Databricks Data Engineer at Biglynx Computer Software

@ info@BigLynxComputer.com
Gmail: 📧 Copy: 📋 Bounce: 🚫

Databricks Data Engineer

💰 $150,000 - $200,000 🌍 United States of America 📅 07/05/2023

Apply

Job Description

##### Job Description :

**BigLynx** , Inc is an American multinational technology corporation
headquartered in Seattle, Washington, with operations in the _United States,
Canada, and India_. The company began in 2016, as a product development
company specializing in _AI/ML Data Engineering_ in the Retail vertical
space with its products warehouse & fast. Post Pandemic in 2022, BigLynx added
a business division of boutique technology consulting, specializing in _
**Data Engineering, Full Stack**_ **, and ** _ **Microsoft Dynamics**_
helping clients build the next generation data platform and big data
pipelines.

* Data pipeline development: Design, develop, and maintain scalable and efficient data pipelines using Databricks to ingest, transform, and load data from various sources. This includes data extraction, data cleansing, data transformation, and data loading processes.

Data modeling and schema design: Design and implement data models, database
schemas, and data structures on Databricks. Optimize data models for
performance, scalability, and ease of use.
ETL processes: Develop and maintain ETL (Extract, Transform, Load) processes
using Databricks to transform and cleanse data. Implement efficient data
integration and transformation logic using languages such as Python, SQL, or
Scala.
Data integration: Integrate data from multiple systems and sources, ensuring
data consistency, accuracy, and quality. Develop and maintain data connectors,
APIs, and data ingestion processes.
Performance optimization: Identify and address performance bottlenecks in data
pipelines and data models. Optimize query performance, data loading, and data
processing capabilities on Databricks.
Data governance and security: Implement data governance practices, data
privacy measures, and security controls on Databricks. Ensure compliance with
data governance policies and regulations.
Monitoring and troubleshooting: Monitor the health and performance of
Databricks data infrastructure, data pipelines, and data processing jobs.
Troubleshoot issues and provide timely resolutions.
Collaboration and teamwork: Collaborate with cross-functional teams, including
data scientists, data analysts, and business stakeholders, to understand data
requirements, provide data engineering expertise, and support their data-
related needs.

* ##### **Qualifications:**

-Databricks expertise: Strong knowledge and hands-on experience with the Databricks platform, including Databricks notebooks, Databricks runtime, and Databricks clusters.
-Data engineering skills: Proficiency in data engineering principles, ETL processes, data modeling, and data integration techniques. Experience with programming languages such as Python, SQL, or Scala.
-Big data technologies: Experience with big data technologies, such as Apache Spark, Apache Hadoop, or related frameworks. Familiarity with distributed computing and data processing concepts.
-Cloud platforms: Experience working with cloud platforms, preferably Azure Databricks, AWS Databricks, or Google Cloud Databricks. Knowledge of cloud storage, compute, and networking services.
-Database and data warehouse concepts: Understanding of relational databases, data warehousing concepts, and SQL. Familiarity with data warehousing best practices and dimensional modeling.
-Performance optimization: Strong skills in optimizing Spark jobs and queries on Databricks. Ability to identify and resolve performance bottlenecks.
-Problem-solving skills: Strong analytical and problem-solving abilities to tackle complex data engineering challenges and troubleshoot issues.
-Collaboration and communication: Excellent collaboration and communication skills to work effectively with cross-functional teams and stakeholders, translating business requirements into technical solutions and providing technical guidance.
-Education: A bachelor's or master's degree in computer science, data engineering, or a related field is typically required. Relevant certifications, such as Databricks Ce