Insite AI: Senior Data Engineer - Python, PySpark - Python, PySpark, Azure, Databricks - market

5+ years
Full Remote
Python
PySpark
Azure
Databricks

Requirements

Must-haves

- 5+ years of data engineering experience - Proficiency in Python and PySpark - Azure experience - Databricks experience - Experience with B2B solutions in areas such as CPG, retail, or supply chain - Experience designing and deploying large-scale data management systems - Experience with optimization and AI-enabled industries - Deep understanding of big data technologies and platforms - Ability to work collaboratively in a cross-functional team environment - Strong communication skills in both spoken and written English - Bachelor's Degree in Computer Engineering, Computer Science, or equivalent

Nice-to-haves

- Data Science background - Experience with data warehousing and ETL technologies (e.g. Airflow, Redshift, Snowflake) - AWS and/or GCP experience - Startup experience

What you will work on

- Create and maintain data integration, ETL pipelines, and data warehouse structures - Develop scalable, available, fault-tolerant, data management systems that support AI models and analytics applications - Testing and validation to establish results that downstream users and processes can confidently depend on - Implement strong and adaptable data pipelines that support immediate decision-making - Develop reusable documentation to facilitate knowledge transfer among teams - Collaborate with cross-functional teams including Data Scientists, Software Engineers, Product Managers, and Customer Success to identify and solve problems concerning data quality, ingestion, and integration - Help drive innovation and best practices - Advise on tools to improve data infrastructure's performance, scalability, and security