Data Engineer

Required skills

Requirements

Startup experience
Experience with CI/CD pipelines, Terraform, Infrastructure as Code, Docker, Devcontainers
Interest in MLOps and model deployment using Vertex AI or feature stores
Ability to contribute using Golang, Java, and Kotlin
Experience with OpenLineage, nix/flakes, uv, poetry
Bachelor’s Degree in Computer Engineering, Computer Science, or equivalent

Lead the evolution and continuous improvement of a cloud-based data platform
Architect scalable data solutions on Google Cloud to support machine learning workflows and self-service BI
Own and optimize the BigQuery environment with a focus on performance, cost efficiency, and analytics usability
Data architecture and pipeline ownership
Design, build, and maintain scalable ETL and ELT pipelines using GCP-native services
Develop and optimize streaming and batch pipelines with Dataflow (Apache Beam) and Pub/Sub
Automate and orchestrate data workflows using Airflow (Cloud Composer), SqlMesh, or Temporal
Performance and cost optimization
Implement advanced BigQuery strategies including partitioning, clustering, and materialized views
Monitor and reduce cloud costs through SQL optimization and use of BigQuery BI Engine
Data governance and quality engineering
Implement automated data quality checks including schema validation, deduplication, and anomaly detection
Design row-level security and column-level encryption to meet HIPAA-level compliance requirements
Maintain metadata, lineage, and data catalogs using OpenLineage or Dataplex
Strategic collaboration and enablement
Partner with data analysts and product teams to ensure clean, analytics-ready data at ingestion
Support high-performance reporting for BI tools (Tableau, Looker)
Mentor junior team members and contribute to shared best practices for SQL and Python