Data Engineer

  • Closed
  • US Company | Small ( employees)
  • LATAM (100% remote)
  • 3+ years
  • Long-term (40h)
  • Healthcare
  • Full Remote

Required skills

  • SQL
  • BigQuery
  • Dataflow
  • Python
  • dbt
  • Dataform
  • Google Cloud Platform

Requirements

Must-haves

  • 3-5 years of data engineering experience
  • Experience with GCP Ecosystem (BigQuery, Dataflow, Pub/Sub, Cloud Functions)
  • Proficiency with dbt or Dataform
  • Proficiency with Python for data manipulation and automation
  • Ability to write and optimize advanced SQL queries for large-scale datasets
  • Deep knowledge of Dimensional Modeling (Star/Snowflake schemas)
  • Deep knowledge of modern variations like Data Vault or Wide Tables for ML
  • Strong communication skills in both spoken and written English

Nice-to-haves

  • Startup experience
  • Experience with CI/CD pipelines, Terraform, Infrastructure as Code, Docker, Devcontainers
  • Interest in MLOps and model deployment using Vertex AI or feature stores
  • Ability to contribute using Golang, Java, and Kotlin
  • Experience with OpenLineage, nix/flakes, uv, poetry
  • Bachelor’s Degree in Computer Engineering, Computer Science, or equivalent

What you will work on

  • Lead the evolution and continuous improvement of a cloud-based data platform
  • Architect scalable data solutions on Google Cloud to support machine learning workflows and self-service BI
  • Own and optimize the BigQuery environment with a focus on performance, cost efficiency, and analytics usability
  • Data architecture and pipeline ownership
  • Design, build, and maintain scalable ETL and ELT pipelines using GCP-native services
  • Develop and optimize streaming and batch pipelines with Dataflow (Apache Beam) and Pub/Sub
  • Automate and orchestrate data workflows using Airflow (Cloud Composer), SqlMesh, or Temporal
  • Performance and cost optimization
  • Implement advanced BigQuery strategies including partitioning, clustering, and materialized views
  • Monitor and reduce cloud costs through SQL optimization and use of BigQuery BI Engine
  • Data governance and quality engineering
  • Implement automated data quality checks including schema validation, deduplication, and anomaly detection
  • Design row-level security and column-level encryption to meet HIPAA-level compliance requirements
  • Maintain metadata, lineage, and data catalogs using OpenLineage or Dataplex
  • Strategic collaboration and enablement
  • Partner with data analysts and product teams to ensure clean, analytics-ready data at ingestion
  • Support high-performance reporting for BI tools (Tableau, Looker)
  • Mentor junior team members and contribute to shared best practices for SQL and Python