Apply for jobs Login
World map with all continents displayed

MLOps Engineer - AWS, ML Infrastructure - Advertising Services market

5+ years
Long-term (40h)
Advertising Services
Full Remote
AWS
ML Infrastructure
Python
Terraform
CI/CD
Docker

Requirements

Must-haves

  • 5+ years of DevOps, MLOps, or Cloud Infrastructure Engineering experience
  • Experience with AWS services (CDK, Lambda, EC2, S3, SageMaker, CloudWatch)
  • Proficiency with Infrastructure as Code (IaC) tools (Terraform, CloudFormation)
  • Strong experience with Python for scripting and automation
  • Proficiency with containerization using Docker
  • Experience building and maintaining CI/CD pipelines for ML workflows
  • Deep knowledge of ML model lifecycle management, including deployment, monitoring, and retraining
  • Based in Brazil, Argentina, Paraguay, Colombia, or Mexico
  • Strong communication skills in both spoken and written English

Nice-to-haves

  • Startup experience
  • AWS Certifications (e.g. DevOps Engineer, Solutions Architect, Machine Learning Specialty)
  • Background in software engineering or ML/AI infrastructure
  • Bachelor’s Degree in Computer Engineering, Computer Science, or equivalent

What you will work on

ML Infrastructure Architecture & Automation

  • Design, provision, and manage AWS infrastructure for ML workloads using AWS CDK and CloudFormation
  • Architect secure, scalable, and cost-efficient ML environments for experimentation, training, and inference
  • Implement cloud-native services (e.g. EC2, ECS, Lambda, S3, RDS, SageMaker, Bedrock, Step Functions)
  • Apply best practices for security, compliance, and disaster recovery in ML infrastructure

Model Deployment & CI/CD

  • Design and maintain CI/CD pipelines for training, deployment, and retraining of models using CodePipeline, CodeBuild, GitHub Actions, or similar
  • Automate testing, versioning, and rollback strategies for applications and ML models
  • Build and manage Docker containers for microservices and ML applications

MLOps Enablement

  • Collaborate with ML engineers to deploy, monitor, and maintain models in SageMaker
  • Develop end-to-end pipelines for data preprocessing, feature engineering, training, inference, and retraining
  • Integrate model monitoring, drift detection, and automated retraining triggers

Monitoring, Observability & Performance

  • Implement observability frameworks for ML workloads using CloudWatch, DataDog, and other tools
  • Track inference latency, accuracy, and resource usage to optimize performance
  • Troubleshoot production ML systems and lead incident resolution

Collaboration & Documentation

  • Partner with software, ML, and data teams to promote MLOps best practices
  • Maintain clear documentation for infrastructure, deployments, and operational processes
  • Contribute to code reviews and architectural discussions
Interested in this job? We're still accepting applications for this position
Interested in this job? We're still accepting applications for this position

Other jobs you might like

Get matched with the best remote opportunities from today's top US companies

Find a great full-time opportunity
Earn more compensation for your hard work
Access exclusive benefits like healthcare, English classes, and more
1-1 individualized training to succeed in the international job market
Apply for remote jobs