Senior DevOps Engineer - AWS, Python - Advertising Services market

5+ years
Long-term (40h)
Advertising Services
Full Remote
AWS
Python
Terraform
Docker
CI/CD

Requirements

Must-haves

  • 5+ years of DevOps or cloud infrastructure experience
  • Experience with AWS services (e.g. CDK, Lambda, EC2, S3, SageMaker, CloudWatch)
  • Proficiency with Python for scripting and infrastructure automation
  • Experience with Infrastructure as Code (e.g. Terraform, CloudFormation)
  • Hands-on experience with Docker
  • Experience with CI/CD pipeline creation and maintenance
  • Strong communication skills in both spoken and written English

Nice-to-haves

  • Startup experience
  • AWS certifications (e.g. DevOps Engineer, Solutions Architect, Machine Learning Specialty)
  • Background in software engineering or ML/AI infrastructure
  • Bachelor's Degree in Computer Engineering, Computer Science, or equivalent

What you will work on

  • Develop and manage scalable infrastructure and deployment workflows in AWS for data and machine learning applications
  • Build cloud-native systems with a focus on infrastructure as code, containerization, and CI/CD automation
  • Author infrastructure using AWS CDK with strong proficiency in AWS services and Python
  • Support ML workflows by integrating services like SageMaker and contributing to model operations infrastructure

Infrastructure Development & Automation:

  • Design, provision, and manage infrastructure in AWS using CDK and CloudFormation
  • Build secure, scalable, and cost-effective environments for machine learning and analytics workloads
  • Operate cloud-native services (e.g. EC2, ECS, Lambda, S3, RDS, SageMaker, Bedrock)
  • Apply best practices for security, compliance, and disaster recovery

CI/CD & Deployment Automation:

  • Design and maintain deployment pipelines using CodePipeline, CodeBuild, GitHub Actions, or similar
  • Automate testing, deployment, and rollback processes

Containerization & Orchestration:

  • Build and manage containerized applications using Docker
  • Deploy services on ECS or Lambda with container-based runtimes
  • Set up image build, versioning, and artifact management workflows

Machine Learning & Model Operations Support:

  • Collaborate with ML engineers to deploy and maintain models in SageMaker
  • Integrate pipelines for pre-processing, inference, and model retraining
  • Monitor model performance, logging, and metrics

Monitoring, Observability & Logging:

  • Set up alerting and observability tools (e.g., CloudWatch, DataDog)
  • Investigate and resolve infrastructure, deployment, and performance issues

Collaboration & Documentation:

  • Partner with ML, software, and data teams to support DevOps practices
  • Maintain documentation for infrastructure and operational workflows
  • Participate in architecture discussions and code reviews