Site Reliability Engineer

Required skills

Requirements

5+ years of experience as an SRE, with diverse infrastructure experience
Hands-on experience deploying AWS services in production environments
Experience with Docker containers
Experience in architecting, supporting, and deploying large-scale systems from scratch
Experience with CI/CD pipelines (e.g. GitLab CI, GitHub Actions, etc.)
Thorough understanding of the Software Development Lifecycle (SDLC)
Familiarity with massive-scale data ingestion and messaging systems
Curiosity and a proactive approach to problem-solving and learning new technologies
Ability to work collaboratively in a remote-first environment
Security-conscious mindset and up-to-date knowledge of security standards
Strong commitment to documentation
Strong communication skills in both spoken and written English

Startup experience
Strong coding skills in Python or Ruby and scripting in Shell (Bash)
Experience with Infrastructure as Code tools (e.g. Terraform, Ansible, Packer, etc.)
Bachelor's Degree in Computer Engineering, Computer Science, or equivalent

Infrastructure Management (80%)

Ensure the reliability, availability, and performance of applications and systems
Design, build, and maintain scalable and efficient infrastructure using AWS services
Deploy Docker containers in production environments
Utilize Infrastructure as Code (IaC) tools such as Terraform, Ansible, and Packer
Develop and manage complex CI/CD pipelines using tools like GitLab CI, GitHub Actions, etc.
Code in Python or Ruby and script in Shell (Bash) for automation and integration tasks
Architect, support, and deploy large-scale systems from scratch
Implement best practices for massive-scale data ingestion and messaging systems

Collaboration and Documentation (20%)

Work closely with developers and operations teams, fostering a true DevOps culture
Document infrastructure and processes ensuring information is centralized in the knowledge base
Communicate clearly in written form, using Slack, tickets, and documentation
Stay updated on security best practices and ensure the infrastructure is secure
Encourage collaboration and empower colleagues by sharing knowledge and feedback

This job is closed

Sign up for Strider today to get matched with top opportunities and receive job alerts.

Refer a friend

LATAM · 100% Remote Full-time (40h) 5+ years

Your referral link

or share via

Link copied to clipboard