Software Development

Senior Site Reliability Engineer (Mexico)

Guadalajara, Jalisco
Work Type: Full Time

Job Title: Senior Site Reliability Engineer (SRE)

Experience: 5+ years                                                                                        Location: Mexico/LATAM

Engagement Type: Full-Time/contractual, Fully Remote

Job Description:

We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our offshore team. In this role, you will be responsible for ensuring the reliability, performance, and scalability of our critical systems. You'll develop automation, build monitoring solutions, lead incident response, and work closely with engineering teams to implement infrastructure as code, CI/CD, and cloud-native tools. 


Job Responsibilities:

  • Maintain the reliability, availability, and performance of critical systems

  • Develop and maintain automation scripts and tools to streamline operations

  • Develop and maintain monitoring dashboards and alerts

  • Lead incident response, conduct post-mortem analysis, and implement preventative measures

  • Optimize system performance and scalability

  • Implement and maintain security best practices

  • Create and maintain comprehensive system and process documentation

  • Participate in on-call rotations for 24/7 critical system support

Must Haves:


  • Kubernetes (hands-on experience) – managing and deploying workloads

  • AWS Cloud Platform – deep understanding and production experience

  • Infrastructure as Code (IaC) – using tools like Terraform (or CloudFormation/Ansible)

  • Scripting/Programming – Proficiency in Python or Go

  • Monitoring & Alerting – Experience with Prometheus, Grafana

  • CI/CD Pipelines – Jenkins, GitLab CI, or similar

  • Incident Management – Proven experience in responding to and analyzing outages

  • Linux Systems & Networking – Strong fundamentals


Good to Haves:


  • ArgoCD, Linkerd, Karpenter, or other Kubernetes-related tools

  • Logging tools – Loki, ELK Stack

  • Security best practices – Cloud and container security knowledge

  • Leadership/Mentorship – Experience guiding junior engineers

  • Post-mortem writing & RCA – Comfortable documenting incidents and learnings

  • Experience in distributed systems or high-availability architectures


Recruitment Process:

  • AI-based online screening test

  • Assignment

  • 2 client interviews

  • CEO Discussion

  • Offer: Successful candidates will receive an offer to join the team. 


Soft Skills

  • Excellent verbal and written communication skills in English - Must

  • Strong problem-solving ability with a customer-first mindset

  • Accountability – Takes ownership of reliability and incident outcomes.

  • Demonstrated ability to operate in high-pressure, multitasking environments independently

  • Passion for supporting and helping others


About Us: 

We at Think Future Technologies (TFT) provide Technology Services to our customers, enabling them to achieve superior business outcomes. We come in as a trusted Partner completely owning the Technology piece. We brainstorm on our customer's business problems, arrive at the right solution framework, deploy the right blend of technical resources, and thereon provide optimal delivery at every step of the project implementation.

Candidate Source:
Direct Source
 
Experience Level:
5+ Years
 

Submit Your Application

You have successfully applied
  • You have errors in applying