MentorMate creates durable technical solutions that deliver digital transformation at scale by blending strategic insights and thoughtful design with brilliant engineering. The company has completed over 1,500 projects and has global technological hubs in Europe and North and South America. With mature and established practices in enterprise web and mobile development, quality engineering, technical architecture, human-centered design, cloud, DevOps, data, and analytics, we provide contract-based career opportunities, competitive pay, and flexibility.
About the role
We are looking to hire a Senior Site Reliability Engineer to join an innovative project in the healthcare domain for one of the largest pharmaceutical companies in the world. The focus of the role is to enhance the reliability and performance of applications with automation. We regularly refine our methods of work, tools, and technologies. We value independence, natural curiosity, ownership, and the desire for constant improvement.
About the team
The seasoned experts in our cloud & DevOps team provide secure, flexible, and time-efficient solutions to our clients, whether shortening the development runway, modernizing legacy systems, or taking a more cost-effective approach to technology with the cloud. Our cloud & DevOps team's projects are HIPAA, PCI-DSS, or SOC2 compliant and are based on a well-architected framework focusing on security, sustainability, and scalability.
Responsibilities
- You will co-own critical production service designs to ensure high reliability is achievable and measurable
- You’ll drive reliability and observability improvements in the services within the engineering verticals
- Using monitoring and telemetry data, you’ll help teams make informed decisions on where reliability challenges may exist and help design and build solutions to improve them
- You’ll build and improve internal tools and automation software to make maintaining production services easier and safer
- You’ll champion and lead reliability-focused practices such as Failure Analysis, Load and Capacity Planning, Service Reviews, Architecture Designs, Incident Postmortems, and others
- You will build SRE dashboards from SLIs to measure SLO adherence
- Define (from design to implementation details) necessary auto-healing and fault-tolerant systems
- Point of contact for production application issues, working closely with engineering leadership
Requirements
- 4+ years of experience in an Infrastructure, SRE, DevOps, CloudOps role
- OR 4+ years of experience in System Administration
- Experience programming in one or more of the following: C#, Java, Python, .Net, NodeJS, Go, automation scriptings like Terraform, Ansible, or any similar programming language
- Experience with at least one cloud technology - AWS, Compute/Containers, Storage, etc.
- Experience with cloud-performant microservices and event-driven architectures
- Understanding of information security concepts and terminology
- Distributed monitoring experience: logging, metrics, tracing, etc.
- Strong knowledge of software development methodologies and passion for creating high-standard tool sets for infrastructure-as-code
- Ability to analyze problems quickly and find suitable solutions based on available resources
- A proactive and open-minded individual with a clean-cut client focus and structured approach
- Very good English level
Why take this opportunity
- Remote Work Model: Freedom to work remotely with a globally-minded team
- Global Tech Community: Work in an experienced team using the latest technologies
- Exciting Career Prospects: Enterprise projects that set standards and save lives
- Competitive Pay: Feel satisfied with the negotiated terms
- No Intermediaries: Direct communication with our teams