As a full-service data center with global footprint, phoenixNAP delivers IT infrastructure solutions for medium-sized businesses and small enterprises to support their evolving digital needs. It makes enterprise-grade technology available on a pay-per-use model for greater cost savings and offers customizable solutions to meet even the most sophisticated needs.
Bare Metal Cloud is a cloud native-ready dedicated platform built to enable automated provisioning of physical servers. Billed on an hourly or monthly bases, BMC is a cost-efficient solution that facilitates building, deploying, and scaling apps from startup to enterprise.
Architected for DevOps teams, Bare Metal Cloud is built on the same principles. Our teams rely on automation to accelerate sprint cycles and ensue error-prone deployment builds. Understanding the needs of modern IT teams, we created a solution to help code, test, and release new features faster.
As the Site Reliability Engineer, you will be a part of the Core Team responsible for building and managing the Platform that powers all the most innovative phoenixNAP products and services.
Using Cloud Native technologies and modern development practices such as Continuous Deployment and Infrastructure as Code, your mission will be to enable automation and constantly improve visibility, availability, security, and performance of our systems. You will manage the entire infrastructure including several Kubernetes clusters, relational (MySQL and Postgress) and document (MongoDB) databases, log aggregation and monitoring solutions distributed geographically across several data centres all over the world.
You will join a dynamic team formed by DevOps and Site Reliability Engineers fuelled by passion for modern technologies and innovation. You will have a direct impact on the future of the technological stack used by the Company. You will bring in your expertise in agile software development, DevOps automation and CICD in Cloud-based environments.
In this role, you will collaborate closely with (Cloud, Network, System, Software) Architects, Scrum Master, and Product Owner to support several other teams that develop API-first products in an iterative and collaborative manner.
Key Job Responsibilities
- Build and manage the Platform infrastructure using Infrastructure as Code (IaC) with an "automate everything" mindset and a keen eye on security
- Build and operate multiple Kubernetes clusters on premise distributed across several geographical locations
Be responsible for the availability, performance, and security of these clusters while offering the necessary access to Product teams and Developers
- Build and manage several databases, middleware and monitoring Perform backups and design and implement disaster recovery solutions
- Be primarily responsible for all our CICD environments and pipelines ensuring performance and optimizing resource usage
- Perform other system administrative tasks on Linux-based systems
- Build and maintain automation scripts as part of client feature delivery
- Research and perform proof-of-concepts on new technology
- Experience with Kubernetes and related Cloud Native technologies such as container engines (Docker, containerD) and service meshes (Istio or Linkerd)
- Extensive experience in Infrastructure as Code (IaC) with tools such as Terraform and Ansible (or equivalent)
- Extensive experience in CICD with tools, such as Gitlab CI (or equivalent) and Git
- Proficiency in Linux system administration
- Experience with agile development and DevOps methodology
- Good understanding of virtualization technology
- History of successful research and proof-of-concept of new technology
- Good communication, collaborative, and problem-solving skills
This is an exciting opportunity to work with a highly innovative and creative team, in a great working environment using the latest technologies, methodologies, and frameworks.