Imagine yourself as part of a team stretching the boundaries of cloud computing, adopting the latest technologies, and rapidly resolving unique SaaS incidents. In this role you'll act as one part firefighter and one part super sleuth. On a given day you might discover that a Docker container is down because the daemon is failing and you need to investigate which service is causing the conflict. Or, you might have to investigate an abuse report from AWS and determine if the issue is caused by a hacked server or a customer abusing the platform. Because there's always something new, your technical skills will never become obsolete.
If you're excited about the idea of diving into the unknown and taking on novel cloud computing challenges that affect thousands of users, then this is your opportunity to join a global SaaS operations support organization unlike any other. As Amazon's #1 centralized database partner, we are building new possibilities with today's cloud computing technologies to provide services for a rapidly-growing network of business customers worldwide.
What you will be doing:
- Providing hands-on cloud computing services working with massive-scale SaaS infrastructure.
- Working as a senior member of a shift based, 24x7 infrastructure support team for our cloud-based applications.
- Responding to challenging system change requests from the business as well as handling communications during any system outages.
- Configuring and using infrastructure monitoring tools and respond to monitoring alerts to constantly improve the stability of the systems.
What you will NOT be doing:
- Interfacing with customers.
- Doing deep investigations on root causes. This job is about fixing WHAT is wrong, not discovering HOW it got that way.
- Coding or testing. Instead you'll be leveraging coding knowledge to troubleshoot more efficiently.
In this role, you will have the responsibility for maintaining mission-critical infrastructure in a multi-billion dollar global software enterprise, supporting products used by Fortune 500 companies. Specifically, you will:
- Respond to system outages and monitoring alerts, resolving incidents to ensure system uptime and expected service levels
- Analyze cloud systems issues and provide recommendations for long-term solutions, such as monitoring and automation
- Execute change requests that impact production systems used by our diverse portfolio of software products
- Provide cloud operations support on a rotating on-call schedule as part of a global SaaS operations team
Success in this role depends on understanding the cloud technologies we support and how they interact. We look for candidates who possess the following:
- Bachelor's degree in Computer Science or related technical field involving coding preferred
- 3+ years of experience with Linux and Windows Server operating systems
- 3+ years of experience maintaining SaaS applications in one of the major platforms (Azure, GCP, AWS, IBM Cloud)
- Programming experience, including working with algorithms and data structures
- English language proficiency (written and verbal)