About the role

We are looking for a SRE / DevOps Engineer – Observability & Cloud to help design, build, and operate scalable observability and monitoring platforms across infrastructure, network services, and applications.

You will work closely with DevOps, platform, and engineering teams to improve system reliability, performance, and operational visibility through modern monitoring, logging, and telemetry solutions.

Key Responsibilities:

Design and maintain observability platforms for metrics, logs, and distributed tracing
Build telemetry pipelines to collect data from infrastructure, network devices, and applications
Develop dashboards and alerts for infrastructure and service reliability
Implement centralized logging solutions and log ingestion pipelines
Support application performance monitoring (APM) and distributed tracing
Implement synthetic monitoring to proactively detect service degradation
Collaborate with engineering teams to implement instrumentation and monitoring best practices
Automate infrastructure and monitoring configurations using infrastructure-as-code
Support incident investigation and root cause analysis

Requirements:

3+ years of experience in SRE, DevOps, Platform Engineering, or Infrastructure Engineering
Experience with observability tools such as Datadog/Splunk/Prometheus/Grafana
Experience with cloud platforms (AWS, Azure, or GCP)
Experience with containerized environments such as Kubernetes
Experience with automation tools such as Terraform / Ansible
Scripting experience (Python, Bash, or similar)

Nice to have:

Experience with open telemetry
Experience with distributed tracing
Familiarity with reliability concepts such as SLI / SLO

Thank you for your interest in this position. Please note that only candidates whose qualifications closely match our requirements will be contacted.

SRE / DevOps Engineer – Observability & Cloud

SkyHighGrowth Inc.

About the role

Key Responsibilities:

Requirements:

Nice to have: