We are looking for a colleague passionate about building data platforms, business insights, storytelling, narrative, heavy data lifting, analytics, and, generally, helping data-driven products become alive.
Data engineering tasks will range from working on third-party integrations, implementing ETL processes, designing data pipelines and data lakes, automating and orchestrating computations, and building data-intensive systems.
If this sounds interesting to you and you do not like to be constrained by a single programming language or tool choice, then chances are we are a good fit for each other.
This position is open for all of our development centers.
Key Responsibilities:
- Take ownership of features and code quality
- Design and implement systems that depend on diverse data sources
- Design and implement data processing pipelines and ETL processes
- Design and implement fault-tolerant workflows
- Automate orchestration and monitoring of job executions
- Understand and advocate the importance of high data accuracy throughout the system
- Spread the culture of maintaining high data quality to support building data-driven products
- Make informed decisions about storage systems when designing and implementing data engineering/warehousing solutions.
Required skills:
- In-depth knowledge of at least one big data processing framework (preferably Spark)
- Knowledge of ETL principles
- Experience with SQL and concepts of Data Warehousing
- Experience with at least one of the following: Scala, Java, or Python (preferably more than one of them)
- Experience with cloud computing and serverless paradigms
- Experience with building data processing pipelines and complex workflows
- Knowledge of Unix-like operating systems
- Experience with Version Control Systems (Git, SVN)
- English language proficiency.
Nice to have skills and traits:
- Strong knowledge of relational and non-relational databases
- Experience with streaming technologies (Kafka)
- Experience with workflow scheduling and/or specific job scheduling tools
- Experience with CQRS and event sourcing approaches
- Experience with distributed environments
- Experience with virtualization and containerized applications (Docker, Kubernetes)
- A desire to build valuable data assets and help business decision-makers.