Data Curation Engineer
Belgrade
Seven Bridges is connecting the world’s biomedical information to accelerate research and improve health. We are collaborating with a wide variety of distinguished pharmaceutical, health-care, and academic partners. A central part of our mission is to help these partners investigate and treat all kinds of genetic disorders.
At Seven Bridges we are building the most advanced cloud computing platform for genomics data analysis. Our team and product enables scientists to analyze genomic data faster and more efficiently than ever, so they can focus on making progress in genomics and personalized medicine.
As a member of our Engineering department, you will have the opportunity to extend our platform by building innovative tools and products, and be involved in designing new and custom solutions. You will work on a subset of our 80+ production services and tools and be involved with their design, implementation, integration, unit testing, as well as deployment and operation of these services. You will also be involved in understanding and analyzing various biomedical datasets on one end and design and implement the data ingress pipeline for various datasets used on the platform.
Main responsibilities:
- All activities related to importing various datasets to Seven Bridges Platform and be able to:
- Understand the origin of data, both in terms of vocabulary and technical aspects of representing a dataset (established domain vocabularies, data formats, access policies, etc.).
- Interpret data and advise on devising the best structure for the data that is imported to the Seven Bridges Platform.
- Parse the source data and transform it to the above constructed model for it to be used on Seven Bridges Platform.
- Manage data (follow changes) throughout its lifecycle, from creation and initial storage to the time it is archived for posterity or becomes obsolete and is therefore deleted.
- Keep track of the industry trends and update/adapt data in order to provide ground for new features.
- Understand different datasets that cover different areas of research and advise on most beneficial ones to import to the Seven Bridges Platform.
- Provide support for the available datasets on the Seven Bridges Platform.
An ideal candidate would:
- Have an understanding of and experience in analyzing biomedical data.
- Be proficient in Python and/or Java, data extraction, analysis and ETL.
- Have a passion for information retrieval problems and solutions.
- Have basic knowledge of different database systems, REST APIs and various data formats like XML, JSON, CSV, TSV...
It would be great if you:
- Have formal education in Medicine, Pharmacy, Molecular Biology or related scientific field.
- Have basic knowledge of graph databases, graph algorithms and inference engines.
- Feel comfortable with at least one major *NIX platform (Linux, OS/X, FreeBSD, etc.).
- Are familiar with Semantic Web technology stack: Ontologies, RDF, SPARQL, Linked Data.
- Have experience with interactive visualization and infographic development.
We value our team more than anything else. We like to learn from each other and share knowledge. Our engineering team is built upon a culture of initiative and openness, where we embrace open discussion about potential technical solutions and genuine curiosity. We value expertise, integrity, accountability, and patience.
If you would like to work with us and help us build the next generation of genomics and be part of our engineering team, please send us your resume/CV and a cover letter. Thank you for your interest in Seven Bridges!
Deadline for applications: 07.04.2019.