Crowds, Big Data, and Tools Team
We are a growing company with a big vision: Build the world’s most intelligent sales and marketing platform while we help communities around the world.
We are building machine learning engines that leverage vast crowd of human researchers to provide the best data available. Our diverse development team works with cutting-edge technologies in messaging, visualization, crowdsourcing, machine learning, and natural language processing. At the same time, we execute on a not-so-secret social mission – attacking poverty at its core in communities around the world by creating thousands of fairly-paid jobs in data curation.
As a LeadGenius Data Engineer you will help extend our data collection, analytics and machine learning platform to build a data-driven marketing engine for our customers. Your work at LeadGenius matters. Your big data platform skills will support our core strategic objectives and leverage our crowd resources to help customers. You will be responsible for data aggregation.
Senior Data Engineer
Key Responsibilities:
- Define, build and manage our market intelligence warehouse that combines large sets of public and our own proprietary data.
- Participate in analysis and design of our data stores and data analytics
- Incorporate human and mechanical feedback loops into our machine learning engines to continually improve our data and automated decision making.
- Make large, complex data sets available for data science, product development and analytics.
- Refine and manage our persistent data build and refinement process
- Build new analytics and data products
- Support data acquisition and custom analytic projects lead by our Chief Scientist
- Have fun working with a great team and an exceptional corporate culture.
- Assess data quality and write integration tests for the data processing pipeline.
Qualifications:
- 3 years of experience in the practical implementation and deployment of non-SQL data technologies such as Cassandra or Druid
- 3 years of experience with relational databases and data modeling.
- Skills in the following technologies will help you hit the ground running: Spark, ElasticSearch, AWS, BASH, Airflow
- Experience programming with Python or Scala.
- Experience in the design, practical implementation and deployment of data pipelines and RESTfull APIs
- Experience in entity resolution. Experience in graph databases and graph queries is a big plus
- Master's degree in Computer Science, Engineering, or similar and/or relevant work experience
Location:
Work anywhere in the world as a contractor to LeadGenius. You’ll work remotely and set your own hours -- all you’ll need is a fast internet connection. You will be working with the Operations Team and will be interacting through chats, emails and calls. Meetings will be weekly through a web conferencing.
Deadline for applications: 27.12.2016.