l Designing data schemas/stores and data pipelines for storing and processing of different kind of structured and unstructured data sets like transaction data, review data, video feeds, process data, email data, social feed data and many more
l Executing batch jobs on our custom-built computing cluster or any standard ETL tools or using custom code in SQL or Java or Python.
l Working closely with the data science team to integrate models to tag data to our data store attached to a KPI or raw data.
l Working closely with our backend engineering team to build a robust suite of libraries for extracting and manipulating data for our apps.
l Add new IoT data connectors to byte prophecy core platform by researching on third party protocols and standards
l Create libraries for data quality assurance or data sanity checks
l Experience with both SQL (MySQL) and Columnar (MariaDB/InfiniDB), noSQL (Cassandra) and big data (Hadoop) data stores
l Familiarity with programming best practices, design patterns, version control systems
l A sound understanding of parallel/distributed programming
l Attitude to go extra mile for the completion of the assigned work
l The ability to work effectively with people from a variety of backgrounds
Location – Ahmedabad
Experience – 1 – 5 years
Register your interest sharing your updated resume – email@example.com
Tagged as: #cassandra #python, #spark, java, scala