Home / Jobs / Associate Product Architect – Data Engineer

Apply now

  • Data Pipeline solutioning based on the requirements and incorporating various optimization techniques based on various sources involved and data volume.   
  • Understanding of storage architectures such as Data Warehouse, Data Lake, and Lake houses
  • Deciding tech stack and development standards, proposing tech solutions and architectural patterns, and recommending best practices for the big data solution
  • Providing thought leadership and mentoring to the data engineering team on how data should be stored and processed more efficiently and quickly at scale
  • Ensure adherence with Security and Compliance policies for the products
  • Stay up to date with evolving cloud technologies and development best-practices including open-source software.
  • Work in an Agile Environment and provide optimized solutions to the customers and JIRA for project management
  • Proven problem-solving skills with the ability to anticipate roadblocks, diagnose problems and generate effective solutions
  • Analyze market segments and customer base to develop market solutions
  • Experience in working with batch processing / real-time systems using various
  • Enhance/Support solutions using Pyspark/EMR, SQL and databases, AWS Athena, S3, Redshift, Lambda, AWS Glue, and other Data Engineering technologies.
  • Proficiency in SQL Writing, SQL Concepts, Data Modelling Techniques, Data validation, Data quality check & Data Engineering Concepts
  • Proficiency in design, creation, deployment, review and get the final sign off from the client by following the best practices in SDLC of existing and new products.
  • Experience in technologies like Databricks, HDFS, Redshift, Hadoop, S3, Athena, RDS, Elastic MapReduce on AWS or similar services in GCP/Azure
  • Scheduling and monitoring of Spark jobs using tools like Airflow, Oozie
  • Familiar with version control tools like Git, Code Commit, Jenkins, Code Pipeline
  • Work in a Cross functional team along with other Data Engineers, QA Engineers, and DevOps Engineers.
  • Develop, test, and implement data solutions based on finalized design documents.
  • Familiar with Unix/Linux and Shell Scripting

SKILL SET
 

  • BE, BS or MS in Computer Science or related field with overall 4+ Years of

working experience

  • Excellent communication and problem-solving skills.
  • Highly proficient in Project Management principles, methods, techniques, and tools
  • Minimum 4 years of working experience in Pyspark, SQL, AWS development
  • Experience of working as a mentor for junior team members
  • Hands on experience in ETL process, performance optimization techniques are a must
  • Candidate should have taken part in Architecture design and discussion
  • Minimum of 4 years of experience in working with batch processing/ real-time systems
  • Using various technologies like Databricks, HDFS, Redshift, Hadoop, Elastic MapReduce on AWS, Apache Spark, Hive/Impala and HDFS and NoSQL databases or similar services in Azure or GCP
  • Minimum of 4 years of experience working in Datawarehouse or Data Lake Projects in a role beyond just Data consumption.
  • Minimum of 4 years of extensive working knowledge in AWS building scalable solutions. Equivalent level of experience in Azure or Google Cloud is also acceptable
  • Minimum of 4 years of experience in programming languages (preferably Python)
  • Experience in Pharma Domain will be a very Big Plus.
  • Familiar with tools like Git, Code Commit, Jenkins, Code Pipeline
  • Familiar with Unix/Linux and Shell Scripting

Additional Skills:
 

  • Exposure to Pharma and life sciences would be an added advantage.
  • Certified in any cloud technologies like AWS, GCP, Azure.

 

  • Data Pipeline solutioning based on the requirements and incorporating various optimization techniques based on various sources involved and data volume.   
  • Understanding of storage architectures such as Data Warehouse, Data Lake, and Lake houses
  • Deciding tech stack and development standards, proposing tech solutions and architectural patterns, and recommending best practices for the big data solution
  • Providing thought leadership and mentoring to the data engineering team on how data should be stored and processed more efficiently and quickly at scale
  • Ensure adherence with Security and Compliance policies for the products
  • Stay up to date with evolving cloud technologies and development best-practices including open-source software.
  • Work in an Agile Environment and provide optimized solutions to the customers and JIRA for project management
  • Proven problem-solving skills with the ability to anticipate roadblocks, diagnose problems and generate effective solutions
  • Analyze market segments and customer base to develop market solutions
  • Experience in working with batch processing / real-time systems using various
  • Enhance/Support solutions using Pyspark/EMR, SQL and databases, AWS Athena, S3, Redshift, Lambda, AWS Glue, and other Data Engineering technologies.
  • Proficiency in SQL Writing, SQL Concepts, Data Modelling Techniques, Data validation, Data quality check & Data Engineering Concepts
  • Proficiency in design, creation, deployment, review and get the final sign off from the client by following the best practices in SDLC of existing and new products.
  • Experience in technologies like Databricks, HDFS, Redshift, Hadoop, S3, Athena, RDS, Elastic MapReduce on AWS or similar services in GCP/Azure
  • Scheduling and monitoring of Spark jobs using tools like Airflow, Oozie
  • Familiar with version control tools like Git, Code Commit, Jenkins, Code Pipeline
  • Work in a Cross functional team along with other Data Engineers, QA Engineers, and DevOps Engineers.
  • Develop, test, and implement data solutions based on finalized design documents.
  • Familiar with Unix/Linux and Shell Scripting

SKILL SET
 

  • BE, BS or MS in Computer Science or related field with overall 4+ Years of

working experience

  • Excellent communication and problem-solving skills.
  • Highly proficient in Project Management principles, methods, techniques, and tools
  • Minimum 4 years of working experience in Pyspark, SQL, AWS development
  • Experience of working as a mentor for junior team members
  • Hands on experience in ETL process, performance optimization techniques are a must
  • Candidate should have taken part in Architecture design and discussion
  • Minimum of 4 years of experience in working with batch processing/ real-time systems
  • Using various technologies like Databricks, HDFS, Redshift, Hadoop, Elastic MapReduce on AWS, Apache Spark, Hive/Impala and HDFS and NoSQL databases or similar services in Azure or GCP
  • Minimum of 4 years of experience working in Datawarehouse or Data Lake Projects in a role beyond just Data consumption.
  • Minimum of 4 years of extensive working knowledge in AWS building scalable solutions. Equivalent level of experience in Azure or Google Cloud is also acceptable
  • Minimum of 4 years of experience in programming languages (preferably Python)
  • Experience in Pharma Domain will be a very Big Plus.
  • Familiar with tools like Git, Code Commit, Jenkins, Code Pipeline
  • Familiar with Unix/Linux and Shell Scripting

Additional Skills:
 

  • Exposure to Pharma and life sciences would be an added advantage.
  • Certified in any cloud technologies like AWS, GCP, Azure.

Apply now