Recruit Big Data focused senior software engineer
Tarmack LogoRequest a demo

Recruit Experienced Engineers With Big Data Expertise

Big Data Technologies Specialist, 10 Yrs Experience

Summary

Experienced Software Engineer with 10+ years specializing in Big Data technologies (Hadoop, Spark, Kafka), AWS services, and agile project management. Proficient in Scala and Python, skilled in data migration, and adept at using version control tools. Strong communication and relationship-building abilities.

Work Experience

Leading IT Firm (Domain- Investment) (March ’22 – Present)

Technical Manager
  • Designed the architecture for data extraction, transformation, and loading from on-premise to AWS S3 using PySpark and Airflow.
  • Developed 19 PySpark jobs to process data from 19 sources using AWS EMR, AWS S3, and MSSQL Server.
  • Orchestrated PySpark jobs using AWS Managed Airflow (MWAA).
  • Developed configurations for EMR cluster creation and job submission using MWAA.
  • Designed and developed Spark jobs for execution in EKS (Elastic Kubernetes Service) cluster.
  • Created a Python job for ledger data verification during peak times using MWAA and Python.
  • Implemented Data Quality checks using pydeequ for Python/PySpark/Pandas/Glue jobs.
  • Provided technical mentoring to the team and contributed to high-level architecture and solution discussions.
  • Designed an architecture for custom metrics generation from cloudwatch, glue, and REST endpoints.
  • Delivered a centralized alert management and monitoring system.
  • Mentored the team to address technical challenges and provided solutions.
  • Led a team of 4 members.

Leading Innovation Labs (July ’20 – Aug ’21)

Associate Architect
  • Developed PySpark jobs for processing parquet files in the MOVOTO recommendation system.
  • Improved geographic data processing performance by 300 times.
  • Designed and developed the Movoto Recommendation system.
  • Increased report generation performance by 60 times.
  • Optimized infrastructure costs by using AWS spot instances for Spark jobs.
  • Collaborated with DevOps for creating infrastructure for Movoto Analytics.
  • Created Dockerfiles for Spark job submission on EKS cluster via Airflow.
  • Led two teams, Data Science, and Data Engineering.
  • Designed the architecture for Data Processing, Data Analytics, and Report Generation.

R & D Company (Mar ’19 – Mar ’20)

Associate Technical Architect
  • Artificial Intelligence and Machine Learning Practice (R & D)
  • Developed PySpark-Streaming job for processing streaming data of Experian (Credit risk project).
  • Designed and developed PySpark batch jobs on EMR clusters.
  • Created Sequence model for Risk classification of Warranty service comment for TMAP project using NLTK, LSTM, Keras, TensorFlow, cuPy, Scipy, and Pandas on AWS GPUs.
  • Implemented Conversational AI using DeepPavlov framework.
  • Developed a Credit Risk Model for predicting defaulters.
  • Led a team of 8 members and handled requirement gathering, problem reports, and daily production issues.

Learnding Consulting firm (Oct ’17 – Dec ’18)

Specialist Senior-Technology
  • Developed CNN and YOLO models for image analysis.
  • Collaborated with CTO and the Master’s of Machine Learning.
  • Member of the Master’s of Machine Learning Guild.
  • Project Title: T Rowe Price
  • Designed data processing pipeline architecture.
  • Led the development team.
  • Utilized AWS services, Terraform, and Python scripting.
  • Managed requirement gathering, problem reports, and ticket analysis.

Leading R&D Institute (July ’15 – Sep ’17)

Engineer
  • Project Title: Server Development Group
  • Developed Recommendation Systems for Samsung Smart TV.
  • Utilized Python, Spark-MLlib, Kafka, AWS, and more.
  • Designed log processing pipelines and contributed to AWS IoT Hackathon.

Reputed Tech Corporation (Jan ’12 – Sep ’14)

Software Engineer
  • Developed Map Reduce jobs in Java and worked on existing batch jobs.
  • Engaged in bug fixing, change requests, and unit testing.

Education

B.Tech from JNTU University, Hyderabad.

Other

  • Databases: MySQL, PostgreSQL, Redshift
  • NoSQL Databases: MongoDB, Elasticsearch
  • Programming Languages: Python, Scala
  • Scripting: Shell Scripting, Python
  • Frameworks: Hadoop, Spark, Kafka
  • Cloud Technologies: AWS (Athena, EMR, Glue, S3, Redshift, IAM, CloudWatch, DMS)

Want to hire talent like this?

If yes, you've come to the right place! Tarmack can help you hire this person or others with similar profile, wherever you are located in the world. We are a global platform that helps employers hire great talent across a whole range of skills and levels.

Want us to help you with your hiring needs?

Get Started

You can also reach us by sending us an email at employers@tarmack.com

Want to know more about Tarmack? Click here

Want to hire talent like this? i

Get Started

Other Suggested Profiles For You To See

+ More

A truly global HR platform with everything you need to build, grow & manage a global team.

  • bestTalentIdentifying & recruiting the best talent
  • payrollPayroll with full compliance across 100+ countries
  • agreementsEmployment agreements as per local laws
  • contractorContractor invoices & time management
  • onboardingSmooth remote onboarding of employees
  • immigrationImmigration & mobility services around the world
Find Out More