Initializing systems...
Senior Data Engineer · Cloud Architect

ANKUR ROY

Architecting the Data-Driven Future

7+ years engineering petabyte-scale data platforms on GCP · Azure · AWS · Databricks · Snowflake — turning raw event streams into competitive intelligence at warp speed.

Explore My Work Get In Touch
0
Years XP
10+TB
Live Pipelines
₹2Cr+
Cost Savings
0
Certifications
>
SCROLL
// 01   PROFILE

The Engineer Behind the Data

I'm a Certified Google Professional Cloud Architect and Microsoft Azure Data Engineer & Data Scientist with deep expertise in building systems that move fast, scale effortlessly, and cost less.

From founding a Flutter startup to leading data engineering teams at Walmart, Delhivery, and Infosys, I've architected everything from real-time HTAP platforms to ML-driven business verification systems.

I thrive at the intersection of distributed systems, cloud-native architecture, and business impact — turning messy raw data into competitive advantage.

GCP Cloud Architect Azure Data Scientist (DP-100) Azure Data Engineer (DP-203) Advanced DS — IBM ISRO Geospatial
AR
Data Engineer 3
Walmart Global Tech
// 02   MISSION LOG

Experience Timeline

Data Engineer 3 Walmart Aug 2023 – Present
  • Team Lead — Managing a 5-member India data engineering team for Walmart Business.
  • Pay By Invoice — Architected a TreviPay-integrated real-time credit, invoicing & reconciliation platform for B2B transactions.
  • Site Pop-up — Built a config-driven attribution engine with Spring Boot + Redis + ML models triggering behavior-driven UI pop-ups from in-session signals.
  • Sales Tech — Curated data facets for 6M+ Accounts & Contacts powering Salesforce with Spark + BigQuery for precise targeting.
  • Business Verification — Designed NiFi + ML entity-resolution pipeline achieving 32% improved accuracy and <50 s latency.
Senior Data Engineer Delhivery Jan 2022 – Aug 2023
  • Real-Time Data Mart — Directed design of 92 real-time marts at 25+ TB scale with Kafka, Spark Streaming & TiDB HTAP at 8K+ write / 3K+ read QPS.
  • DB Migration — Migrated 20+ TB across 92 marts from Aurora PostgreSQL → TiDB HTAP for 14 teams.
  • Data Lake Migration — Moved 45+ TB Hudi → Delta Lake on Databricks, cutting batch time by 43%.
  • Cost Optimisation — Achieved 75% Spark savings (~₹1.5 Cr) via CPU slicing on YARN.
Data Engineer Infosys Nov 2020 – Jan 2022
  • Led team of 4 engineers for Kraft Heinz delivering scalable data solutions.
  • Indirect Tax Datamart — Snowflake datamart transforming 80M+ orders, 100K+ products with international SAP HANA taxation rules.
  • Data Integration — ETL with Azure Data Factory → Databricks + DBT → Snowflake for enterprise analytics.
Founder & Product Engineering Lead Startup Jun 2019 – Oct 2020
  • Founded and led a cross-platform mobile app venture using Flutter with a team of 6.
  • Designed scalable apps with MVVM architecture, RESTful APIs & Firebase services.
  • Managed full product lifecycle — client discovery to successful delivery.

Impact Metrics

₹2Cr+
Infra Cost Saved
45+TB
Data Lake Migrated
92
Real-time Data Marts
43%
Batch Time Reduction
<50s
Verification Latency
6M+
Salesforce Accounts

Skills & Expertise

Languages
PythonScala SQLJavaBash
Data Platforms
DatabricksApache Spark FlinkKafka NiFiDelta Lake HivePresto SnowflakeDBT
Databases
TiDB (HTAP)BigQuery PostgreSQLMySQLRedis
Cloud & Infrastructure
GCPAzureAWS KubernetesDocker TerraformAirflowYARN
Monitoring & Observability
GrafanaPrometheus
Core Expertise
Data ModelingETL / ELT Data WarehousingGovernance Data QualityCost Optimization ML Integration

Achievements & Awards

Winner — Walmart Business Hackathon 2025
Topped the internal hackathon, demonstrating innovation in business data engineering.
Innovation Badge — Walmart Global Tech India
Honored for outstanding contributions to data platform innovation.
Speaker — Databricks Bangalore User Group
Presented at the Databricks community, sharing expertise on real-time data platforms.
First Runner-up — GenAI Hackathon, Google India & Delhivery
Secured 2nd place in a competitive Generative AI challenge.
Above and Beyond Award — Delhivery
Recognized for exceptional performance migrating real-time data platforms at scale.
Winner — Azure-a-thon by Infosys
Won the Azure cloud challenge organized by Infosys Limited.
Speaker — Flutter Meetup, Google Gurugram
Presented at Android Developer Club's Flutter event hosted at Google Gurugram.
Rank 31 — DUET 2020, MS Computer Science
Also secured Rank 50 in MCA — Delhi University Entrance Test 2020.

Certifications

Google Cloud Professional Cloud Architect
Google Cloud
Azure Data Scientist Associate (DP-100)
Microsoft
Azure Data Engineer Associate (DP-203)
Microsoft
Advanced Data Science with IBM Specialization
IBM / Coursera
Geospatial Applications for Disaster Risk Management
ISRO & UNOOSA

Let's Connect

Building the next-gen data platform, tackling a gnarly pipeline challenge, or just want to geek out about distributed systems? I'm one message away.