Rao Abdul Mannan

Key Strengths

Extensive experience in Data Engineering, particularly with Databricks, Spark, and AWS.
Proven track record of optimizing data pipelines, reducing costs, and improving performance (e.g., 55% cost reduction on AWS, 70% read-write performance improvement).
Strong background in real-time data processing using Kafka and Spark Streaming.
Experience with Datalake infrastructure design, dimensional modeling, and data governance (Unity Catalog, ACLs).
Demonstrated ability to lead and implement significant migration and automation initiatives (Spark 2.x to 3.x, Jenkins DSL, Databricks REST API).
Multiple certifications in Databricks and Apache Spark, indicating formal validation of skills.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate has a long and diverse career path across various companies and roles, including freelance work, which suggests adaptability. The experience spans from traditional software engineering to specialized big data and analytics roles, indicating a willingness to evolve and learn. The project 'Edx Analytics Pipeline' shows initiative in open-source collaboration and community involvement, which aligns with a collaborative culture. However, the target role 'Analytics Engineer' might require a stronger emphasis on analytical modeling, reporting, and business intelligence tools beyond core data engineering, which is less explicitly detailed in the experience.

Soft Skills & Operational Fit

The candidate's project descriptions highlight a strong focus on efficiency, cost reduction, and performance improvement, suggesting a results-oriented approach. Experience in leading redesigns and migrations indicates strong problem-solving and project management skills. The emphasis on automation and continuous monitoring points to a proactive operational mindset.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

About

I’m a Sr. Data Engineer with 13+ years of experience designing, scaling, and optimizing cloud-native, distributed data platforms for enterprise organizations, including publicly traded companies. With deep expertise across Big Data Engineering, Distributed Systems, and Cloud Architecture, I build high-performance, fault-tolerant ecosystems that power mission-critical business operations. I architect end-to-end data platforms—from ingestion to governance—across data lakes, Delta/Lakehouse architectures, real-time streaming, ETL/ELT frameworks, and multi-cloud infrastructures. My focus is reliability, scalability, performance tuning, and cost optimization, ensuring every solution is production-ready and built for long-term growth. I work at the intersection of Spark internals, Databricks optimization, distributed compute, and cloud-first engineering. I also lead technical strategy, mentor teams, and bring structure and clarity to large-scale, complex environments. Core Competencies: 🔹 Data Architecture & Engineering Distributed systems, EDA, modeling, integration, lineage, batch/streaming pipelines 🔹 Databricks Ecosystem Delta Lake, DLT, Unity Catalog, Databricks SQL, cluster optimization, governance 🔹 AWS (Advanced) EMR, Glue, Redshift, Athena, S3, RDS, Kinesis (Streams + Firehose) 🔹 Big Data Frameworks Spark, Hadoop, MapReduce, Snowflake, Presto, dbt, Hive, HBase, Cassandra 🔹 Orchestration & Automation Airflow, Luigi, Jenkins, CI/CD, YAML-driven workflows, automation frameworks 🔹 Enterprise ETL Tools Informatica (DEI, DES, PowerExchange), Oracle Data Integrator, IBM Big Data stack 🔹 Cloud, Infra & Containers AWS, Azure, GCP, Docker, Kubernetes, Terraform (IaC) 🔹 Programming Python, Scala, Groovy, PHP Technical Focus Areas: ⚙️ Spark optimization & distributed compute ⚙️ Databricks architecture, governance & scaling ⚙

Top Skills

DatabricksData StreamingData LakeData LakehousePysparkPythonReact.jsDjangoReact NativeApache SparkHadoopPHPAngularJSMvcJavaScriptpaypalMagentoMysqlWordpressDrupaloopWeb ApplicationsGit

Experience

GETTR USA INC

Principal Data Engineer

October 1, 2022 – March 1, 2024

Remote

Systems Limited

Principal Consultant Data Analytics & Growth Lead

January 1, 2022 – June 1, 2022

Systems Limited

Senior Consultant Data Analytics & Growth Lead

April 1, 2021 – January 1, 2022

IBM

Big Data Consultant

April 1, 2019 – March 1, 2021

Lahore, Pakistan

Upwork

Principal Data Engineer

January 1, 2018 – Present

Remote

Arbisoft

Data Engineer/Senior Software Engineer

October 1, 2015 – March 1, 2019

Lahore pakistan

Nextbridge Pvt. Ltd

Software Engineer

January 1, 2015 – October 1, 2015

Lahore

PureLogics

Software Engineer

December 1, 2012 – December 1, 2014

Lahore

MacroPak Solutions Pvt Limited

Junior Web Developer

July 1, 2012 – December 1, 2012

Sahiwal

Projects

Edx Analytics Pipeline

January 1, 2017 – March 1, 2019

- Redesigned Hadoop MapReduce/Hive ETL pipeline to Spark/Hive and gained a 55% cost reduction on AWS. - Improved the performance of legacy pipeline through python workflow scheduler Luigi and reduced the cost of these workflows by 50-80%. - Migrated legacy monolithic development platform to Docker and cut down development setup time by 95%, increasing open-source community collaboration and involvement by 75%. - Revamped CI & CD with Travis and Docker, improved deployment efficiency by 60%. - Integrated pipeline alerts and failures with AWS CloudWatch and Opsgenie for continuous uptime monitoring and daily operations. - Worked on improvements in the data warehouse system and design. - Automated development of Jenkins workflows using Jenkins DSL & Groovy.

Certifications

Oracle Data Integrator 12c Certified Implementation Specialist

Oracle

June 25, 2026 – Present

Databricks Certified Data Engineer Professional

Databricks

June 25, 2026 – Present

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks

June 25, 2026 – Present

CCA Spark and Hadoop Developer

Cloudera

June 25, 2026 – Present

Key Strengths

Extensive experience in Data Engineering, particularly with Databricks, Spark, and AWS.
Proven track record of optimizing data pipelines, reducing costs, and improving performance (e.g., 55% cost reduction on AWS, 70% read-write performance improvement).
Strong background in real-time data processing using Kafka and Spark Streaming.
Experience with Datalake infrastructure design, dimensional modeling, and data governance (Unity Catalog, ACLs).
Demonstrated ability to lead and implement significant migration and automation initiatives (Spark 2.x to 3.x, Jenkins DSL, Databricks REST API).
Multiple certifications in Databricks and Apache Spark, indicating formal validation of skills.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

Key Strengths

Cultural & Operational Fit

About

Top Skills

Skills

Education

Experience

Projects

Certifications

Key Strengths

Cultural & Operational Fit