Data Engineer
Senior AI Data Engineer responsible for designing, building, and scaling ETL/ELT pipelines for AI workloads, transforming unstructured data into vectorized formats for LLM consumption, and automating the data-to-model lifecycle on AWS.
At TechBiz Global , we are providing recruitment service to our TOP clients from our portfolio.
We are currently looking for a dedicated Senior AI Data Engineer to join one of our clients' teams . If you're looking for an exciting opportunity to grow in an innovative environment, this could be the perfect fit for you.
Responsibilities:
▪ Design, build, and scale robust ETL/ELT pipelines optimized for AI workloads, including RAG, fine-tuning, and batch inference.
▪ Transform unstructured data sources such as PDFs, logs, and transcripts into structured and vectorized formats suitable for LLM consumption.
▪ Maintain and automate the data-to-model lifecycle, ensuring AI knowledge bases remain synchronized with changing business data.
▪ Develop and maintain real-time feature pipelines that support low-latency AI and machine learning applications.
▪ Integrate data platforms with Kafka and other event-driven systems to enable real-time processing and AI-driven responses.
▪ Manage and optimize Feature Stores to ensure consistency between model training and production environments.
▪ Implement automated data quality controls and validation processes to ensure the reliability and accuracy of AI training and inference data.
▪ Establish and maintain data lineage frameworks to provide traceability, auditability, and regulatory compliance across data workflows.
▪ Enforce data security, privacy, and governance standards, including PII protection and compliance with industry regulations.
▪ Manage data movement and synchronization across on-premises systems, cloud platforms, and data warehouses.
▪ Optimize data storage and retrieval strategies for Vector Databases to support high-performance RAG and AI search workloads.
▪ Collaborate with Data Scientists, ML Engineers, Software Engineers, and business stakeholders to deliver scalable AI data solutions.
Requirements
▪ 10+ years of experience in Data Engineering or Backend Engineering with a strong focus on data platforms and pipelines.
▪ 2+ years of hands-on experience supporting AI/ML data pipelines, including data preparation for machine learning and generative AI applications.
▪ Expert-level proficiency in Python and SQL; experience with Java or Scala is an advantage.
▪ Strong experience building and maintaining real-time data streaming solutions using Apache Kafka, Flink, or Spark Streaming.
▪ Hands-on experience with modern data orchestration and transformation tools such as Airflow, dbt, and Prefect.
▪ Experience working with Vector Databases and Feature Stores to support AI and machine learning workloads.
▪ Strong knowledge of cloud-based data services on AWS, Azure, or GCP, including services such as Glue, Kinesis, Data Factory, or Dataflow.
▪ Experience deploying and managing data workloads in Kubernetes (K8s) environments.
▪ P
Posted June 21, 2026