remote

Data Engineer - Data Warehouse Architect - ICF

Data Engineer

Data Engineer – Data Warehouse Architect responsible for designing, implementing, and optimizing enterprise data pipelines, database schemas, and data models to support business intelligence analytics using Python, SQL, ETL tools, AWS services, and Spark for scalable data processing.

About the role

This position focuses on developing, implementing, and maintaining architecture solutions across a large enterprise data warehouse to support effective and efficient data management and enterprise-wide business intelligence analytics.

Responsibilities:

Implement and optimize data pipeline architectures for sourcing, ingestion, transformation, and extraction processes, ensuring data integrity and compliance with organizational standards.

Develop and maintain scalable database schemas, data models, and data warehouse structures; perform data mapping, schema evolution, and integration between source systems, staging areas, and data marts.

Automate data extraction workflows and create comprehensive technical documentation for ETL/ELT procedures; collaborate with cross-functional teams to translate business requirements into technical specifications.

Establish and enforce data governance standards, including data quality metrics, validation rules, and best practices for data warehouse design and architecture.

Develop, test, and deploy ETL/ELT scripts using SQL, Python, Spark, or other relevant languages; optimize code for performance and scalability.

Tune data warehouse systems for query performance and batch processing efficiency; apply indexing, partitioning, and caching strategies.

Perform advanced data analysis, validation, and profiling using SQL and scripting languages; develop data models, dashboards, and reports in collaboration with stakeholders.

Conduct testing and validation of ETL workflows to ensure data loads meet SLAs and quality standards; document testing protocols and remediation steps.

Troubleshoot production issues, perform root cause analysis, and implement corrective actions; validate data accuracy and consistency across systems.

Basic Qualifications:

Minimum of 3 years of experience in data analysis.

Additional Qualifications:

Strong analytical and problem-solving skills with attention to detail.

Proficiency in SQL and ability to develop complex queries (e.g., multi-join), tune performance, and troubleshoot.

Experience with Unix/Linux shell scripting for ETL automation.

Familiarity with database tools and platforms (e.g., Teradata, Oracle, Non-Relational).

Excellent verbal and written communication skills; ability to collaborate across all levels.

Ability to prioritize and multi-task in a fast-paced environment.

Knowledge of Java/J2EE, REST APIs, Web Services, and event-driven microservices.

Experience with Kafka streaming, schema registry, OAuth authentication.

Familiarity with Spring Framework, GCP services, Git, CI/CD pipelines, containerization, and data ingestion/data modeling.

Preferred Qualifications:

Experience with Databricks concepts and terminology (e.g., workspace, catalog).

Proficiency in Py