remote

Data Engineer - Anika Systems

Data Engineer

Design, build, and optimize scalable data pipelines for federal clients, leveraging ETL/ELT, XBRL processing, Apache Iceberg, and advanced data optimization techniques to deliver trusted analytics and reporting.

About the role

Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and deliver data across enterprise platforms.
Build scalable data ingestion frameworks for structured and semi-structured data, including XBRL filings and financial datasets.
Implement data transformation logic to support analytics, reporting, and regulatory use cases.
Ensure data pipelines are reliable, performant, and scalable in cloud environments.
Leverage AI-assisted development tools to accelerate pipeline development, testing, and optimization.
Develop and manage data solutions leveraging AWS services (e.g., S3, Airflow, DAGs, Glue, Lambda, Redshift).
Implement and optimize Apache Iceberg table formats for large-scale, ACID-compliant data lakes.
Support lakehouse architectures that unify data lakes and data warehouses.
Optimize data storage and retrieval strategies for performance and cost efficiency.
Enable data platforms that support AI/ML workloads and downstream generative AI use cases.
Design and implement CI/CD pipelines for data pipelines, infrastructure, and analytics code using tools such as GitHub Actions, GitLab CI, Jenkins, or AWS-native services.
Automate build, test, and deployment processes for ETL pipelines and data platform components.
Implement DataOps best practices, including version control, automated testing, environment promotion, and rollback strategies.
Ensure reproducibility, reliability, and governance of data pipeline deployments across environments.
Integrate AI-driven testing and monitoring tools to improve pipeline quality and reduce operational risk.
Design and implement materialized views and other performance optimization techniques to improve query efficiency.
Tune data pipelines and queries for performance, scalability, and cost.
Implement partitioning, indexing, and caching strategies aligned to workload patterns.
Develop pipelines to ingest, parse, and normalize XBRL (eXtensible Business Reporting Language) data.
Support regulatory and financial data use cases requiring high accuracy and traceability.
Ensure alignment with data standards and validation rules for financial reporting datasets.
Apply context engineering principles to ensure data is enriched with meaningful metadata, lineage, and business context.
Collaborate with Data Architects to support data modeling, schema design, and entity relationships.
Enable downstream analytics and AI use cases by structuring data for usability, discoverability, and governance.
Integrate pipelines with enterprise data catalogs and metadata management systems.
Support automated metadata capture, lineage tracking, and data quality monitoring.
Ensure alignment with data governance frameworks and standards established by OCDO organizations, including AI data readiness and trace

About the role

Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and deliver data across enterprise platforms.
Build scalable data ingestion frameworks for structured and semi-structured data, including XBRL filings and financial datasets.
Implement data transformation logic to support analytics, reporting, and regulatory use cases.
Ensure data pipelines are reliable, performant, and scalable in cloud environments.
Leverage AI-assisted development tools to accelerate pipeline development, testing, and optimization.
Develop and manage data solutions leveraging AWS services (e.g., S3, Airflow, DAGs, Glue, Lambda, Redshift).
Implement and optimize Apache Iceberg table formats for large-scale, ACID-compliant data lakes.
Support lakehouse architectures that unify data lakes and data warehouses.
Optimize data storage and retrieval strategies for performance and cost efficiency.
Enable data platforms that support AI/ML workloads and downstream generative AI use cases.
Design and implement CI/CD pipelines for data pipelines, infrastructure, and analytics code using tools such as GitHub Actions, GitLab CI, Jenkins, or AWS-native services.
Automate build, test, and deployment processes for ETL pipelines and data platform components.
Implement DataOps best practices, including version control, automated testing, environment promotion, and rollback strategies.
Ensure reproducibility, reliability, and governance of data pipeline deployments across environments.
Integrate AI-driven testing and monitoring tools to improve pipeline quality and reduce operational risk.
Design and implement materialized views and other performance optimization techniques to improve query efficiency.
Tune data pipelines and queries for performance, scalability, and cost.
Implement partitioning, indexing, and caching strategies aligned to workload patterns.
Develop pipelines to ingest, parse, and normalize XBRL (eXtensible Business Reporting Language) data.
Support regulatory and financial data use cases requiring high accuracy and traceability.
Ensure alignment with data standards and validation rules for financial reporting datasets.
Apply context engineering principles to ensure data is enriched with meaningful metadata, lineage, and business context.
Collaborate with Data Architects to support data modeling, schema design, and entity relationships.
Enable downstream analytics and AI use cases by structuring data for usability, discoverability, and governance.
Integrate pipelines with enterprise data catalogs and metadata management systems.
Support automated metadata capture, lineage tracking, and data quality monitoring.
Ensure alignment with data governance frameworks and standards established by OCDO organizations, including AI data readiness and trace

Data Engineer - Anika Systems

About the role

Data Engineer - Anika Systems

About the role

Skills