remote

Senior Data Engineer AWS - Capstone Integrated Solutions

Data Engineer

Senior Data Engineer responsible for designing, building, and maintaining scalable data pipelines on AWS, leveraging ETL processes, SQL, and Python to support retail software solutions and deliver high-quality data services.

About the role

Capnexus is a comprehensive services provider. Our team consists of outstanding professionals, highly experienced in designing, building, and supporting retail software. We see ourselves as a build-as-a-service provider who follows a repeatable business pattern that can be applied to a variety of platforms and verticals. Having a culture built on outcomes and delivery at the core of the business, Capnexus is providing its customers with a complete suite of services for software development, system analysis, integration, implementation, and support, as well as the option to engage a single team to perform all the services they require.

Who You Are and What You'll Do:

Capnexus is looking for a highly skilled Senior AWS Data Engineer to lead data architecture, pipeline development, and data integrations. This is an exciting opportunity to apply advanced cloud data engineering skills on a platform that leverages generative AI to automate and modernize enterprise workflows.

Responsibilities:

Participate in data discovery workshops to inventory source systems including property management platforms, marketing channels, and CRM data, and translate findings into data lake architecture requirements.
Design and implement a multi-zone enterprise data lake on Amazon S3 (raw, conformed, enriched, aggregated) with ingest, cleansing, and business layers including schema versioning, checksum validation, business rule validation, and quarantine/notify workflows on failure.
Build batch and streaming data ingestion pipelines using AWS Glue, Amazon Kinesis, and containerized ingestion applications across CDP, marketing, and property management data sources.
Write PySpark and Python ETL code for AWS Glue jobs to transform, cleanse, and enrich data at scale; apply Apache Iceberg table format for ACID-compliant, schema-evolving data lake tables.
Implement data transformation and orchestration frameworks using AWS Glue ETL and AWS Step Functions; configure AWS Glue Data Catalog with crawlers for automated metadata management and discovery.
Implement AWS Lake Formation for fine-grained data governance including table-level and column-level permissions, data filters, and resource links — not just IAM-level access controls.
Configure Amazon Athena for serverless SQL querying across the data lake with performance optimization (Parquet format, partitioning, column pruning, file size management, caching); implement Amazon DynamoDB for sub-second customer profile lookups, with DAX where latency requirements demand it.
Develop and deploy AWS Lambda functions using AWS Lambda Powertools for structured logging, handler routing, and observability; implement error handling patterns including exponential backoff, retries, dead-letter queues, and CloudWatch alarms.
Write and maintain Terraform (or CloudFormation/CDK) modules to provision and deploy AWS dat

About the role

Who You Are and What You'll Do:

Responsibilities:

Participate in data discovery workshops to inventory source systems including property management platforms, marketing channels, and CRM data, and translate findings into data lake architecture requirements.
Design and implement a multi-zone enterprise data lake on Amazon S3 (raw, conformed, enriched, aggregated) with ingest, cleansing, and business layers including schema versioning, checksum validation, business rule validation, and quarantine/notify workflows on failure.
Build batch and streaming data ingestion pipelines using AWS Glue, Amazon Kinesis, and containerized ingestion applications across CDP, marketing, and property management data sources.
Write PySpark and Python ETL code for AWS Glue jobs to transform, cleanse, and enrich data at scale; apply Apache Iceberg table format for ACID-compliant, schema-evolving data lake tables.
Implement data transformation and orchestration frameworks using AWS Glue ETL and AWS Step Functions; configure AWS Glue Data Catalog with crawlers for automated metadata management and discovery.
Implement AWS Lake Formation for fine-grained data governance including table-level and column-level permissions, data filters, and resource links — not just IAM-level access controls.
Configure Amazon Athena for serverless SQL querying across the data lake with performance optimization (Parquet format, partitioning, column pruning, file size management, caching); implement Amazon DynamoDB for sub-second customer profile lookups, with DAX where latency requirements demand it.
Develop and deploy AWS Lambda functions using AWS Lambda Powertools for structured logging, handler routing, and observability; implement error handling patterns including exponential backoff, retries, dead-letter queues, and CloudWatch alarms.
Write and maintain Terraform (or CloudFormation/CDK) modules to provision and deploy AWS dat

Senior Data Engineer AWS - Capstone Integrated Solutions

About the role

Senior Data Engineer AWS - Capstone Integrated Solutions

About the role

Skills