remote

Data Analyst - PySpark

Data Analyst

Data Analyst skilled in PySpark and Python to design, monitor, and improve data pipelines while ensuring compliance, quality, and security of large‑scale datasets.

About the role

Key Responsibilities

Develop, maintain, and optimize PySpark data pipelines for large‑scale batch and streaming workloads.
Implement data quality checks, validation rules, and monitoring dashboards to guarantee accurate and reliable data.
Collaborate with governance teams to enforce data compliance policies and security standards across all data assets.
Investigate and resolve data anomalies, security incidents, and performance bottlenecks.
Document data flows, lineage, and technical specifications in Jira for traceability and continuous improvement.

Requirements

3+ years of experience with PySpark, Python, and SQL in a big‑data environment.
Strong understanding of data quality frameworks, compliance regulations, and security best practices.
Proficiency with data‑pipeline orchestration tools and issue‑tracking systems such as Jira.
Ability to translate business requirements into scalable technical solutions.
Excellent problem‑solving skills and a collaborative mindset.

Skills

pythonsqljira

DepartmentEngineering

LocationUnited States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 26, 2026