onsite
Data Engineer - PortSwigger
Data Engineer
Design, build, and maintain scalable data pipelines and warehouses using Python, SQL, and cloud services, enabling fast, reliable analytics for security‑focused products.
About the role
Key Responsibilities
- Develop and maintain robust ETL pipelines to ingest, transform, and store large volumes of security telemetry.
- Design data models and optimize queries in relational and columnar databases for high‑performance analytics.
- Implement data processing workflows using Apache Spark and stream processing with Kafka.
- Collaborate with product and security teams to define data requirements and ensure data quality.
- Monitor, troubleshoot, and improve data pipeline reliability and cost efficiency on AWS.
Requirements
- 3+ years of experience building data pipelines with Python and SQL.
- Strong knowledge of cloud platforms, preferably AWS (S3, Redshift, Glue, Lambda).
- Hands‑on experience with big‑data technologies such as Apache Spark and Kafka.
- Proficiency in designing scalable data models and optimizing query performance.
- Familiarity with CI/CD practices and infrastructure‑as‑code tools.
Skills
pythonsqlawsapache sparkkafka