remoteonsite
OCR/IDP Data Labelling & Validation Specialist - ABBYY
Software Engineer
Specialist focused on high‑quality OCR and Intelligent Document Processing data labelling and validation, ensuring accurate training datasets for AI models using advanced annotation tools and rigorous QA processes.
About the role
Key Responsibilities
- Perform detailed data labelling for OCR and IDP projects, marking text, tables, forms, and other document elements with precision.
- Validate and audit labelled data to maintain high accuracy standards, identifying and correcting errors.
- Collaborate with data scientists and engineers to refine annotation guidelines and improve model training pipelines.
- Utilise annotation tools and scripts to streamline workflows and ensure consistency across large datasets.
- Document quality metrics, provide feedback on data quality, and suggest improvements to data collection strategies.
Requirements
- Strong understanding of OCR and IDP technologies and their data requirements.
- Experience with annotation tools (e.g., Label Studio, CVAT) and data validation techniques.
- Attention to detail and commitment to delivering error‑free datasets.
- Good communication skills to work cross‑functionally with engineering and research teams.
- Proficiency in scripting (Python or similar) for data handling and automation is a plus.
Skills
machine learningnatural language processingcomputer vision