About The Role
GHX is building a cutting-edge LLM-powered document understanding platform focused on classification, structured data extraction, and intelligent orchestration at scale. This is a high-impact AI engineering role where you will own the full lifecycle—from problem framing to production deployment. Initially, you will focus on prompt engineering and evaluation systems, building the quality foundation for AI performance. Over time, the role expands into agent orchestration, system architecture, and migration of rule-based systems to LLM-driven pipelines. A strong foundation in software engineering (5+ years) is essential. This role demands engineering rigor across both traditional system design and AI system behavior.
Role Evolution
Now (0–6 months): Prompt Engineering & Evaluation
- Design and refine prompts for classification and data extraction
- Build ground truth datasets and evaluation pipelines
- Establish accuracy benchmarks and quality baselines
Mid-Term (6–12 months): Agent Orchestration & Pipeline Design
- Develop multi-agent document processing pipelines
- Integrate MCP-based tools and external service interfaces
- Orchestrate parallel AI workflows
Long-Term (12+ months): Platform Architecture & Ownership
- Own the end-to-end document intelligence architecture
- Lead migration from rule-based to LLM-powered systems
- Define engineering standards and best practices
Core Responsibilities
- Prompt Engineering: Design prompts for diverse document classification and extraction tasks; treat prompts as formal specifications (precise, structured, and edge-case-aware); develop few-shot, chain-of-thought, and structured output templates; manage prompt lifecycle: versioning, testing, and rollback.
- LLM Output Evaluation: Create and maintain ground truth datasets; build automated evaluation pipelines (precision, recall, field-level accuracy); identify and resolve conceptually incorrect outputs despite surface correctness.
- AI Agent Orchestration: Design multi-agent workflows for document processing; implement tool-use patterns and integrate MCP servers; optimize orchestration for scale and efficiency.
- Software Engineering: Develop production-grade APIs and backend services; apply Clean Architecture / DDD principles; write maintainable, testable Python code; contribute to CI/CD, deployment, and observability systems.
- Stakeholder Collaboration: Act as a bridge between business stakeholders and AI systems; translate product requirements into technical architectures; communicate system behavior, limitations, and quality metrics clearly.
Required Skills & Experience
- 5+ years of software engineering experience (Python preferred)
- Advanced prompt engineering and LLM evaluation expertise
- Experience with ground truth dataset design
- Hands-on experience with LLM APIs (OpenAI, Anthropic, Azure AI)
- AI agent orchestration experience
- Strong understanding of Clean Architecture / DDD principles
- REST API development
- Version control (Git), CI/CD practices, containerization
- Experience with AWS or similar cloud platforms
Nice to Have
- Document understanding / OCR tools (e.g., AWS Textract)
- Experience with AWS services (EC2, S3, SQS, Lambda, ECS)
- Familiarity with LangChain / LlamaIndex
- Background in NLP or text classification systems