We are sharing a specialised part-time consulting opportunity for advanced LLM power users experienced in personalized AI workflows, rubric-based evaluation, real-world task assessment, personal productivity systems, and high-context decision support.
This role supports current and upcoming remote consulting opportunities focused on evaluating how AI systems handle personalized, real-world life tasks across food, health, productivity, career, learning, research, planning, and personal workflow scenarios. Selected professionals will create realistic prompts, complete complex AI-assisted tasks, record workflow execution, design or apply detailed rubrics, and evaluate whether AI outputs are useful, personalized, practical, safe, and successful in real-life contexts.
Key Responsibilities
Professionals in this role may contribute to:
Personalized AI Task Evaluation
- Create written responses, prompts, and explanations for complex personal-life tasks
- Evaluate whether AI outputs are practical, well-reasoned, personalized, realistic, and successful
- Identify where outputs succeed, miss context, overreach, provide generic advice, or fail to account for real constraints
- Use hands-on LLM experience to assess real-world usefulness across high-context personal workflows
Rubric Design & Quality Assessment
- Apply structured rubrics and quality criteria to evaluate AI system performance
- Create detailed evaluation rubrics for complex personal tasks and multi-step workflows
- Judge outputs against criteria involving usefulness, personalization, reasoning quality, safety, completeness, and success conditions
- Write clear, specific, and well-supported feedback explaining evaluation decisions
Real-World Workflow Execution
- Execute AI-assisted tasks while recording screens according to project instructions
- Review task performance across tools, prompts, reasoning steps, outputs, and final recommendations
- Complete research-intensive personal workflows end-to-end within expected turnaround timelines
- Maintain careful documentation of task setup, execution, rubric design, and evaluation results
Ideal Profile
Strong candidates may have:
- Heavy personal usage of LLM products and AI tools
- Experience using AI for multi-step tasks, planning, research, decision-making, personal workflows, or life administration
- Familiarity with tools such as ChatGPT, Claude, Gemini, Perplexity, Cursor, Windsurf, Codex, or other AI agents
- Strong ability to explain what makes an AI output useful, incomplete, unsafe, unrealistic, generic, or poorly personalized
- Extensive rubric experience, including prior rubric design, evaluation, and quality assessment work
- Strong wri