AI Agent Evaluation Analyst for Autonomous Agents (No coding required)

Remote Full-time
We’re hiring detail-oriented, analytical contributors to help test and improve autonomous AI agent evaluations. This is part-time, fully remote work with flexible hours, ideal for people who enjoy finding edge cases, questioning assumptions, and strengthening complex systems. What you’ll do • Review and refine agent evaluation tasks and scenarios for logic, completeness, and realism • Identify inconsistencies, ambiguities, and missing assumptions • Define gold-standard expected behaviors for agents • Annotate reasoning paths, cause-effect relationships, and plausible alternatives • Collaborate with QA, writers, and developers to suggest refinements and expand edge case coverage • Ensure autonomous agents are tested thoroughly and realistically What we’re looking for • Strong analytical thinking and excellent attention to detail • Fluent written English with clear documentation skills • Comfort reading structured formats such as JSON or YAML (no need to write code) • Ability to reason about complex systems and spot what could break or be misinterpreted Nice to have Prior exposure to QA/test-case thinking, logic puzzles, or evaluation frameworks Apply tot his job
Apply Now

Similar Opportunities

Lead Agentic AI Developer

Remote

Senior Technical Writer, Business Analyst with Gen. AI skills

Remote

AI Automation Specialist - Remote US

Remote

AI Automation Developer for Ongoing Work (AI, n8n, Make.com, Voiceflow) - Contract to Hire

Remote

AI Automation Specialist​/Remote View Position

Remote

AI Agent Developer to Build an Autonomous Instagram Marketing System (Strategy + Automation)

Remote

AI Automation Engineer – Build Internal Business Applications

Remote

AI Automation Engineer – Extend Existing Salesforce-Based AI Outreach System - N8n / Salesforce

Remote

AI Automation Engineer (n8n + Playwright) for Google Flow Video Generation

Remote

Lead Data Engineer + AI Client - Altimetrik Takeda Location: Remote Need minimum 3 years of experien

Remote

Spanish Fluent - Privacy & Compliance Analyst (LATAM)

Remote

**Experienced Vice President of Customer Success – Scaling Global Customer Outcomes and Driving Growth in High-Touch SaaS Environments**

Remote

**Experienced Full Stack Data Entry Specialist – Customer Service Operations**

Remote

**Experienced Online Data Entry Assistant (Teens) – Empowering Young Talents at blithequark**

Remote

**Experienced Customer Support Specialist | arenaflex | Remote (United States)**

Remote

**Experienced Full Stack Customer Service Representative – Remote Work Opportunity with Blithequark**

Remote

Experienced Remote Data Entry Specialist – Fully Remote Opportunity with blithequark for Detail-Oriented and Organized Individuals

Remote

Senior Specialist II, Quality Assurance – Biotechnology Industry Leader in Sterile Operations and Quality Systems

Remote

Prior Authorization Clinical Pharmacist job at Elevance Health in Washington, DC, MD, NY, Mason, OH, Richmond, VA, Durham, NC, Atlanta, GA, Saint Louis, MO, IN, Miami, FL, Louisville, KY, Nashville, TN, Winston-Salem, NC

Remote

Experienced Customer Support Representative – Dynamic Work Environment with Opportunities for Growth and Professional Development at blithequark

Remote
← Back to Home