Operations Research Model Prompt Evaluator

Remote, USA Full-time Posted 2026-06-13

Role Overview

We are seeking expert operations research professionals to author and verify high-quality open-ended prompts for AI model evaluation. You will craft and review challenging, unambiguous optimization and decision-science problems across core subdomains, assessing AI reasoning quality and helping establish rigorous evaluation standards for frontier language models. You will be assigned one of two task types: -

Authoring Task

— Create 5 original, open-ended prompts from your assigned subdomain at varying difficulty levels (undergraduate, advanced undergraduate, or graduate/professional). Prompts should require human judgment to evaluate the quality of the AI's response, such as optimization modeling, algorithmic analysis, or stochastic reasoning. -

Verification Task

— Review 5 authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness. Edit prompts and difficulty ratings where needed.

*Operations Research Subdomains Covered**

Linear & Integer Programming, Network Optimization & Graph Theory, Stochastic Models & Queuing Theory, Game Theory & Decision Analysis, Supply Chain & Logistics Optimization, Simulation & Metaheuristics.

*Key Responsibilities**

- Author clear, unambiguous, open-ended operations research prompts that elicit evaluable AI responses - Verify prompts are within the scope of the assigned subdomain and correctly rated for difficulty - Ensure all 5 prompts in a task are sufficiently distinct from one another with varying difficulty levels - Apply expert judgment to assess the depth and quality of quantitative reasoning required - Edit prompts and difficulty assignments where standards are not met

*Ideal Qualifications**

- Master's degree or higher in Operations Research, Industrial Engineering, Applied Mathematics, or a closely related field - 2–6 years of professional or research experience in optimization, logistics, or decision science - Strong command of mathematical programming, probabilistic modeling, and algorithmic methods - Experience with solvers (Gurobi, CPLEX) or simulation tools is a strong plus - Excellent written English and ability to craft precise, well-scoped technical questions

*More About the Opportunity**

- Expected commitment: 10+ hours/week - Asynchronous, fully remote work Apply tot his job Apply To this Job

Apply Now

Operations Research Model Prompt Evaluator

Role Overview

Authoring Task

Verification Task

Similar Jobs

Video Evaluator (AI Content)

Bilingual Italian Generalist Evaluator Expert

Enhanced Direct Enrollment Product Owner

Product Owner II - IT (Memphis, TN or Remote in USA)

Senior Product Manager, Formations (Remote)

English (U.S. Native) AI Trainer & Evaluator (Remote, Hourly Contrator)

Product Owner - Contract

Remote AI Visual Quality Evaluator

Academic Evaluator

Part-Time Trainer, Evaluator, and Outreach Support

Experienced Full Stack Customer Service Representative – Remote Opportunity with arenaflex

Senior Customer Success Manager (CSM) – Public Sector Expertise

Senior WordPress Developer, Remote Job

[Remote] Associate Product Manager, Virtual Cards

Senior Human Resources Business Partner

Experienced Customer Support Agent – Remote Team Member at arenaflex

Calculus Tutor (Private) in Portland, OR | TeachMe.To

Project Manager (ADEPT Programme), P-4, Temporary Position for 364 days, Supply Division, COPENHAGEN, #00136772

Experienced Full Stack Social Media Customer Support Specialist – Work From Home Opportunity at arenaflex

Sales Representative BC - Grand Junction, CO