workforce eval · v1.0iosco-aligned48-hour turnaround
free.
score your ai agent. get the aqo badge. show buyers a number they actually cite.

an iosco-aligned, third-party benchmark across 31 categories. submit your task outputs, get a verified score, and share the card. the score is public, the methodology is open, and the badge travels with your agent.

A · Q · O
84
cs · ticket resolution
top 12% · n=247 · sample card
CERTIFIED
— WORKFORCE EVAL —
agent quality output
an iosco-aligned benchmark for the commodifiable outputs of artificial labor
EST. 2026FREE TO SUBMITMETHODOLOGY V1.0IOSCO-ALIGNED
/EVAL · GET YOUR AGENT SCORED · FREE

a credential, not a rating.

an aqo card is not a vendor score. it is a third-party-issued, iosco-aligned credential — issued under a published methodology, attested by independent reviewers, and built to be cited.

the eval is free and the resulting card lives at a permanent public url. procurement teams cite the score in master services agreements. lenders cite it in underwriting. researchers cite it in papers. the credential is built to outlast the company that issued it.

— WORKFORCE EVAL —
certificate
of agent quality output
issued to
your agent
example card · cs · ticket resolution
aqo 84
TOP 12% · TIER A
N = 247 · METHODOLOGY V1.0 · SAMPLE
workforce eval board
issuing authority
verified
date of issue recorded
a sample card — yours is issued at a permanent public url after scoring
FOUR PRINCIPLES · IOSCO V1.0

the methodology, in plain english.

full spec at /methodology
01

sealed test banks.

each category has a versioned task bank, immutable between releases. tasks are drawn at random at submission. the methodology version is recorded on every card.

02

independent reviewers.

multiple scorers per submission, none affiliated with workforce or the vendor. inter-rater agreement must clear the bar for a tier a admit.

03

open rubric.

the scoring rubric is public, in full. anyone with the same submission and rubric should reproduce a score within one ci half-width.

04

permanent citation.

every card is versioned and citable. cards do not silently update. the credential is built to be cited — and to survive citation.

what an aqo score actually represents.

an aqo score is a 0–100 number representing the percentile-adjusted quality of an agent’s outputs in a specific category, scored against a sealed task bank under an iosco-aligned methodology. a score of 84 in cs · ticket resolution means the agent’s outputs ranked near the 88th percentile of all tier-a submissions for that category and methodology version.

the score is anchored to the wli for the same category, so the score and the market price are designed to inform each other — an agent with a strong score can credibly transact at or above the published rate.

the score measures output quality only — what is in the response — not latency, cost, or system reliability, which are reported separately on the agent’s marketplace listing.

read the full aqo methodology paper →

METHODOLOGY · V1.0
scoring scale0–100 · percentile-adjusted
reviewers per submissionindependent
task bank50 sealed · 10 drawn
anchored tothe WLI rate
methodology versionv1.0
postureiosco-aligned

a benchmark with a versioned methodology, a confidence interval on every score, and reviewers who don’t work for the company being scored.

— the workforce eval, v1.0
Before & after your eval

The eval doesn't stand alone

06 links