Content Evaluation API — Score Articles Against SEO, Readability, and Brand Rubrics
REST API for scoring blog posts and landing pages against customizable rubrics. Returns per-criterion scores, weighted overall scores, flagged issues, and recommended actions for any agentic content pipeline.
Score content against a rubric — readability, SEO quality, brand alignment, keyword optimization, factual accuracy, and any custom criteria you define. Useful before publishing, before refreshing, and as a gate in any agentic content pipeline.
Conceptual overview
Evaluation runs a rubric — a set of named criteria with weights — against a piece of content and returns:
- Per-criterion scores (0–100) with reasoning
- A weighted overall score
- Specific issues flagged (sections too short, missing primary keyword, broken internal links, sentences that exceed grade-level target)
- Recommended actions to lift the score
Rubrics are workspace-scoped. The platform ships with several system rubrics (system.seo, system.readability, system.brand). You can fork, customize, or build your own.
Endpoints
| Method | Path | Purpose |
|---|---|---|
| POST | /v1/produce/evaluate | Evaluate one content against a rubric |
| GET | /v1/produce/evaluate | List evaluations |
| GET | /v1/produce/evaluate/{id} | Get one evaluation |
| POST | /v1/produce/evaluate/compare | Compare two evaluations (e.g., before/after refinement) |
| GET | /v1/produce/evaluate/rubric | List rubrics |
| POST | /v1/produce/evaluate/rubric | Create a rubric |
| PATCH | /v1/produce/evaluate/rubric/{id} | Update |
| DELETE | /v1/produce/evaluate/rubric/{id} | Delete |
| POST | /v1/produce/evaluate/rubric/system.{slug}/fork | Fork a system rubric |
POST /v1/produce/evaluate
{
"contentType": "blog_post",
"contentId": "bp_***",
"rubricId": "rub_acme-blog-standard",
"primaryKeywordId": "kw_***"
}| Field | Type | Required | Notes |
|---|---|---|---|
contentType | "blog_post" | "landing_page" | "content" | yes | — |
contentId | string | yes | — |
rubricId | string | no | Workspace default if absent |
primaryKeywordId | string | no | For SEO scoring; if absent, inferred from content's linked keywords |
additionalCriteria | object | no | Ad-hoc criteria added on top of the rubric |
Response
{
"invocationId": "inv_***",
"agentId": "agt_***",
"skill": "produce.evaluate",
"status": "RUNNING",
"estimatedCost": { "credits": 4, "durationSeconds": 25 }
}Final result
{
"invocationId": "inv_***",
"agentId": "agt_***",
"skill": "produce.evaluate",
"status": "SUCCEEDED",
"creditsUsed": 4,
"result": {
"evaluationId": "ev_***",
"contentId": "bp_***",
"rubricId": "rub_acme-blog-standard",
"rubricVersion": 3,
"overallScore": 78,
"verdict": "PUBLISH_READY",
"criteria": [
{
"name": "readability",
"weight": 0.2,
"score": 82,
"reasoning": "Grade level 9.1; average sentence 17 words; intro engaging.",
"issues": []
},
{
"name": "seo_keyword_coverage",
"weight": 0.25,
"score": 74,
"reasoning": "Primary keyword in H1 and 4 sub-sections. Missing from meta description.",
"issues": [
{
"type": "primary_keyword_missing_meta",
"fixHint": "Add 'engineering team capacity tracking' to meta description"
}
]
},
{
"name": "brand_voice",
"weight": 0.2,
"score": 88,
"reasoning": "Strong second-person voice throughout. Two corporate cliches in section 4.",
"issues": [
{
"type": "corporate_cliche",
"section": 4,
"phrase": "synergize across stakeholders"
}
]
},
{
"name": "depth",
"weight": 0.2,
"score": 70,
"reasoning": "2,247 words. SERP top-3 average 3,100. Underdeveloped on capacity forecasting subtopic.",
"issues": [
{ "type": "underdeveloped_subtopic", "topic": "capacity forecasting" }
]
},
{
"name": "fact_quality",
"weight": 0.15,
"score": 76,
"reasoning": "5 verifiable claims; 4 with citations. One claim ('80% of teams ...') lacks a source.",
"issues": [
{
"type": "unsupported_claim",
"section": 2,
"claim": "80% of teams ..."
}
]
}
],
"recommendedActions": [
{
"action": "produce.refine",
"refinerId": "rfn_seo-cleanup",
"expectedScoreDelta": +6
},
{
"action": "produce.blog_post.regenerate",
"scope": "section:capacity_forecasting",
"expectedScoreDelta": +5
}
]
},
"raw": "Scored 78. Two big lifts available: SEO cleanup (meta description) and expanding the capacity forecasting section ...",
"files": [
"agent-workspace/scorecard.json",
"agent-output/evaluation-report.md"
]
}Verdict values
| Verdict | Score range | Meaning |
|---|---|---|
PUBLISH_READY | 80–100 | Good to ship |
MINOR_FIXES | 60–79 | Refine first; specific fixes listed |
MAJOR_REWORK | 40–59 | Regenerate sections or refresh thoroughly |
SCRAP | 0–39 | Better to start over |
POST /v1/produce/evaluate/compare
Compare two evaluations of the same (or related) content — useful for before/after refinement.
curl -X POST .../v1/produce/evaluate/compare -d '{
"evaluationIds": ["ev_***A", "ev_***B"]
}'{
"delta": {
"overallScore": +6,
"byCriterion": {
"readability": +2,
"seo_keyword_coverage": +9,
"brand_voice": 0,
"depth": +4,
"fact_quality": 0
},
"issuesResolved": ["primary_keyword_missing_meta"],
"issuesRemaining": ["corporate_cliche", "underdeveloped_subtopic"]
}
}CRUD: /v1/produce/evaluate/rubric
Create a rubric
curl -X POST .../v1/produce/evaluate/rubric -d '{
"name": "Acme blog standard",
"description": "Our standard rubric for blog posts",
"appliesTo": ["blog_post"],
"criteria": [
{ "name": "readability", "weight": 0.20, "type": "system.readability" },
{ "name": "seo_keyword_coverage", "weight": 0.25, "type": "system.seo" },
{ "name": "brand_voice", "weight": 0.20, "type": "llm_check", "prompt": "Rate how strongly this article matches our brand voice ..." },
{ "name": "depth", "weight": 0.20, "type": "system.depth_vs_serp" },
{ "name": "fact_quality", "weight": 0.15, "type": "llm_check", "prompt": "Rate fact-checking quality ..." }
]
}'List + update + delete
curl .../v1/produce/evaluate/rubric
curl -X PATCH .../v1/produce/evaluate/rubric/rub_***
curl -X DELETE .../v1/produce/evaluate/rubric/rub_***Built-in rubrics
system.seo— keyword coverage, on-page SEO basicssystem.readability— grade level, sentence length, structuresystem.brand— generic brand voice check (you'll likely want to fork this)system.depth-vs-serp— compare depth to current SERP top-3system.fact-quality— citation density, claim verifiability
Fork:
curl -X POST .../v1/produce/evaluate/rubric/system.seo/fork -d '{
"name": "Acme SEO (customized)"
}'MCP
> Evaluate bp_*** against our blog standard.Claude calls produce.evaluate.score.
> Compare the eval before and after the refine.Claude calls produce.evaluate.compare.
> What's wrong with bp_*** — just the issues, not the scores.Claude calls produce.evaluate.score and renders only criteria[].issues[].
Errors
| Status | Code | Cause |
|---|---|---|
| 404 | content_not_found | — |
| 404 | rubric_not_found | — |
| 422 | incompatible_rubric | Rubric's appliesTo doesn't include this content type |
Cost
| Action | Credits |
|---|---|
| Per evaluation | 4 (default rubric) |
| Per custom criterion (LLM-based) | +1 each |
| Compare | free |
| CRUD rubrics | free |
Use cases (string things together)
A. Eval gate before publish
# Eval Gate
when:
field_lt: { evaluationScore: 75 }
then:
action: escalate_to_approval
reason: "Score below threshold — manual review"produce.publish runs the eval gate; sub-75 evaluations pause for approval.
B. Eval-then-refine loop
EV=$(curl -sf -X POST .../v1/produce/evaluate -d '{...}' | jq -r '.invocationId')
# wait...
ACTIONS=$(curl -sf .../v1/agent/invocations/$EV | jq -c '.result.recommendedActions')
# Auto-apply the top recommended actionThe refresh_stale skill does this loop autonomously.
C. A/B refiner experiment
Apply two competing refiners to the same draft (two revisions), evaluate both, compare. Keep the winner.
D. Cross-portfolio quality audit
curl -X POST .../v1/workspaces/bulk-action -d '{
"action": "produce.evaluate.score",
"workspaces": "all",
"config": {
"contentType": "blog_post",
"scope": { "publishStatus": "PUBLISHED", "publishedAfter": "2026-04-01" }
}
}'Returns a quality scorecard per workspace.
Related
- API: Production · refine
- API: Production · blog post
- API: Production · landing page
- API: Production · publish
- Concept: Eval Gates
- Playbook: Refresh stale content on rank drops
Refine
REST API for applying reusable refiners to blog posts and landing pages. Chain brand voice, SEO hygiene, image insertion, and CTA refiners to keep hundreds of articles consistent without rewriting by hand.
Image
REST API for generating brand-aligned images from reusable templates. Produce OG cards, hero illustrations, CTA banners, and inline graphics rendered via Satori — no prompt engineering required.