CitationBenchTalk to Sales
Concepts

Keywords: lifecycle, provenance, priority, and pillar

How CitationBench models the Keyword resource — lifecycle states, source provenance, priority, pillar mapping, and rank history — separately from the 2D labeling system.

The Keyword is the most-used resource in CitationBench. Every blog post, landing page, rank check, outreach campaign, and AI citation eventually points back to one. This page explains how we model them — the lifecycle, the source provenance, the priority system, the pillar relationship — separately from the 2D labeling system (which gets its own concept page).

The short version

  • A Keyword is an org-scoped persistent record with a lifecycle (RAWLABELLINGLABELEDFOCUSEDARCHIVED)
  • Each carries provenance (where it came from), labels (the 2D taxonomy), tags, priority, optional pillar, rank history
  • The same keyword string can exist in many workspaces — but is unique within a workspace
  • Operations on keywords (research, search, label, tag, check rank) live on the Research · keyword API page

Why we modeled it this way

Three design constraints shaped this:

1. Provenance matters for trust. When an agent surfaces a keyword for content creation, you need to know where it came from. DataForSEO related-keywords, Ahrefs matching, an LLM mention pass, a Google Search Console import, a manual entry — each has different reliability. We carry source and sourceDetails on every keyword so downstream tools can weight differently.

2. Lifecycle states avoid premature commitment. Most keywords start as RAW — discovered but not yet evaluated. The labeling pass moves them to LABELED. The strategist (or the agent) promotes the ones worth pursuing to FOCUSED. Stale or off-target ones get ARCHIVED. You can run different tools against different lifecycle slices.

3. Priority + pillar + tags are three independent axes. Priority answers "how urgent." Pillar answers "what content theme." Tags answer everything else. Conflating them caused every previous SEO tool we used to drown its keyword DB in soup. Keeping them orthogonal lets you cross-filter naturally.

The data model

{
  "id": "kw_01HVZ...",
  "keyword": "project management software for engineering teams",
  "organizationId": "ws_acme",

  "source": "DATAFORSEO",
  "sourceDetails": {
    "method": "keyword_ideas",
    "seedTerm": "project management software"
  },
  "importedAt": "2026-05-24T08:01:42Z",
  "importBatchId": "inv_01HVZ...",

  "status": "LABELED",
  "priority": "HIGH",

  "intentLabels": ["SPECIFICATION", "PROBLEM_SOLUTION"],
  "intentConfidence": 0.91,
  "relevanceLabel": "OFFERING",
  "relevanceConfidence": 0.87,
  "isHighIntent": true,
  "isHighRelevance": true,
  "labelReason": "Searcher comparing PM tools by feature; aligned to our core offering.",
  "isAiGenerated": true,
  "labeledAt": "2026-05-24T08:02:18Z",

  "pillarId": "pil_pricing",
  "tags": ["q2-2026", "engineering-icp"],

  "parentKeywordId": null,
  "notes": null,

  "createdAt": "2026-05-24T08:01:42Z",
  "updatedAt": "2026-05-24T08:02:18Z"
}

Source enum

ValueMeaning
DATAFORSEODiscovered via DataForSEO (related, ideas, llm mentions)
AHREFSDiscovered via Ahrefs (matching, related, suggestions)
GOOGLE_SEARCH_CONSOLEImported from a GSC property's actual impressions
AI_SUGGESTIONAn agent (e.g. during bootstrap_brand) proposed it
USER_INPUTManually entered

Status enum

RAW         → discovered, not yet labeled
LABELLING   → currently in the labeling pass
LABELED     → has intent + relevance + confidence
FOCUSED     → promoted as a keyword we're actively pursuing
ARCHIVED    → soft-deleted; excluded from default queries

Lifecycle transitions are open — you can move a keyword from any state to any other. Most are agent-driven.

Priority enum

CRITICAL  → highest; agent prioritizes for content / rank tracking
HIGH      → next
MEDIUM    → default
LOW       → below default
BACKLOG   → discovered but not actionable yet

Used by agents.rank_monitor, content_factory, keyword_manager to order work.

How it interacts with other concepts

ConceptRelationship
2D Keyword LabellingEvery labeled keyword has an intent × relevance pair with confidence
PillarsKeywords optionally belong to one LandingPagePillar. Pillars set default voice + landing page template.
TagsMany-to-many; org-scoped reusable label set
BlogPost / LandingPageKeywords are linked to one or more blog posts and landing pages (primary + secondary)
KeywordRankEach keyword has a rolling history of KeywordRank records — position, URL, owned-domain flag, location, device
CompetitorCompetitorKeyword is a separate model — keywords your competitors rank for, tied to the competitor domain

Common patterns

1. Discover → label → focus → write

research.keyword         (discover, source: DATAFORSEO/AHREFS, status: RAW)

labeling pass            (status: LABELLING → LABELED)

keyword.search filters   (you pick the winners by label + KD + volume)

PATCH keyword.priority = HIGH, status = FOCUSED

content_factory          (writes content for FOCUSED keywords first)

2. Source-based filtering

Filter for keywords actually getting impressions in GSC (vs aspirational ones from DataForSEO):

curl -G .../v1/keywords --data-urlencode "source=GOOGLE_SEARCH_CONSOLE"

3. Parent–child variants

A keyword can have a parentKeywordId pointing to a "head" keyword it's a long-tail variant of. Useful for clustering.

4. Bulk import + auto-label

curl -X POST .../v1/keywords/bulk -d '{
  "keywords": [
    { "keyword": "...", "source": "USER_INPUT" },
    { "keyword": "...", "source": "USER_INPUT" }
  ],
  "label": true,
  "pillarId": "pil_pricing"
}'

5. Relabel after a product change

Edit your workspace's product description, then:

curl -X POST .../v1/keywords/relabel -d '{ "scope": { "status": "LABELED" } }'

The agent re-runs the 2D labeling pass with the new context.

On this page