AI Visibility Score Methodology

How to read this methodology

This page explains the scoring model behind the AI Readiness baseline. It is written for technical reviewers, web teams, SEO specialists, and governance owners.

What the score measures

Whether AI systems can access the website, read the right company facts, and find signals that support accurate AI answers.

What the score does not guarantee

The score measures readiness signals. It does not guarantee that any AI platform will cite the website in every answer.

What teams should do with it

Use the score to prioritize website fixes: access, indexability, structured data, content clarity, and citation-readiness gaps.

Overall Score Calculation

The overall score combines three areas: AI access and understanding, SEO foundation, and citation-readiness signals.

VerisAI L1 checks whether AI systems can access a website. L2-L4 check whether they can understand it. L7 estimates whether the page is structured enough to be citation-ready.

Formula: Overall Score = (AI Readiness + SEO + Citation Readiness) / 3

AI Readiness (33.3%)

Layer 1-4: Can AI systems access the website and understand the core company facts?

Components: Gateway, SSR, Indexability, Content Quality

SEO Foundation (33.3%)

Layer 5-6: Does the website have the technical and on-page foundation needed for discovery?

Components: Technical SEO, On-Page SEO

Citation Readiness (33.3%)

Layer 7: Does the website expose signals that make AI citation more likely?

Components: GPT, Gemini, Claude, Perplexity, Grok, and Meta readiness scoring

Thresholds:

≥70 = PASS (strong readiness signals)
40-69 = WARN (partial readiness, visible gaps remain)
<40 = FAIL (critical blocking issues or weak signals)

Component 1: AI Readiness Score (Layer 1-4)

Formula: AI Readiness = (Content Score × 0.6) + (Indexability Score × 0.4)

Content Score = Layer 4 Quality × SSR Factor
Indexability Score = 100 + Layer 3 Penalties

Layer 1: Gateway (Binary Gate)

Status: PASS or BLOCKED

Layer 1 separates declared crawler policy from observed access behavior. It checks robots.txt, guidance files, and bot-level HTTP access because robots.txt alone does not prove that AI crawlers receive a usable page.

Scoring (94 bot-weight pts + 15 pts llms.txt bonus, capped at 100):

Bot weights reflect the relative importance assigned by the VerisAI scoring model. Blocking major AI crawlers carries a higher penalty than blocking lower-priority crawlers. Weights should be reviewed periodically as platform behavior changes.

Bot	Platform	Points	AI chatbot share (US, May 2026)
GPTBot	ChatGPT / OpenAI	36	60.6 %
Google-Extended	Gemini / Google AI	20	15.1 %
OAI-SearchBot	ChatGPT Search / OpenAI	14	— (part of ChatGPT)
PerplexityBot	Perplexity AI	10	5.4 %
ClaudeBot	Claude / Anthropic	10	5.0 %
GrokBot	Grok / xAI	4	0.6 %
llms.txt	All platforms	+15 bonus	—

AI chatbot share source: FirstPageSage, U.S., May 2026. Bot weights are not a direct proportional mapping of share — they reflect a combination of chatbot market share, live crawler activity (BotMonitor audit 2026-05-27), and citation impact. Reviewed quarterly.

robots.txt check: Per-bot ALLOWED / BLOCKED / NOT_SPECIFIED
HTTP access test: 6 scored bots can fetch content with their configured User-Agent (status 200)
llms.txt bonus (+15 pts): Present at domain root, HTTP 200, non-empty content — signals intentional AI permission
llms-full.txt (detected, not scored here): Full content dump for AI consumption — used as +10 pts signal in Layer 7 Citation Readiness per platform
Informative bot checks: ChatGPT-User, anthropic-ai, Claude-SearchBot, Claude-Web, Claude-User, Google-CloudVertexBot, Googlebot, BingBot, Perplexity-User, Meta-ExternalAgent, Meta-ExternalFetcher, Amazonbot, Applebot-Extended, Applebot, Bytespider, DuckAssistBot, MistralAI-User, DeepseekBot, and CCBot are tracked for robots.txt diagnostics but do not change L1 score.

Result: BLOCKED → AI Readiness = 0 (scoring stops)

Theory: If AI bots cannot access your site, all other optimizations are irrelevant. Bot weights are model assumptions used by VerisAI and should be reviewed periodically as platform behavior changes. llms.txt and llms-full.txt are separate signals: llms.txt is an access/index file (L1), llms-full.txt is a content completeness signal for citation (L7). GPTBot documentation, Google AI crawlers

Layer 2: Server-Side Rendering (Binary Gate)

Quality: GOOD / PARTIAL / FAILED

Checks (4 critical elements):

Valid <title> tag (not empty/placeholder)
<h1> heading exists
Text content >500 characters
JSON-LD schema present

Scoring:

0 missing = GOOD (SSR Factor = 1.0)
1-2 missing = PARTIAL (SSR Factor = 0.7)
3+ missing = FAILED → AI Readiness = 0

Theory: AI bots rely on server-rendered HTML. Missing critical elements = empty page for bots. HTML5 spec, Google structured data

Layer 3: Indexability (Penalty System)

Penalties: 0 to -60 points

Canonical Tags (-20 pts per issue, max -60):

Missing canonical tag: -20
Multiple conflicting canonicals: -20
Relative URL (not absolute): -20

Language Declaration (-10 pts): Missing <html lang="xx">: -10

JSON-LD Validation (-20 pts): Broken/invalid JSON-LD: -20

Theory: Technical errors confuse AI crawlers about which version to index. Canonical URLs, JSON-LD spec

Layer 4: Content Quality (Type-Specific Scoring)

Auto-detected types: ARTICLE, VIDEO, AUDIO, PRODUCT, ORGANIZATION, GENERIC

Example: ARTICLE Scoring (Max 100):

Schema (40 pts): Base schema (10) + headline (6) + author (6) + datePublished (6) + publisher (6) + image (6)
Structure (40 pts): Word count >300 (20) + H2 subheadings (10) + Internal links ≥2 (10)
SEO (20 pts): Title 50-60 chars (10) + Meta description 150-160 chars (10)

Example: ORGANIZATION Scoring (Max 100):

Schema (60 pts): Base (10) + name (10) + url (10) + logo (10) + contactPoint (10) + address (5) + sameAs (5)
NAP Consistency (20 pts): Phone (10) + Email (10)
Local Signals (20 pts): Opening hours (10) + Map/location (10)

Theory: Different content types serve different purposes. AI needs structured data matching the content type. Schema.org docs, Schema validator

Component 2: SEO Foundation (Layer 5-6)

Formula: SEO Score = (Technical SEO + On-Page SEO) / 2

Layer 5: Technical SEO (Max 100)

Sitemap (30 pts): sitemap.xml exists (15) + Valid XML (15)
HTTPS (20 pts): HTTPS enabled (20)
Mobile-Friendly (25 pts): Viewport meta tag (15) + width=device-width (10)
Performance (25 pts): CSS files <5 (10) + JS files <10 (10) + Images <50 (5)

Theory: Technical foundation enables discovery and indexing. Sitemap protocol, Core Web Vitals

Layer 6: On-Page SEO (Max 100)

Open Graph (25 pts): og:title (8) + og:description (8) + og:image (9)
Twitter Cards (15 pts): twitter:card (8) + twitter:title (7)
Internal Links (20 pts): ≥5 links (20) or 2-4 links (10)
Images (20 pts): Alt coverage ≥80% (20) or ≥50% (10)
Headings (20 pts): Valid H1-H2 hierarchy (20) or Single H1 (10)

Theory: Proper meta tags enable social sharing and preview generation. Open Graph Protocol, Twitter Cards

Component 3: Citation Readiness (Layer 7)

Formula: Citation Readiness = GPT*0.612 + Gemini*0.155 + Claude*0.102 + Perplexity*0.071 + Grok*0.020 + Meta*0.040

Weighting: L7 is market-share-weighted using VerisAI Intel W15/2026. Weights are revised quarterly: GPT 61.2%, Gemini 15.5%, Claude 10.2%, Perplexity 7.1%, Grok 2.0%, Meta 4.0%.

Scope note: Citation Readiness scores measure technical prerequisites and proxy signals. They do not predict exact citation frequency. Actual AI citations also depend on authority, brand mentions, source availability, platform behavior, and ranking systems that AI platforms do not fully disclose.

Shared signal across all platforms: llms-full.txt (+10 pts each)
If /llms-full.txt is present at domain root (HTTP 200), each platform score receives +10 pts. This file provides a structured full-content dump for direct AI corpus ingestion — a distinct signal from llms.txt (which is an access index scored in Layer 1). All platform scores are capped at 100.

GPT / ChatGPT (OpenAI) Readiness (Max 100)

OAI-SearchBot Access (25 pts): Allowed in robots.txt – citation bot for ChatGPT Search
GPTBot Access (10 pts): Allowed in robots.txt – training bot, affects brand representation
JSON-LD Schema (25 pts): Valid JSON-LD present
Content Quality (20 pts): Layer 4 score ≥70
Citation Structure (20 pts): H2 subheadings (10) + Lists/bullets (10)
llms-full.txt (+10 pts, capped at 100): Full content dump present

Note: OAI-SearchBot drives ChatGPT Search citations. GPTBot affects training data representation only. GPTBot documentation, OAI-SearchBot documentation

Gemini (Google AI) Readiness (Max 100)

Google-Extended ALLOWED (10 pts): Not opted out of Google AI training data
HTTP Access (10 pts): HTTP access ok
E-E-A-T proxy signals (55 pts): Author HTML (5) + author page link (5) + Person schema (10) + LinkedIn sameAs in Person schema (5) + published/updated dates (20) + contact info (10)
Schema and brand authority (25 pts): Valid JSON-LD (10) + specific content type (5) + Organization schema (5) + Wikipedia/Wikidata sameAs in Organization schema (5)
llms-full.txt (+10 pts, capped at 100): Full content dump present

Note: Gemini cites from Google Search index — Google-Extended is AI training opt-out proxy only, not a direct crawl access bot. E-E-A-T-style signals are relevant quality signals, but Google does not disclose a complete Gemini citation formula. E-E-A-T guidelines, Google-Extended

Claude (Anthropic) Readiness (Max 100)

L1 Gateway PASS (20 pts): Site not blocking bots
Citation Metadata (20 pts): Canonical URL (10) + og:url (10)
Trust Signals (10 pts): HTTPS enabled
Answer Engine Optimization (40 pts): FAQ/Q&A section (15) + Question-format headings (15) + Definition lists (10)
Rendering Quality (10 pts): SSR quality GOOD
llms-full.txt (+10 pts, capped at 100): Full content dump for Brave Search ingestion

Note: Claude uses Brave Search index for web citations. FAQ and question-format headings are key extraction signals for answer retrieval. Anthropic docs

Perplexity Readiness (Max 100)

PerplexityBot Access (25 pts): Allowed in robots.txt (25) – important access signal; blocking it can prevent Perplexity from indexing or citing the content
Answer Engine Optimization (45 pts): FAQ section (20) + Question-format headings (15) + Definition lists (10)
Rendering Quality (15 pts): SSR quality GOOD (15)
Content Quality (15 pts): Layer 4 score ≥70 (15)
llms-full.txt (+10 pts, capped at 100): Structured full content for Perplexity indexing

Note: PerplexityBot is an important access signal — blocking it can prevent Perplexity from indexing or citing the content. FAQ sections and question-format headings can improve answer extraction and citation readiness. How Perplexity works

Grok (xAI) Readiness (Max 100)

GrokBot robots.txt access (25 pts): Allowed in robots.txt – primary access signal for xAI indexing
GrokBot HTTP access (10 pts): HTTP request succeeds for GrokBot user agent
Twitter/X card metadata (15 pts): twitter:card metadata present
X sameAs in Organization schema (10 pts): Organization schema links to twitter.com or x.com profile
Freshness (15 pts): dateModified present in structured data
Content Quality (15 pts): Layer 4 score ≥70
llms-full.txt (+10 pts, capped at 100): Structured full content for xAI indexing

Note: GrokBot access, X-native metadata, X profile identity, and freshness are the primary Grok readiness signals. xAI

Why Equal Weighting (1/3 + 1/3 + 1/3)?

AI Readiness (33.3%)

Technical capability: Can AI bots access and understand your content?

SEO Foundation (33.3%)

Discoverability: Can humans and search engines find you?

Citation Readiness (33.3%)

Authority: Will AI platforms cite you as a trusted source?

Critical Insight: All three pillars must be present:

High AI Readiness + Poor SEO = Nobody finds you
High SEO + Poor Citation = AI ignores you
High Citation + Blocked Gateway = Score = 0

Score thresholds defined

PASS / CITABLE (≥70): The page meets minimum AI visibility requirements. Signals are sufficient for AI crawlers to access, parse, and potentially cite the content.
WARN / PARTIAL (40–69): Significant gaps exist. The page may be accessible but has missing structured data, weak content signals, or citation barriers that reduce AI citation likelihood.
FAIL / NOT CITABLE (<40): Critical failures detected — blocked crawling, missing SSR, or insufficient content. AI systems cannot reliably access or cite this page.

Knowledge Diff Methodology

Purpose: compare AI-generated company narratives against deterministic, crawler-visible website facts.

Ground truth source

AI knowledge gap snapshot uses VerisAI's VCL Layer 4 Ground Truth Completeness output as the source of website facts. The website fact set is derived from fetched HTML, structured data, visible content, and identity signals evaluated by the VCL content layer. It is not generated by asking an LLM to invent or infer the company's official facts.

Pre-comparison gate

The AI comparison runs only when crawler-visible ground truth is strong enough for a reliable diff. The current gate requires sufficient Layer 4 quality, sufficient ground truth completeness, and no critical missing identity facts. If the gate fails, the user receives a "website ground truth needed" result instead of an AI narrative comparison.

Compared fields

Company name and canonical identity
Company description and positioning
Products and services
Headquarters country and location signals where available
Target market, employee range, USP, certifications, and contact signals where available

AI narrative providers

When the gate passes, VerisAI takes a same-run snapshot across ChatGPT/OpenAI, Gemini, Claude, Perplexity, and Grok. Each platform answer is compared with the same L4-derived ground truth so results can be read per platform and across the aggregate report.

Diff categories

Matched: the AI answer aligns with the website fact.
Discrepancy: the AI answer states a different value or materially different interpretation.
Missing in AI: the website exposes a fact that the AI answer does not include.
Hallucinated by AI: the AI answer contains a claim not supported by the L4-derived website facts.

Scope note: AI knowledge gap output is a point-in-time diagnostic snapshot. 100webs benchmark pages track historical rank movement where prior snapshot data is available. Customer monitoring drift view is being prepared as a weekly per-domain view; real-time alerting, competitor comparison, and guaranteed citation detection are not currently offered.

EU AI Act public-evidence check Methodology

Purpose: map public website evidence that may indicate EU AI Act exposure or disclosure work before August 2026.

Evidence source

The EU AI Act public-evidence check reads reachable public website pages and extracts public signals such as chatbot references, AI-generated content claims, automated recommendation language, sensitive-context wording, disclosure text, contact paths, and governance or policy pages. It does not inspect internal systems, contracts, product logs, model inventories, or private documents.

Readiness output

The scan returns an operational exposure tier, public evidence findings, missing disclosure signals, and recommended next actions for internal owners, counsel, risk, or compliance review. The result is a lead triage and evidence-preparation tool, not a legal classification engine.

Boundary

VerisAI does not provide legal advice, legal certification, formal EU AI Act conformity assessment, or a definitive compliance determination. Any flagged item must be confirmed by qualified legal counsel before a compliance claim is made.

Industry Standards & Documentation

Official Standards Bodies:

Schema.org – Structured data vocabulary
Google Search Central – SEO best practices, E-E-A-T
W3C – Web standards (HTML, accessibility)
IETF – Internet protocols (robots.txt RFC 9309)

AI Platform Documentation:

OpenAI GPTBot – Bot access guidelines
Google AI Crawlers – CloudVertexBot, Google-Extended
Anthropic Claude – AI assistant documentation
Perplexity AI – Search quality overview

Validation Tools:

Academic & Industry Research:

Version History

v2.5.0 (2026-06-14): L1 access interpretation and L7 platform weighting update

Clarified layer interpretation: L1 checks whether AI systems can access a website, L2-L4 check whether they can understand it, and L7 estimates citation readiness.
Clarified that L1 separates declared crawler policy from observed bot-level HTTP access because robots.txt alone does not prove usable AI crawler access.
L7: added Meta AI proxy readiness and updated market-share weighting: GPT 61.2 % • Gemini 15.5 % • Claude 10.2 % • Perplexity 7.1 % • Grok 2.0 % • Meta 4.0 %.

v2.4.0 (2026-05-06): Current services and monitoring status

Added EU AI Act public-evidence check methodology and legal boundary language.
Clarified that 100webs historical rank snapshots are implemented where prior benchmark data exists.
Clarified that weekly per-domain monitoring drift is in progress and real-time alerting is not currently offered.

v2.3.0 (2026-05-02): Knowledge Diff ground truth methodology

Documented that Knowledge Diff uses VCL Layer 4 Ground Truth Completeness as the deterministic source of website facts.
Added the pre-comparison gate: weak website ground truth blocks AI narrative comparison and returns a ground-truth-needed result.
Clarified that AI knowledge gap output is a point-in-time diagnostic snapshot, not customer monitoring, real-time alerting, or guaranteed citation detection.

v2.2.0 (2026-04-06): llms.txt / llms-full.txt signal separation

L1 Gateway: llms.txt bonus increased from +10 to +15 pts — standard transitioning from optional to baseline as Anthropic, Cursor, Mintlify adopt it; validation tightened (HTTP 200 + non-empty content required)
L1 Gateway: llms-full.txt now detected and stored in score details — separate signal from llms.txt
L7 Citation Readiness: llms-full.txt added as +10 pts to platform scores, capped at 100 — full content dump enables direct AI corpus ingestion independently of platform-specific crawlers
Rationale: llms.txt = access/index signal (belongs in L1 Gateway); llms-full.txt = content completeness signal (belongs in L7 Citation Readiness). Merging both into one L1 bonus was a design flaw

v2.1.0 (2026-04-06): L1 Gateway — market-share-weighted bot scoring

Replaced equal-weight formula with market-share-weighted per-bot points; current v2.5.0 weights are GPTBot 36 pts, Google-Extended 20 pts, OAI-SearchBot 14 pts, PerplexityBot 10 pts, ClaudeBot 10 pts, and GrokBot 4 pts.
Methodological rationale: blocking the dominant AI platform (ChatGPT, 60.2 % market share) cannot carry the same score penalty as blocking a minority platform — equal weighting created false equivalence exploitable by competitors
Market share data source: VerisAI Market Intelligence Pipeline, Week 15/2026 (AI platform referral traffic analysis)
Commitment: bot weights will be revised quarterly (Q1 January • Q2 April • Q3 July • Q4 October) based on current market share data from the VerisAI Intel pipeline

v2.4.1 (2026-05-29): Grok added to L7, L1 bot weights rebalanced

L7: added Grok (xAI) as 5th platform.
L1: OAI-SearchBot promoted from informative to scored. Current scored weights: GPTBot 36, Google-Extended 20, OAI-SearchBot 14, PerplexityBot 10, ClaudeBot 10, GrokBot 4; total bot weight 94, plus llms.txt bonus capped at 100.

v2.0.0 (2026-02-25): L7 methodology revision based on actual AI bot behavior research

GPT: replaced GPTBot-only scoring with OAI-SearchBot (25 pts) as primary citation bot + GPTBot (10 pts) for training; rebalanced remaining points
Gemini: replaced CloudVertexBot scoring with E-E-A-T signals as primary factor (author 25, dates 20, contact 10); added note that Gemini cites via Google Search index
Claude: replaced Privacy/Terms scoring with FAQ (15), Question headings (15), Definition lists (10); reflects Brave Search index citation behavior
Perplexity: PerplexityBot elevated to 25 pts as critical gate; FAQ expanded to 20 pts; removed logo/footer credibility signals
Added informative-only bot list to L1 (OAI-SearchBot, ChatGPT-User, Claude-Web, etc.) — tracked but not scored

v1.0.1 (2026-02-18): Scope clarification

Added Citation Readiness scope note: scores reflect technical eligibility indicators, not actual citation probability

v1.0.0 (2026-02-15): Initial public release

Established equal-weight formula: (AI Readiness + SEO + Citation) / 3
Documented all 8 layers with point breakdowns
Added platform-specific scoring (GPT, Gemini, Claude, Perplexity)
Linked official documentation sources

Quarterly Review Commitment: L1 bot weights are revised every quarter (Q1 Jan · Q2 Apr · Q3 Jul · Q4 Oct) based on current AI platform market share data from the VerisAI Intel pipeline. The full methodology is updated whenever AI platforms publish new official crawler documentation, new LLM citation research emerges, or web standards change in ways that affect AI visibility measurement.

AI Visibility Score Methodology

How VerisAI measures whether AI systems can access, understand, and cite a company website.

How to read this methodology

What the score measures

What the score does not guarantee

What teams should do with it

Overall Score Calculation

AI Readiness (33.3%)

SEO Foundation (33.3%)

Citation Readiness (33.3%)

Component 1: AI Readiness Score (Layer 1-4)

Layer 1: Gateway (Binary Gate)

Layer 2: Server-Side Rendering (Binary Gate)

Layer 3: Indexability (Penalty System)

Layer 4: Content Quality (Type-Specific Scoring)

Component 2: SEO Foundation (Layer 5-6)

Layer 5: Technical SEO (Max 100)

Layer 6: On-Page SEO (Max 100)

Component 3: Citation Readiness (Layer 7)

GPT / ChatGPT (OpenAI) Readiness (Max 100)

Gemini (Google AI) Readiness (Max 100)

Claude (Anthropic) Readiness (Max 100)

Perplexity Readiness (Max 100)

Grok (xAI) Readiness (Max 100)

Why Equal Weighting (1/3 + 1/3 + 1/3)?

AI Readiness (33.3%)

SEO Foundation (33.3%)

Citation Readiness (33.3%)

Score thresholds defined

Knowledge Diff Methodology

Ground truth source

Pre-comparison gate

Compared fields

AI narrative providers

Diff categories

EU AI Act public-evidence check Methodology

Evidence source

Readiness output

Boundary

Industry Standards & Documentation

Official Standards Bodies:

AI Platform Documentation:

Validation Tools:

Academic & Industry Research:

Version History