AI Visibility Score Methodology
How VerisAI measures whether AI systems can access, understand, and cite a company website.
Version 2.5.0 (2026-06-14) – Technical framework for AI access, website facts, SEO foundation, citation readiness, Knowledge Diff, and priority risk scoring.
How to read this methodology
This page explains the scoring model behind the AI Readiness baseline. It is written for technical reviewers, web teams, SEO specialists, and governance owners.
What the score measures
Whether AI systems can access the website, read the right company facts, and find signals that support accurate AI answers.
What the score does not guarantee
The score measures readiness signals. It does not guarantee that any AI platform will cite the website in every answer.
What teams should do with it
Use the score to prioritize website fixes: access, indexability, structured data, content clarity, and citation-readiness gaps.
Overall Score Calculation
The overall score combines three areas: AI access and understanding, SEO foundation, and citation-readiness signals.
VerisAI L1 checks whether AI systems can access a website. L2-L4 check whether they can understand it. L7 estimates whether the page is structured enough to be citation-ready.
Formula: Overall Score = (AI Readiness + SEO + Citation Readiness) / 3
AI Readiness (33.3%)
Layer 1-4: Can AI systems access the website and understand the core company facts?
Components: Gateway, SSR, Indexability, Content Quality
SEO Foundation (33.3%)
Layer 5-6: Does the website have the technical and on-page foundation needed for discovery?
Components: Technical SEO, On-Page SEO
Citation Readiness (33.3%)
Layer 7: Does the website expose signals that make AI citation more likely?
Components: GPT, Gemini, Claude, Perplexity, Grok, and Meta readiness scoring
Thresholds:
- ≥70 = PASS (strong readiness signals)
- 40-69 = WARN (partial readiness, visible gaps remain)
- <40 = FAIL (critical blocking issues or weak signals)
Component 1: AI Readiness Score (Layer 1-4)
Formula: AI Readiness = (Content Score × 0.6) + (Indexability Score × 0.4)
Content Score = Layer 4 Quality × SSR Factor
Indexability Score = 100 + Layer 3 Penalties
Layer 1: Gateway (Binary Gate)
Status: PASS or BLOCKED
Layer 1 separates declared crawler policy from observed access behavior. It checks robots.txt, guidance files, and bot-level HTTP access because robots.txt alone does not prove that AI crawlers receive a usable page.
Scoring (94 bot-weight pts + 15 pts llms.txt bonus, capped at 100):
Bot weights reflect the relative importance assigned by the VerisAI scoring model. Blocking major AI crawlers carries a higher penalty than blocking lower-priority crawlers. Weights should be reviewed periodically as platform behavior changes.
| Bot | Platform | Points | AI chatbot share (US, May 2026) |
|---|---|---|---|
| GPTBot | ChatGPT / OpenAI | 36 | 60.6 % |
| Google-Extended | Gemini / Google AI | 20 | 15.1 % |
| OAI-SearchBot | ChatGPT Search / OpenAI | 14 | — (part of ChatGPT) |
| PerplexityBot | Perplexity AI | 10 | 5.4 % |
| ClaudeBot | Claude / Anthropic | 10 | 5.0 % |
| GrokBot | Grok / xAI | 4 | 0.6 % |
| llms.txt | All platforms | +15 bonus | — |
AI chatbot share source: FirstPageSage, U.S., May 2026. Bot weights are not a direct proportional mapping of share — they reflect a combination of chatbot market share, live crawler activity (BotMonitor audit 2026-05-27), and citation impact. Reviewed quarterly.
- robots.txt check: Per-bot ALLOWED / BLOCKED / NOT_SPECIFIED
- HTTP access test: 6 scored bots can fetch content with their configured User-Agent (status 200)
- llms.txt bonus (+15 pts): Present at domain root, HTTP 200, non-empty content — signals intentional AI permission
- llms-full.txt (detected, not scored here): Full content dump for AI consumption — used as +10 pts signal in Layer 7 Citation Readiness per platform
- Informative bot checks: ChatGPT-User, anthropic-ai, Claude-SearchBot, Claude-Web, Claude-User, Google-CloudVertexBot, Googlebot, BingBot, Perplexity-User, Meta-ExternalAgent, Meta-ExternalFetcher, Amazonbot, Applebot-Extended, Applebot, Bytespider, DuckAssistBot, MistralAI-User, DeepseekBot, and CCBot are tracked for robots.txt diagnostics but do not change L1 score.
Result: BLOCKED → AI Readiness = 0 (scoring stops)
Theory: If AI bots cannot access your site, all other optimizations are irrelevant. Bot weights are model assumptions used by VerisAI and should be reviewed periodically as platform behavior changes. llms.txt and llms-full.txt are separate signals: llms.txt is an access/index file (L1), llms-full.txt is a content completeness signal for citation (L7). GPTBot documentation, Google AI crawlers
Layer 2: Server-Side Rendering (Binary Gate)
Quality: GOOD / PARTIAL / FAILED
Checks (4 critical elements):
- Valid
<title>tag (not empty/placeholder) <h1>heading exists- Text content >500 characters
- JSON-LD schema present
Scoring:
- 0 missing = GOOD (SSR Factor = 1.0)
- 1-2 missing = PARTIAL (SSR Factor = 0.7)
- 3+ missing = FAILED → AI Readiness = 0
Theory: AI bots rely on server-rendered HTML. Missing critical elements = empty page for bots. HTML5 spec, Google structured data
Layer 3: Indexability (Penalty System)
Penalties: 0 to -60 points
Canonical Tags (-20 pts per issue, max -60):
- Missing canonical tag: -20
- Multiple conflicting canonicals: -20
- Relative URL (not absolute): -20
Language Declaration (-10 pts): Missing <html lang="xx">: -10
JSON-LD Validation (-20 pts): Broken/invalid JSON-LD: -20
Theory: Technical errors confuse AI crawlers about which version to index. Canonical URLs, JSON-LD spec
Layer 4: Content Quality (Type-Specific Scoring)
Auto-detected types: ARTICLE, VIDEO, AUDIO, PRODUCT, ORGANIZATION, GENERIC
Example: ARTICLE Scoring (Max 100):
- Schema (40 pts): Base schema (10) + headline (6) + author (6) + datePublished (6) + publisher (6) + image (6)
- Structure (40 pts): Word count >300 (20) + H2 subheadings (10) + Internal links ≥2 (10)
- SEO (20 pts): Title 50-60 chars (10) + Meta description 150-160 chars (10)
Example: ORGANIZATION Scoring (Max 100):
- Schema (60 pts): Base (10) + name (10) + url (10) + logo (10) + contactPoint (10) + address (5) + sameAs (5)
- NAP Consistency (20 pts): Phone (10) + Email (10)
- Local Signals (20 pts): Opening hours (10) + Map/location (10)
Theory: Different content types serve different purposes. AI needs structured data matching the content type. Schema.org docs, Schema validator
Component 2: SEO Foundation (Layer 5-6)
Formula: SEO Score = (Technical SEO + On-Page SEO) / 2
Layer 5: Technical SEO (Max 100)
- Sitemap (30 pts): sitemap.xml exists (15) + Valid XML (15)
- HTTPS (20 pts): HTTPS enabled (20)
- Mobile-Friendly (25 pts): Viewport meta tag (15) + width=device-width (10)
- Performance (25 pts): CSS files <5 (10) + JS files <10 (10) + Images <50 (5)
Theory: Technical foundation enables discovery and indexing. Sitemap protocol, Core Web Vitals
Layer 6: On-Page SEO (Max 100)
- Open Graph (25 pts): og:title (8) + og:description (8) + og:image (9)
- Twitter Cards (15 pts): twitter:card (8) + twitter:title (7)
- Internal Links (20 pts): ≥5 links (20) or 2-4 links (10)
- Images (20 pts): Alt coverage ≥80% (20) or ≥50% (10)
- Headings (20 pts): Valid H1-H2 hierarchy (20) or Single H1 (10)
Theory: Proper meta tags enable social sharing and preview generation. Open Graph Protocol, Twitter Cards
Component 3: Citation Readiness (Layer 7)
Formula: Citation Readiness = GPT*0.612 + Gemini*0.155 + Claude*0.102 + Perplexity*0.071 + Grok*0.020 + Meta*0.040
Weighting: L7 is market-share-weighted using VerisAI Intel W15/2026. Weights are revised quarterly: GPT 61.2%, Gemini 15.5%, Claude 10.2%, Perplexity 7.1%, Grok 2.0%, Meta 4.0%.
Scope note: Citation Readiness scores measure technical prerequisites and proxy signals. They do not predict exact citation frequency. Actual AI citations also depend on authority, brand mentions, source availability, platform behavior, and ranking systems that AI platforms do not fully disclose.
Shared signal across all platforms: llms-full.txt (+10 pts each)
If /llms-full.txt is present at domain root (HTTP 200), each platform score receives +10 pts. This file provides a structured full-content dump for direct AI corpus ingestion — a distinct signal from llms.txt (which is an access index scored in Layer 1). All platform scores are capped at 100.
GPT / ChatGPT (OpenAI) Readiness (Max 100)
- OAI-SearchBot Access (25 pts): Allowed in robots.txt – citation bot for ChatGPT Search
- GPTBot Access (10 pts): Allowed in robots.txt – training bot, affects brand representation
- JSON-LD Schema (25 pts): Valid JSON-LD present
- Content Quality (20 pts): Layer 4 score ≥70
- Citation Structure (20 pts): H2 subheadings (10) + Lists/bullets (10)
- llms-full.txt (+10 pts, capped at 100): Full content dump present
Note: OAI-SearchBot drives ChatGPT Search citations. GPTBot affects training data representation only. GPTBot documentation, OAI-SearchBot documentation
Gemini (Google AI) Readiness (Max 100)
- Google-Extended ALLOWED (10 pts): Not opted out of Google AI training data
- HTTP Access (10 pts): HTTP access ok
- E-E-A-T proxy signals (55 pts): Author HTML (5) + author page link (5) + Person schema (10) + LinkedIn sameAs in Person schema (5) + published/updated dates (20) + contact info (10)
- Schema and brand authority (25 pts): Valid JSON-LD (10) + specific content type (5) + Organization schema (5) + Wikipedia/Wikidata sameAs in Organization schema (5)
- llms-full.txt (+10 pts, capped at 100): Full content dump present
Note: Gemini cites from Google Search index — Google-Extended is AI training opt-out proxy only, not a direct crawl access bot. E-E-A-T-style signals are relevant quality signals, but Google does not disclose a complete Gemini citation formula. E-E-A-T guidelines, Google-Extended
Claude (Anthropic) Readiness (Max 100)
- L1 Gateway PASS (20 pts): Site not blocking bots
- Citation Metadata (20 pts): Canonical URL (10) + og:url (10)
- Trust Signals (10 pts): HTTPS enabled
- Answer Engine Optimization (40 pts): FAQ/Q&A section (15) + Question-format headings (15) + Definition lists (10)
- Rendering Quality (10 pts): SSR quality GOOD
- llms-full.txt (+10 pts, capped at 100): Full content dump for Brave Search ingestion
Note: Claude uses Brave Search index for web citations. FAQ and question-format headings are key extraction signals for answer retrieval. Anthropic docs
Perplexity Readiness (Max 100)
- PerplexityBot Access (25 pts): Allowed in robots.txt (25) – important access signal; blocking it can prevent Perplexity from indexing or citing the content
- Answer Engine Optimization (45 pts): FAQ section (20) + Question-format headings (15) + Definition lists (10)
- Rendering Quality (15 pts): SSR quality GOOD (15)
- Content Quality (15 pts): Layer 4 score ≥70 (15)
- llms-full.txt (+10 pts, capped at 100): Structured full content for Perplexity indexing
Note: PerplexityBot is an important access signal — blocking it can prevent Perplexity from indexing or citing the content. FAQ sections and question-format headings can improve answer extraction and citation readiness. How Perplexity works
Grok (xAI) Readiness (Max 100)
- GrokBot robots.txt access (25 pts): Allowed in robots.txt – primary access signal for xAI indexing
- GrokBot HTTP access (10 pts): HTTP request succeeds for GrokBot user agent
- Twitter/X card metadata (15 pts):
twitter:cardmetadata present - X sameAs in Organization schema (10 pts): Organization schema links to twitter.com or x.com profile
- Freshness (15 pts):
dateModifiedpresent in structured data - Content Quality (15 pts): Layer 4 score ≥70
- llms-full.txt (+10 pts, capped at 100): Structured full content for xAI indexing
Note: GrokBot access, X-native metadata, X profile identity, and freshness are the primary Grok readiness signals. xAI
Why Equal Weighting (1/3 + 1/3 + 1/3)?
AI Readiness (33.3%)
Technical capability: Can AI bots access and understand your content?
SEO Foundation (33.3%)
Discoverability: Can humans and search engines find you?
Citation Readiness (33.3%)
Authority: Will AI platforms cite you as a trusted source?
Critical Insight: All three pillars must be present:
- High AI Readiness + Poor SEO = Nobody finds you
- High SEO + Poor Citation = AI ignores you
- High Citation + Blocked Gateway = Score = 0
Score thresholds defined
- PASS / CITABLE (≥70)
- The page meets minimum AI visibility requirements. Signals are sufficient for AI crawlers to access, parse, and potentially cite the content.
- WARN / PARTIAL (40–69)
- Significant gaps exist. The page may be accessible but has missing structured data, weak content signals, or citation barriers that reduce AI citation likelihood.
- FAIL / NOT CITABLE (<40)
- Critical failures detected — blocked crawling, missing SSR, or insufficient content. AI systems cannot reliably access or cite this page.
Knowledge Diff Methodology
Purpose: compare AI-generated company narratives against deterministic, crawler-visible website facts.
Ground truth source
AI knowledge gap snapshot uses VerisAI's VCL Layer 4 Ground Truth Completeness output as the source of website facts. The website fact set is derived from fetched HTML, structured data, visible content, and identity signals evaluated by the VCL content layer. It is not generated by asking an LLM to invent or infer the company's official facts.
Pre-comparison gate
The AI comparison runs only when crawler-visible ground truth is strong enough for a reliable diff. The current gate requires sufficient Layer 4 quality, sufficient ground truth completeness, and no critical missing identity facts. If the gate fails, the user receives a "website ground truth needed" result instead of an AI narrative comparison.
Compared fields
- Company name and canonical identity
- Company description and positioning
- Products and services
- Headquarters country and location signals where available
- Target market, employee range, USP, certifications, and contact signals where available
AI narrative providers
When the gate passes, VerisAI takes a same-run snapshot across ChatGPT/OpenAI, Gemini, Claude, Perplexity, and Grok. Each platform answer is compared with the same L4-derived ground truth so results can be read per platform and across the aggregate report.
Diff categories
- Matched: the AI answer aligns with the website fact.
- Discrepancy: the AI answer states a different value or materially different interpretation.
- Missing in AI: the website exposes a fact that the AI answer does not include.
- Hallucinated by AI: the AI answer contains a claim not supported by the L4-derived website facts.
Scope note: AI knowledge gap output is a point-in-time diagnostic snapshot. 100webs benchmark pages track historical rank movement where prior snapshot data is available. Customer monitoring drift view is being prepared as a weekly per-domain view; real-time alerting, competitor comparison, and guaranteed citation detection are not currently offered.
EU AI Act public-evidence check Methodology
Purpose: map public website evidence that may indicate EU AI Act exposure or disclosure work before August 2026.
Evidence source
The EU AI Act public-evidence check reads reachable public website pages and extracts public signals such as chatbot references, AI-generated content claims, automated recommendation language, sensitive-context wording, disclosure text, contact paths, and governance or policy pages. It does not inspect internal systems, contracts, product logs, model inventories, or private documents.
Readiness output
The scan returns an operational exposure tier, public evidence findings, missing disclosure signals, and recommended next actions for internal owners, counsel, risk, or compliance review. The result is a lead triage and evidence-preparation tool, not a legal classification engine.
Boundary
VerisAI does not provide legal advice, legal certification, formal EU AI Act conformity assessment, or a definitive compliance determination. Any flagged item must be confirmed by qualified legal counsel before a compliance claim is made.
Industry Standards & Documentation
Official Standards Bodies:
- Schema.org – Structured data vocabulary
- Google Search Central – SEO best practices, E-E-A-T
- W3C – Web standards (HTML, accessibility)
- IETF – Internet protocols (robots.txt RFC 9309)
AI Platform Documentation:
- OpenAI GPTBot – Bot access guidelines
- Google AI Crawlers – CloudVertexBot, Google-Extended
- Anthropic Claude – AI assistant documentation
- Perplexity AI – Search quality overview
Validation Tools:
Academic & Industry Research:
Version History
v2.5.0 (2026-06-14): L1 access interpretation and L7 platform weighting update
- Clarified layer interpretation: L1 checks whether AI systems can access a website, L2-L4 check whether they can understand it, and L7 estimates citation readiness.
- Clarified that L1 separates declared crawler policy from observed bot-level HTTP access because robots.txt alone does not prove usable AI crawler access.
- L7: added Meta AI proxy readiness and updated market-share weighting: GPT 61.2 % • Gemini 15.5 % • Claude 10.2 % • Perplexity 7.1 % • Grok 2.0 % • Meta 4.0 %.
v2.4.0 (2026-05-06): Current services and monitoring status
- Added EU AI Act public-evidence check methodology and legal boundary language.
- Clarified that 100webs historical rank snapshots are implemented where prior benchmark data exists.
- Clarified that weekly per-domain monitoring drift is in progress and real-time alerting is not currently offered.
v2.3.0 (2026-05-02): Knowledge Diff ground truth methodology
- Documented that Knowledge Diff uses VCL Layer 4 Ground Truth Completeness as the deterministic source of website facts.
- Added the pre-comparison gate: weak website ground truth blocks AI narrative comparison and returns a ground-truth-needed result.
- Clarified that AI knowledge gap output is a point-in-time diagnostic snapshot, not customer monitoring, real-time alerting, or guaranteed citation detection.
v2.2.0 (2026-04-06): llms.txt / llms-full.txt signal separation
- L1 Gateway: llms.txt bonus increased from +10 to +15 pts — standard transitioning from optional to baseline as Anthropic, Cursor, Mintlify adopt it; validation tightened (HTTP 200 + non-empty content required)
- L1 Gateway:
llms-full.txtnow detected and stored in score details — separate signal fromllms.txt - L7 Citation Readiness:
llms-full.txtadded as +10 pts to platform scores, capped at 100 — full content dump enables direct AI corpus ingestion independently of platform-specific crawlers - Rationale:
llms.txt= access/index signal (belongs in L1 Gateway);llms-full.txt= content completeness signal (belongs in L7 Citation Readiness). Merging both into one L1 bonus was a design flaw
v2.1.0 (2026-04-06): L1 Gateway — market-share-weighted bot scoring
- Replaced equal-weight formula with market-share-weighted per-bot points; current v2.5.0 weights are GPTBot 36 pts, Google-Extended 20 pts, OAI-SearchBot 14 pts, PerplexityBot 10 pts, ClaudeBot 10 pts, and GrokBot 4 pts.
- Methodological rationale: blocking the dominant AI platform (ChatGPT, 60.2 % market share) cannot carry the same score penalty as blocking a minority platform — equal weighting created false equivalence exploitable by competitors
- Market share data source: VerisAI Market Intelligence Pipeline, Week 15/2026 (AI platform referral traffic analysis)
- Commitment: bot weights will be revised quarterly (Q1 January • Q2 April • Q3 July • Q4 October) based on current market share data from the VerisAI Intel pipeline
v2.4.1 (2026-05-29): Grok added to L7, L1 bot weights rebalanced
- L7: added Grok (xAI) as 5th platform.
- L1: OAI-SearchBot promoted from informative to scored. Current scored weights: GPTBot 36, Google-Extended 20, OAI-SearchBot 14, PerplexityBot 10, ClaudeBot 10, GrokBot 4; total bot weight 94, plus llms.txt bonus capped at 100.
v2.0.0 (2026-02-25): L7 methodology revision based on actual AI bot behavior research
- GPT: replaced GPTBot-only scoring with OAI-SearchBot (25 pts) as primary citation bot + GPTBot (10 pts) for training; rebalanced remaining points
- Gemini: replaced CloudVertexBot scoring with E-E-A-T signals as primary factor (author 25, dates 20, contact 10); added note that Gemini cites via Google Search index
- Claude: replaced Privacy/Terms scoring with FAQ (15), Question headings (15), Definition lists (10); reflects Brave Search index citation behavior
- Perplexity: PerplexityBot elevated to 25 pts as critical gate; FAQ expanded to 20 pts; removed logo/footer credibility signals
- Added informative-only bot list to L1 (OAI-SearchBot, ChatGPT-User, Claude-Web, etc.) — tracked but not scored
v1.0.1 (2026-02-18): Scope clarification
- Added Citation Readiness scope note: scores reflect technical eligibility indicators, not actual citation probability
v1.0.0 (2026-02-15): Initial public release
- Established equal-weight formula: (AI Readiness + SEO + Citation) / 3
- Documented all 8 layers with point breakdowns
- Added platform-specific scoring (GPT, Gemini, Claude, Perplexity)
- Linked official documentation sources
Quarterly Review Commitment: L1 bot weights are revised every quarter (Q1 Jan · Q2 Apr · Q3 Jul · Q4 Oct) based on current AI platform market share data from the VerisAI Intel pipeline. The full methodology is updated whenever AI platforms publish new official crawler documentation, new LLM citation research emerges, or web standards change in ways that affect AI visibility measurement.
Run a free domain check first, then use this methodology to understand the score, the risks, and the fixes behind it.