Pillar research · By Ali Jakvani, Cofounder
The Citation Survivability Index: A Research Framework for Answer Engine Optimization
A working theory of how modern answer engines decide which sources to cite, which entities to trust, and which content survives the compression that happens between a user’s question and the model’s reply. Introducing CSI — Aeonic’s proprietary AEO methodology.
For twenty-five years, optimization for discovery meant one thing: get a crawler to like your page enough to put it near the top of a results list. That objective is no longer sufficient. A growing share of high-intent commercial queries now resolve inside an answer engine before the user ever sees a SERP. If your domain is not among the citations under that synthesized answer, you do not exist for that query — even if you would have ranked #1 on Google for the same string.
1. Why traditional SEO is structurally insufficient
SEO maps poorly onto AEO not because the underlying signals are wrong, but because the objective function changed without most operators noticing. Consider four canonical assumptions and how each breaks under retrieval:
| Traditional SEO assumption | What actually happens in AEO |
|---|---|
| “If it ranks, it will be seen.” | Rankings do not equal retrieval. Answer engines retrieve from indexes that are not the SERP, often using vector similarity over chunks rather than page-level relevance. |
| “Backlinks confer authority.” | Backlinks are a weak proxy for citation probability. Models cite sources whose chunks are confidently retrievable and unambiguously attributable, regardless of DR. |
| “Keyword density signals topical relevance.” | Embedding models care about semantic distribution, not literal repetition. Keyword stuffing increases ambiguity and decreases retrieval confidence. |
| “Crawlers and answer engines see the same web.” | They do not. Many answer engines query intermediate retrieval layers built on third-party indexes, partner data, real-time fetches, and structured-data caches. |
The implication: a page can be perfectly SEO-optimized and still be functionally invisible to AI search. The reverse is also true — a page can be ranked third on Google and yet be the dominant cited source across ChatGPT, Perplexity, and Claude. We see this empirically across customer corpora.
2. Research hypothesis
We refer to this composite function as the Citation Survivability Index (CSI). A secondary hypothesis follows:
If H₂ holds, the strategic implication is large: AEO is a winner-take-most game in a way that even classical SEO — with its ten-blue-link diversity — was not.
3. The retrieval lifecycle
Before introducing the formula, we need a shared mental model of the pipeline CSI is trying to score against. The following lifecycle is a synthesis of the public architectures of major retrieval-augmented systems and the observable behavior of production answer engines.
| Stage | What happens | Where pages die |
|---|---|---|
| 1. Query understanding | Intent classification, entity recognition, sub-question generation | — |
| 2. Retrieval | Vector similarity over chunks, BM25 hybrid, structured-data lookups, live fetch | Chunks fail to enter the candidate set |
| 3. Re-ranking & filtering | Cross-encoder relevance, recency, source-trust priors, dedup, conflict detection | Low-density or duplicate sources discarded |
| 4. Entity resolution | Disambiguation against entity graph, canonical-source preference | Ambiguous brand entities silently swapped |
| 5. Answer extraction | Chunk to candidate answer span, confidence scoring per span | Buried answers lose to better-formatted ones |
| 6. Synthesis & compression | Multi-source merge, contradiction resolution, citation attribution | Less defensible sources dropped from final cite list |
A page can be eliminated at any of stages 2–6. Each stage corresponds, roughly, to a CSI sub-component. That correspondence is intentional — CSI is a model of where in the pipeline a page dies, not just whether it is “good content.”
4. The CSI framework: six variables
The Citation Survivability Index aggregates six sub-scores. Each is independently measurable, independently optimizable, and maps to a specific stage of the retrieval lifecycle.
4.1 Semantic Retrieval Confidence (SRC)
The probability that, given a query q and its embedding v(q), at least one chunk of the document falls within the top-k nearest neighbors in the retriever’s vector space. Range: 0 to 1.
Retrievers do not read pages; they read chunks. SRC measures whether the most retrievable chunk in a document is positioned in semantic space such that it appears in the candidate set for the queries the operator cares about. Stage 2 of the retrieval lifecycle is the first elimination — a document with low SRC never enters the candidate set, and nothing downstream can save it.
4.2 Entity Resolution Depth (ERD)
The degree to which the entities mentioned in the document are disambiguated, canonically referenced, and cross-linked to authoritative graphs (Wikidata, schema.org, industry registries, the engine’s internal KG). Range: 0 to 1.
When a model encounters “Aeonic,” it must answer: which Aeonic? If the answer is uncertain, the model picks the wrong entity, hedges, or omits the citation entirely. Stage 4 is silent — models do not return an error when they fail to resolve an entity, they simply pick a safer source. Low ERD is the most common reason a content-rich domain is invisible in AI answers.
4.3 Contextual Authority Density (CAD)
The amount of corroborating, authoritative, on-topic context per unit of content — measured as the ratio of citations, named entities, statistics, and verifiable claims to total tokens, weighted by source quality. Range: 0 to 1.
Authority is no longer page-level; it is paragraph-level. A 4,000-word post with two thin citations has lower CAD than a 1,000-word piece with eight verifiable references and tightly defined terminology. Re-rankers in stage 3 reward density. Compressors in stage 6 cite the source that gives them the most defensible single sentence.
4.4 Answer Extraction Probability (AEP)
The probability that a model, given the document and a target query, extracts a self-contained answer span from the document rather than from a competing source. Range: 0 to 1.
AEP is a property of form as much as content. A correct answer buried inside a meandering paragraph is extractable; a correct answer formatted as a clean, short, self-contained block is preferentially extractable. AEP is the variable most under the operator’s control and the one most often misunderstood. “Write better content” is not a strategy; engineering chunks for extractability is.
4.5 Multi-Engine Citation Stability (MECS)
The variance-adjusted count of distinct answer engines that cite the document for the same intent cluster over a fixed observation window. Range: unbounded, dampened logarithmically.
Different engines weight retrieval signals differently. A page cited by only one engine is a sample-size-of-one; a page cited by most is structurally optimized rather than coincidentally surfaced. The other five variables are leading indicators; MECS is the lagging indicator that the others are working.
4.6 Retrieval Fragmentation Penalty (RFP)
The degree to which the document’s structure forces a retriever to reconstruct meaning across non-adjacent chunks, dilutes embeddings with mixed topics, or introduces ambiguity through inconsistent entity references. Range: ≥ 0, subtractive in the master formula.
RFP is the only negative variable. It rises with chunks that span unrelated subtopics, pronouns and anaphora that reach across chunk boundaries, inconsistent terminology, render-time content invisibility, duplicate sections, and gated or lazy-loaded content. Most “good” content has more RFP than its authors realize. Fragmentation is the silent killer.
5. The CSI formula
The Citation Survivability Index combines the six variables into a single composite score. The form is multiplicative in the positive components — because retrieval is a chain of conjunctive filters, and any near-zero variable should collapse the whole — and dampened additively for the negative penalty.
SRC × ERD × CAD × AEP × log(1 + MECS)
CSI = ──────────────────────────────────────
1 + (λ × RFP)| Symbol | Variable | Range | Role |
|---|---|---|---|
| SRC | Semantic Retrieval Confidence | [0, 1] | Stage 2 survival |
| ERD | Entity Resolution Depth | [0, 1] | Stage 4 survival |
| CAD | Contextual Authority Density | [0, 1] | Stages 3 and 6 reinforcement |
| AEP | Answer Extraction Probability | [0, 1] | Stage 5 survival |
| MECS | Multi-Engine Citation Stability | ≥ 0 | Empirical confirmation across models |
| RFP | Retrieval Fragmentation Penalty | ≥ 0 | Subtractive penalty |
| λ | Fragmentation weight | tunable, default 0.5 | Calibrates penalty severity |
Why multiplicative on the positive side
Retrieval pipelines are conjunctive. A page must pass stage 2 and stage 3 and stage 4 and stage 5 and stage 6. If any stage rejects it, the page does not get cited — full stop. Multiplicative composition has the right shape: a single near-zero variable collapses the score, mirroring the actual brittleness of the pipeline.
Why logarithmic on MECS
MECS is a count, not a probability. The first additional engine that cites you is informationally enormous — it confirms the pattern is non-coincidental. The fifth additional engine is incrementally less so. Log dampening encodes diminishing returns and prevents MECS from dominating the score.
Why an additive penalty in the denominator
RFP is a friction term. We do not want it to zero out a score, because some fragmentation is recoverable through other strengths; we want it to progressively suppress the score. The denominator form 1 + λ·RFP gives smooth, bounded suppression and makes λ a single tuning knob.
6. A worked example
Consider two real-world pages on the same topic — a SaaS pricing page for a developer-tools company.
Page A — typical SEO build
Long-tail keywords, generic copy, weak schema, mixed sections, no original data.
| Variable | Score | Reason |
|---|---|---|
| SRC | 0.42 | Mixed-topic chunks dilute query alignment |
| ERD | 0.30 | Minimal schema, no sameAs links |
| CAD | 0.25 | Few citations, no original data |
| AEP | 0.35 | Answers buried in marketing prose |
| MECS | 1 | Cited by one engine occasionally |
| RFP | 1.8 | Inconsistent product naming, lazy-loaded sections |
CSI(A) = (0.42 × 0.30 × 0.25 × 0.35 × log(2)) / (1 + 0.5 × 1.8)
≈ 0.00174Page B — CSI-engineered build
Atomic chunks, full Product schema with sameAs, primary benchmarks, FAQ-formatted answers, consistent terminology.
| Variable | Score | Reason |
|---|---|---|
| SRC | 0.86 | Tightly scoped chunks, terminology-precise |
| ERD | 0.78 | Organization + Product + sameAs + consistent NAP |
| CAD | 0.72 | Original benchmark data, eight cited primary sources |
| AEP | 0.81 | Answer-first paragraphs, FAQ blocks, decision tables |
| MECS | 4 | Cited by ChatGPT, Perplexity, Claude, Gemini |
| RFP | 0.4 | Minor anaphora issues |
CSI(B) = (0.86 × 0.78 × 0.72 × 0.81 × log(5)) / (1 + 0.5 × 0.4)
≈ 0.5247. Optimization methodology
Because the formula is multiplicative, the marginal return on improving any variable is proportional to how low that variable currently is. This matters operationally: most teams optimize the wrong variable first.
Step 1. Diagnose RFP first
Fragmentation is cheapest to fix and has the largest hidden suppression effect. Audit:
- Chunk boundary alignment — do paragraphs map to single concepts?
- Entity name consistency — one product, one canonical name, used identically across the site.
- Render-time content visibility — does a fetcher see the same DOM as a user?
- Duplicate or near-duplicate sections that confuse de-duplication.
- Anaphora resolution within chunks — pronouns that reach across boundaries.
Step 2. Lift ERD
Most domains have a 0.2–0.4 ERD ceiling because schema is partial, sameAs is missing, and entity references are inconsistent across the site. Target a complete Organization block with full sameAs array; canonical Product, Service, Person, Article schema everywhere applicable; one canonical name, one canonical description, one canonical logo, repeated identically; and entity profiles on Wikidata, Crunchbase, GitHub (where appropriate), LinkedIn, and industry registries.
Step 3. Engineer chunks for AEP
This is content engineering, not content writing. For each high-value query intent: write the answer in the first sentence of the relevant section; bound the answer in 40–80 tokens of self-contained prose; give the chunk a heading that contains the question phrasing; follow with a short elaboration, not a long preamble; format atomic facts as lists or tables that survive extraction intact.
Step 4. Raise CAD
Density does not mean length; it means defensible claims per token. Replace assertions with cited claims. Cite primary sources, not aggregators. Include original data where possible (proprietary benchmarks, customer aggregates with permission, methodological notes). Name entities precisely. Avoid hedge-words that lower extractable certainty (“often,” “many,” “some”).
Step 5. Improve SRC
SRC is a function of everything above plus a few specific levers: chunk size around 200–500 tokens with single-topic boundaries; terminology density (use the field’s actual vocabulary); heading structure that mirrors retrieval-likely query patterns; internal linking that propagates topical context.
Step 6. Measure MECS
MECS is the outcome variable. It cannot be directly optimized; it is what you observe when steps 1–5 are working. Track citations across engines weekly. Variance across engines is signal — if Claude cites you and Gemini does not, the gap reveals which sub-component is weak.
8. SEO vs. AEO: the comparison that matters
| Dimension | Traditional SEO | Answer Engine Optimization (CSI) |
|---|---|---|
| Objective | Rank a page | Survive retrieval, extraction, and synthesis |
| Unit of optimization | Page | Chunk |
| Authority signal | Backlink graph | Entity graph + corroboration density |
| Trust proxy | Domain Rating | Entity Resolution Depth + Citation Survivability |
| Failure mode | Page ranks low | Page is retrieved but discarded, or cited inconsistently across models |
| Diversity dynamic | Ten blue links | One synthesized answer, 1–3 citations |
| Optimization horizon | Rolling rankings | Compounding citation gravity |
| Primary risk | Algorithm updates | Model swaps, retrieval-layer changes, training-distribution shifts |
| Measurement | SERP positions, traffic | Citation rate, citation stability, extraction frequency |
A SEO update can move a page from #3 to #8 — painful, recoverable. An AEO failure can remove a domain from the citation set across all major engines at once, because the structural variables that cause the failure (low ERD, high RFP) affect every engine simultaneously.
9. Citation gravity: why survival compounds
Citation gravity is the emergent property of the CSI framework. Once a source is reliably cited by an answer engine for a given entity or intent cluster, three reinforcement loops begin to operate:
- Training-distribution reinforcement. Future model training corpora over-represent already-cited domains, so the next generation enters the world with stronger priors for those sources.
- Retrieval-cache reinforcement. Many production answer engines maintain warm caches of frequently-cited chunks for popular queries. Sources in those caches are structurally faster to retrieve and harder to displace.
- Entity-graph reinforcement. When a model resolves an entity, the canonical sources for that entity become the default citations even for adjacent queries. A page that wins on “what is X?” tends to also win on “best alternatives to X.”
10. Semantic decay: why CSI is not static
CSI is not a one-time score. Three forces erode it continuously: model swaps that re-weight the retrieval and extraction pipelines; vocabulary drift as terminology evolves; and authority drift as cited sources are deprecated, paywalled, or dead-linked. Operators who treat CSI work as a one-time content audit will see scores erode within two to three model generations. Operators who maintain a rolling re-evaluation cadence will see citation gravity continue to compound.
11. Enterprise implications
- Content operations becomes content engineering. The discipline that produces CSI-optimized content is closer to documentation engineering than to traditional content marketing.
- Authority is paragraph-level, not domain-level. A high-DR domain with low CAD will lose citation share to a lower-DR domain with deeper paragraph-level authority.
- Entity strategy precedes content strategy. Without ERD, every content investment downstream is suppressed.
- The reporting model changes. Traffic and ranking dashboards under-represent AEO performance. Add citation-rate, citation-stability, and entity-resolution dashboards.
- Risk concentrates in retrieval-layer dependencies. Multi-engine optimization (high MECS) is the structural hedge.
12. Limitations and open questions
A research framework should name what it does not yet know. SRC, ERD, CAD, AEP, and RFP are not fully orthogonal — the current formula treats them as independent for tractability; a future iteration will model the covariance explicitly. The default weights presented here are operator-calibrated, not regression-fit on a public dataset; different engines likely warrant different λ and different MECS dampening curves. We have strong observational evidence for the variables but limited ability to isolate causal contributions in production systems we do not control. And as CSI-style optimization spreads, engines will adapt; some variables (notably AEP) are more vulnerable to gaming than others (notably MECS, which is hard to fake without genuine multi-source survival). The framework is intended to be revised under public scrutiny.
13. Conclusion
The optimization surface for discovery has changed. The page is no longer the unit; the chunk is. The ranking is no longer the objective; the citation is. The crawler is no longer the audience; the retrieval-extraction-synthesis pipeline is.
The Citation Survivability Index is a working theory of how that pipeline decides what survives. It is not the last word on AEO; it is a first defensible articulation of the variables that govern citation in answer engines, and a methodology for engineering against them. The page that ranks does not win the answer. The page that survives does.
— Ali Jakvani, Cofounder, Aeonic
References
- [1]Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab.
- [2]Robertson, S. & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval.
- [3]Karpukhin, V., et al. (2020). Dense Passage Retrieval for Open-Domain Question Answering. EMNLP.
- [4]Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS.
- [5]Izacard, G. & Grave, E. (2021). Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. EACL.
- [6]Khattab, O. & Zaharia, M. (2020). ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. SIGIR.
- [7]Schema.org documentation — Organization, Article, Product, FAQPage, Dataset.
- [8]Wikidata — entity reconciliation and sameAs linking conventions.
- [9]Aeonic — AI Search Optimization Platform.
Scan your domain
Want to see how your brand shows up in AI answers?
Run a free AI-Readiness scan. Get a 13-factor score and a live response from ChatGPT, Claude, Perplexity, and Gemini. No signup required.