EPISTEMIC CONTRACT

Grounding Methodology — How Stratensight Keeps LLM Outputs Grounded in the Dataset

A four-layer epistemic contract powered by 9 deterministic rules and two Claude models.

Stratensight is built on a deterministic, explicable, reproducible principle: every interpretive output the platform shows is the result of a layered audit, not a free-form LLM completion. This page documents the four layers that enforce that principle — the same layers that run on every analysis on every plan, with no gating.

Acronyms (C4, C5, GROUND-2, GROUND-5, GROUND-6, Option B) are kept as internal references for engineers reading the codebase; each is paired with a human-readable label on first occurrence and consolidated in the glossary at the end.

FOUR LAYERS

The contract, layer by layer

Each layer addresses a distinct failure mode of LLM-generated narrative. The layers stack: an output that passes Layer 1 still has to pass Layers 2, 3, and 4 before it reaches the user. None is optional.

LAYER 01

Critical Reader™ Signal Integrity™ — 9 deterministic rules + 1 LLM auditor

WHAT IT DOES

Nine deterministic checks (CE1, CE2, CE3, CE4, CE5, S1, M_CAGR_LAST_YEAR_ARTIFACT, L_ACADEMIC_DOMINANCE, D_ABSTRACT_FILL_CRITICAL) audit every analysis BEFORE any narrative is generated. A second pass by a Claude Sonnet 4.6 LLM auditor surfaces contextual issues the rules cannot express. Critical issues can downgrade or block a verdict.

WHY IT MATTERS

No silent contradictions between scores and verdict. The audit runs on every analysis, every plan, with no gating — because scientific credibility cannot be fragmented.

SOURCE — backend/app/services/critical_reader.py:7-8

LAYER 02

C4 + C5 — Hedging Validator Calibrated language layer (pre-generation directive + post-generation validator)

WHAT IT DOES

C4 Evidence-Certainty Directive (pre-generation): every LLM system prompt receives a language register conditioned on evidence_certainty — VERY_LOW, LOW, MODERATE, HIGH. When certainty is LOW or VERY_LOW, the LLM MUST use hedging vocabulary and MUST NOT use absolute language. C5 Hedge Validator (post-generation): scans the LLM output. If validation fails, retry once at temperature 0.0; on second failure, deterministic pre-validated hedged template fallback.

WHY IT MATTERS

Without this layer, an Executive Summary can read affirmatively ("the evidence demonstrates...") even when the badge shows Conditional / Low certainty. C4+C5 closes the asymmetry.

SOURCE — backend/app/services/_llm_hedging.py + _hedge_validator.py

LAYER 03

GROUND-2 — Grounding Validator Anti-hallucination whitelist enforcement layer

WHAT IT DOES

Per-analysis whitelist of grounded facts (entities, numbers, years, geographies) passed via GroundingContext. After LLM generation, validate_grounding rejects any output that introduces a fact absent from the whitelist. Wrapper enforce_grounding_with_retry retries at temperature 0.0 then falls back to a deterministic template. Accent-insensitive matching (NFKD), word- boundary regex.

WHY IT MATTERS

An LLM that invents an assignee name, a CPC code, or a citation count silently undermines every downstream interpretation. GROUND-2 cuts hallucination at validation, not at trust.

SOURCE — backend/app/services/_grounding_validator.py

LAYER 04

GROUND-5 — Refusal Rule Three-level abstention protocol (soft / strict / narrative-specific)

WHAT IT DOES

Explicit REFUSAL RULE injected into LLM system prompts so the model refuses with a calibrated phrase rather than inventing facts when the dataset variables do not support an answer. Three levels: soft (used by personas, narrative_engine), strict (used by chat advisor — anti false-positive FR phrasing), and narrative-specific (used by 11 narrative_sections functions and persona_engine). When detected, both validators bypass normal scrubbing.

WHY IT MATTERS

Hallucination prevention is not enough — the LLM must have a graceful exit when the data does not support the question. GROUND-5 makes refusal a first-class output, not a failure.

SOURCE — backend/app/services/_llm_refusal.py

BRIDGE TO LAYER C

evidence_certainty — not just a badge, a verdict gate

Beyond labeling and hedging, evidence_certainty drives the Layer C Tier gate (Phase 5.3): the verdict surface adapts categorically to the certainty level. HIGH preserves the raw verdict (TIER_HIGH); MODERATE or LOW maps to a directional signal (TIER_MODERATE — INVEST → OPPORTUNITY_SIGNAL, MONITOR → MIXED_SIGNAL, EXPLORE → WEAK_SIGNAL, AVOID → NEGATIVE_SIGNAL); VERY_LOW withholds the verdict entirely and replaces it with INSUFFICIENT_DATA (TIER_LOW). See the methodology page Layer C section for the complete mapping.

SOURCE TAGGING

GROUND-6 — Provenance flagging on every LLM-readable fact

Inside the LLM prompt, every fact is tagged with its provenance. The model cannot accidentally treat a derived metric as a primary observation, nor invent a fact under the cover of a grounded one.

[grounded]

Fact present in the analysis dataset whitelist (entity, number, year, geography). Safe to assert affirmatively under certainty rules.

[derived]

Fact computed from grounded facts via deterministic transformation (e.g. CAGR from yearly counts). Must inherit the grounding of its inputs.

[absent]

Fact NOT in the whitelist and NOT derivable. The LLM must either refuse (GROUND-5) or hedge as preliminary signal (C4) — never assert as evidence.

SOURCE — backend/app/services/prompt_builder.py (SOURCE_TAG_GROUNDED / _DERIVED / _ABSENT)

UX CALIBRATION

Option B — Certainty × language consistency across the user journey

The four layers above keep individual LLM outputs grounded. Option B extends the same epistemic discipline to the deterministic templates surrounding them — decision narrative, key insight, and executive outlook — so the user reads a coherent register from badge to recommendation.

Three calibration targets

  • Decision Engine narrative — four templates conditioned by certainty (LOW / MODERATE / HIGH) × language (EN / FR), driven by a grade computed BEFORE narrative generation.
  • Key Insight — certainty and language propagated through narrative_sections so the headline interpretation never outruns the evidence.
  • Executive outlook (frontend) — deterministic copy in text.ts reflects the same certainty register across the explorer and analysis pages.

SOURCE — backend/app/services/_ux_calibration.py + decision_engine.py

TWO AI MODELS

The stack — auditor and narrator

Stratensight uses two Claude models, each with a tightly scoped role. AI generates text only — never scores, never numbers. Scoring is deterministic Python, always.

Claude Sonnet 4.6

ROLE

Auditor (Signal Integrity™ Layer 2)

SCOPE

Reads the full analysis context (scores, metadata, source mode) and may surface up to 8 additional issues that the deterministic rules cannot express. Hard guardrails: allowed_values whitelist, ±0.5 float tolerance, 15-second timeout, 2048 max output tokens. Never invents a fact, never re-scores — flags only.

Claude Haiku

ROLE

Narrative generation (personas, executive summaries, clusters, Q&A, narrative_sections)

SCOPE

Generates role-aware narrative TEXT only — never scores, never numbers. Always paired with C4 directive (pre-gen), C5 + GROUND-2 + GROUND-5 (post-gen). Always labeled "AI-generated insight" in UI. Silent fallback to deterministic templates when the API returns null or fails validation.

GLOSSARY

Terminology mapping

Internal acronyms used by Stratensight engineers, paired with their public labels and short definitions.

C4

HUMAN LABEL

Evidence-Certainty Directive (pre-generation)

WHAT

Block injected into LLM system prompts to constrain language register based on certainty level.

C5

HUMAN LABEL

Hedge Validator (post-generation)

WHAT

Scans LLM output for hedging vocabulary and absolute language. Pairs with C4.

GROUND-2

HUMAN LABEL

Grounding Validator

WHAT

Anti-hallucination whitelist enforcement after LLM generation.

GROUND-5

HUMAN LABEL

Refusal Rule

WHAT

Three-level abstention protocol (soft / strict / narrative-specific) injected into LLM prompts.

GROUND-6

HUMAN LABEL

Source Tagging

WHAT

Provenance flagging — every LLM-readable fact is tagged grounded / derived / absent.

Option B

HUMAN LABEL

UX Calibration

WHAT

Deterministic templates conditioned on certainty × language across decision narrative, key insight, and executive outlook.

evidence_certainty

HUMAN LABEL

Public-facing: "Confidence level"

WHAT

Backend variable VERY_LOW / LOW / MODERATE / HIGH, source for badge color and language register.

Intelligence Grade™

HUMAN LABEL

Public-facing reliability metric

WHAT

Backend: branded_scores.intelligence_grade. Drives downgrade rules and disclaimer thresholds.

USER GLOSSARY

What these tokens mean for your decision-making

Eighteen user-facing tokens that appear across decision narratives, persona insights, and score badges. Section 9 above lists internal acronyms for engineers; this section is for everyone reading a Stratensight report. For threshold values and lifecycle weights, see /methodology.

Verdict

INVEST

Decision Engine verdict — strong convergent opportunity

Verdict

MONITOR

Decision Engine verdict — promising but incomplete

Verdict

EXPLORE

Decision Engine verdict — mixed signals, deeper analysis required

Verdict

AVOID

Decision Engine verdict — weak or negative signals

Lifecycle

Research

Lifecycle phase — early academic exploration

Lifecycle

Emerging

Lifecycle phase — first commercial interest

Lifecycle

Acceleration

Lifecycle phase — rapid growth, opportunity window

Lifecycle

Growth

Lifecycle phase — established technology

Lifecycle

Mature

Lifecycle phase — saturated, incumbent-dominated

Persona

investor

Persona role — strategic capital allocator

Persona

rd-engineer

Persona role — research / engineering decision-maker

Persona

patent-attorney

Persona role — IP legal practitioner

Persona

strategist

Persona role — corporate strategy lead

Persona

analyst

Persona role — market intelligence analyst

Persona

executive

Persona role — C-level decision-maker

Score tier

Momentum HIGH / MEDIUM / LOW

Score tier — filing velocity intensity

Score tier

OPEN / CONTESTED / CONCENTRATED / DOMINATED

Openness tier — competitive structure label

Score tier

Certainty LOW / MODERATE / HIGH (user-facing)

Public confidence band — drives badge color and LLM hedging register

Decision Engine

Layer C

Tier gate (Phase 5.3) — subordinates the user-facing verdict to evidence_certainty via categorical mapping

Tier output

OPPORTUNITY_SIGNAL

Directional label for INVEST under TIER_MODERATE (MODERATE or LOW certainty)

Tier output

MIXED_SIGNAL

Directional label for MONITOR under TIER_MODERATE (MODERATE or LOW certainty)

Tier output

WEAK_SIGNAL

Directional label for EXPLORE under TIER_MODERATE (MODERATE or LOW certainty)

Tier output

NEGATIVE_SIGNAL

Directional label for AVOID under TIER_MODERATE (MODERATE or LOW certainty)

Tier output

INSUFFICIENT_DATA

Verdict withheld under TIER_LOW (VERY_LOW certainty) — no signal can be defended

Sibling: /methodology/grade — how Stratensight rates evidence certainty and recommendation strength.
Parent: /methodology — the full scientific methodology behind every Stratensight verdict.

Stratensight provides patent intelligence signals, not legal opinions or freedom-to-operate assessments. Not a substitute for IP counsel.