DOCUMENTATION

Data Guide

Understand where Stratensight data comes from, how scores are computed, and what the system can and cannot do.

Supported data sources

Stratensight automatically detects your export format and maps columns.

Derwent Innovation

✅ Full supportHigh confidence

All 4 scores available with strong citation data

PatSnap

✅ Full supportHigh confidence

All 4 scores available

Questel Orbit Intelligence

✅ Full supportHigh confidence

Full 4 scores with FAMPAT family dedup. Covers 100+ countries worldwide.

PatentSight

✅ Full supportHigh confidence

Professional export with all 4 scores and family dedup

Espacenet / EPO

✅ Full supportHigh confidence

Official EPO database, recommended starting point

Google Patents

✅ Full supportHigh confidence

Free access, excellent for broad technology coverage

TotalPatent One

✅ Full supportHigh confidence

Standardized Assignee + Family ID + Application Date

Generic CSV

⚡ Basic supportVariable confidence

Any CSV with patent data — auto-detection maps columns

Required fields

FIELD NAMEDESCRIPTION
patent_id / Publication NumberUnique identifier for each patent
titlePatent title (used for clustering)
abstractAbstract (for AI concept extraction)
filing_dateFiling date (for Momentum Index™ calculation)
assignee / Current AssigneePatent owner (for Openness Score™ competitive analysis)
cpc_codes / CPC Classifications (patent technology categories)Technology classification (for Lifecycle Position™)

Auto-detection: Stratensight automatically detects your export format and maps columns. No manual configuration required.

Optional fields — improve score accuracy

Including these fields improves Intelligence Grade™ accuracy.

forward_citationsImproves Momentum Index™Stronger signal
family_idEnables patent family deduplicationReduces noise
priority_dateImproves Lifecycle Position™ calculationMore precise staging
inventorsEnables inventor network analysisNetwork signals
ipc_codesSecondary classification supportBroader coverage

Self-audit — Signal Integrity™

Every analysis audits itself before it is presented. The Critical Reader™ layer surfaces mathematical inconsistencies, data-quality artifacts and scoring contradictions on every plan, with no gating.

9 DETERMINISTIC RULES

  • CE1 — Momentum vs YoY divergence
  • CE2 — Verdict vs AND-logic criteria
  • CE3 — White Space saturation impossibility
  • CE4 — CPC diversity contradiction
  • CE5 — Temporal span vs phase consistency
  • S1 — Source coverage anomaly
  • M_CAGR_LAST_YEAR_ARTIFACT — CAGR distortion from last-year filing lag
  • L_ACADEMIC_DOMINANCE — Lifecycle bias from academic fallback
  • D_ABSTRACT_FILL_CRITICAL — Abstract fill rate below clustering threshold

severity = critical

Mathematical or pipeline inconsistency. Verdict shown but should be reviewed before action.

severity = warning

Data-quality concern that may bias the signal.

severity = info

Legitimate downgrade by a Layer B guard. Surfaces the constraint, never blocks.

Why this layer exists: a verdict you cannot audit is a verdict you cannot trust. Stratensight shows the audit before the verdict, not after — on every analysis, every plan, with no gating. Read the full mechanics in the methodology page.

Layer C — Tier gate (Phase 5.3): adds a final coherence check on top of Layer B. The verdict shown to users is subordinated to evidence_certainty. With HIGH certainty, the raw verdict is preserved. With MODERATE or LOW certainty, the verdict is mapped to a directional signal (e.g. OPPORTUNITY_SIGNAL instead of INVEST). With VERY_LOW certainty, the verdict is fully withheld and replaced by INSUFFICIENT_DATA. See the methodology page for the complete Tier gate logic.

File format

CSV

Recommended
  • UTF-8 encoding
  • Comma or semicolon separator
  • First row = headers

Excel (.xlsx)

Supported
  • Single sheet
  • Headers in row 1
  • Up to 2 GB depending on your plan

JSON

Supported
  • Array of objects
  • Camel or snake_case keys
  • UTF-8 encoding

Champs critiques pour la qualité du signal

Un champ manquant n’empêche pas l’analyse. Stratensight affiche un avertissement contextuel et indique précisément quel score est affecté.

CRITIQUE · SCORE DÉGRADÉ SI ABSENT

filing_dateMomentum Index™
assigneeOpenness Score™
title / abstractClustering + relevance

IMPORTANT · GRADE RÉDUIT SI ABSENT

publication_datecpc_codesjurisdiction

OPTIONNEL · ENRICHISSEMENT SEULEMENT

citationsfamily_idlegal_status

Un champ manquant n’empêche pas l’analyse

Stratensight affiche un avertissement contextuel et indique précisément quel score est affecté.

How to prepare your export

Export your search results as CSV or Excel from your patent search tool. Include as many fields as available. No configuration required.

1Run your search in your patent database
2Export results as CSV or Excel
3Upload the file to Stratensight
4Source Detection Engine maps and normalizes columns automatically
5Analysis starts — results in 2–5 minutes

No dataset? Use the Query Engine

Type a technology keyword and Stratensight retrieves patent data automatically from open sources.

Try the Query Engine

Recommended volume

The number of patents in your dataset directly affects score reliability.

VOLUMERESULT
< 50 patentsAnalysis impossible
50 – 200Directional scores — Intelligence Grade™ reduced
200 – 500Reliable analysis for niche technologies
500 – 3,000Optimal zone — all scores fully calibrated
> 10,000Comprehensive analysis — longer processing time

Recommended time window

The filing date range in your export affects Momentum Index™ accuracy. Below 6 years, the score may underestimate actual innovation velocity.

Recent technology (< 5 years of activity)6–8 years back
Growing technology10–12 years back
Mature technology12–15 years back

Best practice: Use the filing date (Application Date), not the publication date. Publication dates lag by 12–18 months on average, which compresses the innovation curve and underestimates Momentum Index™.

How your data quality affects Intelligence Grade™

Intelligence Grade™ gates the confidence of all other scores. Below 45%, scores are flagged LOW CONFIDENCE.

DATA QUALITYINTELLIGENCE GRADE™IMPACT
Questel Orbit / PatentSight full export85–99%Full analysis: all 4 scores and Decision Engine™
Partial fields (no citations)65–80%Core scores only
Generic CSV50–70%Basic analysis
< 100 patents40–60%Low confidence, directional only

QUERY ENGINE — TIME WINDOW

Time Window Selection

When using /explore (Query Engine), Stratensight automatically selects the optimal patent time window based on the technology lifecycle stage detected by AI. You can always override it manually.

AUTO MODE (default)

Stratensight AI estimates the lifecycle stage and suggests the window automatically. The active button is highlighted with an AI badge.

More mature = shorter window (old patents = noise, not signal)

MANUAL OVERRIDE

Click any pill button (1y · 3y · 5y · 10y · all) to override the AI suggestion. The display shows the selected year range and the source (AI suggested / user selected).

Format: YYYY – YYYY (X years — AI suggested / user selected)

LIFECYCLE STAGESUGGESTED WINDOWRATIONALE
researchallVery few patents — use all available history
emerging10yGrowing signal — wide window to capture trajectory
acceleration7yFast growth — focus on recent surge
growth5yActive competition — recent filings most relevant
mature3yOld patents = noise — only recent matters

Regional Coverage

Stratensight retrieves patent data from EPO OPS and Google Patents. Coverage varies by region and filing route.

International filings (EP, US, PCT-visible)Well covered
Chinese international filingsPartially visible
Chinese domestic-only activity (CNIPA-only)Not directly covered
Japanese and Korean filingsWell covered via PCT + EPO

Interpretation note: Directional signal remains useful across all sectors. Absolute volume may underestimate China-heavy sectors where domestic-only filings represent a significant share of activity. For maximum China coverage, upload your own export from Derwent, PatSnap, or Questel with CNIPA data included.

The quality of the decision depends on the quality of the dataset.

Open data provides directional signals. Uploaded datasets provide higher-confidence analysis.

When to be cautious with your analysis

Not all analyses carry the same weight. These conditions reduce signal certainty.

Dataset < 200 patentsConservative signals — scores are directional only
Temporal window < 7 yearsCAGR unreliable — insufficient baseline for trend measurement
Mixed CPC classesPossible off-topic patents — clusters may not be coherent
Open data source (EPO, Google Patents)Directional signal only — upload a professional dataset for higher confidence
Momentum N/AVerdict is conservative — filing velocity cannot be measured

What this analysis does NOT capture

Stratensight analyses patent filing patterns only. The following dimensions are outside the analytical scope.

Market sizeNo revenue, sales, or TAM estimation
Revenue & profitabilityPatent signals do not correlate with financial performance
Technology adoptionFiling activity reflects R&D intent, not market penetration
Cost structureNo manufacturing, licensing, or deployment cost modeling
Regulation & policyRegulatory approvals, trade barriers, and subsidies are not captured

GUIDE 1

Which source for which objective

Your objective determines the right data source. Quick exploration and strategic decisions require different levels of data quality and coverage.

OBJECTIVERECOMMENDED SOURCEINTELLIGENCE GRADE™
Quick signal on a technologyExplorer

Open source query

50–70%
Reliable strategic decisionUpload

Premium dataset (Derwent, PatSnap, Questel)

85–99%
Global US / EP / PCT landscapeExplorer

Sufficient coverage via EPO OPS

60–75%
Asia / emerging market analysisUpload

Required — open sources miss domestic CN/IN filings

80–95%
Reproducible, auditable analysisUpload

Dated export for full traceability

85–99%
Technology monitoring / watchExplorer

Real-time open data, repeat periodically

55–70%
Competitive benchmarkingUpload

Full assignee data with normalized names

80–95%
Due diligence / M&A contextUpload

Tier 1 source with citation + family data

90–99%

Rule of thumb: Explorer is ideal for fast directional signals on Western markets. For any decision with material consequences, upload a professional dataset to reach Intelligence Grade™ above 80%.

GUIDE 2

Understanding CPC classification

The Cooperative Patent Classification (CPC) system is a hierarchical taxonomy of 250,000+ technology codes used by the EPO and USPTO to classify every patent.

CPC sections (A–H)

A Human Necessities

Agriculture, food, health

B Operations & Transport

Separating, shaping, vehicles

C Chemistry & Metallurgy

Organic chemistry, alloys

D Textiles & Paper

Weaving, papermaking

E Fixed Constructions

Building, mining

F Mechanical Engineering

Engines, pumps, weapons

G Physics

Optics, computing, control

H Electricity

Electronics, semiconductors

How to read a CPC code

Each level adds specificity. A broader code captures more patents; a narrower code isolates a precise technology.

HSectionH = ElectricityBroadest level — 8 sections total
H01ClassH01 = Basic electric elements~130 classes
H01LSubclassH01L = Semiconductor devices~640 subclasses
H01L 29/00Main groupH01L 29 = Semiconductor device structuresThousands of groups
H01L 29/66SubgroupH01L 29/66 = FET-specific structures250,000+ leaf codes

CPC examples by technology domain

DOMAINKEY CPC CODESDESCRIPTION
AI / Machine LearningG06NComputing arrangements based on specific computational models
CRISPR / Gene EditingC12N 15/11DNA or RNA fragments; modified forms thereof
Solid-State BatteriesH01M 10/0562Solid electrolytes for secondary cells
Autonomous VehiclesG05D 1/02Control of position or course in two dimensions
Quantum ComputingG06N 10/00Quantum computing; quantum information processing
mRNA TherapeuticsA61K 48/00Medicinal preparations containing genetic material
Carbon CaptureB01D 53/62Carbon dioxide removal from gas mixtures
5G / 6G NetworksH04W 72/04Wireless resource management in multi-carrier systems

Why Stratensight uses CPC

CPC codes are language-independent, hierarchical, and examiner-assigned. They eliminate keyword ambiguity and provide consistent technology mapping across all patent offices. In Explorer, check the “View query” section of Source Coverage to see exactly which CPC codes were used.

GUIDE 3

Preparing your dataset

Follow these recommendations for the best possible Intelligence Grade™. The more complete your export, the higher the analytical confidence.

Required columns

These fields are mandatory for Stratensight to generate a valid analysis.

COLUMNPURPOSESCORE IMPACT
TitlePatent title text. Used for AI clustering and topic extraction.Clustering quality
AbstractFull abstract text. Enables semantic analysis and concept mapping.+20–30% cluster accuracy
CPC CodesCooperative Patent Classification codes. Technology taxonomy backbone.Lifecycle Position™

Optional columns — improve score accuracy

Including these fields significantly improves Intelligence Grade™ and unlocks advanced analytics.

COLUMNPURPOSEBOOST
Filing DateApplication date. Core input for temporal analysis and trend calculation.Momentum Index™ precision
AssigneePatent owner / applicant. Required for competitive landscape analysis.Openness Score™
Priority DateEarliest priority filing date. Improves lifecycle staging accuracy.Lifecycle Position™
Forward CitationsNumber of times cited by later patents. Strengthens momentum signal.Stronger Momentum signal
Family IDPatent family identifier. Enables deduplication across jurisdictions.Reduces noise
InventorsInventor names. Enables inventor network analysis.Network signals
IPC CodesInternational Patent Classification. Secondary classification fallback.Broader coverage

Compatible sources

Stratensight auto-detects export format from these platforms. No manual column mapping required.

SOURCETYPICAL GRADENOTES
Derwent Innovation90–99%Full fields including citations and family data
PatSnap88–97%Complete export with assignee normalization
Questel Orbit90–99%FAMPAT family dedup, 100+ countries
PatentSight88–96%Professional export, family dedup included
TotalPatent One85–95%Standardized assignee + Family ID
Espacenet60–75%Free, good starting point, limited citation data
Google Patents55–70%Free, broad coverage, variable assignee quality

How to improve Intelligence Grade™

Include Abstract field (improves clustering quality by 20–30%)
Include CPC codes (enables accurate Lifecycle Position™)
Include Filing Date not just Publication Date (Momentum precision)
Include Assignee/Applicant (required for Openness Score™)
Include forward citations if available (boosts Momentum signal)
Include Family ID if available (enables deduplication, reduces noise)

OPTIMAL SIZE

200–3,000

patents for best score calibration

RECOMMENDED WINDOW

8–12 years

of filing history for reliable Momentum

GUIDE 4

Interpreting scores

Each score measures a distinct dimension of the technology landscape. Understanding what HIGH, MEDIUM, and LOW mean for each is essential for correct interpretation.

Momentum Index™

Measures the velocity and acceleration of patent filing activity over time. Derived from CAGR, year-over-year trends, and citation weighting.

HIGH 65–100
Strong, sustained filing growth. Technology attracting increasing R&D investment from multiple actors.e.g. Quantum Computing ~93
MEDIUM 35–64
Moderate activity. Filing rate is stable or showing early growth signals. Worth monitoring.e.g. Solid-State Batteries ~58
LOW 0–34
Declining or stagnant filing activity. Technology may be mature, niche, or losing momentum.e.g. Legacy lithography ~12

Lifecycle Position™

Identifies the maturity phase of the technology based on filing patterns, growth trajectory, and actor concentration. Determines strategic timing for market entry.

ResearchEarly academic exploration. Few filings, high diversity. Entry cost is low.
EmergingFirst commercial interest. Growing volume, early consolidation. Prime window for first-movers.
AccelerationRapid expansion. New entrants flooding in. High competition, high opportunity.
GrowthEstablished ecosystem. Clear leaders, stable dynamics. Best for strategic positioning.
MatureSaturated domain. Declining novel filings. Incremental innovation only.

Openness Score™

Measures how concentrated or fragmented the competitive landscape is. Based on the Herfindahl-Hirschman Index (HHI) of patent assignees, transformed to a 0–100 scale where higher = more open.

OPEN 80–100Highly fragmented market. Hundreds of actors, no dominant player. Low barriers to entry.
CONTESTED 55–79Active competition. Multiple significant players but no single dominant force.
CONCENTRATED 30–54Few dominant actors control most filings. Market entry requires differentiation or licensing.
DOMINATED 0–29One or two players hold the vast majority. High barriers, high IP risk for new entrants.

Intelligence Grade™

Meta-score that evaluates the quality and completeness of the underlying dataset. Gates the confidence of all other scores. Below 45%, all scores are flagged LOW CONFIDENCE.

HIGH 70–100%All fields populated, sufficient volume, good temporal coverage. All scores reliable. Decision Engine™ verdict carries full confidence.
MEDIUM 45–69%Some fields missing or limited temporal depth. Core scores are valid but edge cases may be imprecise. Review outlier scores manually.
LOW 0–44%Significant data gaps. Scores are indicative only. Do not use for strategic decisions without supplementary data.

GUIDE 5

Limits to know

Transparency is a core principle. Understand these limitations before making decisions.

SOURCELIMITATIONIMPACT ON ANALYSISSEVERITY
EPO OPS2,000 patent cap per queryLarge domains may be undersampled. Momentum and Openness affected.Medium
EPO OPS3–6 month indexing delayVery recent filings missing. Short-term momentum may be understated.Low
EPO OPS18-month indexation delay (full)CAGR on short windows may appear negative while the market is actually growing. A warning flag is displayed.Medium
StratensightTechnology maturity priorsFor well-known technologies (e.g. Wind Energy, Li-Ion), lifecycle may be adjusted to the industry consensus. A transparency flag is always displayed when a prior is applied.Low
StratensightOutput Intelligence flagsSignals cases where results require caution (CAGR indexation, lifecycle adjusted, short window). Flags are non-blocking and shown as an additional intelligence panel.Low
StratensightSignal Summary (plain language)Deterministic 3-line summary generated from verdict × lifecycle × momentum. No AI involved. Never replaces full score analysis — shown as novice-layer guidance only.Low
Google PatentsAssignee auto-normalization variesCorporate group precision varies. Openness Score™ may be imprecise.Medium
Google PatentsNo citation data in exportCitation-weighted Momentum unavailable. Score relies on volume only.Medium
EspacenetLimited bulk export capabilitiesManual export caps at 500 results. Insufficient for broad domains.Medium
All open sourcesNo domestic CN/IN/TR filingsAsia and emerging markets underrepresented. Upload required for coverage.High
OpenAlex (fallback)Academic publications, not patentsScores reflect research activity, not commercial IP strategy.High
Any source< 50 patents in datasetStatistical reliability insufficient. All scores flagged LOW CONFIDENCE.Critical

When to trust the verdict

Intelligence Grade™ ≥ 70% (HIGH) + uploaded dataset from a Tier 1 source (Derwent, PatSnap, Questel). All four scores are reliable and the Decision Engine™ verdict carries full analytical weight.

When results are directional only

Intelligence Grade™ between 45% and 69%. Explorer-based analysis. Dataset under 200 patents. Scores indicate direction but not magnitude. Use as a starting point, not a final answer.

When to upload your own data

When analyzing Asia/emerging markets. When reproducibility matters. When Intelligence Grade™ on Explorer is below 65%. When the EPO cap warning appears. For any strategic decision with material consequences.

Warning signals in a report

Academic source fallback banner (orange). EPO cap warning. Intelligence Grade™ below 45%. Coverage gap alert showing missing actors. Any of these should prompt verification with a premium data source before making decisions.

GET STARTED

Ready to analyze your data?

Upload your patent export and get your first intelligence report in minutes.

Start your first analysis →

Building a Reliable Dataset

Intelligence Grade™ Layer detects 12 analytical tensions. Follow this checklist to maximize your analysis quality.

Pre-analysis checklist

Volume minimum: 100+ patents recommended
Time coverage: 5+ years recommended
Filing dates: less than 20% missing
CPC scope: neither too broad nor too narrow
Source: indicate EPO / Derwent / PatSnap / other
Regional bias: check CN proportion if relevant
Duplicates: enable family deduplication

Very small dataset (fewer than 30 patents)

Why it matters: Scores are statistically unreliable. A single outlier can shift the entire analysis.

How to fix: Broaden your search query, add broader CPC classes, or upload a larger export from a Tier 1 source.

Limited dataset (30-99 patents)

Why it matters: Directional signal is valid but statistical confidence improves significantly above 100 patents.

How to fix: Add broader keywords to your search. Consider combining multiple CPC codes.

Dataset covers less than 3 years

Why it matters: CAGR and Lifecycle signals require temporal depth. Short windows produce unreliable trend signals.

How to fix: Filter your export to include filings from at least 5 years ago. The Query Engine suggests appropriate time windows automatically.

No historical baseline (all recent filings)

Why it matters: Without historical comparison, momentum direction is speculative. The system cannot distinguish acceleration from emergence.

How to fix: Extend your date range to include pre-2020 filings. Historical context is essential for lifecycle accuracy.

Over 20% of patents lack filing dates

Why it matters: Missing dates create blind spots in temporal analysis. Lifecycle and Momentum scores lose precision.

How to fix: Re-export your dataset with complete filing date coverage. Most premium sources include dates by default.

High CN activity with open-source data

Why it matters: EPO and Google Patents capture international PCT filings but may miss domestic CNIPA-only activity.

How to fix: For comprehensive CN coverage, use Derwent Innovation or PatSnap and upload the export directly.

Growth lifecycle with Momentum below 20

Why it matters: This combination is analytically unusual. Common causes: limited recent filing coverage or overly broad query scope.

How to fix: Verify your query captures recent filings (2020-present) and refine your CPC scope.

High CAGR on fewer than 50 patents

Why it matters: Small samples amplify statistical noise. One abnormal filing period can distort the growth rate.

How to fix: Increase dataset size above 100 patents before trusting CAGR direction.

Open market score but concentrated key players

Why it matters: Accessibility and competitive equality are different. Low barriers coexist with established dominance.

How to fix: Analyze individual clusters separately to identify genuinely open sub-domains.

Low Intelligence Grade with INVEST or AVOID verdict

Why it matters: A strong verdict on weak data is risky. The direction may be correct but the conviction is premature.

How to fix: Improve dataset completeness: add missing dates, increase volume, use a richer source.

Mature technology with strong momentum

Why it matters: Unusual but analytically interesting. May signal a second innovation cycle, regulatory trigger, or disruptive variant.

How to fix: Investigate sub-domains and recent cluster formation to identify the source of renewed activity.

One actor holds over 75% of patents

Why it matters: The analysis reflects one player's IP strategy, not the broader technology ecosystem.

How to fix: Broaden your query scope or exclude the dominant assignee to reveal ecosystem dynamics.

What Stratensight does NOT do

Transparency is a core principle. Here is what Stratensight explicitly does not claim to do.

No prediction

Stratensight does not predict the future. Scores reflect current and historical patent signals, not market forecasts.

No guesswork

Every score is deterministic and computed from explicit formulas. There is no hidden model, no opaque weighting, no proprietary black box.

No hallucination

AI is used for text interpretation only (Claude Haiku). Scores are never generated by AI. If a score cannot be computed, it is absent — never fabricated.

No investment advice

Verdicts (INVEST, MONITOR, EXPLORE, AVOID) are analytical signals based on patent data. They are not financial recommendations.

No complete market coverage

Patent data reflects internationally visible filing activity. Domestic-only filings (e.g. CNIPA) may be underrepresented in open sources.

Understanding the limits of your signal

Stratensight produces directional signals, not absolute truth. Here's what to keep in mind:

Coverage

EPO and Google Patents capture internationally visible filings. Domestic-only activity in China, Japan, or Korea may be underrepresented.

Scope dependency

Your scores reflect the patents in your dataset. A narrow query produces a narrow signal.

Momentum reliability

Requires at least 3–5 years of filing history. Short datasets produce N/A — not zero.

Intelligence Grade™

Every analysis automatically flags these limitations. Watch for LIMITED or FRAGILE badges.

Confidence is a measure of analytical reliability, not market probability

Intelligence Grade™ reflects the reliability of the analysis based on dataset quality, coverage sufficiency, temporal depth, and signal consistency. It is not a probability of commercial success, market prediction, or AI certainty.

Chinese domestic patent activity may be underrepresented in open sources. The directional signal may remain useful, but coverage is not complete.

Stratensight provides patent intelligence signals, not legal opinions or freedom-to-operate assessments. Not a substitute for IP counsel.