kōdōkalabs

The Semantic Shift:
Google No Longer Reads Words,
It Reads Concepts.

For years, the SEO playbook was built on the flawed premise of Keyword Density. Marketers would meticulously count the number of times a target phrase (“best CRM software”) appeared on a page, believing that repetition was the key to relevance.

This was the paradigm of the String-Matching Era.

Today, that approach is not just ineffective; it is actively detrimental. Google’s algorithms, powered by BERT, MUM, and advanced Vector Search, have moved far beyond simple string matching. The new reality is that the search engine is now a sophisticated reasoning engine that processes your content not as words, but as Concepts and Entities.

The critical metric you must optimize for is no longer Keyword Density; it is Entity Salience.

Entity Salience is the measure of how central a specific entity (a person, place, organization, or abstract concept) is to the overall meaning of a document, as determined by a Large Language Model (LLM).

If your business is serious about ranking in AI Overviews (SGE) and competing in complex B2B niches, you must shift your focus from simply using the right words to establishing the right conceptual authority.

This comprehensive guide will explain the fundamental engineering shift within Google, show you how to use Python and API tools to audit your competitors’ entity graphs, and provide an actionable strategy for maximizing your content’s salience.

Entity Salience
To understand Entity Salience, we must first understand the technology underpinning modern search engines: Vector Embeddings.

The Old Way: TF-IDF and String Matching

In the older SEO model, relevance was calculated using algorithms like TF-IDF (Term Frequency-Inverse Document Frequency). This mathematical model determined a document’s importance based on how frequently a term appeared within it (Term Frequency) relative to how rare that term was across the entire web (Inverse Document Frequency).

The Problem: TF-IDF fails to understand nuance. A document about “Apple pie recipes” and a document about “Apple Inc. stock performance” would score similarly for the keyword “Apple” despite being semantically worlds apart.

The New Way: Embeddings and the Vector Space

The Generative AI models that power modern search have solved this problem by creating a Vector Space.

  1. Tokenization & Embedding: The LLM takes a word, phrase, or concept and converts it into a high-dimensional array of numbers—a vector.
  2. Semantic Proximity: This vector is placed in a giant mathematical space. Concepts that are similar are placed close together.
    • The vector for “Chief Financial Officer” will be close to the vector for “Corporate Finance” and “SaaS Unit Economics.”
    • The vector for “Hiking Boot” will be close to the vector for “Gore-Tex” and “Trailhead.”
  3. The Query: When a user types a query, Google converts the intent of that query into a vector.
  4. Retrieval: The engine no longer searches for matching keywords; it searches for documents whose vectors are mathematically closest to the query vector.

The Crux of GEO: If your content’s vector sits closer to the expert cluster of concepts, you win. Entity Salience is the method by which you force your content’s vector closer to the core of the expert concept.

Part 2: The Difference Between Keywords and Entities

This is the most fundamental strategic confusion in modern SEO.

Feature
Keywords (Old SEO)
Entities (New GEO)

Definition

A specific string of text.
A verified, real-world concept (a noun) that has a unique ID in a Knowledge Graph.

Focus

Frequency of occurrence.

Conceptual weight and relationship to other concepts.

LLM Behavior

Easily ignored or dismissed as fluff.

Used as anchors to build the generated answer. (They are the structural pillars.)

Optimization Goal

“Use the target keyword 3 times.”
“Mention all 10 key related entities.”

Example

CRM

Salesforce (Organization), HubSpot (Organization), Customer Relationship Management (Concept), ARR (Metric).

Nouns Matter More: Every entity is a noun. When you use the right nouns and discuss them with the depth an expert would, you signal high Entity Salience. If you only write filler adjectives and verbs, the model cannot map your content to the knowledge graph.

The "Topical Authority" Definition

Topical Authority is simply high Entity Salience across an entire cluster of related entities. You achieve topical authority on “Cloud Computing” by having high salience for entities like “AWS,” “Serverless,” “Containerization,” “Kubernetes,” and “Latency.”

Part 3: Technical Entity Extraction: Auditing the Competition

We primarily leverage two approaches for entity extraction:

  • The Google Natural Language API (Preferred): This is the gold standard because it uses the same underlying technology that Google Search uses. It specifically provides an Entity Salience Score (a numerical value) for each noun it detects.
  • Open Source NLP Libraries (SpaCy/NLTK): Excellent for basic entity extraction (Named Entity Recognition – NER), but they do not provide the Knowledge Graph IDs or the sophisticated Salience Score of the Google API.

Python Walkthrough: Extracting Entities via API

We will demonstrate the core logic using pseudocode based on the Google Cloud Natural Language API (a foundational tool for serious GEO work).

The goal is to analyze a competitor’s winning article and generate a prioritized list of entities we must cover.

# Conceptual Python Script for Entity Extraction

import requests
import json
from time import sleep # Used for robust API calls
# -- Configuration (Replace with your actual keys and endpoints) --
API_KEY = "YOUR_GOOGLE_CLOUD_API_KEY"
ENDPOINT = "
[https://language.googleapis.com/v1/documents:analyzeEntities]
(https://language.googleapis.com/v1/documents:analyzeEntities)"

def analyze_document(text_content):
"""
Submits text to the Google NLP API to get entities and salience scores.
"""
document = {
'content': text_content,
'type': 'PLAIN_TEXT',
'language': 'en'
}
encoding_type = 'UTF8'

payload = {
'document': document,
'encodingType': encoding_type
}

headers = {
'Content-Type': 'application/json'
}

# API call with exponential backoff for production stability (not shown in detail)
try:
response = requests.post(
f"{ENDPOINT}?key={API_KEY}",
headers=headers,
data=json.dumps(payload)
)
response.raise_for_status() # Raise an exception for bad status codes
return response.json().get('entities', [])
except requests.RequestException as e:
print(f"API Request Failed: {e}")
return []

def extract_and_prioritize_entities(article_text):
"""
Analyzes text and returns a list of top entities by Salience score.
"""
entities = analyze_document(article_text)

# Filter for Noun types (e.g., PERSON, ORG, LOCATION, OTHER, EVENT, etc.)
relevant_entities = [
e for e in entities if e['type'] not in ['OTHER', 'NUMBER']
]

# Sort entities by the 'salience' score (a float between 0 and 1)
sorted_entities = sorted(
relevant_entities,
key=lambda x: x['salience'],
reverse=True
)

# Print the top 10 entities and their scores
print("\n--- Top 10 Entities by Salience Score ---")
for i, entity in enumerate(sorted_entities[:10]):
print(f"Rank {i+1}: {entity['name']} (Type: {entity['type']}) - Salience: {entity['salience']:.4f}")

return sorted_entities

# --- Example Usage ---
competitor_article = """
The shift to vector-based search, leveraging technologies like the Pinecone vector database,
has fundamentally changed how enterprise SEO is conducted. We analyzed the new BERT models,
which interpret content embeddings to move beyond simple keyword density. Our kōdōkalabs research
shows that the average B2B SaaS company must focus on entities like "Annual Recurring Revenue"
and "Customer Lifetime Value" rather than generic phrases. This change is driven by the
necessity of Information Gain to rank in Google's SGE environment.
"""

# Run the analysis
# extract_and_prioritize_entities(competitor_article)
# Note: You would typically fetch the article content from a URL before running this.
# This script is provided for illustrative purposes of the core logic.

Part 4: The Actionable Tip: Increasing Entity Density in Old Content

The easiest, highest-ROI win in GEO is not writing new content, but surgically upgrading your existing, authoritative assets. You are not increasing word count; you are increasing information density.

Step 1: Identify Your Target Assets

Focus on pages that meet these criteria:

  • Striking Distance: Pages ranking between position #7 and #20 on Google.
  • High Authority: Pages with a good number of backlinks or high internal link equity.
  • High Intent: Pages targeting commercial or high-value informational queries.

Step 2: The "Justification" Audit

Take the Entity Gap list (from Part 3) and review your old content. For every missing high-salience entity, ask: “Do I have the right to discuss this?”

  • If the entity is a concept (e.g., “Customer Lifetime Value”), you need to add a dedicated H3 section defining and discussing it.
  • If the entity is a person or organization (e.g., “The Founder of kōdōkalabs”), you need to reference them as the authority source.

Step 3: Strategic Entity Injection

Don’t just mention the entity once; integrate it contextually.

Feature
Old SEO (Retrieval Era)
New GEO (Synthesis Era)

Primary Goal

Ranking #1 in organic links.
Winning the citation in the AI Snapshot.

Keyword Strategy

Keyword Density & Exact Match.

Entity Salience & Semantic Closeness.

Content Length

“The Ultimate Guide” (5,000 words of fluff).

High Information Density (Concise, data-rich).

Ranking Factor

Backlink Quantity (Votes).

Information Gain (Unique Value).

Structure

Long intros (“In this article we will…”).

BLUF (Bottom Line Up Front) / Direct Answers.

Target Audience

The Human Reader only.

The LLM (as the primary reader) & The Human.

Metric of Success

Organic Clicks / Sessions.

Brand Impressions / Share of Model (SoM).

By replacing vague phrases with specific, named, capital-letter Nouns, you give the LLM precise anchors for the knowledge graph. This is the essence of high E-E-A-T.

Step 4: Internal Linking via Entity Connection

Once you inject a new entity, use it to drive a contextual internal link.

  • Example: You add an H3 on “Pinecone Vector Databases” to your generic “Vector Search Guide.” Link the word Pinecone to the specific deep-dive article you wrote on that tool.

This process solidifies your topical cluster, showing the LLM that you are not just a page—you are a network of related expertise.

Part 5: The E-E-A-T Connection: Experience is an Entity

Google’s emphasis on E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) perfectly aligns with the Vector Search model.

  • Experience (The “E”): This is often expressed via specific, proprietary entities.
    • Bad: “We know SEO is hard.”
    • High Salience: “After running 5,000 requests through the Google Search Console API this quarter, we observed a pattern in Data Segmentation.” (The specific API, the specific number—these are verifiable entities.)
  • Authority & Trust (The “A” and “T”): This is established by citing high-salience entities like organizations, official reports, and verified authors. The LLM is more likely to trust your content if it sees high-salience entities associated with research and governance.

Your Author Bio is a Geo Signal: Ensure every author has a robust, published bio that links their name (an entity) to their specific professional credentials (another set of entities, e.g., “Former CTO,” “PhD in NLP”).

Conclusion: Stop Counting Words, Start Mapping Concepts

The pivot from Keyword Density to Entity Salience is more than just a tactical change; it is a shift in mindset. It means moving from the language of marketing to the language of information science.

Keywords are for humans who are guessing. Entities are for machines that are reasoning.

By embracing the vector-based model, using Python tools for surgical competitor analysis, and intentionally increasing the density of authoritative entities in your content, you are engineering a structural advantage. You are making your content impossible for an LLM to ignore.

In the AI Era, your ability to articulate and connect concepts—your Entity Salience—is the most powerful ranking factor you have.

Ready to move from
keyword guessing to entity engineering?

Select the fields to be shown. Others will be hidden. Drag and drop to rearrange the order.
  • Image
  • SKU
  • Rating
  • Price
  • Stock
  • Availability
  • Add to cart
  • Description
  • Content
  • Weight
  • Dimensions
  • Additional information
Click outside to hide the comparison bar
Compare
Ask Me Anything
Hello! How can I help you today?