Named Entity Recognition (NER)

Named Entity Recognition identifies and classifies named entities in text — people, organizations, locations, dates, monetary values, and more. It converts unstructured text into structured facts that systems can act on.

What Counts as a Named Entity?

"Apple released the iPhone 17 in San Francisco on September 9, 2025,
 with CEO Tim Cook presenting to 3,000 attendees."

Entities:
  Apple         → ORG  (organization)
  iPhone 17     → PRODUCT
  San Francisco → GPE  (geopolitical entity / city)
  September 9, 2025 → DATE
  Tim Cook      → PERSON
  3,000         → CARDINAL

Standard Entity Types

Label	Description	Example
PERSON	People and fictional characters	Marie Curie, Sherlock Holmes
ORG	Companies, agencies, institutions	Google, WHO, NASA
GPE	Countries, cities, states	France, Tokyo, California
LOC	Non-GPE locations, landmarks	Mount Everest, the Amazon
DATE	Absolute and relative dates	June 2025, last Tuesday
TIME	Times of day	3:00 PM, midnight
MONEY	Monetary values	$4.5 billion, €200
PERCENT	Percentages	12.5%, three percent
PRODUCT	Named products	Model 3, iPhone 17
EVENT	Named events	World Cup 2026, the French Revolution

NER with spaCy

import spacy

nlp = spacy.load("en_core_web_sm")
text = """
In Q1 2025, Microsoft acquired Inflection AI for $650 million.
The deal was announced by CEO Satya Nadella at their Redmond headquarters.
"""
doc = nlp(text)

for ent in doc.ents:
    print(f"{ent.text:<30} {ent.label_:<12} {spacy.explain(ent.label_)}")

# Microsoft                      ORG          Companies, agencies, institutions
# Inflection AI                  ORG          Companies, agencies, institutions
# $650 million                   MONEY        Monetary values
# Satya Nadella                  PERSON       People, including fictional
# Redmond                        GPE          Countries, cities, states

Visualize in a notebook:

from spacy import displacy
displacy.render(doc, style="ent", jupyter=True)

NER with Hugging Face Transformers

Transformer-based NER models substantially outperform statistical models on most benchmarks:

from transformers import pipeline

ner = pipeline("ner", model="dslim/bert-base-NER", aggregation_strategy="simple")

text = "Elon Musk's SpaceX launched Starship from Boca Chica, Texas in May 2025."
results = ner(text)

for entity in results:
    print(f"{entity['word']:<20} {entity['entity_group']:<8} score: {entity['score']:.3f}")

# Elon Musk            PER      score: 0.999
# SpaceX               ORG      score: 0.998
# Starship             MISC     score: 0.941
# Boca Chica           LOC      score: 0.997
# Texas                LOC      score: 0.998

Fine-Tuning NER for a Custom Domain

Pre-trained models miss domain-specific entities (drug names, legal clauses, custom product codes). Fine-tuning on a small annotated dataset solves this:

from transformers import AutoTokenizer, AutoModelForTokenClassification, Trainer, TrainingArguments

# 1. Prepare annotated data in CoNLL or IOB2 format
# 2. Load a base model
model_name = "bert-base-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(
    model_name,
    num_labels=len(label_list)  # your custom entity labels
)

# 3. Fine-tune
training_args = TrainingArguments(
    output_dir="./ner-model",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)
trainer = Trainer(model=model, args=training_args, ...)
trainer.train()

IOB2 Annotation Format

NER training data uses the IOB2 (Inside-Outside-Beginning) tagging scheme:

Token       Tag
──────────────────
Apple       B-ORG   ← Beginning of an ORG entity
Inc         I-ORG   ← Inside the same entity
released    O       ← Outside any entity
iPhone      B-PRODUCT
17          I-PRODUCT
in          O
California  B-GPE
.           O

NER Applications in 2025

Automated news analysis — extract who did what to whom from thousands of articles per hour.

RAG pipeline enrichment — tag documents with entities before indexing so retrieval can filter by person, organization, or location.

Medical NLP — identify drug names, dosages, symptoms, and conditions from clinical notes (using specialized models like BioBERT or clinical spaCy models).

Contract analysis — extract party names, effective dates, and monetary terms from legal documents.

Resume parsing — identify skills, companies, job titles, and education entities for recruitment pipelines.

Current Benchmarks (2025)

On the CoNLL-2003 English benchmark:

BERT-large fine-tuned: ~92.8 F1
DeBERTa-v3-large: ~93.5 F1
Flair with stacked embeddings: ~93.2 F1

For most production use cases with standard entities, the dslim/bert-base-NER model on Hugging Face Hub achieves excellent results out of the box.