Back to Tools

Entity Extractor

Use cases

Optimising content for semantic search and Knowledge Graph Identifying missing entities compared to competitors Building comprehensive topic coverage Understanding what entities Google associates with a topic

Extracts named entities using spaCy NLP with selectable models (sm/md/lg for speed vs accuracy tradeoff).

Recognises 11 entity types: PERSON, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, NORP, and FAC.

Supports text input, HTML (auto-strips scripts/styles/nav), and batch CSV/Excel processing.

Streamlit App

Platform

Browser-based (no installation required)

Input

Text, HTML, or CSV/Excel for batch

Output

Entities by type with frequency counts (CSV)

Launch App View Source

Features

  • spaCy NLP with 3 model sizes (en_core_web_sm/md/lg)
  • 11 entity types recognised (PERSON, ORG, GPE, LOC, PRODUCT, etc.)
  • HTML parsing with noise removal (scripts, styles, nav)
  • Batch processing via CSV/Excel upload
  • Text truncation limit (100k chars) for memory efficiency

How to use

  1. 1 Select spaCy model size based on speed vs accuracy needs
  2. 2 Paste text/HTML or upload CSV/Excel for batch processing
  3. 3 Select entity types to extract (filter by PERSON, ORG, etc.)
  4. 4 Run extraction and review entities grouped by type
  5. 5 Download full results CSV or aggregated entity counts

Let's work together

Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.

Let's Talk