Back to Tools

SERP N-gram Extractor

Use cases

Content gap analysis Page title optimisation Understanding SERP content patterns Competitive content research

Fetches SERP results via ValueSERP API and extracts page content using Trafilatura (unlimited timeout).

Generates bigrams via custom find_ngrams() using zip iteration.

Uses NLTK English stopwords filtering and Collections.Counter for frequency analysis.

Normalises text with special character removal and lowercase conversion.

Platform

Python script (requires Python 3.x)

Input

ValueSERP API key

Target search keyword

Geographic location

Device type (desktop/mobile)

Output

Three CSVs: content bigrams with frequency counts, title keywords (frequency > 1), SERP titles with URLs.

View Source

Features

  • ValueSERP API integration for SERP fetching
  • Trafilatura content extraction (unlimited timeout)
  • Custom bigram generation via zip(*[input_list[i:] for i in range(n)])
  • NLTK English stopwords filtering
  • Counter frequency analysis across combined content
  • Location, device type, and results page selection (1-10)

How to use

  1. 1 Enter your ValueSERP API key
  2. 2 Input target keyword and select location
  3. 3 Choose device type and number of results pages (1-10)
  4. 4 Click Submit to fetch SERPs and extract content
  5. 5 Trafilatura extracts text with unlimited timeout
  6. 6 Review bigrams, title keywords, and extracted titles
  7. 7 Download three CSV files for content planning

Let's work together

Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.

Let's Talk