Keyword Deduplication Tool

Use cases

Cleaning keyword lists before clustering Removing near-duplicate keywords from exports Consolidating keyword research from multiple sources

Uses rapidfuzz with token_sort_ratio scorer and process.extractOne() to identify near-duplicate keywords at 99-point similarity threshold.

Automatically detects file encoding via chardet (first 100,000 bytes).

Keeps the highest search volume variant and exports both processed and dropped keywords for review.

Uses stqdm for progress tracking.

Streamlit App

Platform

Browser-based (no installation required)

Input

CSV or Excel file with keywords and search volumes

Encoding auto-detected via chardet

Output

Excel with deduplicated and dropped keywords

Features

I offer this as a managed service. You get the insights without touching the tool.

Tag keywords using substring matching against up to 7 classification columns.

Two-level eBay related search scraping with ECharts tree visualisations.

Assess keyword difficulty using allintitle, phrase match, and SERP clustering.

Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.