Internet Archive Analyser
Use cases
Queries the Wayback Machine CDX server to analyse how a site evolved over time.
Tracks folder structure changes, HTTP status codes, frequently modified pages, and robots.txt history with diff-based version comparison.
Includes Plotly visualisations and CSV export.
Platform
Browser-based (no installation required)
Input
Domain name
Analysis settings
Output
Historical analysis with charts and CSV export
Features
- Folder structure evolution tracking (annual breakdown)
- HTTP status code analysis (1xx-5xx groupings over time)
- Frequently changed pages identification
- robots.txt timeline with diff-based version comparison
- Stacked line/bar chart visualisations (Plotly)
- Configurable filters (All files, HTML only, HTML + Images)
How to use
- 1 Enter the domain you want to analyse
- 2 Select visualisation type and top folders count
- 3 Choose file type filter
- 4 Run the query against the Wayback Machine CDX server
- 5 Explore tabs: Folder Structure, Status Codes, Changed Pages, robots.txt
- 6 Compare robots.txt versions with highlighted diffs
- 7 Export filtered URL list as CSV
Want me to run this for you?
I offer this as a managed service. You get the insights without touching the tool.
Related Tools
Website Migration Tool
MigrationAutomate redirect mapping during migrations using semantic URL matching.
Google Trends Forecasting
ReportingForecast search trends using NeuralProphet ML.
BCG Matrix Generator
ReportingGenerate BCG matrix visualisations from GA landing page reports with cascading folder analysis.
Let's work together
Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.
Let's Talk