Back to Tools

Sitemap URL Extractor

Use cases

SEO audits - get complete URL inventory Site migrations - document all URLs before changes Competitor analysis - understand site structure Content inventory - catalog all pages Indexation monitoring - compare sitemap vs indexed

Dual-strategy XML parsing: ElementTree with namespace handling, regex fallback (<loc>.*?</loc>).

Recursive queue-based processing of sitemap indexes.

Gzip decompression for .xml.gz files.

Configurable request delay (0-5 seconds slider).

Duplicate prevention via processed_sitemaps set tracking.

Streamlit App

Platform

Browser-based (no installation required)

Input

Sitemap URL (index or individual .xml/.xml.gz)

Custom user agent string

Request delay: 0-5 seconds

Output

URL list with sitemap metadata (CSV/Excel)

Launch App View Source

Features

  • Recursive sitemap index processing via queue
  • Dual parsing: ElementTree + regex fallback
  • Gzip decompression (.xml.gz support)
  • Duplicate sitemap prevention
  • Configurable delay slider (0-5 seconds)

How to use

  1. 1 Enter sitemap URL in the sidebar
  2. 2 Set custom user agent if needed
  3. 3 Configure request delay (recommended for large sites)
  4. 4 Review extracted URLs with metadata
  5. 5 Download as CSV, TXT, or Excel (timestamped filenames)

Let's work together

Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.

Let's Talk