Jackie Stewart - PC Help
   

       

Google (sitemap.xml) and Yahoo! (urlist.txt) Sitemap Generator for Windows

Easy to use sitemap generator for windows that is completely free. This software GSiteCrawler will take a listing of your websites URLs, let you edit the settings and generate Google and Yahoo Sitemap files. However, the GSiteCrawler is very flexible and allows you to do a whole lot more than "just" that!

Capture URLs for your site using

  • a normal website crawl - emulating a Googlebot, looking for all links and pages within your website
  • an import of an existing Google Sitemap file
  • an import of a server log file
  • an import of any text file with URLs in it

The Crawler

  • does a text-based crawl of each page, even finding URLs in javascript
  • respects your robots.txt file
  • respects robots meta tags for index / follow
  • can run up to 15 times in parallel
  • can be throttled with a user defined wait-time between URLs
  • can be controlled with filters, bans, automatic URL modifications

With each page, it

  • checks date (from the server of using a date meta-tag) and size of the page
  • checks title, description and keyword tags
  • keeps track of the time required to download and crawl the page

Once the pages are in the database, you can

  • modify Google Sitemap settings like "priority" and "change frequency"
  • search for pages by URL parts, title, description or keywords tags
  • filter pages based on custom criteria - adjust their settings globally
  • edit, add and delete pages manually

And you have everything the way you want it, you can export it as

  • a Google Sitemap file in XML format sitemap.xml with or without the optional attributes like "change date", "priority" or "change frequency"
  • a text URL listing for other programs (or for use as a UrlList for Yahoo! urlist.txt
  • a simple RSS feed
  • Excel / CSV files with URLs, settings and attributes like title, description, keywords
  • a ROR (Resources of Resources) XML file
  • a static HTML sitemap file (with relative or absolute paths)
  • a new robots.txt file based on your chosen filters
  • ... or almost any type of file you want - the export function uses a user-adjustable text-based template-system

For more information, it also generates

  • a general site overview with the number of URLs (total, crawlable, still in queue), oldest URLs, etc
  • a listing of all broken URLs linked in your site (or otherwise not-accessable URLs from the crawl)
  • an overview of your sites speed with the largest pages, slowest pages by total download time or download speed (unusually server-intensive pages), and those with the most processing time (many links)
  • an overview of URLs leading to "duplicate content" - with the option of automatically disabling those pages for the Google Sitemap file