Google Sitemaps

 

 
         
 

Google Sitemaps - Shared Servers / Hosts

 
 

Instructions for Google's Sitemaps on a Shared Host

Shared hosting presents some problems for those that want to use Sitemaps for submission to Google.

The primary problem is the Python script that generates the sitemap. Allowing the script to be uploaded to the server and run under user control is a significant security risk. Dedicated servers and co-located servers are not a problem because the user is the owner and sole occupant of the server.

So, how does one generate the sitemap on a shared server?

 
   

The sitemap cab be easily generated on the web designer's workstation. It can be generated on the shared host provided the ISP (us) allows you to run your Python scripts (we don't - its is too great a security risk).

The short sitemap explanation:

 
     

Install Python on the workstation.
Follow Google's directions to create a sitemap on the server but perform those steps on the workstation. The only change being, once the map is generated you upload the sitemap to the hosted site.

 
   

The longer sitemap explanation:

 
     

First one needs to install Python. See: http://www.python.org/download/

Python.org provides the files, download and installation instructions. Install and test per their direction.

 

Second one needs the Google Sitemap Generator Code.
See: http://sourceforge.net/projects/goog-sitemapgen/
Windows users get the ZIP file. Mac users... well, pick the file type you are most familiar using.

Also, for more information see: http://code.google.com/projects.html There are other interesting projects here too.

Once you have the Zip file, unzip it and find the files:
    example_config.xml
    sitemap_gen.py

Copy those to the web site's root folder, on the local workstation. Reference

You need to check the file sitemap_gen.py for a coding problem that may have been fixed by the time you read this. If it has not been corrected in your version you need to fix the problem to generate correct sitemaps. You can open the file with Dreamweaver if you add '.py' files to its list of editable files. Or just use the Notepad editor or whatever. Also, Python installs an editor so, you may be able to right click the file and open it in the Python editor. The '.py' file type is plain text so, any simple text editor will work.

Search for the word 'replace'. You should find a line with the text: middle.replace(os.sep, '/')

The correct code is: middle = middle.replace(os.sep, '/') the wrong code is just this: middle.replace(os.sep, '/')

Change the code if necessary and save the file. Use this file for any other copies you need for other sites.

Now the example_config.xml file needs to be edited. Save the file as a new file named config.xml.

In the config.xml file set up the site name and specify where the sitemap file is to be saved. This is done near line 30.

<site
base_url="http://www.cates-assoicates.net/"
store_into="sitemap.xml"
verbose="1"
>

The site name is the name Google will use to find the site. See Google's instructions for other details.

There are various options you can choose for how to build the sitemap by following Google's instructions. We suggest you comment out all methods except the Directory Nodes method for this use.

<!-- ** MODIFY or DELETE **
"directory" nodes tell the script to walk the file system
and include all files and directories in the Sitemap.

Required attributes:
path - path to begin walking from
url - URL equivalent of that path

Optional attributes:
default_file - name of the index or default file for directory URLs
-->
<directory path="F:\Inet\site\www\" url="http://www.someDomain.com/" />

The path needs to be the literal path to your folder containing the site you want to map. It needs to be the folder that contains your site's root files. These are the files on site with the URL's http://www.someDomain.com/pagename.htm.

The URL of course needs to be the site address, http://www.someDomain.com/.

Next, within the config.xml file you need to create a FILTER. The filter section is toward the end of the file. Filters can prevent the program from mapping files that should not be submitted to Google. We suggest Dreamweaver users add, at least, these lines:

<filter action="drop" type="wildcard" pattern="*/_mmServerScripts/*" />
<filter action="drop" type="wildcard" pattern="*/_mm/*" />
<filter action="drop" type="wildcard" pattern="*/_notes/*" />
<filter action="drop" type="wildcard" pattern="*/Connections/*" />
<filter action="drop" type="wildcard" pattern="*/Library/*" />
<filter action="drop" type="wildcard" pattern="*/Templates/*" />
<filter action="drop" type="wildcard" pattern="*/*.LCK" />
<filter action="drop" type="wildcard" pattern="*/*.mno" />
<filter action="drop" type="wildcard" pattern="*/TMP*.asp" />

We also recommend you add these sitemap filters:

<filter action="drop" type="wildcard" pattern="*/*.css" />
<filter action="drop" type="wildcard" pattern="*/*.eot" />
<filter action="drop" type="wildcard" pattern="*/*.js" />
<filter action="drop" type="wildcard" pattern="*/*.ico" />
<filter action="drop" type="wildcard" pattern="*/*.txt" />

Once you have your filters done, save the file and run the script in Test Mode then check the content of the output file sitemap.xml. You should see diretories, html, asp, aspx, and all the other files you want indexed. You may want to exclude images that generated by Fireworks™. If so we recommend you place them in a seperate folder, and filter the folder.

To run the script we often use a batch file with this code:

echo off
echo test ONLY mode
"C:\Program Files\Python24\python" sitemap_gen.py --config=config.xml --testing

The yellow text may need to change to represent the conditions on your workstation. The white text puts the process into text mode.

We also use a more complex script that reads whither or not to use the test mode from the command line.

Test Mode Usage: runIndex.bat --testing
Omit '--testing' to run for effect. This mode notifies Google a new version of the sitemap has been generated. Be sure to upload the new file ASAP.

echo off
if "%1" == "--testing" echo Running in TEST MODE
if "%1" == "" echo LIVE MODE - Use runIndex.bat --testing
"C:\Program Files\Python24\python" sitemap_gen.py --config=config.xml %1

Run the batch file or Python command from the site's root folder on the workstation, which contains the config.xml and sitemap_gen.py files. This should generate the sitemap.xml file. Inspect it and adjust your filters if needed. Once you have the file generated as you want, upload it to the site.

Be sure to visit Google Sitemaps and register/submit your sitemap.

 
         
 

Remember

 
     

We do not support sitemaps nor the Python code. Neither do we make any promises about what Python may or may not do to your computer. You are on your own.

If you find mistakes in our page, we will be glade to correct them and would appreciate your help.

 

 

©2005 Copyright Cates-Associates - Web Design by Dolphin Ad Design
ver 1.0
Uru Maps