September 11 2008
Debunking Michael Gray’s ineffective Google inclusion tracking method
Michael Gray thinks he has figured out which parts of your Web site are not being crawled. He says put a date stamp on your pages. Combined with something unique to your site (he suggests the site name), you’ll create a unique string to search for.
Then he suggests you wait two months.
Um…TWO MONTHS? PUH-leeze!
But let’s start with something else Michael says: “When Google took away the supplemental index last year ….”
WRONG! WRONG! WRONG~
Google did not “take away” the Supplemental Index last year. The Google Supplemental Index still exists and there ARE still ways to look at it through search results. But searching for date stamps on pages is NOT one of the ways to look at the Supplemental Index.
Why?
Your Supplemental page and my Supplemental page may not be equal. That is, your page may show up for a date stamp search and my page may not?
Why?
Because that’s just the way it works. However, if you combine a date stamp with the title of your site, you diminish the chances of finding your pages. Google doesn’t seem to return Supplemental Results pages for queries with more than 5 terms in them. If you follow Michael’s advice and use month+year, you only have three keywords left for the name of your site.
Which is not to say that month+year plus three terms will make a Supplemental Results page appear in a query. I’m just saying that so far, after running thousands of test queries for a year, I’ve not been able to get a page I believe to be in the Supplemental Index to appear in a 6+ term query.
Your mileage may vary.
You should not need to wait 2 months to determine whether Google is crawling your pages, however. If you study your server logs then you’ll know how often Google fetches your pages.
But Google will fetch a page more often than it is recached. What you really want to know is how often your page is cached. Unfortunately, Google doesn’t always show you cache data for pages. I’ve noticed that page cache data seems to become unavailable right before it’s updated (but I could be mistaken).
If your page cache is updated rarely (updates are more than a month apart), you should see if you can grab random bits of text from the page and use them to pull that specific page up in a site query. If the page doesn’t appear for unique text in a site search, it’s almost certainly supplemental.
And you don’t have to wait 2 months to learn that.
Written by Michael Martinez




