Posted Monday, September 20, 2004
I tend to spend a lot of time in various search engine forums and newsgroups answering questions related to search engines. One of the most common questions that seems to come up is along the lines of:
"My site was in Google yesterday and ranking well and today it's gone! What happened?"
"I made some changes to my site and Google picked them up, but now, it shows the old page again. Why would Google do this?"
and a number of other variations on the above two questions.
With the importance of Google these days, it's no wonder that situations like the above would have webmasters quite worried. After all, with all the reports of sites being penalized or banned, having your site completely disappear could be a bit troublesome.
Luckily, there is a very easy explanation for the above phenomena and it's been lovingly referred to as "Everflux". What exactly is everflux?
Well, "everflux" stems from Google's attempt to create the freshest possible index and by fresh I mean up-to-date. To understand this, let's look first at Google's normal update cycle.
Generally, somewhere around the beginning of the month (all though this can vary widely such as in the past couple of months) Google's primary spider (actually there are many more than one primary spider, but for simplicity I'm going with the singular) heads out and begins to index the sites in it's database. This process generally takes anywhere from 5 to 10 days. During this time, the spider indexes any new pages and re-indexes pages already in it's index.
After this spidering process occurs, there is generally about a two to three week delay before the results from this spidering are publicly available. During this period, which has affectionately been termed the "Google Dance" the results returned from Google tend to fluctuate a bit. This "dance" can last anywhere from 2 or 3 days up to about 1 week.
This is the normal cycle for Google and it does quite well except for sites where the content changes frequently such as news sites etc. This is because, with the current system, there can be anywhere from a 2 or 3 week minimum delay for changes to a webpage or site to be reflected in the primary database and up to 6 or even 7 weeks depending on when the changes were made to a site. If changes were made in time for the monthly spidering, those changes would be reflected in a couple of weeks, but if the changes were made after the monthly spidering, then the site would have to wait for the following months spidering to be picked up and it would end up taking much longer.
Even a two or three week delay is too long when dealing with breaking news and other current events. The solution? Google's "Freshbot".
Google's "Freshbot" as it has been termed is a secondary spider that is constantly crawling the web. It crawls sites Google has found to be either news sites or other important sites that change on a constant basis. It also tends to find sites that have either recently changed or are brand new.
This secondary spider adds it's findings not to the main database but to a temporary database. This temporary database is incorporated into the results returned from the primary (main) database which allows Google to continue its normal update cycle but also return very fresh and up-to-date content.
The confusion comes from the fact that this temporary database that is used by the Freshbot is, in effect, rewritten on a daily basis with the results from the latest round of spidering. This means that a page that was in the temporary database on one day may be completely missing the next.
This can cause a lot of confusion as a new site could be found one day by the Freshbot and added to the temporary database only to be overwritten and disappear the following. The same goes for changes to a page that are found by the Freshbot and then revert to the old version within a day or two. This is simply the natural "flux" caused by this temporary database.
The good news is that these sites that are found and then disappear will almost always reappear permanently once the primary spider crawls them and they are added to the main index.
So, if this has happened, is happening, or does happen to you at some point, never fear, it is simply the Google "Everflux" phenomena at work.
About the Author
John Buchanan is the author of the book "The Insider's Guide to Dominating The Search Engines", and a search engine optimization professional. Visit him at (http://www.se-secrets.com) for more information or with any questions.