There are basically three different dates associated with any "public" web page that’s available on the Internet:
1. The publication date - this is the date when a web page or a website is first uploaded on to a public web server so that human beings and search spiders can find and read that page.
2. The discovery date - this is the date when search engine spiders first discover a web page on the Internet. Given the fact that Google has become so good at crawling fresh content, the date of first-crawl can be the same as the actual publication date (#1).
3. The cache date - this is the date when a web page was last crawled by the search bot. While webmasters can use XML sitemaps to hint search engines that a page on the site has changed, search bots are free to ignore that advice and therefore the cache date may or may not be the same as the last modified date.
To give you an example, the publication date of this article is February 25, 2008 (it’s mentioned on the web-page), the discovery date (when Google first crawled that page) is also Feb 25, 2008 but the cache date, or the day when Googlebot last crawled that page, is April 20, 3009.
Know The Publishing Date of Web Pages
Now in the above case, the author has himself indicated the publishing date of the web page but in situations where the date is not specified (or you think the mentioned date in incorrect), here’s a simple hack to help you know when a web page or web domain was last published on the Internet.
Step 1. Go to google.com and copy-paste the full URL of the web page in the search box along with the inurl: operator (e.g. inurl:www.example.com). Hit enter.
google.com/search?q=inurl:http://www.labnol.org/websites
Step 2. Now go to browser address bar (Ctrl+L in Firefox or Alt+D in Internet Explorer) and copy-paste "&as_qdr=y15" at the end of the Google search URL. Hint enter again.
google.com/search?q=inurl:http://www.labnol.org/websites&as_qdr=y15
Step 3. Google will load the results again and this time, you’ll see the actual publication date of the web page next to the title in Google search results as in this screenshot.
Video Screencast: Know when a web page was published
Using the same trick, Google tells us that the MySpace.com domain appeared in Google around 31 March 2002, Orkut on 12 Jan 2004 while Barack Obama created his Twitter account on 05 March 2007. The first publication date for Yahoo.com, Whitehouse.gov, CNN.com, Microsoft and other very old domains is mentioned as 31 Jan 2001 which is incorrect but that probably is a bug because Google’s crawler database does include pages prior to that date like this one.
These site publication dates may not be 100% accurate in all cases but they should be very close especially for new web pages and domains.
No comments:
Post a Comment