LBi: online brand positioning [link:LBi Home Page]
Home  |  About Us  |  Working With You  |  Our Services  |  Our News  |  Blog  |  Contact Us

Search



Archive


Tag Cloud

acquisitions antitrust ask browsers canonical christmas google international keywords language marketing microsoft online advertising pagerank personalised search redirects research search spam yahoo!


Subscribe

If you would like to be alerted when someone posts to the blog please enter your email address below.




RSS 2.0 Feed

Blog RSS feed

Is Ask Jeeves scraping Google?

Posted on 29th July 2009 at 12:08 pm by Ian Macfarlane

Ask Jeeves has pages in its index which could only have been spidered by Googlebot. What is going on?

I was experimenting with User-Agents the other day and came across UserAgent.org – a site which simply displays your web-browser’s User-Agent string. I thought it might be interesting to look at which User-Agents the various search engines had used when they last spidered the site. Little did I expect to find this!

As expected, Google, Yahoo! and Bing simply displayed their standard User-Agents. For example, here’s the result when searching in Bing:

Bing results for UserAgent.org

Side note: Bing is gradually shifting away from msnbot 1.1 and is moving to msnbot 2.0.

However, something rather unexpected happened when searching for that site in the number four search engine, Ask Jeeves:

Ask Jeeves results for UserAgent.org

Eh? That’s Google’s User-Agent, Googlebot! At first, I wondered if Ask Jeeves was simply pretending to be Googlebot sometimes (perhaps to get around websites which block their spider or to detect cloaking). However, when looking at a page which shows the IP address that the request came from, the mystery deepened further:

Ask Jeeves results for UserAgent.org IP address

This page was fetched from the IP address 66.249.68.19. I immediately recognised this as one of Googlebot’s IP addresses (Google owns the entire IP range 66.249.64.0 to 66.249.95.255, and it’s a common Googlebot crawl source). Sure enough, this IP address resolved to the following domain:

crawl-66-249-68-19.googlebot.com

What does this mean? It means that this page must have been fetched by Google’s spiders, not those from Ask Jeeves. It’s not just this site either, there are many, many pages indexed by Ask Jeeves which were spidered from the same location.

Ask Jeeves results from multiple Googlebot IP addresses

It gets even more peculiar – if you look at the cached copy of UserAgent.org, Ask Jeeves instead displays it as having the Ask Jeeves/Teoma spider, with the following User-Agent:

Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)

Also, sometimes you do indeed get Ask Jeeves results - for example, here’s exactly the same web page we saw earlier, after refreshing the search results page a few times:

Ask Jeeves Teoma spider

The IP address 66.235.124.6 resolves to the following Ask Jeeves crawler hostname:

crawler5006.ask.com

In other words, sometimes Ask Jeeves is displaying a page fetched by Googlebot, and sometimes it is displaying the page fetched by its own spider. Typically, the first time a particular request is made, you get the Googlebot-fetched page, and after that Ask Jeeves usually shows the copy it fetched itself.

So why is Ask Jeeves including Google-sourced pages? Well, aside from the somewhat crazy idea that they might actually be scraping Google’s cached pages, which I think we can dismiss, this means that Ask Jeeves and Google have some kind of agreement whereby Google is assisting its diminutive competitor with spidering – and quite possibly more than that.

According to paidContent.org, the advertising deal between Ask Jeeves and Google includes a provision for Google to assist in providing algorithmic search results to Ask Jeeves, not just the better known advertising aspect of the deal.

If so, this discovery of Google-sourced search results could possibly be the first real proof that Ask Jeeves is throwing in the algorithmic towel and giving up on its own search engine.

Note: This is particularly interesting in light of the recently announced Microsoft-Yahoo! deal.

See our follow-up post: Are we losing two of the top four search engines?

   

File under: google ask

Permalink

Comments

No-one has commented so far, or all comments are awaiting moderation.

Post Your Comment

Name*:

Email* (will never be shown):

Website:

Comment*:

Subscribe

If you would like to be alerted when there are new comments to read please enter your email address below.

RSS 2.0 Feed

Comments RSS feed


« What? No Bing Bot? Microsoft and Yahoo Deal Official »
CONTACT US | TERMS & CONDITIONS | SITE MAP
©LBi. All rights reserved 2000-2010