Monday, December 1, 2008

How to get all the results available from your Google AJAX Search API application

If you haven't heard of it yet, the Google AJAX Search API is a pretty nifty little doo-dad. Basically, it allows you to add Google search functionality without having to divert visitors from your site, hack the Google infrastructure, or even deviate from your own formatting rules. It works like this: you build a control of some sort that will process user input; form a request URL; and get information, including web, local, news, video, image, blog, book, and even patent search results via JSONP. Pretty slick, if you ask me, but one of the questions I often receive regarding the API goes something like this: "It's a cool concept, but how can I get more results?" You see, using it, you can currently only get up to 64 results (less for local and blog searches), and you can only retrieve those in blocks of 8.

So for today's adventure, I thought we could explore getting as many results as we possibly can from the Google AJAX Search API. But right off the bat, there is a disclaimer: I'm not going to tell you how to get around the limit of total results you can retrieve. The AJAX Search API is designed to provide rudimentary search functionality for the users of your website or other application, not for SEO, data mining, or even deep searching. And it does that intended task admirably. To try to bypass the total results limit is a violation of the service's Terms of Use and, I believe, generally unnecessary when using the API as intended.

Rather, what we're going to talk about is getting all the results that they will let us get in one fell swoop.

So, how do we do it? Well, first things first; we set up our searcher as normal. Because the default google.search.SearchControl is obfuscated, we're going to have to use a RAW searcher. If you don't know how to do this, you'll want to check out the documentation at http://code.google.com/apis/ajaxsearch/documentation/.

Once that's done, there are two methods of the searcher object that we're going to take advantage of: .setSearchCompleteCallback() and .gotoPage(). Details for both of these methods are provided in the class reference.

Essentially, the process, broken down into its component steps, is going to go like this.
  1. We set up the searcher's completion callback
  2. We execute a search query
  3. The query returns, executes the completion callback
  4. The completion callback checks to see if there are more results
  5. If there are more results to get, calls gotoPage(...);
  6. Otherwise, finish processing the results
So, are you ready for this? Here's the code, commented up so you can see what's going on.

// Initialize the searcher object, in this case a WebSearch.
window.gs = new google.search.WebSearch();
gs.setResultSetSize(google.search.Search.LARGE_RESULTSET);

// Set the search complete callback. Notice that we're using the searcher itself as the context. This is going to allow us to use this in the callback to refer to the searcher.
gs.setSearchCompleteCallback(gs,function(){

// Set a handle to the cursor object that we're going to use repeatedly
var cursor = this.cursor;

// Create a new property on the searcher that we can stash results into so they don't disappear when we go to the next page.
if(!this.allResults || cursor.currentPageIndex==0){this.allResults = [];}

// Add the new results to the other results
this.allResults = this.allResults.concat(this.results);

// Check to see if the searcher actually has a cursor object and, if so, if we're on the last page of results. If not...
if (cursor && cursor.pages.length>cursor.currentPageIndex+1){

// Go to the next page.
this.gotoPage(cursor.currentPageIndex+1);

// Else, if there is no cursor object or we're on the last page...
} else {

// Loop through the results and...
for(var i=0; i<this.allResults.length; i++){
var result = this.allResults[i];


// Plug them into the document where we want them.
document.body.appendChild(result.html.cloneNode(1));
}
}
});


Check it out in action here.

And there you have it! All 64 results from a Google AJAX Search API WebSearch, in one shot.

What I learned on today's adventure:
  1. I can use the Google AJAX Search API to get up to 64 search results for use on my site or in my application.
  2. How to use searcher.setSearchCompleteCallback and searcher.gotoPage to get all available results in one fell swoop.

11 comments:

geoffsmiths said...

Hi Jeremy,

First of all thanx for your help so far with my google api search issues.

You've directed me to this post and I was excited when I've read that there is a solution for the resultset 8 - issue.

But.. There is something else that I've noticed. I've been testing the page that you've wrote and I opened it in Firefox. It works perfect! But as soon as I open the resultpage in Internet Explorer 7, I see a bunch of lines at first and later on I see 8 results...

Do you know how that is possible?

My best regards and keep up the good work Jeremy!

Jeffrey from the Netherlands

jgeerdes said...

Wow, that was a good catch. I never thought about how much IE stinks when I mocked up that test page. At any rate, because of IE, I've adjusted the code a little bit. Basically, it amounts to manually cloning each result object, rather than simply concat'ing the results array to allResults.

So instead of this line:

this.allResults = this.allResults.concat(this.results);

We switch it out to this block:

for(var i=0;i<this.results.length;i++){
var result = this.results[i];
var allResult = {}
for(var j in result){
try{allResult[j] = result[j].cloneNode(1);}
catch(e){allResult[j] = result[j];}
}
this.allResults.push(allResult);


It's disgusting, I know, but such is IE!

See it in action here:

http://jgeerdes.home.mchsi.com/playground/getAllResultsFromJSAPI.html

Unknown said...

hye
google ajax search api its cool..
but not like "yahoo boss" or "msn live api". it is not for "deep search". for exp' in google "api" its limit to 64 result, in msn or yahoo there is no limit.

jgeerdes said...

amirr,

Thanks for the comments. Indeed, the Yahoo! BOSS and MSN Live APIs have some compelling features, but unfortunately, limitless results is not one of them. Both are actually limited to retrieving no more than 1,000 results. I found this in MSN's documentation (I've never been desperate enough to use MSN's API), and by testing with the Yahoo! BOSS API (I have developed with BOSS). You can run the same test; just use a start parameter of 1000 and check the response's start position. It will show 990. Use start=10000, and you'll get the same.

That said, I do appreciate the much larger results sets that both of these APIs return. In the case of the BOSS API, I also appreciate the flexibility in response formats (i.e., you can get results in either an xml or a json format). Combine BOSS's features with the Yahoo! Local API, which includes the ratings and customer comments, etc., for results, and you have a formidable package.

Even so, these two APIs do have one glaring shortcoming that the Google API addresses: neither offers any sort of pre-defined UI elements to allow developers to painlessly deploy their power on their own sites. This is where the Google API really shines. It allows even amateur developers to put basic search functionality on their site quickly and easily. Yes, I have seen some of the attempts to put a JS wrapper on, for example, the BOSS API, but they just don't compare to the ease and flexibility of the Google stuff.

vin said...

Wonderful write up! Thanks very much for the tip! (found it from your msg on the google ajax api group)

Prakash said...

Hi Jeremy,

I am new Google AJAX Search API. i need to limit my search to region and language like region=UK and language=en. kindly help me
Thank you

Roch said...

Hi,
I don't understant how to integrate your method and pass an argument to the execute() function? could you help me with that?

jgeerdes said...

@Prakash
Thanks for the question. Unfortunately, there is currently no way to limit searches by geography. You can, however, limit by language by calling, for example, searcher.setRestriction(google.search.Search.RESTRICT_EXTENDED_ARGS,{lr:'en'});

@Roch:
What do you mean, you're having trouble passing arguments to the execute method? You mean you want your search complete callback to have additional arguments, or you can't get the execute method to work correctly?

jbird said...

Hi Jeremy,

Fantastic post. I know it's been a while since you posted it but I'm going to ask anyway. I'm new to the google API but I'm using it to grab map results and put them in my customer's database (for booking shows for bands). I'm using the following snippet to get the result sets:

http://code.google.com/apis/ajax/playground/?exp=search#localsearch_with_markers

Can you see a way that can be modified to include the maximum number of results rather than 8 x4 pages?

Any help is greatly appreciated.
Cheers,
jay

jbird said...

Jeremy,

With a little help from your post and some brain activity by me I've figured it out - don't approve my previous comment unless you feel like it.

My band (red umbrella) has played in your town a few times!

Cheers,
Jay

jyotirani said...

Hi Jeremy,

Your code is of great help to me. As i was looking around for 2 days to get this task done.

Thanks a Lot!!!