9 March, 2005 at 02:01 Leave a comment

GDS Developer Search API


Requesting a Desktop Search

Your application requests a desktop search by sending an HTTP request that includes a &format=xml parameter to Google Desktop Search. For example, to search for “Google” you’d send something like:

http://127.0.0.1:4664/search&s=1ftR7c_hVZKYvuYS-RWnFHk91Z0?q=Google&format=xml

To break this down:

  • http://127.0.0.1:4664/ is the localhost address and GDS port. search&s=1ftR7c_hVZKYvuYS-RWnFHk91Z0 is the search command and a security token. ?q=Google is the query term(s) parameter.
    • If you want to search for more than one term, separate the terms with +s. For example, to search for both “Google” and “GDS”, use:
      ?q=Google+GDS
    • If you want to search for a specific phrase, separate the terms with +s and surround the phrase with %22s. For example, to search for the phrase “Google Desktop Search”, use: ?q=%22Google+Desktop+Search%22 To search for the two phrases “Google Desktop Search” and “Copyright 2005”, use ?q=%22Google+Desktop+Search%22+%22Copyright+2005%22

  • &format=xml specifies that the HTTP response returns the search results in XML format, as described in the next section.

Note that these requests only do a desktop search, not both a web and desktop search.The search query URL, including your security token, is stored in the registry at:

HKEY_CURRENT_USER\Software\Google\Google Desktop\API\search_url

To use the example above, the stored query URL would be something like:

http://127.0.0.1:4664/search&s=1ftR7c_hVZKYvuYS-RWnFHk91Z0?q=

To create a query, all you have to do is append your query terms and the final&format=xml parameter to the stored query URL.

By default, an HTTP search response will only return the first ten results. You can specify a larger number by appending the &num= parameter, followed by the maximum number of results you’d like returned, to your query. There is no problem if the maximum number argument value is greater than the total number of search results; only the total number of results is returned, with no null “results”.

You can also specify at what point in the results the returned ones start. For example, if you’re using the default value of 10 returned results and want to get back results 11-20 instead of the default results 1-10, append the &start= parameter, followed by the position you want the results to start from. In this example, you’d specify &start=10 to indicate you want your returned results to start with the one after overall result 10. The &start= and &num= parameters can both be used in a single query.

Desktop Search Results

When it receives a search request with the &format=xml parameter, GDS returns results in an XML format. For example:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<results count="24945">
<result>
<category>file</category>
<id>46384</id>
<title>SDK Developer Documentation</title>
<url>C:\Documents and Settings\me\My Documents\developerguide.html~</url>
<time>127543068856350000</time>
<snippet>SDK Developer Documentation <b>Google</b> Desktop Search SDK Developer Guide For Users Download Plug-ins Desktop Search Forum For Developers</snippet>
<icon>/file.gif</icon>
<cache_url>http://127.0.0.1:4664/... </cache_url>
</result>

...

</results>

The one meta- results tag is <results>, which contains the total number of items that matched the query as the value of a count parameter; e.g. <results count=42>. This will be the largest number of possible <result></result> entries in the <results> element. However, if the number of returned results (which defaults to at most 10 if you don’t specify an &num= parameter with a larger value) is different from the count value, the number of returned <result></result> entries will be the smaller of the two conflicting values.

For example, let’s say the <count> value is 42. But you didn’t give an &num= argument in the query. So, despite there being 42 items that matched the query, the XML response will only contain 10 <result></result> entries. If the <count> value was 6, which is smaller than the default 10 value, the XML response will only contain 6 <result></result> entries.

If you want to be sure to obtain all search results, your component will have to parse out the <count> value and then issue additional HTTP GDS search requests to retrieve that many results via use of either the &start= or &num= parameters.

Each <result> may include the following fields, which may appear in any order. Which fields appear will depend on the result type; for example, the <from> entry should only show up in email or chat results. Each tag will contain a value; any tags not containing a value are not shown.

  • <category> contains the result’s type, one of:
    • email
    • chat
    • web
    • file
  • <id>  is the result’s GDS internal identifier.
  • <title>  is the result’s title, which varies depending on its type:
    • Web page: the page’s title.
    • Email: the message’s Subject:.
    • File: its filename.
    • Chat: a line from the chat.
  • <url> is the result’s URL. For files and web pages, this is the usual path to the result. For chats and email messages, Google Desktop Search generates a URL for the location where it has stored its cached copy.
  • <time> is the time value from the event that put this content into GDS. Usually this will be the time the content was indexed by GDS, but, for example, it could also be a file’s last modified time. The format is per the Windows FILETIME structure; the number of 100-nanosecond intervals since January 1, 1601 represented as a 64-bit number.
  • <snippet> is a snippet from the result’s content that contains at least one of the search terms.
  • <icon> is a Google Desktop Search-relative URL to an icon representing this result or its type. This will either be one of the standard Google Desktop Search result type icons (envelope for email, Word icon for a Word file, etc.) or a favicon obtained from a website.
  • <thumbnail> is a relative URL to the icon for this result at the GDS webserver
  • <from> is the name of either the person an email message was from, or the other party in a Instant Message chat.
  • <cache_url> is the Google Desktop Search-relative URL of this result’s internal cache page.

If you want to experiment with queries to see what their XML format results look like, just do a desktop search from the browser. Then append ?format=xml to the result’s URL, hit Enter, and the same results will display in XML format in the browser. Two caveats; first, only the number of results that originally appeared on one results page will show up in the browser in XML. So, for example, if only 10 HTML format results show up in the browser on one results page, only 10 XML format results will appear in the browser, even if the value of <total-results> is, say, 42. Second, your browser must include an XML viewer, which IE and Firefox do by default.

Finally, note that the XML results do not include the search terms. If your application wants to also make use of the search terms, as well as the results, it will have to keep track of what they were.

Back to top

XML Event Schemas


The complete set of event schema definitions in XML (download XML file):

<!– Google Desktop Search Schemas –>
– <schemas>
<!– Generic indexable schema. The base parent schema of all other schemas –>
– <schema name=”Google.Desktop.Indexable” description=”Indexable entity schema“>
  <property name=”content” description=”Indexable content” type=”VT_BSTR” required=”true” />

  <property name=”format” description=”Mime type of the indexable content; text/plain, text/html are accepted types” type=”VT_BSTR” required=”true” />

  <property name=”native_size” description=”The size of the original native content (in bytes)” type=”VT_UI8” />

  <property name=”thumbnail” description=”Thumbnail image of the content” type=”VT_ARRAY” />

  <property name=”thumbnail_format” description=”Mime type of the thumbnail; image/gif, image/jpeg, image/png are accepted types” type=”VT_BSTR” />

  </schema>

<!– Generic email schema. Inherits from Google.Desktop.Indexable –>
– <schema name=”Google.Desktop.Email” parent=”Google.Desktop.Indexable” description=”Generic Email schema“>
  <property name=”mail_header” description=”Mail header of the e-mail message” type=”VT_BSTR” />

  <property name=”from” description=”Sender of the email message” type=”VT_BSTR” />

  <property name=”subject” description=”Subject of the email message” type=”VT_BSTR” />

  <property name=”to” description=”Recipient(s) of the email message” type=”VT_BSTR” />

  <property name=”cc” description=”Cc field of the email message” type=”VT_BSTR” />

  <property name=”bcc” description=”Bcc field of the email message” type=”VT_BSTR” />

  <property name=”replyto” description=”ReplyTo field of the email message” type=”VT_BSTR” />

  <property name=”received” description=”Received time of the email message” type=”VT_DATE” required=”true” />

  <property name=”folder_name” description=”Folder name of the email message” type=”VT_BSTR” />

  </schema>

<!– Generic file schema. Inherits from Google.Desktop.Indexable –>
– <schema name=”Google.Desktop.File” parent=”Google.Desktop.Indexable” description=”Generic File schema“>
  <property name=”uri” description=”Uri of the file” type=”VT_BSTR” required=”true” />

  <property name=”last_modified_time” description=”Time of last modification” type=”VT_DATE” required=”true” />

  <property name=”title” description=”Title of the file” type=”VT_BSTR” />

  <property name=”author” description=”Author of the file” type=”VT_BSTR” />

  </schema>

<!– Generic IM schema. Inherits from Google.Desktop.Indexable –>
– <schema name=”Google.Desktop.IM” parent=”Google.Desktop.Indexable” description=”Generic Instant Message schema“>
  <property name=”message_time” description=”Time the instant message was received” type=”VT_DATE” required=”true” />

  <property name=”user_name” description=”Name of the user (myself)” type=”VT_BSTR” />

  <property name=”buddy_name” description=”Name of the other person in the conversation” type=”VT_BSTR” required=”true” />

  <property name=”title” description=”Title associated with the instant message(s)” type=”VT_BSTR” />

  <property name=”conversation_id” description=”Identifier hint to group messages from the same conversation” type=”VT_UI4” />

  </schema>

<!– Generic web page schema. Inherits from Google.Desktop.File –>
– <schema name=”Google.Desktop.WebPage” parent=”Google.Desktop.File” description=”Generic web page schema“>
  <property name=”bookmarked” description=”Specifies if this web page is bookmarked” type=”VT_BOOL” />

  <property name=”interaction_period” description=”Specifies the amount of time the user interacted with the web page” type=”VT_DATE” />

  </schema>

<!– Generic media file schema. Inherits from Google.Desktop.File –>
– <schema name=”Google.Desktop.MediaFile” parent=”Google.Desktop.File” description=”Generic media file schema“>
  <property name=”width” description=”Width of images and videos (in pixels)” type=”VT_UI4” />

  <property name=”height” description=”Height of images and videos (in pixels)” type=”VT_UI4” />

  <property name=”data_rate” description=”Average video bit rate for video files (in bits/sec)” type=”VT_UI4” />

  <property name=”bit_rate” description=”Average audio bit rate in audio and video files (in bits/sec)” type=”VT_UI4” />

  <property name=”channels” description=”Channel count for audio files” type=”VT_UI4” />

  <property name=”length” description=”Time length of music and video files (in nanosec)” type=”VT_UI8” />

  <property name=”original_date” description=”Original time stamp of media file from the media device” type=”VT_DATE” />

  <property name=”album_title” description=”Album title for music files” type=”VT_BSTR” />

  <property name=”artist” description=”Artist for media file” type=”VT_BSTR” />

  <property name=”genre” description=”Genre/category for media file” type=”VT_BSTR” />

  <property name=”lyrics” description=”Lyrics for music files” type=”VT_BSTR” />

  <property name=”track_number” description=”Number of tracks for music files” type=”VT_UI4” />

  <property name=”comment” description=”Comment about the audio file” type=”VT_BSTR” />

  <property name=”info_tip” description=”Info tip as reported from the shell” type=”VT_BSTR” />

  <property name=”year_published” description=”Year published for media” type=”VT_UI4” />

  </schema>

<!– Generic text file schema. Inherits from Google.Desktop.File –>
  <schema name=”Google.Desktop.TextFile” parent=”Google.Desktop.File” description=”Generic text file schema” />

</schemas>

Advertisements

Entry filed under: Uncategorized.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

March 2005
M T W T F S S
« Feb   Apr »
 123456
78910111213
14151617181920
21222324252627
28293031  

Tweets


%d bloggers like this: