Monday, December 21, 2009

How to Search Google Efficiently..

Basic Boolean

Whenever you search for more than one keyword at a time, a search engine has a default method of how to handle that keyword. Will the engine search for both keywords or for either keyword? The answer is called a Boolean default; search engines can default to Boolean AND (it'll search for both keywords) or Boolean OR (it'll search for either keyword). Of course, even if a search engine defaults to searching for both keywords (AND) you can usually give it a special command to instruct it to search for either keyword (OR).

Google's Boolean default is AND; that means if you enter query words without modifiers, Google will search for all of them. If you search for :

XML Java "web Services"

Google will search for all the words. If you want to specify that either word is acceptable, you put an OR between each item:

XML OR Java OR "Web Services"

If you want to have definitely one term and one of two or more other terms, you group them with parentheses, like this:

XML (Java OR "Web Services")

This query searches for the word "Java" or phrase "Web Services" along with the word "XML." A stand-in for OR borrowed from the computer programming realm is the | (pipe) character, as in:

XML (Java | "Web Services")

If you want to specify that a query item must not appear in your results, use a -.(minus sign or dash).

XML Java -"Web Services"

This will search for pages that contain both the words "XML" and "Java" but not

the phrase "Web Services."

The Special Syntaxes

inurl:
If you include [inurl:] in your query, Google will restrict the results to documents containing that word in the url. For instance, [inurl:google search] will return documents that mention the word “google” in their url, and mention the word “search” anywhere in the document (url or no). Note there can be no space between the “inurl:” and the following word.
Putting “inurl:” in front of every word in your query is equivalent to putting “allinurl:” at the front of your query: [inurl:google inurl:search] is the same as [allinurl: google search].


allinurl:
If you start a query with [allinurl:], Google will restrict the results to those with all of the query words in the url. For instance, [allinurl: google search] will return only documents that have both “google” and “search” in the url.
Note that [allinurl:] works on words, not url components. In particular, it ignores punctuation. Thus, [allinurl: foo/bar] will restrict the results to page with the words “foo” and “bar” in the url, but won’t require that they be separated by a slash within that url, that they be adjacent, or that they be in that particular word order. There is currently no way to enforce these constraints.

intext:
Searches only body text (i.e., ignores link text, URLs, and titles). There's an allintext: variation, but again, this doesn't play well with others. While its uses are limited, it's perfect for finding query words that might be too common in URLs or link titles.
Eg: intext:"yahoo.com"

intext:html

cache:
If you include other words in the query, Google will highlight those words within the cached document. For instance, [cache:www.google.com web] will show the cached content with the word “web” highlighted. This functionality is also accessible by clicking on the “Cached” link on Google’s main results page. The query [cache:] will show the version of the web page that Google has in its cache. For instance, [cache:www.google.com] will show Google’s cache of the Google homepage. Note there can be no space between the “cache:” and the web page url.

link:
The query [link:] will list webpages that have links to the specified webpage. For instance, [link:www.google.com] will list webpages that have links pointing to the Google homepage. Note there can be no space between the “link:” and the web page url.

filetype:
Searches the suffixes or filename extensions. These are usually, but not necessarily, different file types. I like to make this distinction, because searching for filetype:htm and filetype:html will give you different result counts, even though they're the same file type. You can even search for different page generators, such as ASP, PHP, CGI, and so forth-presuming the site isn't hiding them behind redirection and proxying. Google indexes several different Microsoft formats, including: PowerPoint (PPT), Excel (XLS), and Word (DOC).
Eg: homeschooling filetype:pdf

"leading economic indicators" filetype:ppt

related:
The query [related:] will list web pages that are “similar” to a specified web page. For instance, [related:www.google.com] will list web pages that are similar to the Google homepage. Note there can be no space between the “related:” and the web page url.

info:
The query [info:] will present some information that Google has about that web page. For instance, [info:www.google.com] will show information about the Google homepage. Note there can be no space between the “info:” and the web page url.

define:
The query [define:] will provide a definition of the words you enter after it, gathered from various online sources. The definition will be for the entire phrase entered (i.e., it will include all the words in the exact order you typed them).

stocks:
If you begin a query with the [stocks:] operator, Google will treat the rest of the query terms as stock ticker symbols, and will link to a page showing stock information for those symbols. For instance, [stocks: intc yhoo] will show information about Intel and Yahoo. (Note you must type the ticker symbols, not the company name.)

site:
If you include [site:] in your query, Google will restrict the results to those websites in the given domain. For instance, [help site:www.google.com] will find pages about help within www.google.com. [help site:com] will find pages about help within .com urls. Note there can be no space between the “site:” and the domain.

allintitle:
If you start a query with [allintitle:], Google will restrict the results to those with all of the query words in the title. For instance, [allintitle: google search] will return only documents that have both “google” and “search” in the title.

intitle:
If you include [intitle:] in your query, Google will restrict the results to documents containing that word in the title. For instance, [intitle:google search] will return documents that mention the word “google” in their title, and mention the word “search” anywhere in the document (title or no). Note there can be no space between the “intitle:” and the following word.
Putting [intitle:] in front of every word in your query is equivalent to putting [allintitle:] at the front of your query: [intitle:google intitle:search] is the same as [allintitle: google search].


daterange:
Limits your search to a particular date or range of dates that a page was indexed. It's important to note that the search is not limited to when a page was created, but when it was indexed by Google. So a page created on February 2 and not indexed by Google until April 11 could be found with daterange: search on April 11. Note that daterange: works with Julian, not Gregorian dates (the calendar we use every day.)
Eg: "George Bush" daterange:2452389-2452389

neurosurgery daterange:2452389-2452389


0 comments: