home admin@bajalife.com
purchase web advertising
 



Introduction to Search Engines

You've heard the term "Search Engines" so many times, but do you know what it really means? Or better yet, do you know how to use the search engines to make your Web site more visible to potential customers? First, you must understand how search engines work before you can decide how to use them to your advantage.

Search engines are a primary way that people find information on the Internet. (Sometimes search engines are equated to the yellow or white pages of the Web). Users request information from search engines and in return they receive a list of possible URLs that match their request. There are many kinds of search engines that provide Web site information based on many different kinds of criteria. A Web site with a good search engine listing may see a dramatic increase in traffic.

Everyone wants that good listing. Unfortunately, many Websites appear poorly in search engine rankings or may not be listed at all because they fail to consider how search engines work.


How Search Engines Work
The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

Human-Powered Directories

A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

"Hybrid Search Engines" Or Mixed Results

In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.

The Parts Of A Crawler-Based Search Engine

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.

How Search Engines Rank Web Pages
Search for anything using your favorite crawler-based search engine. Nearly instantly, the search engine will sort through the millions of pages it knows about and present you with ones that match your topic. The matches will even be ranked, so that the most relevant ones come first.

Of course, the search engines don't always get it right. Non-relevant pages make it through, and sometimes it may take a little more digging to find what you are looking for. But, by and large, search engines do an amazing job.

As WebCrawler founder Brian Pinkerton puts it, "Imagine walking up to a librarian and saying, 'travel.' They’re going to look at you with a blank face."

OK -- a librarian's not really going to stare at you with a vacant expression. Instead, they're going to ask you questions to better understand what you are looking for.

Unfortunately, search engines don't have the ability to ask a few questions to focus your search, as a librarian can. They also can't rely on judgment and past experience to rank web pages, in the way humans can.

So, how do crawler-based search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine's algorithm works is a closely-kept trade secret. However, all major search engines follow the general rules below.

Location, Location, Location...and Frequency

One of the the main rules in a ranking algorithm involves the location and frequency of keywords on a web page. Call it the location/frequency method, for short.

Remember the librarian mentioned above? They need to find books to match your request of "travel," so it makes sense that they first look at books with travel in the title. Search engines operate the same way. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic.

Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that any page relevant to the topic will mention those words right from the beginning.

Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page. Those with a higher frequency are often deemed more relevant than other web pages.

Top of Page


Major Search Engines

Why are these considered to be "major" search engines? Because they are either well-known or well-used.

For webmasters, the major search engines are the most important places to be listed, because they can potentially generate so much traffic.

For searchers, well-known, commercially-backed search engines generally mean more dependable results. These search engines are more likely to be well-maintained and upgraded when necessary, to keep pace with the growing web.

All search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results.

How do I get my site considered for a listing with a search engine?
Depending on the search engine, there are two common ways they can discover your site:

  • You hand submit a request for your Web site to be included or reviewed in their index.
  • Be referenced by another Web site that is already listed in a search engine's index.

Search engines discover your site when they crawl or spider the Web. This means that they automatically read Web page text and then follow most of the normal kinds of HTML links. They rely on Web site text to record the nature and content of the site.

What can I do to be included in a search engine's index?
Although there are many different types of search engines, there are ways to refine your Web site so that it is picked up or ranked well by search engines in general. These key factors include:

  • Choosing your keywords carefully.
  • Thoroughly applying the keywords in the text of the site.
  • Testing the site's rankings and updating the site often.
  • Never fool (spam) the search engines.
  • Nothing replaces quality content!
  • Submitting your site to the proper engines.

Top of Page

How do I choose my keywords?
Many automated search engines determine the relevancy of your Web page based on keywords used in the Web site text. Keywords are the two or three words that you use to identify your site to users. They are also the words in the HTML tags --specifically the tags--used in your document. Meta tags apply to an entire document, and there are many of them. The ones most useful for search engines are the description and keywords tags. The description tag displays a description of your page when it is relevant to a search. The keywords tag provides keywords for the search engine to associate with your page. For example, here is a simple Web page header:

<HEAD>

<TITLE> Fox Gardens </TITLE>

<META name="keywords" content="landscaping, oriental landscaping, oriental landscaping services, landscape, oriental gardening, gardening services">

</HEAD>

A search engine that displays your site would show:
Fox Gardens - An oriental landscaping company

The words you included in the keywords tag are a subset of the words you might choose to use. In this example, landscaping, oriental landscaping, and oriental landscaping services are samples of keywords. Be specific as you select them, choosing the keyword landscaping alone, for example, does not narrow the search enough for you to be well ranked if the user searches on oriental landscaping. Choosing keywords is critical--one word can completely change the relevance ranking of your site.

To find out the best keywords to use, go to your desired search site and search on the terms you think your desired audience will use.

Top of Page

How do I apply the keywords to my site?
The list below summarizes many of the known tips on using keywords to boost the relevancy of your Web page to search engines.

  • The most important place to use your keywords is in the title of your Web page. Some search engines don't search the text of the file, nor do they read meta tags.
  • Provide relevant information and keywords on the first page of your document.
  • If your title or page is a graphic, make sure that you place the title in the tag, since search engines cannot interpret words in graphics nor do they follow graphic links. Also use keyword meta tags for pages with sparse text. Not all engines review the content in these tags, but this increases the chances of your keywords being caught by the engine.
  • Apply the keywords in keyword meta-tags, including variations on the words, misspellings, plurals, and any other combinations. For example,use photograph, photography, photo etc.
  • Use lower case for meta tag keywords. While some search engines are case sensitive, this will produce the most results without tripping a search engines spam sensor.
  • Repeat the keywords in your content frequently and in various ways. Many of the engines weight the importance of your keywords based upon the frequency of their use. It is especially important that the first paragraphs of your document contain your title's keywords.
  • Use a clear description meta tag. (The description meta tag is used by search engines to describe your site in the listing.) For example Rare old photographs for sale.
  • Some engines ignore the comment tag, others include it. So, using your keywords in your comments may be useful. Avoid repeatedly inserting your keywords in comments, however, as that would considered spamming or trying to fool the search engine.
  • Search engines create relevancy ratings based on when keywords occur in your site. If you use tables your keywords will appear later in your documents rather than earlier. Use of frames also impacts search engine ratings.
  • Keywords that are embedded in JavaScript are usually ignored by search engines. Graphics and image maps are ignored by search engines.
  • Graphic only links can't be followed by the search engine, so make sure you've provided an alternative route for the engine. Some engines submit a simple text site map to search engines to deal with this issue.
  • Symbols and dynamic content are not followed when a site is spidered.
  • If you'd like detailed information on how each search engine uses keywords, refer to the SearchEngineWatch website (www.searchenginewatch.com) for more details.
  • Include your site identification URL as the last keyword in your meta tags.
  • Quality counts! The quality and nature of the content you provide makes all of the difference to the audience you draw to your site. Some Websites are discovered and listed by search engines before they have reviewed their submission requests. They are discovered because other sites of high quality content refer them to your site.

Top of Page

How do I test my site's ranking?
The amount of time that it takes to index your site depends on the search engine. The quickest response time for your site to be indexed is about two days. Two days is extremely fast when you consider the enormous number of Websites that are submitted to search engines every day. Many of the other engines can take 2-4 weeks to appear, if your site appears at all. Some of the search engines offer you the opportunity to pay to have your site considered more quickly than the standard time period (called express submissions).

Search engines routinely update and change their criteria for listing. They also regularly discard sites from their index, so they require constant monitoring. Keywords for your industry also change often. We recommend that you check for the most popular or appropriate keywords for your site and resubmit about twice a month. This can benefit your site since fresh submissions are often higher in ranking than old ones. We suggest that you monitor every week or two and resubmit your pages at minimum every time you make significant changes to your site.

Can I fool search engines? (spamming)
Web designers have developed many ways to try to fool search engines into ranking their sites well and including them in their site index. Some of these methods include repeating keywords over and over again in the meta tags for the site (called stuffing), using colored text on the same color background for key words and content, hiding text using cgi, using very tiny fonts, etc. These tactics are designed to bombard the search engine and deceive them about the quality and nature of the content.

These actions can seriously degrade the value of search engines. Search engines are savvy to these tactics and if they catch you engaging in these practices they will disallow your submissions to their index.

Where do I submit my site?
The search engines below are all excellent choices to start with.

Google
http://www.google.com

Voted four times Most Outstanding Search Engine by Search Engine Watch readers, Google has a well-deserved reputation as the top choice for those searching the web. The crawler-based service provides both comprehensive coverage of the web along with great relevancy. It's highly recommended as a first stop in your hunt for whatever you are looking for.

Google provides the option to find more than web pages, however. Using on the top of the search box on the Google home page, you can easily seek out images from across the web, discussions that are taking place on Usenet newsgroups, locate news information or perform product searching. Using the More link provides access to human-compiled information from the Open Directory (see below), catalog searching and other services.

Google is also known for the wide range of features it offers, such as cached links that let you "resurrect" dead pages or see older versions of recently changed ones. It offers excellent spell checking, easy access to dictionary definitions, integration of stock quotes, street maps, telephone numbers and more. See Google's help page for an entire rundown on some of these features. The Google Toolbar has also won a popular following for the easy access it provides to Google and its features directly from the Internet Explorer browser.

In addition to Google's unpaid editorial results, the company also operates its own advertising programs. The cost-per-click AdWords program places ads on Google as well as some of Google's partners. Similarly, Google is also a provider of unpaid editorial results to some other search engines. For a list of major partnerships, see the Search Providers Chart.

Google was originally a Stanford University project by students Larry Page and Sergey Brin called BackRub. By 1998, the name had been changed to Google, and the project jumped off campus and became the private company Google. It remains privately held today.

Yahoo
http://www.yahoo.com

Launched in 1994, Yahoo is the web's oldest "directory," a place where human editors organize Websites into categories. However, in October 2002, Yahoo made a giant shift to crawler-based listings for its main results. These came from Google until February 2004. Now, Yahoo uses its own search technology. Learn more in this recent review from our SearchDay newsletter, which also provides some updated submission details.

In addition to excellent search results, you can use tabs above the search box on the Yahoo home page to seek images, Yellow Page listings or use Yahoo's excellent shopping search engine. Or visit the Yahoo Search home page, where even more specialized search options are offered.

The Yahoo Directory still survives. You'll notice "category" links below some of the sites lists in response to a keyword search. When offered, these will take you to a list of Websites that have been reviewed and approved by a human editor.

It's also possible to do a pure search of just the human-compiled Yahoo Directory, which is how the old or "classic" Yahoo used to work. To do this, search from the Yahoo Directory home page, as opposed to the regular Yahoo.com home page. Then you'll get both directory category links ("Related Directory Categories") and "Directory Results," which are the top web site matches drawn from all categories of the Yahoo Directory.

Sites pay a fee to be included in the Yahoo Directory's commercial listings, though they must meet editor approval before being accepted. Non-commercial content is accepted for free. Yahoo's content acquisition program also offers paid inclusion, where sites can also pay to be included in Yahoo's crawler-based results. This doesn't guarantee ranking, Yahoo promises. The CAP program also bring in content from non-profit organizations for free.

Like Google, Yahoo sells paid placement advertising links that appear on its own site and which are distributed to others. Yahoo purchased Overture in October 2003.

Overture was formerly called GoTo until late 2001. More about it can be found on the Paid Listings Search Engines page. Overture purchased AllTheWeb (see below) in March 2003 and acquired AltaVista (see below) in April 2003. Now Yahoo owns these, gained as from its purchase of Overture.

Technology AltaVista and AllTheWeb was combined with that of Inktomi, a crawler-based search engine that grew out UC Berkeley and then launched as its own company in 1996, to make the current Yahoo crawler. Yahoo purchased Inktomi in March 2003.

Ask
http://www.ask.com

Ask Jeeves initially gained fame in 1998 and 1999 as being the "natural language" search engine that let you search by asking questions and responded with what seemed to be the right answer to everything.

In reality, technology wasn't what made Ask Jeeves perform so well. Behind the scenes, the company at one point had about 100 editors who monitored search logs. They then went out onto the web and located what seemed to be the best sites to match the most popular queries.

In 1999, Ask acquired Direct Hit, which had developed the world's first "click popularity" search technology. Then, in 2001, Ask acquired Teoma's unique index and search relevancy technology. Teoma was based upon the clustering concept of subject-specific popularity.

Today, Ask depends on crawler-based technology to provide results to its users. These results come from the Teoma algorithm, now known as ExpertRank.

Strongly Consider
The search engines below are other good choices to consider when searching the web.

AllTheWeb.com
http://www.alltheweb.com

Powered by Yahoo, you may find AllTheWeb a lighter, more customizable and pleasant "pure search" experience than you get at Yahoo itself. The focus is on web search, but news, picture, video, MP3 and FTP search are also offered.

AllTheWeb.com was previously owned by a company called FAST and used as a showcase for that company's web search technology. That's why you sometimes may sometimes hear AllTheWeb.com also referred to as FAST or FAST Search. However, the search engine was purchased by search provider Overture (see below) in late April 2003, then later become Yahoo's property when Yahoo bought Overture. It no longer has a connection with FAST.

AOL Search
http://www.aol.com

AOL Search provides users with editorial listings that come Google's crawler-based index. Indeed, the same search on Google and AOL Search will come up with very similar matches. So, why would you use AOL Search? Primarily because you are an AOL user. The "internal" version of AOL Search provides links to content only available within the AOL online service. In this way, you can search AOL and the entire web at the same time. The "external" version lacks these links. Why wouldn't you use AOL Search? If you like Google, many of Google's features such as "cached" pages are not offered by AOL Search.

HotBot
http://www.hotbot.com

HotBot provides easy access to the web's three major crawler-based search engines: Yahoo, Google and Teoma. Unlike a meta search engine, it cannot blend the results from all of these crawlers together. Nevertheless, it's a fast, easy way to get different web search "opinions" in one place.

HotBot's "choose a search engine" interface was introduced in December 2002. However, HotBot has a long history as a search brand before this date.

HotBot debuted in May 1996, it gained a strong following among serious searchers for the quality and comprehensiveness of its crawler-based results, which were provided by Inktomi, at the time. It also caught the attention of experienced web users and techies, especially for the unusual colors and interface it continues to sport today.

HotBot gained more notoriety when it switched over to using Direct Hit's "clickthrough" results for its main listings in 1999. Direct Hit was then one of the "hot" search engines that had recently appeared. Unfortunately, the quality of Direct Hit's results couldn't match those of another "hot" player that had debuted at the same time, Google. HotBot's popularity began to drop.

Even worse, HotBot also suffered by being owned by Lycos (now Terra Lycos). Lycos had acquired HotBot when it purchased Wired Digital in October 1998. Lycos failed to make search a priority on its flagship Lycos site as well as HotBot through much of 1999 and 2000, as it focused instead on adding "portal" features. The company refocused on search in late 2001, making significant improvements to the Lycos site and, as noted, reworked the HotBot site at the end of 2002.

While search engines are still a primary method for drawing traffic, don't forget traditional media, newsgroup postings, web directories, and alternative forms that can sometimes be far more effective than search engines. You can also use personal communications to lists for your industry/area, this kind of approach produces less traffic but the audience is pre-qualified.

Top of Page



Search Engine Optimization (SEO)

Search engine optimization (SEO) is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results. Usually, the earlier a site is presented in the search results, or the higher it "ranks", the more searchers will visit that site. SEO can also target different kinds of search, including image search, local search, and industry-specific vertical search engines.

As a marketing strategy for increasing a site's relevance, SEO considers how search algorithms work and for what people search. SEO efforts may involve a site's coding, presentation, and structure, as well as fixing problems that could prevent search engine indexing programs from fully spidering a site. Other, more noticeable efforts may include adding unique content to a site, ensuring that content is easily indexed by search engine robots, and making the site more appealing to users. Another class of techniques, known as "Black hat" SEO or spamdexing, use methods such as link farms and keyword stuffing that tend to harm search engine user experience. Search engines look for sites that employ these techniques and may remove their listings.

The acronym "SEO" can also refer to "search engine optimizers", a term adopted by an industry of consultants who carry out optimization projects on behalf of clients, and by employees who perform SEO services in-house. Search engine optimizers may offer SEO as a stand-alone service or as a part of a broader marketing campaign. Because effective SEO may require changes to the HTML source code of a site, SEO tactics may be incorporated into web site development and design. The term "search engine friendly" may be used to describe web site designs, menus, content management systems and shopping carts that are easy to optimize.

Top of Page

Content courtesy of SearchEngineWatch, one of the most comprehensive Web sites providing detailed information on existing search engines.