Apple Logo Itsamac Hosting
Mac OS Journal
EditorialsColumnsFeaturesReviewsArchives/StaffSubscribe
 
Table of Contents From the Desktop Connect Feature Column Special The CoXFiles The Gaming Landscape The Surf Report Simply Web Advanced HTML
Medicine Man The Database Guru The AppleScript Foundry Shop Talk Review - GoLive Review - Asteroids Review - BBEdit Review - Agentsheets Behind the Scenes      
   
 
Itsamac.com
Red Light Runner
Applelust.com
     
 

HTML: Above and Beyond
November 2000 || Volume 01, Issue 04

Arachnophobia: Overcoming the Fear of [Search Engine] Spiders.

Remember the old saying... "Build it... and they will come." Our question today is... "But will they find you?"

Hello fellow Mac Lovers! Welcome back! I trust your website is in tip-top shape now. Are you ready to generate some traffic to your site? You are? Good! Today we're going to talk about "Everything You Wanted to Know About the Search Engines (and a few things you probably didn't want to know!)"

By now, most of you have figured out that I only give you just enough information to wet your whistle... the rest is up to you. I've always believed that one learns better by doing... and research is a valuable and effective teacher. So, having said that, I'm going to just give you the "tip of the iceberg" today -- which will also help me keep this article under 2000 words! [Editor's Note: Yeah Right!]

OK... let's get started!

If you're not listed within the first two pages of the search engines you might as well not be listed at all! That's the cyberspace equivalent of the proverbial needle in a haystack. Did you know that more than 85% of surfers rely on Search Engines to find what they're looking for?

It's rare that anyone will look past the first 30 results (3 pages). This is perfectly understandable because the most relevant sites are always listed first. So if the surfer doesn't find what they want within the first 20 - 30 listings, they'll simply do a new search.

Every search engine wants the same thing - Websites that are filled with good, useful content. All search engines base their ranking methods on this criteria. This is a good thing... afterall, you don't want the search engines filled up with a bunch of poor quality sites. You're looking for "quality" information.

Note: While there are ways to get listed with the Search Engines if you have a "framed" site, I won't be covering that here. If you'll recall, I advised against using frames, so if you do use frames, you'll have to do your own research.

What is a search engine?

Search engines use an indexing software agent more commonly referred to as robots or spiders. These agents are programmed to constantly "crawl" the Web in search of new or updated pages. They go from URL to URL until they have visited every Web site on the Internet.

When a spider crawls a site, it makes a record of the full text of every page within the site. Next, it continues on it's journey to visit all of the external links. This is how search engines are able to find your site regardless of whether or not you register your URL with them. However, URL submissions does speed up the process because it prompts the agent to visit and index your site instead of waiting for it to locate you.

Before we continue with how to get listed, let me give you some statistics...

  • The 9th GVU User Survey says that search engines are still being used by more than 85 percent of surfers to find things.
  • In a study released by ActivMedia Research in September 1999, Search Engine Positioning was ranked as the #1 website promotional method used by eCommerce sites.
  • Only 7% of Internet users look past the first three pages of search results.
  • A study, conducted by search company BrightPlanet, estimates that the "inaccessible" (invisible) part of the web is about 500 times larger than what search engines already provide access to. To put that another way, BrightPlanet estimates there are about 500 billion pages of information available on the web, and only 1/500 of that information can be reached with traditional search engines.
  • Google currently claims to have indexed (or knows about 1 billion web pages), making it the largest crawler-based search engine. (Latest estimate of webpages on the World Wide Web: 1 to 2.1 billion)

Top Ways Websites are Discovered*

  • Banner ads: 1%
  • Targeted email: 1.2%
  • TV spots: 1.4%
  • "By accident": 2.1%
  • Magazine ads: 4.4%
  • Word-of-mouth: 20%
  • Random Surfing: 20%
  • Search Engines: 46%

*Source: IMT Strategies - imtstrategies.com

You can see that just getting your website listed in the search engines is not enough. In order to get significant traffic from the search engines, your website must be listed within the top 30 search results (preferably the top 20).

More good reading can be found here...

Internet Exceeds 2 Billion Pages: Did you know that a recent study estimates the size of the surface web to be 2 billion pages? (Cyveillance, July 10, 2000)

Invisible Web & Database Search Engines: The "Invisible" Web.

The Major Search Engines

While there are many more search engines than I've included in the list below, I recommend focusing on these major players: AltaVista, Google, WebCrawler, Lycos, Excite, Looksmart, AskJeeves, AOL NetFind, Netscape, HotBot, MSN, Go, Snap (now NBCi), Northern Light, Fast Search, Canada, Dogpile, FindWhat and Chubba. Check out The Top 100 Search Sites.

Get Global! Tell the Whole World Where to Find You.

Don't forget about the foreign search engines. These can be very beneficial if you have a website or product of global interest. Here's a few of the more popular ones... SurfChina, Yahoo Chinese, Excite Japan, Goo, Web.de Altavista Germany, Virgilio, Arianna, Cade, Iguana, Explore-Mex.

Your Home Page Must Be Search Engine Friendly

Now.... repeat after me.... "My Home Page Needs Some Text"!

That Flash animation on your home page probably looks awesome -- but be warned -- it could be the death of your site on the search engines. Since search engines rely on readable and relevant content for their ranking determinations, keyword-rich body text is mandatory. So, if your site's text appears as a graphic, or if your site is filled with Flash animation graphics, you'll sabotage your chances for a prime ranking. It doesn't matter how good you look if no one sees you!

I recommend making a site map page with text links to every page on your site. Then, use this page to submit your site to be assured of all the links being following by the spiders.

So, How Do We Do This?

Let's Start with Meta Tags, Keywords, and Key Phrases.

What's the difference between a keyword and a key phrase? It can mean the difference between being found among the first 20 or 30 listings or being buried under literally thousands of other listings. Most surfers learn very quickly that a search for "car+Toyota+celica+Vancouver" produces much better results than "car". The more specific your "key phrases" are, the better your chances of being found by someone who is looking for what you have!

Build rich "Key Phrases", not "Keywords": Most surfers search for a phrase rather than just a single word. Check your site stats and see for yourself.

This is one of those times when more is better. You could have a single web page and fill it with hundreds of key words and phrases but it probably won't get a high ranking on a search engine because the actual "relevance" of each individual key word is low. It is much better to have a hundred or more web pages on your site, with a few key words repeated three or four times on each. That way, the relevance of those particular key words is high for the specific page on which they appear. Better still is to have a key word/phrase in the page title, meta tags, the first heading and as the first "bold" word. Each page should also contain text which is relevant to the subject, (at least 100 words), otherwise a search engine is likely to ignore it or consider it as "spam". This is especially applicable to your Index page. If it contains only a heading and the words "click here to enter", for example, a search engine will not give it a high ranking and might ignore it completely!

Be careful not to make your text read like mumbo-jumbo. Some engines will assume that a phrase used fifteen times is more relevant to the document than a phrase used only five times. Search engines view density as a ratio (the number of times your key word/phrase appears on the page, relative to the total number of words on the page). The higher the key word/phrase density, the more relevant some engines rank the page. Click here to see a sample of meta tags.

The meta tags example shows many different types of meta tags, but you really only need to use the "title", "description" and "keyword" tags. Combine these with your "head" and "alt" tags and you'll have great reading for those ravenous little spiders!

Use Effective <title> Descriptions

Be concise with no more than 20 words. Ideally keep around the average of 6 - 8 words.

In your <keywords> tag, be sure your title is the first key word/phrase, and try to include at least half of the keywords in your description. Do not repeat any word more than 6 - 8 times. And please don't "pad" keywords with words that have nothing to do with your content. (See more about this below.)

Make sure that you have links from the main page to every page on your site (deep submission) with <title> & <description> tags reflecting content of each page. Always use <alt> attributes for every image on your site and embed a keyword into this tag. Always use a heading tag <H1> that repeats the title of your page. You can capitalize the first letter of each word in your <title> but never capitalize in your <description> where it is not appropriate.

Tip: Name those HTML files "descriptively"

Name your html pages and images with the keywords you want to rank, so the search engines will know the your "tires.html" with the "snow_tire.gif" is about tires.

What About Doorway/Crawler/Cloaked/Portal/Gateway Pages?

If you designed your pages to be search engine friendly (good content, rich meta tags, simple text navigation etc.) then you already have "natural" doorway pages. And you won't have to worry about new search engine policies that will be banning doorway pages. AltaVista will be introducing this policy soon and I'm sure others will follow shortly thereafter.

Tips:

  • Re-submit your website every two months.
  • Never hide invisible text on your website.
  • Never use misleading keywords in your Meta tags that have nothing to do with the site.
  • Never try scamming search engines by writing bogus site descriptions.

How to get Kicked Off the Search Engines

Spam isn't just for email!

Search engines can ban sites for months -- or even permanently -- and it's a lot easier to get banned by a search engine than it is to get a high ranking. Sometimes sites get banned for poor design and even for being hosted by free hosts.

Because free hosts are a haven for spammers to set up bogus sites, then bombard the search engines with worthless submissions, many search engines now block all submissions from sites like Geocities. So, use free hosts for your "personal" site, but get a professional host if you want search engines to index your business site.

Tip: Search engine policies and methods change frequently, so it's wise to try to keep up-to-date on their current policies.

Some search engines will not index a page with a high meta refresh rate. Go will not index pages with any redirection. And Google doesn't have to worry about meta refresh tags because its "link popularity" ranking system pretty much defeats spam attempts.

Two Popular Spamming Techniques Guaranteed to Leave You Out in the Cold:

  1. Invisible Text: This is the where you place text on a page in the same color as the background, making it invisible to the viewer. Many search engines either refuse to index this text or worse yet, will not index any page containing this sneaky little trick.
  2. Tiny Text: This is where you place text on a page in a very small font size. Pages that weigh in heavy in tiny text may be treated as spam, or the tiny text may not get indexed.

Possible Reasons You Weren't Indexed by the Spiders

You may unknowingly prevent your site from being indexed by using: JavaScript Links, Insufficient Site Links, Incorrect HTML, Virtual Domain Redirection, Meta Refresh, Meta Robots "noindex,nofollow" Tags or Free Website Hosts

Tip: Think Before You Submit

Before submitting your website to a major search engine, re-check the spelling of your meta tags and site descriptions. It takes a lot of work to have errors removed from search engines, so just do it right the first time! And whatever you do, don't make the mistake of submitting your site before it's "Search Engine Ready". You'll only confuse the robots and they will probably index your site incorrectly or give it a low ranking.

What is Deep Web Crawling (Deep Submission)?

Deep submit is nothing more than submitting each and every page of your site. However, if you have prepared your front page rich with content and links (or perhaps your second page) most spiders will "deep index" your site.

Tip: Avoid banners, counters, cgi, Java or JavaScript or place them AFTER your quality content. Do NOT use the refresh tag. This will, without exception, trigger the search engines spam penalty.

Site Submission Services

I have 2 words about these services... Don't Bother! Most of them just use commercially available software. With a little research and effort, you can do a much better job yourself.

Tip: If you use tables to layout your pages, make sure your headline is the first thing read by putting it at the top of the page - outside of the table.

Great Tools to Help You Get Listed

These sites have lots of great tools and information:

Free Site Submission Services:

Note: It can take anywhere from two weeks to six months for your site to be indexed.

Check Your Link Popularity: LinksToYou.com. Just enter your URL and get a report on your site's ranking.

Search Engine Sizes: Take a look at the current coverage of the surface web.

Keyword Tools:

Site of the Day

Macinstein - Lots of great stuff and home of The Mac Only Search Engine! Be sure to check out Odigo Messenger.

Today's Peeve

Canada Post Corporation recently announced a new free Internet access program for "all Canadians".

"This service, paid for with our tax dollars, supports and promotes Microsoft to the detriment of other OS vendors: This Federally funded program is implicitly restricted to only those Canadians who buy their O/S from Microsoft. The 3Web site presents an explicit message: There is no connection software for Mac, Linux, nor, dare I guess, for BSD or OS/2 or..."

Read the article first... and you can read the announcement at Canada Post as well. I hope you will send them an email and tell them what you think about this blatant act of discrimination! (can you say "dictatorship"?)

Update: As usual, my biggest peeve is sites that are not Mac friendly.... Remember "The Living Letters"? Well it seems that they don't like Windows NT either. (Thanks to Jeff for this tip!)

Today's Alert

False Advertising? Misleading? Outright Lie?

The Blue Man Group is featured in a new Intel ad but when it comes to running their live show they rely on Macs. Read the story on Apple's site.

I've contacted Intel regarding this mess... it's been forwarded to the "proper people" within the corporation. While I haven't received word back yet, I did notice that the Blue Man Group has been removed from Intel's front page. Here's a screenshot that I took before it's disappearance.

Well, I'm out of space so that's it for now. What will I be talking about next issue? HTML Editors! Don't forget to check out my other column, The Webmistress at Applelust.com.

Questions? Suggestions? Complaints? Proposals?

I'd love to hear from you, so please drop me a line at nancy@macosjournal.com or use the handy little form via the link below. See you next month! And don't forget about hosting... it's gotta be on a Mac... it's gotta be Itsamac.com -- simply the best! Don't forget to tell them who sent you!

Quote of the Day

Macs for productivity, Unix for stability, Windows for solitaire
-from Daniel Knight (lowendmac.com)

Nancy's Icon Nancy Johnson - nancy@macosjournal.com
Nancy's Page - Feedback Form

back forward