Dec 20 2008
Each of the following components are critical pieces to a site's ability to be
crawled, indexed, and ranked by search engine spiders. When properly used in the
construction of a website, these features give a site/page the best chance of
ranking well for targeted keywords.
Accessibility
An accessible site is one that ensures delivery of its content successfully
as often as possible. The functionality of pages, validity of HTML elements,
uptime of the site's server, and working status of site coding and components
all figure into site accessibility. If these features are ignored or faulty,
both search engines and users will select other sites to visit.
The biggest problems in accessibility that most sites encounter fit into the
following categories. Addressing these issues satisfactorily will avoid problems
getting search engines and visitors to and through your site.
- Broken Links - If an HTML link is broken, the contents
of the linked-to page may never be found. In addition, some surmise that
search engines negatively degrade rankings on sites & pages with many broken
links.
- Valid HTML & CSS - Although arguments exist about the
necessity for full validation of HTML and CSS in accordance with
W3C
guidelines, it is generally agreed that code must meet minimum
requirements of functionality and successful display in order to be spidered
and cached properly by the search engines.
- Functionality of Forms and Applications - If form
submissions, select boxes, javascript, or other input-required elements
block content from being reached via direct hyperlinks, search engines may
never find them. Keep data that you want accessible to search engines on
pages that can be directly accessed via a link. In a similar vein, the
successful functionality and implementation of any of these pieces is
critical to a site's accessibility for visitors. A non-functioning page,
form, or code element is unlikely to receive much attention from visitors.
- File Size - With the exception of a select few
documents that search engines consider to be of exceptional importance, web
pages greater than 150K in size are typically not fully cached. This is done
to reduce index size, bandwidth, and load on the servers, and is important
to anyone building pages with exceptionally large amounts of content. If
it's important that every word and phrase be spidered and indexed, keeping
file size under 150K is highly recommended. As with any online endeavor,
smaller file size also means faster download speed for users - a worthy
metric in its own right.
- Downtime & Server Speed - The performance of your
site's server may have an adverse impact on search rankings and visitors if
downtime and slow transfer speeds are common. Invest in high quality hosting
to prevent this issue.
URLs, Title Tags & Meta Data
URLs, title tags and meta tag components are all information that describe
your site and page to visitors and search engines. Keeping them relevant,
compelling and accurate are key to ranking well. You can also use these areas as
launching points for your keywords, and indeed, successful rankings require
their use.
The URL of a document should ideally be as descriptive and brief as possible.
If, for example, your site's structure has several levels of files and
navigation, the URL should reflect this with folders and subfolders. Individual
pages' URLs should also be descriptive without being overly lengthy, so that a
visitor who sees only the URL could have a good idea of what to expect on the
page. Several examples follow:
Comparison of URLs for a Canon Powershot SD400 Camera
Amazon.com -
http://www.amazon.com/gp/product/B0007TJ5OG/102-8372974-
4064145?v=glance&n=502394&m=ATVPDKIKX0DER&n=3031001&s=photo&v=glance
Canon.com - http://consumer.usa.canon.com/ir/controller?
act=ModelDetailAct&fcategoryid=145&modelid=11158
DPReview.com - http://www.dpreview.com/reviews/canonsd400/
With both Canon and Amazon, a user has virtually no idea what the URL
might point to. With DPReview's logical URL, however, it is easy to surmise
that a review of a Canon SD400 is the likely topic of the page.
In addition to the issues of brevity and clarity, it's also important to keep
URLs limited to as few dynamic parameters as possible. A dynamic parameter is a
part of the URL that provides data to a database so the proper records can be
retrieved, i.e. n=3031001, v=glance, categoryid=145, etc.
Note that in both Amazon and Canon's URLs, the dynamic parameters number 3 or
more. In an ideal site, there should never be more than two. Search engineer
representatives have confirmed on numerous occasions that URLs with more than 2
dynamic parameters may not be spidered unless they are perceived as
significantly important (i.e. have many, many links pointing to them).
Well written URLs have the additional benefit of serving as their own anchor
text when copied and pasted as links in forums, blogs, or other online venues.
In the DPReview example, a search engine might see the URL
http://www.dpreview.com/reviews/canonsd400/ and give ranking credit to the page
for terms in the URL like dpreview, reviews, canon, sd, 400. The parsing and
breaking of terms is subject to the search engine's analysis, but the chance of
earning this additional credit makes writing friendly, usable URLs even more
worthwhile.
Title tags, in addition to their invaluable use in targeting keyword terms
for rankings, also help drive click-through-rates (CTRs) from the results pages.
Most of the search engines will use a page's title tag as the blue link text and
headline for a result (see image below), and thus it is important to make them
informative and compelling without being overly "salesy". The best title tags
will make the targeted keywords prominent, help brand the site, and be as clear
and concise as possible.
Examples and Recommendations for Title Tags
Page on Red Pandas from the
Wellington Zoo:
- Current Title: Red Panda
- Recommended: Red Panda - Habitat, Features, Behavior | Wellington Zoo
Page on Alexander Calder from
the Calder Foundation:
- Current Title: Alexander Calder
- Recommended: Alexander Calder - Biography of the Artist from the Calder
Foundation
Page on Plasma TVs from
Tiger Direct:
- Current Title: Plasma Televisions, Plasma TV, Plasma Screen TVs, SONY
Plasma TV, LCD TV at TigerDirect.com
- Recommended: Plasma Screen & LCD Televisions at TigerDirect.com
For each of these, the idea behind the recommendations is to distill the
information into the clearest, most useful snippet while retaining the primary
keyword phrase as the first words in the tag. The title tag provides the first
impression of a web page and can either serve to draw the visitor in or compel
him or her to choose another listing in the results.
Meta Tag Recommendations:
Meta
tags once held the distinction of being the primary realm of SEO specialists.
Today, the use of meta tags, particularly the meta keywords tag, has diminished
to an extent that search engines no longer use them in their ranking of pages.
However, the meta description tag can still be of some importance, as several
search engines use this tag to display the snippet of text below the clickable
title link in the results pages.
In the image to the left, an illustration of a Google SERP (Search Engine
Results Page) shows the use of the meta description and title tags. It is on
this page that searchers generally make their decision as to which result to
click, and thus, while the meta description tag may have little to no impact on
where a page ranks, it can significantly impact the # of visitors the page
receives from search engine traffic. Note that meta tags are NOT always used on
the SERPs, but can be seen (at the discretion of the search engine) if the
description is accurate, well-written, and relevant to the searcher's query.
Search-Friendly Text
Making the visible text on a page "search-friendly" isn't complicated, but it
is an issue that many sites struggle with. Text styles that cannot be indexed by
search engines include:
- Text embedded in a Java Application or Macromedia Flash file
- Text in an image file - jpg, gif, png, etc
- Text accessible only via a form submit or other on-page action
If the search engines can't see your page's text, they cannot spider and
index that content for visitors to find. Thus, making search-friendly text in
HTML format is critical to ranking well and getting properly indexed. If you are
forced to use a format that hides text from search engines, try to use the right
keywords and phrases in headlines, title tags, URLs, and image/file names on the
page. Don't go overboard with this tactic, and never try to hide text (by making
it the same color as the background or using CSS tricks). Even if the search
engines can't detect this automatically, a competitor can easily report your
site for spamming and have you de-listed entirely.
Along with making text visible, it's important to remember that search
engines measure the terms and phrases in a document to extract a great deal of
information about the page. Writing well for search engines is both an art and a
science (as SEOs are not privy to the exact, technical methodology of how search
engines score text for rankings), and one that can be harnessed to achieve
better rankings.
In general, the following are basic rules that apply to optimizing on-page
text for search rankings:
- Make the primary term/phrase prominent in the document
- Measurements like keyword density are useless, but general frequency can help rankings.
- Make the text on-topic and high quality - Search
engines use sophisticated lexical analysis to help find quality pages, as
well as teams of researchers identifying common elements in high quality
writing. Thus, great writing can provide benefits to rankings, as well as
visitors.
- Use an optimized document structure - The best practice
is generally to follow a journalistic format wherein the document starts
with a description of the content, then flows from broad discussion of the
subject to narrow. The benefits of this are arguable, but in addition to SEO
value, they provide the most readable and engaging informational document.
Obviously, in situations where this would be inappropriate, it's not
necessary.
- Keep text together - Many folks in SEO recommend using
CSS rather than table layouts in order to keep the text flow of the document
together and prevent the breaking up of text via coding. This can also be
achieved with tables - simply make sure that text sections (content, ads,
navigation, etc.) flow together inside a single table or row and don't have
too many "nested" tables that make for broken sentences and paragraphs.
Keep in mind that the text layout and keyword usage in a document no longer
carries high importance in search engine rankings. While the right structure and
usage can provide a slight boost, obsessing over keyword placement or layout
will provide little overall benefit.
Information Architecture
The document and link structure of a website can provide benefits to search
rankings when performed properly. The keys to effective architecture are to
follow the rules that govern human usability of a site:
- Make Use of a Sitemap - It's wise to have the sitemap
page linked to from every other page in the site, or at the least from
important high-level category pages and the home page. The sitemap should,
ideally, offer links to all of the site's internal pages. However, if more
than 100-150 pages exist on the site, a wiser system is to create a sitemap
that will link to all of the category level pages, so that no page in a site
is more than 2 clicks from the home page. For exceptionally large sites,
this rule can be expanded to 3 clicks from the home page.
- Use a Category Structure that Flows from Broad > Narrow
- Start with the broadest topics as hierarchical category pages, then expand
to deep pages with specific topics. Using the most on-topic structure tells
search engines that your site is highly relevant and covers a topic
in-depth.
For more information on segmenting document structure and link hierarchies,
see Dr. Garcia's excellent
guide to on-topic analysis.
Canonical Issues & Duplicate Content
One of the most common and problematic issues for website builders,
particularly those with larger, dynamic sites powered by databases, is the issue
of duplicate content. Search engines are primarily interested in unique
documents and text, and when they find multiple instances of the same content,
they are likely to select a single one as "canonical" and display that page in
their results.
If your site has multiple pages with the same content, either through a
content management system that creates duplicates through separate navigation,
or because copies exist from multiple versions, you may be hurting those pages'
chances of ranking in the SERPs. In addition, the value that comes from anchor
text and link weight, through both internal and external links to the page, will
be diluted by multiple versions.
The solution is to take any current duplicate pages and use a 301 re-direct (described
in detail here) to point all versions to a single, "canonical" edition of
the content.
One very common place to look for this error is on a site's homepage -
oftentimes, a website will have the same content on http://www.url.com,
http://url.com, and http://www.url.com/index.html. That separation alone can
cause lost link value and severely damage rankings for the site's homepage. If
you find many links outside the site pointing to both the non-www and the www
version, it may be wise to use a 301 re-write rule to affect all pages at one so
they point to the other

Webmaster Said:
Thank you.
Vincent Said:
Maneesh Kumar Said: