Search Engine Optimization II

February 13, 2004 | View Comments (20) | Category: Web Mastering

Summary: Part II on Search Engine Optimization.

Some great ideas were passed around in the first article on search engine optimization and after going over it I noticed that there were a couple of items that I had overlooked that I feel would also be beneficial for everyone. So I guess it is time for Part II of the mini-series (I didn't even know one was starting ;).

Keyword Density

Before I mentioned that you should have a select 1 or 2 keywords per page that you would like to optimize for the search engines. The only technique that I mentioned to do this however was through headers. Another subtle way to increase your keyword density per page is through title attributes.

The title attribute, used inside of anchor (links) tags, is a great way to increase the keyword density without even getting in the way of the users. However, it does let the search engines know of the importance that the keyword has on your site. I have only recently found out the importance of them and begun using them more often.

Tools

Two useful tools that I have found that help you keep track of how well your site is doing in Google are:

Content First

Putting the content at the beginning of your site can also play a huge role in how you rank in the search engines. A little known fact about Google is that it only indexes the first 101kb of a page and after that it stops. So try to keep your pages slim and filled with great content. 101kb is a lot for a single page anyways.

Guide the Robots

Pretty URLs

I mentioned in the entry about The Roe the importance of good urls to search engines. I recently read somewhere that Google will follow dynamic links, but only up to a certain number (sorry I forget where I saw it), however, normal links/urls Google loves. Yesterday, Google decided to crawl 54,000 pages of The Roe in one day! This is typically known as being Google Bombed so be careful because it can bring a server down to a halt and piss off your ISP if you are on a shared hosting plan. Fortunately, not too many sites exist today with that many pages that Google has not touched yet.

A lot of these points were brought up in the comments in the first entry concerning SEO, but I just thought that they were so important that they needed their own entry. Be careful about trying to get too caught up in understanding how Google works. The search engine has been doing some crazy things lately with how it ranks sites. One of my entries on this site went from 6 to 21 to 41 to not seen to 6 again in about 3 weeks. Confusing, but funny to watch.

Trackback URL: http://9rules.com/cgi-bin/mt/mt-tb.cgi/147

Comments

#1

Being hit hard by the Googlebot is not called being Google Bombed. A Google Bomb is something different:

Setting up a large number of Web pages with links that point to a specific Web site so that the site will appear near the top of a Google search when users enter the link text.

Jeremy Keith (http://adactio.com/)

#2

That's interesting what you say about Google only indexing the first 101 kb of a page. That would explain a few anomalies I've noticed where certain terms towards the end of a long page have not been indexed but items appearing earlier have.

How did you come by this information?

Joel (http://www.biroco.com/)

#3

Of course,

Google bombing is something different.

101K is well known fact between SEO Community. Also, it is believed that Google indexes only about one hundred links at one page.

See: http://www.seochat.com

dusoft (http://www.ambience.sk)

#4

The title attribute, used inside of anchor (links) tags, is a great way to increase the keyword density without even getting in the way of the users. However, it does let the search engines know of the importance that the keyword has on your site. I have only recently found out the importance of them and begun using them more often.

Title attributes in link tags do make a big difference as far as Google is concerned. A friend of mine and I both have personal weblogs. On my blog, I have been workign on a 'bios' section that goes into some background information about the people I talk about often (friends, family, etc.) Every time I would mention someon in a post--whether I would say "Misty" or just "my wife"--I would link those words to that person's bio page.

Then I started adding their full name as a title attribute. Within a week, Google considered me the leading expert on my friend's name even though he had a site of his own that has been up for a much longer period of time.

KillAllDash9 (http://www.pulpblog.com)

#5

Dusoft -- "Well-known fact" you say. Well, I've certainly heard it before. I am more interested in how is it known. Have Google stated this? Excuse me for wishing to know what is a fact and what is SEO folklore, it's my scientific education.

Joel (http://www.biroco.com/)

#6

Googlebombing: Ah, yes you guys are right about that. So what should we call it when Google decides to attack your server and go after thousand/millions of pages in a couple of hours?

As for the 101kb limit. It is mainly a theory, but check out how it came about. If you look at this page you will see that it weighs in at a whopping 143.19KB. Google, in the search results, says it is only 101KB and if you compare the cached google version of the same page you will see that it just stops after 101kb.

Scrivs (http://www.9rules.com/whitespace/)

#7

Thanks Scrivs, that is indeed very interesting. I exclude my own pages from the Google cache, otherwise I might have noticed such a cut-off point.

Joel (http://biroco.com/)

#8

And that 101k limit is why boingboing is really annoying. Their archive pages go one month at a time, which usually makes them a bit under 1 meg... but the only search they have is google search... so if you want to search something that isn't in that first 101k, you're SOL.

And yeah, link titles work nicely, not just for google, but for screen readers and people who are reading the site and mouse over links... so how come you're not using them, Paul? :-)

JC (http://thelionsweb.com/weblog)

#9

Well I checked it out. Looked through quite a lot of Google results, couldn't find a single one above 101 kb, and more than the statistical average at exactly 101 kb and they were cut off in the cached version. Time to split a few long pages I guess.

Joel (http://biroco.com/)

#10

So would a googlebomb be the reason i have 10 entries at a time from pr0n sites in my refer logs?

Jeremy Flint (http://www.jeremyflint.com)

#11

JC: I am slowly working them in ;-P

Jeremy: That has been happening to everyone lately it seems. I had to move my refer to a different place.

Scrivs (http://www.9rules.com/whitespace/)

#12

Jeremy -
it's called referer spam, and it's a real pain. Nothing to do with being a blog, either. I work in a bank and our primary website's error pages are set to email me whenever there's a 404 that has a referer (so I can request the linking individual to update the link, or do a server side redirect myself) and until I filtered out a few specific strings ("members" was the biggest one) I was getting 150+ error emails a day.

JC (http://thelionsweb.com/weblog)

#13

If a page is over 101KB Google will search and cache the entire page, but it does so in quite a bizarre manner. The first 101KB will be stored in Google's cache as the page, but it also stores the rest of the page in 101KB chunks that it calls 'supplementary results'. Google's cached versions of these results contain the whole page!

This is quite hard to explain so I would suggest searching for 'lesser known holidays' on Google and looking at the results for the domain suda.co.uk.

mjr

#14

mjr -- I'm not convinced that the "supplemental result" is what you say, Google "explain" it here:

http://www.google.com/help/interpret.html

I know for a fact that on pages of my own over 101k that terms near the bottom of it cannot be found with a Google search.

I guess the point is not to feel over-reliant on Google. After reflection, I wonder whether I can be bothered to change the way I make certain pages just to suit its idiosyncrasies. I noticed some time ago that Google is not as reliable as one might hope so set up an Atomz search facility on my own site, which I have to say works splendidly, beyond all expectation. At the time, incidentally, I tested quite a few Movable Type site-search facilities and found them next to useless, making me wonder whether those who display them have ever tried to use them. Try searching for the word "zen" on a Movable Type site and see what you find. Plenty of pages with the word "dozen" or "citizen" on them.

Joel (http://www.biroco.com/)

#15

Joel, I haven't looked into it too much, but I know if you search for a holiday that is listed at the bottom of the page I linked to Google will find it as a supplementary result. I don't know why it works like this!

One way you can get round this 101KB limitation is by gzipping pages for user agents that support this (all modern browsers I know). Google itself gzips all its content does this, and its cached version of the holidays page is 16KB as opposed to the original's 183KB. If you use PHP, there is a built-in function to do this for you.

mjr

#16

mjr - I'm 99% sure googlebot doesn't post accept-encoding: gzip headers, so while compression is a wonderful wonderful thing, it's not going to help your google results.

Pages really shouldn't be 100+K anyway.

JC (http://thelionsweb.com/weblog)

#17

JC, yeah I realised the stupidity of that comment just after I'd posted it — doh! Even if Googlebot did accept gzipped content it would still uncompress it and stop at 101KB anyway.

mjr

#18

Gostei muito de seu artigo, foi proveitoso!
Estou escrevendo em português, para vc saber que ele foi longe!

Tudo de Bom

Marcelo Dantas

#19

As and from last week (Feb 18 to be exact), Yahoo decoupled itself from Google. Yahoo now uses a newly-devised proprietary algorithm (NOT Inktomi).

As yet, the Search Engine Optimisation community is unclear as to how to optimise for Yahoo -- it will take some time, and reverse engineering, to figure this out.

SEOs will have to now take Yahoo's algo into account as well as Google's. One difference we know of right away is that, while Google ignores the keyword meta tags, Yahoo takes account of them.

However, make sure that the words you insert here correspond to words that appear prominently on the page ... Yahoo filters out pages where the meta keywords are stuffed with spam.

Michael Heraghty (http://www.michaelheraghty.com/findability.html)

#20

That is some good info to know. Thanks Michael.

Scrivs (http://www.9rules.com/whitespace/)

Keep track of comments to all entries with the Comments Feed