Web Usability & Website Optimization

Web Usability and SEO
A blog on » Web Usability   » Website Accessibility   » Search Engine Optimization (SEO)

Sunday, June 18, 2006

Actual webpage Title tag and Google

Recently I was working on some issues related to Internationalization on a Website as a part of Search Engine Optimization and found some interesting facts. Till now I thought that Google keeps the same copy of title as it appears in the source code of a webpage and when it is being showed in search result it will appear exactly as it appears on the webpage when opened in a browser, but I was wrong when I did some googling on this subject.
Wherever the use of actual output of char entity and char entity itself (e.g. output of ‘& r a q u o ;’ is ‘»’) are valid to use, Google will also use the output copy of such char entities found in the title text and not the entity itself.
Example for clarification:
Open http://www.w3c.es/ in browser, the title of this page in the browser reads as:
World Wide Web Consortium - Oficina Española
but when you look at the source code you will find it as:
World Wide Web Consortium - Oficina Espa& n t i l d e ;ola
Which is obvious. Now search for “w3c Espanola” (without quotes) in Google, even here you will see the same title as it appears in browser for the website:
World Wide Web Consortium - Oficina Española
But now look at the source code of this Google search result page and you will find that it contains the same title text as appears on the result page, that is:
World Wide Web Consortium - Oficina <>Española< / b >
That means Google has the output of the actual title text in the source code of the site and shows the same in the search result page.

But wherever it is recommended to use entity only and writing actual output for that char entity in coding is not desirable (e.g. it is recommended to use '& a m p ;' for '&') then Google will also respect it and keep the actual title text including the char entity (and not the output of that char entity)
Example for clarification:
Search for “&” (with/without quotes) in Google. On the first place I found barnesandnoble.com, the title of which reads when opened the site in the browser as well as in search result listing is:
Barnes & Noble.com - Home Page
Now look at the source code of webpage, it looks as:
Barnes& n b s p ;& a m p ;& n b s p ;Noble.com - Home Page
And now at the source code of Google search result page:
Barnes <>& a m p ;< / b > Noble.com - Home Page

This clearly means Google has ‘& a m p ;’ and not ‘&’ with it, which is sign of respecting the W3C recommendations, but at the same time Google has skipped the '& n b s p ;' as it appears in the actual title text (source code) and replaced it by space in its own version of the same title because its OK to use space as well as as space equivalent in HTML, both ways are valid and this is what Google understands and act smartly.

Hats off to Google!

This also throws light on the fact that Google counts an entity as one character and not the full length of that char entity. Eg. Google will consider the length of ‘& a m p ;’ as 1 and not 5.

Note: The above information is based on some research in Google and there might be variations to it. Spaces have been used to show the character entity and some HTML Tags and so every char entity and space included HTML tags as seen above should be considered with spaces dropped.
Please post your comments. We would like to know your views and experience in this area.


Anonymous Reshmi said...

Thanks a lot for this piece of information. I had been pondering over this since a long time. Thanks for clearing my mind :)

5:05 AM  

Post a Comment

<< Home