Home | List of Articles | Submit an Article | Contact Us

Google And Duplicate Content

I've been following the discussion about Google and mirrored
information for some time. It is "common knowledge" that Google
penalizes page rank when it determines that content is
duplicated somewhere else. In fact, I've read many experts
stating that there should be no duplicate domain names and no
duplicate content anywhere.

On the face of it the arguments appear to be sound. Google
obviously has several billion pages in it's database and could,
it appears, easily determine if content is duplicated. It also
seems, again on the face of it, that it's reasonable to check
for duplicate content, as this is the "mark of a spammer" and
not necessary on the web with hyperlinking available. At least,
this is the common wisdom.

However, sometimes what seems reasonable and possible is not:
not by a long shot.

Let's begin with the technical side of things. You've got
domain x and domain y with exactly the same content. How on
earth would Google be able to figure that out? Let's say Google
had 3 billion pages in it's database. To compare every page to
every page would be an enormous task - quadrillions of
comparisons.

Now, if site x had page "page1" which linked to site y which
also had "page1", then it would be possible for Google to
determine the duplicate content. Conceivably, it could check
this out.

Not only is the task enormous, but the benefit is so tiny as to
be insignificant. Duplicate content does not imply in any way
shape or form spamming. In actual fact, a duplicate site is
generally going to lower page rank of BOTH sites. Instead of
having 100 links to one site, there will presumably be 50 links
to one and 50 to another. This would tend (all things being
equal) to lower the page ranking of both sites. So Google gains
nothing by this incredible expenditure of resources.

There are several reasons for duplicate content which have
nothing to do with spamming. Sometimes the content is actually
duplicated, and sometimes it's just that there are several
different domains (at least the www and non-www versions) for
the same website

Mirroring a site for load balancing - This is very common. The
purpose is to split up the traffic between two copies of the
site.

Mirroring for region - Sometimes site mirroring is done simply
to make it more efficient on the internet backbone itself. You
might put an identical copy of a site in Europe, for example,
to reduce traffic across the Atlantic, which should make it
faster in European countries.

Viral marketing - It's extremely common to allow other sites to
republish articles in return for a link.

Different domain names - Sometimes a site might be referenced
on many different domain names. You might want to allow the
.com, .net and .org versions of the name to all work the same,
you might allow for common misspellings or you might cover
different keywords (sewing-tips and sewing-secrets are examples
of possible combinations).

Different domain names for different markets - you might also
want to reference your site by different names in order to
target different markets. You could, for example, have a site
about search engine optimization and want to target both SEO
and web designers. Thus domain names like seo.com and
webdesign.com would make sense.

www - Any good webmaster knows his or her site needs to be
referenced with and without the www.

Okay, so what's the smart thing to do? Well, it is possible
that search engines do compare a limited number of pages to
check for duplication. They could certainly check if someone
reported something, and they might check directly linked pages
(although this is still a heck of a lot of overhead for very
little benefit).

Of course, Google and the other search engines can account for
a hefty percentage of the traffic received by a site. In fact,
sometimes the number can exceed 70 percent. So it's wise to
spend some time ensuring that you are totally clean when it
comes to search engine optimization. In other words, a
technician from any search engine should be able to examine
your site down to it's smallest detail and find no evidence of
any kind of search engine spamming (attempting to get higher
rankings by unethical means). This is absolutely critical to a
site's survival for the long term.

Keeping that in mind, here's what I tend to do.

Multiple domains - Using multiple domains to the same site has
a tremendous number of advantages. Thus, I tend to follow the
advice given by others: take advantage of permanent redirection.
In other words, set up a redirection (a 301 status code) which
simply tells the browser "this page has moved, proceed to this
page, and the move is permanent. This tells the spider about the
redirection with no possibility of misunderstanding, yet allows
for the multiple domains.

Republished articles - I allow others to republish many of my
articles, and at this time I have records of over 10,000 of
them all over the internet on thousands of web sites. This is
not a problem, as these articles are sent in text format. The
webmaster must then drop this text into his site, which requires
some reformatting and shuffling around. Thus, the finished
articles may have the same text but the formatting is very, very
different. This is a highly respected method of gaining a large
number of incoming links: I give you something (an article,
i.e., content) and you give me something (a link back to my
site).

Mirroring - I haven't needed to do this yet, so I have no advice
as to what to do if a site requires actual, physical multiple
versions of itself. I would tend to just do it overtly (out in
the open) and not worry about it.


About the Author

Richard Lowe Jr. is the webmaster of Internet Tips And Secrets
at http://www.internet-tips.net - Visit our website any time to
read over 1,000 complete FREE articles about how to improve your
internet profits, enjoyment and knowledge.


Related Articles:

  • Google Spyware? Bad Guys & Spies Using Google Desktop Search - I suppose I was naive when I cheered the new Google Desktop Search tool thinking it was ONLY a great way to help ease my computer info-glut and help organize my hundreds of hard-drive stored documents, emails and files. It seems that now I have to worry about how bad guys and busybodies will use it to spy on me! ...
  • Deja Googled - http://groups.google.com/ http://groups.google.com/googlegroups/archive_announce.html The Internet may have started as the fervent brainchild of DARPA, the US defence agency - but it quickly evolved into a ne ...
  • The Two Most Important Things You Must Do For Google Top Rankings - Attaining a top ranking in Goggle or any other major search engine nowadays is a herculean task indeed. But webmasters are archieving this enviable position continuously. The common thing about many of these top sites is that they have plenty of good content and quality one way inbound links. These are the two most important elements in search e ...
  • How To See What Pages Of Your Site Google Has In Its Index - There is a lag time between the indexing or updating of your site, and the time it takes to show new results in the database. Depending on your site, where it was linked from, who it was linked from, and who knows what other factors, the amount of time varies. With the method I teach in my book it seems to take two to four days on average ...
  • Beat Google’s Dampening Link Filter with SEO Articles - Most Search Engine Optimization (SEO) experts agree that links back to your site have a great impact on your ranking in the major search engines. Think of it like an election; your site is a candidate and every link to your site is a vote. Of course, it was never quite that simple (high ranking, relevant sites have more v ...
  • Google manipulates search results: A boost for small business? - As more and more businesses strive for a top ten Google ranking, it's becoming harder and harder to achieve. This is especially true for smaller businesses that simply don't have the budget for a big link popularity campaign. But hope may be just around the corner. If a top ten ranking for your primary keywords has been eluding you, then read on! < ...
  • Googled - The Internet may have started as the fervent brainchild of DARPA, the US defence agency - but it quickly evolved into a network of computers at the service of a community. Academics around the world used it to communicate, compare results, compute, interact and flame each other. The ethos of the community as content-creator, source of informatio ...
  • Seven Tips To Getting The Nost From Your Google Ad Word Investment - With Google Ad Words . . . When a person searches Google for a word or phrase related to your product or something on your web site, the Google search return will display your ad. When your ad appears on the Google Ad Words page, you only have to pay for the ad if the visitor clicks on it. Even if the ad appears 10,000 times, you onl ...
  • For a Complete list of Articles with summaries Click Here


  • © Copyright. All rights Reserved. QualityBooks.com | Sitemap