The issue of duplicate content is a thorny one that can affect sites' search rankings, and one that has even caught Google out in the past.
So it's good that the search giant's Adam Lasnik has written a post that throws some light on how it deals with the problem.
Duplicate content refers to blocks of text on the same site, or across domains, which are either exactly the same or very similar.
According to Lasnik, most instances are "unintentional or at least not malicious in origin", but in others, content is copied across domains in a bid to manipulate search ratings and generate more traffic.
The reason Google focuses on the issue is, basically, that it wants to provide better search results - users want to see a range of distinct pages when they search for something.
To provide this, search engines have to contend with spammers using invisible or irrelevant content, scraper sites that use other sites' content to obtain AdSense clicks, and a range of other rubbish.
Duplicate content penalties
According to Lasnik, duplicate content is normally dealt with by filtering adjustments - if your site has regular and printer versions of a webpage, then Google will list just one version of the page.
When the duplicate content is clearly an attempt to manipulate rankings, Google says it will ...
... "make appropriate adjustments in the indexing and ranking of the sites involved."
Here are some of Lasnik's tips for webmasters:
* Block the version you don't want Google to index - disallow those directories or make use of regular expressions in your robots.txt file.
* Take care when syndicating content - make sure other sites that use your content include a link back to the original source. Google will always show the version it thinks is most relevant to a given search.
* Use TLDs - adopt top level domains for country-specific content.
* Use smaller boilerplates - minimise boilerplate repetition by including smaller summaries of copyright text on the bottom of pages, and linking to one page with more details.
* Don't worry about scraper sites - Lasnik says that webmasters shouldn't be too concerned about these sites, as they are "highly unlikely" to adversely affect your rankings.