If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

In doing some backend maintenance for Iowa Blogs today I noticed something very peculiar. For some of the bloggers there are duplicate posts… After drilling down further I noticed all of them were on the Typepad platform. After analyzing the RSS feeds produced by Typepad in combination with Feedburner I found that it is producing two different links - FeedBurner’s version and Typepad’s version.

Example from Mike Sansone’s recent post within his RSS Feed…

1. http://feeds.feedburner.com/~r/Converstations/~3/153408787/discovery-along.html

2. http://www.converstations.com/2007/09/discovery-along.html

Both of these links go to the same content. So what’s the problem? Well, I use the link to determine the uniqueness of a post before it’s processed. Since these links are different the engine thinks they are different posts - when in fact they are not.

Now I am checking the link in combination with the publication date and content to find uniquness. I will push the new version this weekend and clean up all the duplicate entries which accounts for 177 of the 4681 posts currently archived, or about 3.7% of the archived content.

Tags: , , ,

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Bumpzee
  • del.icio.us
  • Facebook
  • Furl
  • Mixx
  • NewsVine
  • Reddit
  • StumbleUpon
  • YahooMyWeb
  • Google

If you enjoyed this post, make sure you subscribe to my RSS feed!

Get GetANewBrowser delivered to your inbox