Need help? Check out our Support site, then


Duplicate Blogs, Domain Mapping, Technorati

  1. This is a support question and this post inspired the matter:

    http://en.forums.wordpress.com/topic.php?id=13998&page&replies=28

    As I struggled to figure what was going on between Google and my WordPress blog, I wondered about Google, perhaps, punishing "duplicate content" in their search returns when I realized that "boles.wordpress.com" is domain mapped to "urbansemiotic.com" -- yet both of those URLs are valid and used to be separated domains with unique content.

    Note: I do not know how the domain mapping re-direct is being done on the WordPress side of the equation or what, if any, .HTACCESS settings are being used, but I’m sure WordPress.com would tell me if I thought to ask them about it before now…

    I then began to wonder about Technorati since I have several blogs there and both "boles.wordpress.com" and "urbansemiotic.com" are listed as claimed by me but the Ping updates times are always different for each which always struck me as strange since they're both supposed to be, in a way, bumping the same content but without mirroring.

    I decided to write to Technorati support and ask if I should delete or even "de-claim" my "boles.wordpress.com" blog since it points to urbansemiotic.com and because I don't want to risk having duplicate content somewhere out there somehow you know what I mean…

    Here is the interesting reply I just received from Technorati support:

    ***

    I have gone ahead and made the necessary adjustments so
    "boles.wordpress.com" is listed as a duplicate in our system, so you do not have multiple listings every time you post. You do not need to ping "boles.wordpress.com". Do not hesitate to contact us if you have any other questions. Thank you for using Technorati!

    ***

    My support question is this: When WordPress.com hosts a domain mapped blog and a Ping is sent out to the update services via Pingomatic -- is the Ping being sent for "boles.wordpress.com" or "urbansemiotic.com" or for both?

    I guess I should also send this inquiry to Support in case that should be an included path.

    Thanks!

  2. IN ADDITION:

    As a general reminder, here's what Google says about Duplicate content:

    "Syndicate carefully: If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you'd prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article. You can also ask those who use your syndicated material to block the version on their sites with robots.txt."

    http://www.google.com/support/webmasters/bin/answer.py?answer=66359

    We do not control the “robots.txt” file for our blogs, but you can see what is in yours -- I believe we all have identical files -- by typing your domain and adding that file as in this example:

    http://urbansemiotic.com/robots.txt

    FYI: In Google's analysis of the "robots.txt" file for urbansemiotic.com, the following appears in the "Parsing Results" area:

    "Crawl-delay: 3600 Rule ignored by Googlebot"

  3. Duplicate content is most worrisome when you're dealing with Scrapers who steal your content and republish it on your site.

    I've since moved to a truncated RSS feed -- RSS feeds are the easiest way for someone to republish your content on their site -- and a lot of the Scraping has stopped because it's harder to pull original content if you aren't providing a full feed.

    My concern, however, is if sites like Google and others are pulling my truncated RSS feed and using it as a Sitemap.

    Would a truncated RSS feed lead to incomplete search engine indexing or not?

  4. I'd like to know that, too. I use the <!--more--> tag on ALL of my posts. I'm concerned Google using my RSS as a sitemap is going to screw things up.

  5. Hi abbydonkrafts --

    I wonder, too, if I should go back to full RSS feeds to provide complete content.

    I don't use the more tag but that's an excellent thing to think on if that method might somehow restrict full indexing.

  6. UPDATE:

    Barry checked the server logs -- THANK YOU, BARRY! -- and one, and only one, ping is being sent out for "urbansemiotic.com" and while that is great news to confirm, I still generally wonder how and why "boles.wordpress.com" is getting pinged if I'm not doing it and the WP.com servers aren't doing it.

    Oh, well -- I'm not worrying about it since Barry took the time to check it out for me.

  7. > I still generally wonder how and why "boles.wordpress.com" is getting pinged if I'm not doing it and the WP.com servers aren't doing it.

    only boles.wp.com is getting pinged, urbansemiotic.com is just a pointer to it.

  8. Hi options --

    This is what Barry said: "I checked the pingomatic logs and verified that for mapped domains we are only sending 1 ping with the mapped URL, not the .wordpress.com URL."

  9. Hi options --

    This is what Barry said: "I checked the pingomatic logs and verified that for mapped domains we are only sending 1 ping with the mapped URL, not the .wordpress.com URL."

  10. I thought by the "why "boles.wordpress.com" *is getting pinged*" you mean *incoming* pings, do you? Barry is obviously talking about *outgoing* ones.

  11. I've always been talking about outgoing pings and when the updating services like Technorati report boles.wordpress.com has been updated when no one is actually actively updating that URL.

    Incoming pings are interesting -- but why would any ping be sent to the boles.wordpress.com address and not the urbansemiotic.com address?

Topic Closed

This topic has been closed to new replies.

About this Topic