Incoming links…Blog Scrapers?…stolen content

  • Author
  • #208729

    Hello – My blog is only a few weeks old. Today I noticed a link under ‘Incoming Links’ in the blog stats section. When I clicked on it, it took me to someone else’s site where I found my post replicated in its entirety, photo and all, here:
    My original post is here:
    I did a WHOIS query and found the registrar is Public Domain Registry (but couldn’t see anything for hosting?) I just wrote an email to the registrar, PDR, to let them know about the copyright violation but I haven’t heard back from them. The site that poached my content seems to be filled with a lot of pharmaceutical garbage. Any thoughts? Is there a way to prevent this? Am I within my rights to demand that the registrar remove my content from that site? There is no contact information on the poaching site itself that I can decipher. This is very disheartening. Help!



    The registrar just handles the domain name, they have no control over the site’s content. Whois:

    The second one is the whois information for the server’s IP address. It’s the web hosting company. Send them a DMCA notice, they’re legally obliged to obey it. (The hosting company probably isn’t responsible for that site, one of their customers is).

    Please post back here to let us know what happens.


    Thanks for clarifying – I’ll do that right away. I’ll let you know how it goes.



    If this is successful…Will this just remove your blog from their site though or everyone elses content that they stolen will be removed as well?I am assuming that your actions will just remove the one blog and not their entire site, correct me if I am wrong :). Other people seem to have their content stolen and it would be best to shut down the whole site…

    This will be useful so as to make the splogers life as difficult as possible as it would take them time and effort to sign up all over again and place all their adds, instead of just removing one blog that someone complained about and continue with their site making money…


    I just sent a DMCA notice – I’ll report back with the results. Fingers crossed.



    pkayski: that’s up to the hosting company. They’re only obliged to remove content specifically identified in a DMCA notice — a single post/page in this case. A hosting company has no way of knowing what content has been copied without permission otherwise. It is entirely possible (however unlikely) that site’s owners have obtained permission to republish content, and simply made an error in this one case.

    But there’s DMCA, which is the law, and then there’s the hosting company’s terms of service, which is their contract with the customer. That web site might have broken that contract, in which case the hosting company might shut it down when they realise what’s going on. It’s up to them, and it depends on their arrangement with the customer.

    My guess would be: first complaint, they’ll take down the single post. A bunch of legitimate complaints and they’ll consider cancelling the customer’s account.



    Sounds reasonable tellyworth! Thanks for the info!



    I doubt the DMCA notice is going to do anything for you. Most sites of this nature (promoting pharmaceuticals) are located beyond the reach of U.S. law. They may be in an offshore location where you would have a great deal of trouble finding any information about the site, about who hosts it or owns it.

    They use your content to attract Google links and to make their site look more creditable, like a real health site, to impress potential customers.

    If it is any consolation, Google seems to rank such sites rather low. It’s unlikely that anybody else is ever going to recognize your content there. In other words, it won’t have any real effect on anybody you are trying to write for. Google obviously can tell that the netblock where the server is located maps to an offshore site that presents such content.



    Here’s a quick exercise you can do to decide whether it is worth putting any more time into fussing over this. Run the utility called traceroute (Linux, Mac) or tracert (Windows) in your command prompt. There are webpages that can run traceroute for you. Google traceroute. It is really easy.

    When you run traceroute, you get a list of all the sites between you and the server that a TCP/IP packet passes through. Typically, if I traceroute a site in the USA, that’s about 12-15 gateways, and it I traceroute a site overseas, the number may be as many as 25. Traceroute tells you the time that each gateway between you and the other server took to respond. Sometimes, the host you are trying to reach is set up not to respond to traceroute, but that’s not a problem, because you will be able to see the locations of intermediate routers.

    If running traceroute shows that this company is overseas, and it is not in an OECD country, forget it. You are wasting your time. It’s a lawless world, and you have to figure out what things are worth worrying about. Don’t waste your time on something you can do nothing about.



    That’s a good point madcupcake…. but if they are utilising ad’s (google) or hosts within america or any other OECD country, then steps can be done that way right?

    At the end of the day though,,, i guess people should learn not to click on ad’s from websites like that… But well, if people didnt click on ads there would have been no such thing as spam then ey?



    Is there really anything that hurts us when this happens. As far as I can tell as soon as I post a blog about 20 copies go up on these spam sites. But they usually just quote the first 200 or so words then they put a link to my page. I checked my traffic and about 10% of my traffic comes from these spam pages. How exactly is this hurting me? I don’t think it is.

    Jason Dragon



    Yeah, depends how people look at it I suppose. Some see it as others making money for stuff they ‘created’ (or wrote I guess) and people feel entitled for others not to make money of their stuff they wrote…. But I suppose, unless you have your blog on private, you going to be prepared for the pirates (like anyone else who creates creative content for money,movies, songs etc)


    I tracerouted your website, and found that it is hosted in Concord, California, by a web hosting company that appears to host many such “problematic” websites, sites that have in the past been sources of spam and other such activities. I googled this webhost, and I even found that they were “recommended” by persons who wanted no interference from their hosting provider. The company has a large netblock, with all but 16 of the 254 ports in a subnet. Most of them are not occupied by a host on port 80, a web server, but could be used for other, perhaps nefarious purposes.

    The owner of the domain that is stealing your content owns 120 other domains. For $136 you can have a report run that will give you a complete listing of the domains he owns. My bet is that you will find he has hundreds of such businesses, and has spread them all over the world on different hosts.

    Don’t worry about this kind of thing. As I said before, such content is simply buried in Google, which knows very well how to sort out the wheat from the chaff. I’m sure that one technique Google uses is simply to identify the netblocks and IP addresses used for such activities, and rank them accordingly.

    You will get absolutely nowhere by trying to pursue legal channels, unless you want to spend a lot of money on this.


    Here is a forum where you can read some more about the “company” that is hosting your friend’s website:

    As you can see, they have been on blocklists for years. They are one of the few cases where an entire netblock is blocked by some recommendation, because they are notorious for spam, spyware, malware. Note the comment about “autopilot”, which suggests that this is simply a case of somebody offering a block of IPs to others.

    Anybody with $500 a month can rent a T1. $5,000 a month will get you a partial T3. That’s in a place like Concord. It’s possible that somebody is even renting out a block within a legitimate hosting facility. The hosting facility has no concern with them, because they are merely subleasing network access, not internet access. This company is its own ISP, and it can do what it wants. I’ll bet it’s much harder, in terms of business licensing and so forth, to open a hotdog stand in Concord, California, than to open an ISP.


    czechgreenways – thanks for all the information, I really appreciate it and intend to make good use of it – and thanks to everyone for their interest and comments. I’m quite green when it comes to blogging so it just caught me off guard to see my little innocuous breakfast crepe post posted amongst pharmaceutical spam. I realize it’s not doing me harm but it frustrates me on principle. I haven’t received a response from the hosting company yet but I fully intend to be as much of a thorn in their side as I can until they remove my content. From what you found out about them, it sounds like the powers that be should be having a look at them anyway. Surely there is a governing body to approach and make a formal complaint through.



    I hate to say this, but trying will only make you madder. First of all, there is no way to identify who is the “hosting company” in this case, because the law is designed for the case where you have a contract for a website, or post on someone else’s site. But in this case, there is no hosting company. You have a domain owner with a generic address and name that is obviously not real, and a netblock owner that doesn’t even have a website that functions. The netblock owner rents the IP and probably the server or a cabinet to put the server in, but you don’t really know. You would have to do a lot of legal discovery to find out, and by the time you did, they would move the content to their hosting facility in Panama or Latvia. If you google the name of the netblock owner, as I demonstrated, you get hundreds of links to forums in which they are cited as a source of spam and malware. I noticed that many of the IPs on their netblock were not hosting websites. They probably host a lot of bots, servers that are set up to mimic clients and retrieve materials. Unless you want to pay a lot of money to lawyers, you won’t get any satisfaction out of this. It is a lesson on how lawless the Internet is.



    This thread is a few years old. It looks like the guy’s clients are happy.

    He leaves them alone.



    Damn czechgreenway it’s all so full on, what you explained that these guys are going…

    It feels very organized/profesional as well…. Not just some kid in a basement using the ad’s from google to make a few bucks a week…

    Maybe one day there will be implementations to make this crap go away! Maybe when there are no crimes in real life ey?hehe



    Just to clarify a few of czechgreenways’ comments:

    The web hosting company that runs the server in question is located in the USA. They may well have a permissive attitude to spam (and are legally entitled to host spammers if they wish), but they are still subject to the DMCA, and still legally obliged to remove content identified in a takedown notice.

    If the web hosting company ignores a valid DMCA notice, they are likely breaking the terms of contracts with their own upstream service providers. The next step in the process would be to issue a DMCA notice to those upstream providers.

    Another approach that can be done in parallel is to make a complaint to the scraper site’s ad networks (usually AdSense). Those who have tried that approach say that it sometimes results in their ad accounts being shut down.



    Here’s Google’s info on making a formal complaint about a site that runs AdSense with material that infringes copyright:

The topic ‘Incoming links…Blog Scrapers?…stolen content’ is closed to new replies.