How The Simpsons Tempted Me To The Dark Side

the_simpsons_dvdDisclaimer: In the following blog post, I’m going to mention doing things that aren’t legal.  I want to just clarify from the outset that I haven’t done these things, I’m not condoning these things, and I’m definitely not going to give step-by-step instructions on doing these things.  So if you came here looking for instructions of this nature, you’ll be disappointed.  Also, any comments that give/link to instructions or link to programs to do these things will be removed.

For the most part, I’m a law-abiding citizen.  I like staying within the legal lines.  My "criminal record" would be a boring read – if it weren’t nonexistent.  So when it comes to obtaining movies and TV shows to watch, it should be no surprise that I do things the legal way.  I stream from Netflix, Amazon VOD, record using my DVR, purchase DVDs, or rent DVDs from the library.  I never, ever download the videos in those less-than-legal manners that the copyright owners haven’t approved.  Recently, however, I was sorely tempted.

A couple of weeks ago, while walking through a local store, we saw the new line of Simpsons Lego minifigures.  Of course, my boys wanted them.  They didn’t care that they had never watched a single episode of The Simpsons or that they wouldn’t be able to tell which one was Bart and which one was Milhouse.  All they cared about was that these were new Lego minifigs.

I decided that perhaps the time had come to introduce my boys to The Simpsons.  I pulled out my DVD copy of The Simpsons: Season 1 (a present from B years ago and yet still shrink wrapped).  We watched the first episode and my boys were hooked.  They quickly got through the rest of the first season.

This was where we hit a wall.  How would we get the rest of the seasons for the boys to watch?  The Simpsons is no ordinary TV show.  It has been on the air for 25 years and has amassed five hundred and fifty episodes.  We could purchase each of the DVD sets for seasons 2 to 24, but that would cost over $460 – way too expensive for our bank account.  If Netflix had them available, we could stream them from there, but sadly there isn’t a single episode on there.  They aren’t available via Amazon Prime either.  Amazon’s VOD service has some of the episodes, but not all.

This leaves me with two legal options.  First, I could take them out from the library.  We actually wound up taking Season 2 out of the library, but only got to keep it for four days (two days plus a renewal time of two days).  That was only enough time to watch one of the four DVDs in the set.  We could have kept the set out longer and paid late fees ($0.25 per day), but at that rate we would have needed to pay $3 per season or $69 to watch the entire set.

Alternatively, I could subscribe to Netflix’s DVD streaming service for the duration of our Simpsons watching time.  Given that it would have taken us about 16 days to get through a set, we would have needed to subscribe to Netflix’s DVD streaming service for just over a year.  (This is assuming no downtime of needing to wait for the next disc to arrive.)  This would cost about $130 – even more than the library option.

Clearly, there is no easy, inexpensive way to watch The Simpsons from the beginning to the present episodes.  Or is there?  While I haven’t actually done it myself, I do know in theory how to download items from less-than-legal locations.  If I really wanted to, it wouldn’t take me long to get rips of the DVDs on my computer for the boys and I to watch.  I might even be able to do it in such a fashion as to avoid detection by the companies that watch for people illegally sharing files.

Still, I might slip up and be found.  A fine of even $750 per episode (the minimum fine for a copyright violation) could still work out to over $350,000.  At that rate, we might as well buy all of the DVDs – a hundred times over.

On the other hand, there are sneakier ways of pirating material.  Take the library, for example.  Taking the DVD out from the library is completely legal.  Once it is time to return it, it goes back and we can’t watch it again unless we take it back out.  What if we ripped the DVD though?  We would then be able to watch the episodes at our leisure.  I could even assuage my conscience by telling myself that I’ll delete the episodes when we’re done with them and that it’s just an "extended library loan."  My chances of being caught doing this are virtually zero.

So what is stopping me?  My children.  I want to set an example for them.  If I believe that downloading copyrighted material without authorization is wrong, then what kind of lesson would I teach them if I bent my moral rules for the sake of convenience?  Sometimes doing the right thing is difficult.  Sometimes doing the right thing means going without something you really want.  It can be very easy to shrug off your morals and take "the quick and easy path."  The dark side did tempt me, yes, but I refused to give in.  We’ll watch The Simpsons the slower, but legal way of library rentals.  I just wish the content owners would license The Simpsons to Netflix so that my boys could view it in an easier, but still legal fashion.

NOTE: The "Simpsons DVD" image above was taken by me of our Season One DVD set.

Defeating BuzzMyFx Content Scrapers

burglar_smallMy next post was going to be one about WordPress issues, but then something else came up.  That post will still go live on Wednesday.  Right now, I want to talk to you about content thieves and scrapers.

We had a run-in with some content scrapers two years ago.  That scraper took the content, but left the image links intact.  At the time, I showed how to defeat that particular variety of scraper.  This scraper, however, was trickier.

I’m not sure what the purpose of “BuzzMyFx” is beyond content hijacking.  If you “check” to see if your site is scraped by them (by going to YourSiteName.buzzmyfx.com), you might see that your site isn’t being scraped.  However, your mere act of checking will CAUSE them to start scraping your site.  Scraped sites have all content redirected through their servers.  Images, Stylesheets, JavaScript files, and more all seem to pour through BuzzMyFx’s servers instead of yours.  What’s worse is that, since all links go to BuzzMyFx now, clicking on a link to another site causes that site’s content

It didn’t take long to deduce what was going on.  BuzzMyFx is a server side scraper.  Imagine someone coming to your site under normal circumstances.  They tell their browser to load “www.MyWebSite.com”.  The browser then contacts the server hosting your site asking for that page.  The server gives the page to the browser which shows it to you.  Simple, right?

BuzzMyFx adds an extra layer.  If you go to MyWebSite.BuzzMyFx.com, your browser goes to BuzzMyFx’s server first.  BuzzMyFx’s server then contacts your server (as if it was a browser) for the page.  Your server gives the page to the “BuzzMyFx browser” as it does to all other browsers requesting pages. BuzzMyFx then alters the page’s code to direct all links back to them.  They also add in their own StatCounter script and change ad code to give them the revenue instead of the site owner.  Finally, they give the changed version of the page to you.

Pretty scummy, right?  Of course, by doing this, they are committing massive copyright infringement at the very least.  At $750 – $150,000 per infringement, dozens of infringements per site scraped, and possibly hundreds of thousands of sites affected, this could land them on the hook for millions of dollars.  Then there are the problems encountered if they are using a trademarked logo/name without authorization.

So how do you stop them?

Thankfully, servers keep logs of every visit.  As you loaded this up to read this post, my server dutifully recorded information such as your IP address, where you were referred from, the current date and time, and what page you were loading up.  This happens at all websites you visit, but not all people know how to read the logs.  As a webmaster, I am well versed in reading server logs.

I loaded up their scraped version of my site while checking my server logs and there it was: 192.151.156.170.  That was the IP address doing the scraping.

Next, I opened up my “.htaccess” file.  This is a special file on your web site that controls who can access your site and what they can and can’t see.  I added the following lines at the beginning:

RewriteCond %{REMOTE_ADDR} ^192\.151\.156\.170$
RewriteCond %{REQUEST_URI} !/content-thief.html
RewriteRule ^(.*)$ /content-thief.html [R,L]

Finally, I created a simple HTML page called “content-thief.html” with big, bold, red letters warning people that this was a scraped site and they should go to my real site.  (I didn’t link to my real site since the link would be altered, so I just spelled it out.)  You can go ahead and copy my “content-thief.html” page for your own usage.  Just be sure to change the site name to your own.

Unfortunately, BuzzMyFx has already cached some of my content, so the main page of my “BuzzMyFx-ed” site doesn’t show this warning.  Still, as their content expires and their server tries to grab the new content, it will be replaced by my warning.  (I went easy on them.  My initial reaction was to redirect them to some hard core pornography.  I didn’t want my name linked with that though.)

The other problem is that they can change their IP address which will let them bypass this rule.  I can add their new IP address in, but it will be a constant effort to keep up with them.  Perhaps the best remedy would be for all affected site owners to contact the people who run this “service.”  Unfortunately, they’ve hidden who they are from WHOIS, but they can’t hide two things:  1) Their domain name is registered from eNom and 2) Their site is hosted by CloudFlare.com DataShack.net.  If we can’t get them to stop, we can always get their hosting and domain name cut off.

Here’s hoping this scraper menace ends soon so we can all get back to producing great content instead of trying to protect our content from being scraped.

UPDATE:  CloudFlare.com is denying being their host.  As Heather commented below, they say they are a “reverse proxy, pass-through security service.”  I’m guessing that BuzzMyFx is using CloudFlare to hide their server’s real IP address.  However, the IP address I obtained that was seizing my content (192.151.156.170) isn’t “hidden” at all.  That IP address comes from DataShack.net.  So focus communication on them, not CloudFlare.

UPDATE #2:  If you aren’t technically inclined enough to know how to fiddle with htaccess and/or FTP files to your server, but you are using WordPress, you can also use the WP-Ban plugin to keep them off your site.  This plugin lets you list IP addresses and even leave a specific message for those IP addresses to see.

UPDATE #3: According to Lazy Budget Chef, even if you manage to contact BuzzMyFx, they will try to sell you a domain protection package to “steal the blogger’s legal right to their blog, their log in credentials, mailing list, and other personal information.”  So even if you manage to contact these scrapers, don’t sign anything they give you!  You shouldn’t need to sign some form of contract for them to cease scraping – they should just stop.  Be very wary of these people.

UPDATE #4: It looks like we’ve won this battle.  BuzzMyFx seems to be down.  They could still flee to another hosting provider (or even the same one signed up under a different account) and start their service back up.  Even if they don’t come back, I’m sure other scrapers will take BuzzMyFx’s place.  Still, you need to take each victory as it comes.  Congratulations and thanks for helping take down this scraper, everyone!

NOTE: The “burglar” image above is by tzunghaor and is available from OpenClipArt.org.

The Intellectual Property of Tweets

Last week, GaltsGirl tweeted a question to her followers.  She asked "Are tweets entitled the same intellectual property courtesies as blog posts?"  My answer was "If I’m using someone’s tweet for something I usually ask first. That said, I don’t see it as the same as a blog post IP-wise."  Unfortunately, thanks to the limited nature of Twitter comments and my assumption that credit would always be given, this led to a bit of confusion.  While I cleared up that confusion on Twitter (or, at least, I hope I did).  The interaction did inspire me to write about it at length.

Tweets versus Blog Posts

Part of the problem stemmed from my use of the phrase "I don’t see it as the same as a blog post IP-wise."  By this I meant that blog posts can be quoted without using the entire post.  If you quoted this article in a blog post of your own, you could say that I wrote:

Unfortunately, thanks to the limited nature of Twitter comments and my assumption that credit would always be given, this led to a bit of confusion.

However, if you "quoted" me by copying my entire article word-for-word, that would be copyright infringement.  Furthermore, while you should properly credit this quote, there would be no need to compensate me or even ask for my permission to use this quote.  After all, while this entire post is my intellectual property, a quote falls under fair use.  So copying this entire post to your blog could result in DMCA takedown requests, legal threats if those were ignored, and even large fines if the entire affair proceeded to the courts.

A tweet, on the other hand, is usually too small to quote part of effectively.  To quote someone’s tweet, one usually has to use the entire thing.  This begs the question: If using an entire blog post without permission is copyright infringement, is using an entire tweet infringement as well?

RTs and Inviting Infringements

On the Twitter platform itself, I’d say that quoting someone’s tweets isn’t copyright infringement.  After all, Twitter itself gives a method for doing this: Retweets.  What about off of Twitter, though?  Is using someone’s tweet in a blog post, a book, or some other medium copyright infringement if explicit permission isn’t granted?

Let’ remove two obvious "fair use" cases immediately.  If the quote is used for news reporting purposes ("Lady Gaga tweeted to her followers…") or parody, then permission isn’t required.  It is good form to ask permission, of course, but it isn’t a requirement.

Let’s also assume that credit is given.  If credit isn’t given, then I might be willing call it as infringement.  If someone tweeted something so interesting, insightful, foolish, or otherwise useful to your larger project, it’s only fair that they should get credit for your words.  You wouldn’t quote a passage in a book without stating what book that passage came from.  Similarly, one should never quote a tweet without naming the user who tweeted it.

Beyond those cases, I have to admit that I’m torn.  I’ve blogged about how you just can’t take an image off of Google Images and use it however you like.  Grabbing someone’s tweet and sticking it in your post, at first glance, appears to be like grabbing a picture from Google Images and putting it in your post.  However, the effort invested in a single tweet hardly seems to compare to the effort invested in making an image.

More Flexible Copyright Law

I think this example highlights the need to reform copyright law (something I’ve written about before).  If copying a five hundred page book leads to a $750 fine, why would copying a one hundred forty character tweet hold the same potential fine?  If copying an MP3 – which has a market value of $0.99 – leads to a $80,000 per song verdict, why would copying a tweet (market value of $0) lead to a similar fine?

In addition, profit motive should be considered when potential fines are calculated.  If the quoted tweet is used in a non-profit manner (say, in a blog post such as this one), then any "infringement" fees should be minimal.  If the quoted tweet was used in a for profit manner (say, a book titled "250 Great Tweets"), then infringement fees would be higher.

Protection of Public Statements

In the end, I consider tweets to be short public statements.  One can’t stand in front of a big crowd of people, say something, and assume that *NOBODY* is going to quote them.  Taking words out of context or not crediting them is unacceptable as is making money off of the tweet (in a non-news reporting, non-parody manner) without compensating the person.  However, on the scale of copyright infringement, using someone’s tweet without permission isn’t anywhere near as bad as taking an entire blog post without permission.

PostScript

During my Googling for this blog post, I ran into an article about a similar issue.  In this case, there was a lawsuit not over a tweet, but over a short quote from WIlliam Faulkner’s Requiem For A Nun.  Sony Pictures used a nine word (97 character) quote from it ("The past is never dead. It’s not even past.") in the movie Midnight in Paris.  The Faulkner estate worried that the use of the quote in the movie might confuse people into thinking there was a relationship between the estate and Sony Pictures.  Sony Pictures, meanwhile, decried the lawsuit as frivolous.   On the day that GaltsGirl posed her question, July 19th 2013, a ruling was handed down stating that such a short quote didn’t constitute copyright infringement.

A little closer to the topic at hand, I found a TechDirt story about a journalist who claimed her tweets were "off the record" and thus weren’t allowed to be repeated by anyone.  When someone questioned her on this, she threatened a lawsuit.  It doesn’t look like she ever went through with it, but she did see the inside of a courtroom when she was convicted of harassing a former boyfriend’s daughter by posting her private journals online.  (Apparently she thought "off the record" tweets couldn’t be reposted, but private journals could be.)

This, in turn, led to more articles, including a 2009 blog post by Mark Cuban, all questioning just how copyrightable tweets are.

Content Thieves and Malicious DMCA Takedowns

burglarPretty much anyone who has put content online has encountered it.  Someone takes your content and puts it on their own website.  They might be generating ad revenue from your content or they might just be trying to gather good content (as opposed to generating good content) to that their site looks good.

Whatever their reasons, their theft of your content has serious repercussions.  Beyond simple copyright theft, search engines can knock sites down if it sees the same content on multiple websites.  If a content thief takes your content, it could mean that you actually get dinged in the search engine rankings.

Thankfully, content owners have a recourse via the Digital Millennium Copyright Act (DMCA).  The DMCA says that copyright owners who find their content online without permission can send a letter (DMCA takedown notice) to the person or company hosting the material.  The person/company then must take down the material.  Once they do, the person who put the content online can then either accept that the material was taken down or contest this takedown.

Once they contest it, it is a matter between the poster and the copyright owner.  The hosting company is off the hook and isn’t involved.  This is a good thing.  Were this not the case, the mere act of letting users put any content online would be too much of a lawsuit risk.  The Internet as a whole would grind to a halt.

Sadly, however, DMCA requests can be abused.  Recently, Retraction Watch, a blog run by Ivan Oransky and hosted on WordPress.com, found many of their postings gone.  After doing some investigating, it turned out that one of the subjects of his postings, a cancer researcher who was being investigated for fraud in his research and inaccuracies on his resume wanted to improve his online reputation.

To improve his online reputation, the cancer researcher hired a company.  The company copies Retraction Watch’s content.  Then, they filed DMCA takedown notices with WordPress, claiming the content was their own.  WordPress complied and the content was deleted.  Now, Retraction Watch is trying to recover their lost content.

This is, understandably, very worrying.  Theoretically, false DMCA takedown requests constitute perjury.  Practically, though, there is no penalty for filing a false request.  How many more people will find their content gone via DMCA takedown because some person or company doesn’t like what they posted?  How many content thieves will steal content and then try to take the originals down to bolster their claim over the stolen content?

How can you protect yourself?  The best way is to always back everything up locally.  This way, even if you are struck with a malicious DMCA takedown notice, you won’t actually lose any content.  If you are running a self-hosted WordPress blog, there are many plugins that you can use to back up your database.  (I prefer WordPress Database Backup.)  If you are on WordPress.com or Blogger.com, this site has some recommendations.

Even if you think you can trust your host, it is a good idea to back up.  You never know when a malicious DMCA request will come your way and it is the best method of protecting yourself.

NOTE: The burglar image above is by tzunghaor and is available from OpenClipArt.org.

Glee-Coulton Copyright Commotion

It isn’t news that people seem to think that “on the Internet” equals “free for us to use in any way we see fit.”  It happened with NickMom.  It happened to Kristine and photos of her baby Cora who passed away from congenital heart disease.  It has even happened to both B and to myself with scrapers taking our content for their own uses.  This instance, however, is a bit bigger.

In 2006, a former computer programmer turned musician, Jonathan Coulton, wrote a cover of Sir Mix-A-Lot’s “Baby Got Back.”  He made some major alterations including turning it into a light acoustic, almost folksy, song, including duck quacks where curse words might be, and changing a “Mix-A-Lot” reference to “Johnny C.”

Everything seemed to be fine until a fan of Coulton’s spotted his song on iTunes.  Not under his name, mind you, but as a song for an upcoming episode of Glee.  Furthermore, he didn’t appear to be credited.  Many people figured that there was some kind of mistake, but then the song played exactly the same on the episode itself.  And I mean, exactly the same.  It had the same melody, the same duck quacks (barely heard like someone tried to “scrub them out” but failed), and even the changed “Johnny C” lyrics.  Furthermore, Glee is selling “their version” on iTunes without crediting or publicly acknowledging Coulton in any way.

Sound Cloud even put up a comparison.  You can put on some headphones and listen to the two together.  I did and couldn’t tell them apart.

Coulton tried to get in touch with them and was told that he should be thankful for the “exposure.”  You know, that massive exposure that one gets when a big television show on a major network steals one’s work and doesn’t credit one in any way, shape, or form.  They claimed that they were within their legal rights to do what they did.

The kicker:  They might just be.  It turns out that, thanks to the complicated twists and turns of copyright law, If Artist A makes a song, Artist B makes a derivative work of that song, and Big TV Show C uses Artist B’s version, they just have to pay Artist A.  Jonathan Coulton’s only possible legal avenue centers around the possibility that Glee took his exact audio tracks and used those instead of recreating them.

You see, at one point, Jonathan released his source tracks for a Creative Commons fundraiser.  Some people believe that Glee took these tracks and used them for their own version.  The problem here is that the license they were released under was Non-commercial.  This means that I could take them and release a version of me singing to the song, but I can’t sell that version or use it in a commercial work.

You know, like a television show.

This situation is still developing and it isn’t clear whether Jonathan Coulton will get any credit or payment from Glee.  Since Coulton’s song was copied, many other artists have come forward (or have had their previous claims publicized more) about Glee ripping them off as well.

A television show about underdogs whose only recourse is their singing skills stealing from other artists and using their mega-corporation’s legal might to make sure that they can get away with it?  I’m not sure if that’s irony, but it is extremely repugnant.  It almost makes me want to start watching Glee just so I can quit watching in protest.  (Almost, but not quite.)

Instead, I think I’ll buy a song or two from Jonathan Coulton’s shop.  In fact, if I buy his newly released cover of Glee’s cover of his cover of Baby Got Back (that is to say, his original version), he’ll donate the proceeds to charity.  A very classy move by Jonathan Coulton in response to Glee’s much-less-than-classy move.

The bottom line here is the same as pretty much every case that I detailed in the beginning of this post.  Had Glee offered to pay Jonathan Coulton for permission to use his arrangement, he likely would have agreed.  Had they asked politely, not offering payment but only credit, he still might have agreed.  However, to take the arrangement, give no payment or credit, and try to claim that this gives the artist exposure is flat out wrong.  When it comes to copyright, the rule of thumb is “Ask permission first”, not “Seek forgiveness, not permission.”

1 2 3 4