Recovering From A WordPress Disaster

Let’s say your day is going pretty good.  You are sitting down to write a blog post and have a great idea too.  As you bring up your blog to check on an older post you see something strange.  Nothing.  As in no posts on your site.  Fighting back panic, you try to log into your admin panel.  Maybe you are successful and maybe you aren’t.  In either case, you find that you can’t bring these posts back.

Just this situation happened to SelfishMom last week.  All of her posts were gone and, while she could log in to her administrator account and see the posts there, nothing she could do could bring them back.  She feared that a hacker had gained control of her site.  I’m going to show just how I helped her bring her posts back – as well as what I could have done had things gone differently.

Just a warning: This is going to involve some intense mySQL queries.  They will be very powerful, but can also be very confusing.  If you find yourself in this situation and don’t want to wrangle with mySQL, I can always help.  That help might come with an hourly rate, however.  You can contact me using my contact form on this site or message me on Twitter.

First, let’s launch phpMyAdmin.  Different web hosts have it set up different, so you might need to check with your host to see how to launch this.  Most web hosts let you launch cPanel by going to yoursite.com/cpanel.  You log in (with credentials given to you by your host), find an entry for phpMyAdmin and launch this.

Once you are in phpMyAdmin, you can access your database directly.  Your database should be on the left hand side.  Click on it.  (You might need to click on a + sign first to show the database.)  A series of mySQL tables will be shown.  Within here is all of your WordPress data.

Find one of your WordPress tables.  It should be named something like "wp_users" or "wp_posts".  The "wp_" prefix might be different depending on your setup.  For the purposes of this post, I’m going to list all of the tables using a wp_ prefix.  If your tables used a different prefix, just replace yours for wp_ in the following queries.

Let’s deviate from SelfishMom’s situation for a moment and suppose that she wasn’t able to log in at all.  How could she have reset her administrative password without having access to the administrative panel?  This is actually pretty easy via phpAdmin.  For brevity’s sake, and since they did such a good job on it, here’s WPExplorer’s tutorial on it.

Ok, now that we have a login, let’s address another concern: Hackers.  Did a hacker somehow gain control of SelfishMom’s site and make himself the administrator?  Click on the SQL tab in phpMyAdmin. A blank box will appear.  In here, type:

SELECT u.*, m . *
FROM wp_usermeta m, wp_users u
WHERE u.id = m.user_id
AND m.meta_key like ‘%user_level%’ and m.meta_value = 10

After you click Go, this will show you a listing of users who are set as administrators.  Ideally, you should see only ones that you have set up.  If you see any users there that you don’t recognize, those might be hacker accounts.  You can lower their access by noting the ID number in the listing.  (For the purposes of the query below, I’ll use the number 42.)  Typing in:

Update wp_usermeta Set meta_value = 0 Where meta_key like ‘%user_level%’ and ID = 42

and clicking Go will lower their access level to 0 (Basic access).  We could have deleted their account, but at this point I’d prefer to lower the access in case we need to use the account.

What if your account isn’t listed though?  This would mean you’ve definitely lost administrative access.  Let’s get that back for you.  Run the following query (replacing "admin" with your administrative username):

Select ID from wp_users where user_login = ‘admin’

Make a note of your ID number.  (You’ll need it again later.)  Now run the following query.  In place of your ID number, I’ll use the number 2.

Update wp_usermeta Set meta_value = 10 Where meta_key like ‘%user_level%’ and ID = 2

There’s one more administrative access level to check.  Enter and run the following query:

SELECT u.*, m . *
FROM wp_usermeta m, wp_users u
WHERE u.id = m.user_id
AND m.meta_key LIKE ‘%capabilities%’ and m.meta_value like ‘%admin%’

Again, this should show only your administrator account.  If a mystery account shows up, revert it to basic access by noting the ID number and running the following.  (Again, I’m going to use 42 in my example.  Replace it with the actual ID number.)

Update wp_usermeta Set meta_value = ‘a:0:{}’ Where meta_key like ‘%capabilities%’ and ID = 42

If your account wasn’t listed, run the following query (substituting your administrative ID – obtained earlier – for the number 2):

Update wp_usermeta Set meta_value = ‘a:1:{s:13:"administrator";b:1;}’ Where meta_key like ‘%capabilities%’ and ID = 2

Now that we’ve sorted out administrative access, log into your WordPress Admin panel.  Keep phpMyAdmin open though, we’ll need that later.  Once you are in, look for your posts.  If they are there, then try to make them live.  If you can, then congratulations.  Your troubles should be over.  You might want to secure your WordPress site more, though.

If you can’t make your posts live, then there are two other possible problems.  The first possibility is that your database has grown so large that it is bumping against the limit your host set for it.  To see how large your database is, run the following query:

SELECT table_schema "Data Base Name", sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MB" 
FROM information_schema.TABLES GROUP BY table_schema ;

This should give you a listing of one or more databases with sizes.  If your database sizes are more than what your host provides, then unfortunately there is little to do.  You will need to contact your host to discuss your options.

If your database is under the limit, then most likely the database tables have been corrupted.  Don’t worry, though.  There is an easy fix.  At the top of the page, above the Browse tab, the server should be listed.  Next to that should be your database..  Click on the database’ name to see a listing of database tables.  Next, click on the checkbox next to all of the tables relating to WordPress (all of the ones with the "wp_" or other prefix).  Finally, at the bottom of the page, click on the "With selected" drop down and select "Repair table."  The repair process should begin and, when it is done, you should see all your posts live again – just like SelfishMom did.

There is one possibility we didn’t cover yet, though.  What happens if, after you log in, you find that all your posts are gone?  While it is possible that they remain in the database somewhere and are recoverable, sadly this is too complex to cover here.  The best bet here is to have a good backup process in place and to restore your database from a known good backup.  You might lose a little bit of data in the process, but it’s better than losing everything.

I hope this has been an informative post on how you can recover your WordPress posts even under seemingly dire circumstances.  Hopefully, you’ll never need to use them.  Of course, should you find yourself in this situation and need some help from someone well-versed in the ins and outs of WordPress and mySQL, feel free to contact me.

Defeating BuzzMyFx Content Scrapers

burglar_smallMy next post was going to be one about WordPress issues, but then something else came up.  That post will still go live on Wednesday.  Right now, I want to talk to you about content thieves and scrapers.

We had a run-in with some content scrapers two years ago.  That scraper took the content, but left the image links intact.  At the time, I showed how to defeat that particular variety of scraper.  This scraper, however, was trickier.

I’m not sure what the purpose of “BuzzMyFx” is beyond content hijacking.  If you “check” to see if your site is scraped by them (by going to YourSiteName.buzzmyfx.com), you might see that your site isn’t being scraped.  However, your mere act of checking will CAUSE them to start scraping your site.  Scraped sites have all content redirected through their servers.  Images, Stylesheets, JavaScript files, and more all seem to pour through BuzzMyFx’s servers instead of yours.  What’s worse is that, since all links go to BuzzMyFx now, clicking on a link to another site causes that site’s content

It didn’t take long to deduce what was going on.  BuzzMyFx is a server side scraper.  Imagine someone coming to your site under normal circumstances.  They tell their browser to load “www.MyWebSite.com”.  The browser then contacts the server hosting your site asking for that page.  The server gives the page to the browser which shows it to you.  Simple, right?

BuzzMyFx adds an extra layer.  If you go to MyWebSite.BuzzMyFx.com, your browser goes to BuzzMyFx’s server first.  BuzzMyFx’s server then contacts your server (as if it was a browser) for the page.  Your server gives the page to the “BuzzMyFx browser” as it does to all other browsers requesting pages. BuzzMyFx then alters the page’s code to direct all links back to them.  They also add in their own StatCounter script and change ad code to give them the revenue instead of the site owner.  Finally, they give the changed version of the page to you.

Pretty scummy, right?  Of course, by doing this, they are committing massive copyright infringement at the very least.  At $750 – $150,000 per infringement, dozens of infringements per site scraped, and possibly hundreds of thousands of sites affected, this could land them on the hook for millions of dollars.  Then there are the problems encountered if they are using a trademarked logo/name without authorization.

So how do you stop them?

Thankfully, servers keep logs of every visit.  As you loaded this up to read this post, my server dutifully recorded information such as your IP address, where you were referred from, the current date and time, and what page you were loading up.  This happens at all websites you visit, but not all people know how to read the logs.  As a webmaster, I am well versed in reading server logs.

I loaded up their scraped version of my site while checking my server logs and there it was: 192.151.156.170.  That was the IP address doing the scraping.

Next, I opened up my “.htaccess” file.  This is a special file on your web site that controls who can access your site and what they can and can’t see.  I added the following lines at the beginning:

RewriteCond %{REMOTE_ADDR} ^192\.151\.156\.170$
RewriteCond %{REQUEST_URI} !/content-thief.html
RewriteRule ^(.*)$ /content-thief.html [R,L]

Finally, I created a simple HTML page called “content-thief.html” with big, bold, red letters warning people that this was a scraped site and they should go to my real site.  (I didn’t link to my real site since the link would be altered, so I just spelled it out.)  You can go ahead and copy my “content-thief.html” page for your own usage.  Just be sure to change the site name to your own.

Unfortunately, BuzzMyFx has already cached some of my content, so the main page of my “BuzzMyFx-ed” site doesn’t show this warning.  Still, as their content expires and their server tries to grab the new content, it will be replaced by my warning.  (I went easy on them.  My initial reaction was to redirect them to some hard core pornography.  I didn’t want my name linked with that though.)

The other problem is that they can change their IP address which will let them bypass this rule.  I can add their new IP address in, but it will be a constant effort to keep up with them.  Perhaps the best remedy would be for all affected site owners to contact the people who run this “service.”  Unfortunately, they’ve hidden who they are from WHOIS, but they can’t hide two things:  1) Their domain name is registered from eNom and 2) Their site is hosted by CloudFlare.com DataShack.net.  If we can’t get them to stop, we can always get their hosting and domain name cut off.

Here’s hoping this scraper menace ends soon so we can all get back to producing great content instead of trying to protect our content from being scraped.

UPDATE:  CloudFlare.com is denying being their host.  As Heather commented below, they say they are a “reverse proxy, pass-through security service.”  I’m guessing that BuzzMyFx is using CloudFlare to hide their server’s real IP address.  However, the IP address I obtained that was seizing my content (192.151.156.170) isn’t “hidden” at all.  That IP address comes from DataShack.net.  So focus communication on them, not CloudFlare.

UPDATE #2:  If you aren’t technically inclined enough to know how to fiddle with htaccess and/or FTP files to your server, but you are using WordPress, you can also use the WP-Ban plugin to keep them off your site.  This plugin lets you list IP addresses and even leave a specific message for those IP addresses to see.

UPDATE #3: According to Lazy Budget Chef, even if you manage to contact BuzzMyFx, they will try to sell you a domain protection package to “steal the blogger’s legal right to their blog, their log in credentials, mailing list, and other personal information.”  So even if you manage to contact these scrapers, don’t sign anything they give you!  You shouldn’t need to sign some form of contract for them to cease scraping – they should just stop.  Be very wary of these people.

UPDATE #4: It looks like we’ve won this battle.  BuzzMyFx seems to be down.  They could still flee to another hosting provider (or even the same one signed up under a different account) and start their service back up.  Even if they don’t come back, I’m sure other scrapers will take BuzzMyFx’s place.  Still, you need to take each victory as it comes.  Congratulations and thanks for helping take down this scraper, everyone!

NOTE: The “burglar” image above is by tzunghaor and is available from OpenClipArt.org.

Cheating On Cable

hdtv_smallHi, I’m TechyDad and I’m a cheater.  I’ve been cheating for years as has been my wife.  My kids have been cheating too.

Confused?  Let me back up a bit.

With the rise of Internet video services, a lot of people have found that they aren’t reliant on cable TV for their video entertainment fix.  At first, there were simply short videos on sites like YouTube.  Entertaining, but no match for the ongoing half hour or hour long series that aired on cable.  Then came services like Netflix and Amazon Video On Demand with many series available to watch and YouTube channels dedicated to longer/ongoing shows.

At this point, many people decided that they didn’t need cable TV anymore.  They "cut the cord" and ditched cable.  Although more and more people were doing this, cable companies kept denying that cable cutting was a major trend.  They just couldn’t see how people could replace them with Internet video.  Although some cable executives have begun acknowledging the trend, to most cord cutters were a fringe group, easily ignored.

Now, however, the cable companies have identified a new threat:  Cord Cheaters.  By the sound of it, you might think this means people who get cable without paying for it.  Or, perhaps, it’s people who somehow manage to get premium channels when they’ve only paid for basic.  That’s not what this is referring to, however.  "Cord cheaters" are people who don’t use the cable company’s Pay Video On Demand features and instead pay companies like Netflix or Amazon VOD for video content.

According to DigitalSmiths Corp’s "Video Trends Discovery Report", only 27.1% of respondents have made purchases from the cable company’s VOD menu.   For comparison, 41.7% pay a Netflix monthly fee and 48.2% use a subscription over-the-top service.  This has cable companies worried.  They’re worried that money is flowing to other companies when it could be going to them.

Of course, the reason that this money isn’t flowing to them is that companies like Netflix are providing a better service.  There is more on Netflix for me and my kids to watch than on all of my cable providers’ VOD channels put together.  In addition, it works smoother, has a nicer interface, and costs less.  Is it any wonder that we "cheat" on our cable company with Netflix?

Even though this report is recent, some cable companies have already seen this coming.  They have tried taking "precautions" in the form of low usage caps and overage fees instead of improving their VOD services.  Time Warner Cable trialed caps as low as 5GB, but withdrew them when people complained about how low they were.  They later brought them back as an "optional service" where you would save $5 a month but get a 5GB cap.  Of course, every 1GB you went over cost you $1 so the savings were minimal, if any.

Usage caps, especially low ones, mean that users who watch videos online will have a hard limit on how much online videos can be watched.  Make the caps low enough and the overage fees high enough and Internet video becomes too expensive to use.  Even a customer does use Internet video, the cable company winds up getting more money.  This is a win-win for the cable company, but a lose-lose for consumers.

Unfortunately, most customers don’t have much of a choice in their ISP.  In my case, my only choice of broadband is Time Warner Cable.  If they decided to implement low caps tomorrow, I’d have no recourse.  They wouldn’t have an incentive to provide me with better service because there would be no competition.

Still, the Internet video genie is out of the bottle and no amount of trickery from cable companies will get it shoved back in.  In fact, when you get right down to it, I find the term "cord cheater" to be insulting. "Cheater" implies that I’m doing something wrong and possibly illegal by paying a company other than my cable company for video services.  I didn’t.  Everything I did was perfectly legal.  If the cable company doesn’t like it, then they need to compete with a better service, not scare tactics and rhetoric.

NOTE: The "HDTV" image above is by jgm104 and is available from OpenClipArt.org.

Don’t Take The Plagiarism Short Cut

writeIt was bound to happen eventually.  We sat NHL down by the computer so he could type out sentences to his spelling sentences.  As I was preparing dinner, I looked over and saw one of his sentences.  Only there was a problem.  It was way too advanced a sentence for him to have written.  I asked him and he admitted to having looked up words on Dictionary.com to be sure of their meaning.  While he was there, he noticed that they had the words in sentences.  Just what he needed.  Copying them would save a lot of time, right?

I knew then and there that it was time to introduce NHL to another word: Plagiarism.  I told him that he couldn’t just copy someone else’s work and try to pass it off as his own.  First of all, the assignment was for him to write out sentences.  Copying someone else’s work is not fulfilling the assignment.  Secondly, the purpose of the assignment is to learn how to use the words that he is being introduced to.  Grabbing sentences from the web isn’t teaching him anything.  Third, stealing someone else’s work and passing it off as your own isn’t fair to the original author.  I asked NHL how he’d feel if someone stole something he wrote and told people they had written them.

Sadly, too many people don’t learn this lesson this young.  Some go through life thinking that passing someone else’s work off as their own is perfectly acceptable.  Others learn their lesson later in life when the consequences are more dire.  These consequences can range from public shaming to losing your job or being kicked out of school.

In a way, I’m glad that NHL tried to plagiarize so young as this lesson is an important one to learn as early as possible.  Just like with Google Image Searches, text on the Internet is not free for the taking.  It can definitely be tempting, but you can’t just take text from Wikipedia or another source, use it in your own work.

NOTE: The "pen paper" image above is by aungkarns and is available from OpenClipArt.org.

Overwhelmed By A Hurricane Of Content

content-stormThere are about a billion of websites on the Internet.  Of those, millions are blogs.  Those blogs produce between a dozen and a few thousand new posts every year.  Then there are the thousands of movies, TV shows, games, songs, books, and other forms of media released every year.  Just for good measure, mix into this the millions – if not billions – of status updates, photos, and videos published to social media sites such as Twitter, Facebook, and Google+.  It’s easy to see how we are drowning in content nowadays.

On this blog alone, I have over 1,200 posts published.  Many of them – I’m sure – are updates that would interest almost nobody.  Some might interest a small group.  A couple might actually interest many, many people.  If only people knew about them. 

The problem is that a good post can easily be lost within the swarm of other status updates, videos, and thousands of other posts.  It’s like trying to hear a cricket chirping… from across town… while a category 5 content hurricane is blowing.

There are many people who know many good ways of amplifying your volume.  The problem is that these take time and effort.  My problem is that I’ve got a day job.  This isn’t a "problem" per se – I really like my day job and in this economy there are lots of people who would love to have one.  The problem is that many hours of my day are dedicated to "doing the day job thing."  Subtract time to pay attention to the kids, do chores around the house, cook, etc, and I barely have time to write my blog posts – much less spend hours promoting them.  So I just keep chirping into that content hurricane hoping that someone hears me and likes what they read.

On the flip side, as a content consumer rather than content producer, it almost seems like there are never enough hours in the day to see everything I want to see.  My feed reader is hardly packed with hundreds of thousands of blogs and yet I rarely seem to be able to knock the number of unread items below triple digits.  When I started out on Twitter and was following only a few people, I would read every status update that was made.  Even when I took a day off of social media for Shabbat, I would go back in my timeline to where I left off and would spend some time catching up.  This just isn’t possible anymore.

If I spent my entire day reading blog posts, watching TV shows, looking at Instagram photos, reading status updates, and watching YouTube videos, I wouldn’t even scratch the surface of what I’d like to see.

One of my favorite movies of all time is Short Circuit.  In this movie, a military robot accidentally becomes alive and sentient.  Instead of wanting to destroy, however, Number 5 decides that all he wants is to live in peace and consume information.  In the sequel, Short Circuit 2, this is expanded upon when Number 5 – now called Johnny Five goes to the city and discovers a book store.  He goes from book to book flipping through them absorbing their contents in seconds.  Though it is a big bookstore (for the 80’s), he is able to absorb all of the information rapidly.

I wonder what would happen if Johnny Five were to be released in the present day, however.  No matter how quickly he could flip through a 700 page novel, consume an RSS feed, watch a TV show or movie at extreme fast forward, or listen to music songs, there would still be more to see.

The Internet brings what often seems to be unlimited content to you and this can be a blessing or a curse.  It is nearly impossible to be bored – boredom merely means that it is time to seek out new and interesting feeds/games/videos/etc.  On the flip side, you can feel left out when you are unable to keep up with all of the content that all of your friends are watching (even if said content is spread over your friends and they each aren’t watching it all).  Going back to the hurricane analogy, you are a fly buzzing about as the category 5 content hurricane blows.  Every time you think you have found some stability, another blog post or YouTube video or app comes out of nowhere to strike you.

Whether you are a cricket chirping or a fly buzzing – a creator trying to get your work viewed or a consumer trying to keep up with the latest content – it’s a dangerous and information packed world out there.  Stay safe.

Note: The "content storm" image above was created by combining the following images from OpenClipArt.org: Hurricane Symbol by TheByteMan, Generic Book by dniezby, Movie Camera by schoolfreeware, Music Icon by Minduka, Iphone 4 by Ts-Pc, Cutie Bird by Luen, and Cartoon TV by rg1024.

1 2 3 4 5 17