Bye-bye PeoplePerHour.com

Bye-bye PeoplePerHour.com

Want to see what seller frustration looks like? Just check out the reaction to a recent change instituted by PeoplePerHour.com.

When Sellers have received low ratings for their recently delivered work, they will be getting a warning email and if in the next reviews they fail to increase their average feedback rating over a minimum quality standard, then their selling privileges on PeoplePerHour will be suspended.

In the blog post announcing this change, the author claims that it will increase the quality of services provided by the sellers. Many sellers had something to say:

PeoplePerHour
Sellers speak out against new policy

My personal view on the matter is that freelance websites should rely on the buyers’ feedback to provide guidance to other buyers. Let the market decide which freelancers are worth hiring.

I decided to cut my losses on PeoplePerHour.com. I had two positive experiences and one bad experience. I reinvested all of my earnings back into marketing my profile. I left on a positive note; one of my projects was the inspiration for a product that is now in development.

Super Shortcuts Using Everything and Quick Cliq

Super Shortcuts Using Everything and Quick Cliq

Since I have been using Everything.exe as a portal into my daily files, I was slow to realize that I was missing an opportunity. Quick Cliq saves a few steps over continually hunting for a document. I don’t always think of using shortcuts to launch documents, even though that is a fundamental aspect of Graphical User Interfaces!

Another fundamental feature is the use of keyboard shortcuts. By combining these two obvious productivity enhancers with Quick Cliq, I now have handy super shortcuts for my active documents.

This is not ground-breaking at all. In fact, it is the first shortcut type defined in the Quick Cliq Help File! Nevertheless, if some variation of the following has not occurred to you, you can set it up in about five minutes. (Click the image for the full size.)

Super Shortcut
Click to see all of the steps

The important thing to remember is that you don’t have to use Everything.exe. Anytime you can right-click a document and get the Quick Cliq context menu, you can add that document to your super shortcut menu. However, if you have been relying on Everything.exe to find daily files, this should get you thinking about ways to save time and repetition.

Retrieving the Blog Content for Your eBook

Retrieving the Blog Content for Your eBook

After extracting your blog posts and pages from the WordPress database, you have a file that uses XML formatting to describe every element exported. Needless to say, you need to convert this file to a simple text format, so that you can copy and paste desired chapters or sections into your working draft.

XML Editors

If you are technically proficient—or infinitely patient—you can use a general-purpose XML editor to view and retrieve your blog content. For Windows users, XML Notepad 2007 provides a familiar view of your blog content: a tree structure consisting of nodes and sub-nodes. Working with this editor directly, you can retrieve each post that you want to add to your eBook. Linux users can probably find support for XML editing with the Emacs editor. Here is one(very technical) example: How to Use Emacs for XML Editing.

If you take the time to create a special stylesheet, you can create a nice, readable list of posts, complete with links and images. For a great example of this, check out the efforts of Sacha Chua, a who created an XSL Stylesheet to build one version of her blog posts from 2008. Do note, however, that she had to create a second stylesheet for the main list of posts.

Blogmogrifier

You can simplify the retrieval process by using a specialized program called Blogmogrifier. It’s a Windows-based desktop application, so your computer will need to be running Windows Vista, 7 or 8.

Here is a walk-through that shows you how to import one of your downloaded XML files, select categories and tags and output a text file with matching posts and pages. The file shown in the steps is from Sharon Hurley Hall‘s Get Paid to Write Online blog.

Step One: Download Blogmogrifier

To avoid confusion, please remember the following:

  • Blogmogrifier is just one tool inside of a larger program called Retrievem
  • Retrievem is just one of the many programs developed using a framework called ParserMonster
  • For simplicity, all ParserMonster programs display the ParserMonster shield logo, but use their own names
  • This means that you will be downloading and launching Retrievem.exe to get to Blogmogrifier

This version is currently in beta. It is a free download. You must run it on a Windows PC that uses XP, Vista, Windows 7 or Windows 8.

Click to Download Blogmogrifier
Click image to download Blogmogrifier

The link takes you to Copy.com, where you can simply click the Save button to bring up the dialog box shown below:

The Blogmogrifier Download Page
Be sure to click download it to your computer!

Click download it to your computer. The downloaded file is named Retrievem 3.exe. Put that file into a folder of your choice, as long as it is writable. Windows 7 and 8 do not allow applications to write into the \Programs (x86) folder. However, your \Documents folders is acceptable.

Blogmogrifier is inside Retrievem.exe
Blogmogrifier is inside of Retrievem.exe

Step Two: Start Blogmogrifier

Double-click the Retrievem 3.exe icon. Retrievem is a portable application that does not require installation. You should see a splash screen for a few seconds before the Retrievem Dashboard appears.

Retrievem Dashboard
The Retrievem dashboard

In the task list window, you’ll see two or more task icons. Click the one that looks like a purple ray gun, marked 1. This will select Blogmogrifier as the active task.

You have two ways to “prep” Blogmogrifier. The first way is to use the dashboard to specify where the XML files are located. The second way is to do that after clicking the Run Task button. The choice is yours; however, the first method has the advantage of remembering your settings the next time you run the application. So, let’s set things up from the dashboard. (I’ll briefly mention how to accomplish the same thing without using the dashboard.)

Step Three: Drag and drop XML Folder onto Dashboard

Using Windows Explorer, locate the folder where the WordPress XML files are located. Drag the entire folder onto the box marked 2. It is important that you drag folders only. Files will not be detected if dropped directly, even though they appear on the dashboard.

If the folder has any files, they will be listed in the box marked 3. In addition, a list of file types appears in the small box marked 4. Make sure these areas show the file(s) you want to import.

If you dragged the wrong folder, just click the red “X” to the right of the Paste Clipboard button and try again.

Step Four: Drag Output Folder onto Dashboard

By default, tasks send their output to the same folder where Retrievem.exe is stored. If you want to use a different folder, drag it onto the long, narrow box marked 5. Alternatively, you can use the Browse … button. However, the file dialog’s default behavior won’t let you choose a folder above your \Documents folder.

Step Four: Run the Blogmogrifier Task

Click the Run Task button, marked 6. This saves your folder choices and displays the Blogmogrifier form.

Blogmogrifier XML List
Blogmogrifier XML list

If you selected a valid folder in Step Three, this is the screen you will see when you first open Blogmogrifier. Click on the file you wish to import into Blogmogrifier. Then click the Import tab.

If you skipped Step Three or didn’t pick a valid folder, you’ll be forced to use drag and drop to select a single file (not folders!), as shown below:

Blogmogrifier Drag and Drop
Drag and drop a single file

In this case, after you drop a file with the .XML extension, Blogmogrifier automatically switches to the Import tab. By the way, the Drag and Drop tab is not smart enough to tell the difference between a WordPress XML file or any other XML file. It merely examines the file type and either switches the tabs or displays an error if an XML file was not dropped.

Step Five: Import File

On the Import tab, you will see what, if anything, was imported successfully. Click on the different options to view how many of each post type were imported. (In Sharon’s case, she chose to export only posts from her blog.)

Blogmogrifier Import
Successful import

The hyperlinks don’t work inside the Import tab. If you wish to review a link, select it like you would do in a text editor, copy it using CTRL-C and paste it into your web browser with CTRL-V.

Once you are satisfied that the content has been imported, click the Export tab.

The Import tab will be blank with greyed out controls if you attempted to import a non-WordPress XML file. A terse message alerts you to the problem:

Blogmogrifier Import Error
Blogmogrifier import error

Step Six: Include Categories and Tags

If your XML file contains information about categories and tags, they will be displayed on the Export tab. You can click either or both of the green Include All buttons to toggle the selection of keywords. Whenever you do this, the button will change to red and display Include None, as shown in the image below:

Blogmogrifier Categories and Tags
Blogmogrifier categories and tags

These two buttons and the four possible choices are handy for when you want all or most of the keywords in a list. Otherwise, you can just click on the desired checkboxes, like this example:

Best Category
Let’s get the “Best of GPTWO!”

Step Seven: Export Content to Text File!

Click the Export Text button. The bottom of the form displays the path to the output file. This is a plain .TXT file, so you can open it in your favorite editor. The screen shots below show the first few lines of the output, where you can see helpful information such as a list of the keywords. You will see that the source file is actually a temporary copy of your XML file. These are safe to delete.

Sharon's eBook (Raw Content)
Sharon’s eBook (Raw Content)

Sharon's Best Category
A couple more of Sharon’s posts

Close Blogmogrifier

Click the End Task tab to close the Blogmogrifier form and return to the Retrievem dashboard.

Tips and Limitations

If you select all categories and / or all tags, every single keyword will be included in the output’s header.

Be sure to rename your output files if you intend to export different sections separately.

The Help tab has an explanation for each tab, as well as a few links back to this blog.

Version 3.12 of Retrievem offers Blogmogrifier as a very basic tool for retrieving your content from the WordPress XML file. A few enhancements, tweaks and new tools are planned. These changes may occur rapidly, depending on your feedback.

Each version of Retrievem has an expiration date. This limits the number of outdated copies in operation. Version 3.12 does not have a simple way for you get the next version. The next version should address that. You’ll be able to get that by visiting this post after March 31, 2015 and clicking any of the download links, including this one.

In order to keep the tutorial as concise as possible, I’ve ignored much of the dashboard. I’m building an online resource to explain ParserMonster in general. Learn more about the dashboard and the rest of The ParserMonster Project.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

How to Extract All Content From Your Blog

Unless you have a massive blog or multiple contributors, the easiest path from blog to eBook starts with exporting all of your content and using a separate tool for retrieving your desired post.

This short tutorial will show you how to do that. If you want to know about the other features of the WordPress Export Tool, please read Using the WordPress Exporter.

Step One: Select Export Tool

WordPress Tools Menu
WordPress Tools Menu

Step Two: Choose All Content and Export!

Export All Content
Select All content then click the Download button

Step Three: Verify Downloaded File

Verify Download
Verify downloaded file

WordPress XML File
All done!

Summary

Your downloaded file should be easy to identify, because WordPress thoughtfully names each file after your blog. Since the timestamp only shows the day, you must rename this file if you will be doing multiple exports.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

Head in the Clouds

featured alt-text

One phrase in nearly every cloud storage provider’s copy goes something like this: treat the sync folder like any other folder on your hard drive.

I’ve taken them at their word…and the result is a constantly evolving file ecosystem. Let’s have a look, shall we?

Move Over, My Documents!
Folders With Benefits

Background

The story before the story: I lost a week’s worth of police work back in the day. Luckily, it was not any official mainframe data, just some case files on our Unit’s personal computer. Still, we had come to rely on that computer and the loss taught me a valuable lesson about backing up data.

The lesson was this: more frequent, more data, more places. We went from weekly to daily backups and I made sure to include more files to enable us to recover work in progress (not just completed reports.) Finally, I went crazy with writable CDs, going from what I had thought was a sufficient three-disk rotation (Grandfather-Father-Son) to a monster eight-disk rotation based on the Tower of Hanoi Puzzle.

Freelancing

When I began freelancing, I adapted the Hanoi scheme to backup client websites. I was using a lot of CDs and often worried about their physical security. I had read about online file backups and checked them out.

Back then, Mozy was the only viable cloud service and I didn’t care for it. Once Dropbox came along, I finally jumped onboard the cloud bandwagon. Now, my only concern was whether unauthorized access to the files would breach client confidentiality. I solved that with TrueCrypt, a program that let me store all files inside an encrypted container.

Actually, I found out later that TrueCrypt was defeating the best features of Dropbox. Apparently, normal files only change slightly. Dropbox uploads just those changed bits, making the whole upload much faster. With a TrueCrypt file, the whole thing changed! So Dropbox always had to upload the whole file.

Take Responsibility For Security

Naturally, I was not pleased with this news. I hunted around until I found CryptSync, a nifty open-source utility that syncs two folders in such a way that one of those folders is always encrypted. During my research, the main use cases involved placing the encrypted half of this pair into Dropbox!

That was great. In fact, the developer recognized the need for multiple such folder pairs and designed CryptSync so that any number of pairs could be synchronized. I duplicated my Dropbox model onto each new service that I’ve joined.

1,281.38 GB: Now What?

In keeping with the lesson learned from the Police Department Fiasco, I finally embraced the “more places” philosophy. Take a look at my System Tray:

Head in the Clouds
The Immortal Brain?

In addition to the visible folders shown at the beginning of this post, my main backup schedule is managed by a program called Duplicati. Though not strictly file storage, services like RoboForm and Evernote (not shown) are part of my overall cloud strategy. In fact, I use RoboForm to save the passwords to the desktop apps that connect to their respective cloud servers.

I have several, often competing, ideas on how to make use of these services. Here’s a brief description of each:

  • Virtual Raid: Like the hardware version of disk striping, I think about spreading multiple copies of files over two or more services. It’s too much work to keep track of files, so this is just in the idea stage.
  • Big Files on Big Servers: I have some ISO image files that would choke the smaller account limits. By moving them, videos and MP3s to MediaFire, I’ll have plenty of room for the smaller files.
  • Task Oriented: this is being pushed aside as irrelevant. I used to reserve Box.com for archived client folders. The trouble with this strategy is that I have more space than clients! What a waste, right?

Some services are strictly optimized for backups. Duplicati is one. I think it is fine for what it does, but I really prefer to use folders on a day-to-day basis. It’s hard to describe what I mean but, I basically do not want to think about Folder A as a working folder, with its backup being sliced, diced and scattered in some hard-to-recover scheme. If these services remain robust, then all I need to know is that if my copy of Folder A disappears from my laptop, I can log onto service X and pull it back down.

Having said that, I suppose I will still appreciate Duplicati if my machine ever melted down. I’m just not sure how much work would be involved. Sadly, I treat the testing phase of file recovery like I treat those legal things one has to click before getting to the good stuff. I’m always optimistic that I can upgrade my computer on my own terms. Until then, I just enjoy the peace of mind that comes with having my data backed up on other people’s hardware.

Extracting Your Blog

Blogs and eBooks are two completely different beasts that just happen to live in the same digital jungle. One of the things exposed by converting between the two formats is that blog content, in its browser layout form, looks horrible in a PDF.

By design, most blog content is bite-sized, both visually and conceptually. While eBooks can emulate that, they also have the freedom to be dense tomes. Conventional wisdom claims that blogs generally cannot keep visitors’ attention with dense layout. (Look at this blog, for example. I’ve attempted to buck the norm with a denser layout and fewer visual breaks.)

Lorelle VanFossen on WordPress
Lorelle VanFossen from Lorelle on WordPress

Therefore, if you want total control over the eBook creation process, you are going to have to get comfortable with the idea of editing and cleaning things up, as Lorelle advised. If you have not read Preparing Your Blog to eBook Categories, do yourself a favor and check it out.

Let’s get your blog content out in raw format, with no restrictions on the layout. There is no reason to compromise your vision with inflexible software. With the right mix of general-purpose software and specialized tools, you can automate the drudgery, yet ably manage the task of converting your content to an eBook that you’ll love.

The general-purpose software includes a Word Processor and Spreadsheet. The specialized tool I will use is my own Windows desktop application called Retrievem. It has a built-in task, unimaginatively titled blog2ebook. You run that task, set up a few rules and in a few seconds, you’ll have a text file that contains your desired content. From there, you could import that file into a spreadsheet in order to keep track of which posts should be grouped together.

Originally, I was going to send the text file to a CSV file (comma separated values). Then, I was going to set up the columns that I used for my own project. However, that goes in the wrong direction; I wouldn’t want to impose my structure on your content. Besides, you may not feel like bothering with the spreadsheet approach.

Instead, just look at the text file. You will see the sections easily enough to cherry-pick what you you need. If you are proficient with your word processor, you may prefer to paste the whole file into that software before editing.

The next post will actually step you through the process of extracting your content.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

Preparing Your Blog to eBook Categories

Two concepts drive the strategy outlined in this article. First, no one method of categorization is superior to another. Second, let the tools do as much of the work as possible. Each of the following scenarios embraces these concepts. All you have to do is decide which scenario best describes your blog to eBook status.

Before looking at the scenarios, let us review the two concepts and briefly discuss comments, images and attachments.

Categories

You can segment your content based on any of the following criteria:

  • Title
  • Publication date
  • Post type (page, post, attachment)
  • Category
  • Tag
  • Any attribute, really

So, pick whatever combination makes sense. Keep in mind, though, your prep work will be easier if the combination includes only existing elements. Retrofitting your content may not be an option if your blog is still active; you don’t want to risk messing it up for your visitors or SEO.

A good idea is to create a rough outline of the eBook. Consider the posts or pages that will go into each section. Perhaps your current blog taxonomy—tags, categories and other groupings—will help you decide. WordPress.org has a technical discussion of Taxonomies.

Tools

WordPress Export

Obviously, WordPress itself is the best tool you have for preparing your content to make the trip from blog to eBook. Be sure you understand how the built-in WordPress Export tool works (and its limitations!) If you have a huge blog, consider limiting the amount of exported content. You might even plan ahead by deciding to perform multiple exports.

WordPress Posts Menu

Using the Posts menu, add categories and tags to your taxonomy as needed. By adding these ahead of time, you can use the bulk editing tools for pages and posts.

WordPress Posts Menu
Add categories and tags via the WordPress Posts Menu

WordPress Bulk Editor

The bulk editors are great time-savers. Not only can you filter your posts, you can also select the ones you wish to edit. In this instance, editing is limited to a few attributes, such as categories, tags, author and publication status.

WordPress Bulk Editor
Select posts to be edited

WordPress Bulk Editor
Bulk editing multiple posts

Deferred Preparation

You don’t have to mess around with the bulk editors. In fact, if your blog is still active, you may want to defer categorization until after you have exported the posts you want in your eBook. Don’t take risks with your visitors’ experience.

Again, if you have a huge blog, consider doing multiple exports to reduce the size of the export files. With the right file managing tool, you can work with multiple export files as easily as you can with a single, massive file. (Of course, the recommended tool is Retrievem!)

One of the neat things about the WordPress XML file format is that it can be used to create other types of files. For example, if you can get your blog information into a spreadsheet, you can play around with the titles, grouping, filtering and sorting them by categories. You can create new categories, rename others and generally treat your blog as the actual outline for your eBook!

(A future how-to article will show you some ideas for using spreadsheets to organize your blog to eBook projects.)

Comments, Images and Attachments

Comments are what make a blog post come to life. You may wish to recapture some or all of the engagement related to the posts you add to your eBook. Of course, you get to decide whether to include images but, if you plan to add sourced material, be sure to keep track of the attributions. As for attachments, you will most likely be deciding whether or not to link to them.

Thanks to the WordPress Export tool, the technical bits will be available. Depending on your skill with other tools, organizing these extras will be easy, challenging or impossible. You must consider how much time you are willing to spend to recreate the blog. If you have teaching content, you’ll probably want your eBook to faithfully reproduce your lessons.

On the other hand, if you have a bunch of essays where the images were just added for the sake of esthetics (or catching eyeballs), you may not need the images.

Scenarios

Let’s consider some likely scenarios. Your blog may be active, undergoing changes or dead. Your desired eBook will either replace or supplement your blog content. That’s six possible scenarios. Your situation may not be among these six but the ideas should still be helpful.

Scenario 1: Active Blog, eBook to Supplement Posts

Creating an eBook is one way to deal with the invisibility of older posts. This is different from the eBook-for-email address offer, in that you’re culling existing content. That is not to say you couldn’t offer the eBook of old posts as an inducement, especially if you provide a good amount of time-saving information.

The more common strategy I have seen is to offer the eBook for sale to those who either wish to save time or just want a tangible collectible from their favorite author. Transparency is the key to making this work. Just be upfront about the choices available to the reader, especially if the content is being sold.

Whatever your motivation is for keeping both formats, your preparation should include adding a blurb to each post that will be going into the eBook. Think of it as advertising. At the very least, you’ll want to mention that the post is part of a collection. Add a link to your eBook download and you’re set!

A plugin that provides shortcodes for text snippets will be very handy for such blurbs. I use WebSimon Tables, but you could use anything that works for you.

Scenario 2: Active Blog, eBook to Replace Posts

I am not going to give advice about SEO. I don’t care about it, so my actions may seem reckless to those who do care. This scenario is the one I defaulted to when my previous web host crashed and burned. (Okay, I botched an upgrade and hosed my site.)

Whether you remove old posts all at once or little by little, the most important thing you can do is to decide to redirect the permalinks, rather than delete them. That blurb from Scenario 1 would be a good target for such redirects.

You’ll also want to think about customizing your 404 page for those links you decide not to redirect. Link rot can fertilize eBook downloads if you let wandering visitors know what happened to old blog posts. Be sure to include a link to the eBook.

Wacky 404

At last! My own wacky 404 page

Scenario 3: Evolving Blog, eBook to Supplement Posts

Again, consider your SEO ramifications before going nuts with categories. Your best bet is to defer preparation until you have an offline copy of the blog. Presumably, evolution simply means that you won’t be taking pains to keep interlinking old content. Or maybe you’re just lazy and don’t feel like embarking on Scenario 4…

Scenario 4: Evolving Blog, eBook to Replace Posts

I suspect that you’ll need to do some homework, whether or not you care about SEO. As with Scenario 2, think about how you want to handle the old permalinks. But, unlike replacing posts that may have been topically relevant, find out what you can expect from visitors encountering evidence of unrelated links.

This is one time where those bulk editing tools can come in handy. You’ll basically have three classes of categories, tags and other groupings:

  • The New Stuff
  • The Good Old Stuff
  • The Bad Old Stuff

Try to categorize the old stuff in such a way that it can be hidden from the readers who land on your blog looking for the new stuff. You can use plugins to hide categories from the various list pages generated by WordPress. List pages include Archives, Categories, Tags, Search, etc.

To hide your own pages, take a look at Page-List, for example. Once you understand how it works, you’ll be able to evaluate similar plugins for posts.

Scenario 5: Dead Blog, eBook to Supplement Posts

This is kind of silly, except where you may consider your blog to be inactive rather than dead. Perhaps the blog was the delivery medium for a course. If you no longer offer the course but still wish to share the content via eBook then consider these ideas:

  • Use the course outline as-is for your eBook chapters
  • Tag obsolete posts so that you can filter them out later, either to ignore or update
  • Create an “ignore” category and assign it to posts you want to skip

Scenario 6: Dead Blog, eBook to Replace Posts

All of the ideas from Scenario 5 can be used here. In addition, think hard about ignoring posts if your blog is going to be deleted. If you don’t already have an archive of old blog posts, you should at least store the posts as saved web pages. You never know when you’ll want to refer to them.

Summary

You should be ready to tackle your eBook before you even log into your WordPress site. Once you have a basic outline, you’ll have a better idea of how to prep your posts and pages. Don’t be too quick to add categories and tags. Also, be careful about handling old permalinks and discarding old content in its original format.

The safest bet is to defer all planning until you have a local copy of your blog. It means more work, but you won’t have to look for an Undo button! The Export tool built into WordPress makes retrieving your blog content a snap, no matter how you prep them. If things go wrong, just download another copy.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

Using the WordPress Exporter

The built-in WordPress Exporter utility is a great tool for retrieving some or all of my blog content. Finally, I can describe a process that is potentially useful to others.

First of all, without getting too technical, the WordPress Exporter creates an RSS feed of the blog. The developers chose XML, a file format that simplifies the monumental task of describing a chunk of content – namely a blog post or blog page. If you use an RSS feed reader, you will appreciate the care taken to preserve the blog post layout.

I decided to start small. I set up the exporter to retrieve a single category of posts. I picked a category that had exactly one post. If I could manage the extraction of a single post, I figured that scaling up would be a simple matter of repetition.

The WordPress Exporter has three main choices for what to include in the export file:

  • All Content
  • Posts
  • Pages

Each choice reveals a second set of options that can be used to limit the amount of content exported. When I chose Posts, I saw these options:

WordPress Export Tool

WordPress Export offers many options

The ability to fine-tune the export makes this a great tool for a Blog to eBook project. With a bit of planning ahead of time, I imagine that I would save a lot of time by not having to sift through irrelevant posts.

When I clicked the Download Export File button, I received a tiny, 8KB file with my one post. The next step is to extract that post and any other relevant information. Since my goal is to make this process useful to others, I will try to be as flexible as possible – probably grabbing more data than necessary. Stay tuned.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

Accessing the WordPress Database Posts Table

When I first wrote about converting some of my blog posts to an e-book, I hadn’t planned on repeating the process. Since I’m including a tutorial this time around, I will find out if my biggest concern is valid:

The most important things I had to know were the order and type of data used to store a blog post.
This requirement is the main drawback to working directly with a file. If a future version of WordPress
changes the database structure, my parser would have to be updated, as well.

from Extracting Posts From WordPress Files

Although I used iThemes Security Plugin for the tutorial, I installed WP DB Backup so that I could compare its backup file to the one I created for my e-book.

WP DB Backup plugin

Visit WP DB Backup plugin

Regrouping

I quickly discovered two things: the table structure had indeed changed and, WP DB Backup and iThemes Security both extracted the same fields from the post table. This meant that I could not use my old parsing pattern. On the other hand, at least I stood a chance to make a pattern that would work, regardless of the plugin used to create the backup file.

Clearly, I needed to standardize my extraction procedure; otherwise, this project would be of no use to anyone else. For the morbidly curious, here is a snapshot of the two table structures:

Changed Tables

Don’t count on table columns staying the same!

As I was putting this together, I realized that I didn’t have a clue about why the structures were different. Rather than speculate, I hunted for the answer and found it deep within the WordPress Codex. If you examine the Changelog for the Post Table, you’ll noticed that the category field was dropped in version 2.8:

WordPress Codex

The Posts table changes frequently

It is one thing to account for table structure changes. It is quite another thing to map the changes to a complex pattern. In fact, doing so might not be the best option. I came across an interesting post about importing posts and pages from one website to another. This gave me a new direction to explore.

Using the WordPress Exporter

I played around with the tool provided by WordPress. This turns out to be a simple XML file! Of course, simple is relative. The exporter has three options: all content, posts or pages. The good news is that if you don’t want to bother with adding pages to your e-book, you could use the posts option.

At this point, I was ready to abandon the old pattern in favor of parsing the xml file. After all, the XML file is much cleaner than the raw data from the database. I would need to extract the title, publication date and content. I found the specific XML tags that identified these elements:

  • <title> and </title>
  • <pubDate> and </pubDate>
  • <content:encoded><![CDATA[ and ]]></content:encoded>

That is topic of the next post.

Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog

How to Backup All Your WordPress Posts

Project update

Aug. 28, 2014:

The WordPress Exporter provides a better way to extract blog posts. I will use it for the rest of this project.


As part of the Blog-to-eBook Project, I will present a step-by-step procedure for acquiring your blog posts and pages. You will also gain the benefit of having a backup plan for your WordPress blog.

Plugins for backing up WordPress have many features. At the most basic, a good backup plugin will export your posts from the WordPress database on your web host. Once the posts have been extracted, you can save, download, email or copy them to a cloud service like Dropbox. As long as you can access the backup files and copy them to your local hard drive, you can use whatever plugin and storage scheme you’d like. (For the sake of clarity, I use the term posts only. WordPress considers pages and attachments to be posts as well, and they all get backed up. Be aware that only the links to attachments are backed up in the posts table. Depending on your chosen plugin, the actual attachments may be added to the backup file.)

For this tutorial, I used iThemes Security, a great plugin for securing and backing up WordPress installations. I set the backups to be emailed to me, so that I can easily download the attachments. (Plus, I don’t want to use up server space.)

To start, you have to install your chosen plugin. Once you have activated it, find the setting that allows you to configure the backups.

How to Backup WordPress Posts

Visit iThemes Security plugin page


Weirdly, iThemes Security Backups tab emphasizes the Create Database Backup button when, in fact, your first step is to click the Adjust Backup Settings link.

iThemes Security backup tab

Do not click the button…yet

On the massive settings tab, the backup settings are about midway down. You have just three Backup Methods from which to choose.
I selected Email Only. If you choose Save Locally Only, you’ll have to transfer the file via FTP. This might actually be necessary if your email chokes on huge attachments.

You should check the box for Zip Database Backups. Compressing the original file really reduces the size of the zip file. (See final image)

iThemes backup settings
Finally, set up scheduled backups. It doesn’t matter for this project but, if you are blogging actively, you may as well reap the benefits of current backups.

Enable scheduled backups

You may as well enable scheduled backups

Back on the main iThemes Security Backups tab, click the Create Database Backup to generate a current backup. Get that file onto your hard drive so that you can begin the next step.

Now you can click the button

Now you can click the button


Here is the downloaded attachment. I opened it in 7-zip to show you the compression – the zip file is just over 20% of the original file’s size! (1.7 MB vs 375 KB attachment)

7-zip Info screen

Nearly 80% compression ratio


Project update

Aug. 28, 2014:

The WordPress Exporter provides a better way to extract blog posts. I will use it for the rest of this project.


Fieldnotes

Original Method: WordPress Backup Files

Convert WordPress Blog to e-Book
Backup WordPress Posts
Extracting Posts from WordPress Backup Files
Working with Extracted WordPress Blog Posts
Accessing the WordPress Database Posts Table
(While writing the last post in this list, I discovered a better way...)

How-to

Getting Started

Preparing Your Blog to eBook Categories
Extract All Content From Your Blog

New Method: WordPress Export Tool

Using the WordPress Exporter

Retrieve Selected Posts

Retrieving the Blog Content for Your eBook

Raw Content Retrieval

Extracting Your Blog