My blog to ebook project is going to be an exercise in parsing and extracting. I briefly considered using Anthologize, a WordPress plugin that many people seem to love. Personally, I want total control over the entire process. So, I’ll begin by extracting those posts, sorting them and picking out the ones that will be added to the ebooks.
The WordPress database tables that store your posts, pages, comments and other data can be exported into a simple text file. In fact, that’s what happens when you perform a backup, using a plugin such as WordPress Database Backup (WPDB).
I am taking full advantage of this. I instructed WPDB to email my backups to my Gmail account. I can save any one of them to my hard drive and unzip it into a folder. I use 7-zip, a free, open-source program that creates and manages archive files. After a bit of parsing and extracting, I end up with a spreadsheet of post titles, dates and actual text.
I’ll explain the parsing and extracting, next time. For now, here is a montage of the action: