What Does That Testimonial Even Mean?

I don’t have an Elevator Pitch. Those are so 20th Century. It’s more fun to wave my arms expansively, mutter some technical terms and then blurt, “BAM! Out comes this neat document.”

Seriously, though, I try not to pin myself down when I am offline. Just because the Web requires hyper-vertical niches doesn’t mean that a description of my work has to fit in a tweet-size box. In fact, marketing experts tell us that we will have a chance to explain all of the other things we do once the prospect has gotten more comfortable with us.

But, what if somebody stumbles upon my testimonial page and sees something like this?
Excellent work ! Great communication ! Very pleased! Recommend to everybody !

I appreciate this client’s enthusiastic support and I don’t wish to imply that this testimonial is not as near and dear to my heart as any other hard-won praise. However, I don’t want the new visitor to do an eye-roll and bounce away. So, I link the testimonial to a page like this to draw the visitor in.

Roman had hundreds of automobile descriptions pasted from websites. He needed a way to automatically extract specific keywords from each description. Aliases had to use the same keyword and the output had to be in a single Word document. He required the ability to add and delete keywords whenever he wanted.

I built a VBA macro that included a tiny, dynamic keyword database from Roman’s supplied keyword file. This database is created each time the macro runs, so he can make changes and apply them immediately. All he had to change for me was to add a marker at the end of each description (we agreed on ###.)

I’ll let the collage tell the rest of the story.
Keyword Extraction with Predefined Database
Thanks, Roman!

Patterns of Behavior

You hear this all the time: “Just because you can do something doesn’t mean that you should.” Using the same tricks to solve all of your problems keeps you from finding better ways to accomplish those tasks. I’ll give you a technical example from my mad laboratory.
Pretty Pattern Pablo Fernández via Compfight
I’m creating a script for Retrievem that extracts words from lists based on patterns. Much of Retrievem’s power comes from something called regular expressions. The main selling point, extracting email addresses, is a single regular expression:

^[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}$


Even without knowing regular expressions, you might be able to pick out the @ sign and you can guess that the part before the sign is for the email name, while the part after is for the address. Neat, right?

Well, pattern matching is what regular expressions are all about. However, that may not be the best way to match words, depending on what the patterns are. One type of pattern might be, “Show me all six-letter words that begin and end with vowels.” This is easy enough to do with a regular expression:

^[aeiou][a-z]{4}[aeiou]$


You can see the vowels in the brackets on both ends of the expression. Depending on the source word list, Retrievem would return a list with words like ALKYNE, ARGYLE, ASSURE, EQUINE and ETHYNE.

But, what if the pattern is stricter? “Show me all six-letter words containing exactly two vowels, one at the beginning and one at the end.” Well, that’s still easy, if you replace that [a-z] bit with [^aeiou]. Now, Retrievem throws away everything except ALKYNE, ARGYLE and ETHYNE.
Another Pretty Pattern Pablo Fernández via Compfight
I wanted my script to be even more flexible. Imagine that you’re trying to solve a cryptogram like this:

From Puzzleland on Race2Hugo.net

You could ask Retrievem to find all words that match the pattern for XSHHJYU, which is the first word in the puzzle. First of all, the regular expression is slightly different*:

^([a-z])([a-z])([a-z])(\3)([a-z])([a-z])([a-z])$


Secondly, the regular expression doesn’t take the rule of cryptograms into account. Each letter in the code stands for only one letter in the answer. So, Retrievem will return useless suggestions like FREEZER and SETTLER, where the E‘s can’t fit. What I needed was a stricter pattern matcher that looks at the letters as well as their positions in each word.

I actually spent some time playing with regular expressions to solve this. The answer may be out there, but I decided to just add some code to Retrievem itself. Basically, it converts each word into a number in such a way that only words with the same numbers will satisfy the pattern. Simpler for me and the results are wonderful!

BATTLED, BATTLER, BEDDING, BELLOWS, BETTING, BHEESTY, BIGGEST, BILLOWY,
BULLACE, CADDISH, CAFFEIN, CALLING, CESSPIT, CHEERIO, CHEERLY, CHOOSER,
CIRRATE, CIRROSE, CIRROUS, CITTERN, CLOOTIE, COBBING, COBBLER, CODDING,
CODDLER, COFFRET, COLLAGE, CUTTING, CUTTLER, DABBLER, DAFFERY, DAFFING,
DAFFISH, DAGGERS, DALLIER, FADDISH, FADDISM, FADDIST, FALLING, FIBBERY,
FIDDLER, FISSURE, FISSURY, FITTAGE, FITTERS, FLOODER, FLOORED, FOGGILY,
FOGGISH, FOPPERY, FOPPISH, FOSSULA, FOSSULE, FREEDOM, FREEISH, FREESIA,
FUDDLER, FULLERY, FULLING, FULLISH, FUNNILY, FURRILY, FURRING, FURROWY,
FUSSILY, GALLEON, GARROTE, GOSSIPY, GREENLY, GROOVEY, HAPPIER, HAPPIFY,
HAPPILY, HELLBOX, HELLCAT, HELLDOG, HELLION, HERRING,…, WETTING, WETTISH,
WOBBLER, WOFFLER, WOPPISH, WORRIED, YAPPING, YELLOWS


Here are just a few of the 852 matches Retrievem found in my word file
* (Regex experts: the grouping of every letter is not necessary, but Retrievem needs to be able to use any position as a backreference and it was easier to just put each element into a group.)

Final Hours for Free Software

By the way ...

It’s official. The Morpho Designs Software Library is live. The first product off the assembly line is Retrievem:


Text Processor

Retrievem is a user-friendly text extraction program. Instantly grab email addresses buried in hundreds of files. Need a list of web addresses? Extract them from web pages, book-mark files, spreadsheets and more!

Save time with Retrievem! Finding, copying and pasting data one by one takes too long! Let Retrievem do that for you, automatically!



Learn More