So you wanna learn Regex? - Part 6

Welcome to So You Wanna Learn Regex? Part 6. OK, I know I said part 5 would be the last part in the series, but I just had to work this one out and wanted to share. Remember, If you want more tutorials about regex, especially more advanced ones than the mickey mouse onces here, go bug Ben. He knows more about this than I ever will and I hear he has a blog...

In our last exercise, we looked at cleaning up some data scripts.

In this exercise, we are going to reformat a configuration file from .ini style to ColdSpring MapFactory style. Specifically, I'm integrating CFFormProtect into an application and I want the config to be managed in ColdSpring with the rest of my configurations. Sure, I could go flapping around with copy+paste, smashing keys, burning tendons, but that seems so Junior Programmerish, doesn't it?

Assume this set of declarations:

mouseMovement=1
usedKeyboard=1
timedFormSubmission=1
hiddenFormField=1
akismet=0
tooManyUrls=1
teststrings=1
projectHoneyPot=0
timedFormMinSeconds=5
timedFormMaxSeconds=3600
encryptionKey=JacobMuns0n
akismetAPIKey=
akismetBlogURL=
akismetFormNameField=
akismetFormEmailField=
akismetFormURLField=
akismetFormBodyField=
tooManyUrlsMaxUrls=6

What we want, is to turn:mouseMovement=1 into: <entry key="mouseMovement"><value>1</value></entry>

Note we've split a string delimted by an equals sign into some XML nodes.

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 5

Welcome to So You Wanna Learn Regex? Part 5. This is our last part of this series, mostly because I don't know a whole lot more than this. If you want more tutorials about regex, go bug Ben. He knows more about this than I ever will and I hear he has a blog...

In our last exercise, we looked at a simple way to add cfqueryparam to a bunch of queries. This was accomplished by making a pattern consisting of 3 groups then using one of the groups to populate a literal string.

In this exercise, we are going to clean up some data scripts. Let's suppose you are generating database scripts and your script generator puts the primary key in there. For whatever reason, you want to remove this.

Assume this set of declarations:

INSERT INTO `memberchallenge` VALUES ('11', '1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0');
INSERT INTO `memberchallenge` VALUES ('12', '2', '19', null, '2008-11-14 15:40:51', '2008-11-14 15:40:51', '1', '2008-11-14 15:40:51', '0');
INSERT INTO `memberchallenge` VALUES ('14', '5', '19', null, '2008-11-14 20:14:26', '2008-11-14 20:14:26', '5', '2008-11-14 20:14:26', '0');
INSERT INTO `memberchallenge` VALUES ('15', '1', '20', null, '2008-11-23 18:19:31', '2008-11-23 18:19:31', '1', '2008-11-23 18:19:30', '0');
INSERT INTO `memberchallenge` VALUES ('16', '2', '20', null, '2008-11-23 18:20:09', '2008-11-23 18:20:09', '1', '2008-11-23 18:20:09', '0');
INSERT INTO `memberchallenge` VALUES ('17', '1', '21', null, '2008-11-25 20:32:44', '2008-11-25 20:32:44', '1', '2008-11-25 20:32:44', '0');
INSERT INTO `memberchallenge` VALUES ('18', '2', '21', null, '2008-11-25 20:33:01', '2008-11-25 20:33:01', '1', '2008-11-25 20:33:01', '0');

What we want, is to turn:INSERT INTO `memberchallenge` VALUES ('11', '1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0'); into: INSERT INTO `memberchallenge` VALUES ('1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0');

Note the first value in the VALUES statement has vanished.. this would be the primary key in our dataload script.

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 4

Welcome to So You Wanna Learn Regex? Part 4. In our last exercise, we looked at a simple way to clean a whole bunch of strings. This was accomplished by making a pattern, then removing everything according to that pattern. This time we are going to add cfqueryparam to a query. Say for example, that you have a junior developer who has been turned loose on her first application and she's done a good job, except for she didn't use cfqueryparam. You just found this out and the site has to go live in 10 minutes and you have 200 queries to fix. Do you:

  • a) Download the code to your laptop then pull the fire alarm to stall for time?
  • b) Start blasting your resume out on Monster.com?
  • c) Take a fistfull of aspirin, knowing your forearms will ache in the morning?

If you answered d) none of the above, please keep reading.

Assume this set of declarations:

UPDATE plant
	SET 	Symbol = '#form.symbol#',
			SynonymSymbol = '#form.SynonymSymbol#',
     		ScientificNameWithAuthor = '#form.ScientificNameWithAuthor#',
     		CommonName = '#CommonName#',
     		Family = '#Family#'
WHERE PlantCode = '#form.plantCode#'

What we want, is to turn: '#form.symbol#' into: <cfqueryparam value="#form.symbol#" cfsqltype="cf_sql_varchar">

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 3

Welcome to So You Wanna Learn Regex? Part 3.

In our last exercise, we looked at a simple way to wrap a function argument inside a new function. This was accomplished by making a pattern, defining a group and using a back reference. This time we will look at how to clean some strings.

Say for example, that you run a website called The Health Challenge and say for example, you wanted to use some of your fine tax dollar funded research to deliver motivating messages to the members.

Well, you could just happen across Small Steps and just use their content. After all, it is in the public domain. So you happily cut a LARGE chunk of these from the web site, but now you have to clean them.

Assume this set of declarations:

(# 11)  	Avoid food portions larger than your fist.
(# 12) 	Mow lawn with push mower.
(# 13) 	Increase the fiber in your diet.
(# 17) 	Join an exercise group.
(# 20) 	Do yard work.
(# 24) 	Skip seconds.
(# 25) 	Work around the house.
(# 26) 	Skip buffets.
(# 29) 	Take dog to the park.
(# 30) 	Ask your doctor about taking a multi-vitamin.
....( 700 more lines)

What we want, is to turn: (# 11) Avoid food portions larger than your fist. into: Avoid food portions larger than your fist. See, we like the content, we don't like the parentheticals nor the whitespace. Do we flex our forearms in preparation for a copy/paste session? Do we call KeyboardsAreUs.com and have 2 fresh keyboards airdropped, knowing we'll wear out some keys? (if you said yes, please delete your hard drive and apply at KFC.) Regular expressions are our friends. A Regex is a pattern matcher, and it can do stuff. We can see our code is repetitive and the pattern we want is: Get rid of the parentheticals and the extra whitespace. (Same stuff we'd do over and over via cut/paste/etc, isn't it? Though in a copy paste, you are talking about 5 keystrokes per line times 700 lines. That is 3500 keystrokes, unless you type like me, in which case it would be nearly 4 million.)

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 2

Welcome to So You Wanna Learn Regex? Part 2. In our last exercise, we looked at a simple way to add a new attribute to an HTML tag. This was accomplished by making a pattern, defining a group and using a back reference. This time we will look at a slightly more complicated use case.

Assume this set of declarations:

product.setColor(arguments.color);
product.setSize(arguments.size);
product.setCondition(arguments.condition);
product.setRating(arguments.rating);
product.setReliability(arguments.reliability);
product.setNeedsBatteries(arguments.needsBatteries);

What we want, is to turn: product.setColor(arguments.color); into: product.setColor( htmlEditFormat(arguments.color) );

Normally, this would be a forearm/wrist fatiguing flail on the keyboard, furiously cutting/pasting and generally flapping about. Not so with Regular Expressions. A Regex is a pattern matcher, and it can do stuff. We can see our code is repetitive and the pattern we want is: Take Everything Inside The Parenthesis, and Wrap It In A htmlEditFormat() Function. (Same stuff we'd do over and over via cut/paste/etc, isn't it?)

We can define this pattern in the gobbledegook defining a regular expression. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 1

I've had a set of blog posts stewing in my brain for a while. Steve Nelson, last year, helped me out with a Regular Expression (Regex) and I made it a point to practice my Regex skills more. This series will show how to use Regular Expressions in Eclipse and we'll learn some helpful tips along the way. This series is for you if you are the kind of developer that reads Ben Nadel's blog posts containing Regular Expressions, and has no idea what the heck he is talking about. Seriously Ben, this is unintelligible to us mere mortals:

<cfset blogContent = reReplace( blogContent, "</?\w+(\s*[\w:]+\s*=\s*(""[^""]*""|'[^']*'))*\s*/?>", " ", "all" ) />
(It looks like a catnip crazed kitty went for a prance on a keyboard, doesn't it?) Enough guffaws and such. On with the learning.

Editors Note:

Simply reading these blog posts aren't going to help you. Open eclipse, and copy/paste this stuff into your find/replace dialog. You'll learn more, or your money back!

So, firstly we need a use case. Let's pretend we are going through some old code and looking to add HTMLEditFormat around some arguments so that the forms won't break if there are quotes.

Assume this set of declarations:

<input name="fred" value="willy" />
<input name="bill" value="mickey" />
<input name="erin" value="harry" />
<input name="baz" value="pissette" />

What we want, is to turn: <input name="fred" value="willy" /> into: <input name="fred" id="fred" value="willy" /> Normally, this would be a forearm/wrist fatiguing flail on the keyboard, furiously cutting/pasting and generally flapping about. Not so with Regular Expressions. A Regex is a pattern matcher, and it can do stuff. We can see our code is repetitive and the pattern we want is: make a new attribute called 'id' and populate it with the value from the attribute 'name'... which is what we'd do over and over via cut/paste/etc. We can define this pattern in the gobbledegook defining a regular expression, of course, else I'd be writing this post about Cute LOLCats, not Cute Regexes., wouldn't I? We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]