Using MongoDB Aggregation Framework to Filter Trello Cards

I'm helping prepare the CFSummit conference. We've organized the sessions on Trello and had a public voting session. It's time to start organizing the topics into a schedule.

In a conference schedule, it's important to know which sessions will be popular. It's desirable to ensure the most desirable sessions do not compete with each other. Thus, I wanted to pull out the sessions and organize the sessions by popularity.

The MongoDB Aggregation Framework

The MongoDB aggregation framework is a relatively new addition to the platform. Using this framework, you can group, sort, calculate and handle information gathering in the aggregate sense. Here's how I did this for the Trello Json data.

The Mongo Query

Exporting out of Trello gives a big JSON document with JSON members for each card. It turns out, in our case, all of the cards we want belong to a specific list. Once we pull the correct cards, we want to sort them by their votes. We'll end up with a sorted array of sessions by popularity. Here is the MongoDB query:

view plain print about
1db.cfsummit.aggregate([
2    {$project: { "cards": "$cards"}},
3    {$unwind: "$cards"},
4    {$match: {"cards.idList": {"$in": ["51c9aa15d0b4871a3e000075"]}}},
5    {$project: {"_id": 1, "name": "$cards.name", "members": "$cards.idMembers", "url": "$cards.url", "votes": "$cards.badges.votes"}},
6    {$sort: {votes:-1}}
7])

Explained Line by Line:

db.cfsummit.aggregate([

Notice the argument to the aggregate command is an array? This means you can organize a series of document transformations into steps. Each step will manipulate the document in some fashion. Let's look at our first step in the transformation:

{$project: { "cards": "$cards"}},

The first transformation is a $project command. Project (Pro-JECT), means to project a new way to view the data. In this case, I'm only interested in the cards node. The result of this document is a new document with basically only the cards member. You can write queries without $project, but I always do use it for 2 reasons. Firstly, reducing the size of the working document makes the query more efficient. The resulting projected document is smaller and can more easily be manipulated. The second reasons is I write my queries incrementally, so I only need to see, what I need to see. (Note the cards member is an array, this is important in the next step)

view plain print about
1"result" : [
2    {
3        "_id" : ObjectId("51ee98afaa17829291af81e0"),
4        "cards" : [
5            {
6                "id" : "51b0fbec94b2237145005a18",
7                "badges" : {
8                    "votes" : 0,
9                    "viewingMemberVoted" : false,
10                    "subscribed" : false,
11.....

{$unwind: "$cards"},

Now the card nodes is an array. I'm going to want to sort all of the matching cards by the votes parameter. I use an $unwind command to transform the cards array members into their own documents.

view plain print about
1"result" : [
2    {
3        "_id" : ObjectId("51ee98afaa17829291af81e0"),
4        "cards" : {
5            "id" : "51b0fbec94b2237145005a18",
6            "badges" : {
7                "votes" : 0,
8                "viewingMemberVoted" : false,
9                "subscribed" : false,
10...

Note, the cards member is no longer an array... this is important for grouping, which we will do later.

{$match: {"cards.idList": {"$in": ["51c9aa15d0b4871a3e000075"]}}}

Each of the cards we want to deal with belongs to listId: 51c9aa15d0b4871a3e000075. So we use the $match command to match the cards with the listId we are looking for. (Think of this like a where clause in SQL, if that is your background.

view plain print about
1"result" : [
2    {
3        "_id" : ObjectId("51ee98afaa17829291af81e0"),
4        "cards" : {
5            "id" : "51b0fbec94b2237145005a18",
6            "badges" : {
7                "votes" : 0,
8                "viewingMemberVoted" : false,
9                "subscribed" : false,
10...

{$project: {"_id": 1, "name": "$cards.name", "members": "$cards.idMembers", "url": "$cards.url", "votes": "$cards.badges.votes"}},

Now I have my sorted cards belonging to the correct list. I now want to set up the return data structure in a way that is most useful to me. In my case, I want the ID, Name of the Session, The Presenters, The Trello URL for the card and the Votes Received. We once again use a $project command to organize the data in the format we want. Note, I've used a dot delimited path to walk the JSON tree to the data member I want. Hence, the votes were in the Votes Node which is inside the Badges Node which is inside the Cards node.

{$sort: {votes:-1}}

Lastly, we need to sort the cards by their popularity. The $sort command takes a JSON object containing the nodes you want to sort by. We want most votes to appear first, so we assign a -1 to the votes column for descending sort. Changing this to 1, would sort the data in an ascending manner.

Final Data Result

view plain print about
1{
2    "result" : [
3        {
4            "_id" : "51ee98afaa17829291af81e0",
5            "name" : "Security Best Practices",
6            "members" : [
7                "51b0ff9bf9d2b2b94c0027bd"
8            ],
9            "url" : "https://trello.com/c/ITpzm0xS/15-security-best-practices",
10            "votes" : 45
11        },
12        {
13            "_id" : "51ee98afaa17829291af81e0",
14            "name" : "ColdFusion Object Oriented Advanced",
15            "members" : [
16                "519a266b522736c97000a224"
17            ],
18            "url" : "https://trello.com/c/1DV4Ud2Z/41-coldfusion-object-oriented-advanced",
19            "votes" : 43
20        },
21        {
22            "_id" : "51ee98afaa17829291af81e0",
23            "name" : "REST 101",
24            "members" : [
25                "50997edfdcb1ac3f1c00ac66"
26            ],
27            "url" : "https://trello.com/c/1oYg37pV/23-rest-101",
28            "votes" : 34
29        },
30....

Want More Information?

Learn more about the MongoDB Aggregation Framework at their documentation site. You can install MongoDB in very little time and start working with data.

Free Training

If you want a more structured training, 10Gen offers a 7 week online training class on MongoDB for free. The classes are very well done. Consider a class if you are Mongo-Curious.

Getting USB Device Drivers Working for HTC Android Development

I set up a new Eclipse environment today and wanted to use my HTC Thunderbolt for testing. Usually, the way this works is you right click on your project then select your manual run target of your phone. My HTC Thunderbolt was not recognized for some reason.

After digging around for a bit, I found the USB device driver provided by Google does not support some HTC phones out of the box. I have no idea why. However, fixing it is pretty simple.

All you have to do is update the device driver .inf file. It's pretty simple to do this. Here is what you do:

  1. Follow the steps here: http://developer.android.com/guide/developing/device.html to start the process (if you found this blog article, you have likely done this step already)
  2. If you are on Windows, you'll have to get the Microsoft specific USB driver at the Google Windows USB Driver link.
  3. Once you install the Google Windows USB Driver and follow the instructions on that page for your specific OS, your device will not be recognized.
  4. Use the Device Manager to find your phone. Right Click and choose properties, then choose the Details Tab. On the Details Tab, Change the Property selector to Hardware Ids. Write down the (4?) digits in the VID_1234 (where 1234 is likely different for you) and for PID_1234 (where once again 1234 is likely different for you) You will need them later. If this is confusing, check the screenshot at the bottom of this page.
  5. Use a text editor to open [Android SDK Root]\android-sdk\extras\google\usb_driver\android_winusb.inf
  6. Find the section [Google.NTx86] and copy the lines for the HTC Dream. Paste them and change the dream to your HTC phone model.
  7. Then, update the driver specific lines with the VID_1234 number and PID_1234 number you copied above. Mine looks like this:
    view plain print about
    1; HTC Thunderbolt
    2%SingleAdbInterface% = USB_Install, USB\VID_0BB4&PID_0CA4
    3%CompositeAdbInterface% = USB_Install, USB\VID_0BB4&PID_0CA4&MI_01
    4%SingleBootLoaderInterface% = USB_Install, USB\VID_0BB4&PID_0CA4
  8. Copy and paste this code for the [Google.NTamd64] section also.
  9. When finished, try the driver update once again and you should have better luck this time.

This should improve the situation. Hat tip to Kostya Vasilyev on the Android Developers mailing list for the idea.

What To Do When A Hard Drive Fails

When a hard drive crashes, you can lose all your data. Corrupt hard drives happen out of the blue and for seemingly no good reason. If your hard drive fails, what can you do?

One option is to call a hard drive recovery company. If your data is worth a lot of money to you, you can pay a forensic computer company to get the data off your hard drive. Before you write a check though, try a little Do-It-Yourself first.

What is going on inside the hard drive is a bunch of little platters spinning at high speed. When data is accessed or written to the disk, a little head (sort of like on a record player) moves to the right spot and does it's magic. The space between the head and the platter is very very tiny. Freezing the hard drive will shrink the head and the platter ever so slightly, often allowing you to read data.

Here is how I got the data off of a failed hard drive.

  1. Remove the hard drive from the computer.
  2. Place the hard drive inside of a zip top freezer bag. (don't buy a cheap bag.)
  3. Place the wrapped hard drive inside of ANOTHER zip top freezer bag. (yes, you need to do this) (see figure 1 below)
  4. Place the double wrapped hard drive in the coldest part of your freezer.
  5. Leave the hard drive in the freezer for 12 hours at least. You want it good and cold! (see figure 2 below)
  6. Once very chilled, install the hard drive in your computer and start pulling off data. Begin with the most valuable data.
  7. At some point, the hard drive will fail again. When it does, mark the last successfully copied data, pull out the hard drive, double wrap it again and stick it in the Chill Chest for another 12 hours.
  8. You may need to do this a number of times to get all the data you want, or until the hard drive stops working completely.

Double Wrapped Hard Drive

Hard Drive in the Freezer

How to fix a corrupted Microsoft Office File

My wife creates massive proposals using Microsoft Word and Microsoft Office. Today she had a Word document spontaneously get corrupted and lose hours of her changes. The Microsoft Word document would not open at all, not on her computer, or any of her co-worker's computers.

I offered to take a look at the document and see if I could recover any part. She sent me the document and I started googling around for 'how to recover docx files and found a number of paid utilities claiming to fix the situation. Not ready to spend money, and on a whim, I tried to open the document with the Open Office, open source word processing software. Guess what, it worked!

The corrupted document opened just fine with Open Office. I easily saved the document as a .doc file which opened just fine on my wife's computer in Microsoft Office.

Since the file is now a .doc format and not a .docx format, some of the formatting was munged. However, fixing formatting is a whole lot easier than re-crafting pages and pages of text, don't you agree?

The best part about this, is it took less than a minute and $0 to repair the file.

Here are the steps:

[More]

So you wanna convert Open Office Documents to Wiki format?

While working on some Model-Glue documentation we needed to convert an open office document to TracWiki format. I'm not fluent in TracWiki and I certainly didn't want to manually format 30-40 pages. Thankfully, there is an Open Office --> TracWiki conversion macro that will automate pretty much all of it. It helps if you used the Styles features of Open Office because the TracWiki will be cleaner, but no worries, it'll convert it anyways if you didn't. Here is how it works.

  1. Open the Open Office document you want to copy text from. We'll call it the Source document.
  2. Set the macro permissions to Medium or Low by going to the menu -> Tools -> Options -> Open Office.org -> Security -> Macro Security
  3. Download and open the Open Office Template Macro (ODT file)
  4. Remove all text in the newly opened ODT file.
  5. Highlight the text in your Source document, copy it, then paste it into the ODT file.
  6. Press the button with the correct format, the text will be converted to the correct wiki format and copied to the clipboard. (English is on page 2)
  7. Paste the text from the clipboard into the edit box in your wiki.

You now have converted Open Office document text to wiki format.

The conversion picked nearly everything. I manually cleaned up white space issues and also added in code blocks around all the source code sections, but I was very pleased with the amount of work handled by the OpenOfficeToTracWikiScript.

So you wanna learn Regex? - Part 6

Welcome to So You Wanna Learn Regex? Part 6. OK, I know I said part 5 would be the last part in the series, but I just had to work this one out and wanted to share. Remember, If you want more tutorials about regex, especially more advanced ones than the mickey mouse onces here, go bug Ben. He knows more about this than I ever will and I hear he has a blog...

In our last exercise, we looked at cleaning up some data scripts.

In this exercise, we are going to reformat a configuration file from .ini style to ColdSpring MapFactory style. Specifically, I'm integrating CFFormProtect into an application and I want the config to be managed in ColdSpring with the rest of my configurations. Sure, I could go flapping around with copy+paste, smashing keys, burning tendons, but that seems so Junior Programmerish, doesn't it?

Assume this set of declarations:

mouseMovement=1
usedKeyboard=1
timedFormSubmission=1
hiddenFormField=1
akismet=0
tooManyUrls=1
teststrings=1
projectHoneyPot=0
timedFormMinSeconds=5
timedFormMaxSeconds=3600
encryptionKey=JacobMuns0n
akismetAPIKey=
akismetBlogURL=
akismetFormNameField=
akismetFormEmailField=
akismetFormURLField=
akismetFormBodyField=
tooManyUrlsMaxUrls=6

What we want, is to turn:mouseMovement=1 into: <entry key="mouseMovement"><value>1</value></entry>

Note we've split a string delimted by an equals sign into some XML nodes.

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 5

Welcome to So You Wanna Learn Regex? Part 5. This is our last part of this series, mostly because I don't know a whole lot more than this. If you want more tutorials about regex, go bug Ben. He knows more about this than I ever will and I hear he has a blog...

In our last exercise, we looked at a simple way to add cfqueryparam to a bunch of queries. This was accomplished by making a pattern consisting of 3 groups then using one of the groups to populate a literal string.

In this exercise, we are going to clean up some data scripts. Let's suppose you are generating database scripts and your script generator puts the primary key in there. For whatever reason, you want to remove this.

Assume this set of declarations:

INSERT INTO `memberchallenge` VALUES ('11', '1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0');
INSERT INTO `memberchallenge` VALUES ('12', '2', '19', null, '2008-11-14 15:40:51', '2008-11-14 15:40:51', '1', '2008-11-14 15:40:51', '0');
INSERT INTO `memberchallenge` VALUES ('14', '5', '19', null, '2008-11-14 20:14:26', '2008-11-14 20:14:26', '5', '2008-11-14 20:14:26', '0');
INSERT INTO `memberchallenge` VALUES ('15', '1', '20', null, '2008-11-23 18:19:31', '2008-11-23 18:19:31', '1', '2008-11-23 18:19:30', '0');
INSERT INTO `memberchallenge` VALUES ('16', '2', '20', null, '2008-11-23 18:20:09', '2008-11-23 18:20:09', '1', '2008-11-23 18:20:09', '0');
INSERT INTO `memberchallenge` VALUES ('17', '1', '21', null, '2008-11-25 20:32:44', '2008-11-25 20:32:44', '1', '2008-11-25 20:32:44', '0');
INSERT INTO `memberchallenge` VALUES ('18', '2', '21', null, '2008-11-25 20:33:01', '2008-11-25 20:33:01', '1', '2008-11-25 20:33:01', '0');

What we want, is to turn:INSERT INTO `memberchallenge` VALUES ('11', '1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0'); into: INSERT INTO `memberchallenge` VALUES ('1', '19', null, '2008-11-14 14:07:59', '2008-11-14 14:07:59', '1', '2008-11-14 14:07:59', '0');

Note the first value in the VALUES statement has vanished.. this would be the primary key in our dataload script.

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 4

Welcome to So You Wanna Learn Regex? Part 4. In our last exercise, we looked at a simple way to clean a whole bunch of strings. This was accomplished by making a pattern, then removing everything according to that pattern. This time we are going to add cfqueryparam to a query. Say for example, that you have a junior developer who has been turned loose on her first application and she's done a good job, except for she didn't use cfqueryparam. You just found this out and the site has to go live in 10 minutes and you have 200 queries to fix. Do you:

  • a) Download the code to your laptop then pull the fire alarm to stall for time?
  • b) Start blasting your resume out on Monster.com?
  • c) Take a fistfull of aspirin, knowing your forearms will ache in the morning?

If you answered d) none of the above, please keep reading.

Assume this set of declarations:

UPDATE plant
	SET 	Symbol = '#form.symbol#',
			SynonymSymbol = '#form.SynonymSymbol#',
     		ScientificNameWithAuthor = '#form.ScientificNameWithAuthor#',
     		CommonName = '#CommonName#',
     		Family = '#Family#'
WHERE PlantCode = '#form.plantCode#'

What we want, is to turn: '#form.symbol#' into: <cfqueryparam value="#form.symbol#" cfsqltype="cf_sql_varchar">

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 3

Welcome to So You Wanna Learn Regex? Part 3.

In our last exercise, we looked at a simple way to wrap a function argument inside a new function. This was accomplished by making a pattern, defining a group and using a back reference. This time we will look at how to clean some strings.

Say for example, that you run a website called The Health Challenge and say for example, you wanted to use some of your fine tax dollar funded research to deliver motivating messages to the members.

Well, you could just happen across Small Steps and just use their content. After all, it is in the public domain. So you happily cut a LARGE chunk of these from the web site, but now you have to clean them.

Assume this set of declarations:

(# 11)  	Avoid food portions larger than your fist.
(# 12) 	Mow lawn with push mower.
(# 13) 	Increase the fiber in your diet.
(# 17) 	Join an exercise group.
(# 20) 	Do yard work.
(# 24) 	Skip seconds.
(# 25) 	Work around the house.
(# 26) 	Skip buffets.
(# 29) 	Take dog to the park.
(# 30) 	Ask your doctor about taking a multi-vitamin.
....( 700 more lines)

What we want, is to turn: (# 11) Avoid food portions larger than your fist. into: Avoid food portions larger than your fist. See, we like the content, we don't like the parentheticals nor the whitespace. Do we flex our forearms in preparation for a copy/paste session? Do we call KeyboardsAreUs.com and have 2 fresh keyboards airdropped, knowing we'll wear out some keys? (if you said yes, please delete your hard drive and apply at KFC.) Regular expressions are our friends. A Regex is a pattern matcher, and it can do stuff. We can see our code is repetitive and the pattern we want is: Get rid of the parentheticals and the extra whitespace. (Same stuff we'd do over and over via cut/paste/etc, isn't it? Though in a copy paste, you are talking about 5 keystrokes per line times 700 lines. That is 3500 keystrokes, unless you type like me, in which case it would be nearly 4 million.)

So as you know, we define this pattern in the gobbledegook of regular expressions. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]

So you wanna learn Regex? - Part 2

Welcome to So You Wanna Learn Regex? Part 2. In our last exercise, we looked at a simple way to add a new attribute to an HTML tag. This was accomplished by making a pattern, defining a group and using a back reference. This time we will look at a slightly more complicated use case.

Assume this set of declarations:

product.setColor(arguments.color);
product.setSize(arguments.size);
product.setCondition(arguments.condition);
product.setRating(arguments.rating);
product.setReliability(arguments.reliability);
product.setNeedsBatteries(arguments.needsBatteries);

What we want, is to turn: product.setColor(arguments.color); into: product.setColor( htmlEditFormat(arguments.color) );

Normally, this would be a forearm/wrist fatiguing flail on the keyboard, furiously cutting/pasting and generally flapping about. Not so with Regular Expressions. A Regex is a pattern matcher, and it can do stuff. We can see our code is repetitive and the pattern we want is: Take Everything Inside The Parenthesis, and Wrap It In A htmlEditFormat() Function. (Same stuff we'd do over and over via cut/paste/etc, isn't it?)

We can define this pattern in the gobbledegook defining a regular expression. When read one chunk at a time, these actually make sense. We'll go through the exercise, then look at why it worked.

In Eclipse, perform the following:

[More]