Pulling documents out of nested folders

As part of some consulting work, I had to extract documents out of a large zip file. The zip file contained thousands of documents and EACH individual document resided in a directory structure nested 6 to 11 other directories deep. Doing this by hand would be tedious and boring.

Also, the individual files were named with punctuation, spaces and other annoying bits. I wanted to clean those up as well.

I spent a few minutes hacking together a ColdFusion script to do the work for me.

Here is the code:

view plain print about
1<!--- Query the root directory and recurse through to get ALL directories, subdirectories and the contents thereof --->
2<cfdirectory directory="C:\Xcel\CandidateDocuments" name="dirQuery" action="list" recurse="true">
3
4<!--- Filter the resulting directory query to remove empty directories. --->
5<cfquery dbtype="query" name="FilesOnly">
6    SELECT * FROM dirQuery
7    WHERE TYPE<>'Dir'
8</cfquery>
9
10<!--- Whip through the filtered query --->
11<cfoutput query="FilesOnly">
12    <!--- Remove the zip files, since we don't need them and also anything without a 3 character file extension --->
13    <cfif len(listlast( name, ".")) IS 3 AND NOT listlast( name, ".") IS 'zip'>
14
15        <!---
16            Pull out the name of the file minus the extention and remove all the annoying parts
17            Note, I am not using listFirst() here because I don't know if the document has a period in some inappropriate spot.
18        --->

19        <cfset filename = ReReplace(mid(name, 1, len(name) - 4 ), "[^[:alnum:]]", "", "all")>
20        <!--- Hold on to the extention of the file --->
21        <cfset ext = listlast(name, ".")>
22        <!--- Copy the file to another directory --->
23        <cffile action="copy" destination="C:\Xcel\cleandocuments\" source="#directory#\#name#">
24        <!--- And rename it --->
25        <cffile action="rename" source="C:\Xcel\cleandocuments\#name#" destination="C:\Xcel\cleandocuments\DataCurl-#filename#.#ext#">
26    
27    </cfif>
28</cfoutput>
29    <!--- Clap hands, and use the time saved to post nonsense on blog --->



There, wasn't that fun?

There are no comments for this entry.

Add Comment Subscribe to Comments