Pulling documents out of nested folders
As part of some consulting work, I had to extract documents out of a large zip file. The zip file contained thousands of documents and EACH individual document resided in a directory structure nested 6 to 11 other directories deep. Doing this by hand would be tedious and boring.
Also, the individual files were named with punctuation, spaces and other annoying bits. I wanted to clean those up as well.
I spent a few minutes hacking together a ColdFusion script to do the work for me.
Here is the code:
2<cfdirectory directory="C:\Xcel\CandidateDocuments" name="dirQuery" action="list" recurse="true">
3
4<!--- Filter the resulting directory query to remove empty directories. --->
5<cfquery dbtype="query" name="FilesOnly">
6 SELECT * FROM dirQuery
7 WHERE TYPE<>'Dir'
8</cfquery>
9
10<!--- Whip through the filtered query --->
11<cfoutput query="FilesOnly">
12 <!--- Remove the zip files, since we don't need them and also anything without a 3 character file extension --->
13 <cfif len(listlast( name, ".")) IS 3 AND NOT listlast( name, ".") IS 'zip'>
14
15 <!---
16 Pull out the name of the file minus the extention and remove all the annoying parts
17 Note, I am not using listFirst() here because I don't know if the document has a period in some inappropriate spot.
18 --->
19 <cfset filename = ReReplace(mid(name, 1, len(name) - 4 ), "[^[:alnum:]]", "", "all")>
20 <!--- Hold on to the extention of the file --->
21 <cfset ext = listlast(name, ".")>
22 <!--- Copy the file to another directory --->
23 <cffile action="copy" destination="C:\Xcel\cleandocuments\" source="#directory#\#name#">
24 <!--- And rename it --->
25 <cffile action="rename" source="C:\Xcel\cleandocuments\#name#" destination="C:\Xcel\cleandocuments\DataCurl-#filename#.#ext#">
26
27 </cfif>
28</cfoutput>
29 <!--- Clap hands, and use the time saved to post nonsense on blog --->
There, wasn't that fun?