Pulling documents out of nested folders

As part of some consulting work, I had to extract documents out of a large zip file. The zip file contained thousands of documents and EACH individual document resided in a directory structure nested 6 to 11 other directories deep. Doing this by hand would be tedious and boring.

Also, the individual files were named with punctuation, spaces and other annoying bits. I wanted to clean those up as well.

I spent a few minutes hacking together a ColdFusion script to do the work for me.

Here is the code:

<!--- Query the root directory and recurse through to get ALL directories, subdirectories and the contents thereof --->
<cfdirectory directory="C:\Xcel\CandidateDocuments" name="dirQuery" action="list" recurse="true">

<!--- Filter the resulting directory query to remove empty directories. --->
<cfquery dbtype="query" name="FilesOnly">
   SELECT * FROM dirQuery
   WHERE TYPE<>'Dir'
</cfquery>

<!--- Whip through the filtered query --->
<cfoutput query="FilesOnly">
   <!--- Remove the zip files, since we don't need them and also anything without a 3 character file extension --->
   <cfif len(listlast( name, ".")) IS 3 AND NOT listlast( name, ".") IS 'zip'>

      <!---
         Pull out the name of the file minus the extention and remove all the annoying parts
         Note, I am not using listFirst() here because I don't know if the document has a period in some inappropriate spot.
      --->

      <cfset filename = ReReplace(mid(name, 1, len(name) - 4 ), "[^[:alnum:]]", "", "all")>
      <!--- Hold on to the extention of the file --->
      <cfset ext = listlast(name, ".")>
      <!--- Copy the file to another directory --->
      <cffile action="copy" destination="C:\Xcel\cleandocuments\" source="#directory#\#name#">
      <!--- And rename it --->
      <cffile action="rename" source="C:\Xcel\cleandocuments\#name#" destination="C:\Xcel\cleandocuments\DataCurl-#filename#.#ext#">
   
   </cfif>
</cfoutput>
   <!--- Clap hands, and use the time saved to post nonsense on blog --->



There, wasn't that fun?

Comments
BlogCFC was created by Raymond Camden. This blog is running version 5.9.001. Contact Blog Owner