Chapter 27: Manipulating Files with VBA
IN THIS CHAPTER
• Getting a basic overview of VBA text file manipulation features
• Performing common file operations
• Opening a text file
• Displaying extended file information, such as details for media files
• Reading and writing a text file with VBA
• Exporting a range to HTML and XML format
• Zipping and unzipping files
• Using ActiveX Data Objects to import data
Performing Common File Operations
Many applications that you develop for Excel require working with external files. For example, you might need to get a listing of files in a directory, delete files, rename files, and so on. Excel, of course, can import and export several types of text files. In many cases, however, Excel's built-in text file handling isn't sufficient. For example, you might want to paste a list of filenames into a range, or export a range of cells to a simple HyperText Markup Language (HTML) file.
In this chapter, I describe how to use Visual Basic for Applications (VBA) to perform common (and not so common) file operations and work directly with text files.
Excel provides two ways to perform common file operations:
• Use traditional VBA statements and functions. This method works for all versions of Excel.
• Use the FileSystemObject object, which uses the Microsoft Scripting Library. This method works for Excel 2000 and later.
Some earlier versions of Excel also supported the use of the FileSearch object. That feature was removed, beginning with Excel 2007. If you execute an old macro that uses the FileSearch object, the macro will fail.
In the sections that follow, I discuss these two methods and present examples.
Using VBA file-related statements
The VBA statements that you can use to work with files are summarized in Table 27-1. Most of these statements are straightforward, and all are described in the Help system.
Table 27-1: VBA File-Related Statements
Command |
What It Does |
ChDir |
Changes the current directory. |
ChDrive |
Changes the current drive. |
Dir |
Returns a filename or directory that matches a specified pattern or file attribute. |
FileCopy |
Copies a file. |
FileDateTime |
Returns the date and time when a file was last modified. |
FileLen |
Returns the size of a file, in bytes. |
GetAttr |
Returns a value that represents an attribute of a file. |
Kill |
Deletes a file. |
MkDir |
Creates a new directory. |
Name |
Renames a file or directory. |
RmDir |
Removes an empty directory. |
SetAttr |
Changes an attribute for a file. |
The remainder of this section consists of examples that demonstrate some of the file manipulation commands.
A VBA function to determine whether a file exists
The following function returns True if a particular file exists and False if it doesn't exist. If the Dir function returns an empty string, the file couldn't be found, so the function returns False.
Function FileExists(fname) As Boolean
FileExists = Dir(fname) <> “”
End Function
The argument for the FileExists function consists of a full path and filename. The function can be either used in a worksheet or called from a VBA procedure. Here's an example:
MyFile = “c:udgeting2011 budget notes.docx”
Msgbox FileExists(MyFile)
A VBA function to determine whether a path exists
The following function returns True if a specified path exists and False otherwise:
Function PathExists(pname) As Boolean
‘ Returns TRUE if the path exists
On Error Resume Next
PathExists = (GetAttr(pname) And vbDirectory) = vbDirectory
End Function
The pname argument is a string that contains a directory (without a filename). The trailing backslash in the pathname is optional. Here's an example of calling the function:
MyFolder = “c:usersjohndesktopdownloads”
MsgBox PathExists(MyFolder)
The FileExists and PathExists functions are available on the CD-ROM. The filename is file functions.xlsm.
A VBA procedure to display a list of files in a directory
The following procedure displays (in the active worksheet) a list of files contained in a particular directory, along with the file size and date:
Sub ListFiles()
Dim Directory As String
Dim r As Long
Dim f As String
Dim FileSize As Double
Directory = “f:excelfilesudgeting”
r = 1
‘ Insert headers
Cells(r, 1) = “FileName”
Cells(r, 2) = “Size”
Cells(r, 3) = “Date/Time”
Range(“A1:C1”).Font.Bold = True
‘ Get first file
f = Dir(Directory, vbReadOnly + vbHidden + vbSystem)
Do While f <> “”
r = r + 1
Cells(r, 1) = f
‘Adjust for filesize > 2 gigabytes
FileSize = FileLen(Directory & f)
If FileSize < 0 Then FileSize = FileSize + 4294967296#
Cells(r, 2) = FileSize
Cells(r, 3) = FileDateTime(Directory & f)
‘ Get next file
f = Dir()
Loop
End Sub
Figure 27-1 shows an example of the output of the ListFiles subroutine.
VBA's FileLen function uses the Long data type. Consequently, it will return an incorrect size (a negative number) for files larger than about 2 gigabytes. The code checks for a negative value from the FileLen function, and makes an adjustment if necessary.
FIGURE 27-1: Output from the ListFiles procedure.
Notice that the procedure uses the Dir function twice. The first time (used with an argument), it retrieves the first filename found. Subsequent calls (without an argument) retrieve additional filenames. When no more files are found, the Dir function returns an empty string.
The companion CD-ROM contains a version of this procedure which allows you to select a directory from a dialog box. The filename is create file list.xlsm.
The Dir function also accepts wildcard file specifications in its first argument. To get a list of Excel files, for example, you could use a statement such as this:
f = Dir(Directory & “*.xl??”, vbReadOnly + vbHidden + vbSystem)
This statement retrieves the name of the first *.xl?? file in the specified directory. The wildcard specification returns a four-character extension that begins with XL. For example, the extension could be .xlsx, .xltx, or .xlam. The second argument for the Dir function lets you specify the attributes of the files (in terms of built-in constants). In this example, the Dir function retrieves filenames that have no attributes, read-only files, hidden files, and system files.
Table 27-2 lists the built-in constants for the Dir function.
Table 27-2: File Attribute Constants for the Dir Function
Constant |
Value |
Description |
vbNormal |
0 |
Files with no attributes. This is the default setting and is always in effect. |
vbReadOnly |
1 |
Read-only files. |
vbHidden |
2 |
Hidden files. |
vbSystem |
4 |
System files. |
vbVolume |
8 |
Volume label. If any other attribute is specified, this attribute is ignored. |
vbDirectory |
16 |
Directories. This attribute doesn't work. Calling the Dir function with the vbDirectory attribute doesn't continually return subdirectories. |
If you use the Dir function to loop through files and call another procedure to process the files, make sure that the other procedure doesn't use the Dir function. Only one “set” of Dir calls can be active at any time.
A recursive VBA procedure to display a list of files in nested directories
The example in this section creates a list of files in a specified directory, including all of its subdirectories. This procedure is unusual because it calls itself — a concept known as recursion.
Public Sub RecursiveDir(ByVal CurrDir As String, Optional ByVal Level As Long)
Dim Dirs() As String
Dim NumDirs As Long
Dim FileName As String
Dim PathAndName As String
Dim i As Long
Dim Filesize As Double
‘ Make sure path ends in backslash
If Right(CurrDir, 1) <> “” Then CurrDir = CurrDir & “”
‘ Put column headings on active sheet
Cells(1, 1) = “Path”
Cells(1, 2) = “Filename”
Cells(1, 3) = “Size”
Cells(1, 4) = “Date/Time”
Range(“A1:D1”).Font.Bold = True
‘ Get files
FileName = Dir(CurrDir & “*.*”, vbDirectory)
Do While Len(FileName) <> 0
If Left(FileName, 1) <> “.” Then ‘Current dir
PathAndName = CurrDir & FileName
If (GetAttr(PathAndName) And vbDirectory) = vbDirectory Then
‘store found directories
ReDim Preserve Dirs(0 To NumDirs) As String
Dirs(NumDirs) = PathAndName
NumDirs = NumDirs + 1
Else
‘Write the path and file to the sheet
Cells(WorksheetFunction.CountA(Range(“A:A”)) + 1, 1) = _
CurrDir
Cells(WorksheetFunction.CountA(Range(“B:B”)) + 1, 2) = _
FileName
‘adjust for filesize > 2 gigabytes
Filesize = FileLen(PathAndName)
If Filesize < 0 Then Filesize = Filesize + 4294967296#
Cells(WorksheetFunction.CountA(Range(“C:C”)) + 1, 3) = Filesize
Cells(WorksheetFunction.CountA(Range(“D:D”)) + 1, 4) = _
FileDateTime(PathAndName)
End If
End If
FileName = Dir()
Loop
‘ Process the found directories, recursively
For i = 0 To NumDirs - 1
RecursiveDir Dirs(i), Level + 2
Next i
End Sub
The procedure takes one argument, CurrDir, which is the directory being examined. Information for each file is displayed in the active worksheet. As the procedure loops through the files, it stores the subdirectory names in an array named Dirs. When no more files are found, the procedure calls itself using an entry in the Dirs array for its argument. When all of the directories in the Dirs array have been processed, the procedure ends.
Because the RecursiveDir procedure uses an argument, it must be executed from another procedure by using a statement like this:
Call RecursiveDir(“c:directory”)
The companion CD-ROM contains a version of this procedure that allows you to select a directory from a dialog box. The filename is recursive file list.xlsm.
Using the FileSystemObject object
The FileSystemObject object is a member of the Windows Scripting Host and provides access to a computer's file system. This object is often used in script-oriented Web pages (for example, VBScript and JavaScript) and can be used with Excel 2000 and later versions.
The Windows Scripting Host is sometimes used as a way to spread computer viruses and other malware. Consequently, the Windows Scripting Host may be disabled on some systems. Therefore, use caution if you're designing an application that will be used on many different systems.
The name FileSystemObject is a bit misleading because it actually includes a number of objects, each designed for a specific purpose:
• Drive: Represents a drive or a collection of drives.
• File: Represents a file or a collection of files.
• Folder: Represents a folder or a collection of folders.
• TextStream: Represents a stream of text that is read from, written to, or appended to a text file.
The first step in using the FileSystemObject object is to create an instance of the object. You can do this task in two ways: early binding and late binding.
The late binding method uses two statements, like this:
Dim FileSys As Object
Set FileSys = CreateObject(“Scripting.FileSystemObject”)
Note that the FileSys object variable is declared as a generic Object rather than as an actual object type. The object type is resolved at runtime.
The early binding method of creating the object requires that you set up a reference to the Windows Scripting Host Object Model. You do this by using Tools⇒References in the VBE (see Figure 27-2). After you've established the reference, create the object by using statements like these:
Dim FileSys As FileSystemObject
Set FileSys = CreateObject(“Scripting.FileSystemObject”)
Using the early binding method enables you to take advantage of the VBE's Auto List Members feature to help you identify properties and methods as you type. In addition, you can use the Object Browser (by pressing F2) to learn more about the object model.
FIGURE 27-2: Creating a reference to the Windows Script Host Object Model.
The examples that follow demonstrate various tasks using the FileSystemObject object.
Using FileSystemObject to determine whether a file exists
The Function procedure that follows accepts one argument (the path and filename) and returns True if the file exists:
Function FileExists3(fname) As Boolean
Dim FileSys As Object ‘FileSystemObject
Set FileSys = CreateObject(“Scripting.FileSystemObject”)
FileExists3 = FileSys.FileExists(fname)
End Function
The function creates a new FileSystemObject object named FileSys and then accesses the FileExists property for that object.
Using FileSystemObject to determine whether a path exists
The Function procedure that follows accepts one argument (the path) and returns True if the path exists:
Function PathExists2(path) As Boolean
Dim FileSys As Object ‘FileSystemObject
Set FileSys = CreateObject(“Scripting.FileSystemObject”)
PathExists2 = FileSys.FolderExists(path)
End Function
Using FileSystemObject to list information about all available disk drives
The example in this section uses FileSystemObject to retrieve and display information about all disk drives. The procedure loops through the Drives collection and writes various property values to a worksheet.
Figure 27-3 shows the results when the procedure is executed on a system with 12 drives. The data shown is the drive letter, whether the drive is “ready,” the drive type, the volume name, the total size, and the available space.
FIGURE 27-3: Output from the ShowDriveInfo procedure.
This workbook, named show drive info.xlsm, is available on the companion CD-ROM.
Sub ShowDriveInfo()
Dim FileSys As FileSystemObject
Dim Drv As Drive
Dim Row As Long
Set FileSys = CreateObject(“Scripting.FileSystemObject”)
Cells.ClearContents
Row = 1
‘ Column headers
Range(“A1:F1”) = Array(“Drive”, “Ready”, “Type”, “Vol. Name”, _
“Size”, “Available”)
On Error Resume Next
‘ Loop through the drives
For Each Drv In FileSys.Drives
Row = Row + 1
Cells(Row, 1) = Drv.DriveLetter
Cells(Row, 2) = Drv.IsReady
Select Case Drv.DriveType
Case 0: Cells(Row, 3) = “Unknown”
Case 1: Cells(Row, 3) = “Removable”
Case 2: Cells(Row, 3) = “Fixed”
Case 3: Cells(Row, 3) = “Network”
Case 4: Cells(Row, 3) = “CD-ROM”
Case 5: Cells(Row, 3) = “RAM Disk”
End Select
Cells(Row, 4) = Drv.VolumeName
Cells(Row, 5) = Drv.TotalSize
Cells(Row, 6) = Drv.AvailableSpace
Next Drv
‘Make a table
ActiveSheet.ListObjects.Add xlSrcRange, _
Range(“A1”).CurrentRegion, , xlYes
End Sub
Chapter 11 describes another method of getting drive information by using Windows API functions.
Displaying Extended File Information
The example in this section displays extended file properties for all files in a specified directory. The information that's available depends on the file type. For example, image files have properties such as Camera Model and Dimensions; audio files have properties such as Artist, Title, Duration, and so on.
The actual properties available depends on the version of Windows. Windows Vista supports 267 properties and Windows 7 supports even more. Here's a procedure that creates a list of file properties in the active worksheet:
Sub ListFileProperties()
Dim i As Long
Dim objShell As Object ‘IShellDispatch4
Dim objFolder As Object ‘Folder3
‘ Create the object
Set objShell = CreateObject(“Shell.Application”)
‘ Specify any folder
Set objFolder = objShell.Namespace(“C:”)
‘ List the properties
For i = 0 To 500
Cells(i + 1, 1) = _
objFolder.GetDetailsOf(objFolder.Items, i)
Next i
End Sub
Unfortunately, the property values aren't consistent across Windows versions. For example, in Windows 2000, the Title property is stored as number 11. In Windows XP, the Title property is stored as number 10. In Windows Vista, the Title property is number 21.
The FileInfo procedure, which uses the Windows Shell.Application object, follows. This procedure prompts for a directory using the GetDirectory function (not shown here) and then lists the first 41 properties of each file in the directory.
Sub FileInfo()
Dim c As Long, r As Long, i As Long
Dim FileName As Object ‘FolderItem2
Dim objShell As Object ‘IShellDispatch4
Dim objFolder As Object ‘Folder3
‘ Create the object
Set objShell = CreateObject(“Shell.Application”)
‘ Prompt for the folder
Set objFolder = objShell.Namespace(GetDirectory)
‘ Insert headers on active sheet
Worksheets.Add
c = 0
For i = 0 To 40
c = c + 1
Cells(1, c) = objFolder.GetDetailsOf(objFolder.Items, i)
Next i
‘ Loop through the files
r = 1
For Each FileName In objFolder.Items
c = 0
r = r + 1
For i = 0 To 40
c = c + 1
Cells(r, c) = objFolder.GetDetailsOf(FileName, i)
Next i
Next FileName
‘ Make it a table
ActiveSheet.ListObjects.Add xlSrcRange, _
Range(“A1”).CurrentRegion
End Sub
Figure 27-4 shows part of the output of this procedure, for a directory that contains MP3 audio files.
FIGURE 27-4: A table of information about the files in a directory.
This example uses late binding to create a Shell.Application object, so the objects are declared generically. To use early binding, use the VBE Tools⇒References command and create a reference to Microsoft Shell Controls and Automation.
This example, named file information.xlsm, is available on the companion CD-ROM.
Working with Text Files
VBA contains a number of statements that allow low-level manipulation of files. These Input/Output (I/O) statements give you much more control over files than Excel's normal text file import and export options.
You can access a file in any of three ways:
• Sequential access: By far the most common method. This type allows reading and writing individual characters or entire lines of data.
• Random access: Used only if you're programming a database application — something that's not really appropriate for VBA.
• Binary access: Used to read or write to any byte position in a file, such as storing or displaying a bitmap image. This access method is rarely used in VBA.
Because random and binary access files are rarely used with VBA, this chapter focuses on sequential access files, which are accessed sequentially. In other words, your code starts reading from the beginning of the file and reads each line sequentially. For output, your code writes data to the end of the file.
The method of reading and writing text files discussed in this book is the traditional data-channel approach. Another option is to use the object approach. The FileSystemObject object contains a TextStream object that can be used to read and write text files. The FileSystemObject object is part of the Windows Scripting Host. This scripting service is disabled on some systems because of the malware potential.
Opening a text file
VBA's Open statement (not to be confused with the Open method of the Workbooks object) opens a file for reading or writing. Before you can read from or write to a file, you must open it.
The Open statement is quite versatile and has a rather complex syntax:
Open pathname For mode [Access access] [lock] _
As [#]filenumber [Len=reclength]
• pathname: (Required) The pathname part of the Open statement is quite straightforward. It simply contains the name and path (optional) of the file to be opened.
• mode: (Required) The file mode must be one of the following:
• Append: A sequential access mode that either allows the file to be read or allows data to be appended to the end of the file.
• Input: A sequential access mode that allows the file to be read but not written to.
• Output: A sequential access mode that allows the file to be read or written to. In this mode, a new file is always created. (An existing file with the same name is deleted.)
• Binary: A random access mode that allows data to be read or written to on a byte-by-byte basis.
• Random: A random access mode that allows data to be read or written in units determined by the reclength argument of the Open statement.
• access: (Optional) The access argument determines what can be done with the file. It can be Read, Write, or Read Write.
• lock: (Optional) The lock argument is useful for multiuser situations. The options are Shared, Lock Read, Lock Write, and Lock Read Write.
• filenumber: (Required) A file number ranging from 1 to 511. You can use the FreeFile function to get the next available file number. (Read about FreeFile in the upcoming section, “Getting a file number.”)
• reclength: (Optional) The record length (for random access files) or the buffer size (for sequential access files).
Reading a text file
The basic procedure for reading a text file with VBA consists of the following steps:
1. Open the file by using the Open statement.
2. Specify the position in the file by using the Seek function (optional).
3. Read data from the file (by using the Input, Input #, or Line Input # statements).
4. Close the file by using the Close statement.
Writing a text file
The basic procedure for writing a text file is as follows:
1. Open or create the file by using the Open statement.
2. Specify the position in the file by using the Seek function (optional).
3. Write data to the file by using the Write # or Print # statement.
4. Close the file by using the Close statement.
Getting a file number
Most VBA programmers simply designate a file number in their Open statement. For example:
Open “myfile.txt” For Input As #1
Then you can refer to the file in subsequent statements as #1.
If a second file is opened while the first is still open, you'd designate the second file as #2:
Open “another.txt” For Input As #2
Another approach is to use VBA's FreeFile function to get a file handle. Then you can refer to the file by using a variable. Here's an example:
FileHandle = FreeFile
Open “myfile.txt” For Input As FileHandle
Determining or setting the file position
For sequential file access, you rarely need to know the current location in the file. If for some reason you need to know this information, you can use the Seek function.
Statements for reading and writing
VBA provides several statements to read and write data to a file.
Three statements are used for reading data from a sequential access file:
• Input: Reads a specified number of characters from a file.
• Input #: Reads data as a series of variables, with variables separated by a comma.
• Line Input #: Reads a complete line of data (delineated by a carriage return and/or linefeed character).
Two statements are used for writing data to a sequential access file:
• Write #: Writes a series of values, with each value separated by a comma and enclosed in quotes. If you end the statement with a semicolon, a carriage return/linefeed sequence is not inserted after each value. Data written with Write # is usually read from a file with an Input # statement.
• Print #: Writes a series of values, with each value separated by a Tab character. If you end the statement with a semicolon, a carriage return/linefeed sequence isn't inserted after each value. Data written with Print # is usually read from a file with a Line Input # or an Input statement.
Text File Manipulation Examples
This section contains a number of examples that demonstrate various techniques that manipulate text files.
Importing data in a text file
The code in the following example reads a text file and then places each line of data in a single cell (beginning with the active cell):
Sub ImportData()
Open “c:data extfile.txt” For Input As #1
r = 0
Do Until EOF(1)
Line Input #1, data
ActiveCell.Offset(r, 0) = data
r = r + 1
Loop
Close #1
End Sub
In most cases, this procedure won't be very useful because each line of data is simply dumped into a single cell. It would be easier to just open the text file directly by using File⇒Open.
Exporting a range to a text file
The example in this section writes the data in a selected worksheet range to a CSV text file. Excel, of course, can export data to a CSV file, but it exports the entire worksheet. This macro works with a specified range of cells.
Sub ExportRange()
Dim Filename As String
Dim NumRows As Long, NumCols As Integer
Dim r As Long, c As Integer
Dim Data
Dim ExpRng As Range
Set ExpRng = Selection
NumCols = ExpRng.Columns.Count
NumRows = ExpRng.Rows.Count
Filename = Application.DefaultFilePath & “ extfile.csv”
Open Filename For Output As #1
For r = 1 To NumRows
For c = 1 To NumCols
Data = ExpRng.Cells(r, c).Value
If IsNumeric(Data) Then Data = Val(Data)
If IsEmpty(ExpRng.Cells(r, c)) Then Data = “”
If c <> NumCols Then
Write #1, Data;
Else
Write #1, Data
End If
Next c
Next r
Close #1
End Sub
Notice that the procedure uses two Write # statements. The first statement ends with a semicolon, so a carriage return/linefeed sequence isn't written. For the last cell in a row, however, the second Write # statement doesn't use a semicolon, which causes the next output to appear on a new line.
I use a variable named Data to store the contents of each cell. If the cell is numeric, the variable is converted to a value. This step ensures that numeric data won't be stored with quotation marks. If a cell is empty, its Value property returns 0. Therefore, the code also checks for a blank cell (by using the IsEmpty function) and substitutes an empty string instead of a zero.
Figure 27-5 shows the contents of the resulting file, viewed in Windows Notepad.
This example and the example in the next section are available on the companion CD-ROM. The filename is export and import csv.xlsm.
FIGURE 27-5: This text file was generated by VBA.
Importing a text file to a range
The example in this section reads the CSV file created in the previous example and then stores the values beginning at the active cell in the active worksheet. The code reads each character and essentially parses the line of data, ignoring quote characters and looking for commas to delineate the columns.
Sub ImportRange()
Dim ImpRng As Range
Dim Filename As String
Dim r As Long, c As Integer
Dim txt As String, Char As String * 1
Dim Data
Dim i As Integer
Set ImpRng = ActiveCell
On Error Resume Next
Filename = Application.DefaultFilePath & “ extfile.csv”
Open Filename For Input As #1
If Err <> 0 Then
MsgBox “Not found: “ & Filename, vbCritical, “ERROR”
Exit Sub
End If
r = 0
c = 0
txt = “”
Application.ScreenUpdating = False
Do Until EOF(1)
Line Input #1, Data
For i = 1 To Len(Data)
Char = Mid(Data, i, 1)
If Char = “,” Then ‘comma
ActiveCell.Offset(r, c) = txt
c = c + 1
txt = “”
ElseIf i = Len(Data) Then ‘end of line
If Char <> Chr(34) Then txt = txt & Char
ActiveCell.Offset(r, c) = txt
txt = “”
ElseIf Char <> Chr(34) Then
txt = txt & Char
End If
Next i
c = 0
r = r + 1
Loop
Close #1
Application.ScreenUpdating = True
End Sub
The preceding procedure works with most data, but it has a flaw: It doesn't handle data that contains a comma or a quote character. But commas resulting from formatting are handled correctly. (They're ignored.) In addition, an imported date will be surrounded by number signs: for example, #2007-05-12#.
Logging Excel usage
The example in this section writes data to a text file every time Excel is opened and closed. In order for this example to work reliably, the procedure must be located in a workbook that's opened every time you start Excel. Storing the macro in your Personal Macro Workbook is an excellent choice.
The following procedure, stored in the code module for the ThisWorkbook object, is executed when the file is opened:
Private Sub Workbook_Open()
Open Application.DefaultFilePath & “excelusage.txt” For Append As #1
Print #1, “Started “ & Now
Close #1
End Sub
The procedure appends a new line to a file named excelusage.txt. The new line contains the current date and time and might look something like this:
Started 11/16/2010 9:27:43 PM
The following procedure is executed before the workbook is closed. It appends a new line that contains the word Stopped along with the current date and time.
Private Sub Workbook_BeforeClose(Cancel As Boolean)
Open Application.DefaultFilePath & “excelusage.txt” _
For Append As #1
Print #1, “Stopped “ & Now
Close #1
End Sub
A workbook that contains these procedures is on the companion CD-ROM. The file is named excel usage log.xlsm.
Refer to Chapter 19 for more information about event-handler procedures such as Workbook_Open and Workbook_BeforeClose.
Filtering a text file
The example in this section demonstrates how to work with two text files at once. The FilterFile procedure that follows reads a text file (infile.txt) and copies only the rows that contain a specific text string (“January”) to a second text file (output.txt).
Sub FilterFile()
Open ThisWorkbook.Path & “infile.txt” For Input As #1
Open Application.DefaultFilePath & “output.txt” For Output As #2
TextToFind = “January”
Do Until EOF(1)
Line Input #1, data
If InStr(1, data, TextToFind) Then
Print #2, data
End If
Loop
Close ‘Close all files
End Sub
This example, named filter text file.xlsm, is available on the companion CD-ROM.
Exporting a range to HTML format
The example in this section demonstrates how to export a range of cells to an HTML file. An HTML file, as you might know, is simply a text file that contains special formatting tags that describe how the information will be presented in a Web browser.
Why not use Excel's File⇒Save As command and choose the Web Page file type? The procedure listed here has a distinct advantage: It doesn't produce bloated HTML code. For example, I used the ExportToHTML procedure to export a range of 70 cells. The file size was 2.6KB. Then I used Excel's File⇒Save as Web Page command to export the sheet. The result was 15.8KB — more than six times larger.
But, on the other hand, the ExportToHTML procedure doesn't maintain all the cell formatting. In fact, the only formatting information that it produces is bold, italic, and horizontal alignment. However, the procedure is good enough for many situations, and it serves as the basis for additional enhancements.
Sub ExportToHTML()
Dim Filename As Variant
Dim TDOpenTag As String, TDCloseTag As String
Dim CellContents As String
Dim Rng As Range
Dim r As Long, c As Integer
‘ Use the selected range of cells
Set Rng = Application.Intersect(ActiveSheet.UsedRange, Selection)
If Rng Is Nothing Then
MsgBox “Nothing to export.”, vbCritical
Exit Sub
End If
‘ Get a file name
Filename = Application.GetSaveAsFilename( _
InitialFileName:=”myrange.htm”, _
fileFilter:=”HTML Files(*.htm), *.htm”)
If Filename = False Then Exit Sub
‘ Open the text file
Open Filename For Output As #1
‘ Write the tags
Print #1, “<HTML>”
Print #1, “<TABLE BORDER=0 CELLPADDING=3>”
‘ Loop through the cells
For r = 1 To Rng.Rows.Count
Print #1, “<TR>”
For c = 1 To Rng.Columns.Count
Select Case Rng.Cells(r, c).HorizontalAlignment
Case xlHAlignLeft
TDOpenTag = “<TD ALIGN=LEFT>”
Case xlHAlignCenter
TDOpenTag = “<TD ALIGN=CENTER>”
Case xlHAlignGeneral
If IsNumeric(Rng.Cells(r, c)) Then
TDOpenTag = “<TD ALIGN=RIGHT>”
Else
TDOpenTag = “<TD ALIGN=LEFT>”
End If
Case xlHAlignRight
TDOpenTag = “<TD ALIGN=RIGHT>”
End Select
TDCloseTag = “</TD>”
If Rng.Cells(r, c).Font.Bold Then
TDOpenTag = TDOpenTag & “<B>”
TDCloseTag = “</B>” & TDCloseTag
End If
If Rng.Cells(r, c).Font.Italic Then
TDOpenTag = TDOpenTag & “<I>”
TDCloseTag = “</I>” & TDCloseTag
End If
CellContents = Rng.Cells(r, c).Text
Print #1, TDOpenTag & CellContents & TDCloseTag
Next c
Print #1, “</TR>”
Next r
‘ Close the table
Print #1, “</TABLE>”
Print #1, “</HTML>”
‘ Close the file
Close #1
‘ Tell the user
MsgBox Rng.Count & “ cells were exported to “ & Filename
End Sub
The procedure starts by determining the range to export. This is based on the intersection of the selected range and the used area of the worksheet. This ensures that entire rows or columns aren't processed. Next, the user is prompted for a filename, and the text file is opened. The bulk of the work is done within two For-Next loops. The code generates the appropriate HTML tags and writes the information to the text file. The only complicated part is determining the cell's horizontal alignment. (Excel doesn't report this information directly.) Finally, the file is closed, and the user sees a summary message.
Figure 27-6 shows a range in a worksheet, and Figure 27-7 shows how it looks in a Web browser after being converted to HTML.
This example, named export to HTML.xlsm, is available on the companion CD-ROM.
FIGURE 27-6: A worksheet range, ready to be converted to HTML.
FIGURE 27-7: The worksheet data after being converted to HTML.
Exporting a range to an XML file
This example exports an Excel range to a simple XML data file. As you might know, an XML file uses tags to wrap each data item. The procedure in this section uses the labels in the first row as the XML tags. Figure 27-8 shows the range in a worksheet table, and Figure 27-9 shows the XML file displayed in a Web browser.
FIGURE 27-8: The data in this range will be converted to XML.
FIGURE 27-9: The worksheet data after being converted to XML.
Although Excel 2003 introduced improved support for XML files, even Excel 2010 can't create an XML file from an arbitrary range of data unless you have a map file (schema) for the data.
The ExportToXML procedure follows. You'll notice that it has a quite a bit in common with the ExportToHTML procedure in the previous section.
Sub ExportToXML()
Dim Filename As Variant
Dim Rng As Range
Dim r As Long, c As Long
‘ Set the range
Set Rng = Range(“Table1[#All]”)
‘ Get a file name
Filename = Application.GetSaveAsFilename( _
InitialFileName:=”myrange.xml”, _
fileFilter:=”XML Files(*.xml), *.xml”)
If Filename = False Then Exit Sub
‘ Open the text file
Open Filename For Output As #1
‘ Write the <xml> tags
Print #1, “<?xml version=””1.0”” encoding=””UTF-8”” standalone=””yes””?>”
Print #1, “<EmployeeList xmlns:xsi=””http://www.w3.org/2001/XMLSchema-instance””>”
‘ Loop through the cells
For r = 2 To Rng.Rows.Count
Print #1, “<Employee>”
For c = 1 To Rng.Columns.Count
Print #1, “<” & Rng.Cells(1, c) & “>”;
If IsDate(Rng.Cells(r, c)) Then
Print #1, Format(Rng.Cells(r, c), “yyyy-mm-dd”);
Else
Print #1, Rng.Cells(r, c).Text;
End If
Print #1, “</” & Rng.Cells(1, c) & “>”
Next c
Print #1, “</Employee>”
Next r
‘ Close the table
Print #1, “</EmployeeList>”
‘ Close the file
Close #1
‘ Tell the user
MsgBox Rng.Rows.Count - 1 & “ records were exported to “ & Filename
End Sub
This example, named export to XML.xlsm, is available on the companion CD-ROM.
You can open the exported XML file with Excel. When opening an XML file, you'll see the dialog box shown in Figure 27-10. If you choose the As an XML Table option, the file will be displayed as a table. Keep in mind that any formulas in the original table aren't preserved.
FIGURE 27-10: When opening an XML file, Excel offers three options.
Zipping and Unzipping Files
Perhaps the most commonly used type of file compression is the Zip format. Even Excel 2010 files are stored in the Zip format (although they don't use the .zip extension). A Zip file can contain any number of files, and even complete directory structures. The content of the files determines the degree of compression. For example, JPG image files and MP3 audio files are already compressed, so zipping these file types will have little effect on the file size.
The examples in this section are available on the companion CD-ROM. The files are named ‘zip files.xlsm' and ‘unzip a file.xlsm'.
Zipping files
The example in this section demonstrates how to create a Zip file from a group of user-selected files. The ZipFiles procedure displays a dialog box so that the user can select the files. It then creates a Zip file named compressed.zip in Excel's default directory.
Sub ZipFiles()
Dim ShellApp As Object
Dim FileNameZip As Variant
Dim FileNames As Variant
Dim i As Long, FileCount As Long
‘ Get the file names
FileNames = Application.GetOpenFilename _
(FileFilter:=”All Files (*.*),*.*”, _
FilterIndex:=1, _
Title:=”Select the files to ZIP”, _
MultiSelect:=True)
‘ Exit if dialog box canceled
If Not IsArray(FileNames) Then Exit Sub
FileCount = UBound(FileNames)
FileNameZip = Application.DefaultFilePath & “compressed.zip”
‘Create empty Zip File with zip header
Open FileNameZip For Output As #1
Print #1, Chr$(80) & Chr$(75) & Chr$(5) & Chr$(6) & String(18, 0)
Close #1
Set ShellApp = CreateObject(“Shell.Application”)
‘Copy the files to the compressed folder
For i = LBound(FileNames) To UBound(FileNames)
ShellApp.Namespace(FileNameZip).CopyHere FileNames(i)
‘Keep script waiting until Compressing is done
On Error Resume Next
Do Until ShellApp.Namespace(FileNameZip).items.Count = i
Application.Wait (Now + TimeValue(“0:00:01”))
Loop
Next i
If MsgBox(FileCount & “ files were zipped to:” & _
vbNewLine & FileNameZip & vbNewLine & vbNewLine & _
“View the zip file?”, vbQuestion + vbYesNo) = vbYes Then _
Shell “Explorer.exe /e,” & FileNameZip, vbNormalFocus
End Sub
Figure 27-11 shows the file selection dialog box generated by using the GetOpenFilename method of the Application object (see Chapter 12 for more information). This dialog box allows the user to select multiple files from a single directory.
The ZipFiles procedure creates a file named compressed.zip and writes a string of characters, which identify it as a Zip file. Next, a Shell.Application object is created, and the code uses its CopyHere method to copy the files into the Zip archive. The next section of the code is a Do Until loop, which checks the number of files in the Zip archive every second. This is necessary because copying the files could take some time, and, if the procedure ends before the files are copied, the Zip file will be incomplete (and probably corrupt). This loop slows the procedure considerably, but I haven't been able to figure out an alternative.
When the number of files in the Zip archive matches the number that should be there, the loop ends, and the user is presented with a message like the one shown in Figure 27-12. Clicking the Yes button opens a Windows Explorer window that shows the zipped files.
The ZipFiles procedure presented here was kept simple to make it easy to understand. The code does no error checking and is not very flexible. For example, there is no option to choose the Zip filename or location, and the current compressed.zip file is always overwritten without warning. It's certainly no replacement for the zipping tools built into Windows, but it's an interesting demonstration of what you can do with VBA.
FIGURE 27-11: This dialog box lets the user select the files to be zipped.
FIGURE 27-12: The user is informed when the Zip file is complete.
Unzipping a File
The example in this section performs the opposite function of the previous example. It asks the user for a ZIP filename and then unzips the files and puts them in a directory named Unzipped, located in Excel's default file directory.
Sub UnzipAFile()
Dim ShellApp As Object
Dim TargetFile
Dim ZipFolder
‘ Target file & temp dir
TargetFile = Application.GetOpenFilename _
(FileFilter:=”Zip Files (*.zip), *.zip”)
If TargetFile = False Then Exit Sub
ZipFolder = Application.DefaultFilePath & “Unzipped”
‘ Create a temp folder
On Error Resume Next
RmDir ZipFolder
MkDir ZipFolder
On Error GoTo 0
‘ Copy the zipped files to the newly created folder
Set ShellApp = CreateObject(“Shell.Application”)
ShellApp.Namespace(ZipFolder).CopyHere _
ShellApp.Namespace(TargetFile).items
If MsgBox(“The files was unzipped to:” & _
vbNewLine & ZipFolder & vbNewLine & vbNewLine & _
“View the folder?”, vbQuestion + vbYesNo) = vbYes Then _
Shell “Explorer.exe /e,” & ZipFolder, vbNormalFocus
End Sub
The UnzipAFile procedure uses the GetOpenFilename method to get the Zip file. It then creates the new folder and uses the Shell.Application object to copy the contents of the Zip file to the new folder. Finally, the user can choose to display the new directory.
Working with ADO
ADO (ActiveX Data Objects) is an object model that enables you to access data stored in a variety of formats (including common database formats and even text files). Importantly, this methodology allows you to use a single object model for all your data sources. ADO is currently the preferred data access methodology and shouldn't be confused with DAO (Data Access Objects).
This section presents a simple example that uses ADO to retrieve data from an Access database.
ADO programming is a very complex topic. If you need to access external data in your Excel application, you'll probably want to invest in one or more books that cover this topic in detail.
The ADO_Demo example retrieves data from an Access database named budget data.accdb. This database contains one table (named Budget). This example retrieves the data in which the Item field contains the text Lease, the Division field contains the text N. America, and the Year field contains 2008. The qualifying data is stored in a Recordset object, and the data is then transferred to a worksheet (see Figure 27-13).
FIGURE 27-13: This data was retrieved from an Access database.
Sub ADO_Demo()
‘ This demo requires a reference to
‘ the Microsoft ActiveX Data Objects 2.x Library
Dim DBFullName As String
Dim Cnct As String, Src As String
Dim Connection As ADODB.Connection
Dim Recordset As ADODB.Recordset
Dim Col As Integer
Cells.Clear
‘ Database information
DBFullName = ThisWorkbook.Path & “udget data.accdb”
‘ Open the connection
Set Connection = New ADODB.Connection
Cnct = “Provider=Microsoft.ACE.OLEDB.12.0;”
Cnct = Cnct & “Data Source=” & DBFullName & “;”
Connection.Open ConnectionString:=Cnct
‘ Create RecordSet
Set Recordset = New ADODB.Recordset
With Recordset
‘ Filter
Src = “SELECT * FROM Budget WHERE Item = ‘Lease' “
Src = Src & “and Division = ‘N. America' “
Src = Src & “and Year = ‘2008'”
.Open Source:=Src, ActiveConnection:=Connection
‘ Write the field names
For Col = 0 To Recordset.Fields.Count - 1
Range(“A1”).Offset(0, Col).Value = _
Recordset.Fields(Col).Name
Next
‘ Write the recordset
Range(“A1”).Offset(1, 0).CopyFromRecordset Recordset
End With
Set Recordset = Nothing
Connection.Close
Set Connection = Nothing
End Sub
This example (named simple ado example.xlsm), along with the Access database file (named budget data.accdb), is available on the companion CD-ROM. In addition, the CD-ROM contains an example of using ADO to query a CSV text file. The file named simple ado example2.xlsm uses a large CSV files named music_list.csv.