Mac OS X supports multiple device types that can be accessed as a disk. These include physical disks such as hard drives, FireWire and USB drives, multiple hard drives combined into a RAID, CDs and DVDs, various forms of flash memory (including thumb drives and the iPod Shuffle), network disks residing on a server, and virtual disks that can exist in memory or be derived from files on another filesystem. Even though disks can take many forms, from the user perspective, a disk is a disk is a disk. As long as the disk stores data, and files can be moved and copied between them without too much thought, that’s all any user really cares about, right?
To the system, however, two layers mediate between the operating system and disks:
Device drivers that translate standard system file access calls into a form understood by the disk
Filesystems that organize data on a drive into a form that can be accessed by a device driver
Mac OS X has device drivers, in the form of kernel extensions (kexts
), for most of the devices that you will want to use as a disk including FireWire drives, USB flash memory cards, and SCSI disks.
This chapter starts out by introducing the kinds of filesystems that Mac OS X supports and shows you how to examine and work with these filesystems. It then shows you how to work with disks—both physical and virtual—including partitioning disks and moving data safely from one disk to another.
Each device stores data in a form that makes sense for that device. A hard drive usually writes data to the platters using a series of sectors and tracks. A CD writes data in one continuous track that spirals from the inside of the disk out. A compact flash card simply holds data in a matrix of memory cells on the chip. For efficiency reasons, the data that makes up a file may be scattered across various sectors of a hard drive instead of being nicely organized in one lump. The role of a filesystem is to mediate between the world of the device where data resides and the world of the Finder where data shows up in an organized form as files and folders.
The primary filesystem used by Mac OS X is the Mac OS Extended filesystem, also known as HFS+. Introduced in Mac OS 8.1 and upgraded for Panther with journaling features, HFS+ allows long filenames with up to 255 Unicode characters, scales up to 2 TB of data on a filesystem, can handle 2 billion files, and allows for files up to 2 GB in size. In addition, each folder in an HFS+ filesystem can handle a maximum of 32,767 files. When HFS+ was first introduced, the size of these figures was way beyond what the state-of-the-art filesystems of the time could handle.
Even though it looks like it will be good for the next few years, it’s obvious that HFS+ has only a few more years left to it. Apple, in all their infinite wisdom, no doubt realizes this and is probably hard at work devising a new filesystem type for the future; something capable of keeping up with the rapidly growing file sizes we encounter when working with digital audio and video.
One of the quirks of HFS+ is that it is a case-preserving
and case-insensitive
filesystem. This means you can’t have files named Readme
and README
in the same directory. This is similar to the way the filesystems work on Windows, but is different from the traditional Unix case-preserving filesystems where you can have both a Readme
and README
file in the same directory. In most cases, this isn’t a problem, because people don’t tend to place two files with the same name into a directory.
Another difference between the HFS+ filesystem and most others is that HFS+ supports the concept of resource forks
. Resource forks were used on the old Mac OS to store all sorts of metadata for a file, such as icons. Although resource forks were a good idea, they didn’t catch on with the rest of the world. Unix and Windows filesystems don’t have an equivalent concept, so Apple adopted a similar policy and recommends that all applications that write files should avoid using resource forks. However, the filesystem still supports resource forks, mainly so older applications that rely on them can run just fine. In all likelihood, many applications will continue to use resource forks to some degree or another. You can read more about resource forks in Chapter 8.
If something went wrong with the Mac OS X filesystem in earlier releases, a long and intensive fsck
process would be run the next time the machine started up. For example, if the machine was powered off incorrectly or if the system crashed, it was pretty
common to wait a long time at the initial gray boot screen. And if the filesystem truly got itself into a bad state, manual intervention was necessary to fix the filesystem.
Journaling, which was first introduced in Mac OS X Server 10.2.2 and has been the default filesystem type since Mac OS X Panther, implements a scheme that keeps the filesystem structure of your disk safe even in the face of an unexpected shutdown or system crash. When your Mac reboots, disk repairs are made as needed. The filesystem does this by keeping a continuous record of changes to the files on a disk in a journal file in a designated area of the disk. If a computer starts up and the disk is in an inconsistent state, the journal is used to quickly restore the disk to its previous known state. This record keeping does come with a slight performance overhead. It typically takes 10 to 15 percent longer to write small files with journaling than without it, but in most cases, the slight performance loss is well worth the safety gained. Journaling also allows other optimizations to be made in the disk I/O system, which more than makes up for the performance penalty.
It is important to note that, in the face of unexpected shutdowns, journaling won’t necessarily protect the data being written to disk. By protecting the filesystem, journaling protects all the data that is already on your disk from being lost. Any changes you may have made to a document after its last save, for example, are most likely lost.
In the past 20 years of personal computing, a common theme with hard disks has been the issue of file fragmentation . Early hard drive formats were extremely susceptible to performance problems manifested over time because file data was often split and scattered across the hard drive. As files grew larger than their original allocation, the filesystem was forced to put parts of those files onto different sectors of the disk. Even more modern formats such as HFS+ can exhibit performance slowdowns over time as a disk is used more.
The following two optimizations to the HFS+ driver were introduced for Mac OS X Panther (v10.3) when using a journaled filesystem:
When opened, if a file has more than eight fragments and is smaller than 20 MB in size, it is defragmented by simply moving the file to a new location on the drive where the file can be written in one contiguous block.
Over a period of time, the system keeps track of small files that are read frequently, but never written to. As the system learns which files are used most and which are least likely to change size, it moves them to the fastest part of the drive, where they can quickly be accessed. Files that don’t meet the requirements for being in this “hot zone” are moved out to ensure that enough room exists for the files that should be there.
This means—at least for most files most of the time—a separate defragmentation program isn’t needed. It also means you should always enable journaling on your drives so you can take advantage of these features. Fortunately, when installing Tiger, the default filesystem type is Mac OS Extended (Journaled), so you should be set.
These optimizations weren’t part of the official advertised feature set for Panther. They were, however, discovered by some programmers while reading the source code for the filesystem drivers available from the Darwin project (http://developer.apple.com/darwin).
In addition to HFS+, Mac OS X supports several other types of filesystems, each of which has its own unique characteristics:
The standard Mac filesystem prior to the release of Mac OS 8.1, HFS is primarily supported so that older disks can still be accessed.
A variant of the standard BSD Fast File System , UFS is a case-sensitive and case-preserving filesystem provided to ensure that applications needing a case-sensitive filesystem can be run. Some people recommend that Unix software developers use UFS, but experience has shown that HFS+ is a much faster filesystem and case insensitivity isn’t the big problem many make it out to be. If you think you have a need for UFS, you’ll want to consider your decision carefully because, under Mac OS X, this filesystem doesn’t perform as well as HFS+.
The standard format for all DVD media formats including video, DVD-ROM, DVD-RAM, and DVD-RW as well as some writable CD formats.
The standard cross-platform file format for CD-ROM data disks.
The format used by standard audio CDs.
The standard filesystem of MS-DOS; widely used by Microsoft Windows. Mac OS X supports both the 16- and 32-bit variants of FAT.
Originally developed for Windows NT, NTFS is now the primary filesystem used in the PC world. Mac OS X provides some support for NTFS, but access is limited to read actions; Mac OS X can’t write to a locally mounted NTFS volume.