10 DISKS, PARTITIONING, AND GEOM

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

10
DISKS, PARTITIONING, AND GEOM

A sysadmin can’t overemphasize the importance of managing disks and filesystems. (Go ahead, try to emphasize it too much. I’ll wait.) Your disks contain your data, making reliability and flexibility paramount to the operating system. FreeBSD supports a variety of filesystems and has many different ways to handle them. In this chapter, we’ll consider the most common disk tasks every sysadmin performs.

First, let’s discuss the most important thing to remember about storage devices.

Disks Lie

Once upon a time, a sysadmin could make decisions about a disk based on the information it provided. You could plug in a hard drive and query it for the number of platters, cylinders, sectors, and more. Those days are long, long past. Yes, you can perform the same query and get an answer, but those answers don’t reflect any reality. Today, a disk is a magic box that regurgitates data on request. Some of those magic boxes contain spinning platters. Others lack moving parts. The magic boxes provide numbered sectors for storing bits and bytes. The relationship between those numbers and the contents of the box? That’s magic: inscrutable and unknowable.

In previous books, including earlier editions of this one, I’ve discussed the importance of proper data placement on the disk, but all of that knowledge is completely obsolete. If you still retain any of that knowledge, discard it in favor of something more useful, like the complete biographies of all the actors who appeared in any role in classic Doctor Who.

As far as disk design goes, the only thing you need to know about is logical block addressing (LBA). Each sector on a disk is assigned a number. Filesystems call disk sectors by number. That’s it. Anything beneath LBA is pure guesswork on your part.

Unfortunately, disks now have a new category of lies they tell: sector size.

Up through the 1990s, disk sector sizes varied from 128 bytes to 2KB. Even the original IBM PC could understand different sector sizes on floppy disks.

In the early 2000s, though, manufacturers settled on 512-byte sectors. Today’s hard drives are much larger, and the files are similarly larger. In the last few years, the 512-byte sectors have mostly been replaced with 4,096-byte sectors, called 4K drives. This sector size makes more sense for the type of data we store today.

The problem is, operating systems like Windows XP know that a disk sector always has been, and always will be, 512 bytes. These operating systems won’t tolerate hard drives that reported having 4KB sectors because everybody knows there’s no such thing. If you manufacture 4K drives, what do you do?

The same thing you always do.

You teach the hard drive to lie.

Best of all, different 4K drives lie in different ways. If the OS asks a drive its sector size, most drives state that they have 512-byte sectors. Drives that claim to have both 512-byte and 4KB sectors are probably 4K drives, struggling to tell the truth. Very few admit to having solely 4KB sectors. To complicate matters even more, some solid state drives have sectors as large as 8KB or 16KB, or they support multiple sector sizes.

Both of FreeBSD’s main filesystems must know the sector size of the underlying disk and the logical block address of that sector. If you use the wrong sector size on your disk, performance suffers. I could go into long detailed discussions of how this happens, but to keep it simple, always align partitions on even megabyte boundaries. You might waste a few bytes here and there, but that’s trivial compared to the truly appalling performance you’ll get from having a filesystem misaligned with the disk.

Device Nodes

We touched briefly on device nodes in Chapter 4, but let’s consider them in more detail here. Device nodes are special files that represent a hardware device or an operating system feature. They’re used as logical interfaces to provide features to user programs. By using a command on a device node, sending information to a device node, or reading data from a device node, you’re telling the kernel to perform an action. If the device node represents a physical device, you’re acting on that device. These actions can be very different for different devices—writing data to disk is very different than writing data to a sound card. While you can expose device nodes anywhere, the standard device nodes exist in /dev.

Before you can work with a disk or disk partition, you must know its device name. FreeBSD disk device nodes come from the names of the device driver for that type of hardware. Device driver names, in turn, often come from the type of device and not the device’s role or function.

Table 10-1 shows the most common disk device nodes.

Table 10-1: Storage Device Nodes and Types

Device node	Man page	Description
/dev/ada*	ada(4)	ATA-style direct access disks (SATA, IDE, etc.)
/dev/cd*	cd(4)	Optical media drives (CD, Blu-Ray, etc.)
/dev/da*	da(4)	SCSI-style direct access disks (USB storage, SAS, etc.)
/dev/md*	md(4)	Memory disks
/dev/mmcsd*	mmcsd(4)	MMC and SD memory cards
/dev/nvd*	nvd(4)	NVM express drives
/dev/vtbd*	virtio_blk(4)	Virtio-based virtual machine disk
/dev/xbd*	xen(4)	Xen virtual disks

Many RAID controllers present their RAID containers as SCSI devices, so they show up as /dev/da device nodes. Others present their disks as “SCSI plus special vendor topping,” so they get special device node names such as /dev/raid (ATA RAID), /dev/mfid (certain LSI MegaRAID cards), and so on. Check the man page for your RAID controller to see the device node it presents.

The Common Access Method

The Common Access Method (CAM) is a standardized device driver architecture originally written to support the complex command set of 20th-century SCSI-2 disks. The idea was that standardizing based on this architecture would simplify writing device drivers. Only FreeBSD and DEC OSF/1 actually shipped with CAM, however, and each filled in the specification’s gaps differently.

FreeBSD 9 and later consolidates management of all physical disks that support CAM in the CAM interface. Use camcontrol(8) to gather information from disks and issue commands to them. The camcontrol(8) command has a variety of subcommands that let you issue instructions to hard drives.

What Disks Do You Have?

To identify a host’s storage devices, you can trawl /var/run/dmesg.boot looking for disk device nodes or see which filesystems are mounted and backtrack from there. But the easiest way to identify your storage is to have camcontrol(8) ask the CAM system what disks it sees. Let’s look at one of my test systems:

# camcontrol devlist
<ATA WDC WD1003FBYZ-0 1V03>        at scbus0 target 0 lun 0 (pass0,da0)
<ATA WDC WD1003FBYZ-0 1V03>        at scbus0 target 1 lun 0 (pass1,da1)
<ATA WDC WD1003FBYZ-0 1V03>        at scbus0 target 2 lun 0 (pass2,da2)
<ATA WDC WD1003FBYZ-0 1V03>        at scbus0 target 3 lun 0 (pass3,da3)

This output is broken up into three fields. The first gives the name of the device, as reported by the device itself. This is usually a vendor and the vendor’s model number.

The second section gives SCSI connection information. These drives aren’t actually SCSI drives—they’re SATA connections managed via CAM. But you now know which disk devices are plugged into which port on the SATA controller.

Finally, in parentheses, we have the SCSI device and what we probably want, the storage device node. This host has four disks, named da0, da1, da2, and da3.

Non-CAM Devices

Generally speaking, everything except proprietary RAID controllers and virtual disks support CAM.

RAID controllers have usually embraced and extended the CAM protocol, for what the manufacturer thought was a good reason at the time. A protocol written in the early 1990s wasn’t sufficient for a 2010 RAID controller. These controllers usually have their own control programs. The RAID containers show up in devlist and some other camcontrol(8) subcommands.

Similarly, virtual disks don’t respond to CAM commands. There’s no disk to command there—you’re just writing blocks to a file. You can view the disk with camcontrol devlist, but that’s about it.

For most applications, I recommend using FreeBSD’s RAIDZ or GEOM RAID, rather than a hardware RAID controller.

The GEOM Storage Architecture

FreeBSD has an incredibly flexible storage infrastructure system called GEOM (short for “disk geometry”). GEOM lives between device driver nodes and the underlying hardware, handling data exchanged between them. From this position, GEOM can arbitrarily transform input/output requests.

GEOM is built out of kernel modules, called GEOM classes, that let you perform specific types of transformation or management. Disks have a GEOM class that lets the kernel put data on the disk. But if you want to encrypt your disks, that’s a GEOM class. Software-based RAID? A GEOM class. FreeBSD implements all storage modifications as GEOM classes.

GEOM classes are stackable. They use the output of one class as the input for another. You want to encrypt your hard drive and then mirror it to another hard drive? Sure! Stack an encryption module on top of your hard drive and then stack the drive-mirroring module on top of that. You want to mirror that drive across the network? Add that GEOM class to the stack. This flexible modularity makes GEOM one of FreeBSD’s most powerful features.

GEOM Autoconfiguration

When FreeBSD finds a new storage device, either at boot or when you plug a new drive in, the GEOM subsystem checks the device for known formats, like a master boot record, a BSD disklabel, or other metadata. GEOM also checks for physical identifiers, such as the disk’s serial number. This is called tasting.

When GEOM finds identifying information, it configures the device as that metadata dictates. If a disk’s metadata says, “I’m part of a mirror called garbage, along with two other disks,” GEOM looks for the other disks and assembles the mirror. If GEOM can identify a storage device by format, label, or other information, it starts the device, fires up an instance of the GEOM class, makes the appropriate device nodes, and performs any other configuration it understands.

If GEOM can’t identify any other metadata on the disk, such as on an unformatted and unpartitioned disk, GEOM creates the device node for the storage device and leaves it alone.

An instance of a GEOM class is called a geom. The gmirror(8) class makes disks mirror each other, but the specific pair of mirrored disks named garbage is a geom. Each disk in that mirror is also a geom.

GEOM vs. Volume Managers

Traditional volume managers expect you to do things their way, whether that makes sense for your environment and hardware or not. If the volume manager says that you create an encrypted disk mirror by encrypting the individual drives and then mirroring on top of them, that’s what you do. It might make more sense in your environment to mirror the drives and then encrypt them, but if that’s not what the volume manager does, too bad. Worse, some volume managers make poor choices and then implement fixes sideways to minimize the consequences of those decisions.

GEOM differs from volume managers in that it assumes you know what you’re doing. It gives you flexibility to arrange your storage in the manner that best fits your hardware and benefits your use case. GEOM classes let you easily insert new data transformations into your storage. You can’t, say, add an encryption layer into your commercial volume manager.

Volume managers cover the most common cases for hardware that existed at the time they were conceived. As time passes, though, that most common case becomes increasingly uncommon. People continue to use volume managers long after the hardware they were designed for becomes obsolete. GEOM lets you evolve your designs with your hardware, environment, and application.

FreeBSD includes two software suites that look much like volume managers: gvinum(8) and ZFS. Vinum was the FreeBSD volume manager in the 1990s, and while gvinum(8) reimplements it as a GEOM class, its use is strongly discouraged. ZFS is very powerful, as we saw in Chapter 5, but it does have the “do it our way” ethos of a volume manager.

While you can theoretically stack GEOM modules forever, you must consider your hardware resources. Mirroring a busy disk across a network can require a dedicated network interface and an otherwise empty cross-connect cable. Encrypting and decrypting data eats processor time and memory. GEOM doesn’t prevent you from thrashing your disks; it merely gives you new and interesting opportunities for doing so.

Providers, Consumers, and Slicers

Individual geoms are either consumers, providers, or both.

A provider offers services to another geom. If you’re mirroring two hard drives, the geoms for the hard drive provide the disk to the mirror. A provider usually has a device node, such as /dev/ada1p1.

A consumer uses the provider’s services. A disk-mirror geom consumes the underlying disk drives. The consumer part of a geom doesn’t need a device node.

A geom can be both a provider and a consumer—indeed, every geom in the middle of a stack must be both. A disk-mirror geom consumes the underlying physical storage media, but it provides a mirrored disk for the filesystem to live on.

FreeBSD treats all providers and consumers identically. A physical hard drive is just another provider, exactly like a mirror or encryption layer or import from the network. This characteristic lets you arbitrarily stack GEOM classes.

A GEOM class that subdivides a class is called a slicer and is usually responsible for managing partitions. The GEOM class that handles master boot record (MBR) partitions is a slicer, as is the GUID Partition Table (GPT) class. We discussed both of these partitioning methods in Chapter 2, and we’ll go deeper into both in this chapter. Slicers must make sure that disk partitions don’t overlap and that the partitions conform to the rules of the partitioning scheme.

GEOM Control Programs

Many GEOM classes have a control program that lets you administer the module or interrogate the device. Some widely used classes use geom(8), while other classes use programs like gmirror(8) or geli(8). The disk GEOM class talks to the physical storage media and provides consumers for upper layers. That’s a really commonly used class. Here, I interrogate a host to see what geoms of type disk it has and print out the information the disk offers the operating system.

   $ geom disk list
➊ Geom name: da0
   Providers:
   1. Name: da0
    ➋ Mediasize: 1000204886016 (932G)
    ➌ Sectorsize: 512
    ➍ Mode: r2w2e3
    ➎ descr: ATA WDC WD1003FBYZ-0
    ➏ lunname: ATA   WDC WD1003FBYZ-010FB0            WD-WCAW36478143
    ➐ lunid: 50014ee25e60dab5
    ➑ ident: WD-WCAW36478143
    ➒ rotationrate: 7200
    ➓ fwsectors: 63
       fwheads: 255

This hard drive provides a disk device called da0 ➊. The mediasize field gives its size in bytes and converts it to a more convenient 932GB ➋.

This disk claims to have a sector size of 512 bytes ➌. Many disks lie about their sector size. Check the drive manufacturer’s documentation to determine the actual sector size. Drives might offer a Stripesize value of 4,096 to indicate that they’re actually 4K drives.

A GEOM class’s mode looks an awful lot like file permissions ➍, but it’s really the number of GEOM classes reading from (r2) and writing to (w2) the device, plus the number of devices that have requested exclusive access to the device (e3).

The descr field ➎ offers the drive’s model number.

The lunname field ➏ gives the model number plus the serial number. Yes, it’s a combination of the descr and ident fields. The hard drive really, really wants you to believe this is its name and identifier.

The lunid ➐ gives the logical-unit-number (LUN) identifier, which describes how this drive attaches to this host.

The disk’s ident ➑ is the drive’s serial number.

The rotationrate ➒ tells us how fast this drive spins. It’s a 7,200 RPM disk. Nonspinning disks, like SSDs, have a rotationrate of 0.

The fwsectors and fwheads fields ➓ give us the drive geometry. These are examples of the lies mentioned in the beginning of this chapter. Even SSDs offer these values.

Some drives offer less information. Virtual disks offer almost no information, and anything they do say, you can’t trust. (While the VM system might say this disk offers 32,212,254,720 512-byte sectors, who knows what the actual disk beneath the virtual disk has?)

GEOM Device Nodes and Stacks

Many sysadmin tools expect to run on a disk or disk partition. Unix-like systems offer disks and partitions as device nodes. GEOM offers device nodes so that these tools remain compatible.

Most active GEOM modules have their own directory in /dev. Device nodes within that directory represent the current providers of that module. The directory is often, but not always, named after the GEOM module using it. For example, the gmirror(8) class uses /dev/mirror.

The directory name might be changed to avoid ambiguity or overlaps. The glabel (GEOM label) class uses /dev/label. The /dev/gpt directory contains the labels stored on GPT partitions, where /dev/gptid contains the numerical identifiers integral to GPT partitions.

Some classes don’t create a directory and instead piggyback on existing devices. The gnop(8) class creates a new node right next to the node it’s attached to but appends .nop to the end of the device name.

Hard Disks, Partitions, and Schemes

While we discussed partitioning in Chapter 9, consider partitions from a disk drive perspective. The first possible SATA disk on our first SATA controller is called /dev/ada0. Subsequent disks are /dev/ada1, /dev/ada2, and so on. If you also have SAS disks, they’ll start their numbering over at 0.

Disks get further divided into partitions. Even average consumer-grade systems running Microsoft operating systems ship with multiple partitions on the hard drive. Sysadmins chop huge disk arrays into smaller, more manageable units with dedicated purposes—or perhaps they go the other way and merge multiple disks into one monster partition.

A partitioning scheme is the system for organizing partitions on a disk. The traditional master boot record (MBR) is one partitioning scheme. Old Apple and SPARC hardware have their own schemes. Today, the scheme used by most hardware and operating systems is GUID Partition Tables (GPT). Each scheme has its own requirements for boot blocks, hardware architecture, and partitions. This book discusses the MBR and GPT schemes, but you should be aware that other schemes exist.

Each disk partition gets its own device node, created by adding something to the end of the underlying device node name. Here, I look at the device node for a default FreeBSD install using UFS on a virtual disk:

# ls /dev/vtbd0*
/dev/vtbd0 /dev/vtbd0p1 /dev/vtbd0p2 /dev/vtbd0p3

We have a device node for the disk itself and then three others ending in p1, p2, and p3. What are those subdivisions? The p indicates that they’re GPT partitions. In a default install, p1 is the boot partition, p2 is the swap space, and p3 is the main filesystem.

Each partitioning scheme has its own device node extensions. We’ll read about those later this chapter.

The Filesystem Table: /etc/fstab

FreeBSD, like most Unix-like operating systems, uses the file system table /etc/fstab to map on-disk partitions to filesystems and swap space. While ZFS doesn’t use /etc/fstab, every other FreeBSD filesystem can appear therein. Each partition in use appears on a separate line, along with mounting and management instructions.

/dev/gpt/rootfs  /       ufs     rw  2  1
/dev/gpt/swapfs  none    swap    sw  0  0
proc             /proc   procfs  rw  0  0

The first field gives the GEOM provider name. This might be a physical disk partition such as /dev/ada0p1 or perhaps a partition of a GEOM device node. The first two lines here offer device nodes under /dev/gpt. They’re GPT labels, which we’ll see later this chapter. Our third entry lists the word proc rather than a device node: it’s the procfs(5) virtual filesystem, which we’ll examine in Chapter 13.

The second field gives the directory where the filesystem is available, called the mount point. Every partition you can read or write files on is attached to a mount point, such as /usr, /var, and so on. A few special partitions, such as swap space (line 2 here), have a mount point of none. You can’t read or write usable files to the swap space because they’re not attached to the directory tree and because the system would overwrite those files when it swapped.

Next, we have the filesystem type. The first line shows a type of ufs, or Unix File System. The second line is defined as swap space, while the third is type procfs. Other types include cd9660 (CD disks or images), nfs (Network File System mounts), and ext4fs (Linux filesystems). The filesystem table tells FreeBSD how to mount this partition. Chapter 13 discusses alternate filesystems.

The fourth field shows the mount(8) options used for this particular partition. Each filesystem has its own mount options, but here are a few that multiple filesystems use and that frequently appear in /etc/fstab:

ro The filesystem is mounted read-only. Not even root can write to it.

rw The filesystem is mounted read-write.

noauto FreeBSD won’t automatically mount the filesystem, neither at boot nor when using mount -a. This option is useful for removable media drives that might not have media in them at boot.

The fifth field is used to tell dump(8) what backup level is needed to back up this filesystem. Dump is largely obsolete these days; people perform file-level backup with tar(1) or use more advanced backup software, like Bacula (http://www.bacula.org/) or Tarsnap (https://www.tarsnap.com/).

The last field tells the FreeBSD boot process when to check filesystem integrity. All the partitions with the same number get checked in parallel with fsck(8). The root filesystem gets marked with a 1, meaning it’s checked first. Only the root filesystem should get a 1. Any other partitions should get a 2 or higher, meaning they get checked later. Swap, read-only media, and logical filesystems don’t require integrity checking, so they get set to 0.

FreeBSD configures all filesystems found in /etc/fstab at boot. As the system runs, though, the sysadmin can mount other filesystems. And she can unmount ones listed there. That leads to our next question . . .

What’s Mounted Now?

If not all filesystems are mounted automatically at boot, and if the sysadmin can add and remove mounted filesystems, how can you determine what’s mounted right now? Use mount(8) without any options to see all mounted filesystems.

# mount
/dev/gpt/rootfs on / (ufs, local, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)

This is a small UFS-based host. It has one disk partition and an instance of devfs(5) (see Chapter 13). The word local means that the partition is on a hard drive attached to this machine. The journaled soft-updates option is a UFS feature we’ll discuss in Chapter 11. If you’re using NFS or SMB to mount partitions, they’ll appear here.

More complicated hosts give larger results:

# mount
base/ROOT/default on / (zfs, local, noatime, nfsv4acls)
base/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
base/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
base/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
procfs on /proc (procfs, local)
devfs on /dev (devfs, local, multilabel)
--snip--

This host uses many ZFS datasets, each with its own mount point. The mount(8) output shows selected ZFS options, such as noatime and nfsv4acls.

At the end of this output, we have a procfs(5) entry and one for a devfs(5) mount. A working FreeBSD system needs devfs mounted at /dev or it won’t work very well or for very long.

Disk Labeling

At the lowest level, operating systems identify disks by their physical attachment to the system. Traditionally, the filesystem table says something like, “Use the disk attached at ATA port 3 as the /var/log filesystem.” This worked fine with less flexible hardware, but as hardware technology improved, such connections became much more flexible. If you assign drive roles based on the physical attachment, sometimes that attachment changes. I’ve had more than one mainboard explode at an inconvenient hour, forcing a desperate emergency replacement. Tracking which cable goes to which connecter under such circumstances never goes well. In older versions of FreeBSD, you needed to “wire down” devices so that a specific disk always showed up as a specific device node. This is no longer needed.

Today, a sysadmin uses on-disk labels to refer to the disk by something other than the physical attachment. A label identifies an instance of a geom. Rather than telling FreeBSD that /var/www is on the disk attached to SATA port 2, you declare that /var/www is on the disk labeled website. While the former easily goes wrong, the latter is mostly immune to sleepy hardware techs. One disk can have several labels simultaneously, if they’re different types of label. FreeBSD automatically derives many labels from inherent disk characteristics; the sysadmin can define others.

Most label types have a dedicated device node directory. Each GPT partition has a globally unique identifier (GUID), and the autocreated labels for those partitions live in /dev/gptid. Disks get unique disk IDs based on their serial number, which gets entries in /dev/diskid. Manually created GPT labels appear in /dev/gpt.

Use these labels as you would any other device name. If you label the disk ada5 as stuff1, you can partition the disk stuff into stuff1p1 and stuff1p2, use those partitions in configuration files, and more.

Not all labels come from GEOM. ZFS uses its own internal labeling method for filesystems and pools. You can also add labels to UFS filesystems.

Don’t let swapped SATA cables ruin your weekend. Label everything.

Viewing Labels

View labels with glabel(8), a shortcut for geom label. Here are parts of a label from a small virtual machine. The labels on real hardware can quickly become very complex.

   $ glabel list
➊ Geom name: ada0p1
   Providers:
➋ 1. Name: gptid/b9c0c7c5-5b66-11e7-8aec-080027739ff6
      Mediasize: 524288 (512K)
      Sectorsize: 512
   --snip--
➌ Consumers:
   1. Name: ada0p1
      Mediasize: 524288 (512K)
      Sectorsize: 512
      Stripesize: 0
      Stripeoffset: 20480
      Mode: r0w0e0

This host has a single geom ➊ on the disk partition /dev/ada0p1. It provides an appallingly long label based on the GPT partition ID ➋. We’ll see a bunch of information on the underlying disk, such as the number of sectors on the disk, the sector size, and other information you might see in geom disk list output. This information comes from the partition, however. The physical drive information is passed up from the underlying disk.¹

This drive has a single consumer ➌, the actual underlying partition. We’re at the very bottom of this simple GEOM stack, right up against the disk, so it’s consuming itself. If you add cryptographic layers or software RAID, you’ll see what other device this geom consumes.

Sample Labels

Here are some examples of the kinds of labels you’ll see on most FreeBSD systems.

Disk ID Labels

A physical machine offers labels not available on virtual machines.

Geom name: ada3
Providers:
1. Name: diskid/DISK-WD-WCAW36477141
--snip--

The drive ada3 provides a geom called diskid/DISK-WD-WCAW36477141. The diskid geom is named after the hard drive’s serial number, based on information provided by the drive. You can remove the disk from this machine and attach it to a completely different FreeBSD host, and that new host will generate the exact same disk ID label. Using the diskid label in your configurations guarantees that FreeBSD will use the exact disk you intend. Here’s how you might list partition 3 on this disk in /etc/fstab:

/dev/diskid/DISK-WD-WCAW36477141p3 /usr/local ufs rw 2 2

This disk could attach to the host as /dev/ada3 or /dev/ada300, and FreeBSD would still mount this partition as /usr/local.

The problem with disk ID labels is that they’re painful to read and more painful to type. I’m describing them because they can appear by default, but I’d encourage you to choose a different label. Eliminate these labels from your host by setting the tunable kern.geom.label.disk_ident.enable to 0 in /boot/loader.conf.

GPT GUID Labels

Every GPT partition includes a GUID. FreeBSD can treat the GUID as a label. Here, we see a GPT ID label for partition 1 on the disk attached as ada0:

   Geom name: ada0p1
   Providers:
➊ 1. Name: gptid/075e7b89-30ed-11e7-a386-002590dbd594
   --snip--

This disk partition is conveniently available as /dev/gptid/075e7b89-30ed-11e7-a386-002590dbd594 ➊. Much like disk serial numbers, GUIDs are integral to the partition. You can move the disk to another host and still get the same GPT ID. By using the GPT ID label in configurations like /etc/fstab, you guarantee that FreeBSD uses this particular partition, rather than partition 1, on whatever device happens to get assigned ada0 at system boot.

Using a GPT ID label makes sense when you have many automatically configured disks, such as large storage arrays. On smaller systems, though, the 128-bit GUID is annoyingly long. If you decide not to use these labels, remove them from your system by setting the tunable kern.geom.label.gptid.enable to 0 in /boot/loader.conf.

For most hosts, I recommend assigning GPT labels.

GPT Labels

GPT partitions let you manually assign a label name within the partition table. I highly recommend doing so whenever possible. Here’s a partition that I assigned a name:

   Geom name: ada2p1
   Providers:
➊ 1. Name: gpt/swap2
   --snip--

I’ve assigned the label swap2 ➊ to partition 1 on disk ada2. This label is physically stored on the disk partition. I can use this label in my configurations just like any other device name. Using manually assigned labels is much more manageable for small systems, as this /etc/fstab shows:

/dev/gpt/swap2 none swap sw 0 0

An assigned label is much more human-friendly than a long serial number or GUID. If you have the choice, I encourage you to label GPT partitions. We’ll assign labels when we partition disks.

GEOM Labels

In addition to spilling the standard labels on your system, the glabel(8) command lets you configure GEOM labels. A GEOM label is specific to FreeBSD’s GEOM infrastructure and appears in /dev/label. Use GEOM labels with the glabel label command. Here, I apply the GEOM label root to the GPT partition da0p1:

# glabel label da0p1 root

There’s also a glabel create command, but those labels disappear at system reboot.

GEOM Withering

A provider can have multiple labels. One partition might have a label based on the disk ID of the underlying storage device (/dev/diskid/somethinglong), a GPT ID (/dev/gptid/somethingevenlonger), a manually assigned label (/dev/gpt/swap0), and a device node based on the underlying device’s attachment point (/dev/ada0p1). While any number of processes can look at a disk device simultaneously, many disk operations—such as mounting a partition—require exclusive, dedicated control of the device.

To prevent accessing geoms by multiple names, when you access a device by one label, the kernel removes the unused labels. This is called withering. If I, say, mount a swap partition using the GPT label /dev/gpt/swap0, all the other labels for that partition disappear from /dev. Anyone who tries to access the corresponding /dev/gptid partition will find that the device node is missing.

Once all exclusive locks on a device are removed, the kernel de-withers the other device labels. If I deactivate that swap space, the GPT ID and raw device name reappear.

The gpart(8) Command

Like many operating systems, FreeBSD once had specific partitioning tools for each partitioning scheme. Today, all disk partitioning functions, for MBR and GPT alike, are included in the gpart(8) program. Embedded devices with specialized storage might occasionally need older tools like fdisk(8) and bsdlabel(8), but gpart(8) works perfectly well for servers and desktops.

This common tool means you perform many functions the same way no matter which partitioning scheme you’re using. For example, no matter whether you’re working with the MBR or GPT scheme, you’ll need a way to indicate a particular partition. Both schemes let you indicate a partition with -i and the partition number.

Viewing and deleting partitions are great examples of common functions.

Viewing Partitions

Use gpart show to see a brief summary of all GPT and MBR partitions on a geom. Give the name of a geom as an argument to see only the partitions on that geom. The output from gpart show doesn’t look that different from fdisk(8) and other more traditional disk management tools. Here, I look at a storage device by its traditional device node, but I could use diskid or gptid or any other label:

   $ gpart show ada0
➊ =>        40  1953525088  ada0  GPT  (932G)
➋           40        1024     1  freebsd-boot  (512K)
➌         1064         984        - free -  (492K)
➍         2048     4194304     2  freebsd-swap  (2.0G)
➎      4196352  1949327360     3  freebsd-zfs  (930G)
➏   1953523712        1416        - free -  (708K)

The first column gives the first block in the partition; the second, the partition size in blocks. The third gives the partition number, while the fourth gives the partition type. (We’ll discuss partition types later this chapter: for the moment, just go with the flow.) At the end, we have the disk size.

Our first partition begins on the disk’s sector number 40 and fills almost two billion sectors ➊. The third field shows that this isn’t a partition on the disk, but rather an entry for the entire disk. The fourth field gives the partitioning scheme used. This is a GPT disk. The entire disk is about 932GB.

The second entry also starts on sector 40, and it fills 1,024 sectors ➋. This is partition 1, and it’s of type freebsd-boot. If we want to boot off this disk, we need a boot loader on this partition.

The third entry begins on sector 1,064 and fills 984 sectors ➌. Why 1,064? The first partition started on sector 40 and filled 1,024 sectors, so the first (1,024 + 40) 1,064 sectors are filled with other partitions. But this partition doesn’t have a partition number, and its type is - free -. This partition is aligned for disks with 4K sectors.

The fourth entry is swap space, according to the partition type ➍. It begins on sector 2,048, is 4,194,304 sectors long, and is partition 2. You’ll often see swap space near the beginning of a disk, a hangover from the days when partition placement on the disk impacted performance. If you’re using a virtual machine, however, putting the swap near the beginning of the disk leaves you room to expand a partition at the end of the disk.

The fifth entry is a FreeBSD ZFS filesystem, starting in sector 4,196,352 and going on for about 1.9 billion sectors ➎. This freebsd-zfs partition has our data.

The very end of the disk has 1,416 free sectors ➏. There’s not quite enough space to add space to the partition while still aligning the partition to the 1MB boundaries.

A MBR disk looks much like a GPT disk.

Other Views

Add command line flags to modify the output of gpart show.

You can assemble each partition’s device node from the underlying device name and the partition number. If you want to see the device node rather than the partition number, add the -p flag.

To replace the partition type with the partition label, use -l.

Here, I show both the device node and the labels on this disk:

$ gpart show -pl ada0
=>        40  1953525088    ada0  GPT  (932G)
          40        1024  ada0p1  gptboot0  (512K)
        1064         984          - free -  (492K)
        2048     4194304  ada0p2  swap0  (2.0G)
     4196352  1949327360  ada0p3  zfs0  (930G)
  1953523712        1416          - free -  (708K)

The partition number now contains complete device names, like ada0p3. Rather than the GPT partition type, you get the label applied to the GPT partition, such as swap0 and zfs0.

To see the human-hostile GPT partition type rather than the name FreeBSD presents, use -r. I mostly use this when examining disks from other operating systems. It’s possible that FreeBSD will label multiple partition types as being type ntfs; while that’s good enough for most uses, if I’m doing digital forensics, the precise partitioning scheme might be extremely important.

To see a more detailed description of your GPT partitions, use gpart list. This creates output much like glabel list or other GEOM class commands.

Removing Partitions

Maybe you screw up when creating your partitions and need to remove one. No, you haven’t created partitions yet, in either MBR or GPT, but the process you follow is the same either way. Delete partitions by number.

Take a look at the partition table in the previous section. We have partitions for boot, swap, and ZFS. Maybe you don’t want swap space on your boot drive. Remove that partition with the gpart delete command. Use the -i flag and the number of the partition you want to remove. The gpart show command said the swap space was partition 2. Let’s remove it.

# gpart delete -i 2 ada0
ada0p2 deleted

You can now resize your ZFS partition to use that space. How you resize a partition varies with the partitioning scheme.

Scheming Disks

No, not the sort of scheming where the disk deliberately lies to you. We’re talking about the disk’s partitioning scheme. Destruction is easier than creation, in both meatspace and with storage. Before you can partition a disk, you need to assign it a partitioning scheme.

Removing the Disk Partitioning Scheme

You could go through and painstakingly delete every partition on the disk and then obliterate the partitioning scheme. That’s a bunch of work, though. It’s much simpler to just trash the entire disk partition table.

You can’t erase a disk with mounted partitions. Unmount those partitions first, and remove them from any ZFS pools. Once the disk is truly unused, erase any existing partitioning table with gpart destroy.

# gpart destroy da3
da3 destroyed

If the command returns immediately, the disk had no partitions. It might have had a partition scheme, but no partitions. If you get a “device busy” error, either the disk is still in use or the disk has partitions. You could methodically delete all existing partitions with gpart delete and then destroy the partitioning scheme, but it’s easier to burn the existing scheme to the ground by adding -F.

# gpart destroy -F da3

This forcibly erases all partitions and the partitioning scheme. Running gpart show da3 will show that there’s no partition table. You can now create new disk partitions.

Assigning the Partitioning Scheme

Before you can create disk partitions, you need to mark the disk with the type of partitioning scheme you’ll be using. Use gpart create with the -s flag and the scheme, such as gpt or mbr. Here, I mark a disk as using the GPT scheme:

# gpart create -s gpt da3

Use gpart show to verify that the disk now has a GPT partition table. You can now add GPT partitions or recreate the partition table with MBR and add those partitions. But we’ll start by diving deep into GPT.

The GPT Partitioning Scheme

The GUID Partition Table, or GPT, is the modern standard for hard drive partitioning. This is the recommended standard for new installations. Always use the GPT partitioning scheme unless you have a deeply compelling reason not to, such as a lack of hardware support.

GPT supports disks up to 9.4ZB. One zettabyte is one billion terabytes. While our technology will eventually outgrow 9.4ZB, I expect GPT will last the rest of my career.

FreeBSD’s GPT implementation currently supports 128 partitions. Each partition gets assigned a GUID, which is a 128-bit number displayed as 32 hexadecimal characters. While GUIDs aren’t guaranteed to be truly unique across all of civilization, they’re certainly going to be unique within your organization.

Most modern operating systems support GPT and its predecessor, the master boot record (MBR). MBR-based systems put partition records in the first sector on the disk. If a host supports only MBR, but the first sector of a disk contains something that isn’t an MBR, the system gets confused and might refuse to boot. The GPT scheme puts a protective master boot record (PMBR) in the first sector of every disk. The PMBR indicates that the disk contains one MBR partition of type GPT. The second sector contains the actual GUID Partition Table. GPT also puts a backup copy of the partition table on the last sector of the disk so you can more easily recover from damage.

GPT requires allocating a partition for bootstrap code. The PMBR boot code searches the disk for a FreeBSD boot partition. This boot partition must be larger than the boot code, smaller than 545KB, and reserved for the FreeBSD boot loader. FreeBSD has two GPT boot loaders, gptboot(8) and gptzfsboot(8). You must install one of these on the boot partition.

Use gptboot(8) to start UFS-based systems. At system boot, gptboot searches for a FreeBSD partition marked with the bootme or bootonce attributes.

Use gptzfsboot(8) on systems running ZFS.

Use gpart(8) and its many subcommands to view, create, edit, and destroy GPT partitions.

GPT Device Nodes

Each disk partition has a device node. GPT partition device nodes are an extension of the geom they’re built on, indicated by the letter p and the partition number. If you’ve created GPT partitions directly on the disk ada0, the first partition will be /dev/ada0p1, the second /dev/ada0p2, and so on.

Many systems put their partitions on an upper-layer geom. One of my systems uses SATA RAID and offers the disk as /dev/raid/r0. The partitions on this drive are /dev/raid/r0p1, /dev/raid/r0p2, and so on. You might also put partitions on a device by its GUID or disk ID, giving you partitions like /dev/diskid/DISK-WD-WCAW36477062p1.

GPT Partition Types

When you create a GPT partition, you must mark it with a partition type. The type indicates the partition’s intended use. FreeBSD makes decisions based on the partition types, so assign them correctly.

Strictly speaking, a partition type is another 128-bit GUID. FreeBSD marks GUIDs used as partition types with a leading exclamation point, such as !516e7cb5-6ecf-11d6-8ff8-00022d09712b. These partition types are common across all operating system, but most OSs provide human-friendly names for these human-hostile GUIDs. This book uses the human-friendly names; check gpart(8) for the human-hostile ones.

The most common partition types you’ll see on a FreeBSD system include the following:

freebsd-boot FreeBSD boot loader

freebsd-ufs FreeBSD UFS filesystem

freebsd-zfs FreeBSD ZFS filesystem

freebsd-swap FreeBSD swap partition

efi An EFI system partition, used to boot from EFI

You might also see these GPT partition types. Don’t use them in modern FreeBSD, but know that their presence might help you identify just what that weird disk is and how to crack it open.

freebsd A GPT partition that’s divided into bsdlabel(8) partitions

freebsd-vinum A partition controlled by gvinum(8)

mbr A partition subdivided into MBR partitions

ntfs A partition containing a Microsoft NTFS filesystem

fat16, fat32 Partitions containing FAT

For a complete listing of recognized partition types, see gpart(8).

Creating GPT Partitions

Partitioning disks is easy: figure out which partitions you want, create them, and go. The tricky part is living with your partitioning. Before creating partitions, decide what you’re going to do with this disk. How much space do you have? How do you want to divide it? Before you start creating partitions, write down exactly what you want to achieve.

Here, I’m manually partitioning a 1TB disk for a UFS FreeBSD install. It’ll need a 512KB boot partition (type freebsd-boot) and 8GB for swap (type freebsd-swap). The other partitions will be type freebsd-ufs: 5GB for root, 5GB for /tmp, 100GB for /var, and the rest for /usr. I’ll label each partition for its intended role.

Create partitions with gpart(8). Use the -t flag to specify the partition type, -s to give the size, and -l to assign a GPT label to the new partition. I’ll start with the boot partition.

# gpart add -t freebsd-boot -l boot -s 512K da3
da3p1 added

Use gpart show to check your work. Add the -l flag to see the GPT label.

# gpart show -l da3
=>        40  1953525088  da3  GPT  (932G)
          40        1024    1  boot  (512K)
        1064  1953524064       - free -  (932G)

This disk has one partition, a 512K partition labeled boot. The command succeeded. Now add the swap space.

# gpart add -a 1m -t freebsd-swap -s 8g -l swap da3
da3p2 added

This command is much like the one to add the boot partition: we give the partition type, size, and label.

Hang on, though—what’s this -a 1m thing? The -a flag lets you set a partition alignment, enabling you to set where partitions can begin and end relative to the beginning of the disk. Remember back at the beginning of this chapter when I discussed that misaligning a filesystem with the physical sectors on a 4K disk could cause problems? The -a 1m tells gpart to create partition on an even multiple of 1MB from the beginning of the disk. You’ll have some empty space between partitions 1 and 2, as we saw in “Viewing Partitions” on page 215 in this chapter, but that’s okay. That gives you room to change that partition to support UEFI if necessary (see “Unified Extensible Firmware Interface and GPT” on page 222 later this chapter).

Retain that 1MB alignment as you create the 5GB root and /tmp partitions and the 100GB /var partition.

# gpart add -a 1m -t freebsd-ufs -s 5g -l root da3
da3p3 added
# gpart add -a 1m -t freebsd-ufs -s 5g -l tmp da3
da3p4 added
# gpart add -a 1m -t freebsd-ufs -s 100g -l var da3
da3p5 added

When you create the last partition, don’t give a size. This tells gpart to make the partition as large as possible.

# gpart add -a 1m -t freebsd-ufs -l usr da3
da3p6 added

You have partitioned the disk, and it’s ready for your install.

Resizing GPT Partitions

On second thought, perhaps having a huge /usr partition isn’t wise. A /usr partition of 100GB or so would have all the room you might desire for operating system files, while leaving several hundred gigabytes for an isolated /home partition. I trust most of my users, but a few² are just the sort to dump /dev/random into a file until they absorb all available space. Here, I’ll resize /usr to create space for /home.

Use gpart resize to change the size of a partition. You must know the target partition’s partition number. Running gpart show da3 tells us that /usr is partition 6. Use the -i flag and the partition number to resize a partition.

# gpart resize -i 6 -s 100g -a 1m da3
da3p6 resized

Run gpart show to see the new disk size.

# gpart show da3
--snip--
247465984 209715200 6 freebsd-ufs (100G)
457181184 1496343944 - free - (714G)

This disk has 714GB free at the end. We can now create a spacious /home for all our troublesome users.

Each partition is assigned specific sectors on the disk. You can’t increase the size of a partition if there’s no free space on either side of the partition. While this sample disk has a bunch of free space after partition 6, you can’t use it to increase the size of partitions 1 through 5. You must delete and recreate partitions.

Changing the size of a partition doesn’t change the size of the filesystem on that partition. Shrinking a partition with a filesystem will chop off part of the filesystem. Increasing the partition size won’t expand the filesystem. Both UFS and ZFS have tools to handle increased partition sizes, but you must handle that as a separate process.

Changing Labels and Types

You can modify a GPT partition’s type or GPT label with the gpart modify command. Give the partition number with -i. Use -l to give the new label. Here, I change the GPT label on partition 2 of disk vtbd0:

# gpart modify -i 2 -l rootfs vtbd0

Similarly, change the type of partition with -t:

# gpart modify -i 2 -t freebsd-zfs vtbd0

The disk’s GPT table now declares that partition 2 is labeled rootfs and is of type freebsd-zfs.

Booting on Legacy Hardware

Older hardware expects to see a master boot record at the start of the disk and won’t recognize a GPT partition table. FreeBSD uses a protective MBR (PMBR) to give legacy hardware a recognizable partition table and help that hardware boot a GPT-partitioned disk. A bootable disk formatted with GPT needs both a protective MBR and a GPT boot loader.

Install a PMBR with the gpart bootcode command and the -b flag. FreeBSD provides a PMBR as /boot/pmbr.

# gpart bootcode -b /boot/pmbr da3
bootcode written to da3

This disk will no longer confuse hosts that look for an MBR.

You also need a boot loader. UFS hosts need the gptboot boot loader, while ZFS hosts need gptzfsboot. For convenience, FreeBSD provides a copy of each in the /boot directory. These copies are not the on-disk boot loader, only the version of the bootloaders needed for that version of FreeBSD. Install the selected boot loader with the -p flag to gpart bootcode. Use the -i option to tell gpart(8) which partition to copy the boot loader to. The sample disk we used in the last section had partition 1 as type freebsd-boot, so we’ll use that.

# gpart bootcode -p /boot/gptboot -i 1 da3
partcode written to da3p1

You can combine -p and -b into a single command.

Unified Extensible Firmware Interface and GPT

The Unified Extensible Firmware Interface (UEFI) is a newer standard for booting amd64 hardware without using BIOS emulation. FreeBSD 10 and later have early support for UEFI booting to UFS, while FreeBSD 11 can boot ZFS off of UEFI.

UEFI uses a partition of type efi, which must be 800KB or larger. Create an efi partition on a new disk with gpart create.

# gpart create -s gpt da0
# gpart add -t efi -s 800K da0

FreeBSD provides an efi partition as /boot/boot1.efifat. Copy that to the new boot partition with dd(1).

# dd if=/boot/boot1.efifat of=/dev/da0p1

Partition the rest of the disk as you desire.

An efi partition is actually a FAT filesystem with a very specific directory hierarchy. Feel free to mount the file boot1.efifat and explore it.

Expanding GPT Disks

We’ve seen how to expand a partition, but what about a disk? Expanding disks often happens with virtual hosts. Expand a virtual disk, and gpart(8) will complain that the disk’s GPT is invalid. GPT and GEOM store information in the first and last sectors of the disk. Expanding a virtual disk means adding sectors. The new last sector will be empty. Create a new metadata block for the last sector with gpart recover.

# gpart recover vtbd0

You can now create or expand partitions on the expanded virtual disk.

Now that you have a handle on GPT partitions, let’s look at MBR and see why GPT seemed like such an improvement.

The MBR Partitioning Scheme

Old hardware, or new but small hardware, might need master boot record partitioning on its disks. Intel-style hardware has used MBR partitions for decades, and millions of devices running a plethora of operating systems use it. The MBR scheme works only on disks of 2TB or smaller. Larger disks must use GPT partitioning.

What Is the Master Boot Record?

The master boot record (MBR) is a file that takes up the first 512 bytes of a traditional disk, also known as Sector 0. The MBR contains partition information and a boot loader to allow the BIOS to find the operating system. The term MBR might refer to the actual first sector on the disk or the partition scheme used by that format.

A master boot record describes four primary partitions, called slices in the BSD community. Each slice description includes the disk sectors included in the partition and the type of filesystem expected on that slice. If a disk has only one slice on it, the MBR still lists four slices, but three of those slices have no sectors assigned to them. While the MBR format supports a linked list of up to 20 extended partitions, FreeBSD doesn’t need them thanks to BSD labels.

One of the four primary slices is considered active. When the system powers on, the bootstrap code looks for the active slice and tries to boot it.

The MBR sector also contains bootstrap code. You don’t need to allocate space specifically for a boot loader. In FreeBSD, the bootstrap code finds and executes the kernel. FreeBSD includes two different boot loaders, mbr and boot0. The mbr loader is for a host with a single operating system. If you have multiple operating systems installed on your hardware, use the boot0 loader—or, better still, dedicate your host to FreeBSD and virtualize the other operating systems.

The main function of a slice is to contain a bsdlabel(8) partition.

BSD Labels

BSD existed before either the MBR or the IBM PC. BSD used its own disk partition format, called a disklabel. Now that labeling disks is much more common, disklabels are also called BSD labels or bsdlabels. (If you want to start a spirited discussion, ask a room of FreeBSD developers which is more correct.) BSD systems had several partitions including at least / (root), /usr, /var, /tmp, and swap space, plus separate partitions for whatever actual work the system did.

When BSD was ported to the i386 platform, they could have switched disks to using MBR partitions. With extended MBR partitions, one disk could have had up to 24 partitions. Disklabel partitions were embedded throughout the kernel, however, often in icky places that nobody dared touch. The porting group decided to treat an MBR slice as a BSD disk and to partition each slice with a BSD disklabel. Sysadmins needed to create MBR partitions and then nest disklabel partitions inside those MBR partitions.³

This worked but also made the word partition ambiguous. Does partition mean an MBR partition or a disklabel partition? FreeBSD dusted off the word slices for MBR partitions. Each MBR slice will have its own disklabel, listing the BSD partitions contained within the slice. If you come from a Linux or Microsoft Windows background, the MBR partitions you’re familiar with are called slices over here.

You can’t label slices or disklabel partitions. These formats have no space for labels. Instead, label the ZFS or UFS filesystem on the partition.

It’s possible to skip slicing a disk, instead installing a disklabel directly on the hard drive. Some hardware refused to boot from such disks, so they’re called dangerously dedicated. With the advent of GPT, dangerously dedicated disks aren’t really used any more.

MBR Device Nodes

Every disk, slice, and partition has a device node. The slice device node is an extension of the underlying disk, and the partition device node is an extension of the device’s node. Here are the device nodes on disk ada0 of an MBR-based system:

/dev/ada0 /dev/ada0s1a /dev/ada0s1d
/dev/ada0s1 /dev/ada0s1b /dev/ada0s1e

The first subdivision of the disk is the slice. Device nodes indicate a slice with the letter s and a number from 1 to 4. The first slice is s1, the second is s2, and so on. Unused MBR partitions don’t get device nodes. Here, /dev/ada0s1 is slice 1 on the disk.

The second layer of subdivision is the disklabel partition inside the slice. Each partition has a unique device node name created by adding a letter to the slice’s device node. Here, we have four disklabel partitions, /dev/ada0s1a through /dev/ada0s1e. Traditionally, the node ending in a (/dev/ada0s1a) is the root partition, while the node ending in b (/dev/ada0s1b) is swap space.

Note that the list of device nodes doesn’t use the letter c. The c partition represents the entire slice. These days, you run disk partitioning tools on the slice entry rather than the disklabel for the slice.

Assign partitions d through h any way you like. A default disklabel can have up to seven usable partitions. With up to four slices on each drive, you can have up to 28 partitions on a drive. A disklabel can support up to 20 partitions, but you must indicate you want extra partitions when first creating the label.

MBR and Disklabel Alignment

Slices have their own disk sector and filesystem block alignment issues. Traditionally, MBR partitions end on a cylinder boundary. Cylinder boundaries don’t mean anything on modern hardware, but even newer drives provide them as a comforting lie for older or less capable hardware. If you create MBR partitions that don’t end on a cylinder boundary, and you put that disk in a machine that requires respecting cylinder boundaries, the machine will have some sort of nervous breakdown. A disk you slice today could theoretically find its way into an older system. FreeBSD therefore arranges slices so that they end on cylinder boundaries. Cylinder boundaries not only can but probably do conflict with 4K disk sector sizes. If nothing else, the MBR itself takes up the first cylinder, or sixty-three 512-byte sectors!

Fortunately you rarely write to slice tables, and the performance of writing slice tables is rarely an issue. If you align your disklabel partitions within a slice to 1MB boundaries, you’ll lose a few sectors between the slice partition table and the disklabel partition, but you’ll have proper performance.

So: align disklabel partitions. Don’t align slices.

Creating Slices

Use gpart(8) to manage MBR slices. To create a slice, you need a partition type and a size. FreeBSD slices use type freebsd. If you don’t specify a size, gpart(8) uses all available space. On an empty disk, this dedicates the whole disk to a single slice.

Here, I erase the existing partitioning, tell the disk to use the MBR scheme, and create a single FreeBSD slice:

# gpart destroy -F ada3
# gpart create -s mbr ada3
# gpart add -t freebsd ada3
ada3s1 added

Run gpart show and you’ll see that this disk now has a single slice. Add the -p flag to see the slice’s device node.

# gpart show -p ada3
=> 63 1953525105 ada3 MBR (932G)
63 1953525105 ada3s1 freebsd (932G)

Our slice ada3s1 is now ready for disklabel partitions.

To create multiple slices, specify a size with -s. A common configuration for small embedded systems is to put three slices on a disk. Two smaller slices contain different versions of the operating system, while the third contains any data. Here, I divide this 1TB disk into two 150GB slices and give the rest to a third slice:

# gpart add -s 150g -t freebsd ada3
ada3s1 added
# gpart add -s 150g -t freebsd ada3
ada3s2 added
# gpart add -t freebsd ada3
ada3s3 added

Removing Slices

Use gpart delete to remove unwanted slices. Give the slice number with -i. Here, I remove the third, larger slice from our multislice disk created in the last section:

# gpart delete -i 3 ada3
ada3s3 deleted

Activating Slices

The active slice is the one that the BIOS tries to boot. Set the active slice with the -a active flag. Use -i to give the number of the active slice.

# gpart set -a active -i 1 ada3

Change which slice gets booted by setting a different active slice.

The boot disk also needs a boot loader. While the MBR boot loader is different from the GPT or UEFI boot loaders, it uses the same gpart(8) -b flag. FreeBSD provides a copy of the MBR boot loader as /boot/mbr.

# gpart bootcode -b /boot/mbr ada3

Slice 1 on disk ada3 is now bootable. Now that you’ve sliced your disk, you can create BSD labels inside the slices.

BSD Labels

Creating BSD label (or disklabel) partitions inside a slice is much like creating slices or GPT partitions. You must tell the storage device the scheme to be used, create and remove partitions until you’re satisfied with them, and install a boot loader.

Creating a BSD Label

Where GPT and MBR specifically provide space for partition tables, you must create a BSD label and write it to the beginning of the slice. As with any scheme, use -s and the name of the scheme. Install this scheme on the slice, not on the disk.

Suppose you want to create a BSD label on the slice ada3s1. Use the BSD scheme.

# gpart create -s bsd ada3s1
ada3s1 created

This is a default disklabel, with room for 8 disklabel partitions. You can increase the number of partitions, up to 20, by using the -n flag. Here, I create a whole bunch of partitions on ada3s3, the large partition.

# gpart create -n 20 -s bsd ada3s3
ada3s3 created

There are no actual disklabel partitions on this slice; there’s merely a label that can contain disklabel partitions. Now that the label exists, you can create those partitions.

Creating BSD Label Partitions

Before blindly entering partitioning commands, plan how to partition the disk. Figuring things out on paper beforehand is much easier than figuring them out at the command line. I’m going to partition the first 150GB slice on this disk for UFS filesystems. This slice will get 5GB partitions for / (root), swap, and /tmp. The rest will go to /usr. Why no /var? I’ll dedicate the big slice, ada3s3, to /var. I don’t need to add a boot partition because MBR disks don’t need one.

To create a disklabel partition, you must specify the type with -t and the size with -s—exactly as you would for GPT partitions. FreeBSD UFS filesystems are of type freebsd-ufs. Let’s start with the root partition.

# gpart add -t freebsd-ufs -s 5g -a 1m ada3s1
ada3s1 added

To view this partition, you must give gpart show the slice device, not the disk device. Using the disk device displays the slices.

# gpart show ada3s1
=>        0  314572800  ada3s1  BSD  (150G)
          0       1985          - free -  (993K)
       1985   10485760       1  freebsd-ufs  (5.0G)
   10487745  304085055          - free -  (145G)

The third line of output shows our 5GB partition.

At the very beginning of this slice, we have 1,985 free blocks, or 993KB. I requested that the partition be aligned to 1MB boundaries, so gpart wasted a bit of space to meet that request. I’ll happily lose that 993KB, rather than halve the system’s performance.

Now create the swap partition of type freebsd-swap.

# gpart add -t freebsd-swap -s 5g -a1m ada3s1
ada3s1b added

The 5GB /tmp comes next. Then, I dump the rest of the space into a partition for /usr by omitting the size.

# gpart add -t freebsd-ufs -s 5g -a1m ada3s1
ada3s1d added
# gpart add -t freebsd-ufs -a1m ada3s1
ada3s1e added

A gpart show reveals our disklabel partitions have wasted 63 blocks, or 32KB, at the end of the disk. Watch me not care.

These partitions are now ready to receive filesystems. We discuss UFS in Chapter 11.

Assigning Specific Partition Letters

On a traditional BSD label, the a partition is for the root filesystem, while b is for swap. The c partition represents the entire slice. This isn’t mandatory, but I recommend not using any of these letters for any other purpose.

Why is this important? I once added a hard drive to a server so that we had more space for a database. We moved the database software to partition a and the actual data to partition b.⁴ When I went on vacation a few months later, the system ran short on virtual memory. I got a call from a sysadmin who had found and activated the unconfigured swap space on the new drive—but now the database data was missing. Yes, the company lost several customers and many thousands of dollars of revenue, which is sad—but more importantly, it ruined one day of my vacation and cast a shadow over the rest. This was unacceptable.

Don’t bother fighting these traditions, especially on a decreasingly common disk format. Don’t use the letters a, b, or c for partitions other than those decreed by the Berkeley elders.

The gpart program is designed to work with partition numbers, not letters. When you’re creating disklabels, however, gpart add maps index numbers onto letters. Partition 1 is a, partition 2 is b, and so on. By specifying a partition index when you create the partition, you assign the letter to the partition.

If you don’t specify a partition number, gpart add assigns partition letters starting with a. You might assign your first partition number 18, but if you don’t specify a number for the next partition, it’ll wind up getting partition a. To avoid using a, b, or c, use a number for every partition you create. You can use letters only up to the number of disklabel slots the partition has. A standard disklabel can use only letters a through h, while a 20-partition label can use a through t.

On my three-slice system, I want to put /var on ada3s3. I want to use a letter other than a, b, or c, so I randomly pick index 18. It’s almost exactly the same as the partition for /usr, but we’re adding it to a different slice.

# gpart add -t freebsd-ufs -a 1m -i 18 ada3s3
ada3s3r added

To see that disklabel partition, you’ll need to run gpart show ada3s3. Add -p to see the device name.

# gpart show -p ada3s3
=>         0  1324379505   ada3s3  BSD  (632G)
           0        1985           - free -  (993K)
        1985  1324376064  ada3s3r  freebsd-ufs  (632G)
  1324378049        1456           - free -  (728K)

What do you know? The 18th letter of our alphabet is R.

With partitions, we can start to look at filesystems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10 DISKS, PARTITIONING, AND GEOM

Create new playlist

Sign In

Sign Up

10DISKS, PARTITIONING, AND GEOM

Disks Lie

Device Nodes

The Common Access Method

What Disks Do You Have?

Non-CAM Devices

The GEOM Storage Architecture

GEOM Autoconfiguration

GEOM vs. Volume Managers

Providers, Consumers, and Slicers

GEOM Control Programs

GEOM Device Nodes and Stacks

Hard Disks, Partitions, and Schemes

The Filesystem Table: /etc/fstab

What’s Mounted Now?

Disk Labeling

Viewing Labels

Sample Labels

Disk ID Labels

GPT GUID Labels

GPT Labels

GEOM Labels

GEOM Withering

The gpart(8) Command

Viewing Partitions

Other Views

Removing Partitions

Scheming Disks

Removing the Disk Partitioning Scheme

Assigning the Partitioning Scheme

The GPT Partitioning Scheme

GPT Device Nodes

GPT Partition Types

Creating GPT Partitions

Resizing GPT Partitions

Changing Labels and Types

Booting on Legacy Hardware

Unified Extensible Firmware Interface and GPT

Expanding GPT Disks

The MBR Partitioning Scheme

What Is the Master Boot Record?

BSD Labels

MBR Device Nodes

MBR and Disklabel Alignment

Creating Slices

Removing Slices

Activating Slices

BSD Labels

Creating a BSD Label

Creating BSD Label Partitions

Assigning Specific Partition Letters

Table of Contents for
10 DISKS, PARTITIONING, AND GEOM

10
DISKS, PARTITIONING, AND GEOM