Discontiguous data file &#x2013; the SG &#x2013; I/O approach

The traditional approach using the {lseek, write} pair of system calls three times in succession worked, of course, but at a rather large performance penalty; the fact is, issuing system calls is considered very expensive. A far superior approach performance-wise is called scatter-gather I/O (SG-I/O, or vectored I/O). The relevant system calls are readv(2) and writev(2); this is their signature:

#include <sys/uio.h>
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

These system calls allow you to specify a bunch of segments to read or write in one shot; each segment describes a single I/O operation via a structure called iovec:

struct iovec {
    void *iov_base; /* Starting address */
    size_t iov_len; /* Number of bytes to transfer */
};

The programmer can pass along an array of segments describing the I/O operations to perform; this is precisely the second parameter—a pointer to an array of struct iovecs; the third parameter is the number of segments to process. The first parameter is obvious—the file descriptor representing the file upon which to perform the gathered read or scattered write.

So, think about it: you can gather together discontiguous reads from a given file into buffers (and their sizes) you specify via the I/O vector pointer, and you can scatter discontiguous writes to a given file from buffers (and their sizes) you specify via the I/O vector pointer; these types of multiple discontiguous I/O operations are thus called scatter-gather I/O! Here is the really cool part: the system calls are guaranteed to perform these I/O operations in array order and atomically; that is, they will return only when all operations are done. Again, though, watch out: the return value from readv(2) or writev(2) is the actual number of bytes read or written, and -1 on failure. It's always possible that an I/O operation performs less than the amount requested; this is not a failure, and it's up to the developer to check.

Now, for our earlier data file example, let's look at the code that sets up and performs the discontiguous scattered ordered-and-atomic writes via writev(2):

static int wr_discontig_the_better_SGIO_way(int fd)
{
  struct iovec iov[6];
  int i=0;

  /* We don't want to call lseek of course; so we emulate the seek
   * by introducing segments that are just "holes" in the file. */

  /* A: {seek_to A_START_OFF, write gbufA for A_LEN bytes} */
  iov[i].iov_base = gbuf_hole;
  iov[i].iov_len = A_HOLE_LEN;
  i ++;
  iov[i].iov_base = gbufA;
  iov[i].iov_len = A_LEN;

  /* B: {seek_to B_START_OFF, write gbufB for B_LEN bytes} */
  i ++;
  iov[i].iov_base = gbuf_hole;
  iov[i].iov_len = B_HOLE_LEN;
  i ++;
  iov[i].iov_base = gbufB;
  iov[i].iov_len = B_LEN;

  /* C: {seek_to C_START_OFF, write gbufC for C_LEN bytes} */
  i ++;
  iov[i].iov_base = gbuf_hole;
  iov[i].iov_len = C_HOLE_LEN;
  i ++;
  iov[i].iov_base = gbufC;
  iov[i].iov_len = C_LEN;
  i ++;

  /* Perform all six discontiguous writes in order and atomically! */
  if (writev(fd, iov, i) < 0)
    return -1;
/* Do note! As mentioned in Ch 19:
   * "the return value from readv(2) or writev(2) is the actual number
   * of bytes read or written, and -1 on failure. It's always possible
   * that an I/O operation performs less than the amount requested; this
   * is not a failure, and it's up to the developer to check."
   * Above, we have _not_ checked; we leave it as an exercise to the
   * interested reader to modify this code to check for and read/write
   * any remaining bytes (similar to this example: ch7/simpcp2.c).
   */
  return 0;
}

The end result is identical to that of the traditional approach; we leave it to the reader to try it out and see. This is the key point: the traditional approach had us issuing a minimum of six system calls (3 x {lseek, write} pairs) to perform the discontiguous data writes into the file, whereas the SG-I/O code performs the very same discontiguous data writes with just one system call. This results in significant performance gains, especially for applications under heavy I/O workloads.

The interested reader, delving into the full source code of the previous example program (ch18/sgio_simple.c) will notice something that perhaps seems peculiar (or even just wrong): the blatant use of the controversial goto statement! The fact, though, is that the goto can be very useful in error handling—performing the code cleanup required when exiting a deep-nested path within a function due to failure. Please check out the links provided in the Further reading section on the GitHub repository for more. The Linux kernel community has been quite happily using the goto for a long while now; we urge developers to look into appropriate usage of the same.

Table of Contents for
Discontiguous data file – the SG – I/O approach

Discontiguous data file – the SG – I/O approach

Table of Contents for Discontiguous data file &#x2013; the SG &#x2013; I/O approach

Create new playlist

Sign In

Sign Up

Table of Contents for
Discontiguous data file – the SG – I/O approach