Files

Leaf on Water. © Ike Lea, Lansing Community College.

Chapter 8 Overview

This chapter deals with how the data included in an image are stored and used and describes the parts of the file itself and how the files can be reduced for easy transmittal. Several file types are introduced to show the wide variety of files available and their specific applications.

File Basics

When the image is captured and converted to a set of numerical values, we need to secure the data and make them portable. This takes place at several points in the process, and because of the varying requirements within the system the method for holding the data changes. As discussed earlier, the first place the information has to become transportable is in the architecture of the chip. For the most part, we can consider this short-term storage, and this captured file requires conversion to binary, digital numbers.

Beyond the concept of a captured file are three other common file uses. Working files require certain levels of flexibility to allow creative manipulations of or corrections to the image. Next the file is also the archive form for saving and making the data permanent. To do this requires a file that you can write, record, and later reconstruct. Finally, certain files are designed for the communication of the image information to output or transfer.

While the captured file is not useful to us beyond its function within the camera and is changed from an analog form to a digital file, it displays the basic structure of all files. In order to form the bit map, we have a series of numbers that are registered within the grid of the image file. Another factor that is common to files is the numerical method for defining the value of the light captured. Although not in the originally captured image file, all other files must handle the raw or interpolated color generated from the illumination information in the captured file. Finally, a part of the file will be an algorithm that converts the data for future use.

Of these parts, the major difference between files is found in the algorithm for each file type. The algorithm consists of the standards, steps, and actions required to convert the numerical data into an image. Some of these components are universal, such as color and a binary numbering system (as discussed in Chapter 5). The nonuniversal parts are the image structure, bit order, compression, and applicability. All of these non-universal portions of the file define the file format.

Binary In a binary numbering system (base two), only two numbers make up all the values. Such a numbering system works exceptionally well in the computer environment, as the number 0 can apply to the power being off while the number 1 represents a charge. This base unit—0 or 1—is a bit.

File format The file format contains instructions on how to encode, save, and extract data in a file.

Bit Depth and Color

In Chapter 5, we discussed how colors are generated. A 24-bit color format uses 8 bits (256 distinct numbers) of value information for each of the three primary color channels (red, green, and blue, or RGB) and mixes them to form the image color.

This means that 3 bytes of image information will be generated for each pixel at a 24-bit depth. The World Wide Web uses an 8-bit format having 256 separate colors with no other mixtures possible. An adaptive pallet can be applied to an 8-bit image that allows color in the image file to better communicate the photographer’s wishes or perceptions. Both capture and saving use up to 48-bit file formats, or up to 16 bits (2 bytes) per channel.

Beyond the bit depth of the file, another color consideration is the color metrics used to coordinate color. Of these, the Commission Internationale de L’Éclairage (CIE, International Color Commission) L.a.b. color space is the most commonly used. This system defines the color by its value and either the “a” axis (red–green) or “b” (yellow–blue). Most files use this type of three-dimensional color space. In some files, embedded color preferences are also recorded and can be used to reconstruct the image as it is opened or output.

Many of today’s file formats include information about the image that is not part of the image’s pixel structure, referred to as the metadata. This information is important for tracing the history of the image (such as when the image was made or modified) or for control of the image function, such as an embedded color space or correction.

Commission Internationale de L’Éclairage (CIE) The color measuring and numbering system created by the CIE is based on human perception and is heavily used in digital imaging.

Metadata Metadata are information attached or stored within a file format that is not part of the image.

Figure 8-1 The CIE L.a.b. color space is a model that has hues arranged around the equator. L is the luminance, with the top being white. The (a) axis is hues of red–green, and the (b) axis is hues of blue–yellow.

File Size and Detail

A common misconception is that the count of pixels on the sensor is the same as the files that the sensor creates. There are two ways that files and megapixel counts differ. First is the way color is interpolated. As just mentioned, the file size will vary because of bit depth. Generally, a 24-bit color generates 3 bytes of color data at each pixel. This means that a file size will be three times the number of bytes as there are pixels on the sensors. Also, file sizes do not match the sensor size due to resampling and enlargement of the pixel count that was originally captured. The file becomes larger because of an interpolation algorithm that adds pixels and then adjusts image quality for better appearance. One of the most common methods for this type of file expansion uses a mathematical concept called bicubic resampling and then edge enhancement. This resampling technique is applied to the captured file prior to the file being used or saved. While this type of enlargement creates larger files, it does not add detail to the image.

Resampling Resampling is changing the image file to increase or decrease the number of pixels that are in an image. When a file is permanently reduced in size, data are lost; however, if the file is increased in size, no new detail is added to the image.

Bit Order

The bit order used with each byte in the image file has ramifications on how the file will be decoded and transmitted. Having a compatible bit order within the files is particularly important in relation to color bit-mapped images.

Compression

The type of compression used in the file is another variable. Compression reduces the file size by applying mathematical operations on the data within the file. Many camera systems compress the capture file before exporting the image for other uses, but in these cases the computer or camera system program is determining the important and unimportant data in the image and discarding or rewriting the data. The mathematics of the compression defines the two basic types of compression: lossless and lossy.

Lossless compression If a file format can compress encoded data and then decompress the information in the data without degradation, then it is lossless; this is a reversible compression concept.

Lossy Lossy is any type of file compression that shows a loss of data when opened after saving.

Lossless compression reduces the size of a file without the loss of data. This compression uses a method that stores pixels as the one piece of information when the same color value (R, G, B digital number) is present in adjacent pixels. Mathematically,this allows for a pure function, without any differences in the data before compression and after decompression. Because the compression method stores sameness in one locator for linearly aligned pixels it is often referred to as pixel packing or runlength encoding. While there is no loss of data when decompressing with lossless compression, the gain from this model achieves a maximum compression ratio of about 2:1, or a 50% reduction in file size.

The lossy method of compression has great file reduction potential. Lossy systems can reduce files by as much as a 300: 1 compression ratio; that is, it can bring a file size down to about 3% of its original size. This model takes neighboring pixels of similar data and saves them as the same value in one group of data. As a simplified example, it might treat the color of a red apple and a red fire engine as the same color. The critical issue for lossy compression is how much loss is acceptable at decompression, based on the final file size.

Lossy compression utilizes a basic understanding that human perception will not notice small color changes when files are decompressed. This is taken a step further with a process called visually lossless compression. Though not heavily used, in visually lossless compression the algorithm discards a large portion of color information in a regular pattern while retaining the tonal information at each pixel.

Regardless of the type of compression, the effect of the compression will not be seen when the file is being compressed; it will be apparent only after decompression takes place and the file is opened. One must consider, then, how much image quality and data can be sacrificed for a reduction in file size.

File Types

The most important consideration for file choice is the application chosen for the image. The following list of file types is not complete but does provide a look at some of the common types and how they are used. All of these files are cross-platform compatible.

Purely Abstract #6. © Ian Macdonald-Smith.

RAW Files

RAW files are closest to the actually captured file. This file format records the filtered color at each pixel captured and quantized in the pixel’s filter color. A RAW file does not interpolate the color from adjacent pixels to form an image. The RAW format file records the light captured at each photosite in its filtered color at the number of bits directly from the sensor. The file cannot be viewed as an image without software to interpolate the color information and produce the expected image.

RAW files are different for various capture systems because each system has proprietary image processing, and the file extensions will vary. For Nikon systems, the extension is .nef; for Canon, .crw; for Kodak, eir; and for Leaf, .mos (for Mosaic), to name only a few of the file extensions in use. These files are not interchangeable and cannot be viewed through a different camera’s converter or viewer. For this reason, a plug-in is required for computer viewing or manipulation. This plug-in may be proprietary or a more universal software tool, such as a Photoshop® plug-in. At the time of this writing, interpolation of an image tends to be better using the camera manufacturer’s converters rather than through universal conversions found in post-capture software.

There are three major advantages to the RAW file type. First, it is able to handle large data capture and is not constrained by the need to interpolate color. Second, future software improvements can be applied to images stored earlier as RAW files. Finally, the size of RAW files is smaller than that of interpolated color files. Even though RAW files use no compression, they are smaller when the RAW file is a 16-bit format than an interpolated 8-bit-per-channel file.

TIFF (Tagged Image File Format)

The name of this type of file refers to the metadata (i.e., the tag) written into the file. This portion of the file provides the user with information about what the image is and how to reconstruct it when it is opened or used. TIFF files have an extension of .tif. For many camera systems, TIFF files are the highest level of interpolated color image files available. For many years, the TIFF format has had wide acceptance in the graphics and printing industries. The tags on the TIFF files make them exceptionally useful when working with cross-platforms and in multiple software applications. This extends to being able to save in multiple software applications, including the ability to save channels and layers. Within most software applications, TIFF files open or save more slowly than file formats that are resident to that program. TIFF files can be stored as compressed or uncompressed files and will handle large bit depth (12 or 16 bits per channel). They may be compressed by lossy or lossless compression, although lossless is more commonly used. Lempel–Ziv–Welch (LZW) compression is a lossless compression used in the Macintosh® platform; ZIP compression is used in both Macintosh and PC platforms. These compressions are particularly effective when there are large areas of the same color. TIFF formats supports the use of layers in programs such as Photoshop®.

PSD (Photoshop® Document)

A Photoshop® document is a vehicle developed for use with a specific piece of software. Although the .psd file type was created for the Photoshop® program, it is compatible with many other types of software. This allows these other applications to use a file that has been saved as a Photoshop® document, as it can be opened and manipulated to varying degrees. Unlike other file types mentioned in this chapter that were designed as methods to save data, the Photoshop® document type of file was intended as a processing-based vehicle.

JPEG (Joint Photographic Experts Group)

One of the most common file formats is JPEG, with the file extension .jpg. This file format is common on many cameras and on the World Wide Web. While the format can be lossless, it is most commonly used as a lossy compression file. The compression in JPEG can reduce the image files to very small sizes. It is important to know that every time a file is saved in JPEG, compression is applied to the file. This means that if a file is opened and then saved without any other change, the file will be compressed again. Thus multiple file saving in JPEG format continuously compresses the file with lossy compression.

The potentially small sizes give Web users the ability to move files quickly, and consumer-level imagers can store large numbers of images on any medium. The JPEG file algorithm varies the amount of color or value information that can be discarded. This can be done by switching or lowering the bit depth of the file. While the file format supports 24-bit color, it can also use 8-bit or lower color to achieve the smallest optimal size.

The Joint Photographic Experts Group is developing a new file format referred to as JPEG 2000. This file format will have several important features, including security potentials, prepress and fax document compatibility, use of motion imaging files, and Web language applications. JPEG2000 is based on wavelet technology that divides the data into frequency components equivalent in scale to the data.

Untitled. © Tim Meyer 2003.

GIF (Graphics Interchange Format)

Developed by Compuserve® in the 1980s for use on the World Wide Web, this file format, with the extension .gif, allows efficient movement of files. Using an extremely lossy compression based on 8-bit color (a total of 256 distinct colors), it accomplishes the greatest reduction of file size. Because of the color format and compression used in GIF files, decompressed images will often exhibit banding or posterization of the tones and colors within the image. This format is useful for putting images on the World Wide Web but was not designed as a primary image file format.

EPS (Encapsulated Post Script)

Using the extension .eps, this file is designed for use in the graphic arts field. The format handles both bit maps and vector graphics. The vector graphics are used to support PostScript®, a format designed for various aspects of the printing industry, including typography and certain graphics applications. When the data are presented in a vector graphic form, the number of pixels does not confine the scale of the output, as pixels are not found in a vector graphic file. This function in EPS allows resizing without loss of information or pixelization of type and graphics. EPS also allows embedded bit maps, thus supporting the use of digital images in the files. The EPS files are used extensively in prepress applications.

PDF (Personal Document Files)

Personal document files, with an extension of .pdf, were developed to use the features common in desktop graphics and imaging that use embedded fonts and images. These files are created by Adobe Acrobat®, which uses a unique method to send files to a printer; its raster image processor (RIP) simplifies graphic and color output to produce a file that can be accessed by a variety of users across several platforms.

Salt Lake. © Ike Lea, Lansing Community College.

Summary

  • The four types of files used in digital imaging are capture, working, archiving, and output/transfer files.
  • Features common to all types of files include the basic bit map structure, binary numbering system, and algorithm used to write and store the image.
  • File size is influenced by the pixel count of the sensor, color bit depth, and size interpolation.
  • Color varies in files and is important in file size and compression.
  • Three types of compression are used in image files: lossless, lossy, and visually lossless.
  • The many types of files include RAW, TIFF, PSD, JPEG, GIF, EPS, and PDF. Each has its own benefits and functions within digital imaging.

Glossary of Terms

Binary In a binary numbering system (base two), only two numbers make up all the values. Such a numbering system works exceptionally well in the computer environment, as the number 0 can apply to the power being off while the number 1 represents a charge. This base unit—0 or 1—is a bit.

Commission Internationale de L’Éclairage (CIE) The color measuring and numbering system created by the CIE is based on human perception and is heavily used in digital imaging.

File format The file format contains instructions on how to encode, save, and extract data in a file.

Lossless compression If a file format can compress encoded data and then decompress the information in the data without degradation, then it is lossless; this is a reversible compression concept.

Lossy Lossy is any type of file compression that shows a loss of data when opened after saving.

Metadata Metadata are information attached or stored within a file format that is not part of the image.

Resampling Resampling is changing the image file to increase or decrease the number of pixels that are in an image. When a file is permanently reduced in size, data are lost; however, if the file is increased in size, no new detail is added to the image.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset