4.1 Frequency Spectrum of Images

Since the computational model in spectral domain is rooted in frequency spectrum of the image and the input image is generally in the spatial domain, the transform from spatial domain to frequency domain should be considered first. Images can often be divided into two categories: natural images and man-made object images. Natural images include natural objects (animals, flowers etc.), landscapes (forests, rivers, beaches, mountains etc.) that are commonly outdoor images. Man-made object images involve man-made objects, indoor scenes, streets, city views and so on. Whatever the image is, it always contains some real signal contents and significations that are different from other random signals. Therefore, their frequency spectra consist of some unique statistical properties. In this section, we will review the acquisition of the frequency spectrum of the image, the property of frequency spectrum and the statistical rule.

4.1.1 Fourier Transform of Images

To illustrate the computational model in the frequency domain, first let us review the properties of the image frequency spectrum. If I (x, y) is an M-by-N array that is obtained by sampling a continuous 2D image at equal intervals on a rectangular grid, then its discrete frequency spectrum is the array given by the 2D discrete Fourier transform (DFT):

(4.1) equation

and the inverse DFT is

(4.2) equation

where (x, y) is the coordinate of the sampling array, and (u, v) is frequency component along the x- and y-axes, respectively. It is known that each spectral component at (u, v) is a complex number with real part au,v and imaginary part bu,v, and the coefficient of the Fourier transform can be represented in polar form as

(4.3) equation

equation

where img is the amplitude spectrum and img is the phase spectrum at (u, v) of 2D frequency space. The origin of frequency space is located at the centre of the transformed image. The amplitude and phase spectra (or real and imaginary spectra) are to collect all the components' values of amplitude and phase (real and imaginary parts) in 2D frequency space. Consequently, a given image has both amplitude and phase spectra (or real and imaginary spectra) by Equations 4.1 and 4.3. Let the (M/2−1)th column and the (N/2−1)th row in the frequency space be the vertical and horizontal axes that spilt the spectrum into four quarters. As the values at all the pixels of the image are real numbers, the spectrum has conjugated mirror symmetry. The symmetry for the amplitude spectrum appears between the top-right quarter and the lower-left quarter, and between the top-left quarter and the lower-right quarter. For the phase spectrum, it has similar mirror symmetry except that the symbols are negative. Figure 4.1 demonstrates an example for an original image from a real-world scene and its respective two spectra (amplitude and phase). Note that log-spectral representation is given for amplitude spectra in order to facilitate visualization.

Figure 4.1 (a) Original image; (b) amplitude spectrum of image (a); (c) phase spectrum of image (a)

img

The properties of both amplitude and phase spectra are explained below. It is worth noting that besides Equation 4.1 a fast algorithm of the DFT, the FFT is often used to get a discrete frequency spectrum of an image, especially when the pixel number (M × N) of the image is large since the FFT can reduce the computational complexity by arraying odd and even terms. We will not explain it here. Readers can find the FFT algorithm in any image processing book or in the MATLAB® manual.

4.1.2 Properties of Amplitude Spectrum

The discrete amplitude spectrum of an image specifies how much of each sinusoidal component exists in the image [11]. Different images have different amplitude spectra, but in most cases, high luminance occurs at or near the origin (low frequency components) in the amplitude spectrum as in Figure 4.1(b). Since the power spectrum of an image is the square of each discrete frequency component in the amplitude spectrum, the shape of the amplitude spectrum in Figure 4.1(b) means that most of the image's energy is concentrated in low frequency region. In fact, the amplitude spectrum does not provide any location information since it only denotes the distribution of spatial frequency components in the image, in which each frequency component can spread anywhere in the image [12]. Nevertheless, it gives some inherent statistical information that may be relevant for simple classification tasks [13, 14].

In the last century, many studies found that the average power spectrum of natural images have a particular regularity, that is when the frequency, f, changes from low to high, the power spectrum at each frequency component falls with 1/fα[15, 16], which satisfies:

(4.4) equation

where img is the amplitude spectrum. In Equation 4.4, the location index (u, v) in frequency space is omitted for convenience. Symbol α is the power index of the frequency component, and the symbol E is to take expectation. Experimental studies revealed the index as being α ~ 2 for average power spectra of natural images and α ~ 1 for average amplitude spectra [15, 16]. For different kinds of image categories, the index α varies, and it can coarsely represent the signature of scene categories. A detailed statistical analysis is given in [17] for the average power spectrum for different orientation (features) of images in both natural and man-made object images. The complete model, considering different orientations, is represented as Equation 4.5.

(4.5) equation

where img and img are amplitude value and the power index of frequency component at orientation θ (orientation feature map), respectively. In three orientations for θ (horizontal, oblique and vertical), averaging over thousands of power spectra for separate natural images and man-made object images at different orientations showed that Equation 4.5 can fit all categories of image, with different parameters, img and img[17]. For natural image, the power index of frequency f,img, is almost equal to 2 in any scene regardless of orientation values. However, for man-made object image img is less than 2 for horizontal orientation, and greater than 2 for oblique orientation, compared to natural scenes. The amplitude value img has a similar law for both natural and man-made object environments: a larger value for the vertical and horizontal orientations, as mentioned in Section 3.7: vertical and horizontal orientations are more frequent than oblique orientation in both environments [18, 19]. However, different categories of environment exhibit different shapes in their averaged power spectra; for example, a forest environment depicts almost the same distribution at all orientations due to the diversity of tree leaves; in a coastal or beach environment, the horizontal orientation dominates the shape of the averaged power spectra, and in man-made object scenes (city view or high building) the vertical orientation dominates the shape, which can be employed as the signature of the environment in image recognition or scene comprehension [17].

When the image is taken from a scene near the observer, the components of high spatial frequency in the image will increase, and many details can be observed. Contrarily, the image with content far way from the observer generally has low spectra, which reflects low resolution, because edges of objects in the image become blurry and many details are lost. In image processing, different resolutions can often be realized by different low pass filters, which can coarsely reflect the distance between the observer and the test image. It is worth noting that here we only consider the view of a given image, not the visual field at different viewing distances, since the range of the visual field becomes larger with increasing distance.

In general, the saliency map's resolution is commonly chosen at a mid resolution for the BS model in Section 3.1, its size being one sixteenth of the original image. It has been reported that the shapes of the averaged amplitude spectrum in different resolutions are very similar though their regions of frequency are diverse [17].

4.1.3 Properties of the Phase Spectrum

Compared with the discrete amplitude spectrum (Figure 4.1(b)), the discrete phase spectrum (Figure 4.1(c)) seems very random and insignificant. However, it simply specifies where each of the frequency components resides within the image [11]. So it represents the information related to local properties (form and position) of the image [12]. It has been shown that the second order statistics of an image correspond to the information of the amplitude spectrum, and high-order statistics represent the information of the phase spectrum [20]. Therefore the phase spectrum holds important information about the image's object location.

Two examples in Figures 4.2 and 4.3 validate the correctness of the conclusion. Figure 4.2(a) is a picture of a warship in Pearl Harbor, USA. When we add white noise (SNR = 5 dB) into the phase spectrum of Figure 4.2(a) by maintaining its amplitude spectrum, the recovered image (Figure 4.2(b)) by inverse DFT is completely broken. We cannot see any information about the warship in the recovered image, which is consistent with early finding: slight disturbance of the Fourier phase spectrum makes the image unrecognizable [21]. However, when we add white noise (SNR = 5 dB) into amplitude spectrum by keeping its phase spectrum, the recovered image can contain the main information of Figure 4.2(a), as in Figure 4.2(c).

Figure 4.2 The infected phase or amplitude spectrum has an effect on the reconstructed image: (a) original image; (b) recovered image from an infected phase spectrum by 5 dB noise; (c) recovered image from an infected amplitude spectrum by 5 dB noise

img

Figure 4.3 (a) and (b): the original images; (c) and (d): the recovered images after exchanging their amplitude spectra while keeping their respective phase spectra

img

Another example is shown in Figure 4.3(a)(d). Figure 4.3(a) shows a sculpture of a soldier on horseback in a city square, and Figure 4.3(b) is the upper part of a church building. After computing their phase and amplitude spectra by Equations 4.1 and 4.3, we keep their phase spectra and exchange their respective amplitude spectra. The recovered images by inverse Fourier transform (Equation 4.2) are displayed in Figure 4.3(c) and (d), from which we can see that the mismatching of amplitude spectra only results in some interference of the two original images, without strongly influencing recognition of the image content. In other words, the phase spectrum holds the main components of image information.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset