8.4 Visual Attention for Image Retargeting

One popular application of visual attention models is image retargeting. The traditional image resizing method is to scale images by down-sampling uniformly. The problem with image scaling is that it will result in worse viewing experience and loss of some detailed information as the salient objects become smaller. Image cropping is an alternative solution which preserves the ROI in images by discarding other non-interest regions. The defect of this technique is that the context information in images will be lost [51, 52]. To overcome the limitations of image scaling and cropping, many effective image retargeting algorithms [53–63] have been proposed. In these algorithms, the content awareness is taken into consideration and a visual significance map is designed for measuring the visual importance of each pixel for the image resizing operation. The visual significance maps used in these algorithms are generally composed of a gradient map, a saliency map and some high-level feature maps such as facial map, motion map and so on [53–63]. In existing image retargeting algorithms, the saliency map can be used to measure the visual importance of image pixels for image resizing operations. This section will introduce a saliency-based image retargeting algorithm in the compressed domain [60]. This image retargeting algorithm adopts the saliency map in the compressed domain to measure the visual importance of image pixels for image resizing [60]. The visual attention model in the compressed domain [60] was introduced in Chapter 4.

8.4.1 Literature Review for Image Retargeting

In the past few years, various advanced image retargeting algorithms have been proposed. Avidan and Shamir proposed the popular image retargeting algorithm called seam carving [53]. A seam is defined as an eight-connected path of low-energy pixels (from top to bottom or left to right) in images. These pixels include only one pixel in each row or column. The seam carving aims at reducing the width (or height) by removing those unimportant seams. A gradient map is used to determine the importance of each pixel in images. Later, Rubinstein et al. extended this algorithm to video retargeting by introducing the forward energy method [54]. Some similar algorithms have also been designed, based on seam carving [55, 61].

Other advanced image retargeting algorithms have also been proposed. Wolf et al. introduced a video retargeting algorithm through introducing a linear system to determine the new pixel position [56]. In this study, the visual importance of each image pixel is measured by the visual importance map composed of local saliency detection, face detection and motion detection. Ren et al. proposed an image retargeting algorithm based on global energy optimization, in which the saliency map and face detection are combined to determine the visual importance of each image pixel [57]. Jin et al. presented a content-aware image resizing algorithm through warping a triangular mesh over images by regarding salient line features and curved features as important regions [62]. Guo et al. suggested an image retargeting algorithm through utilizing saliency-based mesh parameterization [63].

Recently, Rubinstein et al. conducted a user study and found that applying multioperators (such as seam carving, cropping and so on) can obtain better results than those from only a single operator in image retargeting [58]. In this study, the authors proposed a multioperator media retargeting algorithm which combines seam carving, scaling and cropping operators to resize images. The amount of sizing for each operation is determined by optimal result for maximizing the similarity between the input image and the retargeted image. In [59], Dong et al. introduced an image retargeting algorithm by combining seam carving and scaling. The authors utilized a bidirectional similarity function of image Euclidean distance, a dominant colour descriptor similarity and seam energy variation to determine the best number of seam carving operations.

All these image retargeting algorithms are implemented in the spatial domain, but images over the internet are typically stored in the compressed domain of JPEG. Compressed JPEG images are widely used in various internet-based applications, since they reduce the storage space and increase the download speed. In order to extract features from the compressed JPEG images, the existing image retargeting algorithms have to decompress these JPEG images from the compressed domain into the spatial domain. The full decompression from these image retargeting algorithms is both computation- and time-consuming. Compared with the existing image retargeting algorithms which operate in the uncompressed domain, it is crucial to design efficient image retargeting algorithms in the compressed domain. In this chapter, an image retargeting algorithm in the compressed domain is introduced, which is designed based on the saliency detection model in the compressed domain [60]. The multioperators including block-based seam carving and image scaling are used to perform image resizing.

8.4.2 Saliency-based Image Retargeting in the Compressed Domain

The image retargeting algorithm [60] uses the saliency detection model in the compressed domain, which is built based on feature contrast of intensity, colour and texture extracted from DCT coefficients. This computational model of visual attention was introduced in Chapter 4. In the algorithm [60], the saliency map extracted in the compressed domain is used to measure the visual importance of each 8 × 8 DCT block for image resizing. Thus, this image retargeting algorithm performs image resizing at the 8 × 8 block level. The multioperators, including block-based seam carving and image scaling, are utilized to perform image resizing. The number of removed block-based seams is determined by the defined texture homogeneity of images.

The image resizing operation steps in the algorithm [60] are as follows: (1) determine the number of block-based seam carving operations based on the defined image homogeneity; (2) use block-based seam carving to resize the original image; (3) use image scaling to resize the retargeted image from block-based seam carving to obtain the final retargeted image. Here are the details.

1. Block-based seam carving operation: It is noted that since the final saliency map from [60] is obtained at block level, each seam indicates connected blocks instead of connected pixels in the original image. The 8 × 8 DCT blocks are used to calculate the saliency map, thus the final saliency map is only 1/64 of the original image and each pixel's value in the final saliency map represents the saliency value for one 8 × 8 DCT block. A block-based seam carving method is defined based on the forward energy [54] to determine the optimal block seams. Based on the saliency map, SM, obtained in the compressed domain [60], block-based seam carving uses the following dynamic programming technique to determine the optimal block-based seams.

(8.36) equation

where M(i, j) determines the position (i, j) of the saliency map for the optimal block-based seams; CL(i, j), CU(i, j) and CR(i, j) are the costs due to the generation of new neighbour blocks separated by the removal seam previously. These costs are calculated as

(8.37) equation

2. Adaptive image retargeting: The optimal block-based seams can be determined by Equations 8.36 and 8.37. As introduced previously, the proposed image retargeting algorithm first utilizes block-based seam carving to resize the image. Then image scaling is used to obtain the final retargeted images. This model proposes to use image homogeneity to decide the number of removed block-based seams. The number of removed block-based seams in dimension p (horizontal or vertical) can be calculated as

(8.38) equation

where img is the number of removed block seams in dimension p; img represents the texture homogeneity of the image in dimension p, which is used to determine the number of removed block seams; img is the length of the original image in dimension p (width or height); img is the length of the retargeted image in dimension p (width or height). The value of img is decided by the size of the display screen of the client, based on the initial communication between the server and the client in real applications. As the proposed algorithm is based on DCT blocks and the size of DCT blocks is 8 × 8, the number 8 is used to calculate the number of removed block-based seams in Equation 8.38.

In this model, a measurement for texture homogeneity img is defined to determine the number of removed block seams. The texture homogeneity defined here is dependent on the energy spatial distribution and the energy connectedness (here the used saliency map is also regarded as the energy map). If the image energy is more centralized and connected, there may be only one or several small salient objects in the image with simple background. In this case, more seam carving is used to remove the block-based seams. However, with more disconnected and decentralized energy distribution, the image may include one or several big salient objects, or the context of the image is complex. In this case, more image scaling is used to resize the image to preserve these salient objects or the context information.

The texture homogeneity of the image in dimension p (horizontal or vertical) can be computed as

(8.39) equation

where img represents the spatial variance of the energy pixels in dimension p and νp is the connectedness of the energy pixels in the dimension p. In this study, the Otsu's thresholding algorithm [64] is used to binarize the energy map into energy pixels (energy value 1) and non-energy pixels (energy value 0).

To simplify the description, this part mainly demonstrates how to calculate the horizontal variance of the energy pixels. The calculation process of vertical variance of the energy pixels is similar. The horizontal variance of the energy pixels img in the image can be calculated as

(8.40) equation

(8.41) equation

where E(i, j) is the energy value for the position (i, j); P represents the summation of all the energy pixels in the image,

equation

In Equation 8.40, H is the expected value of the spatial location for the energy value in the image. Thus, the horizontal variance of the energy pixels can be obtained for the image based on Equations 8.40 and 8.41. Here img is normalized in [0, 1] based on the energy homogeneity: when all the energy pixels are centralized into one square in the image, the energy homogeneity is the largest; but when all the energy pixels are distributed uniformly over the image, the energy homogeneity is the lowest. img is normalized based on these two cases to calculate img in Equation 8.39.

The connectedness of the energy pixels in the saliency map is measured by the number of energy pixels in the neighbourhood of all energy pixels. For each dimension (horizontal or vertical) of the image, there are at most six neighbour pixels for each energy pixel. The other two neighbour pixels are from the other dimension and thus not considered. The connectedness of the energy pixel i for the dimension p can be computed as follows [65].

(8.42) equation

where Mi includes all six neighbour pixels around i; img is the function to denote whether the neighbour pixel z is an energy pixel or not.

The connectedness of the image energy in dimension p is obtained as the sum of the connectedness of all energy pixels in the image as

(8.43) equation

where K is the number of energy pixels in the image.

The connectedness of the image energy can be obtained from Equations 8.42 and 8.43. Here img is normalized between 0 and 1 based on the energy connectedness: when energy pixels in the image are centralized as a connected square, the image texture owns the largest connectedness value with this amount of energy pixels; but when energy pixels in the image are distributed uniformly over the image, the image texture has the lowest connectedness value.

Therefore, the amount of removed block-based seams for images can be obtained according to Equations 8.388.43. After the block-based seam carving has been used to remove the optimal block-based seams, image scaling is used to scale the retargeted image from the block-based seam carving to obtain the final retargeted image. Some experimental results can be found in Figure 8.11. From this figure, the retargeted results from the algorithm [60] are much better than others in terms of resultant visual quality.

Figure 8.11 Comparison of different image retargeting algorithms. The first column: the original images. The second to fifth columns: the retargeted images from [54, 56, 66] and [60] respectively. The width and height of the retargeted images are 75% of the width and height of the original images respectively. © 2012 IEEE. Reprinted, with permission, from Y. Fang, Z. Chen, W. Lin, C. Lin, ‘Saliency detection in the compressed domain for adaptive image retargeting’, IEEE Transactions on Image Processing, Sept. 2012

img

In sum, this section introduces an adaptive image retargeting algorithm based on a saliency detection model in the compressed domain [60]. The saliency map is used as the visual significance map to measure the visual importance of image pixels for this image retargeting algorithm. The multioperator operation including block-based seam carving and image scaling is utilized for image resizing. The new idea of texture homogeneity is defined to determine the number of removed block-based seams in this algorithm [60]. Experimental results show the advantages of the saliency map in the application of image retargeting.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset