Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 2 Multimodal Biometric and Fusion Technology

This chapter presents the basic concept of multimodal biometric and the different fusion rules in the different fusion levels. The chapter is organized as follows. Section 2.1 will give the idea of multimodal biometric and advantages of using multiple biometrics. The concept of fusion and the different levels of fusion will be discussed in Section 2.2 and 2.3. The different fusion rules will be described in Section 2.4. A comparison of the different fusion rules at the different fusion levels will be presented in Section 2.5. Section 2.6 will summarize the chapter.

2.1 Multimodal Biometric Authentication Technology

In multimodal biometric authentication, multiple modalities are used for person identification. Two or more biometric traits are gathered from a person and are used to identify him/her uniquely. The user is asked to present multiple samples for identification and the final recognition decision is made on all or a combination of them.

Use of more than one modality has many advantages as well as challenges associated with it. Modalities such as iris and fingerprint are unique and accurate and play an important role in the overall decision making process. Use of multiple modalities allows more users to enroll in the system because they can present alternate modalities if the ones asked for are not presentable at that moment due to an illness, an injury or any other difficulties. Multimodality also makes it increasingly difficult to fake multiple modalities [41]. The accuracy of unimodal systems is adversely affected by the input sample quality [9, 8]. Use of multiple modalities provides robustness against sample quality degradation. It also helps to improve the reliability of a biometric system.

While designing a multimodal biometric system, one must consider the type of data to be acquired (e.g.. 2 Dimensional or 3 Dimensional), the type of recognition algorithm to be used on each data element (e.g. Principal Component Analysis (PCA) or Independent Component Analysis (ICA)), the output of an algorithm (the distance or error metric), how to combine them and the level at which it should be performed. All these are the major challenges in multimodal biometric authentication system.

2.2 Fusion of Multimodalities

Compared to a unimodal biometric authentication system, a multimodal system requires more storage space to store the different types of samples. The benefits of a multimodal system may get overshadowed if it takes inordinate amounts of storage space and consequently increasing times for sample collection, templates matching and decision making. To overcome these limitations, a possible approach is to combine multiple modalities. This approach is called as fusion. The fusion can be performed at various stages in a biometric authentication system. These stages are called fusion levels and are shown in Fig. 2.1.

Sensor Level Fusion: In sensor level fusion, outputs of the different sensors are fused together to form a new input which is used in further stages [41].
Feature Level Fusion: Features extracted from the different biometric modalities are combined to form a single feature vector in this level of fusion [41].
Match-score Level Fusion: After comparing the stored template and query sample the matcher produces a measure of their similarity called as a match score. The match score produced by the different matchers are combined in this level of fusion [41].
Decision Level Fusion: The decisions made by the different matchers are combined in this level of fusion to arrive at the final recognition decision [41].

A detail discussion of the above mentioned fusion techniques are presented in the next section.

The fusion techniques help in reducing the total number of feature vectors from the different modalities which otherwise would have increased the storage space requirements manifolds. Taking the specialist capabilities of each classifier, a combined classifier may be build up to provide better results than a single classifier. In other words, combining the different experts results in a system can outperform the experts when taken individually. This is especially true if the different experts are not correlated [3]. It also increases the degrees of freedom [52]. The concern of time spent in processing of multiple modalities can be avoided by use of parallel mode of operations.

Figure 2.1: Block diagram of the different levels of fusion.

2.3 Fusion Levels

This section will talk about various levels of fusion. A typical biometric system is shown in Fig. 2.1 with the sensor, feature extraction, matcher and decision making modules. Each modality requires a separate set of these four modules. Information available at the output of each of these modules can then be combined and passed onto the next module. This process of combining information is known as fusion. Fusion can be carried out at four different places or levels as shown in Fig. 2.1. A brief description about each level of fusion, the issues that are required to be handled in order to fuse information at these levels, the limitations of information fusion at those levels and their advantages and disadvantages are presented in the following.

2.3.1 Sensor Level Fusion

Sensor level fusion is fusion performed at the earliest stage and is done before the feature extraction phase. The sensors capture a biometric trait like facial image, iris image, fingerprint etc. The main objective of the sensor level fusion is to fuse these captured samples as they contain the richest amount of information about that biometric trait. According to Zhang et al [58], fusion should be carried as early as possible.

Issues in Sensor Level Fusion: Typically, sensor level fusion can be done if the samples represent the same biometric trait [41]. For instance, multiple snapshots of a face can be taken by cameras positioned at the different locations and are then combined to form a composite 3D image. Sensor level fusion is more beneficial and useful in multi-sample systems [42, 6, 11] where multiple samples of a same biometric trait are taken and combined to form a composite sample. The samples to be fused must be compatible.

Existing Approaches to Sensor Level Fusion: Sensor level fusion involves the formation of a composite image or signal [41]. A 3D face image can be generated from multiple views of the same face in two dimensions [26]. Mosaicing [18] of fingerprints is performed in order to form a composite fingerprint from multiple dab prints. The user provides multiple dab prints of his fingerprint and they are combined to form a fingerprint mosaic. Bowyer et al. [6, 7] used multiple 2D images of human face along with infrared and 3D images of the same. These images are used in seven different combinations ranging from the individual images to all three types of images and the experimental results show the 2D+3D+infrared combination produces the smallest EER of less than 0.002%. In other words, the multi-sample fusion of facial images provides the most accurate recognition performance as compared that of the individual images.

Advantages: As the sensor outputs contain raw data, the information content is the richest at this level of fusion. Such a rich amount of data can contain noise which may be due to inappropriate lighting conditions, background noise, and presence of dirt and sweat on the sensor surface. Noise reduces the quality of a biometric sample and will degrade the accuracy of the authentication system [9, 8] but the richness of information available at this level allows for application of sample quality improvement techniques without the loss of any information. Raw data is also the largest in size and fusion of such data reduces the total required storage space.

Limitations: Sensor level fusion is applicable only to multi-instance and multi-sample biometric systems which contain multiple instances or samples of the same biometric trait. Multimodal biometric systems require samples from the different modalities to be combined. Samples from the different modalities may not be compatible [6] and hence cannot be used for sensor level fusion. Most of the commercial-off-the-shelf (COTS) products do not provide sensor outputs. Performing sensor level fusion in that case is not possible.

2.3.2 Feature Level Fusion

This is the second level of information fusion in a biometric authentication system. The feature extraction phase produces feature vectors which contain the second richest level of information as compared to the raw data captured by biometric sensors. Storing feature vectors instead of raw data requires comparatively less storage space. But still, these feature vectors are high dimensional and to store such high dimensional feature vectors from the different modalities more space is required than that for a single modality. Simple techniques like averaging or weighted averaging, feature level fusion can be used to reduce the total number of feature vectors [41] if they originate from the same feature extraction algorithm.

Issues in feature level fusion: Feature sets of the different modalities are generally not compatible as each modality has its own unique features (e.g. fingerprints contain minutia points which are defined by their type and orientation whereas irises are identified by their pattern and color information which are scalar entities). This type of fusion techniques can be applied only when the feature sets are compatible and/or closely synchronized [57]. Modalities like hand geometry and palmprint or voice and corresponding lip and facial movements are closely coupled modalities and can be synchronized by fusion at this level [56, 10]. Biometric systems that use feature level fusion may suffer from the curse of dimensionality if they use simple concatenation of feature vectors as a fusion technique [39].

Existing approaches to feature level fusion: Feature level fusion is generally achieved by concatenation of feature vectors [36, 24, 39]. Feature vectors obtained from the feature extraction stages of the different modalities are simply concatenated to form the resultant feature vector which may be in higher dimensions than any of its constituent vectors. Xu et al. [57] propose a novel approach to feature level fusion of ear and face profile using Kernel Canonical Correlation Analysis (KCCA) first map the feature vectors into a higher dimensional space before applying correlation analysis on them.

Advantages: Feature vectors contain second richest level of information as compared to the raw biometric data. Feature level fusion allows for synchronization of closely coupled modalities [56, 10] such as face and lip movements while talking or modalities like hand geometry and palmprint. This synchronization can avert spoof attacks as faking facial movements as well as voice accurately at the same time is not easy. It can also be used as a liveliness detection technique.

Limitations: Methods like concatenation are generally used for combining feature vectors [57, 10]. Such a concatenation results in an increase in the total number of vector dimensions and the system may suffer from curse of dimensionality and thus, degraded recognition performance. Storing a lager dimensional feature vector also takes up valuable storage space. Feature concatenation also results in the noise from the fused modalities getting added up as well. Noise reduces the quality of a biometric sample captured and will degrade the accuracy of the authentication system [9, 8]. Most of the commercial-off-the-shelf () products do not provide feature vectors. Performing a feature level fusion in that case is not possible.

2.3.3 Match-score Level Fusion

Match score level fusion is level three of fusion in a typical biometric authentication system. The biometric matcher module produces match scores which are an indicator of similarity or dissimilarity between the input sample and the one stored in a database. Match scores constitute the third richest level of information after raw data and feature vectors and match score level fusion aims at combining these match scores and use the resultant score to make a final recognition decision. As these scores are readily available, match score level fusion is the most effective level of fusion [53].

Issues in match-score level fusion: Not all biometric matchers output their scores in a fixed format. Some biometric matchers may output similarity scores while others may provide dissimilarity scores. The scores may even follow the different numerical ranges and distributions. These scores are required to fall in a common numerical range and normalization techniques are used to achieve that [20, 45, 46, 47, 33]. A detailed study of normalization techniques is done by Jain et al. [20].

Existing approaches for match-score level fusion: Score level fusion is the most popular level of fusion and many research works are devoted to it [30, 37, 45, 47, 46, 17, 13, 33, 3, 2, 58, 4, 23, 1, 34]. Once the scores are normalized, simple arithmetic rules like sum [47, 51, 34], weighted sum [31, 34], product [51], weighted product, product of probabilities, sum of probabilities, min-max score [47] can be used. Techniques like matcher weighting and user weighting are explored by Snelick et al. [46]. Matcher weighting techniques assign weights to the matchers based on the accuracy of that biometric trait. While the user weighting techniques assign user-specific weights [50] to each user. These weights are determined empirically.

Advantages: Commercial-off-the-shelf (COTS) products give access to the match scores hence, they are readily available for fusion [41]. Match scores are simple real numbers within a certain numeric range and thus can go through any numerical transformations. They also contain the lowest level of data complexity. This makes them both easier to understand and combine. Match scores are the smallest in size compared to the raw data and feature vector.

Limitations: Match scores are required to be normalized before fusion. The robustness of normalization techniques plays an important role in the robustness of the biometric system that uses it [20].

2.3.4 Decision Level Fusion

This is the fourth level of information fusion in a biometric authentication system. Sample for each biometric trait goes through the separate sensor, feature extraction and matching modules. The matchers make an independent local decision about the identity of a user. Decision level fusion aims at fusing these local decisions to form a final/ global recognition decision that classifies the user as genuine or an impostor.

Issues in decision level fusion: This is the least rich level of information as the amount of information available for fusion can be a simple binary 0s and 1s (1 indicating a match and 0 indicating a non-match) or match/non-match drawstring.. Since the final decision is made on a certain combination of the decisions output by the different matchers, their configurations; sequential or parallel, plays an important role in the formation of a final decision as well as the level of security provided by the system [4, 51, 1].

Existing approaches to decision level fusion: Existing approaches consist of simple rules like the AND rule [4, 51, 1], the OR rule [51, 1], majority and weighted majority voting [41]. Bayesian fusion rules [37, 2] are also used which aim at transforming the decision labels output by individual matchers into probability values. Behavior knowledge space method [38] is a method of decision level fusion which makes use of a look-up table to decide the final label based on the labels output by individual matchers.

Advantages: Fusion of decisions allows for the use of independent, unimodal biometric authentication products off the shelf. Products from the different vendors can also be used without worrying about their compatibility. Also the security level of the entire system can be adjusted as per the requirement by changing the configuration in which the matchers are connected [4, 51, 1]. Decision level fusion makes use of the smallest and the most unambiguous piece of information for fusion i.e. the matcher decision. As the most commercial-off-the-shelf products (COTS) provide access to only the final decision, make this as the only possible and feasible level of fusion.

Limitations: Relative performances of the matchers must be taken into consideration while making the final decision. Weighted fusion rules must assign matcher weights based on the relative accuracy of the matchers, so that the most accurate matcher plays an important role in determining the final decision.

2.4 Different Fusion Rules

Fusion levels make use of fusion rules for combining the data available. A survey of the different types of fusion rules and the issues related to them is performed in this section. Fusion rules are broadly classified into two categories: 1) Fixed fusion rules and 2) Trained fusion rules.

The rules falling in these two categories are described in the next subsections.

2.4.1 Fixed fusion rules

These rules are said to be fixed rules because when fusion is performed using these rules, no training takes place. It is a linear process and no feedback is taken from the outputs in order to better or optimize the results.

Following terms are used in the mathematical expressions in the fusion rules.

f	=	fused score
d	=	final decision
M	=	number of matchers
x_i	=	voutput score for the i^th matcher
d_i	=	decision of the i^th matcher

2.4.1.1 AND Rule

In AND rule [4, 51, 1], the final decision is a conjunction of the individual matcher decisions. AND rule is explained in Eq. (2.1). The matchers can be imagined to form a serial combination. It results in a match if and only if all the inputs result into a match. Systems using AND rule for fusion exhibit a high FRR. Because of it, this type of fusion gives the highest level of security. These systems, however, have a very low, even as low as 0% FAR.

(2.1)

2.4.1.2 OR Rule

In OR rule [51, 1], the final decision is a disjunction of the individual matcher decisions. OR rule is explained in Eq. (2.2). The matchers can be imagined to form a parallel combination. It results in a match if any one of the inputs results in a match. Systems using OR rule for fusion exhibit a higher FAR. Because of it, this type of fusion gives the lowest level of security. These systems, however, have very low, even as low as 0% FRR.

(2.2)

2.4.1.3 Majority Voting

In this rule, the final decision is the one that is in the majority among all the matchers as shown in Eq. (2.3). If majority of the matchers are reliable and accurate, then the final decision can be trusted otherwise, an accurate matcher may fail to affect the final output decision in the presence of majority of less accurate matchers.

(2.3)

2.4.1.4 Maximum Rule

The maximum rule [41] gives the final output score as the maximum of all the matcher scores. A mathematical formula is given in Eq. (2.4). The accuracy of the matcher with the maximum match-score will determine the accuracy of the final decision.

(2.4)

2.4.1.5 Minimum Rule

In minimum rule [41], the final output score is the minimum of all the matcher scores shown in Eq. (2.5). The accuracy of the matcher with the minimum match-score will determine the accuracy of the final decision.

(2.5)

2.4.1.6 Sum Rule

Sum rule [34, 51] is similar to the OR rule, but for match scores. The final score is the sum of the individual match scores (Eq. (2.6)). This rule has been shown to perform better than product rule and Linear Discriminant Analysis (LDA) [40, 12]. The best combination results are obtained by applying simple sum rule [34]. Sum rule is not significantly affected by the probability estimation errors and this explains its superiority [23]. The ROC performances of sum rule are always better than that of the product rule [49].

(2.6)

2.4.1.7 Product Rule

Product rule [51] is similar to the AND rule, but for match scores. The final score is the product of all the input scores. This rule performs as if the matchers are arranged in a series combination (Eq. (2.7)). This rule assumes independence between individual modalities. If the scores are normalized in the range [0, 1] then the final match score will always be less than the smallest of the match scores.

(2.7)

2.4.1.8 Arithmetic Mean Rule

In arithmetic mean rule [51], final fused score is the arithmetic mean of the match scores of all the matchers (Eq. (2.8)). The match scores should be in the same numerical range i.e. normalized so that they can be averaged out properly.

(2.8)

2.4.2 Trained Fusion Rules

The trained rules, unlike the fixed rules, contain parameters that are obtained, refined and optimized over a training period. Some systems have learning phases which are used to learn specific weights.

Following terms are used in the mathematical expressions for the fusion rules.

f	=	fused score
d	=	final decision
M	=	number of matchers
x_i	=	output score for the i^th matcher
d_i	=	decision of the i^th matcher

2.4.2.1 Weighted Sum Rule

Weighted sum rule [31, 34] is a modification of the simple sum rule. In this rule, a weighted factor is multiplied by the score and then sum is calculated as given in Eq. (2.9). Each matcher carries a weight, W_i. Matcher weights are calculated heuristically and empirically. Generally, the matcher weights are directly proportional to their accuracy. The more accurate a matcher is, the more weight it carries.

(2.9)

where

2.4.2.2 Weighted Product Rule

This rule is a modification of the simple product rule. In this rule, a weighted factor is multiplied by the score and then the product is calculated as given in Eq. (2.10). Each matcher carries some weight, W_i. Matcher weights are calculated using some heuristic methods. Matcher weights are directly proportional to their accuracy. The more accurate a matcher is, the more weight it carries.

(2.10)

where

2.4.2.3 User Weighting

In user weighting [19, 41], each user is assigned a specific weight into each modality. The final score is calculated by multiplying the assigned weights with each modality’s score as given in Eq. (2.11). User weights are calculated heuristically and through empirical measurements. This method provides an improvement over the matcher weighing method. Performance improvement is seen with reduced FRR as the user convenience and user’s habituation with the system increases. Automatic learning and update of system parameters help in reducing the errors associated with an individual, thus improving system performance [19]. Not all users have all of their traits up to an acceptable level/standard and so assigning equal weights to such traits will not exploit their full potential and may dampen their uniqueness. This will also cause their false rejection. User specific weighing and user-specific thresholds will accommodate more people into the system, thus reducing the system’s FRR.

(2.11)

2.4.2.4 Fisher Linear Discriminant (FLD)

Fisher’s linear discriminant [2, 58] is a classification method that projects high-dimensional data onto a line and performs classification in this one-dimensional space. The projection maximizes the distance between the means of the two classes while minimizing the variance within each class. The Fisher Linear Discriminant (FLD) gives a projection matrix W that reshapes the scatter of a data set to maximize class separability, defined as the ratio of the between-class scatter matrix to the within-class scatter matrix. This projection defines features that are optimally discriminating. For that purpose FLD finds such a W to maximize the criterion function defined in Eq. (2.12).

(2.12)

where, S_w and S_b are the within-class scatter matrix and the between-class scatter matrix, respectively. These two matrices are defined in Eq. (2.13) and Eq. (2.14).

(2.13)

(2.14)

In Eq. (2.13) and (2.14), is the i^th sample of class j, µ_j is the mean of class j, c is the number of classes, N_j is the number of samples in class j, and µ is the mean of all classes.

2.4.2.5 Support Vector Machine (SVM)

SVM [25, 34, 3, 2, 14] is a binary classifier that maps the input vector x_i ∈ R^d to output labels y_i = −1, +1 for i = 1, 2, 3, . . . , l, where l is the number of patterns to be classified. A binary classifier is used to map the n-dimensional data points into the label space of two classes (genuine and imposter). The main objective of SVM is to find an optimal separating hyperplane that separates the dataset into two groups and maximizes the separation between them. The separating Hyperplane could be linear or nonlinear [14]. A general form of SVM can be represented as,

(2.15)

Where, α_i is Lagrange multipliers, Ω is indexed to support vectors for which α_i ≠ 0, y_i is output labels, b is bias term, K (x, x_i) is kernel function and x is the input vector to be classified.

The kernel function decides the nature of the separating surface. This function can be polynomial (see Eq. (2.16)) or a Radial Bias Function (RBF) (see Eq. (2.17)).

(2.16)

(2.17)

The choice of kernel function depends upon the satisfaction of Mercer’s conditions. It has been shown that the polynomial classifier performs better than the RBF classifier kernel function [2]. The final decision is made depending on whether f (x) is above or below a threshold.

The experimental comparison of many fusion techniques along with SVM is made in [2] and it shows an FAR of [1.07, 2.09]% and an FRR of [0.0, 1.5]%. This is the best classifier among the classifiers studied in [2]. These results are well within acceptable limits and as it can be seen an FRR of 0% has been achieved.

2.4.2.6 Multi Layer Perceptron (MLP)

An MLP [55, 2] is a supervised learning neural network consisting of many layers of nodes or neurons as shown in Fig. 2.2. The input layer is denoted by i and accepts inputs like a particular pattern or a feature vector from the biometric system. The next layer is called as a hidden layer since it exists only between the input layer and the output layer. There can be more than one hidden layer and they are denoted by j. The final layer is the output layer, denoted by k, and provides the outputs to the outside world.

Figure 2.2: Multilayer perceptron network.

All the layers are fully connected; meaning a node in one layer is connected to every node in the next layer. Each of these connections or branches carries a specific weight. The weight matrices are w_ij for weights between the input and the hidden layer and W_jk for the weights between the hidden layer and the output layer. These weights are obtained during the training phase.

Input to each node in the input layer undergoes identity transformation, I_i = O_i where I_i is the input and O_i is the output of the i^th node. Input I_j to the hidden layer is given by the following transformation.

(2.18)

where w_ij is the weight of branch between i to j and O_i is the output of the i^th node from the input layer.

The output of the hidden layer is given by,

(2.19)

where I_j and O_j are the input and output of the j^th node, respectively. f () is an activation function of the node.

I_k is the input to the output layer is given by the following transformation.

(2.20)

where, W_jk is the weight of branch between j node to k node and O_j is the output of the j^th node from the hidden layer.

Once these inputs are available at the output layer, they undergo similar transformation as given by Eq. (2.19).

MLP is a supervised learning network. That means the desired output is available and the MLP is trained to deliver an output close to the desired output (e.g. for learning the AND rule, when the input is two 0s, the desired output is a 0). During the training phase, many input patterns are presented to the MLP along with the desired output and the weights w_ij and W_jk are adjusted in such a way that actual output is as close to the desired output as possible. The learning phase is completed when the MLP can deliver actual output very close to the desired output. For continuous adjustment of the weights w_ij and W_jk, MLP uses a back-propagation algorithm [15, 44, 28]. This algorithm calculates the error which is the difference between actual and desired output. This error is propagated backwards from the output layer until all the weights up to the input layer are adjusted. The weights are adjusted as follows,

(2.21)

where, w_kj (t + 1) and w_kj (t) are the weights connecting nodes k and j at iteration (t + 1) and t, respectively and η is the learning rate.

The difference between the actual output (δ_k) and the desired output (d_k) is calculated as follows.

(2.22)

This error is then back propagated through the entire network and the weights are adjusted as follows.

(2.23)

2.4.2.7 Mixture-of-Experts (MOE)

Mixture-of-Expert rules [25] consists of two layers as shown in the Fig. 2.3.

Layer of local experts: This layer consists of a number of local experts. Each local expert contains some classifiers. These classifiers work on a single modality. Each classifier uses its own matching algorithms and comes up with local score. These local scores are then passed along to the second layer.

Layer of gating network: This layer consists of a gating network [25]. This network combines the local scores into a final fused score using hard switching fusion network, sum rule and product rule in rule-based fusion, support vector machines, multilayer perceptrons, and binary decision trees in learning-based fusion. This core is then used to classify the user as genuine or an imposter.

The gating network may be one of the following.

1) Adaptive Hard-Switching Networks: Final fused score is given by Eq. (2.24).

(2.24)

In this mode of operation, W_i = 1 for all i and W_j = 0 for all j and j ≠ i. Effectively, the final score is the score of any one of the matcher.

Figure 2.3: Block diagram of Mixture-of-Experts.

2) Adaptive Weighted Combination Networks: Final fused score is given by Eq. (2.25).

(2.25)

In this mode of operation, the classifier weights are nonzero. They M fall in the range [0, 1] such that .

It is a weighted sum of all the classifier rules. This is termed as a soft fusion method. Matcher learning techniques are used to determine an optimum value for constants W_i .

3) Adaptive Nonlinear Fusion Networks: The previous two type of gating networks produce a linear separation between the imposter and genuine distributions. For more information on experimental results of the same please refer [25]. These types of gating networks make use of non-linear classifiers such as SVM or decision-based neural networks. It gives more flexible decision boundaries. The experimental results in [25] show this type of fusion to be the best in terms of performance.

α_j are Lagrange multipliers and S contains indexes to the support vectors s_j . B is a bias multiplier and K (s, s_j) is a kernel function. The experimental results in [25] show this type of fusion to be the best in terms of performance.

2.4.2.8 Bimodal Fusion (BMF)

Bimodal fusion [10] fuses features from two modalities, hence the name. Both the modalities undergo with the different feature extraction phases. The features obtained after feature extraction phases are then fused to form a combined feature vector. In this approach, multimodal fusion is based on late fusion and its variants. Detailed discussion of this method can be found in [22, 35]. Late fusion or fusion at the score level involves combining the scores of the different classifiers, each of which has made an independent decision. The distinguishing factor here is the separation of the different modalities in the feature extraction phase.

2.4.2.9 Cross-Modal Fusion

In cross-modal fusion [10], the modalities go through a feature extraction phase and the obtained features are then passed through another feature extraction technique in order to obtain cross-modal features. The cross-modal features are based on optimizing cross-correlations in a rotated cross-modal subspace. The features are fused in a cross-modal subspace. The individual combination methods which are used to extract features for the cross-modal fusion, try to combine the synchrony between the features fused. Therefore, features like facial movements and voice that have close synchronization between them have generally benefited from such fusion techniques. Chetty et al. [10] explain a cross modal technique using Latent Semantic Analysis (LSA) and Canonical Correlation Analysis (CCA). The LSA technique tries to extract underlying semantic relationship between the face and the audio vectors while the CCA technique tries to find a linear mapping that maximizes the cross-correlation between the feature sets combined.

2.4.2.10 3-D Multimodal Fusion

3-D multimodal fusion [10] technique requires 3D features. The 2D samples that are captured by the sensor are used to make a 3D model of that modality. Modalities such as face, fingerprint are prominently used in their 2D as well as 3D forms. This requires appropriate supplementary information to construct 3D features from many 2D features. These features include frontal and side views for 3D face generation, sweep sensors outputs in the form of strips from the front and side views of a fingerprint. Various internal anatomical relationships between the modalities make it easy to find external and visible relationship between them from the sensor outputs. These include head movement during pronunciation of a particular letter or word. This fusion technique can also help the system to perform liveliness checks by capturing and checking for such movements and hence provide protection against spoof attacks, either manual or computer generated.

2.4.2.11 Canonical Correlation Analysis (CCA), and Kernel Canonical Correlation Analysis (KCCA)

In these methods, association between two feature vectors are measured. CCA is a classical multivariate method dealt with linear dependencies of the feature vectors and KCCA is a method that generalizes the classical linear CCA to nonlinear setting. These methods are described in the following.

Canonical Correlation Analysis (CCA): In this approach [57, 56], only two feature vectors are considered for fusion. A correlation function is required to be defined between these two feature vectors. CCA finds a pair of linear transformations one for each of the vectors to maximize the correlation coefficient between extracted features. Then CCA finds additional pairs of variables that are maximally correlated subject to the constraint that they are uncorrelated with previous variables. This process is repeated until all correlation features of two vectors are extracted. Based on this idea, the canonical correlation features of two vectors are extracted. These features are used as discriminant information between the groups of feature vectors. CCA finds a pair of a linear transformation between the two vectors such that it maximizes the correlation coefficient between them. This simplicity of having linear transformation leads to the disadvantage of this method not being able to identify all or most of the useful descriptors possible. Also, the CCA suffers from small sample size problem.

Kernel Canonical Correlation Analysis (KCCA): Kernels are functions or methods of mapping data into higher dimensional feature spaces. The KCCA [57, 56] projects the features from lower dimensions into higher dimensional Hilbert spaces [43]. This higher dimensional feature maximized the correlation coefficient between two vectors. It is used to extract the nonlinear canonical correlation features associated with the modalities.

2.4.2.12 Simple and Weighted Average

Simple averaging [52] of output is also called as Basic Ensemble Method (BEM). It is used when the individual classifiers produce a continuous output. It is assumed that classifiers are mutually independent and that there is no bias toward any of the classifiers means the accuracy and the efficiency of each classifier are same. Therefore, they all have the same weights. The final score is calculated in Eq. (2.26).

(2.26)

Weighted averaging [51] of output is called as Generalized Ensemble Method (GEM). The assumption in BEM about the mutual independence of classifiers and their unbiasness is not a practical one. It is very natural to have classifiers having the different accuracy and thus should carry the different weights. The final score is given by Eq. (2.27)

(2.27)

where

2.4.2.13 Optimal Weighting Method (OWM)

OWM [52, 51] is an effective way of combining the classifiers linearly. But, there might exist nonlinear effects of classifier combination and interactions among the classifiers. Ignoring these interactions may result in inaccurate results.

2.4.2.14 Likelihood Ratio-Based Biometric Score Fusion

This method of fusion [33, 32] uses the likelihood ratio test [33, 41]. It is a form of density-based score level fusion. It requires the explicit estimation of genuine and imposter score densities [41]. Accuracy of the density-based fusion technique depends upon the accuracy with which the densities are calculated. The likelihood ratio test is as shown in the following.

According to the Neyman-Pearson theorem [33]

(2.28)

(2.29)

where,

X = [X₁, X₂, . . . , X_M] | : | match sores of M different matchers.

ƒ_gen(x)	=	conditional joint densities of the M matchers given the genuine class
ƒ_imp(x)	=	conditional joint densities of the M matchers given the imposter class
α	=	a given level, to be kept fixed, generally it is the FAR
η	=	decision threshold
H₀	=	hypothesis that the user is imposter
H₁	=	hypothesis that the user is genuine
ψ	=	statistical test for testing the hypotheses.

According to the Neyman-Pearson theorem, theorem, the optimal test to determine whether the user is a genuine user or an imposter is the likelihood ratio test given by Eq. (2.29), at a given FAR, α. The threshold η is decided such that the GAR is maximized. When the underlying densities are known, the rule guarantees that there is no other rule with a higher GAR than this. The densities ƒ_gen(x) and ƒ_imp(x) are calculated from the training set.

2.4.2.15 Borda Count Method

Borda count method [30, 41] is used in rank level fusion [30]. In rank level fusion, each user is assigned a rank by all the matchers based on their relative similarity with the input query sample. This is useful in the identification scenario rather than in verification scenario. Ranks give a relative ordering among the users, starting from the user with the highest rank to the one with the lowest rank. Unlike match scores, ranks do not give any details about the measure of similarity or dissimilarity between the input sample and the one stored in the database. But the ranking gives more details than just a match or non-match decision. In the borda count method, the ranks assigned to the users by biometric matchers are summed up to form a statistic s_k for user k (see Eq. (2.30)).

(2.30)

where, r_j,k is the rank assigned to the k^th user by the j^th matcher for j = 1, . . . , R and k = 1, . . . , M .

The ranks are revised based on this statistic s_k for each user. The user with the least value for s_k gets the highest rank. Borda count assumes that all the matchers have equal accuracies, this cannot always be true. Also a simple summation of ranks will put users with average ranks on the same levels with those that have scored the best rank for one matcher but worst for the other (e.g. a user with a rank of 3 from two matchers will have a final statistic, s_k as 6). A user with rank of 1 for the first matcher and 5 for the other matcher will also have the same statistic of 6. Such a high discrepancy in a user’s ranks provided by the different matchers clearly indicates the difference in accuracies of those matchers, something that is not considered by the Borda count method.

2.4.2.16 Logistic Regression Method

Logistic Regression [41] is a generalization of the Borda count method [30]. A weighted sum of ranks assigned to a user is performed for each user to form a statistic s_k (see Eq. (2.31)).

(2.31)

where r_j,k is the rank assigned to the k^th user by the the j^th matcher and w_j is the weight assigned to j^th matcher for j = 1, . . . , R and k = 1, . . . , M .

The weights are determined in a training phase by logistic regression method [48]. A matcher with higher accuracy is assigned a higher weight. This overcomes the drawback of the Borda count method related to the assumption of all matchers being of equal accuracy. However, Logistic regression method does require a training phase to calculate the weights accurately.

2.4.2.17 Kernel Fischer Discriminant Analysis (KFDA)

KFDA [17] is used to solve the problem of nonlinear decision boundary between genuine and imposter distributions. Conventional methods degrade the fusion accuracy when decision boundary is linear. It can control the security level (FAR, FRR) on a threshold as it does not make a unique decision boundary like SVM and other methods.

The KFDA works as follows. The input feature vectors are first projected into a very large, possibly infinite functional space via nonlinear mapping. Then Linear Discriminant Analysis (LDA) is applied there. The LDA projects 2- dimensional data onto a line. This makes it easier to discriminate it linearly and maximize the ratio of inter-class scatter to intra-class scatter.

2.4.2.18 Minimum Cost Bayesian Classifier

Minimum cost Bayesian classifier [2, 54] is used in decision level fusion. This works on the principle of minimization of the expected cost function called Bayes Risk. The mathematical formulation of the problem is shown in Eq. (2.32).

(2.32)

where, i, j ∈ {0, 1}, C_ij is the cost of deciding a = i when w = j is present and w = 10 denotes either the presence or absence of the claimed identity. Considering these four conditions of a and w, Eq. (2.32) becomes Eq. (2.33).

(2.33)

The posteriori authentication probability is given in Eq. (2.34) using Bayes’ theorem.

(2.34)

In Eq. (2.34), f(x|w) is a likelihood function. The joint density of local authentication probabilities is given in Eq. (2.35).

(2.35)

If P (w = 1) is replaced by g then Eq. (2.35) becomes Eq. (2.36).

(2.36)

Therefore,

(2.37)

For n sensors the total likelihood function is given in Eq. (2.38).

(2.38)

Where f_i(x_i|w) is a likelihood function for sensor i. The sensors are assumed to be independent and thus the total likelihood function is a product of likelihood functions of all the sensors. Value of ’a’ is decided with the help of standard 0-1 cost function.

(2.39)

The most important part in all these calculations is the accurate modeling of the likelihood function. The choice of a proper function is pivotal in deciding the quality of the probability fusion.

The experimental comparison of many fusion techniques along with Minimum risk Bayesian classifier is made in [2] and it shows an FAR of [0.63, 1.5]% and an FRR of [0.0, 1.75]%. This is amongst the best classifier among the classifiers studied in [2]. These results are well within acceptable limits and as it can be seen an FRR of 0% has been achieved.

2.4.2.19 Decision Tree

A Decision tree [2, 27] consists of nodes and branches. The nodes are tests that are performed on a particular attribute of the data and the leaf corresponds to a particular label. The path from the root node to the leaf node then consists of a series of tests that classify the input data into a particular class. Alternatively, a decision tree gives a graphical representation of the classification process. They recursively partition the data points in an axis-parallel manner. At each node, the test splits the input on some discriminating feature and thus provides a natural way of feature selection.

2.5 Comparative Study of Fusion Rule

Comparative Study of Fusion Rules in the Different Fusion Levels

In previous sections, multimodal biometric fusion levels and fusion rules are reviewed. The observations of the different fusion rules and fusion rules are summarized in this section. Table 2.1 shows the different fusion rules and their applications in the different fusion levels.

Summary of observations on the state of the art in fusion levels and modalities and fusion rules is shown in Table 2.2. A comment on each work is also given in column 5 of Table 2.2.

The survey reveals that fusion at match-score levels is a favored choice for information fusion among the researchers (refer Table 2.2). This popularity could be attributed to the easy availability of match scores, reduced complexity while fusing and speedup in the fusion process. Another benefit of using MLF is that biometric matchers can be chosen commercial-off-the-shelf (COTS), without any modifications, which in turn, reduces system development cost.

Selecting an appropriate fusion rule depends upon many factors. As far as the level of security is concerned, the AND rule of fusion provides the highest level of security while OR rule provides the least.

Table 2.1: Fusion rules used at various fusion levels.

When a large number of training samples are available, use of SVMs is potentially time consuming. On the other hand, SVMs are unaffected by the dimensionality of feature vectors which makes them an attractive choice for fusing high-dimensional feature vectors. SVMs and minimum cost Bayesian classifier are some of the most accurate classifiers as compared to decision trees, FLD and MLP.

Table 2.2: Summary of state of the art fusion rules and fusion levels

From the existing literature, it can be observed that there is neither a fix level of fusion nor a certain fusion rule that always gives the most accurate and reliable results along with an optimum level of computational, storage and temporal complexity. Five levels of fusion, a variety of biometric traits and more than two dozen fusion techniques provide lots of opportunities for customization of multimodal biometric authentication system. However, for all practical purposes, considering the fusion complexity, reusability and efficiency, fusion at match-score levels would be the preferred choice of many system developers.

2.6 Summary

To meet the increasing demand of reliable, accurate and robust biometric-based authentication systems, multimodal biometrics have been advocated. A number of researches addressing the issues and challenges of multimodal biometrics have been reported in the recent literature. This chapter is an attempt to survey the latest and state-of-the- art techniques in multimodal biometrics and this work has been published in [21]. The literatures across five fusion levels and more than two dozen fusion rules have been studied. This study will give a user enough details for making an appropriate choice of fusion level and fusion rules in building an accurate, robust and effective multimodal authentication system.

Bibliography

[1] Heikki Ailisto, Mikko Lindholm, Satu-Marja Makela, and Elena Vildjiounaite. Unobtrusive user identification with light biometrics. In Roope Raisamo, editor, Nordic Conference on Human-Computer Interaction 2004, pages 327–330. ACM, 2004. ISBN 1-58113-857-1. URL http://dblp.uni-trier.de/db/conf/nordichi/nordichi2004.html #AilistoLMV04.
[2] S. Ben-Yacoub, Y. Abdeljaoued, and E. Mayoraz. Fusion of face and speech data for person identity verification. IEEE Transactions on Neural Networks, 20(5):1065–1074, September 1999. doi: 10.1109/72.788647.
[3] S. Ben-Yacoub, J. Luttin, K. Jonsson, J. Matas, and J. Kittler. Audio-visual person verification. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, volume 1, pages –585 Vol. 1, 1999. doi: 10.1109/CVPR.1999.786997.
[4] F. Besbes, H. Trichili, and B. Solaiman. Multimodal biometric system based on fingerprint identification and iris recognition. In 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA 2008), pages 1–5, April 2008. doi: 10.1109/ICTTA.2008.4530129.
[5] J. Bhatnagar, A. Kumar, and N. Saggar. A novel approach to improve biometric recognition using rank level fusion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’ 07), pages 1–6, 2007.
[6] K.W. Bowyer, K.I. Chang, P.J. Flynn, and Xin Chen. Face recognition using 2-d, 3-d, and infrared: Is multimodal better than multisample? In Proceedings of the IEEE, volume 94, pages 2000–2012, Nov. 2006. doi: 10.1109/JPROC.2006.885134.
[7] Kyong I. Chang, Kevin W. Bowyer, and Patrick J. Flynn. An evaluation of multimodal 2d+3d face biometrics. IEEE Transaction on Pattern Analysis Machine Intelligence, 27(4):619–624, 2005. ISSN 0162-8828. doi: http://dx.doi.org/10.1109/TPAMI. 2005.70.
[8] Y. Chen, S.C. Dass, and A.K. Jain. Localized iris image quality using 2-d wavelets. In International Conference on Biometric Authentication, pages 373–381, 2006.
[9] Yi Chen, Sarat Dass, and Anil Jain. Fingerprint quality indices for predicting authentication performance. In In: Proc. AVBPA, Springer LNCS-3546, pages 160–170, 2005.
[10] Girija Chetty and Michael Wagner. Audio-visual multimodal fusion for biometric person authentication and liveness verification. In Fang Chen and Julien Epps, editors, NICTA-HCS Net Multimodal User Interaction Workshop (MMUI 2005), volume 57 of CRPIT, pages 17–24, Sydney, Australia, 2005. ACS.
[11] Ming-Cheung Cheung, Man-Wai Mak, and Sun-Yuan Kung. Multi-sample data-dependent fusion of sorted score sequences for biometric verification. In International Conference on Acoustics, Speech, and Signal Processing, 2004., volume 5, pages 681–684, 2004.
[12] S. K. Dahel and Q. Xiao. Accuracy performance analysis of multimodal biometrics. In Proceedings of the 2003 IEEE Workshop on Information Assurance United States Military Academy, pages 170– 173. Information Assurance Workshop, 2003. IEEE Systems, Man and Cybernetics Society, 18-20 June 2003. doi: 10.1109/SMCSIA.2003.1232417.
[13] Sarat C. Dass, Karthik N, and Anil K. Jain. A principled approach to score level fusion in multimodal biometric systems. In In Proceedings 5th International Conference Audio- and Video-Based Biometric Person Authentication, pages 1049–1058, 2005.
[14] B. Gutschoven and P. Verlinde. Multi-modal identity verification using support vector machines (svm). In Proceedings of the Third International Conference on Information Fusion, 2000 (FUSION 2000), volume 2, pages THB3/3–THB3/8 vol.2, July 2000.
[15] Tzung-Pei Hong and Jyh-Jong Lee. Parallel neural learning by iteratively adjusting error thresholds. In Proceedings of International Conference on Parallel and Distributed Systems, 1998., pages 107–112, Dec 1998. doi: 10.1109/ICPADS.1998.741026.
[16] A. Humm, J. Hennebert, and R. Ingold. Combined handwriting and speech modalities for user authentication. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 39(1):25–35, 2009.
[17] M. Ichino, H. Sakano, and N. Komatsu. Multimodal biometrics of lip movements and voice using kernel fisher discriminant analysis. In 9th International Conference on Control, Automation, Robotics and Vision (ICARCV ’06), pages 1–6, Dec. 2006. doi: 10.1109/ ICARCV.2006.345473.
[18] A. Jain and A. Ross. Fingerprint mosaicking. In IEEE International Conference on Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP ’02), volume 4, pages IV–4064–IV–4067 vol.4, 2002. doi: 10.1109/ICASSP.2002.1004811.
[19] A. K. Jain and A. Ross. Learning user-specific parameters in a multibiometric system. In Proceedings of International Conference on Image Processing (ICIP’2002), volume 1, pages I: 57–60, Rochester, New York, September 22-25 2002.
[20] Anil Jain, Karthik Nandakumar, and Arun Ross. Score normalization in multimodal biometric systems. Pattern Recognition, 38 (12):2270–2285, December 2005.
[21] T. Joshi, S. Dey, and D. Samanta. Multimodal Biometrics: State of the Art in Fusion Techniques. International Journal of Biometrics (IJBM), 1(4):393–417, 2009.
[22] J. Kittler, G. Matas, K. Jonsson, and M. Sanchez. Combining evidence in personal identity verification systems. Pattern Recognition Letters, 18(9):845–852, 1997.
[23] Josef Kittler, Mohamad Hatef, Robert P. W. Duin, and Jiri Matas. On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell., 20(3):226–239, 1998. ISSN 0162-8828. doi: http://dx.doi.org/10.1109/34.667881 .
[24] A. Kumar and D. Zhang. Personal recognition using hand shape and texture. Image Processing, IEEE Transactions on, 15(8): 2454–2461, Aug. 2006. ISSN 1057-7149.
[25] S. Y. Kung and Man-Wai Mak. On consistent fusion of multimodal biometrics. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. (ICASSP 2006), volume 5, pages V–1085 – V 1088, 2006.
[26] Hyun Cheol Lee, Eun Seok Kim, Gi Taek Hur, and Hee Young Choi. Generation of 3d facial expressions using 2d facial image. In ICIS ’05: Proceedings of the Fourth Annual ACIS International Conference on Computer and Information Science, pages 228–232, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0-7695-2296-3. doi: http://dx.doi.org/10.1109/ICIS.2005.68.
[27] Carsten Maple and Vitaly Schetinin. Using a bayesian averaging model for estimating the reliability of decisions in multimodal biometrics. In The First International Conference on Availability, Reliability and Security (ARES 2006), pages 929 – 935. IEEE Computer Society, April 2006. doi: http://dx.doi.org/10.1109/ARES.2006.141 .
[28] MLP. The multilayer perceptron classifier. http://europa.eu.int/en/comm/eurostat/research/supcom.95/16/result/node7.html , 1995.
[29] M. M. Monwar and M. Gavrilova. A robust authentication system using multiple biometrics. In Computer and Information Science, 2008.
[30] Md. Maruf Monwar and Marina Gavrilova. Fes: A system for combining face, ear and signature biometrics using rank level fusion. In Fifth International Conference on Information Technology: New Generations (ITNG 2008), pages 922–927, April 2008. doi: 10.1109/ITNG.2008.254.
[31] T. Nakagawa, I. Nakanishi, Y. Itoh, and Y. Fukui. Multi-modal biometrics authentication using on-line signature and voice pitch. In International Symposium on Intelligent Signal Processing and Communications (ISPACS 2006), pages 399–402, Dec. 2006. doi: 10.1109/ISPACS.2006.364913.
[32] Karthik Nandakumar, Yi Chen, A. K. Jain, and Sarat C. Dass. Quality-based score level fusion in multibiometric systems. In 18th International Conference on Pattern Recognition (ICPR 2006), volume 4, pages 473–476, 0-0 2006. doi: 10.1109/ICPR. 2006.951.
[33] Karthik Nandakumar, Yi Chen, Sarat C. Dass, and Anil Jain. Likelihood ratio-based biometric score fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2):342– 347, 2008. ISSN 0162-8828. doi: http://dx.doi.org/10.1109/TPAMI.2007.70796 .
[34] Michael Goh Kah Ong, Tee Connie, Andrew Teoh Beng Jin, and David Ngo Chek Ling. A single-sensor hand geometry and palmprint verification system. In WBMA ’03: Proceedings of the 2003 ACM SIGMM workshop on Biometrics methods and applications, pages 100–106, New York, NY, USA, 2003. ACM. ISBN 1-58113-779-6. doi: http://doi.acm.org/10.1145/982507.982526.
[35] N. Poh and J. Korczak. Hybrid biometric person authentication using face and voice features. In Proc. of Int. Conf. on Audio and Video-Based Biometric Person Authentication, pages 348–353, Halmstad, Sweden, June 2001.
[36] A. Rattani, D. R. Kisku, M. Bicego, and M. Tistarelli. Feature level fusion of face and fingerprint biometrics. In First IEEE International Conference on Biometrics: Theory, Applications, and Systems, 2007 (BTAS 2007)., pages 1–6, September 2007. doi: 10.1109/BTAS.2007.4401919.
[37] G. Richard, Y. Mengay, I. Guis, N. Suaudeau, J. Boudy, P. Lockwood, C. Fernandez, F. Fernandez, C. Kotropoulos, A. Tefas, Pitas, R. Heimgartner, P. Ryser, C. Beumier, P. Verlinde, S. Pigeon, G. Matas, J. Kittler, J. Biglin, Y. Abdeljaoued, E. Meurville, L. Besacier, M. Ansorge, G. Maitre, J. Luettin, S. Ben-Yacoub, B. Ruiz, K. Aldama, and J. Cortes. Multi modal verification for teleservices and security applications (m2vts). In IEEE International Conference on Multimedia Computing and Systems, 1999, volume 2, pages 1061–1064 vol.2, Jul 1999. doi: 10.1109/MMCS.1999.778659.
[38] F. Roli, G. Fumera, and J. Kittler. Fixed and trained combiners for fusion of imbalanced pattern classifiers. In Information Fusion, 2002. Proceedings of the Fifth International Conference on, volume 1, pages 278–284 vol.1, 2002. doi: 10.1109/ICIF.2002. 1021162.
[39] Arun Ross and Rohin Govindarajan. Feature level fusion using hand and face biometrics. In Proceedings of SPIE Conference on Biometric Technology for Human Identification II, volume 5779, pages 196–204, 2005.
[40] Arun Ross and Anil Jain. Information fusion in biometrics. Pattern Recogn. Lett., 24(13):2115–2125, 2003. ISSN 0167-8655. doi: http://dx.doi.org/10.1016/S0167-8655(03)00079-5.
[41] Arun A. Ross, Karthik Nandakumar, and Anil K. Jain. Handbook of Multibiometrics. Springer-Verlag New York, Inc. Secaucus, NJ, USA, 2006.
[42] S.A. Samad, D.A. Ramli, and A. Hussain. A multi-sample single-source model using spectrographic features for biometric authentication. In 6th International Conference on Information Communications & Signal Processing, 2007, pages 1–5, Dec. 2007. doi: 10.1109/ICICS.2007.4449710.
[43] Giovanni Sansone. Orthogonal Functions. Dover Publications, 1991.
[44] S. Seung. Multilayer perceptrons and backpropagation learning. http://hebb.mit.edu/courses/9.641/2002/lectures/lecture04.pdf, 2002.
[45] Chang Shu and Xiaoqing Ding. Multi-biometrics fusion for identity verification. In 18th International Conference on Pattern Recognition (ICPR 2006), volume 4, pages 493–496, 2006. doi: 10.1109/ICPR.2006.821.
[46] R. Snelick, U. Uludag, A. Mink, M. Indovina, and A. Jain. Large-scale evaluation of multimodal biometric authentication using state-of-the-art systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):450–455, March 2005. ISSN 0162-8828. doi: 10.1109/TPAMI.2005.57.
[47] Robert Snelick, Mike Indovina, James Yen, and Alan Mink. Multimodal biometrics: issues in design and testing. In Sharon L. Oviatt, Trevor Darrell, Mark T. Maybury, and Wolfgang Wahlster, editors, Proceedings of the 5th international conference on Multimodal interfaces (ICMI ’03), pages 68–72. ACM, 2003. ISBN 1-58113-621-8. URL http://dblp.uni-trier.de/db/conf/icmi/icmi2003.html#SnelickIYM03 .
[48] Ying So. A tutorial on logistic regression. http://www.ats.ucla.edu/stat/sas/library/logistic.pdf, 1995.
[49] K.-A. Toh, Xudong Jiang, and Wei-Yun Yau. Exploiting global and local decisions for multimodal biometrics verification. IEEE Transactions on Signal Processing, 52(10):3059–3072, Oct. 2004. ISSN 1053-587X. doi: 10.1109/TSP.2004.833862.
[50] Kar-Ann Toh and Wei-Yun Yau. Some learning issues in user-specific multimodal biometrics. In 8th Control, Automation, Robotics and Vision Conference, 2004 (ICARCV 2004), volume 2, pages 1268– 1273, December 2004. doi: 10.1109/ICARCV.2004. 1469028.
[51] Kar-Ann Toh and Wei-Yun Yau. Combination of hyperbolic functions for multimodal biometrics data fusion. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34(2):1196–1209, April 2004. ISSN 1083-4419. doi: 10.1109/TSMCB.2003.821868.
[52] Kar-Ann Toh, Wei-Yun Yau, and Xudong Jian. A reduced multivariate polynomial model for multimodal biometrics and classifiers fusion. IEEE Transactions on Circuits and Systems for Video Technology, 14(2):224–233, February 2004.
[53] Brad Ulery, William Fellner, Peter Hallinan, Austin Hicklin, and Craig Watson. Studies of biometric fusion. Technical report, National Institute of Standards and Technology, 2006.
[54] K. Veeramachaneni, L. A. Osadciw, and P. K. Varshney. Adaptive multimodal biometric fusion algorithm. In SPIE Aerosense, 2003.
[55] Rong Wang and Bir Bhanu. Performance prediction for multimodal biometrics. In 18th International Conference on Pattern Recognition (ICPR’06), 2006.
[56] Yu Wang, Zhi-Chun Mu, Ke Liu, and Jun Feng. Multimodal recognition based on pose transformation of ear and face images. In Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, November 2007. IEEE.
[57] Xiaona Xu and Zhichun Mu. Feature fusion method based on kcca for ear and profile face based multimodal recognition. In IEEE International Conference on Automation and Logistics, 2007, pages 620–623, Aug. 2007. doi: 10.1109/ICAL.2007.4338638.
[58] Wenchao Zhang, Shiguang Shan, Wen Gao, Yizheng Chang, Bo Cao, and Peng Yang. Information fusion in face identification. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), volume 3, pages 950–953 Vol.3, Aug. 2004. doi: 10.1109/ICPR.2004.1334686.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2 - Multimodal Biometric and Fusion Technology

Create new playlist

Sign In

Sign Up

Chapter 2

Multimodal Biometric and Fusion Technology

2.1 Multimodal Biometric Authentication Technology

2.2 Fusion of Multimodalities

2.3 Fusion Levels

2.3.1 Sensor Level Fusion

2.3.2 Feature Level Fusion

2.3.3 Match-score Level Fusion

2.3.4 Decision Level Fusion

2.4 Different Fusion Rules

2.4.1 Fixed fusion rules

2.4.1.1 AND Rule

2.4.1.2 OR Rule

2.4.1.3 Majority Voting

2.4.1.4 Maximum Rule

2.4.1.5 Minimum Rule

2.4.1.6 Sum Rule

2.4.1.7 Product Rule

2.4.1.8 Arithmetic Mean Rule

2.4.2 Trained Fusion Rules

2.4.2.1 Weighted Sum Rule

2.4.2.2 Weighted Product Rule

2.4.2.3 User Weighting

2.4.2.4 Fisher Linear Discriminant (FLD)

2.4.2.5 Support Vector Machine (SVM)

2.4.2.6 Multi Layer Perceptron (MLP)

2.4.2.7 Mixture-of-Experts (MOE)

2.4.2.8 Bimodal Fusion (BMF)

2.4.2.9 Cross-Modal Fusion

2.4.2.10 3-D Multimodal Fusion

2.4.2.11 Canonical Correlation Analysis (CCA), and Kernel Canonical Correlation Analysis (KCCA)

2.4.2.12 Simple and Weighted Average

2.4.2.13 Optimal Weighting Method (OWM)

2.4.2.14 Likelihood Ratio-Based Biometric Score Fusion

2.4.2.15 Borda Count Method

2.4.2.16 Logistic Regression Method

2.4.2.17 Kernel Fischer Discriminant Analysis (KFDA)

2.4.2.18 Minimum Cost Bayesian Classifier

2.4.2.19 Decision Tree

2.5 Comparative Study of Fusion Rule

2.6 Summary

Bibliography

Table of Contents for
Chapter 2 - Multimodal Biometric and Fusion Technology