Statistical analysis can be performed using a lattice structure, which is useful for analyzing alternative word orderings. This structure represents forward/backward scores. The HmmDecoder class's tagMarginal method returns an instance of the TagLattice class, which represents a lattice.
We can examine each token of the lattice using an instance of the ConditionalClassification class. In the following example, the tagMarginal method returns a TagLattice instance. A loop is used to obtain the ConditionalClassification instance for each token in the lattice.
We are using the same tokenList instance that we developed in the previous section:
TagLattice<String> lattice = decoder.tagMarginal(tokenList); for (int index = 0; index < tokenList.size(); index++) { ConditionalClassification classification = lattice.tokenClassification(index); ... }
The ConditionalClassification class has a score and a category method. The score method returns a relative score for a given category. The category method returns this category, which is the tag. The token, its score, and its category are displayed as shown here:
System.out.printf("%-8s",tokenList.get(index)); for (int i = 0; i < 4; ++i) { double score = classification.score(i); String tag = classification.category(i); System.out.printf("%7.3f/%-3s ",score,tag); } System.out.println();
The output is shown as follows:
Bill 0.974/np 0.018/nn 0.006/rb 0.001/nps used 0.935/vbd 0.065/vbn 0.000/jj 0.000/rb the 1.000/at 0.000/jj 0.000/pps 0.000/pp$$ force 0.977/nn 0.016/jj 0.006/vb 0.001/rb to 0.944/to 0.055/in 0.000/rb 0.000/nn force 0.945/vb 0.053/nn 0.002/rb 0.001/jj the 1.000/at 0.000/jj 0.000/vb 0.000/nn manager 0.982/nn 0.018/jj 0.000/nn$ 0.000/vb to 0.988/to 0.012/in 0.000/rb 0.000/nn tear 0.991/vb 0.007/nn 0.001/rb 0.001/jj the 1.000/at 0.000/jj 0.000/vb 0.000/nn bill 0.994/nn 0.003/jj 0.002/rb 0.001/nns in 0.990/in 0.004/rp 0.002/nn 0.001/jj two. 0.960/nn 0.013/np 0.011/nns 0.008/rb