Data-Driven Evaluation of Ontologies ◾ 259
e leaf nodes in Figure9.23 are propositionalized attributes and each internal
node represents merging of all of its descendant leaf nodes. e taxonomy is gen-
erated automatically by PAT learner and may not be easily readable for humans.
However, in many applications, taxonomies specied by human experts are unavail-
able. Manual construction of taxonomies requires a great deal of domain expertise,
and in case of large data sets with many attributes and many values for each attri-
bute, manual generation of PATs is extremely tedious and not feasible in practice.
Considering this drawback, PATs generated automatically by PAT learner are use-
ful in constructing concise and accurate classiers when used with PAT-DTL.
9.4.3.2 Experimental Results from PAT-NBL
Comparison with naive Bayes learner and model selection criteria—To assess
how PAT-NBL algorithms evaluate taxonomies, we conducted experiments on data
sets from the UCI Machine Learning Repository (Blake and Merz, 1998) with the
following learning algorithm settings:
1. Naive Bayes learner
2. PAT-NBL algorithm with conditional log likelihood (CLL; Friedman et al.,
1997) criterion
3. PAT-NBL algorithm with conditional minimum description length (CMDL)
criterion
4. PAT-NBL algorithm with conditional Akaike information criterion (CAIC,
a conditional version of Akaike information criterion; Akaike, 1973) repre-
sented as:
CAIC B D CLL B D size B( | ) ( | ) ( )= − +
To compare the performance of the algorithms, we adapted the t-test with 10-fold
cross-validation. Table9.9 shows classier accuracy and tree sizes on UCI data sets
for NBL on the original attributes, and PAT-NBL with CLL, CMDL, and CAIC.
e results described in this section reect that none of the algorithms showed the
highest accuracy over all data sets. As to sizes of the generated classiers (measures
of compactness), PAT-NBL coupled with PAT learner (Kang and Sohn, 2009) usu-
ally generated more concise naive Bayes classiers. e size of a naive Bayes classi-
er can be measured by Equation 9.4.
Figure9.24 illustrates one of the generated propositionalized attribute taxono-
mies of the UCI Balance Scale Weight and Distance data set using PAT learner (Kang
and Sohn, 2009). After the original data set (the relation is shown in Figure9.24(a))
is propositionalized (Figure9.24(b)), each propositionalized attribute is binary (true
or false). Using the class-conditional distribution of each attribute when it is true,
we can nd a pair of attributes whose similarities are maximum (or whose diver-
gences are minimum). Based on the divergence measure, we repeated hierarchical
agglomerative clustering to generate the PAT shown in Figure9.24(c).