4

On the Emprical Status of
Measurement Axioms: The
Case of Subjective Probability

Ernest W. Adams

University of California, Berkeley

This chapter contributes to the study of the empirical status of axioms in theories of fundamental measurement, following the lines of a previous study (Adams, Fagot, & Robinson, 1970) (see also Pfanzagl, 1968; Adams, 1974; and Manders, 1977). Here we will focus on theories of subjective probability representation, specifically considering variations on theories discussed in Krantz, Luce, Suppes, and Tversky (1971), section 5.2; Fine (1973), section IIB; Narens (1985) section 2.8e; and Roberts (1979), section 8.5, which in turn are variations and refinements on theories due to de Finetti (1937); Koopman (1940); and Savage (1954), section 3.2. To get an idea of our problems and approach, let us go quickly over the methodological and model-theoretic background.

The axioms in axiom systems for subjective probability fall into the following two categories: those that are necessary conditions for the existence of subjective probability representations (“necessary conditions of representability,” Manders, 1979) and those that are not logically necessary but that have to be added to obtain sufficient conditions or that serve other theoretical purposes. Most of the necessary conditions included in well known axiom systems have the form of purely universal laws, such as that the subjective probability ordering should be transitive.1 These laws are “empirically transparent,” not only because they can be directly tested for logical consistency with observational data, but because they can be inductively confirmed, either as exactly true or as “true in most instances” (cf. Adams, 1974; however, see section on Other Data, in this chapter, for an important qualification of this). The nonnecessary conditions that enter into the axiom systems we are concerned with are never purely universal, and they seldom have such a transparently empirical character. Savage’s nonnecessary axiom P6’ (Savage, 1954, p. 38) is typical. It says that if an event B is subjectively less probable than another event C, then it is possible to partition the event space into a finite number of “mini-events” in such a way that the union of B with any one of these is still subjectively less probable than C. It is easily seen that any finite amount of data whatever on subjective probability orderings must be logically consistent with this law taken by itself. Moreover, although Savage gave a persuasive a posteriori justification of P6’, to be commented on in section 4.2, it is not a form that is subject to standard experimental confirmation, and it is not clear what a particular instance of it would be.2 Given this, one may be led to think that P6’ and similar laws stand in no need of empirical justification, because they have no empirical content. Like continuity and Archimedean assumptions, they seem to be technical conditions that are sufficiently justified by “theoretical reasons of state” (i.e., they are needed to attain a theoretical objective such as that of proving a representation theorem).3 Nevertheless the appearance of empirical contentlessness is deceiving, as the following elementary model-theoretic considerations show.

Let us picture the space of subjective probability “systems” as filling the rectangle shown in Fig. 4.1, subregions of which correspond to systems that satisfy particular conditions. The most important special class is the class P of all systems satisfying the probabilistic representability hypothesis (i.e., that there exist subjective probability representations for these systems, which is shown as the smallest rectangle in the figure and is in turn divided into subregions P1 and P2. Now fix on a particular system of axioms for subjective probability representations. A subset of its axioms will be purely universal laws that are necessary conditions for representability, and the class of systems satisfying these laws can be depicted as filling the large rectangle U. This is shown containing P as a subregion, because all representable systems must satisfy these axioms. We will suppose that the remaining axioms of the system are not all necessary conditions for representability, but together with the universal axioms, they are sufficient. The class of systems satisfying these axioms is represented as filling circle N (“nonnecessary conditions”), which does not contain P, because these conditions are not necessary, but whose intersection with U is inside P, because the totality of the universal and the nonnecessary conditions is sufficient for representability. One other region, U’, depicts the class of systems satisfying not only the universal laws of the particular axiom system in question but those satisfying all possible universal laws that are necessary conditions of representability.

Image

Fig 4.1

Let us note the following among the various model-theoretic relationships depicted by Fig. 4.1. First, U’ is shown as a proper subregion of U. This means that the purely universal laws of our axiom system do not by themselves logically entail all necessary conditions of representability. In fact, it follows from a fundamental theorem of Scott and Suppes (1958) that no finite set of purely universal axioms can logically entail all such conditions, and, therefore, unless our axiom system is infinite, U’ must be a proper subset of U.4

The second point is that P is shown as a proper subregion of U’, which means that not even the infinite totality of purely universal necessary conditions for representability is sufficient for this. This is an obvious consequence of the Skolem-Lowenheim Theorem (Lowenheim, 1915), given that the necessary conditions we are considering are first-order, and, therefore, they must have models of too high a cardinality to be representable by real-valued probabilities. The same argument shows that the nonuniversal laws defining region N must include at least one axiom that is not first-order, like condition P6’.

The third point is that, whereas the intersection of U and N is a subregion of P, it is a proper subregion of it, which simply means that the totality of the axioms are sufficient for representability but not necessary. However, it is methodologically important that, for many of the axiom systems that have been put forward, it can be shown that all representable systems not satisfying the axioms are subsystems of ones that do (i.e., all systems in P, can be ‘extended to’ ones in P2. If this is so, then it follows that, although the axioms are logically stronger than the representability hypothesis, nevertheless any observational data consistent with the representability hypothesis must also be consistent with the axioms. This point will be returned to in section 4.1, but for now let us note that, although it could be seen as reinforcing the “mere technicality” thesis, the nonnecessary axioms stand in need of empirical justification, because they do in fact have substantive content.

Finally, and most importantly for present purposes, because the totality of the axioms are sufficient for representability, as shown by the fact that the intersection of U and P is inside region P, they must also entail the totality of necessary conditions for representability, because ρ is itself a subregion of U’. Hence, given that U’ is a proper subregion of U, the totality of the axioms must logically entail some purely universal laws that are not entailed by U alone. This in turn implies that there can be observational data that would be consistent with the purely universal laws defining U but that would be inconsistent with the universal laws plus the nonnecessary conditions defining N.5 In other words, although it may be that, taken by themselves, the nonnecessary axioms have no empirical content, because they are logically consistent with any data whatever, nevertheless they can have “contributory empirical content” in the context of a larger axiom system, because there may be data consistent with the system with the axioms deleted but not with the entire system. Even if such axioms are not directly testable by observation, they cannot be justified solely on the grounds of technical convenience.

The axiomatizer’s dilemma is that, in order to attain the objective of proving a representation theorem, it is necessary to include nonnecessary axioms that have contributory empirical content and hence, that require justification by more than “reasons of state,” although they may have no independent empirical content. In the remarks that conclude this paper, we will essay a few inconclusive speculations on how such axioms may be justified, but our main objective is more modest. It is simply to identify the axioms requiring such justification. The point is that, although we know a priori that certain nonnecessary axioms must have contributory empirical content, most axiom systems include more than one nonnecessary axiom, and we may wish to know which of these requires special attention.

In fact, our study is somewhat more fine-grained. It can happen that certain axioms are empirically contentless, even in a contributory sense, relative to certain kinds of data but not to data of other kinds. Thus, in the case of subjective probability, there are at least three kinds of atomic formulas that might be considered as data. The most obvious “empirical facts” are data about the subjective probability ordering, such as that one event is subjectively less probable than another. However, the language of subjective probability includes at least two other sorts of atomic formulas: those expressing the fact that one event is a “subevent” of another, and those expressing the fact that two events are the same. Perhaps we are inclined to regard these as expressing “purely logical facts” that play no role in an empirical evaluation of a theory, but, as with the appearance of mere technicality, we will see that this too can be misleading. Furthermore, even when the appearance is correct, it may be useful to prove this, which may also help to explain why we should regard subjective probability relations as more empirical than the logical relations of subeventhood and of identity.

Now we are ready to turn to details, beginning with the specification of certain theories of subjective probability, of certain kinds of empirical data, and of what it is for such data to be consistent with one of these theories. Following that, we will state our main theorems on empirical equivalences and inequivalences between theories of subjective probability defined in terms of their consistency with data of certain kinds, which will allow us to identify the nonnecessary axioms in them that have contributory empirical content. A concluding section will make some speculative remarks about several methodological issues, including the justification of these axioms. Because of the fact that similar theorems are proved in Adams, Fagot, and Robinson (1970), we will include only one sample proof, which will give some idea of the essential mathematical methods employed and also of limitations that seem to bar the way to extensions to more complex theories. Also it should be stressed that the present study makes no pretense to exhaustiveness in its survey of theories of subjective probability. Certain representative ones are considered, but our objective is to illustrate themes and methods and not to apply them to all cases.

BASIC DEFINITIONS

The axiom systems to be considered apply to ordered boolean algebras (OBA), which will themselves be defined in stages. A boolean algebra of sets (BAS) is an ordered septuple BAS(K) = 〈K, ∩, ∪, –, ∅, V, ⊆〉 in which K is a set of subsets of some fixed nonempty set V, which is closed under intersection, union, and complementation with respect to V; ∩, ∪, and – are the intersection, union, and complementation operations; ∅ is the empty set; and ⊆ is the subset relation restricted to K. Obviously BAS(K) is uniquely determined by K. Two boolean algebras of sets that will be of special importance in what follows are the systems U = BAS([0, 1]) and U– = BAS((0, 1]). In the first of these, K is the set of all finite (possibly empty) unions of subintervals of the closed unit interval [0, 1], and, in the second, K is the set of finite unions of left-open, right-closed subintervals of (0, 1].

A boolean algebra of events, or more briefly a boolean algebra, (BA) is any septuple Image = (K, ∩, ∪, –, ∅, V, ⊆), which is homomorphic to a boolean algebra of sets. Stones’s Representation Theorem (Stone, 1936) shows that BAs can also be characterized by first-order axioms, but the present characterization is simpler for our purposes.

A probability function for an arbitrary BA, Image = 〈K, ∩, ∪, –, ∅, V, ⊆〉 is a real-valued function p mapping K into [0, 1] and satisfying the standard Kolmogorov (1956) axioms: for all x and y in K, (a) if xy, then p(x) ≤ p(y); (b) p(V) = 1; and (c) if x ∩ y ⊆ ∅ (i.e., if x and y are disjoint), then p(xy) = p(x) + p(y). Such a function is regular, if p(x) = 0 only for x ⊆ 0. Note that the two functions pU and pU–, which map the subintervals of [0, 1] and (0, 1] into their lengths, are probability functions for U and U– respectively, and, furthermore, pU– is regular.

An ordered boolean algebra (OBA) is a system 〈Image, Image〉 in which Image is a BA, and Image is a weak ordering of its domain. Specifically, if K is the domain of Image, then, for all x, y, and z in K, the following occurs: (a) either x Image y, or y Image x; and (b) if x Image y, and y Image z, then x Image z. A (set-theoretical) theory of OBAs is simply a set of these structures, and we will regard an individual OBA as a degenerate theory that is the singleton set that contains that OBA. These are the theories with whose empirical equivalences we are concerned, which will in turn shed light on the empirical status of the axioms characterizing them. Empirical equivalence will be defined after we have stated further axioms and related conditions that characterize theories of particular interest.

When the equivalence,

Image

holds for all x and y in K, we will say that Image is generated by p and that p represents Image.6 Obviously any probability function p for Image generates a unique ordering Imagep of K, and 〈Image, Imagep〉 is the corresponding OBA. Furthermore, for most OBAs, 〈Image, Image〉, there is at most one probability function that represents the weak ordering Image. Two important OBAs are the systems, OBA(U) and OBA(U–), whose orderings are represented by the length measures on U and U– respectively.

When the probabilities p(x) and p(y) that attach to elements in the domains of BAs are subjective, the corresponding ordering relations x Imagepy are of particular importance, because they are commonly assumed to furnish the observational data from which the subjective probabilities are inferred. In consequence, theories of subjective probability measurement are generally stated in terms of these orderings. Definition 1 formulates conditions on these orderings, which figure in well known theories of this kind. Note that Savage’s axiom P6’ is not included among the conditions listed, which reflects the fact that this is a condition for what Savage (1954, p. 34) calls an “almost agreeing” representation. This is more general than an “agreeing” representation, which is what we here call simply a “representation.” This is commented on a greater length later.

DEFINITION 1. Let Image = 〈K, ∩, ∪, –, ∅, V, ⊆〉 be a BA, and let Image be a weak ordering of K.

1.1. 〈Image, Image〉 is a Basic de Finetti Structure, if for all x, y, and z in K: (i) if xy, then x Image y; (ii) if x and z are disjoint, and y and z are disjoint, then xz Image yz, if and only if x Image y; and (iii) not V Image ∅.

1.2. 〈Image, Image〉 is regular if x Image ∅ implies x ⊆ ∅ for all x in K.

1.3. Members x1 ,…, xn of K form an equal n–partition of an element x of K if all are disjoint, x1~ … ~ xn (where ~ is the equivalence relation generated by Image) and x ~ x, ∪ … ∪ xn, x is equal n-partitionable if it has an equal n-partition and 〈Image, Image〉 is uniformly equal n-partitionable, if every x in K is equal n–partitionable.

1.4. 〈Image, Image〉 is Koopman Archimedean if, for all x < y in K (i.e., for all x and y such that not y Image x): for some n, there exists an equal n–partition V1, …, Vn of V and some mn such that x Image V1, ∪ … ∪ Vm Image y.

1.5. 〈Image, Image〉 is probabilistically representable if there exists a probability function for Image that represents Image.

The three conditions defining Basic de Finetti Structures are direct consequences of axioms originally due to de Finetti (1937) (cf. also Roberts, 1979, p. 387). These are either postulated, or they are direct consequences of postulates in all of the well known direct axiomatizations of subjective probability independent of preference. Moreover, these are all necessary conditions for the probabilistic representability of an OBA.

Sufficient conditions for probabilistic representability have been exhaustively studied (cf. Krantz et al., 1970, chapter 5, and Roberts, 1979, section 8.5). These can be obtained by adding certain nonnecessary axioms to those for Basic de Finetti Structures. One such set of conditions adds the requirements that the universe V should be equal n-partitionable for arbitrarily large n and that it should satisfy the Koopman Archimedean condition (actually the second of these entails the first, but we want to keep the assumptions distinct in order to consider which adds empirical content to the theory). Luce and Suppes (1969) formulated another set of sufficient conditions by adding to the Basic de Finetti Axioms the requirement that the system should be uniformly equal 2-partitionable, together with a modified Archimedean axiom that can be replaced by the Koopman Archimedean condition. If the regularity requirement is added to either of the previous sets of sufficient conditions, we obtain sufficient conditions for representability by regular probability functions. Our concern is with the empirical status of the nonnecessary axioms such as those noted above, which have to be added to obtain sufficient conditions for probability representability.

Next we define various theories of OBAs, whose empirical equivalences relative to different kinds of data will be investigated.

DEFINITION 2. Theories of OBAs.

2.1. F is the theory of all Basic de Finetti Structures.

2.2. For n = 1, 2, …, En is the theory of all Basic de Finetti Structures that are uniformly equal n–partitionable.

2.3. V is the theory of all Basic de Finetti Structures whose universe elements are equal n–partitionable for arbitrarily large n.

2.4. For n = 1, 2…, EnA is the theory of all OBAs in En that satisfy the Koopman Archimedean condition, and VA is the theory of all OBAs in V that satisfy this condition.

2.5. For n = 1, 2, …, EnR and EnAR are the theories of all regular OBAs in En and EnA respectively, and VR and VAR are the theories of all regular OBAs in V and VA respectively.

2.6. P and RP are respectively the theories of all OBAs that are probabilistically representable and of all OBAs that are representable by regular probability functions.

Given facts already noted (that all Basic de Finetti Structures that satisfy the Koopman Archimedean condition that are either equal 2–partitionable or whose universe elements are equal n–partitionable for arbitrarily large n are probabilistically representable), it follows immediately that E2AP, and VAP, whereas E2ARRP, and VARRP. Trivially also OBA(U) ⊂ P, and OBA(U–) ⊂ RP. Obviously none of these subset relations can be reversed, because the conditions defining the theories E2A, VA, and OBA(U) are sufficient but not necessary for probabilistic representability, whereas E2AR, VAR, and OBA(U–) are sufficient but not necessary for regular probabilistic representability. However, it does not follow that the nonnecessary conditions for probabilistic representability add empirical content to the theories involving them, because it is conceivable that finite data consistent with the weaker theories would also be consistent with these same theories augmented by these nonnecessary conditions. Let us now describe ‘elementary’ data; the section on Other Data will comment on the possibility of adding certain less elementary formulas to the data.

An (elementary) datum will be a relational formula of one of the following six forms:

1.  t1 = t2.

2.  t1t2.

3.  t1t2.

4.  t1t2.

5.  t1 Image t2.

6.  t1t2.

Note that t1 and t2 are boolean terms. These terms are formed from the constant symbols V, and ∅, and variables ‘x’, ‘y’, etc., combined by the binary operation symbols ‘∩’ and ‘∪’, and the unary operation symbol “–.” All data formed from these terms are atomic formulas or their negations, and we will call the atomic formulas positive data and call their negations negative data. Formulas of types (1) and (2) will be called identity data; those of types (3) or (4) will be called inclusion data; and ones of types (5) or (6) will be called ordinal data. Intuition suggests that ordinal data are the most empirically significant, however, because theories of subjective probability involve both identities and inclusions, we must consider data involving these relations to be prima facie relevant to tests of these theories. Results following will help to explain why identity in particular plays a special role in considerations of empirical adequacy.

The idea of a datum or set of data D being consistent with an OBA 〈Image, Image〉 is defined in terms of a valuation f of the terms involved in the data in the domain K of the OBA. This is defined recursively, starting with variables ‘x’, ‘y’ …, whose values f(‘x’), f(‘y’), … are members of K, and the constants ‘V’ and ‘∅’, whose values are given by f(V) = V (the universe element of K) and f(‘∅’) = ∅. Given these valuations, f(t1t2) is defined as f(t1) ∩ f(t2),f(t1t2) = f(t1) ∪ f(t2), and f(–t) = –f(t). Given a valuation/of the terms in D in K, a particular datum of form (1) holds in 〈Image, Image〉 if f(t1) = f(t2); one of form (2) holds if f(t1) ≠ f(t2), and holding is defined analogously for data of forms (3)–(6). A set D of data is consistent with 〈Image, Image〉 if there exists a boolean valuation of the terms in it such that all data in D hold in 〈Image, Image〉.

Now we can define the consistency of data with theories of OBAs, which in turn permits us to define certain kinds of empirical equivalences between these theories.

DEFINITION 3. Let D be a finite set of data, and let T and T’ be two theories of OBAs.

3.1. D is consistent with T if there is an OBA in T with which D is consistent.

3.2. T is empirically at least as strong as T’ with respect to data of a given kind if any finite set of data of that kind that is consistent with T is also consistent with T’.

3.3. T and T’ are empirically equivalent with respect to data of a given kind if each is empirically at least as strong as the other with respect to that kind of data.

If adding an axiom to the conditions defining a theory T yields a theory T’ that is empirically stronger than T with respect to a given kind of data (positive or negative ordinal, inclusion, etc.), then that axiom has contributory empirical content in the context of the axioms defining T’ relative to the given kind of data, whereas, if T and T’ are empirically equivalent with respect to such data, the axiom can be treated as merely technical in its context, at least insofar as concerns empirical tests involving that sort of data. Now we are ready to inquire into the conditions that do add contributory content to the contexts in which they appear and those that do not.

EMPIRICAL EQUIVALENCES

Before giving technical results, we will state a result that justifies disregarding identity data in analyzing equivalences among theories of a wide variety, including all of those considered here.

THEOREM 1. Let T be a theory of OBAs that is closed under homomorphic images and their inverses; let D = E– ∪ E+D’ be a set of data partitioned into subsets of negative identity data E–, positive identity data E+, and inclusion and ordinal data D’; and let I(E+) be the set of positive inclusion data t1t2 for which either t1 = t2 or t2 = t1 is in E+. Then D is consistent with T if and only if E–E+ is logically consistent, and I(E+) ∪ D’ is consistent with T.

Theorem 1, whose proof is omitted, because it is similar to that of Theorem 2 of Adams, Fagot, and Robinson (1970, p. 400), tells us that, in considering the consistency of data (including positive and negative identities with a theory of OBAs that is closed under homomorphisms, which includes all of the theories that concern us), we only need to consider the purely logical question of whether the identity data by themselves are consistent (which they should be a priori), and the consistency of the nonidentity data (inclusion and ordinal) with the theory, provided the inclusion data are augmented to include all inclusions t1t2 and t2t1, that are derived from identities t1 = t2 in the data. A test of a theory of OBAs in terms of its consistency with data, therefore, divides into a purely logical part, having to do with the logical consistency of positive and negative identities in the data, and an empirical part having to do with the theory’s consistency with ordinal and inclusion data. Therefore, because we are concerned with empirical questions, we may disregard identities, henceforth, and focus exclusively on data just involving inclusions and ordinal relations.

Theorem 2 states our main results on empirical equivalences. As previously stated, only the proof of Part 2.1 will be sketched.

THEOREM 2. Empirical equivalences.

2.1. The following theories are empirically equivalent with respect to ordinal data: P, RP, OBA(U), OBA(U–), VA, VAR, and En, EnA, EnR, EnAR, for n = 2, 3, … .

2.2. The following theories are empirically equivalent with respect to ordinal and inclusion data: P, U, VA, En, and EnA, for n = 2, 3, … .

2.3. The following theories are empirically equivalent to respect to ordinal and inclusion data: RP, OBA(U–), VAR, and EnR and EnAR, for n = 2, 3, … .

2.4. P is empirically stronger than E1 and V with respect to ordinal data.

2.5. RP is empirically stronger than P with respect to ordinal and inclusion data.

Proof of 2.1

It will be shown that, for any finite set D of ordinal data, if D is consistent with OBA(U–), it is consistent with the other theories listed, and, if it is consistent with the other theories listed, then it is consistent with OBA(U–). It follows from that that OBA(U–) is equivalent to these other theories with respect to ordinal data.

Suppose first that D is consistent with OBA(U–). Trivially OBA(U–) belongs to all of the theories listed except OBA(U), hence, D must be consistent with all of these theories except possibly OBA(U). That D is also consistent with OBA(U) follows from the fact that OBA(U) is an extension of OBA(U–), and, therefore, any data consistent with OBA(U–) is also consistent with OBA(U).

Now we will show in turn that, if D is consistent with VA, with OBA(U), with P, and with En, for n > 1, it must be consistent with OBA(U–). If this can be shown, it will follow that, if D is consistent with the other theories listed, it is consistent with OBA(U–), because each of these is stronger than at least one of VA, OBA(U), P, or En, for n > 1.

That ordinal data D consistent with VA must also be consistent with ρ and, therefore, with OBA(U–) follows from the theorem that all Basic de Finetti structures whose universes are equal n–partitionable for arbitrarily large n and that satisfy the Koopman Archimedean axiom are probabilistically representable; hence, VA is logically stronger than P, and all data consistent with the former must be consistent with the latter.

Next suppose that D is consistent with OBA(U). Given this, there must be a valuation, f(t), of the terms t in D in the closed unit interval [0, 1] that is the domain of OBA(U) such that all data in D hold in OBA(U) under f. f(t) generates a “reduced mapping,” f’(t), of the terms in D in the domain (0, 1] of OBA(U–) in the following way. f(t) is a finite union I1, ∪ … ∪ In of closed, open, and half-closed subintervals of [0, 1]. Given any Ii, let the corresponding subinterval Iiof (0, 1] be the right-closure of the interior of Ii, and let f’(t) be the union I1’, ∪ … ∪ I’n. Ordinal data D can be interpreted as length inequalities, and the lengths of the intervals Ii equal those of the corresponding intervals Ii’; thus, it follows immediately that if data D hold in OBA(U) under f, then they will also hold in OBA(U–) under f’; hence, D must also be consistent with OBA(U–).

To prove that data D consistent with P or with any En, for n > 1, must be consistent with OBA(U–), it will be useful to construct equivalent “boolean vector forms” of the terms involved in D. We may suppose that these terms are formed from n variables v1, …, vn and that intersections of these and their complements form “boolean atoms” aj = ± v1±vn, j = 1, …, 2n, where ±vi, is either vi or – νi for i = 1…, n. Each term t is formally equivalent to a union of these atoms and can in turn be written in equivalent boolean vectorial form as follows:

t ~ t(a1)a1 + … + t(ak)ak,

(1)

where “ + ” is disjoint union; r(ai) is the universe V or the empty set ∅, according as ai formally entails t or not; and t(ai)ai = t(ai) ∩ ai. We will say that an atom aj “occurs” in this vectorial form, if t(ai) = V or equivalently that aj formally entails t. In effect, (1) says that t is equivalent to the union of the atoms that occur in it.

Now suppose that D is consistent with P. We will show that, if D is consistent with P, it must be consistent with OBA(U) and, therefore, with OBA(U-), as we have just seen. Given the consistency of D with P, there is an OBA 〈Image, Image〉 with probability representation p and valuation f of the terms in D into the domain K of Image such that all of the formulas in D hold in 〈Image, Image〉. Given that f maps the variables vi into K, it must also map the atoms aj into K, and, furthermore, because p is a probability function, the numbers p(f(a1)), …, p(f(ak)) form a probability distribution. This generates a mapping ϕ(t) of the terms t in D into the domain, [0, 1] of OBA(U) as follows.

First, let s= ∅; set sj = p(f(a1)) + … + p(f(aj)); and let ϕ(aj) = [sj–1, sj], forj=1…, k. The intervals ϕ(aj) partition the unit interval [0, 1], except for overlaps at their end points. Now rewrite each term t in D in the equivalent boolean vector form given by (1), and define

ϕ(t) = t(a1)ϕ(a1) ∪ … ∪ ϕ(ak)ϕ(ak),

where t(aj) is now the unit interval or the empty interval, according as aj formally entails t or not. Given the mapping ϕ(t) defined in this way, it is routine to verify that all formulas in D must hold in OBA(U) under ϕ(t). Therefore, they are consistent with OBA(U) and also with OBA(U–).

To prove that ordinal data set D consistent with any Ei for i > 2 must be consistent with OBA(U–), suppose that D is the union of negative ordinal data D– = {ri < si: i = 1, …, p} (where “r < s” abbreviates “s ≮ r”) and positive ordinal data D+ = {ti Imageui ;i=1…, q}, but this union is inconsistent with

OBA(U–). Rewriting the terms in D in boolean vectorial form, the data themselves get rewritten as follows:

ri(a1)al + … + r1(ak)ak < si(al)al + … + Si(ak)ak, i = 1, …, p
(Di–)

for i = 1, …, p, and

ti(al)a1 + … + ti(ak)ak Image ui(al)al + … + ui(ak)ak i = 1, …, q,
(Di+)

for i = 1, …, q. These data can be reinterpreted as numerical inequalities, with “a1”, …, “ak now being interpreted as real variables; with ri(ak) reinterpreted as 1 or 0, according as aj formally entails ri or not; with “ + ” reinterpreted as the numerical addition, and the relations “<” and “Image” being reinterpreted as the numerical inequality signs. It is easily seen that, if the original data are inconsistent with P, then the reinterpreted data must be mathematically inconsistent. More exactly, if D is inconsistent with P, then the numerical inequalities have no solutions for which the reinterpreted values aj form a probability distribution.

Necessary and sufficient conditions for the solvability of linear inequalities, such as those above by probability distributions are given in Adams, Fagot, and Robinson (1970, pp. 390–391), following from standard theorems on solutions to systems of linear inequalities, and these in turn imply the following. If there does not exist a solution in probability distributions to the reinterpreted inequalities, then one of two things must be the case. Case (1): the set D– is nonempty (i.e., p > 0), and there exist nonnegative integers bi, i = 1, … , p, at least one of which is positive, and nonnegative integers ci, i = 1, …, q such that

Image

(1)

for all j = 1, …, k. Case (2): the set D– is empty, and all of the inequalities (2) hold strictly (i.e., with “≥” replaced by “>”). We will show that, in Case (1), the original inequalities must be inconsistent with En for any n > 1, whereas the similar but simpler proof of inconsistency in Case (2) will not be given.

Suppose that a new list D’ of data items is written, with each item Di– now being repeated in the list bi times, and each item Di+ now repeated ci times. There would be at least one strict inequality in the list, because at least one bi is greater than 0. Given inequality (2), it would follow that each boolean atom aj would occur on the left side of inequalities in the new list at least as often as it did on the right side, and at least one of these inequalities would be strict. Now, if each ai occurred at most once on the left side of an inequality in the new list and at most once on the right, it would follow from the axioms for Basic de Finetti Structures (Definition 1.1) that these inequalities could be “added” to yield the following strict inequality:

Image

(1)

However, this would be a single strict inequality in which each atom occuring on the right also occured on the left, which would be trivially inconsistent with the axioms for Basic de Finetti structures.

To get around the difficulty caused by the fact that the atoms ai may occur several times on either side of the inequalities in the new list D’, we use the possibility of equally partitioning them that is guaranteed in structures in En for n > 1. Clearly, for any structure in En, each aj must have equal m-partitions aj1, …, ajm for all powers m of n. Fixing on a particular power m, it also follows from the axioms of Basic de Finetti Structures that ordinal data in the new list must hold, if each aj is replaced by an arbitrary “submultiple” ajh in an equal m-partition of aj, and, furthermore, if m is high enough, these submultiples can be so chosen that each one occurs at most once on the left side of an inequality in the new list and at most once on the right. Moreover, these can still be so chosen that any one occuring on the right also occurs on the left. Given this, the inequalities can once again be “added” to yield a strict inequality of form (3) in which each atom occuring on the right also occurs on the left, which is inconsistent with the Basic de Finetti Structure axioms.

This concludes the proof.

CONCLUDING REMARKS

Results on Empirical Equivalences

Here we will draw attention to a few implications of the results stated in Theorem 2. Note first the way in which theories that are equivalent with respect to ordinal data alone (Theorem 2.1) partition into “regular” theories (condition R, Theorem 2.3) and “nonregular” ones (Theorem 2.2), when inclusion data are added. In virtue of Theorem 2.5, these two groups are not equivalent to one another with respect to ordinal and inclusion data. That the two groups are not equivalent with respect to both kinds of data shows that regularity conditions have contributory empirical content in their contexts when both kinds of data are considered but not when ordinal data alone are. Among other things, this shows that, contrary to what might be thought a priori, inclusion data can be regarded as empirical in the testing of at least some subjective probability theories.

Second, note the light that our results throw on the status of the Koopman Archimedean axiom (condition A). It has no contributory empirical content in the context of theory EnA for n > 1, the theory of Basic de Finetti Structures that are universally equal n-partitionable and that satisfy the Koopman Archimedean Axiom (Theorem 1.1), but it does in the context of VA, the theory of Basic de Finetti Structures whose universe elements are equal n-partitionable for arbitrarily large n (Theorems 2.1 and 2.4) and satisfy the Archimedean condition. This is interesting for the following reason. In a sense, VA is a theory of measurement of the subjective probabilities of events in which the universe and its equipartitions form a system of standards, like standard weights, with which all other events are compared to determine their probabilities, and in this context, A does have contributory empirical content. In contrast, En is a theory of measurement without standards, or rather one in which every event can itself be regarded as a standard. Where all events can be regarded as standards, it seems that the Archimedean condition adds no contributory content. This fits in with results Adams, Fagot, and Robinson (1970) stated, where it was found that Archimedean conditions add no empirical content to other systems of measurement in which all objects are standards.

Third, note that none of the nonnecessary conditions considered (namely universal equipartitionability, equipartitionability of the universe element, and the Koopman Archimedean condition) adds empirical content to the pure hypothesis of probabilistic representability with respect to either ordinal or inclusion data. This means that they satisfy what can be regarded as a “minimum requirement of acceptability” for nonnecessary conditions in axiom systems in fundamental measurement theory. Given that some of these must be included, if a representation theorem is to be proved, at least they should not be so strong that empirical data could be found that are consistent with the representability hypothesis but inconsistent with these axioms. Interestingly, although it might seem that certain axioms that have been formulated to the effect that domains of structures are finite (Suppes and Zinnes, 1963) do violate our acceptability requirement, Adams, Fagot, and Robinson (1970, p. 407) showed that, in the theories they considered, such axioms actually do satisfy it so long as they do not stipulate a fixed finite cardinality for the domain. This reflects the fact that all of the representations considered are essentially linear, which means that there are no “incommensurables” to worry about such as can occur with nonlinear representations (e.g., as in the analytic representation of synthetic plane geometry). The concluding subsection will make further comments on nonlinear representations.

The final remark to make on Theorem 2 is that it shows how theories of subjective probability can be fitted into Scott and Suppes’ (1958) “representation” of theories of fundamental measurement, according to which these theories can be construed as classes of systems that are homomorphically embeddable in fixed, canonical relational systems. Here the fixed systems that we have considered are OBA(U) and OBA(U-), and our study suggests, if it does not prove, that these systems might play a role in subjective probability theory similar to that which the reals or the positive reals under addition play to interval and extensive measurement.

Justification of Nonnecessary Axioms with Contributory Empirical Content

We will focus on two of these: namely Savage’s axiom P6’ and the equiparitionability axiom E2. (We may ignore the fact that none of the theories to which Theorem 2 applies involves Savage’s axiom; the points we will make do not depend on this.) Neither of these is a necessary condition for representability in any theory of which it is a part; both have contributory empirical content in their theoretical contexts, and both of them are consistent with any data whatever when considered by themselves outside of their theoretical contexts. Savage (1954) himself offered a persuasive argument for P6’:

It seems to me rather easier to justify the assumption of P6’, … … Suppose, for example, that you yourself consider B < C, that is, that you would definitely rather stake a gain in your fortune on C than on B. Consider the partition of your world into 2n events each of which corresponds to a particular sequence of heads and tails, thrown by yourself, with a coin of your own choosing. It seems to me that you could easily choose such a coin and choose n sufficiently large so that you would continue to prefer to stake your gain on C, rather than on the union of B and any particular sequence of n heads and tails. (Savage, p. 38; this is essentially the quotation from Savage on p. 206 of Krantz et al., 1970).

Although this argument has something of an a priori flavor, given that the cointossing experiences on which it relies are not of a kind that can be subjected to standard sorts of experimental confirmation, it is clearly a posteriori. Obviously it is logically consistent to suppose that coins of the sort that Savage imagines do not exist and even that their existence would be contrary to laws of nature (imagine quantum laws of probability, according to which probabilities can only increase by discrete amounts). Thus, as is well known, although it may be that no data can be logically inconsistent with a law like P6’, nevertheless it can be inductively confirmed and possibly also disconfirmed. Therefore, there is no difficulty in principle about empirically justifying such laws, although the fact that this kind of justification seems not to be reducible to standard techniques of experimentation does show that these laws have a rather special scientific status. All we would argue is that they should actually be justified.

A priori it would seem to be even simpler to justify E2. Thus, to verify that any particular event x is equal 2-partitionable, it is only necessary to find two events y and z such that yz = 0, yz = x, and y ~ z; further, to inductively confirm E2, it is only necessary to describe some systematic method for constructing these equipartitions. However, one feels uneasy about this, and the reason is easy to see. To establish that y and z equipartition x, we need to establish that they are subjectively equiprobable, not approximately but exactly, and one feels intuitively that that is beyond human capability. The problem here, however, is not with the form of E2 but rather with the fact that we have assumed that the subjective equiprobability of y and z can be conclusively established on the basis of observational data. In fact we feel the same doubts about the status of certain purely universal laws.

Consider the regularity axiom, that x Image 0 implies x ⊆ 0. We have assumed that it is possible to establish empirically that a subject regards an event x as no more likely than the impossible event 0 and, therefore, that this law is “empirically transparent.” However, further consideration suggests that this must be an idealization, unless x itself is a logical contradiction. Furthermore, if we cannot really establish empirically that x Image 0, then the regularity law cannot be regarded as empirical, and we must reconsider its justification and the justification of other idealizations like E2 de novo. Part of this reconsideration involves examining the equivalences of theories involving these laws with respect to different data, to which we turn next.

Other Data, and a Generalization of the Theory

Obviously there is no a priori reason for taking just the formulas so far considered as expressing the data relative to which theories of subjective probability are to be tested. We have just seen that we may have included too much, and it is also possible that we have included too little. We will comment at the end of this subsection on possible extensions of the data, but first we will say something about a simple reduction—that is, to take only negative ordinal formulas of the form x < y as ordinal data and not positive formulas of the form x Image y. In effect this assumes that a person is able to judge that x is strictly less in probability than y, but if the person judges that x is no greater in probability than y, that can only be, because he or she actually judges that x is less probable than y. This adapts the approach to interval and extensive measurement taken in Adams (1965) to subjective probability theory, and it fits neatly into our present framework. What we should do is consider empirical equivalences not with respect to all ordinal data but rather with respect just to negative ordinal data. This cannot be entered into in detail, but certain implications may be noted.

With respect to negative ordinal data only, the theory P of Basic de Finetti Structures that are probabilistically representable is equivalent to the more general theory Q of structures for which there exist representations satisfying the unidirectional law: If x < y, then p(x) < p(y), for some probability function p. However, the axioms for Basic de Finetti Structures are not necessary conditions for this kind of representability, and in fact the class of universal necessary conditions of representability in Q can be taken to be the subset of universal conditions of representability in P that are of the form –(ϕx1<y1…, xn < yn), where ϕ is a conjunction of inclusion formulas. Neither the transitivity nor the monotonicity axiom is included in this class, and we only have the following weakened versions:

–(x < y & y < z & z < x)

and

–(xz ⊆ ∅, yz ⊆ ∅, x < y, (yz) < (xz)).

Roughly this is a theory that is falsifiable but that cannot make predictions, and it can begin to appear that, in order to attain predictive power, it is also necessary to include an element of idealization! How this idea can be formalized in a way that extends both the language and the data will be outlined below, although it will carry us far from our original concerns.

Adams (1965, p. 206) suggested that judgments of ordering relations are not simple observational data but rather that they are inductive inferences. In particular, the judgement that two events are equiprobable is an inference to the effect that inequality between them cannot be established. A way of formalizing this idea involves generalizing to a two-sorted language of a kind described in section 5.3 of Adams and Carlstrom (1979) in which the x < y is subscripted with a test variatle t.7 In this formalism, x < ty expresses the fact that x would be judged less probable than y by test t. Then x < y can be construed as an abbreviation for (t)(x < ty) and x ~ y as an abbreviation for (t)(xty & ytx). If we assume that tests are conclusive, then (x)(x < ty) follows, if x < ty holds on any one test. On this assumption, x < y (i.e., (t)(x < ty)) can be regarded as “hard data.” However, that does not mean that the failure of a test t to establish x < y implies the failure of all tests to establish this, hence, x ~ y would at best be an inductive inference and not hard data on this interpretation. Thus, in our extended two-sorted language, the data are no longer simple atomic formulas and their negations but rather are abbreviations for certain general formulas. However, which generalizations are properly considered as data and which require more complicated kinds of justification is something only detailed analysis of the theory can establish, and it is not a matter of form alone. This is something that lies far beyond our present aims to enter into, and instead we will conclude with a few remarks on mathematical method.

Mathematical Methods and Their Limits

The reader will have noted that the key step in the proofs of the equivalences in Theorem 2 was the reduction of the question of the consistency of data with theory to the question of the solvability of systems of linear inequalities. This permits conditions for the solvability of these systems that are standard in the mathematical literature (cf. Goldman, 1956) to be translated into syntactic conditions that data on qualitative probability orderings must satisfy if quantitative probability functions are to exist representing them.8 What makes this reduction possible is the fact that the subjective probability representations we have been considering are linearizable, and the possibility of this kind of linearization defines the scope and limits of this approach to data analysis. We will make brief remarks on both the scope and the limits.

Given that most well known representations in the literature of fundamental measurement, including extensive, interval, and conjoint measurement, are essentially linear, our approach applies to those kinds of measurement as well, and in fact Adams, Fagot, and Robinson (1970) analyzed empirical equivalences (there called “data equivalences”) in these theories in a manner that parallels our present treatment of theories of subjective probability. The same approach can be taken with theories of a still wider class, including utility theories like the von Neumann-Morgenstern (1944) theory, or Hausner’s (1954) “multidimensional utilities” and certain sorts of “weak representations” like Holman’s (1969) “weak extensive measurement,” Savage’s (1954) “almost agreeing subjective probability” representation, and the different weak interval and extensive representations studied in Adams (1965).

Our method does not apply to representations that are essentially nonlinear, including Tversky’s (1967) polynomial conjoint representations, Jeffrey’s (1965, 1983) and Bolker’s (1967) mixture representation of utilities and subjective probabilities, or any “quantitative-analytic” representation of a “qualitative-synthetic” Euclidean geometry of dimension higher than 1 (cf. Hilbert, 1947). All of these cases involve what can be regarded as polynomial representations, in consequence of which qualitative data are translatable into inequalities between polynomials of degree higher than 1. The tantalizing thing is that Motzkin (1967) gave a beautiful generalization of standard conditions for the solvability of systems of linear inequalities, which applies to systems of polynomial inqualities. The problem that has so far resisted solution is that of translating Motzkin’s conditions into syntactic conditions that data must satisfy in order to translate into solvable systems of polynomial inequalities.9 It was this difficulty that ultimately led Domotor (1978) to his own amazingly ingenious approach to axiomatizing Jeffrey utilities. I regard the harnessing of Motzkin’s methods as one of the really interesting, open technical problems of fundamental measurement theory.

REFERENCES

Adams, E. W. (1965). Elements of a theory of inexact measurement. Philosophy of Science, 32, 205–228.

Adams, E. W. (1974). The logic of “almost all.” Journal of Philosophical Logic, 3, 3–17.

Adams, E. W. (1974). Model-theoretic aspects of fundamental measurement theory. In L. Henkin, J. Addison, C. C. Chang, W. Craig, D. Scott, & R. Vaught (Eds.), Proceedings of the Tarski Symposium (pp. 437–446). Providence, RI: American Mathematical Society.

Adams, E. W. (1986). Continuity and idealizability of approximate generalizations. Synthese, 439–476.

Adams, E. W., & Carlstrom, I. F. (1979). Representing approximate ordering and equivalence relations. Journal of Mathematical Psychology, 19 (2), 182–207.

Adams, E. W., Fagot, R. F., & Robinson, R. E. (1970). On the empirical status of axioms in theories of fundamental measurement. Journal of Mathematical Psychology, 7 (3), 379–409.

Bolker, E. D. (1967). A simultaneous axiomatization of utility and subjective probability. Philosophy of Science, 34, 333–340.

Davidson, D., Suppes, P., & Siegel, S. (1957). Decision Making: an Experimental Approach. Stanford, Stanford University Press.

de Finetti, B. (1937). “La prevision: Ses lois logiques, ses sources subjectives.” (“Foresight: its logical laws, its subjective sources.”) Annates de l’Institut Henri Poincare, 7, 93–158.

Domotor, Z. (1978). Axiomatizaton of Jeffrey utilities. Synthese, 39, 165–210.

Goldman, A. J. (1956). Resolution and separation theorems for polyhedral convex sets. In H. W. Kuhn & A. W. Tucker (Eds.), Linear inequalities and related systems, Annals of Mathematics Study, 38, 41–51.

Hausner, M. (1954). “Multidimensional Utilities”. In R. M. Thrall, C. H. Coombs, and R. L. Davis (Eds.), Decision Processes. New York, Wiley, pp. 167–180.

Hilbert, D. (1947). The foundations of geometry. LaSalle, IL: Open Court.

Holman, E. W. (1969). Strong and weak extensive measurement. Journal of Mathematical Psychology, 6, 286–293.

Jeffrey, R. C. (1965). The logic of decision. New York: McGraw-Hill. Second Edition 1983, University of Chicago Press.

Kolmogorov, A. N. (1956). The foundations of probability. New York: Chelsea.

Koopman, B. O. (1940). The axioms and algebra of intuitive probability. Annals of Mathematics, 41, 269–292.

Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1970). Foundations of measurement. New York: Academic Press.

Lowenheim, L. (1915). Uber moglichkeiten im relativkalkul. (“On possibilities in the calculus of relations.”) Mathematische Annalen, 76, 447–470.

Luce, R. D., & Galanter, E. (1963). Psychophysical scaling. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.) Handbook of Mathematical Psychology, (Vol. 1, pp. 191–307). New York: Wiley.

Luce, R. D., & Suppes, P. (1969). Preference, utility, and subjective probability. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of Mathematical Psychology (Vol. 3, pp. 249–410). New York: Wiley.

Manders, K. (1977). Necessary conditions of representability. Memorandum No. UCB/ERL/M77/3. Berkeley: College of Engineering, University of California.

Motzkin, T. S. (1967). Algebraic inequalities. In O. Shiska (Ed.), Inequalities (Vol. 1, pp. 199–203). New York: Academic Press.

Narens, L. (1985). Abstract measurement theory. Cambridge: MIT Press.

von Neumann, J., & Morgenstem, O. (1944). Theory of games and economic behavior. Princeton: Princeton University Press.

Pfanzagl, J. (1968). Theory of measurement. In cooperation with V. Baumann and H. Huber. New York: Wiley.

Roberts, F. S. (1979). Measurement theory. Reading: Addison-Wesley.

Robinson, A. (1963). Introduction to model theory and the metamathematics of algebra. Amsterdam: North-Holland.

Savage, L. J. (1954). The foundations of statistics. New York: Wiley.

Scott, D. (1965). Measurement structures and linear inequalities. Journal of Mathematical Psychology, 1, 233–247.

Scott, D. (1974). Completeness and axiomatizability in many-valued logic. In L. Henkin, J., Addison, C. C. Chang, W. Craig, D. Scott, & R. Vaught (Eds.), Proceedings of the Tarski Symposium (pp. 411–436). Providence, RI: American Mathematical Society.

Scott, D., & Suppes, P. (1958). Foundational aspects of theories of measurement. Journal of Symbolic Logic, 23, 113–128.

Stone, μ. H. (1936). The representation theorem for boolean algebra. Transactions of the American Mathematical Society, 40, 37–111.

Suppes, P., & Zinnes, J. L. (1963). Basic measurement theory. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of Mathematical Psychology. (Vol. 1, pp. 1–76). New York: Wiley.

Tarski, A., & McKinsey, J. C. C. (1948). A decision method for elementary and geometry. Santa Monica: RAND Corporation. Second revised edition, 1951, University of California Press.

Tversky, A. (1967). A general theory of polynomial conjoint measurement. Journal of Mathematical Psychology, 4, 1–20.

_________________

1These are universal laws on systems with operations, where the closure assumptions are not themselves purely universal. This complication will not concern us. Of course not all necessary conditions of representability need be purely universal. Certain versions of Archimedean conditions are necessary, and Manders (1979) considered very interesting classes of elementary “nontrivially necessary” conditions of representability, which are not of purely universal form. Intuitively these nontrivially necessary elementary axioms have the effect of “excluding infinitesimals,” but they deserve careful study both from the mathematical and the empirical point of view.

2It is worth noting that, although informal justifications have been offered, for instance for P6’ (see section; Justification of Nonnecessary Axioms), systematic experimental studies of these sorts of axiom systems (cf. Davidson, Siegel, & Suppes, 1957) have only studied the purely universal laws these systems involve.

3For instance, Luce and Galanter (1963, p. 259) said, concerning two axioms in a theory of Pfanzagl (1959): “Axioms 1 and 3 are largely technical and need not be discussed.”

4Scott (1965) did indeed present an infinite set of universal laws that are necessary conditions of representability and that entail all such conditions (see also the related axioms in Adams, 1965). Here we consider just finite axiom systems, in part because it is not clear how one might inductively confirm an infinite totality of independent laws. However, it is an interesting question whether the class of necessary conditions of representability, including the nontrivially necessary axioms studied by Manders (footnote 1), is finitely axiomatizable. Footnote 8 comments further on Scott’s methods.

5Note that it is a special property of universal laws that, if and only if they are logically independent of other elementary axioms will it be the case that they will have contributory empirical content when added to these other axioms. This can be viewed as a kind of “generalized independent testability property.”

6The representation defined here is not the only sort that has been considered in the literature. In particular, Savage (1954, p. 34) defined an almost agreeing probability function to be one in which only the “only if” direction of the biconditional holds—this could be viewed as an application to subjective probability of Holman’s (1973) approach—whereas an agreeing probability function is one satisfying the biconditional in both directions. The technical advantage of the unidirectional formulation is that it does not require the Archimedean condition. The distinction between agreeing and almost agreeing probability functions is closely connected to that between regular and “general” probability representations.

7Note that, if we change from tests r to occasions o, so that x < 0y means that x is judged less probable than y on occasion o, then we move in the direction of theories of Stochastic Orderings such as are discussed in Roberts (1979).

8To the best of my knowledge, this method was developed independently by Adams (1965) and by Scott (1965) as a means of characterizing the classes of purely universal necessary conditions for representability in different fundamental measurement theories. Related methods are also used in the proof of Theorem 4.1 in Scott (1974).

9These conditions are based on the work of Abraham Robinson (1963), which, not surprisingly, traces back to Tarski’s decision method for elementary algebra (Tarski & McKinsey, 1948).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset