3

Qualitative Axioms for Random-Variable Representation of Extensive Quantities

Patrick Suppes
Mario Zanotti

Stanford University

In the standard theory of fundamental extensive measurement, qualitative axioms are formulated that lead to a numerical assignment unique up to a positive similarity transformation. The central idea of the theory of random quantities is to replace the numerical assignment by a random-variable assignment. This means that each object is assigned a random variable. In the case of extensive quantities, the expectation of the random variable replaces the usual numerical assignment, and the distribution of the random variable reflects the variability of the property in question, which could be intrinsic to the object or due to errors of observation. In any case, the existence of distributions with positive variances is almost universal in the actual practice of measurement in most domains of science.

It is a widespread complaint about the foundations of measurement that too little has been written that combines the qualitative structural analysis of measurement procedures and the analysis of variability in a quantity measured or errors in the procedures used. In view of the extraordinarily large number of papers that have been written about the foundations of the theory of error, which go back to the eighteenth century with fundamental work already by Simpson, Lagrange, and Laplace, followed by the important contributions of Gauss, it is surprising that the two kinds of analysis have not received a more intensive consideration. Part of the reason is the fact that, in all of this long history, the literature on the theory of errors has been intrinsically quantitative in character. Specific distributional results have usually been the objective of the analysis, and the assumptions leading to such results have been formulated in quantitative probabilistic terms. This quantitative framework is also assumed in the important series of papers by Falmagne and his collaborators on random-variable representations for interval, conjoint, and extensive measurement (see Falmagne, 1976, 1978, 1979, 1980, 1985; Falmagne & Iverson, 1979; Iverson & Falmagne, 1985).

In light of this long history, we would certainly not want to claim that the various results presented in this chapter solve all the natural kinds of questions that have been in the air for some time. We do believe that we have taken a significant step toward combining in one analysis the qualitative structures characteristic of the foundations of measurement and the probabilistic structures characteristic of the theory of error or the theory of variability.

The approach to the distribution of the representing random variables of an object consists of developing, in the usual style of the theory of measurement, qualitative axioms concerning the moments of the distribution, which are represented as expectations of powers of the representing numerical random variable. The first natural question is whether or not there can be a well-defined qualitative procedure for measuring the moments. This is discussed in the first section. Section 2 presents the qualitative primitive concepts and Section 3 the axiom system. The representation theorem and its proof are given in Section 4.

1. VARIABILITY AS MEASURED BY MOMENTS

The approach to the distribution of the representing random variable of an object consists of developing, in the usual style of the theory of measurement, qualitative axioms concerning the moments of the distribution, which are represented as expectations of powers of the representing numerical random variable. The classic problem of moments in the theory of probability enters in an essential way in the developments to follow. We lay out in an explicit way the qualitative assumptions about moments that are made.

Before giving the formal developments, we address the measurement of moments from a qualitative standpoint. We outline here one approach without any claim that it is the only way to conceive of the problem. In fact, we believe that the pluralism of approaches to measuring probability is matched by that for measuring moments, for reasons that are obvious given the close connection between the two.

The one approach we outline here corresponds to the limiting relative-frequency characterization of probability, which we formulate here somewhat informally. Let s be an infinite sequence of independent trials with the outcome on each trial being heads or tails. Let H(i) be the number of heads on the first i trials of s. Then, relative to s,

Image

with the provision that the limit exists and that the sequence s satisfies certain conditions of randomness that need not be analyzed here. In practice, of course, only a finite initial segment of any such sequence is realized as a statistical sample. However, ordinarily in the case of probability, the empirical procedure encompasses several steps. In the approach given here, the first step is to use the limiting relative-frequency characterization. The second step is to produce and analyze a finite sample with appropriate statistical methods.

Our approach to empirical measurement of qualitative moments covers the first step but not the second of giving detailed statistical methods. Thus, let a0 be an object of small mass of which we have many accurate replicas—so we are assuming here that the variability in a0 and its replicas, Image, j = 1,2,… are neglible. Then we use replicas of a0 to qualitatively weigh an object a. On each trial, we force an equivalence, as is customary in classical physics. Thus, on each trial i, we have

Image

The set shown on the right we symbolize as mia0. Then, as in the case of probability, we characterize an, the nth qualitative raw moment of a, by

Image

but, in practice, we use a finite number of trials and use the estimate ân.

Image

and so also only estimate a finite number of moments. It is not to the point here to spell out the statistical procedures for estimating an. Our objective is only to outline how one can approach empirical determination of the qualitative raw moments.

There is one important observation to deal with. The observed data, summarized in the integers m1, m2, …, mj, on which the computation of the moments is based, also constitute a histogram of the distribution. Why not estimate the distribution directly? When a distribution of a particular form is postulated, there need be no conflict in the two methods, and the histogram can be of further use in testing goodness of fit.

The reason for working with the raw moments is theoretical rather than empirical or statistical. Various distributions can be qualitatively characterized in terms of their raw moments in a relatively simple way, as the examples in the Corollary to the Representation Theorem show. Furthermore, general qualitative conditions on the moments are given in the Representation Theorem. Alternative qualitative approaches to characterizing distributions undoubtedly exist and as they are developed may well supersede the one used here.

We now turn to the formal developments. In proving the representation theorem for random extensive quantities in this section, we apply a well-known theorem of Hausdorff (1923) on the one-dimensional moment problem for a finite interval.

HAUSDORFF’S THEOREM. Let μ0, μ1, μ2, … be a sequence of real numbers. Then a necessary and sufficient condition that there exist a unique probability distribution F on [0, 1] such that μn is the nth raw moment of the distribution F, that is to say,

Image

(1)

is that μ0 = 1 and all the following inequalities hold:

Image

(2)

A standard terminology is that a sequence of numbers μn, n = 0, 1, 2, … is completely monotonic iff Inequalities (2) are satisfied, in more compact binomial notation μν(1 – μ)k ≥ 0, for k, ν = 0, 2, … (for detailed analysis of many related results on the problem of moments, see Shohat & Tamarkin, 1943).

It is important to note that we do not need an additional separate specification of the domain of definition of the probability distribution in Hausdorff s theorem. The necessary and sufficient conditions expressed in the Inequalities (2) guarantee that all the moments lie in the interval [0, 1], and so this may be taken to be the domain of the probability distribution without further assumption.

2. QUALITATIVE PRIMITIVES FOR MOMENTS

The idea, then, is to provide a qualitative axiomatization of the moments for which a qualitative analogue of Inequalities (2) obtains and then to show that the qualitative moments have a numerical representation that permits one to invoke Hausdorff s theorem. Thus, the qualitative structure begins first with a set G of objects. These are the physical objects or entities to which we expect ultimately to associate random variables. More precisely, we expect to represent the selected extensive attribute of each object by a random variable. However, in order to get at the random variables, we must generate from G a set of entities that we can think of as corresponding to the raw moments and mixed moments of the objects in G. To do that, we must suppose that there is an operation ♦ of combining so that we can generate elements an= an – 1a, which, from a qualitative point of view, will be thought of as corresponding to the raw moments of a. It is appropriate to think of this operation as an operation of multiplication, but it corresponds to multiplication of random variables, not to multiplication of real numbers. We shall assume as axioms that the operation is associative and commutative, but that it should not be assumed to be distributive with respect to disjoint union (which corresponds to numerical addition) can be seen from the following random-variable counterexample, given in Gruzewska (1954). Let X1, X2, X3 be three random variables, where

Image

Then

Image

(The computations make clear the assumptions of independence made.) As can be seen, L1 and L2 have different distributions, although

E(X1(X2 + X3)) = E(X1X2 + X1X3).

We turn now to the explicit definition of a semigroup that contains the associative and commutative axioms of multiplication.

DEFINITION 1. Let A be a nonempty set, G a nonempty set, · a binary operation on A and 1 an element of G. Then Image = (A, G, ·, 1) is a commutative semigroup with identity 1 generated by G iff the following axioms are satisfied for every a, b, and c in G.

1.   If aG, then aA.

2.   If s, tA, then (s·t)A.

3.   Any member of A can be generated by a finite number of applications of Axioms 1–3 from elements of G.

4.   a · (b · c) = (a · b) · c.

5.   a · b = b · a.

6.   1 · a = a.

Note that, because of the associativity axiom, we omit parentheses from here on. Note, further, that, on the basis of Axiom 3, we think of elements of A as finite strings of elements of G. Intuitively the elements of A are qualitative mixed moments. Furthermore, because the product operation · is associative and commutative, we can always write the mixed moments in a standard form involving powers of the generators. For example, a·a·a·c·a·b·c = a4·b·c2. This expression is interpreted as the qualitative mixed moment consisting of the fourth raw moment of a times the first one of b times the second one of c. We denote this semigroup by A.

Our last primitive is a qualitative ordering of moments. As usual, we will denote it by Image. The first question concerns the domain of this relation. For purposes of extensive measurement, it is useful to assume that the domain is all finite subsets from the elements of the semigroup A. We may state this as a formal definition:

DEFINITION 2. Let A be a nonempty set and Image a binary relation on Image, the family of all finite subsets of A. Then Image = (A, Image Image) is a weak extensive structure iff the following axioms are satisfied for every B, C, and D in Image:

1.   The relation Image is a weak ordering of Image.

2.   If BD = CD = ∅, then B Image C iff BD Image CD.

3.   If B ≠ ∅, then B > ∅.

Superficially the structure just defined looks like a familiar structure of qualitative probability, but in fact it is not. The reason is that because A is an infinite set, we cannot assume Image is closed under complementation, because that would violate the assumption that the subsets in Image are finite.

An important conceptual point is that we do require the ordering in magnitude of different raw moments. One standard empirical interpretation of what it means to say that the second raw moment, a2, is less than the first, a1, was outlined previously. A formal point, appropriate to make at this stage, is to contrast the uniqueness result we anticipate for the representation theorem with the usual uniqueness up to a similarity (i.e., multiplication by a positive constant) for extensive measurement. We have, in the present setup, not only the extensive operation but also the semigroup multiplication for forming moments; therefore, the uniqueness result is absolute (i.e., uniqueness in the sense of the identity function). Given this strict uniqueness, the magnitude comparison of am and an for any natural numbers m and n is not a theoretical problem. It is of course apparent that any procedure for measurement of moments, fundamental or derived, will need to satisfy such strict uniqueness requirements in order to apply Hausdorff’s or other related theorems in the theory of moments.

Within Image we may define what it means to have n disjoint copies of BImage:

1B = B

(n + 1)B ~ nBB’,

where nBB’ = ∅, and B’ ~ B and ~ is the equivalence relation defined in terms of the basic ordering Image on Image. Axiom 3 will simply be the assumption that such a B’ always exists, and so nB is defined for each n. It is essential to note that this standard extensive or additive recursive definition is quite distinct from the one for moments an given earlier.

3. AXIOM SYSTEM FOR QUALITATIVE MOMENTS

Our goal is to provide axioms on the qualitative raw moments such that we can prove that object a can be represented by a random variable Xa, and the nth raw moment an is represented by the nth power of Xa (i.e., by Xan).

For convenience, we shall assume the structures we are dealing with are bounded in two senses. First, the set G of objects will have a largest element 1, which intuitively means that the expectation of the random variables associated with the elements of a will not exceed that of 1. Moreover, we will normalize things so that the expectation associated with X1 is 1. This normalization shows up in the axiomatization as 1 acting as the identity element of the semigroup. Second, because of the condition arising from the Hausdorff theorem, this choice means that all of the raw moments are decreasing in powers of n (i.e., if mn, then anam). Obviously the theory can be developed so that the masses are greater than 1, and the moments become larger with increasing n. This is the natural theory when the probability distribution is defined on the positive real line. As might be expected, the conditions are simpler for the existence of a probability distribution on a finite interval, and this is also realistic from a methodological standpoint. The exponential notation for qualitative moments an is intuitively clear, but it is desirable to have the following formal recursive definition:

a0 = 1,

an = an-1 · a

in order to have a clear interpretation of a0.

Before giving the axiom system, we must discuss more fully the issue of what will constitute a qualitative analogue of Hausdorff’s condition, Inequality (2).

We have only an operation corresponding to addition and not to subtraction in the qualitative system; thus, for k, an even number, we rewrite this inequality solely in terms of addition as follows:

Image

(3)

and a corresponding inequality for the case in which k is odd. In the qualitative system, the analogue to Inequality (3) must be written in terms of union of sets as follows for k even:

Image

(4)

When k is odd,

Image

(5)

There are several remarks to be made about this pair of inequalities. First of all, we can infer that, for a < 1, as opposed to a ~ 1, the moments are a strictly decreasing sequence (i.e., av > av+1). Second, the meaning of such terms as Image was recursively defined earlier, with the recursion justified by Axiom 3 below. It is then easy to see that the unions indicated in Inequalities (2) and (5) are of disjoint sets. On the basis of the earlier terminology, we can then introduce the following definition. A qualitative sequence a0, a1, a2, a3, … is qualitatively completely monotonic iff Inequalities (4) and (5) are satisfied.

DEFINITION 3. A structure Image = (A, Image, G, Image, ·, 1) is a random extensive structure with independent objects—the elements of G—iff the following axioms are satisfied for a in G, s and t in A, k, m, m’, n, and n’ natural numbers and B and C in Image:

1.   The structure (A, Image, Image) is a weak extensive structure.

2.   The structure (A, G, ·, 1) is a commutative semigroup with identity 1 generated by G.

3.   There is a C’ in Image such that C’ ~ C, and C’B = ∅;

4.   Archimedean. If B > C, then, for any D in Image, there is an η such that

nB Image nCD.

5.   Independence. Let mixed moments s and t have no common objects:

a.    If m1 Image ns, and m’1 Image n’t, then mm’1 Image nn’(s·t).

b.    If m1 Image ns, and m’1 Image n’t, then mm’1 < nn’(s·t).

6. The sequence a0, a1, a2, … of qualitative raw moments is qualitatively completely monotonic.

The content of Axiom 1 is familiar. What is new here is, first of all, Axiom 2, in which the commutative semigroup, as mentioned earlier, is used to represent the mixed moments of a collection of objects. Axiom 3 is needed in order to make the recursive definition of (n + 1)B well defined as given earlier. The special form of the Archimedean axiom is the one needed when there is no solvability axiom, as discussed in Section 3.2.1 of Krantz, Luce, Suppes, and Tversky (1971). The dual form of Axiom 5 is just what is needed to prove the independence of the moments of different objects, which means that the mixed moments factor in terms of expectation. Note that it is symmetric in Image and Image. The notation used in Axiom 5 involves both disjoint unions, as in m1, and the product notation for mixed moments, as in (s·t). Axiom 6 formulates the qualitative analogue of Hausdorff’s necessary and sufficient condition as discussed above.

4. REPRESENTATION THEOREM AND PROOF

REPRESENTATION THEOREM. Let Image = (A, Image, G, Image, ·, 1) be a random extensive structure with independent objects. Then there exists a family {XB, BImage} of real-valued random variables such that:

(i).

every object a in G is represented by a random variable Xa whose distribution is on [0, 1] and is uniquely determined by its moments,

(ii).

the random variables {Xa, aG} are independent,

(iii).

for a and b in G, with probability one,

 

Xa·b = Xa · Xb

(iv).

E(XB) ≥E(XC), iff B ImageC,

(v).

if B ∩C = ∅, then XBC = XB + Xc,

(vi).

if B ≠ 0, then E(XB) > 0,

(vii)

E(X1) = 1 for every n.

Moreover, any function ϕ from Re to Re such that {ϕ(xB), B∈Image} satisfies (i)–(vii) is the identity function.

PROOF. First, we have, by familiar arguments from Axioms 1, 3, and 4, the existence of a numerical assignment ϕ. For any B in Image we define the set S of numbers:

Image

(6)

It is easy to show that S is nonempty and has a greatest lower bound, which we use to define ϕ:

ϕ(B) = g.l.b.SB.

It is then straightforward to show that, for B and C in Image

ϕ(B) Image ϕ(C) iff B Image C;

if B ∩ C = ∅, then ϕ(B ϕ C) = ϕ(B) + ϕ((C);

if B ≠ ∅, then ϕ(B) > 0.

Second, it follows from Axiom 2 that

1n= 1,

(8)

whence

ϕ(1) =.

(9)

From Axiom 6, we infer that, for any object a in G, the numerical sequence

1, ϕ(a), ϕ(a2), ϕ(a3), …

satisfies Inequalities (2) and, hence, determines a unique probability distribution for a, which we represent by the random variable Xa. Furthermore, the expectation function E is defined by Image = ϕ(an). The independence of mixed moments s and t that have no common object is derived from Axiom 5 by the following argument that uses the sets SB defined in (6) and their symmetric analogue sets TB defined below. From 5a, we have at once, if

Image

then

Image

whence

ϕ(s)ϕ(t) ≥ ϕ(st).

(10)

Correspondingly, in order to use Axiom 5b, we define

Image

Each set TB is obviously nonempty and has a least upper bound. We need to show that

Image.u.b. TB = g.Image.b. SB = ϕ(B).

Suppose, by way of contradiction, that

Image.u.b. TB < g.(.b. SB.

(That the weak inequality < must hold is evident.) Then there must exist integers m and n such that

Image

Thus, we have, from the left-hand inequality,

ml > nB,

and from the right-hand one,

ml < nB,

which together contradict the weak order properties of >. However, from the definition of TB, we may then also infer

ϕ(s)ϕ(t) & ϕ(st),

which, together with (8), establishes that

ϕ(A>ϕ(t) = ϕ(st).

The previous argument establishes (i). We next want to show that, with probability one,

Xa-b= Xa · Xb

We do this by showing the two random variables have identical nth moments for all n. If A T4 b, we have independence of Xa and Xb by the argument given previously:

Image

which also establishes (ii) by obvious extension. If a = b, we have the following:

Image

which completes the proof of (iii). For the empty set, because ϕ(∅) = 0,

x = 0,

and for B, a nonempty set in Image define

Image

(11)

Each s ∈ B is a multinomial moment; thus, XB is a polynomial in the random variables Xa, with a in some string sB. Such a random variable is clearly a Borel function, and so its distribution is well defined in terms of the joint product distribution of the independent random variables Xa. Parts (iv)–(vi) of the theorem then follow at once from (7) and (11), and (vii) from (8) and (9).

Finally the uniqueness of the representation follows from the fact that ϕ(∅) = 0 and ϕ(1) =1. Ο

If we specialize the axioms of Definition 3 to qualitative assertions about distributions of a particular form, we can replace Axiom 6 on the complete monotonicity of the sequence of qualitative moments of an object by much simpler conditions. In fact, we know of no simpler qualitative way of characterizing distributions of a given form than by such qualitative axioms on the moments. The following corollary concerns such a characterization of the uniform, binomial, and beta distributions of [0, 1], where the beta distribution is restricted to integer-valued parameters a and b.

COROLLARY TO REPRESENTATION THEOREM. Let Image = (A, Image, G, Image,·, 1) be a structure satisfying Axioms 1–5 of Definition 3, and, for any a in G, assume a Image I.

I.

If the moments of an object a for n ≥ 1 satisfy

 

(n + l)an ~ 2a,

 

then Xa is uniformly distributed on [0, 1].

II.

If the moments of an object a for n ≥ 1 satisfy

 

an ~ a,

 

then Xa has a Bernoulli distribution on [0, 1].

III.

If the moments of an object a for n ≥ 1 satisfy

 

(α + β + n)an+1 ~ (α + n)an,

 

where α and β are positive integers, then Xa has a beta distribution on [0, 1].

PROOF. We only give the proof for the Bernoulli distribution in any detail. First, we use the hypothesis an~ a to verify Inequalities (4) and (5). For k even,

Image

certainly holds, and similarly for k odd,

Image

which shows that Axiom 6 of Definition 3 is satisfied, and so the unique numerical function ϕ of the Representation Theorem exists, with

ϕ(a) = ϕ(an) = p

for p, some real number such that 0 < p ≤ 1, and the distribution is uniquely determined by the moments. The moment-generating function for the Bernoulli distribution with parameter ρ is, for – ∞ < t < ∞,

ψ(t) = pet,

and so the nth derivative of ψ with respect to t is equal to p at t = 0, which completes the proof.

In the case of the beta distribution, we just show how the recursion is derived. Verification of Inequalities (4) and (5) is routine but tedious. The moment-generating function of the beta distribution is not easy to work with, but by direct integration, we have as follows:

Image

Using this last expression, it is easy to derive that

(α + β + n)E(Xn+1) = (α + n)E(Xn).

Finally the uniform distribution on [0, 1] is a special case of the beta distribution, namely when α = β = 1.          Image

Note that a Bernoulli distribution of Xa implies that all the probability weight is attached to the end points of the interval, so that, if p is the parameter of the distribution, as in standard notation, then

E(Xa) = (1 - p)·0 + p·1 = p.

We remark informally that some other standard distributions with different domains may also be characterized qualitatively in terms of moments. For example, the normal distribution on (—∞, ∞) with mean equal to zero and variance equal to one is characterized as follows:

a0 ~ 1,

a1 ~ ∅,

a2 ~ 1,

a2(n+1) ~ (2n + l)a2n            for n ≥ 1.

ACKNOWLEDGMENTS

We are indebted to Duncan Luce for helpful comments on earlier drafts of this chapter. The basic idea of a qualitative theory of moments was presented by the first author some years ago for the special case of the uniform distribution at the 1980 meeting in Madison, Wisconsin of the Society for Mathematical Psychology.

REFERENCES

Falmagne, J.-C. (1976). Random conjoint measurement and loudness summation. Psychological Review, 83, 65–79.

Falmagne, J.-C. (1978). A representation theorem for finite random scale systems. Journal of Mathematical Psychology, 18, 52–72.

Falmagne, J.-C. (1979). On a class of probabilistic conjoint measurement models: Some diagnostics properties. Journal of Mathematical Psychology, 19, 73–88.

Falmagne, J.-C. (1980). A probabilistic theory of extensive measurement. Philosophy of Science, 47, 277–296.

Falmagne, J.-C. (1985). Elements of psychophysical theory. New York: Oxford University Press.

Falmagne, J.-C., & Iverson, G. (1979). Conjoint Weber laws and additivity. Journal of Mathematical Psychology, 20, 164–183.

Gruzewska, H. M. (1954). L’Arithmétique des variables aléatoires (The arithmetic of random variables). Cahiers Rhodaniens, 6, 9–56.

Hausdorff, F. (1923). Momentprobleme für ein endliches Intervall (Moment problem for a finite interval). Mathematische Zeitschrift, 16, 220–248.

Iverson, G., & Falmagne, J.-C. (1985). Statistical issues in measurement. Mathematical Social Sciences, 10, 131–153.

Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). Foundations of measurement (Vol. 1). New York: Academic Press.

Shohat, J. A., & Tamarkin, J. D. (1943). The problem of moments. New York: American Mathematical Society.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset