Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 3

Metrics

This chapter discusses metric spaces [S, d], in which S is a subset of Rⁿ or a subset of a grid; it also addresses metric spaces, in which S is a family of such subsets. It also discusses metrics on pictures in which distances depend on the pixel values.

3.1 Basics About Metrics

Measurement requires a metric space. In this section, we summarize facts about metric spaces that are relevant to digital geometry. The definition of a metric (distance function) based on properties M1 through M3 was given in Section 1.2.1.

3.1.1 The Euclidean metric

We first consider the metric that is used in Euclidean geometry: the distance between two points is equal to the length of the straight line segment defined by the two points. This metric will allow us to define arc lengths, angles, and areas. Digital geometry is often concerned with estimates of such quantities.

We assume a Cartesian coordinate system on Rⁿ. (We treat the n-dimensional case, because this allows us to discuss the 2D and 3D cases at the same time.) Let p =(x₁, x₂, …, x_n) and q =(y₁y₂, …, y_n) ∈ Rⁿ (n ≥ 1); then the following is true:

The function d_e is the Euclidean metric, and Eⁿ = [Rⁿ, d_e] is the n-dimensional Euclidean space.

It is easily seen that d_e satisfies M1 and M2; we now prove that it satisfies M3. Let r = (z₁, z₂, …, z_n) be a third point. For i = 1, …, n, let a_i = z_i – x_i and b_i = y_i – z_i so that a_i+b_i = y_i – x_i. From the Minkowski inequality

(3.1)

it follows that d_e(p,q) ≤ d_e(p,r)+d_e(r,q). The Minkowski inequality follows from the Schwarz inequality

(3.2)

where the a_is and b_is are reals and n ≥ 1. A proof of Equation 3.2 is as follows:

Because this inequality holds for any t, it follows that the discriminant of f(t) is not strictly positive. This means that the following is true and is equivalent to Equation 3.2

3.1.2 Norms and Minkowski metrics

Euclidean spaces are often introduced as normed spaces rather than metric spaces. A norm always defines a metric, and a metric defines (at least) a seminorm. Norms can also be related to the metrics studied in digital geometry.

Let [S,+, ·, R] be an n-dimensional vector space over R (see Section 1.2.4). A norm assigns to any p ∈ S a nonnegative real number that satisfies the following properties for all p,q ∈S and all a ∈ R:

N1: = 0 iff p = (0,…,0) (identity).

N2: (homogeneity).

N3: (the triangle inequality: triangularity).

For example, let S = Rⁿ, p =(x₁…, x_n)∈ Rⁿ, and let the following be true¹:

These functions have properties N1 through N3 on the vector space [Rⁿ,+, ·, R].

Let be a norm on [S,+, ·, R]. It can be easily verified that

(3.3)

defines a metric on S. Evidently, the norm defines the Euclidean metric d_e on [Rⁿ,+, ·, R].

A metric defined by a norm using Equation 3.3 also has the following properties:

M4: d(p+r, q+r) = d(p, q) for p,q,r ∈ S (translation-invariance).

M5: d(a ·p, a · q) = · d(p, q) for p,q ∈ S and a ∈ R (homogeneity).

The norms (m ≥ 1 or m = ∞) define the Minkowski metrics L_m on Rⁿ; is therefore called a Minkowski norm. Note that L₂ = d_e. Evidently we have

where p = (x₁, x₂,…, x_n) and q =(y₁, y₂, y_n). All the Minkowski metrics L_m have properties M1 through M5. The following can also be shown:

(3.4)

It is easily verified that two 2D grid points p₁ and p₂ are 4-adjacent iff L₁(p₁, p₂) = 1 and 8-adjacent iff L_∞(p₁, p₂) = 1. Similarly, two 3D grid points p₁ and p₂ are 6-adjacent iff L₁(p₁, p₂) = 1 and 26-adjacent iff L_∞(p₁,p₂) = 1.

Let [S,d] be a metric space, and assume that [S,+, ·, R] is an n-dimensional vector space with additive identity o. Let the following be true:

(3.5)

If d also satisfies M4 and M5 on [S,+, ·, R], Equation 3.5 defines a norm on S. If d does not satisfy M5, the function derived from d by Equation 3.5 need not be a norm, but it is a seminorm, which has properties N1, N3, and

N2^* (upper boundedness).

3.1.3 Scalar products and angles

A norm often allows us to define a scalar product, which in turn allows us to define angular values. A metric allows us to define a seminorm, a weak scalar product, and angular values.

A scalar product is a symmetric, positive definite, such that

and linear, such that

mapping of S² into R. Let be a seminorm or norm on an n-dimensional vector space [S,+, ·, R], and let the following be true:

(3.6)

A seminorm or norm always defines a weak scalar product in this way. It is positive definite and symmetric, and it is a scalar product iff it is linear.

For example, the norm does not define a scalar product this weak scalar product is not linear. There are grid points p₁ and q₁ such that , and there are grid points p₂ and q₂ such that . In general, the linearity of a scalar product implies the following:

and

These are not true for all norms.

Vectors p,q ∈ S are called orthogonal (notation: p⊥q) with respect to a weak scalar product 〈 ·, · 〉 iff 〈 p,q〉 = 0. For example, for the Euclidean space [R²,d_e], the norm and the scalar product we have

where p =(x₁, y₁), q = (x₂,y₂), p ≠ o,q ≠ o, and η is the (smaller) angle between the vectors p = and q = . It follows that p⊥q iff cos η = 0 iffη = 90°.

We say that a weak scalar product satisfies the generalized Schwarz inequality on S iff the following is true:

For example, the weak scalar products defined by the metrics d₄ and d₈ on R² satisfy the generalized Schwarz inequality on R².

Suppose the weak scalar product defined by a metric d satisfies the generalized Schwarz inequality on S. Following [540], we can define an angular value

(3.7)

where p ≠ q, q ≠ r; see Figure 3.1. With the generalized Schwarz inequality, we have the following:

FIGURE 3.1 Illustration of angular value H(p, q, r).and illustrate ways of measuring the (shortest) distances d(p, q) and d(r, q); the sketch in the figure resembles a path in metric d₄.

Proposition 3.1

H(p, q, r) is always in the interval [−1,1].

In the Euclidean space [R²,d_e], we have H(p,q,r) = cos η, where η is the (smaller) angle between the vectors p – q = and r – q = .

3.1.4 Integer-Valued metrics

The Minkowski metrics can obviously have noninteger values, even for pairs of grid points; for example, in Figure 3.2, we have d_e(p,r) = . The measurements used in digital geometry are often based on integer-valued metrics.

FIGURE 3.2 p = (2,1), q = (6,3), and r = (3,3).

We recall the definitions of the floor, ceiling, and nearest integer functions for all real a:

, the largest integer less than or equal to a

, the smallest integer greater than or equal to a

, the nearest integer to a if it is unique,and otherwise

For any function d : S×S → R, we can define by (p, q) = and similarly for and [d]. However, even if d is a metric, these integer-valued functions may not be metrics. For example, we have the following:

Proposition 3.2

and [d_e] are not metrics on Z².

Proof Let p = (2,3), q = (−1, −1), and r = (0,0). Then ](p,q) = 5, but (p,r) = 3 and (r, q) = 1, so property M3 is not satisfied. For [d_e], use, for example, p = (1,1) and q and r as before.

It follows that and [d_e] are not metrics on Rⁿ or Zⁿ for n ≥ 2. Interestingly, is a metric. In fact, we have the following:

Theorem 3.1

If d is a metric, is also a metric.

Proof Let p,q,r ∈ S, a = d(p,q), b = d(q,r), and c = d(p,r). We show that has properties M1 through M3.

M1: For a ≥ 0, we have = 0 iff a = 0, such that iff d(p, q) = 0 iff p = q. Furthermore, a ≥ 0 implies ≥ 0.

M2: Because d(p,q) = d(q,p), we also have =.

M3: We have a+b ≥ c, because d is a metric on S. First assume that a or b is an integer, for example, a = n; then . Now assume that both a and b are not integers, so that and . Because , it follows that .

In the example in Figure 3.2, we have , and .

Definition 3.1

An integer-valued metric d on a set S is called regular iff, for all p,q ∈ S such that d(p, q) ≥ 2, there always exists an r ∈ S (r ≠ p and r ≠ q) such that d(p,q)= d(p,r)+d(r,q).

It is not hard to show that d is regular iff, for all distinct p, q ∈ S, there exists an r ∈ S such that d(p, r) = 1 and d(p, q) = 1+d(r, q). is a regular integer-valued metric on R² but not on Z². For example, let p and q be grid points that differ by 4 in one coordinate and by 3 in another coordinate. The distance (both d_e and ) between p and q is 5, but there is no r ∈ Z² at distance 1 from p and 4 from q; such a real point would have to lie on the segment pq, but it cannot then have integer coordinates.

Integer-valued metrics are of special interest in picture analysis. It can be shown that a finite metric space is isomorphic to the distance space on a graph (see Chapter 4) iff the metric is regular; this implies that d₄ and d₈ are regular. Integer-valued metrics will be discussed further in Sections 3.2 and 3.4.

3.1.5 Restricting and combining metrics

From Exercise 8 in Section 1.3, we know that [Zⁿ,+, ·, R] is not a vector space, because it is not closed under scalar multiplication a · p (property V0). The algebraic structure [Zⁿ,+, ·, Z] is an example of a unitary module: it satisfies properties V0 trough V8 with respect to a ring of scalars that has an additive identity.

Proposition 3.3

If [S, d] is a metric space and A is a nonempty subset of S, then [A, d] is also a metric space. If d is not a metric on A, d is also not a metric on any set S containing A.

Proof If M1 through M3 hold for d on S, they also hold for d on any subset of S. The definition of a metric space requires that A be nonempty.

In particular, metrics on Rⁿ define metrics on Zⁿ, because Zⁿ is a subset of Rⁿ. For example, the Minkowski metrics define metric spaces on the sets Z² and Z³ of all 2D or 3D grid points, and they define metric spaces on rectangular grids G_m,n for all m, n (see Equation 1.1) or cuboidal grids G_l,m,n for all l,m,n, because these grids (using the grid point model) are finite subsets of Z² or Z³.

There are ways of combining two metrics d′, d′′ on a set S so that the resulting function d is a metric on S. For example:

(i) A linear combination of two metrics, d(p, q) = a · d′(p,q)+b · d′′(p, q), where 0<a,b ∈ R is a metric.

(ii) The maximum of two metrics, d(p, q) = max{d′(p, q),d′′(p, q)}, is a metric.

On the other hand, the product or minimum of two metrics is not necessarily a metric.

3.1.6 Boundedness

The Minkowski metrics on Rⁿ are examples of unbounded metric spaces [S, d], where S = Rⁿ is an infinite set, and the distances between points in S can exceed any finite bound. Any metric space [S,d] on a unitary module S satisfying the homogeneity property M5 is necessarily unbounded. We now give some examples of bounded metric spaces.

We first give a degenerate example. Let S be a nonempty set, and we define that

It can easily be verified that [S, d_b] is a metric space, and it is evidently bounded. The function d_b is called the binary metric. The norm = d_b(p,o), defined as in Equation 3.5, satisfies N2^* but not N2.

If [S, d] is an unbounded metric space,

defines a metric d′ on S, and [S, d′] is a bounded metric space.

We now give a more detailed example using the mapping that takes p = (x, y),∈ R² into p* = (x*,y*), where the following are true:

(3.8)

For any p, p* is contained in a disk of radius 2. Indeed, for o = (0,0) we have the following:

(Note: c is defined in this equation.) This is true because and thus c < 1. Thus the mapping defined by Equation 3.8 is one-to-one from R² onto the open disk of radius 2. Any (x,y) on the circle with center o and radius 4/3 is mapped into (3x/4, 3y/4), which is on the circle with center o and radius 1. Hence any point p farther than 4/3 from o is mapped into a point p^* in the open annulus defined by the circles of radii 1 and 2. (The function (arctan(x), arctan(y)) is another example of a continuous one-to-one mapping from R² into a bounded set—the open square (–π/2,π/2)² in this case.)

Bounded distances between points p,q ∈ R² can now be defined using the distances, for any metric d on R², between p^* and q^* in the disk of radius 2. In other words, for any metric space [R²,d] we define the following:

If d is a metric on R²,so is d^*, and d^*(p, q) < 4 for all p,q ∈ R². For example, for the integer-valued metric (see Theorem 3.1), all distances between points p, q ∈ R² are integers in the set {0,1,2,3,4}. We sometimes have note that this does not contradict the triangularity property. We can change the cardinality of the range of d^* by increasing or decreasing the parameter 2 in Equation 3.8. Such metrics might be of interest for classifying pixels (pairs of integers) using finite numbers of distance values.

3.1.7 The topology induced by a metric

A metric induces a topology defined by a family of open or closed sets; this section briefly addresses such issues. For a more extensive discussion of topologic subjects, see Chapter 6.

For any metric space [S, d], any p ∈ S, and any ε >0, let the following be true:

U_ε(p) is called the (open)ε-neighborhood of p in S; evidently p ∈ U_ε(p). The family of all ε-neighborhoods defines a basis of a topology and allows us to generate open sets by taking (finite or infinite) unions of ε-neighborhoods. The Euclidean metric d_e on Rⁿ defines the Euclidean topology. For the binary metric d_b, we have U_ε(p) = {p} for 0 <ε ≤ 1 and U_ε(p) = S for ε > 1.

Definition 3.2

p ∈ S is a frontier point of A ⊆ S iff, for any ε > 0, U_ε (p) contains points of A as well as points of . The frontier A of A consists of all frontier points of A.

For example, the frontier of a disk is a circle. The interior A° of A is AvA, and the closure A^• of A is A ∪ A. A is closed iff A = A^• and open iff A = A °. The empty set ∅ and the set S are both closed and open.

The interior of A is the largest open set contained in A, and the closure of A is the smallest closed set that contains A. A set is closed iff its complement is open.

A ⊆ S is bounded iff A ⊆ U_ε (p) for some p ∈ S and some ε > 0. (A bounded set need not be of finite cardinality.) A is called compact iff, whenever it is contained in the union of a set of open sets, it is contained in a finite union of these sets. The Heine-Borel-Lebesgue theorem [112] says that a subset of Rⁿ is compact in the topology defined by any Minkowski metric on R iff it is bounded and closed. A continuous real-valued function defined on a compact set always has a minimum and a maximum on that compact set.

Two metrics d and d′ on S are called topologically equivalent iff a subset of S is open with respect to d iff it is open with respect to d′. For example, the Minkowski metrics on Rⁿ are all topologically equivalent. A bounded metric on an infinite subset of Rⁿ (see the examples in Section 3.1.6) can be topologically equivalent to an unbounded metric. For example, the bounded metric d_e(p, q)/[1+d_e(p, q)] is topologically equivalent to the unbounded metric d_e; the ball of radius r in the unbounded metric corresponds to the ball of radius r/(1+r) in the bounded metric, and the ball of radius s < 1 in the bounded metric corresponds to the ball of radius s/(1 – s) in the unbounded metric.

3.1.8 Distances between sets

Any metric d on a set S can be extended to a Hausdorff metric on the family of all nonempty compact subsets A,B of S by defining

(3.9)

where sup and inf denote the least upper bound and greatest lower bound, respectively. (For compact sets A and B we can replace inf and sup with min and max, respectively.) Figure 3.3 shows an example of this metric in which d is the Euclidean metric d_e.

FIGURE 3.3 The Hausdorff distance between A and B.

The definition of the Hausdorff metric can be broken up into steps. We first define the closest distance from p ∈ S to T ⊆ S using the following:

(3.10)

Let A, B ⊂ S; let h_p(B) = d(p,B) for all p ∈ A; let h_p(A) = d(p, A) for all p ∈ B; and define the following:

Then here we have the Hausdorff metric of Equation 3.9:

In Figure 3.3, we have and .so that d(A,B) = hA(B).

An alternative way of defining the Hausdorff metric makes use of the definition of an (open)ε-neighborhood of a set. We define that

Let h_q be defined by a metric d on S;ε > 0; A ⊂ S; and be the Minkowski sum (see Section 1.2.12). Then, if A and B are nonempty compact subsets of S, we have the following:

Figure 3.4 illustrates this method of defining a Hausdorff distance. U_ε(B) is a dilation of B; dilations will be studied in Chapter 15. Note that, for D = U_ε(o), we have D = D where D is the mirror set used in the definition of dilation; see Section 1.2.12. (If A and B are compact and we dilate by a closed ball of radius ε instead of an open ball, h_A(B) is defined by min instead of inf.)

FIGURE 3.4 Left: B (a simple polygon) is completely contained in . Right: A is not completely contained in , showing that d(A, B) > ε ₂.

The Hausdorff distance is not a metric in the family of all nonempty bounded subsets of S. For example, consider the closed unit square [0,1]² = [0,1]×[0,1] and the open unit square (0,1)² = (0,1)×(0,1) in the Euclidean topology of the plane.² Then d_e([0,1]², (0,1)²) = 0, but the sets are not identical, so property M1 of a metric is violated.

A generalized metric satisfies the axioms of an ordinary metric but can also have value∞. The Hausdorff distance is a generalized metric in the family of closed sets (bounded or not). We can also include the empty set; in the definition of Hausdorff distance, we replace an empty supremum with 0 and an empty infimum with∞ so that the empty set is at distance 0 from itself but at distance∞ from any nonempty set.

The Hausdorff metrics are based on maximum distances between sets; a single point (an “outlier”) in a set can strongly influence these distances. Distances between sets defined by set-theoretic differences are less sensitive to single points. The symmetric difference between two subsets A, B of a set S is as follows:

Let the following be true:

Let S be any nonempty set, and let ℘_fin(S) be the family of all finite subsets of S.

Proposition 3.4

d_sym and d′_sym are metrics on ℘_fin(S).

Proof Let A,B,C ∈ ℘_fin(S), a = d_sym(A,B), b = d_sym(B,C), and c = d_sym(A, C). We show that d_sym has properties M1 through M3.

M1: a = 0 iff AΔB = ø iff A = B.

M2: Because AΔB = BΔA, we have symmetry.

M3: Let A = A₁ ∪ D ∪ E ∪ F, B = B₁ ∪ D ∪ E ∪ G, and C = C₁ ∪ E ∪ F ∪ G (see Figure 3.5). Let a₁ = card(A₁), d = card(D), and so forth. It follows that a = a₁+b₁+f+g, b = b₁+c₁+d+f, and c = a₁+c₁+d+g. This shows that a+b = a₁+2b₁+c₁+d+2f+g ≥ c.

FIGURE 3.5 Intersections between three sets.

For d′_Sym, M1 and M2 follow by arguments similar to those for d_sym. Regarding M3, let a = d′_sym(A,B), b = d′_sym(B,C), and c = d′_sym(A,C), and consider the intersections of A, B, and C as before. Let h = b₁+f+g, k = d+e+ 1, and i = a₁+c₁ – b₁. Then the following are true:

Let a₂ = a₁+h, b₂ = c₁+h, and c₂ = i+h. Because h ≥ 0 and 0 ≤ i ≤ a₁+c₁, we have 0 ≤ c₂ ≤ a₂+b₂. Together with k ≥ 1, this allows us to show that a+b – c ≥ 0 so that a+b ≥ c.

d_sym is the L_∞ distance of the characteristic functions of sets; this provides a direct proof that it is a metric. For d′_sym, we could also derive Proposition 3.4 from the more general fact that, for any metric d, the function d′ defined by d′(p,q) = d(p, q)/[1+d(p, q)] is a topologically equivalent bounded metric; note that our proof covers the general case.

3.2 Grid Point Metrics

In this section, we discuss integer-valued metrics that are related to the grid point adjacency models defined in Sections 2.1.3 and 2.1.4. We also discuss methods of defining neighborhoods and closeness; grid point metrics that approximate the Euclidean metric; paths, geodesics, and intrinsic distances; and distances between sets of grid points.

3.2.1 Basic grid point metrics

Let p,q ∈ R², p =(x₁, y₁), q = (x₂, y₂), and define the following:

Then [R², d₄] is a metric space; in fact, d₄ is the Minkowski metric L₁. We call d₄ the city-block metric or Manhattan metric because, when we restrict it to Z², d₄(p,q) is the minimal number of isothetic unit-length steps from p to q; it resembles a shortest walk in a city with streets that are laid out in an orthogonal grid pattern. In the example in Figure 3.2, we have d₄(p,r) = 3, d₄(p, q) = 6, and d₄(r, q) = 3.

Let p,q ∈ R²,p =(x₁, y₁), and q = (x₂, y₂), and define the following:

Then [R²,d₈] is a metric space; in fact, d₈ is the Minkowski metric L_∞. Thus d₈ is called the chessboard metric because, when we restrict it to Z², d₈(p, q) is the minimal number of moves from p to q by a king on a chessboard. In the example in Figure 3.2, we have d₈(p,r)= 2, d₈(p,q) = 4, and d₈(r,q) = 3.

Let S ⊆ R², let o = (0,0) ∈ S, and let d be a metric on S. The set {p : p ∈ S Λ d(p, o) ≤ 1} is called a unit disk in [S, d]. Figure 3.6 shows the unit disks in R² for the metrics d₄, d_e, and d₈.

FIGURE 3.6 The city block, Euclidean, and chessboard unit disks in the real plane.

For any grid point p, the smallest neighborhood of p in [Z², d_a] (α ∈ {4,8}) is defined by the following:

The notations d₄ and d₈ are suggested by the fact that N₄(p) – {p} has cardinality 4 and N₈(p) – {p} has cardinality 8. Using Equation 3.4, we have the following:

Theorem 3.2

d₈(p,q) ≤ d_e(p,q) ≤ d₄(p,q) ≤ 2 · d₈(p,q) for all p,q ∈ R².

The last inequality is an easy consequence of the definitions of d₈ and d₄: without loss of generality, let d₈(p,q) = then d₄(p,q) ≤ 2 · = 2·d₈(p,q).

Let p,q ∈ R³, p =(x₁, y₁, z₁), and q = (x₂, y₂, z₂), and define the following:

We also define the following:

(This definition is equivalent to the one based on 18-paths; see Theorem 3.6.) For any grid point p, the smallest neighborhood of p in [Z³, d_α] (α ∈ {6, 18, 26}) is defined by the following:

Note that N_α(p) – {p} has cardinality α for α = 6, 18, and 26. Analogous to Theorem 3.2 and using Equation 3.4, we have the following (see also Exercise 3 in Section 3.5):

Theorem 3.3

d₂₆(p,q) ≤ d_e(p,q) ≤ d₆(p,q) ≤ 3 · d₂₆(p,q) for all p,q ∈ R³, and d₂₆(p,q) ≤ d₁₈(p,q) ≤ d_e (p,q) for al1 p,q ∈ Z³ such that d_e (p,q) ≠ .

Proof d₂₆(p,q) ≤ d₁₈(p,q) follows from the definition of d₁₈. For d₁₈(p,q) ≤ d_e(p,q), we need only to show that ≤ d_e(p,q), because L_∞(p,q) = d₂₆(p,q) ≤ d_e(p,q) = L₂(p,q).

If 0 (p,q) < 1, then = 1 and > d₆ (p,q) ≥ d_e(p,q). If p,q ∈ Z³ and d₆ (p, q) = 0 or d₆ (p, q) = 1, we have = d₆(p, q) = d_e(p, q).

If 1 < d₆(p, q), then < d₆(p, q). If p,q ∈ Z³ and d₆(p, q) = 2, we have d_e(p,q) = 2 or d_e(p,q) =, such that = 1 < d_e(p,q). If p,q ∈ Z³ and d₆(p, q) = 3, we have d_e(p, q) = 3, d_e(p, q) = , or d_e(p, q) = , such that = 2 > d_e(p, q) in the latter case. However, if p,q ∈ Z³ and d₆(p, q) ≥ 4, then ≤ de(p, q).

3.2.2 Neighborhoods and degrees of closeness

The ε-neighborhood U_ε(p) (see Section 3.1.7) is defined for any metric space [S,d], any p ∈ S, and any ε > 0. In some cases we have U_ε(p) = {p}; for example, this is true for ε ≤ 1 and any of the metrics [d_e], d₄, and d₈ on Z² or for the binary metric.

If the range of d is countable so that U_ε is of interest only for discrete values of ε (e.g., for metrics on a grid G that is a subset of Z², Z³, Z₂,or Z₃), we use the notation e-neighborhood instead of ε-neighborhood. The e-neighborhoods for three metrics on are illustrated in Figure 3.7; see Figure 3.6 for the corresponding disks in the real plane. The metrics d₄, d_e, and d₈ are translation-invariant; hence the sets U_ε(p) have identical “shapes” for all p.

FIGURE 3.7 The e-neighborhoods for e = 1,2,3,4, and 5 in the 2D grid cell model defined by the city block (left), Euclidean (middle), and chessboard (right) metrics.

For any metric d on a grid G, there is an interval of values e >0 such that U_e(p) contains as few grid points as possible in addition to p itself. This minimal set of grid points is called the smallest (nontrivial) neighborhood N(p) of p with respect to d. (In the grid cell model, we use the notationη(β), where c is a cell.) For example, for d_α (α ∈ {4,6,8,18,26}), the smallest neighborhoods N_α(p) defined in Section 3.2.1 are obtained for 1 < e ≤ 2. For e = 1, we have U₁(p) = {q ∈ Z² : d_α(q,p) <1} = {p} for any of these d_αs.

With fuzzy geometry (see Section 1.2.10), we can define the degree of closeness of two points p and q of a metric space [S, d] as a monotonically nonincreasing function of the distance between p and q. For example, we can define c(p, q) = 1/[1+d(p, q)]. It follows that 0 <c(p, q) ≤ 1 for all p, q; hence, for any p, c(p, q) defines a fuzzy subsetμ_ρ of S{p} that we can think of as a fuzzy neighborhood of p.

Degrees of closeness between pixels or voxels p and q in a picture P can be defined using monotonically nonincreasing functions of the absolute difference between P(p) and P(q). For example, we can define c′(p,q) = 1/(|P(p) – P(q)|+ 1). Note that, for any p and q, we have 1/(G_max+ 1) ≤ c′(p,q) ≤ 1 so that c′(p,q) defines a fuzzy subsetμ′_ρ of the picture. We can also define degrees of closeness between pixels or voxels that depend on both the distance between them and the absolute difference between their values. In Section 3.4, we will define a metric on a picture in which the distance from p to q depends on the sums of the pixel or voxel values along paths from p to q.

3.2.3 Approximations to the Euclidean metric

We saw in Figure 3.6 that the set of points within a given d₄ or d₈ distance from a given point is a square. These distances depend on direction; their “disks” are not good approximations to Euclidean disks. If we restrict d₄ and d₈ to Z² the set of grid points q such that d₄(p, q) ≤ k is a diagonally oriented square (a “diamond”) of odd diagonal length 2k + 1 centered at p, and the set of grid points q such that d₈(p,q) ≤ k is an upright square of odd side length 2k + 1 centered at p; see Figure 3.7.

Better approximations to Euclidean disks can be obtained by combining d₄ and d₈, for example, by taking the following:

(3.11)

It is not hard to see that the set of grid points such that d(p, q) ≤ k is the intersection of an upright square of side length k with a diamond of diagonal length 3k/2; this intersection is an upright octagon. Varying the coefficient of d₄ in Equation 3.11 causes the shape of the octagon to vary between a diamond and a square. The octagon can be made arbitrarily close to regular by choosing the coefficient appropriately.

Metrics with disks that are “hexagons” can be defined by using a modification of the standard orthogonal grid in which, for example, the odd-numbered rows are shifted half a unit to the right; see Figure 3.8. This is equivalent to working with an unshifted grid but treating a grid point) on an odd-numbered row as having the six neighbors (i ± 1,j), (i,j ± 1), and (i+1, j ± 1) and a grid point on an even-numbered row as having the six neighbors (i ± 1, j), (i,j ± 1), and (i − 1, j ± 1).

FIGURE 3.8 Modification of the standard grid in which the odd-numbered rows are shifted half a unit to the right.

In the hexagonal grid shown in Figure 3.8, we can introduce a coordinate system by using any two of the three axes shown on the right, for example, x and y. We can

reach any grid point p from the origin by making an (positive, negative, or zero) integer number u of moves in the+ x direction and then an integer number v of moves in the+ y direction; the resulting (u,v) are the coordinates of p. We will use these coordinates in the remainder of this discussion.

The x,y, and z axes divide the plane into six sextants that we number counterclockwise beginning at the+ x axis; see the figure in Exercise 2 in Section 1.3. It is easily verified that the signs of the (u, v) coordinates of the points lying in these sextants can be characterized as shown in Table 3.1. Note that the z-axis is the locus of points such that u+v = 0.

TABLE 3.1

Signs of the coordinates in the six sextants of the hexagonal grid.

The hexagonal distance between two points p and q of the hexagonal grid is the minimum number of unit moves in the x and y directions needed to go from p to q. If p = (i,j) and q =(h, k), it can be shown that this number is given as follows:

(The signum function sgn(a) is 1 if a ≥ 0 and 0 otherwise.)

Proposition 3.5

d_h is a metric on Z².

Proof Positive definiteness and symmetry are easily verified. To prove triangularity, assume without loss of generality that the three grid points are (0,0), (i,j), and (h, k), and consider all possible values of the signs of i, j, h, k, (i – h), and (j – k).

It can be shown that d_h((i,j), (h, k)) is also equal to the following:

It can also be shown that hexagonal coordinates (u, v) are related to Cartesian coordinates (i,j) by the following for j even:

and by the following for j odd:

Obviously, is the integer-valued metric that best approximates d_e. However, “incremental” algorithms for distance computation on a grid (see Section 3.4) normally use local neighborhoods; this makes it easy to compute metrics such as d₄, d_h,or d₈ or octagonal metrics, but not . For a method of computing a good approximation to d_e, see Section 3.4.3.

We conclude this section by describing a general method of defining approximations to Euclidean distance by counting moves in different directions (e.g., isothetic moves, diagonal moves). Let p,q ∈ Z², and letρ be a sequence of king’s moves from p to q. Let l_a,_b (p) = ma+nb where m is the number of isothetic moves and n the number of diagonal moves, and let the following be true:

(3.12)

Thus d_a,b is called the (a, b) chamfer distance (or weighted distance) from p to q. Chamfer distances that closely approximate Euclidean distance can be defined by appropriately choosing a and b. If the following is true,

(3.13)

the (a, b) chamfer distance d_a,b is a metric [738], which also defines a norm = d_a,_b(p, o) (the distance of p from the origin o = (0,0)). This metric is a nonnegative linear combination of d₄ and d₈. Convex linear combinations of d₄ and d₈ also give useful chamfer distances; for example, (d₄+2d₈)/3 is the (3,4) chamfer distance (see Exercise 12 in Section 3.5).

[761] formulated necessary and sufficient conditions for a 2D chamfer distance d to define a norm = d(p, o) on Z².

We can similarly define 3D chamfer distances d_a,b,c where a, b, and c correspond to moves in which only one coordinate changes (isothetic moves), two coordinates change, and all three coordinates change, and we can obtain good approximations to Euclidean distance by appropriately choosing a, b, and c.

Generalized chamfer distances can be defined using additional types of moves that are not necessarily moves between 8-neighbors or 26-neighbors.

3.2.4 Paths, geodesics, and intrinsic distances

A sequence ρ of grid points (p₀, p₁,…, p_n) such that p₀ = p, p_n = q, and p_i₊₁ is α-adjacent to p_i (0 ≤ i ≤ n −1) is called an α-path of length n from p to q; p and q are called the endpoints ofρ.

Proposition 3.6

If ρ is a shortest α-path from p to q, the p_is must all be distinct, and nonconsecutive p_is cannot be α-adjacent.

Proof If we had p_h = p_k with h <k, (p₀,…, p_h, p_k₊₁…, p_n) would be a shorter α-path with the same endpoints. Similarly, if p_h were α-adjacent to p_k where h <k and k – h bsol; 1, (p₀,…, p_h, p_k, p_n) would be a shorter α-path.

An α-path is called an α-geodesic if no shorter α-path with the same endpoints exists.

Proposition 3.7

If (p₀,…, p_n) is an α-geodesic, (p_h,…, p_k) is an α-geodesic for all 0 ≤ h ≤ k ≤ n.

Proof If (q₀,…, q_m) were a shorter α-path from q₀ = p_h to q_m = p_k, (p₀,…, p_h−1, q_0,…, q_m, p_k₊₁,…, p_n) would be a shorter α-path from p₀ to p_n.

Theorem 3.4

The length of a shortest α-path from p to q is d_α(p, q).

Proof We give the proof in 2D for α = 4; the proofs for other cases are similar. If the length of the path is 1 (e.g., the path is (p, q)), p and q are 4-adjacent, so d₄(p,q) = 1. We proceed by induction on the shortest length. Let (p₀, p₁,…, p_n) be a shortest 4-path; then, using Proposition 3.7, (p₀, p₁,…, p_n−1) is a shortest path from p to p_n−1, so d₄(p,p_n−1) = n−1 by the induction hypothesis. Because q is 4-adjacent to p_n−1, we have d₄(p_n−1, q) = 1 so that, by the triangle inequality, d₄(p,q) ≤ (n−1) +1 = n. If d₄(p, q) = m <n, we can easily construct a 4-path of length m from p to q. For example, suppose p =(i,j) and q = (h,k) where i ≤ h and j ≤ k; the argument is similar if i ≥ h and/or j ≥ k. Because d₄(p, q) = m, we have (h − 1)+(h – j) = m, and we can construct a 4-path from (i,j) to (h,k) by first increasing i by 1 until it reaches h and then increasing j by 1 until it reaches k; this 4-path has length = m <n, which is contrary to our assumption that a shortest 4-path from (i, h) to (j, k) has length n.

It follows that an α-path ρ of length n is an α-geodesic iff the d_α-distance between the endpoints ofρ is n.

In Euclidean space, there is a unique shortest arc between any two points p and q, which is namely the straight line segment pq. In a grid, there can be many shortest α-paths between two grid points, and these paths need not be digital straight line segments (see Section 2.3.4 and Chapter 9). In what follows, we consider only the 2D cases α = 4 and 8, and we assume that p_i (0 ≤ i ≤ n) has coordinates (x_i, y_i).

Proposition 3.8

The following properties of a 4-path ρ are equivalent:

(a)ρ is a 4-geodesic.

(b)ρ cannot turn right (or left) twice in succession; left and right turns must alternate.

(d)

Proof To see that (a) implies (b), suppose thatρ made two successive turns in the same direction:

(The argument in other cases is analogous.) Then the subpath ofρ from p_r to p_s has length s – r, but there is a horizontal path from p_r to p_s of length s – r − 2, so Proposition 3.7 is violated, andρ cannot be a 4-geodesic.

We next show that (b) implies (c). Suppose the initial direction ofρ is horizontal toward the right and its first turn (if any) is a left turn; the proofs in other cases are analogous. Let the turns be at p_n1, p_n2 where 0 <n₁ <n₂ <… <n; then x₀ <x₁ <… <x_n1, and y₀ = y₁ = … = y_n1. After the first turn,ρ is headed vertically upward; thus x_n1 = x_n₊₁ = …. = x_n2, and y_n1 <y_n₊₁ <… <y_n2. By (b), the second turn must be a right turn, after whichρ is again horizontal and headed rightward, so x_n2₊₁ <… <x_n3, y_n₊₁ = … = y_n3, and so on, proving (c). Note that, by (c), if we take any p_m onρ as origin, the subpaths (p₀,…, p_m) and (p_m,…, p_n) must lie in a pair of opposite quadrants.

Next we prove that (c) implies (d). At each step along a 4-path, either x or y (but not both) changes by 1. Hence, if (c) holds (e.g., x₀ ≤ x₁ ≤ … ≤ x_n and y₀ ≤ y₁ ≤ …. ≤ y_n and similarly in the other cases), the number of steps at which the xs increase and the number at which the ys increase must add up to n, which implies (d).

Finally we show that (d) implies (a). Any 4-path from (x₀, y₀) to (x_n, y_n) must have length at least , because a coordinate can change by only 1 at each step, and only one coordinate can change at a time. If (d) holds, this length is n, and becauseρ has length n, its length is the shortest possible, thus proving (a).

Proposition 3.9

The following properties of an 8-pathρ are equivalent:

(a) ρ is an 8-geodesic.

(b) x₀ <x₁ <…. <x_n (or all >), or y₀ <y₁ <… <y_n (or all >).

Proof The x and y coordinates can each change by at most 1 at each step along an 8-path. Hence, to achieve |x₀ – x_n| = n, successive x_is must all differ by 1 in the same direction (i.e., x₀ <x₁ <… <x_n [or all >]), which proves that (c) implies (b). Conversely, x₀ <x₁ <<x_n means that the successive x_is must differ by 1 in the same direction so that |x₀ – x_n| = n; thus (b) implies (c).

On the other hand, becauseρ has length n, we must have |x₀ – x_n| ≤ n and | y₀ – y_n|≤ n; an 8-path of length n cannot involve coordinate changes of more than n, because each coordinate changes by at most 1 at each step. Any 8-path from p₀ to p_n must have length of at least max {|x₀ – x_n|, |y₀ – y_n|}. If (c) holds (e.g., |x₀ – x_n| = n), the max is n; thusρ (which has length n) is a shortest 8-path, proving (a).

Conversely, if (c) fails, and |y₀ – y_n | <n. Suppose x₀ ≤ x_n, y₀ ≤ y_n, and x_n – x₀ ≤ y_n – y₀; the argument is analogous if any of these ≤s is ≥. Then ((x₀,y₀), (x₀+1,y₀+1),…, (x_n, y₀+(x_n – x₀)), (x_n, y₀+(x_n – x₀)+1),…, (x_n, y_n)) is an 8-path from (x₀, y₀) to (x_n, y_n) of length y_n – y₀ <n,soρ is not a shortest 8-path, thus completing the proof.

These results imply that digital straight line segments (see Chapter 9) are geodesics. It is not hard to show that the only possible turns in an 8-geodesic are 45° right and left turns in alternation.

If S is an α-connected set of grid points (see Chapter 4), for any p,q ∈ S, there exists an α-pathρ =(p₀, p₁,…, p_n) from p₀ = p to p_n = q such that the p_is are all in S. The length of a shortest such path is called the intrinsic α-distance in S from p to q. The ordinary α-distance from p to q will sometimes be called extrinsic to contrast it with intrinsic α-distance.³

Proposition 3.10

d_α(p,q) ≤ (p,q) for all p,q ∈ S.

Proof The length of a shortest α-path that lies in S cannot be less than the length of an unrestricted shortest α-path. ∈

We will see in Chapter 13 that a set S of grid points is digitally convex iff any two points of S are the endpoints of a digital straight line segment that is contained in S. Because a digital straight line segment is a geodesic, it follows that, if S is digitally convex, , (p,q) is equal to d_α(p,q) for all p,q ∈ S.

3.2.5 Distances between sets

Integer-valued metrics d on a grid, such as , d₄, and d₈ in 2D, define Hausdorff metrics in the family of all finite subsets of the grid; see Section 3.1.8 For any such d, any grid point p, and any finite set of grid points S, the distance from p to S is h_p(S) = min d(p,q) where the min is taken over all q ∈ S. Evidently, h_p(S)= 0 iff p ∈ S. For example, for the p, q, and r in Figure 3.2, let A = {p, q} and B = {q,r}; then d₄ (A, B)= d₄(p,r) = d₄(r,q) = 3, d₈(A, B) = d₈(p,r)= 2, and (A,B) = (p,r) = (r,q) = 3. Similarly, for the sets A and B in Figure 3.3, we have d₄(A,B) = h_p(B) = 8 > h_q(A) = 6(h_p and h_q with respect to d₄) and d₈(A,B) = h_p(B)=h_q(A)= 5(h_p and h_q with respect to d₈).

A Hausdorff metric can be used to measure the distance between the frontiers of the inner and outer Jordan digitizations of a set; see Figure 2.29 for an example. Section 3.1.7 allows us to complete Definition 2.8: the inner Jordan digitization (S)

ALGORITHM 3.1 Algorithm for calculating the Hausdorff distance between two subsets A and B of an m×n grid.

1. 1. Calculate a distance field F (A) in an array of size m×n.

2. Calculate a distance field F(B) in an array of size m×n.

3. Let a be the maximum value in F (A) at all positions belonging to B.

4. Let b be the maximum value in F(B) at all positions belonging to A.

5. H(A,B) =max{a,b}.

of a set S ⊆ Rⁿ is actually the union of all n-cells (for grid resolution h > 0 and n = 2 or n = 3) that are contained in the interior of set S.

Theorem 3.5

For any compact set S ⊂ R² such that (S) ≠ ø, the d₄ or d₈ Hausdorff distance between the (polygonal) frontiers (S) and (S) is at least 1/h

Proof Let p be an arbitrary grid vertex on (S). Thus p cannot be on (S), because the frontier of (S) never intersects the frontier of (S) if S is a nonempty compact subset of R². The d₄ (or d₈) distance from p to any point q on (S) is at most 1/h. It follows that

and thus ; this is analogous for d₈.

Finally, we discuss algorithms for calculating the Hausdorff distance between two finite sets A,B ⊂ G_m,n of grid points (see Algorithm 3.1). We assume that m and n are the dimensions of the smallest isothetic rectangle that contains A ∪ B.

We first describe a brute-force approach. For every point in A, calculate the minimum distance to a point in B, and, for every point in B, calculate the minimum distance to a point in A. Take the maxima of these two sets of distances; then the maximum of the two maxima is the desired Hausdorff distance. The points of A and B can be located by scans of G_m,n (see Section 1.1.3); on one scan, we find all of the points p in A, and, for each p, we scan G_m,n again to find all of the points in B and calculate the distances from p to these points. If card(A) and card(B) are O(mn), this brute-force algorithm takes O(m²n²) computation steps.

A much more efficient algorithm is shown above. For any S ⊂ G_m,n, the distance field F (S) is an array of size m×n such that F (S)(p) = h_p (S); in particular, F (S)(p)= 0 iff p ∈ S. It can be shown (see Section 3.4.2 for grid metrics and Section 3.4.3 for the Euclidean metric) that F (S) can be calculated in O(mn) computation steps for any Minkowski metric on G_m,n. This allows us to calculate the Hausdorff distance in O(mn) computation steps by computing distance fields for A and B and scanning each of these fields.

3.3 Grid Cell Metrics

In Sections 3.1.2 and 3.1.3, we defined metric-related concepts such as norms, scalar products, and angular values in n-dimensional vector spaces. In this section, we apply these concepts to the n-dimensional unitary modules defined by the grid cell model. Results involving grid cell adjacency models easily translate into results involving the isomorphic grid point adjacency models.

3.3.1 Basic grid cell metrics

We first consider the 2D grid cell model. Let d be a metric defined on the set Z² of grid points. For any two 2-cells c₁,c₂∈ , we define∂(c₁, c₂) by the value of d for the center points of c₁ and c₂;∂ is thus a metric on . When d = d₄, we call this metric∂₁, and, when d = d₈, we call it∂₀. Evidently,∂₁(c₁, c₂) ≤ 1 iff c₁ ∩ c₂ contains (at least) one 1-cell, and∂₀(c₁, c₂) ≤ 1 iff c₁ ∩ c₂ contains (at least) one 0-cell; the subscript indicates the dimension of the cells that have to be contained in the intersection. Smallest neighborhoods in the grid cell model are denoted by the following:

The 2D (grid cell) incidence model also includes 1-cells and 0-cells; it was illustrated in Figure 2.3, and in a more abstract way in Figure 2.13. The set of centers of the 2-, 1-, and 0-cells in ₂ is the grid with grid constant θ= 0.5. For any metric d on this grid and any b,c ∈ ₂,∂(b, c) is defined by the value of d for the center points of b and c; thus d is a metric on ₂. For example (see Figure 2.3), the 2-cell with its center at (1,2) and the 0-cell at (15,45) are at Euclidean distance , city block distance 3, and chessboard distance 5/2.

Metrics on or ₃ can be defined as they were in the 2D case by identifying cells with their centers. For any two 3-cells c₁, c₂∈ ,∂(c₁, c₂) is defined by the value of a grid point metric d for the center points of c₁ and c₂; thus∂ is a metric on . The metrics defined in this way by d₆, d₁₈, and d₂₆ will be denoted by∂₂, ∂₁, and∂₀, respectively. Evidently,∂₂(c₁, c₂) ≤ 1 iff c₁ ∩ c₂ contains (at least) one

2-cell,∂₁(c₁, c₂) ≤ 1 iff c₁ ∩ c₂ contains (at least) one 1-cell, and∂₀(c₁, c₂) ≤ 1 iff c₁ ∩ c₂ contains (at least) one 0-cell. The smallest neighborhoods are as follows:

Figure 3.9 shows the (smallest) neighborhoods of 3-cells at d₆-, d₁₈-, and d₂₆-distance ≤ 1 from a given 3-cell; they are “balls” of 3-cells of radius 1. A general definition of such metrics will be given in Section 3.3.2 for the n-dimensional grid cell model .

FIGURE 3.9 Balls of 3-cells of radius 1 with respect to d₆ (left), d₁₈ (middle), and d₂₆ (right).

3.3.2 Seminorms

An n-cell (n ≥ 2) is an n-dimensional grid (hyper)cube with edges of length 1 with its center at a grid point p ∈ Zⁿ. [,+, ·, Z] is a unitary module. We identify cells by their centers; hence, if p is the center of n-cell c ∈ , k ·c is the n-cell with its center at k · p, and –c is the n-cell with its center at –p.

Let the following be true (0 ≤ i <n):

B_i is a subset of the frontier of the n-dimensional cube [−1,+ 1]ⁿ. For example, B₀ = {−1,+ 1}ⁿ is the set of all vertices of this cube. We always have o = (0,…,0) ∉ B_i. Because i <n, we have x_k ≠ 0 for at least one coordinate k.

Let c∈ be an n-cell, let 0 ≤α <n, and let

(3.14)

where is the Minkowski sum defined in Section 1.2.12 Then the following is true:

For example, for n = 3, we have card (η₀(c)) = 27. Two n-cells c₁ and c₂ are called α-neighbors iff c₁ ∈ η_α(c₂). This defines the n-dimensional grid cell adjacency models.

Let the following be true for 0 ≤α <n and t ≥ 0:

Let be the smallest t such that c ∈ , where c ∈ . Here, is called the α-value of c. For example, for n = 2, we have c₀ = max{|x|, |y|} and c_i = |x|+|y| where x and y are the coordinates of the center of c.

Theorem 3.6

Let c ∈ ,c =(x₁,…, x_n), and 0 ≤α <n. Then the following is true:

Proof By induction on t, it can be shown that the following is true:

Consequently, we have c_α = t iff the following is true:

For α = 0 and α = n−1, we obtain the following norms:

This coincides with the Minkowski norm ; see Section 3.1.2 We also obtain the following:

This coincides with the Minkowski norm . For 1 ≤α ≤ n − 2, k = n – α, and c =(k − 1, k, …, k), we have c_α = n (recall that n ≥ 2), and thus we have the following:

Thus, for 1 ≤α ≤ n − 2, the α-value ·_α is not a norm but rather a seminorm, which also satisfies –c_α = c_α for all c ∈ and all 0 ≤α <n. As in Equation 3.3, we define the following for all c₁, c₂∈ :

(3.15)

For example, for n = 3 and α = 1,∂₁ is identical to the metric d₁₈ defined in Section 3.2.1

Theorem 3.7

∂_a (0 ≤α <α) is a regular metric on

Proof An α-path of length m − 1 from c₁ to c_m ≠ c₁ is a sequence of n-cells (c₁, c₂,…, c_m−1, c_m) such that c_i is an α-neighbor of c_i₊₁ and c_i ≠ c_i₊₁ (1 ≤ i ≤ m − 1). First, we show that c_α is equal to the length of an α-geodesic (c₁, c₂,…, c_m) from the origin o to c = c_m. Let b_i₊₁ = c_i₊₁ – c_i (1 ≤ i ≤ m − 1).

Then we have that

and let c = b₂+b₃+ …+b_m so that c ∈ and c_α ≤ m − 1. On the other hand, for any c ∈ , there exists an α-path of length m − 1 from o to c so that the length of an α-geodesic from o to c is at most c_α.

We have shown that, for any c₁ ≠ c₂,∂_α(c₁, c₂)= c₂ – c₁ = m − 1 ≥ 1 is the length of an α-geodesic from c₁ to c₂. Let (b₁,b₂,…, b_m) be an α-geodesic from b₁ = o to b_m = c₂ – c₁; then (b₁+c₁, b₂+c₁,…, b_m+c₁) is an α-geodesic from c₁ to c₂. Let c₃ = b₂+c₁; then∂_α(c₁, c₃) = 1 and∂_α(c₁, c₂) = 1+∂_α(c₃, c₂), because (b₂+c₁,…, b_m+c₁) is an α-geodesic from c₃ to c₂. ∈

Theorem 3.6 shows that the metrics∂_α satisfy the following:

(3.16)

This complements Theorems 3.2 and 3.3.

3.3.3 Scalar products and angles

The seminorms || · || _α define weak scalar products 〈 ·, · 〉 _α (see Section 3.1.3), which satisfy the generalized Schwarz inequality on Zⁿ. It follows that they define angular values H_α(c₁, c₂, c₃) (see Equation 3.7). Following Proposition 3.1, H_α(c₁, c₂, c₃) is always in the range of the arccos function.

Angular values can be used to characterize 3-cell configurations. We say that c₂ is between c₁ and c₃ according to the α-metric (notation: (c₁, c₂, c₃)_α) iff || c₁ – c₃ || _α = || c₁ – c₂ || _α + || c₂ – c₃ || _α. We call c₁, c₂, and c₃α-cogeodetic iff they are contained in an α-geodesic; this is equivalent to (c₁, c₂, c₃)_α, (c₁, c₃, c₂)_α,or (c₂, c₁, c₃)_α

Conjecture 3.1

If H_α(c₁, c₂, c₃) = −1, then (c₁, c₂, c₃)_α; if H_α(c₁, c₂, c₃) = 1, then (c₂, c₁,c₃)_α.

Some values of H₀ and H₁ for n = 2 are given in Tables 3.2 and 3.3. In both tables, o =(0,0) and c =(13,8) are fixed 2-cells, and the values H₀(o,b,c) or H₁(o,b,c) are given by the positions of the variable 2-cell b. From these examples, it is clear that (c₁, c₂, c₃)_α does not imply H_α(c₁, c₂, c₃) = −1. H₁(o,b,c)= 0 indicates a position for which o – b = and c – b = are orthogonal with respect to metric∂_α.

TABLE 3.2

Rounded angular values H₀(o,b,c) at cell b. Positions with nonpositive values are shaded.

TABLE 3.3

Rounded angular values H₁(o,b,c) at cell b.

We saw in Section 3.3.2 that, if α = 0 or α = n−1, ·α is a norm. Hence we always have 〈 c,c 〉 ₀ = c²₀ and 〈 c,c 〉 _n−1 = c²_n−1, but, for 1 ≤α ≤ n−2, there exist n-cells c such that 〈 c,c 〉 _α <c²_α. Because 〈 b,c〈 _α = −〈 b, –c 〉 _α and –c_α = c_α for

all b,c ∈ (0 ≤α <n), the angular values are symmetric, as we see in Tables 3.2 and 3.3.

The weak scalar products 〈 ·, ·)_α are not homogeneous; for any α (0 ≤α <n), there exist pairs of cells b₁ and b₂ and c₁ and c₂ such that 2· 〈 b₁, b₂〈 _α <〈 2b₁, b₂ 〉 _α and 2 · 〈 c₁, c₂ 〉 _α > 〈 2c₁, c₂ 〉 α. It follows that these products are not linear.

3.4 Metrics on Pictures

3.4.1 Value-weighted distance

Let P be a picture with pixel or voxel values that have been divided by G_max so that they are in the range [0,1]. Let p and q be pixels or voxels of P, and letρ =(p₀,…, p_n) be an α-path from p = p₀ to q = p_n. We define the value-weighted length ofρ as follows:

We define the value-weighted distance d_P (p,q) as min_pl_P (ρ) where the min is taken over all α-pathsρ from p to q.

Because the reversal of a path from p to q is a path from q to p and the concatenation of a path from p to q with a path from q to r is a path from p to r, d_P is symmetric and satisfies the triangle inequality, and evidently it is nonnegative. However, d_P is not positive definite (metric property M1); for example, if p and q are α-adjacent and P(p) = P(q) = 0, we have d_P(p,q) = 0 even though p ≠ q. On the other hand, if P(p) ≠ 0, we have d_p(p, q) ≠ 0 for any q ≠ p, because any α-path from p to q must go from p to some p′ ≠ p, and the pair (p,p′) contributes a nonzero quantity ½ (P (p) + P (p′)) to the sum. Thus d_P is a metric if we restrict it to the set of pixels or voxels with values that are nonzero or even if we restrict it to pairs (p, q) with values that are not both zero. (The latter is not a restriction of d_P to a set of pixels or voxels, but it justifies our studying the [value–weighted] distance from non-0s to 0s in the next section.)

When we restrict d_P to 〈 P 〉 (the set of pixels or voxels with values that are 1), we evidently have l_P(ρ)= n and d_P(p,q) = d_α(p,q) for all p,q ∈ 〈 P 〉. In the next sections, we will study 4- and 8-distances from the pixels of 〈 P 〉 to the subset 〈 P 〉 in a 2D binary picture.

3.4.2 Distance transforms

Let P be a binary picture in which and are proper subsets of the grid. For any grid metric d_α, the d_α distance transform of P associates with every pixel p of 〈 P 〉, the d_α distance from p to .⁴ The d_α distance transform of the set of gray pixels in the picture shown on the left in Figure 3.10 is shown for α = 4 in the middle and for α = 8 on the right.

FIGURE 3.10 Distance transforms: Left: Picture. Center: d₄ transform. Right: d₈ transform.

We will assume in the rest of this section that the pixels outside of a rectangular region G all have value 0. We will now show that the d₄ or d₈ distance transform of P can be computed by performing a series of local operations while scanning G twice. (A local operation gives each pixel p a new value that depends only on the old values of the neighbors of p.)

For any p ∈ G, let B(p) (“before”) be the set of pixels (4- or 8-) adjacent to p that precede p when G is scanned row by row from top to bottom when each row is scanned from left to right (see Section 1.1.3); thus, if p has coordinates (x,y), B contains (x,y + 1) and (x − 1, y), and if we use 8-adjacency, it also contains (x − 1, y + 1) and (x+1, y+ 1). Let A(p) (“after”) be the remaining (4- or 8-) neighbors of p.

Let the following be true:

We can compute f₁ (p) for every pixel in G in a single left-to-right, top-to-bottom scan of G, because for each p, f₁ has already been computed for all of the qs in B(p). (If p is on the top row or in the left column of G, some of these qs are outside G, but we know that f₁ = 0 for these qs because they are in .)

Now let the following be true:

We can compute f₂ (p) for every pixel in G in a single right-to-left, bottom-to-top scan of G, because for each p, f₂ has already been computed for all of the qs in A(p) or is known because they are outside of G.

The f₁s that use 4- and 8-adjacency are shown in Figure 3.11 for the picture on the left in Figure 3.10. The f₂s are not shown in Figure 3.11, because they are the same as the d₄s (d₈s) shown in Figure 3.10, as we see from the following:

FIGURE 3.11 The first stage in computing the distance transform of the binary picture on the left in Figure 3.10. Left: d₄ transform. Right: d₈ transform.

Theorem 3.8

f₂(p) = d(p, ) for all p ∈ G where d = d₄ for the 4-adjacency version of the algorithm and d= d₈ for the 8-adjacency version.

Proof Evidently, if f₂(p) = 0, p must be in . Suppose f₂(p) = d(p, 〈 P 〉) for all p such that f₂(p) <n, and let f₂(p)=n gt;0. Then either some q ∈ A(p) has f₂(q)=n − 1 or else f₁(p)=n, which implies that some q ∈ B(p) has f₁(q) = n − 1. In either case, using the induction hypothesis, d(q, )= n − 1 so that d(p, ) <n. If d(p, ) <n, we must have d(q, ) <n−1 for some neighbor of p. For this neighbor, we must have either f₁ (q) or f₂(q) <n − 1, which implies that f₂(p) <n, thereby contradicting our assumption that f₂(p) = n.∈

Note that a two-pass distance transform algorithm is valid for any chamfer distance that satisfies Montanari’s inequalities 3.13, not just for d₄ and d₈.

Let P⁽¹⁾ = P, and, for k = 1,2,…, let P^(k ⁺ ¹⁾ be the integer-valued picture in which P^(k ⁺ ¹⁾(p) = 0 if P(p)= 0, and otherwise let P^(k ⁺ ¹⁾(p) = min P^(k)(q)+1, where the min is taken over the pixels q that are α-adjacent to p. It is not hard to see that, for all k ≤ d_α(p, ), we have P^(k)(p) = k and, for all k ≥ d_α(p, ), we have P ^(k)(p) = d_α(p, ). Let D_α = ); we call D_α the α-radius of . Then, for any k ≥ D_α, we have P^(k)(p) = d_α(p, ) for all p ∈ so that P^(k) is the d_α distance transform of . Note that computing the d_α distance transform in this way requires performing local operations at every pixel during D_α −1 scans of P to successively compute P ⁽²⁾, P ⁽³⁾,…., P^(Dα⁾, whereas the method used in Theorem 3.8 requires performing local operations during only two scans of P.

3.4.3 The Euclidean distance transform

The d₄ and d₈ distance transforms of Section 3.4.2 are easy to compute, but, as we saw in Section 3.2.3, d₄ and d₈ are not good approximations to Euclidean distance. We will now describe Danielsson’s algorithm [229] for computing a distance transform in which the distances differ from Euclidean distance by at most a fraction of the grid constant.

To each pixel p = (x,y) of P, we assign a pair of integers (f (x),f(y)) that is initially (0,0) if p ∈ and (D, D) if p ∈ , where D is greater than the diameter of P (the greatest distance between any two pixels of P). We then scan P and update the (f (x),f(y)) values as described in Algorithm 3.2. In the min computations, we pick the pair (u,v) for which u²+v² is smaller; if they are equal, we pick the one for which u is smaller. Note that, in both sets of scans, the values of (f (x), f(y)) are first modified by a single comparison with a vertical neighbor and then by a set of comparisons with left and right horizontal neighbors.

ALGORITHM 3.2 Danielsson’s algorithm for calculating the Euclidean distance transform.

1. For each row of P (from top to bottom), replace each (f(x), f(y)) (from left to right) with

min((f(x),f(y)), ((f(x),f(y-1))+(0, 1)));

then replace each (f(x), f(y)) (from left to right) with

min((f(x),f(y)), ((f(x-1),f(y))+(1, 0)));

then replace each (f(x), f(y))–except the rightmost one (from right to left)—with

min((f(x),f(y)), ((f(x+1),f(y))+(1, 0))).

2. For each row of P—except the bottom row (from bottom to top)—replace each (f(x), f(y)) (from left to right) with

min((f(x),f(y)), ((f(x),f(y+1))+(0, 1)));

then replace each (f(x), f(y)) (from left to right) with

min((f(x),f(y)), ((f(x-1),f(y))+(1, 0)));

then replace each (f(x), f(y))–except the rightmost one (from right to left)—with

min((f(x),f(y)), ((f(x+1),f(y))+(1, 0))).

When the scans are complete, the value of f(x) at p should be the difference between the x coordinates of p and the nearest pixel q of , and the value of f(y) should be the difference between their y coordinates. The Euclidean distance between p and q would then be . In fact, we will see in the next paragraph that the (f(x), f(y)) values are not always exactly equal to the nearest-pixel coordinate differences. Figure 3.12 shows the (f(x), f(y)) values for the pixels in the gray area in !Figure 3.10; in this simple example, the values are all correct.

FIGURE 3.12 Computation of the Euclidean distance transform. Left: the pair of (final) values at a pixel of are its x and y coordinate differences from the nearest pixel of . Right: corresponding values of the Euclidean distance, rounded to two decimal places.

To see how the distances computed by Danielsson’s algorithm can be incorrect, consider circles of radius a centered at (x − 1, y) and (x, y −1) (see Figure 3.13). Let q =(x – a − 1, y) and s = (x, y – a −1), and let r be the point where the circles intersect. If q, r, and s are in, the algorithm gives value a+ 1 to the pixel p at (x,y), because its neighbors (x – 1,y) and (x,y −1) are at distance a from q and s, respectively; however, p is actually at distance b <a + 1 from r. Indeed, as we see from Figure 3.13, the distance a from (x, y−1) to r is the hypotenuse of a right triangle with legs that are b/ and (b/) − 1; thus a² = b²/2+ (b²/2)+1 – b/ = b² –, so that the following is true:

It can be shown that this is a worst case; note that even in this case the error is only a fraction of the grid constant.

FIGURE 3.13 A worst case for the Euclidean distance transform.

3.4.4 Medial axes

For any grid metric d_α, any pixel p, and any k ≥ 0, let P^(k)(p) = {q : d_α(p,q) ≤ k}; thus P⁽⁰⁾(p) = {p}⊂ P⁽¹⁾(p) ⊂ P⁽²⁾(p) ⊂ …. We recall that, for α = 4, P^(k)(p) is a diagonally oriented square centered at p, and, for α = 8, P^(k)(p) is an upright square centered at p. If p ∈ and k <d_α(p, ), evidently P^(k)(p) ⊆ , and all of the P^(k)(p)s contain p,so is the union of all balls P^(k)(p) with p ∈ and k <dα (p, ).

In the d_α distance transform of P, each pixel p of has value d_α(p, ). Evidently, Pd_α (p, )) (p) is not contained in P (dα(q,)) (q) for any neighbor q of p iff p ∈ M_α. Hence is the union of the balls (dα (q,))(p) for all p ∈ M_α(). The picture in which the value of p is d_α (p, ) if p ∈ M_α () and 0 otherwise is called the d_α medial axis transform (MAT) of .

Definition 3.3

We say that p belongs to the medial axis M_α() of if d_α(p,) is a local maximum of the d_α distance transform of P (i.e., d_α(q, ) ≤ d_α(p, ) for all α-neighbors q of p).

The medial axis transforms for the distance transforms of Figures 3.10 and 3.12 are shown in Figure 3.14. Note that the pixels of M_α() are centrally located in , so they constitute a kind of “skeleton” of ; however, this skeleton may not be connected even if is simply connected, and it may be two pixels thick if has even width. Methods of constructing thin connected skeletons will be discussed in Section 16.3.

FIGURE 3.14 Medial axis transforms for the distance transforms of Figures 3.10 and 3.12. Left: d₄. Center: d₈. Right: approximate d_e (values shown to one decimal place).

For any p ∈ and any grid metric d_α, let D_α(p) be the largest ball P^(k)(p) centered at p that is contained in . If p ∈ M_α(), D_α(p) can be α-adjacent to at only one pixel. For example, let be a “vertical strip” of even width; then M_α is of width 2, and the largest balls “touch” the strip’s border on only one side, at a

single border pixel. Thus p ∈ M_α () implies only that there is at least one shortest α-path from p to . The original definition of M() described the process of constructing M() in terms of a “grass fire” ignited along the border of and defined M() as the locus of points at which the grass fire meets itself. However, as we have seen, a pixel on the medial axis is not necessarily characterized by two shortest α-paths from p to . (This is true also in the continuous case, for example, for a parabolic set, where the endpoint of the medial axis is at distance 0 only from itself.) On the other hand, if there are at least two shortest α-paths from p to , then p is on the medial axis.

Our definitions of the distance transform and the medial axis transform assumed that P is a binary picture. We conclude this section by mentioning several generalizations of the medial axis transform to multivalued pictures.

In the SPAN [7], P is approximated by a set of maximal “disks” (e.g., squares) in which the pixel values are “homogeneous,” and the generalized medial axis is the set of centers of the disks. If P is binary, the disks have constant value 1, the approximation is exact, and the set of centers of the disks is M(P). (A medial axis based on fuzzy disks [fuzzy sets with membership, that depend only on distance from an origin] is described in [792].)

In the GRAYMAT [644], the gray-weighted length of a path is defined as proportional to the sum of the values of the pixels on the path (see Section 3.4.1), and the generalized medial axis is the set of pixels p that do not lie on any minimal-length path from any other pixel to the set of 0s.

The GRADMAT [1118] assigns a score to a pixel p by summing the gradient magnitudes (the maximal rates of change) of the pixel values at pairs of pixels that have p as their midpoint; the generalized medial axis consists of pixels that have high scores. Note that, in a binary picture, the gradient of the pixel values is nonzero only at the frontier of 〈 P 〉 .

A definition of the medial axis based on morphologic operations will be given in Section 15.6.3; this definition too applies to multivalued pictures.

3.5 Exercises

1. Let p = (x₁, y₁, z₁) and q =(x₂, y₂, z₂) be points in R³, and let the following be true:

Prove that d_t is not a metric on R³ but rather that it is a metric on the subset {p : p =(x,y,z)∈ R³ Λ z ≥ 0}. We call it the forest metric, because it corresponds to the distance in which moves are of the form “climb down the first tree, walk to the second tree, and climb up the second tree.”

2. Prove that d has properties M1 through M3 on S iff it has the following two properties:

(i) For all p,q ∈ S, d(p, q)= 0 iff p = q.

(ii) For all p,q,r ∈ S, d(p,r) ≤ d(q,p)+d(q,r).

3. Let [S,d] be a metric space. A sequence {p_i}_{i = 0,1,2} of points of S is called convergent iff there is a p ∈ S such that, for any ε bsol; 0, there is an i₀∈ N such that p_i ∈ U_ε(p) for all i ≥ i₀. It is not hard to see that p must be unique; it is called the limit point of the sequence. Show that a sequence p_i = (x_i, y_i)∈ R² (i = 0,1,2,…) is convergent in [R²,L_m](1 ≤ m ≤∞) iff it is convergent in [R²,d_e].

4. Is the sequence 1,−2,3,–4,…,… convergent in [R,d_e]∈ Are the sequences

convergent in [R²,d_e]∈ Are they convergent if we use the binary metric d_b∈ If so, what are their limit points∈

5. Let C be a disk of integer radius centered at a grid point, and let D be the frontier of the union of the grid cells that are contained in C. Prove that the Hausdorff distance between the frontiers of C and D is equal to the grid constant.

6. Prove that d₄ and d₈ are metrics on Z².

7. Prove that, on a (k+ 1)×(k+ 1) grid, d_e–d₈ can be as great as (. − 1)k ≈ 0·41k and d₄ – d_e can be as great as (2 –)k ≈ 0.59k. Prove that, for the “octagonal” distance d (Equation 3.11), we have |d_e – d|≤ (/2 − 1)k ≈ 0.12k.

8. Let p_m be the norms defined in Section 3.1.2. Prove that, for any p ∈ R²,we have p||₂ ≤ p₁ ≤ p₂, p_∞ p₂ ≤ p_∞, and p_∞ ≤ p₁ ≤ 2p_∞. Express these inequalities in terms of the metrics d₄, d_e and d₈ on R². For R³, prove that d₂₆(p, q) ≤ d_e(p, q) ≤ · d₂₆(p, q) and d_e(p, q) ≤ d₆(p, q) ≤ · d_e(p, q).

9. Define “hyperoctagonal” distances d by combining d₆ with d₁₈ or d₂₆, and find bounds on |d_e – d| on a (k+ 1)×(k+ 1) grid.

10. The 3D grid is composed of grid planes z = k, where k is an integer. Construct a modified grid in which odd-numbered grid planes are shifted half a unit in the+ x and+ y directions. Each point in this grid can be regarded as having 12 neighbors: four on its own plane and four on each of the planes above and below it. Define a “d₁₂” metric on this grid in analogy with the d_h metric on the hexagonal 2D grid defined in Section 3.2.3.

11. Prove that d_1,1 = d₈ and d_1,∞ = d₄.

12. Prove that the following is true:

Prove that the chamfer distance d_1,b that best approximates d_e has b = (1/)+ −1 ≈ 1.351 and that, for this d_1,b, we have the following:

This optimal b is close to 4/3; we therefore get a good approximation to 3d_e by using a = 3 and b = 4 (“(3,4) chamfer distance”). (Other simple combinations of basic moves can be used to give even better approximations to Euclidean distance.)

13. Find bounds for |d_e – d_a,b,c| on a (k+ 1)×(k+ 1)×(k+ 1) grid where d_a,b,c is a 3D chamfer distance, and find values for a, b, and c that minimize these bounds.

14. A knight in chess can move two steps in an isothetic direction and one step in a perpendicular direction. For any p,q ∈ Z², let d_k (p, q) be the minimum number of knight’s moves required to go from p to q. Prove that d_k is a metric on Z².

15. Define a concept of intrinsic distance for fuzzy subsets. (Hint: Define the length of a path (p₁,…, p_n) as the sum of f (μ(p_i)) where f is a monotonic function that maps 0 into∞ and 1 into 0.)

16. An integer-valued picture P is called α-smooth iff |P(p) – P(q)| ≤ 1 whenever p and q are α-neighbors. Prove that the d_α distance transform of a binary picture P is the lowest-valued α-smooth picture that has value 0 at all pixels of .

17. Give a “two-scan” algorithm for computing the value-weighted distance from every non-0 to the set of 0s in a multivalued picture.

18. Give “two-scan” algorithms for computing 3D distance transforms for d₆, d₁₈, and d₂₆.

19. Give an algorithm for computing a 3D Euclidean distance transform.

20. Prove that p ∈ M_α(P) iff p does not lie on a shortest α-path from any q ≠ p ∈ to .

21. Define algorithms analogous to those in Theorem 3.8 for constructing the binary picture with a set of 1s that is given the medial axis transform of P.

22. An oval is a bounded closed convex subset of R²; it is said to be proper if it has interior points. Two sets do not overlap iff their interiors are disjoint. Let M be a proper oval, and let n(M) denote the maximum number of nonoverlapping translates of M that can be arranged so as not to be disjoint from M. Prove that 7 ≤ n(M) ≤ 9, and show that n(M) = 7 if M is a disk and n(M) = 9 if M is a square.

23. Design an efficient algorithm for calculating the intrinsic diameter of a simple polygon or polyhedron and the intrinsic distances between pairs of its vertices.

24. Prove that the intrinsic (4- or 8-)diameter of a connected set of pixels S (the greatest intrinsic [4- or 8-]distance between any two pixels of S) is at most half the total (4- or 8-)perimeter of S (the sum of the [4- or 8-]perimeters of all of the frontiers of S).

3.6 Commented Bibliography

There are many textbooks about metric, normed, and Hilbert spaces; see, for example, [144, 490, 1071]. For relationships between topologies and metrics, see [352]. For a survey of publications about digital metrics, see [720].

Metrics on Z² were studied in [922], which is the source of Proposition 3.2. The integer-valued metrics d₄ and d₈, the “octagonal” metrics obtained by combining d₄ and d₈, and the “hexagonal” metric d_h were all introduced in [922]; an improved treatment of d_h was given in [672]. For characterizations of d₄ and d₈, see [411, 718, 719, 723, 850] and [284]; for rounded Euclidean distance, see [851]; for additional results about octagonal distances, see [243, 244, 752].

Metric d₁₈ was defined and studied in [785]. [365] also calculates d₁₈ and counts the number of all minimum-length 18-paths between the origin and a grid point (i,j,k)∈ Z³. For other references about numbers of paths, see [235, 241, 365, 922].

The n-dimensional case (see Section 3.3) was treated in [540]. The metrics∂_α on ,0 ≤α <n (see Theorem 3.6 and Equation 3.15) define balls (all n-cells at distances ≤ r ∈ N from the origin) and spheres (all n-cells at distance = r ∈ N from the origin). [242] studied the volumes of these balls and the “surface areas” of these spheres (the numbers of n-cells contained in the ball or sphere).

For metrics defined by arbitrary neighborhood sequences, see [233, 239, 1146, 1147]. Chamfer distances in arbitrary dimensions were popularized in [103]. See [104, 108, 182, 236, 237, 238, 239, 241, 245, 526, 701, 785, 852, 1037] for related work. For criteria for optimizing chamfer distances, see [66, 67, 107, 156, 1093, 1114]. Distance functions were used to define “continuous” functions on pictures in [767, 903]. For linear metrics on discrete sets, see [247]. For metric-preserving transforms, see [233].

For geodesic distances, see [619]; for geodesic distances on fuzzy subsets, see [91]. Much of the material in Section 3.2.4 is from [889]. Average distances in digital sets are studied in [904], and metric bases in the grid are studied in [724]. [525] estimates distances between borders of components of voxels by calculating geodesics that contain only border voxels.

See [63] for properties of the Hausdorff metric on compact sets. The algorithm discussed in Section 3.2.5 was proposed in [989]. For a linear-time algorithm for calculating the Hausdorff distance between convex polygons, see [52]. The maximum Euclidean distance between two finite planar sets can be calculated in O(n log n) time [1063]. Algorithms for calculating Hausdorff distances (defined by arbitrary Minkowski metrics) between finite planar sets are discussed in [457]. Distances between sets (as in Proposition 3.4) were studied in [556] under the name measures of correspondence. Minimization of the Hausdorff distance between a bounded set S ⊂ Rⁿ and a set M ⊂ Zⁿ defines the Hausdorff digitization of S; see [131, 1116]. See [458] for more about distances between pictures. Hausdorff distances between fuzzy sets (or multilevel pictures) are studied in [179, 181, 901].

For a generalization of the concept of a distance transform, see [897]. The two-scan algorithm for computing the d₄ and d₈ transforms was introduced in [921], which also introduced the digital medial axis (under the name “distance skeleton”). This was further explored in [933]. For fast computation of distance transforms, see [833]. Generalized distance transforms are discussed in [1119]. For computing distance and distance-related transforms in nonrectangular domains, see [818]. For constrained distances, see [1095]. For 3D distance transforms and their uses, see [1040]. For other references about distance transforms, see [520, 929, 977]. [224] contains a detailed review of distance transforms and also covers algorithmic and application aspects.

The generalization of distance transforms to multivalued pictures (using the value-weighted distance to the set of pixels that have value 0) was studied in [644]. For the computation of such transforms, see [880]; for other weighted distances, see, for example, [92, 829, 1011, 1054, 1057]. For fuzzy distance transforms, see [945].

Algorithms for computing Euclidean distance transforms are discussed in [103, 125, 225, 229, 304, 306, 647, 754, 982, 983, 987]. Figure 3.13 and the accompanying discussion follows [229]. Other references for Euclidean distance transforms are [115, 263, 608, 717, 745, 831, 832, 1010, 1097]. For the 3D case, see [834, 1056, 1068].

For the medial axis (also called the “symmetric axis”) and its mathematic theory, see [93, 160, 707]. For other references about digital medial axes, see [2, 28, 34, 40, 85, 197, 199, 200, 213, 322, 349, 476, 521, 634, 686, 774, 777, 953, 972, 956, 975, 1038, 1142]. For medial axes for chamfer and Euclidean distances, see [39, 41, 42, 303, 359, 845, 846, 953, 984, 1094, 1133].

For the “knight’s distance” (Exercise 10), see [240, 244]. Exercise 22 is from [394], and Exercise 24 is from [886].

¹ The norm is defined not only for integer m but for any real number m ≥ 1.

² In the mathematics literature, a unit circle or unit sphere is traditionally defined as having radius 1 (i.e., diameter 2), whereas a unit square or unit cube is (inconsistently!) defined as having side 1.

³ Intrinsic distance is sometimes called “geodesic distance,” but, to avoid confusion with the noun “geodesic,” we will not use that term.

⁴ The distance transform is essentially the same as the distance field F(); see Section 3.2.5.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3: Metrics

Create new playlist

Sign In

Sign Up