Many systems behave unpredictably and randomly, or at least seem to. Some are random due to a lack of information such as a coin toss or the golden autumn leaves fluttering and falling in the wind. Other systems, though perfectly deterministic and well defined, are still random because of their intrinsic, probabilistic nature, such as the radioactive particle decay and measurement of quantum systems.
We consider several simple random systems in this chapter so as to be familiar with random sampling and distributions needed in the next two chapters. First we introduce random numbers and model particle decay by random sampling. We also discuss random walks in 1D and 2D, as well as a stochastic model for Brownian motion, a simple but important phenomenon. Finally we introduce Monte Carlo integration to evaluate multidimensional integrals including electrostatic potential energies.
The first step to represent a random problem starts with the generation of random numbers. At initial glance, this appears paradoxical: on the one hand, the digital computer is as deterministic a machine as it gets, yet on the other hand we want it to produce random numbers. This is a dilemma. We cannot generate truly random numbers on the computer. We can, however, hope to generate a sequence of pseudo-random numbers that mimics the properties of a series of true random numbers.
The generation of pseudo-random numbers usually involves bits overflow in integer multiplication at the low level. For our purpose, Python has a random number library that provides several random distributions. For example, the random() function returns a uniform distribution in the interval x ∈ [0, 1),
We can repeat a random sequence by initializing it with the seed function seed(n), where n is an identifier, typically an integer.
Truly random numbers have no correlation between them. Suppose we have a sequence of random numbers, {x}. Let us define a joint probability P(x, x′) as the probability of finding x′ after x. No correlation means that the chance of finding x′ must not depend on x before it. In other words,
As a result, a correlation test between two random numbers is
Similarly, the moment tests are defined as
We can perform the tests numerically by generating a sequence of N random numbers {xi} and calculate the following averages
Table 10.1 shows the results of moments and correlation of various order k. A million random numbers (N = 106) were drawn from random() in each case.1 The values match their theoretical predictions to within the statistical uncertainty of O.
Radioactive nuclear decay is an intrinsically random process because it occurs via quantum transitions which are probabilistic in nature as seen in Section 8.4. Let N be the number of nuclei alive at time t, and let −ΔN be the number of nuclei that will decay between t and t + Δt. Then, −ΔN will be proportional to the existing nuclei and the time step as
where λ is the decay constant related to the lifetime of the unstable nuclei as τ = 1/λ (Exercise E2.7).
We can interpret the LHS of Eq. (10.6) as the probability of decay in one step, given by p, a constant in terms of λ and Δt. To simulate the decay process, we draw a uniform random number x in [0, 1) for each undecayed nucleus and compare it to p. If x < p, the particle decays; otherwise, the particle lives on. The code segment for each step is (import random as rnd)
deltaN = 0 for i in range(N): x = rnd.random() if (x < p): deltaN = deltaN + 1 N = N − deltaN
We show in Figure 10.1 representative results for p = 0.02. The overall trend resembles an exponential decay law as expected. There are clear fluctuations in the number of decayed nuclei in each step (Figure 10.1, right). But their cumulative effect yields a relatively smooth curve of the remaining particles (Figure 10.1, left), which is a Poisson distribution. The solution thus obtained by random sampling is an example of Monte Carlo simulations.
In our simulation, the step size is related to p and λ. In other words, the probability p implicitly determines the time scale of the decay process. If the rate of decay λ is large, Δt will be small, vice versa.
Like radioactive decay, a random walk is another simple model defined in a straightforward way. A walker takes one unit step in one unit time, but the direction of each step is randomly distributed, uniformly or otherwise. Many practical applications are related to some features of random walks, including diffusive processes, Brownian motion (Section 10.3) and transport phenomena, etc.
In one-dimensional random walks, there are only two possible directions, left or right. Let the unit step be l, so each step has a displacement of ±l. If a walker starts from the origin, the position x, and its square x2, after n steps are
where si is distributed according to
subject to p + q = 1.
Figure 10.2 shows the results of ten separate walkers, each moving ten steps with p = q = and l = 1. The paths for different walkers show considerable variation (left panel). Because they move left or right with equal probabilities, the average position should be zero statistically. The solid line represents the average position of the walkers at each step number n. It fluctuates around zero with no clear trend. The average position squared (right panel) shows an upward trend, though there is fluctuation.
Let N be the number of walkers, and 〈x(n)〉 and 〈x2(n)〉 be the average values of x(n) and x2(n) of the walkers after n steps, respectively. In the limit N 1, the position follows a binomial distribution (Figure 10.3), and the distribution near the center resembles a Gaussian for large steps n (Exercise E10.3). The average values approach (see Exercise E10.4)
As a measure of spread (dispersion), we can calculate the uncertainty as 〈(Δx)2〉 = 〈x2(n)〉 − 〈x(n)〉2. Given the expressions (10.9), we have
The relationship is linear in step number n. It is an important result from random walks, and is typical of random motion like Brownian motion. For continuous time, t ∝ n, and the linear spread is , C being some constant. For p = q = , 〈(Δx)2〉 = 〈x2(n)〉, so Figure 10.2 (right) shows the approximate linear trend. We can interpret the linear spread as the mean distance the walkers can travel in time t. To travel twice as far, we would have to wait four times as long.
We can generalize 1D random walks to 2D via the displacement vector . We need to determine x and y in some random fashion.
Unlike the 1D random walk, we have more freedom choosing the directions. There are many possibilities, including:
Figure 10.4 shows four walkers starting from the origin and moving with unit step size and uniform angular distribution (isotropic). The individual walks vary greatly as seen before in 1D random walks. We have to characterize an ensemble of walkers statistically. For instance, the average position and position squared are defined accordingly as ensemble averages,
These averages will differ according to the chosen model. But, the general linear relationship (10.10) still holds, 〈(Δr)2〉 ∝ n. We shall see an actual example with continuous time next in Brownian motion.
Brownian motion refers to random motion of macroscopic particles such as fine dust grains in liquids observed under a microscope similar to Figure 10.4. It behaves in many respects like random walks, but is a real phenomenon. Brownian motion has great importance in physics and biophysical systems, and often is the foundation toward understanding more complex phenomena in such systems. We model Brownian motion below.
Consider a particle suspended in a liquid and moving with velocity . Intuitively we would expect the particle to slow down due to drag (Section 3.2), and eventually to stop moving. But such particles observed in Brownian motion never stop moving. Rather, the motion carries on but has random fluctuations.
This means there has to be some random force, F(t), besides drag. Since the speed is usually small, we can model the drag force as a viscous linear force (3.9), . The equation of motion including both forces is
where m is the mass, and b1 the linear drag coefficient. This is known as the Langevin equation. It is just Newton’s second law with random forces.
Both forces in Eq. (10.12) come from interactions with the same liquid, but why do we separate them? The reason has to do with different time scales. Roughly, we can think of the drag as representing some average force due to collective effects on a larger time scale, and the random force due to collisions with individual atoms and molecules on a smaller time scale [76].
Clearly, must be rapidly fluctuating and very complex. Its precise form cannot be known, given the large number of atoms moving about. As a simplification, we model as a series of stochastic “kicks”
where is the strength of the kick at ti. Each kick is represented by a Dirac delta function δ(t − ti) like in the kicked rotor (Figure S:5.1). The effect is to deliver an impulse to the particle after each kick. In general, both the strength and the time ti of the kicks are random.
Between kicks, there is only the drag force in Eq. (10.12), and the solutions can be generalized from the 1D results (3.12) and (3.13),
where b = b1/m, and and are the initial position and velocity. Note that 1/b has the dimension of time, so we can regard it as the larger time scale associated with drag stated earlier.
Let be the velocity right before ti+1. The velocity immediately after the kick at ti+1 is increased to
With the δ-kick model, Eq. (10.12) becomes a discrete linear map between kicks.
The model outlined above assumes arbitrary distributions for and ti. Now we just need to specify them to go forward. For simplicity, we assume has a constant magnitude and random directions (this is not too crucial, see Project P10.2). For the distribution of ti, it should be random, but we also expect that after a kick, the probability of not being kicked again should decrease with time. We choose to model the interval between kicks as an exponential distribution,
This is a nonuniform distribution discussed in the next section (see also Section 10.B). In Eq. (10.16), τ is the average interval between kicks, τ = P(t)t dt. It represents the smaller time scale of the rapidly fluctuating . It should be small relative to 1/b, i.e., τ 1/b.
Here is our strategy for simulating Brownian motion. For each particle, sample the time interval ti to the next kick according to Eq. (10.16), and randomly orient . Calculate the position and velocity from the kick-free motion (10.14) until ti, then increment the velocity right after the kick from Eq. (10.15). Repeat the above steps.
Figure 10.5 shows snapshots from animation by Program 10.2 using this strategy. The time scales are 1/b = 10 and τ = 2 (arbitrary units). Note the different length scales between the two rows. All particles are “dropped” at the origin with zero initial velocity. In the beginning (t 10), the size of the drop grows linearly with time, i.e., = Ct. After that, the drop spreads at a slower square-root rate, , the same as random walks (10.10).
The different power scalings are more clearly seen in Figure 10.6 showing the square of the distance (10.11), averaged over 1000 independent particles (an ensemble). We see two regimes: t t1 = 10 and t t2 = 30, where the curve is roughly a straight line with different slopes (note the log-log scale). The slope is about 2 as expected below t1, and almost 1 above t2. Between t1 and t2, there is a transition region connecting the regimes. This different behavior at small and large time scales is predicted in a statistical model (see Eqs. (10.39) and (10.40), Section 10.A).
Physically, the small stochastic forces cause little effect over a short time period, and we expect the motion to be linear. We can see this from Eq. (10.14) as well, as for bt 1. Over longer periods of time, the stochastic forces are more important since they provide the energy to the particle, so the motion resembles random walks with the same scaling (10.10).
We have determined the power laws for the distance of travel in Brownian motion. What about the constant C or the average speed? We suspect that they should depend on the strength and frequency of the kicks. Indeed they do, and are related to temperature discussed in Chapter 11. Further investigation of Brownian motion is left to Project P10.2.
We have just described simulations of select simple problems with random numbers. It turns out that problem solving by random sampling is a powerful method and, for some complex problems, is the best method available. It falls into a category of the so-called Monte Carlo methods. Below we illustrate this method with an electrostatic problem.
Let us consider the electrostatic potential energy, U, between two charged spheres of radii r1 and r2, respectively, separated by a distance d between their centers on the x axis (d > r1 + r2, nonoverlapping). We can find U by
where k is the electrostatic constant in Eq. (7.1), ρ1 and ρ2 the charge densities in spheres 1 and 2, respectively. Equation (10.17) is a six-dimensional integral over the volumes of the spheres.
For arbitrary charge densities, Eq. (10.17) must be evaluated numerically (Project P10.3). We discussed grid-based numerical integration in Section 8.A. This approach becomes inefficient for higher multiple integrals like Eq. (10.17) because the number of grid points increases rapidly as a power of the dimensions. It can be expensive if this type of integrals must be calculated repeatedly during the course of a simulation.
Monte Carlo integration by random sampling can be useful in such cases. Its accuracy is independent of the dimensionality in principle. We discuss briefly this method below.
Let us consider the weighted average of a function f(x) defined as
where P(x) is the weighting function. We assume it to be a normalized probability distribution, P(x)dx = 1.
We divide the space into small intervals as before (Figure 8.18) and convert Eq. (10.18) into a sum
If we randomly sample the points, P(xj)Δx tells us the probability with which the points should fall in the interval [xj, xj+1]. Let nj be the number of points in that interval, and we have
Substituting Eq. (10.20) into (10.19), we obtain
We have turned the group sum over j into an individual sum over i in Eq. (10.21). Once we have chosen a weighting function P(x), we can compute the weighted average 〈f〉, and the integral (10.18). Let us choose a weighting function as
so that x is uniformly distributed in [a, b]. With this P(x), Eq. (10.18) becomes
It follows that the integral is
The notation 〈f〉mc reminds us that the weighted average is to be calculated subject to Monte Carlo sampling, a uniform distribution in this case. We can generalize Eq. (10.25) to n-dimensional multiple integrals of the type
where V is the hyper volume over which the integration is to be done. Monte Carlo integration in this manner boils down to sampling the points uniformly within the volume of integration.
We apply Monte Carlo integration to the evaluation of the electrostatic potential energy U in Eq. (10.17). If we assume uniform charge densities, the result is known, U = kQ1Q2/d, where Q1 and Q2 are the charges on the spheres. This is the energy required to bring the two charges in from infinity. The Monte Carlo results are listed in Table 10.2, obtained from Program 10.3.
The main effort is to generate two points in a trial, one inside each sphere. This is done by sample().
def sample(r): # return a random point inside a sphere while True: x, y, z = rnd.random(), rnd.random(), rnd.random() x, y, z = r*(x+x−1), r*(y+y−1), r*(z+z−1) # map to [−r, r] if (x*x + y*y + z*z <= r*r): break return x, y, z
The function samples points inside a sphere of radius r uniformly by the rejection method. A point at coordinates (x, y, z) is randomly selected within a cube of sides 2r containing the sphere. If the point is inside the sphere, it is selected and returned. Otherwise, it is rejected and a new point is generated until one is found inside. The main program simply evaluates the integrand at these points.
The Monte Carlo results are in satisfactory agreement with the exact value kQ1Q2 for d = 4, even for N in the low hundreds. As N increases, the result stabilizes at the exact value. We expect the relative error to scale as 1/, so at the largest N = 104 in Table 10.2, the error should be about 1%. We can easily control the relative error in Monte Carlo integration. Furthermore, we can improve the final result by cumulatively averaging different runs with proper weighting because they are independent.2 Doing this with the seven runs in Table 10.2, we obtain 0.2501, the closest to the exact value yet. The overall convergence of the Monte Carlo approach is good with increasing N in this example. Its performance and simplicity cannot be matched by nested, grid-based integrals.
The integrand in the above example is monotonic and smooth. Like any numerical integration scheme, Monte Carlo integration suffers from ill-behaving integrands. The problem is particularly acute if the integrand is highly oscillatory or strongly peaked. Excessive oscillations cause large fluctuations in random sampling. Sharp peaks mean that most contributions to the integral come from small localized regions, making uniform sampling inefficient, and often inadequate.
We can address the problem by the technique of importance sampling. Suppose we want to integrate f(x) which is not smooth. We attempt to separate it into two parts as
The new function g(x) is assumed to be smoother than the original integrand f(x). Like in Eq. (10.18), P(x) is a probability distribution function, and Eq. (10.27) is the weighted average of g(x) under P(x). In other words, we have
We stress that 〈g〉mc in Eq. (10.28) is calculated with xi distributed according to probability P(x). This is the basic idea of importance sampling. For a given integrand f(x), we should choose P(x) such that the modified integrand g(x) becomes as smooth as possible. The distribution P(x) guides us to sample the important regions. The integration volume does not appear explicitly in Eq. (10.28). It is implicitly taken into account by the sampling process. Of course, if we choose a uniform probability distribution as in Eq. (10.22), then Eq. (10.28) is reduced to (10.25).
As an example, let us evaluate the following integral
The integrand is strongly peaked at the origin, where we expect most contributions to come from. At the same time, it also oscillates. But relatively speaking, the more pressing problem is to deal with the sharp exponential factor, not the cosine oscillation. Therefore, we choose an exponential probability function as P(x) = exp(−x), which is already normalized.
The modified integrand is g(x) = f(x)/P(x) = cos(x). The integral (10.29) becomes
The value 〈cos(x)〉mc is to be calculated with the exponential probability distribution. The following code illustrates the procedure.
f, N = 0., 1000
for i in range(N):
x = − np.log(rnd.random()) # exponential distribution
f += np.cos(x)
print (’MC=’, f/N)
One trial run gave a result 0.517, or about %3 error, consistent with the number of sampling points N. Without importance sampling, the error would be unacceptably large, even for a reasonably large N, as to make the result meaningless (see Exercise E10.7).
The key to the above code is the statement x = − ln(random()), which transforms the logarithm of uniform random numbers in [0, 1) into a series satisfying the exponential distribution in (0, ∞). It ensures that the small-x region, where the most contribution comes from, is sampled more frequently. We discuss this and general nonuniform sampling in Section 10.B.
Chapter summary
We discussed a few simple random problems, including radioactive particle decay by random sampling, and random walks in 1D and 2D. We constructed a stochastic model for Brownian motion, and compared simulation results with a statistical treatment. We introduced Monte Carlo integration and importance sampling methods for obtaining nonuniform distributions.
The basic usage of the intrinsic Python random library was illustrated through the selected problems.
E10.1 | (a) Test the uniformness of random number generation. Generate N = 104 samples and plot the distribution as a histogram of 100 bins (see Program S:9.1).
(b) Write a program to calculate the moments and correlation listed in Table 10.1. |
E10.2 | (a) Suppose someone prefers white socks to red socks. On each shopping trip, he opens packages – sealed blackboxes each containing a pair of white or red socks, until he finds a pair of white socks. Then he stops, and must buy all the opened packages. After many such trips, will he have an oversupply of white socks relative to red ones? Prove it one way or the other by random sampling.
(b) The SOS game (stick or switch, similar to the Monty Hall game show Let’s Make a Deal) is a game of chance and works as follows. The host hides a gold coin in one of four boxes. You pick a box first, and the host opens an empty box. You decide whether to stick with the box you picked, or switch to one of the two remaining boxes. Which way gives you a better chance of winning? Simulate the SOS game by random sampling. Tabulate the results for N trials. Choose a large enough N so that you can draw definitive conclusions. Assume on average half of the players switch and the other half stick. How much should the host charge the players per game in order to break even? |
E10.3 | (a) Generate random walk distributions analogous to Figure 10.3 for various p values, e.g., 0.3, 0.6, 0.9, etc. Use enough walkers so the fluctuations are small. Show that as the maximum step number nmax increases, the distribution becomes sharper and more Gaussian-like.
(b) Two random walkers start at the same position. What is the probability that they will meet again after walking 20 steps each? Assume equal probability going left or right. Simulate the process numerically. Use enough trials such that the event occurs about 100 times. How does the numerical result compare? |
E10.4 | Prove the relations (10.9). For 〈x2(n)〉, group the terms as
Compute the averages of the two terms on the RHS. The first term is independent of direction, and the double sum in the second term can be evaluated independently. |
E10.5 | Calculate the spread in 2D random walks
for the following models (see Section 10.2): (a) random walks on a lattice; and (b) random directions. For part (b), introduce z = Σ zj, zj = xj + iyj = exp(iθj), and calculate 〈r2〉 = 〈|z|2〉. Separate the equal and unequal indices in the sums as in Exercise E10.4. |
E10.6 | (a) Consider the n-fold integral
show that (b) Calculate I(m, n) using Monte Carlo integration. First choose a fixed m = 3 and n = 10, and vary the number of sampling points from N = 10 to 106 by factors of 10. Compare the Monte Carlo and analytic results. Plot the results on a semilog N scale, and discuss convergence. Next, fix N, e.g., N = 105, and calculate the integral for n = 10 and m = 0 to 5. What do you expect for n = 20? Try it. |
E10.7 | Calculate the integral (10.29) by Monte Carlo integration by uniform sampling x in [0, 8]. Compare the accuracy with the exponential distribution by importance sampling for the same N (Section 10.4). |
E10.8 | (a) Show that the transform for the sine angular distribution is given by
and for the Lorentzian distribution by (b) Generate these distributions, plot them as histograms (see Program S:9.1). Use enough samplings so the fluctuations are small. Compare with the exact results. (c) A Gaussian distribution can be generated using a pair of random variables as This is known as the Box-Muller method. Because x1 and x2 are independent, the pair y1 and y2 are also independent, and both values can be used. Produce a histogram as above, and verify that it yields a Gaussian distribution. Find the standard deviation. |
P10.1 | Simulate 1D random walks. Construct a program that calculates individual path of a single walker, as well as the averages 〈x(n)〉 and 〈x2(n)〉 for N walkers as a function of step number n.
Write a standalone subroutine walknsteps() as the workhorse. It should be similar to the function of the same name in Program 10.1. The function should assume arbitrary step size and probability p. At each step, draw a random number. If it is less than p, the walk is to the right, otherwise it is to the left. The main program should call walknsteps N times for a given maximum step number nmax. Compute the averages 〈x(n)〉 and 〈x2(n)〉 after N walks. (a) Assume unit step size l = 1, and choose nmax ∼ 50 and N ∼ 100. Vary p values, e.g., 0.2, 0.5, 0.9, etc. Predict and sketch 〈x(n)〉 and 〈x2(n) in each case. Plot and discuss the actual results. (b)* Modify walknsteps() to allow the step size l to vary between some minimum and maximum limits, l ∈ [a, b]. You can draw a random l in this range using uniform(a, b) which is equivalent to a + (b − a) × r where r is a random number between [0, 1). Set reasonable limits, e.g., [0, 2]. Plot the results, and discuss differences and similarities with the above results. |
P10.2 | Let us investigate Brownian motion in this project. First explore Program 10.2 with animation on, change some parameters such as F, b, etc., but keep τ < 1/b.
(a) Modify the program to calculate 〈r2〉. You can disable animation. Add a function to the Brownian-motion class that returns 〈r2〉 as a function of time. The average 〈r2〉 at an instant can be obtained from r2 = np.sum(self.r* self. r)/ self .N The number of particles N should be 1000 or more to obtain good statistics. Append this to a list, and plot it as shown in Figure 10.6. (b) Obtain 〈v2〉 in a similar way, and plot it as a function of time. Discuss the results. What is the temperature kT from the equipartition theorem (11.41) (set m = 1)? (c) Reduce τ by a factor of two, and repeat the calculations for 〈r2〉 and 〈v2〉. Do the same with the kick strength. Discuss the results. Is the temperature higher or lower in each case? Why? (d) Change the exponential distribution for ti to a uniform distribution between 0 and 2τ, and repeat the calculations. Also, instead of a constant F, assume it follows a Gaussian distribution (see Eq. (S:10.4)). Sample it using rnd.gauss() in Python (see also Exercise E10.8), with a mean and standard deviation equal to the default F value. Discuss how these results differ from earlier ones. (e)* Generalize the program to the 3D case. The position and velocity should now be N × 3 arrays. The force should be distributed uniformly in all directions. One way is, for a given magnitude F, to sample the angle φ uniformly in [0, 2π], and then sample θ as a sine distribution in Exercise E10.8. Another method is to randomize by an Euler rotation (see Eq. (S:12.125)). Compute 〈r2〉 and 〈v2〉 and compare with 2D results. Optionally visualize 3D Brownian motion using VPython. |
P10.3 | Solve the following problems by Monte Carlo integration. In each case, obtain the results to three digits of accuracy. Assume arbitrary units.
(a) Calculate the center of mass of a hemisphere of radius r with uniform mass density. Compare with the analytic result. (b) Two spheres of radii r1 and r2 are separated by a distance d between the centers. Each carries a unit charge. The charge densities obey a linear relationship of the form ρi = Ci(1 − r/ri), with i = 1 and 2, and Ci are constants. Assume r1 = 1 and r2 = 2. Determine the constants C1 and C2 analytically or numerically via Monte Carlo integration. Find the electrostatic potential energy between the spheres for d = 4. Discuss and compare your results with two uniformly charged spheres (Table 10.2). (c) Consider the same setup as above, but the charge densities have dipole angular distributions (Figure 7.24) ρi = Ci sin θi, where θi is the polar angle relative to central axis connecting the spheres. Predict the energy of this configuration relative to part (b) above, as well as to the uniformly charged spheres. Calculate the actual result, and compare the three systems. |
P10.4 | The hit-or-miss method can be used to find an area or an integral.
(a) Find the approximate value of π by this method. Imagine a unit circle is enclosed in a square board of side length 2, and we throw N darts randomly at the board. Afterwards we inspect the board, and count the number of hits inside the circle which is proportional to the area. We can calculate π as Simulate the dart game. For each throw, generate two random numbers x and y between [−1, 1], and record the number of hits within the unit circle. Vary N, and calculate π. How many throws are needed to obtain π to two digits of accuracy consistently? How many throws to match the accuracy of 22/7? How about 355/113?3 (b) Design a Monte Carlo program to calculate the integral f(x)dx by the hit-or-miss method (see Section 10.B). Test your code with sin x dx, and compare the accuracy with the weighted average method. |
We briefly describe a statistical method for Brownian motion where the unknown force in the Langevin equation (10.12) is absorbed in the thermal properties of a system. We restrict ourselves to 1D motion.
Let us multiply both sides of Eq. (10.12) by x and divide by m, obtaining
where b = b1/m as before. For convenience, we introduce s = xv, so that
Substituting s and xdv/dt above into the RHS and LHS of Eq. (10.31), respectively, we have
To eliminate F(t), let us take statistical (ensemble) averages on both sides of Eq. (10.33) to obtain
with .
Because x and F are not correlated, the last term in Eq. (10.34) is zero since 〈xF〉 = 〈x〉〈F〉 = 0. In thermal equilibrium, the first term can be obtained from the so-called equipartition theorem (11.41), which in 1D is
where k is the Boltzmann constant and T the temperature.
Using Eq. (10.35), we can simplify Eq. (10.34) to
It yields the following solution, assuming initially at t = 0,
Recalling from Eq. (10.32), we can obtain 〈x2〉 as
subject to the initial condition 〈x2〉 = 0 at t = 0. This is the central result.
It seems that we have managed to eliminate the rapidly fluctuating force F(t) in Eq. (10.38). But, its effect is hidden in temperature T. The strength (impulse) and frequency of F(t) clearly depend on the temperature.
Still, the solution lets us obtain the limiting behavior of 〈x2〉 at different time scales. For small t 1/b, exp, and we have
For large t 1/b, we can drop the second term in the bracket of Eq. (10.38) all together, so that
This linear scaling is the same as in random walks (10.10) and other diffusive processes.
The need for nonuniform probability distributions arises frequently in practical situations as demonstrated by several examples we have discussed so far. A number drawn from a nonuniform distribution is still random, but after many such drawings, the distribution becomes skewed. The general approach is to generate a nonuniform distribution from a uniform one.
Let us denote the nonuniform variable by y, and the uniform variable by x. We wish to generate a certain distribution py(y) via a transformation of x. We will discuss two methods below.
In certain cases, it is possible to obtain a nonuniform distribution analytically via the inverse transform method. The basic premise is that there exists a transformation, say y = G(x), which transforms the uniform distribution px(x) to the nonuniform distribution py(y).
Since probability is conserved, we have the following relationship
The LHS of Eq. (10.41) gives the probability in the y-domain from y to y + dy. Because it is generated from x to x + dx one-to-one by G(x), it must be equal to the probability in the x-domain – the RHS of Eq. (10.41).
Because px(x) = 1 from Eq. (10.1), we can integrate Eq. (10.41) on both sides to obtain
In obtaining Eq. (10.42) we have dropped the || signs for convenience.
To the extent that the integral in Eq. (10.42) is analytically solvable for F(y), and subsequently, the inverse of F(y) exists, then we have found the transform
where G = F−1 is the inverse function of F.
As an example, let us obtain the exponential (Poisson) distribution
Though this is also a standard distribution, we choose it for the simplicity. We find the integral from Eq. (10.42) as
from which we can obtain the inverse transform
This shows that the logarithm of a uniform distribution produces an exponential distribution. The two functions are just the inverse of each other in this case.
We have used this distribution in the exponential integral (10.29) in Section 10.4 with λ = 1, which controls the mean. Basically, the −ln x function compresses x ∼ 1 toward y ∼ 0 and stretches x ∼ 0 toward y ∼ ∞, so the resulting distribution is an exponential, like radioactive nuclear decay (Figure 10.1). More examples are given in Exercise E11.8.
Often, it is not possible to find the inverse function analytically. For example, one of the most useful distributions, the Gaussian distribution, cannot be generated by the inverse transform method (though it is possible using two variables, see Exercise E10.8). Sometimes the distribution itself cannot be expressed as a simple, compact function. Rather, it may be in the form of discrete data like a lookup table. In such cases, we can use the rejection method which always works.
The idea is similar to the hit-or-miss method (Project P10.4). Let H be greater or equal to the maximum value of py(y) in the range y ∈ [a, b]. We put a box stretching from the lower-left corner [a, 0] to the upper-right corner [b, H]. The sampling procedure is as follows (as usual, x is uniform in [0, 1)):
Each successful sampling requires at least two random numbers x. After many samplings, the number of y values accepted is proportional to py(y), the desired distribution. The rejection method also requires the values of py(y), in either functional or numerical form, and its upper limit H. The closer H is to the actual maximum of py(y), the more efficient the rejection method. The efficiency of the rejection method can be improved if a better comparison function closer to the actual py(y), rather than the box, is suitably constructed.
Program listing 10.1: Random walk in 2D (walk2d.py)
1import numpy as np, random as rnd import matplotlib.pyplot as plt 3 def walknsteps(n): # walk n steps 5 x, y = np.zeros(n+1), np.zeros(n+1) for i in range(n): 7 phi = rnd.random()*2*np.pi # random in [0, 2π] x[i+1] = x[i] + np.cos(phi) 9 y[i+1] = y[i] + np.sin(phi) return x, y 11 nstep, N = 20, 4 # steps, num of walkers 13col = [’k’, ’r’, ’g’, ’b’, ’c’, ’m’] # color codes plt.subplot(111, aspect=’equal’) 15for i in range(N): x, y = walknsteps(nstep) 17 plt.quiver(x[:−1], y[:−1], x[1:]−x[:−1], y[1:]−y[:−1], scale = 1, scale_units=’xy’, angles=’xy’, color=col[i%len(col)]) 19plt.plot ([0],[0], ’y*’, markersize=16) plt. xlabel(’x’), plt.ylabel(’y’) 21plt.show()
The function walknsteps() simulates one walker taking n steps of unit length, in random directions between [0, 2π], starting from the origin. The main program plots the paths of N walkers with quiver() which connects the path by arrows having the same components as the steps.
Program listing 10.2: Brownian motion (brownian.py)
1import numpy as np, random as rnd import matplotlib.pyplot as plt 3import matplotlib.animation as am 5class BrownianMotion: # Brownian motion class def _init_ (self, N=400, F=0.005, b=0.1, tau=2., h = 0.5): 7 self .N, self .F, self .b, self .tau, self .h = N, F, b, tau, h self .r, self .v = np.zeros((N, 2)), np.zeros((N, 2)) 9 self .t = −np.log(np.random.rand(N))*tau # initial kick times 11 def move(self, r, v, dt): # move between kicks edt = np.exp(−self.b*dt) 13 return r + v*(1−edt)/self.b, v*edt 15 def iterate (self): # advance one step r, v, t, h, F = self.r, self .v, self .t, self .h, self .F # alias 17 for i in range(self .N): if t[i] > h: # no kick within current step 19 dt = h # dt= time to end of step else: # suffers kicks before end of step 21 tot, dt = 0., t[i] # tot=time to last kick while t[i] <= h: 23 r[i], v[i] = self .move(r[i], v[i], dt) phi = rnd.random()*2*np.pi # apply kick 25 v[i] += [F*np.cos(phi), F*np.sin(phi)] tot += dt 27 dt = −np.log(rnd.random())*self.tau # sample dt t[i] += dt # next kick 29 dt = h − tot # dt= to end of current step r[i], v[i] = self.move(r[i], v[i], dt) # no kick, just move 31 t[i] −= h 33def updatefig(*args): # update figure data bm.iterate() 35 plot.set_data(bm.r [:,0], bm.r [:,1]) # update data return [plot] # return plot object 37 bm = BrownianMotion() # create Brownian model 39fig = plt. figure () plt.subplot(111, aspect=’equal’) 41plot = plt.plot(bm.r [:,0], bm.r [:,1], ’o’)[0] # create plot object ani = am.FuncAnimation(fig, updatefig, interval=10, blit=True) # animate 43plt .xlim(−1., 1.), plt .ylim(−1., 1.), plt .show()
Program 10.2 simulates Brownian motion according to the stochastic model in Section 10.3. We use object-oriented programming in this case since the model is self-contained and there is little interaction with the main program. We define a Brownian-motion class consisting of an ensemble of N independent particles. The _ _init_ _() function is called when an object is created. In this case, it initializes the default parameters, sets the initial position and velocity vectors to zero, and samples the times of the initial kicks using the NumPy function np.random.rand(N) (line 9), which generates an array of N random numbers at once. The function move() calculates the position and velocity under the drag force only between kicks from Eq. (10.14).
The next module, iterate(), advances the solution forward by one step h. For each particle in the ensemble, we check whether it receives any kick within this step. That information is stored in t[i] (ti), which indicates the time of its next kick from the start of the current step. If ti > h, it means there is no kick within the current step. So Δt is set to h, the time to the end of the current step, and the particle will be moved according to Eq. (10.14) after the if-else statement.
On the other hand, if ti ≤ h, there will be at least one kick. So we move the particle to obtain and just before ti with move(). Then we sample a randomly oriented force and increase the velocity by the impulse received (10.15). The variable tot keeps track of the time elapsed from the beginning of the step to the last kick. Now we sample from an exponential distribution (10.46) the interval to the next kick (line 27), and add it to ti. If this ti is still within the current step, we repeat the above treatment, until the next sampled kick is after the current step. We then set Δt to the end of the current step, which is the difference between the step size h and the time of the last kick tot.
After the if-else block, the particle is moved to the end of the current step (start of the next step), and ti is adjusted accordingly.
The updatefig() function calls the iterate() method of a given object to update the x and y positions of the particles for animation. The plot object is returned as a list (line 36) required by the animation function.
The main program makes an instance of Brownian-motion objects bm with default parameters. It also creates a plot object for Matplotlib animation using updatefig() at regular intervals (10 ms) to update the figure as in Program 8.1. The blit option makes drawing faster.
Because updatefig() neither receives nor returns any data variable, the object-oriented approach is more convenient here because all data and computation (methods) are contained within the Brownian-motion object.
Program listing 10.3: Electrostatic energy (mcint.py)
1import numpy as np, random as rnd 3def sample(r): # return a random point inside a sphere while True: 5 x, y, z = rnd.random(), rnd.random(), rnd.random() x, y, z = r*(x+x−1), r*(y+y−1), r*(z+z−1) # map to [−r, r] 7 if (x*x + y*y + z*z <= r*r): break return x, y, z 9 r1, r2, d = 1.0, 2.0, 4.0 # radii and separation, d>r1+r2 11 f, V, N = 0., (4*np.pi/3.)**2*(r1*r2)**3, 100 13for i in range(N): x1, y1, z1 = sample(r1) 15 x2, y2, z2 = sample(r2) f += 1./np.sqrt((x1−x2−d)**2+(y1−y2)*(y1−y2)+(z1−z2)*(z1−z2)) 17 print (’MC=’, V*f/N, ’exact=’, V/d)
This program calculates the electrostatic potential energy between two spheres of uniform charge densities (ρ1 = ρ2 = 1, and electrostatic constant k = 1 in Eq. (10.17)). The function sample() returns a uniformly distributed point inside a sphere by the rejection method as explained in the text. The main program draws N points inside the spheres and computes the weighted average according to Eq. (10.17). The final result is multiplied by the combined volumes of the spheres.
1NumPy also has a random library that is especially useful for generating arrays of random distributions. For example, see the use of np.random.rand(N) in Program 10.2.
2This is true to the extent the pseudo-random numbers are uncorrelated, of course.
3These were the ratios used to approximate π by the fifth century Chinese mathematician Zu Chongzhi.