Download Mathematical Modeling A Comprehensive Introduction Pdf

12/21/2020

Download aerogarden elite manual. Mathematical Modeling and Simulation Introduction for Scientists and Engineers. Mathematical Modeling and Simulation. Related Titles. Ullmannâ€™s Modeling and Simulation. 2007 ISBN: 978â€“3â€“527â€“31605â€“2. Graduate Mathematical Physics. With MATHEMATICA Supplements. 2006 ISBN: 978â€“3â€“527â€“40637. A GUIDE TO MODEL TOPICS 249 I NDEX 253 AN INTRODUCTION TO MATHEMATICAL MODELING CH APTER WHAT 1.1 1 IS MODELING M O D E LS A N D R EA LITY model, The theoretical and scientific study of a situation centers around a that is, something that mimics relevant features of the situation being studied. Pdf Book Mathematical Models For Teaching Download Mathematical Models For Teaching Book was writen by Ann Kajander and release on 2014-01-01 by Canadian preview Download MATHEMATICAL MODELING A Comprehensive Introduction. An introduction to mathematical modeling dover books on computer science Download Book An Introduction To Mathematical Modeling Dover Books On Computer Science in PDF format. You can Read Online An Introduction To Mathematical Modeling Dover Books On Computer Science here in PDF, EPUB, Mobi or Docx formats.

Download Mathematical Modeling A Comprehensive Introduction Pdf Pdf
Mathematical Modeling Projects
Mathematical Modeling Pdf
Download Mathematical Modeling A Comprehensive Introduction Pdf Free
Mathematical Modeling Pdf Textbook
Introduction To Mathematical Modelling

= 2 e / [3 ï´ ï´f ] = 2 r1 r2 e / 3. Correspondingly, the velocity correlation is found to be given by 2e r1 e v~ (t ) v~ (t ' ) ï€½ 3
r 2 |t ï€t '|
r |t ï€t '|
ï€ r2 e 1 r1 ï€ r2
.
(8.128)
8.5 Non-Markovian Stochastic Differential Equations
325
Fig. 8.2. The first row shows the normalized velocity correlation function Cv that is implied by the non-Markovian (solid line) and Markovian (dashed line) velocity model. The velocity time scale ï´ = 1, and the ratio ï´f / ï´ is given in the plots. The second row shows the derivative of the normalized velocity correlation function for the non-Markovian (solid line) and Markovian (dashed line) velocity model.
8.5.4 The Relevance of Memory Effects What is the difference between the Markovian velocity model (8.89) and the non-Markovian velocity model (8.97) regarding the process statistics provided by these models, this means what is the relevance of memory effects? Let us compare the acceleration and velocity correlations of both models to address this question. Velocity Correlations. The velocity correlation functions (8.102) and (8.112) are illustrated in Fig. 8.2. It can be seen that the Markovian and non-Markovian models show a similar behavior: there is only a very minor difference between the curves for the range ï´f / ï´ ï‚£ 0.2 considered (we have the condition ï´f / ï´ ï‚£ 0.25 due to the definitions of r1 and r2). The area below the curves is equal to ï´ = 1 for all the cases considered. The effect of the model on the velocity correlation can be better seen by looking at the derivative of the normalized velocity correlations.
326
8 Stochastic Evolution
Fig. 8.3. The normalized acceleration correlation function Ca implied by the nonMarkovian velocity model is shown in the upper row. The corresponding function Ca* is shown in the lower row for the non-Markovian velocity model (solid line) and the Markovian velocity model (dashed line). The velocity time scale ï´ = 1, and ï´f / ï´ is given in the plots.
Regarding the velocity correlation implied by the non-Markovian and Markovian velocity model, respectively, we find r |r|
r |r|
dCv r e 2 ï€e 1 ï€½ r1 r2 ï´ , d (r / ï´ ) | r | r1 ï€ r2
|r|
dCv r ï€ï´ ï€½ï€ e . d (r / ï´ ) |r|
(8.129)
The corresponding features of the derivatives of normalized velocity correlation functions are also shown in Fig. 8.2. The non-Markovian velocity model implies a smoothly changing velocity correlation, which has a continuous derivative. This behavior is supported by consequences of the Navier-Stokes equations (Sawford 1991, Pope 1994). In contrast, the Markovian velocity model leads to a velocity correlation with a derivative that jumps at r = 0. This unphysical behavior is implied by the neglect of acceleration correlations (the neglect of a nonzero ï´f ). Acceleration Correlations. The normalized acceleration correlation function Ca that is implied by the non-Markovian velocity model is shown in Fig. 8.3 for different ï´f / ï´, where ï´ = 1. Such curves cannot be obtained from the Markovian velocity model because the normalized acceleration correlation function cannot be
8.6 Summary
327
defined for this case: see the discussion of Eq. (8.107). We see that the condition |Ca(t, r)| ï‚£ 1 (see the corresponding discussion regarding Cv) for the normalized correlation function of an equilibrium process is satisfied. Due to the fact that the integral acceleration time scale Ta is equal to zero according to Eq. (8.118) we find negative values of the normalized correlation function Ca. These features can be compared with the consequences of the Markovian velocity model by consider-ing the following normalized acceleration correlation Ca ï€¨t , r ï€© ï€½ *
a~ (t ) a~ (t ï€« r ) 2e /(3ï´ 2 )
.
(8.130)
Here, ï´f in the definition (8.115) of Ca is replaced by ï´. The corresponding curves are shown in Fig. 8.3 for both velocity models. For the non-Markovian velocity model the Ca* curves are similar to the behavior of Ca. It may be seen that Ca* of the non-Markovian velocity model converges to Ca* of the Markovian velocity model in the limit ï´f ï‚® ï‚¥. First of all, the difference between the velocity models is given by the acceleration variance < a~ 2 (t ) >, which is 2 e / [3 ï´ ï´f ] and infinity for the non-Markovian and Markovian velocity models, respectively. Summary. These observations can be summarized in the following way: The neglect of memory effects, which means the use of the Markovian velocity model instead of the non-Markovian velocity model, corresponds to the assumption that the characteristic correlation time ï´f of stochastic forces is negligibly small. This approach is equivalent to the consideration of velocities over time steps that are large compared to ï´f , i.e., the real process is described only asymptotically in this case. The latter approach provides velocity correlations that are very close to the correlations implied by the non-Markovian velocity model. On the other hand, no attempt is made to represent acceleration correlations in a physically correct way: these correlations are only represented such that the integral over the acceleration correlation function is equal to zero, as required for a variable that represents the derivative of an equilibrium process (see the discussion related to Eq. (8.108)).

8.6 Summary The goal of the presentation in this chapter was to develop a methodological basis for the modeling of the evolution of stochastic processes. The latter requires answers to the questions considered in the introduction, this means: the questions about the type of equations for PDFs, their solution, the type of stochastic process equations, and the determination of stochastic equations for the modeling of any case. Let us summarize the features observed.
328
8 Stochastic Evolution
PDF Evolution Equation. Our starting point was the most general equation for the evolution of a PDF given by the Kramers-Moyal equation (8.8), n
ï‚¥ ï‚¶f ( x, t ) ïƒ¦ ï‚¶ ïƒ¶ ï€½ ïƒ¥ ïƒ§ ï€ ïƒ· D ( n ) ( x, t ) f ( x, t ). ï‚¶t ï‚¶x ïƒ¸ n ï€½1ïƒ¨
(8.131)
The Kramers-Moyal equation represents an identity. We did not use any physical principle, we did only assume that the PDF f(x, t) and Kramers-Moyal coefficients D(n)(x, t) exist. The Kramers-Moyal equation implies Pawulaâ€™s theorem that shows that there are two possibilities: we can either work with an equation that involves an infinite number of Kramers-Moyal coefficients D(n)(x, t), or we can work with the Fokker-Planck equation (8.21), ï‚¶f ( x, t ) ï‚¶ D (1) ( x, t ) f ( x, t ) ï‚¶ 2 D ( 2 ) ( x, t ) f ( x, t ) ï€½ï€ ï€« , ï‚¶t ï‚¶x ï‚¶x 2
(8.132)
which does only involve the first two Kramers-Moyal coefficients. The neglect of D(n) with n ï‚³ 3 is justified if the stochastic process considered has a continuous sample path, this means if jump processes (i.e., processes involving instantaneous unbounded changes) are not considered. We applied the latter assumption in the following. We did also assume that the coefficients D(n)(x, t) do only depend on x and t. This corresponds to the assumption that the stochastic process considered is a Markovian process (this means a process for which the present state determines the future evolution). Solutions of the Fokker-Planck Equation. An important question is how we can solve the Fokker-Planck equation (8.132). It was shown that this equation can be solved analytically if we consider the specific Fokker-Planck equation (8.34), ï‚¶f ï€¨x, t ï€© ï‚¶ F (t ) ï€« G (t )ï€¨x ï€ X ï€½ï€ ï‚¶t ï‚¶x
ï›
ï€©ï f ï€¨x, t ï€© ï€« ï‚¶ D(t ) f ï€¨x, t ï€© . 2
ï‚¶x 2
(8.133)
The significant difference between the general Fokker-Planck equations (8.132) and Eq. (8.133) is that D(1) is a linear function of x in Eq. (8.133), and D(2) is independent of x. It turns out that the solution to Eq. (8.133) is given by a normal PDF integrated over the initial condition, f ( x, t ) ï€½ ïƒ²
ïƒ¬ (x ï€ ï¡ )2 ïƒ¼ expïƒï€ ïƒ½ f ( x0 , t 0 ) dx0 . 2ï¢ ïƒ¾ 2ï° ï¢ ïƒ® 1
(8.134)
Here, the model parameter ï¡ and ï¢ are functions of t, and ï¡ does also depend on x0. Asymptotically (i.e., for t ï‚® ï‚¥), ï¡ and ï¢ relax to the mean <X> and variance ~ < X 2 > of the process considered. Then, the PDF f(x, t) becomes independent of the initial PDF f(x0, t0): f(x, t) is then given by a normal PDF with mean <X> and ~ variance < X 2 >.
8.6 Summary
329
Stochastic Process Equations. Instead of asking how the PDF of a stochastic process evolves, we may ask how the underlying stochastic process evolves in time. In generalization of the stochastic difference equations considered in Chap. 6, we considered the model (8.55) for the evolution of the stochastic process X(t),
dX dW ï€¨ t ï€©. ( t ) ï€½ a ï€¨ X , t ï€© ï€« bï€¨ X , t ï€© dt dt
(8.135)
This approach leads to the question of which PDF transport equation is implied by the stochastic model (8.135). To answer this question we calculated the KramersMoyal coefficients that are implied by Eq. (8.135), which resulted in
D (1) ( x, t ) ï€½ a ( x, t ) , D ( 2 ) ( x, t ) ï€½
1 2 b ( x, t ) . 2
D ( 3 ) ( x , t ) ï€½ D ( 4 ) ( x , t ) ï€½ ïŒ D ( ï‚¥ ) ( x , t ) ï€½ 0.
(8.136a) (8.136a) (8.136c)
Hence, the evolution equation for the PDF related to the stochastic model (8.135) is a Fokker-Planck equation with coefficients specified through Eqs. (8.136). By using the coefficient relations we see that the Fokker-Planck equation (8.133) corresponds to the stochastic model dX dW (t ) ï€½ F (t ) ï€« G (t )ï€¨X (t ) ï€ X ï€© ï€« 2 D(t ) . dt dt
(8.137)
Hence, a linear stochastic model has a PDF that is a normal PDF integrated over the initial condition. The most important advantage of stochastic equations is that these equations can be used to represent Fokker-Planck equations that cannot be solved analytically. Such PDF evolution equations can be solved by Monte Carlo simulation, this means the numerical solution of equivalent stochastic equations (see Chap. 6). Application to Modeling. How can we determine stochastic process equations for the modeling of any case? The stochastic differential equation (8.135) can be used for the modeling of any nonlinear processes. On the other hand, Eq. (8.135) describes a Markovian stochastic process, and this assumption is often not rigorously satisfied (most real processes do represent non-Markovian processes). Thus, there is the question about the suitability of modeling a non-Markovian process in terms of a Markovian stochastic differential equation. To address this question we compared in Sect. 8.5 a non-Markovian with a Markovian velocity model: the more accurate non-Markovian velocity model (which may be seen to represent the reality) was used as a reference model to evaluate the performance of the less accurate Markovian velocity model (which represents an approximate model for
330
8 Stochastic Evolution
the real process). It was shown that the Markovian velocity model is not incorrect but only less complete than the non-Markovian model. The latter model describes processes that take place over the time scale ï´f (over which accelerations change) and over the time scale ï´ (over which velocities change). On the other hand, the Markovian model does only describe processes that take place over ï´. The performance of the Markovian model is acceptable regarding the processes that are described: this model provides velocity correlations that are very close to the correlations implied by the non-Markovian velocity model. It is often only possible to model a part of all the processes observed in reality (there are often processes that take place over a variety of time scales, which vary over orders of magnitude). The application of a Markovian model that provides an accurate description for a certain part of these processes and neglects other (smaller-scale) processes does often turn out to be the most convenient choice.
8.7 Exercises 8.2.1 Show the consistency of the Fokker-Planck equation (8.21) by integrating this equation over the sample space from negative to positive infinity. 8.2.2 Consider the Fokker-Planck equation (8.21). a) Calculate the asymptotic solution to this equation (this solution has the property ï‚¶f / ï‚¶t = 0). b) Provide an example for D(1) and D(2) that leads to a PDF that approaches zero for |x| ï‚® ï‚¥. c) Provide an example for D(1) and D(2) that leads to a PDF that diverges for positive or negative x with |x| ï‚® ï‚¥. 8.2.3 Consider the Fokker-Planck equation (8.21). Specify this equation (determine the coefficients of the Fokker-Planck equation) so that
f ( x) ï€½
ïƒ¬ (x ï€ ï )2 ïƒ¼ expïƒï€ ïƒ½ 2ï³ 2 ïƒ¾ 2ï° ï³ ïƒ® 1
represents an asymptotic solution. The normal PDF parameters ï and ï³ are considered to be constant. Hint: you may assume that D(2) is constant. 8.2.4 Consider Eq. (8.25) for the mean <X>, which is implied by the FokkerPlanck equation (8.21). Try to solve this equation for the case that D(1)(x, t) = ï€a x2, where a is any constant.
8.7 Exercises
331
~ 8.2.5 Consider Eq. (8.32) for the variance < X 2 >, which is a consequence of the Fokker-Planck equation (8.21). a) Solve this equation for the case that D(1)(x, t) = ï€a x and D(2) is constant. Here, a is any constant. b) Find the asymptotic variance according to this equation. c) Explain the relevance of this result regarding the determination of model parameters.
8.2.6 Consider the Fokker-Planck equation (8.21). a) Follow the approach used in Sect. 8.2.3 to find the evolution equation ~ for the central moment of third order < X 3 >. (1) b) Solve this equation for the case that D (x, t) = ï€a x and D(2) is constant. Here, a is any constant. 8.3.1 Consider the Fokker-Planck equation (8.21). a) Follow the explanations in Sect. 8.3.1 to show that the Fokker-Planck equation (8.21) applies to the two-point PDF f(x, t; x', t') = <ï¤ [x ï€ X(t)] ï¤ [x' ï€ X(t')]>,
ï‚¶f ( x, t ; x' , t ' ) ï‚¶ D (1) ( x' , t ' ) f ( x, t ; x' , t ' ) ï‚¶ 2 D ( 2) ( x' , t ' ) f ( x, t ; x' , t ' ) ï€« ï€ ï€½ 0. ï‚¶t ' ï‚¶x' ï‚¶x'2 b) Apply the definition f(x, t; x', t') = <ï¤ [x ï€ X(t)] ï¤ [x' ï€ X(t')]> to show that ~ ~ the correlation function < X (t ) X (t ' ) > is defined by ~ ~ X (t ) X (t ' ) ï€½ ïƒ² ïƒ² ï€¨x ï€ X
ï€©ï€¨x'ï€ X ' ï€© f ( x, t; x', t ' ) dx dx'.
~ ~ c) Use the definition of < X (t ) X (t ï€« r ) >, where r is any non-negative time, and the Fokker-Planck equation for the two-point PDF f(x, t; x', t + r) to ~ ~ show that < X (t ) X (t ï€« r ) > satisfies the equation ~ ~ d X (t ) X (t ï€« r ) ~ ~ ï€½ X (t ) D (1) ï€¨ X (t ï€« r ), t ï€« r ï€© . dr 8.3.2 Consider Eq. (8.40) for the conditional PDF f(x, t | x0, t0). Calculate the asymptotic solution to this equation (which has the property ï‚¶f / ï‚¶t = 0). Assume for simplicity that G and D are constant and F = 0. 8.3.3 Consider the model (8.42) for the conditional PDF f(x, t | x0, t0). Show that the consistency with the initial condition (8.41) requires that ï¡ and ï¢ have the initial values ï¡(t0) = x0 and ï¢(t0) = 0. 8.3.4 Consider Eqs. (8.48) for ï¡ and ï¢,
dï¡ ï€½ G (t ) ï€¨ï¡ ï€ X ï€© ï€« F (t ), dt

Download Mathematical Modeling A Comprehensive Introduction Pdf Pdf

dï¢ ï€½ 2 G (t ) ï¢ ï€« 2 D(t ). dt
332
8 Stochastic Evolution a) Show that the following expressions are the solutions of these equations and satisfy the initial values ï¡(t0) = x0 and ï¢(t0) = 0. t
ïƒ¬t ï€s ï€« t0
ïƒ¼
t0
ïƒ®
ïƒ¾
ï¡ ï€½ x0 ï€« ïƒ² expïƒ ïƒ² G (r ) dr ïƒ½ ï€¨G ( s)ï›x0 ï€ X ï ï€« F ( s ) ï€© ds. t0
t
ïƒ¬
t ï€ s ï€« t0
ïƒ¼
t0
ïƒ®
t0
ïƒ¾
ï¢ ï€½ 2 ïƒ² expïƒ2 ïƒ² G (r ) dr ïƒ½ D( s ) ds. b) Specify the solutions for the case that G and D are independent of t and F = 0. 8.3.5 Consider the Fokker-Planck equation (8.34) for the case that D = 0. a) Find the function ï¢(t) for this case. b) Find the solution f(x, t) to the Fokker-Planck equation for this case. c) Specify the solution f(x, t) obtained in b) for the case that the initial PDF is given by f(x0, t0) = ï¤ (x0 ï€ ï¡0), where ï¡0 is a given nonrandom value. d) Interpret the result obtained in c). 8.3.6 We consider an instantaneous emission of a substance from a point source, this means the emission of a mass M at time zero at a fixed position H. The substance diffuses along the y direction. The mean substance concentration is given by C(y, t) = M f(y, t). Here, f(y, t) refers to the PDF for finding a parcel at time t at a position y (see Sect. 6.3.3). The concentration C is described by the diffusion equation (D is a constant diffusion coefficient)
ï‚¶C ï€¨ y, t ï€© ï‚¶ 2C ï€¨ y, t ï€© ï€½D . ï‚¶t ï‚¶y 2 a) Specify the initial concentration C(y0, 0) for this case. b) Calculate the solution to the diffusion equation based on the solution of the Fokker-Planck equation (8.34). 8.4.1 Consider the correlation function (8.87) and variance (8.88). a) Show that Eq. (8.87) agrees with the consequence (8.33) of the general Fokker-Planck equation. b) Show that Eq. (8.88) agrees with the consequence (8.32) of the general Fokker-Planck equation. 8.4.2 Consider the case that you are interested to use the linear stochastic model (8.73) for the modeling of a case considered. a) How is it possible to determine the parameters D and ï´ of the model (8.73) in terms of measured statistics? b) How is it possible to provide evidence for the suitability of modeling a certain case in terms of a linear stochastic model?

Mathematical Modeling Projects

8.7 Exercises

Mathematical Modeling Pdf

333
8.4.3 The mixing of species in water or air can be described by the model
dï¦ 1 dW ï€½ ï€ ï€¨ï¦ ï€ ï¦ ï€© ï€« c . dt ï´ dt Here, ï¦ refers to the instantaneous mass fraction, which is bounded by zero and one according to its definition, i.e., 0 ï‚£ ï¦ ï‚£ 1. Here, <ï¦> is the mean value of ï¦, ï´ is a characteristic mixing time scale, c is a parameter, t is time and dW / dt refers to the derivative of a Wiener process. For simplicity, we assume that ï´, c, and <ï¦> are constants. a) Use the stochastic mixing model to derive the equation for the variance ~ < ï¦ 2 >. Solve this equation. b) Use the stochastic mixing model to derive the corresponding equation ~ for the standardized species mass fraction ï† = (ï¦ ï€ <ï¦>) / < ï¦ 2 >1/2. c) Use the equation for ï† to discuss the consequences of applying a zero model parameter c. Relate this discussion to the solution of the variance equation. d) Use the stochastic mixing model to discuss the disadvantage of using a nonzero c. Hint: consider the property 0 ï‚£ ï¦ ï‚£ 1 of ï¦. 8.4.4 Continue with exercise 8.4.3. The PDF f(ï±, t), which is related to the stochastic model considered, is given by

Download Mathematical Modeling A Comprehensive Introduction Pdf Free

f (ï± , t ) ï€½ ïƒ² f (ï± , t | ï± ' , t ' ) f (ï± ' , t ' ) dï± '. Here, f(ï±, t | ï± ', t') and f(ï± ', t') refer to the conditional PDF and initial PDF, respectively. a) Provide the evolution equation and initial condition for the conditional PDF f(ï±, t | ï± ', t'). b) Solve this PDF evolution equation. Provide all the model parameters of f(ï±, t | ï± ', t') as explicit functions of time. c) Calculate f(ï±, t) for the case that f(ï± ', t') = ï¤(ï± ' ï€ <ï¦>). d) Describe qualitatively the evolution of f(ï±, t) obtained in this way. 8.4.5 Consider the stochastic population model discussed in Sect. 6.5,

Mathematical Modeling Pdf Textbook

dP dW ïƒ¶ ïƒ¦ ï€½ P ï€¨1 ï€ P ï€©ïƒ§ ï ï€« ï³ ïƒ·. dt dt ïƒ¸ ïƒ¨ ~ a) Use Eqs. (8.25) and (8.32) to obtain the equations for

Introduction To Mathematical Modelling

Install java for mac. Reelsmart motion blur nuke keygen free. https://onever878.weebly.com/hindi-song-1920-evil-returns-albom.html. Keygen serial licence. Windows pro 10 1604 upgrade iso download.

and < P 2 >. b) Show that the discrete Eqs. (6.94) and (6.96) derived in Chap. 6 imply for ï„t ï‚® 0 the same equations for the mean and variance of P. c) Present the corresponding equation for the PDF f(p, t). d) Explain why the equation obtained in c) cannot be solved analytically.
334
8 Stochastic Evolution
8.5.1 Consider the Markovian velocity model (8.89). a) Use the velocity model to show that the acceleration correlation function is given for all values of r by the expression |r|
2e ï€ 4e a~ (t ) a~ (t ï€« r ) ï€½ ï€ 2 e ï´ ï€« ï¤ (r ). 3ï´ 3ï´ b) Show that the integral of this correlation over 0 ï‚£ r < ï‚¥ is equal to zero. 8.5.2 Show the validity of ï‚¥
~ ~ ïƒ² a (t ) a (t ï€« r ) dr ï€½ 0 0
for any variable a that represents the derivative of an equilibrium process. Hint: perform the integration over a~ (t ï€« r ) ï€½ dv~ (t ï€« r ) / dr directly. 8.5.3 Consider the velocity correlation function (8.109), which is implied by the non-Markovian velocity model. Calculate the limit ï´f ï‚® 0 of this velocity correlation function to recover the normalized velocity correlation function (8.102) of the Markovian velocity model. 8.5.4 Consider the acceleration correlation function (8.114) implied by the nonMarkovian velocity model. Calculate the limit ï´f ï‚® 0 of this acceleration correlation function to recover the acceleration correlation function (8.107) of the Markovian velocity model. 8.5.5 The velocity model (8.89) implies the following model for the position x defined by dx / dt = v,
dx ï€½ v ï€« F (t ). dt Here, is constant, and the stochastic force F(t) is defined by F (t ) ï€½ v~ (0) e ï€t / ï´ ï€«
4e t ï€(t ï€ s ) / ï´ dW ( s ) ds. ïƒ²e 3ï´ 0 ds
a) Show that this position model agrees with the velocity model (8.89). b) Explain why the position model represents a non-Markovian model. c) Calculate under equilibrium conditions the mean and correlation of the stochastic force F(t). You may follow the explanations in Sect. 8.5.1. d) Which condition does this non-Markovian position model reduce to a Markovian position model? To provide the answer to this question you have to specify the force F(t). Use the relation e = 3 ï® / ï´ between e and the kinematic viscosity ï® (see Sect. 10.5). Neglect the first term in the F(t) expression, which is justified under equilibrium conditions.
9 Deterministic Multivariate Evolution
The discussions of deterministic evolution in Chap. 7 were focused on the modeling of the evolution of one variable (as heat, mass, the position of any body, or a population density). The consideration of such relatively simple problems is helpful for a basic understanding of the structure and the range of applicability of equations for typical problems. However, only a narrow range of problems can be described in this way: the analysis of most real problems requires the consideration of the multivariate evolution of several variables. The latter is required, for example, regarding the interaction of biological species and motions of bodies or fluids in three-dimensional space. To deal with such cases we extend here the concepts used for the modeling of mechanical and population ecology processes in Chap. 7 to the modeling of the joint evolution of several variables. We will continue with the consideration of global properties that change in time but not in space, i.e., partial differential equations that describe the evolution of processes in space will be not considered. The mathematics of models for the evolution of such processes can be formulated in terms of linear and nonlinear systems of coupled ordinary differential equations. Section 9.1 explains the motivation for developing mathematical models for the multivariate evolution of processes. Section 9.2 prepares the discussions in the following sections by the explanation of techniques for the solution and analysis of coupled systems of ordinary differential equations. Sections 9.3 and 9.4 extend the discussion in Chap. 7. Section 9.3 describes the modeling of basic population ecology processes (the competition for food and predator-prey interactions). The modeling of mechanical motions will be considered in Sect. 9.4, where the pendulum equation used in Chap. 3 will be solved. Section 9.5 illustrates the problem of dealing with the fluid dynamics equations derived in Chap. 10 by considering a simple model for atmospheric motions. Section 9.6 summarizes the basic features of the modeling approaches presented in this chapter.
S. Heinz, Mathematical Modeling, DOI 10.1007/978-3-642-20311-4_9, Â© Springer-Verlag Berlin Heidelberg 2011
335
336
9 Deterministic Multivariate Evolution
Fig. 9.1. A solution y1(t) of the Lorenz equations (9.1) combined with the model parameter R = 28. The thick line and the thin line present solutions for the initial values (y10, y20, y30) = (5, 5, 5) and (y10, y20, y30) = (5.01, 5, 5), respectively.
9.1 Motivation Weather Forecasting. Weather forecasting is crucially relevant. Weather warnings are used to protect life and property. Temperature and precipitation forecasts are highly relevant to agriculture. Regarding everyday life, weather forecasts are used to find out what to wear on a particular day. Weather forecasting has to be performed of the basis of numerical solutions of complicated equations (systems of nonlinear coupled partial differential equations) that involve several variables (as the three velocity components in space and temperature). To see basic features of such systems of partial differential equations (like the weather predictability), it is helpful to consider highly simplified approximations to these equation systems ï€ as given by the Lorenz (1963) model. The latter model is given by the equations
dy1 ï€½ 10 ( y 2 ï€ y1 ), dt dy 2 ï€½ y1 ( R ï€ y3 ) ï€ y 2 , dt dy3 8 ï€½ y1 y 2 ï€ y3 . dt 3
(9.1a) (9.1b) (9.1c)
Here, y1 measures the strength and direction of atmospheric circulation, and y2 and y2 measure the horizontal and vertical temperature variation, respectively. The variable that essentially controls the dynamics of this equation system is R, which is proportional to the vertical temperature difference. Two solutions to these equations, which differ by a minor difference of the initial value for y1, are shown in Fig. 9.1 (details about the Lorenz equations (9.1) and their numerical solutions can be found in Sect. 9.5). This figure illustrates that solutions of the Lorenz equations reveal a complicated behavior. Also, small variations of initial conditions may result in completely different solutions ï€ which indicates that long-range weather forecasting may be impossible.
9.2 Systems of First-Order Differential Equations
337
Questions Considered. The analysis of solutions of the Lorenz equations (9.1) leads to questions like: ï‚· How can we determine different (chaotic and nonchaotic) solution regimes? ï‚· How can we analyze the influence of variations of initial conditions? ï‚· How can we characterize the asymptotic behavior of solutions? The Lorenz equations do only represent one example for many problems that have to be described by (linear and nonlinear) systems of coupled equations for several variables. Thus, from a more general point of view there are questions like: ï‚· How can we formulate laws for the multivariate evolution of several variables? ï‚· Do all multivariate evolution equations have (convergent numerical) solutions? ï‚· How can we analytically study multivariate evolution equations? The latter and other questions will be addressed in this chapter.
9.2 Systems of First-Order Differential Equations Techniques for the solution of linear systems and the analysis of basic features of solutions to nonlinear systems of first-order ordinary differential equations will be described in this section. This discussion will provide an appropriate basis for the developments to be performed in the following sections of this chapter.
9.2.1 Linear Systems of First-Order Differential Equations Equations Considered. The analysis of linear systems of ordinary differential equations is helpful because of two reasons: many problems (like the linearized pendulum equation: see Sect. 9.4.2) can be solved by linear equations, and linear equation systems can be used to analyze the solution features on nonlinear equation systems (like the Lotka-Volterra equations and Lorenz equations: see Sects. 9.3.3 and 9.5.2, respectively). Thus, let us consider the following linear equation system, dy1 ï€½ a11 y1 ï€« a12 y 2 , dt
(9.2a)
dy 2 ï€½ a21 y1 ï€« a22 y 2 . dt
(9.2b)
Here, a11, a12, a21, and a22 are constants. The solution of the equation system (9.2) requires initial values y1(0) = y10 and y2(0) = y20, where y10 and y20 are considered to be given parameters.
338
9 Deterministic Multivariate Evolution
Relation to Second-Order Equations. A good way to find the solutions y1(t) and y2(t) to the equation system (9.2) is to exploit the relationship between this equation system and second-order differential equations. This relationship can be derived in the following way,
d 2 y1 dy dy dy ï€½ a11 1 ï€« a12 2 ï€½ a11 1 ï€« a12 (a21 y1 ï€« a22 y 2 ) dt dt dt dt 2 dy dy ïƒ¦ dy ïƒ¶ ï€½ a11 1 ï€« a12 a21 y1 ï€« a22 ïƒ§ 1 ï€ a11 y1 ïƒ· ï€½ ï€¨a11 ï€« a22 ï€© 1 ï€ (a11a22 ï€ a12 a21 ) y1. dt dt dt ïƒ¨ ïƒ¸ (9.3) In the first line, Eq. (9.2b) was applied to replace dy2 / dt. In the second line, we used Eq. (9.2a) to replace a12 y2. In the same way we find for d2y2 / dt2
dy dy dy d 2 y2 ï€½ a21 1 ï€« a22 2 ï€½ a21 (a11 y1 ï€« a12 y2 ) ï€« a22 2 2 dt dt dt dt dy2 dy ïƒ¦ dy2 ïƒ¶ ï€½ a22 ï€« a12 a21 y2 ï€« a11 ïƒ§ ï€ a22 y2 ïƒ· ï€½ ï€¨a11 ï€« a22 ï€© 2 ï€ (a11a22 ï€ a12 a21 ) y2 . dt dt ïƒ¨ dt ïƒ¸ (9.4) Equations (9.3) and (9.4) can also be written d 2 y1 dy ï€« b 1 ï€« c y1 ï€½ 0, 2 dt dt
(9.5a)
d 2 y2 dy ï€« b 2 ï€« c y 2 ï€½ 0, 2 dt dt
(9.5b)
where the following abbreviations are applied, b ï€½ ï€(a11 ï€« a22 ),
c ï€½ a11 a22 ï€ a12 a21.
(9.6)
Equations (9.5a) and (9.5b) represent the same equation. Different solutions y1(t) and y2(t) of these equations are obtained by applying the initial values y1(0) = y10 and y2(0) = y20, and the initial derivatives dy1 / dt(0) = y'10 and dy2 / dt(0) = y'20 that are provided through Eq. (9.2). y '10 ï€½ a11 y10 ï€« a12 y 20 ,
(9.7a)
y '20 ï€½ a21 y10 ï€« a22 y 20 .
(9.7b)
Equations (9.5) correspond to the second-order equation (7.45). Accordingly, the solution of Eq. (7.45) derived in Chap. 7 can be used for the solution of the equation system (9.5), as will be shown in the next paragraph. Before doing this we will show that the relationship between a linear system of first-order equations and a linear second-order differential equation also can be used to write a linear second-order equation in terms of a system of first-order differential equations.
9.2 Systems of First-Order Differential Equations
339
The latter can be seen by considering Eq. (7.45), a
d2y dy ï€«b ï€« c y ï€½ 0. dt dt 2
(9.8)
We set y1 = y and y2 = dy / dt. Differentiation of y1 and y2 then provides
dy1 dy ï€½ ï€½ y2 , dt dt
(9.9a)
dy 2 d 2 y b dy c c b ï€½ 2 ï€½ï€ ï€ y ï€½ ï€ y1 ï€ y 2 , dt a dt a a a dt
(9.9b)
where Eq. (9.8) was applied. Hence, the second-order differential equation (9.8) can be represented as the system (9.9) of first-order equations. The initial values required for the solution of Eq. (9.9) are given by the initial values y1(0) = y(0) and y2(0) = dy / dt(0) that complete the second-order equation (9.8). Solution of the Equation System. I. The solutions to Eqs. (9.2) can be derived via the solution (7.57) of the corresponding second-order equations (9.5). For a = 1 we obtain according to Eq. (7.57) y1 ï€½
y '10 ï€ r2 y10 r1 t y '10 ï€ r1 y10 r 2 t e ï€ e , r1 ï€ r2 r1 ï€ r2
(9.10a)
y2 ï€½
y '20 ï€ r2 y 20 r1 t y '20 ï€ r1 y 20 r 2 t e ï€ e . r1 ï€ r2 r1 ï€ r2
(9.10b)
The initial derivatives y'10 and y'20 are given through Eqs. (9.7). The eigenvalues r1 and r2 are determined by the characteristic equation
0 ï€½ r 2 ï€ (a11 ï€« a22 ) r ï€« a11a22 ï€ a12 a21 ï€½ (a11 ï€ r )(a22 ï€ r ) ï€ a12 a21 ,
(9.11)
which follows from the use of b = ï€(a11 + a22) and c = a11 a22 ï€ a12 a21 in the characteristic equation r2 + b r + c = 0. Hence, the eigenvalues r1 and r2 are r1 ï€½ rS ï€« rD ,
r2 ï€½ rS ï€ rD ,
(9.12)
where rS and rD are given by a11 ï€« a22 , 2 1 1 (a11 ï€« a22 ) 2 ï€ 4(a11a22 ï€ a12 a21 ) ï€½ rD ï€½ (a11 ï€ a22 ) 2 ï€« 4a12 a21 . 2 2
rS ï€½
(9.13a) (9.13b)
Solution of the Equation System. II. Let us directly solve the equation system (9.2) to check the validity of the solutions (9.10). Such a solution can be found efficiently by making use of vector and matrix notation. We write Eq. (9.2) as
dy ï€½ A y. dt
(9.14)
340
9 Deterministic Multivariate Evolution
Here, the vector y and the matrix A are given by ïƒ¦a A ï€½ ïƒ§ïƒ§ 11 ïƒ¨ a21
ïƒ¦y ïƒ¶ y ï€½ ïƒ§ïƒ§ 1 ïƒ·ïƒ·, ïƒ¨ y2 ïƒ¸
a12 ïƒ¶ ïƒ·. a22 ïƒ·ïƒ¸
(9.15)
To solve Eq. (9.14) we assume according to Eq. (9.10) an exponential solution,
y ï€½ c er t ,
(9.16)
where the constant vector c is given by ïƒ¦c ïƒ¶ c ï€½ ïƒ§ïƒ§ 1 ïƒ·ïƒ·. ïƒ¨ c2 ïƒ¸
(9.17)
The use of Eq. (9.16) in Eq. (9.14) then results in r c er t ï€½ A c er t .
(9.18)
Upon cancelling the nonzero exponential function we obtain ( A ï€ r I ) c ï€½ 0.
(9.19)
Here, I is the 2ï‚´2 identity matrix, ïƒ¦ 1 0ïƒ¶ ïƒ·ïƒ·. I ï€½ ïƒ§ïƒ§ ïƒ¨0 1ïƒ¸
(9.20)
The identity matrix I has the property I c = c. The inverse matrix of A ï€ r I will exist if the determinant det(A ï€ r I) is nonzero. In this case, we do only obtain trivial solutions c = 0 according to Eq. (9.19). Thus, the condition to obtain nontrivial solutions c is given by det(A ï€ r I) = 0, i.e., det( A ï€ r I ) ï€½
a11 ï€ r a12 ï€½ 0. a21 a22 ï€ r
(9.21)
The latter constraint can be also written 0 ï€½ (a11 ï€ r )(a22 ï€ r ) ï€ a12 a21 ï€½ r 2 ï€ (a11 ï€« a22 ) r ï€« a11a22 ï€ a12 a21.
(9.22)
The solution of this quadratic equation for r reveals that the two eigenvalues r1 and r2 obtained in this way agree with the eigenvalues r1 and r2 given by Eq. (9.12). To specify the solution (9.16) we have to use the eigenvalues r1 and r2 in Eq. (9.19) to determine the corresponding eigenvectors c(1) and c(2). This constraint provides the equations (1)
ïƒ¦ a11 ï€ r1 ïƒ§ïƒ§ ïƒ¨ a21
a12 ïƒ¶ ïƒ¦ c1 ïƒ¶ ïƒ·ïƒ§ ïƒ· a22 ï€ r1 ïƒ·ïƒ¸ ïƒ§ïƒ¨ c2 ïƒ·ïƒ¸
ïƒ¦ a11 ï€ r2 ïƒ§ïƒ§ ïƒ¨ a21
a12 ïƒ¶ ïƒ¦ c1 ïƒ¶ ïƒ·ïƒ§ ïƒ· a22 ï€ r2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ c2 ïƒ·ïƒ¸
( 2)
ïƒ¦ 0ïƒ¶ ï€½ ïƒ§ïƒ§ ïƒ·ïƒ·, ïƒ¨ 0ïƒ¸
(9.23a)
ïƒ¦ 0ïƒ¶ ï€½ ïƒ§ïƒ§ ïƒ·ïƒ· . ïƒ¨ 0ïƒ¸
(9.23b)
9.2 Systems of First-Order Differential Equations
341
The conditions for the first eigenvector c(1) can be written 0 ï€½ (a11 ï€ r1 ) c1 0 ï€½ a21 c1
(1)
(1)
(1)
(9.24a)
(1)
(9.24b)
ï€« a12 c2 ,
ï€« (a22 ï€ r1 ) c2 .
Equation (9.24a) can be used to express c2(1) in terms of c1(1), c2
(1)
ï€½ï€
a11 ï€ r1 (1) c1 . a12
(9.25)
The use of this expression in Eq. (9.24b) provides a relation for c1(1), ïƒ© ïƒ¹ (1) 1 (1) 0 ï€½ ïƒªa21 ï€ (a11 ï€ r1 ) (a22 ï€ r1 ) ïƒº c1 ï€½ ï›(a11 ï€ r1 ) (a22 ï€ r1 ) ï€ a12 a21 ï c1 . (9.26) a 12 ïƒ« ïƒ» The eigenvector r1 solves the characteristic Eq (9.22). Thus, the bracket term in the latter relation is equal to zero, which means that there is no constraint on c1(1). Hence, we may assume that c1(1) = c1, where c1 is an open parameter. The eigenvector c(1) can be written then c(1) = c1 u1, where 1 ïƒ¦ ïƒ¶ u1 ï€½ ïƒ§ïƒ§ ï€ a11 ï€ r1 ïƒ·ïƒ·. a12 ïƒ¸ ïƒ¨
(9.27)
The corresponding condition for the second eigenvectors c(2) can be derived in the same way. According to Eq. (9.23b) we have 0 ï€½ (a11 ï€ r2 ) c1 0 ï€½ a21 c1
( 2)
( 2)
( 2)
(9.28a)
( 2)
(9.28b)
ï€« a12 c2 ,
ï€« (a22 ï€ r2 ) c2 .
Using the relation for c1(2) implied by Eq. (9.28a) in the second relation leads to ïƒ© ïƒ¹ ( 2) a12 ( 2) ï€« a22 ï€ r2 ïƒº c2 ï€½ ï›(a11 ï€ r2 ) (a22 ï€ r2 ) ï€ a12 a21 ï c2 . 0 ï€½ ïƒªï€ a21 a11 ï€ r2 ïƒ« ïƒ»
(9.29)
This relation does not imply a condition on c2(2) because the eigenvalue r2 satisfies the characteristic equation (9.22). By setting c2(2) = c2, where c2 is an open parameter, we find by means of Eq. (9.28a)
c1
( 2)
ï€½ï€
a12 c2 . a11 ï€ r2
(9.30)
Hence, c(2) can be written c(2) = c2 u2, where a12 ïƒ¶ ïƒ¦ ïƒ§ï€ ïƒ· u2 ï€½ ïƒ§ a11 ï€ r2 ïƒ·. ïƒ§ ïƒ· 1 ïƒ¨ ïƒ¸
(9.31)
342
9 Deterministic Multivariate Evolution
By accounting for the two possible exponential solutions we can write the solution of the equation system (9.14) as r t
r t
y ï€½ c1 u1 e 1 ï€« c2 u2 e 2 ,
(9.32)
which means according to Eqs. (9.27) and (9.31) that a12 ïƒ¶ ïƒ¦ 1 ïƒ¦ ïƒ¶ ïƒ¦ y1 ïƒ¶ r t ïƒ§ï€ ïƒ· rt ïƒ§ïƒ§ ïƒ·ïƒ· ï€½ c1 ïƒ§ ï€ a11 ï€ r1 ïƒ· e 1 ï€« c2 ïƒ§ a11 ï€ r2 ïƒ· e 2 . ïƒ§ ïƒ· y ïƒ§ ïƒ· ïƒ¨ 2ïƒ¸ a12 ïƒ¸ 1 ïƒ¨ ïƒ¨ ïƒ¸
(9.33)
This solution has to satisfy the initial conditions, which implies two equations for c1 and c2, y10 ï€½ c1 ï€
a12 c2 , a11 ï€ r2
y 20 ï€½ ï€
a11 ï€ r1 c1 ï€« c2 . a12
(9.34)
The use of the second relation for c2 in the first condition, and the use of the first relation for c1 in the second relation implies y10 ï€½ c1 ï€ y 20 ï€½ ï€
a12 a11 ï€ r2
ïƒ¦ ïƒ© a11 ï€ r1 ïƒ¹ a ï€r ïƒ¶ a12 c1 ïƒº ï€½ c1 ïƒ§ïƒ§1 ï€ 11 1 ïƒ·ïƒ· ï€ y 20 , ïƒª y 20 ï€« a12 ïƒ¨ a11 ï€ r2 ïƒ¸ a11 ï€ r2 ïƒ» ïƒ«
ïƒ¦ ïƒ¹ a11 ï€ r1 ïƒ© a12 a ï€r ïƒ¶ a ï€r c2 ïƒº ï€« c2 ï€½ c2 ïƒ§ïƒ§1 ï€ 11 1 ïƒ·ïƒ· ï€ 11 1 y10 . ïƒª y10 ï€« a12 ïƒ« a11 ï€ r2 ïƒ» a12 ïƒ¨ a11 ï€ r2 ïƒ¸
(9.35a)
(9.35b)
Correspondingly, c1 and c2 are given by c1 ï€½
y10 (a11 ï€ r2 ) ï€« a12 y 20 y '10 ï€ r2 y10 , ï€½ r1 ï€ r2 r1 ï€ r2
(9.36a)
c2 ï€½
a11 ï€ r2 ï›a12 y20 ï€« (a11 ï€ r1 ) y10 ï ï€½ a11 ï€ r2 y '10 ï€r1 y10 , a12 (r1 ï€ r2 ) a12 r1 ï€ r2
(9.36b)
where the definitions (9.7) of initial derivatives are used as abbreviations. With these expressions we can write the solution of Eq. (9.14) ïƒ¦ y1 ïƒ¶ y '10 ï€ r2 y10 ïƒ¦ïƒ§ a 1 ï€ r ïƒ¶ïƒ· r1 t y '10 ï€ r1 y10 ïƒ¦ïƒ§ a 1 ï€ r ïƒ¶ïƒ· r 2 t ïƒ§ïƒ§ ïƒ·ïƒ· ï€½ . ï€ 11 1 e 11 2 e r1 ï€ r2 ïƒ§ïƒ§ ï€ a r1 ï€ r2 ïƒ§ïƒ§ ï€ a ïƒ·ïƒ· ïƒ·ïƒ· ïƒ¨ y2 ïƒ¸ 12 12 ïƒ¨ ïƒ¸ ïƒ¨ ïƒ¸
(9.37)
Consistency of Solutions. This solution provides y1 as given by Eq. (9.10a). To show that y2 given by Eq. (9.37) agrees with Eq. (9.10b) we do the following: The initial value y20 and initial derivative y'20 implied by Eq. (9.37) are given by y 20 ï€½
Pï€Q , r1 ï€ r2
y '20 ï€½
r1 P ï€ r2 Q . r1 ï€ r2
(9.38)
9.2 Systems of First-Order Differential Equations
343
Table 9.1 Cases considered for the illustration of solutions of the linear equation system (9.42).
Case 1: Real unequal eigenvalues of the same sign Case 2: Real unequal eigenvalues of opposite sign Case 3: Real equal eigenvalues Case 4: Complex eigenvalues Case 5: Pure imaginary eigenvalues
a11
a12
a21
r1
r2
ï€1 ï€1 ï€1 ï€1 0
0.5 2 0 1 1
0.5 ï€0.5 ï€1.5 2 1 ï€3 0 ï€1 ï€1 ï€1 ï€1+i ï€1ï€i ï€1 i ï€i
Here, we used the abbreviations Pï€½ï€
a11 ï€ r1 ( y '10 ï€ r2 y10 ), a12
Qï€½ï€
a11 ï€ r2 ( y '10 ï€ r1 y10 ). a12
(9.39)
The use of P = (r1 ï€ r2) y20 + Q and Q = P ï€ (r1 ï€ r2) y20 according to Eq. (9.39) enables us to write y '20 ï€½
r1 ï›Q ï€« (r1 ï€ r2 ) y 20 ï ï€ r2 Q r1 P ï€ r2 ï›P ï€ (r1 ï€ r2 ) y 20 ï . ï€½ r1 ï€ r2 r1 ï€ r2
(9.40)
The latter two relations between Q and P with y'20 can be used to obtain y '20 ï€ r1 y 20 ï€½ Q ï€½ ï€
a11 ï€ r2 ( y '10 ï€ r1 y10 ), a12
(9.41a)
y '20 ï€ r2 y 20 ï€½ P ï€½ ï€
a11 ï€ r1 ( y '10 ï€ r2 y10 ). a12
(9.41b)
The combination of these relations with Eq. (9.37) recovers the solution (9.10b).
9.2.2 Features of Solutions of Linear First-Order Systems Example. Let us illustrate some characteristic features of solutions of the linear equation system (9.2). For simplicity we assume that a11 = a22, dy1 ï€½ a11 y1 ï€« a12 y 2 , dt
(9.42a)
dy 2 ï€½ a21 y1 ï€« a11 y 2 . dt
(9.42b)
According to Eq. (9.12), the eigenvalues r1 and r2 are then given by the relations r1 ï€½ a11 ï€« a12 a21 ,
r2 ï€½ a11 ï€ a12 a21 .
(9.43)
344
9 Deterministic Multivariate Evolution
Fig. 9.2. The temporal evolution of y1 and y2 in the y1-y2 phase plane according to the linear equation system (9.42); (a), (b), (c), (d), and (e) show the evolution of y1(t) and y2(t) for the five cases given in Table 9.1, respectively. Several initial values of y1 and y2 are considered.
Five sets of model parameters a11, a12, and a21 are specified in Table 9.1. The cases considered correspond to five characteristic types of the eigenvalues r1 and r2. The solutions of the equation system (9.42) were obtained numerically. The evolution of y1 and y2 in time is shown in Fig. 9.2 in the y1-y2 phase plane for several initial values y10 and y20. Such curves can be seen as the trajectory of a particle moving with a velocity dy / dt = A y. Analytical Solutions. The analytical solutions for the five cases considered are given by Eq. (9.32). To prepare the discussion of specific cases we will calculate the solution for the first three cases. Neither case 4 nor 5 will be involved here: these cases require rewritings of the solution (9.32) to have real-valued solutions (see Chap. 7). To cover the first three cases we set a12 = a21 = ï¥. Here, ï¥ represents a positive parameter. For the cases 1, 2, and 3 we have the values ï¥ = (0.5, 2, 0),
9.2 Systems of First-Order Differential Equations
345
respectively. According to Eq. (9.43), the eigenvalues are then given by r1 ï€½ ï€1 ï€« ï¥ ,
r2 ï€½ ï€1 ï€ ï¥ .
(9.44)
Equations (9.27) and (9.31) imply for the eigenvectors a12 ïƒ¶ ïƒ¦ ï¥ ïƒ¦ ïƒ§ï€ ïƒ· ï€ u2 ï€½ ïƒ§ a11 ï€ r2 ïƒ· ï€½ ïƒ§ïƒ§ ï¥ ïƒ§ ïƒ· 1 ïƒ¨ ïƒ¸ ïƒ¨ 1
1 ïƒ¦ ïƒ¶ ïƒ¦ 1 ïƒ¶ ïƒ¦ 1ïƒ¶ u1 ï€½ ïƒ§ïƒ§ ï€ a11 ï€ r1 ïƒ·ïƒ· ï€½ ïƒ§ ï€ ï¥ ïƒ· ï€½ ïƒ§ïƒ§ ïƒ·ïƒ·, ïƒ§ï€ ïƒ· a12 ïƒ¸ ïƒ¨ ï¥ ïƒ¸ ïƒ¨1ïƒ¸ ïƒ¨
ïƒ¶ ïƒ¦ ï€ 1ïƒ¶ ïƒ· ï€½ ïƒ§ ïƒ·. (9.45) ïƒ· ïƒ§1ïƒ· ïƒ¸ ïƒ¨ ïƒ¸
According to Eq. (9.36), the coefficients c1 and c2 are given by c1 ï€½
a11 y10 ï€« a12 y 20 ï€ r2 y10 ï€ y10 ï€« ï¥ y 20 ï€« (1 ï€« ï¥ ) y10 y 20 ï€« y10 , ï€½ ï€½ r1 ï€ r2 2ï¥ 2
(9.46a)
a11 ï€ r2 a11 y10 ï€« a12 y 20 ï€ r1 y10 ï¥ ï€ y10 ï€« ï¥ y 20 ï€« (1 ï€ ï¥ ) y10 y 20 ï€ y10 . ï€½ ï€½ a12 r1 ï€ r2 ï¥ 2ï¥ 2 (9.46b) Thus, the solution (9.32) reads c2 ï€½
r t
y ï€½ c1 u1 e 1 ï€« c2 u2 e
r2 t
ï€½
y 20 ï€« y10 ïƒ¦1ïƒ¶ ï€(1ï€ï¥ ) t y 20 ï€ y10 ïƒ¦ ï€ 1ïƒ¶ ï€(1ï€«ï¥ ) t ïƒ§ïƒ§ ïƒ·ïƒ· e ïƒ§ïƒ§ ïƒ·ïƒ· e . ï€« 2 2 ïƒ¨ 1ïƒ¸ ïƒ¨1ïƒ¸
(9.47)
In terms of y1 and y2, this solution can be written
y1 ï€½
y 20 ï€« y10 ï€(1ï€ï¥ ) t y 20 ï€ y10 ï€(1ï€«ï¥ ) t e e , ï€ 2 2
(9.48a)
y2 ï€½
y 20 ï€« y10 ï€(1ï€ï¥ ) t y 20 ï€ y10 ï€(1ï€«ï¥ ) t e e . ï€« 2 2
(9.48b)
To see the relation between y2 on y1, which determines the trajectory in the y1-y2 phase plane, we consider the sum and the difference of these two expressions,
y 2 ï€« y1 ï€½ ( y 20 ï€« y10 ) e ï€ (1ï€ï¥ ) t ,
y 2 ï€ y1 ï€½ ( y 20 ï€ y10 ) e ï€ (1ï€«ï¥ ) t .
(9.49)
Case 1. The first case considers two real unequal eigenvalues of the same sign. For this case (ï¥ = 0.5), the solution (9.47) reads
ï€¨
ï€©
y ï€½ c1 u1 e ï€0.5 t ï€« c2 u2 e ï€1.5 t ï€½ e ï€0.5 t c1 u1 ï€« c2 u2 e ï€ t .
(9.50)
Examples for the evolution of y1 and y2 in time are shown in Fig. 9.2a for different initial values. All trajectories are attracted by the equilibrium solution (0, 0). For two real unequal eigenvalues that are positive one finds the opposite feature that all trajectories increase their distance to (0, 0). The trajectories are aligned with the eigenvectors. There is, however, a difference in the behavior of trajectories. The term c u2 eï€t in Eq. (9.50) is small compared to c u1 for sufficiently large t. Thus, the trajectories tend toward u1 before they reach the equilibrium point (0, 0).
346
9 Deterministic Multivariate Evolution
The trajectories are characterized by Eq. (9.49), where ï¥ = 0.5. For t ï‚® ï‚¥ we find y2 = ï‚±y1, which agrees with the eigenvectors (9.45). Case 2. The second case considers two real unequal eigenvalues of opposite sign. The solution (9.47) reads for this case
ï€¨
ï€©
y ï€½ c1 u1 e t ï€« c2 u2 e ï€3t ï€½ e t c1 u1 ï€« c2 u2 e ï€4 t .
(9.51)
Examples for the evolution of y1 and y2 are shown in Fig. 9.2b for different initial values. The behavior of trajectories can be explained by considering the solution (9.51). As given for the first case, trajectories tend toward u1 because the u2 term in Eq. (9.51) becomes negligible compared to the u1 term for large t. The difference to the first case is given by the fact that the solution tends (due to the positive eigenvalue) to infinity after reaching u1. The relation between y1 and y2 is determined by Eq. (9.49), where ï¥ = 2. For t ï‚® ï‚¥ this relation provides y2 = ï‚±y1 in agreement with the eigenvectors (9.45). Case 3. The third case considers two real equal eigenvalues. The solution for this case can be found by considering the limit ï¥ ï‚® 0 of Eq. (9.47), ïƒ© y ï€« y10 ïƒ¦1ïƒ¶ y20 ï€ y10 ïƒ§ïƒ§ ïƒ·ïƒ· ï€« y ï€½ ïƒª 20 2 2 ïƒªïƒ« ïƒ¨ 1ïƒ¸
ïƒ¦ ï€ 1ïƒ¶ïƒ¹ ï€t ïƒ¦ y10 ïƒ¶ ï€t ïƒ§ïƒ§ ïƒ·ïƒ·ïƒº e ï€½ ïƒ§ïƒ§ ïƒ·ïƒ· e . ïƒ¨ 1 ïƒ¸ïƒºïƒ» ïƒ¨ y20 ïƒ¸
(9.52)
Examples for the evolution of y1 and y2 in time are shown in Fig. 9.2c. Expression (9.52) explains the difference to the cases 1 and 2: trajectories do not tend toward u1 because the bracket term is independent of time. The path of trajectories can be derived from Eq. (9.52), y y 2 ï€½ 20 y1 . (9.53) y10 Hence, every trajectory lies on a straight line through the origin. Case 4. The fourth case considers two complex eigenvalues. The eigenvectors are also complex. Figure 9.2d shows that all trajectories tend toward the equilibrium point (0, 0). The simplest way to see in which way the equilibrium is established is to analyze the consequences of the equation system (9.42) directly,
dy1 ï€½ ï€ y1 ï€« y 2 , dt
(9.54a)
dy2 ï€½ ï€ y1 ï€ y 2 . dt
(9.54b)
We multiply Eq. (9.54a) by 2 y1 and Eq. (9.54b) by 2 y2, and we take the sum of both equations,
d 2 2 2 2 ( y1 ï€« y 2 ) ï€½ ï€2( y1 ï€« y 2 ) . dt
(9.55)
This relation was obtained by making use of the identity dy12 / dt = 2 y1 dy1/ dt, and
9.2 Systems of First-Order Differential Equations
347
a corresponding relation for y2. The solution of this equation is given by
y1 ï€« y 2 ï€½ ( y10 ï€« y 20 ) e ï€2t . 2
2
2
2
(9.56)
Therefore, the trajectories represent circles with a radius that decreases in time, this means the trajectories are spirals. Case 5. The fifth case considers pure imaginary eigenvalues. Figure 9.2e shows that the trajectories are given by circles in this case. Evidence for these trajectories can be obtained by the consideration of the equation system (9.42) for this case,
dy1 ï€½ y2 , dt
(9.57a)
dy 2 ï€½ ï€ y1 . dt
(9.57b)
As for the fourth case, we multiply Eq. (9.57a) by 2 y1, Eq. (9.57b) by 2 y2, and we take the sum of both equations,
d 2 2 ( y1 ï€« y 2 ) ï€½ 0 . dt
(9.58)
Hence, the trajectories are indeed circles, 2
2
2
2
y1 ï€« y 2 ï€½ y10 ï€« y 20 .
(9.59)
The critical point (0, 0) is called a center. Summary. The features of solutions of systems of first-order linear differential equations can be summarized in the following way (Boyce and DiPrima 2009). There are three possibilities for the evolution of trajectories: a) Trajectories approach the equilibrium point as t ï‚® ï‚¥. This behavior is seen if the eigenvalues are real and negative or complex with real negative part. Such a system is called asymptotically stable. b) Trajectories remain bounded but they do not approach the origin. This behavior appears if the eigenvalues are pure imaginary. Such a system is called stable. c) Trajectories become unbounded as t ï‚® ï‚¥. Such a behavior is seen if at least one eigenvalue is positive or if the eigenvalues have a positive real part. Such a system is called unstable.
9.2.3 Analysis of Nonlinear Equation Systems Nonlinear Equation System. The analysis of nonlinear equation systems is more difficult than the analysis of linear systems because nonlinear systems can hardly be solved analytically. To illustrate the way of analyzing the behavior of
348
9 Deterministic Multivariate Evolution
nonlinear systems, let us consider an equation system that will be used also for the discussion of population ecology processes, see Sect. 9.3,
dy1 ï€½ y1 ï€¨a1 ï€« b1 y1 ï€« c1 y 2 ï€©, dt
(9.60a)
dy 2 ï€½ y 2 ï€¨a2 ï€« b2 y 2 ï€« c2 y1 ï€©. dt
(9.60b)
Here, a1, b1, c1 and a2, b2, c2 are any positive or negative constants. One approach to analyze the nonlinear equation system (9.60) will be discussed in the following: The idea is to determine the equilibrium points and to analyze the solution sufficiently close to the equilibrium points such that the nonlinear equation system can be approximated by a linear equation system that can be solved. Equilibrium Points. An equilibrium solution (Y1, Y2) of Eqs. (9.60) is defined by Y1 and Y2 values so that dy1 / dt = dy2 / dt = 0. Therefore, equilibrium solutions are defined by the conditions 0 ï€½ y1 ï€¨a1 ï€« b1 y1 ï€« c1 y2 ï€© ,
(9.61a)
0 ï€½ y2 ï€¨a2 ï€« b2 y2 ï€« c2 y1 ï€©.
(9.61b)
The equation system (9.60) provides four equilibrium points, which are given by
ïƒ¦ a ïƒ¶ (Y1 , Y2 ) ï€½ ïƒ§ïƒ§ 0, ï€ 2 ïƒ·ïƒ·, b2 ïƒ¸ ïƒ¨ ïƒ¦ a c ï€ a b a c ï€ a2b1 ïƒ¶ ïƒ·ïƒ· . (Y1 , Y2 ) ï€½ ïƒ§ïƒ§ 2 1 1 2 , 1 2 ïƒ¨ b1b2 ï€ c1c2 b1b2 ï€ c1c2 ïƒ¸ (Y1 , Y2 ) ï€½ (0, 0),
ïƒ¦ a ïƒ¶ (Y1 , Y2 ) ï€½ ïƒ§ïƒ§ ï€ 1 , 0 ïƒ·ïƒ·, ïƒ¨ b1 ïƒ¸
(9.62)
The validity of the first three equilibrium points can be easily seen. The last equilibrium point ensures that both parenthesis terms are equal to zero. The validity of this claim may be proven by using this point in the parenthesis terms of Eqs. (9.61a) and (9.61b), respectively, a1 ï€« b1 y1 ï€« c1 y2 ï€½ a1 ï€« b1
a2 c1 ï€ a1b2 a c ï€ a2b1 ï€« c1 1 2 b1b2 ï€ c1c2 b1b2 ï€ c1c2
(9.63a)
ï€½ a1 (b1b2 ï€ c1c2 ) ï€« b1 (a2 c1 ï€ a1b2 ) ï€« c1 (a1c2 ï€ a2b1 ) ï€½ 0, a2 ï€« b2 y2 ï€« c2 y1 ï€½ a2 ï€« b2
a1c2 ï€ a2b1 a c ï€ab ï€« c2 2 1 1 2 b1b2 ï€ c1c2 b1b2 ï€ c1c2
(9.63b)
ï€½ a2 (b1b2 ï€ c1c2 ) ï€« b2 (a1c2 ï€ a2b1 ) ï€« c2 (a2 c1 ï€ a1b2 ) ï€½ 0. Near-Equilibrium Equation System. The next step is to study the behavior of small deviations of the solution from the equilibrium points. Small deviations from the equilibrium points are defined by setting
y1 ï€½ Y1 ï€« v1 ,
y2 ï€½ Y2 ï€« v 2 .
(9.64)
9.2 Systems of First-Order Differential Equations
349
Here, (Y1, Y2) represent the coordinates of any equilibrium point, and the functions (v1, v2) are small deviations from this equilibrium point. By replacing y1 and y2 in Eq. (9.60) by the latter expressions we obtain dv1 ï€½ ï€¨Y1 ï€« v1 ï€©ï›a1 ï€« b1 ï€¨Y1 ï€« v1 ï€© ï€« c1 ï€¨Y2 ï€« v2 ï€©ï ï€½ Y1 ï€¨a1 ï€« b1Y1 ï€« c1Y2 ï€© dt ï€« v1 ï›a1 ï€« b1Y1 ï€« c1Y2 ï€« b1Y1 ï ï€« v2 c1Y1 ï€« v1 (b1v1 ï€« c1v2 ),
(9.65a)
dv2 ï€½ ï€¨Y2 ï€« v2 ï€©ï›a2 ï€« b2 ï€¨Y2 ï€« v2 ï€© ï€« c2 ï€¨Y1 ï€« v1 ï€©ï ï€½ Y2 ï€¨a2 ï€« b2Y2 ï€« c2Y1 ï€© dt ï€« v1 c2Y2 ï€« v2 ï›a2 ï€« b2Y2 ï€« c2Y1 ï€« b2Y2 ï ï€« v2 (b2v2 ï€« c2v1 ).
(9.65b)
The first terms on the right-hand sides of these relations are zero because Y1 and Y2 are equilibrium solutions. The last terms are quadratic in v1 and v2. The latter terms can be neglected because v1 and v2 are assumed to be small. In this way, we obtain a linear equation system for v1 and v2, dv1 ï€½ v1 ï›a1 ï€« b1Y1 ï€« c1Y2 ï€« b1Y1 ï ï€« v2 c1Y1 , dt
(9.66a)
dv2 ï€½ v1 c2Y2 ï€« v2 ï›a2 ï€« b2Y2 ï€« c2Y1 ï€« b2Y2 ï. dt
(9.66b)
This linear equation system can be solved in terms of the solutions provided in Sect. 9.2.1. Generalization. The linearization of the nonlinear equation system described in the preceding paragraph can be applied to any nonlinear equation system. The latter fact can be demonstrated by considering the nonlinear system dy1 ï€½ F1 ( y1 , y 2 ) , dt
(9.67a)
dy 2 ï€½ F2 ( y1 , y 2 ) , dt
(9.67b)
where F1 and F2 can be any functions of y1 and y2. The Taylor expansion of F1 and F2 at an equilibrium point (Y1, Y2) is then given by dy1 ï‚¶F ï‚¶F ï€½ F1 (Y1 , Y2 ) ï€« 1 (Y1 , Y2 ) ï€¨ y1 ï€ Y1 ï€© ï€« 1 (Y1 , Y2 ) ï€¨ y 2 ï€ Y2 ï€©, dt ï‚¶y1 ï‚¶y 2
(9.68a)
ï‚¶F dy 2 ï‚¶F ï€½ F2 (Y1 , Y2 ) ï€« 2 (Y1 , Y2 ) ï€¨ y1 ï€ Y1 ï€© ï€« 2 (Y1 , Y2 ) ï€¨ y 2 ï€ Y2 ï€©, dt ï‚¶y1 ï‚¶y 2
(9.68b)
where nonlinear powers of y1 ï€ Y1 and y2 ï€ Y2 are neglected (which is justified for sufficiently small deviations from the equilibrium point). The functions F1 and F2 are equal to zero at the equilibrium points, i.e., we have F1(Y1, Y2) = F2(Y1, Y2) = 0.
350
9 Deterministic Multivariate Evolution
Hence, we obtain a linear equation system in y1 ï€ Y1 and y2 ï€ Y2, ïƒ¦ ï‚¶F1 (Y , Y ) ïƒ§ d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ§ ï‚¶y1 1 2 ïƒ§ïƒ§ ïƒ·ïƒ· ï€½ dt ïƒ¨ y 2 ï€ Y2 ïƒ¸ ïƒ§ ï‚¶F2 (Y , Y ) ïƒ§ ï‚¶y 1 2 ïƒ¨ 1
ï‚¶F1 ïƒ¶ (Y1 , Y2 ) ïƒ· ï‚¶y 2 ïƒ·ïƒ¦ïƒ§ y1 ï€ Y1 ïƒ¶ïƒ·. ïƒ· ï‚¶F2 ïƒ·ïƒ§ (Y1 , Y2 ) ïƒ·ïƒ¨ y 2 ï€ Y2 ïƒ¸ ï‚¶y 2 ïƒ¸
(9.69)
By calculating the partial derivatives involved, this equation system can be used to recover the equation system (9.66), where v1 = y1 ï€ Y1 and v2 = y2 ï€ Y2. Application. Examples for the application of this approach will be discussed in Sect. 9.3. These examples illustrate the benefits of such linear stability analyses. However, such analyses are not always successful. The latter is the case if the linear system that characterizes the system behavior in the neighborhood of an equilibrium point has two pure imaginary eigenvalues such that the trajectories are closed curves (ellipses). In that case, small disturbances given by nonlinear terms generate positive or negative real parts of the complex eigenvalues. Depending on the sign of these real parts, the nonlinear system may be asymptotially stable or unstable. Therefore, the analysis of the corresponding linear system does not allow in this case to decide whether or not the nonlinear system is asymptotially stable. A procedure for handling this problem will be discussed in Sect. 9.4 where Liapunovâ€™s second method is explained.
9.3 Population Ecology: Species Interactions As a first application of the mathematical concepts presented in Sect. 9.2, let us consider the modeling of the multivariate evolution of several populations. This problem will be addressed such that the concepts presented for a single population in Chap. 7 are extended by the consideration of the interaction of several species. The discussions in Chap. 7 showed that there is no unique law of population ecology, but (depending on the definition of the population density function) there are many possibilities for formulating equations for population dynamics. In the following, we will consider modeling approaches that extend the logistic growth model for a single population.
9.3.1 Multivariate Population Dynamics Equations Multivariate Evolution. The following discussion of some basic features of the multivariate evolution of populations will be based on the nonlinear equation
9.3 Population Ecology: Species Interactions
351
system (9.60), which was analyzed mathematically in Sect. 9.2.3, dy1 ï€½ y1 ï€¨a1 ï€« b1 y1 ï€« c1 y 2 ï€©, dt dy 2 ï€½ y 2 ï€¨a2 ï€« b2 y 2 ï€« c2 y1 ï€©. dt
(9.70a) (9.70b)
The structure of this model enables the consideration of a variety of types of species interactions. We will discuss two examples in the following. Competition for Food. As a first example, let us consider the competition for food by two species that do not prey on each other. An example is given by two species of fish (bluegill and redear) in a pond. The equations considered for this case are given by dy1 ï€½ y1 ï€¨ | a1 | ï€ | b1 | y1 ï€ | c1 | y 2 ï€©, dt
(9.71a)
dy2 ï€½ y2 ï€¨ | a2 | ï€ | b2 | y2 ï€ | c2 | y1 ï€© . dt
(9.71b)
For the case that there is no interaction between species, i.e., c1 = c2 = 0, these equations represent logistic growth models for y1 and y2. The interaction terms that involve c1 and c2 do appear here with negative coefficients. In this way, we model the food reduction for one species due to the food consumption of the other species. This model will be analyzed in Sect. 9.3.2. Predator-Prey Interactions. As a second example, we consider predator-prey interactions (e.g., foxes and rabbits in a closed forest). We assume that y1 refers to the prey, and y2 refers to the predator. The equations for this case are given by dy1 ï€½ y1 ï€¨ | a1 | ï€ | b1 | y1 ï€ | c1 | y 2 ï€©, dt
(9.72a)
dy 2 ï€½ y 2 ï€¨ ï€ | a2 | ï€« | c2 | y1 ï€©. dt
(9.72b)
The prey equation (9.72a) has the same structure as Eq. (9.71a): we have a logistic model with an interaction term that is proportional to c1. A nonzero c1 accounts for the reduction of prey due to predators. On the other hand, the predator equation (9.72b) differs from Eq. (9.71b). The predator will die out in the absence of the prey. The consideration of a self-limiting factor (i.e., a nonzero b2) does not make sense in this scenario. For a nonzero coefficient c2, the positive last term describes the increase of the predator population due to the consumption of prey. Equations (9.72) represent the famous Lotka-Volterra equations, which are extended here by the consideration of the self-limiting contribution related to the use of a nonzero b1. This model will be analyzed in Sect. 9.3.3.
352
9 Deterministic Multivariate Evolution
9.3.2 Competition for Food Model Considered. For an analysis of the equation system (9.71) it is helpful to consider more specific equations. The parameters b1 and b2 normalize the other model parameters: see Eq. (9.70). Thus, we may set b1 = b2 = ï€1 (the negative sign is considered according to Eq. (9.71)). The equilibrium values (9.62) show that the second and third equilibrium values of Y1 and Y2 are given then by a1 and a2, respectively. We will assume that a1 = a2 = 1, which corresponds to the consideration of normalized population values. Regarding c1 and c2 we assume that c1 = c2 = ï€d, where d is a non-negative number. The model considered is then given by
dy1 ï€½ y1 ï€¨1 ï€ y1 ï€ d y 2 ï€©, dt dy2 ï€½ y2 ï€¨1 ï€ y2 ï€ d y1 ï€©. dt
(9.73a) (9.73b)
The equations look similar, which leads to the question regarding the difference between them. To address this question we calculate the ratio between y2 and y1, y dy y y d y2 1 dy 2 ï€½ ï€ 22 1 ï€½ 2 ï€¨1 ï€ y 2 ï€ d y1 ï€© ï€ 22 y1 ï€¨1 ï€ y1 ï€ d y 2 ï€© dt y1 y1 dt y1 y1 dt y1 ïƒ¦ y y ïƒ¶ ï€½ (1 ï€ d ) 2 ï€¨ y1 ï€ y 2 ï€© ï€½ (1 ï€ d )ïƒ§ïƒ§1 ï€ 2 ïƒ·ïƒ· y 2 . y1 y1 ïƒ¸ ïƒ¨
(9.74)
This relation shows that y2 / y1 is constant (in particular, we have y2 / y1 = y20 / y10) under two conditions: for d = 1 and for the case that y1 and y2 have the same initial condition. The conclusion for the latter case can be seen, for example, by writing Eq. (9.74) in a discrete formulation. Such a representation shows that there is never a change of y2 / y1. The latter two cases will be considered first because they allow analytical solutions of the nonlinear equation system (9.73). Equal Initial Values. For the case y20 = y10, Eq. (9.74) implies y2 / y1 = y20 / y10 = 1, which means y2 = y1. According to Eq. (9.73a), y1 is then determined by dy1 ï€½ y1 ï€¨1 ï€ (1 ï€« d ) y1 ï€©. dt
(9.75)
This equation is a logistic equation. In particular, this equation corresponds to the logistic equation (7.88) by setting L = 0, ï´ = 1, and K = 1 / (1 + d). According to the solution (7.101) of the logistic equation, the solution of Eq. (9.75) is given by y1 ï€½
1 /(1 ï€« d ) ïƒ¦ 1 /(1 ï€« d ) ïƒ¶ ï€t ïƒ·e 1 ï€ ïƒ§ïƒ§1 ï€ y10 ïƒ·ïƒ¸ ïƒ¨
.
(9.76)
9.3 Population Ecology: Species Interactions
353
The equilibrium solution of Eq. (9.76) is given by Y1 = 1 / (1 + d). According to y2 = y1, the equilibrium point for this case is given by 1 ïƒ¶ ïƒ¦ 1 (Y1 , Y2 ) ï€½ ïƒ§ , ïƒ·. ïƒ¨1ï€« d 1ï€« d ïƒ¸
(9.77)
Equal Competition. We also have a proportionality y2 / y1 = y20 / y10 for d = 1 where we have an equal competition (because the parenthesis terms in Eqs. (9.73) are equal). According to Eq. (9.73a), y1 is then determined by the equation
ïƒ¦ ïƒ© y ïƒ¹ ïƒ¶ dy1 ï€½ y1 ïƒ§1 ï€ ïƒª1 ï€« 20 ïƒº y1 ïƒ· . ïƒ§ dt y10 ïƒ» ïƒ·ïƒ¸ ïƒ¨ ïƒ«
(9.78)
The logistic equation (7.88) corresponds to the latter equation if L = 0, ï´ = 1, and K = 1 / (1 + y20 / y10). By using the solution (7.101) of the logistic equation, we find the solution of Eq. (9.78) to be given by y1 ï€½
1 /(1 ï€« y 20 / y10 ) . ïƒ¦ 1 /(1 ï€« y 20 / y10 ) ïƒ¶ ï€t ïƒ·ïƒ· e 1 ï€ ïƒ§ïƒ§1 ï€ y10 ïƒ¸ ïƒ¨
(9.79)
The equilibrium solution that is implied by this expression is Y1 = 1 / (1 + y20 / y10). According to y2 = y1 y20 / y10, the equilibrium point for this case is ïƒ¦ y / y ïƒ¶ ïƒ¦ y10 y20 ïƒ¶ 1 ïƒ·ïƒ· . , (Y1 , Y2 ) ï€½ ïƒ§ïƒ§ , 20 10 ïƒ·ïƒ· ï€½ ïƒ§ïƒ§ 1 / 1 / ï€« ï€« y y ï€« y y y y y 20 10 20 10 ïƒ¸ 20 10 ï€« y 20 ïƒ¸ ïƒ¨ 10 ïƒ¨
(9.80)
Linear Stability Analysis. We have to use linear stability analysis to study the behavior of the nonlinear equation system (9.73) for the cases of unequal initial values and d ï‚¹ 1. From Eqs. (9.62), the equilibrium values of this model are
(Y1 , Y2 ) ï€½ (0, 0) ,
(Y1 , Y2 ) ï€½ ï€¨0,1ï€©,
1 ïƒ¶ ïƒ¦ 1ï€ d 1ï€ d ïƒ¶ ïƒ¦ 1 ï€½ïƒ§ , (Y1 , Y2 ) ï€½ ïƒ§ , ïƒ·. 2 2 ïƒ· 1 ï€« ï€« 1 d dïƒ¸ ï€ ï€ 1 1 d d ïƒ¨ ïƒ¸ ïƒ¨
(Y1 , Y2 ) ï€½ ï€¨1, 0 ï€©, (9.81)
Therefore, there are four potential equilibrium states: both species will disappear, only one of the species will survive, or there is a coexistence of both species. For a non-negative d we find that the Y1 and Y2 values are bounded by zero and one, this means 0 ï‚£ Y1 ï‚£ 1 and 0 ï‚£ Y2 ï‚£ 1. In the following, we will analyze the solution behavior in the vicinity of the four equilibrium points (9.81) by making use of the linear stability analysis approach presented in Sect. 9.2. ï‚· (Y1, Y2) = (0, 0): the linear equation system (9.69) reads for this case d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦ 1 0 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·. ïƒ·ï€½ïƒ§ ïƒ§ ïƒ·ïƒ§ dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ 0 1 ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸
(9.82)
354
9 Deterministic Multivariate Evolution
According to Eqs. (9.12), we have for this case the eigenvalues r1 ï€½ 1,
r2 ï€½ 1,
(9.83)
Thus, the solution is unstable at the equilibrium point (Y1, Y2) = (0, 0). ï‚· (Y1, Y2) = (0, 1): the linear equation system (9.69) reads for this case d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦1 ï€ d ïƒ·ï€½ïƒ§ ïƒ§ dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ ï€ d
0 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·. ïƒ·ïƒ§ ï€ 1ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸
(9.84)
The eigenvalues are given by r1 ï€½ ï€
d d ï€« 1ï€ , 2 2
r2 ï€½ ï€
d d ï€ 1ï€ . 2 2
(9.85)
For the case 1 ï€ d / 2 > 0, the eigenvalues are given by r1 = 1 ï€ d and r2 = ï€1. For the case 1 ï€ d / 2 < 0, the eigenvalues are given by r1 = ï€1 and r2 = 1 ï€ d. It is up to us which eigenvalue we called r1 and r2. Thus, we can use r1 ï€½ 1 ï€ d ,
r2 ï€½ ï€1.
(9.86)
Depending on the value of d, the eigenvalue r1 can be negative or positive. Thus, the solution can be asymptotically stable or unstable at this equilibrium point. ï‚· (Y1, Y2) = (1, 0): the linear equation system (9.69) reads for this case d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦ ï€ 1 ï€ d ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·. ïƒ·ï€½ïƒ§ ïƒ§ ïƒ·ïƒ§ dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ 0 1 ï€ d ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸
(9.87)
The eigenvalues are given by r1 ï€½ ï€
d d ï€« 1ï€ , 2 2
r2 ï€½ ï€
d d ï€ 1ï€ , 2 2
(9.88)
That are the same eigenvalues as found for (Y1, Y2) = (0, 1). Correspondingly, the eigenvalues are again given by r1 ï€½ 1 ï€ d ,
r2 ï€½ ï€1,
(9.89)
i.e., the system behavior is the same as in the vicinity of (Y1, Y2) = (0, 1). ï‚· (Y1, Y2) = (Y, Y), where Y = 1 / (1 + d): the equation system (9.69) now reads ï€ dY ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦1 ï€ (2 ï€« d )Y ïƒ·. ïƒ·ï€½ïƒ§ ïƒ§ ïƒ·ïƒ§ 1 ï€ (2 ï€« d )Y ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ ï€ d Y
(9.90)
The eigenvalues are given by 1 ï€« d ï€ 2 ï€1 ï€« d , ï€½ 1ï€« d 1ï€« d r2 ï€½ 1 ï€ (2 ï€« d )Y ï€ d Y ï€½ 1 ï€ 2(1 ï€« d )Y ï€½ 1 ï€ 2 ï€½ ï€1. r1 ï€½ 1 ï€ (2 ï€« d )Y ï€« d Y ï€½ 1 ï€ 2Y ï€½
(9.91)
9.3 Population Ecology: Species Interactions
355
Fig. 9.3. The competition of two species y1 and y2 for food. The solutions y1 and y2 of the equation system (9.73) are shown in (a), (b), and (c) as function of time t for d = (0.5, 1, 1.5), respectively. The initial conditions y10 = 0.2 and y20 = 0.1 are used. The phase plane evolution of y1 and y2 is shown for several initial conditions in (d), (e), and (f) for the cases d = (0.5, 1, 1.5), respectively. The dashed line in (e) is not a realizable trajectory: is only gives an orientation regarding the equilibrium points for this case.
As given for the equilibrium point (Y1, Y2) = (0, 1), the eigenvalue r1 can be negative or positive depending on the value of d. Consequently, the solution can be asymptotically stable or unstable at this equilibrium point. Illustration. We will assume that there are two populations initially such that the initial values y10 and y20 are nonzero. Then, the equilibrium point (0, 0), which is characterized by two positive eigenvalues, can never be realized. Hence, it is impossible that both populations disappear. For equal initial values y20 = y10 we find the solution (9.76) for y1 and we have y2 = y1. The equilibrium solution is (Y1, Y2) = (1 / (1 + d), 1 / (1 + d)). For unequal initial values we find features that are illustrated in Fig. 9.3 for d = (0.5, 1, 1.5) and several initial conditions. For a relatively weak competition (d < 1), we find the development of a coexistence (Y1, Y2) = (1 / (1 + d), 1 / (1 + d)) between both species. This result agrees with the conclusions of linear stability analysis: the equilibrium points (0, 1) and (1, 0) are
356
9 Deterministic Multivariate Evolution
characterized by one positive eigenvalue, whereas the coexistence point (Y, Y) is characterized by two negative eigenvalues. The equal competition case with d = 1 does still allow a coexistence of species, but the initial values matter in this case: the species with the higher initial value will have a higher equilibrium value (see Fig. 9.3b). The solution for y1 is given for this case by the logistic function (9.79), and we have y2 = y1 y20 / y10. The equilibrium point is given by Eq. (9.80). For a relatively strong competition (d > 1) we find that one species disappears whereas the other species achieves a maximum value. In particular, we find that the species with the higher initial value will survive. This observation is also supported by linear stability analysis: the coexistence point (Y, Y) has one positive eigenvalue, and the equilibrium points (0, 1) and (1, 0) have two negative eigenvalues.
9.3.3 Predator-Prey Interaction Model Considered. The predator-prey equations (9.72) will be also analyzed by considering more specific equations. In correspondence to Eq. (9.73) we apply a1 = 1 and a2 = ï€1. We assume an equal amount of interaction by setting c1 = ï€4 and c2 = 4. In addition, we assume that b1 = ï€e, where e is a non-negative parameter. With these assumptions, the equation system considered is given by
dy1 ï€½ y1 ï€¨1 ï€ e y1 ï€ 4 y 2 ï€©, dt
(9.92a)
dy 2 ï€½ y 2 ï€¨ï€ 1 ï€« 4 y1 ï€©. dt
(9.92b)
The specific relevance of e variations can be seen by considering dy 2 dy 2 / dt y 2 ï€1 ï€« 4 y1 ï€½ ï€½ . dy1 dy1 / dt y1 1 ï€ e y1 ï€ 4 y 2
(9.93)
This equation is a separable equation for e = 0, which means that this equation can be solved. Zero Self-Limitation. First, let us find the analytical solution to the nonlinear equation system (9.92) for the case e = 0. Relation (9.93) can be written then 1 ï€ 4 y2 ï€1 ï€« 4 y1 dy 2 ï€ dy1 ï€½ 0 . y2 y1
(9.94)
The integration of both sides provides ln | y 2 | ï€ 4 y 2 ï€« ln | y1 | ï€ 4 y1 ï€½ C .
(9.95)
9.3 Population Ecology: Species Interactions
357
Here, C is a constant that is determined through the initial conditions (by setting t = 0 on the left-hand side). Unfortunately, it is impossible to use this relation for the calculation of y2 as an explicit function of y1. The relevance of Eq. (9.95) is that this relation describes a closed curve (see, for example, the illustration of this case in Fig. 9.4d). The existence of a closed curve means that we have a stable solution for e = 0. Hence, nonzero e values describe deviations from a stable state. Interestingly, a closed curve is also found for any other parameter values than those used in Eq. (9.92), provided these parameters have the same signs and e = 0. Linear Stability Analysis. We have to use again linear stability analysis to understand the behavior of the nonlinear equation system (9.92) for the case e ï‚¹ 0. According to Eq. (9.62), there are three potential equilibrium points for this system (the equilibrium point (0, ï€a2 / b2) in Eq. (9.62) cannot be realized), e ïƒ¶ïƒ¶ ïƒ· ïƒ·. 4 ïƒ¸ ïƒ·ïƒ¸ (9.96) Correspondingly, it is possible that both species will be extinct, or only the prey survives, or there is a coexistence of both species. The equilibrium values Y1 and Y2 are positive if e ï‚£ 4. The setting e = 4 does recover the second equilibrium point (1 / 4, 0). For simplicity, we do not consider this case e = 4, this means we consider variations 0 ï‚£ e < 4. The solution behavior of Eqs. (9.92) in the vicinity of the three equilibrium points (Y1, Y2) reveals the following features: ï‚· (Y1, Y2) = (0, 0): the linear equation system (9.69) reads for this case (Y1 , Y2 ) ï€½ (0, 0),
ïƒ¦1 ïƒ¶ (Y1 , Y2 ) ï€½ ïƒ§ , 0 ïƒ· , ïƒ¨e ïƒ¸
ïƒ¦ 4 4ï€eïƒ¶ ïƒ¦1 1ïƒ¦ (Y1 , Y2 ) ï€½ ïƒ§ , ïƒ· ï€½ ïƒ§ïƒ§ , ïƒ§1 ï€ ïƒ¨ 16 16 ïƒ¸ ïƒ¨ 4 4 ïƒ¨
d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦ 1 0 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·. ïƒ·ï€½ïƒ§ ïƒ§ ïƒ·ïƒ§ dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ 0 ï€ 1ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸
(9.97)
According to Eqs. (9.12), the eigenvalues are given by r1 ï€½ 1,
r2 ï€½ ï€1,
(9.98)
Correspondingly, the solution is unstable at the equilibrium point (Y1, Y2) = (0, 0). ï‚· (Y1, Y2) = (1 / e, 0): the linear equation system (9.69) reads for this case 4 ïƒ¶ ïƒ¦ d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ§ ï€ 1 ï€ e ïƒ·ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·ïƒ§ ïƒ§ ïƒ·ï€½ïƒ§ ïƒ·. dt ïƒ§ïƒ¨ y2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ 0 ï€ 1 ï€« 4 ïƒ·ïƒ§ïƒ¨ y2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ ïƒ· eïƒ¸ ïƒ¨
(9.99)
The eigenvalues are provided by r1 ï€½ ï€1 ï€«
2 2 4 ï€« ï€½ ï€1 ï€« , e e e
r2 ï€½ ï€1 ï€«
2 2 ï€ ï€½ ï€1. e e
(9.100)
The eigenvalue r1 is positive due to the condition 0 ï‚£ e < 4. Thus, the solution is unstable at the equilibrium point (Y1, Y2) = (1 / e, 0).
358
9 Deterministic Multivariate Evolution
ï‚· (Y1, Y2) = (1/4, [1 ï€ e / 4] / 4): the equation system (9.69) now reads d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦1 ï€ 2eY1 ï€ 4Y2 ïƒ§ ïƒ·ï€½ïƒ§ 4Y2 dt ïƒ§ïƒ¨ y2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨
ï€ 4Y1 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·ïƒ§ ïƒ·. ï€ 1 ï€« 4Y1 ïƒ·ïƒ¸ïƒ§ïƒ¨ y2 ï€ Y2 ïƒ·ïƒ¸
(9.101)
By adopting the definitions of Y1 and Y2 the latter equation system can be written in the following way, ïƒ¦ e ïƒ¦ ïƒ§1 ï€ ï€ 1 ï€ d ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ§ 2 ïƒ§ïƒ¨ ïƒ§ ïƒ·ï€½ e dt ïƒ§ïƒ¨ y 2 ï€ Y2 ïƒ·ïƒ¸ ïƒ§ 1ï€ ïƒ§ 4 ïƒ¨
ïƒ¶ eïƒ¶ ïƒ¦ e ïƒ· ï€ 1ïƒ·ïƒ¦ y ï€ Y ïƒ¶ ïƒ§ ï€ 4ïƒ¸ ïƒ·ïƒ§ 1 1 ïƒ· ï€½ ïƒ§ 4 e ïƒ·ïƒ§ y ï€ Y2 ïƒ·ïƒ¸ ïƒ§ 0 ïƒ·ïƒ¨ 2 ïƒ§1 ï€ ïƒ¨ 4 ïƒ¸
ïƒ¶ ï€ 1ïƒ·ïƒ¦ y ï€ Y ïƒ¶ ïƒ·ïƒ§ïƒ§ 1 1 ïƒ·ïƒ·. (9.102) 0 ïƒ·ïƒ·ïƒ¨ y 2 ï€ Y2 ïƒ¸ ïƒ¸
Therefore, the eigenvalues are given by e 1 e2 ïƒ¦ r1 ï€½ ï€ ï€« ï€ 4ïƒ§1 ï€ 8 2 16 ïƒ¨ 2
e 1 e ïƒ¦ r2 ï€½ ï€ ï€ ï€ 4ïƒ§1 ï€ 8 2 16 ïƒ¨
2 ïƒ¦ ïƒ¶ eïƒ¶ 1ïƒ§ e ïƒ¦e ïƒ¶ ïƒ· ïƒ· ï€½ ï€ ïƒ§ ï€ ïƒ§ ï€« 2ïƒ· ï€ 8 ïƒ·, 4ïƒ¸ 2ïƒ§ 4 ïƒ· ïƒ¨4 ïƒ¸ ïƒ¨ ïƒ¸
eïƒ¶ ïƒ· 4ïƒ¸
2 ïƒ¦ ïƒ¶ 1 e ïƒ¦e ïƒ¶ ï€½ ï€ ïƒ§ïƒ§ ï€« ïƒ§ ï€« 2 ïƒ· ï€ 8 ïƒ·ïƒ· . 2ïƒ§ 4 ïƒ· ïƒ¨4 ïƒ¸ ïƒ¨ ïƒ¸
(9.103)
Depending on the e variation, 0 ï‚£ e / 4 < 1, there are three cases of eigenvalues. A first case is given for e = 0, which means that we have two complex eigenvalues with zero real parts. As discussed above, this case corresponds to a stable solution. For a nonzero e, we have to distinguish cases for which the square root is real and imaginary. The square root becomes zero for e / 4 = ï‚± 81/2 ï€ 2. Due to the variation 0 ï‚£ e / 4 < 1 considered, only the value e / 4 = 81/2 ï€ 2 = 0.8284 can be realized. Correspondingly, we may have two cases in addition to the case e = 0. For the case 0 < e / 4 < 0.8284, we have two complex eigenvalues with negative real part. Thus, the solution is asymptotically stable. For the case 0.8284 < e / 4 < 1 we have a real square root. The eigenvalues r1 and r2 are always negative for this case. Hence, the solution is again asymptotically stable. Illustration. An illustration of solutions of the equation system (9.92) is given in Fig. 9.4 for the cases e / 4 = (0, 0.6, 0.9). We consider nonzero initial populations densities y10 and y20. This assumption implies that the equilibrium solutions (0, 0) and (1 / e, 0), which are both characterized by one positive eigenvalue, can never be realized. Consequently, no population will disappear, which means that there will be a coexistence between both populations. For a zero self-limitation (Fig. 9.4a, d), we observe cyclic variations of the predator and prey populations: a decrease (increase) of prey leads after a delay time to an increase (decrease) of predators. An equilibrium state cannot be established in this way. In the phase plane, the latter behavior corresponds to a closed curve that surrounds the center.
9.3 Population Ecology: Species Interactions
359
Fig. 9.4. Interactions between a predator y2 and prey y1. The solutions y1 and y2 of the equations (9.92) are shown in (a), (b), and (c) as function of time t for the cases e / 4 = (0, 0.6, 0.9), respectively, where y10 = y20 = 0.5. The corresponding phase plane evolution of y1 and y2 is shown for several initial conditions in (d), (e), and (f), where e / 4 = (0, 0.6, 0.9), respectively. The dots show the equilibrium points.
The y1-y2 curves follow Eq. (9.95), and the center is located at (1/4, 1/4), see Eq. (9.96). For a nonzero but relatively weak self-limitation 0 < e / 4 < 0.8284 (see Fig. 9.4b, e), the prey curve shows oscillations that are damped out due to the self-limitation (which appears as a sink term in Eq. (9.92a)). Due to the coupling with y1, the predator curve also shows damped oscillations. The damping implies in the phase plane an asymptotically stable solution. For the case of a relatively strong self-limitation 0.8284 < e / 4 < 1 (see Fig. 9.4c, f), the damping does not allow oscillations anymore. After the first minimum (maximum), y1 (y2) realizes the equilibrium value. It is interesting to see that the increasing damping reduces the Y2 coordinate of the coexistence point (Y1, Y2) = (1/4, [1 ï€ e / 4] / 4), whereas the Y1 coordinate is unaffected. The latter fact is a consequence of Eq. (9.92b), which fixes the stationary value Y1 = 1/4. A modification of this equation (e.g., by the addition of a positive term proportional to y2 in the parenthesis term) would lead to different features.
360
9 Deterministic Multivariate Evolution
Fig. 9.5. An illustration of a pendulum.
9.4 Mechanical Motions: The Pendulum Let us consider next evolution principles for vector processes in mechanics. In continuation of the explanation of laws for mechanical processes in Chap. 7 we will focus the discussion in this chapter on the application of Newtonâ€™s Laws of Motion. In particular, we will extend the discussion of one-dimensional harmonic oscillator motions in Chap. 7 by considering now the motions of a pendulum. The results obtained in this way were used in Chap. 3 regarding the discussion of the measurement of time.
9.4.1 Pendulum Equations Newtonâ€™s Laws of Motion. Contrary to population ecology we have a sound mathematical basis for the modeling of mechanical processes given by Newtonâ€™s Laws of Motion, which were discussed in Chap. 7. Mechanical motions of macroscopic bodies that move with velocities much smaller than the speed of light can be described by Newtonâ€™s Second Law given by Eq. (7.33), d2x F ï€½ . m dt 2
(9.104)
Here, x = (x1, x2, x3) is the position vector of any body, F = (F1, F2, F3) is the force acting on the body, and m is the mass of the body. The use of this equation for the calculation of pendulum motions will be demonstrated in the following. Undamped Pendulum Equation. An illustration of the pendulum considered is given in Fig. 9.5. A mass m is attached to one end of a rigid, but weightless, supported rod of length r. The rod is free to rotate in one plane. The angle ï¡(t) is the angle of displacement from the vertical. The force that drives the pendulum is given by the gravity force Fg = m g. Here, g denotes the gravity acceleration. Other
9.4 Mechanical Motions: The Pendulum
361
forces are not involved regarding the undamped pendulum motion. The x-y coordinate system applied is shown in Fig. 9.5. In correspondence to the analysis of the spring-mass system we assume that the downward direction y is the positive direction. The equations that govern the motion of the pendulum (the changes of the x(t) and y(t) coordinates of the pendulum) are given by Newtonâ€™s Second Law (9.104). d 2x ï€½ 0, dt 2
(9.105a)
d 2 y Fg ï€½ ï€½ g. m dt 2
(9.105b)
To take advantage of the fact that the pendulum moves along a circle with constant radius r, it is helpful to switch to polar coordinates given by the radius r and the angle of displacement ï¡. According to the illustration in Fig. 9.5, the relations that relate (x, y) and (r, ï¡) are given by sin ï¡ ï€½
x , r
cos ï¡ ï€½
y . r
(9.106)
The first-order and second-order derivatives of x(t) and y(t) that are implied by these relations are given by (r is constant) 2
dx dï¡ ï€½ r cosï¡ , dt dt
d 2x d 2ï¡ ïƒ¦ dï¡ ïƒ¶ ï€½ ï€r sin ï¡ ïƒ§ ïƒ· ï€« r cosï¡ 2 , 2 dt dt ïƒ¨ dt ïƒ¸
dy dï¡ ï€½ ï€r sin ï¡ , dt dt
d2y d 2ï¡ ïƒ¦ dï¡ ïƒ¶ ï€½ ï€ r cosï¡ ïƒ§ ïƒ· ï€ r sin ï¡ 2 . 2 dt dt ïƒ¨ dt ïƒ¸
(9.107a)
2
(9.107b)
The use of these relations in Newtonâ€™s Second Law equations (9.105) then implies 2
d 2ï¡ ïƒ¦ dï¡ ïƒ¶ ï€ r sin ï¡ ïƒ§ ïƒ· ï€« r cosï¡ 2 ï€½ 0, dt ïƒ¨ dt ïƒ¸
(9.108a)
2
d 2ï¡ ïƒ¦ dï¡ ïƒ¶ ï€ r cosï¡ ïƒ§ ïƒ· ï€ r sin ï¡ 2 ï€½ g . dt ïƒ¨ dt ïƒ¸
(9.108b)
Equation (9.108a) can be used to replace the quadratic first-order derivative by the second-order derivative of ï¡, 2
cos ï¡ d 2ï¡ ïƒ¦ dï¡ ïƒ¶ . ïƒ§ ïƒ· ï€½ sin ï¡ dt 2 ïƒ¨ dt ïƒ¸
(9.109)
The use of this relation in Eq. (9.108b) leads then to ïƒ© ïƒ¹ d 2ï¡ cos 2 ï¡ ï€« sin 2 ï¡ d 2ï¡ cosï¡ 1 d 2ï¡ g ï€« cos sin ï€½ ï€½ ï€½ï€ , ï¡ ï¡ ïƒª ïƒº 2 2 2 sin ï¡ ï¡ sin sin r ï¡ dt dt dt ïƒ« ïƒ»
(9.110)
362
9 Deterministic Multivariate Evolution
where the Pythagorean identity was applied. Hence, the equation of motion for the undamped pendulum reads d 2ï¡ g ï€½ ï€ sin ï¡ . 2 r dt
(9.111)
Damped Pendulum Equation. In general, the pendulum will be also affected by a damping force, which reduces the pendulum velocity due to the air resistance. In correspondence to the analysis of the spring-mass system we assume that this damping force is proportional to the pendulum velocity dï¡ / dt. Hence, we extend Eq. (9.111) in the following way, d 2ï¡ 1 dï¡ g ï€½ï€ ï€ sin ï¡ . ï´ dt r dt 2
(9.112)
For the characteristic damping time scale we use Stokesâ€™ Law (see Sect. 3.3.3),
ï´ï€½
m . 6 ï° ï rP
(9.113)
Here, ï refers to the dynamic viscosity, and rP is the radius of the spherical mass. The damping contribution appears in Eq. (9.112) with a negative sign because this term reduces the pendulum velocity dï¡ / dt (the damping term implies that dï¡ / dt becomes smaller for a positive dï¡ / dt). Similar as in the discussion of damping in the spring-mass system, the structure of the damping term applied here does only represent one reasonable assumption among several possible choices. The pendulum equation that results from the use of Eq. (9.113) then reads d 2ï¡ 6 ï° ï rP dï¡ g ï€« ï€« sin ï¡ ï€½ 0. m dt r dt 2
(9.114)
Normalized Damped Pendulum Equation. The use of nondimensional variables is helpful because the number of model parameters involved in the equation can be reduced. We introduce the nondimensional time t* = t / (r / g)1/2, such that Eq. (9.114) reads d 2ï¡ 6 ï° ï rP ï€« 2 m dt*
r dï¡ ï€« sin ï¡ ï€½ 0. g dt*
(9.115)
By introducing the nondimensional dynamic viscosity ï* = ï (r3 / g)1/2 / m, which can be seen as an inverse Reynolds number, this equation can be written r dï¡ d 2ï¡ ï€« 6 ï° ï* P ï€« sin ï¡ ï€½ 0. 2 r dt* dt*
(9.116)
We introduce the nondimensional variable d = 6 ï° rP / r to simplify the writing of this equation. Then, the damped pendulum equation, which does now only depend
9.4 Mechanical Motions: The Pendulum
363
on one parameter (the product d ï*), is given by d 2ï¡ dï¡ ï€« d ï* ï€« sin ï¡ ï€½ 0. 2 dt* dt*
(9.117)
Nonlinear Equation System. The second-order differential equation (9.117) can also be represented as an equation system. We set y1 = ï¡ and y2 = dï¡ / dt*. The differentiation of y1 and y2 provides then dy1 dï¡ ï€½ ï€½ y2 , dt* dt*
(9.118a)
dy 2 d 2ï¡ dï¡ ï€½ ï€½ ï€ d ï* ï€ sin ï¡ ï€½ ï€ d ï* y 2 ï€ sin y1 . 2 dt* dt* dt*
(9.118b)
The initial values for y1 and y2 are given by y10 = ï¡(0) and y20 = dï¡ / dt*(0).
9.4.2 Linear Stability Analysis Let us analyze the nonlinear equation system (9.118) by adopting the linear stability analysis approach described in Sect. 9.2.3. Equilibrium Points. First, we have to determine the equilibrium solutions of Eq. (9.118). Such equilibrium solutions have to satisfy the equations 0 ï€½ y2 ,
(9.119a)
0 ï€½ ï€ d ï* y 2 ï€ sin y1.
(9.119b)
These two equations are solved by y1 = ï‚± n ï° and y2 = 0, where n = 0, 1, 2, ï‚¼. However, there is no need to consider all these equilibrium points. We do only have to consider the two physical equilibrium solutions (0, 0) and (ï°, 0). Due to physical reasons we expect that the first equilibrium point (0, 0) is asymptotically stable and the second equilibrium point (ï°, 0) is unstable. First Equilibrium Point: Linear Stability Analysis. According to Eq. (9.69), the linear equation system that describes the pendulum motion close to the first equilibrium point (0, 0) reads 1 ïƒ¶ïƒ¦ y1 ïƒ¶ d ïƒ¦ y1 ïƒ¶ ïƒ¦ 0 ïƒ§ïƒ§ ïƒ·ïƒ· ï€½ ïƒ§ïƒ§ ïƒ·ïƒ§ ïƒ·. dt* ïƒ¨ y 2 ïƒ¸ ïƒ¨ ï€ 1 ï€ d ï* ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸
(9.120)
By differentiating dy1 / dt* = y2 and using dy2 / dt* = ï€y1 ï€d ï* y2, we find in terms of the original variables y1 = ï¡ and y2 = dï¡ / dt* the equation d 2ï¡ dï¡ ï€½ ï€ï¡ ï€ d ï* . 2 dt* dt*
(9.121)
364
9 Deterministic Multivariate Evolution
Hence, the analysis of this case corresponds to the consideration of the linearized pendulum equation (9.117) where sin ï¡ is approximated by ï¡, which is justified for sufficiently small initial angles of displacement. The equation system (9.120) represents a specific case of the linear equation system (9.2). The eigenvalues of Eq. (9.120) are, therefore, given by Eq. (9.12), r1 ï€½ rS ï€« rD ,
r2 ï€½ rS ï€ rD ,
(9.122)
rD ï€½ (d ï* / 2) 2 ï€ 1.
(9.123)
where rS and rD are given by
rS ï€½ ï€d ï* / 2,
The effect of damping is relatively small in general. Therefore, we may assume that d ï* < 2. For this case we find rD = i rD*, where the real number rD* is
rD* ï€½ 1 ï€ (d ï* / 2) 2 .
(9.124)
Both eigenvalues have a negative real part rS if there is a nonzero damping, i.e., d ï* ï‚¹ 0. The discussion at the end of Sect. 9.2.2 showed that such a system is asymptotically stable. The solution to the linear equation system (9.120) can be found by making use of the fact that Eq. (9.121) represents a specific case of the homogeneous linear second-order differential equation (7.45). According to Eq. (7.69), the solution of Eq. (9.121) is then given by r t
ïƒ¬ïƒ¯ sin( rD* t* ) sin( rD* t* ) ïƒ¹ ïƒ¼ïƒ¯ ïƒ© rS ïƒº ï¡ 0 ïƒ½ ï¡ '0 ï€« ïƒªcos(rD* t* ) ï€ rD* ïƒ¯ïƒ® rD* ïƒ« ïƒ» ïƒ¯ïƒ¾
ï¡ (t* ) ï€½ e S * ïƒ ï€½ ï¡0 e
rS t*
(9.125)
ïƒ¬ïƒ¯ ïƒ¼ïƒ¯ ï¡ '0 ï€rS ï¡ 0 sin(rD* t* )ïƒ½. ïƒcos(rD* t* ) ï€« rD* ï¡ 0 ïƒ¯ïƒ® ïƒ¯ïƒ¾
Here, ï¡0 refers to the initial angle of displacement ï¡(0), and ï¡ '0 is the initial value of dï¡ / dt*. The latter relation can be rewritten by defining an angle ï¤ by tan ï¤ ï€½
ï¡ '0 ï€rS ï¡ 0 . rD* ï¡ 0
(9.126)
The use of this relation then enables the following rewriting of Eq. (9.125),
ï¡ (t* ) ï€½ ï›cos(rD* t* ) ï€« sin( rD* t* ) tan ï¤ ï e S * ï¡ 0 r t
ï€½
cos(rD* t* ) cos ï¤ ï€« sin( rD* t* ) sin ï¤ cos ï¤
r t
e S * ï¡0 ï€½
cos(rD* t* ï€ ï¤ ) cos ï¤
r t
e S * ï¡0.
(9.127)
By replacing rS and rD* according to their definitions (9.123) and (9.124) we find cosïƒ¦ïƒ§ 1 ï€ (d ï* / 2) 2 t* ï€ ï¤ ïƒ¶ïƒ· ïƒ¨ ïƒ¸ e ï€ d ï* t* / 2 ï¡ , ï¡ (t* ) ï€½ 0 cos ï¤
(9.128)
9.4 Mechanical Motions: The Pendulum
365
where the angle ï¤ is given by ïƒ¦ ï¡ ' /ï¡ ï€« d ï / 2 ïƒ¶ ïƒ·. 0 0 * 2 ïƒ· ïƒ¨ 1 ï€ ( d ï * / 2) ïƒ¸
ï¤ ï€½ arctanïƒ§ïƒ§
(9.129)
The solution of the equations (9.120) is then given by y1 = ï¡ and y2 = dï¡ / dt*. Second Equilibrium Point: Linear Stability Analysis. At the second equilibrium point (ï°, 0) Eqs. (9.69) imply the linear equation system 1 ïƒ¶ïƒ¦ y1 ï€ ï° ïƒ¶ d ïƒ¦ y1 ï€ ï° ïƒ¶ ïƒ¦ 0 ïƒ§ïƒ§ ïƒ·ïƒ· ï€½ ïƒ§ïƒ§ ïƒ·ïƒ§ ïƒ·. dt* ïƒ¨ y 2 ïƒ¸ ïƒ¨ 1 ï€ d ï* ïƒ·ïƒ¸ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸
(9.130)
Here, dy1 / dt* was replaced by d(y1 ï€ ï° ) / dt*. To solve this equation system we use the equivalent second-order equation d 2 (ï¡ ï€ ï° ) d (ï¡ ï€ ï° ) ï€½ ï¡ ï€ ï° ï€ d ï* , 2 dt* dt*
(9.131)
where y1 = ï¡ and y2 = dï¡ / dt* are used. The latter equation can be obtained in the same way as Eq. (9.121). The eigenvalues can be written r1 ï€½ ï€d ï* / 2 ï€« 1 ï€« ï€¨d ï* / 2ï€© , 2
r2 ï€½ ï€ d ï* / 2 ï€ 1 ï€« ï€¨d ï* / 2ï€© . 2
(9.132)
The eigenvalue r1 is always positive (also for zero damping), i.e., the solution near the second equilibrium point is unstable. According to Eq. (7.57), the solution of Eq. (9.131) is given for this case of two unequal real eigenvalues by
ï¡ ï€ï° ï€½
ï¡ '0 ï€ r2 (ï¡ 0 ï€ ï° ) r1 ï€ r2
r t
e 1* ï€
ï¡ '0 ï€ r1 (ï¡ 0 ï€ ï° ) r1 ï€ r2
e
r 2 t*
.
(9.133)
The solutions of Eqs. (9.130) follow then from y1 = ï¡ and y2 = dï¡ / dt*. Regarding the discussion of the evolution in the y1-y2 phase plane below it is interesting to consider the consequences of setting ï¡ '0 = r2 (ï¡0 ï€ ï° ), such that the first term in Eq. (9.133) disappears. By differentiating the resulting expression we find
ï¡ ' ï€ r (ï¡ ï€ ï° ) r 2 t* dï¡ ï€½ ï€ r2 0 1 0 e ï€½ r2 (ï¡ ï€ ï° ), dt* r1 ï€ r2
(9.134)
which means that y2 = r2 (y1 ï€ ï° ) for all t. Similarly, we find for ï¡ '0 = r1 (ï¡0 ï€ ï° )
ï¡ ' ï€ r (ï¡ ï€ ï° ) r1 t* dï¡ ï€½ r1 0 2 0 e ï€½ r1 (ï¡ ï€ ï° ), dt* r1 ï€ r2
(9.135)
which means that y2 = r1 (y1 ï€ ï° ) for all t. Hence, the y1-y2 phase plane figure will involve the two linear functions y2 = r2 (y1 ï€ ï° ) and y2 = r1 (y1 ï€ ï° ) in the vicinity of the point (ï°, 0) for both the damped and undamped pendulum.
366
9 Deterministic Multivariate Evolution
Fig. 9.6. An illustration regarding the calculation of the potential energy of a pendulum. The pendulum is lifted above its minimal position by the distance r (1 ï€ cos ï¡).
9.4.3 Nonlinear Stability Analysis Lyapunovâ€™s Second Method. The use of linear stability analysis leads to very helpful conclusions, but the suitability of the assumption of linear processes in the vicinity of equilibrium points is often not very clear. A nice way to overcome this problem by the analysis of the nonlinear equation system (9.118) was developed by Lyapunov. This approach, which will be presented in the following, is known as Lyapunovâ€™s second method (Lyapunovâ€™s first method refers to the method of linearization of a nonlinear equation along an orbit). The basic idea of Lyapunovâ€™s approach is the consideration of orbits (i.e., trajectories in the phase plane) that are characterized by decreasing values of a non-negative function (which is called the Lyapunov function). The trajectory and its Lyapunov function will change until the Lyapunov function reaches the value zero. The position of the trajectory in the phase plane at which the Lyapunov function is equal to zero characterizes an equilibrium point. Hence, the asymptotical stability of nonlinear equation systems can be shown by proving the existence of a Lyapunov function that decreases to zero. Pendulum Lyapunov Function. The most natural choice for the pendulum Lyapunov function is the total energy E defined by 2
E ï€½ m g r (1 ï€ cosï¡ ) ï€«
1 ïƒ¦ dï¡ ïƒ¶ mr2ïƒ§ ïƒ· . 2 ïƒ¨ dt ïƒ¸
(9.136)
The first contribution represents the potential energy (the work done in lifting the pendulum above its minimal position: see the illustration in Fig. 9.6). The second contribution is the kinetic energy of the pendulum. To simplify the analysis below we use the nondimensional time t* = t / (r / g)1/2, and we introduce the nondimensional energy E* = E / (m g r), 2
1 ïƒ¦ dï¡ ïƒ¶ 1 ïƒ· ï€½ 1 ï€ cos y1 ï€« y 2 2 , E* ï€½ 1 ï€ cos ï¡ ï€« ïƒ§ïƒ§ 2 ïƒ¨ dt* ïƒ·ïƒ¸ 2
(9.137)
9.4 Mechanical Motions: The Pendulum
367
where the definitions y1 = ï¡ and y2 = dï¡ / dt* are applied. The total energy E* has two relevant properties. The first property is that E is non-negative, E* ï‚³ 0.
(9.138)
The case E* = 0 can only appear if y2 = 0 and y1 = 2 n ï°, where n = 0, ï‚±1, ï‚±2, ï‚¼ (such that cos y1 = 1). Hence, E* = 0 for all the asymptotically stable equilibrium positions. The second property of E* is the inequality dE* dy dy 2 ï€½ sin y1 1 ï€« y 2 2 ï€½ y 2 sin y1 ï€ y 2 ï€¨d ï* y 2 ï€« sin y1 ï€© ï€½ ï€ d ï* y 2 ï‚£ 0. dt* dt* dt* (9.139) The derivatives of y1 and y2 are replaced here according to Eqs. (9.118). Nonlinear Stability Analysis. By excluding equilibrium points as initial points we find, therefore, the following results of this discussion. ï‚· The damped pendulum motion is characterized by a decreasing energy E* except for the case that y2 = 0. There are two possibilities to find this case. First, we have y2 = 0 at the equilibrium points y1 = ï‚± n ï° with n = 0, ï‚±1, ï‚±2, ï‚¼. However, the equilibrium points can only be realized asymptotically. Second, y2 = 0 at the points on the left and right side at which the pendulum reverses the direction. However, E* continues to decrease after passing these points. Thus, the damped pendulum motion is characterized by trajectories with d E* / dt < 0 that approach the asymptotically stable equilibrium points with y2 = 0 and y1 = 2 n ï°, where n = 0, ï‚±1, ï‚±2, ï‚¼ (because E* = 0 at these points: see Eq. (9.137)). Hence, the other equilibrium points (as (ï€ï°, 0), (ï°, 0), (3ï°, 0), ï‚¼) are asymptotically unstable. ï‚· The undamped pendulum motion (d ï* = 0) is characterized by a constant value E* > 0 (asymptotically stable equilibrium positions are not considered as initial points). The curves satisfy the equation 1 ï€ cos y1 ï€«
1 2 y 2 ï€½ E* . 2
(9.140)
The type of curve depends on the value of E*. For relatively small E* values we have relatively small y1 and y2. By approximating cos y1 by its Taylor series at (0, 0) in the first order of approximation, cos y1 = 1 ï€ y12 / 2, Eq. (9.140) becomes 2
2
y1 ï€« y 2 ï€½ 2 E* .
(9.141)
This equation describes a circle centered at (0, 0) with radius (2 E*)1/2. For larger E* values we have to consider the curve formula y 2 ï€½ ï‚± 2( E* ï€« cos y1 ï€ 1) ,
(9.142)
which is implied by Eq. (9.140). Closed curves correspond to stable cyclic motions about the equilibrium point. Closed curves must include the possibility
368
9 Deterministic Multivariate Evolution
that y2 = dï¡ / dt* = 0 such that the pendulum can reverse the direction. The function (9.142) shows that the value y2 = 0 can be realized as long as E* ï‚£ 2 (otherwise E* is always larger than cos y1 ï€1). Therefore, open curves (corresponding to values y1 = ï¡ that do always increase: see the illustration in the next subsection) are found if E* > 2. The case E* = 2 is a specific case that separates closed and open curves. For this case we have y 2 ï€½ ï‚± 2(2 ï€« cos y1 ï€ 1) .
(9.143)
This curve is called the separatrix for the undamped pendulum motion.
9.4.4 Pendulum Motions Nonlinear pendulum motions are illustrated in Fig. 9.7 as functions of time and in the y1-y2 phase plane. Typical features of these motions will be discussed next and compared to the conclusions of stability theory. Undamped Pendulum. The undamped pendulum is characterized by two sorts of areas: areas that are bounded from below and above by separatrices (indicated by the differently shaded areas in Fig. 9.7c), and the remaining areas. The shaded areas are characterized by closed curves in the y1-y2 phase plane corresponding to cyclic pendulum motions: see the y1(t) curve for y20 = ï° / 3. The system behavior is very different outside the shaded areas: we have here open curves in the y1-y2 phase plane that are related to a steady increase of y1 = ï¡: see the y1 curves related to y20 = 2 ï° / 3 and y20 = 3 ï° / 2. The separatrices separate these two behaviors. The separatrices are closed, but the corresponding y1(t) curve does not show cyclic variations anymore: see the curve y1(t) that results from y20 = 21/2. Comparison with Stability Theory. These observations agree with the consequences of linear and nonlinear stability theory. The local phase plane features are explained by the linear stability theory. The two linear functions y2 = r2 (y1 ï€ ï° ) and y2 = r1 (y1 ï€ ï° ) are found in the vicinity of the point (ï°, 0): y2 = r1 (y1 ï€ ï° ) is the increasing function, and y2 = r2 (y1 ï€ ï° ) is the decreasing function. The global phase plane features are explained by the nonlinear stability theory. The curve shapes correspond to the conclusions reported in Sect. 9.4.3 as a consequence of analyzing the Lyapunov function E*. For relatively small E* values we find circles, the separatrix obtained for E* = 2 is described by Eq. (9.143), and for large values of E* we find open curves. Damped Pendulum. The phase plane for the damped pendulum is differently organized. All the space is divided into areas that are enclosed by separatrices: the differently shaded areas in Fig. 9.7d are surrounded by other areas that are also enclosed by separatrices. The calculation of separatrices is not as simple as for the
9.4 Mechanical Motions: The Pendulum
369
Fig. 9.7. Pendulum motions. Solutions y1 = ï¡ of the nonlinear pendulum equation system (9.118) are shown as function of time t* in (a) and (b) for the undamped (d ï* = 0) and damped pendulum (d ï* = 0.2), respectively. The curves start at y10 = ï° / 2. The initial values y20 have the values given in the figures; (c) and (d) illustrate undamped and damped pendulum motion, respectively, for a variety of initial conditions in the y1-y2 phase plane. Separatrices are indicated by dashed lines. The differently shaded areas indicate areas enclosed by separatrices. undamped pendulum because the choice of initial values for the trajectories is not obvious. In particular, the lowest separatrix has to be calculated such that it ends in (0, 0). This curve is determined by the initial values (3 ï°, ï€3.574). The middle separatrix between ï° and 3 ï° can be calculated by the initial vales (5 ï°, ï€3.574). The other separatrices follow from symmetry conditions: The middle separatrix between ï€ï° and ï° follows from the initial value (ï€3 ï°, 3.574), and the highest separatrix follows from the initial value (3 ï°, 3.574). Curves that begin inside the lower shaded area in Fig. 9.7d are attracted by the equilibrium point (0, 0), whereas curves that begin inside the upper shaded area are attracted by the equilibrium point (2 ï°, 0). Correspondingly, curves that begin in other areas are attracted by different asymptotically stable equilibrium points. For example, the lowest solid curve in Fig. 9.7d is attracted by (ï€2 ï°, 0), and the highest solid line is attracted by (4 ï°, 0).
370
9 Deterministic Multivariate Evolution
Comparison with Stability Theory. These observations do also agree with the conclusions of stability theory. According to linear stability theory, the linear functions y2 = r2 (y1 ï€ ï° ) and y2 = r1 (y1 ï€ ï° ) are found at (ï°, 0): y2 = r1 (y1 ï€ ï° ) is the increasing function, and y2 = r2 (y1 ï€ ï° ) is the decreasing function. According to nonlinear stability analysis, all trajectories approach the asymptotically stable equilibrium points with y2 = 0 and y1 = 2 n ï°, where n = 0, ï‚±1, ï‚±2, ï‚¼. Thus, the other equilibrium points (ï€ï°, 0), (ï°, 0), (3 ï°, 0), ï‚¼ are asymptotically unstable.
9.5 Fluid Dynamics: Lorenzâ€™s Weather Next, let us analyze Lorenzâ€™s weather. The latter term refers to a simple model for the explanation of convection, which is relevant to atmospheric motions (the motion of the atmosphere is forced by the latitudinal imbalance of solar heating) and many technical applications (e.g., the design of heat exchangers). The consideration of convection here continues the discussion of simple convection models in Sect. 7.2. The model equations considered represent a simplification of the complicated partial differential equations of fluid dynamics derived in Chap. 10, which are implied by the Newtonian mechanics. A main feature of fluid dynamics equations is that these nonlinear equations generate deterministic chaos (chaotic solutions). It will be shown below that Lorenzâ€™s weather model is characterized by the same feature. From a methodological point of view, the equations considered extend the previous analyses by the consideration of three coupled equations.
9.5.1 The Lorenz Equations Lorenzâ€™s Equations. Following the studies of Saltzman (1962), Lorenz (1963) suggested the following equations to investigate basic features of convection, dy1 ï€½ Pr ( y 2 ï€ y1 ), (9.144a) dt dy 2 ï€½ y1 ( R ï€ y3 ) ï€ y 2 , (9.144b) dt dy3 ï€½ y1 y 2 ï€ b y3 . (9.144c) dt These equations represent a highly simplified model for (Rayleighâ€“BÃ©nard) convection. Lorenzâ€™s equations can be also seen as a toy model for weather, which can be used for explaining the limitations of long-range weather forecasting (see e.g., Gleick 1987, Lorenz 2006, Baines 2008, and Boyce & DiPrima 2009).
9.5 Fluid Dynamics: Lorenzâ€™s Weather
371
Fig. 9.8. An illustration of convection according to Lorenzâ€™s equations. Fluid flow is considered in a single cell (i.e., in one box). The flow is heated from below and cooled from above, and there are slippery nonconducting side walls. For a sufficiently large temperature difference the warmer fluid rises, cools down at the top, and moves downwards. This results in a steady convective motion. The strength and direction of this circulation is measured by the variable y1. Figure 9.8 shows an illustration of the case considered. The model variables have the following meaning. The strength and direction of the circulation is measured by y1, y2 measures the horizontal temperature variation, and y3 measures the vertical temperature variation. The model parameter Pr is the Prandtl number (the ratio of diffusivities of momentum and heat), and b is defined by b = 4 / (1 + a2). The parameter a is a horizontal wavenumber for the convection cells, and b measures the width-to-height ratio of the convection layer. The most relevant model parameter is R, which is proportional to the vertical temperature difference (the driving force of the system). In particular, R is defined by R = Ra / Rc, where Ra refers to the Rayleigh number and Rc refers to the critical value of the Rayleigh number (the Rayleigh number that is required for the onset of convection). A large value of R implies a large thermal forcing of motion. Lorenz Model Considered. To simplify the relatively complicated analysis of Lorenzâ€™s equations (9.144) we will follow the studies of Saltzman and Lorenz by specifying a2 = 1/2 so that b = 8/3. We also specify Pr = 10, which is a realistic value for water. The equation system that results from these assumptions reads dy1 ï€½ 10 ( y 2 ï€ y1 ), dt
(9.145a)
dy 2 ï€½ y1 ( R ï€ y3 ) ï€ y 2 , dt
(9.145b)
dy3 8 ï€½ y1 y 2 ï€ y3 . dt 3
(9.145c)
These equations will be analyzed in the following in dependence on R ï‚³ 0, which controls the amount of thermal forcing.
372
9 Deterministic Multivariate Evolution
9.5.2 Linear Stability Analysis Equilibrium Points. Linear stability analysis has to be applied to derive analytical conclusions regarding the nonlinear equation system (9.145). Such analysis requires the calculation of equilibrium points. These points are defined by 0 ï€½ 10 ( y 2 ï€ y1 ),
(9.146a)
0 ï€½ y1 ( R ï€ y3 ) ï€ y 2 ,
(9.146b)
0 ï€½ y1 y 2 ï€
8 y3 . 3
(9.146c)
Equation (9.146a) implies y2 = y1. Hence, the other two conditions can be written 0 ï€½ y1 ( R ï€ y3 ï€ 1), 2
0 ï€½ y1 ï€
(9.147a)
8 y3 . 3
(9.147b)
The first way to satisfy Eq. (9.147a) is given by y1 = 0, which implies y2 = y3 = 0. The second possibility to satisfy Eq. (9.147a) is y3 = R ï€ 1. This setting implies that y2 = y1 = [8 (R ï€ 1)/3]1/2, or y2 = y1 = ï€[8 (R ï€ 1)/3]1/2. Hence, we have three equilibrium points given by
P1 ï€½ (0, 0, 0),
ï€¨
P3 ï€½ ï€ 8 ( R ï€ 1) / 3 , ï€
ï€¨ 8 (R ï€ 1) / 3, 8 (R ï€ 1) / 3, R ï€ 1ï€©, 8 ( R ï€ 1) / 3 , R ï€ 1ï€©.
P2 ï€½
(9.148)
The properties of these equilibrium points depend on R. For R < 1, the only real equilibrium point is given by P1. For R > 1, there are three real equilibrium points. For R = 1 we have three times the equilibrium point (0, 0, 0). Linear Stability Analysis. The linear equation system in the neighborhood of any equilibrium point can be obtained by generalizing the equation system (9.69) to the three-dimensional case,
ïƒ¦ y ï€ Y ïƒ¶ ïƒ¦ ï‚¶F1 / ï‚¶y1 ï‚¶F1 / ï‚¶y 2 d ïƒ§ y 1 ï€ Y1 ïƒ· ïƒ§ ï‚¶F2 / ï‚¶y 2 2 2 ïƒ· ï€½ ïƒ§ ï‚¶F2 / ï‚¶y1 dt ïƒ§ïƒ§ ïƒ· ïƒ§ y Y ï€ 3 ïƒ¸ ïƒ¨ 3 ïƒ¨ ï‚¶F3 / ï‚¶y1 ï‚¶F3 / ï‚¶y 2
ï‚¶F1 / ï‚¶y3 ïƒ¶ ïƒ· ï‚¶F2 / ï‚¶y3 ïƒ· ï‚¶F3 / ï‚¶y3 ïƒ·ïƒ¸ yy1 ï€½ï€½YY1 2
ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ§ y ï€Y ïƒ· 2 ïƒ·. ïƒ§ 2 ïƒ§ y ï€Y ïƒ· 3 ïƒ¸ ïƒ¨ 3 2
(9.149)
y3 ï€½Y3
Here, F1, F2, and F3 represent the right-hand sides of the three equations (9.145), respectively. Regarding Eqs. (9.145) considered we find
ïƒ¦ y ï€ Y ïƒ¶ ïƒ¦ ï€ 10 d ïƒ§ y 1 ï€ Y1 ïƒ· ïƒ§ 2 2 ïƒ· ï€½ ïƒ§ R ï€ Y3 dt ïƒ§ïƒ§ ïƒ· ïƒ§ ïƒ¨ y3 ï€ Y3 ïƒ¸ ïƒ¨ Y2
10 0 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·ïƒ§ ïƒ· ï€ 1 ï€ Y1 ïƒ·ïƒ§ y 2 ï€ Y2 ïƒ· . ïƒ§ ïƒ· Y1 ï€ 8 / 3 ïƒ·ïƒ¸ïƒ¨ y3 ï€ Y3 ïƒ¸
(9.150)
9.5 Fluid Dynamics: Lorenzâ€™s Weather
373
First Equilibrium Point. The linear equation system for the dynamics near the first equilibrium point P1 = (Y1, Y2, Y3) = (0, 0, 0) is then given by
0 ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ¦ y ï€ Y ïƒ¶ ïƒ¦ ï€ 10 10 ïƒ·ïƒ§ ïƒ· d ïƒ§ y 1 ï€ Y1 ïƒ· ïƒ§ 1 0 ïƒ·ïƒ§ y 2 ï€ Y2 ïƒ·. ï€½ ï€ R 2 2ïƒ· ïƒ§ ïƒ§ dt ïƒ§ ïƒ· ïƒ§ ïƒ· ïƒ§ 0 ï€ 8 / 3 ïƒ·ïƒ¸ïƒ¨ y3 ï€ Y3 ïƒ¸ ïƒ¨ y3 ï€ Y3 ïƒ¸ ïƒ¨ 0
(9.151)
The eigenvalues of this equation system follow from the extension of Eq. (9.21) to the three-dimensional case considered, ï€ 10 ï€ r 10 0ï€½ R ï€1ï€ r 0 0
0 0 ï€½ ï€(10 ï€« r )(1 ï€« r )(8 / 3 ï€« r ) ï€« 10 R(8 / 3 ï€« r ) ï€8/3ï€ r
ï›
ï
ï€½ ï€(8 / 3 ï€« r )ï›(10 ï€« r )(1 ï€« r ) ï€ 10 R ï ï€½ ï€(8 / 3 ï€« r ) r 2 ï€« 11 r ï€ 10( R ï€ 1) . (9.152) This cubic equation for r has three roots that are determined by 8 / 3 + r = 0 and the condition that the bracket term is equal to zero, 8 r1 ï€½ ï€ , 3 r2 ï€½ ï€
ïƒ¶ 11 121 11 ïƒ¦ 40 ï€« ï€« 10( R ï€ 1) ï€½ ï€ ïƒ§1 ï€ 1 ï€« ( R ï€ 1) ïƒ·, ïƒ· 2 4 2 ïƒ§ïƒ¨ 121 ïƒ¸
r3 ï€½ ï€
ïƒ¶ 40 11 ïƒ¦ 11 121 ï€ ï€« 10( R ï€ 1) ï€½ ï€ ïƒ§1 ï€« 1 ï€« ( R ï€ 1) ïƒ· . ïƒ§ ïƒ· 121 2ïƒ¨ 2 4 ïƒ¸
(9.153)
The eigenvalues r1 and r3 are always negative. The sign of r2 does depend on R: we have r2 < 0 if R < 1, and r2 > 0 if R > 1. Second and Third Equilibrium Points. I. We can combine the stability analysis for the second and third equilibrium point by writing Eq. (9.150) as ï€ 10 ïƒ¦ y ï€Y ïƒ¶ ïƒ¦ d ïƒ§ y 1 ï€ Y1 ïƒ· ïƒ§ 1 2 2ïƒ· ï€½ïƒ§ dt ïƒ§ïƒ§ ïƒ· ïƒ§ y Y ï€ s R ï€ 1) / 3 8 ( 3 ïƒ¸ ïƒ¨ 3 ïƒ¨
ïƒ¶ïƒ¦ y1 ï€ Y1 ïƒ¶ ïƒ·ïƒ§ ïƒ· ï€1 ï€ s 8 ( R ï€ 1) / 3 ïƒ·ïƒ§ y 2 ï€ Y2 ïƒ·. ïƒ·ïƒ§ y ï€ Y ïƒ· s 8 ( R ï€ 1) / 3 ï€ 8/3 3 ïƒ¸ ïƒ¸ïƒ¨ 3 (9.154) The settings s = 1 and s = ï€1 correspond to the consideration of P2 and P3, respectively. The eigenvalues of this system are determined by the condition
0ï€½
10
ï€ 10 ï€ r 1
10 ï€1ï€ r
0 ï€ s 8 ( R ï€ 1) / 3 .
s 8 ( R ï€ 1) / 3
s 8 ( R ï€ 1) / 3
ï€ 8/3 ï€ r
0
(9.155)
374
9 Deterministic Multivariate Evolution
Table 9.2 Solutions y = r + a / 3 of the reduced cubic equation (9.159). The case ï¡ > 0 implies D > 0. The last row provides the formula for the calculation of ïª.
ï¡ < 0: D ï‚£ 0 y1
ï€ 2 E cos
y2
ï€ 2 E cos
y3
ïª
ï€ 2 E cos cos ïª ï€½
ïª 3 ïª ï€« 2ï° 3
ïª ï€« 4ï° 3
ï¢ 2E
3
ï¡ < 0: D > 0 ï€ 2 E cosh
ïª
3 ïª ïªïƒ¹ ïƒ© E ïƒªcosh ï€« i 3 sinh ïƒº 3 3ïƒ» ïƒ« ïª ïªïƒ¹ ïƒ© E ïƒªcosh ï€ i 3 sinh ïƒº 3 3ïƒ» ïƒ«
cosh ïª ï€½
ï¢
2E
3
ï¡ > 0: D > 0 ï€ 2 E sinh
ïª
3 ïª ïªïƒ¹ ïƒ© E ïƒªsinh ï€« i 3 cosh ïƒº 3 3ïƒ» ïƒ« ïª ïªïƒ¹ ïƒ© E ïƒªsinh ï€ i 3 cosh ïƒº 3 3ïƒ» ïƒ«
sinh ïª ï€½
ï¢
2E 3
The expansion of the determinant provides the condition
0 ï€½ ï€(10 ï€« r ) ï›(1 ï€« r )(8 / 3 ï€« r ) ï€« 8 ( R ï€ 1) / 3ï ï€« 10 ï›8 / 3 ï€« r ï€ 8 ( R ï€ 1) / 3ï ï€½ ï€(8 / 3 ï€« r ) ï›(10 ï€« r )(1 ï€« r ) ï€ 10ï ï€ 8 ( R ï€ 1) / 3 ï›10 ï€« 10 ï€« r ï ï€½ ï€r (8 / 3 ï€« r ) ï›r ï€« 11ï ï€ 8 ( R ï€ 1) / 3 ï›20 ï€« r ï
(9.156)
8 8 ( R ï€ 1) ïƒ¶ 160 ( R ï€ 1) 8 ï€« 33 ïƒ¦ r ï€« 11 ï€« ï€½ ï€r ïƒ§ r 2 ï€« ïƒ·ï€ 3 3 3 3 ïƒ¨ ïƒ¸ 41 8 ( R ï€« 10) 160 ( R ï€ 1) rï€« . ï€½ r3 ï€« r2 ï€« 3 3 3 This eigenvalue equation is independent of s. Hence, this equation is the same for the equilibrium points P2 and P3. Solutions of Cubic Equations. Before analyzing the consequences of the last equation, let us briefly review the solutions of cubic equations. We consider the equation 0 ï€½ r 3 ï€« a r 2 ï€« b r ï€« c.
(9.157)
Here, a, b, and c are any real coefficients. By introducing y = r + a / 3, Eq. (9.157) can be written as a reduced equation that does not contain a quadratic term, 3
2
aïƒ¶ aïƒ¶ aïƒ¶ ïƒ¦ ïƒ¦ ïƒ¦ 0 ï€½ ïƒ§ y ï€ ïƒ· ï€« aïƒ§ y ï€ ïƒ· ï€« bïƒ§ y ï€ ïƒ· ï€« c 3ïƒ¸ 3ïƒ¸ 3ïƒ¸ ïƒ¨ ïƒ¨ ïƒ¨ 2
3
2
a a a ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ï€« 3 y ïƒ§ ïƒ· ï€ ïƒ§ ïƒ· ï€« a y 2 ï€ 2a y ï€« aïƒ§ ïƒ· ï€« b y ï€ b ï€« c 3 3 3 ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ 2 2 3 3 ïƒ© ïƒ¦aïƒ¶ ïƒ¹ ïƒ¦aïƒ¶ a ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ï€½ y 3 ï€« y ïƒª3 ïƒ§ ïƒ· ï€ 6 ïƒ§ ïƒ· ï€« b ïƒº ï€ ïƒ§ ïƒ· ï€« 3 ïƒ§ ïƒ· ï€ b ï€« c. 3 3 3 3 3 ïƒ¨ ïƒ¸ ïƒ¨ ïƒ¸ ïƒ«ïƒª ïƒ¨ ïƒ¸ ïƒ»ïƒº ïƒ¨ ïƒ¸
ï€½ y3 ï€ 3y2
(9.158)
9.5 Fluid Dynamics: Lorenzâ€™s Weather
375
A more convenient way is to write this equation as 0 ï€½ y3 ï€« ï¡ y ï€« ï¢ ,
(9.159)
where ï¡ and ï¢ are defined by ïƒ¦aïƒ¶ ïƒ¨3ïƒ¸
2
ïƒ¦aïƒ¶ ïƒ¨3ïƒ¸
ï¡ ï€½ b ï€ 3ïƒ§ ïƒ· ,
3
ï¢ ï€½ 2ïƒ§ ïƒ· ï€ b
a ï€« c. 3
(9.160)
The three solutions of Eq. (9.159) are given in Table 9.2 in dependence on 3
2
ïƒ¦ï¡ ïƒ¶ ïƒ¦ ï¢ ïƒ¶ D ï€½ïƒ§ ïƒ· ï€«ïƒ§ ïƒ· , ïƒ¨3ïƒ¸ ïƒ¨2ïƒ¸
Eï€½
ï¢ |ï¡ | . |ï¢ | 3
(9.161)
Here, D is called the discriminant. The solutions of the original Eq. (9.157) can be obtained by means of the relation r = y ï€ a / 3. Table 9.2 shows that there are two possibilities: there can be either three real solutions or one real solution and two conjugate complex solutions. The two solution regimes are separated at D = 0: for D ï‚£ 0 we have three real solutions, and for D > 0 we have one real and two complex solutions. Pure Imaginary Eigenvalues. The two complex solutions may have positive or negative real parts. Regarding the evaluation of stability it is relevant to know for which R the real parts of complex roots are equal to zero (because we know then for which R the real parts of complex roots are positive and negative). This specific case of pure imaginary roots is given under the conditions that c = a b and b > 0. To prove the requirement of the first condition c = a b we write Eq. (9.157) for this case as
0 ï€½ r 3 ï€« a r 2 ï€« b r ï€« a b ï€½ ( r ï€« a ) ( r 2 ï€« b) .
(9.162)
This representation reveals the roots r1 = ï€a and r2,3 = ï‚± i b1/2. The comparison of the solutions r1 = ï€a and r2,3 = ï‚± i b1/2 for this case with the solutions presented in Table 9.2 shows that E cosh(ïª / 3) or E sinh(ïª / 3), which represent the real parts of y2 and y3 depending on a negative or positive sign of ï¡, respectively, must be equal to a / 3. These values imply zero real parts of r2 and r3 according to r = y ï€ a / 3. The second condition b > 0 is a requirement to have a discriminant D > 0, which is needed for the existence of complex solutions. We have to calculate D to show the correctness of this claim. For c = a b, ï¡ and ï¢ are given by ïƒ¦aïƒ¶ ïƒ¨3ïƒ¸
2
ï¡ ï€½ b ï€ 3ïƒ§ ïƒ· ,
ïƒ¦aïƒ¶ ïƒ¨3ïƒ¸
3
a 3
ï¢ ï€½ 2ïƒ§ ïƒ· ï€« 2b .
(9.163)
Hence, the discriminant D is given by 3
2
3 2 ïƒ¦ b ïƒ¦ a ïƒ¶ 2 ïƒ¶ ïƒ¦ ïƒ¦ a ïƒ¶3 aïƒ¶ ïƒ¦ï¡ ïƒ¶ ïƒ¦ ï¢ ïƒ¶ D ï€½ ïƒ§ ïƒ· ï€« ïƒ§ ïƒ· ï€½ ïƒ§ ï€ ïƒ§ ïƒ· ïƒ· ï€« ïƒ§ïƒ§ ïƒ· ï€« b ïƒ· . ïƒ§ 3 ïƒ¨ 3 ïƒ¸ ïƒ· ïƒ§ïƒ¨ 3 ïƒ¸ 3 ïƒ·ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨2ïƒ¸ ïƒ¨ ïƒ¸ ïƒ¨
(9.164)
376
9 Deterministic Multivariate Evolution
The evaluation of this expression provides 3
2
2
4
6
6
4
bïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ïƒ¦aïƒ¶ ïƒ¦bïƒ¶ ïƒ¦aïƒ¶ ïƒ¦bïƒ¶ D ï€½ ïƒ§ ïƒ· ï€ 3ïƒ§ ïƒ· ïƒ§ ïƒ· ï€« 3 ïƒ§ ïƒ· ï€ ïƒ§ ïƒ· ï€« ïƒ§ ïƒ· ï€« 2bïƒ§ ïƒ· ï€« b 2 ïƒ§ ïƒ· 3ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨ 3ïƒ¸ ïƒ¨3ïƒ¸
2
(9.165) 2 3 2 2 4 2 bïƒ¦aïƒ¶ b ïƒ¦ïƒ§ b ïƒ¦ a ïƒ¶ ïƒ¶ïƒ· ïƒ¦bïƒ¶ ïƒ¦aïƒ¶ ïƒ¦bïƒ¶ ï€« 3ïƒ§ ïƒ· . ï€½ ïƒ§ ïƒ· ï€« 6ïƒ§ ïƒ· ïƒ§ ïƒ· ï€« 9 ïƒ§ ïƒ· ï€½ 3ïƒ¨3ïƒ¸ 3 ïƒ§ïƒ¨ 3 ïƒ¨ 3 ïƒ¸ ïƒ·ïƒ¸ ïƒ¨3ïƒ¸ ïƒ¨ 3ïƒ¸ ïƒ¨3ïƒ¸ Therefore, D > 0 under the condition that b > 0. Second and Third Equilibrium Points. II. The eigenvalue equation (9.156) can be analyzed on the basis of the solutions of cubic equations described in the preceding two paragraphs. Let us prepare this discussion by the calculation of two characteristic values of R. A first characteristic value R1 is the value that separates three real solutions from one real solution and two conjugate complex solutions. R1 is determined by D = (ï¡ / 3)3 + (ï¢ / 2)2 = 0. For our case, ï¡ and ï¢ are given by 2
ï¡ï€½
8 ( R ï€« 10) 8 961 ïƒ¦ 41 ïƒ¶ , ï€ 3ïƒ§ ïƒ· ï€½ R ï€ 3 9 3 27 ïƒ¨ ïƒ¸ 3
10402 ïƒ¦ 41 ïƒ¶ 8 ( R ï€« 10) 41 160 ( R ï€ 1) 1112 ï¢ ï€½ 2ïƒ§ ïƒ· ï€ ï€« ï€½ . Rï€« 9 3 9 3 27 729 ïƒ¨ ïƒ¸
(9.166)
The use of the latter relations in D = (ï¡ / 3)3 + (ï¢ / 2)2 provides the expression 3
2
3
2
961 ïƒ¶ ïƒ¦ 556 5201 ïƒ¶ 961 ïƒ¶ ïƒ¦ 5201 ïƒ¶ ïƒ¦8 ïƒ¦ D( R) ï€½ ïƒ§ R ï€ Rï€« ïƒ· ï€«ïƒ§ ïƒ· ï€½ 32 ïƒ§ R ï€ ïƒ· ï€« ïƒ§139 R ï€« ïƒ· 81 ïƒ¸ ïƒ¨ 27 729 ïƒ¸ 72 ïƒ¸ ïƒ¨ 108 ïƒ¸ ïƒ¨9 ïƒ¨ 961 9612 9613 139 2 2 5201 52012 ï€½ R3 ï€ 3 R2 ï€« 3R 2 ï€ ï€« R ï€« 139 R ï€« 3 72 32 16 ïƒ— 108 32 ïƒ— 108 2 72 72 54,119 2 15,245 36,885 ï€½ R3 ï€« 2 R ï€« 108 R ï€ 2 ïƒ— 108 2 6ïƒ— 4ïƒ—8 16 ïƒ— 108 32 ïƒ— 108 2 54,119 2 15,245 36,885 ï€½ R3 ï€« R ï€« Rï€ . 96 16 16 (9.167) The solution of cubic equations described above shows that the equation D(R) = 0 has two negative real roots and one positive real root. The negative roots can be disregarded because we consider R ï‚³ 0. The real root, which is R1, is given by R1 ï€½ ï€2
ïƒ¦1 ïƒ¦ | ï¢ | 33 / 2 ïƒ¶ 2ï° ïƒ¶ a ï¢ |ï¡ | ïƒ· ï€ ï‚» 1.3456. ïƒ·ï€« cosïƒ§ arccos ïƒ§ïƒ§ 3/ 2 ïƒ· ïƒ· 3 ïƒ§3 |ï¢ | 3 3 ï¡ 2 | | ïƒ¨ ïƒ¸ ïƒ¸ ïƒ¨
(9.168)
Here, ï¡ = b ï€ 3 (a / 3)2 and ï¢ = 2 (a / 3)3 ï€ a b / 3 + c, where a, b, and c are the coefficients of R2, R, and the last term in the cubic equation (9.156), respectively. The validity of this value may be seen by proving that R1 = 1.3456 implies D = 0.
9.5 Fluid Dynamics: Lorenzâ€™s Weather
377
Table 9.3 The linear stability properties of the Lorenz equations (9.145).
P1 0<><1 1='></1><=' r=' r1=' r2='></='>

asymptotically stable: 3 real negative roots unstable: 2 real negative roots, 1 real positive root unstable: 2 real negative roots, 1 real positive root unstable: 2 real negative roots, 1 real positive root
P2 and P3 do not exist asymptotically stable: 3 real negative roots asymptotically stable: 1 real negative root, 2 complex roots with negative real parts unstable: 1 real negative root, 2 complex roots with positive real parts
A second characteristic value R2 of R is the value that separates complex roots with negative and positive real parts. In terms of the notation of the general cubic equation (9.157), this value is determined by the condition c = a b. By adopting a, b, and c according to the eigenvalue equation (9.156), the condition for R2 reads 41 8 ( R2 ï€« 10) 160 ( R2 ï€ 1) ï€½ . 3 3 3
(9.169)
This equation is solved by R2 ï€½
470 ï‚» 24.7368 . 19
(9.170)
Second and Third Equilibrium Points. III. The stability behavior near P2 and P3 can be determined now by taking reference to R1 and R2. We find the following features (see also the summary of linear stability features in Table 9.3): a) 1 < R < R1: All the terms in Eq. (9.156) are positive for R > 1. Hence, all real solutions have to be negative. The discriminant D = (ï¡ / 3)3 + (ï¢ / 2)2 increases according to Eq. (9.167), dD / dR > 0, and we know that D = 0 at R1. Thus, D < 0 for 1 < R < R1, which means that there are three real negative roots for 1 < R < R1. Hence, the solution near P2 and P3 is asymptotically stable. b) R1 < R < R2: For R1 < R we have one real root, which has to be negative because all real roots must be negative, and two conjugate complex roots. The complex roots arise from imaginary contributions that appear in addition to the negative real parts of these roots. The real parts of these roots are negative when R < R2. Hence, the solution near P2 and P3 is asymptotically stable in this regime, too. c) R2 < R: For R values in this regime we have one negative real root and two conjugate complex roots. The complex roots do have now positive real parts because of R2 < R. Hence, the solution near P2 and P3 is unstable in this regime.
378
9 Deterministic Multivariate Evolution
9.5.3 Deterministic Chaos Equations Considered. The illustration of characteristic features of the Lorenz equations (9.145) will be focused on the case R2 < R. A discussion of solution properties for other cases can be found, e.g., in Sparrow (1982). In particular, we use R = 28, which means we consider the equations dy1 ï€½ 10 ( y 2 ï€ y1 ), dt dy 2 ï€½ y1 (28 ï€ y3 ) ï€ y 2 , dt dy3 8 ï€½ y1 y 2 ï€ y3 . dt 3
(9.171a) (9.171b) (9.171c)
The reason for the consideration of these equations is that this case, for which we do not have asymptotically stable solutions, is the most interesting one. It will be shown below that the solutions exhibit a chaotic behavior in this case. This model may be considered as a highly simplified model for turbulence that is described by the Navier-Stokes equations. The condition R2 < R does appear here in analogy to the condition for the onset of turbulence that the Reynolds number must be above a critical Reynolds number. Numerical Solution. Due to the nonlinear terms involved, the nonlinear equation system (9.171) can only be solved numerically. For doing this we write these equations as a system of difference equations
y1
ï€¨
ï€©
n ï€«1
ï€½ y1 ï€« 10 ï„t y2 ï€ y1 ,
nï€«1
ï€½ y2 ï€« ï„t y1 (28 ï€ y3 ) ï€ y2 ,
y2
n
n
ï€¨
n
n
n
n
(9.172a) n
ï€©
(9.172b)
ïƒ¦ n n 8 nïƒ¶ n (9.172c) ï€½ y3 ï€« ï„t ïƒ§ y1 y2 ï€ y3 ïƒ·, 3 ïƒ¨ ïƒ¸ where n = 0, 1, 2, ï‚¼ Starting from the initial values (y10, y20, y30) = (y10, y20, y30), these equations describe the evolution of y1, y2, y3 in time t = n ï„t. The initial data (y10, y20, y30) = (5, 5, 5) will be applied here (except for the study of the influence of varying initial data described below). Equations (9.172) introduces a parameter: the time interval ï„t. This time interval is considered to be sufficiently small in order to produce solutions of Eqs. (9.172) that are independent of ï„t. In that case, the solutions of Eqs. (9.172) are seen as solutions of the differential equation system (9.171). The effect of different ï„t settings is illustrated in Fig. 9.9. This figure shows that the solutions of Eqs. (9.172) do not become independent of ï„t for ï„t variations over seven orders of magnitude (ï„t = 10ï€3 to ï„t = 10ï€10). It is not easy to consider the effect of smaller ï„t values because such simulations are expensive. y3
n ï€«1
9.5 Fluid Dynamics: Lorenzâ€™s Weather
379
Fig. 9.9. Solutions y1(t) of the Lorenz equations (9.172) in dependence on the time interval ï„t, where (y10, y20, y30) = (5, 5, 5). The thick lines present the results for ï„t = 10ï€3, 10ï€5, 10ï€7, 10ï€9 in (a)ï€(d), respectively; the thin lines show the results for ï„t = 10ï€4, 10ï€6, 10ï€8, 10ï€10 in (a) ï€(d), respectively. The simulation, e.g., for ï„t = 10ï€10 requires 11.1 hours on a Pentium(R) 4 CPU 3.2 GHz personal computer with 1 GB memory. To improve the clarity of these plots, y1 is only shown from t = 20 to t = 40 in Figs. 9.9c-d. For t < 20, the thick and thin lines in these figures do hardly show any difference. The conclusion that the solutions of Eqs. (9.172) depend on the time interval ï„t is surprising because this observation differs from the behavior of many other differential equations. It is relevant to note that this conclusion does not depend on the simple numerical scheme (9.172) used to solve the Lorenz equations (9.171): the result is the same for a variety of more advanced numerical schemes (see Yao 2007, 2010, Yao & Hughes 2008, and Liao 2009). Hence, it may be impossible to obtain a unique solution of the Lorenz equations (at least, no such solution that is independent of ï„t has been reported so far). The Lorenz equations (9.145) have to be considered, therefore, as a guideline for the construction of numerical schemes (one possible numerical scheme is given by Eqs. (9.172)) that are defined in conjunction with a specific choice of ï„t. This feature of the Lorenz equations also poses questions about the reliability of numerical solutions of the fluid dynamics equations (the Navier-Stokes equations): see Yao (2007, 2010).
380
9 Deterministic Multivariate Evolution
Fig. 9.10. The evolution of variables determined by the Lorenz equations (9.172) in several phase planes, where ï„t = 10ï€3 and (y10, y20, y30) = (5, 5, 5). (a), (c), and (e) show the evolution from t = 0 to t = 20; (b), (d), and (f) show the evolution from t = 10,000 to t = 10,020. Phase Plane Evolution. The phase plane evolution of y1, y2, and y3 is shown in Fig. 9.10 for ï„t = 10ï€3, which will be used in the following. These trajectories do never cross each other because the system never exactly repeats itself. There is no convergence to any asymptotically stable state: the trajectories from t = 10,000 to t = 10,020 are very similar to the trajectories from t = 0 to t = 20. The long-term behavior of the phase plane trajectories shown in these plots is called the Lorenz attractor, noted for the butterfly shape of the y1-y3 trajectory and the owl mask shape of the y2-y3 trajectory.
9.5 Fluid Dynamics: Lorenzâ€™s Weather
381
Fig. 9.11. Solutions y1(t) of the Lorenz equations (9.172): the influence of variations of initial conditions, where ï„t = 10ï€3. The thick lines in these figures show the result for (y10, y20, y30) = (5, 5, 5). The thin lines in (a), (b), and (c) result from the initial data (5.01, 5, 5), (5.001, 5, 5), and (5.0001, 5, 5), respectively. The Butterfly Effect. The Lorenz equations are not only sensitive to variations of the time step used to integrate these equations. These equations are also very sensitive to minor variations of the initial conditions, as demonstrated in Fig. 9.11. In fact, every difference in initial data will result in different solutions. Figure 9.11 demonstrates that the larger the difference of initial data, the sooner there will be a difference between solutions. Lorenz did accidentally discover this sensi-tivity of solutions to perturbations of initial data when he restarted the numerical integration of equations by rounding-off the data values used in the computations. The sensitive dependence of solutions on initial conditions is called the Butterfly Effect because Lorenz compared this dependence with the effect of a butterfly on the weather: he asked 'Does the flap of a butterflyâ€™s wings in Brazil set off a tornado in Texas?'. The Lorenz equations are clearly a highly simplified version of equations that can be used for weather forecasting. However, they may explain the reason of why long-term weather predictions are simply impossible: the nonlinear interactions of variables involved in such equations imply a high sensitivity to perturbations, and perturbations have to be always taken into account (initial data are never known exactly). Probability Density Functions. The Lorenz equations are deterministic, but they produce output that looks like random data. It is, therefore, a reasonable idea to study the probability for finding certain solution values. This question will be addressed by considering the probability density function (PDF) of y1, y2, and y3 values. The latter PDFs are denoted by f1(x1), f2(x2), and f3(x3), where x1, x2, and x3 represent the sample space variables of y1, y2, and y3, respectively. The Lorenz equations were solved with ï„t = 10ï€3 up to t = 25, t = 100, and t = 1000. For t ï‚£ 15 the PDFs are still heavily affected by randomness: there are very sharp peaks that are difficult to resolve. The state at t = 25 corresponds to a state at which the PDF
382
9 Deterministic Multivariate Evolution
Fig. 9.12. The probability density function (PDF) f1(x1), f2(x2), and f3(x3) related to y1, y2, and y3 values, respectively. The upper row shows these PDFs at t = 25, and the lower row shows the PDFs at t = 100 (dashed line) and t = 1000 (solid line). is relatively smooth but still significantly affected by the initial data. The state at t = 100 represents the asymptotic state. Evidence for this conclusion is provided by the PDF at t = 1000 which does not show an observable differences to the PDF at t = 100. 106 solutions were generated by using varying initial conditions for y1 (with an equal distance) between 4.995 and 5.005. The PDFs were calculated as filtered PDFs according to the explanations in Chap. 4. A filter interval equal to 2 was used to obtain smooth PDF curves. The results are shown in Fig. 9.12 for t = 25, t = 100, and t = 1000. The PDFs at t = 25 are characterized by several modes (4, 5, and 3 modes regarding the f1, f2, and f3 curves). These PDF structures show that the initial data considered (which do only involve variations of y1 values) may excite a spectrum of different motions. At t = 100 (i.e., in the asymptotic stage) we observe a smoothening between these modes: different modes merge. The latter leads to the development of a central mode, but the outer modes are still present. The PDF curves seen here are clearly different from the features of velocity and temperature PDFs for the unstably stratified atmospheric boundary layer (only the f3 PDF shows some similarities at t = 100), but the f1, f2, and f3 curves are similar in the sense that they represent a superposition of several distinct motions.
9.6 Summary
383
9.6 Summary Let us summarize the observations made in this chapter regarding the extension of laws for one variable to the multivariate case of several interacting variables. This will be done by addressing the questions posed at the end of Sect. 9.1, i.e., the questions about the formulation of laws and the use of such equations. Formulation of Multivariate Laws. Newtonâ€™s Laws of Mechanics apply to the case of several variables. Thus, there is no question about the laws for mechanical processes of macroscopic bodies that move with velocities much slower than the speed of light. With regard to the laws for population dynamics we have another case because there is no unique formulation of such processes (see the discussions in Chap. 7). In this case, we follow the spirit of formulating such laws by using empirical modifications of single-variable equations. The particular question with regard to both the laws for population ecology and mechanical processes is how it is possible to use such equations for several variables. This problem is much more complicated than the use of equations for only one variable. Let us summarize the findings obtained regarding this question. Numerical Solution of Multivariate Equations. The numerical solution of equations represents a general methodology for using evolution equations in order to study the features of processes. This approach does work if it is possible to find convergent solutions. A convergent solution represents a solution that is independent of variations of small time intervals used in the numerical scheme. For most equations it is possible to find such convergent solutions, but this is not always the case. An example for the latter case was given here by the Lorenz equations. So far, a convergent solution has not been reported for these equations. This finding does have implications for practical problems. The Lorenz equations represent a highly simplified version of the Navier-Stokes equations that are used to calculate fluid dynamics processes. With regard to most applications it is very expensive to prove the convergence of solutions to the Navier-Stokes equations (such simulations may need several years). Thus, solutions of the Navier-Stokes equations are calculated by adopting a relatively small time step. Then, there is the question of whether such solutions represent convergent solutions, which is not the case for the simple Lorenz model derived from the Navier-Stokes equations. Analytical Study of Multivariate Equations. It is hardly possible to integrate multivariate nonlinear coupled equations. Usually, analytical conclusions can only be derived by means of linear stability analysis. Such analyses are very helpful, as demonstrated for the examples given in this chapter. We obtain insight in this way that can hardly be obtained by numerical simulations. It would be scarcely possible, e.g., to use numerical solutions for accurate calculations of the critical number R2 that separates nonchaotic and chaotic solutions of the Lorenz equations. On the
384
9 Deterministic Multivariate Evolution
other hand, linear stability theory does not represent an alternative to numerical solutions, because the overall features of nonlinear equations cannot be studied in this way. The analysis of nonlinear equations is only possible under very special conditions. An example was given here by the discussion of the application of Lyapunovâ€™s second method. Such methods are applicable if there is a way to find conserved variables, as, e.g., the total energy of a process.
9.7 Exercises 9.2.1 Consider the linear equation system d ïƒ¦ y1 ïƒ¶ ïƒ¦ ï€ 1 2 ïƒ¶ ïƒ¦ y1 ïƒ¶ ïƒ§ ïƒ·ï€½ïƒ§ ïƒ· ïƒ§ ïƒ·. dt ïƒ§ïƒ¨ y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ 4 1 ïƒ·ïƒ¸ ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸ The initial values are given by y1(0) = y10 and y2(0) = y20, where y10 and y20 are any parameters. a) Determine the solutions y1(t) and y2(t) in dependence on y10 and y20. b) Which relation between the initial values y10 and y20 is required such that y2 is a linear function of y1 that disappears asymptotically? Find the corresponding linear function y2 = y2(y1). c) Which relation between the initial values y10 and y20 is required such that y2 is a linear function of y1 that goes to infinity asymptotically? Find the corresponding linear function y2 = y2(y1).
9.2.2 Consider the linear equation system d ïƒ¦ y1 ïƒ¶ ïƒ¦ 1 ï€ A(1 ï€« A) ïƒ¶ ïƒ¦ y1 ïƒ¶ ïƒ§ ïƒ·ï€½ïƒ§ ïƒ·ïƒ· ïƒ§ïƒ§ ïƒ·ïƒ· . A dt ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ ï€ 2 ïƒ¸ ïƒ¨ y2 ïƒ¸ The initial values are y1(0) = 1 and y2(0) = ï€1, and A is any parameter. a) Find the solutions y1(t) and y2(t) to this initial value problem. b) For which range of A values do the solutions become zero as t ï‚® ï‚¥?
9.2.3 Consider the nonlinear equation system d ïƒ¦ y1 ïƒ¶ ïƒ¦1 ï€ y2 ïƒ¶ ïƒ¦ y1 ïƒ¶ ïƒ§ ïƒ·ï€½ïƒ§ ïƒ· ïƒ§ ïƒ·. dt ïƒ§ïƒ¨ y2 ïƒ·ïƒ¸ ïƒ§ïƒ¨1 ï€ y1 ïƒ·ïƒ¸ ïƒ§ïƒ¨ y2 ïƒ·ïƒ¸ a) Find the equilibrium points of this equation system. b) Which of the equilibrium points will be realized? Apply linear stability analysis to address this question.
9.7 Exercises
385
9.2.4 Consider the nonlinear equation system d ïƒ¦ y1 ïƒ¶ ïƒ¦ y1 ïƒ§ ïƒ·ï€½ïƒ§ dt ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸ ïƒ§ïƒ¨ y 2
y 2 ïƒ¶ ïƒ¦ y1 ïƒ¶ ïƒ· ïƒ§ ïƒ·. y1 ïƒ·ïƒ¸ ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸
a) Find the equilibrium points of this equation system. b) Which of the equilibrium points will be realized? Apply linear stability analysis to address this question. c) Apply the solutions y1(t) and y2(t) to the nonlinear equation system to address again the question about the realization of equilibrium solutions. Hint: you may use the equation system to derive and solve equations for y1 + y2 and y1 ï€ y2, respectively.
9.2.5 A specific form of Duffingâ€™s nonlinear spring model reads (Wiggins 2010) d ïƒ¦ y1 ïƒ¶ ïƒ¦ 0 ïƒ§ ïƒ·ï€½ïƒ§ 2 dt ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸ ïƒ§ïƒ¨1 ï€ y1
1 ïƒ¶ ïƒ¦ y1 ïƒ¶ ïƒ· ïƒ§ ïƒ·, 0 ïƒ·ïƒ¸ ïƒ§ïƒ¨ y 2 ïƒ·ïƒ¸
a) Find the equilibrium points of this equation system. b) Show that two equilibrium points represent centers. c) Use the linear equation systems near the centers to find the shape of trajectories in the y1-y2 phase plane. Hint: you may calculate the ratio dy1 / dy2 = (dy1 / dt) / (dy2 / dt) and solve the resulting separable equation. d) Explain the type of equation obtained for trajectories in c).
9.3.1 Consider the following modification of the competition for food dynamics (9.73) discussed in Sect. 9.3.2 (d is a non-negative number), dy1 ï€½ y1 ï€¨1 ï€ y1 ï€ d y 2 ï€©, dt
dy 2 ï€½ y 2 ï€¨1 ï€ d y1 ï€©. dt
a) Find the equilibrium points of this equation system. b) Determine the stability behavior of solutions near the equilibrium points in dependence on the model parameter d. c) The coexistence equilibrium point depends on the value of d. For which range of d do we find a non-negative coexistence equilibrium point? d) Consider the range of values of d determined in c). Which equilibrium point will be realized asymptotically?
9.3.2 Consider the following modified Lotka-Volterra equations for the prey y1 (food fish) and predators y2 (sharks): see Allen (2007), dy1 ï€½ y1 ï€¨ A ï€ f ï€ B y 2 ï€©, dt
dy 2 ï€½ y 2 ï€¨ï€ C ï€ f ï€« D y1 ï€©. dt
Here, A, B, C, D, and f are non-negative constants. The parameter f models a prey reduction due to fishing. How does f affect a coexistence of species?
386
9 Deterministic Multivariate Evolution
9.3.3 Infectious diseases such as measles, mumps, rubella, and chickenpox are modeled by involving three groups of individuals (Kermack & McKendrick 1927, Anderson & May 1979a, 1979b, Anderson 1982, Fulford et al. 1997, Edelstein-Keshet 2005, Allen 2007). The total population N, which is considered to be constant, is subdivided into susceptible (S), infective (I), and removed (R) classes: N = S(t) + I(t) + R(t). Susceptible refers to individuals not infected but who are capable of contracting the disease and becoming infective. Infective refers to individuals who are infected and infectious. Removed refers to individuals who have had the disease and have definitely recovered, who are permanently immune, or are isolated until recovery. A very simple epidemic model (the SI model) assume R = 0 and relates S and I by
ï¢ dS ï€½ ï€ SI, dt N
dI ï¢ ï€½ S I. dt N
Here, ï¢ is a positive constant of proportionality. a) Use the relation N = S + I to derive a closed equation for I. Compare this equation with differential equations considered in Chap. 7. Which type of equation is the equation for I? b) Solve the differential equation for I. c) Calculate the asymptotic values of S and I for large values of t. Explain the meaning of the result obtained.
9.3.4 A modification of the equations described in exercise 9.3.3 is given by the following SIS model,
ï¢ dS ï€½ ï€ SI ï€«ï§ I, dt N
dI ï¢ ï€½ S I ï€ï§ I. dt N
Here, ï¢ and ï§ are positive constants of proportionality. We have again the relation N = S + I. a) Explain the relevance of a nonzero ï§. b) Follow the approach in exercise 9.3.3 to derive a closed equation for I. c) Solve the differential equation for I. d) Calculate S and I for large values of t for the cases that ï¢ > ï§ and ï¢ ï‚£ ï§, respectively.
9.3.5 A modification of the equations described in exercise 9.3.3 is given by the following SIR model,
ï¢ dS ï€½ ï€ SI, dt N
dI ï¢ ï€½ S I ï€ï§ I, dt N
dR ï€½ ï§ I. dt
Here, ï¢ and ï§ are positive constants of proportionality. This model implies the relation N = S(t) + I(t) + R(t).
9.7 Exercises
387
a) Explain the difference between the assumptions reflected in this model and the SIS model. b) Due to the relation N = S(t) + I(t) + R(t) we can focus on the dynamics of S(t) and I(t). Use the model considered to derive an equation for the derivative dI / dS. c) Use this equation to calculate I = I(S). Rewrite this equation by using the abbreviations y = ï¢ (I ï€ I0) / (ï§ N), x = 1 ï€ S / S0, and R = ï¢ S0 / (ï§ N). Here, I0 and S0 refer to the initial values of I and S, respectively. d) An epidemic occurs if y = y(x) increases from its initial value zero to a local maximum. Under which condition can this happen? Determine the critical point of x and the maximum of y. Hint: consider the fact that 0 ï‚£ x ï‚£ 1 because S is a decreasing function by dS / dt = ï€ ï¢ S I / N. e) No epidemic occurs if y is a decreasing function of x. Show under which condition this is the case.
9.3.6 A modification of the equations described in exercise 9.3.3 is given by the following SIRS model, which is given by the equations
ï¢ dS ï€½ ï€ S I ï€«ï® R, dt N
dI ï¢ ï€½ S I ï€ï§ I, dt N
dR ï€½ ï§ I ï€ï® R. dt
Here, ï¢, ï§, and ï® are positive constants of proportionality. This model implies the relation N = S(t) + I(t) + R(t). a) We can focus on the dynamics of S(t) and I(t) because R is determined via the relation N = S(t) + I(t) + R(t). Use the SIRS model considered to derive a closed equation system for S*(t) = N ï€ S(t) and I(t). b) Determine the equilibrium points implied by the equations for S* and I. c) Under which condition for the model parameters ï¢, ï§, and ï® involved do we find positive equilibrium values for S(t) and I(t)? d) Consider the case that the parameter condition derived in c) is satisfied. Which of the equilibrium points will be realized?
9.4.1 Consider the total energy E = m g r (1 ï€ cos ï¡) + m r2 (dï¡ / dt)2 / 2. Use the fact that E is constant for the undamped pendulum to derive the differential equation for the undamped pendulum. Hint: differentiate E. 9.4.2 The undamped nonlinear pendulum equation d2ï¡ / dt*2 + sin ï¡ = 0 combined with the initial conditions ï¡(0) = ï¡0 and dï¡ / dt*(0) = 0 can be used to find an exact expression for the pendulum period TP, which is the time required for the pendulum bob to swing through one complete cycle and return to its original position. This expression for TP reads (Boyce & DiPrima 2009) TP ï€½ 4
r g
dï±
ï° /2
ïƒ² 0
1 ï€ sin (ï¡ 0 / 2) sin 2 ï± 2
.
388
9 Deterministic Multivariate Evolution The integral represents an elliptic integral of the first kind. a) Consider the case that the initial angle ï¡0 is very small. Calculate the integral by using the approximation (1 ï€ x)ï€1/2 = 1 + x / 2. Hint: use the integral ïƒ² sin2 x dx = x / 2 ï€ (1 / 4) sin 2x. b) For which initial angles ï¡0 is the influence of ï¡0 on TP smaller than 1%?
9.4.3 Consider the undamped nonlinear pendulum equation (9.142), this means y2 = ï‚± 21/2 (E* + cos y1 ï€ 1)1/2. a) Write this formula in an explicit dependence on any initial conditions y10 and y20. b) Explain under which conditions the positive and negative signs in the formula for y2 have to be used, respectively. c) For which y20 do we always find open curves in the y1-y2 phase plane? 9.4.4 We consider a spring-mass system that involves two coupled masses m1 and m2: see the corresponding illustration. The masses can move in one direction. Their positions are x1 and x2. According to Newtonâ€™s Second Law, this spring-mass system can be described by the equation system (Haberman 1977) m1
d 2 x1 ï€½ k ( x2 ï€ x1 ï€ L), dt 2
m2
d 2 x2 ï€½ ï€ k ( x2 ï€ x1 ï€ L). dt 2
Here, k is the spring constant, and L is the unstreched length of the spring. a) The center of mass is defined by z = (m1 x1 + m2 x2) / (m1 + m2). Find an equation for z and solve it. b) Consider the spring stretching y = x2 ï€ x1 ï€ L. Derive an equation for y. c) Compare the y equation with the undamped spring-mass system equation d2y / dt2 + k y / m = 0: see Eq. (7.42). Explain your observations. d) Solve the y equation. Hint: use the results derived in Sect. 7.3.3.
9.4.5 Consider again the spring-mass system given in exercise 9.4.4. a) Solve the equations for x1(t) and x2(t) for initial conditions chosen such that the initial values of the derivatives of z and y are zero (z'0 = y'0 = 0) and initial values of z and y that are given by z0 = 0 and y0 = 1. Hint: use the y and z solutions derived in exercise 9.4.4. b) Calculate the positions x1(t) and x2(t) for the case that m1 ï‚® ï‚¥. Explain why the spring-mass system motion obtained in this way makes sense. c) Calculate the positions x1(t) and x2(t) for the case that m1 ï‚® 0. Explain why the spring-mass system motion obtained in this way makes sense.
9.7 Exercises
389
9.5.1 Consider the Lorenz equations (9.144). a) Calculate the equilibrium points by accounting for variable values of the model parameters Pr and b. b) Show that the characteristic number R2, which separates nonchaotic and chaotic solutions of the Lorenz equations, is given by R2 ï€½ Pr
Pr ï€« b ï€« 3 . Pr ï€ b ï€ 1
Hint: follow the explanations in Sect. 9.5.2.
9.5.2 Consider the expression for R2 given in exercise 9.5.1. a) Show the effect of increasing values of b on R2. b) For which Pr values is R2 positive? c) Consider Pr values so that R2 is positive. For growing Pr values, R2 decreases, it attains a minimum, and it increases. Find the minimum of R2. 9.5.3 O. RÃ¶ssler (1976) analyzed the following equation system (b represents a non-negative model parameter) dy1 ï€½ ï€ y 2 ï€ y3 , dt
dy 2 ï€½ y1 ï€« 0.2 y 2 , dt
dy3 ï€½ 0.2 ï€« y3 ( y1 ï€ b). dt
a) Show that there exist two potential equilibrium points, which are given by P = (0.2 Y3, ï€Y3, Y3). Here, Y3 = (5 / 2) [b ï‚± (b2 ï€ 4 / 25)1/2]. b) What are the conditions to have no equilibrium point, one equilibrium point, and two equilibrium points? c) Which behavior of solutions do you expect for the case that there is no equilibrium point?
9.5.4 Consider the RÃ¶ssler equations given in exercise 9.5.3. a) Determine the linear equation system near the equilibrium points. b) Show that the characteristic equation, which characterizes the behavior of the linear equation system obtained in a), is given by 0 ï€½ r 3 ï€« 0.5 ïƒ¦ïƒ§ b ï€ 0.4 ï b 2 ï€ 4 / 25 ïƒ¶ïƒ· r 2 ï€« ïƒ¦ïƒ§ 2.4 b ï€« 1 ï‚± 2.6 b 2 ï€ 4 / 25 ïƒ¶ïƒ· r ïƒ¨ ïƒ¸ ïƒ¨ ïƒ¸
ï b 2 ï€ 4 / 25 . The upper (lower) sign in this equation refers to the positive (negative) sign in Y3 = (5 / 2) [b ï‚± (b2 ï€ 4 / 25)1/2]. c) Show that the condition for the critical value of b that implies two pure imaginary eigenvalues is given by the equation 0 ï€½ b ï€ 0.4 ïƒ¦ïƒ§ b (b 2 ï€ 0.12) ï€ 0.016 ï b (b 2 ï€ 0.12) ï€« 0.016 ïƒ¶ïƒ· . ïƒ¨ ïƒ¸ Hint: follow the explanations in Sect. 9.5.2.
390
9 Deterministic Multivariate Evolution d) What is the conclusion of the latter equation regarding the critical value of b that implies two pure imaginary eigenvalues? e) It was found for the Lorenz equations that the critical value of b that implies two pure imaginary eigenvalues separates complex eigenvalues with positive and negative real parts (nonchaotic and chaotic solutions of the Lorenz equations). Does the critical value of b determined in d) have the same property?
9.5.5 Consider the RÃ¶ssler equations in exercise 9.5.3 combined with b = 0.4. a) Use the results given in exercises 9.5.3 and 9.5.4 to find the equilibrium point and the roots r of the characteristic equation. Explain the stability behavior of solutions near the equilibrium point. b) Show the validity of the findings obtained in a) in terms of y1-y2, y1-y3, and y2-y3 phase plane plots. Solve the RÃ¶ssler equations up to t = 40 to obtain these figures. Use the initial values (y10, y20, y30) = (0.25, ï€1.05, 0.95) and ï„t = 10ï€4 for the numerical solution corresponding to the numerical scheme (9.172) used for the solution of the Lorenz equations. 9.5.6 Consider the RÃ¶ssler equations in exercise 9.5.3 combined with b = 0.5. a) Use the results given in exercises 9.5.3 and 9.5.4 to find the first equilibrium point and the related roots r of the characteristic equation. What do the results obtained mean regarding the stability behavior of solutions near the first equilibrium point? b) Show the validity of the findings obtained in a) by y1-y2, y1-y3, and y2-y3 phase plane plots. Solve the RÃ¶ssler equations up to t = 40 to obtain these figures. Use the initial values (y10, y20, y30) = (0.35, ï€2.1, 1.9) and ï„t = 10ï€4 for the numerical solution corresponding to the numerical scheme (9.172) used for the solution of the Lorenz equations. c) Find the second equilibrium point and the related roots r of the characteristic equation. Explain the meaning of the results obtained regarding the stability behavior of solutions near the second equilibrium point. d) Show the validity of the findings obtained in c) by means of y1-y2, y1-y3, and y2-y3 phase plane plots. Solve the RÃ¶ssler equations up to t = 100 to obtain these figures. Use the initial values (y10, y20, y30) = (0, ï€0.4, 0.4) and ï„t = 10ï€4 for the numerical solution.
10 Stochastic Multivariate Evolution
The discussions of stochastic methods in previous chapters were related to the consideration of a single random variable. This approach is appropriate to explain the basic structure of evolution equations for stochastic processes and their PDFs. On the other hand, most applications cannot be handled on the basis of methods that describe the evolution of single variables. Real processes usually take place in the three-dimensional physical space, and they often involve several variables. Examples are given by flow phenomena (the three-dimensional atmospheric wind field that interacts with the temperature), chemical reactor processes (involving a variety of chemical species in three-dimensional reactors), and the competition of several population densities in areas with varying food resources. To prepare the application of stochastic methods to such cases we will extend now the methods developed in Chap. 8 to the case of several random variables. In fact, the methods to be described in this chapter are applicable to a wide range of realistic problems. More detailed descriptions of corresponding applications can be found elsewhere (Pope 2000, Roekaerts 2002, Heinz 2003, Fox 2003, Givi 2006). From a mathematical point of view, the discussion here reveals a relationship between partial differential equations and stochastic ordinary differential equations, which is very helpful for the solution of complicated partial differential equations. Section 10.1 explains the motivation for considering joint processes of several random variables. Joint PDFs that do not evolve will be considered in Sects. 10.2 and 10.3: Sect. 10.2 explains the definition of joint PDFs and Sect. 10.3 presents the normal model for joint PDFs. Joint PDFs that evolve in time will be considered in Sects. 10.4 and 10.5. The concepts for the description of the evolution of a single-variable PDF (and the corresponding stochastic process) will be extended to the several-variable case in Sect. 10.4. Section 10.5 explains the application of such equations to the modeling of molecular and fluid motion. Section 10.6 summarizes the basic observations made in this chapter.
S. Heinz, Mathematical Modeling, DOI 10.1007/978-3-642-20311-4_10, Â© Springer-Verlag Berlin Heidelberg 2011
391
392
10 Stochastic Multivariate Evolution
10.1 Motivation Fluid Dynamics. As an example, let us consider the motion of fluids (e.g., atmospheric motions) in order to illustrate the need for methods for the calculation of the evolution of several random variables. The prediction of fluid flow requires the calculation of the mean velocity Ui(x, t) of molecules, which represents the ith component (i = 1, 3) of the fluid velocity at the position x = (x1, x2, x3) at time t. It will be shown in Sect. 10.5 that the fluid velocity Ui(x, t) and fluid mass density ï² (x, t) have to satisfy a coupled system of partial differential equations, which represent the conservation of mass and momentum,
ï‚¶U m Dï² ï€«ï² ï€½ 0, ï‚¶xm Dt
(10.1a)
DU i 1 ï‚¶ï²ï³ im ï€« ï€½ 0. ï² ï‚¶xm Dt
(10.1b)
Here, ï³im(x, t) refers to the variance of molecular velocities, this means ï³ im ï€½ v i v m (see Sect. 10.5). We use the sum convention for repeated subscripts, this means we have for example ï‚¶U m ï‚¶U 1 ï‚¶U 2 ï‚¶U 3 ï€½ ï€« ï€« . ï‚¶xm ï‚¶x1 ï‚¶x2 ï‚¶x3
(10.2)
The total derivative (or substantial or material derivative) of any property Q(x, t) (we may set, for example, Q = ï² or Q = Ui) is defined by DQ ï‚¶Q ï‚¶Q ï€½ ï€«Um . Dt ï‚¶t ï‚¶xm
(10.3)
The meaning of DQ / Dt can be seen by considering the property Q at x = x(t). Here, x(t) is a point that follows the fluid velocity Ui, i.e., x(t) is determined by
dxi (t ) ï€½ U i ï€¨ x (t ), t ï€©. dt
(10.4)
The total derivative DQ / Dt at x = x(t) reads
DQï€¨ x (t ), t ï€© ï‚¶Qï€¨ x (t ), t ï€© ï‚¶Qï€¨ x (t ), t ï€© ï€½ ï€« U m ï€¨ x (t ), t ï€© Dt ï‚¶t ï‚¶xm
ï‚¶Qï€¨ x (t ), t ï€© ï‚¶Qï€¨ x (t ), t ï€© dxm (t ) dQï€¨ x (t ), t ï€© . ï€½ ï€« ï€½ dt dt ï‚¶t ï‚¶xm
(10.5)
The last line makes use of Eq. (10.4). Hence, DQ / Dt represents the total change of the property Q in time at a point x(t) moving with the fluid velocity Ui.
10.2 Data Analysis Concepts for Joint Random Variables
393
Closure Problem. Equations (10.1) are unclosed because the variance ï³im of molecular velocities is unknown. This is not a minor problem, but ï³im determines the velocity change DUi / Dt according to Eq. (10.1b). The variance ï³im has to satisfy a conservation equation, too (see Sect. 10.5),
Dï³ ij Dt
ï€«
ï‚¶U j ï³ 1 ï‚¶ï² vi vj vm ï‚¶U i 2ïƒ¦ ïƒ¶ ï³ mj ï€« ï³ mi ï€½ ï€ ïƒ§ ï³ ij ï€ kk ï¤ ij ïƒ·. ï€« ï‚¶xm ï‚¶xm ï² ï‚¶xm Tïƒ¨ 3 ïƒ¸
(10.6)
Here, v iv jv m is the triple correlation of molecular velocities, T is a characteristic relaxation time scale, and ï¤ij refers to the Kronecker delta (which is zero for i ï‚¹ j and one for i = j). This equation is again unclosed because the triple correlation is unknown. It would be possible to continue in this way by considering an equation for the triple correlation. However, this equation does again contain an unknown correlation of higher order, and this applies to all such equations. The solution of this closure problem requires a model that explains the evolution of all moments of molecular velocities, which define the joint PDF of the three molecular velocity components. Therefore, we need a model for the evolution of this joint PDF. Such a model will be presented in Sect. 10.5. Questions Considered. Hence, we have to extend the methods for the analysis and modeling of single random variables to the description of properties of several random variables. In particular, we need answers to the following questions: ï‚· How can we extend concepts for the data analysis of single random variables to concepts for the data analysis of joint random variables? ï‚· How can we extend usual PDF models for single random variables (for example, the normal PDF model) to the case of several variables? ï‚· How can we extend PDF equations for the description of the evolution of singlevariable PDFs to equations for the evolution of joint PDF of several variables? The first two questions will be considered in Sects. 10.2 and 10.3 for the case of two random variables by focusing on the data analysis. The last question will be addressed in Sects. 10.4 and 10.5 with focus on the modeling of several-variable processes.
10.2 Data Analysis Concepts for Joint Random Variables How can we extend concepts for the data analysis of single random variables to concepts for the data analysis of joint random variables? First of all, this requires the definition of a joint PDF, this means the explanation of how a joint PDF can be obtained from measurements. It will be also helpful to extend the definitions of a single-variable PDF and its moments introduced in Chap. 4 by the consideration of correlations, which corresponds to the introduction of conditional means. These
394
10 Stochastic Multivariate Evolution
questions will be addressed in this section by considering PDFs of two random variables X and Y, which may have values between negative and positive infinity. The concepts to be developed can be straightforwardly extended to the case of many variables. Such multidimensional joint PDFs will be considered in Sect. 10.4.1 in the context of the discussion of evolution equations for joint PDFs.
10.2.1 Joint Probability Density Functions Joint PDF. In extension of the definition f(x) = <ï¤(x ï€ X)> of the PDF of a single random variable X, we define the joint PDF of two variables X and Y by
f ( x, y ) ï€½ ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) .
(10.7)
The joint PDF f(x, y) has the properties
ïƒ² f ( x, y ) dy ï€½ ïƒ² ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) dy ï€½ ï¤ ( x ï€ X ) ï€½ f ( x).
(10.8a)
ïƒ² f ( x, y ) dx ï€½ ïƒ² ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) dx ï€½ ï¤ ( y ï€ Y ) ï€½ f ( y ).
(10.8b)
The first rewriting of the left-hand sides makes use of the definition (10.7) of the joint PDF f(x, y). The second rewriting applies the normalization property of delta functions. The PDFs f(x) and f(y) of single variables are called marginal PDFs. As shown in exercise 10.2.1, other typical properties of the joint PDF f(x, y) are f ( x, y ) ï‚³ 0,
(10.9a)
f (ï€ï‚¥, y ) ï€½ f (ï‚¥, y ) ï€½ f ( x,ï€ï‚¥) ï€½ f ( x, ï‚¥) ï€½ 0,
(10.9b)
ïƒ²ïƒ² f ( x, y ) dx dy ï€½ 1,
(10.9c)
ïƒ²ïƒ² g ( x, y ) f ( x, y ) dx dy ï€½ g ( X , Y ) ,
(10.9d)
where g(x, y) is any function of x and y. The knowledge of the joint PDF f(x, y) enables the calculation of the probability for joint events a ï‚£ X ï‚£ b and c ï‚£ Y ï‚£ d, d ïƒ©b ïƒ¹ P(a ï‚£ X ï‚£ b, c ï‚£ Y ï‚£ d ) ï€½ ïƒ² ïƒª ïƒ² f ( x, y ) dx ïƒº dy. c ïƒ«a ïƒ»
(10.10)
The validity of this relation can be seen by using the definition (10.7) of f(x, y), d
ïƒ©b
ïƒ¹
d
c
ïƒ«a
ïƒ»
c
ïƒ© b dï± ( x ï€ X ) dï± ( y ï€ Y ) ïƒ¹ dx ïƒº dy dx dy ïƒ«a ïƒ»
ïƒ² ïƒª ïƒ² f ( x, y ) dx ïƒº dy ï€½ ïƒ² ïƒª ïƒ²
ï€½ ï€¨ï± (b ï€ X ) ï€ ï± (a ï€ X ) ï€© ï€¨ï± (d ï€ Y ) ï€ ï± (c ï€ Y ) ï€© .
(10.11)
10.2 Data Analysis Concepts for Joint Random Variables
395
Here, the delta functions were replaced by derivatives of theta functions according to ï¤(x ï€ X) = d ï± (x ï€ X) / dx. Relation (10.10) can be specified for the case y ï€« dy ïƒ© x ï€« dx ïƒ¹ P( x ï‚£ X ï‚£ x ï€« dx, y ï‚£ Y ï‚£ y ï€« dy ) ï€½ ïƒ² ïƒª ïƒ² f ( xË† , yË† ) dxË† ïƒº dyË† , y ïƒ« x ïƒ»
(10.12)
where dx and dy are infinitesimal intervals. In the first order of approximation we can replace f ( xË† , yË† ) in the integral by f(x, y). Then, Eq. (10.12) provides P( x ï‚£ X ï‚£ x ï€« dx, y ï‚£ Y ï‚£ y ï€« dy ) ï€½ f ( x, y ) dx dy.
(10.13)
Hence, f(x, y) determines the probability to find X and Y in infinitesimal intervals at x and y. Independence. The joint PDF f(x, y) becomes simpler for the specific case of independent random variables, this means for the case that there is no effect of one variable on the other variable. For this case, the joint PDF can be written f ( x, y ) ï€½ ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) ï€½ f ( x) f ( y ).
(10.14)
The consideration of independent variables simplifies analyses significantly. The concepts of independent and uncorrelated random variables (variables with a zero correlation coefficient rXY, see Sect. 2.3.1) are similar but different. Independent variables are always uncorrelated: the correlation coefficient rXY = 0. However, the converse is not true in general: uncorrelated variables do not have to be independent. An example for the latter case is the following: Let X be uniformly distributed on [ï€1, 1] and Y = X 2. The calculation of the correlation coefficient then shows that both variables are uncorrelated. However, X determines Y, and Y restricts X to at most two values. Hence, X and Y are not independent variables.
10.2.2 Conditional Probability Density Functions Conditional PDF. The joint PDF f(x, y) determines the probability to find X and Y in infinitesimal intervals at x and y. However, there is relatively often a slightly different problem given by the question of what is the probability to find values of one variable (e.g., y) for a fixed value of the other variable (e.g., x): see the discussion in Sect. 10.2.3. Information regarding this question is given by the joint PDF f(x, y), but f(x, y) does not represent a PDF for y (the integral over y does not result in one: see Eq. (10.8a)). Therefore, the joint PDF is rescaled so that the rescaled PDF integrates to one. This rescaled PDF is given by f ( y | x) ï€½
f ( x, y ) . f ( x)
(10.15)
396
10 Stochastic Multivariate Evolution
The PDF f(y | x) is called the PDF of y conditioned on x (or simply conditional PDF): it describes the probability to find y values under the condition that X = x. The integral of the conditional PDF f(y | x) over y is equal to one,
ïƒ² f ( y | x) dy ï€½ 1,
(10.16)
which follows from f(y | x) = f(x, y) / f(x) and the property (10.8a) of f(x, y). The conditional PDF of independent variables, for which we have f(x, y) = f(x) f(y), is equal to the corresponding unconditional PDF, f(y | x) = f(y). Conditional Mean. The conditional PDF can be used to define a conditional mean. With regard to any function g(x, y), this relation reads
ïƒ² g ( x, y ) f ( y | x) dy ï€½ g ( X ,Y ) | x .
(10.17)
By using the definition f(y | x) = f(x, y) / f(x), this relation also can be written
ïƒ² g ( x, y ) f ( x, y ) dy ï€½ g ( X ,Y ) | x f ( x).
(10.18)
The consistency of this relation can be seen by integrating it over x,
ïƒ² g ( X ,Y ) | x f ( x) dx ï€½ ïƒ² ïƒ² g ( x, y ) f ( x, y ) dy dx ï€½ g ( X ,Y ) .
(10.19)
The last expression follows from the property (10.9d) of joint PDFs. Hence, the integral over the conditional mean multiplied with the probability to find x equals the unconditional mean. Conditional Mean Calculation. Equation (10.17) can be used to calculate a conditional mean, but this requires the joint PDF f(x, y) and the integration over y. This can be avoided by performing the integration over y in Eq. (10.17), 1 ïƒ² g ( x, y ) ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) dy f ( x) 1 1 g ( X , Y ) ïƒ² ï¤ ( x ï€ X ) ï¤ ( y ï€ Y ) dy ï€½ g( X ,Y )ï¤ (x ï€ X ) . ï€½ f ( x) f ( x)
g( X ,Y ) | x ï€½
(10.20) The first rewriting of the conditional mean is obtained by replacing the joint PDF f(x, y) in the conditional PDF f(y | x) = f(x, y) / f(x) by its definition (10.7). In the second line, the mean value is used for all the integral, and the sifting property of delta functions is used, such that g(X, Y) can be written in front of the integral. The last expression results from the normalization condition for delta functions. The relation between a conditional mean and a conditional PDF can be seen by setting g(X, Y) = ï¤(y ï€ Y). For this case, Eq. (10.20) will become
ï¤ (y ï€Y) | x ï€½
1 f ( x, y ) ï¤ ( y ï€ Y )ï¤ (x ï€ X ) ï€½ ï€½ f ( y | x). f ( x) f ( x)
(10.21)
10.2 Data Analysis Concepts for Joint Random Variables
397
These rewritings follow from the definitions of a joint PDF f(x, y) and conditional PDF f(y | x). Thus, the conditional PDF represents a conditional mean.
10.2.3 Application to Optimal Modeling Optimal Models. Typical problems involving two random variables were considered in Chap. 2. We considered a set of (Xi, Yi) data, where i = 1, N. The problem was to find a model yM(x) that agrees as good as possible with the given data. The particular problem was to find a model yM(x) that minimizes the leastsquares error E2 ï€½
1 N 2 ïƒ¥ ï€¨Yi ï€ y M ( X i ) ï€© . N i ï€½1
(10.22)
The objective here is not to make any modifications of the approach presented in Chap. 2, but to present the findings obtained in Chap. 2 in terms of properties of random variables. Error Definition. In terms of the notation applied here, the least-squares error can be written as a mean value, E 2 ï€½ ï›Y ï€ y M ( X )ï . 2
(10.23)
According to Eq. (10.19), the least-squares error E2 also can be written in terms of a conditional mean, E 2 ï€½ ïƒ² ï›Y ï€ y M ( X )ï | x f ( x) dx ï€½ ïƒ² ï›Y ï€ y M ( x)ï | x f ( x) dx. 2
2
(10.24)
The last expression accounts for the condition X = x. The advantage of using the conditional mean is that the error E2 is now related to the function yM(x), which has to be calculated. Minimal Error. Which model function yM(x) could minimize the least-squares error? The last expression in Eq. (10.24) represents the mean value of the nonnegative numbers <[Y ï€ yM(x)]2 | x>. Therefore, the minimum value of E2 is given if <[Y ï€ yM(x)]2 | x> becomes minimal. This conditional mean can be written
ï›Y ï€ yM ( x)ï2 | x
ï› ï›Y ï€ Y | x ï
ï€½ Y ï€ Y | x ï€« Y | x ï€ y M ( x) ï€½
2
|x ï€«
ï›Y |x
ï
2
|x
ï€ y M ( x)
ï
2
| x ï€« h( x).
(10.25)
The first rewriting involves ï€ . The second rewriting results from distributing the quadratic term, where the function h(x) is given by
ï›
h( x ) ï€½ 2 Y ï€ Y | x
ïï› Y | x
ï
ï€ y M ( x) | x .
(10.26)
398
10 Stochastic Multivariate Evolution
A closer look at h(x) shows that h(x) = 0,
ï› ï€½ 2ï› Y | x
ïï› ( x)ïï› Y | x
ï
h( x ) ï€½ 2 Y | x ï€ y M ( x ) Y ï€ Y | x | x ï€ yM
(10.27)
ï
ï€ Y | x ï€½ 0.
The expression ï€ yM(x) is unaffected by the condition, which results in the first line. The next line follows from distributing the conditional mean. Therefore, Eq. (10.25) can be written
ï›Y ï€ yM ( x)ï 2 | x
ï›
ï€½ Yï€ Y|x
ï
2
ï›
ï
2
| x ï€« Y | x ï€ y M ( x) .
(10.28)
The model function yM(x) does only affect the last term, which is non-negative. Thus, the conditional mean <[Y ï€ yM(x)]2 | x> becomes minimal if the last term disappears, this means if y M ( x) ï€½ Y | x .
(10.29)
This expression is relevant: (i) it explains how an optimal model function yM(x) can be calculated on the basis of measured data (without making use of any model assumptions), (ii) it provides a basis for the optimization of model function types considered (see the explanations in the next paragraph), and (iii) it enables the calculation of yM(x) on the basis of a model for the joint PDF f(x, y) of X and Y (see Sect. 10.3). Optimal Linear Model. The usual way to address optimization problems is the attempt to transform the data such that a linear model can be used, y M ( x) ï€½ ax ï€« b.
(10.30)
The model parameters a and b can be calculated in terms of Eq. (10.29). By using yM(x) = and the definition (10.20) of conditional means, Eq. (10.30) multiplied by f(x) can be written Y ï¤ ( x ï€ X ) ï€½ ï€¨ax ï€« b ï€© f ( x).
(10.31)
We take the integral over x to obtain a condition for b, Y ï€½ a X ï€« b.
(10.32)
By replacing the parameter b in Eq. (10.31) by this condition we obtain
Y ï¤ ( x ï€ X ) ï€½ Y f ( x) ï€« a ï€¨x ï€ X
ï€© f ( x).
The term f(x) can be combined with the left-hand side, ~ Y ï¤ ( x ï€ X ) ï€½ a ï€¨x ï€ X ï€© f ( x).
(10.33)
(10.34)
10.3 The Joint Normal Pobability Density Function Model
399
The multiplication of this expression by x ï€ <X> and integration over x then provides a condition for the model parameter a, ~ ~ ~ a X 2 ï€½ ïƒ² ï€¨x ï€ X ï€© Y ï¤ ( x ï€ X ) dx ï€½ ïƒ² ï€¨X ï€ X ï€© Y ï¤ ( x ï€ X ) dx ~ ~~ ï€½ ï€¨X ï€ X ï€© Y ïƒ² ï¤ ( x ï€ X ) dx ï€½ X Y .
(10.35)
Here, we used the sifting property and normalization condition of delta functions. By combing yM(x) = = a x + b with Eqs. (10.32) and (10.35) we get
~~ XY y M ( x) ï€½ Y | x ï€½ Y ï€« ~ 2 ï€¨x ï€ X X
ï€©ï€½
~ Y ï€« rXY Y 2
1/ 2
xï€ X . ~ 1/ 2 X2
(10.36)
This result recovers Eq. (2.47). The last expression applies the correlation coefficient, which was already defined in Chap. 2, rXY
~~ XY ï€½ ~ 1/ 2 ~ 2 X2 Y
1/ 2
.
(10.37)
The use of the latter expression for yM(x) in Eq. (10.23) results in the following minimal least-squares error E 2 ï€½ ï›Y ï€ y M ( X )ï
2
~~ XY ~2 ï€½ Y ï€ 2 ~2 X ~~ XY ~2 ï€½ Y ï€ ~2 X
ï›
~ ~ ~~ ~ ï€½ Y ï€ X XY / X 2 ~~ XY ~~ XY ï€« ~ X2
ï
2
2 2
~ X2
(10.38)
2
~ 2 ï€½ Y 2 (1 ï€ rXY ) .
This expression recovers Eq. (2.53) for E2, see the related discussion in Chap. 2.
10.3 The Joint Normal Pobability Density Function Model Let us address now the question of how joint PDFs can be modeled. We will consider here the extension of the normal PDF model for single variables, which represents the most relevant PDF model for unbounded variables, to the case of two correlated random variables. The extension to the many-variable case will be described in the context of Fokker-Planck equations (see Sect. 10.4).
400
10 Stochastic Multivariate Evolution
10.3.1 The Joint Normal Probability Density Function Model Joint Normal PDF. The joint normal PDF f(x, y) of two random variables X and Y can be defined by
1
f ( x, y ) ï€½
ï€½
~ ~ 2 2ï° (1 ï€ rXY ) X 2 Y 2 1 ~ ~ 2 2ï° (1 ï€ rXY ) X 2 Y 2
ïƒ¬ïƒ¯ xË† 2 ï€« yË† 2 ï€ 2rXY xË† yË† ïƒ¼ïƒ¯ expïƒï€ ïƒ½, 2 ïƒ¯ïƒ® 2 (1 ï€ rXY ) ïƒ¯ïƒ¾ ïƒ¬ïƒ¯ ( yË† ï€ rXY xË† ) 2 ï€« (1 ï€ rXY 2 ) xË† 2 ïƒ¼ïƒ¯ expïƒï€ ïƒ½. 2 ïƒ¯ïƒ¾ ïƒ¯ïƒ® 2 (1 ï€ rXY )
(10.39)
The second line represents a convenient rewriting of the first line, which will be used below. To represent these expressions efficiently we applied here the nondimensional variables xË† ï€½
xï€ X , ~ 1/ 2 X2
yË† ï€½
yï€ Y ~ 1/ 2 Y2
(10.40)
as abbreviations. The correlation coefficient rXY is given by Eq. (10.37). By defining normalized random variables
XË† ï€½
~ X ~ X2
, 1/ 2
YË† ï€½
~ Y ~ Y2
1/ 2
(10.41)
in analogy to Eqs. (10.40), we find the correlation coefficient rXY to be given by rXY ï€½ XË† YË† .
(10.42)
Due to |rXY| ï‚£ 1 we have 1 ï€ rXY2 ï‚³ 0, i.e., the variance in Eqs. (10.39) is nonnegative. The model (10.39) does satisfy the consistency conditions (10.8), see exercise 10.3.1. Moments. An efficient way to present the moments of the joint PDF f(x, y) is to do this in terms of the normalized random variables (10.41). The moments can be calculated by multiplying the PDF f(x, y) with the corresponding variables and integration. Similar to the properties of a single-variable normal PDF it is found that the third-order and fifth-order (and all other odd-numbered) central moments are equal to zero, XË† 3 ï€½ XË† 2 YË† ï€½ XË† YË† 2 ï€½ YË† 3 ï€½ 0, XË† 5 ï€½ XË† 4 YË† ï€½ XË† 3 YË† 2 ï€½ XË† 2 YË† 3 ï€½ XË† YË† 4 ï€½ YË† 5 ï€½ 0.
(10.43a)
10.3 The Joint Normal Pobability Density Function Model
401
The even-numbered normalized central moments are functions of the correlation coefficient rXY. For example, the fourth-order and sixth-order central moments are given by XË† 4 ï€½ YË† 4 ï€½ 3,
XË† 6 ï€½ YË† 6 ï€½ 15,
XË† 3 YË† ï€½ XË† YË† 3 ï€½ 3 rXY ,
XË† 5 YË† ï€½ XË† YË† 5 ï€½ 15 rXY ,
2 XË† 2 YË† 2 ï€½ 1 ï€« 2 rXY ,
2 XË† 4 YË† 2 ï€½ XË† 2 YË† 4 ï€½ 3 ï€« 12 rXY ,
(10.43b)
3 XË† 3 YË† 3 ï€½ 9 rXY ï€« 6 rXY .
For independent variables for which we have rXY = 0, these relations recover the consequences for single normally distributed variables. Equations (10.43) can be used to decide whether any joint PDF is normal or not (see, e.g., the discussion of this question regarding the Brownian motion model in Sect. 6.4.2). For the case that all the conditions (10.43) implied by a joint normal PDF are satisfied, we can conclude that the joint PDF considered represents a normal PDF. Why is this conclusion valid? It is possible that another joint PDF implies moments that agree with some of the relations considered here (this PDF may also imply zero thirdorder and fifth-order moments), but it is impossible that another joint PDF implies moments that agree with all the 22 conditions (10.43). Conditional PDF. Equation (10.39) can be used for writing the joint PDF as 1
f ( x, y ) ï€½
~ 2 2ï° (1 ï€ rXY ) Y 2
ïƒ¬ïƒ¯ ( yË† ï€ rXY xË† ) 2 ïƒ¼ïƒ¯ expïƒï€ ïƒ½ f ( x), ïƒ¯ïƒ® 2 (1 ï€ rXY 2 ) ïƒ¯ïƒ¾
(10.44)
where f(x) is given by f ( x) ï€½
1 ~ 2ï° X 2
ïƒ¬ xË† 2 ïƒ¼ expïƒï€ ïƒ½. ïƒ® 2ïƒ¾
(10.45)
Comparison of Eq. (10.45) with the definition of the conditional PDF f(y | x), i.e., f ( x, y ) ï€½ f ( y | x) f ( x),
(10.46)
shows that the conditional PDF f(y | x) is given by f ( y | x) ï€½
1 ~ 2 2ï° (1 ï€ rXY ) Y 2
ïƒ¬ïƒ¯ ( yË† ï€ rXY xË† ) 2 ïƒ¯ïƒ¼ expïƒï€ ïƒ½. ïƒ¯ïƒ® 2 (1 ï€ rXY 2 ) ïƒ¯ïƒ¾
(10.47)
The conditional PDF f(y | x) integrates to unity, ïƒ² f(y | x) dy = 1, see exercise 10.3.2. Considered as a function of yË† , f(y | x) represents a normal PDF with mean rXY xË†
402
10 Stochastic Multivariate Evolution
~ and variance 1ï€ rXY2, which is divided by < Y 2 >1/2. Hence, we have the relation
rXY xË†
ïƒ² yË† f ( y | x) dyË† ï€½ ~ 2 Y
1/ 2
(10.48)
.
Conditional Mean and PDF. In terms of f(y | x) we can obtain all conditional moments. First of all, we are interested in the conditional mean, which is given by
Y | x ï€½ ïƒ² y f ( y | x) dy.
(10.49)
The conditional mean can be calculated by writing Eq. (10.48) as yï€ Y
ïƒ² ~2 Y
1/ 2
f ( y | x)
r xË† dy ï€½ XY 1/ 2 . ~ 2 1/ 2 ~2 Y Y
(10.50)
By using the definition (10.49) we find then for the conditional mean ~ Y | x ï€½ Y ï€« rXY Y 2
1/ 2
~ xË† ï€½ Y ï€« rXY Y 2
1/ 2
xï€ ï€¼ X ï€¾ , ~ 1/ 2 X2
(10.51)
where xË† is used according to its definition (10.40). This expression for the conditional mean enables us to write the conditional PDF f(y | x) given by Eq. (10.47) in a very convenient way. To prepare this representation we write
yË† ï€ rXY xË† ï€½
yï€ Y Y|x ï€ Y yï€ Y |x ï€ ï€½ , ~ 2 1/ 2 ~ 2 1/ 2 ~ 1/ 2 Y Y Y2
(10.52)
where the definition of yË† and expression (10.51) for the conditional mean are applied. The use of this relation in Eq. (10.47) leads to the conclusion that f ( y | x) ï€½
1 ~ 2 2ï° (1 ï€ rXY ) Y 2
ïƒ¬ ï€¨y ï€ Y | x ï€© 2 ïƒ¯ expïƒï€ ~2 2 ïƒ¯ïƒ® 2 (1 ï€ rXY ) Y
ïƒ¼ ïƒ¯ ïƒ½. ïƒ¯ïƒ¾
(10.53)
Therefore, the conditional PDF represents a normal PDF with mean and ~ variance (1 ï€ rXY2) < Y 2 >. Thus, the deviations Y ï€ from the conditional ~ mean are normally distributed with zero mean and variance (1 ï€ rXY2) < Y 2 >, this means the deviations Y ï€ are independent of x. Statistical Formulation of Optimal Models. In Sect. 10.2.3 we analyzed the consequences of considering a linear conditional mean, which leads to the global variance (10.38). Evidence for the suitability of considering such a mean and variance was not provided, which leads to the question of whether there is any conditional PDF that has such a mean and variance, and whether it is reasonable to
10.3 The Joint Normal Pobability Density Function Model
403
consider such a PDF. Answers to these questions are obtained by the findings obtained in the previous paragraph. We see that the assumption of a joint normal PDF, which is certainly reasonable, implies a linear conditional mean. The global variance (10.38) is also supported by this PDF, see exercise 10.3.3.
10.3.2 Data Analysis There is often the question of whether a joint normal PDF can be applied to model the joint PDF of given X and Y data. This question cannot be answered by considering only the marginal PDFs of X and Y. For example, it is incorrect to conclude that two variables have a joint normal PDF even if the marginal PDFs of both X and Y are normal PDFs: the mixed moments may differ from Eq. (10.43b). Joint Normal PDF Features. To address this question, it is helpful to know the characteristic features of a normal joint PDF. The best way to illustrate the joint PDF features is to consider isolines f(x, y) = f in the xË† ï€ yË† plane, where f is a constant. In this case, we can write Eq. (10.39) as ~ ~ 2 2 xË† 2 ï€« yË† 2 ï€ 2rXY xË† yË† ï€½ ï€2 (1 ï€ rXY ) lnïƒ¦ïƒ§ 2ï° f (1 ï€ rXY ) X 2 Y 2 ïƒ¶ïƒ· ïƒ¨ ïƒ¸ (10.54) ~ 2 ~2 2 2 2 2 2 2 ï€½ ï€(1 ï€ rXY ) ln 4ï° f (1 ï€ rXY ) X Y ï€½ ï€(1 ï€ rXY ) ln[(1 ï€ rXY ) C ] ,
ï€¨
ï€©
~ ~ where the constant C is defined by C = 4 ï° 2 f 2 < X 2 >< Y 2 >. The meaning of this relation can be better seen by introducing the variables
x' ï€½
xË† ï€« yË† 2
y' ï€½
,
ï€ xË† ï€« yË† 2
.
(10.55)
The (x', y')-coordinate system is obtained by rotating the ( xË† , yË† ) -coordinate system by an angle of 45Âº. The variables xË† and yË† are related to x' and y' by xË† ï€½
x'ï€ y ' 2
yË† ï€½
,
x'ï€« y ' 2
(10.56)
.
In terms of the latter expressions we can write the left-hand side of Eq. (10.54) as
xË† 2 ï€« yË† 2 ï€ 2rXY xË† yË† ï€½ ï€½
ï›
ï›
1 ( x'ï€ y ' ) 2 ï€« ( x'ï€« y ' ) 2 ï€ 2rXY ( x'ï€« y ' )( x'ï€ y ' ) 2
ï
ï
1 2 x'2 ï€«2 y '2 ï€2rXY ( x'2 ï€ y '2 ) ï€½ (1 ï€ rXY ) x'2 ï€«(1 ï€« rXY ) y '2 , 2
(10.57)
such that Eq. (10.54) reads 2
2
(1 ï€ rXY ) x'2 ï€« (1 ï€« rXY ) y '2 ï€½ ï€(1 ï€ rXY ) ln[(1 ï€ rXY ) C ] .
(10.58)
404
10 Stochastic Multivariate Evolution
Fig. 10.1. Isolines of the joint normal PDF f(x, y). The (x', y')-coordinate system is obtained by rotating the ( xË† , yË† ) -coordinate system by a 45Âº angle. The isoline f(x, y) = f is an ellipse in the (x', y')-system. Here, a is the semimajor axis, and b is the semiminor axis.
The latter equation can be written x '2 y '2 ï€« ï€½ 1. a2 b2
(10.59)
Here, the parameters a and b are given by 2
2
aï€½
ï€ (1 ï€ rXY ) ln[(1 ï€ rXY ) C ] 2 ï€½ ï€ ln[(1 ï€ rXY ) C ] 1ï€« rXY , 1 ï€ rXY
bï€½
ï€ (1 ï€ rXY ) ln[(1 ï€ rXY ) C ] 2 ï€½ ï€ ln[(1 ï€ rXY ) C ] 1ï€rXY . 1 ï€« rXY
2
(10.60a)
2
(10.60b)
The relevance of Eq. (10.59) is that this relation represents an ellipse equation: see the illustration in Fig. 10.1. This ellipse equation involves two specific cases. For rXY = 0 we find a = b = [ï€ln C]1/2, which means that the ellipse becomes a circle. The second case is given for rXY = ï‚±1: for rXY ï‚® +1 we have a line along the x' axis, and for rXY ï‚® ï€1 we have a line along the y' axis (see exercise 10.3.4). Filtered Joint PDF Calculation. How can we numerically calculate the joint PDF f(x, y) to test the suitability of model assumptions? In extension of the calculation of marginal PDFs we calculate the filtered joint PDF fï„(x, y) by f ï„ ( x, y ) ï€½
1 ï„N xy . ï„x ï„y N
(10.61)
Here, ï„Nxy is the number of (X, Y) realizations that are found in x and y intervals centered at x and y. This means, ï„Nxy refers to the number of (X, Y) realizations for which X and Y satisfy the conditions xï€
ï„x ï„x ï‚£ X ï‚£ xï€« 2 2
and
yï€
ï„y ï„y ï‚£Y ï‚£ yï€« . 2 2
(10.62)
It is relevant to note that X and Y are not any random values, but they represent a joint event (they are measured at the same time).
10.3 The Joint Normal Pobability Density Function Model
405
Fig. 10.2. Scatter plots of jointly normal and non-normal random variables. Here, uË† , wË† , and TË† are standardized velocity components u and w and temperature T under neutral conditions, see the explanations given in Sect. 4.5. The correlation coefficient ruw = ï€0.26 in (a) and (c), whereas ruT = 0.39 in (b) and (d). The scatter plots in (a) and (b) show jointly normally distributed random variables, and the scatter plots in (c) and (d) are obtained from measurements described in Sect. 4.5. The solid lines represent isolines of jointly normally distributed variables. The outer and inner isolines correspond to f ï‚» 0.02 and f ï‚» 0.1, respectively. Scatter Plots of Normal Variables. A good way to illustrate joint PDFs is to consider scatter plots of joint PDF isolines. Such scatter plots can be obtained by presenting all (X, Y) positions for which the joint PDF fï„(x, y) has a certain value (or is found inside a certain interval). An example for this way of looking at the joint PDF is given in Figs. 10.2aï€b. These figures show scatter plots of jointly normally distributed random numbers. These examples are set up according to Figs. 10.2cï€d. Therefore, the variables are called uË† , wË† , and TË† ( uË† and wË† refer to velocity components, and TË† refers to the temperature). The mean of these variables is zero and the variance is one, this means we consider standardized random variables. Figures 10.2aï€b differ by their correlation coefficients, which have values that agree with the values in Figs. 10.2cï€d. The joint PDF was calculated by using ï„x = ï„y = 0.2. As used for the Figs. 10.2cï€d, a total number of 50,400
406
10 Stochastic Multivariate Evolution
random numbers was considered. The outer isolines correspond exactly to the constant value f = 40 / (N ï„x ï„y) ï‚» 0.02, and the inner isolines correspond exactly to the constant value f = 200 / (N ï„x ï„y) ï‚» 0.1. Figures 10.2aï€b show that the scatter plots obtained in this way agree very well with the isolines, which were calculated according to Eq. (10.59). Scatter Plots of Non-Normal Variables. Such scatter plots of joint PDFs can be used to test the suitability of modeling measured data by a joint normal PDF. An illustration of this approach is given in Figs. 10.2cï€d. These figures show joint PDFs of measured velocities and temperatures that were used in Sect. 4.5 to study marginal PDFs derived from measurements. We see here the joint uË† â€“ wË† PDF and the joint uË† â€“ TË† PDF for a neutral stratification. These joint PDFs have been calculated in the same way as the joint PDFs in Figs. 10.2aï€b. The marginal PDFs of u and w shown in Fig. 4.17 reveal that both PDFs can be described very well by a normal PDF. Hence, the scatter plot in Fig. 10.2c agrees very well with the corresponding plot in Fig. 10.2a, which means that the joint uË† â€“ wË† PDF can be described very well by a joint normal PDF. As may be seen in Fig. 4.17, the marginal temperature PDF can be described only approximately by a normal PDF. Hence, the scatter plot in Fig. 10.2d also shows deviations to the corresponding joint normal PDF features given in Fig. 10.2b. Nevertheless, regarding the usual lack of alternatives it is still reasonable to describe the joint uË† â€“ TË† PDF by a joint normal PDF.
10.3.3 Application to Random Walk Modeling Let us consider the modeling of random walk (see Sect. 6.3) to illustrate the application of concepts introduced above. We consider a random variable (e.g., the position of any object) that is initially normally distributed. In each time step, the variable changes by the addition of a normally distributed contribution, which is independent of previous values of the random variable (it is worth emphasizing that the result to be obtained below can be extended to the case of jointly normally distributed variables that are correlated: see exercise 10.3.6). Hence, the random variable considered represents at every time a sum of independent normally distributed random numbers. The question related to this problem is to find the PDF of the variable considered at any time, this means the PDF of a sum of independent and normally distributed random numbers. This question, which requires the use of joint PDF concepts due to the need to consider simultaneously various random variables involved in the sum considered, will be addressed in the following. Sum of Two Variables. First, let us consider the sum of two random variables with any statistical properties. In particular, we consider one variable X1 with a
10.3 The Joint Normal Pobability Density Function Model
407
marginal PDF f1(x1), and another variable X2 with a marginal PDF f2(x2). The joint PDF of both variables is given by f12(x1, x2). Our objective is to calculate the PDF f(z) of the sum Z = X1 +X2. For doing this it is helpful to consider the distribution function F(z), which enables the calculation of the PDF by means of f(z) = dF / dz. The distribution function F(z) of the sum of two variables can be related to the joint PDF f12(x1, x2), which is considered to be known, by the relation ï‚¥
z ï€ x1
ï€ï‚¥
ï€ï‚¥
F ( z) ï€½ ïƒ²
ïƒ² f12 ( x1 , x2 ) dx2 dx1.
(10.63)
Evidence for the validity of this relation can be obtained in the following way, F ( z) ï€½ ï€½
ï‚¥
z ï€ x1
ï€ï‚¥
ï€ï‚¥
ïƒ² ïƒ²
dï± ( x1 ï€ X 1 ) ï›ï± ( z ï€ x1 ï€ X 2 ) ï€ ï± (ï€ï‚¥ ï€ X 2 )ï dx1 dx1
ï‚¥
ïƒ²
ï€ï‚¥
ï€½
dï± ( x1 ï€ X 1 ) dï± ( x2 ï€ X 2 ) dx2 dx1 dx1 dx2
zï€ X 2
ïƒ²
ï€ï‚¥
(10.64)
dï± ( x1 ï€ X 1 ) dx1 dx1
ï€½ ï± ( z ï€ X 2 ï€ X 1 ) ï€ ï± (ï€ï‚¥ ï€ X 1 ) ï€½ ï± ( z ï€ X 2 ï€ X 1 ) . The first line applies the definition of f12(x1, x2). The integration with regard to x2 is performed in the second line. The third line accounts for ï± (ï€ï‚¥ ï€ X2) = 0 and the fact that the integral is only nonzero if x1 ï‚£ z ï€ X2. The brackets have to apply to all the integral now because the upper bound is a random number. The integration with regard to x1 is performed in the fourth line, where ï± (ï€ï‚¥ ï€ X1) = 0 is used. The last expression represents P(X1 + X2 ï‚£ z), which is the definition of F(z). The corresponding PDF can be obtained by differentiating F(z), f ( z) ï€½
dF ( z ) ï‚¥ ï€½ ïƒ² f12 ( x1 , z ï€ x1 ) dx1. dz ï€ï‚¥
(10.65)
For the case that X1 and X2 are independent, the last formula reads f ( z) ï€½
ï‚¥
ïƒ² f1 ( x1 ) f 2 ( z ï€ x1 ) dx1.
(10.66)
ï€ï‚¥
Sum of Two Independent Normal Variables. Next, let us apply the definition (10.66) of f(z) for the case that X1 and X2 are independent normally distributed random variables. The use of the normal PDF expression (4.72) results in
f ( z) ï€½
1
ï‚¥
2ï° ï³ 1 ï³ 2
ï€ï‚¥
ïƒ²
ïƒ¬ïƒ¯ ï€¨x ï€ ï ï€©2 ï€¨z ï€ x1 ï€ ï 2 ï€©2 ïƒ¼ïƒ¯ expïƒï€ 1 21 ï€ ïƒ½ dx1. 2 ïƒ¯ïƒ¾ ïƒ¯ïƒ® 2ï³ 1 2ï³ 2
(10.67)
408
10 Stochastic Multivariate Evolution
Here, ï1 and ï³1 are the mean and standard deviation of X1, and ï2 and ï³2 are the mean and standard deviation of X2. By introducing y = x1 ï€ ï1 and replacing the integration over x1 by an integration over y, f(z) is given by f ( z) ï€½ ï€½
1
ï‚¥
1
ï‚¥
2ï° ï³ 1 ï³ 2
ï€ï‚¥
ïƒ¬
y2
ï€¨zË† ï€ y ï€©2 ïƒ¼ïƒ¯ dy
ïƒ¯ ï€ ïƒ² expïƒï€ 2 2ï° ï³ 1 ï³ 2 ï€ï‚¥ ïƒ¯ïƒ® 2ï³ 12 2ï³ 2
ïƒ²
ïƒ½ ïƒ¯ïƒ¾
ï›
ïƒ¬ïƒ¯ 1 ï³ 2 2 y 2 ï€« ï³ 12 ï€¨zË† ï€ y ï€©2 expïƒï€ ïƒ¯ïƒ® 2ï³ 12 ï³ 2 2
ï
ïƒ¼ïƒ¯ ïƒ½ dy, ïƒ¯ïƒ¾
(10.68)
where the abbreviation zË† = z ï€ ï1 ï€ ï2 is applied. We rewrite the bracket term to prepare the integration,
ï³ 2 2 y 2 ï€« ï³ 12 ï€¨zË† ï€ y ï€©2 ï€½ (ï³ 12 ï€« ï³ 2 2 ) y 2 ï€ 2 ï³ 12 zË† y ï€« ï³ 12 zË† 2 2
ïƒ¦ ï³ 2 zË† ïƒ¶ïƒ· ï³2 ïƒ¶ 2 2 ïƒ¦ 2 ï€« ï³ 1 zË† 2 ïƒ§ïƒ§1 ï€ 2 1 2 ïƒ·ïƒ· ï€½ (ï³ 1 ï€« ï³ 2 ) ïƒ§ïƒ§ y ï€ 2 1 2 ïƒ· ï³1 ï€« ï³ 2 ïƒ¸ ïƒ¨ ï³1 ï€« ï³ 2 ïƒ¸ ïƒ¨
(10.69)
2
ï³ 2 zË† ïƒ¶ïƒ· ï³ 12 ï³ 2 2 zË† 2 2 2 ïƒ¦ . ï€« ï€½ (ï³ 1 ï€« ï³ 2 ) ïƒ§ïƒ§ y ï€ 2 1 ï³ 1 ï€« ï³ 2 2 ïƒ·ïƒ¸ ï³ 12 ï€« ï³ 2 2 ïƒ¨ Therefore, f(z) reads ïƒ¬ ï³ 2 ï€«ï³ 2 ïƒ¬ïƒ¯ ïƒ¼ïƒ¯ ï‚¥ zË† 2 ïƒ¯ 1 2 expïƒï€ exp f ( z) ï€½ ïƒï€ ïƒ½ ïƒ² 2 2 2 2 2ï° ï³ 1 ï³ 2 ïƒ¯ïƒ® 2 (ï³ 1 ï€« ï³ 2 ) ïƒ¯ïƒ¾ï€ï‚¥ ï³ ï³ 2 1 2 ïƒ¯ïƒ® 1
ïƒ¼ ïƒ¯ ïƒ½ dy. ïƒ¯ïƒ¾ (10.70)
2 ïƒ¦ Ë† ïƒ¶ 1 z ïƒ· ïƒ§yï€ ï³ 2 ïƒ§ ï³ 1 ï€« ï³ 2 2 ïƒ·ïƒ¸ ïƒ¨
2
The integration can be performed by introducing the variable sï€½
ï³ 12 ï€« ï³ 2 2 ïƒ¦ïƒ§ ï³ 2 zË† ïƒ¶ïƒ· . yï€ 21 2 2 ïƒ§ 2ï³ 1 ï³ 2 ïƒ¨ ï³ 1 ï€« ï³ 2 2 ïƒ·ïƒ¸
(10.71)
By replacing y by s in Eq. (10.70) we obtain f ( z) ï€½
1 2ï° ï³ 1 ï³ 2
2 2 ïƒ¼ïƒ¯ ï‚¥ ï€ s 2 ïƒ¬ïƒ¯ 2ï³ 1 ï³ 2 zË† 2 expïƒï€ e ds. 2 2 2 2 ïƒ½ïƒ² ïƒ¯ïƒ® 2 (ï³ 1 ï€« ï³ 2 ) ïƒ¯ïƒ¾ï€ï‚¥ ï³1 ï€« ï³ 2
(10.72)
The integral over exp(ï€s2) is ï° 1/2 according to Eq. (4.70). Therefore, Eq. (10.72) reduces to f ( z) ï€½
ïƒ¬ïƒ¯ ( z ï€ ï1 ï€ ï 2 ) 2 ïƒ¼ïƒ¯ expïƒï€ ïƒ½, 2 2 ïƒ¯ïƒ® 2 (ï³ 12 ï€« ï³ 2 2 ) ïƒ¯ïƒ¾ 2ï° (ï³ 1 ï€« ï³ 2 ) 1
(10.73)
where zË† = z ï€ ï1 ï€ ï2 is used. This expression shows that the PDF of the sum of two independent normally distributed variables is normal with mean ï1 + ï2 and
10.4 The Fokker-Planck Equation
409
variance ï³1 + ï³2. This observation can be summarized by the conclusion X 1 ~ N ( ï1 , ï³ 1 ) and X 2 ~ N ( ï 2 , ï³ 2 ) ïƒž 2
2
X 1 ï€« X 2 ~ N ( ï1 ï€« ï1 , ï³ 1 ï€« ï³ 2 ). 2
2
(10.74) N(ï, ï³2) refers a normal PDF with mean ï and variance ï³2. The notation applied here means that X1, X2, and X1 + X2 are normally distributed with the means and variances specified by the corresponding N. Sum of Independent Normal Variables. The result (10.74) obtained for two independent normally distributed random variables X1 and X2 can be extended to the case of any number of independent normally distributed random variables. By considering the two numbers considered before as one number and adding another number, we find that the sum of three independent normally distributed variables is again normally distributed. Correspondingly, we can conclude (i = 1, N) X i ~ N ( ï i ,ï³ i ) ïƒž 2
N
ïƒ¦
N
N
ïƒ¶
i ï€½1
ïƒ¸
2 ïƒ¥ X i ~ N ïƒ§ïƒ§ ïƒ¥ ï i , ïƒ¥ ï³ i ïƒ·ïƒ·.
i ï€½1
ïƒ¨ i ï€½1
(10.75)
Therefore, the PDF of the sum of N independent normally distributed random variables represents a normal PDF. Its mean is given by the sum of all means, and its variance is given by the sum of all variances. The conclusion (10.75) obtained can be used for deriving a corresponding conclusion for the distribution of the mean value of N independent normally distributed random variables. By replacing Xi by Xi / N we find that X i ~ N ( ï i ,ï³ i ) ïƒž 2
ïƒ¦1 N 1 N 1 N 2ïƒ¶ ïƒ¥ X i ~ N ïƒ§ïƒ§ ïƒ¥ ï i , 2 ïƒ¥ ï³ i ïƒ·ïƒ·. N i ï€½1 ïƒ¨ N i ï€½1 N i ï€½1 ïƒ¸
(10.76)
Hence, the PDF of mean values is normally distributed, where the mean is given by the mean of all means involved and the variance is given by the mean of all variances involved divided by N. The latter results were applied in Sects. 6.2 and 6.3 for modeling a random walk (for determining the evolution of the position PDF in time).
10.4 The Fokker-Planck Equation After considering the normal model for the joint PDF in the previous section let us consider now the modeling of the evolution of any PDF. This question will be addressed by generalizing the Fokker-Planck equation (8.21) for the PDF of one random variable to the case of any number of random variables. The question of how the Fokker-Planck equation is related to stochastic differential equations for the corresponding random variables will be discussed, too.
410
10 Stochastic Multivariate Evolution
10.4.1 Definition of Multivariate Probability Density Functions The generalization of the Fokker-Planck equation (8.21) to an equation for the joint PDF of a vectorial stochastic process X(t) = {X1(t), X2(t), ïƒ—ïƒ—ïƒ—, XN(t)} requires a relevant first step: the definition of a multivariate PDF f(x, t). The most efficient way of doing this is the use of theta and delta functions for several variables. Multivariate Theta and Delta Functions. In Sect. 4.2.2 we introduced theta and delta functions of one variable. For a vectorial process X(t) = {X1(t), X2(t), ïƒ—ïƒ—ïƒ—, XN(t)}, the corresponding theta and delta functions are given by
ï± ï€¨ x ï€ X (t ) ï€© ï€½ ï± ï€¨x1 ï€ X 1 (t ) ï€© ï± ï€¨x2 ï€ X 2 (t ) ï€©ïŒï± ï€¨x N ï€ X N (t ) ï€©,
(10.77a)
ï¤ ï€¨ x ï€ X (t ) ï€© ï€½ ï¤ ï€¨x1 ï€ X 1 (t ) ï€© ï¤ ï€¨x2 ï€ X 2 (t ) ï€©ïŒï¤ ï€¨x N ï€ X N (t ) ï€©.
(10.77b)
Hence, multivariate theta and delta functions are products of all the theta and delta functions of single variables. Multivariate PDFs. The last expression provides the basis for the definition of a multivariate PDF. By averaging (10.77b), the joint PDF f(x, t) can be defined by f ( x , t ) ï€½ ï¤ ï€¨ x ï€ X (t ) ï€© .
(10.78)
The brackets refer to the mean value defined by Eq. (4.1). The latter definition generalizes the definition (4.29) of the PDF of a single variable. In terms of the normalization property of delta functions we find that this definition satisfies the normalization condition for the joint PDF f(x, t),
ïƒ² f ( x, t ) dx ï€½ ïƒ² ï¤ ï€¨ x ï€ X (t ) ï€© dx ï€½ 1 ï€½ 1.
(10.79)
Here, dx = dx1 dx2 ïƒ—ïƒ—ïƒ— dxN represents a multivariate differential given by the product of all differentials involved. Two-point PDFs can be defined correspondingly. For example, the two-point PDF f(x, t; x', t') for having joint events (x, t) and (x', t') is defined by f ( x , t ; x ' , t ' ) ï€½ ï¤ ï€¨ x ï€ X (t ) ï€© ï¤ ï€¨ x 'ï€ X (t ' ) ï€© .
(10.80)
The one-point PDF f(x, t) can be recovered from this definition,
f ( x, t ) ï€½ ïƒ² f ( x , t ; x ' , t ' ) dx ' .
(10.81)
The validity of this relation can be seen by using the definition (10.80) of the twopoint PDF f(x, t; x', t'), f ( x , t ) ï€½ ïƒ² ï¤ ï€¨ x ï€ X (t ) ï€©ï¤ ï€¨ x 'ï€ X (t ' ) ï€© dx ' ï€½ ï¤ ï€¨ x ï€ X (t ) ï€© .
(10.82)
10.4 The Fokker-Planck Equation
411
A PDF f(x, t | x', t') conditioned on X(t') = x' can be defined in correspondence to the definition (8.37) for a single-variable PDF, f ( x, t | x ' , t ' ) ï€½
ï¤ ï€¨ x ï€ X (t ) ï€© ï¤ ï€¨ x 'ï€ X (t ' ) ï€© f ( x, t; x ' , t ' ) ï€½ ï¤ ï€¨ x 'ï€ X (t ' ) ï€© f ( x' , t ' )
(10.83)
ï€½ ï¤ ï€¨ x ï€ X (t ) ï€© | X (t ' ) ï€½ x ' ï€½ ï¤ ï€¨ x ï€ X (t ) ï€© | x ' , t ' . In terms of this definition the one-point PDF f(x, t) can be written f ( x , t ) ï€½ ïƒ² f ( x , t x ' , t ' ) f ( x ' , t ' ) dx '.
(10.84)
This representation will be used in Sect. 10.4.3 for the derivation of solutions to the Fokker-Planck equation.
10.4.2 The Fokker-Planck Equation Fokker-Planck Equation. Let us consider an N-dimensional stochastic vector process X(t) = {X1(t), X2(t), ïƒ—ïƒ—ïƒ—, XN(t)}. This process is assumed to be Markovian and to have a continuous sample path. The extension of Eq. (8.21) to an equation for the joint PDF f(x, t) of the process X(t) reads 2 ï‚¶D ( x , t ) f ( x , t ) ï‚¶ Dij ( x , t ) f ( x , t ) ï‚¶f ( x , t ) ï€½ï€ i ï€« . ï‚¶t ï‚¶xi ï‚¶xi ï‚¶x j
(10.85)
Here, the sum convention is applied, this means the sum is taken over repeated subscripts. Equation (10.85) represents the Fokker-Planck equation for several variables (Fokker 1914, Planck 1917). Its coefficients Di and Dij are given by the vectorial generalizations of D(1) and D(2) given by Eq. (8.22), Di ( x , t ) ï€½ lim
1 X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€© x , t , ï„t
Dij ( x , t ) ï€½ lim
1 2 ï„t
ï„t ï‚® 0
ï„t ï‚® 0
ï›X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€© ï ï›X j ï€¨t ï€« ï„t ï€© ï€ X j ï€¨t ï€© ï
(10.86a)
x, t .
(10.86b)
The conditional means refer to the condition X(t) = x. Equation (10.85) has the structure of a diffusion equation. The coefficient Di represents a drift coefficient and Dij is a diffusion coefficient. The coefficient Dij has two relevant properties, which are a consequence of its definition (10.86b). The first property is that Dij is symmetric, this means Dij = Dji. The second property is that Dij is positive semidefinite, this means Dij is non-negative definite. This property of Dij can be shown by multiplying the definition (10.86b) with arbitrary real nonvanishing vectors ci
412
10 Stochastic Multivariate Evolution
and cj, which results in Dij ci c j ï€½ lim
ï„t ï‚® 0
1 2ï„t
1 ï„t ï‚® 0 2ï„t
ï€½ lim
ï›X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€© ï ci ï›X j ï€¨t ï€« ï„t ï€© ï€ X j ï€¨t ï€© ï c j ï€¨ ï›X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€©ï ci ï€©
2
x, t (10.87)
x , t ï‚³ 0.
Usually, it is assumed that Dij is positive definite, this means Dij ci c j ï€¾ 0.
(10.88)
The inverse matrix of Dij will exist for this case (Pope 2000), which is relevant to solutions of the Fokker-Planck equation. A positive definite matrix has positive eigenvalues, as may be seen in the following way (Ortega 1987): Suppose that ï¬ is an eigenvalue of the matrix Dij and cj is a corresponding real nonvanishing eigenvector, this means Dij cj = ï¬ ci. Multiplication of both sides with ci provides Dij ci cj = ï¬ ci ci. Therefore, we find ï¬ = Dij ci cj / (ci ci) > 0. The existence of positive eigenvalues is a necessary and sufficient condition for a positive definite matrix Dij. For positive eigenvalues we find that the three principal invariants of Dij (one of the invariants is the determinant det(D) of Dij) have to be positive: see exercise 10.4.1. On the other hand, three positive principal invariants imply that the matrix Dij has to be positive definite. Consistency Constraint. The consistency of the Fokker-Planck equation can be proven by integrating Eq. (10.85) over the sample space x, ï‚¶ 2 Dij ( x , t ) f ( x , t ) ï‚¶Di ( x , t ) f ( x , t ) ï‚¶f ( x , t ) d ï€½ ï€ d ï€« x x dx . ïƒ² ï‚¶t ïƒ² ïƒ² ï‚¶xi ï‚¶xi ï‚¶x j
(10.89)
The left-hand side of Eq. (10.89) vanishes: we can write the time derivative in front of the integral, and f(x, t) is normalized to one. The terms on the right-hand side can be treated by invoking the Divergence Theorem. This theorem states the following (Stewart 2006): Let E be a simple solid region and let S be the boundary surface of E, given with positive (outward) orientation. Let L be a vector field whose component functions have continuous partial derivatives on an open region that contains E. Then ï‚¶L
i ïƒ² L ïƒ— dS ï€½ ïƒ² ï‚¶x dx .
S
E
(10.90)
i
The Divergence Theorem can be applied to the right-hand side of Eq. (10.89) by setting Li = Di f and Li = ï‚¶(Dij f ) / ï‚¶xj, respectively. By considering an infinite domain, the integrals on the right-hand side of Eq. (10.89) will vanish if Li is zero at the surface. Therefore, the consistency of the Fokker-Planck equation (10.85) requires the assumption that the PDF f(x, t) and its derivatives vanish for ï¼xï¼ ï‚® ï‚¥.
10.4 The Fokker-Planck Equation
413
Mean Equations. By multiplying the Fokker-Planck equation (10.85) with xk and integration over the sample space we obtain
ïƒ² xk
ï‚¶ 2 Dij ( x , t ) f ( x , t ) ï‚¶Di ( x , t ) f ( x , t ) ï‚¶f ( x , t ) dx ï€½ ï€ ïƒ² x x dx ï€« ïƒ² x k dx . ï‚¶t ï‚¶xi ï‚¶xi ï‚¶x j
(10.91)
To enable the rewriting of the right-hand side and to prepare the use of Eq. (10.90) we write this equation as
ï‚¶x D ( x , t ) f ( x , t ) ï‚¶x ï‚¶ xk f ( x , t ) dx ï€½ ï€ ïƒ² k i dx ï€« ïƒ² k Di ( x , t ) f ( x , t ) dx ïƒ² ï‚¶t ï‚¶xi ï‚¶xi ï‚¶ ï€«ïƒ² ï‚¶xi
ïƒ© ï‚¶Dij ( x , t ) f ( x , t ) ïƒ¹ ï‚¶xk ï‚¶Dij ( x , t ) f ( x , t ) dx . ïƒª xk ïƒº dx ï€ ïƒ² ï‚¶x j ï‚¶xi ï‚¶x j ïƒªïƒ« ïƒºïƒ»
(10.92)
The integral on the left-hand side is equal to the mean <Xk>. The validity of the right-hand side can be seen by distributing the derivatives by xi involved in the first and third terms. For ï‚¶xk / ï‚¶xi we find ï‚¶xk / ï‚¶xi = ï¤ki. Here, ï¤ik is the Kronecker symbol, which has the properties ï¤ik = 1 for i = k and ï¤ik = 0 for i ï‚¹ k. By accounting for ï‚¶xk / ï‚¶xi = ï¤ki, three of the four terms on the right-hand side of Eq. (10.92) can be written as integrals over the surface S according to Eq. (10.90). We assume that the corresponding terms disappear for ï¼xï¼ ï‚® ï‚¥ so that Eq. (10.92) reads ï‚¶ Xk ï‚¶t
ï€½ ï¤ ki ïƒ² Di ( x , t ) f ( x , t ) dx ï€½ ïƒ² Dk ( x , t ) f ( x , t ) dx .
(10.93)
The last expression is implied by the fact that ï¤ki is only nonzero for k = i. The right-hand side represents the mean value . Hence we find d Xk dt
ï€½ Dk .
(10.94)
The partial derivative by t can be replaced here by the regular derivative because <Xk> and are only functions of t. Hence, ï€¼Dkï€¾ determines the transport of means ï€¼Xkï€¾. For that reason Dk is called a drift coefficient. Variance Equations. The variance equations can be obtained by multiplying the Fokker-Planck equation (10.85) with xk xn and integrating over x,
ïƒ² xk xn
ï‚¶ 2 Dij ( x , t ) f ( x , t ) ï‚¶D ( x , t ) f ( x , t ) ï‚¶f ( x , t ) dx ï€½ ï€ ïƒ² x x x n i dx ï€« ïƒ² x k x n dx . ï‚¶t ï‚¶xi ï‚¶xi ï‚¶x j (10.95)
This equation can be also written ï‚¶ Xk Xn ï‚¶t
ï€½ I1 ï€« I 2 .
(10.96)
414
10 Stochastic Multivariate Evolution
We wrote here the partial derivative by t in front of the integral and applied the definition of <Xk Xn>. The symbols I1 and I2 refer to the first and second integral on the right-hand side of Eq. (10.95), respectively. To calculate I1 we write I1 ï€½ ï€ ïƒ²
ï‚¶xk xn Di ( x , t ) f ( x , t ) ï‚¶x x dx ï€« ïƒ² k n Di ( x , t ) f ( x , t ) dx ï‚¶xi ï‚¶xi
(10.97)
ï€½ ïƒ² (ï¤ ki xn ï€« ï¤ ni xk ) Di ( x , t ) f ( x , t ) dx ï€½ X n Dk ï€« X k Dn . The first rewriting identifies an integral over a derivative (the first term on the right-hand side), which disappears. The next rewriting accounts for ï‚¶(xk xn) / ï‚¶xi = ï¤ki xn + ï¤ni xk. The definition of means and the property of the Kronecker symbol ï¤ki to be nonzero only for k = i are used for obtaining the final expression. The integral I2 can be calculated correspondingly, I2 ï€½ ïƒ²
ï‚¶Dij ( x , t ) f ( x , t ) ïƒ¹ ï‚¶xk xn ï‚¶Dij ( x , t ) f ( x , t ) ï‚¶ ïƒ© dx ïƒª xk xn ïƒº dx ï€ ïƒ² ï‚¶xi ïƒ«ïƒª ï‚¶x j ï‚¶xi ï‚¶x j ïƒ»ïƒº
ï€½ ï€ ïƒ² (ï¤ ki xn ï€« ï¤ ni xk ) ï€½ ï€ïƒ²
ï‚¶Dij ( x , t ) f ( x , t ) ï‚¶x j
ï‚¶ (ï¤ ki xn ï€« ï¤ ni xk ) Dij ( x , t ) f ( x , t ) ï‚¶x j
dx dx ï€« ïƒ²
ï‚¶ (ï¤ ki xn ï€« ï¤ ni xk ) Dij ( x , t ) f ( x , t ) dx ï‚¶x j
ï€½ ïƒ² (ï¤ kiï¤ nj ï€« ï¤ niï¤ kj ) Dij ( x , t ) f ( x , t ) dx ï€½ Dkn ï€« Dnk ï€½ 2 Dnk . (10.98) The last expression applies the symmetry of Dnk. The combination of Eq. (10.96) with these expressions for I1 and I2 leads then to the variance equation
d Xk Xn dt
ï€½ X n Dk ï€« X k Dn ï€« 2 Dnk ,
(10.99)
where the partial derivative by t was replaced by the regular derivative. Instead of considering equations for second-order moments, it is more convenient to derive equations for the variance ~ ~ X k X n ï€½ ï€¨X k ï€ X k
ï€©ï€¨X
n
ï€ Xn
ï€©
ï€½ Xk Xn ï€ Xk Xn .
(10.100)
By differentiating this variance expression we obtain ~ ~ d Xk Xn dt
ï€½
d Xk Xn dt
ï€
d Xk Xn dt
ï€½
d Xk Xn dt
ï€
d Xk dt
Xn ï€
d Xn dt
Xk .
(10.101)
10.4 The Fokker-Planck Equation
415
The use of Eqs. (10.94) and (10.99) implies then the following variance equations, ~ ~ d Xk Xn dt
ï€½ X n Dk ï€« X k Dn ï€« 2 Dnk ï€ Dk X n ï€ Dn X k
(10.102)
~ ~ ~ ~ ï€½ X n Dk ï€« X k Dn ï€« 2 Dnk ,
where the variance expression (10.100) is used for obtaining the last expression. The variance of one component is given by setting k = n. We have ï€¼Dkkï€¾ ï‚³ 0 as a consequence of the definition (10.86b) of Dkn. Hence, variances are produced by ï€¼Dknï€¾: a nonzero Dkn causes a diffusion process (the width of the PDF increases). For that reason Dkn is called a diffusion coefficient. An equilibrium state may be reached asymptotically if the first two terms on the right-hand side of Eq. (10.102) appear with a negative sign, i.e., if these terms model a dissipation of variance. Correlations. The Fokker-Planck equation (10.85) can be used to calculate the correlation between Xi(t) and Xj(t'). We assume that t ï‚£ t' = t + r, where r is any non-negative time. By following the derivation of the corresponding correlation (8.33) for the case of one variable (see exercise 8.3.1), the correlation of Xi(t) and Xj(t + r) is found to be determined by the equation (see also Eq. (10.124)) ~ ~ d X i (t ) X j (t ï€« r ) dr
~ ~ ï€½ X i (t ) D j ï€¨ X (t ï€« r ), t ï€« r ï€© .
(10.103)
Thus, the correlation is unaffected by the diffusion coefficient Dij, i.e., correlations are not produced, but they relax according to the model provided by Dj.
10.4.3 A Solution to the Fokker-Planck Equation Equation Considered. Let us illustrate the application of the Fokker-Planck equation (10.85) and demonstrate characteristic solution properties by considering an example that enables the derivation of an analytical solution. The equation considered is a vectorial generalization of Eq. (8.34) for a single variable,
ï›
ï‚¶f ( x , t ) ï‚¶ ï€½ï€ Gi (t ) ï€« Gik (t ) ï€¨xk ï€ X k ï‚¶t ï‚¶xi
ï€© ï f ( x, t ) ï€« ï‚¶
2
Dij (t ) f ( x , t ) ï‚¶xi ï‚¶x j
.
(10.104)
The drift coefficient Di is a linear function of the variables x, which may be seen as first-order Taylor series of Di. The inclusion of ï€¼Xkï€¾ in Eq. (10.104) defines Gik as the coefficient that controls the intensity of fluctuations about the mean ï€¼Xkï€¾. This linear model for Di is well suited for the characterization of near-equilibrium processes. The diffusion coefficient Dij is assumed to be only a function of time,
416
10 Stochastic Multivariate Evolution
which is a convenient choice with regard to many applications. Dij is assumed to be positive definite. Equation (10.104) will be combined with the assumption of natural boundary conditions, this means f(x, t) ï‚® 0 as ï¼xï¼ ï‚® ï‚¥. Solution Approach. Solutions f(x, t) to the Fokker-Planck equation (10.104) will depend on the initial PDF f(x', t'), which has to be provided. The influence of the initial PDF can be treated separately from the solution of the Fokker-Planck equation, which is very helpful for using solutions for a variety of initial PDFs. This can be achieved by using Eq. (10.84), which represents the one-point PDF f(x, t) in terms of the PDF conditioned on the initial condition X(t') = x', f ( x , t ) ï€½ ïƒ² f ( x , t x' , t ' ) f ( x' , t ' ) d x' .
(10.105)
The idea of this approach is to calculate a general expression for the conditional PDF f(x, t | x', t') independent of the initial PDF f(x', t'), and to calculate then the PDF f(x, t) by integration of Eq. (10.105). But how can we calculate f(x, t | x', t')? In terms of Eq. (10.105), the Fokker-Planck equation (10.104) can be written ïƒ¬ïƒ¯ ï‚¶f ( x , t x' , t ' ) ï‚¶ ï€« Gi (t ) ï€« Gik (t ) ï€¨xk ï€ X k 0 ï€½ ïƒ²ïƒ ï‚¶ ï‚¶ t xi ïƒ¯ïƒ®
ï›
ï€© ï f ( x, t
x' , t ' ) (10.106)
ï‚¶ 2 Dij (t ) f ( x , t x' , t ' ) ïƒ¼ïƒ¯ ï€ ïƒ½ f ( x' , t ' ) dx' . ï‚¶xi ï‚¶x j ïƒ¯ïƒ¾
Hence, the conditional PDF f(x, t | x', t') has to satisfy, too, the Fokker-Planck equation (10.104), i.e., the conditional PDF f(x, t | x', t') has to satisfy the equation ï‚¶f ( x , t x' , t ' ) ï‚¶t
ï€½ï€ ï€«
ï›
ï‚¶ Gi (t ) ï€« Gik (t ) ï€¨xk ï€ X k
ï€© ï f ( x, t
x' , t ' )
ï‚¶xi ï‚¶ 2 Dij (t ) f ( x , t x' , t ' ) ï‚¶xi ï‚¶x j
(10.107) .
Eq. (10.83) provides the initial condition for the conditional PDF f(x, t | x', t'),
f ( x , t ' | x' , t ' ) ï€½
ï¤ ï€¨ x ï€ X (t ' ) ï€©ï¤ ï€¨ x' ï€ X (t' ) ï€© ï¤ ( x ï€ x' ) ï¤ ï€¨ x' ï€ X (t' ) ï€© ï€½ (10.108) ï¤ ï€¨ x' ï€ X (t' ) ï€© ï¤ ï€¨ x' ï€ X (t' ) ï€©
ï€½ ï¤ ( x ï€ x' ), where the sifting property of delta functions is used. Conditional PDF Calculation. The conditional PDF is a normal PDF for the single-variable case (see Sect. 8.3.2). Therefore, we may assume that f(x, t | x', t') also is given by a normal PDF (an N-dimensional normal PDF for our case), f ( x , t x' , t ' ) ï€½
ï€¨2ï° ï€©N / 2
1 det( ï¢ )
ï€¨
ïƒ¬ 1 expïƒï€ ï¢ ï€1ij ï€¨xi ï€ ï¡ i ï€© x j ï€ ï¡ j ïƒ® 2
ï€©ïƒ¼ïƒ½. ïƒ¾
(10.109)
10.4 The Fokker-Planck Equation
417
Here, ï¡i are the mean values and ï¢ij represent the elements of the variance matrix, which is positive definite and symmetric (ï¢ij = ï¢ji). Therefore, the inverse matrix ï¢ ï€1ij does exist, and it is symmetric, i.e., ï¢ ï€1ij = ï¢ ï€1ji. Another view of looking at the assumption (10.109) is the following one: we ask under which conditions it is possible to have a normal PDF as solution of a Fokker-Planck equation. To prove the suitability of the assumption (10.109) we have to show that Eq. (10.109) can satisfy Eq. (10.107). This fact is proven in terms of exercise 10.4.4. It is found that the model parameters ï¡i and ï¢ij have to satisfy the equations dï¡ i ï€½ Gi ï€« Gik ï€¨ï¡ k ï€ X k ï€©, dt dï¢ ij
(10.110a)
ï€½ Gik ï¢ kj ï€« G jk ï¢ ki ï€« 2 Dij .
dt
(10.110b)
These initial conditions for ï¡i and ï¢ij are given by Î±i (t' ) ï€½ x'i ,
(10.111a)
Î²ij (t ' ) ï€½ 0.
(10.111b)
Means and Variances Implied by the Fokker-Planck Equation. Next, let us have a look at the means and variances of f(x, t), which are implied by the FokkerPlanck equation (10.104). The simplest way to obtain these equations is to specify the general Eqs. (10.94) and (10.102), which are valid for every Fokker-Planck equation,
d Xk dt
ï€½ Gk ,
~ ~ d Xk Xn dt
(10.112a)
~ ~ ~ ~ ï€½ Gkm X m X n ï€« Gnm X m X k ï€« 2 Dnk .
(10.112b)
Equations (10.112) are similar to Eqs. (10.110) for the parameters ï¡k and ï¢kn of the conditional PDF. To see the difference, we apply Eqs. (10.110) and (10.112) for deriving the following equations d ï€¨ï¡ k ï€ X k dt
ï€¨
ï€© ï€½ G ï€¨ï¡ kn
ï€©
n
ï€ X n ï€©,
ï€¨
(10.113a)
ï€©
ï€¨
ï€©
~ ~ ~ ~ ~ ~ d ï¢ kn ï€ X k X n ï€½ Gkm ï¢ mn ï€ X m X n ï€« Gnm ï¢ mk ï€ X m X k . dt
(10.113b)
The coefficient Gkm is usually provided with a negative sign to model a relaxation of fluctuations. For this case, ï¡k and ï¢kn relax to the means and variances of f(x, t):
418
10 Stochastic Multivariate Evolution
the stationary values of ï¡k and ï¢kn, for which the left-hand sides of Eqs. (10.113) are zero, are given by
ï¡k ï€½ X k , ~ ~
ï¢ kn ï€½ X k X n .
(10.114a) (10.114b)
For this case, the conditional PDF f(x, t | x', t') is independent of x' because its parameters are independent of x'. Equation (10.105) reveals that the PDF f(x, t) is then equal to the conditional PDF f(x, t | x', t'). Therefore, the unconditional PDF f(x, t), which may have any shape initially, does relax (independent of the initial conditions) asymptotically to a normal PDF.
10.4.4 Stochastic Differential Equations Stochastic Differential Equations. In analogy to the discussion of the relationship between the Fokker-Planck equation (8.21) and the stochastic differential equation (8.55) for a single variable, let us consider now the corresponding relationship for several variables. For the case of an N-dimensional stochastic process X(t) = {X1(t), X2(t), ïƒ—ïƒ—ïƒ—, XN(t)} we generalize the Markovian stochastic equation (8.55) by the equation
dX i ï€¨ t ï€© ï€½ ai ï€¨ X , t ï€© ï€« bik ï€¨ X , t ï€© dWk ï€¨ t ï€©. dt dt
(10.115)
Here, the coefficients ai(X(t), t) and bik(X(t), t) are any deterministic functions of X(t) and t. The normally distributed vectorial process dWk / dt is characterized by dWk ï€¨ t ï€© ï€½ 0, dt
(10.116a)
dWk ï€¨ t ï€© dWn ï€¨ t 'ï€© ï€½ ï¤ kn ï¤ (t ï€ t ' ) . dt dt
(10.116b)
Relation (10.116a) corresponds to Eq. (8.57). Relation (10.116b) corresponds to Eq. (8.60) for k = n. For k ï‚¹ n this relation means that dWk / dt is uncorrelated to dWn / dt. The process dWk / dt is assumed to be independent of X(t0). Due to the fact that the change of the stochastic process X(t) is fully determined by ai(X(t), t), bik(X(t), t), and dWk / dt, we find that the equation system (10.115) describes the evolution of X(t) as a Markov process: the future of the statistical properties of X(t) is fully determined by the present state.
10.4 The Fokker-Planck Equation
419
Stochastic Difference Equations. The representation of the stochastic differential equation (10.115) as a stochastic difference equation is relevant, e.g., to the numerical solution of Eq. (10.115) and regarding the derivation of the relationship to the Fokker-Planck equation (10.85): see the discussion in the next paragraph. To address this question we integrate Eq. (10.115) from t to t + ï„t, t ï€« ï„t
t ï€« ï„t
dW
k ïƒ² ai ï€¨ X ( s), s ï€© ds ï€« ïƒ² bik ï€¨ X ( s), s ï€© ds ï€¨s ï€© ds , t t
X i (t ï€« ï„t ) ï€ X i (t ) ï€½
(10.117)
where ï„t is a sufficiently small time interval. As for the single-variable case we use the ItÃ´ definition of stochastic integration, i.e., we approximate ai(X(s), s) and bik(X(s), s) by their values at the lower bound t. Then, Eq. (10.117) can be written X i (t ï€« ï„t ) ï€ X i (t ) ï€½ ai ï€¨ X (t ), t ï€© ï„t ï€« bik ï€¨ X (t ), t ï€© ï„Wk ï€¨t ï€©,
(10.118)
where ï„Wk(t) is defined by ï„Wk ï€¨ t ï€© ï€½
t ï€« ï„t
dW
k ïƒ² ds ï€¨s ï€© ds ï€½ Wk ï€¨ t ï€« ï„t ï€© ï€ Wk ï€¨ t ï€©. t
(10.119)
The properties of ï„Wk(t) can be derived in terms of Eq. (10.116), ï„Wk ï€¨ t ï€© ï€½
t ï€« ï„t
dWk ï€¨s ï€© ds ï€½ 0. ds
ïƒ² t
t ï€« ï„t t ' ï€« ï„t
ï„Wk ï€¨ t ï€© ï„Wn ï€¨ t 'ï€© ï€½ ïƒ² t
t ï€« ï„t t ' ï€« ï„t
ï€½ ï¤ kn ïƒ² t
ïƒ²
t'
(10.120a)
dWk ï€¨s ï€© dWn ï€¨s'ï€© ds' ds ds ds ' t ï€« ï„t t ' ï€« ï„t
ïƒ² ï¤ ( s'ï€ s) ds' ds ï€½ ï¤ kn ïƒ²
t'
t
ïƒ²
t'
dï± ( s 'ï€ s ) ds ' ds ds '
(10.120b)
t ï€« ï„t ïƒ¬1 if t ï€½ t ' ïƒ¼ ï€½ ï¤ kn ïƒ² ï›ï± (t 'ï€« ï„t ï€ s ) ï€ ï± (t 'ï€ s )ï ds ï€½ ï¤ kn ï„t ïƒ ïƒ½. t ïƒ®0 if t ï‚¹ t 'ïƒ¾
Here, t' changes by ï„t as does t, this means tï‚¢ = t + k ï„t, where k = 0, ï‚±1, ï‚±2, â€¦. Thus, the integral in the last line is only nonzero and equal to ï„t if t' = t, which explains the final result of Eq. (10.120b). Relationship to Fokker-Planck Equation. The question about the relationship to the Fokker-Planck equation (10.85) can be addressed by the calculation of the first two coefficients of the Kramers-Moyal equation (which has to be written for the case of many variables). According to Eq. (10.86), these coefficients become Di ï€¨ x , t ï€© ï€½ lim
1 X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€© x , t , ï„t
Dij ï€¨ x , t ï€© ï€½ lim
1 2ï„t
ï„t ï‚® 0
ï„t ï‚® 0
ï›X i ï€¨t ï€« ï„t ï€© ï€ X i ï€¨t ï€© ï ï›X j ï€¨t ï€« ï„t ï€© ï€ X j ï€¨t ï€© ï
(10.121a)
x, t .
(10.121b)
420
10 Stochastic Multivariate Evolution
The use of Eq. (10.118) in these expressions leads to Di ï€¨ x , t ï€© ï€½ lim
1 ai ï€¨ X (t ), t ï€©ï„t ï€« bik ï€¨ X (t ), t ï€© ï„Wk (t ) x , t ï€½ a i ï€¨ x , t ï€©, ï„t
Dij ï€¨ x , t ï€© ï€½ lim
1 ï›ai ï€¨ X (t ), t ï€©ï„t ï€« bik ï€¨ X (t ), t ï€© ï„Wk (t ) ï 2ï„t
ï„t ï‚® 0
ï„t ï‚® 0
ï›a ï€¨ X (t ), t ï€©ï„t ï€« b ï€¨ X (t ), t ï€© ï„W (t ) ï x, t j
ï€½
jn
n
(10.122a)
(10.122b)
1 1 bik ï€¨ x , t ï€© b jn ï€¨ x , t ï€© ï¤ kn ï€½ bik ï€¨ x , t ï€© b jk ï€¨ x , t ï€©, 2 2
where the properties (10.120) of ï„Wk are used. The corresponding calculation of higher-order coefficients of the multivariate Kramers-Moyal equation leads to the same conclusion as obtained for the single-variable case: all these coefficients are zero because they are of higher order in ï„t. Thus, the stochastic Eq. (10.115) does imply uniquely a Fokker-Planck equation that determines the PDF evolution. However, a Fokker-Planck equation does not fully determine a stochastic differential equation in general. For N variables, Eq. (10.122b) provides N (N + 1) / 2 equations for N 2 elements of bij (e.g., for N = 6 there are only 21 equations for 36 elements of bij). Therefore, the coefficients of the stochastic Eq. (10.115) are only uniquely determined by the Fokker-Planck coefficients Di and Dij if bij is assumed to be symmetric so that only N (N + 1) / 2 elements of bij have to be determined. Correlations. It was shown in the previous paragraph that the stochastic differential equation (10.115) is consistent with the Fokker-Planck equation (10.85) with regard to the one-point statistics (i.e., the PDF, means and variances), but the corresponding consistency regarding the correlation dynamics is not demonstrated in this way. To address this question we use the stochastic differential equation for deriving an equation for the correlation function of Xi(t) and Xj(t + r), where r is any non-negative time. For doing this we consider ~ d X i (t ) X j (t ï€« r ) dr
dX j (t ï€« r ) ~ ï€½ X i (t ) dt
(10.123)
~ ~ dW ï€½ X i (t ) a j ï€¨ X (t ï€« r ), t ï€« r ï€© ï€« X i (t ) b jk ï€¨ X (t ï€« r ), t ï€« r ï€© k ï€¨t ï€« r ï€© . dt The last equation arises from the use of Eq. (10.115). The noise term dWk / dt(t + r) is independent of X(t) and X(t + r) because only noise at times before t and t + r can affect X(t) and X(t + r), respectively. Thus, the last term is zero and we obtain ~ ~ d X i (t ) X j (t ï€« r ) dr
~ ï€½ X i (t ) a~ j ï€¨ X (t ï€« r ), t ï€« r ï€© .
(10.124)
10.5 Molecular and Fluid Motion
421
Here, Xj and aj are replaced by the corresponding fluctuations because the means of Xj and aj do not affect the result. The result obtained generalizes Eq. (8.72) to the case of several variables, and it recovers Eq. (10.103) if Dj = aj is taken into account. In this way, the consistency between the stochastic differential equation (10.115) and Fokker-Planck equation (10.85) is also demonstrated regarding the implied correlation dynamics.
10.5 Molecular and Fluid Motion The mathematical modeling of molecular and fluid motion is a problem that is relevant to a huge variety of processes in nature (e.g., atmospheric dynamics) and technology (e.g., reactor chemistry). Actually, we consider only one process: the motion of molecules of a fluid. The difference between the terms molecular and fluid motion is given by the scale considered. With regard to molecular motion we are interested in an understanding of elementary processes with a typical length scale of about 10ï€9 m, whereas the consideration of fluid dynamics means to look at processes with a typical length scale of about 10ï€3 m. Fluid dynamic variables represent means of molecular variables. For example, the mean molecular velocity is equal to the fluid dynamic velocity. Therefore, we will derive here the equations for fluid motion as the moment equations that are implied by a stochastic model for the molecular motion. From a mathematical point of view, the goal of this section is to illustrate the application of stochastic differential equations and the Fokker-Planck equation. In particular, the goals are to show: ï‚· the typical structure of stochastic differential equations for a real problem, ï‚· the problem related to the numerical solution of such stochastic equations, ï‚· the use of analysis tools for deriving moment equations, ï‚· the typical closure problem of moment equations, ï‚· a consistent and systematic way to develop closed moment equations, ï‚· ways to assess the range of validity of different moment equations. The focus here is on the modeling problem, this means the derivation of closed equations for molecular and fluid motion. Unfortunately, the equations obtained cannot be solved analytically, and numerical solutions turn out to be extremely expensive. Interested readers may find more information about solutions of these equations elsewhere (Pope 2000, Heinz 2003, 2004, Fox 2003, Givi 2006, Jenny et al. 2010). The modeling problem will be considered here in its simplest form, this means without accounting for additional variables (like mass fractions of chemical species) or forces (like the gravity force). Such modifications, which may be relevant to applications, can be taken into account by following the methodology to be presented in the following.
422
10 Stochastic Multivariate Evolution
10.5.1 Molecular Motion Model Stochastic Molecular Motion Model. Attention will be restricted here to the case of monatomic fluids, which do not have internal degrees of freedom (rotational or vibrational energy). The molecules are assumed to move independently. This corresponds to the consideration of a perfect gas. The state of each molecule is completely described by its position xi* and velocity Vi*. Here, the subscript i = 1, 3 indicates the three position and velocity components in physical space. The equations considered for xi* and Vi* are given by (Heinz 2003, 2004, 2007) *
dxi * ï€½ Vi , dt *
(10.125a) *
V ï€ Ui dV i 4e dWi ï€½ï€ i ï€« . 3ï´ dt dt ï´
(10.125b)
Here, dWi / dt is the derivative of a Wiener process. This model involves three parameters: Ui is the mean molecular velocity, e is the specific kinetic energy (it has the dimension of a squared velocity), and ï´ is the characteristic time scale of molecular fluctuations. All the three parameters may depend on time t and the position of a molecule (the model parameters are functions of the position x in physical space, where x is replaced by x*(t) in Eqs. (10.125)). The application of this model does only require the definition of the time scale ï´, because Ui and e can be calculated from molecular properties (by taking the mean over velocities and squared velocity fluctuations). An external force is not considered here for simplicity. The model considered represents an extension of the Brownian motion model (6.59). A difference is given by the inclusion of the mean velocity Ui in the drift term here, which is assumed to be zero in the Brownian motion model (6.59). Equations (10.125) can be solved via Monte Carlo simulation, which enables the calculation of all relevant variables (like Ui) as means over particle properties. Nevertheless, this approach is computationally very expensive (Jenny et al. 2010). It is usually more convenient to consider equations for moments, which can be solved with lower computational cost. Moment Equations Implied by Stochastic Model. Similar to the analysis of the Brownian motion model (6.59) we can study the consequences of the stochastic model (10.125) for statistical particle properties, this means we can calculate the evolution of the mean particle position, velocity, and variances in time. However, our main interest here is in fluid dynamics, i.e., the properties of the fluid at a fixed position and time. Such fluid dynamics properties are given by the fluid mass density ï²(x, t) and fluid velocity Ui(x, t). Equations for ï²(x, t) and Ui(x, t) can be derived as a consequence of the Fokker-Planck equation that is implied by
10.5 Molecular and Fluid Motion
423
the stochastic molecular model (10.125). However, these derivations are relatively lengthy. Therefore, only the resulting equations are presented here. All the details of how these equations can be obtained are given in the appendix of this section (see Sect. 10.5.3). All the equations presented in this paragraph are exact consequences of the stochastic model (10.125). The evolution equations for ï²(x, t) and Ui(x, t) can be presented efficiently in terms of the substantial derivative DQ / Dt = ï‚¶Q / ï‚¶t + Um ï‚¶Q / ï‚¶xm (see the discussion of this derivative in Sect. 10.1), where Q(x, t) can be any variable. The equations for ï²(x, t) and Ui(x, t) that are implied by the stochastic molecular model (10.125) can be written then ï‚¶U m Dï² ï€«ï² ï€½ 0, ï‚¶xm Dt
(10.126a)
DU i 2 ï‚¶ï² e 1 ï‚¶ï² d im ï€« ï€« ï€½ 0. ï² ï‚¶xm 3ï² ï‚¶xi Dt
(10.126b)
There are two unknown variables in the last equation: the kinetic energy e(x, t) and deviatoric stress dij(x, t). Both e and dij are related to the variance of molecular velocities. In particular, e represents the isotropic variance contribution, and dij is the anisotropic variance contribution (see the corresponding explanations in Sect. 10.5.3). The stochastic model (10.125) implies an equation for the variance, which can be used to derive the following equations for e and dij, De 1 ï‚¶ï² vivivm ï‚¶U i 2 ï‚¶U i ï€« ï€« d mi ï€« e ï€½ 0, ï‚¶xm Dt 2 ï² ï‚¶xm 3 ï‚¶xi 1 ï‚¶ï² (viv j ï€ vk vk ï¤ ij / 3) vm ï‚¶U i ïƒ¦ 2 ïƒ¶ ï€« ïƒ§ d mj ï€« e ï¤ mj ïƒ· ï‚¶xm ï‚¶xm ïƒ¨ ï² 3 Dt ïƒ¸ ï‚¶U j ïƒ¦ 2 2 2 ïƒ¶ ïƒ¶ 2 ï‚¶U k ïƒ¦ ï€« ïƒ§ d mk ï€« e ï¤ mk ïƒ· ï€½ ï€ d ij . ïƒ§ d mi ï€« e ï¤ mi ïƒ· ï€ ï¤ ij ï´ 3 3 ï‚¶ x 3 ï‚¶xm ïƒ¨ ïƒ¸ ïƒ¸ m ïƒ¨
Dd ij
(10.127a)
ï€«
(10.127b)
The dij equation provides zero on both sides if we set i = j and take the sum over j. Equations (10.127) contain again unknowns given by the terms that involve three velocity fluctuations vi (the triple correlation). An equation for these triple correlations can be also derived from the stochastic model. This equation reads Dviv jvk Dt
ï€«
1 ï‚¶ï² viv jvk vm 1 ï‚¶ï² vivm 1 ï‚¶ï² v jvm 1 ï‚¶ï² vk vm v j vk ï€ vivk ï€ viv j ï€ ï² ï² ï‚¶xm ï² ï‚¶xm ï² ï‚¶xm ï‚¶xm
ï€«
ï‚¶U j ï‚¶U k ï‚¶U i 3 vmv jvk ï€« vmvivk ï€« vmviv j ï€½ ï€ viv jvk . ï´ ï‚¶xm ï‚¶xm ï‚¶xm
(10.128)
424
10 Stochastic Multivariate Evolution
Discussion of Moment Equations. It is interesting that Eqs. (10.126) for the fluid mass density ï²(x, t) and the fluid velocity Ui(x, t) are equal to equations that follow from several molecular motion equations, as, for example, the Boltzmann equation (Heinz 2003, 2004, Jenny et al. 2010). The same applies to the left-hand sides of Eqs. (10.127) and (10.128). Hence, the influence of the stochastic molecular model (10.125) considered does only appear on the right-hand sides of the dij Eq. (10.127b) and triple correlation equation (10.128). Other molecular motion models provide right-hand sides of the dij and triple correlation equations that have the same structure (Jenny et al. 2010). The equation system (10.126)â€“(10.128) of coupled fluid dynamics equations is still unclosed due to the appearance of the term with four velocity fluctuations vi in Eq. (10.128).
10.5.2 Fluid Dynamics Equations Next, let us consider how it is possible to overcome the closure problem of Eqs. (10.126)â€“(10.128) described in the previous paragraph, i.e., how closed equations for fluid dynamics can be derived. Algebraic Model for Fourth-Order Correlations. To close Eqs. (10.126)â€“ (10.128) we need a model for the unknown fourth-order velocity correlations in the triple correlation equation (10.128). A corresponding closure model can be obtained by assuming that the velocity PDF can be approximated by a joint normal PDF, which leads (in generalization of Eqs. (10.43b) for the fourth-order correlations of a bivariate normal distribution) to the following parametrization of fourth-order central velocity moments, viv jvk vm ï€½ viv j vk vm ï€« vivk v jvm ï€« vivm v jvk .
(10.129)
This approximation represents a reasonable assumption for all fluids that are not too far from an equilibrium state. It is relevant to see that this assumption does only affect the evolution of triple correlations, which are small for fluids that are close to an equilibrium state. The gradient of fourth-order correlations required in the triple correlation equation (10.128) is then given by ï‚¶ v j vk ï‚¶v v 1 ï‚¶ï² viv j vk vm ï‚¶ viv j ï€½ v k vm ï€« i k v j v m ï€« vivm ï² ï‚¶xm ï‚¶xm ï‚¶xm ï‚¶xm 1 ï‚¶ï² vk vm 1 ï‚¶ï² v jvm 1 ï‚¶ï² vivm ï€« viv j ï€« vivk ï€« v j vk . ï² ï‚¶xm ï² ï‚¶xm ï² ï‚¶xm
(10.130)
10.5 Molecular and Fluid Motion
425
The use of this expression in Eq. (10.128) for velocity triple correlations leads to a closed equation for triple correlations, D viv j vk Dt
ï€«
ï‚¶ viv j ï‚¶xm
vk v m ï€«
ï‚¶ v j vk ï‚¶ vivk v j vm ï€« vivm ï‚¶xm ï‚¶xm
(10.131)
ï‚¶U j ï‚¶U i ï‚¶U k 3 ï€« v mv j v k ï€« vmvivk ï€« vmviv j ï€½ ï€ viv j vk . ï‚¶xm ï‚¶xm ï‚¶xm ï´
In this way we have derived a closed system of fluid dynamics equations given by Eqs. (10.126), (10.127), and (10.131). Algebraic Model for Third-Order Correlations. The cost of simulations can be reduced by using Eq. (10.131) for the derivation of an algebraic approximation for the triple correlations. We assume that the substantial derivative and the terms that contain triple correlations multiplied with velocity gradients can be neglected in comparison to the other terms,
ïƒ¶ ï‚¶ v j vk ï‚¶v v ï´ ïƒ¦ ï‚¶ viv j viv jvk ï€½ ï€ ïƒ§ vk v m ï€« i k v j vm ï€« vivm ïƒ·. ïƒ· ïƒ§ 3 ïƒ¨ ï‚¶xm ï‚¶xm ï‚¶xm ïƒ¸
(10.132)
This model implies for the triple correlations in the energy equation vivivm ï€½ ï€
ï€¨
ïƒ¹ ï‚¶v v ï‚¶ e ï¤ im ï€« vivm 2ï´ ïƒ© ï‚¶e 2ï´ vmvn ï€« i m vivn ïƒº ï€½ ï€ vivn ïƒª ï‚¶xn ï‚¶xn 3 ïƒ«ïƒª ï‚¶xn 3 ïƒ»ïƒº
2ï´ ïƒ¦ 2 ïƒ¶ ï‚¶ ïƒ¦5 ïƒ¶ ï€½ ï€ ïƒ§ e ï¤ in ï€« d in ïƒ· ïƒ§ e ï¤ im ï€« d im ïƒ·, 3 ïƒ¨3 ïƒ¸ ï‚¶xn ïƒ¨ 3 ïƒ¸
ï€© (10.133)
where the variances vi vm = 2 e / 3 ï¤im + dim are represented by the specific kinetic energy e = vi vi / 2 and deviatoric stress dim (see Sect. 10.5.3). In this case, we have a closed system of fluid dynamics equations given by Eqs. (10.126) and (10.127) combined with Eqs. (10.132) and (10.133), respectively. Algebraic Model for Second-Order Correlations. The fluid dynamics equations can be further simplified by the derivation of an algebraic model for dij. This model can be obtained by neglecting Ddij / Dt, the gradients of triple correlations, and the anisotropy contributions dij in the parenthesis terms of Eq. (10.127b), 2 ïƒ¦ïƒ§ ï‚¶U i ï‚¶U j 2 ï‚¶U k ïƒ¶ïƒ· 2 ï€« e ï€ ï¤ ij ï€½ ï€ d ij . ï‚¶xi ï‚¶xk ïƒ·ïƒ¸ ï´ 3 ïƒ§ïƒ¨ ï‚¶x j 3
(10.134)
This relation can be written more efficiently by introducing the shear rate tensor S ij ï€½
1 ïƒ¦ïƒ§ ï‚¶U i ï‚¶U j ï€« ï‚¶xi 2 ïƒ§ïƒ¨ ï‚¶x j
ïƒ¶ ïƒ· ïƒ· ïƒ¸
(10.135)
426
10 Stochastic Multivariate Evolution
and the related deviatoric shear rate tensor ï‚¶U j 2 ï‚¶U n ïƒ¶ 1 1 ïƒ¦ ï‚¶U d ï€ ï¤ ij ïƒ· , S ij ï€½ S ij ï€ S nnï¤ ij ï€½ ïƒ§ i ï€« ïƒ§ ïƒ· ï‚¶xi 3 2 ïƒ¨ ï‚¶x j 3 ï‚¶xn ïƒ¸
(10.136)
which is the deviation of the shear rate tensor from its isotropic part. According to its definition, Sijd has the property Siid = 0. In terms of the definition of Sijd, the algebraic model (10.134) for dij can be written d ij ï€½ ï€2
ï´e 3
S ij ï€½ ï€2ï® S ij . d
d
(10.137)
The last expression introduces the diffusion coefficient
ï®ï€½
ï´e 3
,
(10.138)
which is called the kinematic viscosity. Fluid Dynamics Equations. The use of the approximation (10.137) simplifies the fluid dynamics equations significantly. Equations (10.126) read now Dï² ï€½ ï€ ï² S ii , Dt
(10.139a)
DU i 2 ï‚¶ï²ï® S im 2 ï‚¶ï² e . ï€½ ï€ 3ï² ï‚¶xi Dt ï² ï‚¶xm d
(10.139b)
In the first equation we applied Sii = ï‚¶Ui / ï‚¶xi, which follows from Eq. (10.135). The second equation is often written in terms of the viscosity ï = ï² ï®. The energy is given by Eq. (10.127a) combined with Eq. (10.133) for vi vi vm . In vi vi vm we have to neglect anisotropy contributions dij in the parenthesis terms to be consistent with the approximations used in the dij equation â€“ we have vivivm ï€½ ï€
ï§ ï‚¶e 20ï´ e ï‚¶e 20 ï‚¶e ï€½ï€ ï® ï€½ ï€2ï® . 27 ï‚¶xm 9 ï‚¶xm Pr ï‚¶xm
(10.140)
The last writing presents this expression in its standard formulation. Here, Pr is the Prandtl number, and ï§ = 1 + 2 / f is the ratio of specific heats, where f counts the degrees of freedom. Monatomic gases have f = 3 degrees of freedom such that ï§ = 5/3. Therefore, we have Pr = 3/2 for the case considered (Jenny et al. 2010). By using Eq. (10.140) for the triple correlation and Eq. (10.137) for dij we find the energy equation to be given by De 1 ï‚¶ ïƒ¦ ï§ ï‚¶e ïƒ§ïƒ§ ï² ï® ï€½ Dt ï² ï‚¶xm ïƒ¨ Pr ï‚¶xm
ïƒ¶ ï‚¶U i 2 ï‚¶U i d ïƒ·ïƒ· ï€« 2ï® . S mi ï€ e 3 ï‚¶xi ï‚¶xm ïƒ¸
(10.141)
10.5 Molecular and Fluid Motion
427
The last two terms of this equation can be rewritten in terms of the definitions of Sij and Sijd. We use again Sii = ï‚¶Ui / ï‚¶xi. The term involving Smid can be written ïƒ¶ ï‚¶U i ï‚¶U m 2 ï‚¶U k 1 ïƒ¦ ï‚¶U d S mi ï€½ ïƒ§ïƒ§ i ï€« ï¤ im ïƒ·ïƒ· S mi d ï€½ S mi d S mi d , ï€ 2 ïƒ¨ ï‚¶xm 3 ï‚¶xk ï‚¶xm ï‚¶xi ïƒ¸
(10.142)
which is a consequence of symmetry properties and the fact that Siid = 0. Thus, the energy equation can be written as De 1 ï‚¶ ïƒ¦ ï§ ï‚¶e ïƒ§ïƒ§ ï² ï® ï€½ Dt ï² ï‚¶xm ïƒ¨ Pr ï‚¶xm
ïƒ¶ 2 ïƒ·ïƒ· ï€« 2ï® S mi d S im d ï€ e S ii . 3 ïƒ¸
(10.143)
Equations (10.139) combined with this energy equation represent a closed equation system. The equations can be presented in different ways by using the relations between the kinetic energy e with the pressure p and temperature T, eï€½
3 p 3 ï€½ RT . 2ï² 2
(10.144)
Here, R refers to the gas constant. Navier-Stokes Equations. Equations (10.139) combined with Eq. (10.143) represent the Navier-Stokes equations, where the ratio ï§ / Pr is chosen according to the fluid considered. The value ï§ = 5/3 is the correct value for monatomic gases, but a Prandtl number value Pr = 3/2 derived here as a consequence of the simple molecular model (10.125) needs adjustments (for most gases measurements show a more or less constant Prandtl number value Pr = 2/3). Which influences cause changes of the fluid dynamic variables? The mass density ï² is changed by the dilatation Sii, which measures compressibility. Changes of the fluid velocity Ui are caused by two effects: molecular diffusion (the first term on the right-hand side) and kinetic energy (or pressure) gradients (the last term): a decreasing pressure in the xi direction (i.e., a negative pressure gradient) implies a positive acceleration DUi / Dt of the fluid in this direction. Changes of the kinetic energy can be caused by three effects. The first effect is given by molecular diffusion (the first term on the right-hand side). The second effect is given by viscous heating (the second term). This contribution is always positive. It arises from the conversion of kinetic energy into heat. The third effect is due to compressibility (the last term). Analytical solutions of the Navier-Stokes equations can be only found under very specific conditions. Numerical solutions of the Navier-Stokes equations turn out to be extremely expensive if the fluid considered is turbulent, which is the usual case (Pope 2000). Therefore, studies of fluid properties on the basis of these equations usually represent a very complicated matter. A simple illustration of characteristic properties of the Navier-Stokes equations was given in Chap. 9 by the discussion of the Lorenz equations and their chaotic solutions.
428
10 Stochastic Multivariate Evolution
10.5.3 Appendix: Implications of the Stochastic Molecular Model This section shows how the equations of fluid dynamics (10.126), (10.127), and (10.128) can be derived from the stochastic molecular motion model (10.125). * Joint PDF and Conditional PDF. The joint PDF for molecular positions xi (t) and velocities Vi*(t) involved in the molecular model (10.125) is defined by
ï€¨
ï€© ï€¨
ï€©
f ï€¨w , x , t ï€© ï€½ ï¤ x * (t ) ï€ x ï¤ V * (t ) ï€ w ,
(10.145)
where x and w refer to the sample space positions and velocities, respectively. The brackets denote an ensemble average. To define fluid dynamic variables at fixed positions x we need the conditional PDF F ï€¨w , x , t ï€© ï€½
ï¤ ï€¨x * (t ) ï€ x ï€© ï¤ ï€¨V * (t ) ï€ w ï€© .
1
ï¤ ï€¨x (t ) ï€ x ï€© *
(10.146)
The delta function ï¤(x*(t) ï€ x) involved here is proportional to the instantaneous molecular mass density, which is defined by
ï² * ( x , t ) ï€½ M ï¤ ï€¨x * (t ) ï€ x ï€©.
(10.147)
The integration of Eq. (10.147) over x shows that M = ïƒ² ï²*(x, t) dx. Hence, M is the total mass of molecules within the domain considered. The mean molecular mass density is given by averaging Eq. (10.147),
ï² ( x, t ) ï€½ ï² * ï€¨ x, t ï€© .
(10.148)
By applying Eqs. (10.147) and (10.148), the conditional PDF can be written F ï€¨w , x , t ï€© ï€½
ï€¨
ï€©
1 ï² * ï€¨ x, t ï€© ï¤ V * (t ) ï€ w . ï² ( x, t )
(10.149)
Hence, the joint PDF f(w, x, t) and the conditional PDF F(w, x, t) are related by f(w, x, t) = ï²(x, t) F(w, x, t) / M. Expression (10.149) shows that F(w, x, t) integrates to one, this means ïƒ² F(w, x, t) dw = 1. Fluid Dynamic Variables. Integrations over F(w, x, t) provide fluid dynamic variables at a fixed position x and time t. For any function Q of velocities we find * * ïƒ² Q( w ) F ( w, x, t ) dw ï€½ ï² ( x, t ) ïƒ² Q( w ) ï² ï€¨ x, t ï€© ï¤ ï€¨V (t ) ï€ w ï€© dw
1
ï€¨
ï€© ï€¨
ï€©
1 * * * ïƒ² ï² ï€¨ x, t ï€© Q V (t ) ï¤ V (t ) ï€ w dw ï² ( x, t ) 1 ï€½ ï² * ï€¨ x, t ï€© Q V * (t ) ï€½ Q(V )ï€¨ x, t ï€©. ï² ( x, t ) ï€½
ï€¨
ï€©
(10.150)
10.5 Molecular and Fluid Motion
429
The first line makes use of the definition of F(w, x, t). In the second line, Q(w) is written inside the brackets and replaced by Q(V*(t)) according to the sifting property of delta functions. The normalization property of delta functions is used in the third line. The last expression introduces an abbreviation for the previous expression that refers to the physical meaning of this expression: we calculate the massdensity weighted mean over velocities at a fixed position x and time t in this way. Examples for the use of Eq. (10.150) are given by the following definitions of the first three velocity moments V i ( x , t ) ï€½ ïƒ² wi F ( w , x , t ) dw ï€½ U i ( x , t ) ,
(10.151a)
V iV j ( x , t ) ï€½ ïƒ² wi w j F ( w , x , t ) dw ,
(10.151b)
V iV jV k ( x , t ) ï€½ ïƒ² wi w j wk F ( w , x , t ) dw .
(10.151c)
Relation (10.151a) relates the integral to the mean molecular velocity Ui, which is used in the stochastic molecular model formulation. Fokker-Planck Equation. Equations for the fluid dynamic variables (10.151) can be found as a consequence of the stochastic molecular model (10.125). The equation for the joint PDF f(w, x, t), which is implied by the stochastic molecular model, is given by ï‚¶w f ï‚¶ ï‚¶f ï€½ï€ m ï€ ï‚¶wm ï‚¶xm ï‚¶t
2e ï‚¶ 2 f ïƒ© wm ï€ U m ïƒ¹ . ï€ f ï€« ïƒª ïƒº ï´ 3ï´ ï‚¶wm ï‚¶wm ïƒ« ïƒ»
(10.152)
By multiplying this equation by M and using the relation between the joint PDF and conditional PDF, f(w, x, t) = ï²(x, t) F(w, x, t) / M, we find 2e ï‚¶ï²F ïƒ¹ ï‚¶ ïƒ© wm ï€ U m ï‚¶ï²F ï‚¶ï²wm F ï²F ï€« ï€½ ï€« ïƒº. ïƒª 3ï´ ï‚¶wm ïƒ» ï‚¶wm ïƒ« ï´ ï‚¶xm ï‚¶t
(10.153)
Mass Density Equation. The integration of the latter equation over the velocity sample space w implies an equation for the mean mass density,
ï‚¶ ï‚¶ï² ï‚¶ï²U m ï€½ïƒ² ï€« ï‚¶wm ï‚¶xm ï‚¶t
ïƒ© wm ï€ U m 2e ï‚¶ï²F ïƒ¹ ï²F ï€« ïƒº dw ï€½ 0 . ïƒª 3ï´ ï‚¶wm ïƒ» ïƒ« ï´
(10.154)
We used here V i (x, t) = Ui(x, t) and the normalization property of F. The terms on the right-hand side do not contribute because we have integrals over derivatives, which disappear at infinity. By distributing the spatial derivative we find ï‚¶U m Dï² ï€«ï² ï€½ 0, Dt ï‚¶xm
(10.155)
430
10 Stochastic Multivariate Evolution
which corresponds to Eq. (10.126a). We applied the definition of the substantial derivative DQ / Dt = ï‚¶Q / ï‚¶t + Um ï‚¶Q / ï‚¶xm (see Sect. 10.1), where Q(x, t) can be any variable. Velocity Equation. An equation for the mean velocity can be derived by multiplication of Eq. (10.153) with wi and integration over the velocity sample space, ï‚¶ï²U i ï‚¶ï²V iV m 2e ï‚¶ï²F ïƒ¹ ï‚¶ ïƒ© wm ï€ U m ï²F ï€« ï€½ ïƒ² wi ï€« ïƒº dw ïƒª 3ï´ ï‚¶wm ïƒ» ï‚¶wm ïƒ« ï´ ï‚¶xm ï‚¶t ïƒ© w ï€ Ui 2e ï‚¶ï²F ïƒ¹ ï²F ï€« ï€½ ï€ïƒ² ïƒª i ïƒº dw ï€½ 0 , 3ï´ ï‚¶wi ïƒ» ïƒ« ï´
(10.156)
where integration by parts is applied. This equation can be rewritten by splitting V iV m into contributions due to the mean velocity Ui and deviations vi = Vi ï€ Ui from the mean velocity, ViVm ï€½ U iU m ï€« vivm .
(10.157)
The last term represents the variance of the velocity distribution. The consistency of this relation may be seen by distributing the variance according to Vi = Ui + vi. The combination of Eq. (10.156) with Eq. (10.157) leads to the following mean velocity equation ï‚¶ï²U i ï‚¶ï² U iU m ï‚¶ï² vivm ï€« ï€« ï€½ 0. ï‚¶t ï‚¶xm ï‚¶xm
(10.158)
For any function Q(x, t) we have the relation ïƒ© ï‚¶ï² ï‚¶ï²U m ïƒ¹ ïƒ© ï‚¶Q ï‚¶ï²Q ï‚¶ï² QU m ï‚¶Q ïƒ¹ DQ ï€« ï€½ Qïƒª ï€« ï€« Um . ïƒº ï€« ï²ïƒª ïƒºï€½ï² ï‚¶t ï‚¶xm ï‚¶xm ïƒ» ï‚¶xm ïƒ» Dt ïƒ« ï‚¶t ïƒ« ï‚¶t
(10.159)
The first bracket term does not contribute here because of Eq. (10.154). By setting Q = Ui and using the last relation, we can write the velocity equation as DU i 1 ï‚¶ï² vivm ï€« ï€½ 0. Dt ï² ï‚¶xm
(10.160)
To prepare the use of approximations (see Sect. 10.5.2) it is helpful to split the variance into two contributions, vivm = 2 e / 3 ï¤im + dim. Here, the kinetic energy e and deviatoric stress dij, which has the property dii = 0, are given by eï€½
1 vivi , 2
d ij ï€½ viv j ï€
2 e ï¤ ij . 3
(10.161)
10.5 Molecular and Fluid Motion
431
The resulting equation for Ui is then equal to Eq. (10.126b), DU i 2 ï‚¶ï² e 1 ï‚¶ï² d im ï€½ 0. ï€« ï€« 3ï² ï‚¶xi Dt ï² ï‚¶xm
(10.162)
Variance Equation. Equation (10.160) is unclosed due to the appearance of the variance vi vm . To derive an equation for vi vm we multiply Eq. (10.153) by (wi ï€ Ui) (wj ï€ Uj) and integrate over the velocity sample space. Let us separately calculate the right-hand side (RHS) and left-hand side (LHS) of this equation,
RHS ï€½ ïƒ² ( wi ï€ U i )( w j ï€ U j )
2e ï‚¶ï²F ïƒ¹ ï‚¶ ïƒ© wm ï€ U m ï²F ï€« ïƒº dw ïƒª 3ï´ ï‚¶wm ïƒ» ï‚¶wm ïƒ« ï´
ïƒ© w ï€Um 2e ï‚¶ï²F ïƒ¹ ï²F ï€« ï€½ ï€ ïƒ² ï¤ im ( w j ï€ U j ) ï€« ï¤ jm ( wi ï€ U i ) ïƒª m ïƒº dw 3ï´ ï‚¶wm ïƒ» ïƒ« ï´ 2 2e ï€½ ï€ ïƒ² ( wi ï€ U i )( w j ï€ U j ) ï²F dw ï€« ïƒ² (ï¤ imï¤ jm ï€« ï¤ jmï¤ im ) ï²F dw ï´ 3ï´ (10.163) 2 4e ï€½ ï€ ï² viv j ï€« ï² ï¤ ij , 3ï´ ï´
ï€¨
ï€©
where integration by parts is applied. The term on the left-hand side is given by ïƒ© ï‚¶ï²F ï‚¶ï²U m F ï‚¶ï² ( wm ï€ U m ) F ïƒ¹ LHS ï€½ ïƒ² ( wi ï€ U i )( w j ï€ U j ) ïƒª ï€« ï€« ïƒº dw ï‚¶xm ï‚¶xm ïƒ» ïƒ« ï‚¶t ï€½
ï‚¶ï² viv j ï‚¶t
ï€½ï²
ï€«
D viv j Dt
ï‚¶ï² U m viv j
ï€«
ï‚¶xm ï‚¶ï² viv jvm ï‚¶xm
ï€«
ï‚¶ï² viv jvm
ï€«ï²
ï‚¶xm
ï€«ï²
ï‚¶U j ï‚¶U i v mv j ï€« ï² vmvi ï‚¶xm ï‚¶xm
(10.164)
ï‚¶U j ï‚¶U i v mv j ï€« ï² vmvi . ï‚¶xm ï‚¶xm
This rewriting is obtained by using the derivatives first such that they apply to all the integral and adding then corrections (given by the terms that involve velocity gradients). The last expression results from the use of Eq. (10.159). The combination of Eqs. (10.163) and (10.164) leads then to the variance equation D viv j Dt
ï€«
ï‚¶U j 1 ï‚¶ï² viv jvm ï‚¶U i 2ïƒ¦ 2e ïƒ¶ v mv j ï€« vmvi ï€½ ï€ ïƒ§ viv j ï€ ï¤ ij ïƒ·. ï€« ï´ïƒ¨ 3 ï‚¶xm ï‚¶xm ï‚¶xm ïƒ¸
ï²
(10.165)
To prepare the use of approximations we split the variance into an isotropic and deviatoric part, viv j = 2e / 3 ï¤ij + dij. For e, Eq. (10.165) implies the equation De 1 ï‚¶ï² vivivm ï‚¶U i ï€« ï€« Dt 2 ï² ï‚¶xm ï‚¶xm
2 ïƒ¦ ïƒ¶ ïƒ§ d mi ï€« e ï¤ mi ïƒ· ï€½ 0. 3 ïƒ¨ ïƒ¸
(10.166)
432
10 Stochastic Multivariate Evolution
This equation corresponds to the energy equation (10.127a). Equation (10.127b) for dij can be obtained by differentiating the dij definition (10.161) and replacing the total derivatives of the variance and e according to Eqs. (10.165) and (10.166), Dd ij Dt
ï€½ï€
1 ï‚¶ï² viv jvm ï‚¶U i ïƒ¦ 2 2 ïƒ¶ 2 ïƒ¶ ï‚¶U j ïƒ¦ ï€ ïƒ§ d mi ï€« e ï¤ mi ïƒ· ï€ d ij ïƒ§ d mj ï€« e ï¤ mj ïƒ· ï€ 3 3 ï² ï‚¶xm ï‚¶xm ïƒ¨ ïƒ¸ ï´ ïƒ¸ ï‚¶xm ïƒ¨
2 ïƒ© 1 ï‚¶ï² vk vk vm ï‚¶U k ïƒ¦ 2 ïƒ¶ïƒ¹ ï€« ï¤ ij ïƒª ï€« ïƒ§ d mk ï€« e ï¤ mk ïƒ·ïƒº 3 ïƒªïƒ« 2 ï² 3 ï‚¶xm ï‚¶xm ïƒ¨ ïƒ¸ïƒºïƒ» ï€½ï€ ï€
1 ï‚¶ï² (viv j ï€ vk vk ï¤ ij / 3) vm ï‚¶U i ï€ ï² ï‚¶xm ï‚¶xm
2 ïƒ¦ ïƒ¶ ïƒ§ d mj ï€« e ï¤ mj ïƒ· 3 ïƒ¨ ïƒ¸
ï‚¶U j ïƒ¦ 2 2 ïƒ¶ 2 ïƒ¶ 2 ï‚¶U k ïƒ¦ ïƒ§ d mk ï€« e ï¤ mk ïƒ· ï€ d ij . ïƒ§ d mi ï€« e ï¤ mi ïƒ· ï€« ï¤ ij 3 ï‚¶xm ïƒ¨ 3 ï‚¶xm ïƒ¨ ïƒ¸ ï´ ïƒ¸ 3
(10.167) Triple Correlation Equation. The last two equations are unclosed due to the term that involves three velocity components. To derive an equation for this triple correlation we multiply Eq. (10.153) by (wi ï€ Ui) (wj ï€ Uj) (wk ï€ Uk) and integrate this equation over the sample space. The right-hand side of this equation reads ï‚¶ ïƒ© wm ï€ U m 2e ï‚¶ï²F ïƒ¹ ï²F ï€« ïƒª ïƒº dw ï‚¶wm ïƒ« ï´ 3ï´ ï‚¶wm ïƒ» ï€½ ï€ ïƒ² ï¤ im ( w j ï€ U j )(wk ï€ U k ) ï€« ï¤ jm ( wi ï€ U i )( wk ï€ U k )
RHS ï€½ ïƒ² ( wi ï€ U i )(w j ï€ U j )(wk ï€ U k )
ï€¨
ïƒ© w ï€Um 2e ï‚¶ï²F ïƒ¹ ï€« ï¤ km ( wi ï€ U i )(w j ï€ U j ) ïƒª m ï²F ï€« ïƒº dw 3 ï´ ï´ ï‚¶wm ïƒ» ïƒ« 3 3 ï€½ ï€ ïƒ² ( wi ï€ U i )(w j ï€ U j )(wk ï€ U k ) ï²F dw ï€½ ï€ ï² viv jvk ,
ï€©
ï´
ï´
(10.168) where integration by parts is used. The corresponding left-hand side of the equation considered is given by ïƒ© ï‚¶ï²F ï‚¶ï²U m F ï‚¶ï² ( wm ï€ U m ) F ïƒ¹ ï€« LHS ï€½ ïƒ² ( wi ï€ U i )(w j ï€ U j )( wk ï€ U k ) ïƒª ï€« ïƒº dw ï‚¶xm ï‚¶xm ïƒ« ï‚¶t ïƒ» ï€½
ï‚¶ï² viv jvk ï‚¶t ï€«ï²
ï€«
ï‚¶ï²U m viv jvk ï‚¶xm
ï€«
ï‚¶ï² viv jvk vm ï‚¶xm
ï€«ï²
DU j DU i v j vk ï€« ï² vivk Dt Dt
ï‚¶U j DU k ï‚¶U i ï‚¶U k viv j ï€« ï² v mv j v k ï€« ï² vmvivk ï€« ï² vmviv j . Dt ï‚¶xm ï‚¶xm ï‚¶xm
(10.169)
10.6 Summary
433
The first three terms appear as a consequence of applying the derivatives to all the integral, and the remaining terms are corrections. By using Eq. (10.159) for the first two terms and Eq. (10.160) for the substantial derivatives of velocities we find the expression LHS ï€½ ï²
D viv j vk Dt
ï€«
ï‚¶ï² viv jvk vm ï‚¶xm
ï€
ï‚¶ï² v j vm ï‚¶ï² vivm v j vk ï€ vivk ï‚¶xm ï‚¶xm
ï‚¶U j ï‚¶ï² vk vm ï‚¶U i ï‚¶U k ï€ viv j ï€« ï² vmv j v k ï€« ï² vmvivk ï€« ï² vmviv j . ï‚¶xm ï‚¶xm ï‚¶xm ï‚¶xm
(10.170)
The combination of this expression with the RHS (10.168) then implies D viv j vk Dt
ï€«
1 ï‚¶ï² viv j vk vm 1 ï‚¶ï² vivm 1 ï‚¶ï² v j vm 1 ï‚¶ï² vk vm ï€ v j vk ï€ vivk ï€ viv j ï‚¶xm ï² ï² ï‚¶xm ï² ï‚¶xm ï² ï‚¶xm
ï€«
ï‚¶U j ï‚¶U i ï‚¶U k 3 vmv j v k ï€« vmvivk ï€« vmviv j ï€½ ï€ viv j vk , ï‚¶xm ï‚¶xm ï‚¶xm ï´
(10.171) which agrees with the triple correlation equation (10.128).
10.6 Summary The methodological basis for the modeling of distributions of random variables and the evolution of PDFs and stochastic processes was presented for one random variable in Chaps. 4, 6, and 8. In this chapter, we extended these concepts to the case of joint random variables. Let us summarize the features observed regarding the extension of data analysis concepts, PDF modeling concepts, and concepts for describing the evolution of PDFs and stochastic processes. Extension of Data Analysis Concepts. The characterization of the properties of several random variables differs from the analysis of single variable properties by the need to account for correlations (uncorrelated variables can be treated like single variables). An efficient way to account for such correlations is the use of conditional PDFs, which are rescaled joint PDFs, and related conditional means. The advantage of these concepts was demonstrated regarding the optimization of models considered in Chap. 2: a conditional mean was shown to represent an optimal model, yM(x) = . In addition to the approach presented in Chap. 2, this relation enables the development of optimal models by the calculation of the conditional mean on the basis of data, this means without the use of any modeling concepts. Other illustrations of the benefits of conditional moments can be found, e.g., in Klimenko & Bilger (1999) with regard to turbulent combustion problems.
434
10 Stochastic Multivariate Evolution
Extension of PDF Modeling Concepts. Which modeling concepts can be used to describe the joint statistics of correlated variables? This is a nontrivial question, because many PDF types for single variables cannot be extended straightforwardly to the case of several variables. Here, the most relevant case was considered: it was shown that the normal PDF model for a single variable can be extended to a joint normal PDF model for several random variables that accounts correctly for any correlations. The applicability of this concept to any case considered can be proven in two ways: by showing that the normal PDF moment relations (10.43) are satisfied, or by demonstrating that scatter plots of the joint PDF agree with the consequence of a joint normal PDF (given by the elliptical shape of isolines in the (x', y')-coordinate system). The first way is helpful for showing that models (like the Brownian motion model considered in Sect. 6.4) have a joint normal PDF. The second way is usually applied for analyzing the joint PDF of real data (like the atmospheric velocity and temperature statistics discussed in Sect. 4.5). The joint normal PDF model does often provide the basis for modeling concepts. This was illustrated here by means of two examples: First, it was shown that the joint normal PDF model justifies the use of linear optimal models. Second, the formulation of a random walk as a sum of jointly normally distributed contributions was shown to represent a sound model: it implies a random walk process that evolves normally distributed in time, which is the typical feature of a diffusion process. Extension of PDF Evolution Concepts. How is it possible to extend concepts for the evolution of PDFs of single variables to the case of several variables? It was shown that the Fokker-Planck equation for the PDF evolution and stochastic differential equation discussed in Chap. 8 can be extended to the case of several variables. As given for the single-variable case there exists a unique relationship between the Fokker-Planck equation and stochastic differential equation provided the coefficient of the noise term in the stochastic equation is a symmetric matrix. This relationship is helpful for the numerical Monte Carlo solution of diffusiontype partial differential equations that cannot be properly solved on the basis of other solution techniques. Consistent with the corresponding finding for single variables, it was shown that the Fokker-Planck equation for several variables can be solved analytically if linear dynamics of random variables are considered. The application of the PDF evolution equation presented here to the modeling of fluid dynamics in Sect. 10.5 illustrated the typical structure of stochastic models for a real problem and the typical problems related to the calculation of the solution of such equations: the numerical solution of the PDF evolution equation via Monte Carlo simulation is computationally expensive, and moment evolution equations are unclosed due to the appearance of higher-order correlations. A consistent and systematic solution for such closure problems was demonstrated by the derivation of closure models that are based on the PDF evolution equation.
10.7 Exercises
435
10.7 Exercises 10.2.1 Use the definition f(x, y) = <ï¤(x ï€ X) ï¤(y ï€ Y)> of a joint PDF to show that every joint PDF f(x, y) of unbounded variables x and y has the following properties. Here, g(x, y) can be any function of x and y.
a ) f ( x, y ) ï‚³ 0, b) f (ï€ï‚¥, y ) ï€½ f (ï‚¥, y ) ï€½ f ( x,ï€ï‚¥) ï€½ f ( x, ï‚¥) ï€½ 0, c) ïƒ²ïƒ² f ( x, y ) dx dy ï€½ 1,
d ) ïƒ²ïƒ² g ( x, y ) f ( x, y ) dx dy ï€½ g ( X , Y ) . 10.2.2 Consider the definition of the conditional mean
g ( X ,Y ) | x ï€½
1 g ( X ,Y )ï¤ ( x ï€ X ) . f ( x)
Specify this definition for the case that X and Y are independent variables. 10.2.3 Consider the optimal model yM(x) = , which was derived in Sect. 10.2.3. Assume that the joint PDF f(x, y) of any data set is available as the result of measurements. a) Explain how the optimal model yM(x) can be calculated on this basis. b) Explain the difference between this approach for developing an optimal model and the approach applied in Chap. 2 to find an optimal model. 10.2.4 A stochastic model for Y, which provides the correct mean and conditional mean , is given by
~ Y ï€½ Y ï€« rXY Y 2
1/ 2
Xï€ X . ~ 1/ 2 X2
~~ ~ a) Calculate < Y 2 > and < X Y > on the basis of this stochastic model for Y. ~2 ~~ b) Use the results for < Y > and < X Y > to explain under which condition the model for Y can represent a reasonable model. 10.3.1 Consider the model (10.39) for the joint PDF f(x, y). a) Integrate f(x, y) to show that ïƒ² f(x, y) dy = f(x). b) Use this result to explain why the condition ïƒ² f(x, y) dx = f(y) is satisfied. 10.3.2 The conditional PDF f(y | x) is given by Eq. (10.47). Considered as a function of yË† , f(y | x) is a normal PDF with mean rXY xË† and variance 1ï€ rXY2, ~ which is divided by < Y 2 >1/2. Use this fact and the known properties of a normal PDF to show that ïƒ² f(y | x) dy = 1.
436
10 Stochastic Multivariate Evolution
10.3.3 Consider the conditional PDF f(y | x) given by Eq. (10.53). a) According to Eq. (10.19), the global variance is defined by multiplying the conditional variance <[Y ï€ ]2 | x> with the PDF f(x) and integrating over the sample space x. Show that this global variance is equal to the global variance <[Y ï€ x=X ]2> considered in Sect. 10.2.3 (see the error (10.38)). b) Calculate the conditional variance <[Y ï€ ]2 | x> and the global variance <[Y ï€ x=X ]2> as functions of rXY. c) Explain why the conditional variance <[Y ï€ ]2 | x> is found to be equal to the global variance <[Y ï€ x=X ]2>. 10.3.4 Consider the ellipse equation x'2 / a2 + y'2 / b2 = 1. Here, a and b are given by Eq. (10.60). Specify the ellipse equation for rXY ï‚® 1 and rXY ï‚® ï€1. 10.3.5 The table shows the correlation coefficients of velocity components (u and v are horizontal velocities and w is the vertical velocity) and the temperature T. The data were obtained by measurements for different stabilities in the atmospheric surface layer (see the discussion in Sect. 4.5).
Stable Case: Neutral Case: Unstable Case:
ruv
ruw
r vw
ruT
r vT
rwT
ï€0.66 ï€0.01 0.16
ï€0.04 ï€0.26 ï€0.13
ï€0.11 0.18 ï€0.02
0.84 0.39 0.01
ï€0.71 0.18 ï€0.11
0.02 ï€0.18 0.50
a) Identify one case that is basically characterized by horizontal motions. Explain your reasoning. b) Identify one case that indicates significant upward motions of warm air. Explain your reasoning. c) Explain for each of the three cases considered which variables have to be accounted for in a stochastic model that characterizes the most basic features of the flow considered. 10.3.6 According to Eq. (10.65), the PDF f(z) of the sum Z = X + Y of any two random variables X and Y is given by f(z) = ïƒ² f(x, z ï€ x) dx. a) The joint PDF f(x, y) is assumed to be the normal PDF of two correlated variables X and Y. Show that f(z) is given for this case by
f ( z) ï€½
ïƒ¬ ï€¨z ï€ X ï€ Y ï€©2 ïƒ¯ expïƒï€ ~2 ~~ ~2 ïƒ¯ïƒ® 2 ( Y ï€« 2 X Y ï€« X ~ ~~ ~ 2ï° ( Y 2 ï€« 2 X Y ï€« X 2
ïƒ¼ ïƒ¯ ïƒ½ ) ïƒ¯ïƒ¾
)
.
10.7 Exercises
437
b) Replace in the f(z) formula the statistics of X and Y by the statistics of Z. Explain the meaning of this rewriting. ~~ c) A nonzero correlation < X Y > may lead to a lower variance of Z than given for the case of uncorrelated variables X and Y (for which we have ~~ < X Y > = 0). How is it possible to understand this observation? 10.3.7 Consider two independent random variables X and Y, which are uniformly distributed on the interval [0, 1]. a) Calculate the PDF of the sum Z = X + Y. b) Compare the result obtained in a) with the conclusions of Sect. 4.4.3 (see Fig. 4.14). What will be the PDF of a sum of a large number of independent variables that are uniformly distributed on the interval [0, 1]? 10.4.1 The principal invariants of the symmetric matrix Dij are the following once:
I ï€½ Dii , 1 ï›Dii Dnn ï€ Din Dni ï, 2 1 1 1 III ï€½ Dii Dnn Dkk ï€ Dii Dkn Dnk ï€« Din Dnk Dki ï€½ det( D). 6 2 3
II ï€½
a) Calculate the three principal invariants in principal axes as functions of the eigenvalues ï¬1, ï¬2, and ï¬3. Show that I, II, and III are positive if the eigenvalues ï¬1, ï¬2, and ï¬3 are positive. b) The three principal invariants and eigenvalues are related via the cubic characteristic equation ï¬3 ï€ I ï¬2 + II ï¬ ï€ III = 0. Use this equation to show that the eigenvalues are positive if I, II, and III are positive. 10.4.2 Show for any matrix ï¢ij(t) the validity of the relation
dï¢ kn ï€1 dï¢ ï€1ij ï€½ ï€ ï¢ ï€1ik ï¢ nj , dt dt which will be applied in exercise 10.4.4. Hint: differentiate ï¢kn ï¢ï€1nj = ï¤kj. 10.4.3 Show for any symmetric matrix ï¢ij(t) the validity of the relation
dï¢ ki 1 d det( ï¢ ) ï€½ ï¢ ï€1ik , dt dt det( ï¢ ) which will be used in exercise 10.4.4. The validity of the latter relation can be shown by considering ï‚¶f(x, t) / ï‚¶t of the joint normal PDF f ( x, t ) ï€½
ï€¨2ï° ï€©N / 2
1 det( ï¢ )
ï€¨
ï€©
ïƒ¬ 1 ïƒ¼ expïƒï€ ï¢ ï€1ij ï€¨xi ï€ ï¡ i ï€© x j ï€ ï¡ j ïƒ½ ïƒ® 2 ïƒ¾
438
10 Stochastic Multivariate Evolution and integrating ï‚¶f(x, t) / ï‚¶t over the sample space x. Here, ï¡i(t) are the mean values and ï¢ij(t) represent the elements of the symmetric variance matrix. Hint: you have to use the relation shown in exercise 10.4.2.
10.4.4 Consider the Fokker-Planck equation (10.107) for the conditional PDF f(x, t | x', t') combined with the initial condition f(x, t' | x', t') = ï¤ (x ï€ x'). a) Calculate the partial derivatives of the conditional PDF (10.109), which appear in Eq. (10.107). Hint: use the relations shown in exercises 10.4.2 and 10.4.3 to simplify the analyses in b) and c). b) Show the conditions under which the conditional PDF (10.109) satisfies the Fokker-Planck equation (10.107). c) Show the conditions under which the conditional PDF (10.109) satisfies the initial condition f(x, t' | x', t') = ï¤ (x ï€ x'). 10.4.5 Consider the relationship between the Fokker-Planck equation (10.85) and stochastic differential equation (10.115) discussed in Sect. 10.4.4. a) Use this relationship to determine the evolution equation for means that is implied by the stochastic differential equation (10.115). b) Use this relationship to find the evolution equation for variances that is implied by the stochastic equation (10.115). Write the model parameters in this equation in dependence on the stochastic process X(t) and t. 10.4.6 Consider the relationship between the Fokker-Planck equation (10.85) and stochastic differential equation (10.115) discussed in Sect. 10.4.4. a) Explain for which purpose it is particularly helpful to use the stochastic differential equation (10.115). b) Explain for which purpose it is particularly helpful to apply the FokkerPlanck equation (10.85). 10.5.1 Consider the stochastic velocity model (10.125). We assume that Ui, e, and ï´ are constants. ~ ~ a) Find the equation for the velocity variance < V i * (t ) V k * (t ) >. b) Solve this variance equation. c) Explain the characteristic features of velocity variances as t ï‚® ï‚¥. 10.5.2 Consider the stochastic velocity model (10.125). We assume that Ui, e, and ï´ are constants. ~ ~ a) Find the equation for velocity correlations < V i * (t ) V k * (t ï€« r ) >, where r is any non-negative time. ~ ~ b) Solve the equation for velocity correlations < V i * (t ) V k * (t ï€« r ) >. c) Calculate the velocity correlations of unequal velocity components (this means for i ï‚¹ k) for t ï‚® ï‚¥ by taking reference to the results obtained in exercise 10.5.1.
10.7 Exercises
439
10.5.3 The stochastic model (10.125) for molecular velocities provides uncoupled equations for the components of velocity. On the other hand, the variance equations (10.165), which are implied by the stochastic molecular velocity model (10.125), predict couplings between all velocity components (i.e., nonzero cross variances). Let us assume that the velocity variances were isotropic at any initial time: explain why the variances may become anisotropic after some time (what is the reason for the development of couplings between different velocity components?). 10.5.4 The table shows fourth-order moments of wind velocity components (u and v are horizontal velocities, and w is the vertical velocity) measured in the atmospheric surface layer for a neutral stratification. 50,400 sample values are available: see the description of these measurements in Sect. 4.5. The corresponding value found by using the normal parametrization (10.129) of fourth-order moments is given in the <>N column. The ratio of fourth-order moments to the corresponding normal parametrization value (10.129) is shown in the <> / <>N column.
= 0.9241 = ï€0.1175 = 0.0886 = ï€0.0323 ï€0.0260 = 0.2612 0.2295 < v3 w> = 0.0270 0.0281 < v2 w2> = 0.0428 0.0374 value. c) Do these data provide support for the normal parametrization (10.129), which was used for the derivation of fluid dynamics equations? 10.5.5 Consider the stochastic velocity model (10.125). Assume that the positions xi* and velocities Vi* are combined to a six-dimensional vector Z = (x*, V*). In matrix notation, Eqs. (10.125) can be written then
dZ dW ï€½ a ï€«G (Z ï€ Z ) ï€« b . dt dt Here, a is a six-dimensional vector, and G and b are 6 ï‚´ 6 matrices. a) Specify a, G, and b according to the equation system (10.125).
440
10 Stochastic Multivariate Evolution b) What are the requirements for the coefficients a, G, and b under which the solution approach for Fokker-Planck equations, which was described in Sect. 10.4.3, can be used for the calculation of the velocity-position joint PDF f(w, x, t) related to Eq. (10.125)?
10.5.6 Consider the stochastic molecular velocity model (10.125). The asymptotic change dVi* / dt can be considered to be small compared to the right-hand side of Eq. (10.125b). The asymptotic velocities Vi* = dxi* / dt are described for this case by the equation *
dxi ï€½ Ui ï€« dt
dWi 4 eï´ . dt 3
The position PDF f(x, t), which is related to x*(t), and the conditional PDF f(x, t | x', t') are related by f(x, t) = ïƒ² f(x, t | x', t') f(x', t') dx'. a) Determine f(x, t | x', t') by applying the solution approach for a FokkerPlanck equation described in Sect. 10.4.3. It is assumed that U, e, and ï´ are constants. Simplify f(x, t | x', t') as much as possible by using the expressions obtained for the parameters of f(x, t | x', t'). b) Determine the asymptotic conditional PDF f(x, t | x', t') as t ï‚® ï‚¥. c) Calculate the corresponding asymptotic position PDF f(x, t).
References
Abbot, I. H. & Von Doenhoff, A. E. 1959 Theory of Wing Sections: Including a Summary of Airfoil Data. Dover Publ., New York. Abramowitz, M. & Stegun, I. A. 1984 Pocketbook of Mathematical Functions. Verlag Harri Deutsch, Thun, Frankfurt/Main. Allen, L. J. S. 2003 An Introduction to Stochastic Processes with Application to Biology. Pearson Prentice Hall, Upper Saddle River, NJ. Allen, L. J. S. 2007 An Introduction to Mathematical Biology. Pearson Prentice Hall, Upper Saddle River, NJ. Anderson, R. M. (Editor) 1982 The Population Dynamics of Infectious Diseases: Theory and Applications. Chapman and Hall, London, New York. Anderson, R. M. & May, R. M. 1979a Population Biology of Infectious Diseases. Part I. Nature 280, 361â€“367. Anderson, R. M. & May, R. M. 1979b Population Biology of Infectious Diseases. Part 1I. Nature 280, 455â€“461. Baines, P. G. 2008 Lorenz, EN 1963: Deterministic Nonperiodic Flow. J. Atmos. Sci. 20, 130â€“141. Prog. Phys. Geog. 32, 475â€“480. Bakker, V. J., Doak, D. F., Roemer, G. W., Garcelon, D. K., Coonan, T. J., Morrison, S. A., Lynch, C., Ralls, K. & Shaw, R. 2009 Incorporating Ecological Drivers and Uncertainty into a Demographic Population Viability Analysis for the Island Fox. Ecol. Monogr. 79, 77â€“108. Boyce, W. E. & DiPrima, R. C. 2009 Elementary Differential Equations and Boundary Value Problems. Ninth Edition, Wiley, Hoboken, NY. Brannan, J. R. & Boyce, W. E. 2007 Differential Equations: An Introduction to Modern Methods and Applications. Wiley, Hoboken, NY. Brohan, P., Kennedy, J. J., Harris, I., Tett, S. F. B. & Jones, P. D. 2006 Uncertainty Estimates in Regional and Global Observed Temperature Changes: A New Dataset from 1850. J. Geophys. Res. 111, D12106/1â€“21.
S. Heinz, Mathematical Modeling, DOI 10.1007/978-3-642-20311-4, Â© Springer-Verlag Berlin Heidelberg 2011
441
442
References
Buchanan, J. R. 2008 An Undergraduate Introduction to Financial Mathematics. Second Edition, World Scientific Publ., Singapore. Buckingham, E. 1914 On Physically Similar Systems; Illustrations of the Use of Dimensional Equations. Phys. Rev. 4, 345â€“376. Bulmer, M. G. 1979 Principles of Statistics. Second Edition, Dover Publ., New York. Chenlo, F., Moreira, R., Pereira, G. & Bello, B. 2004 Kinematic Viscosity and Water Activity of Aqueous Solutions of Glycerol and Sodium Chloride. Eur. Food Res. Technol. 219, 403â€“408. Chu, C. R., Parlange, M. B., Katul, G. G. & Albertson, J. B. 1996 Probability Density Functions of Turbulent Velocity and Temperature in the Atmospheric Surface Layer. Water Resour. Res. 32, 1681â€“1688. Darwin, C. R. 1859 On the Origin of Species by Means of Natural Election. Murray, London. Del Grosso, V. A. & Mader, C. W. 1972 Speed of Sound in Pure Water. J. Acoust. Soc. Am. 52, 1442â€“1446. Du, S., Wilson, J. D. & Yee, E. 1994 Probability Density Functions for Velocity in the Convective Boundary Layer, and Implied Trajectory Models. Atmos. Environ. 28, 1211â€“1217. Durbin, P. A. 1983 Stochastic Differential Equations and Turbulent Dispersion. NASA Reference Publ. 1103, NASA Glenn Research Center, Ohio. Edelstein-Keshet, L. 2005 Mathematical Models in Biology. SIAM, Philadelphia. Einstein, A. 1905 Ãœber die von der Molekular-Kinetischen Theorie der WÃ¤rme Geforderte Bewegung von in Ruhenden FlÃ¼ssigkeiten Suspendierten Teilchen. Ann. Phys. (Leipzig) 17, 549â€“560. Feigenbaum, M. J. 1978 Quantitative Universality for a Class of Nonlinear Oscillations. J. Stat. Phys. 19, 25â€“52. Fokker, A. D. 1914 Die Mittlere Energie Rotierender Elektrischer Dipole im Strahlungsfeld. Ann. Physik 43, 810â€“820. Fox, R. O. 2003 Computational Models for Turbulent Reacting Flows. Cambridge Series in Chemical Engineering, Cambridge Univ. Press, Cambridge. Fulford, G., Forrester, P. & Jones, A. 1997 Modelling with Differential and Difference Equations. Cambridge Univ. Press, Cambridge. Gardiner, C. W. 1983 Handbook of Stochastic Methods. Springer-Verlag, Berlin, Heidelberg, New York. Gillies, G. T. 1997 The Newtonian Gravitational Constant: Recent Measurements and Related Studies. Rep. Prog. Phys. 60, 151â€“225. Giordano, F. R., Weir, M. D. & Fox, W. P. 2003 Mathematical Modeling. Third Edition, Brooks/Coleï€Thomson, Pacific Grove, CA.
References
443
Givi, P. 2006 Filtered Density Function for Subgrid Scale Modeling of Turbulent Combustion. AIAA Journal 44, 16â€“23. Gleick, J. 1987 Chaos: Making a New Science. Viking, New York. Haberman, R. 1977 Mathematical Models: Mechanicals Vibrations, Population Dynamics, and Traffic Flow. Prentice-Hall, Englewood Cliffs, NJ. Haworth, D. C. 2010 Progress in Probability Density Function Methods for Turbulent Reacting Flows. Prog. Energ. Combust. 36, 168â€“259. Heinz, S. 2003 Statistical Mechanics of Turbulent Flows. Springer-Verlag, Berlin, Heidelberg, New York. Heinz, S. 2004 Molecular to Fluid Dynamics: The Consequences of Stochastic Molecular Motion. Phys. Rev. E 70, 036308/1â€“11. Heinz, S. 2007 Unified Turbulence Models for LES and RANS, FDF and PDF Simulations. Theor. Comput. Fluid Dyn. 21, 99â€“118. Jaynes, E. T. 1957 Information Theory and Statistical Mechanics. Phys. Rev. 106, 620â€“630. Jenny, P., Torrilhon, M. & Heinz, S. 2010 A Solution Algorithm for the Fluid Dynamic Equations Based on a Stochastic Model for Molecular Motion. J. Comput. Phys. 229, 1077â€“1098. Jernigan, J. B. & Kodaman, M. F. 2001 An Investigation of the Utility and Accuracy of the Table of Speed and Stopping Distances Specified in the Code of Virginia (Final Report). Virginia Transportation Research Council, Charlottesville, Virginia, 1â€“25. Kapitza, S. P. 1996 The Phenomenological Theory of World Population Growth. Uspekhi Fizicheskikh Nauk 39, Russian Acad. Sci. 57â€“71. Kermack, W. O. & McKendrick, A. G. 1927 Contributions to the Mathematical Theory of Epidemics. 1. Proceed. Royal Soc. 115A, 700â€“721. Klimenko, A. Y. & Bilger, R. W. 1999 Conditional Moment Closure for Turbulent Combustion. Prog. Energ. Combust. 25, 595â€“687. Kloeden, P. E. & Platen, E. 1992 Numerical Solution of Stochastic Differential Equations. Applications of Mathematics 23 (Stochastic Modeling and Applied Probability), Springer-Verlag, Berlin, Heidelberg, New York. Kormondy, E. J. 1996 Concepts of Ecology. Prentice Hall, New York. Kramers, H. A. 1940 Brownian Motion in a Field of Force and the Diffusion Model of Chemical Reactions. Physica 7, 284â€“304. Langevin, P. 1908 Sur la ThÃ©orie du Mouvement Brownien. Comptes Rendus Acad. Sci. Paris 146, 530â€“533. Langhaar, H. L. 1951 Dimensional Analysis and Theory of Models. Wiley, New York. Lâ€™Ecuyer, P. 1994 Uniform Random Number Generation. Ann. Oper. Res. 53, 77â€“ 120.
444
References
Li, Z. & Wang H. 2003 Drag Force, Diffusion Coefficient, and Electric Mobility of Small Particles. II. Application. Phys. Rev. E 68, 061207/1â€“13. Liao, S. 2009 On the Reliability of Computed Chaotic Solutions of Non-Linear Differential Equations. Tellus 61A, 550â€“564. Lorenz, E. N. 1963 Deterministic Nonperiodic Flow. J. Atmos. Sci. 20, 130â€“141. Lorenz, E. N. 2006 Reflections on the Conception, Birth, and Childhood of Numerical Weather Prediction. Annu. Rev. Earth Pl. Sc. 34, 37â€“45. Lubbers, J. & Graaff, R. 1998 A Simple and Accurate Formula for the Sound Velocity in Water. Ultrasound Med. Biol. 24, 1065â€“1068. Luhar, A. K., Hibberd, M. F. & Hurley, P. J. 1996 Comparison of Closure Schemes Used to Specify the Velocity PDF in Lagrangian Stochastic Dispersion Models for Convective Conditions. Atmos. Environ. 30, 1407â€“1418. Malthus, T. R. 1798 An Essay on the Principle of Population. J. Johnson, London. May, R. M. 1975 Biological Populations Obeying Difference Equations: Stable Points, Stable Cycles, and Chaos. J. Theor. Biol. 51, 511â€“524. May, R. M. (Editor) 1976 Theoretical Ecology: Principles and Applications. Blackwell, Oxford. Meadows, D. H., Meadows, D. L., Randers, J. & Behrens III, W. W. 1972 The Limits to Growth: A Report for the Club of Romeâ€™s Project on the Predicament of Mankind. Universe Books, New York. Meadows, D. L., Behrens III, W. W., Meadows, D. H., Naill, R. F., Randers, J. & Zahn, E. K. O. 1974a Dynamics of Growth in a Finite World. Wright-Allen Press, MA. Meadows, D. H., Meadows, D. L., Randers, J. & Behrens III, W. W. 1974b The Limits to Growth: A Report for the Club of Romeâ€™s Project on the Predicament of Mankind. Second Edition, Universe Books, New York. Meadows, D. H., Randers, J. & Meadows, D. L. 2004 Limits to Growth: The 30Year Update. Chelsea Green Publ., White River Junction, VT. Millikan, R. A. 1910 The Isolation of an Ion, a Precision Measurement of its Charge, and the Correction of Stokesâ€™ Law. Science 32, 436â€“448. Millikan, R. A. 1917 A New Determination of e, N, and Related Constants. Philos. Mag. 34, 1â€“31. Millikan, R. A. 1923 The General Law of Fall of a Small Spherical Body Through a Gas, and its Bearing Upon the Nature of Molecular Reflection From Surfaces. Phys. Rev. 22, 1â€“23. Minier, J.-P. & Peirano, E. 2001 The PDF Approach to Turbulent Polydispersed Two-Phase Flows. Phys. Rep. 352, 1â€“214. Moyal, J. E. 1949 Stochastic Processes and Statistical Physics. J. Roy. Stat. Soc. (London) B11, 150â€“210.
References
445
Murray, J. D. 2002 Mathematical Biology: I. An Introduction, Third Edition, Springer-Verlag, Berlin, Heidelberg. Nagle, R. K., Saff, E. B. & Snider, A. D. 2008 Fundamentals of Differential Equations. Seventh Edition, Pearson Education, Boston, MA. Newton, I. 1687 PhilosophiÃ¦ Naturalis Principia Mathematica. Royal Society of London, UK. Ortega, J. M. 1987 Matrix Theory: A Second Course. Plenum Press, New York. Pawula, R. F. 1967 Approximation of Linear Boltzmann Equation by FokkerPlanck Equation. Phys. Rev. 162, 186â€“188. Planck, M. 1917 Ãœber Einen Satz der Statistischen Dynamik und Seine Erweiterung in der Quantentheorie. Sitzber. PreuÃŸ. Akad. Wiss., 324â€“341. Pope, S. B. 1994 Lagrangian PDF Methods for Turbulent Flows. Annu. Rev. Fluid Mech. 26, 23â€“63. Pope, S. B. 2000 Turbulent Flows. Cambridge University Press. Cambridge, UK. Rainey, R. H. 1967 Natural Displacement of Pollution From the Great Lakes. Science 155, 1242â€“1243. Rayner, N. A., Parker, D. E., Horton, E. B., Folland, C. K., Alexander, L. V., Rowell, D. P., Kent, E. C. & Kaplan, A. 2003 Global Analyses of SST, Sea Ice and Night Marine Air Temperature Since the Late Nineteenth Century. J. Geophys. Res. 108, 4407â€“4443. Risken, H. 1984 The Fokker-Planck Equation. Springer-Verlag, Berlin, Heidelberg, New York. Roekaerts, D. 2002 Reacting Flows and Probability Density Function Methods. In: Closure Strategies for Turbulent and Transitional Flows, Edited by Launder, B. E. & Sandham, N. D., Cambridge Univ. Press, Cambridge, Chap. 10, 328â€“337. RÃ¶ssler, O. E. 1976 An Equation for Continuous Chaos. Phys. Lett. 57A, 397â€“ 398. Ross, S. M. 2010 A First Course in Probability. Eighth Edition, Pearson Prentice Hall, Upper Saddle River, NJ. Saltzman, B. 1962 Finite Amplitude Free Convection as an Initial Value Problem â€“ I. J. Atmos. Sci. 19, 329â€“341. Samuelson, P. A. 1939 Interaction Between the Multiplier Analysis and the Principle of Acceleration. Rev. Econ. Statistics 21, 75â€“78. Sawford, B. L. 1991 Reynolds Number Effects in Lagrangian Stochastic Models of Turbulent Dispersion. Phys. Fluids A3, 1577â€“1586. Scheiner, S. M. & Willig, M. R. 2005 Developing Unified Theories in Ecology as Exemplified with Diversity Gradients, Am. Nat. 166, 458â€“469. Scheiner, S. M. & Willig, M. R. 2008 A General Theory of Ecology. Theor. Ecol. 1, 21â€“28.
446
References
Seinfeld, J. H. & Pandis, S. N. 2006 Atmospheric Chemistry and Physics: From Air Pollution to Climate Change. Second Edition, Wiley, New York. Shannon, C. H. 1948 A Mathematical Theory of Communication. Bell. System Tech. J. 27, 379â€“423. Society of Fire Protection Engineers 1995 SFPE Handbook of Fire Protection Engineering. Second Edition, National Fire Protection Association, Edited by Beyler, C. L., Custer, R. L. P., Walton, W. D., Watts, J. M., Drysdale, D., Hall, J. R., Dinenno, P. J., Quincy, MA. Sparrow, C. 1982 The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors. Applied Mathematical Sciences 41, Springer-Verlag, New York, Heidelberg, Berlin. Stewart, J. 2006 Calculus: Concepts and Contexts. Third Edition, Brooks/ColeThomson, Belmont, CA. Stokes, G. 1851 On the Effect of the Internal Friction of Fluids on the Motion of Pendulums. Transactions of the Cambridge Philosophical Society 9, 8â€“106. Tans, P. 2008 Trends in Atmospheric Carbon Dioxide. NOAA/ESRL, Global Monitoring Division, U.S. Department of Commerce, National Oceanic and Atmospheric Administration, Earth System Research Laboratory, Boulder, CO, http://www.esrl.noaa.gov/gmd/ccgg/trends/. Turchin, P. 2001 Does Population Ecology Have General Laws? Oikos 94, 17â€“26. Turchin, P. 2003 Complex Population Dynamics: A Theoretical/Empirical Synthesis. Princeton Univ. Press, Princeton and Oxford. Turner, G. M. 2008 A Comparison of The Limits to Growth With 30 Years of Reality. Global Environ. Chang. 18, 397â€“ 411. Tyagi, M., Jenny, P., Lunati, I. & Tchelepi, H. A. 2008 A Lagrangian, Stochastic Modeling Framework for Multi-Phase Flow in Porous Media. J. Comput. Phys. 227, 6696â€“6714. U.S. Dept. of Energy 2008 Annual Energy Report 2007, Washington DC. Verhulst, P. F. 1838 Notice sur la Loi que la Population Poursuit dans son Accroisement. Correspondence MathÃ©matique et Physique 10, 113â€“121. Wiggins, S. 2010 Introduction to Applied Nonlinear Dynamical Systems and Chaos. Texts in Applied Mathematics, Second Edition, Springer-Verlag, New York. World Almanac Books 2010 The World Almanac and Book of Facts 2010. World Almanac Books, New York. XFOIL (http://web.mit.edu/drela/Public/web /xfoil/). Yafia, R. 2009 Dynamics and Numerical Simulations in a Production and Development of Red Blood Cells Model with one Delay. Commun. Nonlinear Sci. 14, 582â€“592. Yao, L.-S. 2007 Is a Direct Numerical Simulation of Chaos Possible? A Study of a Model Nonlinearity. Int. J. Heat Mass Tran. 50, 2200â€“2207.
References
447
Yao, L.-S & Hughes, D. 2008 Comment on 'Computational Periodicity as Observed in a Simple System' by Edward N. Lorenz (2006). Tellus 60A, 803â€“ 805. Yao, L.-S. 2010 Computed Chaos or Numerical Errors. Nonlinear Anal. Modell. Control 15, 109â€“126.
Author Index
Abbot, I. H., 105 Abramowitz, M., 138-140, 179, 262264 Albertson, J. B., X, 150-151 Alexander, L. V., 3, 24 Allen, L. J. S., X, 191, 234, 269, 281, 385 -386 Anderson, R. M., 386 Baines, P. G., 370 Bakker, V. J., 239 Behrens_III, W. W., 278-279 Bello, B., 87 Bilger, R. W., 433 Boyce, W. E., IX, 251, 254, 269, 289, 347, 370, 387 Brannan, J. R., IX Brohan, P., 3, 24 Buchanan, J. R., X Buckingham, E., 80 Bulmer, M. G., 134 Chenlo, F., 87 Chu, C. R., X, 150-151 Coonan, T. J., 239 Darwin, C. R., 287 Del Grosso, V. A., 87 DiPrima, R. C., IX, 251, 254, 269, 289, 347, 370, 387
Doak, D. F., 239 Du, S., 128 Durbin, P. A., 218, 221, 232 Edelstein-Keshet, L., IX, 183, 191, 204, 269, 386 Einstein, A., 225 Feigenbaum, M. J., 191 Fokker, A. D., 301, 411 Folland, C. K., 3, 24 Forrester, P., IX, 169, 205, 386 Fox, R. O., 391, 421 Fox, W. P., 111, 113 Fulford, G., IX, 169, 205, 386 Garcelon, D. K., 239 Gardiner, C. W., 221, 299, 302, 311 Gillies, G. T., 93 Giordano, F. R., 111, 113 Givi, P., 391, 421 Gleick, J., 191, 370 Graaff, R., 87 Haberman, R., IX, 388 Harris, I., 3, 24 Haworth, D. C., X Heinz, S., 87, 123, 159, 241, 317, 391, 421-422, 424, 426 Hibberd, M. F., 150, 158 Horton, E. B., 3, 24
450
Author Index
Hughes, D., 379 Hurley, P. J., 150, 158 Jaynes, E. T., 128 Jenny, P., X, 421-422, 424, 426 Jernigan, J. B., 8 Jones, A., IX, 169, 205, 386 Jones, P. D., 3, 24 Kapitza, S. P., 272, 292 Kaplan, A., 3, 24 Katul, G. G., X, 150-151 Kennedy, J. J., 3, 24 Kent, E. C., 3, 24 Kermack, W. O., 386 Klimenko, A. Y., 433 Kloeden, P. E., X Kodaman, M. F., 8 Kormondy, E. J., 277 Kramers, H. A., 299 Langevin, P., 225 Langhaar, H. L., 113-114 L'Ecuyer, P., 126 Li, Z., 94 Liao, S., 379 Lorenz, E. N., 336, 370 Lubbers, J., 87 Luhar, A. K., 150, 158 Lunati, I., X Lynch, C., 239 Mader, C. W., 87 Malthus, T. R., 267 May, R. M., 191, 386 McKendrick, A. G., 386 Meadows, D. L., 277-279 Meadows, D. H., 277-279 Millikan, R. A., 94 Minier, J.-P., X Moreira, R., 87 Morrison, S. A., 239 Moyal, J. E., 299 Murray, J. D., 269
Nagle, R. K., IX Naill, R. F., 278 Newton, I., 256 Ortega, J. M., 412 Pandis, S. N., 221 Parker, D. E., 3, 24 Parlange, M. B., X, 150-151 Pawula, R. F., 301 Peirano, E., X Pereira, G., 87 Planck, M., 301, 411 Platen, E., X Pope, S. B., 326, 391, 412, 421, 427 Rainey, R. H., 254-255 Ralls, K., 239 Randers, J., 277-279 Rayner, N. A., 3, 24 Risken, H., 299, 311 Roekaerts, D., 391 Roemer, G. W., 239 RÃ¶ssler, O. E., 389 Ross, S. M., 126, 145, 161 Rowell, D. P., 3, 24 Saff, E. B., IX Saltzman, B., 370 Samuelson, P. A., 166, 182 Sawford, B. L., 326 Scheiner, S. M., 287 Seinfeld, J. H., 221 Shannon, C. H., 128 Shaw, R., 239 Snider, A. D., IX Soc. Fire Protec. Engineers, 87 Sparrow, C., 378 Stegun, I. A., 138-140, 179, 262-264 Stewart, J., 134, 163, 412 Stokes, G., 94 Tans, P., 3, 24 Tchelepi, H. A., X Tett, S. F. B., 3, 24
Author Index Torrilhon, M., 421-422, 424, 426 Turchin, P., 269, 287 Turner, G. M., X, 278-280 Tyagi, M., X U.S. Dept. of Energy, 4 Verhulst, P. F., 269 Von Doenhoff, A. E., 105 Wang H., 94 Weir, M. D., 111, 113 Wiggins, S., 385
Willig, M. R., 287 Wilson, J. D., 128 World Almanac Books, 6-7, 16, 35, 273-274 XFOIL, 104 Yafia, R., 185 Yao, L.-S., 379 Yee, E., 128 Zahn, E. K. O., 278
451
Subject Index
Boldface page numbers indicate principal references. acceleration, of mass, 78, 256-257 principle, 166 acceleration correlation, Markovian velocity model, 321, 334 non-Markovian velocity model, 322-323, 334 acceleration model, 317-324 implied velocity model, 318 advantage of modeling, 22-30, 32 angle of attack, 76, 103, 105, 107, 113 asymptotically stable, 347, 354-355, 358-359, 363-370, 377-378, 380 atmospheric boundary layer, 116, 382 circulation, 336 CO2 concentration, 24-30, 32, 35, 37-38, 40, 51-59, 62-63, 71, 279 dynamics, 421 gases, 22 raindrop size distribution, 134 stability, 150 surface layer, 150, 436, 439 turbulence models, 150 attractor, 380
Besselâ€™s correction, 147, 164 beta PDF, see PDF Boltzmann equation, 424 Boltzmannâ€™s constant, 112, 231 boundary, conditions, 299, 304, 416 effects, 222-224 partially absorbing, 244 partially reflecting, 244 totally absorbing, 223, 244 totally reflecting, 222-223, 244 braking, acceleration, 90-91 distance, 88-90 Brownian motion, 93, 224-232, 241, 245, 250, 314, 401, 422, 434 continuous statistics, 229-232 discrete statistics, 227-229 model, 225-227, relation to diffusion model, 226 Buckinghamâ€™s Theorem, 77-82 business-as-usual scenario, 279-280 butterfly, effect, 381 shape, 380
454
Subject Index
carrying capacity, 233, 268-269, 276-277, 281-286 Central Limit Theorem, 147-149, 154 characteristic equation, 177, 261, 323, 339, 341, 389-390, 437 characteristic time scale, 75, 194 damping, 93 memory, 284-285, 292 oscillations, 260 population, 192, 233, 281, 284 relaxation, 226, 230, 314, 317, 393 velocity, 111, 226, 251, 318, 422 Chebyshev error, 41-43, 46, 69-70 closure problem, 393, 421, 424, 434 collapse, 248, 276-277, 279-280, 283 competition for food, 269, 335, 351356, 385 comprehensive technology scenario, 279-280 compressibility, 102, 427 concentration, 221-224, 244, 254, 288289, 332 see also atmospheric CO2 conditional, mean, 299, 305, 393, 396-398, 402-403, 411, 433, 435 normal PDF, 401-403 PDF, 305-309, 331, 333, 395-397, 416-418, 428-429, 433, 435-440 PDF equation, 305-306, 416 PDF model, 306, 416-417 variance, 436 conservation, of mass, 251, 392 of momentum, 256, 392 of population, 267 of stress, 393 consumption expenditure, 166 convection, 250, 370-371 Rayleigh-BÃ©nard, 370
conversion period, 171-174, 202 correlation, 213 coefficient, 49-52, 55, 65, 71, 395, 399-401, 405, 436 function, 213-218, 243-245, 331-334 random variables, 210 correlation implied by, diffusion model, 219 Fokker-Planck equation, 304, 415 linear stochastic difference equation, 213-218 stochastic differential equation, 313, 315-327, 420-421 stochastic logistic model, 236-238 cubic equation, 374-377, 437 polynomial, 14, 32 damping, constant, 94, 109-110, 259, 290 effect, 97, 226, 265-266, 364 force, 93-96, 98, 108-109, 112, 259, 287, 362 frequency, 265 time scale, see characteristic damping time scale data, analysis, 141-150, 158-159, 393-399, 403-406, 433 transformations, 3-10, 31 decorrelation, 237 delay, 175, 185, 207, 277-278, 280, 293, 358 logistic model, 280-286 delayed response, 248, 266, 277 delta function, 121-123, 127, 160, 235, 243, 281, 297, 299, 306, 311, 316, 321, 394-399, 410, 416, 428-429 design conditions, 85-88, 110-111 deterministic chaos, 370, 378-382
Subject Index difference equation, and differential equation, 191-199 first-order linear, 167-175, 194-195 first-order linear stochastic, 210-218 first-order nonlinear, 186-191 nonlinear stochastic, 419 second-order linear, 175-185, 196-199 second-order linear stochastic, 226 differential equation, first-order linear, 251-255 first-order linear system, 337-347 first-order nonlinear system, 347-350 Markovian stochastic, 309-316 nonlinear stochastic, 418-421 non-Markovian stochastic, 317-327 second-order linear, 260-263, 338 diffusion, 207-210, 218-224, 250, 304, 427, 434 coefficient, 219-226, 230-231, 304, 314, 317, 332, 411, 415, 426 equation, 208, 218-224, 226, 229-232, 234, 241, 243-245, 411 dimension, 77-78 dimensional analysis, 77-88 dimensionally correct equation, 77-80 dispersion, 208 distribution function, 118-121, 124, 126, 268, 272, 407 drag force, 93 Duffingâ€™s nonlinear spring model, 385 eigenvalue, 177-178, 183-184, 197, 200, 261-262, 293, 339-347, 350, 354-377, 389-391, 412, 437 eigenvector, 340-341, 345-346, 412 Einstein relation, 230-231 energy consumption, 2, 4-5, 15-16, 31, 51-52, 70, 72, 75 entropy, 128-129, 149-150, 155, 161
455
epidemic models, SI model, 386 SIR model, 386-387 SIRS model, 387 SIS model, 386 equilibrium solution, difference equations, 169, 175, 187, 189, 215 equation systems, 337-348 Lorenz equations, 372 pendulum equation, 363 population ecology, 348, 353, 357 error function, 132-134, 244 extrapolation, 14, 16, 21-22, 32 Feigenbaum number, 191 filtered PDF, 141-144, 151, 164, 243, 296, 381-382, 404-406 flatness, 52-53, 60, 132, 137, 139, 151-163, 276, 290 fluctuation, 47 fluid dynamics, 392-393, 421-433 see also Lorenz equations Fokker-Planck (for one variable), and stochastic differential equation, 311-313 equation, 301 equation for correlation, 304 equation for mean, 302-303 equation for variance, 303-304 solution, 304-309 Fokker-Planck (for several variables), and stochastic differential equations, 419-421 equation, 411 equation for correlations, 415 equation for fluid dynamics, 429 equation for means, 413 equation for variances, 413-415 solution, 415-418
456
Subject Index
fourth-order SML PDF, 155-158, 164 frequency, 78, 92, 98, 111, 265 frictional force, 93 Froude number, 112, 114 fundamental theorem of probability, first, 145-147, 159 second, 147-150, 159 gamma, function, 135-139, 162 PDF, see PDF global error, 41-42, 70 temperature anomaly, 24-30, 51-52, 67-68, 70, 72 warming, 2-3, 22-32, 40, 67-68, 278 Gompertz model, 292 government expenditure, 166 gravity, 76-77, 98, 107, 111-112 acceleration, 76, 91, 95-96, 101, 109113, 258, 289, 360 force, 76, 91, 95-96, 98, 100, 108, 111-112, 289, 360 greenhouse effect, 22 ground concentration, 223-224 growth, 266-276 chaotic, 191 cyclic, 191 exponential, 17-18, 267, 283, 292 logistic, 18-19, 269, 292, 351 rate, 17-19, 182, 204, 269 stable, 190 harmonic oscillator, 259-260, 264-266, 290, 360 Hassel model, 205 Heaviside function, 119 homogeneous equation, 80, 167, 171, 175-176, 187, 259-260, 364 Hookeâ€™s Law, 258-259
incomplete gamma and beta functions, 138, 140 independence, 395 integral time scale, acceleration correlation, 321, 323 velocity correlation, 321-322 interest, 169-175 compound, 171-173, 202 simple, 170-171, 202 interpolation, 16, 21-22, 32, 191 inverse transformation method, 126 Kapitzaâ€™s model, 272-276, 291-292 Keplerâ€™s, First Law, 5-6 Second Law, 5-6 Third Law, 5-8, 15, 32, 40, 51-52, 57-58, 63, 70, 88-93, 108, 256 kinetic energy, 10, 90, 111, 317, 366, 422-423, 425, 427, 430 Kirchhoffâ€™s Second Law, 290 Kramers-Moyal, coefficients, 299, 312, 329 equation, 297-304, 309, 312, 328, 419-420 kurtosis, 132 Law of Large Numbers, 145, 147 laws, 286 of mechanics, 92, 166, 191, 196, 248, 267, 287, 360 of population ecology, 166, 191, 248, 267, 287, 350 of probability theory, 191, 295 multivariate, 337, 383 least-absolute-deviations error, 41, 44-45, 69 least-squares error, 41-42, 45-46, 47, 53, 55, 60, 64, 66, 69-74, 115, 273274, 285, 397, 399
Subject Index lift, 76 coefficient, 79-88, 101-108, 113 force, 76-85, 101-110, 113, 257 linear model, see linear polynomial loan repayment, 173-175, 202 logistic model, 269-272, 350-353 application to world population, see world population centered, 18-19, 35-36, 72, 271 delay, see delay logistic model discrete, 188-191, 205 solution, 269-271 stochastic, 233-242 Lorenz equations, 336-337, 370-383, 389-390, 427 Lorenzâ€™s weather, 370 Lotka-Volterra equations, 257, 337, 351, 385 Lyapunov function, 366, 368 Lyapunovâ€™s, first method, 366 second method, 366, 384 Mach number, 102-103, 104-107 Malthusian Law, 17, 267-268 marginal, PDF, 394, 403-407 propensity to consume, 166, 181 Markovian stochastic equations, linear, 314-316 nonlinear, 309-311 Markov process, 299, 310, 418 mathematical modeling, problems, 2-3, 421 process, 1 mean, 47, 120 memory, 215 effects, 283, 292, 299, 325-327 function, 280-282, 292-293 mixing, 139, 148, 333
457
model, development, 10-15, 19, 27, 32 error, 41-46, 55 evaluation, 5-10, 16-22, 32, 242, 280 exponential function, 17-18, 66-67 power function, 6, 26-28, 64-66 modeling concepts, 21-22, 159, 434 molecular motion model, 422-424, 440 moment, 118, 120 central, 118, 125, 131, 151, 160, 227, 272, 276, 331, 400-401 Monte Carlo simulation, 210, 216-218, 329, 422, 434, national income, see Samuelsonâ€™s national income model Navier-Stokes equations, 94, 108, 326, 378-379, 383, 427 neutral conditions, 150, 154, 405 Newtonâ€™s, First Law, 256-257 Law of Cooling, 251-254, 287-288 Law of Gravitation, 8, 93 Laws of Motion, 255-266, 360 Second Law, 98, 256-257, 258-259, 289, 360-361, 388 Third Law, 76, 256-257 noise model, 42, 43, 47-52, 57, 69, 210, 220, 240-241 nondimensional products, 79-85, 95, 97, 101, 110 non-Markovian, see differential equations normal PDF, 137-139, 144, 147-163, 208-222, 233, 237, 240-243, 275-276, 306, 309-310, 328-330, 391, 393, 416-418, 434-437 bivariate, 227, 424 joint, 227, 399-409, 424 model, 127-134
458
Subject Index
one-point statistics, 210-213, 216, 218, 234-236, 242, 311-313, 410-411, 416, 420 optimal, exponential function model, 64-67 linear model, 47-58, 398-399, 402-403 model, 21, 39-74, 397-399 power function model, 64-68 quadratic model, 59-63 orbital period, 6-8, 15-16, 91-93 oscillation, 13, 15, 32, 95-96, 154, 160, 180-191, 248-249, 255-266, 276-287, 359 owl mask shape, 380 partial differential equation, 251, 266, 335-336, 370, 391-392, 434 Pawula Theorem, 300-301 PDF (probability density function), beta, 138-141, 148-149, 159, 162-163, 246 Cauchy, 272 conditional, see conditional PDF definition (marginal PDF), 117-126, 297 definition (multivariate PDF), 410 gamma, 134-139, 154, 159, 162-163 joint PDF, 227, 394-411, 424, 428-429, 433-436, 440 logistic, 272 modeling concepts, 127-141 normal, see normal PDF uniform, 124-129, 131, 139, 148-149, 160-161, 395, 437 PDF of processes, Brownian motion model, 227 diffusion model, 219 Fokker-Planck equation, 301 Fokker-Planck equation (8.34), 306
linear stochastic difference equation, 212 linear stochastic differential equation, 314-315 Lorenz equations, 381-382 stochastic differential equation, 312 stochastic logistic model, 235 pendulum, 93-101, 108-111, 186-187, 257, 337, 360-370, 387-388 linear damped equation, 98-99 motions, 368-370 nonlinear damped equation, 98, 362 nonlinear undamped equation, 360362 period, 95-97, 100-101 seconds pendulum, 101 phase plane, 344-345, 355, 358-359, 365-369, 380, 385, 388, 390 planetary motion, 5-8, 31, 256 PODF (population density function), Kapitzaâ€™s model, 272-276, 291 logistic, 272-276, 290 polynomial, exact model, 12-15, Lagrangian form, 10-11 linear, 3-10, 18-22, 24, 33-34 modeling, 19-22 quadratic, 6, 11, 19-22, 25, 34 reduced model, 14 population, basic scenarios, 248-249, 276-277 dynamics and modeling, 16-22, 31-32, 188-193, 232-242, 246-249, 266-286 ecology, 266-286, 350-359 Prandtl number, 371, 426-427 predator-prey interactions, 351, 356-359 predictability, 127-128 private investment, 166
Subject Index quadratic model, see polynomial random, variable, 416-418 independent variables, 395 joint variables, 394-395, 403-406 random walk, 218-221, 243, 406-409 random number generation, 126 Rayleigh number, 371 reaction distance, 88-89 realization, 215, 224, 234, 243-246, 296 explanation, 209 reasonable model, 21 red blood cell production, 183-185, 204 relative error, 5 relaxation time, see characteristic time scale Reynolds number, 94, 102-107, 112, 114, 362, 378 Ricker model, 204-205 RÃ¶ssler equations, 389-390 Samuelsonâ€™s national income model, 166, 181-182, 200 scaling, 205, 220, 240 scatter plots, 405-406, 434 Schaeferâ€™s model, 291 Schwarzâ€™s inequality, 50, 300 seconds pendulum, see pendulum self-limitation, 266-276, 351, 356, 358359 self-similar process, 272, 292 separatrices, 368-369 separatrix, 368-369 shape factor, 103, 106 similarity, 84-88, 107, 110 skewness, 52-53, 60, 132, 137, 139, 151-159, 162, 163, 238
459
SLM (statistically most-likely) PDF, 127-129, 155-161, 164 solution of equations, analytical, 194-195, 197-198 chaotic, 191, 378-381, 389-390 difference equation, 167-169 differential equation, 252-253 numerical, 195, 199, 378 speed of sound, 76, 86-87 spring, constant, 258, 388 force, 258-259 mass system, 109, 257-260, 290, 361-362, 388 stability, 150, 187, 189 stability analysis, linear, 350, 353-358, 363-365, 368-370, 372-377, 384-385 nonlinear, 366-370 stabilized world scenario, 279-280 stable, conditions, 150-160, 436 equilibrium, 186-187, 189, 204-205, 347, 357-358, 367 stably stratified, 116 standard, deviation, 48-49, 124-125, 130-137, 140-147, 158-161, 213-217, 236, 242-246, 272-275, 290, 408 gravitational parameter, 7, 40, 92 Stefan-Boltzmannâ€™s Law of Radiation, 288 step function, 119, 121 stochastic integration, ItÃ´ definition, 311, 419 Stratonovich definition, 311 stochastic process, 115-116, 207, 209-211, 213, 220, 240-242, 295-302, 309-313, 317, 327-329, 392, 410, 418, 433, 438
460
Subject Index
Stokesâ€™ Law, 93-94, 98, 108, 112, 231, 259, 362 problem, 112 stratified atmospheric boundary layer, 116, 150-158, 382 sum of normal variables, 147, 212, 407-409 superflatness, 151 superskewness, 151 T-CO2 relation, 29-30, 36-38, 58-59 temperature data (atmosphere), 116, 150-158, 405-406 theta function, 119, 410 thin airfoil, curve, 106, 108, 113 theory, 105 time to extinction, 238 PDF, 238-240, 246 trajectory, 218, 344-346, 355, 366, 380 transfer, heat, 249-254 mass, 249-251, 254-255 turbulence, 105, 150, 232, 378
variance, definition, 47-49, 118 effect of filtering, 142 effect of sample number, 145-146 fluid dynamics closure problem, 392-393 PDF and PODF models, 131, 136, 272 velocity correlation, Markovian velocity model, 320-321 non-Markovian velocity model, 322 velocity data (atmosphere), 150-158, 405-406, 439 velocity model, Markovian, 317-320 non-Markovian, 317-320 vehicular stopping distance, 8-10, 15, 31, 34, 61-62, 70, 73, 75, 88-91, 108, viscosity, dynamic, 76, 78, 93, 96, 110-114, 231, 362, 426 kinematic, 78, 85-87, 102-104, 109, 334, 426
unbiased estimates, 146-147 uncertainty, 127-128, 155 uniform PDF, see PDF unstable, conditions, 150-160, 436 equilibrium, 186-187, 189, 347, 350, 354-357, 363-367, 370, 377 U.S. population, 1, 16-22, 71-72, 192-193, 277
weather forecasting, 336, 370, 381 Wiener process, continuous (one variable), 310-311 continuous (several variables), 418-419 discrete (one variable), 220-221 World3 model, 277-280, 285 world population, 35-36, 268, 273-276, 277, 285-286
About the Author
Dr. Stefan Heinz is a Professor of Mathematics at the University of Wyoming. He holds a Ph.D. in Physics from the Heinrich-Hertz Institute, Berlin. His research interests are in mathematical modeling, multiscale processes, stochastic analysis, Monte Carlo simulations, computational fluid dynamics, turbulence, combustion, and multiphase flows. He has authored more than seventy refereed publications and the textbook Statistical Mechanics of Turbulent Flows (Springer, 2003). For more than ten years he has taught a variety of courses: calculus, probability, ordinary, partial, and stochastic differential equations, applied mathematics, and deterministic and stochastic mathematical modeling. His exceptional teaching was awarded in 2007 by the College of Arts and Sciences Extraordinary Merit in Teaching Award. In 2008 he was honored as Adjunct Professor of Mechanical Engineering. He has held visiting professor appointments at ETH Zurich, Delft Technical University, and the National Center for Atmospheric Research (NCAR) at Boulder.
<1></1>

Download gabungan lagu-lagu nasyid islami. Download i serial reader mac 2019. Coreldraw free download with serial key.

AN INTRODUCTION TO MATHEMATICAL MODELING
Edward A. Bender
University of California, San Diego
A W i iey- I n t e rscience Pub l i cat i o n JO H N WILEY & S O N S
N ew Y o rk/Chic h este r/Br i sban e/T o r o nto
Copyright Â©
1978
by John Wiley & Sons, Inc.
All rights reserved. Published simultaneously in Canada. No part of this book may be reproduced by any means, nor transmitted, nor translated into a machine language without the written permission of the publisher.
Library of Congress Cataloging in Publication Data 1942Bender, Edward A. An introduction to mathematical modeling.
'A Wiley-Int-erscience publication.' Bibliography: p. Includes index. 1. Mathematics- 1961- 2. Mathematical models. I. Title. 511'.8
QA37.2.B44
ISBN 0-471-02951-3 Printed in the United States of America 10 9 8 7 6 5 4 3 2
I
77-23840.
PREFACE
This book is designed to teach students how to apply mathematics by forÂ mulating, analyzing, and criticizing models . It is intended as a first course in applied mathematics for use primarily at an upper division or beginning graduate level. Some course suggestions are given near the end of the preface. The first part of the book requires only elementary calculus and, in one chapter, basic probability theory. A brief introduction to probability is given in the Appendix. In Part II somewhat more sophisticated matheÂ matics is used. Although the level of mathematics required is not high, this is not an easy text : Setting up and manipulating models requires thought, effort, and usually discussion-purely mechanical approaches usually end in failure. Since I firmly believe in learning by doing, all the problems require that the student create and study models. Consequently, there are no trivial problems in the text and few very easy ones . Often problems have no single best answer, because different models can illuminate different facets of a problem. Discussion of homework in class by the students is an integral part of the learning process ; in fact, my classes have spent about half the time discussing homework. I have also encouraged (or insisted) that homeÂ work be done by students working in groups of three or four. We have usually devoted one class period to a single model, both those worked out in the text and those given as problems. I have also required students to report on a model of their own choosing, the amount of originality required deÂ pending on the level of the student. Except for Chapter 6, each section of the text deals with the application of a particular mathematical ' t echnique to a range of problems. This lets the students focus more on the modeling. My students and I have enjoyed the variety provided by frequent shifts from one scientific discipline to another. This structure also makes it possible for the teacher to rearrange and delete material as desired ; however, Chapter 1 and Section 2. 1 should be studied first. Chapter 1 provides a conceptual and philosophical framework. The discussions and problems in Section 2 . 1 were selected to get students started in mathematical modeling. Most of the material in this book describes other people's models, v
vi
P R EFACE
frequently arranged or modified to fit the framework of the text, but hopeÂ fully without doing violence to the original intentions of the model. I believe all the models deal with questions of real interest : There are no 'fake' models created purely to illustrate a mathematical idea, and there are no models that have been so sanitized that they have lost contact with the comÂ plexities of the real world. Since I've selected the models, they reflect my interests and knowledge. For this I make no apology cav ea t emp tor. The models have been chosen to be brief and to keep scientific backÂ ground at a minimum. While this makes for a more lively and accessible text, it may give the impression that modeling can be done without scienÂ tific training and that modeling never leads to involved studies . I thought seriously about counteracting this by adding a few chapters, each one deÂ voted to a specific model. Unable to find a way to do this without sacrificing 'learnning by doing,' I abandoned the idea. Course suggestions. On an undergraduate level, the text can be used at a leisurely pace to fill an entire year. It may be necessary to teach some probability theory for Chapter 5, and you may wish to drop Chapter 1 0. More variety can be obtained by using the text for part of a year and then spending some time on an in-depth study of some additional modelsÂ with guest lecturers from the appropriate scientific disciplines if possible. Another alternative is to spend more time on simulation models after Section 5 . 2 if a computer is available for groups of students to develop their own in-depth models. A cknowledgments. Particular thanks are due to Norman Herzberg for his many suggestions on the entire manuscript. My students have been invaluable in pointing out discussions and problems that were too muddled or terse to understand. lowe thanks to a variety of people who have comÂ mented on parts of the manuscript, suggested models, and explained ideas to me. I'd appreciate hearing about any errors, difficulties encountered, sugÂ gestions for additional material, or anything else that might improve future editions of this book. -
EDWARD A. BENDER La Jolla, California August 1977
CONT ENTS 1
1.1. 1.2. 1.3. 1.4. 1.5.
M ODELS AND REALITY 1 PR OP E RT I E S OF M ODELS 2 BUi lD I N G A M ODEL 6 A N EXAMP lE 8 AN OTHE R EXAMPLE 10 P rob l e ms 12 W H Y STUDY MODELIN G ? 14
1.6.
PA RT 1 . 2.
17
E L E M E NTA R Y M ET H ODS
19
A RG U M E N TS F R O M SCA LE 2.1.
2.2.
3.
1
W H AT IS M OD ELI N G

EFFECTS O F SIZE 19 Cost o f Packag i n g 19 Speed o f R ac i ng S h e l ls 22 S ize E ffects i n A n i ma ls 26 P r o b l e ms 29 DIME NS I O N AL A N ALYSIS 34 T h eo retica l B a c kg rou n d 35 T h e P e r i od of a P e rfect Pendu l u m Sca l e M od e ls of St r u ct u res 38 P ro b l e ms 40
37
G R AP H I C A L M ET H ODS 3.1. 3.2.
US I N G GRAP HS IN M ODEL IN G 44 C OMP A R AT IVE STATICS 45 T h e Nuc l ea r M iss i l e A r ms R a c e 45 B i ogeog raphy : D i ve rsity of S p ec i es o n 49 I s l a nds T h e o ry of t h e F i rm 52 P rob l e ms 56
44
viii
C O N T E NTS
3.3.
STABILITY QUESTIONS
57
Cobweb Models
i n E co n o m i cs S ma l l G roup D y na m i cs 60 P r o b l ems 63
4.
57
BASIC OPTIMIZATION 4.1.
66
OPTIMIZATION BY DIFFERENTIATION
66
M a i nta i n i n g I nve nto r i e s 66 Geo metry of B l ood vesse l s 71 F i g ht i n g Fo rest F i res 73 Problems 4.2.
76
GRAPHICAL METHODS
81
A Barte r i n g M odel 81 Changing E n v i r o n m e n t a n d Opt i ma l Phen otype 84 Problems 88 5. ;
BASIC P R OBABILITY 5.1.
91
ANALYTIC MODELS
91
S ex P reference a n d S ex Rat i o M a k i n g S i m p l e C h o i ces 94 P ro b lems 98 5.2.
MONTE CA R LO SIMULATION
A D octo r's Wa i t i n g R o o m Sed i ment V o lu m e 108 Stream Networks
103
106
110
115 Problems A Ta b l e of 3000 Ran d o m D i g its 6.
91
118
POTPOU R RI
121
Dese rt L izards a n d Rad iant E n e rg y 121 Are Fai r E lect i o n P rocedu res Possible? 124 Impaired Ca r b o n Dioxide Elimination 127 Problems 130 PART 2.
7.
M O R E ADVANCED METHODS
137
APP R O A CH E S TO D I F FE RENT I A L EQ U ATI ON S 7.1.
GENE R AL DISCUSSION
7.2.
LIMITATIONS OF ANALYTIC SOLUTIONS
139
139 140
ix
C O N T E NTS
7.3.
ALTERNATIVE APPROACHES
7.4.
TOPICS NOT DISCUSSED
140
142 144
8. ) QUANTITATIVE DIFFERENTIAL EQUATIONS 8.1.
ANALYTICAL METHODS
144 144
Pollution of the Great Lakes The Left Turn Squeeze Long Chain Polymers Problems 8.2.
148 . 152
155
NUMERICAL METHODS Towing a Water Skier A Ballistics Problem Problems
)
/
164
165
The Heun Method 9.
160
160
171 173
LOCAL STA BILITY THEO R Y 9.1.
AUTONOMOUS SYSTEMS
9.2.
DIFFERENTIAL E QUATIONS Theoretical Background
173 175
175
Frictional Damping of a Pendulum
177 180
Species Interaction and Population Size 18ï¿½ -.ï¿½ ,)
Keynesian Economics
More Complicated Situations Problems 9.3.
187
188
DIFFERENTIAL-DIFFERENCE EQUATIONS The Dynamics of Car Following Problems
9.4.
10.
198
COMMENTS ON GLOBAL METHODS Problem
199
201 202
MORE PROBABILITY Radioactive Decay
202
Optimal facility Location
204
Distribution of Particle Sizes Problems APPENDIX.
207
212
SOME PROBABILISTIC BACKGROUND
A.1., THE NOTION OF PRO BA BILITY A.2.
193
193
RANDOM VARIA BLES
A.3, ) BERNOULLI TRIALS
220
224
217
217
x
C O N T E N TS
A.4.
INFINITE EVENT SETS
A.5.
THE NORMAL DISTRIBUTION
A.S. Â·Â·
225
230
A.7.
LEAST SQUARES
A.S.,
THE POISSON AND EXPONENTIAL
234
DISTRIBUTIONS
237
REFERENCES
239
A GUIDE TO MODEL TOPICS
249
I NDEX
253
AN INTRODUCTION TO MATHEMATICAL MODELING
CH APTER
WHAT
1.1
1
IS MODELING
M O D E LS A N D R EA LITY
model,
The theoretical and scientific study of a situation centers around a that is, something that mimics relevant features of the situation being studied. For example, a road map, a geological map, and a plant collection are all models that mimic different aspects of a portion of the earth's surface. The ultimate test of a model is how well it performs when it is applied to the problems it was designed to handle. (You cannot reasonably criticize a geological map if a major highway is not marked on it; however, this would be a serious deficiency in a road map.) When a model is used, it may lead to incorrect predictions. The model is often modified, frequently disÂ carded, and sometimes used anyway because it is better than nothing. This is the way science develops. Here we are concerned exclusively with that is, models that mimic reality by using the language of mathematics. Whenever we use ' model' without a modifier, we mean ' mathematical model.' What makes mathematical models useful ? If we 'speak in mathematics,' then
mathematical models,
1.
We must formulate our ideas precisely and so are less likely to let implicit assumptions slip by. 2. We have a concise ' language' which encourages manipulation. 3. We have a large number of potentially useful theorems available. 4. We have high speed computers available for carrying out calculations. There is a trade-off between items 3 and 4 : Theory is useful for drawing general conclusions from simple models, and computers are useful for drawing specific conclusions from complicated models. Since the thought habits needed in formulating models are quite similar in the two cases, it
2
W H AT I S M O D E L I N G
matters little what sort o f models we use ; consequently, I have felt free to neglect computer based models purely for personal pedagogical reasons. There are some references to a computer in Section 5.2 where Monte Carlo simulation is discussed and, to a lesser extent, in Section 8.2 where numerical solutions to differential equations are discussed. Mathematics and physical science each had important effects on the development of the other. Mathematics is starting to play a greater role in the development of the life and social sciences, and these sciences are starting to influence the development of mathematics. This sort of interaction is extremely important if the proper mathematical tools are going to be developed for the various sciences. S. Bochner (1 966) discusses the hand-inÂ hand development of mathematics and physical science. Some people feel that there is something deeper going on than simply an interaction leading to the formulation of appropriate mathematical and physical concepts. E. P. Wigner ( 1 960) discusses this. 1.2. P R O P E RTI E S O F M O D EL S
mathematical model is an abstract, simplified, mathematical construct related to part of reality and created for a particular purpose. Since a dozen different people We begin with a definition based o n the previous discussion : A
a
are likely to come up with a dozen different definitions, don't take this one too seriously ; rather, think of it as a crude starting point around which to build your own understanding of mathematical modeling. We now have a problem : To fully appreciate the general discussion in the next two sections you should look at some concrete examples like those in Sections 1 .4 and 1 . 5 ; however, you will need some abstract backÂ ground to appreciate the examples fully. I suggest reading the remainder of the chapter through quickly and then coming back to this point and reÂ reading more carefully. As far as a model is concerned, the world can be divided into three parts : 1. 2. 3.
Things whose effects are neglected. Things that affect the model but whose behavior the model designed to study. Things the model is designed to study the behavior of.
IS
not
The model completely ignores item 1 . The constants, functions, and so on, that appear in item 2 are external and are referred to as
exogenous variables
P R O P E R T I E S O F M O D E LS
3
(also called parameters, input, or independent variables). The things the model seeks to explain are (also called output or dependent variables). The exogenous-endogenous terminology is used in some areas of modeling. The input-output terminology is used in areas of modeling where the model is viewed as a box into which we feed inforÂ mation and obtain information from. The parameter-independent -dependent terminology is the standard mathematical usage. Suppose we are hired by a firm to determine what the level of production should be to maximize profits. We would construct a model that enables us to express profits (the dependent variable) in terms of the level of proÂ duction, the market situation, and whatever else we think is relevant (the independent variables). Next we would measure all the independent variables except the level of production and use the model to determine which value of the level of production gives the greatest profit. Now let's look at things from the point of view of an economist who is seeking to explain the amount of goods firms produce. A two-part model could be constructed : Firms seek to maximize profits, and profits can be determined as sketched in the previous paragraph. In this model profits become an internal variable (of no interest except for the machinations of the model), and level of production changes from an independent to a dependent variable. These three categories (neglected, input, and output) are important in modeling. If the wrong things are neglected, the model will be no good. If too much is taken into consideration, the resulting model will be hopelessly complex and probably require incredible amounts of data. Sometimes, in desparation a modeler neglects things not because he thinks they are unÂ important, but because he cannot handle them and hopes that neglecting them will not invalidate the conclusions. A. Jensen ( 1 966) discusses the development of a model for safety-at-sea problems. The main difficulty in formulating the model was to determine what types of encounters between ships were dangerous, that is, to separate items 1 and 2. He found this to be hard even with the aid of nautical experts. (If you want to know the answer, you'll have to read the article.) Proper choice of dependent variables (i.e., output) is essential ; we must seek to explain the things we can explain. Often this choice is relatively clear, as in the example involving t J;1 e economist who wished to explain the level of production of a firm. Sometimes we need to be careful ; for example, we could explain profits in terms of level of production, but not conversely as we might naively try to do, since we were asked to determine the best level of production. Since different models make different types of simplifying assumptions, there is usually no single best model for describing a situation. R. Levins
endogenous variables
4
WH AT IS
M O D E LI N G
( 1 968, p . 7 ) observed that ' it i s not possible t o maximize simultaneously generality, realism, and precision.' In the social sciences one is often content with a statement that something will increase ; precision has been sacrificed for realism and (hopefully) generality. Simulation models usually try for precision and realism but sacrifice generality. These three trade-offs should become clearer after you have studied some actual models. Definitions of the variables and their interrelations constitute the of the model. We then use the model to (i.e., to make predictions). This is a deductive process : Hence a false prediction implies that the model is wrong in some respect. Unfortunately things are usually not this clear-cut. We know our model is only an approximation, so we cannot expect perfect predictions. How can we judge a model in this case ? A conclusion derived from a crude model is not very believable, especially if other models make contrary predictions. A result is if it can be derived from a variety of different models of the same situation, or from a rather general model. A prediction that depends on very special assumptions for its validity The cruder the model, the less believable its fragile predictions. You may notice that we have talked about conclusions, not Can a model provide explanations ? This is a somewhat philosophical question, and different people have different notions of what constitutes an explanation. Let us grant that, in some sense, models can provide explanaÂ tions. A decision about the validity of a model is usually based on the accuracy of its predictions. Unfortunately, two different models may make the same predictions but offer different explanations. How can this be ? We can think of the situation we are modeling as being a ' black box ' which outputs something for every input. (' Something ' can be no output.) A model makes correct predictions if it outputs the model equivalent of the black box output whenever the model equivalent of the black box input is fed in. The mechanism is irrelevant when dealing with predictions, but the nature of the mechanism is the heart of an explanation. Although there is usually a situation in which two different models lead to different predictions, we may not be able to determine which prediction is correct. For example, a model of a politician can be constructed by assuming that his behavior is ( 1 ) motivated by concern for his fellow man or (2) motivated by a desire for public office. In many situations these two models lead to identical or very similar predictions. It may be difficult to make contradictory predictions that Another example for those familiar with simple circuits is the mathematical equivalence between perfect springs and perfect LC circuits. Although the underlying mathematics is identical, no one would seriously suggest that Hooke's law for springs ' explains ' the circuit's behavior.
assumptions the conclusions must also be true.
draw conclusions If the assumptions are true, robust
isfragile.
explanations.
can be checked.
P R O P E R T I E S O F M O D E LS
5
We have been talking about an ideal modeler. When any of us apÂ proaches a problem, we do so in a limited, biased fashion. The more openÂ minded, communicative, and creative we can be, the better our model is likely to be. The following poem illustrates the problems that can arise.
The B l i nd M e n a n d the E l ep h a nt
It was six men of Indostan To learning much inclined, Who went to see the Elephant (Though all of them were blind), That each by observation Might satisfy his mind. The First approached the Elephant, And happening to fall Against his broad and sturdy side, At once began to bawl : ' God bless! but the Elephant Is very like a wall! ' The Second, feeling of the tusk, Cried, ' Ho! what have we here So very round and smooth and sharp ? To me 'tis mighty clear This wonder of an Elephant Is very like a spear! ' The third approached the animal, And happening to take The squirming trunk within his hands, Thus boldly up and spake : ' I see,' quoth he, ' the Elephant Is very like a Snake! ' The Fourth reached out an eager hand, And felt about the knee. ' What most this wondrous beast is like Is mighty plain,' quoth he ; ' ' Tis clear enough the Elephant Is very like a tree! '
I)
W H AT I S M O D E LI N G
The Fifth who chanced to touch the ear, Said : ' E'en the blindest man Can tell what this resembles most ; Deny the fact who can, This marvel of an Elephant Is very like a fan ! ' The Sixth no sooner had begun About the beast to grope, Than, seizing on the sï¿½inging tail That fell within his scope, ' I see,' quoth he, ' the Elephant Is very like a rope ! ' And so these men of Indostan Disputed loud and long, Each in his own opinion Exceeding stiff and strong. Though each was partly in the right And all were in the wrong ! John Godfrey Saxe ( 1 8 1 6- 1 8 8 7) Reprinted in Engineering Concepts Curriculum Proj ect ( 1 9 7 1 )
1 . 3 . B U I L D I NG A M OD EL
Model building involves imagination and skill. Giving rules for doing it is like listing rules for being an artist ; at best this provides a framework around which to build skills and develop imagination. It may be impossible to teach imagination. It won't try, but I hope this book provides an opportunity for your skills and imagination to grow. With these warnings, I present an outÂ line of the modeling process. 1. 2.
Formulate the Problem.
What is it that you wish to know ? The nature of the model you choose depends very much on what you want it to do. Outline the ModeL At this stage you must separate the various parts of the universe into unimportant, exogenous, and endogenous . The interrelations among the variables must also be specified .
BUI L D I N G A M O D E L
3.
4.
7
Is It Useful?
Now stand back and look at what you have. Can you obtain the needed data and then use it in the model to make the preÂ dictions you want ? If the answer is no, then you must reformulate the model (step 2) and perhaps even the problem (step 1 ) . Note that 'useful' does not mean reasonable or accurate ; they come in step 4. It means : {(the model fits the situation, will we be able to use it ? Test the Model. Use the model to make predictions that can be checked against data or common sense . It is not advisable to rely entirely on common sense, because it may well be wrong. Start out with easy preÂ dictions-don't waste time on involved calculations with a model that may be no good . If these predictions are bad and there are no matheÂ matical errors, return to step 2 or step 1 . If these predictions are acceptÂ able, they should give you some feeling for the accuracy and range of applicability of the model. If they are less accurate than you anticipated, it is a good idea to try to understand why, since this may uncover imÂ plicit or false assumptions .
A t this point the model i s ready t o b e used. Don't g o t o o far ; i t is dangerous to apply the model blindly to problems that differ greatly from those on which it was tested. Every application should be viewed as a test of the model. You may not be able to carry out step 2 immediately, because it is not clear what factors can be neglected. Furthermore, it may not be clear how accurately the exogenous variables need to be determined. A common practice is to begin with a crude model and rough data estimates in order to see which factors need to be considered in the model and how accurately the exogenous variables must be determined. Some models may require no data. If a model makes the same prediction regardless of the data, we are not getting something for nothing because this prediction is based on the assumptions of the model. To some extent, the distinction between data and assumptions is artificial. In an extreme case, a model may be so specialized that its data are all built into the assumptions. Sometimes step 4 may be practically impossible to carry out. For example, how can we test a model of nuclear war ? What do we do if we have two models of a nuclear war and they make different predictions ? This can easily happen in fields of study that lack the precisely formulated laws found in the physical sciences. At this point experience is essential-not experience in mathematics but experience in the field being modeled. Even if predictions can be tested, the testing may be expensive to carry out and may require training in a particular field of experimental science. Since the absence of experimental verification leaves the modeling process incomplete, I have given test results whenever I have been able to obtain them.
S
W H AT I S M O D E LI N G
1 . 4. A N EXAM P l E
W e discuss models for the long term growth o f a population i n order to illustrate some of the ideas of the two previous sections. We want to predict how a population will grow numerically over a few generations. This is the problem (step 1 in Section 1 .3). Let the exogenous (independent) variables be the net reproduction rate per individual, the time and the size of the population at 0. The net reproduction rate is the birth rate minus the death rate. In other words, it is the fractional rate of change of the population size : (dN/dt)/N. There is only one endogenous (dependent) variable, the size of the population at time which we denote by We also refer to as the To obtain a simple model, we ignore time lag effects ; that is, we assume that only the present value of and its derivatives are relevant in determining the future values of (This will lead to a differential equation.) If the fraction of the population that is of reproductive age varies with this can be a very poor assumption. Let's also assume that the net reproduction rate is a constant. This gives us a rather crude model with the basic relationship
r
t,
t,
N(t). N
N.
t=
r= net growth rate.
r
t,
r
dN -N1 dt = r. The model would certainly be useful if it fits the real world (step 3). The solution of ( 1 ) is N( t) = N(O)e rt. Unless r = 0, the population will eventually either die out (r negative) or grow to fill the universe (r positive). Reasonable (1)
behavior of the population size is a very fragile prediction of the model. This casts serious doubt on the validity of using a constant net reproduction rate for predicting long term growth. This approach to a model illustrates an important point : (in this case as time gets very long, i.e., as ---+ 00). Our test of the model (step 4) for long term growth indicates that it must be rejected ; however, it may be useful for short term predictions. Unfortunately, we specifically asked for long term predictions. Clearly the growth rate of a population will depend on the size of the population because of such effects as exhaustion of the food supply. If the population becomes very large, we can expect the death rate to exceed the birth rate. Let's translate this into mathematics. We replace the net reproÂ duction rate in ( 1 ) by which is a strictly decreasing function of for large and becomes negative when is very large. Thus
Study the behavior of your model in limiting cases t
N
(2)
r
r(N)
N dN -N1 dt = r(N).
N
AN
E XA M P L E
9
We've now redone step 2. The model is less useful than the previous one, because obtaining the exact form of will be hard, perhaps even imÂ possible. However, rough estimates can be obtained, so let's see what can be done with them. On to step 4. It can be shown that approaches the solution of 0) 0, as time passes. This is a robust prediction, since we made very few assumptions concerning the nature of the function Because the model was constructed to predict an upper limit for the size of a population, it is not surprising that it does so. The cycle of steps 4, 2, and 3 can be repeated, since the model described by (2) has many drawbacks. For one thing, the population can only move closer to in the future. A real population often overshoots the steady state size and even steady state populations fluctuate slightly in size because of the somewhat random nature of births and deaths. One way to eliminate the first objection is to introduce For example, if the death rate is not age dependent and the birth rate changes from zero to a constant at age p , we could replace (1) by
r(N)
N(t)
r(N
No, r(N).
=
No No,
time lags.
m
b
dN -mN(t) + bN(t - p).
dt
(3)
=
m
b
The parameter p is called a time lag. Of course, we could make ,and functions of p ), or some weighted average of on the interval - p, To allow for random fluctuations we must replace our deterÂ ministic model by a random one. Another drawback is the assumption that it makes sense to talk about If the age or sex ratios in a population are changing, this may be nonÂ sense. To overcome this objection it is necessary to split the population into subpopulations based on age and sex. Demographic models are designed in this way : In a typical model time is broken up into discrete units such as 5 year periods, men are ignored, and women are divided into age classes separated by a single time unit. For each age class there is no longer simply a net birth rate but a death rate and a birth rate for female children. The number of newborn girls at time + 1 is + 1) and the number of women in class + 1 is the number + 1) (1 surviving from class at time Linear algebra is a natural tool for handling this model. Demographers frequently assume that and are independent of because they are interested in relatively short term predictions. Seasonality may be important for short term models, since in many species births occur during a particular season and death rates are also d. e pendent on the time of year. An explicit time dependence must be built into to allow for seasonal effects. In a long term model encompassing many years we could probably avoid this complication by averaging birth and death rates over an entire year.
tJ.
[t
N
N(t), N(t -
r(N).
mj
bj t No(t I bjNj(t), i Nj + 1 (t - m;)N;(t), t. bj mj =
i
N,
=
r(N)
10
W H AT I S M O D E L I N G
I hope this discussion makes i t clear that w e can't formulate a n adequate model unless we know ( 1 ) what we hope to obtain from the model and (2) how complicated a model we are willing to tolerate. The latter is practically the same as how much data we are willing to supply, since complexity and data demands usually grow simultaneously. 1. 5. ANOTH ER EXA M PLE
The manager of a large commercial printing company asks your advice on how many salespeople to employ. Qualitatively, more salespeople will increase sales overhead, while fewer salespeople may mean losing potential customers. Thus there should be some optimum number. By ' salespeople ' I don't mean clerks, but people who travel, selling a company's products to other businesses ; however, these ideas could be applied to salesclerks, too. This problem has been adapted from A. A. Brown et al. (1956). The original paper goes into greater depth than the following discussion and is well worth reading. The problem as stated is unanswerable. What are the production limitaÂ tions of the company ? What are the goals of management ? Maximum profit ? Maximum ' empire ' with satisfactory profit ? Something else ? Unless these and similar questions are very clearly answered, recommendations may be quite inaccurate. A better approach would be to provide a description of the consequences of sales forces of various sizes. This would leave the final decision up to management, which is as it should be. To determine what effect a sales force will have, we must know what salespeople accomplish. Thus we can try to determine how salespeople spend their time and what results they obtain as a consequence of spending their time in that way. As long as salespeople need to be studied, we may as well ask : What is the best way (in terms of obtaining sales) for them to spend their time ? We can then advise management on ( 1 ) how to obtain the greatest return from their sales force, and (2) the impact various sizes of sales forces will have on sales. This tentatively completes step 1 . Notice that we have changed the original problem considerably. We were asked, ' How many salespeople should be employed ? ' Instead, we are going to answer two other questions which we formulated at the end of the previous paragraph. Actually the questions need further refinement. For example, different salespeople have different abilities, and their territories are probably different. The question how salespeople should spend their time contains a trap, because it invites us to ignore these variations. Again, if we change the size of the sales force, we can change the total geographical area covered, the effort expended per customer, or both. Thus the question on the consequences of various sizes of sales forces also contains potential
A N OT H E R
EXA M P L E
11
traps. Clearly step 1 hasn't been completed ; however, the best idea is probably to proceed and to realize that in studying a real situation we will eventually need to return to step 1 and formulate the questions more precisely in a way that depends greatly on the particular printing company being studied. The maj or factor that will affect how much time a salesperson spends on a customer is what the salesperson can hope to gain. Observations indicate that businesses normally place most of their printing orders with one company. Hence we can classify customers as ' in hand ' or ' potential.' The former need to be held, and the latter need to be converted. In addition, we can classify customers according to how much money they have to spend. As an approximation we can assume (but it should be checked) that holding and conversion probabilities are independent of size. By running an experiÂ ment with the salespeople, or possibly by examining records if we are lucky, we can obtain an idea of how conversion and holding probabilities vary with the amount of time per week devoted to a customer. From this we can decide how a salesperson should spend their time, because one additional hour per month should produce the same expected gain in revenue regardless of which customer it is spent with. (If you don't see this, don't worry, I've omitted some details. Try rereading it after Chapter 4.) This completes steps 2 and 3 for the first part of the problem. We don't have the data to carry out step 4, but it should be relatively straightforward. The decision on how a salesperson should divide his time together with the data on holding and conversion probabilities and data on the sizes of orders various businesses place will determine gross revenue as a function of number of salespeople. (Think about why this is so.) The above outline indicates how we can attack the problem posed by management-remember : How many salespeople should we employ ? The answer will consist of 1.
A statement of how best to divide up a salesperson's time as a function of the number and type of customers being dealt with . 2 . A table of expected gross income as a function of number of salespeople, assuming that the sales districts are divided up evenly. The model building will not be complete until we actually collect the data and make predictions. As soon as we do this, we'll find that the data permit only rough estimates for items 1 and 2. Thus we should give some estimate of the range of the numbers : If we have n salespeople, the gross sales will be expected to be between X and Y dollars. We could also try to anticipate a question management is likely to raise : We can't make salespeople divide their time just the way you recomÂ mend. Besides, salespeople and customers are individuals. How sensitive are your recommendations to all this ?
12
WHAT IS
M O D E LI N G
This example illustrates the importance of formulating the problem. The problem was hard or impossible to solve. By breaking it down and changing the goal (a tabulation of number of salespeople versus expected sales rather than simply an optimal number of salespeople), it became more approachable. C. C. Lin and L. A. Segel ( 1 974, Ch. 1) discuss applied mathematics and present two further examples. You may enjoy reading their chapter to obtain a somewhat different viewpoint. The first two chapters of J. Crank ( 1 962) are also interesting reading. Chapters 2 and 3 of C. A. Lave and J. G. March ( 1 975) present an interesting discussion of modeling.
as given
PROBLEMS
Some of the problems in this book lead you step by step through the developÂ ment of a model and thus resemble the mathematics problems you have seen in other courses ; however, many problems are closer to real life : They are vaguely stated, have multiple answers (models), or are open ended. I strongly recommend working in small groups on the problems to bring out various ideas and evaluate them critically. 1.
Suppose people enter the elevators in a skyscraper at random during the morning rush. The result will be several elevators stopping on each floor to discharge one or two passengers each.
(a) (b) (c)
Discuss schemes for improving the situation. How could improvement be measured ? How could you model the situation to decide what scheme to adopt ?
2.
In the text we discussed models for the growth of a single population. Discuss models for the growth of two interacting populations. This problem has been phrased very vaguely, and before working on it at home decide on a more concrete situation (or situations) in class.
3.
How far can a migrating bird fly without food ?
4.
If all five employees can run all six machines in your shop, how should you decide whom to assign to which job ?
5.
Discuss the differences and similarities in models of urban vehicular traffic that you would construct to deal with the following problems. To what extent could one model be used to handle problems it wasn't
P R O B LEMS
13
designed for ? Consider each case separately. Don't try to set up detailed models, just discuss your general approach.
(a) (b) (c)
6.
You are working for a citizens' committee which wants to convince the city council to ban private vehicles in the city because of pollution. The city council has asked you as a traffic engineering expert to study the possibilities of speeding up traffic flow by changing traffic signal times, setting up one-way streets, and anything else you can think of that will help the traffic problem, not upset the voters, and not cost much to implement. Since your recent efforts have won you a reputation, the city council has given you a contract to study the feasibility of banning private vehicles and taxis in most of the city as a means of reducing atmospheric and noise pollution, but in a fashion that won't interfere greatly with the mobility of the populace. Since this is a thorny problem with many conflicting goals - a political hornets' nest, the city fathers have told you to give them a straightforward recommendation so they can avoid the onus of decision making.
Unless you have been extremely lucky, you have had a large class in a poorly designed lecture hall.
(a) (b) (c)
What are some criteria to be considered in designing a large lecture hall ? One criterion is legibility of material written on the boards. Construct a model of legibility as a function of the distance your seat is from the board and the angle at which you look at the board. What will the curves of constant legibility look like on a floor plan ? How can you test this prediction ? Try it. Does this suggest shaping the back of the hall differently than is usually done ? How ? Can mathematical modeling help with any other criteria besides the one mentioned in (b) ? Try to pick a criterion from among these possibilities and develop a model for it.
You may wish to look at A. A. Bartlett ( 1 973) after working on this problem. 7.
A common technique when no models are available is to collect data, try to fit curves, and then treat the curves as if they were a model or even an explanation. Discuss.' Would you have faith in predictions made from such models ? Explain. Two commonly misused techniques are and For a delightful spoof of the former, see J. S. Armstrong ( 1 967).
factor analysis linear regression.
14
8.
W H AT I S M O D E LI N G
One of the simplest models o f population growth i s the logistic equation
dN/dt rN(1 - N/K). (a) Interpret r and K. Discuss the model. you were given census data for a population (i.e., a table (b) ofSuppose date versus population size). How could you test the fit of the logistic model to the data ? Remember that r and K are not given. ( 1 97 1 , p. 1 24) quotes the following data from the U.S. (c) CensusLeigh Bureau on the growth of the U.S. population and from =
E. G.
Gause on the growth of a population of the one-celled animal
Paramecium aurelia. How well does the logistic model fit the data ? Year 10-6 Day N N
1 790 1810 1830 1 850 1 8 70 1 890 1910 1930 1 950 1970
(d)
x
3.93 7.24 12.87 23. 1 9 39.82 62.95 9 1 .97 1 22.78 1 50.70 208.0
1 2 3 4 5 6 7 10 11 12 13
2 7 25 68 168 1 38 1 90 122 280 260 300
Can you suggest better models for the growth of the two populations given above ? ' Better ' is a vague word. It could mean simpler, fitting the data more accurately, having a firmer biological and sociological foundation, and so on.
1 . 6 . W HY STUDY MODELING?
Why not always deal with the real world instead of studying models ? Modeling can avoid or reduce the need for costly, undesirable, or impossible experiments with the real world, as the following problems illustrate : What is the most efficient way to divide the fuel between the stages of a multistage rocket ? 2. What would be the effect of a very bad nuclear reactor accident ? 3. How large a meteor was needed to produce Meteor Crater in Arizona ? 1.
W H Y S T U D Y M O D E LI N G ?
15
In trying to ' explain ' the world, modeling is essential. Scientific theories are models and are frequently mathematical models. Every scientist from the purest to the most applied must know how to use such models whether he calls them that or not. For anyone planning to use mathematical models, an understanding of how to go back and forth between the world we live in and the world of mathematics is essential. This is the crux of mathematical modeling and this is what I hope this course will help you learn to do. It is neither science nor mathematics, but rather how to put them together. Science and mathematics courses are essential (you need something to put together), and this is no substitute for them.
PART
1
ELEMENTARY METHODS
CHA PTE R
2
ARGUMENTS
FROM S CALE
I n this chapter w e consider arguments based o n proportionality. For example, if you make a scale model of an object with a scale of 1 : surface area will have a scale of 1 : and the volume a scale of 1 : Models using this sort of idea a:re discussed in the first section. The second section is based on the observation that physical laws remain the same if the units of measurement are changed.
[2
[3 .
[,
2 . 1. E F F ECTS O F S IZ E C ost o f P a c ka g i n g
Consider a product like flour, detergent, or jam, which is packaged in containers of various sizes. You've probably noticed that larger packages of such products usually cost less per ounce. This is often attributed to savings in the cost of packaging and handling. Is this in fact the major cause or are there other important factors ? We try to see where this idea leads by constructing a simple model. The cost of a product is the endogenous variable. We are interested in seeing how it varies with the exogenous variable, size. Cost clearly depends on competition and the scale of the business. We neglect these factors and concentrate on expenses due to materials and handling. Since we are neglecting some important factors (name some), the resulting predictions will be crude. In addition, there are various constants involved which we do not even pretend to evaluate. Let's begin by studying the wholesale cost, that is, the price the retailer pays for the product. This is a sum of several costs plus various profit markÂ ups by middlemen. Since profit markups are usually in terms of percentages, 19
20
A R G U M E N TS F R O M S C A L E
we can absorb them in constants later ; for example, a 30 % markup multiplies constants by 1 . 30. The main costs that enter the wholesale price are : 1.
Cost of producing the product, a. Cost of packaging the product, Cost of shipping the product, c. Cost of the packaging material,
b. d.
2. 3.
4.
We will consider each of these in turn. It is reasonable to assume that a is proportional to the amount of the good being produced. We write this as a oc w, which is read ' a is proportional to the weight w.' The costs of packaging depend on how long it takes to fill the package, how long it takes to close the package, and how long it takes to load the package into a box for shipping. The first time is probably nearly proportional to the volume (hence the weight), while the latter two times are probably about the same for all sizes of packages in a reasonable range. Thus ï¿½ w + 9 for some positive constants f and g. (The symbol ï¿½ means ' approximately equal to.') Shipping charges may depend on both weight and volume. Since volume is proportional to weight for filled packages, we have c oc w. The cost of the packaging material is more complicated. It depends on the costs the package manufacturer must meet. Thus we must consider a, c, and for the package manufacturer. We neglect that is, we neglect the cost of the containers for the material from which the final packages are made. From the analysis we have just completed, the cost per package depends on the weight and volume of the package. If the range of packages we are considering is not too large, it is reasonable to assume that the packaging material is the same for all sizes of packages. Therefore the amount of material per package (hence the weight of a package) is proportional to the area of the surface to be covered. The volume per package is proportional to either the surface area or the volume of the package, depending on whether the packaging is shipped collapsed (like cardboard) or preformed (like glass). Therefore the expenses per package of the package supplier are + + m, for constants ;:::: 0, > 0, and m > 0, where is the surface area. Except for a markup, this is the cost to the packager. We now use a scale argument to reduce everything to one independent variable, weight. Let us assume that the various packages are roughly geometrically similar. The volume is nearly proportional to the cube of a linear dimension, and the surface area is nearly proportional to the square of a linear dimension : oc and oc Hence oc Since oc w,
b f b,
d
d;
h
k
d
v [3
hw kS
S
S [2 .
S v2/3 .
v
we have (1)
SOC
W
2/3 .
E F F E CT S O F S I Z E
21
wholesale cost per ounce is +b+c+d = + + -,
Thus the
Cost w
a
n
------
w
13 pW _ 1
q
w
for positive constants n, p, and q. From this we see that the cost per ounce decreases as the size of the package increases, in agreement with the observaÂ tion made at the start of this discussion. Can we make any interesting predictions ? Given three different costs and weights, we could solve for n, p, and q in ( 1 ) and use the results to predict the prices for packages of other sizes. Because of the crudity of our model, it is unlikely that our equation will fit very well. We should not take the exact form of ( 1 ) too seriously. Another way to fit a curve, which allows for inaccuracies, is the method of least squares. For this to be a reasonable test of the model, we should have more data points than parameters. Since ( 1 ) involves three constants, w e should have more than three values for the cost and weight of a single product. This is hard to obtain because of the limited number of different-sized packages in which a particular product is available. Therefore we need a different approach. The cost per ounce at a rate
decreases
(2)
f=
d(cost/w) -
dw
= 3
p
--
+
q
-
This is a decreasing function of w. Thus the increase in the rate of savings per ounce is less when the package is larger. We can also compute the rate of total savings :
rw =
pW
-1/3
--3 -
+ qw - I
It is also a decreasing function of w. The consumer is not likely to understand this. We can make a statement like (2) in simpler terms : In purchasing prepackaged products, doubling the size of the package purchased tends to result in greater savings per ounce when the packages are small than when they are large. You can prove this by taking the difference of ( 1 ) at w and 2w and verifying that it is a decreasing function of w. We have said ' tends to ' because the model is crude. These predictions seem to rely heavily on the exact form of ( 1 ). Actually qualitative predictions like these are usually quite robust. It would be
22
A R G U M E NTS F R O M S CA L E
desirable to derive them from a more general model if we wished to pursue the model more seriously, but I don't know how to do this and I don't think that the problem is worth the effort. This discussion concerned wholesale prices. What about retail prices ? The retailer's costs depend on wholesale prices and handling and storage costs. As above, the latter two costs are of the form + M. If the wholesaler sets his price at a fixed percentage above his costs, then we again obtain an equation of the form ( 1 ). The conclusions we reached above are therefore valid for retail prices too. In Problem 1 you are asked to study the model further and test it against actual data.
Hw
Speed of
R a c i n g Shells
In the college sport of crew racing the best times vary from class to class. Why ? Can we advise a coach how to adjust the shells so that he can pit his teams against each other on an equal basis in practice ? This model is adapted from T. A. McMahon's article ( 1 9 7 1 ) and deals with data for men only. Racing shells are boats propelled by oarsmen in sporting contests. They hold one, two, four, or eight oarsmen and are built to certain specificaÂ tions. Figure 1 is a rough diagram of a racing shell. For an eight-man crew there is a lightweight category and a heavyweight category. Heavyweight
r
J
(a)
(b)
(a) Top view. (b) Cross section of center. I cross-sectional area. F i g u re 1
=
length; b
=
beam; A
=
E F F E CT S O F S I Z E
Ta ble 1
23
Times of Racing Crews in Four M eets
Number of men 5.87 6.33 6 .87 7. 1 6
8 4 2 1
II
III
IV
5 . 92 6.42 6 . 92 7.25
5 . 82 6.48 6.95 7.28
5 . 73 6. 1 3 6.77 7. 1 7
oarsmen average about 8 6 kilograms, and lightweight oarsmen about 73 kilograms. This gives five classes. (There are others which we ignore because of a lack of data.) McMahon observed that there is a rather consistent difference between the best times of the various classes. Table 1 lists the information he presented on best times for 2000 meter races in four interÂ national competitions. The eight-man entry is the heavyweight time. McMahon also states that the time of an eight-man heavyweight crew is about 5 % better than the time of an eight-man lightweight crew. We want to explain all this. Rather than present the underlying assumptions of the model in one ad hoc package, we develop them as we proceed. A shell is propelled by the power of the oarsmen and retarded by the drag of the water. The balance of these two forces determines the speed of the shell, hence its time in the race. We assume 1.
The only drag force on the shell is due to skin friction and this force where is the wetted surface area and is the is proportional to velocity.
Sv2 ,
S
v
The expression for the skin friction drag given in the assumption is obtained from hydrodynamics. The power required to maintain velocity v is, by definition, equal to the drag force times the velocity. Hence and so
P
voc (p/S) 1 / 3 .
Poc Sv 3 ,
We assume
2.
The oarsmen in the shell all have the same weight and the same constant power output for the entire course of the race.
v
It follows that is constant, except for the brief period when the shells are starting up. Hence the course time is proportional to V- I , and so (3)
t toc (-PS ) 1 / 3
24
A R G U M E N TS F R O M S C A L E
We now consider the time difference between the heavyweight and lightweight eight-man crews. We want to explain it and then see if we can find a way to redesign the shells so that the two classes wiII be more nearly equal. The subscripts H and L denote heavyweight and lightweight, respecÂ tively. From (3) we obtain
tL (SL)1/3(PH)1/3.
tH SH PL
(4)
=
We must say something about power output and wetted surface area if we are going to explain the 5 % edge of the heavyweight team. Unfortunately power output information is not obtainable ; however, we know that the ratio of the weights of heavyweight and lightweight oarsmen is about 8 6 kilograms/73 kilograms 1 . 1 8. Therefore w e try t o relate power and weight. Sustained power output depends on such factors as lung volume (actually lung surface area, but this is proportional to volume because the lungs consist of small cells whose size is independent of the size of the person) and muscle volume. For these are proportional to the total weight. Hence we can expect power output to be proportional to the weight w of an oarsman times the number of oarsmen . . Since 1 . 1 8 and both shells have eight oarsmen, 1 . 1 8. Combining this with (4), =
similarly proportioned people,
WH/WL
PH/PL
=
(5)
=
SL SH,
If we make the rough assumption that then (5) comes close to the 5 % observed difference. Actually the surface area for a loaded heavyweight shell is slightly greater than that for a lightweight shell. When this is taken into account, the 6 % edge in (5) decreases slightly. We haven't predicted the edge precisely, but we have explained why it is in the neighborhood of 5 % . How can the shells b e redesigned t o achieve equality ? For fixed power output we obtain ex from (3). To change the time we must change the wetted surface area of the loaded shell. Let the subscripts and denote the present and redesigned shells, respectively. Then =
t Sl/3
p r
ï¿½: = (%y .
S, S
The lightweight crews will have times about equal to those of the heavyÂ weight crews if p 0.95. By the above equation, / p 0.86. In words, the wetted surface area of a loaded lightweight shell should be decreased by about 14 %, or we could slow the heavyweights down by an increase of wetted area of about 16 % (1/0.86 1 . 1 6).
t,/t
=
=
=
E F F E CT S O F S I Z E
25
We now compare the times of various-sized shells by expressing the endogenous variable, course time, in terms of the exogenous variable, team size. To do this we have to relate S and to the size of the team. If assumption 2 is extended to all oarsmen in all shells, the power will be proportional to the number of oarsmen n. Hence (3) reduces to
P
t ex
(6)
S.
(S)1/3 -
n
We need some information about the relative sizes of the various shells so that we can compute The information in Table 2 was presented by McMahon as evidence for the assumption : 3.
The shells are geometrically similar, and their loaded weights are proportional to n. Furthermore, the submerged parts of the loaded shells are also geometrically similar. Ta b l e 2
Shell Design Parameters
n 8 4 2
1 8 .28 1 1 .75 9 . 76 7.93
Note : I
=
b
li b
weightln
0.610 0. 574 0.356 0.293
30.0 2 1 .0 27.4 27.0
14.7 18.1 13.6 16.3
length; b
=
beam.
'lib'
The variation in the and ' weightln ' columns shows that this is a rather crude assumption, but it is about the best we can do, since a table of wetted surface areas is not available. The volume of water displaced by a shell is proportional to its total weight. This volume is also proportional to By assumption 3, weight is proportional to the number of oarsmen n, and ex Thus (7)
[
n ex IA ex P .
lAo 2 A [.
[3.
The values o f and n listed i n Table 2 do not satisfy n ex Therefore the similarity assumption is wrong. What can we do about it ? For the sake of continuity, we postpone discussing this problem. The total submerged surface area is proportional to I times the subÂ merged perimeter of cross section A in Figure 1. By assumption 3, this
26
A R G U M E N TS F R O M S C A L E
A 1 / 22which is in turn proportional to I. Thus S ex:. n / 3 , and so (6) becomes
perimeter is proportional to From (7) we obtain
S ex:. [2 .
(8) This yields the prediction : Times are proportional to the number of oarsmen raised to the power ! -
.
n
We can test this prediction by graphing t versus in some fashion. It is much easier to see if points are close to a straight line than it is to see if they are close to a curve. For this reason relationships like (8) are usually plotted on what is called log-log paper. It gives the effect of plotting log against log t, which equals c - log if (8) is correct. If you do this, you will discover that the points come close to lying on a straight line of slope -! as predicted. For a least squares curve fit, see Section A.7, especially page 237. We are in an awkward situation : the prediction in (8) has been verified, but the intermediate result in (7) is wrong. One possible explanation for this is that the central portions of the shells (which displace most of the water, hence are the most important) obey the similarity assumptions better than the ends of the shells. I do not have the data to check this possibility. This central length A and the cross section enter into the calculations for volume and surface area. A reasonable rough approximation is that volume and surface area are proportional to A and A respectively. The previous calculaÂ tions can then be carried out with A replacing I. We can give a more robust argument that leads to (8). The volume of the submerged portion of the shell is proportional to the weight of the loaded shell by Archimedes' law. The weight is very nearly proportional to Hence the volume is very nearly proportional to Since the shells are all approximately the same shape, the surface area is nearly proportional to the i power of the volume. Hence is nearly proportional to By (6), t ex:. The important point in this argument is that surface area tends to remain proportional to the i power of the volume, even when the shape varies somewhat from shell to shell. Thus we do not need the exact similarity assumption 3.
n
n/9
3
2,
n.
n.
(n 2 / 3/n) 1 / 3 n 1 /9.
S
n 2/ 3 .
=
S i ze Effects i n A n i ma l s
Why do animals have the proportions they do ? You may have noticed that larger animals tend to have stockier bodies and relatively heavier legs. For instance, a deer is not a scale model of an elephant even if we neglect superfluous things like the head and the pelt. Why is the largest bird much
E F F E CT S O F S I Z E
27
smaller than a large mammal ? Why can fleas jump so high relative to their size ? (Is this the basis of flea circuses ?) Various people have applied proportionality arguments to biology. The books by N. Rashevsky ( 1 960, pp. 251 ï¿½275) and J. M aynard Smith (1 968, pp. 6ï¿½ 1 7) contain a variety of examples from which the following discussion was adapted. You may also wish to read J. B. S. Haldane ( 1928). K. Schmidt-Nielson's book (1 972) is worth reading, but only a small part of it deals with scaling problems. We want to study how the size of a quadruped affects its locomotion and the proportions of its body and limbs. The only locomotion question we consider is jumping. J. Maynard Smith ( 1 968, p. 1 2) has observed that the height to which a jumping mammal can leap seems to be nearly independent of its size. In particular, he notes that a jerboa (a mouselike rodent) and a kangaroo can jump about equally high. We want to obtain some idea of what this may mean. If you wish a fuller exposition of movement, see the books mentioned above. The structure of animals is quite complex, and so it is easy to build very involved models. Rather than becoming lost in a morass of complicated, uninterpretable results, we use very crude models. At a couple of critical points we'll unfortunately have to rely on some results from elasticity theory. We now study how the dimensions of the body (trunk) of an animal are related to its weight. As a crude approximation, we think of the trunk of the animal as a flexible beam supported at the ends by the legs. Flexible beams have been well studied in elasticity theory, so there are results ready for our use. If a beam of length vertical thickness t, and cross-sectional area is subjected to a uniform load while its end points are held fixed, a result from elasticity theory states that the maximum deflection satisfies
I,
F 15 -tFI3z-.A
A
15
ex
F IA.
The force is due mainly to the weight of the trunk, which is roughly proÂ portional to Using this we see that
15 1Z3 '
(9)
I ex t where is the relative sagging. It is reasonable to suppose that there exists some physically determined upper limit to above which the animal's trunk will be cripplingly deformed. Some dog breeds (e.g., St. Bernard) may be at this limit. When is much below this limit, body material is being used unnecessarily for support. It is reasonable to suppose that such an inefficient use of body material is eliminated by evolution. Hence we treat as a constant. From (9) we obtain t ex ( 1 0)
15/1
15/1
I5jI
1 3/z ;
I5jI
A R G U M E NTS F R O M S C A L E
28
that is, larger animals have relatively thicker trunks. Rashevsky ( 1 960, vol. 2, p. 263) has plotted log against log I and found fair agreement with ( 1 0). The mass of the trunk is roughly proportional to IA. Since most animals have roughly similar cross sections, A and so Thus by ( 1 0). Combining these observations gives
m
m ex 14
t
ex t2 .
m ex It2
-t1 ex mI/8
(1 1)
Interpret these results. How does limb size vary with body weight ? Our model here is even cruder than the previous one. The leg bones must be strong enough to withÂ stand the bending strain put on them when the animal moves. From elasticity theory, the ability of a bone to withstand a force is proportional to its cross sectional area Ab â€¢ Force equals mass times acceleration. For slow moving animals, acceleration is mostly due to gravity. For fast moving animals, accelerations are still about equal because they depend on the rate of muscle contraction, which has about the same maximum value in all species. Thus the force applied is proportional to the mass of the animal, and so Ab If is the diameter of the leg bone, Ab and so Note that, if everything remained in proportion for animals of different sizes, we would have Hence our model predicts that bone diameter increases faster than proportionally ; that is, the legs of larger animals are relatively thicker than the legs of smaller animals. How does the height an animal can jump depend on its size ? To jump a height h an animal of mass must do an amount of work proportional to This work is accomplished by the muscles as the legs move from a crouched position at the start to a stretched position just before leaving the ground. The work that can be done by a muscle is proportional to its volume is proportional to Vm , and so Vm . Thus
d
m
d 2 ex
d ex m l / 3 .
d ex m I ll .
ex m.
m
mho
mh
Vm
h ex m If we make the plausible assumption that is proportional to m, the total mass of the animal, we obtain h constant from ( 1 2). However, it also seems plausible to assume that the cross sectional area of the muscle is proportional to Ab , the cross sectional area of the leg bones. Since Ab ex m, it follows from ( 1 2) that A, the length of the muscle, is proportional to h. Since A increases with size, this leads to the conclusion that h increases with ( 1 2)
.
Vm
=
size. Which approach is wrong and why ? Actually, neither is correct. Rather than make ' plausible assumptions ' in a naive fashion, we need to look at the situation structurally : What is it that determines the size of leg muscles ? If the muscles are too strong, they will cause the leg bones or joints to break. A plausible but somewhat technical bioengineering argument leads to the
P R O B LE M S
29
conclusion that, if bone breakage is the major consideration, Vrn ex A b . Thus ( 1 2) becomes h ex
(13)
If
Ab m
.
m,
we accept our earlier conclusion that A b ex we obtain h constant. This conclusion was based on the idea that the importance of leg bone cross section derived from supporting the animal ; however, we see from ( 1 3) that for jumping mammals the importance of leg bone cross section may derive from the height the animal wishes to jump. It would be interesting to study a table of h, A b , Vrn , and for jumping mammals. I have been unable to locate such data. In fact, not many data are available to test our size effect models. Of course, one can always measure photographs or actual animals. Perhaps you'd like to do it. Besides the graphical data given by Rashevsky mentioned earlier, T. A. McMahon ( 1 973) presents further graphical data, and D. D. Davis ( 1 962) notes that in domestic cats and lions structures associated with locomotion satisfy mass relationships of the form w ex m ' , where 2: 1, while structures associated with metabolism have < 1 . W. R. Stahl and J. Y. Gummerson ( 1 967) analyzed five species of primates (tamarins, squirrel monkeys, vervet monkeys, macaques, and baboons). Among their results are the following 95 % confidence estimates for in x ex mr . =
m
r
r
x
r
Trunk height Chest circumference Thoracic width Midshaft humerus diameter
r
0.26-0.29 0.3 5 -0. 3 8 0.27-0.3 5 0.39-0.45
It was not clear to me what ' trunk height ' meant. The first two measurements fit the and results in ( 1 1) quite well, but the thoracic width does not fit the ex prediction. The humerus diameter measurement leads to a value of in A b ex considerably less than the predicted value of 1 .
I t t m 3/8 r m' PROBLEMS
1.
This problem relates t o the model of the cost of packaging, The conclusion drawn from ( 1 ) that costs per ounce for larger packages are less holds for the data given at the end of this problem, but this is a relatively crude result. Equation (2) cannot be checked, because we cannot compute derivatives, only differences. Moreover, the rule on doubling the size of a package cannot usually be checked, since manufacturers tend to package
30
A R G U M E N TS F R O M S C A L E
products in odd sizes. We want a more flexible form of the doubling rule, and so we shall derive a finite difference analog of (2). (a)
(b)
Let W i < W z < W3 be the weights of various-sized packages of a packaged product and C i ' C z , and C3 the costs of the packaged product. Derive the following result.
per ounce
r
Why is this analogous to the statement that is a decreasing function of w ? The following data was collected at random in a supermarket in 1 972. Test the result given in (a). The samples in each group came from the same store at the same time and were of the same brand. The packages within a group appeared similar except for the 1 2 and 32 ounce ketchup bottles. The former was labeled ' wide mouth ' and the latter was labeled 'jug.' It may be relevant that the 5 and 10 pound bags of flour were on a shelf marked ' new low price.' The data in each table is taken from a single brand. Powdered Milk
Ketchup Ounces
Quarts
$ 0.29 0.26 0.36 0.57
12 14 20 32
Flour Pounds
$
2 5 10
0. 1 5 0.25 0.45
8 15 29
0.49 1 .09 1 . 59 2. 1 9
3 8 14 20
Tomato Sauce Ounces
$
0.27 0.39 0.85
Detergent Powder Pounds 3 5 10
$
Ounces
$
1 4 11
0.8 1 1 .29 2.52
PROBLEMS
(c )
31
To test the model further it would be desirable to make additional predictions that could be checked against the data. Can you make a testable prediction analogous to the statement that is a decreasing function of w ? Can you obtain any other qualitative predictions from the model, which can be tested with the data ?
rw
2.
Can you think of any data that it would be reasonable to try to obtain and which would allow you to improve the model of the speed of racing shells ?
3.
T. A. McMahon has suggested that, if the lightweight eight-man shell were a scale model of the heavyweight eight-man shell when loaded [i.e., if the dimensions had the ratio 1 ;(1 . 1 8) 1 /3 ] , the 5 % edge would be eliminated. Do you agree with this ? Why ? (Recall that we needed a ratio of redesigned to present surface areas of 0.86.)
4.
Smaller mammals and birds have faster heart rates than larger ones. If we assume that evolution has determined the best rate for each, why isn't there one single best rate ? Is there a model that leads to a correct rule relating heart rates ? A warm-blooded animal uses large quantities of energy in order to maintain body temperature, because of heat loss through its body surfaces. Since cold-blooded animals require very little energy when they are resting, the major energy drain on a resting warmÂ blooded animal seems to be maintenance of body temperature. Let's explore a model based on this idea. The amount of energy available is roughly proportional to blood flow through the lungs-the source of oxygen. Assuming the least amount of blood needed is circulated, the amount of available energy will equal the amount used. (a )
Set up a model relating body weight to basal (resting) blood flow through the heart. Use the data below to check your model. There are many animals for which pulse rate data is available but not blood flow data. Set up a model that relates body weight to basal pulse rate. What sort of assumptions do you need to make about hearts ? How could they be checked ? Use the data below to check your model. (c ) Discuss the discrepencies that arise in testing your models in (a ) and
(b)
After
(b).
working on the model you may wish to read M. Kleiber ( 1 9 6 1 , Ch. 1 0, especially pp. 1 99ï¿½209). It would be good if someone did this and reported on it.
32
A R G U M E NTS F R O M S C A L E
Data on Mammals (Altman and Dittmer, 1 964, pp. 234-235) Weight (kilograms)
Mammal Shrew Bat Mouse Hamster Kitten Rat Guinea pig Rabbit Opussum Seal Goat Sheep Swine Horse Cattle Elephant
Pulse (beats per minute)
0.003-0.004 0.006 0.0 1 7 0. 103 0. 1 1 7 0.252 0.437 1 . 34 2.2-3.2 20-25 33 50 1 00 380-450 500 2,000-3,000
782 588 500 347 300 352 269 251 1 87 1 00 81 70-80 60-80 34-55 46-53 25-50
Note: Rates may not be basal. Data on Humans (Spector, 1 956, p. 279) Age Weight (kilograms) Pulse (beats per minute) Blood flow through heart (deciliters per minute)
5 18 96 23
10 31 90 33
16 66 60 52
25 68 65 51
47 72 72 40
33 70 68 43
60 70 80 46
Data on Some Mammals (Spector, 1956, p. 279)
Weight (kilograms) Blood flow through heart (deciliters per minute)
Rabbit
Goat
Dog
Dog
Dog
4. 1 5.3
24 31
16 22
12 12
6.4 11
P R O B LE M S
33
Data on Small Birds (Altman and Dittmer, 1964, p. 235) Bird Hummingbird Wren Canary Sparrow Dove
Weight (grams)
Pulse (beats per minute)
4 11 16 28 1 30
615 450 514 350 135
Data on Large Birds (Altman and Dittmer, 1964, p . 235) Bird Gull Chicken Vulture Turkey Ostrich
5.
Weight (grams)
Pulse (beats per minute)
388 1 ,980 8,3 1 0 8,750 80,000
40 1 312 1 99 93 65
Note: Rates may not be basal. In Gulliver' s Travels, the Lilliputians decided to feed Gulliver 1 728
times as much food as a Lilliputian ate. They reasoned that, since Gulliver was 12 times their height, his volume was 1 2 3 1 728 times the volume of a Lilliputian and so he required 1 728 times the amount of food of one of them ate. Why was their reasoning wrong ? What is the correct answer ? 6. When you hear something, how does the apparent intensity vary with the actual intensity ? What about brightness, weight, and so on ? In the nineteenth century Weber formulated a law stating that the just noticeÂ able difference (jnd) in signal intensity is proportional to the intensity of the signal. The constant of proportionality varies from 0.003 for pitch to 0.2 for salinity. Fechner took Weber's law and assumed that all jnd's were psychologically equal for a given type of stimulus. This led to the Weber-Fechner law relating psychological intensity S, measured in jnd's, to physical intensity F : S g(F). =
k
(a)
=
Show that Weber's law states that, if Sl and S 2 differ by 1 jnd, log F 1 and log F 2 differ by some constant Derive the WeberÂ Fechner law : If Sl and S 2 differ by an integral number of jnd's,
k.
N
34
A R G U M E N TS F R O M S C A L E
F1
F2
kN.
g(F) k F
log log + c. and log differ by Conclude that Sound loudness and star brightness are both measured in logarithms of energy (decibels and magnitude). Why is this done ? Conclude from the Weber-Fechner law that, if then t h at is, the apparent intensities as measured by a person seem to differ by the same amount. This result is usually fairly accurate for intermediate values of intensity but is often inaccurate at extremes. However, Weber's law is usually fairly accurate over the entire range. How can this be ? (Find the hidden assumption in Fechner's derivation.) (c) Stevens discovered that equal ratios of physical intensity correspond to equal ratios of psychological intensity ; that is,
(b)
=
FdF 2 F 3/F4'
S l '' S2 S 3 - S4 ;
=
=
F 1 F 3 if and only if S 1 S 3 S 2 S4 F2 F4 Let Sj log Sj and h log Fjâ€¢ Suppose 12 11 + (j and 14 = Describe the 13 + (j. Letting (j 0, show that ds/dl is a constant. function S = g(F) in Stevens' law. =
(d)
--+
=
=
How can Weber's law and Stevens' law both be nearly true ?
Various psychology texts discuss the subj ect of this problem, for exaIIlple, E. Fantino and G. S. Reynolds ( 1 975, pp. 220-226). For a more extended discussion see S. S. Stevens ( 1 974, Ch. 1 ). A. Rapoport ( 1 976) discusses this problem and other topics in mathematical psychology. where is surface 7. Atmospheric drag is roughly proportional to area and is speed, for many common objects (e.g., moving cars and falling bodies).
Sv2 ,
v
(a )
(b) (c )
v
S
If is the terminal velocity of a falling object, show that for similarly proportioned objects ex Show that on collision with the ground the kinetic energy per unit area that must be converted into some other form of energy is proportional to m. Discuss the effect of falling on animals of various sizes. Remember that larger animals have larger bones.
v m 1 /3 .
2 . 2 . D I M E N S I O N A L A N A LY S I S
Dimensional analysis i s a tool o f the physical sciences. I t i s based o n the observation that physical quantities have dimensions associated with them and that physical laws remain unaltered when the fundamental units for
D I M E N S I O N A L A N A LY S I S
35
measuring the dimensions are changed. For example, the area of a rectangle is the base times the height regardless of whether we measure in feet or meters as long as the units of area are (feet)l or (meters?, respectively. Dimensional analysis alone will not give the exact form of a function, but it can lead to a significant reduction in the number of variables. As a result, it may be much easier to prepare tables of a function experimentally. A related usage of dimensional analysis is the design of scale models : It helps you face the problem of how to scale the physical parameters of the system so that predictions can be made for the real problem by analyzing the behavior of the scale model. The examples presented here are adapted from L. I. Sedov ( 1 959). The first book on the subj ect was written by P. W. Bridgman ( 1 9 3 1 ). J. F. Douglas (1 969) gives a recent, standard, elementary introduction to the subject. If you would like to read a text containing problems with solutions, see H. L. Langhaar ( 1 9 5 1). S. J. Kline ( 1 965) presents a critical introduction to dimensional analysis and related topics. Theoretical Background
The basic physical dimensions are usually mass, length and time. We denote them by and T. Since we can measure velocity in feet per second, it has the dimension of length/time. We express this by saying that the dimension of velocity is L/T. By Newton's law, force equals where is mass, velocity, and time. Hence it has the dimension of which is
M, L,
d(mv)jdt, m v t mv/t, M(L/T)/T MLT - 1. If all the terms in an equation have the same dimension, we say that the equation is dimensionally homogeneous. By our definition of the dimension =
of force, we have made Newton's law dimensionally homogeneous. Consider Newton's law of gravitation :
F = Gmrll mZ ' where G is a universal constant, m 1 and m l are the masses of two bodies, and r is the distance between them. We have just determined that the diÂ mension of the left hand side is MLT - 1. The dimension of m l m Z /r l is M2L - 2 . The two sides of the equation apparently have different dimensions. Actually, the value of the constant G depends on the units of measurement and so is also given a dimension. To make the law of gravitation dimenÂ sionally homogeneous, the dimension of G must be MLT-1z M - 1 L 3 T - z . M1L (14)
_ï¿½ï¿½
=
36
A R G U M E N TS F R O M S C A L E
By assigning dimensions to variables and constants in this way, we can make of This is not as surprising as it may sound. Almost everyone is aware to some extent that it is not correct to compare things that have different dimensions. The basic theorem of dimensional analysis is the Buckingham pi theorem. It can be stated as follows.
all the laws physics dimensionally homogeneous.
THEOREM. An equation is dimensionally homogeneous if and only if it can be put in the form
f(nb n 2 , . . . ) = 0,
where f is some function and n 1 , n 2 , . . . are dimensionless products (and quotients) of the variables and constants appearing in the original equation. Not all dimensionless products need to be included in the list nb n 2 , Only a set from which all others can be formed by multiplication and division is needed. .
.
.
â€¢
It can be shown that the number of products in the list n l > n 2 , need not exceed the number of variables and physical constants in the original equation. As an example of the theorem we return to the law of gravitation ( 1 ). Consider a product of the form n = G amt mï¿½ rdr, .
where the exponents product is
â€¢
.
a, b, d, and e are arbitrary. The dimension of this c,
b + c + e - a L 3 a + d + e - 2(a + e ) b 2 T ( M - 1 L 3 T - 2 )a M MCLd ( ML T - ) e = M
From this we see that n is dimensionless if and only if
b + + e - a = 0, 3a + d + e = 0, a + e = 0. We can choose a and b arbitrarily. Then c = 2a b, d = 2 and e = -a. Since (a, b) = a( l , 0) + b(O, 1 ), all dimensionless products can be obtained from the two cases (a, b) = ( 1 , 0) and (a, b) = (0, 1 ). These give c
-
Gmï¿½ n 1 = -ï¿½ ' r2F
any
-
a,
Buckingham's theorem tells us that homogeneous equation involving only the values of G, m b m 2 , r, and F can be put in the form f(n l > n 2 ) = 0. For example, the law of gravitation is of this form, since it can be written as n 1 n 2 - 1 = 0. Note that we had to include G, even though it is a universal constant. that can enter into the function must be included.
Everything
37
D I M E N S I O N A L A N A LYS I S
Two comments should be made about the mechanics of doing diÂ mensional analysis. First, it is not always evident what should be included in the list of relevant physical variables and constants. The only sure guide is good intuition. Second, if you have had some linear algebra, you should recognize the procedure we went through to obtain n 1 and nz : We found a basis for the two-dimensional subspace of 5 that makes the exponents of L , and T in n equal to zero. Such a basis was given by = ( 1 , 0, 2, - 2, - 1 ) and = (0, 1 , - 1 , 0, 0). This procedure works in general : Find the exponents of L , and T in terms of the exponents of the exponents of the variables and constants appearing in n ; then find a basis for the null space of these exponents. Each basis vector determines one of the dimensionless products ni mentioned in the Buckingham pi theorem. By formalizing the above idea we can obtain a proof of the Buckingham pi theorem. Here is a sketch for those who are familiar with linear algebra. Let Xb XZ , . . . , Xk be the physical quantities we are studying. Define
R
M,
(a, b, c, d, e)
(a, b, c, d, e) M,
f(X ï¿½' X 22 . . . X fk) =
(a 1 a z , . . . ak) ,
,
'
This sets up a natural one-to-one correspondence between products of powers of X i and the elements of We can replace each X i by its dimensions and define another map like f but this time into (Usually = 3 for - 1 . It is a linear transformation from L , T.) Consider to Let be a basis for the null space and extend it to a basis b b . . . for We can express Xi as products of powers of the Define ni = f ni , since the form a basis for Hence any physical law expressed in terms of Xi can be expressed in terms of the ni . For every m > j there is a change in the units of measurement which changes nm but leaves the other ni unchanged. Since the laws of physics are assumed to be independent of the units of measurement, the law we are considering must be independent of nm . Thus it depends only on n b . . . , nj , and these are all dimensionless since lie in th<;)illi,U space of
M, b I > . . k. , bj R.
b 1 , . . . , bj
bi
d df - l(b;).
Rk.
Rk.
R'.
n
Rk R' .
,
bk
df- 1 .
The P e r i o d o f a P e rfect P e n d u l u m
Legend has it that Galileo's interest in motion began when he observed a hanging lamp in the Pisa cathedral swinging back and forth. This is an example of a pendulum. How fast does a pendulum swing ? How does the period of the swing vary with the length ? The weight ? The angle of swing ? We consider a pendulum in which all the mass is concentrated at a distance I from a perfect pivot and there are no frictional forces. From observation or theory it can be determined that the motion of a frictionless pendulum is periodic with some period t. Since we want to derive a formula for t, it is our (only) endogenous variable.
38
A R G U M E N TS F R O M S C A L E
What quantities should enter into such a formula ? In other words, what are our exogenous variables ? The length of the pendulum, the mass of the pendulum, the acceleration due to gravity, and the maximum angle the pendulum makes with the vertical appear to form a complete list. (Since gravity is involved, you may wonder why and the radius of the earth are not on the list. The only effect of gravity is to provide a force equal to acting on the pendulum. Both and are on our list.) We now show that all dimensionless products can be formed from tZ n 1 = -- '
I
g
m 8
G
mg m g g 1 The procedure is the same as the one we just used for the law of gravitation. We know the dimensions of I and t. The acceleration g has dimension LT - z . Since-an angle is measured by the ratio of arc length to radius, the angle 8 is dimensionless. Thus the product n = m l tC ld e e has dimension Lb d Tc Zb. This vanishes if and only if a = 0, b + d = 0, and c 2b = O. It follows that we can choose b and e arbitrarily and that a = 0, c = 2b, and d = b. We obtain from (b, e) = ( 1 , 0) and nz from (b, e) = (0, 1 ). Note that m does not appear, because no other quantity has in its dimension. Since the period of a pendulum is a physical law, Buckingham's theorem applies. Solving !( n 1 , nz ) = 0 for n1 gives n1 h(n 2 ) for some function h. a
Ma
+
-
-
-
n1
M
=
Therefore
t = k(8) -If, where k Z = h . The exact form o f the function k(8) must b e determined b y other means. It turns out to be an elliptic integral and very nearly equal to 2n when 8 is small. ( 1 5)
Period =
is
Sca l e M od e l s of Struct u res
Suppose you are an engineer and wish to study how a structure you've designed will hold up. Since theoretical analysis of a complicated structure is likely to be impossible, it is convenient to study a scale model. How should you design the model and how should the observations you make on it be translated into predictions about the real structure ? We answer these questions here. Unless they are greatly deformed, most structures can be reasonably approximated by assuming that they are built of materials that are and (These are technical terms.) The important physical consequence of this assumption is that, except for specifying shapes and forces, we need only
isotropic.
elastic
D I M E N S I O N A L A N A LY S I S
39
two parameters to determine the changes in shape (called deformations). One is 's ratio (J which is dimensionless. (It is the ratio of the percentage Poisson changes in the dimensions of a bar perpendicular and parallel to a compresÂ sive force.) The other is Young ' s modulus E. (It is the ratio of the compressive force pÂ·er unit area to the percentage change in the parallel dimension.) The dimension of Young's modulus is ML - I T -2 â€¢ The important thing is not how and E are defined, but rather the fact that as far as deformations are concerned they are the only relevant inherent properties of the material(s). (J
What are the relevant variables and physical constants ? The endogenous variables are the deformations [y of the structure. Our structure has some characteristic length by which we can relate all lengths of the scale model to those of the real structure. The specific gravity (weight per unit volume) 'Y may also be important. Weight per unit volume is density times acceleration - 2 T - 2 . E and (J have already been due to gravity and so has dimension mentioned. Finally there are the forces F which are loading the structure at various points. Our list of relevant quantities is (J, 'Y, E, F, and [yo Actually, all of these except should be subscripted to indicate that there may be several different materials and a variety of forces. All dimensionless products can be formed from the products
I
ML
I,
I
( 1 6)
15k/I.
I5k/1
Each is determined by (Ji ' 'Y i , Ei , Fi , and I. It follows from and Buckingham's theorem that is a function of the various products in ( 1 6). Therefore
I5dl
If the quantities in ( 1 6) are the same for the scale model and the real structure, all deformations will be scaled according to the scaling of
I.
We must therefore keep (Ji the same for the materials in the scale model and the real structure. The easiest way to do this is to use the same materials in both cases. Then all the Ei and 'Yi will be the same for the real structure and the scale model. From the third relation in ( 1 6) it follows that the two values of must be the same, so the model is the same as the real structure. How can we get around this ? It is the of a material that is If we could constant ; 'Y adjust 'Y by changing the gravitational field, this would adjust How can we change the gravitational field ? Since acceleration due to gravity is like any other acceleration, we can effectively increase ' gravity ' by using a centrifuge. This technique is actually used. Suppose the ratio of the scale model to the real is 1 By the third expression in ( 1 6), the centrifuge must
I
density the specifi c gravity varies with the graVitational field .
I
I :r.
l.
40
A R G U M E N TS F R O M S C A L E
rr
produce an acceleration times that due to gravity. By the last expression, the model forces should be - 2 times the real forces. As a check of what we have been doing, let's look at a force F due to weight. It equals mg and so is proportional to 1'[ 3 . This changes by a factor of - 3 = - 2 as desired. There is another approach to building scale models. The specific gravity of the materials is important only because it determines forces on the structure due to gravity. If we expand the list Fj to include these forces as well, we can neglect the specific gravities, hence the third type of ratio in ( 1 6). The fourth ratio tells us that forces must be proportional to [ 2 ; however, if the gravitational field is unchanged (no centrifuge), the gravitational forces will vary as [ 3 . (Why ?) To compensate for this the scale model of the structure can be loaded at various points with weights equal to the difference between these two quantities. This may make it necessary to measure forces at many points, but it eliminates the centrifuge. Without dimensional analysis these ideas for building scale models would have been hard to discover.
rr r
P R O B LE M S
1.
This problem relates to the pendulum model. We want t o include frictional effects.
(a)
Suppose that the frictional force is due primarily to air and is proportional to v 2 with a constant of proportionality K . The value of K depends on the shape of the pendulum. Let r be the time required for the pendulum to reach half its initial amplitude e. Argue physically that v is determined by m, [, g, K , e, and the elapsed time. Show that
(b) Deduce a similar result if the frictional force is proportional to v. Using the results of and (b), describe an experiment for deciding which (if either) of the assumptions about the dependence of the frictional force on v is correct. Consider a pendulum with a hollow weight which can be filled.
(c)
(a)
Hint:
2. Why do stringed musical instruments have strings of different lengths and thicknesses ? The fundamental frequencies of vibrations of strings of similar material depend primarily on length [, mass per unit length f.l , and tension (force) F on the string.
41
P R O B LE MS
(a) (b)
(c) (d)
(e)
Derive the formula for the fundamental frequency materials :
w
for similar
ftJit w oc I- Â·
In terms of the above result, explain the structure of a nylon string, six-string guitar. There are several structural constraints imposed on the instrument. Design and playing considerations dictate that the strings must be of the same length and cannot have either too large or too small a diameter, and impose upper and lower limits on the tensions in the strings. The frequency of the low string is only one-fourth the frequency of the high string. There is of course no need to explain these facts. The following properties of the guitar should be explained. When playing the guitar, different notes are obtained by using the fingers to shorten the length of various strings. Tuning is accomplished by adjusting the tension on the strings. The strings vary in thickness and in the material of which they are made. Roughly speaking there are three thicknesses ( Tl < T2 < T3 ) and two materials nylon (N ) and steel-wrapped nylon (S ) . The strings, from highest frequency to lowest frequency, are N Tb N T2 , N T3 , S Tb S T2 , and S T3 â€¢ If you are familiar with the structure of another stringed instrument, interpret it as much as possible using the ideas in and (b). One of my students (R. T. Oberndorf) collected data to check the formula in He used guitar strings with an arrangement for changing the tension and the length. He found that for a given string wi was constant to within the accuracy of his measurements when F was held fixed. The same was true for w/JF with I held fixed. However, when he used various strings but fixed l and F, he found that wh was not constant. The largest deviations occurred with the thinnest strings and the highest tension, w being higher than predicted. Suggest some possible explanations. Let's take the material of the string into account. We assume that the material is and Thus we need only consider Poisson's ratio (J and Young's modulus See the scale models of structures model in this section for a brief discussion of (J and Show that
(a)
(a).
elastic
isotropic.
w
E.
= K(EFe ' ) ftJit I Â· (J
Use Oberndorf's result to show that guitar strings.
K depends only on
E.
(J
for
42
3.
A R G U M E N TS F R O M S C A L E
How long should you roast a turkey ? Cookbooks usually give directions in the form : ' Set the oven to To degrees and allow minutes per pound for cooking.' For turkey, which can range in weight from about 7 pounds to about 30 pounds, a range of roasting times may be given. In this case, one cookbook recommends cooking for 1 5 to 25 minutes per pound, the longer time to be used for smaller birds. We study this in a problem adapted from S. J. Kline ( 1 965).
n
(a)
A piece of meat is cooked when its minimum internal temperature reaches a certain value dependent on the type of meat and the desired doneness. Let the cooking time t be the endogenous variable. Present an argument to show that the exogenous variables are the difference in temperature ï¿½ Tm between the raw meat and the oven, the difference in temperature ï¿½ T:: between the cooked meat and the oven, some characteristic dimension of the meat, and some measure K of the ability of the meat to conduct heat. The usual measure of ability to conduct heat is thermal conductivity which is the amount of energy crossing a unit cross-sectional area per second divided by the temperature gradient perpendicular to the area. Hence K is measured in
l
(b)
energy/(area x time) degrees/length
ML
(c) 4.
2 T 2 . Temperature is measured in The dimension of energy is energy per unit volume. Determine the dependence of cooking time on the weight for similar pieces of meat for which ï¿½ Tm and ï¿½ T:: are the same. Discuss the accuracy of the cookbook rule. Comment on the rule for turkeys. -
Waves seem to roll in at a beach in a regular fashion, but their speed seems to vary from place to place and, perhaps, from time to time. Why ? Does something similar happen out at sea as well ? We discuss wave motion in a perfect fluid ; that is, a fluid with no viscosity or compression. Let the endogenous variable be the velocity v of a wave.
(a) (b)
Argue that the exogenous variables are acceleration g due to gravity, the density p of the liquid, the length A of the wave, the height h of the wave, and the depth of the liquid. When the height of a wave is small compared to its length, it is known that we can approximate the equations of motion by equations that do not contain h. Conclude that we can ignore h in this case.
d
P R O B LE M S
(c)
Show that
(d) (e)
v = JIgf(ï¿½).
d
43
Show that v is nearly proportional to JIg when is large compared with A. Thus wave speed at sea varies with the wavelength. Suppose we want to build a scale model to study the effect waves on the open ocean have on boats: How should everything be scaled ? If all linear dimensions are scaled by a factor of r, what happens to the time it takes a wave to travel the length of the boat. You may also wish to refer back to the model dealing with scale models of structures. (f) When is small compared with A, the bottom interferes with the wave, so that A is practically irrelevant. Show that is nearly proportional to Jdg in this case. The British government has used this result to obtain depth surveys in certain remote coastal areas. Two pictures were taken of the same region at slightly different times so that wave speed could be measured (R. Carson, 1 96 1 , p. 1 09).
Hint:
d
v
C H A PT E R
3
G R A P H I CA L M ET H O D S
3.1.
U S I N G G R A P H S I N M O D E LI N G
Graphs can be very useful in modeling if you are aware of their uses and limitations. Since many people expect either too much or too little from them, we discuss their uses and limitations before going into specific models. People can take in an entire picture rather quickly and then deduce consequences by using their geometric intuition. It follows that graphs should be useful in conveying information. Those wonderful analog comÂ puters people carry in their skulls can rapidly locate certain patterns in visually presented data. One of the easiest to spot is a straight line. For this reason a variety of forms of graph paper (rectangular, polar, log-log, normal probability, etc.) are marketed so that plotted data will appear linear if the anticipated relationship exists. Graphs are most useful in conveying qualitative relationships or approximate data which involve only a few variables. A graphical approach to a problem is most likely to be useful when not much information is available or when it is given in a rather imprecise form. Analytical methods are usually more appropriate when more precise information is available. In complex simulation models, graphs are frequently used to illustrate the qualitative behavior of several time varying endogenous variables simulÂ taneously. This helps one obtain a qualitative feel for the behavior of a complicated simulation model. So far we have talked about graphs primarily as a way of presenting data. Now let's consider some major roles graphs play in model formulation. Since our imagination is limited to three dimensions, graphical repreÂ sentations of the interrelations of more than three variables are not directly useful. However, it is often possible to graph a function with most variables held fixed and then determine how the graph will change when one of the fixed variables is changed. This is the heart of the geometric approach to 44
C O M P A R AT I V E
STAT I C S
45
comparative statics which is discussed in Section 3.2. The differential calculus
approach parallels the geometric arguments and provides a firm foundation for making statements when any number of variables is involved. The basic problem of comparative statics can be stated as follows : How does the equilibrium point of a system move when certain exogenous variables are changed ? For example, how will the output of a firm be affected by a higher tax rate ? Graphical methods are also useful in studying The analytical treatment of local and global stability theory is not easy. Therefore it is desirable to use graphical methods whenever possible to suggest and perhaps prove results. Section 3.3 touches on this approach. For a treatment of the problems of stability theory from an analytical viewpoint see Chapter 9. As a glance at the figures in this chapter shows, the of curves are of major importance in comparative statics. This is because they determine the equilibrium points. A subtler observation is that of curves play a central role in stability questions. The slope of a curve is a rate, and rates play a crucial role in stability theory. Finally, graphical arguments are useful in problemsÂ especially if the model is not quantitative. Since this straddles Chapters 3 and 4, I've decided to put it in Section 4.2.
stability questions.
intersections slopes
optimization
3. 2. C O M P A R AT I V E STATI C S T h e N u c l e a r M i ssi l e A r m s Race
The United States and the U. S.S.R. both feel that they require a certain minimum number of intercontinental ballistic missiles (ICBMs) to avoid ' nuclear blackmail.' The idea is to ensure that enough missiles will survive a sneak attack so that ' unacceptable damage ' can be inflicted on the attacker. Given this philosophy, it is claimed by some and denied by others that the introduction of antiballistic missiles (ABMs) and/or multiple warheads on each missile (MIRVs) will cause both nations to increase their stock of missiles. Is this true ? What about making missiles less vulnerable to attack by hardening silos or building missile firing submarines ? The wrong answers to these questions could have drastic consequences. Who is right ? Obviously we cannot hope to settle the debate. However, a simple graphical model can shed some light on the problems involved and hopefully help lead to more intelligent debate. The following discussion is adapted from T. L. Saaty (1 968, pp. 22-25). We deal with two countries which we call country 1 and country 2. Let x and y be the number of missiles possessed by countries 1 and 2,
46
G R A P H I C A L M ET H O D S
respectively. W e treat x and y a s real numbers. O f course they are actually integers ; but since they are large, the relative errors introduced by treating them as real numbers will be small ; for example, the percentage difference between 500 and 500. 5 is quite small. For the time being we assume that all missiles are the same and are equally protected. From the above discussion it follows that there exist continuous, increasing functions f and g such that country 1 feels safe if and only if x Â· > f ey), and country 2 feels safe if and only if y > g(x). These functions are plotted in Figure 1 . The shaded region is the area in which armaments are stable, since both countries feel they have ' sufficient weapons to prevent a sneak attack. We consider questions such as : Does such a region actually exist ? What effect do such things as ABMs, MIRVs, and so on, have on the point A = (xm , Ym) ? First we show that the solid curves in Figure 1 are qualitatively correct. Let's look at things from the point of view of country 1. A certain number of missiles X o is needed to inflict what is considered unacceptable damage on country 2. When country 2 has no missiles, country 1 requires X o . We show that for any r > 0 the curve x = fey) crosses the line y = rx. It suffices to show that there is a function x(r) such that, whenever x ;;:::
x(r)
Accepta b l e to cou ntry
2
Ym
Accepta b l e to co u ntry
F i g u re 1,
1
Country 1 introduces ABMs. A initial status (shaded area stable) ; B country I protects its missiles; C country I protects its cities. Axes show number of missiles . =
=
=
C O M P A R AT I V E STAT I C S
47
y rx,
and country 1 believes that it has enough missiles so that the number surviving a sneak attack by country 2 will be able to inflict unacceptable damage on country 2. In other words, country 1 wants to be practically certain of at least of its missiles surviving a sneak attack by country 2. Suppose that To destroy the most missiles, country 2 should aim about missiles at each of country 1 's missiles. Since a warhead may fail to reach and destroy its target, there is some probability, > 0, that a given missile belonging to country i will survive a sneak attack. Thus country 1 this will can expect missiles to survive. For large enough exceed by an amount large enough to allow for uncertainties. This completes the proof that the curves intersect. Thus the curve starts at 0) and curves upward with a slope increasing to 00 . By a symÂ has the form shown, with a slope decreasing to 0. metry argument, the miniÂ Two such curves meet at exactly one point which we call mum stable values for and This analysis applies to all the situations discussed below, so there is always a unique minimum stable point. We want to know how its position compares with Suppose the missiles of country 1 are made less vulnerable to sneak attack by the use of hardened silos, ABM protection, or some other means. This increases the probability that any given missile belonging to country 1 will survive a sneak attack. Hence the curve moves to the left with the point fixed. The shape of the curve is altered somewhat in the process. The new curve is shown dashed in Figure 1 . We can see that =
Xo y rx.
r
=
xp(r) Xo (xo,
p(r) x xCr), =
x f(y) =
y g(x) x y.
(xm, Ym),
=
(xm' Ym).
p(r), f(y) Xo both countries require fewer missiles for stability. Suppose that country 1 protects its cities by some device such as ABMs. Country 2 now requires more than Y o missiles to inflict unacceptable destruction on country 1 . Thus the curve g(x) moves upward as shown by the curve in Figure 1. Both countries require more missiles for stability. x - x - x
What happens if multiple warheads are installed ? This situation is more complicated than the previous two. Suppose country 1 replaces the single warheads on each of its missiles with warheads. It will then require that fewer of its missiles survive a sneak attack. (The numb er required is about Thus moves to the left as in Figure 2. Country 2 will be faced with times as many warheads in a sneak attack, so from its point of view the scale of the axis has changed by about a factor of as shown in Figure 2. It appears that country 2 will require more missiles, and country 1 will require fewer ; however, this depends on the detailed shape of the curves. Therefore probabilistic models should be used instead of, or in conjunction with, graphical ones. This would require us to make more precise assumptions regarding the capabilities of the missiles, so we do not go into it here.
N
xo/N.)
N
x fey) x =
N,
48
G R A P H I C A L M ET H O D S
F i g u re 2
Country 1 introduces MIRVs . Axes show number of missiles.
It seems unreasonable to assume that country 2 will not also develop and deploy multiple warheads if country 1 does. Therefore we should analyze the situation in which both countries deploy multiple warheads. There are two conflicting effects : 1.
missiles,
Since the axes measure the points [j(0), 0] and [0, g(O)] will move toward the origin, tending to decrease (xm , Ym). 2. f (y) becomes more horizontal and g(x) becomes more vertical, tending to increase (xm , Ym).
We cannot decide without further information which effect will dominate. T. L. Saaty ( 1 968, p. 24) presents an analytical model which leads to the conclusion that both countries will require many more missiles. In the above discussion, we assumed that all missiles were the same. This is unrealistic. If we drop this assumption, each country will change its strategy by aiming different numbers of missiles at the various enemy missiles. Of these, some targeting makes the expected surviving firepower a minimum. This targeting gives the curves for Figure 1, and the analysis proceeds as before.
C O M P A R AT I V E STAT I C S
49
You may be interested in the article by K. Tsipis ( 1 9 75a) which contains a discussion of the technology behind ultraaccurate MIRVs. Biogeography :
D ive r s i ty o f S pec i es on I s l a n d s
The diversity o f species varies considerably from place to place, even when the habitats appear to be the same. Conservationists have argued that the size of a region is important for diversity, and so they often favor a few large wilderness areas rather than many tiny ones. The subject is far from underÂ stood. We study one corner of it briefly. The world is broken into patches of differing habitats. Often a habitat a species finds acceptable is surrounded by a large expanse of unacceptable territory. Examples are alpine meadows, farm woodlots, lakes, game preserves, and islands. The following discussion is confined to islands ; however, most of the ideas and results apply to other types of isolated habitats. The material is adapted from R. H. MacArthur and E. O. Wilson ( 1 967, Ch. 3) which treats the subject in much greater depth. Studies have indicated that the size of an island is an important factor in determining the number of species the island is likely to contain. Also, islands closer to the mainland tend to contain a greater variety of species than more isolated islands. It seems reasonable that the effects of migration of species and extinction of species (on islands) can account for this. We develop this idea and briefly consider some of its consequences. A species can become established on an island only by migrating to it and prospering there. An organism migrates by flying, being carried, drifting on currents, and so on. Since a population on an island is relatively small, it can die out because of random variations in the environment. As a result we expect the list of specie-s present on an island to change much faster than the list of species present on the mainland. This is somewhat vague. Does a flock of migrating birds that stops on the island for a day or a season become established and then die out ? Even if a species ' intends ' to stay on the island, we are still faced with the problem of what we mean by ' become established 'ï¿½if the island is too small to support a large population, the species will always be on the verge of extinction. When is a species established in this case ? Since we are dealing with a fairly crude model, we can afford to ignore these problems. A more refined model would have to come to grips with them. If we understood all the aspects of the situation (e.g., the biological, geographical, and meteorological), we could determine the probability of a particular species composition being present on the island at a given future time. These would be tremendously complex calculations involving vast quantities of data, and this approach would be hopeless.
completely
50
G R A P H I C A L M ET H O D S
Let's combine practically all the endogenous variables into one measure : the total number of species present on the island. It seems reasonable to suppose that this should vary around some average number of species in a steady state situation. We discuss this average. For a discussion of transient behavior see R. H. MacArthur and E. O. Wilson ( 1 967, Ch. 3) or E. O. Wilson and W. H. Bossert (197 1 , Ch. 4). When the number of species present on the island is that is, the rate of migration of new species to the island equals the rate of extinction of species already on the island. These rates depend in a complicated way on the species present, the season, and many other factors. If we regard a year as a short period of time, seasonality will present no problem. In this sort of crude averaging over many species, which species are actually present probably doesn't matter much. Therefore it makes sense to talk about rates in a crude way independent of which species are actually present on the island. In Figure 3 are plotted the number of species on the island versus the migration and extinction rates. The two smaller graphs illustrate the effect of distance from the mainland and the effect of island size. We discuss the reasons for the shapes and positions of the curves. Let's consider the extinction rate curves. When more species are present on the island, the chances that at least one species will become extinct in a given time are greater. Hence the extinction rate curves have a positive slope. Since extinction rates depend only on the island and the species present, the extinction curve is not affected by the distance from the mainland. However, we can expect that a species is more likely to die out on a small island because the lack of space keeps the population lower. Thus the extinction rate curves shift upward as the islands become smaller. Why do the migration rate curves have a negative slope ? The migration rate relates to species The greater the diversity on the island, the smaller the pool of potential migrating species on the mainland. Hence the chances of migration decrease as the number of species on the island increases. Migration rates depend on the distance of the island from the mainland and on the size of the island. The rates decrease with distance, because any given organism is less likely to reach the island. The rates increase with island size because ( 1 ) an organism has a larger land area as a target and (2) an organism is more likely to be able to establish itself on a larger island. It follows from Figure 3 that the number of species present increases with island size and decreases with distance from the mainland. This is not so surprising, since we practically put these results in as initial assumptions. We can say something about species turnover by looking at the graphs a bit more. Note that the equilibrium extinction rate (which equals the
in equilibrium,
migration and extinction cancel out numerically ;
N
not present on the island.
C O M P A R AT I V E STAT I C S
51
(a) Large -Â Smai l - -
(b)
(c)
F i g u re 3 Migration .and extinction curves for islands . (a) Typical curves. (b) Effect of distance . ( c) Effect of size .
equilibrium migration rate) is greater for near islands than for far islands. Hence the species composition for two islands of equal size should change more rapidly on the island closer to shore. If the effect of island size on migration rate is not too great, we can similarly conclude that the species composition changes faster on small islands than on large islands. Since small islands have fewer species at equilibrium than large islands, this effect should be quite noticeable. There is some data supporting the conclusions that species turnover is relatively and absolutely more rapid on smaller islands. R. H. MacArthur and E. o. Wilson ( 1 967, pp. 52-54) discuss the results of two botanical surveys of some small islands off the Florida Keys. The first survey ( 1904)
52
G R A P H I CA L M ET H O D S
was conducted quickly and s o may b e incomplete. Since the 1 9 1 6 survey was quite complete, the species present in 1 904 and absent in 1 9 1 6 give some measure of the turnover rate. Unfortunately the data involve only six islands, two of which are very small . . R. H. MacArthur and E. O. Wilson ( 1967, pp. 55-60) also report on some results of R. Patrick. She suspended glass slides in a spring in Pennsylvania and counted the number of diatoms of various species that were present. The glass slides can be thought of as islands. Four experiments were done two times each : A glass slide with an area of either 12 or 25 square millimeters was placed in the water for either 1 or 2 weeks. The slides subÂ merged for 1 week had more species present than those submerged for 2 weeks. We can explain this apparent contradiction by observing that as a barren area becomes more populated the interaction between species may cause extinction. This was not allowed for in our model. Clearly care must be taken in modeling islands that are far from equilibrium. Because of this, we do not consider the 1 week data further. We can check out two predictions using the 2 week data : 1.
Larger area implies more species : The smaller slides had 24 and 2 1 species, and the larger had 2 9 and 28. 2. Smaller area implies a higher migration rate : The migration rate may be reflected somewhat in the differences in the species composition of the slides. (Why ?) Seven species appeared on one but not both of the smaller slides. For the larger slides the number was one. Theory of the Firm
You are the manager of a firm which produces , among many other items, ' zowies.' How can you decide on a level of production ? The price of the main raw material for your zowies is going to increase. Perhaps you can pass some of the cost on to your customers. How much ? Can you pass on enough to make it worthwhile to continue manufacturing zowies ? QuantitaÂ tive results are hard to obtain because data collection is extremely difficult ; however, we can obtain a qualitative picture of the situation fairly easily. In the usual theory of the firm it is assumed that the manager of the firm has complete information, that his decisions are carried out, and that he acts so as to maximize the profits of the firm. There is an ongoing debate about the usefulness of these assumptions, but we don't want to get into that here. If you are interested in the subject see R. M. Cyert and 1. G. March (1 963, pp. 5 - 1 6) for a discussion of both sides of the question. In addition to the above assumptions, we generally assume (as is often done in economic
C O M P A R AT I V E STAT I C S
53
theory) that the functions with which we are dealing are well behaved ; that is, they are continuous and usually differentiable. The theory of the firm is discussed in most textbooks on mathematical economics. There also exist books devoted exclusively to the topic, such as K. J. Cohen and R. M. Cyert (1965). Consult such sources if you wish to see the ideas in this example developed further. For simplicity we assume that the firm produces only one product, so that we can speak unambiguously of the level of production. It is measured in units per time period, where a time period can be a day, a month, or any other convenient interval. We want to find a way to determine the level of production, so that we can discuss the influence of changing costs and prices on the production level. Suppose the production of the firm is at some equilibrium level. Since profits are being maximized, the additional cost that woul d be incurred in raising production slightly is equal to the additional gross income that would be obtained by marketing these additional units of the product. You should convince yourself that this is simply a restatement of the calculus theorem that the function Total gross income - Total cost has maxima and minima where its derivative vanishes. The additional cost required to produce one additional unit is called the and the additional income is called the In general, both marginal cost and marginal income are functions of the level of production. We have shown that marginal cost equals marginal income at equilibrium. This equality could imply that the profit is a minimum instead of a maximum. How can we distinguish one from the other ? If we move away from the equilibrium, profits must decrease. Thus the marginal cost curve must lie above the marginal income curve for higher production levels and below it for lower production levels. This is shown in Figure 4 where the horizontal axis is the quantity produced per unit time. This is the basic result with which we work . Although marginal cost and marginal income may seem to be straightÂ forward concepts, they can be a bit fuzzy. During a short period of time (the short term), wages and the cost of raw materials are because they have been contracted for ; consequently, they do not enter into marginal calculations. From a slightly longer point of view, they are both variable costs and so enter into the marginal costs. Equipment depreciation is a fixed cost ; but maintenance, fuel, and replacement costs enter into the marginal calculations. Since our marginal curves vary with how long a view we take, the optimum level of output may depend on the length of time we want to
marginal cost,
marginal income.
fixed costs,
54
G R A P H I C A L M ET H O D S
$
M ax i m u m
F i g u re 4 Marginal cost and income curves . Axes show quantity produced p e r unit time and dollars per unit time .
consider. In the following discussion we make the vague assumption that the manager is concerned with the firm's profits over a reasonably long time interval. As long as we don't try to make any detailed applications, we can afford to be vague. What effect will taxation have on production ? If the firm is required to pay a lump sum tax independent of production (e.g., a property tax), the marginal curves will not be affected. Hence If the firm is required to pay a tax that depends on the level of production (e.g., an income tax or a value-added tax), the result will depend on whether or not the tax is passed on to the consumer. If it is not passed on, the marginal cost curve will rise. We have shown that the marginal cost curve intersects the marginal income curve from below at a maximum. It follows that the new intersection will be to the left of the old one. Therefore What will happen if the tax is passed on to the consumer ? In this case both marginal curves will move upward by an amount equal to the tax per unit of production, and the production level will be unchanged. The above result on taxation can be generalized considerably, and we can state another result on the income side :
unchanged.
production level will decrease.
the production level will be
the
The optimum level of production moves in the opposite direction from the marginal cost and moves in the same direction as the marginal income.
C O M P A R A T I V E STAT I C S
55
Convince yourself that this is true by giving a graphical argument. Suppose the price of raw materials increases. This raises the marginal cost, so the production level tends to decrease. Decreased production may cause consumers to drive up the cost (per unit) of the product, thereby increasing the producer's marginal income. Consequently the level of production will rise. Since the product now costs more, the amount purchased by consumers will probably be less. Thus the increase in cost wiII not be quite enough to push production back to its original level. We discuss this in terms of supply and demand curves. In industries where the number of firms is large, it is reasonable to suppose that the price per unit of product does not depend on the amount any single firm produces. In this case the marginal income curve is horizontal. The marginal cost curve is then the for the firm's product, since at a price the firm produces the quantity Q at which the marginal cost equals Since the marginal income curve is horizontal, our earlier discussion shows that the supply curve must have a positive slope to ensure stability. This agrees with the intuitive notion that higher selling prices lead to greater production. The is the amount of the product that will be purchased at a given price. Usually demand falls as price increases. Figure 5 shows typical supply and de.rn and curves. At equilibrium , the quantity purchased must equal the quantity sold. Hence the intersection of the supply and demand curves gives the equilibrium values of price and quantity.
p.
supply curve
p
demand curve
Price
Su p p l y c u rv e L-__________________________________ Q
Supply and demand curves. Increased marginal costs shift supply curve upward to dashed position. F i g u re 5
G R A P H I C A L M ET H O D S
56
From Figure 5 w e can see h o w much o f the increased marginal costs will be passed on to the consumer. The dashed curve shows the supply curve (marginal cost curve) after the marginal costs have increased. The flatter the demand curve, the greater the fraction of the increase the producer must absorb. What does a flat demand curve mean ? It indicates that consumer buying patterns are very sensitive to price. Thus, if consumer buying patterns are insensitive to price, you can pass most of your increased expenses on to the consumer. What about the theory of a firm that produces several products ? It is better to study such a situation using tools from calculus. However, our graphical analysis indicates the sort of results we can expect to find in this case. PROBLEMS
Problems 1 t o 5 deal with the arms race model. 1.
N
Suppose that both countries install warheads in each missile and that the new warheads are as effective as the old ones. Show that both countries will require more
warheads.
2.
Suppose a country is able to retarget missiles in flight so as to aim for missiles that previous warheads have failed to destroy. Discuss the effect.
3.
Various criteria have been used to evaluate proposed changes in missile systems. Try to evaluate the changes discussed in the text and the problems on the basis of (a) economics (cost) and amount of radioÂ activity released in the event of a war.
(b)
4.
There are aspects of the armaments race that become important only when a country is not as heavily armed as the United States and the U.S.S.R. When a country is just developing a nuclear strike force, it may be able to inflict heavy damage with a first strike but m ay be inÂ capable of a retaliatory strike. (a)
(b)
Develop a model and use it to explain ' preventive war.' Can you apply the model to the People's Republic of China ? Can you model the early years of the missile race ?
This is a rather unclear area, so class discussion may lead to a variety of ideas. You may wish to consult M. D. Intriligator ( 1 973). 5.
The United States and the U.S.S.R. signed an arms limitation agreement in May 1 972. The number of offensive allowed each country is limited, with a trade-off formula for land-based versus submarine-
missiles
STA B I L I TY Q U E S TI O N S
57
based missiles. There is no limitation on the use of multiple warheads or on improving missile technology. Each country is limited to two ABM sites of 1 00 missiles each. One site is for protection of the capital city and the other for protection of an ICBM site. (a)
(b)
Discuss this agreement in light of the models presented here. Include any relevant later agreements in the discussion. Politics is more complicated than our simple model , so you will have to weigh various factors that might affect the model's validity. How can the model be improved to help in answering (a) ?
6.
Will a group of small islands have more or fewer species per island than an isolated small island ? Assume that all the islands are about the same distance from the mainland and the same size.
7.
Discuss what happens in the model dealing with the theory of the firm if the marginal cost curve does not intersect the marginal income curve.
8.
In the short term, ordinary wages are a fixed cost and overtime wages are a marginal cost. (a)
(b) ( c)
Explain the previous statement. Show that the marginal cost curve has a discontinuity at the level of production corresponding to full usage of labor without overtime. What effect will this have on the results developed in the model of production by a firm ?
3 . 3 . STA B I LITY Q U E STI O N S
C o bweb M od e l s in Eco n o m i cs
We consider the dynamics of supply and demand when there is a fairly constant time lag in production as, for example, in agriculture. It has been observed that there are fairly regular price fluctuations in such situations. This situation was studied by economists in the 1 920s and 1 930s. The problem contrasts sharply with the theory of the firm in Section 3.2, where we ignored time. The following discussion is adapted from M. Ezekiel ( 1 9 3 7/8). When a commodity is marketed, the selling price is determined by the This price is one of the factors producers use in determining how to alter production. In a ' pure ' situation , they produce the amount on the curve that corresponds to the present price. (Supply and demand curves are discussed more fully in the theory of the firm model in Section 3.2. There we were interested in the intersection point of the curves.) Thus (see
demand curve. supply
58
G R A P H I CA L M ET H O D S Price
p,
Su p p l y c u rve
ï¿½----------ï¿½ï¿½------ï¿½----------- Q q2 q, F i g u re 6
The cobweb model.
Figure 6), if the amount of potatoes produced in year 1 is q I ' the price per bushel will be P I ' As a result, farmers will decide to produce the amount q z in year 2, the market will set a price per bushel for this crop, and so on. Because of the picture, this idea is referred to as the In practice one does not know the supply and demand curves, but the above model predicts that the demand curve can be obtained by plotting (qn ' P n) and the supply curve by plotting (qn ' Pn - I ) ' How realistic is this model ? The existence of a supply curve assumes that producers can control output perfectly. This is not true in the agricultural sector where weather is very important, but it may be a reasonable approxiÂ mation. If the supply and demand curves move erratically, the model will be upset. Changes in prices for other goods the supplier may produce, sudden changes in demand (e.g., the sale of wheat by the United States to the U. S.S.R. in 1972), and sudden changes in supply (e.g., crop blights) may cause this to happen. If the suppliers have some understanding of price fluctuations, they will not raise production levels much in spite of higher prices. However, this does not wreck the model. In this case the supply curve will be nearly independent of price near the equilibrium price, but the model will still apply. It predicts small fluctuations in supply and a rapid approach to stability. Plot this. Ezekiel presented the material on U.S. potato production contained in Table 1 . He obtained it from the Bureau of Agricultural Economics.
pz
cobweb theorem.
STA B I L I TY Q U E S TI O N S
Ta b l e 1
59
Potato Production in the United States
Year
1 04 acres
Bushels/acre
1 06 bushels
Farm price
Deflated price
1 92 1 1 922 1 923 1 924 1 925 1 926 1 927 1 928 1 929 1 930 1931 1 932 1 93 3 1 934 1 93 5 1 936
360 395 338 31 1 28 1 28 1 318 350 302 310 347 355 341 360 355 306
90 1 06 1 08 1 24 1 06 1 14 1 16 1 22 1 10 1 10 111 1 06 1 00 1 13 1 09 1 08
325 419 366 384 296 322 370 427 332 341 384 376 342 406 386 330
1 14 69 92 71 1 66 1 36 1 08 57 1 32 92 46 39 82 45 60
121 68 93 71 1 62 1 40 113 59 1 42 1 16 68 62 1 14 57 74 1 32
III
Discuss what should be used as ' quantity ' and what should be used as ' price ' in a cobweb plot and construct the plot. Should the model be modified because the yield per acre is not constant ? What about the effect of population growth during the 1 5 year period ? What about the effect of the Depression ? Clearly there is a lot of noise (i.e., disturbances we can't hope to take into account in a simple model) in the data. Thus we should see if the data fit the model better than a random set of data would. Can you propose a method for doing this ? From the supply and demand curves near equilibrium it is easy to make a prediction concerning stability. If the negative of the demand curve's slope exceeds the slope of the supply curve, there will be instability ; if it is less, stability. Convince yourself of this. Demand for some agricultural products is rather inflexible. When production is sensitive to price, the model predicts instability. The government can attempt to eliminate this by conÂ trolling production or prices. The former causes the supply curve to become vertical (or nearly so) above (and/or below) certain ranges of quantity. This keeps the instability from growing further. (Draw a graph to convince yourÂ self.) What is the effect of price control ? For a further discussion of cobweb models see N. S. Buchanan (1 939) and, for a recent generalization, M. S. Mudahar and R. H. Day ( 1 974).
60
G R A P H I CA L M ET H O D S
P h ase Pla nes
The previous model dealt with the stability of a difference equation. A similar procedure is used for differential equations. This requires the notion of a which is also used in Chapter 9. Suppose we are dealing with the two equations
phase plane,
x' = I(x, y),
( 1 7)
y' = g (x , y).
At each point (x, y) in the x - y plane we can plot a vector proportional to (x', y'). This is called the of ( 1 7). To graph a solution of ( 1 7) we then start at an initial point and follow a path parallel to the direction field. (Since the direction field varies from point to point, the path is usually curved.) The speed is determined by the magnitude of the vector tangent to the path at that point. If we start at a point with I = g 0, we will not move from it. Such points are called Since we have only crude information about I and g, our phase plane diagrams cannot be this detailed. To answer stability questions it is often sufficient to plot the two curves I = 0 and g = 0 and indicate roughly the vectors (x', y') in the neighborhood of these curves. The intersections of the curves are the equilibrium points of ( 1 7). The curve I 0 divides space into two regions such that x' > 0 in one and x' < 0 in the other. If you determine which region is which for I = 0, and likewise for g 0, the rest will be easy. The vectors cross I = 0 vertically, and the direction will be upward if and only if g > Similarly, they cross g = 0 horizontally, and the direction will be rightward if and only if I > o. See Figure 7 on page 63 for an example. In plotting I = 0 and g 0, it is helpful to determine the slopes of the curves. This can be done by implicit differentiation : For I 0,
direction field
equilibrium points.
=
=
=
O.
=
O.
=
dy dx
allax - allay '
and similarly for g = It is important to remember that the partia l derivatives for the slope of I = 0 are evaluated at values of x and y at which x is at equilibrium ; that is, x ' = (This is important in determining the sign of 01 lax in Problem 4a .) The partial derivatives also help decide which region corresponds to I > 0 and which to I < 0 : I > 0 to the right of (or above) I = 0 if and only if 01 lax > 0 (or allay > 0).
O.
S ma l l - G ro u p Dyna m i cs
You wish to set up a local committee to help elect a candidate to office. What keeps a group together and working ? Does more work improve a taskÂ oriented group or harm it ? Very little mathematical modeling has been done
STA B I L I TY Q U E S TI O N S
61
in this area and, unfortunately, the folIowing is rather crude and lacking in practical advice. We want to study the stability and comparative statics of a group which has a required activity imposed from the outside (a task). The model is taken from H. Simon ( 1 952), who based it on a nonmathematical model proposed by G. C. Homans ( 1 950). There are four basic functions of time : I(t), the intensity of in teraction among the group members. F(t), the level of fi'iendliness among the group members. A(t) , the amount of activity within the group. E(t), the amount of activity imposed on the group by the external environment.
The variables can be treated as averages over alI group members or as some overalI measure for the entire group. We regard I, F, and A as endogenous variables and E as an exogenous variable which we generalIy treat as being constant. To make the concepts more concrete, let's consider an example. The imposed activity E is the laying in of firewood. The group may be engaged in this for wages, or they may be friends preparing for winter. The various activities A include locating wood sources, sawing logs, stacking logs, and setting up a football pool. Note that some activities may not be directed toward the externally imposed task. G. C. Homans ( 1 950, p. 1 0 1 ) says, ' By our definition interaction takes place when the action of one man sets off the action of another.' ' Action ' here refers to activity, so that activity is required for interaction, but not conversely-a person can work alone. The many situations in our example that involve interaction include discussing where to obtain wood, working opposite ends of a saw while cutting logs, passing wood from one person to another in stacking, and conversing idly. Some of the interaction is necessary, but a lot of it can be reduced considerably. The same is true of activity, as any efficiency expert knows ; however, this may involve changes in habit patterns and so require more time. There are three relations on which the model is based : 1.
I(t) depends on A (t) and F(t) in such a way that it increases if either A or F does. The adjustment is practicalIy instantaneous. 2. F( t ) depends on l(t). It tends to increase when it is too low for the present level of interaction and to decrease if there is not enough interaction to sustain its present level. This adjustment requires time, and the rate of adjustment is greater when the discrepancy between present and equilibrium levels is greater.
62
3.
G R A P H I CA L M ET H O D S
A (t ) depends on F(t ) and E(t ) . It tends to increase when it is too low for the present level of F or E and to decrease when it is too high. This adjustment requires time, and the rate of adjustment is greater when the discrepancy between present and equilibrium levels is greater.
Criticize the assumptions. These assumptions can be turned into equations : (1 8a)
J(t ) = r(A , F ) ,
ur > 0, ;;cA
(1 8b)
F(t) = s( I , F )
us > 0, uJ ulj; uA
A '(t ) = Ij;(A , F : E )
(1 8c)
<
O
.
ur > 0, cF
ï¿½-
us
uF
<
ulj; > 0, uF
0, ulj; > O. uE
The reasoning behind us/uF < 0 and ulj; /uA < 0 deserves an explanation. The same idea applies to both cases. Let's consider Ij;. If A, F, and E are at some level, Ij; = A' will be determined. If we now increase A, we will either reduce the pressure for A to increase (if Ij; > 0) or increase the pressure for A to decrease (if Ij; < 0). In either case ulj;/uA < O. By substituting (1 8a) into (1 8b) we obtain F (t ) = cp(A , F ),
( 1 9)
ucp uA
= !
us ur > 0 uJ uA
This equation says that a high level of A tends to cause F to increase. The effect of a high level of F is ambiguous : It may tend to cause F to increase or decrease. The statement that ucp/uF > 0 can be interpreted as : ' The greater the friendliness; the faster it tends to increase (or the slower it tends to decrease , if it is decreasing).' While this may be true at some points in the A -F plane, it is unlikely to be true when F is large because of limits on friendliness. We assume that ucp/uF < 0 everywhere. The curves Ij; = 0 and cp 0 are plotted in Figure 7. The slope of the curves is positive, since, for example, on the curve Ij; 0, dF/dA = - (olj;/oA )/(olj;juF ) > O. The slope of the Ij; = 0 curve is increasing , because we assume a saturation effect : When A and F are both large and A ' O. a fairly large increase in F is required to balance a small increase in A. In other words , the group tends to resist increases in activity more when it is already quite active. Discuss the curve cp = O. Verify the general shape of the direction field shown in =
=
=
63
P R O B LE M S F
1/1 > 0 ' < 0
1/1 < 0 ', < 0
ï¿½----- A F i g u re 7
Dynamics in the activityï¿½friendliness plane.
the figure. It can be seen that the upper equilibrium point is stable and that the lower one is unstable. We now consider the effect of changing We have
E.
!11f ï¿½ :ï¿½ !1A + ï¿½ï¿½ !1F + ï¿½ï¿½ !1E. Since !11f on the curve If 0, it follows from ( 1 8c) that, when !1A 0, !1E and !1F have opposite signs. Thus the If curve moves downward as E increases. Hence The equilibrium levels of A and F are increasing functions of E. =
Â°
=
=
Â°
=
If the If Â° curve moves sufficiently far up, it will no longer intersect the Â° curve, and so there will be no equilibrium point. In this case the group cp will not continue to exist. Consequently it is possible that Â· a group will break up if externally imposed activity falls below a certain level. =
=
P R O B LE M S
1.
Discuss modifications of the cobweb model when there i s a time lag of more than 1 year in production, for example, raising hogs. The prices for hogs and corn (the principal feed for hogs) oscillate, and there is a fairly good correlation when they are offset a bit. Explain.
64
G R A P H I C A L M ET H O D S
2.
The demand for new graduates i n various fields fluctuates. H o w should your department adapt its graduate program to help stabilize the situation ? This problem is purposely very vague in hopes of generating a discussion based on reasonable models. Don't forget that feasibility is important. Engineering departments have gone through at least two cycles.
3.
Discuss the group interaction model when
4.
Suppose that two species are in competition. Let the number of members of the first species in the population be and the number of the second be y. Assume that the environment if fairly constant.
Gcp/uF
>
0 for small F.
x
(a)
Show that it is reasonable biologically to suppose that there exists a curve y = of slope such that species 1 increases if and only if y ) lies below the curve. ( b ) State the corresponding assumption for species 2. (e) Show that the equilibrium points are the intersection points of the curves , the point (0 , 0) , the point (f(0), 0), and the corresponding point for species 2. (d ) Determine the stability of the various possible equilibria. 5.
rex) negative (x,
You are called upon to advise an underdeveloped country on methods for increasing per-capita income. This problem briefly considers two difficulties you may encounter. It is an economics theory result that per-capita income is greater when accumulated capital per capita is greater. The idea is that , under suitable assumptions , since more capital is available it is used to help improve production. Do you think this applies to underde veloped countries ? What happens if capital is invested Â· abroad or foreign capital is brought in ? Let's assume that the econ omies theory result still applies. By definition, the capital accumulated in a year equals income (i.e., production) minus consumption.
(a)
Fractional rates of growth are defined in the same way as net growth rates in biology : We denote the fractional rate of growth of by x*. Let stand for total capital and for total population. Show that per-capita income is increasing if and only if > One could suppose that and depend on per-capita income. and as functions Argue this point. Supposing it to be true, plot of per-capita income and show that intersections of the curves correspond to equilibria. How can you determine stability ? In each of the following cases, discuss the shape of the and curves near the given income level and use to explain why these effects can keep per-capita income from increasing.
x
(b) (e)
K
x'(t)/x(t).
p*
P
K*
K* P*.
P* K*
(b)
K*
P*
P R O B LE M S
(i) (ii)
(d )
65
Rising expectations : At a certain income level, savings decrease because people try to mimic more affluent societies. Population explosion : At a certain income level, improved sanitation and diet reduce the death rate , but the birth rate takes much longer to fall because it is the result of custom.
That's the background for showing the ministers of the country some of the problems they face and what is going on. Now, advise them.
See P. A. Neher ( 1 97 1 , Ch. 8) and J. C. G. Boot ( 1 967, Ch. 1 1 ) for further discussion.
C H A PT E R
4
B AS I C O PT I M I ZAT I O N
Determining what must be maximized (or minimized) is usually a maj or problem in formulating an optimization model. For example, the theory of the firm assumï¿½s that managers behave so as to maximize profit ; but it has been suggested in recent years that they maximize a which includes size of staff and other items in addition to profit. Another example is provided by for computers. (A time sharing algorithm is an algorithm used by a computer to decide which of many waiting jobs to run and how long to let it run before interrupting it temÂ porarily to run other jobs.) What should be minimized ? Among the myriad of possible functions are
utility junction,
time sharing algorithms
jew, r) and L jew, r), where w = waiting time and r = running time. Waiting time refers to total time elapsed between submission and completion of a job. There are many possibilities for j, such as j = w and j = wlr. max
The first section of this chapter deals with optimization problems, using the result from elementary calculus that, except for boundary points and points without derivatives, l' = 0 at the extrema of f. The second section contains some models involving graphical optimization. 4. 1 . O PTI M I ZATI O N BY D I F F E R E N TIATI O N
M a i nta i n i n g I nve nto r i es
As a management consultant you are being asked for advice on production and warehousing policies. Where should you begin ? One problem is the trade-off between storage space costs and setup costs for frequent small 66
O PT I M I Z AT I O N
BY D I F F E R E N TIATI O N
67
production line runs. In deciding how large an inventory of finished goods to maintain, a firm concerns itself with such things as cost of storage, setup expenses for a production run, discounts for bulk orders of raw materials, and orders lost as a result of lack of inventory. Because of the random nature of the time and size of orders, a probabilistic model is the most natural. We use a deterministic one, since the results are substantially the same if a firm receives many orders. For a fuller discussion of inventory problems, see R. L. Ackoff and M. W. Sasieni ( 1 968), from which this model is adapted. See also the book by G. Hadley and T. M. Whit in ( 1 963). What should we optimize ? We minimize the cost to the firm, subject to the constraint that all orders be filled. The only variable the manufacturer can control is the time between production runs. To begin with, we assume that the only costs the manufacturer adjusts by changing the production schedule are setup costs for production and storage costs for finished goods. It is reasonable to assume that, when the production line is operating, it produces finished goods at a constant rate per unit time. There is a cost to set up the line at the beginning of a production run. This consists of profits lost by not using the production line for manufacturing at this time, various fixed costs, and any additional material and salaries that may be required. When the production line is not dedicated to the particular good we are interested in, we assume it can be used profitably for other work. We assume that the storage costs of the finished product are per item per unit time, independent of the quantity stored. (This is reasonable if warehouse space can be used for other goods.) Finally, we approximate the discrete arrival of orders by a continuous arrival at a constant rate per unit time. Discuss these assumptions and consider ways in which the model can be made more realistic. Remember that it is essential that the parameters in the model be determined if the model is to be of any use, and that this determination may be quite expensive for a complex model. Let be the length of time between one production run and the next. If is the length of a production run, = that is, goods produced equal goods sold during a cycle. Hence = If you graph inventory versus time from 0 to T, it rises from 0 to with slope and falls from to with slope The area under the triangular curve is = and is in units of items x time. Convince yourself that the storage cost is measured Thus we want to minimize
per unit time
k
c
s
r
t
T
kt rT; t rT/k. t k - Ar t T (k - r)tT/2
r.
sA.
s(k -- -=-r)(rT/k) - c +TsA - c + s(k T- r) t T/2 - ï¿½T + --'2-.c-'. Â· Differentiating with respect to T and setting the derivative equal to zero, we obtain c/T 2 = s(k - r)r/2k. From the form of ( 1 ) it is clear that C becomes (1)
c
_ ____ _
_
B AS I C O PTI M I Z AT I O N
68
infinite i f l' decreases t o zero o r increases t o infinity, hence this extreme value of is a minimum. Thus the optimum values for T and t are
C
1' =
2e k rs(k - r) '
t=
2er
ks(k - r) '
It is not obvious a priori that the optimal time varies as the square root of the setup cost and inversely aï¿½ the square root of the storage cost per unit time. We now consider storage costs for raw materials. Let's assume that there is only one raw material and that the precise amount needed is delivered at the beginning of the run. Let be the storage cost per unit time for enough raw material to produce one item of output. Convince yourself that the cost per unit time is e - r) + (1' 1'/k) e+ - r)t T/2 + s'(r T)t/2 C= = + . T T 2 Setting the derivative equal to zero, we obtain e/T 2 = - 1') + Thus the optimum values of T, t, and C are
s'
[s(k
s(k
s'rJ
[s(k
1' = (2)
t= C=
s'rJr/2k.
2e k r[s(k - 1') + s'rJ' 2er k[s(k - r) + s'rJ ' 2er[s(k - r) + s'rJ k
Since the model is only approximate and since we probably cannot determine the independent variable s very accurately , it is important to have some idea of the cost incurred by making these errors. If T is replaced by exT, it is easy to show that the value of C is (ex + ex - 1 )/2 times the optimal value. For example, a 50 % underestimate of T (i.e., ex = 0.5) increases C by 25 % , while a 50 % overestimate increases C by about only 8 %-the same amount as a 33 % underestimate would. We draw two conclusions from this. First, an error in choosing T does not change costs greatly unless the chosen value of T is quite far from the optimal value. Second, it is better to err on the high side than on the low side. Since the storage costs are the hardest to estimate and since T varies inversely with the storage costs [this follows from (2) and the fact that k > I' this suggests that underestimates of storage costs are better than overestimates. What we have done in this paragraph is an example of Characterizing models as fragile or robust is a very crude form of sensitivity analysis.
J, sensitivity analysis.
O PT I M I Z AT I O N
BY
D I F F E R E N TIATI O N
69
(2)
We can use the results in to determine how much warehouse space our company requires. (How is this done ?) If this differs from the amount of space we now have, we should either get rid of excess space or acquire additional space. This is fine in t he long run, but what do we do in the short run, that is, the period of time before we can change our warehouse space ? Since the cost of the warehouse is fixed in the short run, s and s' should be zero. (See the discussion of the theory of the firm in Section for an explanation of fixed and variable costs.) How can we determine the best short run plan ? As pointed out at the beginning of this paragraph, if we knew the storage costs, we could use to determine how much space is required. This suggests that we assign fake costs to make storage space needed equal to storage space available. The easiest way to do this is to replace s and s' by S(J and s' (J, where (J is the factor that we have to ï¿½cale costs by and s and s' are long run costs. (You should be able to show that this simply has the effect of multiplying the needed storage space computed from by a factor of
3. 2
(2)
(2)
(J - 1/2. )
The situation with a bulk order of raw material is more complicated. Suppose a bulk order shipment consists of enough raw material to produce finished items. For simplicity we assume that is such that = is an integer ; that is a raw material order lasts for production cycles. (You may wish to study the model when is not an integer.) The amount of raw material on hand is plotted in Figure l over production cycles. The area under the curve is + Combining this with N = and = we see that the storage cost per unit time is
N
p
p
T
p N/rT
p N(pT - T t)/2. prT (k - r)T/k, s'[N - - t)rJ = s, (N (k - r)Tr). 2 2 2k (T
T-t
_
Combining this with ( 1 ) we obtain the total cost per unit time : C
N
=
c
T+
rT(k - r)(s - s') + s'N 2' 2k
- - - - - - - - - - - - - -
o F i g u re 1
T
2T
ï¿½
- - - - - -
ï¿½
(p - 1 ) T
pT
Raw material on hand during p production cycles of length T.
B A S I C O PT I M I Z AT I O N
70
s
T T = r(k - 2ck r)(s - s') ' t = k(k - 2cr r)(s - s') ' s'N c = 2cr(k - kr)(s - s') + 2 '
If :-:::; ï¿½', the best strategy is to make as large as possible, that is, When > we obtain the optimum values
s s',
(3)
p
=
l.
(2)
This can be compared with the optimum nonbulk values given by after a correction term is subtracted from the optimum bulk cost due to lower costs for raw materials. If the cost of materials is lower per finished unit when the manufacturer orders in bulk, the correction term will be Note that bulk ordering leads to longer productions runs, the ratio of times being
b
s'k 1 + (s - s')(k
-
rb.
r)
We have not discussed the possibility of allowing the warehouse to run out of finished goods and then back -ordering. This eliminates some storage costs at the expense of possibly losing some customer good will, hence some orders. Various approaches have been suggested. The following is adapted from B. L. Schwartz Most firms can expect to gain and lose customers at a fairly regular rate. At equilibrium the rate of loss and the rate of gain must be equal. What happens to these rates if the fraction of delayed orders . is increased ? Since there will be more disgruntled customers, the rate of loss will increase. We assume that new customers are still gained at the same rate. This probably won't be true if changes markedly, since bad reputations spread ; however, it seems reasonable if changes only slightly. The simplest model incorporating these ideas is
(1966).
f
f
f a(1 - f)N + bfN = constant, where the constant is the rate at which new customers are gained, N is the number of customers, a is the probability of losing a customer whose order is filled promptly, and b is the probability of losing a customer whose order is delayed. Since is proportional to N, it follows that r(j) is proportional to 1/[a(1 - f) + bfJ, and so r(j) = 1 + f(bro- a)la' r
O PT i M I Z AT I O N
BY D I F F E R E N T I A T I O N
71
for some constant '- 0 . The storage costs must be reduced to reflect the fact that less storage space and time are used when i= You should be able to show that is replaced by This has the effect in the formulas obtained above for the optimum of replacing s by values of T, and Also, these values are now functions of f. When the selling price is the profit per unit time is The optimum value of can be determined by maximizing this function. Even in the simplest case this is quite messy. When the production line is so fast that we can approximate by 1 , things are simplified a bit. Try it.
f O. s(k - 1')(1 - f)2 t/2.
s(k - r)t/2 2 s(1 - f) t, C. p,
f
pr(f) - C(f).
(k - r)/k
G eo metry o f B l ood Vesse l s
The blood vessel system of higher animals is so extensive that evolution has probably optimized its structure. How much of the structure can we explain in this way ? First, we need to know what is being minimized or maximized by optimization. We can say that the cost to the organism is minimized, but then we must say what we mean by ' cost.' This depends on the specific problem, so we'll put it off for the time being. Let's study the branching of vessels. For simplicity we consider only the case in which a vessel splits into two vessels, each of which carries equal amounts of blood. For the general situation of unequal-sized branches, see R. Rosen Ch. from which this model is adapted. Any reasonable model can be expected to lead to the conclusion that all three vessels lie in a plane, since otherwise we could shorten the lengths of all three simultaneously by making them planar- surely a saving for the animal. Structural considerations may prohibit this, but it is a reasonably accurate assumption, since sharp changes in direction are seldom required by structural constraints. By symmetry, the two smaller branches should have equal radii 1' and flow rates and make the same angle e with the larger vessel. Let r and = be the radius and flow rate of the larger vessel. See Figure The organism has a ' cost ' associated with maintaining vessels and overcoming resistance in pumping blood. This cost per unit length is some function Since we wish to minimize this, I' and are determined as functions of by
(1967,
2.
f 2I'
C(r, f). f oC(r, f) = 0 or
3),
I',
r'
oC(r', fl2) = 0. or' We also wish to choose to minimize the cost associated with the three vessels in the branch. If the vessels have lengths L, L' , and LI / , we wish to minimize C = C(r, f)L + C(r', I')L' + C(r', I')L I/ . (4)
e
and
72
B AS I C O PT I M IZATI O N
- - - - -f
F i g u re 2
angle, O.
Arterial blood flow. Flow rates, f and !, ; vessel radii, r and r' ; branching
L L
L'
L'
A slight change in to + I'1L results in a decrease in both and by an amount equal to I'1L cos plus a term on the order of (I'1L) 2 . Draw a picture and convince yourself of this. Since 0 at a minimum, I'1C must be on the order of (I'1L) 2 or smaller. Hence
8
C' = C(r, f) 2C(r', j' ) cos 8 = 0
(5)
-
at an extreme point. This must be a minimum, since we can clearly increase the cost by increasing so that approaches n. Since r and r' are determined by (4), this gives an expression for Let's consider a specific form for C. The work needed to overcome resistance in a rigid pipe with flow rate and radius r is kf 2 jr 4 per unit length, where depends on the nature of the fluid. Vessel maintenance may depend on the space occupied by the vesseL the inner surface area of the vessel (where most of the wear may occur) , the volume of the cells making up the vessel, or some combination of these. The first two give a cost per unit length proportional to r and r 2 , respectively. The third depends on how the thickness of the vessel wall varies with r. If it is proportional to r, the cost per unit length is proportional to r 2 . If it is constant, the cost per unit length is proportional to r. In order to include all these possibilities for vessel maintenance in some simple fashion, we consider a contribution of the form Kra, where 2 jr + Kra. By (4) we :-:;; a :-:;; 2. The total cost per unit length is thus
L
k
1
8
8.
f
kf 4
f2 2lra + 4 = I ' ) = 4,
have (il
(6)
Equation
O PT I M I ZAT! O N BY D I F F E R E N T I A T I O N K,
where
K
aKl4k. Thus C(r, f) = Ara (ï¿½)a + 4 4. ,
=
and, Slllce
=
r
5 yields
73
(rlr, )a = 2(a - 4)/(a + 4) . 2 Since 2 a 1, it follows that 37Â° 49 Â° . As far as I know, this has not cos e =
2':
2':
.:::;
e
.:::;
been tested. However, it is known not to hold at the capillary level. If you are interested in obtaining some data, the illustrations in F. H. Netter (various dates) could be measured. I've been told that his drawings are quite accurate. By using plus the known radii of the aorta and capillaries we can determine the number of branchings between a capillary and the aorta in an organism : If there are branchings, by the ratio of the aortic radius to the capillary radius equals Rosen gives an approximate value of for this ratio in dogs. Hence + 4), which ranges from to Since the number of capillaries equals there are between x and capillaries. An empirical estimate cited by Rosen is
(6)
n ' a + 4) 4 /( .
(6)
n ï¿½ 5(a ' 2,
109.
10 3 ï¿½ 45 25 30. 3 10 7 109
F i g h t i n g F o rest F i res
Your state forestry service wants to reduce the financial and environmental costs of forest fires. How can they do this ? What is the best way to reduce the cost of forest fires within the limits of present fire control methods ? The following is an adaptation of a model presented by G. M. Parks for determining the size of an optimal fire fighting force. Another possibility that needs serious consideration is increasing the effort spent on detection ; however, we ignore it here. ' The best way ' is interpreted to mean the least costly way. This means we must assign costs for the burned area and the injuries and deaths of fire fighters. The first cost is very difficult to assess ; outdoorsmen, lumbermen, and city dwellers are likely to assign quite different costs. In California in ' current practice [assigned] . . . values from to upwards of per acre.' What about the second cost ? Since more fire fighters mean quicker control of a fire, there is less chance per fighter for inj ury ; furthermore, fire fighters are assumed to receive monetary compensation. Therefore we do not consider the cost of injuries and deaths. Let B(t) be the area burnt by time t, where time is measured from t = at time of detection. We assume that the fire has stopped when B'(t) = O.
(1964)
$25
1963 $2,000
0
74
B A S I C O PT I M I ZATI O N
Let T' b e the time the fire is first attacked and I;, the time it i s brought under control. Thus 'Fe is the least > such that = O. Let x be the size of the fire fighting force (assumed constant from I;, to 'Fe). The costs for fighting a particular fire are :
t 0
B'(t)
C b , the cost per acre of fire (burnt acreage plus cleanup expenses). C the cost in support and salary per fire fighter per unit time. Cs , one-shot costs per fire fighter (such as transportation to and from the site). Cp costs per unit time, while the fire is burning, for maintaining the organizaÂ tion on an emergency basis, redirecting traffic, and so on. x'
(Note that we are implicitly assuming that all the C are constants.) The total cost is
B(t).
To minimize C as a function of x, we must determine We assume that each fire fighter reduces the burning rate of the fire at a constant rate E, that is, decreases by E. Thus
B'(t) B'(t) b(t), for t I;, , B'(t) = bet) E(t T,)x, for T' t
(7a)
<
=
(7b)
-
bet)
s
-
S
'Fe ,
bet)
where is to be determined. Parks simply assumes that is a linear function of We can derive this from the crude assumption that the fire is spreading circularly at a uniform rate : The perimeter is proportional to and the rate of change of the perimeter is a constant. Thus = + Criticize the model.
t.
bet)
bet) G Ht.
To find 'Fe we set
B'(t) 0 in (7b) and obtain T = Ta + GEx+ HI;,H =
c
H
-
Note that Ex > is required if the fire is ever going to be stopped. We now integrate (7) to obtain
HT2 B( T,) B(O) + G T' + T B(I;,) = B(I;,) + (G2(Ex+ HTH)a)2 =
_
O PT I M I Z AT I O N
BY
ba b(T;,) G HTa
D I F F E R E N T I AT I O N
75
HIE,
For convenience let = = + and z = x the number of fire fighters above the bare minimum. By combining the above results we obtain (8)
C
-
-
C + Cs Z + [(HCx/E) + CtEz+ (Cbbj2)]ba , 0
Co
where is a constant. Setting the derivative with respect to z equal to zero, we obtain the optimal value :
Cb + 2 Ctlba + 2HCx/Eb' + H EÂ· 2Cs E The values of cx, C' and Ct can be determined for a region ; the values of Cb can be tabulated for various types of forests ; the values of H and E can be tabulated by type of forest and wind conditions ; and ba can be deterÂ mined on the spot. Then (9) can be applied. It is unlikely that this would be done by the forest service ; however, (9) could be used to make general recommendations to forestry officials. Parks has done this. He obtained numerical estimates and concluded that 102 of the 139 fires in the Plumas National Forest in California in 1959 were undermanned. In particular, the model predicts that the four fires that burned over 300 acres each would have burned less than 100 acres each with proper manning. There are problems with relying on (9), even if we believe that the model is correct and are able to reach some agreement on estimates for the various costs. It is still necessary to know ba, H, and E. Unfortunately ba tends to be (9)
underestimated because that makes the lookout appear more alert, while H and E are dependent on so many factors that good estimates in a particular situation may be hard to obtain even if tables are prepared ahead of time. How sensitive are (8) and to such errors ? in Figure shows that underestimating x * The graph of x versus by a large amount is more expensive than overestimating it. The critical variables are H and E, since errors here can shift us into the untenable position of fielding less than HIE fire fighters. We could improve the situation someÂ what by tabulating HIE instead of H and E separately. (Of course we also need either H or E as well, but this way we are spared the necessity of dividing two uncertain quantities to obtain the critical quantity HIE.) What HIE ? It is the number of fire fighters needed to keep a fire from spreading at a faster rate, that is, enough fire fighters so that is a constant. Not only does this sound hard to measure, it sounds impossible. Surely the number of fire fighters must depend on the size of the fire. According to the model the number of such fire fighters is independent of the size of the fire. Before we
(9)
C
3
is
b(t)
76
B A S I C O PTI M I ZATI O N c
ï¿½------ï¿½Hï¿½/Eï¿½--ï¿½ X F i g u re 3
Firefighting cost as a function of manpower.
accept the model it would be a good idea to check this counterintuitive prediction, since H/E plays such a crucial role in determining As far as I know, this hasn't even been noted, much less explored.
x*.
P R O B LE M S 1.
2.
Returning to the blood vessel model developed above, do you think Rosen's data on the number of capillaries is strong evidence for the cost function = Why ? Propose further tests for the theory that evolutionary pressure has led to minimal total cost and that the cost per with 1 ::; ::; How can be estimated ? unit length is =
C Ar 2 ? C(r, f) Ara,
a 2.
a
Suppose you wish to get from one place to another in the rain by traveling in a straight line. How fast should you walk (or run) to stay as dry as possible ? The following model is due to B. L. Schwartz and M. A. B. Deakin ( 1 973).
(a)
Let's approximate a person by a rectangular prism (a box) with a ratio of areas given by front: side :top
1 : IJ : 8
Assume that the rain's velocity is (w, W, - 1 ) and that the person's is where the z coordinate is vertical upward. Show that
(v, 0, 0),
P R O B LE M S
(b) (e)
3.
77
the amount of rain hitting the person per unit time is proportional to - v i + cp, where cp + Â£, a constant. Show that, if cp > you shouldÂ· run as fast as possible and that otherwise you should run with w or as fast as possible, whichever is slower. This has a simple interpretation in terms of keeping your front and back dry. What is it ? Criticize the model. Can you improve it ? How do the new and old predictions compare ?
Iw
w,
= I WIIJ v=
Suppose you are an advisor to a congressperson who wishes to develop legislation to regulate commercial fishing so that the fish populations will be preserved. To advise him or her you need to become familiar with the economic aspects of the problem. This material is adapted from C. W. Clark ( 1 973). Let N be the size of the fish population. For simplicity, assume a selling price of p per fish, independent of the quantity sold.
(a)
Argue that the harvest cost per fish e(N) is a decreasing function of N and that, if there are no fishing regulations, we can expect the fish population to be at the level N where p e(N Cost includes salaries, fuel, income lost because capital is tied up in boats, and so on. Suppose we assume a simple reproduction model : N' g(N). Show that a reasonable shape for g is a concave arc passing through N 0 and N N*, the maximum population that can maintain itself when there is no fishing. Show that maximum sustained yield is obtained at Nn ' the solution of g'(Nm) O. What is the yield ? What does N* ::; N say about the economic feasibility of fishing ? What about N* ;:::: Nf ? Suppose the fish population is to be maintained at the most profitÂ able level. Call this N p ' Show that profits are given by
= f).
f'
(b)
=
=
=
=
f
(e)
P(N)
=
g(N) [p - e(N)] ,
=
and that N p is the solution of P'(N p) O. What can you say about the relative sizes of N N * , N m ' and N p ? Under what conditions is it economically best to drive the species to extinction by fishing ? Perhaps the left hand zero of g(N) should be at a point to the right of zero, since a dispersed population below a certain critical level may not be able to come together to reproduce. If extinction is not economically feasible, is legislation a good idea anyway ? Explain. What about fishing in international waters, for example, whaling ? ) (f Can you improve the model ? What if p depends on harvest size ?
(d)
(e)
Hint:
f'
B A S I C O PT I M I ZATI O N
18
(g)
Apply the above ideas to buffalo hunting (previous century), deer hunting (present day), tree farming, and anything else you'd care to.
Notes:
A graphical approach to parts of the above problem may be helpful. See Chapter and Section Fisheries have been studied extensively. Among the journals devoted to the subject are and See also C. W. Clark ( 1 976). In designing a multistage rocket, how would you decide on the number and size of the various stages ? By having multiple stages, unneeded fuel containers can be discarded, thus reducing the amount of mass that must be accelerated for the rest of the flight. Unfortunately there is a cost : Additional motors are needed so that each stage will have an engine, and this adds to the weight until the motor is discarded. Clearly some compromise should provide the biggest payload (or longest flight) for the money. F o r simplicity we assume that cost is proportional to weight. Therefore we maximize the terminal velocity for a given initial mass and a given payload mass. Again for simplicity let us neglect the effect of gravity. (The crude assumptions we are making can be removed, but then the optimization problem may require a computer.) We need the physical fact that the mass and the velocity of a rocket with constant exhaust velocity Ve are related at any time by
Bulletin
4.
3 4. 2 . Fishery Transactions of the American Fisheries Society.
m
v
m exp (ï¿½) = constant, ï¿½e
when gravity and air resistance are neglected. The constant changes each time the rocket drops a stage. (For those who wish to derive the result, it is simply a conservation-of-momentum argument : + Ve = To begin with, let's find the optimal division between stages, given that we are to use stages and the payload counts as a stage. Let
m flv
flm 0.)
n
M i be the mass of the entire rocket (including fuel) when the ith stage begins to fire. F i be the mass of the fuel in the ith stage. C i be the mass of the fuel casing in the ith stage. R i be the mass of the rocket motor and other support in the ith stage.
F
By assumption, M 1 and Mn are given, n = C n = that R n = by absorbing it in the payload.
(a)
0
Show that the terminal velocity is
0, and we can assume
P R O B LE M S
79
Mj is such that for given values of Mj+ 1 and Mj - 1 ) + log ( Mj ) log ( Mj - 1 - Fj - 1 Mj - F j is a maximum and, if this holds for 2 n - 1 , the rocket maximizes Remark: This uses an important idea in maximizaÂ tion : A solution that is locally a maximum is often globally a maximum. In this instance, if the division of mass Mj - 1 - Mj+ 1 between stages - 1 and is the best possible for all the entire rocket is the best possible. (c) We assume that C i F i and Ri M i , with constants of proportionÂ ality independent of i for 1 i n - 1. Discuss. Use this to conclude that F i = aM i - bM i+ 1 for some a and b and thus express log [M;/(M i - F i )] in terms of M and M i + 1 â€¢ of the single (d) Using (c), reduce the expression in (b) toi a function variable Mj . Show that it is a maximum when Mj Mj 1 Mj - Fj Mj - 1 -- Fj -1 Conclude that is a maximum when Mi(Mj - F) is constant for 1 n - 1. Interpret in terms of ï¿½v for each stage. (e) How can you determine the optimum value for n, the number of (b)
Using (a), show that, if
Mj - b
:::;
Vy .
j
j
ex
:::;
j
:::;
j
:::;
j,
ex
:::;
:::;
Vy
stages ? How does the reliability change as the number of stages increases ? What can you do about this and how does it affect the model ? (f) Can you propose a more realistic model which can be analyzed easily ? What additional factors would you take into account if you were actually attempting to design a multistage rocket ?
(g) 5.
6.
A troubleshooter spends a lot of time flying in his private plane to various industrial plants which he helps out. He wishes to spend the least amount of time possible traveling. Where should he live ? Of course, you need data. What data do you need ? You should set up a model so that data collection is feasible. How would you change your approach if he used commercial airlines ? What is the best strategy for a swimming fish to adopt if it wishes to travel with the least expenditure of energy ? (This ' wish ' is not conscious, but rather a result of natural selection.) Since the motions involved in swimming increase the drag on a fish to about three times its value when
80
B A S I C O PT I M I ZATI O N
the fish is gliding, i t i s t o the fish's advantage t o keep swimming time down. This leads to burst swimming (D. Weihs, Fish that are heavier than water can alternate between swimming upward and gliding downward. We study the simplest case of this discussed by D. Weihs We assume that the fish attempts to move with a constant velocity (Other assumptions are possible, but this seems fairly reasonable, and we can handle it.) Let be the drag on the gliding fish at this velocity and the drag on the swimming fish. Let be the net weight of the fish in water, IX the angle of downward glide, and f3 the angle of upward swimming. Thus we're assuming that the fish travels along a path which, when viewed from the side, has a sawtooth appearance. We assume that the energy used by the fish per unit time above and beyond that required simply to stay alive is proportional to the force it exerts in moving.
1973, 1974).
(1973).
D
kD
(a) (b)(c)
v.
W
Criticize the assumptions. Show that sin IX = and that the swimming force is + sin f3. Show that the ratio of energy in the burst mode to energy for continuous horizontal swimming to go from a point A to another point B is
W
D
kD W
k sin + sin f3 . k sin ( + f3) IX
IX
0.2.
What is the best (d) It has been found empirically than tan IX ï¿½ value for f3 ? How much energy does the fish save ? How important is it that the fish estimate f3 accurately ? (We should answer this because it may be unrealistic to expect accurate estimates.) Criticize the model. (f) Suppose the fish wishes to swim from A to B in a given time. Construct a model. Drag is roughly proportional to The energy per unit time (power) used to overcome drag in swimming is nearly proportional to
(e) 7.
v2.
v3 â€¢
Two firms Y and Z are competing for a market. If Y spends y per unit time on advertising and Z spends z, we could expect that Y's share of the market in the long run is a function of the total advertising attributable to Y ; that is, f [y/(y + z)] for some function f. If the two firms are similar, Z's share of the market will be f [z/(y + z)] .
(a ) (b) (c )
Criticize the above suggestion. Show that, for 0 .::; x .::; + f( 1 - x) = and f'(x) = - x). the above, how should Y and Z act so as to maximize Assuming profit -assuming there is neither tacit nor explicit collusion between
1,f(x)
1
1'(1
G R A P H I C A L M ET H O D S
81
the two firms. How reliable is the prediction ? You can assume that all costs and the function are known.
f
This problem was adapted from R. G. Murdick 8.
(1970, Ch. 2).
What is the optimum number of years a company should keep trucks in its fleet before buying new ones ? This can lead to many complications as the model becomes more and more realistic. Begin with a very simple model in which the main factor is rising maintenance costs. You can work up to as complicated a model as you feel the situation warrants.
4 . 2 . G RA P H I CA L M ET H O D S
3.1,
qualitative
For the reasons given i n Section this section i s limited t o problems with few variables. The idea is simple : We wish to maximize a function like ' fitness ' or ' happiness,' subj ect to certain constraints. The constraints and the curve constant are plotted, and the point where is maximized is read from the graph. When the problem can be stated in clear, quantitative terms, more sophisticated methods such as Lagrange multipliers and mathematical programming are used.
f
f
f
=
A B a rte r i n g M od e l
Suppose two people have two goods which they wish t o use i n bartering with each other. What can we say about the situation ? We assume there is some satisfaction associated with various mixes of the goods, and each person wishes his or her satisfaction to be as great as possible. For example, if I have inches of French bread and you have 20 ounces of wine and it is lunch time, we will probably be able to work out a trade in which both of us will be better off. (Don't suggest simply ' sharing ' -that's frowned upon in models.) Can we say more about this ? Let's consider another situation. S u ppose I have yards of one fabric and you have yards of another. We may not wish to do any trading unless we switch ownership completely, because anything else would lead to rather small p ieces of fabric. Can a model explain situations ? We begin with the concept of I may say that as far as I am concerned inches of bread and ounces of wine together are just as good as inches and ounces. We say that I'm indifferent between The set of points that I consider to be indifferent to and form a set which is usually a curve. It is called an Several of my indifference curves are sketched in Figure Can you explain the shape ? A curve further toward the upper right contains points of greater
25
economic
2
both
6 (10, 4) (6, 10).
10
10
2
indifference curves. 4
indifference curve. 4.
(10, 4)
B read
F a b r ic
F i g u re 4
1
Two types of indifference curves ,
Yo u r b read
ï¿½ ï¿½ 15 0 20 r---ï¿½ï¿½--r-ï¿½ï¿½---'--r-----,T--------ï¿½--------ï¿½
15
'
c
';:
>:2:
10
5
-- -
10
- -5
path,
82
c
:::J 0 >ï¿½
15
5 F i g u re 5
'
';:
10
My
15
20
bread
An Edgeworth box : our joint indifference curves, Dotted line is bargaining
G R A P H i CA L M ET H O D S
83
satisfaction to me. (Why ?) Thus I want our bartering to lead to a point on a curve far toward the upper right. Now let's put your indifference curves and my indifference curves together. I've done this in Figure 5 for bread and wine. Note carefully the labeling of the axes : Altogether there are 25 inches of bread and 20 ounces of wine, and any point in the rectangle describes some division of the bread and wine between the two of us. Now suppose I agree to stay on one of my indifference curves. How can you maximize your satisfaction ? The answer is simple : Choose a point where one of your indifference curves is tangent to mine. Another way of viewing this is to say that, if our indifference curves are not tangent at the point we have selected, there is another point where neither of us is worse off and at least one of us is better off. Hence we should stay on points of tangency. This is the which is shown dotted in Figure 5. It starts on my indifference curve containing (25, 0) and yours containing (0, 20), because neither of us will agree to be worse off after trading. Where on the curve we end up depends on our bargaining abilities. (Various people have attempted to be more specific.) Figure 5 is called an
bargaining path,
Edgeworth
box.
What about the yard goods case ? Here the indifference curves have a different shape, so that the points of tangency give minima instead of maxima. Thus we do better at the boundary. What if we are trading more than two goods ? For three goods we can still picture the situation : There are but the points of tangency still form a curve. This is true for any number of goods. We can put this result in a somewhat surprising form :
indifference surfaces,
Suppose Bill and Mary are trading and I know their preferences. If Bill tells me how much of one of the goods he has settled for, I can then say, ' Unless you have settled for the following amounts of the remaining goods, you and Mary can arrange a trade that would be better for both of you.' This model has several drawbacks. First, to make it quantitative requires a great amount of experimental work gathering data ; however, psychologists have collected data of this sort in past experiments. Second, the indifference curves may shift with time-I may be more interested in wine after haggling with you for a while. Third, I may derive satisfaction from how well or how poorly I feel you are doing. Can you think of other objections ? Do you think these ideas on bargaining would be useful in bilateral trade negotiations between the United States and Japan ? In arms limitation talks between the United States and the U.S.S.R ? Discuss your reasons.
84
B A S I C O PTI M I ZATI O N
C h a n g i n g E n v i ron ments a n d O pt i m a l P h e n otype
Why do some animals have only a few quite distinct forms for different situations (e.g., queen, drone, and worker forms among honeybees), while others exhibit a whole range of variation (e.g., variation in the size of many plants with the climate) ? Suppose a habitat consists of two distinct types of environments. Examples are : oak trees versus maple trees (relevant for plant eating insects) : warm versus cool weeks (relevant for insects producing more than one generation per year) ; and the nest versus the outdoors (relevant for some social insects with castes like ants). Assume that the animal or plant spends most of its life in only one of the two environments and that for developÂ mental or genetic reasons the organism can end up having one of several We want a model that explains why some organisms have markedly different phenotypes in different environments while others do not. The following ideas are adapted from R. Levins Ch. See E. O. Wilson and W. H. Bossert pp. for related material. We begin with the idea of In vague terms, the fitness of an individual is a measure of its expected success. This could be measured in terms of the extent to which an individual's genes survive and spread in future generations or, for social insects with a single queen, the survival and reproduction of the nest. Thus fitness could be measured by the expected number of descendants at some future time. Even this is rather vague, because fitness is a very slippery concept to try to grasp precisely. We can allow it to remain vague as long as we are aware that we are doing so, because we only wish to make crude qualitative predictions. Since we can't obtain the data that would be required by a quantitative model anyway, it is pointÂ less to attempt to formulate such a model.
phenotypes.
w
(1971, 73-77) fi tness.
(1968,
2).
The essential property we demand is that the fitness down to the nth generation is the product of the fitnesses at each generation. Suppose the fitness of an individual in the first environment is W1 , and in the second Wz . If the fraction of time spent in the first environment is p, the fitness after n generations is (10) We wish to maximize (10), subject to the constraint that the fitnesses W1 and W2 are actually possible. The shaded regions in Figure 6 indicate fitnesses of biologically possible individuals. The regions are called fitness sets. On the left, the environments are sufficiently similar so that an intermediate individual A can do well in both. In contrast, the intermediate individual B on the right does poorly because the environments are too dissimilar.
G R A P H I C A L M ET H O D S
85
Cu rves with
wf Wi -P
c o n stant
.,.-'-'., :.:.''-'.;, ï¿½UL.______
(a)
w,
(b)
Fitness sets. (a) Similar environments . (b) Dissimilar environments . InÂ termediate individuals such as A and B occur only in the case of similar environments . F i g u re 6
To maximize ( 1 0) we simply plot curves on which Wf Wï¿½ - P is constant and note that the optimum individual occurs at the point where such a curve is tangent to the fitness set. The curve has a shape similar to that of the hyperÂ bola xy As varies, the curves on which ( 1 0) is constant vary in shape. When the two environments are similar, the optimum varies smoothly with In dissimilar environments, there may be a sudden jump from the specialist (in Figure 6b) to the specialist as p increases, completely avoiding the poor generalist Examples of both situations occur. You should be able to think of many examples of the former, for example, variation in thickness of coat in furbearing animals with climate. Here's an example of the latter : Some species of butterflies mimic other species that are distasteful to predators. There is a species in South America that mimics different species in different parts of its range. An organism with the phenotype of a compromise mimic would be poorly protected. Let's consider caste formation in ants. The first environment is the nest defense milieu, and the second is the nest maintenance milieu. In Figure is plotted a soldier (S), a worker ( W), and two possible generalists. If defense and maintenance were sufficiently different so that is the best possible generalist, there would be evolutionary pressure toward caste formation. If were possible, castes would be unlikely to form. If defense were rare, evolution might lead to the castes G' and W Note that we haven't discussed the shape of the fitness curves in conÂ nection with Figure It's rather tricky ; in fact, this whole subject is a bit tricky. You may want to work on it. E. O. Wilson (1 975, pp. 306-309) presents another approach to caste formation which we discuss briefly. It involves some simple probability
p c. =
p. C
D
B.
7
G
G'
7.
86
B AS i C O PT I M I Z AT I O N
s G'
w
G
L------ w,
F i g u re 7
Caste formation. Soldier, S ; worker, W; generalists, G and G ' .
theory. Suppose we have a list of the possible castes and a list of the situations (e.g., repel an attack, forage) that a colony must deal with. A colony cannot fail too often and still survive. Various castes contribute more to success in a particular situation than others do. Let P ij be the probability that caste i will fail to deal with problem j. We assume that the castes contribute inÂ dependently to success, so that P 1 j P 2j ' ' is the probability that problem j will not be dealt with successfully by the colony. One way to limit failures is to require that (11)
for all j.
Clearly P j) depends on the number of members in caste assumption is, again, independence : (12)
i. The simplest
Pi) = pi] ,
where nj is the number of individuals in caste i. If Cj is the cost of producing and maintaining a member of caste i averaged over the individual's lifeÂ time, we can describe the colony's problem as follows : ( 1 3a) ( 1 3b) ( 1 3c)
Subject to : nj And :
:2:
0
I nj log Pij
ï¿½
log
M) .
G R A P H i CA L M ET H O D S
87
[The last expression comes from combining ( 1 1 ) and ( 1 2).J This is an example of a problem in a field in which a variety of textbooks exist. This model has several drawbacks. The major ones are probably the (highly unrealistic ?) assumptions of independence leading to ( 1 1 ) and ( 1 2). Also, the constraints in ( 1 1 ) may not be an appropriate way to define not failing too often. Some of the difficulties can perhaps be avoided by redefining terms. Others require revisions that would destroy the linearity of ( 1 3c). Can you suggest ways to improve the model ? Let's illustrate the model by considering a simple case involving only two possible castes. Introduce two axes which indicate the number of members in each caste. Constraints (1 3b) limit us to the first quadrant. Each of the constraints ( 1 3c) requires that we look above a line of slope - log p i dlog PiZ and a given intercept. Figure illustrates a possible conÂ figuration with four problems. Since + is constant on straight lines of slope picking out the point in the shaded region that produces a minimum is fairly easy to do graphically. You should be able to describe a method. Note that it is possible to obtain a solution in which not all castes actually exist ; that is, i for some This is as it should be. While these models are still quite crude, there is hope that this approach may shed light on why some species of social insects have more castes than others and why the energy of a colony is divided between castes in the way
linear programming,
8 Clnl cznz
-ci /cz,
n 0 =
i.
F i g u re 8 A linear caste formation model . Inequalities ( l 3b) and ( l 3c) hold in the shaded region.
88
B A S I C O PT I M IZAT I O N
why
that it is. (The of sociality in insects is an interesting question which is beginning to be answered. See E. O. Wilson (1975, pp. 41 5-4 1 8) for a discussion.) P R O B LE M S
1.
Two college administrators are trying t o decide on an admissions policy so as to obtain the ' best ' possible students for their college. They each have different ideas on how important various traits are in a good student. Can you suggest a theoretical plan for helping them ? A practical one ? What if three administrators are involved ? The time and money required for extensive testing are not available ; only the adÂ ministrators and their opinions are available.
Nate:
2.
Let's consider a bread and wine problem different from the one in the text. ) Suppose I am buying lunch, wine costs 1 0 cents per ounce, bread costs 5 cents per inch, and I have $1 to spend. If I know my indifÂ ference curves, how can I determine what to buy ? (b) Suppose the price of wine rises to 13 cents. What will happen to the amount of wine I buy ? The amount of bread ?
(a
3 . H o w d o wages affect the amount o f time a person works ? A n individual wants both income and leisure time. Hence he or she is willing, up to a point, to work longer when the hourly wage is higher. As the wage becomes higher, however, an income saturation effect occurs and the worker may wish to work somewhat less time as the wage rate increases, thereby increasing both leisure time and income. A reverse effect may occur if the wage is low, since a person often desires a certain level of income and may more readily sacrifice leisure to attain it if wages are raised slightly. Can we cut through this complexity to decide if, as an employer, it is better for you financially to offer overtime or higher wages ?
a
( ) Using the coordinates hours per day and dollars per day, plot indifference curves for a worker. What is the shape of such a curve ? What does the slope mean ? (b) For a particular hourly wage rate a straight line through the origin gives hours worked versus wages received. Why ? Describe geoÂ metrically how to measure the number of hours a worker would choose to work if he or she were given the freedom to choose (e.g., a self-employed person such as a lawyer or a plumber.) As the hourly rate varies, the optimum point varies. Describe and interpret the locus.
Hint:
P R O B LE M S
(c ) (d) (e)
89
Discuss the effect of overtime. Is is better from the employer's viewpoint (i.e., maximum number of hours per employee for a given total wage) to raise wages or to raise overtime pay ? Why ? Instead of considering a single worker, carry out the above analysis for the entire work force potentially available to the employer.
For further discussion of this topic see K. J. Cohen and R. M. Cyert (1965, Ch. 5). 4.
Suppose you are faced with the problem of how to adjust traffic signals for rush hour traffic. What is the best way to do it ? This problem is adapted from D. C. Gazis and R. B. Potts ( 1 965). We suppose that at t = 0 there is no line at the signal in either direction. At the end of the problem, try to decide how important and how realistic this assumption is. Cars arrive at the signals at rates qN(t) from the north and qE(t) from the east. The signal can handle cars at a rate Let QN(t) and QE(t) be the integrals of qN and qE from 0 to t.
k.
(a)
Show that T, the earliest possible time the intersection can be cleared, is determined by the equation QN(T) + QE( T)
(b) (c
=
kT
Let fN(t) be the flow of the north cars through the intersection at time t. Define F N , fE , and FE in the obvious fashion. What relationÂ ships can you discover among the four functions just defined ? Interpret the area between the curves QN and F N in terms of delay time. ) Show that the total delay time at the intersection is a minimum if and only if both intersections are cleared simultaneously at time Determine What is the best form for F N ? Suppose the rush hour traffic starts to arrive earlier from the north so that qN(t) is large when is small but qE(t) is small when is small. Consider other situations, too. Discuss improvements and generalizations for the model. You need not limit yourself to graphical methods. Among the problems you could consider are flows from all four directions, lost time when signals change, unequal rates of flow (the parameter in different directions.
(d) (e)
T
T
t
t
k)
5.
In industrial chemical processes, yield is frequently highly dependent on temperature and pressure, but these are limited in range by technological and economic considerations. The amount of impurities also depends on temperature and pressure. Describe a graphical approach for
90
B A S I C O PT I M I ZATI O N
obtaining maximum yield when one impurity cannot exceed a certain value. Do the same for several impurities. This idea is discussed in B. Noble ( 1 97 1 ).
6.
(a) (b)
Consider the following model of political behavior. There are three voters, two issues, and two politicians. Suppose the positions taken on the issues can be represented as points on a plane and that the indifference curves of each voter are circles centered about the, to him, ideal position. Show that the politician who declares his positions last can ensure himself at least two of the three votes. Can you construct a more realistic model ? What are its political implications ? How much faith do you have in the predictions ? Why ? See R. D. McKelvey ( 1 973) for further discussion.
C H A PT E R
B AS I C P R O B A B I L I TY
Most of the models in this book are deterministic. Stochastic models are discussed here and in Chapter 10. Here we use only basic discrete probaÂ bilistic concepts, but more sophisticated concepts, such as the central limit theorem, are needed in Chapter 1 0. The Appendix contains a terse discussion of the probabilistic concepts required. It can serve as a refresher or as a reference for a more leisurely classroom discussion.
5 . 1 . A N A LYTI CA L M O D E LS Sex P reference a n d Sex R at i o
Some people have expressed concern about the possibility o f a population markedly altering its sex ratio (number of males divided by number of females) because of preferences for children of a particular sex. This could be a real problem if intrauterine sex determination is coupled with abortion or if infanticide is practiced. To what extent can a population affect the sex ratio purely by means of birth control, including abortion which is not related to the sex of the fetus ? The following discussion is based on L. A. Goodman ( 1 9 6 1 ). Let's ignore multiple births to make the analysis easier. They are sufficiently rare that the effect on the model will be quite small. We must say something about the chances that a healthy baby born to a given couple will be a girl or a boy. This may vary from couple to couple. One can give a reasonable biological argument that it does not depend on the sex of the children already born to the couple. There are data indicating that sex is slightly related to the age of the couple. Since this is not easily incorporated in a model and since it has only a slight effect, we ignore it. 91
92
B AS I C P R OB A B I LITY
The major problem is : How many children is a couple able to bear ? This is a thorny problem. We ignore it completely in the following discussion and consider it briefly in Problem 1 . Our assumptions can be summarized as follows : 1.
There exists a probability Pi that a child born to the ith couple will be male and a probability qi = 1 P i that it will be female. The value of Pi is not a function of the sex of the other children of the couple and cannot be adjusted by the couple. Each birth leads to exactly one child. A couple can have as many children as desired. -
2. 3.
3,
In view of assumption a couple can have additional children if a child should die any time after it is born. Hence we can ignore deaths in childhood and interpret Pi as being the probability that a child who is born is a male. After reading the following discussion, comment on the realism of the assumptions and try to determine what effect they have on the conclusions. In particular, Problem 1 asks for a discussion of a model in which assumption is replaced by an upper bound on the number of children per couple. In technical terms, the model proposed treats sexes of children born to a couple as Bernoulli trials. We wish to study the value of fl, the fraction of males in the population. Let Fi be a random variable equal to the number of females born to the ith couple, let Mi be the number of males , and set Ni = F i + M i . Then
and survives
through childhood
3
(1) where E denotes expectation. Approximating the expectation o f the ratios by the ratio of the expectations , as was done in ( 1 ), is quite accurate for large populations. (If you have had a course in mathematical probability theory, you might like to prove it.) In view of assumption 1, the expected fraction of boys born to the ith couple will be Pi . Hence E(M;) = Pi E(Ni), and From ( 1 ) we have
there is no way a couple can change the expected fraction of boys born to it. L Pi E(Ni) (2) L E(N i) It follows from (2) that the population can cause a change in only by fl ï¿½
fl
introducing a correlation between Pi and E(NJ When there is no sex preference, it is reasonable to assume that E(Ni) and Pi are uncorrelated. In this case the right side of equals the average of Pi over all couples.
(2)
A N A LYT I CA L M O D E LS
93
What values of /1 are possible ? Since (2) is a weighted average of the the value of /1 must lie between min Pi and max P i . Because of assumption 3, the population can approach any value within these limits. Also, the population working can any between min Pi and max Pi : A couple continues to have children as long as the fraction of males in the children they already have does not differ from by more than In general, the closer some P i are to the closer we can expect /1 to approximate The choice of is somewhat arbitrary. We want a function that tends to encourage couples with P i close to to have many children. Since the fraction of children that are males tends to differ by something on the order of for random reasons, we want a function that is large compared to for large values of The function is such a function. Using (2) we argued that min Pi ::::; /1 ::::; max P i - There is an error in this argument : ( 1 ) is an approximation that is accurate only for large populations, and so only the approximation to /1 lies between min P i and max Pi . To see that /1 need not lie within these limits, consider a population consisting of a single couple using the rule, ' Stop after one child if the first child is a boy, otherwise have two children.' Set = P and 1 P = The possible sequences of children are M, FM, and FF, and their probabilities are P, and respectively. Thus Pi ,
working as a whole individually approximate A n A n - 1/3. A, A. n - 1/3 A n--11 /22 n / n. n - 1/3
q2,
P1
/1
= 1
q.
-
p + !qp + Oq 2 =
P
+!
pq > p.
k
qp,
When P = t this equals 0.625. Now suppose there are couples all using the same rule and all with P = ! The expected sex ratio for = 1 , 2, 3, 4, 5 is 0.625 (as just computed), 0.563, 0.54 1 , 0.530, and 0. 524, respectively. Thus the approach to ! is fairly slow. The above discussion shows what be achieved as values for /1. What be achieved if each couple independently pursues a plan based on a desire for children of a given sex ? Many plans are possible. Three examples are .
will
1. 2. 3.
k
can
A couple may continue to bear children until they have a child of the desired sex. They may continue bearing until they have a child who is of the desired sex. Plan 2 may b e modified b y the requirement that there b e a t least one child of the desired sex.
not
Plans may vary from couple to couple, complicating matters tremendously. We assume that the entire population follows the same plan and that boys are desired.
94
B AS I C P R O B A B I LITY
If P is the probability of success (or failure) in repeated Bernoulli trials, the expected waiting time until the first success (or failure) is
I n( 1
nï¿½O
-
py- I p P dpd I (1 =
-
-
P) '
=
1
P
- .
2.
Hence E(NJ = l /Pi for plan 1 and E(NJ = l /q i for plan For plan 3 we have either of the patterns boy(s)-girl or gir1(s)-boy for order of birth of children. The first involves the birth of a boy and so has probability Pi and an expected number of births 1 + l/q i ' The second case is similar. Thus E(NJ
=
( ï¿½qi) qi( l ï¿½Pi)
Pi l +
+
+
=
1 __
Pi qi
-
1 (2).
From this it is easy to compute approximations to Jl by using For plan 1 we obtain the harmonic mean of the Pi ' Since the arithmetic mean exceeds the harmonic mean, this /-1 is than random. This is due to the fact that high Pi is correlated with low E(NJ Similarly, plan leads to a higher Jl than random. What happens in plan 3 depends on the distribution of the Pi ' Up to this point we have not needed imy such information about the P i ' This is good, because they cannot be computed. See also page
less
2
217.
M a k i n g S i m p l e C h o i ces
What mental processes occur (possibly subconsciously) when you make a simple decision, like choosing the longer of two lines ? No one really knows, and the models in this area are plagued by oversimplification ; for example, a process can be assumed to be identical from trial to trial, or a relationship can be assumed to be linear, even though these assumptions are known to be only rough approximations. The following model, while no exception, illustrates some interesting ideas. It is adapted from R. J. Audley ( 1960). Another problem is the existence of several equally good (or bad) models for the same situation. See R. R. Bush and F. Mosteller ( 1 959). We wish to model an experimental situation in which a subject is required to choose between two simple alternatives, for example, which of two nearly equal lines is longer. The alternatives are called A and B, and the correct choice is A. We assume that the subject makes a sequence of choices implicitly (either consciously or subconsciously) and that these determine the final choice. Specifically, we assume L
There are parameters lJ. and f3 such that during a small time interval of length implicit choice A occurs with probability lJ. and implicit choice B with probability f3 These events are independent.
L'1t
L'1t.
L'1t
A N A LYTI CA L M O D E lS
K
2.
95
A final choice is made after a run of identical implicit choices, and it equals the implicit choice that was just chosen successive times.
K
1, 2.
We consider only the two simplest cases of the model : K It would be more appropriate to treat as a parameter, but this would lead to more involved mathematics. Assumption implies that the next implicit choice is A with probability - It follows that the probability of a string of + 13). Let q implicit choices consisting of a A's and b B's in some given order is
K
1
p a/(a
1 p.
=
=
=
(3)
!1t,
In an interval of length Pr {choice}
Pr {A or B} Pr {A} + Pr {B} - Pr {A and B} by assumption + 13)
= =
1. !1t - af3(!1t) 2 , This describes what is called a Poisson process with parameter Il a + 13. The properties of such a process are well known. In particular, the mean time between implicit responses is 1/1l and the probability of exactly n implicit responses during a time interval of length t is n Pn(t) (Ilt)n.e (4) The Poisson process is discussed in the Appendix. It also appears briefly in the radioactive decay example in Chapter 10. We can use K, p, and Il as the basic parameters instead of K, a, and 13, because a pll and 13 (1 - p)ll. All the probability distributions can be Ilt rather parameterized by p and K if they are looked at as functions of than of t. Hence p and K determine the shape of distributions, and Il deterÂ mines the time scale. When K 1, the subject makes only one implicit choice. and this is his final choice. The probability of a response by time t is 1 - P oCt), which is (a
=
=
=
,
- At
=
=
r
=
=
Poisson by (4). Audley notes that this does not agree with experimental results. When the subject alternates between A and B in his implicit choices until he finally makes two identical implicit choices. We begin by studying such strings of choices. Let A ) be the probability that there were exactly implicit responses and the final choice was A, and let A ) be the probability that the final choice was A. The probability A
K 2, =
P I P (n =
n
P (n
96
B AS i C P R O B A B I LITY
(pq)m if n = 2m and q(pqt, if n = 2m +k 1.l n = 0.) Hence, for k > 0, PA 2k) = p 2 (pq) - , k PA(2k 1) = p(pq) , (5) PA = p2 Im [(pqr q(pq)mJ = p2 11 -+ pqq .
of an n-Iong string ending in B is (This allows for the case and +
(
+
--
Since PA ean be determined from experimental data, we have a way of estimating To estimate some information involving time is required. Since means can usually be estimated fairly well from data, a mean time is a reasonable choice. Let be the time to final choice given that the choice is and let L be the time to final choice regardless of whether it is A or B. As usual, we use notation like [ to denote the mean of L, and E (L) to denote the expected value of L. We have
p. A, LA
A,
I nxn x/(1 - x?, it is an easy matter to evaluate these sums : nPA(n) p (2(1 - 3qpq)2 pq 2) and so 3q pq 2 E(LA) = (12++q)(1 (6a) - pq),1 2 + pq E(L) = (6b) (Â· 1 - pq)1' An interesting consequence of (6a) is that r = E(LA)/E(LB) decreases from about i to about ï¿½ as p increases from to 1. To see this it suffices to study 3q - pq 2 = 2 + -q f(p) = (12 ++ q)(1 - pq) 1 - pq 1 + q ' since r = f(p)/ f(q). You should work out the details. Another way to describe
By using
=
' L.,
2
=
+
_
_
A
Â°
--
r
the behavior of is
The mean time to final choice is longer for the less likely choice, but it never exceeds the other mean time by more than about % .
25
A N A LYTI C A L M O D E L S
91
We are now ready to compare the model with experimental results. AudJey notes that very few suitable data are available and bases his major test of the model on the work of Y. A. C. Henmon ( 1 9 1 1). We rely exclusively on his work ; see Audley's paper for further disc.u ssion. In his experiments Henmon displayed two vertical lines, one of which was slightly longer than the other. Half of the time the subject was required to choose the longer line, and half of the time, the shorter line. The subject was also asked to express a degree of confidence in the choice. The lines were displayed until a judgment was made. In a single series the subject was required to make 50 judgments. From three subjects 1 000 judgments each were taken, and 500 each were taken from another seven subjects. Unfortunately, only the data from the first three subjects is presented in a fashion that makes it possible to plot the number of decisions against time to decision (Henmon's Table II), the curve that would provide the most detailed test of the model. However, and can be determined for all subjects by using his Tables I and IV. These are presented in Table
PA, L, LA,
Ta b l e 1
Subj ect BI
Br H A B
C D E F G
LB
1.
Choice Model Parameters for 1 0 Subj ects
PA
I
LA
IB
IB/IA
E(LA )
E(LB)
0 . 820 0 . 774 0.832 0.686 0.778 0.689 0.798 0.778 0 . 696 0 . 742
1 02 1 609 775 303 535 642 1 044 1 095 583 909
992 610 770 305 536 652 1 043 1 046 606 899
1 1 54 603 797 300 530 62 1 1 046 1 268 53 1 938
1.16 0.98 1 . 03 0.98 0.98 0.95 1 . 00 1 .2 1 0.87 1 . 04
1 009 60 1 765 300 528 635 1 030 1081 577 898
1 079 635 822 311 558 658 1 096 1 1 44 598 94 1
Note : Times are given in milliseconds .
We have taken A to be right and B to be wrong. Note that the value of for some of the subjects is less than 1 , a contradiction to the theory. Some of the ratios are so close to 1 that the deviation is not significant, but the ratio for subject F is extremely low. Perhaps it can be explained by assuming that the value of K varied from series to series. You are asked to discuss this idea in the problems. After using and to estimate p and using (5) and (6b), the values of and were computed by (6a)
LBILA
E(LA)
PA L E(LB)
A
98
B A S I C P R O BA B I L ITY
and its analog for E(LB)' Audley has fitted curves to the more detailed data (Henmon's Table II) for subjects B1 and Br. To do this he introduced a third parameter : a short time lag during which the subject prepares to make implicit decisions. It is then necessary to ignore the decisions made before the lag, because they occur before the subj ect is ' ready.' Without a time lag the fit is poor, but with a lag of seconds for BI and 0.34 seconds for Br the fit is good. I have not been able to obtain as good a fit for H as can be obtained for Br and Bl. Since there are so few data for each subject (four numbers), I think that the poor fit of the model is a sign of serious deficiencies ; however, I'm not able to suggest a better model. A related model has been proposed by Estes and Bower and extended by W. Kintsch ( 1 963) to include a Poisson process for implicit response times. Assume there are five states : S, iA, iB, fA, and fB-starting, implicit A and B, and final A and B. The subject makes a decision to move from one state to another. The possibilities are
0.40
r iA - ->fA
S
tt
L iB ----> fB Show that, if the probabilities of S ----> iA, iA ----> fA, and iB ----> iA are all equal, this reduces to Audley's model. Kintsch discusses primarily the case in which the probabilities of iA ----> fA and iB ----> fB are equal. One problem that neither of these models deals with is the possibility of unconscious bias of the subject toward the right line or the left line. Another is the possibility that it is harder to choose the smaller than the larger, or vice versa. Either of these could lead to a mixing of models with different parameter values. Furthermore, data from different sessions with the same subject may have different parameter values. Any mixing like this could give rise to problems in fitting the data. Henmon's tabulations make it impossible to check all this out ; however, he does note that there is a slight difference in reactions to the shorter line versus reactions to the longer.
PROBLEMS
1.
2.
Discuss the sex preference model when each couple can have n o more than C children. In this problem you'll consider ways of adapting Audley's model to fit Henmon's data more accurately. If you become very involved in this,
P R O B LE M S
99
it would be a good idea to read Henmon's paper. Henmon obtained the following data for subjects Bl, Br, and H. He asked them to express a degree of confidence in their. choice ranging from a (perfectly confident) to d (doubtful). Confidence in choice
Subject Bl
0.966 0.841 0.653 0.480
a b c d
753 1 045 131 1 1612
Subject H
Subject Br
557 0.9 5 1 560 987 0.944 596 1 205 0.836 635 1499 0.6 1 5 624
(a)
574 1 669 0.972 606 0.853 596 0.563
638 722 789 850
699 777 814
The simplest modifications o f Audley's model may b e either to choose a different fixed value for or to allow p, or K to vary while the other two are fixed. What do you think of this idea (before we actually examine it against the data) ? (b) Argue that, if p, and are all fixed, the accuracy of a decision depends only weakly on the speed with which it is made. How does this fit with the data ? A decision corresponds to a mixture of A's and B's followed by identical symbols (either A or B). is approximately and that is an inÂ ) Argue that creasing function of and a decreasing function of p. (d ) Show that, if p and are fixed and is variable, longer decision times are associated with greater accuracy. What if only varies ? Only p ? Which of these predictions seem reasonable in view of the data ? Why ? Can you propose a specific model which can be tested against Henmon's data ? If you could have helped Henmon design his experiments, what would you have suggested he do differently in the actual running of the eKperiment and in the compilation of the data ?
K
K Hint: K PA/PB K A
A,
A,
(c
(P/ql - l K
L
A
(e)
3.
Develop the model o f Kintsch, Estes, and Bower mentioned o n page 9 8 with the equality assumption made b y Kintsch. Compare the model with the data given above and in the text. Compare it with Audley's model with 2. Which seems to be better ? Why ? Can you suggest additional experiments that would be useful in testing the models ?
K
4.
=
Many colleges and universities are faced with a problem regarding tenured positions. To attract a good, young faculty, the prospects for tenure must be high, but to allow for adaptation, the percentage of
1 00
B AS I C P R O BA B I L I TY
tenured positions should not be too high. What is the best strategy ? The following material is adapted from an article by 1. G. Kemeny For our purposes let us distinguish three positions :
(1973).
1, assistant professor (first appointment). 2, assistant professor (second appointment). t, tenure. Positions 1 and 2 each normally last for 3 years, and position t lasts for an average of about 30 years. Since these times are multiples of 3 years, ï¿½e will take 3 years as the time unit. Let P i denote the probability of going from 1 to 2, pz the probability of going from 2 to t (given that the step from 1 to 2 has been made), and q r the probability of leaving a tenured position (death, retirement, move to another institution) during a 3 year interval. (a) Show that the probability of achieving tenure is P i P Z ' (b) Show that the fraction of faculty that has tenure in an equilibrium l'
=
(i.e., steady state) situation is
P iP Z . P i ) PiP Z
ql 1 + + Hint: Let x, y, and z be the number of faculty in positions 1, 2, and t, respectively. Show that E (y ) P i E ( ) and E(z) (1 - q r) E(z) + pzE(y). (c) Conclude that, when is fixed, p is a minimum when Pi 1. Interpret this as a policy proposal. (d) Kemeny estimates that qr is roughly 0.15. Tabulate p versus for P i 1. How sensitive is the tabulation to variations in P i ? Comment on the proposal in (c ) in light of this. (e) Incorporate appointments to the tenure level from outside and resignations from the assistant professor levels in the model. Hint: Look at flows of people as suggested in (b). p
=
=
X
l'
=
=
l'
=
(f) Discuss the model. Is it realistic ? Have important psychological factors been neglected ? What psychological effect is the proposal in ( ) likely to have on assistant professors ? What would you reÂ commend ? Why ?
c
5.
3. 2
In Section the nuclear missile arms race was discussed qualitatively. This problem and the next one deal with a simple quantitative model discussed by T. L. Saaty pp. and R. H. Kupperman and H. A. Smith See also K. Tsipis Suppose a country has M missiles which are being attacked by w warheads, each of which has a probability P of destroying the missile it is attacking. Suppose further that the behavior of the warheads is independent.
(1972).
(1968, 22-25) (1975).
1 01
P R O B LE M S
(a) (b)
Show that, if the ith missile is attacked by W i warheads, and the expected number of surviving missiles is
=
=
M
{; }
=
W
( - {;}) (1
( l - p) 1 + [WI Ml + M l
{ p{;})
_
p) [wI Ml
M( l - P)[WI M l -
:::::: M(1 [x]
6.
Wi
Show that the above expression is a minimum when the values of W i are as nearly equal as possible. Interpret this in terms of strategy. Conclude that
S
(c)
I
- pri M ,
x
{x} x - [x]
where is the largest integer not exceeding and is the fractional part of Why is the variance of the expected value S important ? Can you say anything useful about the value of the variance ? With additional assumptions ?
x.
=
In the following discussion, use the results of the previous problem. To make the discussion uniform, assume that a retaliatory force of surviving misssiles is desired and that p
S 100 0. 5 . (a) Suppose there are two equal countries (so M). Determine the minimum M required for stability. (b) Suppose ABMs are installed to protect the defender's missiles. Why will this lead to a decrease in p ? Plot M as a function of p 0. 5 . =
=
W
=
S
(c)
Discuss policy implications. Don't forget to take into account the limitations of the model. What if the attacker has ABMs that can protect its cities ? (Consider Suppose both countries introduce MIRVs with t warheads per missile. Discuss modifications in the formula for and the desired value for It is fairly reasonable to assume that p is directly proportional to the cube root of the strength of the warhead and l that this is proportional to the weight. It follows that p(t) p/t /3 . (Why ?) Suppose there are three equal nuclear superpowers and each wishes to have a retaliatory force survive a coordinated attack by the other two powers. Discuss.
S. )
S.
S
::::::
(d) 7.
Have you ever noticed how children at a playground or people at a party form groups of various sizes ? What sort of patterns are present ?
B AS I C P R O B A B I LITY
1 02
This problem deals with the equilibrium size distribution of freely forming groups and was adapted from J. S. Coleman and J. James We assume that there is a collection of people who are free to join in groups as they choose. Examples are pedestrians, children playing, and shoppers. We wish to explain the size distribution of the groups. Five sets of data are given in the accompanying table. The fi rst column
(1961).
I
III
II
IV
V
1 1486 316 306 .305 276 2 694 1 4 1 132 1 44 229 3 195 44 47 50 61 4 37 5 102 25 12 3 5 10 4 6 1 0 0 1 0 indicates the size of the group, and the remaining five columns refer to the five different groups observed by James. Data set I refers to pedestrians, data set II to shoppers, data sets III and IV to children at playgrounds, and data set V to people on a beach. The entries in the ith row are the number of groups of size i in each of the five samples. (a )
(b)
N G
G
Let be the total number of people present, the total number of groups, and the number of groups with exactly i members. Show that = I and = I Suppose that in a very small time interval of length M single people (i.e., groups of size 1) join groups with probability per person, that the group joined is chosen at random, and that people leave groups and become single with probability M per person. Assume that people act independently of each other (in the probability theory sense of ' independent '). Show that the expected net flow rate of groups from the collection of groups of size i to the collection of groups of size i is because groups of size i break up and groups of size i grow. Show that this must be zero at equilibrium, that is, although flow occurs, the flow is zero. = Show that at Let = Interpret and show that Using this and = conclude equilibrium = (This is called a truncated where = that = i ( Poisson.) Note that only the ratio is important, rather than the actual values of a and Would this be true if we were concerned with a non equilibrium situation ? Why ? We need a formula for in terms of the data. Show that = and use this to fit the model to the five examples given
Gi Gi N
iGiÂ·
a L'lt
fJ
net
+1
+1
fJ(i + l)Gi + 1 - aGI(GJG)
PiI Pi 1. Pi GJG. i I di! Pi 1, (p ajfJ) P Pi Aij Pi! eA - I1), A P I ajfJ. ajfJ fJ. A (d) NjG Aj(l - e - A)
(c)
M O N T E C A R LO S I M U LATI O N
1 03
above. How good is the fit ? (If you are familiar with the chi-square test, you may wish to use it.) Another way to fit the model is to estimate A using A = for example, A = Is this a better idea than that in (d) ? A worse idea ? Why ? (f) Suggest further tests of the model besides the simple fitting of the data that you have done. Criticize the model. Can you justify proposing a model more complicated than the one developed here on the basis of the data ? Why ? Develop an alternate model by replacing ' the group joined is chosen at random ' in with ' the person associated with is chosen at random.' Introduce and A = Show that = = 1 - A, and = (A log (1 - A). Which model provides a better fit to the data ?
(e)
(i + l)Pi + dpi;
2P 2/Pj.
(g)
qi qjAi - qj
(b)
qi iGjN qj rt./{3. G/N - l)/A =
You may wish to look at J. E. Cohen (197 1 ).
5 . 2 . M O N T E CA R LO S I M U LATI O N
When a probabilistic model cannot be analyzed analytically, Monte Carlo simulation is often used. The basic idea is to construct a deterministic model
based on the probabilistic one by choosing particular values for the random variables according to the assumed distributions for them. Many such models are constructed, and statistical information is collected about the various dependent variables. This information is used to estimate parameters of the distributions of the dependent variables. If you don't have access to a computer, that's not reason to skip this section. For example, suppose a ' fair ' coin is tossed 1 00 times. How many heads can we expect ? The following is an algorithm for a Monte Carlo simulation of this problem. Input N , the number o f trials. Carry out steps Set HEADS to O. Carry out step 3 1 00 times. 3. Choose X such that Pr {X O} = Pr {X HEADS + X. 4. Record the value of HEADS . 5 . Analyze the data collected. 1.
2.
=
For this illustration, the analysis in step mean and variance of the number HEADS.
2 thru 4 =
I}
=
N times.
t. Set HEADS to
5 will consist of determining the
1 04
B AS I C P R O BA B I LITY
I ran the algorithm on a computer three times each for N and 1 000. The values of the mean and variance were :
=
1 0, 1 00,
N
Mean
Variance
Mean
Variance
Mean
Variance
10 1 00 1 000
5 1 .2 49.7 49.7
14.8 26.2 25.5
49. 1 49.6 50.0
14. 5 21.5 23.4
49.6 49.0 49.5
38.2 2 1 .4 26.0
Note the greater variability in the estimates for the mean and variance when N is small. The theoretical values of the mean and variance are exactly 50 and 25. How accurate are the estimates of the parameters of a distribution ? Answering this question and obtaining more accurate estimates without an excessive number of trials are major problems in Monte Carlo simulation, but we only touch on them here. Given 8 and (j greater than zero, we can obtain an estimate S of the parameter such that
S
Pr
{ I S S I > (j} -
< 8,
provided the number of trials N is sufficiently large. Determination of N before simulation is usually very difficult ; however, post hoc estimates can be made as follow s . Assume that, when several estimates of are obtained by simulation, they are drawn from a normal distribution with mean (This is probably not true, but often it is not too unrealistic.) If m estimates Sj have been obtained, S L SJm is a estimate of and the variance of the estimate is given by
S
S.
S
=
(J 2
=
L (S
sy 1)
=----'--
m(m
If we apply this to the coin tossing problem we obtain the following estimates, the first S - (J pair referring to the mean and the latter referring to the variance. The value of m is 3. Mean N
10 1 00 1 000 True
S
50.0 49.4 49.7 50
Variance (J
0.6 0.3 0.2
S
22. 5 23.0 25.0 25
(J
7.9 1 .6 0.8
1 05
M O N T E C A R L O S I M U LATI O N
The estimate for the mean happens to be the most accurate when N 10. This is just chance ; the best estimate we can give is 49.7. In addition to these ideas for measuring the accuracy of estimates, there is a theoretical result which can be used to obtain an idea of how many more trials we'll need : After N trials, the error in the estimate of a parameter is often roughly proportional to 1/JN . How can we generate the random choice required in step 3 of the coin tossing algorithm ? Since a computer is (hopefully) a deterministic device, we cannot actually generate random numbers. However, almost every computer center has a subroutine which can produce a number between 0 and 1 each time it is called, and it does so in such a way that the entire sequence appears to have been sampled from the interval [0, 1) using a uniform distribution. If a computer is not available, a table of random digits can be used : Simply start somewhere in the table, write a decimal point, and copy after it the next few digits in the table. This gives a random number drawn from the uniform distribution on [0, 1). A brief table of random digits appears at the end of this chapter. Using uniformly distributed random variables, one can generate random variables according to any distribution. For example, if is distributed uniformly on (0, 1), the largest integer in is distributed uniformly on the set {O, 1 , 2, . . . - 1 } . In general, if is a distribution function, is a random variable with distribution function Since a table of can be constructed ahead of time, it is a relatively easy matter to choose random variables with the distribution function These ideas are discussed more fully in Section A.6. Here I'll content myself with two simple examples. The exponential distribution is given by Pr { T > } 2. Then ( ) 1 for :2: O. Suppose and so - ! log (1 We generate five random values of T by using three-digit numbers from the table at the end of this chapter, starting with the first entry in the table : =
X
F.
,k
F - 1 (X) F- 1
e - kt t - x).
k
F
F.
Ft
=
e - 2 t,
=
t F - 1 (x)
X (table entry)
0.554
0.2 1 8
0.826
0.340
0.244
T (exponential)
0.404
0. 123
0._874
0.20 1
0. 140
{
kX
=
=
, k - 1 } . In this case 0 if t 0, F(t) ([t] + 1 )/k if O :S; t k - 1 , 1 if t > k - 1 , where [y] i s the largest integer in y. Hence F - 1 (x) [kx], a s mentioned earlier. (There is a slight error in the definition of F - 1 at points x for which Let's look at the uniform distribution on {O, 1 , . . . =
<
:s;
=
1 06
B AS I C P R O B A B I LITY
kx
is an integer. Theoretically this is irrelevant, since these values occur with zero probability. Practically, the formula is correct because the uniform distribution comes from [0, 1) instead of [0, 1 ] .) A D octo r's Wa i t i n g R o o m
You've probably experienced a long wait for a doctor. Why does this happen ? This problem is simple enough that a fairly realistic model can be analyzed theoretically using techniques of queuing theory. I plan to take advantage of the simplicity of the problem to work through a Monte Carlo simulation by hand, using the table of random numbers at the end of this chapter. On a normal day, Dr. Smock has his receptionist schedule one patient every 1 0 minutes from 9 : 30 A.M. t o 1 1 : 50 A.M. an <;l from 1 : 1 0 P.M. t o 4 : 0 0 P.M., except that two patients are scheduled for 9 : 30 A.M. and no patients are scheduled for 1 0 : 40 A.M. or 2 : 40 P.M. Starting at 9 : 30 A.M. he works until all the morning patients have been treated, takes a lunch break, and then works until all the afternoon patients have been treated. Subject to the limitation that his lunch break is always at least 45 minutes, he sees the first afternoon patient at 1 : 10 P.M. or as soon afterward as possible. One week Dr. Smock's nurse was asked to time the patients' visits. She divided them into ' short,' ' medium,' and ' long,' according to the doctor's directions, and collected the data shown below. Visit Short Medium Long
Time Range (minutes)
Average Length (minutes)
3-7 7- 1 5 1 6-30
11 20
S
Percentage of Total Visits 38 47 15
She also noted that the doctor spent 1 minute between patients and took 1 0 minute coffee breaks at 1 0 : 40 A.M. and 2 : 40 P.M., or as soon after these times as there was a break between patients. The receptionist observed that about 1 0 % of the appointment times were not filled because of late cancellations and patients who failed to appear. Unfortunately, she did not notice if there was any bias toward certain times of day. No information is available on late arrivals, but the receptionist thought that patients usually arrived on time. That's the data we have to work with. Suppose we could have designed the data collection ourselves. What would you have asked for ? Before setting up the model it I S interesting t o note that according to the table Dr. Smock spends an average of 10 minutes with each patient he sees. Allowing for the 1 minute between patients and the 1 0 % unfilled appointments, this works out to a full day for the doctor on the average.
M O N T E C A R LO S i M U LATI O N
1 07
Now we need to model the waiting room somehow. Various possibilities exist for modeling the amount of time a patient spends with the doctor. One of the simplest is to limit all visits to 5, 1 1 , or 20 minutes each. Suggest others. I am going to use the following simulation and repeat it several times to generate data for several typical days. Criticize it and suggest improvements. 1.
For each of the patient arrival times during the day, choose a random digit. If the digit equals zero, the patient doesn't arrive. 2 . For each patient that arrives, choose a two-digit random number. If the number is at most 37, Dr. Smock sees the patient for minutes. If the number lies between 38 and 84 inclusive, he sees the patient for 1 1 minutes. Otherwise he sees the patient for 20 minutes. 3. Using the results of the two previous steps and information about Dr. Smock's behavior we can put the doctor's day together.
5
I used the following method to model a day. On a sheet of paper for the day I had one row for each patient slot and five columns labeled ' time in,' ' empty,' ' type,' ' see Dr.,' and ' time out.' The ' time in ' column was filled with the various times allowed for appointments, namely, 9 : 30, 9 : 30, 9 : 40, 9 : 50, . . . , 4 : 00. I then filled in the next column by reading a random digit from the table starting at the beginning of line 0 1 and using step 1. As a result, the 1 1 : 20, 1 : 20, 2 : 50, and 3 : 00 slots were empty. I then read the table two digits at a time to carry out step 2 for slots that were not empty. I obtained the following sequence of visits (short, medium, long, and-for empty) : smsmmsmlsms-sml, lunch, m-lmssmsm--msmsss. As a result, the first 9 : 30 patient saw the doctor from 9 : 30 to 9 : 35, and the second saw him from 9 : 36 to 9 : 47, giving the 9 : 40 patient a brief wait. Continuing in this fashion to fill out the last columns, I found that the 1 0 : 50 patient didn't see the doctor until 1 1 : 07 because the doctor was running late and didn't have his coffee break until 1 0 : 56. As a result there were two patients in the waiting room very briefly at 1 1 : 1 0. The 1 1 : 20 cancellation allowed the doctor to catch up and even have a 3 minute break at 1 1 : 36. The afternoon was slightly slower, and the occurrence of two cancellations right after coffee break time allowed the break to run for 25 minutes. To obtain some idea of how typical this was, I decided to model a second day. I picked up in the table of random numbers at the point I had left off at the end of the first day : the twenty-fourth entry on line 03. Although there was only one morning cancellation, things were a bit slow because of a large number of short visits. The afternoon was busier, with two patients in the waiting room twice, once for a quarter of an hour when the 3 : 40 patient arrived.
1 08
B AS I C P R O BA B I LI TY
You may wish to model additional days and compare them with these two. If you think the waiting room tends to be rather empty and that the doctor would not like to have stretches during which he must wait for patients to arrive, you might like to adjust things by changing the scheduling. Scheduling two patients at times like 9 : 30 (already done), 9 : 40 , 1 : 10, and 1 : 2 0 tends to build up a queue in the waiting room so that cancellations will not leave Dr. Smock at loose ends. You may wish to let the doctor work longer hours (an average of 30 minutes) to handle three extra patients, or you may wish to drop some appointments to make up for the additional ones, for example, 1 1 : 50, 3 : 5 0 , and 4 : 00 . Sed i ment Vo l u me
What happens when suspended particles settle ? Do they attract each other ! Slide after contact ? It turns out that these things affect the density of the sediment. Thus we can obtain information about settling in an indirect fashion by studying the sediment's density. But how can such measurements be interpreted ? We need a method for computing the density of the sediment under various assumptions. That's the purpose of this model. We are interested in the fraction of volume occupied in a typical portion of the sediment, and we avoid the surface of the sediment where the fraction of volume occupied is not a well-defined concept. This model is adapted from M. J. VoId (1 959, 1 959a). For simplicity we assume that the particles in suspension are all spheres of the same size. Clearly the volume depends on whether the particles attract each other, cohere on contact, slide on contact, or repel each other. The last case can be eliminated, since we are assuming that the suspension settles. Which of the other cases occur ? If attraction or sliding takes place, to what extent does it occur ? We cannot simulate the behavior of the entire suspension at once, but we can simulate the particles sequentially. Thus we can imagine a sediment into which we let particles settle one at a time. This may be a reasonable assumption if the suspension is fairly dilute. Discuss. Another problem We encounter is that in the real situation there are many more particles than we can possibly hope to include in the model. Although the model will have many fewer particles than a real life situation, it must have enough to avoid large random fluctuations and to avoid ' edge effects ' due to the bottom and sides of the container. After we propose the model, discuss whethel you think there are enough particles. Can the question be answered by computation instead of on heuristic or philosophical grounds ? We treat the case of attraction and cohesion and leave sliding as a problem. Since the nature of the attractive force isn't specified, let's assume that it is zero when the distance between the centers of the particles exceeds
1 09
M O NT E C A R L O S I M U LATI O N
AI',
I' being the radius of a particle, and that it is infinite when the distance is less than and no other particle is closer to the settling particle. This is an unrealistic assumption, but it makes the modeling much easier and should give some measure of the attractive force. Discuss the effects of this crude assumption. When 2, the model reduces to the case of cohesion. Why ? The Monte Carlo simulation proceeds as follows.
AI'
A
A
=
L
Choose a size and shape of cylindrical container, the radius I' of the particles, and the number of particles. Repeat step 2 once for each particle. 2. Select a random point on the upper surface of the container and simulate a particle settling from this point until it comes to rest against another particle or on the bottom of the container. Record its location. 3. Gather the desired statistics. The container is chosen to be cylindrical for simplicity. Since we are not interested in overflow, the container is chosen to be arbitrarily deep. We 1 and simply adjust the size of the container. Steps 2 and 3 can easily set require further explanation. The easiest way to keep track of a particle is probably by the location of its center, say with three coordinates y, where and y are in the horizontal plane and increases upward. The point at which a particle is dropped should be chosen randomly (this is the Monte Carlo part) by using uniform distributions on and y. When a new particle is dropped, it ends up at a position determined as follows. y', For each previous particle with (y .:0:; find such that
I'
=
x (xo, Y o)
(x, z),
z
x (x', z'), (x - xof + - Y of (Ar)2 , Zo (x - X O )2 + (y - YO)2 + {z - Z O )2 (Ar)2 and choose the particle at (x, y, z) that gives a maximum Zo. Then (x' , y', z' ) is the point on the line segment joining (x, y, z) and (xo, Y o , zo) that is a distance 21' from (x, z). If no particle is ever close enough, the new particle =
y,
will settle to the bottom. You should convince yourself that this is correct. The statistics we gather in step 3 will be the fraction of volume occupied by the particles. Since the upper surface of the sediment is not level, we take a cross section of the sediment well below the surface. This requires some numerical experimentation. I chose particles of radius 1 in a container of such a shape that y) for the centers of the particles would be in a square of side 1 4. Thus the crossÂ sectional area of the container is 1 6 - 4 n, and the fraction of the volume occupied by n spheres in a section of container of height h is 4nn/3Ah. With 300 particles and 2, I found that the volume fraction for h 10, 1 5, 20, 25, 30, 40, and 50 was 0. 1 63, 0. 1 5 1 , 0. 147, 0. 1 52, 0. 1 50, 0. 1 23, and 0.099,
(x,
A
A
=
=
2
+
=
110
B AS I C P R O B A B I L I TY
h
respectively. Hence it seemed reasonable to assume that 20 was well below surface effects but still large enough so that the volume fraction would not be much influenced by the flat bottom. I then made three runs each for various values of Quite a while after these computations were done, N. P. Herzberg suggested that the possibility of difficulties with the bottom is indicated by the volume fraction for 10 and that these could be avoided by taking a slice between, say 10 and 25. Since the old program was gone, I decided to leave things as they were. =
A.
h
h
=
=
h
=
A
Volume Fraction
(J
2.00 2.25 2.50 2.75 3.00
0. 1 54530 0. 1 3 1 340 0. 1 07535 0.095632 0.0759 3 1
0.0098 1 3 0.0038 50 0.01 3 1 1 7 0.00393 7 0.008904
A downward trend in the volume fraction is quite visible. [VoId's simulations (1959) led to volume fractions slightly smaller than mine, but this may be due to the flat bottom.] She also determined the number of spheres contacting a given sphere and found that the average was very nearly 2. What does this mean ? Experimental results give a volume fraction of about 0. 1 2 5 for glass spheres in nonpolar liquids and about 0.64 in polar liquids. How could this data be interpreted in terms of the models discussed here ? (See also Problem 1 .) S t r e a m N etwo rks
Is there any regularity in stream networks ? Some geomorphologists believe that many of the features of stream networks are random. In particular, are the branching patterns random ? It would be nice to know, since if we found that they were non-random we could look for an explanation (or at least the geomorphologists could). What do we mean by ' random ' in this context ? We use one idea of random adapted from A. E. Scheidegger (1 970, Sec. 5.33). First we need some definitions. A drainage basin consists of a and the area it drains. A stream network is a stream together with all the streams that flow into it above the point at which we are considering the stream. A is the portion of a stream between two junctions or between a junction and a source. A stream network is almost always made up of a set of links joined so that at each junction only two links flow together to form a third. (The rare occasions when more than two streams
stream
(or river) network
link
M O N T E C A R L O S I M U LATI O N
111
4 ï¿½)
F i g u re 1
0)
Two extreme examples of stream networks .
meet simultaneously can be resolved, but we won't go into that complication here.) See Figure 1 . The of a stream link is defined as follows. Links that start at a source are of order If two links of orders and flow into a third link of order then equals A if A and equals the maximum of and otherwise. See Figure 1 . A is a stretch of river over which the order doesn't change. Let be the number of segments . of order Thus in Figure Horton's law of in Figure and stream numbers is an empirical relationship which states that 1 is nearly independent of i. For streams in the United States, this approximate constant (whatever that means) is about according to Scheidegger. However, the data of L. B. Leopold et al. p. 1 42) for the entire United States, presented in Table does not agree with this. If stream networks tend to be fairly linear as in Figure or rather bushy as in Figure this law is not valid. (Compute and 1 in these cases.) It has been suggested that the result can be explained by assuming that stream networks are random. We model this idea following Liao and A. E. Scheidegger (see A. E. Scheidegger, The only geometric property of a stream network we have introduced is the pattern of connection among the links ; lengths and curvatures have been omitted. Given the number of sources , there is only a finite number of different drainage networks. Those with four sources are shown in Figure These patterns of connection are known mathematically as (' trees ' because of shape, ' plane ' because they are drawn on a flat surface, ' planted ' because the link at which we have cut the network is distinct from all others and can be used to plant the tree, and ' binary '
Strahler order 1. A C, C + 1 B, C A B segment ni nz 1 lb. 1a nz 4 n i/ n i + 3.5 (1964, 2, 1a 1b, n i nJn i +
B
=
i.
=
=
1970).
2. binary trees
plane planted
112
B AS I C P R O B A B I LITY
Ta b l e 2
Number of Stream Links of Various Orders in the United States.
Order
Number
10 9 8 7 6 5 4 3 2
1 8 41 200 950 4200 1 8000 80000 3 50000 1 570000
Average length (miles) 1 800 777 338 1 47 64 28 12 5.3 2.3 I
n
dni + l 8.0 5. 1 4.9 4.8 4.4 4.3 4.4 4.4 4.5
Example Mississippi Columbia Gila Allegheny
Source : L. B. Leopold, et al. ( 1 964) .
00 1 0 1 1 1
000 1 1 1 1
ï¿½
'
,
'
'-',
,
00 1 1 0 1 1
,
t
0 1 00 1 1 1
010101 1
F i g u re 2 The 5 seven-node plane planted binary trees and their seven digit LucasiewÂ icz sequences.
M O N T E C A R LO S I M U LATI O N
113
because of the bifurcation at each node as we move upstream). Since this is the only type of tree we care about here, we call them simply ' trees.' It is not hard to show that a tree with n sources has - 1 nodes and - 1 links (including the link at which we have cut the network for study, that is, the link furtherest downstream). To study n/ni + 1 we want to average over all trees with n sources, or at least over a reasonable number of randomly generated n-source trees ; that is, each of the trees with n sources is equally likely to be chosen. Since Horton's law is formulated for stream networks of fair size, we want n to be fairly large. When n is about 1 00, there are about 1 0 5 6 trees-far to many to generate all of them. Thus we need a way to generate and store a random tree in a digital computer. Fortunately this mathematical problem has a fairly simple solution due to Lucasiewicz. We imagine traveling along the tree so that each link is traversed exactly once upstream and exactly once downstream. We start upstream on the link used to plant the tree, use the following rules, and stop when we return downstream on the cut link.
2n
2n
1.
Go upstream if possible. If a choice is possible, go upstream on the right hand branch. 3. When a node that is not a source is encountered while going upstream on a right hand branch, record a zero. 4. When a source is encountered, record a one.
2.
This process is illustrated in Figure from the string of zeroes and ones : 1.
2. 3.
2 . It is possible to reconstruct the tree
Draw the planted link. If the next digit is a zero, draw a bifurcating node and proceed upstream on the right hand branch. If the next digit is a one, draw a source and proceed downstream until an untraversed upstream link is found. Go up it.
You should convince yourself that this algorithm does indeed work. A string of zeroes and ones corresponds to a stream network with n sources if and only if it possesses two properties : 1. 2.
The number of ones in each initial segment never exceeds the number of zeroes. The total number of ones equals n, and the total number of zeroes equals n - 1.
B A S I C P R O B A B I L I TY
114
n
The second requirement just says that there are sources and 11 1 internal nodes. The first requirement ensures that as we go downstream we never return to the link at which we have cut the network before the last step. Since properties 1 and 2 are necessary and sufficient and since all trees are obtained exactly once in this way, it suffices to generate sequences satisfying properties 1 and 2 randomly. A method for doing this is discussed in Problem 2. Given an internal representation of a tree, we need a way to find 11 ; . This can be done as follows . We list the nodes in the order first reached by traveling around the tree as described above. Each node refers to the link immediately downstream from it. Construct two sequences, LORDER and ORDER : the first refers to the order associated with the left hand branch and the other refers to the actual order. We proceed in order through the sequence L of zeroes and ones which represent the tree. If Lr do nothing if Lr I : =
1. 2.
0,
=
Record I in ORDERr and LORDERr and set ORDERNOW to 1 . Find the nearest preceding LORDERJ which is blank (i . e . , J < maximum, and LORDERJ is blank) and do the following for K I 2, . . , J + 1 . -
a.
b. 3.
-
I, J =
I
is a 1,
-
.
Record in each blank ORDERK the maximum of LORDERK and ORDERNOW if LORDERK i= ORDERNOW and record 1 + LORDERK if LORDERK ORDERNOW. Set ORDERNOW equal to the value of ORDERK just recorded . =
Set LORDERJ equal to ORDERNOW .
Work your way through some examples and try to see why this method works. Note that in this Monte Carlo simulation the main problem is conÂ structing algorithms for handling the pictorially simple concepts of tree and order in a digital computer. We have one problem left : How do we identify segments ? This is fairly easy. When we are computing the order of a link, it will be a new segment if it is a source, or if the orders of both branches feeding in are equal ; otherwise it will belong to a segment containing either the left or right branch. It is useful to keep a sequence SEGMENT that notes which links are the furtherest upstream link of some segment. I generated random stream networks using the above ideas and found a result similar to that obtained by Liao and Scheidegger : For fixed the value of 1 increases slowly with 11 to about When 11 ; 1 ;:::: 1 5, the expected value of the ratio appears to exceed 3.8. Do you think this is evidence in favor of the random stream network model or against it ? Why ?
nJI1; _
4.0.
i,
_
115
P R O B LE M S
Can you suggest other tests ? See A. E. Scheidegger ( 1 9 70, Ch. 5)'for further disc ussion. S. B. Barker et al. ( 1 973) made some studies of the branching structure of real trees. They counted all the branches on an apple tree and on a birch tree. For the apple tree they found that was about 4.35, and for the birch tree it was about 4.00. Does this look random ? It would be a good idea to try a different approach to the idea of what a random network is, if we can think of one. One possibility is discussed in the problems. M. J. Wolden berg ( 1 969) discusses yet another approach to understanding stream networks and criticizes the claim that is independent of His method is an adaptation of the geoeconomic marketing model called See S. Plattner ( 1 975) for a discussion. Trees and other graphs are useful tools for some types of modeling problems. You may enjoy reading F. S. Roberts ( 1 976, Ch. 3).
nJn i - l
nJn i - l
i. central place theory.
PROBLEMS
1.
2.
Construct a Monte Carlo simulation model for sediment volume when the particles are allowed to slide downward in settling. Can you explain the volume fraction for polar solvents by this model ? We want to choose sequences of zeroes and ones satisfying properties and in the stream network example.
1 2 (a) Show that, if a sequence satisfies property 2, exactly one ' rotation ' of it will satisfy property 1 . A rotation of d dz, . . . , dm is a sequence d 1 +i , d 2 +i ' . . . ' dm + i, where dj dk with 1 ::;; k ::;; and j k a multiple of (b) Use (a) to construct an algorithm for rotating a sequence satisfying property 2 to obtain one that satisfies property 1 . (c) We now want an algorithm for randomly choosing k positions m.
(d) 3.
=
b
m
-
from m in such a way that each of the possibilities is equally likely. Find one. Combine the above to produce a complete algorithm for randomly generating strings of zeroes and ones that represent trees.
A manufacturing plant is trying to decide whether to increase the number of loading docks for trucks. Truck arrival at the docks is uniform during the working day.
not
(a)
Describe how you would set up a M onte Carlo model to help management decide how many loading docks to have. Remember that it must be reasonable to collect the data. You should work
B AS I C P R O B A B I LITY
116
(b)
4.
the model out to the point where you could carry out the simulation if data were supplied. Discuss in class what factors could lead to nonuniform arrival rates. Choose a specific situation that leads to nonuniformity and hypothÂ esize some reasonable arrival rates. (Note that for the number of docks to be about right, as it presumably is, the number of arrivals per day should average somewhat less than the loading docks could handle by working steadily. Why ?) Choose a particular Monte Carlo simulation method from (a) , hypothesize reasonable data, divide up the work, and do the simulation by hand. During the next class period pool your results so as to answer management's question.
How many comets are there in the solar system ? What is the rate of loss of comets from the solar system ? The following model deals with the number of ' long period ' comets in the solar system and follows J. M; Hammersley ( 1 9 6 1 ). An iuteresting feature is that, although we usually think of the laws of planetary motion as a classic example of a deterÂ ministic system, Monte Carlo simulation is useful. This is because the number of comets is large. We had a similar situation in the sedimentation problem. A long period comet is a comet that goes well beyond the orbit of Jupiter, and by ' comet ' we mean a long period comet. If we measure the energy E of an obj ect orbiting the sun in such a way that it is zero when resting at an infinite distance, by one of Kepler's laws, the period of the orbit equals ( - CE/m) where m is the mass of the obj ect and the constant C depends only on the gravitational constant and the mass of the sun. If E :2:: the object will escape from the solar system.
T
3/2 ,
0,
(a) What can cause E to change ? The main influence is the gravitational
Zi
(b)
field of Jupiter. Discuss others. If we set - CE/m, where E is the energy after the ith pass by Jupiter's orbit, can be treated as a random number with a distribution depending on Jupiter and the sun but not on m. Approximate this by a normal distribution with mean zero. How could you check this approximation ? [ See R. H. Kerr ( 1 9 6 1 ).J Show that, up to scaling, the lifetime of a ' random ' comet is given by =
ï¿½Zi
'[ - 1
i '= O ZÂ·- 3/ 2 - 1 , ZT 0, and ï¿½Zi has a normal L.,
Zi > 0
I
,
where for 1 ï¿½ i ï¿½ T ï¿½ distribution with mean zero and variance one. What is the scale factor ?
P R O B LE M S
117
(c)
Describe a M onte Carlo model for obtaining information about the distribution of lifetimes of comets, when time and Zo are measured in whatever units were necessary for scaling. If most comets wander into the solar system from outside, as is believed by some astronomers, what is a reasonable value for zo ? Should we neglect Z0 3/ 2 in (b) ? Why ? How could we estimate the total number of comets in the solar system, assuming losses and gains are equal and holds ? Hammersley obtained an estimate of about million comets. (I) Suppose all comets were formed within the solar system when it came into being. Discuss changes in and
(d) (e)
(d)
2 (d) (e).
5.
We consider another way to approach randomness in stream networks. The idea is that the topography is random. Imagine a portion of a plane covered with squares. We think of the edge of each square as a possible stream link. Water might flow from or through any given vertex to an adjacent vertex. See Figure 3. This idea was suggested by a discusÂ sion in L. B. Leopold et al. (1 964, p. 4 1 9).
(a)
v,
Given a vertex choose an adjacent vertex at random and allow the water to flow from to w . Be careful. We can't do this if we've previously decided to let water flow from w to Bifurcating sources and ' lost ' rivers must be avoided. See and in Figure 3. How could you implement this on a computer ? What about the posÂ sibility of water flowing in a closed loop such as in Figure 3 ? Can you handle this by allowing lakes or by somehow stopping it by clever programing ?
v
A
v.B
C
A B
C
(a)
(b )
F i g u re 3 Choosing random stream networks on a grid. (a) Portion of grid . (b) Randomly generated links on this portion of grid. Problems have arisen at A , B, and C.
118
B A S I C P R O B A B I LITY
(a
(b)
Change the model in ) so that each vertex is assigned an altitude and water runs down to the lowest adjacent vertex. What problems arise in implementatIon ? Discuss biasing the two models just suggested to allow for a general overall slope to the land. Since four edges meet at the vertex of a square, we can expect to have some vertices where three streams join to form a fourth. This can be avoided by using a hexagonal (honeycomb) pattern instead of a square pattern. (e) Criticize the model. (f) Perhaps some students can actually implement a Monte Carlo model. If this is going to be done, discuss the practical details carefully beforehand. Among the things you will need to consider are :
(c) (d)
Which model(s) will be implemented ? How should the model be stored ? How big should the model be ? How can the order of a link be determined ? How can the segments be identified ? Exactly what data, if any, are needed ? Don't forget the problems mentioned in
(a) and (b).
A Ta b l e of 3000 R a n d o m D i g its
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16
55421 2 1 66 1 77254 03803 29005 90086 48786 013 1 2 90897 1 1 433 74500 01710 98325 91318 65640 33578
88263 65304 57610 63025 6858 1 72725 Â· 42078 060 1 5 96649 1 04 1 2 34547 945 3 3 93297 54562 3 3035 7 1 492
40244 89606 76372 94237 1 8068 85496 66302 96224 857 1 8 53251 78695 38266 874 1 7 905 36 47348 89085
606 1 3 67 1 32 92693 3 3 227 71414 360 1 5 79 1 8 5 42768 42458 08366 9896 1 42999 79283 39274 50884 2482 1
1 8750 56488 08 168 5 1 828 93529 1 9475 479 1 7 22830 1 8222 26673 50370 85821 1 3082 26757 7 1 729 58763
09668 75977 45645 07254 03790 79306 3 1 532 78005 68868 89379 121 1 8 12617 73321 04007 3 1 23 7 03745
67045 93 3 1 1 963 3 1 96652 1 7 147 8 8066 59264 1 7433 36204 27952 8060 1 98876 08 108 76649 96000 50706
P R O B LE M S
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
38934 63994 56799 1 2726 3408 1 8 3 547 62794 95447 26596 745 1 9 76702 93398 0392 1 07876 1 7597 28 348 57790 1 3233 35 809 99902 1285 1 87584 2 1 627 39 1 24 83985 72642 863 5 1 78675 1 3 744 7 1 582 80380 99964 05032 5899 1 07667 40078 30787 27095 47394 64478 20095 29785
90627 371 35 3 5247 49439 69899 1 5 593 95699 93642 38328 7 3 8 34 1 2394 25450 70788 78832 1 2602 46747 22390 964 1 2 47 1 47 47 1 64 76785 1 7 1 22 33387 971 54 65741 09689 002 1 5 1 3948 07743 87153 39093 70393 429 3 1 7492 1 26870 77005 465 1 2 59999 01 133 56998 8 1 826 41 1 30
936 1 9 04933 7848 1 33920 92802 24422 5 5 1 02 41 265 75787 73701 98323 41 967 45 1 39 93 503 7 1 925 05225 75625 29753 6663 1 61 1 13 250 1 9 1 5362 94307 28543 00 1 1 5 88779 97630 23670 55 507 45222 97093 241 49 69890 38536 48732 00604 89824 79940 87725 9 1 42 1 772 1 1 48891
1 2976 28 1 9 1 70048 67668 8 1 1 44 56988 57232 1 1 687 79328 6 1 1 59 1 1 486 89708 507 1 3 46088 63 1 1 5 1 1 003 05258 95187 87135 799 1 6 79805 56795 34270 1 5 1 67 66382 68 543 62359 208 1 8 62664 95055 68003 23608 80165 6839 1 42076 5 3344 8 1 494 23254 45405 6 1 692 429 1 9 69755
748 5 3 7 1 590 75596 253 1 3 52246 07032 04292 85266 64024 7561 8 6559 1 93328 83241 28554 5 1 767 99959 1 426 1 60401 39573 656 1 1 01 740 1 8723 22996 98577 023 37 64 1 74 24386 4 1 693 045 7 1 30583 004 1 6 5 8032 1 39 1 6 72232 86542 21916 04148 28226 9 1 78 3 83308 56828 06426
36562 169 1 6 9 6 1 36 05208 20404 1 6541 246 1 9 95769 81217 1 07 1 9 66 1 69 08 532 46227 499 1 3 1 3 52 5 69238 270 1 3 5 3 309 98 1 1 7 2848 1 68627 54025 79509 22030 0 1 885 27344 52426 69965 78498 88348 76429 39520 7 1 993 85406 3 3490 3 1 700 74399 468 7 1 601 42 23 590 533 1 5 33279
119
52889 89009 095 1 3 07753 66428 80267 00792 85657 14914 23249 6 1 37 1 1 7663 8 1 250 56826 65363 1 3750 10094 1 6058 1 2344 05621 82308 1 3867 97534 373 10 26932 38379 87404 45507 05944 92666 04361 1 6090 25752 95680 49293 72849 03683 1 1 524 24679 73 1 62 23430 891 80
1 20
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
B A S I C P R O BA B I LITY
06 1 22 95826 334 1 9 1 3 127 1 7970 02440 01 902 83708 43366 28400 10878 3 5304 74794 506 1 2 90297 07048 70827 34 1 6 1 1 3 889 90726 54383 94345 72425 799 8 1 83 1 29 09583 52392 66641
50707 83455 47261 42437 44384 09677 3 3280 61287 458 1 1 8 1 384 67992 33948 04070 1 1 495 95935 79736 6807 1 49740 95558 42834 76347 29276 54109 59796 35323 043 1 6 871 42 47752
70290 4 1 687 1 3998 3992 1 95 1 34 25867 69006 95269 45506 5605 1 50896 648 1 1 1 3049 56502 3 1 036 76495 70 1 23 02489 5 5047 45339 29876 0788 5 47783 78249 59702 57908 65066 48858
74073 28490 42627 979 1 2 0 1 034 50480 57137 639 1 8 02740 496 1 5 20390 09205 78 1 5 8 3 7454 83853 68263 09804 0027 1 99000 567 1 1 1 9497 1 446 1 67259 05050 1 296 1 37926 58787 56250
82 102 3 1 1 37 70392 60053 5 1 693 5 5276 75395 66823 1 2387 1 7959 28689 001 8 1 40274 1 5523 9 1 422 22727 84209 66229 2 1 703 56299 843 10 64927 68498 68335 22452 10256 769 8 1 6 1 530
40049 55658 75443 75764 83968 39445 582 1 5 85887 3 5925 9 1 881 02029 59797 1 8 380 1 7 100 1 4307 72509 649 10 66429 34104 35935 96346 4 1 423 69 107 25702 7 1 264 73089 9 1 372
495 1 4 19873 75939 042 10 41619 86379 1 6067 47487 69605 07447 27049 5 3427 3 1 390 29 1 1 1 66632 52840 73477 53530 03878 45020 5 1 867 09201 1 5027 25771 86662 79661 72 1 3 8
CHA PTE R
6
P OT P O U R R I
The models presented here use a variety of elementary methods that didn't fit conveniently into the earlier chapters. Desert lizards a n d R a d i a nt E n e rgy
Lizards in arid regions make use of radiant energy (direct sunlight, reflected sunlight, and infrared radiation from the ground), conduction of heat through contact with the ground and with rocks, and convection to adjust their body temperature. Because of the high reflectivity of the sand (about one-third of the sunlight is reflected) and the heat of the sand, one could suppose that reflected sunlight and infrared radiation are nearly as important as direct sunlight. This model, which is adapted from K. S. Norris ( 1 967), studies the question. Since we wish to compare the relative amounts of energy hitting the lizard, its actual shape is not likely to be very important. Since symmetry usually simplifies computations, we assume that the lizard is a sphere of radius whose center is a distance above the sand. We assume that the sun is directly overhead and has an energy E per unit area per unit time. We consider the ratio of reflected sunlight to direct sunlight. The energy per unit time due to direct sunlight is E. To study the reflected light we take advantage of the symmetry by setting up a polar coordinate system on the sand with its center directly below the lizard. A side view is shown in Figure 1 . The fraction of light reflected from the sand at point that reaches the lizard depends on the distance p , the angle cp, and the angular diameter of the lizard as seen from As a first approximation, let's suppose that the intensity of the reflected light is independent of cp. Then the fraction of light hitting the lizard will nearly
r
h
nr2
P
P.
1 21
1 22
P O T PO U R R i
Lizard
T ï¿½
R
h
p p
F i g u re 1
Side view of a spherical lizard at high noon .
equal the fraction of the hemisphere of radius the lizard. This fraction is nearly
R
centered at
P that lies within
nr2 - 2(p2 r2 h2)' 2nR2 p p dp 2np dp.
(1)
+
The total amount of sand surface between and + is Since the area directly under the lizard is shaded and since about one-third of the incident light is reflected, it follows from the above discussion that the amount of reflected light reaching the lizard from the sand up to a distance x away is nearly
x r2 E 2np dp nr2-E 1 0 g (X2,. h2 ). ï¿½ 2 h2 6 I,. 2(p 2 h2) 3
(2 )
+
=
+
+
(X 2 h2 ). r2 h2
Dividing this by the direct energy we obtain Reflected Direct
(3) As
=
ï¿½6 log
+
+
becomes large, (3) approaches infinity. What is wrong? One objection is that we have treated the desert as a flat, barren plain, which is certainly not correct. Suppose that topography and brush begin to interfere seriously with the reflected light at a distance between 51' and 5001'. If is at most the value of (3) will be between 30 and 200 % for x between 51' and 5001'. This answer appears to be quite reasonable, and there is no need to determine very accurately when brush and topography become important unless we want very accurate estimates of (3). A completely different objection is that the intensity of reflected sunlight does indeed depend significantly on the angle of reflection. ' Significantly '
h
x
2r,
D ES E RT
LI ZAR D S A N D
1 23
RADIANT E N ERGY
may be rather misleading here, since at a distance of about 500r we have 0.2Â° and so no reflection at angles less than 0.20 is sufficient dependence to limit reflected sunlight to a reasonable value. Let's consider the general situation. We can allow for dependence on the angle by introducing a function multiplying the integrand in (2). This function should vanish at 0 1 80Â°. I have been unable to find an and achieve a maximum of 1 at empirical estimate of f. One possible function is the sine. It has the right general form and leads to an integral which can be easily evaluated :
rp =
j(rp)
rp =
rp =
(4)
zh nEr 3 (p Z h 2 ) 1 /2 I X +
as
r
x ---+ 00 .
Thus we have
h ( 2 h2 ) 1 /2 '
Reflected < -:-:-.-----=-=-;-' D irect - 3 r +
(5)
----
which is bounded above by t. This result is of the same order of magnitude as the result obtained previously. We could consider other forms for and other values for x . In the end we would probably find that for anything reasonable the ratio of reflected to direct sunlight was at least 20 %-a significant amount of energy. Infrared radiation probably behaves in a similar fashion, hence reflected sunlight and infrared radiation are important factors in a lizard's heat balance. Attempts have been made to use these crude results to study what happens as parameters vary, but this can be dangerous. To see this let's consider what happens when a lizard adjusts by bending its legs. By difÂ ferentiating with respect to it is easy to see that the right hand side of (3) is a decreasing function of and the right hand side of (5) is an increasing function of Thus our model is not good enough to tell us whether the lizard becomes warmer or cooler when it raises itself. Actually the lizard will probably become cooler because of an important effect that has not been mentioned : A thin layer of hot air is found on the surface of the sand. If you wish another example of the difficulties that arise from not knowing consider the following. Will a lizard in a bowl-shaped depression in the sand be warmer or cooler than an identical lizard on the flat sand ?
j
h.
h h
h
j,
1 24
P OT P O U R R I
A re Fa i r E l ec t i o n P roced u res Possi b l e ?
In a mathematical model we normally use mathematics to study approxiÂ mately the behavior of a real situation. In this example we consider a different type of question : What can we deduce about a situation that satisfies certain conditions ? This is the axiomatic approach of pure mathematics : Make certain assumptions and see where they lead. The problem is to choose reasonable assumptions which lead to interesting conclusions. One of the earliest and most successful examples of the axiomatic method in science is Newtonian mechanics. This approach has also been used in sociology and economics. A particularly successful example is utility theory. See R. D. Luce and H. Raiffa ( 1 958) for a discussion. J. F. Nash (1 950) applied the theory to show that with some additional axioms one can conclude that there is a unique ' fair ' trade in two-person bargaining. J. G. Kemeny and J. L. Snell (1 962, Ch. 2) showed how certain axioms lead to a unique measure of the distance between individual preferences. Here we study elections. Our goal is to prove that there is no fair way to run an election between several candidates. This is known as the This version differs slightly from that of K. J. Arrow ( 1962, Ch. 8). I've selected this particular example because it is easy to present, is somewhat surprising, and conveys the flavor of the axiomatic method. For a discussion of these topics see F. S. Roberts (1 976, Chs. 7 and 8). We need to say what we mean by a fair election procedure ; but before we can do that, we must say what an election is. Letters like and z denote candidates, and letters like and j denote voters. A (also called an ordering) is a relation :::J , read ' is preferred to,' satisfying
Arrow impossibility theorem.
x, y, ranking
i
x y, x :::J y, Y :::J x, and x y (read ' x and y x, x x. x, y, x y y then x with x if and only x y y We assume that each voter has ranked the candidates, and we use (x y ) i to denote the ranking given by voter i. An election procedure is a rule for deducing a ranking, denoted simply x y, from all the individual rankings. 1.
For all and exactly one of are tied ') is true. 2 . For all and z , if 2 and 3. For all if and z. =
=
=
2
=
z,
2
z
= z
2
2
N ote that an election is not just a choice of the top candidate, but rather a ranking of all the candidates. If the procedure is fair, we will obtain a complete ranking from a procedure that gives the top candidate ; for example, to find the second ranking candidate we apply the procedure to find the top candidate we apply the procedure to find the top candidate when the winner is removed. This can be formally justified on the basis of axiom 3 below.
A R E F A I R E l ECTI O N
P R O C E D U R E S P O S S I B LE ?
1 25
Rather than specify exactly what constitutes a fair election procedure, I'll list some conditions (axioms) an election procedure must satisfy if it is fair. You may wish to add others, but you are not allowed to remove any of the following five. After listing them, I'll discuss them. 1. 2. 3.
All conceivable rankings by the voters are actually possible. If (x ï¿½ Y )i for all then x ï¿½ Y with equality if and only if (x = Y )i for all If in two different elections each voter ranks x and Y the same, then the election outcomes between x and y are the same ; that is, if for all (x ï¿½ Y) i if and only if (x ;:;; Y) i ' then x ï¿½ Y if and only if x ;:;; y. Here > denotes the other election. 4. If there are two elections such that (x ï¿½ Y )i implies (x ;:;; Y )i for all and if also x ï¿½ y, then x ;:;; y. 5. There is no such that invariably x ï¿½ Y if and only if (x ï¿½ Y )i '
i,
i. i
i,
i
The first condition says that the election procedure must be able to deal with all cases. The second axiom simply states that a unanimous desire of the voters is respected by the election procedure. Axiom 3 says that how two candidates rank relative to each other in the election depends only on how the voters rank them relative to each other and not on how they rank relative to other candidates. Thus inserting other candidates won ' t change the election ranking of x relative to y. Axiom 4 states that, if x does at least as well compared to Y in a later ranking by the voters as he did in the present ranking, and if he beat Y in the present election, he'll beat Y in the later election. In other words, if your relative position improves in the eyes of all the voters, it will improve in the election results. The final assumption says that there is no dictator. We can manipulate these axioms in a variety of ways to reach conÂ clusions. In fact, it can be shown that axiom 3 follows from the rest. (You might like to try to prove this.) The manipulations we are interested in are those that lead to a proof of the following impossibility theorem. TH E O R E M . No election procedure for more than two candidates satisfies axioms 1 through 5. Hence a fair election procedure is impossible if there are at least three candidates.
P R O O F . We show that axioms 1 through 4 imply that there is a dictator. Note that, if we have an election procedure for candidates, we can obtain one for 1 candidates by introducing a dummy Nth candidate which all the voters are assumed to rank lowest. It is easy to show that, if assumptions 1 through 4 hold for the original procedure, they hold for the derived procedure.
N
N
-
1 26
POTPO U R R I
V
x
xY Y
A set o f voters will be called decisive for against if when all voters in the set agree on ranking at least equal to y, then ï¿½ regardless of how the remaining voters rank and y ; furthermore, we require that in this case for all in V. At least one decisive set exists for y implies that all and y-all the voters are decisive by axioms and 2. Note that by axiom we can check if a set is decisive just by looking at an election with ï¿½ for all in and C for all not in V. We show that for some and there is a single voter who is decisive. Suppose that this is not true and let be the smallest decisive set. Then has at least two voters in it, and so we can split it into two nonempty, disj oint sets of voters and Let z be another candidate and consider an election in which
V
x
x
=
x x (x Y)i i
1
=
4 (x Y)i
i V (x Y)i i x YV V VI V2 . (x Y ) for i in Vb (6) ( x Y)i for i in V2 , (y X)i for i not in If x then VI is decisive for x and contradicting the minimality of Thus x. Since V is decisive for x and y, it follows from (6) that x y. Thus y. Hence V2 is decisive for and y, contradicting the minimality of (One has to be careful to check out the cases where equality occurs. I won't bother because it clutters up the proof and I only want to give you the flavor of this type of argument.) Thus V contains a single voter, say i. We have shown that for the two candidates x and y, if (x Y) i ' then x y. Let be a third candidate. Now suppose that (x Y ) Consider the election when (y x)j for all j i=- i. By axiom 2, y and by decisiveness, x y. Hence x B y axiom 3 we can ignore y and note that, if (x ) and ( x)j for all j i, then x Hence i is decisive for x and Let w be a candidate distinct from x and By a parallel argument we can show that i is decisive for w and This shows that i is decisive for every pair ; that is, i is a dictator. z
ï¿½
z ::>
ï¿½
ï¿½
ï¿½
Z i
V.
::> z ::>
z,
z,
z ::>
ï¿½
ï¿½
ï¿½
z
z
Z i
ï¿½
ï¿½
::> z ::>
ï¿½
::> z .
i=-
z ::>
z.
This completes the proof.
z.
ï¿½
z.
ï¿½
V.
V.
ï¿½
Z i.
::> z ,
z.
â€¢
How does this work out in practice ? Suppose a contract administrator sends contract proposals (candidates) to experts (voters) for ranking and then determines a final ranking (election). Although he may not weigh the opinions of the experts equally, we hope that his ranking procedure will be fair. The theorem says that this is impossible, and the administrator may not actually be aware of this fact. What axiom is he violating? It is unlikely to be either 2 or 5 . Since 3 follows from the other axioms, he must be violating 1 or In other words, either the administrator cannot produce a ranking in all cases (such situations could be handled by obtaining additional voters) or the ranking of other proposals influences how he decides to rank proposals and y relative to ï¿½ach other.
4.
x
I M PA I R E D C A R B O N
D I OX I D E E L I M I N AT I O N
1 27
I m pa i red C a r b o n D i ox i d e E l i m i n at i o n
I t i s relatively easy to measure the concentrations (via partial pressures) of various gases in the air exhaled and inhaled by a person. Thus this could lead to a diagnostic test -if we know how to interpret the data. In 1 922 Haldane asserted that carbon dioxide (C0 2 ) elimination by the lungs is generally unchanged by a mismatch between blood flow and ventilation because increased elimination in overventilated areas compensates for decreased elimination in under ventilated areas. (We call this an imbalanced lung.) Consequently impaired CO 2 elimination has been considered to be diagÂ nostic of some sort of blockage in the body's gas exchange system. J. W. Evans, P. D. Wagner, and J. B. West ( 1 974) reexamined the question and found that Haldane was wrong : Unequal ventilation rates cause reduced CO 2 eliminaÂ tion. We develop a version of their model here. Lungs function as follows. Air is drawn into the body, humidified, and pulled into little sacs in the lungs called alveoli. Here capillaries exchange CO 2 and oxygen ( 0 2 ) with the air, which is then exhaled and new air drawn in. If the blood flow around each alveolus were proportional to the volume of air in the alveolus, we would have a balanced lung. We want to compare CO 2 exchange in balanced and imbalanced lungs. How much CO 2 is lost from the blood ? At equilibrium the blood can hold a certain amount C(P) of CO 2 per unit volume when the partial pressure, of CO 2 in the air is P. As CO 2 leaves the blood, P increases and the conÂ centration in the blood decreases toward C(P). For lack of better information, we assume that equilibrium is reached. Unfortunately P also increases as the blood absorbs oxygen, because is proportional to the fraction of the air that is CO 2 , It follows from the way carbohydrates and fats are used that over the long term the amount of CO 2 eliminated is about % of the amount of O 2 taken in. If we assume that this is true for a single breath, we will have a constraint for the entire lung. This does not seem to be enough to give a manageable model ; therefore we assume that this % ratio holds for each alveolus for each breath. As you can see we are making a lot of unwarranted assumptions which may leave our conclusions on rather shaky ground. However, if after all these simplifying assumptions balanced and imbalanced lungs behave differently, it should be safe at least to conclude that Haldane was wrong. Let's introduce some mathematical notation. We consider an individual alveolus first. Let the subscript denote inspired and denote expired. Let denote the partial pressure of If we measure partial pressure in units so that atmospheric pressure is 1, then
P
80
80
i
P(x)
(7)
I P Jx ) x
=
1
e
x.
and x
1 28
POTPO U R R I
The change in the amount of x is (with suitable units) (8) Vi P;(x) - Ye Pe(x), where denotes the volume of air. Applying this to the gases,
V
(9a) (9b) (9c)
CO 2 lost = Ye Pe(C0 2 ) - Vi Pi(C 0 2 ), O 2 gained = Vi P;(0 2 ) - Ye P . (0 2 ), 0 = Vi Pi (other) - Ye Pe (other),
where the last equation is based on the fact that CO 2 and O 2 are the only gases exchanged in significant amounts. (Humidification occurs earlier.) We must combine (7) and (9) with the CO 2 /0 2 ratio of 0.8 to obtain informaÂ tion about CO 2 exhaled, but in some simple form because we eventually will have to apply the result to all the alveoli and we can't measure individual volumes. Clearly the total volume change is 20 % of the O 2 uptake. The CO 2 loss is 80 % of the O2 uptake, which is (Vi - Ye)/0.2 by the previous sentence. By (9a), 4( Vi - Ye ) = Ye Pe(C0 2 ) - Vi Pi(C 0 2 )Â· Dropping the CO 2 in the P and rearranging, Vi =
Ye (4 + Pe) 4 + Pi '
and so by (9a), 4 Ye (Pe - Pi) . 4 + Pi The object of all this is to compare balanced lungs with lungs in which air flow and blood flow are mismatched. Hence we need to supplement ( 1 0) with an equation involving blood flow. Let C(P) be the concentration of CO 2 in the blood when the partial pressure of CO 2 in the air is P and equilibrium has been reached. Then for a quantity Q of blood passing by the alveolus and starting with a CO 2 concentration Co , ( 1 0)
(1 1 )
CO 2 1 ost -
_
CO 2 lost = Q [Co - C(Pe)] ,
if (a) the CO 2 balance in the air and blood reaches equilibrium before expiration and (b) the blood doesn't move (so that the blood coming by the alveolus at the start reaches the same CO 2 concentration as the blood coming by at the end). We've already decided to assume (a), but (b) doesn't look like a very good assumption. We should probably replace (1 1 ) by some sort of integral because blood is flowing by continuously. Since we can't handle this, we'll use ( 1 1 ) as an approximation, with Q equal to the quantity of blood flowing by in one breath.
I M PAI R E D C A R B O N
D I O X I D E E L I M I N AT I O N
1 29
We can now equate ( 1 0) and ( 1 1 ), the two expressions for CO 2 lost. The resulting equation can be solved for which can be substituted in ( 1 0) Pi , and to obtain an expression for CO 2 lost which depends on The last two variables do not depend on the alveolus, and the first two enter only as a ratio except for a factor of multiplying the entire expression. This sounds like a good approach, since changes in the ratio measure imbalance in the lung, and the for the various alveoli add to a constant, the total volume of air exhaled. Let's carry out the plan. Let be the solution to the equation
Pe ,
v', Q,
v'
Co.
Qlv'
v'
g(x)
4(g - PJ x[C - C(g)J , 4 + Pi where we think of x as Qlv' for applications. Letting the subscript indicate a particular alveolus, the total CO 2 lost equals Ia 4 VeAPea - PJ 4[La v'ag(Qalv'a) - Pi La v' J 4 + Pi 4 + Pi How does this change when total blood flow and total expired volume are held fixed ? This is the question we must answer. Since P i and L v' a are constants, it suffices to consider L v'ag(Qalv' a ). For convenience, let's measure volume so that I v'a 1 , and let's define the new variable X a Qalv'a. In a balanced lung, Xa is constant. Hence showing that a balanced lung ( 1 2)
=
0
a
a
=
=
is more efficient at eliminating CO 2 is equivalent to showing that
( 1 3)
Xa
v'a
where the are not all equal and the are positive numbers summing to 1 . (You should convince yourself that this i s what we need t o do.) Suppose a takes on only two values. Use Figure 2 to convince yourself that ( 1 3) holds if < O. Once this is done, it is fairly easy to prove inductively that < 0 implies ( 1 3). We turn our attention to For the partial pressures associated with CO 2 in the lungs, is nearly linear. Using this approximation, we can solve ( 1 2) for [If solving ( 1 2) were impossible, we could use implicit differentiation to study via ( 1 2). J Let = 4/(4 + and define and by = Since is an increasing function, > O. Equation ( 1 2) becomes
g il
g il
C
g il
K
g il .
C(P) g(x). PJ A B C(g) - Co Ag - B. A Kg - KPi Bx - Agx, =
and so 9
=
KPi + Bx . K + Ax
1 30
POTPO U R R I
X1
A graphical proof of ( 1 3) when
F i g u re 2
Thus g
'
a
takes on two values.
= - 2 A K(B - APi) (K + A X) - 3 .
By the definition of A and B, B - A P i = Co - C(P i ), which is positive since the blood gives up CO 2 , Thus g ' < O. We've shown that imbalanced lungs have impaired CO 2 elimination, but this is based on some rather crude assumptions. Should we believe the result ? First let's ask another question : Should we continue to accept Haldane's statement ? Obviously not. In view of the present model it appears unlikely that his statement is correct, because it asserts that an equality holds-a very fragile prediction. However, inequalities are usually robust predictions. This by no means proves that our conclusion will stand up under improvement of the model, but it indicates that it is highly likely. I have looked at what I consider the two worst assumptions-the 'Validity of ( 1 1 ) and the 80 % ratio for each alveolus-but I don't see any reasonable way to improve them. Do you have any ideas ?
P R O B LE M S
1.
This problem i s based on H. M . Cundy ( 1 9 7 1 ) and J . Higgins ( 1 9 7 1 ). Suppose that you once owned a reel type tape recorder with a counter that counts revolutions of the take-up reel. Now you've replaced it with a recorder whose counter counts revolutions of the runoff reel. All your information concerning locations of songs on your tapes is now useless unless you can convert one counter value into the other.
P R O B LE M S
1 31
Develop a method for doing this. Show how to construct a table for a given tape if you know the number of revolutions required to empty the reel and also the number of revolutions required to half empty the reel. (The half empty point is fairly easy to measure because the take-up and runoff reels will appear identical.) Do not assume that the thickness of the tape is known. 2.
This problem was suggested by G. Levary (1 956). A businessman is overstocked on a slow moving item. He wishes to mark down the price so that his overstock can be sold off to release money and space for other merchandise. What should he do ? For uniformity we'll introduce the following notation :
L, L*, S, N, p,
list price of slow item. proposed sale price. number of slow moving items sold per year. number of normal stock turnovers per year. profit margin, that is, (net profits)/(total costs).
Consider the following questions and any others that come to mind. How many slow items should be retained ? How low can the sale price be and still leave the merchant better off ? If this problem is easy for you, here are some suggestions for complicating things. What if can be higher on slow moving items because most people don't stock them ? What about the effect of random fluctuations in demand ? A Poisson model may provide a reasonable fit for the number of customers reÂ questing a particular item during a time interval of some given length.
p
3.
This problem is based on F. Metelli ( 1 9 74). Certain mosaics of opaque colors give rise to the impression of transparency. We limit ourselves to shades of gray. With each shade one can associate a reflectance equal to the fraction of incoming light that is reflected. The range from black paper to white paper is about 4 to 80 % . The left hand side of Figure 3 shows a mosaic made from four pieces with reflectances (X i ' Under appropriate conditions it will appear to be two rectangular sheets which have been superimposed. The smaller sheet will appear to be semiÂ transparent, transmitting a fraction f3 of the incoming light. One necessary condition for apparent transparency is that the edge effects match up-discontinuities or even angles at a supposed boundary destroy the illusion of transparency. (Note that the central vertical line in Figure 3 is unbent where it crosses the boundary of the inner rectangle.) What conditions must the (X i ' = 1 , 2, 3, 4, satisfy ? How can we deterÂ mine (xs and f3 in terms of them ? How would you test the model to see
i
1 32
POTPO U R R I
0< 2
0< 3 0<4
0< 1
F i g u re 3 The mosaic on the left can be interpreted as the superposition of a semiÂ transparent sheet on a bicolored opaque sheet .
if the conditions are necessary ? Sufficient ? There are various interÂ pretations for f3 which in turn lead to various formulas for ()( 2 and Consider
()( 3 '
()( 2
=
()( 2
=
rx 2
= =
+ f3rx b (1 - f3)()( s f3 2 ()( 1 ' (1 - f3)()(s + f32 ()( 1 [ 1 + ( 1 - f3)()( 1 + ( 1 - f3f ()( l + J f3 2 rx l ( 1 - f3)rx s + 1 (1 - f3)rx l ()(S
+
. .
.
_
and any others that seem worth looking at. Which are correct ? Use it (or them) to answer the earlier questions. 4.
Why do animals form herds ? One obvious suggestion is protection against predators. What advantages does herding give to animals that always flee ? Herding may reduce the chances of detection and capture per prey animal in the herd, and being near the middle of the herd may offer additional protection. Herding may also provide for improved detection of predators while grazing. Let's consider these by comparing a herd animal with a solitary animal in an open environÂ ment such as the African veldt. These ideas are adapted from 1 . Vine ( 1 9 7 1 ) and H. R. Pulliam ( 1 973). V. E. Brock and R. H. Riffenburgh ( 1 959) discuss schooling of fish. (a ) Let D be the distance at which a predator can be expected to detect a circular herd of individuals and let d be the distance for a solitary animal. Argue that the chances of the herd being detected versus an isolated individual being detected are given by D 2 jd 2 if the animals involved are placed at random on the veldt. Of course this doesn't happen ; instead, the predator roams in search of prey. In this case can the relevant ratio be Djd ? Explain ?
n
P R O B LE M S
(b)
(c) (d) (e)
D/d nr,
(ii)
r
Not every herd member acts as a sentinel at the same time. (In some harems only the male performs sentinel duty, in some mixed herds some peripheral animals act as sentinels, etc.) If the predator approaches a large herd from a side opposite a sentinel, that sentinel won't spot the predator in time to alarm the herd.
(f) Taking
5.
D/d.
It is crucial to have an estimate for See Problem 1 .5.6b. Show that rx with 0 ::; ::; t, may be a reasonable assumption. What can you say about r? What if predators detect prey by smell instead of sight ? Suppose that animals mill around randomly within the herd. When is being in a herd safer than being isolated ? In some herds, the animals push toward the center, with the result that some animals always end up on the perimeter. If a predator captures only animals that are on the perimeter, when is it safer to be on the perimeter of a herd than to be isolated ? Criticize the following model and then develop it or an alternative model. A predator must get within some critical distance of a prey animal undetected in order to win the chase and make a kill ; otherwise, the prey will escape. By looking up at random a grazing animal has some probability p of detecting the predator before it reaches the critical distance. Since one member of a herd can alarm the entire herd, a herd has a much better chance of escaping than an isolated individual. What is the probability that a herd will detect an approaching predator in time ? There are some compliÂ cations :
(i)
(g)
1 33
(e)
(c) (d).
into account, return to and When herding is beneficial, what limits the size of herds ? When is herding not beneficial ? Can you add anything else to the subject of this problem ?
The following is well known in traffic flow theory ; see, for example, W. D. Ashton ( 1 966, p. 1 8). Consider cars traveling along a roadway in one direction. Let be the concentration of cars (e.g., the number of cars per 1 00 feet of roadway) and let q be the rate of flow (e.g., cars per minute).
k
(a)
(b)
(c)
k
Argue that q and are related as shown in Figure 4. Various implicit assumptions were needed in State as many important ones as you can think of explicitly and defend and/or criticize them. Figure 4 is called a or a Translate as many of the following as you can into traffic
curve.
(a).
fundamental diagram
flow concentration
1 34
POTPO U R R I
q
L-__________________________ï¿½L_ï¿½
F i g u re 4
k
The fundamental diagram of traffic flow.
flow terms such as ' speed on an empty roadway ' : ( 1 ) the values of
K and Q such that (K, Q) is the highest point on the curve ; (2) the
slope of the line tangent to the curve at (0, 0) ; (3) the slope of the line tangent to the curve at (4) the slope of the line connecting (0, 0) and a point on the curve. If you don't know what slopes measure, note that they have the same units as Does the above help organize and clarify traffic flow concepts for you ? What questions does it raise that may lead to further investigaÂ tions and deeper understanding ? In other words, what use is the fundamental diagram ?
(k, q)
(d)
(k, q);
Hint:
q/k.
6. When you view an object using only one eye, you can detect a change in the brightness of the object if the change exceeds a certain threshold. (See Problem 2. 1 .6.) Normally you use both eyes. Suppose we fool the brain by exposing the eyes to separate but apparently identical scenes whose brightnesses can be varied independently. A study of the thresholds in this situation may give information about binocular vision. This is what T. E. Cohn and D. J. Lasley ( 1 976) did. They placed subjects in front of a device that exposed the eyes as described above. The subject reported pairs of left and right intensity changes (E L, ER) that resulted in just noticeable changes in the apparently single object. Cohn and Lasley plotted these points for various subjects and found that they lie roughly on the ellipse EI Eï¿½ KEL ER where depends on the subject and K ï¿½ 0.6. There is a fair amount of scatter in the data. You will now consider various possible explanations for the data.
+ +
(a)
=
S2 ,
S
Suppose that only the total intensity change matters. By ' total ' we 1 E R I Â· D escribe the graphs Cohn mean either 1 EL ER 1 or E L I and Lasley could expect to obtain.
+
1 +
P R O B LE M S
(b) (c)
1 35
Suppose all that matters is that the change in at least one eye exceeds the thresh old. Describe the graphs. Combine the ideas in and It suffices to have the change in + or at least one eye or the change in both exceed the threshold. Describe the graphs. (d) Cohn and Lasley proposed the following mechanism. The brain notes the sum and difference of and and combines them in some fashion to obtain a single parameter which must exceed a + threshold. They suggest a weighted sum of squares : + The value * gives the ellipses mentioned earlier. Compare the graphs in ( ) and (d). They fit the published data about equally well. Where do we go from here ? Can we decide between the models in and (d) somehow, or decide that both are wrong ?
(a) (b): (IELI IERI)
EL
T(EL - ER)2 .
T ;::::;
c
(e)
(c)
(IEL ER I)
ER
(EL ER)2
PA RT
MORE
2
A D VAN CED
M ET H O D S
C H A PT E R
7
A P P R OAC H E S TO D I F F E R E NTIAL E Q U AT I O N S
7 . 1. G E N E R A L D I S C U S S I O N
Many phenomena can b e described in a general way b y saying that rates of change of the endogenom variables depend on past and present values of the variables. These situations lead to models involving differential and difference equatiom. The population models discussed in Section 1 .4 are of this type : Eq uations ( 1 ) and (2) in Chapter 1 are differential equations, and (3) in Chapter 1 is a differential difference equation. M odel s in the physical sciences frequently include force, which involves the second derivative of position with respect to time : F = where F is force, is mass, and x is position. The basic equations of electroÂ magnetic theory are formulated in terms of partial differential equations. Thus the study of physical phenomena forces one to deal with differential equations. Economics and sociology also deal with differential equations from time to time. See the marriage model in Problem 8 . 1 .4. and the Keynesian model in Section 9.2 for examples. Because of the importance of differential equations, the next two chapters are devoted to models involving ordinary differential equations. The rest of this chapter discusses some of the philosophy of studying differential equations and describes the topics covered and omitted in the next two chapters.
m
d(m dx/dt)!dt,
1 39
1 40
A P P R O A C H E S TO D I F F E R E N T I A L E Q U AT I O N S
7 . 2 . L I M ITATI O N S O F A N A LYTI CA L S O L U TI O N S
It is usually best t o solve the equations of a model exactly if the exact solution of the model. If we has a reasonable form. We call this an find an analytical solution, we can often easily obtain information about the model that would otherwise be difficult or impossible to acquire. The analytical approach has two severe limitations. The main one is that it may not be possible to solve the equations analytically, since the solutions of most equations cannot be found except numerically. Second, even if an analytical solution exists, it will not yield the desired information easily unless it is in a useful form. For example, it is not easy to see how sin x behaves for large values of x by considering the Taylor series expansion
analytical solu tion
sin
x
=
x
-
x
3
-+ 6
x
5
-
1 20
-
.
.
.
.
Nevertheless, analytical solutions are usually quite useful when they can be obtained. Models in this category are discussed in Section 8 . 1 . 7 . 3 . A LT E R N AT I V E A P P R OA C H E S
Since the analytical approach is often impossible or impractical, approximate methods are employed. These are roughly of two types : quantitative and qualitative. We usually put borderline cases in the latter category. What do we mean by these categories ? Roughly speaking, ' quantitative ' refers to numbers and ' qualitative ' refers to shape, for example, ' What is the value of y(5) ? ' versus ' Is periodic ? ' The following discussion should help to clarify this. If you are interested in quantitative results, a computer is practically a necessity. The usual method for obtaining numerical information is to approximate the differential equations by difference equations and solve the latter. This sounds much easier than it is. We'd like a method that doesn't take a lot of computer time but gives a fairly accurate answer. We'd also like to know how accurate the computer's answer is. (It is important to remember that the computer's output is only an approximation. I know of one researcher who insisted on abandoning a model because the solution to his differential equation had small oscillations. They were present because of the method that was used in the computer center's differential equations package, but he insisted that the computer had his equation and that was that.) What we'd like and what we get may be two very different things. Very few computer centers provide differential equations packages that give error estimates, so you have to be a bit of a numerical analyst and try to
y(t)
solved
141
A LT E R N AT I V E A P P R O A C H E S
obtain them yourself-if it can be done. We won't be concerned with numerical methods per se, but Section contains some models for which numerical methods are useful. In preliminary studies, when the- data are very crude, or when the real situation is complicated, semiquantitative or qualitative statements are useful. Examples of such statements are
8. 2
1.
2.
3.
4.
f(t)e -2 t approaches a limit as t For sufficiently large t, x(t) > O. (x, ) eventually approaches arbitrarily closely each point in D as t f(t) is bounded. --+ 00 .
y,
z
--+ 00 .
What are the advantages of such lack of precision over analytical and quantitative results ? Because of the lack of precision, the model often need not be specified precisely. Thus we can often make robust statements about entire classes of models. This is useful in preliminary studies and in situations where the complexity precludes more accurate descriptions. Even if we have a specific model, we may wish to study the effect of certain parameters on the solution, for example, the effect of the amplitude and the length of the string on the period of a perfect pendulum :
[8' = -g sin 8, 8(0) = A, 8'(0) = O. In this case we can eliminate 1 and g by the change in variable t = T(ljg) 1 /2 (1)
and solve the resulting equation. The period turns out to be given by what is known as an incomplete elliptic integral of the first kind :
(2)
2 (g21) 1 / 2 fA(cos 8 - cos A) - 1 /2 d8. 0
Since elliptic integrals have been studied extensively, quite a bit of informaÂ tion can be extracted from Suppose we incorporate frictional efffcts by adding a term to the right hand side of ( 1 ) which depends on The analytical techniques collapse. If we know the precise form of the term that is being added to ( 1 ), we can conduct a time consuming numerical investigaÂ tion. For a qualitative approach, see Section While some applications of qualitative methods to physics and biology are classic, the power of qualitative methods in modeling is just beginning to be realized. R. Thorn's (1 975) discussion of catastrophe theory has stirred up considerable interest.
(2).
8' .
9. 2 .
1 42
A P P R O A C H E S TO D I F F E R E N T I A L E Q U ATI O N S
7 . 4 . TO P I C S N OT D I S C U S S E D
Partial differential equations arise when w e study variations o f a function with regard to two or more parameters simultaneously. Except in the physical sciences, it is difficult to build models of this level of complexity without their becoming so complex that nothing can be done without a computer. M ost exceptions seem to be based on physical analogies. Two very important partial differential equations are
(Yu a 2 u ax 2 at 2
a
Wave motion :
Heat equation :
a >
O.
a > O.
Equations like the first arise in the study of vibrating strings and membranes, and of electromagnetic, sound, and water waves. Equations like the second arise in the study of diffusion phenomena such as heat transfer, the spread of epidemics, and the change in gene frequencies in a population. Because sophisticated methods and/or extensive computer time are usually required to deal with partial differential equations, we avoid them. Suppose we can relate the present state of a system to the state of the system at one or more previous times. The resulting equation is usually a difference equation. For example, suppose that female unicorns live for exactly 4 years and produce exactly one female offspring in their second and third years. Let be the number of female unicorns at the end of year 4 The number just born in year is - 1 ), and they die in year after bearing offspring in years 2 and 3. Thus
Vet)
t+ t Vet) - V(t t+ t+ Vet) = V(t - 1) - [Vet - 4) - V(t - 5)J + [Vet - 2) - V(t - 3)J + [Vet - 3) - V(t - 4)J = V(t - 1 ) + V(t - 2) - 2 V(t - 4) + V(t - 5).
t.
This is an example of a linear constant coefficient difference equation. Models containing difference equations are designed to produce this type of equation because it is analytically tractable. Unfortunately they are often unrealistic. Attempts to add realism generally result in intractable equations which must be studied numerically. For these reasons as well as my own preferences, I've omitted difference equation models. The analytical intractÂ ability of difference equations is not wholly the result of neglect by matheÂ maticians. Simple difference equations can have stranger solutions than simple differential equations, so that both analytical methods and qualitative
TO P I C S N OT D I S C U S S E D
1 43
methods are harder to develop. Although the differential equation N' = rN(l - N/K), where r and K are constants, has a very simple solution, the corresponding difference equation
N(t
+
1) = N(t)
+
{ _ N%))
r N(t l
is quite complicated. See R. M. May (1975). This richness of behavior may be useful in modeling when lots of computer time is available. However, it could prove embarassing-a model with too many possibilities is often worse than a model with too few. In modeling populations the way we did unicorns, it is usually quite unrealistic to cut things up neatly into years. Attempts to avoid this often lead to integrals as a way of averaging over a period of time. Thus differential and difference equation models are closely related to integral equation models, another advanced topic that is not discussed here.
C H A PT E R
8
Q U A N T I TAT I V E D I F F E R E NTIAL E Q UAT I O N S
8 . 1 . A N A LYTI CAL M ET H O D S
In this section w e consider models that lead to differential equations that have explicit solutions. With the partial exception of the ballistics model in Section 8.2, the examples were chosen to illustrate a variety of models, not to illustrate methods for solving differential equations. P o l l u t i o n of t h e G reat Lakes
Industrialized nations are beginning to face the problems of water pollution. Once pollution of a river is stopped, the river will clean itself fairly rapidly if the pollution has not caused extreme damage. Lakes present a problem, because a polluted lake contains a considerable amount of water which must somehow be cleaned. The only presently feasible method is to rely on natural processes. How long does this take ? In particular, how long would it take to clean up the Great Lakes ? Pollution affects a lake in many complex ways. Some compounds such as DDT enter biological systems and move up the food chain. Since DDT is very soluble in fat, it concentrates in the fatty tissue of higher predators and is hard to remove from the biosphere. Some pollutants move rather freely in and out of the food chain. The behavior of phosphorus lies somewhere between these two extremes. (In one sense, phosphorus is not a pollutant, since it occurs naturally ; however, excessive amounts can trigger algae blooms, and it is then considered a pollutant.) Still other pollutants, like oil 1 44
A N A LYTI C A L M ET H O D S
1 45
spills, may be only slightly involved in the food chain. Extensive pollution can cause irreversible damage and even ' kill ' a lake. The main cleanup mechanism is the relatively straightforward natural process of gradually replacing the water in the lake. In addition, other proÂ cesses such as sedimentation and decay may be important. If we consider all these facets of the problem now, the discussion will go on and on and the resulting model will probably be hopelessly complex. Therefore we present the model first and discuss its validity later. Figure 1 shows the Great Lakes. The numbers will be explained shortly. The basic idea is to regard the flow in the Great Lakes as a standard perfect mixing problem. We ignore biological action, sedimentation, and so on, and assume that all the pollutants are simply dissolved in the water. This model is adapted from R. H. Rainey ( 1 967). We make the following assumptions : 1. 2.
Rainfall and evaporation balance each other, and so the average rates of inflow and outflow are equal. These average rates do not vary much seasonally.
F i g u re 1 The Great Lakes . The figures indicate the number of years required to drain the lakes if outflow is unchanged and inflow stops .
1 46
Q U A N TITAT I V E D I F F E R E N T I A L E Q U AT I O N S
These should b e good approximations. I n addition, w e make the following rather questionable assumptions : 3. 4. 5.
When water enters the lake, perfect mixing occurs, so that the pollutants are uniformly distributed. Pollutants are not removed from the lake b y decay, sedimentation, or any other mechanism except outflow. Pollutants flow freely out of the lake-they are not retained the way DDT is.
By these assumptions, the net change in total pollutants during the time interval l1t is
V
where is the volume of the lake, PI is the pollution concentration in the lake, Pi is the pollution concentration in the inflow to the lake, is the rate of flow, and 0(11t) denotes a function of such that 0(/1t)/ goes to zero as goes to zero. Dividing this equation by /1t and letting /1t approach zero we obtain the differential equation
I1t
I1t
I1t
r
(1) Since this is a first order linear equation, w e easily solve i t t o obtain (2)
Vir.
where r The numbers in Figure 1 are Rainey's values of r for the various lakes, measured in years. He does not give a value for-Huron. Using (2) and the data given in Figure 1 it is easy to determine the effect of various pollution abatement schemes if We do not include Lake Ontario in the discussion, because about 84 % of its inflow comes from Erie, a source of pollution which can be controlled only indirectly. [The modifications required in ( 1 ) and the resulting time estimates are considered in Problem 1 .J The fastest possible cleanup will occur if all pollution inflow ceases. This means that Pi O. In this case (2) leads to the simple expression =
the model is reasonable.
=
(3)
A N A LYTI C A L M ET H O D S
1 47
From this we can read off how long it would take to reduce pollution to a given percentage of its present level. The following figures in were obtained in this fashion.
years
Lake Erie M ichigan Superior
50 %
20 %
10 %
5%
2 21 131
4 50 304
6 71 435
8 92 566
Fortunately, the pollution in Superior is quite low at the present time. We have built a very much simplified model. How much faith can we put in the times we have just obtained ? To answer this question we must examine the validity of assumptions 3, 4, and 5. We begin with the perfect mixing assumption. If a lake has only one source and one outlet, water tends to move from the source to the outlet in a pipeline fashion without mixing. Hence the cleanup time is shortened for the main part of the lake. (However, slow moving portions have much longer cleanup times.) This effect cannot push the times much below T, because a cleanup requires the replacement of nearly all the water in the lake. The value of T is rather large for Michigan and Superior Our assumption of perfect mixing may be far off, but this error is not likely to allow cleanup times much below T and will probably lead to longer cleanup times for some semistagnant regions in the lake. We discuss assumptions 4 and 5 in connection with two important pollutants : DDT and phosphorus. Mercury behaves like D DT in many ways, so the discussion applies to it as well. Studies indicate that D DT and several other chlorinated hydrocarbons take a long time to break down into harmless compounds. Sufficient conÂ centrations of DDT can have bad effects on the health of many organisms and even cause death. Unfortunately, DDT is almost impossible to remove from the biosphere. It dissolves readily in body fat, and so an organism retains most of the DDT in ingests. This causes the chemical to reach greater concentrations in higher predators. These animals are rather large and so are not likely to be swept out of the lake with the outflow unless they choose to leave. When an organism dies, most of its body fat is consumed by other organisms, so most of the DDT remains in the biosphere. As a result of all this, we can expect DDT to stay in the biota of a lake for an extended period of time. The main factor removing DDT from a lake may be its very slow breakdown into less noxious compounds, but consumption of fish by birds of prey and humans may be important.
Vir
Conclusion:
=
1 48
Q U A N TITAT I V E D I F F E R E N T I A L E Q U AT I O N S
Mercury behaves somewhat like DDT ; however, i t i s an element and so does not decay. As a result, it is lost slowly due to sedimentation, o utflow, and the removal of fish by birds and humans. Phosphorus behaves differently. Large amounts of it are present in human wastes and in many fertilizers and detergents. The presence of excesÂ sive quantities of this element can cause algae blooms. These are sudden population explosions of algae as a result of which the lake may look like pea soup. Then the algae die and settle to the bottom. As a result, much of the phosphorus is removed in this fashion. Unfortunately, some of this removal is only temporary, since decay processes return the phosphorus to the lake water. The phosphate inflow to Lake Erie was about 75 tons daily in 1 967, but the outflow was only about 25 tons (K. Sperry, 1 967). Thus phosphorus was building up in the lake. The concentration may have been increasing, or the lake may have been losing 50 tons of phosphate per day in sediment. If the former is correct, cutting the inflow of phosphorus to 25 tons would only have led to an equilibrium situation. If the latter is correct, the phosphates on the bottom may reenter the biosphere and aggravate cleanup problems in the future. For persistent pollutants like DDT the estimated cleanup times may well be too low. For other pollutants it is not clear how assumpÂ tions 4 and 5 affect the cleanup times. The time estimates we derived may be low for some pollutants and high for others. The values of T given in Figure 1 probably provide rough lower bounds for the cleanup times of persistent pollutants.
Conclusion:
Summary:
The Left Tu r n S q u eeze
Have you ever found yourself in a car trapped near the curb with the rear end of a bus moving slowly and ominously toward you as the bus turns to the left ? It can be a hair raising experience. How far to the right will the bus move ? This model is adapted from J. Baylis ( 1 973). The situation is shown in Figure 2. We assume that the wheels do not slide sideways in turning. Since the rear axle is fixed, is tangent to the path of The angle between and the direction of the roadway is called cp, the length of is the length of is the width of the bus is 2w, the turning angle of the front wheels is e, and the speed of the bus is We must specify where the speed of the bus is measured. (To see this note that, if e 90Â° and the wheels don't slide sideways, the bus will move in a circle around Let be the speed of The values of cp, e, and are functions of time. Since we are interested only in the locus of U, we can take to be any function of time. We set 1.
R.
=
FR
FR I,
-
R.) v
FR
R T h, -
F.
v
=
v.
v v
A N A LYTI C A L M ET H O D S
1 49
u
F i g u re 2
The bus turning left. Dotted line is path of R.
How can we describe the bus's motion ? We sketch the derivation of the relevant equations, and you can fill in the details. By looking at the front end of the bus, we see that in a time interval the turning displaces the point a distance
dt
e(u dt)
F
e dt perpendicular to F R and a distance cos e dt parallel to F R. Looked at from the path of R, the displacement of F perpendicular to FR is I dcp, and the displacement parallel to F R depends on the path of R. Thus we have the basic equation relating cp, e, and t: I dcp sin e dt. (4) sin
=
=
sin
1 50
Q U A N T I TAT I V E D I F F E R E N T I A L E Q U ATI O N S
W e now turn our attention t o the motion o f U. Let u s first compute the
leftward displacement of F: x(t) = J;sin (8 + cp) dt. (We have used v = 1 .) Using (4) and cp = O at t = 0, we obtain x(cp) = I J,'P sin sm(8 +8 cp) dcpo (5) The displacement of V is now easily found : the rightward displacement of T is (h + l) sin cp x, and so the rightward displacement of V is f(cp) = w cos cp w + (h + I) sin cp x. (6) .
o
-
-
Setting f'(cp) (7)
=
[(
-
0, using (5), and multiplying by sin cp, we obtain
h + l) cos cp w sin cp] sin 8 -
-
I sin
(8 + cp) = 0.
x,
The general plan is to solve (7) for cp, use (5) to compute and then use (6) to compute the maximum displacement. To do this we need a relationÂ ship between and cpo Usually it is easiest if something is constant. Clearly cp cannot be constant, since the bus turns. Two possibilities are :
8
1. 2.
8 8 + = 1.,
i s constant-the driver keeps the front wheels turned at a constant angle relative to the bus. cp a constant-the driver keeps the front wheels aimed in a constant direction relative to the roadway.
Possibility 1 is mor.e realistic than possibility 2, but neither is perfectly correct. We consider both, because by comparing the results we should be able to obtain some idea of how accurate our conclusions are. Suppose that is constant. Solving (7) and integrating (5) we obtain
8
(8a) (8b)
w + I cot 8 X = [[cos 8 sincos8 (8 + cp)]
cot cp =
h
-
and the maximum displacement is (8c)
f = [(w + [ cot W + h 2] 1 /2 - (w + I cot 8).
A N A LYTI C A L M ET H O D S
1 51
(6),
expanding [The last equation is easily obtained by substituting (8b) into cos + cp), and recalling that the maximum of cos cp + B sin cp is 2 + B 2 ) 1 / 2 ; so we don't need .(8a) for (8c).] Using (8) I obtained Table based on the estimates / = 1 0, and w 4. The last row will be explained later.
(8
(A
A
=
Ta b l e 1
16, h
1,
=
Maximum Displacement with II Constant. B
q; (degrees) f (feet) iX (degrees)
8
20Â°
30Â°
40Â°
50Â°
60Â°
70Â°
1 2Â° 1 .0 26
18
23 2. 1 52
30 2.7 65
37 3 .4 79
46 4.2 93
L5
39
8
=
Now let's consider the case in which + cp IX - cp into (7) we have, after rearranging,
=
IX, a constant. Substituting
[(h + l) sin IX w cos IX] cos 2cp [(h + /) cos IX + (h - l) sin IX w cos IX O. -
+
Further rearranging gives where
-
+
=
C sin (2cp
-
c5) =
D,
C [(h + 1)2 + W2 ] 1 / 2 , D (h - 1) sin IX + w cos IX, (h + /) sin IX + w cos IX ' Stn u C =
=
.
where - 90 Â° (9a) where
<
c5
<
ï¿½ _
90Â°. Solving for cp, cp =
ï¿½ [arcsin (%) + arcsin (ï¿½)],
C and D are as before and E = (h + /) sin IX + w cos IX.
w sin IX] sin 2cp
1 52
Q U A N TITAT I V E D I F F E R E N T I A L E Q U AT I O N S
Integrating (5), (9b)
= I sm. log [tantan 1 and with I = h = and w = as in Table The last row will be explained shortly. x
(X /2
(X
( (X
-
rp )/2
.
1 0, 4, Using (9) (6) 1 6, 1 , we obtain Table 2. How can we compare the two tables ? After all, different things are constant in the two cases. A rough average value of (X can be computed for Table 1 by noting that (X varies between and rp as rp varies between 0 and its optimum value. Thus we set fi rp/2. Likewise for Table 2, e (X - rp/2. Interpolating in Table 1 with the e of Table 2 used as or
e
= e +e +
=
Ta b l e 2
e,
Maximum Displacement with () + cp Constant. IX
cp (degrees) .r (feet) e (degrees)
200
30Â°
40Â°
50Â°
60Â°
70Â°
7 0.8 16
11 1.1 24
15 1 .5 33
18 1 .9 41
22 2.3 49
26 2.7 57
doing the similar thing with the tables interchanged and using fi instead of e, we see that the estimates of f are within about 20 % of each other. This suggests that a table of e (or fi) versus the maximum f will be about the same for almost any method of turning. How could you test this idea ? Thus we conclude that the rear end of a bus turning left moves about 11 feet to the right, or more if the driver makes a sharp turn.
long C ha i n P o l y m e rs
Our booming synthetic fabric industry relies on chemical reactions that produce long chain organic polymers. Thus it is important to understand the nature, speed, and end products of polymerization reactions. We study one type of reaction here and another in Problem 3. The material is adapted from C. Tanford ( 1 96 1 , Ch. 9).
A N A LYTI CA L M ET H O D S
We need some background in chemistry. we wish to study is o ,f/ RCH - C
I
' 0 / NH - C ï¿½ o
n
+
1 53
A simple reaction of the type
H-(NH-RCH-CO) ' -R'
----+
H-(NH-RCH - CO)n + l -R'
+
CO2 ,
where 2 0 and the radical R' provides the mechanism for the reaction by breaking open the anhydride ring. We write the reaction symbolically : ( 1 0)
M'
polymer of length n. A M [A] [Mnl k, [A] [Mn], rate constant k'
The compound is called a For fixed temperature and pressure, the rate of a chemical reaction like ( 10) depends on the probability of a collision between an molecule and an n molecule. This is proportional to the product of their concentrations, which is written Thus the of reaction ( 1 0) is where the is practically the same for all because the reaction mechanism is the same. We assume for all So much for background. typical process starts with a concentration of and a concentraÂ tion of (which is simply R'H). How does the system evolve ? To begin with, since the concentration of R' does not change, we have the conÂ servation equation
rate n kn = k n. A mo(O) Mo (1 1)
a(O) A
00
I m,Jt) = mo(O),
n=O
mn(t) i s [Mn] at time t. From ( 1 0) we have dmo = -ka(t)mo(t), ( 1 2a) dt dmn = ka(t)[m, _ l(t) - mn(t)], n ( 1 2b) dt da(t) = -ka(t) I: mn(t). ( 1 2c) dt where
n=O
Combining ( 1 1 ) and ( l 2c), we obtain
da(t) -at = -kmo(O)a (t),
2
1,
Q U A N TITAT I V E D I F F E R E N TIA L E Q U ATI O N S
1 54
which has the solution ( l 3a) We can simplify ( 1 2a) and (12b) by defining a variable
dy = ka(t) dt, y = 0 at t = 0,
( l 3b)
y such that
for ( 1 2a) and ( 1 2b) can then be rewritten as
dï¿½ï¿½(Y) = mo (y),
( 1 4a)
_
( l 4b) These equations are easily solved inductively to obtain
mn(y) e - Yyn ' mo(O) n ! y.
a Poisson distribution with parameter (Do it.) Thus the mean chain length is and the variance of the length is also We now use ( 1 3) to determine as a function of
y,
t:
( 1 5)
y.
y
y(t) = 1ka(0)e - A' dt :ï¿½ï¿½6) ( 1 - e - A'). =
How can we produce polymers of some desired length we obtain - mo(O)I/a(O)] t = - log [ 1 kmo(O)
( 1 6)
I?
Setting
y=I
.
I
Since the Poisson distribution can be approximated by a normal distribution when is large, about 95 % of the lengths lie between - .jl and .jl. Note that altering reaction conditions like temperature and pressure only affects the time that we let the reaction run and has no effect on the distribuÂ tion of final chain lengths. Let's examine briefly what happens if we relax the assumption that = If is a decreasing function of the reaction proceeds more slowly than expected as time goes on, because the polymers are becoming longer. Also, the final distribution of chain lengths is more peaked than a Poisson distribution, because the shorter chains increase in length faster than the longer chains. You should be able to explain what happens when is an increasing function of
y
t
I+
n,
kn k. kn
n.
kn
P R O B LE M S
1 55
How can we use these results in chemical engineering ? We can use and To make the ( 1 6) to determine the optimum values for reaction run as fast as possible, both and should be large. Since there is an upper limit to the possible combined concentrations of A and Mo-only so much will fit in a given volume-we obtain an inequality :
mo(O)
( 1 7)
o
O.
mo(O) a(O). a(O)jmo(O)
::;; a(O) ::;; f(mo(O)),
r = a(O)jmo(O)
where r < (Why ?) As already noted, the larger is, the faster the reaction proceeds. Since fast reactions save time, increasing increases the number of batches we can process. Unfortunately, when we stop the reaction the concentration of A remaining will be
r
mo(O)(r I). Thus, if we cannot reclaim the remaining A or if the reclamation expense increases with quantity, a larger r will increase our expenses. By studying the details of plant operation we can construct a cost function depending on t, mo(O), and where t and are given by ( 1 6) and ( 1 8). We can then minimize
( 1 8)
I/.
r,
1/. ,
=
-
I/.
this subject to ( 1 7). In this way it is possible to reduce costs considerably over what they might be for a naive approach to plant design.
PROBLEMS
1.
This problem relates to the pollution of Lake Ontario.
(a) (b)
e
Use the subscript to refer to Erie, the subscript 0 to refer to Ontario, and the subscript to refer to non-Erie inflow to Ontario. Show that ( 1 ) should be replaced by
i
Using the fact that about five-sixths of the inflow of Ontario is the outflow from Erie, deduce that
Po(t) = e - t1r{po(0) + 61, 1 [5Pe(X) + P;(x)Jex/r dX}
(c)
Assuming that all pollution inflow to Erie and Ontario ceases except for the uncontrollable flow from Erie to Ontario, compute the and % cleanup times for Ontario. To do this, you will need to know how the pollution level of Erie compares with that of Ontario. No data are available on this, but Erie seems to' be more polluted. Try various values for
50
5
Pe(O)jPo(O).
1 56
Q U A N T I TATiVE D I F F E R E N T I A L E Q U AT I O N S
(d)
In the model we discussed the effect of various types of pollutant behavior on cleanup times. If necessary, reconsider this for Ontario.
compartment
2.
This problem deals with simple models in physiology. See D. S. Riggs ( 1 963) for further discussion, especially his Sec. 6-14 which treats problems of fitting curves to such models.
(a)
Treat the blood as a compartment containing a substance being removed by a physiological mechanism. What sort of equations could describe the concentration of the substance as a function of time ? We need models. How can they be tested ? (b) Let's be specific and assume that the removal is being done by the kidneys. In this case the rate of removal is usually proportional to the amount of the substance passing through a kidney per unit time. Construct a simple model based on concentrations. The substance in (b) is a drug whose concentration should lie between 2 and 5 milligrams per 1 00 cubic centimeters. If the drug is taken internally, about 60 % is quickly absorbed and most of the remainder is lost. In about 8 hours the body of an average person eliminates about 50 % of the drug. A normal adult has about 5 liters of blood. Design a dosage program for the drug. Most drugs are taken orally and require time to be absorbed by the blood. At the same time the drug is being removed by the kidneys. Model the situation. Here is some data on drugs taken from J. V. Swintosky (1956). The first drug is sulfapyridine, and the second is sodium salicylate. An 0 indicates oral administration, and an I indicates intravenous administration to which (a) should apply The column headed ' grams ' gives the initial dosage, and the other columns indicate the concentration in the blood at various times after administration. How well does your model fit ? Could you explain any discrepancies ?
simple
(c)
(d)
[
J.
Concentration (milligrams/cubic centimeters)
Administration
o o
o
2
4
6
8
10
12
24
Grams
hour
hours
hours
hours
hours
hours
hours
hours
4.0
2 ,3
ï¿½. . 7
3.6
3. 0
4.0
1.8
2.8
3.9
3. 5
1.8
3.8
3. 4
2. 6
2.1
1 .8
3.7
3.3
2. 7
2. 3 12. 5
2. 0 2.6
2. 2
10
5. 0
14.4
15. 7
10
39.4
31.4
24. 2
1 6. 2
20
56. 7
43.0
35. 2
26.6
P R O B LE M S
1 57
For a further discussion of drug kinetics see R. E. Notari ( 1 9 7 1). (e) General anesthetics are usually administered through the lungs. What factors do you think are important in modeling anesthetic concentration in the blood ? Outline a model. The rate of absorption through the lungs may vary considerably from one substance to another. An anesthetist monitors an anesthetized patient to decide how to adjust the flow of anesthetic. Do you think the absorption rate should be taken into account ? Explain. 3.
Another sort of polymerization reaction is called simple reaction of this sort is
condensation.
A
Study [Mn(t)J. Warning: Counting reactions is a bit tricky ; don't count Mk + M n and M n + M k . Also, beware of Mn + M n, because [Mn] 2
counts each collision twice. 4.
At what age are your friends going to be marrying most rapidly ? I S ? 20 ? 25 ? 3 0 ? What factors cause people to marry ? Sociologists and psycholoÂ gists generally believe that peer group behavior plays a maj or role. Can we model this ? The following attempt is adapted from G. Hernes (1972). (a) It is assumed that a person's chances of marrying in some small time interval are proportional to f.t and to the fraction of people in the person's age group that are already married e ) This is based on the idea that there is overt and covert peer group pressure to marry. Show that this leads to the differential equation
/).t
mt.
m' cm l - m). =
(b) (c) (d)
(
Solve the equation. The model may be criticized for a variety of reasons ; for example, it assumes that all people feel the same pressure to marry regardless of individual and age as long as the fraction of the peer group that is married is the same. Discuss the model critically. Suppose = How can this help the model ? What is the soluÂ tion to the differential equation ? In terms of properties of determine what fraction of people in your age class will eventually marry. Hernes finds that
c c(t).
log
c(t),
[c(t)] abt log k, b =
<
1,
Q U A N TITAT I V E D I F F E R E NT I A L E Q U AT I O N S
1 58
e(t)
gives a rather good fit, but a variety o f other forms for may do just as well. Can you suggest properties a good i s likely to have ? (e) We have ignored the problem caused by the fact that, since met) was zero when your peer group was younger, the differential equation predicts that it will remain zero. How can we get around this ? Remember that we are trying to provide a model that will roughly fit the situation. Discuss how to handle the fact that people are not identical. Can this be incorporated in somehow ? (We could expect the average value of to decrease with time as those who are more likely to marry do so,) A. J. Coale ( 1 9 7 1 ) found that, by making a linear transformation of the age axis, x and a scale transformation of the proportion married axis, y = mlm( (0), a curve was obtained that was closely fitted by
e(t)
(f)
e(t)
c
(g)
=
at b, -
How does this fit in with the previous discussion ? (Coale used data from a variety of countries ; Hernes used data from a U.S. census.) K. C. Land ( 1 9 7 1 ) discusses a Poisson model for divorce.
5.
How long does it take an object to fall from a great height ? You may need some or all of the following facts : 1.
The drag force on similarly shaped objects depends on the density of the air p , the velocity of the object the speed of sound and a characteristic dimension of the object 2. The velocity of sound depends on the pressure and density p of the air. is the height above the ground, where is acÂ 3 . If celeration due to gravity. 4 . Pressure satisfies oc p T, where T is temperature in degrees Kelvin. 5 . The force of gravity is where oc r - 2 and r is the distance from the object to the center of the earth. The radius of the earth is about miles.
vd.,
e
p dp = -gp dh,
h
p
mg,
e, g
g
4000
Before plunging in blindly and trying to build a model that uses all these facts, you had better consider just what it is you want to know. The problem is rather vague : How great a height ? How accurate an answer ? Of course you may decide you need all these facts and some
P R O B LE M S
6.
1 59
additional ones besides. Whatever you decide, come up with a reasonable method for obtaining an answer of some sort. What is the best way for our company to run its advertising campaign ? A variety of models has been developed to study the effects of advertising on consumer behavior by people who do marketing research. The more elaborate models often allow for more than one type of consumer behavior, each type having at least two and sometimes several constants to estimate. Obviously one can fit data better with complicated models, but frequently one such complicated model is about as good as another. This is a delicate,), can you develop other simple models with some reasonable notion of independence that fit the data as well as Coleman's model ?
8 . 2 . N U M E R I C A L M ET H O D S
I n this section we are not concerned with how the actual numerical solution of a problem is carried out, but rather with models that lead to a need for numerical solutions. A variety of numerical methods exists in the literature, and most computing centers have at least one package for solving differential equations numerically. If you wish or need to write your own package, a simple numerical technique is given at the end of this chapter. Tow i n g a Water S k i e r
You may have noticed that a water skier tends t o slow down when the boat towing him turns. Two factors influence this : For the same amount of power, a turning boat travels slower than a boat moving on a straight course,
(1)
N U M E R I CA L M ET H O D S
1 61
(2)
the skier tends to follow a shorter path than the boat. Can we model and the situation ? Let's look at the skier first. If he drops the tow rope, he will lose speed very rapidly because of the drag of the water. Thus the skier always moves practically along the line of the tow rope unless he can do something to affect his direction of motion. He can exert some control through the position in which he holds his skis in the water. To avoid this rather grave complicaÂ tion, we assume that the skier does the eaSIest thing and keeps his SkIS pointed toward the boat. Thus we have created a skier whose rope is always taut (because of the drag of the water) and who always moves in the direction of the rope. Let the tow rope length be I, the coordinates of the rear of the boat be and the coordinates of the skier be By conÂ sidering the length of the rope and the direction of motion of the skier we obtain two separate equations :
[x(t), y(t)],
[r(t), s(t)].
[2 = (r X)2 (s y)2 s'(t) r'(t) r - x
(1 9a)
_
(1 9b)
S
+
_
,
Y
We manipulate these two equations to obtain a set of two first order equations for and By differentiating ( 1 9a) with respect to clearing fractions in ( 1 9b), and rearranging each of them, we obtain two equations in and
r(t) s(t).
t,
r' s': 2(r - x)r' 2(s - y)s' = 2(r - x)x' 2(s - y)y', (20) (s - y)r' + (r - x)s' = O. Solving for r' and s' and using (1 9a), we obtain r' x'(r - X)2 y'(r - x)(s - y) (2 1 ) s, = y'(s - y) 2 x'[2(r - x)(s - y) +
+
+
+
F
Before we can solve (2 1 ) we mus t model the motion of the boat. Knowing the boat's course is enough to let us determine the skier's course : If we know as a function of multiplying (2 1 ) by gives a set of two differential equations which can be solved numerically for and as functions of This gives us the path of the skier parametrically in terms of We now determine his speed in terms of the boat's speed The component of the
y
x,
dt/dx
r
v.
s x
x.
x.
1 62
Q U A N TITAT I V E D I F F E R E N T i A L E Q U AT I O N S
vj[1 + y'(X) 2J 1 / 2 by basic calculus and geometry. Thus Jr'(x? + S'(X)2 x'(t) Jr'(x)2 + S'(X)2 x'(t) vJr'(x)2 + s'(x)2j[1 + y'(x?}
boat's velocity i s the skier's speed is
=
=
Hence the skier's speed at any time equals the boat's speed at that time multiplied by some function of the paths of the skier and the boat. AlternaÂ tively, we can solve (2 1) under the assumption that the boat's speed always equals We will obtain the path of the skier and, by the argument just given, a ' speed ' for the skier which is equal to the skier's true speed divided by the boat's true speed. This enables us to treat the problem of the boat's speed as a completely separate issue. Since it is a complicated hydrodynamic problem, we do not attempt to solve it. Consequently we obtain only a partial solution to the problem we started out with ; however, the full solution will be easy to find if and when we obtain information on the speed of a speedboat making a turn. We could try all sorts of paths for a turn. The simplest to program is a circular are, and this is a reasonable path. By defining
1.
(22)
x(t) B l cos (ï¿½l ) =
and
y(t) B l cos (ï¿½l ), =
B
I obtained a circular course with radius equal to rope lengths and a speed of 1. I decided it would be interesting to note how far the angle of the rope deviated from a 'line straight back from the boat. By substituting (22) into (2 1 ) and integrating numerically I found that with = a sharp turn, the speed of the skier dropped markedly : After a 90 Â° turn by the boat his speed was 67 % of the boat's and his angle with the line of the boat was 4 r. After a full 1 80 Â° turn the figures were 45 % and 63 Â° . By the time the radius of the turn was twice the tow rope length the situation had improved conÂ siderably : The skier's speed was still 86 % of the boat's speed after a 1 800 turn, and his angle was only 30 Â° . The changes were fastest at the start of the turn ; in fact, after 45 Â° the skier's speed had already dropped to 92 %, and his angle was 23 Â° . With a turn of radius four times the tow rope length the speed change was negligibleï¿½still 96 % of the boat's speed after 1 80 Â° . The tow rope's angle with the line of the boat was only 14 Â° . The lesson is quite clear : To keep up a water skier's speed be sure the radius of your turn is at least twice the tow rope length. A radius four or more times the tow rope length results in almost no loss in the speed of the skier except for a possible loss due to the boat slowing in the turn. Alternatively, the skier can maintain his speed by pointing his skis somewhat outward from the direction of the turn so that he does not move in the direction of the rope. The analysis of this situation appears complicated.
B 1, very
N U M E R I CA L M ET H O D S
1 63
We seem to have completed the problem. This was my reaction until I examined the data a bit more closely. B = 1
e
cp
0Â° 1 5Â° 30 Â° 45 Â° 60 Â° 75 Â° 90 Â° 105 Â° 1 20Â° 1 3 5Â° 1 50 Â° 1 6 5Â° 1 80Â°
0Â° 13Â° 23 Â° 31Â° 38 Â° 43 Â° 47 Â° 51Â° 54 Â° 57 Â° 59 Â° 61Â° 63 Â°
B=4
B=2 cp
w
0Â° 12 Â° 1 9Â° 23 Â° 26 Â° 27 Â° 28 Â° 29 Â° 29 Â° 30 Â° 30 Â° 30 Â° 30Â°
1.00 0.97 0.9 1 0. 85 0.78 0.72 0.67 0.62 0.58 0.54 0. 5 1 0.48 0.45
Â·
w
cp
w
1 .00 0.97 0.94 0.92 0.90 0.88 0.88 0.87 0.87 0.87 0.86 0.86 0.86
0Â° 9Â° 1 3Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â° 1 4Â°
1 .00 0.98 0.97 0.97 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96
e,
It is reproduced here. The angle the boat has turned through is the water skier's angle with the boat is cp, and his speed divided by the boat's is w. (Incidentally, finding the formula for cp is a nontrivial problem. You should do it.) Note that w appears to depend only on cpo Let's prove this for any motion and compute theÂ· function w(cp). For simplicity we move the coÂ ordinate system so that at = 0 the boat is at the origin and its direction of motion is along the x axis. Hence we have at = 0,
t
x
=
y
= 0,
t
x
'
= 1,
y
'
= 0,
w 2 = (ï¿½;y + (ï¿½;) 2 . By (2 1) r' = r 2/[ 2 and s' = rs/1 2 . Hence 2s2 r2 r4 +--4rw2 = [ -,-- [2 '
cos cp =
--r1- '
and so w = cos cpo Such a simple formula is unlikely to depend on more than a simple geometric argument. Can you find one ?
1 64
A
Q U A NTITAT I V E D I F F E R E NT I A L E Q U AT I O N S
B a l l i st i cs P ro b l e m
During World War II, mathematicians were asked to construct tables for gunners relating angle to range. Bombadiers required similar information. How was this done ? In this case the model is fairly straightforward, and the emphasis is on the mathematics, in contrast to most other models we have studied. We wish to construct a model of the' motion of an object under the influence of gravity and air resistance. This material is adapted from T. v. Karman and M. Biot ( 1 940, pp. 1 39-143). We ignore the complicaÂ tions due to lifting forces and possible rotation of the object. Hence the only forces involved are a downward force of and a drag force opposite the direction of motion of m[(v), where is the mass of the object, v I v l is the magnitude of its velocity, and is the acceleration due to gravity. In an x y coordinate system with the positive y axis directed downward, we [f(v)jv]v. Over a fairly can write this as a vector equation : Vi (0, large practical range, f (v) is nearly proportional to v 2 . We let e be the angle between v and the x axis and resolve the acceleration into components parallel and perpendicular to v. To do this we need to know the value of Vi in the two directions. Since v (v cos e, v sin e),
A.
mg m g = g)
-
=
-
=
Vi
= (cos e, sin e)v' + ( - v sin e, v cos e)8',
the parallel component is simply Vi and the perpendicular component is ve'. Resolving the acceleration due to gravity into components parallel and perpendicular to v and using the fact that drag acts parallel to v, we obtain (23a) (23b)
Vi
= g sin e - f(v), ve' = g cos e.
vg cos e, dividing by (23b), and rearranging, we obtain g d(v cos e) = VJ,!'(v), de an equation we cannot solve analytically unless f has some special form. If we assume that f(v) = kv 2 , we obtain g dvx/vx3 = - cosk dee ' where Vx = v cos e, the component of v in the rlirection. Hence k V; 2 = ï¿½ J cos - 3 e de.
Multiplying (23a) by
_
ï¿½ x
1 65
P R O B LE M S
Suppose that rearrangmg,
v
=
(vo, 0) when 8 = O. Carrying out the integration and
V x = r(8),
(24a) where
r(8) =
(24b)
-Â + 1 + ï¿½ )]
Vy = r(8) tan 8,
vo [ l + kVg6 (cossin288
log
V=
r(8) cos 8 '
Si 8 cos
- 1/2
.
We now integrate these velocity equations to obtain the path of the object. Let the origin be at 8 = Using in succession the chain rule, V x = cos 8, and (23b), we have cos 8 v2 dx Vx (25) g cos 8/v Similarly,
O.
v
v de e' g v sin e v2 tan 8 dy Vy d8 8' g cos 8/v g
(26)
Combining (24a) with (25) and (26) we obtain the coordinates parametrically in terms of 8 : r(8) 2 d8 r(8) 2 sin 8 d8 [x(8), y(8)] = (27) . , 2 cos 8 0 cos 3 8
[ Jo g
]
J g
Since r(8) is given by (24b), the integrations in (27) can be carried out numeriÂ cally. And alternative approach is to solve the original differential equations directly by numerical methods. This is more sensitive to numerical errors, because the original equations are linked second order equations while (27) simply involves two disjoint integrals. We can supplement (27) by obtaining time information. Using (24a) to eliminate in (23b), r(8)8' = cos 2 8. Thus
v
(28)
g
t
Jo
r(8)d8 2 . g cos 8 Since our time origin is at 8 = 0, we must integrate back from 0 in (27) and (28) to obtain the intial position for a projectile fired upward. =
P R O B LE M S
1.
Consider the left turn squeeze model in Section
(a)
8.1.
Discuss in class how you could take steps toward answering the question raised at the end of the model by using a computer : How can we show that the results are not very sensitive to the form
Q U A N TITAT I V E D I F F E R E N T I A L E Q U AT I O N S
1 66
(J(t)
(b) 2.
of for reasonable methods o f turning ? (Or perhaps, discover that they are.) Be specific. If the class has access to a computer, implement the plan formulated in
(a).
In this problem the question is : How can we formulate a model that does not require an excessive amount of computer time ? Most galaxies appear to be fairly flat disks with the stars moving about a common center like a huge swarm of planets or asteroids. Nearly all the mass of the galaxy is in the central region, because the stars there are much closer together. Some astronomical photographs (A. Toomre and J. Toomre, 1 973) show pairs of galaxies which appear to have collided, or at least passed close to one another and caused large streamers of stars to be pulled out. How could you test this idea using a mathematical model ? Recall Newton's law of gravity is directed along the line between two bodies, and Newton's basic law is See A. Toomre and J. Toomre ( 1 972, 1 973) afterward if you want to see how they did it.
F = Gmlm2/r2 , F = rna.
3.
This problem is adapted from M. S. Bartlett ( 1 972). Can we construct a simple model of the spread of epidemics ? We take as our example measles, a prevalent childhood disease before vaccinations became available. The incubation time is ! week. During this time a child seems normal but is able to infect others. After this time the child is is ol ated until recovery, at which point he or she is immune. Roughly speaking, measles outbreaks have been more severe during alternate years.
(a) (b)
(c)
In what follows use the differential equation model, the difference equation model, or both.
Show that your model has some sort of cyclic behavior. H it doesn't, fix it, because measles outbreaks definitely tend to occur in a cyclic pattern. Estimate the parameters in your model to fit the ! week incubation and 2 year cycle observations. Do the parameter values appear . realistic ? ) Measles outbreaks are seasonal (60 % below average in summer and 60 % above average in winter), but if you've constructed a model
(d) (e
Construct a sImple differential equation model allowing for three categories : susceptible, infective, and isolated/recovered. Allow for an influx of new susceptibles due to births. Assume an Â· infective makes contact with members of the population at random and infects a contacted susceptible with probability p. Construct a simple difference equation model.
1 67
P R O B LE M S
(d)
of the sort I expected, a slight change in the parameters in will cause the period to differ slightly from 2 years and so the peak will drift from season to season. What can be done ? Most children make contact with more children during the school year than during vacation. Use this to fix up the model by introducing a seasonal , v ariation in p . How much variation is required ? Does this amount seem reasonable ? (f) Can you allow for contact between school districts ? How much faith do you have in the model ? What are its faults ? Can you suggest improvements ? The following data from Bartlett's article may be useful.
(g)
Annual measles deaths in London ( 1 647- 1 660) 1 647
1 648
1 649
1 650
1651
1 652
1 653
5
92
3
33
33
62
8
1 654
1 755
1 656
1 657
1658
1 6 59
1 660
52
11
153
15
80
6
74
Mean time between epidemics for some towns in England and Wales ( 1940-1 956) Population Time between outbreaks (thousands) (weeks) 1 046 658 415 269 1 80 1 13 66 22 18 12 11 7 4
73 1 06 92 93 94 80 74 86 92 79 98 1 99 105
Q U A N T ITATI V E D I F F E R E N T I A L E Q U AT I O N S
1 68
4.
Organisms have internal oscillations, like circadian rhythm, which have natural periods, like 24 hours, and are sustained by the organism itself. What mechanisms make such cycles possible ? It seems natural to look for an explanation in terms of chemical reactions. This model is adapted from J. Maynard Smith ( 1 968, pp. 1 08 - 1 1 5). One of the simplest bioÂ chemical reactions that seems likely to offer an explanation is 1. 2.
3.
A gene catalyzes messenger RNA (mRNA) production. The mRNA leaves the nucleus and catalyzes the production of a protein. A portion o f the protein enters the nucleus and combines reversibly with the gene to form a product which does not produce mRNA.
P
Let M be the concentration of mRNA and the concentration of protein. For simplicity we assume that there are many cells in the organism, that produce this protein, and so many copies of the relevant gene are present. Let be the fraction of genes that are active, that is, not comÂ bined with the protein.
G
(a)
The rate of the reaction Gene
+
protein
is proportional to the product Inactive is proportional to 1 -
(b)
----+
inactive
GP, and the rate of the reaction
----+
gene
+
protein
G. Show that the value of G at equilibrium is G = 1 1 aP +
a O.
for some > (You may wish to look at Problem 9.2 . 8.) Proteins and mRNA both decay. Defend the equations
dM b eM, dt 1 aP dP Tt = M - JP for some positive a, b, e, and f. Show that by suitably rescaling M, P, and we can rewrite them as 1 = -- , 1 p (29) pi = - {3p, for some positive and {3. +
e
_
,
e,
t
m
I
+
m
a
am
P R O B LE M S
(c) (d) 5.
1 69
It can be shown that (29) does not lead to sustained oscillations. In fact, no simple chemical reactions do, see Problem 9.2.8 and also J. S. Griffith ( 1 968). One possible solution is to take into account the fact that it takes time for molecules to travel between the nuclei (where the genes are) and the sites where the protein is synthesized. Incorporate this into (29). Do the equations developed in ( ) have sustained oscillations ?
c
Walt Disney studios once filmed a simulated chain reaction which took place as follows. A large number of cocked mousetraps was placed on the floor of a bare room. Each trap was specially built so that when it was sprung it would throw two ping pong balls into the air. Flying ping pong balls that landed on unsprung traps would spring the traps and thereby set more balls flying. The reaction was started by tossing a single ping pong ball into the room. How should the simulation be designed so that the duration of the chain reaction will be reasonableÂ the audience must be able to see it, but it shouldn't last too long. The following treatment is adapted from G. F. Carrier ( 1966, pp. 2-6). There are three obvious ways to influence the duration of the simulaÂ tion : Change ( 1 ) the flight time of the balls, (2) the number of traps per square foot, or the size of the room (keeping the number of traps per square foot the same by simultaneously changing the total number of traps). We consider each of these separately. It can be observed that the flight times of the balls for a given brand of trap are nearly the same. We assume for simplicity that they're identical. After hitting a trap, very few balls are able to rebound enough to hit another trap with enough force to spring it. Thus a ball that hits a sprung trap or an unsprung trap becomes dead in most cases. We assume that this always happens. A ball that hits the bare floor may or may not rebound enough to be able to set off a trap ; it depends on the flooring material. At any rate, there is a probability p that a random ball will land on a trap with enough force to spring it (if it is still cocked). The value of p depends only on how far apart the traps are and on the nature of the floor. (The latter is a fourth variable which we can adjust. You should convince yourself that this would have the same effect as changing the spacing of the traps.)
(3)
(a) (b)
Criticize the various assumptions we have made. What sorts of errors do they introduce into our predictions ? Argue that the duration of the simulation is nearly proportional to the flight time of a ball. What advantages and disadvantages do you see in trying to adjust the duration by adj usting the flight time ?
Q U A N T I TAT I V E D I F F E R E N T I A L E Q U AT I O N S
1 70
From now on, w e use the flight time o f a ball a s the unit o f time measurement.
(c)
t
Let be the length of time from the start of the simulation until b balls are in the air together, where b is much less than the total number of balls. Show that approximately (2py = b, and so = log b/log 2p. Consider two rooms in which the number of traps per square foot is the same, but one room is b times as large as the other. Show that the difference in the length of the simulaÂ tions is about log b/log 2p. What advantages and disadvantages are there to adjusting the length of the simulation in this way ? To what extent can you change the duration of the ' middle ' range- say the time to go from 5 % sprung traps to 90 % sprung traps ? Discuss adj usting mousetrap density.
t
So far our discussion has dealt primarily with small t. Large t is harder. Intermediate t can be handled fairly easily. The rest of this problem is devoted to it.
(d)
N M,
If there are balls in flight at time n and U unsprung traps out of a total of show that the probability of having exactly balls in flight at time n 1, given that of the traps are hit, is
2B T B B P(B) = ( BU)(MT ) ( 1 - T ) U where (ï¿½) is the binomial coefficient ' U choose B' - the number of ways to choose B objects from a set of U. Using the approximation that, if is small compared to M, no trap is hit by more than one of the N balls, show that the probability that T traps will be hit is
+
M
'
N
approximately
(e) (f)
H(T) = (ï¿½)pT{l
_
p)N - T.
Describe a Monte Carlo simulation for the mousetrap demonstraÂ tion. What inaccuracies have been introduced by our approximaÂ tions ? Both and can be approximated quite accurately by normal distributions for large values of U and The means and variances are
P
H
N.
Mean For H For P
pN T
U /M
Variance Np(l - p) U T(M -
T)/M 2
TH E H EU N
M ET H O D
U
1 71
N
Let's consider the middle range of the experiment when and are both large. Show that, if is the average number of balls in the air at time n, approximately,
Nn
(30) Since Un Un - Nn+ d2, this can be solved recursively for Nn and Un,+ but we can't see what's going on very well just by 1
(g)
=
looking at (30). Write f(n) for the fraction of unsprung traps at time n and show that (30) becomes (3 1 )
f(n) - f(n
+
1 ) = 2p [j(n - 1 ) - f(n)] f(n).
We approximate (3 1 ) by a differential equation in hopes of obtaining an easier problem. Replace f(n + 1) and f(n - 1) by their first degree Taylor polynomials about n. Show that this leads to 1'(n) = 2p1'(n)f(n), and so 1'(n) 0, a poor approximation. This means we need higher degree Taylor polynomials. (h) Use quadratic Taylor polynomials to obtain the approximation
=
1 '(n)
+
ï¿½
1' n)
and so
(i)
(32)
[pf(n)
+
!] f' (n) = [2pf(n) - 1 ] 1'(n).
Can you describe the solution to (32) ? You cannot obtain an analytical solution, but (32) can be integrated once to obtain l' = 2 f
U)
= p [21'(n) - f' (n)] f(n),
- ï¿½p log (2pf
+
1) + c.
Using (3 1 ), (32), or some other device, find a way to answe r the following questions. About how long does it take the simulation to go from f(n) = 0.95 to f(n) = 0. 1 ? How large would you make p ? Why ?
T H E HEUN M ET H O D
In case you have access t o a computer but not to a library routine for solving differential equations, here is the Heun method for solving a system of first order equations of the form
Y; fi(x, =
Yl> . . .
,
Yn) fi(x, y). =
1 72
Q U A N T ITAT I V E D I F F E R E N T I A L E Q U AT I O N S
h, set y* hf(x, y(x)) + y(x), y hf(x + h, y*) + Y*,
T o take a single step of size
=
=
and
y(x + h) Hy(x) + y]. A check o n the accuracy i s provided b y y* - y(x + h), which can b e expected to be greater than the actual error. A better check is provided by using two values of h, since the error in integrating from x = a to x = b is roughly proportional to h 2 . Thus, by using values of h differing by a factor of 2, =
we obtain two estimates for y, and their difference should be about three times the error obtained by using the estimate based on the smaller step size.
C H A PT E R
9
L O CA L STA B ILITY T H EO R Y
If you wish a fuller discussion of the theoretical background than that presented here, consult a textbook. Some introductory differential equations textbooks contain a chapter or two on qualitative methods. F. Brauer and J. A. Nohel ( 1969) treat the general theory and discuss some specific problems.
9 . 1 . A U TO N O M O U S S Y STE M S
Suppose we are dealing with a system in which time is the independent variable. Absolute time may or may not appear. If absolute time appears, we are dealing with a system. If absolute time is irrelevant, the system is Another way of looking at this is that the dependent variables are functions only of in time. Suppose someone gives you money each day starting with $ 1 today, $2 tomorrow, $3 the following day, and so on. Let the amount for day n be M(n). Since you receive n dollars on the nth day, M(n) = n -a historical system. However, M(n) = M(n 1) i - an autonomous system. Thus the disÂ tinction between historical and autonomous systems is sometimes artificial. Here we are concerned only with the stability of autonomous systems and limit most of our discussion to systems with two first order equations involving two endogenous variables. This makes our discussion simpler, allows for two-dimensional diagrams, and still permits us to consider a variety of interesting models. Most of the mathematical ideas can be generalÂ ized to systems of higher order equations with several endogenous variables.
autonomous.
historical differences -
+
1 73
1 74
L O C A L STA B I L I TY T H E O R Y
x=xt
Suppose there is no time delay. Let the endogenous variables be () and Since the equations are first order, we assume that they have been solved for and in terms of and giving
y = y(t).
(1)
x' y' x y, x' = f(x, y), y' = g(x, y),
f
g.
where for the moment we do not say much about the functions and Time can be completely eliminated from by dividing one equation by the other to give
(1) dy g(x, y) dx f(x, y)'
(2)
We can plot the solutions of the first order differential equation (2) in the xy plane. This is called the Furthermore, an arrow can be attached to each curve indicating the direction of motion along the curve with time. This picture contains all the information in except the rate of motion along the curves. (For equations in endogenous variables you can imagine the curves as Iyin lin n-dimensional phase space.) The division of 0 for some values o f and If t o give (2) cannot b e carried out i f 1= U, the curve is vertical. If is also zero is called an A solution that starts at an equilibrium point can uever move, since 0 by Such solutions are plotted simply as points. By going from to (2) we obtain a convenient way of representing solutions graphically. Also, (2) is usually more analytically tractable than However, the loss of the time variable presents difficulties when we study stability questions. There are two types of qualitative questIOns we can ask about the paths of solutions in the phase plane. If a solution starts near an equilibrium point, will it move toward the equilibrium point or away from it and in what manner ? Questions of this type are dealt with in the subject area known as or The second type of question does not assume that we start near an equilibrium point. It concerns what is known as or a more difficult mathematical topic than local stability theory. I discuss this area briefly in Section 9.4. Global behavior is more varied than local behavior. For two first order equations the possibilities include divergence, convergence to an equilibrium point, periodicity, and convergence to a limit cycle. A is a periodic solution such that a solution which starts nearby will approach it. (In the phase plane, a periodic solution appears as a simple closed curve.) In higher dimensions (i.e., three or more first order equations), global behavior is much more varied and much less understood.
phase plane. n
g(x, y) point. equilibrium x', = y' = (1). (1)
n
f(x, y) = g(x, y)
(1),
(x, y)
(1) x y. (1).
stability in the small local stability. stability in the large global stability,
limit cycle
D I F F E R E N T I A L E Q U AT I O N S
1 75
9 . 2 . D I F f E R E N TI A L E Q U ATI O N S
Theo ret ica l B a c kg ro u nd
The basic idea in local stability theory of differential equations is to approxiÂ mate the system ( 1 ) by two linear first order differential equations near an is an equilibrium point ; that is, equilibrium point. Suppose that
(xo, Yo)
(3)
f (xo, Y o). h(x), h(xo) h'(x O)(x - xo) h(x).
Recall that for a We want to approximate and 9 near the point function of a single variable, say we can obtain a fairly good approxiÂ + mation near by using instead of The same idea can be used with functions of two variables : We can approximate near by
Xo
f(x, y) (xo, Yo) f(xo, Y o) + af(xaxo, Yo) (x Xo ) '+ af(xo,ay Yo) (y Y o )Â· Here af(xo, yo)/ax denotes the partial derivative of f evaluated at the point (xo, Y o), that is, _
_
To avoid cumbersome notation we denote this partial derivative by jï¿½ . The meanings of and should be obvious. Thus we have + ï¿½ (4) + ï¿½
fl', gx, gy u' fxu fy v, Vi gxu gy V, where u = x - Xo, v = y - Y o, and fx, fy , g x , and g y denote the partial derivatives of f and evaluated at (xo, Yo). we assume that the approximate equalities in (4) are exact, the equations can easily be solved. The solution of 9
If
this homogeneous, linear system gives information about the local stability of the solutions of ( 1 ). Since our object is not to derive mathematical results, we merely state the following theorem which can be found in almost any differential equations textbook that discusses local stability theory.
(xo, Yo) is an equilibrium point for the system ( 1 ), define b, c, and d by b = fx +2 gy '
If the real numbers
T H E O R E M.
where
i = J=1.
1 76
L O C A L STA B I LITY T H E O R Y
c
stable; unstable ; Keer. c = b ix 9y 9x
If < 0, the equilibrium point is that is, solutions starting nearby move closer. If > 0, the equilibrium point is that is, solutions starting nearby move further away. Furthermore, the distance from the If equilibrium point behaves roughly like 0, additional tests will be needed to determine the nature of the equilibrium point. Necessary and sufficient conditions for < 0 are < Â° and > 1;, . If d # 0, the solutions near the equilibrium point spiral about it in a roughly elliptical fashion with a period approximately equal to 2n/d. The amplitude of the oscillation increases or decreases, depending on the sign of If d 0, there is no oscillation.
c
c
c. =
Typical phase plane diagrams are illustrated in Figure 1 where it is assumed that lies in the first quadrant.
(xo, Yo)
c
<
0, d
c > 0, d
=
*
Â°
c > 0, d
Â°
c
F i g u re 1
<
0, d
Phase plane diagrams near equilibrium points.
=
*
Â°
Â°
D I F F E R E N T I A L E Q U AT I O N S
c
1 77
For those familiar with linear algebra, we note that is the maximum of the real parts of the eigenvalues of the matrix 11 8};/8xj ll , where y, and Stated in this way, the stability result is valid for a system of first order equations in endogenous variables, but the nature of the oscillations is more complicatedo When we made the assumption that (4) is exact, we constructed a model of Since the condition is fragile, it is reasonable to suppose that we could not easily decide between stability and instability if This is indeed the caseo We do not study this situ ation hereo The condition is equivalent to 2: 0 which can be put into the form
X 2 = f1 = f, f2 = go n (1)0
X l = X,
n
c=0
d=0
c = 00 b 2 - (fx gy - gx fy)
d = O if and only if (5) by a little algebrao In particular, no oscillation occurs if fyg x 00 Since d = 0 actually corresponds to an inequality, the case of oscillation versus 2:
non oscillation is not fragileo
F r i ct i o n a l D a m p i n g of a P e n d u l u m
Friction slows a pendulum downo It also changes its periodo Will you need to allow for this change in designing a pendulum clock ? If so, how ? We want to study the motion of a pendulum in an attempt to understand mathematically how frictional forces slow it downo These forces arise from the motion of the pendulum in the air, water, or whatever medium it is suspended ino In contrast to this, the motion of a frictionless pendulum is periodico Using different methods, we studied the period of a frictionless pendulum in Section Consider a pendulum as shown in Figure L Since our primary interest is in damping due to friction, we make several simplifying assumptions whose removal is discussed briefly after the model is analyzedo
220
1. 2. 3.
All the weight i s concentrated as a point of mass m at the end of a piece of wire of length L (If is replaced by the distance from the pivot to the mass, the following results will remain valido) The wire does not stretch or wrap around its pivot, and so the length is independent of the angle of the pendulum There is no wind, shaking, and so on, that can disturb the motion of the pendulum
center of
I
I
e = e(t),
e=0
Let the angle of the pendulum be where is the rest position of the pendulum The gravitational force acting on the pendulum is mgo It is partially balanced by tension in the wireo The resultant is the force
1 78
L O C A L STA B I LITY T H E O R Y
-r.
i I I
I
I I I I I e ï¿½ I I I
I I I
I I I I I I I I I
mg
sin
e mg
F i g u re 2
A pendulum.
-mg
sin () acting on the pendulum along its direction of motion. The only other force affecting the motion of the pendulum is the frictional force we are examining. Empirical studies show that a function that depends only on the velocity gives a good approximation to such forces. Since the velocity of the pendulum depends on its angular velocity we assume that the frictional force is of the form where is a differentiable function. This is a retarding force, and so has the same sign as In particular, is both nonnegative and nonpositive, and therefore is zero. We postpone further assumptions concerning the nature of until they are needed. Newton's laws give
reO) (6)
-r(w), r(w)
r
w,
w.
r ml()' = -mg sin () - r(w).
We now show that our model predicts that friction causes the pendulum to slow down. Since () , we can rewrite (6) as
w=
'
=w (7b) w, = - gI . () - r(w) 1m . These equations are in the form ( 1 ), with = () and = w. We set (7) equal to zero to find the equilibrium points. From (7a) we have w = O. Since reO) = 0, we deduce from (7b) that () is a multiple of Because of the periodÂ icity of the sine function, we need only consider () = 0 and () = The latter case corresponds to the pendulum being straight up. It is left as an exercise (7a)
()
'
--
- SIll
x
y
n.
to show that this equilibrium point is unstable.
n.
D I F F E R E N T I A L E Q U ATI O N S
We have
(wo, 80)
= (0, 0). The partial derivatives are
(8')8
(8)
1 79
= 0,
-g
(W )8 = -1- ' I
(8)w =
( W) w =
r(w) r'(O)
w
I,
rO
- ( ) '
1m
'
Since is an increasing function near = 0, it is reasonable to assume that > 0 (This really is an assumption ; consider In the theorem we have
r(w) w 3 . ) =
b = -r'21m(O) ' c + di .b + -Jr:;-g b2 - / . Since .jb 2 - gil 1 b I , c O. It follows that the motion of a pendulum is locally stable ; that is, it dies out. We see that d is nonzero if b 2 gil, which can be rewritten as r' (O) 2m.jgi. Hence the pendulum oscillates if r' (O) 2m.jgi and does not oscillate if r' ( O) > 2m.jgi. In the latter case =
<
<
<
<
<
the frictional forces are very large, and it is as if the pendulum were moving in molasses. How does the change in the period of the pendulum compare with the damping ? From the theorem, the pendulum will oscillate at about half its l Hence t loge (2)/ 1 initial amplitude after a time t, where This requires about tl(2nld ) oscillations of the pendulum. Hence the pendulum loses half its amplitude after about
eet =
=
bJ.
Yi? - 1 --; 0 . 1 1032 -J;-g--b 1b1 oscillations. Call this number n. Squaring and rearranging, we obtain gllb 2 82n2 + 1 82n 2 . The ratio of the period of the pendulum to the period of a frictionless pendulum is 1 ( 1 - -IbZ ) - 1 /2 1 + -Ib2 1 -( 1 0) 2 164n 2g 9 Thus the period increases by about 0. 61n 2 percent, a very small change. This prediction can be tested experimentally. Since a pendulum takes quite a long time to slow down in air, n is large log (2) d
(9)
2n 1 1
ï¿½
=
0. 1 1032 d
=
ï¿½
ï¿½
ï¿½
+
.
in this case. It follows that the effect of friction on the period is quite small. If this were not so, the period of a pendulum would depend on barometric pressure and pendulum clocks would not keep accurate time.
1 80
L O C A L STA B I LITY T H E O R Y
ml8n 1 (8,
I t i s possible t o replace (6) b y the more general equation = w) , where we assume only that lro and Ie are both negative near zero as they are is a good approximation, our conclusions are unÂ in (8). Since Ie = changed.
-mg
S p e c i es I nteract i o n a n d P o p u l a t i o n S i ze
Interaction between species lies at the heart of ecology. Some claim that these interactions cause the nearly cyclic fluctuations observed in some populaÂ tions. Others claim that other factors are responsible. What can a simple mathematical model contribute to the debate ? Since our theorem allows only two endogenous variables, we assume only two species are interacting. Let x be the number of organisms in the first species and the number of the second. There are three basic types of interaction between species :
y
1. 2. 3.
The first species preys on the second (either direct predation or as a parasite). Both species compete for more or less the same limited resources (e.g., plants competing for sunlight). The two species live in a symbiotic relationship with each other (e.g., nitrogen fixing bacteria on the roots of peas and beans).
Predation is discussed below. Competition and symbiosis are treated sketchily, and the details left as exercises. The assumption of autonomy implies that the environment of the species is constant except for factors whose change depends only on the number of organisms of the two types. Because of the form of ( 1 ), no time lag can be used. Since species require time to reproduce, the absence of a time lag may be a serious deficiency. Furthermore, the past history of the population determines the age mix and general physical condition of the present population. It is an open question how serious a restriction avoiding the past is. If it is serious, difference equations or mixed differential difference equations will be needed. See R. M. May ( 1 973) for a relevant discussion. Let x be the number of predators and the number of prey. It is intuitively clearer to think in terms of the net growth rates of the two species :
y
(1 1 )
x
'
Predator : - = r (x, x
y),
y' y
Prey : - = s (x,
y).
1 81
D I F F E R E N T I A L E Q U ATI O N S
O.
r
The equilibrium condition is then = s = A historically important special case is the Volterra-Lotka equations : ( 1 2) rex, = a + sex, = + dx.
y) c
by,
y)
1. G. Kemeny and 1. L. Snell ( 1 962, Ch. 3) discuss this special model.
We deal with a more general model in which r and s are only vaguely specified. To be able to say something about the stability of the system, we must make some assumptions about r and s. If the population size of species 1 does not affect the population growth of species 2, then y Similarly, if species 1 does not affect its own populaÂ tion growth through crowding, resource exhaustion, and so on, we will have Because the absence of an effect leads to zero for the partial derivative, an indication of how the species affect one another gives us information about the signs of the various partial derivatives. We may be able to make educated guesses about their relative magnitudes as well. Actual data collection is very difficult at best. What effect does a change in predator population have on the net growth rate of the prey ? Since predators consume prey, < If the predator population increases, there will be less food per predator, and so < Another way to decrease the number of prey per predator is to reduce the the prey population, so we expect > The sign of is harder to determine. as the prey population increases in the absence of predation, the net growth rate should decrease, because the species is now moving into less favorable parts of the environment. However, if the prey increase while the predators do not, there will be less predator pressure per individual of prey population, which would lead to an increasing net growth rate. These two effects tend to cancel out. Note that, if the predators are prey at a higher level in the food chain, the argument just given for also applies to We have reached the following conclusions : < 0 or ::::::; 0, > 0, ( 1 3) < 0, ::::::;
rx O.
r O.
Sx O.
rx Sx
ry O.
Sy
S j'
rx .
rx O.
ry Sy O. We interpret S y 0 to mean that we can neglect S y compared to the other partial derivatives. Suppose there is an equilibrium point (xo , Y o) at which neither species has vanished. From ( 1 1 ) we have r S O. The question of the existence of solutions to this equation is discussed later. It is easily seen that fx xrx , ::::::;
=
=
fy = x r y , and so on. From the theorem and (5) we have
( 1 4)
b
=
x
x r + YSy 2
=
1 82
L O C A L STA B I LITY T H E O R Y
b
rxsy rysx'
> From ( 1 3) w e see that is negative and Thus w e have local stability. If the environment is rather homogeneous, the self-limiting effect of species 2 will not come into play until the environment is nearly saturated. In this case we have > 0, and instability is possible. Hence heterogeneity of the environment increases stability. See R. M. May (1972) and M. L. Rosenzweig (197 1). Do we have oscillation ? From (5) we see that the answer is yes if and only if < + ( 1 5)
Sy
rx O.
(xrx - ysy 4xyry sx O.
This holds if ï¿½ Roughly speaking, ( 1 5) says that the interspecies effects on net birth rates are greater than the intraspecies effects. This certainly appears to be true in some situations. Unless we are willing to make a statement stronger than ( 1 3 ) we can't really say much more ; however, this is quite a lot considering the vagueness of ( 1 3). We now turn our attention to the existence of equilibrium points The arguments used to derive ( 1 3) did not use the assumption that we were at equilibrium, so we drop the assumptIOn that the partial derivatives in ( 1 3) are evaluated at equilibrium. Another point is that, in deriving ï¿½ 0, we used the fact that the predator was severely limiting the prey ; let's relax this by allowing < 0 at low predator densities. The following discussion is entirely nongraphical. In simple cases such as this one, a graphical discussion may be preferable. You are asked to provide this in Problem 2. To begin with, there are the trivial equilibrium points associated with = 0 (no predators). We put them aside and look for equilibrium points with predators. Hence > 0 and > Suppose that predators can survive if there are enough prey and if the number of predators does not exceed some critical value Since y < 0, = = 0 for a unique we can solve By implicit differentiation r which is positive by ( 1 3). If predators cannot live in the absence of prey, we have > 0 for 0 < Substituting in we find
(xo, Yo).
Sy
Sy
x
- x/r)'
x rex, y) y(x)
x
y O.
y(x). x ::::; xm. ds Sx + Sy dy dx ' dx
xm.
r dy/dx y(x) sex, y)
=
which is negative because of ( 1 3). Now we can show that a unique equilibrium exists under certain conditions. We have just shown that is a strictly decreasing function of Therefore an equilibrium point will exist if and only if is positive for small and negative for large If such a point exists, it will be unique because is decreasing. To say that > 0 means that we must prey to keep the prey population from increasing
x.
x sex, y(x)) strictly
remove
sex, y(x)) x.
sex, y(x)) sex, y(x))
D I F F E R E N T I A L E Q U AT I O N S
y(x}.
x
1 83
x,
beyond This is likely to be true for small and false for large which is just what we want. If the predator population is kept rather low (small by exogenous forces such as hunting by humans, the prey may be able to provide food for the predator and still increase. In that case we cannot reach the portion of the curve where ::;; Discuss this situation. Let's relate our conclusions to the real world. The main result of our study is a model which proposes a mechanism for maintaining stability in a world that is changing. If the environment varies a lot in a time period comparable to that in which and move significantly toward equilibrium, our results will be useless. However, infrequent changes can be viewed as occasional displacements from equilibrium ; for example, a change in the environment actually shifts the location of the equilibrium point by changing the functions r and an infrequent epidemic changes the value of or y but leaves the equilibrium point unchanged. If these displacements are not too large, our use of local stability theory shows that the system will tend to return to equilibrium. If the system possesses global stability, even large displacements will be damped out. See R. M. May ( 1973). Most natural systems involve many predator and prey species. If we introduce one variable for each species, much more than the vague conditions in ( 1 3) will be needed to study stability. What can be done about this ? If the prey species are sufficiently alike, we can lump them together as if they were one species. Likewise for the predators. In this way it may be possible to apply our conclusions to a system involving more than two species. Since the model would only make predictions about the size of the lumped species population, the individual populations may fluctuate wildly. How can we gather data to test the model ? Except in the physical sciences or in carefully controlled experiments, it is usually difficult to estimate first derivatives and nearly impossible to estimate higher derivatives. Therefore we should not try to verify ( 1 1 ) and ( 1 3 ) directly. To check the model we need some predictions that can feasibly be tested. The model predicts that the population sizes will exhibit damped oscillations with nearly constant periods if they are disturbed from equilibrium. If (4) is treated as an equality and solved, it can be shown that the relative maxima of and differ by a constant phase. As a result, we predict that the predator and prey cycles will be out of phase with one another by about the same amount from cycle to cycle. Of course, random disturbances cause variations, so neither of these predictions is perfect. We now have two predictions which it may be feasible to check : nearly constant period and nearly constant phase shift. In the last section of their article N. S. Goel et al. (197 1 ) briefly discuss some experiments that have been done to check the model. The predictions are usually correct.
xm )
y(x)
sex, y(x)) O.
x
y
s;
x
u v
1 84
L O C A L STA B I L ITY T H E O R Y
I t i s difficult t o test the model on natural populations. The most wellÂ known candidate is the system consisting of the Canadian lynx and the snowÂ shoe hare. Since it undergoes wild fluctuations, a global result is needed. However, the hare populatibn fluctuates in the absence of lynx predators, so a simple lynx-hare model is wrong. Furthermore, the relative phases of the lynx and hare fluctuations seem wrong (Gilpin, 1973). A more promising model may be some sort of three-way system involving hares, vegetation, and (exogenously) the weather. For further discussion of this problem see L. B. Keith ( 1 963). We consider competition and symbiosis briefly. An important factor in the competition situation is the existence of an equilibrium point. This is discussed in Problem 3.3.3. For competition all the partial derivatives are negative. There will be stability if and only if > Roughly speaking, this says that each species inhibits its own expansion more than the competing species does. We now turn our attention to symbiosis. Assume that, if one species somehow increases, it will help the other to increase too. This means that ry and are positive. It follows from (5) that 0, and so there is no oscillation. Suppose we increase species 1 by a small percentage. Ignoring self-limitation, this is essentially the same as decreasing species 2 by the same percentage. Thus we expect unless species 1 tends to limit itself. ï¿½ Self-limitation makes an even larger negative number, and so > in this case. If a similar result holds for the second species, the equilibrium will be stable, because stability is equivalent to >
rxSy sxry.
Sx
d=
rx
xrx
-
yry ,
I xrx I yry
rxSy syrx.
K ey n es i a n Eco n o m i cs J. M . Keynes's revolutionary work, The General Theory of Employment, on economic theory and Interest and Money, has had a profound effect '
practice, the latter beginning with Roosevelt s New Deal politics during the U.S. Depression of the 1 9 30s. Here we study a crude bare bones model adapted from G. Gandolfo (197 1 ). Let's begin with a list of variables that relate to the national economy :
C, desired level of consumption. I, desired level of capital investment. D, total demand for goods. Y, national income. L, desired amount of money to be held as cash on hand. M, amount of money available. R, cost of money (interest rate).
D I F F E R E N T I A L E Q U AT I O N S
1 85
I'm sure you can add to the list, but we have enough for the time being. You may wish to come back later and add more. Before discussing these variables, a word about measurement is worthÂ while. Some of these quantities may be hard to measure, partly because they are imprecisely defined and partly because it is not clear what units we should use. Although lack of precision is a serious problem, we ignore it here because we can construct a model without it and because attempting to eliminate it would involve us in deep economic considerations. We want to measure our variables in whatever this slippery phrase means. Economists use that is, dollars deflated to some standard year such as 1 950. We avoid problems by assuming that our variables are somehow measured in constant dollars (except for R which is a ratio of constant dollars). Note that R is negative if the rate charged by moneylenders is less than the rate of inflation. All this lack of precision is really a serious problem. If it is not resolved, two people may mean different things by the same terms and so the discussion of models will become hopelessly muddled. This may be part of the problem at the present time. People are arguing over whether or not the current (1 974) combination of high unemployment and inflation (called shows that Keynesian models cannot be used. Back to our model. Capital investment over a period of time increases the efficiency of labor. To avoid this thorny problem, we deal with a short term model, that is, one in which the change in total capital investment is not significant. Technological development creates a similar problem which we also avoid by using a short term model. Having said what we won't try to do, let's see what we can do. Our list of variables is too long to handle easily, so we need to know which are exogenous (independent), which are endogenous (dependent), and which we can ignore. Unfortunately, to do this sort of thing directly can be very difficult ; it is often easier to sneak up on it through discussion. At equilibrium we will have = Y and = that is, what we want is what we have. Since we are concerned with disequilibrium, the quantities Y and are important. Since excessive demand for money drives up the interest rate and excessive demand for goods causes production to increase, we assume that
real terms, constant dollars,
stagflation)
D
D( 1 6)
L M,
L-M
R' =
Y'
=
r(L - M), r'(O) > 0, r(O) 0, y(D - Y), y'(0) > 0, y(O) O. =
=
This suggests that it would be nice if we could take R and Y as the basic variables, influencing their own growth through ( 1 6). Can we relate and Y to R and Y ? Of course, Y presents no problem : Y = Y
L, M, D,
1 86
L O C A L STA B I L I TY T H E O R Y
Since M is determined by the government, it is an exogenous variable. We assume that it is constant for the purposes of studying stability ; however, it is interesting to ask how changes in M influence the equilibrium value of Y ; that is, what i s the sign o f YM a ylaM a t equilibrium ? Back t o this later. What about L ? It seems reasonable to assume that < and > since people want to hold less cash as interest rates rise and the country needs more cash for transactions as national income rises. This is far from an explicit functional relationship, but we'll see how far we can go with it. This approach worked fairly well in the previous model. To study the partial derivatives of it is convenient to break it into two parts : should be zero or negative, since + The value of higher interest rates should, if anything, be an inducement to save. Can you defend the assumption > O? What about < and ï¿½ O ? It follows that > and < In summary,
=
LR 0 Ly 0,
D, CR
D = C I.
Cy IR 0 Iy Dy 0 DR 0. CR 0, LR 0, DR 0, ( 1 7) Cy > 0. Ly > 0, Dy > 0, We now compute the partial derivatives of r and y at equilibrium : rR r'(O)LR 0, ry = r'(O)Ly > 0, (18) YR = y'(O)DR 0, = y'(O)(Dy 1 ). The sign of yy cannot be determined from (1 7). The stability conditions in <
<
<
<
=
yY
<
-
the theorem are
and A sufficient, but not necessary, condition for this to hold is yY
::;
0. In words,
If the sensitivity of total demand to changes in the national income is less than unity, our Keynesian model is locally stable. (In economics ' sensitivity ' is a term for a partial derivative ; the sensitivity of to is the amount changes when changes one unit.) When does the proposition stated above apply ? We must have < 1 to ensure < l. What does > 1 mean ? I t says that, as income increases, desired consumpÂ tion increases even faster ; an unlikely possibility except in underdeveloped countries where it can cause severe problems. (See Problem 3.3. 5.) We can't use this argument on + because consumers and investors do not consult each other. However, may not be at all sensitive to Y-it's Y' that we can expect to depend upon, since in Y stimulate additional investment (if Y increases) or liquidation (if Y decreases). As a first approximaÂ tion, y and so the hypothesis of the proposition is satisfied.
A B
A
B
Cy
I
I = 0,
C I, I
changes
Cy
Dy
D I F F E R E N T I A L E Q U ATI O N S
1 87
Now suppose that we have stability. How will the equilibrium move in response to government adjustment of the money supply ? At equilibrium, L = and Y = O. Using the chain rule to differentiate these with respect to M we obtain
M
D-
and Thus (19) where
and L1 =
(Dy - l)LR - DRLy.
rR Y Y rYYR
Comparing ( 1 8) and ( 1 9), we see that the stability condition > is equivalent to L1 > O. Since we are assuming stability, L1 > O. By ( 17) and ( 1 9), YM > O. If the hypothesis in the proposition is true, RM < O. GovernÂ ment often tries to influence national income by adjusting for example, by making more money available when unemployment is high. (Making money available is not simply a matter of running the printing presses-this only leads to inflation with little change in the money supply as measured in real dollars. In the United States the Federal Reserve Board changes the perÂ centage of cash reserves that member banks must hold.) What effect does this have ? Since YM > 0, this should increase national income. Because of our assumption that we are dealing with the short term, national income can increase only by an increase in labor. Hence unemployment should decrease. The size of the change depends on the change in and the size of YM â€¢ If is small, we see from ( 1 9) that changing may not be a very effective way to fight large scale unemployment. Governments try other methods of influencing the economy, which may be more effective than controlling Can you change the model to allow for government control of R ? What about direct attempts to influence through deficit spending ? Can you extend the model to allow for effects of taxation ? Taxation can influence and by redistributing Y and by providing tax incentives for investment.
M,
DR
M
M. C
M
D
I
M o re C o m p l i cated S i t u a t i o n s
B. Noble ( 1 97 1 , Ch. 6) presents two engineering applications : one in hydroÂ dynamics and the other in chemical engineering. I have limited the material in this section to two first order equations. T. V. Karman and M. A. Biot ( 1 940, pp. 249ï¿½255) use two second order equations to discuss the stability of an airplane. L. S. Pontryagin (1 962, pp. 2 1 3ï¿½220) uses three equations to
L O C A L STA B I L I TY T H E O R Y
1 88
discuss the stability o f a stream engine governor. N . Rashevsky ( 1 964, Part IV) uses various numbers of equations to discuss endocrine systems. He assumes the equations are linear. Instead, one can apply local stability theory to equations of a fairly general form.
P R O B LE M S
Problems 1 and 2 deal with the predator-prey model, but d o not use local stability theory. Study the existence o f equilibria i n the predator-prey model graphically by plotting the two curves = 0 and = O. Limit yourself to > 0 and > 0, and use ( 1 3) to help determine slopes. 2. The gypsy moth caterpillar causes considerable damage to trees. Consider a predator-prey model in which the prey is the gypsy moth and the predator is one of several parasitic wasps that attack gypsy moth caterpillars. Since the wasp larvae feed on gypsy moth caterpillars, killing the caterpillar also kills the wasp larvae. A spray program is instituted for gypsy moth caterpillars, using a general purpose insecticide. Suppose that the result is an increase in the death rate of gypsy moths and wasps by an amount independent of the numb er present. 1.
x'
y
y'
x
p
(a) (b) (c) (d) (e)
Is this a reasonable approximation ? Why ? Using the results of the previous problem, predict the effect of the moth control program on the equilibrium size of the wasp populaÂ tion. Show that more data are needed to predict the effect on the moth population. Let and be the solutions of the equations - p = 0 = O. Compute and and and show that they have the same signs as Sy respectively, and without using Use and ( 1 3) to verify the graphical conclusions derived in Suppose that the wasps have little effect on the size of the gypsy moth population. This is probably the case when the gypsy moth population suddenly explodes. (Why ?) Show that in this case
Xo Yo sex, y) - p (b). (c)
rex, y) dxo/dp dYo/dp - ry rx - sx,
spraying will cause the gypsy moth population to decrease.
(b).
(j) Suppose that the gypsy moth is limited by the parasite rather than by intraspecies competition. This is probably the case when the gypsy moth population is fairly stable. (Why ?) Show that in this case
spraying will cause the gypsy moth population to increase.
P R O B LE M S
1 89
This model has rather interesting implications for insecticide usage policies. The following experience agrees with the prediction in (f). In 1 868 the cottony cushion scale insect was introduced into the United States from Australia and began to attack citrus groves. The ladybird beetle was introduced afterward as a predator to control the pest. When the citrus industry later tried to use DDT to reduce the scale population further, the number of pests actually increased (N. S. Goel et aI., 1 97 1 ). Because the gypsy moth population undergoes wild swings, I have doubts about the accuracy of the above predictions. However, the model does indicate some problems that must be considered in planning a control program.
(g)
The following data refer to percentages of true fish in catches brought into the port of Fiume, Italy. The remainder of the catch (sharks, rays, etc.) were primarily predators which feed on true fish. Can you explain the data ? Note that during World War I, which ended in 1 9 1 7, the amount of fishing was below peacetime levels. The data come from M. Braun ( 1 975) who obtained them from the work of U. d'Ancona. 1914 1915
1 9 1 6 1 9 1 7 1 9 1 8 1 9 1 9 1 920 1921
1 922 1 923
88 % 79 % 78 % 79 % 64 % 73 % 84 % 84 % 8 5 % 89 % 3.
Develop the symbiosis model for species interaction.
4.
The Keynesian model involves a variety of functions. Can you describe some of the graphs associated with them ? In particular, what does the Y-R phase plane look like ? You need to graph D = Y and L = M in the Y-R plane.
5.
(a)
Suppose we replace ( 1 6) in the Keynesian economics model by the more general equations Y' = y(D, Y) and R' = M). What can you say about the form of y and r ? Do our conclusions remain valid ? In the Keynesian model we could include sensitivity of investors to changes in Y and R, that is, J( Y, R, Y', R'). Can you say anything useful about such a model ? 6. In this problem we consider the armaments of two antagonistic countries or blocs. Suppose that ( 1 ) provides an adequate description of the amount of armaments x and y of the two antagonists. Allow for maintenance costs and the pressure for higher or lower armament levels provided by the opponent's arms level. Discuss the behavior of the
(b)
r(L,
1 90
L O C A L STA B I L I TY T H E O R Y
x
model. Can you interpret negative values for and y ? You may have to introduce a new definition for and y in place of armament level. Perhaps something like the level of aggressiveness would work. How much faith do you have in the predictions you have made ?
x
The linear form of this model was introduced by Richardson. He showed that it provided a good fit to European data from 1 909 to the outbreak of World War I. See L. F. Richardson ( 1960) or T. L. Saaty (1 968, pp. 46-48) for further discussion. Apply the methods of this section to the group dynamics model of Section 3.3. 8. If various chemicals are reacting in a closed system (i.e., nothing can be removed or added), reactions often stop before any of the chemicals are completely exhausted. Can this stable equilibrium be explained simply in terms of the basic model for chemical reactions ? [By ' basic model ' I mean the mass action model developed below in (b).] Let the various chemicals present be denoted by Suppose that molecules and so on, can react to produce molecules of plus of molecules of molecules of plus and so on. We assume that the reaction is reversible. This is written in the form
7.
Xl
ml Xl
Xl ,
nl
I mi Xi
ï¿½
Xi ' Xl , Ii n i XiÂ·
ml
nl
In (a) through (c), we assume that Â·this is the only reaction that is occurrIng. (a) Let Ci(t) be the concentration of chemical Xi at time t. Show that C m Ci(O) + (n i - mJx(t), where x(t) is some function independent of i. How can x(t) be interpreted ? Suppose that a reaction can occur only if m i molecules of X i all (b) collide with one another simultaneously. Conclude that the forward reaction ( ) proceeds at the rate =
---+
rate constant
where kf is a constant called the for the forward reaction. Let k b be the rate constant for the backward reaction. Show that the equation for the reaction is
x'(t)
=
i Cm' i .
k f TI CJtti - kb TI
i
PROBLEMS
(c)
1 91
Conclude that the chemicals are in equilibrium if and only if
kJ TI; C;(t)m i kb TI C;(tti . Often several reactions occur at once. We want to show that the equilibrium points determined by (20) are locally stable. Only two (20)
=
simultaneous reactions are considered because of the limitations imÂ posed in this section ; however, the approach and results hold for any number of reactions. (d) Repeat the analysis in (a) through ) assuming that the two reactions are
(c
kJ, kb, rJ, rb ; introduce x(t) and yet) C;(t) C;(O) + (n; - m;)x(t) + (q; - p;)y(t), x'(t) = kJ TI C;(tti - kb TI C;(tt', y'(t) = rJ TI C;(t)P' - rb TI C;(t)q, . (e) Denote the four products, including the rate constants, appearing in the above formulas by n (m) , n(n), n(p), and n(q), respectively. Show that equilibrium occurs if and only if n(m) = n(n) and n(p) n(q). y' = g(x, y). Show that at equilibrium (f) Write x' f(x, y) fx n(m) 'f (m;C;-(t)n i , n(m) f' (m; - n;C);((p;t) - q;) , gx - n(p) 'i (m; - nC;;)((p;t) - q;) , g) -n(p) ï¿½ (p; C;-(t)qi (g) Show that fx + gy is negative. Use the Cauchy-Schwartz inequality (I Wf)(I Zf) (I VlIi zi to show that fxg y J;. g x O. with rate constants and associated with these reactions so that =
i
i
i
i
=
=
and
{'
Jr
=
_
=
_
_
_
=
ï¿½
ï¿½
ï¿½
,' i
L O C A L STA B I LITY T H E O R Y
1 92
(h)
Discuss the behavior of the reactions near equilibrium.
D. Shear (1 967) establishes global stability under fairly general conditions. 9.
Apply the methods of this section to the graduate student model in Problem 3 .3.2.
10.
In this problem you will study models for gonorrhea epidemics. For more material on epidemics see N. T. 1. Bailey (1 976). Gonorrhea is spread by sexual intercourse, takes 3 to 7 days to incubate, and can be cured by the use of antibiotics. Furthermore, there is no evidence that a person ever develops immunity.
(a)
f
Let x be the fraction of men who are infected and let be the fraction of men who are promiscuous. Let X and be the corÂ responding quantities for women. Discuss the model
F
x' X'
b,
B
= - ax + b(f - x)X, = - AX + B(F - X)x,
where a, A , and are constants. Interpret the constants. What are the equilibrium points of this model ? Which ones are stable ? Provide phase plane sketches. You should find that the number is critical. When will there be a continual epidemic ? ) Interpret and discuss the effects of changes in the frequency of promiscuous intercourse, the fraction of the population (of either sex) that is promiscuous, and the speed of curing infections. What advice would you give to public health officials who wished to stem a gonorrhea epidemic in an affluent country like the United States ? In a place like Hong Kong ? Develop a model like the above for a population of male homoÂ sexuals. Such a model may be applicable to diseases not linked to sex, for examples, measles and typhoid. See Problem 8 . 1 .3. (f) Develop a less specific model ; for example,
(b)
(a/bf)(A/BF)
(c
(d)
(e)
x'
(g)
= g(x, X)
and
X'
= G(x, X),
G.
with minimal assumptions about 9 and Can you apply any of the above ideas to diseases that require two hosts ? An example is malaria which is transmitted by mosquitoes.
D I F F E R E N T I A L D I F F E R E N C E E Q U AT I O N S
1 93
9 . 3 . D I F F E R E N TI A L D I F F E R E N C E E Q U ATI O N S
We now briefly consider equations involving both derivatives and time lags. As in the previous section, we expand the equation(s) around an equilibrium point to obtain a homogeneous linear approximation. These approximations can be studied with Laplace transforms. We describe an alternative approach which is equivalent to this but does not require a knowledge of Laplace transforms. For simplicity assume that there is only one equation in one endogenous variable. Write Taylor's theorem in the form
J(t + ) = L ( nD.t J(t) etDJ(t), where D stands for d/dt. W e could use this, for example, t o rewrite !' ( t) = bJ(t - 1 ) - mJ(t) as (D - be- D + m)J(t) = O. In this way any homogeneous linear differential difference equation can be replaced by an infinite order differential equation L(D)J(t) = 0, where the function L is a polynomial in D and etD for various values of If the equation was of finite order, the general solution would be a linear combination of solutions of the form t n ert, where r is a root of L(r) = 0 of multiplicity greater than n. The stability of the equation could then be determined by looking at the roots of L(r) O. (2 1 )
00
T
n=O
T
,
=
T.
v
=
(Section 9.2 dealt with quadratic L, because eliminating and Vi from (4) leads to one second order differential equation.) This method also works for the infinite order equation. Since (r 0 is a transcendental equation, studying its roots is often very difficult. A computer may be essential. There are usually an infinite number of roots, so it is fairly likely that at least one will have a positive real part. Hence local instability is common. I can't resist the side remark that (2 1 ) can also be used to derive numerical integration and differentiation formulas. For examples see L. P. Ford (1955, Ch. 8).
L )=
The D yn a m i cs of Car F o l l ow i ng
Traffic flow has become the subject of mathematical modeling in recent years. Three authors who discuss it are W. D. Ashton ( 1 966), F. A. Haight (1 963), and L. J. Pignataro (1 973). Sometimes cars are considered individÂ ually, and systems of equations or probabilistic models are developed. At other times traffic is treated as a fluid, and hydrodynamic techniques are used. Among the topics considered in traffic flow are the motion of traffic on the open road, bottlenecks, and effects of intersections.
1 94
L O C A L STA B I LITY T H E O R Y
How do drivers in a line of cars behave ? There is a limit to how fast a driver can react, but too much delay in reacting causes collisions. Are the delays in drivers' reactions near the danger level ? The model is adapted from R. Herman et al. (1959) and R. E. Chandler et al. ( 1 958), which use the Laplace transform method. That approach is adapted for use as a student project by E. A. Bender and L. P. Neuwirth (1 973). Related material appears in the first part of J. Almond (1 965). The driver of a car cannot directly control the speed of the vehicle. Instead, he or she controls its acceleration. Thus we expect to derive a formula for the acceleration as a function of the driver's sensitivity and the stimulus of the environment. Historically the model has been taken to be of the form Acceleration = Sensitivity
(22)
x
Stimulus.
Since we have not defined what we mean by either ' sensitivity ' or ' stimulus,' the above formula has no content. Rather than attempt to give meaning to the terms ' sensitivity ' and ' stimulus,' we consider directly the physical factors that enter into the driver's reaction. The driver's reaction (acceleration) depends on what he or she senses in the environment. The things that can be perceived most easily are the car's speed, its speed relative to other cars in the line, and the space between the car and adjacent cars. As an approximation, we suppose that the only relevant car is the one directly ahead of the driver. If denotes the position of the nth car, we can write Acceleration =
(23)
Xn f(xï¿½, Xï¿½ - l - xï¿½, Xn - 1 - xn).
In order to proceed it is necessary to say something about the nature of f. Experimentation seems to indicate that the most important factor is the relative velocity. To begin with we construct the simplest possible model using this : AcceleratIOn IS dIrectly proportional to the relative velocity. There is a delay, called the between a change in the environÂ ment and the driver's response. It has been observed to be of the same order of magnitude as the time it takes the vehicle to cover the distance between it and the car ahead. Hence we expect the reaction time to be an important variable. To check this we compare the resulting model with one lacking a reaction time. Let be the reaction time of the nth driver. The above discussion leads to the basic equation
reaction time,
Tn
(24) whenï¿½ An is a constant measuring the strength of the nth driver's response. Chandler et al. ( 1 958) conducted an experiment on the General Motors
D I F F E R E N T I A L D I F F E R E N C E E Q U AT I O N S
1 95
test track in which one driver followed another at what was considered to be a minimum safe distance. Equation 24 gave a good fit for most of the drivers when statistical methods were used to estimate Ii and T . These parameter estimates are given in Table
1.
Ta b l e 1
Driver Reaction Parameters . A
T
D river Number
2 3 4 5 6 7 8
(seconds)
(sec - I )
rA-
1 .4 l .0 1.5 1.5 1 .7 1.1 2.2 2.0
0 . 74 0 . 44 0 . 34 0 . 32 0.38 0. 1 7 0.32 0.23
l . 04 0 . 44 0.51 0.48 0.65 0.19 0 . 70 0.46
Source : Chandler et a l . ( 1 958).
We can rewrite (24) in operator notation as
( 1 + Dï¿½:nD}n Vn - I > =
v x'
where = is the velocity of the car. Using the subscript 0 to denote the lead car, we have
[01 ( 1 + Dï¿½:kD)]vn VoÂ· =
(25)
We get our stability information from this equation. To apply local stability theory we assume that is given and that a stable particular solution exists and has been determined. The existence of such a solution is a global problem. Local stability theory can only tell us whether a driver's behavior stabilizes or becomes wilder when he deviates slightly from The general solution of the equation (25) is the particular solution plus the general solution of the homogeneous equation. To study the homogeneous equation, we must find the roots of
vnp(t)
vnit).
(26)
vo(t)
linear
1k ( 1 + rï¿½ILrkrk ) fI =
r
= o.
1 96
L O C A L STA B I L I TY T H E O R Y
The following fact is proved by Herman e t al. ( 1 959).
zez C
The roots of + = 0 all have negative real parts if and only if o < :s;; the root with the largest real < n12. If in addition 0 < part is real.
C
C lie,
Tkr and C Tk Ak transforms each factor of (26) to the form nl2 1 . 57 for 1 k n. For a long time interval, the root with the largest real part contributes the dominant exponential term to the solution of (24). Hence the oscillatory part is highly damped if Tk Ak lie 0.368 for 1 k n. All the drivers in Table 1 satisfy the stability criterion, but only one of them satisfies Tk Ak 0.368. From the preceding discussion, we see that a slight change in speed Setting
zez + T k Ak
z =
=
c. This proves that the motion of the nth car is stable if and only if
<
:s;;
=
:s;;
:s;;
:s;;
:s;;
=
:s;;
propagates down the line of cars, traveling from one car to the next after seconds. This can be viewed as a wave moving down the line of cars. From this point of view we can ask another question related to stability : What happens to the amplitude of this wave as it propagates down the line of cars ? Each car individually may be stable, but the wave may increase in amplitude as it moves, thus leading to instability. A fluid dynamics model predicts the formation of a of acceleration or deceleration which either dies out or builds up to a maximum amplitude as it moves along the line of cars. We do not deal with this here. For a discussion see any of the books mentioned above. See also Problem for a discussion of this stability question. Let us compare the results involving time delay with a model in which reactions are instantaneous ; that is, O. Equation 26 reduces to
Tk
shock wave
1
Tk
r - Ak .
kfI= ( 1 + ; ) Ak
1
=
O.
The roots of this are = Thus the model without time lags is always increases, the roots become stable and nonoscillatory. As the minimum more negative. This increases stability, because the general solution is a linear combination of terms like The situation in the time delay model is just the reverse of this : Stability tends to decrease as increases. Time delays are obviously important. Equation (24) is a rather severe specialization of (23). Let us consider (23) and see how much we have to specialize it to obtain reasonable results. We assume there are only two cars and that the lead car has a constant velocity V o ' For simplicity we drop the subscript
rme - h.
1.
Ak
Ak
D I F F E R E N T I A L D I F F E R E N C E E Q U AT I O N S
1 91
Because we have considered only stability of auton omous systems, we must eliminate the explicit time dependence of the positions. Let us adopt the convention that distance is measured from Instead of the absolute position x of the second car, we consider the between the two cars. It is given by x . Equation (23) becomes
xo(O). separation s = tvo s'(t + i) = I[ -s'(t) + vo, s'(t), set)]. At the equilibrium separation S e we have s' = 0. Hence I(vo, 0, s e) = 0. Suppose that this equation has a unique root. Expanding I about this point and neglecting terms beyond the linear ones, we have D 2 erDu ï¿½ (/2 Il )Du + 13 u, where u = s and the partial derivatives 1 > 1 , and 13 evaluated at (Vo, 0, sJ Hence we must look at the roots of 1 2 -
Se
This can be rewritten as rxz + f3 e = -z 2- ' z
(27) where
and
z
= ir.
Since experiments indicate that the dominant effect is due to relative velocity, it is reasonable to suppose that rx is negative (not positive, since measures If the velocities are held fixed and the separation is increased, we can expect the driver to accelerate to close the gap. Hence, f3 will be negative. The study of this model cannot be completed, because we do not know what the roots of (27) are. If we want to proceed further, we should first try to study (27) analytically. If this fails, we can turn to numerical study using a computer. In view of the data in Table 1 , it is reasonable to carry out such calculations with rx near - ! - Since the effect of f3 is probably less than the effect of rx, it is reasonable to take f3 to be nearer to zero than rx. Although (27) seems to be a general study of the problem, it has a severe limitation : We assumed that is differentiable. If a model builder is not careful, he can easily let this sort of assumption Slip by, since most functions are in some sense ' well behaved.' Our assumption that is differentiable near equilibrium implies that, except for sign, a driver responds in the same manner to a small negative relative velocity as he does to a small positive relative velocity. Actually, the acceleration and deceleration responses may
s
separation).
I
I
1 98
L O C A L STA B I L I TY T H E O R Y
be quite different because of the driver's psychology, the design of the vehicle, or both. This has been studied by G. F. Newell If the main difference is in reaction time, the waves of acceleration and deceleration will move down the line at different speeds. It seems reasonable that deceleration will move faster. If the lead car first accelerates and then decelerates to its original speed, the two waves will eventually cancel each other out. However, if the deceleration occurs first, the acceleration wave will lag further and further behind the deceleration wave. This may provide an explanation for some of the mysterious slowdowns that occur on freeways.
(1962).
PROBLEMS 1. We want to study the amplitude of a disturbance as it moves along a line of cars. For simplicity we assume that the acceleration of the first car is proportional to sin (wt) . This is a mathematically convenient
assumption which pr'ovides nonzero acceleration with no net change in velocity. It is not as restrictive as it appears at first, because we can expand in a and, by linearity, add the solutions obtained for each term separately.
uo(t) Fourier series (a) Use (25) and the fact that uo(t) - vo(O) is the real part of Aeiwt to show that vn(t) - uo(O) is the real part of Aeiwt fI1 ( 1 + iwe1WTk) - 1 Hint: Induct on n. that all drivers are the same, so that = and = dor (b) Suppose all k. Deduce that the amplitude of the disturbance decreases as n increases if and only if iweiwT > 1. 1I + I (c) Show that the above holds for all w if and only if it holds as w 0, and that this yields the condition > 1(d) Discuss this result in connection with Table 1. A
kï¿½
k
A
k
A
'k
AA
2.
---+
Experiments indicate that, when the separation of the vehicles varies greatly, a more accurate model is provided by replacing An in (24) by is a constant. Discuss the local stability where of this model. The following problems are phrased rather generally. Be as specific as you must to obtain results about stability, but try to avoid unnecessary assumptions.
fln/[Xn(t) - xn - 1 (t)],
3.
,
fln
C O M M E N T S O N G LO B A L M ET H O D S
1 99
(a)
Model the growth of a single population. Allow a time delay due to the need to mature before being able to reproduce. (b) Same as in ) but this time allow a time delay due to identical life spans for all members of the population. What about accidental death ? Combine ( ) and if possible. Consider a herbivore model with a time delay built in to allow for plant recovery and perhaps delay(s) associated with the herbivore life cycle, as in ( ) and (b).
(a ,
a
(c) (d)
(b)
a
4.
Discuss the problem of controlling the temperature in a room as a function of how long it takes the heating unit to respond to the thermoÂ stat. For example, forced air heaters respond quickly, while steam radiators take a fairly long time.
9 . 4 . C O M M E N TS O N G LO BA L M ET H O D S
A s already remarked, I consider this topic very briefly. I hope that you will get the flavor of the subject from this short discussion so that you will have some idea of the sort of problems these tools can attack. In the physical sciences, conservation laws play an important role. A conservation law can be associated with some systems of differential equations by introducing a quantity whose time derivative is zero. For example, if define
x' f(x),
E(t) -( 2-) - IXfeu) duo Then dE/dt [x' - f(x)Jx' 0, and so E(t) is constant. In other words, if the force acting on an object depends only on the position of the object, we can define an energy E which is conserved. Let f be a restoring force; that is, f(x) and x have opposite signs. Since ->0 and - ff(U) du 0, 2 it follows from (28) that both of these are bounded by E and that E 0. Thus the speed I x' I is bounded. If the integrals L+ oof(U) dU and f wf(U) dU are infinite, the position x is also bounded. =
(28)
=
X' 2
0
=
=
X, 2
Z
Z
200
L O C A L STA B I LITY T H E O R Y
h 8ft f(8) - hew), f 8 hew) w. E(t) x 8. E(t) E(t) E'(t) [8' - f(8)Jw -h(w)w 0, with equality if and only if w 0. Thus E(t) decreases toward as t
The pendulum equation (6) has the form = where is a frictional force and is a restoring force provided - n < < n . The sign of is the same as the sign of Consider defined by (28) with = By the previous paragraph, we have ;;::: 0. However, is not constant, since =
=
::;
=
Â°
-+ 00 .
Mathematically w e say that there i s global stability. Physically w e say that energy loss due to friction causes the pendulum to slow down. F. Brauer (1 972) discusses the motion of a pendulum when time is allowed to enter the differential equation explicitly. The van der Pol equation,
uft + (u2 - l)f1u' + u
(29)
f1 > 0,
= 0,
is one of the classic limit cycle problems. It arises in the study of sustained nonlinear oscillations in vacuum tubes. You should verify that the only equilibrium point is = Â° and that it is unstable and oscillatory. If we approximate sin by in the damped pendulum model in Section 9.2, it will look very much like the van der Pol equation, but will be replaced by the term - 1 Intuitively, if > 1, this term will act like a frictional force and cause damping, while if < 1, it will act to increase ConÂ sequently approaches a limit cycle. Diagrams of the limit cycle for various values of are given by W. E. Boyce and R. C. DiPrima (1 969, p. 41 8). The Poincare-Bendixson theorem can be used to prove the intuitive result of the last paragraph. It can be stated as follows.
u 8 8 f1(u 2 )u'. u(t) f1
r(8')
u2 u2
I u'l.
If there is a bounded region solution to the system
TH EO R E M .
x' f(x, y) =
and
D in the x-y plane such that any g(x, y) y
'
=
that starts in the region remains in it, the region contains either a stable equilibrium point or a limit cycle.
Warning: This theorem does not generalize to three dimensions. To apply the theorem to (29), set x u, y u' , f(x, y) y, and g(x, y) -x - f1(x 2 - 1 )y. Determining a region D that satisfies the theorem is not =
=
=
=
easy. You may wish to try it. The study of global stability and limit cycles is more relevant in the life and social sciences than the study of conserved quantities. Although limit cycles are fairly common in models having nonlinear equations, they
P R O BLEM
201
cannot occur if all the equations are linear. Hence, extreme caution should be used in modeling an essentially nonlinear phenomenon by means of a linear approximation. This is fine for studying local behavior, but it is a dangerous practice if global results are desired. For some biological applications of global methods see J. Cr o nin (1977), R. H. May ( 1 973) and T. Pavlidis (1 973). For some economic applicaÂ tions see G. Gandolfo ( 1 9 7 1 , pp. 375-385, 42 1 -465). Although mathematical psychology seems to be a fertile field for such methods, I am not aware of any such applications. PROBLEM 1.
We return t o the predator-prey model in Section 9.2. See the discussion there. We do not wish to assume all of ( 1 3). Which do you think are the weakest assumptions ? Set up some reasonable conditions to ensure = 0 the region that for some point on
(x*, y*) r(x, y) D {(x, y)IO ::;; x ::;; x* =
and
o
::;; y ::;; y*}
satisfies the Poincare-Bendixson theorem. Draw conclusions from this.
C H A PT E R
10
STO C H AST I C M O D E LS You may wish to refer to the Appendix since it contains a summary of the probabilistic concepts used in this chapter. Unlike most of the earlier material, this discussion definitely requires a bit more background than years of college mathematics. However, I couldn't resist the temptation to add these models, and I think they can be read with profit even if you don't fully understand the mathematics.
2
R a d i oactive Decay
The basic premise of the elementary theory of radioactive decay is that atoms have no ' memory ' ; that is, the probability that an atom will decay during a given time interval depends only on the length of the interval and the number of neutrons and protons in the atom. In some situations, such as a chain reaction, an atom changes by absorbing a particle given off by another atom. When this doesn't happen, the decay of one atom does not affect the surrounding atoms. We consider only this case. It follows that the average rate of decay at time is proportional to the total number of undecayed atoms remaining. When is large, it is reasonable to expect that most radioactive samples behave pretty much like the average. This leads to the = deterministic model where is the rate of decay. The solution to this equation is
t
N(t) N'(t) -rN(t),
N(t), r
(1) This i s fine a s an approximation when the number o f atoms i s large, but = when is small, the predictions of ( 1 ) are nonsense. For example, if = when = we have ï¿½ l Two-thirds of an atom is nonsense. Can we construct a model that doesn't yield such nonsense ? Consider a single atom. Let be a random variable equal to the length of time we must wait for the atom to decay. The basic assumption that an
No t 2/r
202
N(t) 5/e 2 T
No 5,
R A D I O A C T I V E D E CAY
203
x
atom has no memory means that, if we have waited minutes and the atom has not decayed, our estimate of how much longer we must wait is the same as if we had just started to observe. In mathematical language this can be written
{T t + xl T x} Pr {T t}. If G 1 - F, where F is the distribution function for T, we can rewrite this as G(t + x)/G(x) G(t). In other words, G(t + x) G(x)G(t) . It is well known that this implies that G(t) is e -A t for some A > O . We prove this under the assumption that F is differentiable at O. The derivative of G(t) is G'(t) lim G(t + x)x - G(t) G(t)G'(O), since G(t + x) G(x)G(t) and G(O) 1 - F(O) 1. With A -A- G'(O), we obtain the desired result. The distribution function F(t) 1 - e t is called the exponential distribution and is associated with ' memory less ' situations. The probability that an atom has not decayed by time t is j ust 1 - F(t) G(t), which is (1) with No 1. This is not surprising. Since G(t) is the probÂ ability that any given atom has not decayed by time t, No G(t) is the expected number of undecayed atoms at time t. Thus A is the decay rate, and (1) is just 2:
Pr
=
2:
2:
=
=
=
=
=
x --+ O
=
=
=
=
=
=
=
the average path of the decay process. How closely is the average path followed ? Associate with the ith atom a random variable = which is if the atom is undecayed at time = = and otherwise. Then Pr The are independent by our assumption that the decay rate of an atom is independent of its surroundings. Hence the random variable
0
Yi Yi(t) 1 {Yi I} G(t).
Yi
t
1 + Y2 + . . . + YNo has mean f.1. and variance a 2 where f.1. f.1. 1 + f.1.2 + NoG(t) Noe -At, (2) a2 af + aï¿½ + NoG(t)[l - G(t)] . Since a provides a measure of typical deviation from the mean, a/f.1. gives a measure of the typical percentage error involved in Â·using (1). It is known as the coefficient of variation . By (2), a 1 - G(t) f.1. NoG(t) , which is small provided No G(t), the expected number of undecayed atoms at time t, is large. A gram of matter contains more than 10 2 1 atoms, so (1) Y
=
Y
,
=
.
.
.
=
=
.
.
.
=
is usually a very good approximation.
=
204
STO C H ASTI C M O D E L S
a//l
There are some cases in which the coefficient of variation may be significant. When a new radioactive isotope is produced in a particle accelerator, the number of atoms may be relatively small. This causes problems in estimating A. In population biology, growth models like ( 1 ) are used. The population size may sometimes b e sufficiently small for random fluctuations to be important. Is it possible to obtain an exponential decay curve when we have a mixture of atoms with different decay rates ? The answer to this question is no. Suppose we start out with a mixture of things that are decaying exponentially at various rates A. If F(/) is the fraction of the original mixture with .1 ::;; I, the expected amount of the mixture undecayed at time is
No
t
/let) f e - 2t dF(A), =
which can be shown to have the form e - kt if and only if F(l) equals 0 for and 1 for ;:::: You may wish to try it.
I k.
I k <
O pt i m a l Faci l i ty Locat i o n
Suppose you are faced with the problem o f finding the best locations for certain facilities. To be specific and simple, consider fire stations in a large, uniform city with rectangular blocks. How can you measure the relative merit of a siting plan ? How can you find the one that is best or close to best ? This model is adapted from R. C. Larson and K. A. Stevenson (1 972). Suppose is the travel time between a station and a fire. We assume that, as increases, the situation deteriorates. Thus if siting plan A locates stations so that every point can be reached at least as quickly as in siting plan then A is at least as good as B. What happens if some points take longer to reach under A and others take longer under B ? Various possibilities exist ; for example, we could compare the average travel times or the average of the square of the travel times. We assume there exists some function called the utility, and that we want to maximize the average of over all points in the city. [Utility theory is discussed in many books. I recommend R. D. Luce and H. Raiffa (1 958). For the two cases just mentioned we could take = = and What do you think of this assumption ? If you don't like it, can you suggest a useful alternative ? Since we're going to assume think about the question of what should be. By the assumption of uniformity of the city, is roughly a linear function of travel distance (It is only roughly a function, because turning corners = - ( ) Then we wish to may slow the trucks down.) Let's write minimize the average value of f.
t
t
B,
u(t)
u(t) u(t),
t
u(t)
t2 .]
u(t)
s.
t
f(s)
ut
.
u(t)
205
O PT I M A L F A C I L I TY L O C AT I O N
Let's assume that the city streets form a rectangular grid and set up coordinate axes parallel to the streets. The travel distance between (X l ' y d is and (X ,
2 Yl)
Prove this. Suppose that there are n stations and that the city area is nA . The optimal solution is to divide the city into n equal diamonds with a station at the center of each. Of course, the geometry of the city may prevent this, in which case the best siting won't be as good as the estimate we're working out. If the area of each of the diamonds is A, the region of such a diamond is given by
D {( =
X,
y) : I x i + I y l
:s;
and Jhe average value of f is given by (3)
A-I
Iff( l x l + I y l ) dx dy D
= A-I
ft}
rJA72 o f(s)4s ds.
J
Now suppose that the stations are distributed at random. Since we should easily be able to do better than random, this gives us an upper bound on what the average value of f is. If this is close to (3), we can conclude that laborious attempts at optimization will be practically useless ; but if it differs Â· considerably from (3), _we can conclude that care needs to be taken in siting the stations. We must compute the expected value of the average value of f. What is the probability that the distance between a random point in the city and the nearest station is at most s ? This is the same as the probability surÂ that a station will lie in the diamond-shaped region of area rx = rounding the point. If a station is placed at random, it will lie outside a region of area rx with probability 1 - rx/nA, since the total area of the city is nA. Thus the probability that no station will lie in the region is
2S2
(1 ï¿½ r e - a/A. -
n
ï¿½
It follows that the probability that the closest station will lie at a distance between s and s + ds is 4s d [l - (1 - rx/nA) '] d(1 ds, ds A ds ds ï¿½ ï¿½
2S2 .
e - a/A)
_
e _ 2 s 2 /A
since rx = We should average f(s) times the probability over the entire area of the city. This leads to an integral. Unfortunately, the approximation
S T O C H A S T I C M O D E LS
206
we have j u st given is poor when ex is a significant fraction of the total area of the city. If
f(s)
s,
does not grow exponentially with
it will not matter because the
integrand will be small. The analog to (3) is approximately
1'' f(s)e - 2 s2/A4s ds,
A-1
(4)
provided the integrand becomes insignificant when 2S2 appro aches the total area of the city. This condition is satisfied for the
f
we consider, provided
n
is greater than about 5. (You should check this out when we are studying a p articular
f)
A p artial check on our mathematics t o date is provided by the fact that (3) and
(4) v
both have the value
c when
f(s)
=
c.
f(s) vs,s (P.
Suppose we wish to minimize the average travel time. We set where
ove r much of the range of 1975 ;
=
is velocity. Actually, travel time grows slower than linearly with
P.
s
for fire engines in New York City
Kolesar,
Kolesar et aI., 1 97 5 ) . Since the travel distances for random siting
tend t o be longer than for the best siting, it follows that the ratio between random siting and best siting travel averages will be less than what we obtain. From (3) we have A-1
(5) and from
(4),
A- 1
(6)
f./Ai2 4vs 2 ds v =
0
I2A
_ ï¿½_ n _Y
3
'
{Xl e -2s2/A4vs 2 ds v Loo e - 2S2/A ds v JÂ¥ , =
=
where the first equality is due to integration by parts and the second to the formula
f oo e - rx2 dx o
The ratio of (5) to
(6)
is 3 Jn/4
=
/n o I ï¿½
= 1 . 329 ; that is, a random siting is about
one-third worse than the best possible siting. We can try other functions for model in Section
4. 1 ,
f(s).
By the discussion of the forest fire
function of time with nonnegative coefficients. Hence with
a, b, c
spectively,
ï¿½
0,
f(s) f as2 + bs + c,
it seems reasonable to assume that
=
is a quadratic
gives an upper bound. We obtain from ( 3 ) and (4), reÂ
aA bJ2A + c
-+ 4
--
3
and
aA b-/iUi
c T + -g- + .
D I ST R I B U TI O N O F P A R T I C L E S I Z E S
207
You should be able to fill in the details. The largest value of the ratio occurs when = = The ratio then equals 2. Thus careful siting is more significant for a quadratic than for a linear f. Â· The above results suggest that the siting of fire stations cannot be improved very much over a quick commonsense siting. Since (3) provides a lower bound on what can be achieved, any given siting can always be checked against the ideal fairly easily. Perhaps you have already raised the objection that for something like fire fighting any improvement in siting is important. I agree, but remember that we are using a model based on an idealized city, so our results are only Hence the best siting for an idealized city is probably not the best siting for a real city. We can expect the two to be close but, if two site plans I and II are such that I is a bit better than II in the ideal city. It may well be that II is a bit better than I in the real city. You may wish to think about this a bit more : How can the model be made more realistic ? What data should you collect to help decide where fire stations should be located in a real city ? How would you go about determining sites ? How is this affected by the fact that many fire stations already exist ?
b c O. f
approximate.
D ist r i b u t i o n of P a rt i c l e S i zes
If you observe the size distribution of particles in clay, material ground in a mill, or pebbles on a beach, you will probably notice that the distributions tend to have a similar shape. This suggests the existence of a common underÂ lying principle. I would like to know what it is, so I'll make a proposal, model it, and test. the model against the data. A successful model won't prove my proposal, but at least it will make it seem more likely. It seems reasonable to suppose that particle size has been determined by a large number of small random events. Because of the central limit theorem, it is natural to look for a normal distribution. Unfortunately, the distribution or particle sizes tends to be skewed and so cannot be normal. Two main distribution laws have been proposed : (7)
F(x) f!OgX Rosin's law : 1 F(x) e -rxn. ex
Log normal law :
-
- <Xc
e - (t - ll) 2 / 2u 2 d t
ex
We discuss the log normal law here. The log normal distribution is discussed by J. Aitchison and J. 'A. C. Brown (1 963) and applied by them to a variety of economic problems. The derivation given below is similar to B. Epstein's ( 1 947). A more recent discussion of the particle size problem, with references to the literature, is given by G. V. Middleton ( 1 970).
208
STO C H ASTI C M O D E LS
People have tried to fit other curves to a variety of size data, for example, the relative biomass of various species in a region, relative sizes of cities, and sizes of words. (The is the weight of the organisms.) Since these data are discrete, they are usually rearranged so that the items are in order of size. We then seek a model that predicts the (relative) size of the nth item in the list. See J. E. Cohen (1 966), B. Mandelbrot (1 965), and H. Simon (1955) for examples. The size distribution of particles is assumed to be the result of many small changes which we will call An example of an event is a wave hitting the shore. Nothing may happen during the event, or several particles may be broken and abraded. This is such a general framework that we can say very little about it. In probability theory the basic tools for handling a long sequence of random events are We would like to use a limit theorem here if possible. To apply such theorems it is necessary to know ( 1 ) that no single event has a big effect, that the events are more or less independent, and (3) that the events combine in a simple fashion. The first condition certainly seems reasonable when averaged over all particles. What about the second condition ? Independence is closely related to the idea that knowing the past history of a particle is of no help in predicting what will happen to it. If the particles are made of two very different materials like wood and glass, this is not likely to be true. If the material is all fairly similar, this seems to be a fairly reasonable assumption. We assume that the material is all fairly similar. The third condition is rather vague. I don't see any way to sharpen it without saying more than should be stated. What we do now is try to describe the erosion procedure and see where it leads us. Let be the number of particles of size at most after the kth random breakage event. We haven't yet said what we mean by ' size.' It could be volume, weight, a characteristic linear dimension, and so on. Let's postpone making a choice until it is useful to do so. In the example on page we saw that, when we dealt with a large number of particles, the number of undecayed particles was close to the averÂ age number. Although this sort of behavior is quite common, it is often hard to prove that it is occurring in some particular case. It seems reasonable to suppose that it holds in the present situation, but it does not seem easy to prove ; therefore we simply assume that it is true. Let the average number of = particles of size at most be We study this as if it were an exact distribution. Let be the average number of particles of size at most Y that we expect to obtain from a particle of size during the kth breakage event. It
Biomass
(breakage) events.
limit theorems. (2)
Nk(x)
x
202,
Bk(yix)
x Mk(x) E(Nk(x)). x
D i ST R i B UT I O N O F P A R T I C L E S I Z E S
209
Bk
follows from our independence assumption that does not depend on any property of an individual particle except its present state. We assume that the size of the particle contains all the information about the state of a random particle relevant to breakage. Of course, this is not correct, since a long, thin particle is more likely to break than a round particle of the same size. Hence this is really an assumption that we can just look at the average behavior of particles of a given size. It is easy to show that
Mk(y) LW Bk(Y ' X)Mï¿½_'(X) dx. =
(8)
As it stands, (8) is too general for us to try to apply a limit theorem. The following is the key assumption : This means that depends only on the ratio This is not always a reasonable assumption. Many breakage events tend to favor the breakage of larger particles. In crushing, smaller particles are protected because their larger neighbors bear the brunt of the crushing. If particles are broken by some sort of throwing action, a scale argument shows that the smaller particles are less likely to break : The strength of a rock tends to vary with its cross section. The energy expended on a rock varies either with the cross section or with the weight, depending on the situation. If it varies with weight, energy or strength increases with size, and so larger rocks are more likely to break. These arguments indicate that our model may tend to overestimate the number of large particles. = Setting we can rewrite (8) in the form
The breakage event is independent of scale.
Bk(y I x)
y/x.
Bk(Ylx) Ck(y/x), (9) Mk(y) LW C{ï¿½)Mï¿½_ '(X) dx. Let X k and ï¿½ be random variables with distribution functions proportional to Mk and Ck> respectively. If (9) is normalized by dividing both sides by function of the product Mk( (0), the result is the formula for the distribution ï¿½Xk of two independent random variables. Hence X k ' which leads to Xk ï¿½ ï¿½ - , . . . Y2 Y,XoÂ· Since the Y are independent and no single event has a large effect, it follows from the central limit theorem that log Xk tends to be normally distributed for large k. Thus Pr {log Xk log x} Pr {Xk x} =
=
=
ï¿½
(1 0) The parameters
=
ï¿½ ï¿½
f.1
ï¿½
1
2n
(J 'I'Fe:
__
f'OgXe - CD
- (t - ll) 2 / 2 a 2
dt.
and (J 2 are the mean and variance of log
Xk, not Xk.
STO C H A S T I C M O D E L S
21 0
Now let's return to the problem of what we mean by ' size.' It doesn't matter whether we mean weight, a linear dimension, or a similar measure, because all powers of a log normally distributed random variable are also is log normally log normally distributed. Let's prove this. Suppose that distributed with distribution function (7). Then
X
Pr
{xr ::;; y}
=
IX
{X ::;; y l /r} IOg y/re - (X -/l)2/2 a2 dx
Pr
S
- 00
where rx. We have shown that replacing X by xr changes (fl, 0') to (rfl, I rStatistics 1 0-). are often collected by passing particles through a sieve and tabulating the percentage by weight that passes through sieves with various mesh sizes. Our model describes particles of different sizes by number, not be weight. We must find out how to connect these results. We show that t
=
the distribution by weight is log normal if and only if the distribution by number is. To study this, we need a formula for the moments of the log normal distribution which is defined by (10). We have
A(x ; fl, 0') Jr xr N(x ; fl , 0') dx 0
=
=
er log (x)N(x ; fl, 0') dx er/1 + (ra) 2/ 2 Jro N (x ; fl + r0'2 , 0') dx, Jr 0
sInce
x. We can state the above result more compactly in the form Jro x rN (x ; fl, 0') dx er/1 + ( ra ) 2 /2 A(y ; fl + r0' 2 , 0') . (1 1 ) With y and r 1 , 2 the mean and variance of the log normal distribution can be obtained from ( 1 1 ) : e/21 +a22/2 , Mean Variance f3 ex (e a 1). where
t
=
log
=
= 00
=
= ex =
=
=
-
D I ST R I B U T I O N O F P A R T I C L E S I Z E S
21 1
We are now in a position to compare distribution by weight and distriÂ bution by number. Suppose fl, a ) describes the distribution by number. If the particles are all roughly the same shape, setting = 3 in (1 1 ), we obtain a function proportional to the distribution by weight. Hence, with the distribution function A{x ; fl + 3 a 2 , a ) . Let and be the mean and the variance of this distribution. We can easily express the mean of the distribution by number using iXw and
J(y ;
r
by wIXweight f3wis log normal
distribution
f3w:
IXn
IXn
= = =
e
'L + a2/2
e(IL + 3 a2 l + a 2/ 2 e - 3 a 2
IXw ( 1 + IXï¿½fJ w)
-
3
How does the model fit the real world ? It fits some data remarkably well and fails at other times. The data in Table 1 is taken from G. Herdan (1953, p. 1 30) who in turn took it from an article by S. Berg in a Danish journal. The percentage by weight of clay particles not exceeding a certain size (measured in micrometers) was tabulated. The plot on log probability paper should be a straight line. Using a least squares fit we obtain fl = - 0.377 and a = 1 .47 when the logarithms in (7) are taken to the base e . The third column in Table 1 shows that the fit is very good.
Ta b l e 1
Distribution of Clay ParÂ ticle Sizes by Weight Percent
Size ::;; (micrometers)
True
Fitted
0. 1 06 0 . 1 47 0.25 0.38 0 .6 5 0.96 1 .4 1 2. 1 5 3.25
1 0. 0 14.9 24. 6 36.4 48 . 3 57.5 67.6 7 7 .5 87.3
10.2 1 4. 7 24. 6 34.4 48 . 5 59.0 68 . 8 78 . 1 85.5
Source : G. Herdan ( 1 95 3 , p . 1 30) .
21 2
STO C H ASTI C M O D E lS
Ta b l e 2
Distribution of Sand Grain Sizes by Weight Percent
Size :s; (millimeters)
True
Fitted
0. 074 0. 1 04 0 . 1 47 0.208 0 . 295 0. 4 1 7 0. 589 0.833 1.17
3.1 5.8 12.9 28 . 5 56. 1 79. 6 94. 1 99 . 5 99.93
1 .0 4.5 1 4. 2 32.9 57.6 79 . 4 92. 6 98 . 1 99.6
Source : G . H . Otto ( 1 939) .
The material i n Table 2 was taken from G . H. Otto (1 939), who obtained it by studying a sand dune in Palm Springs, California. In this case f.1 = - 1 .3 3 and (J = 0. 55 1 . The fit i s not quite a s good If you are interested in more data, you might try the article by G. M. Friedman ( 1 958). I have not checked to see how well his data can be described by a log normal distribution. P R O B LE M S
1.
When steel tapes are used to measure distance, alignment can be a problem. For example, suppose we use a 100 foot long steel tape to measure the distance between two points about a i mile apart. Ii. is unlikely that we will be able to measure along a straight line connecting the points ; instead we will probably zigzag slightly. As a result, the measured distance will exceed the actual distance. The following model of the situation was adapted from B. Noble ( 1 97 1 , Sec. 1 3.6).
(a)
Suppose that the error in aligning the kth usage of a tape of length L is (Figure 1). Show that, if the distance between the two points in question is about nL, the distance is overestimated by approxiÂ mately 1 ï¿½ 2L
ek
= kI= l e . n
(j
P R O B LE M S
- - - - - -
e2
F i g u re 1
(b ) 2.
- - - -
-
- -
- -
21 3
ï¿½
t
Errors in aligning a measuring tape.
What reasonable assumptions can you make about the distribution of ek to obtain information about the distribution of (j ? Can you apply the central limit theorem ?
The following problem was adapted from B. Noble ( 1 9 7 1 , Sec. 15.3). Suppose you are asked to decide whether or not to install a traffic signal at a pedestrian crosswalk. To arrive at an answer you need to know how long a person can expect to wait before a gap in the traffic provides enough time to cross. The only data you can expect to obtain are physical information about the street and the rate of traffic flow in cars per hour. How can this be used ? For simplicity, we assume that for most of the problem the traffic all moves in the same direction.
(a)
It has been found experimentally that the process of car arrival at a given point on a road can be approximated fairly well by a memoryless (Poisson) process. Show that, if the average number of cars passing the point per unit time is A., the probability that no cars will pass during a given interval of length is = e - M . Show that the expected waiting time for a gap of size at least is roughly Is this estimate high o r low ? How accurate is it ? ) Children walk at a rate of about 3.5 feet per second. If we wish the expected waiting time for a child to be at most 1 minute, obtain an estimate for the maximum permissible flow rate A.max in cars per hour as a function of street width Noble gives
(b) (c
t p
D.
A.max =
(d) (e )
t
tip.
29,000(2.322 - log l o
D
D)
which has been adopted by the Joint Committee of the Institute of Traffic Engineers and the International Association of Chiefs of Police. How accurate is the estimate in ( ) What if the assumption in is incorrect because of saturation of the roadway or because of the presence of traffic signals up the road ? Discuss the situation in which traffic is moving in both directions.
c?
(a)
STO C H A STI C M O D E LS
21 4
3.
H o w far apart can w e expect the ends o f a randomly thrown string t o fall ? 1. L. Synge presents an interesting discussion of unsuccessful attempts to model this situation. The following is adapted from L. E. Clarke who wrote an article in response to Synge's. We assume that the string is made of small stiff pieces of length where the angle between adjacent pieces is a random variable depending on Then we allow --> O. Let the location of one end of the string be the origin and let the farther end of the kth segment be at the point
(1970) (1971)
n
/.
Let
(a)
/,
I
Pi (X i , Ii) and give it a physical interpretation. =
Show that the expected value of the square of the distance between the ends of the' string equals
E(S 2 ) nl2 + 2 iI< j E(Xi Xj + Ii lj). =
(b)
Argue that we can assume
E(Pi + 1 1 P;, Pi - ) E(Pi + 1 1 PJ, and also that E(Pi + 1 1 P i (1, 0)) (ql, 0), for some q q(l) 1. Show that it is reasonable to suppose that - q'(O) > 0 is a measure of flexibility. (You q(O) 1 and that m b . . .
=
=
<
=
=
=
=
(c)
should picture what is happening : We are considering shorter and shorter lengths of string of some fixed thickness.) Is it reasonable to assume that ' O exists as we have just tacitly done ? Show that
(d)
Show that, for
q( )
i
<
j,
and that
E(X i Xj + Ii lj) E(qj - iX? + qj - 1 Â¥?) qj - iI2 . Combine this with (a ) to obtain E(S 2 ) [2 n + 2/2 q(q'(1- nqq)+2 - 1 ) =
=
n
=
-
P R O B LE M S
21 5
Hint: I (n i) and of r nl, the length of the string, and let 1 Show (e) Fix the value that E(S 2 /r 2 ) g(mr), where g(t) 2(e'7t +t2 t 1 ) (f) Does the result appear reasonable ? I recommend that the class I
q
i- i
i<j
=
=
i 5: n
-
q
i
=
ï¿½
=
.
o.
-
â€¢
design and carry out an experiment to test the model. How difficult is it to throw a string at random ? Do you have problems with the string tending to stick to itself ? With centrifugal force when the string is thrown ?
These sorts of models are closely related to random walks. The result in is applicable to the problem of determining the lengths of long chain polymers. See C. Tanford ( 1 9 6 1 , Sec. 9).
(d)
4.
(a)
(b)
You are the manager of a delicatessen. Certain items that you stock are highly perishable. The pastries you buy from the wholesale bakery must be ordered 1 day ahead and can be kept only 1 day. How should you determine the size of your order ? Your competitor has less stringent standards than you, so he keeps pastries for days. What is his optimal ordering policy ? If you both have the same costs and wish to make the same profit, how will your prices compare ? Will they differ substantially ? How is your answer affected by the volume of business you and your competitor do ?
2
Note:
5.
you must make a variety of assumptions to do this problem. Discuss them. This problem is adapted from H. M. Finucan ( 1 9 76). Sometimes we must choose a variable x which is stochastically related to another variable y. Penalties for y > Y o and y < Y o may be substantially different. For example, suppose you wish to jump across a stream. Let x be the amount of effort used, y the distance of your jump, and Y o the width of the stream. In a plant with automatic packaging, x may be the length of time a chute filling a container is open, y the weight of the product entering the container, and Y o the minimum acceptable weight. Here we model a situation that is different from these two. When steel beams are made by continuous hot-rolling, they are cut twice. The first cut is a
STO C H ASTI C M O D E LS
21 6
rough cut as the beam emerges from the rollers. The second is a precise cut of the cool beam. The length of the cooled rough cut beam is approximately normally distributed with mean x and variance The machinery is calibrated in terms of x. is measurable and cannot be changed except by changing the mill machinery and/or operating procedures ; therefore we consider it fixed and known. If the length of the cool beam exceeds it is cut to the length if the length is less than it is rejected.
y
Yo, (a)
Yo,
Yo ;
Define
E(z) 1'' e(t) dt. =
and Show that
p (x) (b)
S2.
S2
= Pr
Yo - x) {y Yo} E(=
;::::
S- ,
and that the average length of cold steel needed to produce one beam is W(x) = x/P(x). Conclude that the extreme values of W are given by the solutions to
Yo - z E(z) S e(z) ' where x Y o - Sz. Describe a procedure for computing the value of x that minimizes W(x). Finucan cites Y o 30 feet and S 2 feet as a typical example. Show that the optimal value for x is 3 3 feet 1 1 inches. (Use a table of e(z) and E(z), or a table of E(z)!e(z) if you have one.) (c) Suppose undersized beams can be cut to length Uo and used. Assume that Y o - Uo is much larger than Discuss a model. (d) Can you suggest improvements in the model ? Other applications ? =
=
=
S
=
.
Develop a model for the packaging example cited at the beginning of the problem.
APPEN D IX
S O M E P R O BA B I L I ST i e BAC KG R O U N D
This appendix contains a hasty survey of the probability theory needed in the text. It can be used as a review for those who have had some probability theory. For those who have not had any, it can be used as an adjunct to lectures on the subject. A . 1 . T H E N OTI O N O F P R O BA B I LITY
If I toss a fair coin, what are the chances that it will come up heads ? We expect to see 50 % heads in the long run and so write Pr {heads} = l This is read, ' The probability of the event ' the coin lands heads up after this toss ' equals ! ' ; however, we shorten it to, ' The probability of heads equals l' What happens when we don't know the probability from a priori considerations ? For example, what is the probability that a newborn baby will be a boy ? We need to say very carefully what we mean. The fraction of newborn children who have been males in recent years has been 0. 5 1 4 in the United States. Therefore we could say that the probability of a male child is 0. 5 1 4 if the expectant mother is American. However, if you told me that she is a black American, I would recommend changing the probability to 0.506, since this is the observed fraction when the mother is a black 21 7
21 8
S O M E P R O BA B I LI S T I C B A C K G R O U N D
American. What's going on ? The population I'm looking at has changed from all babies recently born to American women to all babies recently born to black American women. Note that both these populations are drawn from the past ; as in all of science I'm assuming that the future will resemble the past. Although these considerations are essential for applications, they should not enter into the theoretical framework of probability theory to which we now turn our attention. The problem of estimating probabilities, to which I've alluded above, comes up again in the last paragraph of Section A.5. D E F I N ITI O N. Let Iff be a finite set and let Pr be a function from Iff to the nonnegative real numbers such that Pr {e} = 1 .
I
(Note the braces instead of parentheses for the function.) We call Iff the Iff the and Pr {e} the
event set, the elements of of the simple event e.
simple events,
probability
As an illustration, consider tossing a fair coin twice. The outcomes can be denoted by the obvious notation HH, HT, TH, and TT. We can think of these as simple events and write Iff = {HH, HT, TH, TT} .
eE
Also, P r {e} = i for each Iff . A s another illustration, suppose that we toss the coin until a head occurs or until we have completed two tosses. Then the simple events can be denoted by 1 , 2, and F -meaning a head at the first toss, a head at the second toss, and a failure to obtain a head. These corÂ respond, respectively, to H, TH, and TT in the previous notation. We have Â· Iff = { 1 , 2, F},
Pr { 1 } = 1,
Pr {2} = Pr {F} = t.
mutually exclusive and
Note that the simple events in both examples are that is, exactly one occurs. This is the case in all interpretations of simple events. If we had tossed a coin twice in the last example, we could think of event 1 as being the occurrence of either of the two simple events, HH and HT. We would write this as 1 = {HH, TT} . Thus we would write
exhaustive;
Pr (HH, HT} = Pr { I } = t and read the left side as the probability of either H H or HT occurring. More generally, D E F I N IT I O N . For any subset S of Iff we define Pr {S} to be the sum of S and refer to it as the probability that a simple event in Pr {e} over all S will occur or, briefly, the probability that S will occur.
eE
P R O B LE M S
21 9
We can estimate Pr {S} by sampling from
e
NsiN
{e}.
Ns
N
e
e r/=
e
conditional probability I
D E F I N IT I O N . The of A given B is defined to be Pr {A n B}/Pr {B} and is denoted by Pr {A B}. The sets of events A and B are called if
independent
Pr {A
n
B} = Pr {A} Pr {B}.
e
Conditional probability is interpreted as the probability that E A given that E B. We can think of this as restricting our attention to B : If we estimate probability by counting, as described earlier, we will estimate the probability that an event in B lies in A by Since this equals we see that the definition of conditional probability agrees with the notion of restricting our attention to the events in B. We can think of independence as follows. Knowing that E B gives no information about whether or not E A, since
e
NAnBINB'
(NAnBIN)/(NBIN), e
e
Pr {A l B} =
(1)
P r {A n B} = Pr { A } , Pr {B}
b y the definitions o f conditional probability and independence. B y symmetry, the roles of A and B can be interchanged. PROBLEMS
1.
Prove that Pr { A
u
B } = Pr { A }
+
Pr {B} - Pr { A
n
B}.
2. Two dice are thrown. All that matters is the sum of the two values. Formulate this in a probabilistic framework. 3.
We are looking at U.S. coins minted in the 1 960s. Our interest is in denomination, date, and mint. Discuss some things we could consider and cast them all in the appropriate terminology, assuming that a simple event corresponds to observing a single coin. To begin with, what is iff ? Does it help to know the number of each type of coin that was minted ? Why ?
220
S O M E P R O BA B I L I S T I C B A C K G R O U N D
A . 2 . R A N D O M VA R IA B L E S
We're frequently not interested i n simple events but only some real-valued function of them ; for example, the number of heads in 1 00 tosses of a coin. A natural choice for the set of simple events is the 2 1 0 0 possible sequences of heads and tails, but the function we wish to study takes on only 1 0 1 values- a considerable reduction from 2 1 0 0 . The value of such a function depends on which simple event occurs, so it is a variable. Since it depends on something that is random, it is a random variable. Thus we have D E F I N IT I O N .
A
random variable is a real-valued function defined on
tt.
It is conventional to use capital letters for random variables. Instead of the functional notation X(e), one frequently writes simply X and talks about the value of X. The function Pr {X s is called the for X and is important in discussing continuous probabilities. (See Section A.4. ) By our convention regarding Pr {statement}, it equals the over all elementary events with X( ) s x. sum of Pr We are often interested in what values X is likely to take on ; for example, if we toss our coin 1 00 times and count the number of heads, how many do we expect ? How close to this estimate can we expect to be ? We now introduce two important concepts relating to these questions.
x function
x}
{e}
D E F I N IT I O N .
e
The
variance of
X
e
expectation or expected value of X is given by E(X) I X(e) Pr {e}, =
and the
(cumulative) distribution
is given by a 2 ( X)
=
e E {/
I [ X(e) - E(X)y Pr {e}.
e E {/
[Note that in the definition functional notation is used correctly ; i.e., X should not be replaced by X(e) at any of its occurrences.] The expectation is the average value of X. If we make lots of observations and compute the average value of X, it will approximate E(X). The average value of X over a series of observations is denoted by X. Since X is easily determined, we have a good way to estimate E(X). Thus, if E(X) completely determined Pr {X = we'd have a method for estimating whatever we wanted about X. We see examples of this later. The variance is a measure of how much we can expect values of X to deviate from E(X)-the average value of [X(e) - E(X)y ; that is, a 2 (X) = E([ X - E(X)y).
e},
R A N D O M VA R I A B L E S
a2 (X - X) 2 . n/(n a2 (X),
221
This is true, but a by (This suggests that we can approximate better estimate is given by this number times 1 ). We won't go into the the more spread out the values reason here.) The larger the value of of tend to be. Stated another way, if the variance is small, then is not likely to deviate far from The following theorem makes this precise. The proof is left as a problem.
X
E(X).
X
c > 0, a 2 (X) Pr {IX - E(X) I > c} ::; --' c2 In words, the probability that X differs from its expected value by more than c does not exceed its variance divided by c 2 . Note that the theorem is useless if c 2 a 2 (X).
THEOREM .
Chebyshev's inequality. Whenever
<
Some basic properties of expectation and variance are
E(X) I x Pr {X x}, E(aX + bY) aE(X) + bE(Y), E(a) a, a 2 (X) = E(X 2 ) - E(Xf, a2 (aX + b) a2 a2 (X), a2 (a) 0, a2 (X) O. =
(2)
=
x
=
=
=
=
ï¿½
We prove the second and third. You do the others. We have
E(aX + bY) I [aX(e) bY(e)] Pr {e} a I X(e) Pr {e} + b I Y(e) Pr {e} aE(X) + bE(Y) =
=
=
and
e
e
E(a) I a Pr {e} a. =
for the third
+
e
=
a 2 (X) E([X - E(XW) E(X 2 - 2E(X)X + E(X)2) E(X 2) - 2E(X)E(X) + E(Xf E(X 2 ) - E(X)2 . =
= =
=
222
S O M E P R O B A B I LI ST I C B A C K G R O U N D
The notions of independence and conditionality can of course be carried over to random variables. Thus we say that X and Yare independent if = Pr {X = Pr {X = and Y = Pr { Y =
x x} y}, y} for all x and y. In other words, the events X x and Y y must be indeÂ pendent for all x and y. Hence knowing the value of X gives no information about the value of Y, and vice versa. The conditional expectation is defined by E(X I Y y) I x Pr {X xl Y y}. (3) =
=
=
=
=
x
=
In other words, it is the average value of X on the set of events for which Y(e) = Although this is a function of it is often abbreviated E(X I Y). Note that E(E(X I YÂ» is simply E(X), because E(E(X I YÂ» is obtained by multiplying (3) by Pr { Y = and summing over which by simple manipuÂ lation reduces to E(X). The importance of independence is reflected in the following theorem.
y,
y.
y}
TH EO R E M .
y,
If X and Y are independent random variables, E(X Y) 2 a (X + Y)
(4)
E(X I Y)
=
=
=
E(X)E( Y), a 2 (X) + a 2 ( y ), E(X).
We prove these. We have E(X Y)
= = =
a 2 (X
+
Y)
=
=
=
=
and, by ( 1 ),
=
I X(e) Y(e) Pr {e} I xy Pr {X x and Y y} I xy Pr {X x} Pr { Y y} e
=
=
x . )'
=
x . )'
=
E(X)E( Y),
E((X + y) 2 ) (E(X + yÂ» 2 E(X 2 + 2X Y + y 2 ) - [E(X) + E( Y)J 2 a 2 (X) + a 2 ( y) + 2E(X Y) - 2E(X)E( Y) a 2 (X) + a 2 ( y ), _
E(X I Y)
= = =
I x Pr {X xl Y y} I x Pr {X x} x
x
E(X).
=
=
=
223
P R O B LE M S
PROBLEMS
1.
Complete the proof of (2).
2.
Prove Chebyshev's inequality by showing that
u2 (X) c 2 Pr {IX - E(X) I c}. The notion of independence is extended to several sets by requiring that for any subcollection A, B, . . . , C of the sets Pr {A B C} Pr {A} Pr {B} . . . Pr {C}. Describe independence for several random variables and show that, if X l ' . . . , X are independent, E(r; Xi) r; E(X;), u2 (ï¿½ Xi) ï¿½ u2 (X J z
Z
3.
n
=
n . . . n
n
=
=
What else can you say about the situation ?
4. If X and Y are independent random variables with Pr {X x} f(x) and Pr ( Y y } g(y), show that Pr {X + Y z} I f(x)g(z - x) , the sum ranging over all x for which f(x) 5 . (a) Establish Bayes' formula : Pr {A} Pr {B I A} . Pr { A I B } Pr {B} (b) Suppose that a diagnostic test has been developed that detects a particular disease 98 % of the time when it is actually present and incorrectly ' detects ' in 5 % of the time when it is not present. If 1 % of the population has the disease, show that the probability an individual has the disease when the test says that he does is (0.0 1) (0.9 8) (0.0 1)(0. 9 8) + (0. 99)(0. 0 5) 0.14. In other words, 86 % of the detections are incorrect. =
=
=
=
=
=1= o.
_
=
=
224
S O M E P R O BA B I LISTIC B A C K G R O U N D
A . 3 . B E R N O U L LI T R I A LS
Consider an experiment made up of a repeated number of independent identical trials each having two outcomes ; for example, coin tossing. These are called Since Bernoulli trials are important, I'll discuss some of their basic properties. We designate the outcomes of the trials by S and F for success and failure and let be the probability that trial ends in success. A typical simple event is a sequence containing some number of successes and some number of failures in some order. Since the trials are independent, probabilities multiply, and so is the probability of the simple event, given that exactly + trials are performed. Let be a random variable equal to the number of successes in the = first trials. We want to study Pr Let (ï¿½), read choose denote the number of ways to choose locations in an long sequence. Then
Bernoulli trials.
p
i
pS(1 - p)f s f Sn n
s
f
'n k,' n G)pk(1 - p)n - k. Pr {Sn k } (5) The numbers m are the well-studied binomial coefficients. Their values turn k
=
out to be.
{Sn k}. =
(n) n(n - 1) Â· Â· . (n - k + 1) . k
Sn
1Â·2Â·Â·Â·k
=
To study it is convenient to introduce random variables that reflect the = 1 if the ith independence of the trials. Define random variables by = 0 otherwise. Then, = trial succeeds, and + . + and the = are independent. One easily computes and
Xi Xi S Xl Xi X Xi E(X;) p, (j2 (X;) p(l - p? + (1 - p)(O - p)2 p(1 - p). By Problem A. 2 . 3 E(Sn) np and (j 2 (Sn) np(l - p). How long must we wait for our first success ? We have a problem here because there may be no success in the first n trials. To overcome this, we do computations with n fixed and then let n The answer is the expected value of a random variable that equals k if and only if the first success occurs on trial k. Hence we obtain I k Pr {Sk - l 0 and Xk 1 } I k Pr {Sk - l O} Pr (Xk O} I k(1 - p)kp P dd qk, n
=
=
=
,
n '
=
-> 00.
=
=
=
=
=
=
q
ï¿½
L.
=
I N F I N IT E E V E N T S ETS
225
where the sums range from k = 1 to k = n . Evaluating the last sum and letting n --+ 00, we find that the expected waiting time for the first success equals lip. Since the trials after the first success are independent of the trials leading up to the first success, we see that the expected waiting time for the jth success is j/p. P R O B LE M S
1.
Let Wj b e a random variable equal t o the number o f Bernoulli trials until the first success. Show that Pr { Wj = n } = q n - l p. What is a 2 ( WI ) ?
(a) (b)
2.
Let Jil'k be a random variable equal to the number of Bernoulli trials until the kth success.
(a)
Show that Pr { Jil'k
(b)
=
Hint:
n} C = ï¿½ )qn - k k. p
=
Xl X
Xk>
What is a 2 (Jil'k) ? Look at where the + 2 + . + are independent and have the same distribution as Wj '
Xi
3. The circuitry in my hand calculator has a probability of failure equal to p per hour of use, independent of how long I have used it. How long can I expect the calculator to work before it fails ? 4.
In situations like that in the previous problem, circuits can be duplicated. Then failure does not occur until both copies of the circuit have failed. Let be the time to failure.
T
(a)
Show that Pr
(b) A. 4 .
X
{T n} =
=
Pr {max
(X, = n}, Y)
where and Y are independent and identically distributed with the same distribution as Wj ' Show that Pr { = n } q 2 n - 2 (1 - q 2 ), first by using and second by expressing as Wj for some Bernoulli trials.
T T
=
(a)
I NFIN ITE EVENT SETS
Very often we want to allow an infinite event space. In this case it may be difficult to start out with elementary events. For example. consider the situation in which all the real numbers in the interval between 0 and 1 are
226
S O M E P R O B A B I LI S T I C B A C K G R O U N D
equally likely to be chosen. We cannot assign a nonzero probability to any number, for we should then be obliged to assign the same probability to all numbers in the interval, and then the sum of the probabilities would be infinite. However, if each number has zero probability of being chosen, the sum of the probabilities will be zero. The way out of this difficulty is to ignore individual numbers and simply assign a probabiky to the event that the number chosen lies between x and Thus we could start out with a definition of Pr as a function on the subsets of Iff having certain properties like Pr { 4 } ;::: 0, Pr {lff } 1 , and Pr u Pr + Pr n This : nproach leads to - Pr complications. A simpler but limited approach is to work with random variables and use Pr ï¿½ x as the basic concept. This will satisfy our needs.
y.
= {A B} = {A} {B} {A B}. {X } D E F IN IT I ON . I f F(x) i s a real-valued monotonic function satisfying lim F(x) = 1 , and lim F (x) = 0 we call F(x) the distribution function for the random variable X and write Pr {X ï¿½ x} = F(x). If f(x) = F'(x) exists, we call it the density function x -+ + ':L'
X ---+ - oo
for
X.
X
Roughly speaking, j(x) dx is the probability that lies between x and x + dx. By a suggestive abuse of terminology is called the probability that x. Consider the example
f(x) dx
X=
F(x) =
Â°
1 x
for x ï¿½ 0, for x ;::: 1 , for O ï¿½ x ï¿½ 1 .
It follows that X lies in the interval between Â° and 1 , since Pr
{X ï¿½ O} F(O) = 0, =
and
{X > I} = 1 - Pr {X ï¿½ I} = 1 - F(1 ) = 0. Furthermore, if 0 ï¿½ x ï¿½ y ï¿½ 1 , P r {x X ï¿½ y} = F(y) - F(x) = y - x. Thus the probability that X lies in the interval (x, yJ equals the length of the interval. We also have f (x ) = 1 . This is the uniform distribution on the Pr
<
interval [0, 1] mentioned in the first paragraph of this section.
I N F I N IT E EV E N T S ET S
227
F(x) = b i b i = ai x ai + i i = bn = . . . , n, an = ao = + P' ai x :s; y ai + l {x X :::;; y} = {ai X :s; aJ = b i - bi -l = P i ' {X = aJ = P i' F
Consider the example for :s; < l and 1 , 2, where + 00, and 1 . If Pl + . . . + then Pr < O. If b > 0 is small, :s; < b < Pr Letting b 0, we see that, in some sense, Pr Thus the step function corresponds to a discrete distribution like those discussed in Section A.2. Thus the present framework provides a generalization of the ideas introduced in Section A.2 ; however, to carry out the generalization we shall need some additional concepts, and the whole thing will appear rather theoretical. The main idea to keep in mind is that L is replaced by J and Pr is replaced by The analogy between sums and integrals suggests that we define -
00 , 1>
-
-)0
{X = x}
f(x) dx. E(X) = I-+ co xf(x) dx. co This has two drawbacks : First, we want to replace X by a function of X to obtain a more general definition (thi3 is easy), and second, f(x) may not exist (this is more serious). To begin with, we write E(g(X)) = I-+ co g(x)f(x) dx. (6) co Integrating by parts with u = g and dv = f dx we have E(g(X)) = g(x)F(x) I + co Jï¿½+oo g'(x)F(x) dx. (7) co This looks like a good definition for expectation, since f does not appear. Unfortunately the two terms in (7) may both be infinite. To avoid this problem - 00
-
we have
expectation of g(X) is given by E(g(X)) = t ï¿½iï¿½oo [g(t)F(t) f g'(x)F(x) dxJ
D E F I N IT I O N .
The
-
00
f(xNow ) exists,therethisisreduces to (6). a question of consistency that we should consider. Let the random variable Y be defined by Y = g(X). We ought to have E(Y) = Is this the case ? Suppose that g is monotonic increasing. We have E(g(X)). Pr { Y :S; y} = Pr {X :s; g -l (y)} = F (g -l (y)). If
S O M E P R O B A B I LISTI C B A C K G R O U N D
228
Hence
E(Y)
=
'ï¿½ï¿½CX) [UF(g - l (U )) - f', F(g - l (Y)) dyJ
t g - l (U) and x g - l (y), we have E(Y) ï¿½ï¿½ [g(t)F(t) - f g'(X)F(X) dxl E(g(X)), oo which is what we had hoped for. The variance of X is defined to be E([ X - E(XW). We need to be able to handle more than one random variable simulÂ taneously. Thus we introduce a function F(X l . . . , x n) which is identified ' with (8 ) n Then f O F/OX l . . . oxn. We require that F 1 as the Xi + 00, F 0 as the Xi - 00 , and f The last condition can be phrased purely in terms of F to allow for the case in which f does not exist. For example, when n 1, we require that F(x) - F(x*) 0 whenever X x*, and, when n 2, we require that F(x, y) - F(x*, y) - F(x, y *) F(x *, y*) 0 whenever x x* and y*. The n 1 case corresponds to the statement that the integral of f(t) from x* to x is nonnegative, and the n 2 case corÂ responds to the statement that the integral of f(t, u) over the rectangle [x* , x] [y* , y] is nonnegative. This can be generalized. From the joint distribution function F(Xb . , xn) we can compute various marginal distribution functions, that is, probabilities like (8) in which one or more of the X i have been deleted. For example, given F(x, y) as the joint distribution function and Y, the distribution functions for and Y are limJ'ï¿½ F(x, y) and F(x, y), respectively. You should + the density function+ for be able to show that X is given by J:': ï¿½ f(x, y) dy.
by the definition. Setting =t
=
=
cc
=
2::
--+
=
=
--+
o.
--+
2::
=
2::
2::
2::
+
y 2::
=
=
xÂ·
.
00
for X limxï¿½
Of course, expectation is given by
--+
00
.
X
which can be rephrased in terms of F by using n-fold integration by parts. Conditional expectation and independence also parallel Section A.2. For example,
E(X I Y) f + OO xf (x, y) dx, =
_ CD
P R O B LE M S
229
f(h x, y) g x)h y)
= and we say that X and Y are independent if ( ( for some functions 9 and In this case we can choose 9 and to be the density functions for X and Y. There is the old problem of replacing density functions by distribution functions. You may like to try doing this. (The idea for indeÂ pendence is to compute the probability that (X, lies within a rectangle.) We prove the linearity property of expectation given by (2) and leave it to you to show that (4) and the rest of (2) also generalize. For simplicity assume exists. Then
h.
Y)
f(x, y)
E(aX
+
bY) JJ (ax + by)f(x, y) dx dy J x[Jf(x, y) dY] dX + b J{Jf (X, y) dX] dY +b ) =
=
a
=
aE(X)
E( Y .
PROBLEMS
1. 2.
Give the proofs asked for i n the text. If X and Y are independent random variables with density functions and show that Z = X + Y has density function
f g,
OO + h(z) J f (x)g(z - x) dx. =
3.
_ 00
Suppose that you are running a business in a service industry where demand fluctuates. (Examples include freight hauling and telephone repair.) Suppose that the wage rate is dollars per hour and the overtime rate is You contract with employees for a total of hours at the wage rate and fill any unsatisfied demand by paying overtime wages. Let X be a random variable equal to the number of service hours demanded.
r
s.
(a )
(b)
(c)
N
If X has a density function f(x), show that your expected wage costs are
rN + s IX) (x - N)f(x) dx. is a minimum when N is
Show that this = Pr {X > Deduce the result in function.
N} rls.
(b)
without assuming that
chosen so that X
has
a
density
S O M E P R O BA B I L I S T I C B A C K G R O U N D
230
A . 5 . T H E N O R M A L D I ST R I B U T I O N D E F I N ITI O N .
is given by
where exp
(z)
=
normal distribution with mean f(x) 1 exp ( - (X - )1)2 )
The
eZ â€¢
=
M.::::i V
2 (J 2
2n(J2
)1
and variance
(J2
'
You should verify the claims implicit in this definition ; that is,
ff(x) dx 1, f xf(x) dx =
=
)1,
You may need a table of integrals. For a normally distributed random variable X, the (J provides a measure of deviation for )1 that is more precise than Chebyshev's inequality, namely,
standard deviation Pr
(9)
{ I X - II I
:s;
c(J }
=
A J:e - 2 dx. X 2/
You should prove this. The importance of the normal distribution stems from the fact that sums of random variables tend to be normally distributed. Consequently experimental errors are often roughly normally distributed, because they are the sum of many small effects. For biological traits such as size, the effects of genes seem often to be roughly multiplicative, and so the logarithm of size tends to be normally distributed within the adult population of a species. These vague statements can bt:: made mathematically precise. The result is known as the or, more accurately, central limit theorems, since there is more than one. We consider a simple one.
central limit theorem
Suppose Xl ' X , 2 thatare Xl + . . . + Xn . Suppose
TH EO R E M .
Let
Sn
(10) as n
=
â€¢
.
â€¢
max I < i < n
independent random variables.
(J2(XJ (J2(Sn)
-+
0
Define Zn = [Sn - E(Sn)]/(J(Sn) and let function for Zn . Then for every z,
(1 1
)
-+
00 .
lim
n -+ oo
Fn(z)
=
Fn
_1_ fZ e - t 2 dt. fo
-oc
'
2/
I
be the distribution
0' 2 (SII)
231
T H E N O R M A L D I ST R I B U T I O N
0' 2 (XJ
n
by (4), assumption ( 1 0) ensures that as ---+ 00 Since L ConÂ no single Xi makes a significant contribution to the variance of clusion ( 1 1 ) essentially says that tends to be normally distributed when is large. The Bernoulli trials of Section A.3 provide a simple illustration of the theorem. In this case the Xi are independent, identically distributed random = nrr ( X J , and so ( 1 0) holds. We have variables. Thus =
ZII'
Zn
n
2
0'2 (SII)
A refinement of this result can be used to obtain asymptotic information about the binomial coefficients, because of (5). Another important property of the normal distribution is that, if X l , . . . , XII are independent and normally distributed with means f.1i and variances ï¿½f, then X l is also normally distributed [with mean by (2) and (4)] . It suffices to prove f.111 and variance f.1 1 this for 2, since the rest follows easily by induction. By Problem AA.2 2 is the density function for
+ . . .O'i+ +XII. . . + O'ï¿½ n
+ . . .n+
=
=
Using the identity
2 AD)2 (A t + B)2 + (Ct + D)2 (A 2 + C 2 ) (t + ABA Z ++ CD ZC ) + (BCA Z -+ C Z with A 1/0' 1 > B d C l/O'z , and D (f.1z - x)/O' z , we have =
=
=
-
f.1
O' l ,
=
=
which turns out to be the density function for a normal distribution with the correct mean and variance. To change the subject, suppose that we wish to estimate some number m. It may be the expected value of some random variable, and our estimation procedure may be M onte Carlo simulation. It may be a physical constant
232
S O M E P R O B A B I LI S T I C B A C K G R O U N D
and our estimation procedure may be experimental measurement. At any rate, after trials we obtain estimates X i of m. It seems reasonable to take x as an estimate for m . How accurate can we expect it to be ? Suppose that the X i are obtained from independent observations where the distribution function is F and the mean and variance are m and reÂ spectively. Let be independent random variables with distribution function F. Then by (2) and (4),
= I x/nn
n
S2 ,
X,
(52 (InX i) = Sn2 . By the central limit theorem, I X/n is approximately normally distributed with mean and variance s 2 /n, and so by (9) m
Thus we expect our error to decrease as the square root of the number of trials. See the introductory part of Section 5.2 for further discussion.
P R O BLEMS
X
1.
Show that i s normally distributed with mean 0 variance 1 if and + f.1)(5 is normally distributed with mean f.1 and variance only if
2.
Suppose that is normally distributed with mean Sketch the density function for
(X
X
X.
f.1
(52 . aud variance (5 2 .
A . S . G E N E R ATI N G R A N D O M N U M B E R S
In Section 5.2 I briefly discussed the generation of random numbers and provided a table of 3000 random digits. I'll treat the subject further here. There are two distinct approaches to automatically generating random numbers. The first is physical : A device is used to produce ' noise ' which is then translated into numbers. Examples include a noise tube and a pointer which is spun. The second method, which is the topic of this section, is to use a mathematical procedure to generate numbers which appear to be random. Numbers created in this way are not truly random, because they are produced in a repeatable manner. In fact, the numbers produced by such
G E N E R ATI N G
RAN D O M
N U M B ERS
233
methods cycle-but the period of any decent method is so large as to present no problem. The idea is to devise a function that maps the integers onto themselves and then, starting with comp ute between 0 and Xl> X . . . . Hopefully this will go through most of the integers between 0 and in some seemingly random fashion. One can then use a function 9 to obtain random numbers of any desired sort. One objection to-this procedure is that, if I tell you a random number Xn , then you can tell me its successors. This can be avoided by using certain digits of Xn to produce the random number and using other digits of Xn to compute Xn + l ' Here is a method for producing random numbers between 0 and 999 on a hand calculator. [For a discussion of this and many other methods for generating and testing random numbers, see D. Knuth ( 1 969).] Choose any eight-digit number ending in 1 , 3, 7, or 9. (Leading digits may be zeroes.) Define to be the rightmost five digits of times 963 and use the leftmost three digits of x (considering to be an eight-digit number) as the random number. This can be simplified by replacing by
f(xo)
=
=
2.
xo,
'
f(x)
1.
f
M f(xj) 2 M
x x x/lO s : Choose an eight-digit number Xo of the form dld 2 d 3 .d4 d s d6d 7 dg, where dg is 1 , 3, 7, or 9. x
Define r n to be the integer part of Xn and define Xn + 1 to be the fractional part of Xn multiplied by the number 963.
o
2
To illustrate, X 0. 1 2347 leads to the sequence Xl 1 1 8.90 1 6 1 , X 868.25043, X 241.1 6409, and so on. The first four random numbers are 000 1 1 8 868 241 . In Section 5.2 it was pointed out that, if X is uniformly distributed on [0, 1 ] , then Y F - 1 (X) has the distribution function F. To see this note that, since F is monotonic,
3
=
=
=
=
=
Pr
{ Y :::;; y}
=
Pr
{ F ( y ) :::;; F (y)}
=
Pr
{X :::;; F (y)} ,
which equals F ( y), since X is uniformly distributed on [0, 1 ] . Since i s not easily computed for the normal distribution, a table or the central limit theorem should be used. To use the latter, simply generate a sequence of random numbers and apply the theorem to them. For example, if X I , . . . , X n are generated to be uniformly distributed on [0, 1 ] ,
F- I
(XI + . . . + X n - ï¿½)f3
is approximately normally distributed with mean 0 and variance 1 . A convenient and almost certainly large enough value for is 12. Here is a
n
S O M E P R O BA B I LI S T I C B A C K G R O U N D
234
-I
simple table based on F for the normal distribution. I recommend using it when doing calculations by hand or on a hand calculator. It is used as follows. 0
2
3
4
5
6
7
8
9
0, 1
0.00
0.02
0.05
0.08
0. 1 0
0. 1 3
0. 1 5
0. 1 8
0.20
0.23
2, 3
0.25
0.28
0.3 1
0.33
0.36
0.39
0.41
0.44
0.47
0.50
4, 5
0. 52
0. 5 5
0. 58
0.6 1
0.64
0.67
0. 70
0.74
0.77
0. 8 1
6, 7
0. 84
0.8 8
0.92
0.95
0.99
1 .04
1 .08
1.13
1.18
1 .23
8, 9
1 .28
1 .34
1 .4 1
1 .48
1 .56
1 .64
1.8
1 .9
2. 1
2.3
Generate two random digits Y1 and Y2 â€¢ If Y1 Y2 0, reject the pair and try again. Find Y1 in the leftmost column and Y2 in the top row. Read off the number X, changing its sign if Y1 is odd. This is normally distributed with mean 0 and variance 1. Hence (X + is normally distributed with mean and variance (1 2 . =
=
fl)(1
fl
A . 7 . L EAST S Q U A R ES
The racing shell model in Section 2 . 1 predicts a relationship of the form CX - 1 /9 where is the number of oarsmen and is the best possible time in a race. Of course this is only approximate, since shell designs do not quite fit the model we proposed. Furthermore, we can only estimate the best possible times by using data which may be biased by such things as nonideal team performance, currents, and winds. Thus we obtain for various values of (namely 1 , 2, 4, and 8) estimates y for What value of C gives the best fitting curve ? How good is the exponent - ! -what is the best fitting curve of the form Cxm ? In general, we have a function depending on certain parameters and we have estimates Y i of We wish to determine the best values for the parameters. What should we do ? To make any progress, we need to make some additional assumptions. Let's start with a simple situation and then return to the racing shell problem. In Section 2.2 we predicted that for a perfect pendulum in a fixed gravitational field, the period is r qO) Ji where 0 is angle of swing, 1 is length, and C is an unknown function. Let's test this by constructing penduÂ lums of various lengths, starting them swinging at some fixed angle 00 , and
hex)
x
=
hex)
x
hex).
hex)
h(x;).
=
LEAST S Q U A R E S
235
measuring the period. We can then plot versus ji and see if we obtain a straight line. Of course, there will be errors in measuring t and 1, and in setting the angle of swing equal to 80 . (The fact that the pendulum is not perfect can probably be neglected. See Section From another point of view, we are making errors in estimating C(80)ji, both by measuring at the wrong point (80 and in error) and by measuring incorrectly. This suggests that after many repetitions with a given 1 we might obtaip. estimates which are normally distributed about the predicted value C(80)ji. In other words, is normally distributed with mean C(80)ji and unknown variance 2( ) We make measurements for various and thereby obtain pairs where is sampled from a normal distribution with mean C(80).jl; and variance 2( ) . What is the best estimate for C(80) ? We can interpret ' best ' to mean ' estimate which maximizes the probability of being close to the observed values.' Let > 0 be very small. The probability that a sampled value would be within of is
r
9.2.)
r(l)
1
=
r
r ( l)
(J l . r (li' rJ (JT ri (J l; Ci Bi r i
1
=
r
2Bi ( - (ri - C(80).jl;)2) --2(Ji fo (Ji (Ji (ri - C(80) Jly L 2(JT exp
ï¿½ ï¿½
2
â€¢
If the observations are independent, we may multiply this probability for various values of to obtain the joint probability. If the are independent of the parameter C(80), this joint probability will be a maximum when
i
is a minimum. We can find C(80) to minimize this by setting a L /aC(80) equal to zero and solving for C(80). This approach is stated in general form in the following theorem. Least squares. If r; are independent, normally distributed random variables with means h(X ) and variances independent of h, then the probability of each r; simultaheously being within of Y i is maximized by selecting the function h for which
TH EO R E M .
i
(12) is a minimum.
L
(yj
- h(X;))2 (J T
(JT B
i
S O M E P R O BA B I L I ST I C B A C K G R O U N D
236
Usually the assumptions of the theorem cannot be verified (in fact, they are usually incorrect), and the variances cannot be estimated. The usual procedure is to apply the theorem anyway and assume that all the variances are equal. Thus we minimize (1 3) in most cases. This is what would be done in the pendulum problem discussed . before the theorem. Let's apply the theorem to the racing shell problem. Let Y; be the best observed time for a shell with Xi men. Of course, we cannot hope to verify the hypotheses of the theorem or estimate (Ji ' We make the usual assumption that the theorem holds and the (Ji are equal : We assume that the Y; are independent normally distributed random variables with means CXi- 1/9 and equal variances. We wish to minimize ( 1 3) where CXi- 1/9 and is an observed best time. By setting the partial derivative with respect to C equal to zero we obtain
h(x J
=
Yi
( 1 4) This is a linear equation in C, so it is easily solved when the values of Xi and are known. Instead of looking at X versus as in ( 1 4), we can consider log X versus log as suggested in Section 2. 1 . Then in the theorem is the logarithm of the time, Xi is the logarithm of the number of men, and However, we have already set equal to the time and Xi is log C equal to the number of men. We will keep this notation rather than the notation of the theorem. Thus we wish to minimize log Xi 2 + log
Yi
-
x/9.
Y Yi Yi
Y
h(xJ
I [ Yi K -9-J , where K log C. In this case we've assumed that log Y; is normally distriÂ buted with mean log C - (log xJ/9 and variance independent of i. This is inconsistent with our assumptions about leading to (14). Setting the partial derivative with respect to K equal to zero we obtain log Xi (1 5) I log Yi K + 9 - O. -
=
Y;
-
=
Equations ( 1 4) and (15 ) give different values for C (see the accompanying table). Which is correct ? Probably neither one, since our assumptions about Y; and log Y; are assuredly wrong ; however, both give fairly good fits to the data and the fits are about the same. Now suppose that we want to fit the exponent as well, that is, find the best having the form In this case, the second method is preferable. This is not for any theoretical reason, but
hex)
Cx - Yâ€¢
237
T H E P O I S S O N A N D E X P O N E N T I A L D I ST R I B U T I O N S
simply because it is much easier to find the values of C and r that minimize ( 1 3 ) in this case. The equations for K log C and r are ( 1 6)
I log
I log
Yi
=
Yi
log X i
K
-
-
K
+
r
log
log X i
+
Xi
=
0,
r xi (log
=
O.
The following table compares values obtained by the various methods. The data comes from Table 1 in Chapter 2. Since different races may be run under different conditions, it was not clear how I should interpret ' best time.' Should I do separate fits for each of the four races ? A fit to the average of the best times of the four races ? A fit to the overall best time ? I fit the average best time and the overall best time. Once C and r have been determined using ( 1 4), ( 1 5), or ( 1 6), it is possible to compute This I have also done. Note that the fit is fairly good, and the estimates for r via ( 1 6) support the model's prediction that r ï¿½.
h(x;).
=
Average Best Time
C r
( 1 4)
( 1 5)
(1 6)
( 1 4)
( 1 5)
( 1 6)
7.44
7.35
7.29 0. 1 04
7.3 1
7.2 1
7.2 1 0. 1 1 1
7.44 6.89 6.38 5.9 1
7.35 6.8 1 6.30 5.83
7.29 6.78 6.3 1 5.88
7.3 1 6.77 6.27 5 . 80
7.21 6.68 6. 1 8 5.72
7.21 6.68 6. 1 8 5.72
1
1
9'
7.22 6.88 6.34 5.84
1 2 4 8
Overall Best Time
9'
1
9'
7. 1 6 6.77 6. 1 3 5.73
1
9'
A . S . T H E P O I S S O N A N D EX P O N E N T I A L D I ST R I B U TI O N S
Two closely related distributions are the Poisson, a discrete distribution k} given by Pr {X e - AAk/k ! and the exponential, a continuous distriÂ 1 bution given by Pr { T S t } e - vt. They both have mean and variance A 1 /v. Prove it. The exponential is associated with waiting times between rare events, and the Poisson with the number of rare events in a given time interval. The following examples illustrate this. =
=
=
-
=
1.
Suppose we distribute N A items into N boxes. Let X be the number of items in the ith box. If the items are distributed independently and each box is equally likely to be chosen, Pr {X k } -4 e - AAk/k ! as N -4 00 . =
S O M E P R O B A B I LI S T I C B A C K G R O U N D
238
Suppose that i n a small time interval I1t an event has probability v I1t of occurring, independent of what has happened in the past. The waiting time T between two successive occurrences is exponentially distributed. 3 . Closely related t o this i s failure o f a product. I f the probability o f failure in the time interval l1t is v I1t given that the product hasn't failed up to that time, the waiting time to failure is exponentially distributed. 4 . Let's return to example 2. Let X be the number of occurrences of the event between t and t + T. Then X is Poisson distributed with TV, where V is the parameter of the exponential distribution in example 2. 2.
A
=
These examples merit more discussion. We can think of example as a Bernoulli trial situation. If an item is placed in the ith box, this is a success. We then have
1
Pr {X where p
=
=
k}
=
(ï¿½A)qNA - kp
liN. The claim in example 1 follows from (NAk )N - k kAk! and -+
as
-+
NThe exponential Hence the Poisson is a limiting case of Bernoulli trials. is obtained similarly as a limit. In example 2, T is 00.
simply the waiting time to the first success. Consider a situation in which the time between Bernoulli trials is I1t and the probability of success is v I1t p. The probability of a first success at time I1t[TII1tJ is (The square brackets here denote ' largest integer not exceeding.') For small I1t this is approximately Hence f(T) The relationship between the exponential and Poisson distributions asserted in example 4 is easily proved : In each time interval l1t, the probability of success is v 11t, so after time intervals the probability of k successes Setting I1t TIN and letting -+ 00 , we obtain the desired is result.
q[T/Mlp. v
ve-,TI1t.
(ï¿½)qn- kl.
p
=
ve-,' T .
N
=
=
N
=
R E F E R ENCES Numbers at the end of a reference refer to chapters and sections in which the item is mentioned . An asterisk indicates that a book is not very specialized and may be of particular interest to you. Ackoff, R. L. and M. W. Sasieni ( 1 968) . search . Wiley. 4. 1 . Aitchison, J. and J . A . C . Brown ( 1 963).
Fundamen tals of Operations ReÂ The Lognormal D istribution with
Special Reference to Its Uses in Economics.
Press . 1 0. Almond, J . ( 1 965) .
Cambridge University
Proceedings of the Second International Symposium on
the Theory of Road Traffic Flow, London, 1963. The Organisation for Economic Co-operation and Development, Paris . 9 . 3 . Altman, P. L . and D . M . Dittmer, eds. ( 1 964). Biology Data Book . Federa Â tion of American Societies for Experimental Biology . 2 . 1 . Armstrong, J . S . ( 1 967) . Derivative of theory by means of factor analysis or Tom Swift and his electric factor machine. Amer. Stat. 2 1 : 1 7-2 1 . Reprinted i n R . L . Day and L . J . Parsons, eds . ( 1 97 1) . Marketing Models. Intext : 4 1 3-42 1 . 1 . 5 . Arrow, K . J . ( 1 963). Soc ial Cho ice and Individual Values (2nd ed.). Cowles Foundation Monograph 1 2 . Wiley. 6. *Ashton, W . D . ( 1 966) . The Theory of Road Traffic Flo w . Methuen. 9 . 3 . Audley, R . J . ( 1 960) . A stochastic model for individual choice behavior. Psychol. Rev . 67 : 1-1 5 . Reprinted in R . D. Luce, R . R . Bush, and E. Galanter, eds. ( 1 963) . Readings in Mathematical Psychology. Vol. 1 . Wiley : 263-277. 5 . 1 . Bailey, N . T. J . ( 1 976). The Mathematical Theory of Infectious Diseases and Its Applications (2nd ed .). Hafner. 9.2. Barker, S . B., G. Cumming, and K. Horsfield ( 1 973). Quantitative morphoÂ metry of the branching structure of trees. 1. Theor. BioI. 40 : 3 3-43 . 5.2. 239
240
R E F E R E N CES
Bartlett, A . A . ( 1 973) . The Frank C . Walz lecture halls : A new concept in the design of lecture auditoria. A mer. J . Phys. 41 : 1 233- 1 240 . 1 . 5 . Bartlett, M . S . ( 1 972) . Epidemics. In J. M . Taijur e t aI. , eds. Statistics : A Guide to the Unknown . Holden-Day : 66-76. 8.2. *Bass, F . M . et aI. , eds . ( 1 96 1 ) . Mathematical Models and Me thods in Marketing. Irwin. Baylis, J . ( 1 973). The mathematics of a driving hazard . Math . Gaz . 57 : 23-26. 8 . l . Bender, E. A . and L . P . Noowirth ( 1 973) . Traffic flow : Laplace transforms. Amer. Math . Mon . 80 : 4 1 7--423 . 9 . 3 . *Bochner, S . ( 1 966) . The Role of Mathematics in the Rise o.f Science. Princeton University Press. 1 . 1 . *Boot, J . C . G. ( 1 967) . Mathematical Reasoning in Economics and ManageÂ ment Science. Prentice-Hall . 3 . 3 . Boyce, W. E. and R. C. DiPrima ( 1 969) . Elemen tary Differen tial Equations and Boundary Value Problems. Wiley. 9 . 4 . Brauer, F . ( 1 972) . The nonlinear simple pendulum. Amer. Math . Mon . 79 : 348-3 5 5 . 9.4. Brauer, F . and J . A . Nohel ( 1 969) . Qualitative Theory o.f Ordinary D ifferen tial Equations. Benjamin . 9.0. Braun, M. ( 1 975). Verlag. 9.2.
Diff ren l e
Bridgman, P . W . (1 93 1 ) .
tia Equations and Their
Dimensional A nalysis.
Appli o s. SpringerÂ c a ti n
Yale University Press . 2.2.
Brock, V. E. and R. H . Riffenburgh ( 1 959). Fish schooling : A possible factor in reducing predation . J. Cons. Perm . In t. Explor. Mer 25 : 307-3 1 7 . 6. Brown, A . A., F . T. Hulswit, and J. D . Kettelle ( 1 956) . A study of sales operations. 4 : 296-308. Reprinted in R. L. Day and L. J. Parsons ( 1 97 1 , pp. 3-1 6) . 1 . 5 .
Oper. Res.
Buchanan, N . S . ( 1 939) . A reconsideration o f the cobweb theorem. J. Polito Econ . 47 : 67-8 1 . Reprinted in R. V. Clemence, ed . ( 1 950) . Readings in Economic A nalysis. Vol. 1 . Addison-Wesley : 46-60. 3 . 3 . Bush, R. R. and F . Mosteller ( 1 959) . A comparison o f eight models . In R. R. Bush and W. K. Estes, eds . Studies in Mathematical Learning Theory . Stanford University Press : 293-307 . Reprinted in P . F . Lazarsfeld and N. W. Henry ( 1 966, p p . 33 5-349) . 5 . 1 . *Carrier, G. F . ( 1 966) . Top ics in Applied Mathematics. Vol. 1 . (Notes by N. D. Fowkes.) Mathematical Association of America. 8 .2 .
R E F ER EN CES
Carson, R. L. ( 1 9 6 1 ) . 2.2.
The Sea around Us
241
(rev. ed .). Oxford University Press.
Chandler, R. E , R. Herman, and E. W. Montroll ( 1 958). Traffic dynamics : Studies in car following. Oper. Res. 6 : 1 65ï¿½1 84. 9 . 3 . Clark, C. W. ( 1 973) . The economics o f overexploitation . 630ï¿½634. 4. 1 . Clark, C. W. ( 1 976).
Science 1 8 1 :
Mathematical Bioeconomics. Wiky-Interscience. 4. 1 .
Clarke, L . E . ( 1 9 7 1 ) . How long is a piece of string ? Math . Gaz. 55 : 404-407. 10. Coale, A . J. ( 1 97 1 ) . Age patterns o f marriage . Populo Stud. (London) 25 : 1 93ï¿½2 1 4. 8 . 1 . Cohen . J . E. ( 1 966) . A Model of Simple Competition . Harvard University Press. 1 0 . Cohen, J. E . ( 1 9 7 1 ) . Casual Groups of Monkeys and Men : Stochastic Models of Elemental Soc ial Systems. Harvard University Press. 5 . 1 . *Cohen, K . J . and R . M . Cyert ( 1 965) . Theory of the Firm : Resource A llocaÂ tion in a Marke t Economy . Prentice-Hall. 3 .2, 4.2. Cohn, T. E. and D . J . Lasley ( 1 976) . Binocular vision : Two possible central interactions between signals from two eyes. Science 1 92 : 5 6 1 ï¿½563. 6. *Coleman, J . S . ( 1 964) . In troduction t o Mathematical Sociology . M acmillan. 8. 1 . Coleman, J . S . and J . James ( 1 96 1 ) . The equilibrium size distribution of freely-forming groups. SOciometry 24 : 3 6-45 . 5 . 1 . Crank, J. ( 1 962) .
Mathematics and Industry .
Oxford University Press. 1 . 5 .
Cronin, J. ( 1 977) . Some mathematics o f biological oscillations . SIA M Rev. 1 9 : 1 00ï¿½ 1 3 8 . 9.4. Cundy, H . M . ( 1 9 7 1 ) . Getting it taped . ! . Ma th . Gaz . 55 : 43--47. 6 . Cyert, R. M . and J. G. March ( 1 963). A Behavioral Theory of the Firm . Prentice-Hall. 3 . 2. Davis, D . D . ( 1 962) . Allometric relationships in lions vs. domestic cats. Evolution 1 6 : 505ï¿½ 5 1 4. 2. 1 . Day, R. L. and L. J. Parsons, eds. ( 1 97 1 ) . Marketing Models : Quantitative Applications. Intext. Douglas, J. F. ( 1 969) . An In troduction to D imensional Analysis. Pitman. 2.2. *Engineering Concepts Curriculum Project ( 1 97 1 ) . The Man-Made World. McGraw-Hill . A good high school text. 1 .2 . Epstein, B . ( 1 947) . The mathematical description of certain breakage
242
REFERENCES
mechanisms leading to the logarithmico-normal distribution. J . Franklin Inst. 244 : 47 1-477. 1 0 . Evans, J . W . , P. D . Wagner, and J. B . West ( 1 974) . Conditions for reduction of pulmonary gas transfer by ventilation-perfusion inequality. J. Appl. Physiol. 36 : 533-537. 6. Ezekiel, M . ( 1 937-1 938). The cobweb theorem . Quart . J. Economics 52 : 255-280. Reprinted in G. Haberler, ed . ( 1 944) Readings in Business Cycle Theory . Irwin. 3 . 3 . Fantino, E. and G . S . Reynolds ( 1 975). In troduction t o Con temporary Psychology. Freeman. 2 . 1 . Finucan, H . M . ( 1 976) . The silver anniversary of an optimization result in rolling-mill practice. Oper. Res . 24 : 373-377. 1 0 . Ford, L . P. ( 1 955). D ifferen tial Equations (2nd ed.). McGraw-Hill. 9 . 3 . Friedman, G. M . ( 1 958). Determination o f sieve-size distribution from thinÂ section data for sedimentary petrological studies. J. Geol. 66 : 394-4 1 6. 1 0. *Gandolfo, G. ( 1 97 1 ) . Mathematical Methods and Models in Economic Dynamics. American Elsevier. 9.2, 9.4. Gazis, D . C. and R. B . Potts ( 1 965) . The over-saturated intersection. In J . Almond ( 1 965, pp. 22 1-237) . 4.2. Gilpin, M . E . ( 1 973) . Do hares eat lynx ? Amer. Nat . 1 07 : 727-730. 9.2.
Goel, N . S., S . C . Maitra, and E. W . M ontroll ( 1 97 1 ) . On the Volterra and other nonlinear models of interacting populations . Rev. Mod. Phys. 43 : 23 1-276. Reprinted ( 1 97 1 ) as a monograph under the same title. Academic Press. 9.2. Goodman, L. A . ( 1 96 1 ) . Some possible effects of birth control on the human sex ratio. A nn . Hum. Genet. 25 : 75-8 1 . Reprinted in P. A. Lazarsfeld and N . W. Henry ( 1 968, pp. 3 1 1 -3 1 7) . 5 . 1 . Griffith, J . S . ( 1 968) . Mathematics of cellular control processes. I . Negative feedback to one gene. J. Theor. Bioi. 20 : 202-208 . 8 . 2 .
v
Hadley, G . and T . M. Whitin ( 1963). Analys is of In e n t ory Systems. PrenticeÂ Hall. 4. 1 . *Haight, F. A. ( 1 963) . Mathematical Models for Traffic Flow . Academic Press. 9 . 3 . Haldane, J. B. S . ( 1 928). O n being the right size. In J . B . S . Haldane, ed. Possible Worlds. Harper. Reprinted in J . R. Newman ( 1 956, pp. 952957). 2. 1 . Hammersley, J . M . ( 1 96 1 ) . On the statistical loss of long-period comets from the solar system. II. In J. Neyman ( 1 96 1 , pp. 1 7-78) . 5 . 2 .
REFERENCES
243
Henmon, V . A. C. ( 1 9 1 1 ) . The relation of the time of a j udgment to its accuracy. Psychol. Rev. 18 : 1 86-20 1 . 5 . l . Herdan, G . ( 1 953). Small Particle Statistics. Elsevier. ! O . Herman, R . , E . W . Montroll, R. B . Potts, and R. W. Rothery ( 1 959) . Traffic dynamics : Analysis of stability in car following. Oper. Res. 7 : 86- 1 0 3 . 9.3. Hernes, G. ( 1 972) . The process o f entry into first marriage . Amer. Sociol. Rev. 37 : 1 7 3-1 82 . 8 . 1 . Higgins, J . ( 1 97 1 ) . Getting it taped . H . Math . Gaz. 55 : 47-48 . 6. Homans, G. ( 1 950). The Human Group . Harcourt Brace. 3 . 3 . Intriligator, M . D . ( 1 973). Strategy and arms races, a n application of ordinary differential equations to problems of national security. In P . J. Knopp and G. H. Meyer ( 1 973, pp. 253-377 and parts of 29 1298) . 3.2. Jensen, A . ( 1 966) . Safety-at-sea problems . In D . B . Hertz and J . Melese, eds . Proceedings of the Fourth In ternational Conference on Operational Research . Wiley-Interscience : 362-370. 1 .2 . Karman, T. v . and M . A . aiot ( 1 940) . Mathematical Me thods in Engineering . McGraw-Hill. 8.2, 9.2. Keith, L. B . ( 1 963) . Wildl!fe's Ten- Year Cycle . University of Wisconsin Press. 9.2. Kemeny, J . G. ( 1 973). What every college president should know about mathematics. Amer. Math . Mon . 80 : 889-90 1 . 5 . 1 . *Kemeny, J . G . and J . L . Snell ( 1 962) . Mathematical Models in the Social Sciences. Ginn. 9 . 2 . Kerr, R . H . ( 1 96 1 ) . Perturbations of cometary orbits. In J. Neyman ( 1 96 1 , pp. 1 49-1 52) . 5 . 2 . Kintsch, W . ( 1 963) . A response time model for choice behavior. PsychoÂ metrika 28 : 27-32. 5 . 1 . Kleiber, M . ( 1 96 1 ) . The Fire of Life : A n Introduc tion to A n imal Energetics. Wiley. 2 . 1 . *Kline, S. J. ( 1 965). McGraw-Hill. 2.2. Knopp, P . J . and G. H . Meyer, eds . ( 1 973) . Proceedings of a Conference on
Similitude and ApprOXimation Theory.
the App lication of Undergraduate Mathematics in the Engineering, L((e, Managerial and Soc ial Sciences.
Georgia Institute of Technology.
Knuth, D. E. ( 1 969) . The A rt of Computer Programming. II : Seminumerical A lgorithms. Addison-Wesley. A.6. Kolesar, P. ( 1 975) . A model for predicting average fire engine travel times. Oper. R e s . 33 : 603-6 1 3 . 1 0 .
244
R E F E R E N C ES
Kolesar, P . , W. Walker, and J . Hausner ( 1 975) . Determining the relation between fire engine travel times and travel distances in New York City. Oper. Res. 33 : 6 1 4--627. 1 0 . Kupperman, R. H . and H . A. Smith ( 1 972) . Strategies o f mutual deterrence. Science 176 : 1 8-23 . 5 . 1 . Land, K . C . ( 1 9 7 1 ) . Some exhaustible Poisson process models of divorce by marriage cohort. 1. Math . Sociol. 1 : 2 1 3-232. 8 . 1 . Langhaar, H . L . ( 1 9 5 1 ) . Wiley. 2.2.
Dimensional Analysis and the Theory of Models.
Larson, R. C . and K . A . Stevenson (1 972) . O n insensitivities i n urban reÂ districting and facility location. Oper. Res. 20 : 595-6 1 2 . 1 0 . * Lave, C . A. and J. G. March ( 1 975) . A n In troduction t o Models in the Social Sciences. Harper and Row. 1 . 5 . *Lazarsfeld, P. A . and N. W. Henry, eds . ( 1 968) . Readings in Mathematical Social Science. M.I.T. Press . * Leigh, E. G. Jf. ( 1 9 7 1 ) . A dap tation and D iversity . Freeman. 1 . 5 . Leopold, L. B . , M . G. Wolman, and J. P. Miller (1 964) . Fluvial Processes in Geomorphology . Freeman. 5 . 2 . Levary, G. ( 1 956) . A pocket-sized case study i n operations research conÂ cerning inventory markdown. 1. Oper. Soc . Amer. 4 : 738-740. 6. Levins, R. ( 1 968) . Evolution in Changing Environmen ts. Princeton University Press. 1 .2, 4.2. Lin, C . C. and L . A . Segel ( 1 974) . Mathema tics Applied t o De terministic Problems in the Natural Sciences. Macmillan. 1 . 5 . Luce, R . D . and H . Raiffa ( 1 958) . Games and Decisions. Wiley. 6, 1 0 . MacArthur, R . H . and E. O . Wilson ( 1 967) . The Theory of Island BioÂ geography . Princeton University Press. 3 .2. Mandelbrot, B . ( 1 965) . Information theory and psycholinguistics. In B . B . Wolman, ed. Scien tific Psychology . Basic Books. Reprinted with addiÂ tions in P. F. Lazarsfeld and N. W. Henry ( 1 968, pp. 3 50-368) . 1 0 . *Martin, M . J. C . and R. A. Denison ( 1 970) . Case Exercises in Operations Research . Wiley. May, R . M . ( 1 972) . Limit cycles in prey-predator communities. Science 1 77 : 900-902. 9.2. May, R. M . (1 973) . Stability and Complexity in Model Ecosystems. Princeton University Press. 9.2, 9.4. May, R . M . ( 1 975) . Biological populations obeying difference equations : Stable points, stable cycles and chaos. 1. Theor. Bioi. 51 : 5 1 1-524. 7.4.
R E F E R E N CES
245
*Maynard Smith, J. ( 1 968) . Mathematical Ideas in Biology. Cambridge University Press. 2 . 1 , 8 . 2 . McKelvey, R . D . ( 1 973) . Some theorems o n electoral equilibrium under two-candidate competition . In P. J. Knopp and G. H. Meyer ( 1 973, pp. 1-20) . 4.2. McMahon, T . A . ( 1 9 7 1 ) . Rowing : A similarity analysis. Science 173 : 34935 1 . 2 . l . McMahon, T. A . ( 1 973) . Size and shape in biology. Science 179 : 1 201-1204. 2. 1 . Metelli, F . ( 1 974) . The perception of transparency. Sci. Amer. 230 (4) : 9098 . 6. Middleton, G. V. ( 1970). Generation of the log-normal frequency distribuÂ tion in sediments . In M. A. Romanova and o. V. Sarmanov, eds. Top ics in Mathematical Geology. (Translated from the Russian.) ConÂ sultants Bureau : 34-42. 1 0 . Mudahar, M . S . and R. H . Day ( 1 974) . A generalized cobweb model for an agricultural sector. Mathematics Research Center Technical Summary Report 1 45 3 . University of Wisconsin, Madison. 3 . 3 . Murdick, R. G. ( 1 970) . Ma thematical Models in Marketing. Intext. 4. 1 . Nash, J. F . , Jr. ( 1 950) . The bargaining problem. Econometrica 1 8 : 1 55-1 62. Reprinted in K . J. Arrow. ed. ( 1 9 7 1 ) . Selec ted Readings from EconoÂ metrica . Vol. 2. M . I .T. Press : 204-2 1 1 . 6. Neher, P. A . (1971). Economic Growth and Developmen t. Wiley. 3 . 3 . Netter, F . H . The Ciba Collection of Medical Illustrations. Several volumes and dates. Ciba Pharmaceutical. 4. 1 . Newell, G . F . ( 1 962) . Theories of instability in dense highway traffic. l. Oper. Res. Soc . lap . 5( 1 ) : 9-54. 9 . 3 . Newman, J . R . , ed. ( 1 956). h e World of Mathematics. Vol. 2. Simon and Schuster. Neyman, J . , ed . ( 1 9 6 1 ) . Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1960. Vol. 3 . University of California Press. *Noble, B . ( 1 97 1 ) . Applia; tions of Undergraduate Mathematics in EngiÂ neering . Macmillan. 4.2, 9.2, 1 0 . Norris, K . S . ( 1 967). Color adapt,ation i n desert reptiles and its thermal relationships. In W. W. Milstead, ed. Lizard Ecology : A Symposium . University of Missouri Press : 1 62-229. 6. Notari, R. E . (1971). Biopharmaceutics and Pharmacokinetics : A n In troducÂ tion . Marcel Dekker. 8 . 2 .
T
246
R E F E R E N C ES
Otto, G. H . (1 939) . A modified logarithmic probability graph for the interpretation of mechanical analyses of sediments. J. Sedimen t . Pet. 9 : 62-76. 10. Parks, G. M . ( 1 964) . Development and application of a model for suppresÂ sion of forest fires. Manage. Sci. 1 0 : 760-766. 4. l . Pavlidis, T . ( 1 973) . Biological Oscillators : Their Mathema tical Analysis . Academic Press. 8.2, 9.4. *Pignataro, L . J. ( 1 973) . Traffic Engineering Theory and Prac tice . PrenticeÂ Hall. 9 . 3 . Plattner, S . ( 1 975) . Rural market networks. Sci. Amer. 232(5) : 66-79. 5.2. Pontryagin, L. S. (1 962). Ordinary Differential Equations. (Translated from the Russian.) Addison-Wesley. 9.2. Pulliam, H . R. ( 1 973). On the advantages of flocking. 1. Theor. BioI. 38 : 4 1 9-422 . 6. Rainey, R. H . ( 1 967) . Natural displacement of pollution from the Great Lakes. Science 155 : 1 242- 1 243. 8 . l . * Rapoport, A. ( 1 976). Directions in mathematical psychology. 1. Amer. Math . Mon . 83 : 85-1 06. 2. l . Rashevsky, N . ( 1 960) . Mathematical Biophysics. Vol. 2 (3rd ed.). Dover. 2. l . Rashevsky, N. ( 1 964) . Some Medical A spects of Mathematical Biology. C. C. Thomas. 9.2. Richardson, L. F . ( 1 960) . A rms and Insecurity . N. Rashevsky and E. Trucco, eds. Boxwood Press and Quadrangle Books. See also, Mathematics of war and foreign politics. In J. R. Newman ( 1 956, pp. 1 240- 1 253). 9.2. Riggs, D . S . ( 1 963) . The Mathematical Approach t o Physiological Problems. Williams and Wilkins. 8 . 1 . *Roberts, F . S . ( 1 976) . D iscrete Mathematical Models. Prentice-Hall. 5.2. *Rosen, R. ( 1 967) . Optimality Princip les in Biology . Plenum. 4. l . Rosenzweig, M . L . ( 1 97 1 ) . Paradox of enrichment : Destabilization of exploitation ecosystems in ecological time. Science 1 7 1 : 3 8 5-387. This article resulted in interchanges between Rosenzweig and others. See Science 175 : 562-565 ; 177 : 902-904. 9.2. Saaty, T. L. ( 1 968). Mathematical Models of A rms Con trol and Disarmament . Wiley. 3.2, 5 . l , 9.2. Scheidegger, A . E. ( 1 970) . Theoretical Geomorpho logy (2nd ed.). SpringerÂ Verlag. 5 .2. Schmidt-Nielsen, K. (1972). How Animals Work. Cambridge University Press. 2. l .
REFER ENCES
247
Schwartz, B. L. ( 1 966) . A new approach to stockout penalities. Manage. Sci. 12B : 538-544. 4. 1 . Schwartz, B . L. and M . A . B . Deakin ( 1 973). Walking in the rain, reconÂ sidered. Math . Mag. 46 : 272-276. 4. 1 .
Sedov, L. 1 . ( 1 959). Similarity and D imensional Methods in Mechanics. (Translated from the Russian 4th ed .). Academic Press. 2.2. Shear, D . ( 1 967) . An analog of the Boltzmann H-theorem (a Liapunov function) for systems of coupled chemical reactions. 1. Theor. Bioi. 1 6 : 2 1 2-228. 9.2. Simon, H . A . ( 1 952) . A formal theory of interaction in social groups. Amer. So cio I. Rev. 1 7 : 202-2 1 1 . Reprinted in H. A. Simon ( 1 957, pp. 99-1 1 4). 3.3. Simon, H . A . ( 1 955). On a class of skew distribution functions . Biometrika 42 : 425-440. Reprinted in H. A. Simon ( 1 957, pp. 1 45-1 64) . 1 0. *Simon, H . A . ( 1 957) . Models of Man Social and Rational. Wiley. Spector, W. S . , ed. ( 1 956) . Handbook of Biological Data. Saunders . 2. 1 . Sperry, K . ( 1 967) . Water and air pollution : Two reports on cleanup efforts. Science 1 58 : 3 5 1 -355. 8. 1 . Stahl, W . R . and J. Y. Gummerson ( 1 967) . Systematic allometry in five species of adult primates. Grow th 31 : 2 1 -24. 2 . 1 . Stevens, S. S . ( 1974). Psychophysics. G. Stevens, ed. Wiley. 2. 1 . Swintosky, J . V . ( 1 956) . Illustrations and pharmaceutical interpretations of first order drug elimination rate from the bloodstream. 1. Amer. Pharm . Asso c . 45 : 395-400. 8.2. Synge, J. L . ( 1 970) . The problem of the thrown string. Math . Gaz. 54 : 250-260. 1 0. Tanford, C . ( 1 96 1 ) . Physical Chemistry of Macromolecules. Wiley. 8 . 1 . Thorn, R . ( 1 975) . Structural Stability the French.) Benjamin. 7 . 3 .
and Morphogenesis .
(Translated from
Toomre, A . and J. Toomre ( 1 972) . Galactic bridges and tails . 178 : 623-666. 8 . 2.
A strophys. 1.
Toomre, A. and J. Toomre ( 1973). Violent tides between galaxies. Sci. Amer. 229(6) : 38-48. 8.2. Tsipis, K. ( 1 975). Physics and calculus of countercity and counterforce nuclear attacks. Science 187 : 393-397. 5 . 1 . Tsipis, K . ( 1 975a) . The accuracy of strategic missiles . Sci. Amer. 233( 1 ) : 1 4--23 . 3 . 2 .
248
R E F E R E N CES
Vidale, M . L. and H . B. Wolfe ( 1 957) . An operations-research study of sales response to advertising. Oper. Res. 5 : 3 70-38 1 . Reprinted in R. L. Day and L. J. Parsons ( 1 97 1 , pp. 29-42) and in F . M . Bass et al. ( 1 96 1 , pp . 363-374) . 8 . 1 . Vine, I . ( 1 97 1) . Risk of visual detection and pursuit by a predator and the selective advantage of flocking behavior. J. Theor. Bioi. 30 : 405-422. 6. VoId, M . J. ( 1 959) . A numerical approach to the problem of sediment volume. 1. Colloid Sci. 1 4 : 1 68-1 74. 5 .2. VoId, M . J. ( l 959a) . Sediment volume and structure in dispersions of anisometric particles. J. Phys. Chern . 63 : 1 608- 1 6 1 2 . 5.2. Weihs, D . ( 1 973) . Mechanically efficient swimming techniques for fish with negative bouyancy. J. Mar . Res . 3 1 : 1 94-204 . 4. 1 . Wigner, E . P . ( 1 960) . The unreasonable effectiveness of mathematics in the natural sciences. Comm . Pure . Appl. Math . 13 : 1-1 4. 1 . 1 . Wilson, E . O. (1 975) . SOciobiology. Belknap . 4.2. *Wilson, E . O. and W. H . Bossert ( 1 9 7 1 ) . A Primer of Population Biology . Sinauer Associates. 3.2, 4.2. Woldenberg, M . J. (1 969) . Spatial order in fluvial systems : Horton's laws derived from mixed hexagonal hierarchies of drainage basin areas. Geol. Soc . Amer. Bull. 80 : 97-1 1 2 . 5 . 2 .
TO MODEL
A GUIDE
TO P I CS
Models are grouped into major categories which are capitalized and grouped by affinity. Italicized numbers refer to chapters and sections that discuss a subject. Other numbers refer to problems dealing with the subject.
AST R O N O M Y
colliding galaxies 8 . 1 . 2 number of comets 5.2.4 C H E M I ST R Y
chemical engineering 4.2.5 polymer formation 8. 1 , 8 . 1 . 3 reaction stability 9 . 2 . 8 sediment volume 5.2, 5.2. 1 EARTH S C I E N C ES
particle sizes 10 reflected energy in the desert 6 sediment volume 5.2, 5.2. 1 stream networks 5.2, 5 . 2 . 5 waves 2 . 2 . 4 249
250
A G U I D E TO
M O D E l TO P I C S
P HYSICS
ballistics 8.2 falling from a height 2. 1 . 7, 8 . 1 . 5 heat flow 2.2.3 motion of a pendulum 2.2, 2.2. 1 , 9.2 radioactive decay 10 radioactivity and mousetraps 8 . 2 . 5 throwing strings 1 0 . 3 vibrating strings 2.2.2 water skiing 8.2 wave motion 2.2.4 ENGINEERING
chemical engineering 4.2. 5, 8.1, 8. 1 . 3 particle sizes 10 rocket design 4. 1 .4 scale models 2. 2, 2.2.4 thermostatic control 9 . 3 .4 TRAFFIC
car following 9.3, 9 . 3 . 1 , 9 . 3 . 2 elevators 1 . 5. 1 flow 6 . 5 left turn squeezes 8 . 1 , 8 . 1 . 1 signals 4.2.4, 1 0.2 urban streets 1 . 5 . 5 PSYC H O LO G Y A N D PSYC H O P H YS I C S
binocular brightness perception 6 . 6 perception o f transparency 6 . 3 simple choices 5 . 1 , 5. 1 .2, 5 . 1 . 3 Weber-Fechner and Stevens laws 2. 1 . 6 H U M A N P H YS I O LO G Y A N D M E D I C I N E
drug excretion 8 . 1 .2 epidemics 8 . 1 . 3 , 9.2. 1 0 impaired CO 2 elimination 6
A G U I D E TO M O D E L TO P I C S
sex ratios 5. 1 , 5 . 1 . 1 speed of racing shells 2. 1 , 2. 1 .2, 2. 1 . 3 B I O LO G Y O F O R G A N I S M S
blood vessel design 4. 1 , 4. 1 . 1 circadian rhythm 8 .2.4 desert lizards and radiant energy 6 feeding Gulliver 2. 1 . 5 how far can a bird fly ? 1 . 5 . 3 insecticides 8 . 1 .7, 9.2.2 size effects 2. 1 , 2. 1 .4, 2. 1 . 5 , 2. 1 . 7 swimming by fish 4. 1 . 6 B I O LO G Y O F P O P U LAT I O N S
castes 4.2 herd formation 6.4 optimal phenotype 4.2 population dynamics (one species) 1 .4, 1 . 5 . 8 , 9 . 3 . 3 population dynamics (two species) 1 . 5 .2, 3 . 3 .4, 9.2, 9.2. 1-9 . 2 . 3 , 9 . 3 . 3 , 9.4. 1 species diversity 3 .2, 3 . 2.6 A P P LI E D E C O L O G Y
insecticide usage 9 . 2 . 2 Great Lakes pollution 8 . 1 , 8 . 1 . 1 regulating fishing 4. 1 . 3 P O LITI C A L S C I E N C E
arms races 3.2, 3.2. 1-3 . 2 . 5 , 5 . 1 . 5 , 5 . 1 .6, 9 . 2 . 6 fair elections 6 winning votes 4. 2 . 6 S O C I O LO G Y
getting married 8 . 1 .4 group dynamics 3.3, 3 . 3 . 3 , 9.2.7 group size 5 . 1 . 7 sex preference 5 . 1 , 5 . 1 . 1
251
252
A G U I D E TO M O D E L T O P I C S
E CO N O M ICS O F A F I R M
advertising 4. 1 . 7 , 8 . 1 .6 employees 1 .5, 1 . 5 .4, 3 . 2 . 8 , 4.2.3 facility location 10 inventories 4 . 1 , 6.2, 10.4 theory of prices and production 3.2, 3 . 2 . 7 transportation 4. 1 .8, 5.2.3 OTH E R ECO N O M ICS
bartering 4.2 cobweb models 3.3, 3 . 3 . 1 cost of packaging 2. 1 , 2. 1 . 1 cutting beams 10.5 forest fires 4.1 Keynesian theory 9.2, 9.2. 5 optimum fish harvesting 4. 1. 3 optimum location 4. 1 . 5 , 10 underdevelopment 3.3.5 what to buy 4.2.2 U N IV E R S IT I E S
lecture hall design 1 . 5 . 6 student body quality 4.2. 1 student body size 3.3.2, 9.2.9 tenure 5 . 1 .4 M I S C E L LA N E O U S
a doctor's waiting room 5.2 fighting forest fires 4 . 1 measuring lengths 1 0 . 1 positioning recording tapes 6 . 1 roasting turkeys 2.2.3 running in the rain 4. 1 .2 speed of racing shells 2. 1 , 2 . 1 .2, 2. 1 . 3 stringed instrument design 2.2.2 throwing strings 1 0 . 3
I N D EX Advertising, 8 0 , 1 5 9
Chain reaction, 1 6 9
Airplane, stability of, 1 8 7
Chebyshev's inequality, 2 2 1
Anatomy and physiology , biological
Chemical engineering, 8 9 , 1 5 2 , 1 5 7 , 1 8 7
rhythms, 1 6 8
Chemical reactions, stability of, 1 6 9 , 1 9 0
blo o d vessel optimization, 7 1
Choices, simple, 9 4 , 9 8
capillaries, number of, 7 3
Circadian rhythms, 1 6 8
comparative, bloo d flow, 3 1 body proportions, 26 falling, 3 4
Clo cks, pendulum, 1 7 7 Cobweb model o f supply and demand, 5 7 Colleges, see Universities
fo o d needed b y Gulliver, 3 3
Comets, 1 1 6
jumping, 2 7 , 2 8
Committee behavior, 6 0
optimal phenotype, 8 4
Compartment model of drug excretion, 1 5 6
drug excretion, 1 5 6
Competition, interspecific, 6 4 , 1 84
endocrine systems, 1 8 8
Conservation of fish, 7 7
lung efficiency, 1 2 7
Conservation laws, 1 9 9
trees, random branching of, 1 1 5
Cooking time s, 4 2
see also Psychophysics
Arms race, 1 8 9
ICBMs and, 4 5 , 5 6 , 1 0 0, 1 0 1
Cost, marginal, 5 3 Countries, underdeveloped, 6 4 Curve fitting, 2 1 , 26 , 44, 2 1 1 , 2 3 4
Astronomy, 1 1 6 , 1 6 6
Curves, indifference, 8 3
Ballistics, 1 6 4
Cycles, biological, 1 6 8 , 1 8 0 , 1 84 , 1 8 8 , 2 0 1
supply a n d demand, 5 5 Bartering, 8 1
limit, 1 74 , 200
Bayes' formula, 2 2 3 Beam, cutting of h o t rolled, 2 1 5 deflection of, 2 7 Bernoulli trials, 2 2 4 , 2 3 8
Decision making, simple , 9 4 , 9 8 Demand curves, 5 5 Demography, 9 , 1 4 , 9 1
Binomial coefficients, 224
Density function, 226
Bioecono mics, 77
Deserts, reflected sunlight in, 1 2 1 , 12 2
Birds, migration of, 1 2
Difference equations, richness of, 1 4 2
Blood, see Anatomy and physiologY Box, Edgeworth, 8 3
Breakage of particles, 2 0 8 Business, see Firms
Capillaries, number of, 7 3
Differential equations, numerical method for, 1 7 1 Direction field, 6 0 Distribution function, 220, 226 exponential, 203, 2 1 3 , 237 log normal, 207
Cartography, waves and water depth, 4 3
normal, 1 7 0 , 2 0 9 , 230, 2 3 3
Castes, insect, 8 5
Poï¿½son, 9 5 , 102, 1 5 4 , 1 5 8 , 2 3 7
Central Limit Theorem, 2 0 9 , 2 3 0 , 2 3 3 Central Place Theory, 1 1 5
Rosin's law, 207 D o ctor's waiting room, 1 06
253
254
I NDEX
Dosage, drug, 1 5 6
Forest fire, 7 3
insecticide, 1 5 9 Drag force, 2 3 , 34, 1 64
Galaxies, colliding, 1 6 6
Drugs, 1 5 6
Gonorrhea epidemics, 1 9 2
Dunes, sand, 2 1 1
Governor, steam engine, 1 8 8
E cology, Great Lakes pollution, 1 44
Gravitation, 3 5 , 3 9 , 1 5 8
Graphs, uses of, 44 species diversity and habitat size, 4 9
see also Population growth
Economics, Keynesian, 1 84 , 1 89 Economy, national, 1 84 , 1 89 Edgeworth box, 8 3
Great Lakes, pollution of, 1 4 4 Group s, dynamics of, 6 0 peer pressure and marriage, 1 5 7 size distribution of, 1 0 1 Growth rate, net, 8
Elections, fair, 1 24
Gunnery tables, 1 64
Elevators, 1 2
Gypsy moth control, 1 8 8
Employers, see Firms
Herd formation, 1 3 2
Employees, see Firms
Engineering, chemical, 8 9 , 1 5 2, 1 5 7 , 1 8 7
Heun method, 1 7 1
Epidemics, 1 6 6 , 1 9 2 Equilibrium point, 1 74
Income, marginal, 5 3
Equipment turnover, 8 1
Independent events, 2 1 9
Events, independent, 2 1 9 simple, 2 1 8 Expectation, 220, 2 2 7
Indifference curves and surfaces, 8 1 , 8 3 Insecticide, dosage of, 1 5 9 host-parasite systems and, 1 8 8
Exponential distribution, 2 0 3 , 2 1 3 , 2 3 7
Insects, castes of, 8 5
Facility location, optimum, 7 9 , 2 0 4
Inventory maintenance, 6 6 , 1 3 1 , 2 1 5
social, 8 7 Falling, 34, 1 5 8 Fire, forest, 7 3
Keynesian economics, 1 84 , 1 8 9
Fire station location, 204 Firms, advertising, 80, 1 5 9 equipment turnover, 8 1 general theory, 5 2 inventory maintenance, 6 6 , 1 3 1 , 2 1 5 loading docks, 1 1 5
Lakes, pollution of, 1 4 4 Laplace transforms, 1 9 3
Least squares, see Curve fitting Lecture hall design, 1 3
Limit cycles, 1 74 , 200
optimum location of, 7 9 , 204
Linear algebra, 9 , 37, 1 7 7
overstock sales, 1 3 1
Linear approximation, bad effect of, 2 0 1
package rilling , 2 1 6
Linear programming, 8 7
packaging costs, 1 9
Lizards, body temperature of, 1 2 1
production run length, 6 6
Loading docks, 1 1 5
sales force size, 1 0
Lo cation, optimum, 7 9 , 204
salesperson effectiveness, 1 1
Log normal distribution, 2 0 7
wages, 5 7 , 8 8
Lotka-Volterra equations, 1 8 1
Fish, optimum swimming of, 79 schooling of, 1 3 2
Lungs, 1 27 Lynx-hare cycles, 1 84
Fishing, regulation of, 7 7 type of catch, 1 8 9 Fission, nuclear, 1 6 9
Macroeconomics, 6 4 , 1 84 , 1 8 9 Malaria, 1 9 2
Fitness o f organisms, 8 4
Marriage, 1 5 7
Fitness sets, 8 4
Measles, 1 5 7 , 1 9 2
Flow, blood, 3 1 , 7 1
Medicine, diagnosis, 1 27 , 2 2 3
resistance to, 7 2
drug excretion , 1 5 6
I N DEX Polymerization, 1 5 2, 1 5 7, 2 1 5
epidemics, 1 6 6 , 1 9 2
lung efficiency, 1 27
Missles, see Arms race ; Ballistics; and Rockets
Population growth, competition between species, 64, 1 84
demography, 9, 1 4 , 9 1
host-parasite, 1 80 , 1 8 8
Mo del, best does not exist, 3
one species, 8 , 1 4 , 1 9 8
compartment, 1 5 6
predator-prey, 1 8 0 , 1 8 8 , 2 0 1
mathematical, definition, 2
symbiosis, 1 84 , 1 89
usefulness of, 1
Predation, 1 80 , 1 8 8 , 2 0 1
need for, 1 4
predictions, fragile and robust, 4, 1 2 3 , 1 3 0, 1 5 2, 1 7 7
Probability, conditional, 2 1 9 Psychology, 9 4 , 9 8
see also Psychophysics
see also S ensitivity analysis
Psychophysics, perception of intensity, 3 3
scale, 3 8 , 4 3
vision, 1 3 1 , 1 3 4
variables, careful choice of, 3 types of, 2, 3 Modeling process, changing problem, 1 0 example, 8 , 1 0
Queues, 1 06 Racing shells, 2 2 , 2 3 6
implicit assumptions, 1 , 1 9 7
Radioactive decay, 2 0 2
references, 1 2
Random numbers, generation, 1 05 , 2 3 2
theory of, 6 Money, government control of, 1 87 Monte Carlo simulation, 1 0 3 accuracy o f estimates, 1 04, 2 3 1
Music, stringed instruments, 4 0
Normal distribution, 2 3 0
Pedestrian crosswalks, 2 1 3
Rocket, 7 8 Running i n rain, 76
S and dunes, 2 1 1
Numbers, random, 1 0 5 , 1 1 8, 2 3 2
Particle s iz e distribution, 207
Rhythms, biological, 1 6 8
Salesperson effectiveness, 1 1
random generation of, 2 3 3
Nuclear reaction, 1 6 9
Packaging costs, 1 9
table, 1 1 8
Reaction, chain, 1 6 9
Rosin's law, 2 0 7
Normal approximation, 1 7 0, 2 0 9 , 2 3 0
j
Pendulum, damping of, 4 0 , 1 717 peripd of damped, 1 7 9
Scale models, 3 8 , 4 3
S chools, see Universities S ediment volume, 1 0 8
Sensitivity analysis, 1 1 , 6 8 , 7 5 , 8 0 Sex ratio , human, 9 1 , 2 1 7
Signals, traffic, 8 9 , 2 1 3
Simple events in probability, 2 1 8
Sociobiology, herd formation, 1 3 2
period of perfect, 3 7
insect castes, 8 5
Phase plane, 6 0, 1 74
Phenotype, optimal, 84
Physiology, see Anatomy an Jphysiology
Place theory, central, 1 1 5
255
I
Poincare-B endixson theorer,l, 200
Sociology, group size distribution, 1 0 1 marriage rate, 1 5 7 sex preference, 9 1
Species diversity and habitat size, 4 9
Poisson distribution, 9 5 , 1 5 $, 1 5 8 , 237
Species interaction. see Ecology ; Population
Poisson' s ratio , 3 9 , 4 1
Stability, global, 174. 1 99
truncated, 1 02
Politics, buck passing, 1 0 , 1
i.
candidates and platforms, '9 0
growth
graphs ,used in study of, 45 loc
, 1 75 to o
disarmament, 5 6
preventative war, 5 6
Pollution o f the Great Lakeï¿½ .â€¢ 1 44
ï¿½ï¿½er models, 1 8 7
iation,Â· 2 3 0
fair elections, 1 24 â€¢
ara tilfe, 45 , CUtï¿½g hot rolled. 2 1 5
254
I N DEX
Dosage, drug, 1 5 6 insecticide, 1 5 9
Drag force, 2 3 , 34, 1 6 4 Drugs, 1 5 6
Dunes, sand, 2 1 1
Ecology, Great Lakes pollution, 1 44 species diversity and habitat size, 4 9
see also Population growth
Economics, Keynesian, 1 84 , 1 8 9 Economy, national, 1 84 , 1 89
Edgeworth box, 8 3 Elections, fair, 1 24
Forest fire, 7 3 Galaxies, colliding, 1 6 6
Gonorrhea epidemics, 1 9 2
Governor, steam engine, 1 8 8 Graphs, uses of, 44
Gravitation, 35, 39, 1 5 8
Great Lakes, pollution of, 1 44
Groups, dynamics of, 6 0
peer pressure and marriage, 1 5 7 size distribution of, 1 0 1
Growth rate, net, 8
Gunnery tables, 1 64
Elevators, 1 2
Gypsy moth control, 1 8 8
Employers, see Firms
Herd formation, 1 3 2
Employees, see Firms
Engineering, chemical, 89, 1 5 2, 1 5 7, 1 8 7 Epidemics, 1 6 6 , 1 9 2
Heun method, 1 7 1
Equilibrium point, 1 74
Income, marginal, 5 3
Events, independent, 2 1 9
Indifference curves and surfaces, 8 1 , 8 3
Equipment turnover, 8 1 simple, 2 1 8
Expectation, 220, 227
Independent events, 2 1 9
Insecticide, dosage of, 1 5 9
host-parasite systems and, 1 8 8
Exponential distribution, 2 0 3 , 2 1 3 , 2 3 7
Insects, castes of, 8 5
Facility location, optimum, 7 9 , 204
Inventory maintenance, 6 6 , 1 3 1 , 2 1 5
Falling, 3 4 , 1 5 8
social, 8 7
Fire, forest, 7 3
Keynesian economics, 1 84 , 1 8 9
Firms, advertising, 80, 1 5 9
Lakes, pollution of, 1 4 4
Fire station location, 204
equipment turnover, 8 1
Laplace transforms, 1 9 3
general theory, 5 2
Least squares, see Curve fitting
loading docks, l 1 5
Limit cycles, 1 74 , 200
inventory maintenance, 6 6 , 1 3 1 , 2 1 5 optimum location of, 7 9 , 204 overstock sales, 1 3 1
Lecture hall design, 1 3
Linear algebra, 9 , 3 7 , 1 7 7 Linear approximation, bad effect of, 2 0 1
package filling, 2 1 6
Linear programming, 8 7
production run length, 6 6
Loading docks, 1 1 5
salesperson effectiveness, 1 1
Log normal distribution, 207
packaging costs, 1 9 sales force size, 1 0
Lizards, b o d y temperature of, 1 2 1
Location, optimum, 7 9 , 204
wages, 5 7 , 8 8
Lotka-Volterra equations, 1 8 1
schooling of, 1 3 2
Lynx-hare cycles, 1 84
Fish, optimum swimming of, 7 9 Fishing, regulation of, 7 7 type o f catch, 1 8 9
Fission, nuclear, 1 6 9
Fitness o f organisms, 84
Fitness sets, 84 Flow, blood, 3 1 , 7 1 resistance to, 7 2
Lungs, 1 27
Macroeconomics, 6 4 , 1 84 , 1 8 9 Malaria, 1 9 2
Marriage, 1 5 7 Measles, 1 5 7 , 1 9 2 Medicine, diagnosis, 1 27 , 2 2 3 drug excretion, 1 5 6
INDEX epidemics, 1 6 6 , 1 9 2
lung efficiency, 1 27
Missles, see Arms race ; Ballistics; and Rockets
Mo del, best does not exist, 3 compartment, 1 5 6
mathematical, definition, 2 usefulness of, 1
need for, 1 4
predictions, fragile and robust, 4 , 1 2 3 , 1 3 0 , 1 5 2, 1 77
see also Sensitivity analysis
scale, 3 8 , 4 3
variables, careful choice of, 3 types of, 2, 3
Modeling process, changing problem, 1 0 example, 8 , 1 0 implicit assumptions, 1 , 1 9 7 references, 1 2 theory of, 6 Money, government control of, 1 87
Monte Carlo simulation, 1 0 3
accuracy o f estimates, 1 04, 2 3 1
Music, stringed instruments, 4 0
Normal approximation, 1 7 0, 2 0 9 , 2 3 0 Normal distribution, 2 3 0
random generation of, 2 3 3
Nuclear reaction, 1 6 9
Numbers, random, 1 0 5 , 1 1 8 , 2 3 2 Packaging costs, 1 9
Polymerization, 1 5 2, 1 5 7 , 2 1 5 Population growth, competition between species, 64, 1 84
demography, 9, 1 4 , 9 1
host-parasite, 1 80 , 1 8 8 one species, 8 , 1 4 , 1 9 8
predator-prey, 1 8 0 , 1 8 8 , 2 0 1 symbiosis, 1 84 , 1 8 9
Predation, 1 80, 1 8 8 , 2 0 1
Probability, conditional, 2 1 9 Psychology, 9 4 , 9 8
see also Psychophysics
Psychophysics, perception of intensity, 3 3 vision, 1 3 1 , 1 3 4 Queues, 1 06 Racing shells, 2 2 , 2 3 6 Radioactive decay, 2 0 2
Random numbers, generation, 1 05 , 2 3 2 table, 1 1 8
Reaction, chain, 1 6 9
Rhythms, biological, 1 6 8
Rocket, 7 8 Rosin's law, 2 0 7 Running i n rain, 7 6 Salesperson effectiveness, 1 1 Sand dunes, 2 1 1
Scale models, 3 8 , 4 3
S chools, see Universities S ediment volume, 1 0 8
Particle size distribution, 2 0 7
Sensitivity analysis, 1 1 , 6 8 , 7 5 , 8 0
Pendulum, damping of, 4 0 , 1 7 7
Signals, traffic, 8 9 , 2 1 3
Pedestrian crosswalks, 2 1 3 peripd of damped, 1 79 period of perfect, 3 7
Phase plane, 6 0 , 1 74
Phenotype, optimal, 84
Physiology, see Anatomy an ' physiology
Place theory, central, 1 1 5
Poincare-Bendixson theorerd, 200
255
Sex ratio, human, 9 1 , 2 1 7 Simple events in probability, 2 1 8
Sociobiology, herd formation, 1 3 2 insect castes, 8 5
So ciology, group size distribution, 1 0 1 marriage rate, 1 5 7 sex preference, 9 1
Species diversity and habitat size, 4 9
Poisson distribution, 95, 15 i, 1 5 8 , 2 3 7
Species interaction, see Ecology ; Population
Poisson's ratio , 3 9 , 4 1
Stability, global, 1 74 , 1 99
truncated, 1 0 2
Politics, buck passing, 1 0 , 1 3
candidates and platforms 9 0 disarmament, 5 6
fair elections, 1 24 preventative war, 5 6
Pollution o f the Great Lakeï¿½, 1 4 4
growth
graphs used in study of, 45 local, 1 7 4 , 1 75
references to other models, 1 8 7 Standard deviation, 2 3 0 Statics, comparative, 4 5 Steel beams, cutting h o t rolled, 2 1 5
256
I NDEX
Stream networks, 1 1 0, 1 1 7
Unemployment, 1 8 7
String, randomly thrown, 2 1 4
Universities, admissions policy, 6 4 , 8 8
Structures, strength of, 3 8
demand for graduates, 6 4
Students, admission of, 6 4 , 6 8
faculty tenure, 99
demand for graduating, 6 4 Sunlight, reflected i n deserts, 1 2 1 , 1 22
lecture hall design, 1 3 Utility, mathematical theory of, 204
Supply and demand, 5 5 , 5 7, 64 Symbiosis, 1 84, 1 8 9 Systems, autonomous, 1 7 3
Van der Pol equation, 200 Variable, random, 220
Tape recorder reel revolution counters, 1 3 0
Variables in models, 2, 3 , 6
Taylor polynomials, 1 7 1 , 1 9 3
Variance, 220
Thermostats, 1 99
Variation, coefficient of, 20 3
Time lags, 9, 5 7
Vision, 1 3 1 , 1 3 4
Traffic flow,. car following, 1 9 3
Volterra-Lotka equations, 1 8 1
flow-concentration curve, 1 3 3 fundamental diagram, 1 3 3 left turn squeeze, 1 48
Wages, 5 7 , 88
pedestrain cro sswalks, 2 1 3
Waiting room, 106
signals, 8 9
Water skiing, 1 6 0
urban, 1 2
Waves, water, 4 2
Trees, plane planted binary, 1 1 1 random branching of, 1 1 5
Young's mo dulus, 3 9 , 4 1
3 70 1 0/97 34450
7 6 ï¿½il
BU I
Iï¿½

</1>

newrealestate