Stochastic Dynamic Models of Response Time and

Journal of Mathematical Psychology 44, 408463 (2000)
doi:10.1006jmps.1999.1260, available online at http:www.idealibrary.com on
Stochastic Dynamic Models of Response
Time and Accuracy: A Foundational Primer
Philip L. Smith
University of Melbourne, Parkville, Victoria, Australia
A large class of statistical decision models for performance in simple information processing tasks can be described by linear, first-order, stochastic
differential equations (SDEs), whose solutions are diffusion processes. In such
models, the first passage time for the diffusion process through a response
criterion determines the time at which an observer makes a decision about
the identity of a stimulus. Because the assumptions of many cognitive models
lead to SDEs that are time inhomogeneous, classical methods for solving
such first passage time problems are usually inapplicable. In contrast, recent
integral equation methods often yield solutions to both the one-sided and the
two-sided first passage time problems, even in the presence of time inhomogeneity.
These methods, which are of particular relevance to the cognitive modeler, are
described in detail, together with illustrative applications. 2000 Academic Press
Theories of how human subjects make decisions in simple perceptual and
cognitive tasks often propose some form of sequential sampling mechanism to
explain the patterns of response time (RT) and accuracy that are observed in
empirical data. Indeed, for some researchers, the study of such mechanisms addresses
one of the most fundamental questions in psychology, namely, how the central nervous
system translates perception into action and how this translation depends on the intentions and expectations of the individual. From this perspective, the study of simple
decisions helps illuminate the nexus of perception, thought, and action and thus has
implications for our understanding of the diverse perceptual and cognitive phenomena
in which such decisions are involved.
Like signal detection theory (SDT) (Green 6 Swets, 1966), the theory of sequential
sampling mechanisms starts from the premise that simple perceptual and cognitive
The idea for this article emerged from a seminar on response time models at Indiana University during
the fall of 1996 in which I participated, and I thank Richard Shiffrin, Jerry Busemeyer, and Jim Townsend
for encouraging me to write it. Background work was carried out during a sabbatical at the Institute
for Mathematical Behavioral Sciences at the University of California, Irvine, in the same year, and I thank
Duncan Luce for the support and hospitality of the Institute during this time. My thanks also to Jerry
Busemeyer and Michael Rudd for their careful reading of an earlier version of the manuscript. Preparation
of this article was supported in part by Australian Research Council Grant A79802778.
Address correspondence and reprint requests to Philip L. Smith, Department of Psychology, University
of Melbourne, Parkville, Vic. 3052, Australia. E-mail: philipsherman.psych.unimelb.edu.au.
0022-249600 35.00
Copyright 2000 by Academic Press
All rights of reproduction in any form reserved.
408
STOCHASTIC DYNAMIC MODELS
409
decisions are statistical in nature. That this is so follows from the widely held assumption that sensory and cognitive systems are inherently noisy. Thus, a formal model
of simple decisions typically consists of a set of representational assumptions, which
specify how stimulus properties are encoded statistically in the central nervous
system, and a set of process assumptions, which specify how this noisy information
is used to arrive at a decision. Also in common with SDT, the idea that sensory
representations are noisy and time-varying leads naturally to the view that simple
decisions involve a smoothing or filtering operation, and this in turn leads to the
presumption that they are performed by an averaging or integration device.
Where SDT and the theory of sequential sampling models diverge is in their
assumptions about the accrual of stimulus information. Whereas SDT assumes a
fixed sampling interval, sequential sampling models assume that the interval is
variable and depends on the statistical properties of the signal itself. Rather than
taking a sample of stimulus information of predetermined size, such models assume
that the decision mechanism samples until a criterion quantity of information
needed for a response is obtained. Because this quantity varies with the statistics of
the sample, the time needed to acquire it is also variable. Typically, this acquisition
time is identified with the decision time component of RT. Among the many
researchers who have investigated the formal properties of these models are Ashby
(1983), Audley and Pike (1965), Busemeyer and Townsend (1992, 1993), Diederich
(1995, 1997), Edwards (1965), Emerson (1970), La Berge (1962), Laming (1968),
Link (1975, 1978, 1992), Link and Heath (1975), Luce and Green (1972), Pike
(1966, 1968), Ratcliff (1978, 1981), Smith and Vickers (1988, 1989), Townsend and
Ashby (1983), Vickers (1970; Vickers, Caudrey, 6 Willson, 1971), and Viviani (1979a,
1979b). Reviews of this literature may be found in Vickers (1979), Townsend and
Ashby (1983), and Luce (1986).
In practice, the study of a given sequential sampling model reduces to the study
of a stochastic process, or processes, that represents the accumulated information
available to the decision mechanism at a given time. This process (for the moment
defined only in the singular) will be denoted X(t). Formally, X(t) is a random
variable defined on the probability space of all possible sequences of accumulated
information at time tthe term ``information'' being used here in a sense that is
only loosely related to its more technical, statistical meaning. Specifically, at each
time t, t # T, where T denotes the set of indices for which the process is defined,
X(t) takes on a random value x, x # X. The set X, of possible values of accumulated
information, is the state space of the process. The set of possible time indices, T,
and the set of possible information states, X, may independently be either discrete
or continuous, depending on the assumptions of the model. This article will be
concerned exclusively with continuous-time, continuous state-space models. The
interested reader is referred to the preceding references for other possibilities.
Two steps are required to specify the quantitative properties of a model, once the
set of process assumptions that determine X(t) are given. The first step is to specify
the probabilistic character of the process X(t). These properties are expressed via its
transition distribution, F(x, t | y, {), which is defined as
F(x, t | y, {)=P[X(t)x | X({)= y].
(1)
410
PHILIP L. SMITH
This distribution gives the probability that the accumulated information at time t
is less than or equal to x, given that its value at some earlier time { was y. The
second step is to solve the first passage time problem for X(t). Usually, the statistics
of X(t) most of interest in applications are the probability that the accumulated
information will eventually reach or exceed a particular level, together with the time
required for this to happen. Depending on the application, the one-sided or twosided first passage time problem may be of greatest interest.
The one-sided problem involves a single random variable T, defined as
T=inf [t: X(t)a];
X(t 0 )<a.
(2)
That is, T is the time at which the accumulated information first reaches the value
awhere it is assumed that the process starts at time t=t 0 with an initial value
X(t 0 )=x 0 <a. We will take t 0 =0 and x 0 =0 without further comment, except
where greater clarity is obtained by indicating the values of the initial conditions
in an explicit way. Depending on the assumptions of the model, x 0 may either be
fixed or random with a prescribed distribution. For x 0 fixed we stipulate that
P[X(t 0 )=x 0 ]=1. Mathematically, a is an absorbing barrier or boundary for X(t),
which is identified psychologically with the decision criterion adopted by the subject
in an experimental task. The two-sided problem involves a pair of random variables,
T 1 and T 2 , defined as
T 1 =inf [t: X(t)a 1 ; X({)a 2 for all {<t]
T 2 =inf [t: X(t)a 2 ; X({)a 1 for all {<t],
(3)
a 2 <X(t 0 )<a 1 .
The variables T 1 and T 2 denote, respectively, the time required for the accumulated
information first to exceed a 1 or to fall below a 2 , given that the other boundary has
not been crossed already. Again, the values a 1 and a 2 may be interpreted as decision
criteria associated with competing responses. To ensure that T 1 and T 2 are well defined,
we adopt the convention that the infimum of the empty set is infinity. With this convention, the definition (3) implies that either T 1 = or T 2 = for each realisation of X(t).
The statistics of interest for the one-sided problem are P[T<], the probability
that X(t) exceeds a in finite time, and G T (t), the associated first passage time distribution. If T is not finite with probability one, then G T (t) will be a defective distribution;
that is, its total probability mass will be less than one. The corresponding statistics
for the two-sided problem are P[T 1 <T 2 ], the probability that the process crosses
the boundary a 1 before crossing the boundary a 2 , and G 1(t) and G 2(t), the first
passage time distributions for the random variables T 1 and T 2 . The functions G 1(t)
and G 2(t) represent the joint distributions of the events that a boundary crossing
occurs at or before time t and that the first boundary crossed is a 1 or a 2 , respectively. If the random variable T is defined as
T=min(T 1 , T 2 ),
the time of the first boundary crossing, independent of the boundary involved, then
G T (t)=G 1(t)+G 2(t),
STOCHASTIC DYNAMIC MODELS
411
because first boundary crossings at a 1 and a 2 are mutually exclusive events. For the
continuous-time processes of interest here, T 1 , T 2 , and T possess density functions,
denoted g 1(t), g 2(t), and g T (t), respectively. Specifically, g T (t)=(ddt) G T (t), with
the other densities being defined similarly. When we wish to show the dependency
of these densities on the absorbing boundaries and on the initial conditions we will
write g 1(a 1 , t | x 0 , t 0 ) and g 2(a 2 , t | x 0 , t 0 ) for the two-barrier case and g(a, t | x 0 , t 0 )
for the single-barrier case.
To provide some intuitive content for these probabilistic ideas, Figs. 1 and 2
show some simple model schemes, in which the preceding quantities arise naturally
from theoretical considerations. The model in Fig. 1 provides a possible framework
for modeling simple RT, GoNo Go RT, or threshold increment detection. 1
Conceptually, it consists of two main parts: an encoding stage and a decision stage.
The encoding stage, shown in the box on the right, represents the properties of
some particular sensory system, memory system, or systems of both kinds acting in
concert. Its output is a time-dependent information function, denoted +(t). This
function represents the instantaneous, encoded value of those attributes of the
stimulus that are relevant to the decision task at hand. Although the exact substrate
of this function is usually left unspecified, it is typically assumed to correspond to
the level of activation in some neural pathway or pathways. To account for the
behavioral variability that is ubiquitous in simple decision tasks, the information
function is subject to continuous statistical perturbation. This perturbation is
represented in the figure by the function W(t).
The second part of the model, the decision stage, is shown in the box on the right
side of Fig. 1. To make a decision about the stimulus, the instantaneous values of
the noise-perturbed information function are summed or accumulated until the
criterion, a, is exceeded. The accumulated information is represented by the stochastic
process X(t). As described previously, the decision time in the model is T, the first
passage time of X(t) through the level a. Versions of this model have been considered
by, among others, Diederich (1995), Emerson (1970), Pacut (1977, 1980), and
Smith (1995).
To model choice RT, or discrimination between similar stimuli, the scheme
shown in Fig. 1 can be extended in various ways. Two possibilities are shown in
Fig. 2. In both, the information function is assumed to take on both positive and
negative values. In cognitive models, this is often accomplished by the device of
comparing sampled values of the information function to a decision referent c, prior
1
Threshold increment detection can be modeled using either a single criterion or a pair of criteria.
Dual-criterion models (e.g., Link, 1978) associate a criterion with each of the responses ``Signal'' and
``Noise'' and assume that information is sampled until one of the two criteria is exceeded. Single-criterion
models associate a criterion with the signal response only and assume that a noise response is emitted
by default if the signal criterion is not exceeded in some predetermined period of time. Well-behaved
psychometric functions can be obtained with single criterion models only if the asymptotic probability
of a signal response is less than unity; that is, if P[T<]<1. Typically, this means that single-criterion
models will be appropriate only when stimulus duration is limited, as occurs when stimuli are backwardly masked. When weak, response-terminated stimuli are used, dual-criterion models are needed to
obtain psychometric functions with the required properties. These properties can often be characterized
by considering the boundary behavior of the accumulation process (e.g., Karlin 6 Taylor, 1981,
pp. 226250) to ascertain whether absorption at a boundary in finite time is a sure event.
412
PHILIP L. SMITH
FIG. 1. Model for simple and GoNo Go RT. The time-dependent information function +(t) is
perturbed by white noise W(t) and accumulated. A response is emitted when the accumulated decision
stage activation X(t) exceeds a criterion a.
to accumulation (e.g., Link, 1975, 1992; Ratcliff, 1978; Smith 6 Vickers, 1988). The
accumulation process then operates on the difference function, +(t)&c, rather than
on the information function directly. The two panels of Fig. 2 show two ways in
which the accumulation process may operate in this situation.
In the model of Fig. 2a, the difference function +(t)&c is split into positive and
negative halves. This is represented in the figure by a pair of opposite-signed, halfwave rectifiers, whose outputs are [+(t)&c] + =max(+(t)&c, 0) and [+(t)&c] &
=&min(+(t)&c, 0), respectively. These functions are perturbed by independent
sources of noise, W 1(t) and W 2(t), to yield two independent accumulation functions, X 1(t) and X 2(t). These functions race one another, with the response being
determined by which of the two criteria, a 1 or a 2 , is first exceeded. Derivation of
response time statistics for this model involves a pair of one-sided first passage time
problems.2 In a variant of this scheme, a single noisy information function, +(t)+W(t),
is split into positive and negative halves, again by the device of comparing it to a
referent. The statistical properties of the accumulation function are then induced by the
action of the pair of rectifiers operating on the noisy information function directly.
Models of this kind belong to the class that Smith and Vickers (1988) called parallel
integratorsa class which includes the recruitment model (La Berge, 1962), the
Poisson parallel counter model (Pike, 1966, 1968; Townsend 6 Ashby, 1983),
and the continuous state space accumulator model (Smith 6 Vickers, 1988, 1989;
Vickers, 1970, 1979).
In the model of Fig. 2b, the noise-perturbed difference function +(t)+W(t)&c
drives a single accumulation process, X(t), whose average rate of change is either
positive or negative, depending on the stimulus presented. To make a decision, the
subject sets a pair of criteria, a 1 and a 2 , on the accumulated information axis and
responds according to which of the events X(t)a 1 or X(t)a 2 first occurs.
Derivation of response time statistics for this model requires solution of a two-sided
2
The model of Fig. 2a makes the somewhat artificial assumption that noise is localized to stages of
processing occurring after rectification. Relaxation of this assumption typically leads to models in which
the noise across channels is correlated. Ways to deal with correlated noise in diffusion process models
are described in the last section of this article. First passage time problems for such processes are not
particularly tractable, except in special cases.
STOCHASTIC DYNAMIC MODELS
413
FIG. 2. Models for choice RT. (a) Parallel channels model. The function +(t) is compared to a
sensory referent c to produce a signed information function +(t)&c. This is split into positive and
negative parts [+(t)&c] + and [+(t)&c] & by the action of a pair of half-wave rectifiers. The rectified
functions are perturbed by independent sources of white noise, W 1(t) and W 2(t), and accumulated as
separate evidence totals, X 1(t) and X 2(t). The response is r 1 or r 2 , depending on which of the events
X 1(t)a 1 or X 2(t)a 2 first occurs. (b) Two-barrier single channel model. The signed information
function +(t)&c is perturbed by a single noise source W(t) and accumulated as a single signed total
X(t). The response is r 1 or r 2 depending on whether X(t)a 1 or X(t)a 2 occurs first.
first passage time problem. Models that conform to this scheme include the various
random walks (Ashby, 1981; Edwards, 1965; Heath, 1981; Laming, 1968; Link,
1975, 1992; Link 6 Heath, 1975; Stone, 1960) and their continuous-time counterparts, the diffusion process models (Busemeyer 6 Townsend, 1992, 1993; Heath,
1992). The parallel diffusion process model of Ratcliff (1978, 1981) combines the
characteristics of both of the models in Fig. 2.
Models of simple decisions that have been proposed in the literature typically
have made two important simplifying assumptions about the accumulation process
X(t). These are that it is (a) time-homogeneous and (b) an independent-increments
process. A time-homogeneous process is one whose transition distribution satisfies
the relation
P[X(t)x | X({)= y]=P[X(t&{)x | X(0)= y],
(4)
414
PHILIP L. SMITH
for all {<t. In such a process, the conditional probability of a transition from state
y to state x depends only on the interval t&{ and not on the time at which the
transition occurs. An independent-increments process is one whose transition
distribution satisfies the relation
P[X(t)x | X({)= y]=P[X(t)x& y | X({)=0].
(5)
In such a process, the conditional probability of a transition from state y to state
x depends only on the difference between the states x& y and not on the initial
state y. In other words, an independent-increments process is spatially homogeneous
(e.g., Chung 6 Williams, 1983, p. 12).
The assumption that X(t) is time-homogeneous is equivalent to the assumptions
that (a) the information function +(t) in Figs. 1 and 2 is constant and (b) the noise
function W(t) is a (strictly) stationary stochastic process; that is, none of its
statistics change with time. The assumption that X(t) is an independent-increments
process is entailed by the frequently made assumption that the decision process is
a perfect integrator; that is, accumulation of information is neither bounded nor
subject to decay. While both of these assumptions may be defensible, at least in
some settings, there are other occasions in which assumptions of greater dynamical
complexity are warranted.
Frequently, a modeler may wish to assume that the characteristics of the stimulus
information change with time. Such changes may arise for a number of reasons.
Most obviously, stimulus information may be available only for a limited time, as
occurs under conditions of tachistoscopic presentation. Formal models for performance in this situation, which embody a single, abrupt, stimulus-driven change in
the accumulation process, were described by Heath (1981) and Ratcliff (1980).
More generally, continuous, smooth changes in the accumulation process may arise
internally, through the action of mechanisms that encode the stimulus. Following
de Lange (1952, 1954, 1958), the psychophysical properties of the early stages of
visual processing are often modeled as a set of low-pass linear filters (e.g., Busey 6
Loftus, 1994; Sperling 6 Sondhi, 1968; Watson, 1986). Within this framework, timedependent variations in the statistics of the stimulus representation are assumed to
arise because of the phase and frequency response characteristics of the filters. A
model combining linearly filtered stimulus encoding with a stochastic accumulation
process was first suggested by Heath (1992).
Smooth changes in the dynamics of stimulus encoding may also arise if the
underlying mechanism is selectively sensitive to changes in the stimulus input rather
than to steady state intensity levels. Psychophysically transient systems of this kind
have been identified in both vision (e.g., Legge, 1978; Tolhurst, 1975a, 1975b) and
audition (e.g., Abeles 6 Goldstein, 1972; Gerstein, Butler, 6 Erulkar, 1968) and
their properties used to explain performance in models of RT by Burbeck and Luce
(1981), Burbeck (1985), and Rouder (2000). Finally, such changes may occur
because of shifts in a subject's attention during the course of a trial. Depending on
the nature of the task, such changes may be thought of as a set of discrete, punctate
transitions, as proposed in the decision model of Diederich (1997) or as smoothly
varying changes in the rate of accumulation, as suggested by the attention gating
STOCHASTIC DYNAMIC MODELS
415
model of Sperling and colleagues (Reeves 6 Sperling, 1986; Sperling 6 Weichselgartner,
1995). Regardless of how such variations are produced, however, they imply timeinhomogeneity in the accumulation process X(t).
Violations of the independent-increments assumption occur when there is state
dependence in the accumulation processindeed, this is the substance of the definition in (5). State dependencies of this kind arise when the signal-to-noise ratio
(SNR) of the accumulation process is bounded, as occurs in imperfect or leaky
accumulators. In leaky accumulators, the growth of activation in the decision
system caused by the presence of a stimulus is opposed by a tendency for activation
to decay at a rate that is proportional to its current level. Models with stochastic,
SNR-bounded accumulation have been considered by Busemeyer and Townsend
(1992, 1993), Diederich (1995), Pacut (1977, 1980), and Rudd and Brown (1997).
A model combining all of the preceding dynamic attributes, namely, linearly filtered
stimulus encoding, change and level sensitive mechanisms, and leaky stochastic
accumulation, was proposed by Smith (1995, 1998a).
The preceding paragraphs have considered a number of examples from the areas
of perception and performance in which violations of either spatial homogeneity or
temporal homogeneity, or both, arise. Many of these considerations also apply,
either implicitly or explicitly, to mathematical models of memory. Although the
simplest versions of most memory models are static (e.g., Gillund 6 Shiffrin, 1984;
Humphreys, Bain, 6 Pike, 1989; Murdock, 1982) and make predictions that can be
characterized using signal detection theory alone, the theoretical framework that
accompanies them is usually rich enough to support more complex dynamic predictions if they are required (e.g., Ratcliff, 1978). Dynamic behavior of this kind arises
naturally, for example, with the addition of assumptions that specify the time
course of the interaction between a probe stimulus and the memory system. Indeed,
such dynamics are an explicit part of some network models of memory, such as
interactive activation models (Coltheart 6 Rastle, 1994; McClelland 6 Rumelhart,
1981). In general, such models postulate time-dependent and asymptotically bounded
information accrual, which results in accumulation functions that are both spatially
and temporally inhomogeneous.
STOCHASTIC INFORMATION ACCRUAL AS A DIFFUSION PROCESS
This article is concerned with diffusion process models of information accrual;
that is, continuous-time, continuous sample path Markov processes. Although there
are other ways in which information accrual in continuous time may be modeled
the most important among them from a biological perspective being point-process
models (e.g., Diederich, 1995; Luce 6 Green, 1972; Hildreth, 1979; McGill, 1963;
Rudd, 1996; Schwartz, 1992; Smith 6 Van Zandt, in press; Townsend 6 Ashby,
1983)diffusion processes are of particular interest because they include an important class of continuous-time Gaussian processes, which provide a natural, dynamic
generalization of the Gaussian signal detection models that form the mainstay of
much psychological theorizing. Furthermore, Gaussian approximations may be
appropriate to model molar accumulation behavior even when it is assumed that
the fine-grained behavior of the system can be described as a point process, as, for
416
PHILIP L. SMITH
example, when modeling the number of active fibers in a neural relay. In this situation, a possible model for the accumulated impact of successive spike discharges on
a post-synaptic membrane potential is as Poisson shot noise, which becomes
approximately Gaussian at high intensities of the incident point process (e.g.,
Papoulis, 1991, pp. 629635). 3 Variants of this idea appear in Link (1992), McGill
(1967), Rudd (1996), and Smith (1998b), among others.
The rest of the article considers in turn the two problems described previously:
first, how to characterize the stochastic properties of the accumulation process X(t);
second, how to solve the one-sided and two-sided first passage time problems for
X(t) when it is spatially and temporally inhomogeneous. The classical approach to
first passage time problems (e.g., Cox 6 Miller, 1965, Ch. 5; Feller, 1971, Ch. 10)
treats them as boundary value problems in the theory of partial differential equations and yields tractable solutions only in the time-homogeneous case. A second
approach that has been applied to cognitive problems is to approximate the process
using a finite-state Markov chain and to compute the first passage time statistics for
the approximating process using spectral methods, as was done by Busemeyer and
Townsend (1992, 1993) and Diederich (1995, 1997). This method yields computationally efficient solutions for spatially homogeneous and inhomogeneous problems,
but only in the time homogeneous or piecewise time homogeneous cases. In contrast,
recent developments in the applied probability literature have led to integral equation
representations of the solution of the one-sided and two-sided first passage time
problems that can often yield tractable results even in the presence of time inhomogeneity. These methods, which provide high-accuracy numerical approximations to
the first passage time distributions, are of particular relevance to the cognitive
modeler and form the basis of the procedures described here.
Classically, F(x, t | y, {), the transition distribution of a diffusion process X(t), is
shown to satisfy a pair of partial differential equations known as the backward and
forward (or FokkerPlanck) equations. Solution of the relevant equation in the
presence of an initial condition yields the transition distribution; solution in the
presence of appropriate boundary conditions yields the first passage time distribution. These methods are described in Cox and Miller (1965, Ch. 5), Feller (1971,
Ch. 10), and Karlin and Taylor (1981, Ch. 15). Applications of these methods in a
cognitive setting are described by Ratcliff (1978, 1980) and Smith (1990). A more
modern approach to the study of diffusions is via stochastic differential equations
(SDEs). Such an approach has the advantage of providing a characterization of the
process that is more direct and intuitive than that afforded by partial differential
equations. Its disadvantage is that a rigorous treatment of SDEs requires the analytic
framework of measure theory and is thus unsuitable for a brief, self-contained presentation such as this. Fortunately, however, the SDE approach to the important class of
3
Pooling of activation across positive and negative channels of the kind found in the human visual
system may be modeled as the difference of a pair of Poisson shot noise processes. At high intensities,
this process will also approximate a Gaussian process. Such processes may provide a theoretical link
between the underlying neural mechanisms and the diffusion process models described in this article, but
no attempt is made here to develop these properties in detail. The interested reader is referred to Rudd
(1996) and Rudd and Brown (1997) for highly developed models which link patterns of neural firing,
diffusion processes, and early vision.
STOCHASTIC DYNAMIC MODELS
417
Gaussian processes can be motivated heuristically in a fairly simple way, by considering the limits of sums. This is the approach I adopt here. Useful introductions to
the theory of SDEs may be found in Arnold (1974), Bhattacharya and Waymire
(1990), Gardiner (1985), and Karlin and Taylor (1981). Rigorous measure-theoretic
treatments are contained in Ethier and Kurtz (1986), Karatzas and Shreve (1991),
Protter (1990), and Revus and Yor (1994).
We wish to characterize a class of stochastic accumulation processes that is
described by the SDE
dX(t)=+(X(t), t) dt+_(X(t), t) dB(t).
(6)
In this equation, dX(t) is the random change in the process X(t) that occurs in a
small time interval dt. We seek to interpret this equation as the limit, in some
suitable sense, of the discrete time difference equation
X(t+2t)&X(t)=+(X(t), t) 2t+_(X(t), t) 2B(t),
(7)
in which 2B(t)=B(t+2t)&B(t) is a small Gaussian-distributed perturbation of
order - 2t. Symbolically, we write 2B(t)t- 2t. The requirement that 2B(t)
approaches zero more slowly than 2t is designed to ensure that the stochastic
variation in X(t) is preserved in the limit. We ascribe the following properties
to 2B(t):
E[2B(t)]=0
E[2B 2(t)]=2t
(8)
Cov[2B(t+2t) 2B(t)]=0.
Equation (8) stipulates that (a) 2B(t) is a zero-mean Gaussian process; (b) its
variance in any interval 2t equals the length of the interval, and (c) the increments
in successive nonoverlapping intervals are independent.
The physical intuition that underlies the description of X(t) in (7) is as follows:
The change 2X(t) in any interval 2t consists of two parts, one deterministic,
the other stochastic. The magnitudes of the two parts depend on the functions
+(X(t), t) and _ 2(X(t), t), respectively. Both parts may depend on state x and on
time, t, in which case, X(t) is spatially and temporally inhomogeneous. Because
2B(t) is Gaussian, the increment 2X(t) will be Gaussian with mean +(X(t), t) 2t
and variance _ 2(X(t), t) 2t. The resulting process X(t) may be thought of as a
realization of the trajectory in state space of a system whose underlying smooth
dynamics are prescribed by the function +(X(t), t), but which is subject to repeated
shocks or statistical perturbations. For a given set of initial conditions, the actual
trajectory of the system will depend jointly on the factors that determine its smooth
dynamics and on the unique realization of the sequence of perturbations 2B(t i ),
i=1, 2, ... .
418
PHILIP L. SMITH
FIG. 3. Solution to Eq. (6) as a diffusion process X(t), with initial condition X(t 0 )=x 0 . The value
of X(t) is a random variable with distribution function F(x, t | x 0 , t 0 ), transition density f (x, t | x 0 , t 0 ),
and mean m(t; t 0 ). The transposed normal curve suggests the properties of the transition density function
graphically.
Within this framework, (6) may be thought of as describing the trajectory of X(t)
when the interval between successive shocks becomes very small, as suggested by
the sample paths shown in Fig. 3. To interpret (6) as the limit of (7) as 2t goes to
zero requires us to give meaning to quantities of the form W(t)=lim 2 Ä 0 2B(t)2t.
However, the stipulation that 2B(t)t- 2t precludes convergence in any ordinary
sense (e.g., Karlin 6 Taylor, 1981, p. 341). Indeed, the prescriptions contained in
(8b) and (8c) imply that Cov[W(t) W({)]=$(t&{), the Dirac delta function
which does not exist as a function in any usual sense. Nevertheless, a generalized
function interpretation may be given to W(t). Recalling (a) that the power spectrum
of a stochastic process is the Fourier transform of its autocorrelation function and (b)
that the Fourier transform of the Dirac delta function is a constant, W(t) may be interpreted as a Gaussian process whose power spectrum is constant; that is, as one in
which there is equal power at all frequencies. Such a process is known as white noise
(Wong 6 Hajek, 1985, pp. 109115).
As a process, white noise cannot exist in any real physical sense. Nevertheless, it
provides a convenient and widely used approximation when describing the dynamics
of physical systems that are additively perturbed by broad-spectrum Gaussian noise.
Indeed, as noted by Karatzas and Shreve (1991), the development of a rigorous
stochastic calculus by Ito^ in the 1940s was motivated by the need to understand
systems of this kind. Here we use only the minimum of this apparatus needed to
provide a characterization of the distributional properties of X(t). The interested
reader is referred to the previous references for further details.
Under appropriate smoothness conditions on the functions +(x, t) and _(x, t),
the solution X(t) to (6) will be a continuous sample path Markov process. Given
STOCHASTIC DYNAMIC MODELS
419
certain technical restrictions on the process X(t), a sufficient condition for it to have
continuous sample paths is that it satisfy the Dynkin condition 4
lim
hÄ0
1
P[( |X(t+h)&X(t)| >=) | X(t)=x]=0
h
for all =>0. Loosely, the probability of large jumps in X(t) in any interval goes to
zero with the size of the interval. In particular, this condition excludes jump
processes, the canonical example of which is the Poisson process. 5 The resulting
process is fully characterized by its infinitesimal moments, +(x, t) and _ 2(x, t), which
are defined as:
1
E[X(t+h)&X(t) | X(t)=x]
hÄ0 h
+(x, t)= lim
(9)
1
_ (x, t)= lim E[[X(t+h)&X(t)] 2 | X(t)=x].
hÄ0 h
2
These functions, which are known as the drift and diffusion coefficients of the
process, respectively, are the infinitesimal (rate) equivalents of the mean and
variance of the increment process in (7).
As written, the SDE in (6) includes both linear and nonlinear processes of
arbitrary order. Here we confine ourselves to the general, linear, first-order SDE
dX(t)=[+(t)+b(t) X(t)] dt+[_(t)+c(t) X(t)] dB(t),
(10)
in which +(t), b(t), _(t), and c(t) are given continuous functions of timewhich
includes the possibility that they may be constant or zero. Of the various special
cases of (10), the most important is that in which c(t) is zero, i.e.,
dX(t)=[+(t)+b(t) X(t)] dt+_(t) dB(t).
(11)
In this equation the diffusion term is either constant or may depend on time but
is independent of state. The importance of this equation is that its solutions are
Gaussian processes and, unlike the solutions of the more general equation (10),
such solutions may be obtained by elementary methods. In general, SDEs are
4
The requirements are that X(t) is right continuous and possesses left limits. A process is right
continuous if lim t a { X(t)=X({) for all {. It possesses left limits if lim t A { X(t) exists for all {>0. A strong
Markov process with these properties that satisfies the Dynkin condition is a diffusion process. Most
processes of interest in applications either satisfy these conditions or can be realized as processes of this
kind. In particular, any strong Markov process which is continuous in probability (i.e., for which
lim t Ä { P[ |X(t)&X({)| >=]=0 for any {0, =>0) has an equivalent version which satisfies these
conditions. A criterion for the Dynkin condition to hold is that the infinitesimal moment condition
lim h Ä 0 E[ |X(t+h)&X(t)| p | X(t)=x]h=0 holds uniformly in x for some p>2 (Karlin 6 Taylor,
1981, p. 165).
5
Processes consisting of the superposition of a continuous process and a jump process are called Levy
processes. Such processes are discussed by Protter (1990).
420
PHILIP L. SMITH
usually written in the differential form used in (10) and (11) rather than the more
familiar form involving derivatives because of the difficulty in assigning meaning to
expressions of the form dXdt when the function X(t) is not of finite variation. 6
Before proceeding, we distinguish two further special cases of (11). These are
dX(t)=+(t) dt+_ dB(t)
(12)
dX(t)=[+(t)&#X(t)] dt+_ dB(t),
(13)
and
which are obtained by setting _(t)=_ constant and b(t)=0 or b(t)=&#. Equations
(12) and (13) generate, respectively, the two diffusion processes that historically have
been of the greatest importance in both theory and applications, namely the Brownian
motion (or Wiener) process and the OrnsteinUhlenbeck (OU) process. 7 These
processes are also the ones that have found the greatest application in sensory and
cognitive modeling (e.g., Rudd 6 Brown, 1997; Busemeyer 6 Townsend, 1992,
1993; Diederich 6 Busemeyer, 1995; Emerson, 1970; Heath, 1992; Ratcliff, 1978,
1981; Reed, 1973; Smith, 1995, 1998a). They may be interpreted as describing the
dynamics of a perfect and a leaky integrator, respectively. Detailed accounts of the
use of the OU process to model leaky integration may be found in Busemeyer and
Townsend (1992) and Smith (1995).
We attempt to give meaning to (11) using integration by parts. Proceeding by
analogy with the deterministic case (Karlin 6 Taylor, 1981, pp. 345346), we
introduce the integrating factor exp[& t b({) d{] and write (11) in the form
e &
t b({) d{
[dX(t)&b(t) X(t) dt]=e &
t b({) d{
[+(t) dt+_(t) dB(t)].
The left-hand side of this equation can be recognized as an exact differential:
d[X(t) e &
t b({) d{
]=e &
t b({) d{
[+(t) dt+_(t) dB(t)].
Both sides of the equation may therefore be integrated from 0 to t and the result
rearranged to yield
X(t)=
|
t
0
t
e { b(s) ds+({) d{+
|
t
t
e { b(s) ds_({) dB({),
(14)
0
where, as stipulated previously, the initial condition P[X(0)=0]=1 has been
assumed. Equation (14) may be interpreted as the output of a linear system with
6
The variation of a process is defined in the following way: Let > (m) =[t 0 , t 1 , ..., t m ] be a partition
of the compact interval [0, t]. The total variation of X(t) on [0, t] is defined as sup m
i=1 |X(t i )&X(t i&1 )|,
where the supremum is taken over all possible subdivisions of [0, t] with m arbitrarily large. A process
is said to be of finite variation if its total variation is finite. A fundamental property of the sample paths
of diffusion processes is that, with probability one, their total variation on compact intervals is infinite.
7
For historical reasons, SDEs of the form (13) are known as Langevin equations (see, e.g., Gardiner,
1985; van Kampen, 1992).
421
STOCHASTIC DYNAMIC MODELS
impulse response function b(t) (cf. Norman, 1981) and input +(t)+_(t) dB(t). That
is, the output is the superposition of outputs obtained by independently supplying
as inputs the deterministic function +(t) and the temporally modulated white noise
process _(t) dB(t).
To interpret the stochastic integral on the right of (14) we use a second integration by parts to get
|
t
t
e { b(s) ds_({) dB({)=_(t) B(t)+
0
|
t
t
e { b(s) ds[_({) b({)&_$({)] B({) d{.
(15)
0
In this form, the integral on the right of (15) may be seen to be a linear function
of the Brownian motion process B(t), which is the canonical independent-increments
diffusion process (Karatzas 6 Shreve, 1991). By virtue of the independent-increments
property, the functional central limit theorem ensures that B(t) is Gaussian
(Bhattacharya 6 Waymire, 1990, pp. 2024) and thus can be fully characterized by
its first two moments, namely,
E[B(t)]=0
Cov[B(t) B({)]=min(t, {),
and, in particular, Var[B(t)]=t (cf. (10)). As a linear functional of a Gaussian
process, the stochastic integral in (15) will also be Gaussian, and X(t), as a sum of
deterministic and stochastic parts, will inherit this property also. We seek to
ascertain the mean, variance, and transition distribution of X(t).
Before proceeding, we observe that the special case b(t)=&#, _(t)=_ in (14)
and (15) yields a linear functional representation of the OU process (Bhattacharya
6 Waymire, 1990, p. 581; Karlin 6 Taylor, 1981, p. 345)
X(t)=
|
t
_
e &#(t&{)+({) d{+_ B(t)&#
0
|
t
&
e &#(t&{)B({) d{ ,
0
(16)
and, trivially, when #=0, of the (time-inhomogeneous) Brownian motion process
X(t)=
|
t
+({) d{+_B(t).
(17)
0
To motivate subsequent formal manipulations, we seek to ascertain the distributional properties of the process X(t) in (16) heuristically, as the limit of sums. An
analogous, but more laborious, argument could be constructed for the more general
process (15) but would have little additional heuristic value. To this end, we
consider a special case of the difference equation (7)
2X i =+~ &#~X i&1 +_~W i ,
422
PHILIP L. SMITH
in which 0<#~ <1 and W i , i=1, 2, ..., is a sequence of independent and identically
distributed Gaussian random variables with E(W i )=0 and Var(W i )=1. This
equation may be written
X i =+~ +(1&#~ ) X i&1 +_~W i
and solved recursively to yield
n&1
n&1
X n = : (1&#~ ) i +~ + : (1&#~ ) i _~W i .
i=0
(18)
i=0
To obtain moments when the interval between successive increments is arbitrarily
small, we arrange the passage to the limit such that n=t2, _ 2 2=_~ 2, # 2=#~, and
+2=+~. With this substitution, (18) becomes
n&1
n&1
X n =2+ : (1&2#) i +- 2_ : (1&2#) i W i .
i=0
(19)
i=0
Taking expectations in (19), and using the expression for the partial sum of a
geometric series (convergence of which is guaranteed by the condition on #~ ), yields
the expected value
n&1
E[X n ]=2+ : (1&2#) i
i=0
1&(1&2#) n
1&(1&2#)
_
&
1&(1&#tn)
=2+
_ 2# & .
=2+
n
The standard identity lim n Ä (1+xn) n =e x can then be applied to give
+
E[X t ]= lim E[X n ]= [1&e &#t ].
2Ä0
#
(20)
The variance is obtained in a similar way,
Var(X n )=E
_{
n&1
- 2 _ : (1&2#) i W i
i=0
n&1
=2_ 2 : (1&2#) 2i E[W 2i ],
i=0
2
=&
423
STOCHASTIC DYNAMIC MODELS
where the vanishing of the cross-product terms occurs by virtue of the fact that
Cov(W i W j )=0, i{ j. As before, the series may be summed to give
1&(1&2#) 2n
1&(1&2#) 2
_
&
1&[(1&#tn) ]
=_ 2
_ 22#&2 # & .
Var(X n )=_ 2 2
n
2
2
2 2
By a similar passage to the limit to that which yielded (20)
Var(X t )= lim Var(X n )=_ 2
2Ä0
{
1&e &2#t
.
2#
=
(21)
For the special case in which #=0, a similar, but simpler, calculation gives
E[X t ]=+t,
Var(X t )=_ 2t.
(22)
Equations (20)(22) give the mean and variance of the OU process and the
Brownian motion process as limits of sums for the time-homogeneous case, +(t)=+.
From these pairs of expressions, an important difference between the two processes
is apparent, whose psychological significance was first discussed by Busemeyer and
Townsend (1992). For the Brownian motion process, the time-dependent SNR is
E 2[X t ]Var(X t )=(+ 2_ 2 ) t; that is, the separation between signal and noise grows
unboundedly with t. In contrast, for the OU process E 2[X t ]Var(X t ) Ä 2+ 2(#_ 2 );
that is, the SNR grows to an asymptotic limit. This difference between the two
processes is a reflection of their characterizations as perfect and leaky integrators,
respectively.
Implicit in the formal manipulations used to obtain (20) and (21) as a limit of sums
was the assumption that an exchange of limit and integral of the form lim n Ä E[X n ]
=E[lim n Ä X n ] was possible. 8 A rigorous justification for these manipulations is
provided by the Ito^ calculus, which also yields a way to obtain the moments of the
limit processes in (14) and (16) directly. To show the parallels between the results
obtained using these methods and those obtained from the limit of sums, we
consider the solution to the SDE (13) for the time-homogeneous OU process
(+(t)=+):
X(t)=
|
t
0
e &#(t&{)+ d{+_
|
t
e &#(t&{) dB({).
(23)
0
An important property of the stochastic integral on the right in (23) is that, when
interpreted in an Ito^ sense, it preserves the martingale property of the integrator
8
The preceding construction suggests, but does not prove, that the sample paths of the limit process
X t are continuous. To obtain a continuous-sample path process with the requisite properties requires a
more elaborate construction than the one given here. Details may be found in Karlin and Taylor (1975,
pp. 371378) or Karatzas and Shreve (1991, pp. 5659).
424
PHILIP L. SMITH
B(t). 9 In general, a stochastic process Z(t) is said to be a martingale if E[ |Z(t)| ]
< for all t, and E[Z(t) | F{ ]=Z({), {<t, where Ft is the sigma-field generated
by the process [Z({), 0{<t]. Loosely, Ft represents all of the information that
can be observed about Z(t) up to the time t. Specifically, for the Brownian motion
process, B(t), the martingale property implies that E[B(t) | B(0)=0]=0. Because
the stochastic integral preserves the martingale character of B(t),
_
E _
|
t
}
&
e &#(t&{) dB({) B(0)=0 =0,
0
and therefore
E[X(t)]=
|
t
0
+
e &#(t&{)+ d{= (1&e &#t ).
#
In other words, the expected value of X(t) is the output of a linear system with
impulse response function &# to a constant input +. This result agrees with that
obtained from a limit of sums in (19).
To obtain the variance of X(t) we use the fundamental Ito^ isometry (Chung 6
Williams, 1983, p. 27; Karatzas 6 Shreve, 1991, pp. 137138)
E
_{|
2
t
X({) dB({)
0
=&
=
|
t
E[X 2({)] d{.
(24)
0
The mean-square (L 2-norm) convergence of this equation is a crucial step in
establishing the existence of the stochastic integral in a rigorous way. The equation
itself may be derived by considering the limits of sums and from the properties
dB(t)t- dt, E[dB 2(t)]=dt, and Cov(dB(t) dB({))=0. Applying (24) to the
stochastic integral in (23) yields
Var(X(t))=_ 2E
=_ 2
=
|
_{|
t
t
2
e &#(t&{) dB({)
0
=&
e &2#(t&{) d{
0
_2
[1&e &2#t ],
2#
(25)
9
Stochastic integrals may be defined with respect to integrators other than Brownian motion. The
class of integrators for which the Ito integral is defined are called semimartingales. Such a process can
be decomposed into a sum of a finite variation process and a continuous local martingale. The latter are
processes which may fail globally to be martingales (e.g., because they are unbounded), but which
exhibit the martingale property at each of an increasing sequence of stopping times. The Ito^ integral with
respect to such processes preserves the local martingale property of the integrator. See Karatzas and
Shreve (1991) or Protter (1990), for details.
425
STOCHASTIC DYNAMIC MODELS
again in agreement with the result obtained in (21) as a limit of sums. The preceding
method may be applied straightforwardly to obtain the moments for the general
Gaussian process in (14).
From a knowledge of the first two moments of X(t) and the fact that it is
Gaussian, the transition distributionor, equivalently, the transition densityof
X(t) may be written down immediately. The transition density f (x, t | y, {) is
defined as
f (x, t | y, {)=(ddx) F(x, t | y, {).
(26)
The properties of this density are suggested graphically in Fig. 3. For later reference
we write down the transition densities for the time-inhomogeneous Brownian
motion and OU processes. To this end, we let the function m X (t; {) denote the
time-dependent mean for a process starting at time { evaluated at time t. The transition density for the inhomogeneous Brownian motion process in (12) is
f (x, t | y, {)=
1
- 2?_ 2(t&{)
_
exp &
(x& y&m X (t; {)) 2
2_ 2(t&{)
&
(27)
with
m X (t; {)=
|
t
+(s) ds.
{
The corresponding density for the inhomogeneous OU process in (13) is
f (x, t | y, {)=
#
?_ [1&exp[&2#(t&{)]]
&#[x& y exp[&#(t&{)]&m (t; {)]
_exp
{ _ [1&exp[&2#(t&{)]] = ,
2
2
X
2
(28)
with
m X (t; {)=
|
t
e &#(t&s)+(s) ds.
{
These densities will be used subsequently in the derivation of solutions to the onesided and two-sided first passage time problems for X(t).
The general, linear SDE in (10), despite its superficial similarity to the Gaussian
equation (11), cannot be dealt with in the same elementary way. Attempts to do so
result in integrals containing terms of the form X(t) dB(t), in which both the integrator
and the integrand are stochastic, and these cannot be reduced to a Gaussian functional
by the device of twice integrating by parts. Rather, the key to solving equations of the
form (10) is the Ito^ transformation formula, which is the stochastic counterpart of the
426
PHILIP L. SMITH
change of variables formula from ordinary calculus. Specifically, let Z(t) be a stochastic
process that satisfies the SDE
dZ=+(Z(t), t) dt+_(Z(t), t) dB(t),
and let f (z, t) be a continuous function of two variables which is twice-differentiable
in its first argument and once-differentiable in its second. Let f $z and f "zz denote,
respectively, the first and second partial derivatives of f with respect to z and let f $t
denote the first partial derivative with respect to time. Then the transformed process
f(Z(t), t) satisfies the SDE
df (Z(t), t)=[ f $z (Z(t), t) +(Z(t), t)+ f $t(Z(t), t)+ 12 f "zz (Z(t), t) _ 2(Z(t), t)] dt
+ f $z (Z(t), t) _(Z(t), t) dB(t).
(29)
Formally, (29) is derived by expanding f (Z(t), t) in a truncated Taylor series
around (Z(t), t) and discarding terms of order dB(t) dt, (dt) 2, and higher. The key
to performing this expansion in a stochastic setting is the fact that dB 2(t)tdt. This
yields
f (Z(t+dt), t+dt)= f (Z(t), t)+ f $z (Z(t), t) dZ(t)+ f $t(Z(t), t) dt
+ 12 f "zz (Z(t), t) d 2Z(t)
or
df (Z(t), t)= f $z (Z(t), t)[+(Z(t), t) dt+_(Z(t), t) dB(t)]
+ f $t(Z(t), t) dt+ 12 f "zz (Z(t), t) _ 2(Z(t), t) dt,
because d 2Z(t)t_ 2(Z(t), t) dt. This result differs from the one that would be obtained
if Z(t) were deterministic by the addition of the term f "zz (Z(t), t) _ 2(Z(t), t) dt2, reflecting the inclusion in the expansion of the term d 2B(t). The presence of this term means
that the transformation rules for SDEs differ from those of ordinary calculus.
In the Appendix the Ito^ transformation rule is used to obtain the solution of the
general linear SDE (10) and of the related homogeneous equation
dX(t)=b(t) X(t) dt+c(t) X(t) dB(t).
(30)
Specifically, let X(0) be the (random) initial value of X(t) and define the auxiliary
function
U(t)=exp
_|
t
b({) d{+
0
|
t
0
c({) dB({)& 12
|
t
0
&
c 2({) d{ .
(31)
The solution of the homogeneous equation (30) may then be written
X(t)=X(0) U(t).
(32)
427
STOCHASTIC DYNAMIC MODELS
The solution of the general equation (10) is
{
X(t)=U(t) X(0)+
|
t
0
1
[+({) d{+_({) dB({)&c({) _({) d{] .
U({)
=
(33)
From a modeling perspective, it is important to know why one would choose the
representation (10) instead of (11) (or vice versa). The SDE (11), in which the
diffusion term may depend on time, but not on state, is appropriate when the noise
in the decision stage is exogenous or input noisethat is, noise arising because of
statistical variation in the stimulus encoding processas distinct from noise intrinsic
to the decision stage itself. The possibility for time-inhomogeneity in the drift and
diffusion terms expressed by (11) allows for various forms of nonstationarity in the
encoding process, whose mean and variance may vary jointly as a function of time.
One possibility of theoretical interest that can be represented in this way is Poissonlike noise (cf. Link, 1992), which is obtained by setting _(t)=- +(t) in (11). With
this constraint, the stimulus-driven increments to X(t) form a Gaussian process
whose time-dependent mean and variance are equal. Another possibility is Weberlike noise, which is obtained by setting _(t)=+(t). This represents a Gaussian
process in which the standard deviation (rather than the variance) of the increment
process grows in proportion to its mean.
An alternative perspective to that which relegates all of the noise in the system
to the input side is to view the decision stage as analogous to an audio amplifier,
in which noise grows in proportion to signal strength or, in this case, in proportion
to the level of activation in the decision stage. In these circumstances, the dynamics
of the system are likely to be represented more appropriately by a variant of (10),
in which the variance of the increments to X(t) is determined endogenously rather
than exogenously, by the instantaneous value of X(t). The homogeneous equation
(30) describes a system of this kind, in which the noise is purely endogenous. The
general linear equation (10) that is solved by (33) represents a system in which
exogenous, stimulus-driven noise and endogenous decision stage-noise both contribute
to overall system variability.
Although the representation in (33) is attractive in its generality, a number of
complications attend its use in cognitive models. First, as noted previously, the
process X(t) in (33) is in general not Gaussian. Higher moments of X(t) may be
obtained using a method described in Gardiner (1985, pp. 112113), but no general
closed form expression for the transition density of the process appears to exist.
This limitation is unimportant if the process X(t) is used to represent a timedependent signal detection model, in which the sample size is determined by factors
external to the accumulation process, as knowledge of the moments suffices to
determine the required statistics. However, it is relevant to models in which a first
passage time problem must be solved, which require an explicit expression for the
transition density, as described in the following section. One important special case
for which an explicit representation of the transition density exists is the homogeneous process (32). Although the process X(t) in this equation is not Gaussian,
it is immediate from (31) and (32) and the previous characterization of independentincrements processes that a Gaussian process is obtained by the transformation
428
PHILIP L. SMITH
log[X(t)X(0)]=log U(t). Indeed, from the martingale property of the stochastic
integral and the Ito^ isometry (24), the mean and variance of log[X(t)X(0)] are
m X (t; 0)=
|
t
0
b({) d{& 12
|
t
c 2({) d{
0
and
v X (t; 0)=
|
t
c 2({) d{,
0
respectively, where the notation is that of (27) and (28). This means that the
process X(t) is lognormal (i.e., a random variable whose logarithm is Gaussian)
with transition density
f(x, t | y, {)=
1
{
exp &
x - 2?v X (t; {)
[log(xy)&m X (t; {)] 2
.
2v X (t; {)
=
(34)
A second complication involves the existence of competing interpretations of the
stochastic integral in the solution to the SDE (10). The difficulty in finding an
explicit representation for the process X(t) that solves this equation arises from the
problem of giving meaning to stochastic integrals such as
Z(t)=
|
t
X({) dB({),
0
in which both the integrator and the integrand are stochastic. The martingale
property that was used to derive moments of the processes X(t) in (23) and (32)
arises from the presumption that all stochastic integrals are interpreted in a particular way. According to this Ito^ interpretation, the integral is constructed as a limit
of sums in the following manner: Let 6 (m) =[t 0 , t 1 , ..., t m ] be a partition of the
interval [0, t] such that 0=t 0 <t 1 < } } } <t m =t. The Ito^ integral is defined to be
the limit in mean square as the mesh of 6 (m) goes to zero of the sum
|
m&1
t
X(t) dB(t)=
0
lim-m.s.
max(ti+1 &ti ) Ä 0
: X(t i )[B(t i+1 )&B(t i )].
i=0
A crucial part of this definition is the fact that the integrand in the interval
(t i , t i+1 ] is defined to equal X(t i ), its value at the left end point of the interval. In
other words, the integrand is nonanticipating. The martingale-preserving property of
the stochastic integral requires that it be defined in this way.
An alternative definition of the stochastic integral was proposed by Stratonovich
(e.g., Gardiner, 1985; Karatzas 6 Shreve, 1990; Karlin 6 Taylor, 1981) in which it
is defined instead to be the limit of sums of the form
|
t
0
m&1
X(t) b dB(t)=
lim-m.s.
max(ti+1 &ti ) Ä 0
:
i=0
1
2
[X(t i+1 )+X(t i )][B(t i+1 )&B(t i )].
STOCHASTIC DYNAMIC MODELS
429
(The composition symbol `` b '' is a standard notation for the Stratonovich integral.)
In this version of the integral, the value of the integrand in the interval (t i , t i+1 ]
is taken to be the average of its values at the right and left endpoints. It was shown
by Wong and Zakai (Karlin 6 Taylor, 1981; Wong 6 Hajek, 1985) that this
version of the integral arises naturally when the integral is defined ``pathwise'', by
approximating the irregular sample path of B(t) by a finite-variation Gaussian
process. This is equivalent to constructing the integral as a limit of sums using as
integrator a nonwhite Gaussian noise; that is, a broad spectrum noise process,
B (m)(t), in which successive values of the process are not required to be delta
correlated, but for which the autocorrelation function Cov[B (m)(t i+1 ) B (m)(t i )] goes
to zero with the mesh of 6 (m). As discussed by Gardiner (1985) and van Kampen
(1992), the Stratonovich interpretation of the stochastic integral, although it fails to
preserve the martingale property of the noise process, may be more relevant to modeling real systems, in which total noise power is finite. Further, the Stratonovich integral,
although it is defined for a more restricted range of integrands than is the Ito^ integral,
also has the advantage that it transforms according to the familiar rules of calculus
(cf. 30).
Despite these differences in the definitions of the two versions of the integral,
there is a systematic relationship between the Ito^ and the Stratonovich solutions of
the SDE (10). Specifically, the solution of the general SDE
dX(t)=+(X(t), t) dt+_(X(t), t) b dB(t),
in which the stochastic integral is interpreted in the Stratonovich sense, is equivalent
to the solution of the modified SDE
dX(t)=[+(X(t), t)+ 12 _(X(t), t) _$x (X(t), t)] dt+_(X(t), t) dB(t),
in which the integral is interpreted in the Ito^ sense. That is, the Ito^ solution and the
Stratonovich solution differ by the presence or absence in the drift coefficient of a
correction term that is equal to the product of half the diffusion coefficient and its
derivative with respect to its space coordinate. Because of the martingale-preserving
properties of the Ito^ integral, calculations are usually carried out using Ito^ integrals.
The Stratonovich solution, if desired, may be obtained by adding a correction term
to the original equation in this way. Details may be found in Karlin and Taylor
(1981).
Evidently, the behavior ascribed to the solution process X(t) may differ markedly,
depending on whether the Ito^ or the Stratonovich interpretation of the stochastic
integral is chosen. Unfortunately, the level of specificity required to choose between
these competing interpretations in a principled way is unlikely to be present in most
cognitive models, in which the noise process is hypothesized and unobservable. In
contrast, no such arbitrariness exists in relation to the restricted equation (11) in
which the diffusion coefficient is spatially homogeneous, because in this situation,
the correction term _$x (X(t), t) is zero, and the Ito^ and Stratonovich solutions
coincide.
430
PHILIP L. SMITH
THE FIRST PASSAGE TIME PROBLEM FOR TIME-INHOMOGENEOUS
DIFFUSION PROCESSES
The Single-Barrier Case
This section takes up the second of the two questions considered in this article,
namely, the solution of the first passage time problem for a diffusion process in the
presence of possible time inhomogeneities in the drift and diffusion coefficients of
the defining SDE. The one-sided and two-sided problems will be considered in turn.
We formulate the solution to this problem in a fairly general setting, in which the
absorbing barrier(s) are not constrained to be fixed, but may vary smoothly with
time. There are two reasons for formulating the problem in this general way. The
first is that there may be a theoretical justification in some applications for assuming
that the decision criteria used by subjects vary systematically with time. Dynamic
variations in criteria of this kind have been proposed in various settings by a
number of researchers, including Busemeyer and Rapoport (1988), Hockley and
Murdock (1987), and Viviani (1979a, 1979b).
The second reason for considering diffusion processes bounded by time-varying
absorbing barriers is that a large class of time inhomogeneous problems can be
reduced to the homogeneous case by an appropriate transformation of the state
space andor the time scale. This approach, which was described by Ricciardi
(1976) and Ricciardi and Sato (1983), was used in RT models described by Heath
(1992) and Smith (1995). The essence of the transformation approach is that it
seeks to reformulate problems involving time-inhomogeneous processes such as
the OU process in (16) into equivalent problems involving time-homogeneous
processes by the removal of the integrated drift term. Although we consider this
approach briefly here, we prefer to work with the inhomogeneous process directly.
The basic probabilistic setup is depicted in Fig. 4. As before, X(t) is a diffusion
process defined on a state space R=(&, ), the real numbers, with t # [0, ),
the positive real line. As discussed in the previous section, we assume that X(t) has
a known transition density f (x, t | y, {). We define a(t) to be a smooth (specifically,
twice-differentiable) absorbing barrier, as shown in Fig. 4, and assume that the
process starts in state x 0 at time t 0 with probability one. For convenience we
assume x 0 <a(t 0 ). This is the case that arises most naturally in cognitive settings,
so it is the only one we consider explicitly. The solution to the converse problem
x 0 >a(t 0 ) can be obtained by analogy with the one described here. We define
g[a(t), t | x 0 , t 0 ]= g T (t), the first passage time density for X(t) through the timevarying absorbing boundary a(t). In this notation the subscript denoting the
random variable is omitted, but the conditional dependency on the initial time,
state, and absorbing barrier is made explicit.
The basic tool for the analysis of such processes is a simple and intuitive renewal
equation, attributed by Durbin (1971) to Fortet (1943). To obtain this equation we
consider a decomposition of the probability density function of the sample path of
an unconstrained process that originates at (x 0 , t 0 ) and passes through a point a(t)
on the absorbing barrier at time t, as shown in Fig. 4. By definition, the probability
density of sample paths passing through this point is f [a(t), t | x 0 , t 0 ]. Because
STOCHASTIC DYNAMIC MODELS
431
FIG. 4. Renewal representation of the one-sided first passage time problem for a diffusion process
X(t) through a time-dependent absorbing barrier a(t). Equation (35) gives a decomposition of
f [a(t), t | x 0 , t 0 ], the probability density associated with a point on the absorbing barrier a(t) at time
t. The time of the first barrier crossing is {, {t. After crossing at a({) the process makes a transition
to a(t) in time t&{. In this figure, the symbol t does double duty to denote the time index for the
process and the arbitrary point at which Eq. (35) is evaluated.
x 0 <a(t 0 ), all sample paths passing through this point must have crossed the
barrier at least once, at or before time t. Let the time of the first barrier crossing
be {, {t. After crossing the barrier at {, the process must subsequently have made
a transition from a({) to a(t) in the interval t&{. 10 The number of boundary
crossings subsequent to the first is immaterial. By virtue of the (strong) Markov
property of X(t) the probability densities associated with the portions of the sample
path before and after { are g[a({), { | x 0 , t 0 ] and f [a(t), t | a({), {], respectively,
and the joint density of the event [T={, X(t)=a(t) | X(t 0 )=x 0 ] is g[a({), { | x 0 , t 0 ]_
f [a(t), t | a({), {]. Because this decomposition of the sample paths passing through
(a(t), t) is exhaustive and mutually exclusive, the probability density associated
with paths passing though this point may be obtained by summing across values of
{ to yield
f [a(t), t | x 0 , t 0 ]=
|
t
g[a({), { | x 0 , t 0 ] f [a(t), t | a({), {] d{.
(35)
t0
Further discussion of this equation and its ramifications may be found in Durbin
(1971) and van Kampen (1992, pp. 307311).
Equation (35) is formally a Volterra equation of the first kind, in which the
unknown first passage time density g appears only under the integral sign.
Although this equation is in principle soluble, the problem in attempting to work
with it directly is that the kernel of the equation is (weakly) singular; that is,
10
When deriving (35) it is technically more correct to consider the probability density function, not
of the point a(t), but of a(t)=lim = Ä 0 a(t)+=. Then { is always strictly less than t and the transition
density f [a(t), t | a({), {] always well defined. But as the event {=t is of probability zero, the additional
rigor of this refinement does little more than burden the notation and obscure the essential simplicity
of the relationship expressed by (35).
432
PHILIP L. SMITH
lim { Ä t f [x, t | y, {]=$(x& y), the Dirac delta function. The consequence of this
divergence is, as discussed by Durbin (1971), that any numerical method that
attempts to approximate the integral in (35) directly, as the limit of sums, will be
inherently unstable. To circumvent this problem, recent research has sought to find
ways to transform the equation to remove the singularity and thus allow the
development of stable approximation methods. Most notably, Ricciardi and coworkers (e.g., Buonocore, Nobile, 6 Ricciardi, 1987; Buonocore, Giorno, Nobile, 6
Ricciardi, 1990; Giorno, Nobile, Ricciardi, 6 Sato, 1989; Ricciardi, Sacerdote, 6
Sato, 1983, 1984) have considered a variety of ways for stably transforming this
equation that lead to practical computational methods. The most tractable and
efficient of these methods, and hence the one that is most likely to be of value to
cognitive modelers, was derived by Buonocore et al. (1987, 1990). This approach
forms the basis of the methods described here.
We follow Buonocore et al. (1987) in showing how to transform (35) into a
Volterra integral equation of the second kind, in which the kernel goes to zero with
t&{. We then describe a discrete time analogue of the resulting equation which
lends itself readily to the development of an efficient numerical algorithm. Following
Buonocore et al. (1987), we present these results in the form of a lemma followed by
a theorem. In their work, they derived an expression for the kernel of the transformed
equation that is somewhat more general than is needed for practical computational
problems. Accordingly, we present a version of their results which is specialized for
these applications. The reader is referred to the original article for further details.
Lemma 1.
Define
.[a(t), t | y, {]=
F[a(t), t | y, {],
t
(36)
where F[a(t), t | y, {] is the transition distribution in (1) evaluated at the point a(t).
Then
g[a(t), t | x 0 , t 0 ]
=&2.[a(t), t | x 0 , t 0 ]+2
|
t
g[a({), { | x 0 , t 0 ] .[a(t), t | a({), {] d{.
(37)
t0
Proof. Recalling the definition of the transition density in (26), we integrate
over the state variable in the renewal equation (35) from a(t) to and then
exchange the order of integration to obtain
1&F[a(t), t | x 0 , t 0 ]=
=
|
|
t
g[a({), { | x 0 , t 0 ]
t0
|
f [x, t | a({), {] dx d{
a(t)
t
g[a({), { | x 0 , t 0 ][1&F[a(t), t | a({), {]] d{.
t0
433
STOCHASTIC DYNAMIC MODELS
Differentiating this equation with respect to t, making use of the elementary
relation
d
dt
|
t
h(t, {) d{=h(t, t)+
t0
h(t, {) d{
t0 t
|
t
and the definition (36), yields
&.[a(t), t | x 0 , t 0 ]= g[a(t), t | x 0 , t 0 ]&F[a(t), t | a(t), t] g[a(t), t | x 0 , t 0 ]
|
t
g[a({), { | x 0 , t 0 ] .[a(t), t | a({), {] d{.
&
t0
By virtue of the relation 11
lim F[a(t), t | a({), {]= 12 ,
{Ät
the second term on the right-hand side reduces to g[a(t), t | x 0 , t 0 ]2. The resulting
equation may then be rearranged to yield
g[a(t), t | x 0 , t 0 ]=&2.[a(t), t | x 0 , t 0 ]
+2
|
t
g[a({), { | x 0 , t 0 ] .[a(t), t | a({), {] d{,
t0
as asserted by the preceding lemma.
Theorem 1.
Define
9[a(t), t | y, {]=.[a(t), t | y, {]+k(t) f [a(t), t | y, {],
(38)
where k(t) is an arbitrary, continuous function of time, defined on [0, ), and
.[a(t), t | y, {] is as defined in Lemma 1. Then
g[a(t), t | x 0 , t 0 ]=&29[a(t), t | x 0 , t 0 ]
+2
|
t
g[a({), { | x 0 , t 0 ] 9[a(t), t | a({), {] d{.
(39)
t0
11
This limit relation is attributed by Buonocore et al. (1987) to Fortet (1943). The property is rather
unintuitive because it asserts that as the interval t&{ becomes small, P[X(t)>X({) | X({)=x]
approaches P[X(t)<X({) | X({)=x] for all x in the interior of X, the state space of X(t), regardless of
its infinitesimal moments. This property arises because in the SDE (6) and its various special cases, the
deterministic change +(X(t), t) dt is of order dt, whereas the stochastic perturbation _(X(t), t) dB(t) is of
order - dt. Because the stochastic part of dX(t) goes to zero more slowly than the deterministic part,
the limiting transition distribution is determined by the (symmetrical) stochastic term only.
434
PHILIP L. SMITH
Remark. Equations (37) and (39) are Volterra integral equations of the second
kind, in which the unknown first passage time density g[a(t), t | x 0 , t 0 ] is defined
jointly in terms of its values at preceding times, 0{<t, and the values of a known
kernel function (.[a(t), t | y, {] or 9[a(t), t | y, {], respectively). The purpose of
the greater generality of the kernel in (39) is that the arbitrariness of the function
k(t) may be used to ensure that the kernel goes to zero with t&{. We derive expressions for the kernel function for specified diffusion processes subsequently.
Proof. Substituting the definition of 9[a(t), t | y, {] from (38) into the righthand side of (39) yields
g[a(t), t | x 0 , t 0 ]=&2.[a(t), t | x 0 , t 0 ]
+2
|
t
g[a({), { | x 0 , t 0 ] .[a(t), t | a({), {] d{
t0
{
+2k(t) & f [a(t), t | x 0 , t 0 ]
+2
|
t
=
g[a({), { | x 0 , t 0 ] f [a(t), t | a({), {] d{ .
t0
The sum of the first and second terms on the right-hand side of this equation is
g[a(t), t | x 0 , t 0 ], by Lemma 1. The integral in braces in the last line may be
recognized as the right-hand side of the renewal equation (35) and is thus equal to
f [a(t), t | x 0 , t 0 ], by virtue of that equation. The entire expression in braces in the
second line of the preceding equation is therefore zero. Therefore the left- and righthand sides of the equation are equal, which completes the proof.
With k(t) chosen in a manner to be determined subsequently, lim { Ä t 9[a(t),
t | a({), {]=0. Under these circumstances, as discussed by Buonocore et al. (1987),
the integral in (39) may be approximated numerically by a sum of the form
g[a(t 0 +2), t 0 +2 | x 0 , t 0 ]=&29[a(t 0 +2), t 0 +2 | x 0 , t 0 ]
g[a(t 0 +k2), t 0 +k2 | x 0 , t 0 ]=&29[a(t 0 +k2), t 0 +k2 | x 0 , t 0 ]
k&1
+22 : g[a(t 0 + j2), t 0 + j2 | x 0 , t 0 ]
j=1
_9[a(t 0 +k2), t 0 +k2 | a(t 0 + j2), t 0 + j2].
k=2, 3, ...
(40)
These expressions, which may be implemented easily on a personal computer,
converge to the true first passage time density in (39) as the interval of approximation,
STOCHASTIC DYNAMIC MODELS
435
FIG. 5. Renewal representation for the two-sided first passage time problem for a diffusion process
X(t) through a pair of time-dependent absorbing barriers a 1(t) and a 2(t). Equations (41a) and (42b) give
decompositions of f [a 1(t), t | x 0 , t 0 ] and f [a 2(t), t | x 0 , t 0 ], the probability densities associated with
points on the upper and lower barriers, a 1(t) and a 2(t), at time t. The first barrier crossing is at a 1({)
or a 2({) at time {, {t. The process then makes a transition to either a 1(t) or a 2(t) in time t&{.
2, becomes small. 12 The reader is referred to the original article for a convergence
proof.
The Two-Barrier Case
The renewal approach described in the preceding section may also be applied to
the solution of the first passage time problem for a diffusion process constrained by
a pair of time-varying absorbing boundaries, as shown in Fig. 5. Buonocore et al.
(1990) showed that the same approach to regularizing the kernel of the integral
equation can be applied to the two-barrier problem to obtain a system of equations
that provides the basis for a stable numerical approximation. Our presentation
summarizes the main results from their article. To stress the similarity in the
approaches to the one- and the two-barrier problems, we again present the results
in the form of a lemma followed by a theorem.
The basis for the solution of the two-barrier problem is a pair of simultaneous
renewal equations, rather than the single equation considered previously. The probabilistic foundation for these equations is suggested by Fig. 5. As in the previous
section, we let f (x, t | y, {) denote the free transition density of the diffusion process
12
For homogeneous Wiener and OU processes with time-varying boundaries, Buonocore et al. (1987)
showed that convergence of (40) requires that the boundary a(t) be twice-differentiable, i.e., be of class
C 2[0, ). The equivalent formulation for time-inhomogeneous processes with constant boundaries
requires that m X (t; {), the mean of X(t) in (27) or (28), be of class C 2[0, ). In either instance, this
degree of smoothness is required to obtain convergence of 9[a(t), t | a({), {] by L'Ho^pital's rule in the
final step of Theorem 4. A model that fails to exhibit these smoothness requirements is the double halfwave rectifier model of Fig. 2a. There the functions m X (t; {) are only C 1[0, ) because the derivatives
of [+(t)&c] + and [+(t)&c] & have one or more points of discontinuity. This problem can be circumvented by approximating these latter functions by functions with derivatives that are everywhere
continuous.
436
PHILIP L. SMITH
X(t) and let g 1[a 1(t), t | x 0 , t 0 ] and g 2[a 2(t), t | x 0 , t 0 ] denote the first passage time
densities of X(t) through the absorbing barriers a 1(t) and a 2(t), respectively, where
P[X(t 0 )=x 0 ]=1 and a 2(t 0 )<x 0 <a 1(t 0 ).
We consider a decomposition of f [a 1(t), t | x 0 , t 0 ], the probability density of a
point on the boundary a 1(t) at time t, given that the process started at the point
x 0 <a 1(t 0 ) at time t 0 . Evidently, the process must have made at least one boundary
crossing at some time {t. As shown in Fig. 5, this may occur in one of two ways.
The process may first cross the upper boundary a 1({) at time { and then make a
transition to the point a 1(t) in the interval t&{. The probability densities associated
with the portions of the sample path before and after the first boundary crossing
will then be g 1[a 1({), { | x 0 , t 0 ] and f [a 1(t), t | a 1({), {], respectively. Alternatively,
the process may first cross the lower boundary a 2({) at time { and then subsequently
make a transition to a point on the upper barrier a 1(t) at time t. The probability
densities associated with the two portions of the sample path will then be g 2[a 2({),
{ | x 0 , t 0 ] and f [a 1(t), t | a 2({), {]. Because this decomposition according to time of
first boundary crossing is exhaustive and mutually exclusive, and by virtue of the
strong Markov character of X(t), the following relationship holds for sample paths
crossing the upper boundary at time t:
f [a 1(t), t | x 0 , t 0 ]=
|
t
g 1[a 1({), { | x 0 , t 0 ] f [a 1(t), t | a 1({), {] d{
t0
|
t
g 2[a 2({), { | x 0 , t 0 ] f [a 1(t), t | a 2({), {] d{.
+
(41a)
t0
An analogous decomposition of f [a 2(t), t | x 0 , t 0 ], the probability density of a
point a 2(t) on the lower boundary at time t, gives a second equation
f [a 2(t), t | a 0 , t 0 ]=
|
t
g 2[a 2({), { | x 0 , t 0 ] f [a 2(t), t | a 2({), {] d{
t0
|
t
g 1[a 1({), { | x 0 , t 0 ] f [a 2(t), t | a 1({), {] d{.
+
(41b)
t0
In principle, these equations may be solved simultaneously to yield the pair of
unknown first passage time densities g 1[a 1(t), t | x 0 , t 0 ] and g 2[a 2(t), t | x 0 , t 0 ]
but, as in the one barrier case, the singularity of f (x, t | y, {) as { approaches t
precludes the development of a stable numerical approximation scheme based
directly on these equations. Instead, a transformation is sought which removes the
singularity. We have the following lemma.
Lemma 2.
Define .[a i (t), t | y, {], i=1, 2, as in Lemma 1. Then
g 1[a 1(t), t | x 0 , t 0 ]=&2.[a 1(t), t | x 0 , t 0 ]
+2
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 1(t), t | a 1({), {] d{
t0
+2
|
t
t0
g 2[a 2({), { | x 0 , t 0 ] .[a 1(t), t | a 2({), {] d{
(42a)
STOCHASTIC DYNAMIC MODELS
437
and
g 2[a 2(t), t | x 0 , t 0 ]=2.[a 2(t), t | x 0 , t 0 ]
&2
&2
|
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 2(t), t | a 1({), {] d{
t0
t
g 2[a 2({), { | x 0 , t 0 ] .[a 2(t), t | a 2({), {] d{.
(42b)
t0
Proof. Integrating (41a) from a 1(t) to and (41b) from & to a 2(t) followed
by an exchange in the order of integration gives
1&F[a 1(t), t | x 0 , t 0 ]
=
|
t
g 1[a 1({), { | x 0 , t 0 ][1&F[a 1(t), t | a 1({), {]] d{
t0
+
|
t
g 2[a 2({), { | x 0 , t 0 ][1&F[a 1(t), t | a 2({), {]] d{
(43a)
t0
and
F[a 2(t), t | x 0 , t 0 ]=
|
t
g 2[a 2({), { | x 0 , t 0 ] F[a 2(t), t | a 2({), {] d{
t0
|
t
g 1[a 1({), { | x 0 , t 0 ] F[a 2(t), t | a 1({), {] d{.
+
(43b)
t0
We differentiate these equations with respect to t, making use of the limit relations
lim F[a 1(t), t | a 1({), {]= 12
{Ät
lim F[a 2(t), t | a 2({), {]= 12
{Ät
lim F[a 1(t), t | a 2({), {]=1
{Ät
lim F[a 2(t), t | a 1({), {]=0.
{Ät
The derivative of (43a) is
&.[a 1(t), t | x 0 , t 0 ]
= g 1[a 1(t), t | x 0 , t 0 ]& g 1[a 1(t), t | x 0 , t 0 ] F[a 1(t), t | a 1(t), t]
&
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 1(t), t | a 1({), {] d{
t0
+ g 2[a 2(t), t | x 0 , t 0 ]& g 2[a 2(t), t | x 0 , t 0 ] F[a 1(t), t | a 2(t), t]
&
|
t
t0
g 2[a 2({), { | x 0 , t 0 ] .[a 1(t), t | a 2({), {] d{,
(44)
438
PHILIP L. SMITH
which, by virtue of the first and third equalities of (44), may be simplified and
rearranged to give
g 1[a 1(t), t | x 0 , t 0 ]=&2.[a 1(t), t | x 0 , t 0 ]
+2
+2
|
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 1(t), t | a 1({), {] d{
t0
t
g 2[a 2({), { | x 0 , t 0 ] .[a 1(t), t | a 2({), {] d{,
t0
which is (42a). Proceeding in a similar manner with (43b) yields
.[a 2(t), t | x 0 , t 0 ]
= g 2[a 2(t), t | x 0 , t 0 ] F[a 2(t), t | a 2(t), t]
|
t
g 2[a 2({), { | x 0 , t 0 ] .[a 2(t), t | a 2({), {] d{
+
t0
+g 1[a 1(t), t | x 0 , t 0 ] F[a 2(t), t | a 1(t), t]
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 2(t), t | a 1({), {] d{,
+
t0
which, by virtue of the second and fourth equalities of (44) reduces to
g 2[a 2(t), t | x 0 , t 0 ]=2.[a 2(t), t | x 0 , t 0 ]
&2
&2
|
|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 2(t), t | a 1({), {] d{
t0
t
g 2[a 2({), { | x 0 , t 0 ] .[a 2(t), t | a 2({), {] d{,
t0
which is (42b), thus proving the lemma.
Equations (42a) and (42b) are Volterra integral equations of the second kind, in
which the unknown first passage time densities are defined in terms of their values
at preceding times {<t and the kernel function .(x, t | y, {). We transform these
equations to yield new integral equations whose kernels can be chosen in a way
that they go to zero as { Ä t.
Theorem 2.
Set
9[a 1(t), t | y, {]=.[a 1(t), t | y, {]+k 1(t) f [a 1(t), t | y, {],
(45a)
9[a 2(t), t | y, {]=.[a 2(t), t | y, {]+k 2(t) f [a 2(t), t | y, {].
(45b)
and
439
STOCHASTIC DYNAMIC MODELS
Then
g 1[a 1(t), t | x 0 , t 0 ]=&29[a 1(t), t | x 0 , t 0 ]
|
+2
|
+2
t
g 1[a 1({), { | x 0 , t 0 ] 9[a 1(t), t | a 1({), {] d{
t0
t
g 2[a 2({), { | x 0 , t 0 ] 9[a 1(t), t | a 2({), {] d{
(46a)
t0
and
g 2[a 2(t), t | x 0 , t 0 ]=29[a 2(t), t | x 0 , t 0 ]
&2
|
&2
t
g 1[a 1({), { | x 0 , t 0 ] 9[a 2(t), t | a 1({), {] d{
t0
|
t
g 2[a 2({), { | x 0 , t 0 ] 9[a 2(t), t | a 2({), {] d{.
(46b)
t0
Proof. First consider (46a). Evaluating the right-hand side of this equation
using the definition of 9[a 1(t), t | y, {] in (45a) yields
g 1[a 1(t), t | x 0 , t 0 ]=&2[.[a 1(t), t | x 0 , t 0 ]+k 1(t) f [a 1(t), t | x 0 , t 0 ]]
{|
+2
t
g 1[a 1({), { | x 0 , t 0 ] .[a 1(t), t | a 1({), {] d{
t0
|
+k 1(t)
{|
+2
t
g 1[a 1({), { | x 0 , t 0 ] f [a 1(t), t | a 1({), {] d{
t0
=
t
g 2[a 2({), { | x 0 , t 0 ] .[a 1(t), t | a 2({), {] d{
t0
+k 1(t)
|
t
=
g 2[a 2({), { | x 0 , t 0 ] f [a 1(t), t | a 2({), {] d{ .
t0
The integrals with coefficients k 1(t) on the right-hand side of this equation may be
recognized as the terms on the right-hand side of the renewal equation (41a). By
virtue of the renewal equation, the sum of these terms, together with their coefficients, is k 1(t) f [a 1(t), t | x 0 , t 0 ]. With this substitution the preceding equation
simplifies to
g 1[a 1(t), t | x 0 , t 0 ]=&2.[a 1(t), t | x 0 , t 0 ]
&2k 1(t) f [a 1(t), t | x 0 , t 0 ]+2k 1(t) f [a 1(t), t | x 0 , t 0 ]
+2
+
|
{|
t
t
g 1[a 1({), { | x 0 , t 0 ] .[a 1(t), t | a 1({), {] d{
t0
=
g 2[a 2({), { | x 0 , t 0 ] .[a 1(t), t | a 2({), {] d{ .
t0
440
PHILIP L. SMITH
The sum of the second and third terms in this equation
remaining terms is g 1[a 1(t), t | x 0 , t 0 ] by Lemma 2. The
equals the right-hand side, which verifies (46a).
Equation (46b) is proved analogously. The right-hand
expanded using the definition of 9[a 2(t), t | y, {] in (45b)
is zero; the sum of the
left-hand side therefore
side of the equation is
to yield
g 2[a 2(t), t | x 0 , t 0 ]=2[.[a 2(t), t | x 0 , t 0 ]+k 2(t) f [a 2(t), t | x 0 , t 0 ]]
{|
&2
t
g 1[a 1({), { | x 0 , t 0 ] .[a 2(t), t | a 1({), {] d{
t0
|
+k 2(t)
{|
&2
t
g 1[a 1({), { | x 0 , t 0 ] f [a 2(t), t | a 1({), {] d{
t0
=
t
g 2[a 2({), { | x 0 , t 0 ] .[a 2(t), t | a 2({), {] d{
t0
+k 2(t)
|
t
=
g 2[a 2({), { | x 0 , t 0 ] f [a 2(t), t | a 2({), {] d{ .
t0
The integrals with coefficients k 2(t) are the terms on the right-hand side of the
renewal equation (41b). Summing these terms together with their coefficients gives
k 2(t) f [a 2(t), t | x 0 , t 0 ], thereby reducing the equation to
g 2[a 2(t), t | x 0 , t 0 ]=2.[a 2(t), t | x 0 , t 0 ]
&2k 2(t) f [a 2(t), t | x 0 , t 0 ]+2k 2(t) f [a 2(t), t | x 0 , t 0 ]
&2
&
|
{|
t
g 1[a 1({), { | x 0 , t 0 ] .[a 2(t), t | a 1({), {] d{
t0
t
=
g 2[a 2({), { | x 0 , t 0 ] .[a 2(t), t | a 2({), {] d{ .
t0
In this equation, the second and third terms again sum to zero; the remaining terms
sum to g 2[a 2(t), t | x 0 , t 0 ] by Lemma 2. The left- and right-hand sides of (46b) are
thus equal, which proves the theorem.
On the assumption that the kernel functions 9[a i (t), t | y, {] i=1, 2, may be
chosen in such a way that they go to zero as { Ä t, the integrals in (46a) and (46b)
may be approximated numerically by sums and the resulting equations solved
simultaneously as follows:
g 1[a 1(t 0 +2), t 0 +2 | x 0 , t 0 ]
=&29[a 1(t 0 +2), t 0 +2 | x 0 , t 0 ]
STOCHASTIC DYNAMIC MODELS
441
g 1[a 1(t 0 +k2), t 0 +k2 | x 0 , t 0 ]
=&29[a 1(t 0 +k2), t 0 +k2 | x 0 , t 0 ]
k&1
+2 2 : g 1[a 1(t 0 + j2), t 0 + j2 | x 0 , t 0 ]
j=1
_9[a 1(t 0 +k2), t 0 +k2 | a 1(t 0 + j2), t 0 + j2]
k&1
+2 2 : g 2[a 2(t 0 + j2), t 0 + j2 | x 0 , t 0 ]
j=1
_9[a 1(t 0 +k2), t 0 +k2 | a 2(t 0 + j2), t 0 + j2].
k=2, 3...
(47a)
g 2[a 2(t 0 +2), t 0 +2 | x 0 , t 0 ]
=29[a 2(t 0 +2), t 0 +2 | x 0 , t 0 ]
g 2[a 2(t 0 +k2), t 0 +k2 | x 0 , t 0 ]
=29[a 2(t 0 +k2), t 0 +k2 | x 0 , t 0 ]
k&1
&22 : g 1[a 1(t 0 + j2), t 0 + j2 | x 0 , t 0 ]
j=1
_9[a 2(t 0 +k2), t 0 +k2 | a 1(t 0 + j2), t 0 + j2]
k&1
&22 : g 2[a 2(t 0 + j2), t 0 + j2 | x 0 , t 0 ]
j=1
_9[a 2(t 0 +k2), t 0 +k2 | a 2(t 0 + j2), t 0 + j2].
k=2, 3...
(47b)
As in the single-barrier case, these equations may be evaluated straightforwardly on
a personal computer. A numerical study of their convergence properties may be
found in Buonocore et al. (1990).
THE KERNEL OF THE INTEGRAL EQUATION
In this section we see how to obtain the unknown functions k i (t) in Theorems
1 and 2 in such a way that the kernels 9[a i (t), t | a i ({), {] vanish as { Ä t.
Together with the transition density f (x, t | y, {), these functions may be used in
(41) and (47) to obtain the first passage time densities for X(t) numerically. We
proceed to characterize the kernel function for those diffusions whose transition
densities are Gaussian or which can be made Gaussian by appropriate transformation. This class of functions includes the general Gaussian process (14) and its
various special cases, as well as the lognormal process (32). To this end, we seek
to characterize those processes that can be transformed into a standard Brownian
motion (Wiener) process by appropriate transformation of the time and state coordinates. A constructive proof of the conditions for the existence of such a transformation was provided by Cherkasov (1957) and recast into a convenient form for
442
PHILIP L. SMITH
applications by Ricciardi (1976). A slightly more succinct statement of this result
may be found in Ricciardi and Sato (1983). We record their result in the form of
a theorem.
Theorem 3 (Ricciardi 6 Sato, 1983). Let X(t) be a diffusion process satisfying
the SDE (6) with drift +(x, t) and diffusion coefficient _ 2(x, t). Let _ 2$ x (x, t)=
(x) _ 2(x, t) and _ 2$ t (x, t)=(t) _ 2(x, t) be the first partial derivatives of the
diffusion coefficient with respect to its state and time coordinates, respectively. If
there exists a pair of functions c 1(t) and c 2(t) such that
+(x, t)=
_ 2$ x (x, t) _(x, t)
+
c 1(t)+
4
2
{
|
x
c 2(t) _ 2( y, t)+_ 2$ t( y, t)
dy ,
_ 3( y, t)
=
(48)
then there exists a coordinate transformation, X(t) Ä X*(t*), of the form
x*=9(x, t)
(49)
t*=8(t),
such that X*(t*)=B(t*) is a standard Brownian motion. This transformation is
x*=9(x, t)
_
=exp &
1
2
|
t
c 2(s) ds
&|
x
1
dy
&
_( y, t) 2
|
t
_
c 1(s) exp &
1
2
|
s
&
c 2(z) dz ds
(50)
t*=8(t)
=
|
t
_
|
exp &
t0
s
&
c 2(z) dz ds.
Remark. The crucial requirement for the existence of this transformation is the
existence of a pair of functions c 1(t) and c 2(t) of the time coordinate alone that
relate the drift and diffusion coefficients in the manner indicated. Under this transformation, the new state coordinate x* is a function jointly of the old state coordinate and the old time coordinate; the new time coordinate is a function of the old
time coordinate alone. Under the transformation X(t) Ä B(t*), the transition
density of the process will be given by (27) with m X (t; {)=0 and _ 2 =1.
Probabilistically, Theorem 3 asserts that the identity
P[X(t)x | X({)= y]=P[B(t*)x* | B({*)= y*]
holds with an appropriate choice of coordinates. By the previous remark
P[B(t*)x* | B({*)= y*]
x*
= f B (x*, t* | y*, {*)=
1
- 2?(t*&{*)
_
exp &
(x*& y*) 2
,
2(t*&{*)
&
443
STOCHASTIC DYNAMIC MODELS
where a subscript on the transition density has been introduced to identify the
process. Let 9 $x (x, t) and 9 $t(x, t) denote the first partial derivatives of 9(x, t)
with respect to its first and second arguments, respectively. From the preceding
equations,
f X (x, t | y, {)=
=
P[X(t)x | X({)= y]
x
dx*
P[B(t*)x* | B({*)= y*]
x*
dx
= f B[9(x, t), 8(t) | 9( y, {), 8({)] 9 $x (x, t)
=
1
- 2?[8(t)&8({)]
{
exp &
[9 (x, t)&9 ( y, {)] 2
9 $x (x, t).
2[8(t)&8({)]
=
(51)
Equation (51) provides an expression, in (x, t) coordinates, for the transition
density of a process X(t) that satisfies the relation (48).
By virtue of (51), one may work in whichever of the (x, t) or (x*, t*) coordinate
sets is most convenient and express the results in the other coordinates by transformation. Buonocore et al. (1987) worked in the (x*, t*) coordinate set and provided
explicit expressions for the kernel of the time-homogeneous Brownian motion and
OU processes only. This was subsequently extended to a wider class of time-homogeneous processes by Giorno, Nobile, Ricciardi, and Sato (1989). These methods
were used by Heath (1992) and Smith (1995) to remove time inhomogeneities in
the drifts of Brownian motion and OU processes, respectively. 13 Although this
method is often easy to implement computationally, one of its limitations, as pointed
out by Gutierrez Jaimez, Roman Roman, and Torres Ruiz (1995), is that the image
of the absorbing boundary a(t) in transformed coordinates is
a*(t*)=9(a(t), t)=9[a[8 &1(t*)], 8 &1(t*)].
To obtain an explicit expression for the absorbing barrier in the new coordinate
space requires that the function 8(t) that maps the old time coordinate to the new
time coordinate be invertible. As Gutierrez Jaimez et al. (1995) noted, there are
cases of interest in applications where this property does not apply. In these cases,
8(t) must be inverted numerically and the application of the method becomes
cumbersome. Accordingly, in the remainder of this section we use the methods of
Gutierrez Jaimez et al. (1995) and work in the original coordinate set. The following theorem and associated preamble are adapted from results proved in their
article.
13
In Smith (1995) a heuristic argument was used to justify this transformation. It can be made
rigorous by using the Ito^ calculus to show that the SDE which is satisfied by the transformed process
is the defining equation for a zero-drift OU process.
444
PHILIP L. SMITH
We use the definition (36) to obtain an expression for .[a(t), t | y, {] and,
thereby, an expression for the kernel function 9[a(t), t | y, {]. By (51), the transition
distribution of X(t) in (x, t) coordinates is
F(x, t | y, {)=
1
|
- 2?[8(t)&8({)]
9(x, t)
{
exp &
&
[s&9( y, {)] 2
ds,
2[8(t)&8({)]
=
which, by an obvious change of variables, becomes
F(x, t | y, {)=
1
- 2?
|
z(x, t)
exp(&` 22) d`,
&
where
z(x, t)=
9(x, t)&9( y, {)
- 8(t)&8({)
.
By definition (36),
.[a(t), t | y, {]=
d 1
dt - 2?
|
z(a(t), t)
exp(&` 22) d`,
&
where the expression on the right is the total derivative of the integral, considered
as a function of t. This evaluates to
.[a(t), t | y, {]
=
1
- 2?
{
exp &
[9 (a(t), t)&9 ( y, {)] 2
2[8(t)&8({)]
=
2[8(t)&8({)][9 $x (a(t), t) a$(t)+9 $t(a(t), t)]&[9 (a(t), t)&9 ( y, {)] 8$(t)
_
.
2[8(t)&8({)] 32
Multiplying the right-hand side of this expression by 9 $x (a(t), t)9 $x (a(t), t) and
making use of the definition of f X (x, t | y, {) in (51) gives
.[a(t), t | y, {]= f X [a(t), t | y, {]
{
_ a$(t)+
9 $t(a(t), t) [9 (a(t), t)&9 ( y, {)] 8$(t)
.
&
9 $x (a(t), t)
2[8(t)&8({)]
9 $x (a(t), t)
(52)
=
As the consequence of this equation, we have the following easy theorem.
Theorem 4 (Adapted from Gutierrez Jaimez et al. (1995)). Let .[a(t), t | y, {]
be as defined in (36). Then with
k(t)=&
1
9 $t (a(t), t)
a$(t)+
,
2
9 $x (a(t), t)
_
&
(53)
445
STOCHASTIC DYNAMIC MODELS
the limit relation
lim 9[a(t), t | a({), {]= lim [.[a(t), t | a({), {]+k(t) f X [a(t), t | a({), {]]=0
{Ät
(54)
{Ät
holds.
Remark. This theorem provides the conditions required for the vanishing of
9[a(t), t | a({), {], the kernel of the integral equation in Theorems 1 and 2, as
{ Ä t. The condition (53) is that k(t) be defined as a function of the absorbing
barrier(s) and of the ratio of the first partial derivatives with respect to time and
with respect to state of the coordinate transformation that maps the particular
diffusion process to a standard Brownian motion. The conditions for the existence
of this latter transformation are given by Theorem 3.
Proof.
Let
{
h(t, {)= a$(t)+
9 $t (a(t), t) [9(a(t), t)&9(a({), {)] 8$(t)
&
+k(t) .
9 $x (a(t), t)
2[8(t)&8({)]
9 $x (a(t), t)
=
As f X [a(t), t | a({), {] is singular at {=t, the vanishing of the kernel 9[a(t), t | a({), {]
in (54) requires that
h(t, t)=lim h(t, {)=0.
{Ät
With k(t) defined by (53) this implies
{
lim a$(t)+
{Ät
9 $t (a(t), t) [9(a(t), t)&9(a({), {)] 8$(t)
=0.
&
9 $x (a(t), t)
[8(t)&8({)]
9 $x (a(t), t)
=
Equivalently, multiplying both sides of this equation by 9 $x (a(t), t)8$(t), noting
from the definitions in (50) that for finite t this quantity is never singular and never
zero, yields
lim
{Ät
{
9 $x (a(t), t) a$(t)+9 $t (a(t), t) [9(a(t), t)&9(a({), {)]
=0.
&
8$(t)
[8(t)&8({)]
=
(55)
But
t&{
[9(a(t), t)&9(a({), {)] 9(a(t), t)&9(a({), {)
=
,
[8(t)&8({)]
t&{
8(t)&8({)
which in the limit { Ä t is (ddt)(9(a(t), t))8$(t), which is the first term on the
left-hand side of (55). The function h(t, {) is therefore zero at {=t, which means
446
PHILIP L. SMITH
that at this point the kernel 9[a(t), t | a(t), t]=h(t, t) f X [a(t), t | a(t), t] is an
indeterminate form 0 } . Repeated application of L'Ho^pital's rule shows this to be
zero, thus proving the theorem.
Corollary. As 2 Ä 0, the numerical integral equations (40), (47a), and (47b)
with kernel function(s)
9[a(t), t | y, {]=
f X [a(t), t | y, {]
2
{
_ a$(t)+
9 $t (a(t), t) [9(a(t), t)&9( y, {)] 8$(t)
&
9 $x (a(t), t)
[8(t)&8({)]
9 $x (a(t), t)
=
(56)
converge to the first passage time densities g T [a(t), t | x 0 , t 0 ], g 1[a 1(t), t | x 0 , t 0 ]
and g 2[a 2(t), t | x 0 , t 0 ]. The kernels for (47a) and (47b) are obtained from (56) with
a(t) set equal to a 1(t) and a 2(t), in turn.
Proof. This follows immediately on substituting (52) and (53) into the definition
(38). It provides an expression for the kernel that is convenient for applications.
EXAMPLES
The time-inhomogeneous Brownian motion (Wiener) process. We consider the
kernel of the integral equation for the diffusion process that satisfies the SDE (12),
with drift +(x, t)=+(t), diffusion coefficient _ 2(x, t)=_ 2, and t 0 =0, x 0 =0. With
these values of drift and diffusion coefficients, (48) is satisfied with c 1(t)=2+(t)_
and c 2(t)=0. Substituting these values in (50) shows that the transformation that
maps the time-inhomogeneous Brownian motion to a standard Brownian motion is
9(x, t)=
1
x&
_
_
|
t
+(s) ds
&
and
8(t)=t.
In this example, the mapping of the time coordinate is the identity; that is, the
transformation is of the state coordinate only. We have
+(t)
9 $t (a(t), t)=&
;
_
1
9 $x (a(t), t)= ;
_
8$(t)=1.
Using these functions in (56) then yields
9[a(t), t | y, {]=
f [a(t), t | y, {]
a(t)& y& t{ +(s) ds
a$(t)&+(t)&
,
2
t&{
{
=
(57)
447
STOCHASTIC DYNAMIC MODELS
with f [x, t | y, {] given by (27). A simple calculation shows that lim { Ä t 9[a(t),
t | a({), {]=0, as required. Indeed, when a(t)=a (constant) and +(t)=+, 9[a(t),
t | a({), {]=0 for all t, {, {t. Under these circumstances, the kernel of the integral
equation vanishes uniformly, so the integral on the right-hand side of (39) is zero
for all t. The first passage time density for a time-homogeneous Brownian motion
process through a constant absorbing barrier is therefore
g(a, t | 0, 0)=&29(a, t | 0, 0)
=
a
f (a, t | 0, 0).
t
\+
We have thus recovered a well-known result. First passage times for a homogeneous
Brownian motion process through a constant boundary follow a Wald (or inverse
Gaussian) distribution (e.g., Karlin 6 Taylor, 1975, p. 363), whose density function is
related to the transition density of the unrestricted process in the manner shown. An
analogous result was derived by Buonocore et al. (1987) for a zero-drift Brownian
motion through a linear boundary. By an appropriate change of measure on the underlying process, effected via an application of the Girsanov theorem (Karatzas 6 Shreve,
1991, pp. 196197), these results may be seen to be equivalent.
The time-inhomogeneous OU process. We obtain the kernel of the integral
equation for the diffusion process that satisfies the SDE (13), with drift +(x, t)=
+(t)&#x and diffusion coefficient _ 2(x, t)=_ 2. Equation (48) in Theorem 3 is
satisfied with c 1(t)=2+(t)_ and c 2(t)=&2#. By (50), for t 0 =0, x 0 =0, the transformation that maps the process to a standard Brownian motion is
9(x, t)=
8(t)=
e #tx 1
&
_
_
|
t
e #s+(s) ds
1
[e 2#t &1].
2#
Unlike the previous example, for the OU process, both the time and the state
coordinates change under the indicated mapping. For this choice of functions we
have
9 $t (a(t), t)=
e #t[#a(t)&+(t)]
;
_
9 $x (a(t), t)=
e #t
;
_
8$(t)=e 2#t.
Equation (56) then yields
9[a(t), t | y, {]=
f [a(t), t | y, {]
[a$(t)+#a(t)&+(t)
2
&
2#
a(t)&e &#(t&{)y&
1&exp[&2#(t&{)]
_
|
t
{
e &#(t&s)+(s) ds,
&= ,
(58)
448
PHILIP L. SMITH
with f (x, t | y, {) given by (28). For the time-homogeneous case +(t)=+ (constant),
this reduces to
9[a(t), t | y, {]=
f [a(t), t | y, {]
a$(t)+#a(t)&+
2
{
&
2 exp[&#(t&{)]
[exp[#(t&{)](#a(t)&+)&(# y&+)] .
1&exp[&2#(t&{)]
=
(59)
This expression for the kernel of the integral equation was derived by Buonocore
et al. (1987) using a slightly different method. As in the preceding example, an easy
calculation shows that lim { Ä t 9[a(t), t | a({), {]=0. Buonocore et al. also showed
that the kernel vanishes uniformly for a hyperbolic absorbing boundary, in which
case a simple, closed-form expression for the one-sided first passage time density
can be found in a manner similar to that obtained for the Brownian motion process
in the previous example. The interested reader is referred to their article for details.
It should be noted that in the most common applications of the preceding equations the absorbing boundary (or boundaries) will be constant, and a$(t)=0.
The lognormal process. As a final example of the use of these methods we obtain
the kernel for a diffusion process satisfying the SDE (30) with +(x, t)=+(t) x,
_ 2(x, t)=_ 2(t) x 2, t 0 =0, and X(0) # R + =(0, ). With these coefficients, the conditions of Theorem 3 are satisfied with c 1(t)=2+(t)_(t) and c 2(t)=&2_$(t)_(t).
Equation (50) shows that the transformation that maps this process to a standard
Brownian motion is
9(x, t)=log x&
8(t)=
|
t
|
t
+(s) ds+ 12
|
t
_ 2(s) ds
_ 2(s) ds.
0
For these expressions we have
9 $t (a(t), t)=&+(t)+
_ 2(t)
;
2
9 $x (a(t), t)=
1
;
a(t)
8$(t)=_ 2(t).
Substitution of these functions in (56) yields a kernel of the form
9[a(t), t | y, {]=
f [a(t), t | y, {]
a$(t)&+(t) a(t)
2
{
&_ 2(t) a(t)
_
log a(t)&log y& t{ +(s) ds
t{ _ 2(s) ds
&= ,
(60)
449
STOCHASTIC DYNAMIC MODELS
with f (x, t | y, t) given by (34). For the special case of _(t)=_ (constant), we
recover the simpler expression derived by Gutierrez Jaimez et al. (1995), namely,
9[a(t), t | y, {]=
f [a(t), t | y, {]
a$(t)&+(t) a(t)
2
{
&a(t)
_
log a(t)&log y& t{ +(s) ds
t&{
&= .
(61)
A calculation similar to that made in the two previous examples shows that for
y=a({) these functions vanish as { approaches t, as required.
MULTIVARIATE EXTENSIONS
Application of the techniques described in the previous section will yield most of
the statistics that are of interest in cognitive models. First passage time densities
obtained in this way may be integrated numerically using the trapezoidal method
or Simpson's rule (e.g., Dahlquist 6 Bjorck, 1974) to obtain values of the first
passage time distributions G T (t), G 1(t), and G 2(t). These distributions may be
evaluated at large values of t to estimate the absorption probabilities P[T<],
for the single-barrier case, and P[T 1 <T 2 ], for the two-barrier case. When multiple
processes X i (t), i=1, 2, ..., n, are involved, as occurs in the model of Fig. 2, the first
passage time statistics of the resulting multivariate process are easily evaluated if
the SDEs that define the constituent processes are uncoupled and the driving
Brownian motion processes B i (t) are independent. Multivariate processes of this
kind commonly arise in independent, parallel race models, in which the random
variable of greatest interest is usually
T=min[T 1 , T 2 , ..., T n ],
(62)
the time of the first-finishing process. If the random variables T 1 , ..., T n are each
defined by an equation of the form (2) or (3), then we have the following geometrical
interpretation of the random variable T: Let D1 /R n be defined by
D1 =[x i : b i <x i <a i ; i=1, ..., n],
where &b i <a i < and assume that X(0) # D1 . When the a i and b i are all
finite, D1 is the interior of an n-dimensional hypercube. The random variable T then
describes the time at which the vector-valued process X T (t)=[X 1(t), ..., X n(t)] first
exits from the region D1 . (The superscript T in this notation represents matrix
transposition.) Psychologically, this event corresponds to the time at which one of the
coordinate accumulation processes X i (t) first exceeds its associated criterion or
criteria. If g i (t) denotes the marginal first passage time density for the ith process, with
g (n)
T the first passage time density of T in (62), then we have the familiar expression
n
n
g (n)
T (t)= : g i (t) ` [1&G j (t)],
i=1
j=1
j{i
(63)
450
PHILIP L. SMITH
(e.g., Ratcliff, 1978). Expressions of this form may be evaluated straightforwardly
using the methods of the preceding section.
When the processes X i (t) are coupled, as occurs in models with correlated noise,
the transition distribution of the free process, X(t), may still be ascertained easily,
but solution of the first passage time problem is appreciably more difficult. Under
these circumstances, the evolution of X(t) is described by the vector-valued counterpart of (6),
d X(t)=C(X(t), t) dt+D(X(t), t) d B(t),
(64)
where X(t) and B(t) are n- and m-dimensional random processes, respectively. The
drift term, C(X(t), t), is an n-dimensional vector that is jointly a function of n state
variables X i (t) and time t. The diffusion term, D(X(t), t), is an n_m matrix, each
column of which is a function of the set of state variables and time. That m need
not equal n in this equation is because the dimensionality of the process and the
dimensionality of the superposed noise perturbations in general may be different. As
was the case with the pair of scalar-valued SDEs (6) and (11), the most important
special case of (64) is the one in which C(X(t), t) is linear in X(t) and D(X(t), t)
is independent of state:
d X(t)=[+(t)+A(t) X(t)] dt+_(t) d B(t).
(65)
Here +(t), A(t), and _(t) are (n_1), (n_n), and (n_m) matrix-valued functions,
respectively. The function A(t) in this equation is a time-dependent coupling matrix,
which represents statistical dependencies between the elements of X(t); +(t) is a
stimulus-dependent forcing function (which in general is time-inhomogeneous), and
_(t) is a time-dependent dispersion matrix, which represents statistical dependencies
among components of the noise process.
As in the case of the scalar-valued equation (11), the solution of (65) is relatively
straightforward. Let the (n_n) matrix function 4(t) be the fundamental solution of
the homogeneous, deterministic, first-order linear differential system
4$(t)=A(t) 4(t);
4(0)=I.
As described in Hirsch and Smale (1974), a routine procedure for obtaining solutions to systems of equations of this kind is to find a coordinate transformation that
uncouples the equations so that the matrix of the linear operator represented by
A(t) is diagonal. The transformed system of equations can then be solved on an
element-by-element basis and the results reexpressed in the old coordinates by
inversion of the diagonalizing transformation. Once 4(t) has been obtained in this
way, the solution to (65) may be expressed in terms of 4(t) and its inverse 4 &1(t)
as follows (Karatzas 6 Shreve, 1991, pp. 354355),
X(t)=4(t)
|
t
0
4 &1({) +({) d{+4(t)
|
t
0
4 &1({) _({) d B({),
(66)
451
STOCHASTIC DYNAMIC MODELS
where as usual we have assumed that P[X(0)=0]=1. That (66) solves (65) may
be shown by an application of the multivariate form of the Ito^ transformation rule
(e.g., Karlin 6 Taylor, 1981, p. 372). Evidently, the minimum requirement to ensure
that the process X(t) in (66) is nondegenerate is that the matrix function 4(t) be
invertible for all t. A more detailed characterization of how the properties of X(t)
depend on A(t) and _(t) and a characterization of the conditions under which X(t)
possesses a unique, stationary distribution may be found in Karatzas and Shreve
(1991). In general, the process X(t) will have a multivariate normal distribution
with mean
m(t; {)=4(t)
|
t
4 &1(s) +(s) ds
(67)
{
and time-dependent variancecovariance function
v(t; {)=4(t)
{|
t
=
4 &1(s) _(s) _ T (s)[4 &1(s)] T ds 4 T (t).
{
(68)
In the special case in which A(t)=A and _(t)=_ the latter expression simplifies to
t
v(t; {)=exp(At)
_| exp(&As) __
{
T
&
exp(&A Ts) ds exp(A Tt).
(69)
As described by Smith (1996), the preceding equations provide a dynamic, multivariate generalization of SDT, or equivalently, of the General Recognition Theory
of Ashby and Townsend (1986). These equations may be used to model performance in situations in which the observer's decision is based on a sampling interval
of fixed duration. When the decision time depends on the observer sampling to a
criterion, the decision time will be a random variable whose distribution is obtained
by solving a first passage time problem for X(t). In principle, a renewal equation
representation similar to (35) or (41) may be established (van Kampen, 1992,
p. 311). The basis of this equation is suggested by Fig. 6. For concreteness, we
assume that X T (t)=[X 1(t), X 2(t)] is a bivariate Gaussian processalthough the
derivation we give applies in higher dimensions also. Let the transition density of
X(t) be f (x, t | y, {). The precise form of this function will be determined by the
mean and covariance functions in (67) and (68) with A(t), +(t), and _(t) appropriately
specified.
Let a(s) be a closed curve in R 2 parameterized by arc length and let D1 be the
interior region of a(s), with X(0)=x 0 # D1 . Let D2 =R 2 &D1 be the complement of
this region, where D2 includes the points on the boundary a(s). We define g[a(s),
t | x 0 , t 0 ] to be the first passage time density of the process X(t) through the
boundary a(s), where we stipulate as usual that P[X(0)=x 0 ]=1. Let x be a point
on the boundary a(s). Then by an argument analogous to that used in the derivation of (35) we have
f [x, t | x 0 , t 0 ]=
|
a(s)
t
g[a(s), { | x 0 , t 0 ] f [x, t | a(s), {] d{ ds.
t0
(70)
452
PHILIP L. SMITH
FIG. 6. Escape problem for a bivariate diffusion process X(t) from a closed region a(t). A general
renewal representation of the first passage time density is given by Eq. (69).
The outer integral in this equation is a contour integral evaluated around the
boundary of the region D1 .
Despite the similarities in the renewal equation representations for the scalar and
vector cases, an important difference between (35) or (41) and (70) should be noted.
Equations (35) and (41) involved, respectively, a single equation and a system of two
equations, whereas (70) involves a system whose dimension is infinite. Such a system
may be approximated by a system of finite dimension by approximating the boundary
of D1 by a series of l linear segments 2a(i), i=1, 2, ..., l, and replacing the contour
integral in (70) with a sum. A system of l simultaneous equations is then obtained with
x i # 2a(i), which is in principle soluble for the first passage time density g[a(s),
t | x 0 , t 0 ]. However, the high dimensionality of this system will make any numerical
procedure that is based upon it expensive computationally.
Under conditions identified by di Crescenzo, Ricciardi, Giorno, and Nobile
(1991), first passage time problems involving multidimensional diffusion processes
can sometimes be reduced to problems in a single dimension. This method can be
applied to first passage time problems involving escape from a region bounded by
an open (n&1)-dimensional, time-dependent surface in R n or by a pair of such
surfaces. Under appropriate conditions, first passage problems of this form can be
reduced to problems that involve the passage time for a one-dimensional process
through a curve, a~(t), or a pair of such curves, a~ 1(t) and a~ 2(t). One model that can
be dealt with using these methods is a stochastic, dynamic generalization of the
integration model of Kinchla (1969, 1974), which has been investigated subsequently by various authors (e.g., Shaw, 1982; Smith, 1998a; see also MacMillan 6
Creelman, 1991). This model, which describes the detection of redundant signals in
STOCHASTIC DYNAMIC MODELS
453
multidimensional displays, assumes that the observer monitors a set of n signal
sources, the statistical characteristics of which are described by a set of Gaussian
strength variables, X i , whose means depend on whether or not a source contained
a signal. The integration model assumes that the observer sums these variables to
create a composite decision variable i X i , which is compared to a criterion a to
determine whether a detection response is made. Formally,
_
n
&
P(S)=P : X i a ,
i
where P(S) denotes the probability of a detection or ``Signal'' response.
A dynamic generalization of this model can be obtained by assuming that the X i
are coordinate processes of a multidimensional diffusion process and that the
observer samples are from the display for t s time units. The observer responds
``Signal'' if and only if the sum of the coordinate processes X i (t) exceeds the
criterion during this interval:
_ {
n
= &
P(S)=P inf t: : X i (t)a t s .
i
A two-boundary version of this model may be obtained by assuming that the sum
of the coordinate processes is compared to a referent, c(t), and that the observer
responds ``Signal'' or `` Noise'' depending on which of the boundaries a 1 or a 2 is first
exceeded by the process i X i (t)&c(t). Here we state without proof the main results
of di Crescenzo et al. (1991). For simplicity, we give results for a process in R 2 only.
Proofs and a generalization to processes in R n may be found in the original article.
We seek to identify conditions under which the first passage time density of the
scalar-valued transformed-process Z(t)=![X(t)] through an absorbing barrier a~(t)
is the same as that of X(t) through a curve a(t) in R 2, where a(t) is defined as
follows: Assume that !(x) is monotone in x 1 and invertible, with inverse '(z, x 2 ).
That is, z=!(x 1 , x 2 ) implies x 1 ='(z, x 2 ), and vice versa. Let a(t)=[(x 1 , x 2 ): x 1 =
'(a~(t), x 2 )]. Let the bivariate diffusion process X(t) be defined by the SDE (64),
with transition density
f [x, t | y, {]=
2
P[X 1(t)x 1 , X 2(t)x 2 | X({)=y].
x 1 x 2
Let the transition density of the transformed process Z(t) be u(z, t | y, {). This
density is obtained by integrating the transition density of X(t) over the region
&<x 1 '(z, x 2 ), &<x 2 <, and then taking the derivative with respect to
z to obtain:
u(z, t | y, {)=
|
&
f ['(z, x 2 ), x 2 , t | y, {]
'(z, x 2 )
dx 2 .
z
(71)
Now let g*['(a~(t), x 2 ), x 2 , t | x 0 , 0] dx 2 dt be the probability that X(t) first crosses
the boundary a(t) in the interval (t, t+dt) through a linear element da(t) at the
454
PHILIP L. SMITH
point ('(a~(t), x 2 ), x 2 ). The bivariate process X(t) will satisfy the following renewal
equation
f [x, t | x 0 , 0]=
t
&
0
| |
g*['(a~({), x 2 ), x 2 , { | x 0 , 0]
_ f [x, t | '(a~({), x 2 ), x 2 , {] d{ dx 2 .
(72)
This equation is a variant of (70), except that the curve a(t) is open instead of
closed and, in addition, is permitted to depend on time.
By definition,
g[a(t), t | x 0 , 0]=
|
g*['(a~(t), x 2 ), x 2 , t | x 0 , 0] dx 2 .
(73)
&
That is, the marginal first passage time density of X(t) through a(t) is obtained
by integrating g* over all points in R 2 at which a boundary crossing may occur.
Equation (72) may be transformed into a scalar-valued integral equation in z
by integrating both sides of the equation over the region &<x 1 '(z, x 2 ),
&<x 2 <, and then differentiating with respect to z, as indicated in (71). After
exchanging the order of integration the resultant renewal equation is
u[z, t | x 0 , 0]=
t
0
&
||
g*['(a~({), x 2 ), x 2 , { | x 0 , 0]
_u[z, t | '(a~({), x 2 ), x 2 , {] dx 2 d{.
(74)
If and only if u[z, t | y, {] has the representation
u[z, t | y, {]=u~[z, t | `, {],
(75)
where `=!(y), then the term u[z, t | '(a~({), x 2 ), x 2 , {] may be taken out from under
the inner integral in (74) by virtue of the fact that the set of points ('(a~({), x 2 ), x 2 )
falls on a locus of constant `, to give
u[z, t | x 0 , 0]=
|
t
g[a({), { | x 0 , 0] u~[z, t | a~({), {] d{,
(76)
0
where the definition (73) has been used to eliminate the integral over x 2 in the final
step. Equation (76) is a scalar-valued renewal equation which can be evaluated
using the methods described previously (e.g., (40)). Analogous expressions may be
developed for the first exit time for a region bounded by a pair of time-dependent
curves a 1(t), a 2(t), each of which satisfies the conditions on a(t) described previously.
In this case, the resultant renewal equation representation can be reduced to a pair of
simultaneous scalar-valued equations that can be solved using the method of (47).
The conditions required to reduce a multidimensional first passage time problem
to a problem in one dimension are first, that the image of the curve a(t) under the
455
STOCHASTIC DYNAMIC MODELS
mapping ! be invertible and second, that the transition density u has the representation (75), in which the form of the density depends only on the value of
`=![( y 1 , y 2 )] and not on the values of y 1 and y 2 individually. The simplest and
most tractable case in which these conditions are met is that of the first passage
time for a Brownian motion process through a constant linear boundary. This
model is a realization of the dynamic integration model described previously. We
illustrate the condition (75) for the simplest case of an independent, unit-variance,
bivariate Brownian motion with zero drift. Application to homogeneous (nonzero
drift) Brownian motion and OU processes are described in di Crescenzo et al.
(1991).
In this case, X(t) has transition density
f [x, t | y, {]=
1
2 (x i & y i ) 2
.
exp & i=1
2?(t&{)
2(t&{)
_
&
The image of x under transformation is z=!(x)=x 1 +x 2 , so with a~(t)=a (constant),
the absorbing boundary is the set of points a(t)/R 2, a(t)=[('(a, x 2 ), x 2 )]=
[(a&x 2 , x 2 )]. Substituting for z in the previous equation, completing squares in
the exponent, and using the result in (71) gives
u(z, t | y, {)=
1
2?(t&{)
1 2x 2 &(z& y 1 + y 2 )
2
- 2(t&{)
_ {
1 z&( y + y )
&
dx ,
2 { - 2(t&{) = &
|
exp &
&
=
2
2
1
2
2
by virtue of the fact that '(z, x 2 )z=1. This expression may be recognized as the
integral of a product of independent Gaussian densities in 2x 2 and z, respectively,
each with standard deviation - 2(t&{). We may therefore integrate over x 2 to
obtain the marginal density of z:
u(z, t | y, {)=
1 z&( y 1 + y 2 )
exp &
2
2 - ?(t&{)
- 2(t&{)
1
_ {
2
= &.
Since `= y 1 + y 2 , we see that u(z, t | y, {) has a representation of the form u~(z, t | `, {),
as required.
Remark. Because of the requirement that the transformation z=!(x) be
invertible, this method cannot be used to reduce the problem of escape from a
closed region, as described in (70), to a problem in one dimension in this way.
When certain strong symmetry conditions are met, the problem of escape from a
closed region in R n can sometimes be reduced to a problem in one dimension by
considering the Euclidian distance process Z(t)=- ni=1 X 2i (t). For the Brownian
motion and OU processes, the Euclidian distance processes are known as the Bessel
process and the radial OU process, respectively, both of which have well-defined
transition densities on the positive real line R + =(0, ) (Karlin 6 Taylor, 1975,
456
PHILIP L. SMITH
pp. 365371; 1981, pp. 333338). The first exit time for a zero-drift process from a
closed, spherical region in R n is an example of a problem that can be analyzed in
this way. Unfortunately, the processes of greatest interest in cognitive applications
are those in which the drift term is nonzero (i.e, +(t){0), and in these cases, the
symmetry conditions required to reduce the problem to a single dimension are
absent.
In principle, first passage time problems for diffusion processes may always be
approximated using a discrete time, discrete state space representation, in which the
problem is formulated as a first passage time problem for a finite state Markov
chain (e.g., Karlin 6 Taylor, 1975; Bhattacharya 6 Waymire, 1990). First passage
time distributions for such processes may be obtained even in the presence of time
inhomogeneity, but their solution requires a matrix multiplication at each time step
and is thus computionally expensive. Results similar to those described here for
reducing a multidimensional first passage time problem to a problem in one dimension appear to have been obtained independently by Ashby and Schwartz (1996).
Other relevant results are given in di Crescenzo, Giorno, Nobile, and Ricciardi
(1995).
SUMMARY
Stochastic accumulation processes are important elements of many models in
sensory and cognitive psychology, their role being to provide a theoretical foundation from which response time and accuracy predictions may be derived. These
processes represent an essential link between observed performance, which is
inherently probabilistic, and the underlying psychological mechanisms from which
it arises. In the literature, the accumulation processes that have been proposed have
been of two main kinds. The first, exemplified by SDT, assumes that the sampling
time is determined by factors external to the information sample; the second,
exemplified by the large and varied class of sequential sampling models, assumes
that the sampling time is determined by the statistical properties of the sample itself.
These two sorts of model lead, respectively, to the study of the free transition
distribution of the unbounded accumulation process and to the study of its first
passage time statistics.
This article has investigated the foundations of a class of stochastic accumulation
models that can be formulated as Markov processes in continuous time and
continuous state space. The dynamics of information accrual in these models are
represented by first-order, linear SDEs, whose solutions are diffusion processes. In
general, the assumptions embodied in the defining SDEs of such models result in
processes that are both temporally and spatially inhomogeneous. This article has
provided a characterization of accumulation processes that either are Gaussian or
can be made Gaussian by transformation of the time scale and state space. Methods
were described for obtaining the time-dependent distribution of the unbounded
accumulation process and for obtaining the first passage time distributions through
either one or two absorbing barriers, which may themselves vary with time. For
processes in one dimension, it was possible to give a fairly complete characterization both of the free transition distribution of the process and of its first passage
STOCHASTIC DYNAMIC MODELS
457
time distribution. For processes in more than one dimension, a fairly complete
characterization of the unbounded accumulation process was again possible, but
the first passage time problems that arise in relation to processes of this kind are
far less tractable. In important special cases, multidimensional first passage time
problems can be reduced to equivalent problems in a single dimension. The conditions under which such a reduction is possible and the kinds of psychological
processes that might be represented by models of this kind were described.
APPENDIX
We use the Ito^ calculus to obtain the solutions to the SDEs (10) and (30). The
method of solution is adapted from Gardiner (1985, pp. 112113).
Homogeneous case. We show that X(t)=X(0) U(t) solves (30) with X(0) the
random initial value of X(t), X(0) # R + =(0, ), and U(t) defined as in (31). We
write (30) in the form
dX(t)=[b(t) dt+c(t) dB(t)] X(t),
(A1)
and consider the function
Y(t)= f [X(t)]=log X(t).
(A2)
For functions of a single variable the Ito^ transformation formula (29) may be
written in the form
df (X(t))= f $(X(t)) dX(t)+ 12 f "(X)(dX(t)) 2.
(A3)
We recall that when working with stochastic differentials the following order
relations hold:
[dB(t)] 2 tdt;
dB(t) dtt0;
(dt) 2 t0.
(A4)
With f (x) defined as in (A2) we have f $(x)=1x and f "(x)=&1x 2. Applying the
transformation formula (A3) to (A2) therefore yields
dY(t)=
dX(t) [dX(t)] 2
&
.
X(t)
2X 2(t)
We substitute for dX(t) from (A1), noting that the order relations yield [dX(t)] 2 =
c 2(t) X 2(t) dt, to obtain
dY(t)=b(t) dt+c(t) dB(t)& 12 c 2(t) dt.
458
PHILIP L. SMITH
This equation is integrable, i.e.,
Y(t)=
|
t
b({) d{+
0
|
t
0
c({) dB({)& 12
|
t
c 2({) dt+Y(0),
0
which may be combined with the definition (A2) to give
X(t)=X(0) exp
_|
t
b({) d{+
0
|
t
0
c({) dB({)& 12
|
t
0
&
c 2({) dt ,
which is (32).
Inhomogeneous case. We show that X(t) as defined in (33) solves (10), with U(t)
defined as before. We seek a solution of the form
X(t)=Z(t) U(t),
(A5)
where the function Z(t) is to be determined. Before proceeding, we use the Ito^
transformation rule to obtain the differential of the function U(t). To this end, we
write this function as U(t)= f [V(t)]=exp V(t), where V(t) is the exponent in (31).
The Ito^ rule (A3) applied to this function yields
dU(t)=dV(t) e V(t) + 12 [dV(t)] 2 e V(t)
=[b(t) dt+c(t) dB(t)] U(t).
Also, the order relations (A4) yield
[dU(t)] 2 =c 2(t) U 2(t) dt.
Having established these preliminary facts, we write (A5) in the form
Z(t)=X(t)[U(t)] &1.
The chain rule form of the Ito^ formula (Karatzas 6 Shreve, 1991, pp. 150) applied
to this function gives
dZ(t)=dX(t)[U(t)] &1 +X(t) d[U(t)] &1 +dX(t) d[U(t)] &1.
(The stochastic chain rule is derived using a procedure similar to that used to
obtain (29): The product of two functions is expanded in a Taylor series, discarding
terms of order (dt) 2 and above, and the result expressed in differential form.) The
one-variable form of the Ito^ formula (A3) applied to the function d[U(t)] &1 gives
d[U(t)] &1 =&
dU(t) [dU(t)] 2
+
.
U 2(t)
U 3(t)
459
STOCHASTIC DYNAMIC MODELS
Combining these results, we obtain
[dU(t)] 2 dU(t)
dX(t)
+[X(t)+dX(t)]
& 2
U(t)
U 3(t)
U (t)
{
=
c (t) dt&b(t) dt&c(t) dB(t)
dX(t)
=
+[X(t)+dX(t)]
{
=,
U(t)
U(t)
dZ(t)=
2
where the definitions of d[U(t)] and d[U(t)] 2 obtained previously have been used
in the second equality.
Substituting the definition of dX(t) from the SDE (10) into the preceding
equation and expanding using the order relations (A4) then yields
dZ(t)=
+(t) dt+_(t) dB(t)&c(t) _(t) dt
.
U(t)
That is, the coefficients of X(t) on the right-hand side vanish. Both sides of this
equation may therefore be integrated and the result combined with (A5) to obtain
{
X(t)=U(t) X(0)+
|
t
0
1
[+({) d{+_({) dB({)&c({) _({) d{ ,
U({)
=
which is (33).
REFERENCES
Abeles, M., 6 Goldstein, M. H. (1972). Response of single units in the primary auditory cortex of the
cat to tones and to tone pairs. Brain Research, 42, 337352.
Arnold, L. (1974). Stochastic differential equations: Theory and applications. New York: Wiley.
Ashby, F. G. (1983). A biased random walk model for two choice reaction times. Journal of Mathematical
Psychology, 27, 277297.
Ashby, F. G., 6 Schwartz, W. (1996). A stochastic version of General Recognition Theory. Journal of
Mathematical Psychology, 40, 366. [Paper presented at the 29th Annual Meeting of the Society for
Mathematical Psychology.]
Ashby, F. G., 6 Townsend, J. T. (1986). Varieties of perceptual independence. Psychological Review, 93,
154179.
Audley, R. J., 6 Pike, A. R. (1965). Some alternative stochastic models of choice. British Journal of
Mathematical and Statistical Psychology, 18, 207225.
Bhattacharya, R. N., 6 Waymire, E. C. (1990). Stochastic processes with applications. New York: Wiley.
Buonocore, A., Giorno, V., Nobile, A. G., 6 Ricciardi, L. (1990). On the two-boundary first-crossingtime problem for diffusion processes. Journal of Applied Probability, 27, 102114.
Buonocore, A., Nobile, A. G., 6 Ricciardi, L. M. (1987). A new integral equation for the evaluation of
first-passage-time probability densities. Advances in Applied Probability, 19, 784800.
Burbeck, S. L. (1985). A physiologically motivated model for change detection in audition. Journal of
Mathematical Psychology, 29, 106121.
Burbeck, S. L., 6 Luce, R. D. (1982). Evidence from auditory simple reaction times for both change and
level detectors. Perception 6 Psychophysics, 32, 117133.
460
PHILIP L. SMITH
Busemeyer, J., 6 Rapoport, A. (1988). Psychological models of deferred decision making. Journal of
Mathematical Psychology, 32, 91134.
Busemeyer, J., 6 Townsend, J. T. (1992). Fundamental derivations from decision field theory.
Mathematical Social Sciences, 23, 255282.
Busemeyer, J., 6 Townsend, J. T. (1993). Decision field theory: A dynamic-cognitive approach to
decision making in an uncertain environment. Psychological Review, 100, 432459.
Busey, T. A., 6 Loftus, G. R. (1994). Sensory and cognitive components of visual information processing.
Psychological Review, 101, 446469.
Cherkasov, I. D. (1957). On the transformation of the diffusion process to a Wiener process. Theory of
Probability and its Applications, 2, 373377.
Chung, K. L., 6 Williams, R. J. (1983). Introduction to stochastic integration. Boston: Birkhauser.
Coltheart, M., 6 Rastle, K. (1994). Serial processing in reading aloud: Evidence for dual-route models
of reading. Journal of Experimental Psychology: Human Perception and Performance, 20, 11971211.
Cox, D. R., 6 Miller, H. D. (1965). The theory of stochastic processes. London: Chapman 6 Hall.
Dahlquist, G., 6 Bjorck, A. (1974). Numerical methods. Englewood Cliffs, NJ: PrenticeHall.
de Lange, H. (1952). Experiments on flicker and some calculations on an electrical analogue of the
foveal systems. Physica, 18, 935950.
de Lange, H. (1954). Relationship between critical flicker frequency and a set of low frequency
characteristics of the eye. Journal of the Optical Society of America, 44, 380389.
de Lange, H. (1958). Research into the dynamic nature of the human fovea-cortex systems with intermittent and modulated light. I Attenuation characteristics with white and colored light. Journal of the
Optical Society of America, 48, 777784.
Di Crescenzo, A. G., Giorno, V., Nobile, A. G., 6 Ricciardi, L. M. (1995). On a symmetry-based
constructive approach to probability densities for two-dimensional diffusion processes. Journal of
Applied Probability, 32, 316336.
Di Crescenzo, A., Ricciardi, L. M., Giorno, V., 6 Nobile, A. G. (1991). On the reduction to one
dimension of first-passage-time problems for diffusion processes. Journal of Mathematical Physical
Sciences, 25, 599611.
Diederich, A. (1995). Intersensory facilitation of reaction time: Evaluation of counter and diffusion
coactivation models. Journal of Mathematical Psychology, 39, 197215.
Diederich, A. (1997). Dynamic stochastic models for decision making under time constraints. Journal of
Mathematical Psychology, 41, 260274.
Durbin, J. (1971). Boundary-crossing probabilities for the Brownian motion and Poisson processes and
techniques for computing the power of the KolmogorovSmirnov test. Journal of Applied Probability,
8, 431453.
Edwards, W. (1965). Optimal strategies for seeking information: Models for statistics, choice reaction
time, and human information processing. Journal of Mathematical Psychology, 2, 312329.
Emerson, P. L. (1970). Simple reaction time with Markovian evolution of Gaussian discriminal
processes. Psychometrika, 35, 99109.
Ethier, S. N., 6 Kurtz, T. G. (1986). Markov processes: Characterization and convergence. New York:
Wiley.
Feller, W. (1971). An introduction to probability theory and its applications (Vol. II, 2nd ed.), New York:
Wiley.
Fortet, R. (1943). Les fonctions aleatoires du type de Markoff associees a certaines equations lineaires
aux derivees partielles du type parabolique. Journal de Mathematiques Pures et Appliquees.
Gardiner, C. W. (1985). Handbook of stochastic methods (2nd ed.). Berlin: Springer-Verlag.
Gerstein, G. L., Butler, R. A., 6 Erulkar, S. D. (1968). Excitation and inhibition in cochlear nucleus. I.
Tone-burst stimulation. Journal of Neurophysiology, 31, 526536.
Gillund, G., 6 Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological
Review, 91, 167.
STOCHASTIC DYNAMIC MODELS
461
Giorno, V., Nobile, A. G., Ricciardi, L. M., 6 Sato, S. (1989). On the evaluation of first-passage-time
probability densities via non-singular integral equations. Advances in Applied Probability, 21, 200236.
Green, D. M., 6 Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
Gutierrez Jaimez, R., Roman Roman, P., 6 Torres Ruiz, F. (1995). A note on the Volterra integral
equation for the first-passage-time probability. Journal of Applied Probability, 32, 635648.
Heath, R. A. (1981). A tandem random-walk model for psychological discrimination. British Journal of
Mathematical and Statistical Psychology, 34, 7692.
Heath, R. A. (1992). A general nonstationary diffusion model for two choice decision making.
Mathematical Social Sciences, 23, 283309.
Hildreth, J. D. (1979). Bloch's law and a Poisson counting model for simple reaction time to light.
Perception 6 Psychophysics, 26, 153162.
Hirsch, M. W., 6 Smale, S. (1974). Differential equations, dynamical systems, and linear algebra. New
York: Academic Press.
Hockley, W. E., 6 Murdock, B. B. (1987). A decision model for accuracy and response latency in
recognition memory. Psychological Review, 34, 341358.
Humphreys, M. S., Bain, J. D., 6 Pike, R. (1989). Different ways to cue a coherent memory system:
A theory for episodic, semantic, and procedural tasks. Psychological Review, 96, 208223.
Karatzas, I., 6 Shreve, S. E. (1991). Brownian motion and stochastic calculus (2nd ed.). New York:
Springer-Verlag.
Karlin, S., 6 Taylor, H. M. (1975). A first course in stochastic processes. New York: Academic Press.
Karlin, S., 6 Taylor, H. M. (1981). A second course in stochastic processes. Orlando: Academic Press.
Kinchla, R. A. (1969). Temporal channel uncertainty in detection: A multiple observations analysis.
Perception 6 Psychophysics, 16, 802811.
Kinchla, R. A. (1974). Detecting target elements in multi-element visual arrays: A confusability model.
Perception 6 Psychophysics, 15, 149158.
La Berge, D. A. (1962). A recruitment theory of simple behavior. Psychometrika, 27, 149163.
Laming, D. R. J. (1968). Information theory of choice-reaction times. London: Academic Press.
Legge, G. (1978). Sustained and transient mechanisms in human vision: Temporal and spatial properties.
Vision Research, 18, 6982.
Link, S. W. (1975). The relative judgment theory of two choice response time. Journal of Mathematical
Psychology, 12, 114135.
Link, S. W. (1978). The relative judgment theory of the psychometric function. In J. Requin (Ed.),
Attention 6 Performance VII, pp. 619630. Hillsdale, NJ: Erlbaum.
Link, S. W. (1992). The wave theory of similarity and difference. Hillsdale, NJ: Erlbaum.
Link, S. W., 6 Heath, R. A. (1975). A sequential theory of psychological discrimination. Psychometrika,
40, 77105.
Luce, R. D. (1986). Response times. New York: Oxford Univ. Press.
Luce, R. D., 6 Green, D. M. (1972). A neural timing theory for response times and the psychophysics
of intensity. Psychological Review, 79, 1457.
MacMillan, N. A., 6 Creelman, C. D. (1991). Detection theory: A user's guide. New York: Cambridge
Univ. Press.
McClelland, J. L., 6 Rumelhart, D. E. (1981). A interactive activation model of context effects in letter
perception: Part I. An account of basic findings. Psychological Review, 88, 375407.
McGill, W. (1967). Neural counting mechanisms and energy detection in audition. Journal of
Mathematical Psychology, 4, 351376.
Murdock, B. B. (1982). A theory for the storage and retrieval of item and associative information.
Psychological Review, 89, 609626.
Norman, M. F. (1981). Lectures on linear system theory. Journal of Mathematical Psychology, 23,
181.
462
PHILIP L. SMITH
Pacut, A. (1977). Some properties of threshold models of reaction latency. Biological Cybernetics, 28, 6372.
Pacut, A. (1980). Mathematical modeling of reaction latency: The structure of the models and its
motivation. Acta Neurobiologiae Experimentalis, 40, 199215.
Papoulis, A. (1991). Probability, random variables, and stochastic processes (2nd ed.). New York:
McGrawHill.
Pike, A. R. (1966). Stochastic models of choice behaviour: Response probabilities and latencies of finite
Markov chain systems. British Journal of Mathematical and Statistical Psychology, 19, 1532.
Pike, A. R. (1968). Latency and relative frequency of response in psychophysical discrimination. British
Journal of Mathematical and Statistical Psychology, 21, 161182.
Protter, P. (1990). Stochastic integration and differential equations. Berlin: Springer-Verlag.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59108.
Ratcliff, R. (1980). A note on modeling accumulation of information when the rate of accumulation
changes over time. Journal of Mathematical Psychology, 21, 178184.
Ratcliff, R. (1981). A theory of order relations in perceptual matching. Psychological Review, 88, 552572.
Reed, A. V. (1973). Speed-accuracy trade-off in recognition memory. Science, 181, 574576.
Reeves, A., 6 Sperling, G. (1986). Attention gating in short term visual memory. Psychological Review,
93, 180206.
Revus, D., 6 Yor, M. (1994). Continuous martingales and Brownian motion (2nd ed.). Berlin: Springer-Verlag.
Ricciardi, L. M. (1976). On the transformation of diffusion processes into the Wiener process. Journal
of Mathematical Analysis and Applications, 54, 185199.
Ricciardi, L. M., Sacerdote, L., 6 Sato, S. (1983). Diffusion approximation and first passage time
problem for a model neuron. II. Outline of a computational method. Mathematical Biosciences, 64,
2944.
Ricciardi, L. M., Sacerdote, L., 6 Sato, S. (1984). On a integral equation for first-passage-time probability densities. Journal of Applied Probability, 21, 302314.
Ricciardi, L. M., 6 Sato, S. (1983). A note on the evaluation of first-passage-time probability densities.
Journal of Applied Probability, 20, 197201.
Rouder, J. (2000). Assessing the roles of change discrimination and luminance integration: Evidence for
a hybrid race model of perceptual decision making in luminance discrimination. Journal of
Experimental Psychology: Human Perception and Performance, 26, 359378.
Rudd, M. E. (1996). A neural timing model of visual threshold. Journal of Mathematical Psychology, 40,
129.
Rudd, M. E., 6 Brown, L. G. (1997). A model of Weber and noise gain control in the retina of the toad
bufo marinus. Vision Research, 37, 24332453.
Schwartz, W. (1989). A new model to explain the redundant signals effect. Perception 6 Psychophysics,
46, 498500.
Shaw, M. I. (1982). Attending to multiple sources of information: I. The integration of information in
decision making. Cognitive Psychology, 14, 353409.
Smith, P. L. (1990). A note on the distribution of response times for a random walk with Gaussian
increments. Journal of Mathematical Psychology, 34, 445459.
Smith, P. L. (1995). Psychophysically-principled models of visual simple reaction time. Psychological
Review, 102, 567593.
Smith, P. L. (1996). Dynamic signal detection models driven by white noise integrals. Journal of
Mathematical Psychology, 40, 369. [Paper presented at the 29th Annual Meeting of the Society for
Mathematical Psychology.]
Smith, P. L. (1998a). Bloch's law predictions from diffusion process models of detection. Australian Journal of Psychology, 50, 139147.
Smith, P. L. (1998b). Attention and luminance detection: A quantitative analysis. Journal of Experimental Psychology: Human Perception and Performance, 24, 129.
STOCHASTIC DYNAMIC MODELS
463
Smith, P. L., 6 Van Zandt, T. (in press). Time-dependent Poisson counter models of response latency
in simple judgment. British Journal of Mathematical and Statistical Psychology.
Smith, P. L., 6 Vickers, D. (1988). The accumulator model of two-choice discrimination. Journal of
Mathematical Psychology, 32, 135168.
Smith, P. L., 6 Vickers, D. (1989). Modeling evidence accumulation with partial loss in expanded
judgment. Journal of Experimental Psychology: Human Perception and Performance, 15, 797815.
Sperling, G., 6 Weichselgartner, E. (1995). Episodic theory of the dynamics of spatial attention.
Psychological Review, 102, 503532.
Sperling, G., 6 Sondhi, M. M. (1968). Model for visual luminance discrimination and flicker detection.
Journal of the Optical Society of America, 58, 11331145.
Stone, M. (1960). Models for choice-reaction time. Psychometrika, 25, 251260.
Tolhurst, D. J. (1975a). Reaction times in the detection of gratings by human observers: A probabilistic
mechanism. Vision Research, 15, 11431149.
Tolhurst, D. J. (1975b). Sustained and transient channels in human vision. Vision Research, 15, 11511155.
Townsend, J. T., 6 Ashby, F. G. (1983). The stochastic modeling of elementary psychological processes.
Cambridge, UK: Cambridge University Press.
van Kampen, N. G. (1992). Stochastic processes in physics and chemistry (rev. ed.). Amsterdam: Elsevier.
Vickers, D. (1970). Evidence for an accumulator model of psychophysical discrimination. Ergonomics,
13, 3758.
Vickers, D. (1979). Decision processes in visual perception. London: Academic Press.
Vickers, D., Caudrey, D., 6 Willson, R. J. (1971). Discriminating between the frequency of occurrence
of two alternative events. Acta Psychologica, 35, 151172.
Viviani, P. (1979a). Choice reaction times of temporal numerosity. Journal of Experimental Psychology:
Human Perception and Performance, 5, 157167.
Viviani, P. (1979b). A diffusion model for discrimination of temporal numerosity. Journal of Mathematical
Psychology, 19, 108136.
Watson, A. B. (1986). Temporal sensitivity. In K. R. Boff, L. Kaufman, 6 J. P. Thomas, (Eds.),
Handbook of perception and human performance (Vol. 1, pp. 6.16.43). New York: Wiley.
Wong, P., 6 Hajek, B. (1985). Stochastic processes in engineering systems. New York: Springer-Verlag.
Received: January 8, 1998