Behavioral Ecology doi:10.1093/beheco/arl008 Advance Access publication 19 June 2006 Learning rules for optimal selection in a varying environment: mate choice revisited Edmund J. Collins,a John M. McNamara,a and David M. Ramseyb Centre for Behavioural Biology and Department of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, United Kingdom and bInstitute of Mathematics and Computer Science, Wroc1aw University of Technology, Wybrzeźe Wyspiańskiego 27, PL-50-370 Wroc1aw, Poland a The quality of a chosen partner can be one of the most significant factors affecting an animal’s long-term reproductive success. We investigate optimal mate choice rules in an environment where there is both local variation in the quality of potential mates within each local mating pool and spatial (or temporal) variation in the average quality of the pools themselves. In such a situation, a robust rule that works well across a variety of environments will confer a significant reproductive advantage. We formulate a full Bayesian model for updating information in such a varying environment and derive the form of the rule that maximizes expected reward in a spatially varying environment. We compare the theoretical performance of our optimal learning rule against both fixed threshold rules and simpler near-optimal learning rules and show that learning is most advantageous when both the local and environmental variances are large. We consider how optimal simple learning rules might evolve and compare their evolution with that of fixed threshold rules using genetic algorithms as minimal models of the relevant genetics. Our analysis points up the variety of ways in which a near-optimal rule can be expressed. Finally, we describe how our results extend to the case of temporally varying environments. Key words: Bayesian model, evolution of learning rules, genetic algorithm, sequential search, sexual selection, spatially varying environment. [Behav Ecol 17:799–809 (2006)] he quality of a chosen partner can be one of the most significant factors affecting an animal’s long-term reproductive success. As a result, the performance of different mate choice rules has been, and continues to be, of central interest to behavioral ecologists (Janetos 1980; Parker 1983; Real 1990; Simao and Todd 2002; Hutchinson and Halupka 2004; Wong and Candolin 2005). Here we investigate optimal mate choice rules in a varying environment. We focus on modeling the case where only one sex is discriminating. This restricted focus enables us to clearly set out the principles underlying the structure and evolution of robust rules and the logic of why and when learning can be advantageous. Thus, we aim to combine precise quantitative results with qualitative motivation. Moreover, we expect the intuition and insight developed to apply to more general optimal selection problems and also to carry over to more realistic game theoretic mate selection models where both sexes are choosy. For simplicity, we refer throughout to the choosy sex as female; our analysis applies equally well when the roles are reversed (Real 1990). We assume that each individual female encounters a possibly unlimited sequence of potential mates, whose qualities are assumed to be chosen at random from a given (local) distribution, and we assume that costs of some form are associated with searching (Alatalo et al. 1988; Milinski and Bakker 1992). Under these assumptions, at any given time the female will have an acceptance threshold that will determine whether or not she accepts the current male. Previous mate choice models have in general assumed that each female knew (i.e., was fully adapted to) the local distribution of male qualities, which in turn implicitly assumed that this local distribution remained constant from generation to generation (Janetos 1980; Parker 1983; Real 1990). For this T Address correspondence to J.M. McNamara. E-mail: john.mcnamara@ bristol.ac.uk. Received 1 December 2005; revised 28 April 2006; accepted 12 May 2006. The Author 2006. Published by Oxford University Press on behalf of the International Society for Behavioral Ecology. All rights reserved. For permissions, please e-mail: [email protected] setting, Real (1990) gives a comprehensive treatment of both sequential search rules and ‘‘best-of-n’’ rules with costs. He shows that a sequential search rule is optimal in terms of expected reward and will always dominate the best-of-n rule when the possible number of encounters is unlimited. In reality, the distribution of male qualities may differ from place to place and from season to season (Svensson and Sinervo 2004). A mate choice rule that is optimal when qualities have one distribution may perform very badly when they have another (Mazalov et al. 1996; Luttbeg 2002). A female that inherits a robust rule that works well across a variety of environments will thus have a significant reproductive advantage. There may be even more advantage in an adaptive rule that learns from experience to respond optimally to the local distribution, and there is experimental evidence that female choice is affected by the quality of previously encountered males (Bakker and Milinski 1991; Milinski and Bakker 1992; Downhower and Lank 1994; Collins 1995). Real (1990) recognized both the appropriateness of, and the difficulties inherent in, a Bayesian approach to this adaptive choice model but did not address its solution. Since then, there has been only limited progress in characterizing and evaluating the form of the optimal rule in the presence of variability and uncertainty. Apart from some simulation comparisons (Mazalov et al. 1996; Luttbeg 2002; Hutchinson and Halupka 2004), there are few clear theoretical results (Dombrovsky and Perrin 1994; Mazalov et al. 1996). This paper is the first to give a full Bayesian treatment of the optimal learning rule for mate choice in a varying environment. Our model incorporates 2 sources of variability that may lead a female to be uncertain about the quality of a given male relative to others she might meet—variation of individual male qualities within the local pool of potential mates (residual variability) and variation of the average quality of the pools themselves, say from place to place in a given season or from season to season (environmental variability). The different types of rules allowed under our model range from rules that specify a completely fixed acceptance threshold, or rules where the Behavioral Ecology 800 acceptance threshold may change with each encounter but only in some predetermined way that is independent of the actual qualities encountered (but may, e.g., depend on the number of encounters), through to much more complex fully adaptive or learning rules where the acceptance threshold at any encounter depends crucially on the female’s own past experience. Within this framework, we are able to fully characterize and quantify the optimal (learning) rule. Perhaps more significantly, we are able to draw clear qualitative conclusions about how the relative strengths of the 2 sources of variation affect the relative success of each type of rule and also how these relative strengths affect the trade-off between the complexity of a rule and its expected reward in the presence of evolutionary selection pressures. Thus, our emphasis is on optimality rather than on heuristics (Simao and Todd 2002; Hutchinson and Halupka 2004). More complicated models are possible, which allow for game theoretic aspects, perhaps due to both sexes being choosy or to the distribution of male qualities changing over the mating season as females remove the best males (McNamara and Collins 1990; Collins and McNamara 1993; Johnstone 1997; Bergstrom and Real 2000) or allow for courtship (Wong and Candolin 2005) or imperfect observation of male qualities, so that female choice rules need to take into account the costs and benefits of time spent in further assessment or reassessment of previously encountered males (Luttbeg 1996; Fawcett and Johnstone 2003b). We do not address these issues here. We start by specifying the details of our model and developing a clearer intuition of the advantages and disadvantages of rules with a fixed acceptance threshold, gaining insight from the asymmetric shape of the plot of expected net gain against threshold. We formulate a full Bayesian model for updating information in a varying environment and derive the form of the corresponding optimal learning rule that maximizes expected reward in a spatially varying environment. We explore when learning is valuable—either theoretically valuable or valuable in a more practical setting that takes into account robustness and adaptability under evolutionary pressures. Thus, we first identify the conditions that might favor learning under our model by comparing the theoretical performance of our optimal learning rule against both fixed threshold rules and simpler near-optimal learning rules. We then consider how optimal simple learning rules might evolve and compare their evolution with that of fixed threshold rules using genetic algorithms (GAs) as minimal models of the relevant genetics. Our analysis points up the variety of ways in which a nearoptimal rule can be expressed. Finally, we describe how our results extend to the case of temporally varying environments. MODELS FOR MATE CHOICE IN A VARYING ENVIRONMENT Our starting point is a model similar to that of previous authors (Janetos 1980; Parker 1983; Real 1990). Each potential mate is characterized by a value representing his quality or reproductive fitness, and this value can be immediately recognized by the female. The female encounters a sequence of males with individual fitness values x1, x2, x3, . . ., where there is no limit to the potential number of encounters. She sequentially accepts or rejects each male according to his fitness value and cannot recall previously rejected males. The female stops searching if and when she finally accepts a male. We assume that individual male fitness values can be modeled as independent observations from the N(l, c2) distribution, that is, the normal distribution with mean l and variance c2. For clarity of presentation, we assume the time between encounters, measured in appropriate units, has constant value 1. Finally, we assume that the female incurs a fixed positive search cost c for each time unit that elapses up to and including the time a male is finally chosen, reflecting time or energetic costs incurred. The overall net fitness gain to the female is the fitness value of the male eventually chosen minus the total search costs. Previous authors have used this basic model to study optimal mate choice rules for females in a constant environment, where each female in each generation independently encounters males with fitness values drawn from the same N(l, c2) distribution with known values of l and c2. We extend this model to study optimal mate choice rules in environments where the value of l experienced by different females can vary, either within each generation or across generations. We now define a spatially varying environment to be one in which the value of l varies from female to female within a given generation, according to a normal distribution with mean l0 and variance r20 : The values of c2, l0, and r20 are assumed to be known and constant from generation to generation. We can gain insight into the spatially varying model by interpreting the average fitness values seen by females as being influenced by factors specific to the local environment or patch, so that l represents a local average male fitness value. Indeed, we will sometimes speak of males ‘‘in a given location,’’ though this is by no means the only interpretation of how this variability might occur. We define a temporally varying environment to be one in which the value of l is the same for all females within a given generation but now varies from generation to generation according to a normal distribution. Again we write l0 and r20 for the mean and variance of this distribution, and we assume that the values of c2, l0, and r20 are known and constant from generation to generation. The temporally varying model describes cases where the average male fitness value is essentially the same across locations in a given generation, but where this average value changes appreciably from generation to generation because of seasonal factors that affect all locations equally. In both these models, we interpret l0 as the long-run average male fitness value, we interpret r20 as the environmental variance (the variance of l about this value l0, either within generations or across generations according to the context), and we interpret c2 as the residual variance (the variability of local male fitness values around their local average). In many cases, quantities of interest will depend linearly on l0 and depend on r20 and c2 only through the value of the ratios c/r0, c/c, or r0/c. In the numerical examples used to illustrate our results, we will therefore mainly focus on fixed values for the parameters l0 and c (taking l0 ¼ 5 and c ¼ 0.1) and a fixed range of values for the parameters r20 and c2 (r0 ¼ 0.5, 1, 2 and c ¼ 0.5, 1, 2), giving an appropriate range of illustrative values. Finally, we note that many of our qualitative results remain the same if we relax the distributional assumptions to allow distributions with roughly the same shape as the normal or if we assume mortality or discounted costs (Real 1990). Similarly, the analysis is essentially unchanged if we allow the times between encounters to be continuous variables, randomly chosen from some given distribution. FIXED THRESHOLD RULES IN CONSTANT AND VARYING ENVIRONMENTS A mate choice rule that is particularly attractive because of its simplicity is a fixed threshold rule with threshold T of the form: accept a prospective mate with fitness value x if and only if x T ð1Þ In particular, Real (1990) showed that a fixed threshold rule is optimal when the environment is constant, and the value Collins et al. • Learning rules for optimal selection 801 Figure 1 Plot of the expected overall net fitness gain V(T) against the fixed threshold T, for a range of values of the search cost c. For clarity, the negative gains are omitted. For each value of c, the maximum value of V(T) occurs at the value T * corresponding to the point of intersection with the line of unit slope through the origin, as shown. The parameter values are l ¼ 5, c ¼ 1, and c ¼ (from bottom left) 1, 0.5, 0.1, 0.01, 0.001. of l is fixed (corresponding to the case r0 ¼ 0 in our model). Here we review the constant environment case in order to develop insight into how the performance of these fixed threshold rules is affected when the environment varies. In a constant environment, the mate choice problem is stationary in that the expected net future gain of continuing under an optimal rule is constant, independent of the current stage n and the current and previous observed male values. This follows from the fact that the values of successive males are independently drawn from the same known distribution and that recall is not allowed. Denote the expected net future gain under an optimal rule by V *. As with any similar sequential search model, an optimal female action at each stage is to accept the current prospective male if and only if the reward from mating is greater than or equal to the expected net future gain under an optimal rule. It follows that the optimal rule in a constant environment is a fixed threshold rule and that the optimal threshold T * at each stage is exactly equal to V *. In our model, we have specifically excluded recall of previously rejected males as an option. However, because the optimal rule uses a fixed threshold, we see that in a constant environment, the recall of previously rejected potential mates would be of no benefit; a male deemed unacceptable when first encountered would have a fitness value below the fixed threshold and thus would always be unacceptable in the future. An explicit expression for the expected overall net fitness gain, V(T), under a fixed threshold rule with threshold T, is given by the formula V ðT Þ ¼ l 1 c /ððT lÞ=cÞ 1 c ; 1 UððT lÞ=cÞ 1 UððT lÞ=cÞ ð2Þ where / and U are, respectively, the probability density function and distribution function of the N(0, 1) distribution. Here ð1 UððT lÞ=cÞÞ is the probability a randomly chosen male will exceed the threshold T and 1=ð1 UððT lÞ=cÞÞ is the expected search time (i.e., the number of males encountered up to and including the one accepted). Figure 1 shows typical plots of V(T) for a range of c values; in each case, the graph takes value l c for T small, increases slowly with increasing T, and then falls steeply for larger values of T. Because T * ¼ V * ¼ V(T *), the values of T * and V * can be read off from the graph as the coordinates of the point of intersection with the line of unit slope through the origin. To motivate the shape of the graph of V(T), note that V(T) is the difference between the expected value of the male eventually accepted and the expected cost of searching. Both of these terms are increasing functions of the threshold T. For T small, the expected value of the male accepted takes a value near l, and the expected search time takes a value near 1. As T increases, the expected value of the male accepted also increases, initially at a faster rate than the expected search time (and hence the expected search cost), so V(T) increases. Eventually, the rate of increase of the expected search time (and cost) becomes so large that the expected cost of using a higher threshold starts to outweigh the increase in the expected value of the male accepted, until finally the costs dominate the expected return. The important feature of this graph is its emphatically asymmetric shape; it is quite flat for values of T less than the optimal value T * but falls very steeply for values of T greater than T *. The expected overall net gain for a female using a fixed threshold less than T * is close to that under an optimal rule, even if T is appreciably less than T *. However, her expected overall net gain drops sharply if she uses a fixed threshold significantly greater than T *. We would therefore expect that selective pressure on females using a fixed threshold would lead them to err on the low side rather than the high side in choosing the threshold value. When male values come from the N(l, c2) distribution, V * and T * are given explicitly by V * ¼ T * ¼ l 1 cW1 ðc=cÞ ; ð3Þ where WðxÞ ¼ /ðxÞ xð1 UðxÞÞ for real values of x (Real 1990). Thus, the fixed threshold used by the optimal rule is of the form T * ¼ l 1 d*: ð4Þ Here l is the mean male fitness value and d* ¼ cW1 ðc=cÞ is a relative threshold representing how good a male has to be, relative to the known mean, to be acceptable. Note that the optimal fixed threshold T * and hence V * both increase linearly with l for fixed c and c. Consider 2 females following optimal rules in 2 environments that differ only in the mean male fitness value, say l1 and l2. Although the optimal thresholds T1* ¼ l1 1 cW1 ðc=cÞ and T2* ¼ l2 1cW1 ðc=cÞ would differ, the relative thresholds and the search time distribution would be the same, so the observed pattern of searching and acceptance would look exactly the same in both cases. However, the actual thresholds used and the expected value of the mate actually obtained would differ by an amount l1 l2, reflecting the difference in the overall mean fitness values. Moreover, if by mistake a female in an environment where the mean was l2 used the fixed threshold T1* that was optimal for an environment where the mean was l1, then we see from the shape of the graph of V(T) that she would do only slightly worse than the optimal if l1 was less than l2 because then T1* , T2*; but she would do very much worse if l1 was significantly greater than l2 as then T1* would be significantly greater than T2*: BAYESIAN MODEL FOR A VARYING ENVIRONMENT Now consider a female searching for a mate in a spatial or temporally varying environment. Under the varying environment model, the local male fitness values are drawn from Behavioral Ecology 802 the N(l, c2) distribution, where the local variance c2 is known but the value of the local mean l is unknown. Before the female encounters her first male, her only information about l is that it has been randomly drawn from the N ðl0 ; r20 Þ distribution, where l0 and r20 are known. Initially, her best estimate is to assume that l takes the value l0. However, with each male she encounters, she gains information that she can use to update her estimate of l and more generally update her overall uncertainty about the value of l. This process of updating can be precisely quantified by modeling the process in a Bayesian framework. Under the Bayesian model, the distribution of l across different locations or generations can be interpreted as a subjective prior distribution, representing the female’s initial beliefs about the unknown value of l in her locality. Our assumptions imply that this distribution is N ðl0 ; r20 Þ; where l0 is now interpreted as the known prior mean of this subjective distribution and r20 is interpreted as the prior variance (so 1=r20 is the prior precision, giving a measure of how closely her beliefs are concentrated around the mean). Each time the female encounters another male, Bayes theorem specifies how her beliefs prior to the encounter should be updated and modified to give a posterior distribution that incorporates the information provided by the newly observed male fitness value. Consider a female who has just encountered her nth male. Denote the fitness values observed so far by x1, . . ., xn, with mean xn ¼ ðx1 1 1xn Þ=n: Standard results (DeGroot 1970) show that her posterior distribution is the N ðln ; r2n Þ distribution, so her information state can be summarized by the posterior mean ln and the posterior variance r2n (or posterior precision 1=r2n ), where for n ¼ 1, 2, 3, . . . ln ¼ bn l0 1 ð1 bn Þ xn and 1 1 n ¼ 1 r2n r20 c2 ð5Þ and the weights bn have the form c2 =r2 bn ¼ 2 2 0 : c =r0 1 n vations from the N(l, c2) distribution, where the value of l varies independently from female to female (or location to location) according to the N ðl0 ; r20 Þ distribution. Because the randomness affects each female in the population independently of the other population members, it is appropriate to base fitness measures on the overall net fitness gain to each individual female (Houston and McNamara 1999). Thus, we define an optimal mate choice rule in a spatially varying environment to be one that maximizes the expected net fitness gain, where this expectation is taken first over the male values experienced for each value of l and then over the distribution of l across locations. Recall that an optimal action for a female at each stage is to accept the current prospective male if and only if the reward from mating is greater than or equal to the expected net future gain under an optimal rule. Consider a female who has just encountered her nth male with observed fitness value xn and is deciding whether to accept or reject this potential mate. We saw in the previous section that, having observed xn, her current state of information is characterized by the values ln and r2n : Define V *ðln ; r2n Þ to be the expected future net fitness gain if she continues searching under an optimal rule, starting from the state ðln ; r2n Þ: Then, the optimal action at stage n is accept xn if and only if xn T *ðln ; r2n Þ; where T *ðln ; r2n Þ ¼ V *ðln ; r2n Þ: Following DeGroot (1970, p. 336–41), the optimal threshold for fixed c can be expressed as ln plus a term that depends only on r2n ; while Equation 5 shows that, for fixed c2 and r20 ; r2n itself depends only on n. Thus, for fixed c, c2, and r20 ; there is a corresponding sequence of fixed values d(0), d(1), d(2), . . ., such that the dependence of the optimal threshold and optimal future net fitness gain on the current information state ðln ; r2n Þ has the simple form V *ðln ; r2n Þ ¼ T *ðln ; r2n Þ ¼ ln 1 dðnÞ; ð6Þ We see that the actual value of ln changes randomly at each encounter, depending, through xn, on the value of xn observed. However, the uncertainty about l, represented by r2n ; decreases deterministically with each extra observation because 1=r2n increases by 1/c2 irrespective of the observed values x1, . . ., xn. The weight bn given to the initial estimate l0 also decreases deterministically with each extra observation, again independently of the observed values x1, . . ., xn. Note that the weight depends only on the value of n and the relative values of the residual (local) variance c2 and the environmental variance r20 and not on their absolute values. As the local variability c2 decreases relative to r20 ; less weight is given to l0 and more weight is given to the data. This is to be expected because c2 decreasing means the local data are less variable and hence a more reliable indicator of l. Indeed, if c2 was very small, then every male in a location would have roughly the same value, and the first male value observed would be a much better indicator of local male values than l0. Alternatively, if r20 was very small, then there would be very little variation in the value of l seen by different females, and the problem would reduce to the setting of a constant environment with known mean l0. OPTIMAL MATE CHOICE IN A SPATIALLY VARYING ENVIRONMENT In a spatially varying environment, the male fitness values encountered by an individual female are independent obser- ð7Þ ð8Þ where d(n)depends on n but not on l0 or x1, . . ., xn and hence not on ln. DeGroot (1970) gives an iterative procedure for numerically calculating d(0), d(1), d(2), . . ., but there is no simple explicit expression for the sequence of values. We call the rule defined by these thresholds an optimal learning rule. In particular, because the female starts the process with an initial prior distribution N ðl0 ; r20 Þ; the overall expected net fitness gain under an optimal rule is given by V *ðl0 ; r20 Þ ¼ l0 1 dð0Þ: ð9Þ In the form given by Equation 8 above, the female compares xn with a threshold in which the value of ln already incorporates the information provided by xn. From Equations 5 and 6, we can express ln in terms of xn and ln1 and hence rewrite the inequality xn ln 1dðnÞ in the equivalent form xn ln1 1dðnÞr2n1 =r2n : This provides an equivalent rule for the optimal action at the nth stage of the form: accept xn if and only if xn ln1 1 eðnÞ; ð10Þ where eðnÞ ¼ dðnÞr2n1 =r2n : Again, because r2n changes deterministically with n, e(n) depends only on n (for given c, c2, and r20 ), and both d(n) and e(n) can be shown to converge to the same limit cW1 ðc=cÞ as n becomes large. Figure 2 displays the typical form of e(n) for a range of environmental variances. Consider now how the action specified by the optimal learning rule changes in practice over a sequence of encounters. From Equation 10, the optimal action for the female at the nth encounter is to compare the new observation xn with Collins et al. • Learning rules for optimal selection Figure 2 Plot of relative threshold e(n) used by the optimal learning rule, against n, for a range of values of the environmental variance r20 : The parameter values are c ¼ 0.1, c2 ¼ 1 (so cW1 ðc=cÞ ¼ 0:90), and r0 ¼ (from bottom left) 0.5, 1, 2. At n ¼ 10, the values of e(n) in the figure lie between 0.91 and 0.92. a threshold composed of the previous Bayes estimate of the mean (the posterior mean ln1) plus the relative threshold e(n). Before her first encounter, her estimate of the mean is l0, and the value of the relative threshold e(1) is set somewhat high (Figure 2). The learning process enables her to take advantage of cases where the local value of l is higher than the long-run average male fitness value l0 and not rush to accept the first male whose value, though high, may well be exceeded in later encounters. However, as she starts to encounter further males, the weights bn ensure that the posterior mean adjusts rapidly toward the observed mean-to-date xn1, whereas at the same time, the relative threshold e(n) falls rapidly to the value cW1 ðc=cÞ that is optimal in a constant environment (see Equation 3). By this stage, the threshold has adjusted to a value that would be optimal in a constant environment with the same values of c and c but where the local mean l had a value equal to ln1. This enables the female to adapt quickly in cases where the local value of l is lower than the long-run average male fitness value l0 and not incur unnecessary costs by continuing with a threshold that is too high. The threshold defined by the optimal learning rule is free to rise or fall during a search, depending on the values x1, x2, x3,. . . experienced by the searcher. However, because the female only accepts mates of value higher than her current threshold, each time she decides to continue searching, the rejected male must have had a relatively low fitness value. This in turn feeds into her posterior mean ln, and it can be shown analytically that the expected value of the threshold used falls over time, where the expectation is over females still searching at that stage. Thus, the current posterior mean and the current threshold of a female still searching at the nth stage will also be relatively low. Figure 3 gives some indication of how thresholds fall on average as searching continues. For each n, what is plotted is the average threshold used by all females still searching at the nth stage, where the expectation is taken first with respect to the (relatively low) male values that must have been experienced to date and then with respect to the N ðl0 ; r20 Þ distribution for the local mean l. Although the comment of Real (1990) that ‘‘No simple, smooth behaviour can be predicted’’ certainly does apply to the changing thresholds of any fixed focal female, we see that these average 803 Figure 3 Plot of the expected threshold for the optimal learning rule against the number of prospective mates seen, for a range of values of the environmental variance r20 : The parameter values are l0 ¼ 5, c ¼ 0.1, c ¼ 1, and r0 ¼ (from left) 0.5, 1, 2. thresholds do fall smoothly and predictably as the number of males encountered increases. Indeed, as n increases, we see from Equations 5 and 6 that more and more weight is given to the increasing amount of local data. Moreover, in the Bayesian context, the local value of l will eventually be learned as more and more observations are taken, and the problem then reduces to the known (constant) environment model discussed above. This is exactly reflected in the behavior of the thresholds specified by the optimal learning rule; as n tends to infinity, the weight given to the observations tends to 1, xn tends to l, and d(n) tends to cW1 ðc=cÞ; so that the optimal threshold tends to l1cW1 ðc=cÞ; agreeing with the threshold Equation 3 that is optimal for the corresponding constant environment model. Finally, we can gain some insight by comparing the optimal learning rule with more simplistic rules. One estimate of the local mean l, based only on the long-run average value and ignoring the local observations, would be to use l0. Another estimate, based only on the current local observations x1, . . ., xn and ignoring the long-run average, would be to take the local observed average xn ¼ ðx1 1 1xn Þ=n: Analogy with the constant environment model suggests 2 simple but extreme possible mate choice rules: 1) accept xn if and only if xn l0 1 d# and 2) accept xn if and only if xn xn 1 d$, where d#and d$are appropriately chosen relative thresholds. The first rule ignores the data and compares xn with an absolute standard l0. It might fail to take advantage of good locations or lead to a prolonged and costly search in poor locations. The second rule is a purely relative rule, comparing xn only with the female’s own experience and ignoring the long-run average. It might be misled into poor decisions by unusually high or low early observations. Both rules have their drawbacks, and we can see from Equations 5 and 10 that the optimal rule depends on a weighted average of these 2 simple extreme estimates considered before, where the Bayesian analysis precisely identifies the optimal weights. THE REPRODUCTIVE ADVANTAGE OF LEARNING In a spatially varying environment, the optimal learning rule specified by Equation 7 or 10 gives greater expected reward than any other rule. It certainly does better than any fixed Behavioral Ecology 804 Table 1 Expected reward using the optimal learning rule and the best fixed threshold rule for a range of values of the environmental variance r20 and the residual variance c2 c ¼ 0.5 c¼1 c¼2 Optimal learning rule Best fixed threshold rule r ¼ 0.5 r¼1 r¼2 r ¼ 0.5 r¼1 r¼2 5.14 5.80 7.43 5.09 5.71 7.32 5.06 5.65 7.19 5.02 5.73 7.39 4.91 5.28 7.06 4.90 4.92 5.97 The parameter values are l0 ¼ 5; c ¼ 0.1; r0 ¼ 0.5, 1, 2; and c ¼ 0.5, 1, 2. threshold rule, but on the other hand, it is also much more complex than a fixed threshold rule. There may be real costs associated with, say, the increased neurological capacity required to successfully implement more complex rules. Although such costs are not explicitly modeled here, their possible presence prompts us to compare the reproductive advantage of the optimal learning rule with that of the best fixed threshold rule, so as to identify what type of conditions favor learning in a spatially varying environment. The numerical results set out in Table 1 show that, for fixed values of the residual variance c2, the expected reward under the optimal learning rule decreases as the environmental variance r20 increases. Conversely, for fixed values of r20 ; it increases as c2 increases. The reward from using the best fixed threshold rule behaves in a similar way; it falls with r20 for fixed c2, and it increases with c2 for fixed r20 : Moreover, the difference between the expected reward under the optimal learning rule and that under the best fixed threshold rule increases appreciably as r20 increases. The expected reward under the optimal learning rule was calculated using Expression 9, and the other values were evaluated by repeated numerical integration. Overall, we see that the optimal learning rule has the greatest advantage when both the residual and environmental variances are relatively large. Figure 4 compares the performance of the 2 types of rules in terms of the percentage increase in expected reward from using the optimal learning rule rather than the best fixed threshold rule, for a range of values of r20 and c2. Although the absolute values of the expected rewards increase with c2 and decrease with r20 ; the percentage increases do not behave in quite the same way, as seen, for example, in the plotted values at r0 ¼ 1. This is because the baseline (the reward from the fixed threshold rule) also increases, but at a different rate from that for the optimal learning rule. However, a similar overall picture emerges—learning is most advantageous when the environmental variance and the residual variance are both large. To gain more insight into why and when learning is advantageous, it is helpful to consider what happens for extreme values of the variance parameters. When the environmental variance, r20 ; is very small, then the environment is relatively constant, and the mean is roughly the same in all locations, so (Real 1990) the optimal fixed threshold will perform well throughout. Conversely, if the environmental variance r20 is very large, then the mean value of the prospective mates is much more variable across locations, and it is advantageous to use the observed values seen during the search to provide a much better estimate of the local mean value. If the local residual variance c2 is very small, there is very little variation in the male values around the local mean. Now a rule that accepts the first male encountered is optimal as it minimizes cost without sacrificing potential reward, and under this rule, Figure 4 Percentage increase in expected reward from using the optimal learning rule rather than the best fixed threshold rule, for a range of values of the environmental variance r20 and the residual variance c2. The parameter values are l0 ¼ 5; c ¼ 0.1; r0 ¼ 0.5, 1, 2; and c ¼ 0.5, 1, 2. the expected fitness of the mate accepted is equal to the local mean. Indeed, it does not matter much which male is chosen because all males in a given locality are of similar quality. Conversely, when c2 is very large, male values show high local variability; a fixed threshold rule will still do well when r20 is small, but when r20 is large, it may fail to take advantage of the local possibility for high rewards, and there is then strong selection pressure for rules that learn. This explains why the optimal learning rule has significantly greater advantage only when both the residual and environmental variances are large. The complexity of the optimal learning rule leads us to investigate whether there are other learning rules that retain its reproductive advantage but are significantly less complex. With this in mind, we define the class of simple learning rules to be the set of rules of the form: accept xn if and only if xn ln1 1 d for some fixed relative threshold d. Such rules retain the capacity of the optimal learning rule to learn the value of the local mean from the observed male values, using this information in the form of the Bayes estimate, ln1. However, they use a simple fixed relative threshold d instead of the more complicated optimal relative threshold e(n) that changes deterministically at each stage. Note that the initial value of the Bayes estimate is just the long-run average male fitness value l0, and each updating involves only the new observation and the value of the ratio of the environmental variance r20 to the residual variance c2. Because l0, r20 ; and c2 are all assumed to remain constant from generation to generation, the population might more easily evolve a simple learning rule that was adapted to these known values. In our computations, the simple learning rule did nearly as well as the optimal learning rule—its performance was within 1% of that of the optimal learning rule across the full parameter range, though the difference increased very slightly as r0 increased. Thus, the numerical results and comments above apply equally to comparing the best simple learning rule with the best fixed threshold rule. This will be particularly relevant when we discuss the evolution of rules. Collins et al. • Learning rules for optimal selection 805 Figure 6 A plot of the average search time against the local value of l for the optimal learning rule and the best fixed threshold rule. The rules used for illustration were the optimal rules for the parameter values l0 ¼ 5, c ¼ 0.1, r0 ¼ 2, and c ¼ 2. Figure 5 (a) Plot showing the dependence of the best fixed threshold on r0 and c. (b) Plot showing the dependence of the fixed relative threshold for the best simple learning rule on r0 and c. In both cases, the plots are over the range 0.5 c 2, with parameter values l0 ¼ 5, c ¼ 0.1, and r0 ¼ 0.5, 1, 2. We now look at the thresholds themselves and, in particular, how the best fixed threshold and the fixed relative threshold of the best simple learning rule depend on the (relative) values of the environmental variance r20 and the residual variance c2. Recall, from Figure 1, that a fixed threshold can perform very badly if it is even slightly higher than the optimal value for its local environment. To give a good average performance across the full distribution of l values, an inflexible fixed threshold must balance accepting lower than optimal values in cases when l is large against incurring prohibitively high search costs when l is small. We see from Figure 5a that the value of the best fixed threshold has the following properties: 1) it is below the long-run average male fitness value (here l0 ¼ 5) for a range of r0 values when c is small because there is nothing to learn, 2) it increases significantly with c for each fixed r0, and 3) it decreases slightly as r0 increases for each fixed c, to guard against a low local value of l, where the decrease is more pronounced for small values of c than for larger values. Similarly, the value of the fixed relative threshold for the best simple learning rule increases significantly with c for each fixed r0 but changes very little with r0 for each fixed c (increasing, but only relatively slowly, as r0 increases). The fixed relative threshold of a simple learning rule is itself a compromise between the different stage-dependent values of the optimal relative threshold e(n) (see Figure 2). In the cases considered here, e(n) is decreasing in n and quickly approaches its limiting value of cW1 ðc=cÞ; so the best value of d is always slightly above this limit. For comparison with the fixed threshold values in Figure 5a, note that a simple learning rule with relative threshold d starts off with initial acceptance threshold l0 1 d. After that, its exact value changes randomly in response to the observations. Taking l0 ¼ 5 here, we find that the best simple learning rule starts off initially with a higher threshold than the best fixed threshold rule, for all values of c and r0. This enables it to take advantage of situations where the local value of l is high. However, the learning component (ln1) allows it to adjust quickly, reduce its threshold, and avoid excessive search costs in cases where the observed male fitness values indicate that the local value of l is low. This is well illustrated in Figure 6, where we see that the average search time under the best fixed threshold rule increases dramatically in locations where l is low. Thus, in terms of the average search time, the best fixed threshold rule does only slightly better than the optimal leaning rule in locations where l is high but is strikingly worse than the optimal leaning rule in locations where l is low. EVOLUTION OF RULES We have analyzed optimal learning rules, but can rules of this type easily evolve? To investigate this, we simulated the evolution of learning rules using a GA. A GA can be regarded as a minimal model of the genetics rather than a realistic representation (Axelrod 1987). In each simulation, we followed a population of size 250 for 6000 generations. Crossover and mutation were allowed, and rules were coded using a Gray code, so mutations were always to neighboring thresholds. Generally convergence had occurred within the first 1000 generations, and the results shown are averages over the last 5000 of the 6000 generations. Details can be found in Ramsey (1994), but these details are not important because results were robust to parameters such as the mutation rate and the fraction of the population replaced each generation. We first considered a constant environment. Here we took a rule to be described by a single parameter: the acceptance Behavioral Ecology 806 threshold. In this case, the average population fitness rapidly converges to near maximum values. In particular, evolved rules easily outperformed the rule of choosing the first male encountered when the variance c2 was appreciably large. Evolved thresholds, however, fluctuated more markedly and tended to be significantly below the optimal threshold. This is consistent with Figure 1; there is low selection pressure below the optimal threshold but high selection pressure to reduce the threshold when it is above the optimal threshold. We also considered these fixed threshold rules in a spatially varying environment. Evolved thresholds decreased as r20 increased and increased as c2 increased, as theory predicts (see Figure 5). Thresholds were again below that under the optimal fixed threshold. To allow learning in a spatially varying environment, we evolved rules whose form is motivated by the simple learning rules previously considered. In particular, from Equations 7 and 8, an nth male of quality xn is accepted under the optimal learning rule if and only if xn ln 1 d(n), where, by rewriting Equation 5, ln is given by ln ¼ ql0 1 nð1 qÞ xn ; q 1 nð1 qÞ where q ¼ c2 =r20 : 2 c =r20 1 1 With this in mind, we specify a rule by 3 parameters m0, d, and r, where 0 , r , 1. Under a rule with these parameters, the nth male with quality xn is accepted if and only if xn mn 1 d; where mn ¼ rm0 1 nð1 r Þ xn : r 1 nð1 r Þ Thus, the parameter m0 plays the role of a prior estimate of the overall mean l0, d represents a fixed relative threshold, and r plays the role of q, reflecting the relative weight given to the prior mean m0 and the observed mean xn and thus the speed of learning. We considered values of c2 and r20 lying in the range f0.25, 0.5, 1, 2, 4g. We evolved rules repeatedly for each of the 25 combinations of these parameters. When such rules evolve, there is initially strong selection pressure against high values of d. Convergence is slower than for fixed threshold rules, but there was convergence by generation 1000. As noted above, the theory predicts that when c2 is small, it does not matter much which male is chosen because they are all of similar quality, and the selection pressure is low provided achieved thresholds are sufficiently low. For the evolution of rules, this neutrality means that there may be significant drift in evolved parameters, and it may be difficult for the more complex rules to home in on exactly the right combinations of rule parameters. Taken together with the continuing effects of mutation and crossover, this means there may still be noticeable variation and possibly underperformance within a population using complex rules, even when it is close to convergence. When c2 is large, the theory predicts that the performance of the best fixed threshold rule will be comparable to that of the optimal learning rule when r20 is small but may be significantly worse than the optimal learning rule when r20 is large as there is then strong selection pressure for rules that learn. Figure 7 clearly illustrates these points for c2 large, with low selection pressure leading to evolved fixed threshold rules actually doing slightly better than evolved simple learning rules when r20 is small but doing significantly worse as r20 increases. Figure 7 Comparison of the average reward obtained under the evolved fixed threshold rules and the evolved simple learning rules for a range of values of the environmental variance r20 : The rules were evolved using environmental parameter values l0 ¼ 5, c ¼ 0.1, c ¼ 2 and (a) r0 ¼ 0.5, (b) r0 ¼ 1, and (c) r0 ¼ 2. The results shown are averages over the last 5000 of 6000 generations. In all cases, regardless of the strength of selection, there was high variability in parameters and correlations in parameters across successive runs with the same parameter values. When r20 is low, we see from Figure 8a that m0 and d are highly negatively correlated. This correlation can be understood as follows. In this case, the environment is approximately constant. Thus, learning is not valuable, and fixed threshold rules perform well. A learning rule with r ¼ 1 is just a fixed threshold rule with threshold m0 1 d. This threshold can be achieved by different combinations of the 2 parameters m0 and d. In particular, a range of values of m0 can achieve the optimal threshold provided that as m0 is increased d is decreased to compensate. Of course, the behavior observed under a rule depends on how the threshold changes with experience. Thus, high variability in m0 and d does not necessarily imply high variability in what would be observed. When the environmental variance r20 is large, Figure 8b shows that there is a high negative correlation between the values of r and d used by members of the evolved population. The intuition is that the female needs to be very responsive to the locally observed values because the mean of the distribution varies so much from location to location. For example, if l is low and the relative threshold d is high, then the female may take a very long time to accept a male; she will incur considerable search costs unless she quickly lowers her threshold in response to encounters with poor-quality males, and this responsiveness corresponds a low value of r. However, if d is lower, the female does not need to be so responsive. Thus, there are a variety of different ways of achieving near-optimal behavior, ranging from adapting fast with a high relative threshold to adapting more slowly but with a lower relative threshold, and a good simple learning rule must use a low r value if its d value is high and vice versa. Our results indicate that different populations could have found different ways of solving the same problem, for example, using different combinations of values of r and d, and these different r and d values will result in observable differences in behavior. A corollary of this is that observed differences in female behavior between populations does not necessarily imply that females Collins et al. • Learning rules for optimal selection 807 Figure 9 Comparison of the average thresholds used by fixed threshold rules evolved in spatially varying environments with those evolved in temporally varying environments. The rules were evolved using environmental parameter values l0 ¼ 5, c ¼ 0.1, and c ¼ 2 and (a) r0 ¼ 0.5, (b) r0 ¼ 1, and (c) r0 ¼ 2. Figure 8 Correlations between the evolved parameters for simple learning rules, across 20 successive runs. (a) Correlations between m0 and d for rules evolved using environmental parameter values l0 ¼ 5, c ¼ 0.1, c ¼ 2, and r0 ¼ 0.5. (b) Correlations between r and d for rules evolved using environmental parameter values l0 ¼ 5, c ¼ 0.1, c ¼ 1, and r0 ¼ 2. are facing different problems or even using different types of rule. TEMPORALLY VARYING ENVIRONMENTS When fluctuations are temporal as opposed to spatial, all females in the population experience the same value of l in a given year. Thus, if l varies (i.e., r20 . 0), all females are subject to the same environmental fluctuations, and natural section maximizes geometric mean fitness (Lewontin and Cohen 1969). To define the fitness of a particular rule, let V(l) be the expected payoff to a female using the rule when the environmental mean is l. Then, the geometric mean fitness of the rule is G ¼ exp(g), where g ¼ Eflog V ðlÞg: ð11Þ The expectation in Equation 11 is an average over possible values of l. Because of the logarithm in Expression 11, geometric mean fitness is highly sensitive to low values of V(l). This means that it is more important to avoid poor performance for some values of l than to achieve good performance for other values. We can therefore expect the optimal learning rule to be conservative, avoiding risks so as to perform reasonably in all years. In this situation, it is impractical to find optimal learning rules with this fitness criterion. Instead, we evolved rules using the GA. When fixed threshold rules are evolved, we see from Figure 9 that the resulting thresholds are significantly lower than in a spatially varying environment with the same parameters, especially for high r20 : This is as expected because low thresholds reduce the risk of a very low payoff in bad years (low l). However, when simple learning rules are evolved, there is surprisingly little difference to the spatially varying case. The exception is that r is significantly lower. This is so that in a bad year the threshold is lowered quickly in response to encounters with poor-quality males, so reducing the risk of a long and costly search. DISCUSSION AND COMPARISON WITH PREVIOUS MODELS We introduce a model for optimal mate choice in situations where the observed values of prospective mates are subject to 2 types of variations—variation of individual observations about the (local) mean and variation of the local mean about some long-run average value. Each female observes prospective mates in a single location, rather than the multiple patch model of Hutchinson and Halupka (2004). In contrast to the ‘‘full knowledge’’ models of Real (1990) and Janetos (1980), we assume that the local mean is unknown and can vary from location to location or season to season. Under our model, the local male population is possibly infinite and is observed sequentially with no possibility of recall, so the best-of-n rule (Janetos 1980) is not relevant, and finite population models (Dombrovsky and Perrin 1994) are excluded. The value of each male encountered is observed exactly, obviating the need for reinspection or a 808 comparative Bayes approach (Luttbeg 1996, 2002; Fawcett and Johnstone 2003b). We emphasize an optimality framework with a single choosy sex, rather than a game theoretic framework (McNamara and Collins 1990; Johnstone 1997; Bergstrom and Real 2000; Fawcett and Johnstone 2003a), and our optimality criterion is based on a single measure combining both the value of the mate chosen and the search costs, as opposed to simply maximizing the value of the mate (Mazalov et al. 1996) or maximizing the probability of choosing the best mate (Dombrovsky and Perrin 1994). In a constant environment, where a fixed threshold rule is known to be optimal (Real 1990), we show that there will be strong selective pressure against females that use thresholds higher than this optimal value, but there may be only weak selective pressure against females that use lower than optimal thresholds. In a varying environment, the long-run performance of a mate choice rule must be evaluated in terms of its average performance over the whole range of environmental conditions and not just by its performance in a single fixed environment. We show that the optimal rule is a learning rule, under which the acceptance threshold is composed of the current estimate of the mean plus a relative threshold that depends only on the number of males encountered to date, and demonstrate how learning enables the female to take advantage of the local information through the changing weight given to the local mean. We compare the performance of the optimal learning rule with both fixed threshold rules and simpler learning rules, analyzing the effect of different levels of variability on the performance of these rules. We show that the relative advantage of learning is greatest when both residual and environmental variances are large because this results in high local variability about a very variable mean and therefore strong selection pressure for learning. Although the acceptance threshold of an individual focal female may vary unpredictably over her search (Real 1990), we show that under the optimal learning rule, the expected threshold used falls smoothly as a function of the number of males encountered, when averaged over females still searching at that stage. Results for temporally varying environments are not presented in detail as they are qualitatively similar to those for spatially varying environments. However, they do indicate even greater selective pressure against setting thresholds too high. We also break new ground in quantitatively and qualitatively exploring the evolution of these different rules, again looking at the effect of different levels of variability. The simulated evolutions indicate that evolved fixed thresholds will be below the optimal fixed threshold in both constant and spatially varying environments, reflecting the selective pressure against erring on the high side. In spatially varying environments, the evolved thresholds decrease as the environmental variance increases and increase as the residual variance increases. For simple learning rules, our results indicate high variability in the evolved rule parameters for different members of the population, emphasizing the variety of ways in which a nearoptimal rule can be expressed. Overall, Mazalov et al. (1996) comes perhaps closest to our analysis in spirit and content. They consider a finite horizon sequential learning model when local male values have a normal distribution with unknown mean (and possibly variance). They derive an optimal learning rule, in some ways similar to ours in structure, based on updating estimates of the local mean and variance and compare it with the fully adapted rule that would be optimal if the parameters were known. However, despite the common elements, our treatment differs from theirs in several crucial respects. Firstly, their model has no search costs, so their criterion for an optimal rule is just that it maximizes the expected value of the male obtained, and expected search times do not influ- Behavioral Ecology ence the choice of an optimal rule. Our optimality criterion combines both rewards and costs into a complete overall measure of a rule’s performance. This leads us to very different conclusions about the properties of an optimal rule and the relative performance of fixed and learning rules in a varying environment. To illustrate the effect of the difference in optimality criterion, consider the following oversimplified analysis of how a fixed rule fully adapted to a given local mean would perform relative to an optimal learning rule in an environment with a different local mean when the horizon is large. For the example computed in Mazalov et al. (1996), very roughly speaking, the fixed rule has a similar performance to their learning rule when the actual local mean is much lower than the value to which the rule is adapted (essentially because it may eventually accept a relatively high-quality male, though with small probability on each encounter) but performs significantly worse than the learning rule when the local mean is higher than the value to which the rule is adapted (now it may accept a relatively low-quality male). In a similar situation, under our model and optimality criterion, the discussion earlier in this paper indicates just the opposite—the fixed rule has a much worse performance than our learning rule when the actual local mean is much lower than the value the rule is adapted to (it may be penalized for the long search time resulting from an over-high threshold) but has a similar performance to the optimal learning rule when the local mean is higher than the adapted value (its low search cost may compensate for a low value of the male accepted). A second crucial difference is that Mazalov et al. (1996) assume that females have no prior knowledge of the parameters of the local distribution of male quality, and parameter estimation is based purely on the observed values. This has the particular disadvantage that under their model, the female is not allowed to accept the first male encountered—however high their value—but is constrained to use this value as an initial estimate of the local mean against which to judge the next encounter and update as appropriate. We give a fully Bayesian treatment of inference for the local mean, assuming that females have adapted to the relevant long-run parameters. Finally, we note that our comparisons of different types of rules are made under the assumption that recall is not allowed. The advantage of learning may be even greater if recall of previously rejected prospective mates is possible. In a constant environment, l is effectively known, and the observations are independent given l, so they carry no extra information (about l), and a learning rule can do no better than a fixed threshold rule. In a varying environment, the observations do carry information about l. A learning rule can take advantage of this information to lower its threshold in response to low observed male fitness values. In such cases, a prospective mate rejected earlier may later seem much more attractive. However, a fixed threshold rule can never take advantage of this information; a prospective mate that was not acceptable at one point in time would always have a value below the fixed threshold and would never be acceptable despite any indications that the local mean value was unexpectedly low. In particular, a learning rule will be favored over a fixed threshold rule in cases where recall and search costs are very low and recall is reliable, such as a female bird choosing a mate from a male lek. We thank Alasdair Houston, Barney Luttbeg, and Peter Todd for their comments on earlier drafts of this paper. John McNamara acknowledges the support of the Leverhulme Trust. We also thank the editor and 2 anonymous referees for their helpful and perceptive comments. Collins et al. • Learning rules for optimal selection REFERENCES Alatalo RV, Carlson A, Lundberg A. 1988. The search cost in mate choice of the pied flycatcher. Anim Behav 36:289–91. Axelrod R. 1987. The evolution of strategies in the iterated prisoner’s dilemma. In: Davis L, editor. Genetic algorithms and simulated annealing. Los Altos, CA: Morgan Kaufmann publishers Inc. p 32–41. Bakker TCM, Milinski M. 1991. Sequential mate choice and the previous male effect in sticklebacks. Behav Ecol Sociobiol 29:205–10. Bergstrom CT, Real LA. 2000. Towards a theory of mutual mate choice: lessons from two-sided matching. Evol Ecol Res 2:493–508. Collins EJ, McNamara JM. 1993. The job-search problem with competition: an evolutionarily stable dynamic strategy. Adv Appl Probab 25:314–33. Collins SA. 1995. The effect of recent experience on female choice in zebra finches. Anim Behav 49:479–86. DeGroot MH. 1970. Optimal statistical decisions. New York: McGraw-Hill. Dombrovsky Y, Perrin N. 1994. On adaptive search and optimal stopping in sequential mate choice. Am Nat 144:355–61. Downhower JF, Lank DB. 1994. Effect of previous experience on mate choice by female mottled sculpins. Anim Behav 47:369–72. Fawcett TW, Johnstone RA. 2003a. Mate choice in the face of costly competition. Behav Ecol 14:771–9. Fawcett TW, Johnstone RA. 2003b. Optimal assessment of multiple cues. Proc R Soc Lond B Biol Sci 270:1637–43. Houston AI, McNamara JM. 1999. Models of adaptive behaviour. Cambridge, UK: Cambridge University Press. Hutchinson JMC, Halupka K. 2004. Mate choice when males are in patches: optimal strategies and good rules of thumb. J Theor Biol 231:129–51. Janetos AC. 1980. Strategies of female mate choice: a theoretical analysis. Behav Ecol Sociobiol 7:107–12. 809 Johnstone RA. 1997. The tactics of mutual mate choice and competitive search. Behav Ecol Sociobiol 40:51–9. Lewontin RC, Cohen D. 1969. On population growth in a randomly varying environment. Proc Natl Acad Sci USA 62:1056–60. Luttbeg B. 1996. A comparative Bayes tactic for mate assessment and choice. Behav Ecol 7:451–60. Luttbeg B. 2002. Assessing the robustness and optimality of alternative decision rules with varying assumptions. Anim Behav 63: 805–14. Mazalov V, Perrin N, Dombrovsky Y. 1996. Adaptive search and information updating in sequential mate choice. Am Nat 148: 123–37. McNamara JM, Collins EJ. 1990. The job-search problem as an employer-candidate game. J Appl Probab 28:815–27. Milinski M, Bakker TCM. 1992. Costs influence sequential mate choice in sticklebacks, Gasterosteus aculeatus. Proc R Soc Lond B Biol Sci 250:229–33. Parker GA. 1983. Mate quality and mating decisions. In: Bateson P, editor. Mate choice. Cambridge, UK: Cambridge University Press. p 141–66. Ramsey DM. 1994. Models of evolution, interaction and learning in sequential decision processes [dissertation]. Bristol, UK: Department of Mathematics, University of Bristol. Real L. 1990. Search theory and mate choice. I. Models of single-sex discrimination. Am Nat 136:376–404. Simao J, Todd PM. 2002. Modelling mate choice in monogamous mating systems with courtship. Adapt Behav 10:113–36. Svensson EI, Sinervo B. 2004. Spatial scale and temporal component of selection in side-blotched lizards. Am Nat 163:726–34. Wong BBM, Candolin U. 2005. How is female mate choice affected by male competition. Biol Rev Camb Philos Soc 80:559–71.
© Copyright 2026 Paperzz