AN EFFICIENCY APPROACH TO CHOICE SET GENERATION by David Scrogin, Richard Hofler, Kevin Boyle, and J. Walter Milon Corresponding author: David Scrogin Department of Economics University of Central Florida Orlando, FL 32816-1400 (407) 823-4129 voice (407) 823-3269 fax [email protected] October 20, 2004 AN EFFICIENCY APPROACH TO CHOICE SET GENERATION DAVID SCROGIN, RICHARD HOFLER, KEVIN BOYLE, AND J. WALTER MILON Proper formation of individual choice (or consideration) sets is an ongoing concern of applied researchers. For destination choice analyses, criteria defined by travel time, distance, or site quality have a behavioral appeal, yet are fundamentally limited by their unidimensional scope. To remedy their shortcomings in a framework consistent with individual optimizing behavior, we develop a multi-variate econometric frontiers approach to individual choice set formation. Estimating stochastic frontier models relating costly trip inputs to utility generating outputs, choice sets are formed with respect to the deviation of sites from the efficient frontier. The composition of the choice sets, efficiency of site choice and probability of selection, and estimates of consumer surplus are contrasted with results obtained under deterministic choice set criteria. Key words: Choice sets, destination choice, discrete choice models, stochastic frontier models. ___________________________ David Scrogin is assistant professor, Department of Economics, University of Central Florida. Richard Hofler is professor, Department of Economics, University of Central Florida. Kevin Boyle is Libra Professor of Environmental Economics, Department of Resource Economics and Policy, University of Maine. J. Walter Milon is Provost’s Distinguished Research Professor, Department of Economics, University of Central Florida. An earlier version of the paper was presented at the 2004 meetings of the American Agricultural Economics Association. We thank Glenn Harrison, Shelby Gerking, Michael Caputo, and Mark Dickie for helpful comments. The data collection was funded by the Maine Agricultural and Forest Experiment Station and the Maine Department of Inland Fisheries and Wildlife. The views in this paper do not necessarily represent those of the Maine Department of Inland Fisheries and Wildlife or its personnel. All errors and interpretations are the responsibility of the authors. AN EFFICIENCY APPROACH TO CHOICE SET GENERATION Proper formation of individual choice (or consideration) sets is an ongoing concern of applied researchers. While factorial design algorithms and efficiency statistics assist in choice set design for stated choice experiments (e.g., conjoint analysis and multi-attribute contingent valuation), such tools elude the analyst of revealed preference (RP) data. Instead, the individual decision maker’s true choice set is not directly observed. As a result, particular care must be exercised in constructing choice sets for RP analyses. The task is further confounded when the underlying universal choice set–such as recreational sites, houses, or shopping locations–comprises a large number of feasible alternatives. Several recent studies of destination choice have proposed criteria defined by distance or travel time (trip inputs) and destination quality (trip outputs) for generating choice sets in situations where the universal choice set contains hundreds of sites (Parsons and Hauber; Whitehead and Haab; Hicks and Strand). Similar to McFadden’s random draws approach, these deterministic criteria avoid the computational burden and memory requirements associated with jointly modeling the formation of the individual’s choice set and the discrete choice (Manski; Swait and Ben-Akiva; Haab and Hicks, 1997; Swait). Furthermore, the rules seemingly possess a behavioral appeal: sites that are sufficiently distant from the point of trip origin or deficient in particular aspects may, one might argue, be excluded from the analysis. The common finding of the studies is that random utility model (RUM) coefficients and nonmarket use-values tend to be robust with deterministic criteria as the estimation results are similar to those obtained with the universal choice sets. This has led to the perception that sites failing to satisfy a given criterion may be excluded without cause for concern. 1 Yet a closer examination reveals the shortcomings of deterministic choice set rules. In particular, because sites that fail to satisfy a rule defined by trip inputs (or site quality) are discarded without regard to their respective quality (or trip inputs), the rule will fail to accommodate potential tradeoffs between inputs and attainable site quality. As a result, highquality sites may be discarded if the analyst selects a conservative input rule, while readily accessible sites may be discarded if a conservative output rule is selected. Furthermore, the quality of sites attainable with a given amount of inputs may vary sizably across the sites retained in the choice sets used for estimation. The prudent analyst may be troubled, however, by assuming individuals consider all such sites rather than those providing the greatest quality for a given amount of inputs (i.e., time and distance traveled). In light of these issues, we develop a multi-variate efficient frontiers approach (Aigner, Lovell, and Schmidt) for generating individual choice sets from large universal choice sets, such as those commonly confronted in destination choice research. Using an econometric framework complementing the probabilistic approaches to choice set formation examined by Swait and Ben-Akiva, Haab and Hicks (1997), and Swait, individual-level stochastic frontier models relating costly trip inputs to utility generating outputs are estimated to measure the relative efficiency of sites comprising the universal set. Choice sets are then formed under an efficiency criterion, which exclude sites that deviate sufficiently from the efficient frontier. The approach complements the travel cost frontier model of Smith, Palmquist, and Jakus and the minimum utility threshold model of choice set formation proposed by Klein and Bither. Coupled with its consistency with individual optimizing behavior and ease of implementation in the discrete choice framework, the frontiers approach may serve as a useful tool for choice set generation across a broad range of applications. 2 Additional background is provided in the next section. The site choice frontier function and efficiency criteria for choice set generation are developed in section three. Choice set composition, site choice efficiency, and estimated RUMs produced with the efficiency approach are contrasted with results produced with deterministic criteria in the case of recreational angler site choice. The implications for benefits estimation are addressed in section four. Section five concludes. The Choice Set Problem To motivate the choice set problem and development of the efficiency approach, we begin in the conventional way by assuming individuals derive utility from site quality and a numeraire good but are constrained by income (see e.g., Haab and McConnell). Quality is obtained by traveling a known distance or length of time to any of n sites in the universal set J. Furthermore, quality is treated as individual-specific and assumed to vary over sites. Assuming utility maximization, individual i’s problem may be written: (1) where V(.) is the indirect utility function, qj,i denotes the quality of site j attainable by the individual, yi is income, and pj,i is the implicit price of site access.1 The objective of the choice analyst is generally to recover estimates of the parameters of V(.) for testing hypotheses about individual preferences and to measure the non-market values of changes in the choice set composition (e.g., willingness-to-pay for changes in site quality). 3 To proceed requires specification of the indirect utility function and individual choice sets. While the former have been defined rather exclusively as linear-in-parameters, debate is ongoing regarding the proper definition of the choice sets. As a starting point, it is commonly assumed that an individual’s true choice set on a given choice occasion contains those alternatives considered by the individual (see Thill; Haab and Hicks, 1997, 2000). However, given imperfect individual level information–including the subset of sites in J that individuals are cognizant of, their familiarity with these sites, and constraints limiting the viability of some of the sites as choices–the true choice set is not perfectly observable. Further, as 2n - 1 choice sets may be formed from a universal set containing n alternatives, the complexity of properly specifying the choice set is compounded as n increases. The choice of choice set is therefore a dilemma to be resolved by the analyst as its misspecification will yield misleading information about individual preferences in the form of biased estimates of the parameters of the indirect utility function (Swait and Ben-Akiva; Horowitz; Haab and Hicks, 1997, 2000; Swait). Given that individual choice sets are unobservable, two ‘second-best’ approaches for choice set formation have emerged in the literature (see Thill and Haab and Hicks, 2000, for comprehensive reviews).2 The first, deterministic approach entails excluding sites from the choice sets used for estimation if they fail to satisfy a specific rule defined by the analyst. Examples in the destination choice literature include the maximum travel time and distance criteria examined by Parsons and Hauber, Whitehead and Haab, and Hicks and Strand, and in early work by Black, whereby sites located beyond a maximum travel time or distance threshold are discarded.3 The second, behavioral approach follows from Manski’s two-stage 4 model of choice set formation, whereby the probability an individual selects a given site is conditional upon the site appearing in the choice set. The probability that the site appears in the choice set is conditioned upon individual characteristics and then modeled jointly with the probability of site choice. Empirical applications to date include Swait and Ben-Akiva, Haab and Hicks (1997), and Swait. Whether to adopt simple deterministic criteria or to more rigorously develop a formal model of choice set formation will depend primarily upon the dimension of the universal choice set, the availability of individual information specific to choice set formation, and the analyst’s technical expertise. The deterministic approach is ad hoc as the analyst must decide upon both the exclusion criterion and the stringency with which it will be applied. In contrast, the behavioral approach is consistent with utility maximization and by design provides insight on the decision making process that is unobtainable with the deterministic approach (Manski; Thill).4 However, behavioral models of choice set formation become computationally intractable if the universal choice set contains more than a few alternatives (Haab and Hicks, 1997; Swait). Furthermore, empirical implementation of the behavioral approach requires that the instrument used for data collection be designed to elicit the specific individual information necessary to estimate the choice set component of the model, yet this may be infeasible in many circumstances. These information requirements also undermine the use of the behavioral approach with many existing RP datasets. In contrast, deterministic criteria are convenient to implement and computationally tractable regardless of the size of the universal choice set. They also require no additional data beyond that required of a standard model of destination choice (i.e., travel costs and site characteristics). 5 As destination choice analysts frequently confront universal choice sets containing large numbers of alternative locations (e.g., dozens or hundreds of sites), deterministic criteria have prevailed for generating choice sets in empirical applications. Several recent studies have examined the sensitivity of destination choice models to various deterministic criteria. Parsons and Hauber evaluated a rule defined by maximum travel time in modeling recreational angler site choice in Maine. Excluding sites in thirty minute increments from the trip origin, they found RUM coefficient and benefit estimates to be insensitive to the rule. Whitehead and Haab evaluated criteria defined both by trip inputs and outputs in studying angler site choice in the southeast United States. The input was defined by maximum trip distance, and two definitions of minimum attainable site quality were evaluated: the historic (five-year) daily catch rate of fish and an individual-specific catch rate generated from an estimated Poisson regression model (see e.g., McConnell, Strand, and Blake-Hedges). With sites excluded in sixty mile increments, the findings of Parsons and Hauber were supported. Similarly, for both site quality measures the estimated RUM coefficients rapidly converged upon those obtained with the universal choice set. Most recently, Hicks and Strand evaluated the maximum time and distance rules relative to ‘familiar’ choice sets constructed from individual responses to survey questions in studying Maryland beach visitation. In both cases, the estimates rapidly converged upon those obtained with the universal set, a result that may be attributed to a high degree of correlation between distance and travel time. In selecting deterministic choice set criteria, it is natural to question whether tradeoffs exist between attainable site quality and trip inputs. In many instances, quality may reasonably be expected to vary with the time and distance traveled. For example, to the 6 extent that public lands and waters within city boundaries are of lesser quality for any of a number of recreational activities (e.g., fishing and site seeing) compared to the surrounding rural countryside, attainable site quality will vary directly with travel time and distance as one moves outward from the city center. The exceptions occur if quality is homogenous over sites or the closest sites to the trip origin happen to be the highest quality sites. If attainable site quality is correlated with the time or distance traveled, then concern arises about the merits of deterministic rules for choice set generation. Specifically, if a rule is defined solely by the spatial or temporal proximity of sites to the trip origin, then there will be a tendency to discard high-quality sites if attainable site quality increases with the time or distance traveled. Similarly, if a rule is defined solely by site quality, then convenient and readily accessible sites will tend to be dropped. The implication is that deterministic rules may be inconsistent with utility maximization if the excluded sites dominate the sites retained in the choice sets used for estimation. Furthermore, the quality of sites attainable with a given quantity of trip inputs may vary sizably within the choice sets. However, the prudent analyst should be troubled by assuming that individuals consider all such sites rather than those that provide the greatest quality for a given amount of inputs. Efficient Frontiers and Individual Choice Sets To remedy the shortcomings of deterministic criteria while maintaining the hypothesis that trips require costly inputs to yield utility generating outputs, we develop a multi-variate efficient frontiers approach to choice set formation. Stochastic frontier models (Aigner, Lovell, and Schmidt; Kumbhakar and Lovell) are estimated in order to identify the efficient 7 site frontier for individual trip-takers. Measuring the distance of sites from the efficient frontier–referred to here as site efficiency–choice sets are then generated by retaining sites that satisfy a specific efficiency criterion. The stringency of the criterion is varied to evaluate its effects on the choice set composition, site choice probabilities, and consumer surplus. Smith, Palmquist, and Jakus (SPJ) is the only prior study to have employed frontier models for travel demand analysis. Their approach estimated a deterministic frontier model (Farrell) relating the costs of accessing sites for recreational angling to the average catch rate of fish and indicators of water quality. Our approach complements SPJ, however, we implement the more flexible stochastic specification of the frontier model at the level of the individual trip-taker in a discrete choice framework. As with applications in productivity analysis, frontier models are intended as tools for the analyst rather than models of individual behavior. Furthermore, the models do not identify the causes of ‘inefficient’ behavior. However, the appeal of the frontier models for choice set generation rests in their computational tractability relative to probabilistic choice set models (Swait and Ben-Akiva; Haab and Hicks,1997; Swait) and ease of implementation with large universal choice sets. Begin by assuming that trip inputs and site quality are continuous and non-negative. For consistency with the neoclassical production function and the deterministic frontier function of SPJ, attainable site quality is hypothesized to be positively correlated with travel time and trip distance. Defining qj,i as the quality of site j attainable by individual i given the trip input xj,i (time or distance), the individual’s efficient site frontier function is written: (2) 8 where is a positive parameter.5 Empirically, (2) will likely fail to hold perfectly in most, if not all, situations. Although the maximum attainable level of site quality may be expected to increase as one ventures further from home, the number of lesser quality sites may also increase. Those sites providing the greatest (least) quality for a given amount of inputs are the most efficient (inefficient) sites in the universal choice set J. To distinguish between sites on the basis of efficiency, the frontier function is weighted by a site efficiency term : (3) Site j lies on the frontier of efficient sites if inefficient within J as = 1, while the site is considered increasingly . As, all else constant, utility will increase as the level of quality increases, it follows that the utility attainable at relatively efficient sites will be greater than that attainable at a relatively inefficient sites. To also allow for random variation in the relation between site quality and trip inputs from the perspective of the analyst, an error term is included in (3). The resulting stochastic site frontier function is written: (4) And the logarithmic transformation of (4) is given by: (5) where uj,i . Specifying the functional relation between qj,i and xj,i and the 9 distributions of u and , maximum likelihood may be used to recover efficiency values ( Typically, and the site ’s ) indicating the proximity of the sites to the efficient frontier. , while u may be assigned any of is assumed to be distributed as several distributions (e.g., truncated normal, half normal, and exponential). Individual choice sets may be defined in terms of the efficiency of sites in the universal choice set J relative to site frontier (2). One possibility is to define a lower bound by weighting (2) by some value k (where 0 < k < 1, ) and then retain those sites located between the frontier and the lower bound. However, this ‘uniform’ approach is problematic as it ignores potential heterogeneity in the frontier function and site efficiency distribution across trip-takers. We instead define the lower bound at the individual level by weighting the frontier function by a linear combination of the mean and standard deviation of the individual’s site efficiency distribution: ( ). The parameter calibrates the location of the lower bound relative to the frontier and, hence, the composition of the choice set. The efficiency criterion for choice set generation is formally stated: (6) where For individual , site if: represents the choice set retained for estimation. The criterion states that sites are excluded from if their quality, conditional upon the inputs required to access the sites, is less than the lower bound. In general, J as 10 , while converges upon the efficient frontier as . Given that the utility attained with a given level of trip inputs will be greater at relatively efficient sites than at relatively inefficient sites, the lower bound defining the minimum admissible level of site efficiency is equivalent to a minimum utility threshold (Klein and Bither). As increases (decreases), the minimum level of efficiency required for sites to appear in the choice set increases (decreases), and as a result, the minimum utility attainable at the sites comprising the choice set increases (decreases). The composition of the choice sets may be evaluated with efficiency scores, which are calculated from the efficiency values of the sites contained in the universal choice set. As site efficiency is independent of the choice set criterion, the composition of the choice sets may be compared with the composition of choice sets generated from alternative approaches (e.g., travel time, distance, and site quality criteria). The efficiency score for site j is defined: (7) Efficiency Scorej = The scores of highly efficient (inefficient) sites will be positive (negative) and relatively large in absolute value, while sites possessing approximately average levels of efficiency will have efficiency scores of approximately zero. The mean efficiency score of a given choice set may be used as a gauge of its composition. A mean efficiency score of zero indicates the choice set contains both efficient and inefficient sites relative to the mean site. This would be expected, for example, with the universal set of sites. In general, if a choice set is comprised of relatively efficient (inefficient) sites, then its mean efficiency score will be positive (negative). In contrast, the mean site efficiency scores of choice sets generated from simple 11 deterministic criteria are expected to be approximately equal to zero due to the tendency of these criteria to exclude sites independently of their efficiency values. The characteristics of individual choices over the sites comprising their choice sets may also be evaluated in terms of the efficiency scores (7) and with site efficiency rankings (Jensen). The ratio of a choice set’s mean efficiency score to the efficiency score of the chosen site measures the efficiency of the individual’s site choice relative to the mean site. Formally, the relative efficiency score of the site chosen by individual i is defined: Relative Efficiency Score (8) where = denotes the efficiency value of the chosen site.6 For a given choice set, the site selected by the individual is relatively efficient (inefficient) the smaller (larger) the value of (8). In general, if the efficiency score of the chosen site exceeds the mean efficiency score of the choice set, then (8) will be less than one; otherwise, the relative efficiency of the site is no greater than the mean level of site efficiency. With regard to the efficiency criterion (6), as increases through its range, dropping relatively inefficient sites in the process, the mean efficiency score of an individual’s choice set will increase, yet the efficiency score of the chosen site will remain constant. As a result, (8) will increase as increases. Efficiency rankings (Jensen) are also used to examine the relative efficiency of individual choices. These are constructed by placing the 12 sites contained in the choice set in descending order by their efficiency values and then assigning consecutive integers (1, 2, 3, ..., ) to the ordered sites. The ratio of the rank of the chosen site to the number of sites in the choice set may be used as a second measure of the relative efficiency of individual choice. The relative efficiency rank of the site chosen by individual i is defined: (9) Relative Efficiency Rank = If the individual selected the most efficient site, then (9) is equal to 1/ for all choice sets generated under the efficiency criterion (6), while a value of one indicates that the least efficient site was chosen. In general, as relatively inefficient sites are excluded according to the efficiency criterion (i.e., as is increased), the denominator of (9) will decrease, and as a result, the relative efficiency rank will approach one for a chosen site of a given rank. In contrast, as deterministic choice set rules discard sites without regard to their efficiency characteristics, the relative efficiency rank of the chosen site is expected to remain approximately constant as the stringency of the rule is varied. Model Specification The modeling of individual choice sets and site choice is performed in two stages. In the first-stage, the stochastic frontier models relating site quality and trip inputs are estimated separately for each individual in the sample. Choice sets are then generated with the efficiency criterion (6) evaluated across a large range of values of the calibrating 13 parameter . The probability of site choice is then modeled in the second stage. The deterministic portion of the first-stage frontier model is specified as double-log: . This specification is the benchmark in the production frontiers literature and permits convenient calculation of and its interpretation as an efficiency measure (Aigner, Lovell, and Schmidt; Kumbhakar and Lovell). The random error term, is assumed to be distributed as , , and several distributions were examined for the log-transformed efficiency term, u . The stochastic frontier model with truncated normal site efficiency is presented here.7 The log-likelihood function of the model is written: (10) where , and is the standard normal cumulative distribution function. Given this specification of the stochastic site frontier model, the efficiency value of site j may be written for individual i as: (11) where and are defined: (12) 14 And, the empirical expression of the efficiency criterion (6) is written: (13) For individual where ln( )= site if: . For estimation of the second-stage models of site choice, the utility derived by individual i from visiting site j is expressed: (14) where v(.) is the observable, deterministic portion of utility, and is the unobservable, random utility component. The probability of choosing site j from the set may be written: (15) Distributional assumptions about yield specific RUMs. As with the frontier models, alternative specifications were investigated.8 McFadden’s conditional logit model, which results from assuming the random utility component is iid type-I extreme value, was selected for this application. The conditional logit site choice probability is written: (16) 15 And the log-likelihood function of the conditional logit is written: (17) where zj,i equals one if individual selects site and is zero otherwise. To evaluate the effects of the efficiency criterion on the estimated RUM coefficients, choice sets are generated across the range of values defining and the RUM is then estimated at each value. The Data The empirical application is angler choices of inland fishing sites in Maine during the 19992000 openwater season. The data were obtained through mail surveys using the method of Salant and Dillman. The surveys elicited the lakes and ponds visited, the quantity of each of ten species of fish caught, and socioeconomic characteristics of the anglers. The sample used here contains 1,081 Maine residents who took 5,556 single-day trips to 483 sites.9 For estimation of the first-stage site choice frontier models, the trip input is defined as the round-trip travel time between the centroid of the respondent’s home zip-code and the site access points. One-way travel time averages 1.25 hours and ranges from about zero to five hours (Table 1). Converted to natural logarithms, the input is referenced by Ln(Time). Site quality is defined as an individual-specific expected daily catch rate of salmon, brook trout, lake trout, and brown trout. The expected catch rate is generated by estimating specieslevel negative binomial count data models (see Cameron and Trivedi) relating catch to site attributes and angler characteristics and then summing the fitted values over the species (see 16 Scrogin et al.).10 Converted to natural logarithms, the catch rate is referenced by Ln(Quality). For estimation of the site choice models, the deterministic indirect utility function v(.) is specified as linear-in-parameters and linear-in-variables. Five independent variables comprise v(.). These include the four species-specific expected catch rate variables (Salmon, Brook Trout, Lake Trout and Brown Trout) and a travel cost price proxy (Travel Cost) defined as the sum of explicit round-trip travel costs ($0.33/mile) and the opportunity cost of travel time, valued at one-third of the estimated hourly wage rate. Estimation Results Choice Set Frontier Models The site choice frontier model was estimated separately for each individual in the sample over the 483 sites comprising the universal choice set. Results reported in Table 2 indicate that expected catch increases significantly with travel time on average. As the estimated coefficients may be interpreted as elasticities, the results indicate that a one hundred percent increase in one-way travel time (about 1.25 hours) yields about a fifteen percent increase in expected daily catch on average. More than seventy-six percent of the coefficients are positive and significant, and only two percent are negative and significant.11 Table 2 also presents summary statistics on the estimated site efficiency estimates ( average site is relatively efficient ( deviation ( ’s). Overall, the = 0.87), but the relative size of the average standard = 0.25) and range of values (0.06 – 0.99) indicates the efficiency estimates vary sizably within the sample. 17 Analysis of the Choice Sets Having estimated the first-stage site choice frontier models and efficiency statistics, choice sets are formed with the efficiency criterion (6) for estimation of the second-stage site choice models. As the site efficiency estimates are obtained independently from the choice sets, the composition of the choice sets may be compared to the composition of choice sets produced with alternative approaches. To demonstrate, travel-time dependent choice sets were also constructed (Parsons and Hauber; Hicks and Strand). The left half of Table 3 reports results with the efficiency rule, and the right half reports results with the deterministic rule. Considering the former, the 483 site universal choice set (J) is obtained for all individuals by reducing the calibrating parameter to = -3.0 standard deviations below the mean site efficiency. Alternatively, increasing the parameter to = 2.0 standard deviations above the mean site efficiency yields the minimum feasible choice set, containing two sites. Evaluated at the mean, , the average choice set contains approximately 231 sites ( = 17.1). With the deterministic rule, J is obtained by including all sites within five hours, and the minimum feasible choice set is obtained by excluding all sites exceeding eighteen minutes. Evaluated at the mean (1.25 hours), the average choice set contains about fifty more sites ( = 277) than that obtained with the efficiency approach, and the spread in the choice set size around the mean is about five times larger ( = 79.0). The composition of the choice sets is evaluated in terms of the mean of the site efficiency scores (7). With the site efficiency criterion, the mean efficiency score increases from approximately zero at the universal set to beyond 2.0 as 18 increases (column 6). In contrast, the choice set composition remains approximately constant on average as sites are excluded by the deterministic travel time rule, with the average efficiency score reaching a maximum value of only 0.13 (column 13). This invariance is due to the deterministic rule discarding both high-quality, efficient sites and low-quality, inefficient sites as the permissible level of travel time decreases. Analysis of Site Choice The relative efficiency of individual choice and the probability of site selection are evaluated next. As shown in the first row of Table 4, with the universal choice set the absolute and relative efficiency ranks of the chosen sites (9) average 156 and 0.32, respectively, and the relative efficiency scores (8) average approximately zero. Overall, some anglers selected the most efficient sites in their choice sets (Min = 1), while others selected relatively inefficient sites (Max n*). Comparing the rules, the relative efficiency of site choice differs considerably on average as sites are excluded. With the site efficiency rule, as approaches its maximum (2.0) the relative efficiency rank (column 5) approaches its maximum (1.0), and the relative efficiency score increases beyond 3.0 (column 6). This indicates that as increases, the chosen sites become increasingly inefficient within the choice sets relative to the universal choice set. In contrast, with the deterministic rule the relative efficiency ranks remain approximately constant as sites are excluded (column 11), and the relative efficiency score deviates only slightly above that observed with the universal set on average (column 12). 19 The estimated RUM coefficients are examined graphically in Figures 1 and 2. Overall, fifty-three of the fifty-five RUM coefficients estimated with the efficiency rule (i.e., five coefficients estimated at each of the eleven values of ) were significant (0.01 level), while all fifty-five of the coefficients estimated with the deterministic rule were significant (see Appendices D and E for numerical results). Considering the former, the estimated Travel Cost coefficients are negative as expected but approach zero as increases (Figure 1), indicating that the marginal effect of travel cost on the probability of site choice increases as the relative efficiency of the sites comprising the choice set increases. In contrast, the Travel Cost coefficients estimated with the deterministic rule remain virtually constant as the permissible travel time is varied. This is attributed to the invariance of both the composition of the choice sets, as indicated by the constancy of the mean efficiency scores (Table 3), and the relative location of the chosen sites within the choice sets, as indicated by the constancy of the relative efficiency ranks and relative efficiency scores (Table 4). As shown in Figure 2, the estimated catch rate coefficients trend downward in all cases as increases and become negative for at least two of the four species (e.g., Brook Trout and Brown Trout) at relatively conservative values (i.e., as > 0). This is because increases and the resulting proportion of efficient sites in the choice sets increases, the relative efficiency of the chosen sites decreases as shown in Table 4 by the increasing relative efficiency scores and ranks. The quality of the chosen sites is, therefore, less than the quality attainable at other sites requiring the same travel costs, and as a result, the probability of site choice is estimated to decrease as expected catch increases. Alternatively, the catch rate coefficients estimated with deterministic choice sets are insensitive to the level of travel time. 20 The explanation for this finding is that the composition of the choice sets and the relative efficiency of the chosen sites remain approximately constant as the permissible level of travel time is varied as indicated by the invariance of the mean efficiency scores (Table 3) and the relative efficiency scores and relative efficiency ranks (Table 4). To summarize, the invariance of the RUM coefficients estimated with deterministic choice sets is consistent with the findings of Parsons and Hauber, Whitehead and Haab, and Hicks and Strand. In contrast, the parameter estimates converge at a much slower rate with choice sets generated with the efficiency approach. However, as the efficiency criterion evaluated at a specific value of the calibrating parameter ( ) is consistent with a minimum utility threshold, the behavior of the estimated coefficients of the indirect utility function could be explained in terms of the exogenously determined site efficiency statistics. In adopting the efficiency criterion, a natural question follows regarding the value that should be assigned to . As with other choices made in model implementation, the choice of ultimately rests with the analyst. Furthermore, an appropriate value of will likely be application-specific so that sensitivity analyses similar to those performed here (Tables 3-4) may be necessary to guide the decision. However, as behavior models of destination choice typically assume–and empirical observation commonly supports–that, all else constant, site quality (travel cost) is positively (negatively) related to individual utility, the decision regarding an appropriate value of this occurs when may be guided by economic theory. In the present study, is assigned a value that is no greater than zero. 21 Implications for Benefits Measurement A common use of RUMs in destination choice research is to estimate the non-market benefits associated with exogenous changes in some or all of the available sites. With public resources, such as the water bodies and adjoining lands considered here, changes in policy may affect site quality or accessibility. To mitigate environmental degradation or stock depletion, state and federal agencies often impose regulations upon particular activities in which the public may engage–directly or indirectly affecting site quality–or instead restrict public access through site closures. As the welfare effects of policy changes necessarily depend upon the composition of the sites comprising the choice sets, the choice set criterion serves a crucial role in calculating non-market values. In evaluating the effects of the choice set on non-market benefit estimates, we focus upon a controversial set of five large lakes contained in the ‘China Lakes’ region of Maine. While the lakes are among the state’s most coveted coldwater-fisheries, agricultural and forestry production coupled with residential land developments have led to increased levels of eutrophication and toxic loading in their waters (see Parsons, Plantinga, and Boyle for a detailed discussion). The policy scenario we consider is the hypothetical closure of the five lakes (thirteen site access points) to recreational anglers. The monetary value of the change in welfare experienced by individual i from the closure of one or more sites is measured by the compensating variation (CVi ). For the conditional logit model CVi is defined: (18) 22 where MUy represents the marginal utility of income, and n* and n** denote, respectively, the number of sites in the choice set before and after the site closures. Estimated values of CVi are obtained by replacing MUy with the Travel Cost coefficient from an estimated RUM and the v(.) with their estimated values (see e.g., Haab and McConnell, pp. 226-232). As with the site efficiency analysis (Tables 3-4), we begin in Table 5 with the universal choice set (row 1). Here, the thirteen access points appear in one hundred percent of the individual choice sets. The mean welfare loss for closure of the five lakes is calculated to be $0.32 on average (columns 4 and 9). With the site efficiency rule, the number of affected sites in the choice set declines to less than one on average as increases (column 2), however, the mean welfare loss per trip rises sharply. Given that the percentage of individuals affected by the site closures falls as rises (column 3), the results indicate that those who are affected experience large welfare losses. In comparison, the number of affected sites contained in the deterministic choice sets is considerably larger on average, especially at relatively conservative values of travel time (column 7). The percentage of anglers affected by the site closures is also notably larger than with the site efficiency approach, with the results again being amplified as the size of J* decreases (column 8). Given the insensitivity of the RUMs estimated with deterministic choice sets, the mean welfare loss is invariant across all but the most conservative levels of travel time (column 9). When considered at the individual level, welfare losses estimated with the site efficiency approach are larger than those resulting with the deterministic approach–and in some cases strikingly so. However, considering the standard deviations of the estimates (columns 5 and 10), the findings indicate there is greater heterogeneity within the sample when evaluated with choice sets generated from the site efficiency approach. 23 Conclusions Debate remains ongoing about properly constructing individual choice sets for discrete choice analysis of revealed preference data. To accommodate universal choice sets containing hundreds or thousands of sites while maintaining the hypothesis that trips require costly inputs to produce utility generating outputs, this study developed an efficiency approach to the choice set problem in the context of destination choice that complements the travel cost frontier model developed by Smith, Palmquist, and Jakus and the minimum utility threshold model of Klein and Bither. The site efficiency approach has several advantages over deterministic criteria and probabilistic models of choice set formation. First, because the approach is consistent with individual optimizing behavior it overcomes the ‘dominance’ problems associated with excluding sites according to simple deterministic rules. Second, in contrast to probabilistic models of choice set formation, empirical implementation of site efficiency models requires no additional data beyond that required for a defensible revealed preference model of destination choice, and estimation remains feasible with large universal choice sets. On a final note, as the production frontier model employed in this study has an economic dual–the efficient cost frontier–our approach may readily be extended for cost and pricing analyses. As the choice set issue remains open in the literature on residential housing choice (e.g., Banzhaf and Smith) and given that the number and types of homes for sale in a given market at any particular time may be quite large, hedonic cost frontier models may be an invaluable tool for choice set generation in future analyses of housing demand. 24 References Aigner, D.J., C.A.K. Lovell, and P. Schmidt. “Formulation and Estimation of Stochastic Frontier Production Function Models.” Journal of Econometrics 6(1977):21-37. Black, W. C. “Choice-set Definition in Patronage Modeling.” Journal of Retailing 60(1984):63-85. Banzhaf, H.S., and V.K. Smith. “Meta Analysis in Model Implementation: Choice Sets and the Valuation of Air Quality Improvements.” Discussion Paper 03-61, Resources for the Future, 2003. Cameron, A.C., and P.K. Trivedi. Regression Analysis of Count Data. Cambridge: Cambridge University Press, 1998. Farrell, M.J. “The Measurement of Productive Efficiency.” Journal of the Royal Statistical Society Series A 120(1957):253-281. Haab, T.C., and R. Hicks. “Accounting for Choice Set Endogeneity in Random Utility Models for Recreational Demand.” Journal of Environmental Economics and Management 34(1997):127-147. Haab, T.C., and R.L Hicks. “Choice Set Considerations in Models of Recreation Demand: History and Current State of the Art.” Marine Resource Economics 14(2000):271-281. Haab, T.C., and K.E. McConnell. Valuing Environmental and Natural Resources. Cheltenham, UK: Edward Elgar Publishing Limited, 2002. Hicks, R.L., and I.E. Strand. “The Extent of Information: Its Relevance for Random Utility Models.” Land Economics 76(2000):374-385. Horowitz, J. L. “Modeling the Choice of Choice Set in Discrete-choice Random-utility Models.” Environment and Planning 23(1991):1237-1246. 25 Jensen, U. “Is it Efficient to Analyse Efficiency Rankings?” Empirical Economics 25(2000):189208. Klein, N. M., and S. W. Bither. “An Investigation of Utility-Directed Cutoff Selection.” Journal of Consumer Research 14(1987):240-256. Kumbhakar, S.C., and C.A.K. Lovell. Stochastic Frontier Analysis. Cambridge: Cambridge University Press, 2000. Manski, C. F. “ The Structure of Random Utility Models.” Theory and Decision 8(1977) 229-254. McConnell, K.E., I.E. Strand, and L. Blake-Hedges. “Random Utility Models of Recreational Fishing: Catching Fish Using a Poisson Process.” Marine Resource Economics 10(1995):247-261. McFadden, D. “Modelling the Choice of Residential Location.” in A. Karlqvist, L. Lundquist, F. Snickars, and J Weibull (Eds.), Spatial Interaction Theory and Planning Models. Amsterdam: North-Holland, 1978, pp. 75-96. Parsons, G., and B. Hauber. “Spatial Boundaries and Choice Set Definition in a Random Utility Model of Recreation Demand.” Land Economics 74(1998):32-48. Parsons, G. and M. J. Kealy. “Randomly Drawn Opportunity Sets in a Random Utility Model of Lake Recreation.” Land Economics 68(1992):93-106. Parsons, G.R., A.J. Plantinga, K.J. Boyle. “Narrow Choice Sets in a Random Utility Model of Recreation Demand.” Land Economics 76(2000):86-99. Paterson, R., D. Scrogin, K. Boyle, D. McNeish. “Maine Open Water Fishing Survey: Summer 1999.” Maine Agricultural and Forest Experiment Station Paper No. 493, University of Maine, 2001. 26 Rabe-Hasketh, S., A. Skrondal, and A. Pickles. “Estimation of Generalized Linear Models.” The Stata Journal 2(2002):1-21. Salant, P., and D. Dillman. How to Conduct Your Own Survey. New York: John Wiley and Sons, Inc., 1994. Scrogin, D., K. Boyle, G. Parsons, A.J. Plantinga. “Effects of Regulations on Expected Catch, Expected Harvest, and Site Choice of Recreational Anglers.” American Journal of Agricultural Economics 86(2004):963-974. Smith, V.K., R.B. Palmquist, and P. Jakus. “Combining Farrell Frontier and Hedonic Travel Cost Models for Valuing Estuarine Quality.” Review of Economics and Statistics 73(1991):694699. Swait, J. “Choice Set Generation within the Generalized Extreme Value Family of Discrete Choice Models.” Transportation Research B 35( 2001):643-666. Swait, J., and M. Ben-Akiva. “Empirical Test of a Constrained Choice Discrete Model: Mode Choice in Sao Paulo, Brazil.” Transportation Research B 21(1987):103-15. Thill, J. C. “Choice Set Formation for Destination Choice Modelling.” Progress in Human Geography 16(1992):361-382. Train, K. “Mixed Logit Models for Recreation Demand.” in C. Kling, and J. Herriges (Eds.), Valuing the Environment using Recreation Demand Models. New York: Elgar, 1997, pp. 140-163. Whitehead, J.C., and T.C. Haab. “Southeast Marine Recreational Fishing Statistical Survey: Distance and Catch Based Choice Sets.” Marine Resource Economics 14(1999):283-298. 27 Footnotes 1. The access price pj,i is usually defined by explicit and, perhaps, implicit travel costs (e.g., fuel expenditures and the opportunity cost of time, respectively). The former are a function of the round-trip distance between the trip departure point and the destination; the latter are usually expressed as a function of travel time and income. 2. Others have defined individual choice sets by the collection of sites visited by all individuals. When the size of this set proved overwhelming (i.e., in terms of memory requirements or computational burden), McFadden’s random draws approach was adopted to reduce the choice set dimension without jeopardizing the consistency of the estimated RUM coefficients (see e.g., Parsons and Kealy). Alternatively, the choice set dimension may be reduced through some aggregation scheme, whereby multiple sites are combined through a weighting of their attributes to form a single, common site (Parsons, Plantinga, and Boyle). 3. Deterministic rules may be considered directly from (1). With travel time and distance rules, sites are excluded if they are deemed excessively far from the trip origin. Since pj,i is a function of round-trip distance and the opportunity cost of travel time, a rule defined solely by trip inputs is equivalent to a rule defined by travel costs. Hence, if a site is relatively costly to access, then all else constant, the utility derived from visitation will be sufficiently low for the site to be excluded from estimation. Similarly, with rules defined by site quality, and assuming utility increases as quality increases, sites will be dropped if, all else constant, their quality–and thus the utility derived from visitation–is sufficiently low. 4. Thill (p. 362) emphasizes the potential inconsistencies of deterministic approaches with utility maximization more emphatically by stating, “The analyst takes a major risk of 28 mis-specifying the choice set, primarily because its specification derives only remotely from the behavioural processes of choice-set formation.” 5. Although quality is represented here by a single, continuous variable, it may be considered a multi-dimensional vector of site attributes. 6. The relative efficiency score could instead be written as the reciprocal of (8). However, the statistic diverges to infinity as the mean efficiency score approaches zero (e.g., the universal choice set). 7. Models were also estimated with u distributed as half-normal and exponential and the deterministic relation specified as semi-log, linear, and quadratic. Results are reported in Appendix C. 8. Random parameters (or mixed) logit models (see e.g., Train) were estimated by Gaussian-Hermite and adaptive quadrature methods in Stata version 8.2 (Rabe-Hasketh, Skrondal, and Pickles). While the adaptive method is computationally more efficient, run time with the smallest data set used in this study (see Appendix D) and four random parameters exceeded one hundred hours. Nested logit models were also considered, yet these require the analyst to place a common and potentially arbitrary structure upon the individual decision process. Although the mixed and nested models relax the IIA assumption of conditional logit, their empirical implementation is often problematic. However, the general conclusions of this study are expected to be upheld by the alternative models. 9. By focusing on single-day trips, additional qualitative trip outputs associated with multiple-day trips may be eschewed. Details on the survey instruments, sample, and responses are contained in Paterson, Scrogin, Boyle, and McNeish. 29 10. Because an angler’s catch of individual species of fish–and, hence, aggregate catch–is unknown a priori, the catch variables appear as expected values (see e.g., McConnell, Strand, and Blake-Hedges). The expected catch models were specified as zeroinflated negative binomial as it outperformed Poisson, negative binomial, and zero-inflated Poisson models in all cases. Variable definitions, summary statistics, and estimation results are reported in Appendices A and B. 11. The proceeding analysis was also performed by restricting the sample to the subset of individuals for whom the estimated travel time coefficients were positive. Overall, the estimated RUM coefficients were not sensitive to the sample. Estimation results are available in an appendix upon request. 30 Table 1. Variable Definitions and Summary Statistics Variable Definition Mean Standard Deviation Ln(Quality) Natural log of expected catch per day -1.02 1.04 Ln(Time) Natural log of round-trip travel time 0.76 0.70 Salmon Expected catch of salmon per day 0.15 0.25 Brook Trout Expected catch of brook trout per day 0.30 0.49 Lake Trout Expected catch of lake trout per day 0.06 0.12 Brown Trout Expected catch of brown trout per day 0.07 0.12 Travel Cost 0.325 x (Round trip miles) + time cost 51.99 33.40 N = 2,683,548 Stochastic Frontier Models Random Utility Models Notes: The total number of observations (N) is equal to the number of sites (483) multiplied by the sample size (5,556). The aggregate and species-specific expected catch rates are obtained from estimated zero-inflated negative binomial models relating reported catch to site and angler characteristics. Variable definitions, summary statistics, and estimation results are reported in Appendices A and B. The time cost component of Travel Cost is defined as one-third of the hourly wage; hourly values of the wage are derived from respondent annual income and a 2,000 hour work year (40 hours per-week x 50 weeks per-year). 31 Table 2. Estimated Site Efficiency Frontier Models Averaged Over Trips Number of Trips: 5,556 Number of Sites: 483 Truncated Normal Site Efficiency Frontier Models Average Coefficient Estimates Ln(Time) 0.146* (0.040) Intercept -0.904* (0.208) Summary of Ln(Time) Coefficients % Positive and Significant 76.6 % Negative and Significant 2.1 % Insignificant 21.3 Summary of Site Efficiency Estimates Mean Standard Deviation Estimated Site Efficiency: 0.87 0.25 0.06 0.99 Estimated One-Sided Error: 0.22 0.47 1.02e-07 2.84 Min Max Notes: The dependent variable is the natural logarithm of expected daily catch. The number of estimated models is equal to the number of trips (5,556), and the number of observations per model is equal to the number of sites contained in the universal choice set (483). The reported coefficient estimates are averages of the individual coefficient estimates. Standard errors are reported in parentheses. * denotes significance at the 0.01 level. The reported percentages are based upon two-tailed t-tests of the null Ho: = 0 conducted at the 0.05 level. Approximately seventy percent of the intercepts were significant. 32 Table 3. Choice Set Characteristics: Efficiency vs. Deterministic Criteria Site Efficiency Choice Sets Mean Standard Sites Deviation Min Max Travel-Time Dependent Choice Sets Mean Efficiency Scores (7) -3.0 483 0 483 483 -2.5 482.81 1.09 476 483 0.001 -2.0 481.96 2.24 475 483 -1.5 470.46 11.91 435 -1.0 386.54 15.88 -0.5 297.39 0.0 Travel Time N -0.00001 2,683,548 Mean Standard Sites Deviation Min Max Mean Efficiency Scores (7) N 5.0 483 0 483 483 -0.00001 2,683,548 2,682,481 4.5 482.82 0.68 475 483 -0.0001 2,682,543 0.005 2,677,775 4.0 479.38 14.05 331 483 -0.002 2,663,410 483 0.05 2,613,897 3.5 472.37 30.08 214 483 -0.01 2,624,501 336 448 0.32 2,147,590 3.0 460.50 47.75 123 483 -0.01 2,558,536 20.25 217 337 0.64 1,652,302 2.5 438.66 69.14 55 483 -0.01 2,437,168 228.92 16.99 175 274 0.91 1,271,870 2.0 398.99 87.65 31 470 -0.02 2,216,797 0.5 167.10 14.66 127 209 1.15 928,380 1.5 321.67 92.35 15 445 -0.02 1,787,184 1.0 94.71 11.23 72 136 1.45 526,191 1.0 209.21 78.38 7 317 -0.02 1,162,368 1.5 37.72 9.42 16 74 1.80 209,584 0.5 69.07 34.67 3 146 0.07 383,732 2.0 13.52 5.97 2 33 2.03 75,131 0.3 35.86 20.49 2 85 0.13 199,241 33 Table 4. Relative Efficiency of Individual Choices: Efficiency vs. Deterministic Criteria Site Efficiency Choice Sets Efficiency Ranks Travel-Time Dependent Choice Sets Mean Min Max Relative Site Efficiency Ranks (9) Relative Efficiency Scores (8) Travel Time Efficiency Ranks Mean Min -3.0 156.42 1 483 0.32 -0.00002 5.0 156.42 1 483 0.32 -0.00002 -2.5 156.42 1 483 0.32 0.001 4.5 156.36 1 483 0.32 -0.0002 -2.0 156.42 1 483 0.32 0.01 4.0 154.89 1 483 0.32 -0.003 -1.5 156.36 1 478 0.33 0.07 3.5 151.95 1 483 0.32 -0.01 -1.0 154.45 1 434 0.40 0.50 3.0 147.47 1 475 0.32 -0.01 -0.5 144.83 1 331 0.49 1.01 2.5 140.34 1 466 0.32 -0.02 0.0 129.43 1 260 0.57 1.44 2.0 127.93 1 463 0.32 -0.02 0.5 106.48 1 205 0.64 1.82 1.5 103.02 1 403 0.32 -0.04 1.0 71.02 1 128 0.75 2.30 1.0 67.33 1 298 0.32 -0.04 1.5 33.27 1 72 0.88 2.85 0.5 24.01 1 123 0.35 0.11 2.0 13.14 1 33 0.97 3.22 0.3 13.21 1 67 0.36 0.21 34 Max Relative Site Efficiency Ranks (9) Relative Efficiency Scores (8) Table 5. Welfare Losses for Site Closures: Efficiency vs. Deterministic Criteria Site Efficiency Choice Sets Travel-Time Dependent Choice Sets Welfare Losses Mean Sites Closed Percent of Anglers Affected Mean per Trip Standard Deviation -3.0 13 100 0.32 -2.5 13 100 -2.0 13 -1.5 Welfare Losses Travel Time Mean Sites Closed Percent of Anglers Affected Mean per Trip Standard Deviation 0.51 5.0 13 100 0.32 0.51 0.32 0.51 4.5 13 100 0.32 0.51 100 0.32 0.51 4.0 13 100 0.32 0.51 12.98 100 0.32 0.51 3.5 12.91 100 0.32 0.51 -1.0 12.28 99.81 0.36 0.56 3.0 12.77 99.07 0.32 0.51 -0.5 7.90 93.99 0.40 0.65 2.5 12.61 98.06 0.32 0.51 0.0 3.26 60.22 0.37 0.69 2.0 12.11 96.21 0.32 0.51 0.5 1.30 28.31 0.31 0.73 1.5 11.79 94.26 0.32 0.51 1.0 0.56 14.62 0.69 3.32 1.0 9.85 86.03 0.34 0.52 1.5 0.28 11.19 6.52 41.18 0.5 3.76 39.96 1.21 1.85 2.0 0.09 6.85 32.19 182.64 0.3 2.16 30.62 -0.73 1.28 35 Efficiency Choice Sets Horizontal Axis: Calibrating Parameter ( ) Vertical Axis: Travel Cost Coefficient Travel-Time Dependent Choice Sets Horizontal Axis: One-Way Travel Time Vertical Axis: Travel Cost Coefficient Figure 1. Sensitivity of Estimated Travel Cost Coefficients: Efficiency vs. Deterministic Criteria 36 Efficiency Choice Sets Horizontal Axis: Calibrating Parameter ( ) Vertical Axis: Expected Catch Coefficients Travel-Time Dependent Choice Sets Horizontal Axis: One-Way Travel Time Vertical Axis: Expected Catch Coefficients Figure 2. Sensitivity of Estimated Expected Catch Coefficients: Efficiency vs. Deterministic Criteria 37 Appendix A. Variable Definitions and Summary Statistics Standard Variable Name Definition Mean Deviation I = 1,081 Angler Characteristics Age Angler age 40.58 12.91 Male 1 if angler is male; 0 if angler is female 0.88 0.32 Ln(Days) Natural logarithm of single day trips 1.09 0.95 Targeted 1 if angler targeted the species; 0 otherwise: Salmon 0.36 0.48 Brook Trout 0.37 0.48 Lake Trout 0.26 0.44 Brown Trout 0.23 0.42 J = 483 Fishery Characteristics Species Not Present 1 if species not known to be present by managing agency; 0 otherwise: Salmon 0.57 0.50 Brook Trout 0.25 0.44 Lake Trout 0.74 0.44 Brown Trout 0.70 0.46 Species Not Abundant 1 if species not known to be abundant by managing agency; 0 otherwise: Salmon 0.11 0.31 Brook Trout 0.32 0.47 Lake Trout 0.05 0.22 Brown Trout 0.10 0.30 38 Appendix A continued Stocked Species 1 if lake or pond stocked with species; 0 otherwise: Salmon 0.26 0.44 Brook Trout 0.32 0.47 Lake Trout 0.08 0.28 Brown Trout 0.20 0.40 Surface area of lake or pond in acres scaled by 1,000 2.21 6.32 Depth Depth of lake or pond in feet, scaled by 10 2.18 1.66 Elevation Elevation of lake or pond above sea level scaled by 100 5.45 4.52 Acres Water Type 1 1 if oligotrophic water (low productivity; deep secchi disk readings); 0 otherwise 0.28 0.45 Water Type 2 1 if eutrophic water (high productivity; shallow secchi disk readings); 0 otherwise 0.34 0.47 No Live Bait 1 if no-live-bait regulation at lake or pond; 0 otherwise 0.13 0.34 Catch and Release 1 if catch-and-release regulation at lake or pond; 0 otherwise 0.01 0.12 39 Appendix B. Estimated Zero-Inflated Negative Binomial Models of Expected Catch Expected Catch Salmon Brook Trout Lake Trout Brown Trout Log(Days) 0.80* (0.07) 0.63* (0.07) 0.79* (0.09) 0.81* (0.09) Targeted 1.92* (0.16) 2.31* (0.16) 2.31* (0.21) 2.23* (0.20) Age 0.02* (0.00) -0.02* (0.01) 0.00 (0.01) -0.01 (0.01) Male -0.38 (0.23) 0.35 (0.23) -0.54*** (0.32) 0.63*** (0.33) Acres 0.00 (0.00) -0.01** (0.00) 0.02** (0.01) 0.00 (0.01) Depth -0.04 (0.07) 0.00 (0.03) -0.02 (0.08) -0.14 (0.11) Elevation 0.04* (0.02) 0.05* (0.02) 0.00 (0.03) 0.05 (0.03) Water Type 1 0.34 (0.17) -0.53* (0.17) 0.59*** (0.32) 0.13 (0.35) Water Type 2 -0.21 (0.22) -0.16 (0.17) 0.17 (0.32) 0.15 (0.19) No Live Bait -0.39 (0.30) -0.28 (0.19) 0.04 (0.44) -1.12*** (0.60) Species Not Present -0.20 (0.37) -0.05 (0.35) -0.47 (0.40) -0.27 (0.52) Species Not Abundant -0.57* (0.31) -0.99* (0.21) 0.18 (0.55) 0.02 (0.55) Stocked Species -0.36* (0.20) 0.12 (0.15) -0.19 (0.24) 0.40 (0.46) Catch and Release 0.04 (0.46) -1.12** (0.48) -0.15 (0.52) -0.37 (0.76) Constant -2.37 (0.44) -1.20* (0.38) -2.60* (0.59) -2.85* (0.63) Modified Negative Binomial 40 Appendix B Continued Logit of Positive Catch Species Not Present 4.51* (0.80) 3.12* (0.54) 2.07* (0.46) 1.73* (0.56) Species Not Abundant 2.28* (0.75) 1.64* (0.48) 1.80** (0.74) 2.06* (0.69) Age -0.03* (0.01) -0.05* (0.02) -0.02 (0.01) -0.03*** (0.02) Male -1.97* (0.75) -0.35 (0.46) -0.64 (0.49) -0.17 (0.61) 0.36* (0.12) 0.69* (0.13) 0.47* (0.22) 0.52*** (0.27) Mean 1.48 1.80 0.61 0.61 Standard Deviation 3.09 3.20 1.48 1.51 Minimum 0.00 0.01 0.01 0.00 Maximum 34.89 27.74 19.51 19.58 Sample Size 1,310 1,305 1,227 1,213 -1,366.74 -2,022.56 -1,185.18 -1,141.60 Dispersion Parameter Log( ) Prediction Statistics Log-Likelihood Notes: The dependent variable is the reported catch per visited site during the open-water fishing season. *, **, and *** denote significance at the 0.01, 0.05, and 0.10 level, respectively. 41 Appendix C. Comparison of Estimated Site Efficiency Frontier Models Truncated Normal Site Efficiency Number of Trips: 5,556 Number of Sites: 483 DoubleLog LinearLog 0.147* (0.041) Travel Time Travel Time Squared Half-Normal Site Efficiency DoubleLog LinearLog --- 0.146* (0.041) 0.043* (0.012) --- --- --- -1.103* (0.217) 0.547* (0.059) % Positive and Significant 76.6 % Insignificant % Negative and Significant Exponential Site Efficiency DoubleLog LinearLog --- 0.147* (0.041) 0.043* (0.012) --- --- --- -1.062* (0.285) 0.540* (0.117) 78.5 76.6 18.3 20.9 1.4 0.6 Linear Quadratic Linear Quadratic 0.086* (0.027) --- --- --- --- Linear Quadratic 0.086* (0.027) --- 0.086* (0.027) --- --- --- --- --- --- 0.043* (0.012) --- 0.006* (0.002) --- 0.006* (0.002) --- --- --- 0.006* (0.002) 0.500* (0.063) 0.552* (0.113) 0.493* (0.123) 0.552* (0.113) -1.103* (0.217) 0.547* (0.059) 0.500* (0.063) 0.552* (0.113) 75.5 80.3 75.5 80.3 78.5 76.6 75.5 80.3 78.5 21.3 21.5 21.3 21.5 18.3 20.9 21.3 21.5 18.3 20.9 2.1 3.0 2.1 3.0 1.4 0.6 2.1 3.0 1.4 0.6 Coefficient Estimates Log(Travel Time) Intercept Coefficient Summary† Notes: The dependent variable in the linear, quadratic, and semi-log models is defined as the sum of the salmon, brook trout, lake catch, and brown trout catch expectations generated from zero-inflated negative binomial (ZINB) models (see Appendix A). The dependent variable in the double-log models is the natural logarithm of the sum of the catch expectations. Coefficient estimates and standard errors (in parentheses) are obtained by averaging the coefficients over the 5,556 estimated stochastic frontier models. * denotes significance at the 0.01 level. † Significance based upon two-tailed t-tests ( = 0.05) of the null hypothesis Ho : = 0. While not reported here, approximately 96 percent of the intercepts were significant ( = 0.05) with the levels and semi-log specifications; about 70 percent were significant with the double-log specifications. 42 Appendix D. Conditional Logit Estimation Results: Efficiency Choice Sets Pseudo R2 N -3.0 -0.153* 2.432* 0.616* 3.147* 2.838* 0.326 2,683,548 -2.5 -0.153* 2.432* 0.614* 3.157* 2.829* 0.326 2,682,481 -2.0 -0.153* 2.431* 0.607* 3.146* 2.810* 0.326 2,677,775 -1.5 -0.153* 2.425* 0.603* 3.143* 2.768* 0.323 2,613,897 -1.0 -0.148* 2.277* 0.529* 3.047* 2.385* 0.311 2,147,590 -0.5 -0.142* 1.784* 0.104** 2.704* 1.168* 0.310 1,652,302 0.0 -0.136* 0.860* -0.676* 2.178* -0.182 0.324 1,271,870 0.5 -0.129* -0.466* -2.379* 1.184* -0.792* 0.360 928,380 1.0 -0.109* -4.353* -5.192* 0.014 -3.746* 0.432 526,191 1.5 -0.063* -13.663* -15.809* -5.639* -12.754* 0.643 209,584 2.0 -0.031* -21.760* -22.457* -5.504* -20.147* 0.857 75,131 Notes: The choice sets were generated with the stochastic frontier model with the site efficiency component assumed to be distributed as truncated normal and the random error assumed to be normally distributed. * and ** denote significance at the 0.01 and 0.05 level, respectively. 43 Appendix E. Conditional Logit Estimation Results: Deterministic Choice Sets Travel Time Pseudo R2 N 5.0 -0.153* 2.432* 0.616* 3.147* 2.838* 0.326 2,683,548 4.5 -0.153* 2.432* 0.616* 3.147* 2.838* 0.326 2,682,543 4.0 -0.153 2.432* 0.616* 3.147* 2.838* 0.325 2,663,410 3.5 -0.153* 2.432* 0.616* 3.147* 2.838* 0.323 2,624,501 3.0 -0.153* 2.432* 0.616* 3.147* 2.838* 0.319 2,558,536 2.5 -0.153* 2.432* 0.616* 3.148* 2.838* 0.312 2,437,168 2.0 -0.153* 2.432* 0.616* 3.148* 2.839* 0.298 2,216,797 1.5 -0.152* 2.433* 0.617* 3.167* 2.841* 0.267 1,787,184 1.0 -0.144* 2.444* 0.623* 3.309* 2.848* 0.199 1,162,368 0.5 -0.036* 2.099* 0.638* 5.008* 2.992* 0.077 383,732 0.3 0.062* 2.674* 0.756* 4.655* 3.280* 0.114 199,241 Notes: * denotes significance at the 0.01 level. 44
© Copyright 2026 Paperzz