Measuring Matching Using Reported Revenues and Expenses November, 2016 Sudipta Basu* William M. Cready** Wonsun Paek*** Abstract Dichev and Tang (2008) provide initial evidence on the degree to which expenses are appropriately matched with revenues in the population of larger Compustat covered firms. They approach the issue from the perspective that expenses are advanced to produce revenues. Conceptually, however, matching is defined as the identification of expenses attributable to recognized revenues. That is, expense recognition depends on revenue recognition, rather than revenues being derived from expenses. In this analysis, we revisit the core questions considered in Dichev and Tang regarding matching efficacy over time as well as matching’s relations with earnings variability and persistence employing measures consistent with this conceptual notion that expense is matched to revenue. Our alternative measures portray a somewhat different picture than that conveyed by the primary measure employed by Dichev and Tang. Our analysis indicates that a sizable decline in matching has taken place since the turn of the century. The decline in this period for their measure is comparatively modest. We also find only limited evidence of a decline in matching prior to the turn of the century, while their metric suggests that matching declined substantially between 1980 and 1994. We also find that, in the cross-section, matching is negatively rather than positively related to earnings persistence. Finally, our analysis indicates that prior to the early 1980s, expenses are, on average, commonly over-matched to revenues. Keywords Matching principle, Expense Recognition, Earnings Variability, Persistence. JEL Classification G10, M4 * Professor of Accounting and Johnson Senior Research Fellow, Fox School of Business, Temple University. [email protected] **Adolf Enthoven Professor of Accounting, Jindal School of Management, University of Texas at Dallas. [email protected] *** Professor of Accounting, Sungkyunkwan University. [email protected] Measuring Matching Using Reported Revenues and Expenses 1. Introduction Matching and revenue realization are central to the income statement perspective of financial reporting. The matching principle recommends that the costs associated with the generation of revenues should be recognized as expenses in the same period that the revenues are recognized. Consequently, better matching should result in more accurate measures of profitability and thereby improve earnings quality. However, a recent analysis by Dichev and Tang (2008) estimates that matching has declined substantially during 1964 to 2003 and that this decline is closely related to the general decline in accounting earnings quality over much of this same time period (Collins et al. 1997 and Francis and Schipper 1999). We revisit these inferences by studying how to best measure matching using total expense and revenue data. In particular, we argue that matching measures should be consistent with the core principle that expenses are matched to recognized revenues, an approach that we show is substantively different from the Dichev and Tang (DT henceforth) perspective that matching reflects “entities that continually advance expenses hoping to reap revenues and earnings” (p. 1427). We identify two specific measures of matching consistent with the conceptual perspective that expense recognition depends upon revenue recognition. The first measure is the percentage of contemporaneous expense that is determined by contemporaneous revenue (MEXP%). We estimate MEXP% by regressing expense on revenue, then using the estimated coefficient on revenue to determine the amount of revenue-based expense being recognized. In the terminology of DT, this measure directly reflects the degree to which expenses are not “scattered” to other 1 periods.1 The second measure is the variation in expense that is explained by lead and lagged revenues incremental to the variation explained by contemporaneous revenues (MISM). This measure directly targets mismatching by evaluating the degree to which variables that should not explain expense under perfect matching (i.e., lead and lag revenue) do explain it. Using these measures, we revisit the core issues considered in DT: (1) The degree to which matching has changed over time; and, (2) the relation between matching and two core earnings properties— earnings variability and earnings persistence. Our measures are consistent with the general conclusion in DT that matching is worse in recent years than it was 40 years ago. In particular, for the typical firm around 98% of expense is matched to revenue at the start of our sample period (1964-1973) while less than 91% of expense is matched to revenue at the end of the period (2004-2013). However, our analysis suggests that much of this decline is a very recent phenomenon. There is no reliable indication of a decline in MEXP% prior to 2000. MISM, on the other hand, does increase over most of the time period examined, with the sharpest rise occurring in the 1990s. In contrast, the primary measure employed in DT, the coefficient on expense in a regression of revenue on it supplemented by lead and lag expense, declines considerably between 1980 and 1994 but does not decline any further afterwards. Such a difference in timing underscores that it matters how one measures matching. It also raises questions about how closely one should connect changes in matching with the broad decline in earnings quality taking place over this same general time period since the existing evidence (e.g. Collins et al. 1997 and Francis and Schipper 1999) suggests that this decline began well before the 1 Alternatively, if expense recognition is entirely unrelated to revenues then the estimated coefficient on revenues should be 0 meaning that this percentage will also be 0 and this will hold irrespective of how little or how much variation is present in either expense or revenue. 2 turn of the century, which is before the earliest indication of the emergence of a broad decline in matching based on our MEXP% measure. Interestingly, we also find that median MEXP% values often exceed 100% early in our sample period. That is, the amount of expense recognized as a linear function of revenue exceeds the amount of expenses being recognized in total. Such values reflect over-matching of expenses to revenues. For example, if a firm recognizes bad debt expense as 3% of revenue when the correct level should be 2% of revenue then the expense is being over-matched with revenue. Nominally, such over-recognition indicates higher matching. Substantively, however, such over-matching reflects poor rather than high quality matching. Consequently, our evidence indicates that over the past 50 years, the financial reporting environment has moved from a regime typified by the overmatching of expenses to revenues to one dominated by under-matching. In the cross-section, we find that MEXP% is negatively related to earnings variability while MISM is positively related to earnings variability over our entire sample time period (1964-2013). There is no evidence that these relations change in any systematic fashion over this period. Hence, consistent with the analyses provided in DT, matching is inversely related with earnings variability. However, we do not find evidence that matching is positively related with earnings persistence. In fact, MEXP% reduces and MISM increases earnings persistence. A possible explanation for this unexpected relation is that matching is more easily achieved with respect to transitory revenue events relative to permanent or persistent revenue events. That is, it seems likely that the costs associated with period-specific revenues are also more likely to be period-specific, and hence, easily matched to revenues in that same period. 3 2. Relevant Literature While matching has long played a central role in conceptual frameworks for financial reporting, particularly with respect to how one determines the expense component of periodic income (e.g. Paton and Littleton 1940), prior to DT it had not been addressed empirically in a comprehensive manner. The DT analysis focuses on “poor matching,” which is the difference between recognized expense and (unobservable) perfectly matched expense. The paper argues that as the level of “poor matching” increases: (1) the correlation between revenues and expenses decreases; (2) the volatility of earnings increases; and (3) the persistence of earnings decreases.2 While we consider these three relations in our analysis, the subsequent literature largely focuses on DT’s approaches to assess how “poor matching” introduces error into the correlation between revenues and expenses. And, while DT report analyses of the correlation between revenue and expense, their primary approach to assessing the impact of “poor matching” is based on annual cross-sectional regressions of revenues on lead, lag, and contemporaneous expense as follows: Revenuei,t = b0 + b1Expensei,t-1 + b2Expensei,t + b3Expensei,t+1 + ei,t (1), where revenue is annual revenue for firm i in year t and expense is the annual expense in year t. DT argue that the magnitude of b2 in (1) declines with “poor matching” and suggest that the coefficients on the lead and lag expense terms in (1) increase with “poor matching.” They 2 In a largely descriptive exercise, Table 4 of DT presents over time changes in the levels of earnings, expense, and revenue volatilities, focusing on how earnings volatility has increased substantially while expense and revenue volatility has not. As, covariance between expense and revenue is the reconciling metric between earnings variability and the variability of its components (expense and revenue), this analysis implicitly addresses covariance change. Covariance and covariance change is also the central driver of the MEXP% based analyses presented in this paper. However, MEXP% explicitly captures covariance changes making it much more amenable to rigorous empirical analysis, which is not possible with the comparative evaluations presented in DT. (An added challenge here is that the high level of the covariance between revenue and expenses means that comparatively small shifts in expense or revenue volatility that are not offset by a change in covariance have sizable relative impacts on earnings volatility. In particular, based on the Table 4 values reported in DT, a 10% increase in the variance of either revenue or expenses—which is roughly what happens between the first and last year of the DT analysis—if not offset by a covariance shift, would increase residual income variation by well over 100%.) 4 document that the coefficient on current expenses has declined considerably over time while the coefficient on lagged expenses has increased. They interpret these coefficient shifts as indicating that a substantial decline in matching occurred between 1967 and 2003. Donelson, Jennings, and McInnis (2011) employ the DT equation (1) approach to assess the underlying factors that contributed to the decline in matching documented in DT. Decomposing the estimated coefficients, they show that the decline in matching documented in DT is largely attributable to a marked increase in the frequency and magnitudes of special items from 1967 to 2005. They argue that this increase in special items is driven by increasing economic uncertainty, which suggests that the DT decline in matching is mostly attributable to a rise in the sorts of economic conditions that produce costs that are difficult to match ex ante with revenues, rather than to any substantive shift in financial reporting standards.3 Relatedly, Srivastava (2014) finds that due to their being more intangible-intensive, newer firms exhibit lower levels of matching because most of their costs are expensed immediately, thereby reducing the overall level of matching of the Compustat population. He and Shan (2016) use the DT measure to assess matching over time across 42 countries. They find that matching has declined worldwide from 1991 to 2010. They also find that matching varies across countries on many dimensions. However, they do not find any evidence of a connection between matching and IFRS adoptions. Bushman, Lerman and Zhang (2016) employ the adjusted R2 from (1) as a more direct measure of the random error component of expense recognition to measure matching. They do not find evidence of an incremental relation, above and Basu (1997) argues that asset write-downs frequently arise from over-estimation of assets’ useful lives, which causes mismatching with revenues generated by the asset. As opposed to this conditional conservatism, unconditional conservatism such as immediate expensing of R&D and advertising effectively underestimates assets’ useful lives, so that some expenditures are not correctly matched to future revenues they help generate. 3 5 beyond a shared trend, between matching and the association between cash flows and accruals, which they show declines markedly between 1964 and 2014.4 Instead of directly measuring expense as a function of revenue, Prakash and Sinha (2013) evaluate matching in the context of deferred revenue by examining profit margins. They find that profit margins are lower in periods when deferred revenues increase and higher in periods when they decrease. This pattern is consistent with the expenses on deferred revenue being recognized before the revenue is recognized. Like our MEXP% measure, they employ estimated linear equation parameters to evaluate matching efficacy. And, as is seen in the next section, profit margin poses a challenge for such parameter-based perspectives. In the case of Prakash and Sinha, the analysis implicitly assumes that profit margins based on correctly matched expenses are similar for deferred and non-deferred revenue items. In the case of MEXP%, we rescale the estimated parameter to remove the impact of profit margin variation. 3. Measuring Matching 3.1 Conceptual Matching In devising our approaches to measuring matching, we begin with the conceptual notion that revenues are first recognized by period and then expenses are matched to these recognized revenues. At a linear level, this perspective suggests the following equation: Expensei,t = b0 + b1*Revenuesi,t (2) However, the above expression is over simplistic since it assumes that b0 and b1 are constant across firms and over time. That is, it implies a financial reporting process that operates with universal 4 Dechow (1994) argues that revenue recognition and matching cause recognition of accrual revenues and expenses to adjust operating cash flows and induce a contemporaneous negative correlation between accruals and cash flows. 6 constants b0 and b1. Financial reporting policy and application, however, is far more idiosyncratic. That is, b0 and b1 vary across firms and vary within firms over time. Hence, a more accurate linear equation starting point is: Expensei,t = b0i,t + b1i,t*Revenuesi,t (2a) While it accurately captures matching as a product of the financial reporting process, (2a) is not useful empirically absent restrictions on b0 and b1. That is, for estimation purposes b0 and b1 must be restricted to be the same across firms or over time or across both firms and over time. Such restrictions, however, move the empirical design away from the conceptually rich idiosyncratic notion of expense recognition reflected in (2a) and toward the more sterile notion of expense recognition reflected in (2). Importantly, however, (2a) is not entirely distinct from (2). Indeed, the two equations can be more directly linked by replacing b0i,t and b1i,t in (2a) with their averages (cross-sectional, timeseries or pooled) as follows: Expensei,t = b0 + b1*Revenuesi,t + ei,t (2b) where bolding indicates that the item is an (weighted) average value of a parameter. (2b), unlike (2a) but like (2), is empirically estimable. But crucially it also introduces error. Rearranging (2b) to solve for this error gives: ei,t = b0i,t - b0 + (b1i,t – b1)*Revenuei,t (3) That is, a portion of the residual error arising in estimations following the general form of (2b) is due to suppressed variation in idiosyncratic parameter values.5 Hence, the implications of the error 5 The conceptually expected variation in coefficient parameter values here differentiates this setting from other similar conceptualizations of relations among financial performance variables such as those between accruals and cash flows found in Dechow and Dichev (2002). In their analysis, the independent variable parameter values should be the same both across firms and over time. Unlike the matching setting, there is no conceptual reason in their analysis for parameter values to vary over time or across firms. 7 term magnitude in (2b) based estimations are inherently ambiguous. Empirically, the random “scattering” of expenses to periods in a manner totally unrelated with revenues (i.e., poor matching) will increase this error term. But, the amount of idiosyncratic variation in b0 and b1 also increases it. And such variation certainly does not reflect poor matching. Indeed, it reflects good matching in that it is a product of expense recognition that is accurately capturing idiosyncratic performance differences across firms and within firms over time. Consequently, the level of residual error from (2b) estimations as well as measures derived or impacted by it such as the regression R-square and the correlation coefficient between revenue and expense are highly problematic candidates for measuring matching and changes in matching. 3.2 Parameter Based Estimates of Matching Alternatively, estimations of (2b) can provide unbiased measures of the average values of b0 and b1. And, these two parameters do say something about matching. Specifically, the level of b0 reflects the amount of expense that is impacting income in the period that is unconnected with current revenue. When b0 is high the level of unmatched expense is high. When it is low the level of unmatched expense is low. Similarly, b1 reflects the level of expenses that are being matched to revenues as a percentage of revenues. So, when b1 is high, more expense is being matched to each dollar of revenue. In our analysis we focus on b1 as a reasonable starting point for measuring matching. The preceding interpretations of b0 and, in particular b1 do, however, depend on one additional factor—profit margin. In general, higher profit margins reflect lower expense. And, ceteris paribus, if expenses are lower, then b1 and b0, which are both measures of matching, will also be lower. Hence, profit margin represents a crucial confounding factor when employing either of these measures to evaluate matching. We address this problem by expressing b1 as a percentage 8 of expense rather than using its native percentage of revenue scale. That is, we measure the percentage of expense recognized as a linear function of contemporaneous revenue for a firm as: MEXP% = b1(Average Revenue/Average Expense) . Our empirical analysis employs firm-specific time-series regressions estimated over ten-year windows. So, for our purposes average revenue and average expense are determined over the same ten-year windows used to produce the estimated value of b1. While MEXP% is a highly intuitive and straightforward measure, it is not without drawbacks. In particular, while it declines with under-recognition of correctly matched expense (or, in the terminology of DT, the scattering of expenses to other periods), it increases with overrecognition of expense as a function of revenue. Such over-matching can happen when a firm inflates the correctly matched expense in the interests of being conservative. For instance, suppose a firm routinely determines bad debt expenses as 3% of sales when such expenses are only expected to be 2% of sales. The 3% number is conservative because it is an over-accrual of expense. But it also (artificially) increases the amount of expense recognized as a linear function of revenue. 6 Consequently, this measure is better interpreted as reflecting the degree to which correctly matched expenses are not being scattered to other time periods that is offset by the degree to which expenses are being over-matched to revenues. 3.3 Measuring Mismatching As an alternative to focusing on how much expense is being matched to revenues, our second measure addresses the level of mismatching of expenses that is occurring. That is, instead 6 Importantly, over-recognition of expense in a period does not in and of itself lead to over-matching. The overrecognition must be tied to the magnitude of current period revenues as well. 9 of focusing on how well matching works, one can instead identify specific situations where matching is unambiguously not working. The lead and lag expense terms in equation (1), as developed by DT, reflect this perspective. They focus on the lead and, in particular, the lag sales coefficients as measures of mismatching. In fact, DT identify the lag coefficient as reflecting accounting conservatism in that it connects lagged expense with current revenues. 7 More generally, such coefficients reflect expenses from other periods that are associated with current period revenues. Hence, they are expenses that are arguably being recognized in the wrong time period. There are, however, two significant drawbacks to the DT approach. First, at a conceptual level matching dictates that expense, not revenue, is the appropriate dependent variable. Expenses are determined based on (depend upon) recognized revenues in a period. Second, as we will elaborate on below, it is actually far from clear that coefficient magnitudes here are the best route for assessing mismatching in this context. We introduce mismatching estimation into (2b) by supplementing the current revenue variable with one period lead and lag revenue variables as follows: Expensei,t = h0 + h1*Revenuesi,t-1 + h2*Revenuesi,t + h3*Revenuesi,t+1 + ei,t (4), where bolding again indicates that the estimated coefficient is an average value. In this form, h1 and h3 represent how (on average) revenues in adjacent periods uniquely explain expenses in period t incremental to the explanatory power of period t revenues. What is far less clear, however, is how to appropriately interpret the estimated coefficient values in (4) in terms of what they are saying about matching efficacy or lack thereof. Consider, as a general example, h1. It reflects the change in period t expense associated with a marginal (or unexpected given revenues in t and t+1) 7 Lee (2012) explores some of the estimation properties of the DT lagged revenue estimator. He also suggests that a more natural measure is the coefficient on lagged revenues in a regression of expense on lag, contemporaneous, and lead revenue. One view of the measures we present here is that they build on Lee’s perspectives of measuring matching and conservatism. 10 unit change in period t-1 revenues holding period t and t+1 revenues constant. Consequently, the overall level of mismatched expense depends as much on the magnitude of this incremental change in revenue as it does on the magnitude of h1. That is, even if h1 is large its impact may be quite small if the associated incremental revenue changes are expected to be comparatively tiny. A related concern stems from the possibility that expenses are over-matched to revenue as discussed in the prior section. Such over-matching necessarily leads to reversals/offsets in other periods. So, if such a reversal occurs in the next period then h1 is decreased (and h2 is increased as is seen in the prior section). Or, if expenses are being deferred in t based on (correctly anticipated) period t+1 revenues (essentially a reversal in advance) then h3 is decreased. These sorts of over-matching activities lead to lower rather than higher h1 and h2 values.8 Given the difficulties in clearly interpreting the implications of the coefficients in (4) as measures of overall mismatching magnitude, we focus instead on the total amount of expense that is explained by the introduction of the adjacent period revenue terms. That is, we examine the reduction in residual error obtained by supplementing (2b) with the lag revenue variable as follows: MISM = MSE (2b) - MSE(4) where MSE(2b) is the mean squared error from estimating model (2b) on a given set of data and MSE(4) is the mean squared error from estimating model 4 over this same set of data. A particular advantage of MISM is that it reflects mismatching attributable to scattering of expenses as well as over-recognition of expenses. In particular, while scattering manifests itself in 8 It is even possible that (short term) over-matching is the dominant effect, in which case h1 will be negative. And, in fact, we do observe negative h1 values in some years in our empirical analyses. (The negative values for the lead expense coefficient reported in some years by DT is also broadly consistent with this effect, although the precise interpretation of their coefficients is clouded by its disconnection from the idea that under conceptual matching expenses are appropriately modeled as a function of revenues). 11 the form of positive adjacent period revenue coefficients, over-recognition manifests itself in the form of negative adjacent period revenue coefficients, the impact on MSE (4) is not dependent on coefficient signs. MISM’s limitation, however, is that it only addresses mismatching effects that arise in or are reconciled in immediately adjacent time periods. 3.4 Dichev and Tang Reverse Regression Measure At this point it is relevant to revisit the DT equation (1) matching measure—the coefficient on contemporaneous expense in a regression of lead, lag, and contemporaneous expense on revenue. In evaluating this measure, it is particularly important to recognize that it is inconsistent with the core matching principle that expenses are matched to revenues since it places expense(s) in the independent variable position. Consequently, from a conceptual matching perspective it is better understood as a reverse regression approach in which matching is measured based on d2 from the following regression of revenues on expenses: Revenuet = d0 + d1 (Expenset-1 + et-1) + d2 (Expenset + et) + d3 (Expenset+1 + et+1) (5) As the conceptual matching roles of the lead and lag expense terms in this regression are rather unclear and their inclusion clouds the core reverse regression properties of the approach, we focus instead on the simpler regression of revenues on contemporaneous expenses (e.g. Sivakumar and Waymire, 2003) where expenses are measured with error (due to mismatching). That is: Revenuesi,t = g0 + g1*(Expensei,t + ei,t) (6). As noted in Dichev and Tang, measurement error in expense, ei,t, biases g1 toward zero. Hence, in the reverse regression specification, the magnitude of the coefficient on estimated expense declines as random variation in expense (poor matching) increases. 12 The key drawback to the Dichev and Tang approach is seen when we rearrange (6) to correspond with the conceptual notion that that matching drives the dependence of expenses on revenues, not revenues on expenses: Expensesi,t = -g0/g1 + (1/g1 )*(Revenuei,t + ei,t) (6a). This rearrangement shows that higher (lower) estimated g1 values imply that expenses are less (more) determined by revenues. That is, apart from measurement error, g1 is inversely related to the degree to which expenses are being matched to revenues which in turn suggests that the evidence that g1 declines over time reported in Dichev and Tang can actually be interpreted as implying that matching has improved over time. So, absent adopting the counterintuitive assumption that changes in matching impact the level of random error without affecting the level of expenses being matched to revenues, the implications of this metric for matching are inherently ambiguous. 4. Empirical Analyses Our empirical analysis focuses on two error-based measures along the lines proposed by DT and the two measures of matching/non-matching developed in the previous section—MEXP% and MISM. The two DT-based measures are: DMSE—the direct regression mean squared error, which corresponds to the mean squared error from estimations of equation (2b); and, DTM—the primary DT matching measure, which is the estimated coefficient on contemporaneous expense from equation (5). The analysis is conducted in two stages. In the first stage, we investigate the behavior over time of these measures. In the second, we explore the degree to which crosssectional variation in matching is related to earnings persistence and earnings variability. 13 Both analyses use rolling ten-year windows of pooled annual firm-level data. Consistent with DT, revenue and expenses for a firm in these rolling pools are scaled by the book value of the firm’s assets. We follow Francis et al. (2004) and estimate the above matching measures for each firm in each rolling window. We restrict the sample of firms to the 500 largest non-financial firms listed on Compustat in the start year of each rolling window. We exclude firms for which revenue or expense data are unavailable in any of the subsequent nine years. The first rolling window we examine ends in 1973 (start year of 1964), while the last window ends in 2013. Hence, we have forty-one rolling ten-year windows in total and every tenth window is non-overlapping. Two of the four matching measures we examine (MEXP% and DTM) are based on regression parameter values that follow directly from equations (1b) and (5) from the prior section. We estimate these parameters by firm in a given window using OLS regression. The two mean squared error based measures, DMSE and MISM, rely on (MSE) calculated as Sum of Squared Errorsi,wt/(ni,w - p). where sum of squared errors is the sum of the squared residuals from the firm-specific regression of interest, ni,w is the number of observations for firm i in rolling window w, and p is the number of parameters estimated in the associated regression specification (p is 2 for estimations of equation (2b) and 4 for estimations of equation (4)). Since MSE declines as explanatory power increases, lower values of MISM indicate that adjacent period revenues have greater explanatory power for current period expense (i.e., worse matching). Table 1 reports descriptive statistics for the primary variables employed in our analyses. As expected, revenues at 87.2% of asset value, on average, slightly exceed expenses at 82.7% of asset value. The four matching measures (Winsorized at the 2% and 98% levels) all exhibit comparatively high degrees of variability, reflecting the short time series used in estimating them. 14 DMSE averages 1.111, but its median is only 0.195, indicating that it is highly skewed. The average of the DTM metric of 0.975 is similar to the 0.957 average for 1967-2003 determined using ten year averages of the yearly cross-sectional estimation estimates reported in Table 3 of DT. The median measure of 1.016, however, is slightly higher than the mean. MEXP% averages 97.2% (median is 99%) which indicates that, on average, 97.2% of a firm’s expenses are matched to revenues via the estimated coefficient on revenue or, equivalently, that only 2.8% of firm expenses are not matched to revenues. 9 MISM averages 0.140 indicating that adjacent period revenues are explaining some portion of current period expense. Median MISM, however, is only 0.005 suggesting that the high mean value for MISM is attributable to a small subset of firms.10 In addition to the four primary measures, Table 1 reports values for three more matching metrics: D_SALES, LGREV, and LDREV (also Winsorized at the 2% and 98% levels). D_SALES is, as expected, smaller than MEXP% as it expresses expense recognition as a percentage of revenues instead of a percentage of expenses. Mean and median LDREV are positive, indicating that expenses are generally being recognized in advance of associated revenues more than they are being deferred in anticipation of future revenues. In contrast, mean LGREV is negative while its median is zero. Hence, on average higher values of LGREV lower next period expenses. One explanation for this relation is that expenses are being over-recognized as a function of revenue in the current period leading to a reversal of such over-recognized expenses in the next period. Table 2 reports means of within-window correlations among the four primary matching measures. DMSE is negatively associated with DTM and MEXP%, consistent with a positive 9 These percentages reflect how much expense is being matched to revenues, they do not necessarily speak to whether these expenses are being correctly matched to revenues. 10 In general, all of the results we report in the subsequent analysis of the over-time patterns in these matching measures are robust to using rolling window medians rather than means. The one exception is the analyses of MISM. While MISM medians exhibit the same sorts of patterns as their means, differences in MISM medians across windows generally lack significance. 15 relation between unexplained variation in expenses and the values of these two metrics. The 34.7% value for MEXP%, however, is much higher than the 15.8% level obtained for DTM. Moreover, DTM is negatively correlated with MEXP%, which is consistent with the reverse regression interpretation of its value. That is, DTM declines as the magnitude of the underlying direct relation between expense as a function of revenue increases. DTM is also positively correlated with MISM. Collectively, these relations indicate that DTM is the least consistently aligned matching measure. 4.1 Matching Over Time Analysis Figures 1 through 5 present graphs of the mean value of the four matching measures we consider by (rolling window-end) year as well as rolling-window values for D_SALES, which is the underlying driver for the MEXP% measure. As these are rather short time-series estimations they are prone to extreme outcomes. Hence, we focus on cross-sectional mean values within rolling windows after Winsorizing at 2% and 98%. We take these means as reflective of expense recognition exhibited by the typical firm in each rolling window period. All five graphs are consistent with matching having declined between 1965 and 2013. They differ considerably, however, in terms of when this decline manifests itself. Figure 1 suggests that DMSE was relatively constant until around 1980 (as these are end-years of ten year windows the reported point estimates reflect the average effect over the ten years ending with the last year in each rolling period). DMSE then increases fairly steadily through the end of the sample period. Interestingly, by the end of the time period its value is around five times its pre-1980 value. The 16 DMSE graph also exhibits a sharp spike in 2009. This spike is consistent with a sharp decline in matching taking place in, and immediately after, the financial crisis year.11 Figure 2 presents the graph of the DTM metric and it suggests a rather different pattern than what we see in Figure 1 and the figures that follow. DTM declines over the first half of the sample period, exhibiting a particularly sharp drop in the years before 1992. It then oscillates around its 1992 levels in the following years, rising from 1993 to 2000, declining from 2001 to 2009, and then rising from 2010 forward. If we adopt the DT interpretation of this metric (i.e., that it is primarily capturing random error in expense recognition), then these patterns suggest that a sharp decline in matching occurred between the mid-1970s and the early 1990s, but this decline was partially reversed over the later 1990s only to reverse again in the 2000s.12 In contrast to DTM, the D_SALES and MEXP% graphs presented in Figures 3 and 4 indicate that matching was rather stable in the earliest time periods, but then worsens sharply at the end of the sample period. Interestingly, in the 1975-1982 period, MEXP% exceeds 100%. That is, in the time periods covered by these rolling windows (roughly 1970 to 1982) expenses are apparently being over-matched to revenues. Such “too high” values can be evidence of poor matching. And, they likely partially account for the decline in the DTM measure seen in Figure 2 for this time period. However, this effect is somewhat mechanical. Ceteris paribus, the DTM measure moves inversely with D_SALES (which is the primary driver of MEXP%). Consequently, 11 The spike does not persist beyond 2010 due in large part to the exit of firms exhibiting extreme values in 20082010. If we limit the sample to only those firms still present in 2011, the spike largely disappears. 12 The pairwise correlation between the values for our measure and the ten year averages of the annual crosssectional DTM estimates reported in Table 3 of Dichev and Tang (2008) is 0.913. Hence, our alternative approach to generating the DTM coefficient is not the source of the difference in pattern between DTM and the other matching metrics we examine. 17 while DTM declines when D_SALES values rise it also increases when D_SALES values fall, but falling D_SALES values typically imply deterioration, not improvement, in matching.13 The Figure 5 graph of MISM, like D_INT and D_SALES, shows no large decline in matching (or, equivalently, increase in mismatching of expenses to adjacent period revenues) in the initial time period of the study. However, MISM increases steadily thereafter. Relative to the patterns exhibited by D_SALES and MEXP%, this increase begins earlier, in the early 1990s, and peaks in 2009, before declining somewhat at the very end of the sample period.14 The 2009 extreme peak, however, reflects that the financial crisis increased the level of expenses unexplained by concurrent revenue, and that this increase was accompanied by an increase in the degree to which adjacent period revenues explain expenses. Table 3 presents a more formal evaluation of the over-time patterns in the various matching metrics. We base this evaluation on the five non-overlapping windows that span our 1964-2013 time period. Hence, the first window covers 1964-1973 while the final window covers 2004-2013. Panel A of Table 3 reports the mean values for DMSE, DTM, MEXP% and MISM in each window. Not surprisingly, these values reflect the same general patterns seen in the figures. DMSE increases uniformly over the five windows, DTM declines sharply in the 1984-1993 window where it is also lowest, MEXP% declines sharply in the 2004-2013 window, while MISM, like DMSE, increases over the five windows, particularly so in the final two windows relative to the earlier windows. Panels B through E of the table report t-statistics of differences in means between rolling windows for each of the four matching measures. Differences are determined as earlier window means deducted from later means. For DMSE, reported in Panel B, all of these differences are 13 In fact, a likely contributing explanation for the absence of a more sizable decline in the DTM measure in the final portion of the sample period is the decline in D_SALES taking place in this same period. 14 Median values for MISM actually decline until the late 1990s before rising substantially over the final 15 years. 18 positive and significant at the 0.05 level consistent with the rising trend in DMSE seen in Figure 1 and Panel A. For DTM, reported in Panel C, the t-statistics for the differences between the means in the first two windows (1964-1973 and 1974-1983) and the final three windows are quite high, ranging from -9.05 to -12.90, indicating that the drop in DTM seen in Figure 2 after 1983 is highly significant. However, the difference in DMSE between the 1984-1993 and 1994-2003 windows is positive and significant, suggesting that matching recovered during 1994-2003. For MEXP%, reported in Panel D, the only significant differences are for the 2004-2013 window, where the tstatistics range between -3.91, for the difference with the 1994-2003 time period, and -5.57, for the difference with the 1974-1983 time period. These differences support the picture presented in Figure 4 that MEXP% remains relatively stable until the turn of the century before deteriorating. For MISM, reported in Panel E, the t-statistics for the differences between the last two windows and each of the first three windows range from 2.05 to 4.60. These results confirm the statistical significance of the increase in MISM that takes place in the final twenty years covered by Figure 5. However, the differences between the 1984-1993 mean and the earlier 1964-1973 and 19741983 means are significant as well (t-statistics of 3.40 and 2.84) indicating that MISM actually starts rising before 1984. 4.2 Firm Compustat Age and Matching Srivastava (2014) finds that much of the temporal change in measures of accounting quality, including DTM, between 1970 and 2009 is attributable to changes in the set of firms for which such measures are estimated. In particular, when the sample is restricted to older established firms, observed declines in earnings quality are far less pronounced. We examine the robustness of the Table 3 results to this firm age effect by restricting the sample in each of the five non19 overlapping windows employed in Table 3 to only those firms in a window that are listed on Compustat prior to 1974. Table 4 presents our analysis of these restricted samples. Panel A reports mean values for these restricted samples while panels B through E report t-statistics for mean differences by matching metric. In this restricted sample, consistent with Srivastava, the decline in DTM is smaller relative to the decline observed in the overall sample. DTM falls to 0.946 in the 2004-2013 period, as compared to the 0.907 value reported in Table 3 for this same period. Similarly, MEXP% declines to 93.7% as compared to 90.9% in the full sample. However, the increases in DMSE and MISM are actually slightly higher for this sample. DMSE increases to 2.125 in the 2004-2013 period as opposed to 2.051 for this same period in the full sample, while MISM increases to 0.219 in the 2004-2013 period as opposed to its full sample value of 0.197. In the Panels B to E examinations of changes over time in matching, the results are broadly consistent with those reported in Table 3. In terms of the significant differences identified in Table 3, the only comparable differences that lack significance for the restricted sample examined in Table 4 are: (1) DMSE (Panel B) in the 1994-2003 period as compared with the 1984-1993 period; (2) DTM (Panel C) in the 2004-2013 period as compared with the 1994-2003 period; (3) MISM (Panel E) in the 1994-2003 and 2004-2014 periods as compared with the 1984-1993 time period. Among these statistically insignificant differences, only the failure to find significant increases in MISM after 1983 represents a potentially substantive departure from the Table 3 overall sample results. Specifically, these results are consistent with the increase in MISM occurring largely prior to 1993 while the Table 3 analysis suggests that it continues to increase until 2003. However, inspection of the actual estimated MISM values in Panel A casts considerable doubt on such an interpretation. The 2004-2013 MISM value is 0.219, which is nearly double the 0.118 value 20 reported for the 1984-1993 window. Hence, loss in power due to declining sample size seems a more plausible explanation for these insignificant MISM differences in Table 4. 4.3 Variation Attributable to Coefficient Restrictions In section 3.2, we discuss how suppressed idiosyncratic parameter variation in empirical estimations of the degree to which contemporaneous revenues explain expense on average confounds the interpretation of DMSE and DMSE-related measures. While it is impossible to fully assess the impact of such suppressed variation on DMSE, it is possible to obtain some insights about the likely severity of this issue by evaluating how much DMSE changes as we move from a model where parameter estimates are more restricted to one where they are less restricted. We conduct such an assessment by comparing the error levels based on our firm-specific estimations with the errors that arise when we estimate equation (1b) using a single pooled regression for each rolling window. That is, we restrict all firms in a rolling window to have the same slope and intercept.15 On a firm-by-firm basis, we then determine the average of the squared errors (ASEi,wt) based on these commonly determined slope and intercept values. DIFSEi,nt, the difference between ASEi,wt and MSE(1b)I,wt, reflects the amount of error introduced at the firm level when individually estimated firm parameters are replaced with these pooled cross-sectional based estimators. Panel A of Table 5 reports median values for DIFSE for the five non-overlapping ten-year windows in our sample. DIFSE more than doubles between 1964-1973 and 2004-2013, rising from 0.973 to 2.053. The t-statistic for this difference, reported in panel B, is 3.93. Hence, restricting coefficients to their pooled cross-sectional values introduces increasing amounts of error into the 15 We also conducted the analysis using annual cross-sectional regressions rather than pooling observations across years, which is the approach employed in DT. The median error levels for these annual regressions differed very little from those for the pooled regression. Hence, we only report values for the pooled specification here. 21 overall pooled estimations of expense on revenue during 1964-2013. Moreover, such error is purely the result of imposing a counterfactual restriction on the model parameters and so cannot be viewed as evidencing poor matching. Indeed, it may well reflect good matching since we should expect such cross-sectional variation when firms perform differently. An alternative approach to measuring the error effect of restricting parameters to their cross-sectional pooled levels is to scale the error by the total amount of error that can possibly be explained. We do this by dividing DIFSE by ASE to determine the percentage of ASE that is attributable to the parameter restriction. These values are also reported in Panel A of Table 5 and range from a low of 46.4% during 1994-2003 to a high of 69.2% during 1964-1973. The magnitudes of these percentages suggest that parameter restriction contributes substantially to the residual error in pure cross-sectional estimations such as those employed in DT. However, unlike DIFSE, the values of these percentages do not rise over time. Indeed, the 2004-2013 value of 55.7% is significantly (0.05 level) lower than the 1964-1973 and 1974-1983 values of 69.2% and 62.8%. So, while parameter variation is increasingly important in absolute terms, its relative contribution to the overall level of error has fallen in more recent years. And, a decline in matching quality is a plausible candidate explanation for such a decline. 4.4 Lead and Lag Revenue Coefficients While the lead and lag revenue coefficients (LDREV and LGREV) from (3) are problematic as measures of overall matching magnitude or efficacy, they are not without interest in terms of describing some of the underlying factors that are contributing to how well matching is working in a given time period or firm. Positive LGREV and LDREV values are broadly consistent with under-matching in that they identify the expenses recognized in the current period 22 that are clearly attributable to other periods. In contrast, negative LGREV and LDREV values indicate that expenses in the current period are moving inversely with adjacent period revenues. That is, expenses are being shifted out of the current period and into adjacent periods. While one mechanism by which such inverse relations can arise, particularly with respect to LGREV, is overmatching, negative LGREV and LDREV coefficients do not necessarily imply over-matching. For example, an expense recognition function of the form: Expense = K - c1*Revt+1 - c2*Revt-1 (7) where K is a positive constant, c1> 0, and c2>0, gives rise to negative LGREV and LDREV values but has no implications for the relation between period t revenue and expense. Here, the expense reductions associated with lead and lag revenue are pulled from the constant term K. That is, higher values of c1 and c2 cause K to be higher (to cover them). Alternatively, if we replace K with c*Revt in (7) then the value of c must increase to cover them (which is over-matching). Figure 6 presents the mean values for the firm-specific estimates of the LDREV and LGREV coefficient estimates by year. LGREV is negative for the first several rolling windows (i.e., the 1964-1974 period), but then turns positive increasing in value through the next several years, peaking in 1982. It then falls back below zero again in 1987, where it remains for the rest of the sample period. The shifting back and forth between positive and negative values suggests that both under-matching and contrarian expense shifting are co-occurring with direct shifting dominating in the early years and contrarian shifting become a major factor in more recent years. LDREV, on the other hand, is positive for every rolling window year except 2012. Table 6 presents more formal analyses of median LDREV and LGREV for the five nonoverlapping ten-year windows in our analysis. Panel A reports means as well as the percentage of the annual estimates that are positive for each of the five windows. LDREV is, on average, positive 23 in all five windows. It is significant at the 0.05 level in all but the 2004-2013 window. It is positive more often than it is negative in all windows, but the highest percentage positive value is only 57.6%, which occurs during 1964-1973. So, a substantial minority of firms exhibit negative LDREV values. LGREV is negative in all windows except 1974-1983, when it is positive (significant at the 0.05 level). The negative values in the last three windows are all statistically significant (0.05 level). These values are all consistent with the shifts in LGREV observed in Figure 6. Panels B and C report t-statistics for the differences in rolling window means for LDREV (Panel B) and LGREV (Panel C). For LDREV the only significant (0.05 level) differences are that the 1984-1993 mean of 0.042 exceeds the 1974-1983 mean of 0.017 as well as the 2004-2013 mean of 0.006. For LGREV the 1974-1983 mean of 0.010 is greater than all of the other window means (all of which are negative). 4.5 Cross-Sectional Analyses In this section, we examine the relations between earnings variability, earnings persistence and matching in the cross-section. DT propose that earnings variability and persistence both decline as matching efficacy declines. Both of these observations stem from the notion that poor matching adds noise to earnings by spreading out costs. We examine these two relations using our alternative measures of mismatching: MEXP% and MISM. We do not consider DMSE as its validity depends on parameter variation not playing a confounding role as a design-induced source of error, and there is no plausible way to completely control for such variation in the cross-section. And, we know from Table 5 that such variation is a major source of error the size of which increases over time. DMSE is also arguably a tautological metric, since it measures the variability of the expense component of earnings. We do not consider DTM because it is conceptually 24 problematic, and empirically it seems out of step with the other matching measures. Finally, we also consider the cross-sectional relevance of the LDREV and LGREV estimates from equation 4 for earnings variability and persistence. Since the implications of these two parameters differ substantively depending on their sign we consider their positive and negative values separately. Hence, we split the set of LDREV estimates into two variables: (1) LDREV+, which equals LDREV when LDREV > 0, and is 0 otherwise; and, (2) LDREV-, which equals LDREV when LDREV< 0, and is 0 otherwise. We perform a similar division of the LGREV estimates as well to generate LGREV+ and LGREV-. We examine both earnings variability (Evar) and earnings persistence (Persist) using regression estimations of the following linear form: Evari,rt or Persisti,rt = d0 + d1 Mj,rt + Σj dj controlsi,j,rt + ei,rt (8) M represents a given firm-specific measure of matching. In the LDREV and LGREV versions of this model they enter in pairs. For example, in the case of LDREV, we replace d1M in (8) with d1a LDREV+ and d1b LDREV-. We employ decile ranks of all variables in estimating equation (8) where the ranking is done within rolling windows.16 Consequently, explained variation in the model is due entirely to cross-sectional within-year variation and not to economy-wide trends or commonalities in the earnings quality or matching measures we examine.17 The set of control variables employed in some of our estimations of (8) are: (i) log(bp), defined as the natural logarithm of book-to-price ratio of equity (CEQ/CSHO*PRCC_F), to control for underlying risk or future investment opportunity (Smith and Watts 1992; Fama and French 1992); (ii) log(mv), defined as the natural logarithm of equity market value (CSHO*PRCC_F), to 16 LDREV and LGREV are divided based on coefficient sign after ranking. The ten-year time-series regressions we employ are, in particular, quite noisy. Hence, it is important to avoid overweighting extreme observations associated with them, as such observations are almost certainly due to noise. As an alternative to decile ranks, we also trimmed the data as a means of eliminating such extreme observations. The trimmed results are broadly consistent with those reported based on decile ranks. 17 25 control for firm size in market value (Fama and French 1992); (iii) log(assets), defined as the natural logarithm of total assets (AT), to control for firm size in book value (Dechow and Dichev 2002; Francis et al. 2004); (iv) σ(sale), defined as the standard deviation of total sales (SALE) for the rolling ten-year period, to control for operating volatility (Dechow and Dichev 2002; Francis et al. 2004); and (v) log(ocycle), defined as the natural logarithm of operating cycle, i.e., months in accounts receivable plus months in inventory ((RECT/SALE) + (INVT/COGS)), to control for credit and collection policy (Dechow, 1994; Dechow and Dichev 2002; Francis et al. 2004). 4.5.1 Earnings Variability Table 7 presents estimations of equation (8) for earnings variability, both without (Panel A) and with (Panel B) control variables. Estimates are provided for each rolling window and for the five windows pooled together. In the Panel A estimations, MEXP% exhibits a consistent significant negative relation with earnings variability while MISM exhibits a consistent significant positive relation with earnings variability. Hence, consistent with DT, earnings variability is inversely associated with matching efficacy as captured by MEXP% and MISM. These core relations differ little across windows. For MEXP% the estimated coefficients vary between -0.392 during 2004-2013 and -0.580 during 1974-1983. For MISM they vary between 0.134 during 20042013 and 0.254 during 1974-1983. Introduction of control variables (Panel B) has minimal impact on the Panel A MEXP% and MISM relations. Estimated magnitudes shrink somewhat, but remain significant at the 0.05 level or better in all cases except MISM during 2004-2013. The LDREV and LGREV relations are also of some interest here. Positive LDREV values are positively associated with earnings variability in the pooled overall estimation (significant at the 0.05 level), but negative values are negatively associated with variability (also significant at the 0.05 level). These divergent directional effects indicate that as the magnitude of the LDREV 26 coefficient increases in absolute terms earnings variability also increases. In the case of LGREV, however, the estimated relation is negative regardless of the sign of LGREV (both coefficients are significant at the 0.05 level). So, as LGREV increases, earnings variability decreases unconditionally. However, the inclusion of control variables raises the LGREV+ coefficient from -0.062 to -0.007, which lacks statistical significance. Hence, decreasing the absolute magnitude of negative LDREV coefficients (i.e., moving them toward 0) is associated with lower earnings variability, which is consistent with near 0 LDREV values reflecting low levels of mismatching. But, increasing them beyond 0 has little impact on variability or possibly even decreases it further. A possible explanation for this latter relation stems for the fact that positive LDREV coefficients reflect recognition of expense in advance of the associated revenue. If the objective of such advanced expense recognition is to smooth earnings, then higher positive side LDREV values may reflect such smoothing, which by design decreases earnings variability. 4.5.2 Earnings Persistence Table 8 presents estimations of equation 8 for earnings persistence. The MEXP% and MISM relations here are rather surprising. In the overall estimations, MEXP% is negatively associated with earnings persistence and MISM is positively associated with it. That is, matching efficacy is negatively, not positively, associated with the level of earnings persistence exhibited by firms in the cross-section. A possible explanation for these unexpected relations is that persistence also depends on the frequency with which firms experience one-time or short-lived earnings shocks. The short-lived nature of such shocks seems likely to inhibit the ability of firms to shift expenses associated with them to other time periods giving rise to lower levels of MISM and higher levels of MEXP%. Hence, MISM is lower and MEXP% is higher in high transitory 27 shock (low inherent persistence) firms and lower in low transitory shock (high inherent persistence) firms. The LGREV and LDREV relations with persistence are also of interest here. The reported coefficients are negative in all windows in both panels, and significant (0.05 level) in most windows. Hence, higher values of LGREV and LDREV, irrespective of sign, are associated with lower earnings persistence. Interestingly, this general unconditional relation implies that firms with highly negative values of LGREV (or LDREV) exhibit higher levels of earnings persistence than firms with near 0 values. Over-matching of expense, in particular, is a plausible explanation for this. Consider a firm that experiences a purely transitory revenue shock of T in a period where the appropriately matched expenses for this revenue is a*T, but the firm actually recognizes expenses of (a+k)*T leading to a reversal of k*T in the next period. If k>0 (i.e., there is over-matching) then the transitory one period income of (1-a)*T is now spread over two periods with (1-a-k)*T recognized in the first period and k*T recognized in the second. That is, over-matching causes transitory single period earnings shocks to positively persist into the subsequent period. 5. Conclusion As a fundamental principle, matching aims to appropriately assess the cost of recognized revenue in a period as a measure of current period performance (e.g. Dichev, 2008; Basu and Waymire 2010). Hence, it is a core concept for any substantive notion of the profitability of firm’s transactions and is a financial reporting attribute of considerable empirical interest. Existing archival empirical evidence on matching, however, is limited. Moreover, much of the existent evidence relies on measures developed in Dichev and Tang (2008) that are derived from the perspective that matching involves the advancing of expenditures to generate revenues. That is, 28 that expense recognition precedes revenue recognition. Conceptually, however, matching is the recognition of expenses associated with recognized revenues. That is, expense recognition follows, rather than precedes, revenue recognition. We examine whether this distinction matters by revisiting the DT study using measures consistent with the conceptual perspective that expenses are matched to revenues as opposed to revenues being matched to expenses. More generally, our study provides a more conceptually grounded empirical perspective of matching efficacy over the past fifty years as well as how matching efficacy is related to core earnings attributes. Our primary comprehensive measure of overall matching efficacy is the percentage of a firm’s expense that, on average, is directly attributable to revenue based on how the firm’s expenses and revenues covary over time. This measure is remarkably high, averaging 97.2% during 1964 to 2013. And, through the year 2000 it does not deviate greatly from this average. In fact, its value is 97.9% in the opening period of our analysis (1964-1973) and its value for the tenyear period ending in 2000 is 98.7%. After 2000, however, it declines markedly, reaching a nadir of 88.6% in the ten-year period ending in 2009 (the primary financial crisis year). And, it is only 90.9% in the final time period we study (the ten years ending in 2013). In contrast, the primary over-time matching measure in DT suggests that matching declined markedly during the 1980s and early 1990s. Its level in 2013 of 0.907 is actually slightly higher than the low of 0.891 it reaches for the ten-year period ending in 1994. These values suggest that matching efficacy is little changed since 1994. Our second measure of matching addresses the amount of mismatching attributable to adjacent period revenues. It is a partial measure in that it does not capture mismatching to nonadjacent periods. Consistent with the percentage of matched expense measure, it indicates that 29 mismatching is higher in the closing years of our analysis relative to earlier years. It is also particularly high in the financial crisis time period. Our analysis also identifies the over-matching of expenses to revenues as a key factor in understanding matching. Over-matching is the recognition of a greater amount of expense than appropriate as a function of revenues. And, during 1975-1982, over-matching plays a dominant role in expense recognition as the percentage of expense being recognized based on revenue actually exceeds 100%. In fact, if 100% is taken as a benchmark level that one should expect to observe under perfect matching, then the 1975-2000 time period reflects a shift from a period where expense recognition is barely dominated by over-matching to one where expense recognition is barely dominated by under-matching. Over-matching is also a factor in the positive cross-sectional relation between mismatching and earnings persistence. Over-matching spreads earnings shocks into other periods through expense reversals, thereby enhancing (in appearance) their permanence. Finally, we employ our measures to evaluate how earnings variability and earnings persistence vary with our measure of matching and our measure of mismatching in the crosssection. Consistent with Dichev and Tang as well as underlying theory, we find that earnings variability moves inversely with the percentage of expense attributable to revenue and increases with mismatching. However, we do not find the expected positive linkage between percentage of expense attributable to revenue and earnings persistence or the expected negative relation between earnings persistence and mismatching. Over-matching, as discussed in the prior paragraph is one possible explanation for not finding these relations. Another is that the persistence exhibited by revenue shocks is directly connected to how difficult it is to match expenses to such shocks. Hence, 30 high persistence shocks give rise to lower matching simply because it is more difficult to appropriately match expenses to such shocks. 31 REFERENCES Basu, S. 1997. The conservatism principle and the asymmetric timeliness of earnings. Journal of Accounting and Economics 24 (1): 3-37. Basu, S. 2003. Discussion of “Enforceable accounting rules and income measurement by early 20th century railroads.” Journal of Accounting Research 41 (2): 433-444. Basu, S. and G. B. Waymire. 2010. Sprouse’s What-You-May-Call-Its: Fundamental insight or monumental mistake? Accounting Historians Journal 37 (1): 121-148. Bushman, R. M., A. Lerman, and X. F. Zhang. 2016. The changing landscape of accrual accounting. Journal of Accounting Research 54 (1): 41-78. Collins, D. W., E. Maydew, and I. Weiss. 1997. Changes in value-relevance of earnings and book values over the past forty years. Journal of Accounting and Economics 24 (1): 39-67. Dechow, P. M. 1994. Accounting earnings and cash flows as measures of performance: The role of accounting accruals. Journal of Accounting and Economics 18 (1): 3-42. Dechow, P. M. and I. D. Dichev. 2002. The quality of accruals and earnings: The role of estimation errors. The Accounting Review 77 (Supplement): 35–59. Dichev, I. D. 2008. On the balance sheet-based model of financial reporting. Accounting Horizons 22 (4): 453-470. Dichev, I. D. and V. W. Tang. 2008. Matching and the changing properties of accounting earnings over the last 40 years. The Accounting Review 83 (6): 1425–1460. Donelson, D. C., R. Jennings, and J. McInnis. 2011. Changes over time in the revenue-expense relation: Accounting or economics? The Accounting Review 86 (3): 945–974. Fama, E. F. and K. R. French. 1992. The cross-section of expected stock returns. Journal of Financial Economics 47 (2): 427-465. Francis, J., R. LaFond, P. M. Olsson, and K. Schipper. 2004. Costs of equity and earnings attributes. The Accounting Review 79 (4): 967–1010. Francis, J. and K. Schipper. 1999. Have financial statements lost their relevance? Journal of Accounting Research 37 (2): 319-352. He, W. and Y. Shan. 2016. International evidence on the matching between revenues and expenses. Contemporary Accounting Research forthcoming. Lee, J. 2011. Measuring reporting conservatism using the Dichev-Tang (2008) model. Unpublished working paper, Singapore Management University. SSRN: http://ssrn.com/abstract=2100319 . (not cited) Paton, W. A. and A. C. Littleton. 1940. An Introduction to Corporate Accounting Standards. American Accounting Association. Prakash, R. and N. Sinha. 2013. Deferred revenues and the matching of revenues and expenses. Contemporary Accounting Research 30 (2): 517-548. 32 Sivakumar, K. N. and G. B. Waymire. 2003. Enforceable accounting rules and income measurement by early 20th century railroads. Journal of Accounting Research 41 (2): 397-432. Smith, C. W. and R. L. Watts. 1992. The investment opportunity set and corporate financing, dividend, and compensation policies. Journal of Financial and Economics 32 (3): 263292. Srivastava, A. 2014. Why have measures of earnings quality changed over time? Journal of Accounting and Economics 57 (2-3): 196-217. 33 Table 1 Descriptive Statistics Variables Revenue Expense Matching Measures DMSE DTM MEXP% MISM Other Measures D_SALES LDREV LGREV Evar Persist Mean Std. Dev. Median Maximum Minimum 0.873 0.828 0.594 0.585 0.773 0.724 3.507 3.427 0.073 0.058 1.020 0.976 97.2% 0.136 3.221 0.246 19.0% 0.770 0.193 1.017 99.0% 0.005 72.312 1.568 173.3% 15.906 0.000 -0.181 -18.1% -2.083 0.902 0.025 -0.015 0.161 0.180 0.182 0.931 0.011 0.000 1.386 1.023 0.596 -0.027 -0.833 -1.136 0.026 0.402 0.027 0.362 0.018 0.420 0.337 1.808 0.001 -0.558 Revenue is annual revenue (SALE) divided by average assets (AT). Expense is annual expense, excluding extraordinary items (SALE-IB) divided by average assets (AT). DMSE is the mean squared error (x 103) from regressing expense on contemporaneous revenue. Regressions are firm-specific time-series, estimated over 10-year rolling windows. DTM is the coefficient on contemporaneous expense from regressing revenue on lead, lag, and contemporaneous expense. Regressions are firm-specific time-series, estimated over 10-year rolling windows. D_SALES is the coefficient on revenue from regressing expense on contemporaneous revenue. Regressions are firm-specific time-series, estimated over 10-year rolling windows. MEXP% is D_SALES*Average(Revenue/Expense) where Revenue/Expense is averaged over the 10 years of data used to estimate D_SALES. MISM is the mean squared error (x103) from regressing expense on contemporaneous revenue less the mean squared error (x103) from regressing expense on lead, lag, and contemporaneous revenue. Both regressions are firm-specific time-series, estimated over 10-year rolling windows. 34 LDREV is the coefficient on lead revenue in regressions of expense on lead, lag, and contemporaneous revenue. Regressions are firm specific time-series, estimated over 10-year rolling windows. LGREV is the coefficient on lead revenue in regressions of expense on lead, lag, and contemporaneous revenue. Regressions are firm specific time-series, estimated over 10-year rolling windows. Evar is the standard deviation of a firm’s income before extraordinary items (IB), determined for 10-year rolling windows. Persist is the persistence of earnings before extraordinary items (IB), measured as the coefficient on income before extraordinary items estimates from firm-specific temporal regressions of IB on the lag of IB. Income is divided by average assets (AT). 35 Table 2 Correlations Among Matching Measures: 1964-2013 Variables DMSE DTM MEXP% DTM (p-value) -0.158 (<.0001) MEXP% (p-value) -0.349 (<.0001) -0.280 (<.0001) MISM (p-value) 0.195 (<.0001) 0.040 (<.0015) -0.032 (<.0018) Variable definitions are provided in Table 1. Reported correlations are averages of rolling 10year window Spearman correlations. P-values are based on these average values. 36 Table 3 Matching in the 1964 to 2013 Time Period Panel A: Mean Values For Every 10th Rolling Window Matching Measures 1964-1973 1974-1983 1984-1993 Window Window Window DMSE 0.180 0.384 0.943 1994-2003 Window 1.484 2004-2013 Window 2.051 DTM 1.080 1.063 0.895 0.943 0.907 MEXP% 97.9% 98.8% 98.2% 97.0% 90.9% MISM 0.034 0.043 0.097 0.229 0.197 Panel B: t-statistics for Cross-Window Mean Differences in DMSE Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 5.39 8.48 8.84 1974-1983 5.84 7.28 1984-1993 3.15 1994-2003 Panel C: t-statistics for Cross-Window Mean Differences in DTM Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 -1.78 -12.98 -10.25 1974-1983 -11.88 -9.05 1984-1993 2.88 1994-2003 Panel D: t-statistics for Cross-Window Differences in MEXP% Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 1.04 0.25 -0.88 1974-1983 -0.56 -1.72 1984-1993 -0.97 1994-2003 37 2004-2013 8.48 7.48 4.67 2.14 2004-2013 -11.14 -10.10 0.63 -2.04 2004-2013 -4.93 -5.57 -4.67 -3.91 Panel E: t-statistics for Cross-Window Differences in MISM Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 1.15 3.40 4.86 1974-1983 2.84 4.60 1984-1993 3.04 1994-2003 2004-2013 3.54 3.32 2.05 -0.53 This table reports measures reflecting how well expenses are matched to revenues over ten-year periods. Panel A reports mean firm-level values for the four primary matching measures we consider based on firm-specific time-series regressions conducted over the specified ten-year window. Measure definitions are provided in Table 1. The sample of firms for each window consists of the 500 largest non-financial firms at the start of the window with available data for the entire 10-year period. Panels B through E of the table reports t-statistics for between-window comparisons (Baseline window less deducted window) of the means of each of the four matching measures. Significant t-statistics (0.05 level) are bolded. 38 Table 4 Mean Values of Matching Measures for Firms with Pre-1974 Compustat Start Dates Panel A. MEANS Matching Measure DMSE DTM MEXP% MISM N 1964-1973 Window 1974-1983 Window 1984-1993 Window 1994-2003 Window 2004-2013 Window 0.180 1.080 97.9% 0.034 500 0.384 1.064 98.5% 0.044 485 1.058 0.901 97.9% 0.118 380 1.135 0.959 98.6% 0.140 323 2.125 0.946 93.7% 0.219 194 Panel B: t-statistics for Cross-Window Mean Differences in DMSE Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 5.33 7.97 6.55 1974-1983 5.86 5.02 1984-1993 0.42 1994-2003 Panel C: t-statistics for Cross-Window Mean Differences for DTM Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 -1.62 -11.06 -8.41 1974-1983 -10.12 -7.34 1984-1993 2.99 1994-2003 Panel D: t-statistics for Cross-Window Mean Differences for MEXP% Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 0.77 -0.06 0.63 1974-1983 -0.62 0.06 1984-1993 0.57 1994-2003 39 2004-2013 4.90 4.37 2.59 2.34 2004-2013 -6.40 -5.66 1.84 -0.54 2004-2013 -2.50 -2.88 -2.24 -2.68 Panel E: t-statistics for Cross Window Mean Differences for MISM Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1.25 1964-1973 3.72 2.70 1974-1983 3.23 2.43 0.49 1984-1993 1994-2003 2004-2013 2.12 2.00 1.13 0.83 This table reports measures reflecting how well expenses are matched to revenues over ten-year periods for those firms in each period that are listed on Compustat prior to 1974. Panel A reports mean firm level values for the four primary matching measures we consider based on firmspecific time-series regressions conducted over the specified ten-year window. Measure definitions are provided in Table 1. The initial sample of firms for each window consists of the 500 largest non-financial firms at the start of the window with available data for the entire 10year period. Panels B through E of the table reports t-statistics for between window comparisons (Baseline window less deducted window) of the means each of the four matching measures. Significant t-statistics (0.05 level) are bolded. 40 Table 5 Residual Error Variation Attributable to Imposing Common Regression Parameter Values in the Cross-section. Panel A: Mean Values For Every 10th Rolling Window Matching Measure 1964-1973 1974-1983 1984-1993 1994-2003 2004-2013 Window Window Window Window Window DIFSE 0.973 1.047 1.114 1.678 2.053 DIFSE/ASE 0.692 0.628 0.581 0.464 0.557 Panel B: t-statistics for Cross-Window Mean Differences for DIFSE Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 0.49 0.52 3.30 1974-1983 0.25 2.93 1984-1993 1.81 1994-2003 Panel C: t-statistics for Cross-Window Mean Differences for DIFSE/ASE Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 -5.17 -10.84 -3.10 1974-1983 -7.50 -2.11 1984-1993 -5.15 1994-2003 2004-2013 3.93 3.65 2.63 1.19 2004-2013 -6.36 -3.23 -1.06 4.12 This table reports mean values for DIFSE and DIFSE/ASE where ASE is a firm’s average residual squared error (x103) from a regression of expense on revenue that is estimated using pooled cross-sectional data (from all 500 firms over all 10 years in a given rolling window) and DIFSE is ASE – DMSE (DMSE is the mean squared error (x 103)). The sample of firms for each ten-year window consists of the 500 largest non-financial firms at the start of the window with available data for the entire 10-year period. Panels B and C of the table reports t-statistics for between window comparisons (baseline window less deducted window) of mean DIFSE and mean DIFSE/ASE respectively. Significant t-statistics (0.05 level) are bolded. 41 Table 6 Lead and Lag Revenue Coefficients by Rolling Window Panel A: Mean Values For Every 10th Rolling Window Measures 1964-1973 1974-1983 1984-1993 Window Window Window LDREV 0.023 0.017 0.042 (% Positive) (57.6%) (54.8%) (53.8%) LGREV (% Positive) -0.010 (44.0%) 0.010 (52.2%) -0.022 (51.4%) 1994-2003 Window 0.023 (55.4%) 2004-2013 Window 0.006 (51.2%) -0.018 (51.2%) -0.033 (47.2%) Panel B: t-statistics for Cross-Window Mean Differences for LDREV Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window -0.91 1964-1973 1.60 -0.01 1974-1983 2.07 0.59 1.39 1984-1993 1994-2003 Panel C: t-statistics for Cross-Window Mean Differences for LGREV Baseline Window 1974-1983 1984-1993 1994-2003 Deducted Window 1964-1973 -1.13 -0.87 2.86 1974-1983 -2.98 -2.91 1984-1993 0.28 1994-2003 2004-2013 -1.53 -1.05 -2.38 1.31 2004-2013 -1.97 -3.66 -0.79 1.09 This table reports mean values of firm specific estimates of the lead (LDREV) and lagged (LGREV) coefficients from firm-specific regressions of expense on lead, lag, and contemporaneous revenues. The sample of firms for each ten-year window consists of the 500 largest non-financial firms at the start of the window with available data for the entire 10-year period. Panels B and C of the table reports t-statistics for between-window comparisons (baseline window less deducted window) of mean LDREV and mean LGREV respectively. Means significantly different from 0 and medians significantly different from 50% at the 0.05 level are bolded in panel A. T-statistic values significant at the 0.05 level are bolded in panels B and C. 42 Table 7 Matching and Earnings Variability Panel A: Estimates Without Control Variables Matching Measures 1964-1973 1974-1983 Window Window coeff std. err coeff std. err MEXP% 0.045 -0.580 0.036 -0.448 MISM 0.046 0.047 0.198 0.254 + LDREV -0.077 0.051 0.017 0.049 LDREV-0.056 0.046 -0.093 0.049 + LGREV 0.046 0.020 0.048 -0.186 LGREV0.051 -0.121 0.047 -0.150 1984-1993 Window coeff std. err 0.041 -0.531 0.045 0.201 0.049 0.048 -0.086 0.047 -0.087 0.049 0.046 -0.225 1994-2003 Window coeff std. err 0.042 -0.464 0.046 0.182 0.049 0.179 -0.039 0.046 -0.054 0.049 0.046 -0.269 2004-2013 Window coeff std. err 0.044 -0.392 0.047 0.134 0.047 0.100 -0.087 0.047 0.001 0.048 0.046 -0.220 Overall coeff std. err 0.019 -0.483 0.021 0.194 0.022 0.054 0.021 -0.072 0.022 -0.062 0.021 -0.197 Panel B: Estimates With Control Variables Included in the Model. Matching Measures 1964-1973 1974-1983 1984-1993 1994-2003 2004-2013 Overall Window Window Window Window Window coeff std. err coeff std. err coeff std. err coeff std. err coeff std. err coeff std. err MEXP% 0.044 -0.396 0.034 -0.372 0.044 -0.406 0.042 -0.411 0.042 -0.421 0.019 -0.535 MISM 0.037 0.035 0.038 0.043 0.072 0.042 0.018 0.117 0.109 0.182 0.137 0.125 + LDREV 0.049 0.035 0.041 0.045 0.043 0.019 0.106 0.079 0.151 0.175 0.133 0.134 LDREV-0.032 0.044 -0.093 0.034 -0.075 0.039 -0.057 0.044 -0.106 0.043 -0.073 0.018 + LGREV 0.047 0.038 0.034 0.008 0.040 0.022 0.046 0.003 0.044 -0.007 0.019 -0.133 LGREV0.045 -0.128 0.031 -0.214 0.038 -0.216 0.043 -0.268 0.043 -0.224 0.018 -0.320 This table reports coefficient estimates from regressions of earnings variability, measured by firm over ten-year windows, on various measures reflecting how expenses are matched with revenues for a firm over these same ten-year rolling windows. Variable definitions for the matching measures and earnings variability are provided in Table 1. Panel B control variables are average log of book-to price ratio of equity (CEQ/CSHQ*PRCC_F); average log of market value (CSHO*PRCC_F); average log of asset value (AT); sales variability measured as the standard deviation of a firm’s sales (SALE) over the ten year window; and average log of operating cycle ((RECT/SALE) + (INVT/COGS)). Separate models are estimated for each matching measure (LREV+ and LDREVare treated as a single measure as are LGREV+ and LGREV-). LDREV+ equals LDREV when LDREV > 0, and is 0 otherwise; and, LDREV-, which equals LDREV when LDREV< 0, and is 0 otherwise. LGREV+ equals LGREV when LGREV > 0, and is 0 otherwise; and, LGREV-, which equals LGREV when LGREV< 0, and is 0 otherwise. The first five columns report estimates for each 43 of the five non-overlapping ten-year windows in our sample. The overall column pools all of the observations from these five windows. All data in these regressions are measured using decile ranks formed within windows. Standard errors are reported in parentheses. In the case of the overall estimation these errors are cluster-adjusted by window. 44 Table 8 Matching and Earnings Persistence Panel A: Estimates Without Control Variables Matching Measures 1964-1973 1974-1983 Window Window coeff std. err coeff std. err MEXP% 0.044 -0.105 0.045 -0.175 MISM -0.019 0.043 -0.007 0.043 + LDREV 0.047 -0.114 0.049 -0.215 LDREV0.044 -0.153 0.047 -0.248 + LGREV 0.047 -0.241 0.046 -0.125 LGREV0.047 -0.303 0.046 -0.192 1984-1993 Window coeff std. err 0.018 0.044 0.042 0.095 0.046 -0.172 0.047 -0.114 0.047 -0.142 -0.078 0.047 1994-2003 Window coeff std. err 0.044 -0.101 0.070 0.045 -0.088 0.049 -0.050 0.047 -0.045 0.048 -0.011 0.049 2004-2013 Window coeff std. err 0.045 -0.154 0.043 0.132 0.046 -0.261 0.047 -0.225 0.046 -0.226 0.048 -0.173 Overall coeff std. err 0.021 -0.103 0.020 0.054 0.021 -0.170 0.021 -0.158 0.021 -0.156 0.022 -0.151 Panel B: Estimates With Control Variables Included in the Model. Matching Measures 1964-1973 1974-1983 1984-1993 1994-2003 2004-2013 Overall Window Window Window Window Window coeff std. err coeff std. err coeff std. err coeff std. err coeff std. err coeff std. err MEXP% 0.056 -0.199 0.051 0.005 0.053 -0.078 0.048 -0.142 0.050 -0.097 0.023 -0.155 MISM 0.006 0.045 0.007 0.043 0.046 0.048 0.047 0.044 0.020 0.104 0.142 0.059 + LDREV 0.059 -0.158 0.050 -0.176 0.052 -0.068 0.052 -0.266 0.048 -0.168 0.023 -0.168 LDREV0.055 -0.173 0.049 -0.107 0.051 -0.028 0.051 -0.256 0.049 -0.144 0.023 -0.169 + LGREV 0.053 -0.259 0.048 -0.150 0.051 -0.007 0.052 -0.196 0.050 -0.163 0.023 -0.224 LGREV0.055 -0.308 0.047 -0.095 0.051 0.016 0.053 -0.160 0.051 -0.155 0.023 -0.259 This table reports coefficient estimates from regressions of earnings persistence, measured by firm over ten-year windows, on various measures reflecting how expenses are matched with revenues for a firm over these same ten-year rolling windows. Variable definitions for the matching measures and earnings variability are provided in Table 1. Panel B control variables are average log of book-to price ratio of equity (CEQ/CSHQ*PRCC_F); average log of market value (CSHO*PRCC_F); average log of asset value (AT); sales variability measured as the standard deviation of a firm’s sales (SALE) over the ten year window; and average log of operating cycle ((RECT/SALE) + (INVT/COGS)). Separate models are estimated for each matching measure (LREV+ and LDREVare treated as a single measure as are LGREV+ and LGREV-). LDREV+ equals LDREV when LDREV > 0, and is 0 otherwise; and, LDREV-, which equals LDREV when LDREV< 0, and is 0 otherwise. LGREV+ equals LGREV when LGREV > 0, and is 0 otherwise; and, LGREV-, which equals LGREV when LGREV< 0, and is 0 otherwise. The first five columns report estimates for each 45 of the five non-overlapping ten year windows in our sample. The overall column pools all of the observations from these five windows. All data in these regressions are measured using decile ranks formed within windows. Standard errors are reported in parentheses. In the case of the overall estimation these errors are cluster adjusted by window. 46 Figure 1. Mean Firm-Specific Mean Squared Error (DMSE):1964-2013 6 DMSE 4 2 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981 1979 1977 1975 1973 0 Rolling Year End This figure reports mean values for DMSE for rolling windows covering data from 1964 - 2013. DMSE is the mean squared error (x 103) from the regressing expense on contemporaneous revenue. Regressions are firm-specific time-series estimated over 10-year windows. Figure 2. Mean Dichev and Tang Coefficient (DTM): 1964-2013 1.15 DTM 1.05 0.95 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981 1979 1977 1975 1973 0.85 Rolling Year End This figure reports mean values for DTM for rolling windows covering data from 1964-2013. DTM is the coefficient on contemporaneous expense from regressing revenue on lead, lag, and Contemporaneous expense. Regressions are firm-specific time-series, estimated over 10-year rolling windows. 47 Figure 3. Mean Direct Regression Revenue Coefficient (D_SALES): 1964-2013 D_SALES 1 0.95 0.9 0.85 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981 1979 1977 1975 1973 0.8 Rolling Year End This figure reports mean values for D_SALES for rolling windows covering data from 19642013. D_SALES is the estimated coefficient on revenue from regressing expense on contemporaneous revenue. Regressions are firm-specific time-series, estimated over 10-year windows. Figure 4: Mean % of Expense Explained by Revenue (MEXP%): 1964-2013 105.00% MEXP% 100.00% 95.00% 90.00% 85.00% 80.00% Rolling Year-end This figure reports mean values for MEXP% for rolling windows covering data from 1964-2013. MEXP% is the percentage of a firm’s expense over the rolling window that is attributable to revenue over this same window, based on the estimated revenue coefficient (DSALES) obtained by regressing expense on revenue. Regressions are firm-specific time-series estimated over 10year windows. 48 Figure 5. Mean Variation Explained by Lead and Lag Sales (MISM): 1964-2013 0.55 MISM 0.35 0.15 -0.05 19731976197919821985198819911994199720002003200620092012 Rolling Year End This figure reports mean values for MISM for rolling windows covering data from 1964-2013. MISM is the mean squared error (x103) from regressing expense on contemporaneous revenue less the mean squared error (x103) from regressing expense on lead, lag, and contemporaneous revenue. Both regressions are firm-specific time-series, estimated over 10-year rolling windows. Figure 6. Mean Coefficients on Lead (LDREV) and Lag (LGREV) Sales: 1964-2013 0.08 LDREV Coefficient Value 0.06 LGREV 0.04 0.02 0 -0.02 -0.04 -0.06 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981 1979 1977 1975 1973 -0.08 Rolling Year End This figure reports mean values for the coefficients on lead (LDREV) and lag (LGREV) revenue in regressions of expense on lead, lag, and contemporaneous revenue. Regressions are firm specific time-series, estimated over 10-year rolling windows. 49
© Copyright 2026 Paperzz