Model Validation: Is this spreadsheet a model?

A VALIDATOR’S
GUIDE TO MODEL
RISK MANAGEMENT
BEST PRACTICES FOR COSTEFFECTIVE REGULATORY
COMPLIANCE
Release 1.0
January 2017
TABLE OF CONTENTS
MODEL VALIDATION: IS THIS SPREADSHEET A MODEL?
2
VALIDATING VENDOR MODELS – SPECIAL CONSIDERATIONS
7
PREPARING FOR MODEL VALIDATION – IDEAS FOR MODEL OWNERS
10
4 QUESTIONS TO ASK WHEN DETERMINING MODEL VALIDATION
SCOPE
PERFORMANCE TESTING: BENCHMARKING VS BACKTESTING
14
VALIDATING MODEL INPUTS – HOW MUCH IS ENOUGH?
23
CONTRIBUTORS
26
ABOUT RISKSPAN
28
19
As model validators, we frequently find ourselves in the middle of debates between spreadsheet owners
and enterprise risk managers over the question of whether a particular computing tool rises to the level
of a “model.” To the uninitiated, the semantic question, “Is this spreadsheet a model?” may appear to be
largely academic and inconsequential. But its ramifications are significant, and getting the answer right is
of critical importance to model owners, to enterprise risk managers, and to regulators.
STAKEHOLDERS OF MODEL VALIDATION
In the most important
respects, the incentives of
these stakeholder groups are
aligned. Everybody has an
interest in knowing that the
spreadsheet in question is
functioning as it should and
producing
accurate
and
meaningful
outputs.
Appropriate steps should be
taken to ensure that every
computing tool does this,
regardless of whether it is ultimately deemed a model. But classifying something as a model carries with
it important consequences related to cost and productivity, as well as overall model risk management.
It is here where incentives begin to diverge. Owners and users of spreadsheets in particular are generally
inclined to classify them as simple applications or end-user computing (EUC) tools whose reliability can
(and ought to) be ascertained using testing measures that do not rise to the level of formal model
validation procedures required by regulators.1 These formal procedures can be both expensive for the
institution and onerous for the model owner. Models require meticulous documentation of their
1
In the United States, most model validations are governed by one of the following sets of guidelines: 1) OCC 2011-12
(institutions regulated by the OCC), 2) FRB SR-11 (institutions regulated by the Federal Reserve) and 3) FHFA 2013-07 (Fannie
Mae, Freddie Mac, and the Federal Home Loan Banks). These documents have much in common and the OCC and FRB
guidelines are identical to one another.
2
approach, economic and financial theory, and code. Painstaking statistical analysis is frequently required
to generate the necessary developmental evidence, and further cost is then incurred to validate all of it.
Enterprise risk managers and regulators, who do not necessarily feel these added costs and burdens, may
be inclined to err on the side of classifying spreadsheets as models “just to be on the safe side.” But
incurring unnecessary costs is not a prudent course of action for a financial institution (or any institution).
And producing more model validation reports than is needful can have other unintended, negative
consequences. Model validations pull model owners away from their everyday work, adversely affecting
productivity and, sometimes, quality of work. Virtually every model validation report identifies issues that
must be reviewed and addressed by management. Too many unnecessary reports containing findings that
are comparatively unimportant can bury enterprise risk managers and distract them from the most urgent
findings.
DEFINITION OF A MODEL
So what, then, are the most important considerations in determining which spreadsheets are in fact
models that should be subject to formal validation procedures? OCC and FRB guidance on model risk
management defines a model as follows:2
A quantitative method, system, or approach that applies statistical, economic, financial, or
mathematical theories, techniques, and assumptions to process input data into quantitative estimates.
The same guidance refers to models as having three components:
1. An information input component, which delivers assumptions and data to the mode
2. A processing component, which transforms inputs into estimates
3. A reporting component, which translates the estimates into useful business information
This definition and guidance leaves managers with some latitude. Financial institutions employ many
applications that apply mathematical concepts to defined inputs in order to generate outputs. But the
existence of inputs, outputs, and mathematical concepts alone does not necessarily justify classifying a
spreadsheet as a model.
Note that the regulatory definition of a model includes the concept of quantitative estimates. The term
quantitative estimate implies a level of uncertainty about the outputs. If an application is generating
outputs about which there is little or no uncertainty, then one can argue the output is not a quantitative
2
See footnote 1.
3
estimate but, rather, simply a defined arithmetic result. While quantitative estimates typically result from
arithmetic processes, not every defined arithmetic result is a quantitative estimate.
For example, a spreadsheet that sums all the known balances of ten bank accounts as of a given date,
even if it is supplied by automated feeds, and performs the summations in a completely lights-out process,
likely would not rise to the level of a model requiring validation because it is performing a simple
arithmetic function; it is not generating a quantitative estimate.3
In contrast, a spreadsheet that projects what the sum of the same ten bank balances will be as of a given
future date (based on assumptions about interest rates, expected deposits, and decay rates, for example)
generates quantitative estimates and would therefore qualify as a model requiring validation.
Management and regulators would want to have comfort that the assumptions used by this spreadsheet
model are reasonable and that they are being applied and computed appropriately.
IS THIS SPREADSHEET A MODEL?
We have found the following questions to be particularly enlightening in helping our clients determine
whether a spreadsheet should be classified as 1) a model that transforms inputs into quantitative
estimates or 2) a non-model spreadsheet that generates defined arithmetic results.
QUESTION 1: DOES THE SPREADSHEET PRODUCE A DEMONSTRABLY “RIGHT”
ANSWER ?
A related question is whether benchmarking yields results that are comparable, as opposed to exactly the
same. If spreadsheets designed by ten different people can reasonably be expected to produce precisely
the same result (because there is only one generally accepted way of calculating it), then the result
probably does not qualify as a quantitative estimate and the spreadsheet probably should not be classified
as a model.
Example 1 (Non-Model): Mortgage Amortization Calculator: Ten different applications would be
expected to transform the same loan amount, interest rate, and term information into precisely the same
amortization table. A spreadsheet that differed from this expectation would be considered “wrong.” We
3
Management would nevertheless want to obtain assurances that such an application was functioning correctly. This,
however, can be achieved via less intrusive means than a formal model validation process. This might be addressed via
conventional auditing, SOX reviews, or EUC quality gates. All of these are less intrusive.
4
would not consider this output to be a quantitative estimate and would be inclined to classify such a
spreadsheet as something other than a model.
Example 2 (Model): Spreadsheet projecting the expected UPB of a mortgage portfolio in 12 months: Such
a spreadsheet would likely need to apply and incorporate prepayment and default assumptions. Different
spreadsheets could compute and apply these assumptions differently, without one
particularly necessarily being recognized as “wrong.” We would consider the resulting UPB projections
to be quantitative estimates and would be likely to classify such as spreadsheet as a model.
Note that the spreadsheets in both examples tell their users what a loan balance will be in the future. But
only the second example layers economic assumptions on top of its basic arithmetic calculations.
Economic assumptions can be subjected to verification after the fact, which relates to our second
question:
QUESTION 2: CAN THE SPREADSHEET’S OUTPUT BE BACK-TESTED?
Another way of stating this question would be, “Is back-testing required to gauge the accuracy of the
spreadsheet’s outputs?” This is a fairly unmistakable indicator of a forward-looking quantitative estimate.
A spreadsheet that generates forward-looking estimates is almost certainly a model and should be
subjected to formal model validation.
Back-testing would not be of any particular value in our first (non-model) example, above, as the
spreadsheet is simply calculating a schedule. In our second (model) example, however, back-testing would
be an invaluable tool for judging the reliability of the prepayment and default assumptions driving the
balance projection.
QUESTION 3: IS THE SPREADSHEET SIMPLY APPLYING A DEFINED SET OF
BUSINESS RULES?
Spreadsheets are sometimes used to automate the application of defined business rules in order to arrive
at a prescribed course of action. This question is a corollary to the first question about whether the
spreadsheet produces output that is, by definition, “correct.”
Examples of business-rule calculators are spreadsheets that determine a borrower’s eligibility for a
particular loan product or loss mitigation program. Such spreadsheets are also used to determine how
much of a haircut to apply to various collateral types based on defined rules.
These spreadsheets do not generate quantitative estimates and we would not consider them models
subject to formal regulatory validation.
5
SHOULD I VALIDATE THIS SPREADSHEET?
All spreadsheets that perform calculations should be subject to review. Any spreadsheet that produces
incorrect or otherwise unreliable outputs should not be used until its errors are corrected. Formal model
validation procedures, however, should be reserved for spreadsheets that meet certain criteria.
Subjecting non-model spreadsheets to model validation unnecessarily drives up costs and dilutes the
findings of bona fide model validations by cluttering enterprise risk management’s radar with an unwieldy
number of formal issues requiring tracking and resolution.
Spreadsheets should be classified as models (and validated as such) when they produce forward-looking
estimates that can be back-tested. This excludes simple calculators that do not rely on economic
assumptions or apply business rules that produce outputs that can be definitively identified before the
fact as “right” or “wrong.”
We believe that the systematic application of these principles will alleviate much of the tension between
spreadsheet owners, enterprise risk managers, and regulators as they work together to identify those
spreadsheets that should be subject to formal model validation.
6
Many of the models we validate on behalf of our clients are developed and maintained by third-party
vendors. These validations present a number of complexities that are less commonly encountered when
validating “home-grown” models. These often include:
1.
2.
3.
4.
5.
Inability to interview the model developer
Inability to review the model code
Inadequate documentation
Lack of developmental evidence and data sets
Lack of transparency into the impact custom settings
Notwithstanding these challenges, the OCC’s Supervisory Guidance on Model Risk Management (OCC
2011-12)1 specifies that, “Vendor products should nevertheless be incorporated into a bank’s broader
model risk management framework following the same principles as applied to in-house models, although
the process may be somewhat modified.”
The extent of these modifications depends on the complexity of the model and the cooperation afforded
by the model’s vendor. We have found the following general principles and practices to be useful.
7
VALIDATING VENDOR MODELS
Vendor documentation is not a substitute for model documentation: Documentation provided by model
vendors typically includes user guides and other materials designed to help users navigate applications
and make sense of outputs. These documents are written for a diverse group of model users and are not
designed to identify and address particular model capabilities specific to the purpose and portfolio of an
individual bank. A bank’s model documentation package should delve into its specific implementation of
the model, as well as the following:







Discussion of the model’s purpose and specific application, including business and functional
requirements achieved by the model
Discussion of model theory and approach, including algorithms, calculations, formulas, functions
and programming;
Description of the model’s structure
Identification of model limitations and weaknesses
Comprehensive list of inputs and assumptions, including their sources
Comprehensive list of outputs and reports and how they are used, including downstream systems
that rely on them
Description of testing (benchmarking and back-testing)
Because documentation provided the vendor is likely to include very few if any of these items, it falls to
the model owner (at the bank) to generate this documentation. While some of these items (specific
algorithms, calculations, formulas, and programming, for example) are likely to be deemed proprietary
and will not be disclosed by the vendor, most of these components are obtainable and should be
requested and documented.
Model documentation should also clearly lay out all model settings (e.g., knobs) and justification for the
use of (or departure from) vendor default settings.
Testing results should be requested of the vendor: OCC 2011-12 states that “Banks should expect vendors
to conduct ongoing performance monitoring and outcomes analysis, with disclosure to their clients, and
to make appropriate modifications and updates over time.” Many vendors publish the results of their own
internal testing of the model. For example, a prepayment model vendor is likely to include back-testing
results of the model’s forecasts for certain loan cohorts against actual, observed prepayments. An
automated valuation model (AVM) vendor might publish the results of testing comparing the property
values it generates against sales data. If a model’s vendor does not publish this information, model
validators should request it and document the response in the model validation report. Where available,
this information should be obtained and incorporated into the model validation process, along with a
discussion of its applicability to data the bank is modeling. Model validators should attempt to replicate
the results of these studies, where feasible, and use them to enhance their own independent
benchmarking and back-testing activities.
8
Developmental evidence should be requested of the vendor: OCC 2011-12 directs banks to “require the
vendor to provide developmental evidence explaining the product components, design, and intended
use.” This should be incorporated into the bank’s model documentation. Where feasible, model validators
should also ask model vendors to provide information about data sets that were used to develop and test
the model.
Contingency plans should be maintained: OCC 2011-12 cites the importance of a bank’s having “as much
knowledge in-house as possible, in case the vendor or the bank terminates the contract for any reason, or
if the vendor is no longer in business. Banks should have contingency plans for instances when the vendor
model is no longer available or cannot be supported by the vendor.” For simple applications whose inner
workings are well understood and replicable, a contingency plan may be as simple as Microsoft Excel. This
requirement can pose a significant challenge, however, for banks that purchase off-the-shelf asset-liability
and market risk models and do not have the in-house expertise to quickly and adequately replicate these
models’ complex computations. Situations such as this argue for the implementation of reliable challenger
models, which not only assist in meeting benchmarking requirements, but can also function as a
contingency plan backup.
Consult the model risk management group during the process of procuring any application that might
possibly be classified as a “model”: In a perfect world, model validation considerations would be
contemplated as part of the procurement process. An agreement to provide developmental evidence,
testing results, and cooperation with future model validation efforts would ideally figure into the
negotiations before the purchase of any application is finalized. Unfortunately, our experience has shown
that banks often acquire what they think of as a simple third-party application, only to be informed after
the fact, by either a regulator or the model risk management group, that they have in fact purchased a
model requiring validation. A model vendor, particularly one not inclined to think of its product as a
“model,” may not always be as responsive to requests for development and testing data after sale if those
items have not been requested as a condition for the sale. It is therefore a prudent practice for
procurement departments to have open lines of communication with model risk management groups so
that the right questions can be asked and requirements established prior to application acquisition.
9
Though not its intent, model validation can be disruptive to model owners and others seeking to carry out
their day-to-day work. We have performed enough model validations over the past decade to have
learned how cumbersome the process can be to business unit model owners and others we inconvenience
with what at times must feel like an endless barrage of touch-point meetings, documentation requests
and other questions relating to modeling inputs, outputs, and procedures.
We recognize that the only thing these business units did to deserve this inconvenience was to devise or
procure a methodology for systematically improving how something gets estimated. In some cases, the
business owner of an application tagged for validation may view it simply as a calculator or other tool, and
not as a "model." And in some cases we agree with the business owner. But in every case, the system
under review has been designated as a model requiring validation either by an independent risk
management department within the institution or (worse) by a regulator, and so, the validation project
must be completed.
As with so many things in life, when it comes to model validation preparation, an ounce of prevention
goes a long way. Here are some ideas model owners might consider for making their next model validation
a little less stressful.
10
OVERALL MODEL DOCUMENTATION
Among the first questions we ask at the beginning of a model validation is whether the model has been
validated before. In reality, however, we can make a fairly reliable guess about the model's validation
history simply by reading the model owner's documentation. A comprehensive set of documentation that
clearly articulates the model's purpose, its inputs' sources, how it works, what happens to the outputs and
how the outputs are monitored is an almost sure sign that the model in question has been validated
multiple times.
In contrast, it's generally apparent that the model is being validated for the first time when our initial
request for documentation yields one or more of the following:






An 800-page user guide from the model’s vendor, but no internally developed documentation or
procedures
Incomplete (or absent) lists of model inputs with little or no discussion of how inputs and
assumptions are obtained, verified, or used in the model
No discussion of the model’s limitations
Perfunctory monitoring procedures, such as, "The outputs are reviewed by an analyst for
reasonableness"
Vague (or absent) descriptions of the model's outputs and how they are used
Change logs with just one or two entries
No one likes to write model documentation. There never seems to be enough time to write model
documentation. Compounding this challenge is the fact that model validations frequently seem to occur
at the most inopportune moments for model owners. A bank's DFAST models, for example, often undergo
validation while the business owners who use them are busy preparing the bank's DFAST submission. This
is not the best time to be tweaking documentation and assembling data for validators.
Documentation would ideally be prepared during periods of lower operational stress. Model owners can
accomplish this by predicting and staying in front of requests from model risk management by
independently generating documentation for all their models that satisfies the following basic criteria:







Identifies the model's purpose, including its business and functional requirements, and who is
responsible for using and maintaining the model
Comprehensively lists and justifies of the model's inputs and assumptions
Describes the model's overall theory and approach, i.e., how the model goes about transforming
the inputs and assumptions into reliable outputs (including VBA or other computer code if the
model was developed in house)
Lays out the developmental evidence supporting the model
Identifies the limitations of the model
Explains how the model is controlled—who can access it, who can change it, what sorts of
approvals are required for different types of changes
Comprehensively identifies and describes the model’s outputs, how they are used, and how they
are tested
11
Any investment of time beforehand to incorporate the items above into the model’s documentation will
pay dividends when the model validation begins. Being able to simply hand this information over to the
validators will likely save model owners hours of attending follow-up meetings and fielding requests.
Additional suggestions for getting the model’s inputs and outputs in order follow below.
MODEL INPUTS
All of the model’s inputs and assumptions need to be explicitly spelled out, as well as their relevance to
the model, their source(s), and any processes used to determine their reliability. Simply emailing an Excel
file containing the model and referring the validator to the ‘Inputs’ tab is probably going to result in more
meetings, more questions, and more time siphoned out of the model owner’s workday by the validation
team.
A useful approach for consolidating inputs and assumptions that might be scattered around different
areas of the model involves the creation of a simple table that captures everything a validator is likely to
ask about each of the model’s inputs and assumptions.
Input/Assumption
Location
(screen/tab)
Source
Purpose
How Verified
2-Yr/10-Yr
Rates
‘YldCrv’ Tab
Bloomberg
Forecast
Secondary
Mortgage Rates
AD-Co
Forecast prepayments
Input 2
Assumption 2
Prepayment
Screen
…
…
…
…
…
…
Weekly spot check of two
random
values
against
TradeWeb
Back-test study provided by
vendor
…
…
…
…
…
…
…
CPR Curve
Swap
Systematically capturing all of the model’s inputs and assumptions in this way enables the validators to
quickly take inventory of what needs to be tested without having to subject the model owner to a timeconsuming battery of questions designed to make sure they haven’t missed anything.
MODEL OUTPUTS
Being prepared to explain to the validator all the model’s outputs individually and how each is used in
reporting and downstream applications greatly facilitates the validation process. Accounting for all the
uses of every output becomes more complicated when they are used outside the business unit, including
as inputs to another model. At the discretion of the institution’s model risk management group, it may be
12
sufficient to limit this exercise only to uses within the model owner’s purview and to reports provided to
management. As with inputs, this can be facilitated by a table:
Output
Location
(screen/tab)
Report(s)
containing it
Output Purpose
Benchmarking/
Backtesting Proc.
1-day VaR
‘Output’ tab
Daily VaR Report
Back-testing of VaR vs. actual
P/L over 12-month look-back.
Market Value
Position Screen
Portfolio Summary
Option-Adjusted
Spread
Duration
Risk Metrics
Screen
Risk Metrics
Screen
Risk Report
Convexity
Risk Metrics
Screen
Rate Shock
Summary
…
…
…
Capture potential
daily loss with 99%
confidence
Report current
portfolio value
Discount rate
determination
Measure asset
sensitivity to interest
rate changes
Measure asset
sensitivity to interest
rate changes
…
Rate Shock
Summary
Benchmarked monthly against
pricing service
Benchmarked semi-annually
against Bloomberg
Benchmarked semi-annually
against Bloomberg
Benchmarked semi-annually
against Bloomberg
…
Outputs that impact directly on financial statements are especially important. Model validators are likely
to give these outputs particular scrutiny and model owners would do well to be prepared to explain not
only how such outputs are computed and verified, but how the audit trails surrounding them are
maintained, as well.
To the extent that outputs are subjected to regular benchmarking, back-testing, or sensitivity analyses,
these should be gathered as well.
A SERIES OF SMALL INVESTMENTS
A model owner might look at these suggestions and conclude that they seem like a lot of work just to get
ready for a model validation. We agree. Bear in mind, however, that the model validator is almost certain
to ask for these things at some point during the validation, when, chances are, a model owner is likely to
wish she had the flexibility to do her real job. Making a series of small time investments to assemble these
items well in advance of the validators’ arrival not only will make the validation more tolerable for model
owners, but will likely improve the overall modeling process as well.
13
Model risk management is a necessary undertaking for which model owners must prepare on a regular
basis. Model risk managers frequently struggle to strike an appropriate cost-benefit balance in
determining whether a model requires validation, how frequently a model needs to be validated, and how
detailed subsequent and interim
model validations need to be.
The extent to which a model
must be validated is a decision
that affects many stakeholders in
terms of both time and dollars.
Everyone has an interest in
knowing that models are reliable,
but bringing the time and
expense of a full model validation
to bear on every model, every
year is seldom warranted. What
are the circumstances under
which a limited-scope validation
will do and what should that
validation look like?
We have identified four considerations that can inform your decision on whether a full-scope model
validation is necessary:
1.
2.
3.
4.
What about the model has changed since the last full-scope validation?
How have market conditions changed since the last validation?
How mission-critical is the model?
How often have manual overrides of model output been necessary?
14
WHAT CONSTITUTES A MODEL VALIDATION
Comprehensive model validations consist of three main components: conceptual soundness, ongoing
monitoring and benchmarking, and outcomes analysis and back-testing.1 A comprehensive validation
encompassing all these areas is usually required when a model is first put into use. Any validation that
does not fully address all three of these areas is by definition a limited-scope validation. 1
Comprehensive validations on ‘black box’ models developed and maintained by third-party vendors are
therefore problematic because the mathematical code and formulas are not typically available for review
(in many cases a validator can only hypothesize the cause and effect relationships between the inputs and
outputs based on a reading of the model’s documentation). Ideally, regular comprehensive validations
are supplemented by limited-scope validations and outcomes analyses on an ongoing, interim basis to
ensure that the model performs as expected.
KEY CONSIDERATIONS FOR MODEL VALIDATION
There is no ‘one size fits all’ question for determining how often a comprehensive validation is necessary,
versus when a limited-scope review would be appropriate. Beyond the obvious time and cost
considerations, model validation managers would benefit from asking themselves a minimum of four
questions in making this determination:
QUESTION 1: WHAT ABOUT THE MODEL HAS CHANGED SINCE THE LAST FULLSCOPE VALIDATION?
Many models layer economic assumptions on top of arithmetic equations. Most models consist of three
principal components:
1. inputs (assumptions and data)
2. processing (underlying mathematics and code that transform inputs into estimates)
3. output reporting (processes that translate estimates into useful information)
Changes to either of the first two components are more likely to require a comprehensive validation than
changes to the third component. A change that materially impacts how the model output is computed,
either by changing the inputs that drive the calculation or by changing the calculations themselves, is
1
In the United States, most model validations are governed by the following sets of guidelines: 1) OCC 2011-12 (institutions
regulated by the OCC), and 2) FRB SR-11 (institutions regulated by the Federal Reserve). These guidelines are effectively
identical to one another. Model validations at Government-sponsored enterprises, including Fannie Mae, Freddie Mac, and
the Federal Home Loan Banks, are governed by Advisory Bulletin 2013-07, which, while different from the OCC and Fed
guidance, shares many of the same underlying principles
15
more likely to merit a more comprehensive review than a change that merely affects how the model’s
outputs are interpreted.
For example, say I have a model that assigns a credit rating to a bank’s counterparties on a 100-point
scale. The requirements the bank establishes for the counterparty are driven by how the model rates the
counterparty. Say, for example, that the bank lends to counterparties that score between 90 and 100 with
no restrictions, between 80 and 89 with pledged collateral, between 70 and 79 with delivered collateral,
and does not lend to counterparties scoring below a 70. Consider two possible changes to the model:
1. Changes in model calculations that result in what used to be a 65 now being a 79.
2. Changes in grading scale that result in a counterparty that receives a rating of 65 now being
deemed creditworthy.
While the second change impacts the interpretation of model output and may require only a limited-scope
validation to determine whether the amended grading scale is defensible, the first change is almost certain
to require that the validator go deeper ‘under the hood’ for verification that the model is working as
intended. Assuming that the inputs did not change, the first type of change may be the result of changes
to assumptions (e.g., weighting schemes) or simply a revision to a perceived calculation error. The second
is a change on the reporting component, where a comparison of the model’s forecasts to those of
challenger models and back-testing with historical data may be sufficient for validation.
Not every change that affects model outputs necessarily requires a full-scope validation. The insertion of
recently updated economic forecasts into a recently validated model may require only a limited set of
tests to demonstrate that changes in the model estimates are consistent with the new economic forecast
inputs. The magnitude of the impact on output also matters. Altering several input parameters that results
in a material change to model output is more likely to require a full validation.
QUESTION 2: HOW HAVE MARKET CONDITIONS CHANGED SINCE THE LAST
VALIDATION?
Even models that do not change at all require periodic, full-scope validations because macroeconomic
conditions or other external factors call one or more of the model’s underlying assumptions into question.
The 2008 global financial crisis is a perfect example. Mortgage credit and prepayment models prior to
2008 were built on assumptions that appeared reasonable and plausible based on market observations
prior to 2008. Statistical models based solely on historical data before, during, or after the crisis are likely
to require full-scope validations as their underlying datasets are expanded to capture a more
comprehensive array of observed economic scenarios.
It doesn’t always have to be bad news in the economy to instigate model changes that require full-scope
validations. The federal funds rate has been hovering near zero since the end of 2008. With a period of
gradual and sustained recovery potentially on the horizon, many models are beginning to incorporate
rising interest rates into their current forecasts. These foreseeable model adjustments will likely require
16
more comprehensive validations geared toward verifying that model outputs are appropriately sensitive
to the revised interest rate assumptions.
QUESTION 3: HOW MISSION-CRITICAL IS THE MODEL?
The more vital the model’s outputs are to financial statements or mission-critical business decisions, the
greater the need for frequent and detailed third-party validations. Model risk is amplified when the model
outputs inform reports that are provided to investors, regulators, or compliance authorities. Particular
care should be given when deciding whether to partially validate models with such high-stake outputs.
Models whose outputs are used for internal strategic planning are also important. That being said, some
models are more critical to a bank’s long-term success than others. Ensuring the accuracy of the risk
algorithms used for DFAST stress testing is more imperative than the accuracy of a model that predicts
wait times in a customer service queue. Consequently, DFAST models, regardless of their complexity, are
likely to require more frequent full-scope validations than models whose results likely undergo less
scrutiny.
QUESTION 4: HOW OFTEN HAVE MANUAL OVERRIDES OF MODEL OUTPUT BEEN
NECESSARY?
Another issue to consider revolves around the use of manual overrides to the model’s output. In cases
where expert opinion is permitted to supersede the model outputs on a regular basis, more frequent fullscope validations may be necessary in order to determine whether the model is performing as intended.
Counterparty credit scoring models, cited in our earlier example, are frequently subjected to manual
overrides by human underwriters to account for new or other qualitative information that cannot be
processed by the model. The decision of whether it is necessary to revise or re-estimate a model is
frequently a function of how often such overrides are required and what the magnitude of these overrides
tends to be. Models that frequently have their outputs overridden should be subjected to more frequent
full-scope validations. And models that are revised as a result of numerous overrides should also likely be
fully validated, particularly when the revision includes significant changes to input variables and their
respective weightings.
FULL OR PARTIAL MODEL VALIDATION
Model risk managers need to perform a delicate balancing act in order to ensure that an enterprise’s
models are sufficiently validated while keeping to a budget and not overly burdening model owners. In
many cases, limited-scope validations are the most efficient means to this end. Such validations allow for
the continuous monitoring of model performance without bringing in a Ph.D. with a full team of experts
to opine on a model whose conceptual approach, inputs, assumptions, and controls have not changed
17
since its last full-scope validation. While gray areas abound and the question of full versus partial
validation needs to be addressed on a case-by-case basis, the four basic considerations outlined above
can inform and facilitate the decision.
Incorporating these considerations into your model risk management policy will greatly simplify the
decision of how detailed your next model validation needs to be. An informed decision to perform a partial
model validation can ultimately save your business the time and expense required to execute a full model
validation.
18
When someone asks you what a model validation is, what is the first thing you think of? If you are like
most, then you would immediately think of performance metrics — those quantitative indicators that tell
you not only if the model is working as intended, but also its performance and accuracy over time and
compared to others. Performance testing is the core of any model validation and generally consists of the
following components:




Benchmarking
Back-testing
Sensitivity Analysis
Stress Testing
Sensitivity analysis and stress testing, while critical to any model validation’s performance testing, will be
covered by a future article. This post will focus on the relative virtues of benchmarking versus backtesting—seeking to define what each is, when and how each should be used, and how to make best use
of the results of each.
19
BENCHMARKING
Benchmarking is when the validator is providing a comparison of the model being validated to some other
model or metric. The type of benchmark utilized will vary, like all model validation performance testing
does, with the nature, use, and type of model being validated. Due to the performance information it
provides, benchmarking should always be utilized in some form when a suitable benchmark can be found.
CHOOSING A BENCHMARK
Choosing what kind of benchmark to use within a model validation can sometimes be a very daunting
task. Like all testing within a model validation, the kind of benchmark to use depends on the type of model
being tested. Benchmarking takes many forms and may entail comparing the model’s outputs to:






The model’s previous version
An externally produced model
A model built by the validator
Other models and methodologies considered by the model developers, but not chosen
Industry best practice
Thresholds and expectations of the model’s performance
One of the most used benchmarking approaches is to compare a new model’s outputs to those of the
version of the model it is replacing. It remains very common throughout the industry for models to be
replaced due to a deterioration of performance, change in risk appetite, new regulatory guidance, need
to capture new variables, or the availability of new sets of information. In these cases, it is important to
not only document but also prove that the new model performs better and does not have the same issues
that triggered the old model’s replacement.
Another common benchmarking approach compares the model’s outputs to those of an external
“challenger” model (or one built by the validator) which functions with the same objective and data. This
approach is likely to return more apt output comparisons than those generated by benchmarking against
older versions that are likely to be out of date since the challenger model is developed and updated with
the same data as the champion model.
Another benchmark set which could be used for model validation includes other models or methodologies
reviewed by the model developers as possibilities for the model being validated but ultimately not used.
Model developers as best practice should always list any alternative methodologies, theories, or data
which were omitted from the model’s final version. Additionally, model validators should always leverage
their experience and understanding of the current best practices throughout the industry, along with any
analysis previously completed on similar models. Model validation should then take these alternatives
and use them as benchmarks to the model being validated.
20
Model validators have multiple, distinct ways to incorporate benchmarking into their analysis. The use of
the different types of benchmarking discussed here should be based on the type of model, its objective,
and the validator’s best judgment. If a model cannot be reasonably benchmarked, then the validator
should record why not and discuss the resulting limitations of the validation.
BACK-TESTING
Back-testing is used to measure model outcomes. Here, instead of measuring performance with a
comparison, the validator is specifically measuring whether the model is both working as intended and is
accurate. Back-testing can take many forms based on the model’s objective. As with benchmarking, backtesting should be a part of every full-scope model validation to the extent possible.
WHAT BACK-TESTS TO PERFORM
As a form of outcomes analysis, back-testing provides quantitative metrics which measure the
performance of a model’s forecast, the accuracy of its estimates, or its ability to rank-order risk. For
instance, if a model produces forecasts for a given variable, back-testing would involve comparing the
model’s forecast values against actual outcomes, thus indicating its accuracy.
A related function of model back-testing evaluates the ability of a given model to adequately measure
risk. This risk could take any of several forms, from the probability of a given borrower to default to the
likelihood of a large loss during a given trading day. To back-test a model’s ability to capture risk exposure,
it is important first to collect the right data. In order to back-test a probability of default model, for
example, data would need to be collected containing cases where borrowers have actually defaulted in
order to test the model’s predictions.
Back-testing models that assign borrowers to various risk levels necessitate some special considerations.
Back-testing these and other models that seek to rank-order risk involves looking at the model’s
performance history and examining its accuracy through its ability to rank and order the risk. This can
involve analyzing both Type 1 (false positive) and Type 2 (false negative) statistical errors against the true
positive and true negative rates for a given model. Common statistical tests used for this type of backtesting analysis include, but are not limited to, a Kolmogorov-Smirnov score (KS), a Brier score, or a
Receiver Operating Characteristic (ROC).
21
BENCHMARKING VS. BACKTESTING
Back-testing measures a model’s outcome and accuracy against real-world observations, while
benchmarking measures those outcomes against those of other models or metrics. Some overlap exists
when the benchmarking includes comparing how well different models’ outputs back-test against realworld observations and the chosen benchmark. This overlap sometimes leads people to mistakenly
conclude that model validations can rely on just one method. In reality, however, back-testing and
benchmarking should ideally be performed together in order to bring their individual benefits to bear in
evaluating the model’s overall performance. The decision, optimally, should not be whether to create a
benchmark or to perform back-testing. Rather, the decision should be what form both benchmarking and
back-testing should take.
While benchmarking and back-testing are complementary exercises that should not be viewed as mutually
exclusive, their outcomes sometimes appear to produce conflicting results. What should a model validator
do, for example, if the model appears to back-test well against real-world observations but do not
benchmark particularly well against similar model outputs? What about a model that returns results
similar to those of other benchmark models but does not back-test well? In the first” scenario, the model
owner can derive a measure of comfort from the knowledge that the model performs well in hindsight.
But the owner also runs the very real risk of being “out on an island” if the model turns out to be wrong.
The second scenario affords the comfort of company in the model’s projections. But what if the models
are all wrong together?
Scenarios where benchmarking and back-testing do not produce complementary results are not common,
but they do happen. In these situations, it becomes incumbent on model validators to determine whether
back-testing results should trump benchmarking results (or vice-versa) or if they should simply temper
one another. The course to take may be dictated by circumstances. For example, a model validator may
conclude that macro-economic indicators are changing to the point that a model which back-tests
favorably is not an advisable tool because it is not tuned to the expected forward-looking conditions. This
could explain why a model that back-tests favorably remains a benchmarking outlier if the benchmark
models are taking into account what the subject model is missing. On the other hand, there are scenarios
where it is reasonable to conclude that back-testing results trump benchmarking results. After all, most
firms would rather have an accurate model than one that lines up with all the others.
As seen in our discussion here, benchmarking and back-testing can sometimes produce distinct or similar
metrics depending on the model being validated. While those differences or similarities can sometimes
be significant, both benchmarking and back-testing provide critical complementary information about a
model’s overall performance. So when approaching a model validation and determining its scope, your
choice should be what form of benchmarking and back-testing needs to be done, rather than whether one
needs to be performed versus the other.
22
In some respects, the OCC 2011-12/SR 11-7 mandate to verify model inputs could not be any more
straightforward: “Process verification … includes verifying that internal and external data inputs continue
to be accurate, complete, consistent with model purpose and design, and of the highest quality available.”
From a logical perspective, this requirement is unambiguous and non-controversial. After all, the reliability
of a model’s outputs cannot be any better than the quality of its inputs.
From a functional perspective, however, it raises practical questions around the amount of work that
needs to be done in order to consider a particular input “verified.” Take the example of a Housing Price
Index (HPI) input assumption. It could be that the modeler obtains the HPI assumption from the bank’s
finance department, which purchases it from an analytics firm. What is the model validator’s
responsibility? Is it sufficient to verify that the HPI input matches the data of the finance department that
supplied it? If not, is it enough to verify that the finance department’s HPI data matches the data provided
by its analytics vendor? If not, is it necessary to validate the analytics firm’s model for generating HPI
assumptions?
It depends.
Just as model risk increases with greater model complexity, higher uncertainty about inputs and
assumptions, broader use, and larger potential impact, input risk increases with increases in input
23
complexity and uncertainty. The risk of any specific input also rises as model outputs become increasingly
sensitive to it.
VALIDATING MODEL INPUTS BEST PRACTICES
So how much validation of model inputs is enough? As with the management of other risks, the level of
validation or control should be dictated by the magnitude or impact of the risk. Like so much else in model
validation, no ‘one size fits all’ approach applies to determining the appropriate level of validation of
model inputs and assumptions. In addition to cost/benefit considerations, model validators should
consider at least four factors for mitigating the risk of input and assumption errors leading to inaccurate
outputs.




Complexity of inputs
Manual manipulation of inputs from source system prior to input into model
Reliability of source system
Relative importance of the input to the model’s outputs (i.e., sensitivity)
CONSIDERATION 1: COMPLEXITY OF INPUTS
The greater the complexity of the model’s inputs and assumptions, the greater the risk of errors. For
example, complex yield curves with multiple data points will be inherently subject to greater risk of
inaccuracy than binary inputs such as “yes” and “no.” In general, the more complex an input is, the more
scrutiny it requires and the “further back” a validator should look to verify its origin and reasonability.
CONSIDERATION 2: MANUAL MANIPULATION OF INPUTS FROM SOURCE
SYSTEM PRIOR TO INPUT INTO MODEL
Input data often requires modification from the source system to facilitate input into the model. More
handling and manual modifications increases the likelihood of error. For example, if a position input is
manually copied from Bloomberg and then subjected to a manual process of modification of format to
enable uploading to the model, there is a greater likelihood of error than if the position input is extracted
automatically via an API. The accuracy of the input should be verified in either case, but the more manual
handling and manipulation of data that occurs, the more comprehensive the testing should be. In this
example, more comprehensive testing would likely take the form of a larger sample size.
In addition, the controls over the processes to extract, transform, and load data from a source system into
the model will impact the risk of error. More mature and effective controls, including automation and
reconciliation, will decrease the likelihood of error and therefore likely require a lighter verification
procedure.
24
CONSIDERATION 3: RELIABILITY OF SOURCE SYSTEMS
More mature and stable source systems generally produce more consistently reliable results. Conversely,
newer systems and those that have produced erroneous results increase the risk of error. The results of
previous validation of inputs, from prior model validations or from third parties, including internal audit
and compliance, can be used as an indicator of the reliability of information from source systems and the
magnitude of input risk. The greater the number of issues identified, the greater the risk, and the more
likely it is that a validator should seek to drill deeper into the fundamental sources of source data.
CONSIDERATION 4: OUTPUT SENSITIVITY TO INPUTS
No matter how reliable an input data’s source system is deemed to be, or the amount of manual
manipulation to which an input is subjected, perhaps the most important consideration is the individual
input’s power to affect the model’s outputs. Returning to our original example, if a 50 percent change in
the HPI assumption has only a negligible impact on the model’s outputs, then a quick verification against
the report supplied by the finance department may be sufficient. If, however, the model’s outputs are
extremely sensitive to even small shifts in the HPI assumption, then additional testing is likely warranted—
perhaps even to include a validation of the analytics vendor’s HPI model (along with all of its inputs).
A COST-EFFECTIVE MODEL INPUT VALIDATION STRATEGY
When it comes to verifying model inputs, there is no theoretical limitation to the lengths to which a
model validator can go. Model risk managers, who do not have unlimited time or budgets, would benefit
from applying practical limits to validation procedures using a risk-based approach to determine the
most cost-effective strategies to ensure that models are sufficiently validated. Applying the
considerations listed above on a case-by-case basis will help validators appropriately define and scope
model input reviews in a manner commensurate with appropriate risk management principles.
25
Tim Willis is an experienced model validator and mortgage industry analyst
with extensive experience consulting financial institutions of various sizes. He
has managed projects validating many different financial models and
developed business requirements for a range of technological solutions
throughout the mortgage asset value chain. Tim is the author of numerous
mortgage industry benchmarking and market research studies and regularly
oversees consulting engagements for banks, loan servicers, the GSEs, and U.S.
Government agencies. He has held previous positions at First American
Corporation, Fannie Mae, and KPMG Consulting. Tim holds a Master of
Business Administration from George Washington University and a Bachelor of
Arts from Brigham Young University.
Chris Welsh has 5 years of experience in the financial services industry,
including two years in mortgage banking. His areas of expertise include
complex problem solving, quantitative analytics and model validation. Prior to
joining RiskSpan, Chris was an analyst with the Global Index Group at NASDAQ
OMX. His daily job responsibilities there included the design and modeling of
custom index funds. He has also authored and published blogs, economic
research reports and industry pieces. He is proficient in Microsoft Excel (VBA)
and R programming, and has recently completed a base programming
certification in SAS. He also has over 10 years of engineering experience,
including project management duties. Chris holds dual master’s degrees in
Banking & Financial Services Management from Boston University and Physics
from Clark University (Worcester, MA), as well as a bachelor’s degree in
Physics/Mathematics from Wheeling (WV) Jesuit College.
Steve Sloan Director, CPA, CIA, CISA, CIDA, has extensive experience in the
professional practices of risk management and internal audit, collaborating
with management and audit committees to design and implement the
infrastructures to obtain the required assurances over risk and controls. He
prescribes a disciplined approach, aligning stakeholders’ expectations with
leading practices, to maximize the return on investment in risk functions. Steve
holds a Bachelor of Science from Pennsylvania State University and has
multiple certifications.
26
Nick Young is a Quantitative Modeling Manager at Riskspan who has more
than 8 years of experience as a quantitative analyst and economist. At
Riskspan, Nick performs model development, validation, and governance on
a wide variety of models including those used for capital planning,
reserve/impairment, credit origination and management and CCAR/DFAST
stress testing. Prior to joining RiskSpan, Nick worked at U.S. Bank where he
was a senior model risk manager with governance, model validation, and
model development responsibilities for the Bank’s Basel II Pillar 1 A-IRB
models, Pillar II ICAAP models, credit origination and account management,
market risk value-at-risk (VaR), reserve and impairment, and CCAR stress
testing. He has a PhD (ABD) in economics from American University, MA in
economics from American University and dual BAs in both economics and
political science magna cum laude with history and music minors from the
University of Oklahoma.
27
Founded by industry veterans, RiskSpan offers end-to-end solutions for data management, risk
management analytics, and visualization on a highly secure, fast, and fully scalable platform that has
earned the trust of the industry’s largest firms.
Combining the strength of subject matter experts, quantitative analysts, and technologists, the RiskSpan
platform integrates a range of data-sets–including both structured and unstructured–and off-the-shelf
analytical tools to provide you with powerful insights and a competitive advantage.
We empower our clients to make accurate, data-driven decisions based on facts and analysis. We make
data beautiful.
CONTACT
Email Shawn Eliav at [email protected] to be connected with our experts.
28

Download Report

Model Validation: Is this spreadsheet a model?

Paperzz.com

Your Paperzz