Exploitability/Priority Index Rating Systems (Approaches, Value, and

Exploitability/Priority Index Rating Systems
(Approaches, Value, and Limitations)
Carsten Eiram, Chief Research Officer, Risk Based Security
2013/07/02
1 of 17
Table of Contents
Table of Contents
About Risk Based Security
Mission
Background
Discriminators
Introduction
Microsoft Exploitability Index
Advantages and Disadvantages
Prior publications discussing the Exploitability Index
Past Experience With Exploitability Index Changes
Arguing Exploitability Index Accuracy Based on Number of Changes
Statistics
Concerns and possible improvements
Adobe Systems Priority Ratings
Advantages and Disadvantages
Statistics
CVSS Exploitability Temporal Metric
Advantages and Disadvantages
Conclusion
2
3
3
3
3
4
5
6
7
8
8
9
10
12
13
14
15
16
17
2 of 17
About Risk Based Security
Mission
To equip clients with the technology and customized risk­based consulting solutions to turn
security data into information and information into a competitive advantage.
Background
Risk Based Security, Inc., incorporated in 2011, was established to better support the
users/contributors to the Open Security Foundation, OSF, with the technology to turn security
data into a competitive advantage.
The OSF’s wealth of historical data, combined with the interactive dashboards and analytics
offered by Risk Based Security provide a first of its kind risk identification and security
management tool.
Risk Based Security further complements the data analytics with risk­focused consulting
services to address industry specific information security and compliance challenges.
Discriminators
Risk Based Security offers a full set of analytics and user­friendly dashboards designed
specifically to identify security risks by industry.
Risk Based Security is the only company that offers its clients a fully integrated solution – real
time information, analytical tools and purpose­based consulting.
Unlike other security information providers, Risk Based Security offers companies
comprehensive insight into data security threats and vulnerabilities most relevant to their
industry.
3 of 17
Introduction
Ultimo 2008, Microsoft included a new rating system dubbed the “Exploitability Index” in their
security bulletins as a complement to their existing severity ratings.
The intent of this new rating system was to improve on risk assessment of released security
updates and how to prioritize their installation by providing additional granularity. Supposedly, this
was in response to requests from IT administrators of enterprises, who wanted to make
well­informed patch management decisions in cases, where it was not an option to just install all
security updates at once.
In early 2012, Adobe adopted a similar “Priority Rating” system as a complement to their existing
severity ratings. Instead of taking the approach of Microsoft by performing a technical
exploitability assessment of each vulnerability, Adobe based their system on the likelihood of
exploits surfacing based on historical prevalence of exploits being developed for a product after
patch release.
Another approach taken by CVSS (Common Vulnerability Scoring System) since its initial
release is the “Exploitability” temporal metric, which is based on the current availability of PoCs
and/or exploits. It should be noted, however, that this metric was never intended as some form of
prioritization index, but is nevertheless being used as such in some organizations.
Now, more than four years since the inception of Microsoft’s Exploitability Index, it seems
appropriate to review the value of this rating system as well as briefly discuss the other similar
initiatives to determine if these rating systems are meeting the goal of easing prioritization of
security update installation.
4 of 17
Microsoft Exploitability Index
October 2008, Microsoft introduced the Exploitability Index1 as a complement to their existing
severity ratings.
The Exploitability Index is “a rating system that helps customers identify the likelihood that a
specific vulnerability would be exploited within the first 30 days after bulletin release”, and “each
vulnerability rating is based on a thorough review by the MSRC Engineering team, as well as
close cooperation with a number of key partners”2 .
The approach taken by Microsoft is to basically perform an in­depth technical analysis of a
vulnerability in order to determine how likely it is for an attacker to create a reliable, working
exploit within the initial 30 days of a bulletin release.
The result is a score in a rating system from 1 to 3 for any vulnerability with a severity of
“Important” or “Critical”.
Rating
Definition
1
Exploit code likely (originally: “Consistent exploit code likely”)
within first 30 days of bulletin release.
2
Exploit code would be difficult to build (originally: “Inconsistent
exploit code likely”) within first 30 days of bulletin release.
3
Exploit code unlikely (originally: “Functioning exploit code
unlikely”) within first 30 days of bulletin release.
May 2011, Microsoft improved3 upon the rating system by taking into account the added security
mitigations and features built into the latest versions of their software. The result was splitting the
rating, thereby assigning one for the latest version of the affected software and another for older
releases.
1
http://technet.microsoft.com/en­us/security/cc998259.aspx
http://blogs.technet.com/b/msrc/archive/2011/05/05/exploitability­index­improvements­amp­advance­
notification­service­for­may­2011­bulletin­release.aspx
3
http://blogs.technet.com/b/msrc/archive/2011/05/05/exploitability­index­improvements­amp­advance­
notification­service­for­may­2011­bulletin­release.aspx
2
5 of 17
A snippet from the Microsoft December 2012 Exploitability Index4 illustrating this split can be
found below.
It should also be noted that the “Key Notes” field may contain various relevant information from
the security bulletin e.g. details on a vulnerability having been publicly disclosed or currently
being exploited in targeted attacks.
Advantages and Disadvantages
One of the primary advantages of a technical exploitability index is that it provides system
administrators with an actual technical evaluation of the given vulnerability, and how feasible it is
to exploit it in a reliable manner. This added granularity certainly provides value, as severity
ratings by themselves reflect a worst case impact only; vulnerabilities unlikely to be exploitable
for code execution are still rated as such ­ and for good reason since it’s important for severity
ratings to reflect both the proven and potential impact of a vulnerability.
An exploitability score, therefore, makes it clear if a vulnerability is theoretically exploitable only or
actually can be exploited. It is, however, important to note that one researcher’s non­exploitable
issue may be another’s exploitable vulnerability.
This leads to one of the major disadvantages of this approach: The required resources and skills
to correctly assess the exploitability of a vulnerability. While some are straightforward to analyze,
other vulnerabilities are very complex. By adopting a technical approach like the one Microsoft
has, a vendor commits to spending significant resources and either employing skilled analysts
with a solid understanding of exploitation or outsource the work to some. If customers can’t have
faith in the ratings, either because the vendor’s employees do not have the required
competencies, plays it too safe, or downplays too many issues, the rating system loses its
value.
4
http://technet.microsoft.com/en­us/security/bulletin/ms12­dec
6 of 17
Prior publications discussing the Exploitability Index
Microsoft’s Exploitability Index has been discussed on a number of occasions.
Shortly after Microsoft announced the index, Immunity published a white paper5 , “A Bounds
Check on the Microsoft Exploitability Index”, discussing the concept of the index and its value.
The conclusion in the white paper was that the index did add additional value when making risk
assessments, but while the concept of the Exploitability Index was thoroughly discussed,
Microsoft’s implementation of it was not analysed to determine if providing optimal value.
Late 2009, an article6 titled: ”Microsoft correctly predicts reliable exploits just 27% of the time”
was published by Gregg Keizer, discussing the value of the index based on published statistics
by Microsoft. In this article, it was highlighted that Microsoft only correctly “predicted” in 27% of
the cases, if a vulnerability with a score of “1” was exploited within 30 days. In said article,
Andrew Storms, director of security operations at nCircle Network Security, asked: "So what's
the point?", indicating that the Exploitability Index rating system was pointless based on the low
accuracy of Microsoft’s “predictions” and further concluding that: “With these numbers of false
positives, we are in no better place than we were prior to the index, in respect to granularity”.
What the article and Andrew Storms failed to keep in mind was that the purpose of the
Exploitability Index is not to predict the future. It is a technical index that evaluates how feasible it
is from a technical standpoint for an attacker to figure out a given vulnerability and have a
working, reliable exploit ready within 30 days of the security bulletin release. Similar concerns
were raised by Dustin Trammell in a blog post7 as response to the article.
While the article misses the point when referring to Microsoft’s ratings as predictions and
concluding that the index is not useful due to the number of false positives, it did touch on
another very important point that relates to the value of the Exploitability Index. According to the
article, Microsoft admitted to using a conservative approach during the assessment process.
This begs the questions: “How conservative? Are they too concerned about being proven wrong?
And is it to a point that it perhaps negatively impacts the value of the index?”.
5
http://www.microsoft.com/en­us/download/details.aspx?id=4537
http://www.reuters.com/article/2009/11/04/urnidgns852573c400693880002576630073ead­
idUS186777206820091104
7
http://blog.dustintrammell.com/2009/11/05/microsoft­exploitability­index/
6
7 of 17
Past Experience With Exploitability Index Changes
While it is not possible to conclusively answer how concerned Microsoft are in regard to being
proven wrong, it definitely is a concern as learned from personal experience.
September 2010, Microsoft published MS10­0638 for a vulnerability in the Uniscribe Font Parsing
Engine with the Exploitability Index rating initially set to “2”. While analysing the fix, it quickly
became apparent to me that the vulnerability was in fact reliably exploitable9 , and I provided
details10 of this to Microsoft.
This eventually lead to a reassessment11 of the Exploitability Index rating, raising it to “1” ­
supposedly the first time a rating was increased. It was definitely not my impression that
Microsoft took this correction light­hearted, but instead saw it as a major failure. If Microsoft
and/or their external partners are overly concerned about being proven wrong, this could be
reflected in the ratings and a higher number of “1” scores being assigned than appropriate.
Arguing Exploitability Index Accuracy Based on Number of Changes
Microsoft has argued for the accuracy of the Exploitability Index by referring to the number of
times it has been required to update a score ­ either lowering it or raising it. An example from
July 200912 : “The Microsoft Exploitability Index has also proven an effective and reliable
resource to help customers better assess risk. Of the 140 Exploitability Index ratings Microsoft
provided from October 2008 to June 2009, only one had to be modified — a 99 percent reliability
rate”. However, this method of assessing the index is flawed for a number of reasons.
Firstly, researchers would rarely approach Microsoft to argue that a score is too high. While it is
easy to prove that a vulnerability can indeed be reliably exploited (“just” create a reliable exploit),
it is not as simple to prove that it is not possible. What researchers can prove is that they are not
able to create a reliable exploit with their technical skills and allotted time; that does not mean
someone more skilled and/or with more time (or just luck) cannot do so.
8
http://technet.microsoft.com/en­us/security/bulletin/MS10­063
https://twitter.com/carsteneiram/status/25120911041
10
http://secunia.com/blog/137/
11
http://news.softpedia.com/news/Microsoft­Reassesses­the­Exploitability­of­Two­Vulnerabilities­
157816.shtml
12
http://www.microsoft.com/en­us/news/press/2009/jul09/07­27blackhat09pr.aspx
9
8 of 17
If researchers cannot exploit a vulnerability rated as “1”, they may for good reason assume that
Microsoft knows something that they do not and not see any reason to dispute the rating.
Ultimately, there is hardly any incentive for researchers to approach Microsoft in an attempt to
get a score lowered ­ only raised.
Secondly, as Microsoft admits to a conservative scoring, a disproportionate number of
vulnerabilities may be rated as “1”. If that is the case, it would only be possible to raise the score
further for a very small number of vulnerabilities.
All in all, this means that one would expect to only see very few ratings being updated.
Statistics
As discussed, evaluating the value of Microsoft’s Exploitability Index by considering the scores
as predictions is also a flawed approach. While no exploit surfaced for a vulnerability within 30
days of security bulletin release, it does not mean that the vulnerability could not have been
exploited ­ researchers or attackers may just have been prioritizing other vulnerabilities instead.
Proving the accuracy of the scores would require extensive analysis into each vulnerability in
order to compare the scores against the technical characteristics of the vulnerability. Such an
analysis is outside the scope of this paper, and even if done, it would only be an indication of
which vulnerabilities this researcher would have been capable of exploiting ­ it would not rule out
that others could fare better as previously discussed.
It is, however, possible to determine the value and granularity provided by the Exploitability Index
by counting the number of security bulletins and vulnerabilities with a score of “1”. If the
Exploitability Index should provide the expected value, this number should not be too high. If it is,
the index is not providing sufficient granularity or value when assessing patch prioritization.
In 2012, Microsoft published 83 security bulletins. Of these, 80.7% covered one or more
vulnerabilities with an exploitability score of “1”. System administrators would, therefore, have to
give around 4/5 of all patches the highest priority when patching.
9 of 17
Drilling down into the 83 security bulletins, the total number of vulnerabilities covered are 19813 .
The number of vulnerabilities with a score of “1” totals 74.7% i.e. 3 out of every 4 vulnerabilities
reported. Taking Microsoft’s additional granularity into account where separate scores are
assigned for the latest version of a product than older releases, the percentages are 79.2% for
older releases and 73.9% for the latest releases ­ not a significant change.
Naturally, the Exploitability Index rating should be used in conjunction with the severity ratings
and not standalone when determining proper patch prioritization i.e. a security update with a
“Critical” severity rating and Exploitability Index rating of “1” should be prioritized higher than
another with a similar severity rating, but an Exploitability Rating of “2”. Likewise, a security
update with a “Important” severity rating and Exploitability Index rating of “1” should be prioritised
lower than one with a “Critical” severity rating and similar Exploitability Index rating.
The number of security bulletins with a “Critical” severity ratings ­ the highest possible ­ totals 35
i.e. 42.2% of all released security bulletins. Of these 35 “Critical” security bulletins, all but a
single covered one or more vulnerabilities with an Exploitability Rating of “1”. That means 97.1%
of all critical security bulletins should be given the highest possible priority.
Concerns and possible improvements
With 97.1% of all critical security bulletins and 80.7% of all security bulletins covering one or
more vulnerabilities with an Exploitability Index rating of “1”, it is clear that while the concept of
Microsoft’s Exploitability Index definitely has value, it provides little in reality.
Microsoft describes14 their exploitability rating of “1” more detailed as:
13
Number of vulnerabilities is based on number of CVEs in lack of a better way to counting them in this
case. It should be noted that some CVEs may be assigned for more than a single vulnerability.
14
http://technet.microsoft.com/en­us/security/cc998259.aspx
10 of 17
As accurately concluded in the Immunity paper: “An exploitability index is only as good as the
security researchers backing it. The validity of the Microsoft Exploitability Index, a seemingly
technical subject, in the end boils down to the human element: the people doing the research.”
Clearly, Microsoft need to consider enhancing the current implementation. The first step is to
evaluate if they are being too conservative with their scores, or if that many vulnerabilities really
are consistently exploitable.
If the former, Microsoft should consider being less concerned with being proven wrong and rate
vulnerabilities more realistically than conservatively, though it may not be a popular solution,
since their “main measure for success is to make sure [they] avoid rating something in the
index “lower” than it actually should be once under full public review”.15
The reasoning is that it would mean Microsoft’s customers would have been exposed to more
risk than communicated by the Exploitability Index, and while this makes sense, being too
conservative clearly decreases or even eliminates the value added to the risk assessment
process, if the vast majority of patches are rated as “1”. By having too many bulletins rated as
“1”, customers may also end up becoming desensitized to the ratings, which may eventually be
ignored altogether.
If the latter, Microsoft could make the Exploitability Index more valuable by e.g. shortening the
time from disclosure to exploit availability (e.g. from 30 days to 14 days). Alternatively, Microsoft
should look into adding further granularity by adopting an additional metric similar to Adobe
and/or CVSS. Both of these approaches are briefly discussed next.
15
http://blogs.technet.com/b/ecostrat/archive/2008/11/13/one­month­analysis­exploitability­index.aspx
11 of 17
Adobe Systems Priority Ratings
March 2012, Adobe introduced their “Priority Rating” concept as a complement to the existing
severity ratings.
Similar to Microsoft’s concept with the Exploitability Index, the idea is to provide customers with
more granularity to improve risk assessment and patch prioritization. The approach used by
Adobe, however, differs from Microsoft’s in that Adobe does not specifically evaluate the
technical details of a given vulnerability in order to assess its exploitability. Instead, Adobe
assigns priority ratings based on how prevalent attacks historically are against the affected
software.
The following table16 illustrates Adobe’s three priority ratings and each definition.
16
Rating
Definition
Priority 1
This update resolves vulnerabilities being targeted, or
which have a higher risk of being targeted, by exploit(s)
in the wild for a given product version and platform.
Adobe recommends administrators install the update as
soon as possible. (for instance, within 72 hours).
Priority 2
This update resolves vulnerabilities in a product that has
historically been at elevated risk. There are currently no
known exploits. Based on previous experience, we do
not anticipate exploits are imminent. As a best practice,
Adobe recommends administrators install the update
soon (for instance, within 30 days).
Priority 3
This update resolves vulnerabilities in a product that has
historically not been a target for attackers. Adobe
recommends administrators install the update at their
discretion.
http://www.adobe.com/support/security/severity_ratings.html
12 of 17
Priority ratings are not assigned per vulnerability. Instead, a single priority rating is set for all
vulnerabilities covered by a bulletin on a per product/platform combination basis as illustrated by
the following snippet from an Adobe advisory 17 .
This allows Adobe to have priority ratings reflect if attacks historically occurred more often for
e.g. the Windows version of Flash Player in comparison to the Linux version. This in turn allows
customers to make a prioritization decision on a platform basis and not just product basis. In the
snippet above, Adobe recommends a faster response time for the Windows platform (e.g.
suggesting within 72 hours) than the Linux platform (e.g. suggesting one month). Ultimately, it
would, of course, be up to the individual organization to determine an appropriate response time
for each priority rating.
Advantages and Disadvantages
The advantage by following this approach for Adobe is that it does not require spending large
amounts of resources to thoroughly investigate not only the cause of each vulnerability, but also
the follow­up impact and any obstacles there may be in order to actually exploit it fully and in a
reliable manner.
For customers the concept of this rating system also adds value to the risk assessment
process, as it provides dynamic scores that allow focusing on the applications, versions, and
even platforms that are currently most popular to attackers. Just because a vulnerability may
technically be possible to exploit in a reliable manner, it does not necessarily mean that attackers
will do so, if the application is deemed uninteresting e.g. due to limited ROI.
17
http://www.adobe.com/support/security/bulletins/apsb13­11.html
13 of 17
However, the disadvantage of this approach is that neither Adobe’s severity nor priority ratings
factor in the technical characteristics of a vulnerability i.e. if it is indeed exploitable. As the ratings
only focus on worst­case impact and which applications and platforms are historically commonly
exploited, the rating does not reflect the possibility of actually exploiting a given vulnerability. As
such, customers may find themselves prioritizing vulnerabilities that are exploitable only in theory
over vulnerabilities that may be straightforward to exploit.
Statistics
In 2012, 8 out of 24 released bulletins with priority ratings had at least one product with a priority
rating of 1 (33.3%). Looking at the 19 advisories with a “Critical” severity, 8 were given a priority
rating of 1 (42.1%). Without looking closer at the accuracy of the scores, the percentages at
least seem to provide a better granularity and value for patch prioritization than the high
percentages of “1” scores for Microsoft’s exploitability index ratings.
Reviewing Adobe’s 2012 priority scores, it seems that they are currently predominantly scoring
Flash Player for Windows and Adobe Reader/Acrobat 9.x as “1”. Other versions of Flash Player,
Shockwave Player, and ColdFusion are generally scored as “2” and applications like Photoshop,
Illustrator, and similar are scored as “3”.
These scores do immediately seem in line with what one would expect based on the
applications and platforms that are most commonly attacked. As pointed out earlier, since the
rating system is dynamic, it allows Adobe to change default scores in the future at will e.g.
raising Photoshop from “3” to “2” if more focus is suddenly observed for this specific application.
14 of 17
CVSS Exploitability Temporal Metric
Long before Microsoft’s Exploitability Index and Adobe’s Priority Rating, CVSS (Common
Vulnerability Scoring System) introduced a metric to provide similar granularity in risk
assessment: The “Exploitability (E)” temporal metric, which “measures how complex the
process is to exploit the vulnerability in the target system”18 by looking at the current state of
available code. This means that e.g. publicly available exploit code would result in a higher score
than for a vulnerability without any code publicly available. The easier the vulnerability would be to
exploit using publicly available code, the higher the ultimate CVSS score.
The metric has 4 different values with a fifth value, “Not Defined (ND)”, that does not affect the
resulting CVSS score.
Metric Value
Description
Unproven (U)
No exploit code is available, or an exploit is entirely
theoretical.
Proof­of­Concept Proof­of­concept exploit code or an attack
(POC)
demonstration that is not practical for most systems is
available. The code or technique is not functional in all
situations and may require substantial modification by
a skilled attacker.
18
Functional (F)
Functional exploit code is available. The code works in
most situations where the vulnerability exists.
High (H)
Either the vulnerability is exploitable by functional
mobile autonomous code, or no exploit is required
(manual trigger) and details are widely available. The
code works in every situation, or is actively being
delivered via a mobile autonomous agent (such as a
worm or virus).
http://www.first.org/cvss/faq
15 of 17
Advantages and Disadvantages
The main advantage of this approach is that it cuts out all the “what ifs” and guesstimates of
whether or not a vulnerability may be exploitable. The metric does not require any technical skills
nor in­depth analysis to generate. If exploit code is available, the vulnerability is clearly
exploitable; if used by malware, vulnerable applications are clearly being targeted etc.
This temporal metric certainly provides great value from a patch prioritization and risk
assessment standpoint by making it obvious what the current publicly known threat is to a
vulnerable application.
However, the main disadvantage is that this is a purely reactive metric that does not allow
organizations to be proactive. Also, the lack of information about available exploit code does not
mean that its not privately available and being used to compromise systems. Organizations
preferring this approach would do well to ensure that they implement a quick and efficient
response process.
The organization would, furthermore, be required to either thoroughly track the discussion and
publication of new PoC and exploit code themselves or alternatively acquire the services of a
reliable vulnerability intelligence provider, tracking such information and updating information in a
timely fashion to ensure that organizations have a proper understanding of the threatscape at all
times and can react quickly to prioritize patches accordingly.
16 of 17
Conclusion
It is clear that each of these rating systems ­ if implemented correctly ­ provide additional value
to patch prioritization and risk assessment efforts using different approaches. The metrics have
value standalone, but certainly compliment each other greatly.
In a perfect world, it would be very interesting to see a vendor combine all three ratings in
addition to the worst­case impact of a given vulnerability (severity rating) i.e. rating how likely the
vulnerability is to successfully and reliably exploit from a technical standpoint (exploitability index
rating), how likely attackers will actually target the affected product/platform (priority rating), and
the current public availability of PoCs and/or exploit code (CVSS Exploitability temporal metric).
Of all the vendors, Microsoft currently seems to be the one coming the closest to this. They
provide a standard severity rating and their exploitability index (requiring some changes, though,
as discussed in this paper). While not available as metrics, Microsoft at the same time provide
the “Key Notes” section in the Exploitability Index and FAQs in the security bulletins that discuss
whether or not a vulnerability has been previously disclosed and/or is currently being exploited
along with the “What systems are primarily at risk from the vulnerability?” section in security
bulletins that discusses types of systems most at risk.
However, even implementing only one of these types of priority/exploitability ratings still provide
extra value to end­users of the security bulletins when doing risk assessment in order to
prioritise their patching efforts, as long as they clearly understand the challenges and limitations
of each rating type.
The approaches used by Adobe or CVSS are easier to implement and support by most software
vendors than the exploitability index rating system offered by Microsoft, which requires greater
commitment and resources. Therefore, only software vendors, who are prepared to go all­in
when it comes to technical root cause, impact, and exploitation analysis should consider an
actual exploitability index.
Hopefully, more major vendors will follow suit and provide one or more of these types of ratings
along with their standard severity ratings in the future. The more data made available to system
administrators, the better risk management decisions they can make.
17 of 17