Paper - Europa.eu

5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
KEEPING SCORE: BETTER POLICY THROUGH IMPROVED
PREDICTIVE ANALYSIS
Regina Joseph
Sibylink
Prinsestraat 64, 2513 CE, Den Haag/The Hague, The Netherlands; [email protected]
Abstract
Threat and risk assessment in foreign policy remains mostly stuck in a 20th century rut: failures
in foreseeing such catalytic events as the fall of the Berlin Wall, the Arab Spring, and even
Russia’s intervention in Ukraine result as much from the lack of foresight accountability as they
do from an entrenched dependency on gatekeepers and experts already within the system.
Strategic foresight in the hands of even the best futurists becomes blinkered when trapped in the
bubble of sclerotic, hierarchical and reactive government structures. Attempts to address such
weaknesses can bog down further in a dialectic of false positives and false negatives—a major
obstacle in undertaking such current initiatives as the European External Action Services’
harmonization efforts to link global situation rooms and crisis centres.
But rigorous, scientifically-tested platforms of binary prediction (the first-generation of web-based
computational futures research), such as the Good Judgment Project, have now arrived. Good
Judgment embraces populism through a highly refined crowd-sourcing technique; replaces the
static method of think tank reports with graphically visualized web-based dashboards; utilizes
gamification techniques to engage and incentivize volunteers in a demanding and competitive
tournament environment; and shows continuously elicited forecasts on short-term, mid-term and
long-range questions by Superforecasters who have been shown to maintain consistently high
accuracy over three years in experimental conditions. With such new tools comes the potential
to transform how foresight can be integrated into political decision-making.
Led by three principal investigators, University of Pennsylvania/Wharton professors Dr. Phillip
Tetlock and Dr. Barbara Mellers, and University California/Berkeley professor Dr. Donald Moore,
Good Judgment fuses cognitive psychology and behavioural science with computational and
scientific rigor, thus setting both quantitative and qualitative standards in identifying who is
capable of better foresight; it also now has a track record over three years in correlating
effectively improving forecasting accuracy through specific training and algorithmic aggregation
techniques.
Keywords: Good Judgment Project, binary prediction, crowd-sourcing, gamification, tournament, Superforecasters
Introduction
1.1 The Roots of Today’s Geopolitical Forecasting Advances
Ernest Hemingway once wrote of the process of bankruptcy as happening gradually, then
suddenly [9], which also serves an apt description for the accelerating pace of technical and
computational modalities applied to predictive geopolitical forecasting. Data-driven foresight
methods for identifying emerging threats or events are quickly becoming a significant and novel
force influencing policy—especially those that combine open source information and the wisdom
of crowds.
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-1-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
Scientific and quantitative approaches to forecasting have their earliest roots at the turn of the
20th century in fields of natural observation; inductive reasoning about future weather conditions
propelled the burgeoning science of meteorology [1, 13, 18], and the pioneering work of the
polymath anthropologist Francis Galton in statistics led to the concepts of correlation, linear
regression and regression to the mean—central features of current computational forecasting [4,
6,7].
The inaccuracy of the old canard, “you can’t predict the weather,” reflects just how far
meteorology has come since the discovery of thermodynamics in the 19th century [13]. In 1901,
Cleveland Abbe, an American meteorologist, was the first to encourage mathematical
approaches to forecasting [1]—a proposal which was quickly taken up first by the Norwegian
scientist Vilhelm Bjerknes in 1904 [13], who advanced qualitative diagnostics through
observational research, and then by English meteorologist Lewis Fry Richardson, who in 1922
published the first rigorous algorithms for prognostic calculation [1,18]. Consequently, progress
in forecasting accuracy began a slow upward trajectory until the end of the century, when it
experienced a hockey stick-like jump in the 1990s; today, four-day weather forecasts are as
accurate as one-day forecasts were 30 years ago and extreme weather events can be predicted
five to seven days in advance, whereas by contrast 20 years ago, any significant forecasting
accuracy could only be achieved one day ahead of time [28].
Around the same time as Abbe, Bjerknes and Fry attempted to improve weather prediction,
Francis Galton was merging his ideas regarding the prognostic qualities of statistics and the
democratic nature of popular judgment. In 1907, he published his ground-breaking “wisdom of
the crowd” article, relaying the outcomes of a public contest in which 800 average citizens
competed to correctly guess the weight of a slaughtered and dressed ox [6, 7]. Galton posited
an equivalence between the average competitors’ fitness to judge the ox’s weight and to judge
the merits of political issues on which they voted [6,7]; the result introduced the psychological
dimension of vox populi, leading to a predictive outcome accurate to within one percent of the
real value [7]. Since Galton’s observations, the harnessing of groups to forecast events and
conditions has spread far beyond judging livestock at the country fair; for decades, government
agencies, universities, banks, corporations and think tanks have assembled subject matter
experts to predict outcomes, especially within the economic and geopolitical arenas. But it
wasn’t until the last decade that crowd-sourced environments, now multiplying with speed,
became globalized and populist.
Technology lies behind the aggressive and temporally compressed expansion of meteorological
accuracy and crowd-sourced prediction vehicles: weather forecasting could not have achieved
its exponential strides in accuracy without geospatial satellite technology [13, 28]; and human
networks generating massive statistical data volumes could not advance without Internet linkage
and computers providing both data processing power (which augments the prognostic effect)
and the means for information collection (which provides diagnostic foundations). These twinned
roots behind the advances in geopolitical forecasting seem especially pertinent today, as climate
becomes increasingly central to predicting geopolitical events and as the uncertainty around
both put ever more pressure on policy-makers to develop greater accuracy in strategic foresight.
1.2 Modern Technical Foresight Efforts
Machine-readable data sets and an emphasis on statistical forecasting [10, 19, 27] led to several
government-sponsored artificial intelligence (AI, or computational methodology) efforts in
forecasting conflict during the Cold War, which fizzled out after sub-optimal results [19]. But
academic progress in the field during the 1990s and 2000s compelled such US government
agencies as Defense Advanced Research Projects Agency (DARPA) and others, and
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-2-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
likeminded counterparts in Europe such as Austria’s Federal Ministry of Education, Science and
Culture to name one [27], to refocus on technical geopolitical forecasting [19, 27].
In tandem with the boost in machine-led political forecasting work in the last decade, two social
science researchers have helped shape and push the frontier of geopolitical forecasting by
addressing the psychological dimension Galton instinctively knew was at work in crowd-sourced
accuracy over 100 years ago.
Nobel laureate Daniel Kahneman’s work in cognitive biases and his distinction between System
1 thinking (fast, emotional and intuitive) and System 2 thinking (slow, logical and deliberate)
elucidates the behavioural impulses that affect decisions, not the least of which include policy
choices [12]. This prevalence of System 1 thinking over System 2 may be interpreted as a key
reason behind the dismal record of the dismal science of economics—evidenced most recently
by the Organization of Economic Cooperation and Development’s (OECD) post mortem of
economic projections in the period from 2007-2012, in which not a single economic growth
forecast proved correct [5, 16].
On a more granular level, Philip Tetlock’s ground-breaking and definitive research [24, 26] on
geopolitical forecasting deflates the notion that experts maintain an edge in accuracy: his
assessment spanning 20 years of more than 80,000 predictions made by 284 expert political
forecasters demonstrated forecast outcomes barely better than chance. Furthermore, in an egodriven effort to preserve status by playing to an audience, the most recognized pundits often fare
worse than Tetlock’s memorable analogue of “dart-throwing chimpanzees [24].” Without postforecast evaluative metrics, publicly visible forecasters are prone to overconfidence in their
judgments. The lack of accountability in forecasting, Tetlock suggests, is a driver of poor
performance.
According to Tetlock’s research, some forecasters, however, are much better than others.
Borrowing from Greek poet Archilochus’ comparison of hedgehogs and foxes (and by second
degree, British philosopher Isaiah Berlin), Tetlock observed the forecasting accuracy of
generalists—who, like foxes know “many things”—versus the often inferior ability of experts–
who, like hedgehogs, know “one big thing [24].”
Tetlock’s findings piqued the interest of Intelligence Advanced Research Projects Activity
(IARPA), a division of the US’ Office of the Director of National Intelligence. In 2011, in an effort
to answer whether it is possible to predict economic and geopolitical outcomes using social
science methods, IARPA launched the four-year Aggregative Contingent Estimation (ACE)
Program (http://www.iarpa.gov/index.php/research-programs/ace) to “dramatically enhance the
accuracy, precision, and timeliness of intelligence forecasts for a broad range of event types
through the development of enhanced techniques that elicit, weight and combine the judgement
of many intelligence analysts.”
Fashioning ACE as a tournament, IARPA invited Tetlock to form a team and pitted that team,
known as the Good Judgment Project, against four other multi-disciplinary research teams. Each
team was allowed to choose its forecasters as it saw fit, and IARPA posed hundreds of
geopolitical questions to all the experimental subjects across the five teams. Examples of
questions included “Will Greece leave the Eurozone before X date?”; “Will there be a lethal
confrontation between China and Japan in the East China Sea before X date?”; “Will the IMF
provide a new loan to Egypt before X date?” and other policy-relevant queries. Each team could
employ its own algorithmic analysis, forecasting environment and experimental conditions [29].
As a control, IARPA used a group of active government intelligence analysts who forecasted on
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-3-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
the same questions as the teams. The goal of the tournament was to beat the control group by
20 percent in Year 1 and increase to 50% in Year 4.
Unexpectedly, of the five teams, the Good Judgment Project’s best method beat the Year 4 50%
goal in Year 1 and won decisively over the control group--factors which led to IARPA defunding
the other four teams (which did not come close to Good Judgment’s success) after Year 2 and
building the remaining two years of the experiment around Good Judgment [15]. Taking its cue
from Tetlock’s body of research that demonstrates the importance of keeping score when
making futures forecasts, the Good Judgment Project in its fourth and final experimental year
continues to push the boundaries of current understanding about prognostic accuracy.
Unsurprisingly, the success of the Good Judgment Project has attracted press as well as public
and private sector interest. Moreover, it has begun to attract copycats wishing to trade on the
recent and growing interest in technical forecasting and prediction markets. While imitation is the
sincerest form of flattery, several aspects of Good Judgment’s rigorously scientific composition
will make it extremely difficult, expensive and time-consuming to match.
Methodological approach
Over 2000 forecasters were recruited by Tetlock and the research teams at University of
Pennsylvania (where he and Mellers serve as professors) and University of California, Berkeley
(where Tetlock and Mellers taught previously and where fellow principal investigator Dr. Donald
Moore continues to teach) for the Good Judgment Project in Year 1 through requests for
participants placed in policy- and research relevant blogs, journals and fora [11]. These
experimental subjects had an average age of 35 [29] and most had careers ranging widely
across academia, political science and the private sector [29].
To refine this crowd-sourced environment well beyond that of any existing entity, Good
Judgment developed a state-of-the-art vetting system: extensive demographic and
psychographic data are collected on each forecaster, including IQ tests, political knowledge
tests, and personality tests to determine open-mindedness and the relative nature of each
forecaster’s fox-to-hedgehog quotient—data points that are renewed through collection each
year at the start of every forecasting season. In the first year, each forecaster was randomly
assigned to an experimental condition, ranging from individual prediction in isolation; individual
prediction but with a window view on what others are predicting; team predictions; or a prediction
market (where forecasters “buy and sell” on event probability based on determining the positions
of their fellow forecasters) [29]. In addition, each forecaster was assigned to one of three training
conditions, including no training at all; probability training; and scenario training [29]. Over the
course of the ACE program, the Good Judgment Project has developed and iteratively refined
training materials built on the work of Tetlock, Mellers, Moore and the research team to assist
forecasters in cognitive de-biasing as well better predictive practice.
In strategic policy development today, analysts often make forecasting judgments using “fuzzy”
language like “possible,” probable” and “likely” modified by adjectives such as “most” and “least”
or adverbs like “moderately.” This is true of much quantitative futures work and think tank
reports—the popularity of which is not least exemplified by the preponderance of such methods
in prior FTA conference submissions [https://ec.europa.eu/jrc/en/event/site/fta2014/previouseditions]. Good Judgment definitively breaks with that tradition, requiring forecasters to assign
numerical probability to events on a continuously elicited basis; in Good Judgment’s binary
predictive environment (question outcomes are defined as 0 or 1), forecasters are scored using
Brier scoring rules—the sum of the squared differences between the estimated probability and
the actual outcome (0 or 1) [2]. To determine accuracy by addressing variance decomposition of
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-4-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
the continuously elicited forecasts, Good Judgment’s research team developed algorithms and
principles for forecast aggregation: in Year 1, forecasters were weighted and averaged using
either a weighted mean or a weighted median; older forecasts were down-weighted using
exponential decay; and aggregated forecasts were transformed to push them away from 0.5 and
towards more extreme values [29]. Since Year 1, Good Judgment has consistently and
iteratively refined its statistical algorithms and aggregation methods to improve its ability to
model forecasting decision processes and accuracy [15].
For their forecasts and participation, experimental subjects received nothing more than an
Amazon gift card worth USD$150-250. Retention success within such a demanding experience
lies partly in the gamified nature of a competitive tournament environment. Furthermore, Good
Judgment enabled forecasters to see where they ranked in accuracy compared to others by
publishing and regularly updating a leaderboard. The effects of a game environment on intrinsic
motivation and engagement are well known [3] and its application in the quantitative arena of
Good Judgment has been salutary in reducing attrition, especially among a special class of
forecasters known as Superforecasters [20, 29], who rank in the top 2% of accuracy among the
thousands of the experiment’s forecasters (full disclosure: in addition to being on staff with Good
Judgment, I am also a Superforecaster). In Years 3 and 4, Superforecasters have been
aggregated into Super teams, creating a meta-tournament within the tournament.
Gamification is augmented through the constantly evolving graphical user interface of the Good
Judgment environment. Forecasters, whether in the survey condition or in the prediction market,
enter their predictions into a bespoke-programmed “dashboard” that facilitates communication
and information exchange among individuals and teams, lending a social media component
within Good Judgment’s closed environment. Chronological comment history is displayed, along
with visualizations such as graphs charting forecast history to assist individual and team
analysis.
Question formulation constitutes a central aspect of Good Judgment’s success, which was built
on answering questions with short-term (days and months) and mid-term (1-3 years) temporal
scope. In Years 1 and 2, IARPA supplied questions with a focus on rigor and relevance; the fact
that questions must be policy relevant but also resolvable with irreducible certainty led to an
early trade-off favouring rigor but reducing policy relevance [25]. In Year 3, once Good
Judgment became the sole contender in the ACE program, IARPA ceded the role of question
formulation to Good Judgment’s research team. Since then, question formulation has addressed
policy relevance by introducing Bayesian question clusters, which permit rigorous questions to
be directionally diagnostic to less rigorous but more policy-relevant issues.
Black-swan critiques still remain a central problem for binary prediction tournaments like Good
Judgment: short- and mid-term foresight can desensitize forecasters to long-term or extreme/fattail risks [21, 25]. To address this, Good Judgment’s question formulation team is currently
experimenting with the development of “rolling” continuous elicitation questions that demand
forecasters to consider queries with much longer timelines (5-10+ years).
Results, discussion and implications
In less than four years, the Good Judgment project has answered the ACE program’s primary
question unequivocally in the affirmative: economic and geopolitical outcomes can be predicted
using social science methods.
Attaining superiority in policy-relevant technical geopolitical forecasting hinges on three critical
factors: tracking, training, and teaming [15, 25].
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-5-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
Tracking forecasts begins with rigorously defined questions and outcome conditions, since good
question formulation is “the bread that feeds good forecasting” (Horowitz, 2014). Good
forecasting performance is then best achieved through systematically keeping score, essential in
reducing forecaster overconfidence. Continuous feedback on scoring retains forecaster
engagement and motivation; skimming off the top performers (what Good Judgment calls
Superforecasters) and aggregating them into elite teams enhances accuracy further still.
Contrary to expectations of Superforecasters eventually regressing to the mean, the individual
and group accuracy of Super teams actually increased and outperformed all other conditions in
the first three years [25], confirming that the best performers are not merely the beneficiaries of
luck. Forecasting is a skill.
Training efforts over the last three years confirm that forecasting is a skill that can be learned.
Prior to training however, screening prospective forecasters for active open-minded System 2
thinking, fluid intelligence, and political knowledge can boost forecasting performance by 10-15
percent [25]. Once selected to compete in a forecasting tournament, training in probabilistic
thinking, cognitive de-biasing and reducing groupthink behavior can deliver up to a 10 percent
improvement in forecasting performance [25].
Teaming and competitive forecasting in prediction markets outperform individual forecasting.
Team collaboration sharpens forecasters, improving performance by 10-20 percent [25].
Aggregating individuals into teams goes hand in hand with aggregating forecasts: overweighting smart, open-minded forecasters and “extremizing transformations” can compensate
for the conservatism of aggregate forecasts [15, 29].
The exciting and astonishingly accurate results yielded so far by the Good Judgment Project
cannot obscure the fact that more work in the technical geopolitical forecasting field lies ahead.
Merging Good Judgment’s human capacity with artificial neural networking and supercomputing
could be one path forward in pushing the forecasting frontier, but the benefits are not yet
immediately clear. More significant is the need to come to grips with inevitable forecasting
failures and the challenge of balancing them against successes. As Tetlock identified in a 2014
symposium [25], it would be a grave mistake to assume the existence of a perfect
“Nostradamus-like” forecasting solution; rather, it is more useful to view the Good Judgment
Project as a superlative pair of eyeglasses that can transform average 20/20 vision. Those
glasses won’t provide perfect 20/0 visual acuity, but they certainly will deliver a significant
improvement in quality of life if they can raise the average to an eagle-eye’s standard of 20/5. As
Tetlock puts it, “Optometry trumps prophecy.” [25]
Press reports of Good Judgment’s tangible results coincide with (and in a few cases are
spurring) the rise of new entrants in the forecasting marketplace, especially those in prediction
markets. Outlets like Predictit, Prediki, American Civics Exchange (whose acronym spells ACE)
and others, bear some superficial structural similarities—in certain cases due to licensing of
third-party prediction market back-end systems. Quantitative analysis outlets like boutique think
tanks and scenario-based futures consultancies are attempting to use more quantitative and
probabilistic language in their assessments. But the scientific data, tournament development
experience, aggregation models, training material feedback, and most importantly, pool of
Superforecasters whose accuracy has been consistently vetted over four years give Good
Judgment an edge that no other entity in the world can match unless they are prepared to invest
an equal amount of time and resources.
And yet, whether or not such advances in forecasting approaches can sufficiently catalyse
policymaking is uncertain. As Tetlock notes:
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-6-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
The long and the short of the story is that it's very hard for professionals and executives to maintain their status if they
can't maintain a certain mystique about their judgment. If they lose that mystique about their judgment, that's
profoundly threatening. My inner sociologist says to me that when a good idea comes up against entrenched
interests, the good idea typically fails. But this is going to be a hard thing to suppress. Level playing field forecasting
tournaments are going to spread. They're going to proliferate. They're fun. They're informative. They're useful in both
the private and public sector. There's going to be a movement in that direction. How it all sorts out is interesting. To
what extent is it going to destabilize the existing pundit hierarchy? To what extent is it going to destabilize who the big
shots are within organizations? [23]
Conclusions
The exponentially greater impact of “fat-tail” risks in our uncertain and deeply linked world—
whether through climate change, terrorism, accident or worse—demands better futures analysis
beyond the status quo of vague verbiage, foresight programs with high costs but low-to-zero
accountability, and “gurus-du-jour.” The rise of the semantic web and a widening asymmetry
between predictive analytics research and innovation in the US and virtually everywhere else
poses both opportunity and challenge: an opportunity to harness new tools for conducting
verifiable and accountable futures-oriented analysis; and a challenge in whether such tools will
be taken up by global actors disadvantaged by the growing technological gap and in most need
of better foresight.
Governments should seek to expand accountability metrics as a part of their strategic foresight
practice and tournaments such as those run by the Good Judgment Project can provide optimal
environments for these efforts.
References
1.Abbe C (1901) The physical basis of long-range weather forecasts, Monthly Weather Review, 29, 551–561.
2.Brier, GW (1950). Verification of Forecasts Expressed in Terms of Probability. Monthly Weather Review, 78, 1-3.
3.Deci EL, Koestner R, Ryan RM (1999) A meta-analytic review of experiments examining the effects of extrinsic
rewards on intrinsic motivation, Psychological Bulletin, Vol. 125 (6), 627-668.
http://psycnet.apa.org/journals/bul/125/6/627.
4.Forrest, DW (1974) Francis Galton: The Life and Work of a Victorian Genius. Elek, London.
5.Freidman W (2013) Fortune Tellers: The Story of America’s First Economic Forecasters, Princeton University Press,
Princeton
6.Galton F (1907) One Vote, One Value, Nature, No. 1948, Vol. 75, 414. http://galton.org/essays/1900-1911/galton1907-vote-value.pdf
7.Galton F (1907) Vox Populi, Nature, No. 1949, Vol. 75, 450-451. http://galton.org/essays/1900-1911/galton-1907vox-populi.pdf
8.Gerner DJ, Schrodt PA, Yilmaz O (2009) Conflict and mediation event observations (CAMEO) Codebook.
http://eventdata.psu.edu/data.dir/cameo.html
9.Hemingway, E (1926) The Sun Also Rises. Scribners, New York
10.Hopple, GW, Andriole, SJ, & Freedy A (1984) National security crisis forecasting and management. Boulder:
Westview
11.Horowitz MC, Tetlock PE (2012) Trending Upward, Foreign Policy
http://www.foreignpolicy.com/articles/2012/09/06/trending_upward
12.Kahneman D (2011) Thinking Fast and Slow. Farrar, Straus and Giroux, New York
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-7-
5th International Conference on Future-Oriented Technology Analysis (FTA) - Engage today to shape tomorrow
Brussels, 27-28 November 2014
13.Lynch P (2008) The origins of computer weather prediction and climate modelling Journal of Computational
Physics, 227 http://www.elsevierscitech.com/emails/physics/climate/the_origins_of_computer_weather_prediction.pdf
14.Mandel DR, Barnes A (2014) Accuracy of forecasts in strategic intelligence, Proceedings of the National Academy
of Sciences, Vol. 111, Issue 30, 10984-10989 http://www.pnas.org/content/111/30/10984.full
15.Mellers B, Ungar L, Baron J, Ramos J, Gurcay B, Fincher K, Scott SE, Moore D, Atanasov P, Swift SA, Murray
T, Stone E, Tetlock PE (2014) Psychological strategies for winning a geopolitical forecasting tournament,
Psychological Science, May 1;25(5):1106-15 http://pss.sagepub.com/content/25/5/1106.long
16.OECD (2014) OECD forecasts during and after the financial crisis: a post mortem, OECD Economic Department
Policy Notes, No. 23, February 2014 http://www.oecd.org/eco/outlook/OECD-Forecast-post-mortem-policy-note.pdf
17.Pawlak P, Ricci A (eds) (2014) Crisis rooms: towards a global network? European Union Institute for Security
Studies
18.Richardson LF (1922) Weather Prediction by Numerical Process, Cambridge University Press, Cambridge
19.Schrodt PA, Yonamine J, Bagozzi BE (2012) Data-based computational approaches to forecasting political
violence. Subrahamian VS (ed) Handbook on computational approaches to counterterrorism, Springer, New York
http://www.benjaminbagozzi.com/uploads/1/2/5/7/12579534/data-based-computational-approahes-to-forecastingpolitical-violence.pdf
20.Spiegel A (2014) So You Think You’re Smarter Than a CIA Agent, NPR,
http://www.npr.org/blogs/parallels/2014/04/02/297839429/-so-you-think-youre-smarter-than-a-cia-agent
21.Taleb, N.N., 2013, Fat Tails and Antifragility: Lectures on Probability, Risk and Decisions in the Real World, Freely
Available Web Book, www.fooledbyrandomness.com
22.Taleb NN, Tetlock PE (2013) On the Difference between Binary Prediction and True Exposure, with Implications
for Forecasting Tournaments and Prediction Markets, Social Science Research Network
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2284964
23.Interview with Philip Tetlock , “How to win at forecasting, with introduction by Daniel Kahneman, Edge.org,
http://edge.org/conversation/win-at-forecasting
24.Tetlock PE (2005) Expert Political Judgment: How Good is it? How Can We Know? Princeton University Press,
Princeton
25.Tetlock PE (2014) Presentation at American University Good Judgment Project question formulation symposium,
May 2014
26.Tetlock PE, Mellers B (2014) Judging political judgment, Proceedings of the National Academy of Sciences, Vol.
111, Issue 32, 11574-11575 http://www.pnas.org/content/111/32/11574.full
27.Trappl R (ed) (2006) Programming for Peace: Computer-Aided Methods for Conflict Resolution and Prevention.
Springer, The Netherlands
28.Wall M (2014) Weather report: forecasts improving as climate gets wilder, BBC News, Sept. 25 web edition,
http://www.bbc.co.uk/news/business-29256322
29.Ungar L, Mellers B, Satopaa V, Baron J, Tetlock PE, Ramos, J, Swift S (2012) The Good Judgment Project: A
Large Scale Test of Different Methods of Combining Expert Predictions, AAAI Fall Symposium Series
https://www.aaai.org/ocs/index.php/FSS/FSS12/paper/view/5570/5871
THEME 2: CREATIVE INTERFACES FOR FORWARD LOOKING ACTIVITIES
-8-