Maximizing Incremental Validity


It has always been in the interest of law
enforcement agencies to hire honest, hardworking folks.

Before modern credibility assessment methods
were integrated with police selection, agencies
prided themselves on their ability to identify
patterns of inappropriate behavior and unwanted
characteristics.

Traditional pre-employment screening
modalities: paper and pencil; background
investigation; interview(s).
Challenges we face in today’s hiring
environment.
The selection process, while sometimes robust in
nature, has yet to produce a system whereby we can
validate and predict all outcomes.
One problem with measuring effective police
behavior centers on the difficult nature of quantifying
“quality” while at the same time taking into account
the multiple dimensions of police work (Frank, et al.,
2008).
Most professions have a very clear mandate: college
professors educate, car salesmen sell cars, and airline
pilots fly planes. The fire department, the occupation
considered to be most similar to the police, also has a
very clear mandate: to prevent fires and to
extinguish as quickly and safely as possible those fires
they could not prevent (Skolnick & Fyfe, 1993).
Numerous studies indicate that the traditional response
to this question—preventing and controlling crime—is
inadequate, as police officers spend only 10% to 20% of
their time on crime-related activities (Scott 1981;
Wilson, 1968).
A Few Thoughts
While the need for policing has remained constant, the
process whereby officers are selected is rapidly changing.
Demands on police officers in the past 30 years have
grown dramatically with the increasing threats to social
order and personal security.
**Police
psychologists and those involved in credibility
assessement play an integral role in the
screening/selection process. Essentially, we are in the
risk assessment business.
As someone involved in the screening process, you should
be intimately involved in policy-making.
20-23
24-30
31-40 41-50
Standard Deviation
Σ=36

If you had ever smoked marijuana or used
drugs (self-reported).

Any arrest for DWI/DUI.

Physical appearance (Height/Weight).

Poor character or associates that had
criminal histories.






Today, instead of ever used drugs the
language is now, “We understand that it is
almost improbable for anyone not to have
used drugs at sometime in their past”.
Example: CIA = 1 year
Some departments use the 3 year window (where
do you fall?).
Driving record should reflect prudence and
maturity.
Must never have been convicted of or received
community supervision for a Class A
Misdemeanor or Felony.
Must not have been convicted of or received
community supervision for a Class B
Misdemeanor within the past 10 years.
• Detection and Deterrence
• Credibility assessment tool that adds incremental
validity to investigative and evidentiary decisions
and risk assessment activities.
• Gathers intelligence that would otherwise be
unavailable
• Despite polygraph’s continued use, research fails to
capture the essence of why government agencies place
such trust in an instrument that is continually
scrutinized for contexts of validity (Kraphol, n.d.).
• Yet, with renewed intensity since September 11, 2001,
there have been enormous efforts expended by
governments and universities to continue the centuryold development of an accurate deception test based on
sound scientific principles (Hu, Hegeman, Landry, &
Rosenfeld, 2012).
• Published studies on the validity of polygraph
techniques range in accuracy rates from 70- 90% with
confidence levels of 95%.

Context of Screening
Public demand (security blanket) – There is an inverse
relationship with public demand/perceived efficacy of
polygraph.
Contribution to Research - Many of the current
screening practices are the result of applied studies
utilizing applicant, examiner and testing format
information.
Enhances Incremental Validity – A predictor's ability to
explain an outcome, beyond all other predictors (Sage,
2017).
Right of Passage
New Screening Research Suggest the Consideration of
Base Rate
What the heck is a base rate??
If we are conducting a specific issue test for our police
department or for an attorney, it would be the prior
probability the subject is actually Guilty of the crime. If
we are screening (i.e. PCSOT or public safety preemployment) it is still the prior probability of Guilt,
but can also be regarded as the proportion of the
examinees who are lying to one or more of the test
questions (Handler, 2017).
In a security screening setting, we hope the base rate of
espionage, terrorism, or sabotage is very, very low
(Handler, 2016).




Meehl and Rosen (1955) – sensitivity, specificity & base
rates
◦ Sensitivity- how well the test detects Guilt.
◦ Specificity- how well the test detects Innocence.
In the area of psychometrics essentially all tests have error.
The base rate can become so extreme that simply predicting
the base rate extracts all of the possible information from
the situation.
Learning how to account for, manage, and take advantage of
base rates helps understand the test results.
◦ Helps improve Information Gain.
◦ Helps understand the test result.



1000 applicants, 500 Innocent and 500 Guilty.
The test is 90% accurate with both.
With equal base rates your confidence in the test outcomes
negative predictive value and positive predictive value (NPV &
PPV) directly mirrors the accuracy of the test.
Contingency Table with Equal Accuracy and Equal Base Rate of Guilt
& Innocence
Ground Truth
Pass Test
Fail Test
Totals
450 (TN)
50 (FP)
500
Guilty
50 (FN)
450 (TP)
500
Totals
500
500
1000
0.9 (NPV)
0.9 (PPV)
Innocent
Outcome
Confidence (NPV &
PPV)
≠



Here the target of the screening test is a
relatively rare event and occurs in only 10%
of the people- BR of Guilt = .10.
The test has accuracy of 90% with both
Innocent & Guilty subjects.
Confidence in a “pass test” outcome is
extremely high (NPV = .99) but your
confidence in a “fail test” outcome is poor
(PPV = .50).


“Beauty is in the eyes of the beholder…”
◦ Extreme base rates can affect confidence in
outcome (NPV & PPV).
◦ There is no such thing as a perfect test.
◦ We can make thoughtful estimates of our
base rates.
We can also think about our testing goals
and adjust the base rates to achieve those
goals- But How?
We can use additional techniques as tools to adjust the
base rates.
Depends on testing goals.
There is a cost.
We should conduct a cost/benefit analysis first to
maximize utility.
Most of the studies examining policing and
selection criteria find a relationship between
personality traits and negative predictors of
police performance and officer success (e.g.
problem officers and poor performance)
(Sanders, 2003; White, 2008;).
While polygraph screening outcomes hold
weight relative to the selection process, police
agencies have traditionally placed emphasis on
psychological traits.

The predictive validity of criteria commonly used
to screen applicants is a problem in the research
on police candidate selection. Police agencies
need to evaluate their various screening
methodologies in the multiple-hurdles approach
to police candidate selection.

The purpose of this quantitative study was to
examine whether the two sets of variables,
demographic profiles and pre-academy polygraph
screening results, were significant predictors of
police cadet attrition and training performance.

The odds of completing the academy
were 61.4% lower for a person that had
a polygraph result of “inconclusive or
deception indicated”, compared to a
person with a polygraph result of “no
deception indicated”.

The odds of completing the academy
decreased by 8.5% for every 1-year
increase in age.
The odds of a person with prior military service
completing the academy were 3.63 times greater than
the odds of completion for a person without prior
military service (a 263% increase in odds).
The odds of academy completion were 64.6% less for
those with “some college”, compared to those with a
different level of education (i.e. high school or GED;
Associate’s degree, or Bachelor’s degree).
Cross-classification Table of Academy Completion Status by
Polygraph Test Result Note: (1) = 5.88; p = .015.
Academy Completion
Status
Polygraph Result
No Deception
Indicated
Count
% within
Polygraph
Result
17.4%
82.6%
-1.3
.7
24
50
32.4%
67.6%
Std. Residual
1.7
-.9
Count
45
150
23.1%
76.9%
Std. Residual
Inconclusive or
Count
Deception Indicated % within
Polygraph
Result
Total
Unsuccessful
21
Successfu
l
Total
100
121
% within
Polygraph
Result
100.0%
74
100.0%
195
100.0%
Incremental Validity (fancy word for
increasing the predictive ability of our testing)
Conscientiousness has been correlated with high levels
of job performance.
As a construct it relates to the degree of organization,
control, and motivation one holds in goal-directed
behavior (Sarchione et al., 1998).
Individuals exhibiting high levels of conscientiousness
tend to be organized, reliable, hard-working, selfgoverning, thorough, persevering, and tend to have a
great amount of integrity.
Is this not what we are pursuing?
DQ_________________Intermediate Spectrum_____________________________FA
-1, 0 +1
-2
+2
Conventional police polygraph and psychological appraisal outcomes fall into
two specific categories. The initial and perhaps most utilized outcome is
disqualification; whereby the applicant fails to meet established thresholds (e.g.,
successive hurdles). This outcome is followed by full acceptance (conditional job
offer). While the decision to remove applicants for “face value” findings such as
serious crime and pathology is common sense, little is known regarding “the grey
space” which oftentimes serves as an impetus for disqualification and/or nonselection. Police agencies oftentimes simply utilize undefined characteristics
which serve to eliminate an otherwise qualified applicant. This approach largely
undermines the efficacy of the screening process. The Park Credibility Assessment
Screening Continuum is an exploratory theory which posits the inclusion of an
Intermediate Spectrum outcome; advancing a new approach to police candidate
assessment. This measure suggests that positive, previously unexploited
relationships co-exist between credibility assessment results, psychological
profiles and cognitive interviews. It argues that there is inherent value to
maximize incremental validity by incorporating a broad, yet robust scoring
continuum. By integrating credibility assessment scores with police psychologist
findings, it is possible to produce a broader, holistic score which ultimately
enhances incremental validity.
Avoiding a porous screening process
Majar Nidal Malik Hasan was sentenced to death
for killing 13 people and wounding 32 others in a
2009 shooting rampage at Fort Hood
Omar Mateen, 29, of Fort
Pierce, Florida
An American-born man employed with G4S Secure
Solutions who'd pledged allegiance to ISIS gunned down 49
people early Sunday at a nightclub in Orlando, the deadliest
mass shooting in the United States and the nation's worst
terror attack since 9/11.
Terrorism

functional Magnetic Resonance Imaging
(fMRI)

Brain Fingerprinting (P300)

EyeDetect
• fMRI works by exploiting the fact that the nucleus of a hydrogen
atom behaves like a small magnet. Using the phenomenon of
nuclear magnetic resonance (NMR), the hydrogen nuclei can be
manipulated so that they generate a signal that can be mapped
and turned into an image (UC San Diego School of Medicine,
2015).
• Published functional MRI (fMRI) data on the brain activity during
deception indicates that, on a multi-subject group level, a lie is
distinguished from truth by increased prefrontal and parietal
activity.
• These findings are theoretically important; however, their
applied value will be determined by the accuracy of the
discrimination between single deceptive and truthful responses
in individual subjects Langleben, Loughhead, Bilker, Ruparel,
Childress, Busch, and Gur, 2005).
• The term ‘‘brain fingerprinting’’ is based on the defining
feature of matching something on the person of the suspect
with something from the crime scene (Farwell, Richardson, &
Richardson, 2003).
• The P300 wave is a positive deflection in the human eventrelated potential. It is most commonly elicited in an "oddball"
paradigm when a subject detects an occasional "target"
stimulus in a regular train of standard stimuli. The P300 wave
only occurs if the subject is actively engaged in the task of
detecting the targets (Picton, 1992).
CIA Real Life Study Brain Responses
Information -Absent Subject:
FBI Agent Study Brain Responses:
Information-Present Subjects
_____________________________________________________________
Correct positives
19
100 %
Correct negatives
2
100 %
Total correct determinations
21
100 %
False positives
0
0%
False negatives
0
0%
Indeterminates
0
0%
Accuracy
21/21
100 %
Error rate
0/21
0%
_____________________________________________________________
Note: Error rate was 0 %, determinations were 100 %
accurate, no false negatives or false positives; also no
indeterminates. Countermeasures had no effect. Median
statistical confidence for determinations was 99.9 %
Cognitive load can be defined as a multidimensional
construct representing the load that performing a
particular task imposes on the learner’s cognitive system
(Paas & van Merriënboer, 1994).
While cognitive load has importance relative to criminal
interrogation (confession), in police screening contexts
we must continually evaluate the concept of cognitive
load due to the nuances of simultaneously examining
multiple issues.
To what degree does cognitive load impact our ability as
examiners to effect an accurate testing process?