The moderating effects of sample type as evidence of the effects of

Psychology Science, Volume 48, 2006 (3), p. 313-335
The moderating effects of sample type as evidence of the effects of faking
on personality scale correlations and factor structure
KEVIN M. BRADLEY1 & NEIL M. A. HAUENSTEIN
Abstract
Motivational differences as a function of sample type (applicants versus incumbents)
have frequently been suspected of causing meaningful differences in the psychometric properties of personality inventories due to the effects of faking. In this quantitative review,
correlations among the Big Five personality constructs were estimated and sample type was
examined as a potential moderator of the personality construct inter-correlations. The resulting subgroup meta-analytic correlation matrices were factor-analyzed, and the second order
factor solutions for job incumbents and job applicants were compared. Results of the metaanalyses indicate frequent, but small moderating effects. The second order factor analyses
indicated that the observed moderation had little effect on the congruence of factor loadings.
Together, the results are consistent with the position that faking is of little practical consequence in selection settings.
Key words: personnel selection, faking, sample type
1
Kevin Michael Bradley, Human Resources Research Organization, and Neil M. A. Hauenstein, Virginia
Tech.
This manuscript is based on the first author’s doctoral dissertation, completed under the guidance of the
second author. Portions of this paper were presented at the 19th annual meeting of the Society of Industrial /
Organizational Psychology, Chicago, Illinois.
The authors would like to acknowledge Roseanne Foti, Kevin Carlson, John Donovan, and Jack Finney. We
would also like to express our gratitude to the numerous researchers who shared their data.
Correspondence concerning this paper should be addressed to the first author at Human Resources Research
Organization, 66 Canal Center Plaza, Suite 400, Alexandria, Virginia 22314-1591. Electronic mail may be
sent to [email protected].
314
K.M. Bradley & N.M.A. Hauenstein
Despite the commonsense notion that there is a relationship between personality and job
performance (Kluger & Tikochinsky, 2001), the predictive accuracies of personality test
scores have never reached the levels implied by commonsense (Hurtz & Donovan, 2000).
Questions continue to be raised that this failure to meet expectations is because job applicants present themselves in an overly favorable manner by either deliberately enhancing
their positive attributes and/or denying negative attributes, or through a less calculated selfdeceptive enhancement (Dunnette, Carlson, McCartney, & Kirchner, 1962; Paulhus & Reid,
1991; Rosse, Stecher, Miller, & Levin, 1998). This overly favorable responding is referred
to, alternatively, as faking, dissimulation, response distortion, socially-desirable responding
(SDR), and impression management (IM).
A common strategy used in the debate about the effects of faking is to compare various
indicators of the psychometric quality of personality measures in applicant and incumbent
samples (Michaelis & Eysenck, 1971; Robie, Zickar, & Schmit, 2001; Schmit & Ryan, 1993;
D. Smith & Ellingson, 2002; Smith, Hanges, & Dickson, 2001; Van Iddekinge, Raymark,
Eidson, & Putka, 2003; Weekley, Ployhart, & Harold, 2003; Zickar, Gibby, & Robie, 2004).
However, the findings of these studies are inconsistent in that sometimes applicant / incumbent differences are found and sometimes they are not found. The current study attempts to
provide greater closure on this debate by testing sample type (i.e., incumbents versus applicants) as a moderator of meta-analytic estimates of the inter-correlations among common
personality constructs. Furthermore, these meta-analytic correlation matrices were subjected
to second order factor analyses2 as a way to calibrate the practical effects of any observed
sample type moderation.
There are a number of lines of evidence that indicate applicants present themselves more
favorably than is warranted. First, there is anecdotal evidence wherein the job application
process is frequently used as an example of a situation likely to foster impression management (see, for example, Bozeman & Kacmar, 1997, p. 12; Jones & Pittman, 1982, p. 245;
Leary & Kowalski, 1990, p. 38; Tedeschi & Melburg, 1984, p. 41). Second, empirical evidence clearly shows job applicants present more favorable personality profiles than nonapplicants (Green, 1951; Heron, 1956; Hough, 1998; Robie et al., 2001; Rosse et al., 1998;
Smith et al., 2001; Stewart, 1997). Third, Donovan, Dwight, and Hurtz (2003) used the
Randomized-Response Technique to survey deceptive behaviors among job-applicants.
Results from that study indicated 32% of job-applicants exaggerated personal traits to make
themselves look better; 47% exaggerated attributes such as dependability or reliability; and,
62% de-emphasized negative attributes when applying for a job. Although fewer applicants
in the Donovan et al. (2003) study reported giving responses that were completely made-up
(15%), it is evident that applicants do embellish the truth.
2
Much of the empirical evidence in the debate about faking is based on first order factor analyses. The current study reports the results of second order factor analyses. To avoid confusion, we will use the phrase
“second order” where appropriate.
Moderation by sample type
315
Faking and scale inter-correlations and factor structure
Although the effect of faking on mean personality scale scores is well established (e.g.,
Heron, 1956; Hough, 1998; Robie et al., 2001; Rosse et al., 1998; Smith et al., 2001), it is
not clear that faking affects other measurement properties, including the magnitude of the
inter-correlations among personality dimensions and the resultant factor structure. Several
studies found no differences in the psychometric quality of personality measures as a function of sample type (Ellingson, Smith, and Sackett, 2001; Smith & Ellingson, 2002; Smith et
al., 2001). For example, Smith et al. (2001) using the HPI examined the factor structure of
personality across three groups of respondents: students, job incumbents, and job applicants.
They found that the five-factor model fit each sample of data when examined independent of
the other samples. Further, when the factor structure, factor loadings, factor variances and
covariances, and error variances were constrained to be equal across groups, there was not a
significant decrement in model fit. Smith et al. (2001) concluded that the five-factor model is
robust with respect to testing context, and that investigations of response dynamics among
job applicants ought to move beyond investigations of personality structure.
In contrast, there are several studies that provide evidence that faking affects psychometric properties beyond scale means. Conceptual arguments drawn from the impression management (IM) literature suggest that personality scale inter-correlations should be inflated by
SDR potentially resulting in misleading factor analysis solutions. Specifically, IM leads
people to claim desirable images while downplaying negative images (Leary & Kowalski,
1990). In the context of personality tests, this would manifest itself in an increased likelihood
of endorsing favorable attributes and a decreased likelihood of endorsing negative attributes,
resulting in a polarization of responses. At the extreme, personality profiles would fail to
represent the traits intended to be measured; rather, personality scale scores would reflect
little more than a social desirability response set (Edwards, 1957). Although there is little
evidence of such an extreme effect of social desirability as a response set (Block, 1965),
there is evidence that SDR inflates the inter-correlations among trait dimensions resulting in
the extraction of fewer factors than when compared to honest responders (Douglas, McDaniel, & Snell, 1996; Ellingson, Sackett, & Hough, 1999; Frei, 1998).
A less extreme argument is that scale inter-correlations are inflated by faking, but the inflation is not strong enough to affect the number of extracted factors. Weekley et al. (2003)
compared the factor structure for incumbents and applicants and found that while the factor
form was the same, the magnitude of the factor loadings differed across groups. In fact,
when only 25% of the factor loadings were constrained to be equal (a test of partial invariance), the fit of the multiple-groups model was still poor. Weekley et al. (2003) concluded
that the factorial validity of personality inventories is adversely affected in applicant settings.
A less popular position is that faking produces more factors than when compared to honest responders. Schmit and Ryan (1993) argued that the selection setting may activate cognitive schemas among applicants that differ from the cognitive schemas activated among persons who complete the inventory on a voluntary basis. Applicants wish to convey workrelated competence relative to other applicants, and therefore might operate according to an
ideal-employee frame-of-reference. Incumbents, on the other hand, may enact a strangerdescription frame-of-reference, where they communicate basic information as they would
during an initial meeting with a stranger (Schmit & Ryan, 1993, p. 967). They compared the
factor structure of a big-five measure in applicants and students and found a five-factor solu-
316
K.M. Bradley & N.M.A. Hauenstein
tion fit the student sample. However, they found a sixth “ideal employee” factor in the applicant sample. Schmit and Ryan (1993) concluded the five-factor model may not be the appropriate level of specificity required for prediction in job applicant settings.
To summarize, there are currently four positions regarding the effects of faking on scale
inter-correlations and factor structure: 1) Faking does not affect scale inter-correlations and
factor structure, 2) SDR inflates scale inter-correlations to the point that fewer factors than
should be are extracted, 3) SDR inflates scale inter-correlations but this inflation only results
in biased estimates of the factor loadings, and 4) frame-of-reference responding by applicants produces more factors than should be extracted, and by extension this argument implies
that faking likely reduces the average scale inter-correlations relative to when responders are
being honest.
Methods and analyses used to study faking
Empirical evidence is complicated by the fact that different methods are used to study
faking and there are also disagreements about statistical analyses. A common strategy in
faking research is to manipulate instruction sets using either between- or within-group designs. These instructional sets often take the form of an “Honest” response set (i.e., “Please
present yourself as you truly see yourself when responding to this personality inventory”), a
“Fake Good” response set (i.e., “Please present yourself so that you come across as an extremely virtuous person”), and/or a “Desirable Job Applicant” response set (i.e., “Please
present yourself so that you would maximize your chances of being selected by the hiring
organization”). Such “directed faking” studies have found manipulated faking response sets
inflate dimension inter-correlations and degrade the construct-validity of personality inventories (Douglas, et. al., 1996; Ellingson, et. al., 1999; Frei, 1998).
It has been argued that directed faking studies are not an accurate reflection of the selfpresentation process that test-takers utilize in natural (job screening) settings (Griffin, Hesketh, & Grayson, 2004; Hough, 1998; Hough & Ones, 2001). Therefore, the generalizability
of results from directed faking studies has been called into question. As an alternative to the
directed faking study, many inquiries have relied on social-desirability, impression management, or lie scales as indicators of SDR or faking. In investigations of the effect of SDR on
the factor structure of personality measures, samples are often split into high and low faking
groups on the basis of social-desirability scale scores. Subsequently, the factor structure of
the personality measure in high and low faking groups is compared. Ellingson et al. (2001)
brought together four large data sets; in each data set, the sample was split into those respondents presenting a high versus a low degree of socially desirable responding. Across the four
samples, the results were consistent in showing that the factor structure was not meaningfully influenced by socially desirable responding.
However, the reliance on social desirability scales as indicators of self-presentation dynamics in personality testing has also been questioned. Stark, Chernyshenko, Chan, Lee, and
Drasgow (2001) noted that previous research efforts had failed to examine the possibility
that the psychometric properties of social desirability scales could be influenced by situational factors (such as the purpose of testing). Using Item Response Theory and Differential
Item Functioning to examine the measurement properties of social desirability scales in
applicants and non-applicants, they found that social desirability scales are markedly influ-
Moderation by sample type
317
enced by testing conditions. They went on to conclude that generalizing results of inquiries
that utilize high and low faking groups (as defined by impression management or social
desirability scale scores) across populations is not warranted. Instead, Stark et al. argued that
a more fruitful approach is to compare the psychometric properties for naturally occurring
groups, such as incumbents and applicants.
In regards to statistical analyses, Smith and his colleagues (Smith et al., 2001; Smith &
Ellingson, 2002) have suggested that studies finding differences may have drawn erroneous
conclusions due in part to a failure to account for a lack of multivariate normality in the data.
Contrarily, Weekley et al. (2003) suggest Smith et al. (2001) might have failed to find differences because Smith et al. used fit indices that were potentially inappropriate. Yet, Smith
et al. (2001) used the same fit indices in their research as have been used by other researchers examining differences across sample types, including Weekley et al. (2003) in the very
study wherein the claim is made that the fit indices utilized by Smith are inappropriate.
To summarize, it appears that the best way to study faking is to use naturally occurring
groups, and not to rely on lie scales to differentiate “honest” responders from “fakers”. Furthermore, it is clear that the use of confirmatory analytic strategies adds a layer of complexity to the understanding of the results.
Current study
The current study uses meta-analysis to address the moderating effects of incumbents
versus applicants on scale inter-correlations and second order factor solutions. Examining
the meta-analytic correlations generated from applicant and incumbent samples is consistent
with the recommendation to study faking in naturally occurring groups, and to do so without
relying on social desirability scales (Stark et. al, 2001). If any of the three positions that
argue faking has meaningful effects on scale inter-correlations and factor structure are valid,
then sample type will moderate the meta-analytic estimates of scale inter-correlations.
However, the criteria used to detect moderation in meta-analyses are not stringent. To
further calibrate the practical effects of potential sample type moderation, we factor analyzed
the meta-analytic correlation matrices (cf. Viswesvaran & Ones, 1995). Based on the conflicting evidence regarding the effects of faking, we did not expect that the number of second
order constructs extracted from incumbent samples would differ from the number extracted
from applicant samples. Instead, our focus was on whether the moderating effect of sample
type was strong enough to diminish the congruence of the factor loadings. Exploratory factor
analysis is more appropriate for assessing congruence across all possible factor loadings, and
the use of exploratory factor analysis has the added advantage of avoiding the aforementioned controversies surrounding the use of confirmatory analytic strategies.
318
K.M. Bradley & N.M.A. Hauenstein
Method
Study identification and inclusion criteria
A computerized database search of the terms “personality or temperament or dispositions” and “job performance or occupational success” was completed in September 2001,
using the PsycLit, National Technical Information Service, and ERIC databases. Studies
were also identified by hand-searching the 1991 through 2002 volumes of Journal of Applied
Psychology, Personnel Psychology, Human Performance, Journal of Business and Psychology, Journal of Occupational and Organizational Psychology, Academy of Management
Journal, Journal of Management, Journal of Organizational Behavior, Journal of Vocational Behavior, and Educational and Psychological Measurement. A less inclusive search
was conducted of Leadership Quarterly (1995 – 2002), Administrative Science Quarterly
(1991 – 1997), Organizational Behavior and Human Decision Processes (1991 – 1992; 2000
– 2001), and International Journal of Selection and Assessment (1998 – 2001). A manual
search of all studies published in the Validity Information Exchange of Personnel Psychology was also conducted. Next, programs from the 1996 through 2002 Annual Conferences of
the Society for Industrial and Organizational Psychology were searched to identify additional studies to include in the current review. Finally, a number of test publishers, applied
researchers, assessment specialists, and consultants were contacted in order to locate unpublished technical reports and unpublished data collected in conjunction with selection and
validation projects completed by these firms.
Authors of studies failing to report complete information were contacted, and any supplemental information provided was coded. If only significant correlations were reported and
the author could not provide additional data, the available correlations were included in the
meta-analyses. In some cases it was unclear if the data presented in a manuscript overlapped
with data that had previously been presented in another manuscript. Authors were contacted
for clarification on this matter; if the author did not provide a definitive response, the first
author of this manuscript made the final decision. If a sample of data appeared in multiple
studies, only one of those studies was included.
Studies were only included in the meta-analyses if they: a) were conducted in a workplace setting; b) were reported in English; and, c) presented sufficient information to compute a bivariate correlation between at least two of the Big Five personality constructs. We
used the Hough and Ones (2001) taxonomy to categorize the personality scales into Big Five
factors.
The original intent was to conduct as comprehensive a meta-analysis as would be possible. To this end, studies that reported at least one correlation between big five indicators
were to be included in the meta-analyses. When these analyses were conducted, the distribution of population correlations was extremely variable (e.g., SDρ values were large). Much of
this variability was traced to differences in the population correlations as estimated by distinct personality inventories. For example, based on the studies identified in this effort, the
sample weighted observed correlation between Extraversion and Conscientiousness was
estimated as, alternatively, r = 0.32 (including only studies that utilized the NEO-FFI), r =
0.21 (studies that utilized the California Psychological Inventory), and r = 0.00 (studies that
utilized the 16PF). This finding underscored the difficulty in categorizing extant personality
scales into a taxonomy such as the Big Five.
Moderation by sample type
319
In addition to the finding that correlations between personality constructs are influenced
by the operational definition of those personality attributes, personality inventories were
disproportionately represented in incumbent and applicant samples. For example, among the
studies identified initially, the MMPI and the MMPI-2 appeared in 19 studies and 15 of those
(79%) utilized an applicant sample. On the other hand, the NEO-PI, the NEO-PI-R, and the
NEO-FFI were used in a total of 60 studies: seven (12%) were studies of job applicants. The
confound between sample type and inventory, combined with the evident moderating effect
of inventory on population correlations, led us to choose to conduct meta-analyses that examined the inter-correlations for specific Big Five instruments. Only two Big Five instruments, the NEO-FFI and the HPI, provided enough samples in both subgroups to test the
moderating effects of sample type. There were four and three applicant samples for the
NEO-FFI and HPI, respectively.
It should be noted that for the HPI data, 10 out of the 30 applicant inter-scale correlations
came from Smith and Ellingson (2002) who used a version of the HPI that measured each
Big 5 trait using the 20 items with the greatest homogeneity for each factor.
Coding of sample type
The incumbent samples consisted of already employed individuals who were voluntarily
participating in research studies conducted by their employer or by an external researcher.
All of the NEO-FFI “applicant” data were from registrants with either an unemployment
office or a recruiting firm, and they completed the NEO-FFI during the registration process.
Participants in these samples might be better characterized as “job seekers” than job applicants, given they responded to the NEO-FFI without reference to a specific job. However,
these job seekers exhibited inflated Big 5 scale mean scores as typically seen with applicant
samples, and were higher than the incumbent scale means, clearly suggesting the motivation
of these job seekers was similar to that of job applicants. Furthermore, faking is usually
attributed to socially desirable responding but there is also the issue of faking in a manner
consistent with the stereotypical expectations of the job in question (Ones, Viswesvaran, &
Reiss, 1996). The fact that these job seekers were not applying to a specific job title suggests
that any moderation due to sample type is caused only by greater socially desirable responding on the part of job seekers. The applicant samples for the HPI were from applicants applying to specific positions in the hiring organization. As expected, the applicant means for the
most socially desirable scales (Adjustment and Prudence) on the HPI were significantly
higher than the corresponding incumbent means.
Meta-analytic method and computation of correlation coefficients
For each unique sample in a study, Pearson bivariate correlations between indicators of
each personality construct were recorded. The artifacts of sampling error and scale unreliability (where possible) were corrected for in the meta-analyses. The weighted average correlation was used in the estimation of sampling error variance (Hunter & Schmidt, 1994).
For the NEO-FFI, internal consistency estimates reported in the studies were used to produce
an average coefficient alpha weighted by sample size. The weighted coefficient alphas for
320
K.M. Bradley & N.M.A. Hauenstein
incumbents (applicants) were: .81 (.79) for Neuroticism, .75 (.74) for Extraversion, .69 (.72)
for Openness, .72 (.66) for Agreeableness, and .80 (.87) for Conscientiousness. The
weighted average alpha coefficients for the incumbent samples using the HPI were: .80 for
Neuroticism, .69 for Extraversion, .67 for Openness, .60 for Agreeableness, and .63 for
Conscientiousness. We were unable to ascertain reliability estimates for the applicant samples using the HPI, and the technical manuals for the HPI do not report reliabilities for applicant samples, therefore, we did not correct for scale unreliability for the HPI meta-analyses.
The reliability data from the NEO-FFI studies suggest that sample type is not a strong
moderator of scale reliability. The difference in the reliability estimates were no more than
.03 for three of the five scales. For the two scales with the larger discrepancies, reliability
was greater for incumbents for Agreeableness, and reliability was greater for applicants for
Conscientiousness.
We chose not to correct for range restriction for a number of reasons. In the end, the primary reason that incumbent parameter estimates were not corrected for range restriction was
the lack of evidence that incumbent sample variances were restricted relative to the incumbent populations in the primary studies. The main reason that applicant parameter estimates
were not corrected for range restriction was the unavailability of unrestricted applicant population variances in primary studies and published norms.
Methods for testing moderator effects
Tests for moderation used two criteria. First, the traditional 75% rule was used (Hunter &
Schmidt, 1990). The initial meta-analysis for both personality tests collapsed over sample
type. In this overall meta-analysis, no moderation by sample type was concluded for any
scale where sampling error accounted for at least 75% of the variance in the distribution of
observed correlations. For those scales where less than 75% of the variance was accounted,
moderation by sample type was tested using the second moderator criterion suggested by
Hunter & Schmidt (1990). That is, a moderator is present if the average correlation varies as
a function of the moderator variable, and if the average of the corrected variances within the
categories of the moderator variable are less than the corrected variance when the moderator
variable is ignored.
Factor analyses
Following the derivation of the meta-analytic correlation matrices, the resulting 5-by-5
matrices were factor analyzed with the principal axis factor extraction method using SAS.
Squared multiple correlations served as initial communality estimates. Scree plots and parallel analysis were used to determine the appropriate number of factors to extract (Montanelli
& Humphreys, 1976). Following the extraction of factors, a Harris-Kaiser oblique rotation
was conducted. A Harris-Kaiser power coefficient of .25 was utilized in the oblique rotation.
This value tends toward the identification of a solution that enhances simple structure.
Moderation by sample type
321
Results
Results of the NEO-FFI are presented in Table 1. The Neuroticism-Conscientiousness
overall correlation (ρ = -.47) was the only relationship where no moderation was indicated
based on the 75% criterion. The Neuroticism-Agreeableness pairing (overall ρ = -.34) failed
the second criterion for moderation in that the average variance of the subgroups (Average
SDρ = .085) was greater than the variance (SDρ = .080) when collapsing over sample type.
The other eight pairings of scales met the technical criterion for moderation by sample
type, but the moderation effects were trivial to small in strength. The three relationships with
the strongest moderation all involved Openness: Extraversion-Openness (applicant ρ = .31,
SDρ = .00; incumbent ρ = .20, SDρ = .00), Neuroticism-Openness (applicant ρ = -.17, SDρ =
.04; incumbent ρ = -.07, SDρ = .14), and Openness-Conscientiousness (applicant ρ = .03,
SDρ = .02; incumbent ρ = -.05, SDρ = .12). For these three scale pairings, the correlation was
stronger in the applicant sample for two pairings, and the sign of the relationship switched in
the third pairing. As to the five other trait pairings where moderation was weaker, the relationships were stronger in the applicant sample for Neuroticism-Extraversion (applicant ρ = .53, SDρ = .00; incumbent ρ = -.49, SDρ = .06), and Openness-Agreeableness (applicant ρ =
.14, SDρ = .00; incumbent ρ = .10, SDρ = .08). In contrast, the relationships were stronger in
the incumbent sample for Agreeableness-Conscientiousness (applicant ρ = .26, SDρ = .13;
incumbent ρ = .31, SDρ = .12), Extraversion-Agreeableness (applicant ρ = .31, SDρ = .09;
incumbent ρ = .35, SDρ = .10), and Extraversion-Conscientiousness (applicant ρ = .41, SDρ
= .00; incumbent ρ = .43, SDρ = .08)
Although moderation by sample type was detected in eight of the ten trait pairs, the largest absolute difference between the applicant and incumbent correlations was .11, and the
absolute differences of five of the eight trait pairings was .05 or less. It is clear the moderating effect of sample type on the NEO-FFI is at most a small effect, without a biasing direction in term of the relationships being stronger in one sample type versus the other.
For the HPI, the Openness-Agreeableness overall correlation (r = .17; for the HPI, we
use r instead of ρ because the HPI correlations were not corrected for scale unreliability) was
the only relationship where no moderation was indicated based on the 75% criterion. The
Neuroticism-Openness pairing (overall r = -.14) failed the second criterion for moderation
because the overall correlation did not change as a function of sample type. The Neuroticism-Extraversion trait pair (r = -.05) also failed the second criterion for moderation in that
the average variance of the subgroups (Average SDr = .12) was not less than the variance
(SDr = .12) when collapsing over sample type.
Seven trait pairs showed moderation by sample type, but all the differences were again
trivial to small. Four of these correlations were stronger among applicants, including: the
correlation between Neuroticism and Agreeableness (applicant r = -.35, SDr = .12; incumbent r = -.30, SDr = .18); the correlation between Neuroticism and Conscientiousness (applicant r = -.44, SDr = .10; incumbent r = -.39, SDr = .12); the correlation between Extraversion
and Conscientiousness (applicant r = -.17, SDr = .02; incumbent r = -.14, SDr = .19); and,
the correlation between Openness and Conscientiousness (applicant r = .08, SDr = .00; incumbent r = .00, SDr = .12).
Table 1:
Meta-analysis Results for Correlations Between Predictors for NEO-FFI
322
K.M. Bradley & N.M.A. Hauenstein
Note: k = number of studies; N = total sample size; r = weighted average observed correlation; 90% CIL = Lower 90% confidence limit of observed
correlation estimate; 90% CIU = Upper 90% confidence limit of observed correlation estimate; σ2OBS = variance in observed correlations; σ2SE =
variance attributable to sampling error; % σ2OBS: SE = percentage of observed variance attributable to sampling error; σ2 = variance corrected for
sampling error; ρ = weighted average observed correlation corrected for unreliability; SDρ = standard deviation of correlations corrected for sampling error and scale unreliablility; 90% CVL = Lower 90% credibility limit; 90% CVU = Upper 90% credibility limit.
Moderation by sample type
323
Table 2:
Meta-analysis Results for Correlations Between Predictors for HPI
324
K.M. Bradley & N.M.A. Hauenstein
Note: k = number of studies; N = total sample size; r = weighted average observed correlation; 90% CIL = Lower 90% confidence limit of observed
correlation estimate; 90% CIU = Upper 90% confidence limit of observed correlation estimate; σ2OBS = variance in observed correlations; σ2SE =
variance attributable to sampling error; % σ2OBS: SE = percentage of observed variance attributable to sampling error; σ2 = variance corrected for
sampling error; SDr = standard deviation of correlations corrected for sampling error; 90% CVL = Lower 90% credibility limit; 90% CVU = Upper
90% credibility limit.
Moderation by sample type
325
326
K.M. Bradley & N.M.A. Hauenstein
Three correlations were stronger in incumbent samples: the correlation between Extraversion and Openness (incumbent r = .38, SDr = .11; applicant r = .35, SDr = .05); the correlation between Extraversion and Agreeableness (incumbent r = .22, SDr = .05; applicant r =
.15, SDr = .00); and, the correlation between Openness and Agreeableness (incumbent r =
.18, SDr = .02; applicant r = .13, SDr = .02). The absolute difference between subgroup
correlations were less than or equal to .08. As such, there is evidence of sample type acting
as a moderator of seven of the ten trait parings for the HPI, but as seen with the NEO-FFI the
magnitude of the moderating effect is generally small.
For both personality instruments, moderation effects were small, but they were slightly
stronger for the NEO-FFI. However, it is not surprising that the NEO-FFI moderation is
stronger because only the NEO-FFI scales were corrected for scale unreliability. The correction for unreliability slightly amplifies the magnitude of the differences between the relationships as a function of sample type. Using the weighted average observed correlations to
assess moderation when using the NEO-FFI produces subgroup difference magnitudes that
are similar to those reported for the HPI.
Second order factor analyses
The purpose of the second order factor analyses was to assess the congruence of the factor loadings as a function of sample type. Both subgroup correlation matrices (See Table 3
and Table 4) for both personality measures were factor analyzed. For both the applicant and
incumbent matrices based on the NEO-FFI, parallel analysis and inspection of scree plots
indicated that retaining two factors would be sufficient to account for the correlation matrices.
The extraction of two factors from both subgroups resulted in a reproduced correlation
matrix that provided reasonable estimates of the observed correlation matrices. The overall
root mean squared residual (RMSR) was .04 for both subgroups. Marcoulides and Hershberger (1997) suggest that RMSR values less than .05 indicate a strong fit between the data and
the factor analysis model. The rotated factor pattern and factor structure matrices for the
NEO-FFI are presented in Table 5. Pattern matrix coefficients represent loadings of the
personality traits on each latent factor, controlling for other latent factors. Pattern matrix
coefficients are analogous to standardized regression coefficients relating the latent factor to
each personality trait. Structure matrix coefficients represent correlations between the personality traits and the latent factors.
Examination of the pattern matrices for the NEO-FFI shows similarities between the applicant and incumbent factor loadings. Examining the 10 pairs of pattern matrix coefficients
(five personality variables by two latent factors) in Table 5, it can be seen that the average
difference between corresponding (incumbent versus applicant) factor loadings is .067. Five
of the 10 coefficients differed by .04 or less, two coefficients differed by more than .05 but
less than .10, and three were between .11 and .13. The resulting factor structures were compared using the coefficient of congruence (Burt, 1948), as reported in formula one of
Guadagnoli and Velicer (1987). This index, computed for each latent factor, indicates the
similarity between the pattern of factor loadings across two samples. Mulaik (1972) reported
that it is common to accept two factors as equivalent when the congruence coefficient asso-
Moderation by sample type
327
Table 3:
NEO-FFI Meta-analytic Correlation Matrices Utilized in Exploratory Factor Analyses
N
Neuroticism
Extraversion
Openness to Experience
Agreeableness
Conscientiousness
-.49
-.07
-.36
-.47
E
-.53
.20
.35
.43
O
-.17
.31
.10
-.05
A
-.33
.31
.14
.31
C
-.48
.41
.03
.26
-
Note: N = Neuroticism; E = Extraversion; O = Openness to Experience; A = Agreeableness; C = Conscientiousness. Incumbent subgroup correlations appear below the
diagonal; applicant subgroup correlations appear above the diagonal. Correlations
corrected for scale unreliability.
Table 4:
HPI Meta-analytic Correlation Matrices Utilized in Exploratory Factor Analyses
Neuroticism
Extraversion
Openness to Experience
Agreeableness
Conscientiousness
N
.02
-.14
-.30
-.39
E
.01
.38
.22
-.14
O
-.14
.35
.18
.00
A
-.35
.15
.13
.25
C
-.44
-.17
.08
.27
-
Note: N = Neuroticism; E = Extraversion; O = Openness to Experience; A = Agreeableness; C = Conscientiousness. Incumbent subgroup correlations appear below the
diagonal; applicant subgroup correlations appear above the diagonal. Correlations
not corrected for scale unreliability.
Table 5:
NEO-FFI Factor Pattern and Factor Structure Matrices for Factor Analyses Based on Incumbent
and Applicant Meta-Analytic Correlation Matrices
Personality Trait
Neuroticism
Extraversion
Openness to Experience
Agreeableness
Conscientiousness
Incumbents
Pattern
Structure
F1
F2
F1
F2
-.67
-.02
-.67
-.27
.58
.20
.65
.41
-.04
.38
.10
.36
.46
.10
.49
.27
.67
-.14
.62
-.11
Applicants
Pattern
Structure
F1
F2
F1
F2
-.64
-.10
-.70
-.49
.45
.33
.65
.60
-.08
.47
.21
.42
.35
.13
.43
.35
.67
-.11
.61
.30
328
K.M. Bradley & N.M.A. Hauenstein
Table 6:
HPI Factor Pattern and Factor Structure Matrices for Factor Analyses Based on Incumbent and
Applicant Meta-Analytic Correlation Matrices
Personality Trait
Neuroticism
Extraversion
Openness to Experience
Agreeableness
Conscientiousness
Incumbents
Pattern
Structure
F1
F2
F1
F2
-.57
-.03
-.57
-.11
-.09
.58
-.01
.57
.07
.51
.14
.52
.41
.27
.44
.33
.57
-.14
.55
-.07
Applicants
Pattern
Structure
F1
F2
F1
F2
-.61
-.04
-.62
-.10
-.10
.56
-.04
.55
.12
.47
.17
.48
.44
.19
.47
.24
.61
-.13
.59
-.07
ciated with those two factors is greater than .90. The coefficients of congruence between
incumbents and applicants were .99 for both factor one and factor two. Based on the coefficients of congruence, there is strong evidence of factorial equivalence between the incumbent and the applicant data.
The only evidence that sample type mattered for the NEO-FFI was seen in the structure
matrix. The correlations between each trait and the two latent factors are less differentiated
in the applicant sample than the incumbent sample. The end result is that the two extracted
factors are more highly correlated in the applicant samples (r = .61) than the incumbent
samples (r = .38). This difference may matter when undertaking more elaborate construct
validation efforts, but it is unlikely to matter in terms of criterion-related validity.
As with the NEO-FFI, for both the applicant and incumbent matrices based on the HPI,
parallel analysis and inspection of scree plots indicated that retaining two factors would be
sufficient to account for the correlation matrices.
The overall root mean squared residual (RMSR) was .05 for incumbents and .06 for applicants. Although the fit is not quite as strong as with the HPI, the level of fit is still good.
As with the NEO-FFI, the similarities between the factor loadings in the pattern matrix for
the HPI (Table 6) are apparent. The average difference between factor loadings is .033. The
difference between corresponding factor loadings is .05 or less for nine of the 10 factor
loadings, with the tenth factor loading showing a difference of .08. The coefficients of correspondence were .99 for both latent factors, indicating equivalence in the factor structure
across the two subgroups. Unlike the NEO-FFI, the correlation between the latent constructs
for the HPI did not vary much as a function of sample type, and the incumbent correlation (r
= .13) was slightly larger than the applicant correlation (r = .10).
Discussion
Based on our analyses of two popular measures of personality used for selection, the results are consistent with the position that testing context (incumbent versus applicant) has a
small, practically unimportant, influence on either the inter-correlations among big five
traits, and no effect on the higher order factor loadings of big five measures. These findings
are consistent with the more general perspective that, other than inflating scale mean scores,
Moderation by sample type
329
faking does not meaningfully affect psychometric properties of personality test scores. Given
these findings, it is not surprising then that a recent meta-analysis that examined the effects
of sample type on the predictive accuracy of personality test scores also found little evidence
of moderation by sample type (Bradley & Hauenstein, 2004).
Although interpretation of the second order factor structures of the different measures
was not a goal of the current study, the findings of a two factor solution for both measures is
consistent with recent analyses of the higher order structure of the big five (Digman, 1997; J.
Hogan & Holland, 2003; Markon, Krueger, & Watson, 2005; Ones, Viswesvaran, & Dilchert, 2005). However, it is interesting that the factor structures were somewhat different for
the two measures. For the HPI, Emotional Stability (the opposite pole of Neuroticism), and
Conscientiousness loaded on factor 1, and Extraversion and Openness to Experience loaded
on factor 2. Agreeableness loaded modestly on each factor, but more strongly on factor one.
Factor one is consistent with the growth orientation factor uncovered in Digman’s (1997)
higher-order factor analysis of the big five. In addition, factor one appears to represent what
J. Hogan and Holland (2003) refer to as the “getting ahead” personality factor that is important in occupational settings. Factor two resembles the socialization factor uncovered by
Digman (1997), as well as the “getting along” personality factor that is thought to benefit
individuals’ workplace standing (J. Hogan & Holland, 2003).
The findings for the NEO-FFI can also be interpreted in terms of the notion of the getting
ahead and getting along factors, however, extraversion also cross-loaded in the NEO-FFI
solution--more so than agreeableness. Perhaps the NEO-FFI measure of extraversion is more
saturated with dominance than the HPI measure of extraversion. Regardless the reason(s) for
the differing patterns of factor loadings, more important for the current study is the fact that
the factor loadings were highly congruent across sample types for both measures. The findings of congruence, in spite of the fact that the factor loadings for the two instruments were
meaningfully different, supports the conclusion that although sample type frequently moderates the inter-correlations between big five traits, the moderating effects of sample type are
too small to be of practical significance.
Limitations
The main limitations of this study were the number of applicant samples available for the
analyses, and the lack of reliability estimates for the applicant samples using the HPI. The
latter problem prevented correcting for scale unreliability in the HPI meta-analyses. Of the
two limitations, the number of applicant samples is of greater concern. Obviously, greater
faith in the results would be warranted if the number of applicant samples were larger. That
being said, the sample sizes are large in the applicant studies and the conclusions are the
same for both personality measures, both of which suggest that availability of more applicant
samples is unlikely to affect the conclusions. As mentioned in the results, not correcting the
meta-analytic estimates for scale unreliability resulted in a slight underestimation of the
magnitudes (but not the frequency) of the moderating effects of sample type for the HPI.
However, given the results for the NEO-FFI, it appears unlikely that correcting the HPI
estimates for scale unreliability would affect our conclusions.
330
K.M. Bradley & N.M.A. Hauenstein
Future research directions and conclusion
Although it is likely that faking has few if any practical effects on the reliability, validity,
and ultimately the utility of personality test scores, at the applicant level, it is clear that differences in the motivation to fake responses to any non-cognitive selection instrument has
the potential to affect who is hired and who is not hired. In particular, there likely will be
“honest responding” applicants who are not hired in the typical selection context, but who
would have been hired if all applicants responded in an honest manner. As such, future research on faking of personality tests likely would be better served by focusing on the applicant-level issues (e.g., Donovan et. al., 2003) as opposed to studying the effects of faking on
the psychometric properties of personality test scores.
Another issue made salient by this research is the need for further refinement of taxonomies for classifying personality scales according to constructs. The taxonomic effort of
Hough and Ones (2001) is impressive, and is an important development in the furthering of
personality research. However, such taxonomies are not without their shortcomings. In the
course of this meta-analytic review, it became clear that the correlation among personality
constructs was meaningfully influenced by the operational definition of those constructs.
Indeed, the correlation among personality constructs was more strongly influenced by personality inventory than by sample type. In the same vein, Anderson and Ones (2003) found
that personality measures contain enough inventory specific variance that convergent validities across inventories can be overwhelmed. Such findings indicate that qualitative and
quantitative reviews of personality research would benefit from further refinement of the
process of sorting personality measurement scales into construct categories. This position is
echoed by Barrett, Miguel, Hurd, Lueke, and Tan (2003) who argue that meta-analytic research on personality constructs is not useful to practitioners because of the limitations of the
scale to construct taxonomies.
Taken altogether, the evidence indicates that research on faking should not be the focus
of research on the use of personality measures for selection (Ones et al., 1996). There are
more important issues regarding the use of personality constructs for selection purposes
(Barrick & Mount, 2005). For example, J. Hogan and Holland (2003) provided evidence that
personality is more strongly related to conceptually similar criteria than to general “overall
performance” criteria (see also Pulakos, Borman, & Hough, 1988). Research that includes
more reliable criterion measures as well as criterion measures that are conceptually aligned
with predictor measures will likely advance our understanding of the benefits of personality
in personnel selection.
An alternative direction for future research that might address the relatively low predictive accuracy of personality inventories would be to consider using social cognitive models
of personality (Mischel & Shoda, 1998; Cervone, 2004). Such models argue that knowledge
structures should replace traits as the structure of personality, and that personality can only
be understood in a given context (e.g., work).
Research in the domain of personality and selection will evolve in different directions.
Regardless of the direction of this evolution, the evidence presented here and in other research has highlighted that while applicants are more likely to fake than incumbents, the
effects of such faking on the psychometric properties of personality test scores are a less
important issue than other, more pressing questions regarding personality and personnel
selection.
Moderation by sample type
331
References
References marked with an asterisk indicate studies included in the meta-analyses.
* Abbott, S. L. (1996). Forming and enacting job perceptions: An investigation of personality
determinants and job performance outcomes. Unpublished doctoral dissertation, New York
University.
Anderson, N., & Ones, D. S. (2003). The construct validity of three entry level personality
inventories used in the UK: Cautionary findings from a multiple-inventory investigation.
European Journal of Personality, 17(Suppl 1), S39-S66.
* Arthur, W., Jr and Graziano, W. G. (1996). The five-factor model, conscientiousness, and
driving accident involvement. Journal of Personality, 64, 593-618.
Barrett, G. V., Miguel, R. F., Hurd, J. M., Lueke, S. B., & Tan, J. A. (2003). Practical issues in
the use of personality tests in police selection. Public Personnel Management, 32, 497Barrick, M. R., & Mount, M. K. (2005). Yes, personality matters: Moving on to more important
matters. Human Performance, 18, 359-372.
* Beaty, J. C., Jr., Cleveland, J. N., & Murphy, K. R. (2001). The relation between personality
and contextual performance in strong versus weak situations. Human Performance, 14, 125148.
* Bing, M. N. and Burroughs, S. M. (2001). The predictive and interactive effects of equity
sensitivity in teamwork-oriented organizations. Journal of Organizational Behavior, 22, 271290.
* Bishop, N. B., Barrett, G. V., Doverspike, D., Hall, R. J., Svyantek, D. J. (1999, May). Big five
and selection: Factors impacting responses and validities. Poster presented at the 14th Annual
Conference of the Society for Industrial and Organizational Psychology, Atlanta, GA.
Block, J. (1965). The Challenge of Response Sets. New York: Appleton-Century-Crofts.
* Boudreau, J. W., Boswell, W. R., Judge, T. A. (2001). Effects of personality on executive
career success in the United States and Europe. Journal of Vocational Behavior, 58, 53-81.
Bozeman, D. P. & Kacmar, K. M. (1997). A cybernetic model of impression management
processes in organizations. Organizational Behavior and Human Decision Processes, 69, 930.
Bradley, K. M., & Hauenstein, N. M. A. (2004). Do incumbent samples overestimate validities in
applicant settings? Paper presented at the 19th annual meeting of the Society of Industrial /
Organizational Psychology, Chicago, Illinois.
Burt, C. L. (1948). The factorial study of temperamental traits. British Journal of Psychology,
Statistical Section, 1, 178-203.
* Caligiuri, P. M. (2000). The Big Five personality characteristics as predictors of expatriate's
desire to terminate the assignment and supervisor-rated performance. Personnel Psychology,
53, 67-88.
Cervone, D. (2004). The architecture of personality. Psychological Review, 111, 183-204.
* Chan, D. & Schmitt, N. (2002). Situational judgment and job performance. Human
Performance, 15, 233-254.
* Conte, J. M. & Jacobs, R. R. (1999, May). Temporal and personality predictors of absence and
lateness. Paper presented at the 14th Annual Conference of the Society for Industrial and
Organizational Psychology, Atlanta, GA.
* Crant, J. M. (1995). The Proactive Personality Scale and objective job performance among real
estate agents. Journal of Applied Psychology, 80, 532-537.
* Deary, I. J., Blenkin, H., Agius, R. M., Endler, N. S., Zealley, H., & Wood, R. (1996). Models
of job-related stress and personal achievement among consultant doctors. British Journal of
Psychology, 87, 3-29.
332
K.M. Bradley & N.M.A. Hauenstein
* Deluga, R. J. & Mason, S. (2000). Relationship of resident assistant conscientiousness,
extraversion, and positive affect with rated performance. Journal of Research in Personality,
34, 225-235.
Digman, J. M. (1997). Higher-order factors of the big five. Journal of Personality and Social
Psychology, 73, 1246-1256.
Donovan, J. J., Dwight, S. A., & Hurtz, G. M. (2003). An assessment of the prevalence, severity,
and verifiability of entry-level applicant faking using the randomized response technique.
Human Performance, 16, 81-106.
Douglas, E. F., McDaniel, M. A., & Snell, A. F. (August, 1996). The validity of non-cognitive
measures decays when applicants fake. Academy of Management Proceedings.
Dunnette, M. D., McCartney, J., Carlson, H. C., & Kirchner, W. K. (1962). A study of faking
behavior on a forced-choice self-description checklist. Personnel Psychology, 15, 13-24.
Edwards, A. L. (1957). The Social Desirability Variable in Personality Assessment and Research.
New York: Dryden.
Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability corrections in
personality measurement: Issues of applicant comparison and construct validity. Journal of
Applied Psychology, 84, 155-166.
Ellingson, J. E., Smith, D. B., & Sackett, P. R. (2001). Investigating the influence of social
desirability on personality factor structure. Journal of Applied Psychology, 86, 122-133.
* Erez, A. & Judge, T. A. (2001). Relationship of core self-evaluations to goal setting,
motivation, and performance. Journal of Applied Psychology, 86, 1270-1279.
* Fogarty, G. J., Machin, M. A., Albion, M. J., Sutherland, L. F., Lalor, G. I., & Revitt, S. (1999).
Predicting occupational strain and job satisfaction: The role of stress, coping, personality, and
affectivity variables. Journal of Vocational Behavior, 54, 429–452.
Frei, R. L. (1998). Fake this test! Do you have the ability to raise your score on a service
orientation inventory? (Doctoral dissertation, University of Akron, 1998). Dissertation
Abstracts International, 58, 5681.
* Gellatly, I. R. & Irving, P. G. (2001). Personality, autonomy, and contextual performance of
managers. Human Performance, 14, 231-245.
* George, J. M. & Zhou, J. (2001). When openness to experience and conscientiousness are
related to creative behavior: An interactional approach. Journal of Applied Psychology, 86,
513-524.
Green, R. F. (1951). Does a selection situation induce testees to bias their answers on interest and
temperament tests? Educational and Psychological Measurement, 11, 503-515.
Griffin, B., Hesketh, B., & Grayson, B. (2004). Applicants faking good: Evidence of item bias in
the NEO PI-R. Personality and Individual Differences, 36, 1545-1558.
* Griffin, M. A. (2001). Dispositions and work reactions: a multilevel approach. Journal of
Applied Psychology, 86, 1142-1151.
Guadagnoli, E. & Velicer, W. (1987). A comparison of pattern matching indices. Multivariate
Behavioral Research, 26, 323-343.
* Hense, R. L., III (2000). The Big Five and contextual performance: Expanding personenvironment fit theory. Unpublished doctoral dissertation, University of South Florida.
Heron, A. (1956). The effects of real-life motivation on questionnaire response. Journal of
Applied Psychology, 40, 65-68.
* Hogan Assessment Systems (undated). [Hogan Personality Inventory Validation Studies].
Unpublished raw data.
Hogan, J. & Holland, B. (2003). Using theory to evaluate personality and job-performance
relations: A socioanalytic perspective. Journal of Applied Psychology, 88, 100-112.
Hough, L. M. (1998). Effects of intentional distortion in personality measurement and evaluation
of suggested palliatives. Human Performance, 11(2/3), 209-244.
Moderation by sample type
333
Hough, L. M. & Ones, D. S. (2001). The structure, measurement, validity, and use of personality
variables in industrial, work, and organizational psychology. In N. R. Anderson, D. S. Ones,
H. K. Sinangil, & C. Viswesvaran (Eds.), Handbook of Work Psychology (pp. 233-277). New
York, NY: Sage.
Hunter, J. E. & Schmidt, F. L. (1990). Methods of Meta-Analysis: Correcting error and bias in
research findings. Newbury Park, CA: Sage.
Hunter, J. E. & Schmidt, F. L. (1994). Estimation of sampling error variance in the meta-analysis
of correlations: Use of average correlation in the homogeneous case. Journal of Applied
Psychology, 79, 171-177.
* Hunthausen, J. M. (2000). Predictors of task and contextual performance: Frame-of-reference
effects and applicant reaction effects on selection system validity. Unpublished doctoral
dissertation, Portland State University.
Hurtz G. M., & Donovan J. J., (2000). Personality and job performance: The Big Five revisited.
Journal of Applied Psychology, 85, 869-879. OL
* Jacobs, R. R., Conte, J. M., Day, D. V., Silva, J. M., & Harris, R. (1996). Selecting bus drivers:
Multiple predictors, multiple perspectives on validity, and multiple estimates of utility.
Human Performance, 9, 199-217.
* Johnson, D. L. (1993). Competitiveness and performance in the workforce: Hierarchical factor
analysis of managerial competitiveness, achievement motivation, and the big five personality
dimensions (Doctoral dissertation, Iowa State University, 1993). Dissertation Abstracts
International, 53, 3821.
Jones, E. E., & Pittman, T. S. (1982). Toward a general theory of strategic self-presentation. In J.
Suls (Ed.), Psychological perspectives on the self. Hillsdale, NJ: Erlbaum.
* Judge, T. A. & Cable, D. M. (1997). Applicant personality, organizational culture, and
organization attraction. Personnel Psychology, 50, 359-394.
* Judge, T. A. & Heller, D. (2002). The Dispositional Sources of Job Satisfaction: An integrative
test. In R. Ilies (Chair), Dispositional influences on work-related attitudes. Symposium
presented at the 17th Annual Conference of the Society for Industrial and Organizational
Psychology, Toronto, ON, Canada.
Kluger, A. N. & Tikochinsky, J. (2001). The error of accepting the “Theoretical” null hypothesis:
The rise, fall, and resurrection of commonsense hypotheses in psychology. Psychological
Bulletin, 127, 408-423.
Leary, M. R. & Kowalski, R. M. (1990). Impression Management: A literature review and twocomponent model. Psychological Bulletin, 107, 34-47.
* Lin, T. R., Doyle, T. F., & Howard, J. M. (1990, June). The prospective employee potential
inventory: A validation study with school bus drivers. Proceedings of the 1990 International
Personnel Management Association Assessment Council Conference on Personnel
Assessment, 14, 152-160.
Markon, K. E., Krueger, R. F., & Watson, D. (2005). Delineating the structure of normal and
abnormal personality: An integrative hierarchical approach. Journal of Personality and Social
Psychology, 88, 139-157.
* Miller, R. L., Griffin, M. A., & Hart, P. M. (1999). Personality and organizational health: The
role of conscientiousness. Work and Stress, 13, 7-19.
Mischel, W. & Shoda, Y. (1998). Reconciling processing dynamics and personality dispositions.
Annual Review of Psychology, 49, 229-258.
Marcoulides, G. A. & Hershberger, S. L. (1997). Multivariate Statistical Methods: A first course.
Mahwah, NJ: Lawrence Earlbaum.
Michaelis, W. & Eysenck, H. J. (1971). The determination of personality inventory factor
patterns and inter-correlations by changes in real-life motivation. Journal of Genetic
Psychology, 118, 223-234.
334
K.M. Bradley & N.M.A. Hauenstein
* Molitor, D. D. (1998). An examination of the effects of personality and job satisfaction on
multiple non-workrole organizational behaviors. Unpublished doctoral dissertation, Iowa
State University.
Montanelli, R. G. & Humphreys, L. (1976). Latent roots of random data correlation matrices with
squared multiple correlations on the diagonal: A Monte Carlo study. Psychometrika, 41, 341348.
* Morrison, K. A. (1997). Personality Correlates of the Five-Factor Model For A Sample of
Business owners/managers: Associations with scores on self-monitoring, Type A behavior,
locus of control, and subjective well-being. Psychological Reports, 80, 255-272.
Mulaik, S. A. (1972). The Foundations of Factor Analysis. New York: McGraw-Hill.
Ones, D. S. & Viswesvaran, C. (1998). The effects of social desirability and faking on personality
and integrity assessment for personnel selection. Human Performance, 11, 245-269.
Ones, D. S., Viswesvaran, C., & Dilchert, S. (2005). Personality at work: Awareness and
correcting misconceptions. Human Performance, 18, 389-404.
Ones, D. S., Viswesvaran, C., & Reiss, A. (1996). Role of social desirability in personality testing
for personnel selection: The red herring. Journal of Applied Psychology, 81, 660-679.
Paulhus, D. L. & Reid, D. B. (1991). Enhancement and denial in socially desirable responding.
Journal of Personality and Social Psychology, 60, 307-317.
Pulakos, E. D., Borman, W. C., & Hough, L. M. (1988). Test Validation for Scientific
Understanding: Two demonstrations of an approach to studying predictor-criterion linkages.
Personnel Psychology, 41, 703-716.
Robie, C., Zickar, M. J., & Schmit, M. J. (2001). Measurement equivalence between applicant
and incumbent groups: An IRT analysis of personality scales. Human Performance, 14, 187207.
Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response
distortion on preemployment personality testing and hiring decisions. Journal of Applied
Psychology, 83, 634-644.
Schmit, M. J. & Ryan, A. M. (1993). The big five in personnel selection: Factor structure in
applicant and nonapplicant populations. Journal of Applied Psychology, 78, 966-974.
* Sinclair, R. R. & Michel, R. P. (2001, April). A construct-oriented approach to modeling entrylevel job performance. Poster presented at the 16th Annual Conference of the Society for
Industrial and Organizational Psychology, San Diego, CA.
* Smith, D. B. & Ellingson, J. E. (2002). Substance versus style: A new look at social desirability
in motivating contexts. Journal of Applied Psychology, 87, 211–219.
Smith, D. B., Hanges, P. J., & Dickson, M. W. (2001). Personnel selection and the five-factor
model: Reexamining the effects of applicant’s frame of reference. Journal of Applied
Psychology, 86, 304-315.
Stark, S., Chernyshenko, O. S., Chan, K. –Y., Lee, W. C., & Drasgow, F. (2001) Effects of the
testing situation on item responding: Cause for concern. Journal of Applied Psychology, 86,
943-953.
Stewart, G. L. (1997, August). Applicants versus incumbents: Assessing the impact of validation
design on personality research. Paper presented at the Academy of Management Annual
Meeting, Boston, MA.
Tedeschi, J. T. & Melburg, V. (1984). Impression management and influence in the organization.
Research in the Sociology of Organizations, 3, 31-58.
* Thoms, P., Moore, K. S., & Scott, K. S. (1996). The relationship between self-efficacy for
participating in self-managed work groups and the big five personality dimensions. Journal of
Organizational Behavior, 17, 349-362.
Moderation by sample type
335
* Tokar, D. M. & Swanson, J. L. (1995). Evaluation of the correspondence between Holland's
vocational personality typology and the five-factor model of personality. Journal of
Vocational Behavior, 46, 89-108.
* Tull, K. T. (1997). The effects of faking behavior on the prediction of sales performance using
the Guilford Zimmerman Temperament Survey and the NEO Five Factor Inventory.
Unpublished doctoral dissertation, University of Akron.
Van Iddekinge, C. H., Raymark, P. H., Eidson, C. E., & Putka, D. J. (2003). Applicant-incumbent
differences on personality, integrity, and customer service measures. Poster presented at the
18th Annual Conference of the Society for Industrial and Organizational Psychology,
Orlando, FL.
* Van Scotter, J. R. (1996). Evidence for the usefulness of task performance, job dedication, and
interpersonal facilitation as components of overall performance. Unpublished doctoral
dissertation, University of Florida.
* Vasilopoulos, N. L., Sass, M. D., Shipper, F., & Story, A. L. (2002, August). Target personality
and rater agreement - implications for 360-degree feedback. Paper presented at the Academy
of Management Annual Meeting, Denver, CO.
Viswesvaran, C., & Ones, D. S. (1995). Theory testing: Combining psychometric meta-analyses
and structural equation modeling. Personnel Psychology, 48, 865-886.
* Wanberg, C. R., Kammeyer-Mueller, J. D. (2000). Predictors and outcomes of proactivity in the
socialization process. Journal of Applied Psychology, 85, 373-385.
Weekley, J. A., Ployhart, R. E., & Harold, C. M. (2003). Personality and situational judgment
tests across applicant and incumbent settings. Poster presented at the 18th Annual Conference
of the Society for Industrial and Organizational Psychology, Orlando, FL.
* Williams, R. W. (1998). Using personality traits to predict the criterion space (Doctoral
dissertation, Union Institute, 1998). Dissertation Abstracts International, 59, 902.
* Willock, J., Deary, I. J., McGregor, M. M., Sutherland, A., Edwards-Jones, G., Morgan, O.,
Dent, B., Grieve, R., Gibson, G., & Austin, E. (1999). Farmers' attitudes, objectives,
behaviors, and personality traits: The Edinburgh study of decision making on farms. Journal
of Vocational Behavior, 54, 5-36.
* Witt, L. A. (2002). The interactive effects of extraversion and conscientiousness on
performance. Journal of Management, 28, 835-851.
* Woolard, C. & Brown, R. D. (1999). Moderation of personality test validity: An extension and
replication of Barrick and Mount (1993). Poster presented at the 14th Annual Conference of
the Society for Industrial and Organizational Psychology, Atlanta, GA.
* Zellars, K. L. & Perrewé, P. L. (2001) Affective personality and the content of emotional social
support: Coping in organizations. Journal of Applied Psychology, 86, 459-467.
Zickar, M. J., Gibby, R. E., & Robie, C. (2004). Uncovering faking samples in applicant,
incumbent, and experimental data sets: An application of mixed-model item response theory.
Organizational Research Methods, 7, 168-191.