Cognitive assessment in multicultural contexts – exploring

16th Annual Society for Industrial and Organisational Psychology (SIOPSA) Conference
22-23 July 2014, CSIR, PRETORIA
Cognitive assessment in
multicultural contexts –
exploring the use of
indigenous artefacts for
improved cultural fairness
Nomfusi Bekwa &
Marié de Beer
[email protected]
[email protected]
Overview of the presentation
• Introduction
• Background
• Objective and motivation
• Literature and theories on cognitive assessment
• Conceptualising new items for cognitive assessment
• Description of the empirical study
• Results
• Practical application
• Conclusions and implications for future research
– Sample
– Measuring instruments
Introduction
•
Assessment in the complex South African multilingual and multicultural society is
compounded by socio-economic and educational differences between various
sub-groups. “The use of psychometric tests in personnel selection has been
regarded with an extraordinary degree of suspicion and scepticism. This is
especially true when selection occurs in respect of a diverse applicant group. ”
(Theron, 2007, p. 102).
•
At the 1995 Psychometrics conference (which in time became the SIOPSA
conference), Dr Blade Nzimande voiced the following concerns:
– Testing in South Africa has been fundamentally shaped by apartheid – has this basic
paradigm changed significantly?
– Can psychometric testing grapple with the new reality such as affirmative action?
– Testing will have to look at potential – not just actually existing skills
– Whilst testing must take into account international developments, it must ultimately be
located within the broader social and economic objectives of the society within which it
is located.
Background
•
Psychometric testing in South Africa has mainly followed international trends
(Foxcroft, 1997) but according to Claassen (1997), it cannot be investigated in
isolation without taking the country’s political, economic, and social history
into account.
•
“Unfamiliarity with the stimulus material in western IQ tests is only one of
possible cultural factors that may affect performance of African test-takers
when diagrammatic, non-verbal intelligence tests, such as the CPM or SPM
are used to assess general cognitive ability. (Wicherts et al., 2010, p. 141)
•
“Many colleagues have criticised eurocentric assessment models and have
called for research on approaches that would more satisfactorily take a
developing (African) country perspective into account. Is this a viable idea
whose time has come?” (Maree, 2010, p. 230)
•
Anxiety and specifically test anxiety has been researched for many years.
While it is a complex syndrome, it is considered one of the most common
reactions to stress (Sarason, 1984) involving elements of uncomfortable
physical arousal emotional sensation, and cognitive thoughts (Hong, 1988).
•
Society has become more competitive and emphasizes success in
academic life. “As the need for knowledge and professionalism increases,
there is a need for assessment of the individual, which in turn requires more
and more tests aimed to measure, classify, and sort in order to enter
university or acquire jobs.” (Lufi & Darliuk, 2005, p. 237).
•
“Anxiety appears because the individual knows that judgment is used to
assess his or her performance” (Lufi & Darliuk, 2005, p. 237).
• Educationally and socioeconomically disadvantaged individuals are
often at a further disadvantage when standard cognitive tests are used,
because:
– tests typically include content representing crystallized abilities (i.e. language
proficiency, scholastic content or educational material)
– which is influenced by prior learning experiences (Claassen, 1997; Foxcroft, 1997; Van
de Vijver, 1997, 2002).
• Theron (2007) asked whether it is possible to assure selection fairness
through the choice of instruments and whether it is possible to avoid
biased measures and adverse impact (Theron, 2007)
In the United States, Melba Vasquez, the President of the APA
(American Psychological Association) indicated in her President’s
column of January 2011 that the following would be key focus areas for
her during her term as president of the APA (Vasquez, 2011):
–
–
–
–
–
Reducing discrimination and enhancing diversity
Addressing educational disparities
Focusing on applying our science and practice to the advancement of society
Addressing the challenges of the changing demographics
Applying psychological knowledge to address the grand challenges of society.
These aims are also worthy ones to pursue within the South African
context. One of the challenges is to ensure fair and unbiased
(cognitive) assessments to all citizens – in line with the requirements of
the Employment Equity Act of 1998.
Indigenous art used and celebrated around the world
Aims of the study
The aim of the study is threefold:
• To evaluate the utility of items inspired by African artefacts and cultural
symbols in items to measure general nonverbal figural reasoning ability.
• To evaluate the psychometric properties of such new format items.
• To compare the results (total score) obtained on the new item formats
with that of another measure of general non-verbal figural reasoning
using the more traditional format items.
Theorists and tests of cognitive assessment
Description of the study
The study was conducted in two phases:
Item development phase
• Research journey of its own – photos of art objects, traditional
dresses, beadwork, flea market photos, etc.
• Identifying patterns and colour themes for new items
• Writing new items
• Subject expert evaluation of items
• Colour blind evaluation of selected items
Item evaluation phase
• Using new items to collect data
(quanti responses and qualitative feedback)
• Item analysis
Conceptualising new items for cognitive
assessment
• It is important to provide a clear understanding of the domain of
interest
• Eductive vs. Reproductive - Fluid vs. Crystallised intelligence
• The ability to draw meaning out of confusion vs. ability to recall
acquired information (Raven, 2002)
• Eductive – fluid ability – Gf: relevant to tests that require adaptation
to new situations, problem solving, pattern recognition, abstract
reasoning (Cattell, 1963; Gregory, 2007; Ravens, 2002)
Conceptualising new items for cognitive
assessment
• Same principles used for items – set of geometric figures with one
figure missing
• Alternatives of possible answers given in a multiple choice format
• Used African art and cultural artefacts to transform the patterns,
shapes and outlook of the items with colour
• The inspirations: African material prints, art, decorations, beadwork,
paintings, etc.
Conceptualising new items for cognitive
assessment
• Nonverbal figural items
• Limited to 5 basic colours: Blue; Orange/Yellow; Red; Green and
Brown
examples
Photographs – Ms Nomfusi Bekwa
Design and method
Research design: exploratory sequential mixed method crosssectional survey
• QUAL > QUANT
The study was conducted using the following steps:
• Conceptualize: Identifying the purpose
• Operationalize: Drafting items, obtaining feedback (qualitative)
from culture experts and colour blind evaluation of items
• Piloting items: administer draft items to smaller group, check
instructions, review/modify items, administer to larger sample
• Item analysis
Sample
A convenience sample of 946 participants was used
The participants were part of a group that were undergoing a
funded career-related training and guidance programme.
The programme is an ongoing initiative for skills development and
job creation for the youth and entails training and guidance in
character development, life skills, skills development, practical
work, soft skills, etc.
Sample
• Age: ranges between 18 and 36
• Gender: female (50%) and male (49%)
• Language: Xhosa (38%), Afrikaans (16%) and Zulu (11%) being in
the majority; while all the other languages were below 10%
• Education: Grade 12 (68%); Grade 11 (22%); and Grade 10 (6%)
• Province: GP (40%); WC (31%); EC (19%) and FS (10%)
Measuring instruments
New items
• 200 items developed and administered
• Nonverbal figural items
• Administered by computer
• Group administration
• Six types of item formats:
• Blocks [figure series, 2-pairing, 2x2, 3x3] – Circle/Wheel – Triangle
• Same question – change positioning of question mark
Measuring instruments
Learning Potential Computerised Adaptive Test (LPCAT)
• Dynamic test aimed at addressing some of the challenges such as
fairness, item bias and reducing of test duration (De Beer, 2005,
2010)
• Uses the test-train-retest approach to measure learning potential
• Nonverbal figural reasoning Items based on fluid ability
• Coefficient alpha internal consistency rieliability scores ranges
0,925 to 0,987 (De Beer, 2005; 2010)
• Predictive validity for academic results on average between .3 and
.6
Procedure
Item development: Data collection
• Collection of photos of different artefacts
• 1st draft – 40 items – preliminary feedback on the appropriateness of
the symbols, colours etc.
• 2nd draft – 24 items – additional feedback and to check instructions
Item evaluation: Data collection
• Administered tests to 946 participants done over 2 weeks
• Proper scheduling to ensure that the testing sessions fit the
structured programme of the participants
• Computerised testing used
• Consent forms signed for each of the tests
Data analysis
• Qualitative feedback and comments reviewed
• Item analysis (CT and Rasch)
• Correlations between new item totals and LPCAT
Qualitative results
• Shweshwe print - very African. Sotho to be exact
• “...interesting mix of shapes, colours and patterns, exciting, familiarity
of some patterns, looking forward to go to the next page to see what’s
in store”
• “ethnic colours, modern African theme and Afrocentric”
• Colour of life
• Kept me captivated, wanted to finish all the questions by all means …
Exercised my brain by challenging and difficult questions.
• Very stressful but enjoyed them.
Person item map – all items
Rasch item analysis
Person reliability 0.96
Item reliabiltiy 0.99
Descriptive results of P-values for all items
Descriptive results for item type P-values
Correlation results
Learning points and practical application
• The item types using African art and artefacts were generally very
positively received by participants in the initial pilot study.
• Item difficulty values indicate a slight skewness – items generally easy for
the sample group (mostly matriculated). Item difficulty could be more
aligned for individuals at mid-secondary or lower levels.
• Correlation with other cognitive measure shows construct validity (gc – fluid
ability measured)
• Further developments with similar items deemed feasible and appropriate
– for the SA context but also internationally.
Limitations
• Only African and Coloured participants in the sample group
• No criterion data available to evaluate concurrent or predictive
validity
• Sample of convenience – not representative of any particular group.
• Items administered and analysed only for preliminary data to
determine the feasibility of these new items
Way forward and further research
•
In automatic item generation (AIG), generic and specific item characteristics
are used in models to create items that are similar in content and equivalent
in psychometric properties and which could thus be developed in mass on
specified principle to be mused interchangeably in test administration.
•
Irvine (2002, 2014) identified radical and incidental characteristics of items –
radical features affect the difficulty level of an item when changed, while the
incidental features change superficial characteristics only – which should
not affect the difficulty level or other psychometric properties of the item
compared to the original model format.
•
Using computer technology and algorithms to automate this process
potentially makes available an infinite number of items that can be
generated real-time – which addresses issues of item and test security in
the online assessment environment.
•
Initial investigations in the potential use of these approaches in
computerized adaptive tests (CATs) have been positive (Bejar et al., 2013)
and are continuing to draw attention of scientists and practitioners.
•
According to Leucht (2013), AIG offers three distinct advantages for CAT,
namely: lower cost without loss of quality; improved item-writing and
improved item calibration and lower need for pilot testing every item.
•
The labour intensive item writing process is avoided when item models are
used and automates many of the details required to produce items once the
item model has been formulated and calibrated.
•
The advantage of the present project results is having empirical item data
available to serve as baseline for a future AIG CAT development.
References
Cattell, R.B. (1963). Theory of fluid and crystallized intelligence: A critical experiment. Journal of
Educational Psychology, 54(1), 1-22.
De Beer, M. (2005). Development of the Learning Potential Computerised Adaptive Test
(LPCAT).South African Journal of Psychology, 35(4), 717-747.
De Beer, M. (2010). Longitudinal predictive validity of a Learning Potential test. Journal of Psychology
in Africa, 20(2), 225-232.
Gregory, R.J. (2007). Psychological testing: History, principles and applications (5th ed.). Boston:
Pearson International Edition.
Irvine, S.H. (2002). The foundations of Item Generation for Mass Testing. In S.H. Irvine & P.C.
Kyllonen (Eds), Item generation for test development (pp. 3-34). London: Lawerence Erlbaum
Asociates Publishers.
Irvine, S.H. (Ed.). (2014). Tests for recruitment across cultures. Amsterdam: IOS Press BV.
Lufi, D., & Darliuk, L. (2005). The interactive effect of test anxiety and learning disabilities among
adolescents. International Journal of Educational Research, 43, 236-249.
Maree, K. (2010). Assessment in psychology in the 21st century – a multi-layered endeavour. South
African Journal of Psychology, 40(3), 229-233.
Raven, J. (2000). Psychometrics, cognitive ability, and occupational performance. Review of
Psychology, 7(1-2), 51-74.
Raven, J. (2002). Spearman’s Raven legacy. Testing International, 12(2), 7-10.
Theron, C. (2007). Confessions, scapegoats and flying pigs: psychometric testing and the law. SA
Journal of Industrial Psychology, 33(1), 102-117.
Wicherts, J.M., Dolan, C.V., Carlson, J.S., & Van der Maas, H.L.J. (2010). Raven’s test performance
of sub-Saharan Africans: Average performance, psychometric properties, and the Flynn effect.
Learning and Individual Differences, 29, 135-151.