Faculty of Sciences Mechanisms of masked semantic priming: A meta-analysis Eva Van den Bussche Master dissertation submitted to obtain the degree of Master of Statistical Data Analysis Promoter: Prof. Dr. Vansteelandt Department of Applied Mathematics and Computer Science Academic year 2007 – 2008 The author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular concerning the obligation to mention explicitly the source when using results of this master dissertation. Eva Van den Bussche 2 Foreword This master dissertation is part of a doctoral project in collaboration with Bert Reynvoet and Wim Van den Noortgate. The collection of the data, the statistical analyses and writing of the dissertation were completed by the student. The student wishes to thank all researchers who contributed to this meta-analysis by providing us with the statistical information needed to compute effect sizes. We especially want to thank Kenneth Forster, Andrea Kiesel, Sachiko Kinoshita and Carsten Pohl for sending us unpublished data and Tom Verguts for reading a previous version of this paper and for his useful comments. 3 Table of contents Abstract 5 Introduction 6 Method 14 Semantic categorization 32 Results 32 Discussion 44 Lexical decision and naming 46 Results 46 Discussion 57 General Discussion 58 References 63 4 Abstract Whether unconscious information can influence our behavior and to which extent, has been topic of heavy debate throughout the history of psychology. A frequently used method to study subliminal processing is the masked priming paradigm. The present meta-analyses focus on studies using this paradigm. Our aim was twofold: first, we wanted to assess the magnitude of subliminal semantic priming across the literature and shed light on whether subliminal primes can be processed up to a semantic level. Second, we aimed to assess the influence of potential moderators on semantic priming effects. We found that across the literature significant priming emerges, indicating that unconsciously presented information can influence our behavior. Furthermore, significant priming is observed under circumstances where a non-semantic interpretation can not fully explain the observed priming effects. This implies that unconsciously presented information can be genuinely semantically processed. Our quantitative review also shows that several moderators influence the strength of the observed priming effects: when the experimental context creates the opportunity to establish automatic stimulus-response mappings, the (non-semantic) processing of the primes is enhanced and priming effects are boosted. 5 Introduction Can unconsciously presented information influence our behavior? This has been one of the most controversial questions in the history of psychology and remains an intriguing and heavily debated issue today (e.g. Merikle & Daneman, 1998; Sidis, 1898). Numerous researchers have tried to answer it by presenting stimuli below the “limen” or the threshold of conscious perception and assessing whether these subliminal stimuli influence us. Subliminal perception is deduced when such an invisible stimulus is able to affect our emotions, our thoughts or our behavior. Over the years some extraordinary claims have been made about the power of subliminal perception. Probably the most famous example is the experiment conducted by James Vicary in 1957. He claimed that an experiment in which moviegoers were repeatedly shown invisible 3 millisecond advertisements for Coca-Cola and popcorn (the messages “drink coca-cola” and “eat popcorn”) significantly increased product sales. Based on his claims the CIA produced a report in 1958 that led to subliminal advertising being banned in the US. This report suggested that certain individuals can at certain times and under certain circumstances be influenced to act abnormally without awareness of the influence. A few years later, after several failed attempts to replicate Vicary’s study, he himself admitted that the original study was "a gimmick" and that he had falsified the data. Despite this confession, his study preserved its massive impact on the public (Pratkanis, 1992) and subliminal advertising is still being used today. For example, in 2007 a McDonald's logo was flashed very briefly during an episode of “Iron Chef America”, a popular cooking show. Even though the producers of the show, Food Network, claimed that it was a technical error and not a subliminal message, it stirred the public opinion. Luckily, the phenomenon of subliminal perception has also been extensively studied in the scientific community. One of the earliest demonstrations was reported in 1898 by Sidis. He found that when subjects were shown cards with a letter or number on them at such a distance they claimed they were unable to see what was on the cards, they nonetheless performed above chance when guessing the identity of the cards. However, providing convincing empirical evidence for subliminal processing and developing a reliable paradigm to study it, has proven to be a challenging task, which explains why the research history of subliminal perception is governed by skepticism and critique. And although the critical questions changed throughout the years, the controversial issue of unconscious perception 6 never ceased to intrigue us and remains a debated topic (see Kouider & Dehaene, 2007, for an extensive review). For almost three decades, a frequently used method to study the impact of subliminal information on our behavior is the masked priming paradigm. In this method, the impact of a masked stimulus on the processing of a subsequent target is examined. For instance, Marcel (1983) found that target words were processed faster when they were preceded by a semantically related prime word (e.g. cat - dog), even when these primes were rendered subliminal by masking them and presenting them for only a very short duration. Although Marcel’s results were first described as “startling” and “counterintuitive”, successful replications of subliminal semantic priming accumulated throughout the years (e.g. Balota, 1983; Forster & Davis, 1984; Fowler, Wolford, Slade & Tassinary, 1981; Greenwald, Klinger & Liu, 1989). But as the pile of supporting evidence grew, so did the skepticism. Holender (1986) reviewed the then-available support of masked priming as contaminated by a variety of problems such as lack of reliability and poor assessment of whether stimuli were actually presented subliminally. This demonstration of serious methodological flaws caused great dubiety as to the existence of subliminal semantic processing. By the mid-1990s, the development of new and stronger paradigms of subliminal priming prompted a renewed interest for subliminal semantic activation. In 1998, Dehaene and colleagues asked subjects to classify numbers between 1 and 9 as smaller or larger than 5. Two types of trials could be distinguished: congruent trials (e.g. 1 - 3) where prime and target evoked the same response and incongruent trials (e.g. 1 - 8) where prime and target evoked opposite responses. They found that congruent trials were responded to faster than incongruent trials, a phenomenon referred to as the “response congruency effect”. Moreover, their study was the first to use brain-imaging techniques which ascertained that subliminal primes elicited neural activity in the motor cortex, implying that when subjects participate in a semantic categorization task with visible targets, they also unconsciously apply the task instructions to the subliminal primes. The authors concluded that subliminal primes are processed in a series of stages, including semantic processing: the ability of a subliminal prime to facilitate subsequent categorization of a target belonging to the same semantic category implies that this prime has been unconsciously categorized and thus has been processed up to a semantic level. This and other demonstrations, all using more sound methodological approaches (see also Draine & Greenwald, 1998; Greenwald, Draine & Abrams, 1996), removed the stigma of the untrustworthiness of subliminal priming: nowadays, the existence of subliminal perception is largely acknowledged. However, the 7 debate has progressed beyond existence claims. Ever since Dehaene et al. (1998) interpreted their findings as proof of semantic processing of subliminal information, the depth of processing of subliminal stimuli has been the topic of heavy controversy. In 2001, Damian shed a new light on the findings of Dehaene and colleagues. In the study of Dehaene et al. the prime stimuli were also shown as target stimuli. However, Damian found that when the prime stimuli were not presented as targets and hence were never responded to, the response congruency effect disappeared. According to Damian this failure to obtain a congruency effect for primes never presented as targets implies that subjects learn to associate stimuli directly with the adequate responses, creating automatized stimulus-response (S-R) mappings which bypass semantic access. When a prime is also presented as a target, it can automatically activate the corresponding response without the need of accessing semantics first, which provides an alternative non-semantic explanation of the response congruency effect obtained by Dehaene et al. (1998). However, when a stimulus is never responded to (as is the case for the primes in Damian’s study), such an S-R mapping cannot be established, which explains the absence of a response congruency effect in his study (see Abrams & Greenwald, 2000, for a similar claim). However, this explanation is not in line with several other studies that did report response congruency effects for primes that were never presented as targets (e.g. Dell’Acqua & Grainger, 1999; Klauer, Eder, Greenwald & Abrams, 2007; Naccache & Dehaene, 2001; Reynvoet, Gevers & Caessens, 2005). These findings are difficult to explain without assuming that the subliminal primes were semantically processed. Hitherto, the debate on whether subliminal semantic priming reflects genuine semantic processing of the subliminal information or rather the formation of automatic stimulus-response mappings remains undecided. The divergent research results also gave rise to the supposition that several other factors moderate the emergence of subliminal priming effects. Kouider and Dehaene (2007) recently provided an excellent historical overview of the literature on masked priming, but to date the field lacks a quantitative review of the available data assessing the overall effect size and investigating the influence of potential moderators on subliminal semantic priming effects. Therefore, the aim of the present study is to statistically combine published and unpublished data using meta-analytic techniques to shed light on unresolved issues. First, by estimating and testing the overall effect size of subliminal semantic priming in the literature, we want to establish whether the literature to date provides clear evidence in favor of a semantic processing of subliminal information. Second, we aim to assess the influence of potential moderators on subliminal semantic 8 priming effects. Since a unique aspect of meta-analyses is that they allow a statistical examination of the moderating effects of study characteristics, this could shed light on the conditions under which subliminal semantic processing can (not) be observed, indicating the limitations and possibilities of subliminal processing. Addressing these two overarching aims, this meta-analysis will make a valid contribution to the field of subliminal priming research: (1) from a theoretical viewpoint, it will help clarify when and to which depth subliminal primes are processed, which will entail implications for the reigning contradictory theories. (2) The present study might also guide future research: important gaps in the literature may be identified and the identification of significant moderators and interactions between them might produce a new flow of research. (3) Identifying which factors moderate subliminal semantic priming could also have important methodological consequences: it can indicate which factors are crucial to take into account in designing a subliminal priming experiment. We now turn to a more detailed description of different potential moderators considered in this meta-analysis. Although this list of variables consists of the most plausible moderators of subliminal semantic priming, it is not meant to be exhaustive, since we were limited to include only those moderators in the meta-analysis which are frequently reported in the literature on subliminal semantic priming. For example, the luminance (or brightness) of the stimuli and the exact task instructions the subjects received might also influence the observed priming effects, but since detailed information on these variables is rarely reported in the subliminal priming literature we could not include them in our meta-analysis. Task Traditionally, three standard tasks are used to examine subliminal semantic priming. In a semantic categorization task subjects are required to make a decision about a visible target with respect to the membership of this target to a semantic category. For example, subjects are asked to categorize numbers as smaller or larger than 5. A subliminal prime precedes the target and belongs either to the same semantic category as the target (i.e. congruent trial, e.g. 1 - 3) or to the opposite semantic category (i.e. incongruent trial, e.g. 1 8). The priming effect, also called the response congruency effect, is manifested as a faster and/or more accurate response to congruent trials as compared to incongruent trials. In a standard lexical decision task subjects are presented with a mixture of words and pseudowords as targets and their task is to indicate whether the target is a word or not. Preceding these targets are subliminal primes, which can either be semantically related (e.g. 9 doctor - nurse) or unrelated (e.g. butter - nurse) to the target words. The priming effect here is expressed as a faster and/or more accurate response to semantically related prime-target pairs compared to unrelated pairs. A naming task is very similar to a lexical decision task, with the distinction that the subject has to read the target aloud instead of deciding whether it is a word or nonword. The priming effect is defined in the same way as it was for the lexical decision task. It has been consistently claimed that the size of the priming effects is dependent on the task used in the experiment. For example, Lucas (2000) found that semantic priming effects obtained using naming tasks were smaller than those obtained with lexical decision tasks. It has also been reported that priming effects obtained with semantic categorization tasks are stronger than those observed with lexical decision tasks (e.g. Grainger & FrenckMestre, 1998). The latter observation can be explained by the fact that categorization requires access to semantic information whereas naming and lexical decision do not. Furthermore, priming effects in a semantic categorization task are biased by response congruency (e.g. Forster, 2004): on congruent trials, prime and target belong to the same semantic category and require the same response, whereas on incongruent prime and target belong to different categories requiring opposite responses. If indeed the subjects also apply the instructions the primes (Dehaene et al., 1998), then it could well be that the obtained priming effect does not originate at the level of the category, but rather at the response level. This confounding is absent in naming and lexical decision tasks where both the related and unrelated primes belong to the same semantic category and thus require a similar response. Prime novelty As already described above, whether primes are also presented as targets or not could be an important moderator of subliminal priming effects. For example, Dehaene et al. (1998) used primes that were also presented as targets (generally referred to as repeated primes). Contrarily, Damian (2001) used primes that were never shown as targets (referred to as novel primes). The complete opposite nature of their results indicates that prime novelty should be considered as a putative important factor able to influence the emergence of subliminal priming effects. 10 Category size Some researchers only found semantic priming effects when the stimuli presented to the subjects originated form a small category (e.g. body parts, numbers, months, etc.). When subjects had to judge whether words were animals or non-animals (Forster, 2004; Forster, Mohan & Hector, 2003) no priming effect for this large category of animals was observed. This suggests that category size might be a moderating factor. Target set size Contrary to Forster et al. (2003), several other researchers did find priming effects for primes from a large category (e.g. Klauer et al., 2007; Van den Bussche & Reynvoet, 2007), indicating that category size might interact with other moderators of subliminal semantic priming effects. Using a large category, Kiesel, Kunde, Pohl and Hoffmann (2006) did obtain priming effects, but only when they presented a large number of targets to the subjects. When a limited number of targets was presented, no priming effects emerged. Their results indicate that another factor potentially influencing priming effects is target set size (i.e. the number of targets presented to the subjects). These findings are in line with Abrams (2008) who also did not find priming for novel primes when the target set was small. The nature of primes and targets In subliminal semantic priming experiments, the primes and targets presented to the subjects might take various forms such as digits (e.g. 1, 2), number words (e.g. one, two), words, pictures, Chinese symbols or letters. An extensive body of naming research illustrates the differences between the processing of word and picture stimuli (see Glaser, 1992, for a review). For visually presented words there is plenty of evidence that it is possible to name them without semantic mediation, whereas pictures cannot be named if their meaning is not activated first (e.g. Theios & Amrhein, 1989). The same argument is made for the discrepancy between the processing of words and symbols: just like pictures, symbols like digits and Chinese characters require semantic mediation to be named (e.g., Perfetti & Tan, 1998; Tan, Hoosain & Peng, 1995; Tan, Hoosain & Siok, 1996; Ziegler, Ferrand, Jacobs, Rey & Grainger, 2000). Although this all-or-none distinction might be too strict, it seems reasonable to suspect that the processing of words and number words on the one hand and pictures and symbols on the other hand differ. The former will predominately rely on nonsemantic processing, whereas semantic mediation is more likely for the latter. This discrepancy implies that the nature of the primes and targets used in a subliminal priming 11 task might act as potential moderators of priming effects. For example, Kiesel et al. (2006) wondered why they were unable to find significant priming for novel primes with a target set of four words, whereas this has been regularly observed for four novel digit targets. Prime duration and SOA One of the defining characteristics of subliminal priming is the fact that the prime has to be presented below the threshold for conscious perception. A first means to assure this is to present the primes for a very short period of time. Several studies have found that semantic priming effects increase with prime duration (e.g. Holcomb, Reder, Misra & Grainger, 2005; Klauer et al., 2007). Vorberg, Mattler, Heinecke, Schmidt & Schwarzbach (2004) concluded that the same is true for the stimulus onset asynchrony (SOA, i.e. the time interval between the onset of the prime and the onset of the target). Masking A second precaution taken to ensure subliminal presentation of a prime is the masking of that prime. In the research field of subliminal semantic priming three visual masking techniques are common: backward masking, where a mask succeeds the prime stimulus; forward masking where the mask precedes the prime which is then immediately replaced by the target; and a combination of both forward and backward masking. A visual mask can contain a series of symbols (e.g. #### or %#%#), letters or scrambled patterns. The goal of these masks is to eliminate the visual image of the prime immediately by replacing it by a new image. It has been suggested that the nature of the masking procedure can influence subliminal priming effects. Klauer et al. (2007) for example reported that more severe masking conditions might have contributed to an observed decrease in subliminal semantic priming in their study (see also Van Opstal, Reynvoet & Verguts, 2005a). Visibility measures As the subliminal nature of the primes is the definitive feature of subliminal semantic priming, one should provide some report of whether this criterion was met. If no measure assessing prime visibility is reported, we have no way of knowing whether the primes were indeed presented subliminally. If this failure to report a prime visibility measure would reflect the fact that primes were presented above the consciousness threshold, we would expect that these studies would also report stronger priming effects, because an increased prime visibility is associated with stronger priming effects (e.g. Greenwald et al., 1996; Holcomb et al., 12 2005). If this is the case, the fact whether a study measured the primes’ visibility or not might be a moderator of subliminal priming. In general, the different measures of prime awareness used in the literature can be subdivided in two classes (Merikle & Reingold, 1992): a first assessing a subjective threshold, where conscious awareness is indexed by the subject's self reports (e.g. asking the subjects after the experiment whether they were aware of the primes or not); and a second administering an objective threshold, where conscious awareness is indexed by a measure of the subject's discriminative abilities such as a forced choice absent-present decision or categorization task (e.g. asking the subjects to categorize the subliminal primes instead of the targets). This objective measure provides a lower threshold for conscious awareness, leading to more conservative evaluations of when subliminal perception occurs (e.g. Cheesman & Merikle, 1986). Regardless of whether either of these methods is considered adequate and reliable (in fact this is strongly debated, see for example Merikle & Reingold, 1992), we can assume that if the nature of the measure used to assess prime visibility moderates the observed priming effects, more stringent (i.e. objective) methods would lead to reduced priming effects compared to more liberal (i.e. subjective) methods. Finally, part of the studies reporting an objective visibility test also calculate a direct measure of prime visibility (d’ measure). This d’ measure is a sensitivity measure based on signal detection theory (Greenwald et al., 2003). Subjects are generally presented with the same stimuli as during the priming experiment and are asked to perform the same task as before but now on the primes instead of the targets. A mean d’ value of 0 indicates zero visibility of the primes and the larger a positive d’ value, the more visible the primes were. When d’ values are small and not significantly different from 0, they are considered to indicate that unawareness of the primes was guaranteed. Again, if increased prime awareness would induce stronger priming effects, this measure should turn out to be a moderator of subliminal priming effects. Study features We also included two study features. First, the sample size (N) of the studies was considered as a potential moderator of subliminal priming effects. If publication or reporting bias is present in the subliminal priming literature (see further), we expect that smaller samples would lead to stronger priming effects. Second, we also looked at the impact of the population under investigation by the studies. This study feature was included because it has 13 been suggested that conscious semantic priming effects increase with age (Laver & Burke, 1993). Method Literature search Following the recommendations of Lipsey and Wilson (2001), five procedures were used to retrieve published and unpublished studies. First, we performed database searches on PsycINFO, Web of Science, PubMed and ScienceDirect (which also covers Dissertation Abstracts International). We used the search string (semantic or associative) and (priming or prime) and (masked or subliminal or unconscious or automatic). We used language and publication date as additional search parameters: only studies published in English, French, Dutch or German between January 1998 and December 2006 were considered for the present meta-analysis. We searched for studies published since 1998, because Kouider and Dehaene (2007) mention the publication of Dehaene and colleagues in Nature in 1998 as a marker of the reinstatement of the study of non-conscious processing using a new and reliable paradigm of subliminal priming, which did no longer suffer from the methodological criticisms. Furthermore, in their publication Dehaene and colleagues (1998) claimed for the first time that subliminal primes are semantically processed. Second, the tables of content for journals that commonly publish articles in this area were searched for the same period (e.g. Advances in Cognitive Psychology, Cognition, Journal of Experimental Psychology: Human Perception and Performance). Third, we examined the reference sections of relevant literature reviews for additional citations. Fourth, we checked the reference sections of all potentially qualifying articles for additional citations. Finally, we contacted established subliminal priming researchers and requested any relevant published and unpublished studies. Inclusion and exclusion Criteria We used the following criteria to select studies for inclusion in the meta-analysis: 1. The relation between the prime and target should be of a direct semantic nature (for example, cat-dog) in the visual domain. Thus, studies investigating phonological priming, morphological priming, orthographic priming, negative priming, repetition priming, cross-language priming, stem-completion, mediated priming, action priming and auditory priming were excluded. 14 2. The primes have to be presented subliminally (i.e. beneath the consciousness threshold). Thus, studies where the prime was presented for 100 ms or more, where a task had to be performed on the prime, where the subjects were explicitly aware of the presence of the primes or where a study phase preceded the priming task, were excluded. 3. This meta-analysis focuses solely on studies employing the most standard experimental tasks in priming: semantic categorization, lexical decision and naming. Studies using a different task (i.e. rapid serial visual presentation, scrambled sentence tasks, Stroop tasks, flanker tasks or double tasks) or studies where no experimental priming task was present (e.g. questionnaire studies, reviews or theoretical articles) were excluded. 4. Only studies using a standard priming procedure, where subjects are instructed to respond as fast as possible to the target which is preceded by a subliminal prime and where response times to the targets is the crucial dependent variable, were considered. Studies whose procedure deviates from this standard procedure (i.e. cueing procedures, response window procedures or mood induction procedures) were excluded. 5. We also limited the present meta-analysis to centrally presented single word or symbol primes. Thus, studies using lateralized presentation of stimuli, parafoveal primes, sentence primes, color primes, ambiguous primes, multiple primes, video primes, odor primes, music primes, context primes or two-letter strings were excluded. 6. We focused exclusively on studies that tested a healthy sample containing more than one subject. Studies testing clinical populations without including a control group or casestudies were excluded. 7. Finally, only studies were included which contained sufficient statistical information to compute an effect size. If insufficient information was present in the article, the corresponding author was contacted to obtain the necessary data. When after contacting the author twice this was unsuccessful, the study was excluded from the meta-analysis. An overview of the number of studies excluded based on these seven criteria is shown in Figure 1. We note that this figure shows the most prominent reason used to exclude a study (since several studies had to be excluded for multiple reasons). These criteria resulted in the selection of 37 studies published between 1998 and 2006 and eight unpublished studies. 15 Figure 1. Flowchart illustrating the number of published articles omitted based on the seven inclusion criteria. b/c = because. Coding Procedure and reliability We used a standard coding system to rate each study or (when present) the different conditions within a study. Table 1 lists the putative moderators coded for each study or condition within a study, the operationalization of each moderator and relevant descriptive statistics describing the distribution of the moderators. To limit the number of levels of the categorical moderators some categories were grouped, as described in Table 1. Intercoder agreement for the moderators was established by having two independent reviewers code all the moderators for a randomly selected 30% of the studies included in the meta-analysis. To assess intercoder reliability we calculated the intraclass correlation coefficient for continuous variables and kappa coefficients for categorical variables. The intraclass correlation coefficients ranged from .996 (for the target set size) to 1.0 (for all other continuous variables). The kappa coefficients ranged from .744 (for the nature of the visibility measure) to 1.0 (for all other categorical variables). The high intercoder agreement is due to the fact that coding the moderators was very straightforward. Disagreements were resolved by discussion, and the final coding reflects the consensus between the two coders. Table 1 Operationalization and descriptive statistics for potential moderators for the semantic categorization and lexical decision/naming conditions separately Moderator Value Coding description and criteria Population 1 = students 2 = adults Categorical variable representing whether the sample consisted of students or adults. Sample size Continuous Task 1 = lexical decision 2 = naming Continuous variable representing the sample size of the condition. Categorical variable representing whether a lexical decision task or a naming task was used. Primes 1 = symbols 2 = words 3 = both Prime novelty 1 = repeated primes 2 = novel primes 3 = both Targets 1 = symbols 2 = words 3 = both Category size 1 = small 2 = large Set size Continuous Categorical variable representing whether the prime stimuli were symbols (i.e. digits, letters, pictures or Chinese symbols), words (i.e. words or number words) or both (i.e. mixture of digits and number words). Categorical variable representing whether the prime stimuli were repeated primes (i.e. primes that were also presented as targets), novel primes (i.e. primes that were never presented as targets) or both. Categorical variable representing whether the target stimuli were symbols (i.e. digits, letters, pictures or Chinese symbols), words (i.e. words or number words) or both (i.e. mixture of digits and number words). Categorical variable representing whether the stimuli came from a large category (i.e. animals, objects, positive and negative words, adjectives or a combination of various stimulus categories) or from a small category (i.e. months, numbers below 10, farm animals, types of dogs, body parts, letters or words related to power or sex). Continuous variable representing the number of targets presented to the subjects. Descriptive statistics for semantic categorization conditions k = 87 Students: k = 62 Adults: k = 25 k = 87 M = 20, SD = 12.1, range = 6-80 k = 87 Symbols: k = 32 Words: k = 37 Both: k = 18 k = 87 Repeated primes: k = 40 Novel primes: k = 43 Both: k = 4 k = 87 Symbols: k = 34 Words: k = 37 Both: k = 16 k = 87 Small: k = 44 Large: k = 43 k = 87 M = 19, SD = 22.8, range = 4-90 Descriptive statistics for lexical decision and naming conditions k = 48 Students: k = 38 Adults: k = 10 k = 48 M = 37, SD = 29.1, range = 11-132 k = 48 Lexical decision: k = 39 Naming: k = 9 k = 48 Symbols: k = 2 Words: k = 46 k = 48 Repeated primes: k = 2 Novel primes: k = 46 k = 48 Symbols: k = 9 Words: k = 39 k = 40 Small: k = 2 Large: k = 38 k = 40 M = 117, SD = 109.1, range = 6-320 Moderator Value Coding description and criteria Prime duration Continuous SOA Continuous Masking 1 = BM and FM 2 = BM 3 = FM 4 = None Visibility measured 0 = no 1 = yes Continuous variable representing how long the primes were presented to the subjects (in milliseconds). Continuous variable representing how long the stimulus onset asynchrony (SOA) was (i.e. time interval between prime onset and target onset, in milliseconds). Categorical variable representing whether the condition used both backward and forward masking (BM and FM; i.e. mask presented before and after the prime), only backward masking (BM; i.e. mask presented after the prime), only forward masking (FM; i.e. mask presented before the prime) or no masking (prime not masked). Dichotomous variable representing whether or not the visibility of the primes was measured in the condition. Nature of visibility measure 1 = objective 2 = subjective 3 = both d’ Continuous Categorical variable representing how the visibility of the primes was measured in the condition: either objectively (i.e. an objective test of prime visibility), subjectively (i.e. a subjective test of prime visibility), or both. A measure often used to test prime visibility is the d’ measure, which is a sensitivity measure based on signal detection theory. Subjects receive the same stimuli and are asked to perform the same task as before but now on the primes instead of the targets. A d’ value of 0 reflects chancelevel visibility of the primes. Descriptive statistics for semantic categorization conditions k = 87 M = 42, SD = 70.7, range = 17-72 k = 87 M = 96, SD = 35.0, range = 41-273 Descriptive statistics for lexical decision and naming conditions k = 48 M = 51, SD = 17.5, range = 13-84 k = 38 M = 91, SD = 67.0, range = 33-340 k = 87 BM and FM: k = 70 BM: k = 1 FM: k = 16 k = 48 BM and FM: k = 26 BM: k = 5 FM: k = 13 None: k = 4 k = 87 No: k = 14 Yes: k = 73 k = 73 Objective: k = 56 Subjective: k = 10 Both: k = 7 k = 58 M = 0.19, SD = 0.20, range = -0.06-0.66 k = 48 No: k = 21 Yes: k = 27 k = 27 Objective: k = 19 Subjective: k = 6 Both: k = 2 k=4 M = 0.05, SD = 0.04, range = -0.01-0.08 Note. k = number of effect sizes in the category, M = mean, SD = standard deviation. 19 Meta-analytic procedures In most cases, each study contained multiple conditions (e.g. multiple experiments within a study and/or manipulations of potential moderators within a single experiment). Only conditions that met all inclusion criteria described above were included in the metaanalysis. The 45 studies were included because at least one of their conditions met all inclusion criteria. In total, these 45 studies yielded 135 separate conditions. Because of the important discrepancy in the possible origins of the observed priming effects between semantic categorization tasks on the one hand and lexical decision and naming tasks on the other hand (see above) we will make a strict distinction between these two. A second reason to separate them was the fact that lexical decision and naming conditions have very different features than semantic categorization conditions, making them difficult to compare. This can be seen in Tables 2 and 3. For example, almost all lexical decision/naming conditions used word primes and targets, novel primes and large stimulus categories. Therefore, the 135 conditions were divided into two groups: one group containing all semantic categorization conditions (22 studies comprising 87 separate conditions) and the other group containing all lexical decision and naming conditions (24 studies comprising 48 separate conditions). Note that one study contained both semantic categorization and lexical decision/naming conditions, and was therefore included in both groups. Two separate meta-analyses were performed on these two groups, always using exactly the same procedure described below. Tables 2 and 3 provide an overview and description in terms of the putative moderators of all conditions included in the semantic categorization (Table 2) and lexical decision/naming (Table 3) meta-analyses. Table 2 Effect sizes ( θˆ jk ), standard errors (SEjk) and the study characteristics for the semantic categorization conditions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Bodner & Dypvik (2005)a Bodner & Dypvik (2005)b Bodner & Dypvik (2005)c Bodner & Dypvik (2005)d Bodner & Dypvik (2005)e Bodner & Dypvik (2005)f Bodner & Dypvik (2005)g Bodner & Dypvik (2005)h Bodner & Dypvik (2005)i Bodner & Dypvik (2005)j Bueno & Frenck-Mestre (2002) Damian (2001)a Damian (2001)b Damian (2001)c Damian (2001)d Damian (2001)e Dehaene et al. (1998)a Dehaene et al. (1998)b Dell'Acqua & Grainger (1999)a Dell'Acqua & Grainger (1999)b Forster et al. (2003)a Forster et al. (2003)b Forster et al. (2003)c Forster (2004) Kiesel et al. (2006)a Kiesel et al. (2006)b Kiesel et al. (2006)c Kiesel et al. (2006)d Kiesel et al. (2006)e Kiesel et al. (2006)f Kinoshita et al. (2006)a* Kinoshita et al. (2006)b* Koechlin et al. (1999)a Koechlin et al. (1999)b Kunde et al. (2003)a Kunde et al. (2003)b Kunde et al. (2003)c Kunde et al. (2003)d N Population Primes 20 20 20 20 23 24 44 57 20 20 48 16 16 16 16 16 12 9§ 18 18 54 54§ 54 22 11 11§ 11§ 12 12§ 12§ 24 28 25 24 12 12§ 12 12§ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 1 1 2 2 2 2 2 2 2 2 2 2 1 2 3 1 3 3 1 1 Prime novelty 3 3 3 3 1 1 1 1 1 1 2 1 1 2 2 1 1 1 2 2 2 2 2 2 1 2 2 1 2 2 1 1 1 1 1 2 1 2 Targets 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 2 2 2 2 2 2 2 2 2 2 2 2 1 2 3 1 3 3 1 1 Category size 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 2 2 2 2 2 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 Set size 6 6 6 6 8 8 8 8 8 8 40 12 12 12 12 12 4 4 84 84 90 90 90 20 40 40 40 4 4 4 8 8 4 4 4 4 4 4 Prime duration 50 50 42 42 42 42 42 42 42 42 57 43 43 43 43 43 43 43 17 17 55 55 41 55 43 43 43 43 43 43 53 53 66 66 43 43 29 29 SOA Masking 50 50 42 42 42 42 42 42 42 42 71 72 72 72 72 72 114 114 134 134 55 55 41 55 115 115 115 115 115 115 53 53 132 132 115 115 101 101 3 3 3 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 3 3 3 3 1 1 1 1 1 1 3 3 1 1 1 1 1 1 Visibility measured 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 Obj/ Subj 2 2 2 2 2 2 2 2 3 3 1 1 1 1 1 1 1 . 1 1 . . . . 1 1 1 1 1 1 . . 2 2 1 1 1 1 d’ θˆjk (SEjk) . . . . . . . . . . . .06 .05 .08 .02 .12 . . .05 .05 . . . . .17 .17 .17 .24 .24 .24 . . . . .29 .29 .33 .33 1.17 (0.32) 0.13 (0.24) 1.75 (0.41) 0.42 (0.25) 1.14 (0.29) 0.37 (0.22) 1.33 (0.22) 1.00 (0.17) 1.53 (0.37) 1.64 (0.39) 0.43 (0.15) 0.62 (0.30) 0.68 (0.31) -0.15 (0.27) 0.00 (0.27) 0.60 (0.30) 1.76 (0.59) 1.31 (0.61) 0.53 (0.27) 0.58 (0.28) 0.03 (0.36) 0.48 (0.14) 0.08 (0.15) 1.62 (0.14) 0.36 (0.35) 0.52 (0.37) 0.78 (0.41) 1.02 (0.43) 0.28 (0.33) -0.14 (0.32) 1.22 (0.30) 1.45 (0.30) 1.53 (0.33) 1.04 (0.28) 1.43 (0.51) 1.02 (0.43) 1.13 (0.45) 0.48 (0.35) 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 Kunde et al. (2003)e Kunde et al. (2003)f Kunde et al. (2005)a Kunde et al. (2005)b Naccache & Dehaene (2001)a Naccache & Dehaene (2001)b Naccache & Dehaene (2001)c Naccache & Dehaene (2001)d Pohl et al. (2004)a* Pohl et al. (2004)b* Pohl et al. (2004)c* Pohl et al. (2004)d* Pohl et al. (2004)e* Pohl et al. (2004)f* Pohl et al. (2004)g* Pohl et al. (2004)h* Pohl et al. (2005)a* Pohl et al. (2005)b* Pohl et al. (2005)c* Pohl et al. (2005)d* Pohl et al. (2005)e* Pohl et al. (2006)a* Pohl et al. (2006)b* Pohl et al. (2006)c* Pohl et al. (2006)d* Pohl et al. (2006)e* Pohl et al. (2006)f* Pohl et al. (2006)g* Pohl et al. (2006)h* Pohl et al. (2006)i* Pohl et al. (2006)j* Reynvoet et al. (2002)a Reynvoet et al. (2002)b Reynvoet et al. (2002)c Reynvoet et al. (2002)d Reynvoet et al. (2002)e Reynvoet et al. (2002)f Rouibah et al. (1999) Théoret et al. (2004) Van den Bussche & Reynvoet (2006)a* N Population Primes 24 24§ 16 16 18 18§ 18 18§ 18 18§ 18§ 17 17§ 16 16§ 16§ 11 11§ 11§ 11§ 11§ 12 12§ 12§ 12§ 12 12§ 12§ 12§ 12 12§ 16 16§ 16 16§ 16 16§ 80 6 24 2 2 2 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 3 1 Prime novelty 1 2 1 2 1 2 1 2 1 2 2 1 2 1 2 2 1 2 2 2 2 1 2 2 2 1 2 2 2 1 2 1 2 1 2 1 2 2 1 1 Targets 3 3 3 3 3 3 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 Category size 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 2 1 1 Set size 4 4 4 4 4 4 4 4 4 4 4 40 40 40 40 40 40 40 40 40 40 4 4 4 4 40 40 40 40 40 40 6 6 6 6 6 6 60 8 4 Prime duration 43 43 72 72 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 28 28 28 28 28 28 28 28 28 28 57 57 57 57 57 57 49 43 33 SOA Masking 115 115 144 144 114 114 114 114 115 115 115 115 115 115 115 115 115 115 115 115 115 100 100 100 100 100 100 100 100 100 100 114 114 114 114 114 114 273 114 50 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 Visibility measured 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 Obj/ Subj 1 1 1 1 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . . . . . . . 3 1 d’ θˆjk (SEjk) .22 .22 .66 .66 .60 .60 .01 .01 .08 .08 .08 .07 .07 .05 .05 .05 .05 .05 .05 .05 .05 .35 .35 .35 .35 .08 .08 .08 .08 .64 .64 . . . . . . . . .46 0.82 (0.25) 0.55 (0.23) 1.92 (0.50) 2.06 (0.53) 0.88 (0.31) 0.70 (0.29) 0.98 (0.32) 0.76 (0.29) 0.00 (0.25) 0.19 (0.25) -0.12 (0.25) 0.70 (0.30) 0.04 (0.26) 0.83 (0.33) -0.32 (0.28) -0.11 (0.27) -0.12 (0.34) 0.00 (0.34) 0.15 (0.34) 0.60 (0.38) -0.31 (0.35) 1.28 (0.48) 0.07 (0.32) 0.39 (0.34) -0.11 (0.32) 0.95 (0.41) 1.05 (0.43) 0.66 (0.37) -0.20 (0.32) 1.88 (0.61) 0.75 (0.38) 1.64 (0.45) 1.35 (0.40) 1.51 (0.43) 1.03 (0.35) 1.72 (0.47) 0.62 (0.30) 0.82 (0.13) 1.55 (1.03) 1.37 (0.31) 22 Van den Bussche & Reynvoet (2006)b* Van den Bussche & Reynvoet (2006)c* Van den Bussche & Reynvoet (2006)d* Van Opstal et al. (2005a)a Van Opstal et al. (2005a)b Van Opstal et al. (2005b)a Van Opstal et al. (2005b)b Van Opstal et al. (2005b)c Van Opstal et al. (2005b)d 79 80 81 82 83 84 85 86 87 N Population Primes 24§ 24 24§ 16 16§ 23 23§ 23 23§ 1 1 1 2 2 2 2 2 2 1 2 2 1 1 3 3 3 3 Prime novelty 2 1 2 1 1 1 2 1 2 Targets 1 2 2 1 1 3 3 3 3 Category size 1 2 2 1 1 1 1 1 1 Set size 4 12 12 4 4 4 4 4 4 Prime duration 33 33 33 33 33 33 33 33 33 SOA Masking 50 50 50 100 100 100 100 100 100 1 1 1 1 1 1 1 1 1 Visibility measured 1 1 1 1 1 1 1 1 1 Obj/ Subj 1 1 1 1 1 1 1 1 1 d’ θˆjk (SEjk) .46 -.06 -.06 .18 .18 .00 .00 .15 .15 1.10 (0.28) 1.31 (0.31) 0.25 (0.22) 0.73 (0.31) 1.11 (0.37) 0.77 (0.26) 0.27 (0.22) 1.80 (0.38) 1.76 (0.38) Note. Effect sizes ( θˆ jk ) that are positive indicate positive priming effects. N = sample size. SOA = Stimulus Onset Asynchrony. Obj/Subj: the nature of the visibility measure. d’ = objective measure of prime visibility. Population: 1 = students, 2 = adults. Primes: 1 = symbols, 2 = words, 3 = both. Prime novelty: 1 = repeated primes, 2 = novel primes, 3 = both. Targets: 1 = symbols, 2 = words, 3 = both. Category size: 1 = small, 2 = large. Masking: 1 = backward and forward masking, 2 = backward masking, 3 = forward masking. Visibility measured: 0 = no, 1 = yes. Obj/Subj: 1 = objective, 2 = subjective, 3 = both. a…j : different conditions within a study * unpublished data § (Part of the) same sample used as in a previous condition of the same study 23 Table 3 Effect sizes ( θˆ jk ), standard errors (SEjk) and the study characteristics for the lexical decision and naming conditions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Alameda et al. (2003)a Alameda et al. (2003)b Alameda et al. (2003)c Bajo et al. (2003)a Bajo et al. (2003)b Bodner & Masson (2003)a Bodner & Masson (2003)b Bodner & Masson (2003)c Bourassa & Besner (1998)a Bourassa & Besner (1998)b Brown & Besner (2002) Dell'Acqua & Grainger (1999) Devlin et al. (2004) Devlin et al. (2006) Finkbeiner & Caramazza (2006)a Finkbeiner & Caramazza (2006)b Forster & Hector (2005)a* Forster & Hector (2005)b* Forster & Hector (2005)c* Forster & Hector (2005)d* Forster & Hector (2005)e* Forster & Hector (2005)f* Grossi (2006)a Grossi (2006)b Kamphuis et al. (2005)a Kamphuis et al. (2005)b Kiefer (2002) Kiefer & Brendel (2006)a Kiefer & Brendel (2006)b Kiefer & Brendel (2006)c Kiefer & Brendel (2006)d Kiefer & Spitzer (2000)a Kiefer & Spitzer (2000)b Kolanczyk & Pawlowska-Fusiara (2002) N Pop Task Primes 50 50 50 30 30§ 100 50§ 40 130 132 72 19§ 11 11 18 20 20 20§ 18 18§ 20 20§ 20 20 20 20§ 24 16 16§ 16 16§ 20 20§ 40 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1 2 2 1 2 2 1 1 1 1 1 1 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Prime novelty 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 Targets 1 1 1 1 1 2 2 2 2 2 2 1 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Category size 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 . . . . . . 2 2 1 1 2 2 2 2 2 2 2 2 Set size 50 50 50 30 30 200 200 200 80 80 96 72 56 56 46 38 . . . . . . 200 200 18 18 320 320 320 320 320 320 320 16 Prime duration 84 57 84 50 75 45 45 45 40 40 34 17 33 33 53 53 42 42 55 55 69 69 50 50 40 40 34 34 34 34 34 50 50 75 SOA Masking 98 71 98 64 89 45 45 45 80 340 284 134 33 33 53 66 . . . . . . 50 50 67 67 67 67 200 67 200 67 200 . 1 1 1 1 1 3 3 3 2 2 1 1 3 3 1 1 1 1 1 1 1 1 3 3 2 2 1 1 1 1 1 1 1 4 Visibility measured 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 Obj/ Subj . . . . . 2 1 2 2 2 . 1 1 1 2 2 1 1 1 1 1 1 1 1 . . 3 1 1 1 1 1 1 1 d’ θˆ jk (SEjk) . . . . . . . . . . . .07 . . . . . . . . . . .07 .08 . . -.01 . . . . . . . 0.89 (0.17) 0.34 (0.15) 0.70 (0.16) 0.28 (0.19) -0.07 (0.19) 0.76 (0.12) 0.94 (0.18) 0.69 (0.18) 0.37 (0.09) 0.33 (0.09) 0.35 (0.12) 0.71 (0.28) 0.45 (0.36) 0.45 (0.36) 0.57 (0.28) 0.48 (0.25) 0.48 (0.25) 0.22 (0.24) 0.27 (0.26) 0.53 (0.27) 0.22 (0.24) 0.43 (0.25) 0.81 (0.28) 0.59 (0.26) -0.24 (0.24) -0.13 (0.24) 1.09 (0.28) 0.96 (0.34) 0.46 (0.29) 0.98 (0.35) 1.10 (0.36) 1.00 (0.30) 1.07 (0.31) 0.36 (0.17) 24 McBride, Kripps & Forster (2003)* Perea & Lupker (2003)a Perea & Lupker (2003)b Perea & Rosa (2002a)a Perea & Rosa (2002a)b Perea & Rosa (2002a)c Perea & Rosa (2002b)a Perea & Rosa (2002b)b Perea & Rosa (2002b)c Perea & Rosa (2002b)d Ruz et al. (2003) Sassenberg & Moskowitz (2005) Sebba & Forster (1989)* Zhou et al. (1999) 35 36 37 38 39 40 41 42 43 44 45 46 47 48 N Pop Task Primes 20 120 36 48 48 44 32 32 32 32 45 15 45 29 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 Prime novelty 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Targets 2 2 2 2 2 2 2 2 2 2 2 2 2 1 Category size . 2 2 2 2 2 2 2 2 2 2 2 . 2 Set size . 120 120 24 24 24 66 66 66 66 45 6 . 45 Prime duration 67 40 40 66 66 83 66 83 66 83 13 50 50 57 SOA Masking . 80 80 66 66 83 66 83 66 83 . 50 . 57 1 1 1 3 3 4 3 3 3 3 2 4 1 4 Visibility measured 0 0 0 0 0 0 0 0 0 0 1 0 0 0 Obj/ Subj . . . . . . . . . . 3 . . . d’ θˆ jk (SEjk) . . . . . . . . . . . . . . 0.49 (0.25) 0.31 (0.09) 1.03 (0.22) 0.31 (0.15) 0.32 (0.15) 0.31 (0.16) 0.39 (0.19) 0.45 (0.19) 0.02 (0.18) 0.85 (0.22) 0.31 (0.16) 0.80 (0.34) -0.15 (0.15) 0.41 (0.20) Note. Effect sizes ( θˆ jk ) that are positive indicate positive priming effects. N = sample size. SOA = Stimulus Onset Asynchrony. Obj/Subj: the nature of the visibility measure. d’ = objective measure of prime visibility. Pop = population. Pop: 1 = students, 2 = adults. Primes: 1 = symbols, 2 = words. Task: 1 = lexical decision, 2 = naming. Prime novelty: 1 = repeated primes, 2 = novel primes. Targets: 1 = symbols, 2 = words. Category size: 1 = small, 2 = large. Masking: 1 = backward and forward masking, 2 = backward masking, 3 = forward masking, 4 = none. Visibility measured: 0 = no, 1 = yes. Obj/Subj: 1 = objective, 2 = subjective, 3 = both. a…f : different conditions within a study * unpublished data § (Part of the) same sample used as in a previous condition of the same study 25 All studies included in the meta-analyses used a single-group repeated measures design, where the same subject is observed under multiple treatment conditions, and thus serves as his or her own control (Morris & DeShon, 2002). Specifically, in the present studies each subjects’ reaction time (RT, in milliseconds) is measured on related or congruent trials (i.e. prime and target are related or congruent to each other, e.g. cat - dog) and on unrelated or incongruent trials (i.e. prime and target are unrelated or incongruent to each other, e.g. cat pot). For these single-group designs, the widespread method of calculating effect sizes for independent-groups designs (where the outcome is measured at a single point in time and is compared across independent groups that received different treatments, see Hedges, 1981; 1982), can no longer be applied and can not be compared with the measures used here to calculate effect sizes. In a single-group repeated measures design, for each subject a difference can be calculated between the mean RT on the unrelated/incongruent trials and the mean RT on the related/congruent trials. To express the priming effect for a certain condition, we will use the mean of these differences, divided by the standard deviation of the differences (Gibbons, Hedeker & Davis, 1993): θˆ jk = MD SDD where θˆjk is the estimated effect size for a certain condition j in study k 1 , MD is the difference between the mean RTs on the unrelated/incongruent trials and the related/congruent trials (unrelated/incongruent - related/congruent) and SDD is the sample deviation of the differences. In general, studies always report the means of the observed priming effects, allowing the calculation of the numerator, but the standard deviations are often unavailable. Therefore, we always estimated the effect sizes from reported test statistics. For all conditions included in the present meta-analyses we were able to obtain the repeated measures t test or F test. Since we are dealing with single-group repeated measures data, where one group of subjects is measured on two conditions (unrelated/incongruent condition and related/congruent condition), the paired one-sample t-test for example is calculated as: t= 1 MD −δ SDD / n The standard symbol used to describe the sample estimator of the effect size is d. However, to avoid all confusion with d’, the objective measure of prime visibility mentioned in this study, we will use the symbol θˆ to denote the sample statistic. where δ is the expected population mean difference (which equals 0 under the null hypothesis). From this, the test statistics can be transformed into a repeated measures effect size using the following conversion formula (Rosenthal, 1991): θˆjk = MD t = SDD n Or θˆ jk = F n where n is the number of subjects in the experiment. Since the square root of the F value does not indicate the direction of the difference, we specified whether the effect s ize was positive or negative based on the pattern of means. A positive effect size can be interpreted as a positive priming effect (i.e. faster response to related/congruent prime-target pairs compared to unrelated/incongruent prime-target pairs). We also estimated the sampling error (SE) for each effect size in the meta-analyses before conducting the meta-analyses. The inverse of the estimates of the sampling variance of the observed effect sizes will be used afterwards in our meta-analyses to weight the observed effect sizes. For single-group repeated measures designs, the following variance formula has been proposed (Morris & DeShon, 2002): ˆ2 ⎛ 1 ⎞ ⎛ n −1 ⎞ ˆ2 ) − θ SE 2jk = ⎜ ⎟ ⎜ + n θ (1 ⎟ 2 ⎝ n ⎠⎝ n − 3 ⎠ [ c(df )] where n is the number of subjects in the experiment and the bias function c(df) is approximated by (Hedges, 1982): c(df ) = 1 − 3 4(n − 2) Tables 2 and 3 provide an overview of the effect sizes and corresponding sampling errors for the individual semantic categorization (Table 2) and lexical decision/naming (Table 3) conditions separately. After estimating the effect size ( θˆ jk ) and its corresponding sampling error ( SE jk ) for each condition j from study k included in the meta-analysis, we can conduct the actual metaanalyses. The fact that subjects were nested within conditions and conditions were nested within studies also indicates that three potential sources of variance were present. In a typical meta-analysis there are two sources of variance: (1) sampling variance, i.e. the differences between the observed effect sizes and the population effects estimated in the respective studies; (2) between-study variance, i.e. systematic differences between the (population) 27 effect sizes from different studies. However, in the present meta-analyses, we also have a third source of variance: (3) between-condition, within-study variance, i.e. systematic differences between the effect sizes from different conditions within the same study. To be able to take into account these three levels of variance, we used the multilevel meta-analysis approach (see Van den Noortgate & Onghena, 2003). Raudenbush and Bryk (1985) showed that a meta-analysis can be considered as a special case of a multilevel analysis, except that here aggregated data instead of raw data are used. Following the multilevel research tradition, we started from a random-effects model (REM), where studies are not seen as direct replications of each other, but rather as a random sample from a population of studies. The REM without moderators is shown below: θˆ jk = β 0 + V⋅k + U jk + e jk where θˆjk is the observed effect size for condition j within study k; β 0 is the overall mean effect size, across all conditions and studies; V⋅k refers to the random deviation of the effect in study k from the overall effect; U jk refers to the deviation of the effect for condition j in study k from the mean effect in study k; e jk is the residual due to sampling fluctuation, indicating the deviation of the observed effect size from the population effect size for condition j in study k. All three error terms, V⋅k , U jk and e jk are assumed to be independently normally distributed with zero mean. Note that the sampling variance for each condition has been estimated before conducting the meta-analyses (see above). Thus, only β 0 , the overall mean effect size, σ V2 , the between-study variance component, and σ U2 , the within-study variance component, are estimated in the meta-analysis. This REM can then be extended by including moderators. The REM with two moderator variables, one study characteristic X1 and one condition characteristic X2, is shown below: θˆ jk = β 0 + β1 X 1k + β 2 X 2 jk + V⋅k + U jk + e jk where β1 is the estimated effect of the moderator at the between-study level; β 2 is the estimated effect of the moderator at the between-condition, within-study level. To estimate the parameters, maximum likelihood estimation was used, as implemented in the mixed procedure from SAS (Littell, Milliken, Stroup, Wolfinger & Schabenberger, 2006). 28 In this approach, the estimates from the individual studies are automatically weighted by the reciprocal of the variance of the observed effect sizes, which in our case is the sum of the sampling variance, the between-condition, within-study variance and the between-study variance. In this way, more precise observed effect sizes have greater impact on the results. Furthermore, this kind of variance weighting accounts for both sample size and study design (Hedges & Olkin, 1985). To determine whether the effect sizes were homogeneous (i.e. whether they could be treated as estimates of a common effect size) we can investigate whether the variance between the observed effect sizes is larger than what can be expected based on sampling variance alone. If this is the case, the effect sizes are heterogeneous and the presence of moderators is likely. In order to test this, we used the likelihood ratio test where we compare the model with the between-study variance component ( σ V2 ) with a model without this between-study variance component. The difference between these two deviance scores, defined as -2 times the log likelihood, follows a 50-50 mixture of the χ2 distributions for 0 and 1 degrees of freedom (Raudenbush & Bryk, 2002; Snijders & Bosker, 1999), making it possible to determine whether significant variance is present at the between-study level. The same procedure can be used to test the significance of the between-condition, within-study variance ( σ U2 ). If these tests indicate that significant between-study and/or betweencondition, within-study variance is present, the presence of moderators of the effect sizes is likely. However, the multilevel meta-analysis model does not assume that all heterogeneity between studies and between conditions will be explained by the included moderators (Van den Noortgate & Onghena, 2003). Inherent to the data at hand in the present meta-analyses is the fact that the same sample is often being used across multiple conditions within a study, so that certain samples were able to contribute more than one effect size to the meta-analyses. More specifically, 46 of the 135 included conditions used the same sample already used in a previous condition. This implies that within studies a substantial part of the effect sizes were not independent. Rosenthal (1991) noted that the inclusion of multiple effect sizes from the same sample treats dependent results as independent, in effect weighting each study according to the number of effect sizes it produces. If there is anything unusual or unrepresentative about studies contributing more than one effect size because of the reuse of the same sample, this can bias the outcome of the meta-analysis. To take this into account, we should ideally conduct a multivariate analysis where we also include the sampling covariance. However, since the 29 data necessary to calculate this sampling covariance (i.e. estimates of the covariances between the effect sizes within a study) were neither reported in the studies included in the present meta-analyses, nor elsewhere in the literature, this was not feasible. Because the number of conditions reusing a previous sample was substantial, we decided not to restrict each sample to be able to contribute only one effect size to the meta-analyses. However, to study the impact of this dependency on the outcome patterns of the meta-analyses, we conducted several sensitivity analyses (Greenhouse & Iyengar, 1994). For both the semantic categorization and the lexical decision/naming meta-analyses, all analyses were redone twice including only one randomly selected effect size per sample. This will allow us to investigate the influence of ignoring the dependency between the effect sizes on the observed patterns of results. Furthermore, since the two random samples are drawn from a large dataset, they are still sufficiently large to guarantee the adequacy of the covariance matrix. Another important issue to be addressed is the potential presence of reporting and publication bias. Significant results are more likely to be reported and published (Begg, 1994). If the meta-analysis is limited to published studies, there is a risk that it will lead to biased conclusions. Therefore, a first method to avoid publication bias is to include as many unpublished studies as possible. In the present meta-analyses eight unpublished studies (i.e. 18%) comprising 37 conditions (i.e. 27%) were included. More specifically, five unpublished studies (i.e. 23%) comprising 29 conditions (i.e. 33%) were included in the semantic categorization meta-analysis. Three unpublished studies (i.e. 12.5%) comprising eight conditions (i.e. 17%) were included in the lexical decision and naming meta-analysis. Next to including as many unpublished studies as possible, it remains of great importance to attempt to identify potential publication bias. The most important means of identifying publication bias is sample size: small studies produce highly variable effect size estimates, which implies that aberrant values that occur by chance are much farther from the true mean effect size for small studies than for large studies. Thus, if publication bias only selects those studies that report positive and statistically significant effect sizes, regardless of the sample size, then the effect sizes from the small samples, in general, are likely to be more positive than those from larger studies (Begg, 1994). This would lead to a negative correlation between sample size and effect size: larger effect sizes for small, more unreliable studies and smaller effect sizes for large, more reliable studies. A first method we used to examine this bias is the construction of “funnel graphs”, which plot sample size versus effect size (Light & Pillemer, 1984). If no bias is present, this plot should be shaped like a funnel, with the spout pointing up (a broad range of points for the variable small studies at the bottom and a narrow range of 30 points for the large studies at the top). The mean effect size should be similar regardless of sample size: when a vertical line is drawn through the mean effect size, the graph should be symmetrical on either side of the line. However, if publication bias is present the funnel will be skewed. A second method we used to detect publication bias was the inclusion of whether the study was published or unpublished as a moderator in the model and testing its influence on the effect sizes. 31 Semantic Categorization Results Descriptive statistics The literature search identified 22 studies that met all inclusion criteria and used a semantic categorization task in at least one of their conditions. In total, 87 conditions using a semantic categorization task could be extracted from these 22 studies. All conditions studied either an adult or a student sample with a mean sample size of 20. Primes and targets were either symbols, words or a mixture of both. Primes were presented on average for 42 ms with an average SOA of 96 ms. The conditions presented subjects with novel or repeated primes or a mixture of both. On average subjects received 19 targets. In half of the conditions the stimuli came from a small category, while in the other half stimuli from a large category were used. Most of the conditions used backward and forward masking of the prime and the majority of the conditions assessed prime visibility. Two thirds of the conditions calculated a d’ measure to study prime perceptibility. Table 1 provides detailed descriptive statistics for the 87 semantic categorization conditions across the various potential moderators. Overall mean effect size and effect size heterogeneity Table 2 provides the observed effect sizes and corresponding sampling errors for the 87 semantic categorization conditions. Across all conditions, the mean effect size for the random effects model was 0.79 (k = 87, 95% CI = 0.59 – 0.99). Figure 2 graphically displays all observed effect sizes ordered by size and the overall mean effect size with their 95% confidence intervals. A likelihood ratio test comparing the model with the between-study variance with a model without the between-study variance component showed that significant variance was present at the between-study level (Var = 0.15, χ2(1) = 18.0, p < .0001) 2. In addition, within studies we found significant differences between conditions (Var = 0.10, χ2(1) = 27.5, p < .0001). Thus, the presence of moderators of the effect sizes is likely. To be able to assess how much of the variance was situated at each of the three levels in the meta-analysis, we used the median sampling variance to calculate the total amount of variance (as the sum of the between-study variance, the within-study, between-condition 2 In fact a 50-50 mixture of the χ2 distributions for 0 and 1 degrees of freedom was used, but we always simply refer to this as χ2(1). 32 variance and the median sampling variance). This strategy was used, because the sampling variance for each condition was estimated before conducting the meta-analyses (see above) and these sampling variances vary depending on study and condition within study. This implies that we can not readily determine how much of the variance is located at the third level. Using the median sampling variance across all conditions solves this problem. The median amount of sampling variance was 29% (median Var = 0.10, Mean Var = .13, range = 0.02 – 1.06). Thus, the total variance for the median study was 0.35 (0.15 + 0.10 + 0.10). Based on this, we could then calculate the percentage of variance situated at the between-study level (42%) and at the condition level (29%). According to the 75% rule by Hunter and Schmidt (1990), if less than 75% of the variance is due to sampling error (or other artifacts), it is likely that between-study and/or between-condition, within-study variances are substantial and the influence of potential moderators should be examined. Based on this rule, we can again conclude that the effect sizes are heterogeneous, which suggests that there may be moderators that can account for the variability in effect sizes. 33 3,5 Effect Sizes 2,5 1,5 0,5 -0,5 -1,5 Figure 2. Observed effect sizes for the semantic categorization conditions ordered by size of the effect size and the overall mean effect size (in bold) with their 95% confidence intervals. Regression models with one moderator variable We first examined random effect regression models where each of the 13 putative moderators, as described in Table 1, was entered separately. Table 4 provides a detailed overview of these 13 random effect regression models. We first provide a detailed description of the significant moderators and afterwards we list the non-significant moderators. Nature of primes. The nature of primes proved to be a significant moderator of the observed effect sizes. Pairwise comparisons indicated that the average effect size for word primes (0.48) was significantly smaller than the average effect sizes for symbol primes (0.91) and a mixture of symbol and word primes (1.15) (respectively t(51.2) = 2.89, p = .006 and t(28.6) = 3.44, p = .002). However, statistical significance tests of the regression coefficients (t-tests) against the null mean showed that, even though the average effect size for word primes was significantly smaller, the average effect sizes for all three kinds of primes were significantly different from zero. The moderator “nature of the primes” was able to explain 22% of the variance (R2 = .22. Please note that this proportion of explained variance does not correspond to the “traditional” R2 obtained with linear regression; the R2 here is calculated across the 3 levels and using the median sampling variance). Prime novelty. Prime novelty was also a significant moderator of the observed effect sizes. The average effect size for novel primes was significantly smaller than the average effect size for repeated primes (respectively 0.54 and 1.07, t(84) = 5.15, p < .0001). However, significance tests of the regression coefficients showed that, even though the average effect size for novel primes was significantly smaller, the average effect sizes for novel primes, repeated primes and the mixture of both were all significantly different from zero. The moderator “prime novelty” was able to explain 26% of the variance. Nature of targets. A third significant moderator of the observed effect sizes was the nature of the targets. The average effect size for word targets (0.45) was significantly smaller than the average effect sizes for symbol targets (0.96) and a mixture of symbol and word targets (1.15) (respectively t(34.4) = 3.50, p = .001 and t(33.5) = 3.79, p = .0006). However, significance tests of the regression coefficients showed that, even though the average effect size for word targets was significantly smaller, the average effect sizes for all three kinds of targets were significantly different from zero. The moderator “nature of the primes” was able to explain 27% of the variance. Category size. Category size emerged as a strong moderator of the observed effect sizes. The average effect size for conditions that used stimuli from a large category was significantly smaller than the average effect size for small categories (respectively 0.35 and 1.08, t(14.8) = 6.90, p < .0001). Again, significance tests of the regression coefficients indicated that, even though the average effect size for large categories was significantly smaller, the average effect sizes for small and large categories were both significantly different from zero. The moderator “category size” was able to explain 42% of the variance. d’. Fifty-eight of the 87 conditions calculated a d’ measure to assess prime visibility. This d’ measure proved to be a significant moderator of the observed effect sizes, with a larger d’ (i.e. higher prime visibility) indicating a stronger effect (β = 1.04, F(1, 54.1) = 8.09, p = .006). The moderator “d’” was able to explain 20% of the variance. This significant positive effect of d’ on the effect sizes, indicates that when primes are more visible, the effect sizes (and thus the observed priming effects) are larger. To assess whether a significant effect size was still present when the visibility of the primes is zero, we tested whether the intercept, where the d’ value equals zero, was still significant. This was indeed the case (β = 0.44, t(14.3) = 3.73, p = .002), indicating that significant priming was observed even when prime awareness was absent. The moderators population, sample size, target set size, prime duration, SOA, masking, whether visibility was measured and the nature of the visibility measure did not have a significant effect on the observed effect sizes. More details are provided in Table 4. 36 Table 4 Random effect regression models with one moderator for semantic categorization conditions 1 2 3 4 5 6 7 8 9 10 11 12 13 N Population Students Adults Primes Symbols Words Both Prime novelty Repeated Novel Both Targets Symbols Words Both Category size Small Large Set size Prime duration SOA Masking BM and FM BM FM Visibility measured No Yes Obj/Subj Objective Subjective Both d’ k 87 87 62 25 87 32 37 18 87 40 43 4 87 34 37 16 87 44 43 87 87 87 87 70 1 16 87 14 73 73 56 10 7 58 β -0.0007 95% CI -0.01, 0.01 0.71**** 0.98**** 0.48, 0.94 0.64, 1.32 0.91**** 0.48*** 1.15**** 0.67, 1.15 0.26, 0.69 0.83, 1.47 1.07**** 0.54**** 0.70** 0.87, 1.27 0.35, 0.73 0.21, 1.18 0.96**** 0.45*** 1.15**** 0.73, 1.19 0.25, 0.65 0.84, 1.46 1.08**** 0.35*** -0.005 0.01 0.0004 0.93, 1.23 0.21, 0.49 -0.01, 0.001 -0.003, 0.03 -0.004, 0.005 0.76**** 0.82 0.93** 0.52, 1.00 -0.24, 1.88 0.44, 1.42 0.98*** 0.73**** 0.57, 1.39 0.50, 0.96 0.65*** 0.86** 1.26*** 1.04** 0.41, 0.89 0.32, 1.40 0.67, 1.885 0.32, 1.76 F (df) 0.02 (1, 36.5) 1.81 (1, 29.9) p .89 .19 Model R2 0.00 0.02 7.39 (2, 39.4) .002 0.22 14.21 (2, 65.8) <.0001 0.26 9.51 (2, 39.3) .0004 0.27 47.63 (1, 14.8) <.0001 0.42 2.69 (1, 30.4) 2.69 (1, 31.8) 0.03 (1, 21.4) 0.18 (2, 18.8) .11 .11 .86 .84 0.10 0.05 0.00 0.00 1.03 (1, 21.4) .32 0.01 1.77 (2, 18.2) .20 0.00 8.09 (1, 54.1) .006 0.20 Note: k = number of effect sizes in the category. β = regression coefficients. The regression coefficients for the categorical variables can be interpreted as the mean effect sizes for each category. Model R2 refers to the proportion of the explained total variance across the 3 levels and using the median sampling variance. **** p < .0001. *** p < .001. ** p < .01. * p < .05. 37 Regression models with two-way interactions After identifying the significant moderators in the regression models with one moderator, the two-way interactions between the four theoretically most interesting moderators, nature of primes, prime novelty, category size and target set size, were examined by entering the main effects of two moderators and the interaction between these two in a model (other moderators besides these two are not added to the model). None of the two-way interactions between these variables were significant (p-values ranging from .20 to .93). Because various moderator categories contained only a few observations and because interpretability would become increasingly complicated, no higher-order interactions were investigated. Multiple regression models To get a first overview of the relations between the moderators, Pearson correlations between a continuous and another continuous moderator, Pearson correlations between a dichotomous and a continuous moderator and odds ratios (Mantel-Haenszel statistic) between a dichotomous and another dichotomous moderator were computed. To be able to interpret the correlations between all included moderators and to compute odds ratios, the categorical variables were reduced to dichotomous variables by eliminating the level with the least observations (this procedure was used only for these correlational analyses). More specifically, for the variables nature of the primes, nature of the targets, prime novelty and the nature of the visibility measure we eliminated the level “Both”. For the variable masking we eliminated the level “Backward Masking”. As a result, 62 of the 87 conditions were retained for these correlational analyses. The correlations and odds ratios are shown in Table 5. This table also reports the variance inflation factors (VIF) for the moderators. Calculating the VIF is a method of detecting the severity of multicollinearity between the moderators. Different rules of thumb regard VIF values greater than 4 to 10 as indicative of severe multicollinearity (O’Brien, 2007). The VIF values exceeded this cut-off value for the moderators nature of the primes (VIF = 69.09), nature of the targets (VIF = 24.12) and prime duration (VIF = 21.39). It makes sense that there is substantial overlap between the nature of the primes and targets, since almost all studies use the same kind of primes and targets within a condition. Because nature of the primes is theoretically a key variable in the present meta-analyses, we decided to omit the variables nature of the targets and prime duration from the multiple moderator analyses. By eliminating these two variables, all remaining VIF values fell well below the cut-off value (VIF range 1.10 – 2.88). The observed overlap between certain variables also 38 indicates that we should be careful interpreting the results from the regression models with one moderator variable, because confounding was present. Since no two-way interactions proved to be significant and to enhance interpretability, only main effects of the moderators were included in the multiple regression models. In a first step we examined a multiple model including the four theoretically most interesting moderators: nature of the primes, prime novelty, category size and target set size. Only prime novelty and category size reached statistical significance in this model (respectively F(2, 72.7) = 9.56, p = .0002 and F(1, 21.4) = 10.95, p = .003). The nature of the primes and target set size did not reach significance (respectively F(2, 31.1) = 0.19, p = .83 and F(1, 23.8) = 2.09, p = .16). The moderator variables of this model (k = 87) explained 46% of the total variance between observed effect sizes (R2 = .46). In a second step all models with two or three of these four moderators were examined. The models’ R2 ranged from .24 to .48. The optimal model in terms of R2 where all included moderators were significant was a multiple model containing only the moderators prime novelty and category size. Both moderators in this model were highly significant (F(2, 76.2) = 9.00, p = .0003 for prime novelty and F(1, 20.4) = 24.13, p < .0001 for category size). This model (k = 87) had an R2 of .48. We also used a backward elimination strategy starting from a full model containing all moderators, and then eliminating non-significant moderators from the model step by step based on the size of their p-value (the moderator with the highest p-value was eliminated first, etc.). Because the moderators d’ and the nature of the visibility measure could only be computed on a reduced number of effect sizes (respectively k = 58 and k = 72), and because in the estimation of the model parameters conditions are only included if data are available for all moderators included in the model (conditions with missing values are excluded), these two variables were not included in the full model, allowing a first model selection based on a larger set of conditions. The R2 for the full model (k = 87) was .48 and in this full model only the moderators prime novelty and category size remained significant (respectively F(2, 69) = 9.95, p = .0002 and F(1, 20.8) = 5.15, p = .03). The backward strategy eliminated the following sequence of non-significant moderators: population, sample size, SOA, target set size, masking, nature of primes and whether the visibility was measured. So finally, we are left again with the model containing only prime novelty and category size. When a different starting point was chosen in the backward strategy, the same optimal model was always reached. 39 In a final step we added d’ to the multiple model containing the four key moderators (nature of the primes, prime novelty, category size and target set size), because it proved to be a significant moderator in the models with one moderator. Prime novelty 3 and d’ reached statistical significance in this model (respectively F(1, 35.4) = 20.98, p < .0001 and F(1, 46.3) = 4.07, p = .05). The nature of the primes, category size and target set size did not reach significance (respectively F(2, 13.5) = 0.55, p = .59, F(1, 13.4) = 1.18, p = .30 and F(1, 30.6) = 2.22, p = .15). This model (k = 58) had an R2 of .52. Again, all models with two, three or four of these five moderators were examined. The models’ R2 ranged from .16 to .55. The optimal model in terms of R2 where all included moderators were significant, was a model containing the moderators prime novelty, category size and d’. All moderators in this model were significant (F(1, 37.5) = 18.25, p = .0001 for prime novelty, F(1, 10.8) = 8.69, p = .01 for category size and F(1, 43.8) = 4.21, p = .05 for d’). This model (k = 58) had an R2 of .53. 3 Note that for the multiple models containing d’ (k = 58), prime novelty lost a degree of freedom because the 58 conditions in these analyses all used either novel (k = 34) or repeated (k = 24) primes, but none used a mixture of both. 40 Table 5 Pearson correlations, odd ratios (in bold) and Variance Inflation Factors (VIF) among the moderators for the semantic categorization conditions Semantic categorization 1 N 2 Population 3 Primes 4 Prime novelty 5 Targets 6 Category size 7 Set size 8 Prime duration 9 SOA 10 Masking 11 Visibility measured 12 Obj/Subj 13 d’ 14 Effect size 2 -.26* - 3 .06 1.29 - 4 -.03 0.73 2.11 - 5 .07 1.29 208.00*** 2.78 - 6 -.18 0.67 10.00*** 5.20** 28.80*** - 7 .32* -.17 .21 .35** .42** .50*** - 8 .31* -.12 .40** -.10 .16 -.34** -.15 - 9 -.66*** .25* .04 .19 .08 .27* .08 -.07 - 10 .72*** / 0.79 0.45 0.79 0.14* .17 .29* -.71*** - 11 -.35** / 0.84 0.91 1.93 10.64** -.09 -.62*** .18 0.09** - 12 .61*** / / / / / -.26 .33* -.45** / / - 13 -.17 .20 -.58*** -.07 -.66*** -.38* -.28 -.36* -.11 / / / - 14 .06 -.006 -.43** -.55*** -.50*** -.62*** -.29* .14 -.20 .18 -.36* .28* .44** - VIF 3.87 2.81 69.09 1.12 24.12 5.23 2.59 21.39 3.05 / / / 2.09 Note. To allow the calculation of the VIF, all categorical moderators were reduced to dichotomous variables by eliminating the level with the least observations. Some correlations could not be calculated (marked with /): The correlation between d’ and masking, because all studies reporting a d’ value used the same masking procedure (BM and FM); the correlation between d’ and whether visibility was measured or not, because all studies that reported a d’ measure used a visibility measure; the correlation between d’ and the nature of the visibility measure, because all studies reporting a d’ value used an objective visibility test. Some odd ratios could not be calculated (marked with /) when one of the cells of the crosstable contained no observations. Because of these missing correlations, the VIF for masking, whether visibility was measured and the nature of the visibility measure could not be computed. *** p < .001. ** p < .01. * p < .05. Publication bias and dependency A first strategy used to identify possible publication bias, was the construction of a funnel graph plotting the sample size against the observed effect size from the conditions in the meta-analysis. Figure 3 shows this funnel graph. Visual inspection of this graph suggests that it is rather symmetrical around the mean effect size (0.79) and does not appear to be heavily skewed. This is confirmed by the non-significant correlation between sample size and effect size (r = .001, p = .99), suggesting that no strong publication bias is present. Additional arguments showing that publication bias did not play a crucial role in the present metaanalysis, are (1) the fact that sample size did not prove to be a significant moderator (p = .89), which indicates that smaller samples did not report larger effect sizes and (2) whether the study was published or unpublished also did not significantly moderate the observed effect sizes: the average effect size for published conditions did not significantly differ from the average effect size for unpublished conditions (t(13.9) = 1.41, p = .18). Sensitivity analyses were used to study the impact of dependency of part of the conditions included in the meta-analysis. As explained above, certain samples contributed more than one effect size to the meta-analyses. More specifically, 41.1% of the included conditions (36 of 87 conditions, indicated in Table 2) used the same sample already used in a previous condition. Using two random selection procedures, each sample was restricted to be included only once (and thus to contribute only one effect size) in the meta-analyses. Using these two new samples (k = 51), all analyses were redone. The overall mean effect sizes for the random effects models were now 0.78 for the first random selection and 0.80 for the second random selection. The regression models with one moderator showed a similar pattern of results with the same moderators reaching significance: nature of the primes: F(2, 29.1) = 6.94, p = .003 for the first random selection and F(2, 39.7) = 4.50, p = .02 for the second random selection; prime novelty: F(2, 39.5) = 6.14, p = .005 for the first random selection and F(2, 35.1) = 6.65, p = .004 for the second random selection; category size: F(1, 4.41) = 24.33, p = .006 for the first random selection and F(1, 44) = 27.50, p < .0001 for the second random selection; nature of the targets: F(2, 27.8) = 7.76, p = .002 for the first random selection and F(2, 41.6) = 4.54, p = .02 for the second random selection; d’: F(1, 20.9) = 9.37, p = .006 for the first random selection and F(1, 23.7) = 3.26, p = .08 for the second random selection. As before, none of the two-way interactions were significant and the optimal multiple model contained prime novelty and category size. This indicates that the influence of dependency between the effect sizes is limited: when the dependency is eliminated, a highly similar pattern of results emerged. 80 Sample size 60 40 20 0 -1 -0,5 0 0,5 1 1,5 Observed effect sizes Figure 3. Funnel plot of the semantic categorization conditions. 2 2,5 3 Discussion The literature search identified 22 studies comprising of 87 conditions using a semantic categorization task. The overall mean effect size for the random effects model was 0.79. Significant variance was present at the between-study and the between-condition, within-study levels. Indeed, in the regression models with one moderator, several variables were identified that significantly moderated the effect sizes for semantic categorization conditions. However, it also became clear that these regression models with one moderator were confounded, because severe overlap was present between certain variables. After eliminating this multicollinearity, analyses with multiple moderator variables showed that the combination of prime novelty and category size was able to explain almost half of the variance present in the effect sizes. Adding the d’ measure (thereby reducing the number of studies) further improved the model slightly. From this, we can infer that prime novelty, category size and, additionally, the d’ measure are the most prominent moderators of priming from semantic categorization tasks. First, prime novelty proved to be a strong moderator of the effect sizes: primes that were not presented as targets (i.e. novel primes) elicited significantly smaller priming effects compared to primes that were presented as targets (i.e. repeated primes). This observation supports the claim that non-semantic S-R links are formed when repeated primes are used, which enhances priming effects (e.g. Damian, 2001). However, it also becomes clear that even though priming for novel primes is diminished, it is still significant across the literature. Second, category size also moderated the observed effect sizes: priming effects were significantly smaller for large categories compared to small categories. This is in line with the research of Forster and colleagues (Forster, 2004; Forster, Mohan & Hector, 2003). However, where Forster’s research did not obtain any priming for stimuli from large categories, this meta-analysis across the literature shows that, in general, subliminal stimuli from large categories are able to trigger significant priming effects. Finally, d’ measure of prime visibility moderated the effects sizes for those studies that reported this objective visibility test (k = 58): effects sizes were smaller for studies reporting a lower d’ value, indicating less prime awareness. Intuitively, this makes sense: when the primes become more visible, they are able to exert a stronger influence, increasing the observed priming effects. However, our results also show that even when prime visibility was zero, a significant effect size emerged, indicating that, across the literature, significant priming effects are still observed without any prime awareness. Publication bias and the dependency between parts of the data did not prove to have a strong effect on the results. The observation that significant priming was observed, even under circumstances where the priming effects were reduced (for novel primes, large categories and zero prime visibility), increases the likelihood of primes being processed up to a semantic level under these circumstances. However, Kunde, Kiesel and Hoffmann (2003) proposed an alternative non-semantic hypothesis which could also explain why novel subliminal primes can elicit priming effects. According to this theory, when subjects receive a semantic categorization task (e.g. categorize numbers as smaller or larger than 5), they will consciously prepare themselves for that task by forming action triggers for the stimuli they might receive during the experiment (e.g. numbers between 1 and 9). According to Kunde et al., this can also be managed for novel primes: subjects will also prepare themselves for stimuli they might not consciously receive in the experiment (e.g. even if only 1, 4, 6 and 9 are presented to the subjects as targets, it is likely that adjacent unseen numbers 2, 3, 7 and 8 were also included in the formed action-trigger set). These prepared action triggers will then enable them later on in the task to quickly and automatically associate each stimulus with the appropriate response. Thus, when novel primes were included in the initial action trigger set, they can automatically trigger the adequate response and evoke priming effects, without the need to be semantically processed. However, the formation of an action trigger set intuitively seems limited by the number of stimuli and the size of the stimulus category. It seems unlikely that subjects would be able to form action triggers for all possible members of a large category such as animals or objects. Thus, this theory has some difficulty to explain why we observe significant priming across the literature for large categories and large target sets. However, in order to completely eliminate this alternative, non-semantic explanation of subliminal priming, we can look at naming and lexical decision tasks: when subjects are instructed to name targets or classify them as words or non-words, the same action triggers are formed for related and unrelated primes, since they always evoke the same response. 45 Lexical decision and Naming Results Descriptive statistics The literature search identified 24 studies that met all inclusion criteria and used a lexical decision or a naming task in at least one of their conditions. In total, 48 conditions using a lexical decision or a naming task could be extracted from these 24 studies. All conditions studied either an adult or a student sample with a mean sample size of 37. The majority of the conditions used a lexical decision task. Primes and targets were almost always words/ Primes were presented on average for 51 ms with an average SOA of 91 ms. Almost all conditions used novel primes. On average the subjects received 117 targets. Almost all conditions presented stimuli from a large category. Most of the conditions used backward and forward masking of the prime and more than half of the conditions assessed prime visibility. Only 4 conditions calculated a d’ measure to study prime perceptibility. Table 1 provides detailed descriptive statistics for the 48 lexical decision and naming conditions across the various potential moderators. Overall mean effect size and effect size heterogeneity Table 3 provides the observed effect sizes and corresponding sampling errors for the 48 lexical decision and naming conditions. Across all conditions, the mean effect size for the random effects model was 0.46 (k = 48, 95% CI = 0.34 – 0.58). Figure 4 graphically displays all observed effect sizes ordered by size and the overall mean effect size with their 95% confidence intervals. A likelihood ratio test comparing the model with the between-study variance with a model without the between-study variance component showed that significant variance was present at the between-study level (Var =0.05, χ2(1) = 6.4, p = .005). Thus, the presence of moderators of the effect sizes is likely. Although the variance at the between-study level is significant, the amount of variance (0.05) seems rather small. However, note that we assumed that the study effect sizes are normally distributed. If we look at the range of the middle 95 % of this distribution of study effect size based on this variance ( 0.46 ± 1.96 × 0.05 = [0.02; 0.90]), this range is quite substantial. Within studies we found no significant differences between conditions (Var = 0.01, χ2(1) =1.6, p = .10). 46 To be able to assess how much of the variance was situated at each of the three levels in the meta-analysis, we used the median sampling variance to calculate the total amount of variance. The median amount of sampling variance was 44% (median Var = 0.05, Mean Var = 0.06, range = 0.01 – 0.13). Thus, the total variance for the median study was 0.11 (0.05 + 0.01 + 0.05). Based on this, we could then calculate the percentage of variance situated at the between-study level (44%) and at the within-study, between-condition level (12%). According to the Hunter and Schmidt 75% rule, we can again conclude that the effect sizes are heterogeneous, which suggests that there may be moderators that can account for the variability in effect sizes. 47 2,5 2 Effect Sizes 1,5 1 0,5 0 -0,5 -1 Figure 4. Observed effect sizes for the lexical decision and naming conditions ordered by size of the effect size and the overall mean effect size (in bold) with their 95% confidence intervals. Regression models with one moderator variable We first examined random effect models where each putative moderator as described in Table 1 was entered separately. We note that potential moderators could only be entered in the model if sufficient observations were present for the continuous moderators or if each category of the dichotomous moderators was represented by sufficient observations to be able to draw valid conclusion. Four moderators could therefore not be examined because of severe restrictions in range. Because only two conditions used symbol primes, only two conditions used repeated primes, only two conditions used a small category and only 4 conditions calculated a d’ measure, these four moderators (nature of the primes, prime novelty, category size and d’) were not included as potential moderators. Table 6 provides a detailed overview of the regression models for the 10 individual moderators for the lexical decision and naming conditions. We first provide a detailed description of the significant moderators and afterwards we list the non-significant moderators. Population. The nature of the population studied in the lexical decision and naming conditions proved to be a significant moderator of the observed effect sizes. The average effect size for conditions using a student sample was significantly lower than the average effect size for adults (respectively 0.39 and 0.78, t(36.6) = 2.60, p = .01). However, significance tests of the regression coefficients showed that the average effect sizes for students and adults were both significantly different from zero. The moderator “population” was able to explain 15% of the variance. Target set size. Target set size emerged as a strong moderator of the observed effect sizes, with a larger target set (i.e. more targets presented to the subjects) indicating a stronger effect (β = 0.002, F(1, 24.3) = 25.24 , p < .0001). The moderator “target set size” was able to explain 43% of the variance. To assess whether a significant effect size was still present when very small target set sizes are used, we tested whether the intercept, where the target set size (theoretically) equals zero, was still significant. This was indeed the case (β = 0.26, t(15.3) = 4.27, p = .0006), indicating that significant priming was observed even when the target set size was minimal. Visibility measured. Whether the conditions included a test to assess prime visibility was also a significant moderator of the observed effect sizes. The average effect size for conditions that did not include a visibility test was significantly smaller than the average effect size for conditions that did conduct a visibility test (respectively 0.32 and 0.59, t(18.9) = 2.41, p = .03). Significance tests of the regression coefficients again showed that the average effect sizes for conditions that used a visibility test and for conditions that did not, 49 were both significantly different from zero. Whether visibility was measured was able to explain 12% of the variance. The moderators sample size, task, nature of targets, prime duration, SOA, masking and the nature of the visibility measure did not have a significant effect on the observed effect sizes. More details are provided in Table 6. 50 Table 6 Random effect regression models with one moderator for lexical decision and naming conditions 1 2 3 4 5 6 7 8 9 10 N Population Students Adults Task Lexical decision Naming Targets Symbols Words Set size Prime duration SOA Masking BM and FM BM FM None Visibility measured No Yes Obj/Subj Objective Subjective Both k 48 48 38 10 48 39 9 48 9 39 40 48 38 48 26 5 13 4 48 21 27 27 19 6 2 β -0.003 95% CI -0.006, 0.0002 0.39**** 0.78**** 0.28, 0.51 0.51, 1.04 0.47**** 0.41*** 0.34, 0.60 0.19, 0.63 0.46** 0.46**** 0.002**** 0.0009 -0.0002 0.19, 0.73 0.32, 0.60 0.001, 0.003 -0.005, 0.007 -0.001, 0.001 0.50**** 0.18 0.53*** 0.47** 0.34, 0.67 -0.13, 0.49 0.30, 0.76 0.16, 0.77 0.32*** 0.59**** 0.16, 0.47 0.43, 0.74 0.63**** 0.49** 0.59* 0.44, 0.83 0.24, 0.74 0.17, 1.01 F (df) 3.23 (1, 22.2) 6.78 (1, 36.6) p .09 .01 Model R2 0.00 0.15 0.22 (1, 32.6) .64 0.00 0.00 (1, 18.9) .99 0.00 25.24 (1, 24.3) 0.10 (1, 34.7) 0.08 (1, 14.3) 1.26 (3, 21.7) <.0001 .75 .79 .31 0.43 0.00 0.00 0.01 5.82 (1, 18.9) .03 0.12 0.54 (2, 17.5) .59 0.00 Note: k = number of effect sizes in the category. β = regression coefficients. The regression coefficients for the categorical variables can be interpreted as the mean effect sizes for each category. Model R2 calculated refers to the proportion of the explained total variance across the 3 levels and using the median sampling variance. **** p < .0001. *** p < .001. ** p < .01. * p < .05. 51 Regression models with two-way interactions Since three of the four most interesting moderators (nature of primes, prime novelty and category size) could not be included in the lexical decision and naming meta-analysis due to severe restrictions in range, the two-way interactions between the moderators that proved to be significant in the regression models with one moderators (population, target set size and whether the visibility was measured) were examined. None of the two-way interactions between these variables were significant (p-values ranging from .60 to .74). Because various moderator categories contained only a few observations and because interpretability would become increasingly complicated, no higher-order interactions were investigated. Multiple regression models To get a first overview of the relations between the moderators, Pearson correlations between the moderators (excluding nature of the primes, prime novelty, category size and d’ for reasons mentioned above) were computed. As in the semantic categorization metaanalysis, the categorical variables were reduced to dichotomous variables by eliminating the level(s) with the least observations. More specifically, for the variable nature of the visibility measure we eliminated the level “Both”. For the variable masking we eliminated the levels “Backward Masking” and “None”. As a result, 38 of the 48 conditions were retained for these correlational analyses. The correlations and VIF values are shown in Table 7. The VIF values exceeded the cut-off value for the moderators nature of the targets (VIF = 10.52), target set size (VIF = 6.85), masking (VIF = 7.74) and whether visibility was measured (VIF = 6.49). Because masking is theoretically the least interesting variable in the present meta-analyses, we decided to omit it from the multiple regression models. By eliminating masking, all remaining VIF values fell well below the cut-off value (VIF range 1.45 – 3.67). The observed overlap between certain variables also indicates that we should be careful interpreting the results from the regression models with one moderator variable, because confounding was present. Since no two-way interactions proved to be significant in the models with one moderator and to enhance interpretability, only main effects of the moderators were included in the multiple models. In a first step we examined a multiple regression model including the three moderators that were significant in the regression models with one moderator (population, target set size and whether the visibility was measured) and the nature of the targets. Only target set size reached statistical significance in this model (F(1, 16) = 13.95, p = .002). Population, whether the visibility was measured and the nature of the targets did not reach significance (respectively F(1, 22.5) = 0.13, p = .72, F(1, 9.51) = 0.34, p = .57 and F(1, 10.4) = 2.60, p = .14). This model (k = 40) had an R2 of .44. In a second step all models with two or three of these four moderators were examined. The models’ R2 ranged from .11 to .48. The optimal model in terms of R2 where all included moderators were significant was the model containing only the moderator target set size (F(1, 24.3) = 25.24, p < .0001). This model (k = 40) had an R2 of .43. We also used a backward elimination strategy starting from a full model containing all moderators, and then eliminating non-significant moderators from the model step by step based on the size of their p-value. Because the moderator “the nature of the visibility measure” could only be computed on a reduced number of effect sizes (k = 27) it was not included in the full model. The R2 for the full model (k = 38) was .41 and in this full model only the moderator target set size remained significant (F(1, 10.6) = 9.44, p = .01). The backward strategy eliminated the following sequence of non-significant moderators: SOA, task, population, sample size, whether the visibility was measured, nature of the targets and prime duration. So finally, we are left again with the optimal model containing only target set size. When a different starting point was chosen in the backward strategy, the same optimal model was always reached. 53 Table 7 Pearson correlations, odd ratios (in bold) and Variance Inflation Factors (VIF) among the moderators for the lexical decision and naming conditions Lexical decision and Naming 1 N 2 Population 3 Task 4 Targets 5 Set size 6 Prime duration 7 SOA 8 Masking 9 Visibility measured 10 Obj/Subj 11 Effect size 2 -.41* - 3 .002 / - 4 -.007 / 0.005*** - 5 -.20 .66*** -.52** .50** - 6 .08 -.40* .13 -.20 -.48** - 7 .04 .21 -.09 .04 .28 -.21 - 8 .11 0.47 0.21 / -.20 .16 -.49** - 9 -.46** 8.57 0.35 2.88 .64*** -.54*** -.07 0.78 - 10 .52* / 17.00* 0.06* -.41 .16 -.31 2.60 / - 11 -.12 .41* -.17 .09 .66*** -.23 .10 -.007 .37* -.04 - VIF 1.84 3.37 3.52 10.52 6.85 3.28 1.66 7.74 6.49 / Note. To allow the calculation of the VIF, all categorical moderators were reduced to dichotomous variables by eliminating the level with the least observations. Some odd ratios could not be calculated (marked with /) when one of the cells of the crosstable contained no observations. Because all studies reporting an objective or subjective visibility test included a visibility measure, the VIF for the nature of the visibility measure could not be computed. *** p < .001. ** p < .01. * p < .05. Publication bias and dependency Again, a funnel graph was constructed which is depicted in Figure 5. Visual inspection of this graph suggests that this funnel graph is rather symmetrical around the mean effect size (0.46) and does not appear to be heavily skewed. This is confirmed by the nonsignificant correlation between sample size and effect size (r = -.13, p = .37), suggesting that no strong publication bias is present. Additional arguments showing that publication bias did not play a crucial role in the present meta-analysis, are (1) the fact that sample size did not prove to be a significant moderator (p = .09), which indicates that smaller samples did not report larger effect sizes and (2) whether the study was published or unpublished also did not significantly moderate the observed effect sizes: the average effect size for published conditions did not significantly differ from the average effect size for unpublished conditions (t(17.7) = 1.52, p = .15). Sensitivity analyses were again used to study the impact of dependency of part of the conditions included in the meta-analysis. Ten of the 48 conditions (20.8%, indicated in Table 3) used the same sample already used in a previous condition. Using two random selection procedures, each sample was restricted to be included only once in the meta-analyses. Using these new samples (k = 38), all analyses were redone. The overall mean effect sizes for the random effects models were now 0.46 for both the first and the second random selection. The regression models with one moderator showed an identical pattern of results with the same moderators reaching significance: population: F(1, 36) = 5.91, p = .02 for the first random selection and F(1, 36) = 3.87, p = .06 for the second random selection; target set size: F(1, 31) = 17.73, p = .0002 for the first random selection and F(1, 31) = 14.35, p = .0007 for the second random selection.. The moderator whether visibility was measured was no longer significant (F(1, 14.5) = 3.10, p = .10) in the first random selection, but did approach significance in the second random selection (F(1, 14.8) = 3.87, p = .07). As before, none of the two-way interactions were significant and the optimal model contained only target set size. This indicates that the influence of dependency between the effect sizes is limited: when the dependency is eliminated, a highly similar pattern of results emerged. 100 Sample size 80 60 40 20 0 -1 -0,5 0 0,5 1 1,5 2 Observed effect sizes Figure 5. Funnel plot of the lexical decision and naming conditions. 56 Discussion The literature search identified 24 studies comprising of 48 conditions using either a lexical decision or a naming task. The overall mean effect size for the random effects model was 0.46. Significant variance was present at the between-study level. Indeed, in the regression models with one moderator, several variables were identified that significantly moderated the effect sizes for lexical decision and naming studies. However, it also became clear that these regression models with one moderator were confounded, because severe overlap was present between certain variables. After eliminating this multicollinearity, multiple regression analyses showed that a model containing only target set size was the optimal model, able to explain almost half of the variance present in the effect sizes. From this, we can conclude that target set size is the most prominent moderator of priming from lexical decision and naming tasks: the effect sizes significantly decreased for studies using a smaller target set size. This concurs with the findings of Abrams (2008) and Kiesel et al. (2006), who found that priming effects for novel primes from a large category could only be observed for large target sets, but not for small target sets. The decreased priming for small target sets could be due to the fact that when a small number of targets is being used, each target will have at least one unique feature distinguishing it from the other targets, and subjects very quickly realize that paying attention to these specific subword elements of the targets (e.g. first letter, last letter, adjacent vowels) is sufficient to correctly respond to them without the need of semantically processing the targets, which consequently decreases the chance that the prime will influence the categorization of the target, and thus diminishing the observed priming effects. However, for large target sets, subjects can no longer rely on subword elements of the targets, and they are forced to semantically process them, which could explain the stronger priming effects for large target sets. An alternative explanation for the moderating effect of target set size might be that for small target sets, subjects are able to store each target in their short-term memory, so they can recall the adequate response to the targets from memory, without the need to semantically process them leading to decreased priming. However, since only a limited number of targets can be stored in short-term memory, for large target sets subjects are forced to process each target each time because they cannot be recalled from memory, which leads to a stronger influence of the primes on the targets, and thus to increased priming (Roelofs, 2001). 57 Still, our results also show that even when target set size approaches zero, a significant effect size emerged, indicating that, across the literature, significant priming effects are still observed with small target set sizes. We note that four key variables, namely the nature of the primes, prime novelty, category size and d’ measure, could not be included in the meta-analysis because (certain levels of) these variables were underrepresented in the literature. Therefore, caution is warranted in interpreting the results of this meta-analysis of lexical decision and naming conditions, since the influence of several potentially important variables could not be tested. Publication bias and the dependency between parts of the data did not prove to have a strong effect on the results. General Discussion The aim of the present meta-analyses was twofold: first, we wanted to assess the (magnitude of the) overall effect of subliminal semantic priming in the literature. By doing this separately for semantic categorization conditions and lexical decision and naming conditions, we wanted to establish whether the subliminal priming literature to date supports the claim of semantic processing of subliminal information, especially when this semantic processing is unbiased by response congruency as is the case in lexical decision and naming tasks. Second, we aimed to assess the influence of potential moderators on semantic priming effects for the different tasks. Our meta-analyses show that across the literature conducted between 1998 and 2006 significant positive priming has been reported for both semantic categorization studies and lexical decision and naming studies (as indicated by strong positive overall effect sizes). Priming effects for semantic categorization conditions showed to be moderated primarily by prime novelty and category size. Priming effects for lexical decision and naming conditions showed to be moderated primarily by target set size. However and most importantly, our meta-analysis indicates that across the literature significant priming emerges under various circumstances. The fact that a) across the literature significant priming is observed in the context of lexical decision and naming, unconfounded by response effects and b) across the literature significant priming in semantic categorization is observed even under those circumstances where a non-semantic interpretation is difficult to sustain (novel primes, large stimulus categories), strongly support the hypothesis that semantic information can be 58 extracted from subliminal information and influence subsequent behavior. This observation is in line with the claims of Dehaene et al. (1998). However, our results also indicate that the observed priming for semantic categorization studies, where priming is biased by stimulus-response effects, is larger than priming obtained in lexical decision and naming studies. Furthermore, within the semantic categorization studies, the stronger priming effects for repeated primes and small categories provide evidence for the fact that non-semantic S-R mappings can easily be formed under these circumstances. These observations indicate that S-R mappings enhance the priming effects, which is in line with the claims of Damian (2001) and Abrams and Greenwald (2000). To summarize, these meta-analyses imply that under certain circumstances (semantic categorization with repeated primes and/or small categories) S-R effects cause at least a significant part of the observed priming effects. However, even when the influence of such SR mappings is minimized or eliminated (e.g. in lexical decision and naming tasks or in semantic categorization tasks with novel primes and/or large categories), significant semantic priming is observed. These findings reconcile the two reigning theories regarding subliminal semantic priming: when given a chance, automatic stimulus-response effects will clearly boost the obtained priming effects (Abrams & Greenwald, 2000; Damian, 2001). However, when S-R influences are minimized or avoided, subliminal information can be genuinely semantically processed (e.g. Klauer et al., 2007; Van den Bussche & Reynvoet, 2007). This latter finding is in line with a body of evidence from recent brain imaging studies, which show that cerebral regions associated with a semantic level of representation are activated by subliminal primes. For example, Kiefer and Brendel (2006) used a lexical decision task where subjects decided whether targets, preceded by novel primes, were words or pseudowords. They reported that the N400 (an ERP component thought to reflect lexical or semantic integration of information) was modulated by subliminal semantic priming, even when prime awareness was controlled carefully. Devlin, Jamison, Matthews and Gonnerman (2004) showed that masked prime-target pairs which shared meaning (e.g. idea - notion) reduced neural activity in the left middle temporal gyrus, a region thought to be involved in semantic processing of words and objects. In addition, the discrepancy between the enhanced priming effects under circumstances that stimulate the possibility of forming non-semantic S-R mappings for the primes and the decreased, although still significant, priming effects under circumstances where the development of such S-R links is unlikely, is something researchers should keep in 59 mind when designing future subliminal priming studies. For example, when one aims to investigate the semantic processing of subliminal information, a design should be chosen that excludes the influence of S-R effects as much as possible, since they will confound the obtained effects. Furthermore, the two separate meta-analyses for semantic categorization on the one hand and lexical decision and naming on the other hand not only show that the overall effect size is smaller for lexical decision and naming studies, but also that a different set of variables moderates priming from semantic categorization tasks (primarily prime novelty and category size) and lexical decision and naming tasks (primarily target set size). This again affirms our decision to strictly separate these two kinds of tasks. It could also suggest that differential mechanisms might lie at the basis of priming obtained with semantic categorization and priming obtained with lexical decision and naming: for semantic categorization stronger priming is observed for repeated primes and small categories, which suggests that when S-R links can be formed, priming effects will be boosted. Contrarily, for lexical decision and naming stronger priming is observed for large target sets. This could be due to the fact that for large target sets subjects can no longer rely on unique subword features of the targets to process them or to capacity limitations of the working memory (or perhaps a combination of both), which forces subjects to always semantically process the targets, allowing the primes to exert a larger influence on the targets and consequently evoking larger priming effects. However, caution is warranted when directly comparing the two meta-analyses: in lexical decision and naming studies, researchers usually examine the same combinations of the moderators. As a consequence we were unable to enter some of our key potential moderators in the meta-analysis for lexical decision and naming due to severe restrictions in range. Therefore, it is very tricky to compare this meta-analysis with the one for semantic categorization, where we were able to include all potential moderators. As mentioned above, the lexical decision and naming studies usually investigate the same combinations of the moderators, leading to gaps in this research field. Future research should address these missing combinations of the moderators. Almost all these studies used novel primes, word primes, stimuli from a large category and large target sets. For example, it would be interesting to test for lexical decision and naming tasks, which are independent of response effects, whether using repeated primes and/or small categories would also boost the observed priming effects, as is the case for semantic categorization studies which are influenced by S-R effects. 60 Finally, our meta-analyses clearly indicate the importance of including a visibility test in subliminal priming experiments, and preferable the inclusion of an objective d’ measure to critically assess prime awareness, as indicated by the significant effect of d’ measure in the semantic categorization studies. Studying the impact of unconscious information on our behavior, automatically assumes that we strive to guarantee the unconscious nature of that information. In order to do this, some assessment of prime awareness is indispensable. As only four lexical decision or naming studies included such a d’ measure, the findings of the present meta-analyses hopefully urge researchers to include this objective visibility measure in the future. Our present meta-analyses have a few important limitations: 1) Because of the gaps in the present behavioral research and the confounding between several moderators (e.g. almost two thirds of the studies using a large category also use novel primes) we were not able to compute higher order interactions between the moderators. Furthermore, we did not obtain significant two-way interactions between the moderators. Of course, it could be that these interactions simply do not exist. However, since recent literature has provided some evidence suggesting that interactions between the moderators are present (e.g. interaction between category size and target set size, Kiesel et al., 2006), it also seems plausible that the lack of power due to the often severe restrictions in range for certain levels of the moderators caused the absence of interaction effects. 2) Publication bias and the partial dependency of the effect sizes did not prove to have a strong impact on the results of the meta-analyses. However, we can not completely rule out the possibility that they did slightly influence the results, which could for example have led to slightly biased effect size estimates. 3) One could argue that certain studies included in the meta-analyses did not efficiently control prime awareness (e.g. studies using longer prime durations or SOAs, studies that did not report a visibility test or only a subjective visibility test, studies that did not use a masking procedure). However, some of these variables never showed a significant impact on the observed effect sizes (masking, prime duration, SOA and the nature of the visibility test). For the two variables that did have a significant influence on the effect sizes (whether visibility was measured and d’ measure), our findings indicate that even under the most strict conditions (when a visibility measure was included and when d’ was equal to zero), significant priming was observed. We can conclude from these meta-analyses that across the research conducted on subliminal priming between 1998 and 2006, a considerable amount of subliminal priming has emerged. Thus, can unconsciously presented information influence our behavior? Yes! Furthermore, our quantitative review confirms that subliminal information can be 61 semantically processed and influence our subsequent behavior. Thus, can unconsciously presented information be processed up to a semantic level? Yes! However, our study also shows that when the experimental context creates the opportunity to establish non-semantic S-R mappings for the unconscious primes, the (non-semantic) processing of the primes will be enhanced and hence the priming effects will be additionally boosted. So do our results also suggest that receiving subliminal messages such as “drink cocacola” and “eat popcorn” or a McDonald's logo will unconsciously stimulate us to go out and buy these products? Difficult to say! Our results show that we are able to unconsciously influence one’s behavior in a controlled experimental laboratory context, where the immediate impact of the subliminal information on the subsequent behavior is measured. However, it is difficult to directly translate these results to everyday-life situations, where the delay between the subliminal message and its influence on behavior is much longer. Improving our understanding of the time course of subliminal activation and assessing exactly how short-lived this activation is, will be essential in order to evaluate its potential impact on our behavior and decisions. 62 References References marked with an asterisk indicate studies included in the meta-analysis. Abrams, R.L. (2008). Influence of category size and target set size on unconscious priming by novel words. Experimental Psychology, 55, 189-194. Abrams, R.L., & Greenwald, A.G. (2000). Parts outweigh the whole (word) in unconscious analysis of meaning. Psychological Science, 11, 118-124. * Alameda, J.R., Cuetos, F., & Brysbaert, M. (2003). The number 747 is named faster after seeing Boeing than after seeing Levi’s: Associative priming in the processing of multidigit Arabic numerals. The Quarterly Journal of Experimental Psychology, 56A, 10091019. * Bajo, M.T., Puerta-Melguizo, M.C., & Macizo, P. (2003). The locus of semantic interference in picture naming. Psicológica, 24, 31-55. Balota, D.A. (1983). Automatic semantic activation and episodic memory encoding. Journal of Verbal Learning and Verbal Behavior, 22, 88-104. Begg, C.B. (1994). Publication bias. In: H. Cooper & L.V. Hedges (Eds.), The handbook of research synthesis (pp. 399-410). New York: Russell Sage Foundation. * Bodner, G.A., & Dypvik, A.T. (2005). Masked priming of number judgments depends on prime validity and task. Memory and Cognition, 33, 29-47. * Bodner, G.E., & Masson, M.E.J. (2003). Beyond spreading activation: An influence of relatedness proportion on masked semantic priming. Psychonomic Bulletin & Review, 10, 645-652. * Bourassa, D.C., Besner, D. (1998). When do nonwords activate semantics? Implications for models of visual word recognition. Memory & Cognition, 26, 61-74. * Brown, M., & Besner, D. (2002). Semantic priming: On the role of awareness in visual word recognition in the absence of an expectancy. Consciousness and Cognition, 11, 402-422. * Bueno, S., & Frenck-Mestre, C. (2002). Rapid activation of the lexicon: A further investigation with behavioral and computational results. Brain and Language, 81, 120-130. Cheesman, J., & Merikle, P.M. (1986). Distinguishing conscious from unconscious perceptual processes. Canadian Journal of Psychology, 40, 343-367. 63 * Damian, M. F. (2001). Congruity effects evoked by subliminally presented primes: Automaticity rather than semantic processing. Journal of Experimental Psychology: Human Perception and Performance, 27, 154 – 165. * Dehaene, S., Naccache, L., Le Clec'H, G., Koechlin, E., Mueller, M., DehaeneLambertz, G., van de Moortele, P. F., & Le Bihan, D. (1998). Imaging unconscious semantic priming. Nature, 395, 597-600. * Dell'Acqua, R., & Grainger, J. (1999). Unconscious semantic priming from pictures. Cognition, 73, 1-15. * Devlin, J.T., Jamison, H.L., Gonnerman, L.M., & Matthews, P.M. (2006). The role of the posterior fusiform gyrus in reading. Journal of Cognitive Neuroscience, 18, 911-922. * Devlin, J.T., Jamison, H.L., Matthews, P.M., & Gonnerman, L.M. (2004). Morphology and the internal structure of words. Proceedings of the National Academy of Sciences, 101, 14984-14988. Draine, S., & Greenwald, A. (1998). Replicable unconscious semantic priming. Journal of Experimental Psychology: General, 127, 286-303. * Finkbeiner, M., & Caramazza, A. (2006). Now you see it, now you don’t: On turning semantic interference into facilitation in a Stroop-like task. Cortex, 42, 790-796. * Forster, K.I. (2004). Category size effects revisited: Frequency and masked priming effects in semantic categorization. Brain and Language, 90, 276-286. Forster, K.I., & Davis, C. (1984). Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 680-698. * Forster, K.I., & Hector, J. (2005). [Semantic priming as a function of SOA]. Unpublished raw data. * Forster, K.I., Mohan, K., & Hector, J. (2003). The Mechanics of Masked Priming. In: S. Kinoshita & S.J. Lupker (Eds.), Masked priming: The state of the art (pp. 3-37). New York: Psychology Press. Fowler, C., Wolford, G., Slade, R., & Tassinary, L. (1981). Lexical access with and without awareness. Journal of Experimental Psychology: General, 110, 341-362. Gibbons, R.D., Hedeker, D.R., & Davis, J.M. (1993). Estimation of effect size from a series of experiments involving paired comparisons. Journal of Educational Statistics, 18, 271-279. Glaser, W.R. (1992). Picture naming. Cognition, 42, 61–105. 64 Grainger, J. & Frenck-Mestre, C. (1998). Masked priming by translation equivalents in bilinguals. Language and Cognitive Processes, 13, 601-623. Greenhouse, J.B., & Iyengar, S. (1994). Sensitivity analysis and diagnostics. In: H. Cooper & L.V. Hedges (Eds.), The handbook of research synthesis (pp. 383-398). New York: Russell Sage Foundation. Greenwald, A.G., Abrams, R.L., Naccache, L., & Dehaene, S. (2003). Long-term semantic memory versus contextual memory in unconscious number processing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 235-247. Greenwald, A.G., Draine, S.C., & Abrams, R.L. (1996). Three cognitive markers of unconscious semantic activation. Science, 273, 1699-1702. Greenwald, A.G., Klinger, M.R., & Liu, T.J. (1989). Unconscious processing of dichoptically masked words. Memory and Cognition, 17, 35-47. * Grossi, G. (2006). Relatedness proportion effects on masked associative priming: An ERP study. Psychophysiology, 43, 21-30. Hedges, L.V. (1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics, 6, 107-128. Hedges, L.V. (1982). Estimation of effect size from a series of independent experiments. Psychological Bulletin, 92, 490-499. Hedges, L.V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press. Holcomb, P.J., Reder, L., Misra, M., & Grainger, J. (2005). The effects of prime visibility on ERP measures of masked priming. Cognitive Brain research, 24, 155-172. Holender, D. (1986). Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: a survey and appraisal. Behavioral and Brain Sciences, 9, 1-23. Hunter, J.E., & Schmidt, F.L. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage. * Kamphuis, J.H., De Ruiter, C., Janssen, B., & Spiering, M. (2005). Preliminary evidence for an automatic link between sex and power among men who molest children. Journal of Interpersonal Violence, 20, 1351-1365. * Kiefer, M. (2002). The N400 is modulated by unconsciously perceived masked words: Further evidence for an automatic spreading activation account of N400 priming effects. Cognitive Brain Research, 13, 27-39. 65 * Kiefer, M., & Brendel, D. (2006). Attentional modulation of unconscious “automatic” processes: Evidence from event-related potentials in a masked priming paradigm. Journal of Cognitive Neuroscience, 18, 184-198. * Kiefer, M., & Spitzer, M. (2000). Time course of conscious and unconscious semantic brain activation. Cognitive Neuroscience, 11, 2401-2407. * Kiesel, A., Kunde, W., Pohl, C., & Hoffmann, J. (2006). Priming from novel masked stimuli depends on target set size. Advances in Cognitive Psychology, 2, 37-45. * Kinoshita, S., Mozer, M.C., & Forster, K.I. (2006). Modulation of congruence effects with masked primes in the parity decision task. Manuscript submitted for publication. Klauer, C.K., Eder, A.B., Greenwald, A.G., & Abrams, R.L. (2007). Priming of semantic classifications by novel subliminal prime words. Consciousness and Cognition, 16, 63-83. * Koechlin, E., Naccache, L., Block, E., & Dehaene, S. (1999). Primed numbers: Exploring the modularity of numerical representations with masked and unmasked semantic priming. Journal of Experimental Psychology: Human Perception and Performance, 25, 1882-1905. * Kolanczyk, A., & Pawlowska-Fusiara, M. (2002). Automatic correction or controlled processing of affective priming. Polish Psychological Bulletin, 33, 37-46. Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: A critical review of visual masking. Philosophical Transactions of the Royal Society B, 362, 857-875. * Kunde, W., Kiesel, A., & Hoffmann, J. (2003). Conscious control over the content of unconscious cognition. Cognition, 88, 223-242. * Kunde, W., Kiesel, A., & Hoffmann, J. (2005). On the masking and disclosure of unconscious elaborate processing. A reply to Van Opstal, Reynvoet, and Verguts (2005). Cognition, 97, 99-105. Laver, G.D., & Burke, D.M. (1993). Why do semantic priming effects increase in old age? A meta-analysis. Psychology and Aging, 8, 34-43. Light, R.J., & Pillemer, D.B. (1984). Summing up: The science of reviewing research. Cambridge, MA: Harvard University Press. Lipsey, M.W., & Wilson, D.B. (2001). Practical meta-analysis. Newbury Park, CA: Sage. Littell, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D., & Schabenberger, O. (2006). SAS® system for mixed models (2nd ed.). Cary, NC: SAS Institute Inc. 66 Lucas, M. (2000). Semantic priming without association: A meta-analytic review. Psychonomic Bulletin & Review, 7, 618-630. Marcel, A. J. (1983). Conscious and unconscious perception: Experiments on visual masking and word recognition. Cognitive Psychology, 15, 197-237. * McBride, K.A., Kripps, J.H., & Forster, K.I. (2003). [Phonological and semantic priming in deaf and hearing subjects]. Unpublished raw data. Merikle, P.M., & Daneman, M. (1998). Psychological investigations of unconscious perception. Journal of Consciousness Studies, 5, 5-18.. Merikle, P.M., & Reingold, E.M. (1992). Measuring unconscious perceptual processes. In R.F. Bornstein & T.S. Pittman (Eds.), Perception without awareness (pp. 55-80). New York: Guilford Press. Morris, S.B., & DeShon, R.P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods, 7, 105-125. * Naccache, L., & Dehaene, S. (2001). Unconscious semantic priming extends to novel unseen stimuli. Cognition, 80, 223-237. O’Brien, R.M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality & Quantity, 41, 673-690. * Perea, M., & Lupker, S.J. (2003). Does judge activate COURT? Transposed-letter similarity effects in masked associative priming. Memory & Cognition, 31, 829-841. * Perea, M., & Rosa, E. (2002a). Does the proportion of associatively related pairs modulate the associative priming effect at very brief stimulus-onset asynchronies? Acta Psychologica, 110, 103-124. * Perea, M., & Rosa, E. (2002b). The effects of associative and semantic priming in the lexical decision task. Psychological Research, 66, 180-194. Perfetti, C.A., & Tan, L.H. (1998). The time course of graphic, phonological, and semantic activation in visual Chinese character identification. Journal of Experimental Psychology: Learning, Memory and Cognition, 24, 101-118. * Pohl, C., Kiesel, A., Kunde, W., & Hoffmann, J. (2004). [Impact of target size and concept category on the effect of masked primes]. Unpublished raw data. * Pohl, C., Kiesel, A., Kunde, W., & Hoffmann, J. (2005). [Impact of target size and concept category on the effect of masked primes]. Unpublished raw data. * Pohl, C., Kiesel, A., Kunde, W., & Hoffmann, J. (2006). Early and late selection in unconscious information processing. Manuscript submitted for publication. 67 Pratkanis, A. R. (1992). The cargo-cult science of subliminal persuasion. Skeptical Inquirer, 16, 260-272. Raudenbush, S.W., & Bryk, A.S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10, 78-98. Raudenbush, S.W., & Bryk, A.S. (2002). Hierarchical linear models. 2nd ed. London: Sage. * Reynvoet, B., Caessens, B., & Brysbaert, M. (2002). Automatic stimulus-response associations may be semantically mediated. Psychonomic Bulletin & Review, 9, 107-112. Reynvoet, B., Gevers, W., & Caessens, B. (2005). Unconscious primes activate motor codes through semantics. Journal of Experimental Psychology: Learning, Memory and Cognition, 31, 991-1000. Roelofs, A. (2001). Set size and repetition matter: comment on Caramazza and Costa (2000). Cognition, 80, 291-298. Rosenthal, R. (1991). Meta-analytical procedures for social research. Newbury Park, CA: Sage. * Rouibah, A., Tiberghien, G., & Lupker, S.J. (1999). Phonological and semantic priming: Evidence for task-independent effects. Memory & Cognition, 27, 422-437. * Ruz, M., Madrid, E., Lupiáñez, J., & Tudela, P. (2003). High density ERP indices of conscious and unconscious semantic priming. Cognitive Brain Research, 17, 719-731. * Sassenberg, K., & Moskowitz, G.B. (2005). Don’t stereotype, think different! Overcoming automatic stereotype activation by mindset priming. Journal of Experimental Social Psychology, 41, 506-514. * Sebba, D., & Forster, K.I. (1989). [Mediated priming with masked primes]. Unpublished raw data. Sidis, B. (1898). The psychology of suggestion. New York, NY: Appleton. Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage. Tan, L.H., Hoosain, R., & Peng, D. (1995). Role of early presemantic phonological code in Chinese character identification. Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 43-54. Tan, L.H., Hoosain, R.,& Siok, W.K. (1996). Activation of phonological codes before access to character meaning in written Chinese. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 865-882. 68 Theios, J., & Amrhein, P.C. (1989). Theoretical analysis of the cognitive processing of lexical and pictorial stimuli: Reading, naming, and visual and conceptual comparisons. Psychological Review, 96, 5–24. * Théoret, H., Halligan, E., Kobayashi, M., Merabet, L;, & Pascual-Leone, A. (2004). Unconscious modulation of motor cortex excitability revealed with transcranial magnetic stimulation. Experimental Brain Research, 155, 261-264. * Van den Bussche, E., & Reynvoet, B. (2006). [The influence of attention on masked priming effects]. Unpublished raw data. Van den Bussche, E., & Reynvoet, B. (2007). Masked priming effects in semantic categorization are independent of category size. Experimental Psychology, 54, 225-235. Van den Noortgate, W., & Onghena, P. (2003). Multilevel meta-analysis: A comparison with traditional meta-analytical procedures. Educational and Psychological Measurement, 63, 765-790. * Van Opstal, F., Reynvoet, B., & Verguts, T. (2005a). Unconscious semantic categorization and mask interactions: An elaborate response to Kunde et al. (2005). Cognition, 97, 107-113. * Van Opstal, F., Reynvoet, B., & Verguts, T. (2005b). How to trigger elaborate processing? A comment on Kunde, Kiesel, and Hoffmann (2003). Cognition, 97, 89-97. Vorberg, D., Mattler, U., Heinecke, A., Schmidt, T., & Schwarzbach, J. (2004). Invariant time-course of priming with and without awareness. In C. Kaernbach, E. Schröger & H. Müller (Eds.), Psychophysics Beyond Sensation: Laws and Invariants of Human Cognition (pp. 273-290). Hillsdale, NJ: Lawrence Erlbaum Associates. * Zhou, X., Marslen-Wilson, W., Taft, M., & Shu, H. (1999). Morphology, orthography, and phonology in reading Chinese compound words. Language and Cognitive Processes, 14, 525-565. Ziegler, J.C., Ferrand, L., Jacobs, A.M., Rey, A., & Grainger, J. (2000). Visual and phonological codes in letter and word recognition: Evidence from incremental priming. The Quarterly Journal of Experimental Psychology, 53A, 671-692. 69
© Copyright 2026 Paperzz