Relationship Between Dean’s Letter Rankings and Later Evaluations by Residency Program Directors Stephen J. Lurie Department of Family Medicine University of Rochester School of Medicine and Dentistry Rochester, New York, USA David R. Lambert Department of Medicine University of Rochester School of Medicine and Dentistry Rochester, New York, USA Tana A. Grady-Weliky Department of Psychiatry University of Rochester School of Medicine and Dentistry Rochester, New York, USA Background: It is not known how well dean’s letter rankings predict later performance in residency. Purpose: To assess the accuracy of dean’s letter rankings to predict clinical performance in internship. Method: Participants were medical students who graduated from the University of Rochester School of Medicine and Dentistry in the classes of 2003 and 2004. In their Dean’s Letter, each student was ranked as either “Outstanding” (upper quartile), “Excellent” (second quartile), “Very good” (lower 2 quartiles), or “Good” (lowest few percentile). We compared these dean’s letter rankings against results of questionnaires sent to program directors 9 months after graduation. Results: Response rate to the questionnaire was 58.9% (109 of 185 eligible graduates). There were no differences in response rate across the four dean’s letter ranking categories. Program directors rated students in the top two categories of dean’s letter rankings significantly higher than those in the very good group. Students in all three groups were rated significantly higher than those in the good group, F (3, 105) = 13.37, p < .001. Students in the very good group were most variable in their ratings by program directors, with many receiving similarly high ratings as students in the upper 2 groups. There were no differences by gender or specialty. Conclusion: Dean’s letter rankings are a significant predictor of later performance in internship among graduates of our medical school. Students in the bottom half of the class are most likely either to underperform or overperform in internship. C 2007 Lawrence Erlbaum Associates, Inc. Copyright Teaching and Learning in Medicine, 19(3), 251–256 than of recommendation.8 To further emphasize this point, the AAMC has recently recommended changing the name of the “dean’s letter” to the “Medical Student Performance Evaluation,” which should contain comparative information about the student performed relative to their classmates.8 In practice, program directors have been found to prefer a ranking system that goes beyond simple The dean’s letter provides a summary of students’ performance in medical school and remains a key piece of information that program directors use to assess candidates for residency. Partially in response to historical findings that such letters may be variable in their quality1–6 and accuracy,7 the American Association of Medical Colleges (AAMC) has stressed that the dean’s letter should be a letter of accurate evaluation rather This article was previously presented at Northeast Group on Educational Affairs Spring Meeting March 3, 2006, and 12th Annual Ottawa International Conference on Medical Education, May 21, 2006. Correspondence may be sent to Stephen J. Lurie, University of Rochester School of Medicine and Dentistry, 601 Elmwood Avenue, Box 601, Rochester, NY 14642, USA. E-mail: Stephen [email protected] 251 LURIE, LAMBERT, & GRADY-WELIKY pass/fail descriptions.9 It also appears that that program directors and dean’s letter writers can agree about relative ranking of students based on accurate information.10 Similarly, a study from 1991 found that undergraduate grades were highly predictive of program directors’ rankings.11 There have been no recent studies, however, of how well dean’s letter rankings predict performance in residency. It is also unclear whether program directors are able to interpret evaluative terms such as superior, outstanding, excellent, and very good without an explicit key that describes the relationship between such terms and a quantitative assessment of students’ ranking relative to their classmates.4 Similarly, it is not known how often program directors may disagree with these rankings, and if so, whether they tend to overestimate or underestimate student’s anticipated strengths and/or weaknesses in internship. We studied these questions in two recent graduating classes from the University of Rochester School of Medicine and Dentistry, from whom we obtained survey data from program directors regarding graduates performance during internship. We assessed agreement between the dean’s letter ranking and program directors’ global assessments. We also examined for any systematic differences by specialty. Finally, we examined dean’s letter rankings of students for whom the program directors felt that the dean’s letter had either underestimated or overestimated their abilities. Method Participants Participants were 184 medical students who graduated from the University of Rochester School of Medicine and Dentistry in the classes of 2003 and 2004. Measures The final sentence of the dean’s letter described each student as either an “outstanding” (upper quartile), “excellent” (second quartile), “very good” (lower two quartiles), or “good” (a small number of students in the bottom of the lowest quartile) candidate for residency training. Students were assigned to these descriptors based on their grades during the mandatory longitudinal ambulatory clerkship (completed during Years 1 and 2) and the six mandatory 3rd-year clinical clerkships (Internal Medicine, Neurology, Obstetrics/Gynecology, Pediatrics, Psychiatry, and Surgery). Grades were then weighted by the number of weeks of the clerkship and the grade distribution of the entire class in the clerkship. For example, a grade of Honors in a long clerkship that does not give many Honors grades carries more weight than a similar grade in a shorter clerkship that gives many Honors grades. Our 252 dean’s letter also provides a guide to interpreting these rankings, with approximately 20% in the outstanding group, 25% in the excellent group, and 50 to 55% in the very good group. Less than 5% are in the good group. Within 1 year of graduation we sent a 15-item survey to internship program directors, asking them to rate the graduates on a number of general clinical, interpersonal, and professional skills (see Figure 1). Items 1 through 12 were scaled on a 4-point scale, whereas Item 13 was scaled on a 7-point scale. Program directors were also asked (Items 14 and 15) whether the dean’s letter had overstated or understated the graduates’ strengths or weaknesses. Statistical Analysis We first used factor analysis to determine the best method of grouping the survey items for further analysis. We then used analysis of variance to compare differences in program directors’ assessments across the four dean’s letter ranking groups. An analysis of variance (ANOVA) was also used to compare dean’s letter ranking groups across specialty. We used the Duncan multiple range test to assess for post hoc differences among ranking groups for which there was a significant overall F value. Because of the relatively small number of students for whom program directors felt the dean’s letter had under- or overestimated the students’ abilities, we used simple descriptive statistics to assess relationships between dean’s letter ranking groups and program directors’ assessments. Chi-square tests were used to assess for relationships between categorical variables. All analyses were performed with SAS version 9.1 (SPSS, Cary, NC). Our study was determined to be exempt by institutional review by our Institutional Review Board. Results Postgraduation Questionnaire We received 109 responses from program directors for the 185 graduates (response rate = 58.9%). There was no significant difference in response rate by the four dean’s letter ranking groups, χ 2 = 1.79, df = 3, n = 185, p = .69. Factor analysis of the first 13 items initially appeared consistent with a two-factor solution—the first eigenvalue was 8.9, the second was 1.06, and the third was 0.57. The first two factors collectively accounted for 77% of the covariance between responses. The results of varimax rotation for the two-factor solution are displayed in Table 1. Although the item about “interpersonal skills” appears to reside on a separate factor, we found that this item had a very high correlation (0.82) with the mean of the other items and thus appears to partake of significant shared LETTER RANKINGS AND DIRECTOR EVALUATIONS Figure 1. Postgraduation survey. variance. Furthermore, even if this item were analyzed separately, as a single item we anticipate that it would have relatively low reliability. A single 13-item scale was highly reliable, with a Cronbach’s alpha of 0.94. Application of the Spearman–Brown prophecy formula reveals that a shorter questionnaire with only 4 items would have a reliability of 0.8. Nonetheless, for the purpose of further analysis, we combined all 13 items into a single overall quality score. Relationship of Dean’s Letter Ranking Groups to Program Directors’ Assessment. An ANOVA revealed significant differences in program directors’ assessments across the four dean’s letter ranking groups, F (3, 105) = 13.37, p < .001. Comparisons between groups revealed that the means of students in the outstanding and excellent groups were statistically similar but that both were significantly higher then the mean of the very good group. The mean of students in the good group was significantly lower than that of the other three groups. As shown in Figure 2, there was also a marked trend for increasing variability across the ranking groups; students in the outstanding group were less variable in their rankings than students in the excellent group, who in turn were less variable than those in the very good group. As a secondary analysis, we also computed factor scores for the three items that loaded on the second factor in our analysis, which appears to reflect interpersonal abilities. These results were virtually identical, F (3, 99) = 8,49, p < .001, 253 LURIE, LAMBERT, & GRADY-WELIKY Table 1. Two-Factor Solution to Postgraduation Questionnaire Items Overall impression Clinical reasoning and patient management Lifelong learning skills Suitability for career in clinical practice Suitability for career in academic medicine Bedside skills Leadership skills General fund of knowledge Teaching skills Which description best fits this resident? Personal qualities Attentiveness to psychosocial issues Interpersonal skills Factor 1 Factor 2 0.92 0.89 0.87 0.87 0.87 0.86 0.85 0.85 0.82 0.81 0.72 0.71 0.04 0.18 0.05 0.13 0.18 0.09 0.23 0.06 0.02 0.11 0.16 0.38 0.44 0.96 with the mean of the good group significantly lower than that of the other three groups. Disagreements Between Dean’s Letter Ranking Group and Program Directors’ Assessments. We asked program directors whether the dean’s letter had either overestimated or underestimated the student’s strengths or weaknesses. Among the 104 students for whom we received responses to the item about strengths, there were 19 for whom the program directors felt the dean’s letter to have been inaccurate. Of these 19 students, 14 had been described as very good in the dean’s letter. Within this group, seven dean’s letters were said to have overestimated the students’ strengths, while seven were said to have overestimated them. Of the remaining 5 students, 1 was in the outstanding group and 5 were in the excellent group. For 3 of these 5 students (including the outstanding group student), the dean’s letter was said to have underestimated their strengths. There was a similar pattern among the 97 students for whom we received responses to the item about weaknesses. Of the 10 students whose weaknesses were felt to have inaccurately reported, 9 were in the very good group; for 3 the dean’s letter was felt to have overestimated their weaknesses, whereas for the other 6 the dean’s letter was reported to have underestimated weaknesses. We then focused on this very good group to further explore the relationship between program directors’ overall rankings and their perceptions of the accuracy of the dean’s letter among these students. Program directors assigned significantly lower overall scores to students for whom the dean’s letter was perceived to have overestimated their strengths than they did to students for whom the dean’s letter was either accurate or had underestimated their strengths (group means = 1.9 vs. 3.3 vs. 3.6 respectively), F (2, 48) = 18.52, p < .001. Findings were similar for weaknesses; program directors assigned significantly lower overall scores to students for whom the dean’s letter was perceived to have underestimated their weaknesses than they did to students for whom the dean’s letter was either accurate or had underestimated their strengths (group means = 2.17 vs. 3.2 vs. 3.8, respectively), F (2, 44) = 7.55, p= .002. For 4 of the 7 students for whom the dean’s letter was reported to have understated their weaknesses, it was also reported to have overstated their strengths. Relationship Between Dean’s Letter Ranking Groups, Program Director Assessments, Specialty, and Gender. We subdivided graduates’ residency fields into the seven mutually exclusive categories of Family Medicine, Internal Medicine, Obstetrics/Gynecology Pediatrics, Psychiatry, Surgery, and Other. We found no significant differences between groups in overall program directors’ rankings, F (6, Figure 2. Relationship between rankings by internship directors and dean’s letter groupings. 254 LETTER RANKINGS AND DIRECTOR EVALUATIONS 99) = 0.99, p = .45.When comparing the seven possible residency choices against the four dean’s letter ranking categories, however, the number of observations in most of the 28 resulting cells is too small to draw reliable conclusions. Nonetheless, there did not appear to be any obvious trends in these results (not shown). There were no differences between men and women in program directors’ rankings, F (1, 108) = 0.26, p = .62. Similarly, there were no significant differences in proportions of men and women in the four dean’s letter ranking categories, χ 2 = 3.24, p = .35, df = 3, n = 185. Discussion We find that dean’s letter rating groups at our institution are significant predictors of program directors’ evaluations of performance during internship. Survey responses were received from more than 50 different program directors, thus supporting the validity of the dean’s letter ranking group to predict performance in a range of clinical settings and specialties. We also found that overall program directors’ ratings did not differ by specialty, which would argue against any possible bias resulting from students in different Medical Student Performance Evaluation (MSPE) categories differentially entering various specialties. This general conclusion must be tempered, however, with the finding that many students in the excellent and very good groups received similarly high ratings from their program directors as did students in the outstanding group. As a group, students in these lower tiers of dean’s letter rankings are more variable in their programs directors’ ratings than are those in the upper tier. It appears that some of the students in these lower dean’s letter ranking groups are capable of displaying the same level of clinical skills as those in the outstanding group but may need either more time or a change in environment to do so. Other students in these lower two groups, however, continue to receive less favorable evaluations during internship, thus continuing the pattern they displayed in medical school. By contrast, it was rare for a student in who received a rank of outstanding to receive low scores from their program directors. Thus, although a rank of outstanding seems to guarantee that students will perform highly during internship, prediction becomes somewhat less certain for those in the excellent and very good groups. These conclusions are supported by the fact that dean’s letters for students in the very good group were most likely to be perceived by program directors as having misrepresented the students’ strengths or weaknesses. By contrast, the few students in the lowest, or good group receive consistently low evaluations from their program directors, thus supporting the validity of this ranking. In their survey internal medicine residency program directors about problem residents, Yao and Wright13 found that the most commonly reported difficulties were insufficient medical knowledge, poor clinical judgment, and inefficient use of time. Presumably such difficulties would also be identifiable during medical school clerkships and reflected both in MSPE ratings and program directors’ evaluations. Incidentally, although many items on our program director questionnaire reflect the competencies described by the Accreditation Council for Graduate Medical Education (ACGME),14 program directors in our study appeared to view clinical competence along a single dimension, which was only marginally differentiated from interpersonal skills. This finding is similar to the results reported by Silber et al.,15 who found that responses to a global rating form did not differentiate the 6 ACGME competencies but rather separated into the two factors of medical knowledge and interpersonal skills. Although it may in principle be possible to predict interns’ levels of skills in the individual ACGME competencies, this will require development of more sophisticated assessment tools. Our study has several limitations. First, it was conducted at a single medical school, and thus the results may not necessarily reflect those at other institutions. Nonetheless, we point out that our dean’s letter ranking groups were base on clerkship grades compiled over a range of settings, venues, and evaluators with a standardized formula and thus should be generalizable to other schools with similar clerkships and grading systems. We also point out that questionnaires were received by clerkship directors in more than 50 different programs and thus reflect a national sample of these respondents. Second, our response rate was somewhat low at 59%. There were no systematic differences in response rates for students in the different dean’s letter ranking categories, however, thus suggesting that responses were not biased by how well the students had performed in medical school. Third, it is possible that program directors were biased by their previous knowledge of the contents of the dean’s letter. For several reasons, however, we do not believe that their responses were significantly influenced by this information. First, the instructions made no mention of the dean’s letter. Rather, we stated that we were interested in the performance of our graduates during internship. Second, the questions about the dean’s letter came only at the end of the questionnaire. Thus, we doubt that program directors were cued to think about the dean’s letter as they were completing the first part of the questionnaire. Indeed, a few program directors later complained that they had to stop and look for our dean’s letter in their files before completing those items. Finally, our questions asked 255 LURIE, LAMBERT, & GRADY-WELIKY specifically about how the graduates had performed in the course of their internship. We strongly suspect that program directors responded on the basis of nearly a year of lived experience with these individuals, rather than on the basis of a letter that they may not have read for nearly a year. In summary, we find that dean’s letter ranking groups are a significant predictor of programs directors’ evaluations during internship across a range of training programs. Students in the lower half of dean’s letter ranking are most variable in their performance during internship, with many receiving very high evaluations from program directors. A few students in the lower half of the dean’s letter rankings were also perceived as having been overrated in the dean’s letter. Thus, this appears to represent a heterogenous group for whom prediction of later performance is less certain. Further study will be needed to develop more precise predictors of later performance among this group. References 1. Hunt DD, MacLaren C, Scott C, Marshall SG, Braddock CH, Sarfaty S. Follow-up study of the characteristics of dean’s letters. Academic Medicine 2001;76:727–33. 2. Leiden LI, Miller GD. National survey of writers of dean’s letters for residency applications. Journal of Medical Education 1986;61:943–53. 3. Hunt DD, MacLaren CF, Scott CS, Chu J, Leiden LI. Characteristics of dean’s letters in 1981 and 1992. Academic Medicine 1993;68:905–11. 4. Ozuah PO. Variability in deans’ letters. Journal of the American Medical Association. 2002;288:1061. 256 5. Toewe II CH, Golay DR. Use of class ranking in deans’ letters. Academic Medicine 1989;64:690–1. 6. Yager J, Strauss GD, Tardiff K. The quality of deans’ letters from medical schools. Journal of Medical Education 1984;59:471–8. 7. Edmond M, Roberson M, Hasan N. The dishonest dean’s letter: An analysis of 532 dean’s letters from 99 U.S. medical schools. Academic Medicine 1999;74:1033–5. 8. American Association of Medical Colleges. A guide to the preparation of the Medical Student Performance Evaluation. Available at: www.aamc.org/members/gsa/mspeguide.pdf. Accessed May 21, 2007. 9. Provan JL, Cuttress L. Preferences of program directors for evaluation of candidates for postgraduate training. Canadian Medical Association Journal 1995;153:919–23. 10. Hunt DD, MacLaren CF, Carline J. Comparing assessments of medical students’ potentials as residents made by the residency directors and deans at two schools. Academic Medicine 1991;66:340–4. 11. Blacklow RS, Goepp CE, Hojat M. Class ranking models for deans’ letters and their psychometric evaluation. Academic Medicine 1991;66(Suppl):S10–2. 12. Lurie SJ, Nofziger A, Meldrum S, Mooney C, Epstein RE. Temporal and group-related trends in peer assessment amongst medical students. Medical Education, in press. 13. Yao DC, Wright SM. National survey of internal medicine residency program directors regarding problem residents. Journal of the American Medical Association 2000;284:1099–104. 14. Accreditation Council for Graduate Medical Education. Accreditation Council for Graduate Medical Education Outcome Project. Available at: http://www.acgme.org/outcome/comp/ compMin.asp. Accessed March 9, 2006. 15. Silber CG, Nasca TJ, Paskin DL, Eiger G, Robeson M, Veloski JJ. Do global rating forms enable program directors to assess the ACGME competencies? Academic Medicine 2004;79:549–56. Final revision received on November 13, 2006
© Copyright 2025 Paperzz