Improving the Accuracy of Raters :Direct Observation Workshop Approaches to Rater Training: A. Rater Error Training (RET) The major goal of this type of training is to improve accuracy through by decreasing common “rater errors”, or rater biases. There are several types of rater biases: 1. Leniency error: The simple tendency to give all residents good ratings. This is most common in most training programs. 2. Severity error: The tendency to give all residents low or poor ratings. When is the last time you saw this ? 3. Halo error: The failure to discriminate the performance of a resident across the different dimensions of competency. For example, a resident is perceived to have outstanding knowledge and presentation skills that leads the attending to give high marks in all dimensions of competence. This is a very common error with rating scales (evaluation forms) – residents tend to get high marks in history-taking and physical examination skills even though a) the attending rarely, if ever, actually observe the resident performing these skills, and b) we know from multiple studies that we stink at historytaking and physical examination skills. 4. First impression error: providing a rating based on first impression and failing to account for subsequent performance. 5. Friendship bias: tendency to give good ratings because of friendship ties. RET usually involves exercises designed to get raters to provide greater variability in their ratings. Participants are given the definitions of the common rater errors and then given examples of actual ratings demonstrating the types of rating errors. Discussion and feedback are usually included as part of the training. You could potentially use examples of ratings from your own program to provide examples of the common rating errors. As noted in your annotated bibliography, RET appears to be modestly effective in reducing halo and leniency error, but when used alone may actually decrease rater accuracy!! B. Performance Dimension Training (PDT) Simply stated, this type of training is designed to teach and familiarize the raters, in this case your faculty, with the appropriate performance dimensions used in your evaluation system. Although PDT alone probably does not improve rater accuracy, it is a critical structural element for all rater training programs. Definitions for each dimension of performance, or competency, should be reviewed with all evaluators. The dimensions of performance in residency training consist of the competencies required for the six new general competencies promoted by the Accreditation Council of Graduate Medical Education (ACGME). In addition to defining the competencies for your faculty, faculty should also be given the opportunity to interact with the definitions to improve their understanding of these definitions. This can be accomplished with review of actual resident performance, “paper cases” or video-tape. C. Frame of Reference Training (FOR) This type of training specifically targets accuracy in rating. The steps for FOR in a residency training program would be: a) Participants are given descriptions of each dimension of competence and are then instructed to discuss what they believe the qualifications are needed for each dimension. b) Participants are given clinical vignettes describing critical incidents of performance from unsatisfactory to average to outstanding. (Frame of reference). c) Participants use the vignettes to provide ratings on a behaviorally anchored rating scale. d) The session trainer then provides feedback on what the “true” ratings should be along with an explanation for the rating. e) The training session wraps up with an important discussion on the discrepancies between the participants’ ratings and the “true” ratings. The most difficult aspect of FOR is setting the actual performance standards. As you can see, FOR is really an extension of PDT. FOR involves establishing appropriate ratings for various levels of performance. Hauenstein (Performance Appraisal, 1998) makes one additional important point: with regards to the actual target scores, the goal should be to “produce reasonable target scores without being overly concerned that the target scores represent truth in the abstract sense.” In training programs, we should be able to define definitions and target ratings for the basic dimensions of clinical competence. D. Behavioral Observation Training (BOT) Observation skills are critical to effective and accurate ratings. While RET and FOR training is more focused on the judgmental processes involved in ratings, BOT is focused on improving the detection, perception, and recall of actual performance. There are 2 main strategies to improve observation. The first is simply to increase the number of observations, or increased sampling of actual performance. This helps to improve recall of performance, and also in essence provides multiple opportunities for skill practice in observation by the rater. The second strategy is to provide some form of observational aide that raters can then use to record observations. Some call these aides “behavioral diaries.” In a sense, the mini-CEX form is an immediate “behavioral diary” to record a rating of an observation. We believe there is an additional component of BOT in residency training. Observation of clinical skills require that the attending appropriately “prepare” for the observation, position him/herself correctly to observe a particular skill, minimize interaction between him/herself and the resident and patient, and avoid distractions. Preparation means determining what it is you wish to accomplish during the actual observation. For example, you plan to perform a mini-CEX of physical examination skills of an intern caring for a newly diagnosed hypertensive patient. What are he appropriate components of a physical exam for a hypertensive patient? How do I need to position myself in order to ensure proper technique is used by the intern? How and when will you confirm (if you deem necessary) physical findings? This preparation helps to maximize the value of the observation and reinforces the need of the attending to consider the appropriate definitions of the performance dimension of interest. Simple rules for Observation: 1. Correct positioning. As the rater, try to avoid being in the line of sight of either the patient or resident. Use principle of triangulation: Desk R R = resident P = patient A = attending P A 2. Avoid being intrusive. Don’t interject or interrupt if at all possible. Once you interject yourself into the resident-patient interaction, the visit is permanently altered. However, there will be many times at some point in the visit where you need to interject yourself in order to correct misinformation, etc. from the resident. 3. Minimize interruptions. Let your staff know you will be with the resident for 5-10 minutes, avoid taking routine calls, etc. 4. Be prepared. Know before you enter the room what your goals are for the observation session. For example, if a physical exam, have the resident present the history first; then you will know what the key elements of the PE should be. Performance Dimension Training Exercise The purpose of this exercise is for your group to develop the definitions for a dimension of clinical competency. The dimension we will focus on today is counseling. Counseling is an important component of the new ACGME general competency of Patient Care. The ACGME will be looking for evidence that training programs have developed appropriate methods to measure the success of the curriculum and the competency of individual residents in the general category of Patient Care. Counseling situation: A resident needs to counsel a patient about starting a new medication at the end of a clinic visit. What criteria will you use to judge the counseling performance of this resident? In other words, define the essential components the resident should specifically include in the counseling session with the patient starting a new medication. With your group: Define the components of an effective counseling session. Knowledge E.g. What should the patient be told? What should the patient be asked? Skills E.g. How should the questions be asked? How should the information be presented? Attitudes E.g. How should the resident interact with the patient? Developing a Checklist Form Counseling Session: Starting a New Therapy Resident Name: ____________________________________________ Date:___________________ Components for a Checklist Knowledge: 1. 2. 3. 4. 5. Other: Skills: 1. 2. 3. 4. 5. Other: Attitudes: 1. 2. 3. 4. 5. Other: Overall Rating of Counseling (circle one): Poor Marginal Good Excellent Outstanding Annotated Bibliography Why Faculty Need to Observe Clinical Skills: Examples from Physical Diagnosis » Wiener S, Nathanson M. Physical examination. Frequently observed errors. JAMA. 1976; 236: 852-855 This article confirmed what previous authors had demonstrated in the 1960’s: the physical examination skills of house staff suffered from multiple errors. The authors did not quantify the number of observed errors by house staff, but they did classify the most common errors. The five main categories of errors are displayed in the table below: Category Technique Description 1. 2. 3. 4. 5. Poor ordering and organization of the exam Defective or no equipment Improper manual technique or use of instrument Performance of the examination when not appropriate Poor bedside etiquette leading to patient discomfort, embarrassment, or overt hostility Omission 1. Failure to perform part of the examination Detection 1. 2. 3. 4. Interpretation 1. Failure to understand the meaning in pathophysiologic terms of a sign 2. Lack of knowledge of or use of confirming signs 3. Lack of knowledge of the value of a sign in confirming/refuting Dx Recording 1. Forgetting a finding and not recording it 2. Illegible handwriting, obscure abbreviations, incomplete recording 3. Recording a diagnosis and not the sign detected Missing a sign that is present Reporting detection of a sign that is not present Interpreting normal physiological or anatomic variation as abnormal Misidentifying a sign after detection Comment: Problems with clinical skills have “plagued” training programs for decades. Despite the pleas of numerous educators over time, adequate assessment of history-taking, physical examination, and communication skills remains sub-optimal in most residency programs today. » Wray NP, Friedland JA. Detection and correction of house staff error in physical diagnosis. JAMA. 1983; 249: 1035-37. This study sought to quantify the amount of errors committed by house staff. Disturbingly, residents committed at least one error in 58% of the patients they examined, and interns committed at least one error in 62% of their patients. The gold standard was an “expert” faculty member. The majority were errors of omission (72% of all errors). Comment: Further confirmation of the scope of the problem. This article also highlights the importance of faculty observation and that through direct observation faculty can correct deficiencies in “real time.” » Mangione S, Nieman LZ. Cardiac auscultatory skills of Internal Medicine and Family Practice trainees: A comparison of diagnostic proficiency. JAMA. 1997; 278: 717-22 Thirty-one training programs with 453 residents and 88 medical students participated in this trial. All of the participants listened to 12 cardiac sounds taken directly from actual patients, then completed a multiple choice questionnaire. On average the residents identified only 20% of the sounds correctly. Internal medicine residents were only slightly better than family practice residents. Level of training had little to no effect on correct identification. Comment: This well done study documented highly deficient auscultatory skills among a large group of trainees. This study does not answer the question of the best teaching method and the role of direct observation by faculty in auscultatory skills. However, when these results are considered in the context of the studies by Weiner and Wray, faculty can address and correct problems and errors with technique through direct observation. Faculty Skill in the Observation of Clinical Skills » Kroboth FJ, Hanusa BH, Parker S, et al. The inter-rater reliability and internal consistency of a clinical evaluation exercise. J Gen Intern Med. 1992; 7: 174-9. This study examined the reliability of the traditional clinical evaluation exercise (CEX). Each of 32 interns was observed twice with two raters present for each patient interaction. Faculty completed a standardized rating scale form at the end of the exercise. Overall, inter-rater agreement for scores was poor on the three main domains of competence assessed: historytaking, physical examination, and cognitive skills. Comment: This study examined the reliability of performance observations by faculty of interns during an inpatient CEX. Validity was not assessed in this study. Interestingly, prior experience of the faculty member with the CEX did not lead to better reliability. » Noel GL, Herbers JE, Caplow MP, Cooper MS, Pangaro LN, Harvey J. How well do Internal Medicine faculty members evaluate the clinical skills of residents. Ann Intern Med. 1992; 117: 757-765. In this study 203 faculty members participated in the trial designed to assess faculty ratings skills in the CEX. One group, 69 participants, received a brief educational intervention that consisted of a 15-minute videotape explaining the purpose of the CEX and the need for detailed observation and feedback. All 203 faculty viewed two CEX case simulations on tape. For each taped clinical scenario, a resident was trained to omit or perform improperly important aspects of the history and/or physical examination. Enough “errors” were inserted so that the resident should be rated as marginal. Accuracy scores were calculated for each attending. The overall accuracy for the cohorts of faculty ranged from only 32% for those using an open-ended evaluation form to ~ 60% for faculty given a structured evaluation form. Regarding overall ratings of competence for the two scenarios, over 50% of the faculty rated each of the two scenarios as satisfactory or superior. Use of the 15-minute instructional videotape did not improve accuracy. Comment: This study is important because it is one of the few that investigated the accuracy of actual observation skills. Despite use of a structured form for a standardized videotape clinical encounter that was carefully staged, overall accuracy remained at best 60%. More concerning is the high overall competency ratings given by the majority of the faculty for clinical scenarios specifically designed to depict marginal performance. This study also helps to highlight the important distinction between reliability and accuracy/validity. Highly reliable ratings of competence are essentially useless if they fail to possess reasonable levels of accuracy and validity. Although an attempt was made to provide a subgroup of the faculty with some degree of “rater training,” the educational intervention was brief and not designed to improve observation skills. Thus, we cannot extrapolate from this study any significant information about rater training programs other than to note that more research in this area is desperately needed. » Kalet A, Earp JA, Kowlowitz V. How well do faculty evaluate the interviewing skills of medical students? J Gen Intern Med. 1992; 7: 499-505. This study utilized an Objective Structured Clinical Exam (OSCE) with second year medical students to examine the reliability and accuracy of faculty ratings. Videotapes were made of 21 of 159 total encounters for review by a) the original faculty member who observed the OSCE in person and b) comparison with other faculty and outside “experts.” With regards to accuracy, the faculty scores after observing the OSCE averaged 80% correct (range 41-100%). Intra-rater agreement (faculty who observed the OSCE then rated the videotape of the same encounter) was quite poor (Pearson correlation coefficients 0.12 to 0.54 depending on the domain of competence rated). Inter-rater agreement between faculty and outside experts was equally poor, with correlation coefficients ranging from 0.11 to 0.37. Comment: This study with a group of medical students is consistent with the findings of Noel, et al. Although accuracy was better (using a checklist for process variables only), reliability was quite poor. Rater training was not part of the study. » Marin JA, Reznick RK, Rothman A, Tamblyn RM, Regehr G. Who should rate candidates in an objective structured clinical examination? Acad Med. 1996; 71: 170-75. This study compared the ratings of history-taking sessions on an OSCE among certification candidates for Canada’s qualifying examination. Candidates were evaluated by three different raters: a physician rater, a standardized patient observer, and a standardized patient rating from recall, all raters using the same checklists. The gold standard was a panel of 3 “expert” physicians who had rated the candidates using identical checklists. The physician rater was less likely to deviate from the ratings of the expert panel compared to the standardized patients. Comment: This study highlights several points. One, direct observation of clinical skills by faculty physicians is important and in some instances may be more reliable depending on the purpose of the evaluation and the gold standard used. It is also important to note that a standard checklist was used for the ratings. The checklist helped to frame the observation of these raters. Noel, et al also found that a more detailed, specific rating form improved reliability in their study of the CEX (see above). Rater Training: A Few Lessons from Industry » Murphy KR, Balzer WK. Rater errors and rating accuracy. J Appl Psych. 1989; 74: 619-24. Using meta-analysis, this study sought to determine the relationship between rating errors and rating accuracy. Some in the field of performance assessment have argued that the absence of errors by raters indicates better accuracy in rating performance. The three main rating errors are halo effect, leniency, and range restriction. Halo error is the failure to discriminate among the different dimensions of clinical competence, usually because a rating in one domain affects the ratings of all other domains. Leniency error is the tendency to give everyone good ratings regardless of actual performance. Range restriction is simply the tendency to use a “restricted” portion of the scale (e.g. giving a resident all 5’s or more likely in medicine, all 9’s on a 9-point scale. This is a potential example of all three errors!).This study found a weak correlation between rating errors and rating accuracy. Comment: This is an important finding given other work in the field has found that rater training specifically focused on reducing rater errors may actually reduce accuracy. » Murphy KR, Garcia M, Kerkar, Martin C, Balzer WK. Relationship between observational accuracy and accuracy in evaluating performance. J Appl Psych. 1982; 67: 320-25. This study examined the relationship between observational accuracy and performance rating accuracy. The first step in any rating process should involve observation of the performance dimension of interest (e.g. physical exam skills). Hopefully accurate ratings of observations lead to better judgments of performance ratings. The study also highlights that there are four separate components of rating accuracy used by industrial psychologists (definitions from Murphy, et al): 1. Elevation: accuracy due to the average rating, over all ratees , given by a rater. The rater with an overall accuracy score closer to the true score is more accurate than someone whose average rating is far from the true score. 2. Differential elevation: the component associated with the average rating for each ratee, across all performance dimensions. A rater with good differential elevation will correctly rank order ratees on the basis of their overall performance. 3. Stereotype accuracy: component associated with the average rating given to each performance dimension across all ratees. A rater with good stereotype accuracy will correctly assess the relative strengths of ratees across multiple performance dimensions (e.g. clinical judgment vs. interviewing skills vs. knowledge, etc.). 4. Differential accuracy: Component of accuracy that reflects the rater’s sensitivity to ratee differences in patterns of performance (e.g. think settings, ward versus clinic, etc.) This study used videotapes of a teaching encounter as the unit of performance to be evaluated. The main finding was that frequency ratings (accurate recording of observed behaviors) and performance evaluations probably involve to some degree different cognitive processes. Performance evaluation involves, in the words of the authors, “complex, abstract judgments about the quality of performance”. The other finding was that accuracy in observation had a modest association with performance rating accuracy. Likewise, “errors” in observation also were more likely to lead to less accurate performance ratings. Comment: This study highlights the multiple and complex components of accuracy. This early study also showed only a modest link between accuracy in observing behaviors and accuracy in actual performance ratings. The results of this study in essence “foreshadow” the results seen in the study by Noel, et al of the CEX (see above). One, less accuracy in observations of the CEX led to a more “inaccuracies” in the overall performance rating, and two, accurate recording of behaviors are not always “correctly” incorporated in the complex task of overall performance ratings. » McIntyre RM, Smith DE, Hassett CE. Accuracy of performance ratings as affected by rater training and perceived purpose of rating. J Appl Psych. 1984; 69: 147-56. One of the early studies using video-tape to train raters using FOR – frame of reference training. FOR contains the following elements: f) Participants are given job descriptions and are then instructed to discuss what they believe the qualifications are needed for the job. g) Participants are given job (e.g. clinical ) vignettes describing critical incidents of performance from unsatisfactory to average to outstanding. (Frame of reference). h) Participants use the vignettes to provide ratings on a behaviorally anchored rating scale. i) The session trainer then provides feedback on what the “true” ratings should be along with an explanation for the rating. j) The training session wraps up with an important discussion on the discrepancies between the participants’ ratings and the “true” ratings. This study also studied the impact of the purpose of the rating; ratings meant to provide feedback for improvement or for a hiring decision (“high stakes” evaluation). Participants who completed the FOR training demonstrated greater accuracy. Interestingly, the group who received only error avoidance training (see study above) actually did worse with regards to accuracy. Interestingly, the purpose of the rating did not make a difference in this study. Several caveats need to be noted: First, the improvement in ratings was modest at best. Second, the target group was college students rating a video-taped lecture. Comment: Although we can learn from such studies, we cannot directly extrapolate this type of training to the more complex task of rating clinical performance across multiple domains. We clearly need more rigorous research in rater training programs for teaching faculty. » Hauenstein NMA. Training raters to increase the accuracy of appraisals and the usefulness of feedback. Pgs. 419-21. In Performance Appraisal, Smither JW, editor. Jossey-Bass, San Francisco. 1998. » Woehr DJ, Huffcutt AI. Rater training for performance appraisal: a quantitative review. J Occupational Organizational Psych 1994; 67: 189-205. One particular form of rater training is behavioral observation training (BOT). As stated by Hauenstein, BOT “is designed to improve the detection, perception, and recall of performance behaviors.” Thus this type of training is particularly pertinent to faculty development in clinical competence evaluations. BOT has two main components. The first is to encourage the rater to increase the amount of observations, or “increase the sampling of behaviors.” Along with enhanced sampling is training to avoid observational errors. The focus is on accurate recall of performance behavior. As an example, the mini-CEX is in essence an attempt to promote more direct observation by structuring the purpose and form of the observation. The second key component is to encourage raters to utilize aides, or “aides-de-memoir”, to document witnessed behaviors. The purpose of this diary or log is to help the rater track those dimensions of performance actually observed. The rater can periodically assess what dimensions of performance have NOT been observed and thus make plans to correct these “observational deficiencies”. Experts suggest that raters define in advance what, how many, and how frequent specific performance behaviors will be observed. The mini-CEX is one tool that can help raters with these decisions. BOT training has been found to improve rater accuracy, and higher numbers of observations also appears to improve accuracy. Comment: We believe BOT type training is highly relevant to residency training. Lack of direct observation is already a well-recognized problem, and research is critically needed to define the optimal approach for BOT in medical training. Existing tools such as the mini-CEX and OSCE’s are well-suited instruments for this task. Most BOT programs described in the psychology literature did not involve more than several hours of training time. This is a reasonable time commitment for those faculty serving a key role in the evaluation process in residency programs. References: Resident Clinical Skills (Compiled by Richard Hawkins, MD; Director, USUHS Clinical Simulation Center) Beckman HB, Frankel RM. The Use of Videotape in Internal Medicine Training. J Gen Intern Med 1994;9:S17-S21. Burdick WP, Friedman Ben-David M, Swisher L, Becher J, Magee D, McNamara R, Zwanger M. Reliability of Performance-based Clinical Skill Assessment of Emergency Medicine Residents. Acad Emerg Med 1996;3:1119-23. Chalabian J, Garman K, Wallace P, Dunnington G. Clinical Breast Evaluation Skills of House Officers and Students. Am Surg 1996;6_:840-5 Chalabian J, Dunnington G. Do Our Current Assessments Assure Competency in Clinical Breast Evaluation Skills? Am J Surg 1998;175:497-502. Day SC, Grosso LJ, Norcini JJ, Blank LL, Swanson DB, Horne MH. Residents Perception of Evaluation Procedures Used by Their Training Program. J Gen Intern Med 1990;5:421-6. Duffy DF. Dialogue: The Core Clinical Skill. Ann Intern Med 1998;128:139-41. Dupras DM, Li JTC. Use of an Objective Structured Clinical Examination to Determine Clinical Competence. Acad Med 1995;70:1029-34. Eggly S, Afonso N, Rojas G, Baker M, Cardozo L, Robertson RS. An Assessment of Residents' Competence in Delivery of Bad News to Patients. Acad Med 1997;72:397-9. Elliot DL, Hickam DH. Evaluation of Physical Examination Skills: Reliability of Faculty Observers and Patient Instructors. JAMA 1987;258:3405-8. Fletcher RH, Fletcher SW. Has Medicine Outgrown Physical Diagnosis? Ann Intern Med 1992;117:786-7. Fox RA, Clark CLI, Scotland AD, Dacre JE. A Study of Pre-registration House Officers’ Clinical Skills. Med Educ 2000;34:1007-12. Hawkins R, Gross R, Gliva-McConvey G, Haley H, Beuttel S, Holmboe E. Use of Standardized Patients for Teaching and Evaluating the Genitourinary Examination Skills of Internal Medicine Residents. Teach Learn Med 1998;10:65-8. Herbers JE, Noel GL, Cooper GS, Harvey J, Pangaro LN, Weaver MJ. How Accurate Are Faculty Evaluations of Clinical Competence? J Gen Intern Med 1989;4:202-8. Hilliard RI, Tallett SE. The Use of an Objective Structured Clinical Examination with Postgraduate Residents in Pediatrics. Arch Pediatr Adolec Med 1998;152:74-8. Holmboe ES, Hawkins RE. Methods for Evaluating the Clinical Competence of Residents in Internal Medicine: A Review. Ann Intern Med 1998;129:42-8. Johnson JE, Carpenter JL. Medical House staff Performance in Physical Examination. Arch Intern Med 1986;146:937-41. Johnston BT, Boohan M. Basic Clinical Skills: Don’t Leave Teaching to the Teaching Hospitals. Med Educ 2000;34:692-9. Joorabchi B, Devries JM. Evaluation of Clinical Competence: The Gap Between Expectation and Performance. Pediatrics 1996;97:179-84. Kalet A, Earp JA, Kowlowitz V. How Well Do Faculty Evaluate the Interviewing Skills of Medical Students? J Gen Intern Med 1992;7:499-505. Kern DC, Parrino TA, Korst DR. The Lasting Value of Clinical Skills. JAMA 1985;254:70-6. Klass D, De Champlain A, Fletcher E, King A, Macmillan M. Development of a Performancebased Test of Clinical Skills for the United States Medical Licensing Examination. Federation Bulletin 1998;85:177-84. Kroboth FJ, Kapoor W, Brown FH, Karpf , Levey GS. A Comparative Trial of the Clinical Evaluation Exercise. Arch Intern Med 1985;145:1121-3. Kroboth FJ, Hanusa BH, Parker S, Coulehan JL, Kapoor WN, Brown FH, Karpf M, Levey GS. The Inter-rater Reliability and Internal Consistency of a Clinical Evaluation Exercise. J Gen Intern Med 1992;7:174-9. Lane JL, Gottleib RP. Structured Clinical Observations: A Method to Teach Clinical Skills with Limited Time and Financial Resources. Pediatrics 2000;105(4):__ Lee KC, Dunlop D, Dolan NC. Do Clinical Breast Examination Skills Improve during Medical School? Acad Med 1998;73:1013-9. Li JTC. Assessment of Basic Examination Skills of Internal Medicine Residents. Acad Med 1994;69:296-9. Mangione S, Peitzman SJ, Gracely E, Nieman LZ. Creation and Assessment of a Structured Review Course in Physical Diagnosis for Medicine Residents. J Gen Intern Med 1994;9:213-8 Mangione S, Burdick WP, Peitzman SJ. Physical Diagnosis Skills of Physicians in Training: A Focused Assessment. Acad Emerg Med 1995;2:622-9. Mangione S, Nieman LZ. Cardiac Auscultatory Skills of Internal Medicine and Family Practice Trainees: A Comparison of Diagnostic Proficiency. JAMA 1997;278:717-22. Mangione S, Peitzman SJ. Revisiting Physical Diagnosis during the Medical Residency: It is Time for a Logbook – and More. Acad Med 1999;74:467-9. Mangrulkar RS, Judge RD, Stern DT. A Multimedia CD-ROM Tool to Improve Residents’ Cardiac Auscultation Skills. Acad Med 1999;74:572. Marin JA, Reznick RK, Rothman A, Tamblyn RM, Regehr G. Who Should Rate Candidates in an Objective Structured Clinical Examination? Acad Med 1996;71:170-5. Noel GL, Herbers JE, Caplow MP, Cooper GS, Pangaro LN, Harvey J. How Well Do Internal Medicine Faculty Members Evaluate the Clinical Skills of Residents? Ann Intern Med 1992;117:757-65. Peterson MC, Holbrook JH, Hales DV, Smith NL, Staker LV. Contributions of the History, Physical Examination, and Laboratory Investigation in Making Medical Diagnoses. West J Med 1992;156:163-5. Petrusa ER, Blackwell TA, Ainsworth MA. Reliability and Validity of an Objective Structured Clinical Examination for Assessing the Clinical Performance of Residents. Arch Intern Med 1990;150:573-7. Pfeiffer C, Madray H, Ardolino A, Willms J. The Rise and Fall of Students’ Skill in Obtaining a Medical History. Med Educ 1998;32:283-8. Poenaru D, Morales D, Richards A, O’Connor M. Running and Objective Structured Clinical Examination on a Shoestring Budget. Am J Surg 1997;173:538-41. Ramsey PG, Curtis R, Paauw DS, Carline JD, Wenrich MD. History-taking and Preventive Medicine Skills among Primary Care Physicians: An Assessment Using Standardized Patients. Am J Med 1998;104:152-8. Remmen R, Derese A, Scherpbier A, Denekens , Hermann I, van der Vleuten C, Van Royen P, Bossaert L. Can Medical Schools Rely on Clerkships to Train Students in Basic Clinical Skills? Med Educ 1999;33:600-5. Sachdeva AK, Loiacono LA, Amiel GE, Blair PG, Friedman M, Roslyn JJ. Variability in the Clinical Skills of Residents Entering Training Programs in Surgery. Surgery 1995;118:300-9. Schechter GP, Blank LL, Godwin HA, LaCombe JA, Novack DH, Rosse WF. Refocusing on History-taking Skills During Internal Medicine Training. Am J Med 1996;101:210-6. Schwartz RW, Donnelly MB, Sloan DA, Johnson SB, Strodel WE. The Relationship Between Faculty Ward Evaluations, OSCE and ABSITE as Measures of Surgical Intern Performance. Sloan DA, Donnelly MB, Johnson SB, Schwartz RW, Strodel WE. Assessing Surgical Residents' and Medical Students' Interpersonal Skills. J Surg Res 1994;57:613-8. Sloan DA, Donnelly MB, Schwartz RW, Strodel WE. The Objective Structured Clinical Examination: The New Gold Standard for Evaluating Postgraduate Clinical Performance. Ann Surg 1995;222:735-42. Stillman PL, Swanson, DB, Smee S. et al. Assessing Clinical Skills of Residents with Standardized Patients. Ann Intern Med 1986;105:762-71. Stillman P, Swanson D, Regan MB. Assessment of Clinical Skills of Residents Utilizing Standardized Patient: A Follow-up Study and Recommendations for Application. Ann Intern Med 1991;114:393-401. Stillman PL, Regan MB, Swanson DB, Case S, McCahan J, Feinblatt J, Smith SR, Willms J, Nelson DV. An Assessment of the Clinical Skills of Fourth Year Students at Four New Englan Medical Schools. Acad Med 1990;65:320-6. Suchman A, Markakis K, Beckman HB, Frankel R. A Model of Empathic Communication in the Medical Interview. JAMA 1997;277:678-82. Todd IK. A Thorough Pulmonary Exam and Other Myths. Acad Med 2000;75:50-1. Turnbull J, Gray J, MacFadyen J. Improving In-Training Evaluation Programs. J Gen Intern Med 1998;13:317-23. Van Thiel J, Kraan HF, van der Vleuten C. Reliability and Feasibility of Measuring Medical Interviewing Skills: The Revised Maastricht History-taking and Advice Checklist. Med Educ 1991;25:224-9. Warf BC, Donnelly MB, Schwartz RW, Sloan DA. The Relative Contributions of Intepersonal and Specific Clinical Skills to the Perception of Global Clinical Competence. J Surg Res 1999;86:17-23. Wiener S, Nathanson M. Physical Examination. Frequently Observed Errors. JAMA 1976;236:852-5. Williamson PR, Smith RC, Kern DE, Lipkin M, Barker LR, Hoppe RB, Florek J. The Medical Interview and Psychosocial Aspects of Medicine. J Gen Intern Med 1992;7:235-42. Woolliscroft JO, Stross JK, Silva J. Clinical Competence Certification: A Critical Appraisal. J Med Educ 1984;59:799-805. Woolliscroft JO, Howell JD, Patel BP, Swanson DB. Resident-Patient Interactions: The Humanistic Qualities of Internal Medicine Residents Assessed by Patients, Attending Physicians, Program Supervisors, and Nurses. Acad Med 1994:69:216-224. Wray NP, Friedland JA. Detection and Correction of House staff Error in Physical Diagnosis. JAMA 1983;249:1035-7.
© Copyright 2026 Paperzz