III.2 Educational Evaluation EVALUATION TESTS The essay test refers to any written test that requires an examinee to write a sentence, a paragraph or longer passages and that demands a subjective judgement about its quality and completeness when it is scored. THE ESSAY QUESTION The distinctive features of the essay question are: 1) No single answer can be considered throught correct; 2) The examinee is permitted freedom of response: 3) The answers vary 4) The answer needs a comprehensive explanation TYPES 1) Extended response 2) Restricted response Overall Essay questions can be divided into 1. Essay Type (Recall Type) a. Discussion or extended response; b. Short answer or restricted response; c. Oral Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation For eg. In Geography Q1. What is a phase of moon? Q2. What are the factors causing pollution of oceanic waters? Q3. Describe volcanic activities. In Economics Q1. What are the main indicators of economic development? Q2. Explain the major objectives of GATT. Q3. Write the functions of WTO. In English Q1. Write a brief 250 words composition of the following topic. 1. _________ if trees could talk. In Science Q1. Define Classification of organism. Enlist the kingdom with examples Q2. Describe the structure of human brain. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation In History Q1. What is prehistoric period? Q2. How was harappan civilization discovered? OBJECTIVE TYPE TEST It refers to any written test that requires the exa minee to select the correct answer from among one or more of several alternatives or supply a word or two and that demands an objective judgement when it is scored. OBJECTIVE TYPE TEST ITEM: The most important criterion of an objective type test item is that it can be most objectively scored.the scoring will not vary from examiner to examiner. 2. Objective Type A)supply-type items a)short answer: 1) Single word, symbol, formula 2) Multiple words or phrases b) Completion (fill up the blanks) B) Selection type items (Recognition type) Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation a) Alternate response :true-false, yes-no ,right-wrong; b) Multiple-choice; c) Matching. c) Context-dependent type items a) Pictorial b) Interpretative Examples : a. which of the following is the formula for finding the number of subsets for given set? A. 2n B.2+n C.n2 D.2n b. The number of subsets of the set (1,2,3) is – A.3 B.6 C.8 D.9 THE MULTIPLE CHOICE ITEM (THE MC ITEM) Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation The MC item consists of two parts: 1. the stem, which contains the problem; 2. responses or options, i.e; a list of suggested answers QUESTION VARIETY : Stem : which one of the following men invented the telephone? Response or options: a. Marconi b. Edison c. Bell d. Faraday e. Morse. INCOMPLETE VARIETY : Stem : the telephone was invented byRespose : a. Marconi b. Edison c. Bell d. Faraday e. Morse. Variation of the MC Formate 1. Correct answer 2. Best Answer 3. Analogy Type 4. Numeric Series Type. 5. Substitution Type 6. Absurdities Type 7. Reverse Type Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation EDUCATIONAL EVALUATION PAPER 3.2 BLOOMS TAXONOMY PRESENTED BY PROF. MEGHA D GOKHE TAXONOMIES OF EDUCATIONAL OBJECTIVES The taxonomy of educational objectives is basically a classification scheme. pURPOSES a) To establish the accuracy of communication. b) To reduce the vagueness c) To become a means of a more precise communication. d) To establish a common understanding. e) To be a great help in clearly defining. Three-flod Divisions of instructional objectives. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Cognitive -(the recall or recognition of knowledge and the development of intellectual abilities and skill). Affective – (changes in interests and values, and the development of application). Psychomotor – (development of manipulative or motor skills). Congnitive domain 1) Knowledge It involves the recall of specifics and universals, methods and processes, or of a pattern, structure or setting. a) Knowledge of terminology and facts. b) Knowledge of conventions, trends and sequences, classification and categories, criteria, methodology. 2) Comprehension It represents the lower level of under-standing and includes. a) Translation of facts, principles and theories. b) Interpretation c) Extrapolation 3) Application Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation The use of abstractions in particular and concrete situations. The abstraction may be in the form of general ideas, rules and theories a) Making generalist ion of facts b) Diagnosis of the weakness c) Application 4) Analysis The breakdown of a communication into its constituent elements a) An analysis of elements b) The establishment of relationship c) The formulation of some principles 5) Synthesis The putting together of elements and parts so as to form a whole. a) The production of a unique communication. b) The suggestion of new plan c) The establishment and derivation 6) Evaluation This includes a) Quantitative and qualitative judgement about the value of material Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation b) Judgement in terms of internal evidence Affective domain It includes those objectives which deal with a attitudes, values interest and appreciation. 1) Receiving (attending): This means that the learner should be sensitized to the existence of certain phenomena and stimuli. a) Awareness about the stimulus or phenomena. b) Willingness in the learner to receive it c) Controlling the attention and selected affection of the leaner. 2) Responding: This is concerned with responses that go beyond merely attending to phenomena. a) The learner‟s acquiescence in responding b) The learner‟s willingness to respond c) The learner‟s satisfaction in respond. 3) Valuating: This objective concerns with the worth of a thing, phenomena or behavior, which is a result of individual‟s own valuing and assessment. a) Acceptance of a value b) Preference for a value Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation c) Commitment to or a conviction in regard to a certain point of view. 4) Organization: For situations where more than one value is relevant, the need arises a) The organization of the value into a system b) The determination of the interrelationship among them c) The establishment of the dominant value 5) Characterization by a value or value complex: At this level, the already existing values are organized in to some kind of an internally consistent system and control the behavior of an individual, who attains an integration of his beliefs and attitudes into a total philosophy. Psychomotor domain “The psychomotor domain includes those objectives which deal with manual and motor skills.” 1) Imitation of an action, performance 2) Manipulation of an act. This includes differentiating among various movements and selecting the proper one 3) Precision in reproducing a given act. This includes accuracy, proportion and exactness in performance. 4) Articulation among different acts, This includes co-ordination , sequence and harmony among acts. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation 5) Naturalisation: Here a pupil‟s skill attains its highest level of proficiency in performing an act with the least expenditure of psychic energy. The act becomes so automatic that it is attended to unconsciously. Thus the behaviour of a child is governed by his development in three domains: cognitive, affective and psychomotor. A GOOD MEASURING INSTRUMENT A good tool or a machine, a good measuring instrument must meet certain minimum requirements. The essential characteristics and requirements of a good measuring instrument Planning, validity, reliability, objectivity, discriminating power, adequacy, practicality, comparability and utility Planning - Adequate planning - Design of an instrument - Different weightages to be given to the different aspects of the tools - Its proper blue print Reliability Defines as - The degree of consistency among test scores. - The degree of consistency with which the test measures what it does measure. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation - A test score is called reliable when we have reasons for believing it to be stable and trustworthy. There are many reasons why a pupil‟s test score may vary. a) Trait instability: The characteristics we measure may change over a period to time b) Sampling Error: Any particular question we ask in order to infer a person‟s knowledge may affect his score. c) Administrative error: any change indirection, timing or amount of rapport with the test administrator may cause score variability. d) Scoring error: Inaccuracies in scoring a test paper will affect the scores. e) Other factors: Such things as health, motivation, degree of fatigue of the pupil and good or bad luck in guessing may cause score variability. VALIDITY Defined The accuracy with which a test reliably measures what is relevant. It should always be remembered that a) Validity is an inclusive term b) Validity is a matter of degree c) Validity is specific rather than general Types of Validity 1) Content validity 2) Concurrent validity 3) Predictive validity Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation 4) Constructive validity Objectivity A test is objective when the scorer‟s personal judgment does not affect scoring. Objectivity in a test makes for the elimination of biased opinion or judgment of the person who scores it. Objectivity In an objective test, test items can readily be scored as right or wrong. The true-false type, the alternate response type, the matching type, the multiplechoice type test items are highly objective, while essay type items are highly subjective. The objectivity of the test can be increased by a) Using more objective type test items b) Preparing a marking scheme or a scoring key c) Setting realistic standards d) Asking two independent examiners to evaluate the test and using the average score of the two as the final score Discriminating Power The basic function of all educational measurement is to place individuals in a defined scale in accordance with differences in their achievements. Such a function implies a high discriminating power on the part of a test. Since tests are made up of separate items, it is clear that each item must have this quality in a maximum degree if the total test is to possess it. This quality of a test directly affects its validity. Adequacy A measuring instrument should be adequate i.e. balanced and fair. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation The highest mountain in the world is Mt. Everest. Practicality Practicality is an important criterion for assessing the value of a test; and it depends upon a number of factors. a)Ease of Administrability b) Ease of Scoring c) Ease of Interpretation d Economy Comparability A Test possesses comparability when scores resulting from its use can be interpreted in terms of a common base that has a natural or accepted meaning. 1) Availability of equivalent forms of a test. 2) Availability of adequate norms. Utility A Test possesses utility to the extent to which it satisfactorily serves a definite need in the situation in which it is used. THAKUR SHYAMNARAYAN COLLEGE OF EDUCATION & RESEARCH Prof. Megha Gokhe. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Criterion Reference Test CRT is meant to measure the achievement of an examinee on a certain domain to find out his level of achievement in that domain. It has little to do with the achievement level of other examinees. Criterion Reference Test In the words of Gronlund, N.E.(1985), criterion-referenced test is “a test designed to provide a measure of performance that is interpretable in terms of a clearly defined and delimited domain of learning tasks.” Characteristics 1. Its main objective is to measure student‟s achievement of curriculum based skills. 2. Number of correct items. 3. Percent of correct items. 4. Derived score based on correct items and other factors. Uses 1. To identify the master learners and non-master learners in class. 2. To find out the level of attainment of various objectives of instruction. 3. To find out the level at which a particular concept has been learnt. 4. To better placement of concepts at different grade levels. Norm-Referenced Test Norm-Referenced Test is used primarily for comparing achievement of an examinee to that of a large representative group of examinees at the same grade level. The representative group is know as the „Norm Group‟. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Norm-Referenced Test Bormuth (1970) writes that a norm-referenced test is designed “to measure the growth in a student‟s attainment and to compare his level of attainment with the level of attainment with the levels reached by other students and norm group.” Characteristics 1. Its basic objective is to measure student‟s achievement in curriculum based skills. 2. It is prepared for a particular grade level. 3. It is administered after instruction. 4. It is used for forming homogeneous or heterogeneous class group 5. It classifies achievement as above average, average of below average for a given grade. Uses 1. To get a reliable rank ordering of the pupils with respect to the achievement we are measuring; 2. To identify the pupils who have mastered the essentials of the course more than others; 3. To select the best of the applicants for a particular programme; 4. To find out how effective a programme is in comparison to other possible programmes. Thanks You ANECDOTAL RECORDS Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation These are records of specific incidents, factual description of important and meaningful events or behaviour of students on informal occasions. Characteristics Factual description of what happened, when it happened, under what circumstances the behaviour. The interpretation and recommended action should be noted separately from the description. Each anecdotal record should contain a record of a single incident The incident recorded should be one that is considered to be significant to the pupil‟s growth and development. Advantages If properly used, they provide a factual record of an observation of a single, significant incident in the pupil‟s behaviour. They record critical incidents of spontaneous behaviour. They provide the teacher with objective descriptions. They are very good for young children who are unable to use penciland-paper test. They direct the teacher‟s attention to a single pupil. They provide for a cumulative record of growth and development. Limitations They tend to be less reliable than other observational tools as they tend to be less formal and systematic. They are time-consuming to write. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation It is difficult for the observer to maintain objectivity when he records the incident observed. The observers tend to record only undesirable incidents and neglect the positive incidents. Making Effective Restrict observations to those aspects to behaviour which cannot be evaluated by other means. Concentrate on only one or two behaviours. Observation should be selective REVISED TAXONOMY Background: In the late 1950s into the early 1970s here in the US there were attempts to dissect and classify the varied domains of human learning cognitive (knowing, head), affective (feeling, heart) and psychomotor (doing, hand/body). The resulting efforts yielded a series of taxonomies in each area. A taxonomy is really just a word for a form of classification. The following taxonomies deal with the varied aspects of human learning and are arranged hierarchically proceeding from the simplest functions to those that are more complex. While all of the taxonomies have been defined and are explained in this site via the hotlinks, the material below is a simple overview of the newer version of the cognitive domain. You can also search the Web for various references on these different taxonomies, as well as explore the active hyperlinks below. There are many valuable discussions of the development of the varied taxonomies and examples of their usefulness and application in teaching. If Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation you find that some of my links are not working, please let me know through my e-mail link as I know how frustrating that can be. Also, if you have additional related resources that you think I might be interested in, please write sending the URL. The Cognitive Domain: In the following table are the two primary existing taxonomies of cognition. The one on the left, entitled Bloom's, is based on the original work of Benjamin Bloom and others as they attempted in 1956 to define the functions of thought, coming to know, or cognition. This taxonomy is over 50 years old. The taxonomy on the right is the more recent adaptation and is the redefined work of one of Bloom's former students, Lorin Anderson, working with one of his partners in the original work on cognition, David Krathwohl. That one is labeled Anderson and Krathwohl. The group redefining Bloom's original concepts, worked from 1995-2000. The group was assembled by Anderson and Krathwohl and included people with expertise in the areas of cognitive psychology, curriculum and instruction, and educational testing, measurement, and assessment. As you will see the primary differences are not just in the listings or rewordings from nouns to verbs, or in the renaming of some of the components, or even in the repositioning of the last two categories. The major differences in the updated version is in the more useful and comprehensive additions of how the taxonomy intersects and acts upon different types and levels of knowledge -- factual, conceptual, procedural and metacognitive. Taxonomies of the Cognitive Domain: Bloom's Taxonomy 1956 1. Knowledge: Remembering or retrieving previously learned material. Examples of verbs that relate to this function are: Mrs. Megha Gokhe TSCER Anderson and Krathwohl's Taxonomy 2000 1. Remembering: Retrieving, recalling, or recognizing knowledge from memory. 1 of 29 III.2 Educational Evaluation know identify relate list define recall memorize repeat record name recognize acquire Remembering is when memory is used to produce definitions, facts, or lists, or recite or retrieve material. 2. Comprehension: The ability to grasp or construct meaning from material. Examples of verbs that relate to this function are: 2. Understanding: Constructing meaning from different types of functions be they written or graphic messages restate identify illustrate activities like locate discuss interpret interpreting, report describe draw exemplifying, recognize review represent classifying, explain infer differentiate summarizing, express conclude inferring, comparing, and explaining. 3. Application: The ability to use 3. Applying: Carrying learned material, or to implement out or using a procedure material in new and concrete situations. through executing, or Examples of verbs that relate to this implementing. Applying function are: related and refers to situations where learned apply organize practice material is used through relate employ calculate products like models, develop restructure show presentations, interviews translate interpret exhibit or simulations. use demonstrate dramatize operate illustrate Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation 4. Analysis: The ability to break down or distinguish the parts of material into its components so that its organizational structure may be better understood. Examples of verbs that relate to this function are: analyze compare probe inquire examine contrast categorize differentiate contrast investigate detect survey classify deduce experiment scrutinize discover inspect dissect discriminate separate 5. Synthesis: The ability to put parts together to form a coherent or unique new whole. Examples of verbs that relate to this function are: compose produce design assemble create prepare predict modify Mrs. Megha Gokhe plan invent formulate collect set up generalize document combine propose develop arrange construct organize originate derive write TSCER 4. Analyzing: Breaking material or concepts into parts, determining how the parts relate or interrelate to one another or to an overall structure or purpose. Mental actions included in this function are differentiating, organizing, and attributing, as well as being able to distinguish between the components or parts. When one is analyzing he/she can illustrate this mental function by creating spreadsheets, surveys, charts, or diagrams, or graphic representations. 5. Evaluating: Making judgments based on criteria and standards through checking and critiquing. Critiques, recommendations, and reports are some of the products that can be created to demonstrate the processes of evaluation. In the newer taxonomy evaluation comes before creating as 1 of 29 III.2 Educational Evaluation tell relate propose it is often a necessary part of the precursory behavior before creating something. Remember this one has now changed places with the last one on the other side. 6. Evaluation: The ability to judge, check, and even critique the value of material for a given purpose. Examples of verbs that relate to this function are: judge assess compare evaluate conclude measure deduce argue decide choose rate select estimate validate consider appraise value criticize infer 6. Creating: Putting elements together to form a coherent or functional whole; reorganizing elements into a new pattern or structure through generating, planning, or producing. Creating requires users to put parts together in a new way or synthesize parts into something new and different a new form or product. This process is the most difficult mental function in the new taxonomy. This one used to be #5 in Bloom's known as synthesis. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Table 1.1 Bloom vs. Anderson/Krathwohl Visual Comparison of the two taxonomies Bloom et al 1956 Anderson & Krathwohl et al 2000 Evaluation Create Synthesis Evaluate Analysis Application Analyze Apply Comprehension Understand Knowledge Remember One of the things that differentiates the new model from that of the 1956 original is that it lays out components nicely so they can be considered and used. And while the levels of knowledge were indicated in the original work � factual, conceptual, and procedural -- these were never fully understood or used by teachers because most of what educators were given in training consisted of a simple chart with the listing of levels and related accompanying verbs. The full breadth of Handbook I and its recommendations on types of knowledge were rarely discussed in any instructive way. Nor were teachers in training generally aware of any of the criticisms of the original model. The updated version has added "metacognitive" to the array of knowledge types. Here are the intersections as the processes impact the levels of knowledge. Using a simple cross impact grid or table like the one below, one can match easily activities and objectives to the types of knowledge and to the cognitive processes as well. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Cognitive Processes The Knowledge Dimensions 1. 2. 3. 4. 5. 6. Remembe Understan Appl Analyz Evaluat Creat r d y e e e Factual Conceptual Procedural Metacognitiv e Knowledge Dimensions Defined: Factual Knowledge is knowledge that is basic to specific disciplines. This dimension refers to essential facts, terminology, details or elements students must know or be familiar with in order to understand a discipline or solve a problem in it. Conceptual Knowledge is knowledge of classifications, principles, generalizations, theories, models, or structures pertinent to a particular disciplinary area. Procedural Knowledge refers to information or knowledge that helps students to do something specific to a discipline, subject, area of study. It also refers to methods of inquiry, very specific or finite skills, algorithms, techniques, and particular methodologies. Metacognitive Knowledge is the awareness of one�s own cognition and particular cognitive processes. It is strategic or reflective knowledge about how to go about solving problems, cognitive tasks, to include contextual and conditional knowledge and knowledge of self. Source: . Anderson, L. W. and David R. Krathwohl, D. R., et al (2000) A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Allyn & Bacon Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation Rating Scales Rating scales resemble check lists but are used when finer discriminations are required. Uses of Rating Scales 1. They measure specified outcome 2. They evaluate procedure (playing of instrument, typing, cooking), products (typed letters, written themes), and personal-social development. 3. They help teachers to rate their students periodically on various characteristics. E.g. punctuality, enthusiasm, cheerfulness, co-operativeness. 4. Used by pupil to rate himself. 5. A teacher can make use of them for evaluating the effectiveness of the instructional procedure, teaching-learning strategy, tactics and aids. Advantages of rating scales - They can be used with a large number of students - Very adaptable and flexible. - Efficient and economical - Comprehensive in the amount of information Type of rating scales Numerical Graphic Descriptive Graphic Ranking Numerical rating scale This is one of the simplest types of ratings scales. - The rater simply marks a number that indicates the extent to which a characteristic or trait is present - The trait is presented as a statement and values Direction: Encircle the appropriate number showing the extent to which the pupil exhibits his skill in questioning. Key:5-outstanding 4-above average 3-average 2-below average 1unsatisfactory Skill Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation i. Questions were specific ii. Questions were relevant to the topic discussed iii. Questioned were grammatically correct, etc. Graphic rating scale As in the case of the numerical rating scale, the rater is required to assign some value to a specific trait. This time, however, instead of using predetermined scale values, the ratings are made in graphic form – a position anywhere along a continuum. Direction: Rate for each characteristic listed blow along the continuum form 1 to 5. You can point: between the scale values Were the illustrations used interesting? 1 2 3 4 5 Too little Little Adequate Much Too much Descriptive graphic scale This type of scale is generally the most desirable type of scale to use Direction: As shown in graphic rating scale While preparing a blackboard summary, how was the penmanship - legible, beautiful uniform size and slant - Normally readable, good-looking, fluent motion. - Illegible, bad-looking, tends to draw outlines Ranking The rater, instead of assigning a numerical value to each student with regard to a characteristic, ranks a given set of individuals from high to low on the characteristic that is rated. Sources of error in rating scales - Ambiguity - The Personality of the rater - Attitude of the rater - Opportunity for adequate observation. Check List Check List A Check list consists of a listing of steps, activities or behaviour which the observer records when an incident occurs. Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation A check list enables the observer to note only whether or not a trait or characteristic is present Advantage of check lists They are adaptable to most subject-matter areas. They are useful in evaluating those learning activities that involve a product, process and some aspects of personal-social adjustment. They are most useful for evaluating those processes that can be dubdivided into a series of clear, distinct, separate actions. They provide a simple method to record observations. They objectively evaluate traits or characteristics. They may be used for evaluating interest, attitudes and values of the learner Directions Listed below are a series of characteristics related to health practices. Check those characteristics which are applicable to students. Examples Expresses each item in clear, simple language; Avoids negative statements wherever possible; Makes sure that each item is clearly true or false; Reviews the items independently. Factors Influencing Reliability Method The method used in obtaining data on reliability affects the reliability coefficient both stability and equivalence. Interval With any method involving two testing occasions, the longer the interval of time between two test administrations, the lower the coefficient will tend to be. Test Length Adding equivalent items makes a test more reliable. Speed A test is considered to be pure speed test if everyone who reaches an item gets it right, but no one has time to finish all the items. Group Homogeneity Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation The test is more reliable when applied to a group of students with a wide range of ability than one with a narrow range of ability. Difficulty of the items Too easy or too difficult tests for a group will tend to be less reliable because the differences among the students in such tests are narrow. Objectivity of scoring The more subjectively a measure is scored, the lower its reliability. Ordinarily objective type tests are more reliable than subjective or essay type test. Ambiguous Wording of items When questions are interpreted in different ways at different times by the same students, the test becomes less reliable. Inconsistency in Test Administration Inconsistency in test administration such as deviations in timing, procedure, instruction etc., fluctuations in interest and attention of the pupils shifts in emotional attitude, etc., make a test less reliable. Optional Questions If optional question are given, the same students may not attempt the same items on a second administration; thereby the reliability of the test is reduced. Factors Affecting Validity Unclear direction Unclear - State whether the following statements are true or false. Give reasons. Clear – State whether each of the following statements is true or false. If false, give reasons. Reading vocabulary It the reading vocabulary is poor the students fail to reply to the item, even if they know the answer. Difficult sentence construction If a sentence is so constructed as to be difficult to understand, students would be unnecessarily confused, which will affect the validity of the test. Poorly constructed test items Poorly constructed test items reduce the validity of a test. Use of inappropriate items With the help of objective type items, the pupils power of organising matter cannot be judged. Medium of expression Mrs. Megha Gokhe TSCER 1 of 29 III.2 Educational Evaluation English, as the medium of instruction and response for non-English medium students, creates many serious problems. Difficulty level of items Too easy too difficult test items would not discriminate among pupils. Influence of extraneous factors Extraneous factors, like the style of expression, legibility, mechanics of grammar handwriting, length of the answer method of organising the matter, etc., influence the validity of a test. Inappropriate time limit In a speed test, if no time limit is given, the results will be invalidated. Inadequate coverage Essay type items generally fail to cover a vast portion of the content. Inadequate sampling lowers validity. Inadequate weightage Inadequate weightage to subtopics or objectives or various forms of questions would call into question the validity of a test. Quiz items Sometimes students, because of their inability to understand a test item, guess and respond. Relation Between Validity and Reliability i) Validity is sometimes defined as truthfulness while reliability is sometimes defined as trustworthiness. ii) In order that a test should be valid, it must first of all be reliable. A measure might be very consistent but not accurate. iii) Neither validity nor reliability is an either. They are degrees of each. iv) Since a single test may be used for many different purposes there is no single validity index for a test. v) Validity includes reliability. A classroom test should be both consistent and relevant; this combination of characteristics is called validity. Mrs. Megha Gokhe TSCER 1 of 29
© Copyright 2024 Paperzz