How do I apply assessment theory to assessment practice?
Getting started
Orientation
- What are the main concepts in assessment theory?
- What are the different assessment methods?
- What is the difference between formative and summative assessment?
- What is the difference between reliability and validity in assessment?
- What are rubrics and how do I use them?
- What is the University policy on assessment?
Assessing Knowledge, Skills and Attitudes
Assessment is often focused on determining knowledge. However, there is more to being a competent health professional or scientist than having sound theoretical knowledge. Students and trainees must also demonstrate that they can apply their knowledge, ultimately being able to perform appropriately without supervision in the workplace. It is ability at this final stage that is most difficult to measure. Assessments that primarily test knowledge (e.g. examinations) are of limited value in predicting what a graduate will do as an independent practitioner. Students will also have to work with patients in a variety of settings. It may, therefore, be appropriate to teach students about appropriate attitudes and to assess students in this area.
Students may, therefore, be assessed about: what they know (cognitive domain); skills they have learned (psychomotor domain) and attitudes that they have been taught (affective domain). It is important to assess students at the appropriate level. For example, if the learning outcomes for a course specify that students will be able to recall basic facts then the assessment must test their ability to recall basic facts. As a second example, if the learning outcomes for a course specify that students will be able to expertly perform a clinical skill, then the assessment must test expert ability.
Knowledge of the different levels of learning in the three domains (cognitive, psychomotor and affective) can help to ensure that assessment(s) test students at the appropriate level i.e. assessment fits with levels of learning specified in the learning outcomes.
Cognitive Domain
The cognitive domain has to do with knowledge and understanding. Students can evidence different levels of learning/achievement in this domain. Remembering (factual recall) is considered to the lowest level of learning. The ability to create new knowledge is considered to be the highest level of learning in this domain.
Click here to view Bloom's taxonomy of the cognitive domain
Create | Reorganise elements into a new pattern, structure or purpose (generate, plan produce, forecast, develop, invent, improve, prepare) |
Evaluate | Come to a conclusion about something based on standards or criteria (checking, critiquing, judging, conclude, appraise, prioritise, evaluate) |
Analyse | Subdivide content into meaningful parts and relate the parts (differentiating, organising, attributing, inspect, categorise, contrast) |
Apply | Use procedures to solve problems or complete tasks (execute, implement, translate, calculate, solve, demonstrate, adapt, practice, construct) |
Understand | Construct new meaning by mixing new material with existing ideas (interpret, exemplify, classify, summarise, infer, compare, explain, paraphrase) |
Remember | Retrieve pertinent facts from long-term memory (recognise, recall, describe, list, name, identify) |
Psychomotor Domain
Much of the assessment of clinically based students or trainees occurs in the clinical setting - workplace based assessment - with judgements based on observation of performance in an effort to determine what a student can do. Dave's taxonomy for the psychomotor domain can be useful in determining the level of learning to be evidenced for psychomotor skills.
Click here to view the Psychomotor Domain
Naturalisation | Performing the skill automatically with ease, on a consistently high level |
Precision | Coordinating a series of actions, achieving harmony and internal consistency. |
Articulation | Refining, becoming more exact. Few errors are apparent. |
Manipulation | Being able to perform certain actions by following instructions and practicing. |
Imitation | Observing and return demonstrating (under close supervision of instructor) Performance may be of low quality. |
Miller's Triangle is another useful concept to determine the type of assessments required in the assessment of competence. The triangle represents the different levels of learning that might be demonstrated by students. Notice, however, that the triangle does not specify a range of abilities within "knowing how". This is where the levels of learning in the psychomotor domain are useful.
Click here to view Miller's Triangle
Miller, G. E. (1990). The assessment of clinical skills/competence/performance. Academic Medicine, 65(9), S63-67.
Affective Domain
Finally, we might engage students in instruction to change their attitudes. Engaging in this sort of instruction will raise a lot of questions about the sorts of attitudes that we want our students to have and about ethical considerations when teaching to change an attitude. We are not going to deal with those issues here. We are just suggesting that if the aim of instruction is attitudinal change then there needs to be a way to assess the change.
Click here to view the Affective Domain
Internalise or characterise values | Adopt a belief system or philosophy. |
Organise or conceptualise value | Reconcile internal conflicts and develop value system. |
Value | Attach values & express personal opinions. |
Respond | React and actively participate. |
Receive | Open to experience, willing to hear. |
Selecting Assessment Methods: What, When and How to Measure
Assessment methods include:
- Essays;
- Short-answer questions;
- Multiple choice questions;
- Multiple-choice (single best answer and extended matching);
- Self-assessment \ peer assessment;
- Learning portfolios;
- Case studies and projects;
- Structured oral vivas;
- Structured objective practical assessments e.g. OSCE;
- Direct observation of professional, technical or clinical practice;
- Case/record review.
Different assessment methods are appropriate to test different attributes. Multiple Choice Questions can test broadly across the curriculum, and are useful for assessing factual recall, and, potentially, problem solving. Although difficult to write, they have the advantage of high reliability and ease of marking. However, they can cue students to the correct response, and for that reason, other forms of assessment such as Short Answer Questions, or essays, may be preferred. These written assessments are assessing at the "Knows" or "Knows How" level of Miller's Triangle.
Direct observation of practice should form the basis of judgements of workplace performance. A range of structured workplace assessment tools have been developed in an attempt to make this form of assessment reliable. Widely used methods include the mini clinical evaluation exercise (mini-CEX), direct observation of procedural skills (DOPS), objective stuctured assessment of technical skills (OSATS), and multi-source feedback.
Before choosing methods, ask yourself the following questions:
- What is the purpose of the assessment?
- What form of assessment will align with the intended learning outcomes?
- How will feedback on the assessment be provided to students?
Fiona Kelly talks about using a peer assessment tool with her students:
Click to play the video (requires Flash Player).
Wayne Hazel talks about assessment in the clinical setting:
Formative and Summative Assessment
Assessments can be either formative or summative.
Formative assessment guides further learning, so it includes feedback on areas of strength and weakness. As formative assessment provides information on how learning is proceeding, it can be useful both to improve individual student learning and to improve teaching.
- Summative assessment usually occurs at the end of a course, and the results are used to grade students and determine whether they have achieved competencies and standards.
Think of formative assessment as 'how am I doing?' and summative assessment as 'how did I do?'
Reliability and Validity
A reliable assessment is one in which the same students would consistently achieve the same results or in which different teachers would grade in the same way. What we mean here is that the assessment has been written in such a way that expectations of what is required (assessment criteria) are clear, marking schemes have been defined and questions are unambiguous. When we achieve these aims, we remove the possibility that a feature of the assessment itself might produce an inconsistent result.
In practice, reliability can be improved by:
- Ensuring that instructions and questions are clear and unambiguous;
- Ensuring the time allowed for the assessment is appropriate;
- Ensuring that the number of assessment items is sufficient to give a reliable measure of a student's ability;
- Developing a marking scheme with explicit criteria; and
- Checking marks (moderating), double marking (two assessors), re-marking a sub-set (sampling) or carrying out multiple assessments over time - especially if more than one marker is involved and/or if the assessment methods entail a degree of subjectivity (e.g. an essay) or professional judgement (e.g. a complex clinical or workplace competence)
A valid assessment is an assessment that measures what it is meant to measure. For example, if we have a learning outcome such as, "Students will be able to critically analyse the evidence on community immunisation programmes" then an assessment that measures factual recall is not appropriate.
In practice, the following points help to ensure validity:
- Include a representative sample of the course content (content validity);
- Measure the progress toward the intended outcome of the course, for example, attributes/knowledge that the student should be developing (construct validity, the construct being 'what it is to be a good nurse/ doctor' etc)
- Have sufficient predictive ability to determine how well the student will transfer her/his knowledge into the workplace (criterion validity: the criteria will provide a suitable measure of how the student may be expected to perform outside the relative safety of the teaching environment).
There are a number of reasons why a teacher might evaluate their assessment practices e.g. consistently poor learning outcomes. The review may lead to a revision of the assessment and/or to changes in the teaching and learning activities.
Reliability and validity of direct observation of student performance can be improved by using a structured set of criteria, with clear descriptions for each of these, as well as a description of the expected standard for each performance.
Standard setting
A standard is a score or a criterion that acts as a boundary, such as a 'pass/fail' mark. Educational achievement generally exists on a continuum and is influenced by the type of assessment, time, location and other factors, so a degree of judgement is required.
There are two types of standards:
- Relative standards compare those that are assessed; for example, scholarships may be awarded to the 10 applicants with the highest grades, independent of what those grades are. Relative standards are useful when a number or percentage of students are to be selected.
We call this norm-referenced assessment. It is useful when you need to discriminate between students (e.g. to award scholarships or select students for entry to a programme) but is less helpful for the assessment of health professionals as it can lead to competition between learners.
Absolute standards compare the grade of the student against a set grade-based criterion; for example, a student may need to answer 50% of the examination questions correctly in order to pass (regardless of how many other students achieve 50% or more).
We call this criterion-based assessment. This is the most common assessment used in health professions' education, it assumes simply that if students meet the criteria then they will pass the assessment.
The validity of standards is dependent upon:
The method used; for example, the method must be appropriate for the purpose, and easily explained and implemented.
- The standard-setters, who must understand the purpose of the test and know the course content.
Rubrics
A rubric is set of rules or instructions that can be attached to a formative or summative assessment task. A rubric lists the criteria associated with various grades for a piece of work. The criteria for each of the grades will provide the characteristics and qualities needed to successfully complete the particular task or assignment to the standard for that particular grade.
For example, the criteria for achieving an A grade might make reference to "evidence of critical thinking" and "evidence of independent research that that led to insight". The criteria for achieving a C grade might make reference to "clear and logical presentation" and "use of the resources provided as part of the course". The rubric makes clear that more is required to achieve an A (See also Krathwohl on Blooms Revised Taxonomy in Taking if Further).
Rubrics are important for two main reasons:
A rubric makes the criteria for marking explicit. The educator - and anyone else involved in the assessment (second marker for example) - can clearly see the criteria associated with the different grades.
- Students can be guided by the rubric as they complete the assignment. The rubric will help students to understand what is expected of them and it will help to communicate high expectations to the students.
There is an excellent website that will provide you with a full explanation of rubrics together with templates for creating your own rubrics.
Click here to reveal a rubric from an Obstetrics and Gynaecology course
The following rubric makes evident the criteria against which work submitted for a portfolio is judged.
Criteria | Levels of Achievement | |||
Part 1 - Participation and contribution to online discussion | 5 Distinction | 4 Good Pass | 3 Pass | 0-2 Fail |
Frequency of discussion board participation | Regular, frequent participation. | Acceptable to regular participation - presence in well over half discussion topics. | Contributes to at least half of discussion topics - acceptable level of participation. | Student only posts 1-2 times throughout course. |
Interactive participation with group | Responds to and contributes further information to posts from others. Critically evaluates others' opinions and formulates ongoing research questions. | Interacts with others frequently but does not add critical evaluation to body of knowledge. Some support for opinions offered. | Occasional responding and interaction but no additional clarification presented in posts. Will agree or disagree when prompted. | No responding or interaction with posts from other members. |
Draws on reference material in contributions | Posts are referenced. Suggests other reference and resource material. Points out gaps in the literature. | All comments referenced but no addition of readings. | References main points. | No referencing of posts. |
Evidence of facilitative style of participation | Contributes posts which stimulate relevant, ongoing discussion and acknowledges interdisciplinary needs in the area. | Others to contribute on at least two occasions. | At least one other interaction from other students. | Infrequently to enable interaction from other students. |
Part 2 - Application of knowledge to professional practice | 5 Distinction | 4 Good Pass | 3 Pass | 0-2 Fail |
Provides examples of application of knowledge within clinical practice | Evidence of critical application of knowledge to clinical case work. Clearly justifies choice of practice. Clear rationale for clinical decision making. Demonstrates interdisciplinary approach. | Examples of application of knowledge given and correctly applied to the clinical setting. Specific reference to clinical case work. | Few examples of knowledge application provided but little demonstration of appropriate content knowledge. | No example of knowledge application provided. |
Provides examples of change in clinical care | Clear evidence of application of learned course work to present and future changes for clinical care. Articulate presentation of how to set in place changes /new protocols etc. Clarification of how this will improve outcomes for patients/clients. | Good discussion of potential changes in clinical practice and their implementation but no clarification of how this may benefit patients/clients. | Some suggestions made regarding changes in practice but no discussion of how changes will be implemented. | Not able to demonstrate how course work may affect present or future clinical care. |
University of Auckland Policy
University of Auckland, Assessment of Student Learning Policy
You can evaluate your assessment practices against the University of Auckland policy on assessing student learning.University of Auckland, Te Reo Māori in Teaching Learning and Assessment Policy
The policy which details the commitment of the University to recognising and promoting Te Reo Māori as an official language of New Zealand, and its use in the teaching, learning and assessment activities of the University.
Action
A satisfactory performance in Assessing Learning and Providing Feedback might be evidenced by the application of assessment theory to design and implementation of of assessments. Satisfactory performance in Assessing Learning and Providing Feedback might also be evidenced by showing that you are managing assessment according to University Policy. This section has covered both of these criteria.
- If you've worked through this section on assessment theory, you may be ready to start an ePortfolio record evidencing the ways in which you have put the concepts in this section into practice in your assessment.
Taking it further
Armstrong, R. J. (Ed.). (1970). Developing and Writing Behavioural Objectives. Tucson, Arizona: Educational Innovators Press. |
Dave's taxonomy for the psychomotor domain was first presented at a Berlin conference in 1967. This book reports on Dave's taxonomy for the psychomotor domain.
Epstein, R.M. (2007). Assessment in Medical Education. The New England Journal of Medicine, 356 (4), 387 - 396. |
This article provides a conceptual framework for and a brief update on commonly used and emerging methods of assessment. The article also discusses the strengths and limitations of each method, and identifies several challenges in the assessment of physicians’ professional competence and performance.
Hossam, H. (2006). Blueprinting for the Assessment of Health Care Professionals. The Clinical Teacher, 3(3), 175-179. |
This article discusses the "blueprint approach" to assessment construction. This approach indicates that a process of assessment needs to be conducted according to a replicable plan. This fundamental procedure, as a precursor to test construction and item choice, ensures that test content is mapped carefully against learning objectives to produce a ‘valid examination’.
Krathwohl, D. R. (2002). A Revision of Bloom’s Taxonomy: An Overview. Theory into Practice, 41(4), 212-218. |
The taxonomy of educational objectives is a framework for classifying statements of what we expect or intend students to learn as a result of instruction. The framework was conceived as a means of facilitating the exchange of test items among faculty at various universities in order to create banks of items, each measuring the same educational objective. This article looks at Bloom's original framework and presents and explains a revised framework.
Leinster, S. J. (2009). Workplace-Based Assessment as an Educational Tool: Guide Supplement 31.2 - Viewpoint. Medical Teacher, 31(11), 1032-1032. |
This guide provides a starting point for addressing the challenges of conducting workplace based assessment and setting up effective feedback systems in clinical settings. It concentrates on using workplace-based assessment as a formative tool. It does not address the additional problems that are involved in obtaining consistency of judgement between raters and cases across geographically disparate sites which provide the main weakness in using workplace-based assessment for summative decisions.
Miller, G. E. (1990). The assessment of clinical skills/competence/performance. Academic Medicine, 65(9), S63-67. |
This is the article with Miller's Triangle for the assessment of clinical practice.
Norcini, J., & Burch, V. (2007). Workplace-Based Assessment as an Educational Tool: AMEE Guide No. 31. Medical Teacher, 29(9), 855 - 871. |
There has been concern that trainees are seldom observed, assessed, and given feedback during their workplace-based education. This has led to an increasing interest in a variety of formative assessment methods that require observation and offer the opportunity for feedback. This article reviews some of the literature on the efficacy and prevalence of formative feedback, describes the common formative assessment methods, characterises the nature of feedback, examines the effect of faculty development on its quality, and summarises the challenges still faced.Workplace Based Assessment, The London Deanery
This is an online module that provides theoretical and practical advice on carrying out workplace based assessments.
Add to myEportfolio
If you need to log in: FMHS staff - log in with NetID/UPI. Registration is not required. Affiliated members, e.g. clinical teachers who don't have NetID, can register for an account after clicking 'Login'. What is myEportfolio? Questions regarding access | ||||
How do I apply assessment theory to assessment practice?
“A teaching philosophy can help you to reflect on how and why you teach. If you don’t have a teaching philosophy, you might want to consider writing one. You can take a look at What makes a good teacher? to get started. If you already have a teaching philosophy, you might want to reflect on how the work that you are doing here fits with that philosophy”.
« Assessing learning & providing feedback | How do I make sure my feedback is timely and appropriate? » |