Assessing PBL
Why Paper Tests Are Not Enough
- PBL involves all three learning domains
- Skills cannot be assessed on paper
- Dispositions cannot be assessed on paper
- Group collaboration cannot be assessed on paper
- Performance must be observed
Three Domains to Assess
Cognitive (knowledge)
- Concept maps
- Written and oral responses
- Traditional tests (when appropriate)
Psychomotor (skills)
- Written and oral responses
- Performance ratings
- Self-ratings
- Observation rubrics
Affective (dispositions)
- Observation
- Display of work
- Reflection writing
- Self-assessment
Tools for PBL Assessment
- Scoring rubrics
- Checklists
- Rating scales
- Performance assessments
- Portfolios
- Self-assessment forms
- Peer assessment forms
PBL assessment is fundamentally different from assessing lecture-based teaching. A teacher who uses traditional assessment for PBL misses what PBL produces.
A teacher who masters PBL assessment captures what students actually learned. A teacher who does not may give grades that do not reflect actual learning.
Why paper-pencil tests fall short
Paper tests measure some things well. They can measure recall of facts. They can measure comprehension of concepts. They can measure some application.
What they cannot measure:
- Skill performance. Does the student actually do the skill, or just describe it?
- Collaboration ability. Can they work effectively with others?
- Process quality. Did they go through the steps properly, or just produce the right answer?
- Dispositions. Did they develop scientific attitude, problem-solving mindset, learner agency?
- Real-world application. Can they transfer the learning to new situations?
PBL builds these things. Paper tests do not measure them. Assessment must match what the method produces.
PBL produces artifacts. These can be assessed through performance methods rather than paper tests alone.
Three domains of learning
Cognitive domain. Knowledge and intellectual skills.
Psychomotor domain. Physical skills and procedures.
Affective domain. Attitudes, values, dispositions.
PBL touches all three. Real PBL assessment must cover all three.
Assessing the cognitive domain
The cognitive domain is the most familiar.
Concept maps
Students draw concept maps showing how ideas connect. This reveals their understanding of relationships, not just isolated facts.
A good concept map reveals:
- What concepts the student knows.
- How they connect those concepts.
- What hierarchy they perceive.
- What gaps exist.
Concept maps are powerful but underused in many classrooms. They suit PBL well.
Written and oral responses
Students answer questions in writing or speech. The questions can be:
- Traditional. “What is X? Why does Y happen?”
- Application. “How would you use this concept in a new situation?”
- Analysis. “Compare your conclusion to the textbook explanation.”
- Synthesis. “What new question does this raise?”
- Evaluation. “How confident are you in your conclusion, and why?”
Higher-order questions test more than recall. They show how deeply students understood.
Traditional tests
Even in PBL, traditional tests have a place. They can verify factual knowledge that the inquiry produced.
But traditional tests should be one part of assessment, not the whole. Paper tests alone are insufficient.
Assessing the psychomotor domain
Skills require performance assessment, not just description.
Written and oral responses about skills
Students can describe what they did. This shows they understood the skill conceptually. But it does not show they can do the skill.
A student who can describe how to use a microscope may or may not be able to use one. A student who can describe how to balance an equation may or may not be able to balance one.
Written/oral responses are useful but not sufficient for psychomotor assessment.
Performance ratings
The teacher (or another evaluator) rates the student’s actual performance. The rating uses specific criteria.
For a microscopy task:
- Setup (1-5): Did the student prepare the slide correctly?
- Focus (1-5): Did they focus the microscope appropriately?
- Observation (1-5): Did they observe carefully and record findings?
- Cleanup (1-5): Did they handle equipment properly afterward?
The rating produces a numerical score. The score reflects actual performance rather than description.
Self-ratings
Students rate their own performance using the same criteria.
Self-rating builds metacognitive skills. Students think about their own performance. They identify strengths and weaknesses.
Self-ratings should be compared to teacher ratings. Discrepancies are educational. A student who rates themselves higher than the teacher does may need to develop better self-awareness. A student who rates themselves lower may need confidence-building.
Observation rubrics
The teacher uses a rubric to evaluate observable behavior. The rubric has specific criteria with descriptors at different levels.
For a collaboration rubric:
| Criterion | Excellent (4) | Good (3) | Adequate (2) | Needs Work (1) |
|---|---|---|---|---|
| Listening | Listens fully and engages | Listens most of the time | Sometimes listens | Rarely listens |
| Contributing | Contributes thoughtful ideas | Contributes when asked | Contributes minimally | Does not contribute |
| Supporting | Encourages and supports peers | Supports peers sometimes | Neutral | Discourages peers |
| Resolving conflict | Mediates well | Handles conflict adequately | Avoids conflict | Creates conflict |
The teacher observes during PBL and rates each criterion. The total score reflects collaboration skill.
Assessing the affective domain
Dispositions are the hardest to assess. They involve attitudes, values, and habitual behaviors.
Observation
Affective qualities show in behavior. A teacher who watches carefully sees:
- Curiosity (does the student ask questions?)
- Persistence (do they continue when stuck?)
- Open-mindedness (do they consider alternative views?)
- Skepticism (do they question evidence?)
- Honesty (do they report honestly even when uncomfortable?)
- Cooperation (do they work well with peers?)
Observation captures these. The teacher can use a rubric or just take notes during the unit.
Display of work
What students choose to display says something about their dispositions.
A student who proudly displays incomplete or sloppy work may not value quality. A student who agonizes over the display shows attention to quality.
A student who displays only their successes may avoid showing failures. A student who displays both successes and failures shows openness about learning.
These are dispositional cues.
Reflection writing
Students write about their own thinking and feeling during the project.
Questions for reflection:
- What did I learn?
- What surprised me?
- What did I find difficult?
- How did I respond to difficulties?
- What would I do differently next time?
- How did I feel about the project?
- What disposition (curiosity, persistence, etc.) did I rely on?
Honest reflection reveals dispositions. A student who reflects superficially shows less developed metacognition. A student who reflects deeply shows strong metacognitive development.
Self-assessment
Students assess their own dispositions.
A self-assessment form might ask:
- How curious was I during this project? (1-5)
- How persistent was I when stuck? (1-5)
- How open was I to peer ideas? (1-5)
- How honest was I in reporting findings? (1-5)
- How well did I work with my group? (1-5)
Self-assessment is imperfect. Students may overrate themselves. But the act of self-assessing builds awareness, which is itself an affective development.
A debate-related example
Debating involves:
- Cognitive skills (constructing arguments).
- Psychomotor skills (delivering speech, using gestures).
- Affective qualities (composure under pressure, respect for opponents).
Written debate reveals cognitive skills. It does not reveal speaking skills. It does not reveal composure.
To assess all three, the teacher must observe live debate. They use criteria for delivery, response under pressure, treatment of opponents.
The same logic applies to PBL. Some aspects can be assessed on paper. Others require observation.
Specific tools
Scoring rubrics
A rubric specifies criteria and levels of performance. The teacher rates each criterion using the rubric. The result is a structured, transparent assessment.
Rubrics work well for performance assessment. They can be developed for any task.
Checklists
A checklist lists specific things students should have done. The teacher checks off what was done.
Checklists are simpler than rubrics. They are useful for tasks with clear yes/no criteria.
Rating scales
A rating scale gives a numerical rating for each criterion. Similar to rubrics but often simpler.
Rating scales suit many performance assessments.
Performance assessments
Any assessment that involves observing actual performance is a performance assessment. Lab experiments, presentations, debates, and PBL artifacts all qualify.
Performance assessments are essential for PBL. They capture what tests cannot.
Portfolios
A portfolio is a collection of student work over time. It shows growth, not just the current state.
For PBL, a portfolio might include:
- Initial problem statement.
- Research notes.
- Drafts of analyses.
- Final artifacts.
- Reflection writing.
- Self-assessment.
The portfolio gives a fuller picture than any single artifact. It shows the process alongside the product.
Self-assessment forms
Students rate themselves on specific criteria.
Peer assessment forms
Students rate each other. This adds perspectives.
Peer assessment requires careful structure to avoid favoritism or bias. Specific criteria, anonymous ratings, and aggregation across multiple peers all help.
Putting it together: a PBL assessment plan
A teacher planning PBL assessment might design:
Cognitive assessment:
- A short paper test on key concepts (10% of grade).
- A concept map showing how concepts connect (15% of grade).
- Written analysis of the problem (15% of grade).
Psychomotor assessment:
- Performance rubric for investigation (15% of grade).
- Performance rubric for the artifact created (20% of grade).
- Self-rating on skills used (5% of grade).
Affective assessment:
- Observation rubric on collaboration (10% of grade).
- Reflection writing (5% of grade).
- Peer assessment of group contribution (5% of grade).
Total: 100%. Multiple methods. Three domains covered.
This plan captures what PBL produces. A teacher who designs such a plan assesses fairly. A teacher who uses only one or two methods misses much.
The challenge of fairness
PBL assessment is fairer than paper tests in some ways but less fair in others.
Fairer: Assesses what students actually learned, including skills and dispositions paper tests miss.
Less fair: Subjective judgment is involved. Different teachers may rate the same performance differently.
Does not address this directly, but it matters. To make PBL assessment fair:
1. Use clear rubrics. Specific criteria reduce subjectivity.
2. Use multiple raters. When possible, have two teachers rate. Discuss differences.
3. Be transparent. Show students the rubric in advance. They know what is being assessed.
4. Allow self-correction. If a student receives a poor rating, give them a chance to improve and be reassessed.
5. Aggregate methods. No single method is perfect. Combining multiple methods produces fairer overall grades.
A teacher who attends to fairness produces respected assessment. A teacher who relies on intuition without structure produces complaints and challenges.
Cognitive (knowledge), psychomotor (skills), affective (dispositions)
Cognitive: concept maps, written/oral responses, traditional tests.
Psychomotor: performance ratings, observation rubrics, self-ratings.
Affective: observation, displays, reflection writing, self-assessment.
Each domain requires different methods. Paper-pencil tests work only for some cognitive content. Real PBL assessment uses multiple methods to capture all three domains.