Critical Thinking Assessment Report - March 2015
Critical Thinking Assessment Report - March 2015
Excerpts from the WASC Interim Report
Investigation (AY 2012-13). The assessment began its three-year cycle with the establishment of the Critical Thinking Learning Community comprised of faculty from across disciplines. The learning community was initially charged with defining what critical thinking means at Cal Poly. Working with “The Delphi Report,” [1] the learning community identified five traits that should be accounted for when assessing for critical thinking:
- Trait 1: Purpose
- Trait 2: Analysis of Problem/Issue
- Trait 3: Credibility of Sources/Source Material
- Trait 4: Conclusions/Solutions
- Trait 5: Self-Assessment
The plan was to assess for critical thinking via written argumentative papers collected from students in 100-level, GE Area A3 courses (Reasoning, Argumentation, and Writing) and from students in 400-level, discipline-specific courses. The overall intention was to examine cross-sectional differences between students taking courses at these different levels. In spring 2014, this work was given to Professor Brenda Helmbrecht of the English Department.
Evaluation (AY 2013-14). Over 700 student papers from two GE Area A3 courses (ENGL 145 and ENGL 149; both courses requiring ENGL 134 English course completion) and 600 papers from 400-level courses in five colleges (CAED, CLA, OCOB, CAFES, and CENG) were collected. To determine whether the instructors’ assignments elicited argumentative writing, Professor Helmbrecht collected and reviewed the assignments in advance. Nearly every assignment was deemed acceptable for the assessment project.
The learning community developed a five-point critical thinking rubric (appendix B.2) based on the five traits identified above, on the language in “The Delphi Report,” and on the Cal Poly University Writing Rubric, which was developed for the University Learning Objectives Assessment Project that ran from 2008 to 2011. The critical thinking rubric is analytic in the sense that it assesses for five traits separately as opposed to giving an artifact a single holistic score. The rubric scores range from 0 for “Poor/No Attainment” to 4 for “Superior Attainment.”
The rubric was tested and refined on two separate occasions by using essays from the pool. Professor Helmbrecht finalized the rubric with the assistance of Matt Luskey, the Writing Coordinator for the Center for Teaching, Learning, and Technology, and Dawn Janke, the Director of the Writing and Rhetoric Center.
Assessing for Trait 5 proved somewhat challenging, as most academic papers do not require self-assessment, yet this trait was deemed an essential component of critical thinking by the learning community. As such, instructors were asked to include a short reflection with the assignment, using the following language:
When submitting your paper, please include a typed, one-page (minimum) “Writer’s Memo” wherein you reflect on the choices you made as you wrote your essay. What do you see as the strengths and weaknesses of your essay? What process did you go through to write the essay? Please address anything that can help your reader better understand the approach you took when composing your essay.
Depending on the level of adherence to this language in the assignment, some essays were not scored on Trait 5.
A scoring session with 29 readers from across campus was led by Professor Helmbrecht and Professor Josh Machamer, chair of the General Education Governance Board, on June 27, 2014. Readers were comprised of faculty members who had submitted their students’ work for assessment, members of the learning community, members of the Academic Assessment Council, and other interested faculty members. Readers were each paid $200 for their participation.
In preparation for the scoring session, faculty members read and scored three essays in advance, so that they could become familiar with the rubric’s language. Their scores were collected upon arrival and used for norming purposes. After discussing each of the three essays, readers scored and discussed a fourth.
Sampling took the form of a random selection of entire course sections. In most cases, essays for an entire class section were scored.
After norming and sampling, readers were split into two groups; one assessed the GE Area A3 papers and the other assessed the 400-level work. Assessing the papers in two rooms helped alleviate the possibility of a bias in scoring that might have resulted from reading an essay written by a first-year student back-to-back with an essay written by a senior.
During the four-hour scoring session, a total of 268 essays were each scored twice—96 from ENGL145, 50 from ENGL 149, and 122 from 400-level courses. Notably, each essay was accompanied by its assignment.
Analysis of Results. A summary of critical findings is provided here; additional analysis is included in appendix B.3.
As described above, each student paper was read twice. However, there were sometimes sizeable discrepancies between the two resulting scores and the correlation coefficients — one measure of inter-rater reliability — were generally quite low (<.6), as illustrated in Table 2.
Table 2 - Correlation Coefficients
Trait 1 Purpose |
Trait 2 Analysis |
Trait 3 Credibility/Source |
Trait 4 Conclusions |
Trait 5 Self Assessment |
---|---|---|---|---|
.195 | .271 | .226 | .265 | .338 |
CTo adjust for the different scores, the decision was made to remove any scores where the discrepancy was larger than one (e.g., a 2/4 split) and to average the two scores for the remaining papers. For example, a 1.5 indicates that the student’s paper received a 1 and a 2 on a single trait, and a score of 2 indicates that the paper received a 2 and a 2, any 1/3 splits having been removed.
The tables and graphs show the percentage distributions of these average scores for the five traits by class; the three lowest and two highest score categories were grouped together. The sample sizes are given in the first row; the numbers vary due to the removal of the discrepant papers on that trait.
Distribution of Scores for Trait 1- Purpose
TRAIT 1 |
81 |
42 |
111 |
Course |
145 |
149 |
400 |
0-1 |
3.70 |
2.38 |
3.60 |
1.5 |
11.11 |
2.38 |
9.01 |
2 |
4.94 |
14.29 |
15.32 |
2.5 |
41.98 |
35.71 |
28.83 |
3 |
17.28 |
26.19 |
25.23 |
3.5-4 |
20.99 |
19.05 |
18.02 |
Average |
2.62 |
2.70 |
2.61 |
Distribution of Scores for Trait 2- Analysis of Problem
TRAIT 2 |
80 |
36 |
115 |
Course |
145 |
149 |
400 |
0-1 |
8.75 |
5.56 |
6.09 |
1.5 |
13.75 |
11.11 |
13.91 |
2 |
10.00 |
19.44 |
19.13 |
2.5 |
38.75 |
36.11 |
30.43 |
3 |
15.00 |
13.89 |
17.39 |
3.5-4 |
13.75 |
13.89 |
13.04 |
Average |
2.41 |
2.42 |
2.02 |
Distribution of Scores for Trait 3- Credibility of Source
TRAIT 3 |
81 |
42 |
107 |
|
Course |
145 |
149 |
400 |
|
0-1 |
6.17 |
2.38 |
17.76 |
|
1.5 |
23.46 |
16.67 |
17.76 |
|
2 |
17.28 |
26.19 |
19.63 |
|
2.5 |
27.16 |
33.33 |
28.04 |
|
3 |
17.28 |
16.67 |
8.41 |
|
3.5-4 |
8.64 |
4.76 |
8.41 |
|
Average |
2.25 |
2.29 |
2.07 |
Distribution of Scores for Trait 4- Conclusions
TRAIT 4 |
78 |
37 |
107 |
Course |
145 |
149 |
400 |
0-1 |
17.95 |
8.11 |
17.76 |
1.5 |
17.95 |
16.22 |
17.76 |
2 |
17.95 |
35.14 |
19.63 |
2.5 |
20.51 |
21.62 |
28.04 |
3 |
15.38 |
13.51 |
8.41 |
3.5-4 |
10.26 |
5.41 |
8.41 |
Average |
2.12 |
2.15 |
2.0 |
Distribution of Scores for Trait 5- Self Assessment
TRAIT 5 |
74 |
39 |
92 |
Course |
145 |
149 |
400 |
0-1 |
12.16 |
23.08 |
16.30 |
1.5 |
25.68 |
30.77 |
8.70 |
2 |
20.27 |
17.95 |
18.48 |
2.5 |
22.97 |
23.08 |
20.65 |
3 |
12.16 |
2.56 |
17.39 |
3.5-4 |
6.76 |
2.56 |
18.48 |
Average |
2.09 |
1.72 |
2.33 |
Graph 8
Chi-square tests do not reveal any statistically significant differences in the distribution of scores until Trait 5 (p-value < .001), with generally lower scores by the ENGL 149 students, almost all CENG majors. Standard deviations are roughly 0.7; sample sizes tend to be around 80, 40, 110 for each class group, with fewer graded papers for Trait 5; and standard errors are around 0.08. The average Trait 5 score for the 400-level courses was 2.00 for CENG students and 2.44 for non-CENG students (indicating that the gap in Trait 5 scores for the CENG majors narrowed a bit from ENG 149 to the 400-level course)
Combining students across the class levels, a repeated-measures ANOVA (analysis of variance) compared the scores on the five traits. Trait 1 was significantly higher than all the other traits; Trait 2 was significantly higher than Traits 3, 4, and 5; Trait 3 was significantly higher than Traits 4 and 5.
Improvement (AY 2014-15). To kick off this phase of the assessment, the provost sponsored a faculty development opportunity in the form of a Fall 2014 visit by Dr. Peter A. Facione, author of “The Delphi Report” on critical thinking, and his colleague and co-author, Dr. Carol Ann Gittens. They held a general session on critical thinking along with two discipline-specific workshops.
Planning is now under way for the presentation of the assessment results at a spring series of meetings with the deans, associate deans, and Academic Senate, as well as a joint meeting of the faculty members in Communications Studies, English, and Philosophy who are responsible for teaching foundation-level critical-thinking skills. The latter is especially important, as it is intended to address a structural problem, whereby GE faculty members who teach in the same area but reside in different departments do not meet to discuss their common concerns and responsibilities. It is also intended to begin an ongoing review of the GE objectives and criteria, which were established in 2000 and have not been revised since then.
These meetings are intended to promote an engagement with the results, of course, but also to prepare the ground for a multiday summer workshop on course and assignment design for critical thinking, which will be organized by the Center for Teaching, Learning, and Technology.
Success of Actions Taken. Because the critical thinking assessment project was the first of its kind at Cal Poly, it has always been regarded as a pilot. Although the results should establish a critical thinking benchmark for graduating seniors, there is still much to consider before the next campus assessment of critical thinking:
- It became clear that assignment design is an essential factor in assessing for critical thinking. Some assignments provide students with a great deal of structure and guidelines, whereas others are more open-ended and give students room to respond in idiosyncratic ways. As such, some of the results could be an artifact of the assignment design, and assignments that explicitly build critical thinking into their outcomes may elicit better responses from students. Therefore, it seems prudent to work with the Center for Teaching, Learning, and Technology to offer workshops to help faculty build critical thinking into their assignments and rubrics with greater intentionality.
- Working with a more standardized assessment tool in future critical thinking assessment efforts may prove advantageous. The variance in the assignments makes assessment more challenging, so perhaps embedding standardized assignments into classes and/or working with the results of the Writing Proficiency Exam should be explored.
- Better understanding where critical thinking happens in the curriculum as well as where it could happen, seems essential. At present, determining how critical thinking is scaffolded in the curriculum after the GE Area A3 courses is also a key to ensuring that students continue to develop their skills throughout their education.
- Triangulating the results of this assessment, the Collegiate Learning Assessment, and the National Survey of Student Engagement will help flesh out the campus’s understanding of students’ critical thinking skills.
[1] Peter A. Facione, “Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction” (Millbrae CA: California Academic Press, 1990). Also known as “The Delphi Report,” it articulates the findings of a two-year effort to make a “systematic inquiry into the current state of CT and CT assessment.” The report can be found at https://assessment.aas.duke.edu/documents/Delphi_Report.pdf.