A Look at Academic Standards
- The case for standards is not obvious
- Administrators must make the case for standards
- Rejecting standards requires retreating to comparative evaluation
- Standards are fair
- Standards are effective
- Standards are fundamentally different from norms
- Standards are fair and focused on proﬁciency
- Norms represent constantly changing rules of the game
- Standards are more rigorous than norms
- The administrator must make the case for standards to a skeptical public
- Assessment is the key to inﬂuencing every other element of classroom performance
- Performance assessment is the best way to assess student proﬁciency
- The administrator must be the architect of collaboration in assessment design and evaluation
- The administrator must focus professional development on assessment quality
A Case for Standards
As Reeves (2002) informs us, academic standards are part of every public school in the nation. An increasing number of public and private schools in the United States and throughout the world have transformed their approach to assessing student performance from the tradition of comparing students to one another to comparing students to academic standards. However, standards are not implemented with legislation or resolutions but because they are the fairest way to assess student performance.
The challenges implementing standards have included being either too vague or too hyper-specific (Marzano, Kendall, and Cicchinelli, 1998). Almost all of the state standards documents are too long, requiring more time than is available in the school calendar. Moreover, for many teachers, parents, and students, standards have become inextricably linked to standardized tests. If the latter is disliked, the former is blamed (Kohn, 1999) (p. xvi).
There are two alternatives to using standards. Compare the quality of student work either to:
- An objective standard or
- The work of other students
The most obvious manifestation of the comparison of students to one another is the bell curve and, more appropriately, the normal distribution commonly known as “norms.” The reference point is typically the average of student performance. Many report cards continue to refer to a grade of C as average. There are two problems with using the bell curve or any other evaluation system that compares student performance to the average:
- It is inaccurate, and
- It is unfair
Comparison evaluation systems are inaccurate because they purport that a student is proﬁcient, when the only accurate statement that can be made is whether the student is similar to other students or not. Saying a student is an “average writer” does not reveal if the student writes well. Saying a student is an “above average mathematician” does not disclose if the student is ready to advance. Comparable evaluation is unfair because it denies opportunity to a student who has earned it. If a student has met every requirement for academic proﬁciency but is nevertheless “below average” because other students have performed at a higher level, it is unfair to deny the student an opportunity to continue to academic proﬁciency.
Using academic standards eliminates the ﬁction of claiming that the student who has performed better than 51 percent of their colleagues is successful while the student who has performed better than 49 percent is unsuccessful. When standards are used rather than comparisons, it is possible that both of those (p. xvii) students are proﬁcient, or that neither is. Students should not be told that they are proﬁcient merely because they beat other students. The point of assessing student performance should be accurate assessment of proﬁciency, not meaningless comparison and competition (xvii).
The Only Two Ways to Assess Human Performance
On ﬁelds ranging from Olympic competition to the admissions office at Swarthmore, the decision system rests on the premise that there are only a very few winners, and the job of the judges is to separate – if even by a fraction of a second, grade point, or test score – the winners from the losers. This method of evaluation is so common that some people accept it as universal practice.
If every airline pilot trainee in a class meets the standards for navigation, weather, and air traffic control, we do not object to calling all of them proﬁcient. If none of the pilots is proficient they do not receive their wings. A potential pilot is not proﬁcient merely because they are competitive; they must be a proﬁcient pilot. We do not tell a prospective pilot that they are unsuccessful because another pilot had a higher score on the air traffic control test; both are proﬁcient if they have achieved the standard (p. 4).
Comparison of human performance to a standard is the only appropriate way to evaluate student achievement and educators as well. If there is great risk to the public and to individuals should we make an error, then the standards-based approach is what is used. Thus whether our perspective is protecting the rights of the public or protecting the rights of the individual, a standards-based approach is society’s preferred method of operation.
The dichotomy between these two methods of assessing human performance is stark and mutually exclusive. Students are either compared to a clear objective standard or they are compared to one another; there is no third alternative. Rejecting standards as the method for student evaluation leaves only an evaluation system that is based on comparison of one student to another, a system that is inconsistent, unfair, and ineffective.
Although every public school in the nation has a curriculum that is theoretically governed by state academic content standards, the reality is considerably more complex. In most schools, a significant number of faculty members have heard of standards, but an equal or greater number regard standards as an administrative imperative rather than a reﬂection of the school’s fundamental values and educational principles. In a growing number of schools, there is organized opposition to standards, with the aggravation teachers and students feel toward standardized tests directed toward the entire standards movement (p. 5).
Are Standards Really New?
Although critics have attempted to label the educational standards movement a passing fad, standards are in fact about as old as Socrates. In the Lyceum, the issue was not merely whether Glaucon got the better of his opponents in an argument, but whether any argument met the teacher’s challenges on the basis of logic and truth. At the dawn of the Enlightenment, scholars argued that the scientiﬁc method was a better way to test a hypothesis than a popularity contest among medieval superstitions. Galileo did not ask, “Is the theory a little bit better than the others?” but instead “Does the theory conform to observable facts?” In other words, use of educational standards is not a passing fad, or even a particularly new development.
Although critics have compared standards to “new math,” “whole language,” or any other reform that they disliked, applying clear and objective expectations for student performance is hardly a novel idea. As historian Diane Ravitch (2000) has extensively documented in her studies of more than a century of self-proclaimed educational reforms, there is little new under the sun. In particular, the assertion that the standards movement has directly resulted in extraordinary and unprecedented demands upon children is a claim unsupported by the evidence.
An article titled “A Crime Against Children: The Scourge of Homework in Our Schools” was not a product of the anti-standards movement of the twenty-first century, but a lead in the Ladies Home Journal at the dawn of the twentieth century. Indeed, there was probably a Cro-Magnon adolescent complaining about homework, a Neanderthal parent complaining about excessive testing, and a fresh-faced Homo erectus school administrator who despaired of the lack of time to accomplish all the demands in the school day. The leaders of our present era must not be taken in by superficial disparagement of the new or dismissive contempt for the old. The only qualities on which the matter of educational standards should be judged are fairness and effectiveness, and on that basis they stand the test of time (p. 6).
The Language of Fair Standards
For standards to meet the test of fairness, they must be consistent. Here some states fall far short of the mark, particularly when they use the vocabulary of standards to describe a comparative expectation. The requirement that “students will read a fourth grade level text, compare it to another text on a similar topic, and accurately (p. 8) recall details, similarities, and differences in the two texts” is a standard. The requirement that “students will score at or above the 51st percentile” is not a standard. The imprimatur of standard on the cover of a document containing such a statement does not transform a comparative statement such as a percentile rank into a standard. The language of comparison – with its percentiles, quartiles, stanines, and averages – inevitably implies a statement of student proﬁciency on the basis of standards. A report that student achievement is “above average” when the essential question is whether the performance is proﬁcient is not creditable..
Adding Value to Standards
Just as in a game, clarity in articulating academic content standards is preferable to ambiguity. Unfortunately, the political process by which standards were established in many states rendered the documents full of equivocation, imprecise words, and a mysterious threshold for success. It is difficult to ﬁx any political process at the state level that is inherently ambiguous, but school administrator can add value to standards by taking the state documents and recreating them with greater precision, focus, and prioritization. The concept of “power standards,” as Reeves calls them, is a process of adding value to the standards districts and schools have been given.
Rather than going on the defensive over a ﬂawed set of state standards, administrator can honestly admit that every set of state standards and the tests accompanying them have some ﬂaws. In some cases, there are too many standards; in others the standards are vague, or hyper-speciﬁc, or inappropriate. Because perfection is not an option in creating standards and because defense of inadequate standards looks silly, administrators should not attempt to defend what has been given to them but rather articulate a set of alternatives.
One alternative is to reject standards and return to a comparative method of student evaluation. With such a choice, the fairness that a standards-based approach to education offers students and the public is forfeited. The other alternative is to add value to standards with focus and prioritization, collaborating with educators and school administrators to get the most out of standards rather than retreating to an alternative that is unfair.
Fairness to the Public
The ﬁnal issue to consider in this discussion of standards and fairness is the obligation to look beyond the boundaries of the classroom. Although there is clearly an obligation for a school to be fair to the individual student, standards represent the best way to provide feedback on student performance that is clear, consistent, and fair. There is an obligation for fairness beyond the classroom. This includes fairness to teachers in future grades, administrators in future schools, and above all to the communities in which students live. If a student “meets standards” in reading or mathematics it has to said with integrity.
Failure to be accurate in this regard leads to frustration on the part of students, parents, and future teachers and misallocation of resources thanks to rescheduling and course failures. Just as students have the right to expect fair treatment in assessment of their work, so also the community of parents, employers, and educators in other schools has the right to expect accuracy in a description of student achievement.
A significant part of the continuing crisis in conﬁdence in public education can be attributed to the difference between what educators say a student can do, as documented with report cards and diplomas, and what students can actually do, as observed at school and in the home. The choice is not whether to be fair and accurate in an assessment comparing students to a standard but rather the timing of this assessment. Problems can be found proactively to identifying needs and implementing interventions or they can be found passively, reacting only after a child has faced severe academic trauma (p. 9).
Research on Standards Implementation
Researchers from different educational, theoretical, and political perspectives have determined that academic standards are effective. Documenting dramatic improvement in schools around the nation, Schmoker (2001) found that one consistent prerequisite for success was a focus on speciﬁc academic areas and identiﬁcation with absolute clarity of what successful performance looks like. This is consistent with other authors who have studied schools containing students from a spectrum of demographic backgrounds (Haycock, 1998; Reeves, 2000a).
Although critics take issue with his choice of subjects and his term “cultural literacy,” E. D. Hirsch, Jr., has made a signiﬁcant contribution (1996) to the standards debate by reminding us of long-term observations in Europe in which schools with deﬁned curriculum objectives signiﬁcantly outperformed schools without them. In particular, he notes that the impact of clear curriculum deﬁnition affords not only educational excellence but also equity of opportunity. Economically deprived immigrant children, he reported, fared far better in schools in which expectations are absolutely clear and consistent from one school to the next. His ﬁndings are consistent with researchers on this side of the Atlantic.
There should be nothing surprising about research that indicates that when students and teachers focus more on an objective (as does Innovationism), they are more successful than when they do not. Nor should it be surprising that standards alone are a framework for focus, while a comparative system simply encourages teachers to try harder and beat the other schools. They do not recognized that as important as effort and perception of ability may be, the only real question is: Is the work proﬁcient when compared to an objective standard?
In considering the effectiveness of standards implementation, there are three essential considerations:
- Student performance
- Teaching practice, and
- Administration behavior
The most important statistic in evaluating student performance is the percentage of students who are performing at the proﬁcient level or higher. In circumstances where a large proportion of students are meeting standards at the proﬁcient level then it may be more appropriate to measure the percentage of students who are performing above the proﬁcient level. In any case, it is essential that a single metric be selected and then measured consistently. Although teaching variables are rarely evaluated, it is quite possible to measure them with fairness and accuracy (p. 12).
The key to effective measurement of the professional practices of teachers is consistency and precision. Sincerity or attitude cannot be measured but the frequency with which a teacher uses standards-based assessments in the classroom can be. Teacher collaboration and the percentage of their agreement in evaluating the same piece of student work can also be measured. A principal can measure the frequency with which faculty meetings focus on student achievement issues and the frequency with which teacher excellence in implementation of standards can be tracked.
It is more important to measure a few things frequently than to measure a lot of things infrequently. Annual standards assessment is more effective than state standards that are ignored by students and teachers. Effective standards implementation requires administrators to identify a few variables, measure them frequently, and make instructional decisions on a reliable basis that reﬂect the results of those measurements (p.12).
Standards and Norms
Nine out of ten US. Senators support academic standards, as do a similar proportion of members of the House of Representatives. The legislatures of forty-nine of the ﬁfty states have endorsed statewide academic standards; the ﬁftieth, Iowa, requires standards for every district. If standards are so widespread and common, then why is it necessary to make the case for them? The problem is the association of academic standards with standardized tests diverts conversation and the imperfect and inconsistent manner in which standards have been implemented at the classroom and local school district levels in the past has caused a backlash against the entire notion of academic standards (p. 13).
Accordingly, standards cannot be implemented without ﬁrst making the fundamental case for the superiority of this method of assessing student performance. As stated, student work must be compared either to objective performance criteria – a standard – or to the average work of other students – a norm. The advantages of standards over norms are overwhelming, but the superiority is not obvious.
The most typical reason for unfairness in any game is that the rules change in the middle of it. Students, along with their teachers and school leaders, remain engaged in a task if the goal is clear and they perceive that their individual efforts help them move toward that goal. Good academic standards produce clear and unambiguous goals. By contrast, the goal established by a norm-referenced test is to be “above average” – a goal that changes with each administration of the test.
Pursuing the moving average creates two problems. First, it fails to recognize progress that is made by struggling students and exceptionally challenged schools. If a student in the sixth grade, for example, progresses from a third grade reading level to a ﬁfth grade reading level in a single year, this should be recognize as progress and at the same time (p. 14) be recognized that the child has not yet achieved the sixth grade standard.
The most effective models of standards implementation recognize a continuum of performance, ranging from failure to meet the standard to progressing toward standards then to proﬁciency, then to exemplary performance. The student can be recognized as “progressing” even as his failure to read on the sixth grade level is acknowledged. If it is only recorded that the student failed to meet the average of other sixth graders there is not a helpful challenge but only that the child is not above average.
Consider the case of a student who is already above average and in fact has entered every grade reading above that grade level. Using norm-referenced tests is a prescription for complacency for this child, providing only bland confirmation of what he already knows. A standards-based assessment, by contrast acknowledges that this student has achieved proficiency and at the same time challenges him to reach the next level of performance.
Merely beating other students is not a standard; achieving a speciﬁc level of exemplary performance is a meaningful challenge for the student and the teacher. Even this high-achieving student however loses motivation if the definition of success is constantly changing. Exemplary performance, like proficiency in the standard itself is not a mystery; nor is it a function of the performance of other students. It is a clear and challenging goal to which every student can aspire. The variable is the hard work of students and teachers not successful guesswork or the defeat of other hard-working students.
Standards Are Cooperative; Norms Are Competitive
The effectiveness of cooperative learning strategies is a matter of settled research (Walters, 2000). Nevertheless, there is a prevailing ethic that we live in a competitive world and that for children to succeed they must learn competition in the classroom. Standards actually promote successful competition, but not by pitting one student against the (p. 15) other. Rather, standards promote success by building teamwork and successful reinforcement among members of a learning team. Listen to the most competitive employers in your community and throughout the world. Their most frequent request is for employees who are literate and able to work cooperatively in a team. Using cooperative learning is not antithetical to competition. Cooperative teamwork is actually essential for successful competition. The challenge, however, is to use competition in the right context.
If students believe that success in the classroom is a function of who they beat rather than the standard that they achieve, they lurch between the aspiration of mediocrity and disappointment even when successful (see: Academic Behavior and Academic Motivation). If students work in a standards-based classroom, they know that it is possible for every student to succeed. With diligence and focus, with cooperation and mutually reinforced learning, all students can achieve a standard. This is the way the best athletic coaches and orchestra leaders get a group of people with divergent abilities to succeed together.
Students are most successful not when they relax knowing that their efforts, though falling short of a standard, are sufficient to beat the other students in the room. Nor do students succeed if they believe that no matter how hard they work, their efforts are never quite good enough to beat another student in the class. Students succeed academically when they know that their success is a direct result of their hard work. Moreover, the success that they most celebrate is the success of the entire class. Thus the standards-based classroom has incentives for group success, for students who help one another, and for students who are willing to be vulnerable enough to ask one another for assistance (p. 16).
The Implications of Cooperation on Grouping and Tracking
The necessity of cooperation among students often leads to discussion of grouping and tracking. Through ﬂexible grouping, students can be grouped by similar ability when they are building common skills such as learning multiplication or phonics. The same students, however, can work with an entire class or a group of students with differing abilities when the discussion turns to prediction, evaluation, or comparison.
A similarly balanced approach can be applied to discussion of tracking. If the ultimate goal is to reduce tracking in high school, where some students are given an opportunity for college while other students are systematically assigned to courses that are inconsistent with postsecondary educational opportunity, then it is necessary to recognize academic needs in middle school and elementary grades and address deﬁciencies. For example, to give a student the opportunity in ninth grade for academic rigor and no tracking, it may be necessary to give this same student additional intensive literacy instruction in eighth grade. The eighth grade grouping of students who need literacy skill is necessary, therefore, to “de-track” students in the ninth grade. If students are not grouped in the eighth grade, the result is likely to be segregation of students not only in the ninth grade but through-out the remainder of their academic and employment careers (p. 17).
The discussion of academic standards must not be sidetracked by the partisans or opponents of grouping. There are times when it is necessary to group students according to their ability and other times when it is strategic to group students of differing ability. The common element is to not to label but rather the extent to which grouping advances academic achievement. By focusing on the goal and embracing a ﬂexible strategy maximum advantage can be taken of the effectiveness offered by a cooperative learning strategy and not succumb to the false assurance of norm-based test results. It is not necessary to know who beat whom but only to know the percentage of students who meet or exceed standards.
Standards Measure Proﬁciency; Norms Measure Speed
The difference in assessing standards compared to assessing norms is that proﬁciency precedes speed. Consider the example of state writing requirements, in which students are expected to prepare a multiple-paragraph persuasive, analytical, or expository essay. The performance standards are clear: the paper must adhere to the conventions of English grammar, spelling, and punctuation; be well-organized; include appropriate topic sentences and transitions; and have supporting illustrations and evidence that are linked to a clear central theme. Not a single state writing standard, however, expresses the requirement that the writing be performed quickly. There is no differentiation between accomplishing these standards on the ﬁrst draft or the third draft (p. 18).
The only issue is the assessment of the standards. In norm-referenced tests, by contrast, the typical requirement for standardized test conditions includes a common constraint on time. Therefore the norm-referenced test not only assesses student proﬁciency but also the speed with which the student processes the information. Perhaps there are tasks in which speed is important, and where that is the case one should expect a standards document to make such a speciﬁcation. A technology standard might, for example, specify that a student be able to keyboard at forty words per minute with 98 percent accuracy. There is not a standard, however, that requires students to “solve a quadratic equation” or “complete a writing process” quickly. A true standards-based test focuses on proﬁciency, not speed.
The typical objection to de-emphasis of speed is the fear that students will abuse freedom, as well as the expectation that the real world requires speed. A concern is that if a student has no time limit they could take all day or all week to complete the assignment. Time doesn’t stand still while a student takes an infinite number of minutes to complete an assessment. Rather, if most students complete a writing assignment in forty-five minutes, then a reasonable time limit in a standards-based test might be ninety minutes. In this way, students who needed only an extra ten or twenty minutes are able to become proﬁcient and the test data do not inaccurately label them as non-proficient. For those students who are unable to complete the task in ninety minutes, the problem is probably far deeper than the task at hand, perhaps involving inability to read the assessment or to address the task at all. These students require intervention and additional instruction, not merely more time for the task.
In the real world the quality model in the most advanced enterprises asks workers, engineers, attorneys, or executives (p. 19) to submit work, get feedback, and then improve it. This cycle of feedback and improvement is what leads to quality, not the expectation that work is completed once irrespective of quality and then the worker proceeds to the next task. A successful enterprise begins with a demand for quality rather than speed, conﬁdent that as quality improves, speed will be achieved in due course. However, the enterprise fully understands that speed without quality is not a prescription for success.
Standards Are Challenging; Norms Are Dumbed Down
The use of the average, the staple of norm-referenced tests, is a formula for mediocrity. It allows students who fail to meet standards to become inappropriately complacent by claiming that they are above average whether or not they meet the standard. There are a number of states, such as Virginia and New York that were initially alarmed when a large number of their students did not fare well on standards-based tests. Indeed, a chorus of critics announced that if 80 percent of students failed to achieve proﬁciency in a standard, it must be the test or the standard that is at fault. The logic of this allegation rests with the false presumption that 50 percent of students – those above average – should automatically be successful. In fact, it does not surprise classroom teachers to read a standard that is rarely applied and infrequently tested and discover that 80 percent of students do not meet it.
With student achievement, however, the answer is neither to shoot the messenger nor to subvert the message. In fact, the first reaction to the news that 70 percent of students do not meet standards should be a positive one, as it offers clear evidence that the standard is more rigorous than the norm. In a system that blindly applauds the above-average student, 20 percent of the students – those between the 50th and 70th percentiles – would be complacent in their apparent victory over their below-average colleagues, while failing (p. 20) to notice that these above-average students were not proﬁcient. This is precisely what happens when honor roll students in middle school, comfortable in their relative merit when compared to other students, run into trouble in high school because the expectations are dramatically different from what the middle school students had expected.
The rigor of standards compared to the mediocrity of norms is played out in a number of ways every day in schools. Teachers can examine the scores of any class and ﬁnd a student who is in the 55th percentile on a test of verbal ability, and then examine the written work of the student and fail to ﬁnd an expository essay that meets the state standard. Similarly, teachers can ﬁnd students who achieve a 60th percentile score in a norm-referenced math test but fail to ﬁnd evidence that the student is able to integrate that mathematical knowledge to the typical requirements of social studies and science standards to interpret data. The existence of standards makes it clear that being above average is not as important as meeting a standard.
The impact of academic rigor in standards should be considered when analyzing improvement measurements. The most important metric in a standards-based system is the percentage of students who meet or exceed standards on a speciﬁc performance task, score at or above the proﬁcient level of performance. The move of the average score from the 55th to the 60th percentile may or may not imply an improvement in the percentage of students who are proﬁcient. Conversely, percentile scores can be stagnant, yet the percentage of students who are proficient can rise. Schools will, as a matter of statistical fact, never have more than 50 percent of any distribution above average, but it is possible for all students to meet a standard. Thus at the same time that standards represent a more rigorous level of achievement than norms, standards also open the door of success to far more students than do norm-referenced tests (p. 21).
Standards Are Complicated; Norms Are Simple
Many people like norms because they appear to encapsulate, in a single number, the achievement of a student. A single composite score is used to represent the ability of a student, his rank among other students, and perhaps his potential as well. Achievement of standards, by contrast, is not amenable to description with a single number.
In fact, standards invite complexity. School leaders in a standards-based environment may report that 82 percent of students are proﬁcient readers, but only 42 percent are proﬁcient in writing; 93 percent are proﬁcient in math computation, but only 55 percent are proﬁcient in mathematical analysis and problem solving. This is, to be sure, much more complicated than saying that the average fourth grade score places a group of students in the 52nd percentile nationally.
The fundamental purpose of assessment and accountability is to improve teaching and learning so a ranking system is not very helpful. Teachers require the complexity of a standards-based report to know that, in this example, they need to focus more on problem solving. The success in math computation requires teaching techniques that can be applied to the area that is deﬁcient. Reading well might not be not enough and the literacy emphasis might need be expanded to include writing as well. A single norm-referenced score would never provide such helpful insight. A balance is needed as test data can be notoriously complex. Teachers and administrators need enough data to modify instruction, but not so much information that it is overwhelming.
People who must make decisions that improve performance personally influence data reports. For example, if a teacher does not perceive that his instructional decisions inﬂuence (p. 22) student achievement, then test data will not inﬂuence that teacher’s decisions. This is particularly true if the data are not related to the teacher’s decisions. For example, if the teacher recognizes that the students who fail also fail to come to school then it is necessary to analyze the data by attendance, producing a list of students who have 90 percent or greater attendance but are nevertheless failing. If the teacher feels that the students who are failing are all in someone else’s class, then it is necessary to analyze the data individually by teacher. Each increment of data analysis is a step toward personal responsibility. Without it, the data are abstract and irrelevant to daily decision making.
Standards Address Causes; Norms Display Effects
Causes need to be understood, not merely effects. The relationship between cause and effect can better be understood because standards address a range of student achievement and behavior. When teachers are asked to identify the most important standards they usually list a combination of the academic requirements and the behaviors that reﬂect the components of student success (p. 23).
By analyzing the achievement of these students, it can be understood if the cause for poor science achievement is:
- A lack of science content knowledge by the student
- Misalignment of curriculum between the science class and the assessment taken by the student
- The inability of the student to write a lab report properly, or
- Failure of the student to have the time management and organizational skills associated with being a successful student in any class
Whereas a science score on a norm-referenced test merely announces a result and a rank, the report of the student’s achievement of various standards illuminates the entire picture of student achievement. Decisions can be made which will ultimately improve learning only if there is enough understanding of which standards the student has and has not achieved.
The Impulse Toward Ranking
If the case for standards is so clear-cut, then why is there so much institutional and individual resistance to applying academic standards? Why do parents in particular have difﬁculty with the notion of achieving a standard rather than ranking students against one another? For a large number of parents (including those most likely to be active in school affairs) the bell curve has been their friend. Using norm-referenced comparative data validates them as parents and announces to the world that their child is a success, at least compared to the child in the norming group. Because the traditional purpose of testing has been merely to announce a result rather than to improve learning, there is little or nothing to be gained from any test that yields bad news.
Therefore, confronted with the choice between a norm-referenced test that is encouraging and within the parental comfort zone and a standards-based test that suggests all is not well, parents understandably prefer the former. Moreover, successful parents have themselves defined success by ranking. A’s are not enough; (p. 24) their child must aspire to be valedictorian. Challenging grades are not enough; their child must have quality points to allow the possibility of a 7.5 average on a 4-point scale. From an early age, children witness their parents ask not “Is my child proficient?” but rather “Is she the best in the class?”
The Appeal of Ranking and Norms
The impulse toward ranking happens not only among affluent and competitive parents. Children themselves quickly sort out the world into those on top, those in the middle, and those left behind. The way we post scores and rank students conﬁrms this early pre-disposition. There is a fundamental problem, however, that must be confronted. Ranking is not an accurate measure of student achievement; achievement of standards is far superior.
Consider the case of the aspiring valedictorian. The premise of chasing the graduating class microphone is that the student who receives a 7.458 grade-point average is superior to the student who receives a 7.457 grade-point average. This is the classic error of a distinction without a difference. If both students are exemplary performers -that is, not merely the best in the school, but capable of demonstrating performance that meets not just academic standards but explicit standards for exemplary performance as well – then both deserve recognition as exemplary. If, by contrast, both are at the top of the class but neither of them completed a research paper with the appropriate research citation that state standards required, then the valedictorian’s victory is hollow. The trophy should read “superior but not proficient,” a truth that will elude proud parents but become evident soon enough during the ﬁrst year of college.
Breaking the Cycle: The Value of Standards for Parents Who Love Ranking
Administrators have an obligation not merely to set policy but to communicate the values, vision, and ideals on which policy is (p. 25) based. For parents and students, the answer lies in constructive use of assessment. The ﬁrst principle of assessment is that the purpose of testing is not to rate, rank, sort, and humiliate students or parents, but rather to improve teaching and learning. The role of the school is not to announce a judgment but to coach improvement.
This is one reason frequent measurement of a few standards is so important. Parents do not embrace the value of standards on the basis of an annual report. Rather, they and their children must see, every time they cross the threshold of the school door, evidence that each month students are getting better and better. The percentage of students achieving a rigorous standard in writing, reading comprehension, mathematical problem solving, or other academic area of particular interest to the school is growing higher and higher each month.
Changing Grading Patterns in the Standards-Based School
Before addressing any change in something so tradition-bound and emotionally sensitive as grading, the leader must ﬁrst address what does not change. Parents and community members must ﬁrst receive assurance that, at least at the secondary level, using letter grades and high school transcripts remains intact. Any documentation of student performance with respect to academic standards is an addition to, not a replacement of, the traditional transcript. In the culture of some communities, it may be needed to offer the same reassurance to elementary school parents.
The reassurance of stability is not because letter grades are the ideal way to assess student performance it because standards are too important to be caught in the cross-ﬁre over letter grades in the national culture wars. Because of this letter grades should be used even though many recognize that they are unrelated to student achievement. The key is to supplement the letter grades with a report that is meaningful, fair, and consistent. The Standards Achievement Report (Reeves, 2000b, 2002c) allows teachers to amplify and explain the meaning of a letter grade (p. 26).
In addition to making the evidentiary case for standards, the particular concerns of competitive parents of a high-achieving student should be addressed. These parents need to know that there is value in unpleasant truth. If it is to be stated that their child does not meet a standard, then it should be stated at the same time that the child can meet the standard with additional work, and that the grading system in place rewards and recognizes accomplishment of the standard, not the rate at which it is achieved. This requires abolishing the average as a means of determining a grade in a standards-based school.
The consequence for failing to achieve a standard is the opportunity for detailed feedback, more work, and ultimate success. The grade in a standards-based school is not the average of work done throughout the semester, but an accurate representation of the performance of the student at the end.
The average is a ﬁxture in most grading systems, and many secondary schools have even institutionalized it, using computer programs require use or the arithmetic mean, or average, of grades (p. 27) throughout each quarter, semester, and year. This is strange in a standards-based school particularly since every middle school mathematics standard in the nation requires sixth and seventh grade students to understand that the mean is not always the best measurement of central tendency. This is why they must learn about the median and mode; the average does not always represent the data accurately.
Consider two students, Stewart and Maria. Stewart comes to school fresh from summer camp and complacently strolls through the semester with these weekly scores: 85, 85, 85, 85, 85, 85, 85, 85, and 85. The average is not difficult to calculate, and Stewart happily settles for his “gentleman’s B.” Maria struggles for everything she has learned and turns in this performance: 50, 60, 65, 70, 80, 85, 90, 90, and 90. Maria’s average of a little over 75 will, depending on the grading scale, allow her to take home a C or D on her report card if the teacher is slavishly devoted to the average, even though any fair observer would note that she is a better mathematician and a more responsive student than Stewart.
This is not a subjective judgment or an expression of sympathy. It is a statistically rigorous examination of a set of data in which the average fails to explain the results accurately. If we look at the data objectively and evaluate Maria’s proficiency, she will receive either the same grade as Stewart (a B) or – preferably, in the mind of many teachers – the average of her last three assessments (an A). Should we commit such heresy, you can expect to see Stewart’s parents at the next school board meeting. “Our son,” they will indignantly declare, “was proﬁcient all semester long, while THAT GIRL [pointing to Maria] was only proﬁcient for the last few weeks.” They conclude with the wounded expression of playground anger, “That’s not fair!”
Administrators should work with teachers in giving them a clear and unambiguous response to use if challenged for the abandonment of the use of averages. A good response would be to say that the school, district and state are all standards-based which means (p. 28) that student work is compared to a standard, and not compared to work of other students. It should then be noted that the district or school will be happily to discuss a student’s work relative to in place standards but to another student’s work.
It is that clear, blunt, simple that either a student’s work is compared to standards or compared it to that of other students. It cannot be both ways. Although universal acceptance and popularity will not follow, the case for academic standards has to be based on the twofold appeal of fairness and rigor. Some parents will always prefer ranking to standards, just as some teachers always prefer grading as an exercise in mysterious judgment to an objective achievement of standards.
The role of administrators and teachers is not to achieve popularity or universal acceptance, but to articulate the values that represent the core beliefs of stakeholders. There is a core norms belief statement, rarely inscribed in documents but frequently carried out in practice that states, “Some kids get it and some kids don’t, and we’re here to validate the social hierarchy that existed long before these kids came to our school.”
The core belief statement that guides standards-based academics is quite different. It contains certain immutable principles. To begin with, student success is not the result of luck, genetic determinism, or discovery of a mystery known only to a select few. Success in this school is the result of achieving standards through honest evaluation, diligent work, and exceptional effort. Our standards are never a secret; successful accomplishment of those standards can be achieved by every single student. When you leave this school, we will not announce who beat whom; rather, we will celebrate your accomplishments and those of every student who attained and surpassed our clear and unchanging standards (p. 30).
Standards-Based Performance Assessment: The Key to Standards Implementation
As Reeves (2002) wrote, State standards include expectations ranging from factual recall and declarative knowledge through complex requirements that entail demonstration of student proficiency in ways that a typical multiple-choice test cannot possibly address (Marzano, Kendall, and Cicchinelli, 1998; Wiggins, 1995, 1997). Although state tests may evaluate student performance on a few state standards, comprehensive and meaningful assessment of academic standards can only take place in the classroom. It is therefore essential that classroom assessment be clearly linked to the standards, in content and complexity. The vast majority of schools require significant improvement in the quality of classroom assessment. Neither administrators or teachers can offer a compelling alternative to the flaws of standardized testing if the assessments used in the classroom, school, and district do not generate valid and reliable alternative assessments that are clearly related to the academic requirements of the state standards (p. 31).
Performance Assessment: The Key to Standards Implementation
However persuasive the case for academic content standards may be, there is a strong core of resistance to standards. Much of it is based on associating standards with standardized testing (Leeman, 1999). Other scholars have taken issue not so much with the tests themselves but with how test data are misinterpreted and misused (Popham, 2000; Wiggins, 1997). There is a responsibility to listen carefully, but not necessarily capitulate, to the opposition to standards on the part of their faculty and among other stakeholders.
The flaws that standards have can be acknowledge including that many tests associated with standards have flaws; and that also there are uses to which test information is put that is deeply flawed. The remedy for these concerns, however, is not to reject standards or glumly accept standardized tests.
Recognition should be built among faculty, parents, and students that classroom-level and building-level assessment can be fair, effective, and related to academic standards. Whereas large-scale state tests may be limited in scope and provide untimely feedback to students, the best way to address these concerns is with effective performance assessment in the classroom.
Performance assessment has some distinctive characteristics that large-scale state tests almost never have. First, as the name implies, performance assessment requires the student to demonstrate performance of a standard, not merely to answer questions about a standard. The performance might be an essay; laboratory demonstration; application of mathematical knowledge; or analysis of a combination of maps, historical letters, and treaties. The thinking and analysis required in these performance tasks is far more rigorous than the guesswork entailed in many typical (p. 32) multiple-choice tests. More important, the results of these performances are immediately available to the teacher and student and thus can be used to improve student performance.
Although there are many activities that bear the label “performance assessment” and that claim the appellation “standards-based,” the best standards-based performance assessments follow a consistent format that includes, at a minimum, standard, scenario, performance tasks, and a scoring guide (rubric) for each task.
Relationship to Standards
Because of the proliferation of standards, many curriculum and assessment documents make the bald claim that they are standards-based. The simple question, “To which specific state standards does this activity relate?” is frequently met with stony silence by the purveyors of these documents. However obvious the question may be, the answers are not equally obvious. In fact, activities accumulate in the classroom as much from tradition and popularity as from their relationship to academic requirements.
As standards are used in an increasing number of schools, some traditional activities persist, and a leap of imagination and logic is required to relate those activities to standards. In other cases, traditional activities unrelated to standards and new standards-based activities have been piled on top of one another, guaranteeing only heavier backpacks for children and greater superficiality in the classroom, but failing to produce a set of coherent tasks that are directly related to state standards (p. 33).
Once the question of standards has been addressed, the second issue that must be considered is whether the standards addressed by this task are the most important ones. Because standards are so voluminous and broad in most states, it is possible to find some standard that relates, however remotely, to virtually any activity. A simple relationship to a standard, therefore, is not a sufﬁcient justiﬁcation for any classroom activity. Rather, it must be considered whether the activity under review is related to the so called “power standards,” those that the administration and faculty have determined are most essential because they possess the qualities of endurance, leverage, and readiness for the next level of instruction.
The scenario is the real-world context within which a performance task takes place. Like adults, students prefer to do meaningful work. The scenario creates meaning and conveys great respect to students who otherwise might regard tasks as busywork or utterly unrelated to their lives.
A scenario addresses the questions, “Why is this important?” and “How might I really use this knowledge in the years to come?” The Constitution is learned, for example, not merely because this requirement appears in the social studies standards but because each student has rights that deserve protection. We can only assert and protect the rights we understand. Geometry is learned because it can be used to design gardens, rooms, houses, and schools. Science is learn because the same hypothesis-testing procedures of the laboratory can be used to challenge popular but inaccurate wisdom on matters ranging from personal health to social issues.
Developing a scenario for performance tasks helps to engage students in the assessment and also to improve the thinking of teachers. After years of addressing an academic subject as a set of abstract skills, the intellectual discipline required to ask and answer the question “Why in the world do we do this anyway?” is valuable (p. 34).
Application of an academic skill to realistic circumstances promotes deep understanding not only for the students but also for the teacher who designs and administers the assessment as well.
Standards-based performance assessments include a range of performance tasks, from those that must be completed prior to approaching the standard to others directly related to a demonstration of proﬁciency in the standard, through tasks that are designed solely for enrichment and that allow students to demonstrate proﬁciency far beyond the standard (see: Teaching & Learning Platforms). There is a great deal of emphasis in many schools on the concept of differentiated instruction. This reflects the reality of many a classroom, in which two dozen students representing quite a range of skills (from those two or three grade levels below the current class to those several grade levels above it) (See: Peer-based Learning). Properly applied, differentiated instruction maintains the same standard of performance but allows variation in time and instructional strategy.
To have differentiated instruction be effective it must be used in conjunction with differentiate assessment. Differentiated assessment does not mean that less is expected of students who start at a lower level or who process information more slowly. Rather, differentiated assessment means that the same accomplishment for some students may be several incremental steps toward a single task rather than completion of one task (see Mastery Learning). A good rule of thumb is that each standards-based performance assessment should have a minimum of four tasks. Depending on the level of differentiation required in your school, more than four tasks may be required.
Scoring Guides (Rubrics) for Each Task
There is some debate in the assessment community over whether performance assessment should receive single scores (holistic scoring) or whether each task or each dimension of an assessment should (p. 35) receives a separate score (analytical scoring). Since the first principle of education is the improvement of teaching and learning if the results of assessment are to be used to improve teaching and learning, then feedback is necessary that is as speciﬁc as possible.
The feedback should commend the students on areas of proﬁcient or exemplary work and simultaneously point the student and teacher toward those areas that require improvement. Only an analytical scoring guide, with a separate score awarded to the student after task, can do that. In this way, one student might blaze through the ﬁrst three tasks of a performance assessment and ﬁnally slow down on the fourth task, where the challenge is greater and the work more difﬁcult; another student might repeat the ﬁrst task three times before proceeding to the second task.
The chaos that this creates in the classroom is constructive chaos, students working on various tasks because they are focusing on their area of greatest need. Avoiding this type of chaos is a false hope that only substitutes one kind of chaos for another. When assessment fails to differentiate, insisting that students all proceed through the same tasks at the same pace and engaging in the fantasy that whole-group instruction assesses uniform needs for all students, The external appearance of order is gained in exchange for the internal chaos of students who are either bored because the pace is too slow or paralyzed with fear because they do not have the opportunity to perfect the skills necessary to proceed to the next task. By creating a scoring guide for each performance task and offering students feedback immediately after each task rather than after completing the entire assessment, the teacher automatically differentiates instruction and assessment in the manner most appropriate for the students. (see: Assessment Rubrics)
Creating a Collaborative Environment
Administrators fear that their role in standards-based performance assessment must be limited because implementers cannot possibly have the depth of subject-matter knowledge for each grade level that the classroom teacher possesses. It is true that school leaders do not need to have expertise on everything from the Pyramids to Pythagoras, but administrators have to be experts at two qualities that must pervade every school: fairness and collaboration (p. 37)
Haycock, K. (1998). Good teaching matters: How well qualified teachers can close the gap. Thinking K-16, Summer 1998, pp. 1-16.
Kohn, A., (1999). The Schools Our Children Deserve: Moving Beyond Traditional Classrooms and “Tougher Standards.” Boston: Houghton-Mifflin, 1999.
Leeman, N., (1999). The big test: The secret history of the American meritocracy. New York: Farrar, Strauss and Giroux.
Marzano, R. J., Kendall, J. S., and Cicchinelli, L. F. (1998). What Americans Believe Students Should Know: A Survey of U. S. Adults, Aurora, Colorado: Mid-continent Regional Education Laboratory, 1998.
Popham, W. J., (2000) Testing! Testing! What every parent should know about school tests. Boston: Allyn and Bacon.
Ravitch, D. (2000). Left Back: A Century of Failed School Reforms. New York: Simon & Schuster.
Reeves, D. B., (2001a). Accountability in action: A blueprint for learning organizations. Denver: Advance Learning Press.
Reeves, D. B., (2001b). “If you Hate Standards, Learn to Love the Bell Curve.” Education Week, June, 11, p.48.
Reeves, D. B. (2002). The Leader’s Guide to Standards: A Blueprint for Educational Equity and Excellence. San Francisco, CA: Jossey-Bass, A Wiley Imprint.
Schmoker, M. (2001). The Results Fieldbook: Political Strategies from Dramatically Improved Schools. Alexandria, Va.: Association for Supervision and Curriculum Development.
Walters, L. S. (2000). Putting cooperative learning to the test. Harvard Education Letter, 16 (3)
Wiggins, G. (1995). Assessing student performance. San Francisco: Jossey-Bass.
Wiggins, G. (1997). Executive assessment. San Francisco: Jossey-Bass.