Mandated Assessment of Educational Outcomes

The report which follows was approved by the Association’s Committee on College and University Teaching, Research, and Publication and adopted by the Association’s Council in June 1991.


Background

Toward the end of the 1980s, “state-mandated” assessment of higher education, including attempts to measure learning outcomes for undergraduates, came to the forefront of attention in several states.1One recent study indicates that about a dozen states reported “serious efforts” to develop assessment measures in 1987; within two years, twenty-seven states reported either legislative- or board-mandated assessment initiatives in place. Still other states indicated that formal assessment policies were under study at that time, and only eight reported “nothing in place and nothing immediately planned.” Of the twenty-seven formal initiatives reported, eighteen were implemented under state board policies requiring institutions separately to develop assessment plans congruent with their own mission, while in others such initiatives involved actual legislation mandating assessment as part of an overall educational reform package, leaving details of implementation to higher education officials. Four states required “across-the-board cognitive-outcomes testing”—that is, statewide mandated assessment instruments—and of these, all but one had been in place for some time. Three states had considered but rejected such tests, and even in the case of previously established basic skills tests for entering college or university students, only four states employed a common measurement instrument.2

Although so far only a minority of states has employed common test instruments, the assessment movement itself is generally conceded, by both supporters and critics, to be thriving. The August 1986 report of the Task Force on College Quality, “Time for Results: The Governors’ 1991 Report on Education,” sets forth six recommendations, calling for: (1) a clear definition of institutional roles and missions in each state; (2) a reemphasis on undergraduate instruction, especially in research universities; (3) the development in both public and private institutions of “multiple measures to assess undergraduate student learning,” with the results to be used to evaluate institutional and program quality; (4) the adjustment of funding formulas in the public sector to provide incentives for the improvement of undergraduate education;  (5) a renewed commitment to access for all socio-economic groups to higher education; and (6) the requirement by accrediting agencies that information about undergraduate student outcomes  be used as one basis for reaccreditation.

Several responses to the governors’ report endorse assessment in broad principle while suggesting certain steps for accomplishing its implementation. The National Association of State Universities and Land-Grant Colleges has issued a “Statement of Principles on Student Outcomes Assessment” (November 1, 1988), which stresses the improvement of student learning and performance, reliance on “incentives rather than regulations or penalties” in the process, the importance of faculty involvement, definitions of “quality indicators” appropriate to the diversity of purposes and programs in American higher education, multiple methods of assessment rather than recourse to a standardized test, the need to avoid imposing heavy costs for assessment on either state agencies or institutions, and the linkage of assessment programs to strategic planning and program review. The Commission on Institutions of Higher Education of the North Central Association, at its meeting of October 27, 1989, approved a “Statement on Assessment and Student Academic Achievement,” which seems on balance to confirm existing principles of institutional autonomy and the primary responsibility of faculty for assessment. Nonetheless, it appears that no statement has adequately defined the nature of the faculty role in overseeing either the establishment or the subsequent use of an assessment process. This report attempts to accomplish that end.

Public discussion has not always made clear whether assessment instruments are designed simply to provide a diagnostic tool for self-improvement or also to furnish a basis for budgetary allocations. It is worthy of note, however, that, at the time the Committee on College and University Teaching, Research, and Publication undertook its review, at least one state reserved the right, beginning in 1990, to return to the state treasury up to 2 percent of its appropriation for any institution failing to comply with the statute mandating assessment, and that conversely another state promised an additional increment of up to 5 percent in public funds to cooperating institutions, based on indicators of institutional quality as determined through assessment.

The American Association of University Professors has long recognized that the practical difficulties of evaluating student learning do not relieve the academic profession of its obligation to attempt to incorporate such evaluation into measures of teaching effectiveness. The Association’s 1975 Statement on Teaching Evaluation contains the following comments on “student learning”:

Evaluation of teaching usually refers to the efforts made to assess the effectiveness of instruction. The most valid measure is probably the most difficult to obtain, that is, the assessment of a teacher’s effectiveness on the basis of the learning of his or her students. On the one hand, a student’s learning is importantly influenced by much more than an individual teacher’s efforts. On the other, measures of before-and-after learning are difficult to find, control, or derive comparisons from. From a practical point of view, the difficulties of evaluating college teaching on the basis of changes in student performance limit the use of such a measure. The difficulties, however, should not rule out all efforts to seek reliable evidence of this kind.

It is also important to note that many of the measures proposed by proponents of mandated assessment—examinations of various kinds, essays, student portfolios, senior theses and comprehensive examinations, performances and exhibitions, oral presentations, the use of external examiners—have been in place for many years. So have certain standardized tests that have become widely accepted for specific ends, such as the SAT or ACT for purposes of admission to undergraduate work, or the GRE and the LSAT for admission to post-baccalaureate programs. Other indicators favored by proponents of assessment, such as alumni satisfaction and job placement, have been used in recurring academic program reviews, some of which have been undertaken through institutional initiatives, others of which have been mandated by state agencies. As a general rule it is safe to observe that undergraduates in American postsecondary education, and their academic programs, are more intensively and perhaps more frequently evaluated than are those in postsecondary education anywhere else in the world. If many of the aforementioned measures have long been in place, the question naturally arises: What is different about the call for mandated assessment in its present form, and why is it seen as necessary by many policy makers, including some within the higher education community itself?

The present assessment movement is in part a response to increased demands for public accountability and in part a by-product of various national reports on the state of higher education in the late 1980s which criticized both growing research emphases in the nation’s colleges and universities and the quality of undergraduate education. The previously cited governors’ report states that, despite “obvious successes and generous funding,” “disturbing trends” are evident in both objective and subjective studies of educational effectiveness. The report complains that ”not enough is known about the skills and knowledge of the average college graduate,” and that the decline in test scores and the frustrations voiced about the readiness of graduates for employment exist in a climate of institutional indifference to educational effectiveness. The opening paragraph of “Time for Results” charges: “Many colleges and universities do not have a systematic way to demonstrate whether student learning is taking place. Rather, learning—and especially developing abilities to utilize knowledge—is assumed to take place as long as students take courses, accumulate hours, and progress ‘satisfactorily’ toward a degree.” Such allegations indicate that the call for an increased emphasis on assessment not only will be increasingly linked to subsequent budgetary increments or decrements, but also will, though often inexplicitly, be accompanied by external pressures on the internal academic decision-making processes of colleges and universities.

Critics of mandated assessment have questioned whether the premise of assessment proponents, namely, that higher education has been generously funded in recent years, can withstand the light of scrutiny. Those individuals—board members, administrators, faculty members, students, and staff—who are more directly in touch on a daily basis with the working realities of campus life have noted that support has not kept pace with growth. They see an increased reliance on part-time or short-term faculty, starved scientific and technical laboratories, the deferment of routine maintenance costs, the growth of academic support staffs at a rate that outstrips the number of new tenure-eligible faculty positions, and patterns of funding that follow enrollment trends without regard to the relative priority of subjects in those very liberal arts in which undergraduate unpreparedness has been decried by national study commissions. Under these circumstances, critics question the relevance and importance of assessment when access to higher education has been expanded without a corresponding expansion in the base of support.

In the remainder of this report we examine Association policies that provide a historical context for considering the subject of assessment, and then consider specific issues related to present discussions of mandated assessment. After this review we conclude with a set of recommendations that we believe should govern discussions of the implementation of assessment procedures on particular campuses.

Applicable Association Policies

The Association’s long-standing principles relating to academic freedom and tenure, and to college and university government, provide a broad and generally accepted context within which to treat the question of assessment in higher education. These principles are embodied in a number of documents from which the earlier-cited Statement on Teaching Evaluation derives its own more specific applicability. The joint 1940 Statement of Principles on Academic Freedom and Tenure sets forth in its preamble the principle that “academic freedom in its teaching aspect is fundamental to the protection of the rights of the teacher in teaching and of the student to freedom in learning” and goes on to stipulate that “teachers are entitled to freedom in the classroom in discussing their subject.”3The 1966 Statement on Government of Colleges and Universities spells out those areas of institutional life requiring joint effort, and those falling within the primary responsibility of the governing board, the president, and the faculty, respectively. It also contains a concluding section on student status.4

The direct implications of mandated assessment for academic freedom and tenure have not yet become a centerpiece of public discussion. Proponents of mandated assessment argue that the impact of assessment instruments on the conduct of individual classes not only has been, but will remain, negligible in the extreme, and that the complaints that such instruments will force individual faculty members to “teach to the test” misrepresent both the purpose and the techniques of assessment in the crudest and most reductive terms. They deny that mandated assessment would ever be based on a single quantitative instrument. While it is true that, thus far, the assessment movement does not appear to have resulted in any overt infringement of academic freedom as traditionally understood, the question remains whether, in more subtle ways, assessment may begin to shape the planning and conduct of courses. We believe, moreover, that the demand for mandated assessment is related to recent calls for the “post-tenure” review of faculty performance, inasmuch as an increased public demand for “accountability” on the part of colleges and universities, and real or alleged dissatisfaction with their internal processes of decision making, are common themes underlying both movements. This possible interplay between two movements which heretofore have been treated as distinct will be commented upon further below and, in our view, ought to be a continuing subject of discussion and review by the Association.

Our remaining comments in this section focus on the Statement on Government as a frame of reference for considering mandated assessment. The statement is premised on the interdependence of governing board, administration, and faculty. It emphasizes the institutional self-definition within which these constituent groups work together, giving recognition to the fact that American higher education is not a unitary system but rather contains many diverse institutions. While unequivocal in its position that “the faculty has primary responsibility for such fundamental areas as curriculum, subject matter and methods of instruction, research, faculty status, and those aspects of student life which relate to the educational process,” the statement allows for the possibility that external bodies with jurisdiction over the institution may set “limits to realization of faculty advice.” We interpret this proviso to refer to those areas—for example, the allotment of fiscal resources within a statewide system—which are primarily the responsibility of “other groups, bodies, and agencies.” It may, for example, be the function of a state agency, and legitimately so, to determine (and provide reasonable grounds to the affected institution for so determining) that the establishment of a new professional school on a particular campus cannot be justified in terms of existing resources. This finding is quite a different matter from external action designed to force particular internal revisions in an existing educational program.5

As we see it, the question of most fundamental interest in terms of long-standing Association policy is the extent to which, despite disclaimers by its proponents, the mandatory assessment movement thus far has tended to represent a form of external state intrusion, bypassing the traditional roles of governing board, administration, and faculty, as well as both duplicating and diminishing the role of the independent regional accrediting bodies. A derivative question is whether mandated assessment, to the extent that it is driven by the felt need to compare institutions, requires measures of quantification applicable to all those institutions, and thus tends to diminish the autonomy and discourage the uniqueness of individual campuses. We take up this question as the first of several issues in the section that follows.

Some Specific Assessment Issues

1.     Institutional Diversity

The manner in which institutional self-identity is defined varies both within and between the private and public sectors, though the historical basis for such distinctions may have been eroded to some extent in recent years. Thus private institutions may seem to have a greater freedom from direct state intrusion, but in states with scholarship or tuition-assistance programs offered without regard to whether the student is attending a public or a private college or university, it is increasingly difficult for all but the most prestigious “independents” to operate without some attention to state policy. Within public systems most state colleges and universities reflect different missions in their degree programs and student clientele. Many of the proponents of assessment link it to this fact and call for clearer and more distinct descriptions of roles and missions: a particular state assessment plan may indeed specify that Campus A is not necessarily expected to follow the same procedure for assessment as Campus B, and the plan may therefore seem to endorse the principle of institutional diversity.

Such lip service to diversity, however, obscures some serious issues. Even a reasoned recognition of institutional differences of the sort that mandated assessment plans claim to recognize may result in the stifling of growth and development at an institution in the process of change, or the favoring of one campus with a particular set of goals over another in the same system. The governors’ report makes it clear that “universities that give high priority to research and graduate instruction” will be the object of particularly close scrutiny in the assessment of undergraduate educational outcomes, thus raising the question of whether one implication of institutional definition is the homogenization of different campuses within a particular system, and whether there will not be a de facto ascription of superior value to those institutions that have remained devoted primarily to undergraduate teaching. It is doubtless unwise and undesirable for all institutions in a state system to aspire to research university status, but it is equally unwise and undesirable as a matter of social policy to depreciate, directly or indirectly, the research mission of those campuses capable of carrying it out.

Proponents of mandated assessment have also undercut their own assertions of respect for institutional differences by pointing to specific institutions as exemplars of “good practice.” They do not pause to consider whether a model devised at one type of institution, for example, a small Roman Catholic liberal arts college or a middle-sized state institution initially founded for the purpose of teacher training and now embracing an expanded purpose, is necessarily—or properly—transferable to other kinds of institutions. Nor do they pause to note that successful assessment tools may be successful not because of their intrinsic merit, but because the ambience and scale of a particular campus already guarantee those conditions of teaching and learning conducive to such an assessment. The findings and conclusions of a study of student progress at a primarily residential four-year liberal arts college with a high rate of successful degree completion may tell us little or nothing about a large urban campus with significant dropout rates and a greater number of student transfers. Either of these institutions might be more appropriately compared to a peer institution in another state than to another institution that happens to be in the same state. To encourage institutions to develop their own instruments for assessment does not necessarily mean that the outcomes of the various assessment instruments will be properly acknowledged as logical extensions of institutional differences, since, as we have already said, higher education agencies tend to want the kinds of data that facilitate comparisons among institutions.

2.     Skills Versus Values

Although any mandated assessment plan might be resisted as an effort to increase external political control over colleges and universities, or dismissed as a cynical public relations ploy, we have no doubt that many supporters and practitioners of mandated assessment are motivated by legitimate and well-intentioned concern for educational quality. But their motivations are diversely grounded and sometimes mutually exclusive. Some educational and political leaders, viewing with alarm the decline in standardized test scores nationwide, tend to focus their attention on the need for colleges and universities to certify that students have attained certain basic educational skills. Others profess primary concern for the student’s acquisition of moral values. For them, assessment presents itself as an additional means for achieving curricular change of the sort called for in various books and national reports published in the second half of the 1980s. Though inevitably these goals overlap—indeed, most in either group would probably state their belief in the importance of the purposes espoused by the other—they cannot be reached by similar means or tested by the same instruments. Indeed, as a practical matter, it is not even clear that in a time of budgetary constraints they can both be realized as a part of the same agenda. Given such competing demands, it is likely that mandated assessment will force a change in curriculum, not in order to produce a better-educated student, but to enhance the “measurability” of the outcomes.

As a general rule, those standardized measurement instruments that are the easiest to replicate are the least valid in any context other than the assessment of basic skills. As we have already noted, colleges and universities already employ such standardized tests as the ACT and the SAT to assist them in determining the admissibility of prospective undergraduate students, just as graduate and professional programs employ a variety of other standardized tests to measure undergraduate preparation at their respective entry levels. What lends this process some degree of credibility—despite the well-recognized misuse of these instruments when they are devised without proper regard for persons of varying cultural backgrounds—is that the interpretation of results is usually tailored to the mission of the particular institution. Test scores, if they are used in conjunction with other evaluative instruments such as a student’s standing as a graduating senior in high school, may argue for admission to one institution if not to another. This capacity to differentiate among students is the particular genius of American higher education: that in its diversity of institutional purposes, it offers a flexible response to different student needs and abilities while ensuring that access to higher education remains a proper part of education for the citizenry of a democratic society. Historically, faculty members and administrators have assumed that test measurements are retrospective, determining admissibility on the basis of the student’s demonstrated pre-collegiate or pre-professional skills.

Standardized outcomes testing in general education directed to the acquisition of values as well as facts is another matter, and it is worth noting that neither of the major national testing services has yet succeeded in devising a standardized general education examination that satisfies either the institutions or the test designers themselves. Though proponents of mandated assessment may wish to pay tribute both to the acquisition of skills and to the acquisition of a broadened general education, it is unlikely that the resources that would be required for a basic improvement in skills alone at the collegiate level could also be devoted to enhancing the environment necessary for the transmission of diverse content and the development of critical thinking so central to the preparation of students in the liberal arts. The costs of conducting assessment drain funds away from other institutional needs, such as smaller classes, reduced dependence on part-time faculty, and adequate numbers of full-time faculty members to staff both graduate and undergraduate instruction—needs which are at loggerheads, too, with external demands for remediation at the college level. Assessment carried on without proper attention to the incompatibility of the two goals—the attainment of skills and the learning of values—can have only one result: shifting the burden of blame to the faculty, just as in the K–12 system many teachers and principals are operating under a state mandate for reform without adequate funds to implement it and are now under pressure to show measurable results.

Under these conditions, values are likely to be subordinated to skills, and the quality of higher education as higher education, rather than as remediation, will suffer. When the mission of a particular institution dictates that scores be used to determine placement in basic skills courses so as to compensate for inadequate prior preparation at the primary and secondary levels, then the funding of such instruction needs to be provided at a level that protects the viability of instruction appropriate to undergraduates who are ready for a college-level curriculum. Under present conditions, such additional resources as are available would be better devoted to remedying inadequate student preparation than to attempting to assess it.

3.     Assessment in the Major Field of Study

Most faculty members agree on the importance of assessing systematically a student’s competence in the major, as is shown by the multiplicity of forms of assessment that many departments employ. Yet even in this disciplinary context the range of possible student options after graduation makes it unlikely that an externally mandated assessment instrument would do anything more than gauge the lowest common vocational denominator. The major is properly regarded as a vehicle for deepening the student’s general education and for sharpening the student’s independent research and study skills, and thus standardized assessment of achievement in the major field raises precisely the same objections it does in general education.

Learning for its own end—for the purpose of developing breadth, intellectual rigor, and habits of independent inquiry—is still central to the educational enterprise; it is also one of the least measurable of activities. Whereas professional curricula are already shaped by external agencies, such as professional accrediting bodies and licensing boards, the liberal arts by contrast are far more vulnerable to intrusive mandates from other quarters; for example, the governors’ report professes to find evidence of program decline “particularly in the humanities.” To be sure, even in the liberal arts a student’s accomplishment in the major can be measured with relative objectivity by admissions procedures at the graduate and professional level that include GRE scores as one of the bases for judgment. But a student majoring in English may wish to pursue a career in editing, publishing, journalism, or arts administration (to name only a few); a political science major may have in mind a career in state or local government or in the U.S. State Department. Either of them may have chosen his or her major simply out of curiosity, or perhaps out of a desire to be a well-educated citizen before going to law school or taking over the family business.

For these reasons we suggest that the success of a program in the major field of study is best evaluated not by an additional layer of state-imposed assessment but by the placement and career satisfaction of the student as he or she enters the world of work. Whereas imposed assessment measurements will at best—and rightly—attract faculty cynicism and at worst lead to “teaching to the test,” no responsible faculty member will ignore the kinds of informed evaluation of a program available through a candid interchange with a graduating senior or recent graduate.

4. “Value-Added” Measures

Despite their occasional disclaimers, proponents of mandated assessment frequently desire quantifiable outcome data based on a comparison of students’ entrance and exit performance at a postsecondary institution, or their performance at entrance and at the beginning of the junior year, before their attention turns primarily to their work in the major. Our concerns are two: (a) whether the data, based on what are sometimes called “value-added” measures, realistically reflect the diverse structures of American higher education and the different kinds of student involvement in it; and (b) whether value-added measures are sound even in narrow quantitative terms.

Perhaps the crudest form of value-added testing involves the administration of an identical general-education examination to the same body of students twice during their college careers. Even if—which we doubt—the acquisition of knowledge could be measured by the mere repetition of an earlier test, the uncritical implementation of value-added measures is quite simply unsustainable in light of the increasingly migratory, part-time, and drop-in–drop-out patterns of many American undergraduates. Like debates over what constitutes the one true curriculum or reading list, value-added measures ignore the fact that any system which presupposes a particular pace or place for student learning is at best applicable to a diminishing, and in some cases relatively elite, proportion of the student population.

We do not believe that the most important, or even useful, kind of learning that takes place at any level of education is readily quantifiable or results from the accumulation of facts by rote. Yet such an emphasis is implied in value-added measures, since the words themselves betray the assumption that one must add something measurable to something else in order to evaluate educational outcomes.

5. Accountability Versus Self-Improvement; or, Does Involving the Faculty in the Process Make It All Right?

Although, as we noted earlier, proponents of assessment have argued that the purpose of assessment is to provide diagnostic tools for self-improvement, both institutional and personal, in some cases direct budgetary consequences may ensue not only from the choice of noncompliance over compliance, but also from the results of the assessment itself. A vivid example of the slippery slope down which higher education could descend can be found in those segments of the K–12 system in which standardized tests have been employed to appraise curricular and teaching effectiveness and to group children by presumed intellectual level. What emerges as the end result of such a process is no longer an educational matter but rather a policy issue external to the schools, with the resulting data being interpreted by persons not necessarily expert in primary and secondary education, and the faculty harboring deep-seated feelings of disenfranchisement in the process.

Proponents of mandated assessment might respond that the historic position of faculties in higher education sufficiently guarantees the continuing primacy of the faculty in the assessment process. If, the argument runs, faculty members develop and administer the assessment instruments, and these are used primarily for pedagogic self-improvement, then what can the objection be?

The second of these points—as to whether pedagogic, curricular, and thus institutional, self-improvement is really the primary reason for mandated assessment—has already been questioned. We have seen sufficient evidence that such a call for self-improvement does not take place in a fiscal or policy vacuum. Most faculty members, in our experience, are perfectly willing to undertake a periodic look at their own effectiveness. Increasing numbers of institutions have been developing programs to devise incentives for such self-examination, which we regard as a continuing faculty responsibility. But self-examination is best conducted in a climate free of external constraint or threats, however vaguely disguised.

The nub of the problem lies, as it has throughout this report, not so much in the noun assessment as in the modifier mandated. If, indeed, proponents of assessment want to express support for measures of student progress that are based on principles of sound instruction—papers, essay examinations, theses, special projects, or performances or exhibitions—then an informed dialogue between the institution’s representatives and the public may be usefully carried on. But if mandated assessment presupposes instruments that move further in the direction of greater standardization and quantification, then the adoption of such instruments requires not only the participation of testing experts, but also an involvement by faculty members and administrators in the development of discipline-specific or general-education versions of such tests.

The fact that faculty members, rather than external agencies, select or even participate in the design of the test instrument does not substantially diminish the problems of standardization and reductionism inherent in the process of developing a reliable test instrument. And in view of the political forces that drive such demands, the assertion that the faculty can oversee or even control the design is of little meaning if the requesting agency wants to accumulate data susceptible of statistical formulation and translatable into budgetary decisions.

We have already implied that one academic, as opposed to budgetary, consequence of mandated assessment is “teaching to the test,” a pressure on faculty members to transmit to their students easily testable nuggets of information rather than broader conceptual issues and methods of reasoning. We must also acknowledge that for some faculty members a move toward standardized outcomes measurement might in fact represent a tempting relief from the exigencies of grading papers and essay examinations. An unwelcome result for both higher education and the public would be the exodus of better faculty members to other careers (as has already happened in certain segments of the K–12 system for much the same reason) or at the very least for other campuses not yet infatuated with “value-added” assessment measures.

Furthermore, faculty participation in the development of mandated assessment instruments—a sine qua non if some degree of faculty control were to be exerted over the process—would represent yet one more burden added to the existing teaching, research, and public service responsibilities that faculty members already carry. The added burden might be acceptable if it contributed to furthering the central academic purposes of a college or university, but, as we have sought to show, mandated assessment is not likely to achieve that result.

To reconnect the discussion briefly with the earlier mention of post-tenure reviews, we believe that both mandated systems of post-tenure review and mandated procedures for assessment, even if they involve faculty collaboration, are strikingly similar both in their demands and in their adverse practical outcome. In both cases it can be said that faculty members have been evaluating each other and assessing their students’ learning outcomes for many years. In both cases the faculty is informed that it can participate in, or even control, the procedures and thus retain its traditional role of primary responsibility, whether over faculty status or over academic programs. But in both cases the faculty is also being told that the very instruments it has devised in the past are no longer sufficient to ensure either faculty quality or student learning, and that new mandated instruments are needed to satisfy public demands for greater accountability. Thus the logic of mandated assessment requires that faculty judgment be superseded if some agency external to the campus deems the need for public accountability not to have been met.

Proponents of mandated assessment cannot have it both ways. Either the purpose of mandated assessment is the improvement of teaching and learning in an atmosphere of constructive cooperation, or it shifts the responsibility for educational decisions into the hands of political agencies and others not only at a remove from, but by the nature of their own training and biases not versed in, the purposes and processes of higher education.

Recommended Standards For Mandated Assessment

American higher education has generally encouraged frequent assessment of student learning. The recent movement to mandate such assessment differs, however, in that it emphasizes evaluation of overall instructional and programmatic performance rather than individual student achievement. The American Association of University Professors recognized in its Statement on Teaching Evaluation that assessment of student learning outcomes may provide the most valid measure—though also the most difficult to obtain reliably—for the evaluation of teaching effectiveness. The Association has also recognized that such assessment is the responsibility of the faculty, whose primary role in curriculum and instruction has been set forth in the Statement on Government of Colleges and Universities.

Where assessment of student learning is mandated to ensure instructional and programmatic quality, the faculty responsibility for the development, application, and review of assessment procedures is no less than it is for the assessment of individual student achievement. Since the Statement on Teaching Evaluation was first formulated, increased public attention has been turned toward various plans for externally mandated assessments of learning outcomes in higher education. Some of the plans have been instituted on short notice and with little or no participation by faculty members who, by virtue of their professional education and experience, are the most qualified to oversee both the details and the implications of a particular plan. Often these plans are the result of external political pressures, and may be accompanied by budgetary consequences, favorable or unfavorable, depending on the actual outcomes the mandated schemes purport to measure.

The Association believes that the justification for developing any assessment plan in a given case, whether voiced by a legislative body, the governing board, or one or more administrative officers, must be accompanied by a clear showing that existing methods of assessing learning are inadequate for accomplishing the intended purposes of a supplementary plan, and that the mandated procedures are consistent with effective performance of the institutional mission. The remaining question involves the principles and derivative policies that should prevail when agencies external to colleges and universities—state legislatures, regional and professional accreditation bodies, and state boards of higher education—insist that assessment take place. We believe that the following standards should be observed:

  1. Central to the mission of colleges and universities is the teaching-learning relationship into which faculty members and their students enter. All matters pertinent to curricular design, the method and quality of teaching, and the assessment of the outcome in student learning must be judged by how well they support this relationship.
  2. Public agencies charged with the oversight of higher education, and the larger public and the diverse constituencies that colleges and universities represent, have a legitimate stake in the effectiveness of teaching and learning. Their insistence that colleges and universities provide documented evidence of effectiveness is appropriate to the extent that such agencies and their constituencies do not: (a) make demands that significantly divert the energies of the faculty, administration, or governing board from the institution’s primary commitment to teaching, research, and public service; or (b) impose additional fiscal and human burdens beyond the capacity of the responding institution to bear.
  3. Because experience demonstrates the unlikelihood of achieving meaningful quantitative measurement of educational outcomes for other than specific and clearly delimited purposes, any assessment scheme must provide certain protections for the role of the faculty and for the institutional mission as agreed upon by the faculty, administration, and governing board, and endorsed by the regional accrediting agency.

    Specifically:

    (a) The faculty should have primary responsibility for establishing the criteria for assessment and the methods for implementing it.

    (b) The assessment should focus on particular, institutionally determined goals and objectives, and the resulting data should be regarded as relevant primarily to that purpose. To ensure respect for diverse institutional missions, it is important that uniform assessment procedures not be mandated across a statewide system for the purpose of comparing institutions within the system. For a further development of this point, see (f) below.

    (c)       If externally mandated assessment is to be linked to strategic planning or program review, the potential consequences of that assessment for such planning and review should be clearly stated in advance, and the results should be considered as only one of several factors to be taken into account in budgetary and programmatic planning.

    (d) The assessment process should employ methods adequate to the complexity and variety of student learning experiences, rather than rely on any single method of assessment. To prevent assessment itself from making instruction and curriculum rigid, and to ensure that assessment is responsive to changing needs, the instruments and procedures for conducting assessment should be regularly reviewed and appropriately revised by the faculty. We suggest the following considerations with respect to both quantitative and qualitative measures:

    (i)   Quantitative performance measures exhibit two specific dangers. First, reliable comparisons between disparate programs, or within individual programs over time, demand narrow and unchanging instruments, and thus may discourage necessary curricular improvement and variety. Second, even where such instruments are ordinarily available or responsive to changing curricula (as with certification and graduate record examinations), they may be unreflective of diverse purposes even within a single discipline or field of study. Thus, such instruments should not be used as the exclusive means of assessment.

    (ii)  Qualitative performance measures are often pedagogically superior to quantitative tests. These measures include such devices as capstone courses, portfolios, exhibitions, senior essays, demonstrations and work experiences, and the use of external examiners. The use of these measures, however, is costly and implies a curricular decision to shift additional resources to evaluate outcomes rather than to improve student learning. Hence, adoption of such procedures should include a review of costs and benefits compared to other curricular options such as greater investment in the support of first- and second-year students.

    (e)       If a state agency mandates assessment, the state should bear the staffing and other associated costs of the assessment procedure, either directly or in the form of a supplemental budgetary allocation to the campus for the purpose.

    (f) If comparative data from other institutions are required for purposes of assessment, the faculty should have primary responsibility for identifying appropriate peer units or peer institutions for those purposes, and (as with program planning referred to in [c] above) the results of that assessment should be only one of the several factors in arriving at such comparisons.

    (g)      Externally mandated assessment procedures are not appropriate for the evaluation of individual students or faculty members and should not be used for that purpose.

Notes

1 Throughout this document the word “mandated” usually implies an external mandate, that is, one imposed by an agency outside the college or university. On occasion, however, institutional governing boards (as opposed to a superboard or coordinating board) or even administrative officers may themselves deliver such a mandate. In such instances the roles of the respective parties should be defined in terms of the broad principles of the Association’s Statement on Government of Colleges and Universities, discussed below. Back to text

2  Statistical information and general commentary on trends are here drawn from Peter Ewell, Joni Finney, and Charles Leath, “Filling in the Mosaic: The Emerging Pattern of State-Based Assessment,” AAHE Bulletin 42 (April 1990): 3–5. Back to text

3 AAUP, Policy Documents and Reports, 9th ed. (Washington, D.C., 2001), 3. Back to text

4 Ibid.,217–23. Back to text

5 Under the heading “Joint Effort,” the “Statement on Government” (Policy Documents and Reports, 218) adds: “Special considerations may require particular accommodations: (1) a publicly supported institution may be regulated by statutory provisions, and (2) a church-controlled institution may be limited by its charter or bylaws. When such external requirements influence course content and manner of instruction or research, they impair the educational effectiveness of the institution” (emphasis added).Back to text