Educational Policy 2016, Vol. 30(4) 606 –648
© The Author(s) 2014 Reprints and permissions:
sagepub.com/journalsPermissions.nav DOI: 10.1177/0895904814557592
epx.sagepub.com
Article
The Developmental Evaluation of School Improvement Networks
Donald J. Peurach1, Joshua L. Glazer2, and Sarah Winchell Lenhoff3
Abstract The national education reform agenda has rapidly expanded to include attention to continuous improvement research in education. The purpose of this analysis is to propose a new approach to “developmental evaluation” aimed at building a foundation for continuous improvement in large-scale school improvement networks, on the argument that doing so is essential to producing the intellectual capital needed to replicate effective practices and desired outcomes throughout these networks. We begin by developing a rationale for developmental evaluation, both to illuminate the need and to discuss its coordination with other forms of evaluation. We continue by proposing a logic of developmental evaluation to support analyzing networks as learning systems. We then use that logic to structure a framework for developmental evaluation to support evaluators, network executives, and other stakeholders in analyzing and strengthening the foundation for continuous improvement in a given network. Our analysis suggests that building a foundation for continuous improvement among a large number of networks is an educational reform agenda unto itself, one that must be supported and sustained if these networks are to succeed at the level expected under current accountability regimes.
1University of Michigan, Ann Arbor, MI, USA 2George Washington University, DC, USA 3Education Trust–Midwest, Royal Oak, MI, USA
Corresponding Author: Donald J. Peurach, School of Education, University of Michigan, 610 E. University, Ann Arbor, MI 48109, USA. Email: [email protected]
557592 EPXXXX10.1177/0895904814557592Educational PolicyPeurach et al. research-article2014
Peurach et al. 607
Keywords evaluation, developmental evaluation, impact evaluation, best practice, intellectual capital, educational reform, innovation, knowledge production, networks, organizational learning, replication, scale, sustainability
This is a conceptual analysis that addresses a practical problem: improving the production, use, and management of intellectual capital in the service of large-scale education reform.1 By intellectual capital, we mean the practical, useable knowledge needed to coordinate and improve the performance of students, teachers, and school leaders in large numbers of schools. This intel- lectual capital is captured in formal resources (e.g., manuals, tools, digital media, and other artifacts); in individuals and relationships among them; and in relationships among schools and agencies, organizations, and constituents in their environments.
The analysis is grounded in a high leverage context: school improvement networks. These are new types of educational systems in which a central, “hub” organization collaborates with “outlet” schools to enact schoolwide improvement programs (Peurach & Glazer, 2012). Some of these networks operate outside of the K-12 governance structure: for example, networks operated by comprehensive school reform providers, charter management organizations, and education management organizations. Some operate within the K-12 governance structure: for example, “turnaround zones” in which newly constituted agencies centrally coordinate improvement efforts in large numbers of low performing schools.
Over the past 20 years, these networks have benefited from billions of dol- lars in public and philanthropic investment. Though some enlist functional schools seeking to transform existing capabilities, support is weighted heav- ily toward networks enlisting underperforming public schools and newly cre- ated charter schools, either to improve schools serving large populations of at-risk students or to create alternatives.
Given the accountability pressures under which they have emerged, the legitimacy and viability of school improvement networks are tightly tied to demonstrating levels of student performance on impact evaluations that many institutionalized educational systems have long struggled to obtain. Yet com- plex problems in schools, weak cause–effect knowledge on which to base pro- grams, fledgling hub organizations, and turbulent environments interact to greatly reduce the prospects of quickly building large-scale networks of schools able to support students any better than institutionalized public schools (Berends, Bodilly, & Kirby, 2002; Center for Research on Education Outcomes,
608 Educational Policy 30(4)
2009; Cohen, Peurach, Glazer, Gates, & Goldin, 2014; Education Sector, 2009; Lake, Dusseault, Bowen, Demeritt, & Hill, 2010; Peurach, 2011).
Increasing recognition of the complexity and uncertainty of large-scale, systemic school improvement has been instrumental in motivating the Institute of Education Sciences (IES, 2013) to engage external evaluators, network executives, and other stakeholders in the continuous improvement of educational systems, including school improvement networks. This IES initiative is complemented by increasing use of design-based research to sup- port the development of effective educational interventions (Anderson & Shattuck, 2012; Penuel, Fishman, Cheng, & Sabelli, 2011) and by emerging efforts to support continuous improvement in practice-focused educational networks (Bryk, Gomez, & Grunow, 2010).
New support for continuous improvement marks an advance in the national reform agenda, beyond a primary focus on impact to a complemen- tary focus on producing, using, and managing the intellectual capital needed to demonstrate impact. However, in the case of school improvement net- works, one problem is that there is little to suggest that hubs and schools have the foundation—the essential strategies and operational supports—to man- age that which is to be continuously improved: intellectual capital. A second problem is that there is little to suggest that external evaluators, network executives, and other stakeholders are prepared to collaborate in improving intellectual capital in these novel and emerging systems.
The purpose of this analysis is to take up these two problems. We do so by proposing an approach to “developmental evaluation” aimed at establishing a foundation for continuous learning and improvement in school improvement networks. We begin by developing a rationale for developmental evaluation as a complement both to impact evaluation and to other improvement-focused evaluation strategies. We continue with a logic of developmental evaluation to support thinking and reasoning about networks as learning systems. We then use that logic to structure a framework for developmental evaluation to support evaluators, network executives, and other stakeholders both in critically ana- lyzing networks as learning systems and in strengthening their foundation for continuous learning and improvement.
Our analysis provides novel perspective on school improvement networks as systems that produce, use, and manage the intellectual capital needed to improve education for many poor and at-risk students. Indeed, our analysis suggests that building a foundation for continuous improvement among a growing population of school improvement networks is an educational reform agenda unto itself, one that must be supported, sustained, and care- fully managed if networks are to succeed at the level expected under current accountability regimes.
Peurach et al. 609
Rationale for Developmental Evaluation
We begin with a critical analysis of conventional means of evaluating school improvement networks. Our argument is that a disconnect between the condi- tions that would support success on widely required impact evaluations (on one hand) and the complex and uncertain conditions under which these net- works actually operate (on the other) warrant commensurate support for developmental evaluation aimed at building a foundation for continuous learning and improvement within these networks.
Impact Evaluation, Its Goals, and Its Logic
What distinguishes school improvement networks from other large-scale reform strategies is that they take schools (rather than students) as the unit of intervention: not just their formal roles, structures, and technologies but also the teachers and leaders in schools, their individual capabilities and motiva- tions, and their collective capabilities and culture.
The intervention, itself, is a comprehensive model for establishing and improving schoolwide operations. These models typically include complex organizational blueprints for structuring and restructuring schools, as well as designs for the practice (i.e., the day-to-day work) of leaders, teachers, and students. Enacting these designs for practice, in turn, requires intellectual capital: practical, useable knowledge as retained and elaborated in material, digital, and other resources; as manifest in individual teachers, leaders, exter- nal coaches, and their communities of practice; and as shared through prac- tice-based learning opportunities. This intellectual capital is increasingly recognized as an essential resource for effecting coordinated improvements in leadership, instruction, and student achievement (Aladjem & Borman, 2006; Camburn, Rowan, & Taylor, 2003; Cohen & Ball, 2007; DeArmond, Gross, Bowen, Demeritt, & Lake, 2012; Rowan, Correnti, Miller, & Camburn, 2009a, 2009b).
These networks are emerging in policy contexts that hold them increas- ingly accountable for quickly establishing program impact.2 Rationale for doing so include assuring due diligence in the use of formidable public and private investment; stimulating competition for students and funding based on effectiveness; and recognizing the potential consequences of these net- works for the lives of many students, teachers, and school leaders (for better or worse).
Impact evaluation typically has two goals (Raudenbush, 2007; Slavin & Fashola, 1998). The first is to identify a “treatment effect” evidenced by a positive, statistically significant difference in outcomes between students in
610 Educational Policy 30(4)
participating and nonparticipating schools, with more rigorous evaluations seeking to establish a causal relationship between the treatment (i.e., the schoolwide model) and outcomes. The second is to identify whether the treat- ment effect can be replicated beyond early adopters and in a broader pool of schools.
Replicable treatment effects are increasingly examined using a four-stage, “tiered evidence” sequence that culminates with impact evaluation.3 Each stage marks an increase in available funding, the number of participating schools, the standards of evidence, and, thus, the costs and sophistication of evaluation. Each stage also marks movement from formative to summative evaluation: that is, from evaluations that inform the incremental improve- ment of the schoolwide model to evaluations that determine its replicable effectiveness. A combination of issues (e.g., funding cycles, the need to ensure due diligence, and the desire to capitalize quickly on investments) often interact to drive the evaluation sequence along a 7- to 14-year timeline4:
1. Evaluate a proposed program for its use of scientifically based research or other sources of “best practice” (1-2 years, preimplementation);
2. Implement in one or a small number of schools to establish “proof of concept,” with success evidenced via descriptive and other qualitative studies (1-3 years);
3. Increase the installed base of schools and use more rigorous research methods (e.g., matched-comparison designs) to examine the magni- tude and statistical significance of program effects on student out- comes (2-4 years);
4. Further increase the installed base and use even more rigorous meth- ods (e.g., quasi-experimental designs, randomized control trials, and meta-analyses) to further examine the magnitude and significance of effects (3-5 years).
This four-stage evaluation sequence is coupled closely with assumptions that the development of effective, replicable programs adheres to a sequential “RDDU” logic: research, development, dissemination, and utilization (Rowan, Camburn, & Barnes, 2004; see, also, Rogers, 1995). Basic and applied research feed development and small-scale pilots, from which follow rapid and widespread dissemination and effective use. This sequential, diffu- sion-centered logic model is highly institutionalized: for example, as evi- denced by the use of this logic as the basis for the four-phase progression of the New American Schools initiative (Bodilly, 1996) the current goal
Peurach et al. 611
structure of the IES (2012b), and the three-stage “development/validation/ scale up” sequence within the federal Investing in Innovation (i3) program (U.S. Department of Education, 2010).
Questionable Assumptions
The RDDU logic is, itself, based on a set of assumptions about conditions that would enable the rapid development of effective, replicable programs in a 7- to 14-year window:
•• Clear, shared understandings of the problems of (and goals for) schools.
•• A knowledge infrastructure that includes a basic research enterprise providing robust, cause–effect knowledge; an applied research enter- prise providing useable, research-based and research-validated com- ponents; and a professional education system that produces human resources capable of developing, supporting, and using these components.
•• The possibility of hubs working with small numbers of schools in a short period of time to integrate previously tested and newly devel- oped components into a generally effective schoolwide improvement model.
•• The possibility of rapidly and faithfully transferring an established, multicomponent, schoolwide program to large numbers of schools that can quickly incorporate and use it to effect intended outcomes.
The problem, however, is that these conditions rarely (if ever) hold in practice, thus complicating efforts both to rapidly develop and scale up schoolwide models and to evaluate their impact.5 For example, the most rel- evant goals for schools are those in state accountability schemes (and not evaluation designs). Yet these schemes differ among states, are variably developed by content area and grade level, and are still evolving. Moreover, improvement goals are continuously redefined school-by-school, in response to the past performance of specific subgroups of students, specific content areas and grade levels, and even individual teachers and students. Consequently, the uses to which schools put a school improvement program are likely to differ in significant ways, with hubs as accountable for support- ing differentiated use to meet school-specific goals as for maintaining pro- gram fidelity to meet evaluation goals.
The knowledge infrastructure in education is equally problematic. The basic research infrastructure has long been characterized as weak and
612 Educational Policy 30(4)
disconnected from practice (Bryk, 2009; Kaestle, 1993). As a locus of applied research and a possible source of validated program components, the “school improvement industry” is sprawling, turbulent, and dominated by conserva- tive commercial publishers (Rowan, 2002), and it is just now coming under the oversight of emerging quasiregulatory agencies such as the What Works Clearinghouse and the Best Evidence Encyclopedia. And the professional education of teachers and school leaders has long been criticized as weak, lacking a coherent knowledge base, and uncoordinated with specific curri- cula, assessments, and other resources of practice (Levine, 2005, 2006; Sykes, Bird, & Kennedy, 2010).
Even if the knowledge infrastructure was robust, the use of small-scale pilots to devise a generally effective schoolwide improvement model is com- plicated by interdependencies in and among schools, the models, hub organiza- tions, and the environments in which they operate (Cohen et al., 2014; Glazer & Peurach, 2013; Peurach, 2011). Each of the preceding consists of multiple and ever-evolving components, with difficult-to-discern relationships among them. Understanding, improving, and coordinating their interactions is difficult to accomplish in a small number of schools, and grounded in the particulars of pilot sites, their environments, and time. Indeed, networks often move from small-scale pilots to large-scale operations with programs that are promising- but-problematic, and under constant revision (Berends et al., 2002; Cohen et al., 2014; Glennan, Bodilly, Galegher, & Kerr, 2004; Marsh, Hamilton, & Gill, 2008; McDonald, Klein, & Riordan, 2009; Peurach, 2011).
Finally, rapid, effective, large-scale use is likely to be complicated not only by the scope, complexity, and uncertainty of schoolwide programs but, also, by problems and shortcomings in common strategies for diffusion and utilization. For example, strategies that emphasize such formal resources as codified routines and guidance have long been interpreted as a bureaucratic affront to local control and professional autonomy and, thus, resisted (Peurach, 2011; Rowan, 1990). At the same time, strategies that emphasize such social resources as mentoring, coaching, and communities of practice are limited by the geographic distance between hubs and schools; cultural and logistical obstacles to moving staff among classrooms and schools; small ratios of experts to novices; personnel transiency; and variability in local environments (Cohen et al., 2014; Peurach, 2011).
The Learning Imperative
Rather than enabling conditions supporting the RDDU sequence, longitudi- nal research on comprehensive school reform suggests that school improve- ment networks emerge under complex and uncertain conditions that challenge
Peurach et al. 613
operating in accord with the RDDU sequence (Berends, Bodilly, & Kirby, 2002; Bodilly, Glennan, Kerr, & Galegher, 2004; Cohen et al., 2014; Honig, 2006; Peurach, 2011). From this complexity and uncertainty follows a learn- ing imperative: a need for hubs and schools to work less linearly and more circuitously, by collaborating over time to create, use, and refine practical knowledge supporting replicable effectiveness.6
Researchers describe this collaborative learning in terms of two interde- pendent, iterative learning processes: exploration and exploitation (Hatch, 2000; Peurach & Glazer, 2012). Exploration is a type of divergent learning that involves reconsidering premises and identifying new possibilities through search, experimentation, discovery, and invention.7 Exploitation is a type of convergent learning that involves leveraging established knowledge, selecting from among alternatives, and learning and refining through repeated use.
In that hubs and schools are learning through collaborative, iterative explo- ration and exploitation, the primary intervention in a school improvement net- work—the model for establishing and improving schoolwide operations—is not a fixed, objective, and effective “treatment” developed in advance of large- scale implementation. Rather, researchers have reconceptualized schoolwide improvement programs as subjective realities created through processes of co- construction and sensemaking among schools, districts, hubs, and other vested organizations in the context of large-scale implementation (Datnow, Hubbard, & Mehan, 2002; Datnow & Park, 2009). Effectiveness, in turn, depends both on leveraging established knowledge and on taking ownership and asserting agency in adapting and using knowledge in specific, local contexts (Coburn, 2003; McLaughlin & Mitra, 2001; Peurach & Glazer, 2012).
Consequently, knowledge of “best practice” does not exist in advance of scaling up, such that it can be readily incorporated and integrated into gener- ally effective school improvement programs. Furthermore, knowledge of “best practice” does not remain constant over time, such that it will maintain currency despite ever-changing organizational and environmental contexts. Rather, increasingly better knowledge of practice emerges through the pro- cess of scaling up. This knowledge is intellectual capital that emerges (and is retained) among individuals and communities of practice within the network; that is captured and retained in codified resources, digital resources, and other artifacts; and that is adapted and refined over time through individual and collective use (Peurach & Glazer, 2012).
Learning to Learn
Although researchers have reported instances of this type of learning activity among a small number of school improvement networks, it is not safe to
614 Educational Policy 30(4)
assume that this is modal practice among the population of school improve- ment networks. Indeed, it is unlikely that all recognize the need for continu- ous learning and improvement, are equally adept, and work explicitly and proactively (rather than tacitly and reactively) to align their strategies, opera- tional capabilities, and cultural norms to support the production, use, and management of intellectual capital.8
Consider hub organizations. These are often start-up enterprises with few demonstrated capabilities, founded and managed by educators, advocates, and others with little (if any) prior professional training or experience operating large-scale, knowledge-intensive organizations or networks. The development of such knowledge and capabilities among network executives is limited by weaknesses in available knowledge and professional learning opportunities to support their work.9 It is also limited by incentives and sanctions that often drive network executives to feign (if not actively pursue) operating in accor- dance with the RDDU logic.10 Indeed, for executives, upholding the myth of rationality is far safer than acknowledging complexity and uncertainty, and instrumental in maintaining legitimacy among funders and clients.
Consider, also, the chief collaborators of hub organizations: (a) existing schools with histories of underperformance and nonimprovement and (b) newly created schools with no history of past performance or improvement. These schools are likely to lack two capabilities essential to continuous learn- ing and improvement via exploitation and exploration: absorptive capacity (i.e., the capability to leverage existing capabilities to recognize, value, incor- porate, and use new practices and understandings) and dynamic capabilities (i.e., the capability to systematically generate and modify practices and understandings in pursuit of improved effectiveness, continued legitimacy, and sustainability).11 Moreover, efforts to develop these capabilities are com- plicated by labor market dynamics that have disproportionate numbers of weakly prepared teachers and school leaders working in underperforming schools, as well as by high rates of personnel transiency in underperforming and charter schools.12 The result is often the steady loss of newly created intellectual capital and the steady incorporation of weakly prepared teachers and school leaders.
Thus, even though the conditions under which they operate are likely to occasion the imperative to learn through interdependent exploration and exploitation, engaging that learning imperative is likely to require that school improvement networks learn to learn. That is, these networks must learn to develop and leverage the foundation—the essential strategies, operational infrastructure, and normative infrastructure—needed to create, use, retain, and manage intellectual capital through continuous learning and improvement.
Peurach et al. 615
That, in turn, begins with developing among network executives the understandings and capabilities needed to structure and manage school improvement networks as distributed, collaborative learning systems. Though not impossible, it is unlikely that many network executives will be able to independently develop knowledge and capabilities that, among innovating enterprises more broadly, have been found to be tacit, subconscious, and scarce.13 Rather, supporting network executives in learning to learn is likely to require the assistance and support of an external evaluator of some sort (e.g., a researcher, executive coach, and/or experienced mentor) sufficiently knowledgeable and skillful to provide guidance yet sufficiently humble to value and learn from the experiences of network executives.
The Case for Developmental Evaluation
Thus, school improvement networks operate in environments that link their legitimacy, funding, and sustainability to a progression of increasingly rigor- ous impact evaluations. However, conditions that would enable these net- works to rapidly demonstrate replicable effectiveness at a large scale simply do not hold in practice. Instead, complex and uncertain conditions create a learning imperative: a need for hubs and schools to work collaboratively to produce, use, and manage intellectual capital supporting replicable effective- ness. Yet few networks are likely to have the foundation needed to support such learning, and few network executives are likely to know (or to indepen- dently learn) how to establish such a foundation. As such, many executives and networks will likely need support in learning to learn.
Developmental evaluation is an approach to evaluator/innovator collabo- ration with potential to address this need (Dozois, Langlois, & Blanchet- Cohen, 2010; Gamble, 2008; Patton, 2006, 2011, 2012). Consistent with the preceding analysis, developmental evaluation is grounded in assumptions that large-scale social innovations emerge and operate under conditions of complexity and uncertainty that challenge rational management and decision making and that require continuous learning and improvement. A chief aim of developmental evaluation, thus, is to support the development of large- scale social innovations through learning-centered, improvement-focused evaluation.14
Central to developmental evaluation is the understanding that evaluation thinking, methods, and use need to be stitched deeply into the enterprise and integral to its management (in contrast to evaluation operating as a parallel process running alongside the enterprise). A key means for achieving this “stitching in” is for evaluators, themselves, to become integral members of the enterprise, working alongside social innovators and collaborating as
616 Educational Policy 30(4)
partners in their work (Dozois et al., 2010; Gamble, 2008; Patton, 2006, 2011, 2012). As a collaborative partner, the evaluator both contributes techni- cal expertise and rigor to inquiry and analysis and, also, functions as a knowl- edgeable-and-critical friend who explicates the tacit, challenges assumptions, raises questions, and documents (and reexamines) decision-making processes in light of outcomes. Reciprocally, the evaluator leverages the experience to advance his or her understandings of the innovating enterprise, of social innovation more broadly, and of enacting the role of developmental evaluator.
An especially critical focus of developmental evaluation is the practice- based coaching and support of decision makers as they guide the enterprise, itself: that is, as they assess prevailing conditions; adapt and reconcile mis- sion and strategy; align operations with mission and strategy; and build the understandings, commitment, and motivation of others (Dozois et al., 2010; Gamble, 2008). Because the complexity and uncertainty of prevailing condi- tions are likely to reduce the quality of available information, a primary role of the evaluator is to support decision makers both in interpreting partial and equivocal information and in communicating interpretations (and their mean- ing) to others. As such, key responsibilities of evaluators include framing and conceptualizing important issues and problems; identifying a parsimonious set of key indicators to guide rapid data collection; introducing frameworks and questions to guide interpretation and meaning-making; and building con- sensus among decision makers and others.15
Grounding principles of developmental evaluation in the preceding analy- sis of school improvement networks, a first-order matter of developmental evaluation early in the emergence of the network is for evaluators to support executives in learning to learn: that is, in learning to develop and leverage the foundation needed to produce, use, and manage intellectual capital through continuous learning and improvement.16 In doing so, the primary unit of anal- ysis would not be the schoolwide program, and whether it works reliably and effectively. Rather, the primary unit of analysis would be the school improve- ment network and whether it is working as a learning system. Such analysis, in turn, would benefit from two key resources:
•• A shared logic of developmental evaluation to support new ways of thinking and reasoning about networks as evolving through ongoing, iterative exploration and exploitation (and not through an RDDU sequence).
•• A framework for developmental evaluation: that is, a parsimonious set of guiding questions to assess the network as a learning system; an interpretive framework to support analysis of information generated
Peurach et al. 617
using these indicators; and reflective questions to guide the assess- ment of possible implications.
With its focus on improving the network as a learning system, develop- mental evaluation would serve as a complement to other approaches to improvement-focused evaluation, by establishing a foundation that would allow these other approaches to be leveraged to greater effect. This includes design-based implementation research focused on the continuous improve- ment of programs and interventions (e.g., Anderson & Shattuck, 2012; Penuel et al., 2011). It also includes design-educational engineering-devel- opment focused on addressing problems of practice in network-based improvement communities (Bryk, 2009; Bryk et al. 2010; Mehta, Gomez, & Bryk, 2012).
By comparison, all three approaches to improvement-focused evaluation exist in a complex relationship with impact evaluation. Although improve- ment-focused evaluation has potential to increase prospects for success on impact evaluations, it also intentionally disrupts the many things that evalua- tors would hope to control in rigorous impact evaluations (none the least of which is the ostensible “treatment”). As such, any assessment of impact would be no more than a point estimate of a set of momentary conditions. For networks simultaneously engaged in both improvement-focused and impact evaluations, the relationship between them should be recognized as a tension to be managed and a key consideration in interpreting results.
A Logic for Developmental Evaluation
We continue by proposing a logic for developmental evaluation. The intent is to support external evaluators, network executives, and other stakeholders in thinking and reasoning more carefully about ways in which exploration and exploitation can interact to support the production, use, and management of intellectual capital, both at a large scale and in ways that ultimately have potential to support replicable effectiveness.
Specifically, we review and extend the evolutionary logic of replication advanced by Peurach and Glazer (2012). The logic draws on several knowl- edge-based traditions of organizational scholarship, including leading research on franchise-like organization replication in the commercial sector: an approach to large-scale organizational development that closely parallels that of school improvement networks.17 As applied to school improvement networks, the logic was initially used to structure an account of a leading comprehensive school reform program (Success for All) as a learning system and, then, refined through continued use and scholarship.18
618 Educational Policy 30(4)
The logic is not intended as a “how to” prescription but, instead, as an ideal type: a heuristic for critically analyzing the foundation for continuous learning and improvement in specific networks, and for considering ways in which to build and strengthen that foundation. It is an alternative vision and a common platform on which external evaluators, network executives, and other vested parties can base the work of developmental evaluation, itself likely to evolve through use, reflection, and refinement.
Review: The Evolutionary Logic of Replication
As with school improvement networks, the evolutionary logic begins with a central, hub organization aiming to replicate a common organizational model across large numbers of outlets.19 The organizational model is assumed to be sufficiently broad in scope as to transform the core capabilities (and even the identity) of outlets, with the goal of replicating the effectiveness of produc- tion activities and/or service delivery (Winter & Szulanski, 2001).
Recognizing the impossibility of creating precise organizational replicas in widely varying contexts, replication is considered successful when broadly equivalent outcomes are realized by similar means, with specific tolerances established within individual replication initiatives (Baden-Fuller & Winter, 2012). This approach has advantages in terms of speed, efficiency, and effec- tiveness under complex and uncertain conditions as described above: for example, weakness in knowledge and component technologies in broader environments; weakness in the knowledge and capabilities of outlet staff; and limits on the social management of knowledge through apprenticeship, men- toring, and communities of practice.
Premises: Practice-Focused, Learning-Driven Networks
The evolutionary logic begins with two core premises. The first premise is that, in replicating complex organizational models, the overarching consider- ation is not the replication of physical characteristics, formal structures, or culture, simply because it is possible to replicate broad organizational forms without replicating organizational effectiveness (Winter & Szulanski, 2001). Instead, the overarching consideration is the replication of capabilities: that is, the replication of practices and understandings that support working dif- ferently, more effectively, and in more coordinated ways toward intended outcomes than would be possible if outlets were working independently.
The second premise is that capabilities cannot be reliably replicated through the rapid, unilateral transfer, communication, or dissemination of knowledge and information from hubs to outlets. Reasons include
Peurach et al. 619
uncertainties, shortcomings, and flaws in available knowledge; inaccuracies and uncertainties in communicating complex practices and understandings; and the complexities of human agents learning to enact and understand their work in new ways. Instead, the evolutionary logic holds that the replication of organizational capabilities requires the creation and recreation of coordi- nated, interdependent practices and understandings through collaborative, experiential, long-term learning within and among hubs and outlets.
Foundations: Essential Knowledge Base and Core Learning Processes
Given the preceding, the primary focus of the evolutionary logic is the pro- duction, use, and management of an essential knowledge base that supports the broad scope replication of capabilities. Consistent with research on school improvement networks, this essential knowledge base is produced, used, and managed through multiple iterations of two interdependent learning pro- cesses coenacted by hubs and outlets: exploitation and exploration (Winter & Szulanski, 2001; see also Bradach, 1998; March, 1996). This essential knowl- edge base is the core intellectual capital of the enterprise: a nonrivalrous resource that can be used repeatedly across many outlets without limiting its use in others.20
The essential knowledge base consists of three categories: knowledge of what, how, and where to replicate (Winter & Szulanski, 2001). Knowledge of what to replicate focuses on the essential practices and understandings to be recreated in each outlet. Knowledge of where to replicate focuses on prac- tices and understandings within the hub for identifying, vetting, and selecting outlets and environments that favor successful replication. Knowledge of how to replicate focuses on practices and understandings within the hub for recreating essential practices and understandings in outlets (e.g., strategies for training and coaching).
Emergence: A Template
To establish proof of concept, development of the essential knowledge base begins with the construction of a “template”: an initial outlet that serves as a working example of the production or service capabilities to be replicated, often constructed in carefully selected sites and staffed with carefully selected people (Baden-Fuller & Winter, 2012; Winter, 2010; Winter & Szulanski, 2001). The template functions as a context for initial, exploratory learning in which hub and template staff engage in joint search, experimentation, discov- ery, and invention to devise means of realizing intended outcomes.
620 Educational Policy 30(4)
With successful exploration, the template becomes a repository of knowl- edge that the hub can study to develop provisional understandings of the capabilities to be recreated in outlets, where those capabilities might be recre- ated, and how to recreate them. It also functions as a resource for developing a formal design for practice to be replicated across outlets: a plan describing intended activity in outlets, as well as a schema around which to develop and organize knowledge of practice. At a minimum, a design for practice describes essential roles, along with the “in principle” responsibilities of each role. A more developed design for practice can include qualifications for people occupying specific roles; descriptions and principles that detail the coordina- tion among roles; and rubrics, goals, and standards for evaluating perfor- mance (both of individual roles and of the outlet as a whole).
Essential Resource: Formalized Knowledge
As the template matures, it becomes a context for the social, interpersonal management and reproduction of knowledge through apprenticeship, men- toring, and communities of practice. But, again, the use of social mechanisms to support large-scale organizational replication is limited by such issues as the broad scope of the knowledge to be replicated, geographic distances between the template and new outlets, logistical obstacles to moving staff between the template and new outlets, and small ratios of experienced to novice staff members.
As such, with proof of concept, a central role of the hub is to formalize the essential knowledge base: that is, to codify knowledge of what, where, and how to replicate in tools, manuals, training materials, digital media, and other artifacts (Winter & Szulanski, 2001, 2002).21 The formalization of knowl- edge functions as a principal strategy for retaining, managing, and exploiting knowledge beyond the template and throughout the network.22
Rather than as a “coercive” mechanism for exercising tight control over outlets, formalized knowledge is viewed as an “enabling” resource intended to support outlet staff in effectively performing coordinated work that would otherwise be beyond their immediate capabilities.23 Furthermore, although formal knowledge can be easily transferred or communicated to outlets, the assumption is that using this knowledge to recreate capabilities in outlets will require opportunities to learn about it and to practice using it.
Formalized knowledge falls into two categories. The first category is codi- fied routines: coordinated patterns of activity, both in outlets (e.g., routines supporting essential practices) and in the hub (e.g., routines supporting the selection and creation of outlets). Routines are considered the primary mech- anisms for supporting levels of coordinated activity that would otherwise be
Peurach et al. 621
difficult and costly to achieve (Nelson & Winter, 1982). These include “closed” routines: procedures that provide step-by-step directions for what, exactly, to do in particular situations. They include “open” routines: frame- works used to devise courses of action under conditions of uncertainty. They include assessment routines used to generate information with which to eval- uate performance and outcomes. And they include “learning” routines that detail cycles of diagnosis, planning, implementation, and reflection.
The second category is codified guidance: professional and background knowledge essential to the understanding and enactment of specific roles and responsibilities, along with evaluation rubrics and decision trees that support analysis and decision making. Such guidance supports the intelligent (rather than rote) selection and enactment of routines, responsiveness to local cir- cumstances, and the management of inevitable breakdowns and limitations in routines.
Endemic Complication: Partial and Problematic Knowledge
Within the evolutionary logic, an endemic complication is that the hub often faces pressure from investors and others to begin exploiting knowledge and scaling up before having a completely worked out template or a highly devel- oped (and tested) formal knowledge base (Winter & Szulanski, 2001). Within the template, activities may combine to effect intended outcomes in nonobvi- ous ways; relevant knowledge will always remain tacit; understandings of cause-and-effect relationships may be flawed; and apparently important activities may be completely unrelated to outcomes. Furthermore, the effec- tiveness of templates likely depends on specific individuals, relationships, and environments in ways not fully understood at the outset.
Consequently, hubs and outlets satisfice: that is, they commence replica- tion with potentially rich (but partial-and-problematic) knowledge of key practices and understandings to be replicated in outlets, and (absent any experience replicating) with only speculative knowledge about where and how to replicate them. Consider the impossible alternative: that, working from one or a small number of templates, the hub would be able to quickly discern and formalize perfect knowledge of what, where, and how to replicate.
Essential Method: Developmentally Sequenced Replication
The evolutionary logic continues with the hub recruiting or creating outlets and proceeding to large-scale replication. The aim is to recreate conventional capabilities for achieving common performance levels among outlets while
622 Educational Policy 30(4)
also extending and refining the knowledge needed to do so. The method is a developmentally sequenced replication process that depends on a synergy between two approaches to replication often viewed as logical opposites: fidelity of implementation and adaptive, locally responsive use (Szulanski, Winter, Cappetta, & Van den Bulte, 2002; Winter, 2010; Winter & Szulanski, 2001).24
The developmental sequence begins with fidelity of implementation: exploiting knowledge by supporting outlets in learning to enact formalized routines as specified, with the goal of establishing conventional, coordinated, base-level capabilities and performance levels. Despite shortcomings and problems in the essential knowledge base, and despite the deferred benefits of addressing outlet-specific exigencies, fidelity provides multiple advan- tages. These include mitigating against weak initial capabilities in outlets; taking advantage of lessons learned and problems solved; creating opportuni- ties to learn by doing (e.g., to enact practices, examine underlying principles, and examine the interdependence and coordination of activities); forestalling early problems (e.g., regression to past practice and the introduction of novel, site-specific operational problems); and establishing conventions that sup- port collaborative learning and problem solving (e.g., common language, shared experiences, and joint work).
Once base-level practices and understandings are established, the devel- opmental sequence proceeds to adaptive use. With that, outlets begin learning via exploration: that is, by assuming ownership and asserting agency in enacting the model to compensate for shortcomings, address problems, and respond to local needs and opportunities. Learning via adaptive use can include adjusting hub-formalized routines and guidance to better address local circumstances, inventing new routines and guidance that address criti- cal work not yet formalized by the hub, and/or abandoning routines and guid- ance that appear either inconsequential or detrimental. Capabilities for adaptive use are not assumed. Rather, the hub enables such activity using a collection of resources. These include open routines that support local deci- sion making; assessment routines for evaluating performance and outcomes; “learning routines” that guide analysis, evaluation, and reflection; guidance that provides theories, principles, goals, standards, and other information to support and constrain local analysis, invention, and problem solving; and support for learning to use these resources.
The enactment of this developmental sequence also creates opportunities for the hub to engage in its own exploitation and exploration to refine and extend knowledge of where and how to replicate. For example, learning where to replicate involves using (and adapting) formal routines and guid- ance for identifying new outlets prepared for initial implementation, as well
Peurach et al. 623
as experienced outlets prepared to advance to adaptive use. Learning how to replicate involves using (and adapting) routines and guidance for use by coaches and trainers in supporting both base-level operations and adaptive use in outlets.
The Outcome: Knowledge Evolution
This developmental sequence fuels a knowledge evolution cycle through which the hub and outlets collaborate to continuously expand and refine the essential knowledge base (Zollo & Winter, 2002).25 The cycle begins with exploitation: fidelity of implementation within and between outlets to estab- lish conventional, base-level capabilities and performance levels. As they advance to exploration and adaptive use, outlets introduce variation into the network regarding practices and understandings that support effective opera- tions. As the coordinative center, the hub monitors the network for instances and patterns of variation; selects, evaluates, and refines potential improve- ments; squares those with existing or new knowledge, resources, and require- ments in broader environments; retains improvements both by incorporating them into an evolving template and by formalizing them as designs for prac- tice, routines, and guidance; and works to purge ineffective practices. New practices and understandings are then fed back into existing outlets as incre- mental, “small-scope” improvements, and they are incorporated into a broader-yet knowledge base for use in creating new outlets.
The cycle then begins again, with initial recreation of practices and under- standings via faithful implementation, followed by adaptation, variation, selection, and retention. Successive iterations of exploitation and exploration result in an increasing (and increasingly refined) formal knowledge base detailing where, what, and how to replicate.
Essential Mechanisms: Dynamic Capabilities
Knowledge evolution is dependent on dynamic capabilities: learned routines through which hubs and outlets systematically generate and modify practices and understandings in pursuit of improved effectiveness, continued legiti- macy, and sustainability (Dosi, Nelson, & Winter, 2001; Winter, 2003; Winter & Szulanski, 2001; Zollo & Winter, 2002).
In outlets, dynamic capabilities are anchored in the sort of adaptive use described above: systematic, disciplined, exploratory learning anchored in Deming-like “plan-do-check-act” continuous improvement cycles, with adaptations evaluated in light of outcomes and refined accordingly (Dosi et al., 2001). In hubs, dynamic capabilities are anchored in infrastructure and
624 Educational Policy 30(4)
capabilities for rapidly pooling and analyzing information and knowledge throughout the network; for evaluating the relationship between practices and understandings (on one hand) and intended outcomes (on the other); for experimentation and rapid prototyping; and for disseminating program improvements through the installed base of outlets by formalizing essential knowledge and supporting both faithful and adaptive use.
The more developed the dynamic capabilities in hubs and outlets, and the more they include formal and rigorous methods of internal evaluation and refinement (e.g., via design-based implementation research), the more likely that the knowledge evolution cycle will not only yield agreed on “best prac- tices” but, instead, evidence-based practices linked empirically to relevant outcomes.26 Even so, extensive iterations will not yield omniscience. The essential knowledge base will always be partial and problematic, key knowl- edge will always remain undiscovered and/or tacit, and broader conditions are apt to change.
As such, knowledge evolution featuring iterative, interdependent exploita- tion and exploration functions as the essential capability of network-based organizational replication initiatives: a condition of “perpetual beta” enacted jointly by hubs and outlets over the life of the enterprise to support the pro- duction, use, and management of intellectual capital.
A Framework for Developmental Evaluation
We complete our analysis by proposing a framework for developmental eval- uation. While establishing a rationale and logic for developmental evaluation is critical, building a foundation for continuous learning and improvement is, ultimately and essentially, practical work. As such, our aim is to build on our rationale and logic to provide actionable guidance to support researchers, network executives, and other stakeholders in critically analyzing the founda- tion for continuous learning and improvement in a given school improvement network.27 The framework consists of three components: five guiding ques- tions to structure data collection, a four-category interpretative framework to support the analysis of data generated using these questions, and three ques- tions to structure collective reflection.
Guiding Questions
We begin by adapting guiding questions first proposed by Peurach and Glazer (2012) and Peurach, Glazer, and Lenhoff (2012). These five questions are intended to structure the rapid collection of a parsimonious-yet-powerful body of evidence (through interviews, document analysis, and participant
Peurach et al. 625
observation) about characteristics of the network suggested by the evolution- ary logic as foundational to continuous learning and improvement.28 The first question examines the alignment between the network’s strategy for manag- ing intellectual capital and the complex and uncertain conditions under which networks are likely to operate. The remaining four questions examine the alignment between that strategy and internal operations.
1. Does the enterprise have an explicit strategy for managing intellec- tual capital that attends to both exploitation and exploration? Attention to exploitation is evidenced by goals, norms, and language that emphasize the faithful implementation of evidence-based (or otherwise-established) practices. Attention to exploration is evi- denced by goals, norms, and language that emphasize local experi- mentation, invention, and adaptation.
2. Does the enterprise have a formal design for practice? Akin to an explicit program theory or logic model, such a design would be evi- denced by formal descriptions of essential roles; qualifications for essential roles; principles detailing responsibilities and coordination among roles; and standards and rubrics for assessing the enactment of those roles. It would be further evidenced by a functional template from which the design was drawn (and in which it can be observed and studied in operation).
3. Does the enterprise feature formal, codified resources for recreating base-level practices and understandings in schools? These resources are evidenced by formal routines and guidance for recruiting, select- ing, and enlisting schools in which conditions exist (or can be cre- ated) to support base-level operations; by formal routines and guidance for use by schools to establish consistent, base-level prac- tices and understandings; by formal routines and guidance for use by trainers and coaches to support schools in establishing base-level practices and understandings; and by language and frameworks to guide the interpretation of these resources as enabling (and not coercive).
4. Does the enterprise feature formal, codified resources for recreating practices and understandings for adaptive, locally responsive use? These resources would be evidenced by formal routines and guidance for use by hub staff in identifying outlets that have mastered base- level operations (and, thus, are prepared to progress to adaptive use); by formal routines and guidance for use by school staff to support design, evaluation, problem solving, decision making, and other dis- cretionary activity; by formal routines and guidance for use by
626 Educational Policy 30(4)
trainers and coaches to support discretionary activity in outlets; and by language and frameworks that encourage divergence while also minding conventions.
5. Does the hub organization have the infrastructure and capabilities to support evolutionary learning? Such infrastructure and capabilities would be evidence by the above-described supports for adaptive use (as a source of within-network variation in practices and understand- ings) and for base-level capabilities (to feed program improvements back through the network). They would be further evidenced by a communication infrastructure supporting the bilateral exchange of knowledge and information among hubs and schools; opportunities, resources, and capabilities in the hub for analyzing school perfor- mance and outcomes (as via design-based implementation research); and opportunities, resources, and capabilities for formalization, rapid prototyping, and small-scale evaluation.
Interpretive Framework
Our conjecture is that few school improvement networks will be fully atten- tive to exploitation and exploration in ways suggested by the evolutionary logic. After all, these networks operate amid a legacy of past educational reform movements that were vigilant on matters of structural compliance but surprisingly inattentive to the learning required to effect complementary changes in practice. Furthermore, they operate in reform environments that have long understood exploitation and exploration as mutually exclusive and ideologically steeped alternatives, with faithful implementation of external guidance understood both as coercive and as fundamentally at odds with local and professional autonomy, invention, and design. Finally, they operate amid institutionalized understandings of innovation as an RDDU sequence, absent understandings of either the possibility or legitimacy of integrating exploitation and exploration to support continuous learning and improvement.
As such, we continue by proposing an interpretive framework to differen- tiate among networks by their foundations (i.e., their strategies and supports) for continuous learning and improvement. Using evidence generated with our guiding questions, the framework identifies networks as structured and oper- ating consistent with one of four primary types: a shell enterprise, a diffusion enterprise, an incubation enterprise, or an evolutionary enterprise.29 The framework also identifies vulnerabilities among these types for the produc- tion, use, and management of intellectual capital, as well as implications for implementation and outcomes.
Peurach et al. 627
Shell enterprise. A shell enterprise is one in which the hub seeks to replicate distinguishing organizational characteristics across schools (e.g., roles, struc- tures, tools, and/or culture) absent efforts to recreate essential capabilities. As such, a shell enterprise will be evidenced by detailed organizational blueprints, including (potentially) a formal design for practice. However, a shell enter- prise will show little or no evidence of an explicit strategy for managing intel- lectual capital, as well as little or no evidence of formal supports for either base-level operations (i.e., exploitation) or adaptive use (i.e., exploration).30
Diffusion enterprise. Consistent with the institutionalized RDDU logic, a diffu- sion enterprise is one in which the hub places a primary emphasis on codifying proven and/or established practices to be enacted with fidelity in schools. A dif- fusion enterprise is evidenced by a strategy for managing intellectual capital that focuses primarily on exploiting available knowledge of “what works.” It is fur- ther evidenced by (a) extensive, formal routines and guidance for creating con- sistent, base-level operations (i.e., exploitation) and (b) comparatively weak attention to supporting experimentation and adaptation (i.e., exploration).
Incubation enterprise. Mindful of local control and professional autonomy, an incubation enterprise is one in which the hub places a primary emphasis on structuring parameters, processes, and resources to support school-level design, implementation, and problem solving. An incubation enterprise is evidenced by a strategy for managing intellectual capital that focuses primar- ily on distributed, exploratory learning through which schools operationalize hub-formalized designs for (and principles of) practice. It is further evidenced by (a) extensive, formal routines and guidance supporting adaptive, locally responsive use (i.e., exploration) and (b) comparatively weak emphasis on (and support for) the faithful implementation of specific methods of produc- tion or service delivery (i.e., exploitation).
Evolutionary enterprise. An evolutionary enterprise is one in which hubs and schools engage in collaborative learning that yields a formal knowledge base detailing where, what, and how to replicate. An evolutionary enterprise is evi- denced by a strategy for managing intellectual capital that emphasizes both exploitation and exploration, as well as by formal routines and guidance support- ing both base-level operations (exploitation) and adaptive use (exploration).
Vulnerabilities and implications. From the perspective of continuous learning and improvement, shell, diffusion, and incubation enterprises lack mecha- nisms that would support the development of a formal knowledge base that could be used to recreate and refine capabilities for effective practice in large
628 Educational Policy 30(4)
numbers of schools. For example, shell enterprises lack the foundation for exploration (to establish base-level operations), exploitation (to support adaptation and problem solving), and iterations between the two (to continu- ously learn and improve). Furthermore, while diffusion enterprises establish mechanisms for exploiting established knowledge, they lack mechanisms to support exploration, adaptation, problem solving, and feedback that would introduce variation and new knowledge. Finally, while incubation enterprises support exploration, adaptation, and problem solving, they also lack mecha- nisms for culling, testing, and exploiting new knowledge as it emerges, and they run the risk of new knowledge being so context-specific as to have little value beyond individual schools.
Weaknesses in the resulting knowledge base, in turn, create risks for implementation and outcomes for networks operating under the uncertain and complex conditions described above; with schools likely weak in initial capabilities; with increasing scale likely straining social mechanisms for managing intellectual capital; and with impact evaluation likely on the hori- zon. For example, shell enterprises risk classic loose coupling, with weak linkages between the formal and behavioral structure of schools and, thus, a high risk of regression to past practice. Furthermore, diffusion enterprises risk schools faithfully enacting practices as specified, neglecting to address local needs, and thus capping program impact below desired levels. Finally, incubation enterprises risk both variation in the “treatment” and regression to past practice, thus complicating efforts to identify a “treatment effect.”
Failure isn’t inevitable. It is conceivable that schools could compensate for network-level weaknesses with strengths of their own: for example, prior knowledge and capabilities; the ability to incorporate and use other knowledge; and the ability to learn from experience. Yet, beyond the possibility of a small number of positive outliers, our earlier analysis suggests that chronically under- performing and newly created schools are likely to lack such capabilities.
The risk, then is of a Matthew effect: an asymmetric, bimodal distribution in implementation and outcomes determined by the initial capabilities of new schools, with a small number of initially capable schools succeeding and many others struggling. With that, the network’s strategy for managing intel- lectual capital becomes a chief source of precisely the type of variation that complicates establishing replicable effectiveness on impact evaluations.
Structured Reflection
Again, our conjecture is the networks are more likely to be identified as shell, diffusion, and incubation enterprises than as evolutionary enterprises. From the perspective of developmental evaluation, the task for external evaluators
Peurach et al. 629
is to use the preceding evidence and interpretations to motivate and guide critical reflection among network executives and stakeholders, with the goal of understanding (and, possibly, improving) the network’s foundation for continuous learning and improvement, given the likelihood of impact evaluation.
This is not a straightforward task, and would surely benefit from guidance all its own.31 It would require that external evaluators, executives, and stake- holders understand the rationale, logic, and framework for developmental evaluation. Furthermore, it would require collectively reviewing evidence of the network’s current strategies and supports, reaching consensus on interpre- tations, and squaring hypothesized vulnerabilities and risks with evidence of implementation and outcomes. Finally, it would require developing shared understandings of what are likely to be multiple, mutually reinforcing condi- tions that have the network pursuing a shell, diffusion, or incubation strategy: for example, the understandings, ideologies, beliefs, and identities of net- work executives, as well as those of the people and schools that have joined the network; the expectations and constraints of funders and other stakehold- ers; institutionalized understandings and ideologies of education and its reform; and limitations in resources.
Assuming the development of such understandings, we conclude by pro- viding guidance to structure collaborative reflection about possible next steps. Specifically, we consider three options: staying the course, transform- ing the network, and learning to learn. The catch is this: While each is a legiti- mate possibility, each also presents its own problems.
Option 1: Staying the course. For networks vested in shell, diffusion, or incu- bation strategies, one possible response is to stay the course: that is, to hold tight to the existing strategy while, at the same time, trying to reduce the complexity and uncertainty under which the network operates. This approach might include working in comparatively developed domains of research and professional preparation (e.g., early reading); working with more established, able, and stable schools; working at a scale that permits the social manage- ment of knowledge; and avoiding circumstances that call for impact evaluation.
However, it is not clear that any domain of educational activity is suffi- ciently developed as to obviate the need for both exploitation and explora- tion.32 Furthermore, efforts to establish favorable conditions (e.g., through detailed program adoption processes, district-level collaboration, and broader lobbying efforts) are often ineffective (Datnow, 2000), and policy- level efforts to establish coherent environments often do more to effect tur- bulence than to effect enabling conditions (Glazer & Peurach, 2013; Trujillo,
630 Educational Policy 30(4)
2012). Finally, the stronger the schools, the smaller scale, and less the con- cern with replicable effectiveness, the greater the risk that the network will define itself out of the policy-supported and philanthropic-supported agenda for educational reform (and, with that, struggle to secure resources and schools).
Option 2: Transforming the network. A second option is for executives to com- mence a rapid and radical transformation of the network to reorganize as an evolutionary enterprise. Diffusion enterprises would place renewed emphasis on exploration, incubation enterprises would place renewed emphasis on exploitation, shell enterprises would attend to both, and all would place renewed emphasis on establishing centralized dynamic capabilities. This would have the advantage of aligning the network’s foundation for continu- ous learning and improvement with the complexity and uncertainty under which it is likely to operate, thus (in principle) increasing the potential of quickly producing, using, and managing the intellectual capital needed to demonstrate replicable effectiveness.
Even so, rapid transformation is risky. Again, the evolutionary logic is an ideal type to support analysis, and not a set of “how to” prescriptions. Furthermore, though research provides a number of accounts of networkwide learning via exploration and exploitation, it provides only one detailed account of a school improvement network operating in full accord with the evolutionary logic, the result of three decades of self-guided learning and improvement (Peurach & Glazer, 2012). Finally, research neither provides accounts of a school improvement network intentionally reorganizing as an evolutionary enterprise nor any knowledge of how to do so.33
Rapidly reorganizing as an evolutionary enterprise would likely be exceedingly difficult, in that it would require fundamentally deconstructing and reconstructing the entire system of interdependent, mutually reinforcing conditions that support operating as a shell, diffusion, or incubation enter- prise.34 For example, it would require that network executives have the con- fidence and humility to reconstruct their understandings, ideologies, and identities, at the same time they support hub and school staff in doing the same. Furthermore, it would require expanding the operating capabilities of the hub, recruiting staff members and collaborators with new knowledge and capabilities, and reducing other activities and reallocating resources. Finally, it would require that funders, policy makers, and other stakeholders not only reconstruct their own understandings but also create conditions that would support reorganizing as an evolutionary enterprise: for example, by legitimiz- ing the learning imperative; building political support; and providing neces- sary time and resources.
Peurach et al. 631
Indeed, the uncertainty and challenges of rapidly reorganizing as an evo- lutionary enterprise could be so daunting that network executives decide, instead, to incur the vulnerabilities and risks of continuing to operate as a shell, diffusion, or incubation enterprise.
Option 3: Learning to learn. A third option is to learn to learn. This option would have executives, stakeholders, and evaluators initiating cycles of exploration and exploitation more limited in scope than networkwide trans- formation, with two goals: strengthening the knowledge needed to demon- strate replicable effectiveness and, importantly, building the knowledge needed to manage movement toward an evolutionary enterprise.
This approach would begin by reviewing the network’s design for practice to identify exceptional instances in which particular roles are operating in accord with the evolutionary logic, with a formal knowledge base accumulat- ing to support both conventional, base-level operations and adaptive, locally responsive use. These exceptional instances would function as existence proofs of the possibility of pursuing an evolutionary strategy within the net- work, and of the conditions that support doing so.
Building on evidence of possibility, this approach would continue with again examining the design for practice to identify a small set of “linchpin roles” particularly instrumental to replicating effectiveness at scale: for example, coaching or leadership roles with responsibilities for supporting the performance of others. Attention would then focus on those linchpin roles for which role incumbents were likely to have weak initial capabilities, and for which increasing the scale of operations would likely tax social mechanisms for managing intellectual capital (e.g., apprenticeship, mentoring, and com- munities of practice).
The hub would then explore the possibility of supporting these roles in accordance with the evolutionary logic: that is, by codifying routines and guidance to support base-level enactment of key responsibilities; by codify- ing routines and guidance to support and bound discretionary enactment; by introducing these routines and guidance in developmentally appropriate ways, progressing from fidelity to adaption; and, finally, by monitoring implementation, evaluating and codifying new knowledge as it emerges, and diffusing it throughout the network. Knowledge gained by executives through reforming individual roles in this way could then be exploited to reform addi- tional roles over time.
With that, learning to learn would, itself, be an evolutionary progression toward operating as an evolutionary enterprise, one that would heed argued advantages in improving the network’s foundation for continuous learning and improvement while, at the same time, managing the uncertainty and
632 Educational Policy 30(4)
challenges in doing so. The risk, however, is that this would be a slow process in policy and funding environments only now beginning to recognize the need for such learning; in which the myth of rationality has long held sway despite overwhelming evidence of complexity and uncertainty; and in which the continued viability of networks is linked tightly to rapidly demonstrating replicable effectiveness on rigorous impact evaluations.
Conclusion: Reforming Practice and Practicing Reform
Anchored in contemporary concern with the continuous improvement of school improvement networks, our analysis contributes to long-running tra- ditions of policy, research, and reform focused on the role of knowledge and knowledge production in educational improvement. It also provides critical perspective on contemporary efforts to leverage research and evaluation in the service of large-scale education reform, including the Obama administra- tion’s tiered evidence sequence for social innovation and its heavy focus on impact evaluation.
Our specific focus is reforming the practice—the day-to-day work—of students, teachers, and school leaders. Our analytic foil is the research–devel- opment–diffusion–utilization sequence: a highly institutionalized set of understandings about creating and using knowledge, and the paradigm that currently structures the leading funding opportunities and evaluation demands for large-scale educational improvement initiatives.
Recognizing both the imperative to evaluate school improvement net- works and the challenges of doing so, we began with a rationale establishing the need for pursuing developmental evaluation as a complement both to impact evaluation and to other approaches to improvement-focused evalua- tion. We continued by proposing an evolutionary logic to support thinking and reasoning about developmental evaluation as focused on the production, use, and management of intellectual capital under conditions of complexity and uncertainty. We concluded by proposing a parsimonious framework to structure inquiry, analysis, and reflection about the current state of a school improvement network and about strengthening its foundation for continuous learning and improvement.
Our general argument is that the prospects for continuous learning and improvement increase with a positive coalignment among three things: the con- ditions under which a school improvement network operates; its strategies for producing, using, and managing intellectual capital; and its operational supports for collaborative learning among the hub and schools. For most school improve- ment networks, achieving such positive coalignment will require learning to
Peurach et al. 633
learn, with the conditions under which they are likely to operate driving net- works to develop strategies and operational supports that combine exploitation and exploration in the service of two goals: building a formal knowledge base with potential to support replicable effectiveness; and building knowledge among executives, stakeholders, and evaluators to manage such work.
With that, our analysis suggests that reforming practice will require practic- ing reform: executives, stakeholders, and external evaluators learning about (and from) the work of producing, using, and managing intellectual capital in the context of large-scale educational networks. Doing so will require that these unlikely collaborators critically examine the fundamental conditions under which networks operate; their strategies for producing, using, and man- aging intellectual capital; and the assumptions, ideologies, and identities on which those strategies are based. It will also require that they make reasoned judgments about how best to move forward under conditions of complexity and uncertainty, and that they continuously reflect on their rationale and logic for moving forward in light of the experiences that follow.
One next step, then, is to begin experimenting with exactly that: actually using the proposed logic and framework to experiment with supporting a small number of school improvement networks open to analyzing and improv- ing themselves as learning systems. Another next step would be to develop (and similarly experiment with) alternative logics and frameworks for improv- ing networks as learning systems. Doing so would provide opportunity to cre- ate additional resources and guidance to support evaluators, network executives, and other stakeholders in collecting, analyzing, and reflecting on evidence about networks as a learning system. Furthermore, it would provide opportunity to refine the proposed logics and frameworks through use. Finally, it would provide opportunity to examine the feasibility, enabling conditions, and payoff for developmental evaluation, especially if carefully coordinated with complementary analyses of implementation and outcomes.
Another next step is to use the analysis developed here as a stepping stone into understanding continuous learning and improvement in other approaches to large-scale education reform. This includes approaches that use designs for improvement less comprehensive than the schoolwide models used in school improvement networks. It also includes approaches that leverage network forms other than the type of hub-outlet topology described here: for example, self-organizing networks that emphasize reciprocal relationships among schools absent a strong, coordinating hub organization.
As argued at the outset, we are edging toward a much fuller appreciation of the challenges of large-scale school improvement, and of the different forms of knowledge and knowledge production that support it. This fuller appreciation is captured clearly in IES moving beyond an essential focus on
634 Educational Policy 30(4)
program impact to an equally essential focus on the continuous improvement of educational systems.
Our analysis affirms the wisdom of these efforts and predicts challenges likely to follow. Demonstrating replicable effectiveness on rigorous impact evaluations depends on the type of continuous improvement initiatives that IES is just now beginning to legitimate and fund. Yet the success of these initiatives likely depends on foundational organization-building in novel educational sys- tems; the success of those efforts depends on the practice-based professional development of novel types of educational leaders; and the success of those efforts depends on devising new types of frameworks and methods to support such learning, as well as new types of external evaluators to occasion and lead such learning. Furthermore, absent careful coordination, success along the pre- ceding dimensions would likely have school improvement networks working at cross purposes, in that they would be learning and improving in ways that could well undermine rigorous, complex, controlled impact evaluations.
Thus, some might view new support for the continuous learning and improvement of educations systems as another plank in the RDDU/tiered evidence platform. Instead, we view this support fundamentally differently: as new, complex, and uncertain territory, the mapping of which will require that unlikely collaborators learn to think, reason, and coordinate in entirely new ways, both about the production, use, and management of intellectual capital and about efforts to evaluate impact.
The former perspective has continuous improvement research as part of the answer. The latter perspective has continuous improvement research as yet another piece of the puzzle, and a reform agenda all its own.
Acknowledgment
The lead author gratefully acknowledges the opportunity to participate in the i3 Learning Community cosponsored by the W. T. Grant Foundation and the Spencer Foundation, as well as the opportunity to discuss this work in the Seminar on the Evolution of Organizations and Industries at the Wharton School. Both experiences served as important contexts in which to further develop and advance this work.
Authors’ Note
An earlier draft of this article was presented at the 2012 Conference of the National Center on Scaling Up Effective Schools, Nashville, TN (June 10-12, 2012).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Peurach et al. 635
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors gratefully acknowledge funding received from the Spencer Foundation (Reference No. 201300078), the School of Education at the University of Michigan, and the Education Policy Center at Michigan State University.
Notes
1. Our conceptualization of intellectual capital draws from Bontis (2002) and Stewart (1997). Bontis considers intellectual capital as the stock of knowledge of an organization and as the product of organizational learning over time. Furthermore, he describes intellectual capital as consisting of three subdomains: structural capital (i.e., the organizational routines of an organization), human capital (i.e., the tacit knowledge embedded in individuals and their interactions), and relational capital (i.e., the knowledge embedded in relationships established in outside environments). Stewart, in turn, argues that the value of intellectual capital lies in its practical use in creating instrumental outcomes (in his case, wealth; in our case, improved performance in schools).
2. For example, school improvement networks have emerged concurrent with the rise of standards-based reform in states; the establishment of the Institute of Education Sciences (IES; 2012a), and its mission of identifying “what works, what doesn’t, and why”; funding streams that require rigorous evaluation of outcomes; the coordination of evidence standards among federal agencies; and the rise of such quasiregulatory agencies as the What Works Clearinghouse and the Best Evidence Encyclopedia. Such efforts in education parallel wider efforts to establish the impact of other social programs, both in the United States and abroad (Campbell Collaboration, 2013; Granger, 2011; Khandker, Koolwal, & Samad, 2010).
3. This tiered evidence sequence is one component of the Obama administration’s efforts to advance the use of evidence in support of social policy and innova- tion—efforts described as by Haskins and Baron (2011) as the most expansive in the history of the U.S. government. For an overview of the Obama administra- tion’s strategy for using evidence and innovation to improve government per- formance, see Burwell, Munoz, Holdren, and Krueger (2013) and IES/National Science Foundation (2013).
4. Time estimates are derived from IES (2012b). 5. Rather than being unique to school improvement networks, weaknesses in such
enabling conditions are characteristic of innovation in general (Van de Ven, Polley, Garud, & Venkataraman, 1999) and of social innovation in particular (Preskill & Beer, 2012). Multiple efforts are underway to strengthen the impact evaluation infrastructure in education in ways that would support evaluating the impact of complex educational interventions amid weaknesses in enabling con- ditions: for example, the establishment of the IES in 2002; the establishment
636 Educational Policy 30(4)
of Society for Research on Educational Effectiveness (the chief professional organization focused on understanding cause–effect relationships in educa- tional programs and interventions) in 2005; the development of methods such as regression discontinuity design (Schochet, 2008); efforts to conceptualize sources of variation in program effects (M. J. Weiss, Bloom, & Brock, 2013); and multiple IES-sponsored initiatives to develop capabilities of researchers to conduct randomized control trials, impact evaluations, and causal analyses. Even so, within education, issues related to the potential and problems of impact eval- uation have been (and continue to be) hotly debated among proponents and crit- ics (e.g., Foray, Murnane, & Nelson, 2007; Mosteller & Boruch, 2002; Schneider & McDonald, 2007). Moreover, recent emphasis on impact evaluations in other domains of social improvement have led to political, empirical, and practical challenges described as both dividing and overwhelming evaluators (Easterly, 2009; Khandker et al., 2010).
6. The learning imperative is not unique to school improvement networks. Rather, comparative research on the innovation process directly refutes the RDDU (research, development, dissemination, and utilization) sequence and, instead, argues that the innovation process is better understood as cycles of explora- tion/divergence and exploitation/convergence (Van de Ven, Polley, Garud, & Venkataraman, 1999). Such learning is argued to be the essential capability for innovating organizations operating in turbulent, uncertain, and chaotic environ- ments (Lewin, 1999; Waldrop, 1992). This learning imperative is also captured in leading characterizations of “design thinking” (T. Brown, 2009), “perpetual beta” in the development of technical innovations (Musser & O’Reilly, 2006), and the development and refinement of “better” (rather than “best”) medical practice (Berwick, 2008; Gawande, 2002, 2007, 2009).
7. Exploration and exploitation have roots in March (1996). Furthermore, “explo- ration” parallels “double loop learning” as developed by Argyris and Schön (1978), where “exploitation” parallels “single loop learning.”
8. In organizational studies, capabilities for continuous learning and improvement are recognized as unevenly distributed among organizations, and as a source of competitive advantage for innovating organizations operating in complex and turbulent environments (Choo & Bontis, 2002; Dosi, Nelson, & Winter, 2001).
9. For example, scholars describe a “paucity of research” on the work of hub orga- nizations (Datnow, Hubbard, & Mehan, 2002, p. 90; see, also, Allen & Peurach, 2013; Peurach & Gumus, 2011). Furthermore, professional learning opportuni- ties are limited to a small number of executive education programs (e.g., Harvard University’s Doctor of Education Leadership and the Broad Residency) and to small communities of practice coordinated by leading philanthropists (e.g., the i3 Learning Community, which is sponsored by the Spencer Foundation and the W. T. Grant Foundation; the Deeper Learning Initiative, which is sponsored by the William and Flora Hewlett Foundation).
10. For example, executives in charter management organizations (CMOs) report a “tyranny of business plans” resulting from the need to project rationality to secure
Peurach et al. 637
funding, despite encountering complexity and uncertainty that require flexibility and adaptability in their work (National Charter School Research Project, 2007). Again, such pressure is not unique to school improvement networks but, instead, characteristic of social innovation enterprises (Preskill & Beer, 2012).
11. For absorptive capacity, see Cohen and Levinthal (1990). For dynamic capabili- ties, Eisenhardt and Martin (2000), Winter (2003), and Zollo and Winter (2002).
12. For example, see Almy and Theokas (2010) and Ingersoll (2001) for analyses of occupational demographics. See Ronfeldt, Loeb, and Wyckoff (2013) for a review and analysis of teacher transiency (and its negative consequences for students and for entire schools). See Stuit and Smith (2009) and Gross and DeArmond (2010) for comparative research on teacher turnover in charter schools and conventional public schools.
13. For example, regarding the challenges of developing executive capacity to manage learning-focused innovation processes, see Van de Ven et al. (1999). Regarding challenges in strategically managing intellectual capital, organiza- tional knowledge, and organizational learning, see Choo and Bontis (2002). Finally, regarding the emergence of (and challenges in) the role of “chief knowl- edge officer” as a formal executive role responsible for such work, see Earl and Scott (1999).
14. Developmental evaluation is an emerging approach, and not an established tradi- tion or method. Its principles, central tenets, and unique contributions are devel- oped most fully in Patton (2011), drawing on prior work on utilization-focused evaluation (Patton, 2008) and social innovation (Westley, Zimmerman, & Patton, 2007). Two practical guides for developmental evaluation have emerged out of early collaborations with the McConnell Family Foundation (Dozois, Langlois, & Blanchet-Cohen, 2010; Gamble, 2008). The notion of developmental evalua- tion has been quickly and widely embraced for supporting innovation. For exam- ple, a simple Google search using “developmental evaluation” and “innovation” yielded hundreds of sources. Even so, a search on “developmental evaluation” using Google Scholar yielded very little research on (or using) developmen- tal evaluation in peer-reviewed journals (by our count, less than 10 in the first hundred sources identified at the time of this writing). Among both the peer- reviewed and non-peer-reviewed reports that we identified, most function as “proof of concept” affirming the principles and tenets of developmental evalu- ation, and most report positive value to program developers. These include one study focused on supporting a new approach for teaching classroom assessment methods to aspiring teachers (Lam, 2011), one focused on supporting learning in a network context (Ramstad, 2009), and one focused on supporting the devel- opment of interorganizational networks (Sydow, 2004). Our conjecture is that the small number of peer-reviewed studies arises, in part, from developmental evaluation being focused primarily on internal use within organizations (and not on testing and advancing general knowledge, at least beyond the principles and practices of developmental evaluation, itself). Furthermore, our experience is that quickly identifying peer-reviewed studies is complicated by the fact that
638 Educational Policy 30(4)
“developmental evaluation” is an established diagnostic regime in psychology, and the subject of extensive research.
15. As argued both by Gamble (2008) and Patton (2011), central to this interpretation is research on executive decision making in complex environments with high costs (and little potential) for accurate information and knowledge (Sutcliffe & Weber, 2003). As summarized by Sutcliffe and Weber, “Our findings suggest that perceptual accuracy at the very top executive levels is actually a source of competitive disadvantage for most firms. The task of leaders is to manage ambi- guity and to mobilize action, not to store highly accurate knowledge about their environment. The more effective way to improve the performance of a company is to invest in how leaders shape their interpretive outlooks” (Sutcliffe & Weber, 2003, p. 82).
16. Working from this perspective, the meaning of “development” in “developmen- tal evaluation” begins to shift. Specifically, it shifts away from a more behavioral connotation: the sort of active program fashioning connoted by research–devel- opment–diffusion–utilization and as captured by the “development” stage of the tiered evidence sequence. And it shifts toward a more cognitive connotation: establishing the resources and capabilities needed to support collective thinking, reasoning, and understanding. As argued by Patton (2011), this focus on continu- ous learning amid complex and uncertain conditions (vs. active program fashion- ing and related problem solving) is what distinguishes developmental evaluation from other improvement-focused evaluation strategies, including design-based research and action research.
17. As noted by Peurach and Glazer (2012), the evolutionary logic is drawn pri- marily from theory and research by Sidney Winter, Gabriel Szulanski, and colleagues focused on the replication of knowledge within and between orga- nizations: for example, Baden-Fuller and Winter (2012); Szulanski and Winter (2002); Szulanski, Winter, Cappetta, and Van den Bulte (2002); Winter (2003, 2010, 2012); Winter and Szulanski (2001, 2002); and Zollo and Winter (2002). Roots of this work lie in the work of Nelson and Winter (1982) on evolution- ary economics, with specific focus on developing, adapting, and replicating routines. The perspective has contemporary ties to research in organizational learning (March, 1996); innovation development (Van de Ven et al., 1999); organizational routines (Feldman & Pentland, 2003); dynamic capabilities, the resource-based view of the firm, and the evolutionary view of the firm (Arrow, 1962, 1974; J. S. Brown & Duguid, 1998; Eisenhardt & Martin, 2000; Grant, 1996; Wernerfelt, 1995); alternative conceptions of centralized control (Adler & Borys, 1996); franchised organizational forms (Bradach, 1998); and nonprofit replication (Bradach, 2003).
18. The evolutionary logic as represented here incorporates three important adap- tations over the logic as originally represented in Peurach and Glazer (2012). The first adaptation is our naming of the logic. Where we initially represented it as a “knowledge-based” logic, we came to recognize that RDDU sequence is, itself, a knowledge-based logic, such that the “knowledge-based” qualifier did
Peurach et al. 639
not discriminate between the two logics on which we focus. Hence, we renamed this as an evolutionary logic, recognizing its deep roots in evolutionary eco- nomics. The second adaptation is our elaboration of conditions under which an evolutionary strategy has advantages in speed, effectiveness, and efficiency over a more straightforward “in principle” articulation of capabilities to be developed and coordinated in new outlets. The third adaptation is our incorporation of the notion of a “design for practice” as an essential component of the evolutionary logic, which becomes central to the interpretative framework developed later in this article. These adaptations were motivated by personal communications with Sidney Winter and Charles Baden-Fuller about their work on “principles” and “templates” as strategies for organizational replication; by feedback on our analysis from participants in the Seminar on the Evolution of Organizations and Industries at the Wharton School; and by efforts actually practicing the work of developmental evaluation in collaboration with a large-scale effort to sup- port the implementation of Response to Intervention and Positive Behavioral Interventions and Supports in Michigan.
19. “Outlet” is the general term describing the organizations that are to be replicated. In school improvement networks, the outlets are schools.
20. Winter and Szulanski (2001) describe this knowledge base as the “Arrow core” in recognition of Kenneth Arrow’s (1962) exposition of information economics: in particular, his analysis of information as a nonrivalrous good, the fundamental assumption on which the evolutionary logic of replication rests. To say that this knowledge base is nonrivalrous is not to say that it is (or should be) available to other, possibly competing enterprises. An enterprise may well take measures to protect the use of this knowledge base through copyrights, trademarks, patents, noncompete agreements, and other means of protecting intellectual property rights. Indeed, whether the knowledge base produced by school improvement networks (through public funding, in the service of public education) is a public or private good is a formidable policy issue that arises from this analysis, espe- cially because current policy contexts do more to structure competition among these networks (thus, the protection of intellectual capital, despite being pro- duced with public funding for the public good) than collaboration (thus, the shar- ing of intellectual capital as a public good in the public domain).
21. The work of Winter, Szulanski, and colleagues generally places more emphasis on routines than on guidance. However, the importance of professional and back- ground knowledge as a complement to routines becomes salient in Baden-Fuller and Winter (2012).
22. Knowledge thus formalized functions as a sort of “immutable mobile” (Latour, 1988) that can be used, studied, and manipulated by others.
23. See Adler and Borys (1996) on “coercive” versus “enabling” formalization. 24. As noted by Peurach and Glazer (2012), Szulanski et al. (2002) cast this as a
four-phase process. As forms of exploration, initiation involves recognizing opportunities to replicate and deciding to act on them, while initial implementa- tion is a process of “learning before doing.” As forms of exploitations, ramp
640 Educational Policy 30(4)
up to satisfactory performance is a process of learning by resolving unexpected outcomes, while integration involves maintaining and improving performance after satisfactory results are initially obtained.
25. Zollo and Winter (2002) are clear that this is an analytical representation and that, in practice, the processes of knowledge evolution described here are likely concurrent and confounded.
26. In this sense, the evolutionary logic can be understood as a broader learning strategy, of which such improvement-focused evaluation strategies as design- based implementation research are a core component.
27. Our framework can be understood as a theory-based approach to developmental evaluation (Patton, 2011; C. H. Weiss, 1997), though etic (i.e., grounded in the evolutionary logic) rather than emic (i.e., grounded in the network’s own theory of action). Moreover, it is a reciprocal approach. Specifically, it is an external perspective on the internal structure and operation of the network, with two interdependent purposes: using the evolutionary logic (and associated evaluation framework) to motivate and inform critical analysis of the network as a learning enterprise, and using that experience to reflect critically on the logic (and the associated framework).
28. While derived from the evolutionary logic, our view is that these questions have face validity independent of the evolutionary logic, and potentially useful for analyzing enterprises smaller in scope than schoolwide improvement enterprises. We also recognize that complementary analyses would be needed to examine the content of routines and guidance, the actual use of program resources in schools, and the work of hubs in leveraging school-level adaptations as resources for networkwide improvement.
29. Our notion of shell enterprises derives from “faux replication” as discussed by Baden-Fuller and Winter (2012) and Winter and Szulanski (2001, 2002). Our notion of diffusion and incubation enterprises derives from Baden-Fuller and Winter (2012), who discuss the distinction between replication via “templates” (our “diffusion enterprises”) and replication via “principles” (our “incubation enterprises”).
30. A shell enterprise can operate as part of a good faith effort to improve schools: for example, efforts in which a hub establishes a small set of similarly structured templates to support initial, exploratory, cross-template learning. A shell enter- prise can also operate absent good faith efforts to improve schools: for example, when the hub seeks to capitalize on fees to schools and, thus, does not engage in costly efforts to develop capabilities; and/or when a newly adopting school seeks to capitalize on the reputational assets of networks to establish identity and secure legitimacy, though absent a commitment to reform core capabilities. See the discussion of “faux replication” in Baden-Fuller and Winter (2012) and Winter and Szulanski (2001, 2002) for discussion of differing motivations for what we describe as a shell enterprise.
31. Evaluation use is a topic that has received formidable attention from researchers, policy makers, funders, and others for decades, a key theme being difficulty in
Peurach et al. 641
making effective use of evaluation processes and results. As argued by Preskill and Torres (2000), evaluation that seeks to support transformative learning in organizational contexts (such as developmental evaluation as proposed here) requires communal, collaborative, dialogical approaches that are carefully structured and expertly supported. Next steps in our research agenda include (a) reviewing the literature on evaluation use through the lens of developmental evaluation as proposed here and (b) experimenting with protocols and guidance to support the type of communal, collaborative, dialogical approaches described by Preskill and Torres.
32. For example, see Peurach (2011) and Peurach and Glazer (2012) on efforts by Success for All to field a schoolwide model focused on K-6 reading: on one half of one core content area that enjoys some consensus regarding its basic knowl- edge base (as evidenced by the efforts of the National Reading Panel) and for which the What Works Clearinghouse and Best Evidence Encyclopedia suggest a body of possible, vetted components. Yet, even in this comparatively established domain, Success for All evolved consistent with the evolutionary logic reported here.
33. Research does provide an account of a large-scale educational reform effort reor- ganizing in ways consistent with the evolutionary enterprise: the reorganization of National Alliance for Restructuring Education as America’s Choice (Cohen, Peurach, Glazer, Gates, & Goldin, 2014; Glazer, 2009a, 2009b). However, this reorganization was driven by issues other than analysis of weaknesses in existing strategies for managing intellectual capital.
34. See Aldrich (1999) on the challenges of organizational (never mind network- wide) transformation.
References
Adler, P. S., & Borys, B. (1996). Two types of bureaucracy: Enabling and coercive. Administrative Science Quarterly, 41, 61-89.
Aladjem, D. K., & Borman, K. M. (2006, April). Summary of findings from the National Longitudinal Evaluation of Comprehensive School Reform. Paper pre- sented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Aldrich, H. (1999). Organizations evolving. Thousand Oaks, CA: Sage. Allen, A., & Peurach, D. J. (2013, April). The work of charter and education manage-
ment organizations: A review of research. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Almy, S., & Theokas, C. (2010). Not prepared for class: High-poverty schools con- tinue to have fewer in-field teachers. Washington, DC: The Education Trust.
Anderson, T., & Shattuck, J. (2012). Design-based research: A decade of progress in education research? Educational Researcher, 41, 16-25.
Argyris, C., & Schön, D. (1978). Organizational learning: A theory of action per- spective. Reading, MA: Addison Wesley.
642 Educational Policy 30(4)
Arrow, K. J. (1962). Economic welfare and the allocation of resources for invention. In R. R. Nelson (Ed.), The rate and direction of inventive activity (pp. 609-625). Princeton, NJ: Princeton University Press.
Arrow, K. J. (1974). The limits of organization. New York. NY: W.W. Norton. Baden-Fuller, C., & Winter, S. G. (2012). Replicating organizational knowl-
edge: Principles or templates? Philadelphia: Wharton School, University of Pennsylvania.
Berends, M., Bodilly, S. J., & Kirby, S. N. (2002). Facing the challenges of whole school reform: New American Schools after a decade. Santa Monica, CA: RAND.
Berwick, D. M. (2008). The science of improvement. The Journal of the American Medical Association, 299, 1182-1184.
Bodilly, S. J. (1996). Lessons from New American Schools Development Corporation’s demonstration phase: Prospects for bringing designs to multiple schools. Santa Monica, CA: RAND.
Bodilly, S. J., Glennan, T. K., Jr., Kerr, K. A., & Galegher, J. R. (2004). Introduction: Framing the problem. In T. K. Glennan Jr., S. J. Bodilly, J. R. Galegher, & K. A. Kerr (Eds.), Expanding the reach of educational reforms: Perspectives from leaders in the scale-up of educational interventions (pp. 647-685). Santa Monica, CA: RAND.
Bontis, N. (2002). Managing organization knowledge by diagnosing intellectual capi- tal: Framing and advancing the state of the field. In C. W. Choo & N. Bontis (Eds.), The strategic management of intellectual capital and organizational knowledge (pp. 621-642). New York, NY: Oxford University Press.
Bradach, J. L. (1998). Franchise organizations. Boston, MA: Harvard Business School Press.
Bradach, J. L. (2003). Going to scale: The challenge of replicating social programs. Stanford Social Innovation Review, 1 (1), 19-25.
Brown, J. S., & Duguid, P. (1998). Organizing knowledge. California Management Review, 40(3), 90-111.
Brown, T. (2009). Change by design: How design thinking transforms organizations and inspires innovation. New York, NY: HarperCollins.
Bryk, A. S. (2009). Support a science of performance improvement. Phi Delta Kappan, 90, 597-600.
Bryk, A. S., Gomez, L. M., & Grunow, A. (2010). Getting ideas into action: Building networked improvement communities in education. Stanford, CA: Carnegie Foundation for the Advancement of Teaching.
Burwell, S. M., Munoz, C., Holdren, J., & Krueger, A. (2013). Next steps in the evi- dence and innovation agenda. Retrieved from http://www.whitehouse.gov/sites/ default/files/omb/memoranda/2013/m-13-17.pdf
Camburn, E., Rowan, B., & Taylor, J. T. (2003). Distributed leadership in schools: The case of elementary schools adopting comprehensive school reform models. Educational Evaluation and Policy Analysis, 25, 347-373.
Campbell Collaboration. (2013). What helps? What harms? Based on what evidence? Available from http://www.campbellcollaboration.org
Peurach et al. 643
Center for Research on Education Outcomes. (2009). Multiple choice: Charter school performance in 16 states. Palo Alto, CA: Center for Research on Education Outcomes, Stanford University.
Choo, C. W., & Bontis, N. (Eds.). (2002). The strategic management of intellectual capital and organizational knowledge. New York, NY: Oxford University Press.
Coburn, C. E. (2003). Rethinking scale: Moving beyond numbers to deep and lasting change. Educational Researcher, 32(6), 3-12.
Cohen, D. K., & Ball, D. B. (2007). Educational innovation and the problem of scale. In B. Schneider & S. K. McDonald (Eds.), Scale up in education: Ideas in prin- ciple (Vol. I, pp. 19-36). Lanham, MD: Rowman & Littlefield.
Cohen, D. K., Peurach, D. J., Glazer, J. L., Gates, K. G., & Goldin, S. (2014). Improvement by design: The promise of better schools. Chicago, IL: University of Chicago Press.
Cohen, W. M., & Levinthal, D. A. (1990). Absorptive capacity: A new perspective on learning and innovation. Administrative Science Quarterly, 35, 128-152.
Datnow, A. (2000). Power and politics in the adoption of school reform models. Educational Evaluation and Policy Analysis, 22, 357-374.
Datnow, A., Hubbard, L, & Mehan, H. (2002). Extending educational reform: From one school to many. New York, NY: Routledge.
Datnow, A., & Park, V. (2009). Towards the co-construction of educational policy: Large-scale reform in an era of complexity. In D. Plank, B. Schneider, & G. Sykes (Eds.), Handbook of education policy research (pp. 348-361). New York, NY: Routledge.
DeArmond, M., Gross, B., Bowen, M., Demeritt, A., & Lake, R. (2012). Managing talent for school coherence: Learning from charter management organizations. Seattle: University of Washington, Center on Reinventing Public Education.
Dosi, G., Nelson, R. R., & Winter, S. G. (2001). Introduction: The nature and dynam- ics of organizational capabilities. In G. Dosi, R. R. Nelson, & S. G. Winter (Eds.), The nature and dynamics of organizational capabilities (pp. 51-68). New York, NY: Oxford University Press.
Dozois, E., Langlois, M., & Blanchet-Cohen, N. (2010). A practitioner’s guide to developmental evaluation. Montreal, Quebec, Canada: The J.W. McConnell Family Foundation.
Earl, M. J., & Scott, I. A. (1999). What is a chief knowledge officer? Sloan Management Review, 40(2), 29-38.
Easterly, W. (2009). The civil war in development economics. Retrieved from http:// aidwatchers.com/2009/12/the-civil-war-in-development-economics/
Education Sector. (2009). Growing pains: Scaling up the nation’s best charter schools. Washington, DC: Education Sector.
Eisenhardt, K. M., & Martin, J. A. (2000). Dynamic capabilities: What are they? Strategic Management Journal, 21, 1105-1121.
Feldman, M. S., & Pentland, B. T. (2003). Reconceptualizing organizational rou- tines as a source of flexibility and change. Administrative Science Quarterly, 48, 94-118.
644 Educational Policy 30(4)
Foray, D., Murnane, R., & Nelson, R. (2007). Randomized trials of educational and medical practices: Strengths and limitations. Economics of Innovation and New Technology, 16, 303-306.
Gamble, J. A. A. (2008). A developmental evaluation primer. Montreal, Quebec, Canada: The J.W. McConnell Family Foundation.
Gawande, A. (2002). Complications: A surgeon’s notes on an imperfect science. New York, NY: Picador.
Gawande, A. (2007). Better: A surgeon’s notes on performance. New York, NY: Picador. Gawande, A. (2009). The checklist manifesto: How to get things right. New York,
NY: Metropolitan Books. Glazer, J. L. (2009a). External efforts at district-level reform: The case of the National
Alliance for Restructuring Education. Journal of Educational Change, 10, 295-314.
Glazer, J. L. (2009b). How external interveners leverage large-scale change: The case of America’s Choice, 1998-2003. Educational Evaluation and Policy Analysis, 31, 269-297.
Glazer, J. L., & Peurach, D. P. (2013). School improvement networks as a strategy for large-scale education reform: The role of environments. Educational Policy, 27, 676-710.
Glennan, T. K., Jr., Bodilly, S. J., Galegher, J. R., & Kerr, K. A. (2004). Expanding the reach of educational reforms: Perspectives from leaders in the scale-up of educational interventions. Santa Monica, CA: RAND.
Granger, R. C. (2011). The big why: A learning agenda for the scale-up movement. Pathways, Winter, 28-32.
Grant, R. M. (1996). Toward a knowledge-based theory of the firm [Special Winter issue]. Strategic Management Journal, 17, 109-122.
Gross, B., & DeArmond, M. (2010). Parallel patterns: Teacher attrition in charter vs. district schools. Seattle: Center on Reinventing Public Education, Center on Reinventing Public Education, University of Washington.
Haskins, R., & Baron, J. (2011). Part 6: The Obama Administration’s evidence-based social policy initiatives: An overview. In R. Puttick (Ed.), Evidence for social policy and practice: Perspectives on how research and evidence can influence decision making in public services (pp. 28-35). London, England: Nesta.
Hatch, T. (2000). What does it take to break the mold? Rhetoric and reality in New American Schools. Teachers College Record, 102, 561-589.
Honig, M. I. (2006). Complexity and policy implementation: Challenges and opportu- nities for the field. In M. I. Honig (Ed.), New directions in policy implementation: Confronting complexity (pp. 1-24). Albany: State University of New York.
Ingersoll, R. M. (2001). Teacher turnover and teacher shortages: An organizational analysis. American Educational Research Journal, 38, 499-534.
Institute of Education Sciences. (2012a). About IES: Connecting research, policy, and practice. Available from http://ies.ed.gov/aboutus/
Institute of Education Sciences. (2012b). Request for applications: Education research grants. Retrieved from http://ies.ed.gov/funding/pdf/2012_84305A.pdf
Peurach et al. 645
Institute of Education Sciences. (2013). Seeking comment on new IES research topic, Continuous Improvement Research in Education. Retrieved from http://ies. ed.gov/funding/comment_CIRE.asp/
Institute of Education Sciences/National Science Foundation. (2013). Common guidelines for education research and development. Washington, DC: Institute of Education Sciences.
Kaestle, C. F. (1993). The awful reputation of education research. Educational Researcher, 22(1), 23+26-31.
Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2010). Handbook on impact evalu- ation: Quantitative methods and practice. Washington, DC: The World Bank.
Lake, R., Dusseault, B., Bowen, M., Demeritt, A., & Hill, P. (2010). The National Study of Charter Management Organizations (CMO) Effectiveness: Interim findings. Seattle: Center on Reinventing Public Education, University of Washington.
Lam, C. Y. (2011). A case study on the use of developmental evaluation for innovat- ing: Navigating uncertainty and unpacking complexity (Unpublished master’s thesis). Queen’s University, Kingston, Ontario, Canada.
Latour, B. (1988). Science in action: How to follow scientists and engineers through society. Cambridge, MA: Harvard University Press.
Levine, A. (2005). Educating school leaders. New York, NY: The Education School Project.
Levine, A. (2006). Educating school teachers. New York, NY: The Education School Project.
Lewin, R. (1999). Complexity: Life at the edge of chaos. Chicago, IL: University of Chicago Press.
March, J. G. (1996). Exploration and exploitation in organizational learning. In M. D. Cohen & L. S. Sproull (Eds.), Organizational learning (pp. 101-123). Thousand Oaks, CA: Sage. (Reprinted from Organization science, 2(1), 1991)
Marsh, J., Hamilton, L., & Gill, B. (2008). Assistance and accountability in exter- nally managed schools: The case of Edison Schools, Inc. Peabody Journal of Education, 83, 423-458.
McDonald, J. P., Klein, E. J., & Riordan, M. (2009). Going to scale with new schools designs: Reinventing high schools. New York, NY: Teachers College Press.
McLaughlin, M. W., & Mitra, D. (2001). Theory-based change and change-based theory: Going deeper and going broader. Journal of Educational Change, 2, 301-323.
Mehta, J., Gomez, L. M., & Bryk, A. S. (2012). Building on practical knowledge: The key to a stronger profession is learning in the field. In J. Mehta, R. B. Schwartz, & F. M. Hess (Eds.), The futures of school reform (pp. 35-64). Cambridge, MA: Harvard Education Press.
Mosteller, F., & Boruch, R. (Eds.). (2002). Evidence matters: Randomized trails in education research. Washington, DC: Brookings.
Musser, J., & O’Reilly, T. (2006). Web. 2.0: Principles and best practices. Cambridge, MA: O’Reilly Media.
646 Educational Policy 30(4)
National Charter School Research Project. (2007). Quantity counts: The growth of charter school management organizations. Seattle: National Charter School Research Project, University of Washington.
Nelson, R. R., & Winter, S. G. (1982). An evolutionary theory of economic change. Cambridge, MA: Harvard University Press.
Patton, M. Q. (2006). Evaluation for the way we work. Nonprofit Quarterly, 13, 28-33.
Patton, M. Q. (2008). Utilization-focused evaluation (4th ed.). Thousand Oaks, CA: Sage.
Patton, M. Q. (2011). Developmental evaluation: Applying concepts to enhance inno- vation and use. New York, NY: Guilford Press.
Patton, M. Q. (2012). Essentials of utilization-focused evaluation. Thousand Oaks, CA: Sage.
Penuel, W., Fishman, B., Cheng, B. H., & Sabelli, N. (2011). Organizing research and development at the intersection of learning, implementation, and design. Educational Researcher, 40, 331-337.
Peurach, D. J. (2011). Seeing complexity in public education: Problems, possibilities, and success for all. New York, NY: Oxford University Press.
Peurach, D. J., & Glazer, J. L. (2012). Reconsidering replication: New perspectives on large-scale school improvement. Journal of Educational Change, 13, 155-190.
Peurach, D. J., Glazer, J. L., & Lenhoff, S. W. (2012). Make or buy? That’s really not the question: Considerations for systemic school improvement. Phi Delta Kappan, 93(7), 51-55.
Peurach, D. J., & Gumus, E. (2011). Executive leadership in school improvement networks: A conceptual framework and agenda for research. Current Issues in Education, 14(3), 1-17
Preskill, H., & Beer, T. (2012). Evaluating social innovation. Washington, DC: Center for Evaluation Innovation.
Preskill, H., & Torres, R. T. (2000). The learning dimension of evaluation use. New Directions for Evaluation, 88(Winter), 25-37.
Ramstad, E. (2009). Developmental evaluation framework for innovation and learn- ing networks: Integration of the structure, process and outcomes. Journal of Workplace Learning, 21, 181-197.
Raudenbush, S. W. (2007). Designing field trials of educational innovations. In B. Schneider & S. K. McDonald (Eds.), Scale up in education: Issues in practice (Vol. II, pp. 1-15). Lanham, MD: Rowman & Littlefield.
Rogers, E. M. (1995). Diffusion of innovations (4th ed.). New York, NY: Free Press.
Ronfeldt, M., Loeb, S., & Wyckoff, J. (2013). How teacher turnover harms student achievement. American Educational Research Journal, 50, 4-36.
Rowan, B. (1990). Commitment and control: Alternative strategies for the organi- zational design of schools. In C. Cazden (Ed.), Review of research in educa- tion (Vol. 16, pp. 353-389). Washington, DC: American Educational Research Association.
Peurach et al. 647
Rowan, B. (2002). The ecology of school improvement: Notes on the school improvement industry in the United States. Journal of Educational Change, 3, 283-314.
Rowan, B., Camburn, E., & Barnes, C. (2004). Benefiting from comprehensive school reform: A review of research on CSR implementation. In C. Cross (Ed.), Putting the pieces together: Lessons from comprehensive school reform research (pp. 1-52). Washington, DC: National Clearinghouse for Comprehensive School Reform.
Rowan, B., Correnti, R. J., Miller, R. J., & Camburn, E. M. (2009a). School improve- ment by design: Lessons from a study of comprehensive school reform programs. In G. Sykes, B. Schneider, & D. Plank (Eds.), AERA handbook on education policy research (pp. 637-651). New York, NY: Routledge.
Rowan, B., Correnti, R. J., Miller, R. J., & Camburn, E. M. (2009b). School improve- ment by design: Lessons from a study of comprehensive school reform programs. Philadelphia, PA: Consortium for Policy Research in Education.
Schneider, B., & McDonald, S. K. (Eds.). (2007). Scale up in education, Volume II: Issues in practice. Lanham, MD: Rowman & Littlefield.
Schochet, P. Z. (2008). Technical methods report: Statistical power for regression Discontinuity designs in education evaluations (NCEE 2008-4026). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
Slavin, R. E., & Fashola, O. S. (1998). Show me the evidence! Proven and promising programs for America’s schools. Thousand Oaks, CA: Corwin Press.
Stewart, T. A. (1997). Intellectual capital: The new wealth of organizations. New York, NY: Doubleday.
Stuit, D. A., & Smith, T. M. (2009). Teacher turnover in charter schools. Nashville, TN: Peabody College, National Center on School Choice, Vanderbilt University.
Sutcliffe, K., & Weber, K. (2003). The high cost of accuracy. Harvard Business Review, 81, 74-82.
Sydow, J. (2004). Network development by means of network evaluation? Explorative insights from a case in the financial services industry. Human Relations, 57, 201- 220.
Sykes, G., Bird, T., & Kennedy, M. (2010). Teacher education: Its problems and some prospects. Journal of Teacher Education, 61, 464-476.
Szulanski, G., & Winter, S. G. (2002). Getting it right the second time. Harvard Business Review, 80(January-February), 62-69.
Szulanski, G., Winter, S. G., Cappetta, R., & Van den Bulte, C. (2002). Opening the black box of knowledge transfer: The role of replication accuracy. Philadelphia: Wharton School of Business, University of Pennsylvania.
Trujillo, T. (2012). The paradoxical logic of school turnarounds: A Catch-22 (ID No. 16797). Teachers College Record. Available from http://www.tcrecord.org.
U.S. Department of Education. (2010). Investing in Innovation Fund (i3) program: Guidance and frequently asked questions. Retrieved from http://www2.ed.gov/ programs/innovation/faqs.pdf
648 Educational Policy 30(4)
Van de Ven., A. H., Polley, D. E., Garud, R., & Venkataraman, S. (1999). The innova- tion journey. Oxford, UK: Oxford University Press.
Waldrop, M. M. (1992). Complexity: The emerging science at the edge of order and chaos. New York, NY: Simon & Schuster.
Weiss, C. H. (1997). Theory-based evaluation: Past, present, and future. New Directions for Evaluation, 76(Winter), 41-55.
Weiss, M. J., Bloom, H. S., & Brock, T. (2013). A conceptual framework for studying the sources of variation in program effects. New York, NY: MDRC.
Wernerfelt, B. (1995). The resource-based view of the firm: Ten years after. Strategic Management Journal, 16, 171-174.
Westley, F., Zimmerman, B., & Patton, M. Q. (2007). Getting to maybe: How the world is change. Toronto, Ontario: Vintage Canada.
Winter, S. G. (2003). Understanding dynamic capabilities. Strategic Management Journal, 24, 991-995.
Winter, S. G. (2010). The replication perspective on productive knowledge. In H. Itami, K. Kusunoki, T. Numagami, & A. Takeishi (Eds.), Dynamics of knowledge, corporate systems, and innovation (pp. 85-124). New York, NY: Springer.
Winter, S. G. (2012). Capabilities: Their origin and ancestry. Philadelphia: The Wharton School, University of Pennsylvania.
Winter, S. G., & Szulanski, G. (2001). Replication as strategy. Organization Science, 12, 730-743.
Winter, S. G., & Szulanski, G. (2002). Replication of organizational routines: Conceptualizing the exploitation of knowledge assets. In C. W. Choo & N. Bontis (Eds.), The strategic management of intellectual capital and organiza- tional knowledge (pp. 207-222). New York, NY: Oxford University Press.
Zollo, M., & Winter, S. G. (2002). Deliberate learning and the evolution of dynamic capabilities. Organization Science, 13, 339-351.
Author Biographies
Donald J. Peurach is an Assistant Professor of educational leadership and policy in the School of Education at the University of Michigan. His research focuses on large scale, practice-focused improvement efforts in underperforming schools.
Joshua L. Glazer is a Visiting Associate Professor of educational administration at the Graduate School of Education and Human Development at George Washington University. He is currently directing a study of Tennessee’s Achievement School District, an ambitious initiative to improve the State’s most under-performing schools.
Sarah Winchell Lenhoff is the director of policy and research at Education Trust- Midwest. Her work focuses on the intersections of policy and practice in school and educator improvement efforts.
EVALUATION MATTERS: GETTING THE INFORMATION YOU NEED
FROM YOUR EVALUATION A GUIDE FOR EDUCATORS TO BUILD EVALUATION INTO PROGRAM PLANNING AND
DECISION-MAKING, USING A THEORY-DRIVEN, EMBEDDED APPROACH TO EVALUATION
Prepared by:
Susan P. Giancola Giancola Research Associates, Inc.
Prepared for:
U.S. Department of Education Office of Elementary and Secondary Education
School Support and Rural Programs DRAFT 2014
This publication was prepared for the U.S. Department of Education under Contract Number ED-07-CO-0098 (Contracting Officer’s Representatives: Kenneth Taylor, Sharon Horn, and Vickie Banagan) with Kauffman & Associates, Inc. The views expressed in this publication do not necessarily reflect the positions or policies of the U.S. Department of Education. For the reader’s convenience, this publication contains information about and from outside organizations, including hyperlinks and URLs. Inclusion does not constitute endorsement by the Department of any outside organization or the products or services offered or views expressed. Nor is any endorsement intended or implied of the consulting firm “Evaluation Matters.” In fact, this publication was not prepared with help from or in consultation with, in any manner, that firm. U.S. Department of Education Arne Duncan Secretary Office of Elementary and Secondary Education Deb Delisle Assistant Secretary School Support and Rural Programs Jenelle V. Leonard Director DRAFT January 2014 This publication is in the public domain, except for the 1.1 Implementation Rubric in Appendix B, for which the William & Ida Friday Institute for Educational Innovation at North Carolina State University kindly granted permission to reproduce herein. Authorization to reproduce Evaluation Matters in whole or in part—except for the 1.1. Implementation Rubric—is granted. Any further use of the 1.1 Implementation Rubric in Appendix B is subject to the permission of the William & Ida Friday Institute (for more information, email Jeni Corn, director of program evaluation programs, Friday Institute, at [email protected]). The citation for Evaluation Matters should be: U.S. Department of Education, Office of Elementary and Secondary Education, School Support and Rural Programs, Evaluation Matters: Getting the Information You Need From Your Evaluation, Washington, D.C., 2014. To obtain copies of this publication,
Write to ED Pubs, Education Publications Center, U.S. Department of Education, P.O. Box 22207, Alexandria, VA 22304. Or fax your request to 703-605-6794. Or email your request to [email protected]. Or call in your request toll-free to 1-877-433-7827 (1-877-4-ED-PUBS). Those who use a telecommunications device for the deaf (TDD) or a teletypewriter (TTY), should call 1-877-576-7734. If 877 service is not yet available in your area, call 1-800-872-5327 (1-800-USA-LEARN). Or order online at http://edpubs.gov.
On request, this publication is available in alternate formats, such as Braille, large print, audiotape, or compact disk. For more information, please contact the Department’s Alternate Format Center at 202-260-9895 or 202-260- 0818. In addition, if you have difficulty understanding English, you may request language assistance services for Department information that is available to the public. These language services are available free of charge. If you need more information about interpretation or translation services, please call 1-800-USA-LEARN (1-800-872- 5327) (TTY: 1-800-437-0833), or email the content contact below. Content Contact: Nancy Loy, Project Officer Phone: 202-205-5375; Email: [email protected]
Contents Acknowledgements ...............................................................................................................iv Before You Get Started ..........................................................................................................vi Introduction .......................................................................................................................... 1
What Is the Purpose of the Guide? .......................................................................................................... 1 Why Evaluate and What Do I Need to Consider? ..................................................................................... 2 Where Do I Start? ..................................................................................................................................... 6 How Is the Guide Organized? ................................................................................................................... 7
Embedding Evaluation Into the Program ................................................................................ 9 STEP 1: DEFINE – What Is the Program? .................................................................................................. 9 STEP 2: PLAN – How Do I Plan the Evaluation? ...................................................................................... 24 STEP 3: IMPLEMENT – How Do I Evaluate the Program? ....................................................................... 53 STEP 4: INTERPRET – How Do I Interpret the Results? ........................................................................... 61 STEP 5: INFORM and REFINE – How Do I Use the Evaluation Results? .................................................. 68
Appendix A: Embedded Evaluation Illustration – READ* ...................................................... 74 Program Snapshot .................................................................................................................................. 74 Step 1: Define the Program .................................................................................................................... 74 Step 2: Plan the Evaluation ..................................................................................................................... 80 Step 3: Implement the Evaluation .......................................................................................................... 94 Step 4: Interpret the Results ................................................................................................................ 107 Step 5: Inform and Refine – Using the Results ..................................................................................... 120
Appendix B: Embedded Evaluation Illustration – NowPLAN* ............................................... 124 Program Snapshot ................................................................................................................................ 124 Step 1: Define the Program .................................................................................................................. 124 Step 2: Plan the Evaluation ................................................................................................................... 132 Step 3: Implement the Evaluation ........................................................................................................ 133 Step 4: Interpret the Results ................................................................................................................ 145 Step 5: Inform and Refine – USING the Results.................................................................................... 145
Appendix C: Evaluation Resources ...................................................................................... 162 Evaluation Approaches ......................................................................................................................... 162 Program Theory and Logic Modeling ................................................................................................... 163 Research and Evaluation Design, Including Reliability and Validity ..................................................... 163 Threats to Validity ................................................................................................................................ 164 Budgeting Time and Money ................................................................................................................. 164 Ethical Issues......................................................................................................................................... 164 Data Collection, Preparation, and Analysis .......................................................................................... 165 Evaluation Pitfalls ................................................................................................................................. 165 Interpreting, Reporting, Communicating, and Using Evaluation Results ............................................. 166
Appendix D: Evaluation Instruments for Educational Technology Initiatives ........................ 167 Appendix E: Evaluation Templates ...................................................................................... 173 Appendix F: Lists of Tables and Figures ............................................................................... 175
List of Tables ......................................................................................................................................... 175 List of Figures ........................................................................................................................................ 176
iii
Acknowledgements This guide was created with the valuable input and advice from many individuals. Some individuals provided input into shaping the initial conceptual framework of the guide, some in editing portions of the guide, and some in reviewing draft versions of the guide.
Kathleen Barnhart, Principal Education Consultant, Illinois State Board of Education
Barbara DeCarlo, Retired Principal and Teacher
Beverly Funkhouser, Adjunct Professor, University of Delaware
Rick Gaisford, Educational Technology Specialist, Utah State Office of Education
Robert Hampel, Interim Director, School of Education, University of Delaware
Vic Jaras, Education Technology Director, Iowa Department of Education
Karen Kahan, Director of Educational Technology, Texas Education Agency
Tonya Leija, Reading Recovery Teacher Leaders, Spokane Public Schools
Melinda Maddox, Director of Technology Initiatives, Alabama Department of Education
Daniel Maguire, District Instructional Technology Coach, Kennett Consolidated School District
Jeff Mao, Learning Technology Policy Director, Maine Department of Education
Jennifer Maxfield, Research Associate, Friday Institute for Educational Innovation, North Carolina State University
Brandy Parker, Graduate Research Assistant, Friday Institute for Educational Innovation, North Carolina State University
Shannon Parks, State Education Administrator, Technology Initiatives, Alabama Department of Education
Barry Tomasetti, Superintendent, Kennett Consolidated School District
Bruce Umpstead, Educational Technology Director, Michigan Department of Education
Carla Wade, Technology Education Specialist, Oregon Department of Education
Brent Williams, Director, Educational Technology Center, Kennesaw State University
iv
Thanks also to Jeni Corn, Director of Evaluation Programs, with the William & Ida Friday Institute for Educational Innovation at North Carolina State University for obtaining approval to use the 1:1 Implementation Rubric in the Evaluation Matters guide.
I would like to extend a special thank you to Jenelle Leonard, Director of School Support and Rural Programs (SSRP), Office of Elementary and Secondary Education (OESE) at the U.S. Department of Education, for being the driving force in the creation of this guide.
In addition, I would like to thank Andy Leija, Kelly Bundy, Kim Blessing, Janelle McCabe, and Anna Morgan with Kauffman & Associates, Inc. for their continued support during the creation of the guide.
And finally, I would like to especially thank Nancy Loy (SSRP/OESE) at the U.S. Department of Education for her constant assistance and support throughout the development and writing of the guide, from brainstorming ideas to reading multiple drafts to facilitating review of the evaluation guide.
Portions of this guide were adapted from An Educator’s Guide to Evaluating the Use of Technology in Schools and Classrooms prepared by Sherri Quiñones and Rita Kirshstein at the American Institutes for Research for the U.S. Department of Education in 1998 (Nancy Loy, Project Officer).
v
Before You Get Started Some who use this guide, especially those who are unfamiliar with evaluation or educational program design, may decide to read it cover to cover. However, most who read the guide will likely use it as a compendium and a companion with which they will travel to those portions that are relevant to their current needs. To facilitate this use, there are several features that will aid you in your navigation through the guide.
Click on the I note icon to go to excerpts from Appendix A: Embedded Evaluation Illustration – READ* that appear throughout the text to illustrate each step of the
evaluation process. If you find the excerpts interspersed within text to be distracting, you may want to skip them in the main text and instead read the example in its entirety in Appendix A. There you will find a detailed example of a theory-driven, embedded program evaluation from its inception through the use of its first-year results. Appendix B: Embedded Evaluation Illustration – NowPLAN* provides another example. Both examples set out in this guide are provided solely for the purpose of illustrating how the principles in this guide can be applied in actual situations. The programs, characters, schools, and school districts mentioned in the examples are fictitious.
Click on the R note icon to see additional resources on a topic included in Appendix C: Evaluation Resources.
vi
Introduction What Is the Purpose of the Guide?
Who Is this Guide For? This guide is written for educators. The primary intended audience is state- and district-level educators (e.g., curriculum supervisors, district office personnel, and state-level administrators). Teachers, school administrators, and board members also may find the guide useful. It is intended to help you build evaluation into the programs and projects you use in your classrooms, schools, districts, and state. This guide will also provide a foundation in understanding how to be an informed, active partner with an evaluator to make sure that evaluation provides the information you need to improve the success of your program, as well as to make decisions about whether to continue, expand, or discontinue a program.
No previous evaluation knowledge is needed to understand the material presented. However, this guide may also be useful for experienced evaluators who want to learn more about how to incorporate theory-based evaluation methods into their programs and projects.
In addition to using the guide to embed evaluation within your program, the guide will be useful for
•
•
•
•
•
State education agencies during preparation of program and evaluation guidelines within Requests for Proposals (RFPs), in order to facilitate uniform assessments of proposals and for districts to know how their proposals will be assessed.
School districts in responding to RFPs or in writing grant proposals, in order to set clear expectations for what a program is intended to accomplish and how the evaluation will be embedded within the program to measure changes as a result of the program.
Teams of educators to show value added for a program, in order to build program support and provide budget justification.
Program staff to tell the story of a program using data.
Organizations for evaluation training and professional development
1
How Is this Guide Different From Other Evaluation Guides? There are many evaluation guidebooks, manuals, and tool kits readily available. So, what makes the material presented in this guide different from other evaluation guides? This guide is written with you, the educator, in mind. It outlines an evaluation approach that can be built
into your everyday practice. It recognizes the preciousness of time, the need for information, and the tension between the two. The theory- driven, embedded approach to evaluation is not an additional step to be superimposed upon what you do and the strategies you use but rather a way to weave evaluation into the design, development, and implementation of your programs and projects.
The term program is used broadly in this guide to represent activities, small interventions, classroom-based projects, schoolwide programs, and district or statewide initiatives.
2
This guide will help you to embed evaluation within your program in order to foster continuous improvement by making information and data the basis upon which your program operates. The step-by-step approach outlined in this guide is not simply a lesson in “how to evaluate” but rather a comprehensive approach to support you in planning and understanding your program, with a rigorous evaluation included as an integral part of your program’s design.
In Appendices A and B, you will find two examples of educators building evaluation into their everyday practices. Through a narrative about programs, characters, schools, and school districts that are fictitious, each example is designed to illustrate how the principles in this guide can be applied in actual situations. While embedded evaluation can be used for any type of program you may be implementing, these illustrations specifically focus on programs that involve infusing technology into the curriculum in order to meet teaching and learning goals.
Why Evaluate and What Do I Need to Consider?
Why Evaluate? Evaluation is important so that we can be confident the programs we are using in our schools and classrooms are successful. A common criticism regarding evaluation is that it takes time and resources that could be dedicated to educating students. However, evaluation, done properly, can actually result in better quality practices being delivered more effectively to enhance student learning.
You would not hire new teachers without regularly monitoring and mentoring to help them improve their skills and foster student success. Would you adopt and maintain a new curriculum full scale without being sure that student learning improved when you tested the
new curriculum? What if student learning declined after implementing a new curriculum? How would you know whether the curriculum did not work well because it was a faulty curriculum,
or because teachers were not trained in how to use the curriculum, or because the curriculum was not implemented properly? Building evaluation into your educational programs and strategies enables you to make midcourse corrections and informed decisions regarding whether a program should be continued,
expanded, scaled down, or discontinued.
Evaluation enables you to identify and use better quality practices more effectively to improve learning outcomes.
3
A primary purpose of evaluation is to make summative decisions. You can use summative evaluation results from rigorous evaluations to make final, outcome-related decisions about whether a program should be funded or whether program funding should be changed. Summative decisions include whether to continue, expand, or discontinue a program based on evaluation findings.
Another important purpose of evaluation is to make formative decisions. You can use formative evaluation data from rigorous evaluations to improve your program while it is in operation. Formative evaluation examines the implementation process, as well as outcomes measured throughout program implementation, in order to make decisions about midcourse adjustments, technical assistance, or professional development that may be needed, as well as to document your program’s implementation so that educators in other classrooms, schools, or districts can learn from your program’s evaluation.
Who Should Do the Evaluation? Once you have decided to evaluate the implementation and effectiveness of a program, the next step is to determine who should conduct the evaluation. An evaluation can be conducted by someone internal to your organization or someone external to your organization. However, the ideal arrangement is a partnership between the two, i.e., forming an evaluation team that includes both an internal and an external evaluator.
Preferably, evaluation is a partnership between staff internal to your organization assigned to the evaluation and an experienced, external evaluator.
Such a partnership will ensure that the evaluation provides the information you need for program improvement and decision-making. It also can build evaluation capacity within your organization.
An internal evaluator may be someone at the school building, district office, or state level. For evaluations that focus on program improvement
and effectiveness, having an internal evaluator on your evaluation team can foster a deeper understanding of the context in which the program operates. Involving people inside your organization also helps to build capacity within your school or district to conduct evaluation. An internal evaluator should be someone who is in a position to be objective regarding program strengths and weaknesses. For this reason, choosing an internal evaluator who is responsible for the program’s success is not recommended and may compromise the evaluation. In order to maintain objectivity, an internal evaluator should be external to the program. However, while staff internal to the program itself should not be part of the evaluation team, they should certainly partner with the evaluation team in order to ensure that the evaluation informs the program during every phase of implementation.
It is good practice to have an external evaluator be part of your evaluation team. Using an external evaluator as a “critical friend” provides you with an extra set of eyes and a fresh perspective from which to review your design and results. Professional evaluators are trained in the design of evaluations to improve usability of the findings, and they are skilled in data
collection techniques such as survey design, focus group facilitation, conducting interviews, choosing quality assessments, and performing observations. An experienced evaluator can also help you analyze and interpret your data, as well as guide you in the use of your results. Further, when you are very close to the program being evaluated, objectivity or perceived objectivity may suffer.
The choice of who conducts your evaluation should depend upon the anticipated use of the results and the intended audience, as well as your available resources.
For some programs, while an external evaluator is preferred, funding an evaluator who is external to your organization may not be feasible. In such cases, partnering with an evaluator who is internal to your organization, yet external to your program, might work well.
Other potentially affordable evaluation options include putting out a call to individuals with evaluation experience within your community who might be willing to donate time to your program; contacting a local university or community college regarding faculty or staff with evaluation experience who might work with you at a reduced rate; asking your local university if there is a doctoral student in evaluation who is looking for a research opportunity or dissertation project; or exploring grant opportunities that fund evaluation activities.
4
Partnering with an external evaluator can improve the credibility of the findings, as some may question whether an evaluator internal to an organization can have the objectivity to recognize areas for improvement and to report results that might be unfavorable to the program. For some programs, you may choose to use an evaluator who is external to your organization to be the sole or
primary evaluator. An external evaluator may be a researcher or professor from your local university or a professional evaluator from a private evaluation firm.
The choice of who conducts your evaluation should depend upon the anticipated use of the results and the intended audience, as well as your available resources. If evaluation results are to be used with current or potential funding agencies to foster support and assistance, contracting with an external evaluator would be your most prudent choice. If the evaluation is primarily intended for use by your
organization in order to improve programs and understand impact, an evaluation team comprised of an internal and an external evaluator may be preferred. Connecting with someone external to your organization to assist with the evaluation and results interpretation will likely enhance the usability of your evaluation and the credibility of your evaluation findings. Evaluation as a partnership between an internal evaluator and an external evaluator is the ideal arrangement to ensure the utility of the evaluation and its results.
The focus of embedded evaluation is to enable educators to build and implement high-quality programs that are continuously improving, as well as for educators to know when to discontinue programs that are not working.
5
For some programs, while an external evaluator might be preferred, funding an evaluator who is external to your organization may not be feasible. In such cases, partnering with an evaluator who is internal to your organization, yet external to your program, might work well. For instance, staff from a curriculum and instruction office implementing a program might partner with staff from another office within the district, such as an assessment or evaluation office, to conduct the evaluation.
If resources are not available for an external evaluator and there is no office or department in your organization that is not affected by your program, you may want to consider other potentially affordable evaluation options. You could put out a call to individuals with evaluation experience within your community who might be willing to donate time to your program, contact a local university or community college regarding faculty or staff with evaluation experience who might work with you at a reduced rate, ask your local university if there is a doctoral student in evaluation who is looking for a research opportunity or dissertation project, or explore grant opportunities that fund evaluation activities.
What Is Embedded Evaluation? The embedded evaluation approach presented in this guide is one of many approaches that can be taken when conducting an evaluation. Embedded evaluation combines elements from several approaches, including theory-based evaluation, logic modeling, stakeholder evaluation,
and utilization-focused evaluation. See Appendix C: Evaluation Resources for resources with additional information on evaluation approaches.
Further, it is important to note that evaluation is not a linear process. While the steps of embedded evaluation may appear as if they are linear rungs on a ladder culminating with the final step, they are not rigid steps. Rather, embedded evaluation steps build on each other and depend upon decisions made in prior steps, and information learned in one step may lead to refinement in a previous step. The steps of embedded evaluation are components of the evaluation process that impact and influence each other. What you learn or decide in one step may prompt you to return to a previous step for modifications and improvements. Just as programs are ongoing, evaluation is dynamic.
The dynamic nature of evaluation and the interconnectedness of an embedded evaluation with the program itself may seem amiss to researchers who prefer to wait until a predefined time to divulge findings. And inarguably, having a program stay its course without midcourse refinements and improvements would make cross-site comparisons and replication easier. However, embedded
evaluation is built upon the principle of continuous program improvement. With embedded evaluation, as information is gathered and lessons are learned, the program is improved. The focus of embedded evaluation is to enable educators to build and implement high-quality programs that are continuously improving, as well as to determine when programs are not working and need to be discontinued. The overall purpose of designing a rigorous, embedded evaluation is to aid educators in providing an effective education for students.
Evaluation is a dynamic process. While embedded evaluation leads the evaluator through a stepped process, these steps are not meant to be items on a checklist. Information learned in one step may lead to refinement in a previous step. The steps of embedded evaluation are components of the evaluation process that impact and influence each other.
Where Do I Start? Just as the first step in solving a problem is to understand the problem, the first step in conducting an evaluation is to understand what you want to evaluate. For the purposes of this guide, what you want to evaluate is referred to as the “program.” It is important to note that the term program is used broadly in this guide to represent small interventions, classroom-based projects, schoolwide programs, and districtwide or statewide initiatives.
The first step in evaluation is to understand what it is you want to evaluate.
6
You can use the evaluation process that is presented in this guide to define and evaluate a small project, as well as to understand and evaluate the inner workings of large programs and initiatives. Regardless of the size or type of program, understanding the program is not only the first step in evaluation. It also is the most important step. Defining why your program should work and making the theory that underlies your program explicit lay the foundation upon which you can accomplish program improvement and measure program effectiveness.
How Is the Guide Organized?
Steps to Embed Evaluation Into the Program This guide presents a framework to aid you in embedding evaluation into your program planning, design, and decision-making. You will be led step-by-step from documenting how and why your program works to using your evaluation results (see Figure 1: Embedded Evaluation Model). The framework is based on the following five steps:
STEP 1: DEFINE – What is the program?
STEP 2: PLAN – How do I plan the evaluation?
STEP 3: IMPLEMENT – How do I evaluate the program?
STEP 4: INTERPRET – How do I interpret the results?
STEP 5: INFORM (a) and REFINE (b) – How do I use the results?
Throughout the guide, the boxed notes highlight important evaluation ideas. As mentioned earlier,
I notes provide excerpts from Appendix A: Embedded Evaluation Illustration – READ* to illustrate the process of designing an evaluation from understanding the program to using results, and
R notes indicate that additional resources on a topic are included in Appendix C: Evaluation Resources.
Appendices Appendices A and B provide examples of theory-driven, embedded evaluations of two programs that involve infusing technology into the curriculum in order to meet teaching and learning goals. These examples are provided solely for the purpose of illustrating how the principles in this guide can be applied in actual situations. The programs, characters, schools, and school districts mentioned in the examples are fictitious. The examples include methods and tools to aid you as you build evaluation into your programs and projects and become an informed, active partner with the evaluator.
7
The illustration in Appendix A: Embedded Evaluation Illustration – READ* is of a districtwide reading program that uses technology to improve literacy outcomes and to assess reading progress. The illustration in Appendix B: Embedded Evaluation Illustration – NowPLAN* focuses on a building-level evaluation of a statewide strategic technology plan. This example builds evaluation into the everyday practice of educators in order to improve instruction and monitor strategic planning components.
Appendix C: Evaluation Resources and Appendix D: Evaluation Instruments for Educational Technology Initiatives include evaluation resources and information about instruments that you may find useful for your evaluations. Appendix E: Evaluation Templates includes a logic model template you can use to define your program and an evaluation matrix template to use to plan your evaluation. Finally, Appendix F: Lists of Tables and Figures appears at the end of the guide.
Figure 1: Embedded Evaluation Model This figure illustrates the five-step, iterative evaluation process: define, plan, implement, interpret, and inform and refine.
Step 1 involves defining the program and logic. Ask these questions about the program: What is the program? What does the program purport to accomplish? What are the goals and objectives? What are the strategies and activities?
Ask these questions about the logic: How do program strategies related to program goals? What is the underlying logic of the program? What are the program’s short-term, intermediate, and long-term objectives? To what extent is program theory supported by rigorous research?
Step 2 involves planning the design. Ask these questions about the design: What questions should the evaluation answer? What indicators best address objectives? What evaluation methods should be used? What is the strongest design that can be feasibly implemented?
Step 3 involves implementation. Ask these questions about the evaluation: How should data be collected? How should data be organized and maintained? How should data be analyzed to best answer evaluation questions?
Step 4 involves interpreting the results. Ask these questions about the results: How should results be interpreted? How can the program be improved? To what extent did the program accomplish its goals? How should results be communicated? What can be done to make sure that evaluation results are used?
Step 5 has two parts: 5a is inform, and 5b is refine.
8
Embedding Evaluation Into the Program STEP 1: DEFINE – What Is the Program?
How Can I Find Out More About the Program? (Understanding the Program) The first step to conducting your evaluation is to understand what you want to evaluate. Whether you are evaluating a new program or a program that you have been using for some
For the past 5 years, reading scores in the Grovemont School District have been declining. The curriculum supervisor, Mrs. Anderson, has tried many strategies to improve reading skills. However, scores continue to decline. Mrs. Anderson has been searching for curricular and assessment materials that are better aligned with state reading standards and that provide ongoing standards-based assessment data. Mrs. Anderson found a program called READ (Reading Engagement for Achievement and Differentiation) that looked promising. After reviewing research on the program and documentation from the vendor as well as numerous discussions and interviews with other districts that had implemented the program, Mrs. Anderson and the district superintendent decided to present the READ program to the school board, in order to gain approval for funding the program for Grades 3-5.
At last month’s meeting, the school board voted to partially fund the READ program. Due to recent state budget cuts, the school board was only able to fund the program at 50% for 2 years. At the end of the 2 years, the board agreed to revisit its funding decision. The board required an evaluation report and presentation due in September of each year.
Before starting to plan the READ program, Mrs. Anderson invited one teacher from each of the district’s six elementary schools, the district reading coach, one of the district’s reading specialists, and the district technology coordinator to join the READ oversight team. This 10-member team was charged with planning the READ program and its evaluation. The team asked an evaluator from the local university to conduct the READ evaluation and to attend oversight team meetings.
Note: The examples set out in this guide are provided solely for the purpose of illustrating how the principles in this guide can be applied in actual situations. The programs, characters, schools, and school districts mentioned in the examples are fictitious.
9
time, it is still important to begin from the basics in understanding how a program works. Do not rely on what you already know about the program or what you believe the program is intended to accomplish. Instead, take what you know, and build upon it with information from multiple sources. By doing this, you will have a full understanding of the program including multiple perspectives and expectations, as well as basic underpinnings and complex inner workings.
So, how do you find out more about the program? If you have experience with the program, you should first document what you know. You may want to investigate if any rigorous previous evaluations have been conducted of the program. If well designed and well carried out, previous evaluations can provide useful information regarding how a program operates.
Another good source from which you can learn more about the program is existing documentation. Documents such as technology plans, curriculum materials, strategic plans, district report cards, user manuals, and national, state, or district standards may have useful information for understanding your program and the context in which it will be implemented. Further, you may want to talk with people who are most familiar with the program, such as vendors and people from other districts that have implemented the program. Consider
The oversight team asked the external evaluator, Dr. Elm, to help them plan the evaluation. Dr. Elm suggested that the oversight team build evaluation into its program as the team is designing it. By embedding evaluation into the program, information from the evaluation would be available to guide program implementation. Evaluation data would both drive program improvement and be the foundation for future decisions regarding whether the program should be continued, expanded, scaled down, or discontinued.
The oversight team members invited Dr. Elm to lead them through the process of building evaluation into their program planning. Dr. Elm explained that the first step is to gain a thorough understanding of the program. In doing this, Mrs. Anderson shared the materials she had already reviewed with the oversight team. In addition, the oversight team contacted four school districts that had used the READ program successfully in order to learn more about the program. To develop a thorough and shared understanding of the context in which the READ program would be implemented, the team reviewed the state's reading standards, the district's strategic plan, the district's core learning goals and curriculum maps in reading, and the district's technology plan. The team also examined reading grades and state reading assessment scores for the district as a whole, as well as by school, English Language Learner (ELL) status, and special education status for the past 5 years.
10
conducting interviews and group discussions to learn more about their insight into the program, how it operates, and what goals it is intended to achieve.
Why Should the Program Work? (Explaining the Program Theory) Once you have a good understanding of the program, the next step is for you to document more thoroughly what you know about the program. The first component in explaining the program is to describe the program’s goals and objectives. Goals should reflect a shared understanding among program stakeholders as to what the program should achieve. What is the program intended to accomplish? How would you know if it worked? If the program were a success, what would have happened? What would have changed?
The next step, stated Dr. Elm, is to define the program by explaining the program theory. Explaining the program theory will include what the program is intended to accomplish, as well as how and why the program is expected to work. Dr. Elm recommended that the team complete the program theory in three parts: (a) defining the program’s long-term goals, (b) delineating the program’s strategies and activities, and (c) explaining how and why the team believes the program’s activities and strategies will result in the desired outcomes.
Your program may have one or two goals, or your program may have many goals. For some programs, the primary goal may be to improve student learning. For others, primary goals might be to affect teacher content knowledge and teacher practice. Goals may have to do with behavior, safety, involvement, or attitudes. The first piece in explaining the program is to list the overall goals of your program or initiative. Goal statements should be broad and general and should reflect the overall intent of your program or a shared vision of what your program is supposed to accomplish. Objectives tend to be more specific and are often short term or intermediate term. If objectives are known, record them. However, at this point in program planning, broad goal statements are sufficient.
Based on their review of documentation and research as well as discussions and interviews with other districts that have implemented the program, and from meetings with district administration and school staff, Mrs. Anderson and the oversight team set the following long-term goals for READ:
1. Increased student engagement in reading
2. Improved student reading skills
11
Once you have documented what the program is intended to accomplish, the next component is to document your program’s strategies and activities. How will the program accomplish these goals? What strategies will be used to achieve your goals? What activities will need to be put in place for the program? Does the program have activities that occur in the classroom, in another setting at school, at home, or in a combination of these settings?
The READ oversight team examined program materials to determine the primary components of the READ program. They determined that the READ program had three strategies: classroom lessons, homework, and assessments. Each of these strategies required certain activities in order to be successful. For instance, teachers would need professional development on how to integrate the READ classroom lessons into their instruction, as well as how to use the READ assessment data. Students would also need training in how to use the READ system in the classroom and at home.
Strategies might include activities such as professional development, technology access, and the use of curricular materials. Strategies might be ongoing throughout the program or drawn on at various stages during the program’s operation. Listing all strategies and activities used in your program is important to explain later on how and to what extent your program’s goals were met.
After careful review of the READ program and the district’s particular program needs, the oversight team outlined the following primary strategies and activities for the READ program:
1. Interactive, standards-based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student).
2. Standards-based reading assessments (Internet-based, formative READ assessments of student reading skills administered using the READ software).
3. Standards-based reading homework (Internet-based using READ software).
4. Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad).
5. Teacher professional development on using READ assessment data for classroom lesson planning.
6. Student training on using READ (in the classroom and at home).
12
At this point, you have documented your program’s goals and objectives, as well as the strategies and activities that will be conducted as part of the program to meet these goals. The next component is to relate program strategies and activities to program goals. Why should the program work? Why do you think implementing this set of strategies and activities will result in the goals you have set? The linkages between program strategies and program goals are assumptions as to why the program should work.
During a planning meeting focusing on why READ strategies and activities should result in the desired long-term goals, the oversight team brainstormed the underlying assumptions that were necessary for READ to work. The evaluator, Dr. Elm, facilitated the discussion among the oversight team members, leading them through the process of linking the program’s activities and strategies to the long-term goals. Dr. Elm asked each member of the team to record why and how they thought each strategy or activity would lead to increased student engagement and improved student reading skills. Team members shared their reasoning with the group.
These underlying assumptions, taken together, are the basis of the program’s theory. That is, the program’s theory is your theory as to why the program should work. Perhaps you believe that employing a set of curricular materials and providing professional development to teachers in the use of these materials will result in improved differentiation of instruction and ultimately increased student learning. Or an assumption might be that having students respond to teacher questions using tablet computers or handheld devices will improve student engagement and participation as well as student learning. Documenting the relationship between your program’s strategies and its goals explains your program design and is the basis for embedding evaluation into your program.
Dr. Elm led a discussion with the oversight team in which they examined each team member’s ideas regarding why the program should work. Focusing on these ideas but not limited by them, the team members formulated, as a group, the underlying assumptions that were necessary to relate READ strategies and activities to long-term goals. During the discussion, team members were able to build on each other’s ideas in order to construct a comprehensive theory that was supported by the group. As a result of their discussion, the team put forward seven assumptions forming the basis of READ’s program theory.
13
The following seven assumptions form the basis of READ’s program theory:
1. Interactive, standards-based classroom lessons (using READ software) will increase student interaction during learning, which will lead to increased exposure to standards-based learning opportunities.
2. Standards-based reading assessments (using READ software) will increase the availability of formative standards-based data on reading performance, which will lead to increased teacher use of standards-based reading assessment data and then improved differentiation of instruction.
3. Standards-based reading homework (using READ software) will increase student exposure to standards-based learning opportunities.
4. Teacher training on integrating READ into their classroom instruction will increase teacher use of READ, which will lead to improved integration of READ into classroom instruction. Teacher training on using READ assessment data for classroom lesson planning will increase teacher use of formative standards-based reading assessment data. Both will lead to improved differentiation of instruction.
5. Student training on using READ in the classroom will increase student interaction during learning. Student training on using READ at home will increase student use of READ at home. Both will lead to increased student exposure to standards-based learning opportunities.
6. Increased student interaction in the classroom and improved differentiation of instruction will result in increased student engagement.
7. Increased student exposure to standards-based learning opportunities, improved differentiation of instruction, and increased student engagement will result in improved reading skills.
14
Most programs rely upon certain contextual conditions being met and resources being readily available in order to operate the program. If your program assumes that a certain infrastructure is in place or that certain materials are available, you should identify and list these conditions and resources when planning your evaluation.
The oversight team also identified contextual conditions and resources that are necessary to the success of READ:
1. Program funding for READ, as well as necessary equipment to support infrastructure needs.
2. Program funding for external evaluation assistance.
3. Technology infrastructure at school:
a. Classroom computer with Internet access
b. Interactive technologies in each classroom
c. Interactive, wireless pad for convenient, mobile teacher operation of computer
d. 25 student handheld mobile devices per classroom for interactive learning
4. Availability of professional development for teachers on:
a. Using interactive equipment in the classroom with the READ software; ongoing technical assistance from technology coordinator
b. Integrating the READ software into their instruction
c. Using READ assessment data for classroom lesson planning and differentiation of instruction
5. Availability of student training on how to use interactive equipment in the classroom, as well as how to use the READ software at home.
6. Student access to technology at home (computer with Internet connection).
15
You have now documented your program, including the strategies and activities that will be part of the program and the goals you hope to accomplish. You have also documented your assumptions as to why the strategies should result in achieving the program’s goals. As mentioned in the previous paragraph, these assumptions explaining why the program should work are the basis of the program’s theory. Before defining your program any further, this would be a good place to pause for a moment and reflect on the program design that you have
documented. Ask yourself again why you think your assumptions of the program should work. Are your assumptions based on a solid research foundation? That is, do you have reason to believe based on the results from past evaluations or research conducted by others that the program will work? Or are your assumptions based on emerging knowledge in the field or your own experience? Do you believe that your assumptions are based on strong evidence or are they just hypotheses?
Understanding the basis of the program’s theory is important to designing a rigorous evaluation. Assessing program implementation should always be central to your evaluation design. However, the less evidence there is to support the program’s theory, the more carefully you will want to monitor the implementation of your program and gather early and intermediate information of program effectiveness. If there is evidence from methodologically sound past evaluations that is contrary to your proposed theory, you will want to think carefully about what is different regarding your program to cause you to think it will work. In such cases, documenting alternative theories may prove useful to you in understanding and interpreting program results. It is important to note that there is nothing wrong with a sound, well- documented theory that has little existing information to support its effectiveness, as the information you obtain from your evaluation may be the foundation of innovation. See the section of Appendix C for more on program theory.
How Does the Program Work? (Creating the Logic Model) You have completed the most important part of program design and evaluation; you have defined your program, documenting why your program should work. Next is the process of refining the program design and evaluation: How does the program work? Using the program’s theory and underlying assumptions as the foundation, you will begin to create a model that depicts your program’s inner workings.
What Is Logic Modeling?
A logic model lays out your program’s theory by explaining how you believe your program works. Your logic model will set short-term and intermediate objectives that you can check throughout the evaluation to determine the extent to which your program is working as envisioned. Your logic model is the cornerstone of your program and its evaluation, and you should continually use it to check progress throughout the program, to help you discover problems with your program, and to make necessary corrections and improvements while your program is in operation.
Logic modeling is a process, and the model created through the process will be the foundation of your program and its evaluation.
16
Logic modeling is a process, not simply an end result. While you will create a logic model through the process—a model that will be a critical component of your program’s operation and evaluation—the power is in the process. The process of logic modeling has many uses, from designing a new project to fostering shared ownership of a plan to teaching others how a program is intended to work. We will touch on those uses that are important to evaluation. Additional resources are provided in the section of Appendix C if you would like to learn more.
Putting a new idea into practice is change, and change takes time. Logic modeling can facilitate change by building a shared vision and ownership among stakeholders from the outset, but only if creating the logic model is a shared process. This does not mean that you need to include every stakeholder in every phase of your logic modeling. The initial creation of your logic model works best if done by a small group. However, once this group creates a draft, including others in the process will likely improve your model and the program’s subsequent implementation.
Including teachers in the logic modeling process can help to ensure that teachers are working toward a common goal and that all teachers understand and support what the program is trying to accomplish. Including parents can help to foster a culture in which parents understand and embrace what the teachers are trying to accomplish with their children, so parents can, in turn, support these efforts at home. Including students invites them to be active participants in the program planning and understanding process. Further, including administrators and school board members is critical to creating a shared understanding and mutual support of the program and its goals. Finally, the inclusion of stakeholders is not a one-time effort to garner support but rather an ongoing partnership to improve your program’s design and operation.
A logic model explains how you expect a program’s strategies and activities to result in the program’s stated goals and objectives.
17
The logic modeling process should include the person or people who will have primary responsibility for the program, as well as those who are critical to its success. Because the logic model you are creating will be used for evaluation purposes, your model will not simply describe your program or project, but it will also provide indicators that you will use to measure your program’s success throughout its operation. For this reason, it would be helpful to ask someone with evaluation expertise to be part of your logic modeling group. Once you have your logic modeling team assembled, the following paragraphs will step you through the process of creating your model.
How Do I Create a Logic Model?
At the heart of your logic model are the linkages between what you do as part of your program and what you hope to accomplish with the program. The linkages explain how your program works, and they include your program’s short-term and intermediate objectives. Short-term
and intermediate objectives are critical to improving the implementation of your program, as well as to establishing the association, supported by data, that your program’s activities are theoretically related to your program’s goals. Without short-term and intermediate indicators that reflect the program’s underlying theory, your evaluation would be a black box with inputs (strategies and activities) and outputs (goals and objectives). The logic model is a depiction of the inside of the box, allowing you to monitor your program’s operation and enabling you to make assertions about the success of the strategies that are part of your program.
If your program theory is well defined, you may find that creating the logic model is a breeze. If your program theory still needs more explanation of how your program should work, the process of creating your logic model will aid you in further refining it. Logic modeling is an opportunity to really think through the assumptions you laid out in your program’s theory, to
consider again what resources and supports you will need to implement your program effectively, and to lay out what you plan to achieve at various stages during your program’s operation.
These are the primary components of a logic model:
1. Long-term objectives or outcome goals
2. Program strategies and activities
3. Early (short-term) objectives
4. Intermediate objectives
5. Contextual conditions
18
Your logic model will be a living model, in that the theory underlying your model and the indicators informing your model are not static but should be changed as your understanding changes. You will start
with your program theory, and your logic model will represent this theory. However, as information is obtained through the program’s implementation and evaluation, you will need to revise and improve the model so that it is always an accurate representation of your program. The logic model is your road map and should reflect your initial understanding of the program, as well as the knowledge you learn during your program’s operation.
These are the primary components of a logic model, in order of development:
1. Defining long-term objectives/outcome goals.
2. Delineating program strategies and activities.
3. Detailing early (short-term) objectives.
4. Outlining intermediate objectives.
5. Listing necessary contextual conditions or resources (context).
While you may decide to depict your logic model using various shapes, in this guide:
•
•
•
Strategies and activities will be denoted by rounded rectangles.
Early (short-term) and intermediate objectives will be denoted by rectangles.
Long-term goals will be represented by elongated ovals.
Remember that there is no magic to the shapes. You should use whatever shapes make the most sense to you! The substance is in the connections between your shapes, as these connections represent your program’s theory.
Start by stating your long-term goals on the right-hand side of your logic model. Move to the left and give your intermediate and early or short-term objectives, followed by your strategies and activities on the left-hand side. Including contextual conditions and resources on your model is a helpful reminder of what needs to be in place for your program to operate. If you decide to add contextual conditions or resources to your model, you can list them on the far left-hand side of your model (before your strategies and activities).
19
The headings of your model might look like those in Figure 2.
Figure 2: Possible Logic Model Headings
Once you have listed your contextual conditions and necessary resources, strategies and activities, short-term and intermediate objectives, and long-term goals, it is time to translate your program’s theory (set of assumptions) into your logic model. Think carefully about what needs to occur in the short term, intermediate, and long term. Map out your assumptions, carrying each strategy through to a long-term goal. Some strategies may share short-term and intermediate objectives, and some objectives may branch out to one or more other objectives. Check to be sure that all strategies ultimately reach a goal and that no short-term or intermediate objectives are dead-ends (meaning that they do not carry through to a long-term goal). Every piece of your model is put into place to achieve your long-term goals. As mentioned earlier, it is your road map, keeping you on track until you reach your destination. Seeing a fully completed logic model may be helpful at this point. Please refer to Figure 3: READ Logic Model
in Appendix A (and reproduced on page 24), and Figure 5: NowPLAN-T Logic Model in Appendix B for examples.
A logic model can be used to explain your program and its evaluation to others, as well as to track your program’s progress.
20
Keep in mind that creating your logic model offers another opportunity for you to examine whether important activities are missing. Does it make sense
that the program strategies and activities would result in your short-term and intermediate objectives and long-term goals for the program? Are additional strategies needed? Are some strategies more important than others? If so, note this in your program definition and theory. In addition, the logic modeling process can help you to refine your program’s theory. As you think through the assumptions that link strategies and activities to goals, you may decide that the logic model needs more work and may want to include additional or different objectives. It is important to use the logic modeling process to reaffirm or refine your program’s theory, as the model will be the basis of your program’s design and evaluation. Your logic model will have many uses, including documenting your program, tracking your program’s progress, and communicating your program’s status and findings. As mentioned previously, your model can
also be used to foster a mutual understanding among your stakeholders of what your program looks like, as well as what you intend for the program to accomplish.
At this point in the evaluation design, Dr. Elm recommended that the READ oversight team create an evaluation subcommittee, named the E-Team, comprised of 3 to 5 members. The evaluation subcommittee was formed as a partnership and a liaison between the READ program staff and the external evaluator, and was tasked with helping to design the evaluation and with monitoring the evaluation findings shared by the READ external evaluator. Mrs. Anderson appointed two oversight committee members (the district reading coach and one of the district reading specialists) to the E- Team. She also asked the district supervisor for assessment and evaluation to serve on the E-Team and to be the primary internal contact for the READ external evaluator. Finally, she invited Dr. Elm to serve as the chair of the E-Team and to serve as the lead, external evaluator of the READ program. As the external evaluator, Dr. Elm would conduct the evaluation and share findings with the E-Team and oversight team. The four-member E-Team’s first task was to create the READ logic model.
This guide touches on the basics of logic models. Logic models can be simple or quite sophisticated, and can represent small projects as well as large systems. If you would like to know more about logic models or logic modeling, a few good resources are included in the section in Appendix C.
21
Why Is Understanding the Program Important? As stated earlier, understanding your program by defining your program’s theory is the most important step in program design and evaluation. The logic model that you create to depict your program’s theory is the foundation of your program and your evaluation. Once you have a draft logic model, you can share the draft with key program stakeholders, such as the funding agency (whether it be the state education agency, the district, the school board, or an external foundation, corporation, or government entity), district staff, teachers, and parents. Talking through your model with stakeholders and asking for feedback and input can help you improve your model as well as foster a sense of responsibility and ownership for the program. While your program may be wonderful in theory, it will take people to make it work. The more key stakeholders you can substantively involve in the logic model development process and the more key people who truly understand how your program is intended to work, the more likely you will succeed.
Using the program definition developed by the oversight team, the E-Team worked to create a logic model. The E-Team started with the long-term goals on the right side of the model. The E-Team listed the contextual conditions and resources on the left. Just to the right of the context, the E-Team listed the strategies and activities. Next, the E-Team used the oversight team’s assumptions to work through the early/short-term and intermediate objectives.
22
Finally, following and updating your logic model throughout your program’s operation, as well as recording the degree to which early (short-term) and intermediate objectives have been met, enable you to examine the fidelity with which your program is carried out and to monitor program implementation. Logic modeling as an exercise can facilitate program understanding, while the resulting logic model can be a powerful tool to communicate your program’s design and your program’s results to stakeholders. Stakeholders, including the funding agency, will want to know the extent to which their resources – time and money – were used effectively to improve student outcomes.
This is a reduced size of the full logic model for the READ program. Appendix A provides the full-size logic model in Figure 3: READ Logic Model.
23
STEP 2: PLAN – How Do I Plan the Evaluation?
What Questions Should I Ask to Shape the Evaluation? While many evaluations ill-advisedly begin with creating evaluation questions, the first step should always be understanding the program. How can you create important and informed evaluation questions until you have a solid understanding of the theory that underlies a program? Because you have already created a logic model during the process of understanding your program, generating your evaluation questions is a natural progression from the model.
Your evaluation questions should be open-ended. Avoid yes/no questions, as closed-ended responses limit the information you can obtain from your evaluation. Instead of asking “does my program work?” you might ask:
•
•
•
•
•
To what extent does the program work?
How does the program work?
In what ways does the program work?
For whom does the program work best?
Under what conditions does the program work best?
Evaluation questions tend to fall into three categories taken from your logic model: measuring the implementation of strategies and activities, identifying the progress toward short-term and intermediate objectives, and recognizing the achievement of long-term program goals. The following paragraphs will lead you through a process and some questions to consider while creating your evaluation questions.
At the next READ planning meeting, the E-Team shared the draft logic model with the full oversight team. Oversight team members reviewed the model and felt comfortable that it represented the assumptions and logic as they had agreed on at their last meeting. No changes were needed to the logic model at this time. Next, the E-Team and the oversight team used the logic model to develop evaluation questions for the READ program.
24
Evaluating Implementation of Activities and Strategies
How do you know if your program contributed toward achieving (or not achieving) its goals if you do not examine the implementation of its activities and strategies? It is important for your evaluation questions to address the program’s activities and strategies. Education does not take place in a controlled laboratory but rather in real-world settings, which require that you justify
why you believe the program strategies resulted in the measured outcomes. Your program’s underlying theory, represented by your logic model, shows the linkages between the strategies and activities and the goals. The evaluation of your program’s operation will set the stage to test your theory. And more importantly, asking evaluation questions about how your strategies and activities were applied can tell you the degree to which your program had the opportunity to be successful.
It is never a good idea to measure outcomes before assessing implementation. If you find down the road that your long-term goals were not met, is it because the program did not work or because key components of it were not applied properly or at all? Suppose you find that your long-term goals were successfully met. Do you have enough information to support that your program contributed to this success? It is a waste of resources to expend valuable time and money
evaluating program outcomes if important program components were never put into place. While you will likely want to create evaluation questions that are specific to your program’s activities and strategies, a fundamental evaluation question at this stage is: What is the fidelity with which program activities have been implemented?
Use your logic model to guide you as you create your evaluation questions.
Your questions regarding strategies and activities address the degree to which your program had the opportunity to be successful. Questions in this category may also address contextual conditions and resources.
Questions addressing your early and intermediate objectives are important in determining if your program is on track toward meeting its long-term goals.
Using each of the strategies and activities listed on the left-hand side of the logic model, the E-Team worked with the READ oversight team to develop evaluation questions. For each strategy or activity, they developed questions addressing whether the strategy or activity had been carried out, as well as questions addressing some contextual conditions and resources necessary for program implementation.
25
The READ E-Team and oversight team created six evaluation questions to assess READ strategies and activities:
Strategies and Activities Evaluation Questions
Interactive, standards-based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student).
To what extent did teachers have access to the necessary technology in the classroom to use READ in their instruction?
Standards-based reading assessments (Internet-based, formative assessments of student reading skills administered within the READ software).
To what extent were READ assessments made available to students and teachers? Examine overall, by school, and by grade level.
Standards-based reading homework (Internet-based using READ software).
To what extent did students have access to READ at home? Examine overall and by grade level, race, gender, and socioeconomic status.
Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad).
To what extent did teachers receive professional development on how to integrate READ into their classroom instruction?
Teacher professional development on using READ assessment data for classroom lesson planning.
To what extent did teachers receive professional development on how to incorporate READ assessment data into their classroom lesson planning?
Student training on using READ (in the classroom and at home).
To what extent were students trained in how to use READ?
Note: These questions are intended to evaluate the degree to which the program had the opportunity to be successful, as well as to determine if additional program supports are needed for successful implementation.
26
Evaluating Progress Toward Short-Term and Intermediate Objectives
Evaluating your program’s opportunity to be successful is the initial step toward determining your program’s success. The second category of evaluation questions will address how your program is working. That is, how do you know if your program is on track to meeting its long- term goals? Measuring progress toward short-term and intermediate objectives plays a significant role in determining how your program is working. By examining progress, you can
catch early problems with the program and remediate them before they become critical impediments to your program’s success. Program staff can use interim evaluation findings to plan, shape, and improve the program prior to the evaluation of final outcomes. It is much easier and more cost-effective to uncover problems or issues early in your program’s implementation. Your evaluation should strive to provide program staff with the necessary
information for them to be able to understand the degree to which the program is on course so that they can make midcourse adjustments and refinements as needed.
Your evaluation questions at this stage should focus on your program’s specific short-term and intermediate objectives. However, an overarching evaluation question at this stage might be: To what extent is the program on track to achieving long-term goals? Use your logic model to guide you as you create your evaluation questions pertaining to early and intermediate objectives, just as you used the strategies and activities from your logic model to create your first set of evaluation questions.
While you created your logic model right to left (starting with your long-term goals), it is often easier to craft your evaluation questions left to right. Begin with your early (short-term) objectives and work
your way toward your intermediate objectives, and then long-term goals. Some evaluation questions may address more than one objective, while some objectives may have more than one evaluation question. That is, there does not need to be a one-to-one correspondence between objectives on your logic model and evaluation questions. However, you should have at least one evaluation question that addresses each objective. Later, the evaluation team can prioritize evaluation questions. In doing this, it is possible that you will decide, based on your priorities and resource constraints, not to address certain questions and objectives in your evaluation.
While you created your logic model right to left (starting with your long-term goals), it is often easier to craft your evaluation questions left to right. Begin with your early (short-term) objectives and work your way toward your intermediate objectives, and then long-term goals.
There does not need to be a one-to-one correspondence between objectives and evaluation questions. Some evaluation questions may address more than one objective, while some objectives may have more than one evaluation question.
At this point in your evaluation design, it is important to brainstorm evaluation questions and not be hindered by resource concerns. Prioritizing your questions will come later. When prioritizing your evaluation questions, you will decide based on resource constraints and feasibility which questions your evaluation can adequately address.
27
Next, the E-Team worked with the READ oversight team to create several evaluation questions addressing READ early/short-term objectives and intermediate objectives:
Early/Short-Term and Intermediate Objectives
Evaluation Questions
Increased student use of READ at home (early/short-term).
How often did students receive READ homework assignments? To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased teacher use of READ in the classroom (early/short-term).
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
Increased student exposure to standards-based learning opportunities (early/short-term).
To what extent did students complete READ homework assignments?
How often did teachers use READ in the classroom with students?
Increased availability of standards- based, formative READ assessment data on student reading performance (early/short-term).
How often did teachers access READ student assessment data? **Note frequency and type of access.
Increased teacher use of standards- based READ assessment data (early/short-term).
In what ways did teachers use READ student assessment data?
Increased student interaction during learning (intermediate).
To what extent and how did students interact during classroom instruction when READ was used? **Note frequency and type of interaction.
Improved integration of READ into classroom instruction (intermediate).
In what ways and to what extent did teachers integrate READ into their classroom instruction? **Note the quality with which READ was integrated into classroom instruction by teachers.
Improved differentiation of instruction (intermediate).
In what ways and to what extent did teachers use READ assessment data to plan and differentiate instruction? **Note what data were used and how data were used in instructional planning.
28
Evaluating Progress Toward Long-Term Goals
Finally, a third set of evaluation questions should focus on the program’s long-term goals. While evaluation findings at this stage in your program’s operation can still be used to improve the program’s operation, assessment of long-term goals is typically used for summative decision- making. That is, results from the measurement of progress toward long-term goals are often used to make decisions about whether program funding should be extended and if a program should be continued, expanded, scaled down, or discontinued. Your questions will be specific to your program’s goals, though they should address the following: To what extent does the program work? For whom does the program work best? Under what conditions does the program work best?
Finally, the E-Team and the READ oversight team created evaluation questions addressing READ long-term goals:
Long-Term Goals Evaluation Questions
Increased student engagement in reading.
To what extent and in what ways did READ foster student engagement during reading lessons?
Improved student reading skills.
To what extent did READ improve student learning in reading?
•
•
•
•
•
•
•
•
To what extent did student learning improve after READ was implemented?
To what extent did learning outcomes vary with teacher use of READ in the classroom?
To what extent did learning outcomes vary with teacher use of READ assessment data to plan and differentiate instruction?
How did student performance on the READ assessments correlate with student performance on state assessments?
In what ways did learning outcomes vary by initial reading performance on state assessments?
In what ways did learning outcomes vary by grade level?
In what ways did learning outcomes vary by special education status and English language proficiency?
In what ways did learning outcomes vary with the frequency of READ use at home?
29
If you do not have the resources to focus on all of your evaluation questions, you may need to prioritize. When prioritizing evaluation questions, it is important to have at least some measurement in all three categories: implementation, short-term/intermediate objectives, and long-term goals.
What Data Should I Collect? Now that you have developed your logic model and decided on your evaluation questions, the next task is to plan how you will answer those questions. Your logic model is your road map during this process. Just as you used the key components of your logic model as a guide to develop your evaluation questions, your evaluation questions will drive the data that will be collected through your evaluation.
The answers to your evaluation questions will give you the information you need to know in order to improve your program and to make critical program decisions. The following paragraphs will take you through the process of creating indicators for your evaluation questions that relate to program strategies and activities, short-term objectives, intermediate objectives, and long-term goals. Your indicators will dictate what data you should collect to answer your evaluation questions.
Indicators are statements that can be used to gauge progress toward program goals and objectives. An indicator is a guide that lets you know if you are moving in the right direction. Your indicators will be derived from your evaluation questions; for some evaluation questions, you might have multiple indicators. Indicators are the metrics that will be tied to targets or benchmarks, against which to measure the performance of your program.
Indicators can be derived from evaluation questions and are used to measure progress toward program goals and objectives. An evaluation question may have one or more indicators.
Targets provide a realistic time line and yardstick for your indicators. Indicators and targets should have the following characteristics:
An indicator is SMA:
Specific
Measurable
Agreed upon
And a target is RT:
Realistic
Time-bound
Together, indicators and targets are SMART.
30
Indicators and targets should be specific, measurable, agreed upon, realistic, and time-bound (SMART). For instance, suppose you are evaluating a teacher recruitment and retention program. You may have an
objective on your logic model that states “to increase the number of highly qualified teachers in our school district” and a corresponding evaluation question that asks “to what extent was the number of highly qualified teachers increased in our school district?” However, we know there are several ways that a “highly qualified teacher” can be defined, such as by certifications, education, content knowledge, etc. The indicator would specify the definition(s) that the evaluator chooses to use and the data element(s) that will be collected. For example, to be specific and measurable, the indicator might be twofold: “increasing number and percentage of teachers who are state certified” and “increasing number and percentage of teachers who hold National Board certification.” At this point, it would also be wise to consider whether the indicators you choose are not only measurable, but also available, as well as agreed upon by the evaluation team and program staff.
Next, clarify your indicator by agreeing upon a realistic and time-bound target. Thus, a target is a clarification of an indicator. A target provides a yardstick and time line for your indicator, specifying how much progress should be made and by when in order to determine to what extent goals and objectives have been met. Targets for the above example might include: “by 2015, all teachers in the school district will be state certified” and “by 2018, 50 percent of district teachers will have National Board certification.” For some programs, it is possible that reasonable targets cannot be set prior to the program’s operation. For instance, consider a program that is intended to improve writing skills for seventh graders, and the chosen indicator is a student’s score on a particular writing assessment. However, the evaluation team would like to see baseline scores for students prior to setting their target. In this case, a pretest may be given at the start of the program and, once baseline scores are known, targets can be determined.
31
Table 1
With the evaluation questions that the READ oversight team and E-Team had created, the E-Team was ready to expand on each with indicators and accompanying targets. Using the logic model as its guide, the E-Team created an evaluation matrix detailing the logic model component, associated evaluation questions, indicators, and accompanying targets. Two examples are provided below. All indicators for the READ project are provided in Appendix A.
1. To what extent were READ assessments made available to students and teachers? (activity) Indicator: Increased number of students and teachers with access to READ assessments. Targets: By the start of the school year, all teacher accounts will have been set up in READ. By the end of September, all student accounts will have been set up in READ.
2. In what ways and to what extent did teachers integrate READ into their classroom instruction? (intermediate objective) Indicator: Improved integration of READ lessons into classroom instruction, as measured by teacher scores on the READ implementation rubric (rubric completed through classroom observations and teacher interviews). Targets: By April, 50% of teachers will score a 3 or above (out of 4) on the READ implementation rubric. By June, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
32
Evaluation Matrix Now that you have created evaluation questions with accompanying indicators and targets for each component of your logic model, how do you organize that information into a usable format for your evaluation? One method is to use an evaluation matrix. An evaluation matrix represents your logic model components, evaluation questions, indicators, and targets by your logic model strategies and activities, early and intermediate objectives, and long-term goals.
shows an example shell. Table 26: Evaluation Matrix Template is provided in Appendix E. Information for completing the data source, data collection, and data analysis columns will be covered next in the guide.
Table 1: Evaluation Matrix Example Shell
Lo gi
c M
od el
Co
m po
ne nt
Ev al
ua tio
n Q
ue st
io ns
In di
ca to
rs
Ta rg
et s
Da ta
S ou
rc es
Da ta
C ol
le ct
io n
Da ta
A na
ly si
s
Strategies and Activities/ Initial Implementation
Early/Short- term and Intermediate Objectives
Long-term Goals
How Should I Design the Evaluation? Evaluation Design
You are most likely evaluating your program because you want to know to what extent it works, under what conditions or with what supports it works, for which students it works best, and how to improve it. You have spent the last step defining the program and what you mean when you say it “works.” A strong evaluation design can help you to rule out other plausible explanations as to why your program may or may not have met the expectations you set through your indicators and targets. How many programs are continued with little examination of how they are benefiting the students? How often do we “experiment” in education by putting a new program into the classroom without following up to see if there was any benefit
In an effort to organize their logic model and associated information, the E-Team created an Evaluation Matrix. At this stage, the Evaluation Matrix included the READ logic model components, evaluation questions, indicators, and targets by the READ logic model strategies, early/short-term and intermediate objectives, and long-term goals. A copy of the READ Evaluation Matrix starts at Table 7: Evaluation Matrix Addressing Strategies and Activities During the Initial Implementation—Indicators and Targets.
33
(much less, any adverse effect)? When do we make our decisions based on data, and how often do we accept anecdotal stories or simple descriptions of use as though they were evidence of effectiveness (because we have nothing else on which to base our decisions)? Evaluation can provide us with the necessary information to make sound decisions regarding the methods and tools we use to educate our students.
Evaluation should be built into your program so you can continually monitor and improve your program—and so you know whether students are benefiting (or not). Your evaluation also should help you determine the extent to which your program influenced your results. Suppose you are evaluating a mathematics program and your results show that student scores in mathematics, on average, increased twofold after your program was put into place. But upon further investigation, you find that half of the students had never used the program, and that the students who used the program in fact had much lower scores than those who did not.
What if you had not investigated? This program may have been hindering, rather than helping, student learning.
The questions and example in the above paragraph are intended to show that while evaluation is important, it is a good evaluation (one that gives you valid information as to how your program is working) that really matters. Evaluation relies on attribution. And, the more directly you can attribute your evaluation
findings to the program activities you implemented, the more meaningful your findings will be—and the more useful your findings will be to you as you work to improve your program.
Some evaluation designs provide you with stronger evidence of causality than others. So, how do you choose the strongest possible design and methods to answer your evaluation questions, taking into account any constraints that you may have? This will partly depend upon the extent to which you have control over your implementation setting and other, similar settings.
Common evaluation designs include:
•
•
•
single-group designs
comparison group designs
randomized controlled experiments
Strong comparison group designs are often referred to as quasi-experimental designs.
Randomized controlled experiments are also called true experiments or randomized controlled trials (RCTs).
34
Single-Group Designs
If you are implementing a project in only one of the schools in your district, your evaluation may focus on a single group—one school. In a single-group design, one group participates in the program and that same group is evaluated. While a single-group design is the simplest
evaluation design, it is also the weakest evaluation design. This is because there may be many competing explanations as to why your evaluation produced the results it did. If your evaluation showed promising results, could it be because of something else that was going on at the same time? Or perhaps the participants would have had the same results without the program?
Using your logic model along with the single-group design can help to improve the credibility of your findings. For instance, suppose you are working with an evaluator to examine a new program in your classroom or school focused on improving reading comprehension among third graders. If the evaluation results are promising, the principal has agreed to incorporate the funding for the program into the ongoing budget. If you do not have another classroom or
school against which to compare progress (i.e., you have a single-group design), you can explain how the program operates by using your logic model and the data collected at each stage of operation. You can give evidence showing that the program’s activities were put into place, use data from your early and intermediate objectives to show change in teacher practice and student progress, and present your long-term outcomes showing how reading comprehension changed. While you cannot claim that your program caused the change in reading comprehension, you can use your logic model and its associated indicators to demonstrate a theoretical association
between your program and long-term outcomes.
Using your logic model to guide your evaluation will strengthen your evaluation design.
Other ways to strengthen your design include:
•
•
•
measuring indicators multiple times,
sampling, and
studying the program longitudinally.
35
Comparison Group Designs
If you are able to have more than one group participate in your evaluation, typically you can improve the usability of your findings. For instance, one teacher could use the program in the classroom in one year, and another the next year—and you could compare the results not only within the evaluation classroom from one year to the next, but also between the two classrooms in the evaluation. Using multiple groups, referred to as a comparison group design, can help you rule out some of the other competing explanations as to why your program may have worked. The comparison group is the group that does not use the program being evaluated. However, the groups must be comparable. Comparing test scores from a district that used a new program to test scores from another district that did not use the new program would not yield meaningful information if the two districts are not comparable.
The strength of your evaluation design will vary with how closely matched your comparison group is with the group that will be implementing your program. Convenience groups, such as a district chosen because it neighbors your district, will likely not yield results that are as meaningful as would a comparison district that is purposefully chosen to match your school district based on multiple key indicators that you believe might influence your outcomes, such as gender, ethnic and socioeconomic composition, or past test performance.
Just because a good comparison group does not readily exist for your program, do not give up on the possibility of finding or creating one. Use some creativity when creating your evaluation design and identifying comparison groups. If you are implementing a project across a district, you may have flexibility such that you could vary the timing of the implementation in order to create a comparison group. For instance, several schools could implement the program in one year, leaving the remaining schools as a comparison group. If your evaluation results are promising, the comparison schools can be brought on board in the following year.
Strong comparison group designs are often referred to as quasi-experimental designs. When considering a comparison group, seek to identify or create a group that is as similar as possible, especially on the key indicators that you believe might influence your results, to the group that will be implementing your program. However, the only way to make certain, to the extent possible, that groups are equivalent is through random assignment. Random assignment is discussed in the following section.
Experimental Designs
The gold standard of evaluation design is the true experiment. Comparison group designs, discussed in the above paragraph, attempt to approximate, to the extent possible, a true experiment. In an experimental design, participants are randomly assigned to the program or to a nonprogram control group. True experiments are also referred to as randomized controlled experiments or randomized controlled trials (RCTs).
In a true experiment, participants are randomly assigned to either participate in the program or an alternative condition (such as a different program or no program at all). Theoretically, the process of random assignment creates groups that are equivalent across both observable and unobservable characteristics. By randomly assigning program participants, you can rule out other explanations for and validity threats to your evaluation findings. See the Research and Evaluation Design, Including Reliability and Validity section and the Threats to Validity section in Appendix C for resources addressing random assignment and threats to validity.
For some programs, random assignment may align well with program resources. For instance, for programs that do not have the resources to include all students from the start, randomly assigning students or classrooms to the program would address in a fair manner who participates in the program and would allow you to draw causal conclusions from your
36
evaluation findings. As mentioned in the section on comparison groups, be creative when designing your evaluation. You might find that, with a little resourcefulness at the design stage, you can implement a stronger evaluation than you originally thought. For example, instead of purposefully assigning students or teachers to the program or allowing participants to self- select, you might consider a lottery at the start to determine who will participate. In such cases, if results are promising for the first cohort who participates, additional resources could be sought to expand the program to all students and classrooms.
Enriching Your Evaluation Design
Whether you have chosen to evaluate using a single-group, comparison-group, or experimental design, there are several methods and approaches you can use to enrich your evaluation design. Such methods are added supports in your evaluation design that can increase the usefulness of your results and credibility of your findings, make your evaluation more manageable, and expand upon information obtained throughout program implementation. These methods include using repeated measures, longitudinal data, and sampling. Logic modeling too can enrich your evaluation, as it can be used to construct a reasoned explanation for your evaluation findings. Supplementing your evaluation design with a case study could also enrich your evaluation design by providing in-depth information regarding implementation and participant experiences.
Using repeated measures, collecting the same data elements at multiple time points, can also help to strengthen your evaluation design. If the program you are evaluating is intended to improve critical thinking skills over a specified time period (e.g., 1 year), taking repeated measurements (perhaps monthly) of indicators that address critical thinking skills will not only provide you with baseline and frequent data with which to compare end-of-year results, but will also enable program staff to use midterm information in order to make midcourse corrections and improvements.
Using longitudinal data, data collected over an extended period of time, can enable you to follow program participants long-term and examine post-program changes. Longitudinal data can also enable you to examine a program’s success using a time series analysis. For example, suppose your district made the change from half-day to full-day kindergarten 5 years ago, and you are asked whether the program positively affected student learning. The district has been using the same reading assessment for kindergarteners for the past 10 years. The assessment is given in September and May of each year. You examine the September scores over the past 10 years and find that there has been little variability in mean scores. Mean scores by gender, ethnicity, and English Language Learner (ELL) status have been fairly steady. You conclude the kindergarteners have been entering school at approximately the same mean reading level for the past 10 years.
37
Next, you examine the May reading scores for the past 10 years. You notice that for the first 5 years, the mean end-of-year scores (overall and by subgroup) were significantly greater than the September scores, but varied little from year to year. However, for the past 5 years, the May scores were about 15 percent higher than in the previous 5 years. The increase by gender and ethnicity was similar and also consistent over the past 5 years, while reading scores for ELL students were over 30 percent higher in the spring, after the full-day program was instituted. After ruling out other possible explanations for the findings to the extent possible, you conclude that the full-day kindergarten program appears to have been beneficial for all students and, particularly, for the district’s ELL students.
If your program has many program participants or if you lack the funds to use all of your participants in your evaluation, sampling to choose a smaller group from the larger population of program participants is an option. Random sampling selects evaluation participants randomly from the larger group of program participants, and may be more easily accepted by teachers, parents, and students, as well as other stakeholder groups. Whether you are using random sampling or purposeful sampling, you should select a sample group that is as representative as possible of all of your participants (i.e., the population). Typically, the larger the sample you use, the more precise and credible your results will be. For more information on Research and Evaluation Design, Including Reliability and Validity and Threats to Validity, see Appendix C.
Using logic modeling in your evaluation can also help to strengthen the credibility of your findings. By examining the implementation of your strategies and activities as well as the measurement of progress on your early, intermediate, and long-term indicators, your logic model can provide you with interim data that can be used to adjust and improve your program during its operation. As described with the reading comprehension example in the single-group design section, logic modeling can help to show a theoretical association between the strategies and outcomes in even the weakest of evaluation designs.
Evaluation reporting should be ongoing. While formal evaluation data and reports may be issued once or twice a year, informal updates should be provided to program staff on a regular basis.
It is a good idea to explicitly identify (on your logic model) when evaluation updates will occur and to delineate these evaluation milestones in the evaluation contract with your evaluator.
Frequent and ongoing evaluation updates give a program the best opportunity to monitor and improve implementation, as well as to help maintain stakeholder support.
Finally, case studies are in-depth examinations of a person, group of people, or context. Case studies can enrich your understanding of a program, as well as provide a more
38
accurate picture of how a program operates. See the Evaluation Methods and Tools section for more information on case studies.
Building Reporting into Your Evaluation Design
You do not need to wait until the end of the evaluation to examine your goals. In fact, you should not wait until the end! Just as our teachers always told us that our grades should not be a surprise, your evaluation findings should not be a surprise. You should build reporting into your evaluation design from the very start.
It works well to align your evaluation’s schedule with your program’s time line. If you aim to have your program infrastructure in place by the end of summer, monitor your logic model indicators that address this activity prior to the end of the summer (to verify that the program is on track), and again at the end of summer (or early fall). If the infrastructure is not in place on schedule or if it is not properly operating, program staff need to know right away to minimize delays in program implementation (and so you do not waste time measuring intermediate indicators when early indicators tell you the program is not in place). Likewise, do not wait until the end of the year to observe classrooms to determine how the program is used. Frequent and routine observations will provide program staff with valuable information from which they can determine whether additional professional development or resources are needed.
39
Grovemont School District had 80 third- through fifth-grade classrooms across six elementary schools (28 third-grade classrooms, 28 fourth-grade classrooms, and 24 fifth-grade classrooms). District class size for grades three through five ranged from 22 to 25 students per classroom. Because of state budget cuts and reduced funding for the program, the E-Team knew that Mrs. Anderson and the READ oversight team would have to make some difficult choices about how to structure and evaluate their program.
Some members of the oversight team wanted to implement the program in fifth grade only for the first year, and then reexamine funds to see if they might be able to expand down to fourth grade in Year 2. Others voted to start the program at two of the six elementary schools and then try to include an additional school in Year 2.
Dr. Elm and the E-Team recommended that they consider partially implementing the program at all six schools and across all three grades. Dr. Elm explained that they would receive much better information about how their program was working and, more importantly, how it could be improved, if they were able to compare results from those classrooms that were using the program with those that were not. Dr. Elm knew that students at all of the schools in Grovemont School District were randomly assigned to teachers during the summer before each school year. However, Dr. Elm explained that in order to minimize initial differences between those classrooms that participate in READ and those that do not, they should consider randomly assigning half of the classrooms to continue with the existing district curriculum while the other half would supplement their existing curriculum with the READ program.
Dr. Elm also recommended that they first divide the classrooms by school and grade level so that each school and grade would have one half of the classrooms assigned to the program. Teachers whose classrooms were not assigned to the program would be assured that if the program proved successful, they would be on board by Year 3. However, if the program did not have sufficient benefits for the students, it would be discontinued in all classrooms after Year 2. Dr. Elm concluded that building a strong evaluation into their program would provide them with credible information as to how their program was working and that having data to direct their program adjustments and improvements would give the program the best opportunity to be successful.
The READ oversight team agreed to think about this idea and reconvene in 1 week to make a decision. The E-Team also distributed the evaluation matrix it had created based on the READ logic model. The E-Team asked the oversight team to review the matrix and provide any feedback or comments.
40
The following week, the E-Team and READ oversight team reconvened to decide how to structure the program and to work on the evaluation design. Mrs. Anderson had spoken with the district superintendent about the evaluator’s suggestion of implementing READ in half the district’s third- through fifth-grade classrooms, with the promise that it would be expanded to all classrooms in Year 3 if the program was successful. Although logistically it would be easier to implement the program in two or three schools or one or two grades than to implement it in half the classrooms in all schools and at all grades, the superintendent understood the benefit of the added effort. The evaluation would provide higher quality data to inform decisions for program improvement and decisions regarding the program’s future.
Mrs. Anderson shared the superintendent’s comments with the oversight team and evaluation subcommittee. Like the superintendent, team members felt conflicted by the choice between simpler logistics or a stronger evaluation design. Dr. Elm understood the dilemma all too well, but as an evaluator and an educator, she believed that a strong evaluation would result in improved program implementation and improved program outcomes.
Dr. Elm recognized that implementing the program in all classrooms in one grade level across the district would offer the weakest evaluation design and the least useful information but would likely be the simplest option logistically. Another option would be to start the program in all classrooms at two or three schools. In such a case, the other schools could be used as comparisons. For this reason, Dr. Elm explored the comparability of the six elementary schools in case the team decided to go that route. Five of the elementary schools had somewhat comparable state test scores in reading, while the sixth school had lower state test scores, and the difference was statistically significant. In addition, schools one through five had similar (and fairly homogenous) populations, while school six had a much lower socioeconomic student population and a much higher percentage of ELL students. Because the district was interested in how the program worked with ELL students, the team knew that the evaluation needed to include school six. However, if school six were used in a three-school implementation, the team would not have a comparable school against which to benchmark its results.
While not the simplest option, the oversight team decided that its best option would be to structure the program in such a way as to maximize the quality of the information from the evaluation. The team chose to build a strong evaluation into the READ program design to provide the formative information needed for program improvement and valid summative information for accountability.
41
Follow progress on your logic model indicators carefully along the way, so you continually know how your program is doing and where it should be modified. And when the time does come to examine results in terms of your long-term goals, your logic model is critical to explaining your findings. While you may not be able to rule out all competing explanations for your results, you can provide a plausible explanation based on your program’s logic that your program activities are theoretically related to your program findings.
Finally, as mentioned above, the strength of your evaluation design, or the design rigor, directly impacts the degree to which your evaluation can provide the program with valid ongoing information on implementation and long-term goals regarding the success of the program. A strong evaluation design is one that is built to provide credible information for program improvement, as well as to rule out competing explanations for your summative findings. A strong evaluation design coupled with positive findings is what you might hope for, but even a strong evaluation that provides findings showing dismal results from a program provides valuable and important information. Evaluation results that help you to discontinue programs that do not work are just as valuable as findings that enable you to continue and build upon those programs that do improve student outcomes.
Evaluation Methods and Tools
You have almost completed your evaluation design. The most difficult part is over—you have defined your program and built your evaluation into your program’s logic model. Using your logic model as a road map, you have created evaluation questions and their related indicators. You have decided how your evaluation will be designed. Now, how will you collect your data? You may have thought about this during the discussion on creating indicators and setting targets. After reading through the following paragraphs on methods that you might use in your evaluation, revisit your indicators to clarify and refine the methods you will use to measure each indicator.
42
Based on the READ oversight team’s decision about how to structure the program, Dr. Elm and the E-Team drafted the following evaluation design. They presented the design at the next oversight team meeting. The oversight team voted to approve the design as follows:
Design: Multiple-group, experimental design (students randomly assigned to classrooms by the school prior to the start of the school year and classrooms randomly assigned to the READ program group or a non-READ comparison group)
Program group (READ): 40 classrooms (22 to 25 students per classroom)
Comparison group (non-READ): 40 classrooms (22 to 25 students per classroom)
Classrooms will be stratified by grade level within a school and randomly assigned to either the READ program group or a comparison group. The READ and non-READ groups will each include 14 third-grade classrooms, 14 fourth- grade classrooms, and 12 fifth-grade classrooms.
Enriching the evaluation design: Program theory and logic modeling will be used to examine program implementation as well as short-term, intermediate, and long-term outcomes.
43
Although there are many evaluation methods, most are classified as qualitative, quantitative, or both. Qualitative methods rely primarily on noncategorical, free responses or narrative descriptions of a program, collected through methods such as open-ended survey items, interviews, or observations. Quantitative methods, on the other hand, rely primarily on discrete categories, such as counts, numbers, and multiple-choice responses. Qualitative and quantitative methods reinforce each other in an evaluation, as qualitative data can help to describe, illuminate, and provide a depth of understanding to quantitative findings. For this reason, you may want to choose an evaluation design that includes a combination of both qualitative and quantitative methods, commonly referred to as mixed-method. Some common evaluation methods are listed below and include assessments and tests; surveys and questionnaires; interviews and focus groups; observations; existing data; portfolios; and case studies. Rubrics are also included as an evaluation tool that is often used to score, categorize, or code interviews, observations, portfolios, qualitative assessments, and case studies.
Assessments and tests (typically quantitative but can include qualitative items) are often used prior to program implementation (pre) and again at program completion (post), or at various times during program implementation, to assess program progress and results. Results of assessments are usually objective, and multiple items can be used in combination to create a
subscale, often providing a more reliable estimate than any single item. If your program is intended to improve learning outcomes, you will likely want to use either an existing state or district assessment or choose an assessment of your own to measure change in student learning. However, before using assessment or test data, you should be sure that the assessment adequately addresses what you hope your program achieves. You would not want the success or failure of your program to be determined by an assessment that does not validly measure what your program is intended to achieve.
Surveys and questionnaires (typically quantitative but can include qualitative items) are often used to collect
information from large numbers of respondents. They can be administered online, on paper, in person, or over the phone. In order for surveys to provide useful information, the questions must be worded clearly and succinctly. Survey items can be open-ended or closed-ended.
Reliability and validity are important considerations when selecting and using instruments such as assessments and tests (as well surveys and questionnaires).
Reliability is the consistency with which an instrument assesses (whatever it assesses). Reliability may refer to any of the following elements:
•
•
•
The extent to which a respondent gives consistent responses to multiple items that are asking basically the same question in different ways (internal consistency reliability).
The extent to which individuals’ scores are consistent if given the same assessment a short time later (test-retest reliability).
The extent to which different raters give consistent scores for the same open-ended response or different observers using an observation protocol give consistent scores for the same observation (inter-rater reliability).
(See next page for information on validity.)
44
Open-ended survey items allow respondents to provide free-form responses to questions and are typically scored using a rubric. Closed-ended items give the respondent a choice of responses, often on a scale from 1 to 4 or 1 to 5. Surveys can be quickly administered, are usually easy to analyze, and can be adapted to fit specific situations.
Validity refers to how well an instrument measures what it is supposed to or is claims to measure. An assessment is not simply valid or not valid but rather valid for a certain purpose with a certain population. In fact, the same assessment may be valid for one group but not for another. For example, a reading test administered in English may be valid for many students but not for those in the classroom who are ELL.
Traditional views of validity classify the validity of a data collection instrument into three types: content validity, construct validity, and criterion-related validity.
Content validity addresses whether an instrument asks questions that are relevant to what is being assessed.
Construct validity is the degree to which a measure accurately represents the underlying, unobserved theoretical construct it purports to measure.
Criterion-related validity refers to how well a measure predicts performance. There are two types of criterion- related validity—concurrent and predictive. Concurrent validity compares performance on an assessment with that on another assessment. For example, how do scores on the statewide assessment correlate with those on another nationally normed, standardized test? Predictive validity indicates the degree to which scores on an assessment can accurately predict performance on a future measure. For instance, how well do SAT scores predict performance in college?
A fourth type of validity that is sometimes noted is consequential validity. Consequential validity refers to the intended and unintended social consequences of using a particular measure, for example, using a particular test to determine which students to assign to remedial courses.
When choosing an assessment or creating your own assessment, you should investigate the technical qualities of reliability and validity to be sure the test is consistent in its measurement and to verify that it does indeed measure what you need to measure.
45
Building your survey in conjunction with other methods and tools can help you understand your findings better. For instance, designing a survey to explore findings from observations or document reviews can enable you to compare your findings among multiple sources. Validating your findings using multiple methods gives the evaluator more confidence regarding evaluation findings.
Using a previously administered survey can save you time, may give you something to compare your results to (if previous results are available), and may give you confidence that some of the potential problems have already been addressed.
Two notes of caution, however, in using surveys that others have developed: (a) be sure the instrument has been tested and demonstrated to be
reliable, and (b) be sure the survey addresses your evaluation needs. It is tempting to use an already developed survey without thinking critically about whether it will truly answer your evaluation questions. Existing surveys may need to be adapted to fit your specific needs.
Interviews and focus groups (qualitative) are typically conducted face-to-face or over the phone. You can create an interview protocol with questions to address your specific information needs. The interviewer can use follow-up questions and probes as necessary to clarify responses. However, interviews and focus groups take time to conduct and analyze. Due to the time-consuming nature of interviews and focus groups, sample sizes are typically small, and research costs can be expensive.
Observations (usually qualitative but can be quantitative) can be used to collect information about people’s behavior, such as teacher’s classroom instruction or students’ active engagement. Observations can be scored using a rubric or through theme-based analyses, and multiple observations are necessary to ensure that findings are grounded. Because of this,
observational techniques tend to be time- consuming and expensive, but can provide an extremely rich description of program implementation.
Rubrics are guidelines that can be used objectively to examine subjective data. Rubrics as an evaluation tool provide you with a way to identify, quantify, categorize, sort, rank, score, or code portfolios, observations, and other subjective data.
Rubrics are used to score student work, such as writing samples or portfolios, as well as to examine classroom implementation of a program. When rubrics are used to examine behavior or performance, observers rely on the rubric definitions to determine where the behavior or performance lies on the rubric scale. Rubrics are typically scaled 1 to 4 or 1 to 5, with each number representing a level of implementation or a variation of use.
Observers or rubric scorers must be highly trained so that scoring is consistent among scorers (referred to as inter-rater reliability) and over multiple scoring occasions.
Rubrics can also be used to facilitate program implementation. Providing those implementing a project or program with a rubric that indicates variations in implementation, as well as what the preferred implementation would look like, can help to promote fidelity of implementation. For instance, just as students are provided with a scoring rubric before they complete a writing assignment (so they know what is expected and what constitutes an ideal response), teachers or administrators could be provided with a rubric regarding how to use or operate a program or how to conduct an activity.
46
Existing data (usually quantitative but can be qualitative) are often overlooked but can be an excellent and readily available source of evaluation information. Using existing data such as school records (e.g., student grades, test scores, graduation rate, truancy data, and behavioral infractions), work samples, and lesson plans, as well as documentation regarding school or district policy and procedures, minimizes the data collection burden. However, despite the availability and convenience, you should critically examine
the quality of existing data and whether they meet your evaluation needs.
Data Collection, Preparation, and Analysis
When collecting, storing, and using data from any source (including surveys, interviews, observations, existing data, etc.), it is important to keep participant information and responses confidential.
Ethical considerations should be first and foremost in the minds of an evaluator. Participant privacy is more important than evaluation needs or evaluation results.
Portfolios (typically qualitative) are collections of work samples and can be used to examine the progress of your program’s participants throughout your program. Work samples from before (pre) and after (post) program implementation can be compared and scored using rubrics to measure growth. Portfolios can show tangible and powerful evidence of growth and can be used as concrete examples when reporting program results. However, scoring can be subjective and is highly dependent upon the strength of the rubric and the training of the portfolio scorers.
Case studies (mostly qualitative but can include quantitative data) are in-depth examinations of a person, group of people, or context. Case studies can include a combination of any of the methods reviewed above. Case studies look at the big picture and investigate the interrelationships among data. For instance, a case study of a school might include interviews with teachers and parents, observations in the classroom, student surveys, student work, and test scores. Combining many methods into a case study can provide a rich picture of how a program is used, where a program might be improved, and any variation in findings from using different methods. Using multiple, mixed methods in an evaluation allows for a deeper understanding of a program, as well as a more accurate picture of how a program operates and its successes. See Appendix C for resources on , as well as Research and Evaluation Design, Including Reliability and Validity.
47
Table 2 presents an overview of evaluation methods and tools used to collect data, noting advantages and disadvantages.
Table 2: Evaluation Methods and Tools: Overview
Methods and Tools Basic Information Advantages Disadvantages
Assessments and Tests
Usually quantitative but can be qualitative
Can be administered online or in person
Can be administered individually or in groups
•
•
•
•
•
•
•
•
Multiple items may be used in combination to create a subscale, often providing a more reliable estimate than any single item. Can be used pre- and post-program implementation to measure growth
If assessment is not aligned well with the program, data may not be a meaningful indicator of program success.
If reliability and validity are not adequate, the data will be poor quality, and inaccurate conclusions may be drawn.
Surveys and Questionnaires
Typically quantitative but can be qualitative
Can be administered in person, over the phone, online, or through the mail
In-person surveys can be a quick method to collect data.
If conducted with a captive (in-person) audience, response rates can be high. Electronic or Internet-based surveys can save time and costs with data entry and can improve data quality by reducing data entry errors.
Due to postage costs and multiple mailings, mail surveys can be expensive.
Response rates of mail surveys can be low. If upon data analysis it is found that questions were not worded well, some data may be unusable.
Interviews Qualitative method Can be conducted in person or over the phone
Follow-up questions can be used to obtain more detail when needed.
Follow-up probes can be used to determine how interviewees are interpreting questions.
Nonverbal communication during in-person interviews aids in response interpretation.
Time-consuming to conduct Time-consuming to analyze data Limited number of participants Can be expensive, depending on the number of people interviewed
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
48
Methods and Tools Basic Information Advantages Disadvantages
Focus Groups Qualitative method Multiple people can be interviewed at the same time.
Follow-up questions can be used to obtain more detail when needed.
Follow-up questions can be used to determine how interviewees are interpreting questions.
Participants can build on each other’s responses. Often more cost effective than interviews
Nonverbal communication during in-person focus groups can aid in response interpretation.
Group setting may inhibit participants from speaking freely.
Difficult to coordinate schedules with multiple people
Participants may focus on one topic, limiting exploration of other ideas.
Requires a skilled facilitator Time-consuming to analyze data
Observations Typically qualitative but can be quantitative Can be done in person, via videotape, through one-way glass, or from a distance
Provides a good sense of the use of program
Allows the researcher to gain a full understanding of the environment of participants
Helps to provide a context for interpreting data
Sometimes need many observations to gain a realistic sense of the use of a program Time-consuming to observe, thus expensive Time-consuming to analyze
Participant behavior may be affected by observer presence
Existing Data Can be qualitative or quantitative
Might include school records (electronic or paper based), work samples, lesson plans, or existing documentation (such as meeting minutes or attendance sheets)
Low burden on participants to provide data Relatively inexpensive to collect Electronic data may facilitate analysis Interpretation of existing data is often objective. However, interpretation of existing data such as documents or meeting minutes can be subjective.
May not correspond exactly to evaluation needs
May be incomplete or require additional interpretation May need special permission or consent to access and use If not electronic, may be time- consuming to analyze
•
•
•
•
•
•
•
•
•
• •
• •
•
• • • •
•
•
•
• •
•
• • •
•
•
•
•
49
Methods and Tools Basic Information Advantages Disadvantages
Portfolios Primarily a qualitative method Can be captured and stored
electronically
•
•
•
•
•
•
Can provide a representative cross-section of work
If portfolio work is used pre-program and post-program, data can be used to examine growth.
Scoring of qualitative work is often subjective.
Objectivity of results relies on strength of scoring rubric and training of scorers. So, reliability and validity should be considered.
Case Studies Primarily a qualitative method Can include both qualitative
and quantitative data
Can include a mixture of many methods, including interviews, observations, existing data, etc.
Provides a multi-method approach to evaluation
Often allows a more in-depth examination of implementation and change than other methods
Analyses of data can be subjective Expensive to conduct and analyze; as a result, sample sizes are often small
Rubrics Quantitative method Guidelines to objectively
examine and score subjective data such as observations, portfolios, open-ended survey responses, student work, etc.
See Rubrics sidebar on page 47 for more information.
Powerful method to examine variations of program implementation
Well-defined rubrics can be used not only for evaluation purposes but also to facilitate program implementation.
Objectivity of results relies on strength of scoring rubric and training of scorers
•
•
•
•
•
•
•
•
•
•
•
•
•
50
Constraints
All programs have constraints during their implementation. Constraints might be contextual in that you may not have the support needed to fully evaluate your program. Or you may have
resource constraints, including financial or time constraints. Feasibility is important to consider while designing your evaluation.
A good evaluation must be doable. The design for a rigorous, comprehensive evaluation may look great on paper, but do you have the time available and
financial resources necessary to implement the evaluation? Do you have adequate organizational and logistical support to conduct the evaluation the way you have planned?
Feasibility is important to remember when planning your evaluation. A good evaluation must be doable. Remember, a small evaluation is better than no evaluation, because basing program decisions on some information is better than basing decisions on no information. Considering the feasibility of carrying out your evaluation is critical when planning your evaluation. Be sure to plan within your organizational constraints.
Every evaluation has constraints, and if you do not consider them at the outset, your thoughtfully planned evaluation may be sidelined to no evaluation. Remember, a small evaluation is better than no evaluation, because basing program decisions on some information is better than basing decisions on no information. Considering the feasibility of carrying out your evaluation is critical to planning your evaluation. Be sure to plan within your organizational constraints.
You can also use your logic model to represent your evaluation time line and evaluation budget. The time frame of when and how often you should measure your short-term, intermediate, and long-term objectives can be noted directly on the logic model, either next to the headings of each or within each objective. Likewise, the cost associated with data collection and analysis can be recorded by objective.
By examining time line and budget by objective, evaluation activities that are particularly labor intensive or expensive can be clearly noted and planned for throughout the program’s implementation and evaluation. The Budgeting Time and Money section in Appendix C includes several resources that may help you with considerations when budgeting time and money for an evaluation.
51
The READ E-Team decided on data collection methods, including the data sources, for each evaluation question and associated indicators. Two examples are provided below.
1. In what ways and to what extent did teachers integrate READ into their classroom instruction?
→
→
→
→
A READ rubric will be used to measure teacher implementation of READ in the classroom.
The rubric will be completed through classroom observations and teacher interviews.
The READ implementation rubric will be on a 4-point scale, with a 4 representing the best implementation.
Data will be collected monthly, alternating between classroom observations one month and interviews the following month.
2. To what extent did READ improve student learning in reading?
→
→
→
→
→
The state reading assessment will be used to measure student learning in reading. It is administered in April of each academic year, beginning in second grade.
READ assessment data will be used as a formative measure to examine student reading performance.
State reading scores and READ assessment data will be disaggregated and examined by quality of teacher use (using the READ implementation rubric), frequency of home use, initial reading performance, grade level, gender, ethnicity, special education status, and English language proficiency.
Previous year state reading assessment scores will be used as a baseline against which to measure student reading improvement.
Reading scores on the state assessment will be analyzed in relation to scores on the READ assessments in order to determine the degree to which READ assessments correlate with the state reading assessment.
For a full list of evaluation questions, data sources, and data collection methods, see the READ Evaluation Matrix tables 10, 11, and 12 in Appendix A, Step 3: Implement the Evaluation. The READ Evaluation Matrix includes the READ logic model components, evaluation questions, indicators, targets, data sources, and data collection methods by the READ logic model strategies and activities, early/intermediate objectives, and long-term goals. The data analysis column in the READ Evaluation Matrix will be completed in Step 3.
52
STEP 3: IMPLEMENT – How Do I Evaluate the Program?
Ethical Issues Because evaluation deals with human beings, ethical issues must be considered. Evaluation is a type of research—evaluators research and study a program to determine how and to what extent it works. You likely have people (perhaps teachers or students) participating in the program, people leading the program, people overseeing the program, and people relying on the program to make a difference. It is the responsibility of the evaluator to protect people during evaluation activities. An evaluator must be honest, never keeping the truth from or lying
to participants. You should be clear about the purpose of the program and its evaluation. Respect for participants always comes before evaluation needs.
Prior to collecting any data, check with your administration to see what policies and procedures are in place for conducting evaluations. Is there an
Institutional Review Board (IRB) at the state, district, or school level that must be consulted prior to conducting an evaluation? Does your state, district, or school have formal Human Subjects Review procedures that must be followed? Does the evaluator need to obtain approvals or collect permission forms? Policies and procedures to safeguard study participants must be followed and permissions must be received before any data are collected. For resources on federal requirements regarding Institutional Review Boards or the protection of human subjects in research, see the
Policies and procedures regarding informed consent and ethics to safeguard study participants must be followed before any data are collected.
Ethical Issues section in Appendix C.
Many programs are implemented as part of the school curriculum or as a districtwide or statewide initiative. In such cases, participants may by default participate in those programs as part of their education or work. However, if data are collected or used in the program evaluation, the participants have the right to consent or refuse to have their information used in the evaluation. In some situations and for some data, participants may have consented prior to the evaluation for their information to be used for various purposes, and their consent may extend to your evaluation. If you think this may be the case for your evaluation, be sure the evaluators verify it with your administration. In other instances, especially when data will be newly collected for the evaluation, the evaluator should obtain informed consent from participants before data collection begins. Depending upon the nature of your study and your institution, informed consent may be obtained through permission forms or through a formal human subjects review process.
53
As part of the evaluator’s responsibility to protect people, information obtained through and used by the evaluation must be kept confidential. Individual identities should be kept private and access to evaluation data should be limited to the evaluation team. Evaluators should protect privacy and ensure confidentiality by not attaching names to data and also by ensuring that individuals cannot be directly or deductively identified from evaluation findings. An exception to this may be case studies or evaluations that use student work as examples. For these evaluations, you should take care that your informed consents and written permissions explicitly state that participating individuals or organizations consent to being identified in evaluation reports, either by name or through examples used in the report.
Finally, you must be especially careful not to blur the lines between the two roles of program staff and evaluation team when it comes to privacy and confidentiality. This is one of the reasons that it is prudent to have the external evaluator on your evaluation team collect, manage, and analyze your data. If your data are particularly sensitive or if evaluation participants were promised complete confidentiality, using an external evaluator to handle all data collection and management needs would be the practical and pragmatic choice, as well as the ethical preference. See the Ethical Issues section in Appendix C for resources on ethical considerations and obligations of evaluation.
How Do I Collect the Data? Your data collection approach will depend upon your evaluation method. Table 3 includes an overview of data collection procedures for various evaluation methods.
Table 3: Evaluation Methods and Tools: Procedures
Methods and Tools
Procedures
Assessments and Tests
•
•
•
Review the test to be sure that what it measures is consistent with the outcomes you hope to affect.
Review the test manual to be sure the test has adequate reliability and validity. (See reliability and validity sidebars on pages 45 and 46 for more information.)
Be sure that test proctors are well trained in test administration.
Surveys and Questionnaires
•
•
•
•
Develop the survey questions or choose an existing survey that addresses your evaluation needs.
Pilot test the survey to uncover and correct problems with survey items and questions as well as to plan data analyses.
Decide in advance on a target response rate as well as the maximum number of times you will administer the survey or send the questionnaire.
Examine reliability and validity. (See reliability and validity sidebars on pages 45 and 46 for more information.)
54
Methods and Tools
Procedures
Interviews •
•
•
•
Develop an interview protocol, highlighting key questions.
Include question probes to gather more in-depth information.
Limit how long the interview takes so that participants will be more willing to participate (and make sure to tell participants how much time will be needed for the interview).
Obtain permission to digitally record so that you can concentrate on listening and asking questions. (The recording can be transcribed and analyzed after the interviews.)
Focus Groups •
•
•
•
•
As with an interview, develop a focus group protocol that includes key questions.
Limit group size. (Using six to eight participants tends to work well, though a skilled facilitator may be able to increase the size.)
Purposefully organize focus groups that include participants who can build upon and benefit from each other’s ideas, providing for a richer discourse.
Purposefully organize focus groups that include participants who will feel comfortable speaking their opinion in the group.
Obtain permission to digitally record so that you can concentrate on listening and asking questions. (The recording can be transcribed and analyzed after the focus groups.)
Observations •
•
•
Design the observation protocol and rubrics (if you will be analyzing data with rubrics). Remember to consider the environment and atmosphere, dispositions, pedagogy, curriculum, etc. when designing your protocol and rubrics.
Observers should try to be as unobtrusive as possible so as to not influence the environment they are observing.
See the rubrics row below in this table for pointers on design and consistency in scoring.
Existing Data • Review existing data for applicability and accuracy. Caution: Simply because data exist does not mean that they are complete or accurate.
Portfolios •
•
•
Choose artifacts to be included in the portfolio.
Design the scoring rubric in advance.
See the rubrics row below in this table for pointers on design and consistency in scoring.
Case Studies • Case studies might involve a combination of the above methods.
55
Methods and Tools
Procedures
Rubrics •
•
•
•
•
Design the scoring rubrics before examining the qualitative data.
Describe the best response or variation in detail.
Decide on the number of variations or categories. (It works well to use four or five.)
For each variation, describe in detail what the response or variation would look like. Typically, the best response is at the top of the scale. For example, on a scale of 1 to 4, the best response would be a 4. A variation with many but not all components of the best might be a 3. A variation with a few components of the best response might be a 2, while a variation with little to no components of the best response would be a 1.
Train raters or observers how to score using the rubric. Use several raters to score the same responses, observations, or student work using the rubric. Compare scores to examine inter-rater reliability. Discuss scoring among raters to improve consistency.
How Should I Organize the Data? During data collection, procedures should be put in place to protect privacy and to provide data security. For instance, if data can be tied to individual respondents, assign each respondent an identification number and store data according to that number. Often when data are collected at multiple times during the evaluation (e.g., pre and post) or when data sources need to be individually connected (e.g., student
demographic data and assessment data), a secondary data set can be created to match identification numbers with respondents. If this is the case, this secondary data set should be encrypted and kept highly confidential (i.e., stored in a locked office and not on a shared server), so that individual information cannot be accessed intentionally or inadvertently by others. It is also good practice to control and document who has access to raw evaluation data.
Confidentiality and individual privacy are of primary importance during all aspects of the evaluation.
An evaluator should safeguard that private information is not divulged in conversations regarding the program; during data collection, organization, and storage; and through evaluation reporting.
56
You should also document your data sets. Having good documentation increases the credibility of your evaluation should questions be asked regarding your findings. It is sound practice to keep a record of what data were collected, when they were collected, and how respondents and other participants were chosen. This documentation also should include any definitions that might be necessary in order to interpret data, as well as interview protocols or survey instruments that were used. Documentation of data collected and how data were stored will be useful if you should want to reanalyze your data in the future, if someone asks you questions
about your data, or if someone would like to replicate your evaluation. See the Data Collection, Preparation, and Analysis section in Appendix C for resources on data preparation and creating codebooks to organize and document your data.
How Should I Analyze the Data? The purpose of analyzing your data is to convert all of the raw data that you have collected into something that is meaningful. Upon organizing your data, you may find that you are overwhelmed with the data you have available and wonder how you will make sense of it. Start with your logic model and evaluation questions. List the indicators and associated targets you have outlined for each evaluation question. Use what you have set up during your evaluation design to organize your analysis. Take each evaluation question one at a time, examine the data that pertain to the indicator(s) you have identified for the evaluation question, and compare the data collected to your targets.
Analyzing your data does not have to be daunting. Often when people think of data analysis, they assume complicated statistics must be involved. In reality, there are two things to keep in mind:
•
•
Not all data analysis involves statistics.
Even if statistics are involved, they should be at the level that the intended audience will understand.
Analysis methods differ by the type of data collected. If the information to be analyzed includes quantitative data, some type of statistical analysis will be necessary. The most common way statistics are used in evaluation is for descriptive purposes. For example, if you want to describe the number of hours students spent using a computer at home or at school, you would calculate either the average number or the percentage of students who use computers for a specified period of time. Or, you may want to compare the results of one group of students (e.g., at-risk students) to another group to see if technology influences different groups differently. In this case, you may want to use the same statistics (e.g., means and percentages), but report separate results by group.
You may also want to use a simple test of significance (e.g., t-test) to see if the differences in means are statistically significant (i.e., unlikely to differ by chance). Whether you use simple descriptive statistics or tests of significance and how you want to group your information depend on the type of information you have collected and your evaluation questions. For more complex data sets or in-depth analyses, more sophisticated statistical techniques, such as regression analysis, analysis of variance, multilevel modeling, factor analysis, and structural equation modeling can be used.
57
If the information to be analyzed involves qualitative data, such as data collected from open- ended survey questions, interviews, case studies, or observations, data analysis will likely involve one of two methods. The first is to develop a rubric to score your interview or observational data. Remember, if at all possible, the rubric should be developed in advance of data collection. Once data are scored using the rubric, you can use quantitative analyses to analyze the resulting numerical or categorical data.
A second method to analyze qualitative data is to create a protocol to aid you in data analysis. Such protocols typically call for an iterative process of identifying and understanding themes, organizing data by emerging themes, coding data by theme, and making assertions or conclusions based on these themes. Often, example responses or descriptions taken from the data are used to support the assertions. As with quantitative data, it is important when reporting qualitative data not to inadvertently reveal an individual’s identity. All assertions and findings should be “scrubbed” to be sure that someone reviewing the report cannot deductively identify evaluation participants. See Appendix C for more information on Data Collection, Preparation, and Analysis.
When developing a rubric to code qualitative data:
•
•
•
Decide on the number of variations or categories. (It works well to use four to five categories.)
Describe the best response in detail.
For each subsequent variation, describe what the response would look like. For example, on a scale of 1 to 4, the best response would be a 4. A variation with many but not all components of the best might be a 3. A variation with a few components of the best response might be a 2, while a variation with little to no components of the best response would be a 1.
58
The READ external evaluator collected a mix of quantitative and qualitative data to address evaluation questions. Qualitative data collected through observations and interviews were coded using the READ implementation rubric and analyzed using descriptive statistics, including means and frequency distributions. Student reading assessment data were analyzed by testing for statistical significance, comparing mean test scores between groups of students and over time. An example is provided below. The full READ Evaluation Matrix start in Appendix A, at Table 10: READ Evaluation Matrix—Strategies and Activities/Initial Implementation.
1. Logic Model Component: Improved integration of READ into classroom instruction (intermediate objective).
2. Evaluation Question: In what ways and to what extent did teachers integrate READ into their classroom instruction?
3. Indicator: Improved integration of READ lessons into classroom instruction.
4. Targets: By April, 50% of teachers will score a 3 or above (out of 4) on the READ implementation rubric. By June, 75% of teachers will score a 3 or above on the READ implementation rubric.
5. Data Source: READ implementation rubric (developed by the E-Team and administered by Dr. Elm).
6. Data Collection: Rubric completed through alternating, monthly classroom observations and teacher interviews.
7. Data Analysis: Rubric scores aggregated into frequency distributions and means; change over time to be analyzed.
All data collected through the evaluation were managed and stored by Dr. Elm, the external evaluator. The computer used for storage and analysis was located in a locked office. Only the external evaluator had access to the raw data. Data were backed up weekly to an external drive, which was kept in a locked drawer. To protect teacher and student privacy, identification numbers were assigned to all participants. Teacher and student names were not recorded with the data.
READ online records regarding student and teacher use, rubric data, and survey data were only accessible by the external evaluator. Results that were released were only in aggregation and had no identifying information. All evaluation data were secured and kept confidential to protect individual privacy.
59
Managing the Unexpected and Unintended Just as with life, sometimes the unexpected happens. Perhaps you find that you were unable to collect all the data you had outlined in your design. Or maybe the existing data that you were relying on are not accessible. Or the data were available but the quality was not as good as you had expected (e.g., too much missing information or recording errors). Or possibly you were unable to get enough program participants to respond to your survey or agree to an interview. Don’t panic. Go back to your evaluation questions. Reexamine your indicators and measures. Is there another measure that can be used for your indicator? Is there another indicator you can use to address your evaluation question? Think creatively about what data you might be able to access or collect. You may find that you are not able to answer a certain evaluation question or that your answer to that question will be delayed. Or you may find that you can answer your question, sort of, but not in the best way. In any case, document what happened, explain what alternatives you are pursuing, and simply do the best you can. Evaluation does not occur in a sterile laboratory but within the course of everyday practice. Your evaluation might be less than ideal at times and you will undoubtedly face challenges, but in the long run, some information is better than no information. See Appendix C for resources on Evaluation Pitfalls.
60
STEP 4: INTERPRET – How Do I Interpret the Results?
How Do I Examine and Interpret My Results? In the end, the important part of collecting and analyzing your information is not the statistics or analytical technique but rather the conclusions you draw. The process of coming to a conclusion can vary from goal to goal and objective to objective. One of the most difficult tasks is defining vague goals and objectives, such as “sufficient training” or “adequate progress.” However, you have gone to great lengths to understand your program and plan your evaluation, and you have already developed targets for your indicators. Because of this, your interpretation of results will likely be more straightforward and less cumbersome.
Examination of evaluation results should be ongoing. It is not wise to wait until the end of an evaluation to analyze your data and interpret your results. For instance, if your evaluation results from implementation reveal that program activities were not put into place, continuing with the measurement of short-term and intermediate objectives is likely a waste of your resources. Similarly, if the evaluation of intermediate objectives reveals that outcomes are not as envisioned, an important question would be whether the program should be modified, scaled back, or discontinued. Do results indicate that the program is not working as expected?
Or do results reveal that the program’s theory is invalid and needs to be revisited? Was the program implemented as planned? Is it reasonable to think that making a change in the program could improve results? These are important questions to consider before moving on to the measurement of progress toward long-term goals.
Examination of data and interpretation of findings should be ongoing. Do not wait until the end of the evaluation!
In order to use evaluation for program improvement, communication of findings should be regular, continuous, and timely.
61
The READ evaluation subcommittee, E-Team, examined the evaluation results and determined the following. Use of these findings will be discussed in Step 5.
Summative
1. First-year results indicate that state reading scores for READ students are higher than those for non-READ students. The gains are especially compelling for classrooms in which READ was used regularly and with fidelity, where increases in state reading scores were over three times that of non-READ students.
2. Students in classrooms where READ was used regularly and with fidelity increased their reading scores on the state assessment by twice that of students in READ classrooms where READ was used minimally.
3. Students of teachers who used READ assessment data as intended to differentiate instruction increased their reading scores on the state assessment by twice as much as students of teachers who did not use READ assessment data as intended.
4. Student scores on READ assessments had a significant and strong positive correlation with student scores on the state reading assessment, indicating the state reading and the READ assessments are likely well aligned and that READ assessment data are likely a good indicator of performance on the state reading assessment.
Formative
5. State reading assessment data could not be analyzed by home use of the READ program because only one classroom implemented the home component.
6. At the start of the year, teacher use of READ was promising and the program met its targets. However, as the program progressed and as more teachers were pressed to improve their use of READ, several targets were not met. READ student assessment data were not used as regularly by teachers as the classroom component of READ.
A full accounting of evaluation results by logic model component and evaluation question is provided in Appendix A, starting at Table 13: READ Evaluation Results—Strategies and Activities/Initial Implementation.
62
Interpretation should address the relationship between implementation and long-term goals. Presuming that your program was implemented and ongoing results were promising, to what extent did the program accomplish its long-term goals?
During interpretation, consider how the program worked for different groups of participants and under different conditions. You may also want to examine how long-term outcomes vary with implementation, as well as with results from short-term and intermediate indicators.
Results should be examined in relation to the proposed program’s theory. Do evaluation findings support the program’s theory? Were the assumptions underlying the program’s theory validated? If not, how did the program work differently from what you had proposed? How can the theory and the logic model representing this theory be changed to reflect how the program worked?
The logic model can be used as a tool to present evaluation findings, as well as to explain the relationships among components of the program. Updating the logic model to include results can be a useful reporting and dissemination tool.
Cautions During Interpretation Two common errors during results interpretation are overinterpretation and misinterpretation of results. Unless the evaluation design was a randomized, controlled experiment, results interpretation should not claim causal relationships. Indeed there may be relationships
between your program’s activities and its outcomes (and hopefully there will be!), but unless all rival explanations can be ruled out, causal associations cannot be claimed. Doing so would be an overinterpretation of your results.
When interpreting evaluation findings, be careful not to claim the data say more than they actually do!
Additionally, when interpreting results, you should consider possible alternative theories for your results. Considering and recognizing other explanations or contributors to your evaluation results does not diminish the significance of your findings but rather shows an understanding of the environment within which your program was implemented.
Over time, it is a combination of factors, some unrelated to the program itself, that interact to create results. Documenting your program’s environment can guard against misinterpretation of results and instead provide a thoughtful description of the circumstances under which the results were obtained. See Appendix C for more information on
63
Interpreting, Reporting, Communicating, and Using Evaluation Results.
Although the READ evaluation was a true experimental design, E-Team members knew it would still be worthwhile to consider the possibility that other factors might have influenced the positive findings. The E-Team therefore brainstormed possible competing explanations for the positive results of the READ program.
The E-Team decided that another plausible explanation for the positive results was that the teachers who used READ regularly in the classroom and who used READ assessments as intended may have been more skilled teachers and their students might have had a similar increase in reading scores even without the READ program. The E- Team decided to follow up on fidelity of implementation and its relationship to teacher skills. In addition, while classrooms were randomly assigned to READ to minimize initial differences between READ and non-READ classrooms, it is possible that by chance more skilled teachers were assigned to the READ program group. The E-Team also intends to investigate this issue further in Year 2 of the evaluation.
64
How Should I Communicate My Results? As mentioned earlier, evaluation findings should be communicated to program staff on an ongoing and regular basis. These formative findings are critical to program improvement. Setting a schedule for regular meetings between program staff and the evaluation team, as well as building these communications into your time line, will ensure that evaluation findings can
truly help the program during its operation. Evaluators can provide quick feedback at any stage of the program to help improve its implementation. For instance, if an evaluator notices from observing professional development sessions that teachers are leaving the
training early to attend another faculty meeting, the evaluator should give quick feedback to program staff that the timing of sessions may not be convenient (and for this reason, teachers are not receiving the full benefit of the training).
Setting up regular times throughout the program’s operation to share evaluation findings with program staff and other stakeholders is a key responsibility of the evaluator and critical to a program’s success.
Suppose the evaluator finds during the early stages of the program (through interviews or classroom observations) that teachers are struggling with the technology needed to use the program in the classroom. The evaluator can give quick feedback at a monthly meeting or through an email that technology support and technical assistance are needed in the classroom. Remember, however, an evaluator should not report on individual teachers or classrooms unless consent to do so has been obtained. Doing so could violate the ethical
obligation to participants in the evaluation and undermine future data collection efforts. Even quick feedback should maintain confidentiality.
In addition to relaying your findings on an ongoing basis for formative purposes, you will also want to communicate your summative evaluation findings regarding the extent of your program’s success to stakeholders, including administrators, school board members, parents, and funders. The first step to communicating your results is to determine your audience. If you have multiple audiences, (e.g., administrators and parents), you may want to consider multiple methods of reporting your findings, including reports, presentations, discussions, and short briefs. Make a list of (a) all people and organizations you intend to communicate your results to; and (b) any others you would like to know about your evaluation findings. For each audience, ask yourself these questions:
•
•
•
•
What background do they have regarding the program?
What will they want to know?
How much time and interest will they have?
What do you want the audience to know?
Thinking through these questions will help you tailor your communication. In general, if you are given guidelines on what to report by a funder or by the state or district, try to follow them as closely as you can. If you are not given guidelines, then put yourself in the position of your audience and consider what information you would like to know. Here are some tips to keep in mind:
•
•
•
If the audience already has background information on the program, try to focus on providing only specific findings from your evaluation. If your audience is not familiar with your program, you can use your program theory and logic model to introduce the program and provide a description of how the program is intended to work.
Address the goals and objectives that you believe the audience would most want to know about.
If the audience wants information immediately, write a short summary of major findings and follow up with a longer, more detailed report.
Don’t rely on the typical end-of-year evaluation report to communicate evaluation findings. Communicate to multiple audiences using multiple methods.
In addition to regularly sharing evaluation findings with program staff, let other stakeholders know on an ongoing basis how the program is doing. Think creatively about modes of communication that will reach all stakeholders.
65
• Don’t be afraid to include recommendations or identify possible areas for change. Recommendations are a critical piece to making sure your evaluation findings are used appropriately. If you want to make changes, you are going to have to talk about it sooner or later, and having it in the report is a good way to start the conversation.
Finally, a long report is not the only way to communicate results. It is one way and perhaps the most traditional way, but there are many other methods available. Other options include:
•
•
•
•
•
•
•
•
A memo or letter;
A special newsletter or policy brief;
A conference call or individual phone call;
A presentation before a board or committee, or at a conference;
A publication in a journal, newspaper, or magazine;
A workshop;
A web page or blog; or
The school district newsletter or website.
Evaluation reports or presentations typically have a common format. First is an executive summary or overview that notes key findings. In fact, some will read only the executive summary, so you want to be sure it has the most important information. Other report sections might include:
•
•
•
•
•
•
Introduction (including program background and theory);
Evaluation design (including logic model, evaluation questions, and evaluation methods);
Results (including all findings from the evaluation, organized by evaluation question);
Conclusions (including your interpretation of the results);
Recommendations (including how the program should proceed based on your findings); or
Limitations (including limitations based on evaluation design, analysis of data, and interpretation of findings).
See Appendix C for more information on Interpreting, Reporting, Communicating, and Using Evaluation Results.
66
The READ oversight team met monthly to discuss program monitoring and improvement. At each meeting, the READ evaluator, Dr. Elm, and the E-Team provided an update to the oversight team. Based on the formative evaluation findings, the oversight team developed recommendations and a plan for the next month.
At the December school board meeting, the oversight team presented a status report, noting important findings from the evaluation. The oversight team asked Dr. Elm to create a full evaluation report for the administration and to present the findings at the August school board meeting. The E-Team also drafted a one-page brief of evaluation findings which was provided to all participants, as well as to the local newspaper.
67
STEP 5: INFORM and REFINE – How Do I Use the Evaluation Results?
Informing for Program Improvement One of the most important uses of evaluation findings is for program improvement. In fact, for many audiences, your evaluation communication should focus on improvement. In order to do
this, evaluation communication and reporting should include not only positive findings but also findings that may not be flattering to your program. These not-so-positive findings are the basis for program improvement.
When using evaluation results, ask yourself whether your findings are what you expected.
Has the program accomplished what was intended? If yes, do you see areas where it can be made even better? If no, why do you think the program was not as successful as anticipated? Did the program not have enough time to be successful? Was the implementation delayed or flawed? Or perhaps the program theory was not correct. In any case, using evaluation results is vital to improve your program.
Be sure to report both positive and negative findings. Negative findings can be communicated as lessons learned or areas for improvement.
68
Informing for Accountability Another important use of evaluation findings is for accountability purposes. Designing and implementing programs take valuable resources, and your evaluation findings can help you determine whether the expenditure is worth the results.
Accountability pertains to basic questions, such as whether the program was indeed implemented and whether program funding was faithfully spent on the program, and to more involved questions, such as whether the program is a sound investment. For this reason, as with program improvement communications, it is important for your evaluation reporting to include all findings, good and bad, so that informed decisions can be made regarding the program’s future. Should the program be continued or expanded? Should it be scaled back? While evaluation reporting can be used for program marketing or for encouraging new funding, evaluation findings should include sufficient information for decisions regarding accountability. A caution, however, is that decisions regarding accountability should be made carefully and be based on evidence from multiple sources derived from a rigorous evaluation.
During her evaluation update at the November oversight team meeting, Dr. Elm shared initial findings from the evaluation of the implementation of READ program activities. Indicators showed that many students did not have the technology available at home to access READ. Even within those schools that had high numbers of students with the technology necessary for home access, the classroom variability was large. Only one of the 40 classrooms was able to have 100 percent of students access READ from home. Open-ended survey items revealed that teachers did not feel comfortable offering READ homework assignments to some but not all students in their classroom and therefore chose not to train students in the home use of READ. Only one teacher had trained his students in the home use of READ because all of his students had the technology at home necessary to access READ. This teacher indicated that he would like to continue with the home component of READ.
The oversight team discussed the home-component issue and asked for advice from the E-Team on how to proceed. With the support of the E-Team, the oversight team decided to have a one classroom pilot of the home component but otherwise to remove the home component from the program during Year 1. Based on results from the pilot, implementing a partial home component in Year 2 would be considered.
During the same November update, Dr. Elm provided some findings from the evaluation of the early/short-term objectives on the READ logic model. She noted that in October all teachers had reported using READ in their classroom and that over half of teachers reported that they had used READ every week. However, over one-quarter of teachers reported that they had used READ in their classroom only once or twice in the last month. Survey data indicated that some of these teachers felt overwhelmed with the technology and some said they could not fit READ classroom use into their already busy day.
The oversight team discussed this information and decided to make a midcourse adjustment. Before the READ program began, team members had thought that the initial professional development and ongoing technical assistance would be sufficient. However, they now believed that they needed to make one-to-one professional development available to those teachers who would like to have someone come into their classroom and model a lesson using READ. Mrs. Anderson assigned arrangements for this one-on-one professional development to one of the oversight team members.
69
During her evaluation update at the January oversight team meeting, Dr. Elm shared findings from the evaluation of the intermediate objectives on the READ logic model. Dr. Elm explained that on the December teacher survey, slightly less than half the teachers reported that they used the READ assessment data on a weekly basis for planning and differentiating instruction. One in 10 teachers said they had never used the READ assessment data. Dr. Elm further stated that the lack of use of the READ assessment data was likely affecting scores on the READ implementation rubric. From classroom observations, interviews, and surveys, she believed that the quality of teacher use of READ in the classroom was progressing nicely but that the lack of assessment data use was decreasing the overall rubric score.
The oversight team knew that using the READ assessment data to plan and differentiate instruction was critical to the program’s success. Mrs. Anderson decided to discuss the issue with the READ faculty at each school in an effort to understand what she could do to facilitate their use of the READ assessment data. Additionally, the E-Team planned to elaborate on the rubric so that subscores could be captured for various components of the rubric. These rubric subscores would be especially useful for analysis when the data are disaggregated by teacher use of READ in the classroom, student interaction in the classroom, and teacher use of READ student assessment data to plan and differentiate instruction. The revised rubric would be developed during the spring, piloted over the summer, and implemented during Year 2.
Finally, at the evaluation update at the end of the school year, Dr. Elm reported on the preliminary evaluation of long-term goals of the READ program. Student reading achievement was higher among students of teachers who used READ regularly and as intended, and the difference was statistically significant. Further, students of teachers who used the READ assessment data to tailor classroom instruction had higher reading test scores than students of teachers who did not use the READ assessment data, and again the difference was statistically significant.
Year 1 evaluation findings also indicated that not all teachers had bought into using READ with their students, especially the READ assessment component. The oversight team decided to share the evaluation findings with all teachers at a staff meeting in order to encourage them to use READ in their classroom. Prior to sharing the evaluation findings with teachers, Dr. Elm conducted an anonymous follow-up survey at the staff meeting in an effort to find out why some teachers chose to not use READ.
70
If your program design and evaluation were inclusive processes that involved stakeholders and participants from the start, it is more likely that your evaluation findings will be used for program improvement and accountability. Involving others in your program’s implementation encourages a shared sense of responsibility for the program as well as a shared investment in the program’s success. Hearing about a program at its very start and not again until an evaluation report is provided does not foster the ownership among staff, stakeholders, and participants that is needed for a successful program.
So, how do you make sure your evaluation report, along with all of your hard work and informative results, is not put
on a shelf to gather dust? Make evaluation a participatory process from understanding and defining the program in Step 1 to informing the program in Step 5.
Why do we evaluate?
To ensure that the programs we are using in our schools are beneficial to students; to make programs and projects better; and to learn more about what programs work well, for whom, and to what extent.
How do we increase the likelihood that evaluation results will be used?
Create the opportunity for evaluation to have an impact on programmatic decision-making.
How do we create this opportunity?
We can start by:
•
•
•
Embedding evaluation into our programs from the outset;
Communicating program findings frequently and regularly; and
Making the evaluation process participatory from start to finish.
71
Refining the Program’s Theory Your evaluation findings should also be used to refine your logic model. As mentioned earlier, the logic model is a living model and its underlying assumptions should be dynamic, changing as new information is learned. If the culture in which your program is implemented is a learning culture, using findings to improve the logic model is a natural. However, in other environments, it may not be as easy to apply your findings to logic model improvement. Regardless, if your program is to continue, you should keep its program logic model up-to-date.
An up-to-date logic model can facilitate future evaluation and serve as the cornerstone of your program. Your program’s theory and logic model should be part of the core documentation of your program and can be used to train new program participants, as well as to explain the program to parents, administrative staff, potential funders, and other stakeholders.
Take Action You have completed a lot of work. You have distributed your evaluation findings through letters, reports, meetings, and informal conversations. You have given presentations. So, what do you do now? How do you make sure that your information is used?
First, think about what changes you would like to see. Before you can attempt to persuade others to use your information, you need to figure out what you would like to happen. What changes would you like to see or what decisions do you think need to be made as a result of your information?
Second, think about what changes others might want. Learning how others would like the information to be used gives you more awareness of where they are coming from and more insight as to how they would best be motivated.
Next, take action. You have evidence from your evaluation, you have shared it with others, and you know what you want done. Ask for it! Find out who is in charge of making the changes you want and make sure they hear your findings and your recommendations. Give them a chance to process your suggestions. Then follow up. See Appendix C for more information on Interpreting, Reporting, Communicating, and Using Evaluation Results.
The READ oversight team felt that the logic model they created accurately portrayed the program. Yet, since it was clear from November that the home component could not be fully implemented, they wanted to highlight this on the logic model. The team decided to draw a box around the program as it was implemented, excluding the home component. Below the model, a note was provided indicating why the home component was not part of the existing implementation and that it was currently being piloted in one classroom. The oversight team hoped to understand more about the implementation of the home component, as well as the success of the home component, from examining results from the pilot classroom.
The oversight team also wanted to understand more about the strength of the relationship between classroom use of READ and state assessment scores and between use of READ assessment data for instructional planning and state assessment scores. It noted this on the logic model and asked the E-Team to investigate the linkages further in the second year of the evaluation.
72
Change Takes Time One final note: change takes time. We all want to see the impact of our efforts right away, but in most cases change does not happen quickly. Embedded evaluation allows you to show incremental findings as you strive to achieve your long-term goals, and can help you to set realistic expectations regarding the time it takes to observe change related to your indicators. If you plan to use your evaluation results to advocate program expansion or to secure funding, keep in mind that changing policy based on your findings also will take time. People need to process your evaluation findings, determine for themselves how the findings impact policy and practice, decide how to proceed based on your evidence, and then go through the appropriate process and get the proper approvals before you will see any change in policy from your evaluation findings. As mentioned earlier, including others throughout your program’s design and implementation can facilitate the change process. However, even with a participatory evaluation and positive findings, policy change will occur on its own time line.
The READ oversight team recommended that the READ program be offered to all students in the district. It also recommended that the program be incorporated into the regular curriculum. The team felt that the positive findings regarding test scores were strong enough that all students should have access to it.
However, since READ funding was still at the 50% level for the second year, the oversight team planned to work with Dr. Elm and the E-Team for another year in order to continue to refine the implementation of the program in the classroom and to further understand the success of the READ program with students. To do this, the team recommended that the second-year evaluation include student surveys and focus groups as data sources to address objectives related to student interaction and engagement in the classroom.
The oversight team decided to continue to advocate for the program's expansion in the hope that it would be institutionalized soon.
73
Appendix A: Embedded Evaluation Illustration – READ* Program Snapshot The Reading Engagement for Achievement and Differentiation (READ) program is a districtwide initiative focused on improving student reading skills in Grades 3-5. READ uses an experimental evaluation design and theory-based, embedded evaluation methods.
*This example was created solely to illustrate how the principles in this guide could be applied in actual situations. The program, characters, schools, and school districts mentioned in the example are fictitious.
Step 1: Define the Program
Background For the past 5 years, reading scores in the Grovemont School District have been declining. The curriculum supervisor, Mrs. Anderson, has tried many strategies to improve reading skills. However, scores continue to decline. Mrs. Anderson has been searching for curricular and assessment materials that are better aligned with state reading standards and that provide ongoing standards-based assessment data. Mrs. Anderson found a program called READ (Reading Engagement for Achievement and Differentiation) that looked promising. After reviewing research on the program and documentation from the vendor as well as numerous discussions and interviews with other districts that had implemented the program, Mrs. Anderson and the district superintendent decided to present the READ program to the school board, in order to gain approval for funding the program for Grades 3-5.
At last month’s meeting, the school board voted to partially fund the READ program. Due to recent state budget cuts, the school board was only able to fund the program at 50% for 2 years. At the end of the 2 years, the board agreed to revisit its funding decision. The board required an evaluation report and presentation due in September of each year.
Before starting to plan the READ program, Mrs. Anderson invited one teacher from each of the district’s six elementary schools, the district reading coach, one of the district’s reading specialists, and the district technology coordinator to join the READ oversight team. This 10- member team was charged with planning the READ program and its evaluation. The team asked
74
an evaluator from the local university to conduct the READ evaluation and to attend oversight team meetings.
The Evaluation The oversight team asked the external evaluator, Dr. Elm, to help them plan the evaluation. Dr. Elm suggested that the oversight team build evaluation into its program as the team is designing it. By embedding evaluation into the program, information from the evaluation would be available to guide program implementation. Evaluation data would both drive program improvement and be the foundation for future decisions regarding whether the program should be continued, expanded, scaled down, or discontinued.
The oversight team members invited Dr. Elm to lead them through the process of building evaluation into their program planning. Dr. Elm explained that the first step is to gain a thorough understanding of the program. In doing this, Mrs. Anderson shared the materials she had already reviewed with the oversight team. In addition, the oversight team contacted four school districts that had used the READ program successfully in order to learn more about the program. To develop a thorough and shared understanding of the context in which the READ program would be implemented, the team reviewed the state's reading standards, the district's strategic plan, the district's core learning goals and curriculum maps in reading, and the district's technology plan. The team also examined reading grades and state reading assessment scores for the district as a whole, as well as by school, English Language Learner (ELL) status, and special education status for the past 5 years.
The next step, stated Dr. Elm, is to define the program by explaining the program theory. Explaining the program theory will include what the program is intended to accomplish, as well as how and why the program is expected to work. Dr. Elm recommended that the team complete the program theory in three parts: (a) defining the program’s long-term goals, (b) delineating the program’s strategies and activities, and (c) explaining how and why the team believes the program’s activities and strategies will result in the desired outcomes.
Program Goals Based on their review of research and documentation as well as discussions and interviews with other districts that had implemented the program, and meetings with district administration and school staff, Mrs. Anderson and the oversight team set the following long-term goals for READ:
1. Increased student engagement in reading
2. Improved student reading skills
75
Program Strategies and Activities The READ oversight team examined program materials to determine the primary components of the READ program. They determined that the READ program had three strategies: classroom lessons, homework, and assessments. Each of these strategies required certain activities in order to be successful. For instance, teachers would need professional development on how to integrate the READ classroom lessons into their instruction, as well as how to use the READ assessment data. Students would also need training in how to use the READ system in the classroom and at home.
After careful review of the READ program and the district’s particular program needs, the oversight team outlined the following primary strategies and activities for the READ program:
1. Interactive, standards-based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student)
2. Standards-based reading assessments (Internet-based, formative READ assessments of student reading skills administered using the READ software)
3. Standards-based reading homework (Internet-based using READ software)
4. Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad)
5. Teacher professional development on using READ assessment data for classroom lesson planning
6. Student training on using READ (in the classroom and at home)
Relating Strategies to Goals: Program Theory During a planning meeting focusing on why READ strategies and activities should result in the desired long-term goals, the oversight team brainstormed the underlying assumptions that were necessary for READ to work. The evaluator, Dr. Elm, facilitated the discussion among the oversight team members, leading them through the process of linking the program’s activities and strategies to the long-term goals. Dr. Elm asked each member of the team to record why and how they thought each strategy or activity would lead to increased student engagement and improved student reading skills. Team members shared their reasoning with the group.
Dr. Elm led a discussion with the oversight team in which they examined each team member’s ideas regarding why the program should work. Focusing on these ideas but not limited by them, the team members formulated, as a group, the underlying assumptions that were necessary to relate READ strategies and activities to long-term goals. During the discussion, team members were able to build on each other’s ideas in order to construct a comprehensive theory that was supported by the group.
76
As a result of their discussion, the team put forward the following seven assumptions forming the basis of READ’s program theory:
1. Interactive, standards-based classroom lessons using READ software will increase student interaction during learning, which will lead to increased exposure to standards- based learning opportunities.
2. Standards-based reading assessments using READ software will increase the availability of formative, standards-based data on student reading performance, which will lead to increased teacher use of formative standards-based reading assessment data and then improved differentiation of instruction.
3. Standards-based reading homework using READ software will increase student exposure to standards-based learning opportunities.
4. Teacher professional development on integrating READ into their classroom instruction will increase teacher use of READ, which will lead to improved integration of READ into classroom instruction. Teacher professional development on using READ assessment data for classroom lesson planning will increase teacher use of formative standards- based student reading assessment data. Both will lead to improved differentiation of instruction.
5. Student training on using READ in the classroom will increase student interaction during learning. Student training on using READ at home will increase student use of READ at home. Both will lead to increased student exposure to standards-based learning opportunities.
6. Increased student interaction in the classroom and improved differentiation of instruction will result in increased student engagement.
7. Increased student exposure to standards-based learning opportunities, improved differentiation of instruction, and increased student engagement will result in improved reading skills.
Resources The oversight team also identified contextual conditions and resources necessary to the success of READ:
1. Program funding for READ, as well as necessary equipment to support infrastructure needs.
2. Program funding for external evaluation assistance.
3. Technology infrastructure at school:
a. Classroom computer with Internet access
b. Interactive technologies in each classroom
c. Interactive, wireless pad for convenient, mobile teacher operation of computer
77
d. 25 student handheld mobile devices per classroom for interactive learning
4. Availability of professional development for teachers on:
a. Using interactive equipment in the classroom with the READ software; ongoing technical assistance from technology coordinator
b. Integrating the READ software into their instruction
c. Using READ assessment data for classroom lesson planning and differentiation of instruction
5. Availability of student training on how to use interactive equipment in the classroom, as well as how to use the READ software at home.
6. Student access to technology at home (computer with Internet connection).
Program Logic Model At this point in the evaluation design, Dr. Elm recommended that the READ oversight team create an evaluation subcommittee, named the E-Team, comprised of 3-5 members. The evaluation subcommittee was formed as a partnership and a liaison between the READ program staff and the external evaluator, and was tasked with helping to design the evaluation and with monitoring the evaluation findings shared by the READ external evaluator. Mrs. Anderson appointed two oversight committee members (the district reading coach and one of the district reading specialists) to the E-Team. She also asked the district supervisor for assessment and evaluation to serve on the E-Team and to be the primary internal contact for the READ external evaluator. Finally, she invited Dr. Elm to serve as the chair of the E-Team and to serve as the lead, external evaluator of the READ program. As the external evaluator, Dr. Elm would conduct the evaluation and share findings with the E-Team and oversight team. The four- member E-Team’s first task was to create the READ logic model.
Using the program definition developed by the oversight team, the E-Team worked to create a logic model. The E-Team started with the long-term goals on the right side of the model. The E- Team listed the contextual conditions and resources on the left. Just to the right of the context, the E-Team listed the strategies and activities. Next, the E-Team used the oversight team’s assumptions to work through the early/short-term and intermediate objectives. The resulting logic model is provided in Figure 3: READ Logic Model.
78
Figure 3: READ Logic Model
79
Now that it had a draft logic model, the E-Team planned to share it with the oversight team in order to fine-tune, clarify, and finalize. Next, the oversight team and the E-Team would work together to develop evaluation questions.
Note: Figure 6: Logic Model Template is provided in Appendix E. Logic models can be created using the drawing template in a simple word processing application. There are also several applications available that are specifically tailored for creating logic models, and there are others that enable you to create a diagram using different shapes.
Step 2: Plan the Evaluation At the next READ planning meeting, the E-Team shared the draft logic model with the full oversight team. Oversight team members reviewed the model and felt comfortable that it represented the assumptions and logic as they had agreed on at their last meeting. No changes were needed to the logic model at this time. Next, the E-Team and the oversight team used the logic model to develop evaluation questions for the READ program.
Evaluation Questions – Strategies and Activities Using each of the strategies and activities listed on the left-hand side of the logic model, the E- Team worked with the READ oversight team to develop evaluation questions. For each strategy or activity, they developed questions addressing whether the strategy or activity had been carried out, as well as questions addressing some contextual conditions and resources necessary for program implementation. The READ E-Team and oversight team created six evaluation questions to assess READ strategies and activities.
Table 4: Evaluation Questions for Strategies and Activities
Strategies and Activities Evaluation Questions
Interactive, standards-based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student)
To what extent did teachers have access to the necessary technology in the classroom to use READ in their instruction?
Standards-based reading assessments (Internet-based, formative assessments of student reading skills administered within the READ software)
To what extent were READ assessments made available to students and teachers? Examine overall, by school, and by grade level.
80
Strategies and Activities Evaluation Questions
Standards-based reading homework (Internet-based using READ software)
To what extent did students have access to READ at home? Examine overall and by grade level, race, gender, and socioeconomic status.
Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad)
To what extent did teachers receive professional development on how to integrate READ into their classroom instruction?
Teacher professional development on using READ assessment data for classroom lesson planning
To what extent did teachers receive professional development on how to incorporate READ assessment data into their classroom lesson planning?
Student training on using READ (in the classroom and at home)
To what extent were students trained in how to use READ?
Note: These questions are intended to evaluate the degree to which the program had the opportunity to be successful, as well as to determine if additional program supports are needed for successful implementation.
Evaluation Questions – Early/Short-Term and Intermediate Objectives Next, the E-Team worked with the READ oversight team to create several evaluation questions addressing READ early/short-term and intermediate objectives:
Table 5: Evaluation Questions for Early/Short-Term and Intermediate Objectives
Early/Short-Term and Intermediate Objectives
Evaluation Questions
Increased student use of READ at home (early/short-term)
How often did students receive READ homework assignments? To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased teacher use of READ in the classroom (early/short- term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
81
Early/Short-Term and Intermediate Objectives
Evaluation Questions
Increased student exposure to standards-based learning opportunities (early/short- term)
To what extent did students complete READ homework assignments?
How often did teachers use READ in the classroom with students?
Increased availability of standards-based, formative READ assessment data on student reading performance (early/short-term)
How often did teachers access READ student assessment data? **Note frequency and type of access.
Increased teacher use of standards-based READ assessment data (early/short- term)
In what ways did teachers use READ student assessment data?
Increased student interaction during learning (intermediate)
To what extent and how did students interact during classroom instruction when READ was used? **Note frequency and type of interaction.
Improved integration of READ into classroom instruction (intermediate)
In what ways and to what extent did teachers integrate READ into their classroom instruction? **Note the quality with which READ was integrated into classroom instruction by teachers.
Improved differentiation of instruction (intermediate)
In what ways and to what extent did teachers use READ assessment data to plan and differentiate instruction? **Note what data were used and how data were used in instructional planning.
82
Evaluation Questions – Long-Term Goals Finally, the E-Team and the READ oversight team created evaluation questions addressing READ long-term goals:
Table 6: Evaluation Questions for Long-Term Goals
Long-Term Goals Evaluation Questions
Increased student engagement in reading
To what extent and in what ways did READ foster student engagement during reading lessons?
Improved student reading skills
To what extent did READ improve student learning in reading?
•
•
•
•
•
•
•
•
To what extent did student learning improve after READ was implemented?
To what extent did learning outcomes vary with teacher use of READ in the classroom?
To what extent did learning outcomes vary with teacher use of READ assessment data to plan and differentiate instruction?
How did student performance on the READ assessments correlate with student performance on state assessments?
In what ways did learning outcomes vary by initial reading performance on state assessments?
In what ways did learning outcomes vary by grade level?
In what ways did learning outcomes vary by special education status and English language proficiency?
In what ways did learning outcomes vary with the frequency of READ use at home?
Data Collection – Indicators and Targets With the evaluation questions that the READ oversight team and E-Team had created, the E- Team was ready to expand on each question with indicators and accompanying targets. Using the logic model as its guide, the E-Team created the evaluation matrix below detailing the logic model components, associated evaluation questions, indicators, and accompanying targets.
For example, as the E-Team members began developing indicators for logic model components, they realized that student exposure to standards-based learning opportunities was an important construct composed of multiple components. Student exposure was assumed to
83
occur early on through READ homework assignments and teacher use of READ in the classroom, both of which the E-Team identified as indicators of exposure. Then at the intermediate stage, student interaction during classroom lessons using READ (measured by the rubric described below) was assumed to further increase student exposure.
The evaluation matrix is presented in three tables:
1. Strategies and Activities (Table 7: Evaluation Matrix Addressing Strategies and Activities During the Initial Implementation—Indicators and Targets)
2. Early/Short-Term and Intermediate Objectives (Table 8: Evaluation Matrix Addressing Early/Short-Term and Intermediate Objectives—Indicators and Targets)
3. Long-term Goals (Table 9: Evaluation Matrix Addressing Long-Term Goals—Indicators and Targets)
As you read through the tables, you will see that the evaluation will collect much of the data through the four instruments described below: (1) the READ implementation rubric, (2) the teacher survey, (3) the READ student assessment, and (4) the annual state student reading assessment.
1. READ implementation rubric: The E-Team created the READ implementation rubric to examine the quality of teacher practice when using READ during classroom instruction, student interaction during learning, teacher integration of READ into classroom instruction, and teacher use of READ student assessment data to plan and differentiate instruction. Dr. Elm will administer the READ implementation rubric on a monthly basis, alternating between classroom observations one month and interviews with teachers the following month.
2. Teacher survey: Dr. Elm will conduct the teacher survey in October as a baseline and again in December, February, April, and June. The teacher survey has multiple sections including some open-ended questions. Most items will be included every time the survey is administered. Others (such as items on the initial account setup and access) will be administered only when appropriate.
3. READ student assessment: The READ software itself includes an embedded, formative, standards-based READ assessment to measure student learning before and after each lesson. The data from the embedded READ assessment are stored in the READ system for teachers to use in assessing student learning and planning their instruction. The evaluation also will use these data.
4. State student reading assessment: The evaluation also will use state reading assessment scores to measure student learning in reading. The state reading assessment is administered in April of each academic year. Reading scores from the spring prior to READ program implementation will be used as a baseline against which to measure student reading improvement.
84
The evaluation will use data collected using each of these four instruments—the READ implementation rubric, teacher survey, READ student assessment, and reading scores from the state assessment. READ student assessment data and state student reading assessment data will be disaggregated and examined by quality of teacher use (using the READ implementation rubric), frequency of home use, initial reading performance on state assessments, grade level, gender, ethnicity, special education status, and English language proficiency. In addition, reading scores on the READ student assessment will be analyzed in relation to the state assessment reading scores to determine the degree to which the READ assessments correlate with the state reading assessment.
Tables 7, 8, and 9 show the evaluation questions, indicators, and targets developed in Step 2: Plan the Evaluation.
Table 7: Evaluation Matrix Addressing Strategies and Activities During the Initial Implementation—Indicators and Targets
Logic Model Components Evaluation Questions Indicators Targets
Interactive, standards- based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student)
To what extent did teachers have access to the necessary technology in the classroom to use READ in their instruction?
Increased number of teachers with access to the necessary technology in their classroom to use READ
By the start of the school year, all teachers will have the necessary technology in their classroom to use READ.
Standards-based reading assessments (Internet- based, formative assessments of student reading skills administered within the READ software)
To what extent were READ assessments made available to students and teachers? Examine overall, by school, and by grade level.
Increased number of teachers with access to READ assessments
Increased number of students with access to READ assessments
By the start of the school year, all teacher accounts will have been set up in READ.
By the end of September, all student accounts will have been set up in READ.
Standards-based reading homework (Internet-based using READ software)
To what extent did students have access to READ at home? Examine overall and by grade level, race, gender, and socioeconomic status.
Increased number of students with access to READ at home
By the end of September, all teachers will have determined how many students have the technology necessary to access READ from home.
85
Logic Model Components Evaluation Questions Indicators Targets
Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad)
To what extent did teachers receive professional development on how to integrate READ into their classroom instruction?
Increased number of teachers trained in how to effectively use READ in their classroom instruction
By the start of the school year, all teachers will have received professional development on how to integrate READ into their classroom instruction.
Teacher professional development on using READ assessment data for classroom lesson planning
To what extent did teachers receive professional development on how to incorporate READ assessment data into their classroom lesson planning?
Increased number of teachers trained in how to use READ assessment data in their lesson planning
By the start of the school year, all teachers will have received professional development on how to use READ assessment data in their lesson planning.
Student training on using READ (in the classroom and at home)
To what extent were students trained in how to use READ?
Increased number of students trained in how to use READ
By the end of September, all teachers will have trained their students in the use of READ (for use in the classroom and at home).
Table 8: Evaluation Matrix Addressing Early/Short-Term and Intermediate Objectives— Indicators and Targets
Logic Model Components Evaluation Questions
Indicators Targets
Increased student use of READ at home (early/short- term)
How often did students receive READ homework assignments?
To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased number of teachers assigning READ homework
Increased number of students completing READ homework, within a reasonable time
By November, over 50% of teachers will be assigning weekly READ homework.
By December, over 50% of students will be completing weekly READ homework assignments. Students will spend no more than 20 minutes to complete READ homework. (Note: Completion rates and duration of use are available through the READ online system.)
86
Logic Model Components Evaluation Questions
Indicators Targets
Increased teacher use of READ in the classroom (early/short-term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
Increased number of teachers using READ in the classroom with students
Improved teacher use of READ in the classroom with students
By October, all teachers will be using READ in the classroom with students.
By November, 25% of teachers will score a 2 or above (out of 4) on the READ implementation rubric.
By December, 50% of teachers will score a 2 or above on the READ implementation rubric.
Increased student exposure to standards- based learning opportunities (early/short- term)
To what extent did students complete READ homework assignments?
Increased number of students completing READ homework
By December, over 50% of students will be completing weekly READ homework assignments.
Increase d student exposure to standards-based learning opportunities (early/short-term)
How often did teachers use READ in the classroom with students?
Increased number of teachers using READ in the classroom with students
By October, all teachers will be using READ in the classroom with students.
Increased availability of standards-based, formative READ assessment data on student reading performance (early/short- term)
How often did teachers access READ student assessment data? **Note frequency and type of access.
Increased number of teachers accessing READ student assessment data
By October, 50% of teachers will have accessed READ student assessment data.
By November, all teachers will have accessed READ student assessment data.
Increased teacher use of standards-based READ assessment data (early/short-term)
In what ways did teachers use READ student assessment data?
Improved teacher use of READ student assessment data
By February, 25% of teachers will score a 3 or above on the READ implementation rubric.
Increased student interaction during learning (intermediate)
To what extent and how did students interact during classroom instruction when READ was used? **Note frequency and type of interaction.
Increased student interaction during learning, as measured by the student component of the READ implementation rubric (rubric completed through classroom observations and teacher interviews)
By February, 25% of classrooms will have a score of 3 or above (out of 4) on the READ implementation rubric.
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
87
Logic Model Components Evaluation Questions
Indicators Targets
Improved integration of READ into classroom instruction (intermediate)
In what ways and to what extent did teachers integrate READ into their classroom instruction? **Note the quality with which READ was integrated into classroom instruction by teachers.
Improved integration of READ lessons into classroom instruction, as measured by teacher scores on the READ implementation rubric (rubric completed through classroom observations and teacher interviews)
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
By June, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
Improved differentiation of instruction (intermediate)
In what ways and to what extent did teachers use READ assessment data to plan and differentiate instruction? **Note what data were used and how data were used in instructional planning.
Increased number of teachers using READ assessment data to plan instruction
Improved use of READ assessment data to differentiate instruction
By December, all teachers will be using READ assessment data on a weekly basis to plan instruction.
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
By June, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
88
Table 9: Evaluation Matrix Addressing Long-Term Goals—Indicators and Targets
Logic Model Components
Evaluation Questions Indicators Targets
Increased student engagement in reading
In what ways did READ foster student engagement during reading lessons?
Increased frequency and improved quality of student engagement in the classroom, as measured by the READ implementation rubric
By February, 25% of classrooms will have a score of 3 or above (out of 4) on the READ implementation rubric.
By April, 50% of classrooms will score a 3 or above on the READ implementation rubric.
By June, 75% of classrooms will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
89
Logic Model Components
Evaluation Questions Indicators Targets
Improved student reading skills
To what extent did READ improve student learning in reading?
→
→
→
→
→
→
→
→
To what extent did student learning improve after READ was implemented?
To what extent did learning outcomes vary with teacher use of READ in the classroom?
To what extent did learning outcomes vary with teacher use of READ assessment data to plan and differentiate instruction?
How did student performance on the READ assessments correlate with student performance on state assessments?
In what ways did learning outcomes vary by initial reading performance on state assessments?
In what ways did learning outcomes vary by grade level?
In what ways did learning outcomes vary by special education status and English language proficiency?
In what ways did learning outcomes vary with the frequency of READ use at home?
Increased scores on tests assessing students’ reading ability (including both state assessments and the formative assessments provided within the READ software)
Within 2 years, the increase in student scores on the state standards-based reading assessment will be statistically significant for those students who participated in READ versus those students who did not participate in READ.
State reading scores and READ assessment data will be disaggregated and examined by quality of READ teacher use (using the READ implementation rubric), frequency of READ home use, initial reading performance on state assessments, grade level, gender, ethnicity, special education status, and English language proficiency.
Reading scores on the state assessment will be analyzed in relation to scores on the READ assessment data, in order to determine the degree to which READ assessments correlate with the state assessments.
Evaluation Design Grovemont School District had 80 third- through fifth-grade classrooms across six elementary schools (28 third-grade classrooms, 28 fourth-grade classrooms, and 24 fifth-grade classrooms). District class size for grades 3 through 5 ranged from 22 to 25 students per classroom. Because of state budget cuts and reduced funding for the program, the E-Team knew that Mrs. Anderson and the READ oversight team would have to make some difficult choices about how to structure and evaluate their program.
90
Some members of the oversight team wanted to implement the program in fifth grade only for the first year, and then reexamine funds to see if they might be able to expand down to fourth grade in Year 2. Others voted to start the program at two of the six elementary schools and then try to include an additional school in Year 2. Dr. Elm and the E-Team recommended that they consider partially implementing the program at all six schools and across all three grades.
Dr. Elm explained that they would receive much better information about how their program was working and, more importantly, how it could be improved, if they were able to compare results from those classrooms that were using the program with those that were not. Dr. Elm knew that students at all of the schools in Grovemont School District were randomly assigned to teachers during the summer before each school year. However, Dr. Elm explained that in order to minimize initial differences between those classrooms that participate in READ and those that do not, they should consider randomly assigning half of the classrooms to continue with the existing district curriculum while the other half would supplement their existing curriculum with the READ program. Dr. Elm also recommended that they first divide the classrooms by school and grade level so that each school and grade would have one half of the classrooms assigned to the program. Teachers whose classrooms were not assigned to the program would be assured that if the program proved successful, they would be on board by Year 3. However, if the program did not have sufficient benefits for the students, it would be discontinued in all classrooms after Year 2. Dr. Elm concluded that building a strong evaluation into their program would provide them with credible information as to how their program was working and that having data to direct their program adjustments and improvements would give the program the best opportunity to be successful. The READ oversight team agreed to think about this idea and reconvene in 1 week to make a decision.
The E-Team also distributed the evaluation matrix it had created based on the READ logic model. The E-Team asked the oversight team to review the matrix and provide any feedback or comments.
The following week, the E-Team and READ oversight team reconvened to decide how to structure the program and to work on the evaluation design. Mrs. Anderson had spoken with the district superintendent about the evaluator’s suggestion of implementing READ in half the district’s third- through fifth-grade classrooms, with the promise that it would be expanded to all classrooms in Year 3 if the program was successful. Although logistically it would be easier to implement the program in two or three schools or one or two grades than to implement it in half the classrooms in all schools and at all grades, the superintendent understood the benefit of the added effort. The evaluation would provide higher quality data to inform decisions for program improvement and decisions regarding the program’s future.
Mrs. Anderson shared the superintendent’s comments with the oversight team and evaluation subcommittee. Like the superintendent, team members felt conflicted by the choice between
91
simpler logistics or a stronger evaluation design. Dr. Elm understood the dilemma all too well, but as an evaluator and an educator, she believed that a strong evaluation would result in improved program implementation and improved program outcomes.
Dr. Elm recognized that implementing the program in all classrooms in one grade level across the district would offer the weakest evaluation design and the least useful information but would likely be the simplest option logistically. Another option would be to start the program in all classrooms at two or three schools. In such a case, the other schools could be used as comparisons. For this reason, Dr. Elm explored the comparability of the six elementary schools in case the team decided to go that route. Five of the elementary schools had somewhat comparable state test scores in reading, while the sixth school had lower state test scores, and the difference was statistically significant. In addition, Schools 1 through 5 had similar (and fairly homogenous) populations, while School 6 had a much lower socioeconomic student population and a much higher percentage of ELL students. Because the district was interested in how the program worked with ELL students, the team knew that the evaluation needed to include School 6. However, if School 6 were used in a three-school implementation, the team would not have a comparable school against which to benchmark its results.
While not the simplest option, the oversight team decided that its best option would be to structure the program in such a way as to maximize the quality of the information from the evaluation. The team chose to build a strong evaluation into the READ program design to provide the formative information needed for program improvement and valid summative information for accountability.
Based on the READ oversight team’s decision about how to structure the program, Dr. Elm and the E-Team drafted the following evaluation design. They presented the design at the next oversight team meeting. The oversight team voted to approve the design as follows:
Design: Multiple-group, experimental design (students randomly assigned to classrooms by the school prior to the start of the school year and classrooms randomly assigned to the READ program group or a non-READ comparison group).
Program group (READ): 40 classrooms (22 to 25 students per classroom).
Comparison group (non-READ): 40 classrooms (22 to 25 students per classroom).
Classrooms will be stratified by grade level within a school and randomly assigned to either the READ program group or a comparison group. The READ and non-READ groups will each include 14 third-grade classrooms, 14 fourth-grade classrooms, and 12 fifth- grade classrooms.
92
Enriching the evaluation design: Program theory and logic modeling will be used to examine program implementation as well as short-term, intermediate, and long-term outcomes.
Data Collection Methods The E-Team decided on data collection methods, including the data sources, for each evaluation question and associated indicators. Two examples are provided below.
1. In what ways and to what extent did teachers integrate READ into their classroom instruction?
→
→
→
→
A READ rubric will be used to measure teacher implementation of READ in the classroom.
The rubric will be completed through classroom observations and teacher interviews.
The READ implementation rubric will be on a 4-point scale, with a 4 representing the best implementation.
Data will be collected monthly, alternating between classroom observations one month and interviews the following month.
2. To what extent did READ improve student learning in reading?
→
→
→
→
→
The state reading assessment will be used to measure student learning in reading. It is administered in April of each academic year, beginning in second grade.
READ assessment data will be used as a formative measure to examine student reading performance.
State reading scores and READ assessment data will be disaggregated and examined by quality of teacher use (using the READ implementation rubric), frequency of home use, initial reading performance, grade level, gender, ethnicity, special education status, and English language proficiency.
Previous year state reading assessment scores will be used as a baseline against which to measure student reading improvement.
Reading scores on the state assessment will be analyzed in relation to scores on the READ assessments in order to determine the degree to which READ assessments correlate with the state reading assessment.
For a full list of evaluation questions, data sources, and data collection methods, see the READ Evaluation Matrix tables 10, 11, and 12 in Step 3.
93
Step 3: Implement the Evaluation The READ external evaluator collected a mix of quantitative and qualitative data to address evaluation questions. Qualitative data collected through observations and interviews were coded using the READ implementation rubric and analyzed using descriptive statistics, including means and frequency distributions. Student reading assessment data were analyzed by testing for statistical significance, comparing mean test scores between groups of students and over time.
The following is an example using one of the READ intermediate objectives:
1. Logic Model Component: Improved integration of READ into classroom instruction (intermediate objective).
2. Evaluation Question: In what ways and to what extent did teachers integrate READ into their classroom instruction?
3. Indicator: Improved integration of READ lessons into classroom instruction.
4. Targets: By April, 50% of teachers will score a 3 or above (out of 4) on the READ implementation rubric. By June, 75% of teachers will score a 3 or above on the READ implementation rubric.
5. Data Source: READ implementation rubric (developed by the E-Team and administered by Dr. Elm)
6. Data Collection: Rubric completed through alternating, monthly classroom observations and teacher interviews.
7. Data Analysis: Rubric scores aggregated into frequency distributions and means; change over time to be analyzed.
The full READ Evaluation Matrix is included in tables 10, 11, and 12. Note that the evaluation matrix was completed in steps. The logic model components are taken directly from the READ logic model created in Step 1: Define the Program. The logic model components consist of strategies and activities, early/short-term and intermediate objectives, and long-term goals. The evaluation questions were created in Step 2: Plan the Evaluation, guided by the READ logic model. Indicators and targets were derived in Step 2 using the READ logic model and evaluation questions. At the end of Step 2, data collection sources and methods were chosen for each READ indicator. Data analysis methods were determined in Step 3: Implement the Evaluation. (See Appendix E for Table 26: Evaluation Matrix Template.)
94
Table 10: READ Evaluation Matrix—Strategies and Activities/Initial Implementation
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Interactive, standards-based classroom lessons (using the READ software with interactive classroom technologies and individual hand- held mobile devices for each student)
To what extent did teachers have access to the necessary technology in the classroom to use READ in their instruction?
Increased number of teachers with access to the necessary technology in their classroom to use READ
By the start of the school year, all teachers will have the necessary technology in their classroom to use READ.
Technology installation records; teacher survey
Technology installation records examined in September for evidence of necessary classroom technology
Teacher survey administered in October, including items on technology in the classroom
Records analyzed with basic descriptive statistics (counts and percentages) of classrooms with necessary technology
Teacher survey analyzed with basic descriptive statistics including means and frequency distributions; open- ended items on the survey summarized, and if warranted, analyzed for themes
95
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Standards-based reading assessments (Internet-based, formative assessments of student reading skills administered within the READ software)
To what extent were READ assessments made available to students and teachers?
Increased number of teachers with access to READ assessments
Increased number of students with access to READ assessments
By the start of the school year, all teacher accounts will have been set up in READ.
By the end of September, all student accounts will have been set up in READ.
Technology records (teacher accounts); teacher survey
Technology records (student accounts)
Technology records examined in September for evidence of teacher and student account setup/activation
Teacher survey administered in October (includes items on teacher account setup/activation)
Records analyzed with basic descriptive statistics (counts and percentages) on setup/activated teacher and student accounts
Teacher survey analyzed with basic descriptive statistics on setup/activated accounts
Standards-based reading homework (Internet-based using READ software)
To what extent did students have access to READ at home?
Increased number of students with access to READ at home
By the end of September, all teachers will have determined how many students have the technology necessary to access READ from home.
Teacher survey Teacher survey to be administered in October (includes items on student technology availability)
Teacher survey analyzed with basic descriptive statistics including means and frequency distributions; open- ended items summarized, and if warranted, analyzed for themes
96
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad)
To what extent did teachers receive professional development on how to integrate READ into their classroom instruction?
Increased number of teachers trained in how to effectively use READ in their classroom instruction
By the start of the school year, all teachers will have received professional development on how to integrate READ into their classroom instruction.
Professional development records
Teacher survey
Professional development examined in September for evidence it was offered, as well as attendance
Teacher survey administered in October (includes items on professional development)
Records summarized and analyzed with basic descriptive statistics, where appropriate
Teacher survey analyzed as described above
Teacher professional development on using READ assessment data for classroom lesson planning
To what extent did teachers receive professional development on how to incorporate READ assessment data into their classroom lesson planning?
Increased number of teachers trained in how to use READ assessment data in their lesson planning
By the start of the school year, all teachers will have received professional development on how to use READ assessment data in their lesson planning.
Professional development records
Teacher survey
Professional development examined for offerings, as well as attendance
Teacher survey administered in October (includes items on professional development)
Records summarized and analyzed with basic descriptive statistics, where appropriate
Teacher survey analyzed as described above
Student training on using READ (in the classroom and at home)
To what extent were students trained in how to use READ?
Increased number of students trained in how to use READ.
By the end of September, all teachers will have trained their students in the use of READ.
Teacher survey Teacher survey administered in October (includes items on the training of students)
Teacher survey analyzed as described above
97
Table 11: Evaluation Matrix—Early/Short-Term and Intermediate Objectives
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Increased student use of READ at home (early/short- term)
How often did students receive READ homework assignments?
Increased number of teachers assigning READ homework
By November, over 50% of teachers will be assigning weekly READ homework.
Teacher survey Teacher survey, including items on teacher practice regarding homework
Teacher survey analyzed with basic descriptive statistics including means and frequency distributions; open- ended items summarized, and if warranted, analyzed for themes
Increase d student use of READ at home (early/ short -term)
To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased number of students completing READ homework within a reasonable time
By December, over 50% of students will be completing weekly READ homework assignments. Students will spend no more than 20 minutes to complete weekly READ homework assignments.
(Note: Completion rates and duration of use are available through the READ online system.)
READ online records
READ online records examined monthly for evidence of student use
READ online records analyzed with basic descriptive statistics
98
Components Questions Logic Model Evaluation Indicators Targets Data Sources Data Collection Data Analysis
Increased teacher use of READ in the classroom (early/short-term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
Improved teacher use of READ in the classroom with students
By November, 25% of teachers will score a 2 or above (out of 4) on the READ implementation rubric.
By December, 50% of teachers will score a 2 or above on the READ implementation rubric.
READ implementation rubric
Rubric data collected monthly (for each teacher), alternating between classroom observation and teacher interviews
Rubric data analyzed by means and frequency distributions of rubric scores; change over time analyzed by testing for statistical significance
Increased student exposure to standards-based learning opportunities (early/short-term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use
Increased number of teachers using READ in the classroom with students
By October, all teachers will be using READ in the classroom with students.
Teacher survey Teacher survey, including items on teacher practice regarding classroom use
Teacher survey analyzed with basic descriptive statistics including means and frequency distributions; open- ended items summarized, and if warranted, analyzed for themes
99
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Increase d student exposure to standards-based learning opportunities (early/short-term)
To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased number of students completing READ homework, within a reasonable time
By December, over 50% of students will be completing weekly READ homework assignments.
Students will spend no more than 20 minutes to complete weekly READ homework assignments.
READ online records
READ online records examined monthly for evidence of student use
READ online records analyzed with basic descriptive statistics
Increased availability of standards-based, formative READ assessment data on student reading performance (early/short-term)
How often did teachers access READ student assessment data? **Note frequency and type of access.
Increased number of teachers accessing READ student assessment data
By October, 50% of teachers will have accessed READ student assessment data.
By November, all teachers will have accessed READ student assessment data.
READ online records
READ online records examined monthly to determine access patterns
READ online records analyzed with basic descriptive statistics
100
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Increased teacher use of standards- based READ assessment data (early/short-term)
In what ways did teachers use READ student assessment data?
Improved teacher use of READ student assessment data
By February, 25% of teachers will score a 3 or above on the READ implementation rubric.
READ implementation rubric
Rubric data collected monthly (for each teacher), alternating between classroom observation and teacher interviews
Rubric data analyzed by means and frequency distributions of scores; change over time analyzed by testing for statistical significance
Increased student interaction during learning (intermediate)
To what extent and how did students interact during classroom instruction when READ was used? **Note frequency and type of interaction.
Increased student interaction during learning
By February, 25% of classrooms will score of 3 or above on the READ implementation rubric.
By April, 50% of teachers will score a 3 or above on the rubric.
READ implementation rubric
Rubric data collected monthly (for each teacher), alternating between classroom observation and teacher interviews
Rubric data analyzed by means and frequency distributions of rubric scores; change over time analyzed by testing for statistical significance
Improved integration of READ into classroom instruction (intermediate)
In what ways and to what extent did teachers integrate READ into their classroom instruction? **Note the quality with which READ was integrated into classroom
Improved integration of READ lessons into classroom instruction
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
By June, 75% of teachers will score a 3 or above and 25% of teachers
READ implementation rubric
Rubric data collected monthly (for each teacher), alternating between classroom observation and teacher interviews
Rubric data analyzed by means and frequency distributions of rubric scores; change over time analyzed by testing for statistical significance
101
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
instruction by teachers.
will score a 4 on the rubric.
Improved differentiation of instruction (intermediate)
In what ways and to what extent did teachers use READ assessment data to plan and differentiate instruction? **Note what data were used and how data were used in instructional planning.
Increased number of teachers using READ assessment data to plan instruction
By December, all teachers will be using READ assessment data on a weekly basis to plan instruction.
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
By June, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
Teacher survey Teacher survey, including items on teacher practice regarding use of READ assessment data
Teacher survey analyzed with basic descriptive statistics including means and frequency distributions; open- ended items summarized, and if warranted, analyzed for themes
102
Improve d differe ntiation of instruction (i nterme diate) In what ways and to what e xtent did tea chers use READ assessme nt data to plan and di ffere ntiate instruction? **Note what data were use d and how data were used in instructional planning.
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Improved use of READ assessment data to differentiate instruction
By December, all teachers will be using READ assessment data on a weekly basis to plan instruction.
By April, 50% of teachers will score a 3 or above on the READ implementation rubric.
By June, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
READ implementation rubric
Rubric data collected monthly (for each teacher), alternating between classroom observation and teacher interviews
Rubric data analyzed by means and frequency distributions of rubric scores; change over time analyzed by testing for statistical significance
103
Table 12: Evaluation Matrix—Long-Term Goals
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
Increased student engagement in reading
To what extent and in what ways did READ foster student engagement during reading lessons?
Increased frequency and improved quality of student engagement in the classroom, as measured by the READ implementation rubric
By February, 25% of classrooms will have a score of 3 or above (out of 4) on the READ implementation rubric.
By April, 50% of classrooms will score a 3 or above on the READ implementation rubric.
By June, 75% of classrooms will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
READ implementation rubric
Monthly, alternating between classroom observation and teacher interviews
Means and frequency distributions of READ rubric scores determined; change over time analyzed using significance testing
104
Improved student reading skills
To what extent did READ improve student learning in reading?
To what extent did student learning improve after READ was implemented?
To what extent did learning outcomes vary with teacher use of READ in the classroom?
To what extent did learning outcomes vary with teacher use of READ assessment data to plan and differentiate instruction?
How did student performance on the READ assessments correlate with student performance on state assessments?
In what ways did learning
Increased scores on tests assessing students’ reading ability (including both state assessments and formative assessments provided within the READ software)
Within 2 years, the increase in student scores on the state standards-based reading assessment will be statistically significant for those students who participated in READ versus those students who did not participate in READ.
State and READ assessment data will be disaggregated and examined by quality of teacher use (using the READ implementation rubric), frequency of home use, initial reading performance, grade level, gender, ethnicity, special education status, and English language proficiency.
Reading scores on the state assessment will be
State reading assessment data
READ assessment data
READ implementa- tion rubric
Teacher survey
Demographic data from school records
READ online records
April of each academic year
T-test of mean test scores (on state reading assessment and READ assessments) between READ and non-READ students, taking into account prior reading performance on the state reading assessment; results disaggregated by teacher use, grade, gender, race, English language proficiency; correlational testing of state reading test scores and READ assessment scores
105
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources Data Collection Data Analysis
outcomes vary by initial reading performance, grade level, special education status, and English language proficiency?
In what ways did learning outcomes vary with the frequency of READ use at home?
analyzed in relation to scores on the READ assessment data, in order to determine the degree to which READ assessments correlate with the state reading assessments.
106
All data collected through the evaluation were managed and stored by Dr. Elm, the external evaluator. The computer used for storage and analysis was located in a locked office. Only the external evaluator had access to the raw data. Data were backed up weekly to an external drive, which was kept in a locked drawer. To protect teacher and student privacy, identification numbers were assigned to all participants. Teacher and student names were not recorded with the data.
READ online records regarding student and teacher use, rubric data, and survey data were only accessible by the external evaluator. Results that were released were only in aggregation and had no identifying information. All evaluation data were secured and kept confidential to protect individual privacy.
Step 4: Interpret the Results The READ evaluation subcommittee (the E-Team) examined the evaluation results. Some highlights from the findings are provided below. Use of these findings will be discussed in Step 5: Inform and Refine – Using the Results.
Summative
1. First-year results indicated that READ and state reading scores for READ students were higher than those for non-READ students. The gains were especially compelling for students in classrooms in which READ was used regularly and with fidelity, where increases in reading scores were over three times that of non-READ students.
2. Students in classrooms where READ was used regularly and with fidelity increased their reading scores by twice that of students in READ classrooms where READ was used minimally.
3. Students of teachers who used READ assessment data as intended to differentiate instruction increased their reading scores on the state assessment by twice as much as students of teachers who did not use READ assessment data as intended.
4. Student scores on READ assessments had a statistically significant and strong positive correlation with student scores on the state reading assessment, indicating that these two assessments are likely well aligned and that READ assessment data are likely a good indicator of performance on the state reading assessment.
Formative
5. Student assessment data could not be analyzed by home use of the READ program because only one classroom implemented the home component.
6. At the start of the year, teacher use of READ was promising, and the program met its targets. However, as the program progressed and as more teachers were pressed to
107
improve their use of READ, several targets were not met. READ student assessment data were not used as regularly by teachers as the classroom component of READ.
A full accounting of evaluation results by logic model component and evaluation question is provided in tables 13, 14, and 15.
108
Table 13: READ Evaluation Results—Strategies and Activities/Initial Implementation
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Interactive, standards- based classroom lessons (using the READ software with interactive classroom technologies and individual handheld mobile devices for each student)
To what extent did teachers have access to the necessary technology in the classroom to use READ in their instruction?
Increased number of teachers with access to the necessary technology in their classroom to use READ
By the start of the school year, all teachers will have the necessary technology in their classroom to use READ.
By September, all READ teachers (100%) had the necessary technology in their classroom to use READ.
Standards-based reading assessments (Internet- based, formative assessments of student reading skills administered within the READ software)
To what extent were READ assessments made available to students and teachers?
Increased number of teachers with access to READ assessments
Increased number of students with access to READ assessments
By the start of the school year, all teacher accounts will have been set up in READ.
By the end of September, all student accounts will have been set up in READ.
By September, all READ teacher accounts (100%) had been set up.
By the end of September, all student accounts (100%) had been set up.
109
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Standards-based reading homework (Internet- based using READ software)
To what extent did students have access to READ at home?
Increased number of students with access to READ at home
By the end of September, all teachers will have determined how many students have the technology necessary to access READ from home.
By the end of September, all teachers (100%) had determined how many students had home access to READ.
→ At three of the schools, most students (90%) had home access to READ.
→ At two of the schools, about half the students (54%) had home access to READ.
→ At one school, less than 20% of students had the technology necessary to access READ from home.
→ Even within those schools that had high numbers of students with home access, the classroom variability was large. Only one of the 40 classrooms had 100% of students with access to READ from home.
Teacher professional development on integrating READ into classroom instruction (using an interactive wireless pad)
To what extent did teachers receive professional development on how to integrate READ into their classroom instruction?
Increased number of teachers trained in how to effectively use READ in their classroom instruction
By the start of the school year, all teachers will have received professional development on how to integrate READ into their classroom instruction.
By September, all teachers (100%) had received professional development on how to integrate READ into their classroom instruction.
110
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Teacher professional development on using READ assessment data for classroom lesson planning
To what extent did teachers receive professional development on how to incorporate READ assessment data into their classroom lesson planning?
Increased number of teachers trained in how to use READ assessment data in their lesson planning
By the start of the school year, all teachers will have received professional development on how to use READ assessment data in their lesson planning.
By September, all teachers (100%) had received professional development on how to use READ assessment data in their classroom lesson planning.
Student training on using READ (in the classroom and at home)
To what extent were students trained in how to use READ?
Increased number of students trained in how to use READ
By the end of September, all teachers will have trained their students in the use of READ (for use in the classroom and at home).
By October, all teachers (100%) had trained their students in the classroom use of READ.
By the end of October, only one teacher (2%) had trained his students in the home use of READ.
→ Open-ended survey items revealed that teachers chose not to train students in the home use of READ unless every student in the classroom was able to take advantage of the home component. Since only one classroom had 100% participation, only one teacher trained students on home use.
111
Table 14: READ Evaluation Results—Early/Short-Term and Intermediate Objectives
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Increased student use of READ at home (early/short-term)
How often did students receive READ homework assignments?
Increased number of teachers assigning READ homework
By November, over 50% of teachers will be assigning weekly READ homework.
By November, only one teacher (2%) was assigning weekly READ homework.
Increase d student use of READ at home (early/ short -term)
To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased number of students completing READ homework within a reasonable time
By December, over 50% of students will be completing weekly READ homework.
Students will spend no more than 20 minutes to complete weekly READ homework. (Note: Completion rates and duration of use are available through the READ online system.)
By December, most students (70%) in the classroom where the READ homework component was used were completing the assignment. These students spent, on average, 15 minutes to complete the weekly READ homework.
Increased teacher use of READ in the classroom (early/short-term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
Improved teacher use of READ in the classroom with students
By November, 25% of teachers will score a 2 or above (out of 4) on the READ implementation rubric.
By January, 50% of teachers will score a 2 or above on the READ implementation rubric.
By November, one-third of teachers (33%) scored a 2 or above on the READ implementation rubric.
By January, over half the teachers (58%) scored a 2 or above on the READ implementation rubric.
112
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Increased student exposure to standards- based learning opportunities (early/short-term)
In what ways and how often did teachers use READ in the classroom with students? **Note frequency, duration, and nature of use.
Increased number of teachers using READ in the classroom with students
By October, all teachers will be using READ in the classroom with students.
→
By October, all teachers (100%) reported some classroom use of READ, while a little over half (58%) reported regular (at least weekly) use of READ in the classroom.
By October, over one-quarter of teachers (30%) reported that they had only used READ in the classroom once or twice in the last month.
Increase d student exposure to standards-based learning opportunities (early/short-term)
To what extent did students complete READ homework assignments? **Note frequency and duration of use.
Increased number of students completing READ homework within a reasonable time
By December, over 50% of students will be completing weekly READ homework. Students will spend no more than 20 minutes to complete weekly READ homework. (Note: Completion rates and duration of use are available through the READ online system.)
→ By December, most students (70%) in the classroom where the READ homework component was used were completing the assignment. These students spent, on average, 15 minutes to complete the weekly READ homework.
Increased availability of READ standards-based, formative assessment data on student reading performance (early/short-term)
How often did teachers access READ student assessment data? **Note frequency and type of access.
Increased number of teachers accessing READ student assessment data
By October, 50% of teachers will have accessed READ student assessment data.
By November, all teachers will have accessed READ student assessment data.
→
By October, over half the teachers (58%) had accessed the READ student assessment data.
By November, one-fifth of teachers (20%) had not accessed the READ student assessment data. Thirty-two teachers (80%) had accessed the READ assessment data, while over half (58%) accessed the READ assessment data on a regular (at least weekly) basis.
113
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Increased teacher use of standards-based READ assessment data (early/short-term)
In what ways did teachers use READ student assessment data?
Improved teacher use of READ student assessment data
By January, 25% of teachers will score a 3 or above (out of 4) on the READ implementation rubric.
In January, 11 teachers (28%) scored a 3 or above on the READ implementation rubric.
Eight teachers (20%) scored a 1 on the READ implementation rubric.
→
Increased student interaction during learning (intermediate)
To what extent and how did students interact during classroom instruction when READ was used? **Note frequency and type of interaction.
Increased student interaction during learning
By February, 25% of classrooms will have a score of 3 or above (out of 4) on the READ implementation rubric.
By April, 50% of classrooms will score a 3 or above on the READ rubric.
In February, 11 classrooms (28%) scored a 3 or above on the READ implementation rubric.
In April, 21 classrooms (52%) scored a 3 or above on the READ implementation rubric.
Improved integration of READ into classroom instruction (intermediate)
In what ways and to what extent did teachers integrate READ into their classroom instruction? ** Note the quality with which READ was integrated into classroom instruction by teachers.
Improved integration of READ lessons into classroom instruction
By March, 50% of teachers will score a 3 or above on the READ implementation rubric.
By May, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
By March, 21 teachers (52%) scored a 3 or above on the READ implementation rubric.
Twelve teachers (30%) scored a 2 on the READ implementation rubric, while seven (18%) scored a 1 on the rubric.
By May, 27 teachers (68%) scored a 3 or above on the READ implementation rubric.
Nine teachers (22%) scored a 2 on the rubric, while four (10%) scored a 1 on the rubric.
By May, 15 teachers (38%) scored a 4 on the READ implementation rubric.
→
→
114
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Improved differentiation of instruction (intermediate)
In what ways and to what extent did teachers use READ assessment data to plan and differentiate instruction? ** Note what data were used and how data were used in instructional planning.
Increased number of teachers using READ assessment data to plan instruction
Improved use of READ assessment data to differentiate instruction
By December, all teachers will be using READ assessment data on a weekly basis to plan and differentiate instruction.
By March, 50% of teachers will score a 3 or above on the READ implementation rubric.
By May, 75% of teachers will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
On the December teacher survey, less than half the teachers (48%) reported using READ assessment data on a weekly basis to plan and differentiate instruction.
→ Also on the December survey, most teachers (88%) said they had used the READ assessment data at least some to plan instruction, while a little over 10% said they had never used the READ assessment data in their classroom lesson planning.
By March, 21 teachers (52%) scored a 3 or above on the READ implementation rubric.
→ Twelve teachers (30%) scored a 2 on the READ implementation rubric, while seven (18%) scored a 1 on the rubric.
By May, 27 teachers (68%) scored a 3 or above on the READ implementation rubric.
→ Nine teachers (22%) scored a 2 on the rubric, while four (10%) scored a 1 on the rubric.
By May, 15 teachers (38%) scored a 4 on the READ implementation rubric.
115
Table 15: READ Evaluation Results—Long-Term Goals
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Increased student engagement in reading
To what extent and in what ways did READ foster student engagement during reading lessons?
Increased frequency and improved quality of student engagement in the classroom, as measured by the READ implementation rubric
By February, 25% of classrooms will have a score of 3 or above (out of 4) on the READ implementation rubric.
By April, 50% of classrooms will score a 3 or above on the READ implementation rubric.
By June, 75% of classrooms will score a 3 or above and 25% of teachers will score a 4 on the READ implementation rubric.
By February, 28% of classrooms scored a 3 or above on the READ implementation rubric.
By April, 52% of classrooms scored a 3 or above on the rubric.
By June, 68% of classrooms scored a 3 or above on the rubric.
By June, 38% of classrooms scored a 4 on the rubric.
Improved student reading skills
To what extent did READ improve student learning in reading?
To what extent did student learning improve after READ was implemented?
Increased scores on tests assessing students’ reading ability (state and READ assessments)
Within 2 years, the increase in student scores on the state standards-based reading assessment will be statistically significant for those students who participated in READ versus those students who did not participate in READ.
First year results indicate that state reading scores for READ students were higher than those for non-READ students, and the difference was statistically significant.
116
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Improved student reading skills
To what extent did READ improve student learning in reading?
How did student performance on the READ assessments correlate with student performance on state assessments?
Increased scores on tests assessing students’ reading ability (state and READ assessments)
Reading scores on the state assessment will be analyzed in relation to scores on the READ assessment data, in order to determine the degree to which READ assessments correlate with the state assessments.
Student scores on READ assessments had a statistically significant and strong positive correlation with student scores on the state reading assessment, indicating that the assessments are likely well aligned.
117
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Improved student reading skills
To what extent did READ improve student learning in reading?
To what extent did learning outcomes vary with teacher use of READ in the classroom?
To what extent did learning outcomes vary with teacher use of READ assessment data to plan and differentiate instruction?
In what ways did learning outcomes vary by initial reading performance on the state reading test, grade level, special education status, and English language proficiency?
Increased scores on tests assessing students’ reading ability (state and READ assessments)
Reading assessment data will be disaggregated and examined by quality of teacher use (using the READ implementation rubric), initial reading performance, grade level, gender, ethnicity, special education status, and English language proficiency.
Students in classrooms where READ was used regularly and with fidelity:
Demonstrated a statistically significant increase in their reading scores from last year to this year.
Increased their reading scores by twice as much as students in READ classrooms where READ was used minimally.
Among READ students:
Third-grade students showed greater reading gains on the state assessment than did fourth- and fifth-grade students.
There were no differences in reading gains by gender or ethnicity.
Regular education students showed greater reading gains than did special education students.
Scores for English Language Learner students were mixed. In third grade, ELL students showed statistically significant reading gains, while the differences in reading scores on the state assessment were not statistically significant for ELL students in either the fourth or fifth grade.
→
→
→
→
→
→
118
Logic Model Components Evaluation Questions
Indicators Targets Evaluation Findings
Improved student reading skills
To what extent did READ improve student learning in reading?
In what ways did learning outcomes vary with the frequency of READ use at home?
Increased scores on tests assessing students’ reading ability (state and READ assessments)
Reading assessment data will be disaggregated and examined by frequency of home use.
Assessment data could not be analyzed by home use because only one classroom implemented the home component.
119
Although the READ evaluation was a true experimental design, E-Team members knew it would still be worthwhile to consider the possibility that other factors might have influenced the positive findings. The E-Team therefore brainstormed possible competing explanations for the positive results of the READ program.
The E-Team decided that another plausible explanation for the positive results was that the teachers who used READ regularly in the classroom and who used READ assessments as intended may have been more skilled teachers and their students might have had a similar increase in reading scores even without the READ program. The E-Team decided to follow up on fidelity of implementation and its relationship to teacher skills. In addition, while classrooms were randomly assigned to READ to minimize initial differences between READ and non-READ classrooms, it is possible that by chance more skilled teachers were assigned to the READ program group. The E-Team also intends to investigate this issue further in Year 2 of the evaluation.
Communicating Results The READ oversight team met monthly to discuss program monitoring and improvement. At each meeting, the READ evaluator, Dr. Elm, and the E-Team provided an update to the oversight team. Based on the formative evaluation findings, the oversight team developed recommendations and a plan for the next month.
At the December school board meeting, the oversight team presented a status report, noting important findings from the evaluation. The oversight team asked Dr. Elm to create a full evaluation report for the administration and to present the findings at the August school board meeting. The E-Team also drafted a one-page brief of evaluation findings which was provided to all participants, as well as to the local newspaper.
Step 5: Inform and Refine – Using the Results
Informing the Program During her evaluation update at the November oversight team meeting, Dr. Elm shared initial findings from the evaluation of the implementation of READ program activities. Indicators showed that many students did not have the technology available at home to access READ. Even within those schools that had high numbers of students with the technology necessary for home access, the classroom variability was large. Only one of the 40 classrooms was able to have 100 percent of students access READ from home. Open-ended survey items revealed that teachers did not feel comfortable offering READ homework assignments to some but not all students in their classroom and therefore chose not to train students in the home use of READ. Only one teacher had trained his students in the home use of READ because all of his students
120
had the technology at home necessary to access READ. This teacher indicated that he would like to continue with the home component of READ.
The oversight team discussed the home-component issue and asked for advice from the E- Team on how to proceed. With the support of the E-Team, the oversight team decided to have a one classroom pilot of the home component but otherwise to remove the home component from the program during Year 1. Based on results from the pilot, implementing a partial home component in Year 2 would be considered.
During the same November update, Dr. Elm provided some findings from the evaluation of the early/short-term objectives on the READ logic model. She noted that in October all teachers had reported using READ in their classroom and that over half of teachers reported that they had used READ every week. However, over one-quarter of teachers reported that they had used READ in their classroom only once or twice in the last month. Survey data indicated that some of these teachers felt overwhelmed with the technology and some said they could not fit READ classroom use into their already busy day.
The oversight team discussed this information and decided to make a midcourse adjustment. Before the READ program began, team members had thought that the initial professional development and ongoing technical assistance would be sufficient. However, they now believed that they needed to make one-to-one professional development available to those teachers who would like to have someone come into their classroom and model a lesson using READ. Mrs. Anderson assigned arrangements for this one-on-one professional development to one of the oversight team members.
During her evaluation update at the January oversight team meeting, Dr. Elm shared findings from the evaluation of the intermediate objectives on the READ logic model. Dr. Elm explained that on the December teacher survey, slightly less than half the teachers reported that they used the READ assessment data on a weekly basis for planning and differentiating instruction. One in 10 teachers said they had never used the READ assessment data. Dr. Elm further stated that the lack of use of the READ assessment data was likely affecting scores on the READ implementation rubric. From classroom observations, interviews, and surveys, she believed that the quality of teacher use of READ in the classroom was progressing nicely but that the lack of assessment data use was decreasing the overall rubric score.
The oversight team knew that using the READ assessment data to plan and differentiate instruction was critical to the program’s success. Mrs. Anderson decided to discuss the issue with the READ faculty at each school in an effort to understand what she could do to facilitate their use of the READ assessment data. Additionally, the E-Team planned to elaborate on the rubric so that subscores could be captured for various components of the rubric. These rubric subscores would be especially useful for analysis when the data are disaggregated by teacher
121
use of READ in the classroom, student interaction in the classroom, and teacher use of READ student assessment data to plan and differentiate instruction. The revised rubric would be developed during the spring, piloted over the summer, and implemented during Year 2.
Finally, at the evaluation update at the end of the school year, Dr. Elm reported on the preliminary evaluation of long-term goals of the READ program. Student reading achievement was higher among students of teachers who used READ regularly and as intended, and the difference was statistically significant. Further, students of teachers who used the READ assessment data to tailor classroom instruction had higher reading test scores than students of teachers who did not use the READ assessment data, and again the difference was statistically significant.
Year 1 evaluation findings also indicated that not all teachers had bought into using READ with their students, especially the READ assessment component. The oversight team decided to share the evaluation findings with all teachers at a staff meeting in order to encourage them to use READ in their classroom. Prior to sharing the evaluation findings with teachers, Dr. Elm conducted an anonymous follow-up survey at the staff meeting in an effort to find out why some teachers chose not use READ.
Refining the Program Logic The READ oversight team felt that the logic model they created accurately portrayed the program. Yet, since it was clear from November that the home component could not be fully implemented, they wanted to highlight this on the logic model. The team decided to draw a box around the program as it was implemented, excluding the home component. Below the model, a note was provided indicating why the home component was not part of the existing implementation and that it was currently being piloted in one classroom. The oversight team hoped to understand more about the implementation of the home component, as well as the success of the home component, from examining results from the pilot classroom.
The oversight team also wanted to understand more about the strength of the relationship between classroom use of READ and state assessment scores and between use of READ assessment data for instructional planning and state assessment scores. It noted this on the logic model and asked the E-Team to investigate the linkages further in the second year of the evaluation.
Making Recommendations The READ oversight team recommended that the READ program be offered to all students in the district. It also recommended that the program be incorporated into the regular curriculum. The team felt that the positive findings regarding test scores were strong enough that all students should have access to it.
122
However, since READ funding was still at the 50 percent level for the second year, the oversight team planned to work with Dr. Elm and the E-Team for another year in order to continue to refine the implementation of the program in the classroom and to further understand the success of the READ program with students. To do this, the team recommended that the second-year evaluation include student surveys and focus groups as data sources to address objectives related to student interaction and engagement in the classroom.
The oversight team decided to continue to advocate for the program's expansion in the hope that it would be institutionalized soon.
123
Appendix B: Embedded Evaluation Illustration – NowPLAN* Program Snapshot Strategic Planning for Learning and Achievement in Nowgarden (NowPLAN) is part of a statewide, strategic planning initiative. The example in Appendix B focuses on the building-level evaluation of a state’s strategic technology plan (NowPLAN-T). The NowPLAN-T evaluation uses theory-based, embedded evaluation within a mixed-method design.
*The example set out in this appendix is provided solely for the purpose of illustrating how the principles in this guide can be applied in actual situations. The programs, characters, schools, and school district mentioned in the example are fictitious.
Step 1: Define the Program
Background Every six years, Nowgarden School District’s State Department of Education requires each district to create a new strategic plan. This strategic plan is intended to drive the district’s initiatives and strategies over the next six years, as well as to provide a means with which to monitor and evaluate the implementation of these initiatives and strategies.
Nowgarden School District recently completed its strategic planning process. With the help of teachers, administrators, board members, and the community, Nowgarden developed a 6-year plan that includes action plans for student learning, professional development, additional learning opportunities, safety and security, technology, and communication. The district’s technology plan, in particular, was modeled after the state technology plan and is intended as the foundation to build a districtwide technology program that will support the 21st century learner.
The Evaluation The Nowgarden School District administration asked a local evaluation organization to help them plan and conduct an evaluation of their strategic plan. The organization assigned an evaluator to work with the district to create an evaluation design. The external evaluator and the district created an evaluation of the district’s 6-year strategic plan that includes both quantitative and qualitative measures and has two foci:
124
1. Formative evaluation to help shape and improve the implementation of the plan strategies.
2. Summative evaluation of the plan to determine its overall success with students.
The main focus of this multiyear evaluation is to monitor how well the district has achieved its primary learning-related long-term goal: to improve the achievement of all students.
As part of its strategic planning, Nowgarden identified the six key strategies below that it will employ to meet the primary long-term goal. The strategic plan evaluation also will examine and monitor the extent to which and how well the district’s primary learning goal is being met as the district implements these strategies.
While Nowgarden School District and its external evaluator will design a comprehensive evaluation of all six strategies, they chose to first design the evaluation of the district’s technology component of the strategic plan.
Program Goals Nowgarden School District’s strategic plan and its related technology plan have one primary long-term goal: to improve student achievement. However, a longer-term goal of the district is to ultimately improve postsecondary success for all students.
Program Strategies and Activities – NowPLAN Nowgarden identified six key strategies to meet the district’s long-term goals:
1. Provide an engaging and challenging education program.
2. Provide the necessary resources and professional development for teachers.
3. Provide additional opportunities and supports for students.
4. Provide a safe and healthy educational environment.
5. Provide the technological tools necessary for the 21st century learner.
6. Effectively communicate with and engage the community.
Strategy #5, provide the technological tools necessary for the 21st century learner, will be the focus of this example.
A logic model shell of the entire Nowgarden strategic plan, titled NowPLAN Logic Model, is provided in Figure 4: NowPLAN Logic Model.
Note that the “improved postsecondary success” goal is not currently addressed through the NowPLAN evaluation. However, the district does plan to examine progress toward this goal in the future through graduate follow-up surveys and focus groups.
125
Also note that the strategy related to the districtwide technology plan is highlighted. The relationship between the district’s technology plan and its long-term goals will be explored throughout the remainder of this appendix.
When referring to the technology component of the overall strategic plan, the acronym NowPLAN-T will be used.
126
Figure 4: NowPLAN Logic Model
127
Program Strategies and Activities – NowPLAN-T The NowPLAN-T evaluation examines Nowgarden’s strategy to create a technology plan that provides the technological tools necessary for the 21st century learner. The district identified seven activities that comprise the technology plan strategy.
1. Districtwide Technology Curriculum: to create a districtwide technology curriculum K- 12. The technology curriculum should integrate technology into the core curriculum.
2. Student Technology Orientation: to create a technology orientation plan for students (initially for all students, and then for new students transferring into the district).
3. Technology Professional Development Model: to create a districtwide professional development model. The model will focus on improving student learning, best instructional practice, and administrative functions such as data management and assessment.
4. Teacher Technology Orientation: to create a technology orientation plan for teachers (initially for all teachers, and then for new teachers transitioning into the district).
5. Technology-based Communications: to fully develop the use of technology resources for communication throughout the school community.
6. Technology-based Additional Learning Opportunities: to provide technology-based additional learning opportunities (ALOs) for students, including but not limited to distance learning and extended school day opportunities.
7. Hardware and Software Acquisition Plan: to create a districtwide software and hardware acquisition plan.
Relating Strategies to Goals: Program Theory The overall theory behind the NowPLAN-T program is that by providing
•
•
•
•
•
•
•
A districtwide technology curriculum,
A technology orientation plan for students,
A technology professional development model,
A technology orientation plan for teachers,
Technology resources for communication,
Technology-based additional learning opportunities, and
A districtwide software and hardware acquisition plan,
Student achievement will be improved, AND
Student postsecondary success ultimately will be improved.
128
The external evaluator helped the district people develop a set of assumptions as to why they believed NowPLAN-T strategies would result in improved student achievement. They based these assumptions on research and developed them in consultation with educational technology professionals.
1. A districtwide technology curriculum will lead to a revised technology curriculum, which will lead to improved integration of technology into the core curriculum. This integration will lead to improved integration of technology in the classroom to enhance instruction.
2. A technology orientation plan for students will lead to a revised student orientation plan, which will lead to improved student understanding of technology availability and appropriate use. Improved understanding will lead to increased student use of technology to enhance learning and then improved integration of technology to enhance instruction.
3. A technology professional development model will lead to a revised professional development model, which will lead to improved teacher understanding of technology availability and then an increased use of technology by teachers. Increased use of technology will lead to improved teacher use of technology, which will in turn improve the integration of technology to enhance instruction.
4. A technology orientation plan for teachers will lead to a revised teacher orientation plan, which will lead to improved teacher understanding of technology availability and then to increased use of technology by teachers. Increased use of technology will lead to improved teacher use of technology. Improved teacher use will in turn improve the integration of technology to enhance instruction.
5. Technology-based communications will lead to a revised districtwide protocol for technology-based communications, which will lead to improved communication with families regarding events, assignments, and emergencies. Improved communication will lead to increased parental involvement in their child’s education.
6. Technology-based additional learning opportunities (ALOs) will improve the identification of technology-based ALOs. Improved identification will lead to increased availability of ALOs outside of the regular school day, which will lead to increased student participation in technology-based learning opportunities. Increased participation will lead to increased student exposure to learning opportunities.
7. A districtwide software and hardware acquisition plan will lead to a revised protocol for hardware and software acquisitions, which will lead to improved long-term acquisition planning. Improved acquisition planning will lead to increased availability of appropriate and necessary technology, then to increased availability of technology- based learning opportunities, and then to increased student participation, which will lead to increased student exposure to learning opportunities.
129
8. Improved use and integration of technology to enhance instruction, increased parental involvement in their child’s education, and increased student exposure to learning opportunities will lead to increased student learning and improved student achievement and ultimately improved postsecondary success.
130
Program Logic Model Figure 5 illustrates these assumptions in the NowPLAN-T logic model.
Figure 5: NowPLAN-T Logic Model
131
Resources Nowgarden School District worked with its external evaluator to identify several contextual conditions and resources that are necessary for the success of the district’s technology plan. These are listed in the first column of the NowPLAN-T logic model and include financial resources to implement the technology plan, administrative support throughout the NowPLAN- T implementation, technology infrastructure (or resources to build this infrastructure), and the necessary technology personnel.
Step 2: Plan the Evaluation
Evaluation Design Nowgarden School District is a public school district educating over 56,000 students. The district has 38 elementary schools (grades K-5), 13 middle schools (grades 6-8), and 11 high schools (grades 9-12). Nowgarden employs over 4,000 teachers across its 62 schools.
The district’s technology plan is implemented districtwide. The NowPLAN-T program and its evaluation will employ a single group evaluation design and will include both quantitative and qualitative data sources.
Design: Mixed-method, nonexperimental design
To improve the quality of the NowPLAN-T evaluation, Nowgarden’s external evaluator and the district worked to embed the evaluation into the technology plan. The evaluation is theory based and uses logic modeling to relate NowPLAN-T strategies and activities to long-term district goals. The evaluation design is longitudinal (6 years) and uses repeated measures. (Some data will be collected annually, while other data will be collected quarterly.)
Enriching the Evaluation Design: Logic modeling; longitudinal data with some repeated measures. In-depth case study of NowPLAN-T schools and sampling of classrooms within schools will be used to collect observational data.
Data Collection Methods Data will be collected through a variety of methods, including interviews, documents, surveys, rubrics, and assessments. All schools in the district will participate in certain components of the evaluation, while a purposefully selected group of schools will be chosen for a comprehensive case study, based on the quality of their implementation of the NowPLAN-T activities.
In the districtwide portion of the evaluation, all teachers and students will participate in surveys examining the use and integration of technology into the classroom. Teacher and student
132
surveys will be based on the NowPLAN-T rubrics. That is, along with questions regarding the program’s implementation, teachers and students will self-report on rubric components.
To provide a rich understanding of the classroom context of NowPLAN-T, the external evaluator will conduct a case study in a purposefully selected group of schools. Schools will be selected for the case study based upon the quality with which they have implemented NowPLAN-T activities. The case study will focus on four schools and provide linkages between classroom- level implementation and student learning that may not be fully understood by examining survey data only. A random sample of classrooms will be chosen at each case study school to participate in observations using the NowPLAN-T rubrics. The external evaluator and trained members of an evaluation team assembled by the external evaluator will observe these classrooms periodically throughout the school year. During observations, the NowPLAN-T rubrics will be completed and classroom context recorded. Case study teachers will also participate in in-depth interviews.
Student surveys (delivered electronically) will address classroom and home use of technology for primary learning activities, as well as additional learning opportunities offered as part of NowPLAN. The evaluation also will use participation logs from additional learning opportunities to determine opportunities that were offered and attendance. Additionally, a parent survey will be electronically administered to all parents in the district. The survey will be voluntary and anonymous. Information from the parent survey will examine communication and parent involvement related to NowPLAN components. The district would like but does not currently have the funds to conduct a graduate follow-up survey to address the second long-term goal of improved postsecondary success. The proposed survey is included in the evaluation design as a placeholder for future evaluation possibilities.
Step 3: Implement the Evaluation The NowPLAN-T evaluation matrix is provided in tables 16, 17, and 18. Evaluation questions, indicators, and targets, as well as data sources, collection, and analysis are addressed by logic model component.
Note that some very early (short-term) objectives are included in Table 16: NowPLAN-T Evaluation Matrix—Strategies and Activities/Initial Implementation. The remaining short- term and intermediate objectives are covered in Table 17: NowPLAN-T Evaluation Matrix— Early/Short-Term and Intermediate Objectives. Table 18: NowPLAN-T Evaluation Matrix— Long-Term Goals addresses long-term goals on improving student achievement and postsecondary success.
133
Table 16: NowPLAN-T Evaluation Matrix—Strategies and Activities/Initial Implementation
Logic Model Components
Evaluation Questions*
Indicators Targets Data Sources Data Collection Data Analysis
Districtwide technology curriculum/revised technology curriculum
Student technology orientation/revised student orientation plan
Technology professional development model/revised professional development model
Teacher technology orientation/revised teacher orientation plan
Technology-based communications/revi sed districtwide protocol for technology-based communication
Hardware/software acquisition plan/revised protocol for hardware/ software acquisitions
In what ways were the districtwide technology curriculum, student technology orientation plan, technology professional development model, teacher technology orientation plan, districtwide protocol for technology-based communication, and protocol for hardware/soft-ware acquisitions revised?
Creation of a revised districtwide technology curriculum, student technology orientation plan, technology professional development model, teacher technology orientation plan, districtwide protocol for technology-based communication, and protocol for hardware/software acquisitions
By the end of Year 1, a revised districtwide hardware/software acquisition plan and districtwide protocol for technology-based communication will have been created.
By the end of Year 2, a revised districtwide technology curriculum and technology professional development model will have been created.
By the end of Year 3, revised student and teacher technology orientation plans will have been developed.
Technology records
Document analysis
Meeting minutes
Interviews with technology personnel
Technology records and documents, as well as meeting minutes, reviewed monthly
Interviews with technology personnel conducted quarterly
Documents/minutes summarized for evidence of implementation
Interview data summarized, and if warranted, analyzed for themes
134
Logic Model Components
Evaluation Questions*
Indicators Targets Data Sources Data Collection Data Analysis
Technology-based additional learning opportunities (ALOs)/improved identification of ALOs
To what extent were technology-based ALOs identified?
Increasing number of technology-based ALOs identified
By the end of Year 2, a process to identify technology- based ALOs will be operational.
Interviews with technology personnel
Interviews with technology personnel conducted quarterly
Interview data summarized, and if warranted, analyzed for themes
*Note: Logic model components are combined in the evaluation questions but will be disaggregated in the data analysis.
Table 17: NowPLAN-T Evaluation Matrix—Early/Short-Term and Intermediate Objectives
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Improved integration of technology into the core curriculum
How was technology integrated into the core curriculum?
Increased number of schools with improved classroom integration of technology
By the end of Year 4, at least <>% of schools will score a 3 or better on the NowPLAN-T rubrics.
By the end of Year 6, <>% will score a 4 out of 4 on the NowPLAN- T rubrics.
NowPLAN-T rubrics
Teacher surveys
Baseline rubric data collected at start of Year 1
Rubric data collected quarterly (for each school), through teacher surveys (all classrooms) and classroom observations (case study classrooms)
Rubric data analyzed by frequency distributions of rubric scores
Changes over time analyzed using significance testing
135
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Improved classroom integration of technology to enhance instruction
In what ways and to what extent was technology integrated into classroom instruction?
Increased number of schools with improved curricular integration of technology
By the end of Year 4, at least <>% of schools will score a 3 or better on the NowPLAN-T rubrics.
By the end of Year 6, <>% will score a 4 out of 4 on the NowPLAN- T rubrics.
NowPLAN-T rubrics
Teacher surveys
Baseline rubric data collected at start of Year 1
Rubric data collected quarterly (for each school), through teacher surveys (all classrooms) and classroom observations (case study classrooms)
Rubric data analyzed by frequency distributions of rubric scores
Changes over time analyzed using significance testing
Improved student understanding of technology availability and appropriate use
To what extent do students understand the technology available to them, as well as its appropriate use?
Increased number of students who have an understanding of available technology
Using Year 1 survey data as a baseline --by the end of
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>% of students will appropriately understand and use available technology to enhance learning.
Student surveys
Baseline student survey administered during Year 1
Student survey administered annually (and electronically) to all students
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
136
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Increased student use of technology to enhance learning
In what ways and how often do students use technology for learning (in the classroom and at home)?
Increased number of students who use technology in their learning activities
Using Year 1 survey data as a baseline --by the end of
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>% of students will appropriately understand and use available technology to enhance learning.
Student surveys
Baseline student survey administered during Year 1
Student survey administered annually (and electronically) to all students
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
Improved teacher understanding of technology availability
To what extent do teachers understand the technology available to them?
Increased number of schools with improved teacher understanding of the technology available to them
By the end of Year 4, at least <>% of schools will score a 3 or better on the NowPLAN-T rubrics, and will demonstrate an understanding of technology (as measured by the teacher survey).
By the end of Year 6, <>% will score a 4 out of 4 on the NowPLAN- T rubrics, and will demonstrate an understanding of technology (as measured by the teacher survey).
Teacher surveys
NowPLAN-T rubrics
Baseline teacher survey data and rubric data collected at start of Year 1
Rubric data collected quarterly (for each school), through teacher surveys (all classrooms) and classroom observations (case study classrooms)
Survey data analyzed through basic descriptive statistics and frequency distributions
Rubric data analyzed by frequency distributions of rubric scores
Changes over time analyzed using significance testing
137
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Increased use of technology by teachers
To what extent do teachers use technology to improve student learning?
Increased number of schools with increased teacher use of technology to improve student learning
By the end of Year 4, at least <>% of schools will score a 3 or better on t he Now PLAN-T rubri cs, and will demonstrate an understandi ng of te chnology (as measured by the tea cher survey).
By the end of Year 6, <>% will score a 4 out of 4 on the NowPLAN-T rubrics, and will de monstrate an understa nding of technol ogy (as measure d by the teacher survey).
Teacher surveys
NowPLAN-T r ubrics
Baseline tea cher survey data and rubri c data collected at start of Y ear 1
Rubri c data colle cted quarterly (for each school ), through teacher surveys (all classrooms) a nd classroom observations (ca se study classrooms)
Rubri c data analyzed by frequency distributions of rubri c scores
Change s over time analyze d usi ng signifi cance testing
Improved teacher use of technology
In what ways do teachers use technology to improve student learning?
Increased number of schools with improved teacher use of technology to improve student learning
By the end of Year 4, at least <>% of schools will score a 3 or better on t he Now PLAN-T rubri cs, and will demonstrate an understandi ng of te chnology (as measured by the tea cher survey).
By the end of Year 6, <>% will score a 4 out of 4 on the NowPLAN-T rubrics, and will de monstrate an understa nding of technol ogy (as measure d by the teacher survey).
Teacher surveys
NowPLAN-T r ubrics
Baseline tea cher survey data and rubri c data collected at start of Y ear 1
Rubri c data colle cted quarterly (for each school ), through teacher surveys (all classrooms) a nd classroom observations (ca se study classrooms)
Rubri c data analyzed by frequency distributions of rubri c scores
Change s over time analyze d usi ng signifi cance testing
Improved communication with families
To what extent has communication with families improved?
Increased number of families who report improved communication
Using Year 1 survey data as a baseline --by the end of
Year 4 at least <>%,
Year 5 at least <>%, and
Year 6 at least <>% of parents will report improved communication.
Parent survey Baseline parent survey administered during Year 1
Parent survey administered annually (and electronically) to all families
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
138
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Increased parental involvement
In what ways has technology contributed to parental involvement?
Increased number of parents for whom communication has contributed to increased parental involvement
Using Year 1 survey data as a baseline --by the end of
Year 4 at least <>%,
Year 5 at least <>%, and
Year 6 at least <>% of parents will report that communication has contributed to their increased parent involvement.
Parent survey Baseline parent survey administered during Year 1
Parent survey administered annually (and electronically) to all families
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
Increased availability of technology-based additional learning opportunities (ALOs)
To what extent are technology-based ALOs available to students?
Increased number of technology-based ALOs offered to students
Each year, the number of technology-based ALOs offered to students will increase by 20% (e.g., online courses, supplemental programs).
Technology records
Technology records and participation logs reviewed quarterly
Technology records and participation logs reviewed for evidence of ALOs availability
139
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Increased student participation in technology-based ALOs
How often do students participate in technology-based ALOs?
Increased number and percent of students participating in technology-based ALOs (within and outside of the regular school day)
Using Year 1 survey data as a baseline --by the end of
Year 2 at least <>%,
Year 3 at least <>%,
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>% of students will participate in technology-based ALOs.
Participation logs
Student surveys
Baseline student survey administered during Year 1
Student survey administered annually (and electronically) to all students
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
Results disaggregated by type of ALO (e.g., home-based, outside of school day)
Increased student exposure to technology-based ALOs
To what extent and in what ways do students participate in technology-based ALOs?
Increased number and percentage of students who have increased their overall learning time through technology-based ALOs (Note: investigate the nature of use, i.e., replacing an ALO or adding a new ALO)
Using Year 1 survey data as a baseline – students will have increased their learning time through technology-based ALOs by the end of
Year 2 at least <>%,
Year 3 at least <>%,
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>%.
Participation logs
Student surveys
Baseline student survey administered during Year 1
Student survey administered annually (and electronically) to all students
Survey data analyzed using frequency distributions and basic descriptive statistics
Changes over time analyzed using significance testing
Results disaggregated by type of ALO (e.g., home-based, outside of school day) and nature of the ALO (e.g., if the student replaced an ALO with a technology- based ALO)
140
Logic Model Components
Evaluation Questions
Indicators Targets** Data Sources Data Collection Data Analysis
Improved long- term hardware/ software acquisition planning
In what ways has hardware/software acquisition planning improved?
Increased number of teachers and technology staff who report improved acquisition planning
By the end of
Year 2 at least <>%,
Year 3 at least <>%,
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>% of teachers/staff will report improved planning.
Interviews with technology staff
Teacher surveys
Interviews conducted annually
Teacher surveys administered annually
Interviews summarized and analyzed by theme
Survey data analyzed using frequency distributions and basic descriptive statistics
Increased availability of appropriate and necessary technology
To what extent has the availability of appropriate and necessary technology improved?
Increased number of teachers and technology staff who report improved availability of necessary technology
By the end of
Year 2 at least <>%,
Year 3 at least <>%,
Year 4 at least <>%, Year 5 at least <>%, and
Year 6 at least <>% of teachers/staff will report improved availability.
Interviews with technology staff
Teacher surveys
Interviews conducted annually
Teacher surveys administered annually
Interviews summarized and analyzed by theme
Survey data analyzed using frequency distributions and basic descriptive statistics
141
Increased student learning
To what extent has technology contributed to student learning as measured by local assessments?
To what extent did learning outcomes vary by school and classroom technology use?
(Not currently evaluated: In what ways have technology-based ALOs contributed to student learning?)
Increased scores on local assessments
Increased correlation between local assessment scores and technology implementation
Students in schools with high rubric scores will have higher gains on local assessments than students in schools with lower rubric scores, and the difference will be statistically significant.
Local assessments
NowPLAN-T rubric scores
Local assessment data collected quarterly
Baseline rubric data collected at start of Year 1
Rubric data collected quarterly (for each school) through teacher surveys (all classrooms) and classroom observations (case study classrooms)
Correlational analyses between local assessment scores and NowPLAN-T rubric scores
T-test of mean test scores pre-NowPLAN-T and each academic year post-NowPLAN-T
Results disaggregated by school, grade level, gender, race/ethnicity, special education status, and English language proficiency
**Targets will be updated once baselines are measured.
142
Table 18: NowPLAN-T Evaluation Matrix—Long-Term Goals
Logic Model Component
Evaluation Questions
Indicators Targets Data Source Data Collection Data Analysis
Improved student achievement
To what extent did the district’s technology plan contribute to student achievement?
→
→
To what extent did student learning improve after NowPLAN-T was implemented?
To what extent did learning outcomes vary by school and classroom technology use?
Increased scores on statewide standards-based achievement assessments
Increased correlation between achievement scores and NowPLAN-T implementation
Within 2 years, the correlation between improvement in student scores on the statewide standards-based achievement tests and scores on the NowPLAN-T technology rubrics will be statistically significant.
Students in schools (and classrooms) with high rubric scores will have higher achievement gains than students in schools (and classrooms) with lower rubric scores, and the difference will be statistically significant.
State test scores in reading and math (as well as science and writing)
NowPLAN-T rubric scores
State tests conducted in April of each academic year
Rubric data collected quarterly, through teacher surveys (all classrooms) and classroom observations (case study classrooms)
Correlational analyses between state achievement test scores and NowPLAN-T rubric scores
T-test of mean test scores pre- NowPLAN-T and each academic year post-NowPLAN-T
Results disaggregated by school, classroom, grade level, gender, race/ethnicity, special education status, and English language proficiency
Improved postsecondary success
To what extent was postsecondary success related to implementation of NowPLAN-T?
Increased correlation between achievement scores and postsecondary success
Note: This component is not currently evaluated. Indicators will be refined and targets will be determined at a later date.
Graduate follow- up surveys and focus groups
Note: This component is not currently evaluated. Data collection will be determined at a later date.
Descriptive statistics
Qualitative analysis of focus group data
143
Logic Model Component
Evaluation Questions
Indicators Targets Data Source Data Collection Data Analysis
Improve d postsecondary success
To what extent was postsecondary success related to student achievement?
Increased correlation between NowPLAN-T implementation and postsecondary success
Note: This component is not currently evaluated. Indicators will be refined and targets will be determined at a later date.
State test scores
Graduate follow- up surveys
Note: This compone nt is not currently evaluated. Data collection will be determi ned at a later date.
Descriptive statistics
Correlational studies and significance testing
144
Step 4: Interpret the Results The NowPLAN-T external evaluator will meet with district staff quarterly to provide interim findings for program improvement and midcourse adjustments. The external evaluator will also provide an annual evaluation report focusing on the district’s progress towards meeting the targets set for NowPLAN-T objectives. Biannual newsletters (traditional and electronic), as well as periodic presentations and press releases, will be used to communicate NowPLAN-T progress and findings to staff, parents, and students.
Step 5: Inform and Refine – USING the Results Evaluation results will be used to inform and improve NowPLAN-T, refine the NowPLAN-T logic model (as necessary), and make recommendations and decisions regarding the future direction of NowPLAN-T.
Additional Notes Embedding Evaluation in the Strategic Plan
Nowgarden School District had been through strategic planning before. Past strategic plans were completed only because they were required and then filed away and rarely consulted. The superintendent, who was hired 3 years ago, saw strategic planning as an opportunity for the district to reflect and grow. The superintendent did not want to ask teachers, administrators, and community members to spend their valuable time participating in a strategic planning process that was not going to be used to its fullest potential. The superintendent knew that a good strategic plan could be used for positive change and growth, and that embedding evaluation within the plan itself would provide information for continuous improvement. District administration agreed that a powerful strategic plan with an embedded evaluation provides the ingredients for success.
The strategic planning team chose to use the NowPLAN rubrics as the cornerstone of its strategic plan. It developed rubrics for each strategy included in the strategic plan. The NowPLAN rubrics were to be used as a guide and benchmarking tool. All teachers received professional development on using the NowPLAN rubrics for self-assessment. Administrators believed that familiarity with the rubrics would provide teachers with an understanding of the district’s expectations (i.e., what the district has determined good practice to “look like”), and that use of the rubrics would encourage self-reflection and ultimately improvement. Embedding the NowPLAN rubrics into everyday practice, from the classroom teacher’s use to the curriculum supervisor’s reviews, was Nowgarden’s way to translate the district’s strategic plan into practice. The Nowgarden superintendent knew that the district’s strategic plan,
145
including its technology plan, could drive change if it was a living, breathing plan that was incorporated as the foundation and into every aspect of the district’s operation.
The superintendent also knew that an important aspect of using rubrics to understand expectations and drive change is consistency. Understanding of the rubrics must be uniform, and application of the rubric must be consistent. For this reason, the external evaluator was asked to compare externally completed observation-based rubric ratings (from case study classrooms) with self-report rubric ratings by classroom teachers. By doing this, the evaluator could uncover discrepancies and inconsistencies in understanding and application. These findings were to be provided to program staff to be used to plan professional development activities that aid in rubric use and to improve the reliability of rubric data.
The evaluator assured the district staff that the reporting of such data would in no way violate teacher confidentiality and privacy. The evaluation team planned to collect and manage data such that individual privacy was maintained. Data were to be stored with “dummy” keys that would allow linkages between data sets, but that would not relate to any internal, district identifier. Only the external evaluator would have access to key coding. Linkages between data sources, such as observational data and survey data, would be performed by the evaluation team and findings would be reviewed prior to release to ensure that individual identities could not be directly or deductively determined.
While summarizations of NowPLAN (including NowPLAN-T) rubric data would be provided to program staff for formative program changes, the evaluator planned to also use rubric data to relate implementation to long-term outcomes. The program theory laid out by district administration assumed that higher rubric scores would be positively related to higher student achievement scores. Similarly, they hypothesized that lower rubric scores (that is, less sophisticated levels of implementation) would be associated with lower student achievement scores.
NowPLAN-T Rubrics Tables 19 through 25 represent the 1:1 Implementation Rubric. The William & Ida Friday Institute for Educational Innovation at North Carolina State University kindly granted permission to reproduce the 1:1 Implementation Rubric in Appendix B. The rubric was developed by research staff at the Friday Institute. The 1:1 Implementation Rubric is based on the International Society for Technology in Education’s National Educational Technology Standards (ISTE’s NETS) framework. It was also developed using the North Carolina IMPACT Guidelines, the Texas Star Chart, and the North Carolina Learning Technology Initiatives (NCLTI) Planning Framework.
The rubric provides an assessment of the daily impact and use of technology programs and services on the teaching and learning process. It can be used to examine technology programs
146
at the district level, as well as the school and classroom levels. The 1:1 Implementation Rubric is intended to aid in reflecting on your technology implementation. For more information, visit https://eval.fi.ncsu.edu/11-implementation-rubric/.
Although the programs, characters, schools, and school district mentioned in Appendix B are fictitious examples provided to illustrate how the principles in this guide can be applied, the 1:1 Implementation Rubric is a real instrument. The 1:1 Implementation Rubric serves as the NowPLAN-T evaluation rubric for the fictitious Nowgarden School District. The NowPLAN-T evaluation incorporates the rubric as a powerful indicator of classroom and school performance. The rubric will be used by teachers for self-assessment and by evaluators during observations of case study classrooms.
The NowPLAN-T evaluation will use the Friday Institute 1:1 Implementation Rubric to examine four dimensions of classroom technology use and teacher experience with technology. The four implementation areas are shown below:
1. Curriculum and Instruction
2. Infrastructure and Technical Support
3. Leadership, Administration, and Instructional Support
4. Professional Development
147
Each of the four implementation areas has six elements of reflection. The NowPLAN-T evaluator created the chart below, using the elements from the Friday Institute 1:1 Implementation Rubric.
Table 19: 1:1 Implementation Rubric: Implementation Areas and Elements of Reflection
Implementation Area Element of Reflection
Curriculum & Instruction (CI) Classroom Use
Access to Digital Content
Content Area Connections
Technology Applications
Student Mastery of Technology Applications
Web-based Lessons
Infrastructure & Technical Support (IA)
Students: Computer
Access/Connectivity
Classroom Technology
Technical Support
LAN/WAN
Student Access to Distance Learning
Leadership, Administration, & Instructional Support (LA)
Leadership and Vision
Planning
Instructional Support
Communication and Collaboration
Sustainability
Policy
Professional Development (PD)
Professional Development Experiences
Models of Professional Development
Educator Capability
Participation in Technology-Driven Professional Development
Levels of Understanding
Student Training
148
Scores on each element will range from 1 to 4. Rubric comparison points are awarded as follows for each level of technology implementation:
1 point = Early (Starting) Technology
2 points = Developing Technology
3 points = Advanced (Prepared) Technology
4 points = Target Computing (i.e., exemplary implementation)
A total classification score is calculated for each of the four implementation areas by adding the scores across the six elements. Thus, scores in each implementation area of the rubric can be a maximum of 24 points. These scores are then classified into one of four categories. A score of 21 to 24 points is considered “target,” while a score of 15 to 20 points is “advanced/prepared.” A classification of “developing” is assigned to a score of 9 to 14 points, and fewer than 9 points is considered “early/starting.”
Using the Score Chart to Complete the Rubric
Using the elements from the Friday Institute 1:1 Implementation Rubric, the NowPLAN-T evaluator created the score chart below in Table B.5. The evaluator explained that to complete the rubric, you needed to consider each of the four implementation areas separately. For each implementation area, you would decide where the classroom’s technology use and experience falls on each of the six elements comprising that dimension.
Each element has one or two characteristics that describe each level of technology implementation. These bulleted characteristics were developed by research staff at the Friday Institute and are shown in 1:1 Implementation Rubric in tables 22 through 25. Note that all characteristics describing a level of technology implementation must be achieved for points to be awarded for that level. For example, if both characteristics for the developing level accurately describe a classroom but characteristics for higher levels would not be accurate descriptions, then you grade that classroom as Developing. Use the score chart to record your scores, calculate the total score, and identify the classification level of implementation for each implementation area.
Table 20: 1:1 Implementation Rubric: NowPLAN-T Score Chart
Implementation Area
Element of Reflection
Ea rly
De ve
lo pi
ng
Ad va
nc ed
Ta rg
et TOTAL
(6-24)
Classification
(Circle One)
Curriculum & Instruction (CI)
CI1: Classroom Use 1 2 3 4 Target
Advanced CI2: Access to Digital Content 1 2 3 4
149
Implementation Area
Element of Reflection
Ea rly
De ve
lo pi
ng
Ad va
nc ed
Ta rg
et TOTAL
(6-24)
Classification
(Circle One)
CI3: Content Area Connections 1 2 3 4 Developing
Early CI4: Technology Applications 1 2 3 4
CI5: Student Mastery of Technology Applications
1 2 3 4
CI6: Web-based Lessons 1 2 3 4
Infrastructure & Technical Support (IA)
IA1: Students: Computer 1 2 3 4 Target
Advanced
Developing
Early
IA2: Access/Connectivity 1 2 3 4
IA3: Classroom Technology 1 2 3 4
IA4: Technical Support 1 2 3 4
IA5: LAN/WAN 1 2 3 4
IA6: Student Access to Distance Learning
1 2 3 4
Leadership, Administration, & Instructional Support (LA)
LA1: Leadership and Vision 1 2 3 4 Target
Advanced
Developing
Early
LA2: Planning 1 2 3 4
LA3: Instructional Support 1 2 3 4
LA4: Communication and Collaboration
1 2 3 4
LA5: Sustainability 1 2 3 4
LA6: Policy 1 2 3 4
Professional Development (PD)
PD1: Professional Development Experiences
1 2 3 4 Target
Advanced
Developing
Early
PD2: Models of Professional Development
1 2 3 4
PD3: Educator Capability 1 2 3 4
PD4: Participation in Technology- Driven Professional Development
1 2 3 4
150
Implementation Area
Element of Reflection
Ea rly
De ve
lo pi
ng
Ad va
nc ed
Ta rg
et TOTAL
(6-24)
Classification
(Circle One)
PD5: Levels of Understanding 1 2 3 4
PD6: Student Training 1 2 3 4
In the following pages, Table 21 shows the score chart developed by research staff at the Friday Institute. The chart developed by the Friday Institute with bulleted characteristics for curriculum and instruction is presented in Table 22; for infrastructure and technical support in Table 23; for leadership, administration, and instructional support in Table 24; and for professional development in Table 25.
These tables have been reformatted from their original versions to fit the pages.
These evaluation instruments were identified, modified, or developed through support provided by The Friday Institute. The Friday Institute grants you permission to use these instruments for educational, non-commercial purposes only. You may use an instrument as is, or modify it to suit your needs, but in either case you must credit its original source. By using these instruments, you agree to allow the Friday Institute to use the data collected for additional validity and reliability analysis. You also agree to share with The Friday Institute publications, presentations, evaluation reports, etc. that include data collected and/or results from your use of these instruments. The Friday Institute will take appropriate measures to maintain the confidentiality of all data. For information about additional permissions, or if you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute,
151
Table 21: The Friday Institute 1:1 Implementation Rubric 2: Score Chart
Curriculum & Instruction Total
CI1 Classroom Use CI2 Access to Digital Content
CI3 Content Area Connection
CI4 Technology Applications CI5 Student Mastery of Technology Applications
CI6 Web-Based Lessons
Infrastructure & Technology Support
IA1 Students: Computer
IA2 Access/ Connectivity
IA3 Classroom Technology
IA4 Technical Support IA5 LAN/WAN IA6 Student Access to Distance Learning
Leadership, Administration & Instructional Support
LAI1 Leadership & Vision
LAI2 Planning LAI3 Instructional Support
LAI4 Communication & Collaboration
LAI5 Sustainability LAI6 Policy
Professional Development
PD1 Professional Development Experiences
PD2 Model of Professional Development
PD3 Educator Capability
PD4 Participation in Technology- Driven Professional Development
PD5 Levels of Understanding
PD6 Student Training
1:1: Implementation Summary:
Implementation Area Total Classification*
Curriculum & Instruction
Infrastructure & Technology Support
Leadership, Administration & Instructional Support
Professional Development
*Classification: Early (Starting) Technology (6-8 pts.), Developing Technology (9-14 pts.), Advanced Technology (15-20 pts.), Target Computing (21-24 pts.)
If you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute, [email protected].
152
If you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute, [email protected].
Table 22: The Friday Institute 1:1 Implementation Rubric 3: Curriculum and Instruction
Curriculum & Instruction
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
CI1
Classroom Use
Teachers occasionally use technology to support instruction and present teacher-centered lectures.
Students use technology for skill reinforcement.
Teachers use technology to drive instruction, improve productivity, and model technology skills.
Students use technology to communicate and present information.
Teachers use technology as a collaborative tool in teacher-led and some student-centered learning experiences to facilitate the development of students’ higher order thinking skills and to interact with content experts, peers, parents, and community.
Students use technology to evaluate information and analyze data to solve problems.
Teachers and students are immersed in a student-centered learning environment where technology is seamlessly integrated into the learning process and used to solve real world problems.
Students use technology to develop, assess, and implement solutions to real world problems.
CI2
Access to Digital Content
Teachers have occasional access to digital resources for instruction.
Teachers have regular access to digital resources in the classroom
Teachers have regular access to digital resources in various instructional settings (e.g., school, home, community).
Teachers have on demand access to digital resources anytime/anywhere.
CI3
Content Area Connections
Teachers use technology for basic skills practice with little or no connection with content objectives.
Teachers use technology to support content objectives.
Teachers integrate technology in subject areas.
Teachers seamlessly apply technology across all subject areas to provide learning opportunities beyond the classroom.
153
Curriculum & Instruction
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
CI4
Technology Applications
Teachers are aware of technology applications for grades K-12.
Teachers have a general understanding of appropriate technology applications for their content areas.
Teachers are knowledgeable of and consistently use appropriate technology applications for their content areas and grade levels.
Teachers seamlessly integrate technology applications in collaborative, cross-curricular units of instruction.
CI5
Student Mastery of Technology Applications
Up to 25% of students have mastered technology applications.
Between 26-50% or students have mastered technology applications.
Between 51-85% of students have mastered technology applications.
Between 86-100% of students have mastered technology applications.
CI6
Web-Based Lessons
Teachers use a few web- based activities with students.
Teachers have customized several web-based lessons, which include online standards-based content, resources, and learning activities that support learning objectives.
Teachers have created many web-based lessons, which include online standards- based content, resources, and learning activities that support learning objectives.
Teachers have created and integrate web-based lessons which include online standards-based content, resources, and learning activities that support learning objectives throughout the curriculum.
154
If you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute, [email protected].
Table 23: The Friday Institute 1:1 Implementation Rubric 4: Infrastructure and Technical Support
Infrastructure & Technical Support
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
IA1
Student:Computer
Less than two (2) student computers available per classroom.
Two (2) to five (5) connected multimedia student computers available per classroom.
At least one connected multimedia student lab or mobile cart is available.
Six (6) or more connected multimedia student computers available per classroom.
1 to 1 access to multimedia computers for all students in the classroom when needed.
Ability to take computers home.
IA2
Access/Connectivity
No need to the Internet in the classroom.
Internet access to at least one computer in the classroom.
Direct Internet access with reasonable response time in the classroom.
Direct Internet connectivity in the classroom with adequate bandwidth to access e-learning technologies and resources for all students.
Consistent access at home and school.
155
Infrastructure & Technical Support
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
IA3
Classroom Technology
Teachers have shared access to resources such as, but not limited to, digital cameras, PDAs, MP3 players, probes, interactive white boards, projection systems, scanners classroom sets of graphing calculators.
Teachers have access to a designated computer and shared resources such as, but not limited to, digital cameras, PDAs, MP3 players, probes, interactive white boards, projection systems, scanners classroom sets of graphing calculators.
Teachers have access to a designated computer, and dedicated and assigned use of commonly used technology such as, but not limited to, digital cameras, PDAs, MP3 players, probes, interactive white boards, projection systems, scanners classroom sets of graphing calculators.
Teachers have ready access to designated computer and a fully equipped classroom to enhance student instruction. Technologies include those earlier, as well as the use of new and emerging technologies.
IA4
Technical Support
When needed, the response time for technical support is greater than twenty-four (24) hours.
When needed, the response time for technical support is less than twenty-four (24) hours.
When needed, the response time for technical support is less than eight (8) hours.
When needed, the response time for technical support is less than four (4) hours.
IA5
LAN/WAN
Students and teachers have access to technologies such as print/file sharing and some shared resources outside the classroom.
Students and teachers have access to technologies such as print/file sharing, multiple applications, and district servers.
Students and teachers have access to technologies such as print/file sharing, multiple applications, and district- wide resources on the campus network.
All classrooms are connected to a robust LAN/WAN that allows easy access to multiple district-wide resources for students and teachers, including, but not limited to, video streaming and desktop videoconferencing.
IA6
Student Access to Distance Learning
Students have no or limited access to online learning with rich media such as streaming video, podcasts, applets, animation, etc.
Students have scheduled access to online learning with rich media such as streaming video, podcasts, applets, animations, etc.
Students have anytime access to online learning with rich media such as streaming video, podcasts, applets, animation, etc.
Students have anytime access to online learning with rich media such as streaming video, podcasts, applets, and animation, and sufficient bandwidth storage to customize online instruction.
156
If you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute, [email protected].
Table 24: The Friday Institute 1:1 Implementation Rubric 5: Leadership, Administration, and Instructional Support
Leadership, Administration &
Instructional Support
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
LAI1
Leadership & Vision
Leadership has the basic awareness of the potential of technology in education to lead to student achievement.
Leadership develops a shared vision and begins to build buy-in for comprehensive integration of technology leading to increased student achievement.
Leadership communicates and implements a shared vision and obtains buy0in for comprehensive integration of technology leading to increased student achievement.
Distributive leadership facilitates sustainability of the initiative.
A student leader is included in the planning team.
Leadership promotes a shared vision with policies that encourage continuous innovation with technology leading to increased student achievement.
Teams of instructional, curriculum, technology, and administrative personnel to work together to goals and strategies for an effective 1:1 initiative.
LAI2
Planning
Few technology goals and objectives are incorporated in the school/district improvement plan.
Several technology goals and objectives are incorporated in the school/district improvement plan.
Technology-rich school district plan sets annual technology benchmarks based on the technology applications standards.
Leadership team has a collaborative, technology-rich school/district improvement plan grounded in research and aligned with district strategic plan focused on student achievement.
157
Leadership, Administration &
Instructional Support
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
LAI3
Instructional Support
Teachers have limited opportunity for technology integration and planning or professional development.
Teachers have time for professional development on the integration of technology.
Teacher teams are provided time to create and participate in learning communities to stimulate, nurture, and support the use of technology to maximize teaching and learning.
Education leaders and teacher teams facilitate and support the use of technology to enhance instructional methods.
On-demand, up-to-date student data is available to administrators and teachers to drive instructional decision-making.
LAI4
Communication & Collaboration
School leaders use technology for limited written communication with teachers and parents.
Technology is used for communication and collaboration among colleagues, staff, parents, students, and the community.
Current information tool and systems are used for communication, management of schedules and resources, performance assessment, and professional development.
Technology is used to engage leaders from the business community.
Variety of media and formats, including telecommunications and the school website used to communicate, interact, and collaborate with all education stakeholders.
Marketing strategies are used to engage the business community and seek volunteers to assist with promoting the initiative.
158
Leadership, Administration &
Instructional Support
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology
Target Computing
LAI5
Sustainability
Limited discretionary funds for implementation of technology strategies to meet goals and objectives outline in the school/district improvement plan.
Discretionary funds and other resources are allocated to advance implementation of some technology strategies to meet goals and objectives outlines in the school/district improvement plan.
Discretionary funds and other resources are allocated to advance implementation of most of the technology strategies to meet the goals and objectives outlined in the school/district improvement plan.
Discretionary funds and other resources are allocated to advance implementation of all the technology strategies to meet the goals and objectives outlined in the school/district improvement plan.
A team of stakeholders is assembled to create a long-term funding plan for the initiative. These individuals include the district leadership team, local business partners, and outside business individuals).
IA7
Policy
Planning team is in a place to develop policies for ensuring student safety and appropriate use of computers.
Policies for ensuring student safety and appropriate use of computers are in place.
Policies are enforced for ensuring student safety and appropriate use of computers are in place.
Policies for ensuring student safety and appropriate use of computers in accord with the Children’s Internet Protection Act (CIPA), while still enabling teachers and students access to a wide range of information and communication resources (AUP, plans for parent, teacher, student information, filtering, virus/spyware protection)
159
If you have any questions or need further information about these instruments, please contact Dr. Jeni Corn, Director of Evaluation of the Friday Institute, [email protected].
Table 25: The Friday Institute 1:1 Implementation Rubric 6: Professional Development
Professional Development
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology Target Computing
PD1
Professional Development Experiences
Teachers participate in professional development on basic technology literacy skills and district information systems.
Teachers have participated in professional development on integrating technology into content area activities for students as well as to streamline productivity and management tasks.
Teachers have participated in professional development on technology integration into the curriculum through the creation of new lessons and activities that promote higher order thinking skills and collaboration with experts, peers, and parents.
Teachers collaborate with other professionals in developing new learning environments to empower students to think critically to solve real- world problems and communicate with experts across business, industry and higher education.
PD2
Models of Professional Development
Teachers participate in large group professional development sessions to acquire basic technology skills.
Teachers participate in large group professional development sessions focusing on increasing teacher productivity and building capacity to integrate technology effectively into content areas with follow-up that facilitates implementation.
Teachers participate in on-going professional development, including training, observation/assessment, study groups, and mentoring.
Teachers participate in multiple professional development opportunities that support anytime, anywhere learning available through delivery systems including individually guided activities, inquiry/action research, and involvement in a development/improvement process.
PD3
Educator Capability
Educators are aware of the certification for technology applications.
Most educators meet two (2) to three (3) technology application standards.
Most educators meet four (4) to five (5) of the technology application standards.
Most educators meet all six (6) of the technology application standards.
160
Professional Development
Early (Starting) Technology
Developing Technology Advanced (Prepared) Technology Target Computing
PD4
Participation in Technology-Driven Professional Development
Teachers participate in less than nine (9) hours of technology professional development per year.
Teachers participate in nine (9) to eighteen (18) hours of technology professional development per year.
Teachers participate in nineteen (19) to twenty-nine (29) hours of technology professional development per year.
Teachers participate in thirty (30) or more hours of technology professional development per year.
PD5
Levels of Understanding
Teachers understand technology basics and how to use teacher productivity tools.
Teachers adapt technology knowledge and skills for content area instruction.
Teachers use technology as a tool in and across content areas to enhance higher order thinking skills.
Teachers create new, interactive, collaborative, and customized learning environments.
PD6
Student Training
Training on school technology policies and software is not provided to students.
Training on school technology policies and software is being planned for students.
Training on school technology policies and software is provided to students once a year.
Training on school technology policies and software is provided to students multiple times a year.
161
Appendix C: Evaluation Resources The evaluation resources in this appendix can help you find more information on a topic. Note that many of these resources address multiple evaluation subjects, so the inclusion of a resource under one topic should not at all imply that it does not also pertain to other areas. There are many good evaluation texts, so you will undoubtedly find additional resources. This list is not exhaustive by any means. However, these resources will get you started if you are interested in a more in-depth look on a subject of interest.
Evaluation Approaches Alkin, M. (2004). Evaluation roots: Tracing theorists’ views and influences. Thousand Oaks, CA:
Sage Publications.
Fetterman, D., and Wandersman, A. (2005). Empowerment evaluation principles in practice. New York, NY: Guilford Press.
Patton, M. (2008). Utilization focused evaluation. (4th ed.) Thousand Oaks, CA: Sage Publications.
Paul, J. (2005). Introduction to the philosophies of research and criticism of the education and the social sciences. Upper Saddle River, NJ: Prentice-Hall.
Preskill, H., and Jones, N. (2009). A practical guide for engaging stakeholders in developing evaluation questions. Princeton, NJ: Robert Wood Johnson Foundation.
Rossi, P., Lipsey, M., and Freeman, H. (2004). Evaluation: A systematic approach. (7th ed.) Thousand Oaks, CA: Sage Publications.
Shadish, William R. Jr., Cook, Thomas D., and Leviton, Laura C. (1991). Foundations of program evaluation: Theories of practice. Newbury Park, CA: Sage Publications.
Stake, R. (2004). Standards-based and responsive evaluation. Thousand Oaks, CA: Sage Publications.
Stufflebeam, D. (2001). Evaluation models: New directions in evaluation, No. 89. San Francisco, CA: Jossey-Bass.
162
Program Theory and Logic Modeling Frechtling, J. (2007). Logic modeling methods in program evaluation. San Francisco, CA: Jossey-
Bass.
Knowlton, L., and Phillips, C. (2009). The logic model guidebook. Thousand Oaks, CA: Sage Publications.
Weiss, C. (1998). Evaluation. Upper Saddle River, NJ: Prentice Hall.
Research and Evaluation Design, Including Reliability and Validity Haertel, G., and Means, B. (2003). Evaluating educational technology: Effective research designs
for improving learning. New York, NY: Teachers College Press.
Lauer, P. (2006). An education research primer: How to understand, evaluate, and use it. San Francisco, CA: Jossey-Bass.
Means, B., and Haertel, G. (2004). Using technology evaluation to enhance student learning. New York, NY: Teachers College Press.
Mosteller, F., and Boruch, R. (2002). Evidence matters: Randomized trials in education research. Washington, DC: Brookings Institution Press.
Shadish, W., Cook, T., and Campbell, D. (2001). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton-Mifflin.
Shavelson, R., and Towne, L. (2002). Scientific research in education. Washington, DC: National Academy Press.
Stake, R. (1995). The art of case study research. Thousand Oaks, CA: Sage Publications.
Stake, R. (2010). Qualitative research: Studying how things work. New York, NY: Guilford Press.
Trochim, W., and Donnelly, J. (2006). The research methods knowledge base (3rd ed.). Cincinnati, OH: Atomic Doc Publishing.
What Works Clearinghouse. (2011). Procedures and standards handbook (version 2.1). Retrieved from http://ies.ed.gov/ncee/wwc/references/idocviewer/doc.aspx?docid=19&tocid=1.
Wholey, J., Hatry, H., and Newcomer, K. (Eds.). (2010). Handbook of practical program evaluation (3rd ed.). San Francisco, CA: Jossey-Bass.
163
Threats to Validity Campbell, D., & Stanley, J. (1963). Experimental and quasi-experimental designs for research.
Boston, MA: Houghton Mifflin Co.
Budgeting Time and Money Bamberger, M., Rugh, J., and Mabry, L. (2011). RealWorld evaluation. Thousand Oaks, CA: Sage
Publications.
Posavac, E., and Carey, R. (2003). Program evaluation methods and case studies. Upper Saddle River, NJ: Prentice Hall.
W. K. Kellogg Foundation. (2004). Evaluation handbook. Battle Creek, MI: W. K. Kellogg Foundation.
Ethical Issues APA. (1982). Ethical principles in the conduct of research with human participants. Washington,
DC: American Psychological Association.
National Center for Education Statistics. SLDS Technical Brief (NCES 2011-601), Basic concepts and definitions for privacy and confidentiality in student education records. Retrieved from http://nces.ed.gov/pubs2011/2011601.pdf.
National Center for Education Statistics. SLDS Technical Brief (NCES 2011-602), Data stewardship: Managing personally identifiable information in electronic student education records. Retrieved from http://nces.ed.gov/pubs2011/2011602.pdf.
U.S. Department of Education. (2007). Mobilizing for evidence-based character education. Washington, D.C.: Office of Safe and Drug-Free Schools. Retrieved from http://www2.ed.gov/programs/charactered/mobilizing.pdf.
U.S. Department of Education. Privacy technical assistance center. Retrieved from http://ptac.ed.gov/.
U.S. Department of Education. Safeguarding student privacy. Retrieved from http://www2.ed.gov/policy/gen/guid/fpco/ferpa/safeguarding-student-privacy.pdf.
Wilder Research. (2007). Evaluation tip sheets: Ethical issues. St. Paul, MN: Wilder Research. Retrieved from http://www.wilderresearch.org.
164
Yarbrough, D., Shulha, L., Hopson, R., and Caruthers, F. (2011). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage Publications.
See also general information, regulations, and guidance on the protection of human subjects at the U.S. Department of Education at http://www2.ed.gov/about/offices/list/ocfo/humansub.html.
Data Collection, Preparation, and Analysis Bradburn, N., and Sudman, S. (2004). Asking questions: The definitive guide to questionnaire
development. San Francisco, CA: John Wiley and Sons.
Cox, J., and Cox, K. (2008). Your opinion, please!: How to build the best questionnaires in the field of education (2nd ed.). Thousand Oaks, CA: Sage Publications.
Fowler, F. (2008). Survey research methods (4th ed.). Thousand Oaks, CA: Sage Publications.
Friesen, B. (2010). Designing and conducting your first interview. San Francisco, CA: John Wiley and Sons.
Krueger, R., and Casey, M. (2009). Focus groups: A practical guide for applied research (4th ed.). Thousand Oaks, CA: Sage Publications.
Wilder Research. (2010). Evaluation tip sheets: Making sense of your data. St. Paul, MN: Wilder Research. Retrieved from http://www.wilderresearch.org.
See also evaluation publications, research syntheses, and technical assistance resources at the U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance (NCEE) at http://ies.ed.gov/ncee.
Evaluation Pitfalls Hall, G., and Hord, S. (2010). Implementing change: Patterns, principles, and potholes (3rd ed.).
Boston, MA: Allyn and Bacon.
Wholey, J., Hatry, H., and Newcomer, K. (2010). Handbook of practical program evaluation (3rd ed.). San Francisco, CA: John Wiley and Sons.
Wilder Research. (2007). Evaluation tip sheets: Does it measure Up?. St. Paul, MN: Wilder Research.
165
Interpreting, Reporting, Communicating, and Using Evaluation Results Hittleman, D., and Simon, A. (2005). Interpreting educational research (4th ed.). Upper Saddle
River, NJ: Pearson.
Hord, S., Rutherford, W., Huling-Austin, L., and Hall, G. (1998). Taking charge of change. Austin, TX: Southwest Educational Development Laboratory.
McMillan, J. (2011). Educational research: Fundamentals for the consumer (6th ed.). Boston, MA: Allyn and Bacon.
McMillan, J., and Wergin, J. (2009). Understanding and evaluating educational research (4th ed.). Upper Saddle River, NJ: Prentice Hall.
Torres, R., Preskill, H., and Piontek, M. (1996). Evaluation strategies for communicating and reporting: Enhancing learning in organizations. Thousand Oaks, CA: Sage Publications.
166
Appendix D: Evaluation Instruments for Educational Technology Initiatives Links to instruments that may be helpful to you for evaluating educational technology initiatives are provided below. As with Appendix C: Evaluation Resources, this list of evaluation resources is not meant to be exhaustive. The links are current as of publication date. However, should they no longer work at some point in the future, information on the author may help you to track them down. There are many useful evaluation instruments available, some in the public domain and some through private organizations. Hopefully these will provide you with a starting point as you build your instrument library.
Technology Policy Implementation Rubric (NCREL)
Location: http://www.air.org/focus- area/education/index.cfm?fa=viewContent&content_id=3006&id=10
Author: Learning Point Associates, North Central Regional Educational Laboratory (2004)
Description: The Technology Policy Implementation Rubric “can be used to assess a state’s implementation of educational technology policies in 19 areas. The rating scale provides indicators for four levels of implementation: outstanding, high, medium, and low.”
NETS for Students: Achievement Rubric (NCREL)
Location: http:// www.comfsm.fm/national/administration/VPA/researchdocs/techPlan/NCREL%20p- 12rubric.pdf
Author: Learning Point Associates, North Central Regional Educational Laboratory (2005)
Description: The NETS for Students: Achievement Rubric defines “four achievement levels in relation to the NETS. The rubric is being developed to assist state and school district leaders in their efforts to measure and monitor the development of student technology literacy throughout the elementary and secondary grades.”
NETS for Teachers: Achievement Rubric (NCREL)
Location: http://www.air.org
Author: Learning Point Associates, North Central Regional Educational Laboratory (2005)
Description: The NETS for Teachers: Achievement Rubric is the teacher counterpart rubric to the NETS for Students: Achievement Rubric.
167
1:1 Technology Implementation Rubric (NCSU-FI)
Location: https://eval.fi.ncsu.edu/11-implementation-rubric/
Author: The William and Ida Friday Institute for Educational Innovation, North Carolina State University (2010)
Description: A technology implementation rubric that “is based on Technology Standards & Performance Indicators for Students (ISTE NETS-S), the NC IMPACT Guidelines, Texas Star Chart, and NC Learning Technology Initiatives (NCLTI) Planning Framework. This rubric provides a global perspective of school media and technology programs at both the building and system levels.”
Integration of Technology Observation Instrument (ASU-West)
Location: http://www.west.asu.edu/pt3/assessment/observation.htm
Author: University of Arizona West PT3 (2002)
Description: The Integration of Technology Observation Instrument includes a preobservation form to be completed by a classroom teacher, and a timed observation form and postobservation form to be completed by the observer. It is a component of the University of Arizona West Preparing Tomorrow's Teachers to use Technology (PT3) project. More information can be found at http://www.west.asu.edu/pt3. Please note that this instrument is copyrighted and written permission of the authors must be obtained prior to use.
The Technology Integration Matrix (USF)
Location: http://fcit.usf.edu/matrix/matrix.php
Author: University of South Florida, College of Education, Florida Center for Instructional Technology (2011)
Description: The Technology Integration Matrix (TIM) is an online, multimedia tool that examines classroom technology use in order to develop a common vocabulary to describe levels of technology integration. The matrix assesses five characteristics of effective learning environments: active, constructive, goal directed, authentic, and collaborative. These are measured against five levels of technology integration: entry, adoption, adaptation, infusion, and transformation. Each of the resulting 25 boxes includes a written explanation and a classroom video example for math, science, social studies, and language arts. Text-only versions of the online matrix are also available. Two additional tools are also described, though online versions are not available. The first is the Technology Integration Matrix Observation Tool (TIM-O), which contains yes/no questions designed to evaluate classroom technology integration using the
168
terms of the Technology Integration Matrix. The second, the Technology Comfort Measure (TCM), is a teacher self-assessment containing 35 questions and photographs showing classroom technology use. Results provide teachers with a profile and professional development suggestions. More information can be found at http://fcit.usf.edu/matrix/index.php.
The Looking for Technology Integration Protocol (UNC-Greensboro)
Location: http://www.serve.org/uploads/docs/LoFTI_1.1.pdf
Author: University of North Carolina at Greensboro SERVE Center, North Carolina Department of Public Instruction Educational Technology Division (2005)
Description: The Looking for Technology Integration (LoFTI) protocol is an observation tool that profiles school educational technology implementation. Observation Record Forms may be used to assess the learning environment, teaching and learning styles, student engagement, use of technology, and hardware and software use. A Data Tally Tool is available for presenting data (http://www.serve.org/uploads/docs/LoFTIpaperpencilAnalysis.pdf). More information can be found at http://www.serve.org/lofti.aspx.
School Technology Needs Assessment (UNC-Greensboro)
Location: http://www.serve.org/uploads/docs/STNA3.0.0.pdf
Author: University of North Carolina at Greensboro SERVE Center, North Carolina Department of Public Instruction Educational Technology Division (2006)
Description: The School Technology Needs Assessment (STNA) is designed for planning and formative evaluation. It is available as an online or paper-pencil survey tool. Questions address conditions for technology use, professional development, and classroom practices. A checklist for STNA preparation is available at http://www.serve.org/uploads/docs/STNAchecklist.pdf. The STNA website is located at http://www.serve.org/stna.aspx. A guide for interpreting STNA findings, as well as additional resources (including instructions for using the online survey) can be found on the STNA website.
Additional resources: Corn, J. (2007, November). Investigating the validity and reliability of the School Technology Needs Assessment (STNA): http://www.serve.org/uploads/docs/STNA_paper.pdf
169
Profiling Educational Technology Integration (SETDA)
Location: http://www.setda.org/web/guest/petitools
Author: State Educational Technology Directors Association (SETDA), developed in partnership with Metiri Group (2004)
Description: Profiling Educational Technology Integration (PETI) Evaluating Educational Technology Effectiveness includes a set of tools designed to examine school, district, and state use of educational technology over time. Assessment is focused on both technology readiness and effective use of technology, and is aligned with No Child Left Behind, Title II, Part D. Tools in the PETI suite include a teacher survey, building-level survey, district survey, artifact review forms, principal interview protocol, classroom observation protocol, school walk-through protocol, school range-of-use observation tool, and teacher focus group protocol. An overview of and introduction to the SETDA/Metiri Group’s PETI - Evaluating Educational Technology Effectiveness, as well as the PETI framework and information regarding reliability and validity, can be found at http://www.setda.org/web/guest/peti.
Levels of Technology Implementation (DCET)
Location: http://www.dcet.k12.de.us/instructional/loti/index.shtml
Author: Delaware Center for Educational Technology (1994/2003)
Description: The Levels of Technology Implementation (LoTi) instrument is a 37- question online teacher self-assessment. Questions address various topics, including teacher personal technology use, current instructional practice, and level of technology implementation. Information is reported individually to teachers and in aggregate to districts and the state. LoTi is funded by the Delaware Center for Educational Technology (DCET) and is available to Delaware Public and Charter schools. Other schools may contact DCET for information about the LoTi instrument.
Insight (TCET)
Location: http://www.tcet.unt.edu/insight/instruments/
Author: Texas Center for Educational Technology, University of North Texas (2005)
Description: Insight, the South Central Instrument Library and Data Repository, links to several useful tools for examining the use and integration of technology in classrooms and schools. Instruments include a variety of self-assessments, checklists, affective questionnaires, and technology implementation surveys. Information regarding Insight materials can be found at http://www.tcet.unt.edu/insight/about.php.
170
Observation Protocol for Technology Integration in the Classroom (NETC)
Location: http://www.netc.org/assessing/home/integration.php
Author: Northwest Educational Technology Consortium (2004)
Description: The Observation Protocol for Technology Integration in the Classroom (OPTIC) was developed by the Northwest Educational Technology Consortium (NETC) in 2004. OPTIC is an observation protocol that relies on checklists and rubrics to assess the degree of technology integration in classrooms and schools. Federal funding for the regional technology consortia program ended in September 2005. However, the Northwest Regional Educational Laboratory (NWREL), now Education Northwest, continues to make the OPTIC resources available to educators.
Capacity Building Instruments (NCSU-FI)
Location: https://eval.fi.ncsu.edu/instruments-2/
Author: The William and Ida Friday Institute for Educational Innovation, North Carolina State University
Description: The Friday Institute for Educational Innovation (FI) at North Carolina State University has compiled a list of instruments for evaluation capacity building. Instruments include inventories, checklists, surveys, and rubrics, including versions of LoFTI and STNA modified for special use. The site includes links to instruments, as well as instructions for using the instrument. A form is also provided for requesting permission to use instruments included on the site.
IMPACT Surveys (Sun Associates/Alabama State Department of Education)
Location: http://www.sun-associates.com/index.html
Author: Sun Associates
Description: The Alabama State Department of Education, Technology Initiatives Section, uses two sets of surveys to assess and monitor the impact of its state technology plan. The first set of surveys, Indicators for Measuring Progress in Advancing Classroom Technology (IMPACT), examines growth in technology use and perceptions towards technology. IMPACT surveys are self-report and are conducted with teachers, administrators, and technology coordinators. The second set, Speak Up surveys, are part of a national research project conducted by Project Tomorrow (see below). For more information on Alabama educational technology initiatives, see http://www.alsde.edu/html/sections/section_detail.asp?section=61&footer=sections.
171
Speak Up Surveys (Project Tomorrow)
Location: http://www.tomorrow.org/speakup/index.html
Author: Project Tomorrow (2011)
Description: Speak Up is a group of surveys available to schools and districts. Speak Up surveys are conducted online and are voluntary. Nationally aggregated data are available for comparison purposes. Speak Up includes self-report surveys for teachers, students, administrators, and parents. The Speak Up website includes information about the project, directions on how to participate, and sample survey questions.
School 2.0 ETOOLKIT (CTL)
Location: http://etoolkit.org/etoolkit/
Author: Center for Technology in Learning at SRI International
Description: The School 2.0 eToolkit was created by the Center for Technology in Learning (CTL) at SRI International. It includes an online reflection tool comprised of teacher, principal, and technology coordinator questionnaires. The reflection tool focuses on skills in technology integration and identifies areas for growth. Resources relating to several categories, including planning and implementation—technology evaluation, are also provided on the website. The tool kit is currently maintained by the Central Susquehanna Intermediate Unit (CSIU) and content is provided by the International Society for Technology in Education (ISTE).
School Technology and Readiness Chart (Texas Education Agency)
Location: http://starchart.epsilen.com/docs/TxTSC.pdf
Author: Texas Education Agency (2006)
Description: The School Technology and Readiness (STaR) Chart was created by Texas Education Agency to help teachers in Texas to self-assess their progress toward meeting state technology goals. The STaR Chart measures technology integration in four areas: teaching and learning; educator preparation and development; leadership, administration, and instructional support; and infrastructure for technology. More information on the Texas STaR Chart can be found at http://starchart.epsilen.com/.
172
Appendix E: Evaluation Templates Figure 6: Logic Model Template
173
Table 26: Evaluation Matrix Template
Logic Model Components
Evaluation Questions
Indicators Targets Data Sources
Data Collection
Data Analysis
Strategies and Activities/
Initial Implementation
Early/Short- term and Intermediate Objectives
Long-term Goals
174
Appendix F: Lists of Tables and Figures List of Tables Table 1: Evaluation Matrix Example Shell ..................................................................................... 33 Table 2: Evaluation Methods and Tools: Overview ...................................................................... 48 Table 3: Evaluation Methods and Tools: Procedures ................................................................... 54 Table 4: Evaluation Questions for Strategies and Activities ......................................................... 80 Table 5: Evaluation Questions for Early/Short-Term and Intermediate Objectives ..................... 81 Table 6: Evaluation Questions for Long-Term Goals .................................................................... 83 Table 7: Evaluation Matrix Addressing Strategies and Activities During the Initial
Implementation—Indicators and Targets ............................................................................... 85 Table 8: Evaluation Matrix Addressing Early/Short-Term and Intermediate Objectives—
Indicators and Targets ............................................................................................................ 86 Table 9: Evaluation Matrix Addressing Long-Term Goals—Indicators and Targets ..................... 89 Table 10: READ Evaluation Matrix—Strategies and Activities/Initial Implementation ................ 95 Table 11: Evaluation Matrix—Early/Short-Term and Intermediate Objectives ........................... 98 Table 12: Evaluation Matrix—Long-Term Goals ......................................................................... 104 Table 13: READ Evaluation Results—Strategies and Activities/Initial Implementation ............. 109 Table 14: READ Evaluation Results—Early/Short-Term and Intermediate Objectives .............. 112 Table 15: READ Evaluation Results—Long-Term Goals .............................................................. 116 Table 16: NowPLAN-T Evaluation Matrix—Strategies and Activities/Initial Implementation ... 134 Table 17: NowPLAN-T Evaluation Matrix—Early/Short-Term and Intermediate Objectives ..... 135 Table 18: NowPLAN-T Evaluation Matrix—Long-Term Goals ..................................................... 143 Table 19: 1:1 Implementation Rubric: Implementation Areas and Elements of Reflection ...... 148 Table 20: 1:1 Implementation Rubric: NowPLAN-T Score Chart ................................................ 149 Table 21: The Friday Institute 1:1 Implementation Rubric 2: Score Chart ................................. 152 Table 22: The Friday Institute 1:1 Implementation Rubric 3: Curriculum and Instruction ........ 153 Table 23: The Friday Institute 1:1 Implementation Rubric 4: Infrastructure and Technical
Support.................................................................................................................................. 155 Table 24: The Friday Institute 1:1 Implementation Rubric 5: Leadership, Administration, and
Instructional Support ............................................................................................................ 157 Table 25: The Friday Institute 1:1 Implementation Rubric 6: Professional Development ......... 160 Table 26: Evaluation Matrix Template ........................................................................................ 174
175
List of Figures Figure 1: Embedded Evaluation Model........................................................................................... 8 Figure 2: Possible Logic Model Headings ...................................................................................... 20 Figure 3: READ Logic Model .......................................................................................................... 79 Figure 4: NowPLAN Logic Model ................................................................................................. 127 Figure 5: NowPLAN-T Logic Model.............................................................................................. 131 Figure 6: Logic Model Template ................................................................................................. 173
176
- Acknowledgements
- Before You Get Started
- Introduction
- What Is the Purpose of the Guide?
- Why Evaluate and What Do I Need to Consider?
- Where Do I Start?
- How Is the Guide Organized?
- Embedding Evaluation Into the Program
- STEP 1: DEFINE – What Is the Program?
- STEP 2: PLAN – How Do I Plan the Evaluation?
- STEP 3: IMPLEMENT – How Do I Evaluate the Program?
- STEP 4: INTERPRET – How Do I Interpret the Results?
- STEP 5: INFORM and REFINE – How Do I Use the Evaluation Results?
- Appendix A: Embedded Evaluation Illustration – READ*
- Program Snapshot
- Step 1: Define the Program
- Step 2: Plan the Evaluation
- Step 3: Implement the Evaluation
- Step 4: Interpret the Results
- Step 5: Inform and Refine – Using the Results
- Appendix B: Embedded Evaluation Illustration – NowPLAN*
- Program Snapshot
- Step 1: Define the Program
- Step 2: Plan the Evaluation
- Step 3: Implement the Evaluation
- Step 4: Interpret the Results
- Step 5: Inform and Refine – USING the Results
- Appendix C: Evaluation Resources
- Evaluation Approaches
- Program Theory and Logic Modeling
- Research and Evaluation Design, Including Reliability and Validity
- Threats to Validity
- Budgeting Time and Money
- Ethical Issues
- Data Collection, Preparation, and Analysis
- Evaluation Pitfalls
- Interpreting, Reporting, Communicating, and Using Evaluation Results
- Appendix D: Evaluation Instruments for Educational Technology Initiatives
- Appendix E: Evaluation Templates
- Appendix F: Lists of Tables and Figures
- List of Tables
- List of Figures
|
Almost everywhere we turn, trust is on the decline. Trust in our culture at large, in our institutions, and in our companies is significantly lower than a generation ago. Research shows that only 49% of employees trust senior management, and only 28% believe CEOs are a credible source of information. Consider the loss of trust and confidence in the financial markets today. Indeed, "trust makes the world go 'round," and right now we're experiencing a crisis of trust.
This crisis compels us to ask three questions. First, is there a measurable cost to low trust? Second, is there a tangible benefit to high trust? Third, how can the best leaders build trust in and within their organizations to reap the benefits of high trust?
Most people don't know how to think about the organizational and societal consequences of low trust because they don't know how to quantify or measure the costs of such a so-called "soft" factor as trust. For many, trust is intangible, ethereal, unquantifiable. If it remains that way, then people don't know how to get their arms around it or how to improve it. But the fact is, the costs of low trust are very real, they are quantifiable, and they are staggering.
In 2004, one estimate put the cost of complying with federal rules and regulations alone in the United States—put in place essentially due to lack of trust—at $1.1 trillion, which is more than 10% of the gross domestic product. A recent study conducted by the Association of Certified Fraud Examiners estimated that the average American company lost 6% of its annual revenue to some sort of fraudulent activity. Research shows similar effects for the other disguised low-trust taxes as well.
Think about it this way: When trust is low, in a company or in a relationship, it places a hidden "tax" on every transaction: every communication, every interaction, every strategy, every decision is taxed, bringing speed down and sending costs up. My experience is that significant distrust doubles the cost of doing business and triples the time it takes to get things done.
By contrast, individuals and organizations that have earned and operate with high trust experience the opposite of a tax—a "dividend" that is like a performance multiplier, enabling them to succeed in their communications, interactions, and decisions, and to move with incredible speed. A recent Watson Wyatt study showed that high trust companies outperform low trust companies by nearly 300%!
I contend that the ability to establish, grow, extend, and (where needed) restore trust among stakeholders is the critical competency of leadership needed today. It is needed more than any other competency. Engendering trust is, in fact, a competency that can be learned, applied, and understood. It is something that you can get good at, something you can measure and improve, something for which you can "move the needle." You cannot be an effective leader without trust. As Warren Bennis put it, "Leadership without mutual trust is a contradiction in terms."
How do the best leaders build trust?
The first job of any leader is to inspire trust. Trust is confidence born of two dimensions: character and competence. Character includes your integrity, motive, and intent with people. Competence includes your capabilities, skills, results, and track record. Both dimensions are vital.
With the increasing focus on ethics in our society, the character side of trust is fast becoming the price of entry in the new global economy. However, the differentiating and often ignored side of trust—competence—is equally essential. You might think a person is sincere, even honest, but you won't trust that person fully if he or she doesn't get results. And the opposite is true. A person might have great skills and talents and a good track record, but if he or she is not honest, you're not going to trust that person either.
The best leaders begin by framing trust in economic terms for their companies. When an organization recognizes that it has low trust, huge economic consequences can be expected. Everything will take longer and everything will cost more because of the steps organizations will need to take to compensate for their lack of trust. These costs can be quantified and, when they are, suddenly leaders recognize how low trust is not merely a social issue, but that it is an economic matter. The dividends of high trust can be similarly quantified, enabling leaders to make a compelling business case for trust.
The best leaders then focus on making the creation of trust an explicit objective. It must become like any other goal that is focused on, measured, and improved. It must be communicated that trust matters to management and leadership. It must be expressed that it is the right thing to do and it is the economic thing to do. One of the best ways to do this is to make an initial baseline measurement of organizational trust and then to track improvements over time.
The true transformation starts with building credibility at the personal level. The foundation of trust is your own credibility, and it can be a real differentiator for any leader. A person's reputation is a direct reflection of their credibility, and it precedes them in any interactions or negotiations they might have. When a leader's credibility and reputation are high, it enables them to establish trust fast—speed goes up, cost goes down.
There are 4 Cores of Credibility, and it's about all 4 Cores working in tandem: Integrity, Intent, Capabilities, and Results. Part of building trust is understanding—clarifying—what the organization wants and what you can offer them. Be the one that does that best. Then add to your credibility the kind of behavior that builds trust. (see the 13 high trust behaviors below). Next, take it beyond just you as the leader and extend it to your entire organization. The combination of that type of credibility and behavior and organizational alignment results in a culture of high trust.
Consider the example of Warren Buffett—CEO of Berkshire Hathaway (and generally considered one of the most trusted leaders in the world)—who completed a major acquisition of McLane Distribution (a $23 billion company) from Wal-Mart. As public companies, both Berkshire Hathaway and Wal-Mart are subject to all kinds of market and regulatory scrutiny. Typically, a merger of this size would take several months to complete and cost several million dollars to pay for accountants, auditors, and attorneys to verify and validate all kinds of information. But in this instance, because both parties operated with high trust, the deal was made with one two-hour meeting and a handshake. In less than a month, it was completed. High trust, high speed, low cost.
13 Behaviors of High-Trust Leaders Worldwide
I approach this strategy primarily as a practitioner, both in my own experience and in my extensive work with other organizations. Throughout this learning process, have identified 13 common behaviors of trusted leaders around the world that build—and allow you to maintain—trust. When you adopt these ways of behaving, it's like making deposits into a "trust account" of another party.
1. Talk Straight
2. Demonstrate Respect
3. Create Transparency
4. Right Wrongs
5. Show Loyalty
6. Deliver Results
7. Get Better
8. Confront Reality
9. Clarify Expectation
10. Practice Accountability
11. Listen First
12. Keep Commitments
13. Extend Trust
Remember that the 13 Behaviors always need to be balanced by each other (e.g., Talk Straight needs to be balanced by Demonstrate Respect) and that any behavior pushed to the extreme can become a weakness.
Depending on your roles and responsibilities, you may have more or less influence on others. However, you can always have extraordinary influence on your starting points: Self-Trust (the confidence you have in yourself—in your ability to set and achieve goals, to keep commitments, to walk your talk, and also with your ability to inspire trust in others) and Relationship Trust (how to establish and increase the trust accounts we have with others).
The job of a leader is to go first, to extend trust first. Not a blind trust without expectations and accountability, but rather a "smart trust" with clear expectations and strong accountability built into the process. The best leaders always lead out with a decided propensity to trust, as opposed to a propensity not to trust. As Craig Weatherup, former CEO of PepsiCo said, "Trust cannot become a performance multiplier unless the leader is prepared to go first."
The best leaders recognize that trust impacts us 24/7, 365 days a year. It undergirds and affects the quality of every relationship, every communication, every work project, every business venture, every effort in which we are engaged. It changes the quality of every present moment and alters the trajectory and outcome of every future moment of our lives—both personally and professionally. I am convinced that in every situation, nothing is as fast as the speed of trust. | |
|
| Get the LEAD:OLOGY Newsletter delivered to your inbox. | ||
| © 2019 LeadershipNow
All materials contained in https://www.LeadershipNow.com are protected by copyright and trademark laws and may not be used for any purpose whatsoever other than private, non-commercial viewing purposes. Derivative works and other unauthorized copying or use of stills, video footage, text or graphics is expressly prohibited.
|
|
About Us
Contact
Privacy
Terms & Conditions
|
Educational Administration Quarterly 2016, Vol. 52(4) 675 –706
© The Author(s) 2016 Reprints and permissions:
sagepub.com/journalsPermissions.nav DOI: 10.1177/0013161X16652202
eaq.sagepub.com
Article
Teacher Trust in District Administration: A Promising Line of Inquiry
Curt M. Adams1 and Ryan C. Miskell1
Abstract Purpose: We set out in this study to establish a foundation for a line of inquiry around teacher trust in district administration by (1) describing the role of trust in capacity building, (2) conceptualizing trust in district administration, (3) developing a scale to measure teacher trust in district administration, and (4) testing the relationship between district trust and teacher commitment. Method: Teachers were the unit of analysis. Data were collected from a sample of teachers in one urban school district. Construct validity was assessed by examining content, structural, and convergent validity of the scale. A fully latent structural equation model was used to test the relationship between teacher trust in district administration and teacher commitment. Results: This study makes a strong case for developing a line of research on teacher trust in district administration. It establishes a good measure to use in future research, and it provides initial evidence showing that teacher beliefs are sensitive to the actions of district administrators. Implications: A valid and reliable measure can be used by researchers to study systematically the formation and effects of teacher trust in district administration. Accurate information on district trust also allows central office leaders to formatively assess the capacity of the school system to accomplish reform objectives at scale.
1University of Oklahoma, Tulsa, OK, USA
Corresponding Author: Curt M. Adams, University of Oklahoma, 4502 E. 41st Street, Tulsa, OK 74135, USA. Email: [email protected]
652202 EAQXXX10.1177/0013161X16652202Educational Administration QuarterlyAdams and Miskell research-article2016
676 Educational Administration Quarterly 52(4)
Keywords trust in district administration, district leadership, capacity building, teacher commitment
Two general findings on school reform provide the backdrop for this study. First, uneven progress in raising achievement and reducing achievement gaps has shifted the reform focus from individual schools to the larger district context in which schools are nested (Daly Liou, & Moolenaar, 2014; A. Hargreaves & Shirley, 2009; Harris, 2011; Harris & Chrispeels, 2006; Sharrat & Fullan, 2009). Second, school systems making measurable and sustainable improvements in teaching and learning have done so by building the capacity of educators to learn from practice (King & Bouchard, 2011). Considerable evidence within the United States and abroad indicates that top performing school systems are distinguished by their capacity to continuously enact changes that produce better processes and outcomes for students across the entire system (Chenowith, 2007; Darling-Hammond, 2005; Fullan, 2010; D. H. Hargreaves, 2011; Mourshed, Chinezi, & Barber, 2010).
Capacity is not a tool or resource that districts purchase and input into schools. It emerges through a relational context the supports information exchange, knowledge creation, and purposeful action (Forsyth, Adams, & Hoy, 2011; Fullan, 2008; Harris, 2011). A move toward capacity building requires careful thought and action for how district leaders organize and coordinate the work of schools and teachers (Sharrat & Fullan, 2009). Leaders who centralize too much control at the top threaten to constrict knowledge creation and adaptation at the school level, whereas too little coordination across schools tends to produce unequal learning opportunities and outcomes (Darling-Hammond, 2005; Honig & Hatch, 2004). The tight–loose balance is a tricky dance for district leaders to master. Successful execution requires a relational context that synchronizes district, school, and teacher actions. Herein enters trust. Trust operates like glue by connecting individuals and groups to a common purpose and as a lubricant that eases collaboration and cooperation among interdependent actors (Forsyth et al., 2011; Tschannen- Moran, 2004, 2014).
The school trust literature is extensive, spanning over 30 years and including diverse school samples (national and international), different trust forms, and a variety of research methods (Forsyth et al., 2011; Tschannen-Moran, 2014). We know from this literature that trust is antecedent to important educational pro- cesses and outcomes: professional learning, instructional change, collective action, collaboration, school outreach to parents, knowledge creation, student achievement, school identification, motivation, and school performance
Adams and Miskell 677
(Forsyth et al., 2011; Bryk & Schneider, 2002; Tschannen-Moran, 2014). Even though the trust evidence runs deep, it is limited almost entirely to role relation- ships within schools (e.g., teacher–teacher, teacher–principal, student–teacher, teacher–parent). Research on trust in district administration is scarce, leading to an impoverished explanation for how decisions and actions made at the execu- tive level affect the attitudes and behaviors of individuals whose collective actions can energize or constrain capacity.
On the surface, relational ties between teachers and district administration would seem to function as a critical conduit for greater school and district capacity. Existing evidence, however, does not provide much explanation for the teacher–district administration relationship. We argue that teacher trust in district administration is a useful line of inquiry for understanding the path- way by which district leaders work through teachers to ensure that students receive a learning experience that prepares them for a purposeful life. We set out in this study to establish a foundation for this line of research by (1) describing the role of trust in capacity building, (2) conceptualizing trust in district administration, (3) developing a scale to measure teacher trust in dis- trict administration, and (4) testing the relationship between district trust and teacher commitment.
District Context, Capacity, and Trust
School districts operate as intermediary agents in the larger educational sys- tem (Firestone, 2009; Harris & Chrispeels, 2006; Sharrat & Fullan, 2009); they span boundaries between federal/state policy, community interests and values, and local school needs. As intermediary agents, district leaders con- trol how they implement policies and respond to external pressure (Honig & Hatch, 2004). This is an important function. Policy implementation is where the real, messy work of change unfolds (Darling-Hammond, 2005; Darling- Hammond et al., 2006; Honig & Hatch, 2004; McLaughlin, 1990), and dis- trict leaders retain considerable discretion in the design and use of strategies to achieve goals and objectives set forth in state and federal policy (Firestone, 2009). Trust in district leaders is a resource that on the surface seems to be a condition that enables teachers and central administrations to work coopera- tively toward shared goals and aims.
Firestone (2009) describes three types of district contexts that uniquely affect the role of trust for policy implementation and improvement efforts. First, an accountability context tightens control over the instructional core by standardizing teaching practices and materials, closely monitoring teaching and student outcomes, sanctioning poor performance when goals are not met, and rewarding high achievement. External control, more so than trust,
678 Educational Administration Quarterly 52(4)
regulates decisions and behaviors in an accountability context (Forsyth & Adams, 2014). Second, a loosely coupled system has many moving parts with little to no coherence at the district level. Loosely coupled districts tend to cycle through programs and interventions, chase money, set unrelated goals, and create silos of decision making and action (Firestone, 2009).
Finally, a learning context, the ideal type of district environment to sustain quality performance (Firestone, 2009), uses capacity building to drive sys- tem-wide reform (Darling-Hammond, 2005; Fullan 2008, 2014). Capacity emerges as relational connections enable individuals and schools to learn from their experiences (Adams, 2013; D. H. Hargreaves, 2011). As Sharrat and Fullan (2009) argue, “Capacity is a highly complex, dynamic, knowl- edge-building process” (p. 8). The work of district leaders is to build the social infrastructure by which schools create knowledge to accurately diag- nose if, how, and why improvement strategies are leading to more effective performance (Levin, 2008). For capacity to grow across a district, superinten- dents and other central office leaders need to be trusted.
There is nothing innovative or particularly new about features of high- capacity school systems. They set clear expectations for quality instruction and student learning, establish a coherent and aligned curriculum, use pro- cess and outcome data to study variation in teaching and student performance, and attract, retain, and develop talented educators (Fullan, 2010; Harris, 2011). What these characteristics have in common is the reliance on trust to make structures, processes, and practices functional by igniting behavior that leads to improved teaching and learning. Trust enhances cooperation (Tschannen-Moran, 2014), enriches openness (Hoffman, Sabo, Bliss, & Hoy, 1994), promotes cohesiveness (Zand, 1997), facilitates knowledge creation (Adams, 2013), deepens change (Daly, Moolenaar, Bolivar, & Burke, 2010), and builds school capacity (Cosner, 2009). System-wide reform carried out through a culture of trust is the difference between knowing characteristics that define effective districts and how to actually make schools better places to teach and learn. The latter moves school organizations forward while the former leaves behind many unfilled promises.
Capacity does not form simply by establishing quantitative performance targets, inputting new resources into schools, adopting new teacher evalua- tion models, and holding people accountable for results (Harris, 2011; King & Bouchard, 2011; Sharrat & Fullan, 2009). Rather, capacity grows as social and psychological barriers to change are replaced by a culture that values risk taking, experimentation, cooperation, and collective problem solving (Schein, 1996; Spillane, Reiser, & Reimer, 2002). Open and connected social net- works provide the relational structural for deep knowledge creation (Daly et al., 2010; Daly et al., 2014), while trust establishes the psychological safety
Adams and Miskell 679
to ask tough questions about the effectiveness of strategies, resources, and practices (Garvin, Edmondson, & Gino 2008). If knowledge creation drives capacity as many scholars claim, then trust is the ignition that starts the pro- cess moving forward.
Teacher Trust in District Administration: Conceptual Definition
Trust is a complex phenomenon that exists at different analytical levels (Van Maele, Van Houtte, & Forsyth, 2014). It has been studied as a personality trait (Rotter, 1967; Zand, 1972), a relational condition (Bryk & Schneider, 2002), a group norm (Forsyth et al., 2011), and an organizational property (Shamir & Lapidot, 2003). Despite its different forms and characteristics, scholars have reached agreement on some general attributes of trust. These attributes are reflected in the definition used in this study. Trust is a teacher’s willing- ness to risk vulnerability based on the confidence that district administrators act benevolently, competently, openly, honestly, and reliably (Hoy & Tschannen-Moran, 1999; Mishra, 1996).
Teacher trust in district administration is conceptualized and measured as a type of relational trust that forms as teachers observe and judge the actions and intentions of district leaders. Bryk and Schneider (2002) argue that rela- tional trust is fundamentally an intrapersonal phenomenon that emerges through interactions that occur within defined role relationships. We argue that teacher trust in district administration manifests itself as a teacher per- ception, not a collective property of a school faculty or a normative condition of the school district. Trust that is measured as an individual belief is substan- tively different than a collective property (Forsyth et al., 2011). A collective property is the representation of the assumptions, beliefs, and values held in common by members of a role group (Forsyth et al., 2011); it reflects a group norm. We aim to capture trust beliefs of individual teachers, not the collective perception of a teaching faculty in a school.
Relational trust corresponds to the unique social structure of school orga- nizations (Schneider, Judy, Ebmeye, & Broda, 2014). School systems have been described as a complex web of social actors who interact within estab- lished role relationships and defined responsibilities (Bryk & Schneider, 2002; Schneider et al., 2014; Van Maele et al., 2014). Critical role relation- ships affecting school processes and practices extend throughout the larger social system in which schools are embedded. To illustrate, school districts cannot accomplish system-wide goals without teachers, and teachers in turn depend on district leaders to organize teaching and learning in ways that max- imize teacher potential and support their effectiveness in the classroom. Trust
680 Educational Administration Quarterly 52(4)
enables teachers to risk vulnerability by placing themselves in an uncertain position based on confidence that district leaders will respond in ways that are not detrimental to their instructional effectiveness or professional growth (Van Maele et al., 2014).
Social exchanges between teachers and district administrators may not be as frequent or direct as they are between teachers and principals, but hierar- chical boundaries and organizational constraints do not eliminate opportuni- ties for teachers to evaluate the collective actions of central office leaders. Teachers follow decisions made at board meetings, read stories communi- cated in the media, have access to intradistrict newsletters, receive messages through site administrators, and have informal conversations with colleagues. Direct social exchanges combined with other observations build a body of evidence that teachers use to interpret decisions and actions of district lead- ers. This evidence becomes the wellspring of trust discernments.
Trust in District Administration Scale Development
Before we could proceed with an empirical test, we needed to develop a scale to measure teacher trust in district administration. Similar to existing school trust measures, items forming the Teacher Trust in District Administration Scale operationalize district actions that align with the trust facets. Teacher trust is observable in the perceived benevolence, competence, openness, reli- ability, and honesty of district leaders. The major conceptual difference between teacher trust in district administration and existing school trust scales (e.g., Omnibus T-Scale, Parent Trust in School Scale, Student Trust in Teacher Scale) is the unit of measurement. Collective trust measures are writ- ten so the trustee and trustor reflect the collective group. Teacher trust in district administration is conceived of and measured as an individual teacher belief. The trustor is the individual teacher, not the collective teaching faculty in a school. Thus, items capturing benevolence, competence, openness, hon- esty, and reliability reflect the collective action of district administration yet are written at a level that measures individual teacher beliefs, not teachers’ shared perceptions of district leaders. We now turn to a description of the facets and proposed scale items.
Benevolence in interpersonal exchanges relates to confidence that one’s interest or something one cares about will be protected by the trustee (Baier, 1986; Mishra, 1996). In an organizational setting, benevolence extends beyond basic care and concern for a single individual to a belief that the inter- ests and well-being of the collective are preserved (Barber, 1983; Ouchi, 1981). Goodwill toward the collective defines benevolence items for trust in
Adams and Miskell 681
district administration. Teachers depend on district administrators to act in ways that express their care and concern for the school community as a whole. Often this involves valuing the expertise and work of teachers. District administration express benevolence by how they work with teachers to build a school environment where teaching and learning thrive. Items capturing benevolence include the following: District administrators value the exper- tise of teachers. District administrators show concern for the needs of my school.
Competence reflects the trustee’s perceived ability to perform tasks that are required of his/her position (Gabarro, 1987). Trust diminishes if individu- als and groups do not demonstrate the competencies needed for successful performance. For educational systems, competence is reflected in knowing how to organize teaching and learning in ways that bring out the best in teachers and students alike (Darling-Hammond, 2005). Teacher discernment of competence teeters on the ability of district leaders to create an organiza- tional environment that develops instructional expertise. Items capturing competence include the following: District administrators demonstrate knowledge of teaching and learning. District administrators have established a coherent strategic plan for the district.
Openness functions like a valve in trust production. When individuals are open their intentions flow freely, but when closed intentions remain hidden, raising doubts about one’s motives and future actions (Adams, 2010; Mishra, 1996). Openness by district administration is observable in two types of actions. The first involves transparency by which decisions are made and performance information is communicated to teachers (Tschannen-Moran & Hoy, 1998). The second is the willingness of district leaders to listen to teach- ers and to understand their experiences. Openness connects teachers to the larger district vision by providing them with influence over school improve- ment efforts and regularly communicating important decisions and informa- tion (Fullan, 2008). Items capturing openness include the following: District administrators are open to teacher ideas about school improvement. District administrators are transparent in making strategic decisions about district performance.
Honesty is based on integrity and truthfulness (Hoy & Tarter, 2004). Disingenuous actions can elicit a degree of suspicion that raises red flags about the sincerity and intentions of leaders. Suspicion dampens a willing- ness of teachers to risk vulnerability (Tschannen-Moran, 2004, 2014). In con- trast, honesty builds trust by increasing confidence in the actions of the trustee (Mishra, 1996). Honest behavior reflects both the accuracy of infor- mation communicated to others and the acceptance of responsibility for deci- sions and actions (Adams, 2010). District administrators who hide facts,
682 Educational Administration Quarterly 52(4)
blame others, or cover-up mistakes damage the integrity of the central office and jeopardize its ability to advance an agenda that improves learning for all students (Sharrat & Fullan, 2009). Items capturing honesty include the fol- lowing: District administrators often say one thing and do another. District administrators take personal responsibility for their actions and decisions.
Reliability relates to predictable and consistent behavior (Butler & Cantrell, 1984). For interpersonal exchanges, reliability gets measured in terms of perceived fairness (Adams & Forsyth, 2009), but at the organiza- tional level, reliability reflects consistent and dependable action (Mishra, 1996). In a school district, reliability can be found in coherent and consistent improvement efforts centered on a predictable instructional focus (Sharrat & Fullan, 2009). Districts that cycle through programs and chase after the latest research-based strategies cannot establish the consistency of practice needed for continuous improvement (Firestone, 2009; Fullan, 2014). Items measur- ing reliability include the following: District administrators follow through on commitments. District administrators are committed to the stated goals of the district.
Benevolence, competence, openness, honesty, and reliability establish cri- teria by which teachers judge the collective actions of district administrators. Criteria for trust discernments, however, do explain how judgments of trust- worthy behavior interact in the cognitive process to elicit an overall trust belief. Two distinct arguments on the nature of trust beliefs have emerged. First, psychometric evidence on several existing trust scales (see Forsyth et al., 2011; Tschannen-Moran, 2004, 2014) implies that trust is a one-dimen- sional construct with the five facets sharing variance around a single factor. Second, a more recent belief has been advanced by Romero (2010) and Makiewicz and Mitchell (2014) who conceptualized and measured trust as a second-order construct, specifying trust facets as related but distinct factors. These two different conceptualizations call attention to the factor structure of the trust concept. Do trustworthy behaviors cohere around a single, latent trust variable or do they cohere around distinct trust facets?
Teach trust in district administration is hypothesized to be a single factor construct (Figure 1). Items derive from trust facets, but facets are not inde- pendent behaviors; they are interrelated and converge in the discernment pro- cess to form a unitary belief (Forsyth et al., 2011; Hoy & Tschannen-Moran, 1999; Tschannen-Moran, 2004). That is, trust facets are inextricably related; the absence of one affects the presence of others (Adams, 2010). To illustrate, district administrators who regularly listen to teachers and solicit teacher feedback are likely to be perceived as open, benevolent, and competent. The weight of the existing empirical evidence supports specifying trust as a sin- gle-factor construct. Exploratory factor analyses from multiple samples of
Adams and Miskell 683
the Omnibus Trust Scale, Student Trust in Teachers Scale, and Parent Trust in School Scale consistently find that scale items load strongly on one factor (Forsyth et al., 2011).
Validation Study
The purpose of the validation study was to evaluate the construct validity of the Teacher Trust in District Administration Scale before using the measure in the empirical test. Construct validity refers to the ability of a measure to yield truth- ful judgments about the object it purports to measure (Messick, 1995; Miller, 2008). While there are different aspects of construct validity (content, discrimi- nate, convergent, consequential, etc.), these aspects share the same fundamen- tal logic—that validity exists to the degree that the measure represents the underlining theoretical construct and informs credible judgments about the phenomenon of interest (Cronbach, 1971; Messick, 1995).
Construct validity was assessed by examining content, structural, and con- vergent validity. Evidence for content validity comes from two sources. First, similar to existing trust scales, items were written to capture the different facets of trustworthy behavior. Second, the 10 items were submitted to a group of 30 educators to assess item clarity and alignment with the trust fac- ets. This review resulted in support for the theoretical alignment between the items and the underlying trustworthy behaviors. Additionally, respondent feedback revealed behavioral characteristics of district administrators that
Figure 1. Hypothesized single-factor model of teacher trust in district administration. Note. B = Benevolence items; C = Competence items; O = Openness items; H = Honesty items; R = Reliability items.
684 Educational Administration Quarterly 52(4)
potentially link to more than one facet. For example, the item, District admin- istrators often say one thing and do another, was identified as representing openness, honesty, and reliability. Similarly, the item, District administrators value the expertise of teachers, was largely identified as benevolence but a few respondents felt it was also a reflection of competence.
Structural validity and convergent validity were examined in a field test with more than 800 teachers from an urban district in a southwestern city. Confirmatory factor analysis (CFA) in AMOS 19.0 was used to test the fac- ture structure of the scale. Building and testing a measurement model a priori has two advantages over a traditional exploratory approach. First, CFA mod- els are guided by theory. The empirical relationships found in the sample data will either support or repudiate the underlying logic of the measure. Second, CFA is useful for evaluating comparative model structure and fit by testing different theoretical specifications of the observed and latent features of the construct (Thompson & Daniel, 1996). In this case, a comparative analysis was used to test the hypothesized model by comparing estimates against a second-order specification of district trust.
Sample
Teachers were the unit of analysis. Data were collected in February of 2013 from a random sample of teachers in an urban school district. A roster of certified teaching faculty in each school was provided to the researchers by the school district. Half of the teachers in each school were randomly sampled to receive a survey with the 10 teacher trust in district administra- tion items and additional survey questions about academic emphasis and perceptions of the teacher evaluation model. This resulted in a sample of 1,305 teachers in 73 schools. Of the 1,305 teachers, 849 completed and returned usable surveys for a response rate of 65%. Teachers in the sample averaged 13 years of teaching experience and 6 years in their current school, approximately 9% were nationally board certified, and 85% were female (Table 1).
The school district is located in a southwestern city with a metropolitan population of around 900,000 residents. The district serves approximately 42,000 students across 88 different educational sites. Student demographics in the district include 30% Hispanic, 27% African American, 27% Caucasian, 8% multiracial, 6% American Indian, and 1% Asian. Seventy-nine percent of the students qualified for the federal lunch subsidy. The district employees approximately 2,978 certified staff. Nearly 49% have more than 11 years in the district, 123 are nationally board certified, 30% are racial minorities, and 1,173 hold graduate degrees.
Adams and Miskell 685
Measures
Convergent validity was examined by correlating teacher trust in district administration with teacher perceptions of the performance-based teacher evaluation system. Teacher evaluation has become a central feature of district reform. Trust in district administration would, on the face of it, appear to be related to teacher perceptions of the teacher evaluation model. Three items measuring teacher favorableness of the evaluation system were taken from Milanowski and Heneman’s (2001) teacher evaluation survey. The items cap- ture teacher perceptions of the system as a whole, the evaluation process itself, and its perceived valence. The items include the following I am satis- fied with the new evaluation system. The evaluation process takes more time than it is worth (reversed scored). The new system makes working in the dis- trict more attractive to me. The Likert-type response set ranges from Strongly Disagree (coded as 1) to Strongly Agree (coded as 6).
Analysis
For structural validity, a model generating approach using maximum likeli- hood estimation was used in AMOS 19.0 to evaluate the hypothesized single- factor specification of trust against the alternative second-order model. The first step was to build and test the single-factor model with all 10 items load- ing on trust. The second step was to build and test a second-order model with each facet conceptualized as a first-order factor of the latent second-order trust concept. This second step addresses Moss’ (1995) argument that “con- struct validation is most efficiently guided by the test of plausible rival hypotheses” (pp. 6-7). Specifying trust as a second-order model reflects the rival hypotheses as advanced by Romero (2010) and Makiewicz and Mitchell (2014). Fit indices, parameter estimates, and residuals were used to evaluate the two competing models. The absolute fit index was the root mean square of approximation (RMSEA). Relative fit indices included normed fit index (NFI), Tucker–Lewis index (TLI), and the comparative fit index (CFI).
Table 1. Teacher Demographic Information.
Mean SD Min Max
Years experience 13.11 9.10 1 30 Years in current school 6.13 6.4 1 30 National board 0.09 0.28 0 1 Female 0.85 0.36 0 1
Note. N = 849 teachers.
686 Educational Administration Quarterly 52(4)
The final step was a model trimming process for both the simple-factor hypothesized model and the alternative second-order construct. Model modi- fication is a conventional and acceptable approach for identifying a parsimo- nious model with the best fit (Schreiber, Nora, Stage, Barlow, & King, 2006). When trimming models for parsimony, it is critical to preserve the theoretical specification. For the hypothesized model, items were trimmed from 10 to 5. For the second-order model, first-order factors were collapsed from 5 to 3. This decision was based on the models advanced and tested by Romero (2010) and Makiewicz and Mitchell (2014), who measured trust as consisting of three factors: competence, benevolence, and integrity.
For convergent validity, we looked to the relationship between teacher trust in district administration and teacher favorableness of the teacher evalu- ation framework. Based on CFA results, we specified trust as a simple-factor latent variable composed of five observed variables representing each trust facet. Teacher favorableness of the evaluation framework was also treated as a latent variable so to account for measurement error in the analysis.
Harmon’s single-factor test was used to evaluate the degree to which common method bias may confound the estimated relationships in the convergent validity test. Common measurement bias reflects variance that is attributed to characteristics of the measurement itself and not an underlying relationship between constructs (Meade, Watson, & Kroustalis, 2007). The Harmon test estimates the potential problem of common method variance by subjecting items from all three constructs to an exploratory factor analysis (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). It is assumed that measurement bias exists if a single factor emerges to explain variance in the combined items (Meade et al., 2007). Results of our test showed that two factors were extracted with Eigen values more than one. Items had the strongest loading on their conceptual factors (see Appendix A).
The final analysis was to check our specification of district trust as an individual teacher belief. To do this, we estimated two types of intraclass cor- relation coefficients (ICCs): an ICC-2 with a one-way ANOVA (Mean square between − Mean square within/Mean square between) and an ICC-1 with an HLM unconditional model. ICC-2 is a measure of within-group homogeneity and assesses the degree to which individual teacher responses cluster around the school mean (Glisson & James, 2002). It is different from the HLM- derived ICC-1 in that the HLM model estimates between-group differences in teacher trust, not the cohesiveness of teacher trust perceptions within schools. Both estimates are necessary to justify aggregation of school-level variables because within-group consistency and between-group variability occur independent of each other (Glisson & James, 2002).
Adams and Miskell 687
Results
Preliminary data screening confirmed that sample data were normally distrib- uted. Skewness and kurtosis for all trust items were below 1.0 and the Quantile-Quantile plot of the items indicated that sample data aligned with a normal distribution. Furthermore, we did not find any outlier cases that would bias the estimates. Item means, standard deviations, skewness, kurtosis, and correlation coefficients appear in Table 2. Notable from these results is the statistically significant and strong relationship among all 10 items. Next, results are organized by tests of structural validity and convergent validity.
Structural Validity
For structural validity, we first examined the fit indices to compare the hypothesized to the alternative specification. Estimates reported in Table 3 show good fit for the hypothesized model. Chi-square was statistically sig- nificant, but RMSEA, NFI, TLI, and CFI all exceeded acceptable thresholds, suggesting a reasonably good fit between the hypothesized model and the observed data. Estimates of the alternative, second-order model did not meet established thresholds for good fit. Chi-square was considerably larger than the single-factor hypothesized model, and RMSEA, NFI, TLI, and CFI were all below the acceptable criteria for a good fitting model.
Parameter estimates and residuals lend further support for conceptualizing and measuring trust as a simple, single-factor construct. For the simple-factor hypothesized model, all parameter estimates were statistically significant and above .70, suggesting strong relationships between the latent trust construct and each item (Table 4). Parameter estimates and residuals for the alternative second-order model point to multicollinearity concerns among the first-order factors. Three of the latent first-order factors had estimates more than 1.0 and negative residuals. Factor loadings over one and statistically negative error variance are not possible outcomes in an actual population (Kolenikov & Bollen, 2012).
Negative residuals indicate a Heywood case stemming from either: a mis-identified model, outliers in the sample data, sampling fluctuations, or mis-specified structural relations (Kolenikov & Bollen, 2012). We can rule out problems with the sample data, model identification, and sam- pling fluctuation because the hypothesized model did not have the same estimation problems. This leaves a specification problem as the plausible reason for the Heywood case. We can trace the specific problem to a high degree of multicollinearity among the Benevolence, Openness, and Honesty factors. Parameter estimates over 1 for these factors suggest that
688
T ab
le 2
. It
em M
ea ns
, S ta
nd ar
d D
ev ia
tio ns
, S ke
w ne
ss , K
ur to
si s,
a nd
C or
re la
tio ns
.
M ea
n SD
Sk ew
ne ss
K ur
to si
s 1
2 3
4 5
6 7
8 9
10
FT D
is t1
3. 09
1. 38
.1 5
− .8
9 1.
0 .5
3 .5
8 .6
6 .5
9 .6
3 .5
9 .5
7 .6
0 .6
3 FT
D is
t2 3.
26 1.
34 −
.0 8
− .7
6 1.
0 .8
3 .6
9 .7
1 .6
7 .6
1 .6
5 .6
3 .6
8 FT
D is
t3 3.
49 1.
36 −
.1 8
− .7
5 1.
0 .7
5 .7
6 .7
2 .6
8 .7
3 .6
7 .7
2 FT
D is
t4 3.
50 1.
32 −
.3 1
− .5
6 1.
0 .7
6 .8
5 .7
6 .7
0 .7
6 .7
7 FT
D is
t5 3.
40 1.
45 −
.1 3
− .8
8 1.
0 .8
0 .7
4 .7
0 .7
0 .7
4 FT
D is
t6 3.
71 1.
28 −
.4 8
− .2
9 1.
0 .8
1 .7
3 .7
5 .7
8 FT
D is
t7 3.
94 1.
29 −
.6 0
− .1
1 1.
0 .7
4 .7
7 .7
5 FT
D is
t8 3.
69 1.
37 −
.4 3
− .5
6 1.
0 .7
7 .7
9 FT
D is
t9 3.
64 1.
37 −
.3 9
− .6
2 1.
0 .7
7 FT
D is
t1 0
3. 50
1. 45
− .2
6 −
.8 4
1. 0
N ot
e. N
= 8
49 t
ea ch
er s.
A ll
co rr
el at
io ns
a re
s ta
tis tic
al ly
s ig
ni fic
an t
at p
< .0
1.
Adams and Miskell 689
the factors are measuring the same cognitive discernment. In other words, benevolence, openness, and honesty are not independent dimensions of trust. They are related facets that cohere in the discernment process (see Tables 5 and 6).
Estimates of the trimmed models provide additional evidence to test the validity of the Teacher Trust in District Administration Scale. The hypothe- sized simple-factor structure was trimmed from 10 items to 5, one item for each facet. The item with the strongest parameter estimate for each facet was retained. Fit indices for the trimmed hypothesized model show a better over- all fit compared to the 10-item measure. Chi-square was considerably smaller and not statistically significant at the .05 level. Additionally, RMSEA, NFI, TLI, and CFI improved. All parameter estimates had a strong, statistically significant relationship with trust. Estimates for the respecified alternative second-order model suggested continued specification problems. Fit indices did not reach the thresholds for good fit and parameter estimates and residu- als point to continued problem with multicollinearity between the latent first- order factors.
Reliability was estimated through an interitem consistency analysis and split-half reliability test. Both the 10-item simple-structure model and the more parsimonious 5-item model demonstrated strong reliability. Cronbach alpha was .96 for the 10 items and .93 for the 5 items. For the 10 items, Cronbach alphas for Part 1 and Part 2 of the split-half results were .90 and .93. Additionally, the correlation between the split forms was .93 with a Spearman–Brown coefficient of .96 and a Guttman split-half coefficient of .96. For the 5 items, Cronbach alphas were .84 and .91 for the two parts of the split-half with correlations between the forms at .84 and Spearman–Brown and Guttman split-half coefficients of .92, respectively.
Table 3. Model Fit Indices for the Hypothesized and Alternative Models.
Fit Index Criteria Hypothesized
Model df
Alternative Five-Factor
Model df
Hypothesized Trimmed
Model df
Alternative Three- Factor Model df
Chi-square NS 188.6** 35 384.6** 30 5.0 5 224.7** 32 RMSEA <.05 .05 .14 .03 .11 NFI >.95 .97 .93 .99 .93 CFI >.95 .97 .94 .99 .94 TLI >.95 .95 .91 .98 .94
Note. N = 849 teachers. RMSEA = root mean square error of approximation; NFI = normed fit index; CFI = comparative fit index; TLI = Tucker–Lewis index. **p < .01.
690 Educational Administration Quarterly 52(4)
Table 4. Sample Items, Factor Loadings, and Squared Multiple Correlations for Hypothesized Model and Trimmed Five-Item Model.
Latent Factor and Items Factor
Loadings
Squared Multiple
Correlation Residuals
Latent Trust Factor District administrators value the
expertise of teachers .74 .54 .96
District administrators show concern for the needs of my school
.77 .59 .73
District administrators demonstrate knowledge of teaching and learning
.83 .68 .58
District administrators have established a coherent strategic plan for the district
.90 .81 .34
District administrators are open to teacher ideas about school improvement
.86 .74 .54
District administrators are transparent in making strategic decisions about district performance
.90 .81 .31
District administrators often say one thing and do another
.85 .73 .45
District administrators take personal responsibility for their actions and decisions
.83 .69 .59
District administrators follow through on commitments
.84 .71 .55
District administrators are committed to the stated goals of the district
.88 .77 .48
Five-Item Trimmed Model District administrators show concern for
the needs of my school .81 .66 .61
District administrators have established a coherent strategic plan for the district
.91 .82 .31
District administrators are transparent in making strategic decisions about district performance
.88 .77 .48
District administrators often say one thing and do another
.78 .61 .74
District administrators are committed to the stated goals of the district
.84 .71 .61
Note. Factor loadings and squared multiple correlations were statistically significant at p < .01; N = 849 teachers. Degrees of freedom for the 5-item model was 5, and for the 10-item model was 35.
Adams and Miskell 691
Table 5. Sample Items, Factor Loadings, and Squared Multiple Correlations for Alternative Second-Order Five Factor Model.
Latent Factor and Items Factor Loading
Squared Multiple
Correlation Residuals
Trust Benevolence 1.02 1.05 −.04 Competence 1.01 1.01 −.02 Openness 1.02 1.03 −.05 Honesty .97 .93 .09 Reliability .95 .91 .95 Benevolence District administrators value the expertise
of teachers .69 .47 1.0
District administrators show concern for the needs of my school
.78 .60 .70
Competence District administrators demonstrate
knowledge of teaching and learning .84 .71 .54
District administrators have established a coherent strategic plan for the district
.88 .78 .38
Openness District administrators are open to teacher
ideas about school improvement .85 .73 .57
District administrators are transparent in making strategic decisions about district performance
.87 .75 .53
Honesty District administrators often say one thing
and do another .93 .86 .23
District administrators take personal responsibility for their actions and decisions
.88 .77 .38
Reliability District administrators follow through on
commitments .87 .76 .44
District administrators are committed to the stated goals of the district
.88 .77 .43
Note. Factor loadings and squared multiple correlations were statistically significant at p < .01; N = 849 teachers. Degrees of freedom were 30.
The combined results establish empirical support for the structural valid- ity of conceptualizing and measuring Teacher Trust in District Administration as a simple-factor construct with shared variance among trust facets. The five-item trust measure had the best fit with the observed sample data, strong
692 Educational Administration Quarterly 52(4)
Table 6. Sample Items, Factor Loadings, and Squared Multiple Correlations for Alternative Second-Order Three Factor Model.
Latent Factor and Items Factor Loading
Squared Multiple
Correlation Residuals
Trust Benevolence .90 .80 .17 Competence 1.01 1.02 −.02 Reliability .96 .92 .11 Benevolence District administrators value the expertise
of teachers .66 .44 110
District administrators show concern for the needs of my school
.87 .76 .42
District administrators are open to teacher ideas about school improvement
.93 .86 .26
Competence District administrators demonstrate
knowledge of teaching and learning .88 .76 .35
District administrators have established a coherent strategic plan for the district
.86 .73 .54
District administrators are transparent in making strategic decisions about district performance
.87 .76 .28
Reliability District administrators often say one thing
and do another .89 .80 .36
District administrators take personal responsibility for their actions and decisions
.86 .74 .50
District administrators follow through on commitments
.91 .83 .45
District administrators are committed to the stated goals of the district
.88 .77 .49
Note. Factor loadings and squared multiple correlations were statistically significant at p < .01; N = 849 teachers. Degrees of freedom were 32.
factor loadings, and large square multiple correlations. The five-item mea- sure also displayed strong reliability as assessed through interitem consis- tency and a split-half reliability test. Results do not lend support for measuring trust as a second-order construct composed of distinct first-order factors. We now turn to evidence of convergent validity.
Adams and Miskell 693
Convergent Validity
Results for the relationship between teacher trust and teacher perceptions of the evaluation system report good model fit. Chi-square was 36.1 and statisti- cally significant, but RMSEA was below .05, NFI was above .95, CFI was above .95, and TLI was above .96. Factor loadings for the latent constructs were strong, ranging from .70 to .91. As for the correlation result, trust had a statistically significant and strong relationship with teacher favorableness of the evaluation framework (r = .54, p < 01; see Figure 2).
Tests of within-group homogeneity and between-school variability in teacher trust in district administration support the specification of trust at the individual teacher level. An ICC-2 estimate of .50 was not statistically sig- nificant, and it was considerably below the .70 threshold set by Cohen, Doveh, and Eick (2001) as indicative of reliable group means. A small ICC-2 indicates weak agreement among teachers within schools as to the trustwor- thiness of district administrators (see Table 7). The ICC-1 achieved statistical significance at .05, but the amount of variance in district trust between schools was only 4%, a relatively small amount of variance in comparison to the school-level variance associated with collective trust measures (Forsyth et al., 2011).
Figure 2. Test of convergent validity with a fully latent structural equation model using maximum likelihood estimation. Note. The parameter estimates are standardized regression coefficients. TE is teacher perceptions of the evaluation system. Scale items are treated as observed variables. Model fit estimates include the following: Chi-square = 36.1; RMSEA = .03; CFI = .99; NFI = .99; TLI = .99. Degrees of freedom = 61.
694 Educational Administration Quarterly 52(4)
To conclude, empirical results lend support for specifying trust in district administration as a simple-factor construct observable in the perceived benevolence, competence, openness, honesty, and reliability of district lead- ers. Trust can be measured with either the 10-item scale or the shorter 5-item scale. Both scales had good fit with the observed data, strong factor loadings, and high explained variance. Additionally, both scales had strong internal item-consistency and good split-half correlation results. Teacher trust was also related to teacher perceived favorableness of the evaluation framework. Both ICC-1 and ICC-2 estimates support the theoretical specification of dis- trict trust as a teacher belief.1
Empirical Test: District Trust and Teacher Commitment
Teacher trust in district administration represents a line of research that appears capable of deepening our understanding of how school systems con- tinuously get better at teaching and learning. As an initial study, we were interested in knowing if teacher trust in district administration can ignite beliefs and behavior that would enable teachers and schools to flourish. Teacher commitment to the school and its vision stands out as a psychologi- cal driver of individual and group capacity (Firestone & Pennell, 1993; Tsui & Cheng, 1999). Just like commitment to a goal motivates an individual to persist in her journey toward success, teacher commitment ignites the desire and determination to see that schools are working to serve students in ways that engage them in deep and meaningful learning (Nordin, Darmawan, & Keeves, 2009).
Commitment is defined as an individual’s identification with the values and goals of an organization, a willingness to work toward achievement of a shared vision, and a desire to remain in the organization (Angle & Perry, 1981; Mowday, Porter, & Steers, 1982; Ross & Gray, 2006). Committed teachers have established a psychological attachment that has deep and wide- ranging effects on their motivation, actions, and performance (Nordin et al., 2009). The effects of commitment also extend to the performance of schools
Table 7. Intraclass Correlation Coefficients.
Variable ICC-1 Chi Square ICC-2 F Ratio
Teacher trust in district administration .04 82.36* .50 .99
Note. N = 849 teachers. *p < .05.
Adams and Miskell 695
and the larger school system in which teachers work (Reyes, 1992). At the individual level, committed teachers are autonomously motivated, persevere through challenges, and work cooperatively with colleagues (Firestone & Pennell, 1993; Nordin et al., 2009). For schools and school systems, having committed teachers generally leads to better achievement outcomes, a healthy teaching environment, less teacher conflict, and innovative practices (Henkin & Holliman, 2009; Kushman, 1992; Nordin et al., 2009).
Evidence has identified school leaders as an essential factor in getting teachers to commit to a school and its success (Henkin & Holliman, 2009; Nordin et al., 2009; Ross & Gray, 2006). Teachers have stronger commitment when principals are seen as open, collaborative, and empowering; where teachers collectively work to accomplish high academic goals; and where collective teacher efficacy defines the normative school climate (Lee, Zhang, & Yin, 2011; Ross & Gray, 2006; Ware & Kitsantas, 2011). It makes sense that daily interactions between principals and teachers would be a source of teacher commitment. Principals, through their decisions and actions, create an environment that can either reinforce teacher psychological attachment to the school’s vision, or conversely, alienate teachers by leading in ways that contrast with their values, expectations, and beliefs (Henkin & Holliman, 2009; Kushman, 1992; Nordin et al., 2009).
The effect of district leadership on teacher commitment is an interesting relationship to think about. At the executive level, district leaders have fewer interactions with teachers, making it possible for the psychological and behavioral effects of district trust to be attenuated by several school factors. That being the case, we believe that commitment to a school can be deepened when district leaders are viewed by teachers as trustworthy. Low trust would seem to lessen the faith that teachers place in district leaders, thereby jeopar- dizing their value congruence with the vision of the school, diminishing their willingness to carrying out improvement plans, and reducing their desire to remain in the school. Thus, we predict that teacher trust in district administra- tion has a relationship to teacher commitment over and above the effects of teacher trust in principal.
Data Source
Teachers were the unit of analysis for the study. Data came from 785 teachers in the same urban school district as the validation study. A general teacher survey was administered in February of 2014 to teachers in 73 elementary and secondary schools. Half of the teachers in each school were randomly sampled to receive a survey of teacher trust in district administration, teacher trust in principal, and teacher commitment. The result was a sample of 1,273
696 Educational Administration Quarterly 52(4)
teachers in 73 schools. Of the 1,273 teachers, 785 completed and returned usable surveys for a response rate of 62%.
Measures
District trust was measured with the five-item Teacher Trust in District Administration Scale. Teacher commitment was measured with five items adapted from the Organizational Commitment Scale (Porter, Steers, Mowday, & Bouhan, 1974). The scale captures the facets of value congruence, willing- ness to work toward a shared vision, and desire to stay in the school. Sample items include the following: “I feel very little loyalty to this school” (reversed scored). “I am glad I chose to teach in this school.” “I would probably con- tinue teaching in this school.” Five items on teacher trust in principal were adapted from the Omnibus Trust Scale (Hoy & Tschannen-Moran, 1999) so as to measure individual teacher beliefs not collective teacher perceptions. Sample items include the following: “The principal in this school typically acts in the best interests of teachers.” “The principal in this school shows concern for teachers.” “The principal in this school is competent in doing his/ her job.” The principal keeps teachers informed about school issues.” “The principal in this school is honest.” A Harmon single-factor test provided evi- dence to indicate that common method variance does not seem to be a con- cern with including all measures on a common survey. Three factors emerged from the extraction and all items loaded strongly on their theoretical property (see Appendix B).
Analysis and Results
We tested a fully latent structural equation model in AMOS 19.0. The fully latent model enabled us to account for measurement error in the estimation of the structural relationship among the two forms of trust and teacher commit- ment. Results (Figure 3) show that teacher trust in principal and teacher trust in district administration combined to explain 29% of the variance in teacher commitment. The unique effects of district and principal trust were similar. Teacher trust in district administration had a unique effect of .33, whereas the unique effect of principal trust was .32. District and principal trust were strongly related to each other with a parameter estimate of .40. Model fit indices show a strong alignment between the theoretical model and the sam- ple data. Chi-square was statistically significant, but the comparative fit indi- ces all met or exceeded acceptable thresholds for good fit: RMSEA was .05, CFI was .97, TLI was .96, and NFI was .96.
Adams and Miskell 697
Discussion
Increasingly, scholars, policy makers, and practitioners have called on school districts to pursue an improvement strategy that builds capacity within schools (Darling-Hammond, 2005; Firestone, 2009; Harris, 2011; Sharrat & Fullan, 2009). Capacity grows out of a relational context that encourages risk taking, problem solving, knowledge creation, and adaptation among individ- uals and groups (D. H. Hargreaves, 2011). Trust, for its role in knowledge creation and learning, is foundational to high capacity school organizations (Adams, 2013; Bryk & Schneider, 2002; Tschannen-Moran, 2004, 2014). With this in mind, we set out to extend trust research to the teacher–district role relationship by developing a measure of teacher trust in district adminis- tration and using the measure in an initial empirical test to evaluate the use- fulness of this line of research. Together, the validity evidence and findings from the empirical test lend support for extending trust research to district leadership.
Figure 3. Results of the empirical test of the fully latent structural equation model. Note. Full maximum likelihood was used as the estimation technique. Parameter estimates are standardized regression coefficients. Model FIT estimates include the following: Chi-square = 316.3; RMSEA = .05; CFI = .97; NFI = .96; TLI = .96. Degrees of freedom = 87.
698 Educational Administration Quarterly 52(4)
Evidence from the validity study supports the use of the Teacher Trust in District Administration Scale for research and practice. Items for teacher trust in district administration were based on the facets of trust and written to reflect the collective actions of district administrators. For validity evidence, we were interested in three patterns in the data: (1) the relationship between scale items and the latent construct, (2) the factor structure of the scale, and (3) the correlation between district trust and convergent teacher beliefs. Based on findings from other trust measures, we hypothesized that teacher trust in district administration exists as a single-factor construct. This hypoth- esis stands in contrast to an alternative specification of trust as a multifac- tored construct with the facets functioning as distinct beliefs.
Results supported our hypothesized model. Each trust facet contributes to a teacher’s perception of district administration. No one facet stands out as more critical in the formation process than others. Benevolence, competence, openness, honesty, and reliability are inextricably related and converge in the discernment process to form a singular belief. While conceptual distinctions can be made, the facets represent highly integrated actions that together elicit teacher judgments of administrators’ trustworthiness. Our findings suggest it is hard to be perceived as benevolent without being perceived as competent, open, honest, or reliable. Similarly, competent behavior likely contains elements of benevolence, openness, honesty, and reliability. The absence of one facet affects the presence of others.
Additional validity support comes from the strong correlation between trust in district administration and perceived favorableness of the teacher evaluation system. We expected trust and teacher perceptions of the evalua- tion system to go hand in hand. Consistent with our prediction, higher teacher trust in district administration was associated with higher favorableness of the evaluation system, linking teacher trust beliefs to a strategy behind efforts to elevate teaching quality.
A valid measure of district trust by itself does not establish a strong case for developing this line of research. Evidence from the empirical test provides additional support. Results highlight the potential of district trust to explain differences in teacher, school, and school system capacity. At a minimum, capacity building requires a core group of dedicated and committed teachers who have the inspiration and know-how to address problems plaguing teach- ing and learning (Harris, 2011). Fostering such commitment has generally been viewed as the responsibility of school principals (Leithwood, Louis, Anderson, & Wahlstrom, 2004), and indeed, the evidence indicates that prin- cipals have considerable sway over the dedication and persistence of teachers (Henkin & Holliman, 2009; Kushman, 1992; Nordin et al., 2009). That stated, results of our empirical test add district trust to the equation.
We were not surprised that teacher trust beliefs would be related to their commitment. Trust has been described metaphorically as a type of glue that
Adams and Miskell 699
unites individuals and connects individuals to organizations (Bryk & Schneider, 2002; Tschannen-Moran, 2004, 2014). Nonetheless, we did not expect district trust to have as large of an effect on commitment when accounting for principal trust. Principals, for their proximately to teachers, would seem to have greater influence on teacher commitment than district leaders. This was not the case in our sample of teachers, suggesting district leaders may be closer to the psychological sources of motivated and engaged teachers than hierarchical boundaries imply.
The study had limitations that future research can address. First, we sampled teachers from one urban school district, thereby potentially reducing variability in individual teacher beliefs. Future research can broaden the sample to include teachers from different school districts. A sample of teachers across multiple school districts may yield different results. A second limitation was with the simple empirical test. The study was partly based on the argument that capacity building is a viable process for school improvement. Capacity, both as a con- cept and condition, is more complex and dynamic than what can be captured by measuring trust and teacher commitment. Future research can examine the relationship between teacher trust in district administration and other indicators of capacity and high performance. Finally, we believe evidence on the forma- tion of teacher trust in district administration has utility as well. Understanding practices that build trust has the potential to shape how district leaders lead and manage system-wide improvement efforts.
Conclusion
In conclusion, we offer three implications for research and practice. First, as a simple factor construct, trust facets should not be weighted or ranked in an attempt to understand which actions are more likely to elicit positive discern- ments. With facets shaping a unitary belief, any attempt to weight the relative importance of specific behaviors would be an arbitrary decision. For exam- ple, we cannot conclude that competence is any more instrumental than open- ness, or that honesty does not carry the same weight as benevolence. Rather, findings confirm the theory that district administrators can build trust by con- sistently acting in ways that teachers perceive as benevolent, competent, open, reliable, and honest (Forsyth et al., 2011). Each trust facet contributes to teacher discernments of the actions and intentions of district leaders.
Second, trust in district administration has the potential to deepen our under- standing of how executive leaders in school districts set a direction for improve- ment, organize and coordinate work processes, and develop talent and expertise. Deeper knowledge in these areas is critical when considering the pressure school districts are under to improve, and the relatively few examples of districts that consistently make teaching and learning better at scale (Bryk, Gomez, Grunow,
700 Educational Administration Quarterly 52(4)
& LeMahieu, 2015; Fullan, 2010). Unlike descriptive accounts of improving districts (Firestone, 2009; Waters & Marzano, 2006), trust explores the rela- tional connections that either ignite positive change or become barriers to turn- ing good ideals into effective practices. It is the human and social side of school districts that ultimately determines how school systems adapt to a changing environment (A. Hargreaves & Shirley, 2009; Sharrat & Fullan, 2009), and additional research on district trust can bring this knowledge to the surface.
Third, evidence on district trust can be used by practitioners to gauge capacity of the school system to continuously improve. Indeed, there is more to capacity than trust, but it is hard to envision a high-capacity system with- out similarly strong levels of trust between teachers and district leaders. High trust signals an open, cooperative, and cohesive relational network (Forsyth et al., 2011; Bryk & Schneider, 2002; Tschannen-Moran, 2004, 2014), whereas low trust points to relational problems that constrain information exchange and knowledge creation (Adams, 2013). Accurate information on district trust allows central office leaders to formatively assess the strength of relational ties with teachers.
As a whole, this study makes a strong case for developing a line of research on teacher trust in district administration. It establishes a good measure to use in future research, and it provides initial evidence showing that teacher beliefs are sensitive to the actions of district administrators. Looking ahead, addi- tional knowledge on district trust can be used to map the process by which district leaders work through teachers to create high functioning school systems.
Appendix A Exploratory Factor Analysis Results of the Teacher Trust in District Administration and Teacher Evaluation Scales.
Factor 1 Factor 2
TTDA1 .74 −.21 TTDA2 .79 −.22 TTDA3 .82 −.25 TTDA4 .83 −.23 TTDA5 .78 −.27 TE1 .35 .65 TE2 .25 .72 TE3 .21 .77
Note. N = 849 teachers. Principal axis factoring with no rotation was used. Two factors were extracted with Eigen values over one. Factor 1 explained 56% of the variance, and Factor 2 explained 20%.
Adams and Miskell 701
Appendix B Exploratory Factor Analysis Results of the Teacher Trust in District Administration, Teacher Trust in Principal, and Teacher Commitment.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publica- tion of this article.
Note
1. Estimates used to assess the factor structure of the scale derive from individual- level data that assume independence of observations within schools. The small ICC-1 (.04) calculated from our sample of teachers establishes empirical support for proceeding with an individual-level analysis of data to evaluate the factor structure of the survey items. For future uses of the scale, researchers will want to estimate the independence of teacher observations prior to any analysis. Any use of the survey at the group level would not be supported by the validity evi- dence in this study. Analysis at the group level would require that a multilevel
Factor 1 Factor 2 Factor 3
TTDA1 .73 .21 −.08 TTDA2 .78 .32 −.01 TTDA3 .72 .35 −.09 TTDA4 .80 .31 −.17 TTDA5 .77 .37 −.18 TTP1 .29 .67 −.15 TTP2 .22 .70 −.29 TTP3 .37 .69 −.28 TTP4 .39 .68 −.23 TTP5 .34 .65 −.24 TC1 −.31 .27 .69 TC2 −.24 .21 .75 TC3 −.19 .30 .67 TC4 −.29 .29 .71 TC5 −.31 .29 .72
Note. N = 785 teachers. Principal axis factoring with no rotation was used. Three factors were extracted with Eigen values over one. Factor 1 explained 33% of the variance, Factor 2 explained 24%, and Factor 3 explained 13%. TTDA = Teacher Trust in District Administration items; TTP = Teacher Trust in Principal items; TC = Teacher Commitment items.
702 Educational Administration Quarterly 52(4)
factor analysis be performed to assess the factor structure when taking into account nonindependence of observations.
References
Adams, C. M. (2010). Social determinants of student trust in high poverty elementary schools. In W. K. Hoy &M. D. (Eds.), Analyzing school contexts: Influences of principals and teachers in the service of students (pp. 65-85). Charlotte, NC: Information Age.
Adams, C. M. (2013). Collective trust: A social indicator of instructional capacity. Journal of Educational Administration, 51(3), 1-36.
Adams, C. M., & Forsyth, P. B. (2009). Conceptualizing and validating a measure of student trust. In W. Hoy & M. DiPaola (Eds.), Improving schools: Studies in leadership and culture (pp. 186-206). Charlotte, NC: Information Age.
Angle, H. L., & Perry, J. L. (1981). An empirical assessment of organizational com- mitment and organizational effectiveness. Administrative Science Quarterly, 26, 296-319.
Baier, A. C. (1986). Trust and antitrust. Ethics, 96, 231-260. Barber, B. (1983). The logic and limits of trust. New Brunswick, NJ: Rutgers
University Press. Bryk, A. S., Gomez, L. M., Grunow, A., & LeMahieu, P. G. (2015). Learning to
improve. How America’s schools can get better and getting better. Cambridge, MA: Harvard Education Press.
Bryk, A. S., & Schneider, B. (2002). Trust in schools: A core resource for improve- ment. New York, NY: Russell Sage Foundation.
Butler, J. K., & Cantrell, S. R. (1984). A behavioral decision theory approach to model- ing dyadic trust in superiors and subordinates. Psychological Reports, 55, 19-28.
Chenowith, K. (2007). It’s being done: Academic success in unexpected schools. Cambridge, MA: Harvard Education Press.
Cohen, A., Doveh, E., & Eick, U. (2001). Statistical properties of the rwg(i) index of agreement. Psychological Methods, 6, 297-310.
Cosner, S. (2009). Building organizational capacity through trust. Educational Administration Quarterly, 45, 248-291.
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational mea- surement (pp. 443-507). Washington, DC: American Council on Education.
Daly, A. J., Liou, Y., & Moolenaar, N. M. (2014). The principal connection: Trust and innovative climate in a network of reform. In D. Van Maele, P. B. Forsyth, & M. Van Houte (Eds.), Trust and school life: The role of trust for learning, teaching, leading, and bridging (pp. 285-312). New York, NY: Springer.
Daly, A. J., Moolenaar, N. M., Bolivar, J. M., & Burke, P. (2010). Relationships in reform: The role of teachers’ social networks. Journal of Educational Administration, 48, 359-391.
Darling-Hammond, L. (2005). Policy and change: Getting beyond bureaucracy. In A. Hargreaves (Ed.), Extending educational change: International handbook of educational change (pp. 362-387). Dordrecht, Netherlands: Springer.
Adams and Miskell 703
Darling-Hammond, L., Hightower, A. M., Husbands, J. L., LaFors, J. R., Young, V. M., & Christopher, C. (2006). Building instructional quality: Inside-out and outside- in: Perspective on San Diego’s school reform. In A. Harris & J. Chrispeels (Eds.), Improving schools and educational systems (pp.129-185). New York, NY: Routledge.
Firestone, W. A. (2009). Culture and process in effective school districts. In W. K. Hoy & M. F. DiPaola (Eds.), Studies in school improvement (pp. 177-204). Raleigh-Durham, NC: Information Age.
Firestone, W. A., & Pennell, J. R. (1993). Teacher commitment, working condi- tions, and differential inventive policies. Review of Educational Research, 63, 489-525.
Forsyth, P. B., & Adams, C. M. (2014). Organizational predictability, the school prin- cipal, and achievement. In D. Van Maele, M. Van Houtte, & P. Forsyth (Eds.), Trust and school life: The role of trust for learning, teaching, leading, and bridg- ing (pp. 83-98). New York, NY: Springer.
Forsyth, P. B., Adams, C. M., & Hoy, W. K. (2011). Collective trust: Why schools can’t improve without it. New York, NY: Teachers College Press.
Fullan, M. (2008). Six secrets of change: What the best leaders do to help their orga- nizations survive and thrive. San Francisco, CA: Wiley.
Fullan, M. (2010). All systems go: The change imperative for whole system reform. Thousand Oaks, CA: Corwin.
Fullan, M. (2014). The principal: Three keys to maximizing impact. San Francisco, CA: Jossey-Bass.
Gabarro, J. J. (1987). The development of working relationships. In J. Lorsch (Ed.), Handbook of organizational behavior (pp.172-189). Engle-wood Cliffs, NJ: Prentice-Hall.
Garvin, D. A., Edmondson, A. C., & Gino, F. (2008, March). Is yours a learning orga- nization? Harvard Business Review, 3-11.
Glisson, C., & James, L. R. (2002). The cross-level effects of culture and climate in human service teams. Journal of Organizational Behavior, 23, 767-794.
Hargreaves, A., & Shirley, D. (2009). The fourth way: The inspiring future for educa- tional change. Thousand Oaks, CA: Corwin Press.
Hargreaves, D. H. (2011). System redesign for system capacity building. Journal of Educational Administration, 49, 685-700.
Harris, A. (2011). System improvement through collective capacity building. Journal of Educational Administration, 49, 624-636.
Harris, A., & Chrispeels, J. H. (2006). Improving schools and educational systems: International perspectives. New York, NY: Routledge.
Henkin, A. B., & Holliman, S. L. (2009). Urban teacher commitment: Exploring asso- ciations with organizational conflict, support for innovation, and participation. Urban Education, 44, 160-180.
Hoffman, J. D., Sabo, D., Bliss, J., & Hoy, W. K. (1994). Building a culture of trust. Journal of School Leadership, 4, 484-501.
Honig, M. I., & Hatch, T. C. (2004). Crafting coherence: How schools strategically manage multiple, external demands. Educational Researcher, 33(8), 16-30.
704 Educational Administration Quarterly 52(4)
Hoy, W. K., & Tarter, C. J. (2004). Organizational justice in schools: No justice with- out trust. International Journal of Educational Management, 18, 250-279.
Hoy, W. K., & Tschannen-Moran, M. (1999). Five faces of trust: An empirical con- firmation in urban elementary schools. Journal of School Leadership, 9, 184-208.
King, B. M., & Bouchard, K. (2011). The capacity to build organizational capacity in schools. Journal of Educational Administration, 49, 653-669.
Kolenikov, A., & Bollen, K. A. (2012). Testing negative error variances: Is a Heywood case a symptom of misspecification? Sociological Methods & Research, 41, 124-167.
Kushman, J. W. (1992). The organizational dynamics of teacher workplace com- mitment: A study of urban elementary and middle schools. Educational Administration Quarterly, 28, 5-42.
Lee, J. C., Zhang, Z., & Yin, H. (2011). A multilevel analysis of the impact of a professional learning community, faculty trust in colleagues, and collective effi- cacy on teacher commitment to students. Teaching and Teacher Education, 27, 820-830.
Leithwood, K., Louis, K. S., Anderson, S., & Wahlstrom, K. (2004). How leadership influences student learning. New York, NY: The Wallace Foundation. Available at: http://www.wallacefoundation.org/knowledge-center/school-leadership/key- research/Documents/How-Leadership-Influences-Student-Learning.pdf
Levin, B. (2008). Thinking about knowledge mobilization. Paper presented at the Canadian Council on Learning and the Social Sciences and Humanities Research Council, Vancouver, Canada.
Makiewicz, M., & Mitchell, D. (2014). Teacher trust in the principal: Factor structure and effects. In D. VanMaele, P. B. Forsyth, & M. Van Houte (Eds.), Trust and school life: The role of trust for learning, teaching, leading, and bridging (pp. 99-120). New York, NY: Springer.
McLaughlin, M. W. (1990). The rand change agent study revisited: Macro perspec- tives and micro realities. Educational Researcher, 19(6), 11-16.
Meade, A. W., Watson, A. M., & Kroustalis, C. M. (2007). Assessing common meth- ods bias in organizational research. Paper presented at the 22nd Annual Meeting of the Society for Industrial and Organizational Psychology, New York, NY.
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into school mean- ing. American Psychologist, 50, 741-749.
Milanowski, A. T., & Heneman, H. G. (2001). Assessment of teacher reactions to a standards-based teacher evaluation system: A pilot study. Journal of Personnel Evaluation in Education, 15, 193-212.
Miller, D. M. (2008). Data for school improvement and educational accountability: reliability and validity in practice. In K. Ryan & L. Shepard (Eds.), The future of test-based educational accountability (pp. 249-262). New York, NY: Routledge.
Mishra, A. K. (1996). Organizational responses to crises: The centrality of trust. In R. Kramer & T. Tyler (Eds.), Trust in organizations: Frontiers of theory and research (pp. 261-287). Thousand Oaks, CA: Sage.
Adams and Miskell 705
Moss, P. A. (1995). Themes and variations in validity theory. Educational Measurement: Issues and Practice, 14(2), 5-12.
Mourshed, M., Chinezi, C., & Barber, M. (2010). How the world’s most improved school systems keep getting better. New York, NY: McKinsey.
Mowday, R. T., Porter, L. W., & Steers, R. M. (1982). Employee-organization link- ages: The psychology of commitment, absenteeism and turnover. New York, NY: Academic Press.
Nordin, A. R., Darmawan, G. N., & Keeves, J. P. (2009). Teacher commitment. In L. Saha & A. Dworkin (Eds.), International handbook of research on teachers and teaching (pp. 343-360) New York, NY: Springer.
Ouchi, W. G. (1981). Theory Z: How American business can meet the Japanese chal- lenge. Reading, MA: Addison-Wesley.
Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and rec- ommended remedies. Journal of Applied Psychology, 88, 879-903.
Porter, L. W., Steers, R. M., Mowday, R. T., & Bouhan, P. V. (1974). Organizational commitment, job satisfaction, and turnover among psychiatric technicians. Journal of Applied Psychology, 59, 603-609.
Reyes, P. (1992). Preliminary models of teacher organizational commitment: Implications for restructuring the workplace. Madison, WI: Center on Organization and Restructuring of Schools.
Romero, L. (2010). Student trust: Impacting high school outcomes (Unpublished doc- toral dissertation). University of California Riverside.
Ross, J. A., & Gray, P. (2006). Transformational leadership and teacher commitment to organizational values: The mediating effects of collective teacher efficacy. School Effectiveness and School Improvement, 17, 179-199.
Rotter, J. B. (1967). A new scale for the measurement of interpersonal trust. Journal of Personality, 35, 651-665.
Schein, E. H. (1996). Culture: The missing concept in organization studies. Administrative Science Quarterly, 41, 229-240.
Schneider, B., Judy, J., Ebmeye, C., & Broda, M. (2014). Trust in elementary and secondary urban schools: A pathway for student success and college ambition. In D. VanMaele, P. B. Forsyth, & M. Van Houte (Eds.), Trust and school life: The role of trust for learn- ing, teaching, leading, and bridging (pp. 37-56). New York, NY: Springer.
Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. Journal of Educational Research, 99, 323-338.
Shamir, B., & Lapidot, Y. (2003). Trust in organizational superiors: Systemic and collective considerations. Organization Studies, 24, 463-491.
Sharrat, L., & Fullan, M. (2009). Realization: The change imperative for deepening district wide reform. San Francisco, CA: Corwin Press.
Spillane, J. P., Reiser, B. J., & Reimer, T. (2002). Policy implementation and cogni- tion: Reframing and refocusing implementation research. Review of Educational Research, 72, 387-431.
706 Educational Administration Quarterly 52(4)
Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores: A historical overview and some guidelines. Educational and Psychological Measurement, 56, 213-224.
Tschannen-Moran, M. (2004). Trust matters: Leadership for successful schools. San Francisco, CA: Jossey-Bass.
Tschannen-Moran, M. (2014). Trust matters: Leadership for successful schools (2nd ed.). San Francisco, CA: Jossey-Bass.
Tschannen-Moran, M., & Hoy, W. K. (1998). A conceptual and empirical analysis of trust in schools. Journal of Educational Administration, 36, 334-352.
Tsui, K. T., & Cheng, Y. C. (1999). School organizational health and teacher com- mitment: A contingency study with multi-level analysis. Educational Research and Evaluation: An International Journal on Theory and Practice, 5, 249-268.
Van Maele, D., Van Houtte, M., & Forsyth, P. B. (2014). Introduction: Trust as a mat- ter of equity and excellence in education. In D. VanMaele, P. B. Forsyth, & M. Van Houte (Eds.), Trust and school life: The role of trust for learning, teaching, leading, and bridging (pp. 1-36). New York, NY: Springer.
Ware, H. W., & Kitsantas, A. (2011). Predicting teacher commitment using princi- pal and teacher efficacy variables: An HLM approach. Journal of Educational Research, 104, 183-193.
Waters, T. J., & Marzano, R. J. (2006). School district leadership that works: The effect of superintendent leadership on student achievement. Denver, CO: MCREL.
Zand, D. (1972). Trust and managerial problem solving. Administrative Science Quarterly, 17(2), 229-239.
Zand, D. (1997). The leadership triad: Knowledge, trust, and power. New York, NY: Oxford University Press.
Author Biographies
Curt M. Adams is the Linda Clarke Anderson Presidential Professor at the University of Oklahoma and co-director of the Oklahoma Center for Educational Policy. He conducts research on the social-psychology of school systems, performance measure- ment, leadership, accountability, and improvement science.
Ryan C. Miskell is a research associate with WestEd where he conducts research related to issues and factors affecting teaching, learning, and school improvement.
Educational Administration Quarterly 2016, Vol. 52(2) 221 –258
© The Author(s) 2016 Reprints and permissions:
sagepub.com/journalsPermissions.nav DOI: 10.1177/0013161X15616863
eaq.sagepub.com
Article
The Impact of Leadership on Student Outcomes: How Successful School Leaders Use Transformational and Instructional Strategies to Make a Difference
Christopher Day1, Qing Gu1, and Pam Sammons2
Abstract Purpose: This article illustrates how successful leaders combine the too often dichotomized practices of transformational and instructional leadership in different ways across different phases of their schools’ development in order to progressively shape and “layer” the improvement culture in improving students’ outcomes. Research Methods: Empirical data were drawn from a 3-year mixed-methods national study (“Impact Study”) that investigated associations between the work of principals in effective and improving primary and secondary schools in England and student outcomes as defined (but not confined) by their national examination and assessment results over 3 years. The research began with a critical survey of the extant literature, followed by a national survey that explored principals’ and key staff’s perceptions of school improvement strategies and actions that they believed had helped foster better student attainment. This was complemented by multiperspective in-depth case studies of a subsample of 20 schools.
1University of Nottingham, Nottingham, UK 2University of Oxford, Oxford, UK
Corresponding Author: Christopher Day, School of Education, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham NG8 1BB, UK. Email: [email protected]
616863 EAQXXX10.1177/0013161X15616863Educational Administration QuarterlyDay et al. research-article2016
222 Educational Administration Quarterly 52(2)
Findings: The research provides new empirical evidence of how successful principals directly and indirectly achieve and sustain improvement over time through combining both transformational and instructional leadership strategies. The findings show that schools’ abilities to improve and sustain effectiveness over the long term are not primarily the result of the principals’ leadership style but of their understanding and diagnosis of the school’s needs and their application of clearly articulated, organizationally shared educational values through multiple combinations and accumulations of time and context- sensitive strategies that are “layered” and progressively embedded in the school’s work, culture, and achievements. Implications: Mixed-methods research designs are likely to provide finer grained, more nuanced evidence- based understandings of the leadership roles and behaviors of principals who achieve and sustain educational outcomes in schools than single lens quantitative analyses, meta-analyses, or purely qualitative approaches. The findings themselves provide support for more differentiated, context sensitive training and development for aspiring and serving principals.
Keywords school leadership, effective principal leadership, student outcomes, transformational leadership, instructional leadership
The Research Context: Why School Leadership Matters
The past 20 years have witnessed remarkably consistent and persisting, worldwide efforts by educational policymakers to raise standards of achieve- ment for all students through various school reforms. Common to almost all government reforms has been an increased emphasis on accountability and performativity accompanied by a concurrent movement toward the decentral- ization of financial management and quality control functions to schools, with increasing emphasis on evaluation and assessment (Ball, 2001, 2003; Baker & LeTendre, 2005; Organisation for Economic Co-operation and Development [OECD], 2008, 2013).
These changing policy landscapes of education have culminated in a changing profile of school leadership in many countries (OECD, 2008, 2010, 2012). However, what remains unchanged is a clear consensus in the policy and research arenas that “effective school autonomy depends on effective leaders” (OECD, 2012, p. 14). International research has pro- vided consistent evidence that demonstrates the potential positive and negative impacts of leadership, particularly principal leadership, on school
Day et al. 223
organization, culture and conditions, and, through these, on the quality of teaching and learning and student achievement (Bruggencate, Luyten, Scheerens, & Sleegers, 2012; Bryk, Sebring, Allensworth, Luppescu, & Easton, 2010; Day et al., 2009; Gu & Johansson, 2013; Leithwood & Jantzi, 1999; Leithwood, Jantzi, & Steinbach, 1999; Marks & Printy, 2003; Mulford, 2008; Robinson, Lloyd, & Rowe, 2008; Silins & Mulford, 2002a).
Comprehensive and large-scale systematic reviews of, by and large, quan- titative data (Hallinger & Heck, 1996, 2010; Leithwood, Day, Sammons, Harris, & Hopkins, 2006; Leithwood, Day, Sammons, Hopkins, & Harris, 2008; Marzano, Waters, & McNulty, 2005; Robinson, Hohepa, & Lloyd, 2009) have also found that leadership is second only to classroom teaching as an influence on pupil learning (Leithwood et al., 2006) and that such influ- ence is achieved through its effects on school organization and culture as well as on teacher behavior and classroom practices (Witziers, Bosker, & Krüger, 2003). Hallinger’s (2010) review of 30 years of empirical research on school leadership points in particular to the indirect or mediated positive effects that leaders can have on student achievement through the building of collabora- tive organizational learning, structures, and cultures and the development of staff and community leadership capacities to promote teaching and learning and create a positive school climate—which in turn promote students’ moti- vation, engagement, and achievement.
Although it is acknowledged that measurable outcomes of students’ aca- demic progress and achievement are key indicators in identifying school “effectiveness,” they are insufficient to define “successful” schools. A range of leadership research conducted in many contexts over the past two decades shows clearly that “successful” schools strive to educate their pupils by promoting positive values (integrity, compassion, fairness, and love of lifelong learning), as well as fostering citizenship and personal, economic, and social capabilities (Day & Leithwood, 2007; Ishimaru, 2013; Mulford & Silins, 2011; Putnam, 2002). These social outcomes are likely to be deemed by successful leaders to be as important as fostering students’ academic outcomes. Studies carried out by members of the 20-country International Successful School Principals Project over the past decade provide rich empirical evidence that leadership values, qualities, and strategies are critical factors in explaining variation in pupil outcomes between schools (Day & Leithwood, 2007; Moos, Johannson, & Day, 2012; Ylimaki & Jacobson, 2011). A U.S. study (Louis, Leithwood, Wahlstrom, & Anderson, 2010) that investigated the links between school leadership and student learning in 180 schools in 43 school districts in North America, further confirms that leadership, particularly that of the principal, counts.
224 Educational Administration Quarterly 52(2)
Most school variables, considered separately, have only small effects on student learning. To obtain large effects, educators need to create synergy across the relevant variables. Among all the parents, teachers and policy makers who work hard to improve education, educators in leadership positions are uniquely well positioned to ensure the necessary synergy. (Louis et al., 2010, p. 9)
Thus, “effectiveness” as defined solely in terms of academic progress and measurable attainment, is a necessary, but not sufficient, indicator of “suc- cess” in terms of students’ broader educational progress and attainment. In this article, while schools were selected initially on the basis of their aca- demic effectiveness over time, the case studies showed clearly that their prin- cipals defined success in broader terms.
Despite the consensus on the important influence of school leaders on student outcomes, the ways in which leadership effects have been analyzed vary considerably, depending on the variables and research designs adopted by researchers to study the nature and significance of particular aspects of school leadership in improving student outcomes. The most commonly researched leadership models that have been identified as resulting in success are “instructional” and “transformational.” While transformational leader- ship has traditionally emphasized vision and inspiration, focusing on estab- lishing structures and cultures that enhance the quality of teaching and learning, setting directions, developing people, and (re)designing the organi- zation, instructional leadership is said to emphasize above all else the impor- tance of establishing clear educational goals, planning the curriculum, and evaluating teachers and teaching. It sees the leaders’ prime focus as responsi- bility for promoting better measurable outcomes for students, emphasizing the importance of enhancing the quality of classroom teaching and learning.
The results of Robinson et al.’s (2009) meta-analysis of quantitative empirical studies suggested that transformational leadership is less likely to result in strong effects on pupil outcomes (because it focused originally on staff relationships) than instructional leadership, which is focused on the core business of schools in enhancing effective teaching and learning. This, however, appears to be at variance with empirical evidence from Marks and Printy’s (2003) earlier research that claimed that concentrated instructional leadership had rather limited value and impact if leaders were to effectively respond only to the undeniably strong, policy-driven external demands of accountability, performativity, and change: “Responding to these demands with an outmoded conception of instructional leadership was senseless, but engaging teachers in a collaborative dialogue about these issues and their implications for teaching and learning was essential” (p. 392). They con- cluded that “when transformational and shared instructional leadership
Day et al. 225
coexist in an integrated form of leadership, the influence on school perfor- mance, measured by the quality of its pedagogy and the achievement of its students, is substantial” (p. 370). In a meta-analysis of unpublished research studies about the nature of transformational leadership and its impact on school organization, teachers, and students, Leithwood and Sun (2012) reached a similar conclusion. They found that “each transformational school leadership practice adds to the status of consequential school conditions.” Effective leadership, especially in today’s performance driven culture, thus includes both a focus on the internal states of organizational members that are critical to their performance and classroom instruction.
Evidence from the empirical research reported in this article supports and extends Marks and Printy’s (2003) conclusions, those of other later work on integrated leadership (Printy, Marks, & Bowers, 2009), and the conclusions of Leithwood and Sun (2012). It shows that the overrigid dis- tinction between transformational leadership and instructional leadership made by Robinson et al. (2009), and indeed their claims that instructional leadership has greater effects on students than transformational leadership, did not apply to the leadership approaches in a sample of more than 600 (primary n = 363 and secondary n = 309) of the most effective and improved schools in England (Day et al., 2011). Our data showed that, on the con- trary, in schools that sustained and/or improved their performance as judged by student academic outcomes and external inspection results, principals had exercised leadership that was both transformational and instructional as they progressively shaped the culture and work of their schools in building teachers’ commitment and capacities during different phases of their schools’ development journeys. Through this integrated approach, changes had been introduced and implemented successfully and standards of teach- ing and learning built and sustained. These findings provide empirical sup- port to Leithwood and Sun’s (2012) claim that “improvement requires leaders to enact a wide range of practices” (Leithwood & Sun, 2012, p. 403). They also go beyond their claim by providing a “practice-specific” conceptualization of what we call “successful” school leadership that is expressed through the application and accumulation of combinations of values-informed organizational, personal, and task-centered strategies and actions, which, according to the data in our research, together contributed to successful student outcomes. We identified these leadership approaches as the “layering” (Day et al., 2011) of “fit-for-purpose” combinations and accumulations of within-phase leadership strategies and actions over time through the enactment of principals’ personal and professional values and visions and in response to careful diagnosis and multiple and sometimes conflicting communities of interest.
226 Educational Administration Quarterly 52(2)
This understanding of successful leadership values and practices is dis- tinctively different from, for example, “contingency” leadership theory (Fiedler, 1964). This theory proposed that decisions by the leader were made solely in response to the interaction between environmental uncertainty, organizational structure and aspects of “performance” (Pennings, 1975). It is different, also, from “situational” leadership theory (Hersey & Blanchard, 1988) in which, similar to “contingency” leadership, the fundamental prin- ciple is that there is no single “best” approach to leadership because leaders who are successful respond according to their judgments of the perceived “maturity” of the individual or group that they are trying to influence. Neither theory was generated from research in school contexts. Moreover, neither appears to acknowledge the complex range and combinations of strategies, actions, and behaviors that successful principals employ over time in striving to improve their schools. Both, also, seem to ignore the active role played by values—moral and ethical purposes—in decisions about which strategies to apply, how they should be combined, applied, and changed over time and how, cumulatively, these might best lead to the build- ing of organizational cultures and actions by all stakeholders through which improvements may be more likely to occur. Hersey and Blanchard’s (1988) model, for example, seems to ignore participation in leadership by others in its identification and application of four leadership behavior “types” (telling, selling, participating, and delegating) that leaders use according to their identification of four levels of organizational maturity (very capable and confident, capable but unwilling, unable but willing, unable and insecure). These models were important in their time and contributed significantly to knowledge of leadership, though there were criticisms (see, e.g., Goodson, McGee, & Cashman, 1989; Graeff, 1997; Thompson & Vecchio, 2009). Much research since then, however, has been able to find more complex relationships between, for example, values, behaviors, and strategies used in effective and improving schools that serve different contexts to a range of communities.
By “layering,” we are referring to the ways in which, within and across different phases of their schools’ improvement journeys, the principals selected, clustered, integrated, and placed different emphases on different combinations of both transformational and instructional strategies that were timely and fit for purpose. In this way, as findings of our 20 case studies show, the principals progressively built the individual and collective capacity and commitment of staff, students, and community. Quantitative results com- plemented these case study findings by providing empirical evidence of the patterns of associations between certain key features of leadership identified from confirmatory factor analysis (CFA) of survey responses by principals
Day et al. 227
(setting directions, redesigning the organization, developing people, and managing teaching and learning, trust) and the role of personal qualities. The results revealed the interconnections of how such leadership strategies and actions shaped school and classroom processes and improved school condi- tions that, in turn, promoted better pupil outcomes (Sammons, Davies, Day, & Gu, 2014; Sammons, Gu, Day, & Ko, 2011).
The IMPACT Research: Mixed Methods Design
Figure 1 illustrates the different phases and strands of the IMPACT (Impact of School Leadership on Pupil Outcomes) research and their sequencing. A review of the leadership literature (Leithwood et al., 2006) informed the design and development of the questionnaire surveys and the case study interviews. The use of mixed methods increased the possibilities of identify- ing various patterns of association and possible causal connections between variation in different indicators of school performance and measures of school processes and the way these are linked with different features of lead- ership practices and pupil outcomes. The sequencing of the study facilitated the ongoing integration of evidence, synthesis, and meta-inferences neces- sary in well-designed mixed methods research (Day et al., 2008; Sammons, 2010; Sammons et al., 2014).
Figure 1. Research design: Integrating evidence about effective/improved schools.
228 Educational Administration Quarterly 52(2)
The Sampling Strategy: Identifying Effective and Improved Schools
An analysis of national assessment and examination data sets on primary and secondary school performance was used to identify schools that were effec- tive in their value-added results (which take account of pupils’ prior attain- ment and background characteristics) and also showed significant improvement in raw results or stable high attainment over at least the previ- ous three consecutive years under the leadership of the same principal. The analyses were based on relevant published data and key indicators, including both “value-added” measures of pupil progress based on multilevel statistical analyses, combined with important accountability indicators such as the per- centage of pupils achieving national performance benchmarks in Key Stage 2 assessments (Age 11), or at Key Stage 4 in public GCSE examinations (Age 16). Approximately a third of the primary (34%) and of the secondary (37%) schools in England for which national data were available were classified as meeting our criteria as more effective/improved in terms of value-added per- formance and changes in pupil attainment over a course of 3 years.
Nationally, a greater proportion of English schools are in Free School Meal (FSM) Band 1 (0% to 8% pupils eligible for FSM) and Band 2 (9% to 20% eligible) than in the more disadvantaged groups of FSM Band 3 (21% to 35% eligible) and Band 4 (36%+ eligible), and this is the case for both pri- mary and secondary schools. We deliberately oversampled schools with higher proportions of disadvantaged pupils (FSM Bands 3 and 4) to achieve a more balanced (less skewed toward low disadvantage) sample of schools in relation to level of disadvantaged pupil intake. In addition, pupils in schools from more disadvantaged areas tend to start from a lower attainment level, and thus, such a sample allowed us to (a) secure a group of schools that had seen pupil progress and attainment improve significantly from low to moder- ate or high and (b) explore in greater depth the impact of leadership on the improvement of pupil outcomes in schools serving more disadvantaged intakes. Table 1 indicates the composition of this stratified random sample of schools by FSM bands against the national distribution of schools.
Two Questionnaire Surveys to Investigate Leadership and School Process
The first questionnaire survey was conducted for principals and key staff (two per school at primary level, five per school at secondary level) among the sample schools. The survey design was informed by a review of the literature on the impact of school leadership on pupil outcomes (Leithwood et al., 2006) and covered the following topics:
Day et al. 229
1. Leadership Practice 2. Leaders’ Internal States 3. Leadership Distribution 4. Leadership Influence 5. School Conditions 6. Classroom Conditions
The questionnaire included specific items that focused on key aspects of transformational leadership strategies (e.g., setting directions and visions) and instructional leadership strategies (e.g., managing teaching and learning) and items that explored principals’ and key staff’s percep- tions of change in these six areas of school work and on academic and other kinds of pupil outcomes (nonacademic areas such as engagement, motivation, behavior, and attendance) over the previous 3 years. This period coincided with the years over which the analyses of national pupil attainment data had taken place. The key staff survey closely mirrored that of the principals so that comparisons could be made between responses by the two groups. The response rate (Table 2) was somewhat higher for prin- cipals of secondary schools, which were followed up in more detail to ensure roughly equal numbers of responses from schools in each sector. Although not high, the response rate is typical of that achieved by surveys of schools in England in recent years.
Case Studies of 20 Primary and Secondary Schools
The qualitative strand used 20 in-depth case studies of a subset of these schools. Data were collected through three visits each year (N = 6) over 2 years with detailed interviews of principals and a range of key staff and stake- holders.1 These case studies represented schools in different sectors and
Table 1. Distribution of Sample Schools by Socioeconomic Contexts.
FSM Band
Primary Sample
Primary National
Secondary Sample
Secondary National
N % N % N % N %
FSM1 (0% to 8%) 510 33 6,150 42 400 35 1,159 37 FSM2 (9% to 20%) 452 29 2,896 27 393 34 1,097 35 FSM3 (21% to 35%) 275 18 2,359 16 191 17 520 17 FSM4 (36%+) 312 20 2,267 15 156 14 339 11 Total 1,549 100 14,672 100 1,140 100 3,115 100
Note. FSM = free school meal.
230 Educational Administration Quarterly 52(2)
contexts, including different levels of socioeconomic advantage as identified through the “Free School Meals” proxy and disadvantage and ethnic diver- sity (FSM Bands 1 and 2: 3 primary and 4 secondary; FSM Bands 3 and 4: 7 primary and 6 secondary). We also constructed, in interviews with principals, “lines of school improvement,” using critical incident techniques. These allowed us to build holistic representations of the strategies for improvement that each principal had used over the period of their leadership. These were then mapped onto data showing changes in external measures of students’ progress and attainment over the same period and external inspection grades for the schools. Interviews with principals and key staff prompted them to speak about those issues that were most significant to them in relation to the research aims and objectives and aspects identified as important in the litera- ture review. Interviews with other colleagues in the school provided insights into their perceptions of the nature and impact of the practice and effective- ness of school (and, in secondary schools, departmental) leadership and its distribution.
Findings: How School Leadership Makes a Difference
1. Building and sustaining the right conditions for a sustained focus on the quality of teaching and learning: evidence from the first principal and key staff surveys.
Table 2. Wave 1 Survey Response Rate.
Sample Size Surveyed (n)
Questionnaires Returned (n)
Response Rate (%)
Heads Primary 1,550 378a 24 Secondary 1,140 362a 32 Key staffb at school level Primary 1,550 409c 26 Secondary 1,140 393c 34 Key staffc at questionnaire level Primary 3,100 608d 20 Secondary 5,700 1,167d 20
aQuestionnaires returned by heads. bKey staff include members of the senior leadership team and middle managers (e.g., Key Stage Leaders). cSchools with returned key staff questionnaires. dReturned key staff questionnaires.
Day et al. 231
Actions Identified by Principals as Most Important in Promoting School Improvement
In the first survey, principals were asked about the most important combina- tions of specific strategies that they felt had had the most positive impact on improving pupil outcomes over the past 3 years. Leadership strategies related to improving teaching practices and promoting a stronger academic press or emphasis were the most frequently cited strategies. More specific actions most commonly cited by primary principals as most important were as follows:
•• Improved assessment procedures (28.1%) •• Encouraging the use of data and research (27.9%) •• Teaching policies and programs (26.0%) •• Strategic allocation of resources (20.4%) •• Changes to pupil target setting (20.2%)
For secondary principals, the actions/strategies viewed as most important showed strong similarities to the findings for primary principals, although the emphasis on the “use of data” was somewhat stronger, and secondary princi- pals placed much more emphasis on changing school culture:
•• Encouraging the use of data and research (34.0%) •• Teaching policies and programs (27.7%) •• School culture (21.1%) •• Providing and allocating resources (19.5%) •• Improved assessment procedures (18.6%)
There was consistent evidence in the first survey that both principals and key staff were positive about the role of instructional leadership strategies in promoting and sustaining the academic standards and expectations in their schools, which, to some extent, might be expected given the study’s focus on more effective/improved schools. The large majority of the primary (69%) and secondary (64%) principals agreed strongly that “this school sets high standards for academic performance.” Such a view was also shared by the key staff, with more than 90% in agreement (“strongly” and “moderately”).
In particular, the use of performance data and monitoring were shown to be important strategies in the drive to raise standards in schools that make sustained improvement in raising pupil attainment—especially for those in disadvantaged contexts. The large majority of primary (79%) and secondary (91%) principals agreed strongly or moderately that “the performance of department/subject areas is regularly monitored and targets for improvement are regularly set.” For
232 Educational Administration Quarterly 52(2)
principals of primary schools, those in high disadvantaged schools (N = 118, 84%, vs. N = 175, 75%) were somewhat more likely to be in agreement with this (p < .05). Principals in low disadvantaged secondary schools (N = 200, 79%, vs. FSM 3 and 4: N = 91, 88%) were slightly less likely to agree strongly that “teachers regularly use pupil assessment data to set individual pupil achieve- ment targets” (p < .05).
To explore the relationships between leadership, school process, and changes in pupil outcomes, exploratory factor analysis followed by CFA were used to investigate the possible structures underpinning the questionnaire data from principals and to test theoretical models about the extent to which leadership characteristics and practices identified in the earlier literature review (Leithwood et al., 2006) could be confirmed from the sample of effective and improved schools in England. Results showed that the underlying leadership factors iden- tified for both primary and secondary principal surveys largely accorded with the conclusions of Leithwood et al.’s (2006) literature review. After deletion of missing data, the structural equation modelling (SEM) analysis was conducted with data for 309 secondary schools and 363 primary schools. The development of the models draws on but extends the cross-sectional approach that predicts student outcomes adopted in the earlier Leadership and Organisational Learning study in Australia by Silins and Mulford (2004)—as the factors identified in this research in the English context relate to improvement in school performance (as measured by change in student outcomes and progress). Results for the primary and secondary samples showed strong similarities. The SEM models predict changes (i.e., the extent of improvement) in student attainment over a 3-year period for our sample of effective and improved schools as the dependent vari- able. They demonstrated that the leadership constructs identified in the literature operated in ways in which we hypothesized in relation to influencing directly and indirectly a range of school and classroom processes that in turn predicted changes (improvements) in schools’ academic performance. These dynamic, empirically driven models present new results on the leadership of a large sam- ple of effective and improving schools in England and thus add to school improvement and leadership theories. Details of the exploratory factor analysis and CFA results and SEM models were reported in our final project report and other subsequent publications (Day et al., 2009; Day et al., 2011; Sammons et al., 2011; Sammons et al., 2014). In this article, we use the secondary SEM model as an example to illustrate how transformational and instructional leader- ship strategies were used by principals in our research to influence the processes of school improvement and, through these, improve pupil outcomes over time.
The secondary SEM model of leadership practice showed a relatively high internal consistency reliability of 0.950 (Figure 2). The model fit indices in Figure 2 suggest a “good” model–data fit (Hu & Bentler, 1999; Kaplan, 2004;
233
F ig
ur e
2. E
xa m
pl e
of le
ad er
sh ip
p ra
ct ic
es a
nd c
ha ng
es in
s ec
on da
ry p
up il
ou tc
om es
o ve
r 3
ye ar
s: A
s tr
uc tu
ra l e
qu at
io n
m od
el
(N =
3 09
p ri
nc ip
al s
ur ve
y re
sp on
se s)
.
234 Educational Administration Quarterly 52(2)
Kelloway, 1998; Kline, 2010). All latent variables were derived from the CFA, and Table 3 lists the observed variables (i.e., questionnaire items) that are asso- ciated with the latent constructs in the model. While all the links between the different latent constructs were statistically significant (as indicated by the t values at p < .05), some were stronger than others. The strength of these con- nections indicates which features of leadership practice were most closely linked for respondents to the surveys. Figure 2 shows that the school processes directly connected with principals’ leadership strategies are the ones that also connect most closely with improvements in aspects of teaching and learning and staff involvement in leadership; these in turn help predict improvement in school conditions, and so, indirectly, improvement in pupil outcomes.
Four groups of latent constructs were identified in the SEM (as indicated by four different shadings in Figure 2) predicting change in pupil attainment outcomes. They are positioned from proximal (i.e., factors that are near to principal leadership and influence directly constructs such as “developing people” and school conditions) to distal (i.e., factors that are further removed from principal leadership and influence indirectly the intermediate outcomes such as pupil behavior and attendance). They represent robust underlying dimensions of leadership and school and classroom processes (i.e., latent constructs relating to key features of leadership practice and school and class- room processes) and highlighted strategies and actions that school principals and staff had adopted to raise pupil attainment.
Group 1 comprises three key dimensions of principal leadership: “Setting Directions,” “Redesigning Organization,” and “Principal Trust” plus three other major dimensions of “Developing People,” “Use of Data,” and “Use of Observation” strongly linked with the first two. Group 2 comprises four dimensions in relation to leadership distribution in the school: “Distributed Leadership,” “Leadership by Staff,” “Senior Leadership Team” (SLT), “Collaboration,” and the “SLT’s Impact on Learning and Teaching.” Group 3 comprises four dimensions relating to improved school and class- room processes that seem to function as mediating factors in this structural model: “Teacher Collaborative Culture,” “Assessment for Learning,” “Improvement in School Conditions,” and “External Collaborations and Learning Opportunities.” Group 4 also comprises four dimensions: “High Academic Standards,” “Pupil Motivation and Learning Culture,” “Change in Pupil Behavior,” and “Change in Pupil Attendance.” These constructs identify important intermediate outcomes that had direct or indirect effects on measured changes in pupil academic outcomes for school over 3 years.
Day et al. 235
Table 3. Questionnaire Items That Underpin Each Latent Variable in the SEM Model (Secondary Principals).
Latent Variables Questionnaire Items
Setting directions
Demonstrating high expectations for staff’s work with pupils
Demonstrating high expectations for pupil behavior
Demonstrating high expectations for pupil achievement
Working collaboratively with the governing body
Developing people
Encouraging staff to consider new ideas for their teaching
Promoting leadership development among teachers
Promoting a range of CPD experiences among all staff
Encouraging staff to think of learning beyond the academic curriculum
Redesigning the organization
Encouraging collaborative work among staff
Improving internal review procedures
Allocating resources strategically based on pupil needs
Structuring the organization to facilitate work
Use of data Encouraging staff to use data in their work
Encouraging all staff to use data in planning for individual pupil needs
Use of observation
Regularly observing classroom activities
After observing classroom activities, working with teachers to improve their teaching
Using coaching and mentoring to improve quality of teaching
Leader trust in teachers
I feel quite confident that my teachers will always try to treat me fairly
My teachers would not try to gain an advantage by deceiving me
I feel a strong loyalty to my teachers
Distributed leadership
Most leadership tasks in this school are not carried out by the Head and SMT/SLT
Many others take on leadership tasks
Staff leadership Groups of teachers
Individual teachers with formally assigned tasks
Individual teachers acting informally
(continued)
236 Educational Administration Quarterly 52(2)
Latent Variables Questionnaire Items
SLT collaboration
SLT playing a role share a similar set of values, beliefs, and attitudes related to teaching and learning
Participate in ongoing collaborative work
Have a role in schoolwide decision making
SLT impact on L and T
SLT have a positive impact on standards of teaching
SLT have a positive impact on raising levels of pupil attainment
SLT have a role in determining the allocation of resources to pupils
Teacher collaborative culture
Most teachers in our school share a similar set of values, beliefs, and attitudes related to teaching and learning
Teachers in our school mostly work together to improve their practice
There is ongoing collaborative planning of classroom work among teachers in our school
Teachers in this school have a sense of collective responsibility for pupil learning
Assessment for learning
The performance of department/ subject areas is regularly monitored, and targets for improvement are regularly set
Pupils are regularly involved in assessment for learning
Class teachers regularly use pupil data to set individual pupil achievement targets
Improvement in school conditions
School experienced enhanced commitment and enthusiasm of staff
Promoted an orderly and secure working environment
Improved pupil behavior and discipline as a result of a whole school approach
External collaborations and learning opportunities
Parents often visit the school
The school is actively involved in work with other schools or organizations
There are more opportunities for pupils to take responsibilities for their own learning in school now than 3 years ago
Table 3. (continued)
(continued)
Day et al. 237
Latent Variables Questionnaire Items
High academic standards
Pupils in this school can achieve the goals that have been set for them
Most pupils do achieve the goals that have been set for them
This school sets high standards for academic performance
Positive learner motivation and learning culture
Pupils’ respect others who get good marks
Change in pupils’ motivation in learning
Pupils feel safe in our school
Improvement in pupil behavior
Changes in physical conflict among pupils
Physical abuse of teachers
Verbal abuse of teachers
Improvement in pupil attendance
Changes in pupils’ lateness to lessons
Pupils’ lateness to school
Pupils’ missing class
Note. SEM = structural equation modelling; CPD = continuing professional development; SMT = senior management team; SLT = senior leadership teams.
Table 3. (continued)
These groups of latent constructs, driven by theories of school leader- ship and school improvement, were identified in the process of model building. As the SEM shows, the leadership practices of the principal (Group 1 dimensions) and of the SLT (Group 2 dimensions) influence, directly or indirectly, the improvement of different aspects of school cul- ture and conditions (Group 3 dimensions), which then indirectly influence the change in pupil academic outcomes through improvements in several important intermediate outcomes (Group 4 dimensions). Some of the dimensions (latent constructs) in the model have direct effects on dimen- sions at more than one level. For example, to create a collaborative culture among teachers (Group 3), “Leader Trust in Teachers” (Group 1) is shown to be critical not only in terms of directly influencing the building and development of such a culture but also indirectly influencing the culture through distributing leadership to the “Staff” and promoting “SLT Collaboration” (Group 2). In addition, three dimensions (latent constructs) were found to have small direct effects on change in “Pupil Academic Outcomes”: “SLT’s Impact on Learning and Teaching,” “Leadership by Staff,” and “Improvement in Pupil Behavior.”
While the direct effects of school leadership on pupil outcomes are gener- ally found to be weak (Leithwood et al., 2006), these effects should be inter- preted in relation to the size of the effects of other school variables, which are
238 Educational Administration Quarterly 52(2)
also generally found to have relatively small effects in comparison with teacher effects (Creemers & Kyriakides, 2008). Leithwood, Patten, and Jantzi (2010) argue that it is likely that the influence of different leadership practices travel different routes (i.e., influence different mediators) to improve student out- comes. As a way of interpreting the complex direct and indirect effects in our model, we suggest that “synergistic influences” may be promoted through the combination and accumulation of various relatively small effects of leadership practices that influence different aspects of school improvement processes in the same direction, in that they promote better teaching and learning and an improved culture, especially in relation to pupil behavior and attendance and other pupil outcomes such as motivation and engagement.
Such synergy of leadership influences is also related to the ways in which transformational and instructional leadership strategies (Groups 1 and 2) were used in combination by secondary principals in our survey to create and build the structural and cultural conditions (Groups 3 and 4) necessary for school improvement. As the SEM model shows, transformational leadership strate- gies relating to setting directions and restructuring the organization for change (Group 1) set the departure point for their schools’ improvement journeys and, from our case study data, are shaped by the principal’s skills in diagnosis of their school’s performance and needs. These strategies served to raise expecta- tions and provide organizational structures that promoted collaborative work among teachers (see Table 3 for observable variables attached to these latent constructs). Building trusting relationships with teachers and the senior lead- ership team (Group 1) was shown to be another key leadership strategy that enabled the distribution of leadership across the school (four latent constructs at Group 2) and, through this, the transformation of the social and relational conditions of schools (Group 3: “teacher collaborative culture,” “improve- ment in school conditions,” and “external collaborations and learning opportunities”).
As Table 3 shows, observed leadership strategies that are related to instruc- tion tend to be loaded on their respective latent factor while those that are related to transformation and change form distinct latent variables. What is clear, however, is that neither instructional leadership strategies nor transfor- mational leadership strategies alone were sufficient to promote improvement identified by the SEM model. Leadership strategies that built on change in organizational structures and conditions but that focused more closely on developing people to become innovative and more rigorous in their teaching practices and to learn to use data and observation to improve their teaching (Level 1; see Table 3 for observable variables) also played an important role in school improvement processes. As the SEM model shows, they contrib- uted to “positive learner motivation and learning culture,” “high academic
Day et al. 239
standards,” and “improvement in pupil attendance” (Group 4) through lead- ership distribution (Group 2) and “teacher collaborative culture” and “assess- ment for learning” (Group 3). The SEM analysis of the responses of primary school principals showed very similar results, suggesting that leadership operated in similar ways across the two sectors.
We view the models as dynamic representations of the use of both trans- formational and instructional strategies by principals as they seek to identify the ways in which different dimensions that relate to features of leadership and school and classroom processes link with, and predict improvements in, schools’ internal conditions and various pupil outcomes. The results suggest that school and leadership effects would be expected to operate most closely via their influence on developing teachers, improving teaching quality, and promoting a favorable school climate and culture that emphasize high expec- tations and academic outcomes. In addition, they showed connections between other important intermediate outcomes such as the retention and attendance of staff, improvements in pupil attendance and behavior, and per- ceived increases in pupil motivation, engagement, and sense of responsibility for learning—all of which were themselves linked by the dynamic combina- tion and accumulation of different leadership values, strategies, and actions. The models and case studies indicate that their various effects on school improvement processes and outcomes were both interactive and interdepen- dent in our sample of effective and improving schools.
Although of value in identifying patterns and testing hypothesized rela- tionships, and a range of interconnected leadership actions and strategies, on their own, these SEM quantitative analyses were not able to reveal what kind of leaders these principals were. Nor could the SEM illuminate how they diagnosed their schools’ needs or were perceived by their colleagues or the different ways in which combinations of strategies were applied by principals in particular contexts and at particular times and the reasons for this. Evidence from the case study investigations provides complementary, rich illustrations and insights as to how the “synergistic effects” of different dimensions of transformational and instructional leadership strategies on students’ aca- demic outcomes are achieved in different phases of schools’ development over time. The use of mixed methods thus enabled deeper insights and expla- nations to emerge.
2. School improvement phases: The layering of transformational and instructional leadership strategies.
Two key findings that resulted from the project’s mixed methods approach concerned the identification of clear, interrelated, phases in the schools’
240 Educational Administration Quarterly 52(2)
improvement trajectories (reflecting the dynamic nature of improvement) and, within these, what we have termed, “the layering of leadership.”
Phases of School Improvement
Toward the end of the field research, we used focused interviews to discuss the school’s improvement trajectories and the school’s leadership since the principal’s appointment. Principals and their key staff identified various com- binations of actions and strategies that had contributed to school improve- ment as defined by improvements in student attainment, evidence from external Ofsted inspection reports and their own vision and broad educational purposes during their tenure. By plotting these on a time graph, then identify- ing significant turning points, each principal created a detailed “line of school improvement” that extended through a number of school improvement phases during their time at the school. The conceptualization of phases of school improvement focuses on how and why some leadership actions are contextually appropriate at a point in time. Together, these actions are able, individually and in combination, to make a difference to aspects of school improvement processes and enable schools to develop capacity and achieve intermediate successes that are essential for them to move on to the next phase of school improvement. There will be overlaps in terms of leadership practices (or “variables”) in-between phases—thus, layering the foundation for the next phase. Some practices continue to be important across phases.
It is important to note that there are differences between “phases” and “time periods.” Depending on the capacity at the departure point for improve- ment and many other associated leadership and contextual factors, different schools may, for example, take longer to move from Phase 1 to Phase 2, while others may need a shorter period of time. Some schools in our case studies took, for example, 6 months to move from Phase 1 to Phase 2. The example we give in this article (Figure 3) took longer than that (3 years).
Nevertheless, while there were differences in the number and variations in the length of these phases, on close analysis, four broad improvement phases were identified across the 20 cases: foundational, developmental, enrich- ment, and renewal phases (see Figure 3).
In the foundational phase of principals’ leadership, key strategies relating to transformational leadership (e.g., developing vision, setting directions, building a “core” senior leadership group with common purpose) were used, together with instructional leadership strategies (e.g., raising teacher perfor- mance expectations of self and pupils; improving pupil behavior; improving the physical, social, psychological, and emotional conditions for teaching and learning; and using data and research). They were combined to ensure
241
F ig
ur e
3. P
ri nc
ip al
’s li
ne o
f s uc
ce ss
: E yh
am pt
on S
ec on
da ry
S ch
oo l.
*T o
di st
in gu
is h
tr an
sf or
m at
io n
an d
in st
ru ct
io na
l l ea
de rs
hi p
st ra
te gi
es , w
e ha
ve p
la ce
d st
ra te
gi es
t ha
t fo
cu s
on im
pr ov
in g
th e
qu al
ity o
f t ea
ch in
g an
d le
ar ni
ng in
it al
ic s.
242 Educational Administration Quarterly 52(2)
that certain “basics” were in place. Three particular strategies were priori- tized in this foundational phase.
a. Improving the physical environment of the school for staff and pupils to create positive environments conducive for high-quality teaching and learning.
Principals recognized the importance of creating a physical environment in which all staff and students felt inspired to work and learn. Changes to the school buildings varied in scope from increasing visual display in classrooms, corridors, and reception areas to the creation of internal courtyards and entirely new buildings. For example,
When [the principal] first came here the biggest impact that she made her number one priority was the environment. And everything went into the environment. That was the focus, nothing else, which I think is great because if you try to do too many things too soon, I don’t think we’d have got where we are today. So that was the one big thing. (Primary teacher, Round 1 Interview)
b. Setting standards for pupil behavior and improving attendance.
Strategies for improving pupil behavior initiated in the early phase often included changes to uniform, systems for monitoring attendance patterns, and follow-up of unauthorized absence.
Behavior was seen as a whole-school collegiate approach. We refined classroom rules and had the same classroom rules and expectations displayed in each classroom, so we were having, I think, more emphasis on a unified approach to behavioral issues so students knew the ground rules and what to expect. (Secondary Head of Department, Round 5 Interview)
c. Restructuring the senior leadership team and redefining the roles, responsibilities, and accountabilities of its members.
Both primary and secondary school principals prioritized the early creation of a Senior Leadership Team around them that shared and championed their values, purposes, and direction for the school. They viewed this as essential to enable the development of other important improvement strategies.
In the first year [of the] new SLT structure, that was partly good luck because the existing senior deputy left and that gave me the chance to restructure . . . basically bringing more people onto the team. The previous structure had been
Day et al. 243
a head, two deputies, and two senior teachers, and I made it a head, a deputy, and five assistant principals. The number of assistant principals has increased with time. The idea was to have more people involved. That has been a key plank all the way through, to try and be less hierarchical than it had been before. (Secondary Principal, Round 5 Interview)
Only later did they distribute leadership responsibilities to the middle leaders and other staff.
In the developmental phase of principals’ leadership, two key transforma- tional and instructional strategies were prioritized. First, there was wider dis- tribution of leadership with the focus being placed on redesigning organizational roles and responsibilities to extend leadership across the school, build leadership capacity, and, through this, deepen the scope and depth of change. By the second phase, all but two of the 20 case study prin- cipals were distributing significant decision making both to the senior leader- ship team and to a larger group of middle leaders. The additional distribution of responsibility was very much a function of growing trust and trustworthi- ness. For example,
We’ve always been involved in leading but I think it is distributed more between the whole staff now rather than just the senior leadership. (Primary Deputy Headteacher, Round 5 Interview)
Second, systematic classroom observations and increasing the use of data- informed decision making to improve the quality of teaching and learning were key features of practice in all schools (i.e., instructional focus). Data were used to identify those who needed extra support, facilitating increases in opportunities for personalized learning.
[These] data are what then help us to track progress within the school on [a] whole-school level and for a department because clearly each pupil is set targets when they join us in Year 7. (Secondary Head of Department, Round 3 Interview)
Building on the growth of achievement and its positive effects on teachers’ and students’ sense of confidence and stability in the foundational and devel- opmental phases, the key strategies that principals prioritized in the later enrichment and renewal phases focused on the further personalization of learning and enriching of the curriculum. Throughout these phases, the emphasis on quality (of learning as well as teaching), classroom observa- tions, target setting, and pupil participation in learning was increased. Personalization (in Phase 4) was reflected in an increasing emphasis on
244 Educational Administration Quarterly 52(2)
teaching that promoted more participative, interdependent, independent, and flexible learning and that supported a range of approaches to pupil learning. The relationships between the extended use of data and personalizing the cur- riculum (instructional leadership) were highlighted by staff and the senior leaders as key strategies that affected improved pupil outcomes. For example,
It would be the assessment and tracking systems. I think that has got to be. I had taken a long time to get there and I think at some stage that people thought that (Principal) was just filling in more forms for us, but I think that now people have realized that there is benefit, that from the systems we can narrow it down to individual pupils who might need differentiated approaches, personalized learning. It is not just one size fits all. (Primary Deputy Principal, Round 3 Interview)
Curriculum enrichment (as part of instructional leadership) refers to broad pupil outcomes and development of the whole pupil. It focuses on social and emotional learning and provision of creative, cross-curricular or skills-based learning. For primary schools, the emphasis tended to be on making the cur- riculum more creative, flexible, and enjoyable for the pupils, aiming to inspire and interest them, with the aim of producing a more rounded indi- vidual. For secondary schools, flexibility and enjoyment were also central. This would sometimes involve whole days off timetable, working on cross- curricular projects or skills-based learning. Specialist school status often helped focus on these days and use the specialism as a guide, such as science fun days or adding extra dimensions to sports days.
The Layering of Leadership
These phases of improvement contained within them, then, different combi- nations of actions and strategies relating both to transformational and instruc- tional leadership. At certain times, principals emphasized some more than others. They made judgments, according to their values and diagnoses of context, about the timing, selection, relevance, application, and continuation of strategies that created the optimal conditions for both the motivation, well- being, and commitment of staff and effective teaching, learning, and pupil achievement within and across broad development phases. Some strategies did not continue through each phase, an example being “restructuring and redesigning roles and responsibilities,” which was a particular feature of the early phase. Others grew in importance and formed significant foundations on which other strategies were built. Thus, they grew, nurtured, and sustained
Day et al. 245
school improvement by combining and accumulating what we identified as “layered leadership” strategies and actions that were both transformational and instructional.
For the purpose of this article, we have selected the story of a secondary case study school that provides an example of how the principal selected, combined, and accumulated strategic actions, placing relatively more or relatively less emphasis on one or more at any given time and over time, to ensure school improvement. In doing so, the principal was demonstrating not only the possession and use of key values, qualities, and skills (i.e., an ability to diagnose and problem solve and an ability to exercise judgments about selection, timing, and combination of strategies that were sensitive to individual and organizational contexts) but also highly attuned cognitive and emotional understandings of the needs of individual staff and students and of the concerns of both national government and local community. This example is used to illustrate how and why school leaders in our case study schools were able to influence others and achieve and sustain success over time in the contexts in which they worked, such that they not only trans- formed the conditions and culture of a school but, more importantly, devel- oped and transformed the people who shaped and were shaped by the culture. Together, these resulted in continual improvement in student learning and achievement.
Eyhampton High School: From “Notice to Improve” to “Outstanding”
Context
Eyhampton is a 13 to 19 age mixed comprehensive school. It was situated in an area of high industrial deprivation, where few parents had a history of accessing further education. Aspirations and academic expectations in the community were typically low, although students came from a range of back- grounds. At the time of our visits, the school was below average size, with 793 pupils on the roll. It provided a range of opportunities for trips and visits, opportunities for achievement through sport, opportunities for performance through theatre and music arts, and opportunities for citizenship through involvement in a range of community activities.
The school was struggling with low attainment, poor behavior, a poor reputation locally and a poor external inspection report when Graham, the new principal, arrived. He felt that strong authoritarian leadership was what was needed at that time to raise aspirations and change the underachieving school culture. He had worked as a modern foreign languages teacher and
246 Educational Administration Quarterly 52(2)
senior leader in a number of schools in a different region of England before joining this school 10 years earlier. Over the 10 years of his leadership, he had worked hard and successfully to change the physical environment, cul- ture, and capacity and raise student performance of the school. In 2006, the leadership of the principal and senior staff was described as “outstanding” by the external national inspection agency, and by 2010, the school itself achieved an overall grade of “outstanding.” The school’s attainment levels measured by national benchmarks and value-added indicators of student progress also revealed the school’s transformed performance.
Four School Improvement Phases
Phase 1 (Foundational): Urgent attention, back to basics (3 years). Typical of the secondary schools in the sample, this principal began his tenure with a wide- ranging redesigning of the organizational roles and responsibilities, particu- larly within the leadership team. He had strong values, a sense of moral purpose, and a desire to raise standards for pupils in this disadvantaged and declining ex-mining area. There was a clear emphasis on high expectations and raising aspirations, which continued throughout. This led to a major focus on pupil behavior and teacher and teaching quality as well as an improvement in the physical environment. During this phase, the principal focused on six leadership strategies, which, together, illustrate his twin focus on transformational and instructional leadership strategies:
1. Redesigning the leadership and staff teams: Initially the principal, Graham, built a new SLT and focused on building and interlocking teams. He made a number of key appointments in the early stages and then later reduced the number of middle managers and the size of the SLT to widen participation and make the leadership structure stronger and flatter.
2. Training and development for all: Typically, he focused on school- based and school-led professional learning and development, which he saw as better value for money than external training. He provided a comprehensive range of training and monitoring for all staff, and in the first phase, the emphasis was on raising standards using the national inspection criteria.
3. School ethos and high expectation: This was described as “not easy.” However, the principal was “fortunate” as many of the staff who were initially resistant to change chose to retire or move, thus leaving the way clear to develop the new ethos and “get the floating voters on board.”
Day et al. 247
4. Pupil behavior: The early change to a school uniform and the devel- opment of a focus on discipline and high behavioral expectations were key elements in instilling the new culture into the school. These measures were accompanied by the development of a new pastoral system, led by a member of the SLT, to ensure that the higher expecta- tions were accompanied by pupil support and guidance.
5. Improving the physical environment: Some of the buildings were completely remodeled and this was an ongoing process. One of the first changes made by the principal was to create environments in each classroom that were conducive to learning.
6. Raising expectations and standards of classroom teaching and learning: This was an important strategy, both desirable (in terms of moral pur- pose and service to pupils) and necessary (in terms of securing external judgments of quality).
Phase 2 (Developmental): Rebuilding and making the school more student-centered, continuing focus on the quality of teaching and learning (2 years). The strategies used by Graham in Phase 2 again illustrate his combination of transforma- tional and instructional leadership. In this phase, there was a continued focus on performance management of staff, high expectations, and improving teacher and teaching quality. Pupil behavior was also a continuing priority and addressed through the pastoral care system. Pupil voice was given greater importance. Five key leadership strategies were the focus of this phase:
1. Performance management—Observation and coaching: All staff were regularly observed and strengths and weaknesses were identified. Coaching and support were available for all to enable them to meet the high expectations. Peer observation also began to play a role in develop- ment. It was in this phase that the school increased the number of preser- vice students enrolled in school-based teacher training.
2. High expectations and use of data: To continue to raise aspirations, Graham introduced the use of data and target setting. This was seen as crucial to promoting higher academic standards and change in staff and student attitudes and in the school culture. In addition, he estab- lished a “pupil exclusion” center and a “flexible learning” center, which were used to manage teaching and learning for pupils with a range of special learning and behavioral needs.
We track the children really closely, which is not something that all of the departments do within the school, or are trying to do. And we are then able to send letters home, for example, termly, to tell the parents where they’re at . . .
248 Educational Administration Quarterly 52(2)
and what percentage, so on and so forth. We’re also quite motivational. (Principal, Round 2 Interview)
3. Pupil behavior and pastoral care: The focus on pupil behavior con- tinued into Phase 2, and to ensure that pupils had the support they needed to achieve, the pastoral care system was strengthened. A col- legial approach to student behavior management was adopted by all staff, and classroom rules were refined early in Graham’s leadership.
We have very positive and supportive teacher pupil relationships. We have worked on pupil management strategies and assertiveness of staff. They can’t be aggressive or pupils will be aggressive back. (Head of Department, Round 1 Interview)
4. Pupil voice: The profile of pupil voice was increased. Graham intro- duced a questionnaire through which pupils could comment on lessons, teaching and learning, and other aspects of school life. A student coun- cil was also introduced early on, and this grew in influence over time. The school council was consulted at every level, even staff recruitment. Opinions of its representatives were taken into account and had a sig- nificant influence on new appointments. The school council grew in many ways and provided the pupils with leadership opportunities.
5. Becoming a (preservice) training school: The school enjoyed strong links with universities and became a training school in this phase, enabling it to develop and then recruit newly qualified teachers who understood the ethos of the school.
Phase 3 (Enrichment): Period of reflection and curriculum development (2 years). In this phase, Graham distributed leadership more widely as a consequence of the trust that had developed over the previous phases. Again, both staff and students were at the center of his layering of values-based leadership strate- gies. He also expanded the curriculum significantly, enriching the experience of the pupils and making their options more personalized and pupil-centered. It was also in this phase that the school achieved “Specialist” status as a Sports College. Four key leadership strategies strengthened the school’s ear- lier achievements and extended its development.
1. Distribution of leadership: Graham and his assistant principal took most of the strategic decisions in the foundational phase, but over time, this process became more distributed. In the third phase, deci- sions were taken with the whole of the SLT.
Day et al. 249
2. Curriculum enrichment, personalization, and pupil-centered learning: A new curriculum was designed to “meet the enormous range of needs that we have in the school, right from children who can’t cope in the classroom . . . to pupils who will go to Cambridge” (Assistant Principal). In addition, pupils took more responsibility for their own learning, having a greater awareness of and responsibility for identify- ing and achieving their learning objectives. The expansion and person- alization of provision of the curriculum took place throughout the school and had a powerful effect on pupil outcomes.
3. Developing the school ethos and raising aspirations: There was a renewed focus on developing the school ethos, accompanied by a continued emphasis on raising expectations.
The school culture is one of understanding, at the forefront, respect, warm and friendly. It’s fast and demanding as well. (Key Stage 4 Curriculum Coordinator, Round 1 Interview)
4. Specialist status—Building an improved environment: The achieve- ment of specialist status enabled the school to release funds for fur- ther improvements to its physical environment.
Phase 4 (Renewal): Distributed leadership (3 years). Graham took further steps toward distributing leadership more widely, ensuring that all teachers were able to take on some leadership responsibility, a further extension of the trust built through the increased participation in leadership roles during the previ- ous phase. Perhaps the most important change in this phase was the introduc- tion of nonteaching staff as “inclusion managers,” who were responsible for pupil behavior and emotional issues. Finally, the deeper strategic work on the curriculum also had a big impact in this phase, with a more highly personal- ized and enriched curriculum.
1. Further distribution of leadership: More responsibility was given to the faculty leaders to run their own departments. Also, leadership responsibilities were further devolved to middle leaders and other staff. Where the principal used to lead all the staff meetings, in this phase, he encouraged staff to take the lead. They were supported in their decision making and encouraged to find their own solutions, knowing that they could approach the principal whenever they needed guidance.
[The principal] wants staff to think of solutions, not to bring him problems. He gives responsibility to people. (Assistant Principal, Round 1 Interview)
250 Educational Administration Quarterly 52(2)
2. Further pastoral restructuring—focus on learning and inclusion: The introduction of nonteaching pastoral staff was a common feature in many of the case study schools, and all reported how much this benefited behavior. With the increased support, the pupils cooperated more with staff. This new system helped provide an environment that was strict and yet supportive, regarded as “essential in this context.” New “learning” and “inclusion” managers focused on behavioral issues and worked regu- larly with those pupils who required it. This monitoring and learning sup- port allowed the school to meet the needs of individuals and work, essential in an area where the pupils have diverse needs and capacities.
3. Further curriculum enrichment and personalization: Pupils had a more extensive range of options available to them, and this provided opportunities for all pupils to succeed. Key elements of this new focus were “enrichment” days and community involvement.
Just for example, for Year 10 we had a crime and punishment day. So we had the justice system in, we had judges in, we set up a mock trial, we had the police in talking about forensic science, we had a youth offending team, we had convicted people in talking about what happened to them. So it’s citizenship and I think it’s true, it’s for them really. (Principal, Round 2 Interview)
Figure 3 shows how Graham established, combined, and built on strategies over time. It provides an illustration of the ways in which both transformational and instructional leadership strategies and practices were layered and devel- oped over the course of the school’s improvement journey. While some strate- gies, such as restructuring, which was a particular feature of the early phase, did not continue through each phase, others grew in importance, and others formed foundations on which other strategies were built. An example of the integration of transformational and instructional strategies is “pupil behavior,” which figures in different ways in all phases of Graham’s tenure (see Figure 3), expressed as “pupil behavior” in Phase 1, “pupil behavior and pastoral care” and “pupil voice” in Phase 2, “pupil-centered learning” in Phase 3, and “focus on learning and inclusion” in Phase 4. Alongside this focus on instructional leadership was an emphasis on, for example, “redesigning the leadership and staff teams” in Phase 1, “performance management: peer observation and coaching” in Phase 2, distribution of leadership to a small group of colleagues in Phase 3, and the “further expansion of leadership distribution and trust” in Phase 4. The growing confidence in using data, which began in Phase 2, was a necessary step on the way to developing a complex personalized curriculum in Phases 3 and 4. The two strategies then continued to develop in tandem. By the latest phase, a range of strategic actions was being simultaneously implemented,
Day et al. 251
though not all with the same degree of intensity. While some had a higher prior- ity than others, it was the context-sensitive combination and accumulation of actions, along with timely broadening and deepening of strategies, that allowed the later strategies to succeed, and made it possible for Graham’s leadership to have such a powerful impact on pupil outcomes.
Discussion and Conclusions: Both Transformational and Instructional Leadership Are Necessary for Success
The complementarity of the quantitative and qualitative methodologies enabled this research to identify patterns and common strategies used by principals of effective and improved schools in England and probe the qualities and context- specific strategies and actions over time. The principals
•• measured success both in terms of pupil test and examination results and broader educational purposes
•• were not charismatic or heroic in the traditional sense. However, they possessed a number of common values and traits (e.g., clarity of vision, for the short and longer term, determination, responsiveness, courage of conviction, openness, fairness) and their work was informed and driven by strong, clearly articulated moral and ethical values that were shared by their colleagues
•• were respected and trusted by their staff and parental bodies and worked persistently, internally and externally, in building relational and organizational trust
•• built the leadership capacities of colleagues through the progressive distribution of responsibility with accountability, as levels of trust were built and reinforced
•• placed emphasis on creating a range of learning and development opportunities for all staff and students
•• used data, research, inspection evidence, and observation as tools to enhance teaching and learning and thus to support school improvement
•• combined and accumulated both transformational and instructional leadership strategies within, through, and across each developmental phase of their schools’ long-term improvement.
In addition, principals whose schools drew their pupils from highly socio- economically disadvantaged communities faced a greater range of challenges in terms of staff commitment and retention and student behavior, motivation,
252 Educational Administration Quarterly 52(2)
and achievement than those in more advantaged communities. Principals of primary and secondary schools in all contexts were able to achieve and sustain successful pupil outcomes, but the degree of success was likely to be influ- enced by the relative advantage/disadvantage of the communities from which their pupils were drawn.
These results draw attention to Hallinger’s (2005) argument that leader- ship should be viewed as a process of mutual influence, whereby instruc- tional leaders influence the quality of school outcomes through shaping the school mission and the alignment of school structures and culture. This in turn promotes a focus on raising the quality of teaching and learning (instruc- tional leadership). The extent to which influence is perceived, felt, and “mea- sured” in terms of students’ academic gains can only be judged over time; and how influence is exercised positively or negatively over time can in part be seen in the conditions, structures, traditions, relationships, expectations, and “norms” that make up the cultures of schools. In the effective and improving schools in our study, principals palpably exercised both “transformational” and “instructional” leadership. We have seen this both in the presence of “trust,” for example, in the quantitative findings and clear evidence of the strategies used to raise expectations and build the commitment and capacities of teachers, students, and community in the qualitative case studies. Both “transformational” and “instructional” leadership strategies were, therefore, used in combination, as Printy et al. (2009) would claim, in an “integrated” leadership model. However, even for these successful principals like Graham, integration took time.
Like all research, the IMPACT project had its limitations. So, for example, while it was able to draw on national data based on effective and improving schools in all socioeconomic and geographically distributed contexts, the ini- tial judgment of “effectiveness” was that which related to performance in national tests and examinations and the judgments made by Ofsted (the United Kingdom’s independent school inspection agency). The structural equation model used to illustrate our quantitative conclusions was based on the responses of the principals only and not their staff (although further work has supported the models; e.g., Sammons et al., 2014). Moreover, we drew from only those schools in the national data base that had improved over at least 3 consecutive years under the leadership of the same principal. In the 20 school cases, we were able to interview a cross-selection of staff and other stakeholders over 3 years but did not directly observe the principals at work.
Nevertheless, findings of the research both confirmed the observations of a range of previous research and enabled, through its mixed methods approach, new knowledge to be generated about the ways in which the strategies, actions, and values of the principals and their relationships with teachers, parents, and
Day et al. 253
the community were grown, accumulated, combined, and applied over time in different contexts in ways that resulted in ongoing sustained school improve- ments. The qualitative component of the IMPACT study, in particular, adds to the growing body of research that suggests that successful principals use the same basic leadership practices. It found, also, that there is no single leader- ship formula for achieving success. Rather, successful principals draw differ- entially on elements of both instructional and transformational leadership and tailor (layer) their leadership strategies to their particular school contexts and to the phase of development of the school. When and how they do so, and the relative emphases that they place on these in different phases of their schools’ improvement trajectories, depend on their ongoing diagnoses of the needs of staff and students, the demands of the policy contexts and communities that their schools serve, clear sets of educational beliefs and values that transcend these, and the growth of trust and trustworthiness:
Is it a surprise, then, that principals at schools with high teacher ratings for “institutional climate” outrank other principals in developing an atmosphere of caring and trust? (The Wallace Report, 2011, p. 6)
The work of successful principals, like that of the best classroom teachers, is intuitive, knowledge informed, and strategic. Their ability to respond to their context and to recognize, acknowledge, understand, and attend to the needs and motivations of others define their level of success. Successful prin- cipals build cultures that promote both staff and student engagement in learn- ing and raise students’ achievement levels in terms of value-added measures of pupil progress in national test and examination results.
Much has been written about the high degree of sensitivity that successful leaders bring to the contexts in which they work. Some would go so far as to claim that “context is everything.” However, the IMPACT research suggests that this reflects too superficial a view of who successful leaders are and what they do. Without doubt, successful leaders are sensitive to context, but this does not mean that they use qualitatively different practices in every different context. It means, rather, that they apply contextually sensitive combinations of the basic leadership practices described earlier. The ways in which leaders apply these leadership practices—not the practices themselves—demonstrate responsiveness to, rather than dictation by, the contexts in which they work. They also demonstrate their ability to lead and manage successfully and to overcome the extreme challenges of the high need contexts in which some of them work. Success, then, seems to be built through the synergistic effects of the combination and accumulation of a number of strategies that are related to the principals’ judgments about what works in their particular school context.
254 Educational Administration Quarterly 52(2)
The evidence in this article also suggests that there is a value in using mixed methods approaches to identify and study leadership and to move beyond the oversimplistic promotion of particular types or models of leader- ship (an adjectival approach to improvement) as the key to enabling success, recognizing that what leaders do (strategies and actions) and their personal qualities (values and relationships) are more important. Future research should move beyond the use of single-paradigm models that may, despite their apparently technical rigor, provide somewhat simplistic dichotomies or limited accounts of successful school leadership. Rather, to increase under- standing, we need research that combines and synthesizes results and evi- dence from different methodological perspectives to provide more nuanced accounts and insights that can inform and support improved practice.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors received financial support for the original research from the Department for Education (DfE) in England but not for the authorship, and/or publication of this article.
Note
1. It is important to note that all principals had led their schools for more than 5 years. Thus, informants were able to draw on a considerable bank of experience of the nature, impact, and effects of their principals’ leadership. Direct quota- tions used in this article, as indicated, are drawn from a range of interviews over time.
References
Baker, D., & LeTendre, G. (2005). National differences, global similarities: World culture and the future of schooling. Stanford, CA: Stanford University Press.
Ball, S. J. (2001). The teachers’ soul and the terrors of performativity. London, England: University of London, Institute of Education.
Ball, S. J. (2003, May 14). Professionalism, managerialism and performativ- ity: Professional development and educational change. Paper presented at a conference organized by the Danish University of Education, Copenhagen, Denmark.
Bruggencate, G., Luyten, H., Scheerens, J., & Sleegers, P. (2012). Modeling the influ- ence of school leaders on student achievement: How can school leaders make a difference? Educational Administration Quarterly, 48, 699-732.
Day et al. 255
Bryk, A., Sebring, P., Allensworth, E., Luppescu, S., & Easton, J. (2010). Organizing schools for improvement: Lessons from Chicago. Chicago, IL: University of Chicago Press.
Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effec- tiveness: A contribution to policy, practice and theory in contemporary schools. Abingdon, England: Routledge.
Day, C., & Leithwood, K. (Eds.). (2007). Successful school principalship in times of change: An international perspective. Dordrecht, Netherlands: Springer.
Day, C., Sammons, P., Hopkins, D., Harris, A., Leithwood, K., Gu, Q., . . . Kington, A. (2008). The impact of school leadership on pupil outcomes: Interim report. London, England: UK Department for Children, Schools and Families Research.
Day, C., Sammons, P., Hopkins, D., Harris, A., Leithwood, K., Gu, Q., . . . Kington, A. (2009). The impact of school leadership on pupil outcomes: Final report. London, England: UK Department for Children, Schools and Families Research.
Day, C., Sammons, P., Leithwood, K., Hopkins, D., Gu, Q., Brown, E., & Ahtaridou, E. (2011). Successful school leadership: Linking with learning and achievement. Maidenhead, England: McGraw Hill Open University Press.
Fiedler, F. E. (1964). A contingency model of leadership effectiveness. Advances in Experimental Social Psychology, 1, 149-190.
Goodson, J. R., McGee, G. W., & Cashman, J. F. (1989). Situational leadership the- ory: A test of leadership prescriptions. Group Organisation Management, 14, 446-461.
Graeff, C. L. (1997). Evolution of situational leadership theory: A critical view. Leadership Quarterly, 8, 153-170.
Gu, Q., & Johansson, O. (2013). Sustaining school performance: School contexts matter. International Journal of Leadership in Education, 16, 301-326.
Hallinger, P. (2005). Instructional leadership and the school principal: A passing fancy that refuses to fade away. Leadership and Policy in Schools, 4, 221-239.
Hallinger, P. (2010). Leadership for learning: What we have learned from 30 years of empirical research. Paper presented at the Hong Kong School Principals’ Conference 2010: Riding the Tide, The Hong Kong Institute of Education, Hong Kong.
Hallinger, P., & Heck, R. H. (1996). The principal’s role in school effectiveness: An assessment of methodological progress, 1980-1995. In K. Leithwood & P. Hallinger (Eds.), International handbook of educational leadership and adminis- tration (pp. 723-783). Dordrecht, Netherlands: Kluwer Academic.
Hallinger, P., & Heck, R. H. (2010). Collaborative leadership and school improve- ment: Understanding the impact on school capacity and student learning. School Leadership and Management, 30(20), 95-110.
Hersey, P., & Blanchard, K. H. (1988). Management of organisational behaviour (5th ed., pp. 169-201). Englewood Cliffs, NJ: Prentice Hall.
Hu, L., & Bentler, P. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Ishimaru, A. (2013). From heroes to organizers: Principals and education organizing in urban school reform. Educational Administration Quarterly, 49, 3-51.
256 Educational Administration Quarterly 52(2)
Kaplan, D. (2004). Structural equation modelling. In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.), Encyclopaedia of social science research methods (pp. 1089- 1093). Thousand Oaks, CA: Sage.
Kelloway, E. K. (1998). Using LISREL for structural equation modelling. Thousand Oaks, CA: Sage.
Kline, R. (2010). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford Press.
Leithwood, K., Day, C., Sammons, P., Harris, A., & Hopkins, D. (2006). Seven strong claims about successful school leadership. Nottingham, England: National College for School Leadership.
Leithwood, K., Day, C., Sammons, P., Hopkins, D., & Harris, A. (2008). Successful school leadership: What is it and how it influences pupil learning. Nottingham, England: Department for Education and Skills.
Leithwood, K., & Jantzi, D. (1999). Transformational school leadership effects: A replication. School Effectiveness and School Improvement, 10, 451-479.
Leithwood, K., Jantzi, D., & Steinbach, R. (1999). Changing leadership for changing times. Buckingham, England: Open University Press.
Leithwood, K., Patten, S., & Jantzi, D. (2010). Testing a conception of how school leader- ship influences student learning. Educational Administration Quarterly, 46, 671-706.
Leithwood, K. A., & Sun, J. (2012). The nature and effects of transformational school leadership: A meta-analytic review of unpublished research. Educational Administration Quarterly, 48, 387-423.
Louis, K. S., Leithwood, K., Wahlstrom, K. L., & Anderson, S. E. (2010). Investigating the links to improved student learning: Final report of research findings to the Wallace Foundation. Minneapolis: University of Minnesota.
Marks, H., & Printy, S. (2003). Principal leadership and school performance: An integration of transformational and instructional leadership. Educational Administration Quarterly, 39, 370-397.
Marzano, R. J., Waters, T., & McNulty, B. A. (2005). School leadership that works: From research to results. Alexandria, VA: Association for Supervision and Curriculum Development.
Moos, L., Johannson, O. & Day, C., (Eds.). (2012). How school principals sustain success over time: International perspectives. Dordrecht, Netherlands: Springer.
Mulford, B. (2008). The leadership challenge: Improving learning in schools (Australian Education Review No. 53). Camberwell, Victoria, Australia: Australian Council for Educational Research. Retrieved from www.acer.edu.au/research_reports/AER.html
Mulford, B., & Silins, H. (2011). Revised models and conceptualisation of success- ful school principalship for improved student outcomes. International Journal of Educational Management, 25, 61-82.
Organisation for Economic Co-operation and Development. (2008). Improving school leadership. Paris, France: Author.
Organisation for Economic Co-operation and Development. (2010). Education today: The OECD perspective. Paris, France: Author.
Day et al. 257
Organisation for Economic Co-operation and Development. (2012). Preparing teach- ers and developing school leaders for the 21st century. Paris, France: Author.
Organisation for Economic Co-operation and Development. (2013). Synergies for better learning: An international perspective on evaluation and assessment. Paris, France: Author.
Pennings, J. M. (1975). The relevance of the structural-contingency model for organ- isational effectiveness. Administrative Science Quarterly, 20, 393-410.
Printy, S. M., Marks, H. M., & Bowers, A. J. (2009). Integrated leadership: How prin- cipals and teachers share transformational and instructional influence. Journal of School Leadership, 19, 504-532.
Putnam, R. (Ed.). (2002). Democracies in flux: The evolution of social capital in con- temporary society. Oxford, England: Oxford University Press.
Robinson, V., Hohepa, M., & Lloyd, C. (2009). School leadership and student out- comes: Identifying what works and why. Auckland, New Zealand: Ministry of Education, Best Evidence Syntheses Iteration.
Robinson, V., Lloyd, C., & Rowe, K. (2008). The impact of leadership on student outcomes: An analysis of the differential effects of leadership types. Educational Administration Quarterly, 44, 635-674.
Sammons, P. (2010). The contribution of mixed methods to recent research on edu- cational effectiveness. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods research (2nd ed., pp. 697-723). Thousand Oaks, CA: Sage.
Sammons, P., Davies, S., Day, C., & Gu, Q. (2014). Using mixed methods to inves- tigate school improvement and the role of leadership. Journal of Educational Administration, 52, 565-589.
Sammons, P., Gu, Q., Day, C., & Ko, J. (2011). Exploring the impact of school lead- ership on pupil outcomes: Results from a study of academically improved and effective schools in England. International Journal of Educational Management, 25, 83-101.
Silins, H., & Mulford, W. (2002a). Leadership and school results. In K. Leithwood & P. Hallinger (Eds.), Second international handbook of educational leadership and administration (pp. 561-612). Dordrecht, Netherlands: Kluwer Academic.
Silins, H., & Mulford, W. (2004). Schools as learning organisations: Effects on teacher leadership and student outcomes. School Effectiveness and School Improvement, 15, 443-466.
Thompson, G., & Vecchio, R. P. (2009). Situational leadership theory: A test of three versions. Leadership Quarterly, 20, 837-848.
The Wallace Foundation. (2011). The school principal as leader: Guiding schools to better teaching and learning. New York, NY: Author.
Witziers, B., Bosker, R., & Krüger, M. (2003). Educational leadership and student achievement: The elusive search for an association. Educational Administration Quarterly, 39, 398-425.
Ylimaki, R. M., & Jacobson, S. L. (Eds.). (2011). US and cross-national policies, practices, and preparation. New York, NY: Springer.
258 Educational Administration Quarterly 52(2)
Author Biographies
Christopher Day is a professor of education, convenor of the Centre for Research on Educational Leadership and Management (CRELM) at the University of Nottingham, and leader of the 25 country International Successful School Principals Research Project. He is lead author of Successful School Leadership: Linking with Learning and Achievement (Open University Press, 2010); Resilient Teachers, Resilient Schools (Routledge, 2014); and co-editor of Leading Schools Successfully: Stories from the Field (Routledge, 2014).
Qing Gu is a professor of education at the University of Nottingham. She is author of Teacher Development: Knowledge and Context (Continuum, 2007); editor of The Work and Lives of Teachers in China (Routledge, 2014); and coauthor of Teachers Matter (Open University Press, 2007), The New Lives of Teachers (Routledge, 2010), Successful School Leadership: Linking with Learning and Achievement (Open University Press, 2011), and Resilient Teachers, Resilient Schools (Routledge, 2014).
Pam Sammons is a professor of education at the Department of Education, University of Oxford, and a senior research fellow at Jesus College, Oxford. Her research over more than 30 years has focused on school effectiveness and improvement, school leadership, teaching effectiveness, and promoting equity and inclusion in education. She is a governor of a primary school in Oxfordshire and a secondary school academy in the city of Oxford.
Mission, vision, values, and goals: An exploration of key organizational statements and daily practice in schools
D. Keith Gurley • Gary B. Peters • Loucrecia Collins •
Matthew Fifolt
Published online: 26 February 2014
� Springer Science+Business Media Dordrecht 2014
Abstract This article reports findings from a study of graduate level, educational leadership students’ familiarity with shared mission, vision, values, and goals
statements and the perceived impact these concepts have on their practice as leaders
and teachers in schools. The study is primarily qualitative and uses content analysis
of responses to open-ended questions. Researchers adopted a limited quantitative
analysis technique, however, in order to report frequency of responses to survey
questions. We used the literature base regarding strategic planning and school
improvement as conceptual frameworks to guide the analysis. Findings revealed that
educational leadership students had limited ability to recall the content of key
organizational statements. Further, respondents reported that these key organiza-
tional statements had only minimal impact on their daily practice. Implications are
presented for university preparation programs designed to equip school leaders to
effect meaningful school improvement and organizational change centered on
D. K. Gurley (&) Department of Human Studies, University of Alabama at Birmingham, 210B Education Building,
901 13th Street South, Birmingham, AL 35294, USA
e-mail: [email protected]
G. B. Peters
Department of Human Studies, University of Alabama at Birmingham, 203 Education Building,
901 13th Street South, Birmingham, AL 35294, USA
e-mail: [email protected]
L. Collins
Department of Human Studies, University of Alabama at Birmingham, 223 Education Building,
901 13th Street South, Birmingham, AL 35294, USA
e-mail: [email protected]
M. Fifolt
Center for the Study of Community Health, University of Alabama at Birmingham, 112 912
Building, 912 18th Street South, Birmingham, AL 35294, USA
e-mail: [email protected]
123
J Educ Change (2015) 16:217–242
DOI 10.1007/s10833-014-9229-x
development of shared mission and vision for improvement. This research confirms
similar findings reported by Watkins and McCaw (Natl Forum Educ Adm Superv J
24(3):71–91, 2007) and adds to the research by exploring respondents’ reports of the
impact of mission, vision, values, and goals statements on their daily practice. It
further extends the discussion by presenting a content analysis of key organizational
statements, comparing mission, vision, values, and goals statements to models of
strategic planning and planning for continuous school improvement from the
organizational improvement literature.
Keywords Goal-setting � Organizational change � Organizational values � School culture � School improvement � Shared mission � Shared vision
Introduction
Articulating and nurturing widely shared ownership and commitment to purpose in
organizations (i.e., mission, vision, values, and goals) has long been identified as
essential to effective, strategic planning for organizational improvement (Bryson
2004; Kaufman 1992; Mintzberg 1994). Bryson (2004) stated, ‘‘Clarifying purpose
can eliminate a great deal of unnecessary conflict in an organization and can channel
discussion and activity productively’’ (p. 38). Unity of purpose, or mission, within
an organization provides a means by which organizational members can work
together toward a common set of objectives.
The purpose of the research presented in this article was to explore how familiar
graduate students, enrolled in educational leadership programs at a southeastern US
university, were with the mission, vision, values, and goals statements in their
schools. We also explored the perceived level of impact that these statements had on
educational leadership students’ daily, professional practice. The article concludes
with a discussion of the findings as well as some implications for university
preparation programs designed to equip future school leaders to effect meaningful,
organizational change in their schools.
Background
While discussion of strategic planning finds its roots in business management
contexts, much of what has been presented within this literature has migrated into
the research and discussion regarding school improvement models over the last two
decades (Quong et al. 1998). Development of a clear school mission, shared vision,
articulated values, and specific goal statements has also been applied more
specifically to the fundamental processes of school improvement focused on
increased levels of learning for all students (DuFour and Eaker 1998; DuFour et al.
2008; Perkins 1992; Renchler 1991; Teddlie and Reynolds 2000; Wiggins and
McTighe 2007). Yet, despite a longstanding and consistent admonition in the
literature regarding the purpose and power in developing these foundational
218 J Educ Change (2015) 16:217–242
123
statements, the practice of clearly articulating such statements continues to be
effectively ignored by many school leaders (DuFour et al. 2008; Watkins and
McCaw 2007). In an insightful piece on vision-guided schools, Pekarsky (2007)
stated, ‘‘… thoughtful, systematic attention to larger questions of purpose is rarely at the heart of American social and educational discourse’’ (p. 424).
The current authors contend that, among school leaders, there exists a lack of
understanding of exactly what mission, vision, values, and goals statements are and
the value such foundational statements offer to the development of shared
commitment among stakeholders to the process of school improvement. Citing
evidence from a recent survey of our graduate-level educational leadership students,
we point to the presence of an implied disconnect between the widely established,
best practice in the first steps of school improvement (i.e., development of key
organizational mission, vision, values, and goals statements) and the daily,
professional practice of educational leaders charged with demonstrating continuous
improvement in school achievement and student learning.
In the first section of the article, we provide clear definitions of the terms and
then research-based evidence for the value of school mission, vision, values, and
goals statements. Next, we describe findings from the research conducted in an
educational leadership program at a university in the southeastern United States. We
conclude the article by presenting a discussion of the findings as well as some
implications for further research into the topic. The article begins with a description
of the two key conceptual frameworks adopted to guide the research, i.e., strategic
planning and continuous school improvement. We based the content analysis of
mission, vision, values, and goals statements recalled by our students on the models
of strategic planning and continuous school improvement.
Conceptual frameworks
The research project was guided by two frames of thought regarding organizational
and school change. The first of these frameworks is strategic planning, developed by
authors and researchers primarily outside the field of education. The second
framework, that is continuous school improvement, stems from the strategic
planning literature, but applies its concepts specifically to the process of increasing
the capacity of schools to effect high levels of learning for students and adults in a
school context. Discussion of continuous school improvement comprises a broad
framework developed by a wide variety of school improvement experts. Strategic
planning and continuous school improvement frameworks are described in more
detail in the section that follows.
Strategic planning
We first adopted a conceptual framework of strategic planning to guide the project.
Strategic planning finds its roots in the work of Lewin (1943) on organizational
change (Burnes 2004). Lewin described three stages of organizational change
claiming that, in order to solidify meaningful change within an organization,
organizational members must first unfreeze or become aware that the current
J Educ Change (2015) 16:217–242 219
123
mindset within the organization must change in order to meet new demands from
the external environment. Next, organizational members, now aware of the need for
change, actually experience a state of confusion or become unsettled as they recreate
and redefine the new norms for the organization. Finally, once new norms and
expectations have been defined, the organization experiences a state of freezing in
which they establish, commit to, and become comfortable again with the new set of
organizational norms, goals, and expectations (Lewin 1943).
Based upon an extensive knowledge of historical literature on planning in
American and European corporations, Mintzberg (1994) sought to define the elusive
construct of strategic planning. Mintzberg asserted that planning has been conceived
of historically as merely ‘‘future thinking’’ by many planning experts, while others
define planning as actually ‘‘controlling the future’’ (p. 7). Finally, Mintzberg
asserted the possibility that planning is simply a process of ‘‘decision making’’ (p.
9). In an effort to define strategic planning, Mintzberg clearly pointed to the
complex nature of the process and the need for organizational actors to define what
it is they mean by ‘‘strategic planning’’ and how that process will be fleshed out in
the organization. Other strategic planning experts have focused specifically on
aspects of organizational change in the nonprofit sector, including the second phase
described by Lewin (1943), wherein organizational leaders and members focus on
developing a new set of organizational norms and commitments in order to enable
the change process (Bardwell 2008; Crittenden and Crittenden 1997; Moore 2000).
Describing a successful strategic planning process in their nonprofit organization,
McHatton et al. (2011) stated, ‘‘…strategic planning has been shown to be beneficial in gaining stakeholder consensus for organizational objectives and future action’’ (p.
235).
In this second stage of strategic planning described by Lewin (1943), confusion,
members engage in a process of developing organizational purpose statements
intended to guide the change process. Purpose statements include statements of
mission, vision, values, and goals, and become the cornerstones upon which
organizational change is built (Bardwell 2008; Crittenden and Crittenden 1997;
Moore 2000). McHatton et al. (2011) identified common elements of effective
strategic planning emergent from the literature and from their own experience,
including the development of clear mission and vision statements, a commitment to
organizational values (e.g., leadership, collaboration), and development of a
systematic way to monitor progress toward organizational goals.
School improvement
Out of this dialogue of strategic planning for organizations in general stems the
discussion of organizational improvement specifically for schools. Researchers for
the current study adopted this conceptual framework of school improvement to
further guide data analysis and reflection.
In the seminal work on the problem of change in US schools, Sarason (1971)
clearly explicated many problems that school leaders often encounter in their efforts
to effect meaningful, modal change in educational settings. Among these problems
was an insufficient understanding of the context of schools and the regularities, or
220 J Educ Change (2015) 16:217–242
123
common practices of school personnel. Without a thorough understanding of these
regularities, change agents have traditionally found it difficult, if not impossible, to
implement and sustain desired changes in schools.
Rooted in the work of Sarason (1971); Fullan (1993, 1998, 1999) extended the
discussion of the complexity of the change process in school improvement. Fullan
observed that schools are not only complex organizations, but operate in constantly
changing, fluid contexts. School improvement leaders are challenged, at best, to introduce
and support change efforts within organizations that experience ongoing, dynamic
external and internal change forces, most of which may be hidden and unexpected.
Fullan (1993) explained that, while developing a shared vision among school
personnel is essential, it is important that this vision remain fluid, especially at the
point of introduction of substantial change in a school. Fullan recommended that
school improvement leaders remain open to reflection within and about the
organization in order to gain a more comprehensive understanding of the context
before establishing an organizational vision. Fullan wrote, ‘‘Under conditions of
dynamic complexity, one needs a good deal of reflective experience before one can
form a plausible vision’’ (p. 28).
Many other authors have contributed to the school improvement knowledge base
over the last two decades, offering a wealth of research-based practices in school
leadership, change agency, instruction, curriculum development, and organizational
planning (Danielson 2007; DuFour et al. 2008; Marzano et al. 2005; Reeves 2000).
In presenting an increasingly popular model of school improvement, professional
learning communities (PLC), DuFour and Eaker (1998) identified the articulation,
implementation, and stewardship of mission, vision, values, and goals statements as
fundamental building blocks to effective school improvement. For the current study,
researchers adopted the PLC guiding framework due to the more extensive
articulation of the definitions of these foundational, organizational statements.
These definitions are explained more clearly in the following section.
Review of literature
Although strategic planning and school improvement literature bases are replete,
even saturated, with discussion about organizational mission, vision, values, and
goals, there remains a widespread misunderstanding of exactly what each of these
terms means, as well as an apparent lack of understanding of the value of establishing
such statements to the process of school improvement (DuFour et al. 2008; Watkins
and McCaw 2007). It is imperative, then, that we carefully define each term and
provide background regarding how well-articulated, foundational terms can
contribute to the evolution and improvement of organizations and schools.
Defining a mission statement
Often in leadership discourse, a mission statement is used synonymously and
interchangeably with the vision statement of an organization. However, the two
statements are distinct from one another (DuFour et al. 2008). A mission statement
J Educ Change (2015) 16:217–242 221
123
is, most simply, a statement of why an organization exists, a statement of its
fundamental purpose. In the context of continuous school improvement, DuFour and
Eaker (1998) described a mission statement as ‘‘stating the business of our
business’’ and answering the question, ‘‘Why do we exist?’’ (p. 58). Lunenberg
(2010) argued that leading an ongoing, community-wide discussion about the
purpose of the organization’s existence is essential to the function of school
leadership and to the process of building unity and shared commitment to the work
to be done in an educational organization.
Stemler et al. (2011) conducted a comprehensive content analysis of high school
mission statements from a sample of schools from ten states across the United States.
These authors noted that, despite the presence of an allegedly unifying school mission
statement, the reasons that stakeholders assign for a school’s existence may vary
widely from school to school, and even among stakeholders within the same school.
For example, faculty and other community members may perceive the mission of a
school as ranging from preparing students to function as mature civic, emotional,
cognitive, and social adults to preparing students to assume vocational functions,
physically healthy habits, and even local and global integration (Stemler et al. 2011).
Stemler et al. (2011) argued that, while each of the perceivedmissions or purposes of
schooling are indeed important and laudable, the fact that such a wide variety of
individually held or perceived purposes for schooling exists, even among faculty
members operating within the same school unit, results in a lack of unity of mission and
shared effort toward a common set of objectives. This lack of unity in defining a shared
mission may result in a breakdown of mutual understanding of the primary purpose for
the school’s existence and eventually lead to fragmentation of effort among
organizational actors. The purpose of developing a widely shared organizational
mission, therefore, is not conducted to limit other, important functions of schools, but
rather to focus members’ efforts in order to reach clearly articulated and specific goals
(Bryson2004;DuFour et al. 2008;Kaufman1992;Mintzberg 1994, Stemler et al. 2011).
According to Boerema (2006), the mission statement of a school actually
articulates a set of values that answer fundamental questions about the purpose of
education and how the educational program should be carried out. Boerema pointed
out that, ‘‘The school mission provides the context for governance, decision making,
and the way the school is managed’’ (p. 182). Boerema further explained that a
school mission statement provides key direction to those individuals performing the
core technology of a school, namely teaching and learning.
The process of articulating a clear and concise mission statement is imperative in
order to solidify a shared understanding of what the primary work of the school
actually is. Without careful examination, discussion, articulation, and clarification
of the school mission, educational professionals who work together closely on a
daily basis may interpret their purpose very differently, each assuming a different
reason for why they do the work that they do.
Defining a vision statement
A vision statement is qualitatively different from a mission statement. A vision
statement is an articulation not of purpose, but of a preferred future for the
222 J Educ Change (2015) 16:217–242
123
organization. According to DuFour and Eaker (1998), a vision statement answers
the question, ‘‘What do we hope to become?’’ (p. 62).
A vision statement provides stakeholders with a picture of what their ideal school
and students will look like if educators are successful in working together to achieve
that vision. Though a vision statement should be clear and meaningful to all
stakeholders, effective vision statements are concise and provide lofty, yet
measureable, language so that school personnel know when the vision has been
achieved or when it should be adjusted to better meet the needs of the organization
(Pekarsky 2007).
Pekarsky (2007) stated that a vision statement is far more than a mere slogan. A
vision statement enables school community members to assume a desired state of
heart and mind with which to carry out their daily functions in the school.
Stakeholders in a vision-guided organization, through the function of a clearly
articulated and supported vision statement, are explicit about where they are headed,
what they are about, and how they will know when they have arrived.
Kose (2011) stated that a shared, articulated vision is characteristic of effective
schools, is a vehicle for building more inclusive and equitable schools, and can
influence positive change in school improvement efforts, hiring, evaluation,
professional development, and other key school functions. According to Kose,
principals can use a well-crafted and supported vision statement to effect powerful
change in the school on many different levels.
Defining values statements
Perhaps the least understood and under-implemented of the four foundational
statements is the statement of core values. As the name suggests, core values
statements articulate the shared beliefs of an organization. Again, DuFour and Eaker
(1998) claimed that core values statements answer the question, ‘‘How must we
behave in order to make our shared vision a reality?’’ (p. 88).
In describing their work with business organizations, Blanchard and O’Connor
(1997) argued that, ‘‘When aligned around shared values and united in a common
purpose, ordinary people accomplish extraordinary results and give their organi-
zation a competitive edge’’ (p. 144). Though their work and research was conducted
in a profit-driven context, key concepts may arguably be applied to non-profit
organizations, as well. Blanchard and O’Connor wrote of the importance for
contemporary organizations to adopt key values, such as honesty, fairness, and
integrity, in order to survive in the current economy.
Blanchard and O’Connor (1997) further contended that organizations, centered
on powerful, shared values, report better service to their clientele, higher profits, and
a higher quality of working environments for their employees. The authors stated
that it is these shared values that act as the primary authority within an organization,
the authority to which all organizational members answer.
In order for statements of organizational values or belief statements to be
effective and meaningful to a school community, however, they must be translated
from esoteric statements of stakeholder beliefs into clear and succinct statements of
observable behaviors. In other words, statements of core values do not merely
J Educ Change (2015) 16:217–242 223
123
answer the question, ‘‘What do we believe?’’ but also address the question, ‘‘Based
upon our core beliefs, how will we behave within our organization in order to
achieve our vision?’’
For example, if a school community identifies a core value of safety for its
school, it is not enough merely to state, ‘‘We believe our school should be safe.’’
Instead, the value becomes much more realistic and observable when a statement of
safety as a value is translated into behavioral statements such as, ‘‘Because we value
keeping our community safe, we will each assume responsibility to keep school
doors locked at all times.’’ Or, ‘‘Because we value safety for all staff and students,
we will each approach and greet strangers to our building and offer our assistance.’’
Such behavioral statements, added to a stem statement of a basic value, makes the
core values statements come alive within the organization and allows leaders to
observe when and if the espoused core values are actually at work or if they are,
rather, mere words on a document.
Calder (2011) extended the understanding of the importance of values statements
by claiming that values statements provide an important foundational pillar for how
business is to be conducted. Calder wrote, ‘‘Values shape much of the work
processes and, as such, influence how an institution moves forward in a positive
way’’ (p. 24).
Defining goals statements
Perhaps the most clearly understood of the four terms is the statement of goals. In a
goal statement, educators spell out precisely what level of performance is to be
achieved in the selected domain (e.g., student learning, professional development)
and what steps are to be taken, by whom, in order to achieve the goal. Clearly, in
this era of increased accountability for student learning and professional practice,
setting clear, measurable performance goals has become common practice for
school leaders and other school personnel. DuFour and Eaker (1998) stated that
statements of learning goals address the question, ‘‘Which steps will we take first,
and when?’’ (p. 100).
A widespread trend across the United States in school improvement efforts,
especially in light of increased accountability, is the development of organizational
goals that are Strategic, Measurable, Attainable, Results-oriented, and Time-bound
or SMART goals (O’Neill 2000). The connection between effective goal setting and
student achievement has been clearly established among researchers (Moeller et al.
2012).
Summary of literature review
A clear definition of the meaning of each of the four foundational statements
(mission, vision, values, goals) is imperative for members of the organization,
especially leaders, to understand the purpose of statement development. Further-
more, a deep understanding of the value of each type of statement, not merely the
development of the statement, but the organization-wide ownership and investment
in the principles asserted in the statement, is also imperative if school leaders are to
224 J Educ Change (2015) 16:217–242
123
make important and significant progress toward school improvement. In other
words, the purpose and value of developing foundational mission, vision, values,
and goals with stakeholders within an organization is not merely to have done so,
and to check these tasks off of the ‘‘to do’’ lists. Rather, the purpose of developing
these statements is to bring organizational stakeholders together to share in a
common understanding of and commitment to the school’s purpose, preferred
future, behavioral expectations, and next steps toward school improvement and
increased levels of student learning.
Methodology
We used a primarily qualitative methodology (i.e., content analysis) in order to
explore the level of familiarity educational leadership students had with their
school’s mission, vision, values, and goals statements and the level to which the
statements impacted their daily practice. We also, however, employed the use of a
quantitative technique in reporting the frequency of responses to survey questions.
While not strictly a replication of research, the project described here follows-up on
and extends research conducted by Watkins and McCaw (2007).
Reporting findings from a similarly designed study, Watkins and McCaw (2007)
discovered a lack of ability by their educational leadership students to articulate
their own school or district mission, vision, and values statements. These authors
discovered that the mission, vision, and values statements that their students recalled
were largely not aligned between school and district levels and that only a small
percentage of recalled statements (8–15 %) were reflective of identified criteria for
what the content of vision, mission, and values statements should reflect.
We patterned the current investigation after the Watkins and McCaw (2007)
study by surveying our current educational leadership students, asking them to recall
key organizational statements. We also followed the Watkins and McCaw design by
conducting a content analysis of the actual statements that respondents could recall.
Our study departs from the Watkins and McCaw study in that we added students’
ability to recall school goals statements to the survey. Further, at the suggestion of
Watkins and McCaw, we explored our students’ perception of impact that school
mission, vision, values, and goals statements had on their daily practice as
professional educators.
Study sample
The individuals who comprised the convenience sample for this study were enrolled
in one of three graduate-level, educational leadership preparation programs at a
university in the southeastern part of the United States during the fall of 2012. All
participants were employed as teachers, principals, or central office administrators
in schools within the university service area and were enrolled in either an
educational master’s, educational specialist, or doctoral program at the university.
Educational leadership students were selected to participate in the research based on
their experience working in schools and on their demonstrated interest in the study
J Educ Change (2015) 16:217–242 225
123
of school leadership evidenced by their enrollment in an educational leadership
program. Based on students’ professional pursuits, we assumed that educational
leadership students would be familiar with any guiding mission, vision, values, and
goals statements in their schools.
Survey development and administration
The survey, developed by the researchers, was quite simple and straightforward.
Using a web-based, electronic survey administered after a class session, participants
were asked to report whether or not their school had a mission statement, a vision
statement, a values statement, and a statement of school goals. Participants were
further asked to recall any or all of the words included in each of the statements and
to rate the level to which each of the statements impacted their daily practice as
teachers or school leaders (6-point, Likert-type scale).
Of the 98 students enrolled in one of the three educational leadership degree
programs, 80 students completed the survey, yielding a survey return rate of just
over 81 %. Because the survey was administered after class activities had
concluded, some potential respondents chose not to participate. The research team
did not inquire as to the reasons these individuals chose not to complete the survey.
Before administering the survey, we explained the project and provided potential
survey respondents the opportunity to either complete the survey or to opt out
without penalty. Participants were provided the opportunity to complete the survey
online or via paper-and-pencil, submitting the completed questionnaires to one of
the researchers who later entered responses, verbatim, into the survey website.
Although we asked participants to provide only a limited amount of identifying
demographic information (e.g., educational level, school level where employed,
subject taught, job title, school and district where employed) all study participants
were assured of confidentiality in data analysis and anonymity in future reporting of
the data.
Analysis of survey data
We performed simple, statistical analyses on demographic and quantitative
responses to the survey. For open-ended questions, however, we conducted a
two-phase content analysis of the response texts (Rosengren 1981). Because
mission, vision, values, and goals statements are clearly defined in strategic
planning literature, we first scanned the text of responses to collect all statements
that were related to the more general, strategic planning definitions. For example,
because mission statements are clearly identified as purpose-related statements in
general, we first analyzed the content of mission statement responses against this
standard. This part of the analysis provided insight into the scope of the statements
recalled by educational leadership students relative to their schools’ mission, vision,
values, and goals statements.
In the second phase of content analysis of open-ended responses, we used more
specific, school improvement-related definitions of mission, vision, values, and
goals statements to guide the analysis. For example, while school improvement
226 J Educ Change (2015) 16:217–242
123
literature reiterates that mission statements are organizational purpose statements,
school improvement experts also assert that the primary mission of schools is to
effect high levels of learning for all students (DuFour et al. 2008). Therefore, in the
second phase of content analysis of the mission-related text responses, we took note
of the student learning-related content recalled by survey respondents. This two-
phased, content analysis approach was applied to all four sets of statement-related
responses recalled by study participants.
The research team first summarized the demographic characteristics of the survey
respondents, the reported presence (or lack thereof) of the specific statements in
their schools, and the perceived effect that the statements had of their daily work.
Next, we analyzed the content of open-ended responses asking respondents to recall
any or all of their schools’ mission, vision, values, and goals statements. We
searched for and identified themes emergent from the data from each of the sets of
statement-related responses (Creswell 2013). These themes, related first to general,
strategic planning-based definitions of the term, and then to school-improvement
related definitions are reported in the next section.
Findings
Findings from the study indicated that survey respondents reported a disparity in
whether or not their school had mission, vision, values, and goals statements.
Reports of the presence of school mission statements were clearly most prevalent.
However, reported presence of school vision and values statements is substantially
less prevalent. School goals statements are slightly more common, as might be
expected given the current environment of increased accountability resultant from
the No Child Left Behind (NCLB) legislation (United States Department of
Education 2002). Next, we present demographic characteristics of the sample as
well as the reported presence of each type of statement and the perceived effect that
each statement has on the daily practice of educational leadership students in their
school contexts. The section concludes with a presentation of the two-phased
analysis of the content that educational leadership students were able to recall on
demand, through the open-ended survey questions.
Study participants’ demographics
Study participants were fairly evenly divided among all demographic categories,
suggesting a balanced distribution of input. Regarding educational level, 37 % of
respondents were at the master’s level, 39 % at the educational specialist level, and
24 % of respondents were doctoral students. Participants were also evenly
distributed regarding the level at which they work within a K-12 educational setting
(Elementary 29 %; Middle School 23 %; and High School 29 %). Thirteen percent
of respondents were district-level school leaders. Teachers comprised 47 % of the
sample, school-level principals, assistant principals, and curriculum specialists
comprised 38 % of the sample, and district-level administrators comprised 14 % of
the sample. Among respondents who reported serving in a formal school leadership
J Educ Change (2015) 16:217–242 227
123
role (i.e., school- and district-level leaders), 13 % were relatively new, having served
only a year or two in the role of principal or assistant principal. Thirty-six percent had
served in a formal school leadership role between 3 and 10 years, while 14 % had
served in formal school leadership roles for more than 10 years. Although we know a
large majority of the respondents to be employed in the K-12 public school
environment, we did not inquire as to the public or private status of respondents’
school contexts. The demographic characteristics of study participants we surveyed
are represented in Table 1 below.
Presence of mission, vision, values, and goals statements
The survey asked participants to indicate, ‘‘Yes’’ or ‘‘No,’’ whether or not their
school had each of the four types of organizational statements. As expected, a vast
majority of students (94 %) reported having a mission statement in place. Only
62 % of schools, however, reported having adopted a separate vision statement.
Only 18 % of study respondents reported that their school had a statement of
organizational values. Finally, respondents reported that 42 % of schools had
written goals statements. These findings are reported in Table 2.
Perceived impact of statement on daily professional practice
Educational leadership students were asked to rate the perceived level to which each
type of statement affected their daily professional practice. For the study, impact on
professional practice was defined as the level to which respondents thought about,
referred to, and were guided by the foundational statements on a daily basis in their
individual school roles and responsibilities. Respondents indicated the perceived
Table 1 Demographic characteristics of educational
leadership student sample
For brevity, the category of
‘‘other’’ has been eliminated,
resulting in some categories
totaling\100 %
Demographic characteristic N (%)
Level of participation in educational leadership program
Master’s 29 37
Educational specialist 31 39
Doctoral 19 24
Level of K-12 professional practice
Elementary 23 29
Middle school 18 23
High school 23 29
District level 10 13
Role in K-12 practice
Teacher 37 47
Building-level leader 30 38
District-level leader 11 14
Years in formal leadership role
1–2 Years 10 13
3–10 Years 29 36
More than 10 Years 11 14
228 J Educ Change (2015) 16:217–242
123
level of impact through the use of a six-point, Likert-type scale, ranging from ‘‘1’’
No Effect, to ‘‘5’’ Maximum Effect. An option of ‘‘6’’ Does not Apply was provided
for respondents reporting an absence of that type of statement in their schools.
Findings from these questions are represented in Table 3.
Content of mission statements
In open-ended question format, educational leadership students were asked to recall
any or all of their school’s mission, vision, values, and goals statements. Clearly,
recalling the content of any organization’s mission, vision, values, and goals
statements, on demand, is a formidable task for any employee. We chose to include
this on-demand task within the survey, however, for two reasons. First, this was the
task that Watkins and McCaw (2007) assigned their own educational leadership
students in their exploration of the issue within their own university setting. We
desired to create a survey that was somewhat parallel to the Watkins and McCaw
study for purposes of comparison of results.
Second, since the survey was administered to educational leadership students, our
team sought to determine the level to which these school personnel had internalized
the mission, vision, values, and goals statements of their schools. We assumed that,
of all school employees, educational leadership students were perhaps more likely
to have internalized, and possibly even memorized, key organizational statements to
a higher level than most school personnel. In the following sections, we report
findings from a two-phased content analysis of the text from educational leadership
student responses to the open-ended questions asking them to recall any or all of the
designated statement.
Table 2 Reported presence of mission, vision, values, and
goals statements
Statement type present (%)
Does your school have a written mission statement? 94
Does your school have a written vision statement? 62
Does your school have a written values statement? 18
Does your school have a written goals statement? 42
Table 3 Perceived effect of statement on daily professional practice, percent by level
Type of statement Effect
Little to none Some Large to maximum Does not apply
Impact of mission statement 21 29 45 6
Impact of vision statement 28 20 25 28
Impact of values statement 25 10 7 60
Impact of goals statement 26 14 23 38
For brevity, categories 1 and 2 (i.e., little to no effect) and categories 4 and 5 (i.e., large to maximum
effect) have been collapsed
J Educ Change (2015) 16:217–242 229
123
In the first phase of analysis of mission statement text provided by 80 survey
participants, researchers examined respondent comments, collapsed similar com-
ments into a single category, and tallied the frequency of each comment or category
of comment. The goal of this phase of analysis was to compare the overall content
of mission statements provided by respondents to determine if the comments
represented the broad definition of mission statements provided in the strategic
planning literature as statements of organizational purpose.
While nearly all of the content of mission statements recalled pertained to
organizational purpose, the variety of stated purposes was very broad in scope.
During this phase of the analysis, researchers gained insight into the wide variety of
stated purposes for schools represented by educational leadership students’
responses. To that end, 144 different statements of purpose were provided by
survey respondents. These 144 individual statements collapsed into 56 different
statement categories. Statement categories ranged in frequency from 11 (7 %) of the
total number of comments down to a single, unique statement. For example, the two
most frequently mentioned statement categories of school purpose included
inclusivity of all students (11 of 144) and preparation of students for productive
citizenry (11 of 144). Our team cataloged 26 of the 144 total comments as unique
statements of purpose (i.e., only mentioned by one respondent), some of which
included such purposes as (a) meeting unique needs of students, (b) expanding
opportunities and horizons for students, (c) producing responsible students,
(d) producing respectful students, and (e) transforming students.
In the second phase of content analysis, researchers sorted the comments and
comment categories into themes in order to determine the frequency to which
student learning, high levels of learning, or student academic achievement was
mentioned as a school purpose. Researchers included this phase of the analysis for
each type of statement in an effort to compare survey results to the commonly
defined purpose of schools as producing high levels of student learning (DuFour
et al. 2008). Nine separate categories or themes emerged from the analysis of
mission statement content recalled. In Table 4, each category or theme of school
purpose is presented along with several of the most frequently mentioned examples
from respondents. The frequency that the theme was mentioned by survey
respondents is also provided.
While all themes identified in Table 4 represent laudable purposes for schools to
exist, the variety and scope of the themes mentioned is broad, ranging from
producing students who possess desirable social characteristics, to the provision of
highly qualified personnel. Only a minimal number of statements (14 of 144 or
10 %) of the content included in school mission statements mentioned were
specifically related to high levels of student learning.
Content of vision statements
Content provided by participants recalling any or all of their school’s vision
statement represented a large discrepancy in the data. While 62 % of survey
respondents indicated that their school had a vision statement, only 16 of 80
230 J Educ Change (2015) 16:217–242
123
T a b le
4 T h em
at ic
st at em
en t o f sc h o o l p u rp o se , ex am
p le s, fr eq u en cy , an d p er ce n t
T h em
e an d ex am
p le s o f sc h o o l p u rp o se
st at em
en ts
f (%
)
D ev el o p m en t o f st u d en t p er so n al
ch ar ac te ri st ic s
E x am
p le s: p ro d u ct iv e ci ti ze n s, li fe -l o n g le ar n er s, re al iz ed
p o te n ti al , em
p o w er ed
st u d en ts , su cc es sf u l st u d en ts
4 8
3 3
P re p ar at io n o f st u d en ts fo r fu tu re
E x am
p le s: p re p ar in g st u d en ts fo r: g lo b al
so ci et y , ev er y fa ce t o f li fe , co ll eg e an d ca re er , te ch n o lo g ic al
w o rl d
2 5
1 7
P ro v is io n o f sa fe , o rd er ly
en v ir o n m en t
E x am
p le s: p h y si ca ll y sa fe , em
o ti o n al ly
n u rt u ri n g , o rd er ly
1 5
1 1
S tu d en t le ar n in g /a ca d em
ic ac h ie v em
en t
E x am
p le s: h el p in g st u d en ts
ac h ie v e o r ex ce l in
ac ad em
ic s, ed u ca ti n g /t ra in in g st u d en ts , le ar n in g at
h ig h le v el s
1 4
1 0
H ig h q u al it y cu rr ic u lu m
an d in st ru ct io n
E x am
p le s: p ro v id in g : re le v an t, ch al le n g in g cu rr ic u lu m ; o p p o rt u n it y to
le ar n ; ex ce ll en t te ac h in g st ra te g ie s
1 2
8
P ro v is io n o f in cl u si v e en v ir o n m en t
E x am
p le s: in cl u si v e o f al l st u d en ts , ed u ca ti n g d iv er se
p o p u la ti o n s
1 2
8
S tu d en t ac h ie v em
en t, n o n -a ca d em
ic
E x am
p le s: h el p in g st u d en ts
ac h ie v e in
cr ea ti v it y , at h le ti cs , in n o v at io n , le ad er sh ip , d ec is io n -m
ak in g
9 6
H ig h q u al it y en v ir o n m en t
E x am
p le s: p ro v id in g a q u al it y ed u ca ti o n , w o rl d -c la ss
ed u ca ti o n , p ro v id in g at m o sp h er e o f ex ce ll en ce
8 6
P ro v is io n o f h ig h q u al it y st af f
E x am
p le s: co m m it te d st af f, d ed ic at ed
st af f; ro le
m o d el s, q u al ifi ed
3 2
J Educ Change (2015) 16:217–242 231
123
respondents (20 %) were able to recall any portion of that vision statement on
demand.
Similar to the analysis of mission statement responses, researchers first analyzed
vision statement content against the widely accepted definition of vision statements
from the strategic planning literature, specifically, that a vision statement is a future-
oriented statement or describes a preferred future state of the organization.
Researchers scanned survey responses for future-oriented language embedded in the
vision statements. Of the 16 responses provided, 11 (14 % of total possible
responses) included language that was future-oriented.
In the second phase of analysis, researchers used a model for vision statement
evaluation proposed by Kotter (1996). Kotter suggested that, in order to be
effective, the content of vision statements must clarify a general direction for the
school, must be motivational, and must help to coordinate the actions of individuals
within the organization. More specifically, Kotter (1996) explained that, in
evaluating for effectiveness, vision statements should be (a) imaginable, (b) desir-
able, (c) feasible, (d) focused, (e) flexible, and (f) communicable (p. 72).
Researchers analyzed content recalled by educational leadership students with
these criteria in mind. Of the 11 future-oriented vision statements (or portions
thereof) provided by survey respondents, researchers found that only two statements
actually appeared to meet all six criteria. For example, one of the statements that
seemed to meet or at least approach all six criteria stated:
[Our school] will develop curriculum and instructional strategies that utilize
various resources which will promote active involvement of students, provide
for their varied experiences, as well as individual abilities and talents. We will
provide monitoring of our students’ progress and offer guidance and support
services tailored to individual student needs.
We acknowledge that, while this example vision statement meets or addresses all
of Kotter’s six criteria, it may be lacking some in the second criterion, that is,
desirability. In other words, according to strategic planning and school improvement
experts, one important quality of vision statements is that they should be inspiring or
motivational to organizational members (Bryson 2004; DuFour et al. 2008; Kose
2011; Kotter 1996).Though the vision statement quoted here may not be particularly
inspiring to organizational members, it does represent the most thorough vision
statement recalled by study respondents.
Other vision statements offered by survey respondents offered what might be
perceived to be organizational slogans, including such statements as, ‘‘A tradition of
excellence,’’ and ‘‘A small system that dreams big.’’ Such slogan-like statements are
not future-oriented, nor do they include criteria for effective vision statements
(Kotter 1996).
Content of values statements
Of the 80 survey respondents, 6 (7.5 %) provided any type of values statements. In
the first phase of analysis, a number of values were identified, including statements
of commitment to: (a) diversity; (b) service to students; (c) student learning;
232 J Educ Change (2015) 16:217–242
123
(d) creativity and innovation; and (e) various stakeholder groups, including parents,
students, and the community at large. One values statement was particularly
complete and included commitment statements to ten different organizational
values. This statement included all of the following values: (1) All students matter;
(2) Partnerships with parents are important; (3) Manage with data; (4) Teacher
collaboration is important to improve; (5) We must continually improve; (6) Strong
leadership is important; (7) Students must be engaged in authentic, real-world
learning; (8) Teachers must be life-long learners; (9) Students must be safe and
secure; and (10) Students must be provided extra help when needed. With the
exception of this single, yet fairly comprehensive statement, none of the values
statements recalled by educational leadership students approached the criteria for
powerful organizational values statements provided in the strategic planning
literature (Blanchard and O’Connor 1997; Calder 2011). For example, one
statement read simply, ‘‘We value creativity, diversity, and innovation.’’ Another
stated, ‘‘[Our school] values its constituents and seeks to place education for all as
its vision.’’ While these statements represent laudable values, they do not meet the
standards for specific, behavior-based values statements powerful enough to drive
an organization firmly toward its mission as described in the strategic planning
literature (Blanchard and O’Connor 1997; Bryson 2004; Kotter 1996; Moore 2000).
In the second phase of content analysis, researchers compared the values
statements to the standards of values statements presented in the school improve-
ment literature (DuFour et al. 2008). These authors state that values statements
should clearly indicate the ‘‘actions, behaviors, and commitments necessary to bring
mission and vision to life’’ (p. 148). None of the values statements met these
criteria, with the possible, partial exception of the single, most complete values
statement included above. For example, one of the other, more typical values
statement reads, ‘‘[Our school system] is a system that is unique and values
diversity, commitment, service, and learning’’. Again, while these are all certainly
admirable values, none of the statements clearly outlines actions, behaviors, and
commitments to guide the implementation of these values in every day, professional
practice in a school setting.
Content of goals statements
Twelve of the 80 (15 %) educational leadership students responding to the survey
were able to recall some sort of goals statements developed and adopted by their
schools. However, as mentioned earlier, in a goal statement, organizational
members spell out precisely what level of performance is to be achieved in the
selected domain and what steps are to be taken, by whom, in order to achieve the
goal. The goals statements recalled were somewhat vague and non-specific and
included such statements as, ‘‘Our goal is to prepare students to enter college or
work force,’’ or ‘‘[Our goal] is to model the importance of life-long learning
activities daily in the curriculum’’. Eight of the 12 goals statements recalled by
respondents were of this nature.
In the second phase of analysis, researchers compared the statements of survey
respondents to the SMART goal standard. Four (5 %) of the recalled statements
J Educ Change (2015) 16:217–242 233
123
gave some indication that SMART goals had indeed been developed and adopted.
Example statements that gave evidence of SMART goal development included:
Each year we develop academic goals based on the previous year’s test data.
For example, one goal would be: Math test scores will increase from 84 to
90 % in 3rd grade. Then we set numeric goals per grade level for reading and
math.
Another stated simply, ‘‘… to increase the number of students scoring Level 4 on [the state assessment], each grade level will focus on reading comprehension and
writing’’. In the first example, strong evidence is provided indicating that a SMART
goal had been developed. The second example suggests that such a process had been
followed in developing goals statements for the school.
Discussion
Based upon strategic planning and school improvement conceptual frameworks
(Bryson 2004; DuFour et al. 2008; Kaufman 1992; Mintzberg 1994) we conducted a
study designed to explore the extent to which graduate-level, educational leadership
students were able to recall, on demand, any part of the mission, vision, values, and
goals statements adopted by the schools in which they were currently serving as
professional educators. Further, we asked survey respondents to report the level to
which these organizational statements impacted their daily practice in the school
context. We identified a convenience sample of students enrolled in university
educational leadership graduate degree programs because of their experience
working in schools and because they would likely be knowledgeable about such
foundational, organizational statements as mission, vision, values, and goals.
The authors designed the study to follow-up and extend the research conducted
by Watkins and McCaw (2007) who discovered that, among their own graduate-
level, educational leadership students, the ability to recall any or all of their schools’
statements of mission, vision, and core values was limited, that alignment between
such statements between the school and district levels was limited, and that a large
majority of the recalled statements did not meet criteria for how such statements are
defined in the literature on organizational improvement. The results from the current
study confirm these findings and combine to suggest a disturbing lack of
understanding of the purpose and value of developing and stewarding mission,
vision, values, and goals statements among graduate-level, educational leadership
students.
Lack of focus on student learning as school mission
Leadership students in the current study were nearly unanimous (94 %) in claiming
that their schools had adopted a mission statement. This is good news!
However, upon close analysis of the content recalled by leadership students of
their school mission statements, researchers determined that the school mission
statements, while overwhelmingly inclusive of purpose statements, failed to identify
234 J Educ Change (2015) 16:217–242
123
high levels of student learning or supporting academic achievement as a primary
purpose of their schools (DuFour et al. 2008). In fact, only about 14 % of the
content of mission statements recalled included any statement of student learning as
a primary focus of schools. This is disturbing!
Strategic planning and school improvement experts have consistently and, over a
long period of time, identified the value of mission statements as a key element in
defining organizational purpose. While mission statements in our sample did
identify key organizational purposes, many of those purposes were unrelated or only
loosely related to student learning. The largest amount of recalled content of their
schools’ mission statements, in fact, focused on the inclusion of all students and on
developing character traits in students such as reaching their potential, developing
productive citizenship, preparing for their future in a global society, and developing
life-long learning skills.
If a mission statement is intended to clarify a singular and compelling purpose, or
raison d’etre for a school’s existence, one might hope or even expect that student
learning would be at the top of the list of possible purposes for schools. Clearly
these data suggest a lack of focus by school leaders, in the contexts represented, on
the obvious reason that schools exist, that is, effecting high levels of learning for all
(DuFour et al. 2008).
We acknowledge that there exists some level of disagreement among school
personnel and educational experts regarding the primary purpose of schools.
Schools, indeed, can and do fulfill many important purposes for students, only one
of which is increased levels of learning and achievement. Clearly, however, the
preponderance of literature on strategic planning exhorts leaders to work toward
defining a singular, organizational purpose in order to focus the efforts of
organizational members toward a set of common goals (Bardwell 2008; Bryson
2004; Crittenden and Crittenden 1997; Moore 2000). Further, among school
improvement experts specifically, this singular mission for schools has coalesced
around the issue of increasing levels of student learning (DuFour et al. 2008;
Lunenberg 2010; Stemler et al. 2011). We further acknowledge that defining such a
singular purpose or mission for schools may be perceived by some as limiting
educators’ perceptions of why schools exist. Nevertheless, we contend that the
process of defining and focusing organizational members’ shared understanding of
student learning as the primary function of schools serves to focus organizational
effort. Such a process does not preclude schools from addressing a multitude of
purposes and student needs. Rather, the process serves to direct and focus the school
improvement efforts of individuals and of the group toward common ends.
Lack of shared vision
Another clear direction from the literature calls for school leaders to develop,
articulate, implement, and steward a clear, shared vision among school personnel.
The Interstate School Leaders Licensure Consortium (ISLLC), in 1996, identified
six standards for school leaders, widely adopted by licensing agencies across the
United States (Council of Chief State School Officers [CCSSO] 2008). The first of
these standards charges school leaders with the development of a shared vision
J Educ Change (2015) 16:217–242 235
123
among all school personnel (CCSSO 2008). Kouzes and Posner (2006) stated, ‘‘You
can leave a lasting legacy only if you can imagine a brighter future, and the capacity
to imagine exciting future possibilities is the defining competence of leaders’’ (p.
99).
Data from the current study suggest yet another disturbing disconnect between
best practice and reality, insofar as a mere 14 % of educational leadership students
were able to recall any part of a future-oriented vision statement adopted by their
school. We acknowledge that, just because a school leader may not be able to recall
specific language from their school’s vision statement, this does not necessarily
indicate that there is no adopted vision statement in the school. What it does
indicate, however, is that, even if a vision statement is clearly articulated, and even
perhaps framed and hanging in the front hall, the vision itself has not been
internalized by key formal and informal leaders. A statement, made by DuFour et al.
(2008), springs to life in light of these data when they claimed that ‘‘there is an
enormous difference between merely writing a mission [or vision] statement and
actually living it’’ (p. 114).
No articulated organizational values
Data from this study indicated a nearly universal absence of articulated values or
organizational commitments in schools represented by study participants. Only six
of 80 respondents (7.5 %) could recall any part of a set of values articulated and
adopted by their schools. The reverse of this statistic implies that well over 90 % of
formal and informal leaders in the schools represented had no knowledge of any
shared values articulated by their school personnel. As with the data for mission and
vision statements, the overwhelming lack of ability of school leaders to recall values
statements suggests, simply put, that a set of shared commitments has not been
articulated in the schools represented. One wonders, then, exactly what are the
values demonstrated in the daily practice of organizational and school personnel.
Organizational and educational experts agree that articulated values, or shared
commitments, are fundamental to the process of organizational improvement. These
statements are not merely a set of words or platitudes. When commonly developed,
adopted, and lived, organizational values actually drive the daily practice of
individuals within the organization (Blanchard 2007; Blanchard and O’Connor
1997).
Lack of focused goal statements
Educational leadership students who responded to the survey were similarly unable
to recall key organizational goal statements relative to student learning, specifically,
or to school improvement in general. Again, this does not automatically imply that
goal statements have not been developed or identified in their schools. What it does
imply is that school leaders surveyed have not internalized these goals to a level
where they are conscious of them and are able to recall even any part of those goals
on demand. The fact that only 12 of 80 (15 %) students could recall any part of the
goal statements of their schools, and, of those, only 4 (5 %) could recall their
236 J Educ Change (2015) 16:217–242
123
organizational goals with any specificity, suggests a fourth looming disconnect in
the practice of school leaders. As Schmoker (2003) stated, ‘‘Abundant research and
school evidence suggest that setting goals may be the most significant act in the
entire school improvement process, greatly increasing the odds of success’’ (p. 23).
The problem of impact
As mentioned above, the current research project was designed to follow up on and
extend research conducted by Watkins and McCaw (2007). While the results from
the current study confirm Watkins’ and McCaw’s findings in terms of a lack of
educational leadership students’ ability to recall key organizational statements, the
questions regarding survey respondents’ perceived effect that mission, vision,
values, and goals statements had on their daily work offers an extension of Watkins
and McCaw’s research.
Considering the lack of ability of our own students to recall any part of their
schools’ mission, vision, values, and goals statements, it is not surprising that survey
respondents also reported low levels of impact these statements had on their daily,
professional practice. Revisiting the data presented in Table 3, we suggest that such
low levels of impact that these key organizational statements reportedly had on
professional practice of school leadership students is very likely mirrored by
personnel throughout their schools.
To be explicit, let us consider the following statements derived from our data on
perceived impact of the mission, vision, values and goals in the schools represented:
1. Well over half of school leaders surveyed reported that the mission statement in
their school had only some to no effect on their daily practice as educators.
2. Fifty-six percent of school leaders reported either that their school had no
articulated vision statement or that the vision statement that was present had
little to no effect on their daily work as school leaders.
3. Sixty percent of school leaders surveyed reported no articulated values
statements (i.e., common commitments) in their schools.
4. Only 23 % of school leaders surveyed reported that the articulated goals
statements in their schools had a large to maximum effect on what they did
every day at work!
Concluding remarks
Findings from the current study may be provocative inasmuch as they imply that
school leaders continue to ignore the call from educational change experts to
establish, and especially to steward, a shared purpose in the context of school
improvement efforts. This is evidenced by the fact that school leadership students
surveyed were either unable to recall the content of such statements or recalled
statements that were so widely varied as to suggest a lack of shared understanding
and focus that is centered around the purpose and future of their schools.
Furthermore, our respondents reported that the statements they could recall had only
J Educ Change (2015) 16:217–242 237
123
a minimal impact on their daily practice. We also note that, even among the
mission, vision, values, and goals statements that have been articulated, such
foundational statements, intended to focus and drive organizational change in the
schools represented, are imprecise, and are not expressly focused on student
learning.
From a broader perspective, the findings from this study point to a long-
established reality among those who have studied organizational and educational
change; there exists a wide gap between theory and practice, or between what we
know as educators, and what we do in schools (Dewey 1938; Fullan 1998, 1999;
Pfeffer and Sutton 2000; Sarason 1971). Returning to the arguments made by
leading scholars in educational change, effecting systemic change within organi-
zations is, at best, a rare occurrence, due in part to the complexity of the
organization, to the multiplicity of purposes and values espoused by organizational
members, and to the fluid contexts in which they operate (Fullan 1993, 1998, 1999;
Hargreaves et al. 2001; Sarason 1971). These experts argued that change agents,
committed to the process of school improvement, may be unsuccessful due to a lack
of understanding of the nature of this complexity. Reflecting on the work of
Sarason, Fried (2003) re-emphasized the complexity of schools and the problem of
change by restating Sarason’s words, ‘‘It could be argued that schools and school
personnel vary so fantastically on so many different levels that attempts to arrive at
communalities or distinctive patterns of behavior and attitudes are rendered
meaningless or fruitless’’ (p. 80).
Our study confirms these authors’ theory of complexity of the change process.
Certainly, the process of school improvement is a formidable task. However, rather
than resigning ourselves to the ‘‘fruitless’’ nature of school change, we hope that
these findings may contribute to uncovering and more fully understanding the nature
of this complexity by recognizing that, at least among our respondents, the guiding
principles and specific goals of their organizations appear to be unclear, at best, and
have not been internalized by organizational players. Perhaps school personnel who
fail to achieve desired success in effecting change may be informed by reflecting on
the possibility that school leaders, and the people they lead, suffer from a lack of
understanding, articulation, unity, and shared commitment to the mission, vision,
values, and goals of their organization. We believe that school change agents and
their communities would be well-served to recognize and address the fact that
school personnel vary widely in their ‘‘beliefs, norms, and practices across diverse
schools,’’ and to work toward a unification of purpose to support effective, school-
and system-wide change (Talbert 2010, p. 569).
We further contend that leaders who work toward meaningful and substantial
change in schools would benefit from reflecting on the source and power of a shared
purpose among school personnel. Based on a multi-national study of successful
school leadership, Mulford (2010) concluded:
The principal’s core values and beliefs, together with the values and capacities
of other members of the school community, feed directly into the development
of a shared school vision, which shapes the teaching and learning—student
and social capital outcomes of schooling (p. 201)
238 J Educ Change (2015) 16:217–242
123
The power of shared mission, vision, values, and goals among school personnel to
shape teaching and learning, i.e., the core technology of schools, is difficult to
overstate and certainly worthy of continued focus and reflection.
Though the findings from the current study may be interpreted by some as an
indictment of school leaders in general, and of our own students specifically, that is
certainly not the intent of this research project. On the contrary, the current research
project was designed to explore findings from previous research (Watkins and
McCaw 2007) in order to compare results and to give further consideration to what
may be interpreted as some rather disturbing disconnects between best practice and
the realities of daily practice of school leaders. Indeed, such widespread inability of
educational leadership students, all of whom work actively and daily in their
respective schools should raise red flags, not just in our own university setting, but
among personnel in school leadership preparation programs across the nation. Of
course, the findings of our study may not be generalized beyond the context in
which the research was conducted. Other researchers may find very dissimilar
results to our own within the contexts of their own settings. However, the findings
should and do raise more questions than they answer.
Careful consideration of these findings may benefit school leadership profes-
sionals, and professionals who work to prepare school leaders, as well, in the effort
to have a powerful and effective impact on the school improvement process. Despite
decades of evidence and admonishment by organizational and school improvement
experts, school leaders may simply continue to misunderstand the purpose and
power of developing school mission, vision, values, and goals statements at best.
And, at worst, the evidence may suggest that school leaders, in many places, may
simply be ignoring the evidence of the essential nature of the development of key
organizational statements to the detriment of the improvement processes in the
schools to which they are, undoubtedly, deeply committed.
Implications for school leadership preparation programs
This study was conducted by educational leadership faculty in an effort to
understand and explore what appeared in previous studies to be a lack of
understanding and implementation among school leaders of the four key organi-
zational statements. Findings from this study suggest that faculty involved in
university school leadership programs would do well to clarify for students the
meaning of organizational mission, vision, values, and goals statements, as well as
explore the powerful impact that the articulation, widespread adoption, and
alignment to such statements can have on the process of school improvement. What
is clear from the results of this analysis is that educational leadership students had
little to no knowledge of the content of these statements in their schools. Clearly, it
follows then, that such statements will have little to no effect on their practice.
Leadership preparation programs would also do well to emphasize the how and
the why of articulating, adopting, implementing, and stewarding shared mission,
vision, values, and goals to serve as a vehicle for unifying school stakeholders
around a common purpose and direction for the future, that is, toward increased
levels of learning for all students.
J Educ Change (2015) 16:217–242 239
123
Implications for further research
Researchers continue to study the school improvement process on many levels and
examine best practice from many different angles. The continued study of the
purpose and power of clearly developed and shared school mission, vision, values,
and goals statements is definitely in order. Findings from this study and others add
evidence to the knowing-doing gap (DuFour et al. 2010; Schmoker 2006) in
educators’ efforts to improve schools and to effect high levels of learning for all
students. Surprisingly, little research exists that contributes to the unveiling of how
prevalent the gap is between what school leaders know and what school leaders
actually do on a daily basis.
Research analyzing the actual (as opposed to recalled) content of school mission,
vision, values, and goals statements is only recently beginning to emerge (Stemler
et al. 2011). Further study in this area (e.g., rural, suburban, and urban schools) is
indicated, as well.
Finally, the findings from this study have led us to pose the following two key
questions to researchers, school leaders, and to those who prepare individuals to
assume school leadership roles. The first question is, When school leaders know
what to do to improve schools (i.e., begin by developing, articulating, and
stewarding clear school mission, vision, values, and goals statements), and how to
accomplish these beginning steps, why do school leaders continue to ignore these
foundational practices? The second, and perhaps more profound question is, In the
absence of such guiding statements, what statements or belief systems, perhaps
unwritten and unexamined, are serving as de facto school missions, visions, values,
and goals for school personnel? While addressing these questions was beyond the
scope of the current study, we believe that the findings of the study, and others, are a
clear implication of the need for further investigation and discussion into these
important matters.
References
Bardwell, R. (2008). Transformational assessment: A simplified model of strategic planning. AASA
Journal of Scholarship and Practice, 5(2), 30–37.
Blanchard, K. (2007). Leading at a higher level: Blanchard on leadership and creating high performing
organizations. Upper Saddle River: Prentice Hall.
Blanchard, K., & O’Connor, R. (1997). Managing by values. San Francisco: Berrett-Koehler.
Boerema, A. J. (2006). An analysis of private school mission statements. Peabody Journal of Education,
81(1), 180–202.
Bryson, J. M. (2004). Strategic planning for public and nonprofit organizations: A guide to strengthening
and sustaining organizational achievement (3rd ed.). San Francisco: Jossey-Bass.
Burnes, B. (2004). Kurt Lewin and the planned approach to change: A re-appraisal. Journal of
Management Studies, 41(6), 977–1002.
Calder, W. B. (2011). Institutional VVM statements on web sites. Community College Enterprise, 17(2),
19–27.
Council of Chief State School Officers. (2008). Educational leadership policy standards: ISLLC 2008.
Washington: Council of Chief State School Officers.
240 J Educ Change (2015) 16:217–242
123
Creswell, J. W. (2013). Qualitative inquiry and research design: Choosing among five approaches.
Thousand Oak: Sage.
Crittenden, W. F., & Crittenden, V. L. (1997). Strategic planning in third-sector organizations. Journal of
Managerial Issues, 9(1), 86–103.
Danielson, C. (2007). Enhancing professional practice: A framework for teaching (2nd ed.). Alexandria,
VA: Association for Supervision and Curriculum Development.
Dewey, J. (1938). Logic: The theory of inquiry. New York: Holt, Reinhart and Winston.
DuFour, R., DuFour, R., & Eaker, R. (2008). Revisiting professional learning communities at work: New
insights for improving schools. Bloomington: Solution Tree.
DuFour, R., DuFour, R., Eaker, R., & Many, T. (2010). Learning by doing: A handbook for professional
learning communities at work (2nd ed.). Bloomington: Solution Tree.
DuFour, R., & Eaker, R. (1998). Professional learning communities at work: Best practices for enhancing
student achievement. Bloomington: National Education Service.
Fried, R. L. (2003). The skeptical visionary: A Seymour Season education reader. Philadelphia: Temple
University Press.
Fullan, M. (1993). Change forces: Probing the depths of educational reform. Philadelphia: Falmer Press.
Fullan, M. (1998). The meaning of educational change: A quarter century of learning. In A. Hargreaves,
A. Lieberman, M. Fullan, & D. Hopkins (Eds.), International handbook of educational change.
Dordrecht: Kluwer Academic Publishers.
Fullan, M. (1999). Change forces: The sequel. Philadelphia, PA: Falmer Press. In A. Hargreaves, L. Earl,
S. Moore, & S. Manning (eds.), (2001). Learning to change: Teaching beyond subjects and
standards. San Francisco: Jossey-Bass.
Hargreaves, A., Earl, L., Moore, S., & Manning, S. (2001). Learning to change: Teaching beyond
subjects and standards. San Francisco: Jossey-Bass.
Kaufman, R. (1992). Strategic planning plus: An organizational guide. Newbury Park: Sage.
Kose, B. W. (2011). Developing a transformative school vision: Lessons from peer-nominated principals.
Education and Urban Society, 43(2), 119–136.
Kotter, J. (1996). Leading change. Boston: Harvard Business School.
Kouzes, J., & Posner, B. (2006). A leader’s legacy. San Francisco: Jossey-Bass.
Lewin, K. (1943). Defining the field at a given time. Psychological Review, 50, 292–310.
Lunenberg, F. C. (2010). Creating a professional learning community. National Forum of Educational
Administration and Supervision Journal, 27(4), 1–7.
Marzano, R., Waters, T., & McNulty, B. (2005). School leadership that works: From research to results.
Alexandria: Association for Supervision and Curriculum Development.
McHatton, P. A., Bradshaw, W., Gallagher, P. A., & Reeves, R. (2011). Results from a strategic planning
process: Benefits for a nonprofit organization. Nonprofit Management and Leadership, 22(2),
233–249.
Mintzberg, H. (1994). The rise and fall of strategic planning. New York: Free Press.
Moeller, A. J., Theiler, J. M., & Wu, C. (2012). Goal setting and student achievement: A longitudinal
study. The Modern Language Journal, 96(2), 1–17.
Moore, M. H. (2000). Managing for value: Organizational strategy in for-profit, nonprofit, and
governmental organizations. Nonprofit and Voluntary Sector Quarterly, 29(1), 183–204.
Mulford, B. (2010). Recent developments in the field of educational leadership: The challenge of
complexity. In A. Hargreaves, A. Lieberman, M. Fullan, & D. Hopkins (Eds.), Second international
handbook of educational change. New York: Springer.
O’Neill, J. (2000). SMART goals, SMART schools. Educational Leadership, 57(5), 46–50.
Pekarsky, D. (2007). Vision and education: Arguments, counterarguments, rejoinders. American Journal
of Education, 113(3), 423–450.
Perkins, D. (1992). Smart schools: Better thinking and learning for every child. New York: Free Press.
Pfeffer, J., & Sutton, R. (2000). The knowing-doing gap: How smart companies turn knowledge into
action. Boston: Harvard Business School.
Quong, T., Walker, A., & Stott, K. (1998). Values-based strategic planning: A dynamic approach for
schools. Singapore: Prentice Hall.
Reeves, D. (2000). Accountability in action: A blueprint for learning organizations. Denver: Advanced
Learning Press.
Renchler, R. (1991). Leadership with a vision: How principals develop and implement their visions for
school success. OSSC Bulletin, 34(5), 1–29.
Rosengren, K. E. (Ed.). (1981). Advances in content analysis. Beverly Hills: Sage.
J Educ Change (2015) 16:217–242 241
123
Sarason, S. B. (1971). The culture of the school and the problem of change. Boston: Alyn & Bacon.
Schmoker, M. (2003). First things first: Demystifying data analysis. Educational Leadership, 60(5),
22–24.
Schmoker, M. (2006). Results now: How we can achieve unprecedented improvements in teaching and
learning. Alexandria: Association for Supervision and Curriculum Development.
Stemler, S. E., Bebell, D., & Sonnabend, L. A. (2011). Using school mission statements for reflection and
research. Education Administration Quarterly, 47(2), 383–420.
Talbert, J. E. (2010). Professional learning communities at the crossroads: How systems hinder or
engender change. In A. Hargreaves, A. Lieberman, M. Fullan, & D. Hopkins (Eds.), Second
international handbook of educational change. New York: Springer.
Teddlie, C., & Reynolds, D. (Eds.). (2000). The international handbook on school effectiveness research.
New York: Falmer Press.
United States Department of Education (2002). No Child Left Behind Act. Accessed http://www.ed.gov/
policy/elsec/leg/esea02/index.html.
Watkins, S. G., & McCaw, D. S. (2007). Analysis of graduate students’ knowledge of school district
mission, vision, and core values. National Forum of Educational Administration and Supervision
Journal, 24(3), 71–91.
Wiggins, G., & McTighe, J. (2007). Schooling by design: Mission, action, and achievement. Alexandria:
Association for Supervision and Curriculum Development.
242 J Educ Change (2015) 16:217–242
123
Copyright of Journal of Educational Change is the property of Springer Science & Business Media B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.
- Mission, vision, values, and goals: An exploration of key organizational statements and daily practice in schools
- Abstract
- Introduction
- Background
- Conceptual frameworks
- Strategic planning
- School improvement
- Review of literature
- Defining a mission statement
- Defining a vision statement
- Defining values statements
- Defining goals statements
- Summary of literature review
- Methodology
- Study sample
- Survey development and administration
- Analysis of survey data
- Findings
- Study participants’ demographics
- Presence of mission, vision, values, and goals statements
- Perceived impact of statement on daily professional practice
- Content of mission statements
- Content of vision statements
- Content of values statements
- Discussion
- Lack of focus on student learning as school mission
- Lack of shared vision
- No articulated organizational values
- Lack of focused goal statements
- The problem of impact
- Concluding remarks
- Implications for school leadership preparation programs
- Implications for further research
- References

Get help from top-rated tutors in any subject.
Efficiently complete your homework and academic assignments by getting help from the experts at homeworkarchive.com