209
CHAPTER SEVEN
Working with Stakeholders
Establishing the Context and the Evaluand
You have already read about a wide variety of evaluands that refl ect many disciplines and issues, such as programs to provide youth mentoring, address homelessness and unemployment, provide effective mental health services, increase literacy skills, provide safe housing, improve schools, and prevent the spread of HIV/AIDS. An evaluand may seem pretty clear in the published version of an evaluation; however, this clarity generally comes from many hours of discussions and revisions during the evaluation planning and implementation phases. The evaluations discussed in earlier chapters have also been con- ducted in a wide variety of contexts and countries across the globe, with diverse cultural groups who use different languages and live in different socioeconomic conditions. These contextual factors infl uence what is chosen to be evaluated and how that determination is made.
Evaluation planning can begin in many different ways: a phone call from a person previously unknown to you who says, “I have a program that needs to be evaluated”; an email from someone who is preparing a proposal to develop a new program that needs an evaluation plan; or a request to expand on previous evaluation work with members of a community with whom you have an ongoing relationship. What these beginning points have in common is that you, as the evaluator, are interacting with another person or per- sons. Hence issues of human relations are inevitably part of the process of planning an evaluation. A second important point to note is that evaluands come in all stages of being implemented—from existing only as an idea in a principal investigator’s head, to a fi rmly established program or one that is undergoing changes, to a more dynamic organization that wants to be in a mode of continuous learning.
Identifying Stakeholders
Once the initial contact has been made between a client and an evaluator, both parties need to consider who needs to be involved in the process of planning the evaluation. As defi ned in Chapter 1, stakeholders are people who have a stake in the program: They fund, administer, provide services, receive services, or are denied access to services. It is usually wise to spend some time and effort thinking about which stakeholders need to be included at the very beginning; this can help avoid political disasters at the end of evalu- ations if the proper people were not involved. On a more positive note, the quality of the evaluation will be enhanced with representation of diverse interests, especially by inclu-
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
210 PLANNING EVALUATIONS
sion of traditionally marginalized groups. Appropriate stakeholders are sometimes identi- fi ed by default, including only those who have power in positions related to the evaluation. The selection of stakeholders can also be an evolving process, with some stakeholders identifi ed early in the process and others added as the relevant issues become clarifi ed. In relatively small projects, the identifi cation of stakeholders may be fairly straightforward. However, in larger projects, strategies for selection of representatives from stakeholder groups will probably need to be employed.
Identifi cation of stakeholders is context- specifi c. Two lists of categories of stakehold- ers are displayed in Box 7.1; these lists will give you an idea of how many and what types of diverse groups can be considered in identifying stakeholders. The fi rst list is based on a study of projects specifi cally focused on substance abuse prevention (Center for Substance Abuse Prevention [CSAP], 2008). The second list of stakeholders is based on the UN- Women’s (2014) Guide for the Evaluation of Programmes and Projects with a Gender, Human Rights and Intercultural Perspective, which details how evaluations should incorporate prin- ciples of gender equality, women’s rights, and the empowerment of women in all initiatives they support and fund. Box 7.1 lists the four groups of stakeholders UN-Women and all UN agencies identify and include throughout all evaluation processes.
Box 7.1. Two Samples of Stakeholders for Evaluations, Listed by Category
Substance abuse prevention (based on CSAP, 2008)
Integration of gender in policy for poverty reduction strategies (based on UN-Women, 2014)
Law enforcement
Education
Youth
Criminal justice
Civic organizations
Parents
Faith-based organizations
Elderly persons
Businesses
Human service providers
Health care providers
Military
Colleges and universities
Ethnic groups
Government
Elected offi cials
Child care providers
Various ministry offi cials, such as fi nance, economic planning, and others (health, education, trade, industry, labor, social development, natural resources, and environment)
Elected offi cials
Civil society (e.g., NGOs, community-based organizations, faith- based groups, trade unions, private sector associations), with specifi c attention given to relevant dimensions of diversity within these groups (e.g., rural–urban, disability groups, women’s groups)
World Bank staff involved in poverty reduction planning, especially those responsible for the World Bank Joint Staff Assessments/Joint Staff Advisory Notes, because they assess the quality of poverty reduction plans and make their recommendations for funding or debt reduction to the World Bank and International Monetary Fund
International agencies, such as United Nations agencies and international donor agencies (e.g., CARE, Oxfam, Save the Children, ActionAid)
Representatives from the sectoral groups that represent infrastructure, agriculture, education, health, and employment
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 211
Broad categories that are contextually relevant can be helpful in identifying stake- holders for specifi c evaluation studies. Evaluators can determine which stakeholder groups have relevance by recalling their own experiences in particular contexts, reading literature related to the particular context, conferring with knowledgeable members of the community, and asking for specifi c recommendations to represent diverse viewpoints. Evaluators should be aware of the need to include stakeholders who represent diverse perspectives and positions of power. They should also be aware of the need to provide support for those stakeholders who require it for authentic participation. This support might take the form of transportation, stipends, a safe meeting environment, interpreters, food, or child care. Evaluators working with stakeholders must pay careful attention to their interpersonal skills, because human relations are critical in conducting high- quality evaluations, as discussed in the next section.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Identifying Stakeholders
1. Machik is an NGO that is building new opportunities for education and training with Tibetans living in a small, isolated village in a deep valley. With support from donors, they have opened the Ruth Walter Chungba Primary School in this rural community. Imagine that Machik has asked you to evaluate the impact the school has made on the community. You need to decide with the school authorities and the donors who the stakeholders are in this community. Who would you ask to participate in this study, and why? (Read about the school and watch a video at this website: www.machik.org/index.php?option=com_content&task= view&id=24&It emid=50.)
2. You have been hired by a school system to evaluate a new reading program for use in elementary schools. How would you begin your identifi cation of appropri- ate stakeholders for this evaluation?
Human Relations
The nature of the relationship between the evaluator and stakeholders is an area of ten- sion in the evaluation community, as exemplifi ed by the different paradigmatic perspec- tives on this topic:
Methods Branch evaluators tend to favor having a distant relationship, in the belief that this will protect the evaluator from developing biases toward particular stake- holder groups.
Use Branch evaluators see the necessity of forming a relationship with the stakehold- ers who are the primary intended users, so the evaluator can be responsive to their needs and thus enhance the possible use of the fi ndings.
Values Branch evaluators believe that the evaluator needs to be involved with the
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
212 PLANNING EVALUATIONS
community suffi ciently to reveal the viewpoints of different stakeholder groups accu- rately.
The Social Justice Branch evaluators directly address differences in power between themselves and various stakeholder groups, with a conscious awareness of the need to include the full range of stakeholders, especially those who have traditionally been excluded from decision- making positions into the process.
These differences in the nature of evaluator– stakeholder relationships lead to differences in the processes used to defi ne the evaluand and understand its context.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Human Relations Skills for Evaluators
Two eminent scholars in the evaluation community see the importance of human relations very differently. Read the two following passages and discuss your own thoughts and positioning with regard to this issue. First, Patton (as a contributor to Donaldson, Patton, Fetterman, & Scriven, 2010) writes:
Human beings are in a relationship to each other and that relationship includes both cognitive and emotional dynamics. The interpersonal relationship between the evalua- tor and intended users matters and affects use. That interpersonal relationship is not just intellectual. It is also political, psychological, emotional, and affected by status and self- interest on all sides. What the astute evaluator has to be able to do, which includes the essential competencies to do that, is to be able to engage in relationships. (p. 25)
In contrast, Scriven (also as a contributor to Donaldson et al., 2010) writes that inter- personal skills are not necessarily important for evaluators:
Michael [Patton] fi nds one of these to be a great strength, namely having lots of inter- personal skills. Forget it, guys! The way that evaluation works, and always will, is that it inhabits ninety niches. One of those niches is to be found in Washington in every agency, e.g., in the offi ce of its inspector- general. Here are to be found the desk evalua- tors. Most of them don’t have to have interpersonal skills any more than anyone in any kind of offi ce job; and they don’t need them. All they’re doing is analyzing the reports, and they’re very important people because they’re the fi rst line of advice and back-up to the decision makers. What we need from them is good analytic skills. It’s not that I don’t think that it’s a good thing to have good interpersonal skills; it is that one must not put them in as minimum requirements for every evaluator. (p. 24)
Now answer the following questions:
1. What do you think about these two positions?
2. What merits do their arguments have?
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 213
3. Do you personally agree with one more than the other?
4. What are your reasons for your own positioning on the topic of human relations skills in evaluation?
Interacting with Stakeholders
Kirkhart (2005) has noted that the validity of an evaluation is infl uenced by “interper- sonal justifi cation” (i.e., the quality of the interactions between and among participants and the evaluator). Evaluators bring their own cultural lenses to the planning process, and these affect their interactions with stakeholders in terms of who is involved in the process and how. Lincoln (1995) has reinforced the importance of the quality of human relations in evaluation by suggesting that an evaluator needs to know the community “well enough” to link the evaluation results to positive action within the community. Evaluators must critically examine the meaning of “well enough”; what does this mean? Indigenous researchers provide insights into the nature of relationships that they would interpret as indicating that an evaluator is appropriately situated to work in their communities.
Lessons from the Maori
Cram (2009) and Smith (2012), who work in the Indigenous Maori community in New Zealand (Aotearoa), have provided guidance to the meaning of kaupapa Maori (which means “a Maori way”). Kaupapa Maori can be applied to many aspects of life; it implies the development of a relationship that is respectful of Maori cultural, social, and economic well-being. Cram (2009) provides a list of cultural values that she translates into expecta- tions for evaluators’ interactions in their community. These include the following:
Aroha ki e tangata (respect for people). Evaluators establish relationships with peo- ple via situating themselves within the history of the community (genealogically, if possible; through personal connections if no genealogical link is present), with the assistance of the community elders. Another aspect of respect for people is to be knowledgeable about appropriate rituals in terms of entering the community (such as who to contact, how to approach people, bringing of gifts, etc.).
He kanohi kitea (a voice may be heard, but a voice must be seen). Maori people expect that an evaluator will come into their community to allow the community members to see for themselves who this person is. Community meetings, called hui, are often used as a forum for evaluators to meet stakeholders, explain the study, and ask permission to proceed.
Titro, whakarongo . . . korero (watch, listen . . . talk). An evaluator shows respect for Maori people by listening to what they say before he/she talks. This process of fi rst looking and listening conveys the value that the evaluator places on the contribu- tions of the community members.
Manaaki kit e tangata (looking after people). In the context of the evaluation, the essential meaning of this concept is that the evaluator establishes a reciprocal rela-
aori
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
214 PLANNING EVALUATIONS
tionship with the stakeholders. The stakeholders are providing access to their com- munity and information in the form of data; the evaluator can offer small gifts or services, capacity- building activities, networking, and access to the evaluation fi ndings.
Kaua e takahia te mana o te tangata (do not trample on the mana [authority] of the people). Maori people want to know what an evaluator is saying about them before the results are released outside the community. As most communities would, the Maori do not want to be portrayed as having something wrong with them (a defi - cit view). Rather, they want to be portrayed in a balanced way, with both their strengths and their challenges.
Kia mahaki (be humble). An evaluator should share the results with the Maori com- munity in a way that helps the community take action on its own behalf. The com- munity members can be provided with the tools necessary to fi ght for their own rights and challenge oppressive systems.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Maori Cultural Values and Evaluation
1. Reciprocity is seen as valuable in evaluations conducted in the Maori community. How would this principle translate to evaluation situations outside the Maori community?
2. What is your opinion with regard to the implications of applying these Maori cultural values in other evaluation contexts?
3. What could evaluators learn about the establishment of relationships with stake- holders from these Maori cultural values?
4. What might some evaluators fi nd objectionable concerning the Maori’s expecta- tions of the evaluators’ interactions in their community? Why would they object?
5. What do you know about yourself that might enhance or inhibit your ability to work in an evaluation context that requires attention to and respect for cultural values and backgrounds?
6. Symonette (2004) suggests that evaluators need to be aware of who they are them- selves, as well as who they are in relation to community:
Even more important for the viability, vitality, productivity and trust- building capacity of a transaction and relationship cultivation is multilateral self- awareness: self in context and self as pivotal instrument. Who do those that one is seeking to communicate with and engage perceive the evaluator as being? . . . Regardless of the truth value of such perceptions, they still rule until authentically engaged in ways that speak into the listen- ing. (p. 100)
How would you answer this question: Who do others think that you are? If you are in an evaluator role, who do others think you are?
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 215
(cont.)
Power and Privilege
Power and privilege are concepts discussed in prior chapters. Here the emphasis is on (1) strategies for evaluators to use to bring themselves and the communities with which they work to consciousness about the dynamics of power and privilege, as well as on (2) mean- ingful ways to engage those who have traditionally had less power in evaluation contexts. Two action researchers, Heron and Reason (2006), provide the following strategies for designing evaluations that include self- refl ection and that monitor evaluators’ engage- ment with communities in culturally respectful ways:
Research cycling. Evaluators should be prepared to go through the inquiry process several times. This cycling process allows for repeated episodes of action and refl ec- tion that can help refi ne understandings and reduce distortions.
Authentic collaboration. Evaluators and stakeholders need to devise strategies for interactions that allow for the development of an egalitarian relationship. The interaction dynamic needs to be designed so that stakeholders are motivated to have sustained involvement and allow every voice to be expressed.
Challenging consensus collusion. Individuals have the right to challenge the assump- tions that underlie the knowledge created or the process by which it was created.
Managing distress. Group processes typically have moments of stress and tension; a process needs to be in place to handle this distress respectfully.
Refl ection and action. A cyclical process that includes phases of action and refl ection allows needed changes to occur.
Chaos and order. Refl ective action is diffi cult when a system is in total chaos. Evalu- ators should encourage divergent thinking and also bring the system back into balance so that the group can move forward toward its goals.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Power and Privilege
1. How do we understand the dynamics of power when participatory methods are employed by the powerful?
2. Whose voices are raised, and whose are heard?
3. How are these voices mediated as issues of representation become more complex with the use of participatory methods in larger-scale planning and consultation exercises?
4. The culturally responsive approach to evaluation places emphasis on matching the characteristics of the evaluation team with those of the community, particu- larly in terms of race. Frierson et al. (2002) suggest that data will not be valid if they are collected by people who are not attuned to the program’s cultural con- text. What if you are a member of the community? How does that prepare you to
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
216 PLANNING EVALUATIONS
work in that community? What if you are not a member of a community? To what extent is it necessary to share salient characteristics of a community?
5. Recall the discussion of cultural competence in Chapter 1. How does cultural competence come into the discussion of interactions in evaluation contexts?
6. When evaluators enter a community, they may fi nd that they hold power in a way they have not before. For example, an elderly female evaluator may be more respected in this community than in her home culture. List situations where you must be cognizant of the increased or decreased power you hold as a result of personal characteristics that may affect your relationship with the stakeholders (age, gender, education, ethnicity, sexual orientation, etc.).
Developing Partnerships/Relationships
A large community of immigrants and refugees settled outside Lowell, Massachusetts, in a relocation effort for people from Laos who had assisted the United States in the years preceding the Vietnam War. When the United States lost the war, the government fol- lowed through on its promise to move members of the Laotian community who had been their allies to a safe place. The presence of such a large community in what had previously been a very white, working-class, mainstream American community did not go unnoticed by researchers. Researchers motivated by a desire to create knowledge, to work with an exotic community, or even simply to do good inundated the community with their study teams. Silka (2005) and her colleagues at the Center for Family, Work and Community at the University of Massachusetts, Lowell, noted that the immigrants and refugees were not benefi ting from the research. They developed a model for partnership research and evaluation between a consortium of universities and the Laotian community, in order to protect the community from exploitative research that did not directly benefi t the commu- nity. Silka and her colleagues have developed a set of tip sheets to guide researchers and evaluators who conduct studies in the Laotian community; several of these tip sheets are summarized in Box 7.2. They have wider applicability in the development of partnerships with other communities as well.
Box 7.2. Developing Ethical Partnerships: Tip Sheet Summaries
Initiating Partnerships: Gathering the Players, by Darcie Boyer. This is the initial step in the process of acting on a felt need, identifying others who share a concern in the community and in the research or evaluation world, fi nding appropriate ways to con- tact and communicate with potential partners, and planning to have a community meeting to discuss the potential partnership.
Ethical Considerations in Participatory Research: The Researcher’s Point of View, by Maryjane Costello. Researchers need to be aware of the diversity of perceptions as to what constitutes ethi- cal practice in various communities.
Partnership-Based Research: How the Community Balances Power within a Research Partnership, by
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 217
G. Martin Sirait. Partnerships should be arranged
so that both researchers and participants are rec-
ognized as having power in that context.
Everything You Always Wanted to Know about IRBs, by Sokmeakara Chiev. IRBs, or institutional review
boards, are mandated by U.S. federal legislation for
any organization that receives federal funds to do
research. Communities can institute IRBs of their
own with membership from within their cultural
group.
Overcoming the Roadblocks to Partnership, by Marie Martinelli. Communities can ensure that they derive benefi ts from proposed research or evaluation by forming community advisory boards, actively participating in the planning process, and considering successful models of partnerships that might transfer to their own situation.
Knowledge Creation in Research Partnerships, by Pascal Garbani. Researchers need to work together to create knowledge in a manner that respects dif- ferences between and within groups.
Source: Based on Center for Family, Work and Community at the University of Massachusetts, Lowell (2004). The Center for Community Research and Engagement’s home page is www.uml.edu/research/ccre.
Many Indigenous peoples prefer to speak of “relationships” rather than “partner- ships.” For example, Maori, Native Americans, and Africans share an emphasis on con- nectivity and extend it beyond relationships among human beings to include the wider environment, ancestors, and inanimate objects. For them, “partnership” implies more of a contractual relationship that may still refl ect inequities and exploitation. “Relationship” means that there is a deeper connection at multiple levels in terms of where we are from and who our people are. It means that the evaluators understand the culturally appropri- ate ways of a community and see the evaluation as a journey that they take together with community members, with opportunities for mutual learning, participant control, and evaluator accountability (Cram, 2009).
Partnerships or relationships are not easy to develop and may not be smooth through- out their existence. Kirkhart (2005) suggests the following considerations that are related to effective partnerships and relationships. First, relationships in evaluation take time and effort to develop. Evaluators often work in compressed time frames with limited budgets that constrain their ability to be responsive to multicultural dimensions. Second, cultural responsiveness requires knowledge, emotions, and skills. These are complex and not easily taught. Third, evaluators need to be able to interact with the stakeholders in the evalu- ation in ways that are culturally respectful, cognizant of the strength in the community, and facilitate desired change. This means that they need to be fl exible with the design and implementation of the evaluation in order to be responsive to these factors. Finally, evaluators, particularly if they are from outside the community, need to avoid cultural arrogance in several forms: imposing their own cultural beliefs on the stakeholders, pre- imposing a design on the evaluation, or mistakenly thinking that they accurately under- stand the culture in which they are working.
Evaluators can also work with community members on capacity building. The capac- ity building can be reciprocal, in that the evaluators have knowledge and skills to teach from their perspective, and the community members have knowledge, skills, and attitudes to teach from theirs. Teams of evaluators can be formed that allow strengths from all
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
218 PLANNING EVALUATIONS
sides to be represented in the evaluation planning. Caldwell et al. (2005) describe effective evaluator teams formed with academic and tribal representatives. They do point out that one challenge with this approach arises from concerns about confi dentiality and anonym- ity, especially in small communities where identities can be recognized readily.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Developing Partnerships
Think about the evaluation you intend to plan.
1. At what point will you involve the community?
2. How will you prepare yourself for meeting the community (by reading about the culture, etc.)?
3. How will you approach that community?
4. What benefi ts do you see for the community?
5. How will you demonstrate your respect for its culture and traditions?
The Evaluand and Its Context
The theme of AEA’s annual meeting in 2009 was “Context and Evaluation.” Debra J. Rog, the 2009 president of AEA, defi ned context in these terms:
Context typically refers to the setting (time and place) and broader environment in which the focus of the evaluation (evaluand) is located. Context also can refer to the historical con- text of the problem or phenomenon that the program or policy targets as well as the policy and decision- making context enveloping the evaluation. Context has multiple layers and is dynamic, changing over time. (Rog, 2009, p. 1)
The contrast in terms of how evaluators from different branches view context was cap- tured in the opening plenary session of the 2009 AEA meeting. Bickman (2009), a theo- rist from the Methods Branch, said that context was always something that he called “extraneous variables”—in other words, variables that were not of central concern but had to be controlled, so that the validity of the intervention could be determined apart from contextual factors. His perspective contrasted sharply with that of Bledsoe (2009), who is situated in the Social Justice Branch. She indicated that understanding the context was critical to understanding the experiences of the less powerful in the evaluations that she conducted, in order to challenge assumptions by the more powerful. With those two anchor points, we now explore several types of contextual variables and the implications of these variables for the identifi cation of the evaluand and the methods used in the evalu- ation.
Contextual variables include those associated with the local setting (time and place),
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 219
as well as with the broader context—the history of the problem and its proposed solutions, as well as politics and legislation that have relevance for the evaluand. The range of stake- holders and their cultural differences are also contextual variables that need to be con- sidered. These contextual variables infl uence who is involved (stakeholders), how they are involved, the evaluation questions, the type of evaluation undertaken, use of evaluation fi ndings, and decisions about analysis and dissemination of results. The following ques- tions can help stimulate your thinking about contextual variables and their implications:
What dimensions of context infl uence the type of evaluation questions that can be addressed?
How does the nature of the political context infl uence utilization? How does it interact with the type of evaluation conducted?
What dimensions of context infl uence the choice of methods?
How does culture within context affect evaluation practice?
How do our evaluation theories guide us in thinking about context?
How can we learn about context in multisite studies?
What are the implications of a context- sensitive evaluation for analysis and dis- semination?
How can we incorporate context into our evaluation inquiries?
Here is an example from the Hawaiian housing study (Stuffl ebeam et al., 2002; see Chapter 4, Box 4.3) of the identifi cation of contextual variables. The local setting for the housing project was on Oahu’s Waianae Coast, one of the most depressed and crime- ridden areas in the state. The project stretched over 7 years. The funding agency placed high value on self-help and sustainability; this value system infl u- enced the design of the program as well as the evaluation. Contextual variables of particu- lar importance centered on the characteristics of the intended benefi ciaries: specifi cally, the extent of their needs and their abilities to follow through on the expectations for helping to build and pay for their houses. These contextual variables infl uenced who was fi nally accepted as the target audience and how local people were used in the role of data collectors. As noted in Box 4.3, the original intent of the program was to serve the poorest families. However, these families could not get the mortgages, so the focus of the project was shifted to the working poor.
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Questions about Context
Refl ect on the excerpt of Rog’s (2009) explanation of context and the discussion of contextual variables in this section. Now return to the sample studies summarized in boxes in Chapters 3–6. Use the questions listed earlier in this section to analyze rel- evant contextual variables in at least one sample study. Think about how the authors either considered or did not consider these contextual variables.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
220 PLANNING EVALUATIONS
Sources That Inform the Identifi cation of the Evaluand and Context
Developing a focused identifi cation of the context and the evaluand can be approached through a number of different strategies:
Funding agencies establish priorities and provide information in requests for proposals (RFPs) about the context and the program that needs to be evaluated. Another version of a funding agency request is a request for a program to be devel- oped with the requirement for an evaluation plan in the proposal.
Traditional scholarly literature reviews can provide valuable information about the context and the evaluand in terms of what is already known about the setting and the program. This type of resource is generally found through databases of articles available in university and sometimes community libraries, or online for a fee.
Theoretical frameworks for evaluation approaches can provide guidance regard- ing the variables that are important (e.g., an Indigenous evaluation will emphasize specifi cs of the targeted culture), as well as a basis for decisions about appropriate components of a program. Theoretical frameworks can inform the evaluator and stakeholders about power differences on the basis of race/ethnicity, gender, sexual identities, disabilities/deafness, religion, class/socioeconomic status, and other characteristics associated with discrimination and oppression.
Web-based resources are now available (sometimes overwhelmingly!). Here, an evaluator can read about past evaluations, recommended evaluation strategies for this type of evaluand, and relevant contextual factors. Web-based resources can also include databases such as those posted by the U.S. Census Bureau (2017), the Central Intelligence Agency (CIA) and their World Factbook (CIA Factbook, 2017), the U.S. Department of Education’s (2017) evaluation reports, and USAID’s Devel- opment Experience Clearinghouse (2017) evaluations.
“Grey literature” (i.e., that which is not published) can be a valuable resource, espe- cially to gain the perspectives of those who have not been in the privileged schol- arly or technological circles that would be represented in the fi rst several strate- gies. This literature can include program- produced documents such as brochures, project reports, self- studies, past evaluations, conference papers, policy statements, newsletters, newspapers, fact sheets, and more.
Group and individual strategies can be used, such as interviews, surveys, focus groups, concept mapping, and outcome mapping, as well as Indigenous methods based on traditional community meeting ceremonies and rituals.
Advisory boards are commonly used to guide evaluators throughout the process of planning and implementing an evaluation.
New technological tools such as satellite imagery and mapping can be used to provide valuable contextual information about the locations of roads, buildings, services, and natural terrain.
We discuss all of these strategies in more detail below.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 221
(cont.)
Funding Agencies
Funding agencies typically include government agencies and foundations. The U.S. gov- ernment has a website that lists opportunities to apply for more than $400 billion in fed- eral monies from over 1,000 different programs (www.grants.gov). In addition, many agen- cies offer their own funding opportunities on their websites (e.g., the U.S. Department of Education). Obtaining funds from federal agencies usually brings a fairly prescriptive set of requirements for how the funds can be used. On the other hand, foundations also offer many potential funding opportunities through a web portal (http://foundationcenter.org/ fi nd- funding); larger foundations offer such opportunities at their own websites. Founda- tions tend to have priority interest areas, but they are generally more fl exible than govern- ment granting agencies. Box 7.3 provides contrasting statements from a federal agency’s and a foundation’s RFPs.
Box 7.3. Government and Foundation RFPs
The U.S. Department of Justice (2009) offers funding for a tribal youth program that includes the following pro- gram requirements:
[The Offi ce of Juvenile Justice and Delinquency Prevention] seeks applicants to establish or expand a mentoring program that offers a mixture of core services and engages youth with activities that enable them to practice healthy behaviors within a positive pro- social peer group. The target population should be youth at risk of gang activity, delinquency, and youth violence.
The goals of this mentoring program are to prevent gang activity, delinquency, and violence by doing the following:
(1) Offering at-risk youth core services that fulfi ll their adolescent developmental needs within the context of a positive pro- social peer group, including:
A multi-modal mixture of services that may include, but is not limited to, life skills and psycho- educational training, mental health counseling, job placement, community service projects, and structured afterschool recreational, educational, and artistic/culturally enhancing activities.
Emphasizing long-term relationships with mentors and key staff, who are nurturing and supportive adults.
(2) Developing structured mentoring relationships that include the following:
A relationship that lasts 2 or more years with signifi cant contact between the mentor and mentee where the mentee views the mentor as a friend, not an authority fi gure.
Signifi cant training for the mentor.
Oversight of the mentoring relationship.
Data collection to track the relationship and positive outcomes arising from the mentoring relationship.
Structured activities for the mentors and mentees to participate in together.
The Ford Foundation (2010) also supports grantees to develop and implement projects for youth mentoring, but it does not have explicit requirements about the nature of the program. Rather, it has issued this broad statement:
We make grants to develop new ideas and strengthen organizations that reduce poverty and injustice and pro- mote democratic values, international cooperation and human achievement. To achieve these goals, we take varied approaches to our work, including supporting emerging leaders; working with social justice movements and net-
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
222 PLANNING EVALUATIONS
Box 7.3 (cont.)
works; sponsoring research and dialogue; creating new organizations; and supporting innovations that improve lives. These methods of problem- solving refl ect our values and the diverse ways in which we support grantees.
The foundation also describes a model of philanthropy that it has pursued for more than 70 years: to be a long- term and fl exible partner for innovative leaders of thought and action. Lasting change in diffi cult areas, such as the reduction of poverty, protection of human rights, and establishment of democratic governance after a dictatorship, requires decades of effort. It involves sustained work with successive generations of innovators, thinkers, and activists as they pursue transformational and ambitious goals.
Cheek (cited in Mertens, 2009, p. 112) offers the following cautionary questions to consider before accepting money from a funding agency:
Who owns the data and what can you do with the data?
What if the funder wants to suppress results of the study? Or wants to exclude parts of the results?
What exactly is the deliverable (e.g., product expected by the funder)?
In what time frame?
Reporting requirements?
What if there is a disagreement about the way the research or evaluation should proceed?
Scholarly Literature
Many funding agencies require a scholarly review of literature on the evaluation topic in order to provide evidence of knowledge in the fi eld, of the need for the proposed project, and directions to inform the proposed scope of work. Searching databases is very easy for evaluators in the developed world, especially those who work in universities. A list of commonly used databases is provided in Box 7.4. These are generally searchable for free at universities and for a modest fee for people in other settings. Most of these databases can be searched by topic, author, or title. Many databases now have full text documents electronically available to users, eliminating the need to actually visit the library to obtain the documents.
Box 7.4. Scholarly Databases
Psychology
The American Psychological Association (APA) pro- duces the following databases:
PsycARTICLES. This database contains full text articles from 42 journals published by APA and related organizations. The dates of coverage
vary; the earliest articles are from 1988, but APA is developing PsycARCHIVES, which has over 100 years of content coverage.
PsycINFO. This database indexes and abstracts over 1,300 journals, books, and book chapters in psychology and related disciplines (1887–pres- ent).
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 223
PsycBOOKS. Textbooks published by APA and selected classic books from other publishers are found in this database.
Social Science
Social Science Journals (ProQuest). Social sci- ence journal articles published from 1994 to the present.
Sociological Abstracts. This is an online resource for researchers, professionals, and students in sociology and related disciplines. Sociologi- cal Abstracts includes citations and abstracts from over 2,000 journals, plus relevant disser- tation listings, abstracts of conference papers and selected books, citations of book reviews and other media, and citations and abstracts from Social Planning/Policy and Development Abstracts.
Social Work Abstracts. Index to articles from social work and other related journals on topics such as homelessness, AIDS, child and family
welfare, aging, substance abuse, legislation, community organization, and more.
Education
Education Database (ProQuest). Indexes more than 750 titles on education, including primary-, secondary-, and university-level topics. Almost 500 titles include full text.
Educational Resources Information Center (ERIC). A bibliographic database covering the U.S. literature on education; a key source for researchers, teachers, policy makers, librarians, journalists, students, parents, and the general public. Accessible to the public at www.eric. ed.gov.
Dissertations and Theses
ProQuest Dissertations and Theses. An index of dissertations and theses published in the United States and internationally.
Lawless and Pellegrino (2007) describe an evaluation they were planning to deter- mine how to prepare teachers to use technology in their classrooms to enhance learning. They began with a very extensive literature review, which focused on “what is known and unknown about professional development to support the integration of technology into teaching and learning. To answer such questions, we have assembled bodies of literature that are relevant to the design of research studies, the evaluation of the quality of the evi- dence obtained therein, and the possible utility of conclusions” (p. 577). To this end, they examined a multipart literature: what constitutes professional development, how technol- ogy is integrated into the classroom, what infl uences teachers to adopt technology, the multiple roles that technology can play in this context, the quality of previous research on this topic, and the long-term impacts technology has had on teachers and administrators. They used this literature review to “lay out the kinds of questions that should be asked in evaluating how states, districts, and schools have invested their technology integration funds and the nature of the research designs and sources of evidence that might be used to better answer questions about what is effective and why” (p. 578).
In an evaluation of the sustainability of health projects, Scheirer (2005) provides this description of her literature search strategy:
The search was conducted using the search string “sustainability OR routinization OR insti- tutionalization AND health OR healthcare,” in all major relevant bibliographic databases, for the years 1990 to 2003, including PubMed, ProQuest, the Librarians Index to the Internet,
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
224 PLANNING EVALUATIONS
and NLM Gateway. The abstracts of potentially relevant citations were examined to determine if the original research included data collected about any aspect of sustainability after the initial funding had ended. Full texts of all relevant articles were then obtained. A few studies were already known to me from prior related work. In addition, reference lists of obtained articles were examined for any additional studies, such as those using different terminology. The systematic review did not include articles or how-to-do-it commentaries about sustain- ability that did not report empirical data, although these articles were consulted for their con- ceptual frameworks and approaches. These procedures yielded 19 studies that met the criteria for inclusion: reporting data collected about the status and/or infl uences on health program sustainability (including case studies). The review included all available studies that met these criteria, not a sample of them. (p. 327)
The use of scholarly literature is a critical part of enhancing our understanding of the context in which the evaluation is taking place. However, it is limited by the fact that various gatekeepers decide what will be published and what will be archived in a database. Therefore, evaluators should be cognizant of this limitation and engage in other types of search strategies to identify important contextual variables.
Theoretical Frameworks
The theorists whose work is described in Chapters 3–6 provide evaluators with a multitude of theoretical frameworks from which to choose in their planning work. These theories can range from theories of literacy development to theories of community involvement. Theories provide a framework for thinking, highlight relevant concepts, and suggest dynamic relationships between those concepts. Here are some examples of evaluations that used theoretical concepts:
Bowman’s (2005) evaluation of a tribal education model in a technical college in Wisconsin was based on an Indigenous theory from the Native American com- munity. The geographic coverage area of the technical college included members of three tribes. The evaluators sought out each tribe’s individual customs, culture, language, and epistemological views based on their tribal traditions.
Donaldson and Gooler (2002) conducted a theory-based evaluation of a job search training program in California. The underlying theory of the program was based on identifying the skills and psychological factors that were necessary for the par- ticipants to fi nd employment and improve their mental health. The theory held that the participants needed to increase their job search confi dence, their job search skills, and their problem- solving strategies in order to achieve the intended outcomes.
Campbell et al.’s (2014) study of the effectiveness of an intervention to sup- port victims of sexual assault (see Chapter 6, Box 6.9) used a feminist theo- retical framework, which focused on power differentials in the planning, implementation, and use of the evaluation.
Brady and O’Regan (2009) used Rhodes’s model of mentoring as a theoreti- cal framework for their youth mentoring evaluand. This model is presented graphically in Chapter 3, Box 3.3.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 225
Web-Based Resources
The proliferation of web-based resources sometimes makes me wonder what we would do if we didn’t have the World Wide Web anymore. This is probably unimaginable to many people younger than I am, and I admit that life would be a lot harder for me if it hap- pened. The major search engines of today may not be the major search engines of tomor- row. The two major search engines that I currently use (www.google.com and www.bing. com) provide access to printed documents, pictures, graphics, images, news, videos, dis- cussion groups, maps, and more. Evaluators can locate a great deal of information about contexts of evaluations and experiences with similar evaluands through web searching. Here are two examples:
Fredericks et al. (2008): “The evaluation relied on information being col- lected from a number of data sources, including case records, which con- tained demographics and disability diagnoses data; Medicaid billing and expendi- ture data” (p. 225).
Sharma and Deepak (2001) gathered contextual data for their evalua- tion of CBR in Vietnam (see Chapter 4, Box 4.12) from several websites, including the World Bank, the Central Intelligence Agency (CIA), and UNICEF. They were able to report on the gross national product of Vietnam, the density of its population, its population growth rate, and other demographics such as health indicators, age, life expectancy, infant mortality, literacy rates, access to clean water, and government budgets.
“Grey Literature”
Evaluators should always seek program documents that have been produced before the start of the evaluation process. The quantity and quality of these documents will vary widely, depending on the history of the evaluand. Even if a new program is planned, it is probably going to occur in a context that has some kind of paper trail. When I conducted an evaluation of a residential school for the deaf, I asked to see their self-study report and their accreditation report. In addition, I asked to see the curriculum guides and the stu- dent conduct rules. All of these documents gave me an overview of the evaluation context. The APA (www.apa.org/psycextra) has listed the following documents as examples of “grey literature”: research reports, policy statements, annual reports, curricula materials, stan- dards, videos, conference papers and abstracts, fact sheets, consumer brochures, newslet- ters, pamphlets, directories, popular magazines, white papers, and grant information. Examples of using “grey literature” in evaluation practice include the following:
Mertens et al. (2007; see Chapter 6, Box 6.8) read over the RFP for the teacher training program that they evaluated, as well as the university’s proposal and annual reports for the 6 years prior to the evaluation.
Bowman (2005) located and reviewed the initial needs assessment that was con- ducted in Wisconsin and was used as the basis for the development of the tribal education model for on- and off- campus activities. She was also able to determine that there had been no electronic, print, or annual data since the time of that report until she undertook her evaluation study in 2004.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
226 PLANNING EVALUATIONS
Brady and O’Regan (2009; see Chapter 3, Box 3.3) cited the Atlantic Philanthro- pies annual report for 2007 as a source of historical information that set the context for their evaluation of the youth mentoring program in Ireland. The Atlantic Philanthropies foundation has funded programs to improve peo- ple’s lives through education and knowledge creation since the 1990s. The founda- tion reported that early initiatives in this area were not as effective as they had hoped because of lack of coordination, depending on volunteers, and relying on multiple unpredictable funding sources. Within the Foroige agency in Ireland, the foundation funded a pilot project of a BBBS model of youth mentoring.
Group and Individual Strategies
Evaluators can use group and individual strategies such as concept mapping, brainstorm- ing, interviews, surveys, and focus groups, as well as Indigenous methods based on tradi- tional community meeting ceremonies and rituals. Steps for conducting group and indi- vidual interviews are described in Chapter 10 on data collection. Here we provide examples of the use of these strategies and Indigenous methods for the purpose of determining the evaluand and its context.
Bowman (2005) included the use of focus groups and individual interviews in the Native American community in order to determine what their needs were for tribal- related education. She integrated the medicine wheel into the interviews (similar to the Cross et al. [2000] study summarized in Chapter 6, Box 6.6). She structured the questions based on the four quadrants of the medicine wheel. In addition, she pro- vided time for informal interaction following the focus group process to allow people to socialize and share experiences that might not have surfaced during the focus group. The data from the focus groups and individual interviews were used to develop recommenda- tions for changes in the tribal education model, the evaluand of interest in this study.
Africans have traditional tribal gatherings that can be used as a basis for dialogue about context and needs (Chilisa, 2011). The group gatherings in Botswana are called kgotla; these involve the village council in the main village, with the chief or his assistant in charge of the process. Smaller kgotla can be held in outlying areas with the head tribes- man as the facilitator, or even in extended families with the elders facilitating the process. These gatherings can be used to identify problems and potential solutions. One downside to this process is that it has traditionally excluded women and children. Therefore, evalu- ators will need to work with the communities to develop appropriate strategies for all stakeholders’ views to be represented.
Concept Mapping
Trochim (1989) developed the technique of “concept mapping,” which has been applied in many different contexts. The steps in the process involve having participants brain- storm either possible outcomes or specifi c factors that infl uence those outcomes. The next step is to edit the statements to reduce repetition. Participants are then asked to rate the outcomes on two dimensions— importance (compared to other factors) and feasibil- ity over the next few years—on 5-point scales where 5 indicates “extremely important” or “extremely feasible.” Sophisticated statistical procedures (multidimensional scaling and
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 227
hierarchical cluster analysis, discussed in Chapter 12) are then applied to the data to pro- duce confi gurations revealing which of the statements are rated most similarly. Different types of maps can be used to demonstrate how the statements can be organized and used to understand the underlying theory of the project.
Trochim, Milstein, Wood, Jackson, and Pressler (2004) used concept mapping with the Hawaii Department of Health to determine factors of importance that affect individu- als’ behaviors related to avoidance of tobacco, improvement of nutrition, and increased physical activity. Project participants brainstormed factors that they believed infl uenced individuals’ behaviors, and then rated those factors according to their importance and feasibility. The concept mapping revealed that factors could be categorized in terms of policies and laws, environmental infrastructure, children and schools, coalitions and col- laborations, community infrastructure, information and communication, and access. These results were used by the state’s governor in the offi cial state plan, approved by the legislature, and used to create sustainable change in Hawaii.
Outcome Mapping
Buskens and Earl (2008; see Chapter 6, Box 6.10) offer a strategy similar to con- cept mapping called “outcome mapping.” These two strategies are similar in many respects; however, Buskens and Earl offer insights into the application of outcome mapping within the context of transformative participatory evaluations in international development. Outcome mapping deliberately involves subgroups of stakeholders in the process of determining how interventions fi t into the overall development process. It begins with four questions (Buskens & Earl, 2008, p. 174):
1. What is the program’s vision?
2. Who are its boundary partners?
3. What changes in behavior are being sought?
4. How can the program best contribute to these changes?
“Boundary partners” are defi ned as “the individuals, groups, or organizations with whom the program works directly and with whom the program anticipates opportunities for infl uence” (p. 190). Boundary partners are similar to stakeholders; however, Buskens and Earl make the distinction that boundary partners are the subgroups interacting most closely with each other. Hence, instead of having big stakeholder meetings with every- one represented, they tend to have team meetings of relevant boundary partners. For example, the core management team for the IFRP had the following boundary partners (Buskens & Earl, 2008, p. 183):
Action researchers
Training development team
IFRP trainers
IFRP desk researchers
Funders
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
228 PLANNING EVALUATIONS
Motivational Interviewing Southern African Network (MISA)
Department of Family Medicine at University of Stellenbosch
Health researchers in southern Africa
The action researchers had their closest associations with the nurse counselors and the project management team members, who constituted their boundary partners. The bound- ary partners for the mothers who participated in the project were the nurse counselors with whom they worked. These teams deliberated on the program’s vision and desired changes in behavior. Buskens and Earl then discussed how the program could provide the conditions necessary for that change to occur. Outcome mapping typically hopes to observe outcomes as not only a change in behavior but also changes in relationships, actions, activities, policies, or practices of an individual, group, community, organization, or institution (Wilson-Grau & Britt, 2013). The outcome- mapping process is dynamic and ongoing, allowing the boundary partners to examine their progress, to make adjustments to the intervention as deemed necessary, to plan for the next step and wider adaptation, or to scale up their project.
Advisory Boards
Evaluators often work with advisory boards as a way to get input from representatives of various stakeholder groups. It would not be possible to work with all stakeholders in a national-level study (or a state-level or community-level study, in many instances). Hence the use of an advisory board can allow for important dimensions of the community to be represented. Mertens (2000) worked with an advisory board in a national evaluation of court access for deaf and hard-of- hearing people. The advisory board included representa- tives of the deaf and hard-of- hearing communities who were diverse in various respects: their choice of communication mode and language (sign language, reading lips, use of voice); backgrounds with the court (attorneys, judges, judicial educators, police offi cers, and interpreters); and hearing status (hearing, hard of hearing, and deaf). This group was able to provide guidance in regard to the diversity of experiences that deaf and hard-of- hearing people encounter in the courts. The group also emphasized the importance of understanding these diverse experiences in order to develop an intervention that could improve court access.
Technological Tools: Satellite Imagery and Mapping
Satellite imagery and mapping are valuable tools that can be used to display current conditions, as well as to compare past and current conditions. An organization called Information Technology for Humanitarian Assistance, Cooperation and Action (http:// ithacaweb.org/international- cooperation) provided information to help aid agencies plan how to respond when the island country of Haiti was struck by a massive earthquake on January 12, 2010. This organization used geomapping technology to post before-and- after pictures on its website of the areas hit by the earthquake. The before- earthquake satellite photos showed roads, airports, various types of buildings (public and private), and water and electricity centers. The photos taken after the earthquake showed how extensive the damage was to all these facilities. Electricity was not available; telephone
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 229
cables were damaged; the airport had no fuel or lights, and the road from there into the city was destroyed; the water supply collapsed, and wells were contaminated; the prisons broke open, and the prisoners who survived the quake escaped. The geomapping tool thus provided information that was invaluable in helping the aid agencies identify and respond to the conditions on the ground, especially since communication systems were not functioning.
Note that many of these strategies for identifi cation of context and evaluand are revisited in our Chapter 8 discussion of the approach to evaluation known as “needs and assets assessment.”
Depicting the Evaluand
In most evaluation planning, the evaluand, as the entity that is being evaluated, needs to be specifi ed early in the evaluation planning process. The exception to this specifi ca- tion might occur in developmental evaluations in which there is no static evaluand, or in transformative cyclical evaluations in which the evaluand might be developed based on fi ndings from early stages of the evaluation. As mentioned at the beginning of this chap- ter, evaluands can range in defi nition from a gleam in a proposing investigator’s eye to a well- established program. It is sometimes easier to describe an evaluand that has a long history and ample extant information, although this is not always the case. Sometimes a program that has been around for a while has developed layers of complexity that were not present in the original plans, requiring evaluators to do a bit of investigative work. Programs that are under development may also exist differently in the minds of different stakeholders. One of the greatest services an evaluator can provide in such circumstances is to facilitate discussions among the various stakeholder groups to identify what the vari- ous components of the evaluand are, how they work together, and what resources are needed and available to lead to the desired outcomes. Portrayals of evaluands should be considered as working models that will change over time; however, in order to plan an evaluation, a preliminary portrayal of the evaluand is needed.
Evaluands can be depicted in many ways: descriptively or graphically, as static or dynamic entities. Descriptive portrayals of evaluands are typically given as narratives; the object of the evaluation is described, along with the major players and goals. Graphic portrayals of evaluands have typically taken the form of logic models or logical frame- works (the latter is sometimes shortened to log frame, the terminology used in the inter- national development community for logic models). Evaluators from all branches can use all of these approaches to depicting evaluands; however, they may use them a bit differently. A Methods Branch evaluator might view the logic model as needing to be followed without changes in order to assure the fi delity of the treatment intervention. A Values Branch evaluator would probably be more comfortable with a fl exible view of the logic model, allowing it to evolve as the study progresses. Use Branch evaluators would want the logic model to be viewed as useful to their primary intended user and would therefore be amenable to changes as needed. A Social Justice Branch evaluator would see the logic model as a best guess at the beginning of the project and would want to leave room for changes based on fi ndings from communities throughout the process of the evaluation.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
230 PLANNING EVALUATIONS
Logic Models and Log Frames
Logic models are most closely tied to theory-based evaluation approaches (although they are used in many evaluation approaches), because the essence of theory-based evaluation is to reveal the underlying theory of how the program intends to achieve its intended out- comes. For example, if I want youth to refrain from using illegal drugs, what is my theory as to how to accomplish that outcome? The logic model is supposed to make the program’s theory of change explicit. A theory of change describes how the activities, resources, and contextual factors work together to achieve the intended outcomes.
The W. K. Kellogg Foundation (WKKF, 2004b) has published a logic model develop- ment guide that starts with a very simple depiction of a logic model. This includes two main components: what the program people plan to do (resources/inputs and activities) and what their intended results are (output, outcomes, impact). This elementary depiction of a logic model is shown in Figure 7.1.
“Resources” or “inputs” are those human, fi nancial, and community resources that are needed for the evaluand, such as funding, partnering organizations, staff, volunteers, time, facilities, equipment, and supplies. They can also include wider contextual factors, such as attitudes, policies, laws, regulations, and geography. “Activities” include the pro- cesses, events, technology, and actions that are part of the program implementation. These can include such components as education and training services, counseling, or health screening; products such as curriculum materials, training materials, or brochures; and infrastructure such as new networks, organizations, or relationships. “Outputs” are prod- ucts of the activities and include the quantity and quality of the services delivered by the program, such as the number of workshops taught or the number of participants served. “Outcomes” are the changes in individual participants in terms of behaviors, knowledge, skills, or attitudes. These can be short term or long term. “Impact” is the desired change on a broader level for organizations or communities, such as reduction of poverty or increase in health.
The most basic format for a logic model is the outcomes-based logic model, which starts with stakeholders’ identifying those outcomes and impacts that are important to them. Any of the group processes described earlier in this chapter can be used for this purpose. For example, Fredericks et al. (2008; see Chapter 3, Box 3.5) described a logic model for a project that was supposed to improve services and quality of life for people with developmental disabilities. The stakeholders included a state-level steering
Figure 7.1. Basic logic model template. Source: Based on WKKF (2004, pp. 1 and 17).
Your planned work
Your intended results
Resources/ inputs
Activities Outputs Impact Outcomes
What changes do you
expect in 7–10 years?
What activities do you need
to conduct?
What evidence of service delivery is
there?
What do you need to
accomplish your activities?
What changes do you expect in 1–3 and then
4–6 years?
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 231
committee and a fi nance team from the state agency in charge of the program. The proj- ect had specifi ed goals: “to increase the individualization of service planning and delivery, increase administrative effi ciencies, increase person- centered planning, increase con- sumer choice, increase community integration, and improve the quality of life for consum- ers—in terms of home, relationships, personal life, work and school, and community” (Center for Policy Research, cited in Fredericks et al., 2008, p. 254). The evaluators and the steering committee worked together to develop the logic model displayed in Box 7.5.
Box 7.5. Logic Model from the Fredericks et al. (2008) Quality-of-Life Study
Inputs (What is going into the system?)
Process (What is it that we are doing?)
Outputs and short-term outcomes (How will we know when we have done this?)
Long-term outcomes and impacts (Why are we doing this?)
Training for staff to ensure more individualized services
Resources to train and retain qualifi ed staff
Implementing sites will redesign service delivery efforts based on an individualized service environment
Increases in person- centered planning
Increases in community integration
Increases in the individualization of service planning and delivery
Increases in administrative effi ciencies
Links to community partners that will allow consumers to be more involved in the community, both socially and in a work setting
Increased choices for consumers
Implementing sites will provide services according to the performance contract
Implementing sites will provide services to individuals currently not being served
Implementing sites will serve individuals with a full range of disabilities
Implementing sites will use a new budgeting procedure
Increases in consumer choice
Increases in the number of people being served
Financial predictability, as measured by stability in the budgets
Increases in the quality of life for consumers— in terms of home, relationships, personal life, work and school, and community
Source: Fredericks et al. (2008, p. 255). Copyright © 2008 the American Evaluation Association. Reprinted by permission.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
232 PLANNING EVALUATIONS
The WKKF (2004b) logic development guide offers another, more intricate template for a theory-based logic model. Like the simpler logic model just presented, this theory- based logic model explains what the project wants to accomplish and how it will accom- plish those intended results, but it does so in greater detail and complexity. The theory- based approach begins by clarifying the assumptions that underlie the decisions to plan and implement the evaluand. A template for this type of logic model appears in Figure 7.2. The development of the theory-based logic model follows these steps:
1. Identify the problem or issue. Why is this evaluand needed? What are the condi- tions in the community that give rise to the need for this program (e.g., high levels of poverty, increased rate of infection from HIV/AIDS, low literacy levels)?
2. List the community’s needs and assets. This means listing both the strengths and challenges in the community. For example, strengths might include networks of health care workers, expressed desire to work for change, or access to funds. Chal- lenges might include poor infrastructure in terms of transportation or school buildings or clean drinking water. Part of the contextual analysis should pay atten- tion to issues of power and infl uences of discrimination and oppression in the evaluation context.
3. Specify the desired results in terms of outputs, outcomes, and impact. As explained above for the outcomes-based logic model, outputs might be services delivered, workshops provided, or number of participants trained. Outcomes are short-term results in the form of changes in individuals’ behaviors, skills, effi ciency, literacy levels, or disease prevention or treatment. The impacts are the longer-term goals of the project (e.g., reduction of poverty, violence, economic hardship, or hunger).
Figure 7.2. Theory-based logic model template. Source: Based on WKKF (2004, p. 28).
Strategies
Problem or issue
Influential factors
Community needs/assets
Assumptions
Desired results
(outputs, outcomes, and impact)
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 233
4. Identify infl uential factors—both those that are facilitative and those that are bar- riers to change. These can include legislation or policies that either mandate or inhibit the changes that are needed, a history of political stability or civil unrest, economic upturns or downturns, natural disasters, and political or community leadership.
5. Determine strategies (activities) that are needed to achieve the desired results. These might include development of recruitment or training materials, provision of services to enhance skills or health, or enhancement of infrastructure or tech- nology.
6. State the assumptions that underlie the project. Why do the stakeholders believe that this course of action in this context will garner the results they desire? What are the principles, beliefs, or ideas that are guiding this project?
An example of a theory-based logic model is displayed in Figure 7.3. This fi gure is adapted from the work of Kathleen Donnelly- Wijting (2007) for an evaluation of an HIV/ AIDS prevention program for deaf youth in South Africa.
Another example, in Box 7.6, is from Hamilton County, Ohio, which participated in the U.S. Department of Housing and Urban Development’s (HUD) Lesbian, Gay, Bisex-
Figure 7.3. Theory-based logic model for HIV/AIDS prevention for youth in South Africa. Source: Adapted from Donnelly- Wijting (2007). Used by permission of Kathleen Donnelly- Wijting.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
234 PLANNING EVALUATIONS
ual, Transgender, and Questioning (LGBTQ) Youth Homelessness Prevention Initiative. LGBTQ youth were dramatically overrepresented in the population of youth experiencing homelessness, because there were few systems and services designed to meet their needs. The goals of this initiative were to learn more about (1) preventing homelessness for LGBTQ youth and (2) intervening early to prevent chronic homelessness among LGBTQ youth. The initiative involved a deep and diverse list of stakeholders who had a vested interest in the issue, and together they created a theory, on which they based their logic model, of how to resolve LGBTQ youth homelessness.
Box 7.6. Hamilton County Safe and Supported Community Plan to Prevent Homelessness for Lesbian, Gay, Bisexual, Transgender,
and Questioning Youth
Narrative Description of the Evaluand and Theory of Change
The Hamilton County Safe and Supported Community Plan has eight key goals:
1. Facilitate greater community awareness of issues contributing to LGBTQ youth homeless- ness and the Initiative’s efforts to address these issues.
2. Facilitate greater local collaboration among stakeholders, including youth, community members, youth- serving agencies, and staff of youth- chosen spaces.
3. Improve data quality on sexual orientation and gender identity.
4. Use risk and protective factors for screening and assessment of youth at risk of or experienc- ing episodic homelessness.
5. Improve the quality of interventions to reduce risks and build protective factors that can pre- vent LGBTQ youth homelessness.
6. Support positive outcomes for LGBTQ youth in the areas of well-being, permanent connections, stable housing, and education/employment.
7. Obtain new funding and in-kind resources to support plan implementation.
8. Evaluate the initiative including its progress and outcomes.
Safe and Supported Theory of Change: How and Why an Approach Will Produce Change
To prevent LGBTQ youth homelessness:
Start with a needs assessment, understanding of local community context, and a collaborative planning process with stakeholders and youth representing the community.
To identify and implement strategies that leverage local strengths and address gaps for preventing LGBTQ youth homelessness and address challenges contributing to LGBTQ youth homelessness.
Through increased resources for youth, families, schools, communities and peer groups.
Through cultural competency training and awareness building for families, schools, com- munities, and peer groups.
Through changes in policies, procedures, and systems.
So that we build protective factors and reduce risk factors associated with LGBTQ youth homelessness, such as:
1. Improve social climate, including inclusivity of policies, effectiveness of resources, and sup- port/acceptance of LGBTQ identity.
2. Nurture youth who are motivated by self-
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 235
(cont.)
acceptance and belonging to a community to seek social and emotional well-being, perma- nent connections, stable housing, and educa- tion/employment.
3. Nurture a community that provides a safety net of social and emotional well-being, permanent
connections, stable housing, and education/ employment opportunities so youth do not experience homelessness.
4. Increase the ability of families to accept and support difference to create a safe space for youth and prevent episodes of homelessness.
Abbreviated Logic Model
Contextual Factors Contextual Factors
Community context
Availability of and access to culturally competent services, programs, shelters, and housing
Availability of data
Economic development and fi nancial resources
Geography
Leadership
Collaboration in the community across youth- serving systems (e.g., education, juvenile justice, law enforcement, mental health, faith-based) and “turf” concerns
Culture
Advocacy efforts and politics
Community awareness of prevalence and causes of LGBTQ youth homelessness
Social attitudes toward LGBTQ
Client context
Socioeconomic demographics (age, race, etc.)
Awareness of and willingness to access supports
Previous access to supports
Protective factors (e.g., employment, positive friends, school connection, supportive adults, survival skills)
Risk factors (e.g., emotional distress, family rejection, lack of stable housing, substance use, mental health challenges, physical factors)
Coming out status
Federal context
HUD, DOE, HHS, DOJ support for the initiative
DOE requiring diversity training for all school staff
Inputs, Activities, and Outputs
Inputs Priority Activities Outputs
Initiative planning team (~30 members), including youth participants
Lighthouse staff (2)
Strategies to end homelessness staff (1)
Technical assistance (TA) team (3) and other federal TA
Group site
Needs assessment
SWOT analysis
Local collaboration
Steering committee meetings (monthly)
Community meetings (4)
More clearly defi ning CQI process (formal change management process)
Needs assessment
Needs assessment fi ndings
Local plan development
Analysis of local data— report
Theory of change
Logic model
Strategic plan
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
236 PLANNING EVALUATIONS
Box 7.6 (cont.)
Inputs Priority Activities Outputs
Coordination of existing funding
Exploring new funding
Local plan development
Six-month strategic planning process involving the systems and providers serving LGBTQ and homeless youth
Financial plan
Local plan implementation
Outputs based on fi nal local plan
Leadership team meetings (biweekly)
Identify funding sources
Local toolkit for corporate response
Development and advocacy of funding strategies
Local plan implementation
Two years of implementation
Plan strategies and activities
Community advisory group
Local plan evaluation
Outcomes and Impact
Short-term outcomes (months 1–6)
Intermediate outcomes (months 7–18)
Long-term outcomes (months 19+)
Identifi cation of community need(s) using data
Participation of LGBTQ homeless youth in planning
Increased community engagement
Increased participant and community awareness of LGBTQ homelessness
Identifi cation of evidence-based or promising practices
Identifi cation and promotion of existing resources
Identifi cation of new funding sources
Reduced number of LGBTQ youth who become homeless
Strengthened relationships among youth and key partners and within each group
Expanded screening and assessment opportunities
Increase cultural competency at initiative partner agencies
Increased participation in LGBTQ competency training for foster parents and JFS workers
Increased number of LGBTQ youth in stable housing, permanent connections, social and emotional well-being, and education/ employment
Increased community acceptance and adult support of LGBTQ youth
Improved response to risk and protective factors of LGBTQ youth at risk of or experiencing homelessness
Implemented interventions and countywide programs to address specifi c needs of youth
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 237
Short-term outcomes (months 1–6)
Intermediate outcomes (months 7–18)
Long-term outcomes (months 19+)
Increased number of foster and adoptive families that support LGBTQ foster youth and increased matches between youth and these families
Improved LGBTQ client services and satisfaction at Sheakley Center
Decreased number of LGBTQ youth who become homeless
Improved access to community supports and resources for LGBTQ youth
More positive school environment for LGBTQ youth
Improved social and emotional well-being among LGBTQ youth at risk of homelessness
Secure funding for initiative recommendations
Expanded dialogue to share and explore perceptions of LGBTQ youth and related issues
Improved understanding of the prevalence of LGBTQ foster youth in Hamilton County
Improved data depth and quality (completeness, accuracy, timeliness)
Source: Hicks and Alspaugh (2014). Copyright © 2014 Meredith Hicks and Meradith Alspaugh. Reprinted by permission.
In addition to the WKKF (2004b) development guide for logic models, a number of other guides are available online:
The Harvard Family Research Project has a guide for developing logic models. The logic model development process is illustrated with an example of a districtwide family engagement program (https://eric.ed.gov/?id=ED507500).
The Aspen Institute has developed a tool that includes step-by-step instructions on the development of a logic model within the world of philanthropy. Continuous Prog- ress, a branch of the Aspen Institute’s Global Interdependence Initiative, just launched its Advocacy Progress Planner (www.aspeninstitute.org/programs/aspen-planning-and-eval- uation-program/tools). Funded by the California Endowment and the William and Flora Hewlett Foundation, this tool illustrates the range of possible outcomes and target audi- ences that might be relevant to a certain advocacy or policy change strategy. The model helps a user focus on identifying the proper goals of any advocacy effort, which depends on where the issue stands in the policy process.
CAPT presents a planning framework for prevention programs (www.samhsa.gov/ capt/applying- strategic- prevention- framework). Many of the steps fi t into the logic model system. Step 1 is to assess the community’s needs and readiness for an intervention. Step 2 is to mobilize the community and build capacity as necessary. Step 3 is called “planning” and includes a description of the program, activities, and strategies. The website gives many examples of best practices from the National Institute on Drug Abuse, CSAP, the
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
238 PLANNING EVALUATIONS
National Center for the Advancement of Prevention, the Offi ce of Juvenile Justice and Delinquency Prevention, the Department of Education, and the Centers for Disease Con- trol and Prevention (CDC). Step 4 is to implement the program, and Step 5 is to evaluate the program’s results and sustainability.
In the fi eld of international development, logical frameworks (log frames) are used instead of logic models. Baker (2000) describes log frames as statements of objectives that lead to the identifi cation of outputs and impact indicators.
The use of a logical (log) framework approach provides a good and commonly used tool for identifying the goals of the project and the information needs around which the evaluation can be constructed. The log frame, increasingly used at the World Bank, is based on a simple four-by-four matrix that matches information on project objectives with how performance will be tracked using milestones and work schedules, what impact project outputs will have on a benefi ciary institution or system and how that will be measured, and how inputs are used to deliver outputs. . . . In other words, it is assumed that the project’s intended impact is a func- tion of the project’s outputs as well as a series of other factors. The outputs, in turn, are a func- tion of the project’s inputs and factors outside the project. Quantifi able measures should then be identifi ed for each link in the project cycle. This approach does not preclude the evaluator from also looking at the unintended impacts of a project but serves to keep the objectives of the evaluation clear and focused. Qualitative techniques are also useful in eliciting participa- tion in clarifying the objectives of the evaluation and resulting impact indicators. (p. 19)
Davies (2005) also describes a logical framework as a 4 × 4 planning matrix:
The four columns are the Narrative—a description of expected changes, Objectively Verifi - able Indicators—of those changes, Means of Verifi cation—of those indicators, and Assump- tions about external infl uences on the expected changes, both positive and negative. The four rows are the Activities, which lead via Assumptions on that row to the Output, which leads via Assumptions on that row to the Purpose, which leads via Assumptions on that row to the Goal. (p. 147)
· · · · · · · · · · · · E X T E N D I N G Y O U R T H I N K I N G · · · · · · · · · · · ·
Using a Logic Model
Logic Model: Stopping Teens from Texting While Driving
Situation: A high school in Montgomery County is mourning the death of one senior who died in a car accident as he was texting while driving. The problem seems to be complex: Many teens text while they drive; their parents text while driving; teens see other drivers texting while driving; the local police department does not seem to be ticketing or consistently ticketing drivers, despite the law prohibiting driving and texting; and there are limited consequences for the few teens who have been caught texting.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 239
The Montgomery County Teen Unit (MCTU) is planning a campaign to begin a program to teach the teens and the community at large about the dangers of texting while driving. The following table lists the inputs and processes as well as the out- puts/short-term outcomes and impacts/long-term outcomes. What would be some other outputs and short-term outcomes, and some other long-term outcomes and impacts?
Inputs Processes (activities)
• Montgomery County grants • Private funding (telephone
companies)
• Parents • Montgomery High School • Equipment • Volunteers (parents, police,
community members, teens)
• Community partners • Existing resources • MCTU staff • Materials • Time
MCTU will:
• Develop teaching units with driving schools • Create literature with teens • Create public service announcements at high
school’s TV lab
• Engage youth and build relationships • Write grants for funding • Collaborate with county judges for consistent
punishments and education
• Conduct training for cellphone providers • Work with police on vigilant and consistent
enforcement
• Discuss initiative at county hall meetings • Deliver prevention education programs
Outputs and short-term outcomes Long-term outcomes and impacts
Increased knowledge about the danger of texting while driving
Decrease in the number of teens who text while driving after fi rst probation
Name others: Name others:
In the international development context, evaluators focus on the United Nations Sustainable Development Goals (SDGs; these are listed in Chapter 1, Box 1.5). These give evaluators direction in terms of their goals and targets, as well as the indicators they can use to determine whether those goals and targets are being achieved. The World Bank and the United Nations have developed electronic databases that provide helpful informa- tion in planning an evaluation for an international development project.
The United Nations developed the Sustainable Development Knowledge Platform, which includes a global database and a metadata repository that contains information about progress toward the achievement of the SDGs by country or geographic area accord-
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
240 PLANNING EVALUATIONS
ing to each SDG indicator. The World Bank’s World Development Indicators (WDIs) is another database that planners can use to target disparities associated with the most vul- nerable groups, thus enhancing the possibility of designing interventions that are appro- priate within each country’s context.
Here is a list of databases that international development evaluators may fi nd useful if they are working on evaluations related to the SDGs:
1. The SDG Indicators Global Database (https://unstats.un.org/sdgs/indicators/ database) allows planners access to UN system data used to prepare for the secretary- general’s annual report on “Progress towards the Sustainable Develop- ment Goals” by SDG indicator and country or geographic area.
2. The World Bank’s WDI database (http://databank.worldbank.org/data/reports. aspx?source= world- development- indicators) contains current national, regional, and global estimates of development indicators collected from offi cially recog- nized international data sources, disaggregated by sex, age, economics, and urban or rural location. The WDI has been updated to include more indicators that refl ect the SDGs.
3. The World Bank also offers 150 maps and data visualizations of the progress of countries achieving the 17 SDG goals in their online Atlas of Sustainable Develop- ment Goals 2018 (http://datatopics.worldbank.org/sdgatlas). The atlas is meant to “help policy makers, managers, and the public alike better understand them (the SDGs). The Atlas helps quantify progress, highlight some of the key issues, and identify the gaps that still remain.”
Evaluators can use these databases to provide context for their evaluation planning, as well as to inform stakeholders about the extent of needs within various populations.
Descriptive Depictions of the Evaluand
Evaluators always have a descriptive depiction of the evaluand; it can stand alone or sup- port the graphic depiction of the evaluand in a logic model. All the examples of evaluations presented in this and earlier chapters have either a descriptive depiction of the evaluand or a descriptive and graphic depiction. One framework that is useful for conceptualizing a description of the evaluand is the CIPP model developed by Stuffl ebeam (see Chapter 4). Box 7.7 contains examples of the types of variables that might be considered for each aspect of the model, as well as applications of these to the evaluand description of a self- help program for women adjusting to breast cancer and its treatment (Sidani & Sechrest, 1999). It provided information about the course of treatment, belief in self, and improving problem- solving and cognitive reframing skills. The course had three components: (1) The cognitive component provided the knowledge needed to understand the condition, treat- ment, and self-care strategies; (2) the behavioral component addressed women’s skills nec- essary for active participation in their own care, problem solving, and stress management; and (3) the psychological component helped women deal with their feelings. The course used three teaching modes (interactive, didactic, and hands-on experience).
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 241
(cont.)
Box 7.7. Evaluand Descriptions Based on the CIPP Model
Component Variables Example from Sidani and Sechrest (1999)
Context Presenting problem; characteristics of the setting (physical and psychosocial features of the environment; social, political, and economic context of the program).
Setting: accessibility, material resources needed to deliver the services; the physical layout and attractiveness of the setting; organizational culture; composition of and working relationships among the staff; norms and policies.
Women with breast cancer receiving therapy.
Physical side effects; need for management to minimize effect on daily functioning.
Setting: Classroom in a quiet setting; written materials; seating arrangements to facilitate discussion; audiovisual materials; space and equipment for demonstrations and hands-on learning.
Input Critical inputs needed to produce the desired results, including client characteristics (e.g., demographics, personality traits, personal beliefs, employment status, level of anxiety, stage of the disease).
Resources available to clients (internal and external support factors); access to treatment.
Characteristics of the staff: personal and professional attributes, competency, gender.
Clients: Age, gender, educational level, traits such as sense of control, cultural values, and beliefs.
Staff: Communication abilities, demeanor, education background, level of competence or expertise in provided services, preferences for types of treatment, beliefs and attitudes toward target population. Staff members (women) delivering the courses: knowledge about breast cancer and self- help strategies; sensitivity to clients; good communication and teaching skills.
Teaching protocol: objectives, content, learning activities, logistical instructions, training for instructors.
Process Mediating processes, targeted activities, quality of implementation; quantity of process delivered (dosage/strength); frequency, duration; which clients received which components of the project at which dosage; sequence of change expected.
The self-help program had three components: cognitive, behavioral, and psychological. The course was given over six sessions (90 minutes each, once a week). The theoretical process involved this chain of events: attending course, increasing knowledge, engaging in self- care, decreasing uncertainty, improving affect, improving quality of life.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
242 PLANNING EVALUATIONS
Box 7.7 (cont.)
Component Variables Example from Sidani and Sechrest (1999)
Product The expected outcomes; reasons why the program was implemented; criteria to judge the effectiveness of the program; nature, timing, and pattern of change expected. (Nature of outcomes included particular changes in the clients’ lives or condition; timing refers to when the change was expected to occur— immediately, short term, or long term.)
The self-help program expected positive changes in the quality of life about 6 months after the training; it should continue into the future. Improved quality of life was contingent upon the women’s improvement in self-care and affect and the reduction of uncertainties.
Mixing Things Up
As most people know, life rarely follows a linear pathway. Hence the use of linear mod- els to depict evaluands is limited, because they do not portray deviations from what was planned or iterative changes that occur during the life of a program. A logic model is linear and suggests that action fl ows in one direction. However, the intended outcomes can focus on changes in participants, as well as changes in staff members as they progress through the project as well. These could lead to additional changes in the program that are not depicted in the logic model. Davies (2004) asserts that linear models are inadequate to depict the complexity of evaluands throughout the life of a project. He suggests that evaluators consider using more complex modeling strategies based on network analysis.
This chapter includes an example of an evaluand that was depicted in both narrative and graphic form using the WFFK logic development model by a county in Ohio to pre- vent homelessness for LGBTQ youth (Hicks & Alspaugh, 2014) (Box 7.6). Included in the plan is the list of diverse stakeholders who participated, contextual considerations, their theory of change, a complete logic model, and detailed short- and long-term outcomes.
Planning Your Evaluation: Stakeholders, Context, and Evaluand
Choose an evaluand for which you can develop an evaluation plan. This may be a pro- gram that you experienced at some time in your past, something related to your current position, or even a new idea that you would like to develop. Using one of the logic models presented in this chapter, develop a logic model for your evaluand, at least as you pres- ently understand it. Your understanding is expected to change throughout the planning process; therefore, be prepared to be fl exible with this part of the evaluation. Identify potential stakeholders for this evaluand; to the extent feasible, involve the stakeholders in the process of developing the evaluand. After you develop the logic model, write a narra- tive that explains the context of the evaluand and also provides additional details of what
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Working with Stakeholders 243
is depicted in the logic model. Share this narrative with a peer; obtain feedback as to the clarity and completeness of your depiction of the context and evaluand. Make revisions as necessary. If possible, obtain feedback from the stakeholders about your logic model and narrative.
Moving On to the Next Chapter This chapter rests on the assumption that evaluators and stakeholders know what the evaluand should be or is. However, that is not necessarily the case. In Chapter 8, we look at strategies evaluators can use to provide information to stakeholders who are in the process of designing a new intervention or making substantial changes in an existing evaluand. This approach to evaluation is called “needs and assets assess- ment.” We also consider other evaluation purposes and questions that might be used to guide the evaluation; we focus on how answers to those questions might be used to make changes in the organization.
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
244
Preparing to Read Chapter Eight
As you are halfway through the book, you now have a good understanding of the landscape of the evaluation fi eld, its history, its currently used paradigms, and the different theories and approaches in evaluation. In Chapter 7, you learned how to identify the stakeholders and establish the context of the evaluand. Do you think by now you can list why evaluations are done?
1. Imagine that your school wants to establish a no-texting policy during classes or meetings. Try to list as many purposes for an evaluation of this type of initiative as you can.
2. Consider the following purposes for an evaluation of the no-texting policy:
Is this a good policy?
How well was it implemented?
What were the results of implementing the policy?
3. What kind of data would you collect in order to address these purposes for the evaluation?
Mertens, D. M., & Wilson, A. T. (2018). Program evaluation theory and practice, second edition : A comprehensive guide. Guilford Publications. Created from templeuniv-ebooks on 2022-07-27 00:55:54.
C o p yr
ig h t ©
2 0 1 8 . G
u ilf
o rd
P u b lic
a tio
n s.
A ll
ri g h ts
r e se
rv e d .
Journal o f Physical Activity and Health, 2016,13,275-280 http://dx.doi.org/10.1123/jpah.2014-0607 © 2016 Human Kinetics, Inc.
Human Kinetics ORIGINAL RESEARCH
The Impact of Playworks on Students’ Physical Activity by Race/Ethnicity: Findings from a Randomized Controlled Trial
Susanne James-Burdumy, Nicholas Beyler, Kelley Borradaile, Martha Bleeker, Alyssa Maccarone, and Jane Fortson
Background: The Playworks program places coaches in low-income urban schools to engage students in physical activity during recess. The purpose of this study was to estimate the impact of Playworks on students’ physical activity separately for Hispanic, non-Hispanic black, and non-Hispanic white students. Methods: Twenty-seven schools from 6 cities were randomly assigned to treatment and control groups. Accelerometers were used to measure the intensity of students’ physical activity, the number of steps taken, and the percentage of time in moderate-to-vigorous physical activity (MVPA) during recess. The impact of Playworks was estimated by comparing average physical activ ity outcomes in treatment and control groups. Results: Compared with non-Hispanic black students in control schools, non-Hispanic black students in Playworks schools recorded 338 more intensity counts per minute. 4.9 more steps per minute, and 6.3 percentage points more time in MVPA during recess. Playworks also had an impact on the number of steps per minute during recess for Hispanic students but no significant impact on the physical activity of non-Hispanic white students. Conclusions: The impact of Playworks was larger among minority students than among non-Hispanic white students. One possible explanation is that minority students in non-Playworks schools typically engaged in less physical activity, suggesting that there is more room for improvement.
Keywords: accelerometry, intervention study, youth
Childhood obesity continues to be a significant health concern. In the past 30 years, childhood obesity in the United States has more than doubled in children and quadrupled in adolescents.I~ By 2012, more than one third of children and adolescents were overweight or obese.2 These rates are even higher in children from racial and ethnic minority groups.3 Further, few children, particularly children from racial and ethnic minority groups,4 meet the current physical activity guidelines recommendation of at least 60 minutes of moderate-to- vigorous physical activity (MVPA) per day.5-6
Because children spend approximately half of their waking hours at school, schools provide an opportunity for interventions to target obesity and promote physical activity.7 Recess, in particu lar, has been recognized as an important part of a comprehensive school-based program for the promotion of physical activity.8-9 Furthermore, there is evidence that children from racial and ethnic minority groups are less physically active at recess than their non minority counterparts and most in need of interventions to address this gap.10-12 Recent research suggests that interventions aiming to increase physical activity during recess, such as Ready for Recess, the Playworks program, and social prompting or modeling inter ventions, may successfully increase physical activity in elementary school children.13- '5 However, very little research has examined whether these interventions are specifically effective at increasing the physical activity of children from the racial and ethnic subgroups most in need of intervention.l6To our knowledge, only one study17 has examined the effect of a recess-based intervention on racial and ethnic subgroups; this study, which was conducted on fewer than 100 students from 2 Midwestern schools, showed that the Ready for Recess intervention was associated with a statistically significant
James-Burdumy, Beyler ([email protected]), Borradaile, Bleeker, Maccarone, and Fortson are with Mathematica Policy Research, Princeton, NJ.
increase of 4.7 minutes in MVPA during recess, across all chil dren. The impact at recess was slightly larger in minority children compared with non-Hispanic white children (4.7 vs 4.4 minutes of MVPA), but this difference was not statistically significant.17
The findings presented in this article are part of a larger evalu ation that investigated the impact of Playworks across several out come domains, including students’ physical activity, school climate, and student behavior.14-18 Playworks operates in low-income urban schools. Full-time coaches are placed in these schools to provide students with organized recess activities and help them foster social skills such as cooperation and conflict resolution, Playworks was specifically developed for use in urban settings and is often used by schools with large minority populations, thus highlighting the need to test its effectiveness with this specific population. In the larger evaluation, we found that the Playworks intervention reduced bullying and exclusionary behavior during recess and the difficulty and amount of time it took teachers to transition to learning activi ties after recess. We also found that Playworks increased students’ perceptions about the effectiveness of sports, games, and play on their behavior in class and had a positive impact on students’ use of positive, encouraging language and teachers’ perceptions of student safety at school and during recess. A separate, quasi-experimental study of Playworks showed that with each additional year of expo sure, students reported significantly higher levels of physical activity frequency.19 Both of these studies did not evaluate the impact of Playworks within racial and ethnic subgroups.
The aim of the current study was to evaluate the impact of the Playworks program on objectively measured physical activity in racial and ethnic subgroups in the context of the large randomized controlled trial previously mentioned, which was conducted in 27 schools across 6 major U.S. cities. In particular, this evaluation will answer the following research question: does Playworks have an impact on recess physical activity, measured as average number of accelerometer intensity counts per minute, average number of steps
275
276 James-Burdumy et al
taken per minute, and average percentage of time spent in MVPA for non-Hispanic black, non-Hispanic white, and Hispanic students, and is the impact different across these groups?
Methods The current study took place in elementary (n = 11) or combination elementary and middle (n = 18) schools during either the 2010-2011 or 2011-2012 school year. During the 1-year study period, treat ment schools implemented Playworks, and control schools did not implement the program. The 29 study schools were all interested in the Playworks program but had never implemented it before and were located in 6 large urban cities geographically dispersed across the United States.
Study Design
The study used a randomized controlled trial design. The schools were randomly assigned to treatment (n = 17) and control (n = 12) groups. The advantage of random assignment is that any differences in outcomes between the groups can be attributed to the effect of the Playworks intervention. To improve the statistical precision of impact estimates, we conducted random assignment within matched blocks of schools that were similar in terms of school size, highest grade offered, student race/ethnicity, and the percentage of students eligible for free or reduced-price lunch.
Procedures All schools were recruited by research staff to participate in the study. Notification letters were sent to parents of students in the study schools before data collection began, and parents had the opportunity to have their children excluded from the study. The research team visited each school for 1 week to collect accelerometer data during recess periods. Recess schedules varied across schools; most schools offered recess each day, with periods lasting an aver age of 33 minutes. A few schools provided students with more than one recess period. Some schools designed their schedules so that similar age groups began and ended recess at the same time, and other schools had overlapping periods with different grades and groups of students entering and exiting the play yard at different times. The time of day that recess was offered also varied, but most recess periods were in the morning or scheduled around lunch. The research team found that recess schedules provided by schools were followed loosely because of such factors as weather or other school events. Schools had different policies for what students were to do when inclement weather made outdoor recess impossible, including staying in the classroom to play board games, gathering in the gym for physical activity, or going to the auditorium to watch a movie.
During the week of data collection at each school, students in sampled fourth- and fifth-grade classrooms provided demographic data about their race/ethnicity and gender via survey and wore accelerometers for 1 or 2 days. Four classrooms (2 fourth-grade classrooms and 2 fifth-grade classrooms) were randomly selected from each school for accelerometer data collection. In schools that had fewer than 4 fourth- and fifth-grade classrooms combined, all classrooms were selected. One classroom from each school was also randomly selected to participate in a second day of accelerometer data collection. Including 2 days of data for a subset of students accounts for students’ intraindividual variability in physical activ ity during recess as part of the analyses. The research team arrived
at participating classrooms at the beginning of the school day, described the function of the accelerometer, and then attached an accelerometer to each consenting student’s hip using an elastic belt. Students were instructed to seek out someone from the research team if their monitor became loose during the day. The monitors were taken off at the end of the school day by a research team member, and physical activity data were uploaded to a com puter. Each school provided information about the start and end time for recess periods for each grade, and accelerometer data measured during these scheduled recess periods were used for analysis. While the data collection team was on site in treatment and control schools, they confirmed the original recess schedules provided by schools; if necessary, they corrected the schedules based on when recess actually occurred during the week of data collection. The corrected version of the recess schedules were used when determining the start and end times for recess periods for the accelerometer analysis.
The Playworks Program Treatment schools (with n = 511 student study participants) were each assigned one paid full-time Playworks coach for I school year. The Playworks program (www.playworks.org) places coaches in low-income urban schools to engage students in physical activity, foster their social skills related to cooperation and conflict resolu tion, improve school climate, improve their ability to focus on class work, and decrease their behavioral problems. The typical Playworks coach is a young adult who is experienced or interested in education, youth development, or sports. To ensure the quality of the coaches, the Playworks staff members train and then supervise coaches when they are first placed into a school.
The Playworks program comprises 4 main components1418:
1. Organized recess activities. During recess, the coach assigned to each school encourages involvement in organized and inclu sive activities such as four-square, Simon says, wall ball, and basketball and provides a common set of rules for each game. In addition, the coach models conflict resolution tools, such as RoShamBo (rock-paper-scissors), which aim to reduce the number of conflicts that arise, enable youths to resolve their own disputes quickly, and create an environment of positive play. Although not formally part of the Playworks program, other adults, such as playground monitors, were also often present at treatment and control schools during recess to monitor the children.
2. Junior coach program. Older students (ie, fourth- and fifth-grade students and some older students in K-8 schools) serve as role models and facilitators during recess. They receive monthly training in leadership and conflict resolution skills from the school Playworks coaches so they can lead other students in games and help resolve student conflicts.
3. Class game time. In addition to the recess activities described above, the coach meets with individual classes during the school day (at times other than recess) to play games such as four corners, hot potato, and red light-green light. The coaches lead these games with the goal of fostering teamwork and positive play. After the students learn the rules, they can play the games on their own during recess. Classroom teachers are required to be present and are encouraged to play alongside their students.
4. After-school activities. Playworks also offers an after-school program, sports leagues, and school staff trainings, but the current study did not focus on this component.
JPAH Vol. 13, No. 3, 2016
The Impact of Playworks on Physical Activity 277
The first 3 components were carried out at all study schools assigned to the treatment group.
Playworks also includes coach training and supervision:
• Training. Each year, new coaches receive 109 to 120 hours of training; returning coaches receive 65 to 80 hours. Before the school year, coaches receive roughly 30 hours of training; they receive an additional 16 to 24 hours within the first 2 weeks of the school year. The remaining hours are spread out over the course of the year (Playworks staff, personal communication, September 8, 2014).
• Supervision. Playworks program managers, who spend time on-site at schools, observe the coaches and provide feedback. They usually visit the coaches for at least 2 to 3 hours each week (Playworks staff, personal communication, September 8,2014).
Playworks’ central office in Oakland, California provides direc tion to the independent regional hubs that carry out the program. According to Playworks, the total cost of providing its program to a single school was $61,200 in the 2010-2011 school year and $64,600 in the 2011-2012 school year, based on national estimates.18 However, the cost of the program is subsidized through donations and grants. With these subsides, study schools paid on average $24,353 for the program.
Control Group Schools The 12 schools assigned to the control condition (with n = 405 student study participants) were not offered Playworks during the study year but were instead put on a waiting list to implement Play works in the following year. These schools were asked to carry out recess as they normally would in the absence of the study. Recess periods were observed at both treatment and control schools using the SOPLAY observation tool,20 which allows collection of physi cal activity information on students within zones in the recess play space to understand how they differed. Compared with treatment schools, control schools had less availability of games at recess, fewer adults involved in organizing recess games, and less use of equipment during recess. There were similarities between the 2 groups of schools in terms of the role of playground monitors at recess. Playground monitors at treatment- and control-group schools engaged with students in similar ways and for a similar amount of time—playing with students, intervening in conflict, and encourag ing students with positive messaging.
Participants The study participants consisted of 916 students (312 non-Hispanic black students, 207 non-Hispanic while students, and 397 Hispanic students from 101 fourth- and fifth-grade classrooms in 27 study schools) who reported their race and ethnicity in the student survey and wore accelerometers for at least 10 minutes during their recess periods on 1 or 2 school days. Students from one study school did not participate in the accelerometer data collection. This school and the school with which it was matched during random assign ment were dropped from the accelerometer analysis, leaving 27 schools from the original sample of 29 schools. In these 27 schools, the response rate for accelerometer data collection was about 66% overall in both treatment and control groups. Students were considered nonrespondents if they did not obtain written parental consent (about 30%), or if they either refused to participate, were absent during data collection, had an accelerometer malfunction,
or if we did not have data indicating their race/ethnicity and gender (about 5%).
Physical Activity Outcomes Three outcome variables were used to measure students’ recess physical activity: (1) average number of intensity counts per minute, (2) average number of steps per minute, and (3) average percentage of time spent in M VPA. These variables were constructed using data from accelerometers (GT3X; ActiGraph, Pensacola, FL), which are monitoring devices worn on the body that allow researchers to objectively measure the intensity, frequency, and duration of physical activity. The cut points used to measure time spent in MVPA were based on Evenson cut points, which were found to be the most reliable in a comprehensive study involving youth.21 (We also condueted a separate analysis using cut points from Edwardson and Gorely22 that yielded similar findings.) Students’ accelerom eter wear time was divided into 5-second epochs; if the intensity counts recorded during a given epoch were greater than or equal to 191, that epoch was identified as time spent in MVPA. Although 15-second, 30-second, and 1-minute epochs are typically used to measure time spent in MVPA, recent research suggests that the shorter 5-second epoch lengths used here are more appropriate for measuring physical activity in children because their activity can be more spontaneous, might last only a matter of seconds, and is not always picked up using longer epochs.22"25
Data Analysis To answer the research question posed earlier, we focused our analyses on estimating the impacts of Playworks on the aver age number of intensity counts per minute, the average number of steps per minute, and the average percentage of time spent in MVPA during recess for 3 racial and ethnic subgroups: (1) non- Hispanic black students, (2) non-Hispanic white students, and (3) Hispanic students. We did not estimate impacts for students in other race and ethnicity categories because the small number of students in those groups prevented us from estimating reliable impacts.
The impact of Playworks was estimated for each of the 3 race/ethnicity subgroups by using regression models to compare the average outcomes in treatment and control group schools. We also estimated school and student characteristics using regression models to populate Tables 1 and 2, respectively. The models included random assignment block indicator variables to account for the blocked design and school-specific random error terms to account for school-specific effects not attributable to the treatment. The impact models also included accelerometer wear time, gender, and grade level as covariates. These covariates were included because there was a large range in accelerometer wear time (16-61 minutes across students) and because previous research suggests that inter ventions may have a different effect (1) on boys than on girls and (2) on students in different age groups.26 The modeled outcomes measure students’ average number of intensity counts per minute, average number of steps taken per minute, and average percentage of recess time spent in MVPA, controlling for the fact that some students were at recess for different lengths of time during the school day. If we had used total number of intensity counts, total number of steps taken, and total number of recess minutes in MVPA, it would have resulted in disproportionately large outcomes for students with longer recess periods.
Generalized estimating equations (GEE) were used to account for clustering of students within schools. Because GEE
JP A H V ol.13, No. 3, 2016
278 James-Burdumy et al
Table 1 Characteristics of Study Schools3
C h a r a c te r is tic T re a tm e n t (n = 16) C o n tro l (n = 12) D iffe re n c e
P e rc en tag e o f sc h o o ls re ce iv in g T itle I fu n d in g 86.7 84.2 2.5
M ean n u m b e r o f stu d e n ts p e r te a c h e r/c la ssro o m 16.3 16.3 0.0
M ean n u m b e r o f stu d e n ts p e r school 4 9 4 .0 562.3 -6 8 .3
M ean p e rce n ta g e o f stu d e n ts e lig ib le fo r free o r re d u c e d -p ric e lunch
M ean p e rce n ta g e o f stu d e n ts w h o are the fo llo w in g ra ce /e th n ic ity
8 1.0 8 3 . 1 -2 .1
N o n -H isp a n ic black 40.7 38.3 2.4
H isp a n ic 25.6 32.3 - 6 .7
N o n -H isp a n ic w hite 17.0 12.9 4.1
O th e r 16.7 16.5 0.2
“Source: Common Core of Data (CCD) from the 2009-2010 school year (25 schools) and 2010-2011 school year (3 schools). Data were unavailable for one treatment school that was new in 2011-2012.
Table 2 Descriptive Statistics for Participating Students
S tu d e n t T y p e a n d C h a r a c te r is tic T re a tm e n t C o n tro l D iffe re n c e
N o n -H isp a n ic b lack stu d e n ts (n = 312)
P e rc en tag e o f fe m a le stu d e n ts 52.4 49.8 2.6
P e rc e n ta g e o f fo u rth -g ra d e stu d e n ts 4 8 .0 54.1 -6 .1
M ean (SD ) n u m b e r o f m in u te s stu d e n ts w ore the a c c e le ro m e te rs d u rin g recess 32.9 (11.7) 2 7 .4 (1 3 .6 ) 5.4
N o n -H isp a n ic w hite stu d e n ts (n = 207)
P e rc en tag e o f fem ale stu d e n ts 50.5 57.8 - 7 .3
P e rc en tag e o f fo u rth -g ra d e stu d e n ts 67.2 6 4.6 2.6
M ean (SD ) n u m b e r o f m in u te s stu d e n ts w o re the a c c e le ro m e te rs d u rin g recess 3 1 .4 (1 0 .5 ) 3 6 .6 (1 1 .4 ) - 5 .2
H isp a n ic stu d e n ts (n = 397)
P e rc en tag e o f fe m a le stu d e n ts 57.8 51.6 6.2
P e rc en tag e o f fo u rth -g ra d e students 59.4 4 9 .4 10.0
M ean (SD ) n u m b e r o f m in u te s stu d e n ts w o re the a c c e le ro m e te rs d u rin g recess 34.1 (11 .9 ) 3 2 .3 (1 8 .0 ) 1.8
automatically accounts for any correlations among students below the level of clustering (schools), the standard errors also reflect nesting of students within classrooms. Before estimating the model, averages of the outcomes for students who wore accelerometers for 2 days were calculated. Sampling weights were used to account for both nonresponse and the selection probabilities of students into the accelerometer sample, ensuring that students included in the analysis represented all eligible students in the participating schools.
Results Table 1 provides characteristics of the study schools assigned to the treatment and control groups. A majority of schools in both groups had large low-income student populations and received Title I fund ing. Students at both treatment and control schools were primarily minority students, and a majority qualified for free or reduced-price lunches. Table 2 provides characteristics of the students in the study schools. There were no significant differences between the treatment and control groups in terms of accelerometer wear time, gender, or grade level after accounting for the blocked design and school-specific effects.
Our research question set out to determine whether Playworks has an impact on physical activity for non-Hispanic black, non- Hispanic white, and Hispanic students. Statistically significant differences were found between the treatment and control groups on average number of intensity counts per minute during recess, average number of steps taken per minute during recess, and average percentage of time spent in MVPA during recess for non-Hispanic black students (Table 3). Relative to control-group students, treat ment-group students in this race/ethnicity subgroup had, on aver age, 338 (95% confidence interval [Cl], 155-522) more intensity counts per minute, took 4.9 (95% Cl, 1.4-8.3) more steps per minute, and spent 6.3 percentage points (95% Cl, 3.1-9.6) more time in MVPA. All of the impacts were large, with effect sizes ranging from 0.37 to 0.53. There was one statistically significant difference for Hispanic students. Treatment-group Hispanic students took, on average, 5.4 (95% Cl, 0.4-10.3) more steps per minute than control-group Hispanic students (Table 2). The effect sizes for Hispanic students across all outcomes were modest, ranging from 0.21 to 0.34. There were no statistically significant differences for non-Hispanic white students; the 95% confidence intervals for dif ferences between treatment- and control-group means for intensity counts, steps, and time spent in MVPA for non-Hispanic white students all contained 0.
JPAHVol. 13, No. 3 ,2 01 6
The Impact of Playworks on Physical Activity 279
Table 3 Physical Activity of Participants in Race and Ethnicity Subgroups by Treatment Condition
Outcome Treatment Mean Control Mean Difference (95% Cl) Effect Size
N o n -H is p a n ic b lac k stu d e n ts (n = 312)
M ean n u m b e r o f in te n sity c o u n ts per m inute 1224.9 886.8 338.1 (1 5 4 .6 to 5 21.6) 0.51 M ean n u m b e r o f step s per m inute 26.9 22.0 4 .9 (1.4 to 8.3) 0 .37 M e a n p e rc e n ta g e o f r e c e ss tim e sp e n t in M V PA 20.4 14.1 6.3 (3.1 to 9.6) 0.53
N o n -H is p a n ic w h ite stu d e n ts (n = 207)
M ean n u m b er o f in te n sity c o u n ts p e r m inute 1250.3 1185.0 6 5 .4 ( - 2 0 5 .9 to 3 36.6) 0 .0 6 M ean n u m b e r o f ste p s p e r m in u te 28.2 29.2 - 0 .9 (- 8 .5 to 6.6) - 0 .0 7 M ean p e rc e n ta g e o f re ce ss tim e sp e n t in M V PA 19.7 19.2 0.5 ( - 5 .3 to 6.2) 0.03
H isp a n ic stu d e n ts (n = 397)
M ean n u m b e r o f in te n sity c o u n ts p e r m inute 1310.5 1097.7 2 1 2 .8 ( - 7 3 .9 to 4 9 9 .4 ) 0.21 M ean n u m b e r o f ste p s p e r m inute 31.9 26.5 5.4 (0 .4 -1 0 .3 ) 0 .34 M ean p e rc e n ta g e o f re ce ss tim e spent in M V PA 21.6 17.2 4 .4 (- 0 .3 to 9.0) 0 .32
Abbreviations: Cl, confidence interval; MVPA, moderate-to-vigorous physical activity.
Discussion
The limited research on racial and ethnic differences in physical activity during recess suggests that minority children are less physi cally active at recess than their nonminority counterparts and are therefore most in need of interventions to increase their activ ity.10-12 This study confirmed prior research that non-Hispanic black students in the control group were significantly less physically active than their non-Hispanic white and Hispanic counterparts in terms of the percentage of time spent engaged in MVPA. Non-Hispanic black students in the control group engaged in MVPA for an average of 14.1% of their recess time, whereas non-Hispanic white and Hispanic students in the control group engaged in MVPA for an average of 19.2% and 17.2% of their recess time, respectively.
There is even less research on racial and ethnic differences in the effect of recess interventions on physical activity. To our knowledge, only one other study17 has examined whether a recess intervention has different effects among children from different racial/ethnic groups in the United States. Although Siahpush and colleagues17 found no significant differences by race and ethnicity in the effect of Ready for Recess on MVPA during recess, their quasi-experimental study did find evidence that nonwhite students benefited more (approximately 10 additional minutes of MVPA) from the intervention than white students when considering the entire school day. In the present study, we set out to determine whether the Playworks program had an impact on physical activ ity during recess for 3 racial/ethnic groups: non-Hispanic blacks, non-Hispanic whites, and Hispanics. The study determined that Playworks provided large, significant benefits on the physical activity of non-Hispanic black students that were consistent across all 3 outcomes measured—intensity, steps, and percentage of time in MVPA.
These findings suggest that Playworks is a promising approach for increasing the physical activity of non-Hispanic black students. These findings are important because this subgroup of students has an increased risk for becoming over weight and obese3 and of not meeting national physical activity guidelines.4-27 Although the pattern was less consistent with a
significant impact for only 1 of 3 outcomes examined, the find ings also suggest some benefit of Playworks for Hispanic students’ accrual of recess physical activity.
One possible explanation for why the impact of Playworks on physical activity during recess was large and significant for minority students but not for non-Hispanic white students is that there was more room for improved physical activity engagement among minority students compared with white students. Non- Hispanic white students in non-Playworks schools were more active than Hispanic and non-Hispanic black students. Therefore, implementation of Playworks would not necessarily significantly increase the physical activity of white students because they are already physically active; it is more likely that they would keep the same activity levels or increase them marginally with program implementation. On the other hand, implementation of Playworks would be more likely to increase physical activity of minority students because there would be more room for growth among these students.
This study has several strengths. It used a rigorous random ized controlled-trial design and was conducted in a geographically diverse set of 27 schools from 6 major cities. Objective measures of physical activity data from accelerometers were used to assess the effect of the program on physical activity. Despite these strengths, some limitations existed, including the lack of baseline measures. Although baseline measures are not necessary to obtain unbiased impact estimates given the random assignment design, they would have provided even more power to detect impacts. In addition, we were able to collect data after only 1 year of implementation, but additional follow-up data collection periods would have enabled us to examine the longer-term impact of the program.
Conclusion In this first randomized controlled trial of the Playworks program, we found large positive impacts on the intensity, steps, and per centage of time in MVPA among non-Hispanic black students and a significant impact on Hispanic students’ number of steps. There were no significant impacts on non-Hispanic white students’ physical activity during recess. One possible explanation for these
JPAHVol. 13, No. 3, 2016
280 James-Burdumy et al
findings is th at there is m ore room for im proving physical activity am ong m inority stu d en ts co m p ared w ith th eir n on-H ispanic w hite co u n terp arts. F u tu re program efforts m ight include ex p an d in g the focus o f Play w orks on physical activity for all subgroups o f students in the hope that students o f all racial and ethnic su b g ro u p s involved in the Playw orks program will experience increased physical activity d u ring recess to the g reatest ex ten t possible.
Acknowledgments We thank Brittany Vas and William Reeves from Mathematica Policy Research, who helped lead the data collection effort, and Max Benjamin, who helped with the preliminary analyses. We also thank the editing staff from Mathematica Policy Research and the journal’s referees for their thoughtful comments on the manuscript. This research was supported by Robert Wood Johnson Foundation Grants 67445. 67807, and 67877.
References 1. Ogden CL, Carroll MD, Kit BK, Flegal KM, Prevalence of child
hood and adult obesity in the United States, 2011-2012. JAMA. 2 0 14;311:806-814. PubMed doi: 10 .1001 /jama.2 0 14.732
2. National Center for Health Statistics. Health, United States, 2011: With Special Features on Socioeconomic Status and Health. Hyattsville, MD: U.S. Department of Health and Human Services; 2012.
3. Miech RA, Kumanyika SL, Stettler N, Link BG, Phelan JC, Chang VW. Trends in the association of poverty with overweight among US adolescents, 1971-2004. JAMA. 2006;295:2385-2393. PubMed doi: 10.100! /jama.295.20.2385
4. Trost SG, McCoy TA, Vander Veur SS, Mallya G, Duffy ML, Foster GD. Physical activity patterns of inner-city elementary schoolchil dren. Med Sci Sports Exerc. 2013;45:470-474. PubMed doi: 10.1249/ MSS.ObO 13 e3 18275e40b
5. Strong WB, Malina RM. Blimkie CJ, et al. Evidence based physical activity for school-age youth. J Pediatr. 2005;146:732-737. PubMed doi: 10.1016/j .jpeds. 2005.01.055
6. Council on Sports Medicine and Fitness, Council on School Health. Active healthy living: prevention o f childhood obesity through increased physical activity. Pediatrics. 2006; 117:1834-1842. PubMed doi: 10.1542/peds.2006-0472
7. Story M, Kaphingst KM, French S. The role of schools in obesity prevention. Future Child. 2006;16:109-142, PubMed doi: 10.1353/ foe.2006.0007
8. Stellino MB, Sinclair CD. Intrinsically motivated, free-time physi cal activity: considerations for recess. J Phys Educ Recreat Dance. 2008;79(4):37^40. doi: 10.1080/07303084.2008.10598162
9. Robert Wood Johnson Foundation. Recess Rules: Why the Undervalued Playtime May Be Am erica’s Best Investment fo r Healthy Kids and Healthy Schools. Princeton, NJ: Robert Wood Johnson Foundation; 2007.
10. McKenzie TL, Sallis JF, Elder JP, et al. Physical activity levels and prompts in young children at recess: a two-year study of a bi-ethnic sample. Res Q Exerc Sport. 1997:68; 195-202. PubMed doi: 10.1080 /02701367.1997.10607998
11. McKenzie TL, Sallis JF, Nader PR, Broyles SL, Nelson JA. Anglo- and Mexican-American preschoolers at home and at recess: activ ity patterns and environmental influences. J Dev Behav Pediatr. 1992; 13:173-180. PubMed doi: 10 .1097/00004703-199206000-00004
12. Turner L, Chaloupka FJ, Chriqui JF, Sandoval A. School Policies and Practices to Improve Health and Prevent Obesity: National Elemen tary School Survey Results: School Years 2006-07 and 2007-08. Vol
1. Chicago, IL: Bridging the Gap Program, Health Policy Center, Institute for Health Research and Policy, University of Illinois at Chi cago; 2010. http://www.bridgingthegapresearch.org/_asset/6q2pg2/ ES_20l0_monograph.pdf. Accessed October 13, 2014.
13. Huberty JL, Siahpush M, Beighle A, Fuhrmeister E, Silva P, Welk G. Ready for Recess: a pilot study to increase physical activity in elementary school children. J Sch Health. 2011 ;81:251-257. PubMed doi: 10.1111/j. 1746-1561.2011.00591.x
14. Beyler N, Bleeker M, James-Burdumy S, et al. Findings from an Experimental Evaluation o f Playworks: Effects on Play, Physical Activity and Recess. Report submitted to the Robert Wood Johnson Foundation. Princeton, NJ: Mathematica Policy Research; 2013.
15. Efrat MW. Exploring effective strategies for increasing the amount of m oderate-to-vigorous physical activity children accumulate during recess: a quasi-experimental intervention study. J Sch Health. 2013:83:265-272. PubMed doi: 10.1111 /josh. 12026
16. Yildirim M, van Stralen MM, Chinapaw MJ. et al. For whom and under what circumstances do school-based energy balance behavior interventions work? Systematic review on moderators. Int J Pediatr Obes. 2011 ;6(Suppl 3):e46-e57, doi.T 0.3109/17477166.2011.566440
17. Siahpush M, Huberty JL, Beighle A. Does the effect of a school recess intervention on physical activity vary by gender or race? Results from the Ready for Recess pilot study. J Public Health Manag Pract. 2012; 18:416-422. PubMed doi: 10.1097/PHH.ObO 13e318226ca47
18. Fortson J, James-Burdumy S, Bleeker M, et al. Findings from an Experimental Evaluation o f Playworks: Effects on School Climate, Academic Learning, Student Social Skills and Behavior. Report submitted to the Robert Wood Johnson Foundation. Princeton, NJ: Mathematica Policy Research; 2013.
19. Madsen KA, Hicks K, Thompson HR. Physical activity and positive youth development: impact of a school-based program. J Sch Health. 2011 ;81462-470. PubMed doi: 10.1111/j. 1746-1561.2011.00615.x
20. McKenzie TL, Marshall SJ, Sallis JF, et al. Leisure-time physical activ ity in school environments: an observational study using SOPLAY. Prev Med. 2000;30:70-77. PubMed doi: 10.1006/pmed. 1999.0591
21. Trost SG, Loprinzi PD, Moore R, Pfeiffer KA. Comparison of accelerometer cut points for predicting activity intensity in youth. Med Sci Sports Exerc. 2011;43:1360-1368. PubMed doi: 10.1249/ MSS.ObO 13e318206476c
22. Edwardson CL, Gorely T. Epoch length and its effect on physical activity intensity. Med Sci Sports Exerc, 2010;42:928-934. PubMed doi: 10.1249/MSS.ObO 13e3181 c3G) 15
23. McClain JJ, Abraham TL, Brusseau TA, Jr, Tudor-Locke C. Epoch length and accelerometer outputs in children: comparison to direct observation. Med Sci Sports Exerc. 2008;40:2080-2087. PubMed doi: 10.1249/MSS.ObO 13e3181824098
24. Cain KL, Sallis JF, Conway TL, Van Dyck D, Calhoon L. Using accelerometers in youth physical activity studies: a review of methods. J Phys Act Health. 2013;10:437-450. PubMed
25. Nilsson A, Ekelund U, Yngve A. Sjostrom M. Assessing physical activ ity among children with accelerometers using different time sampling intervals and placements. Pediatr Exerc Sci. 2002:14:87-96.
26. Brown T, Summerbell C. Systematic review of school-based inter ventions that focus on changing dietary intake and physical activity levels to prevent childhood obesity: an update to the obesity guidance produced by the National Institute for Health and Clinical Excel lence. Obes Rev. 2009;10:110-141. PubMed d o i/IO .ll 1 1/j. 1467- 789X.2008.00515.x
27. Kelly EB, Parra-Medina D, Pfeiffer KA, et al. Correlates of physical activity in black, Hispanic, and white middle school girls. J Phys Act Health. 2010;7:184-193. PubMed
JP AH Vol.13, No. 3,2016
Copyright of Journal of Physical Activity & Health is the property of Human Kinetics Publishers, Inc. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.
Chapter 9 Notes The purpose of evaluation is to determine the merit or worth of an evaluand. That is, we want to know whether a program had the intended effect on its participants as specified by the programs theory and model. Our ability to faithfully and confidently determine the effects of a program are in part determined by the manner in which we design the evaluation. There are many ways to think about designs within the context of evaluation and designing evaluation is a complex endeavor. Moreover, it is important to note that different designs can be used for different types of applications. Regardless of how we conceptualize and frame the relationship between the purposes and methods of an evaluation process, there are two major questions that we have to be explicitly addressed:
1. To what extent are the effects we observe in participants really due to the program and not some other reason?
2. To what extent can the results observed in participants be expected to generalize (extend to) other situations?
Both of these questions pertain more formally to the concept of validity. And there are two specific forms of validity that as evaluators we must be concerned with:
• Internal validity – Refers to the extent to which a research design includes enough control of the conditions and experiences of participants that it can demonstrate a single unambiguous explanation for a manipulation, that is cause and effect.
To what extent are the effects we observe in participants really due to the program and not some other reason?
When we have adequately attended to issues involving internal validity within the evaluation process it means that an evaluator has controlled the effects of variables other than the treatment, in order to say with confidence that the results are reflective of the treatment. Hence, we can confidently say that the observed effects are cause by the program and nothing else.
• External validity – Extent to which observations made in a study generalize beyond the specific manipulations or constraints in the study
To what extent can the results observed in participants be expected to generalize (extend to) other situations?
When we have adequately attended to issues involving external validity it means that an evaluator has ensured that the participants of the program are representative of the population, and therefore that if the treatment is applied with another group of people from that population under similar circumstances, it should be effective there as well
WHAT FACTORS DIMINISH OR THREATEN VALIDITY OF EVALUATIONS? We can classify threats to the validity of our conclusions in terms of internal and external threats
Threat Description
History Events occurring during a study (other than the program treatment) that can influence results
Maturation Naturally occurring physical or psychological changes in program participants (e.g., growth, development, aging) that can influence results
Testing Administration of test before and after program might influence scores on test independent of program (e.g., familiarity with test results in changes in scores)
Instrumentation Having pretest and posttest that differ in terms of content, structure, format or difficulty can lead to differences in scores but not due to program treatment but differences in the instruments used
Statistical regression Having extreme groups in program may artificially decrease or increase scores independent of the program treatment—if all members of a group are already scoring at the highest levels and their scores cant go any higher, any observed decline in scores may be due to the test not program treatment indicating a measurement error
Differential Selection
Differences between groups compared (treatment vs. no-treatment groups) on important characteristics may account for observed differences but these are not due to the program treatments
Experimental Mortality
Differential dropout of participants in treatment and no-treatment groups yields differences in observed effects that are not a function of the program treatment but rather an artifact of attrition within the groups.
Treatment Diffusion Proximity among participants in treatment and no-treatment groups leads to treatment exposure for the no-treatment group
Compensatory rivalry
When no-treatment group outperforms the treatment group, but those differences are not due to the treatment effects, but by competition –John Henry effect
Compensatory Equalization of treatments
If one group receives something and the other receives nothing, than any effects on the first group may be due to the fact that this group received something, and not to the specifics of what it received.
Resentful Demoralization
When members of the no-treatment group realize they did not get something that the treatment group received they may become demoralized because they are being excluded but not because they did not get the specific treatment
External Validity Threats Threat Description
Selection Treatment Interaction
Refers to the possibility that the program results may be applicable to only to that population from which the treatment and no-treatment groups were chosen—hence results may be internally valid but not generalizable
Testing Treatment Interaction
Refers to the fact that the program results may be generalizable to other groups only when a pretest is also given.
Situation effects Experimenter effects
Refers to the existence of multiple factors associated with the program itself---results may be due to a particularly charismatic instructor rather than the content of the program
Multiple treatment effects
Participants are involved in multiple programs at the same time of the evaluation, hence the findings may not be generalizable to other settings because of the confounding of multiple treatments
Population validity Extent to which results observed in a study will generalize to the population from which a ample was selected. Homogeneous attrition: Rates of attrition are about the same in
Ecological validity Extent to which results observed in a study will generalize across settings or environments
Temporal validity Extent to which results observed in a study will generalize across time and at different points in time
Outcome validity Extent to which results observed in a study will generalize across different but related DVs
HOW DO WE MITIGATE AGAINST THREATS TO INTERNAL AND EXTERNAL VALIDITY? An evaluator can try to mitigate against these potential threats by selecting an evaluation design that reduces the influence of the particular threat by the manner in which the design is executed. There are many ways by which to characterize evaluation designs—Mertens and Wilsons 2012 distinguish between quantitative vs. qualitative data; but we can also classify designs in terms of being experimental, quasi-experiemental and non-experimental. I will use this latter one to highlight how the various designs attempt to address the validity threats we just discussed. The experimental research designs use methods and procedures to make observations in which the researcher fully controls the conditions and experiences of participants by applying three required elements of control: randomization, manipulation, and comparison/control
Randomization—involves randomly selecting participants into the study so that all individuals in a study; it also involves randomly assigning participants to the experimental conditions.
Manipulation—involves the systematic application of an experimental treatment.
Control—involves controlling who gets or does not get a particular treatment and ensuring that all other aspects of the experimental process are the same except for who gets or does not get a particular treatment.
Experimental research designs are the only research design capable of establishing cause—effect relationships. To demonstrate that one factor causes changes in a dependent variables, the conditions and experiences of participants must be under the full control of the research. This often means that an experiment is conducted in a laboratory and not in an environment where a behavior may occur naturally. Strength: Capable of demonstrating cause and effect. Limitation: Behavior that occurs under controlled conditions may not be the same as behavior that occurs in a natural environment We can categorize experimental research designs into one of 4 possible types of designs.
Box 9.4 provides an alternative way to conceptualize designs in Mertens & Wilson (p. 316). You will note that (R) designates randomization for all of those designs, (O) indicates and observation; and (X)
denotes a treatment. There are 5 different experimental designs which we can use to evaluate the impact of a program. Each one affords particular advantages that if relevant to the validity concerns and purpose of the evaluation enable you to more faithfully assess the program. Whether you are able to employ this designs for evaluation depends on whether or not you can randomize, manipulate and control. To the extent that you can randomize (randomly select/randomly assign participants to a treatment and no treatment group); manipulate (manipulate which group receives a treatment and which group does not); control (control for extraneous factors that may influence or impact participants that may not involve the treatment itself—e.g., control lighting and temperature on performance) then you are able to use one of the experimental designs described (see pl. 316-319). Besides practical concerns you have to think about ethical concerns with regard to the potential risks and benefits of randomizing, manipulating and controlling the treatment and the participants—how ethical is it to withhold a potential treatment for cancer from a terminally ill patient? If you cannot randomize, manipulate or control within your evaluation design, then the alternative is to employ a quasi-experimental design. To be an experimental design, it must meet the following three elements of control: 1. Randomization 2. Manipulation 3. Comparison/control group. Quasi- experiments are similar to an experiment, except that this design does one or both of the following: Includes a quasi-independent variable--Quasi-independent variable: A preexisting variable that is often a characteristic inherent to an individual, which differentiates the groups or conditions being compared in a research study (e.g. Gender (man, woman), health status (lean, overweight, obese). It lacks an appropriate or equivalent control group. Strength: Allows researchers to study factors related to the unique characteristics of participants. Limitation: Cannot demonstrate cause and effect
Again, there are many ways to classify the various types of quasi-experimental designs, what is most important is to pay attention to the design that matches and address the purposes and validity threats that may influence and impact the evaluation. Mertens and Wilson describe the relevant issue with regard to quasi experimental designs in Box 9.5 (pp. 320-325)
WHAT ABOUT OTHER DESIGNS THAT DO NOT CONFORM TO THE EXPERIMENT AND QUASI- EXPERIMENT CLASSIFICATION? The last category of designs involves what I refer to as non-experimental or what Mertens and Wilson (2013) classify as qualitative designs. These designs do not share any of the characteristics that are required for experimentation (e.g., randomization, manipulate, control). These designs use of methods and procedures to make observations in which the behavior or event is observed “as is” or without an intervention from the researcher. Strength: Can be used to make observations in settings that the behaviors and events being observed naturally operate (e.g. Interactions between an athlete and coach during a game). Limitation: Lacks control needed to demonstrate cause and effect.
Correlational Designs
• Measurement of two or more factors to determine or estimate the extent to which the values for the factors are related or change in an identifiable pattern
• Correlation coefficient: Statistic used to measure the strength and direction of the linear relationship, or correlation, between two factors
• The value of r can range from -1.0 to +1.0 Naturalistic Observation
The observation of behavior in the natural setting where it is expected to occur, with limited or no attempt to overtly manipulate the conditions of the environment where the observations are made (e.g. Buying behavior in a grocery store, parenting behavior in a residential home Generally associated with high external validity, but low internal validity
Qualitative Designs
• Use of scientific method to make nonnumeric observations, from which conclusions are drawn without the use of statistical analysis
• Adopts the assumption of determinism; however, it does not assume that behavior itself is universal
• Determinism: Assumption in science that all actions in the universe have a cause • Based on the holistic view, or “complete picture,” that reality changes and behavior is dynamic
Phenomenology (Individual)
• Analysis of the conscious experiences of phenomena from the first-person point of view • The researcher interviews a participant then constructs a narrative to summarize the
experiences described in the interview • Conscious experience is any experience that a person has lived through or performed and can
bring to memory • The researchers must be considerate of the intentionality or meaning of a participant’s
conscious experiences • Identify objects of awareness, which are those things that bring an experience to consciousness
Ethnography (Group)
• Analysis of the behavior and identity of a group or culture as it is described and characterized by the members of that group or culture
• A culture is a “shared way of life” that includes patterns of interaction, shared beliefs and understandings, adaptations to the environments, and many more factors
• To observe a group or culture, it is often necessary to get close up to or participate in that group or culture
• To gain entry into a group or culture without causing participants to react or change their behavior
• Researchers can covertly enter a group • Researchers can announce or request entry into a group
• Participant observation: Researchers participate in or join the group or culture they are observing
• Researchers need to remain neutral in how they interact with members of the group • Common pitfalls associated with participant observation
• The “eager speaker” bias • The “good citizen” bias • The “stereotype” bias
Case Study Analysis of an individual, group, organization, or event used to illustrate a phenomenon, explore new hypotheses, or compare the observations of many cases Case history: An in-depth description of the history and background of the individual, group, or organization observed. A case history can be the only information provided in a case study for situations in which the researcher does not include a manipulation, treatment, or intervention Illustrative: Investigates rare or unknown cases Exploratory: Preliminary analysis that explores potentially important hypotheses Case studies have two common applications: 1. General inquiry 2. Theory development
The level of control in a research design directly related to internal validity or the extent to which the research design can demonstrate cause—effect. Experimental research designs have the greatest control and therefore the highest internal validity. Nonexperimental research designs typically have the least control and therefor the lowest internal validity.
Internal validity – Extent to which a research design includes enough control of the conditions and experiences of participants that it can demonstrate a single unambiguous explanation for a manipulation, that is cause and effect External validity – Extent to which observations made in a study generalize beyond the specific manipulations or constraints in the study Constraint: Any aspect of the research design that can limit observations to the specific conditions or manipulations in a study See also Merten & Wilson (2102) Box 9.8
1
Article Review
The purpose of this evaluation was to examine the efficacy of a Rural Infant Care
Program (RICP) designed to reduce infant mortality rates in rural communities from nine states
(Gortmaker, Clark, Graven, Sobol, & Geronimus, 1987). The RICP is based on the notion that
high infant mortality rates are due to deficits in the perinatal system (e.g., system deficits--See
Table 1). Accordingly, the RICP proposes that by bringing together key personnel (e.g., local
providers, medical school personnel, state health department) improvements in the perinatal
system (e.g., training providers, increasing referral rates, regionalizing tertiary centers) can be
made which would lead to lower levels of infant mortality in rural communities (See Table 1).
The RICP provided funding to 10 medical schools with programs designed to improve
the delivery of health services to mothers and infants in rural areas. These programs aimed to
improve access to perinatal care, improve the transportation of sick neonates, upgraded
professional skills in rural hospitals and increase referrals of high-risk pregnancies to tertiary
centers (See Inputs/Obj. in Table 1). These ten sites were selected because they had infant
mortality rates above their state’s level for 1977, had a minimum of 1000 births per year and
were located in states with IPO projects.
A time series design was employed to examine whether the RICP reduced infant
mortality rates above those expected in the absence of the program (See figure 1). This design
has the advantage of controlling for both maturation and history effects. It allows researchers to
determine if the changes in infant mortality rates can be attributed to the RICP intervention.
However, a time series design can not control for instrumentation effects--the use of the
same/different instrument over various time periods. In the present study infant mortality rates,
natality data and vital statistics from various sources (e.g., National Center for Health Statistics,
2
State Health Departments, and published/unpublished State vital statistics) were utilized to
examine changes in mortality rates both pre and post-intervention. The reliance on multi-source
data can increase the potential for instrumentation effects (e.g., reliability/validity of the DV
measures) and thus limit the strength of the results of the study. Infant mortality rates from
various sources (e.g., National Center for Health Statistics, State Health Departments, and
published/unpublished State vital statistics) were compared to rule out instrumentation effects.
The results of these analyses showed that there were no significant differences in infant mortality
rates. Thus, the design employed in the present study (partially) controlled for instrumentation
effects as well as maturation and history effects. In addition, this design has the advantage of
controlling for regression and (partially controlling) selection effects as well. These design
characteristics are noteworthy since they allow the evaluators to rule out a number of alternative
explanations for the results including (1) that the program effects were due spillover effects from
the IPO projects, (2) that the program effects were due to some special event co-occurring with
the intervention (e.g., history), (3) that the program effects were due to some change in the
participants (e.g., maturation), and (4) that the program effects were due to the composition of
the participants (e.g.,selection; regression).
The use of a time series design is particularly appropriate for evaluating the RICP
intervention. This design take advantage of the fact that multiple data points can be retrieved for
consecutive time periods both before and after the initiation of the RICP (See figure 1). In fact,
three separate comparisons tests were conducted to determine whether the RICP was effective.
These included a comparisons between RICP and non-RICP areas (e.g., non-RICP areas
included counties not targeted to receive RICP funding with lower infant mortality rates), a
comparison between RICP areas and eligible RICP states not funded, and a comparison between
3
RICP areas to matched rural areas with IPO funding. Time-series regression models were fit to
examine whether the RICP was effective. Success of the RICP was determined by a change in
infant mortality rates beginning in 1979.
In general, the RICP was successful in reducing infant mortality rates in nine out of ten
sites. Time series regression models revealed that declines in neonatal mortality were
attributable to the RICP. Furthermore, there was a sharp drop in neonatal mortality beginning in
1979 in the RICP areas. By 1982-1984 the neonatal rates in the RICP areas were similar to those
found in non-RICP areas. In addition, there were no significant differences in postneonatal
activity associated with the RICP and there were no significant changes observed in the non-
RICP comparison areas. Thus, the authors conclude that “The RICP demonstrated the value of
local initiative in addressing these problems and showed that effective cooperation can be
achieved among local physicians and nurses, hospital administrators, local health departments,
state health departments, and tertiary hospitals” (Gortmaker et al., 1987, p. 114).
Though encouraging, the result of the present study have several limitations that pertaing
to the external validity of the results. First, as the authors note, the ten sites included in the study
were chosen because they were well organized. That is, the sites already had an existing network
which facilitated the implementation of the RICP intervention. This is particularly important
since it raises questions about the extent to which the RICP may be equally successful in rural
communities without such established networks (e.g., generalizability). The authors note that
despite this limitation infant mortality rates were still reduced. However, the magnitude of this
effect remains an open question. It may be that a selection factor may account for the program
effects.
4
A second limitation, also noted by the authors, concerns the lack of random assignment
of geographic areas to treatment and control groups. While there are ethical issues involved with
this decision (e.g., withholding of treatment to control group participants), it is important to
remember that a selection factor may be (partly) responsible for the results of the study. The
lack of random assignment compounds the potential for a selection bias already noted. For
example, it may be that patients who are the recipients of RICP benefits (e.g., pregnant women)
may be different from those attending other rural hospitals. Though, the authors tried to deal
with this issue by comparing Non-RICP areas with lower mortality rates (e.g., non-RICP
comparison areas), similar mortality rates (e.g., IPO-76 programs) and a matched rural area (e.g.,
IPO-78 programs), it remains unsolved. This issue becomes even more problematic when we
consider the fact that though there were overall reductions in mortality rates in the nine sites,
only “three of the reductions were statistically significant” (Gortmaker et al., 1987, p. 106). This
may suggest that the reductions in infant mortality may be due to some selection X treatment
effect. That is, the combination of a specific site and the treatment used at that site.
Thirdly, because the nature of the intervention required prior planning and coordination
on the part of network participants it is possible that this activity alone may account for the
obtained results. This is clearly an alternative hypothesis which cannot be ruled out by the study.
In fact, the authors note that program meetings in the targe areas “were mostly informational [at
first]. . . Important contacts, however, were made at this stage. . .local physicians. . . often
[met] for the first time [with] doctors from tertiary centers” (Gortmaker et al., 1987, p. 97).
Thus, the extent to which these early meetings may have influenced the results of the study are
unknown.
5
Finally, a larger question remains unanswered--are the expenses associated with the
implementation of the RICP justified by the results? A cost-benefits analysis (Shortell &
Richardson, 1978) would shed some light on this issue. If the cost to benefit ratio was such that
the benefits outweighed the cost then clearly the program could be deemed effective. On the
other hand, if the cost outweighed the benefits, then it would certainly call into question the
desirability of replicating such efforts.
Clearly, these issues raise some concerns about the implementation of this program in
other hospitals. In the best case scenario, this program could be implemented in hospitals with
existing networks of providers that are willing to participate in the RICP intervention. In the
worst case, replication of this project in a random fashion would not be desirable. Given that
only three sites reported statistically significant reductions in infant mortality rates, careful
considerations must be given to the fit between the RICP intervention and the characteristics of
the setting in which it will be implemented.
References
Gortmaker, S.L., Clark, C.J.G., Graven, S.N., Sobol, A.M., & Geronimus, A. (1987). Reducing
infant mortality in rural America: Evaluation of the rural infant care program. Health
Services Research, 22(1), 91-116.
Shortell, S. M., & Richardson, W. C. (1978). Health program evaluation. Saint Louis, MO. The
C. V. Mosby Company.
6
Table 1. Process Model of Evaluation.
Preexisting Conditions Program Components Intervening Events Impact/Consequences
High infant mortality
Individual Differences
-Low SES
-underinsurance
-isolation
-low Ed. levels
-inadequate housing
System Deficits
-Poor communication
among providers
-Low referrals rates to
teriary centers
-Limited Knowledge,
Skills, Abilities
(KSAs) among
providers
-No regionalization of
tertiary centers
-Limited success with
births of LBW infants
Improve delivery
services to mothers &
infants in target areas
Inputs/Obj.
-Increase access to
perinatal centers
-Improve
transportation of sick
neonates
-Upgrade KSAs of
providers
-Increase referral of
high-risk pregnancies
Resources
-Special funding
-Medical School
-Local providers
-State health Dept.
-Adm. Support
-Travel Costs
Activities
-Conduct program
meetings
-Identify problems
-Conduct needs
assessment
-Provide training
-Upgrade facilities
-Transport sick
neonates
-Expand Well-child
clinics
-Develop High-risk
OBGYN clinics
External
-IPO projects
-Announcement of
RICP
Internal
-Organization of
provider network
-Regionalization of
services
-Lack of interest in
RICP
-Reduce infant
mortality
-Increase referrals
-Greater cooperation
/communication
among providers
-Increase KSAs of
providers
7
Figure 1. Evaluation Design.
O1965 O1966 O1967 O1968 O1969 O1970 O1971 O1972 O1973 O1974 O1975 O1976 O1977 O1978 X1979 O1979 O1980 O1981 O1982 O1983
O1984 O1985
Project 1 Program Evaluation Critique: Project 1involves reviewing and critiquing the program evaluation study by James-Burdumy and colleagues (see below). The project is worth 15% of the total grade and requires that you provide an overview and critique of the evaluation by James-Burdumy et al (2016) that includes critiquing (a) the background and focus of the evaluation; (b) the evaluation methodology; (c) the analytical approach and findings; and (d) conclusions and recommendations.
James-Burdumy, S., Beyler, N., Borradaile, K., Bleeker, M, Maccarone, A., & Fortson, J. (2016). The impact of playworks on students' physical activity by race/ethnicity: Findings from a randomized controlled trial. Journal of Physical Activity & Health, 13 (3), 275-280.
Deliverable : A written critique of James-Burdumy et al (2016) addressing elements below. Critiques should be type-written, doubled-spaced in 12pt font.
(a) Background and Focus. What is the background and context for the evaluation? What program is being evaluated? Describe its components of the logic, model or theory of the program (e.g., resources, activities, outputs, outcomes, impact; Mertens & Wilson, 2019, pp. 229-242)? What were the goals, objectives, and purposes of the evaluation? What question(s) were being answered in this evaluation? See Mertens and Wilson (2019) Chapter 7-8.
(b) Evaluation Method. Data—describe the data used to assess the evaluation objectives (e.g., what types of data are collected? how much data is collected and why?) and comment on the appropriateness of these data (e.g., What were the constructs of interest and how were these measured? Is there evidence of the reliability and validity of the measured constructs to support their use? Are these appropriate given the evaluation goals, objectives and purposes?) See Mertens and Wilson (2019) Chapter 10. Evaluation Design—Identify and describe the evaluation design (e.g., Mertens & Wilson, 2019, Chapter 9). Explicate how the evaluation design relates to the goals, objectives, and purposes of the evaluation (see Mertens & Wilson, 2019, Chapter 8) and comment on the appropriateness of the design to answer the evaluation objectives (e.g., Is the design appropriate to the evaluation goals, objectives and purposes? Why do you suppose they chose this design over other alternatives? Which sources of potential invalidity have they ruled out with this design and what sources might remain? What might be some important considerations other than the pros and cons of the designs themselves? Would you have used a different design? Why or why not?). See Mertens & Wilson, 2019, Chapters 8-10.
(c) Analyses and Findings. Analyses—Describe the analytical approach taken to evaluate the data and comment on the appropriateness of the analyses in relation to the evaluation goals, objectives and purposes (e.g. What analyses were used and how did these address the evaluation goals, objectives and purposes? Are there other types of analyses that might further inform the evaluation goals, objectives and purposes?). Findings— Describe the major finding of the evaluation and comment on how the findings relate to the evaluation goals, objectives and purposes (e.g., What are the main findings of the study? How do the findings address the evaluation goals, objectives and purposes? Do the findings support, refute or inform the program? What alternative interpretations of the findings can be ruled out and what plausible alternative or rival explanations can be made? What limitations or qualifiers must be placed in the evaluation results given conceptual/theoretical, methodological or statistical concerns?). See Mertens & Wilson (2019) Chapter 12.
(d) Conclusions and Recommendations. Describe the major conclusions and recommendations of the evaluation and comment on the evidence supporting, refuting or informing the evaluation goals, objectives and purposes (e.g., What were the major conclusion and recommendations? How do the conclusion and recommendations address the evaluation goals, objectives and purpose? Are there any conceptual/theoretical, methodological/statistical or practical concerns warrant caution?). See Mertens & Wilson (2019) Ch.12.
4
Project 1 Evaluation Form
Individual Project 1 Program Evaluation Critique. Individual Project 1 required you to provide an overview and critique of the evaluation study conducted by James-Burdumy et al (2016). The assignment should review and critique (a) the background and focus of the evaluation (3 pts); (b) the evaluation method (3pts); (c) the analytical approach and findings (3pts); (d) conclusions and recommendations (3pts) using questions described in Appendix A; in narrative form within no more than 3-5 single-spaced pages; and (e) be written in a clear and coherent manner that critiques all elements of the program evaluation (3pts). The assignment is worth 15% of the total grade.
Background & Focus (3 Points) |
Not at all 0 |
Partially .5 |
Completely 1 |
Total Points |
a. Does the narrative describe the background and context for the evaluation? |
|
|
|
|
b. Does the narrative describe relevant aspects of the program being evaluated? (e.g., its components, logic model or program theory) |
|
|
|
|
c. Does the narrative describe the purpose of the evaluation? (e.g, goals, objectives, and evaluation questions) |
|
|
|
|
Evaluation Method (3 Points) |
Not at all 0 |
Partially .5 |
Completely 1 |
Total Points |
d. Does the narrative describe the data used to assess the evaluation objectives? (e.g., types and amount of data collected and why?) |
|
|
|
|
e. Does the narrative comment on the appropriateness of the data? (e.g., relevance of data to goals, objectives and purposes, operationalization of constructs, measurement qualities--reliability and validity) |
|
|
|
|
f. Does the narrative describe the evaluation design and comment its appropriateness for the evaluation (e.g., appropriateness of the design given the goals, objectives and purposes; advantages and disadvantages of the design— threats to validity, alternative designs, pros and cons) |
|
|
|
|
Analysis and Findings (3 Points) |
Not at all 0 |
Partially .5 |
Completely 1 |
Total Points |
g. Does the narrative describe the analytical approach taken to evaluate the data and comment on the appropriateness of the analyses (e.g. analyses used to address goals, objectives and purposes; alternative analyses that might further inform the evaluation goals, objectives and purposes?) |
|
|
|
|
h. Does the narrative describe major findings of the evaluation in relation to the evaluation goals, objectives and purposes (e.g., What are the main findings of the study? How do the findings address the evaluation goals, objectives and purposes?) |
|
|
|
|
i. Does the narrative comment on how the findings relate to the evaluation goals, objectives and purposes (e.g., Do the findings support, refute or inform the program? Alternative interpretation of findings? Any limitations or qualifiers to the findings?) |
|
|
|
|
Conclusions & Recommendations (3 Points) |
Not at all 0 |
Partially .5 |
Completely 1 |
Total Points |
j. Does the narrative describe and discuss the major conclusions? (e.g., What were the major conclusion and how do they address the goals, objectives and purposes of the evaluation?) |
|
|
|
|
k. Does the narrative describe and discuss recommendations? (e.g., What were the recommendations and how do the recommendations address the evaluation goals, objectives and purpose?) |
|
|
|
|
l. Does the narrative comment on the evidence supporting, or refuting the program goals, objectives and purposes (e.g., Are there any conceptual/theoretical, methodological/statistical or practical concerns warrant caution?) |
|
|
|
|
Critique (3 Points) |
Not at all 0 |
Partially .5 |
Completely 1 |
Total Points |
m. Does the narrative critique of the background, context or purpose? |
|
|
|
|
n. Does the narrative critique of the methodology or analytical approach? |
|
|
|
|
o. Does the narrative critique the results, major findings or conclusions? |
|
|
|
|
Grand Total |
|

Get help from top-rated tutors in any subject.
Efficiently complete your homework and academic assignments by getting help from the experts at homeworkarchive.com