How We Rate Programs

June 4, 2021

To be included on CrimeSolutions, programs undergo an eight-step review and evidence-rating process, and potentially a re-review under certain circumstances.

Programs are identified for potential inclusion on CrimeSolutions through:

Literature searches of relevant databases, journals and publications, including:
- Social science databases using keywords identified in the areas of criminal justice, juvenile justice and victims of crime;
- Journals (including peer-reviewed journals) and other relevant resources; and
- Other web-based databases of effective programs, and meta-analyses of evaluated programs.
Nominations from the field. See how to nominate a program here: Nominate a Program for CrimeSolutions.

Historically, for every article reviewed:

8% resulted in an identified program.
3% resulted in a review.
1% resulted in a finding of inconclusive evidence.
2% resulted in a rating.

After programs are identified, research staff review program materials to determine whether the goals of the program fall within the scope of CrimeSolutions. To fall within the scope, the program must:

Aim to prevent or reduce crime, delinquency or related problem behaviors (such as aggression, gang involvement, or school attachment);
Aim to prevent, intervene, or respond to victimization;
Aim to improve justice systems or processes; and/or
Target a population of persons committing or convicted of a crime or an at-risk population (that is, individuals who have the potential to become involved in the justice system).

Prevention programs not explicitly aimed at reducing or preventing a problem behavior must apply to a population at risk for developing problem behaviors.

See a list of screened-out program evaluations.

Historically, for every program identified:

34% resulted in a review.
20% resulted in a rating.
45% were screened out.
20% are put on hold.

If the program's scope meets CrimeSolutions criteria, research staff then expand the search for evaluations, research and program materials to identify all relevant information needed for Lead Researcher and Study Reviewer consideration. Nonexperimental, qualitative, ethnographic and case-study research is collected if it adds contextual information to the program description, but is not used to determine the program's evidence rating.

Once the literature search is complete, research staff review the newly identified studies to determine whether they meet the criteria for evidence. To be considered for expert review, the program's evaluation evidence must meet the following minimum requirements:

The program must be evaluated with at least one randomized field experiment or quasi-experimental research design (with a comparison condition).
The outcomes assessed must relate to crime, delinquency, or victimization prevention, intervention or response.
The evaluation must be published in a peer-reviewed publication or documented in a comprehensive evaluation report.
The date of publication must be 2000 or later.

Based on available resources, some identified program studies that are not selected for review have been placed “on hold.” These studies have met the minimum standards of evidence for CrimeSolutions, but, for one or more reasons, have been added to a backlog studies for future consideration. See Programs Held for Future Consideration.

A Lead Researcher with subject-matter and research method expertise selects up to three studies representing the most rigorous study designs and methods from all available evaluations of the program. (See the Lead Researchers procedures manual.) The Lead Researcher follows review guidelines to ensure that appropriate statistical comparisons between treatment and control groups were conducted in the studies, that they are not studies of comparative effectiveness, and that the studies’ authors report the outcomes in a manner that allows for the Study Reviewers to assess the quality of the results (for example, providing data to calculate an effect size, if needed). Additionally, the Lead Researcher ensures that outcomes for the full treatment sample are reported in the studies (studies that provide outcomes only for subgroups are not eligible for review). The selected studies comprise the program's evidence base and will be scored by Study Reviewers, ultimately to be used as the basis for the program's evidence rating. Additional studies identified through the literature search, but not included in the evidence base, may serve as supporting documentation. The criteria used to determine the three most rigorous studies include:

Strength of research design
Breadth of documentation
Type of analytic procedures used
Sample size
Independence of evaluator
Year of publication.

If a CrimeSolutions Study Reviewer believes there is a compelling reason to review more than three studies, he or she may contact the Lead Researcher to request additional studies for review. The Lead Researcher will then make the final determination. In addition, multiple articles and publications that report on various aspects of a single study are generally treated as one study for purposes of the review; however, two studies that use the same data set but include different follow-up periods or analyses may be considered separately on a case-by-case basis.

Once the Lead Researcher selects the studies that will comprise the program's evidence base and the outcomes that will be considered (see CrimeSolutions Tiered Outcomes List), trained and certified Study Reviewers begin the program evidence review, using the program scoring instrument (download PDF or Excel version) to assess the quality, strength and extent to which the evidence indicates that the program achieves its goals. Each study within the evidence base is assessed by at least two Study Reviewers. The program scoring instrument indicates the overall rating for each study that is reviewed.

The program scoring instrument consists of two parts:

Part 1 of the Scoring Instrument: Conceptual Framework is assessed only once for each program, regardless of the number of studies in the evidence base. The Study Reviewers make this assessment based on information from all of the studies and program materials they have received. These additional program materials may include nonexperimental, qualitative, ethnographic and case-study research as well as implementation materials.

Program's Conceptual Framework
Dimension	Overview	Elements
Conceptual Framework	Assesses the degree to which the program is grounded in the research literature.	Prior research Theoretical base Program description

Part 2 of the Scoring Instrument: Quality, Outcomes and Fidelity is completed for each evaluation study that is included as part of the evidence base (up to three studies). It includes the research design quality, outcome evidence, and program fidelity.

Study Quality, Outcomes, and Fidelity
Element	Description	Considerations
Design Quality	Assesses the quality of the research design. The Study Reviewers are also required to note specific information, such as threats to validity.	Type of research design Sample size Statistical adjustment (if applicable) Instrumentation Internal validity Follow-up period Displacement/diffusion (if applicable)
Outcome Evidence	Assesses the quality of the results. (Note: Outcomes are considered and rated separately within this dimension because programs may target multiple outcomes. In addition, the assessment focuses on the programs' primary, justice-related outcomes.)	Substantive program effects Behavior change Outcomes
Program Fidelity	Assesses the degree to which the program is delivered as designed and intended.	Documentation Adherence

Study Reviewers assign numerical values to each element in the program scoring instrument. The elements include a definition and other guidance that Reviewers consider when rating the elements. In the Program Review information, the Reviewers also make note of any other information that should be highlighted as being of particular importance. See the program scoring instrument (download PDF or Excel version) for guidance on each element.

The Study Reviewer is responsible for making a reasonable determination (i.e., supported or justified by fact or circumstance) as to the strength of the conceptual framework, research design, outcome evidence and program fidelity on the basis of provided documentation and his or her specialized knowledge with regard to program evaluation.

As a final step on the program scoring instrument, Study Reviewers provide an assessment as to their overall confidence in the study design. If both Study Reviewers agree, and the Lead Researcher concurs, that there is a fundamental flaw in the study design (not captured in the Design Quality dimension), this raises serious concerns about the study's results, the study is removed from the evidence base, and it is not factored into the evidence rating. This final determination serves as an additional safeguard to ensure that only the most rigorous studies comprise the evidence base. The study citation will be listed among the program's additional references.

The score for each of the four dimensions (Conceptual Framework, Design Quality, Outcome Evidence, and Program Fidelity) is calculated separately and used to assess each study. On the basis of scores, the study is assigned one of the following classifications:

Class 1 Studies are very rigorous and well-designed and find significant, positive effects on justice-related outcomes.
Class 2 Studies are well-designed but slightly less rigorous, or there may be limitations in their design. They find significant, positive effects on justice-related outcomes.
Class 3 Studies are very rigorous and well-designed and find significant, harmful effects on justice-related outcomes.
Class 4 Studies are very rigorous and well-designed and find no significant effects on justice-related outcomes.
Class 5 Studies do not provide enough information or have significant limitations in their study design such that it is not possible to establish a causal relationship to the justice-related outcomes.

Discrepancies among Reviewers: In the event that there is a classification discrepancy between the Study Reviewers, the Lead Researcher will work to achieve a consensus classification. If necessary, the Lead Researcher will also review the study and make a final determination on the classification.

To reach an evidence rating[1] for each program, the study-level information is aggregated.

All evidence ratings, which are based on 1–3 studies, are classified as follows:

In the CrimeSolutions rating process, we address program and practice evaluations that sit along an evidence continuum with two axes: (1) Effectiveness and (2) Strength of Evidence. Looking at the exhibit below, where a program or practice sits on the Effectiveness axis tells us how well it works in achieving criminal justice, juvenile justice, and victim services outcomes. Where it sits on the Strength of Evidence axis tells us how confident we can be of that determination.

Programs fall along two continuums: Effectiveness and Strength of Evidence — (View larger image.)

When a Reviewed Program Does Not Receive a Rating

CrimeSolutions periodically updates a static list of programs that have been reviewed by Study Reviewers but not assigned an evidence rating due to lack of evidence. A program is placed on the Inconclusive Evidence List if the reviewed study (or studies) received only Class 5 study ratings, indicating that there were significant limitations in the study design such that it was not possible to establish a causal relationship to the program's justice-related outcomes (as outlined in the above Program Evidence Rating chart). See Programs Identified But Not Rated.

Search Programs

Historically, for every program reviewed:

42% resulted in a finding of inconclusive evidence.
58% resulted in a rating.

For every program rated:

27% resulted in a rating of No Effects.
60% resulted in a rating of Promising.
13% resulted in a rating of Effective.

We re-review programs for the following reasons:

New evaluation studies, or studies not previously identified, are found that meet the CrimeSolutions criteria. This may include studies that extend the follow-up period of previously reviewed studies.
New supplemental materials are submitted that better explain the conceptual framework and fidelity dimensions of the program, which may affect a program's evidence rating.
As part of a periodic and continuous quality control assessment to ensure information, links, and research in program profiles are current, accurate, and consistent with any updates to policies and procedures for the program review.

A re-review may or may not be sufficient to warrant a new evidence rating. If a Lead Researcher determines that there is sufficient evidence in the new materials to warrant another review, then the new information is sent to the Study Reviewers for assessment. Even if the program's evidence rating does not change, new evidence, information, and materials may be included or referenced on the program's profile page.

We also may re-review select programs when changes or clarifications are made to the program scoring instrument, criteria for inclusion of a study, and the guidance given to our reviewers.

When a new evaluation study (or studies) is identified for a program that already has three studies in its evidence base, the Lead Researcher will review all eligible studies and determine which three are most rigorous. This is to ensure that the program’s evidence base consists of studies representing the most-rigorous study designs and methods from all available evaluations of the program. This decision can be based on strength of research design, breadth of documentation, type of analytic procedures used, sample size, independence of the evaluator, and year of publication. If a newly identified study is determined to be more rigorous than a study that was previously reviewed, the new study will replace the reviewed study in the evidence base. When this happens, the program will be re-reviewed to ensure the program rating is based on research that is both most rigorous and most up to date.

For more information: Inquiring About or Appealing an Evidence Rating

Date Modified: June 4, 2021

Under step four, Initial Evidence Screening, the publication date for a study to be considered for expert review, the date of publication must be 2000 or later.

Date Published: June 4, 2021

How We Rate Programs

1. Preliminary Program Identification

2. Initial Program Screening

3. Literature Search

4. Initial Evidence Screening

5. Selection of Evidence Base

6. Expert Review

7. Study Classification

8. Program Evidence Rating

When a Reviewed Program Does Not Receive a Rating

Re-Reviewing a Program and Updating a Rating

Notes