National Institute of Justice National Institute of Justice. Research. Development. Evaluation. Office of Justice Programs
Crime Solutions.gov
skip navigationHome  |  Help  |  Contact Us  |  Site Map   |  Glossary
Reliable Research. Real Results. skip navigation
skip navigation Additional Resources:

skip navigation

Glossary

Following is a glossary of terms and acronyms used throughout the CrimeSolutions.gov website:

A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  



A

Analysis of Variance (ANOVA)

A method for analyzing the differences in the means of two or more groups. Specifically, this procedure partitions the total variation in the dependent variable into two components: between-group variation and within-group variation. It allows researchers to determine if the differences between a control group and a treatment group are attributed to the independent variable or treatment.


Anticipatory Benefits (Evidence Rating Element)

Occurs when the effects of a program are observed prior to the implementation of the program, generally because the target population believes the program has already started. This element is reviewed along with diffusion and displacement on the CrimeSolutions.gov Scoring Instrument. These elements are typically considered in evaluations of community-level crime prevention efforts. See Program Review and Rating from Start to Finish for more information.


Attrition (Mortality) (Evidence Rating Element)

The loss of participants during the course of a study, which often occurs because subjects move or they refuse to participate in the study. This may be a threat to the study’s Internal Validity. See Program Review and Rating from Start to Finish for more information.

Back to Top

B

Bivariate Analysis

An analysis of the relationship between two variables, such as correlations and one-way analysis of variance (ANOVA).

Back to Top

C

Causal Evidence

Evidence that documents a relationship between an activity, treatment, or intervention (including technology) and its intended outcomes, including measuring the direction and size of a change, and the extent to which a change may be attributed to the activity or intervention. Causal evidence depends on the use of scientific methods to rule out, to the extent possible, alternative explanations for the documented change. This differs from descriptive evidence.


Chi-Square Test

A statistical test used to compare differences between observed, categorical data and expected data (based on a specific hypothesis) to determine if any difference that occurred is the same as would occur by chance.


Comparative Effectiveness Research (CER)
An evaluation approach, often using meta-analysis results, to show the relative strengths and weaknesses of alternative interventions or programs on the same outcome. CER often uses effect size as a standard metric for comparing the effectiveness of the alternative programs. CER is common in health and medical research, as well as other fields.

Comparison Group

A group of individuals whose characteristics are similar to those of a treatment group. Comparison group individuals may not receive any services, or they may receive a different set of services, treatment, or activities as the treatment group. In no instance do they receive the same services as the individuals being evaluated (the treatment group). Comparison groups are used in quasi-experimental designs where random assignment is not possible or practical.


Contamination (Evidence Rating Element)

Occurs when members of the control group or the comparison group are inadvertently exposed to the intervention or treatment being studied. Contamination threatens the study’s Internal Validity. See Program Review and Rating from Start to Finish for more information.


Control Group

A group of individuals whose characteristics should be almost identical to those of the treatment group but do not receive the program services, treatments, or activities being evaluated. In experimental designs, individuals are placed into control groups and treatment groups through random assignment.


Correlation

A statistical term that measures the degree of the relationship between two variables. A correlation has two components, magnitude and direction. Magnitude is a measure of strength and ranges from 0, no correlation, to 1, perfect correlation. Direction determines whether a correlation is positive or negative. A positive correlation means that as one variable, X, increases so does another variable, Y. A negative correlation means that as one variable, X, decreases so does another variable, Y. An inverse correlation means that as one variable, X, increases the other variable, Y, decreases and vice versa. For example, if variables X and Y have a correlation of 0.7 this means they have a strong, positive relationship. Correlation does not imply a causal relationship between variables.

Back to Top

D

Dependent Variable

A variable whose outcome is influenced or changed by some other variable, usually the independent variable or the treatment. It is the “effect” or outcome variable in a cause and effect relationship.


Descriptive Evidence

Evidence used to characterize individuals, groups, events, processes, trends, or relationships using quantitative statistical methods, correlational methods, or qualitative research methods. This differs from causal evidence.


Diffusion (Evidence Rating Element)

Occurs when the effects or benefits of a program extend beyond the places, individuals, problems, or behaviors directly or indirectly targeted. This element is reviewed along with anticipatory benefits and displacement on the CrimeSolutions.gov Scoring Instrument. These elements are often considered in evaluations of community-level crime prevention efforts. See Program Review and Rating from Start to Finish for more information.


Dimension

One of four broad categories of information included in the CrimeSolutions.gov Scoring Instrument used to review and rate program evidence. The dimensions include: Program’s Conceptual Framework; Study Design Quality; Study Outcomes; and Program Fidelity, which consists of multiple Evidence Rating Elements. See Program Review and Rating from Start to Finish for more information.


Displacement (Evidence Rating Element)

Occurs when an intervention has the effect of moving the problem in question (such as crime) rather than producing an actual reduction in incidence. This element is reviewed along with diffusion and anticipatory benefits on the CrimeSolutions.gov Scoring Instrument. These elements are typically considered in evaluations of community-level crime prevention efforts. See Program Review and Rating from Start to Finish for more information.

Back to Top

E

Effect Size

A standardized, quantitative index representing the magnitude and direction of an empirical relationship. More specifically, the effect size is a value that reflects the magnitude of the treatment effect. An effect size from an outcome evaluation represents the change in an outcome measure from before a program is implemented to the follow-up period. The effect size of the treatment group can be compared to the effect size from the control group to determine if there are any differences, and if so, whether those differences are statistically significant (which allows for greater confidence that the difference was due to the program). See Statistical Significance for more information. The most common types of effect sizes in the criminal justice and delinquency literature are the standardized mean difference effect size; odds ratios and risk ratios; and correlation coefficients.

In program evaluation, the effect size is typically hypothesized a priori to guide decisions about needed sample size and the likelihood of Type I and Type II errors (See Type I Error and Type II Error for more information). In a meta-analysis, the effect sizes from the various evaluation studies are standardized to be in the same form. By representing the findings of each study included in a meta-analysis in the same form, this permits a synthesis of those findings across studies. After evaluation data are analyzed, an actual effect can usually be estimated from the data, and this value is often used as a basis for comparative effectiveness research on alternative interventions.

The magnitude of an effect size is often judged using “rules of thumb” from social science research. For example, standardize mean difference effect sizes (Cohen’s d or Hedge’s g) are judge using the following rules: small=0.20; medium=0.50; large=0.80. These are not hard cut-off points but rather approximation. There are different standards for each type of effect size.


Effective

An Evidence Rating on CrimeSolutions.gov that indicates a program with strong evidence that it achieves its intended outcomes when implemented with fidelity. Read more About CrimeSolutions.gov or about Program Review and Rating from Start to Finish. "Effective" programs are represented throughout the site with the "Effective" icon: Effective icon.


Effectiveness

The strength of the evidence demonstrating that a program achieves its intended outcomes.  


Evidence

Information about a question that is generated through systematic data collection, research, or program evaluation using accepted scientific methods that are documented and replicable. Evidence may be classified as either descriptive or causal. 


Evidence base (studies reviewed)

For programs, evidence base represents the three or fewer studies reviewed and scored by CrimeSolutions.gov Study Reviewers the results of which are aggregated to determine a program’s Evidence Rating. For practices, the evidence base comprises all available meta-analyses. Read more About CrimeSolutions.gov or about Program Review and Rating from Start to Finish or Practice Review and Rating from Start to Finish.


Evidence Rating

Refers to one of three designations on CrimeSolutions.gov indicating the extent of the evidence that a program works. The three designations are: "Effective" (Effective icon), "Promising" (Promising icon), and "No Effects" (Ineffective image). A single study icon is used to identify programs that have been evaluated with only one study. A multiple studies icon (Multiple studies icon image) is used to represent a greater extent of evidence supporting the evidence rating. The icon depicts programs that have more than one study in the evidence base demonstrating effects in a consistent direction.For practices, the rating designations take a slightly different meaning. Read more About CrimeSolutions.gov or about Program Review and Rating from Start to Finish or about Practice Review and Rating from Start to Finish.


Evidence Rating Element

Subcategories within the four broad dimensions included in the CrimeSolutions.gov Scoring Instrument used to review and rate the evidence for a program or practice. See Program Review and Rating from Start to Finish or Practice Review and Rating from Start to Finish for more information.


Evidence-based Programs

The Office of Justice Programs (OJP) considers programs and practices to be evidence-based when their effectiveness has been demonstrated by causal evidence, generally obtained through high quality outcome evaluations.


Experimental Design

A research design in which participants are randomly assigned to an intervention/treatment group or a control group. Many social scientists believe studies using random assignment lead to the highest confidence that observed effects are the result of the program and not another variable. See also Randomized Controlled Trial (RCT).

Back to Top

F

Fidelity (Evidence Rating Element)

The degree to which a program’s core services, components, and procedures are implemented as originally designed. Programs replicated with a high degree of fidelity are more likely to achieve consistent results. See Program Review and Rating from Start to Finish for more information.


Follow-up Period (Evidence Rating Element)
The length of time that the study period continues after the program ends to determine the program’s sustained or continued effects. This is a dimension in the CrimeSolutions.gov Scoring Instrument.

Back to Top

G

Grey Literature (Evidence Rating Element)

Research and evaluations that are not controlled by commercial publishers (i.e., not published in a peer-review journal or a book). Sources of grey literature or unpublished studies include dissertations, theses, government reports, technical reports, conference presentations, and other unpublished sources. This is a dimension in the CrimeSolutions.gov Practices Scoring Instrument that assesses the extent to which a meta-analysis includes results from unpublished or “grey” literature sources. A meta-analysis should always attempt to include grey literature due to consistent evidence that the nature and direction of research findings is often related to publication status. See Publication Bias for more information.

Note: If the literature search does not include an effort to locate unpublished studies, or is explicitly restricted to published literature, it is not eligible for inclusion as a practice on CrimeSolutions.gov.

Back to Top

H

Heterogeneity (Evidence Rating Element)
Refers to the variability of the effect sizes from the different evaluation studies included in a meta-analysis (e.g., some evaluations may show strong, significant effects while other evaluations show small or no effects). This is a dimension in the CrimeSolutions.gov Practices Scoring Instrument that rates a meta-analysis on whether the authors were aware of and attentive to heterogeneity (i.e., variability) in the effect sizes from the studies in the meta-analysis. Heterogeneity statistics include tau (t), tau-squared (t2), Q, or I-squared (I2).

History (Evidence Rating Element)

An event that takes place between the pretest (data collected prior to the treatment beginning) and the posttest (data collected after the treatment ends) that has nothing to do with the treatment but may impact observed outcomes.  History is a potential threat to Internal Validity. See Program Review and Rating from Start to Finish for more information.

Back to Top

I

Implementation
Refers to implementing a program in the same or similar manner targeting the same or similar population in order to achieve the same results that occurred when a program was originally implemented.

Independent Variable

A variable that changes or influences another variable, usually the dependent variable. This is often the treatment in experimental designs and precedes the outcome variable in time. It is the “cause” in a cause and effect relationship.


Instrumentation (Evidence Rating Element)

The measures used in a study. The instrumentation quality is dependent on the measures’ reliability and validity. Reliability refers to the degree to which a measure is consistent or gives very similar results each time it is used, and validity refers to the degree to which a measure is able to scientifically answer the question it is intended to answer. Instrumentation is a component considered within Internal Validity. See Program Review and Rating from Start to Finish for more information.


Insufficient Evidence
Programs or practices with insufficient evidence are those that have been reviewed by CrimeSolutions.gov Study Reviewers, but were not assigned an evidence rating due to limitations of the studies included in the programs' evidence base. Programs are placed on this insufficient evidence list if the study (or studies) reviewed (1) had significant limitations in the study design or (2) lacked sufficient information about program fidelity so that it was not possible to determine if the program was delivered as designed.

Intended Outcomes
The results that a program deliberately sets out to achieve by its design (i.e., the program’s goals). For example, a reentry program’s intended outcomes might be to reduce recidivism among program participants.

Intent-to-Treat Analysis

An analysis based on the initial treatment intent, not on the treatment eventually administered. For example, if the treatment group has a higher attrition rate than the control or comparison group, and outcomes are compared only for those who completed the treatment, the study results may be biased. An intent-to-treat design ensures that all study participants are followed until the conclusion of the study, irrespective of whether the participant is still receiving or complying with the treatment.


Internal Validity (Evidence Rating Element)

The degree to which observed changes can be attributed to the program. The validity of a study depends on both the research design and the measurement of the program activities and outcomes. Threats to internal validity may affect the extent to which observed effects may be attributed to a program or intervention, on CrimeSolutions.gov’s Scoring Instrument, which includes: Attrition, Maturation, Instrumentation, Regression toward the Mean, Selection Bias, Contamination, and History, as well as other factors. See Program Review and Rating from Start to Finish for more information.

On CrimeSolutions.gov’s Scoring Instrument for practices, internal validity is measured by the number of randomized controlled trials used to calculate the mean effect size. Mean effect sizes calculated using only randomized controlled trials are considered to have fewer threats to internal validity then mean effect sizes calculated using only quasi-experimental designs. See Practice Review and Rating from Start to Finish for more information.

Back to Top

L

Lead Researcher
Subject matter and research methodology experts who serve a leadership role in selecting the studies that comprise the evidence base for a program or practice and who coordinate the review process for a given topic area on CrimeSolutions.gov. They also ensure that any scoring discrepancies between Study Reviewers are resolved and consensus is achieved prior to a program or practice being assigned a final Evidence Rating. Read more about CrimeSolutions.gov Researchers and Reviewers.

Back to Top

M

Maturation (Evidence Rating Element)

When observed outcomes are a result of natural changes of the program participants over time rather than because of program impact. Maturation is a threat considered within Internal Validity. See Program Review and Rating from Start to Finish for more information.


Meta-analysis

In general terms, meta-analysis is a social science method that allows us to look at effectiveness across numerous evaluations of similar, but not necessarily identical, programs, strategies, or procedures. Meta-analysis examines conceptually similar approaches and answers the question, "on average, how effective are these approaches?" On CrimeSolutions.gov, we use the term "practices" to refer to these categories of similar programs, strategies, or procedures and meta-analyses form the evidence-base for practices.

A more precise definition for meta-analysis is that it is the systematic quantitative analysis of multiple studies that address a set of related research hypotheses in order to draw general conclusions, develop support for hypotheses, and/or produce an estimate of overall program effects.


Multilevel Models/Hierarchical Models

A statistical method that allows researchers to estimate separately the variance between subjects within the same setting, and the variance between settings. For example, when evaluating a school-based program it is important to know the variation of students within the same school as well as the variation of students between different schools. This ensures that when programs are evaluated, the effects are not attributed to the program when there could be underlying differences between schools or between the students in those schools.


Multivariate Analysis

Research strategy and analytic technique that involves the investigation of more than two variables at the same time or within the same statistical analysis. For example, in a multiple regression analysis, the effects of two or more independent variables are assessed in terms of their impact on the dependent variable.

Back to Top

N

No Effects

An Evidence Rating on CrimeSolutions.gov indicating a program or practice has strong evidence that it had no effects or harmful effects when trying to achieve its intended outcomes. Read more About CrimeSolutions.gov or Program Review and Rating from Start to Finish. "No Effects" programs are represented throughout the site with the "No Effects" icon: No Effects.


Non-experimental

Refers to a research design in which participants are not assigned to treatment and control/comparison groups (randomly or otherwise). Such designs do not allow researchers to establish causal relationships between a program or treatment and its intended outcomes. Non-experimental designs are sometimes used when ethics or circumstances limit the ability to use a different design or because the intent of the research is not to establish a causal relationship. Examples of non-experimental designs include case studies, ethnographic research, or historical analysis.

Back to Top

O

Office of Justice Programs (OJP)
An agency of U.S. Department of Justice, the Office of Justice Programs works in partnership with the justice community to identify the most pressing crime-related challenges confronting the justice system and to provide information, training, coordination, and funding of innovative strategies and approaches to address these challenges. The following bureaus and offices are part of the Office of Justice Programs: the Bureau of Justice Assistance (BJA), the Bureau of Justice Statistics (BJS), the National Institute of Justice (NIJ), the Office of Juvenile Justice and Delinquency Prevention (OJJDP), the Office for Victims of Crime (OVC), and the Office of Sex Offender Sentencing, Monitoring, Apprehending, Registering, and Tracking (SMART). Read more About the Office of Justice Programs.      

Outcome Evaluation

A formal study that seeks to determine if a program is working. An outcome evaluation involves measuring change in the desired outcomes (e.g., changes in behaviors or changes in crime rates) before and after a program is implemented, and determines if those changes can be attributed to the program. Outcome evaluations can use many different research designs: randomized controlled trials, quasi-experimental designs, time-series analysis, simple pre/posttest, etc. For CrimeSolutions.gov, a program must be evaluated with at least one randomized controlled trial or quasi-experimental research design (with a comparison condition) in order for the outcome evaluation to be included in the program’s evidence base. See Program Review and Rating from Start to Finish for more information.


Outcomes (Primary and Secondary)

The intended results of a program’s activities or operation and a dimension in the CrimeSolutions.gov Scoring Instrument. Primary outcomes refer to the primary or central intended effects of a program. Within the scope of CrimeSolutions.gov, those primary outcomes must also relate to criminal justice, juvenile justice, or victim services. Secondary outcomes are the ancillary effects of a program. Outcomes are considered and rated separately within this dimension because programs may target multiple outcomes. Examples of outcomes include: reducing drug use, increasing system response to crime victims, and reducing fear of crime.


Outcomes (Tier 1 and Tier 2)
The intended results of a practice’s activities or operation and a dimension in the CrimeSolutions.gov Scoring Instrument. Tier 1 outcomes refer to the general outcome constructs (e.g., crime/delinquency, drugs and substance abuse, mental/behavioral health, education, victimization, family, etc.). Tier 2 outcomes refer to the specific outcome constructs (e.g., property offenses, sex-related offenses, or violent offenses under crime/delinquency; alcohol, cocaine/crack cocaine, and heroin/opioids under drugs and substance abuse; internalizing behavior, externalizing behavior, and psychological functioning under mental/behavioral health; etc.). On CrimeSolutions.gov’s Scoring Instrument for practices, all effect sizes are coded to a Tier 2 outcome construct.

Outlier (Evidence Rating Element)
An unusually high or low effect size. When combining effect sizes from various evaluations, extreme outliers can potentially distort the overall mean effect size. This is a dimension in the CrimeSolutions.gov Practices Scoring Instrument that assesses whether the meta-analysis checks for effect size outliers in the data. Note that this item refers to outlying effect sizes included in the meta-analysis, not the outlying data in the evaluation studies that contributed to the meta-analysis.

Back to Top

P

Practical Significance
Refers to the practical importance of an effect size. For example, an outcome evaluation may show that the treatment group performed statistically significantly better than the control group following participation in a program, but if the effect size is very small and the program costs are very high, the results may not be practically significant. Practical significance can be subjective, and can be assessed by looking at the magnitude of the effect size, the costs and resources of the program, and various other factors.

Practice

A general category of programs, strategies, or procedures that share similar characteristics with regard to the issues they address and how they address them. CrimeSolutions.gov uses the term “practice” in a very general way to categorize causal evidence that comes from meta-analyses of multiple program evaluations. Using meta-analysis, it is possible to group program evaluation findings in different ways to provide information about effectiveness at different levels of analysis.  Therefore, practices on CrimeSolutions.gov may include the following:

- Program types – A generic category of programs that share similar characteristics with regard to the matters they address and how they do it. For example, family therapy is a program type that could be reported as a practice in CrimeSolutions.gov.
- Program infrastructures – An organizational arrangement or setting within which programs are delivered. For example, boot camps may be characterized as a practice. 
- Policies or strategies – Broad approaches to situations or problems that are guided by general principles but are often flexible in how they are carried out. For example, hot spots policing may be characterized as a practice. 
- Procedures or techniques – More circumscribed activities that involve a particular way of doing things in relevant situations. These may be elements or specific activities within broader programs or strategies. For example, risk assessment.

On the CrimeSolutions.gov website, a practice is distinguished from a program. Whereas the evidence base for a practice is derived from one or more meta-analyses, the evidence base for a program is derived from one to three individual program evaluations.


Preponderance of Evidence
To determine if a program works, most of the outcome evidence must indicate effectiveness. This is part of a dimension in the CrimeSolutions.gov Scoring Instrument.

Process Evaluation
A study that seeks to determine if a program is operating as it was designed to. Process evaluations can be conducted in a number of ways, but may include examination of the service delivery model, the performance goals and measures, interviews with program staff and clients, etc. Process evaluations are not included in a program’s evidence base and therefore do not determine a program’s evidence rating, but may be used as supporting documentation. See Program Review and Rating from Start to Finish for more information.

Program

A planned, coordinated group of activities and processes designed to achieve a specific purpose. A program should have specified procedures (e.g., a defined curriculum, an explicit number of treatment or service hours, and an optimal length of treatment) to ensure the program is implemented with fidelity to its model. It may have, but does not necessarily need, a “brand” name and may be implemented at single or multiple locations.

On the CrimeSolutions.gov website, a program is distinguished from a practice. Whereas the evidence base for a program is derived from one to three individual program evaluations, the evidence base for a practice is derived from one or more meta-analyses.  


Promising
An Evidence Rating on CrimeSolutions.gov that indicates a program or practice has some evidence that it achieves its intended outcomes. More extensive research is recommended. See more About CrimeSolutions.gov or Program Review and Rating from Start to Finish. "Promising" programs are represented throughout the site with the "Promising" icon: Promising icon.

Publication Bias (Evidence Rating Element)
Broadly refers to the idea that published evaluations are more likely to show large and/or statistically significant program effects, whereas unpublished evaluations are more likely to show null, small, or “negative” (i.e., opposite of what would be predicted) program effects. This is a dimension in the CrimeSolutions.gov Scoring Instrument that rates the extent to which a meta-analysis investigates the potential for publication bias in the sample of included studies.

Back to Top

Q

Quasi-experimental Design

A research design that resembles an experimental design, but in which participants are not randomly assigned to treatment and control groups. Quasi-experimental designs are generally viewed as weaker than experimental designs because threats to validity cannot be as thoroughly minimized. This reduces the level of confidence that observed effects may be attributed to the program and not other variables.

Back to Top

R

Randomized Controlled Trial (RCT) / Randomized Field Experiment

Refers to an experimental research design in which participants are randomly assigned to a treatment or a control group. Most social scientists consider random assignment to lead to the highest level of confidence that observed effects are the result of the program and not other variables. 


Regression toward the Mean (Evidence Rating Element)
The statistical tendency for extreme scores relative to the mean to move closer to the average score in subsequent measurements. Regression toward the mean is a threat to the study’s Internal Validity. See Program Review and Rating from Start to Finish for more information.

Research Design (Evidence Rating Element)
The plan for how a study’s information is gathered that includes identifying the data collection method(s), the instrumentation used, the administration of those instruments, and the methods to organize and analyze the data. The quality of the research design impacts whether a causal relationship between program treatment and outcome may be established. Research designs may be divided into three categories: experimental, quasi-experimental, and non-experimental. See the Program Review and Rating from Start to Finish for more information.

Reviewer Confidence

As a final step on the Scoring Instrument, Study Reviewers provide an assessment as to their overall confidence in the study design. If both Study Reviewers agree, and the Lead Researcher concurs, that there is a fundamental flaw in the study design (not captured in the Design Quality dimension) that raises serious concerns about the study’s results, the study is removed from the evidence base and not factored into the Evidence Rating. This final determination serves as an additional safeguard to ensure that only the most rigorous studies comprise the evidence base. The study citation will be listed among the program’s additional references. See Program Review and Rating from Start to Finish for more information.

Back to Top

S

Sample Size (Evidence Rating Element)

A sample is the subset of the entire population that is included in a research study. Typically, all else being equal, a larger sample size leads to increased precision in estimates of various properties of the population. The sample size affects the statistical power of a study and the extent to which a study is capable of detecting meaningful program effects. It is included as an element with Statistical Power in the CrimeSolutions.gov Scoring Instrument. See Program Review and Rating from Start to Finish for more information.


Scoring Instrument

The method by which aspects, strengths, and weaknesses of programs and practices are consistently and objectively rated for evidence. For programs, the scoring instrument is a compilation of the dimensions and elements of a research study that are reviewed and assigned a numerical score by the CrimeSolutions.gov Study Reviewers in order to assess the evidence of a program’s effectiveness. The instrument provides a standard method to assess the quality of each program’s evidence base, while also reflecting Study Reviewers’ judgment and expertise. A similar method of scoring the aspects of meta-analyses is used for practices. See Program Review and Rating from Start to Finish or Scoring Instrument for more information.


Selection Bias (Evidence Rating Element)
Occurs when study participants are assigned to groups such that pre-existing differences (unrelated to the program, treatment, or activities) impact differences in observed outcomes. Selection bias threatens the study’s Internal Validity. Even if the subjects are randomly assigned, this threat is of particular concern with studies that have small samples. See Program Review and Rating from Start to Finish for more information.

Statistical Adjustment (Evidence Rating Element)

The use of statistical controls to account for the initial measured differences between groups. It is not applicable for all research designs. See Program Review and Rating from Start to Finish for more information.


Statistical Power (Evidence Rating Element)

The ability of a statistical test to detect meaningful program effects. It is a function of several factors, including: 1) the size of the sample; 2) the magnitude of the expected effect; and 3) the type of statistical test used. Statistical power is an element within Sample Size on the CrimeSolutions.gov Scoring Instrument. See Program Review and Rating from Start to Finish for more information.


Statistical Significance

In an evaluation, statistical significance refers to the probability that any differences found between the treatment group and control group are not due to chance but are the result of the treatment group’s participation in the program or intervention being studied. For example, if an outcome evaluation finds that after participating in a substance abuse program, the treatment group was statistically significantly less likely to abuse substances compared with the control group, this means that the difference between the two groups is likely due to the program and not due to chance.

In social science, researchers generally use a p-value of 0.05 or less, which means the probability that the difference between the treatment group and control group is due to chance is less than 5 percent. The p=0.05 is the cut-off point that CrimeSolutions.gov Expert Reviewers use to score whether an outcome is statistically significant. If the p-value is larger than 0.05, the outcome is not statistically significant, and the difference between the treatment and control group could be due to chance. See Program Review and Rating from Start to Finish for more information.


Study Reviewer

Subject matter and research methodology experts who review and assess the individual evaluation studies (for programs) or meta-analyses (for practices) that comprise the evidence base upon which CrimeSolutions.gov ratings are based. All Reviewers must complete training and receive certification prior to becoming a Study Reviewer. Read more about CrimeSolutions.gov Researchers and Reviewers.


Systematic Review

A process by which the research evidence from multiple studies on a particular topic is reviewed and assessed using systematic methods to reduce bias in selection and inclusion of studies. A systematic review is generally viewed as more thorough than a non-systematic literature review, but does not necessarily involve the quantitative statistical techniques of a meta-analysis.

Back to Top

T

Time Series Analysis

An analytic technique that uses a sequence of data points, measured typically at successive, uniform time intervals, to identify trends and other characteristics of the data. For example, a time series analysis may be used to study a city’s crime rate over time and predict future crime trends.


Treatment Group

The subjects or program participants of the set of services, treatment, or activities being studied or tested.


Type I Error
The probability of a Type I error, usually signified as “alpha,” is often used to indicate the chance of failing to reject a null hypothesis that is actually false (e.g., concluding that a program works when in fact it does not, also called a false positive).

Type II Error
The probability of a Type II error, usually signified as “beta,” is often used to indicate the chance that an actual effect goes undetected (e.g., concluding that a program doesn’t work when it fact it does, also called a false negative).

Back to Top

V

Variance

A statistical measure of how far a set of data points are dispersed from the mean or average for a population or a sample. It is the average deviation of outcomes from the mean of outcomes for a group. It is used as a step in determining the effect of an intervention or treatment on a population.

Back to Top