The use of one- versus two-tailed tests to evaluate prevention programs. Academic Article uri icon

abstract

  • Investigators have used both one- and two-tailed tests to determine the significance of findings yielded by program evaluations. While the literature that addresses the appropriate use of each type of significance test should be used is historically inconsistent, almost all authorities now agree that one-tailed tests are rarely (if ever) appropriate. A review of 85 published evaluations of school-based drug prevention curricula specified on the National Registry of Effective Programs and Practices revealed that 20% employed one-tailed tests and, within this subgroup, an additional 4% also employed two-tailed tests. The majority of publications either did not specify the type of statistical test employed or used some other criterion such as effect sizes or confidence intervals. Evaluators reported that they used one-tailed tests either because they stipulated the direction of expected findings in advance, or because prior evaluations of similar programs had yielded no negative results. The authors conclude that one-tailed tests should never be used because they introduce greater potential for Type I errors and create an uneven playing field when outcomes are compared across programs. The authors also conclude that the traditional threshold of significance that places at .05 is arbitrary and obsolete, and that evaluators should consistently report the exact p values they find.

published proceedings

  • Eval Health Prof

author list (cited authors)

  • Ringwalt, C., Paschall, M. J., Gorman, D., Derzon, J., & Kinlaw, A.

citation count

  • 12

complete list of authors

  • Ringwalt, Chris||Paschall, MJ||Gorman, Dennis||Derzon, James||Kinlaw, Alan

publication date

  • June 2011