Interobserver Agreement

History, Theory, and Current Methods
  • Terry J. Page
  • Brian A. Iwata
Part of the Applied Clinical Psychology book series (NSSB)


Research in applied behavior analysis often involves the measurement of behavior under conditions precluding the use of precision mechanical recording equipment often found in experimental laboratories. As a result, it has been necessary to rely on human observers to record data that reflect some characteristic of the behavior observed; rate, duration, magnitude, or latency measures, for instance.


Interobserver Agreement Behavior Analysis Observation Interval Apply Behavior Analysis Exact Agreement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baer, D. M. (1977). Reviewer’s comment: Just because it’s reliable doesn’t mean that you can use it. Journal of Applied Behavior Analysis, 10, 117–119.PubMedCrossRefGoogle Scholar
  2. Baer, D. M., Wolf, M. M., & Risley, T. R. (1968). Some current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis, 1, 91–97.PubMedCrossRefGoogle Scholar
  3. Bailey, J. S. & Bostow, D. E. (1979). Research methods in applied behavior analysis. Tallahassee, FL: Copy Grafix.Google Scholar
  4. Bijou, S. W., Peterson, R. F., & Ault, M. H. (1968). A method to integrate descriptive and empirical field studies at the level of data and empirical concepts. Journal of Applied Behavior Analysis,1, 175–191.PubMedCrossRefGoogle Scholar
  5. Birkimer, J. C., & Brown, J. H. (1979). A graphical judgmental aid which summarizes obtained and chance reliability data and helps assess the believability of experimental effects. Journal of Applied Behavior Analysis, 12, 523–533.PubMedCrossRefGoogle Scholar
  6. Clement, P. G. (1976). A formula for computing inter–observer agreement. Psychological Reports, 39, 257–258.CrossRefGoogle Scholar
  7. Cohen, J. A. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.CrossRefGoogle Scholar
  8. Cone, J. D. (1979). Why the “I’ve got a better agreement measure” literature continues to grow: A commentary on two articles by Birkimer and Brown. Journal of Applied Behavior Analysis, 12, 571.PubMedCrossRefGoogle Scholar
  9. Cronbach, L. S. (1960). Essentials of psychological testing. New York: Harper & Row.Google Scholar
  10. Gehring, R. E. (1978). Basic behavioral statistics. Boston, MA: Houghton Mifflin.Google Scholar
  11. Harris, F. C., & Ciminero, A. R. (1978). The effect of witnessing consequences on the behavioral recordings of experimental observers. Journal of Applied Behavior Analysis, 11, 513–521.PubMedCrossRefGoogle Scholar
  12. Harris, F. C., & Lahey, B. B. (1978). A method for combining occurrence and non–occurrence interobserver agreement scores. Journal of Applied Behavior Analysis, 11, 523–527.PubMedCrossRefGoogle Scholar
  13. Hartmann, D. (1977). Considerations in the choice of inter–observer reliability estimates. Journal of Applied Behavior Analysis, 10, 103–116.PubMedCrossRefGoogle Scholar
  14. Hawkins, R. P. & Dotson, V. A. (1975). Reliability scores that delude: An Alice in Wonderland trip through misleading characteristics of interobserver agreement scoresin interval recording. In E. Ramp & G. Semb (Eds.), Behavior analysis: Areas of research and application (pp. 359–376). Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
  15. Helmstadter, G. C. (1964). Principles of psychological measurement. New York: Appleton-Century-Crofts.Google Scholar
  16. Hilgard, E. R., & Atkinson, R. C. (1967). Introduction to psychology. New York: Harcourt, Brace, & World.Google Scholar
  17. Hopkins, B. L., & Hermann, J. A. (1977). Evaluating interobserver reliability of interval data. Journal of Applied Behavior Analysis, 10, 121–126.PubMedCrossRefGoogle Scholar
  18. Johnson, S. M., & Bolstad, O. D. (1973). Methodological issues in naturalistic observation: Some problems and solutions for field research. In L. A. Hamerlynck, L. C. Handy& E. J. Mash (Eds.), Behavior change: Methodology, concepts and practice (pp. 7–67). Champaign, IL: Research Press.Google Scholar
  19. Kazdin, A. E. (1977a). Assessing the clinical or applied importance of behavior change through social validation. Behavior Modification, 1, 427–451.CrossRefGoogle Scholar
  20. Kazdin, A. E. (1977b). Artifact, bias and complexity of assessment: The ABC’s of reliability. Journal of Applied Behavior Analysis, 10, 141–150.PubMedCrossRefGoogle Scholar
  21. Kelly, M. B. (1977). A review of observational data-collection and reliability procedures reported in the Journal of Applied Behavior Analysis, 10, 97–101.PubMedCrossRefGoogle Scholar
  22. Koegel, R. L., Russo, D. C., & Rincover, A. (1977). Assessing and training teachers in thegeneralized use of behavior modification with autistic children. Journal of Applied Behavior Analysis,10, 197–205.PubMedCrossRefGoogle Scholar
  23. Kratochwill, T. R., & Wetzel, R. J. (1977). Observer agreement, credibility, and judgment: Some considerations in presenting observer agreement data. Journal of Applied Behavior Analysis, 10, 133–139.PubMedCrossRefGoogle Scholar
  24. Lindsley, O. R. (1960). Characterization of the behavior of chronic psychotics as revealed by free operant conditioning methods. Diseases of the Nervous System, MonographSupplement, 21, 66–78.Google Scholar
  25. Michael, J. (1974). Statistical inference for individual organism research: Mixed blessing or curse? Journal of Applied Behavior Analysis, 7, 647–653.PubMedCrossRefGoogle Scholar
  26. Page, T. J., Iwata, B. A., & Neef, N. A. (1976). Teaching pedestrian skills to retarded persons: Generalization from the classroom to the natural environment. Journal of Applied Behavior Analysis, 9, 433–444.PubMedCrossRefGoogle Scholar
  27. Powell, J., Martindale, A., & Kulp, S. (1975). An evaluation of time-sample measures of behavior. Journal of Applied Behavior Analysis, 8, 463–469.PubMedCrossRefGoogle Scholar
  28. Powers, R. B., Osborne, J. G., & Anderson, E. G. (1973). Positive reinforcement of litter removal in the natural environment. Journal of Applied Behavior Analysis, 6, 579–589.PubMedCrossRefGoogle Scholar
  29. Rapport, M. D., Murphy, H. A., & Bailey, J. S. (1982). Ritalin vs. response cost in the control of hyperactive children: A within-subject comparison. Journal of Applied Behavior Analysis, 15, 205–216.PubMedCrossRefGoogle Scholar
  30. Repp, A. C., Dietz, D. E., Boles, S. M., Dietz, S. M., & Repp, C. F. (1976). Differences among common methods for assessing interobserver agreement. Journal of Applied Behavior Analysis,9, 109–113.PubMedCrossRefGoogle Scholar
  31. Yelton, A. R., Wildman, B. G., & Erickson, M. T. (1977). A probability based formula for calculating interobserver agreement. Journal of Applied Behavior Analysis, 10, 127–131PubMedCrossRefGoogle Scholar

Copyright information

© Plenum Press, New York 1986

Authors and Affiliations

  • Terry J. Page
    • 1
  • Brian A. Iwata
    • 1
  1. 1.Division of Behavioral Psychology, The John F. Kennedy InstituteJohns Hopkins University School of MedicineBaltimoreUSA

Personalised recommendations