Advertisement

Prevention Science

, Volume 21, Issue 2, pp 194–202 | Cite as

Using Security Questions to Link Participants in Longitudinal Data Collection

  • Shu XuEmail author
  • Anthea Chan
  • Michael F. Lorber
  • Justin P. Chase
Article

Abstract

Anonymous data collection systems are often necessary when assessing sensitive behaviors but can pose challenges to researchers seeking to link participants over time. To assist researchers in anonymously linking participants, we outlined and tested a novel security question linking (security question linking; SEEK) method. The SEEK method includes four steps: (1) data management and standardization, (2) many-to-many matching, (3) fuzzy matching, and (4) rematching and verification. The method is demonstrated in SAS with two samples from a longitudinal study of adolescent dating violence. After an initial assessment during a laboratory visit, participants were asked to complete an online assessment either (a) once, 3 months later (Sample 1, n = 60), or (b) three times at 1-month intervals (Sample 2, n = 140). Demographics, eye color, and responses to nine security questions were used as key variables to link responses from the laboratory and online follow-up assessments. The rates of matched cases were 100% in Sample 1 and from 94.3 to 98.3% in Sample 2. To quantify the confidence in the data quality of successfully matched pairs, we reported the means and standard deviations of the number of matched security questions. In addition, we reported the rank order and counts of the mismatched components in key variables. Results indicate that the SEEK method provides a feasible and reliable solution to link responses in longitudinal studies with sensitive questions.

Keywords

Security questions Linking Longitudinal studies SEEK Online studies 

Notes

Acknowledgments

Support for the Dating Study data collection was provided by Grants 2014-VA-CX-0066 and 1R21HD077345. We thank Gabriella Damewood, Ashley Dills, Nicole Graziano, and Angela Marinakis for their assistance in data collection.

Funding Information

The third author received research grants from the National Institutes of Health (1R21HD077345) and the National Institute of Justice (2014-VA-CX-0066) to support this study.

Compliance with Ethical Standards

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflicts of Interest

The first, second, and fourth authors declare that they have no conflict of interest.

Supplementary material

11121_2019_1080_MOESM1_ESM.docx (18 kb)
ESM 1 (DOCX 17 kb)

References

  1. Barnea, Z., Rahav, G., & Teichman, M. (1987). The reliability and consistency of self-reports on substance use in a longitudinal study. British Journal of Addiction, 82, 891–898.  https://doi.org/10.1111/j.1360-0443.1987.tb03909.x.CrossRefPubMedGoogle Scholar
  2. Bold, K. W., Kong, G., Cavallo, D. A., Camenga, D. R., & Krishnan-Sarin, S. (2016). Reasons for trying e-cigarettes and risk of continued use. Pediatrics, 138, 1–8.  https://doi.org/10.1542/peds.2016-0895.CrossRefGoogle Scholar
  3. Brown, A. P., Ferrante, A. M., Randall, S. M., Boyd, J. H., & Semmens, J. B. (2017). Ensuring privacy when integrating patient-based datasets: New methods and developments in record linkage. Frontiers in Public Health, 5, 1–6.  https://doi.org/10.3389/fpubh.2017.00034.CrossRefGoogle Scholar
  4. Cadieux, R. & Bretheim, D. R. (2014, March). Matching rules: Too loose, too tight, or just right? Proceedings of the 2014 SAS global forum (SGF) conference, Washington D.C. Retrieved from http://support.sas.com/resources/papers/proceedings14/1674-2014.pdf
  5. Carifio, J., & Biron, R. (1978). Collective sensitive data anonymously: The CDRPG technique. Journal of Alcohol and Drug Education, 23, 47–66.Google Scholar
  6. Daigneault, I., Hébert, M., McDuff, P., Michaud, F., Vézina-Gagnon, P., Henry, A., & Porter-Vignola, É. (2015). Effectiveness of a sexual assault awareness and prevention workshop for youth: A 3-month follow-up pragmatic cluster randomization study. The Canadian Journal of Human Sexuality, 24, 19–30.  https://doi.org/10.3138/cjhs.2626.CrossRefGoogle Scholar
  7. Galanti, M. R., Siliquini, R., Cuomo, L., Melero, J. C., Panella, M., & Faggiano, F. (2007). Testing anonymous link procedures for follow-up of adolescents in a school-based trial: The EU-DAP pilot study. Preventive Medicine, 44, 174–177.  https://doi.org/10.1016/j.ypmed.2006.07.019.CrossRefPubMedGoogle Scholar
  8. Gilbert, R., Lafferty, R., Hagger-Johnson, G., Harron, K., Zhang, L. C., Smith, P., et al. (2017). GUILD: Guidance for information about linking data sets. Journal of Public Health, 40, 191–198.  https://doi.org/10.1093/pubmed/fdx037.CrossRefGoogle Scholar
  9. Grube, J. W., Morgan, M., & Kearney, K. A. (1989). Using self-generated identification codes to match questionnaires in panel studies of adolescent substance use. Addictive Behaviors, 14, 159–171.  https://doi.org/10.1016/0306-4603(89)90044-0.CrossRefPubMedGoogle Scholar
  10. Heerwegh, D., & Loosveldt, G. (2008). Face-to-face versus web surveying in a high-internet-coverage population: Differences in response quality. Public Opinion Quarterly, 72, 836–846.  https://doi.org/10.1093/poq/nfn045.CrossRefGoogle Scholar
  11. Holden, J. D. (2001). Hawthorne effects and research into professional practice. Journal of Evaluation in Clinical Practice, 7, 65–70.  https://doi.org/10.1046/j.1365-2753.2001.00280.x.CrossRefPubMedGoogle Scholar
  12. Kearney, K. A., Hopkins, R. H., Mauss, A. L., & Weisheit, R. A. (1984). Self-generated identification codes for anonymous collection of longitudinal questionnaire data. Public Opinion Quarterly, 48, 370–378.  https://doi.org/10.1093/poq/48.1b.370.CrossRefPubMedGoogle Scholar
  13. Kristjansson, A. L., Sigfusdottir, I. D., Sigfusson, J., & Allegrante, J. P. (2014). Self-generated identification codes in longitudinal prevention research with adolescents: A pilot study of matched and unmatched subjects. Prevention Science, 15, 205–212.  https://doi.org/10.1007/s11121-013-0372-z.CrossRefGoogle Scholar
  14. McGloin, J., Holcomb, S., & Main, D. S. (1996). Matching anonymous pre-posttests using subject-generated information. Evaluation Review, 20, 724–736.  https://doi.org/10.1177/0193841X9602000604.CrossRefPubMedGoogle Scholar
  15. Ong, A. D., & Weiss, D. J. (2000). The impact of anonymity on responses to sensitive questions. Journal of Applied Social Psychology, 30, 1691–1708.  https://doi.org/10.1111/j.1559-1816.2000.tb02462.x.CrossRefGoogle Scholar
  16. Pérez, A., Ariza, C., Sánchez-Martínez, F., & Nebot, M. (2010). Cannabis consumption initiation among adolescents: A longitudinal study. Addictive Behaviors, 35, 129–134.  https://doi.org/10.1016/j.addbeh.2009.09.018.CrossRefPubMedGoogle Scholar
  17. Pfeiffer, M., Slopen, M., Curry, A., & McVeigh, K. (2010). Creation of a linked inter-agency data warehouse: The longitudinal study of early development. A research report from the New York city department of health and mental hygiene. Retrieved from https://www1.nyc.gov/assets/doh/downloads/pdf/episrv/lsed-white-paper.pdf
  18. Rabkin, A. (2008, July). Personal knowledge questions for fallback authentication: Security questions in the era of Facebook. In In proceedings of the 4th symposium on usable privacy and security, Pittsburgh, Pennsylvania (13–23). New York, New York: ACM.Google Scholar
  19. Rubin, D., Schrauf, R., & Greenberg, D. (2004). Stability in autobiographical memories. Memory, 12, 715–721.  https://doi.org/10.1080/09658210344000512.CrossRefPubMedGoogle Scholar
  20. SAS Institute Inc. (2018). COMPGED Function. Retrieved February 8 from http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206133.htm
  21. Schnell, R., Bachteler, T., & Reiher, J. (2010). Improving the use of self-generated identification codes. Evaluation Review, 34, 391–418.  https://doi.org/10.1177/0193841X10387576.CrossRefPubMedGoogle Scholar
  22. Staum, P. (2007, ). Fuzzy matching using the COMPGED function. In Proceedings of the 2007 NorthEast SAS users group (NESUG) conference, Baltimore, Maryland. Retrieved from https://www.lexjansen.com/nesug/nesug07/ap/ap23.pdf
  23. Tamariz, L., Medina, H., Suarez, M., Seo, D., & Palacio, A. (2018). Linking census data with electronic medical records for clinical research: A systematic review. Journal of Economic and Social Measurement, 43, 105–118.  https://doi.org/10.3233/JEM-180454.CrossRefGoogle Scholar
  24. Theis, M. K., Reid, R. J., Chaudhari, M., Newton, K. M., Spangler, L., Grossman, D. C., & Inge, R. E. (2010). Case study of linking dental and medical health records. The American Journal of Managed Care, 16, e51–e56.PubMedGoogle Scholar
  25. Tromp, M., Ravelli, A. C., Bonsel, G. J., Hasman, A., & Reitsma, J. B. (2011). Results from simulated data sets: Probabilistic record linkage outperforms deterministic record linkage. Journal of Clinical Epidemiology, 64, 565–572.  https://doi.org/10.1016/j.jclinepi.2010.05.008.CrossRefPubMedGoogle Scholar
  26. Yurek, L. A., Vasey, J., & Sullivan Havens, D. (2008). The use of self-generated identification codes in longitudinal research. Evaluation Review, 32, 435–452.  https://doi.org/10.1177/0193841X08316676.CrossRefPubMedGoogle Scholar
  27. Zhu, Y., Matsuyama, Y., Ohashi, Y., & Setoguchi, S. (2015). When to conduct probabilistic linkage vs. deterministic linkage? A simulation study. Journal of Biomedical Informatics, 56, 80–86.  https://doi.org/10.1016/j.jbi.2015.05.012.CrossRefPubMedGoogle Scholar

Copyright information

© Society for Prevention Research 2019

Authors and Affiliations

  • Shu Xu
    • 1
    Email author
  • Anthea Chan
    • 2
  • Michael F. Lorber
    • 3
  • Justin P. Chase
    • 3
  1. 1.Department of BiostatisticsNew York UniversityNew YorkUSA
  2. 2.Columbia UniversityNew YorkUSA
  3. 3.Family Translational Research Group, New York UniversityNew YorkUSA

Personalised recommendations