“You Can Do It!”—Crowdsourcing Motivational Speech and Text Messages

  • Roelof A. J. de VriesEmail author
  • Khiet P. Truong
  • Jaebok Kim
  • Vanessa Evers
Part of the Human–Computer Interaction Series book series (HCIS)


Recent approaches for technology, that assist or encourage people to change their exercise behavior, focus on tailoring the content of motivational messages to the user. In designing these messages, the mode and style of presentation, e.g., spoken or written and tone of voice, are also thought to play an important role in the effectiveness of the message. We are interested in studying the effects of the content, mode, and style of motivational messages in the context of exercise behavior change. However, we are not aware of any accessible database on motivational messages. Moreover, collecting a large database of spoken and written messages is not a trivial task. Crowdsourcing can be an effective way to collect a large amount of data for all sorts of tasks. Traditionally, crowdsourcing tasks are relatively easy for participants (microtasks). In this work, we use crowdsourcing to collect a large amount of data for more complex tasks (macrotasks): designing motivational messages in text and recording spoken motivational messages. We present and discuss the approach, database and challenges we ran into, and report findings on unsupervised explorations of the emotional expressiveness and sound quality (signal-to-noise ratio, SNR) of the crowdsourced motivational speech.


Crowdsourcing Macrotasks Speech elicitation Speech acquisition Speech corpus Motivational speech Motivational messages 


  1. Arteaga, S. M., Kudeki, M., Woodworth, A., & Kurniawan, S. (2010). Mobile system to motivate teenagers’ physical activity. In Proceedings of the 9th International Conference on Interaction Design and Children (pp. 1–10). ACM .Google Scholar
  2. Benisovich, S., Rossi, J., Norman, G., & Nigg, C. (1998). Development of a multidimensional measure of exercise self-efficacy. Annals of Behavioral Medicine, 20(suppl).Google Scholar
  3. Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., et al. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786.CrossRefGoogle Scholar
  4. Callison-Burch, C. (2009). Fast, cheap, and creative: Evaluating translation quality using Amazon’s Mechanical Turk. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, ser. EMNLP ’09 (pp. 286–295).Google Scholar
  5. Callison-Burch, C., & Dredze, M. (2010). Creating speech and language data with Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 1–12).Google Scholar
  6. Cheng, J., Teevan, J., Iqbal, S. T., & Bernstein, M. S. (2015). Break it down: A comparison of macro-and microtasks. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 4061–4064). ACM.Google Scholar
  7. Coley, H. L., Sadasivam, R. S., Williams, J. H., Volkman, J. E., Schoenberger, Y.-M., Kohler, C. L., Sobko, H., Ray, M. N. , Allison, J. J., Ford, D. E., Gilbert, G. H., & Houston, T. K. (2013). Crowdsourced peer- versus expert-written smoking-cessation messages. American Journal of Preventive Medicine, 45(5), 543–550.
  8. Cowie, R., Douglas-Cowie, E., Savvidou*, S., McMahon, E., Sawey, M., & Schröder, M. (2000). ‘FEELTRACE’: An instrument for recording perceived emotion in real time. In ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion (pp. 19–24).Google Scholar
  9. de Souto, M. C. P., de Araujo, D. S., Costa, I. G., Soares, R. G., Ludermir, T. B., & Schliep, A. (2008). Comparative study on normalization procedures for cluster analysis of gene expression datasets. In IEEE International Joint Conference on Neural Networks. IJCNN 2008 (IEEE World Congress on Computational Intelligence) (pp. 2792–2798). IEEE.Google Scholar
  10. de Vries, R. (2018). Theory-based and tailor-made: Motivational messages for behavior change technology. PhD dissertation, Human Media Interaction, Netherlands.Google Scholar
  11. de Vries, R. A. J., Truong, K. P., & Evers, V. (2016a). Crowd-designed motivation: Combining personality and the transtheoretical model. In Persuasive technology (pp. 41–52). Berlin: Springer.Google Scholar
  12. de Vries, R. A. J., Truong, K. P., Kwint, S., Drossaert, C. H. C., & Evers, V. (2016b). Crowd-designed motivation: Motivational messages for exercise adherence based on behavior change theory. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 297–308). ACM.Google Scholar
  13. de Vries, R. A., Zaga, C., Bayer, F., Drossaert, C. H., Truong, K. P., & Evers, V. (2017a). Experts get me started, peers keep me going: Comparing crowd-versus expert-designed motivational text messages for exercise behavior change. In 11th EAI International Conference on Pervasive Computing Technologies for Healthcare, PervasiveHealth. ACM.Google Scholar
  14. de Vries, R. A., Truong, K. P., Zaga, C., Li, J., & Evers, V. (2017b). A word of advice: How to tailor motivational text messages based on behavior change theory to personality and gender. Personal and Ubiquitous Computing, 21(4), 675–687.Google Scholar
  15. Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile: The munich versatile and fast open-source audio feature extractor. In Proceedings of the International Conference on Multimedia (pp. 1459–1462). ACM.Google Scholar
  16. Godin, G., & Shephard, R. (1997). Godin leisure-time exercise questionnaire. Medicine and Science in Sports and Exercise, 29(6s), S36.Google Scholar
  17. Gong, Y. (1995). Speech recognition in noisy environments: A survey. Speech Communication, 16(3), 261–291.CrossRefGoogle Scholar
  18. Haas, D., Ansel, J., Gu, L., & Marcus, A. (2015). Argonaut: Macrotask crowdsourcing for complex data processing. Proceedings of the VLDB Endowment, 8(12), 1642–1653.CrossRefGoogle Scholar
  19. Hekler, E. B., Klasnja, P., Froehlich, J. E., & Buman, M. P. (2013). Mind the theoretical gap: Interpreting, using, and developing behavioral theory in HCI research. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 3307–3316).Google Scholar
  20. Hirsch, H. G. (1993). Estimation of noise spectrum and its application to SNR-estimation and speech enhancement. International Computer Science Institute.Google Scholar
  21. Hsueh, P.-Y., Melville, P., & Sindhwani, V. (2009). Data quality from crowdsourcing: A study of annotation selection criteria. In Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing, ser. HLT ’09 (pp. 27–35).Google Scholar
  22. Huang, R., & Ma, C. (2006). Toward a speaker-independent real-time affect detection system. In Proceedings of the International Conference on Pattern Recognition (ICPR), 1, 1204–1207.Google Scholar
  23. Klasnja, P., & Pratt, W. (2012). Healthcare in the pocket: Mapping the space of mobile-phone health interventions. Journal of Biomedical Informatics, 45(1), 184–198.CrossRefGoogle Scholar
  24. Kwon, O.-W., Chan, K., Hao, J., & Lee, T.-W. (2003). Emotion recognition by speech signals. In Proceedings of Interspeech (pp. 125–128).Google Scholar
  25. Lane, I., Waibel, A., Eck, M., & Rottmann, K. (2010). Tools for collecting speech corpora via Mechanical-Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 184–187).Google Scholar
  26. Latimer, A. E., Brawley, L. R., & Bassett, R. L. (2010). A systematic review of three approaches for constructing physical activity messages: What messages work and what improvements are needed? The International Journal of Behavioral Nutrition and Physical Activity, 7, 36.CrossRefGoogle Scholar
  27. Marge, M., Banerjee, S., & Rudnicky, A. I. (2010). Using the Amazon Mechanical Turk for transcription of spoken language. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5270–5273).Google Scholar
  28. McGraw, I. (2013). Collecting speech from crowds. In Crowdsourcing for speech processing: Applications to data collection, transcription and assessment (pp. 37–71).CrossRefGoogle Scholar
  29. McGraw, I., Glass, J., & Seneff, S. (2011). Growing a spoken language interface on Amazon Mechanical Turk. In Proceedings of Interspeech (pp. 3057–3060).Google Scholar
  30. McGraw, I., Lee, C., Hetherington, L., Seneff, S., & Glass, J. (2010). Collecting voices from the cloud. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10) (pp. 1576–1583).Google Scholar
  31. McKeown, G., Valstar, M., Cowie, R., Pantic, M., & Schroder, M. (2012). The semaine database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing, 3(1), 5–17.CrossRefGoogle Scholar
  32. Nigg, C., Rossi, J., Norman, G., & Benisovich, S. (1998). Structure of decisional balance for exercise adoption. Annals of Behavioral Medicine, 20, S211.Google Scholar
  33. Nigg, C., Norman, G., Rossi, J., & Benisovich, S. (1999). Processes of exercise behavior change: Redeveloping the scale. Annals of Behavioral Medicine, 21, S79.Google Scholar
  34. Norman, G., Benisovich, S., Nigg, C., & Rossi, J. (1998). Examining three exercise staging algorithms in two samples. In 19th Annual Meeting of the Society of Behavioral Medicine.Google Scholar
  35. Novotney, S., & Callison-Burch, C. (2010). Crowdsourced accessibility: Elicitation of Wikipedia articles. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 41–44).Google Scholar
  36. Parent, G., & Eskenazi, M. (2011). Speaking to the crowd: Looking at past achievements in using crowdsourcing for speech and predicting future challenges. In Proceedings of Interspeech (pp. 3037–3040).Google Scholar
  37. Prochaska, J. O., & DiClemente, C. C. (1983). Stages and processes of self-change of smoking: Toward an integrative model of change. Journal of Consulting and Clinical Psychology, 51(3), 390.CrossRefGoogle Scholar
  38. Schmitz, H., & Lykourentzou, I. (2018). Online sequencing of non-decomposable macrotasks in expert crowdsourcing. ACM Transactions on Social Computing, 1(1), 1.CrossRefGoogle Scholar
  39. Skutella, L. V., Sssenbach, L., Pitsch, K., & Wagner, P. (2014). The prosody of motivation: First results. In Proceedings of ESSV (Elektronische Sprachsignalverarbeitung).Google Scholar
  40. Zaidan, O. F., & Callison-Burch, C. (2011). Crowdsourcing translation: Professional quality from non-professionals. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ser. HLT ’11, 1, 1220–1229.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Roelof A. J. de Vries
    • 1
    Email author
  • Khiet P. Truong
    • 2
  • Jaebok Kim
    • 2
  • Vanessa Evers
    • 2
  1. 1.Biomedical Signals and SystemsUniversity of TwenteEnschedeThe Netherlands
  2. 2.Human Media InteractionUniversity of TwenteEnschedeThe Netherlands

Personalised recommendations