Skip to main content

Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody

  • Chapter
  • First Online:

Part of the book series: Multimedia Systems and Applications ((MMSA))

Abstract

In this chapter, we discuss the problem of how to discover when works in a social media site are related to one another by artistic appropriation, particularly parodies. The goal of this work is to discover concrete link information from texts expressing how this may entail derivative relationships between works, authors, and topics. In the domain of music video parodies, this has general applicability to titles, lyrics, musical style, and content features, but the emphasis in this work is on descriptive text, comments, and quantitative features of songs. We first derive a classification task for discovering the “Web of Parody.” Furthermore, we describe the problems of how to generate song/parody candidates, collect user annotations, and apply machine learning approaches comprising of feature analysis, construction, and selection for this classification task. Finally, we report results from applying this framework to data collected from YouTube and explore how the basic classification task relates to the general problem of reconstructing the web of parody and other networks of influence. This points toward further empirical study of how social media collections can statistically reflect derivative relationships and what can be understood about the propagation of concepts across texts that are deemed interrelated.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Kimono Labs, (2014). Retrieved from Kimono Labs: https://www.kimonolabs.com/

  • E. Alpaydin, Introduction to Machine Learning, 3rd edn. (MIT Press, Cambridge, 2014)

    MATH  Google Scholar 

  • API Overview Guide, (2014). Retrieved from Google Developers: https://developers.google.com/youtube/

  • D.M. Blei, A.Y. Ng, Latent dirichlet allocation. J. Mach. Learn. Res. 2003(3), 993–1022 (2003)

    MATH  Google Scholar 

  • S. Bloehdorn, A. Moschitti, Combined syntactic and semantic kernels for text classification. Adv. Inf. Retr. 4425, 307–318 (2007)

    Article  Google Scholar 

  • K. Bontcheva, L. Derczynski, A. Funk, M. A. Greenwood, D. Maynard, N. Aswani, TwitIE: an open-source information extraction pipeline for microblog text, in Proceedings of the International Conference on Recent Advances in Natural Language Processing (2013)

    Google Scholar 

  • S. Bull, Automatic Parody Detection in Sentiment Analysis (2010)

    Google Scholar 

  • R. Bunescu, R. Mooney, A shortest path dependency kernel for relation extraction. in Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing(2005), pp. 724–731

    Google Scholar 

  • C. Burfoot, T. Baldwin, in ACL-IJCNLP, Automatic Satire Detection: Are You Having A Laugh? (Suntec, Singapore, 2009), pp. 161–164

    Google Scholar 

  • I. Cadez, D. Heckerman, C. Meek, P. Smyth, S. White, Visualization of Navigation Patterns on a Web Site Using Model-Based Clustering, in Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2000), ed. by R. Ramakrishnan, S. J. Stolfo, R. J. Bayardo, I. Parsa (Boston 2000), pp. 280–284

    Google Scholar 

  • N. Cancedda, E. Gaussier, C. Goutte, J. Renders, Word sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)

    MathSciNet  MATH  Google Scholar 

  • D. Caragea, V. Bahirwani, W. Aljandal, W. H. Hsu, Ontology-Based Link Prediction in the Livejournal Social Network, in Proceedings of the 8th Symposium on Abstraction, Reformulation and Approximation (SARA 2009), ed. by V. Bulitko, J. C. Beck, (Lake Arrowhead, CA, 2009)

    Google Scholar 

  • S. Choudury, J. G. Breslin, User Sentiment Detection: A Youtube Use Case, (2010)

    Google Scholar 

  • M. Collins, N. Duffy, Convolution kernels for natural language. Adv. Neural Inf. Proces. Syst. 1, 625–632 (2002)

    Google Scholar 

  • A. Culotta, J. Sorensen, Dependency tree kernels for relation extraction, in Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, (2004), p. 423-es

    Google Scholar 

  • A. Cuzzocrea, I.-Y. Song, K. C. Davis, Analytics over large-scale multidimensional data: the big data revolution!, in Proceedings of the ACM 14th International Workshop on Data Warehousing and On-Line Analytical Processing (DOLAP 2011), ed. by A. Cuzzocrea, I.-Y. Song, K. C. Davis, (ACM Press, Glasgow, 2011) pp. 101–104

    Google Scholar 

  • R. Dawkins, in The Meme Machine, ed. by S. Blackmore, Foreword, (Oxford: Oxford University Press, 2000). pp. i–xvii

    Google Scholar 

  • R. Dawkins, The Selfiish Gene, 30th edn. (Oxford University Press, Oxford, 2006)

    Google Scholar 

  • G. Doddington, A. Mitchell, M. Przybocki, L. Ramshaw, S. Strassel, R. Weischedel, The automatic content extraction (ace) program–tasks, data, and evaluation. Proc. LREC 4, 837–840 (2004)

    Google Scholar 

  • C. Drummond, R. E. Holte, Severe Class Imbalance: Why Better Algorithms aren't the Answer (2012). Retrieved from http://www.csi.uottawa.ca/~cdrummon/pubs/ECML05.pdf

  • C.E. Elger, K. Lehnertz, Seizure prediction by non-linear time series analysis of brain electrical activity. Eur. J. Neurosci. 10(2), 786–789 (1998)

    Article  Google Scholar 

  • W. Elshamy, W. H. Hsu, in Continuous-time infinite dynamic topic models: the dim sum process for simultaneous topic enumeration and formation, ed. by W. H. Hsu, Emerging Methods in Predictive Analytics: Risk Management and Decision-Making (Hershey, IGI Global, 2014), pp. 187–222

    Google Scholar 

  • U. Gargi, W. Lu, V. Mirrokni, S. Yoon, Large-scale community detection on youtube for topic discovery and exploration, in Proceedings of the 5th International Conference on Weblogs and Social Media, ed. by L. A. Adamic, R. A. Baeza-Yates, S. Counts (Barcelona, Catalonia, 17–21 July 2011)

    Google Scholar 

  • P. Gill, M. Arlitt, Z. Li, A. Mahanti, YouTube traffic characterization: a view from the edge, in IMC'07: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (ACM, New York, 2007), pp. 15–28

    Google Scholar 

  • J. Gleick, The Information: A History, a Theory, a Flood (Pantheon Books, New York, 2011)

    Google Scholar 

  • J. Goldstein, S. F. Roth, Using aggregation and dynamic queries for exploring large data sets, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2004), ed. by E. Dykstra-Erickson, M. Tscheligi (ACM Press, Boston, MA, 1994), pp. 23–29

    Google Scholar 

  • Google. Statistics. (2012). Retrieved from YouTube: http://www.youtube.com/t/press_statistics

  • M. Hall, E. Frank, G. Holmes, B. Pfahringer, The WEKA Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009b)

    Article  Google Scholar 

  • M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1), 10–18 (2009a)

    Article  Google Scholar 

  • J. Heer, N. Kong, M. Agrawala, Sizing the horizon: the effects of chart size and layering on the graphical perception of time series visualizations, in Proceedings of the 27th International Conference on Human Factors in Computing Systems (CHI 2009) (ACM Press, Boston, 2009), pp. 1303–1312

    Google Scholar 

  • E. Hovy, J. Lavid, Towards a ‘Science’ of corpus annotation: a new methodological challenge for corpus linguistics. Int. J. Translat. 22(1), 13–36 (2010). doi:10.1075/target.22.1

    Google Scholar 

  • W. H. Hsu, J. P. Lancaster, M. S. Paradesi, T. Weninger, Structural link analysis from user profiles and friends networks: a feature construction approach, in Proceedings of the 1st International Conference on Weblogs and Social Media (ICWSM 2007), ed. by N. S. Glance, N. Nicolov, E. Adar, M. Hurst, M. Liberman, F. Salvetti (Boulder, CO, 2007), pp. 75–80

    Google Scholar 

  • H. Jenkins. If it Doesn’t Spread, it’s Dead. (2009). Retrieved 06 16, 2011, from Confessions of an Aca-Fan: The Official Weblog of Henry Jenkins: http://www.henryjenkins.org/2009/02/if_it_doesnt_spread_its_dead_p.html

  • D. A. Keim, Challenges in visual data analysis, in 10th International Conference on Information Visualisation (IV 2006), ed. by E. Banissi, K. Börner, C. Chen, G. Clapworthy, C. Maple, A. Lobben, … J. Zhang (IEEE Press, London, 2006), pp. 9–16

    Google Scholar 

  • D. Koller (2001). Representation, reasoning, learning: IJCAI 2001 computers and thought award lecture. Retrieved from Daphne Koller: http://stanford.io/TFV7qH

  • A. Krishna, J. Zambreno, S. Krishnan, Polarity trend analysis of public sentiment on Youtube, in The 19Tth International Conference on Management of Data (Ahmedabad, 2013)

    Google Scholar 

  • N. Kumar, E. Keogh, S. Lonardi, C. A. Ratanamahatana, Time-series bitmaps: a practical visualization tool for working with large time series databases, in Proceedings of the 5th SIAM International Conference on Data Mining (SDM 2005) (Newport Beach, CA, 2005), pp. 531–535

    Google Scholar 

  • M. Liberman. Penn Treebank POS, (2003). Retrieved 2014, from Penn Arts and Sciences: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

  • LiteraryDevices Editors, (2014). Retrieved from Literary Devices: http://literarydevices.net

  • J. Liu, S. Ali, M. Shah, Recognizing human actions using multiple features, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008) (2008), pp. 1–8. doi: 10.1109/CVPR.2008.4587527

  • H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, C. Watkins, Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)

    MATH  Google Scholar 

  • C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, D. McClosky, The Stanford CoreNLP Natural Language Processing Toolkit, in Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014), pp. 55–60

    Google Scholar 

  • C. Mario, D. Talia, The knowledge grid. Commun. ACM 46(1), 89–93 (2003)

    Article  Google Scholar 

  • A. K. McCallum (2002). Retrieved from MALLET: A Machine Learning for Language Toolkit: http://mallet.cs.umass.edu

  • A. Mesaros, T. Virtanen, Automatic recognition of lyrics in singing. EURASIP J. Audio, Speech, and Music Processing 2010 (2010). doi:10.1155/2010/546047

  • M. Mintz, S. Bills, R. Snow, D. Jurafsky, Distant supervision for relation extraction without labeled data, in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 (2009), pp. 1003–1011

    Google Scholar 

  • T.M. Mitchell, Machine learning (McGraw Hill, New York, 1997)

    MATH  Google Scholar 

  • M. Monmonier, Strategies for the visualization of geographic time-series data. Cartographica: Int. J. Geogr. Inf. Geovisualization 27(1), 30–45 (1990)

    Article  Google Scholar 

  • A. Moschitti, A study on convolution kernels for shallow semantic parsing, in Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (2004), p. 335-es

    Google Scholar 

  • A. Moschitti, Syntactic kernels for natural language learning: the semantic role labeling case, in Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX (2006). (pp. 97–100)

    Google Scholar 

  • A. Moschitti, Kernel methods, syntax and semantics for relational text categorization, in Proceeding of the 17th ACM Conference on Information and Knowledge Management (2008), pp. 253–262

    Google Scholar 

  • A. Moschitti, D. Pighin, R. Basili, Tree kernels for semantic role labeling. Comput. Linguist. 34(2), 193–224 (2008)

    Article  MathSciNet  Google Scholar 

  • J. C. Murphy, W. H. Hsu, W. Elshamy, S. Kallumadi, S. Volkova, Greensickness and HPV: a comparative analysis?, in New Technologies in Renaissance Studies II, ed. by T. Gniady, K. McAbee, J. C. Murphy, vol. 4 (Toronto and Tempe, AZ, USA: Iter and Arizona Center for Medieval and Renaissance Studies, 2014), pp. 171–197

    Google Scholar 

  • K.P. Murphy, Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, 2012)

    MATH  Google Scholar 

  • T. Nguyen, A. Moschitti, G. Riccardi, Convolution kernels on constituent, dependency and sequential structures for relation extraction, in Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: volume 3 (2009), pp. 1378–1387

    Google Scholar 

  • T. O'reilly, What Is Web 2.0 (O'Reilly Media, Sebastopol, 2009)

    Google Scholar 

  • A. Reyes, P. Rosso, T. Veale, A multidemensional approach for detecting irony in twitter. Lang. Resour. Eval. 47(1), 239–268 (2012)

    Article  Google Scholar 

  • J. Selden, Table Talk: Being the Discourses of John Selden. London: Printed for E. Smith (1689)

    Google Scholar 

  • J. Shawe-Taylor, N. Cristianini, An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2004)

    MATH  Google Scholar 

  • S. C. Siersdorfer, How useful are your comments?-Analyzing and predicting Youtube comments and comment ratings, in Proceedings of the 19th International Conference on World Wide Web, vol. 15 (2010), pp. 897–900

    Google Scholar 

  • V. Simmonet, Classifying Youtube channels: a practical system, in Proceedings of the 22nd International Conference on World Wibe Web Companion (2013), pp. 1295–1303

    Google Scholar 

  • J. Steele, N. Iliinsky (eds.), Beautiful Visualization: Looking at Data Through the Eyes of Experts (O'Reilly Media, Cambridge, 2010)

    Google Scholar 

  • L. A. Trindade, H. Wang, W. Blackburn, N. Rooney, Text classification using word sequence kernel methods, in Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC 2011) (Guilin, 2011), pp. 1532–1537

    Google Scholar 

  • L. A. Trindade, H. Wang, W. Blackburn, P. S. Taylor, Enhanced factored sequence kernel for sentiment classification, in Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies (WI-IAT 2014) (2014), pp. 519–525

    Google Scholar 

  • O. Tsur, D. Davidov, A. Rappoport, in ICWSN—A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews, AAAI (2010)

    Google Scholar 

  • M. Wang, A re-examination of dependency path kernels for relation extraction, in Proceedings of IJCNLP (2008), 8

    Google Scholar 

  • H.J. Watson, B.H. Wixom, The current state of business intelligence. IEEE Comput. 40(9), 96–99 (2007)

    Article  Google Scholar 

  • T. Watt, Cheap Print and Popular Piety, 1550–1640 (Cambridge University Press, Cambridge, 1991)

    Google Scholar 

  • M. Wattenhofer, R. Wattenhofer, Z. Zhu, The YouTube Social Network, in Sixth International AAAI Conference on Weblogs and Social Media (2012), pp. 354–361

    Google Scholar 

  • J. L. Weese, in Emerging Methods in Predictive Analytics: Risk Management and Decision-Making, ed. by W. H. Hsu, Predictive analytics in digital signal processing: a convolutive model for polyphonic instrument identification and pitch detection using combined classification. (Hershey: IGI Global, 2014), pp. 223–253

    Google Scholar 

  • M. Yang, W. H. Hsu, S. Kallumadi, in Emerging Methods in Predictive Analytics: Risk Management and Decision-Making, ed. by W. H. Hsu, Predictive analytics of social networks: a survey of tasks and techniques (Hershey: IGI Global, 2014), pp. 297–333

    Google Scholar 

  • W. Yang, G. Toderici, Discriminative tag learning on Youtube videos with latent sub-tags. CVPR, (2011), pp. 3217–3224

    Google Scholar 

  • H. Yoganarasimhan. (2012). Impact of Social Network Structure on Content Propagation: A Study Using Youtube Data. Retrieved from: http://faculty.gsm.ucdavis.edu/~hema/youtube.pdf

  • D. Zelenko, C. Aone, A. Richardella, Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)

    MathSciNet  MATH  Google Scholar 

  • M. Zhang, J. Zhang, J. Su, G. Zhou, A composite kernel to extract relations between entities with both flat and structured features, in Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (2006), pp. 825–832

    Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers for helpful comments, and Hui Wang and Niall Rooney for the survey of kernel methods for clustering and classification of text documents in Section “Machine Learning Task: Classification”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William H. Hsu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Weese, J.L., Hsu, W.H., Murphy, J.C., Knight, K.B. (2017). Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody. In: Hai-Jew, S. (eds) Data Analytics in Digital Humanities. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-54499-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54499-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54498-4

  • Online ISBN: 978-3-319-54499-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics