Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody

Weese, Joshua L.; Hsu, William H.; Murphy, Jessica C.; Knight, Kim Brillante

doi:10.1007/978-3-319-54499-1_3

Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody

Joshua L. Weese³,
William H. Hsu³,
Jessica C. Murphy⁴ &
…
Kim Brillante Knight⁵

Chapter
First Online: 05 May 2017

2211 Accesses
2 Citations

Part of the book series: Multimedia Systems and Applications ((MMSA))

Abstract

In this chapter, we discuss the problem of how to discover when works in a social media site are related to one another by artistic appropriation, particularly parodies. The goal of this work is to discover concrete link information from texts expressing how this may entail derivative relationships between works, authors, and topics. In the domain of music video parodies, this has general applicability to titles, lyrics, musical style, and content features, but the emphasis in this work is on descriptive text, comments, and quantitative features of songs. We first derive a classification task for discovering the “Web of Parody.” Furthermore, we describe the problems of how to generate song/parody candidates, collect user annotations, and apply machine learning approaches comprising of feature analysis, construction, and selection for this classification task. Finally, we report results from applying this framework to data collected from YouTube and explore how the basic classification task relates to the general problem of reconstructing the web of parody and other networks of influence. This points toward further empirical study of how social media collections can statistically reflect derivative relationships and what can be understood about the propagation of concepts across texts that are deemed interrelated.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Kimono Labs, (2014). Retrieved from Kimono Labs: https://www.kimonolabs.com/
E. Alpaydin, Introduction to Machine Learning, 3rd edn. (MIT Press, Cambridge, 2014)
MATH Google Scholar
API Overview Guide, (2014). Retrieved from Google Developers: https://developers.google.com/youtube/
D.M. Blei, A.Y. Ng, Latent dirichlet allocation. J. Mach. Learn. Res. 2003(3), 993–1022 (2003)
MATH Google Scholar
S. Bloehdorn, A. Moschitti, Combined syntactic and semantic kernels for text classification. Adv. Inf. Retr. 4425, 307–318 (2007)
Article Google Scholar
K. Bontcheva, L. Derczynski, A. Funk, M. A. Greenwood, D. Maynard, N. Aswani, TwitIE: an open-source information extraction pipeline for microblog text, in Proceedings of the International Conference on Recent Advances in Natural Language Processing (2013)
Google Scholar
S. Bull, Automatic Parody Detection in Sentiment Analysis (2010)
Google Scholar
R. Bunescu, R. Mooney, A shortest path dependency kernel for relation extraction. in Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing(2005), pp. 724–731
Google Scholar
C. Burfoot, T. Baldwin, in ACL-IJCNLP, Automatic Satire Detection: Are You Having A Laugh? (Suntec, Singapore, 2009), pp. 161–164
Google Scholar
I. Cadez, D. Heckerman, C. Meek, P. Smyth, S. White, Visualization of Navigation Patterns on a Web Site Using Model-Based Clustering, in Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2000), ed. by R. Ramakrishnan, S. J. Stolfo, R. J. Bayardo, I. Parsa (Boston 2000), pp. 280–284
Google Scholar
N. Cancedda, E. Gaussier, C. Goutte, J. Renders, Word sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)
MathSciNet MATH Google Scholar
D. Caragea, V. Bahirwani, W. Aljandal, W. H. Hsu, Ontology-Based Link Prediction in the Livejournal Social Network, in Proceedings of the 8th Symposium on Abstraction, Reformulation and Approximation (SARA 2009), ed. by V. Bulitko, J. C. Beck, (Lake Arrowhead, CA, 2009)
Google Scholar
S. Choudury, J. G. Breslin, User Sentiment Detection: A Youtube Use Case, (2010)
Google Scholar
M. Collins, N. Duffy, Convolution kernels for natural language. Adv. Neural Inf. Proces. Syst. 1, 625–632 (2002)
Google Scholar
A. Culotta, J. Sorensen, Dependency tree kernels for relation extraction, in Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, (2004), p. 423-es
Google Scholar
A. Cuzzocrea, I.-Y. Song, K. C. Davis, Analytics over large-scale multidimensional data: the big data revolution!, in Proceedings of the ACM 14th International Workshop on Data Warehousing and On-Line Analytical Processing (DOLAP 2011), ed. by A. Cuzzocrea, I.-Y. Song, K. C. Davis, (ACM Press, Glasgow, 2011) pp. 101–104
Google Scholar
R. Dawkins, in The Meme Machine, ed. by S. Blackmore, Foreword, (Oxford: Oxford University Press, 2000). pp. i–xvii
Google Scholar
R. Dawkins, The Selfiish Gene, 30th edn. (Oxford University Press, Oxford, 2006)
Google Scholar
G. Doddington, A. Mitchell, M. Przybocki, L. Ramshaw, S. Strassel, R. Weischedel, The automatic content extraction (ace) program–tasks, data, and evaluation. Proc. LREC 4, 837–840 (2004)
Google Scholar
C. Drummond, R. E. Holte, Severe Class Imbalance: Why Better Algorithms aren't the Answer (2012). Retrieved from http://www.csi.uottawa.ca/~cdrummon/pubs/ECML05.pdf
C.E. Elger, K. Lehnertz, Seizure prediction by non-linear time series analysis of brain electrical activity. Eur. J. Neurosci. 10(2), 786–789 (1998)
Article Google Scholar
W. Elshamy, W. H. Hsu, in Continuous-time infinite dynamic topic models: the dim sum process for simultaneous topic enumeration and formation, ed. by W. H. Hsu, Emerging Methods in Predictive Analytics: Risk Management and Decision-Making (Hershey, IGI Global, 2014), pp. 187–222
Google Scholar
U. Gargi, W. Lu, V. Mirrokni, S. Yoon, Large-scale community detection on youtube for topic discovery and exploration, in Proceedings of the 5th International Conference on Weblogs and Social Media, ed. by L. A. Adamic, R. A. Baeza-Yates, S. Counts (Barcelona, Catalonia, 17–21 July 2011)
Google Scholar
P. Gill, M. Arlitt, Z. Li, A. Mahanti, YouTube traffic characterization: a view from the edge, in IMC'07: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (ACM, New York, 2007), pp. 15–28
Google Scholar
J. Gleick, The Information: A History, a Theory, a Flood (Pantheon Books, New York, 2011)
Google Scholar
J. Goldstein, S. F. Roth, Using aggregation and dynamic queries for exploring large data sets, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2004), ed. by E. Dykstra-Erickson, M. Tscheligi (ACM Press, Boston, MA, 1994), pp. 23–29
Google Scholar
Google. Statistics. (2012). Retrieved from YouTube: http://www.youtube.com/t/press_statistics
M. Hall, E. Frank, G. Holmes, B. Pfahringer, The WEKA Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009b)
Article Google Scholar
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1), 10–18 (2009a)
Article Google Scholar
J. Heer, N. Kong, M. Agrawala, Sizing the horizon: the effects of chart size and layering on the graphical perception of time series visualizations, in Proceedings of the 27th International Conference on Human Factors in Computing Systems (CHI 2009) (ACM Press, Boston, 2009), pp. 1303–1312
Google Scholar
E. Hovy, J. Lavid, Towards a ‘Science’ of corpus annotation: a new methodological challenge for corpus linguistics. Int. J. Translat. 22(1), 13–36 (2010). doi:10.1075/target.22.1
Google Scholar
W. H. Hsu, J. P. Lancaster, M. S. Paradesi, T. Weninger, Structural link analysis from user profiles and friends networks: a feature construction approach, in Proceedings of the 1st International Conference on Weblogs and Social Media (ICWSM 2007), ed. by N. S. Glance, N. Nicolov, E. Adar, M. Hurst, M. Liberman, F. Salvetti (Boulder, CO, 2007), pp. 75–80
Google Scholar
H. Jenkins. If it Doesn’t Spread, it’s Dead. (2009). Retrieved 06 16, 2011, from Confessions of an Aca-Fan: The Official Weblog of Henry Jenkins: http://www.henryjenkins.org/2009/02/if_it_doesnt_spread_its_dead_p.html
D. A. Keim, Challenges in visual data analysis, in 10th International Conference on Information Visualisation (IV 2006), ed. by E. Banissi, K. Börner, C. Chen, G. Clapworthy, C. Maple, A. Lobben, … J. Zhang (IEEE Press, London, 2006), pp. 9–16
Google Scholar
D. Koller (2001). Representation, reasoning, learning: IJCAI 2001 computers and thought award lecture. Retrieved from Daphne Koller: http://stanford.io/TFV7qH
A. Krishna, J. Zambreno, S. Krishnan, Polarity trend analysis of public sentiment on Youtube, in The 19Tth International Conference on Management of Data (Ahmedabad, 2013)
Google Scholar
N. Kumar, E. Keogh, S. Lonardi, C. A. Ratanamahatana, Time-series bitmaps: a practical visualization tool for working with large time series databases, in Proceedings of the 5th SIAM International Conference on Data Mining (SDM 2005) (Newport Beach, CA, 2005), pp. 531–535
Google Scholar
M. Liberman. Penn Treebank POS, (2003). Retrieved 2014, from Penn Arts and Sciences: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
LiteraryDevices Editors, (2014). Retrieved from Literary Devices: http://literarydevices.net
J. Liu, S. Ali, M. Shah, Recognizing human actions using multiple features, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008) (2008), pp. 1–8. doi: 10.1109/CVPR.2008.4587527
H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, C. Watkins, Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)
MATH Google Scholar
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, D. McClosky, The Stanford CoreNLP Natural Language Processing Toolkit, in Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014), pp. 55–60
Google Scholar
C. Mario, D. Talia, The knowledge grid. Commun. ACM 46(1), 89–93 (2003)
Article Google Scholar
A. K. McCallum (2002). Retrieved from MALLET: A Machine Learning for Language Toolkit: http://mallet.cs.umass.edu
A. Mesaros, T. Virtanen, Automatic recognition of lyrics in singing. EURASIP J. Audio, Speech, and Music Processing 2010 (2010). doi:10.1155/2010/546047
M. Mintz, S. Bills, R. Snow, D. Jurafsky, Distant supervision for relation extraction without labeled data, in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 (2009), pp. 1003–1011
Google Scholar
T.M. Mitchell, Machine learning (McGraw Hill, New York, 1997)
MATH Google Scholar
M. Monmonier, Strategies for the visualization of geographic time-series data. Cartographica: Int. J. Geogr. Inf. Geovisualization 27(1), 30–45 (1990)
Article Google Scholar
A. Moschitti, A study on convolution kernels for shallow semantic parsing, in Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (2004), p. 335-es
Google Scholar
A. Moschitti, Syntactic kernels for natural language learning: the semantic role labeling case, in Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX (2006). (pp. 97–100)
Google Scholar
A. Moschitti, Kernel methods, syntax and semantics for relational text categorization, in Proceeding of the 17th ACM Conference on Information and Knowledge Management (2008), pp. 253–262
Google Scholar
A. Moschitti, D. Pighin, R. Basili, Tree kernels for semantic role labeling. Comput. Linguist. 34(2), 193–224 (2008)
Article MathSciNet Google Scholar
J. C. Murphy, W. H. Hsu, W. Elshamy, S. Kallumadi, S. Volkova, Greensickness and HPV: a comparative analysis?, in New Technologies in Renaissance Studies II, ed. by T. Gniady, K. McAbee, J. C. Murphy, vol. 4 (Toronto and Tempe, AZ, USA: Iter and Arizona Center for Medieval and Renaissance Studies, 2014), pp. 171–197
Google Scholar
K.P. Murphy, Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, 2012)
MATH Google Scholar
T. Nguyen, A. Moschitti, G. Riccardi, Convolution kernels on constituent, dependency and sequential structures for relation extraction, in Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: volume 3 (2009), pp. 1378–1387
Google Scholar
T. O'reilly, What Is Web 2.0 (O'Reilly Media, Sebastopol, 2009)
Google Scholar
A. Reyes, P. Rosso, T. Veale, A multidemensional approach for detecting irony in twitter. Lang. Resour. Eval. 47(1), 239–268 (2012)
Article Google Scholar
J. Selden, Table Talk: Being the Discourses of John Selden. London: Printed for E. Smith (1689)
Google Scholar
J. Shawe-Taylor, N. Cristianini, An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2004)
MATH Google Scholar
S. C. Siersdorfer, How useful are your comments?-Analyzing and predicting Youtube comments and comment ratings, in Proceedings of the 19th International Conference on World Wide Web, vol. 15 (2010), pp. 897–900
Google Scholar
V. Simmonet, Classifying Youtube channels: a practical system, in Proceedings of the 22nd International Conference on World Wibe Web Companion (2013), pp. 1295–1303
Google Scholar
J. Steele, N. Iliinsky (eds.), Beautiful Visualization: Looking at Data Through the Eyes of Experts (O'Reilly Media, Cambridge, 2010)
Google Scholar
L. A. Trindade, H. Wang, W. Blackburn, N. Rooney, Text classification using word sequence kernel methods, in Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC 2011) (Guilin, 2011), pp. 1532–1537
Google Scholar
L. A. Trindade, H. Wang, W. Blackburn, P. S. Taylor, Enhanced factored sequence kernel for sentiment classification, in Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies (WI-IAT 2014) (2014), pp. 519–525
Google Scholar
O. Tsur, D. Davidov, A. Rappoport, in ICWSN—A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews, AAAI (2010)
Google Scholar
M. Wang, A re-examination of dependency path kernels for relation extraction, in Proceedings of IJCNLP (2008), 8
Google Scholar
H.J. Watson, B.H. Wixom, The current state of business intelligence. IEEE Comput. 40(9), 96–99 (2007)
Article Google Scholar
T. Watt, Cheap Print and Popular Piety, 1550–1640 (Cambridge University Press, Cambridge, 1991)
Google Scholar
M. Wattenhofer, R. Wattenhofer, Z. Zhu, The YouTube Social Network, in Sixth International AAAI Conference on Weblogs and Social Media (2012), pp. 354–361
Google Scholar
J. L. Weese, in Emerging Methods in Predictive Analytics: Risk Management and Decision-Making, ed. by W. H. Hsu, Predictive analytics in digital signal processing: a convolutive model for polyphonic instrument identification and pitch detection using combined classification. (Hershey: IGI Global, 2014), pp. 223–253
Google Scholar
M. Yang, W. H. Hsu, S. Kallumadi, in Emerging Methods in Predictive Analytics: Risk Management and Decision-Making, ed. by W. H. Hsu, Predictive analytics of social networks: a survey of tasks and techniques (Hershey: IGI Global, 2014), pp. 297–333
Google Scholar
W. Yang, G. Toderici, Discriminative tag learning on Youtube videos with latent sub-tags. CVPR, (2011), pp. 3217–3224
Google Scholar
H. Yoganarasimhan. (2012). Impact of Social Network Structure on Content Propagation: A Study Using Youtube Data. Retrieved from: http://faculty.gsm.ucdavis.edu/~hema/youtube.pdf
D. Zelenko, C. Aone, A. Richardella, Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)
MathSciNet MATH Google Scholar
M. Zhang, J. Zhang, J. Su, G. Zhou, A composite kernel to extract relations between entities with both flat and structured features, in Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (2006), pp. 825–832
Google Scholar

Download references

Acknowledgements

We thank the anonymous reviewers for helpful comments, and Hui Wang and Niall Rooney for the survey of kernel methods for clustering and classification of text documents in Section “Machine Learning Task: Classification”.

Author information

Authors and Affiliations

Department of Computer Science, Kansas State University, Manhattan, KS, 66506, USA
Joshua L. Weese & William H. Hsu
School of Arts and Humanities, University of Texas at Dallas, Richardson, TX, 75080, USA
Jessica C. Murphy
School of Arts, Technology, and Emerging Communication, University of Texas at Dallas, Richardson, TX, 75080, USA
Kim Brillante Knight

Authors

Joshua L. Weese
View author publications
You can also search for this author in PubMed Google Scholar
William H. Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Jessica C. Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Kim Brillante Knight
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William H. Hsu .

Editor information

Editors and Affiliations

Kansas State University, Manhattan, Kansas, USA
Shalin Hai-Jew

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Weese, J.L., Hsu, W.H., Murphy, J.C., Knight, K.B. (2017). Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody. In: Hai-Jew, S. (eds) Data Analytics in Digital Humanities. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-54499-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-54499-1_3
Published: 05 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54498-4
Online ISBN: 978-3-319-54499-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics