Abstract
Authorship attribution is a text classification technique, which is used to find the author of an unknown document by analyzing the documents of multiple authors. The accuracy of author identification mainly depends on the writing styles of the authors. Feature selection for differentiating the writing styles of the authors is one of the most important steps in the authorship attribution. Different researchers proposed a set of features like character, word, syntactic, semantic, structural, and readability features to predict the author of a unknown document. Few researchers used term weight measures in authorship attribution. Term weight measures have proven to be an effective way to improve the accuracy of text classification. The existing approaches in authorship attribution used the bag-of-words approach to represent the document vectors. In this work, a new approach is proposed, wherein the document weight is used to represent the document vector instead of using features or terms in the document. The experimentation is carried out on reviews corpus with various classifiers, and the results achieved for author attribution are prominent than most of the existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stamatatos, E.: A survey of modern authorship attribution methods. JASIST (2009)
Elayidom, M.S., Jose, C., Puthussery, A., Sasi, N.K.: Text classification for authorship attribution analysis. Advanc. Comput. Int. J. 4(5) (2013)
Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: Unmasking pseudonymous authors. J. Mach. Learn. Res. 8, 1261–1276 (2007)
Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Liter. Linguist. Comput. 17(4), 401–412 (2002)
Juola, P.: Authorship attribution. Found. Trends Inf. Retr. 1, 233–334 (2006)
Stefan, R., Traian, R.: Authorship identification using a reduced set of linguistic features—notebook for PAN at CLEF 2012. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963
Ludovic, T., Franck, S., Basilio, C., Nabil, H.: Authorship attribution: using rich linguistic features when training data is scarce. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963
Ludovic, T., Assaf, U., Basilio, C., Nabil, H., Franck, S.: A Multitude of Linguistically-rich Features for Authorship Attribution. CLEF 2011 Labs and Workshops, 19–22 September, Amsterdam, Netherlands, September 2011. ISBN 978-88-904810-1-7. ISSN 2038-4963
Navot, A.: Authorship and plagiarism detection using binary BOW features. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963
Wei, Z., Feng, Wu, Lap-Keung, C., Domenic, S., A discriminative and semantic feature selection method for text categorization. Int. J. Prod. Econom. Elsevier, 215–222 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Reddy, P.B., Reddy, T.R., Chand, M.G., Venkannababu, A. (2018). A New Approach for Authorship Attribution. In: Satapathy, S., Tavares, J., Bhateja, V., Mohanty, J. (eds) Information and Decision Sciences. Advances in Intelligent Systems and Computing, vol 701. Springer, Singapore. https://doi.org/10.1007/978-981-10-7563-6_1
Download citation
DOI: https://doi.org/10.1007/978-981-10-7563-6_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7562-9
Online ISBN: 978-981-10-7563-6
eBook Packages: EngineeringEngineering (R0)