Abstract
Author identification is a hot topic, especially in the Internet age. Following our previous work in which we proposed a novel approach to this problem, based on relational representations that take into account the structure of sentences, here we present a tool that computes and visualizes a numerical and graphical characterization of the authors/texts based on several linguistic features. This tool, that extends a previous language analysis tool, is the ideal complement to the author identification technique, that is based on a clustering procedure whose outcomes (i.e., the authors’ models) are not human-readable. Both approaches are unsupervised, which allows them to tackle problems to which other state-of-the-art systems are not applicable.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S.: Stylistic text classification using functional lexical features: research articles. J. Am. Soc. Inf. Sci. Technol. 58(6), 802–822 (2007)
Feng, V.W., Hirst, G.: Authorship verication with entity coherence and other rich linguistic features notebook for PAN at CLEF 2013. In: CLEF 2013 Labs and Workshops - Online Working Notes, Padua, Italy, PROMISE, September 2013
Ferilli, S.: A sentence structure-based approach to unsupervised author identification. J. Intell. Inf. Syst. 1–19. Published on-line: 19 December 2014
Ferilli, S., Basile, T.M.A., Biba, M., Di Mauro, N., Esposito, F.: A general similarity framework for horn clause logic. Fundamenta Informaticæ 90(1–2), 43–46 (2009)
Ferilli, S., Esposito, F., Grieco, D.: Automatic learning of linguistic resources for stopword removal and stemming from text. Procedia Comput. Sci. 38, 116–123 (2014)
Leuzzi, F., Ferilli, S., Rotella, F.: ConNeKTion: a tool for handling conceptual graphs automatically extracted from text. In: Catarci, T., Ferro, N., Poggi, A. (eds.) IRCDL 2013. CCIS, vol. 385, pp. 93–104. Springer, Heidelberg (2014)
Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Commun. ACM 49(4), 76–82 (2006)
Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer, Heidelberg (1987)
Mccarthy, P.M., Lewis, G.A., Dufty, D.F., Mcnamara, D.S.: Analyzing writing styles with coh-metrix. In: Florida Artificial Intelligence Research Society International Conference (FLAIRS), pp. 764–769. AAAI Press (2006)
Raghavan, S., Kovashka, A., Mooney, R.: Authorship attribution using probabilistic context-free grammars. In: ACL 2010 Conference Short Papers, ACLShort 2010, pp. 38–42. Association for Computational Linguistics (2010)
Rotella, F., Ferilli, S., Leuzzi, F.: A domain based approach to information retrieval in digital libraries. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds.) IRCDL 2012. CCIS, vol. 354, pp. 129–140. Springer, Heidelberg (2013)
Seidman, S.: Authorship verification using the impostors method - notebook for PAN at CLEF 2013. In: CLEF 2013 Labs and Workshops - Online Working Notes, Padua, Italy, PROMISE, September 2013
van Halteren, H.: Linguistic profiling for author recognition and verification. In: 42nd Annual Meeting on Association for Computational Linguistics, ACL 2004. Association for Computational Linguistics (2004)
Vilariño, D., Pinto, D., Gómez, H., León, S., Castillo, E.: Lexical-syntactic and graph-based features for authorship verification - notebook for PAN at CLEF 2013. In: CLEF 2013 Labs and Workshops - Online Working Notes, Padua, Italy, PROMISE, September 2013
Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Am. Soc. Inf. Sci. Technol. 57(3), 378–393 (2006)
Acknowledgments
The authors would like to thank Fabio Leuzzi, Fulvio Rotella and Domenico Grieco for their work in setting up the system and running the experiments. This work was partially funded by the Italian PON 2007–2013 project PON02_00563_3489339 “Puglia@Service”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ferilli, S., Redavid, D., Esposito, F. (2016). Unsupervised Author Identification and Characterization. In: Calvanese, D., De Nart, D., Tasso, C. (eds) Digital Libraries on the Move. IRCDL 2015. Communications in Computer and Information Science, vol 612. Springer, Cham. https://doi.org/10.1007/978-3-319-41938-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-41938-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41937-4
Online ISBN: 978-3-319-41938-1
eBook Packages: Computer ScienceComputer Science (R0)