Finding Paraphrase Facts Based on Coordinate Relationships

Zhao, Meng; Ohshima, Hiroaki; Tanaka, Katsumi

doi:10.1007/978-3-319-22324-7_12

Finding Paraphrase Facts Based on Coordinate Relationships

Meng Zhao¹⁸,
Hiroaki Ohshima¹⁸ &
Katsumi Tanaka¹⁸

Conference paper
First Online: 01 January 2015

1086 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9052))

Abstract

We propose a method to acquire paraphrases from the Web in accordance with a given sentence. For example, consider an input sentence “Lemon is a high vitamin c fruit”. Its paraphrases are expressions or sentences that convey the same meaning but are different syntactically, such as “Lemons are rich in vitamin c”, or “Lemons contain a lot of vitamin c”. We aim at finding sentence-level paraphrases from the noisy Web, instead of domain-specific corpora. By observing search results of paraphrases, users are able to estimate the likelihood of the sentence as a fact. We evaluate the proposed method on five distinct semantic relations. Experiments show our average precision is \(60.5\,\%\), compared to TE/ASE method with average precision of \(44.15\,\%\). Besides, we can acquire 3 paraphrases more than TE/ASE method per input.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.google.com.
2.
http://www.bing.com.
3.
The acquisition relation exists between two companies such that one company acquired another.
4.
The directorOf relation exists between a director and his works, i.e. (Steven Spielberg,Saving Private Ryan), (James Cameron,Titanic).
5.
The leaderOf relation exists between a country and its current leader, i.e. (Barack Obama,U.S.), (Giorgio Napolitano,Italy).
6.
The ceoOf relation exists between a company and the chief executive officer of that company, i.e. (Tim Cook,Apple), (Mark Zuckerberg,Facebook).
7.
The founderOf relation exists between a person and his founded company, i.e. (Larry Page,Google).
8.
Replace entities in e with variables.
9.
http://nlp.stanford.edu/software/tagger.shtml.
10.
http://datamarket.azure.com/dataset/bing/search.

References

Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, DL 2000, pp. 85–94 (2000)
Google Scholar
Anick, P.G., Tipirneni, S.: The paraphrase search assistant: terminological feedback for iterative information seeking. In: Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 153–159 (1999)
Google Scholar
Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 597–604 (2005)
Google Scholar
Barzilay, R., Elhadad, N.: Sentence alignment for monolingual comparable corpora. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 25–32 (2003)
Google Scholar
Barzilay, R., McKeown, K.R.: Extracting paraphrases from a parallel corpus. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 50–57 (2001)
Google Scholar
Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multi-document summarization. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 550–557 (1999)
Google Scholar
Bollegala, D.T., Matsuo, Y., Ishizuka, M.: Relational duality: unsupervised extraction of semantic relations between entities on the web. In: Proceedings of the 19th International Conference on World Wide Web, pp. 151–160 (2010)
Google Scholar
Callison-Burch, C., Koehn, P., Osborne, M.: Improved statistical machine translation using paraphrases. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 17–24 (2006)
Google Scholar
Denning, P., Horning, J., Parnas, D., Weinstein, L.: Wikipedia risks. Commun. ACM 48(12), 152–152 (2005)
Article Google Scholar
Etzioni, O., Banko, M., Soderland, S., Weld, D.S.: Open information extraction from the web. Commun. ACM 51(12), 68–74 (2008)
Article Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165, 91–134 (2005)
Article Google Scholar
Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)
Google Scholar
Idan, I.S., Tanev, H., Dagan, I.: Scaling web-based acquisition of entailment relations. In: Proceedings of EMNLP, pp. 41–48 (2004)
Google Scholar
Lin, D., Pantel, P.: Dirt - discovery of inference rules from text. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 323–328 (2001)
Google Scholar
Madnani, N., Ayan, N.F., Resnik, P., Dorr, B.J.: Using paraphrases for parameter tuning in statistical machine translation. In: Proceedings of the ACL Workshop on Statistical Machine Translation (2007)
Google Scholar
Marton, Y., Callison-Burch, C., Resnik, P.: Improved statistical machine translation using monolingually-derived paraphrases. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 381–390 (2009)
Google Scholar
McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J.L., Nenkova, A., Sable, C., Schiffman, B., Sigelman, S., Summarization, M.: Tracking and summarizing news on a daily basis with columbia’s newsblaster. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 280–285 (2002)
Google Scholar
Ohshima, H., Oyama, S., Tanaka, K.: Searching coordinate terms with their context from the web. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X., Unland, R. (eds.) WISE 2006. LNCS, vol. 4255, pp. 40–47. Springer, Heidelberg (2006)
Chapter Google Scholar
Paşca, M., Dienes, P.: Aligning needles in a haystack: paraphrase acquisition across the web. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y., Unland, R. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 119–130. Springer, Heidelberg (2005)
Chapter Google Scholar
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc., New York (1986)
Google Scholar
Shinyama, Y., Sekine, S.: Paraphrase acquisition for information extraction. In: Proceedings of the Second International Workshop on Paraphrasing, vol. 16, 65–71 (2003)
Google Scholar
Shinyama, Y., Sekine, S., Sudo, K.: Automatic paraphrase acquisition from news articles. In: Proceedings of the Second International Conference on Human Language Technology Research, HLT 2002, pp. 313–318 (2002)
Google Scholar
Wang, R., Callison-Burch, C.: Paraphrase fragment extraction from monolingual comparable corpora. In: Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web, pp. 52–60 (2011)
Google Scholar
Wubben, S., van den Bosch, A., Krahmer, E., Marsi, E.: Clustering and matching headlines for automatic paraphrase acquisition. In: Proceedings of the 12th European Workshop on Natural Language Generation, ENLG 2009, pp. 122–125 (2009)
Google Scholar
Yamamoto, Y., Tanaka, K.: Towards web search by sentence queries: asking the web for query substitutions. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part II. LNCS, vol. 6588, pp. 83–92. Springer, Heidelberg (2011)
Chapter Google Scholar
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: Open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 25–26 (2007)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the following projects: Grants-in-Aid for Scientific Research (Nos. 24240013, 24680008) from MEXT of Japan.

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Kyoto, 606–8501, Japan
Meng Zhao, Hiroaki Ohshima & Katsumi Tanaka

Authors

Meng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Ohshima
View author publications
You can also search for this author in PubMed Google Scholar
Katsumi Tanaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Zhao .

Editor information

Editors and Affiliations

Soochow University, Suzhou, China
An Liu
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
Wuhan University, Wuhan, China
Tieyun Qian
University of Hong Kong, Hong Kong, China
Sarana Nutanong
Monash University, Clayton, Victoria, Australia
Muhammad Aamir Cheema

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, M., Ohshima, H., Tanaka, K. (2015). Finding Paraphrase Facts Based on Coordinate Relationships. In: Liu, A., Ishikawa, Y., Qian, T., Nutanong, S., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9052. Springer, Cham. https://doi.org/10.1007/978-3-319-22324-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-22324-7_12
Published: 30 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22323-0
Online ISBN: 978-3-319-22324-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics