Patterns and Trends in Semantic Predications

Chen, Chaomei; Song, Min

doi:10.1007/978-3-319-62543-0_8

Chaomei Chen³ &
Min Song⁴

1914 Accesses
2 Citations
1 Altmetric

Abstract

We demonstrate a series of studies of semantic predications from Semantic MEDLINE, including the detection of semantic predications with burstness and in association with conflict, contradictory, or other sources of uncertainties of scientific knowledge. Semantic networks of predications are analyzed within the framework of structural variations. Examples in this chapter represent scientific knowledge at a level of granularity that differs from those studies of scientific knowledge at the level of articles or journals of scholarly communication.

You have full access to this open access chapter, Download chapter PDF

Semantic MEDLINE Database

The backend of Semantic MEDLINE is the Semantic MEDLINE Database (SemMedDB) (Kilicoglu et al. 2012). As of December 31, 2016, SemMedDB contains about 89.2 million predications from 26.7 million bibliographic records from MEDLINE. Its primary coverage is the biomedical literature. These predications are extracted by SemRep. The current version of SemMedDB is semmever30.

Representing Semantic Predications as a Graph

SemMedDB contains several tables of citations (in the MEDLINE sense of the term), i.e. the metadata of a published article, original sentences, and predications. For example, the SENTENCE table contains information on individual sentences such as SENTENCE_ID, PubMed ID (PMID), and the sentence. The PREDICATION table contains various information about predications such as PREDICATION_ID, a SENTENCE_ID, PMID (PubMed ID), PREDICATE, SUBJECT_CUI, SUBJECT_NAME (preferred name of the subject of the predication), and similar fields for the object of the predication. We loaded SemMedDB version 24 to a MySQL database. The examples explained below are based on this version. Figure 8.1 shows a visualization of a network of semantic predications in Neo4j, a graph database. The visualization shows that the semantic connections are unevenly distributed. Some entities are connected by a lot of semantic relations, whereas some are connected by few connections. The unevenness implies a level of uncertainty.

A distinct advantage of a graph database over the traditional relational database is a reduced complexity of queries. As illustrated in Table 8.1, a complex and time-consuming query with multiple table joins in a relational database can be reduced to a simple and efficient query in a graph database in Neo4j with the Cypher query language. The query in the graph database is in Cypher, a powerful query language supported by Neo4j. The query is to find paths that start with a doctor node and connect to a therapy node through at least four other types of nodes in between. A Cypher query shares some similarities with MySQL queries in terms of their style.

Table 8.1 The complexity of a query can be reduced in a graph database

Patterns and Trends in Semantic Predications

Abstract

Semantic MEDLINE Database

Representing Semantic Predications as a Graph

Causality Claims on Ebola

Conflicting Claims

When Was a Causal Relationship Initially Hypothesized?

Measuring the Importance of Semantic Predications

Contradictions as a Source of Uncertainty

Semantic Predications on Virus Research (1914–2014)

Exploring a Semantic Network of Predications in CiteSpace

Causal Relations in Virus Research

Visual Analysis of Semantic Predications

Constructing a Semantic Network

Option 1: Top N MEDLINE Articles

Structural Variations

Option 2: MEDLINE Articles by g-Index

Structural Variations

Summary

Notes

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation