Converting AMR resources to linked data proceeds in two steps. First, we identify and/or define ontologies that capture the semantics of the original AMR concepts and relations (including the corresponding namespaces). Second, we use entity linkage techniques  to map to well-known web resources.
2.1 Representing and Linking AMR Concepts and Relations
The semantics of AMRs is primarily defined by the types of nodes in the graph, which are PropBank frames  and AMR entity types. Each PropBank frame defines the applicable roles and their specific meaning. For example, in Fig. 1 the type of the node labeled t is the PropBank frame transform-01 (defined as to “change, cause a change in state”), which has four roles: ARG0, the causer of transformation (Agent); ARG1, the thing changing (Patient); ARG2, the end state (Result); and ARG3, the start state (Material). In our example, the agent of the transformation ARG0 is a set of enzymes (BRAF, MEK1, RAS).
We use a simple meta-model consisting of (1) a main concept class AMR-
Concept with subclasses (AMR-PropBank-Frame, AMR-Entity-Type, and AMR-
Term) and (2) a main relation AMR-Role with a subrelation (AMR-PropBank-
Role). We also define namespaces to make these distinctions explicit, as follows:
AMR-Core (ac) includes the core constructs of the AMR specification: the AMR concept, metadata associated with AMRs (e.g., has-annotator, in-
document), and AMR-specific roles (e.g., mod, part-of). We also define a xref role to link to external entities (cf. Sect. 2.2).
AMR-PropBank-Frame (pb) includes PropBank frames (e.g. transform-
01, activate-01), the roles used by PropBank Frames (e.g., ARG0, ARG1, ...), and their inverses (e.g., ARG0-of, ARG1-of, ...) with corresponding inverse role assertions (e.g., pb:ARG0-of owl:inverseOf pb:ARG0).
AMR-Entity-Type (ae) includes all named entity types corresponding to common concepts in a domain, (person, organization, and location in general news text, or enzyme or cell in biomedical text).
AMR-Term (at). AMR parsing tools and human curators are free to create additional entity types, even if those types are not predefined in the AMR-Entity-Type namespace. For example, concepts, such as cancer and intestine in Fig. 2, are not registered as AMR entity types.
In our translation of AMRs to AMR-LD we closely follow the AMR design. One representational structure we deliberately altered was the naming convention used in the core AMR formulation. This involves the use of a :name role to create an instance of a name concept containing one ore more :opN roles that contain the string tokens of the name, for example, e.g., :name (n3 / name :op1"MEK1"). In AMR-LD, we replace this with an standard rdfs:label property.
A feature of the PropBank roles :ARG0, :ARG1, :ARG2, :ARG3 is that their precise semantics may change from frame to frame. Generally, :ARG0 is the Agent, and :ARG1 is the Patient. However, this is not always the case. The semantics of :ARG2, :ARG3, etc., is even more variable. In our presentation in the paper we will show only properties using the :ARGN roles. However the tool we describe in Sect. 2.3 can also generate frame-specific roles, like transform-01.ARG0. The rationale is to attach precise semantics to roles of different frames, as needed. For example, stating that transform-01.ARG0 is a subproperty of vnrole:26.6.1-Agent role (while not all ARG0’s may be agents).
2.2 Representing and Linking AMR Entities
A crucial feature of the AMR-LD representation is that it explicitly links to well-known entities in the Semantic Web using the xref property. For example, in Figs. 1 and 2 the AMR node p labeled “serpinE2” corresponds to the UniProtKB protein GDN_HUMAN and its synonymous identifier P07093. Similarly the entity e4 labeled “RAS” corresponds to entity PF00071 in the protein family ontology . We could have used owl:sameAs to indicate linkages. However, given the strong semantics of owl:sameAs and the difficulty of accurately performing entity linkage, we decided to use a more relaxed property like ac:xref. The ac:in-document property also provides links into the literature. For example, the AMR <a_pmid_2094_2929.39> comes from the PubMed article pm:20942929. These linkages embed AMR data into the Semantic Web and can significantly enhance the value of AMR corpora by leveraging existing ontologies, as well as provide an entry point into linguistic resources for semantic web applications.
We developed an entity linkage algorithm for common bioentities based on their labels. First we collected protein and chemical names from existing databases, specifically the UniProt knowledge base , proteins appearing in pathways in Pathway Commons , and chemicals from NCBI’s PubChem. Then, we mapped entities appearing in BioAMRs to these resources. For short (protein) names, like “BRAF”, we use a combination of string similarity metrics, such as edit distance, and Jaccard similarity over n-grams. For efficiency we include a blocking algorithm based on prefixes of the protein names. Our implementation used the FRIL  record linkage system. For long (protein, chemical) names, such as “Cbl E3 ubiquitin ligase”, we used traditional information retrieval techniques, such as TF-IDF cosine similarity.
2.3 AMR-LD Open-Source Conversion Software
We developed a Python library for translating the original AMR representation to RDF. The library is hosted on GitHub . The tool provides extensions to connect to different record linkage algorithms. In our development of the bio AMR-LD corpus, we used the L2K2R2 project bioentity mapping web service . We applied this system to the Bio-AMR v0.8 data  to generate the publicly available AMR-LD resource at . The conversion proceeds as follows:
Generate URLs for AMR elements, qualified by appropriate namespaces.
Add RDFS classes to represent AMR Concepts, Entities, Frames and Roles.
Convert entity names to standard rdfs:label elements.
Define elements from the AMR base language, AMR named entity vocabulary, and PropBank frame repository.
Link to well-known semantic web entities using xref properties.