Nanoinformatics: developing new computing applications for nanomedicine

Nanoinformatics has recently emerged to address the need of computing applications at the nano level. In this regard, the authors have participated in various initiatives to identify its concepts, foundations and challenges. While nanomaterials open up the possibility for developing new devices in many industrial and scientific areas, they also offer breakthrough perspectives for the prevention, diagnosis and treatment of diseases. In this paper, we analyze the different aspects of nanoinformatics and suggest five research topics to help catalyze new research and development in the area, particularly focused on nanomedicine. We also encompass the use of informatics to further the biological and clinical applications of basic research in nanoscience and nanotechnology, and the related concept of an extended “nanotype” to coalesce information related to nanoparticles. We suggest how nanoinformatics could accelerate developments in nanomedicine, similarly to what happened with the Human Genome and other -omics projects, on issues like exchanging modeling and simulation methods and tools, linking toxicity information to clinical and personal databases or developing new approaches for scientific ontologies, among many others.

ment of diseases. In this paper, we analyze the different aspects of nanoinformatics and suggest five research topics to help catalyze new research and development in the area, particularly focused on nanomedicine. We also encompass the use of informatics to further the biological and clinical applications of basic research in nanoscience and nanotechnology, and the related concept of an extended "nanotype" to coalesce information related to nanoparticles. We suggest how nanoinformatics could accelerate developments in nanomedicine, similarly to what happened with the Human Genome and other -omics projects, on issues like exchanging modeling and simulation methods and tools, linking toxicity information to clinical and personal databases or developing new approaches for scientific ontologies, among many others.

Introduction
Over the past five decades many computing methods and applications have arisen in the context of biomedicine, leading to interdisciplinary areas such as medical informatics, bioinformatics and others [1][2][3]. These biomedical-related informatics disciplines span a wide range of scientific and technological approaches to solve complex problems, including, among others, data and knowledge integration methods, biomedical ontologies and vocabularies, data and text mining, systems interoperability, DNA and RNA sequencing, medical decision support, predicting the relationships between gene mutations and diseases, the development of standards for data representation and exchange, or the development of informatics methods and tools for integrating multilevel data and creating multi-scale simulations of biomedical systems. Informaticians have successfully contributed to these areas, leading to outstanding results such as the Human Genome and other -omics projects, the computerization of clinical practice or the creation of computerized systems for decision support [4]. The authors have been active in informatics research supporting a number of projects in the past decades and pioneered significant examples such as, among others, medical expert systems [5][6][7].
After various decades of research on such biomedical systems, a challenging new field, nanomedicine, which promises to deliver scientific and technological breakthroughs that could transform medicine, is beginning to receive attention from the scientific community, including informaticians [8].
To our knowledge, no paper has been published, at the time of writing, in the computing literature about nanoinformatics. Figure 1 presents an analysis of these references, using the goPubmed [9] facility.
In this context, one particular challenging issue, which remains mostly unexplored, is the application of computing to nanomedicine. To advance research in this field, requirements for data, information and knowledge management need to be specified, and are substantial. While research in this area is commonly associated with nanotechnology there are many topics where computational methods are themselves critical to advance research and support professional practice. For instance, we have identified 123 Fig. 1 A comparative analysis among the references available in Medline with the terms "Nanotechnology Informatics", "Medical Informatics", "Bioinformatics" and "Biomedical Computing" carried out with goPubmed [9]. On the left side, is shown the number of publications per year (2000-present); on the right side, the top terms or most common keywords used in these publications various areas where significant research in informatics applied to nanomedicine is already underway [10,11]. These can be summarized in the following non-exhaustive list: -Nanoparticle characterization -Modeling and simulation -Imaging -Terminologies, ontologies and standards -Data Integration and Exchange -Systems' interoperability -Data and text mining for nanomedical research -Linking nano-information to computerized medical records -Basic and translational research -Networks of international researchers, projects and labs -Nanoinformatics Education -Ethical Issues Nanoinformatics has only recently emerged to address these issues, with the support of organizations like the US National Science Foundation, the National Cancer Institute and the European Commission. In this new context, (biomedical or nanomedical) nanoinformatics refers to the use of informatics techniques for analyzing and processing information about the structure and physico-chemical characteristics of nanoparticles and nanomaterials, their interaction with their environments, and their applications for nanomedicine [10,11]. Such new applications emerge in a time where genomic and personalized medicine are still getting recognition, and promise additional future perspectives for biomedicine. We have adopted the term "nanoinformatics" in this paper, as a contraction and easier form of the broader terms "biomedical nanoinformatics" or "nanomedical informatics". Nanoinformatics can be also related to other applications of nanotechnology but we will address here only biomedical applications.
Given the rapidly advancing extensions of all the informatics issues arising between molecular biology and systems biology (bioinformatics and computational biology) and public health (for public health bioinformatics) in biomedical practice and research, we can wonder if current informatics applications-e.g., from bioinformatics or medical informatics-may also be ready to address this new area of nanomedicine. The latter is considered by many as the new frontier of medicine-at a different, nano level, which implies significant physical and chemical differences-raising great challenges for research, medical practice and economic implications due to novel toxictherapeutic tradeoffs at the nano level [12][13][14][15]. In this regard, we might recall what happened around 1995-2001, when bioinformatics contributed to the early completion of the Human Genome Project, and here one can conjecture that informatics will also be essential to progress in the development of nanomedicine. In addition to new computing methods and tools, the possibilities of reusing informatics methods, tools, data and lessons learned may well contribute to speeding up the development of nanomedicine.
Nanomedicine includes a large number of significant topics of practical and scientific impact, such as, for instance, the early detection of diseases like cancer, the ability to reach highly specific targets within the body, new molecular imaging methods based on the optical properties of nanoparticles, methods to control drug delivery at very low dosages, nanorobots for diagnosis and therapy, and novel approaches to overcome solubility limitations of new or existing drugs. Readers can access a large number of reports for further reference [8,[12][13][14].
Let us consider an example. Some of the authors (at the Universidad Politecnica de Madrid (UPM)) have developed an informatics application, based on text mining techniques, that we have called "the nanotoxicity searcher", which automatically searches for toxicity information in the literature. In searches carried out for paclitaxel, information about nanotoxicity was found, combined with appropriate ontologies-like the Nanoparticle Ontology [15] and others such as the Foundational Model of Anatomy [16]-, and it can be made available to physicians in advance of treatment or linked to the electronic health record.
This research and other efforts [17][18][19][20] have raised issues associated with managing the nanotechnology information, most notably: (a) the lack of adequate standard classifications of nanomaterials, (b) the rapidly evolving knowledge of the many complex biological, chemical and physical processes occurring at the nano level, and (c) the heterogeneity of the information content and structure of many scientific papers in the very diverse nano disciplines and subfields. All of these issues magnify the challenges of applying standard information extraction and retrieval methods to the literature without further additional knowledge of the specifics of the fields and subfields represented. New informatics approaches are needed to efficiently and effectively link information from nanomedicine, while addressing the various levels of complexity covered by research, development and translation in nanotechnology.
In this context, the authors have worked in an international collaboration to define nanoinformatics and propose a roadmap for the field [18]. We present below the main conclusions of our analysis.

A perspective for nanoinformatics applications
In Table 1, we extend the approach we previously used to analyze biomedical informatics, contrasting its medical informatics and bioinformatics components [4]. We now contrast nanoinformatics with informatics applications in biomedicine according to six different perspectives.
We expand this comparison below.

Academic disciplines: development and scientific goals
A "Workshop on Nanoinformatics Strategies," supported by the National Science Foundation, was held in Arlington, Virginia in 2007 [17], followed by various conferences. This workshop focused on practical engineering perspectives, without including existing computing infrastructures. Given the relatively short time since this meeting took place, only a few references can be found in the literature at the time of writing; however, an increasing number of web references show that nanoinformatics is evolving rapidly. Since nanotechnology is involved in issues beyond medical and biological domains, we can anticipate how the use of nanomaterials may have an impact on health issues.

Scientific content and informatics goals
The past decade has seen considerable efforts to correlate molecular and clinical data for scientific discovery, which have led to significant achievements. However, the semantic heterogeneity, the inherent complexity and uncertainty associated with linking information from these disparate biological levels has demonstrated the difficulties in fulfilling the original expectations of the application of -omics information and knowledge to clinical practice and prognosis [21,22]. With nanomedicine, these difficulties can be expected to increase, but their promise is also great. From a scientific perspective, very distinctive quantum effects and size-related phenomena take place at the nanoscale. In nanoscience, physicists and chemists have • Need to link and integrate molecular, cellular and clinical data for scientific discovery.
• Differences at the biological nano level suggest that BMI approaches will have to be modified-e.g., ontologies, data integration, simulation • Genomic medicine and personalized medicine more complicated than expected, but results raise great expectations • Challenges in basic research (nanoscience) might lead to applications in clinical medicine-research on areas like nanoparticles for imaging, targeting or drug delivery. • The term "Information" at the core of biomedical informatics and its name-however, still lacking a biologically-focused information theory • Is there a niche for translational research, from basic nanotechnology to nanomedical applications?
• Optimally biomedical informatics will translate basic research into personalized medical advice for patients 3. Integrating data and knowledge: quality, availability and collection of data, networks, databases, interoperability, semantic issues, standardization and information retrieval shown that bridging orders of magnitude in scale introduces additional scientific challenges. Although the basic quantum laws of physics required at the nanoscale are well known, additional models and tools are needed to compute the interactions between nanoparticles and biomolecules and to compare the results from these models with experimental measurements. For instance, research on quantum dots or the development of new nanoparticles for drug delivery benefit from new data, information, theory and models of the changes in physical characteristics of materials at the nanoscale versus in bulk, that were not previously available or known. In this context, many molecular tools in use for bioinformatics cannot be directly applied-while others can-, and as a result substantial contributions from the field of multi-scale modeling and simulation holds the promise of expanding the scope of biomedical engineering and informatics-in areas such as imaging.

Integrating data and knowledge
Structuring information in nanomedicine is essential for advancing research. The term "nanoparticle" was introduced in 2007 [23] in MeSH, the controlled vocabulary used in Pubmed for organizing and indexing the biomedical literature. New taxonomies and ontologies, such as the Nanomedicine Taxonomy (NT) [24] and the Nanoparticle Ontology (NPO) [15], have been developed. Other initiatives include an ontology for discovery of new nanomaterials, the Nanotech Index Ontology, as well as the adoption of applications such as BiomedGT to map among different ontologies [19]. Efforts like the NPO can contribute to data collection, information classification, search and retrieval, data and text mining. By annotating nanoparticles, professionals can access information from previous research and discover knowledge that might reside in such information. Such work can expand the scope of biomedical ontologies-like those included in the Open Biological and Biomedical Ontologies (OBO) Foundry [25].

Tools to support professional practice
Selection of nanomaterials for medical diagnosis and therapy should be based on criteria such as size, shape, topology, composition, pharmacokinetics, biologic activity or toxicity. The nanoinformatics methods and tools that are needed would require extensions of those already developed-for example, tools for integrating -omics and clinical data. Although nanoinformatics is in its initial phase of development, the opportunities for computer specialists to contribute expertise and applications in nanomedicine will be great. Several of the present authors have already carried out research on methods to automatically extract information from the literature-primarily in bioinformatics [26] but also medical informatics 1 . We are now extending these efforts by working on an inventory of nano-resources, which would expand previous ones in the direction of nanoinformatics. Figure 2 shows a classical diagram reflecting the scope of biomedical informatics, with different research at various levels of granularity. We have added screenshots of our own three developments-for bioinformatics, medical informatics and nanoinformatics, related to each specific topic.

Methods and tools to support research
Over the past decade, a large number of significant initiatives have been launchedfor example, the National Centers for Biomedical Computing (USA) and the Virtual Physiological Human (VPH) programme (Europe) [28]-for supporting biomedical engineering and informatics nationally and internationally. For instance, some of the authors' work has been within the context of the VPH, which supports a large number of projects addressing modeling and simulation of various systems, organs, tissues, and cells of the body, and their linking to clinical applications. Meanwhile, related efforts have been funded in the USA, usually with basic research objectives. Both approaches-clinical and basic research-can be highly complementary. Many engineering approaches and tools that were developed for signal processing, imaging, modeling and simulation could be adapted in the future for nanoinformatics projects such as characterizing, modeling and simulating the behavior of nanoparticles used to target specific molecules within the body, or developing new imaging techniques using quantum dots for clinical research and practice, as mentioned above. Various repositories containing modeling and simulation tools are already available for nanotechnology Fig. 2 Examples of the automated tools developed by some of the authors at the UPM to address the automated creation of inventories of informatics resources-e.g., databases, software tools, services, etc. Screenshots of the applications developed at the UPM-from left to right, related to the inventories of resources for the nanoinformatics, bioinformatics and medical informatics fields-are linked to specific areas from Shortliffe's diagram of the various levels of biomedical informatics [27] applications (such as nanoHUB [29]) and can be expanded through adoption of computational capability. In contrast, proprietary issues regarding nanoparticle design and development will add new issues to traditional computational approaches. The use of open source tools and data in nanoinformatics could also facilitate linkage in the future. An example of current facility, already available in the USA, is the cancer Nanotechnology Laboratory portal (caNanoLAB) [30]. Another initiative of interest, previously mentioned, is nanoHUB, a resource for nanoscience and technology created by the National Science Foundation-funded Network for Computational Nanotechnology [29]. It includes applications, professional networking, and interactive simulation tools for nanotechnology. Similarly, some of the authors, at the University of Talca, in Chile, in collaboration with members of the Advanced Biomedical Computing Center at the NCI-Frederick, have developed a pilot database of nanoparticle structures, the Collaboratory for Structural Nanobiology (CSN) [31].

Education and training MI and BI professionals
In the case of nanomedicine, the challenges and complexities of the field appear somewhat parallel, though possibly even more difficult than those that faced 123 Fig. 3 Nanoinformatics in the context of Medicine and related disciplines and fields bioinformaticians earlier in the context of the Human Genome Project. Education in nanomedicine and nanoinformatics requires adding content from areas such as advanced quantum physics and chemistry, including new models of imaging, multiscale modeling, and others. These new topics considerably extend past computational research and would add much to current curricula of medical schools. Professionals with expertise in nanoinformatics applications will have an important role as information brokers, connecting people with diverse backgrounds. In such future academic programs for education in nanoinformatics and nanomedicine, informatics tools will play a decisive role for students and professionals by helping them to represent and manage the concepts and knowledge needed, without having to become quantum physicists or chemists. Similarly, the latter can acquire knowledge about nanoinformatics to participate in research in the field. Figure 3 graphically depicts nanoinformatics as a related field to Medicine and other computer-related fields.

Five significant areas of research where BMI should influence nanoinformatics
We present below our ideas about five areas of research where the expertise and the lessons learned in the Human Genome and other -omics projects, the European Virtual Physiological Human programme, and projects related to basic and translational Since the number of topics can be very large, we have focused our attention on a few particularly significant challenges (see Fig. 4).

Area 1: data, repositories and standards
The most urgent need for a nanoinformatics infrastructure is to collect, curate, annotate, organize and archive the available data. In addition to archiving the data, expert annotations and analysis regarding its quality and extent of validity, the infrastructure should allow for a federated system of public/private databases with adequate, layered access control to allow aggregation among public and private data where possible. The development or expansion of databases, software tools or repositories for nanoparticles, and, for example, their nanotoxicity, will allow the exchange of information about the actual 3-D structures of nanoparticles and nanomaterials and data about the physical and chemical properties of nanoparticles with biomedical applications. Since BMI professionals have built thousands of biomedical databases, their experience will be very valuable in building the required database infrastructure (Table 2).

Area 2: interoperability: semantic search and ontologies
An important need for nanoinformatics is to begin to federate the mostly isolated data silos that currently exist in different institutions. One example of such a federated system is caNanoLAB [30]. In this regard, semantic interoperability of heterogeneous information systems containing nano and other information will be a key issue in 123 Table 2 Summary of challenges for Area 1 • Creation of a Nanoinformatics infrastructure to collect, curate, annotate, organize and archive the available data • Design of extended web nano portals, linking groups and information around the world to facilitate data sharing • Development of repositories/databases of use cases, clinical trials experiments, databases and nanorelated informatics tools with nano-data, facilitating the reuse of the data-like Arrayexpress for genomic data.
• Incorporation of regulatory aspects (standards, issues related to open data and source tools, quality control) nanoinformatics, which will benefit from previous experiences [32]. Similarly, we have found in our early work on informatics methods for accessing the scientific literature related to nanomedicine that publications in the nano areas are ill structured, which makes information retrieval and extraction difficult. Improving the structure of abstracts and publications to facilitate such tasks has been already proposed in bioinformatics [33] (Table 3). A related challenge involves building classifications of nanoparticles. Such classification approaches could be very helpful in creating new hierarchies/taxonomies based on actual physical/chemical/clinical/toxic/spatial characteristics, and be supplemented by detailed structural information as it becomes available. Some of the authors are currently working to develop new "morphospatial" taxonomies or ontologies [34], analyzing various examples from biomedical imaging, which can be also applied to nanoparticles. Current ontologies are based on different types of qualitative information and knowledge, but they cannot help in managing different, quantitative, visual/graphical types of information and knowledge that are included, for instance, in shapes, forms and volumes, such as those needed for nanoparticles.

Area 3: extension of virtual integrative physiological programs
BMI researchers have created a large number of models and simulation tools that could be reused or adapted to nanomedicine. For instance, 3-D representations and visualizations of molecular structures [35,36] which can be adapted to visualize nanoparticles. Adding significant information for understanding changes in the genotype induced by interactions with nanomaterials could possibly provide new foundations and Table 4 Summary of challenges for Area 3 • Reuse the large number of models and simulation tools which have been created by BMI researchers to adapt them to nanomedicine.
• Create a hypothetical, extended " nanotype" to allow cataloging of nanoparticles and their biological targets, their interactions in biological environments, their potential nanotoxicities and their relation to different diagnostic and therapeutic uses • Simulate "in silico" the effects, reactions or toxicity of new compounds or materials before "in vivo" studies and correlate both in silico and in vivo results to in vitro assay results. Multilevel simulations might predict effects of nanoparticles, provide better in vitro methods, and reduce the need for animal studies.
• Initiate theoretical studies of the interactions between nanoparticles with the most common components of human cells Table 5 Summary of challenges for Area 4 • Data and knowledge integration at the nano level, different to what has been already done between clinical and -omics data at larger scales.
• The structural nature of nanomaterials and the unknown effects of many nanoparticles must be investigated independently of ontological analyses • Imaging. A key issue is to create new contrast agents to target specific organs, functions, or cell types and new imaging methods that are based on nanotechnogical advances explanations for phenotypical traits. A hypothetical "nanotype"-as we have named it-could include a large catalog of nanoparticles and biological targets, their interactions, potential nanotoxicities and relations to different diagnostic and therapeutic medical uses. An example is provided by the Protein Data Bank (PDB) [37], which could be redesigned and extended to include nanotechnology applications (Table 4).

Area 4: translational nanoinformatics
Given that many molecular-level processes increasingly involve new atomic-level or nano-level measurements and understanding, bioinformaticians aims to carry out translational research to transform the increasing amount of -omics data into new knowledge that can provide, among other results, personalized medical diagnosis and therapy for patients by taking into account the complexity of multi-gene and environmental interactions in human diseases. Similarly, nanomedicine requires novel insights beyond the current informatics technology that is typically focused on collecting, representing and linking information that is usually heterogeneous. Another challenge would be to develop a nomenclature for nanomaterials. Analyzing structures and information at the nano level may require incorporating new assumptions beyond those considered in BMI (Table 5).
Many current nanoinformatics applications look very similar-at least, on the surface-to comparable systems already built. In this context, researchers begin to face unique challenges that nano brings to informatics as we have mentioned above, such as the polymorphic and polydispersed nature of nanomaterials and the as yet unknown effects of many nanoparticles. These variabilities imply a substantial added requirement for expert annotation and curation of data and analyses to inform scientists about the quality and reliability of the data, test methods, analyses and models used in nanomedicine. Establishing better methods for curation by digital means (e.g., wikis) 123 Table 6 Summary of challenges for Area 5 • To manage nanomedicine-related data. New standards will be needed for storing data, augmenting clinical vocabularies and terminologies or exchanging electronic medical information • Questions related to patient safety and possible secondary effects related to the use of nanoparticles need to be addressed and managed with nanoinformatics tools • The creation of large databases that would store nano-related information can be complemented by new approaches to building EHRs. It will require a collaborative effort from a number of researchers, including international initiatives-e.g., USA-Europe could provide another means of accessing literature and data. Such challenges need to be addressed prior to semantic or ontological analyses, and may well influence the new area of translational nanoinformatics, already proposed elsewhere [38]. That is, it will be used in defining the information needed to advance the science and the translation to clinical medicine.

Area 5: linking nano information to the electronic health record (EHR)
One of the clearest challenges of nanoinformatics is to link nanomedicine-related data to patient EHRs. Diagnostic and therapeutic methods based on new nanomaterials can enhance proposals for personalized medicine-mostly based on -omics advances. To manage the new, nano-related information-and create potential tools for helping with decision making-new models of EHRs must be developed (Table 6).
To accomplish this, new extensions to current standards-such as SNOMED or HL7, for instance-, must be developed to incorporate nano-related information, terminologies and procedures. Then, how can researchers extract useful clinical information and predictive therapeutic rules from large sets of data obtained through clinical trials of therapeutics containing nanomaterials? How could a physician anticipate the possible therapeutic and toxic effects of nanoparticles for a specific patient?
Despite their benefits, using nanoparticles for therapeutic purposes may involve hazards for patient safety due to their potential secondary effects, which are often reported in the literature. In this context, nanoinformatics methods and techniques can significantly contribute to automatically extracting and organizing the specific nanotoxicology information available in scientific papers, and make it available for clinicians and researchers. We have already conducted research along these lines, by applying text mining techniques to automatically identifying and extracting nanotoxicology-related entities from the scientific articles 2 . This includes, for instance, names and types of nanoparticles-e.g. carbon nanotubes, fullerenes, etc.-, routes of exposure to the nanoparticles-e.g. inhalation, dermal contact, etc.-, potential targets-e.g. organs or anatomic locations-or toxic effects of nanoparticles such as destruction, inflammation, etc. The extracted information can be used for many different purposes, such as indexing and retrieving scientific papers with the different entities appearing in them, automatically finding relationships between the detected entities, or automatically linking and aligning existing nanoinformatics and biomedical ontologies. This kind of research actually addresses three of the five challenges suggested above: (1) data and knowledge storage and management, (2) nano-ontologies and semantic searches and (3) extending traditional Electronic Health Records to include nano-related information.

Conclusions
Various fundamental issues arise when analyzing this new field of nanoinformatics. These include, for instance, the large number of different computing applications that already emerge to cope with the different areas and topics of nanomedicine and nanotechnology, the large number of papers already indexed in bibliographic databases for nanotechnology and nanomedicine, the number of companies and nanotechnologists working with nanotechnological and nanomedical issues, the medical expectations of nanomedicine-promising to deliver breakthrough advances in various aspects of biomedicine-and the economic and ethical implications of this research. The growing importance of the field will lead professionals in the area to develop, in one way or another, the necessary nanoinformatics approaches. These will pose a wide range of new research challenges.
By analyzing nanoinformatics, we can highlight various strong issues that involve building large databases, developing data and information standards, creating and mapping domain ontologies, linking related information (formerly, biological, and now, nano-related) to the computerized medical record, or assimilating new medical imaging techniques. In this regard, some unique scientific issues arise.
The enormous challenges that nanotechnology and nanomedicine present require a significant investment to accelerate current research. Informatics requirements are similar to what was faced in the genomic and post-genomic research projects that transformed biomedicine. In this regard, the current activity of the authors, from three different continents, working together over the last 3 years in various nanoinformaticsrelated research activities, illustrates the opportunities that international collaborations can provide to advance this new field.