Rationale

The rapidly expanding biological datasets of physical, genetic and functional interactions present a daunting task for data visualization and evaluation [1]. Existing applications such as Pajek allow the user to visualize networks in a simple graphical format [2], but lack the necessary features needed for functional assessment and comparative analysis between datasets. Typically, interaction networks are viewed within a graphing application, but data is manipulated in other contexts, often manually.

To address these shortfalls, we developed a network visualization system called Osprey that not only represents interactions in a flexible and rapidly expandable graphical format, but also provides options for functional comparisons between datasets. Osprey was developed with the Sun Microsystems Java Standard Development Kit version 1.4.0_02 [3], which allows it to be used both in stand-alone form and as an add-on viewer for online interaction databases.

Network visualization

Osprey represents genes as nodes and interactions as edges between nodes (Figure 1). Unlike other applications, Osprey is fully customizable and allows the user to define personal settings for generation of interaction networks, as described below. Any interaction dataset can be loaded into Osprey using one of several standard file formats, or by upload from an underlying interaction database. By default, Osprey uses the General Repository for Interaction Datasets as a database (The GRID [4]), from which the user can rapidly build out interaction networks. User-defined interactions are added or subtracted from mouse-over pop-up windows that link to the database. Networks can be saved as tab-delimited text files for future manipulation or exported as JPEG or JPG graphics, portable network graphics (PNG), and scalable vector graphics (SVG) [5]. The SVG image format allows the user to produce high-quality images that can be opened in applications such as Adobe Illustrator [6] for further manipulation.

Figure 1
figure 1

Representative Osprey network with genes colored by GO process and interactions colored by experimental system.

Searches and filters

A drawback of current network visualization systems is the inability to search the network for an individual gene in the context of large graphs. To overcome this problem, Osprey allows text-search queries by gene names. A further difficulty with visualization systems is the absence of functional information within the graphical interface. This problem is remedied by Osprey, which provides a one-click link to all database fields for all displayed nodes including open reading frame (ORF) name, gene aliases, and a description of gene function. By default, this information is obtained from The GRID, which in turn compiles gene annotations provided by the Saccharomyces Genome Database (SGD [7]). Various filters have been developed that allow the user to query the network. For example, an interaction network can be parsed for interactions derived from a particular experimental method. Current Osprey filters include source, function, experimental system and connectivity (Figure 2).

Figure 2
figure 2

Searches and filters. (a) Network containing 2,245 vertices and 6,426 edges from combined datasets of Gavin et al. [10], shown in red, and Ho et al. [11], shown in yellow. (b) A source filter reveals only those interactions shared by both datasets, namely 212 vertices and 188 edges.

Network layout

As network complexity increases, graphical representations become cluttered and difficult to interpret. Osprey simplifies network layouts through user-implemented node relaxation, which disperses nodes and edges according to any one of a number of layout options. Any given node or set of nodes can be locked into place in order to anchor the network. Osprey also provides several default network layouts, including circular, concentric circles, spoke and dual ring (Figure 3). Finally, for comparison of large-scale datasets, Osprey can superimpose two or more datasets on top of each other in an additive manner. In conjunction with filter options, this feature allows interactions specific to any given approach to be identified.

Figure 3
figure 3

Layout options in Osprey. (a) Circular; (b) concentric circle with five rings; (c) dual ring with highly connected nodes on the inside; (d) dual ring with highly connected nodes outside; (e) spoked dual ring.

Color representations

Osprey allows user defined colors to indicate gene function, experimental systems and data sources. Genes are colored by their biological process as defined by standardized Gene Ontology (GO) annotations. Genes that have been assigned more than one process are represented as multicolored pie charts. Osprey currently recognizes 29 biological processes derived from the categories maintained by the GO Consortium [8]. Interactions are colored by experimental system along the entire length of the edge between two nodes. If a given interaction is supported by multiple experimental systems, the edges are segmented into multiple colors to reflect each system. Alternatively, interactions can be colored by data source, again as multiply colored if more than one source supports the interaction. When combined with filter options, a network can be rapidly visualized according to any number of experimental parameters.

Osprey download

A personal copy of the Osprey network visualization system version 0.9.9 for use in not-for-profit organizations can be downloaded from the Osprey webpage at [9]. Registration is required for the sole purpose of enabling notification of software fixes and updates. A limited version of Osprey used for online interaction viewing can be used at The GRID website [4]. For implementation of Osprey as an online viewer for other online interaction databases please contact the authors.