Advertisement

Generating a Virtual Computational Grid by Graph Transformations

  • Barbara Strug
  • Iwona Ryszka
  • Ewa Grabska
  • Grażyna Ślusarczyk
Conference paper

Abstract

This chapter aims at contributing to a better understanding of generation and simulation problems of the grid. Towards this end, we propose a new graph structure called layered graphs. This approach enables us to use attributed graph grammars as a tool to generate at the same time both a grid structure and its parameters. To illustrate our method an example of a grid generated by means of graph grammar rules is presented. The obtained results allow us to investigate properties of a grid in a more general way.

1 Introduction

Grid computing is based on the distributed computing concept. The latter term refers to a model of computer processing that assumes the simultaneous execution of separate parts of a program on a collection of individual computing devices.

The availability of the Internet and high performance computing gives the possibility to execute large-scale computation and to use data intensive computing applications in the area of science, engineering, industry and commerce. This idea led to the emergence of the concept of the Grid computing.

The term “Grid” originated in the mid-1990s to describe a collection of resources geographically distributed that can solve large-scale problems. In the foundational paper “The Anatomy of the Grid. Enabling Scalable Virtual Organizations,” Ian Foster, Carl Kesselman, and Steve Tuecke introduced the paradigm of the Grid and its main features. According to them the Grid concept can be regarded as coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations [3].

The Grid integrates and provides seamless access to a wide variety of geographically distributed computational resources (such as supercomputers, storage systems, instruments, data, service, people) from different administrative areas and presents them as a unified resource. Sharing of resources is restricted and highly controlled, with resource providers and consumers defining clearly and carefully their sharing rules. A dynamic collection of individuals, multiple groups, or institutions defined by such restrictions and sharing the computing resources of the Grid for a common purpose is called a Virtual Organization (VO) [3].

Moreover, in the Grid environment standard, open general-purpose protocols and interfaces should be used. The use of open standards provides interoperability and integration facilities. These standards must be applied for resource discovery, resource access, and resource coordination [9]. Another basic requirement of a Grid Computing system is the ability to provide the quality of service (QoS) requirements necessary for the end-user community. The Grid allows its constituent resources to be used in a coordinated fashion to deliver various qualities of service, such as response time measures; aggregated performance; security fulfillment; resource scalability; availability; autonomic features, e.g. event correlation, configuration management; and partial fail over mechanisms [10].

The Grid is used for many different applications and it seems important to be able to appropriately represent its structure. As it is a heterogeneous, changing structure, based on clusters distributed on different geographical locations, simple graphs are not powerful enough to represent such a structure. We propose to use a new graph structure called a layered graph which is based on a hierarchical graph. This flexible and yet powerful representation can be used to implement a simulator of a grid, which would allow for testing different types of a grid and different grid configurations before actually implementing them. Such an approach could lower costs of grid establishing and running. Moreover, having an adequate representation of a grid would allow us to investigate properties of a grid in a more general way and thus better understand its working and behaviour.

Grid generation is discussed by Lu and Dinda [2]. This approach separates topology generation and annotations, while the method proposed here, which uses graph grammars, makes it possible to generate the structure with the annotations (attributes) [6].

Section 2 presents the layered structure of the Grid. In Sect. 3, the Grid simulation is discussed. In Sect. 4, definitions concerning hierarchical graphs are presented, while in Sect. 5 layered graphs are defined. The grid structure represented by a layered graph and the description of rules generating such a representation are discussed in Sects. 6 and 7, respectively.

2 Layered Architecture of the Grid

The Grid can be seen as a framework composed of several layers. Its architecture can be compared to the OSI model [8]. The lowest layer in the Grid is a Fabric Layer which is responsible for providing physical resources that can be shared by the Grid: computational power, storage, network and also local services associated with them. In the OSI model the counterparts of the Grid Fabric layer are the Data Link Layer and the Physical Layer that together ensure proper communication between different types of media.

The next layer in the Grid is responsible for connectivity issues. It defines Grid services such as communication and authentication protocols which support transactions in the grid environment. A similar role in the OSI model is assigned to the Network and Transport Layers the main function of which is logic and physical transmission of data between endpoints in a standard way (Fig. 1).
Fig. 1

Comparison between the Grid architecture and the OSI model

The scope of responsibilities of the Resource Layer in the Grid architecture includes using protocols and security functions offered by the Connectivity Layer to perform and manage operations on individual resources in the Grid (for example initiating, securing, terminating operations). Similar functionalities are provided in the Session Layer in case of the OSI model – it enables creating and managing sessions between system elements.

Above the Resource Layer in the Grid architecture the Collective Layer can be distinguished. It is tightly coupled with the previous one due to offering collective protocols, APIs and services which are used to manage the access to multiple resources. In a similar way, the Presentation Layer in the OSI model links elements from the lower layers and provides them as a cooperating environment to the highest layer.

On the top of both architectures there is the Application Layer. In case of the Grid, in this layer toolkits, APIs and Software Development Kits (SDKs) can be defined and used to create applications dedicated to work in a grid environment. Analogously for the OSI structure, this layer provides the interface to cooperate with end users.

In the above mentioned architecture of the Grid, particular attention is paid to resource sharing and supporting cooperation between geographically distributed computers in common problem solving.

The layered architecture is a good initial point for further outlining key elements of each layer. From the end user’s perspective the two first layers (Connectivity and Fabric) can be regarded as core physical elements which coupled together build the environment. These layers are managed by middleware components the two next layers. Management can be discussed from two different points of view. The first one takes into account only one resource at time (it is the Resource Layer). Therefore, there should exist middleware components that perform such operations as authentication, authorization or security services. As the main advantage of the Grid is cooperation, the other middleware components should control the interoperability between low level elements in such a way that the user’s requirements are accomplished. These components are a part of the Collective Layer, for instance, Grid Information Service, Resource Broker.

3 Grid Simulation

Building the Grid environment is a long-lasting and complex process due to a necessity of adjusting all elements to expected user’s requirements. Simulators of the system are generally regarded as a support used during work on the target environment, which enables one quick verification of suggested solutions. However, in case of high performance computing systems such as the Grid the simulation cannot cover all features of the Grid simultaneously. Therefore, many different types of simulators have been already proposed. In this section we will discuss some existing solutions.

The authors of the Brick Grid simulator were particularly interested in analysis and comparison of various scheduling policies on a high-performance global computing setting [14]. The tool enables one to verify a wide range of resource scheduling algorithms and the behavior of networks. The main advantage of this solution is a possibility of incorporating existing global computing components via its foreign interface. A lack of complex network topology that could be built can be regarded as a drawback.

The GridSim toolkit focuses on investigation of effective resource allocation techniques based on computational economy [13]. The system provides a possibility to create a wide range of types of heterogeneous resources. Additionally, resource brokers have been introduced and appropriate scheduling algorithms for mapping jobs to resources are used in order to optimize system or user objectives according to their goals. The simulator has been also used to verify the influence of local economy and the global positioning on securing jobs under various pricing and demand/supply situations.

On top of the GridSim the Grid Scheduling Simulator (GSSIM) has been built [7]. Its authors were interested in overcoming two issues: lack of tools that generate workloads, events or resources, and simultaneous modelling different Grid levels, i.e. resource brokers and local level scheduling systems. The goals were achieved by introducing multilevel scheduling architectures with plugged-in algorithms for the Grid and local schedulers. Additionally, the simulator enables one to read existing real workloads and generating synthetic ones on the base of given probabilistic distributions and constraints.

Another example of a simulator is the SimGrid [1] project. The main goal of the project is to facilitate research in the area of distributed and parallel application scheduling on distributed computing platforms ranging from a simple network of workstations to Computational Grids. The authors are concerned mainly with network topology and flow of data over an available network bandwidth. However, the lack of modelling job decomposition and task parallelization characteristics can be seen as a main disadvantage of the toolkit.

To summarize, the research in the area of grid simulation touches a wide range of problems existing in real environments. However, their complexity enforces searching for new solutions of these issues.

4 Hierarchical Graphs

Hierarchical graphs are used to represent complex objects in different domains of computer science [11, 12]. Their ability to represent the structure of an object as well as the relations of different types between its components makes them useful in representing different parts of the structure of the grid. They can represent such parts at different levels of detail at different stages of the grid construction thus hiding unnecessary low-level data and showing only a relevant part of its structure.

Hierarchical Graphs (HGs) can be seen as an extension of traditional, simple graphs. They consist of nodes and edges. What makes them different from “flat” graphs is that nodes in HGs can contain internal nodes. These nodes, called children, can in turn contain other internal nodes and can be connected to any other nodes with the only exception being their ancestors.

This property makes hierarchical graphs exceptionally well fitted to represent the structure of the grid consisting of a number of geographically distributed networks which can in turn contain other networks.

In this paper, a hierarchical node is defined as a pair \({v}_{i} = (i,{C}_{{v}_{i}})\), where i is a node identifier and \({C}_{{v}_{i}}\) is a set of the node’s children. Nodes which have no children are called simple, and are sometimes referred to as the lowest-level nodes. An edge e is a pair (v i , v j ), where v i and v j are nodes connected by e.

For the remaining part of this paper let X be a set of hierarchical nodes with a finite number of children. Sets of children, ancestors and a hierarchical graph are defined in the following way.

Definition 1 (Children).

  1. 1.

    v ∈ X,  Ch 0(v) = { v},

     
  2. 2.

    \({\mathit{Ch}}^{n}(v) =\{ w :\ \exists z \in \ {\mathit{Ch}}^{n-1}(v),\ w \in {C}_{z}\}\),

     
  3. 3.

    Ch(v) = Ch 1(v).

     
  4. 4.

    \({\mathit{Ch}}^{+}(v) ={ \bigcup \nolimits }_{i=1}^{\infty }{\mathit{Ch}}^{i}(v),\ {\mathit{Ch}}^{{_\ast}}(v) ={ \bigcup \nolimits }_{i=0}^{\infty }{\mathit{Ch}}^{i}(v)\)

     

Definition 2 (Direct ancestor).

The direct ancestor of the node v is defined as a node w, such that v ∈ C w , and denoted anc(v). The ancestor of a node v that is not a child of any node will be denoted by v ε.
$$\mathit{anc}(v) = \left \{\begin{array}{ll} w\ \mathit{if } &\exists w\ : v \in {C}_{w} \\ {v}_{\epsilon }\ \mathit{if }&\neg \exists w\ v \in {C}_{w} \end{array} \right.$$

Definition 3 (Ancestor).

The i-th level (i ≥ 1) ancestor of the node v is defined in the following way:
$$\begin{array}{rcl}{ \mathit{anc}}^{1}(v)& =& \mathit{anc}(v), \\ {\mathit{anc}}^{i}(v)& =& \left \{\begin{array}{ll} \mathit{anc}({\mathit{anc}}^{i-1}(v))& :{ \mathit{anc}}^{i-1}(v)\neq {v}_{ \epsilon } \\ {v}_{\epsilon } & :{ \mathit{anc}}^{i-1}(v) = {v}_{\epsilon }. \end{array} \right. \end{array}$$

Definition 4 (Hierarchical graph).

A hierarchical graph G is defined as a pair (V, E), where:
  • V ⊂ X is a set of nodes, such that node identifiers are unique and a node may have only one direct ancestor,

  • E is a set of edges, which do not connect nodes related hierarchically.

It is important to note that the edges between nodes are by no means limited to edges between descendants of the same node. In other words, there may exist edges between nodes having different ancestors. Such a feature can be very useful in a grid representation where nodes representing components of different parts of the grid represented by a hierarchical graph (and hence having different ancestors) can be connected to each other in different ways. For example, there can be a direct connection to a node representing a storage unit from the outside of its subnetwork. Such a connection may be represented in a hierarchical graph by an edge connecting the nodes representing the above mentioned elements.

Parts of an object represented by a hierarchical graph correspond to subgraphs.

Definition 5 (Subgraph).

A subgraph of a hierarchical graph G = (V, E) is a hierarchical graph g = (V g , E g ), such that V g  ⊂ V, E g  ⊂ E and
  • if a node belongs to a subgraph then all its children also do  ∀v ∈ V G w ∈ Ch  ∗ (v) : v ∈ V g  ⇒ w ∈ V g ,

  • if two nodes belong to a subgraph then the edge between them also does ∀v i , v j  ∈ V G e = (v i , v j ) : v i , v j  ∈ V g  ⇒ e ∈ E g .

If a subgraph is removed from the graph, the remaining graph (called the rest graph) contains all nodes that do not belong to the subgraph and the edges that connect them. During this operation also all edges that connect nodes of the subgraph with nodes of the rest graph are removed. The set of all such edges is called embedding.

Definition 6 (Embedding).

Let G = (V, E) be a hierarchical graph and g = (V g , E g ) its subgraph. The set of edges Em is called the embedding of g in G if
  • Em ⊆ E,

  • e = (v i , v j ) such that {v i  ∈ V − V g and v j  ∈ V g }, or {v j  ∈ V − V g and v i  ∈ V g },  e ∈ Em.

Nodes and edges in hierarchical graphs can be labelled and attributed. Labels are assigned to nodes and edges by means of node and edge labelling functions respectively, and attributes – by node and edge attributing functions. Attributes represent properties of components and relations represented by nodes and edges. Formally, an attribute is a function a : V → D a , which assigns elements of the domain of attribute a, D a , to elements of V.

Let for the rest of this chapter R V and R E be sets of node and edge labels, respectively. Let A and B be sets of node and edge attributes and D A and D B be domains of attributes of nodes and edges respectively.

Definition 7 (Labelled attributed hierarchical graph).

A labelled attributed hierarchical graph is defined as a 6-tuple aHG = (V, E, ξ V , ξ E , att V , att E ) where:
  1. 1.

    (V, E) is a hierarchical graph,

     
  2. 2.

    ξ V :  V → R V is a node labelling function,

     
  3. 3.

    ξ E :  E → R E is an edge labelling function,

     
  4. 4.

    att V : V → P(A) is a node attributing function assigning sets of attributes (i.e functions a : V → D a ) to nodes and

     
  5. 5.

    att E : E → P(B) is an edge attributing function assigning sets of attributes (i.e. functions b : E → D b ) to edges.

     

A subgraph g of a labelled attributed hierarchical graph G is defined in the same way as in Definition 5 with labelling and attributing of g defined by restrictions of respective functions in G. A labelled attributed hierarchical graph defined above may represent a potentially infinite number of grids. A given hierarchical graph G can represent a potentially infinite subset of such grids having the same structure. To represent an actual grid we must define an instance of a graph. An instance of a hierarchical graph is a hierarchical labelled attributed graph in which to each attribute a a value from the set of possible values of this attribute has been assigned. In the following, a hierarchical graph, a subgraph and an instance will mean a labelled attributed hierarchical graph, its subgraph and an instance of a labelled attributed hierarchical graph, respectively.

5 Layered Graphs

A computational grid contains different types of elements such as physical nodes (computers, computational elements), virtual ones (software, operating systems, applications), and storage elements which can be treated both as physical elements (designated hard drives) or virtual elements (residing on other physical elements). Such a structure requires using graphs which can represent both types of elements.

Moreover, some virtual elements of a grid are responsible for performing computational tasks sent to the grid, while other elements (services) are only responsible for managing the grid structure data, behaviour, and the flow of the tasks. Thus, a structure used as a representation for a grid should be able to reflect all the elements, their interconnections and interdependences.

In this contribution, we introduce a notion of a layer composed of hierarchical graphs as a formal description of one part of a grid; for example we can have a physical layer, a management layer or a resource layer. Each layer consists of one or more graphs representing parts of a grid. For example, at the physical layer a graph may represent a network located at one place. As such parts of a grid are independent of each other they are represented by disjoined graphs. On the other hand, each such graph consists of elements that can communicate with each other so they are represented by connected graphs.

Formally, a layer is defined in the following way:

Definition 8.

A layer

Let Ly = { G 1, G 2 …G n }, n ∈ N be a family of labelled attributed hierarchical graphs, such that
  1. 1.

    G i  = (V i , E i ), is a connected graph, 1 ≤ i, j ≤ n,

     
  2. 2.

    V i  ∩ V j  = , for ij,

     
  3. 3.

    E i  ∩ E j  = , for ij.

     

Such a family will be called a layer.

A grid is a dynamic structure. New resources can be added and removed form the overall structure at any time. Thus, many operations are performed not on the whole structure but on parts of it. In order to be able to define such operations we first have to introduce the notion of a sublayer. A sublayer consists of one or more graphs, each of them either belonging to a layer or being a subgraph of a graph belonging to a layer. Formally, such a structure is defined as:

Definition 9.

A sublayer

Let Ly = { G 1, G 2 …G n } be a layer, then ly = { g 1, g 2, , g m } is a sublayer of Ly if
  • m ≤ n,

  • g i is a subgraph of G i .

We propose to represent each of the grid layers as layers composed of hierarchical graphs. The graph layers are connected by interlayer edges which represent how the communication between different layers of a grid is carried out.

As the layered graph represents the structure (topology) of a grid, a semantics is needed to define the meaning, and thus possible uses, of its elements. Such information may include an operating system installed on a given computational element, specific software installed, size of memory or storage available, type of resource represented by a given element etc. This information is usually encoded in attributes assigned to elements of a graph. Each attribute is actually a function assigning values to attribute names. In a graph representing a given grid, each node can have several attributes assigned, each of them having one value.

Let Σ E be a set of edge labels and A E be a set of edge attributes.

Definition 10.

A layered graph

A layered graph is a set of layers GL = ({Ly 1, Ly 2, …Ly k }, E, I L , I A ), where
  1. 1.

    \(E =\{ e\vert e = ({v}_{i},{v}_{j}),{v}_{i} \in {V }_{{G}_{i}},{v}_{j} \in {V }_{{G}_{j}} \wedge {G}_{i}\neq {G}_{j} \wedge {G}_{i} \in {\mathit{Ly}}^{i} \wedge {G}_{j} \in {\mathit{Ly}}^{j} \wedge i\neq j\}\),

     
  2. 2.

    I L : E → Σ E is an edge labelling function,

     
  3. 3.

    I A : E → P(A E ) is an edge attributing function.

     

Elements of E are called inter-layer edges.

In the above definition, the layers are numbered with superscripts and graphs making part of layers – with subscripts. Throughout this chapter a notion of G i j will be used to denote that a graph G i belongs to a layer Ly j . Moreover, a notion of Ly g i will be used to denote the fact that layer i belongs to a layered graph g.

Having defined a notion of a layer graph a layer subgraph has to be defined formally in order to be able to define operations on parts of a grid at different levels of abstraction. The definition of a layer subgraph makes use of a sublayer definition and traditional definition of a subgraph.

Definition 11.

A layered subgraph

A layered subgraph of GL = ({Ly 1, Ly 2, …Ly n }, E, I L , I A ), is a layered graph gl = ({ly gl 1, ly gl 2,  \({\mathit{ly}}_{\mathit{gl}}^{m}\},{E}_{g},{I}_{{L}_{g}},{I}_{{A}_{g}})\) where
  • m ≤ n,

  • ly gl i is a sublayer of Ly i ,

  • E g  ⊂ E.

  • \({I}_{{L}_{g}} = {I}_{L}\vert {E}_{g}\) is an edge labelling function,

  • \({I}_{{A}_{g}} = {I}_{A}\vert {E}_{g}\) is an edge attributing function.

6 Grid Representation

The layered graph defined in the previous section can contain any number of layers. In the description of the grid we will use three-layered graphs. These layers will represent the resource layer, management layer and computing (physical) layer. As we have only three layers they will be denoted, respectively, by RL, ML, and CL, instead of the Ly i notation.

Let R V  = {C, CE, RT, ST, CM, index, services, broker, job scheduling, monitoring} (where C stands for a computer, CE for a computational element, RT for a router, ST for storage and CM for a managing unit), be a set of node labels used in a grid representation. Let R E  = {isLocatedAt, hasACopyAt, actionPassing, infoPassing, taskPassing} be a set of edge labels.

An example of a layered graph representing a simple grid is depicted in Fig. 2. The top layer of this graph, layer RL, represents the main resources/services responsible for task distributing/assigning/allocating and general grid behavior. The second layer represents the elements responsible for the grid management. Each node labelled CM represents a management system for a part of a grid, for example for a given subnetwork/computing element. The management elements CM can be hierarchical, as it is shown in the example. Such a hierarchy represents a situation in which data received from the grid services is distributed internally to lower-level managing components and each of them in turn is responsible for some computational units. At the same time each CM element can be responsible for managing one or more computational elements.
Fig. 2

A layered graph representing a grid

The labels of edges are not written in this figure for clarity, instead different styles of lines representing edges are used. Each node label describes the type of a grid element represented by a given node. But grid elements, depending on their type, have some additional properties. These properties in a graph based representation are represented by attributes. Let attributes of nodes be defined on the basis of node labels. We also assume that attributes are defined for low-level nodes only. The attributes for their ancestors are computed on the basis of the children attributes.

Thus, let the set of node attributes be A = {capacity, RAM, OS, apps, CPU, class, type}. Let att V be an attributing function for a layered graph depicted in Fig. 2.

The sets of attributes are determined according to the two following rules:
  1. (R1).

    \({\mathit{att}}_{V }(v) = \left \{\begin{array}{l@{\quad }l} \{\mathit{RAM},\mathit{OS},\mathit{CPU},\mathit{apps}\}\ \quad &{\xi }_{V }(v) = C \\ \{\mathit{capacity},\mathit{type}\} \quad &{\xi }_{V }(v) = \mathit{ST} \\ \{\mathit{class}\} \quad &{\xi }_{V }(v) = \mathit{RT} \\ \{\mathit{load}\} \quad &{\xi }_{V }(v) = \mathit{CM} \\ \{\mathit{load}\} \quad &{\xi }_{V }(v) = \mathit{broker} \\ \{\mathit{size}\} \quad &{\xi }_{V }(v) = \mathit{index}\end{array} \right.\)

    for v such that ¬ ∃w : v = anc(w).

     
  2. (R2).

    v such that \(\exists w : v = \mathit{anc}(w) :{ \mathit{att}}_{V }(v) ={ \bigcup \nolimits }_{w\in {\mathit{Ch}}^{+}(v)}{\mathit{att}}_{V }(w)\).

     
In Fig. 3 a part of the graph from Fig. 2 is shown. It represents one computational element, which is a part of a computational layer. For this graph the sets of attributes are described according to rule R1 in the following way:
$$\begin{array}{rlrlrl} {\mathit{att}}_{V }({v}_{i}) = \left \{\begin{array}{l@{\quad }l} \{\mathit{RAM},\mathit{OS},\mathit{CPU},\mathit{apps}\}\ \quad &i = 2,3,4 \\ \{\mathit{capacity},\mathit{type}\} \quad &i = 5 \\ \{\mathit{class}\} \quad &i = 6 \end{array} \right. & & \end{array}$$
Fig. 3

A part of the graph from Fig. 2 representing a single computational element

For node v 1, which is a higher level node, its attributes are computed on the basis of children attributes according to rule R2. Thus, att V (v 1) = {RAM, OS, CPU, apps, capacity, type, class}.

To node attributes the appropriate values have to be assigned. Firstly, a domain for each attribute has to be defined. In this example, let D OS  = {Win, Lin, MacOS, GenUnix}, D apss  = {app1, app2, app3, app4}, D CPU  = {cpu1, cpu2, cpu3}, D class  = {c1, c2, c3, c4}, \({D}_{\mathit{RAM}}\,=\,n,n\,\in \,\mathcal{N}\) and n ∈ [0. . . 64], \({D}_{\mathit{capacity}}\,=\,m,m\,\in \,\mathcal{N}\) and m ∈ [0. . . 1000], D type  = {FAT, FAT32, NTFS, ext3, ext4, xfs}. The last two attributes are expressed in gigabytes available. In the example, the values of attributes are as follows: RAM(v 2) = 4, RAM(v 3) = 8, RAM(v 4) = 2, CPU(v 2) = cpu1, CPU(v 3) = cpu2, CPU(v 4) = cpu1, OS(v 2) = Win, OS(v 3) = Win, OS(v 4) = Lin, apps(v 2) = app1, apps(v 3) = app1, apps(v 4) = app2, capacity(v 5) = 500, type(v 5) = FAT32, class(v 6) = c2. For the hierarchical node v 1, the values for the properties CPU, OS, and apps are a collection containing all the values of the children of v 1. In case of numerical properties, it is a sum of the values of its children.

7 Grid Generation

The grid represented by a layered graph can be generated by means of a graph grammar. Graph grammars are systems of graph transformations called productions. Each production is composed of two graphs named left-hand side and right-hand side (in short, left side and right side). The right side graph can replace the left side one if it appears in the graph to be transformed. A graph grammar allows us to generate potentially large numbers of grid structures. In order to restrict the number of generated solutions, a control diagram can be introduced. Such a diagram would define the order in which grammar productions are to be applied and thus limit the number of possible replacement operations. Moreover, a grammar contains a starting element called axiom. In case of graph grammars the axiom can be either a graph or a single node. It is then changed by subsequent applications of suitable productions of the grammar.

Different types of graph grammars have been investigated and their ability to generate different structures has been proved [11, 12, 5, 4].

In a grid generation process, there is a need for two categories of productions, that can further be divided into five subtypes. The productions operating on only one layer can be based on traditional productions used in graph grammars. But, as we use a more complex structure, there is a need for productions operating on several layers. Moreover, we need productions that can both add and remove elements from the grid represented by the layered graph. To make sure the functionality of a grid is preserved the productions removing elements have also to be able to invoke rules responsible for maintaining the stability of affected elements. The application of a production can also require some actions to be invoked. For example, if a production removes a node representing a computational element containing a virtual resource all other elements using this resource must be redirected to its copy. If such a copy does not exist it must be generated on other element before the node can be removed from the grid.

The productions used to generate a grid can be divided into five main types:
  • Working on two layers, but without adding edges or nodes (e.g. dividing a manager)

  • Working on two layers, adding only edges (e.g. making a copy of a resource)

  • Working on two layers adding nodes and edges (e.g. adding a computational element)

  • Working on one layer and not requiring additional actions (e.g. adding a computer, a storage unit)

  • Working on one layer with additional actions required (e.g. removing a computer, removing a manager)

7.1 Productions Working on Two Layers

Making a copy of a resource is an example of a production operating on two layers, as the resource of which a copy is to be made is located on the resource layer, while the computer it is placed on and the one where a copy is to be placed are located on the computing layer. This production is depicted in Fig. 4. The left side of the production consists of two layers, CL and RL. On the resource layer a node labeled service is depicted. This node can be matched with any node representing a service and belonging to the resource layer. On the computing layer two nodes labelled by C are shown. The one connected to a service node represents the node on which this service is located. The second one represents any other computer on the computing layer. Such a definition of the left side guarantees that the copy will be made on a computer different from the one on which a considered service is already present. When a subgraph isomorphic with the left side graph is found in a graph representing a grid it can be replaced by the right side graph within a process called production application. By applying this production a new edge is added to a current graph. This edge represents the fact that a copy of the resource is placed on a computer to which this edge connects a service node.
Fig. 4

The production responsible for making a copy of a resource

It must be noted here that when the subgraph isomorphic with the left side is found in the current graph, there may exist other edges connecting the nodes matched to the nodes of the left side with other nodes in the graph. These edges are not affected by the application of the production.

Another type of production working on two layers is depicted in Fig. 5. This production is used to model the division of a manager. It can be used in a situation when a single manager is responsible for too many computational units and one of the units is to be transferred to a different manager to speed the grid operation. The node labelled CM with two edges connected to it can be matched within a considered graph to any node labelled CM with at least two edges connected to it. The node without edges connected to it can be matched to any node labelled by CM other than the one matched to the first manager. No other edges are affected by this production and no additional actions are required.
Fig. 5

The production responsible for dividing a manager

One more type of production contains all productions working on two layers and adding nodes and edges. An example of such a production can be a production adding a computational element (i.e. a hierarchical node).

7.2 Productions Working on One Layer

A production adding a computer, depicted in Fig. 6 can be considered to be a good example of a production working on one layer. A computer is always added to a computational element containing at least a router. So the left side graph is a hierarchical one with a node representing a computational element, labelled CE, with one child, labelled RT, representing a router. After a copy of the left side graph is found in a graph representing a grid it can be replaced by the right side graph. By applying this production a new edge and a new node are added to a current graph. This edge represents the fact that a new computer is connected to the router and the node represents the new computer. As it was in the previous production no other elements of the considered graph are affected. The new node that is added must also be attributed. The attributing here is transferred from attributing the right side graph. So to each new element attributes transferred from the right side are assigned. As it was shown above attributes of the hierarchical nodes are computed on the basis of their children so adding a child may result in changing attributes of all ancestors up the hierarchy to the topmost node.
Fig. 6

The production responsible for adding a computer to the grid

A similar production is used to add a storage element to the computational unit. As the only difference is in a label of one node the production is not shown in this paper.

Removing a computer can also be simulated using a production working on one layer. Such a production is depicted in Fig. 7. There must be at least two computers belonging to the same computational element for this production to be applied. This requirement is based on the assumption that each computational unit must contain at least one computer and it thus ensures that the assumption is satisfied after the production is applied. Application of this production results in deleting a node representing the computer being removed together with the edge connecting it to the node representing a router. After applying the considered production some additional action may be required. There may be two types of the actions to be carried out.
Fig. 7

The production responsible for removing a computer from the grid

First, one concerns the attributing of the ancestors of the removed node. As in the case of adding a computer, because attributes of the hierarchical nodes are computed on the basis of attributes of their children, removing one child may result in changing attributes of all ancestors. This change has to be propagated up the hierarchy to the topmost node.

The second type of actions that may be required is concerned with interlayer connections. For example, on a computer being removed some services or their copies could have been placed. To ensure the stability of the grid the copy of each such service has to be activated or/and a new one made. The decision whether this action is actually needed is made on the basis of the embedding of the removed node. Let Em = { e 1, e 2, e n } be the embedding of the removed node, e i  = (w, v i ), where w is the node being removed and v i the one connected to w by e i and let I L (e i ) be a label of this edge. Then if there exists e i such that I L (e i ) = hasACopyAt then the production making a copy of the resource has to be applied immediately after the one removing a computer. This production has to make a copy of a resource represented in the graph by v i . If there is an edge e i in the embedding, such that I L (e i ) = isLocatedAt a copy of the service has to be found and activated and a new copy has to be made.

By applying productions of the grid grammar a grid structure can be generated. This approach also enables us to modify the grid to model its changing nature. As the layered graph represents not only the structure but also interconnections it can be used to simulate the working paradigm of the grid.

The above described productions are responsible for the generation and modification of the grid structure – i.e. its mainly static side. To be able to model the behaviour of the grid, that is its dynamic side, one more type of graph productions is planed to be added. These productions will be based on a notion similar to token passing in Petri nets and will make possible the simulation of user tasks processing. They will be capable of simulating the actual workings of the grid.

8 Conclusions

In this chapter, a new approach to a grid representation has been described. It makes use of graph structures. Layered graphs have been defined and an example of such a graph as a grid representation has been shown. Using a graph based representation enables us to use graph grammars as a tool to generate a grid. A grid graph grammar has been described and some of its productions have been depicted and explained. As layered graphs are attributed both the structure and the parameters of the grid can be generated at the same time.

Productions presented in this work are used to generate the grid. The next step will consist in adding productions that will be able to simulate the work and behaviour of the grid as well. Then a new grid simulator will be implemented. As we have the possibility of modeling the structure of the environment in a dynamic way, the design of the simulator focuses on building the topology and adopting it in the runtime. The basic requirement is to enable the generation of a wide range of different grid structures which would be described using the proposed grid grammar for the simulation purpose.

Additionally, some research is also planned in the area of jobs execution. The explicitly defined management layer gives us the opportunity to check some specific configurations or scheduling algorithms.

The simulator will be deployed and tested using a cluster environment at our institution. We plan to use a computer, which has 576 cores and a modular structure. Each of its six computational units has 256 GB of memory and works as a multiprocessor memory sharing computer. Moreover, it allows for running tasks based on OpenMP technology.

Notes

Acknowledgments

The authors would like to thank Wojciech Grabski for the graphic concept of layered graphs visualization used in this paper in many figures.

References

  1. 1.
    Casanova H., Legrand A. and QuinsonM., SimGrid: a Generic Framework for Large-Scale Distributed Experiments, 10th IEEE International Conference on Computer Modeling and Simulation, 2008.Google Scholar
  2. 2.
    Dong Lu, Peter A. Dinda; GridG: Generating Realistic Computational Grids, SIGMETRICS Performance Evaluation Review, Volume 30, num 4, pp. 33-40 2003.Google Scholar
  3. 3.
    Foster I., Kesselman C., and Tuecke S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International Journal Supercomputer Applications 2001, pp. 200-220.Google Scholar
  4. 4.
    Grabska, E.. Graphs and designing. Lecture Notes in Computer Science, 776 (1994).Google Scholar
  5. 5.
    E.Grabska, W. Palacz, Hierarchical graphs in creative design. MG&V, 9(1/2), 115-123. (2000).Google Scholar
  6. 6.
    E. Grabska, B. Strug, Applying Cooperating Distributed Graph Grammars in Computer Aided Design, Lecture Notes in Computer Science, S vol 3911, pp. 567-574 Springer, 2005.Google Scholar
  7. 7.
    Grid Scheduling Simulator, http://www.gssim.org.
  8. 8.
    Ihssan A., Sandeep G. : Grid Computing: The Trend of the Millenium, Review of Business Information Systems, Volumne 11, num 2, 2007.Google Scholar
  9. 9.
    Joseph J., Ernest M., and Fellenstein C.: Evolution of grid computing architecture and grid adoption models (http://www.research.ibm.com/journal/sj/434/joseph.pdf).
  10. 10.
    Joseph J. and Fellenstein C.: Grid Computing, IBM Press, 2004.Google Scholar
  11. 11.
    Rozenberg, G. Handbook of Graph Grammars and Computing by Graph. Transformations, vol.1 Fundations, World Scientific London (1997).Google Scholar
  12. 12.
    Rozenberg, G. Handbook of Graph Grammars and Computing by Graph. Transformations, vol.2 Applications, Languages and Tools, World Scientific London, (1999).Google Scholar
  13. 13.
    Sulistio A., Cibej U., Venugopal S., Robic B. and Buyya R. A Toolkit for Modelling and Simulating Data Grids: An Extension to GridSim, Concurrency and Computation: Practice and Experience (CCPE), Online ISSN: 1532-0634, Printed ISSN: 1532-0626, 20(13): 1591-1609, Wiley Press, New York, USA, Sep. 2008.Google Scholar
  14. 14.
    Takefusa A., Matsuoka S., Nakada H., Aida K., and Nagashima U., Overview of a performance evaluation system for global computing scheduling algorithms, in In Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing (HPDC8), 1999, pp. 97-104.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Barbara Strug
    • 1
  • Iwona Ryszka
    • 1
  • Ewa Grabska
    • 1
  • Grażyna Ślusarczyk
    • 1
  1. 1.Faculty of Physics, Astronomy and Applied Computer ScienceJagiellonian UniversityCracowPoland

Personalised recommendations