1 Introduction

In large-scale distributed environments, there is huge requirement of providing reliability. One such method to achieve reliability is replication of the critical resources. The method of replication enhances the availability of the critical resources in the system and also ensures fault tolerance. However, the biggest concern in replication mechanism is the issue of maintaining consistency among the replicated components of the same resource. There have been many attempts in the past to address this issue. Various update propagation policies exists, that are used to communicate the changes made in the file, to its replicas. Two major policies used for update propagation are Write Update and Write Invalidate. Both these policies have their own advantages and disadvantages.

Multiple copies of file engender the issue of confliction between file replicas; it is called inconsistency. To keep the replicated file consistent, it is mandatory that all the replica of a file strictly have the same content at a time; it means changes made in a file must be reflected on other replicas in zero time (immediately). It is practically impossible because of network delays. Conventional approaches for modification propagation maintain a master replica and any changes made in any other replica have to be propagated immediately to the master replica. This overhead is reduced in proposed approach by notifying all replicas that the latest modified replica of file (f) is (r1) and is now the new master replica or new owner for file (f).

As a result, system can handle large number of requests as several replicas of the file exist. Model proposed in this paper avoids unnecessary file replication and tries to resolve the following issues:

  • Prevents the creation of file, if a copy of the requested file is available on a peer File Replication Servers (FRSs).

  • File access frequency based, dynamic file replication on the peer FRS.

  • Handling the file request in case of node failure without user intervention.

Model uses asynchronous communication ensuring that the system will keep accepting the requests without blocking its state. It provides fault tolerance to the system by automatically connecting the user to other FRS in case one FRS fails.

Reputation systems [1] provide a way for building trust by utilizing community based feedback about past experiences of peers to help making recommendation and judgment on quality and reliability of the transactions. The challenge of building such a reputation based trust mechanism in distributed system is how to effectively cope with various malicious behaviors of peers such as providing fake or misleading feedback about other peers. Another challenge is how to incorporate various contexts in building trust as they vary in different communities and transactions. Further, the effectiveness of a trust system depends not only on the factors and metrics for building trust, but also on the implementation of the trust model. Most existing reputation mechanisms require a central server for storing and distributing the reputation information. It remains a challenge to build a decentralized trust management system that is efficient, scalable, and secure in both trust computation and trust data storage and dissemination.

The rest of the paper is organized as follows. The next section discusses a brief literature survey of existing theories and work done so far. Section 3 discusses about the proposed trust based approach. Section 4 discusses the replication and consistency maintenance model followed by simulations and results in Section 5. Finally, Section 6 concludes the work followed by references.

2 Related work

Cohen and Shenker [2] consider static replication in combination with a variant of Gnutella searching. Static strategies are applied for replication when there is a little gain from using dynamic strategies. Dynamic strategies are able to recover from failures such as network partitioning and easily adapt to changes in demand, bandwidth and storage availability. Clark et al. [3] replicate objects both on insertion and retrieval on the path from the initiator to the target, mainly for anonymity and availability purposes. Wolfson et al. [4] address data replication and considers that adaptive replication algorithms change the replication scheme of an object to reflect the read-write patterns and eventually converge towards the optimal scheme. The adaptive data replication algorithm aims at decreasing the bandwidth utilization and latency by moving data closer to clients. Similarly, locality aware file replication is proposed by Cheng and King [5] to ensure data reliability and availability through the parallel I/O system. To ensure synchronized file replication across two loosely connected file systems, a transparent service model has been developed by Rao and Skarra [6] that propagate the modification of replicated files and directories from either file system. Primary-copy (master–slave) approach for updating the replicas says that only one copy could be updated (the master), secondary copies are updated lazily. There is only one replica which always has all the updates. Consequently the load on the primary copy (master replica) is large. Domenici [7] discuss several replication and data consistency solutions, including Eager (Synchronous) and Lazy (Asynchronous) replication, Single-Master and Multi-Master Model, pull-based and push-based consistency mechanism. Author presents various replication and consistency maintenance algorithms to deal with huge scientific data. Guy et al. [8] propose a replica modification approach, a replica is designated either as master or a secondary replica. Only master replica is allowed to be modified whereas secondary replica is treated as read-only, i.e. modification permission on secondary replica is denied. A secondary replica is updated in accordance with the master replica if master replica is modified. Sun and Xu [9] propose two coherence protocols viz., lazy-copy and aggressive-copy. Replicas are only updated as needed, if someone accesses it in the lazy-copy based protocol. Huang et al. [10] propose the differentiated replication to improve access performance and replicas availability. They make effort on performance, availability and consistency. But consistency maintenance algorithm does not take storage capacity into account. Some replicas that are not accessed for a long time by grid users will waste the free space of storage device. Düllmann et al. [11] propose a high-level replica consistency service, called Grid Consistency Service (GCS). The GCS allows updating file and consistency maintenance. The literature proposes several different consistency levels ranging from entirely synchronized data to loosely synchronized data. Gird users can choose different consistency services dynamically adjusting replicas consistency degree. Hu et al. [12] propose an asynchronous model, despite the system failure or network traffic congestion, this model avoids the replicas inconsistency in grid environment. The consistency problem mentioned in the literature could be classified into two kinds, one was the metadata replica consistency and the other one the data content consistency.

There are some recent research on reputation and trust management in distributed systems. Aberer and Despotovic [13] are one of the first in proposing a reputation based management system. However, their trust metric simply summarizes the complaints a peer receives and files and is very sensitive to the skewed distribution of the community and misbehaviors of peers. Chen and Singh [14] differentiate the ratings by the reputation of raters that is computed based the majority opinions of the rating. Adversaries who submit dishonest feedback can still gain a good reputation as a rater in their method simply by submitting a large number of feedback and becoming the majority opinion. P2PRep proposed by Cornelli et al. [15] is a protocol where servants can keep track of information about the reputation of other peers and share them with others. Their focus is to provide a protocol complementing existing protocols, as demonstrated on top of Gnutella. However, there are no formalized trust metric and no experimental results in the paper validating heir approach. Dellarocas [16] propose mechanisms to combat two types of cheating behavior when submitting feedback. The basic idea is to detect and filter out exceptions in certain scenarios using cluster-filtering techniques. The technique can be applied into feedback-based reputation systems to filter out the suspicious ratings before the aggregation. Another work is Eigen Trust proposed by Kamvar et al. [17]. Their algorithm again focuses on a Gnutella like P2P file sharing network. They based their approach on the notion of transitive trust and addressed the collusion problem by assuming there are peers in the network that can be pre-trusted. While the algorithm showed promising results against a variety of threat models, we argue that the pre-trusted peers may not be available in all cases and a more general approach is needed. Another shortcoming of their approach is that the implementation of the algorithm is very complex and requires strong coordination and synchronization of peers. A proposal specifically attempted to address the issue of quality of the feedback. A recent paper by Miller et al. [18] propose a mechanism, based on budget balanced payments in exchange for feedback, that provides strict incentives for all agents to tell the truth. This provides yet another approach to the problem of feedback trustworthiness. However, such a mechanism is vulnerable to collusion. Sen and Sajja [19] propose a word-of-mouth reputation algorithm to select service providers. Their focus is on allowing querying agent to select one of the high-performance service providers with a minimum probabilistic guarantee. Zacharia and Maes [20] propose an approach that is an approximation of game-theoretic models and studied the effects of feedback mechanisms on markets with dynamic pricing using simulation modeling. The basic idea is to generate trust values describing the trustworthiness, reliability, or competence of individual nodes, based on some monitoring schemes. Such trust information is then used for malicious node detection [21], and even time synchronization. Although there are a few works studying one or several possible vulnerabilities [22] in ecommerce and P2P applications, there is a lack of systematic treatment of this problem. There has been a great deal of confusion on the topic of trust. Many researchers recognize trust as an essential element in security solutions for distributed systems [23]. But it is still not clear what trust is and how exactly trust can benefit network security [24]. Keynote is a well-known trust management system [25] designed for various large and small-scale Internet-based applications. It provides a single, unified language for both local policies and credentials. It has several shortcomings with respect to trust negotiation [26]. Trust-Builder [27] provides a broad class of negotiation policies, as well as a policy- and language-independent negotiation protocol that ensures the interoperability of defined policies within the Trust-Builder architecture. Gwertzman and Seltzer [28] states that most wide-area replication schemes are client initiated. Decisions on when and where to replicate files are made without the benefit of the server's global knowledge of the situation. Author believe that the server should play a role in making these replication decisions, and propose a geographical push-caching as a way of bringing the server back into the loop. Hitoshi et al. [29] propose a file clustering based replication algorithm for grid file systems. The algorithm groups files according to a relationship of simultaneous accesses between files and stores replicas of the clustered files into storage nodes, to satisfy expected most of future read access times to the clustered files and replication times for individual files being minimized under the given storage capacity limitation. Hisgen et al. [30] examines the Echo distributed file system. The primary goals of Echo are to explore issues of scaling, availability, and performance. For scaling and uniformity of access, Echo provides a global, hierarchical name space. Replication is used for availability. Performance is achieved by distributed caching on clients and by using a log on the file server to reduce disk seeks. Hurley and Yeap [31] establish that file replication and migration can be utilized simultaneously to potentially provide significant performance benefits over a system, without file migration or replication. File replication can be viewed as a natural extension to file migration, and thus, a dynamic file replication policy based on an established file migration heuristic is derived.

3 Proposed approach

3.1 Trust based security service design

This section gives an overview of Trust Management Service and discusses the main components of the system. Also identifies the functionalities and interdependency between the components. Once a node gets itself registered, it becomes the part of elite FRS community denoted as Service Provider (SP) or Service Requestor (SR). Figure 1 presents the components of the Trust Management Service. Major features of different modules are discussed below:

  • TNI: Trusted Node Identifier

  • SR: Service Requestor

  • SP: Service Provider

  • - - - -: Nodes registered with TNI

  • ____: Shows the logical connection between nodes

Fig. 1
figure 1

Security model for distributed

3.2 Message types

  • M1: SUKREQUEST (SeqNo, DestNode, TOS)

  • M2: SUKREPLY (SeqNo, SUK, TrustValue)

  • M3: PAIDREG ()

  • M4: UNPAIDREG (Referral)

  • M5: ACK (Type, Message)

  • M6: GET (E (SUK, Request))

3.3 Acknowledgement types

See Table 1 .

Table 1 Acknowledgement types

3.4 Data structure used

Table 2 shows the data structure maintained by TNI. Various parameters are described as below:

Table 2 Trust value information

Node_ID shows the ID of the node registered with TNI. Trust value is the trust of a particular node. Last update shows the latest modification date of the trust value. Permissions identifies the operation (read and write) that a node can perform on a file. Refferal_ID identifies the ID of the node that refers an unregistered node for registration. Paid this field identifies the whether the node registration is of type paid or unpaid.

Data structure maintained in Table 3 is used to identify the frequent file request behavior. Node_ID is used to identify the node information from which repeated request for a particular file is received. Count: this field gives the number of times a file is requested in a particular time span. Filename is the name of the file for which frequent requests are received. Last request time is the last access time of a file.

Table 3 Frequent file request information

3.5 Trust based security mechanism

TNI keeps the log of the registered nodes. As shown in Fig. 2, Service Requester (SR) node requests Session Usage Key (SUK) from TNI to access the service from the SP. TNI provides the SUK based on the current trust value of both nodes i.e. SR & SP. SUK is for limited time period as defined by the system. Information (trust value, permission, file count) about the communicating nodes is maintained at TNI. SP provides access to the services (file read and write operation) based on current trust value of the requester node. SP monitors the behavior of requester node and informs TNI to update the trust value of requester node based on past request.

Fig. 2
figure 2

Working model of trust based security system

Flow graph for the SR is shown in Fig. 3. After the SUK is received by the service requester, connection is established between the SP and SR.

Fig. 3
figure 3

Flow graph of SR

Figure 4 shows the flow of TNI. TNI checks whether the request is made for updating the trust value of SR or revocation of SUK. Accordingly, the requester node is informed in reply to the request made. Where RN is the Requesting Node and TV is trust value of RN.

Fig. 4
figure 4

Flow graph of TNI

Figure 5 shows the flow diagram of SP. After receiving the request from the SR, SP checks, if the request made is for file. If not SP confirms whether registration is required or not. If the request is for a file, SP decrypts the request using the SUK and proceeds.

Fig. 5
figure 5

Flow graph of service provider

3.6 Trust based parameters

3.6.1 Parameters considered for identifying node behavior

  1. 1.

    Node doesn’t exist.

    1. 1.1

      Node is not registered.

    2. 1.2

      Node debarred because of malicious behavior.

  2. 2.

    Session key (SUK) expires.

  3. 3.

    Increase trust values if request is correct (Good behavior).

  4. 4.

    Decrease trust value if session key (SUK) mismatch.

  5. 5.

    Decrease trust value if ‘file not exists’.

  6. 6.

    Decrease trust value if repeated request is made in a particular time span.

  7. 7.

    Warning against violation of access permissions, to the RN.

  8. 8.

    In case of unpaid registration, decrease trust value of referee over wrong referral. If trust value decreases to minimum value (bad mouthing attack).

  9. 9.

    In case of unpaid registration, referee node must have the required trust value.

  10. 10.

    In case of unpaid registration, if trust value reaches 0, in that case node will be debarred.

  11. 11.

    In case of unpaid registration, initial trust value = half the trust value of referee.

  12. 12.

    In case of paid, initial trust value = half of max trust value.

  13. 13.

    Node is rejoining within kicked out time period.

    1. 13.1.

      Rejoin by paid method.

    2. 13.2.

      Rejoin by unpaid method.

3.6.2 Cases (for modifying trust value)

  1. 1.

    When there is session key (SUK) mismatch, in that case decrease the trust value.

  2. 2.

    If RN requests the same file frequently,

    1. a.

      When first time file request is made, save the file name.

    2. b.

      Subsequent file request from same node will be identified by comparing the requested file name with saved file name.

      1. i.

        If both are same, count value is increased, if count value ≥ Cfr and Tf − Ti < Tfr update trust value of requested node, else do nothing.

      2. ii.

        Else, update file name and save time and set count value.

(where Tfr = frequent time period and fr = frequent request of same type).

  1. 3.

    This paper does not consider the case, if server is responding after a long time, since that may depend on load, communication link, size and type of requested file.

3.7 Interaction diagram

3.7.1 Unregistered node

Node1 (N1) sends request to TNI that it wants to communicate with Node2 (N2) and also requests SUK. TNI sends message to N1, that N1 is not a registered node (Fig. 6).

Fig. 6
figure 6

Not a registered node

3.7.2 Paid registration (direct)

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sent message to N1, that N1 needs to register. N1 sent registration request to TNI. Registration Successful message is send to N1 by TNI (Fig. 7).

Fig. 7
figure 7

Paid registration (direct)

3.7.3 Unpaid registration

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends message to N1 saying that you are not a registered node so you cannot communicate with N2. N1 sends N2 a request message saying that it wants to register. N2 forwards N1 request to TNI. TNI replies to N2 that registration is successful. N2 forwards this message to N1. Now N1 can communicate with TNI and N2 (Fig. 8).

Fig. 8
figure 8

Unpaid registration

3.7.4 Session usage key distribution

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends trust value of N1 and N2 to N2 and N1 respectively and also the requested SUK (Fig. 9).

Fig. 9
figure 9

Session key distribution

3.7.5 Session usage key mismatch

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends trust value of N1 and N2 to N2 and N1 respectively. N1 sends a request message to N2. SUK mismatch between N1 and N2 is found by N2. N1 and N2 sends message to TNI to update the trust value of N2 and N1 respectively (Fig. 10).

Fig. 10
figure 10

Session key mismatch

3.7.6 Block write access

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends message back to N1 providing the SUK and trust value of N2. TNI also sends message to N2 providing N1 SUK as well as trust value of N1. N1 sends message to N2 requesting the file in write mode. N2 replies to N1 that N1 does not have enough trust value to access the file in write mode. N1 sends a message to TNI to update trust value of N2. N2 sends a message to TNI to update trust value of N1 (Fig. 11).

Fig. 11
figure 11

No write access

3.7.7 File not found

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends message back to N1 providing the SUK and trust value of N2. TNI also sends message to N2 providing N1 SUK as well as trust value of N1. N1 sends message to N2 requesting the file in write mode. N2 replies to N1 that N1 does not have enough trust value to access the file in write mode. N1 sends a message to TNI to update trust value of N2. N2 sends a message to TNI to update trust value of N1 (Fig. 12).

Fig. 12
figure 12

File not found

3.7.8 Frequent request

Frequent File Request Detection Algorithm

  1. 1.

    Get a request.

  2. 2.

    Check request type.

  3. 3.

    If request is a file request,

    1. 3.1

      Check if previous request from same node exist.

    2. 3.2

      If previous request is same as current request then check for maximum request within specified time period.

    3. 3.3

      If previous request does not exist then, store the request node and request in a map which has node_id as key.

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends message back to N1 providing it the SUK and trust value of N2. TNI also sends message to N2 providing it SUK as well as trust value of N1. N2 identifies that N1 is frequently requesting the files. N2 replied with ACK_0. N1 sends message to TNI to update the trust value of N2. N2 sends message to update trust value of N1 (Fig. 13).

Fig. 13
figure 13

Frequent request

3.7.9 Session key expire

N1 sends request to TNI that it wants to communicate with N2 and also requests SUK. TNI sends trust value of N2 to N1. TNI sends trust value of N1 to N2. N1 sends a request to N2. SUK between N1 and N2 expires, hence the service cannot be accessed by N1. N1 sends message to TNI to update the trust value of N2. N2 sends message to TNI to update the trust value of N1 (Fig. 14)

To validate the proposed security model, Calculus of Communicating System (CCS) is written and its Observational equivalence is proved using the Concurrency Workbench of the New Century (CWB-NC) that provides different techniques for specifying and verifying finite-state of concurrent systems.

Fig. 14
figure 14

Session key expire

3.8 Observational equivalence of trust based security mechanism

Terms used: suk = Session usage key; sukreq = SUK request; SC = Simple Client; SS = Simple Server; freq = file request; skey = Session Key; utrust = update trust request; utvalue = Update trust value; ack = Acknowledge; sreq = Session Key Request; prreq = Paid Registration Request; uprreq = UnPaid Registration Request; TNI = Trust Node Identifier; refreq = Referral Registration Request; STGS = simple Ticket Granting Server (Fig. 15).

Fig. 15
figure 15

Replication security state diagram

3.8.1 Simple security model

Definition of simple client

$$ {\text{SC}} \;\mathop{=}\limits^{\rm def}\; \hbox{`}{\text{sukreq}}.{\text{suk}}.{\text{SC }} + \, \hbox{`}{\text{freq}}.{\text{ack}}.{\text{SC}} $$
(1)

Definition of Simple Ticket Granting Server

$$ {\text{STGS}}\,\;\mathop{=}\limits^{\rm def}\; {\text{sukreq}}.\hbox{`}{\text{suk}}.\hbox{`}{\text{suk}}.{\text{STGS}} $$
(2)

Definition of simple SP (server)

$$ {\text{SS}}\;\mathop{=}\limits^{\rm def}\; {\text{freq}}.\hbox{`}{\text{ack}}.{\text{SS}} $$
(3)

Definition of simple security system

$$ {\text{SSYSTEM}}\,\;\mathop{=}\limits^{\rm def}\; \,{\text{SC }}\left| {\text{ STGS }} \right|{\text{ SS}} $$
(4)

3.8.2 Trust based model

Definition of Client

$$ {\text{CLIENT}}\;\mathop{=}\limits^{\rm def}\; \hbox{`}{\text{sukreq}}.{\text{suk}}.{\text{CLIENT }} + \, \hbox{`}{\text{freq}}.{\text{ack}}.{\text{CLIENT}} $$
(5)

Definition of Trust Node Identifier

$$ {\text{TNI}}\,\;\mathop{=}\limits^{\rm def}\; \,{\text{sukreq}}.\hbox{`}{\text{suk}}.\hbox{`}{\text{suk}}.{\text{TNI }} + {\text{ utrust}}.\hbox{`}{\text{utvalue}}.{\text{TNI}} $$
(6)

Setting internals for security modules

$$ {\text{RSI}}\,\,\;\mathop{=}\limits^{\rm def}\; \,\,\left\{ {{\text{ utrust}},{\text{ utvalue }}} \right\} $$
(7)

Definition of security service provider (server)

$$ {\text{SERVER}}\;\mathop{=}\limits^{\rm def}\; \hbox{`}{\text{utrust}}.{\text{utvalue}}.{\text{SERVER }} + {\text{ freq}}.\hbox{`}{\text{ack}}.{\text{SERVER}} $$
(8)

Definition of Security System

$$ {\text{SYSTEM}}\,\,\;\mathop{=}\limits^{\rm def}\; \left( {{\text{CLIENT }}\left| {\text{ TNI }} \right|{\text{ SERVER}}} \right) \, \backslash {\text{ RSI}} $$
(9)

3.9 MU Calculus

  1. 1.

    Whenever a RN makes a File request, eventually there will always be an acknowledgement to the RN

    $$ {\text{P1 }} = {\text{ AG }}\left( {\left( {{\text{not }} < {\text{-freq}} > {\text{tt}}} \right) \, \backslash / \, \left( {{\text{AF }}\left( { < {\text{-ack}} > {\text{tt}}} \right)} \right)} \right) $$
  2. 2.

    Whenever a RN makes a Session Key Request, eventually there will always be a Session Key or Nack to the RN

    $$ {\text{P2 }} = {\text{ AG }}\left( {\left( {{\text{not }} < {\text{-sreq}} > {\text{tt}}} \right) \, \backslash / \, \left( {{\text{AF }}\left( {\left( { < {\text{-nack}} > {\text{tt}}} \right) \, \backslash / \, \left( { < {\text{-skey}} > {\text{tt}}} \right)} \right)} \right)} \right)\frac{{}}{{}} $$
  3. 3.

    Whenever a node makes a Trust Value request, eventually there will always be a Updated Trust Value to the node

    $$ {\text{P3 }} = {\text{ AG }}\left( {\left( {{\text{not }} < {\text{-utrust}} > {\text{tt}}} \right) \, \backslash / \, \left( {{\text{AF }}\left( { < {\text{-utvalue}} > {\text{tt}}} \right)} \right)} \right) $$
  4. 4.

    Whenever a node makes a Paid Registration request, eventually there will always be an ACk or Nack to the node

    $$ {\text{P4 }} = {\text{ AG }}\left( {\left( {{\text{not }} < {\text{-prreq}} > {\text{tt}}} \right) \, \backslash / \, \left( {{\text{AF }}\left( {\left( { < {\text{-ack}} > {\text{tt}}} \right) \, \backslash / \, \left( { < {\text{-nack}} > {\text{tt}}} \right)} \right)} \right)} \right) $$
  5. 5.

    Whenever a node makes a unpaid Registration request, eventually there will always be an Ack or Nack to the node

    $$ {\text{P5 }} = {\text{ AG }}\left( {\left( {{\text{not }} < {\text{-uprreq}} > {\text{tt}}} \right) \, \backslash / \, \left( {{\text{AF }}\left( {\left( { < {\text{-ack}} > {\text{tt}}} \right) \, \backslash / \, \left( { < {\text{-nack}} > {\text{tt}}} \right)} \right)} \right)} \right) $$
  6. 6.

    Whenever there is a session key to the RN, there is a prior Session Key Request from the RN to TNI

    $$ {\text{P6 }} = {\text{ A }}\left( {\left( {{\text{not}} < {\text{-skey}} > {\text{tt}}} \right){\text{ W }}\left( { < {\text{-sreq}} > {\text{tt}}} \right)} \right) $$
  7. 7.

    Whenever there is a Referral Registration Request from Server, there is a prior Unpaid Registration request from the RN to TNI

    $$ {\text{P7 }} = {\text{ A }}\left( {\left( {{\text{not}} < {\text{-refreq}} > {\text{tt}}} \right){\text{ W }}\left( { < {\text{-uprreq}} > {\text{tt}}} \right)} \right) $$
  8. 8.

    Whenever there is a file to the RN, there is a prior file request from the RN to SERVER

    $$ {\text{P8 }} = {\text{ A }}\left( {\left( {{\text{not}} < {\text{-file}} > {\text{tt}}} \right){\text{ W }}\left( { < {\text{-freq}} > {\text{tt}}} \right)} \right) $$
  9. 9.

    Whenever there is a Updated Trust Value to the Node, there is a prior Update Trust request from the Node to TNI

    $$ {\text{P9 }} = {\text{ A }}\left( {\left( {{\text{not}} < {\text{-utvalue}} > {\text{tt}}} \right){\text{ W }}\left( { < {\text{-utrust}} > {\text{tt}}} \right)} \right) $$
  10. 10.

    There is a possibility of a Session Key Request from RN, followed by a session key from the TNI. This sequence of actions may also repeat infinitely

    $$ {\text{P1}}0 \, = {\text{ max X }} = \, < {\text{t}} > < {\text{-sreq}} > < {\text{t}} > < {\text{-skey}} > {\text{ X}} $$
  11. 11.

    There is a possibility of a file request from RN, followed by a ACK from the Server. This sequence of actions may also repeat infinitely

    $$ {\text{P11 }} = {\text{ max Y }} = \, < {\text{t}} > < {\text{-freq}} > < {\text{t}} > < - {\text{ack}} > {\text{ Y}} $$
  12. 12.

    There is a possibility of a file request from RN, followed by a Update Trust request from server followed by Updated Trust Value from the TNI then followed by ack from Server. This sequence of actions may also repeat infinitely

    $$ {\text{P12 }} = {\text{ max Z }} = \, < {\text{t}} > < {\text{-freq}} > < {\text{t}} > < {\text{-utrust}} > < {\text{t}} > < {\text{utvalue}} > < {\text{t}} > < {\text{-ack}} > {\text{ Z}} $$

Having established the trust model, we can safely proceed to provide services i.e. access to file read and file write to the requestor. For this a Trust based File Replication and Consistency model is proposed, with a view that communication between FRS is now secured and all malicious activities carried out by any FRS will be observed and notified to TNI, which in turn will lead to FRS deregistration based on its trust value.

4 Replication and consistency maintenance model

4.1 File replication model

Figure 16 shows a group of FRSs along with the nodes and these nodes are termed as RNs when they request for a particular file in a distributed environment. RN only has read access on the file. RNs cannot perform write operation to modify the file. File can be modified only by the FRS. The figure represents the logical connections between FRSs and RNs. FRSs will communicate/exchange information with each other as and when required.

Fig. 16
figure 16

Proposed scenario

An FRS can be ‘local’ or ‘remote’ with respect to RN. For RN, FRS is said to be ‘local’ if RN is directly connected to FRS, and all the other FRSs are said to be ‘remote’. So, in a group of n FRSs, each RN has one ‘local’ FRS and (n − 1) ‘remote’ FRSs.

4.2 Data structure used

Filename: Name of file. Filesize: Size of file. Request Count: Number of requests for a particular file a server handles. Replication Threshold: Maximum number of requests for a particular file a server can handle, at any time after that file will be replicated on other server. Valid: It is a Boolean variable that signifies whether the file content is valid or not. Lock: It is an integer variable that signifies that some server node is updating a file and hence has acquired lock on the file. Lock is significant only to the primary server node of the file. Primary Server ID: It is an integer variable. This specifies the ID of the primary server node (or the parent node) of the file. Replicating: It is a Boolean variable that signifies that the server node is replicating the file on some other node. Getting: It is a Boolean variable that signifies that the FRS is getting a VALID file from some other peer FRS. This variable gets set only when the server node has an INVALID file. Timestamp Array: It is an array of integer variables. It stores the timestamp at which the clients requested for the file from this particular server node. Peers: It is an array of integer variables and stores the ID of the FRS that has the replica of the file (Table 4).

Table 4 File details table

Peer FRS ID: ID of peer File Replication Server. PeerFRS IP: IP address of peer File Replication Server. Peer FRS Port: Port address of peer File Replication Server (Table 5).

Table 5 Peer FRS table

4.3 Proposed replication mechanism

Each FRS receives a file request from the RN and based on its current load status, handles the request. Frequently accessed files are replicated on other FRS when the request count for a particular file reaches the threshold value. The various states of FRS are described below:

  • Ready: File is present on the FRS and the Request Count for the file is less than the threshold value.

  • Busy: File is present on the FRS and the Request Count for the file is equal to the threshold value.

  • File Not Found: File is not present on the FRS.

The handling of the request takes place as shown in the flow diagram (Fig. 17).

Fig. 17
figure 17

Flow diagram for replication

Now, to understand the working of FRS in a much better way, few scenario’s are discussed in the next section.

4.4 Replication scenarios

The various scenarios presented in this section explain the complete file replication model (FRM). The scenarios described below involve 3 FRSs S1, S2, S3 and one RN N1. The messages exchanged during the communication between FRSs and RN are described below:

  • M1: This is a request message and involves the request for file, resource_FRS_list, replication and status of other FRS. The listserver is the request for all the filenames, FRS IP and FRS Port address from the Local FRS.

  • M2: This is the status message of FRS. The different status is ready, busy and file not found.

  • M3: This message denotes the sending of the file contents to the RN or FRS, or the sending of the IP address, Port address of remote FRSs and resource_FRS_list present on the local FRS.

  • M4: This message involves the IP and Port address of the remote FRS from which the RN establishes the connection to receive the replicated file.

  • M5: Reply acknowledgement (RACK) after the file has been replicated successfully.

4.4.1 Case 1: Local FRS S1 cannot fulfill the request and looks for a remote FRS that can fulfill the file request

As shown in Fig. 18, N1 establishes connection with FRS S1 and sends resource_FRS_list request (message M1) to it. On successful connection, resource_FRS_list is received (M3) by N1. N1 sends file request (M1) to the S1. S1 sends the status request message (M1) to remote FRS S2. S2 sends status as ‘ready’ (M2) to S1. S1 send IP and Port address of S2 (message M4) to N1. N1 receives the file from S2.

Fig. 18
figure 18

Remote FRS S2 handles the request

4.4.2 Case 2: Local FRS S1 replicates the file on remote FRS S3

As shown in Fig. 19, N1 establishes connection with FRS S1 and sends resource_FRS_list request (M1) to it. On successful connection, resource_FRS_list is received (M3) by N1. N1 sends file request (M1) to S1. The status of S1 is busy and so it sends the status request message (M1) to remote FRS S2. S2 sends status as ‘busy’ (M2) to S1. S1 sends the status request message (M1) to remote FRS S3. S2 sends status as ‘file not available’ (M2) to S1. S1 sends the replication request message (M1) to S3. S1 creates the file replica (M3) on the S3. S3 sends RACK message (M5) to S1. S1 sends the IP and Port address of the S2 (M4) to N1. N1 receives the file from S3.

Fig. 19
figure 19

Remote FRS S3 handles the request

Now, after creating the file replica on more than one server, there arises a need to maintain consistency among all the replicas of a file. If a file is modified at any FRS, those changes need to be propagated to those FRS on which the replica is present. For this a partial update propagation mechanism for maintaining file consistency is proposed in the next section.

4.5 Update propagation mechanism for maintaining replica consistency

Time Tt depends on the size of file to transfer. Instead of replacing stale replica of file f1 (r2, \( \overline{f1} \)) with updated replica (r1) of file (f1) i.e. (r1,f1) located on FRS1, only the changes made to the file are extracted and propagated i.e. ∆ ((r1,f1), (r1, \( \overline{f1} \))), where (r1, \( \overline{f1} \)) is now old copy of (f1) on FRS1. These changes will be stored in Diff file denoted by D(f1, sequence_no, timestamp) and are propagated to stale replica (r2) of file (f1) on FRS2. After applying these changes on stale copy (r2, \( \overline{f1} \)) it will be updated to (r2,f1) using join operation Σ((r2, \( \overline{f1} \)), D(f1, sequence_no, timestamp)). Time required to extract modification is denoted by t, time to join the Diff file with stale file is t and time of propagation for D(f1, seq_no, timestamp) is tpD, so total time to reflect the changes made to (r1, f1) in (r2, \( \overline{f1} \)) is Trf.

$$ {\text{T}}_{\text{rf}} = {\text{ t}}_{\Updelta } + {\text{ t}}_{\text{pD}} + {\text{ t}}_{\sum } $$

Trf = total time required to update the replica on FRSi (ri,\( \overline{f1} \)), t = time required to extract modification content and store them in Diff file (D), tpD = time required to propagate Diff file/s D from (r1) to (r2), t = time for joining stale file (\( \overline{f1} \)) with Diff file/s (D), Tcft = time required to transfer whole (complete) file

In case of one replica, proposed approach is beneficial if and only if Trf < Tcft. If there are more than one replica of file, than for updating each replica rn where (n > 1), will take tpD + t, as the previously existing Diff file/s (D) can be propagated to all the replicas.

In previous approaches every file has a primary replica and other replicas are considered as the secondary replica, this primary replica is called the master replica [8]. When a replica of file is updated on secondary replica, than primary copy have to be updated immediately. With this approach there is need to wait until file write operation completes on secondary copy and then transfer this updated file to the master replica. In proposed approach as soon as a replica get request for write operation it notifies other replicas about the new master replica. As all replicas knows that new master copy is the replica on which last write operation done. So there is no need to update any other replica immediately. Every FRS maintains a data structure given below in Table 6. The entries in data structure keep track of information, when a file was last modified (tlw) and by which FRS. Detailed working of consistency mechanism for a file replica on a FRS is given in Table 3.

Table 6 Data structure maintained on FRS

Flow diagram for maintaining file consistency is given in Fig. 20.

Fig. 20
figure 20

Flow graph for maintaining file consistency

To validate the proposed model, CCS is written and its Bisimulation equivalence is proved using the Concurrency Workbench of the New Century (CWB-NC) that provides different techniques for specifying and verifying finite-state of concurrent systems.

4.6 Bisimulation equivalence of replication model

Stability analysis of FRM using a process algebraic approach is carried out in this section. Transition systems [32] are considered to perform external and internal actions. External actions are defined as observable actions which are seen by the observer. However, an unobservable action is considered as an internal action which the observer cannot see. Meaning of the symbols used in the CCS [33] is described as follows:

  • SPN: Stands for Simple Provider Node. This denotes the Server Node of the No-Replication model.

  • SRN: Stands for Simple Requesting Node. This denotes the Client Node of the No-Replication model.

  • NR: This denotes the No-Replication Model.

  • FRS: Stands for File Replication Server. This denotes the Server Node of the p model.

  • RRN: Stands for Replication Requesting Node. This denotes the Client Node of the proposed replication model.

  • RI: This is the set of internal actions for the proposed replication model.

  • The symbol in CCS (‘) denotes the output actions and the rest of the actions denote the inputs.

4.6.1 Definition of simple provider and requesting node

Definition of simple provider node (SPN): provides the file to the RN, without performing any file replication and changes its state back to initial state i.e. SPN (Fig. 21).

$$ {\text{SPN}} \;\mathop{=}\limits^{\rm def}\; \,{\text{listReq}}.\hbox{`}{\text{listSent}}.{\text{SPN }} + {\text{ get}}.\hbox{`}{\text{ready}}.\hbox{`}{\text{fileContent}}.{\text{SPN}} $$
(10)
Fig. 21
figure 21

Simple provider node

Definition of Simple Requesting Node (SRN): request a file from the simple server node and changes its state back to initial state i.e. SRN (Fig. 22).

$$ {\text{SRN}} \;\mathop{=}\limits^{\rm def}\; \hbox{`}{\text{listReq}}.{\text{listSent}}.{\text{SRN }} + \, \hbox{`}{\text{get}}.{\text{ready}}.{\text{fileContent}}.{\text{SRN}} $$
(11)
Fig. 22
figure 22

Simple requesting node

Model for simple server with no replication (NR):

$$ {\text{NR}}\;\mathop{=}\limits^{\rm def}\; \,\left( {{\text{SPN }}|{\text{ SRN}}} \right) $$
(12)

4.6.2 Definition of FRS and requesting node

Definition of FRS: fulfils the file requests, performs the file replication and changes its state back to initial state i.e. FRS.

$$ {\text{FRS}} \;\mathop{=}\limits^{\rm def}\; \,{\text{listReq}}.\hbox{`}{\text{listSent}}.{\text{FRS }} + {\text{ head}}.\hbox{`}{\text{no}}.{\text{FRS }} + {\text{ put}}.{\text{fileContent}}.{\text{FRS }} + {\text{ get}}.\left( {\hbox{`}{\text{ready}}.\hbox{`}{\text{fileContent}}.{\text{FRS }} + \, \hbox{`}{\text{head}}.{\text{no}}.\hbox{`}{\text{put}}.\hbox{`}{\text{fileContent}}.\hbox{`}{\text{newfrs}}.{\text{FRS}}} \right) $$
(13)

Definition of Replicating Requesting Node (RRN): requests a file from FRS and changes its state back to initial state i.e. RRN

$$ {\text{RRN}} \;\mathop{=}\limits^{\rm def}\; \hbox{`}{\text{listReq}}.{\text{listSent}}.{\text{RRN }} + \, \hbox{`}{\text{get}}.\left( {{\text{ready}}.{\text{fileContent}}.{\text{RRN }} + {\text{ newfrs}}.{\text{RRN}}} \right) $$
(14)

Setting internals for replicating module

$$ {\text{RI}} \;\mathop{=}\limits^{\rm def}\; \left\{ {{\text{ head}},{\text{ put}},{\text{ no}},{\text{ newfrs }}} \right\} $$
(15)

Definition of replicating module

$$ {\text{R}} \;\mathop{=}\limits^{\rm def}\; \left( {{\text{FRS }}|{\text{ RRN}}} \right) \, \backslash {\text{ RI}} $$
(16)

Above mentioned CCS is compiled on CWB-NC and bisimulation equivalence is proved between File Replication and no-replication model.

4.6.3 MU Calculus of replication model

  1. 1.

    Whenever a RN makes a List request, a List will be sent to the RN.

    $$ {\text{P1}}\, =\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{listReq}} > {\text{tt }}\backslash /{\text{ AF }}\left( {\left[ {\text{listReq}} \right] < {\text{listSent}} > {\text{tt}}} \right)} \right) $$
  2. 2.

    Whenever a RN makes a Get request, there exists a possibility that a ready message is received by the RN.

    $$ {\text{P2 }}\, =\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\hbox{`}{\text{get}}} \right] < {\text{ready}} > {\text{tt}}} \right)} \right) $$
  3. 3.

    Whenever a RN makes a Get request, there exists a possibility that a newfrs message is received by the RN.

    $$ {\text{P3 }} \,=\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\hbox{`}{\text{get}}} \right] < {\text{newfrs}} > {\text{tt}}} \right)} \right) $$
  4. 4.

    Whenever a RN makes a Get request, there exists a possibility that a fileNotFound message is received by the RN.

    $$ {\text{P4 }} \,=\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\hbox{`}{\text{get}}} \right] < {\text{fileNotFound}} > {\text{tt}}} \right)} \right) $$
  5. 5.

    Whenever a RN makes a Get request, there exists a possibility that a serverBusy message is received by the RN.

    $$ {\text{P5 }} \,=\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\hbox{`}{\text{get}}} \right] < {\text{serverBusy}} > {\text{tt}}} \right)} \right) $$
  6. 6.

    Whenever a RN makes a Get request, there exists a possibility that a ready message is received by the RN and the file Content thereafter.

    $$ {\text{P6 }} \,=\, {\text{ AG }}\left( {{\text{not}} < - \hbox{`}{\text{get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\hbox{`}{\text{get}}} \right] < {\text{ready}} > < {\text{fileContent}} > {\text{tt}}} \right)} \right) $$
  7. 7.

    Whenever a RN makes a Get request, there exists a possibility that a head message is received by the RN.

    $$ {\text{P7 }} \,=\, {\text{ AG }}\left( {{\text{not}} < {\text{-get}} > {\text{tt }}\backslash /{\text{ EF }}\left( {\left[ {\text{get}} \right] < \hbox{`}{\text{head}} > {\text{tt}}} \right)} \right) $$
  8. 8.

    For every head request by the server, there is a possibility that a yes message is received.

    $$ {\text{P8 }} \,=\, {\text{ EF }}\left( {\left[ {\hbox{`}{\text{head}}} \right] < {\text{yes}} > {\text{tt}}} \right) $$
  9. 9.

    For every head request by the server, there is a possibility that a no message is received.

    $$ {\text{P9 }} \,=\, {\text{ EF }}\left( {\left[ {\hbox{`}{\text{head}}} \right] < {\text{no}} > {\text{tt}}} \right) $$
  10. 10.

    For every head request by the server, there is a possibility that a busy message is received.

    $$ {\text{P1}}0 \, =\, {\text{ EF }}\left( {\left[ {\hbox{`}{\text{head}}} \right] < {\text{busy}} > {\text{tt}}} \right) $$
  11. 11.

    For every no request received by the server, there is a possibility that a put message is sent by the server.

    $$ {\text{P11 }} \,=\, {\text{ EF }}\left( {\left[ {\text{no}} \right] < \hbox{`}{\text{put}} > {\text{tt}}} \right) $$
  12. 12.

    For every no request received by the server, there is a possibility that a fileNotFound message is sent by the server.

    $$ {\text{P12 }} \,=\, {\text{ EF }}\left( {\left[ {\text{no}} \right] < \hbox{`}{\text{fileNotFound}} > {\text{tt}}} \right) $$
  13. 13.

    For every put request by the server, there is a possibility that a fileReplicate message is received.

    $$ {\text{P13 }} \,=\, {\text{ EF }}\left( {\left[ {\hbox{`}{\text{put}}} \right] < \hbox{`}{\text{fileReplicate}} > {\text{tt}}} \right) $$
  14. 14.

    For every put request received by the server, there is a possibility that a fileReplicate message is also received.

    $$ {\text{P14 }} \,=\, {\text{ EF }}\left( {\left[ {\text{put}} \right] < {\text{fileReplicate}} > {\text{tt}}} \right) $$
  15. 15.

    For every busy request received by the server, there is a possibility that a serverBusy message is sent.

    $$ {\text{P15 }} \,=\, {\text{ EF }}\left( {\left[ {\text{busy}} \right] < \hbox{`}{\text{serverBusy}} > {\text{tt}}} \right) $$

Finally, having discussed all this, next section presents the simulation and results obtained from it.

5 Simulation and results

5.1 Simulation results

As shown in Table 7, the simulation has been conducted for three cases using two, three and four FRSs (2FRS, 3FRS and 4FRS). Each RN requests for file F of size 64.1 MB from FRS1. The experiment is carried out considering three scenarios viz., 2FRS, 3FRS and 4FRS. Threshold varies depending on the number of FRS, here in case of 2FRS threshold is 40, 3FRS it’s 27 and for 4FRS it’s 20.

Table 7 Experiment Configuration Table

Table 8 shows the request completion time in seconds and the FRS that handles the request. Table shows that the average request completion time under replication scenario is 28.78–47.24 % less when compared to FTP and 4.9 % less when compared to no-replication scenario. Figure 23 shows the comparison in terms of request completion time between FTP and proposed replication mechanism (with 2FRS, 3FRS and 4FRS). When replication is done, average completion time for a request is always less than the average completion time under no replication operation. After reaching the threshold, based on the request received by FRS1, it replicates the file on FRS2. So the request handled by FRS 2 takes more time since this time is inclusive of replication overhead from FRS1 to FRS2. If there is no replication on FRS2 and all request are handled by FRS1, the service time for each request increases significantly. This is shown in Table 8.

Table 8 Average request completion time (s)
Fig. 23
figure 23

Comparison of request completion time

When the local FRS reaches its threshold value and replicates the file on some other FRS, the replication overhead is compensated by the following benefits:

  • Avoid retransmission of request by the RN.

  • Reduces latency in case of load above threshold.

  • Ensures scalability.

  • Provides fault tolerance capability.

The proposed model is simulated on Linux platform and LAN of 10.0 Mbps. Proposed approach is compared with the conventional Complete File Transfer approach (CFT). Proposed approach outperforms the CFT in terms of time and size of data transfer to replace a stale replica of file with the modified one.

Table 9 describes the details and possible cases for better understanding of this approach. It also shows that as the number of replicas to be updated increases total reduction in time for updating replicas increases. The corresponding graph is shown in Fig. 24.

Table 9 Increasing number of replicas updated for Constant file size (5 MB) with modification size (100 kB)
Fig. 24
figure 24

Graph corresponding to Table 9

Figure 25 shows that for a constant file size of 5 MB, a stale file replica can either be updated by replacing it with the latest modified file of 5 MB or by using the proposed approach. As the file content modification size increases, ratio of (file size/modification size) decreases and size of Diff file/s that needs to be propagated keeps on increasing. As a result percentage reduction in time for synchronizing the stale file replica with the latest modified file also decreases (Table 10).

Fig. 25
figure 25

Time required to update the stale replica as file content modification size increases

Table 10 Constant file size (5 MB) with increasing modification size

Based on the various feature, a comparison matrix of the proposed model with the Kerberos [34], Pippal et al. [35] and Tao et al. [36] is shown in Table 11. Table shows that the proposed model provides few extra features in addition to those provided by the existing model.

Table 11 Feature based comparison with previously proposed models

Figure 26 shows the number of messages exchanged for node registration, SUK and finally for accessing various services.

Fig. 26
figure 26

Graph showing messages exchanged for different models

Referral of a node by the help of Sec-SLAs, checking the access right of the file for providing the service (file read or write), checking the validity of the SUK and observing the malicious behavior on the basis of frequency of file requests. On the basis of these features average time is calculated with and without considering them. Results are tabulated in Table 12.

Table 12 Average time required for acquiring keys and service (s)

6 Conclusion

This paper proposes a trust based file replication and update propagation model that creates a file replica when the number of request exceeds the threshold value and also maintains file consistency. This threshold is decided, based on the configuration of FRS and application requirements. It discusses the basic trust parameters and adaptive factors in computing trustworthiness of peer FRS, namely, frequency of the request for a particular file that a FRS performs, registration type of node i.e., paid or unpaid, blocking write operation if trust value of a FRS is less than threshold, authenticity of the session key and feedback that a FRS gives about other FRS. The proposed approach is able to resolve many of the unaddressed issues viz., file access frequency, failure handling, avoidance of unnecessary file replication, identification and finally propagation of partial updates. Instead of haphazardly creating the replica, the proposed approach autonomously determines and ascertains the location and need for file replication based on the number of requests and availability of files on FRSs. While performing any file replication operation by a FRS, if this FRS crashes, the proposed model completes the file request via one of the peer FRS thus providing fault tolerance capability to the system. Simulation results show that during high file request scenario for a particular file, frequently accessed files are replicated on other FRSs dynamically and file request is redirected in transparent manner, thus reducing request completion time by about 28.78–47.24 % as compared to FTP. It has also been observed that, when a FRS replicates a file on other FRS, the replication overhead is compensated by various factors like avoiding retransmission of request by the RN, reducing latency in case of load above threshold, ensuring scalability and providing fault tolerance capability. Paper presents a mechanism that reduces time for updating multiple replicas of a file by using modification propagation and transfers the role of master replica to the last modified replica. The master replica computes the file modifications immediately but propagates only the required partial updates on-demand. Experimental results shows that various factors viz., file size, size of modifications and number of replicas to be updated affects the time to propagate the changes to other replicas. Percentage reduction in time for propagating these changes varies from 31.56 to 78.1 %. Simulation results also shows that proposed approach gives a far better performance in terms of time and the benefits are even more if the modification size increases and the replicas to be updated are more in number. Percentage reduction in time for updating replicas varies from 78.17 to 85.37 %. Though the percentage increase in time required for acquiring the keys to access the services with trust based security model varies from 11.94 to 17.49 %, the same is compensated by reduction in time for updating replicas.