1 Introduction

The massive growth of Internet traffic and the emergence of new applications have caused Internet architecture to support new services such as security, reliability, quality assurance, and so on. However, TCP/IP architecture is not suitable to provide such services. Therefore, some patches were used to support these services, which do not fully support the requirements and also, have increased the complexity of the Internet architecture [1]. Furthermore, the current architecture of the Internet is based on a host-centric model, that is appropriate for the basic needs of Internet users. Current applications mainly need to access information regardless of their users' physical location. Hence, there is a contradiction between the Internet design model and the needs of today's Internet users. To eliminate this contradiction and respond to the current needs of the users, a new architectural approach is needed. Information Centric Network (ICN) is considered to be the ideal candidate for the next generation of Internet architecture [2]. In this architecture, routing is done based on the content name, not on the physical location of the content host. One of the important features of this architecture is the use of the network routers' caches, which has a great impact on reducing the network traffic and also content retrieval latency. The caches of the routers have a limited capacity and are filled up with stored contents after a while. Replacement policy plays a key role in such situations and determines which content should be removed to provide space for the new content storage. So, the cache replacement policy has a significant impact on the hit rate and data retrieval latency.

This paper represents the details of a new cache replacement policy (DFRC) aiming at increasing the hit rate and reducing the latency of retrieval [3]. In the DFRC method, the retrieval time of the contents is evaluated using the Forwarding Information Base (FIB) table information, and the content with less retrieval time receives a higher discard priority. Furthermore, by using the Stale Parameter (SP), which reflects the length of time that this content is in the cache but not used, DFRC can detect the popularity changes of the contents and the old popular data chunks will receive higher discard priority as well. Storage of the popular data chunks along with reducing data redundancy (storing of data chunks in more than one router’s cache) in the network caches has increased the average hit rate of DFRC compared to the LRU [4, 5], SRTT [6], EPPC [7] and OCRICN [8]. In comparison with [3], this paper investigates the performance of DFRC deeply by providing an extensive simulation study. Furthermore, the grade of retrieval impact against the stale parameter in the discard priority calculation is set using the utility parameter U.

The remainder of this paper is organized as follows: a brief overview of the ICN is presented in Sect. 2. Section 3 provides an overview of the previous caching and replacement policies. Section 4 describes the proposed replacement method (DFRC). The performance of DFRC is evaluated through simulation in Sect. 5. Finally, the paper will be concluded in Sect. 6.

2 A brief overview of named data network

One of the most famous ICN architectures is the Named Data Network (NDN). This section provides a brief overview of the NDN. A more detailed description is presented in [9].

In NDN, information retrieval is performed using both the interest and data packets [9]. The content is segmented by the content provider, unique names are assigned to each of these segments (data chunks), and information retrieval is performed according to this name. To retrieve the desired content, the user puts the content's name into an interest packet and sends it into the network. Each router containing the desired data chunk returns it using a data packet so that there is no need to forward the interest packet to the content provider. The forwarding function is performed with the use of a Content Store (CS), a Forwarding Information Base (FIB), and a Pending Interest Table (PIT) in the NDN routers (Fig. 1) [10]. The CS enables in-network caching of the data chunks to decrease the latency of data retrieval. When the cache becomes full, it is necessary to use cache replacement policies.

Fig. 1
figure 1

The structure of an NDN router [5]

FIB is a table, filled up by routing algorithms and specifies the interfaces from which each name prefix can be retrieved. This table is similar to the current forwarding table in the Internet protocol, except that in this table, multiple interfaces may exist for a given name prefix. In the FIB table, each record contains the name prefix, stale time, and a ranked list of the candidate interfaces for forwarding an interest packet with the corresponding name prefix [10].

The stale time of the content reflects the time duration, that this content has existed in the cache but has not been used. Furthermore, some other parameters including interface-id, average Round Trip Time (RTT), and so on are identified for each interface.

An abstract of the forwarding function in NDN is presented in Fig. 2. First of all when an interest packet arrives at an NDN router, the content store is searched for the data chunk. If a match is found, the data chunk will be returned to the interface where the interest packet has been received. Otherwise, the PIT is searched for the desired name. If an entry is found with the mentioned name, the incoming interface will be added to the found PIT entry. Otherwise, in addition to adding the regarding information as a PIT entry, the interest packet will be forwarded to an output interface that is obtained from FIB [10].

Fig. 2
figure 2

An abstract of the forwarding function in NDN

On the way back, when a data chunk is received in a router, the router searches its PIT for the data name. If a match is found, the data chunk will be sent to all of the registered interfaces in the corresponding PIT entry and after that, the entry will be removed. By this time, the data chunk will be cached if there is enough free space in the router’s CS. Otherwise, the data chunk that should be removed to provide space for the new caching will be determined by the replacement strategy. Finally, If a match is not found in the PIT, the data packet will be discarded.

3 Related works

NDN has attracted extensive attention recently. Nowadays, hidden memories (caches) are used in NDN routers, and cache decision-making and replacement policies greatly affect the network performance, cache hit rate, data redundancy, and also the data retrieval latency from the user's perspective.

In this section, previous works on the decision-making and replacement policies for the content caches of the routers are briefly reviewed.

3.1 Decision-making policies

Decision-making policies determine which content should be cached in which routers.

One of these policies is “Leave Copy Everywhere” (LCE). In this policy, when a cache hit occurs for a requested chunk, the data chunk is copied on all downstream routers [11]. Another policy is called “Leave Copy Down” (LCD), in which the data chunk is copied only on the downstream router [11]. Another approach is presented in [11]. In this method, when a data chunk is requested and a cache hit occurs, the data chunk is deleted from the current router's cache and copied to the first downstream router. “Copy by probability” is another method that stores the data chunk in all downstream routers with the probability of p [11]. In “Randomly Copy One”, a data chunk can be copied on a random router selected from the downstream path [12]. In the “Probabilistic Cache” method, the data chunk may be copied on any router, located on the return path to the user. The point is that the closer the router gets to the user, the more likely it is to store the data chunk on that router. The idea is to store the data chunks on the routers close to the requesting user [2, 4]. In “random caching”, when a data chunk arrives at a router, the data chunk will be stored by a random chance. As the number of requests for a data chunk increases, the chance of its storage increases too [13]. In [14], a data caching scheme for Vehicular Named Data Network is proposed by considering the spatial-temporal characteristics of data. In this scheme, the data is divided into emergency safety messages, traffic efficiency messages, and service messages according to the application requirements. Then, these various categories will be cached based on their spatial-temporal characteristics. In [15], a novel distributed caching strategy is proposed in vehicular environments. The content’s residual lifetime, popularity, and perceived availability in the neighborhood are considered by the vehicle to autonomously decide about the contents' local caching. Therefore, the popular contents that are not cached by a nearby node are more likely to be cached by a vehicle which leads to higher caching diversity in the neighborhood.

3.2 Replacement policies

The cache replacement policy determines which data chunk should be removed to provide enough space for the caching of the new one if the cache is full.

The simplest replacement algorithm is called “First Input First Output” (FIFO). In this algorithm, the oldest data chunk is removed from the cache [4]. In LRU policy, a cache consists of several slots, each of which can contain a data chunk. When a cache hit occurs, the retrieved data chunk will be transferred to the first slot and the contents of its previous slots will be shifted one slot forward. Otherwise, if a cache miss occurs, the last slot‘s content will be discarded. Then, the contents of all other slots will be shifted one slot forward and eventually, the retrieved data chunk will be stored in the first slot [4, 5]. On the other hand, the “Least Frequently Used” (LFU) method tries to keep popular data chunks in the cache [16]. In this method, a counter is kept for each data chunk, which increases by one when a cache hit occurs. When a data chunk needs to be discarded, the chunk with the smallest counter will be selected. The “Most Frequently Used” (MFU) approach is just the opposite of the LFU policy. In other words, the data chunk with the highest number of requests will be removed for the replacement [17, 18]. Another method is called “Adaptive Replacement Cache” (ARC) [19]. ARC uses two LRU lists for capturing both the recency and frequency of the cached pages. The ARC algorithm outperforms LRU in terms of hit ratio for a wide range of cache sizes. In the “Random Replacement” (RR) strategy, a data chunk in the cache is selected randomly to be discarded [4].

The “Popularity” approach defines five levels of popularity. When a new data chunk arrives, if the cache is full, the chunk with the lowest popularity, whose popularity is also lower than the new chunk will be removed [20]. There is another cache replacement policy which is called OCRICN [8]. In this method, the replacement of chunks is done based on their utility parameters. The utility parameter is computed for all of the entered and stored contents.

A new chunk will be stored in the cache, if its utility is more than the stored chunks' utilities, and the chunk with the smallest utility will be discarded. The “Age-based Cooperative” (ABC) method operates based on both content location and popularity parameters. In this scheme, the age parameter is calculated for each of the contents based on the mentioned parameters. The content will be discarded from the cache when its age expires. When a data chunk comes to the router and the cache is full, the router will check if there is an expired data chunk. The new data chunk will be cached only if there is an expired data chunk in the cache [21]. In the “WAVE” replacement method, which is using LRU, an access history file is used to find the best chunk to be discarded [16]. In the SRTT method, a router evicts the data chunk that has the smallest product of the request frequency and the router computed RTT [6]. QoS-aware Cache Replacement (QCR) policy is another replacement method, which assigns a class to each content and splits the cache into a set of sub-caches to manage the content classes. Each content will be replaced based on its popularity and size. It means that the content with higher popularity to size ratio, called density-popularity, will be cached, and the one with a lower density-popularity is discarded [22]. Meddebet al. [23] introduces a cache replacement policy named Least Fresh First (LFF) that discards the least fresh content from the cache. LFF policy uses time series analysis to predict future events and to estimate the residual lifetime of the cached contents. A replacement method named Efficient Popularity-aware Probabilistic Caching (EPPC) is proposed in [7]. The EPPC introduces a method consisting of content selection, content placement, and content replacement mechanisms together. In this method, the content with the highest request rate is selected to be cached. In the proposed replacement policy, the content with the lowest hit rate is discarded and also cached in the backup caching nodes without losing its popularity count. The information of the FIB table of the routers is used in [24]. The paper proposes a strategy wherein the cache removes data chunks that can be retrieved quickly and replaces them with new data chunks, storing them in the initial cache slots. However, a comprehensive evaluation of this policy has not been conducted, and the proposed method has yet to demonstrate adaptability to the dynamic nature of the network.

The mentioned cache replacement schemes are compared in Table 1. The main idea of the DFRC method is the usage of the available information in the FIB table of the routers, along with the temporary popularity of the data chunks in the replacement strategy to increase the hit rate and reduce data redundancy and data retrieval latency with an acceptable level of complexity.

Table 1 Comparison of the previous cache replacement schemes

4 Proposed method

Our new cache replacement policy (DFRC) is presented in this section. The objectives of the DFRC are to reduce both the average RTT and data redundancy, increase the hit rate, and also to reduce the load on the content providers. DFRC defines the Grade Of Retrieval (GOR) parameter to assign more discard priorities to the data chunks that need lower retrieval times. Another parameter that plays an important role in the proposed method is SP. The impact of these two parameters on the discard priority calculation is set through the use of an impact weight, called the U parameter. The optimal value of U is calculated by using the Reinforcement Learning paradigm.

The mentioned FIB table information in Sect. 2 is used in DFRC. The number of available satisfying interfaces and their average RTT in the FIB table are used to make a better decision in the cache replacement of the NDN routers. An entry of the FIB with a small average RTT for a specific name prefix indicates that its content provider is close to the router, or maybe upstream routers have stored this content in their caches. Therefore, the data can be retrieved fast. DFRC assigns more discard priorities to the data chunks that need less time to retrieve. For this purpose, the GOR parameter is defined according to Eq. (1):

$$ GOR\left( x \right) = { }\frac{{\mathop \sum \nolimits_{i = 1}^{{NOI_{x} }} \left( {1 - \frac{{AVG\_RTT\left( {I_{i} {, }x} \right)}}{{Time{ }Out}}} \right)}}{{NOI_{x} }}\quad 0 < GOR < 1 $$
(1)

Here, \({NOI}_{X}\) is the number of candidate interfaces for the name prefix of \(x.AVG\_RTT({I}_{i} \text{,}\mathrm{ LPM}({\text{x}})\)) represents the estimated RTT for the Longest Prefix Match with the name prefix of \(x\) \((LPM\left(x\right))\) at interface i, which is calculated and updated as a moving average for each record of the FIB table [10]. Time Out parameter indicates the interest packet's time-out. If the corresponding data packet is not received before the timeout, the corresponding entry will be deleted from the PIT.

Using the \(GOR\) parameter is the main idea of this paper. However, since the popularity of the data chunks influences the hit rate, it can not be fully ignored in the design of cache replacement policies. LRU (Least Recently Used) is a popular and effective cache replacement policy that gives importance to the popularity of data chunks [4, 5]. LRU (in the form of the \(SP\)) is the other policy that is used besides the \(GOR\) parameter in our proposed method to improve the network performance.

In this paper, similar to the LRU method (explained in Sect. 2), the retrieved data will be copied to the first slot when a cache hit occurs, and the contents of its previous slots will be shifted one slot forward (Fig. 3). The difference is in the cache miss condition.

Fig. 3
figure 3

Structure of the cache in an NDN router

In our method, when a cache miss occurs, Discard Priority (\(DP\)) will be calculated for the data chunks to determine the one that should be discarded. More details of our method are described below.

To consider a parameter corresponding to the LRU policy, SP is designed as Eq. (2):

$$ SP\left( x \right) = \frac{Slot\, Number\left( x \right)}{{Total\, slot\, number\, of \,cache}} \quad 0 < SP < 1 $$
(2)

Here, Slot Number(x) represents the slot number where the data chunk x is stored, and the Total slot number of the cache indicates the total number of the cache slots.

Finally, DP is calculated for the cache’s stored data chunks according to Eq. (3):

$$ DP\left( x \right) = U*GOR\left( x \right) + \left( {1 - U} \right)*SP\left( x \right)\quad 0 < U < 1\quad 0 < DP < 1 $$
(3)

Suppose that a decision-making policy decides to insert a new chunk of data into the cache. If there is no free space, the DP is calculated for a predefined number of the cached data chunks (\(p\)) with high values of \(DP\), e.g. 10 data chunks with the highest \(DP\) values. Then, the data chunk with the highest updated \(DP\) will be discarded and the contents of all of its previous slots will be shifted one slot forward. Then, the new data chunk will be stored in the first slot. It is worth noting that the \(DP\) value is also calculated for all the cached data chunks periodically; e.g. after every 30 times of data chunks replacement. Therefore, it can be mentioned that the DFRC’s time complexity is \(O(n)\), where \(n\) is the cache size. Furthermore, we need \(2\times p\) bytes of memory to store the indexes of the cached data chunks with the highest DP values (\(2 bytes\) for each index as an integer number).

The Parameter \(U\) in Eq. (3) is an effective weight that determines the effects of the parameters \(GOR\) and SP in the calculation of the \(DP\). If the value of \(U\) is close to one, the effect of \(GOR\) and if the value of \(U\) is close to zero, the effect of \(SP\) will be greater in the calculation of the \(DP\). Network conditions such as the distribution of the data chunks' popularity, routers' cache size, communication latency, etc. can change the suitable value of the \(U\) parameter. This suitable value is a value that can lead to a lower amount of RTT.

This paper utilizes Reinforcement Learning (RL) to determine the U value. RL is a machine-learning approach where an agent selects actions from a set. The environment transitions between states based on the agent's actions, and the agent receives rewards as feedback. The agent aims to maximize cumulative rewards, learning the optimal action. In this paper, states represent different U parameter values, the agent's actions involve increasing or decreasing the U value, and the RTT difference reflects the reward. The Boltzmann strategy is employed for U value adjustment, balancing exploration (random U selection), and exploitation (U selection based on past observations) phases [25]. The strategy employs a temperature parameter (T) to control exploration and exploitation rates. Initially, T is set to 10 and updated using the formula \({T}_{t}=\left\{{T}_{t}-{\left(0\text{.}99\right)}^{t}\times 0\text{.}2\times {T}_{t}\right\}\), where t represents the number of iterations determining the U value [25]. When T is high, the weight of the exploration phase is greater than the exploitation, and the agent chooses a random U with a high probability. As T decreases, the agent moves towards exploiting past observations (exploitation phase). As mentioned, the reward is considered as the RTT difference between the previous and next states (\({RTT}_{previous state}-{RTT}_{next state}\)). It is worth noting that the RTT value will be set to zero if any cache hit occurs.

The reinforcement learning algorithm is tractable, i.e. its run-time is a small polynomial in the size of the state space [26]. So, in this paper, any run needs a search to find the next action, and in the worst case, its complexity is \(O(k)\), where k is the size of the state space. As mentioned above, different values of \(U\) form the state space. Different \(U\) values are considered from the \((0\text{,}1]\) interval and with the step size of 0.05, so, \(k=20\). Even if k is large, the computation cost will be almost light because the complexity is linear in terms of \(k\). Therefore, it can be mentioned that in the worst case, the DFRC's complexity is \(O\left(kn\right)\text{,}\) where \(n\) is the cache size. However, using the proposed method in the SDN-based ICN architecture helps in resolving the computational complexity of the routers, which is one of the future works of the authors.

5 Simulation and evaluation

In this section, the DFRC method is evaluated through a simulation study in comparison with the previous methods such as LRU, OCRICN, SRTT, and EPPC. The OCRICN is selected as a reference replacement algorithm, because of its similarity to the DFRC. It considers the latency of data retrieval from the content provider in its utility parameter computation, which is somewhat similar to the GOR parameter in the DFRC method. It is also similar to the DFRC in terms of the amount of computation and running time. It also prevents data redundancy due to the use of the LFU mechanism. Due to the performance superiority of the OCRICN method compared to ABC, which is itself better than WAVE, this study is compared with OCRICN instead of ABC or WAVE. SRTT is another reference replacement algorithm that, like DFRC, considers RTT in the replacement procedure. Therefore, the SRTT will also be considered in evaluating the performance of the DFRC. Besides, DFRC is compared with the EPPC replacement policy which is based on popularity.

The performance of the mentioned methods will be compared in different network conditions, such as the different number of data chunks in the network, different routers' cache capacity, different request rates for the users, and different parameters of Zipf distribution. The evaluated parameters in this paper are considered as follows:

  • Average hit rate: the ratio of the number of hit requests to all the requests in the network.

  • Average round trip time: the average time duration for retrieving the data chunk.

  • Redundancy of the data chunks: the ratio of the total number of the data chunks to the non-repetitive data chunks in the network caches.

5.1 Simulation results

The DFRC method is simulated using the Icarus simulator. The DFRC algorithm is simulated in a variety of topologies, all of which have shown improvements in results. The DFRC algorithm is simulated in three realistic topologies, GEANT, the main European backbone; GARR, the Italian national computer network for universities and research; and WIDE, the Japanese Internet backbone. In these topologies, the average queuing delay of the routers is chosen in a way that causes the users to face different retrieval delays to access the content of different providers. FIBs are constructed based on the criteria mentioned in [10]. The users’ requested rate is 4 packets per second. Each data chunk is 1.5 Kbytes and the cache capacity of each router is 300 data chunks (about 10 MB) and the content popularity follows Zipf’s law [27, 28].

In the implementation of the Zipf Mandelbrot distribution, the popular data is popular forever. But in fact, popular data will change over time. Therefore, in this paper, the Zipf Mandelbrot distribution function is re-run with a new popularity parameter every minute so that more realistic results are obtained. The simulation parameters are presented in Table 2. Due to the random nature of the simulation environment, the presented results are the average result of ten simulation trials. Evaluated parameters are shown in Figs. 4, 5, 6, 7, 8 and 9. At first, routers' caches are filled up with the same data chunks. Then, data chunks will be selected to be replaced based on the cache replacement methods.

Table 2 Simulation parameters
Fig. 4
figure 4

Average hit rate in the network caches

Fig. 5
figure 5

Redundancy of the stored data chunks in the network caches

Fig. 6
figure 6

Average RTT

Fig. 7
figure 7

Hit rate in different cache sizes

Fig. 8
figure 8

Data chunks redundancy in different cache sizes

Fig. 9
figure 9

Average RTT in different cache sizes

In the learning phase of the RL, different values are chosen for the U parameter (The selected values are used in the DFRC's DP formula in different steps), and feedback is used for the selection of the next value. This process continues until U reaches its optimal value, according to the best obtained RTT.

As observed in Fig. 4, DFRC’s average hit rate in the three network topologies is more than the other methods. DFRC method has lower data chunks' redundancy than the LRU. Therefore, the number of stored data chunks in the network caches is more than the LRU which causes a higher hit rate in the DFRC compared to the LRU. On the other hand, the popularity of the stored data chunks is another prominent issue that can affect the hit rate. OCRICN method stores popular data chunks in the caches but does not consider the contents' popularity changes in the network. Unlike the real network, OCRICN considers that popular contents are popular forever. However, the DFRC method can detect the popularity changes with the use of the SP. Therefore, the old popular data chunks will be discarded, and the new popular data will be stored in the network caches.

In the EPPC method, the least popular data is discarded from the full cache and stored in the backup cache. Thus, the network’s cache space is wasted for the least popular data that is unlikely to be requested in the future. Therefore, the overall hit rate of the network is reduced. Storage of the popular data chunks along with data redundancy reduction in the network caches has increased the average hit rate of DFRC compared to the four other methods. SRTT is not performing well because it does not consider the popularity changes of the data chunks, while in the network, the popularity of the data chunks changes over time. In the case of the redundancy of the stored data chunks, the LRU method has the worst performance, which is due to the lack of routers' awareness of the network conditions and caches. Therefore, the same data chunks will be discarded from the routers' caches, which is due to the use of the same cache replacement policy on the network. Therefore, the stored data chunks in the routers' caches will be almost the same.

As it is shown in Fig. 5, the OCRICN method has the best performance in this parameter. In this method, the data chunks with the highest request frequency will be stored in the first upstream router next to the user. Therefore, the requests for these data chunks will be answered by this router and the other routers will be unaware of the requests for the mentioned data chunks and will store other data chunks in their caches. As a result, data redundancy becomes low in this method. The EPPC method stores the group of data chunks in the middle nodes based on the request frequency. Besides, the requested data chunks are stored at the edge node (nearest node to the requester). Therefore, the redundancy of this method is higher than the OCRICN. However, this method considers a predetermined expiry time (Maximum content expiry time is 5 seconds) for the stored data. When the time is expired the data chunk will be discarded. Therefore, the redundancy cannot be too high. The DFRC method also performs well in the redundancy parameter. In this method, the data chunks with low retrieval latency will be discarded in the routers. The existence of these data chunks on the neighboring routers causes this low retrieval latency and this removal helps to reduce the redundancy in the network. DFRC’s redundancy is higher than the OCRICN due to the use of the SP.

The purpose of the network redundancy reduction is to increase the hit rate and reduce the average RTT, which are the main objectives of this study. The lower redundancy of the data chunks in the network reduces the average RTT in most cases. However, it will not necessarily cause a lower average RTT, as can be seen in the OCRICN method. The DFRC method discards data chunks with low retrieval time and holds data chunks with long retrieval time. As a result, all data chunks will have low RTT from the user's point of view. When RTT decreases, response time decreases too.

As observed in Fig. 6, the DFRC method is better than the four other methods in the RTT parameter (in the three topologies). DFRC performs better than SRTT, due to the consideration of the data chunk popularity changes over time. The superiority of the OCRICN to the LRU is due to its lower redundancy and higher hit rate.

The average hit rate of the DFRC method in comparison with the LRU, OCRICN, EPPC, and SRTT and for different routers' cache sizes is presented in Fig. 7. As is shown in Fig. 7, On average, DFRC's hit rate in the three topologies improved by almost 18.78% compared to LRU, 28.12% compared to SRTT, 4.57% compared to OCRICN, and 23.75% compared to EPPC.

Figure 8 shows the data chunks' redundancy in different cache sizes. By increasing the routers’ cache sizes in the network, a more complete part of the entire content will be stored in each router. Therefore, the ratio of the duplicate data chunks to the entire content will increase. In Fig. 9, the average RTT is shown in different cache sizes. As observed in Fig. 9, DFRC performance is better than OCRICN. This is due to the DFRC's tendency to keep the data chunks that require more time to be retrieved. RTT improved by almost 11.38%, 17.28%, 9.02%, and 8.64% in DFRC compared to OCRICN, LRU, SRTT, and EPPC, respectively.

6 Conclusion

The growth of the content generated on the Internet has recently pushed research to a new architecture, called NDN. The contents are stored in the network routers and thus, traffic load and latency of the data retrieval will reduce from the user's point of view in NDN. NDN routers are equipped with a cache that is larger than the routers’ buffers. These caches can store the named data chunks so that they can reduce the latency of the data retrieval, network congestion, and energy consumption. Due to the economic cost and the cache’s searching time, the capacity of caches is limited. Therefore, it is necessary to consider appropriate cache decision-making and replacement policies. In this paper, a new cache replacement policy was introduced by using the FIB table information. The proposed method, called DFRC, aims to reduce the average latency of data retrieval from the end user's point of view and increase the hit rate. The main idea of DFRC is the introduction of a cache replacement policy in which data with less retrieval time (larger GOR parameter) is given more discard priority. The SP is another parameter that is considered in the calculation of the discard priority that reflects the popularity of the data chunks. The Reinforcement Learning method at DFRC is used to specify the appropriate weight (parameter U), which determines the impact of these two parameters on the drop priority relative to each other. Finally, DFRC was evaluated in terms of data retrieval latency, data redundancy, and overall hit rate in comparison with the LRU, OCRICN, EPPC, and SRTT in a simulation process. Although DFRC's redundancy is higher than the OCRICN due to the use of the SP, it outperforms the LRU, OCRICN, EPPC, and SRTT algorithms in terms of main comparison objectives such as average RTT and hit rate. Therefore, it can be claimed that DFRC operates better than the four other methods. For future studies, the implementation of our proposed method in the SDN-based ICN architecture will be studied to reduce the computational complexity of the routers.