DOR: a novel dual-observation-based approach for recommendation systems

As online social media platforms continue to proliferate, users are faced with an overwhelming amount of information, making it challenging to filter and locate relevant information. While personalized recommendation algorithms have been developed to help, most existing models primarily rely on user behavior observations such as viewing history, often overlooking the intricate connection between the reading content and the user’s prior knowledge and interest. This disconnect can consequently lead to a paucity of diverse and personalized recommendations. In this paper, we propose a novel approach to tackle the multifaceted issue of recommendation. We introduce the Dual-Observation-based approach for the Recommendation (DOR) system, a novel model leveraging dual observation mechanisms integrated into a deep neural network. Our approach is designed to identify both the core theme of an article and the user’s unique engagement with the article, considering the user’s belief network, i.e., a reflection of their personal interests and biases. Extensive experiments have been conducted using real-world datasets, in which the DOR model was compared against a number of state-of-the-art baselines. The experimental results explicitly demonstrate the reliability and effectiveness of the DOR model, highlighting its superior performance in news recommendation tasks.


Introduction
Nowadays, the popularity of web and mobile applications has experienced a remarkable surge, granting users access to an enormous wealth of global information.Nevertheless, this abundance of information presents a challenge when it comes to finding articles that cater to users' specific interests.As a solution to this problem, recommendation systems have gained widespread recognition and usage [1].
Traditional approaches of recommendations, such as Content-based Filtering (CBF), Collaborative Filtering (CF), and hybrid recommendation systems, have been widely studied and applied to create personalized reading lists based on users' reading behavior [2].However, these methods have limitations, including data sparsity and the cold start problem [3].Researchers have incorporated additional information B Mengyan Wang mengyan.wang@autuni.ac.nzB Weihua Li weihua.li@aut.ac.nzExtended author information available on the last page of the article to overcome these challenges and utilized deep neural networks to better model users' interests.Nonetheless, these approaches may not fully capture low-order propagation information [3].It is crucial to incorporate low-order propagation information, which signifies enriched features derived from each user's behavior using deep neural networks, to capture contextual information from users' reading behavior and obtain more accurate user interests [4].
Recently, there has been a growing trend in combining knowledge graphs with deep neural networks to consider low and high-order relations in recommendation [5,6].For instance, Wang et al. employ deep learning networks to capture representations of low-order relations and utilize graph neural networks to capture representations of high-order relations [6].Despite these advancements, the recommendation task still poses challenges for a couple of reasons.
• Different readers may focus on different aspects of the same article for different reasons.For example, an article headline such as "New Zealand fully reopens to the world in August: Ardern" may attract readers who are interested in the economic impacts of the border reopen-ing, as well as those who are interested in the real estate market.Therefore, it is important to consider the various belief networks of readers in the recommendation process.• One-sided or incomplete observations may not be sufficient to form a satisfactory user belief system.A user's belief system is shaped by all of the information that this user has encountered.This information should include all relevant contextual semantics and be revised as the user's beliefs change.Therefore, it is essential to leverage dual observation mechanisms to deeply explore the semantic information in the textual information and analyze the influences of each article on the user's belief network.
To address the two challenging issues mentioned above, in this paper, we propose a novel approach called the Dual-Observation-based approach for Recommendation (DOR).The DOR architecture incorporates a local and global observation-based interest extraction and construction model, which simultaneously distills and learns local and global users' interest representations.By integrating dual observation neural networks, the DOR approach can endow each piece of user behavior with deeper meaning.Specifically, the local observation network is a content-based learning model that imports textual inputs, forming the users' belief networks.On the other hand, the global observation network explores the mutual influence between the users' belief networks and various extrinsic information sources.By considering local and global observations, the DOR approach can provide users more accurate and personalized recommendations.
The contributions of this paper are summarized as follows.
• We propose a novel Dual-Observation-based approach for Recommendation (DOR), a model that leverages local and global observation networks.This dualistic approach delves into a more profound semantic exploration of target items, refining user preference construction.DOR leverages the specificity of item expressions and user belief networks, effectively bridging the gap between the content and the readers.• We advance the state-of-the-art by ingeniously integrating low and high-order relation expressions in our model.This complex combination offers a powerful solution to the often-encountered data sparsity and cold start problems.By balancing low and high-order expressions, we address the issue of data sparsity, ensuring robust, reliable recommendations even in scenarios with sparse user interaction data or new users.• For high-order relation expression in our model, two advanced methods are proposed.First, we introduce hybrid-domain information from external sources, which improves the generalization ability of the proposed model and decreases the limitation of data sparsity.Second, we propose an Attention-based Graph-enhanced Global Contextual (AGGC) model to extend attention by considering the global context of the graph to enrich user preferences.• We validate the superior performance of our DOR model through extensive experiments on real-world datasets.
The experimental results show that the DOR significantly outperforms existing baselines across multiple evaluation metrics.This empirical validation demonstrates our proposed approach's practical utility and effectiveness.
The remainder of this paper is organized as follows: Section 2 reviews literature related to this study.Section 3 introduces some preliminary concepts of our research and the problem description.Subsequently, the details of our proposed dual-observation-based recommendation system are given in Section 4. Section 5 describes the experimental settings and demonstrates the results of the experiments.At last, we conclude this research work and point out the future directions.

Feature-based recommendation
Recommendation systems have been a popular research topic for many years, and there have traditionally been two main techniques for studying them: feature-based techniques and deep learning-based techniques [7].One well-known featurebased modeling technique is Collaborative Filtering (CF) [8], which has been widely adopted to develop recommendation systems because it can effectively capture the interactions between users and items [8, 9].However, CF-based algorithms may suffer from the cold-start problem and the issue of information sparsity and may not provide satisfactory performance [10].
To address these limitations, Content-Based (CB) modeling algorithms are proposed [11].These algorithms calculate the similarity of content features and recommend similar content.Wynne et al. leverage a CB modeling technique to model a fake news detection system [12].However, CB modeling algorithms are handcrafted and require extensive domain knowledge, which can be time-consuming [13].By considering the limitation of the CB modeling technique, researchers incorporate additional factual information into their recommendation systems, leading to more promising outcomes.For example, Sharma et al. propose hybrid recommendation modeling algorithms that combine CF and CB algorithms to complete the book recommendation task [14].Wu et al. utilize social influence among users to improve users' acceptance toward the recommendation of incentives 123 [15].However, such approaches do not completely address the challenges mentioned above.

Deep learning-based recommendation
Deep learning has been widely adopted and applied to various applications, including recommender systems [16][17][18].Zhang [19].These approaches offer several benefits for recommendation systems, such as the ability to model complex user-item interactions through nonlinear transformation, learn rich item representations through representation learning, model sequential user behavior through sequence modeling (SM), and increase flexibility.
However, recommendation systems based solely on deep neural networks may not be able to explain complex interaction patterns fully.They may be perceived as a "black box" due to their lack of interpretability [19].Hui et al. use an embedding-based model to represent items [20], while Wang et al. propose a time-aware method, which is built on the RNN method to mimic user behaviors [21].Tang et al. propose a dynamic graph-based recommendation system that can capture users' evolving preferences toward items over time [22].Wu et al. devise a deep reinforcement learning-based method to recommend incentives for promoting users' beneficial behaviors [23].Zhu et al. develop a recommendation system based on the RNN model called the Deep Attention Network (DAN) model, showing the importance of the RNN method in fully exploring users' historical sequential features [24].Chen et al. design a co-occurrence CNN that considers both user-item and item-item interactions [25].Guo et al. utilize deep learning techniques to address the limitations of previous social recommendation research, such as insufficient robust data management and overly specific preferences [26].These deep learning-based approaches can automatically discover information-rich item expressions without the need for extensive manual processes and often provide a better understanding of item content than feature-based techniques.

Knowledge graph-based recommendation
Several researchers incorporate knowledge graph techniques and deep learning skills into diverse tasks.Shi et al. adopt a concern graph and graph-based presentation skills to improve the public concern detection effectiveness and achieve a satisfactory result [27].In recent years, integrating knowledge graphs into recommendation systems has received significant attention from researchers, resulting in several successful approaches.For example, Sun et al. demonstrate the effectiveness of knowledge graphs in improving recommendation satisfaction by using both knowledge graphs and DNNs to obtain item representations [28].Zhang et al. design a graph-based context-aware recommendation system with a knowledge graph to analyze and predict users' behavior [29].Wang et al. propose a knowledge graph-based recommendation model that learns movie representations at a high-order level, achieving satisfactory results [30].Ma et al. incorporate hybrid information, such as news categories and six different types of behaviors, to construct the user's behavior graph and further demand news diversity [31].Fan et al. also integrate knowledge graphs into the recommendation model, using Graph Neural Networks (GNNs) to learn features from duplicate user-user and user-item graphs [32].
In text-based scenarios, knowledge graphs have been adopted to extract semantic meaning.For example, Sheu et al. apply a knowledge graph to the news domain, focusing on exploring the contextual features of news to represent users' reading interests over a short period, where Graph Convolutional Networks (GCNs) are used to embed contextual information [33].Wang et al. incorporate a knowledge graph into a news recommendation system for news content engineering, using TransE to represent news entities and pretrained Word2Vec to express word embeddings.The findings indicate that the utilization of a knowledge graph has a profound influence on the effectiveness of recommendations [5].

Attention-based recommendation systems using deep neural networks
In recent years, researchers have integrated attention mechanisms into recommendation systems to enhance performance.Attention serves as a technique that enables models to identify the crucial elements of input data that are pivotal for decision-making [34].The attention method distinguishes the importance of data by learning patterns within it and using those patterns to prioritize certain parts of the data when making decisions, enabling the model to focus more heavily on the most relevant features and improving its decision-making capability [9].On top of that, attention mechanisms enable personalized recommendations by allowing the model to focus on relevant information and automatically extract relevant information, improving the understanding of the item's content [35].Jung et al. use an In-and-Out Attention flow framework in a dialogue recommendation system [36], while Zhu et al. develop an attention-based DNN news recommendation system [24].Wu et al. consider diverse news information in their proposed recommendation model and include an attention mechanism [37].They represent users' interests from word-level expressions and incorporate category embeddings 123 into the news embeddings.Similarly, Li et al. adopt a similar method, using an attention-based deep neural network recommendation system in various scenarios [38].Duan et al. leverage the CNN model and multi-attention mechanism for knowledge graph-based recommendation task [39], highlighting the non-trivial influence of relations in contextual representation learning.These studies demonstrate the positive impact of attention networks on recommendation research.
However, few studies consider the interaction between the user's knowledge system and input information from a macro perspective and the context-rich semantic expression of input information.

Summary
The existing recommendation systems experience difficulties adequately capturing the two-way influence between users' belief networks and global information sources and demonstrate a limited understanding of these global sources, such as insufficient contextual representations.Furthermore, such systems can be prone to the cold-start problem and may lack transparency due to their black-box nature.Our proposed DOR model addresses these issues by constructing users' preferences using a dual-observation mechanism.The global observation examines the mutual influence between users' belief networks and various global information sources, while the local observation extracts rich contextual semantic information.In addition, the DOR model learns user interests from low-order and high-order mechanisms based on dual observation mechanisms, including local and global agencies.The high-order expression model effectively alleviates the limitation of cold-start and increases transparency in the recommendation process.

Preliminary
This section aims to provide an introduction to several fundamental concepts that are essential for this research.These concepts include knowledge graph embedding and dual observations.Subsequently, we will formulate the problem within the context of the current setting.

Knowledge graph embedding
Knowledge graphs have been widely studied from many perspectives, including representation and modeling, knowledge identification, knowledge fusion, and knowledge retrieval and reasoning [40].Benefiting from the power of knowledge graphs, incorporating knowledge graphs into recommendation systems has become popular in recent years, as it can significantly improve the performance of recommendations [41].In general, a knowledge graph consists of a number of Resource Description Framework (RDF) triples, where each RDF triple contains a head entity h, a relation r , and a tail entity t [42].To effectively derive and utilize information on entities and relations in the knowledge graph, it is necessary to represent them as low-dimensional vectors in a continuous space, i.e., embeddings.These embeddings can be used for subsequent tasks, such as link prediction, entity classification, and knowledge base completion [43].
There are several approaches for learning knowledge graph embeddings, such as neural network-based models [6], semantic matching models [44], and translation-based models [45].The translation-based models have proved their effectiveness and efficiency in representing entities and relations, and they have the complexity of space and time that scales linearly with the dimensionality of entities and relations embedding space [46].Furthermore, neural network-based embedding models and semantic matching models usually suffer from the limitation of data sparsity and over-simplify [47].Hence, we select three widely used translation-based models (i.e., TransE, TransH, and TransR) to represent knowledge graph triples into low-dimensional embeddings in the proposed DOR system.The details of these three models are as follows: • TransE [48] is a translation-based model that learns low-dimensional embeddings of entities and represents 123 relationships as translations in the embedding space.The objective of TransE is to minimize the distance between vectors h + l and t if the triple (h, r , t) holds, or to maximize the distance conversely, as described in Fig. 1(a).Accordingly, the scoring function of TransE can be represented using (1).Although TransE can effectively handle 1-to-1 relations in the knowledge graph, it remains flaws for 1-to-N, N-to-1, and N-to-N relations. (1) • TransH [49] overcomes the problems of TransE in modeling 1-to-N, N-to-1, and N-to-N relations by enabling entities to have distributed representations in different relations.Specifically, as described in Fig. 1 where w r is the normal vector, h ⊥ = h − w r hw r , and t ⊥ = t − w r tw r .Through this mechanism, TransH enables entities to have diverse roles in different relations.

Dual observations
In the current setting, the dual observation aims to access and refine the user's reading preference by considering both the focus of the text-based information and the user's belief network.It consists of local observation and global observation.
The local mechanism combines low-order relation representations with high-order relation expressions, which helps to alleviate data sparsity and the cold-start problem and allows for the exploration of hybrid-domain and more meaningful information for the user.The global observation mechanism focuses on the continual influence of each piece of information on the user.Every time a user reads a piece of information, their belief network is connected to the primary semantic information and historical knowledge of the information due to the mutual influence between each word and the influence between the user's belief network and the information.Different users may show different levels of attention to the same information.Hence, observing local textual features and the interaction between the user's belief network and the information is essential for extracting the user's current knowledge system.
Dual observation differs from dual attention, which generally refers to using two separate attention mechanisms in a single model, where the two attention mechanisms can be used independently or in combination to attend to different aspects of the input data [51].For example, a model with dual Fig. 2 A typical process of how readers perceive and understand information 123 attention normally uses one attention mechanism to focus on user information and another attention mechanism to focus on item information when making recommendations [34].By contrast, dual observation emphasizes the vital information of the article and the user's belief network.
Figure 2 demonstrates how readers perceive, understand, and integrate information.Users form their belief system essentially through two observation mechanisms.The first is the observation of the local attention of the textual information, which entirely extracts the textual feature (the word with red color, with darker red representing more critical).When users read this article, they will analyze and perceive the information based on their existing belief networks, describing their prior knowledge and experiences.The pure blue user belief network represents the user's prior knowledge.The second observation mechanism is global observation, which observes the mutual influence between the user's prior knowledge and the outside source.The final affected result (mixed color user belief networks) is transferred back to the user.It can be seen that when a person reads a text, they receive the article's information and incorporate it into their existing belief networks.It is important to adopt dual observation mechanisms to retrieve users' preferences accurately.

Problem definition
This research explores a novel approach to personalized recommendations considering users' beliefs and textual features.We formally define the reading behavior of a reader r as b r ,t ∈ B r , where B r represents the list of historical reading behaviors of reader r .Each reading behavior b r ,t is represented as a four-tuple, i.e., b r ,t = (reader id , item id , t, l).
Here, reader id denotes the reader's unique identifier, item id refers to the identifier of the textual item associated with the behavior, t represents the timestamp, and l indicates the label that denotes whether the item was clicked or not.Each item in the dataset comprises a title or overview, which can be represented as a sequence of words [w 1 , w 2 , . . ., w m ]. m indicates the length of the news.Our objective is to predict the likelihood of reader r selecting a candidate item and calculate the corresponding click probability.

Dual-observation based recommendation
In this section, we comprehensively explain the Dual-Observation based Recommendation (DOR) model, starting with an overview of its architecture.Subsequently, we delve into the details of the dual-observation modules, particularly emphasizing the local and global observation mechanisms.This thorough exploration will shed light on the key components and workings of the DOR model.

DOR architecture
The overall architecture of the proposed DOR model is demonstrated in Fig. 3.The model primarily comprises two main modules: the local and global observation mechanisms.
As illustrated in Fig. 3

High-order and low-order relations
The local observation module concentrates on the content and contextual information of each item.It aims to identify the inherent characteristics of articles by utilizing both high and low-level representations.The readers' perception of the information is partially influenced by the main idea of the reading materials, where different words are assigned varying levels of importance.
As depicted in Fig. 3, the local observation mechanism includes two essential models, i.e., the low-order relations model (LRM) and the model of the high-order relation (HRM).The LRM captures the lexical-level representations of textual inputs, and the HRM incorporates global knowledge from knowledge graphs to extract contextual representations at a high-order level.These two models work in tandem to provide a comprehensive understanding of the characteristics of articles.Next, the LRM and HRM are introduced with examples.
The low-order relation module (LRM) captures the intraarticle word relationships in the current configuration.Each word in the article is encoded as a vector, enabling the depiction of connections between words.The LRM module aims to derive lexical-level representations of articles by considering these word relationships, which are then used for further analysis.This mechanism allows the DOR system to obtain local information from articles, leveraging it for recommendations.Figure 4 illustrates an example of the LRM module, where the input is titled "Monkeypox to become a notifiable disease in NZ."The title undergoes encoding using a pre-trained word embedding model, resulting in its representations.
On the other hand, high-order relations describe the connections between entities mentioned in the article that are not explicitly mentioned in the text but can be inferred through additional context or global knowledge.To identify these connections, it uses techniques such as knowledge distillation and entity linking [52], which align entities present in the textual content with pre-defined entities in a global knowledge graph.In this research, we use Wikidata1 as the global knowledge graph, which is a vast repository of structured data from the real world.To provide a clearer understanding of the High-order Relation Module (HRM), an example is depicted in Fig. 5.The HRM module utilizes triple distillation techniques to extract subjects, objects, and predictions from textual inputs.In the example, the subject "Abandoned Theme Parks" and the object "Nostalgia" are identified as entities within the input.Additionally, the prediction "Explore" is also extracted, representing the connection between the two entities.However, due to potential limitations in extraction, the keywords of the textual input are also considered as entities in our model and aligned with corresponding entities in a sub-knowledge graph.Consequently, closely related entities (represented as green nodes in the graph) associated with the input keywords (represented as yellow nodes in the graph) are further extracted from the sub-graph and integrated into existing triples (represented as

Graph feature learning
We employ graph feature learning to extract entity neighbors and relations from a graph, which is then utilized in a graphbased feature learning mechanism known as the Attentionbased Graph-enhanced Global Contextual (AGGC) model.The AGGC model leverages contextual information, such as multiple-head neighbor expressions and their associated relation representations to discover embeddings for each node.It further improves input representations by incorporating a global knowledge graph and considering contextual information within the graph.The GFLM highlights the varying importance of each node, enabling a better understanding of the significance of individual entities.
The proposed AGGC model combines deep neural networks with advanced attention mechanisms to capture intricate relationships within the data.It builds upon the success of existing graph attention models, such as the Graph Attention Network (GAT), but introduces key enhancements that significantly improve performance and versatility.Different from traditional graph convolutional networks (GCN) that aggregate information from immediate neighbors, and GAT that selectively attends to neighbors using attention mechanisms, the AGGC extends the concept of attention by incorporating contextual information into the attention mechanism.It considers the global information or context of the graph when assigning attention weights to neighbors.By integrating local and global information, the AGGC allows nodes to attend to neighbors that are relevant in a broader context, capturing more holistic information from the graph and enhancing the representation learning process.
The AGGC model architecture consists of multiple layers of graph convolutional operations (GCNConv), followed by non-linear activations.The attention layer is introduced to compute attention weights based on the output of each GCNConv layer.These attention weights are then applied to the GCNConv outputs using element-wise multiplication and aggregation.Finally, a linear layer is used for further transformation before producing the final context embeddings.To train the AGGC model, an optimizer (e.g., Adam) is employed to minimize the defined loss function between the predicted context embeddings and the target values.The model parameters are updated iteratively over a specified number of epochs.
The AGGC model significantly advances capturing complex relationships within graph data by incorporating global contextual information through attention mechanisms.Its unique ability to attend to relevant neighbors in a broader context (hybrid domains) enhances the representation learning process and enables a more comprehensive understanding of each entity's significance.Thus, the AGGC model stands as a valuable and innovative approach for graph-based learning tasks.The detailed algorithm of the AGGC is described as follows.
The AGGC presents a novel neural network architecture that deals with graph-structured information, which reuses the concept of "local" and "global" and fixates on contextual information from a global perceptive.The input of the user belief graph with a set of nodes represented by E and a set of edges expressed by R. Each node e i in this user network is associated with a feature vector e i ∈ R d , where d is the dimension of the vector feature.The purpose of the AGGC model is to calculate a new representation for each node by integrating both local and global contextual information.The local information is captured through a graphical convolution operation, which aims to obtain the immediate neighborhood characteristics of a node in the user graph.It focuses on local connections and features between a node and its neighbors to get node-level local contextual information, which can be defined as: where y l represents the hidden expression of node e i at layer l, N i summarizes all one-hop neighborhoods of node e i , W l 123 and b l are the weight matrix and bias vector respectively, and σ is the activation function.1/w i j indicates the normalized edge weight between nodes e i and e j , which can be obtained by learning from data source.
For capturing global information, our AGGC model conducts the graph attention mechanism between each layer of the local graphical convolution operation.The global attention weights a l i j between nodes e i and e j at layer l are computed as: where • means the dot product operation.The attention weights determine the global importance of node connections and enable the model to focus on global contextual information.
At last, the updated hidden expression y l+1 i at layer l + 1 is learned by incorporating the local and global information using a linear transformation: where W l local , W l global , and b l refer to learnable parameters.

Knowledge graph distillation and construction
In the proposed DOR, three approaches are applied to address the issue of inadequate entities and relations distillation.First, it utilizes both in-domain and out-domain knowledge from external sources as inputs to the DOR model, thereby solving the cold-start problem, improving generalization ability, and increasing the robustness of the model.Second, it utilizes word embedding techniques to represent the words in each input, considering their low-order relations.Additionally, a global knowledge graph is incorporated to enrich the representation of entities within the inputs.An example of the textual input knowledge graph extraction and construction process is illustrated in Fig. 6.The input is a text composed of an input title or overview, with several keywords.Triples (subject, predicate, object) are extracted from the input.The inadequate triples extraction may lead to information sparsity in the experiment.We employ the entity alignment technique to link with corresponding entities and relations in the sub-graph (Wikipedia knowledge graph).Based on these entities, their neighbors and relations are also explored.All the extracted information is then used to construct a knowledge graph of input.

Global-observation mechanism
The Global Observation Mechanism (GOM) in the DOR system receives the content and contextual features of the input from the local observation mechanism.The focus of this module is to study the different influences on the user's behavior from each piece of input.Therefore, the GOM is a deep learning model with a self-attention mechanism that analyses the mutual influence between users' beliefs and each piece of input.This module allows investigation of how users' perceptions and behaviors are affected by the input.Specifically, the GOM takes into account the dynamic nature of users' beliefs.It incorporates a sequential characteristic that considers each historical reading record as a snapshot of the user's evolving belief network.The GOM receives the representations of the current input from the local observation mechanism and uses the user's current belief network to output the user's preferences.Thus, it models how the user's perception of current input is influenced by their previous reading history and belief network, which makes our model more adaptive to the user's dynamic interests.
Mathematically, given user i embedding U i e and the candidate input embedding N i e , the probability of user i clicking input N i , i.e., p i,N i , is estimated through a general deep neural network D:

Experiment and analysis
In this section, we first evaluate the performance of the proposed DOR model using three real-world datasets by comparing it against a few well-known neural recommendation methods.Subsequently, an ablation study is conducted to verify the contribution of the dual observation mechanisms.Next, we analyze the impact of the chosen parameters on our model's performance.Finally, we discuss the insights obtained from the experiments.

Dataset
This section describes the experimental setup for evaluating the performance of the proposed DOR.In the experiments, we leverage three real-world datasets from two sources: the Microsoft News datasets (MIND) 2 and IMDB dataset 3 .IMDB dataset consists of movie rating records derived from users' past behaviors.This dataset provides movierelated information, including movie ID, genres, titles, and overviews.Furthermore, it includes user ratings for target movies and their respective user IDs.In our experiments, we consider the rating degree of a movie as an indication of the user's interest.Ratings range from 0 to 5, where a rating of 5 represents a high level of interest, and a rating of 0 indicates no interest.We define a standard interest degree at 2.5, where ratings above this threshold imply user interest (clicked movie), while ratings below 2.5 suggest a lack of interest.By leveraging the IMDB dataset, we aim to enhance the recommendation capabilities of the DOR system by incorporating movie-related content and user preferences.

Datasets statistics.
Table 1 provides an overview of the sizes and properties of three different datasets: IMDB, MINDsmall, and MIND-large, showing the statistics related to the number of users, user behaviors, words, entities, and specific constraints on the data.The maximum number of words per title represents a constraint on the length of article titles or movie overviews.The titles or overviews in all three datasets have a maximum length of 20 words.Additionally, there are constraints on the maximum number of history entries and impression logs per user.

Experimental settings
The proposed model and the baseline models are constructed by utilizing TensorFlow [53].The experiments are configured with the intent to evaluate the effects of changing the number of epochs, which ranged from 0 to 8, across all datasets.We adopt a batch size of 20 and establish a learning rate of 0.0001 for these trials.
To verify the contribution of the dual-observation (DO) mechanism, we give an experimental setup where the representation of the user's belief is removed from the DOR model.This serves as the local observation mechanism of the model.Additionally, we compare the performance of our proposed High-order Relation Model with other classical graph-based embedding models in the DOR.
In the experiments, we employ word embeddings of a 300-dimension specification.These embeddings are acquired through the GloVe2Vec method [54].In addition to this, we evaluate the performance of other embedding techniques, including Word2Vec [54] and BERT [54].
The DO mechanism extracts contextual data and is trained using a total of 16 layers.The context embedding dimension for each entity was consistently set to 300.The input of the DO module is pre-processed via the TransR model, having 123 a dimensionality of 300, and is trained using a series of 10 batches.The DOR model is trained using the Adam optimization technique [55], with log loss optimization as the primary objective function.

Evaluation metrics
In the experiments, we employ several widely used metrics commonly employed in the field of recommendation systems [56].
• AUC (Area under the ROC Curve): AUC measures the probability that randomly chosen related items will rank higher than randomly chosen unrelated items.A higher AUC value indicates that the model can better distinguish between related and unrelated items.• MRR (Mean Reciprocal Rank): MRR indicates the mean value of the reciprocal rankings of multiple query statements.It measures the effectiveness of a ranking system, with a higher MRR indicating a higher level of effectiveness.

• NDCG (Normalised Discounted Cumulative Gain):
NDCG measures the ranking quality of a recommendation system.The principle of NDCG is that highly correlated products should rank higher than unrelated products.A higher NDCG value indicates a better ranking of related products.• NDCG@5: The NDCG@5 metric calculates the DCG of the first five recommendations.• NDCG@10: The NDCG@10 metric calculates the DCG of the first ten recommendations.

Baseline methods
To evaluate the performance of our model, we compare it against several widely used baselines in the field of recommendation.These baselines include: • DKN [5] represents a news recommendation system, employing attention networks to procure entity-word level representations.This model enhances the model-ing of pertinent information by utilizing a dual-attention mechanism.
• NRMS [37] introduces an innovative approach for extracting user interests, in situations where user interest data is scarce.The proposed model in NRMS considers both the news title and abstract, thereby creating a comprehensive perception of users' reading interests.
• NAML [57] presents a novel neural news recommendation strategy, applying attentive multi-view learning to assimilate diverse types of news information into the representation of news.
• LSTUR [58] advocates a neural methodology for news recommendation that acknowledges immediate and lasting user interests.Utilizing the GRU (Gated Recurrent Unit), it captures short-term user preferences based on recent news engagement while considering long-term user interests.
• FIM [59] integrates multi-grained representation and matching methodologies to identify fine-grained interest signals through interactions among news articles at various semantic layers.
• UNBERT [60] addresses the cold-start issue in news recommendation by leveraging an out-domain data pretrained model.It integrates multi-grained user-news matching signals at both the word and news level via WLM (Word-Level Matching) and NLM (News-Level Matching) strategies, respectively.
• MINER [61] model utilizes a poly attention scheme to derive multiple user preference vectors, capturing various user interest facets through attention.It also implements a disagreement regularization technique to boost the diversity of the learned interest vectors.A category-aware attention weighting strategy is further adopted to adjust the significance of historical news based on category resemblance.

Performance evaluation
In this section, we evaluate the proposed DOR model by comparing it against a number of state-of-the-art baselines.123 three real-world datasets: MIND-Small, MIND-Large, and IMDB.As can be observed from this table, the DOR-Glove model achieved the highest performance in AUC, MRR, NDCG@5, and NDCG@10 across all datasets."Improv.min" indicates the percentage by which the DOR-Glove model outperforms the MINER model in terms of expressiveness."Improv.max" refers to the percentage by which the DOR-Glove model outperforms the DKN model regarding expressiveness.
To explain the results further, the DOR-Glove model obtained an AUC score of 0.7917 and 0.8701 on the MIND-Small and MIND-Large datasets, respectively, indicating its strong predictive capability.Moreover, it achieved a high MRR of 0.4714, implying the successful ranking of relevant documents higher in the recommendation list.The NDCG@5 and NDCG@10 scores of 0.5731 and 0.6775 illustrate the model's effectiveness in promoting relevant documents to the top of the recommendation list.
On the other side, the DKN model demonstrated competitive performance on both MIND datasets, with AUC scores of 0.629 and 0.6407, respectively.However, it performed lower in MRR and NDCG metrics than the DOR-Glove model, implying that its ranking and relevance were not as strong.The FIM, NRMS, LSTUR, NAML, and UNBERT models achieved higher AUC scores than DKN, indicating superior discrimination ability.However, these models fell short in MRR and NDCG metrics, suggesting they may not rank relevant items as accurately as others.Interestingly, the DKN model outperformed the NRMS and LSTUR models on the IMDB dataset, attaining a 0.6784 AUC score.Among all baselines, the MINER model consistently delivered high performance across different metrics on all datasets.
As for DOR-BERT and DOR-Word2Vec, the variants of the DOR model, demonstrated competitive performance on the MIND-Small dataset, with higher AUC scores than other models.However, their performance in MRR and NDCG metrics was not as strong as the DOR-Glove model.In particular, the DOR-BERT model displayed weak performance on the MRR metric.
Based on the results of this experiment, the DOR-Glove model demonstrates superior performance than the baseline methods.Its strengths stem from effectively integrating dual observations when building user preference.It also captures global semantic relationships and contextual information within the text data, resulting in more precise and relevant recommendations.

Ablation studies
In this section, we conducted three ablation experiments using MIND-Large and IMDB datasets to assess individual contributions of the various modules within the DOR model, enabling us to discern their specific impacts on performance and understand their importance in enhancing recommendation outcomes.
The first ablation experiment aims to evaluate the contribution of the proposed Dual Observations (DO) mechanism.The second ablation experiment examines the effectiveness of the proposed Attention-based Graph-enhanced Global Contextual (AGGC) model.The third ablation study investigates the influence of the classic translational distance models that can be adopted in DOR, including TransE, TransH, and TransR.
Ablation Study 1: By comparing the performance of the DOR model with and without DO, we aim to quantify the contribution of DO in our model.Recall that the DO mechanism employs a comprehensive two-observed pattern encompassing local and global perspectives.A model without the DO mechanism indicates this model will neglect the importance of the out-domain semantic information and the user's belief representations.
Table 3 demonstrates the results of the first ablation study, where MIND-Small, MIND-Large (represented as M I N D ↑ ), and IMDB datasets are utilized.
As can be seen from Table 3, for the MIND dataset, the results explicitly show that the model with the DO mechanism (M I N D DO ) achieved a higher AUC (0.7917) compared to the model without DO (M I N D N on−DO ) (0.7522).Similarly, the DO model outperformed the Non-DO model in terms of MRR, NDCG@5, and NDCG@10.In the case of the MIND-Large dataset, the mode with DO demonstrates even better performance, with a higher AUC (0.8701), MRR (0.4714), NDCG@5 (0.5731), and NDCG@10 (0.6775) compared to the that of Non-DO.Lastly, in the IMDB dataset, the model with DO achieved an AUC of 0.7309, MRR of 0.2731, NDCG@5 of 0.8688, and NDCG@10 of 0.9149, while it had poor performance without DO.This further highlights the effectiveness of the DO mechanism, particularly when applied to larger datasets.
The results from this ablation study consistently demonstrate that incorporating the DO mechanism leads to improved performance across various evaluation metrics.The higher AUC, MRR, and NDCG scores obtained by the DO models indicate their superior effectiveness in capturing and utilizing dual observations compared to those without the DO mechanism.

Ablation study 2:
We proposed the Attention-based Graphenhanced Global Contextual (AGGC) model to capture high-order relation representations.In this ablation study, we replaced the AGGC model with three alternative graph embedding models, namely GCN, GAT, and GraphSAGE, to evaluate their performance within our model.
Figure 7 illustrates the performance of different graph embedding models on various metrics, namely AUC, MRR, NDCG@5, and NDCG@10.The y-axis represents the metrics, while the x-axis represents the performance values ranging from 0 to 1.Each line in the graph corresponds to a specific model and is assigned a specific color.The models MIND_GCN, MIND_GAT, and MIND_AGGC exhibit similar and higher AUC scores compared to MIND_GraphSAGE.Among the IMDB models, IMDB_GCN and IMDB_GAT perform better than IMDB_GraphSAGE, while IMDB_AGGC demonstrates the highest AUC score.MIND_AGGC consistently achieves the highest MRR score among all models.MIND_GCN and MIND_GAT also perform relatively well.However, for the IMDB dataset, IMDB_AGGC does not perform as well as AGGC used on the MIND dataset.MIND_AGGC shows the highest NDCG@5 and NDCG@10 scores, indicating its superior performance.MIND_GCN and MIND_GAT also exhibit competitive performance.IMDB_AGGC achieves the highest NDCG@5 and NDCG@10 scores among the IMDB models, followed by IMDB_GAT.
In conclusion, the graph reveals that MIND_AGGC consistently outperforms the other models across all metrics.It showcases the effectiveness of the AGGC model architecture in the MIND dataset.Similarly, among the IMDB models, IMDB_AGGC and IMDB_GAT demonstrate relatively better performance compared to IMDB_GraphSAGE and IMDB_GCN.
Ablation study 3: We analyze the impact of incorporating these different translational distance models on the overall performance of the DOR model.
Figure 8 presents a comparative analysis of various knowledge graph embedding models, including TransE, TransH, and TransR, when applied to DOR.The evaluation metrics, AUC, MRR, NDCG@5, and NDCG@10, are plotted on the y-axis, while the x-axis signifies the different models.
The results reveal the performance differences between the models across the four evaluation metrics.TransE_large, TransH_large, and TransR_large show a dominant performance over the rest.The "large" suffix in these models signifies their training on a larger-scale dataset, specifically the MIND-Large.
TransR leads in performance, consistently achieving peak scores in AUC, MRR, NDCG@5, and NDCG@10.This suggests that TransR excels at modeling intricate relationships and encapsulating the semantic interactions between entities and relations in the DOR.TransE also demonstrates competitive performance, with high scores in all metrics.While its scores fall slightly short compared to TransR and TransH, it maintains a robust overall performance.TransH's performance mirrors that of TransE, indicating its effective capture of the interactions between entities and relations when applied to DOR.
As TransR shows superior across all evaluation metrics.Thus, in the context of DOR, TransR is regarded as a

Parameter analysis
In this section, we aim to analyze the impact of various parameters on our proposed DOR model.Specifically, we investigate the effects of epoch numbers, filter sizes, and dimension numbers on the performance of the DOR model using the MIND-Large dataset.
First, we conduct experiments with different epoch sizes from 0 to 8 to evaluate the performance of the DOR model.This allows us to examine the model's behavior and performance under varying training durations.Second, we explore the influence of filter sizes in the DOR system.Specifically, we experiment with 5, 8, and 10 filters, respectively, as the number of filters for each size in the DOR system.This analysis helps us understand the impact of filter sizes on the model's ability to capture relevant information and make accurate recommendations.Third, we examine the effects of dimension numbers on the performance of the DOR system.We vary the dimensions for all features of the DOR model, specifically exploring dimensions 50, 100, and 300.By assessing the model's performance with different dimension settings, we are capable of obtaining an optimal dimensionality for feature representations.
Figure 9(a) demonstrates the model's performance over multiple epochs, with the x-axis representing the epochs and the y-axis representing the AUC values.The AUC value commences at 0.7261 (Epoch 0) and displays an upward trend through training, peaking at 0.8701 (Epoch 5).Subsequent epochs exhibit slight fluctuations in the AUC value, suggesting that the model's performance stabilizes and may not see significant improvements beyond Epoch 5.
In a second parameter analysis, the dimensions are varied according to Fig. 9(b), which reveals that AUC performance differs based on the dimensions set.There's an upward trend in AUC values as the dimensionality increases: 0.8154 for dimension 50 (moderate performance), 0.8523 for dimension 100, and 0.8701 for dimension 300, indicating an improved model performance with increased dimensionality.
Lastly, an analysis of filter sizes (2, 5, 8, and 10) is conducted to evaluate the performance of the DOR model.Figure 9(c) reveals a gradual increase in AUC performance corresponding to the number of filters, reaching its peak at 8 filters with an AUC value of 0.8701.Although the AUC slightly decreases to 0.8688 at 10 filters, it remains close to the highest value, suggesting that increasing the number These three parameter analyses reveal the interplay between various parameters and the model's performance, providing critical insights for optimizing the proposed DOR.
In summary, our experiments demonstrate that the DOR model holds a clear advantage in recommendations when utilizing the dual-observation mechanism.Our results confirm that it is essential to fully observe the features of items.Additionally, our experiments demonstrate the importance of considering the mutual influence between the user's belief network and the items in the recommendation process.

Discussion
Among evaluations, three real-world datasets were adopted to compare the performance of the DOR model against wellknown neural recommendation methods.This comparison provides a comprehensive assessment of the model's effectiveness.The results showed that the DOR model outperformed the competitors, indicating its superiority in involving dual observations, capturing global semantic relationships, and utilizing contextual information for more accurate and relevant recommendations.
Based on the experiment results, we observe that our DOR model is superior to other baselines.This reflects that the DOR model can generate more accurate results and provide more related recommendations.This has vital implications for solving information overload in recommender systems and improving user satisfaction.These experimental results show that by introducing a dualobservation mechanism, i.e., a local observation mechanism that fully considers contextual semantic information and hybrid-domain features and a global observation mechanism that adequately captures the mutual propagation between the user's belief network and outside data source, our DOR model can better dig vital user preferences.This is critical for personalizing recommendations and improving user experience.
Furthermore, our experiments demonstrate the effectiveness of high-order relation representation in the DOR model.By employing graph representation models, especially our AGGC model, we enable to depict higher-order relational representations.This is critical for recommendation systems, as it can explore more complex semantic information among items, thus providing more accurate, relevant, and personalized recommendations.Additionally, The flexible use of the global observation mechanism in the model can further reflect the interaction between the user's existing knowledge system and external access information.
These findings have important implications for advancing domain knowledge and recommender systems.Our experimental results illustrate that the performance of recommender systems can be significantly improved by incorporating the dual-observation mechanism.

Conclusion and future work
In this research, we proposed a novel recommendation method called DOR, which leverages a dual observation mechanism consisting of two key modules, i.e., the local and global observation modules.
The local observation module focuses on improving the model's generalization ability, enhancing semantic relationships in the user confidence network, addressing challenges posed by data sparsity, and making the model applicable to various domains.The global observation module considers the interaction between textual item representations and user belief networks.This module incorporates external features into the user's knowledge and interest representation process by analyzing the propagation and influence of external information on the user belief network.The module recognizes the importance of personalized recommendations and ensures that the recommendations align with the user's preferences and reading behavior.
Extensive experiments have been conducted using realworld datasets, comparing the DOR model against several state-of-the-art baselines.The results demonstrate the stability and accuracy of the proposed DOR model, showcasing its effectiveness in news recommendation tasks.
In the future, we plan to delve deeper into the importance of observation mechanisms in the field of recommendations.This involves further exploring representation techniques that capture high-order and low-order interactions, as well as studying the use of triple representations to better represent user reading behavior.Additionally, our goal is to explore integrating other features, such as images and categories, to enhance the overall performance of recommendation systems.These future directions will contribute to advancing the field of recommendations and further improving the DOR model.
direct financial interest in the subject matter or materials discussed in the manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

Fig. 1
Fig. 1 Simplified illustrations of entities and relations in TransE, TransH, and TransR further improves the embedding performance by modeling entities and relations into distinct embedding spaces because entities and relations are completely different objects.To perform the translation in such settings, TransR sets entities embeddings as h, t ∈ R k and the relation embedding as r ∈ R d for any triple (h, r , t), where k = d.Meanwhile, a projection matrix M r ∈ R k×d is used to project entities from the entity embedding space into the relation embedding space, as shown in Fig. 1(c).The scoring function of TransR is described in (3).

Fig. 5
Fig. 5 An example of the High-order Relations Model (HRM)

Fig. 7 Fig. 8
Fig. 7 Ablation study on diverse graph representation models Fig. 9 Parameter analysis on AUC scores

References 1 .
Wu C, Wu F, An M, Huang J, Huang Y, Xie X (2019) Npa: neural news recommendation with personalized attention.In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2576-2584 2. Wang M, Ren P, Mei L, Chen Z, Ma J, De Rijke M (2019) A collaborative session-based recommendation approach with parallel memory modules.In: Proceedings of the 42nd international acm sigir conference on research and development in information retrieval, pp 345-354 3. Wang H, Zhang F, Wang J, Zhao M, Li W, Xie X, Guo M (2018)Ripplenet: Propagating user preferences on the knowledge graph for recommender systems.In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 417-426 4. Lei F, Liu X, Dai Q, Ling BW-K, Zhao H, Liu Y (2020) Hybrid low-order and higher-order graph convolutional networks.Comput Intell Neurosci 2020 5.Wang  H, Zhang F, Xie X, Guo M (2018) Dkn: Deep knowledgeaware network for news recommendation.In: Proceedings of the 2018 world wide web conference, pp 1835-1844 6. Wang H, Zhao M, Xie X, Li W, Guo M (2019) Knowledge graph convolutional networks for recommender systems.In: The world wide web conference, pp 3307-3313 7. Shlezinger N, Eldar YC, Boyd SP (2022) Model-based deep learning: On the intersection of deep learning and optimization.IEEE Access 10:115384-115398 8. Wu L, He X, Wang X, Zhang K, Wang M (2022) A survey on accuracy-oriented neural recommendation: From collaborative filtering to information-rich recommendation.IEEE Trans Knowl Data Eng 9. Wang R, Wu Z, Lou J, Jiang Y (2022) Attention-based dynamic user modeling and deep collaborative filtering recommendation.Expert Syst Appl 188:116036 10.Wang X, He X, Cao Y, Liu M, Chua T-S (2019) Kgat: Knowledge graph attention network for recommendation.In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & Data Min pp.950-958 11.Javed U, Shaukat K, Hameed IA, Iqbal F, Alam TM, Luo S (2021) A review of content-based and context-based recommendation systems.Int J Emerg Technol Learn (iJET) 16(3):274-306 12. Wynne HE, Wint ZZ (2019) Content based fake news detection using n-gram models.In: Proceedings of the 21st international conference on information integration and web-based applications & services, pp 669-673 Textual Input17 Abandoned Theme Parks to Explore for Thrills, Chills, and Nostalgia.Disney, Six Flags, and even the Flintstones have had amusement parks that succumbed to disasters, bad press, and shifting entertainment markets.But for the adventurous, abandoned theme parks, whether in California, Florida, or Ohio, can be fascinating places to explore if you dare.

Entity Linking Entity in outer knowledge graph Relation Tail Embedding Input
Fig. 6 An example of user knowledge graph construction and representation 123

Table 1
Statistics of datasets Table 2 demonstrates the performance of various models on 123