The cognitive comparison enhanced hierarchical clustering

Guan, Chun; Yuen, Kevin Kam Fung

doi:10.1007/s41066-021-00287-x

The cognitive comparison enhanced hierarchical clustering

Original Paper
Open access
Published: 28 October 2021

Volume 7, pages 637–655, (2022)
Cite this article

Download PDF

You have full access to this open access article

Granular Computing Aims and scope Submit manuscript

The cognitive comparison enhanced hierarchical clustering

Download PDF

2091 Accesses
1 Citation
Explore all metrics

Abstract

The growth of online shopping is rapidly changing the buying behaviour of consumers. Today, there are challenges facing buyers in the selection of a preferred item from the numerous choices available in the market. To improve the consumer online shopping experience, recommender systems have been developed to reduce the information overload. In this paper, a cognitive comparison-enhanced hierarchical clustering (CCEHC) system is proposed to provide personalised product recommendations based on user preferences. A novel rating method, cognitive comparison rating (CCR), is applied to weigh the product attributes and measure the categorical scales of attributes according to expert knowledge and user preferences. Hierarchical clustering is used to cluster the products into different preference categories. The CCEHC model can be used to rank and cluster product data with the input of user preferences and produce reliable customised recommendations for the users. To demonstrate the advantages of the proposed model, the CCR method is compared with the rating approach of the analytic hierarchy process. Two recommendation cases are demonstrated in this paper with two datasets, one collected by this research for laptop recommendation and the other an open dataset for workstation recommendation. The simulation results demonstrate that the proposed system is feasible for providing personalised recommendations. The significance of this research is the provision of a recommendation solution that does not depend on historical purchase records; rather, one wherein the users’ rating preferences and expert knowledge, both of which are measured by CCR, is considered. The proposed CCEHC model could be further applied to other types of similar recommendation cases such as music, books, and movies.

Improvement of Data Sparsity and Scalability Problems in Collaborative Filtering Based Recommendation Systems

Hierarchical Clustering for Collaborative Filtering Recommender Systems

An Evidential Clustering for Collaborative Filtering Based on Users’ Preferences

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Online shopping has already influenced the purchasing behaviour of consumers. Today, buyers face an overload of information to select the most preferred goods. Recommender systems (RSs) are developed to recommend appropriate products to consumers on the basis of their historical records. An effective RS service can boost sales by building and increasing customer loyalty (Aggarwal 2016). Reviews of RS technologies can be found in (Aggarwal 2016; Haruna, et al. 2017; Adomavicius and Kwon 2015; Kunaver and Požrl 2017; Kotkov et al. 2016; Zhang et al. 2017; Ma et al. 2018). RSs are typically categorised into three types: collaborative filtering, content-based, and hybrid (Aggarwal 2016). Since these types are based on user profiles including their historical ratings and purchase records (Lika et al. 2014), the RSs have insufficient information to learn the interests of new users. Lacking information for newly joined users is known as the cold-start problem, which is a critical challenge of RS (Kunaver and Požrl 2017; Lika et al. 2014; Volkovs et al. 2017; Viktoratos et al. 2018). A discussion and review of the cold-start problem can be found in Lika et al., (2014).

Cold start problems have significant influence on high-end consumer electronics such as smartphones, laptops, game consoles, and audio–video equipment. Since their electronic components and technologies are frequently updated, recommendations based on historical purchasing records could possibly not be applicable to new products. The motivation of this research is to propose an expert system for product recommendations that is based on the current individual users’ preferences and expert knowledge elicited from cognitive comparison rating (CCR) method. The proposed model does not have such a cold-start problem, as historical information is not used for the recommendations.

The evaluation of expert judgments and user preferences for products is complicated as numerous products such as the aforementioned high-end consumer electronics consist of different attributes. Multi-criteria decision making (MCDM) methods, which can measure both user preferences and expert judgments for multiple product attributes, have been used in RSs (van Capelleveen et al. 2019; Song 2018; Zhang et al. 2018). The analytic hierarchy process (AHP), a classical MCDM, has been adopted to evaluate user preferences for different product attributes (Hinduja and Pandey 2018; Karthikeyan et al. 2017; Pamučar et al. 2018; Wang and Tseng 2013). CCR, an improved alternative to AHP, is introduced in this study for evaluating expert judgments and user preferences. As an approach to rectify the mathematical representation problem of the perception of the paired differences in AHP, CCR is an ideal method for weighing product attributes and defining numerical values of nominal scales based on user preferences (Yuen 2009, 2012, 2014a; b).

To provide product recommendation services, the hierarchical clustering (HC) method is used to group the products based on the evaluation results of CCR. Different clustering analysis methods have been applied to identify groups of products that have similar attributes with respect to consumer preferences (Nilashi 2017; Frémal and Lecron 2017; Katarya and Verma 2017; Selvi and Sivasankar 2019). HC (Murtagh 1983; Ward Jr 1963; Han et al. 2011) is a popular clustering method; for example, HC has been adopted in other RSs (Selvi and Sivasankar 2019; Gupta and Patil 2015; Zheng et al. 2013; de Aguiar Neto et al. 2020). A hierarchical decomposition of a dataset can be built by HC in the form of a tree graph (called a dendrogram). The major advantage of HC is that the dendrogram can be easily interpreted since the distances between the objects are directly presented. HC has limitations when applied to product-recommendation cases. Firstly, the attributes of products are equally considered; however, different consumers can have different preferences for each attribute. Secondly, the product attributes of nominal scales cannot be directly used in clustering processes. To address these limitations, CCR is used to weigh product attributes and define numerical values of nominal scales with respect to user preferences. A novel system, cognitive comparison-enhanced hierarchical clustering (CCEHC), is proposed to provide product recommendations with respect to the current individual user’s rating preferences. The new method provides a solution to the cold start problem in RSs by using the expert knowledge elicited from CCR instead of the users’ historical data. In addition, non-specialized consumers can express their references to interact with the system.

This paper offers a significant extension of the previous initial work (Guan and Yuen 2015; Guan 2018), especially for the sections of methods, experiments, comparisons, and discussions. The remainder of this paper is organised as follows. Section 2 proposes the novel CCEHC system. Section 3 demonstrates the validity and feasibility of the proposed method using a laptop recommendation case, for which the dataset was collected in this study. Section 4 discusses the advantages and limitations of the proposed approach. Section 5 presents the application of CCEHC for workstation recommendations using an open dataset. Finally, Sect. 6 concludes the study.

2 Cognitive comparison enhanced hierarchical clustering

The procedures of the CCEHC model are presented in Fig. 1. In Steps 1 and 2, the attributes of the products are structured as an attribute tree. According to the attribute tree, a raw data table is collected from different sources. In Step 3, CCR is applied to measure the nominal attribute values and attribute weights with user preferences. The resulting table is normalised in Step 4. In Step 5, the values of the products are produced by aggregating the normalised table and attribute weights. In Step 6, a personalised top-N recommendation is produced by ranking the product values. In the final step, the products are clustered by HC, and similar products can be recommended to the different users.

2.1 Specifying attributes

Detailed product information can be obtained from different sources including manufacturer websites, product engineers, and retailers. A product is represented as a group of attributes, $\left\{ {\delta_{i} } \right\} = \left( {\delta_{1} , \delta_{2} , \ldots ,\delta_{i} , \ldots ,\delta_{n} } \right)$, where $\delta_{i}$ is the ith attribute of the product. Attributes can have sub-attributes. For example, an attribute $\delta_{i}$ is represented by n_i sub-attributes, $\left\{ {\delta_{i,j} } \right\} = \left( {\delta_{i,1} , \delta_{i,2} , \ldots ,\delta_{i,j} , \ldots ,\delta_{{i,n_{i} }} } \right),$ where $\delta_{i,j}$ is represented by the jth sub-attribute of $\delta_{i}$; the attribute $\delta_{i,j}$ is represented by n_i,j sub-attributes, $\left\{ {\delta_{i,j,k} } \right\} = \left( {\delta_{i,j,1} , \delta_{i,j,2} , \ldots ,\delta_{i,j,k} , \ldots ,\delta_{{i,j,n_{i,j} }} } \right)$, where $\delta_{i,j,k}$ is the kth sub-attribute of $\delta_{i,j}$. The attributes of the different levels are structured as an attributes tree. A sample of the laptop attribute tree is presented in Fig. 2 in Sect. 3.

2.2 Preprocessing data

The leaf attributes, denoted as L, are attributes without sub-attributes. The measurable values of leaf attributes are collected from different sources, as mentioned in Sect. 2.1. Product dataset D consisting of m products and l leaf attributes is denoted as $D=\left\{{d}_{\alpha \beta }|\forall \alpha \in \left(1,\dots ,m\right),\forall \beta \in \left(1,\dots ,l\right),\right\}$. An example of a laptop data matrix is presented in Sect. 3.2. D cannot be directly clustered since it could contain nominal scales that do not have a natural ordering. In the proposed CCEHC system, the nominal scales are substituted by the numerical values measured using the CCR approach presented in the next step.

2.3 Evaluating user preferences by CCR

The user preferences for different attributes and nominal scales are measured using the CCR method. A sample of the CCR interface is displayed in Fig. 3.

Table 1 is a typical measurement scale schema $\left( {\aleph ,\overline{X}} \right)$ applied to CCR (Yuen 2009, 2014a). The space of the linguistic labels $\aleph$ of the paired interval scales is {Equally, Slightly, …, Outstandingly, Absolutely}. The numerical representation of the paired interval scales $\overline{X}$ is as follows:

$$\overline{X} = \left\{ {\overline{x}_{q} = \frac{q\kappa }{\tau }|\forall q \in \left\{ { - \tau , \ldots , - 1,0,1, \ldots ,\tau } \right\},\quad \kappa > 0} \right\}.$$

(1)

Table 1 Measurement scale schema for CCR

The cognitive comparison enhanced hierarchical clustering

Abstract

Similar content being viewed by others

Improvement of Data Sparsity and Scalability Problems in Collaborative Filtering Based Recommendation Systems

Hierarchical Clustering for Collaborative Filtering Recommender Systems

An Evidential Clustering for Collaborative Filtering Based on Users’ Preferences

1 Introduction

2 Cognitive comparison enhanced hierarchical clustering

2.1 Specifying attributes

2.2 Preprocessing data

2.3 Evaluating user preferences by CCR

2.4 Normalising dataset

2.5 Fusing data

2.6 Generating top-N list

2.7 Clustering products

3 Application of laptop recommendation

3.1 Specifying attributes

3.2 Preprocessing data

3.3 Evaluating user preferences by CCR

3.4 Normalising dataset

3.5 Fusing data

3.6 Generating top-N list

3.7 Clustering products

4 Discussions

4.1 Personalization

4.2 Comparisons between CCR and AHP

4.3 Limitations

5 Workstation recommendation with open dataset

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation