Multi-task learning for intelligent data processing in granular computing context

Liu, Han; Cocea, Mihaela; Ding, Weili

doi:10.1007/s41066-017-0065-2

Multi-task learning for intelligent data processing in granular computing context

Original Article
Open access
Published: 30 November 2017

Volume 3, pages 257–273, (2018)
Cite this article

Download PDF

You have full access to this open access article

Granular Computing Aims and scope Submit manuscript

Multi-task learning for intelligent data processing in granular computing context

Download PDF

2307 Accesses
24 Citations
Explore all metrics

Abstract

Classification is a popular task in many application areas, such as decision making, rating, sentiment analysis and pattern recognition. In the recent years, due to the vast and rapid increase in the size of data, classification has been mainly undertaken in the way of supervised machine learning. In this context, a classification task involves data labelling, feature extraction, feature selection and learning of classifiers. In traditional machine learning, data is usually single-labelled by experts, i.e., each instance is only assigned one class label, since experts assume that different classes are mutually exclusive and each instance is clear-cut. However, the above assumption does not always hold in real applications. For example, in the context of emotion detection, there could be more than one emotion identified from the same person. On the other hand, feature selection has typically been done by evaluating feature subsets in terms of their relevance to all the classes. However, it is possible that a feature is only relevant to one class, but is irrelevant to all the other classes. Based on the above argumentation on data labelling and feature selection, we propose in this paper a framework of multi-task learning. In particular, we consider traditional machine learning to be single task learning, and argue the necessity to turn it into multi-task learning to allow an instance to belong to more than one class (i.e., multi-task classification) and to achieve class specific feature selection (i.e., multi-task feature selection). Moreover, we report two experimental studies in terms of fuzzy multi-task classification and rule learning based multi-task feature selection. The results show empirically that it is necessary to undertake multi-task learning for both classification and feature selection.

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Article 30 August 2019

Xibin Dong, Zhiwen Yu, … Qianli Ma

1 Introduction

Classification is one of the most popular tasks of machine learning, which has been frequently used in broad application areas, such as decision making (Pedrycz and Chen 2015a; Liu and Gegov 2015), sentiment analysis (Pedrycz and Chen 2016; Liu et al. 2016b) and pattern recognition (Teng et al. 2007). In general, classification is aimed at assigning a class/label to an unseen instance, i.e., it is to judge to which category an instance belongs.

In traditional machine learning, classification is typically considered to be of single-task, due to the following aspects:

Firstly, classification is generally undertaken by assuming that different classes are mutually exclusive and thus an instance can only belong to one class. However, this assumption does not really hold for many real-life problems. For example, in the context of text classification, the same movie may belong to different categories. Similarly, the same book may belong to different subjects. There are also many similar examples in other areas, e.g., a patient may be found to have more than one health issue in medical diagnosis.

Secondly, feature evaluation and selection have been considered as a very important step towards advancing the performance of learning classifiers (Liu et al. 2017a; Dash and Liu 1997; Langley 1994). However, the evaluation of features has been typically done by measuring their relevance to all classes. In fact, it could happen that a feature is only relevant to one class and is irrelevant to all the other classes (Cendrowska 1987). For example, in the context of image understanding, some target regions need to be identified, and each of the target regions involves recognizing instances of a specific class and extracting a set of features (that may be relevant only to this class). In this case, if features extracted from different target regions of an image are put together to make up a feature set, then the resulted data set could involve a sparse matrix. The case of a sparse matrix could lead to a large number of features being judged as irrelevant and thus filtered. However, these filtered features may be highly relevant to a specific class, which may lead to poor classification performance on that particular class.

Based on the two aspects described above, we argue in this paper the need to turn single-task learning into multi-task learning, i.e., a learning task per class. In particular, we propose the use of fuzzy approaches to allow an instance to belong to more than one class by judging the membership degree of an instance to different classes. We also show how different classes may be related to each other from granular computing perspective, through looking at fuzzy membership degrees of an instance to different classes, i.e., each class is viewed as a granule and the possible relationships between granules are identified.

On the other hand, in terms of feature evaluation and selection, we propose to turn it into a multi-task approach, from a granular computing perspective. In particular, we transform the class feature into a number of binary features, and each binary feature corresponds to a class. In this way, features are evaluated for each class in terms of their relevance, i.e., for each class, there is a feature subset selected towards learning to judge if an instance belongs to this class or not. We also show how rule learning approaches are capable of achieving class-specific feature evaluation.

The rest of this paper is organized as follows: Sect. 2 provides related work on classification, feature selection and granular computing. In Sect. 3, we present how single-task learning can be transformed effectively into multi-task learning, in terms of both classification and feature selection. In Sect. 4, we conduct two experimental studies, and discuss the results for showing the necessity to achieve multi-task classification and feature selection, towards advancing machine learning techniques for classification. In Sect. 5, we summarize the contributions of this paper and suggest further directions that could lead to advances in this research area in the future.

2 Related work

In this section, we describe the concepts of granular computing and justify how granular computing is related to classification and feature selection. Moreover, we provide an overview of classification in the context of machine learning and a review of existing approaches of feature selection.

2.1 Granular computing

Granular computing is a computational approach of information processing. It is aimed at structural thinking at the philosophical level and is aimed at structural problem-solving at the practical level (Yao 2005b). In general, granular computing involves two operations, namely granulation and organization (Yao 2005a). The former operation is to decompose a whole into parts, whereas the latter operation is to integrate parts into a whole. In computer science, granulation and organization have been frequently involved as the top-down and bottom-up approaches, respectively (Liu and Cocea 2017a).

In practice, two main concepts of granular computing, which have been popularly used for granulation and organization, are granule and granularity. A granule generally represents a large particle, which consists of smaller particles that can form a larger unit. There are many real-life examples as follows:

In the context of classification, each class can be viewed as a granule, since a class represents a collection of objects/instances.
In the context of feature selection, each feature set can be viewed as a granule, since a feature set represents a collection of features.

In general, granules can be at the same level or different levels with specific interrelationships, which leads to the need of the concept of granularity (Pedrycz and Chen 2015b). In particular, if granules are located at the same level of granularity, then the relationships between these granules are referred to as horizontal relationships (Liu and Cocea 2018). In contrast, for granules located at different levels of granularity, the relationships between these granules are referred to as hierarchical relationships (Liu and Cocea 2018). For example, in the context of classification, a class at a higher level of granularity may be specialized/decomposed into sub-classes at a lower level of granularity, in terms of specialization/decomposition (hierarchical relationships). Also, classes at a lower level of granularity may be generalized/aggregated into a super class at a higher level of granularity, in terms of generalization/aggregation (hierarchical relationships) (Liu and Cocea 2017a). On the other hand, classes may also have horizontal relationships between each other when these classes are at the same level of granularity, such as mutual exclusion, correlation and mutual independence (Liu et al. 2017b).

In practice, granular computing concepts and techniques have been used broadly in popular areas, such as artificial intelligence (Wilke and Portmann 2016; Pedrycz and Chen 2011; Skowron et al. 2016), computational intelligence (Dubois and Prade 2016; Yao 2005b; Kreinovich 2016; Livi and Sadeghian 2016), machine learning (Min and Xu 2016; Peters and Weber 2016; Liu and Cocea 2017c; Antonelli et al. 2016), decision making (Xu and Wang 2016; Liu and You 2017; Chatterjee and Kar 2017) and data clustering (Chen et al. 2009; Horng et al. 2005).

Furthermore, ensemble learning is also a subject that involves applications of granular computing concepts (Liu and Cocea 2017c). In particular, ensemble learning approaches, such as Bagging, involve information granulation through decomposing a training set into a number of overlapping samples, and also involve organization through combining the predictions provided from different base classifiers towards classifying an unseen instance; a similar perspective has also been stressed and discussed in Hu and Shi (2009).

In Sect. 3, we will show how granular computing concepts can be used for advancing classification and feature selection in the context of multi-task learning.

2.2 Overview of classification

As mentioned in Sect. 1, classification is one of the most popular tasks of machine learning. In terms of the number of predefined classes for a learning task, classification can be specialized into two categories: binary classification and multi-class classification. On the other hand, classification can be for different purposes, which leads to different types of class attributes, such as nominal, ordinal and string (Tan et al. 2005). In this context, the purposes of classification include recognition, rating and decision making.

Both binary classification and multi-class classification tasks could essentially be for any of the above purposes. In particular, binary classification could be for the purpose of recognition, such as gender classification (Wu et al. 2011). There are also some examples of binary classification for the purpose of rating, such as sentiment analysis (positive or negative) and assessment of teaching and learning (good or bad). In addition, binary classification can be involved in a decision-making task, such as voting (support or objection) and shopping (buy or not). Regarding multi-class classification, examples of recognition include emotion identification (Teng et al. 2007; Altrabsheh et al. 2015). There are also many examples of rating, such as movie rating and multi-sentiment analysis (Jefferson et al. 2017). In addition, multi-class classification can be used as a way of decision making towards selecting one of the given options.

As argued in Sect. 1, in traditional machine learning, different classes are assumed to be mutually exclusive, but this assumption does not always hold in reality. To address this issue, some related work was done in Boutell et al. (2004); Tsoumakas and Katakis (2007); Tsoumakas et al. (2010); Zhang and Zhou (2014) for turning single-label classification into multi-label classification. In particular, multi-label classification typically includes three types: PT3, PT4 and PT5 (Tsoumakas and Katakis 2007).

Table 1 Example of PT3 (Liu et al. 2017b; Liu and Cocea 2018)

Multi-task learning for intelligent data processing in granular computing context

Abstract

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

1 Introduction

2 Related work

2.1 Granular computing

2.2 Overview of classification

2.3 Review of feature selection techniques

3 Multi-task learning framework

3.1 Fuzzy multi-task classification

3.2 Multi-task feature selection

4 Experiments, results and discussion

5 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation