1 Introduction

In 2003, UNESCO declared the Convention for the Safeguarding of the Intangible Cultural Heritage, which defined the intangible cultural heritage as practices, representations, expressions, knowledge and skills that communities or individuals recognize as part of their cultural heritageFootnote 1. The significance of intangible cultural heritage is, according to UNESCO, not the cultural manifestation itself but rather the wealth of knowledge and skills that is transmitted through it from one generation to the nextFootnote 2. Confucianism is commonly defined as system of philosophical, ethical and political thought based on the teachings of Confucius ([10] cited in [24]), which has been communicated as part of everyday living of several Asian cultures, from as far back as from sixth-century BCE. It is an intangible cultural heritage that has been integrated in to common practices, relationships and morals.

How can individuals re-familiarize themselves with an intangible cultural heritage within smart environs? "Creating intelligent cultural spaces" [2] is one of the innovative ways of recreating interactive spaces to immerse in intangible cultural heritage. This project created a digital interactive system, where individuals can interact with Confucius through his teachings and experience a cultural heritage within smart environs. In the code of ethics in Confucianism, relationships between elders and youngsters are highly valued and revered [27]. Taking this aspect into account, considering the growing distance between old and young in this digital era, this project encourages sharing of experiences of Confucius heritage between elders and youngsters.

Thus, there are key significant aspects around which this research project has been developed: the intergenerational sharing and communication and use of modern digital interactive technology to create intangible cultural heritage experiences.

Intergenerational communication is viewed as an activity that enables interaction or exchange of verbal and nonverbal symbols between any two generations, involving sharing of skills, knowledge and experience between the grandparents or parents, and children. Play has many proven benefits for intergenerational communication, evident in prior studies. However, there is limited research which examines play between grandparents, parents and children, let alone interactive play systems that facilitate intergenerational communication. Prior literature and our own initial survey with grandparents, parents and children reviewed that there are currently limited digital play and entertainment systems which are designed for their simultaneous consumption, despite their eagerness to participate in those activities together. In this research, cultural play is defined as engaging in a play activity which allows the user to experience the core aspects of his or her culture. Design-oriented research approach was employed in this research, to develop the research prototypes, while involving the intergenerational users throughout the design process. The research prototypes were carefully engineered to meet the requirements of the users.

There have been encouraging results on the use of modern communication technology in bridging intergenerational gap, such as digital story telling, empowering both parties through communication across generations [16]. This motivates us to design a new form of cultural play system, where users can explore cultural values and teachings through digital media. We have created Confucius Chat, a philosophical conversational agent which models Confucius knowledge and teachings of Confucius, that allows parents and children to share information. Contemporary users could significantly benefit from this interactive and personalized advice from virtual Confucius, which is not possible in passive media, such as the printed text. This philosophical conversational technology could be used to model philosophers in different cultures.

Recent research on intergenerational communication across cultures has indicated that people in Asian nations construed older family and non-family members as less accommodating than did people in Western nations [4]. It is possible that the lack of understanding of the grandparents’ culture, for example the ethic of filial piety, makes salient age-group identities and, thereby, triggers intergroup processes [1]. In support of this argument, studies have also shown that strong traditional Confucian norms, such as filial piety and elders’ contribution to family harmony, have resulted in youths having more positive images of old age, closer psychological proximity and thus more respectful communication with older adults in the East [3, 9, 11]. Thus, it is important that the young generation could interact with new media that promotes ancient philosophies and culture, so that they will have better understanding and communication with the older people.

2 Literature review

The contemporary child is high in digital literacy and is more inclined to explore knowledge through the medium of digital media. It is important for us to design a new form of cultural play system, where users can explore cultural values and teachings through digital media. We share Tosa et al’s [21] view on cultural computing:

Human communication is originally something cultivated in an environment comprising localities, national customs and language. Therefore, the fruits of these cultures have strong roots in their unique histories. [...] Now, as the computer society covers the earth, the task that computers must take on is the clear and accurate intercommunication between local and global cultures. Toward that end, it is first necessary for those involved with computer technology to bring to life local characteristics

There is an emerging trend in entertainment research known as cultural computing, which allows the user to experience an interaction that is closely related to the core aspects of his or her cultural heritage [17]. Similarly, Tosa et al. [21] think of cultural computing as a method for cultural translation that uses scientific methods to represent the essential aspects of a cultural heritage.

For example in ZENetic Computer [20], the user inputs the elements that he/she wants in his/her sansui painting. Based on the user input, the system then tries to infer the user’s internal consciousness and generates a story that the user can "enter" via the computer display. The user can respond to objects presented by the interactive system by manipulating input media, such as a virtual calligraphy brush or rake of a Zen rock garden, on-screen images or simply by clapping hands. By exerting effort to link the fragmentary stories, the user interactions help to decrease the gap between daily self and hidden self. The system aims to allow users to experience a virtual unification of their daily self and their unconscious self into a recreated conscious self through this dialogue with the system.

In another research project, ALICE [17], Salem et al took inspiration from Alice in Wonderland project [14] and created an augmented reality (AR) narrative with intelligent agents acting as characters who lead the user through virtual and real locations, moral choices and emotional states. ALICE is designed to provoke self-reflection on unconscious cultural issues such as logic and reasoning, self and ego, selfishness and selflessness. This is achieved by giving users the opportunity to occupy and experience any of these mental and emotional positions as they move along the plot of Alice in Wonderland.

ZENetic Computer and ALICE projects used interactive storytelling and compelling visual to bring users through specific intangible cultural contents. On the other hand, we would like to take a more open-ended approach, to allow the parents and children the flexibility to ask a wide range of questions, and the system would reply with the most relevant answer from the knowledge database, hence presenting the cultural content directly to the user. Our cultural play system employs natural language processing methods to analyze the user’s input sentence, for example the keywords, the sense of the keywords and their corresponding topics. At the same time, the system models Confucian knowledge and teachings, by engaging Confucius scholars to provide data set for classifying each of the Confucius database entries. The system will then retrieve the most relevant entry from the database based on the proximity of the input sentence and the scholars’ classification of the entry. In addition, k-nearest neighbor training algorithm, a text classification method, is used to improve the accuracy of the system retrieval.

Confucius Chat offers an alternative approach to existing virtual chat agents, for example ELIZA, Hex and Jabberwacky. Existing chat agents use simple pattern-matching approaches and employ tricks to cover up the failure of understanding the user input, for example frequently switching topics or rephrasing the input by replacing the first person’s pronouns to second person’s pronouns and vice versa. These approaches fail to help users gain further understanding on the topic of discussion and thus offer minimal benefits to the interaction.

3 Designing cultural play

3.1 Problem exploration

While exploring the problem of developing a computer to allow intergenerational communication through cultural components, we have also gathered suggestions for new forms of intergenerational play system from elderly and young people. The research is conducted in Southeast Asia; hence, the suggestions are mostly influenced by SE Asian intangible cultural values. Many elderly users suggested using play-like systems to let the children learn about traditional values. When asked about what they meant by traditional values, some mentioned Confucian values, Eastern values and family traditions. They have highlighted that since the children are spending substantial amount of time with electronic gadgets like computers and mobile devices, it would be beneficial if there are applications that allow them to explore cultural values that utilize their familiar devices and provide them with enjoyment. This would serve as an activity that they could do together with children. Many children also have shown clear interest in exploring intangible cultural heritage using games or interactive systems. Some of them mentioned that their parents bought educational software for them to learn traditional values. The contents are normally presented in text with illustration, comics and videos.

We carried out a survey with parents and younger siblings of university students. A total of 20 parents between the age of 46 to 53 and 15 children between the age of 11 to 16 took part in the survey. The participants were prompted on whether they would like to learn or explore intangible cultural values, for example Eastern values using interactive media, such as games and social network chat. Many parents think that the idea is interesting and may appeal to children. If available, they would like to use it together with the children. Most children reported that it would be fun to learn about the cultural values through games or chat with historical figures. The participants were asked the current method the family members communicate about traditional cultural values. Most of them replied that there is currently no method in place, and few mentioned that they talked about those issues when they arose naturally. In another question on whether they are currently reading any traditional cultural content on books or on the Internet, a few parents reported reading the books or searched for those material online and few children mentioned that they do read those books in school. A further question is on the enjoyment of reading those cultural contents. The parents reported that the activity is meaningful and enjoyable. Most children reported that the activity is not very interesting.

3.2 Design goals

Based on the observations above, we established the design goals for the cultural play as listed below:

(1) Intergenerational cultural communication Studies have shown that strong traditional Confucian norms, such as filial piety and elders’ contribution to family harmony, have resulted in youths having more positive images of old age, closer psychological proximity and thus more respectful communication with older adults in Asia. Therefore, we wanted to create a play system that would facilitate discussion of traditional Eastern cultural values between grandparents and parents with their children. In this way, children may have a better understanding of the cultural values embraced by their parents and grandparents.

(2) Dynamic interaction to explore intangible cultural heritage Currently, Eastern cultural values and teachings are communicated orally in a family and are available in the traditional media. Traditional media, for example books, animation or videos, only provides a linear presentation of the subject matter, and the user is normally a passive receiver of the information. However, children today possess high level of digital literacy and are more inclined to explore new knowledge using digital media. This motivates us to design a new form of cultural play system, where the children can explore cultural values by actively contributing to the discussion using modern digital media, together with their parents.

3.3 Design requirements

3.3.1 User needs

We considered various cultural contents which are relevant to the users, for example Confucius, Mencius, Lao Zi and traditional Chinese concept of Yin and Yang, many of which were recommended by the parents. We decided to start with Confucius as his philosophies, and values have deep influence in Asian Chinese culture. Confucianism has gained popularity in books, animation and movies. For instance, the book written by Yu Dan about Confucius [26] has witnessed phenomenal sales, indicating a high demand for Confucian knowledge in modern Chinese societies.

Another important factor for choosing Confucius for the content of our cultural play is that his philosophies have significant influence on family values of most Southeast Asian cultures. The significance of family can be seen from the following statement outlining the process of Confucian moral cultivation in the Da Xue (Great Learning) chapter of the Book of Rites [22], “Extension of knowledge lay in the investigation of things. Things being investigated, knowledge became complete. Their knowledge being complete, their thoughts were sincere. Their thoughts being sincere, their hearts were then rectified. Their hearts being rectified, their persons were cultivated. Their persons being cultivated, their families were regulated. Their families being regulated, their states were rightly governed. Their states being rightly governed, the whole kingdom was made tranquil and happy” [15].

Apparently, family is the first test ground beyond the individual self for a cultivated person to manifest himself before he can make an impact on the society. Furthermore, out of the traditional five cardinal interpersonal relations (father–son, husband–wife, younger and elder brother, ruler–subordinate and friends), three are family-based. Discussions on filial piety are disproportionately abundant in Confucius literature. This factor is particularly important, as our aim is to facilitate intergenerational communication by allowing the children to better understand their family values.

3.3.2 Context of use

The system is envisioned to be used in a home setting by the children and their parents. The system should be a simple application or accessible from a Web site and is available whenever the parents and children want to explore the traditional cultural values. The system should allow the users to input questions or statement to promote their discussion and reflection on the system output. To facilitate learning and discussion, a record of their interaction would be available for the users to review at later time.

3.4 Design idea generation

We started our brainstorming session by examining the current media that supports exploration of traditional culture. Traditional Eastern cultural values are typically preserved in the printed media, for example books, which were often written in an esoteric way, have not only limited appeal to young users, but their sheer volume simply scares them off (Fig. 1). Existing endeavors in making traditional texts more friendly to young readers have not gone beyond the medium of books. The methods they employ include transforming traditional characters to simplified Chinese, sometimes accompanied by modern language interpretation, and even inserting caricatures to assist understanding. Figure 2 shows a person reading the popular book about Confucius written in English. Others have ventured into the digital media by producing movie clips, for example “Biography: Confucius DVD” and “The Complete Analects of Confucius” (Fig. 3) However, these methods still limit the users as a passive receiver of the information.

Fig. 1
figure 1

Example of books about Confucius

Fig. 2
figure 2

Example of a person reading a book about Confucius

Fig. 3
figure 3

Screenshot of the Confucius biography DVD cover and the complete Analects of Confucius cover

Modern-day children are highly exposed to digital culture. Through modern networked and social digital media such as the Twitter and Facebook, they make friends, explore new forms of entertainment and expand their knowledge. “24% of teens go online ‘almost constantly,' facilitated by the widespread availability of smart phones.”Footnote 3 says PEW research center in their overview of 2015. Ninety-two percent of teenagers use their smart phones to access social networks daily. Sixty-seven percent of teen social networks say they update their page at least once a week. However, usage of such media should not be seen as only for leisure. Teenagers look to their social networks for much more than gossip and photograph sharing. To them, social networks are a key source of information and advice.

Thus, we would like to design a system in which the users can interact dynamically with a virtual historical character in a social network chat environment, as means to explore and understand traditional Eastern values. The user is no longer a passive receiver watching or reading the cultural content; instead, she will be an active inquirer engaging in stimulating dialogue with the historical giant who shares his or her values and wisdoms. In this way, the knowledge is also presented in the user’s context, which would be more meaningful and personalized.

3.5 Prototype iterations

In this section, we provide an overview of the prototype iterations from the first prototype which addressed more technical issues, through the more recently tested prototype, which supports more accurate system output and additional interaction features.

We now describe the features of the prototype, the user involvement in the design, what design issue each prototype was attempting to answer, the user testing after the realization of the prototype and the lessons learned which were carried to the subsequent iterations of the design cycle. The prototype iterations are shown in Tables  1, 2 and 3.

Table 1 Cultural play prototype iteration 1
Table 2 Cultural play prototype iteration 2
Table 3 Cultural play prototype iteration 3

Prototype 1 was a simple proof of concept system, which consisted of a simple application which allows user to input a question or statement. For prototypes one and two, we collaborated with a Confucius scholar who had a master degree in Confucius study, to provide us the relevant Confucius knowledge content. The system uses Artificial Intelligence Markup Language (AIML) [23] to create a database of templates with answers to questions which are frequently asked. More detail about how AIML works is elaborated in Sect. 4.1. These templates range from casual chat, for example,

“Hi. How are you?”

to important concepts, historical persons and texts. For instance, if the user asks

“Who is Yan Hui?”

Confucius’s reply will be taken directly from our AIML database, which replies

“Yan Hui is my favorite disciple.”

This AIML database consists of short introductory statements about the figures that appear in Confucius’s responses (mostly Confucius’s disciples), classical texts from which Confucius often quotes (like the Odes) and certain names of ancient countries and dynasties (like the state of Lu and the three dynasties of Xia, Shang and Zhou). This database will also include certain information about Confucius as an individual. Since user may be curious about Confucius as a person, they may ask about his personal particulars such as his age, his date of birth and his hometown. We gather this information from the earliest reliable historical text of Shiji by Sima Qian (ca. 110B.C.) and formulate them into Confucius’s answers. Besides, we also prepared a few series of dialog sequences, which would be initiated by virtual Confucius asking the user a question. This would make the conversation between the user and virtual Confucius more interactive. The prototype was tested with project team members and laboratory members to gather feedback and identify potential usability issues. In this prototype, we noted the limitation of pattern-matching algorithm of AIML, where sometimes when user asked a question in a different sentence style, the system failed to retrieve relevant output.

In prototype 2, we use similarity measurement method to overcome the limitation of simple pattern matching in the previous prototype. With the help of the Confucius scholar (same as prototype 1), we created a database of Confucian statements from four classical texts: the Analects, Confucius sayings in the Mencius, passages directly related to Confucius in the Book of Rites and the entire Classic on Filial Piety. We chose to use James Legge’s translation for all four texts. Since his translation is more than a century old and is less literal, we hope it can help to give virtual Confucius reply a more authentic feeling so that user can feel that he is talking to someone who walks out from history. On encountering disputable interpretation, we will consult two other popular translations in the field by Lau [8] to derive at what we think is an appropriate and more pertinent to our modern user translation. We eliminate passages that are too historically specific in nature and out of which no real meaning can be extracted. An example of such elimination is

“to Zhou belonged the eight officers, Bo Da, Bo Kuo, Zhong Tu, Zhong Hu, Shu Ye, Shu Xia, Ji Sui, and Ji Gua.” [15]

Since many of the passages are considerably long (especially those in the Book of Rites and the Classic on Filial Piety) and are comprised of several parts each with a distinct meaning, instead of transporting the whole paragraph of text into our database as one entry, we separate them into short phrases, but each is a self-sustained statement loaded with meaning. For instance, the opening passage of the Analects becomes three entries in our database: 1. Is it not pleasant to learn with a constant perseverance and application? 2. Is it not delightful to have friends coming from distant quarters? 3. Is he not a man of complete virtue, who feels no discomposure though men may take no note of him? In this way, our database is expanded to 2069 entries. The Confucius scholar assigned every Confucius database entries as a vector, with a combination of topics which best describe them. In this way, the system would identify the topics in user input sentence and compare with the database entries to find the closest match based on the semantic closeness of the input topics vector and the database entry vectors. We have also created a web application, so that users can access the system from any web browser. A pilot study was carried out with six pairs of parents and children to identify usability issues and their interaction experience.

In prototype 3, the current prototype, a personal chat log, corresponding to a unique username was created so that user can review their previous interaction with virtual Confucius. We have also incorporated rating feedback on the Web site so that users can rate each input-output pair. This information is collected for future improvement of the system.

To improve the retrieval accuracy of our system, k-nearest neighbor (k-NN), a widely used method in text classification [18] was employed. Text classification is the process of identifying the class to which a text document belongs. In our case, each database entry is treated as a unique class, described by a set of vectors manually assigned by Confucius scholars. When a new input sentence is entered to the system, k-nearest neighbor algorithm will determine the most relevant class it belongs to, based on the similarity of the input sentence and the vectors describing each database entries. The database entry corresponding to the selected class will be output. Given the limitation on the resources of Confucius scholars, we have decided to reduce the database entries to only those relevant to the family topic. Family topic is chosen because of its relevance for intergenerational communication and its importance in Confucius teaching.

Table 4 Set of family topics

Five Confucius scholars, and final year undergraduate students, recommended by a faculty member who taught them during a Confucius studies module, were involved in selecting those entries that are only related to the family topic. A total of 108 database entries were selected. First, the Confucius scholars have to provide a set of topics which could be used in combination, to describe each of the 108 entries. This is an iterative process where the scholars provide a set of topics and manually check through the entries to identify new topics, which is unique to the existing pool of topics and is important to describe the database entries. The new topics are added to the pool of topics, and the process repeats until they felt that the set of topics are sufficient to describe each database entries. The final set of topics, a total of 23 topics, is shown in Table 4.

A list of keywords and their corresponding synsets of the words in WordNet [25] lexical database were then identified. The synsets are used in the similarity comparison process, to identify the semantic closeness of the user input sentence’s keywords with the topics. This step is described in detail in Sect. 4.2.2. Next, the five Confucius scholars will each provide an input-output data sets, out of which one randomly chosen set will be used for the evaluation (elaborated in Sect. 5.1) and four other sets will be used for k-NN classification and training. For each database entry, the scholars will compose an input sentence, which in their opinion should trigger the entry as the virtual Confucius Chat output. The input sentence can be either a question or a statement. For each of the 108 input sentences, the scholars will identify two to three keywords. Then, for each keyword, they will identify at least one topic from the list in Table 4. A maximum of three topics are allowed to be assigned to an input sentence. The combination of topics provided by the four Confucius scholars will be used as the vectors to describe each entry. A total of 432 vectors, 4 each to describe a database entry, are obtained. To use k-NN algorithm, k value which yields the best performance needs to be identified during a training process. A k-fold cross-validation method, a widely used method to estimate the k value of the k-NN classifier [19], was used. More detail about this step is described in Sect. 4.2.3. Finally, the system performance is evaluated in a glass-box evaluation, where the system’s selected keywords, topics and database entry will be compared to the evaluation set provided by the scholar, who is the expert in Confucius knowledge domain.

Fig. 4
figure 4

Confucius Chat interface

Fig. 5
figure 5

Confucius Chat user instructions

4 System description

When the user enters Confucius Chat URL into their web browser, it would load Confucius Chat webpage written in HTML and JavaScript (JavaScript for AJAX request and return processing). There is a chat box where user can type a sentence(s) using the computer keyboard (as shown in Figs. 4, 5). Upon pressing the Enter key or mouse-click the Send button, an AJAX request containing the input sentence will be sent to the Web Server, running on Apache HTTP server version 2.2.9. Upon receiving incoming query at port 80, Apache external handler will pass the query to port 8088 of localhost. The core of the system is the Chat Server running on Python 2.5.2. There are three functions of the Chat Server. Firstly, it would listen to any incoming query at port 8088. Upon receiving the query, it would process the query and then return virtual Confucius’s reply to Apache external handler. The processing of the Chat Server is shown in Fig. 6 and will be elaborated in the following sections.

Fig. 6
figure 6

Flowchart of Confucius Chat Server

The output from the k-NN method will be retrieved from Confucius Knowledge Database. Virtual Confucius’s reply is then updated on the user’s web browser using AJAX return process. The chat input, output, time, unique index number and the details of the processing are also stored in Chat History Database, running on MySQL 5.1.53. When the users finished chatting, they can click on Rate button to go to the rating webpage. They can provide ratings for relevance and enjoyment for each of the dialog entries, by clicking on the rating from 1 to 5 stars, which will be stored on the Chat Rating Database with the same index number stored in the Chat History Database. The common index number allows for further analysis of the information in the future. After the rating database is updated, the rating webpage is updated using AJAX return process. Figure 7 shows the block diagram of the overall system described above.

Fig. 7
figure 7

Block diagram of the Confucius Chat system

4.1 Artificial Intelligence Markup Language retrieval

The user input sentence is first put through Artificial Intelligence Markup Language (AIML) database [23] to retrieve an output. The output sentence is then evaluated for its score (from 0.0 to 1.0). The score evaluation is based on the number of matched words in the input sentence, discounting randomness and a list of words that Confucius will not talk about.

For example, the user input

“What is your name please”

maps to two templates in the AIML database. First is the template “_PLEASE” which means that if the sentence ends with the word please, the reply from this template will be used and the words before the word please will be used to search for a second template,

“WHAT IS YOUR NAME”.

The reply for first template is

“Thank you for being polite”.

and the reply for second template is

“My name is Confucius”.

Hence the reply for the user input is

“Thank you for being polite. My name is Confucius.”

For this example, there are two templates which match all the words in the input sentence; hence, the score is 1.0. The score is divided by half for the more general templates in AIML database which offers random output. If the output sentence contains any word in our forbidden word list which is a list of words Confucius will not discuss, for example God and Jesus, the score will be 0.0. If the score is above a predetermined threshold value, currently set at 0.7, the output from AIML will be used. Below the threshold value, the input sentence is passed to knowledge database retrieval step for further processing.

4.2 Knowledge database retrieval

The core of our system is a Similarity module, which calculates the highest similarity score for each keyword in user input sentence with the topics to identify a topic for each keyword. The set of topics are then used to retrieve the closest matched database entry based on k-NN method.

4.2.1 Identifying Keywords

To achieve the above, first the computer must understand the input sentence. This is not an easy task, because the user’s input is natural human language, which has very complicated structure, and even slight changes of the order of the words may alter the meaning of the sentence. Therefore, simple keyword matching will not work well and we need a more sophisticated method to analyze the meaning of the input. The user input is fed into a parser to get the grammatical structure of the sentence. Our system used the Stanford Parser [5] because of its speed and reliability. The last noun of each noun phrase is selected as headword [6] of the sentence. Usually, they are the topics the user is talking about. However, sometimes there are no nouns in the user input, or there are some important words that are not nouns. Therefore, we employ another method called inverse term frequency to find the important words. An inverse term frequency database is created by calculating the frequency of appearance of each word in a large corpus. Study shows that the more frequently used words such as "the" and "and" do not contribute much to the real meaning of the sentence, whereas the less frequent words, "loyal" and "conflict," for example, are the more important words [7].

With the result from the above two methods combined, the system will select three keywords, either from the headword method, or in the event that there is not enough headwords, the remaining words will be selected from inverse term frequency method. Furthermore, the user input is passed through a Word Sense Disambiguation (WSD) module, so that we not only know what are the keywords of the input, but also the meaning of these words in the context [13].

4.2.2 Identifying topics

We then compute the semantic similarity between these selected keywords and the 23 topics provided by the Confucius scholars. This is done using a WordNet-based similarity module WordNet::Similarity developed by Ted Pedersen. For each topic \(T_x,\) there are several topical words,

i.e..

$$ T_x = {T_x W_1,T_x W_2, \ldots ,T_x W_y}, $$
(1)

where y is the number of topical words for topic x.

For each topical word, there may exist several suitable senses in WordNet, i.e.,

$$ T_x W_y = {S_{xy1},S_{xy2}, \ldots , S_{xyz}}, $$
(2)

where z is the number of senses for the yth topical word of topic x.

For each keyword in the user input, the topical word with the highest similarity score between the keyword and the topical word’s synset will be selected. Selected topic T for a keyword K, is shown in the equation below.

$$ {\mathrm {T}} = \arg \max \{Sim(S_{111},K), \ldots ,Sim(S_{xyz},K)\} $$
(3)

User input sentence can then be represented as vector \(\overrightarrow{v_i}\)

$$ {\mathbf {v}}_{\mathbf{i}}= a_1 {\mathbf {t}}_{\mathbf{1}} + a_2 {\mathbf {t}}_{\mathbf{2}} + a_3 {\mathbf {t}}_{\mathbf{3}} + \cdots +a_n {\mathbf {t}}_{\mathbf{n}} $$
(4)

where n is the total number of topics (23), \(\overrightarrow{t_n}\) is the basis vector representing the nth topic, \(a_n\) is the binary weight of that topic. A selected topic will have a weight of \(a = 1,\) and not selected topics will have \(a = 0.\)

4.2.3 Identifying Confucius entry

To improve the retrieval accuracy of the system, we have employed k-nearest neighbor algorithm (k-NN) to classify the database entries based on the training examples provided by the Confucius scholars, who are the domain expert. The k-nearest neighbor algorithm is simple and widely used in text classification [18]. An object is classified by a majority vote of its k-nearest neighbors. Each input sentence, as well as each entry in the Confucius database, can be digitized as a point in a high-dimensional space. The Euclidean distance between each pair of the points is used as the distance metric. For an input sentence x, the class of x, denoted by c(x), is given by

$$ c(x) = \arg \max _{c \in C} \sum _{i=1}^k \delta (c,c(y_i)) $$
(5)

where C is the collection of all classes \(c(y_i),\) is the class of \(y_i, y_1, \ldots , y_k,\) are the k-nearest neighbors of input sentence,

$$\begin{aligned} \delta (u,v) = {\left\{ \begin{array}{ll} 1 & \quad \text {if} \quad (u = v) \\ 0 & \quad \text {otherwise} \end{array}\right. } \end{aligned}$$
(6)

Five Confucius scholars were recruited; each of them classified the 108 entries using a combination of up to 3 topics. The scholars’ tagging for each Confucius entry in the database can also be represented as a vector \(\overrightarrow{v_0}\) in the same high-dimensional space as \(\overrightarrow{v_i}\):

$$ \overrightarrow{v_0}=b_1 \overrightarrow{t_1} + b_2 \overrightarrow{t_2} + b_3 \overrightarrow{t_3} + \cdots + b_n \overrightarrow{t_n} $$
(7)

where \(b_i\) is the binary weight of the corresponding topic. Topics tagged by the scholar will have a weight of \(b = 1,\), and topics not tagged will have \(b = 0.\) The similarity between the input sentence and database sentence is inversely proportional to the Euclidean distance, d, between point \(a = (a_1, a_2, \ldots , a_n)\) and point \(b = (b_1,b_2, \ldots , b_n).\)

$$ d= \sqrt{(a_1-b_1)^2+(a_2-b_2)^2+\ cdots +(a_n-b_n)^2} $$
(8)

A smaller d indicates higher similarity between the input and database entry; thus, the database entry is deemed as more suitable to be chosen as the output to the user.

We have five sets of data from five Confucius scholars. One data set is randomly selected and reserved as an evaluation set to evaluate the overall output accuracy of our system using the k-NN classification algorithm. Ideally, for any input sentence, the output given by our system should be the same as the one given by the domain expert, which means the system output should be as close as possible to human domain expert output. The rest of the 4 data sets are used to train the classifier.

Eightfold cross-validation is performed on the 432 data points in our training set, as described in Sect. 3.5. All the data points are evenly divided into 8 partitions D1, D2, ...,D8, with each partition containing the same number of data points from each class, i.e., each partition contains 54 samples. Each partition is used in turn as the test set, while the rest of the partitions are used as training set. To tabulate test sample classification results, a 108 × 108 confusion matrix C is used. All elements in C are initialized to 0. Let wt denote the true class of the samples and wp denote the predicted class of the samples. For every test sample, the element \(C_{wt,wp}\) is incremented by 1. The accuracy A of the classifier is given by

$$ A = \frac{trace(C)}{n_\text {total}} $$
(9)

where \(n_\text {total}\) is the total number of samples that have been tested.

Table 5 Classification accuracy using different values of k
Fig. 8
figure 8

Classification accuracy using different values of k

Furthermore, the process is repeated 10 times, repartitioning the samples in each iteration, to get a better estimate of the accuracy. The classification accuracy using different values of k is shown in Table 5 and Fig. 8. k value for the classifier is influenced by many factors, including the number of Confucius scholar data sets, agreement between the Confucius scholars’ data sets and other nonlinear system parameters. To determine the most suitable value of k for our system, cross-validation method [19], a well established technique to determine the value of k, is used. Based on the different values of k and their corresponding classification accuracy in the test, \(k = 3\) is selected using k-fold cross-validation method.

5 Technical results

To evaluate the performance of our system, we carried out glass-box and black-box evaluation [12]. Glass-box evaluation attempts to look inside the system and measures how well each module performs, while black-box evaluation attempts to measure how well the system performs as a whole. We carried out the glass-box evaluation by examining each system module carefully. For black-box evaluation, the users gave their rating on relevance and enjoyment for each input-output chat entries.

5.1 Glass-box evaluation

The evaluation set is randomly selected from one of the five Confucius scholar’s data sets.

Keyword identification

For each of the 108 input sentence(s), the scholar has provided two to three keywords. Total number of keywords provided by the scholar is 266. Each sentence is entered into our system, and our system-identified keywords are compared to the ones provided by the scholars. As shown in Table  6, the total number of system-identified keywords that match the keywords provided by the scholar is 236. Therefore, the accuracy of keyword identification is 88.72%.

Table 6 Keywords retrieval accuracy

Topics identification Total number of topics provided by the scholar is 265. Each sentence is entered into our system, and our system-identified topics are compared to the ones provided by the scholars. As shown in Table 7, the total number of topics identified by the system that matches the topics provided by the scholar is 216. Therefore, the accuracy of keyword identification is 81.20%.

Table 7 Topics retrieval accuracy

Confucius entry identification Out of the 5 sets of input-output data provided by the Confucius scholars, one set is randomly selected for evaluation. The four sets were used for k-NN training. The Confucius entries selected using k-NN and one of the four individual sets are each compared to the Confucius entries selected in the evaluation set. The result is shown in Table  8.

Table 8 k-NN method retrieval accuracy improvement (k  =  3)

With k-NN method, an accuracy improvement of 39.39% was observed when compared with average performance of Set 1 to 4. Comparing to the worst case Set 3, an improvement of 76.92% was observed.

5.2 Black-box evaluation

For black-box evaluation, the user gave feedback on their enjoyment and relevance rating of each of virtual Confucius response to the input sentence or question. For each input-output pair, the user can rate from 1 to 5, 1 being strongly disagree and 5 being strongly agree. For the question, “I enjoy the Confucius’s response,” the users rating is shown in Fig. 9. For the question, “The Confucius’s response is relevant to my input sentence”, the users rating is shown in Fig. 10. Frequency refers to the number of input-output pair being rated for that particular rating number. The results shows that users rated highly on their enjoyment and relevance of the Confucius’s chat reply.

Fig. 9
figure 9

User rating on the enjoyment of Confucius output

Fig. 10
figure 10

User rating on the relevance of Confucius output

There is a positive correlation between relevance and enjoyment (r (778)  =  .673, p < .01), indicating that as the user ratings for relevance increase, the enjoyment ratings also increase.

5.3 Example of input-output retrieval

The example below shows the walk-through of a correctly retrieved system output which matches the expert’s (Confucius scholar) given output, based on the input-output data set provided by a Confucius scholar, as described in Sect. 3.5. In this example as shown in Fig. 11, the input system to the system is

“What would cause the instability of a family and how should we stay united?”

The system selects the keywords of the input sentence using two methods, headwords and inverse term frequency. The sentence is first fed into a parser to get the grammatical structure of the sentence. The last noun of each noun phrase is selected as headword of the sentence. Usually, they are the topics the user is talking about. However, sometimes there are no nouns in the user input, or there are some important words that are not nouns. Therefore, we employ another method called inverse term frequency to find the important words. The words with higher inverse term frequency value are considered more important in the sentence. Furthermore, the input sentence is passed through a Word Sense Disambiguation (WSD) module, so that we not only know what are the keywords of the input, but also the meaning of these words in the context. The output of WSD module is in the format word#parts of speech#sense number. For example, instability#n#3 is the noun sense number three of the word instability, which is a lack of balance or a state of disequilibrium; united#a#1 is the adjective sense number one of the word united, which is characterized by unity or joined into a single entity. The selected keywords with their corresponding senses in the sentence are instability#n#3, family#n#2 and united#a#. The details of how the keywords are selected are presented in Sect. 4.2.1. Based on the keywords, the system identifies the topics based on the semantic similarity of the keywords and topics as described in Sect. 4.2.2. In this example, the system is able to correctly determine the topics based on the input sentence. The topics identified are 22 and 9 which correspond to family and harmony topics, respectively, in Table 4. The system then finds the three most relevant database entries, based on the closest distant, as described in Sect. 4.2.3. In this case, the three closest entries belong to class 1283, 1970 and 1283. Based on the majority vote from the three closest neighbors, 1283 is the entry that will be output. The number 1283 refers to the index of the entry in the database. In this case, the output selected by the system matches the output given by the expert,

“A family must first destroy itself before others can destroy it.”

Fig. 11
figure 11

Confucius Chat input-output retrieval example 1

In another example as shown in Fig. 12, the system output selected is different from the output given by the expert. Based on the expert input sentence, the keywords selected using headwords and inverse term frequency with their corresponding senses are action#n#1, son#n#1 and filial#a#1. The topics identified are 8, 2 and 19 which correspond to conduct, children, filial topics, respectively, in Table 4. The top three closest entries selected using k-nearest neighbor belong to class 26, 1612 and 186. Since there is an equal vote, the system will randomly choose one entry to be output; in this case, entry 1612 was selected. Note that the three entries selected using k-nearest neighbor were the closest match with the input sentence’s topics combination, based on the classification by four Confucius scholars. The output provided by the evaluation expert is entry 186, which is different from the system selected output. Although the output is considered incorrectly retrieved for the evaluation process, it is noted that the system output is reasonable as a reply to the input sentence.

Fig. 12
figure 12

Confucius Chat input-output retrieval example 2

Fig. 13
figure 13

iSage mobile app: an extension of Confucius Chat system

6 Conclusion

In this research, we applied the NLP algorithms onto an intangible cultural heritage and created a virtual chat agent. It modeled the Confucius knowledge and teachings, delivering them intelligently through a natural language chat with human. To understand both the meaning and context of user’s natural language input and retrieve relevant answer, k-nearest neighbor (k-NN) algorithm was employed in the system to improve the retrieval accuracy. Five Confucius scholars were engaged to provide input-output data sets for the training and evaluation of the system. A total of 432 vectors, 4 each to describe a database entry, are obtained. To use k-NN algorithm, k value which yields the best performance is obtained using a k-fold cross-validation method. Each database entry is treated as a unique class, described by a set of vectors manually assigned by Confucius scholars. When a new input sentence is entered to the system, natural language processing methods are employed to determine the keywords and corresponding topics in the sentence. k-nearest neighbor algorithm will then determine the most relevant class the input sentence belongs to, based on the similarity of the input sentence topics and the vectors describing each database entry. The database entry corresponding to the selected class will be output. The software engineering details to build the system prototypes are presented.

We also carried out evaluations to test the system performance and the experience of users. Glass-box evaluation attempts to look inside the system and measures how well each module performs step by step. It was carried out by measuring the computation accuracy of each module carefully. Black-box evaluation is through the user rating on the feeling of relevance and enjoyment for each input-output chat entries, to examine how well the system works as a whole. From the glass-box evaluation, the system is able to identify the keywords and topics with an accuracy of 88.72 and 81.20%, respectively. Based on the input sentence provided by a Confucius scholar, the system selected output is compared to the Confucius scholar output. An accuracy of 42.59% was obtained using k-NN method. There is an accuracy improvement of 39.39% when compared with the average performance of individual scholar’s classification. For black-box evaluation, more than 70% of the users gave rating of 4 (agree) or 5 (strongly agree) for the enjoyment and relevance of virtual Confucius’s response to their input sentence or question. There is a positive correlation between relevance and enjoyment (r (778)  =  .673, p < 0.01), indicating that as the user ratings for relevance increase, the enjoyment ratings also increase.

Through processing the natural language input and computational matching with the database, we created a novel merging of ancient philosophy with recent media literacy through interactive cultural play. Our studies showed that users gave quite positive feedbacks on their experience with virtual Confucius. They enjoyed using it and were willing to share their stories with this virtual philosopher, just like talking with real friend. They also believed that this media could improve intergenerational interaction.

Currently, the Confucius Chat system has been extended into a mobile application, iSage (Fig. 13), which offers users advice, based on various philosophers and knowledge base. In this application, the users can choose to ask the virtual Sage on topics ranging from love, fate and many more to be expanded in the future. The topic knowledge databases were obtained from various sources. Based on the algorithm of Confucius Chat system, iSage allows user to interact with the application using natural language chat. The iSage application has been deployed on the Android MarketFootnote 4. We hope this work will in future be used to achieve new interactive experiences with all forms of intangible cultural heritage.