1 Introduction

With the proliferation of the powerful hardware and soft-computing techniques, an era of AI based services are emerged in the recent years. AI services have witnessed a new industrial revolution across the globe. Today, AI is becoming part of real life among researchers and innovators due to evolution of technologies in deep learning and big data. Data analysis [7], smart assistance services that collects the information from different areas is also commercialized. Chatbot using artificial intelligence may ease the human life in knowing real-time news or information, suggestions, recommendations, shopping services etc. The categorization of AI to be employed in the chatbots are broadly classified as follows [79]

  • Weak AI: It is designed and trained for dedicated task and cannot respond beyond its limitations or defined field, for example Apple SIRI, Chess, College bot, Hospital bot, Telecom bot, Google Assistant etc.

  • Strong AI: It is intelligent enough to find out the solution without human intervention. Strong AI based systems simulate human abilities and are supposed to be familiar with any kind of task for example borrowing ideas from neuroscience and psychology, genetics and language understanding.

  • Evolutionary AI: Evolutionary AI involves study and design of machines that simulate simple creatures and attempt to evolve ex-Ants, Bees etc.

  • Super AI: It is hypothetical concept in which machines leads humans and machines can communicate with each other and also with humans.

Machine Learning (ML) tasks such as classification, regression and clustering are the emerging topics including combinatorial optimization, computer vision, deep learning, transfer learning, and ensemble learning. Applications for supply chain and important manufacturing industries are being embedded in AI environment using ML is evolving as promising area of researchers [6]. ML, Natural Language Processing (NLP), machine vision, expert systems, robotics etc. are application specific technologies incorporated with AI. These technological interventions using AI is now emerged with its significant role in normal living aspects of human life.

Since, technological intervention using AI is emerged on large scale along with flourishing smart behaving tools; traditional chatbot development techniques have synergic replacement with the rule-based techniques or with simple ML algorithms. Now a days, many applications including chatbots are using Deep Learning techniques [15, 26] like Deep Reinforcement Learning (DRL) [33], Deep Neural Network (DNN), Deep Hybrid Neural Network [3,4,5] and computational intelligence. The chatbots are mainly based on following 2 types:

  1. (1)

    Linguistic (rule-based) chatbots: Also known as decision tree boats and are programmed to reply specific questions that are predefined at the beginning. This type of chatbots users are restricted to limited input options. These chatbots uses if-then logic for conversations and has set of defined rules, framed for different type of problems and their respective solutions. It performs the mapping of conversation similar to the flowchart and depends on customer’s query for which the rules are defined. However, inability to learn from real time conversation and a less flexible conversational flow are the limitation of the rule-based chatbots [20, 68]. Small organizations or companies having specific working area may use Rule-based Chatbot for more appropriate responses. Rule-based chatbots can also be better option for designing Frequently Asked Questions (FAQ) having limited sample conversation to learn.

  2. (2)

    AI-Chatbots: These chatbots are programmed to interact with users as a real human and possess the ability to keep track of contact and word dictionary. In addition, it requires many logic implementations that understand the intent as well as context of a query for delivering the response. Use of NLP, framing the own responses for complicated queries, ability to learn continuously from the conversation to generate better responses etc. makes it the preferred choice of the researchers. It takes more time initially for training but saves lot of time in long run. In some application database, simple nested-if-else based logic also works well to meet the requirement for AI-Chatbot [20, 68].

Further, based on query and response requirement of user, chatbots can be categorized as follows:

  1. (a)

    Text-to-text bot (TTT): It is basic architecture where user gives input as text in the form of words or sentences and chatbot response similarly using pattern matching or rule based general purpose methods. Suitable example is ELIZA [82] which started the era of catboats in the world. Now a day, we can find various messenger apps like Facebook messenger, WhatsApp and Telegram which uses TTT for two user interaction service.

  2. (b)

    Text-to-speech bot (TTS): It is made more users interactive and provides useful purpose to users for listening as it speaks. Hence suitable for visually impaired persons who can hear or speak but cannot see, for example Snatch-bot.

  3. (c)

    Speech-to-text bot (STT): It generates text responses where user interacts with the bot through their voice and views the answers. It is suitable for various purposes like conferences, hearing impaired individuals to interact socially. SIRI [59] is one of the STT chatbot developed by Apple.

  4. (d)

    Speech-to-speech bot (STS): It is an emerging chatbot having expertise like humans. Currently voice to voice assistant for teaching in education sector is being attempted.

1.1 Challenges

There are several studies that are trying to develop ideal application of a chatbot, which can have a natural conversation and are indistinguishable from humans. But it is far from to achieve. From the summary of the literature and limitations mentioned in [21, 57], following drawbacks are observed in the development of efficient chatbot application.

  • Lack of training data: ML based chatbot system require huge amount of relevant training data in appropriate format. Rule-based chatbot requires human expertise to craft each rule and response while a ML based chatbot requires humans to collect, select and clean every single piece of training data. Chatbot learns the relationships between words, phrases, synonyms, lexical entities sentences and concepts. Due to lack of relevant and categorise data, AI-Chatbot building can be a prohibitively costly and time-consuming part of the application.

  • Poor conversational understanding: Understanding the customer behaviour and questions is the main concern in ML chatbots. In rule-based Chatbots, it is ensured that questions with the same meaning receive the same answer but it is difficult for ML systems to correctly recognise similar questions phrased in different ways, even within the same conversation. AI-Chatbot needs critical improvement in the grammatical errors, words ambiguity, language structure, semantic meaning, sentiment analysis, user’s way of writing, data restructuring, ability to learn languages disorder to address limitation of NLP, accuracy in self-learning etc.

Deep learning algorithms are capable of overcoming the challenges involved in the development of a chatbot. It supports to analyse user sentiments and inputs for generating the appropriate response. Deep learning of chatbots is the most empirical task which recognizes the human natural languages and generation of proper response in every situation. However, to generate natural responses require huge amount of related data for learning and also consumes great amount of time. The appropriate training rules for the AI-Chatbot to be selected to handle the challenging issues that are normally obstacles for simpler chatbots.

1.2 Motivation

The main motive of this review paper is to study about AI-Chatbots their evolution and services. A huge no of services are being offered by AI-Chatbots but in this paper it is divided in to customer & public administration based services. The motivation behind this review paper include study of chatbots based on following points:

  • Type of AI-Chatbots based on different aspects and their challenges.

  • Architecture, models, techniques and technologies used in AI-Chatbots.

  • Evolution of chatbots including various state-of-the-art chatbots.

  • Customer services based chatbots including their approaches, knowledge-base, techniques, target audience, limitations, mode of communications etc.

  • Public administration services based chatbots including the type of public service, their impact of solving issues, knowledge-base, performance etc.

  • AI-Chatbot performance measurements based on various parameters.

Very limited work has been carried out in developing public administration service based chatbots. But there are many research work done in solving public administration issues using AI which can be utilized with some modifications to develop an AI based chatbot. Some of such AI based public administration research works are as follows:

In 2007, Chun Andy [9] has used AI for e-Government automatic assessment of immigration application forms. He describes an e-Government AI project that provides a range of intelligent AI services to support automated assessment of various types of applications submitted to an immigration agency. The use of AI in public administration giving ease of doing is very well demonstrated in this paper. In 2014, S. Mohammady et al. [54] have discussed the urban growth modelling using an artificial neural network as a case study of Sanandaj city, Iran. In this local public administration centric activity, they have used Artificial Neural Network (ANN) as a powerful tool for simulating urban growth patterns of sanandaj city located in the west of Iran. Landsat imageries acquired through satellite were used as database. In 2016, Zheng Xu et al. [91] have proposed the artificial neural network based evaluation method of urban public security. In this paper authors have illustrated a scenario of Pudong District of Shanghai wherein an ANN based simulation is applied for the proposed Grid Management System model to validate its applicability. Hardt, Moritz et al. [55] have revealed equality of opportunity in supervised learning. In this hard core technical paper, a framework to assess and reduce bias in predictive algorithms is developed. It creates precise definitions of equality of opportunity and equalized odds. This indicates that there exists an immense potential for the unbiased and transparent solution of the citizen centric services to be adopted at the local administration using AI.

In 2017, Georgios N. Kouziokas et al. [41] has developed an Artificial Neural Network (ANN) model for unemployment rates forecasting in public administration. The ANN topologies were examined regarding the number of the neurons and the transfer functions in the hidden layers. They clearly demonstrated that use of AI techniques have significant potential to carry out the authentic survey with the defined indicators and to get the clear idea to frame the public policies based on such unbiased survey reports. G. N. Kouziokas et al. [42] has combined artificial intelligence and regression analysis in predicting ground water levels to be used in public administration. The different predictive models are constructed by implementing different number of the nodes in the hidden layers in this work for the fare estimate of the water level. The predicted results through ANN are found accurate enough to further use and setting line of action by local administration. Hila Mehr [52] has presented at the Harvard Kennedy School for the democratic governance and the innovation about the artificial intelligence for citizen services and government at Harvard Ash Centre. They focused on the applications directly related to citizen services like answering questions, filling out and searching documents, routing requests, translation, and drafting documents.

It can be clearly concluded from online services like Netflix and Facebook to chatbots like Siri and Alexa that it is the beginning of the era to interact with AI on daily basis. It is stated that soon, AI will permeate the ways we interact with our government too. From small cities in the US to countries like Japan, government agencies are looking to AI to improve citizen services. In North Carolina government office, chatbots are the conversational systems, which are mostly AI-based that free up the help centre operators line, where nearly 90% of calls are just about basic password support, allowing operators to answer more complicated and time-sensitive inquiries. The government of Singapore worked with Microsoft to create chatbots for selected citizen services. These chatbots are intended to function as digital representatives. New York City is planning to work with IBM’s AI platform, Watson, to build a new customer management system to speed up the time and process of answering questions and complaints about city services on their 311 platform.

1.3 Type of chatbots: application aspect

Different authors have classified the chatbot applications based on various aspects. In this section, the literature scenario of classification of chatbot application is compiled and depicted as follows:

In 2017 Barker et al. [12] divided the chatbot applications into four groups as (i) Service chatbot to provide various kind of facilities to the customer (ii) Commercial chatbot to provide the assistance related to purchase (iii) Entertainment chatbot are designed to keep customers engaged with sports, movies etc. and (iv) Advisory chatbot are used to provide suggestions and recommendations on various topics. Chen et al. [23] classified the task based chatbot applications into two groups as Task oriented chatbot and Non-Task oriented chatbot. Based on short conversation within closed domain, chatbot assist the customers to complete their tasks. Siri, Google Now, Alexa etc. are Task oriented chatbots. Non-Task oriented chatbots involves conversation with customers in open domain to answer their questions as normally used for entertainment purpose [47].

In 2018, Nuruzzaman et al. [57] categorized Chatbot applications into following four groups:

  1. (1)

    Goal-based chatbot: It is based on some primary goal and are task specific having very short conversation with clients to get the task completed. Such chatbots are deployed by the companies to assist the users for their queries.

  2. (2)

    Knowledge-based chatbot: It is based on the available knowledge either in the open or close domain of the data source. Open domain chatbots (Allen AI Science [25], Quiz Bowl [16]) are based on general topics and hence respond for general queries only. Closed domain chatbots (CNN/Daily Mail [37], MCTest [67] and bAbI [83]) are based on specific knowledge base, consisting of all the required data needed for responding the queries.

  3. (3)

    Service-based chatbot: It is designed to provide personal or commercial facilities to the customer. It provides the copy of dispatched document, food order etc.

  4. (4)

    Response generated-based chatbot: The response model takes inputs and produces output in natural languages. Dialogue manager combines all the response models together. For generating the response, dialogue manager uses three steps. In first step a set of responses are generated by using all the response models, then in second step responses are given based on priority. In third step, if there is no priority response then the response is given based on the selection policy of model. Various response generation based models are template based models, generative models, retrieval based models and search engine models.

Since, chatbots are not limited to specific area, organisation or industry. It is being used by all the fields like Insurance, retail, entertainment, telecom, travel, energy, banking etc. to provide different kinds of supports and services to their customers. Chatbot case studies in the artificial solution [22] studied over a range of industries. The learning of how companies have leveraged AI powered chatbot to transform their industry is summarised as follows:

  • Automotive chatbot: It Increases customer engagement, improves the conversion rate and provide highly qualified leads to the dealerships. It also guide in choosing vehicle as per need and budget, help to configure a car and schedule the test drive. Connecting vehicle features enables user to interact with services such as charging, road side assisting.

  • Banking, financial services & insurance chatbot: It simplifies the operations and assists the customers to perform various financial operations, managing renewing policy and reporting for lost cards. It also guarantying the customer support and provides quick response to the existing customer queries in real time. This chatbot supports back office operations like document management agreement reviewing etc. for the employee including monitoring of training data based for new staff.

  • Retail & ecommerce chatbot: It improves customer experience, personalizes marketing communication and enhances the shopping Journey. It provides pre-purchase information, shopping advice, product navigating in different categories, shipping updates and help to place the order interactively. Chatbot collects the day to day conversation to identify the customer’s needs behaviour and analyse it for products offers and marketing messages so as to increase the sale.

  • Telecom chatbot: It resolves technical issues, increase sales and acquisition and improves workforce productivity. Customers can resolve their technical issues and can get up gradation deals, offer plans and services based on uses history. It assists potential customers for getting right product as per their need and popup for the repetitive and time consuming activities.

  • Energy & utilities chatbot: Streamline customer support, customer retention and manage field operations. This chatbot deliver the technical advice, resolve billing queries, suggest good tariffs, retain customers by providing proactive plan information, suggest for saving consumption and fix the appointment with technical staff to simplify services and field operations.

  • Media & entertainment chatbot: This chatbot application transforms the gaming experience with unique targeted content and boost up the conversion. It uses virtual reality to transform the gaming experience like online casino for gambling, communicating to the bot player. Based on the past behaviour and preference, bots suggest novel and related contents which may like by the users.

  • Smart homes & IoT chatbot: Chatbot make control the home appliances like smart fridge, TV, light etc. using voice. These chatbots may increase the experience of smart driving from locking/unlocking car, temperature setting, route selection based on congestion or distance etc.

  • Travel & hospitality chatbot: These chatbots make recommendations by asking some questions like preferences budget restrictions and location. Based on data collected while conversation, offers upgrade or add-on packages, increases engagement and loyalty of brand. Chatbot users can also complain for demand for various services.

This paper is organized in nine sections. Section "AI-chatbot: architectural components" explains about architectural components of AI-Chatbots, Section "AI-chatbot: response generation models" contains response generation models, Section "AI chatbots: techniques and technologies" discusses about techniques and technologies, evolution of conversational agents: state-of-the-art chatbots is discussed in Section "An evolution of conversational agents: state-of-the-art chatbots", Sections "AI chatbots: customer based services" and "AI-Chatbots: public administration based services" elaborate customer and public administration based services, Section "Chatbot performance evaluation" contains future direction and paper is concluded in Section "Future direction".

2 AI-chatbot: architectural components

The chatbot is application software which provides various services to users using text and voice based natural language as primary means to take input and produce output. The architecture of AI-Chatbots includes the understanding of its components and models. AI Chatbots components and the process flow between the communicating components are shown in Fig. 1 by Eric Gregori in 2017 [34]. Accordingly, the chatbot architecture exhibits following four basic components.

  1. (1)

    Front end: It is used to communicate with user. It contains following sub components to receive the input and generate the response:

    1. (a)

      User interface: Communicate with user, receive the input and generate the response.

    2. (b)

      NLP: It breaks the input into entities intense and actions with use of general AI techniques for neural network. For processes like sequence matching, learning, prediction etc. of natural language. Understanding the natural language is a sub part of natural language processing, produces a semantic representation of user statement. Passing is the main task of natural language understandings (NLU) which takes the string of words and generates the linguistic structure of the statement. It uses the context free grammars pattern matching for data driven techniques [31].

    3. (c)

      NLG: NLG uses templates and text stored in knowledge base to generate the response. NLG perform content planning and language generation. Content planning is concerned with determining semantic and pragmatic content, communicative action and its subject to be conveyed to the user. Language generation is concerned with meaning interpretation from syntactic structure and needed words.

    4. (d)

      Dialog manager (DM): Dialogue manager of chatbot is a Meta component which is used to manage the conversation between the chat board and the user. It creates responses from user input and knowledge base data. The dialogue manager mainly generates the semantic representation through communicative action [32].

    5. (e)

      Conversational chatbot AI (CCAI): It is used for common sense reasoning to make assumptions using knowledge corpus. It basically manages the conversation. The response of CCAI depends on knowledge base output, if it is text then is directly forwarded to the user and if it is request for more data then CCAI and dialogue manager generates the response with the help of text generator.

  2. (2)

    Knowledge-base: It is created by the back-end and represents the knowledge of the chatbot in a format that is needed by the front-end. The classification and tagging of domain corpus is done in knowledge base. The data in the knowledge base is formatted specific to the chatbot for generating the quick response. Input to the knowledge base is entities, intense and actions and the output is text response for request of additional data.

  3. (3)

    Back-end: It is responsible for creating the knowledge base and consuming the domains corpus. It can be further divided into following sub components.

    1. (a)

      Knowledge AI agent (AIA): AIA uses NLP of dormant corpus in pdf and webpages form and classify its text in to entities, intense and actions.

    2. (b)

      NLP: Used to process the corpus.

  4. (4)

    Corpus: The corpus is defined by the domain and hence represents domain. It is classified and tagged in knowledge base. Domain corpus can be structured or unstructured, text in human understandable format or compressed binary data. Structured data has some specific format. Mostly human understandable text contains some defined structure in form of sentences and paragraphs. Many documents also contains additional structure like sections, tags etc. XML and other similar files are examples of tightly structured documents having defined tags.

Fig. 1
figure 1

Components of chatbot [34]

3 AI-chatbot: response generation models

AI-Chatbot models can be classified based on response generation, conversation length or knowledge base [18]. Based on response generation, chatbot response can be given in two ways, either by generating the response using ML from scratch or by selecting a response from available set of responses library using some heuristic. They are termed as Retrieval based models and Generative models [14, 73].

Retrieval based architectural modal of chatbot is reliable and can be built easily. Due to the availability of algorithms and APIs to the developers, it is easy to build chatbots for this architectural model. It uses the messages of user and context of conversation to output the best suitable response from predefined library of responses. Traditionally these models existed in form of rule-base answering system which has a repository of question response mapping. Retrieval model proposed by Young et al. in [89] generate multiple responses based on the context of conversation stored. From multiple responses, each one is evaluated from suitable score and response in output is generated for highest score. Combining retrieval based approach with deep learning results in more appropriate responses. In 2016, Yan et al. [88] uses deep learning for analysing two sentences, results in more context of conversation for response generation. Such approaches have better accuracy and control. Figure 2 shows the architectural model of retrieval based systems. Mitsuku is a retrieval-based chatbot which contains over 300,000 predefined response patterns and a knowledge base of over 3000 objects. This chatbot can construct songs and poems based on its knowledge base.

Fig. 2
figure 2

Retrieval based architectural modal [73]

Generative architectural model makes the chat boards smarter and advanced. Due to its complex algorithms and implementation, these chatbots are used rarely. Millions of training samples are required to train such chatbots, which consumes huge amount of effort and time. Opposed to retrieval-based model, generation-based neural network model does not depend on fixed responses; instead they generate the response from scratch [18]. Their response depends on training data set and ML algorithms.

Sequences to sequence models [75] are more suitable for generation-based neural network. Young et al. [89], proposed a chatbot using generative model to ask the follow-up questions from the user. For questions they designed a template like “what about”, “what do you think about”, and by taking the user input and context, generates the remaining sequence. Figure 3 represent the generative models based chatbot systems. Microsoft Tay is the example of generative chatbots. Tay chatbot is the conversational AI-Chatbot that interacted with Twitter users and learnt with each tweet. It generates new content on its own, depending on what is tweeted to it.

Fig. 3
figure 3

Generative architectural model [73]

4 AI chatbots: techniques and technologies

AI based chatbot is normally to response the natural language. Chatbot help people facilitate their work and their interaction, makes interface with the computers however, it cannot imitate the human conversation perfectly and cannot replace the human role and the human competency [70]. Due to increasing efficient learning algorithms, chatbot is learning from human interaction, emotions and behaviours which is raising its relevance in future. ML technologies like NLP, NLU, NLG and Deep Learning technologies like ANN and RNN can be used to analyse the text or speech and generate intelligent responses in chatbots while interacting with humans. The robust techniques and technologies of AI-Chatbot discovered in the literature are as follows:

4.1 Deep learning

Deep learning is multi-neural network architecture and is subset of ML approach. Following deep learning based approaches can be used in the development of AI based chatbots. It has following two components:

  1. (a)

    Artificial neural networks: Artificial Neural Networks using a Neural Network algorithm can calculate the desired output from the appropriate input by using a weight calculation method in a continuous loop and train dataset. In each step through training accurately compensates weights which results in the desired output. Each phrase while executing an algorithm is divided onto various words and those words will be used as a desired input for the algorithm. Weights are then calculated thousands of times for various steps in a loop while using the training dataset. The assigned weights are improved in each training cycle to make the algorithm accurate. It gradually increases the code comparison of a trained dataset.

  2. (b)

    Recurrent neural network: Recurrent Neural Network (RNN) is composed of the input layer, multiple hidden layers, and the output layer. In the input layer, the input is feed as a vector representation. Then, the input vector is multiplied by some weight and some biases are added. Then, the output from the input layer is passed to the next hidden layer where each consecutive hidden layer is composed of numerous RNN cells. After getting output from the input layer, the cells in the hidden layer multiplies the generated output from the input layer by their own cell weights and biases. Next, in each of the hidden layer cells, some global activation function (sigmoid, tangent) is applied to generate output from the hidden layer. Then, the output from each hidden layer cell is passed to the successive hidden layer. Similar to previously hidden layer cells, some weight, biases, and activation function is applied to the input of the currently hidden layer cell. This procedure propagates through all consequently hidden layers. Finally, the output generated from the last generated hidden layer is passed to the output layer and the output layer applies some function to generate the final output.

4.2  Natural Language Processing (NLP)

It is the core component and is the base for the development of AI driven chatbots. NLP algorithms receives the input, interpret and infer it to determine the meaning and based on that decide a series of suitable and necessary actions. In broader sense, NLP is used for speech recognition and to understand the text-based inputs. In real sense NLP is used for many applications like tokenization, parsing, information extraction, speech recognition, similarity, speech generation, text-summarisation, sentiment analysis, topic extraction, parts of speech tagging, relationship extraction, text-mining, machine translation, automated question answering etc.

NLP includes two main components one is NLU and other is NLG. Natural languages have variety of forms and structures and hence NLU is more difficult than an NLG. NLU includes the mapping of input, analysing of language from various aspects and converting the unstructured data into structured data to be understandable by the system. NLP uses 5 steps to understand the message in a Chatbot.

  1. (a)

    Lexical Analysis—Identify and analyse the words by dividing the text into chapters, sentences, phrases and words.

  2. (b)

    Syntactic Analysis (Parsing)—Understanding the grammar of the text

  3. (c)

    Semantic Analysis—Understanding the literal meaning of the text

  4. (d)

    Discourse Integration- For final interpretation of message based on the overall context.

  5. (e)

    Pragmatic Analysis—Understanding the objective of the text.

NLG is used to provide the meaningful response by creating linguistically correct phrases and sentences based on text planning and realisation. There are some major challenges involved in NLP including the complexity of human language, ambiguous structure of language, lexis, metaphor and similes. Understanding some words as noun or verb, parsing a sentence in various ways, different meanings of an input etc. are some general challenges of NLP. Figure 4 represent the relationship between NLP, NLU and NLG.

Fig. 4
figure 4

Relation between NLP, NLU and NLG

4.3  Natural Language Understanding (NLU)

NLU is responsible for understanding the meaning of a text in a human like hue. It is based on following three concepts:

  1. (a)

    Entities: It is essentially a concept in a chatbot. For an instance, e-commerce chatbot entity basically represents a payment system.

  2. (b)

    Intents: When the user types something, the action chatbot should perform is intent. For example, the same intent will trigger if the user types " hi", "hey"," hello" and gives a reply " Welcome how may I help you?". In another example, if the user types “I want to order a black pair of balls”, “Do you have black balls? I want to order them” or “Show me some black pair of balls”, all these user’s input triggers same thing, providing options for black pair of balls to the users.

  3. (c)

    Context: Contexts represent the current request status of a user and allow the agent to carry information from one intent to the other. If an NLU algorithm analyses a sentence, it has no user conversation history. It means that if the questions are asked by a user it will not remember. To differentiate the context in chat, one can preserve it in local storage. Contexts like hi or bye will detect while keywords are entered, so the intent is detected easily without knowing what the first question was there.

NLU is the integral part of NLP. Differing from NLP, other than understanding of words, NLU is used to interpret the meaning with common errors done by humans like transposed words and letters or mispronunciation.

4.4 Natural Language Generation (NLG)

Natural language generation is the reverse process of NLU as it generates the text from the meaning. It is used for text summarisation, dialogue systems and machine translation. NLG maps the semantic frame response of the system with natural language sentence to be understandable by the user. NLG may work using different approaches like rule-based, model-based or hybrid model based. Rule based NLG outputs predefined template of sentences for a given semantic frame hence are limited having no generalization capability. ML based NLG uses some inputs sources like contain plane, knowledge base, structure database to return domain specific entities, dialogue history, referring expressions, user model etc. Training based NLG systems may provide different candidate utterances and rank them using some statistical model. These systems generate utterances by using bigram and trigram language model.

5 An evolution of conversational agents: state-of-the-art chatbots

Today Chatbots are proving to be fundamental tool also helping in improving customer engagement and retention. Amazon’s Alexa and IBM Watson have taken over the world through the advancement in technology (ML, Deep Learning, AI) architecture and “advanced information retrieval” processes. Apple’s SIRI, Amazon’s ALEXA are now embedded on desktops. Google Assistant announced acquires all the conversational capabilities to interact with humans. Evolution of different concepts with state-of-the-art chatbots as shown in Fig. 5 are as follows -

Fig. 5
figure 5

Evolution of chatbots

5.1 Turing test [78]

The term “Chatbot” evolved in 1950 with a Turing Test problem designed by Alan Turing. In Turing Test an “interrogator” asked a question for identifying a person. If the person and machine are indistinguishable, then it be said that the machine can think.

5.2 ELIZA [43, 82]

It was the first chatbot designed in 1966 by Joseph Weizenbaum in the Artificial Intelligence Laboratory at MIT. It was simply based on parsing and key substitution. ELIZA identifies keywords and pattern matching concept against a set of pre-programmed rules to generate quality responses. It is open source and uses text as input and output for communication. ELIZA got its name from Eliza Doolittle, a working-class character in George Bernard Shaw's play Pygmalion, who is taught to speak with an upper class accent. ELIZA was designed to mimic human interaction through pattern recognition but it could not, however, was able to react the queries in their full context. ELIZA used built-in scripts used to display the illusion of intelligence in answering questions on a given subject like psychological evaluation.

5.3 PARRY [35]

In 1972 a psychiatrist and computer scientist, Kenneth Colby created PARRY at Stanford's Psychiatry department, which was a much more serious attempt at creating an artificial intelligence; it became the first machine to pass a version of the Turing Test.

5.4 Racter [35]

In 1988, another interesting chatbot Racter was designed by William Chamberlain et al. under the Inrac Corporation called Racter (short for raconteur—a storyteller). Through the 1980’s and 1990’s the technology was deployed in automated telephone systems that used decision trees through Microsoft Network (MSN) and America Online (AOL) service providers.

5.5 JABBERWACKY [38]

In 1988 Rollo Carpenter developed a new chatbot Jabberwacky which simulates human conversation naturally in an entertaining way. It uses “contextual pattern matching” technique and is advantageous for research purpose using its webpage since beginning.

5.6 Loebner prize competition [17, 48, 77]

The Loebner Prize was begun in 1990 by Hugh Loebner. It is an annual competition for chatbots where they are tested based on Turing Test. ALICE [81] has won this prize 3 times and Mitsuku [86] has won this prize 5 times including in 2019.

5.7 DR. SBAITSO [38]

In 1992 Dr. Sbaitso chabot was developed by Creative Labs for MS-DoS. It was one of the chatbot that uses initial efforts to incorporate AI. It was completely voice based and communicates with the user like a psychologist.

5.8 ALICE [81]

In 1995, Artificial Linguistic Internet Computer Entity (ALICE) designed by Richard Wallace, was a complex bot based on pattern recognition that matches the inputs adjacent to template pairs (output) and must be kept in knowledge base. Artificial Intelligence Mark-Up Language (AIML) is used to write the document which is based on eXtensible Markup Language. It was also awarded as the most intelligent chatbot. It is the first AIML based program to receive the Loebner Prize in 2000, 2001, and 2004 as “the most human computer” at the annual Turing Test contests. It was inspired by ELIZA [54]. It has two parts, chatbot engine and language module. Language model is in AIML (Artificial Intelligence Mark-up Language) files which consist of patterns and templates. Pattern is used for matching the user query whereas the response will be provided in the template. User sentences given as input in AIML are stored in category which consists of a response template and context. Context is conditional set to provide meaning to the sentence. Input is then pre-processed and matched against decision tree nodes. After matching of user input response is generated or some action is executed by chatbot. Generation of appropriate response, incapability of reasoning and inefficiency in generating human like responses are the major limitation of this chatbot.

5.9 SmarterChild [38]

The SmartChild developed by Activebuddy, Inc in 2001. Using quick data access to other services, it performs funny conversation. This chatbot was made available over MSN messenger and American Online Services (AOI) Instant Messenger (IM) or American Instant Messenger (AIM). Microsoft, one of its suits developed its own smarterchield.

5.10 MITSUKU [86]

MITSUKU bot was developed by Steve Worswick Mitsku in 2005. It is human like chatbot, designed for general conversation and is based on rules written in AIML. It is integrated with bot network like twitter, telegram etc. as a personality layer. It is hosted at Pandora bot and uses NLP and heuristic patterns. When the mitsuku bot is unable to find proper matching for an input, it moves to default category automatically. It can perform long conversation, remembers the personal details of user and also learn from user conversation. One of the important features of it is the reasoning ability with specified object. For example, if the user inputs, “can you eat a home?” Mitsuku will first get the home properties and find the value of “made from”, which is “brick” and then reply “No” as brick is not edible. Mitsuku support multiple languages and is based on supervised ML. To learn new things, data is first verified by human manager and only this verified data is incorporated and used. Mitsuku need huge amount of training data for effective performance and does not provide dialogue management component.

5.11 Watson [56]

It is IBMs rule-based chatbot developed in 2006 under DeepQA project [14] of IBM. It uses NLP and hierarchical ML method and is designed to retrieve information as well as questioning and answering. Watson uses various techniques to identify and assign feature values like names, geographical locations or other entities for response generation. It uses ML to learn combining the feature values into final score for individual response. This score is used to rank the possible answers and choose the best one. Apache Hadoop and UIMA (unstructured information management architecture) framework is used to examine the phrase structure and the grammar of query to interpret the input. The major disadvantages of Watson are that it has no relational database, do not process structured data directly, require high maintenance cost, takes huge effort and time to learn and is targeted for large organisations.

5.12 SIRI [59]

Siri was developed by Apple. SIRI is an application of Apple’s iOS designed as an intelligent personal assistant and knowledge navigator. SIRI during its release has opened doors to huge opportunities to catch its wave and exploit this natural language-based voice-interface technology in new and profitable ways. Siri is a virtual assistant specifically available on Apple products, and has access to Apple applications including mail, contacts, messages, maps, and safari. Siri can read users’ email, text contacts, change music playing, make calls, find restaurants, find books, set alarms, and give directions. Siri’s gender, accent, and language are configurable and changeable.

5.13 Google now/assistant [38, 84]

Google Now was developed by Google in 2012. It is used to answer the queries, make recommendation and perform various actions using set of web services. Initially Google Now was used to retrieve contextually appropriate information depending on time & location. Sometimes it is referred as predictive search. It is developed to be used in smart phones and is upgraded with several features. In 2017 Google Now was replaced by Google assistant. Google Assistant is an extension of the basic “OK Google” functionality that allows users to conduct search and control their mobile devices through voice commands. Assistant is programed into Google phones, Android OS and will be integrated into some cars. Today Google assistant is part of Google search growth strategy.

5.14 ALEXA [58]

Alexa also known as Amazon Alexa developed in 2015, can access the weather, connect to radio and television stations, and has partnerships with a number of services, including: JustEat, Uber, FitBit, The Telegraph, Spotify and Nest, among others. These services can be accessed with the Alexa interface [23]. It is developed by Amazon Lab 126 and is initially used in Amazon Echo smart speakers.

5.15 DIALOGFLOW [27]

It is developed by Google in 2016 which provides the NLU platform to perform the conversation using mobile apps, devices, bots, web applications etc. Dialogflow can analyse multiple types of input from users, including text or audio inputs. It can also respond to users in two ways either using text or speech. Dialogflow known as Api.ai is developed using Google cloud platform. It uses ML and NLP techniques for user interaction with interfaces in text or voice mode. It first recognises the intent and context of user input and then matches it with specific intents to extract relevant data using entities. Then it provides the response using conversional interface. The limitation of it is that it does not include interactive user interfaces and documentation, and also it has no hand held device versions.

5.16 LUIS [46]

It is domain specific cloud-based API service developed by Microsoft in 2017, which uses ML in natural text conversation for predicting the final meaning and to extract the detailed as well as relevant information. It uses intense and predefined domain entities model for processing of information and natural language. To find intents of a sentence LUIS performs NLP against big data. It is basically developed for identifying valuable information from conversation, interpreting user goals (intents) and for extracting information (entities). The model starts with common user intentions such as booking train, contact helpdesk etc. After the intentions are identified, user input example phrases also called as utterances for intents. Now it labels the utterances with some unique details that user wants from LUIS to pull out of the utterances. Once the model is designed and trained it can receive and process utterances. Utterances are received in form of HTTP request and are responded with extracted user intentions. The limitation of LUIS is that it requires Azure subscriptions.

5.17 Amazon Lex [8]

Amazon Lex is an AWS service developed in 2017 by Amazon to build interfaces of application for doing conversation either using text or voice mode. To provide real life conversation it uses deep learning, NLP and automatic speech recognition (ASR). It uses AWS lambda which can be used easily to execute back-end business logic for retrieving and updating data. The limitation of it is that, it is not multilingual and support only English. It has a critical web integration process and has complicated data set preparation process. Mapping of utterances and entities are also critical in it. In Table 1 a summary is presented for various state-of-the-art chatbots.

Table 1 Properties of various state-of-the-art chatbots

6 AI chatbots: customer based services

Chatbots have obtained huge attention with the development of artificial intelligence, Internet and social networking sites. Customer service based chatbots are used to communicate with a customer for online shops, customer service, marketing, advertising, entertainment industry, data collection and many more related works. These chatbots are summarized in Table 2 and are critically reviewed as follows:

Table 2 Customer service based Chatbot approaches with various attributes

In 2016, Ly Pichponreay et al. [65] converted document into chatbot knowledge. In first step Apache PDFBOX OCR library is used to extract the text from Digital photos or portable document format (PDF). After extraction of text the paper proposes to generate questions using different techniques and finally uses Intelligence Markup Language (AIML) and pattern matching for answers of the given queries. In 2017, Cyril Joe Baby, et al. [11] proposed for using of Chatbot system for home automation using Internet of Things (IoT). Natural language toolkit (NLTK) for language processing tasks is used in this work. The core idea of the approach is to detect the key words from chatbot system entered by users and then pass these key words to the microcontroller that controls the home appliances using Internet of Things.

In 2017, Gregori [34] has reviewed a number of modern chatbot platforms, NLP tools, and their application and its design. Gregori’s aim is to develop a Chatbot for Online Masters of Computer Science (OMSCS) program in The Georgia Institute of Technology. Different aspects of using Chatbots are discussed in the paper including customer service and education. In education, Gregori has mentioned some examples of chatbots such as ANTswers, a librarian Chatbot at University of California based on A.L.I.C.E open-source framework, AdmitHub”Pounce”, a custom virtual assistant for Georgia State University (GSU) admissions that has a knowledge base of about thousand FAQ. Pounce chatbot has achieved great results and proved to be very successful to handle student queries. Abdul-Kader et al. [1] proposed a novel chatbot having no specific knowledge source. They used World Wide Web in to search the queries responses. It uses chatter bot and NLP toolkit to perform NLP operations. They proposed to use text matching approach for finding responses of queries and these responses are stored in a structured database. Divya Madhu et al. [49] proposes a chatbot for the medical domain. It takes in the symptoms from the users and predicts the possible diseases and provides medicine and dosage. Ming-Hsiang Su et al. [74] proposed a chatbot for elders. Due to many reasons it is not possible to spare enough time with elders for good conservation. This paper depicts the chatbot system to engage the elders in small talk and conservation. The MHMC chitchat database of this chatbot consists of 2239 messages and corresponding responses collected from daily life. It uses long short term memory to maintaining the state information. Chaitrali S. Kulkarni et al. [45] proposed a chatbot for the banking sector. In general the persons are very cautious about their monetary related issues and services. Users can have many queries related to banking services at any point of time across the day but, the all-time availability of customer service representative is questionable. Hence, the paper proposed a chatbot for responding customer’s queries 24 × 7. The knowledge base consists of FAQ from different banking platforms. It uses NLTK for the processing of natural language and bag of words method for converting words into vectors. Finally the question–answer mapping is done using cosine similarity. The accuracy of their approach is 87%.

In 2018, Che-Hao Lee et al. [62] designed a companion robot for kids which require good question answer pair. For the training and testing dataset, articles from http://www.yfes.tn.edu.tw/yfesvi/story.htm were used. Questions were generated from 90 student’s stories, each containing on an average 10 sentences leading to collection of total 885 sentences. The question generation model of their approach generated 2693 questions. Finally, questions labelled manually whether acceptable or not acceptable using supervised learning. Logistic regression is used as the classifier to identify the answers by using a question ranking model which ranks the question answer pairs. The result shows significant level of questions acceptance by their model as shown in Fig. 6.

Fig. 6
figure 6

Question generation system [62]

In 2018, Bhavika R. Ranoliya et al. [66] designed a novel chatbot to answer the queries based on FAQ dataset using AIML and Latent Semantic Analysis (LSA). It can be used for responding the FAQs in an interactive way. Kumar Shivam et al. [72] developed a college chatbot system. The main purpose of developing this chatbot was that it is hectic to sit all the day by enquiry staff and is very complicated to answer the questions of the students by them. They uses facebook messenger interface to perform the NLP. In 2018, Takuma Okuda and Sanae Shoda [61] proposed an industrial finance services based chatbot framework. Minjee Chung et al. [24] analysed that luxury fashion retail brands can provide personalized care through e-services basically through chatbot, rather than traditional face-to-face interactions. Customers using both online and offline services have found online service to be effective, accessible, and both time and cost-saving. They use customer questionnaire data to test the chatbot on a five-dimensions i.e. customer perceptions of interaction, entertainment, trendiness, customization, and problem-solving. To understand the necessary factors by luxury brands for communicating with customers, they focused on luxury brands and assumed that e-service agents marketing efforts are associated with quality of communication quality that requires accuracy, credibility, and competence. They found 29 items from previous studies to measure marketing efforts according to interaction, entertainment, trendiness, customization, and problem-solving.

In 2019 Arsovski et al. [10] pointed out that knowledge acquisition is one of the most important tasks to build conversation model. It is a difficult process and requires significant amount of effort and time. Keeping it in mind a new approach was proposed to extract knowledge from conversation automatically in an existing chatbot. This extracted knowledge is used as training data set and is utilized to design neural network conversation agent. Their approved share and reuse the automated machine-machine conversation. Graeme McLean et al. [50] studied the variables having direct impact in a live chat function. They find out eight variables which affect the chat and are motivation for performing live chat. These variables depends on context for chat initiation, namely for search/navigation support or decision support. It offers key managerial implications, point out the significance of offering live chat function for customers and why website user are motivated for live chat. F. Patel et al. [64] developed an intelligent social therapeutic chatbot for providing the mental relaxation of students from stress. They used ISEAR dataset for emotion detection in text consisting of 7652 phrases and 1542 emotional words. These emotional words were broadly categorized into few emotions such as happy, joy, shame, anger, disgust, sadness, guilt, and fear. For emotion detection, they used 3 popular deep learning classifiers as Convolutional Neural Network (CNN), Recurrent Neural Network (CNN), and Hierarchical Attention Network (HAN). Based on negativity percentage of proposed algorithms using emotion label and conversation data their approach identifies the student’s mental state in five classes namely, normal, slightly stressed, highly stressed, slightly depressed and highly depressed.

In 2020 Sayed Ehsanullah Ahmady et al. [2] proposed a chatbot as an add-on over the Telegram, a social networking application to help foreigners trapped in disaster-affected areas. It assists them to evacuate from the disaster-affected area and to reach the nearest evacuation centres or nearest stations. The disaster information from governmental agencies Telegram API is used to send and receive real-time information. Chatbot uses Weather API to provide weather information on the user’s current location and Google map to reach at evacuation centres. Snehal Karia et al. [39] developed a BankBot which takes data from www.banksofmumbai.in URL and provides accurate answers to user’s query and can guide customers in a meaningful manner during and after covid-2019 situation. The contactless chatbot is developed keeping the cosine algorithm as our base, An article will be downloaded through the specified URL (www.banksofmumbai.in), after which tokenization is performed using TF-IDF vector, followed by lemmatization and vectorization to get a similarity score from which our machine learns and provides the most efficient results to the user. The aim of this paper is to provide a solution to banks so that there is a contactless communication between the employees of the bank and the customers which would be very beneficial during these pandemic times. The chatbot proposed in this paper has the capability to replace the “May I Help You?” desk which is present in every bank just at the entrance; it will also be a major boon in the current Covid-19 situation, fostering minimum or no human to human interaction.

In 2021, Shi Yu et al. [90] proposed AVA (A Vanguard Assistant) chatbot using Deep Bidirectional Transformer models (BERT) [29] to handle client questions in financial investment customer service. The proposed approach finds the best corrections for misspelled words. They used advance Bayesian deep learning to quantify uncertainties in BERT intent predictions. They used Monte Carlo Dropout (MCD) approach [30] to approximate variational inference, whereby dropout is performed at training and test time, using multiple dropout masks. They also use the BERT as a language model in automatic spelling correction. Training data was prepared in a year by a dedicated business team from interaction log of phone agents and the expert team. In total 22,630 queries were selected and classified to 381 intents, which compose the relevant questions set for the intent classification model. Additionally, 17,395 queries are manually synthesized as irrelevant questions, and none of them belongs to any of the aforementioned 381 intents.

7 AI-Chatbots: public administration based services

These Chatbots are used to ease the communication between administrative bodies and public based on FAQ’s. Citizens can avail the easy answers for their administrative issues in quick and reliable way without the involvement of staff, which enables them to spare their efforts in other tasks.

Citizen happiness index increases with increasing satisfaction for the governance system. Hence if few services are made available through Chatbot which on other hand are office dependant and involve human intervention today, can facilitate citizen to avail the authentic and unbiased services without constraint of time, travel and place. As a matter of fact, such application Chatbots are not found in practice by the governing bodies.

Almost every field uses Chatbot applications but public administration based chatbots lags behind. There exist scopes for large number of applications for the citizen centric activities which covers different domains of population. These users’ needs public administration services on the tips of the finger in their day to day life. In reality, especially in context to the local administrative services there is no such authentic Chatbot applications developed and announced by the governing bodies. Hence it becomes challenging to provide such services for resolving the queries, providing information, interpret queries and give answer, making application, certifying users needed documents etc. of general citizens. In public administration services, most of the queries and procedures are common and having similar kind of responses and procedures. Hence it can help to reduce problems from both the ends. Public service providers will not have to engage in solving common queries and users do not have to wait or visit the offices unnecessarily. The utility and direction of Chatbot in public administrative service may include counselling complaints, calls inquiries, suggestions, immediate response and communication between the government and the people etc. which results in improving transparency and ease of getting services by the society. It is aimed to develop such a dedicated AI-Chatbot that can be used authentically by the local citizen as an outcome of this survey.

In 2017, the chatbot named WienBot [76] was designed in Vienna to provide responses of FAQ’s. The Vienna city was analysed that on municipal website there are numerous searches to get the details of online services running in the city. Wienbot provide quick information to the public using voice command rather than searching the required page on the website. Wienbot provide answer of more than 350 services of the city in German language and is able to respond in local dialect [85]. This bot won the world summit award in 2017 for best government and citizen engagement application [87].

In 2018, the Register of Enterprises of Latvia developed a novel chatbot UNA [53] of FAQ’s related to enterprise registration process as well as liquidation, merchants companies and organizations. UNA stands for future support of entrepreneurs in Latvian language which is a symbolic name. It indicates future of Latvian public administration. The chatbot is available on register of enterprise website as well as on facebook [60]. The citizen can also take the follow up after registration. It works in Latvian language only [53].

In 2019, Dr. Alan R. Shark et al. [71] have emphasized on teaching and learning of public policy and administration using artificial intelligence. This paper is a part of the “AI and its Impact on Public Administration” released by the US Based National Academy of Public Administration organization which is assisting government in building effective, accountable, and transparent system. It mainly discusses on integration of information technology with public administration to turn thought into action. They explained that the AI application in form of ML is already being used as chatbot in public management domain e. x. citizen engagement system, interaction of smart; language based interactive system for field inquiries like Siri or Alexa. In 2019, Masafumi Kosugi et al. [40] developed a LINE BOT to share the disaster related information in Japan. It is based on LINE messaging service of Japan which has more than 80 million monthly active users. Using this chatbot user can register their friends, tweet and share the text information including the images and current location.

In 2020, Vito Bellini et al. [13] developed an Italian public administration based platform GUapp for job search and recommendation. The recommender system uses Latent Dirichlet allocation and finds k-nearest neighbour job positions which are very similar to the user profile. To improve the user experience GUapp is implemented in a chatbot to allow users to interact using natural language. Paul Henman [36] focused the study on use of AI in public sector, for automated decision making system, for chatbot to give advice, provide information and for public safety and security. They discussed four challenges in public administration for deploying AI. They also outlined governance and technical innovations to address these challenges. Chonnatte Rodsawang et al. [28] designed an informative chatbot “COVID-19 preventable” for COVID-19 pandemic. In this reliable information was converted in form of question and answer system and imported to NLP in Google Cloud Dialogflow. It has seven prompt features including situation report, how to protect yourself from COVID -19, fake news, self-screening for Covid-19, list of nearest hospitals, hotline no to call, and report notification. Fahad Mehfooz [51] proposed a retrieval based chatbot with voice support. They also studied other standing chatbots and their usefulness in helping patients during COVID-19.

8 Chatbot performance evaluation

In the context of Chatbot, many matrices are used to measure the performance of a Chatbot. Several existing parameters to measure the performance for making a useful chat bot are as below:

  • Scalability: It should accept number of questions from user and respond efficiently.

  • Turing Test: Turing test by Alan Turing in 1950 is used to test the equivalency of intelligence between human and machine. If the answers given by human and machine are indistinguishable then it may be said that machine has passed the Turing Test. Even popular state-of-the-art chatbots like Amazon Alexa, Microsoft Cortana, Apple Siri etc. have not passed the Turing Test. Loebner Prize Competition [17, 77] is an annual competition for conversational agents (chatbots), where they are being tested via Turing Test. ALICE [81] has won this prize 3 times as well as Mitsuku [86] has won it for 5 times.

  • Interoperability: It is the ability of a system to exchange and reuse information it supports multiple channels and users can switch between channels.

  • Speed: It measures the response generation time for any query. To the best of my knowledge Google Assistance has response time of 2 s which is better than other existing chatbots of its domain.

There are a number of different perspectives on how to evaluate chatbot performance. From an Information Retrieval (IR) perspective [69], chatbots have specific functions: there are virtual assistants, question–answer and domain-specific bots. Evaluators should ask questions and make requests of the chatbot, evaluating effectiveness by measuring accuracy, precision, recall, and F-score relative to the correct chatbot response. From a user experience perspective [69], the goal of the bot is, arguably, to maximize user satisfaction. Evaluators should survey users (typically, measured through questionnaires on platforms such as Amazon Mechanical Turk), who will rank bots based on usability and satisfaction. From a linguistic perspective, bots should approximate speech, and be evaluated by linguistic experts on their ability to generate full, grammatical, and meaningful sentences. Finally, from an artificial intelligence perspective [69], the bot that appears most convincingly human (e.g. passes the Turing Test best) is the most effective.

One of the most widely used evaluation frameworks is PARAdigm for DIalogue System Evaluation (PARADISE) [80] used for estimation of factors such as (i) Clarity, (ii) friendliness, (iii) ease for use, (iv) naturalness, (v) robustness regarding misunderstandings. It seeks to objectively quantify bot effectiveness by maximizing task success and minimizing dialogue cost. It introduced the concept of Attribute value matrix (AVM) [80] to measure the effectiveness and two types of minimizing dialogue cost: (i) Efficiency cost, (ii) qualitative cost.

In 2015, Kuligowska et al. [44] proposed an evaluation framework. They evaluated ten quality components of every commercial chatbot, namely: visual appearance, type of implementation over the website, speech synthesis unit, general and specialized information based built-in knowledge base, knowledge presentation and other add-on functionalities, ability of conversation and sensitiveness of context, personality traits, options of personalization, response in emergency and in unusual conditions, rating possibility of Chatbot and website by the client. By analysing these factors they presented the current situation of Polish market of commercial chatbots and demonstrated the importance of performance, usability and overall quality evaluation of every commercial application of virtual assistant.

The other performance metrics, helpful in designing an efficient chatbot is BiLingual Evaluation Understudy (BLEU) score [63], a method for comparing a generated sequence of words with reference sequence. In 2002, Kishore Papineni developed the concept of BLEU score for translation related tasks only. Some of the features of BLEU score are easy to calculate and cheap, language independent and it correlates highly with human evaluation. The working of BLEU score includes counting of matching n-grams of user text and reference text. High BLEU score represents the higher intelligence of the chatbot.

9 Future direction

The chatbot that understand the queries of the citizen for the Government scheme, interpret the Government resolutions and communicate it with authentic information, facts and figures is need of the day. The expert knowledge base, voice input, vernacular language support etc. of an AI-Chatbot can circumvent many inherent redundancy and hurdles responsible for the poor performance of the citizen centric activities. It would be truly revolution if citizens can share their documents online through chatbot for processing and get the required document. E-signature, online payments are needed to make it possible. Also the administrative process needs to be changed to make it convenient for online process using chatbot.

Understanding the context of the conversation and generating the appropriate response is one of the major challenges in chatbots. Improved NLP techniques, understanding the context of communication and generating the response with emotion or personalized feeling is required form future chatbots.

10 Conclusion

Chatbot history has been around the world for decades. Intervention of chatbot found in almost every field and with inbuilt technological evolution. Major internet companies such as Apple, Google, Facebook, and Microsoft etc. have developed varieties of chatbots using popular technologies and hence accepted across the globe. This paper categorizes chatbot services in to customer and public administration based services and explored them to understand the possibility of improvement and implementation. Chatbot as a customer assistant is found ordering food, making of appointments, booking a flight etc. and is becoming a part of day to day life with enhancing ease of living. Public administration services based chatbots are helpful to resolve the queries of citizens and assist them to avail the needful government services without the involvement of admin staff. With the advancement of AI technology and language processing techniques there is huge possibility to develop a more accurate, multi-lingual, efficient, multi-communication mode and auto-learning chatbots to provide customer and public administration services.