1 Introduction

The growing significance of the real estate market motivates investors to search for factors and variables that support cohesive analyses of real estate markets, market comparisons that are based on diverse criteria and the determination of market potential (Renigier et al., 2017)

Market analysis is an indispensable element of both the property valuation procedure and process and rational investment decision-making. One of the key challenges in analysing the real estate market is identifying the attributes (their occurrence, scope and significance) that affect the attractiveness of properties. This challenge results from the characteristics of the real estate market, which is a multithreaded and complex organism that is affected by frequent changes in the preferences of its participants, the economic situation, the uncertainty of information, and the impacts of many endogenous and exogenous factors on real estate prices.

The two main reasons for the difficulty in identifying the attributes are as follows:

  • Many factors may affect the real estate market and property prices, which are related to economic, social, technical and behavioural aspects and are difficult to define, especially in quantified and graded forms.

  • No unified and effective methods are available for analysing the real estate market due to its specificity.

A detailed analysis of the presented problem led to the following inference: The identification of property attributes and the assessment of their significance in property attractiveness perception are among the most troublesome elements of property analysis.

The main subject and decision maker in the real estate market is the human being, and his/her motives, emotions and consequent reactions should be included as a behavioural component in decision systems. Among the most popular systems that are related to decision algorithms for the real estate market are the automatic valuation models, which enable the objectification and processing in numerical form of large data sets. Currently, the rapid development and common use of modern technologies that are related to artificial intelligence (AI) and machine learning (ML) motivates us to search for more efficient tools that may resolve problematic aspects of analyses (Ziółkowski & Niedostatkiewicz, 2019) and to move on from the traditional analysis of efficiency to more suitable approaches for the real estate specificity.

How can aspects that are based on factors that express human reactions be considered as evidence for decision-making? These factors are based on emotions that can be conveyed via facial (Bobkowska et al., 2016, 2017), vocal and body expressions. Facial emotion detection is one of the most common types of emotion detection, for which mostly deep-learning-based approaches are utilized (Barsoum et al., 2016; Li et al., 2017; Wang et al., 2019).

The scientific question that must be raised regarding this subject is the following: Is it possible to objectify and reflect the impacts of individual variables (attributes) on the attractiveness of real estate? To answer this question, the authors of this paper formulated the following thesis: The use of unconditional reactions of property market participants in market analysis enables the objectification and assessment of property attractiveness. It was assumed that the thesis will be supported or refuted by the following research objective: the use of solutions that are based on the AI emulation of "cognitive" functions that are related to human face emotion detection to increase the reliability of variable significance determination. The main contribution of the paper is the proposal of a new approach for the determination of the significance of the real estate attributes in consideration of the behavioural aspect (based on the subconscious behaviour) of decision making regarding the real estate market. The paper proposes new soft-computing methodologies and conducts experiments that are related to facial emotion recognition for real estate market analyses and valuation. The remainder of this paper is organized as follows: First, the reasons for searching for new solutions for real estate feature analyses are described Sect. 1. Section 2 presents a short theoretical presentation about the uncertainty related to real estate decisions, especially real estate valuation. Section 3 presents current approaches for the determination of attribute significance. Section 4 contains the behavioural aspects of real estate market decision making. Data, the materials, methods and methodology of the property attribute significance determination based on an emotion recognition technology/algorithm were presented in Sect. 5. Section 6 presents the results of analysis. Section 7 discusses findings of this study, its’ limitations and future directions. Finally, Sect. 8 presents the conclusions of this study.

2 Literature review

Real estate is a component of a system, namely, a real estate market. The real estate market is a part of a main system of economic processes in which one can distinguish segments based on the subject of exchange (markets of goods and services, labour markets, and financial and capital markets). A capital market is a type of market in which purchase and sale transactions (financial or material) are conducted. The transactions include currency, securities, mortgages, and real estate trade, among others. The main problem in the analyses of the real estate market is the specificity of the information that describes it. Decision making that is related to real estate is burdened with high uncertainty and is very vague and complex. One of the main problems is limited access to reliable data or information, which is due to excessive amounts of such data (information noise/bias) and difficulties in the selection of suitable data (Renigier et al., 2019). The selection and elaboration of suitable data should be correlated to the objectives of the analyses.

One of the main objectives is real estate valuation for investment purposes that is based on conventional (standard) procedures. To increase the efficiency of market analyses, computational algorithms are applied. The procedures consist of several common stages, which are described in standards that were established by national, European or international societies (organizations) such as Polish Valuation Standards (PKZW, 2017), IAAO (2012), RICS (2017), and IVS (2017). One of most important stages, regardless of the type of valuation, is the collection and provision of reliable information that is related to the elaboration of the dataset. The definition of the data that influence the establishment of the prices is one of most crucial aspects of the property valuation procedure and market analyses (Guo et al., 2012; Hwang et al., 2013; Kauškale & Geipele, 2017; Kobylińska & Cellmer, 2019; Wei-Xing & Sornette, 2008). However, the type and scope of the attributes that determine property values are not universal and may differ from one valuation to another. Hopfer (2008) presents examples of features that are most often considered in the valuation of residential properties, which include the location, surface, plot shape, legal status of the property, designation in the spatial development plan, number of rooms, installed equipment, year of construction, type of development, technical condition, standard of finish, prestige and neighbourhood. According to Hopfer location can be divided into physical location (including location of the property in relation to the directions of the world or sunshine) and economic location (determined, e.g. by the availability of service points, retail outlets or public transport). These features are the typical set of information that are included in property analyses. How are they selected for the evaluations? An analysis of property transaction participants’ motives shows that the scope of their perception is limited to only a few attributes. According to Miller, an average person cannot make a decision on the basis of more than seven criteria or rank the attributes according to their significance (Ries & Trout, 1996).

The property valuation practice assumes the selection of suitable features that potentially impact the property markets based on a pairwise comparison method or, when data are unavailable, on the basis on a preference survey. Pairwise comparisons must be based on the assumption of ceteris paribus of the analysed property, namely, that pairs of properties differ in terms of only one feature. In practice, there is a small probability of identifying sufficiently many such pairs for statistical analyses; hence, the obtained results are not reliable. Moreover, the surveying of preferences is typically not conducted in accordance with professional guidelines. Eventually, the real estate valuer subjectively determines the importance of the features to the property value. There is a very high risk of "falsification" in which attempts are made to adapt reality to the applied methods.

Moreover, property valuation lacks universal algorithms or procedures that can be used in property feature significance analyses in various property market conditions, which are often imprecise, ambiguous, uncertain and “behaviourally affected”. Moreover, d'Amato and Kauko (2017) raise the following issue: “from an analytical point of view, the solution to the problem requires the selection of appropriate methods for analysing the available information rather than, as it is often observed in practice, the adaptation of existing information to popular analytical methods, such as econometric models”. This proves that the most important principle for every analysis of real estate value is the selection of suitable research methods and procedures.

Due to the specificity of the real estate market and of the information on which the analyses are based, the most important tasks are to select methods and develop a methodology (application procedure) that “understands” specificity of information and participants decisions on the real estate market. The following factors should be considered:

  • Significant variations in the quantity of available information, subject to the type of the analysed market (region);

  • Complex methods of data description (differences in the scale of attribute description);

  • Lack of comprehensive (complete) information;

  • Inaccurate and "fuzzy" real estate data;

  • Absence of the restrictive assumption of linearity in data relationships (McCluskey et al., 2013);

  • Absence of homogenous functional dependencies between real estate attributes.

Moreover, according to Kaklauskas et al. (2015) and California Bureau of Real Estate (2015), the most substantial challenge is that the value of a property should be estimated based on accurately determined elements that affect the value.

3 Current approaches for the determination of attribute significance

A property markets analyst or property appraiser who is analysing a type of real estate market must select suitable market attributes that significantly affect the market price of the property being valued (McAllister, 2007). According to Zavadskas et al. (2010), the major problem here is the “estimation influence of different attributes”. To solve this problem, both practitioners and scientists apply either qualitative or quantitative methods (Barańska, 2008; Maleta, 2013; Senetra & Szarek-Iwaniuk, 2019). From the valuation procedural prospective, the attribute significance (which is called the attribute weight) represents the impact of a feature on the price range in the real estate market (Sawiłow, 2004). The determination of market attribute weights consists of measuring the impacts of these features on real estate prices. The impact of a feature can be measured on the basis of:

  • Database analysis of real estate prices and features,

  • Evaluation of the preferences of potential property buyers.

3.1 Database analysis of real estate prices and features (quantitative current state-of-the-art solutions)

One of the most frequently used analytical methods for determining the market attribute weights is based on the Formulas 1 and 2:

$$wg_{il} = \left( {C_{{jl_{s} }} - C_{{jl_{k} }} } \right) \cdot \left( {\mathop {\max }\limits_{j} C_{ij} - \mathop {\min }\limits_{j} C_{ij} } \right)^{ - 1}$$
(1)
$$wg_{j} = \left( {\mathop \sum \limits_{l = 1}^{l = z} wg_{il} } \right) \cdot z^{ - 1}$$
(2)

where: \(wg_{il}\)—market attribute weight for particular property pair, \(C_{{jl_{s} }} ,C_{{jl_{k} }}\)—real estate prices that differ in only one feature (s—highest level of attribute condition, k—lowest level of attribute condition, l—particular pair of property), \(\mathop {\max }\limits_{j} C_{ij} ,\mathop {\min }\limits_{j} C_{ij}\)—minimum and maximum real estate prices on the local market, \(wg_{j}\)—average market weighting coefficients of real estate features reflecting their influence on property prices, z—amount of property pairs.

Other analytical approaches for determining the weights of market features are methods that are based on a correlation matrix between features and property prices (Barańska, 2019a, b; Gaca, 2018; Doszyń, 2017). Sawiłow (2004) proposes the determination of the weighting coefficients of real estate features on the basis of variation coefficients that are calculated according to Formula 3:

$$wg_{j} = \left( {\mathop \sum \limits_{i = 1}^{n} x_{ij} } \right) \cdot \left( {n \cdot \sigma_{j} } \right)^{ - 1}$$
(3)

where: j—designation of a feature, n—number of properties, \(wg_{j}\)—weighting coefficients of real estate features, \(x_{ij}\)—value of attribute x for feature j and particular property i, \(\left( {n \cdot \sigma_{j} } \right)^{ - 1}\)—variation coefficient of feature j.

Dmytrów and Gnat (2019) state that “since there are many attributes that are considered to assess the real estate value, multiple-criteria decision making techniques ought to be applied”. The authors propose the use of the analytical hierarchy process (AHP) method to assess the attribute influence on the value. For that, a pairwise comparison matrix (the AHP matrix) was proposed.

Another current approach towards property attribute determination was proposed by Doszyń and Gnat (2017) and considers predictive and studentized residuals based on Formula 4:

$$y_{j} = \mathop \sum \limits_{j = 0}^{k} \alpha_{j} x_{ji} + \gamma_{i} z_{i} + u_{i}$$
(4)

where: \(y_{j}\)—the dependent variable (the unit price of a property), \(x_{ji}\)—the explanatory variables (the operty attributes) (j = 0, 1,…, k), \(z_{i}\)—the dummy variables (i = 1, 2,…, n), \(\alpha_{j} \gamma_{i}\)—the structural parameters, \(u_{i}\)—a random factor.

Automated solutions for property valuation that are based on current technological capabilities are also applied in numerous aspects of property analyses, which include property attribute significance determination. Despite the main complaint of automated valuation model (AVM) opponents regarding the credibility and quality of the data on which AVM is based, this process considers the behavioural impact on real estate market decision-making that the typically applied methods (e.g., ordinary least squares) cannot consider. Moreover, according to Lentz and Wang (1998), d'Amato and Kauko (2017), Yakubovskyv et al. (2018), Renigier-Biłozor et al. (2018), Mark and Goldberg (1988), Fletcher et al. (2000), Dubin and Goodman (1982), Des Rosiers and Theriault (1999), Helbich et al. 2014, Sawiłow (2010) and Renigier-Biłozor (2011), despite the enormous number of applications, popular statistic data analyses have theoretical weaknesses. The obtained results usually simplify the problem and may not be efficient in markets in which uncertainty is high due to difficulties with quantification of the features and their numerical definition, which is related to behavioural and emotion aspects of the market participants who created/formed the prices/value of the real estate. Hence, many authors investigated alternative approaches for automated valuation, e.g. application of artificial neural networks (ANNs) (Borst, 1992; McCluskey et al., 1997; Yakubovskyv et al., 2018). However, the main problem with using neural networks (NNs) is the influence of differences in the architecture of the neural network model on the results (Worzala et al., 1995). Gonzales et al. (2002) make yet another research contribution by creating a mass appraisal model that is based on fuzzy theory and ANNs. Very interesting results were presented by McCluskey et al. (2013), which demonstrated that “ANNs can be shown to perform very well in terms of predictive power, and therefore valuation accuracy, outperforming the traditional multiple regression analysis and approaching the performance of spatially weighted regression approaches”. Kauko (2002) and Kauklaskas et al. (2012) proposed a self-organizing map, whereas an AHP was applied by Kauko (2002) and Fischer (2003), as an alternative to commonly used methods.

3.2 Evaluating the preferences of potential property investors (qualitative current state-of-the-art solutions)

The analysis of potential property buyer’s preferences is regarded as an alternative property attribute significance determination method (Bartuś et al., 2008; Coolen & Jansen, 2012). It is typically used when analytical methods that are based on property prices and features are not applicable due to, for example, lack of data. The preferences of potential buyers of real estate are among the factors that determine the demand in the real estate market (Ben-Shahar, 2007). Subjective preferences of buyers are treated as customer preferences and form the basis for decision-making in the exchange process. Shopping preferences reflect consumer taste. They reflect the superior suitability of one combination of goods over others. They are characterized by features such as the following:

  • They relate to specific entities, e.g., tangible or intangible;

  • They become identifiable immediately prior to or during the purchase;

  • They determine the purchasing decision;

  • They become identifiable when making a purchasing decision;

  • Their measurement depends on their type (internal versus external preferences).

Buyer preference analysis is conducted to identify a system of individual assessments based on which the consumer makes a choice. They enable the investigation of the product features by which buyers are guided when making a purchase. To identify preferences, it is possible to use historical or presumptive data. The type of data creates a basis for selecting an analysis method from the following: disclosed preference analysis methods that reflect actual market choices (in this case, statistical factors that determine market choices, along with interviews and post-purchase surveys, are analysed); methods for analysing expressed preferences, which are hypothetical behaviours of buyers on the market (data that are collected from surveys and interviews regarding the behaviour of buyers at the time the survey or interview is conducted, namely, before purchase, are subject to analysis).

The structure of buyer preferences reflects a hierarchical arrangement of the possibility of choosing purchases from the market offerings. The consumer assesses which of the purchase options is more suitable for him. The option that occupies the highest position on the scale of his preferences will always be selected. Thus, this structure shows an ordered catalogue of buyer preferences (Szymańska, 2007).

4 Behavioural aspects of real estate market decision making

The real estate market is shaped by factors that can be forecasted with substantial probability and by factors that are little-known, difficult to predict or random. The individual preferences of real estate market participants and their emotional approach to investment decisions undeniably impact the image of the real estate market (Brzezicka, 2016; Czechowska, 2014; Radzewicz & Wiśniewski, 2011). Although this problem has been identified many times in the literature, a large component of behavioural decision making in the property markets seems to be often neglected and remain unidentified (Salzman & Zwinkels, 2017). The physiological and psychological limitations of the organism (biologically defined) and social and moral motives determine decision optimization (Simon, 1955), namely, the choices that are made by market participants are not optimal but are intended to satisfy their objectives at a particular moment (Kucharska-Stasiak, 2014). According to Shiller (2007), residential investments are difficult to compare with other financial investment assets because they have both investment and consumer characteristics. People base life decisions on vague expectations. Combining these expectations with their impression that they have a unique property makes them believe their property will become extremely valuable. This foresight makes them consume more today and implicitly drive up prices tomorrow (Calum et al., 2020; Shiller, 2007). Due to cognitive limitations, real estate market participants make decisions based on random information that they process heuristically. Departing from traditional economic theory, which assumes full rationality of decision making, Simon (1955) proposed an “approximate rationality” approach that considers the actual behaviour of "organisms" in decision-making processes in the context of the ability of entities to collect information and the computational possibilities, which depend on the environment in which these entities operate. According to Szapiro (1993), when making decisions, people focus on selected aspects of the problem and ignore others. In his theory of limited rationality, the principle of maximization (in analytical rationality) is replaced by the principle of satisfaction, and before making a selection, the decision-maker formulates the ranges of satisfactory assessments (reservation and aspiration levels) and selects the first considered option for which the assessments satisfy the restrictions. According to Kahneman and Tversky (1979), the current asset level could be a benchmark for making investment decisions, along with the expectation, from where the gains and losses are coded. The factors that influence investment decisions include moods, emotions, habits and beliefs, among others (Czechowska, 2014; Dunning et al., 2017). Premises of the choices that are made are considered in the field of behavioural economics. Schlösser et al. (2013) and Rick and Loewenstein (2008) focused on two types of emotions that accompany decision making: immediate emotions and expected emotions, which are emotions that are felt while waiting until the results of the decisions are apparent. According to Schlösser et al. (2013, pp.1), immediate emotions refer to the “hot visceral feelings people feel as they contemplate a specific decision option at the cusp of making a decision, whereas anticipated emotions are those emotions that people forecast that they will feel once they experience possible consequences of that decision”.

Entities that make choices and conduct assessments, including experts (professionals who are involved in property markets), tend to make cognitive errors and fall into decision traps (Brzezicka, 2016). Various types of heuristics can negatively impact real estate market analysis, especially when traditional computational methods are used for the analysis. Czechowska (2014) underlines the need for greater integration of behavioural economics and real estate theory, which would provide a better understanding of the behaviour of real estate market participants.

The preferences determine the types of reactions that are reflected by emotions, while emotions can be detected via the application of technology.

4.1 Technology of emotion detection

In the last century, it was understood that the search for deterministic laws of nature is riddled with logical and practical difficulties (Rao, 1994). The understanding and acceptance of the presence of uncertainty in all areas of human life and the acknowledgement of randomness as "an inherent part of the normal functioning of many classes of objects and their modes of existence" (Rao, 1994, pp. 20) led to the development of theories based on stochastic laws of nature. According to these theories, certainty in real estate decision making, by definition, does not exist. However, it is possible to minimize the risk and uncertainty by providing relevant information, in a specified manner, to be processed and given to decision-makers. One of the most important approaches that yield more suitable results with higher stability (less uncertainty) is the inclusion of analyses methods from the artificial intelligence (AI) domain. Methods that are related to the AI domain attempt to emulate the behaviours of humans that directly influence the property market. In this paper, the term “artificial intelligence methods" refers to human-inspired and nature-inspired informatics algorithms that mimic "cognitive" functions that humans/nature associate with other human minds/organism, such as "learning" and "problem solving".

A key feature in human interaction is the universality of facial expressions and body language. In the nineteenth century, Charles Darwin (1872) observed that globally shared facial expressions play an important role in non-verbal communication. In 1971, Ekman and Friesen declared that facial behaviours are universally associated with particular emotions. Emotion recognition can be based on facial, vocal and textual emotions. Facial emotions are the most common type for emotion detection using mostly image analysis algorithms and deep-learning-based approaches (Przybyło, 2008).

The main assumption that facial expressions are used to transmit information is dictated by the finding (which is supported by the results of many psychological studies) that a significant amount of information is transmitted via non-verbal communication, which includes facial expressions.

Four general signal classes can be distinguished (Przybyło, 2008):

  • Static signals, which consist of permanent features such as natural facial structure, mimic wrinkles and skin folds;

  • Slowly changing signals, which represent changes in the appearance of the face that occur gradually over time (e.g., ageing);

  • Artificial elements that have been placed on the face, such as glasses and makeup;

  • Quick signals or facial expressions that are caused by the movements of the corresponding muscles.

An important stage of the analysis is the definition of a description method that will systematize mimic gestures and assign meanings to them. The psychophysiological conditions of a human being and somatic or emotional reflex stimuli should also be considered. From the perspective of provided information, there is the following functional division in relation to facial emotion communication: emotion indicators, which communicate to the recipient the current state of emotion of the sender (e.g., joy, sadness, surprise, anger, embarrassment, or fear), and illustrators, which are used to complement the content of spoken words (Ekman, 1979; Ekman & Friesen, 1975). They are most often made with hands, e.g., to indicate size, distance and direction, but mimicry and vocal channel are also used to emphasize the weight of a word.

In IT solutions to the emotion detection problem, distinguishing spontaneous facial expressions from intentional, meaningful movements is a task that requires calibration of the interface. Among the well-known IT systems for the description of facial expressions is MAX (the maximally discriminative affect coding system) (Izard, 1979).

This is a theoretical method for measuring facial expressions in terms of changes in the appearance of the face that are caused by individual emotions. There are also many methods that enable the precise description of facial expressions without specifying their meaning. The most commonly used systems include the FACTS and FAP methods/systems. FACS (facial action coding system), which is an anatomy-based system for measuring facial mimic movements and grouping them into action units that are closely related to the anatomy of facial muscles, is one of the most commonly used methods for measuring and describing facial expressions (Waller et al., 2020). It was developed by psychologists Ekman and Friesen in the 1970s (1971; 1975; 1979). Their objective was to develop a reliable method for distinguishing and categorizing facial expressions by qualified observers. FAP (facial animation parameters) is defined as part of the MPEG-4 standard as a specification and method of description and animation of the face and human body. As part of the MPEG-4 standard (MPEG-4. ISO/IEC MPEG-4 Part 2; Lavagetto & Pockaj, 1998), a specification of the method of description and animation of the face and human body is defined. It defines a geometric model of the face. In contrast to the FACS methodology, which is closely related to anatomical details (mimic muscles, wrinkles and skin folds), this standard defines 84 characteristic points for describing the appearance of the face.

In economics, the "homo oeconomicus" hypothesis of human rationality is gradually being abandoned. In the economic decision-making process, people most often do not remain fully rational. Many studies have been conducted on emotions and their impact on the rationality of decisions that are made in financial markets (Elster, 1998, Cieciura, 2019, Sanfey et al., 2003, Camerer & Loewenstein, 2003, Loewenstein & Lerner, 2003, Tyszka, 2004, Drabik E. and Drabik I., 2009, Chen & Chen, 2019). The use of artificial intelligence methods in economics offers substantial potential for research, as this area of science remains largely unexplored. Sanfey et al. (2003) used functional magnetic resonance imaging of Ultimatum Game players to investigate neural substrates of cognitive and emotional processes that are involved in economic decision-making. The use of emotion recognition technology in economics offers high potential for further investigation; however, substantial work remains regarding the validation of the elaborated methodologies in practice.

5 Materials and methods

Since the objective of this study is to use solutions that are based on artificial intelligence (AI) methods that emulate "cognitive" functions that are related to human face emotion detection to increase the reliability of variable significance determination, the following procedure of the analyses was applied (Scheme 1).

Scheme 1
scheme 1

Source: authors’ elaboration)

Procedure of attribute significance determination (

5.1 Data: respondents and property attribute images

The analyses involved the participation of 156 students (a consciously selected homogenous group). The research was carried out on 78 women and 78 men. They are students at University of Warmia and Mazury in Olsztyn (Poland) in the final year of the property valuation studies or are postgraduates. They are between 23 and 24 years old. The respondents were selected in terms of knowledge of the local real estate market and the declaration of property investments willingness within the nearest future. For the analyses, the attributes that were selected, along with images that represent them, are presented in Table 1.

Table 1 Attributes of properties and their image representations.

Prior to the study, every respondent was informed in detail of objectives of the analyses and the organization of the test process and was provided technical guidance related to the recording devices. Before the survey, all respondents signed the respondent image protection statement. During the emotion recognition test, each respondent was alone in the recording room to avoid distraction, where he/she watched a presentation of pair of positive/negative attribute images that were additionally described to focus attention on a specified attribute. To confirm the completion of the image recognition, the respondent clicked on a pictogram that expressed positive or negative emotions (this enabled additional emotion evocation and verification of the results). The results of all the recordings were stored as numerical movie frames of individual emotions. Each respondent decided when to stop watching an image, which determined the subset that was used in the analyses. From the video recording, separate digital datasets were created, which represented recordings of individual emotions of each respondent. Due to the emotion noise limitation that was identified at the beginning and the end of attribute image perception, the authors decided to apply eye tracking solutions. At the beginning, the type of attribute (e.g., “standard”) was specified to the respondent, while at the end, the respondent focused on the picture change. As soon as the respondent was focusing on the picture (having read the name of the attribute), the recording was started, and when he/she was focusing on the happy/sad pictogram, the recording was stopped; see Fig. 1.

Fig. 1
figure 1

Source: authors’ elaboration)

Emotion detection concept presentation for respondent No. 4, in which the attribute is a negative standard image (

Figure 1 presents a visualization of 7 emotions that were recognized during negative standard image viewing by respondent No. 4. While the recorded video consisted of 8012 movie frames, 6023 frames were used for attribute significance determination (the eye tracker omitted the beginning/end of the image watching). The property attributes and their image representations are presented in Table 1.

The number and types of attributes were selected on the basis of the maximum efficiency in relation to the number of features that person can consider in making a decision. According Ries and Trout (1996), there are no more than seven attributes. The objective of this study is to determine the significance of attributes based on AI technology; thus, the number of features shouldn’t exceed seven. Additionally, the features were selected on the basis of widespread consideration of investment decision making on residential apartment markets (Hopfer, 2008; Cieślak et al., 2020; Renigier et al., 2018; Mert & Serhat, 2009). The selection of the images was based on the strategy of preparing extreme examples of the features that may trigger extreme emotions.

While designing the study, the authors assumed that the selected attributes ought to be fitted and, therefore, representative of the local market (the origin of the respondents) so that respondents could refer to their feelings. Hence, the technical condition of the building and the technology of the building (which might seem correlated) are perceived as completely different features in the local market. The first attribute reflects the building physical deterioration, which can be improved or worsened via building renovation, whereas the second attribute (technology of the building) reflects mainly the applied construction technology and the functional obsolescence that is connected with it. The differentiation of the two attributes reflects potential dilemmas of property market participants who must choose between apartments that are located in old buildings that were renovated and are of satisfactory technical condition and apartments that are located in new buildings that are outfitted according to modern trends. The next attribute—standard—reflects the quality of the used materials and the aesthetics of the interior, which often affect the investment decision since the investor does not have to spend any additional funds on finishing. The parking space attribute is strictly connected with a common and annoying problem in most overcrowded residential areas. Apartments with designated parking spaces are more attractive since most households possess at least one car. Not needing to search for a parking space saves time and increases the comfort of living. The area feature reflects the apartment functionality and modern trends. The location attribute, due to the presence of local residential properties zones that are comparable, is interpreted as a neighbourhood design, while the window view attribute reflects the immediate surrounding aesthetics (e.g., green spaces and playgrounds).

The preparatory stage concentrated on thorough image selection so that the final set of images that was presented to the respondents (negative/positive attribute visualization) was formed by pairs that were most indicative of the desired attribute and evoked the highest emotion ranges, as indicated by academics with long-term knowledge and experience in market analysis and property valuation. To make results of the studies more objective and, thus, minimize the psychophysiological conditions of the person and reflexive responses to somatic stimuli, the following solutions were adopted in the procedure: The set of attribute example images was shuffled (to eliminate of the influence on the result of factors such as the elapsed time of the examination, tiredness or boredom of the respondent, and emotions that were related to the evaluation of previously presented attributes), and the influence of the length of image exposure was reduced (by adjusting the number of movie frames via skewed observation was eliminated via the solution that was proposed in the methodology (see Formula 5 in Chapter 2.3), namely, unification).

Having collected all the required data, the emotion recognition algorithm was designed.

5.2 Description of the emotion recognition algorithm

Artificial intelligence has been witnessing monumental progress in bridging the gap between the capabilities of humans and machines. As humans, we make many conscious and subconscious evaluations of a person just by looking at his or her face. However, as basic as it is for humans to visually interpret this information, it is a substantial challenge for machines. Conventional semantic facial feature recognition and analysis techniques mostly lack robustness and suffer from long computation time (Gudi, 2015). This paper aims at exploring ways for machines to learn to interpret semantic information that is available in faces via an automated process. Researchers have made extraordinary progress in facial expression detection recently, which has led to improvements in neuroscience and cognitive science that drive the advancement of research in the field of facial expression (e.g. Danisman et al., 2010; Ekman, 2017; Mal & Swarnalatha, 2017; Mehendale, 2020; Parr & Waller, 2006; Xie & Hu, 2018). Moreover, the development of computer vision and machine learning makes emotion identification much more accurate and accessible to the general population. The most promising (e.g., Gudi, 2015; Chollet, 2017; Mehendale, 2020; Chollet, 2017) strategy for facial expression analysis is the use of deep convolutional neural networks (CNNs). CNNs differ from multi-layer perceptrons (MLPs) as CNNs have hidden layers, which are called convolutional layers. In the paper, the authors applied an emotion recognition algorithm under the following assumptions: In the preprocessing stage, the input images were preprocessed to make the faces more uniform. The dataset that was used for training, which consisted of images that were obtained from FERC 2013 and augmented by the CK + (Lucey et al., 2010) resource, has following properties:

  • The images were cropped to the minimum dimensions of one human face,

  • The colours were expressed in the form of one chromatic channel (monochrome),

  • All the images were of the same size (48 × 48 pixels),

  • The images differed in terms of rotation (the ocular base was not always horizontal).

Thus, the authors have normalized the images by float array for shape transformation (48 * 48) with values of between 0 and 1. To avoid large updates of the model estimation error gradient, which could cause network convergence problems, the adoption of small values in the range of 0–1 is recommended (Chollet, 2017). To increase the resistance of network predictions, the training set was augmented by introducing randomized rotations, shifts, cuts, scale changes, and horizontal flipping of the training images.

The images that were subject to prediction, namely, from video recording, contained empirical analysis material. The photographs were registered in a monochromic scale (grey scale). Limiting the registrations to only luminance (brightness) with no colour information enabled the frequency of image registration (60 Hz) and the resolution (640 × 480 px) to be increased. This also ensured that the number of registered channels was consistent during the measurement of emotions and in the process of network training. Realizing consistency of the images that were subjected to emotion prediction using a trained network required limiting the content to the range of faces. For this, an object detection approach that uses Haar feature-based cascade classifiers (Vioa & Jones, 2001) was applied. In addition, the recorded images were subjected to a resampling process, via which they were downsized to 48 × 48px using nearest-neighbour interpolation.

5.3 Training Method and Dataset (FERC and CK +)

Throughout the experiments that are described in this paper, the network is trained using stochastic gradient descent with momentum in mini-batch mode with batches of 256 data samples. The categorical cross-entropy loss function was used. The learning rate for the training was initialized to 0.01 and was supervised by the Adam optimization algorithm (Kingma & Ba, 2014), which was constantly decreased within 100 epochs of training (additional epochs are not presented due to the lack of model efficiency change). The image dataset was divided into 3 sets: 50% training set + 25% validation set + 25% testing set.

A dataset of labelled images from two sources was used for training. One of them was the FER-2013 dataset from the challenge on FER (Carrier et al., 2013), which was published in the International Conference on Machine Learning (ICML) and contained 36,178 48 × 48 px greyscale images of faces that were expressing basic emotions: angry 4953, disgust 838, fear 5121, happy 8989, neutral 6198, sad 6077, and surprised 4002. The FER2013 dataset contains images that vary in terms of viewpoint, lighting, and scale. Additionally, the CK + “Extended Cohn-Kanade Dataset” (Lucey et al., 2010, Kanade et al., 2000, Goodfellow et al., 2013) resource was used to extend the number of images for emotions, which contained 927 emotion images: angry 135, disgust 177, fear 75, happy 207, sad 84, and surprised 249. The CK + resource of images was used to increase the model prediction efficiency. The combined dataset was divided into 3 sets as discussed above.

5.4 Deep network topology

The authors’ designed (using an iterative approach) a CNN network by following Gudi (2015), Chollet (2017), and Mehendale (2020). The designed network topology contains 16 layers: 7 convolution layers, 3 max-pooling layers, 3 dropout layers, 2 dense layers, and 1 flatten layer (Fig. 2). The presented topology yielded the best results in terms of the final model accuracy.

Fig. 2
figure 2

Source: authors’ elaboration)

Network architecture of the developed model (

5.5 Emotion classification: result verification

As a result of network training in the Keras-TensorFlow environment, a CNN model was obtained, and two metrics (accuracy and loss) are presented in Fig. 3.

Fig. 3
figure 3

source: authors’ elaboration)

Accuracy and loss metrics of the model versus the number of training epochs (

Based on an analysis of the efficiency of the proposed model based on two parameters, namely, loss and accuracy (Fig. 3), the larger the number of epochs, the lower the loss during the training process (stable trend). A similar trend is identified for the accuracy parameter. Figure 3 shows that the accuracies of the validation and training increase, whereas the validation and training set losses decrease simultaneously up to ~ 70th epoch, at which the validation loss begins to be approximately constant. At the ~ 70th epoch, symptoms of overtraining begin to appear.

For that reason, the model with weight settings from the 70th epoch was selected for further empirical analysis as a basis for emotion prediction.

The obtained model efficiency was represented by both the percentage (measure) of the training and validation accuracy (73.6% at the 70th epoch) and their loss (0.86 at the 70th epoch), which reflected results that were obtained by other models on the FER2013 dataset by researchers who were participating in projects regarding challenges in the use of CNN models for emotion recognition. They obtained the following accuracy parameters: Alizadeh and Fazel (2017) obtained 65% accuracy on the validation set, Gudi (2015) obtained 67.12% total classification accuracy of the network; Tang in 2013 reported 69.4% total classification accuracy; Nguyen et al. (2019) used a backbone model with 18 weighted layers to realize 69.21% accuracy (with multi-level connections; the best MLCNN increases the testing accuracy to 3.82%); and Taleaongar et al. (2019) obtained 59.63% accuracy before the network became overtrained.

To complete the evaluation of the elaborated CNN model, the confusion matrix was computed using the testing dataset. The performance of this network is presented in Fig. 4.

Fig. 4
figure 4

Source: authors’ elaboration)

Confusion matrix for the CNN model. (

The correct classification of test samples by the network reached 67.04%, and an average precision of 64.85% per class was maintained. The confusion matrix indicates that negative emotions usually have lower network precision than positive emotions. The lowest precision is obtained for faces that are expressing fear since this emotion varies considerably among people and is often confused with anger, surprise or sadness. In contrast, the deep network model performs well in predicting happiness; hence, learning happy face features is easier than learning other expressions. The obtained results are assumed to be justified since many researchers (CNN model developers) obtained similar results, e.g., Talegaonkar et al. (2019)—test accuracy of 60.12%; Alizadeh and Fazel (2017)—54% accuracy for the best shallow models and deep models; and Gudi (2015)—66.56% total classification accuracy and 65.64% average precision.

5.6 Methodology of property attribute significance determination

The original methodology of property attribute significance determination was based on the assumption that the attribute significance equals the number of evoked emotions (1—anger, 2—disgust, 3—fear, 4—happiness, 5—sadness, 6—surprise, and 7 – neutral) with respect to their tinge, namely, positive/negative; therefore, the procedure of their determination was divided into the following steps.

  1. 1.

    Determination of emotions’ share (either positive or negative) in one image for a respondent (Formula 5):

    $${\text{ES}}_{k} = \frac{{\mathop \sum \nolimits_{j = 1}^{n} e_{kj} }}{{\mathop \sum \nolimits_{i = 1}^{7} \left( {\mathop \sum \nolimits_{j = 1}^{n} e_{ij} } \right)}} \cdot 100\%$$
    (5)

    where: ESk—percentage emotion share for a respondent and image, ekj—value of emotion k for movie frame j, k—type of emotion (k = 1,2…,7), n—number of movie frames (indicated by the eye tracker).

    The first step of the methodology was determined by the instability of the respondent’s emotions (see Fig. 1) within the time of video recording (reflected by the number of movie frames). Additionally, the equation enabled the emotion intensity measurement of each respondent who was watching each attribute picture. Such a solution was necessary for the unification of the recognized emotions to enable result comparison.

  2. 2.

    Determination of the positive and negative emotion range for an image and a respondent (Formula 6):

    $$ER_{im} = \left| {\sum e_{Nk} - \sum e_{Pk} } \right|$$
    (6)

    where: ERim—emotion range for the corresponding respondent and image, eNk—negative emotion (k = 1, 2, 3, 5), ePk—positive emotion (k = 4, 6).

    The second step of the methodology was determined by the two adopted assumptions. The first was connected with the neutral emotion. Since it reflects a state of equilibrium (it has neither a positive nor a negative tint), it was omitted from the formula (see “k” from 1 to 6). The emotion detection precision is diversified (see the confusion matrix that is presented in Chapter 4.2), but for detection among either positive (e.g., happiness and surprise) or negative (e.g., anger, disgust, fear, and sadness) emotions, the results are comparable—see Fig. 4. From that perspective, distinction between only negative and positive emotions, regardless their number, is justified.

    The second assumption was connected with the emotion range calculation. The difference in the positive and negative emotions reflected respondents’ approaches towards the attribute through the exposed individual attribute significance. The form of image presentation and the image content may significantly affect the emotion range; hence, the images should be prepared with caution, as in this study.

  3. 3.

    Determination of the emotion range for images (of negative and positive expression) with one attribute for a respondent (Formula 7):

    $$ER_{a} = \frac{{\left| {ER_{imP} - ER_{imN} } \right|}}{{\mathop \sum \nolimits_{a = 1}^{7} \left| {ER_{imP} - ER_{imN} } \right|}} \cdot 100\%$$
    (7)

    where: ERa—percentage emotion range for the corresponding attribute, a –attribute number, a = 1,…, 7, ERimP—emotion range for images of positive expression within one attribute, ERimN—emotion range for images of negative expression within one attribute.

    The third step of the methodology is the determination of the percentage spread of a respondent’s six emotions within one property attribute, which was used for unified attribute significance determination in a form that enabled comparison within the group of all respondents.

  4. 4.

    Approximate property attribute significance determination (Formulas 8, 9, and 10):

    $$A_{a} = \left\{ {ER_{a,1 } \ldots ER_{a,l} } \right\},$$
    (8)
    $$M_{a} = \frac{{ER_{{a, \frac{l}{2}}} + ER_{{a, \frac{l}{2} + 1}} }}{2},$$
    (9)
    $$ES_{a} = \frac{{M_{a} }}{{\mathop \sum \nolimits_{a = 1}^{7} M_{a} }}*100\%$$
    (10)

where: A—attributes in descending order, a—attribute number, a = 1,…, 7, l—number of respondents (\(l = 1 \ldots 156\)), \(M_{a}\)—median ER value for individual attributes, \({\text{ES}}_{a}\)—percentage median.

To objectify the acquired results, the midpoint of a frequency distribution of observed values (median) of individual attributes for selected respondents (and their percentage share) was calculated. The median was adopted because it is outlier-resistant. This enabled a final attribute significance determination that could be used in a variety of applications, such as market analysis and individual property valuations.

The presented methodology was applied and verified in the conducted experiment.

6 Results

The crucial step in almost every property analysis procedure is obtaining the significance attributes that influence the value, attraction or investment potential. To quantify emotions as behavioural elements of decision systems, the experiment that is presented below was conducted.

The presented methodology enables the determination of the percentage share of the sum of the values of the seven emotions that are evoked by the photographs that represent individual attributes of a property (Table 2 presents the results for respondent 4 as an example).

Table 2 Evoked emotion percentage share (either positive or negative)—respondent 4.

In the example of respondent 4 (Table 2), emotions that were expressed by his/her facial expression in accordance with pictures that showed negative standard attributes were recorded and quantified. The facial recognition in this example was conducted on the basis of 6023 movie frames. The value of 5.61 in the “angry” emotion column indicates that the share of anger among all seven emotions, at the time of emotion recognition, equalled 5.61%. The ESk sum is 100%. In the table above, the highest intensities of emotions that were calculated on the basis of Formula 5 for respondent 4 were in the cases of “neutral” and “sad”. The values of the emotions were evaluated in comparison with results from previous experiences and according to individual predispositions and the ability of the respondents to express them.

Further analysis of the subjects' emotions consisted of determining the range of emotion values ERim (Formula 6), according to which the emotions were divided into two groups, namely, negative (angry, disgusted, fearful, and sad) and positive (happy and surprised), within one photo. The human brain assesses a scenario based on previous experience and beliefs as one of the following:

  • Positive—which results in positive emotions,

  • Negative—which results in negative emotions,

  • Neutral—no emotional tinge (neutral emotions).

Since neutral emotions correspond to a state of equilibrium, neutral values were excluded at the next stages of the calculations. The ranges of ERa emotions that were determined on the basis of Formula 7 for individual images (negative and positive expression) under one attribute (two photographs) for the exemplary respondent are presented in Table 3. These values express the differences in intensity of emotions of the respondent in response to the visualization of individual attributes.

Table 3 Attribute emotion intensity determination for the example of respondent 4.

Respondent 4 showed the largest fluctuations in emotions for the standard and technical condition attributes; hence, in these cases, the difference between the positive and negative emotions in relation to both pictures was the largest. These percentages represent the individual/subjective feelings of the respondent—the processes for valuing each scenario—in relation to the property attribute under consideration. The assessment and evaluation of the attributes depends on the respondent’s subjective (even unconscious) preferences, which are affected by experience and individual attitude.

Individual ERa values of each respondent enable the identification of general preferences/trends in the real estate market in the context of the significance of real estate attributes. The representative approximate significances of the seven property features were determined on the basis of the median Ma (Formula 9) and ERa value ranges for all respondents (Table 4). Due to the natural human diversity of the respondents, the median ranges of the emotion values for the obtained results were adopted as the objective measure of the multiple respondent results.Error! Not a valid link.In Table 4, the ESa values, which correspond to the percentage median (Ma) values of the emotion value ranges (ERa) that are evoked in the respondent while viewing individual attributes that were presented in the form of two photographs (negative and positive expressions) represent the significances of real estate attributes.

Table 4 Property attribute significance.

According to the obtained attribute significances, the most crucial property attributes for the group of respondents that were analysed with the proposed methodology were parking space and standard. The results accord with the current mainstream trends of client property preferences, especially those of young investors who are searching for apartments in new attractively located investments. Location was third in terms of attribute significance. The least significant attributes were technical condition, window view and technology of the building.

The presented property attribute significance results were obtained from the facial emotions of a homogenous respondent group (156 students). Hence, these results should be generalized to a wider group of property market participants with caution and reserve. Additionally, the objective of the study was to develop an original methodology that was based on innovative technological solutions to bridge the gap in property attribute significance determination rather than a methodology that would provide a universal set of attributes and their significances (in this stage of investigation).

7 Discussion

Determining the impacts of attributes on the assessment of the attractiveness of real estate is a basic element that affects the accuracy of real estate market analysis and property value estimation. Methods that are based on the detection of emotions can be alternatives to traditionally used methods for determining the significance of real estate attributes. Emotion is a subjective psychological state that activates priority for the associated action. Emotions are typically accompanied by somatic changes, mimicking, pantomime expressions and other behaviours (Imbir, 2012). They are valuation processes that are recognized as ubiquitous (Wojciszke, 1988), and they accompany all impulses that are received from the external world, internal environment and own mind (Jarymowicz, 1997, 2009a, b). According to the obtained results (which were derived from a homogenous group of respondents), features such as technology and the area of the real estate were of the lowest significance (10.73% and 7.41%, respectively). The technical condition of the building, location and window view are at a similar level of significance (12.68%, 14.23% and 12.02%, respectively).

To compare the unconscious expressed facial emotions within the group of analysed respondents, the answerers were asked in the form of a commonly used questionnaire to rank the property attributes from highest significance (7) to lowest significance (1). The results of the questionnaire are presented as a box plot (Fig. 5).

Fig. 5
figure 5

Source: authors’ elaboration)

Property attribute ranking according to a questionnaire (

The comparison of the collected results has shown that the conscious declaration of the attribute ranking differed from the emotion detection conclusions. The three most significant property attributes in both scenarios were the same (standard, parking space and location). In the emotion detection analysis, the most important attributes were standard and parking space (with comparable significances of 21.32 and 21.61); however, the questionnaire results have shown that location and standard are of the highest significance. The respondents in case of the standard attribute were highly consistent. A lower level of consistency was identified in case of the location—the second most significant attribute. The results of the two attribute significance determination methods differed in terms of the parking space attribute: While emotion detection showed that this is the most significant attribute, questionnaire results led to the conclusion that it is in third place in terms of significance. This lack of consistency in the significance of the attribute shows that respondents had varied opinions regarding that attribute. The group of least significant attributes (technical condition, technology of the building, window view and property area) was the same between the two analysed examples. Moreover, the property area attribute was identified as highly significant (with high consistency).

As was demonstrated in the experiment, the results that are obtained with the proposed methodology may be affected by various factors. The video acquisition process, which will be analysed frame by frame, requires the selection of suitable parameters and the satisfaction of important conditions, such as the use of a camera with a suitable resolution and sufficient recording speed, namely, frames per second (the higher the speed, the lower the risk of missing the recording of the correct emotions). As part of the experiment, a 60-fps camera with a resolution of 800 × 600 px was used. The use of fast cameras (Bobkowska et al., 2015) will significantly extend the computational process. Additional conditions include the registration of only one face—one person in the camera's image space; suitable and homogeneous face lighting; and frontal face recording. A 30-degree departure from the camera focal perpendicular from the face frontal plane extends the time of face detection and hinders the accurate detection of emotions.

Compliance with these requirements for image acquisition enables the advancement, after registration, to the next stage of evaluating the recorded emotions. During preprocessing to enable the use of observations in the study of emotions, the impact of registration imperfections should be reduced. The lighting, despite substantial efforts, rarely corresponds to the conditions that were implemented during the construction and training of the neural networks for the detection of emotions; thus, compensation (of the images) is conducted for grey scale recording. Emotion detection requires face detection in the image, e.g., using the Haar classifier (Gao & Lu, 2008), and only within the selected sub-area of the photo is emotion detection conducted. Such tasks are difficult to complete using modern personal computers, especially in the case of high-speed cameras (over 1000 frames per second). Hence, all detections are conducted only during postprocessing. Moreover, the visualization of the images may affect the intensity of the emotions.

7.1 Research limitations and possible areas of future research on attribute significance determination

The determination of attribute significance in a broadly understood real estate market analysis is a highly complex and difficult topic according to multiple scientific studies on this subject (Kauko, 2007; Foryś & Kokot, 2008; Renigier-Biłozor & Biłozor, 2009; Luoma et al., 2010; Hurtubia et al., 2010; Opoku & Abdul-Muhmin, 2010; Jackson & Orr, 2011; Tan, 2012; Zydroń & Walkowiak, 2013; Renigier-Biłozor, 2013; García et al., 2014; Saw & Tan, 2014; Moghimi & Jusan, 2015; Kokot & Bas, 2016; Abidoye & Chan, 2016; Głuszak & Małkowska, 2017; Sarif et al., 2018; Nilsson, 2019; Ghumare, 2019). The complexity of the issue increases the necessity of economic, behavioural and analytical analysis. The diversity and imprecision of attributes (real estate features) and the large and multidimensional set of data render real estate analyses, which include valuation, planning, and management, complex and time-consuming processes with a high degree of risk. The practical problems of data exploration in property attribute significance are due mainly to the non-homogeneity of real estate (no two properties are identical), the presence of various attributes that describe real estate (both qualitative and quantitative), highly diversified access to information on various segments of the real estate market and, frequently, lack of awareness regarding the dependencies between real estate attributes and market actors. Combined with information overload, imprecision, measurement errors and the unavailability of various types of information, these factors directly impact the effectiveness of the selected investigation method and the quality of decisions. From this perspective, it’s not possible to include all features/factors that impact the real estate value, attractiveness, and investments profits. The only solution in this case is to minimize the risk via the selection of suitable data and via significance analyses using methods/algorithms/systems that are most suitable for the domain analysis and the specificity of the real estate domain. The main (well-known) problems in property attribute significance analyses are as follows:

  • Selection of the suitable features for representation in the analyses,

  • Avoiding stereotypes in features assessment (e.g., location is the most important),

  • Human (analyst) subjectivity,

  • Lack of sufficient analytical systems that are practical and user-friendly,

  • Common use of statistical methods that are often unreliable,

  • Limited data access from a behavioural perspective.

The approach that is presented in this study attempts to minimize these problems or to provide a reliable/sufficient alternative from an analytical perspective. No perfect methods are available. All the methods and analyses have limitations. In this study, the two main issues were considered in relation to property data collection and emotion recognition technology application.

In the first case, problems were encountered in the collection of suitable features that may significantly impact the objective of the analyses. These features should be tied with the local market due to the higher awareness and cognitive abilities of potential investors. The analysis covered residential apartments, where, from the perspective of an individual investor and his awareness, the decisive factors are predominately technical and location attributes that are directly related to the condition and description of the real estate and its immediate surroundings. However, a limitation in this analysis was the selection of factors that can be expressed via photography/visualization. The selection of individual attribute images is a neuralgic element of these analyses. The individual attributes are presented in two photographs, which should be selected in a way that evokes intense emotions in the respondent and, therefore, enables the application of the presented methodology. During the analyses, various photo configurations were evaluated, and those that evoked extreme emotions, both positive and negative, were selected. The conducted analyses show that in the selection of photographs for visualizing the attributes, the real estate market should be considered so that the respondents can directly relate emotionally to the images they see.

No ideal methods are available that would enable, without any controversy, the determination of the significance of real estate attributes that influence attractiveness. Both the approach that is described in Chapter 2 and that proposed by the authors have an aspect that is related to the attribute-based perception of real estate. The methods that are used in the analyses are subjectively quantified by an analyst or are subject to survey choices that depend on the respondent's imagination. The descriptive and visual form of presenting the property attribute that is proposed by the authors minimizes the risk of a subjective interpretation of individual attributes, which will (and should, because it concerns human feelings and sensations) appear in every method.

Additionally, from the perspective of the respondents, there is a limit to the number of attributes that can be assessed simultaneously so as not to tire the respondents and lose the possibility of an objective assessment. Another problem concerns the legal aspect that is connected with the respondent's image protection statement, which may slightly intimidate/distort the intensity of the reaction. The answers of the respondents are expressed as personal facial reactions that are specific to each person, which may interfere with the process of detecting emotions. The time from seeing the image, understanding it and making the facial expression is highly important here. The significant variation of this time hinders the comparison and analytically interpretation of the individual image recordings. It is also important to use eye tracking technology, which eliminates the information noise that results from the distraction/distortion from the strict visualization of the attribute.

The next limitation concerns the recording and processing of mass data sets. Recorded image sequences for individual respondents differ in terms of length; hence, it is difficult to compare them directly. Another important element is the sampling rate, namely, the number of frames per second. In this case, a higher sampling rate is better. Therefore, it is reasonable to use a slow-motion camera. The selection of uniform lighting of each respondent's face and suitable direction of the face position (head pose) and the minimization of the presence of elements such as glasses, piercings, hair, and moustaches that hide emotions are highly important for ensuring the effectiveness of the assignment of each recorded image to the correct emotion. An important issue from a technical perspective is the correct (optimal) image resolution. In this case, a compromise must be found between the resolution and sampling rate. In principle, the higher the resolution, the larger the set of data that must be processed in a specified unit of time. Increasing the resolution requires lowering of the frequency, which may result in a failure to capture emotions. Sometimes, a higher recording rate at the same resolution can be realized by reducing the colour depth of the captured images. From the perspective of the applied technology, the problematic aspect is the data that are used to train the model of neural networks with the selected topology. The images that are used to train the model must be representative of the analysed group.

In consideration of both the emotion detection technological constraints and the assumption that is adopted in the experiment, the following directions and challenges will be considered in future research:

  • Improvement of the efficiency and accuracy of the algorithm via the use of a transfer learning approach,

  • Increasing the efficiency via the use of an alternative algorithm structure,

  • Verification of attribute image selection on the obtained results,

  • Verification of the emotion range between static (image) and dynamic (short movie) presentations of property attributes,

  • Increasing the universality of the obtained results in relation to the attribute significance by widening the scope of the questioned respondent groups,

  • Focusing on conscious/unconscious aspects of attribute selection and their significance.

8 Conclusions

The main problem that is encountered in the analyses of the real estate market is the specificity of the information that describes it. The proposed methodology, which is based on an emotion detection algorithm, enables the importance of attributes to be determined in relation to their significance as an alternative solution to the commonly used approaches. The hypothesis that the use of unconditional reactions of property market participants in the market analysis enables objectification and assessment of property attractiveness for property investment decision-making is supported. As was discussed, the most important task is the selection of methods and the development of a methodology (application procedure) that “understand the specificity of information and participant decisions” on the real estate market. The conducted analyses and the obtained results better accord with current property investor preference trends in comparison to their declarations. This phenomenon is highly variable. However, real estate market analysts stereotypically identify the significance of attributes by adopting several of the typically adopted features, among which the location has the highest value. The location is an important attribute, except when analysing comparable properties. Among properties, the locations currently are similar due to the preference for housing estates.

The proposed methodology of attribute significance calculation based on emotion recognition has several advantages:

  • Big data processing,

  • Objectivity of the obtained results (low probability of manipulation and adjustment to the assumed results),

  • Higher flexibility in response to changes in trends and market preferences,

  • The possibility of expanding the observation sets—increasing their precision, completeness and reliability,

  • Speed of data acquisition with indirect (remote) contact with the respondent,

  • The possibility of free selection of relativized (compared) features (attributes) of the property (however, their image visualization must be possible—this is also a limitation).

In addition, the added value of the developed methodology is its universality and the possibility of its application in other areas in which it is important to objectify the recipient's reaction to an element as a component of the decision system.

It is disadvantages may be related to finding a suitable representation of the analysed features using photographs and to mastering new skills and overcoming technological challenges. The need to visualize property features can limit the applicability of this method.

According to the recent literature and findings, property market analyses require the use of innovative new automated valuation techniques. These innovative techniques differ from linear models and are suitable for the specificity of the property market information. They can be used to explore multivariate relationships and the reasoning of decision-makers. One of the most crucial stages in AVM model building is the definition of the information that is related to attributes that crucially influence property values. The proposed methodology can be an essential component of the market analysis in automated models. Since emotion recognition technology is rapidly developing, the authors will develop solutions that overcome the identified limitations in attribute significance determination.