1 Introduction

As the health care industry continuously growing and changing, health care providers are constantly seeking new and innovative methods to maintain and improve the quality of hospital services. Currently, patients not only seek medical treatments but also demand safe and comfortable health care experiences [1]. Moreover, hospitals, as patient-oriented service providers, are expected not only to provide the best medical treatments but also, to improve patient satisfaction [2].

To assess the quality of many services, it is necessary to use formal quality models, and the quality function deployment (QFD) method [3, 4] is one of the most well-know. This method has been successfully applied to many domains in the past, such as supply chains [5], supplier selection [6], strategic selection [7], manufacturing strategies [8], market segment evaluation [9], health care [10] and health care services [11]. Nevertheless, it still has many limitations.

Wolnia [12] and Huang et al. [13] studied the main properties as well as the main limitation of the QFD method. Among these limitations, some issues were highlighted related to the dependence on experts in making any decision. Traditionally, small teams of experts conduct QFD analyses, which may not sufficiently capture the various perspectives and knowledge needed for complex products and services, especially in distributed contexts. This lack of general input can limit the effectiveness of QFD in real-world scenarios. One of these scenarios is the health care systems; when several hospitals are involved, it is necessary to rely on several dozen experts, which is very difficult to achieve. For that reason, it is necessary to use other strategies such as those presented here.

In addition to the fact that it is not easy to find many experts, on many occasions their limited involvement further complicates QFD implementation. Experts in the field of health care services are often consulted to assess the quality of the services; nevertheless, the perspective of the patients is not considered on many occasions [14], although previous studies have highlighted the significance of incorporating user feedback in improving health care service quality [15]. Experts are usually expensive and difficult to find, and on many occasions, they convey their own opinions which are different from patient opinions [11, 16,17,18].

Furthermore, the static nature of traditional QFD methods poses a challenge in today’s dynamic health care environment. Customer preferences and health care service conditions can change rapidly; nonetheless, QFD may not adapt quickly enough to these changes, potentially leading to services that no longer meet customer needs. For these reasons, we propose the use of a patient feedback web platform to capture information directly from users instead of resorting to experts, which is often more difficult and expensive.

Another typical challenge lies in dealing with the uncertainty associated with QFD evaluations. Vagueness, ambiguity, and incomplete information are common in expert judgments and relationships between customer requirements (CRs) and design requirements (DRs). Additionally, existing QFD approaches often ignore the complex relationships and dependencies between CRs. For this reason, a fuzzy representation model is proposed to represent the information from the system, instead of a classical crisp representation, which is less flexible and less similar to how a user expresses their assessments.

Despite all these limitations, there are opportunities to address them. The development of medical social platforms can provide valuable information on the health care services given by appropriate patients or health care users. Furthermore, techniques from areas such as opinion mining and topic modeling can help clarify the most relevant dimensions for the users and their opinions about them in the medical domain. Therefore, it is possible to exploit this information instead of the information provided by experts, to try to understand the quality of the provided services. This can aid hospitals in enhancing the quality of their services, optimizing them, and increasing their competitiveness within the health care industry. Consequently, the goal of this study is to propose an application that is able to recommend the most suitable hospitals for a specific user depending on the quality of the provided services according to patient opinions. To do so, the novelty of the proposal lies in replacing the knowledge usually obtained from experts to model the QFD method by determining the health care dimensions and quality parameters using topic modeling techniques and assessing them using sentiment analysis techniques applied to past patient reviews. Given that these values are assumed to come from experts, i.e., they are not usually expressed in numbers but in natural language expressions such as good and bad, they are represented by fuzzy sets.

In summary, the main contributions of this study are as follows:

  • A fuzzy recommendation mechanism for ranking real hospitals utilizing a fuzzy-QFD method based on patient opinions and user preferences.

  • A topic modeling algorithm to identify the quality parameters of the QFD model from patient opinions on the hospitals.

  • An opinion mining algorithm for calculating the polarity of the DRs.

  • A fuzzy mechanism to represent the polarity of patient opinions and the relationship between the CRs and the DRs of a hospital.

  • A set of experiments using real patient reviews from hospitals to customize the recommendations of these hospitals considering user preferences under different scenarios.

The remainder of the paper is organized as follows: Section 2 provides a detailed overview of the state-of-the-art methods of hospital service assessment. Section 3 presents the proposal, outlining the methodology and dataset utilized in the hospital ranking. In Section 4, the results of the case studies testing the methodology are discussed. Finally, in Section 5, conclusions are drawn based on the findings of this study.

2 Literature review

In the health care industry, different models such as SERVQUAL [19] and Kano [20] have been used to measure the quality of different services. The SERVQUAL model evaluates the gap between expected and perceived service quality through different dimensions, such as assurance, responsiveness, empathy, tangibility and reliability. Kano, on the other hand, categorizes service attributes into three categories—must-haves, performance, and excitement factors—to assess customer satisfaction. Both models have been applied in health care to identify areas for improving and enhancing patient satisfaction [21, 22].

Fig. 1
figure 1

HOQ diagram

The QFD methodology is a systematic approach used to optimize the design and development of products and/or services [3, 4]. It is a customer-focused method that uses a series of matrices to translate customer needs and requirements into specific design and development goals. Cohen [23] introduced a four-phase approach for service development utilizing QFD. One of the key steps of this approach is the HOQ, which reflects the relationships between CRs and service characteristics. This matrix helps clarify the needs and desires of the target customer and how they can be met through the design and delivery of the service.

There are two main styles of HOQ: American and Japanese. The American style typically includes six basic steps and has an additional section, known as Area E (see Fig. 1) while the Japanese style is considered simpler and easier to use [24]. This study focuses on the Japanese style of HOQ to measure the quality of services provided by hospitals considering patient feedback. This approach allows us to identify areas needing improvement and the necessary changes to provide better patient services:

  • Area A represents the CRs, which are the needs and desires of the target customer that the service or product should meet. In this paper, the CRs were determined through a comprehensive academic literature review process.

  • Area B represents the relative importance of CRs, which can be determined through various methods, including direct methods where the user assigns scores to the requirements and indirect methods utilizing more complex mathematical and statistical models such as analytical hierarchy process (AHP) and decision-making trial and evaluation laboratory (DEMATEL).

  • Area C describes the DRs or service characteristics, which are the solutions to meet customer requirements.

  • Area D is the relationship matrix, which is the core element of the HOQ and shows the strength of the relationships between the CRs and DRs.

  • Area E is the correlation matrix, which illustrates the impact that one DR may have on another. This area helps identify potential trade-offs between requirements.

  • Area F represents the target value for each DR. These values are derived from a thorough analysis of the customer’s requirements and serve as a benchmark for evaluating the design effectiveness. The weight of each DR is also a significant consideration because it determines the relative priority of each requirement within the design process. These weights are determined using the HOQ methodology and serve as a guiding principle for the design team to ensure that all DRs are met in an appropriate manner.

QFD has been successfully applied in various fields. In health care, QFD has proven to be a valuable tool for service planning and customer needs analysis. By translating consumer points of view into conclusions for regular service planning, hospital managers can effectively identify and prioritize consumer needs through the use of QFD [25]. It can also help optimize emergency department operations, decrease patient waiting times [26], and improve medical device design and patient safety in hospital settings [27]. In addition, Said et al. applied QFD to develop quality inpatient services based on patient needs and to achieve high levels of medical tourist satisfaction in India [11]. Gavahi et al. developed a hybrid approach using QFD and SERVQUAL to assess and improve radiology center services [28]. This study used the voice of customers (VOC) through surveys to evaluate service quality for radiology patients; its findings can be easily applied by practitioners.

Araujo et al. integrated fuzzy set theory, QFD, and SERVQUAL to assess medical clinic service quality, emphasizing the importance of prioritizing key requirements [29]. A similar approach was presented by Junior et al. to enhance decision support for surgical centers in Brazil [30]. Kanan et al. applied QFD to analyze private health care services in Saudi Arabia and identified management issues and information flow as key factors [31]. Amran et al. introduced fuzzy multicriteria decision-making (MCDM) for prioritizing critical medical devices in biomedical engineering maintenance services, ensuring the reliability of the hospitals of the Ministry of Health [32]. The HOQ information obtained from surveys completed by experts is exploited in this method.

Alejo-Vilchis et al. applied QFD to identify improvement areas in clinical analysis laboratories based on patient satisfaction data obtained through SERVQUAL questionnaires [33]. Khan et al. focused on building a resilience-based model for the health care sector to fight COVID-19 in Pakistan. This model combines multicriteria decision-making with QFD to prioritize resilient attributes and strategies [7]. Tortorella et al. proposed a problem-oriented methodology using QFD for prioritizing the integration of Industry 4.0 technologies in hospitals, demonstrating its applicability in ranking digital applications [9]. Nie et al. introduced a cloud-supported QFD model to implement a customer-centric approach for improving health care service quality [34].

Fig. 2
figure 2

Example of Careopinion story

Neira-Rodado et al. addressed challenges in health care technology product design by combining fuzzy Kano, AHP, DEMATEL, and QFD, providing a novel approach for translating customer needs into product characteristics [35]. Salah et al. proposed a QFD-LFA framework to improve patient safety in hospital design [36]. Fauziah et al. analyzed the effect of pharmacy service quality and identified the largest gap between the speed and accuracy of pharmacy staff [37]. These studies offer valuable insights into enhancing the quality of health care services by considering CRs, identifying critical improvement actions, and prioritizing improvement efforts. Furthermore, Raziei et al. presented a hybrid approach that uses group decision-making, service quality measurement, and QFD to prioritize patient needs as quality factors and cope with uncertainties [38].

These studies collectively demonstrate the potential of QFD techniques to improve hospital service quality by prioritizing patient needs, identifying quality characteristics, and enhancing health care processes. Most of these methods have been used for studying specific quality features of a hospital but not for ranking hospitals according to the most critical quality dimensions, as is proposed here.

Although QFD has been widely used, it has limitations mainly due to its manual approach, which can be time-consuming to implement and relies on subjective decision-making. Moreover, customer preferences and/or opinions are often expressed using subjective, vague, or imprecise terms, both numerically and linguistically. Consequently, in QFD, it is challenging to interpret fuzzy human language with the aim of translating it into technical requirements. For this reason, many QFD-based solutions use fuzzy logic-based representations in diverse industries such as supply chains [5], supplier selection [39], strategic selection [40], manufacturing strategies [41], market segment evaluation [42] and health care [43].

3 Proposal

In this section, an application for recommending real hospitals according to user preferences is described. This fuzzy QFD-based approach considers the past experiences of patients as well as their satisfaction degree with respect to the services offered by the hospital.

Unlike traditional QFD mechanisms applied to hospital management, this approach does not rely on experts to determine and assess the primary dimensions of hospitals. In this case, the information provided by the patients’ opinions posted on a feedback web platform called CareopinionFootnote 1 is sufficient to model and assess all these dimensions modeled by the HOQ. These dimensions in the health care industry are well-known, and it is therefore easy to find these opinions. Furthermore, the opinions can reveal positive or negative feelings toward these dimensions. Hence, it is not only possible to identify the dimensions of the HOQ of the QFD model but also to determine the values of the matrix, i.e., the assessment of the dimensions.

Fig. 3
figure 3

Steps for modeling the HOQ

The Careopinion website is a platform for patients, where they can share information about their own experiences in a hospital. Each opinion is called a “story” because it includes personal emotions and anecdotes when visiting a hospital. The structure of every story consists of a “Story Summary” section with three subsections covering the patient feelings, positive aspects of the hospital, and areas that could be improved. The “how did you feel” section provides a glimpse into the emotional state of the patient during their hospital visit and can help identify areas of improvement for hospital managers. The “what’s good” section highlights the positive aspects of the patient experience and can serve as an indicator of areas where the hospital is performing correctly. Finally, the “what could be improved” section allows patients to provide feedback on areas needing attention such as waiting times or communication with staff. This last section can be especially useful for detecting DRs to pay special attention to. For instance, if several patients mention that the waiting area is uncomfortable, it is possible to infer that a better waiting area could help increase patient satisfaction.

As shown in Fig. 2, the information about the dimensions and their assessments are available in these opinions. It is necessary to detect the opinions and their associated feelings to model the HOQ. Regardless of the dimension an opinion concerns, it is detected by a specifically designed LDA-based model [44]. As to whether the opinion toward a dimension is positive or negative, an opinion mining-based algorithm is proposed to process the sentences talking about each dimension and calculate the corresponding polarity.

As one of the goals of the system is to replace human interaction and most of the system information is provided by users, this information will be represented by linguistic terms as usual. For instance, the most common mechanism for collecting the assessment of the CRs to build the HOQ and gather its inputs is to conduct surveys, interviews, or focus groups with the customers, who may be experts, end-users, or practitioners. Since the values of these variables are mostly linguistic (the dimension may be ‘highly important’ or performance may be ‘very good’), they tend to be uncertain, imprecise, subjective, and vague; therefore, the use of fuzzy logic techniques seems appropriate under these circumstances. Therefore, the values of the HOQ are modeled by fuzzy linguistic term sets.

Finally, taking advantage of the information represented in the fuzzy HOQs of the assessed hospitals and considering the user preferences for different CRs, an aggregation method based on the fuzzy suitability index is proposed to determine the final score of all hospitals. The rank of the hospitals is established according to this score. The entire process is depicted in Fig. 3, and the subsequent steps are described in detail below.

This process is repeated for all hospitals that are to be ranked to calculate their different fuzzy suitability indices. The final ranking order is established by comparing the values of the indices.

3.1 Algorithm

3.1.1 Step 1. Identifying the customer requirements (“WHATs”)

To accurately reflect the expectations and needs of patients, identifying the k key CRs (\(CR = \{CR_1, CR_2, CR_3, \cdots ,\)\( CR_k\} \)) for hospitals is paramount. By conducting a thorough analysis of the most recent and relevant literature [2, 38, 45], the crucial dimensions for assessing the quality of health care services were determined according to the majority of related studies. Moreover, these dimensions play a vital role in shaping DRs for hospitals. Particularly, six dimensions were found, therefore, in this study \(k= 6\). These six essential quality dimensions identified for the CRs were:

  • Tangible (\(\varvec{CR_1}\)): This refers to the physical appearance and equipment of the hospital and the staff, including for instance, their uniforms. The goal is to create a professional atmosphere that instills trust in patients.

  • Reliability (\(\varvec{CR_2}\)): This dimension focuses on the consistency and accuracy of the hospital services as well as the ability of the staff to deliver services as needed. Patients can only rely on their health care providers if they deliver high-quality services.

  • Responsiveness (\(\varvec{CR_3}\)): Responsiveness refers to the promptness and willingness of the hospital staff when providing services to patients. This can be demonstrated through the hospital commitment to providing services in a timely manner and sharing diagnostic information promptly with patients.

  • Assurance (\(\varvec{CR_4}\)): This dimension is focused on the level of trust, confidence, and security that patients feel about the provided services. It includes staff competence, knowledge, and skills, as well as staff willingness to help and communicate effectively with the patients.

  • Empathy (\(\varvec{CR_5}\)): This dimension is centered around the caring and understanding attitude of the staff toward the patients. The goal is to create a supportive environment for patients.

  • Professionalism (\(\varvec{CR_6}\)): This dimension is related to the level of skills and expertise of the hospital staff, and their ability to provide services in a manner aligned with patient expectations. Patients expect to receive services from knowledgeable and well-trained staff who adhere to high ethical standards.

3.1.2 Step 2. Determining the relative importance of the customer requirements (“WHATs”)

This approach aims at prioritizing the customer’s voice in the evaluation process. Thus, to determine the importance of the CRs, it is necessary to obtain the information form the user. The easiest manner is to directly ask users about their preferences regarding CRs. The users can assign weights to different dimensions based on their needs, ensuring a transparent and user-centric process. Nonetheless, if the system where this methodology is implemented has information about the user priorities over the CRs, for instance, in the form of textual opinions, it is also possible to develop indirect mechanisms able to analyze these opinions and determine the user preferences in a ‘less subjective’ manner than directly asking the user about them.

Mathematically, these weights are represented by a k-dimensional vector \(W_{CR} = \{ w_1, \cdots , w_k \}\), where k symbolizes the number of CRs. This vector representation provides a comprehensive understanding of customer preferences regarding the six CRs mentioned above in the same order (see Section 3.1.1). The sum of the weights must always be 1.

For the evaluation section, three scenarios are considered for simulating different types of users: (i) a user considering that all dimensions are important to them, (ii) another considering the “assurance” dimension to be more relevant and (iii) another considering “reliability” to be more relevant. In this manner, it is possible to simulate different users and recognize that customer preferences can change over time.

Hence, the first scenario involves all aspects being equal regarding the CRs; i.e., the user does not consider one aspect more important than the others:

$$\begin{aligned} W_{CR1}\!=\!\{0.16666,\!0.16666,\!0.16666,\!0.16666,\!0.16666,\!0.16666\} \end{aligned}$$

In the second scenario, more weight is assigned to the “assurance” dimension; i.e., the user prioritizes this dimension over the others:

$$\begin{aligned} W_{CR2}=\{0.025, 0.025, 0.025, 0.875, 0.025, 0.025\} \end{aligned}$$

In the third scenario, the “reliability” dimension receives the highest weight, i.e., this is the most crucial aspect for the user:

$$\begin{aligned} W_{CR3}=\{0.025,0.875,0.025,0.025,0.025,0.025\} \end{aligned}$$

By evaluating various weight combinations, it is possible to gain a comprehensive understanding of customer preferences and opinions, providing valuable insights for service quality improvement.

3.1.3 Step 3. Identifying the relevant aspects for the design requirements (“HOWs”)

To determine the m DRs (\(DR = \{DR_1, DR_2, \cdots , DR_m\}\)) of the HOQ, a manual comprehensive analysis of the patient reviews available on Careopinion was carried out. This analysis included identifying the key areas of concern and opportunities for improvement highlighted by patient opinions. In addition, the subsections “how did you feel”, “what’s good”, and “what could be improved” provided very relevant information. In particular, “what could be improved” was especially relevant highlighting specific areas to improve according to the patients. These findings were used to identify the DRs (“HOWs”) necessary to enhance the quality of hospital services. In total, eight DRs were identified, and consequently, \(m=8\). The requirements found primarily match the ones used in the literature [2, 46, 47]:

  • Staff (\(\varvec{DR_{1}}\)): The competency, knowledge, and attitudes of health care providers, including doctors and nurses, play crucial roles in delivering high-quality services to patients. To enhance this aspect, hospitals can focus on ongoing training and development programs, hiring experienced and knowledgeable personnel, and fostering a supportive and empathetic workplace culture.

  • Care (\(\varvec{DR_{2}}\)): The level of compassion, empathy, and attention given to patients is critical to providing high-quality care. Hospitals can improve this aspect by promoting patient-centered care, encouraging open and effective communication between staff and patients, and regularly gathering feedback from patients to address areas of improvement.

  • Facilities (\(\varvec{DR_{3}}\)): The physical environment of the hospital, including the design and maintenance of buildings, equipment, and services, is important for developing a welcoming and safe environment for patients. Hospitals can enhance this aspect by investing in modern and well-maintained facilities, ensuring cleanliness and hygiene, and providing a comfortable and accessible environment.

  • Confidence (\(\varvec{DR_{4}}\)): Confidence refers to the trust, belief, and sense of security that patients have regarding a hospital and its services. It is built upon the reliability, preventative measures, and experience of the hospital in providing safe and high-quality care. Hospitals can promote patient confidence by demonstrating expertise in their field, ensuring the safety and security of patients through robust protocols and procedures, proactively addressing concerns and disappointments, and providing reliable and preventative care measures.

  • Services (\(\varvec{DR_{5}}\)): The quality and responsiveness of the services provided to patients, such as diagnosis, treatment, and recovery support, are important factors for promoting patient satisfaction. Hospitals can enhance this aspect by investing in advanced technology, promoting timely and effective treatment, and providing a patient-centered approach to care.

  • Timeliness (\(\varvec{DR_{6}}\)): The promptness and efficiency of healthcare services are critical for ensuring that patients receive care in a timely manner. Hospitals can improve this aspect by streamlining processes, investing in technology, and providing adequate staffing to ensure that patients receive prompt and effective care.

  • Efficiency (\(\varvec{DR_{7}}\)): The effective use of resources and reduction of waste are important in promoting cost-effectiveness and sustainability in health care services. Hospitals can enhance this by promoting resource optimization, reducing waste, and improving operational processes.

  • Transparency (\(\varvec{DR_{8}}\)): This dimension refers to open, honest, and clear communication, disclosure, and clarity of information and processes in a hospital. It encompasses openness and honesty in sharing information, reviewing procedures, and addressing questions or concerns raised by patients. Transparency involves providing accurate and accessible information, promoting openness in practices and procedures, and actively seeking feedback from patients.

3.1.4 Step 4. Determining the CR-DR (“HOW”-“WHAT”) relationship scores

To calculate the relationship degrees between the CRs and the DRs in the HOQ, the most common solution is to consult x relevant experts in the health care field. The opinion of each expert e can be represented by a \(k \times m\) matrix where each value of the matrix \(r_{ij}\) represents the relationship degree of the i-th CR with respect to the j-th DR:

$$\begin{aligned} matrix_e=\{r_{ij} \mid i=1,\ldots ,k \text { and } j=1,\ldots ,m\} \end{aligned}$$

where \(k=\) number of CRs (“WHATs”), and \(m=\) number of DRs (“HOWs”).

In the literature, the expert opinions are usually expressed by linguistic terms such as ‘very low’ or ‘very high’. Therefore, it is necessary to mathematically represent these opinions. The most natural mechanism to do this is fuzzy logic. Thus, these linguistic labels can be associated with balanced triangular fuzzy numbers (TFNs).

Therefore, the system represents the opinions of several experts with several fuzzy matrices that must be merged into a unique fuzzy relationship matrix that accurately represents the overall relationship between each pair of a CR and DR. To determine the overall relationship between CRs and DRs, it is necessary to aggregate all opinion matrices from all experts:

$$\begin{aligned} relationship\_matrix = matrix_{1} \oplus matrix_2 \oplus \ldots \oplus matrix_x \end{aligned}$$

To do so, the individual fuzzy values \(r_{ije}\) of all the matrices, which represent the opinions of decision-maker e regarding the i-th CR and the j-th DR, must be aggregated into a unique value \(r_{ij}\):

$$\begin{aligned} r_{ij} = r_{ij1} \oplus r_{ij2} \oplus \ldots \oplus r_{ijx} \end{aligned}$$

where \(x=\) number of decision-makers. Thus, the final matrix values, denoted as “\(r_{ij}\)”, indicate the aggregation of the fuzzy values from each expert, representing the relationship between the \(i-th\) “WHAT” aspect and the \(j-th\) “HOW” aspect. These matrix items are also represented by fuzzy sets as a result of the aggregation of fuzzy sets.

This approach, not only improves the interpretability of the relationship matrix but also, addresses the differences and ambiguity associated with the opinions of experts, providing an accurate and clear representation of the relationship between CRs and DRs.

Fig. 4
figure 4

The linguistic term set used in this study

Fig. 5
figure 5

DR detection algorithm

This process is used repeatedly in the literature [37, 47, 48]. From the literature, it is seen that experts’ opinions are repeated in many studies. Therefore, it is possible to extract the most commonly repeated opinions for the DRs and CRs detected in Sections 3.1.1 and 3.1.3, representing three different experts. In the experimental section, these repeated opinions are used to represent three different experts to calculate the overall relationship matrix. Therefore, in the experimental section, the number of CRs is \(k=6\) (see Section 3.1.1), the number of DRs is \(m=8\) (see Section 3.1.3), and the number of experts \(x=3\).

Furthermore, the opinions of these experts in the literature are usually expressed using the following labels: ‘very very low (VVL)’, ‘very low (VL)’, ‘low (L)’, ‘medium-low (ML)’, ‘medium-high (MH)’, ‘high (H)’, ‘very high (VH)’, and ‘very very high (VVH)’. Hence, these labels can be modeled mathematically by a scheme of fuzzy sets, as shown in Fig. 4. This scheme will be followed for the case studies shown in the experimental section. For instance, if the expert considers the relationship between the CR “empathy” and DR “staff” to be very very high (VVH), then it would be mathematically represented by the TFN = (0.875, 1, 1).

3.1.5 Step 5. Calculating the weights of the design requirements (“HOWs”)

To finalize the HOQ, it is necessary to calculate the weights of the DRs as \(W_{DR} = \{w_j \mid j = 1,\ldots ,m\}\). This is done by averaging the aggregated weighted relationship scores between the CRs and DRs through the values of the \(relationship\_matrix= \{ r_{ij} \mid i=1,\ldots ,k \text { and } j=1,\ldots ,m\}\) (see Section 3.1.5) and the vector \(W_{CR}\) (see Section 3.1.2):

$$\begin{aligned} w_{j}= \frac{1}{m} \otimes [(r_{1j}\otimes w_{CR1}) \oplus \ldots \oplus (r_{mj} \otimes w_{CRm})] \end{aligned}$$

As a result, the vector \(W_{DR}\) represents the importance of each hospital design requirement. The values of the vector are again triangular fuzzy numbers.

Steps 5 and 6 can be executed in parallel, as shown in Fig. 3.

3.1.6 Step 6. Computing the rating of a hospital with respect to the design requirements (“HOWs”)

After the weights of each DR are calculated, the next step consists of evaluating each hospital with respect to each DR. Most studies applying QFD rely on assessments from experts or decision-makers through questionnaires or surveys; nevertheless, in this research, the evaluation of the quality of hospital services is based on patient reviews, exploiting, the information available on the internet from real customers, i.e., patients, and avoiding the search for experts for this type of task. Patient reviews provide a comprehensive and nuanced understanding of the quality of the services delivered by hospitals, highlighting areas of improvement that may have gone unnoticed. This methodology leads to a more accurate and well-rounded evaluation of hospital services, providing valuable insights for both health care providers and patients. The proposed methodology for evaluating the quality of hospitals consists of two phases:

Phase A. Design requirement detection

To detect the DRs available in the patient opinions, the algorithm used in [49] was applied. First, the reviews from the dataset were preprocessed, and an LDA-based method was used to identify the most relevant topics. Sixteen topics were computed and manually analyzed to determine which one belonged to each DR criterion according to the most descriptive terms. The procedure is depicted in Fig. 5, and an excerpt of the most representative terms of each topic, which were assigned to each DR criterion manually, is presented in Table 1. Through this approach, any sentence belonging to one of the 16 topics is automatically grouped into the corresponding DR related to that topic.

Table 1 Example characteristic terms for each DR

Although this method provides helpful information semiautomatically, i.e., the clusters are generated automatically, a user must still classify the clusters manually into the corresponding DRs; nevertheless, it is possible to find other alternatives that also involve user interaction. For example, Wu et al. provided a BERT-based model that automatically detects the features or design requirements from opinions; the major drawback of this approach is that it is necessary to generate a large set of instances that are manually labeled to be able to tune the BERT model to detect them [50].

Phase B. Design requirement polarity detection

Having successfully identified the eight DRs, it is crucial to calculate the sentiment orientation conveyed by the reviews regarding each DR. Understanding the polarity for each individual criterion will help us understand the overall opinion about a hospital.

To compute these polarities, the input of the algorithm is the set of sentences grouped according to each DR, enabling the calculation of the sentiment degree of the opinions toward that particular criterion. In this case, the Valence Aware Dictionary and sEntiment Reasoner (VADERFootnote 2) is the tool selected to compute the polarity of every sentence. The polarity \(P_{DR_j}\) regarding a specific aspect \(DR_j\) of hospital is calculated as the average of the polarity of all sentences S according to VADER yielding the compound sentiment value for each opinion. Mathematically, the overall polarity for each \(DR_j\) of a hospital is computed as follows:

$$\begin{aligned} P_{hospital}= & {} \{ P_{DR_{j}} \mid j=1, \ldots , m \} \\ P_{DR_j}= & {} \frac{1}{ \mid S_{DR_{j}} \mid } \sum _{s_i \in S_{DR_{j}} } VADER(s_i) \end{aligned}$$

With the goal of modeling the polarity value of each aspect i, a fuzzy label set can be used. This fuzzy label set includes labels represented by a TFN with appropriate membership degrees (see Fig. 4). Figure 6 illustrates the mechanism for computing the polarity for each aspect DR represented by a TFN.

Fig. 6
figure 6

The mechanism for calculating the polarity for each DR

3.1.7 Step 7. Fuzzy suitability index calculation

Finally, the fuzzy suitability index (FSI) for each hospital is calculated to determine the extent to which each hospital satisfies the requirements [51]. This method is an aggregation mechanism suitable for dealing with TFNs. In particular, the FSI index of a hospital is represented as a TFN, which is determined by multiplying the previously calculated polarity score \(P_{hospital} = \{ P_{DR_{j}} \mid j=1, \ldots , m \}\) by the corresponding weights \(W_{DR} = \{w_1, \ldots , w_m\}\) (see Section 3.1.5) for each DR criterion:

$$\begin{aligned} FSI = \frac{1}{m} \otimes ( P_{DR_{1}} \otimes w_1) \oplus \ldots \oplus (P_{DR_{m}} \otimes w_m) \end{aligned}$$

This final value represents how well a hospital l meets the expectations of a user, represented by a fuzzy set. As the goal of the application is to recommend the most suitable hospital in a set of hospitals \(H= \{H_1, H_2, .... \}\), the order of this set depends on the FSI of each hospital. As the fuzzy values are represented by triangular values \( <FN_\alpha ,FN_\beta , FN_\gamma > \), to compare the FSIs of all hospitals, these values are discretized by the following formula:

$$\begin{aligned} \frac{FN_{\alpha }+2FN_{\beta }+FN_{\gamma }}{4} \end{aligned}$$

These discrete values can then be ranked to set the final order of the hospitals according to the user’s preferences.

3.2 Summary of mathematical notations and operations of the proposal

The previous subsections describe the entire process to rank the hospitals. Next, to synthesize all steps, the most important mathematical definitions and operations to be considered are summarized.

Customer requirements

The k CRs (\(CR \!=\! \{CR_1, CR_2, \cdots ,\)\( CR_k\}\)) are determined by experts. Overall, in most of the available studies, six customer requirements are used when assessing the quality of a hospital. These ones (tangible, reliability, responsiveness assurance, empathy and professionalism) are used here as explained in Section 3.1.1, and consequently, \(k = 6\).

Relative importance of the CRs

The user must determine what CRs are the most important for her/him. To do so, the user manually assigns values in the range [0, 1] to each CR in the vector \(W_{CR}= \{ w_1, \cdots , w_k \}\). The sum of all weights must be 1.

Design requirements

They are represented by a m-dimensional vector \(DR = \{DR_1, DR_2, DR_3, \cdots , DR_m\}\). Manually analyzing the used dataset, the found requirements are the same eight that can be usually found in the literature. These ones (staff, care, facilities, confidence, services, timeliness, efficiency and transparency) are described in Section 3.1.2, and consequently, \(m = 8\).

Experts

It is usual to resort to a set of experts \(E = \{e_1, e_2, \cdots , e_x \}\) to assess the relationship degrees between the CRs and the DRs. In most of the studies, the number of experts is usually three (\(x = 3\)).

Linguistic term set

The experts express the relationship degrees by linguistic labels such as ‘very good’ or ‘quite bad’; therefore, a linguistic term set like the one depicted in Fig. 4 is necessary to be defined to model the expert’s opinions.

CR-DR relationship score matrix

Each expert e expresses the relationship score between the i-th CR with respect to the j-th DR in a \(\text {k} \times \text {m}\) matrix using the previous selected linguistic term set: \( matrix_e=\{r_{ij} \mid i=1,\ldots ,k \text { and } j=1,\ldots , m\}\).

Relationship matrix

The final opinion over the relationship scores between the CRs and the DRs depends on all the experts’ opinions. Hence, all the score matrices are aggregated into a final matrix: \(relationship\_matrix = matrix_{1} \oplus matrix_2 \oplus \ldots \oplus matrix_x\).

Importance of the DRs

As explained in Section 3.1.5, the importance of the DRs (\(W_{DR} = \{w_1, w_2, \cdots , w_m \}\)) can be computed averaging the \(relationship\_matrix= \{ r_{ij} \mid i=1,\ldots ,k \text { and } j=1,\ldots ,m\}\) and the vector \(W_{CR}\) following the formula: \(w_{j}= \frac{1}{m} \otimes [(r_{1j} \otimes w_{CR1}) \oplus \cdots \oplus (r_{mj} \otimes w_{CRm})]\).

Evaluation of a hospital regarding the DRs

Following the algorithm described in [49], the evaluation/polarity of a hospital regarding each DR is calculated as a vector \( P_{hospital} = \{ P_{DR_{j}} \mid j=1, \ldots , m \} \).

First, from the opinions of the hospital available in the dataset, all sentences \(s_i\) talking about the different DRs are detected by an LDA-based topic modeling algorithm, and then, the polarity is calculated as the aggregation of the individual polarities of each sentence \(P_{DR_j} = \frac{1}{ \mid S_{DR_{j}} \mid } \sum _{s_i \in S_{DR_{j}} } VADER(s_i)\). To calculate the individual polarities, the well-known tool VADER has been used.

As a real user would usually rate a hospital using a linguistic labels and not crip values, then a fuzzification method must be selected to codify the polarity value according to a previously selected linguistic term set. Thus, \(P_{DR_j}\) is finally represented as a set of triangular fuzzy sets.

Final score a hospital

The final score given to a hospital considers the opinion over each DR (\(P_{hospital} = \{ P_{DR_{j}} \mid j=1, \ldots , m \}\)) and the corresponding importance of each DR (\(W_{DR} = \{w_1, \ldots , w_m\}\)). Considering both vectors [51], the FSI computes the final score of the hospital represented by a TFN (\( <FN_\alpha ,FN_\beta , FN_\gamma > \)): \(FSI = \frac{1}{m} \otimes ( P_{DR_{1}} \otimes w_1) \oplus \ldots \oplus (P_{DR_{m}} \otimes w_m)\)

Final ranking

Once the FSI of each hospital (\(H= \{H_1, H_2,\)\( .... \}\)) to be assessed is calculated, these values can be defuzzied by the formula: \(\frac{FN_{\alpha }+2FN_{\beta }+FN_{\gamma }}{4}\). These values allow ranking the hospitals. The highest value indicates the most recommendable hospital and lowest the least recommendable one.

Fig. 7
figure 7

The HOQ model results after the first four steps

Table 2 List of hospitals and numbers of retrieved opinions for case study 1

4 Experiments

4.1 Experimental setup

To test the performance of the model, two case studies are proposed for assessing two different sets of hospitals. Furthermore, for every case study, three types of users are represented depending on their preferences toward the CRs. These preferences are modeled by the vectors explained in Section 3.1.2.

To assess the quality of the results of the proposed model, it is necessary to compare the generated ranking against the gold standard or ideal ranking. To compute the ideal ranking, the assessments of the hospitals from Careopinion were used, as explained in Section 5.1 of [49]. These assessments are not obtained from the textual opinions but from the keywords provided by the users in the subsections of the opinions (“how did you feel”, “what’s good”, and “what could be improved”), which are interpreted as direct user evaluations of the different hospital aspects. Apart from this ranking, the model for ranking hospitals proposed in [49], which is based on the textual information opinions, was also implemented for comparison.

To measure how close every proposal is to the ideal baseline, a ranking comparison metric, Spearman’s footrule, is used. This measure is based on the differences between the orders of the n items of the rankings being compared.

$$\begin{aligned} Spearman (X,Y) = \sum _{i=1}^{n} \mid rank(x_i) - rank(y_i) \mid \end{aligned}$$
(1)

where \(X=\left\{ x_{1}, x_{2}, \cdots , x_{n}\right\} \) and \(Y=\left\{ y_{1}, y_{2}, \cdots , y_{n} \right\} \) are the two compared rankings. This value can be normalized to the range [0, 1] by dividing by the maximum value among the items of each ranking. A score of 0 indicates that the two lists are identical, whereas a score of 1 indicates that the rankings are completely different.

4.2 Case study 1

The first case study uses opinions provided by previous patients on Careopinion from the 7 hospitals, as shown in Table 2.

As mentioned previously, three scenarios are assessed. For simplicity, the case of a balanced user, i.e., one whose preferences are represented by the vector \(W_{CR1}\!=\!\{0.16666,\!0.16666,\!\)\(0.16666,0.16666,0.16666,0.16666\}\) (see Section 3.1.2), it explained next step by step, and the results of the remaining of cases are simply given.

Fig. 8
figure 8

The fuzzy-HOQ

Table 3 Polarity results for the DR criteria

Using the dataset shown in Table 2, first, the CRs or “WHATs” were identified through the analysis of the relevant available literature (see Section 3.1.1). Second, the relative importance of each CR was determined using three scenarios (see Section 3.1.2). Third, the DRs or “HOWs” were identified by conducting a comprehensive analysis of the relevant literature, a manual analysis of patient reviews on the CAREOPINION website and an evaluation of user experiences (see Section 3.1.3). Fourth, the CR-DR relationship scores were determined by gathering multiple expert perspectives as explained in Section 3.1.4. A graphical representation of the HOQ model showing the results after the first four steps is presented in Fig. 7. The column “relative importance for the CR” represents the weights for scenario 1 (see Section 3.1.2) and the remaining cells represent the assessments of the three experts conveyed in linguistic terms. For instance, the cell “tangible staff” was assessed as high, high, and medium by the three experts. Mathematically, these labels are processed as the associated fuzzy sets, shown in Fig. 4. Depending on the experts’ assessments, other fuzzy sets could be used.

Finally, the HOQ was completed by calculating the DR weights by averaging the weighted relationship values between the CRs and DRs (see Section 3.1.5). This process utilized both the relationship scores and the weights assigned to the CRs. Additionally, to represent the relationship between the CRs and the DRs, the fuzzy sets depicted in Fig. 4 were employed. The result of this process is the fuzzy-HOQ model shown in Fig. 8, which shows the outcomes after the five steps of the methodology. In this case, the cells represent the final values of the calculated triangular fuzzy sets, with every triangle having the form \( <FN_\alpha ,FN_\beta , FN_\gamma > \). For instance, the value for the cell “staff professionalism” is the fuzzy set “VVH” from Fig. 4.

The next phase (see Section 3.1.6) involved computing the ratings of the hospitals with respect to the DRs or “HOWs”. The topic modeling technique LDA [44] was used to detect the DR aspects from the opinions, and VADER was used to calculate the polarity for each hospital. Table 3 presents the polarity results which include the average compound value per DR criterion and hospital. The polarity values are expressed in the range [0, 1], where 0 is the most negative value, 0.5 is a neutral value and 1 is the most positive value.

As these polarity values are numeric, they do not represent the real linguistic expression a user would use to convey their opinion; for that reason, these values have been fuzzified as the closest label associated with a scheme of fuzzy sets such as the one seen Fig. 4 for the variable polarity. Each label is able to capture the variations in sentiments and ensure a comprehensive and interpretable representation. The linguistic values obtained by computing the polarities of all aspects for each DR criterion of each hospital are displayed in Table 4.

Table 4 Polarity values for the DRs of each hospital
Table 5 Hospitals ranking
Fig. 9
figure 9

FSI values for the hospitals in case study 1

The last step (see Section 3.1.7) consists of calculating the FSI of each hospital using the outputs of steps 5 and 6. These values are useful to determine the final hospital recommendations for the user. Table 5 shows the FSIs of each hospital and the rankings. The highest FSI value indicates the most recommendable hospital. In addition to the results for scenario 1, the table shows the final rankings for the other 2 scenarios when “reliability” and “assurance” are priorities for the user. The most relevant hospital is highlighted in bold for each case.

Fig. 10
figure 10

FSI values of the hospitals in case study 2

By analyzing the results from Table 5 and Fig. 9, it is evident that consumers’ preferences play a vital role in determining the overall ranking. In the first scenario, where all aspects were given the same weight, H4 was ranked first, followed by H6 and H3. However, when the “Assurance” dimension was given the highest weight in the second scenario, H6 was ranked first, followed by H4 and H3. In the third scenario, “Reliability” was the most important aspect and H4 was the most highly ranked hospital, followed by H6 and H5, whereas H3 was ranked fourth. Overall, analyzing

the opinions, hospitals H4 and H6 are the best rated; consequently, they always appear in the first positions because the best polarity values are obtained for each DR in general (see Table 3). Hospitals H2, H7 and H5 obtained the lowest polarity scores for the DRs; for that reason, their positions can change slightly depending on user preferences; nonetheless, they are still the worst-scored hospitals.

4.3 Case study 2

In the second case study, this fuzzy QFD-based approach was extended to seven additional hospitals to validate its stability and adaptability for different health care settings. The patient opinions from Careopinion for the hospitals presented in Table 6 again served as the basis for implementing the proposed approach.

Table 6 List of hospitals and number of retrieved opinions for case study 2

This implementation not only tested the stability of the methodology but also confirmed its effectiveness in different hospitals. Table 7 shows the final ranking of the hospitals according to this methodology under the different scenarios.

A comprehensive analysis of the results presented in Table 7 and Fig. 10 yielded significant findings regarding the effect of customer preferences on the overall ranking of hospitals. In the first scenario, where equal weights were assigned to all aspects, H4 had the top position, closely followed by H7 and H1. In the second scenario, where the “assurance” dimension was given the highest weight, H7 achieved the top position, followed by H4 and H6. In the third scenario, where “reliability” was prioritized over the other aspects, H4 obtained the top rank, followed by H7 and H1.

Notably, hospitals H4 and H7 were consistently ranked as top-rated establishments. In contrast, hospitals H2, H3, and H5, assigned low ranks, showed positional variations due to user preferences, consistently maintaining their status as the lowest-ranked hospitals.

Table 7 Ranking of hospitals in case study 2
Table 8 Results for the different scenarios

4.4 Comparative analysis

As mentioned previously, to test the effectiveness of the proposal, a gold standard and other approaches for ranking hospitals were implemented [49]. The proposed method demonstrated its effectiveness and superiority over other approaches when evaluating the service quality of hospitals. The most notable difference with respect to [49] lies in the consideration of CRs in the ranking process. Unlike [49], the proposed approach integrates CRs, considering their impact on the overall ranking of hospitals. This additional dimension provides a more patient-centric evaluation and corresponds with user perspectives. Moreover, the relationship between CRs and DRs is a crucial factor in our method, enhancing the accuracy of the ranking process. While [49] does not delve into these aspects, our approach systematically incorporates them, contributing to a more robust and insightful evaluation.

Therefore, the proposed method was used in the same context as described in [49], yielding the rankings shown in Table 8 for case studies 1 and 2.

In the first case study, looking at the different scenarios, the results are practically the same, varying by only one position at most between the ideal rankings of the proposed method and the other comparison method. In the second scenario, there are a few more differences between the rankings; nonetheless, the proposed method is closer to the ideal one. This fact can be corroborated by calculating Spearman’s footrule metric.

As shown in Table 9, in both case studies, the proposed method consistently demonstrated superior performance, achieving lower average Spearman’s footrule values than the method in [49]. For instance, in the first case study, the proposed method attained an average footrule of 0.17, while the method in [49] had a higher average of 0.28. Similarly, in case study 2, the average Spearman’s footrule value of the proposed method was 0.14, surpassing the average of the other method (0.25). These results highlight that the proposed model performed very similarly to the ideal ranking in both case studies, proving its effectiveness in this manner.

Table 9 Comparison in terms of Spearman’s footrule metric

To deeper analyze the results, it is necessary to focus on the proportion of positive and negatives sentences per aspect and their importance. Looking at the extremes of the rankings, in study case 1 the best hospitals are H4 and H6 according to the ideal ranking for most scenarios, and the worst for all scenarios are hospital H2 and H7. Similar results are obtained by our approach and the one in [49] because H4 and H6 have by far the major ratio of positive opinions over negatives, 2.34 and 2.43, respectively, whereas H2 and H7 have the minor ratio, 1.49 and 1.47, respectively. Nonetheless, the ratio for the other hospital is more similar and then, the weights assigned by the house of quality play an important role to correctly place the hospital in the final ranking.

In scenario 1 case study 1 our method is able to rank all hospitals as expected by the ideal ranking except hospital 1 and 3 which exchanged their positions in the final rank. The same situation happens in [49]; nonetheless, the positions of hospitals 7 and 5 are also exchanged with respect to the ideal ranking. In the rest of scenarios similar situations take place, some hospital positions are exchanged by only one position with respect to the ideal rank. Nevertheless, there are two hospitals that [49] struggles to rank properly, H3 and H5, whereas our proposal places them closer to their ideal positions. Looking at the relative weights of HOQ in scenario 2, care/empathy and confidence are the most relevant areas to be considered, whereas staff and service the least relevant. Regarding scenario 3, timeliness and efficiency are the most important aspects to be considered in the HOQ, whereas staff and facilities, the least relevant ones.

A similar situation occurs in scenario 2, overall, the best hospitals are H7 and H4 and the worst ones H5 and H2 in all scenarios, because the ratio of positive over negative sentences is 2.4 and 2.26 for H7 and H4, respectively, and 1,68, 1.7 and 1,71, for H5, H3 and H2, respectively. In this second case study, our approach exchanges just one position between several hospitals regarding the ideal ranking in all scenarios; nevertheless, H3 and H6 are incorrectly placed several positions by the other algorithm [49]. For example, in scenario 3, the selection of the appropriate weights for H3 is vital. Our approach assigns more importance to weak aspects such as timeliness whose ratio is 1.1 for H3, i.e, there is barely the same number of positive sentences and negatives. This fact is key for this hospital, because the second one lowest ratio is 1.49 for H5. Therefore, the calculation of the correct weight for the HOQ highly influences and determine the results of the approach here presented.

4.5 Discussion

The proposed methodology distinctively involves patients or users directly in the assessment process, eliminating the reliance on expert evaluations, in contrast to the methods of other studies [30, 52,53,54,55,56,57] that primarily rely on experts to assess service quality and recommend products or suppliers. Instead, this approach strategically avoids these challenges by automating the evaluation process so that it is more adaptable and reflective of real-world users’ perspectives. Considering this, if it is necessary to evaluate 14 hospitals using traditional expert-based methods, and each hospital needs 3 experts for the evaluation, a large team of 42 specialized experts in the medical field would be needed, each providing nonrandom evaluations. This poses challenges in terms of reliability, consistency, and substantial resource requirements.

Moreover, in contrast to the methods of other studies [58,59,60], the method proposed here uses an alternative means of attribute extraction. While classic methods rely on the team knowledge and experience to extract the criteria for the HOQ, the proposed method incorporates a classification technique that automatically identifies the criteria of HOQ within the QFD, effectively capturing the criteria specific to the hospital context. This automation not only improves the efficiency of the process but also reduces the costs associated with human interaction. Overall, this approach represents an alternative to traditional expert-based evaluations, providing a practical and effective method that is easily aligned with the contemporary shift toward patient-centered health care assessments that take advantage of internet technologies.

Furthermore, several practical contributions useful for patients, health care providers, and managers can be derived from this study. First, the results of this study demonstrate the effectiveness of the integrated HOQ matrix and patients’ opinions in detecting the most relevant DRs for hospitals. By analyzing the opinions of patients or users, it is possible to identify the hospital aspects that are most relevant to them.

Overall, the results of this analysis can provide valuable insights for health care providers and policy-makers, helping them focus on specific aspects of service quality that are the most important to their customers. It is also essential to recognize that customer opinions and preferences can change over time. Therefore, health care providers should regularly assess their customers’ changing opinions and preferences and adjust their service quality accordingly. Moreover, these scenarios demonstrate that the weights and scores given to each aspect of service quality have a crucial impact on the ranking of hospitals. Hence, health care providers should consider their customers’ preferences when developing strategies to improve their service quality. Furthermore, the results of this study can help health care providers identify areas for improvement to allocate resources more effectively to meet their customers’ expectations. In doing so, health care providers can enhance their competitiveness, increase customer satisfaction and loyalty, and improve the quality of health care services provided.

The second case study serves as a proof that the proposed method is stable, adaptable, and reliable. The consistent rankings across different scenarios and additional hospitals show that the proposed method is stable and that the results are reliable. The method’s adaptability is shown by its ability to perform well in different health care settings.

Second, user opinions have been found to be powerful tools for recommending specific hospitals to others based on their DRs. The presented fuzzy approach mitigates uncertainty in these opinions, yields more accurate information, and enables better-informed decisions regarding the most suitable hospitals for individual patients or users. It also allows hospital managers to gain deeper insights into the needs and expectations of their patients or users, facilitating targeted improvements in their services and facilities.

In conclusion, the integration of the HOQ matrix with user opinions provides a more accurate and reliable analysis of hospital DRs. Implementing this approach in the QFD process has significant implications for enhancing health care service quality and meeting the demands of patients and users. Therefore, this approach can help improve hospital facilities and services and ultimately, enhance patient satisfaction.

Among the implementation challenges, it is necessary to highlight that this proposal is costly in terms of development the first time it is implemented because the first steps of the methodology do not need to be implemented again if the set of hospitals to be assessed is different. Therefore, the bottleneck of the system is step 6 which depends on the number of opinions available per hospital.

Regarding the implementation of the proposal in a real environment, the major challenge is to obtain a feedback platform such as Careopinion, where the patients truly want to collaborate to provide reliable information. Furthermore, the proposal is modeled considering the information provided by Careopinion; nevertheless, the primary necessary information is patient feedback, which is assumed to be available on every similar platform. The other information provided by sections such as “how did you feel” or “what’s good” is used only for evaluation purposes and to manually corroborate some of the data obtained automatically in some steps of the methodology.

In terms of privacy, Careopinion users accept the terms of sharing their data with other users through the application programming interface (API) provided by Careopinion; therefore, privacy does not seem to be a major issue for patients. The proposed methodology does not use any personal data, and the information used can be considered perfectly anonymous because the data are aggregated, rather than managed individually. Consequently, it is impossible to trace patient opinions and find personal details about the patients.

5 Conclusions and future work

This study describes a fuzzy QFD approach that integrates patient opinions to rank hospitals; this approach offers significant advantages over traditional QFD methods that rely solely on decision-makers. The paper initially presents the basic concepts of the QFD model and proposes the HOQ method to address the “what” and “how” questions. In situations where information is incomplete, imprecise, or subjective, a fuzzy decision-making environment can be an interesting tool. For that reason, a fuzzy QFD approach is proposed to represent the polarity of user opinions and to calculate the relationship between CRs and DRs. Then, a systematic procedure using the fuzzy-QFD method to rank the hospitals is proposed.

The primary contribution lies in the use of user opinions, integrated within the HOQ model, to rank hospitals and guide health care providers to understand what specific services or requirements need improvement. Unlike the traditional QFD model, which relies on decision-makers to rate hospitals or health care providers, this approach provides a more direct and patient-centered perspective on hospital performance. The findings suggest that incorporating patient or user opinions in the ranking process is useful for understanding patient satisfaction and creating a more patient-centered health care system. Thus, this study contributes to the growing literature on the use of HOQ in improving health care quality improvement and highlights the importance of incorporating patient or user opinions in the ranking process.

There are several potential avenues for further research building upon the findings of this study. First, incorporating public and personal preferences based on patient opinions, rather than relying only on personal preferences, could provide a more patient-centered approach to ranking hospitals. This could involve gathering and integrating feedback from a larger sample of patients to capture a broader range of perspectives and opinions. Second, exploring different fuzzy representation methods that consider the uncertainty of opinions, including the positivity, negativity, or neutrality of sentiments, could enhance the accuracy and reliability of the fuzzy QFD approach as has been done in other studies [61,62,63,64,65,66]. Furthermore, additional use of emotions can be considered instead of just considering feelings only in computing the polarity of opinions [67,68,69].