1 Introduction

The Republic of Korea (ROK) military is considering providing contracted civilian services for field military base management to modernize its barracks culture while also focusing on combat missions. The purpose of this study was to evaluate the performance of the contracting out of military base management and to measure the satisfaction of the soldiers involved in a pilot project. The ROK military conducted the pilot project for military base management by focusing on three service areas: facility management, cleaning, and mowing/landscaping (Table 1).

Table 1 Service areas covered by the pilot project for the outsourcing of military base management

The most effective strategy for evaluating the performance of a pilot project is to compare service performance before and after the project starts. However, in this project, it was not possible to measure the performance before starting the pilot project.

To overcome this problem, we used SERVQUAL for performance analysis. The dimensions of service quality in SERVQUAL are based on multi-item scales that are designed to measure customer expectations and perceptions and any gap between them (Tileng et al. 2013). The elements of SERVQUAL were used to develop the survey questionnaire while an importance-performance analysis (IPA) model was employed to measure service quality, not only in terms of the performance of individual items but also in terms of their importance to the overall satisfaction of the respondent (Syaifoelida et al. 2016). The experiences of the soldiers in the pilot project were collected using the SERVQUAL-based survey, with differences between perception and expectations recorded. Based on the measured performance, IPA was conducted to identify areas that need improvement in the contracting-out process.

2 Related research

Developed by Parasuraman et al. (1988), SERVQUAL is a service quality measurement tool that is widely used to evaluate service quality in companies and public institutions, incorporating five service dimensions: tangibles, reliability, responsiveness, assurance, and empathy (Brady et al. 2001; Cunningham et al. 2017; Ju et al. 2012; Lee et al. 2016; Mukhtar et al. 2013; Pansiri et al. 2010). In the framework of these dimensions, the quality of a service is measured as a result of an evaluation process, in which consumers’ expectations and their perception of performance are compared (Yildiz 2011).

Since first proposed by by Martilla et al. (1977), IPA has been widely used to analyze importance and performance (Aktas et al. 2007; Azzopardi et al. 2013; Blesic et al. 2014; Shieh et al. 2009, 2011; Wu et. 2012). These methods have since been used to assess service quality in a number of studies. Izadi et al. (2017) created 32 questions that measured patient expectations and perception of performance for the IPA model. Yin (2016) analyzed the quality of telehealth services designed to monitor the vital signs of patients via mobile devices using SERVQUAL and IPA, while Kim et al. (2010) examined the characteristics of public transportation, including the commuting time, ease of access, and access time, to evaluate the quality of transportation services using SERVQUAL and IPA. Moon et al. (2010) also used SERVQUAL to assess the family welfare services provided by the Healthy Families Support Center and to identify effective management strategies based on IPA. They measured the level of perceptions and expectations for 21 indicators across five dimensions for the Citizen Health and Family Support Center and analyzed the difference between perceptions and expectations using t-tests.

The design of SERVQUAL-based assessment has also been studied. Kim et al. (2016) investigated different combinations of SERVQUAL items and developed appropriate measurement items to assess the quality of a multicultural education training project, with IPA conducted based on the results. Han et al. (2012) also conducted a literature review and qualitative research on B2B service quality to determine the relevant quality factors and measurement items for SERVQUAL.

3 Research methodology

3.1 Research model

In this study, we surveyed the unit members involved in the pilot project. Before distributing the questionnaire, we visited the units to explain the purpose of the performance analysis to the officers in charge of facility management. The questionnaire was composed of perception and expectation items based on the concept of SERVQUAL. The questionnaire was then distributed to the units. After collecting the survey results, we analyzed the gap between the perception and expectation levels to understand the satisfaction of the unit members. Based on this, IPA was employed to identify what improvements were needed. Figure 1 summarizes the research process.

Fig. 1
figure 1

Research model for the present study

3.2 Creating the SERVQUAL items

SERVQUAL consists of five dimensions (reliability, tangibles, responsiveness, assurance, and empathy) and 22 items. We modified some items to take into account the characteristics of the ROK military force and the defense environment. To confirm the details of the questionnaire items, we conducted three focus group interviews with military officers. Table 2 presents the demographic information of the interviewees. The item for notification of accurate service delivery times (No. 7) was included in facility management only because it was believed to be unrelated to the other services. The SERVQUAL items utilized in this study are presented in Table 3.

Table 2 Focus group interviewees' demographic information
Table 3 Details of the modified SERVQUAL items

3.3 Survey process

We conducted a survey on the unit members involved in the pilot project for performance analysis of the contracted-out services. The questionnaire regarding facility management and cleaning were distributed to two divisions. In addition to these divisions, members of an ammunition depot unit received the questionnaire for mowing and landscaping. Table 4 summarizes the demographic information for the respondents.

Table 4 Demographic information for the respondents in the present study

4 Performance analysis

4.1 Factor analysis

The construct validity of the questionnaire was supported by the factor loading for each of the three services (Table 5): facility management (Kaiser–Meyer–Olkin [KMO] value of 0.975 and Bartlett’s test of sphericity with χ2 = 29097.95 [p = 0.000]), cleaning (KMO value of 0.976 and Bartlett’s test of sphericity with χ2 = 28067.04 [p = 0.000]), and mowing/landscaping (KMO value of 0.979 and Bartlett’s test of sphericity with χ2 = 32060.44 [p = 0.000]). It was recommended the threshold value of KMO as 0.5 and Bartlett’s test of sphericity is significant at less than 0.05 (Ramu et al. 2019). Pearson correlation was also conducted to evaluate convergence validity of the questionnaire (Wu et al. 2010).

Table 5 Construct validity of the questionnaire

Factor analysis was conducted based on principal components analysis with varimax rotation, eigenvalues exceeding 1, and a factor loading greater than 0.5. These items were classified into four dimensions, i.e., reliability, responsiveness, assurance and empathy, and tangibles. Factor analysis was not applied to tangibles because there was only one item. Tables 6, 7 and 8 present the results of the factor analysis for each service area with their Cronbach’s α values.

Table 6 Results of factor analysis for facility management
Table 7 Results of factor analysis for cleaning
Table 8 Results of factor analysis for mowing/landscaping

4.2 Expectation and perception assessment for the three service areas

We measured the expectations and perceptions for the three service areas based on five SERVQUAL dimensions. Repeated sample t-tests were performed to test the statistical significance of any difference between the expectation and perception scores. As shown in Table 9, for facility management, the perception and expectations were distributed within a range of 3.861–4.049, with no significant differences except for item No. 9.

Table 9 Comparison of expectation and perception levels by SERVQUAL item for facility management

Next, cleaning performance was measured. As shown in Table 10, the service quality items exhibited high levels of perception (from 4.012 to 4.107). Unlike facilities management, significant differences were found between the expectations and perceptions.

Table 10 Comparison of expectation and perception levels by SERVQUAL item for cleaning

As shown in Table 11, the perception level for mowing/landscaping ranged from 3.936 to 4.036 for the service quality items. It was confirmed that some items had significant differences between expectations and perceptions (items No. 2, No. 9, No. 10, No. 11, and No. 15).

Table 11 Comparison of expectation and perception levels by SERVQUAL item for mowing and landscaping

In conclusion, it could be seen that perception levels for the cleaning service were relatively high compared to the other services and there were significant differences between expectations and perceptions for this service area.

5 Identification of required improvements

5.1 Using IPA

IPA measures the relationship between importance and performance (Izadi et al. 2017). Performance refers to customer perceptions about how a service is delivered by an enterprise, whereas importance is a manifestation of the relative value assigned by customers to a service (Yildiz 2011). By measuring importance and performance, the IPA model develops specific product relations based on attributes of technological priorities. These attributes can be organized into quadrants according to the average importance and performance (Syaifoelida et al. 2016). Figure 2 is divided into four quadrants, with the x-axis representing the importance of an attribute, which is a measure of its expectation, and the y-axis representing the performance level of an attribute, which is a measure of its perception. The items in the first quadrant represent high importance and performance items, meaning that an effort is required to maintain high standards. The items in the second quadrant are considered to be of low importance by the users, but their performance is good. Thus, maintaining the status quo for these items is desirable. The third quadrant represents items of low importance and poor performance; they could be improved, but it does not have to happen immediately. The fourth quadrant contains items with high importance but poor performance, thus they require urgent attention. In this study, we focused on the fourth quadrant to identify improvement requirements. IPA plots based on the average expectation and perception levels were performed for each service (facilities management, cleaning, and mowing/ landscaping).

Fig. 2
figure 2

Generalized IPA quadrant layout

5.2 Results of IPA

By utilizing the information summarized in Tables 6, 7 and 8, three IPA plots were drawn as Figs. 3, 4, and 5, respectively. Following the IPA plots for this study, it was found that 60.0% of the items in mowing/landscaping occupied the third and fourth quadrants, indicating the need for improvement. The other services had similar results (facility management: 56.3%; cleaning: 50.0%). For mowing/landscaping, two items were in quadrant 4, with the respondents requesting more systematic management such as maintaining records of the work as well as services delivered at the desired time. Table 12 summarizes the items for improvement highlighted by the IPA.

Fig. 3
figure 3

Results of IPA for facility management

Fig. 4
figure 4

Results of IPA for cleaning

Fig. 5
figure 5

Results of IPA for mowing/landscaping

Table 12 Summary of the items that require improvement (fourth quadrant)

6 Conclusions and limitations

SERVQUAL and IPA were used as tools to evaluate the performance of pilot projects for the contracting-out of military base management services. We conducted a large-scale survey of 2,112 uniformed members regarding the projects. In order to deliver a large number of surveys without problems, we conducted focus group interviews with the officers managing the units to explain the importance and purpose of the questionnaire.

Based on factor analysis, the items in the three service areas were classified into four dimensions: reliability, tangibles, responsiveness, and assurance/empathy. It was found that performance was highest for cleaning services. Therefore, it can be concluded that the outsourcing of cleaning services was the most successful initiative. On the other hand, mowing/landscaping had the most urgent tasks for improvement, with some respondents believing that reliability and assurance should be improved. It seems that members of the unit expected the facilities to be well managed.

This study is significant in that it introduced a service quality perspective to the ROK defense sector in the measurement of service quality. We approached the soldiers as if they were customers receiving a service. Another important contribution of this study is the use of IPA methodology to determine which services in the pilot program should be improved. This would enhance the efficiency of the utilization of private resources in the defense sector.

However, there are some limitations in the study. First, considering that this is an evaluation for actual policy implementation, it is necessary to analyze its cost-effectiveness in future research. Second, there was a lack of discrimination in some response results. For facility management in particular, the differences between perceptions and expectations were small. Future research thus requires more sophisticated survey questions to more clearly distinguish gaps in service quality.