Keywords

1 Introduction

As a result of the increased adoption of a user-centered approach to improve human-computer interaction and increasing concerns regarding computer-related privacy and security issues, researchers and scientists have proposed many design techniques that incorporate usable-security [11, 15]. Those techniques are used to build software systems that seek to balance usability and security. However, if we take a careful look at the currently available software systems, we often find that quality attributes, such as security and usability, are applied and evaluated separately [3, 10, 20].

Therefore, the vast majority of those systems are built and deployed without giving proper attention to evaluating the state of integrating the quality attributes, such as usable-security. It is very important to be sure that software systems are usable and secure enough [15, 19], so that people can rely on such systems and interact with them as if interacting with actual human beings. People ought to be able to use and deal with systems with hands-on ease, as this helps in a better cooperation between the people and the systems [9, 10]. Also, people must be able to trust that the systems are secure enough against any malicious penetration behavior that may cause any intentional undesired change or update during the human-computer interaction [2, 22].

Unfortunately, without careful consideration of the balance between usability and security, a system can be designed so that those properties will actually work against each other and cannot be aligned and evaluated together as one joint concept [1, 16, 25]. One approach that has proven successful combines usability and security with the goal of improving the performance and quality of computer systems [8]. Therefore, we adopted the approach of bridging usability and security attributes together as one combined attribute: usable-security. One way to achieve that goal is using models and frameworks that help in assessing software usability, security, and usable-security requirements based on formative evaluation approach [10, 13, 20]. Assessing usable-security helps in predicting to which level software systems could be simultaneously usable and secure [13]. This work proposes a summative usable-security evaluation matrix based on the usable-security assessment models and frameworks. The matrix represents a solid component of a large scale ongoing research project for designing a Usable-Security Engineering Framework (USEF) for enhancing software development when balancing security and usability is an important matter.

Section 2 presents background on how important usability and security are to software systems and Sect. 3 introduces a proposed mathematical-based evaluation model to help determining the levels of usability and security quality attributes. Section 4 introduces a proposed matrix to evaluate usable-security of software systems. The last section is a conclusion about applying the matrix.

2 Background

Despite the success of most of the available systems in making human life even easier, they need continual evaluation and enhancement to make sure that such systems meet the usability and security requirements and standards [4, 17].

2.1 Usability

One usability concern is that software systems must be effective. For example, when airline companies use automatic speech recognition systems for phone reservations, those systems can only be considered effective if they are able to convert a traveler’s speech into a textual representation that can be then processed as accurately as a textual request. Another usability concern is that the systems must be efficient enough to save processing time for both users and business owners. However, there are many examples where the extended time of the sound representation process negatively impacts the speed of the automatic speech recognition systems’ ability to access customers’ information. Consequently, lack of effectiveness and efficiency creates unsatisfied customers and owners of such software systems [4].

2.2 Security

Security is also another important aspect to be evaluated and enhanced continually [17]. For instance, NLP approach is used in voice biometric authentication and access control management applications [5]. The voice representation techniques that are used in voice authentication systems must maintain confidentiality to prevent access to the representation techniques, which could lead to the penetration of authentication systems. Voice authentication systems can also be used against the important security goal of integrity by allowing unauthorized alterations to representation techniques that lead to a misleading or improper representation process. Human-Computer Interaction software systems must guarantee the availability of any application for legitimate users at any time according to the availability requirements specifications. For example, when biometrics systems are used for authentication, it must be available during all access permission requests to resources. Once the biometric authentication is not available, no access control will be activated.

Based on the above justifications on how important are both usability and security to any software system, Sects. 3 and 4 propose an evaluation methodology and a matrix to help in the evaluation of software systems.

3 Usability-Security Measuring and Evaluation

In order to develop a useful evaluation methodology, developers should have appropriate measuring techniques before the evaluation process begins. Access to suitable metrics gives developers the ability to predict the evaluation’s outcome. However, usable-security is difficult to measure, consequently, it is difficult to evaluate as well [7, 11]. Therefore, this work proposes new measuring and evaluation techniques based on OWASP [20] and SAULTA [10]. The evaluation technique serves as extended research work for the Assessment Framework for Usable-Security (AFUS) [13]. The following subsections describe methods for measuring and evaluating usability and security. In order to provide evaluation consistency, all the attributes and their properties are evaluated according to Table 2 after processing their measurements according to Table 1.

Table 1. Usability and Security Attributes and their Properties’ Measuring Guidance, where the listed measuring statuses: HA, MA, SA, and NA represent High Achieved, Mostly Achieved, Some Achieved, and Not Achieved; respectively. \(\alpha \) and \(\beta \) represent Usability and Security, respectively.
Table 2. Usability and Security Attributes and their Properties’ Evaluation Guidance, where the listed measuring statuses: HA, MA, SA, and NA represent High Achieved, Mostly Achieved, Some Achieved, and Not Achieved; respectively.

3.1 Usability Evaluation

The term usability is defined by the International Standard Organization (ISO) as the range in which a product can be operated by legitimate users to satisfactorily perform specific tasks in an effective, efficient, and specified way [26]. Some others add accuracy, memorability, and learnability as secondary usability factors [16]. This paper focuses only on the three factors of the ISO definition.

In order to evaluate the targeted usability level on any developed software system during the evaluation phase, the application state (level) of each of the usability properties (efficiency, effectiveness, and satisfaction) must be focused on, as well as providing a method to determine a measurable way to evaluate these properties on software systems. The following evaluation equations are adapted from Bevan and Macleod’s concepts [6, 18].

Effectiveness. Systems can only be considered as effective if their users are able to achieve their goal of operating such systems. The effectiveness property can be measured based on a goal-centered view by counting the number of successful tasks that legitimate users perform [12]. For example, a software systems is effective if it allows users to successfully create their passwords, login using their previously created passwords, or provide their biometric traits. Equation 1 can be used to evaluate the effectiveness for software systems, where n represents the total number of accepted tasks that legitimate users perform, R represents the result of each performed task’s trial (either “failure” or “success”), EF1 represents the system effectiveness rate.

$$\begin{aligned} EF1 = \frac{1}{n} \sum _{i=1}^{n} \delta (R[i]) \end{aligned}$$
(1)
$$\begin{aligned} \begin{matrix} Where \quad \quad \quad \delta (\theta )\, \text {is defined as:}\, \delta (\theta ) = \left\{ \begin{array}{ll} 0 &{} \quad \text {if}\, \theta = \text {failure}\\ 1 &{} \quad \text {if}\, \theta = \text {success} \end{array} \right. \end{matrix} \end{aligned}$$

According to Table 1, if the result of system effectiveness rate, \(EF1 \ge 0.9\), this means that the software system is highly effective. On the other hand, if the result of \(EF1 <0.3\), this means that the system is ineffective and may need further enhancement. After measuring the effectiveness, it is evaluated according to Table 2, because this provides the consistency among all of the usability properties.

Efficiency. Efficient systems must complete a specific task or process to reach a particular goal within an acceptable amount of time. The efficiency property is important because both the vendors and the users will not rely on a system that takes too long a time to perform a specific task (for instance: authentication). The measurement used to evaluate the efficiency is the amount of time that is consumed for achieving a particular goal or to complete a particular task. Equation 2 depicts the evaluation, where n represents number of trails to perform a particular task, \(\beta \) represents the standard average amount of time to finish such task, and T represents the amount of time to perform the task on each trial.

$$\begin{aligned} EF2 = \frac{\frac{1}{n} \sum _{i=1}^{n} T[i]}{\beta } \end{aligned}$$
(2)

According to Table 1, if the result of system efficiency rate, EF2, is less than 1, this means that the software system is highly efficient. In contrast, if the resultant value of EF2 is greater than or equal to 1.50, this means that the software system is inefficient because the average amount of time consumed to perform that task is too far beneath the standard average, \(\beta \). After measuring the efficiency, it is evaluated according to Table 2, because this provides the consistency among all of the usability properties.

Satisfaction. For a system to be satisfactory, both the vendors and the users must be happy with the system. This is determined by the willingness of both vendors and users to rely on and reuse the system. It is important to be aware that the satisfaction is compellingly affected by the vendors’ and the users’ mood [16]. It is most likely to be perceived as satisfactory if it is both simultaneously effective and efficient. Evaluating the satisfaction property is a challenging task due to the difficulty of having accurate measurement tools. However, the best way to evaluate satisfaction is via questionnaires such as in SUMI [18] and the SUS [23] surveys, or interviews [16]. Therefore, satisfaction is evaluated through involving HCI-SEC principles that are related to user-centered approach [27] that focus on the ease of use, the degree of happiness, and the degree of confusion. Based on the results of the given survey, user satisfaction, SA1, is measured according to Table 1. After measuring user satisfaction based on the above degree of satisfaction via surveys, it is evaluated according to Table 2, because this provides the consistency among all of the usability properties.

Evaluating the above standard usability properties leads to an overall usability evaluation through summing the evaluations of the three properties (effectiveness, efficiency, and user satisfaction) as in Eq. 3, where \(\alpha \) represents the usability:

$$\begin{aligned} \alpha = \frac{EF1 + EF2 + SA1}{3} \end{aligned}$$
(3)

Equation 3 provides the \(\alpha \) measure shown in Table 1. Consequently, same as evaluating the usability properties, overall usability of software systems is evaluated based on Table 2, because this provides the evaluation consistency among three quality attributes under investigation (usability, security, and usable-security).

3.2 Security Evaluation

The term “security” is identified in many ways. Essentially, system security is a set of methods and techniques used to prevent weaknesses from being exploited by applying three security goals: confidentiality, integrity, and availability [21]. Software systems have vulnerabilities that need to be discovered and closed, or at least protected from being imposed. Therefore, to reach an acceptable level of security, the three security properties must be applied and achieved. The following analyzes the importance of the security properties, how they are applied on the systems, and how they are measurably evaluated.

Confidentiality. Confidentiality is a goal of all secure systems. Confidentiality is defined as the ability to grant access only to authorized users. If unauthorized users gain access to computer systems, confidential information may be accessed and then used against the systems’ vendors or users. To measure software systems’ confidentiality, Eqs. 45 and 6 are applied where n represents the total number of access trials, \(\alpha \) represents an individual access trial (either “false access” or “true access”), TAR represents true access rate, FAR represents false access rate, and CO1 represents system confidentiality evaluation value.

$$\begin{aligned} CO1 = TAR \end{aligned}$$
(4)
$$\begin{aligned} TAR = \frac{\sum _{i=1}^{n}\delta (\alpha [i])}{n} \end{aligned}$$
(5)
$$\begin{aligned} \begin{matrix} Where \quad \quad \quad \delta (\theta )\, \text {is defined as:}\, \delta (\theta ) = \left\{ \begin{array}{ll} 1 &{} \quad \text {if}\, \theta = \text {true access}\\ 0 &{} \quad \text {if}\, \theta = \text {false access} \end{array} \right. \end{matrix} \end{aligned}$$

and

$$\begin{aligned} FAR = 1 - TAR \end{aligned}$$
(6)

According to Table 1, if the result of system confidentiality rate, CO1, is greater than or equal to 0.99, this means that the system is highly confidential. In contrast, if the resulted value of CO1 is less than or equal to 0.80, this means that the system failed to provide confidentiality because access was granted for too many unauthorized users. After measuring confidentiality, it is evaluated according to Table 2, because this provides the consistency among all of the security properties.

Integrity. Integrity means that for the authorized users, the system does not allow them to perform tasks in an improper way, and protects the data from any unauthorized alteration. As having usable-security evaluation for software systems is the goal of this paper, the integrity property must be correctly applied to make such systems secure. This property is achieved by enabling systems to create auto-backup and auto-check using proper techniques and tools like hashing, the process of comparing backup files with the same files on the system. Equation 7 depicts the integrity calculation on software systems, where n represents the total number of selected files for hashing, IN1 represents the integrity evaluation result.

$$\begin{aligned} IN1 = \frac{1}{n} \sum _{i=1}^{n} \delta (\frac{system file[i]}{backup file[i]}) \end{aligned}$$
(7)
$$\begin{aligned} \begin{matrix} Where \quad \quad \quad \delta (\theta ) \, \text {is defined as:}\, \delta (\theta ) = \left\{ \begin{array}{ll} 1 &{} \quad \text {if}\, \theta = \text {1}\\ 0 &{} \quad \text {if}\, \theta \ne \text {1} \end{array} \right. \end{matrix} \end{aligned}$$

According to Table 1, if the result of system integrity, IN1, is greater than or equals to 0.99, this means that the system is provides integrity, because the result of the hashing (comparing be-tween the system and the backup files) indicates that there were not any unauthorized alterations of system files. In contrast, if the resulted value of IN1 is less than 0.80, this means that the system did not provide integrity because a critical alternation occurred. After measuring integrity, it is evaluated according to Table 2, because this provides the consistency among all of the security properties.

Availability. Availability is a security factor where the system’s services and contents, or data, must be available at any time an authorized user needs to access them. It is measured based upon the number of success services or data access requests a system receives. Equation 8 shows the availability calculation, where n represents the total number of access trials, \(\alpha \) represents an individual access trial (either “available” or “unavailable”), and AV1 represents availability rate.

$$\begin{aligned} AV1 = \frac{1}{n} \sum _{i=1}^{n} \delta (\alpha [i]) \end{aligned}$$
(8)
$$\begin{aligned} \begin{matrix} Where \quad \quad \quad \delta (\theta )\, \text {is defined as:}\, \delta (\theta ) = \left\{ \begin{array}{ll} 1 &{} \quad \text {if}\, \theta \, \text {is available}\\ 0 &{} \quad \text {if}\, \theta \, \text {is unavailable} \end{array} \right. \end{matrix} \end{aligned}$$

According to Table 1, if the result of system availability evaluation, AV1, is greater than or equal to 0.99, this means that the software system is highly available, because it is accessible to its authorized users whenever needed. In contrast, if the resulted value of AV1 is less than 0.80, this means that the system was not available because more than 20 % of the requests were not serviced. After measuring availability, it is evaluated according to the unified numerical evaluation values of Table 2, because this provides the consistency among all of the security properties.

Evaluating the above standard security properties leads to an overall security evaluation through summing the evaluations of the three properties (confidentiality, integrity, an availability) divided by 3. Equation 9 illustrates the calculation, where \(\beta \) represents the security:

$$\begin{aligned} \beta = \frac{CO1 + IN1 + AV1}{3} \end{aligned}$$
(9)

The outcome of the above calculation provides one of the measuring statuses of Table 1. Consequently, same as evaluating the security properties, overall security of software systems is evaluated based on Table 2, because this provides the evaluation consistency among three quality attributes under investigation.

4 Usable-Security Evaluation Matrix

Analysis and measurement studies of usability and security properties on systems produced detailed understanding of the nature and structure of systems’ usable-security. Based on the previous section, a usable-security evaluation matrix can be constructed that can be used as a guidance to achieve the overall goal of this paper, which is evaluation and enhancement for software systems to be both more usable and secure enough (see Fig. 1). The matrix shown in Fig. 1 is a proposed method for evaluating usable-security of software systems. Systems’ usable-security can be evaluated by using Eqs. 3 and 9 that evaluate the systems’ usability (effectiveness, efficiency, and satisfactory), and security (confidentiality, integrity, and availability). The results are then used in Eq. 10, where \(\alpha \) represents the overall usability evaluation, \(\beta \) represents overall security evaluation, \(\gamma \) represents the matrix score, which is the usable-security evaluation.

$$\begin{aligned} \gamma = (\alpha \times \beta ) - |\alpha - \beta | \end{aligned}$$
(10)

The highest measuring evaluation category in the matrix is when the system attains a score of at least 0.81 as displayed in Fig. 2. This means that the system has applied and achieved the highest level of all of the six usability and security properties. Such a system is considered as usable and secure not only because it achieved the usability and security goals, but because it bridged both usability and security together with careful consideration to the HCI/user-centered approach from the begining of the development process.

Fig. 1.
figure 1

The Usable-Security Evaluation Matrix: The evaluation process consists of two components: formative mathematical-based modeling that evaluates the security and usability properties, and a summative matrix that evaluates the Usable-Security based on the results of the formative modeling.

Fig. 2.
figure 2

Usability and Security Measuring Values: The measuring values and categories of the Security and usability attributes are used to evaluate Usable-Security

On the other hand, the lowest measuring evaluation category in the matrix is when a software system scores at most 0.01 as shown in Fig. 2, which means that the system has not achieved any beneficial values of the usability and security attributes. Such a system is considered neither usable nor secure not only because it does not achieve the usability and security goals, but because it does not bridge usability and security as well. In addition, such a low score for a system indicates that its developers might have not considered the HCI/user-centered approach from the begining of the development process.

Walking through the whole evaluation process, a final usable-security evaluation is reached as one of five perfection levels: high usable-security when \(\gamma \ge \) 0.81, mostly usable-security when 0.49 \(\le \gamma \) \(<\) 0.81, usable-security when 0.36 \(\le \gamma \) \(<\) 0.49, some usable-security when 0.09 \(\le \gamma \) \(<\) 0.36, and not usable-security when \(\gamma \) \(<\) 0.09. Overall, a numerical categorization is given to each perfection level as follows: high usable-security is categorized as 9, mostly usable-security is categorized as 7, usable-security is categorized as 6, some usable-security is categorized as 3, and not usable-security is categorized as 1. Figure 3 presents the final usable-security evaluation and categorization guidance.

Fig. 3.
figure 3

Final usable-security evaluation and categorization guidance

5 Case Study: Authentication Approaches

In order to show the advantage of using the usable-security evaluation matrix, this section presents the results of assessing the usability levels of an experiment that compares two authentication approaches. The Choice-Based Authentication Approach (CBAA) and the Traditional-Based Authentication Approach (TBAA) are described in detail in [14]). The experiment used 30 different scenarios for each approach and the results from the experiment are described below.

For the security values, we used one value (mostly secure: 0.7) for all the thirty scenarios of the TBAA, with expected entropy between \(2^{26}\) and \(2^{27}\). As the assumed security value is considered as the security level that can be achieved when the NIST SP 800 Series is used as a foundation security policy [24]. However, security level of the CBAA has been increased because of the increased complexity for adversaries by displaying multiple authentication methods during the login process as explained in [14], where the expected entropy is between \(2^{64}\) and \(2^{65}\), in addition to biometrics. Hence, we anticipated that the CBAA security should at least be better than the TBAA by 0.1. Therefore, we used a unified security value for all the thirty scenarios of the CBAA as (mostly secure: 0.8). Figure 4 displays the two sets of the scenarios.

Fig. 4.
figure 4

Radar chart for the usable-security evaluation values. It displays usable-security coverage by the CBAA and the TBAA

Figure 4 displays a radar chart showing the trend-lines from evaluating usable-security levels among all the scenarios of both the CBAA and the TBAA. A comparison of the trend-lines for both approaches shows that the CBAA provided a better balance of usable-security than the TBAA. Moreover, it indicates that the usable-security evaluation results are more stable in the CBAA than the TBAA. In addition, the figure shows the usable-security coverage for both approaches and indicates that the CBAA occupies a wider domain than the TBAA.

Following the guidelines in Fig. 3, the results in Fig. 4 for the CBAA show that scenarios 21 and 25 achieved the “some usable-security” level (0.31 and 0.32, respectively), scenarios 3 and 6 provide the “usable-security” level of 0.40 and 0.48, respectively, while the rest of the 26 scenarios reached the “mostly usable-security” level (i.e.: the values are greater than or equal to 0.49, and less than 0.81). As a result, the overall usable-security evaluation value for the CBAA is 0.58, which is at the “mostly usable-security” level.

On the other hand, the TBAA scenarios displayed on Fig. 4 show that the scenarios 16 and 22 only achieve the “not usable-security” level (0.09 and \(-0.19\), respectively). Moreover, 10 scenarios reached the “some usable-security” level, and only 18 scenarios provided the “usable-security” level. Consequently, the overall usable-security evaluation value for the TBAA is 0.35, which is the “some usable-security” level.

6 Conclusion

It is important that developers are able to evaluate usability, security, and usable-security software quality attributes that were specified within a system’s requirements and were properly designed, correctly constructed, accurately evaluated, and appropriately deployed. However, existing methodologies for applying the above attributes do not assure the development of systems that can meet the necessary high quality standards for usability and security. The integration of such attributes becomes more important than just applying each attribute individually. The work presented in this paper goes beyond the application of attributes’ integration, as it proposes an evaluation methodology to test the alignment of those attributes. Moreover, the proposed evaluation matrix is flexible enough to be used to modify the attributes’ measures. It can also be extended to evaluate other quality attributes.