1 Introduction

This study is motivated by the need to evaluate the security posture on a national level. For evaluating the security posture on a national level, the security levels of relevant organisations need to be measured in a standardised way. Information security standards such as ISO/IEC 27001 can provide helpful guidance for securing information assets, but lack methods for gathering quantified actionable metrics to compare different organisations and observe changes in an organisation.

This paper describes the framework for security measurement method and shows how it is used for information security assurance in Estonia. Although this paper focuses on measuring organisations information security levels in Estonia, a similar approach could be adapted in other countries. Specifically, our study investigates how to evaluate the level of information security of an organisation.

2 Related Work

When developing any framework for information security evaluation, several information security standards and frameworks can be used as a starting point, e.g., ISO/IEC 27k family, BSI IT-Grundschutz, CIS 18 (Center for Internet Security Critical Security Controls for Effective Cyber Defense), etc. Many standards provide maturity models, but there is a lack of a benchmark solution or model to support reliably comparable results [2]. Shukla et al. [9] have defined correctness, measurability, and meaningfulness as the core quality criteria of security metrics. They emphasis that systematic and complete security metrics are needed in the decision-making process. Although there are commercial tools that relate security metrics to vulnerability management and policy compliance [1], those tools tend to lack transparency and accessibility.

Le and Hoang [3] highlight that security maturity models support security posture. Such a model should (i) have consistent and justified maturity levels of cybersecurity across different domains; (ii) introduce quantitative metrics for any security assessment (in contrast to mainly qualitative metrics/processes in international standards such as ISO 27k series and NIST cybersecurity framework); (iii) maintain a flexible model to facilitate the inclusion of new topics connected to novel technologies. They also argue that security models could balance the qualitative assessment for management and quantitative assessment for security experts [3]. In this paper, we present a publicly accessible security measurement framework that utilises a maturity model and addresses the above challenges and shortcomings. Our framework provides measurable, systematic and transparent results, and is flexible to changes and compliant to internationally recognised standards (e.g., ISO27001).

3 Creation of Security Evaluation Framework

Our study followed Design Science Research Methodology (DSRM) [5]. Firstly, we identified the problem and elicited requirements to assess security level of organisations. Next, following the E-ITS security catalogueFootnote 1, we designed the maturity framework, which consists of ten dimensions, four maturity levels and attributes. We invited ten organisations for framework demonstration and testing experiment. Finally, we updated the framework according to the evaluation results. The resulting maturity framework is published in [8].

Problem Identification. For implementing an information security standard, an organisation needs to understand what should be changed and what is the impact of that change. Similarly, to make decisions at the state level, the government needs data to plan and estimate the security strategy. In Estonia, a new information security standard E-ITS [7] is being introduced. Hence, each implementer needs to assess its level of information security compared to set objectives for improving and monitoring its progress.

Security measuring is challenging because of dependencies, multidimensionality, dynamics, gain and loss perception biases, as well as probe effect caused by the security measurement process [6]. The problem is: How to evaluate the information security level of the organisation without passing the complete gap analysis and how to evaluate the security level of the organisations comparably?

Solution Objectives Definition. To have an actionable evaluation of the information security level of the organisations, a set of requirements must be met. The requirements (Table 1) were gathered from the literature, authors’ previous experience, stakeholders’ expectations, and legislation. It was acknowledged that the comprehensive coverage (Req. 1) and standard-based requirements (Req. 4) are similar. We found that although comprehensive coverage might be simple to achieve with the standard, security evaluation based on a standard does not always give comprehensive coverage of topics. For example, some standards cover only technical topics.

Table 1. Requirements for the security level evaluation

Design and Development. Based on elicited requirements, we defined the scope of our framework design and developed the security evaluation framework. Figure 1 illustrates the process of maturity model framework design and development.

Fig. 1.
figure 1

Maturity framework

Baseline Standard. We used the Estonian information security standard (E-ITS) [7] for creating our framework. E-ITS is compliant with internationally recognised ISO27001 and has comprehensive measures, which satisfies simultaneously the Req. 1 and Req. 4. E-ITS is a baseline standard, which helps to reduce the time assigned to risk assessment for typical assets by providing ready-to-use security measures for the organisations. Only the implementation of E-ITS baseline controls is evaluated and organisation’s uniqueness or special needs are not considered. Such scope produces a general E-ITS-based benchmark for evaluating the security level (Req. 2).

Dimensions of the Model. After setting the baseline standard as the base of our framework, the maturity model for security level evaluation formed naturally. The E-ITS contains measures, which are divided into ten module groups (Fig. 1) that were used as systematic dimensions, therefore ISMS, ORP, CON, OPS, DER module groups were procedural, and INF, NET, SYS, APP, IND module groups were system based technical modules. All module groups have around 90 sub-modules. For example, the OPS module converges several subtopics like outsourcing IT services, cloud service usage, and patch and change management. To meet the objectives of Req. 3, we did not transfer the sub-modules from the standard to the framework. If we go by the dimensions and predefined measures levels, this approach adds comparability to our framework and follows Req. 2.

Framework Levels. E-ITS measures are ordered Basic, Standard and High as illustrated at the top of the Fig. 1. We decided to exclude the High measures from our scope to align with Req. 2 – to limit the benchmark only on the general baseline part, which is mandatory to all organisations. Standard level measures of E-ITS we allocated directly into Standard level attributes of our model. E-ITS Basic level attributes we divided into three levels to enable the organisation to set reachable goals and allow more granular measurement of their improvement. We named the starting point level as Initial Level. At this level, the organisation solves its security issues ad hoc and on a need-based. The organisation is risk-sensitive and unable to deal with incidents on its own. Defined Level was designed to separate formal compliance documentation requirements from the processes taking place. The Basic level of our framework is complemented by Basic Level of E-ITS. Three levels were used (Initial, Defined and Basic) to evaluate the organisational readiness to respond to known threats and direct the organisation to the documented, trained and optimised Standard Level security. Achieving Standard level security allows the organisation to deal with unknown risks by significantly reducing their potential impact and loss.

Attributes of the Framework. We aimed to populate a framework with attributes so that the respondent could find evidence for each attribute implementation status if necessary. The result should be as unambiguous as possible, giving a similar outcome for the same role and information space holder. We needed both, measuring the attributes of the performance (e.g. “policy exists”, “backups are done”) and effectiveness (e.g. “policy is reviewed during the year”, “logging is targeted and regularly monitored”).

We avoided direct technology metrics to follow Req. 2. Still, we needed to keep technical measures. We customised the measures by connecting dependencies, generalising, and focusing on the essentials. For example: if there is a channel for security incident reporting, we tied the incident registration into the same attribute. We excluded all direct technology related measures or aggregated them into a general attribute (e.g. if the control was “Organisation has Office product requirements list”, we aggregated it into attributes: “Requirements lists are created for applications”, and included under that attribute also other applications measures like calendar, browser, etc.). Dividing attributes into levels, we followed the rules: Initial level attributes describe ad hoc behaviour or measures that are implemented are not managed or maintained; Defined level with formal attributes (defining policies or rules), the Basic level was populated with actual procedures and activities which were not covered by previous levels; Standard level attributes follow E-ITS Standard measures.

Evaluation Scale. We constructed the evaluation scale for attributes evaluation, which should align with Req. 3. Our first preference was the traffic light solution, which is cognitively simple to understand. The green marks attribute as fully implemented, yellow means partly implemented, and red indicates not implemented situation. The respondent’s task was to colour the attributes accordingly. Based on validation feedback (see Sect.4) we decided to provide a scale with four levels.

The four-level scale supports Req. 2 to show the dynamics of organisation security even in the case of minor changes. In this case, yellow was divided into mostly implemented with some shortages (still yellow), and orange marking the significant deficiencies but still partly implemented. The results of three framework iterations are available at [8].

Demonstration. At first we received responses from 10 public or vital service providers, the size of organisations were from 30 to more than a thousand employees with subdivisions. In the second iteration we received responses from the public authority with around 350 employees. The third iteration was done by expert, one author of this article, who had not participated in previous framework development phases. Each respondent was an information security officer, IT manager or data protection officer of subject organisations. Work experience in this role was from one year to more than ten years. To avoid potential leakages of vulnerabilities or security measures and due to the Public Information Act, we kept the respondents’ characteristics at a general level.

Testing of the framework is conducted in three iterations. In the first and second iterations, participants received the respective file [8] in DOCX format to keep the low entrance barrier (avoiding specialised tool, login requirements or unknown formats). Each respondents task was to perform the organisation self-assessment by highlighting the maturity model framework attributes (sentences) using the evaluation scale (the traffic light colours). The data was gathered into one file, which was processed manually. At the third iteration, the expert analysed the framework structure, principles of the levels, and attributes. Each suggestion was later explored and taken into account by consensus in joint debate with the primary author of the framework.

Fig. 2.
figure 2

Organisation security evaluation and comparison with benchmark (Color figure online)

Results Interpretation. Interpreting the results we followed only the dimensions based results, which gives us the option to change the attributes in the dimension, but the full framework still stays comparable with previous results. This approach supports the planned E-ITS yearly updates. Use case 0. The organisations used the coloured table to interpret organisation security using traffic light colours and the dominant visual colour to indicate the current security status before calculations. Use case 1. We transferred the colours into quantifiable form to perform the calculations (red equated with 1, yellow with 2 and green with 3; a four-level scale: red signifies 0, orange 1, yellow 2 and green 3). Then we calculated the average result of the organisation by each dimension and maturity level and visualised it on a radar diagram (example for illustration on Fig. 2a). The figure indicates that the Defined level is in good shape, all dimensions values are higher than 2, except the IND dimension (it has a low value in all levels). They have not defined that it could be a security issue and have not taken industrial assets into their security management scope. Formal procedures are defined (Defined level results) but not enough training or exercises are performed (Basic level results). A high score in the Standard level gives the organisation and its partners the confidence to manage the unknown threats with minimal loss. Use case 2. To simplify the model, we sum each level’s average value by dimension and get the information security level of organisation by dimensions (Fig. 2b blue line). I.e., a partner can not be confident about this illustrative organisation case because the DER result indicates shortages in incident management. Use case 3. We calculate the average value of information security level by dimension based on all organisations to get the benchmark values. The benchmark of our test-group results is shown in Fig. 2b red line. Benchmark is important for organisations to assess the status in the market or the partnership. In an illustrative case, we can say that organisation is a bit over the benchmark in the procedural dimensions, but lower values are in the technical dimension. Use case 4. The benchmark is also the input for state-level political and strategical decisions. Currently we estimate, based on the benchmark model, that few organisations have reached the Standard level (average dimension levels over 6) and can manage with unknown threats. Exercises and training may be needed to achieve a maturity value over 6 and 7.

Thus, our study has provide a deployment-ready model for organisations security level evaluation and an opportunity to centralise results at a state level.

4 Evaluation

We evaluated the framework compliance to stated requirements (Table 1) using semi-formal methods (feedback seminar with respondents, written feedback, interview with security expert).

Req. 1: The respondents confirmed that the framework gave an overview of the security in the scope and implementations and was still perceptible to all respondents. Req. 2: Few respondents noticed some duplicates in the framework, which we removed. They suggested some aesthetic issues related to the presented results. Additionally, they indicated that the results attained managers’ attention and contained the motivational aspect to compare with a benchmark. One respondent provided an improvement suggestion, which was to replace the three traffic light evaluation scale with a four-level scale. This was to force the respondent to decide whether the situation is somewhat positive or rather negative. Req. 3 was reflected with time spent (between 30 min (IT officers) to 2 h (data protection officers), average 60 min). Also, they confirmed that no entrance barriers were detected. Req. 4 get a positive response that framework attributes gave a compact overview of E-ITS requirements: “One hour, and I got the full picture!”

The independent expert analysed the framework dimensions and levels based on literature and related work and agreed with national standard structure elements and measures. Also, attributes divided into Initial, Defined and Basic levels were double-checked to follow the internal rules (see in Sect. 3 paragraph Framework levels).

The first iterations indicated some lingual limitations. Respondents were not able to colour attributes from the initial level. To resolve the problem we revised the framework so that the higher level attributes would also contain lower level attributes. We excluded the first level results from our examples (see Fig. 2a) to avoid the possible misinterpretations.

Limitations. This work only gives insight into how it is possible to interpret the provided model data. To validate the method as an acceptable benchmark, a bigger reference group is needed. Woods and Böhme [10] have shown how indicators of security have little explanatory power alone. Furthermore, any security guideline contains the inherent weakness of generalisation. Measuring information security is a complex task, and the validity of any metrics should be considered with care and updating the attributes needs expert review.