Keywords

1 Optimisation of Quality Assurance

Analytical quality assurance is the part of software quality assurance that analyses artefacts and processes to assess the level of quality and identify quality shortcomings. Analytical quality assurance is a major cost factor in any software development project. While real numbers are hard to find, any practitioner would agree that they spend a lot of time on testing and similar activities. At the same time, analytical quality assurance is indispensable to avoid quality problems of all kinds.

Hence, we have a set of activities in the development process that are expensive but are also decisive for the quality, and thereby probably the success, of a software product. Therefore, it is a practical need to optimise what kinds of analytical quality assurance are employed, to what degree and with how much effort to reach high quality and low costs.

In the following, we will first discuss how this problem could be framed from an economics point of view. Second, we will discuss the progress on evaluating and understanding the effectiveness and efficiency of various analytical quality assurance techniques. Third, we the will widen our perspective by discussing psychological aspects of quality assurance, and, finally, discuss some concrete, current proposals to optimise software testing.

2 Quality Economics

As the optimisation of quality assurance contains many different factors such as the effort spent for the techniques and what kind of techniques, what kind of faults are in the software it is difficult to find a common unit. Hence, various quality economics approaches have been proposed that use monetary units [1, 10].

We have proposed a quality economics model to estimate and evaluate costs and revenue from using analytical quality assurance mainly based on effort and difficulty functions of faults and quality assurance techniques [11,12,13]. We show in Fig. 1 just the part of the model describing the direct costs of applying an analytical quality assurance technique. Such an application usually comes with some kind of fixed setup costs, execution costs depending on how much effort we spend and removal costs depending on the probability that we detect a given fault.

Fig. 1.
figure 1

The direct costs of an analytical quality assurance technique

This needs to be complemented with future costs describing the costs incurred by faults not detected by analytical quality assurance but leading to failures in the field. Those faults need also to be removed and there are further effect costs such as compensations to the customers. We showed that especially the latter are an important part of the model, but it is extremely difficult to determine. A very simple fault can have catastrophic effects while a very complicated, far ranging fault might never lead to a failure. Overall, we found that the empirical basis is too thin to practical apply such models and, because of the difficulty of collecting all the necessary data, might never improve.

3 Effectiveness and Efficiency of Quality Assurance Techniques

Therefore, we are convinced that it is more useful to rather devote research on the effectiveness (what share of the faults is found?) and efficiency (what is the relation of the number of found faults to effort spent?) of particular quality assurance techniques to at least be able to judge and compare them to have a basis for the decision what techniques to apply when and for what. There have been many studies for evaluating these aspects for various quality assurance techniques. In particular testing and inspections have been a focus of empirical research.

We have also contributed to better understand several quality assurance techniques. For model-based testing, we could show that a large share of the detected defects are already found during the modelling activity. The created test suite itself was not more effective than a test suite created by experts but was able to find other defects, often defects that require a particular long interaction sequence [9]. For black-box integration testing, we could show that test suites with a higher data-flow-based coverage are able to detect more more defects [2].

In a comparison of tests, reviews and automated static analysis [15], we found that automated static analysis finds a subset of defect types of reviews, but if it detected a specific type, it would detect it more thoroughly. Tests found completely different detect types than reviews and automated static analysis. In a study only of automated static analysis [14], we found that none of the analysed 72 field defects would have been found by one of the used static analysis tools. For the particular static analysis technique clone detection, we found that by looking particularly at clones with differences between their instance, we found 107 faults in five production systems leading to the observation that every other clone with unintended differences between its instances constitutes a fault [3].

Together with the large and growing body of evidence from other researchers, this starts to give a good understanding of the effectiveness and efficiency of analytical quality assurance. The main weaknesses I see still today is the infrequent replication of studies and slightly different operationalisations of effectiveness and efficiency that make meta-analysis difficult.

4 Psychological Aspects

A further dimension we believe to be of critical importance but that has not widely been investigated are psychological factors of the software developers and testers applying analytical quality assurance. In particular, we studied the use of automated static analysis in relation to the personality of the developers [8] and the stress it caused for developers [7]. For example, we found that while people with a high level of agreeableness show a relatively structured strategy in dealing with static analysis findings by working with small changes in cycles, people with high neuroticism show a more chaotic approach by making larger changes and impulsive decisions: They change the file they were working on without finishing the work they had started. Such findings can in turn inform how to improve quality assurance techniques or the user interfaces of corresponding tools.

5 Test Optimisation

Especially for the optimisation of tests, there is a lot of existing research on test case prioritisation and test suite minimisation. These techniques aim at only executing relevant test cases and executing test cases with a high probability of detecting a defect for a giving change early. Yet, there is still room for improvement. We recently introduced two novel concepts to optimise test suites: (1) detection of pseudo-tested code [4, 6] and (2) inverse defect prediction [5].

We define pseudo-tested code as code that is covered by some test case but the test case does not effectively test that code. It would not detect faults in that code. We detect pseudo-tested code by using an extreme mutation operator: We remove the whole implementation of a method or function and return a constant. In empirical analyses, we found in all 19 study objects that pseudo-tested code existed and that the median of pseudo-tested methods was 10%. It can help in a practical setting to reduce incorrect coverage values to concentrate on test cases that effectively add more coverage.

We introduced inverse defect detection to identify methods that are too trivial to be tested. For example, many getter and setter methods in Java contain only trivial functionality. In many cases, it is not time and effort well spent to test these methods. We identified a set of metrics that we considered to be likely indicators for trivial methods. We used these for association rule mining and identified association rules to identify methods that have a low fault risk. This forms effectively a theory for low-fault-risk methods. The predictor that uses these rules is what we call inverse defect detection. It is effective in identifying methods with low fault risk: On average, only 0.3% of the methods classified as ?low fault risk? are faulty. The identified methods are, on average, 5.7 times less likely to contain a fault than an arbitrary method in the analysed systems. Hence, we can either not test those methods at all or at least execute the corresponding test cases last.

6 Conclusions

Optimising analytical quality assurance has been an important practical problem as well as a corresponding research area for many years. We have learnt that precise economical models suffer from the lack of empirical data and, hence, have not led to much practical progress. Yet, many empirical studies contributed to a better understanding of the effectiveness and efficiency of particular quality assurance methods. Furthermore, new research directions such as the influence of psychological factors, pseudo-tested code and the prediction of low-fault-risk code provide us with more theoretical understanding and already practical impact in the form of new tools.