OpenSCV: An Open Hierarchical Taxonomy for Smart Contract Vulnerabilities

Smart contracts are nowadays at the core of most blockchain systems, as they specify and allow an agreement between entities that wish to perform a transaction. As any computer program, smart contracts are subject to the presence of residual faults, including severe security vulnerabilities, which require that the vulnerable contract is terminated in the blockchain. In this context, research began to be developed to prevent the deployment of smart contract holding vulnerabilities, mostly in the form of vulnerability detection tools. Along with these efforts, several and heterogeneous vulnerability classification schemes arised (e.g., most notably DASP and SWC). At the time of writing, these are mostly outdated initiatives, despite the fact that smart contract vulnerabilities are continuously being discovered and the associated rich information being mostly disregarded. In this paper, we propose OpenSCV, a new and Open hierarchical taxonomy for Smart Contract Vulnerabilities, which is open to community contributions and matches the current state of the practice, while being prepared to handle future modifications and evolution. The taxonomy was built based on the analysis of research on vulnerability classification, community-maintained classification schemes, and research on smart contract vulnerability detection. We show how OpenSCV covers the announced detection ability of current vulnerability detection tools, and highlight its usefulness as a resource in smart contract vulnerability research.


Introduction
Smart contracts play an important role in advancing blockchain because they expand the application of the technology to various domains (e.g., education (Grech and Camilleri, 2017), healthcare (Agbo et al., 2019), government (Geneiatakis et al., 2020)). While they are essential for the consolidation and expansion of the technology, they also bring serious risks, namely those associated with the potential presence of vulnerabilities that can affect the security of the blockchain system (Atzei et al., 2017).
Just as conventional programs, smart contracts are being deployed with residual software faults (i.e., bugs or defects), including security vulnerabilities (i.e., internal faults that enable external events to harm the system) (Qian et al., 2022;Avizienis et al., 2004). However, the consequences of deploying a faulty contract have particular characteristics in the context of blockchain systems, such as: i) if faulty code is identified, the respective contract cannot be patched, it must be terminated, and a new one should be created (Zou et al., 2019); ii) once the potentially erroneous data (generated/updated by faulty contracts) has been stored in the blockchain, there is no way to change it, i.e., to undo the respective transactions (and subsequent transactions that rely on this data) (Yaga et al., 2018); and iii) if the faulty contract has been executed, the associated impact may be irreparable (e.g., reputation costs) (Antonopoulos and Wood, 2018).
Several initiatives have been created that ultimately aim at contributing to the development of more secure smart contracts. Among these initiatives, we find three main types: i) New smart contract programming languages (e.g., Clarify (Blockstack and Algorand, 2021), Vyper (Kaleem et al., 2020), Obsidian (Coblenz, 2019)), which aim at increasing protection against faults; ii) New vulnerability detection tools (e.g., Mythril (ConsenSys, 2021), Neucheck (Lu et al., 2019), (Bose et al., 2022), SoliDetector (Hu et al., 2023)), which have the main goal of detecting vulnerabilities in smart contracts so that vulnerable contracts do not reach the deployment phase; and also, iii) and also vulnerability classifications that mostly allow knowledge regarding vulnerabilities to be identified in a standard manner and systematized.
The existence of vulnerability (or software defects, in general) classifications is quite important, as we can observe by the research and industry effort associated with well-known cases like OWASP (OWASP Foundation, 2001), NVD (government, 1999), CVE (MITRE Corporation, 1999), CWE (CWE Community, 2009), or, in the case of smart contracts most notably by SWC (SmartCon-tractSecurity, 2020), and DASP (NCCGroup, 2021). Generally, they raise the level of awareness among developers and may allow, in a uniform manner, for development tools to assist developers regarding mistakes being placed in the code. They may also help in the design and development of vulnerability detection tools and in the assessment of their detection capabilities (Durieux et al., 2020;Hu et al., 2021;di Angelo and Salzer, 2019). This case also holds for programming languages. It is known that languages, such as Obsidian, have benefited from the systematized knowledge of vulnerabilities. There are even studies that use taxonomies as a basis for comparing different programming languages with respect to the protection offered against certain types of vulnerabilities, e.g., (Kaleem et al., 2020).
At the time of writing, vulnerability classifications for smart contracts have significant limitations. They are generally outdated with, at the time of writing, popular schemes like DASP or SWC not being updated since 2018. This largely differs from the state of the practice, in which we find cases of tools like Securify2 (Tsankov, 2018) already detecting several vulnerabilities for which there is no accurate description. As in other software areas, with new vulnerabilities being continuously discovered, having a flexible way of integrating (and possibly restructuring the classification) new defects is crucial.
Vulnerability naming and classification schemes are being defined using arbitrary nomenclatures. This is easily visible just by analyzing a few of the most cited papers in vulnerability detection, e.g., (Luu et al., 2016;Tsankov et al., 2018;Kalra et al., 2018). The lack of a standard nomenclature leads to verification tools mostly using arbitrary names to present their result, e.g., SmartCheck (Tikhomirov et al., 2018) and Slither (Feist et al., 2019) respectively use balance equality and incorrect-equality/locked-ether to refer to the same vulnerability. As a result, it is very difficult to compare the effectiveness of different tools. As classifications are many times built based on multiple sources, such as different industry tools and several research papers (Rameder et al., 2022), terms easily end up being inconsistent. This is aggravated when there is no active maintenance even for known issues. Indeed, Reduced community contribution is known to be a problem, with the main classifications that are community-oriented (i.e., DASP, SWC) showing residual community activity, many times related to minor issues (e.g., broken links) (NCC Group, 2019;SmartContractSecurity, 2020).
Many times, vulnerability classification schemes mix the characteristics of a certain vulnerability with the effect of exploiting it, how it is exploited, or its impact. This concept inconsistency is quite visible in current taxonomies. As an example, in (Kaleem et al., 2020) presents DoS with unbounded operation as a vulnerability, but it is not possible to understand what the vulnerability is with this name (e.g., it can be a problem in a loop, it can be a malicious call that is externally triggered several times). Instead, the given name refers to the possible impact of exploiting a vulnerability, which should be a separate dimension for characterizing the defect. Similarly, this occurs in DASP (NCC Group, 2019), in which one of the categories is precisely Denial of Service. Another aspect this latter example shows is that taxonomies are being built with inadequate granularity, often too coarse to be really helpful. For instance, the Denial of Service category in DASP may refer to gas limit reached, unexpected throw, unexpected kill, or access control breached. Moreover, the description is sometimes so short that may become ambiguous (e.g., access control breached may refer to a vulnerability that would simply fit in access control, which is another DASP category).
In this paper, we propose OpenSCV, a new hierarchical and Open taxonomy for Smart Contract Vulnerabilities (available at http://openscv.dei.uc.pt), which is open to community contributions , aims at matching the current state of the practice and is prepared to handle future modifications and evolution. To build the taxonomy, we analyzed current smart contract vulnerability classifications and discussed their gaps and limitations. We then analyzed the announced detection capabilities of 49 research works on smart contract vulnerability detection with the goal of collecting an heterogeneous set of 357 vulnerability definitions. We then mapped the vulnerabilities in existing classifications, namely DASP (NCC Group, 2019), SWC (SmartContractSecurity, 2020), (Rameder et al., 2022), and CWE (CWE Community, 2009) and further characterized them using the Orthogonal Defect Classification (ODC) (IBM, 2013b,a) and with a code excerpt. Names were then consolidated and grouped in a structure that was built bottom-up. This process involved 2 Experienced Researchers and 1 Early Stage Researcher, which revised the proposed taxonomy iteratively in terms of structure, correctness, and uniformity.
We structured OpenSCV to allow it to be flexible to changes and evolution by preparing a supporting infrastructure at github. We are able to receive change requests easily and integration information from new research on vulnerability detection into the taxonomy. All OpenSCV entries are supported by a code example, with the goal of mitigating possible ambiguities in the description of each vulnerability and we also prepared an initial dataset holding vulnerable contracts (one per each of the vulnerabilities present in OpenSCV) and their respective correction. OpenSCV is live and available at http://openscv.dei.uc.pt , the github repository is available at  and linked to Zenodo which permanently hosts the dataset (Vidal et al., 2023a). It is worthwhile mentioning that the taxonomy considers mostly software vulnerabilities and a few software defects considered in the literature to be associated with high-security risks. For simplicity, we use the term vulnerability throughout the paper to refer to both cases.
The rest of this paper is organized as follows. section 2 discusses the related work and limitations of current vulnerability classification schemes. Section 3 presents the process followed to build the taxonomy and overviews the final outcome. Section 4 presents the taxonomy structure and provides a brief description of all vulnerabilities included in the taxonomy. Section 5 characterizes and discusses the coverage of the taxonomy in perspective with the state-of-the-art. Section 6 presents the threats to the validity of this work, and finally, Section 7 concludes this paper.

State of the Art
This section presents the quality properties of taxonomies and then discusses the existing classification schemes for smart contract vulnerabilities. The classification schemes presented are have their origins in: a) existing research on smart contract vulnerability classification; b) community-oriented initiatives; and also c) vulnerability detection research. The section closes with a discussion of the gaps and limitations of current classification.

Taxonomy Quality Properties
We analyzed a set of reference works centered around the definition of taxonomies as well as critical analyses of vulnerability taxonomies (Bishop and Bailey, 1996;Lindqvist and Jonsson, 1997;Mann and Christey, 1999;Rameder et al., 2022;Lough, 2001;Hansman and Hunt, 2005) to identify a set of quality properties criteria, which should be followed when designing a taxonomy that is expected to be long-lived. The following paragraphs discuss the identified properties.
A classification system may benefit from a hierarchical organization as it allows to show similar characteristics of related vulnerabilities, which may be helpful for vulnerability prevention (Bishop and Bailey, 1996). A hierarchical structure may be a tree in which each node refers to a category of vulnerabilities, and each leaf corresponds to individual vulnerabilities. Thus, the granularity of the categories should generally vary from large to fine as we traverse the tree from the root to the leaves. Nodes at a certain tree level must be as uniform as possible, i.e., ideally representing the same level of abstraction or a group of vulnerabilities viewed from the same perspective. Obviously, this is quite difficult to achieve because, many times, this has to be balanced with the creation of taxonomy trees that become too complex, which in the end, may make it less comprehensible or less helpful. Also, sometime the nature of the problem is simply an unbalanced or heterogeneous (in structure) one, which basically disallows this criteria. Anyway, a uniform taxonomy may contribute to fewer errors (in its use) and, as such, a higher probability of adoption by practitioners. In practice, it may contribute to a taxonomy that is useful and comprehensible (Lindqvist and Jonsson, 1997) (i.e., understandable by security experts but also by less specialized people).
The selection of names to be used in a classification scheme is particularly important. The name that describes a certain vulnerability must be a unique identifier) and non-ambiguous (Howard, 1997;Lindqvist and Jonsson, 1997), meaning that the name and also the associated description must allow not only for easy identification but should include enough information to distinguish it from other vulnerabilities (Mann and Christey, 1999;Bishop and Bailey, 1996). Whenever possible, existing terminology should be used (Lindqvist and Jonsson, 1997). The name and characteristics of a certain defect should characterize what the issue is and not additional dimensions, such as the effect of exploiting it. While it is acceptable to understand the effect of exploiting a certain vulnerability starting from its description, the characteristics of the problem itself cannot be omitted and should be clearly identified (Mann and Christey, 1999;Bishop and Bailey, 1996).
Regardless of the perspective of the individual using the taxonomy, a certain defect should be classified in the same manner by different individuals (e.g., developers, users, testers). This means that not only the names and structure should be as clear as possible, but also that the process of classifying a certain defect must be made clear (whenever the structure and nomenclature are not sufficient), i.e., there must be a deterministic (Krsul, 1998) way of classifying a certain defect, which fosters repeatability (Howard, 1997;Krsul, 1998) of using the classification.
Finally, a taxonomy should also allow for completeness (Amoroso, 1994), i.e., the taxonomy should provide a good coverage (Rameder et al., 2022) of the vulnerabilities identified in state of the art or reported by vulnerability detection tools. Also, it should be open to the community (i.e., accept new entries from the community) and shareable (i.e., no distribution restrictions) (Mann and Christey, 1999). The fact that it is open is also a factor that can contribute to it being accepted (Amoroso, 1994;Howard, 1997).

Smart Contract Vulnerability Classification Schemes
To the best of our knowledge, the first initiative to classify smart contract vulnerabilities (for Ethereum systems) is proposed in (Atzei et al., 2017). The authors listed 12 vulnerabilities, which we overview in Table 1, and implemented nine of the corresponding attacks. This initial effort is quite relevant but holds some limitations. Some of the selected names do not really specify the nature of the vulnerabilities or are not clear about the problem being characterized (e.g., "call to the unknown"). This limitation was mitigated in (Zhou et al., 2022) and (Argañaraz et al., 2020), where the authors tried to make the names used more specific. In (Atzei et al., 2017) three categories of issues are proposed: i) Solidity Issues (i.e., language weaknesses), ii) EVM Issues (i.e., residuals faults in byte code), and iii) Blockchain Issues (i.e., vulnerabilities from blockchain technology). Despite allowing an initial separation of the issues (which may help developers in dealing with the faults), this scheme does not benefit from the presence of a more complex hierarchy, which is a better fit for cases where we find several interrelated families of vulnerabilities. We have also identified that these three categories may generate some ambiguity as some cases could potentially fit into multiple categories. For example, Immutable Bugs could be classified into EVM or Blockchain. Despite this, the separation between the cases referring to the programs (i.e., solidity source code or EVM binary code) and the blockchain platform is helpful. This classification is not available in a public repository and, due to its age, its coverage is relatively low, accounting for 12 evulnerabilities. Table 2 overviews the vulnerability classification presented in (Kaleem et al., 2020), which has the goal of allowing comparison between the security of the Solidity and Vype languages. The work presents 18 vulnerabilities, along with a detailed explanation for each one, and individual code examples for each vulnerability. Being mostly a list of vulnerabilities, there are no benefits associated with hierarchical structures. There is no open public repository associated with the proposal, and the 18 vulnerabilities are nowadays a small amount, e.g., the work in (Rameder et al., 2022) identifies a total of 54 defects. A vulnerability classification is presented in (Argañaraz et al., 2020) with the goal of exposing threats and, ultimately, minimizing the presence of software bugs in smart contracts. Table 3 presents the proposed classification in which we find faults separated into two levels: i) security (i.e., faults that may be exploited by attacks; and ii) functional (i.e., faults that violate the program's functionality). Each fault is also associated with a criticality level, which may be useful for getting developers' attention while coding. We found that certain cases, such as Non-verified maths and Malicious libraries, may indicate the same vulnerability, indicating a potential need for further clarification and refinement of the classification process to address any ambiguity. Similar to the previously presented works, there is no hierarchical structure besides the two groups of vulnerabilities. The names used in the classification are quite specific (i.e., Use of tx.origin), which makes it difficult to understand the problem in a more abstract manner. Only 13 faults are considered, with no possibility of expansion. Still, the idea of classifying the faults into two broad concepts of security and functionality is a vision that may be interesting for newer classifications (e.g., targeting different types of systems). A smart contract vulnerability classification is presented in (Zhou et al., 2022), based on the previous work in (Atzei et al., 2017). In this classification, summarized in Table 4, the groups were maintained (i.e., Solidity, EVM, Blockchain), but the vulnerability entries were modified (i.e., some names were removed, like stack size limit and gasless send and others names were included, like tx. origin and default visibility). The authors linked the proposed names to an external taxonomy, namely CWE (CWE Community, 2009), which is helpful for understanding each vulnerability, verifying the correctness of the proposed classification, and also for standardization purposes. The proposed classification defines a basic separation of vulnerabilities, mostly distinguishing cases related to the programs from cases related to the platform. Again the number of vulnerabilities listed is quite small (i.e.,13 vulnerabilities), and the work could benefit from a repository open to community contributions.  Table 5 presents a vulnerability classification proposed by (Amiet, 2021). The classification is based in two categories: i) core blockchain vulnerabilities (i.e., vulnerabilities related to the blockchain platform); ii) smart contracts vulnerabilities (i.e., vulnerabilities related to the programs deployed in the blockchain). At the blockchain level, examples are provided (e.g., attacks on the consensus mechanism), whereas, at the contract level, pseudo-code is presented, which clarifies the security issues identified. These two broad groups are a basis for applying the classification to other types of systems. There are no further hierarchical levels present in this taxonomy, and we found vulnerability names that are unclear, such as Improper Blockchain Magic Validation, which does not really characterize the technical details involving the vulnerability. As with previous cases, the 12 vulnerabilities represent a quite small number of currently known vulnerabilities. A classification of 28 vulnerabilities is proposed in (Staderini et al., 2020) and was further evolved to categorize a total of 33 vulnerabilities in (Staderini et al., 2022). Table 6 presents an overview of the authors' classification, identifying the acronym and name of the vulnerability and an associated CWE (CWE Community, 2009).
As we can see in Table 6, the additional characterization by CWE is quite helpful although it is not accompanied by a blockchain-specific classification scheme, such as SWC (SmartContract-Security, 2020) which could help in unifying knowledge. The source of information is based on a set of four references in (Staderini et al., 2020) and in five in (Staderini et al., 2022), which do not directly map with the state of the practice (e.g., tools for vulnerability detection). Still, the process for building the classification is insightful and helpful as a way to solidify our own classification, e.g., by allowing a verification of our own mapping to CWE.
A consolidated taxonomy is presented in (Rameder et al., 2022). The authors were able to collect 54 vulnerabilities reported from different verification tools and grouped them into 10 categories. Table 7 overviews the taxonomy created by the authors.
This classification is more fine-grained than the previously discussed ones. However, there are a few issues with some names given to the vulnerabilities. For instance, it is not obvious to what extent integer bugs or arithmetic issues -7C is different from Integer over-or underflow -7A or Integer division -7B, and there are names like Gas costly loops and Gas costly pattern which seem very similar. Also, names like configuration error are quite generic and could lead to a more specific vulnerability like Environment Configuration Issues. Regarding the structure itself, the taxonomy has a flat organization in which the categories do not really represent aspects at the same abstraction or conceptual level. For instance, Denial of Service is generally considered as a type of attack or the effect of an exploited vulnerability, or Configuration Issues is quite generic, and it does not characterize the vulnerability sufficiently. We can observe a similar issue between the names given to the categories and to the specific vulnerabilities, e.g., Bad Coding Group versus Coding error Vulnerability. Although the classification lists 54 vulnerabilities, it would benefit of a way for evolving and including more recent ones (e.g., via an open repository).

Community-Based Classification Schemes
This section discusses taxonomies or classification initiatives maintained by communities. One of the most popular ones is Smart Contract Weakness Classification SWC (SmartContractSecurity, 2020), a vulnerability classification scheme for smart contracts whose main goals are: i) Provide a straightforward way to classify 'weaknesses' of a smart contract; ii) Identify weaknesses that lead to vulnerabilities; iii) Define a common language to describe weaknesses in the architecture, design, and coding of smart contracts; and finally, iv) Being a way to improve the effectiveness of smart contract security analysis tools (Wagner, 2018). In SWC, each software defect has an external relationship with another taxonomy (i.e., CWE (CWE Community, 2009)), and there are examples (i.e., faulty and non-faulty code) to illustrate the vulnerability and a correction. SWC is a flat list structure, where the distinction between vulnerabilities and other types of defects is many times unclear. Also, it is worthwhile mentioning that there are cases where it is difficult to distinguish whether the problem is related to the blockchain platform or to the smart contract itself (e.g., Weak Sources of Randomness from Chain Attributes, Unencrypted Private Data On-Chain). A positive aspect is that SWC is associated withan open repository, although, at the time of writing, the last update was made in 2018. Considering the changes and new knowledge about smart contract vulnerabilities, this means that practitioners' involvement is now impaired. For instance, the classification presented in (Rameder et al., 2022) identifies several new defects that are not present in SWC.
The NCC Group initiated the Decentralized Application Security Project (DASP) in 2018, which includes a vulnerability classification scheme for smart contracts. The idea of the classification is to present the top 10 threats to smart contract security, for which a single iteration was carried precisely in 2018. Thus, it does not really reflect the whole landscape of vulnerabilities. DASP provides a short description for each class of vulnerabilities, which is accompanied by pseudo-code as a way of explaining the defects. The classification emphasizes the impact the vulnerability had in real-world scenarios (e.g., reentrancy loss estimated at 3.5M ETH 50M USD at the time). References to real-world attacks are provided (i.e., reports, magazines, etc.), which present a historical view of vulnerability exploitation. The nomenclature is clear, although some parts of the structure are questionable. For instance, the Denial of Service category in DASP refers to gas limit reached, unexpected throw, unexpected kill, and access control breached. The description is sometimes so short that may become ambiguous (e.g., access control breached may refer to a vulnerability that would simply fit in Access Control, which is another DASP category). In (Durieux et al., 2020), the authors used DASP but concluded that the categories were not sufficient to cover the vulnerabilities found.
SIGP (Manning, 2018) is a vulnerability classification scheme for smart contracts written in Solidity that forms the basis of of the work in (Antonopoulos and Wood, 2018). The classification considers three main elements: vulnerability, preventive technique, and a real-world example. The first element conceptually describes the reported vulnerability. It also presents the vulnerable code and explains how the attack is performed. The second element presents a solution for the problem, and the last element discusses a real-world attack in which the vulnerability was exploited. The clarity of the names used for the vulnerabilities could be improved (e.g., entropy illusion and constructors with care are ambiguous). There is an open repository associated, but not receiving any updated, at the time of writing. As in previous cases, there are only 16 vulnerabilities listed, which is currently far from the state of the practice.
The SMARTDEC classification (SmartDec Corporation, 2018) originated from the experience gathered from the creation of Smartcheck (Tikhomirov et al., 2018). The vulnerabilities are organized into three main categories: Blockchain (i.e., vulnerabilities from the blockchain system), Language (i.e., programming language defects), and Model (i.e., vulnerabilities caused by mistakes in the model). Each group has several entries (up to a total of 11), where each entry corresponds to a set of related vulnerabilities. The entry names are unique, although they are also quite generic, and therefore less descriptive (e.g., Trust). The authors provide a mapping between their taxonomy and other classifications, namely DASP (NCCGroup, 2021), SWC (SmartContractSecurity, 2020), and SIGP (Manning, 2018). As an example, the Arithmetic category is related to Over/underflow in SWC-101, DASP-3, and SP-2 and to Precision issues in SP-15. The repository is open to contributions, although, at the time of writing, there has been no update since 2018.

Classification Schemes used in Vulnerability Detection Research
Research in smart contract vulnerability detection for smart contracts is generally accompanied by custom vulnerability classification schemes (Luu et al., 2016;Kalra et al., 2018;Wang et al., 2019;Ghaleb and Pattabiraman, 2020;Choi et al., 2021;Bose et al., 2022). This is primarily due to lacking an appropriate and up-to-date classification standard or taxonomy. As a result, biased and limited classifications emerged, which are coupled to the context in which they were created. The next paragraphs describe the classification schemes of selected research, namely of three of the most cited vulnerability detection research works (at the time of writing and according to Google Scholar). In all of these cases, the heterogeneity is clear, as well as the divergence with other classification schemes, such as the ones previously presented in this section.
A symbolic execution tool named Oyente is proposed in (Luu et al., 2016) with the goal of allowing practitioners to detect security vulnerabilities. Within the tool proposal the authors identify a small set of security vulnerabilities, as illustrated in Table 8. Although the work in Oyente targets a specific set of vulnerabilities, the absence of a standard way for categorizing and naming the vulnerabilities impairs the assessment and comparison of results with other tools or approaches.
Securify (Tsankov et al., 2018) is a vulnerability detection tool based on symbolic execution methods, which, at the time of writing, is able to detect 37 security defects (Tsankov, 2018), which the tool groups by severity, as we can see in Table 9.
Again, as with the previous tool, the groups and the names or vulnerability definition are non-standard, although there is an effort in to classify most of them according to SWC (SmartCon-tractSecurity, 2020).
Zeus is a tool based on abstract interpretation and symbolic execution (Kalra et al., 2018). Table  10 shows the vulnerability classification performed by the authors and targeted by the tools.
As we can see in Table 10, the authors created several groups (e.g., incorrect contracts, unfair contracts), in which several defects are placed. Although this is obviously a partial classification of known vulnerabilities, the heterogeneity of the naming and definitions and also general classification structures is clear (when compared to other works), which again emphasizes the need for a more standard way of categorizing defects.

Limitations of Current Classification Schemes
In this section, we highlight the main gaps and limitations identified during the analysis of the different vulnerability classifications previously described, as follows: -Classifications proposed in the literature tend to have simple structures, most of them simply grouping the vulnerabilities into related groups. Many times, no groups at all are used. Such Table 9: Vulnerability classification in (Tsankov et al., 2018) and extended in (Tsankov, 2018)  structures are often ad-hoc and consequently short-lived, resulting in limited adoption. The classifications that collect more vulnerabilities are found in (Rameder et al., 2022;SmartCon-tractSecurity, 2020;Tsankov, 2018), with (Tsankov, 2018) grouping vulnerabilities by criticality and with (Rameder et al., 2022) using conceptual groups to fit related vulnerabilities. -There is a large diversity of names being used in state of the art to refer to the same vulnerability (e.g., both Integer bugs or arithmetic issues and Integer over-or underflow (Rameder et al., 2022) refer to the same vulnerability). There are also cases in which very similar names refer to different vulnerabilities (e.g., unpredictable state (Grishchenko et al., 2018) refers to wrong class inheritance order defect while vulnerable state (Krupp and Rossow, 2018) refers to uninitialized storage variable defect). In some cases, the same name refers to different vulnerabilities, e.g., Transaction Order Dependency (TOD) is the name used in (Liao et al., 2019) and in (Bose et al., 2022), which however refers respectively to "5.1.5 Transfer Amount Dependent on Transaction Order" and to "5.1.6 Transfer Recipient Dependent on Transaction Order". -Current classifications include several generic names that do not assist in the classification of specific defects (e.g., call to the unknown (Atzei et al., 2017) or unexpected function invocation (Chen et al., 2020)). In several cases, unclear nomenclatures are used, such as entropy illusion, constructors with care (Manning, 2018), or Improper Blockchain Magic Validation (Amiet, 2021), which do not specify what the defect is. Another example is Style guide violation , which is not even clear whether it is referring to bad practice or a vulnerability. -Regarding vulnerability classification, current research appears to be falling far behind the state of the practice. Current vulnerability detection tools identify several vulnerabilities (e.g., Secu-rify2 (Tsankov, 2018)) that do not fit in relatively well-established classifications, such as DASP (NCC Group, 2019), or SWC (SmartContractSecurity, 2020). -Current classifications do not involve active community participation, and we observed little to no participation at all in several classifications. Thus, it is fundamental that a classification can be easily maintained and evolve to integrate new vulnerabilities or even has the possibility of structurally changing (i.e., versioning is also required). This reduced community participation is the main reason why the most popular classification initiatives, like SWC (SmartContractSecurity, 2020) or DASP (NCC Group, 2019), are currently far behind the detection capabilities of vulnerability detection tools. -Classifications originated from vulnerability detection tools sometimes use names that are biased towards the tool's capabilities, which is fully acceptable from a tool perspective, but for broader goals (e.g., tool benchmarking), a vulnerability classification must be independent of specific tools' capabilities. For instance, Osiris (Torres et al., 2018) is a tool for detecting vulnerabilities related with integer values and naturally focuses on a few types of issues affecting integer manipulation. Thus, the naming used is very specific of this context and also does not capture the larger picture (e.g., issues affecting other types of numbers may be related, but are not represented). -The high heterogeneity of names used across various tools, community efforts, and research initiatives creates a significant obstacle to understanding which tools perform better. Although initiatives exist to assess the effectiveness of the vulnerability detection tools, they all faced difficulties in adopting a uniform, fine-grained taxonomy for defects. -Many times, taxonomies mix the characteristics of a certain vulnerability with the effect of exploiting it or with how it is exploited, or its impact, and use category names like Denial of Service, which is basically a consequence of the activation of a certain vulnerability. This is not necessarily wrong, but it may contribute to a non-uniform taxonomy and possibly error-prone from the point of view of the taxonomy's user.
-Classification structures are often constructed with different degrees of granularity. Some structures have general categories, while others have more specific categories. This inconsistent categorization of poses difficulties and complexities for practitioners and tool developers, as they end up creating new classifications. Overall, a broader view on vulnerability detection is needed to foster the longevity of a particular taxonomy, accompanied with the possibility of evolving it.

OpenSCV Construction Process
This section describes the process followed to build the OpenSCV taxonomy. Overall, it was an iterative and incremental process during which we kept general taxonomy quality properties (e.g., the ones discussed in Section 2) in perspective, while going through all the construction phases. As mentioned in Section 1, we use the general term vulnerability to refer to vulnerabilities and also to software defects considered in the literature to be associated with high-security risks. Figure 1 overviews the process, which consists of the following phases: i) Vulnerability information collection; ii) Vulnerability relationship with other classifications; iii) Vulnerability characterization (defect type, qualifier, and code clip example); iv) Structural and nomenclature consolidation; v) Dataset construction.
Regarding the first phase (vulnerability information collection), visible on the top of Figure  1), the main goal was to gather an up-to-date, heterogeneous, and non-curated list of vulnerabilities that affect smart contracts. This list allowed us to understand the naming and classification heterogeneity, which is essential to build an integrated vision and ultimately reach a meaningful taxonomy. Thus, we began by using Google Scholar to try to identify research work on smart contract vulnerability classification (e.g., taxonomies and defect classification schemes). We then proceeded to search for research targeting smart contract vulnerability detection, which resulted in the identification of 49 research papers, which are mostly materialized in tools and that we summarize in Table 11. The identified works refer to research carried out from October 2016 to January 2023 and resulted in the collection of 357 vulnerability definitions. It is worthwhile mentioning that the identified research also led us to the identification of community-oriented initiatives, namely SWC (SmartContractSecurity, 2020) and DASP (NCC Group, 2019), which recurrently appear in the literature. There are other initiatives, such as SIGP (Manning, 2018) or SMARTDEC (SmartDec Corporation, 2018), which seem to have less expression.
In the second phase, we analyzed the vulnerability relationship with other classifications by going through each of the identified vulnerabilities and mapping them to popular smart contract vulnerability classification schemes, namely SWC (SmartContractSecurity, 2020) and DASP (NC-CGroup, 2021). We also selected, from the state of the art in vulnerability classification, what is, to the best of our knowledge, the currently largest and most recent vulnerability classification scheme proposed by (Rameder et al., 2022). Then we resorted to a broader security-related classification, namely the Common Weakness Enumeration (CWE) (CWE Community, 2009), which provides us with a non-domain-specific view of each defect. Although the action consists of simply mapping vulnerabilities, it actually contributes to the characterization of each vulnerability. This may be useful for later taxonomy consolidation purposes (e.g., by merging defects that are the same but are represented with different names). Obviously, mapping the identified vulnerabilities to existing  (Feist et al., 2019) classifications also allow us to understand the exact coverage of existing classifications or disparities against the current state of the art or practice. The third phase -vulnerability characterization (defect type, qualifier, and code clip example) -has the direct goal of detailing the vulnerability according to its nature and also by example, which allows for clarity of the explanation and may help in cases where the vulnerability description and remaining attributes are inadvertently left unclear. Regarding (vulnerability nature), we resort to the Orthogonal Defect Classification, namely to the 'defect type' attribute, which generally characterizes the type of defect and can correspond to Assignment/Initialization, Checking,  Fig. 2: The same vulnerability named and also described differently in: a) (Brent et al., 2018); b)  extraneous code. We also use the ODC extensions, as proposed in (IBM, 2013a), for defects that relate with other aspects (e.g.,defect types related with the process followed during compilation, or management of libraries). For each identified defect, we also extracted a code excerpt (when made available by the authors) that could represent the issue, as a way to reduce or eliminate any possible ambiguities that could still be present. For the cases where no defect was made available and the description allowed to build one, we created a Solidity code example as a way of further illustrating the defect. Thus, all of the identified vulnerabilities in OpenSCV are associated with a code example.
The fourth phase naming and structural consolidation, consists of two steps: the attribution of names to the vulnerabilities, and; ii) their organization in a tree structure. In the first step, we merged defects that referred to the same issue (despite being named differently by different authors. This required going through the names and descriptions of the different defects and, whenever provided by the authors, also analyzing the corresponding vulnerable code to understand if it referred to the same defect or not. The additional characterization (e.g., ODC) helped in such grouping. Obviously, during this step several adjustments to the characterization of the defects were made, as well as corrections to the defects' relationships with other classifications. Figure 2 shows an example of the same vulnerability named differently by different authors. In (Brent et al., 2018) named it Unsecured Balance (Figure 2.a)) and it basically consists of a misnamed constructor while  named it Missing constructor (Figure 2.b)), where we observe that it is actually a wrong name used during the definition of the constructor. So, besides the names we actually see that the definitions provided may not be really accurate sometimes. In this particular case, and as an example, we named this defect as "Wrong Constructor Name". Thus, during this step, we defined an initial name for each of the defects, based on the name given by the authors of the respective paper, on the names presented in the corresponding related classifications (i.e., DASP, SWC, CWE), and on the ODC classification.
In the second step of the fourth phase (i.e., structural consolidation), we defined a hierarchical structure for the taxonomy based on the merged vulnerabilities and preliminary naming. During this step, names were further adjusted for clarity and also to better fit in the categories being created. The final result is visible in Figure 3 and  Figure 4) contains the higher level categories, the intermediate one is hybrid and contains groups (i.e., subcategories) of vulnerabilities as well as a few isolated vulnerabilities. All items at the last level (at the right-hand side of Figure 3 and Figure 4) represent vulnerabilities. Each vulnerability identified in the tree is labeled with several symbols that characterize it in terms of ODC defect type and ODC qualifier. To build the taxonomy structure, we followed a bottom-up process and began by grouping the defects of similar nature, which allowed us to create a set of categories, such as reentrancy, useless code, or improper type usage, for example. Certain defects could not really be grouped, such as Use of Malicious Libraries or Inadequate Data Representation, although at the same time many of them sounded like higher-level defects (i.e., siblings were expected). Thus, for the time being, we opted not to keep these vulnerabilities at the bottom layer (e.g., by creating a subcategory with a single vulnerability). After this, the same procedure was applied at this current intermediate level to reach the definition of the higher-level categories.
The whole taxonomy construction process was iterative and required the involvement of 2 Experienced Researchers and 1 Early Stage Researcher. During the process, several adjustments to names were performed for further clarity and consistency across all axis of the taxonomy. Obviously, this is a continuous effort, which is now open to the community participation via our github repository , and the current shape of the taxonomy may evolve to incorporate further vulnerabilities. It is worthwhile mentioning that, during this process, we observed that the integration of new works on vulnerability detection was a major contributor to the definition of the taxonomy, and this is the reason why we intend to be continuously integrating new works on vulnerability detection and mapping their information into new versions of our taxonomy, possibly making naming and structural adjustments as a consequence of such integration. Currently, our taxonomy lists 76 vulnerabilities and is available at , where all the mapped works are identified, as well as the vulnerability names used by those works. We allow for easy integration of new works, and the infrastructure is not only ready to support naming and structural changes, as well as corrections to possible errors.
The fifth phase refers to the dataset constrution, where we aimed at obtaining multiple real examples of smart contracts that match the defects present in our taxonomy. At the time of writing, the goal is simply to have a preliminary version of the dataset by gathering multiple real examples of contracts (i.e., a vulnerable contract and the corresponding correction) per each of the different defects present in the taxonomy. Indeed, each defect may be present in different forms (i.e., different implementations), and vulnerability detection tools may be able to detect just some of the forms. For this collection process, we directly used examples from the collected papers themselves (whenever complete contracts were made available). In some cases, SWC had usable examples also. All collected contracts present in our dataset pass through the compilation phase. Our intention is to provide an initial basis for researchers to use and, at the same time, provide the possibility of further examples (ideally, different forms of the same vulnerability) being added to the dataset.

Taxonomy Levels and Defects Description
In this section, we traverse the taxonomy tree and present a brief description of all categories and individual vulnerabilities, which are identified according to the respective numbers in Figure 3 and Figure 3. In order to keep the used space under reasonable limits, most of the descriptions consist of brief explanations. For further information, the reader may refer to  for further details and also examples. The exception to this is the first vulnerability discussed in the text (i.e., 1.1.1 Unsafe credit transfer ), which, for illustrative purposes, we present with complete detail.

Unsafe External Calls
This category represents a set of vulnerabilities in which there is an interaction between at least two contracts.

Reentrancy
The first subcategory is reentrancy, in which two contracts are involved: the vulnerable contract and the malicious contract. Overall, this type of vulnerability occurs when the malicious contract, after initiating a call, is allowed to make new calls to the vulnerable contract before the initial call has been completed. Thus, unexpected state changes may occur, such as depletion of credit. We identified two main types of reentrancy vulnerabilities: one type associated with loss of credit and the other one associated with unexpected state changes. This is in line with several vulnerability detection tools, such as Securify (Tsankov, 2018), Slither (Slither's Github, 2019), or (Momeni et al., 2019) (Feist et al., 2019) which also distinguish these two cases (although using different names).

Unsafe Credit Transfer
Known due to the DAO attack event (Siegel, 2016), this vulnerability allows attackers to maliciously change balance via credit transfer calls that are allowed to take place before a previous call has been completed. Let us consider the case where a smart contract maintains the balance of several addresses, allowing the retrieval of funds. A malicious contract may initiate a withdrawal operation which would lead the vulnerable contract to send funds to the malicious one before updating the balance of the malicious contract. On the malicious contract side, funds would be accepted, and a new withdrawal could be initiated (before the balance had been updated on the vulnerable contract side). As a consequence, the malicious contract could withdraw funds multiple times, with the total sum exceeding its own funds.
Using the Orthogonal Defect Classification (ODC) as a reference, this defect can be classified as being of type Algorithm as the nature of the defect sits in the logic created by the programmer. The ODC qualifier is defined as wrong as the error is related to incorrect logic (i.e., not missing or extraneous logic), related to the order of the instructions in the code.

Unsafe System State Changes
This vulnerability is similar in nature to v1.1.1, with the main difference being the fact that there is no credit involved and, thus, no impact on users' funds. Due to the way the contract is coded, a call that reaches the vulnerable contract before a previous one has ended may allow an attacker to place the program in an unexpected state, leading to various effects, depending on the type of contract involved, including performance or availability issues. This vulnerability is also known in the literature as "ReentrancyNoETH" (Tsankov et al., 2018), "Reentrancy" (Mavridou Anastasia et al., 2018;Mavridou et al., 2019), "Re-entrancy without balance change" (Momeni et al., 2019), or "SWC-107 reentrancy" (SmartContractSecurity, 2020).

Malicious Fallback Function
Fallback functions are functions that are executed when a program receives a call to a function whose signature does not exist, i.e., either the name does not exist or the parameters do not match the parameters of any of the existing functions. For instance, an attacker could deploy a smart contract with a malicious fallback function, which could be used to drain funds or alter the system's state.
By mistake, a user could invoke it and reach a state that was not expecting to reach (Chen et al., 2020). This vulnerability is also known in the literature as "Call to the unknown" (Atzei et al., 2017;Argañaraz et al., 2020;Chapman et al., 2019) or "Unexpected function invocation" (Chen et al., 2020).

Improper Check of External Call Result
This category groups vulnerabilities that verify the execution of external contracts in an improper manner (i.e., verification is wrong or even missing), which affects the subsequent logic of the calling contract. The result of invoking a certain external operation should be verified, first of all, because it may simply fail, but especially because the called operation may be malicious (or may just have been poorly coded, resulting in an unexpected result); thus, the direct use of the result may lead to unexpected behavior.

Improper Check of External Call Return Value
This defect consists of an incorrect (or missing) verification of the returned value from the external execution of a contract. When a smart contract invokes another one, the returned value should be verified because the called operation may return an unexpected value (i.e., either because the callee is malicious or may just have been poorly coded, resulting in an unexpected result) (Chen et al., 2020). This vulnerability is also known in the literature as "Unchecked call return value" (Zheng et al., 2021), "Unused return" (Tsankov et al., 2018;Momeni et al., 2019), "Unchecked external call" (Tikhomirov et al., 2018), "No check after contract invocation" (Chen et al., 2020), "Call-depth" (Liao et al., 2019), "Not checked return values" (Andesta et al., 2020), , "Call-stack Depth Attack" Song Jingjing and He et al., 2019) or "SWC-104 Unchecked Call Return Value" (SmartContractSecurity, 2020).

Improper Exception Handling of External Calls
In the case of this defect, the problem resides in the incorrect (or missing) handling of exceptional behavior thrown by a call (i.e., instead of residing in the handling of values, as in the case of vulnerability v1.3.1 ). The improper verification of exceptions thrown by the callee may lead to unexpected behavior in the caller contract. There are various reasons why the callee may exhibit exceptional behavior. For instance, the callee could be under malicious control, the execution of the transaction could activate a fault in the callee contract, the transaction could be terminated due to reaching the gas limit, or the callee contract may have been terminated (e.g., after a software fault has been detected in the contract). This vulnerability is also known in the literature as "DoS by external contract" Tikhomirov et al., 2018;Lu et al., 2019), "Denial of service" (Ashouri, 2020;Andesta et al., 2020) or "SWC-113 DoS with Failed Call" (SmartContractSecurity, 2020).

Improper Check of Low-Level Call Return Value
Languages like Solidity offer the possibility of using low-level calls that operate over raw addresses. Such calls do not verify that the code exists or the success of the calls. Thus, its use may lead to unexpected behavior (Xi and Pattabiraman, 2023). As a result, using such calls can be risky and should be avoided in most cases. This vulnerability is also known in the literature as "LowLevel-Calls" (Tsankov et al., 2018;Liao et al., 2019), "Unchecked calls" Hu et al., 2023), "InlineAssembly" (Liao et al., 2019), "Usage of low-level calls" (Momeni et al., 2019), or "Check-effects" (Liao et al., 2019).

Improper Locking During External Calls
A vulnerable contract uses a lock mechanism in an erroneous manner, which may cause deadlocks. This may result, for instance, in the impossibility of executing transfers and eventually in Denial of Service (Mavridou et al., 2019). This vulnerability is also known in the literature as "Deadlockfreedom" (Mavridou et al., 2019) or "SWC-132 Unexpected Ether balance"(SmartContractSecurity, 2020).

Interoperability Issues with Other Contracts
This issue relates to interoperability issues between contracts built in different language versions. Newer contracts may execute or inherit discontinued functionality present in older contracts (Khan et al., 2021). For instance, Solidity has introduced the operation code STATICCALL to allow a contract to call another contract (or itself) without modifying the state. Starting from V0.5.0, pure and view functions must now be called using the code STATICCALL instead of the usual CALL code. Consequently, when defining an interface for older contracts, the programmer should only use view instead of constant in the case s/he is absolutely sure that the function will work with STATICCALL (Solidity, 2023). This vulnerability is also known in the literature as "AssemblyUsage" (Tsankov et al., 2018;Momeni et al., 2019).

Mishandled Events
This category includes a set of vulnerabilities in which exceptional events are mishandled. In Solidity, there are specific functions that can be used to verify if certain conditions exist and to throw exceptions in the case the conditions are not met, namely require and assert. There are, however, fundamental differences. When the require function returns false, all executed changes are reverted, and all remaining gas fees are refunded. When the assert function returns false, it reverts all changes but consumes all remaining gas. However, such differences have become a frequent source of problems (Hajdu and Jovanović, 2020).

Improper Exceptional Events Handling
This first group of vulnerabilities is directly related to exceptional events, which, when mishandled, are many times linked to the loss of atomicity in operations as well as other effects, such as excessive gas consumption or unauthorized access.

Improper Use of Exception Handling Functions
Diverse runtime errors (e.g., out-of-gas error, data type overflow error, division by zero error, arrayout-of-index error, etc.) may happen after a compiled smart contract is deployed. However, Solidity has many functions for error handling (e.g., throw, assert, require, revert), but their correct use relies on the experience and expertise of the developer. This defect occurs when the developer misuses the handling exception functions, which can lead the program to unexpected behavior. This vulnerability is also known in the literature as "Mishandled exceptions" (Grishchenko et al., 2018;Choi et al., 2021;Zhang et al., 2019;Nguyen et al., 2020;Luu et al., 2016), "UnhandledException" (Ashouri, 2020;Tsankov et al., 2018), "Exception disorder (Jiang et al., 2018)", or "Exception state" (Momeni et al., 2019).

Improper Exception Handling in a Loop
This vulnerability occurs when a transaction is excessively large (i.e., it executes too many statements) and may lead to excessive costs. For instance, when one of the statements in a transaction fails (e.g., due to a software bug), the transaction will not be packaged into a block, and the consumed gas will not be returned to the user (and actually the concluded operations are reverted and must be executed again). Thus, such kinds of transactions should be decomposed into smaller parts so that the likelihood of success increases and the negative effects associated with the failure cases diminish. This vulnerability is also known in the literature as "CallInLoop" (Tsankov et al., 2018), "Revert DOS" (Stephens et al., 2021), "Costly loop" Tikhomirov et al., 2018;Lu et al., 2019), "Multiple calls in a single transaction" (Momeni et al., 2019), "UnboundedMassOperation"  or "SWC-128: DoS With Block Gas Limit" (SmartContractSecurity, 2020).

Incorrect Revert Implementation in a Loop
In the case of this vulnerability, the developer incorrectly specifies how the revert operation should be handled (in the context of a loop or a transaction composed of multiple operations), which ends up in a partial revert of the whole set of operations that should be reverted. This vulnerability is also known in the literature as "Nonisolated calls (wallet griefing)" , "Push DOS " (Stephens et al., 2021), or "SWC-126 Insufficient Gas Griefing" (SmartContractSecurity, 2020).

Improper Token Exception Handling
The ERC-20 standard (Vogelsteller and Buterin, 2015) provides functionalities to exchange tokens. Besides describing the functionalities, the standard specifies good practices for developers to implement its features. Regarding the transfer function, exceptional events can become problematic if they are not handled properly.

Missing Thrown Exception
Regarding the transfer function (i.e., functionality to transfer tokens from one account to another), the ERC-20 standard recommends to the developer throw an exception when a condition of the caller's account balance does not have enough tokens to spend. This allows the caller to understand the reason for which the transfer is not completed and take appropriate action. This vulnerability is also known in the literature as "Non-standard Implementation of Tokens" (Ji et al., 2020), Missing the Transfer Event (Chen et al., 2020).

Extraneous Exception Handling
This type of defect refers to the implementation of extra actions compared to what is recommended in a certain specification. The specification does not recommend actions like the use of guard functions (e.g., require or assert) in addition to throwing an exception in the case when there is no balance in the caller. The extra actions might be arbitrary and incompatible with the purpose of a transfer functionality (e.g., returning true or false to report the success of the execution). This vulnerability is also known in the literature as "Token API violation" Tikhomirov et al., 2018)

Gas Depletion
This category groups defects that, in different ways, lead to gas depletion of the account used for the smart contract execution.

Improper Gas Requirements Checking
This defect represents missing or wrong checking of the prerequisites (i.e., in terms of gas) for executing a certain operation, causing unnecessary processing and use of memory resources. For cost management reasons, languages offer programmers several ways to deal with the cost of the executing a certain operation in a contract. For instance, for transferring credits, Solidity provides the functions transfer() and send(), which have a limit of 2300 gas units for each execution. An alternative is to build a custom transfer function, where the gas limit is defined by a variable (e.g., address.call.value(ethAmount).gas(gasAmount)()). Despite having several ways of managing the program costs, it is challenging for programmers to predict which part of the code may fail. If an out-of-gas exception is triggered, the result may be unexpected behavior. This vulnerability is also known in the literature as "Send without Gas" (Argañaraz et al., 2020), "Gassless send" (Jiang et al., 2018;Ashraf et al., 2020;Nguyen et al., 2020;Feng et al., 2019;Wang et al., 2019;Chang et al., 2019;Chapman et al., 2019), "Gas Dos" (Stephens et al., 2021), "Out of gas" (Akca et al., 2019) or "SWC-126 Insufficient Gas Griefing" (SmartContractSecurity, 2020).

Call with Hardcoded Gas Amount
This defect refers to the impossibility of adjusting the amount of gas used by a certain program after being deployed. This issue is related to the observation that certain transfer credit in real contracts was being deployed using a fixed amount of gas (i.e., 2300 gas). If the gas cost of EVM instructions changes during, for instance, a hard fork, previously deployed smart contracts will easily break. This vulnerability is also known in the literature as "SWC-134 Message call with hardcoded gas amount" (SmartContractSecurity, 2020).

Bad Programming Practices and Language Weaknesses
This category represents issues that are mostly related to bad programming practices (i.e., errorprone or insecure coding practices) and language weaknesses, which are mostly related to insufficient protection mechanisms offered by the language, allowing the developers to make mistakes that could be avoided, e.g., by language constructs.

Bad Randomness
This vulnerability is related to the use of the variables that control the blocks in a blockchain as a way of generating randomness, which is not secure. Such variables may be manipulated by miners so that the randomness is subverted, compromising the security of the blockchain, with its information becoming vulnerable to attacks. In fact, generating a strong enough source of randomness can be very challenging. The use of variables like block.timestamp, blockhash, block.difficulty, and other fields is problematic as these can be manipulated by miners. For example, a miner could select a specific timestamp within a delimited range, or use powerful hardware to mine several blocks quickly, choose the block that would provide an interesting hash, and drop the remaining. This vulnerability is also known in the literature as "Generating Randomness" (Grishchenko et al., 2018;Tsankov et al., 2018), "Random generation" (Argañaraz et al., 2020), "Bump Seeds" (Cui et al., 2022), "Dependence on predictable variables" (Lu et al., 2019), "Bad randomness" (Ashouri, 2020;Hu et al., 2023), "Bad random"  or "SWC-120 Weak Sources of Randomness from Chain Attributes" (SmartContractSecurity, 2020).

Improper Initialization
The smart contract has resources that are either not initialized or are initialized in an incorrect manner, leading to unexpected behavior.

Missing Constructor
A smart contract constructor is a function that is executed exactly once during the lifetime of a contract. It executes at deployment time and initializes state variables, performs a few necessary tasks that the specific contract requires, and sets the contract owner. If there is no constructor, the developer will have to implement such tasks manually, which is prone to security issues (e.g., variables may be set with incorrect values or forgotten, which may result in security problems). This vulnerability is also known in the literature as "Unsecure balance" (Brent et al., 2018), "Missing constructor"  or "SWC-118 Incorrect Constructor Name" (SmartContractSecurity, 2020).

Wrong Constructor Name
Contract published without a constructor because the programmer created a function, imagining that it would behave like a constructor. Usually, the construction function has sensitive code (e.g., assignment of the owner of the contract), and by declaring a wrong function name, any user can call the function, thus, causing serious security risks. This vulnerability is also known in the literature as "Constructor name" (Andesta et al., 2020), "Erroneous Constructor Name" (Hu et al., 2023) or "SWC-118 Incorrect Constructor Name" (SmartContractSecurity, 2020).

Missing Variable Initialization
This defect refers to the lack of initialization of variables that are used throughout the contract. Obviously, the effects can largely vary, depending on the variable itself and on the context in which is being used. This vulnerability is also known in the literature as "UninitializedStateVariable" (Tsankov et al., 2018), "Uninitialized-local" (Tsankov et al., 2018) or "Uninitialized variables" (Feist et al., 2019).

Uninitialized Storage Variables
In Solidity, state variables are assigned to memory or storage. When a state variable is declared, it is assigned to a certain storage slot. If that variable is not initialized, it will be stored in slot 0 (the first one) of the contract's storage. Thus, it may conflict with the next variable that is declared in the same slot, causing an address conflict. This latter variable will overwrite the first, leading to unexpected behavior. This is the reason why it is important to initialize all state variables in a smart contract so that they are set into the correct storage slots (and possible conflicts are avoided) (Antonopoulos and Wood, 2018). This vulnerability is also known in the literature as "Uninitialized storage pointers" (Antonopoulos and Wood, 2018), "UninitializedStorage" (Tsankov et al., 2018;Hu et al., 2023) or "SWC-109 Uninitialized Storage Pointer" (SmartContractSecurity, 2020).

Improper Credit Transfer
This category groups defects which are generally related to improper credit transfer operations.

Missing Check On Transfer Credit
This defect refers to the absence of verification after a transfer event, which can lead to an erroneous vision of the correct balance of the account. Indeed, the balance of the account may not reflect the currency transferred in an exact manner, leading to potential errors and opening the door to security issues. This vulnerability is also known in the literature as "Unchecked send" (Kalra et al., 2018;Stephens et al., 2021;Brent et al., 2018;Akca et al., 2019;Lu et al., 2019).

Unprotected Transfer Value
The transfer function uses a numeric variable for transfers and may be vulnerable if it does not protect or specify limits for the values. When attribute address.balance is used for identifying the amount to be transferred, it will result in transferring the total balance at once, which is a high-risk operation for the cases where the amount is high . This vulnerability is also known in the literature as "Unchecked transfer value"  , "Transfer forwards all gas" (Tikhomirov et al., 2018), "UnrestrictedEtherFlow" (Tsankov et al., 2018), "Ether Leak" (Choi et al., 2021), "Manipulated Balance" (Hu et al., 2023), "Multiple Send" (Choi et al., 2021) or "SWC-105 Unprotected Ether Withdrawal" (SmartContractSecurity, 2020).

Wrong use of Transfer Credit Function
Depending on the programming language, there are different ways to carry out credit transfer operations. In Solidity, transfer and send will both allow executing a credit transfer. However, in the case of a problem, transfer will abort the process with an exception, whereas send function will return false, and transaction execution is continued. An attacker may manipulate the send function and be able to continue executing a credit transfer operation without proper authorization. This vulnerability is also known in the literature as "Failed Send" (Kalra et al., 2018), "Use of send instead of transfer" (Argañaraz et al., 2020), or "Send instead of transfer" (Tikhomirov et al., 2018).

Error in Function Call
In a blockchain context, each function in a smart contract is identified by its name, input parameters, and output parameters. Thus, these items compose the function signature, which is used by the contracts to verify that the right function is being called. This category groups defects in which a developer uses a function in the wrong manner: either a wrong signature is used, wrong arguments are used, or a wrong function is called.

Wrong Function Call
The issue occurs when a contract executes a certain function at a wrong address, i.e., at the address used by another function, which, however, has the same signature as the intended function. This vulnerability is also known in the literature as "Type casts" (Atzei et al., 2017).

Wrong Selection of Guard Function
Assert is a Solidity function, which is recommended to be used only in the development phase. Intentionally, the programmer inserts the function at a specific point in the program where a bug is suspected. If running the program results in gas depletion, the suspicion is confirmed.
Thus, this defect refers to the cases in which the assert function is implemented with the wrong purpose, not having the expected effect. In more severe cases, in which the programmer forgets to remove it from the code or does not replace it with require, the impact of this defect can be serious. This vulnerability is also known in the literature as "AssertFail" (Liao et al., 2019), "Assertion Failure" (Choi et al., 2021) or "SWC-110 Assert Violation" (SmartContractSecurity, 2020).

Function Call with Wrong Arguments
This defect refers to the presence of certain control characters within the arguments of a function call, namely the right-to-left override control character, which can cause the function to execute with arguments in reverse order. This is a known issue also in other computing areas (Yosifova and Bontchev, 2021). This vulnerability is also known in the literature as "RightToLeftOverride" (Tsankov et al., 2018) or "SWC-130 Right-To-Left-Override control character (U+202E)".

Wrong Class Inheritance Order
Contracts may have inheritance relationships with other contracts. In the case of solidity, the code of the inherited contract is always executed first, e.g., so that state variables are initialized properly. Solidity uses an algorithm named C3 linearization to determine the order in which the contracts are to be executed. Developers specify the inheritance relationships in a inherit statement and may believe that the order in which the inherited contracts are specified in that statement reflects the order in which the linearization algorithm should work. This opens space for security issues due to the wrong order of the contract in the inherit statement. This vulnerability is also known in the literature as "Unpredictable state" (Grishchenko et al., 2018;Argañaraz et al., 2020) or "SWC-125 Incorrect Inheritance Order"(SmartContractSecurity, 2020).

Improper Type Usage
This category groups vulnerabilities in which there is some misuse of types of data structures or functions.

Missing return type on Function
This vulnerability refers to a missing return type in the definition of a smart contract interface. At runtime, if a contract that implements that interface contains two functions with the same name and arguments but have different return types, there is a chance that the wrong function will be called. This may lead to unexpected results once the calling contract receives the wrong data type . This vulnerability is also known in the literature as "ERC20Interface" (Tsankov et al., 2018), "Unsafe inherit from token" (Lu et al., 2019), "Missing Return Statement" (Hu et al., 2023) or "Incorrect ERC20 interface" (Momeni et al., 2019).

Function Return Type Mismatch
In this case, the developer implemented a function (starting from an interface), but it selected the wrong data type for the value to be returned (i.e., it differs from what is specified in the interface). This vulnerability is known in the literature in the context of non-fungible tokens by the name of "ERC721Interface" (Tsankov et al., 2018) or "SWC-127 Arbitrary Jump with Function Type Variable" (SmartContractSecurity, 2020).

Parameter Type Mismatch
This issue refers to a divergence regarding the types of arguments used in a function that implements an interface. In this situation, even if the call is done with the right function name and arguments, the EVM considers it to be a non-existent function error. This vulnerability is also known in the literature as "Types conversion" (Argañaraz et al., 2020), "Unindexed ERC20 event parameters" (Momeni et al., 2019) or "ERC20Indexed" (Tsankov et al., 2018) in the context of fungible tokens.

Missing Type in Variable Declaration
In Solidity, whenever a variable is declared without an associated type, the compiler infers the data type based on the assigned value. This additional computation may lead to higher costs (i.e., in gas) and memory usage and especially allows for overflow or underflow problems to occur. For instance, the compiler may infer that a signed integer is the right datatype for a certain variable, where an unsigned integer should be used. This vulnerability is also known in the literature as "Unsafe type inference" Tikhomirov et al., 2018) or "Unsafe-type declaration" (Lu et al., 2019) .

Wrong Type in Variable Declaration
This issue refers to a wrong selection of datatypes that leads to the allocation of more memory than what would be necessary for the intended function, leading to an increase in gas consumption. As an example, in Solidity, the byte[] type reserves 31 bytes of space for each element, whereas the bytes requires a single byte per element, thus being more space efficient. This vulnerability is also known in the literature as "byte[ ]" , "Byte array" (Tikhomirov et al., 2018), or "Costly bytes" (Lu et al., 2019).

Wrong Type of Function
In Solidity, it is possible to specify a type for each function. Functions of type view can read data from state variables but cannot modify them, and no gas costs are involved, whereas functions of type pure neither can read nor modify state variables and similarly to view functions, no gas costs are associated with this type of function. This vulnerability occurs when a developer uses the wrong type for a function. For instance, there is an issue reported in Ethereum's GitHub (Ethereum's Github, 2022) in which a state variable conversion operation (from storage to memory) inside a pure function results in a problem (i.e., to avoid this problem, the function type should be view). This vulnerability is also known in the literature as "Function type operators" (Chapman et al., 2019).

Useless Code
This category groups a set of vulnerabilities in which the program contains a unit of code that, in practice, has no effect.

No Effect Code Execution
This vulnerability refers to the presence of code that has no practical purpose (i.e., it has no effect on the intended functionality). Within a smart contract, it increases the size of the program's binary code, which results in more gas consumption than would otherwise be necessary. This vulnerability is also known in the literature as "CallToDefaultConstructor" (Tsankov et al., 2018), "Useless Assignment" (Hu et al., 2023) or "SWC-135 Code With No Effects" (SmartContractSecurity, 2020).

Unused Variables
This defect refers to the declaration of variables that are not used in the contract, which results directly in the allocation of unnecessary space in memory. As a consequence, the gas cost of executing the contract increases as well as the attack surface of the contract. Other effects are related to the readability or maintainability of the code. This vulnerability is also known in the literature as "UnusedStateVariable" (Tsankov et al., 2018;Hu et al., 2023), "Presence of unused variables" (SmartContractSecurity, 2020) or "SWC-131 Presence of unused variables" (SmartContractSecurity, 2020).

Version Issues
This category refers to issues that relate to the versioning of various aspects, including the use of deprecated versions of functions.

Undetermined Program Version Prevalence
This defect refers to the case where the developer allows a certain contract to be compiled across multiple versions. This allows the known faults in older versions to be easily activated. . This vulnerability is also known in the literature as "SolcVersion" (Tsankov et al., 2018), "Compiler version not fixed" (Tikhomirov et al., 2018), "Unfixed compiler version" (Lu et al., 2019), "Usage of complex pragma statement" (Momeni et al., 2019) or "SWC-103 Floating Pragma" (SmartContractSecurity, 2020).

Outdated Compiler Version
Contracts that have been developed against an outdated compiler version can bring in several risks, mainly because newer versions may have resolved certain bugs or even introduced security mechanisms to avoid particular issues (e.g., the throw function has been disallowed in Solidity 0.5.0 and superior versions, in favor of assert, require⁄, and revert). This vulnerability is also known in the literature as "Compiler version problem"  , "Unfixed compiler version" (Lu et al., 2019) or "SWC-102 Outdated Compiler Version" (SmartContractSecurity, 2020).

Use of Deprecated Functions
Deprecated functions are not recommended due to the fact that they are usually replaced by functions that solve known security issues or even operate in a more efficient manner (i.e., may consume less gas). As an example, sha3 was marked as a deprecated function in Solidity 0.5 and replaced by keccak256, which is more secure and efficient. This vulnerability is also known in the literature as "SWC-111 Use of deprecated solidity functions" (SmartContractSecurity, 2020).

Inadequate Data Representation
The numbers to represent the credits (e.g., Ether) can be very large (i.e., literals with many digits are difficult to read and review). Thus it is recommended that the programmer use the native resources of the language to make this representation (e.g., Solidity 10000000000000000000 for 1 ether). This vulnerability is also known in the literature as "too-many-digits" (Tsankov et al., 2018).

Improper Modifier
This group gathers defects that relate to the use of modifiers in functions and variables.

Wrong Function Modifier
This defect refers to the case of functions that are written solely to be used by other contracts (i.e., not within the contract). Such functions should be marked with the external modifier instead of public. The public modifier allows both external and internal calls. Marking a function with external results in gas savings, as every invocation will be using calldata (a special memory region to store arguments, which cannot be later modified by the function) and can avoid unnecessary read and write operations to memory, which occur with internal calls (i.e., that do not use calldata). This vulnerability is also known in the literature as "external-function" (Tsankov et al., 2018) or "SWC-100: Function Default Visibility" (SmartContractSecurity, 2020).

Missing Constant Modifier in Variable Declaration
Variables that are not modified during the execution flow should be declared as constants to save gas. In the absence of the constant modifier, it is assumed that the variable's value can be changed. This vulnerability is also known in the literature as "ConstableStates" (Tsankov et al., 2018) or "State variables that could be declared as constant" (Momeni et al., 2019).

Missing Visibility Modifier in Variable Declaration
Variables have different visibility states, which determine the context for accessing them. In Solidity, by default, the visibility of state variables and functions is internal, which allows access from functions in the same contract or derived contracts. A developer that is unaware of this may create a contract that allows exposure of sensitive data or allow unexpected behavior. This vulnerability is also known in the literature as "StateVariablesDefaultVisibility" (Tsankov et al., 2018), "Visibility level" (Tikhomirov et al., 2018;Zhang et al., 2019), "Unspecified visibility level" (Lu et al., 2019), "Gain/Lose visibility" (Chapman et al., 2019) or "SWC-108: State Variable Default Visibility" (SmartContractSecurity, 2020).

Redundant Functionality
Contracts that are written with redundant functionality increase code size and make maintainability difficult. In a simple scenario, a programmer creates a function and later (by bad practices) ends up creating the same functionality again in a new function. He/she identifies a vulnerability in the new function and fixes it, but the old function with the defect is used by the caller. This vulnerability is also known in the literature as "Redundant refusal of payment" , "Redundant fallback function" (Tikhomirov et al., 2018), or "Unnecessary payable fallback function" (Lu et al., 2019).

Shadowing
This category groups defects in which there are code elements (e.g., a function or a variable) with the same name, which can lead to erroneous and unexpected behavior.

Use of Same Variable or Function Name In Inherited Contract
When using the same name as a local variable, which was previously declared by an inherited contract, the program loses the reference of the inherited variable, causing the local variable to assume the role of the other variable. This vulnerability is also known in the literature as "Shadowing state variables" (Tsankov et al., 2018), "Shadow memory" (Ashouri, 2020), "Shadowing" (Feist et al., 2019) or "SWC-119: Shadowing State Variables" (SmartContractSecurity, 2020).

Variables or Functions Named After Reserved Words
This bug occurs when creating variables named after keywords of the language itself. For example, in Solidity, creating a variable with the name now conflicts with the function that returns the date and time. This vulnerability is also known in the literature as "ShadowedBuiltin" (Tsankov et al., 2018).

Use of the Same Variable or Function Name In a Single Contract
This vulnerability refers to cases where the same name is used for more than one variable or function inside the contract. This makes the program lose the reference of the variable of the class, assuming the variable of the function as its role. This vulnerability is also known in the literature as "ShadowedLocalVariable" (Tsankov et al., 2018), "Redefined Variable" (Hu et al., 2023) or "Local variable shadowing" (Momeni et al., 2019).

Buffer Overflow
This category refers to overflow vulnerabilities (e.g., stack-based, heap-based) in which it is possible to write more data than what the buffer can hold, thus modifying memory areas outside the expected.

Stack-based Buffer Overflow
The EVM keeps an execution stack that manages the execution of contracts. If an attacker is allowed to overflow this stack (e.g., by using specially crafted inputs), it can potentially overwrite control variables (e.g., timestamp or block number) and, for instance, gain unauthorized access to certain resources. This vulnerability is also known in the literature as "Stack size limited" (Argañaraz et al., 2020).

Write to Arbitrary Storage Location
In solidity, arrays are stored as contiguous fixed-size slots. In the absence of a bounds verification, a malicious user could write data to a particular storage slot used to store the contract owner's address, which could be overwritten and then used to further harm the contract. This vulnerability is also known in the literature as "UnrestrictedWrite" (Tsankov et al., 2018) , "Storage modification" (Krupp and Rossow, 2018),"Arbitrary Write" (Choi et al., 2021) or "SWC-124: Write to Arbitrary Storage Location" (SmartContractSecurity, 2020).

Use of Malicious Libraries
This defect refers to the use of third-party libraries containing malicious code. This vulnerability is also known in the literature as "Malicious libraries" (Tikhomirov et al., 2018), "Unknown libraries" (Lu et al., 2019), or "Dynamic libraries" (Andesta et al., 2020.

Incorrect Control Flow
This category groups a set of vulnerabilities that, if exploited, cause changes in the control flow of the program.

Incorrect Sequencing of Behavior
This category gathers vulnerabilities that end up in a sequence of behaviors that are carried out in the wrong order, leading to unexpected results.

Incorrect Function Call Order
This defect refers to the creation of public functions that expect to be called in a certain sequence, originating unanticipated results whenever clients do not follow the right call order (Mavridou Anastasia et al., 2018). This vulnerability is also known in the literature , "Transaction-ordering dependence" (Mavridou Anastasia et al., 2018) or "SWC-114: Transaction Order Dependence" (SmartContractSecurity, 2020).

Improper Locking
This issue refers to the case where a contract assumes that all entities participating in a transaction must have the same credit balance before the contract operations can execute. If there are no adequate (e.g., wrong or even missing) locking mechanisms, an attacker can forcefully send credit to the other entity, which would cause the verification of the balance condition to never be met. Thus, the contract may become unusable or show unexpected behavior (or unexpected state changes). This vulnerability is also known in the literature as "IncorrectEquality" (Tsankov et al., 2018), "Balance equality" (Tikhomirov et al., 2018;Zhang et al., 2019), "Strict equality" (Lu et al., 2019), "Strict Check for Balance" (Chen et al., 2020), "Arbitrary sending of ether" (Feist et al., 2019) or "SWC-132: Unexpected Ether balance" (SmartContractSecurity, 2020).

Transfer Pre-Condition Dependent on Transaction Order
In the case of this vulnerability, the order in which transactions are executed influence a precondition that guards the execution of the transfer. This influence may erroneously result in, for instance, a transaction not being executed at all. This defect is known in the literature as TOD-Transfer (Tsankov et al., 2018), TOD (Bose et al., 2022) or Transaction Order Dependence (Smart-ContractSecurity, 2020).

Transfer Amount Dependent on Transaction Order
This issue refers to the case where the value of the variable that stores or determines an amount of a digital asset (to be transferred) is modified before it is sent to the recipient due to transaction ordering within a block. The amount may be changed due to the effect of multiple transactions being grouped in a block and executed in a specific order having the effect of producing unexpected changes in the value being transferred. This vulnerability is also known in the literature as "TODAmount" (Tsankov et al., 2018) , "TOD" (Liao et al., 2019;Wang et al., 2021;Bose et al., 2022) or "SWC-114: Transaction Order Dependence" (SmartContractSecurity, 2020).

Transfer Recipient Dependent on Transaction Order
In the case of this defect, the transfer recipient is modified before the send event due to transaction ordering within a block. As an example, if the intended recipient address is stored as a storage variable and a transfer is to execute based on this address, there is a chance the address may be changed or overwritten by another transaction prior to the transfer. This vulnerability is also known in the literature as "TODReceiver" (Tsankov et al., 2018) , "Direct value transfer" (Krupp and Rossow, 2018), "Transaction order dependence" (Kalra et al., 2018;Hu et al., 2023), "Transactionordering dependence" (Song Jingjing and He et al., 2019;Grishchenko et al., 2018;Luu et al., 2016;Grieco et al., 2020), "TOD" (Bose et al., 2022) or "SWC-114: Transaction Order Dependence" (SmartContractSecurity, 2020).

Exposed state variables
This vulnerability refers to the case where a developer erroneously exposes a state variable, whose value may then be modified by an attacker so that this modification influences the execution of a certain contract operation. As an example, consider a contract that executes a credit transfer from one user to another and has a require statement for verifying that there is sufficient credit to conclude the operation. If the balance is stored as a public state variable, a malicious use could change its value so that the require is avoided allowing the user to run a transfer that exceeds the amount of credit actually held by the malicious user. This vulnerability is also known in the literature as "Vulnerable state" (Krupp and Rossow, 2018).

Inadequate Input Validation
This group refers to defects involving the inadequate validation of functional conditions, which are requirements that a contract must meet so that it can operate correctly. Such conditions may offer protection against certain types of attacks or force certain business rules to be followed.

Improper Input Validation
This type of problem occurs when an attacker calls a certain contract operation using invalid or malicious input data, capable of affecting the functioning of the contract due to the fact that either it does not validate the incoming inputs or validates them in an incorrect manner. For instance, in the context of Solidity, a Short Address Attack occurs when a contract receives less data than it was expecting, which leads the system to fill the missing bytes with zeros (Chen et al., 2020). As a consequence, the behavior may become unexpected if the code assumes that the input data will comply with a certain length or format. This vulnerability is also known in the literature as "Invalid input data" (Chen et al., 2020;Grieco et al., 2020), "Short address attack" (Ashouri, 2020), "Short address" , or "Avoid non-existing address" (Chang et al., 2019).

Extraneous Input Validation
In this particular case, the functional conditions of the contract are too strong and do not allow certain behaviors (which would be valid) to occur, making the contract unable to meet the requirements. This vulnerability is also known in the literature as "Requirement Violation" (Choi et al., 2021) or "SWC-123 Requirement violation" (SmartContractSecurity, 2020).

Arithmetic Issues
This category groups different vulnerabilities that share the outcome of resulting in arithmetic problems.

Overflow and Underflow
This category refers to the use of operations (e.g., addition, subtraction) over values that result in a value that is less than (or greater than) the minimum values (or maximum value) that a variable can hold, which produces a value different from the correct result.

Division Bugs
This category groups issues related to erroneous division operations.

Divide by Zero
This issue refers to the attempt of a program to divide a value by zero. This vulnerability is also known in the literature as "Division-by-zero" (So et al., 2020), "Arithmetic Bugs" (Torres et al., 2018), or "Division by zero" (Akca et al., 2019).

Integer Division
At the time of writing, a smart contract mainstream language like Solidity does not support floating point or decimal types. Thus, the remainder of a division operation is always lost. Developers may use fixed-point arithmetic and external libraries to handle this kind of operation. This vulnerability is also known in the literature as "Numerical Precision Error" (Cui et al., 2022), "Integer division" (Tikhomirov et al., 2018) , "Using fixed point number type"  or "SWC-101: Integer Overflow and Underflow" (SmartContractSecurity, 2020).

Conversion Bugs
This category groups a set of vulnerabilities where there are issues related to the conversion between different datatypes.

Truncation Bugs
This vulnerability refers to the case where a variable declared in a certain type is converted to a smaller type, which means that data is lost during the conversion process. This vulnerability is also known in the literature as "Truncation bugs" (Torres et al., 2018) or "SWC-101: Integer Overflow and Underflow" (SmartContractSecurity, 2020).

Signedness Bugs
The conversion of a signed integer type to an unsigned type of the same width may change a negative value to a positive one (the opposite may also happen) (Torres et al., 2018). This vulnerability is also known in the literature as "Signedness bugs" (Torres et al., 2018) or "SWC-101: Integer Overflow and Underflow" (SmartContractSecurity, 2020).

Improper Access Control
This category groups a set of vulnerabilities that are strongly related to authentication or access control.

Incorrect Authentication or Authorization
The smart contract fails to properly identify a client or determine its privileges, resulting in wrong access privileges for that particular client.

Wrong Caller Identification
In Solidity, tx.origin allows obtaining the address of the account that initiated a transaction and msg.sender allows obtaining the address of the contract that has called the function being executed. The use of the tx.origin for access control may be a way of opening an entry point to a malicious user. A malicious user may create a contract that calls the vulnerable function (i.e., the one that uses tx.origin to check the identity of the caller). Thus, msg.sender will differ from tx.origin. In the case the vulnerable function uses tx.origin for access control, it will allow the user to perform actions it should not be able to. This vulnerability is also known in the literature as "Transaction Origin Use" (Choi et al., 2021), "Transaction state dependence" (Kalra et al., 2018), "Use of origin" (Brent et al., 2018), "TxOrigin" (Hu et al., 2023;Li et al., 2023;Akca et al., 2019;Liao et al., 2019;Tsankov et al., 2018), "Tx.origin for authentication" , "Tx.origin" (Lu et al., 2019;Tikhomirov et al., 2018), "Incorrect Check for Authorization" (Chen et al., 2020), "Unprotected usage of tx.origin" (Momeni et al., 2019) or "SWC-115: Authorization through tx.origin" (SmartContractSecurity, 2020).

Owner Manipulation
This vulnerability allows an attacker to exploit some function or feature of the smart contract by manipulating the owner control variable. This allows the attacker to perform some kind of restricted operations . This vulnerability is also known in the literature as "Missing Owner Check" (Cui et al., 2022), "Unprotected Function" (Stephens et al., 2021), "Vulnerable access control" , "Access control" Lu et al., 2019), or "Tainted owner variable"

Missing Verification for Program Termination
This issue refers to the lack of a secure verification for terminating a published (deployed) contract, allowing an attacker to terminate it in an unauthorized manner. Selfdestruct is an EVM instruction that is able to nullify the bytecode of a deployed contract. When invoked, it stops the execution of the EVM, deletes the contract's bytecode, and sends the remaining fund to a certain address. Access to this kind of function by non-authorized clients may result in security issues. This vulnerability is also known in the literature as "Suicidal Contract" (Choi et al., 2021), "Unprotected Suicide" (Hu et al., 2023), "Destroyable contract" (Brent et al., 2018), "SelfDestruct" (Liao et al., 2019), "Suicidal contracts" (Feist et al., 2019), "Guard suicide" (Chang et al., 2019), "Unprotected usage of selfdestruct" (Momeni et al., 2019), "Accessible selfdestruct" , "Tainted selfdestruct"  or "SWC-106: Unprotected SELFDESTRUCT Instruction" (SmartContractSecurity, 2020).

Improper Protection of Sensitive Data
This category generally refers to the issues that result in the inability to protect sensitive information from non-authorized clients.

Exposed Private Data
This issue refers to the cases in which contracts store unencrypted sensitive data in public blockchain transactions. Solidity, like other programming languages, support the private keyword that indicates that data is only accessible within the contract itself. However, in blockchain environments, marking a variable with private does not make it fully invisible to the outside world. Miners, which are responsible for validating transactions on the blockchain, can view the code of the contract and the value of its state variables . This vulnerability is also known in the literature as "Keeping Secrets" (Argañaraz et al., 2020), "Exposed secret" , "Private modifier" (Lu et al., 2019;Tikhomirov et al., 2018;Zhang et al., 2019) or "SWC-136: Unencrypted Private Data On-Chain" (SmartContractSecurity, 2020).

Dependency on External State Data (Unsolvable constraints of external critical state data)
This vulnerability refers to the use of data that is not under control nor is generated by the contract (i.e., external critical state data). A malicious user may exploit this situation if such data determines the outcome of the execution of the contract. This vulnerability is also known in the literature as "Unsolvable constraints" .

Cryptography Misuse
This category groups vulnerabilities that generally reflect misuse of cryptography mechanisms.

Incorrect Verification of Cryptographic Signature
This issue refers to the wrong verification of the authenticity and integrity of messages with the use of message signatures. As an example, a developer could develop a vulnerable contract that relies on a signature in a signed message hash for representing the earlier verification of previous messages. A client could generate a malicious message with a valid signature and include it in the hash. The contract then would validate the signature and update the hash, indicating that the message was processed. This vulnerability is also known in the literature as "Missing Key Check" (Cui et al., 2022), "SWC-117: Signature Malleability" (SmartContractSecurity, 2020).

Improper Check against Signature Replay Attacks
This defect refers to a situation where a malicious client is able to obtain the message hash of a legitimate transaction and is allowed to use the same signature to impersonate the legitimate client and execute fraudulent transactions. This vulnerability is also known in the literature as "SWC-121: Missing Protection against Signature Replay Attacks" (SmartContractSecurity, 2020).

Improper Authenticity Check
In this case, a contract may tolerate off-chain signed messages instead of waiting for an on-chain signature. This is usually done with the goal of improving performance but may come at the expense of compromising the authenticity of the message. This vulnerability is also known in the literature as "Missing Signer Check" (Cui et al., 2022), "SWC-122: Lack of proper signature verification" (SmartContractSecurity, 2020).

Incorrect Argument Encoding
This defect refers to the misuse of one-way hash functions (i.e., Solidity keccak256) namely in the incorrect encoding of the function arguments, which can result in a higher likelihood of hash collisions for different entries. This vulnerability is also known in the literature as "Authorization" (Mavridou Anastasia et al., 2018) , "Hash collision" (Lu et al., 2019) or "SWC-133: Hash Collisions With Multiple Variable Length Arguments" (SmartContractSecurity, 2020).

Discussion
This section overviews the main characteristics of the taxonomy and maps our observations to state-of-the-art and industry practices. We conclude the section with a brief summary of the main aspects that contribute to the overall quality of the taxonomy. 5.1 Mapping the taxonomy to the state of the art Table 12 summarizes the distribution of the number of vulnerabilities per each of the main categories present in our taxonomy. As we can see, the distribution is dominated by 'Bad Programming Practices & Language Weaknesses', which account for almost half of the defects. Most of the remaining defects show relatively similar numbers among themselves. Figure 5 further characterizes the identified vulnerabilities, namely by identifying the different defect types (in the y-axis) and specifying the number of OpenSCV vulnerabilities per each defect type (between parenthesis, in the y-axis). The plot then shows the prevalence of the qualifier values. Notice that the sum of the qualifier values exceeds the vulnerability count between parenthesis in the y-axis, as a certain defect may be associated with more than one qualifier. Relationship (1) Application Dependencies (1) Process conformance (2) Function/Class/Object (3) Timing/Serialization (7) Interface/O-O Messages (13) Algorithm/Method (14) Checking (17) Assignment/Initialization (18)

ODC Defect Type (#Vulnerebilities in OpenSCV)
Missing Wrong Extraneous As we can see in Figure 5, the top defect types fit in 'Assignment/Initialization' 'Checking', and 'Algorithm/Method', which is closely followed by 'Interface/O-O Messages'. The top three defect types account for nearly two-thirds of the defects, with the top four accounting for more than 80% of the 76 vulnerabilities. Table 13 summarizes the qualifier prevalence. It also shows the prevalence of qualifier combinations, which represent vulnerabilities whose correction may be related to more than one qualifier (e.g., a certain vulnerability may be due to a 'missing' or due to a 'wrong' 'assignment'). The 'wrong' qualifier is the most frequent one, followed by 'missing'. In terms of combinations, 'missing' and 'wrong' are the most frequent case.
We selected three different cases of classification schemes for comparison with OpenSCV. In particular, we selected SWC (SmartContractSecurity, 2020) for frequently appearing in the literature, we also selected the classification by (Rameder et al., 2022) for being the most extensive one found in the state of the art, and we also selected the list of vulnerabilities used in (Hu et al., 2023) for being the most recent vulnerability detection work in our list. Figure 6 shows to what extent these classifications map the vulnerabilities identified in our taxonomy and overall shows their reach and practical limitations. OpenSCV SWC (SmartContractSecurity, 2020) (Rameder et al., 2022) SoliDetector (Hu et al., 2023) # Vulnerabilities

OpenSCV Correspondent
Reported Vulnerabilities The first observation from Figure 6 is that the number of distinct vulnerabilities currently captured by OpenSCV exceeds the remaining classifications in the plot. SWC has seen its 37 vulnerabilities unfold into 49 different cases. We also map all 48 vulnerabilities in (Rameder et al., 2022), in this case in a one-to-one mapping (notice that the work reports a total 54 defects, of which we excluded 6 that do not represent security vulnerabilities). Finally, 18 vulnerabilities in (Hu et al., 2023) are mapped into 19 in OpenSCV (one of the vulnerabilities unfolds in two in OpenSCV). We must mention that (Hu et al., 2023) actually identifies a total of 20 vulnerabilities, although, for two of them, there are no sufficient details to allow an understanding of what exactly the defect represents.
We now analyze the state of the practice (in terms of tools) by analyzing their announced vulnerability detection capabilities facing the identified vulnerabilities in our taxonomy. Figure 7.a) shows the practical distance of the 49 identified works in vulnerability detection to our current state of knowledge in what concerns smart contract vulnerabilities. Figure 7.b) shows, from the perspective of each individual vulnerability, how many tools are being designed to detect it.
As we can see in Figure 7.a), current tools are being designed to detect in average 7 of the vulnerabilities in OpenSCV. If we consider the first quartile we see that tools are projected to detect from 3 to 8.5 different vulnerabilities. The best three vulnerability detection works (in terms of projected detection capabilities) are: Securify (Tsankov, 2018) covering 44.8% of OpenSCV vulner- abilities (34 out of 76); Neucheck (Lu et al., 2019) with 36%, (20 out of 76) and; SoliDetector (Hu et al., 2023) which is projected to detect 25% of the vulnerabilities in OpenSCV (19 out of 76). We can see that, event in terms of design for detection, the room for improvement is huge and the plain combination of the different tools capabilities in itself is a clear possibility for the creation of a better tool. Figure 7.b) shows that a single vulnerability is being targeted, in average, by 5.1tools, with the first quartile being between 2 and 6 tools. Figure 8 shows the focus of the different classes of tools per each of the top categories in our taxonomy.  Figure 8 shows that the category in which there is a larger focus from vulnerability detection approaches is "1. Unsafe External Calls", which is dominated by software testing approaches. Another aspect to mention is that, although "4. Bad Programming Practices & Language Weaknesses" is the category that groups the largest set of vulnerabilities (i.e., 36 vulnerabilities), proportionally it is far from gathering the same attention as other categories (e.g., "6. Arithmetic Issues" is being targeted by 27 tools, although it groups only 6 types of vulnerabilities). The last aspect that is worthwhile mentioning is that software testing tends to be the most frequent technique across all categories of vulnerabilities, with the exception of "7. Improper Access Control", where static analysis has the lead. Indeed several vulnerabilities in this group easily fit the detection capabilities of static analysis techniques (e.g, cryptography misuse vulnerabilities). The top 5 vulnerabilities (in terms of presence in the different papers) are "1.1.1 Unsafe Credit Changes" (36), "5.1.1 Incorrect Use of Event Blockchain Variables for Time" (28), "6.1.1 Integer Underflow" (21), "6.1.2 Integer Overflow" (21), "4.7.1 Unreachable Payable Function" (18). Detailed information and further data (e.g., code examples) can be quickly viewed at the OpenSCV website .

Main contributors to the overall quality of the taxonomy
We now summarize the main aspects which we believe are the main contributors to the general quality of the taxonomy, which we are making publicly available at http://openscv.dei.uc.pt . In terms of organization, we opted for a hierarchical structure, as it may be useful from a defect prevention perspective. From a language designer's perspective, understanding that there is a certain group of defects that are related to, for instance, gas depletion may be helpful for designing effective protection mechanisms against those defects. Such mechanisms may share common strategies.
A taxonomic structure of this kind allows setting homogeneous levels of abstraction in an easier manner, which we iteratively tried to achieve, although this kind of goal is quite difficult as it should be balanced with the number of items and overall tree complexity (and in some cases, due to the specificity of the problem, this may not even be possible). We tried to, as much as possible, reuse existing terminology although many times we converged to the use of new terms (adapted from the literature), for clarity purposes. The required nomemclature adaptations integrated into our taxonomy were carried out mostly with the goal of making the items non-ambiguous (and uniquely identifiable also) and also fostering the determinism of the classification process by clarifying the meaning of each vulnerability. We complemented this with the available information from DASP, SWC, (Rameder et al., 2022), and CWE, targeting to make the taxonomy further comprehensible and non-ambiguous (multiple perspectives will dissipate standing doubts, fostering repeatability).
The taxonomy construction process involved the analysis of a relatively large number of papers, tools, and other classifications, with the main goal of fostering completeness (i.e., good coverage), which in the end makes it also more useful as we end up forming a unified view of the landscape of smart contract vulnerabilities. As previously mentioned, we found that the number of papers and respective vulnerabilities analyzed (i.e., an initial set of 357 vulnerabilities collected from 49 papers) was actually a main contributor to the overall quality of the taxonomy, with a few late additions becoming trivial to map. It is worthwhile mentioning that the created structure is non-ridig in the sense that we make it open to the community and, in particular, open to community contributions, which can be carried out by submitting issue requests at the OpenSCV Github repository .

Threats to Validity
This section discusses the main threats to the validity of this work. To minimize the chances of creating an incorrect structure or providing incorrect vulnerability information, we formalized the taxonomy creation process, which was based on several quality criteria identified in the state of the art, and especially made use of several researchers (i.e., one Early Stage Researcher and 2 Experienced Researchers) who incrementally and iteratively built the taxonomy following a bottomup approach. The process was enriched by establishing relations to other classifications in the blockchain context (i.e., SWC, DASP, (Rameder et al., 2022)) and in a more general context (i.e., CWE). We also characterized each vulnerability using ODC and an example, which served also to minimize doubts and clear divergences among researchers. In addition, we provide the taxonomy as a live structure at  supported by a Github repository  so that possible mistakes are corrected and also allow future updates, changes, and overall taxonomy evolution.
We are aware that a classification or categorization scheme or a taxonomy may assume one of several possible forms. We may have more or less main categories, we may have a deeper tree, the organization may or may not be hierarchical, and so on. While such diversity is acceptable (as long as the organization and individual items are correct), we opted to focus on the taxonomy creation process instead of on forcing a certain structure. For this purpose, we identified quality criteria, analyzed similar structures in the state of the art so that we could learn from possible mistakes and incorporate lessons learned by previous researchers. While the current structure is a proposal, we prepared it built to change and evolve, by opening it to the community and also by directly providing 'Request For Change' templates to facilitate changes or additions to the present form.
An important aspect is that the taxonomy creation process was guided by the research that was found during the analysis of the state of the art. Thus, we may have missed some relevant work in this context and, with time this gap may become greater. The fact that we were already aware of contributions coming from 3 areas: research on vulnerability classification, initiatives on vulnerability classification that are community-oriented, and research on vulnerability detection, allowed for a more efficient search, with which we believe captured representative research in this context. Despite this, and to mitigate possible gaps between the set of works considered to build OpenSCV and the set not captured during the collection of papers in this work, we prepared a supporting infrastructure to allow continuous update and evolution of OpenSCV. Thus, we will be able to capture and integrate new research in vulnerability detection that may bring in emerging smart contract vulnerabilities.

Conclusion
In this paper, we presented an open hierarchical taxonomy for smart contract vulnerabilities. The taxonomy is up-to-date according to the current state of the practice and is prepared to handle future modifications and evolution. To build the taxonomy we began by analyzing current vulnerability classification schemes for blockchain, we also analyzed announced detection capabilities of research on smart contract vulnerability detection, and we followed an iterative process to structure the taxonomy. We discussed the proposed taxonomy characteristics and coverage against the state of the practice. In particular, we analyzed the announced detection ability of current industry-level tools and mapped it to the different identified vulnerabilities. As future work, we plan on using this taxonomy as basis to define a benchmark for smart contract vulnerability detection tools.