Systematic review of declarative tactical knowledge evaluation tools based on game-play scenarios in soccer

In the last two decades, the analysis of tactical knowledge has become a research channel of increasing interest, contributing to the development of ad-hoc tools to carry out this task. The aim of this study is to collect evaluation tools which allow to measure the declarative tactical knowledge (DTK) in soccer. Five databases (Web of Sciences, Pub Med, SportDiscus, Psycinfo and Eric) were used for the literature search based on PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analyses) guidelines, according to five inclusion/exclusion criteria: (i) tools that determinate DTK in soccer players, (ii) come from primary sources, that is, published for the first time, (iii) show game-play scenarios in video sequences or static images via questionnaires, (iv) have been submitted to a process of validity and reliability, (v) and avoid the use of verbal language. Nine tools were selected and analyzed in this systematic review: Soccer decision-making tests (McMorris, in Percept Mot Ski 85(2):467, 1997), Protocol for the evaluation of declarative tactical knowledge (Mangas, in Conhecimento declarativo no futebol: Estudo comparativo em praticantes federados e não-federados, do escalão de sub-14, Dissertação de Mestrado, Faculdade de Desporto da Universidade do Porto, 1999), Questionnaire for the evaluation of tactical comprehension applied to football—CECTAF—(De la Vega, in Cultura, Ciencia y Deporte, 2002. https://repositorio.uam.es/bitstream/handle/10486/1723/11535_vega_marcos_ricardodela.pdf?sequence=1), Decision making instrument for Soccer (Fontana, in The development of a decision making instrument for soccer, Master’s degree dissertation, University of Pittsburgh, 2004. http://d-scholarship.pitt.edu/10124/), Game Understanding Test (Blomqvist et al., in Phys Educ Sport Pedagogy 10(2):107–119, 2005), Offensive Football Tactical Knowledge Test—TCTOF—(Serra-Olivares and García-López, in Revista Internacional De Medicina y Ciencias De La Actividad Física y Del Deporte 16(63):521–536, 2016. https://doi.org/10.15366/rimcafd2016.63.008), Video-based decision-making test (Keller et al., in Int J Sports Sci Coach 13(6):1057–1066, 2018. https://doi.org/10.1177/1747954118760778), Decision-Making form IOS application (Bennett et al., in J Sci Med Sport 22(6):729–734, 2019. https://doi.org/10.1016/j.jsams.2018.12.011) and TacticUP video test for soccer (Machado and Teoldo, in Front Psychol, 2020. https://doi.org/10.3389/fpsyg.2020.01690). Most of the tools did not pass many of the criteria proposed to assess their quality. Fundamentally, it can be concluded that few tools show specific tactical scores based on game principles or subroles that allow identifying possible points of improvement in the knowledge that players have on specific aspects of the game. For this reason, and based on the other findings found in this review, future studies should consider: (i) the importance of designing tools that reflect scores based on tactical principles and game subroles; (ii) the advantages and disadvantages of designing tools based on static images or video sequences; (iii) the need to design tools that can access the DTK of young children; (iv) the requirement to design tools that present game-play scenarios in the first person; (v) the essentiality of facing the tools designed to rigorous processes of validity and reliability.


Introduction
Any sport confrontation evolves within a closed field where the actions are channeled inside the borders that space encloses in itself and, beyond this, the game has no meaning (Parlebas 1988). This closed field has a series of characteristics that distinguish and shape it, configuring the possibilities and limitations of action of the players. The way of responding to the problems that the game raises, is not only influenced by this type of constraints, but also by the players' tactical knowledge, which is not inherent to themselves, but is developed and learned (González-Víllora et al. 2015). The players are in constant training, in continuous incorporation of knowledge and experiences, in mutual interaction with each other, and with the physical environment, enriching themselves and enriching the game (Castellano 2000). The mental images that the player must build in the course of the game, which should allow him to read and anticipate the game, are fruit, above all, of the experience, learning and training in contexts of "socio-interaction" (Casamichana et al. 2015), because in team sports, the knowledge the player needs is related to game logic (Gréhaigne et al. 1999).
In the last two decades, the analysis of tactical knowledge has become a research channel of increasing interest, contributing to the development of ad-hoc tools to carry out this task, either from the procedural or declarative level. In the sport field, procedural tactical knowledge (PTK) refers to performance and creation of movements, selecting the most adequate actions according to different competition situations (French and Thomas 1987), being intimately linked to motor action (Kirkhart 2001;Teoldo et al. 2011;Williams and Davids 1995). Specifically, in an eminently open sport with unrepeatable game-play situations such as soccer, PTK should be determined by knowing how, when and where to do it, drawing the "identity document" that determines the traits of the motor action (Parlebas 1993) and the consequences that entails for its realization (Parlebas 2001). Declarative tactical knowledge (DTK) refers to the knowledge on rules, positions, functions, basic offensive and defensive strategies, and understanding of game tactical-technical logic, "to know what to do" (Thomas et al. 1986). It is the player's ability to declare verbally and/or in writing, what is the best decision to be made in a given training or match situation (Tenenbaum and Lidor 2005).
Traditionally, an association has been established between expertise level and declarative tactical knowledge, as it appears that high-skill players possess a larger and more elaborate declarative knowledge base (Williams and Davids 1995). For this reason, selective methodology (Anguera 2003) has traditionally been used to evaluate DTK, and numerous instruments have been designed and validated, allowing access to a large sample, quickly 1 3 and directly, and with little training in its application (Salmon et al. 2009). In this wide range of instruments can be included: The questionnaires on basic knowledge, in terms of terminology, rules, principles of game and performance, used in basketball (Del Villar et al. 2004;French and Thomas 1987;Pinto 1997), volleyball (Moreno-Domínguez et al. 2006), tennis (García-González et al. 2009;McPherson 1987), soccer (Otero et al. 2012), floorball (Contreras Jordán et al. 2005, as well as invasion sports in general (Contreras Jordán et al. 2005;Figueiredo et al. 2008). Verbal protocols based on conditional statements "if …, then …", which establish that "if X occurs, then I do Y", which were used for the first time in tennis (McPherson 1987), including later in some of the works mentioned that they value tactical knowledge in other sports; as well as serving as a scaffold for the use of more sophisticated tools through static or dynamic images of game situations. Based on these verbal protocols, tennis interviews have also been conducted, which are subsequently coded according to a system of categories (García-González et al. 2006;McPherson 1999). Interviews with open questions about basic knowledge and tactical problems have been used in soccer (González-Víllora, et al. 2010a, b;Griffin et al. 2001); selfconfrontation interviews (Cranach and Harré 1982) with open-ended questions, carried out after the performance of the participants themselves, have also been used in volleyball (MacQuet 2009), badminton (Macquet and Fleurance 2007), tennis (McPherson and Thomas 1989). Also, interviews through soccer video sequences have been used (Den Hartigh et al. 2018;García-López et al. 2010;González-Víllora et al. 2011González-Víllora et al. 2010a, b;Vaeyens et al. 2007). In tennis, in-game interviews have been used to access the knowledge of the participants immediately after the action (McPherson 1987;McPherson and Thomas 1989). The ability to memorize game patterns, associated with the memory paradigm, and related in some way to the anticipatory component in team sports, and therefore to tactical knowledge, was first studied in chess (Chase and Simon 1973), having also been analyzed in soccer (Casanova et al. 2012;McMorris 1997;McMorris andGraydon 1996a, 1996b;Williams et al. 1993). Self-perception questionnaires in invasion sports (Elferink-Gemser et al. 2004) were also used in soccer (Kannekens et al. 2009;Nortje et al. 2014). Reflective supervision protocols have been used through self-reports in basketball (Iglesias 2006). Multi-response tests through iconic images of game situations have been used in soccer (De la Vega 2002;Griffin et al. 2001;McMorris andGraydon 1996a, 1996b;Quina et al. 2011;Serra-Olivares and García-López 2016) and futsal (Souza 2002). Multi-response tests through freezing images of video game sequences have been used in soccer (Bennett et al. 2019;Blomqvist et al. 2005;Fontana 2004;Giacomini et al. 2011;Keller et al. 2018;Mangas 1999;Praça et al. 2016), tennis (Aburachid et al. 2013) and badminton (Blomqvist et al. 2000). Also, computerized tests based on situations from different sports disciplines have been used (Buscà et al. 2010), computer animations in volleyball (Broek et al. 2011), as well as game simulators in soccer (De la Vega et al. 2008;Helsen and Pauwels 1988;Sánchez-López et al. 2012). The future invites us to think that virtual reality can play a relevant role in the assessment of DTK.
Regarding the limitations found in these tools, one of the key requirements for any tool is that it must be configured from the internal logic of the game (Parlebas 1988), measuring aspects that faithfully constitute the player's soccer competence (Parlebas 2018), both in attack and defense. Furthermore, it seems interesting to go to the operational (Bayer 1979) and core/specific (Castelo 1999;Garganta and Pinto 1994;Hainaut and Benoit 1979;Kunrath et al. 2020;Queiroz 1983;Teoldo et al. 2009Teoldo et al. , 2011Worthington 1974) principles of the game, as well as the roles (Lago 2000) and player subroles (Marqués et al. 2015;Oboeuf et al. 2009); since they allow the analysis of the DTK from various perspectives, and in relation to the teaching-training processes. Also, an important issue present in the DTK analysis is the role of verbalization in decision-making. Decision making is defined as a choice of action, and it is a result that can be observed as a motor or verbal response (Macmahon and Mcpherson 2009). However, there is no theoretical support to defend the relationship between verbal behavior and tactical behavior (Araújo et al. 2014), so tactical skills and verbalizations about tactical skills cannot be considered equivalent (Araújo et al. 2010). This is because not everything a person says about something is everything he knows about that "something" (De la Vega 2002); since on many occasions, the human being is not able to express everything he knows through language. It is true that, explicit knowledge, in addition to allowing "to do", also allows to explain the reasons and the processes of action (Quina et al. 2011), and verbal protocols have been used to distinguish and compare the mental processes that support the decision-making out of the context of natural play (Petiot et al. 2017), or to measure cognitive effort in perceptionreflection processes (Cardoso et al. 2019). For these reasons, those instruments constructed to access DTK of the players should consider whether to use linguistic competence. Last, ensuring the validity and reliability of the data collected is important for the analysis to fulfil its intentions and purposes effectively (Tenga et al. 2009). Regarding this fact, validity and reliability are two neglected aspects in many tools contemplated in sports literature, and whose importance is essential to ensure a correct evaluation; since validity guarantees that the tool measures what it is intended to measure, while reliability confirms that the tool always measures the same.
Taking this into account, it would be necessary to know if the set of tools analyzed in this review include the above aspects, and to deepen to what extent. Therefore, the primary aim of this review is to present evaluation tools that are able to measure the DTK of soccer players through game-play scenarios without using verbal language, validly and reliably; and the secondary aims are to identify the variables studied by the tools found, evaluate their quality, find the most relevant key aspects that do not comply with the tools presented and justify the design of new tools.

Article search, inclusion and exclusion criteria
Systematic review about the DTK evaluation tools in soccer from their inception to August 27, 2020; in four phases: identification, screening, suitability and inclusion. In phase 1 (identification), the search strategy was determined and conducted in five databases: Web of Sciences, Pub Med, SportDiscus, Psycinfo and Eric; and in three languages: English, Spanish and Portuguese; according to the following search strategy (Table 1). In this phase, a total of 1,349 studies were found.
In phase 2 (screening), the Mendeley software® was used to organize and manage the search. In this phase, 571 duplicate studies were found and other 701 studies were excluded by reading the title and abstract. A total of 77 potential studies passed this phase.
In phase 3 (suitability), the following inclusion/exclusion criteria was established to select the most interesting tools by reading the full text: (1) the assessment tools had to study variables that determine the DTK in soccer players; (2) had to come from primary sources, that is, published for the first time and discarding adaptations of the original tools; (3) had to show game-play scenarios, either through video sequences or static images in questionnaires, in which the participant must answer "what to do"; (4) had to have faced a process of validity and reliability, having been published; and (5) could not use "interviews" or "verbal protocols" that expose the participants to explain their answers, knowing that many players are not able to explain everything they know and they do via verbal language. In this phase, the studies were examined in detail by two observers independently, and the kappa correlation index (k = 0.927) and the Intraclass correlation coefficient (ICC = 0.95) were calculated to establish the degree of agreement between the observers. After this process, the discrepancies found were resolved by consensus. Finally, 71 studies were excluded and six studies (Bennett et al., 2019;Blomqvist et al., 2005;Keller et al., 2018;McMorris, 1997; Serra-Olivares and García-López, 2016) exceeded the phase.
In phase 4 (inclusion), three additional studies were included. Two of them were developed in master's degree thesis unpublished (Fontana 2004;Mangas 1999) and the other one in a doctoral thesis also unpublished (De la Vega 2002). These three studies were accessed through references from the 77 potential articles via backward search and were included because all of them met the inclusion/exclusion criteria and present original tools used in published studies in scientific journals. Figure 1 shows the flowchart of the decision taken.

Quality of studies and data extraction
The nine studies that exceeded the inclusion/exclusion criteria were assessed for quality. To obtain a quality score for each study, an ad-hoc checklist was elaborated taking as a reference the criteria used in previous studies (Araújo et al. 2014;Castellano et al. 2014;Chu and Zhang 2018;González-Víllora et al. 2018). A total quality score on a ten-point scale with ten dichotomous questions was calculated. Each item was rated with "1" or "0" point (yes = 1, no = 0), using the criteria established in Table 2.
The checklist was used by two observers, and any discrepancies were resolved by consensus. In case of doubt, the two main authors debated and agreed on their inclusion. In the event of an agreement not being reached, a third author took part in the decision. Finally, the quality of this systematic review was assessed using the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analyses) guidelines (Moher et al. 2009). Declarative tactical knowledge (declarative OR "tactical awareness" OR "tactical knowledge" OR "tactical evaluation" OR "tactical assessment" OR "decision making" OR declarativo OR "conocimiento táctico" OR "comprensión táctica" OR "toma de decision" OR "conhecimento tático" OR "avaliação tática") AND Sport 1 3

Summary of the tools
The articles included in this review are collected in Table 3 where the following categories are described: tool name (author/s, year), tactical variables evaluated, tested with… (sample), process of validity and reliability, and type and number of game-play scenarios.  Did the study describe the process of content validity of the tool? Q4 Did the study describe the process of criterion validity of the tool? Q5 Did the study describe the process of construct validity of the tool? Q6 Did the study describe the process of intra-observer stability of the tool? Q7 Did the study describe the process of inter-observer agreement of the tool? Q8 Did the study describe the process of internal consistency of the tool? Q9 Did the study report the procedure to use the tool? Q10 Does the tool provide an offensive tactical score? Q11 Does the tool provide a defensive tactical score? Q12 Does the tool provide scores based on offensive tactical principles? Q13 Does the tool provide scores based on defensive tactical principles? Q14 Does the tool provide scores based on offensive game subroles? Q15 Does the tool provide scores based on defensive game subroles?  Table 4 shows the quality of the tools analyzed. The most relevant results according to the general score were the following: (i) the mean score of the nine selected tools was 6.66 points, (ii) no publication achieved the maximum score of 15 points, (iii) all the tools obtained a score between 5 and 10 points, and (iv) more than half of the sample (n = 6) achieved a general score of 5 or 6 points. More specifically, the following findings can be described: (i) all the studies reported the procedure for using the tools, without going into the rigor and completeness of said process; and the degree of detail of the procedure performed, (ii) the tools were subjected to different validity and reliability processes that are detailed in depth in the discussion of this paper, (iii) all the tools were tested with samples of players, without considering the representativeness of the samples used, (iv) five tools have been published to date in Journal citation report -JCR-or Scimago Journal Rank -SJR-, (iv) all tools provide an offensive tactical score, (v) four tools provide a defensive tactical score, (vi) two tools provide scores based on offensive tactical principles, (vii) a tool provides scores based on defensive tactical principles (viii) none tool studied provides scores based on offensive and defensive game subroles.

Discussion
The aim of this review was to present evaluation tools that are able to measure the DTK of soccer players through game-play scenarios, validly and reliably.
A total of nine tools were selected and analyzed, considering that 3 of them were not found in databases that collect studies in scientific journals, probably because these tools were validated in doctoral thesis and master's degree thesis of approximately 20 years ago, being used in subsequent studies have been published. The geographic origins of the included tools were: Spain (n = 2), Australia (n = 2), Brazil (n = 1), Portugal (n = 1), U.S.A. (n = 1), Finland (n = 1), and England (n = 1). "TacticUP video test for soccer" (Machado and Teoldo 2020) and "TCTOF" (Serra-Olivares and García-López 2016) were the tools with the highest quality score. Therefore, from this review, researchers and coaches are recommended to use these two tools compared to the other tools analyzed. Researchers are also encouraged to create and validate new tools to improve DTK analysis and evaluation based on the findings presented and organized into the following five sections: tactical variables and scores, game scenarios, validity and reliability, limitations and future perspectives.

Tactical variables and scores
An interesting question when evaluating tactical knowledge through tools presented in this study, revolves around which tactical variables are studied and how they are measured. Traditionally, the "game cycle" (Antón 1990) of invading sociomotor sports, such as soccer, has been divided into two directly confronted phases that depend directly on ball possession: the offensive and defensive (Bayer 1986;Malho 1981). In general, all tools (n = 9) provide an offensive tactical score, and four tools present a defensive tactical score (Blomqvist et al. 2005; De la Vega 2002; Machado and Teoldo 2020; Serra-Olivares and  García-López 2016). The possibilities of action of the players, in the offensive and defensive phases, can be framed within the game principles. Tactical principles are defined as a set of norms about the game that provide players with the possibility of rapidly achieve tactical solutions for the problems that arise of the situations they face (Teoldo et al. 2009). The current literature includes various types of game principles: operational principles (Bayer 1979), general/fundamental principles (Garganta and Pinto 1994;Queiroz 1983), core/specific principles (Castelo 1999;Garganta and Pinto 1994;Hainaut and Benoit 1979;Queiroz 1983;Teoldo et al. 2009Teoldo et al. , 2011Worthington 1974). Specifically, a tool provides scores based on offensive operational principles (Serra-Olivares and García-López 2016), and another tool provides scores based on offensive and defensive core/specific principles . This brings benefits, with respect to the other tools, regarding the very specificity of the evaluation of tactical performance in the game context, the attunement with the contents that are developed in the training process and the objectivity of the measure in consideration with the opposition and the evaluation of players of different categories (Teoldo et al. 2011). The analysis of the sociomotor roles that the players acquire can be interesting to carry out the evaluation of the DTK from another plane, since they allow to classify the participants that take part in the game according to their specific sociomotor status (Parlebas 2001). That is, the set of motor behaviors resulting from the temporary interaction of the ball, player, teammates, opponents and space. However, they do not provide a rigorous and detailed analysis of the game action, for which, it is necessary to go to the subroles associated with each of the game roles (Sánchez-López et al. 2021), since this will serve to appreciate, in the case of each player, the particular orientation that he makes of his role (Lasierra 1993). Subroles are defined as the possible decision behaviors (e.g. pass the ball, shot at goal, make a tackle…) that the player can assume and perform during the development of the game (Hernández Moreno 1995). Regarding this point, no tools studied classifies player performance based on roles, nor does it provide scores based on offensive and defensive game subroles. In no way, this fact suggests that the game-play scenarios presented in each tool do not show images where the subroles that players acquire appear; but they are not analyzed in the form of scores to show the level of knowledge which the participant has about it. Under this idea, obtaining these kinds of scores can be very interesting in order to transport the findings to the teaching and training process.

Game-play scenarios: video-sequences vs static game situations
The main element to configure the game-play scenarios should be the tactical variables presented. Having determined this, one of the questions facing any researcher who tries to design tools to evaluate DTK is the use of game-play scenarios from video sequences or static situations. In this review, six tools used video sequences (Bennett et al. 2019;Blomqvist et al. 2005;Fontana 2004;Keller et al. 2018;Mangas 1999), while three tools resorted to static game images (De la Vega 2002; McMorris 1997; Serra-Olivares and García-López 2016). Two tools (De la Vega 2002; Serra-Olivares and García-López 2016) based on static game images, were able to access a sample of a younger age (u-8 u-9 and u-10), while all the tools based on video sequences were used with samples above U-11. This is very important when approaching the evaluation of knowledge in the earliest categories, in order to detect possible talents. Also, this finding points to the need to investigate if the different ways of expressing the scenarios to evaluate the DTK could be affected by the age of the players. Another advantage in the use of static images compared to video sequences is the ease provided by the questionnaires to reach a larger sample, since the video sequences need a support to project.
Soccer teams are dynamic systems (Garganta and Cunha e Silva 2000), and a static image does not ideally represent what may really be happening in the game situation. For this reason, the use of dynamic film or video may offer a more natural perception of the scene when compared with static slides (Mann et al. 2007). However, the movements can be planned, executed and stored in memory in the form of mental representations (Schack and Tenenbaum 2004), and our brain fails to think about moving images (Damasio 2018). In other words, the brain has the ability to represent aspects (Damasio 2012).
Regarding the personification of the players, two tools based on static images used icons (De la Vega 2002; Serra-Olivares and García-López 2016), and one tool used pictures of soccer players miniatures (McMorris 1997), while three of the tools based on game sequences turned to adult soccer (Fontana 2004;Mangas 1999), and the other three to formative soccer (Bennett et al. 2019;Blomqvist et al. 2005;Keller et al. 2018). In this sense, the symbolic coding of the images represented in the mind, optimizes the perceptual organization and structuring of the gesture to be performed, at the cognitive level (Díaz Ocejo and Mora Mérida 2009).
Finally, focusing attention on tools based on video sequences, five of them used thirdperson images (Bennett et al. 2019;Blomqvist et al. 2005;Fontana 2004;Keller et al. 2018;Mangas 1999), and one of them bird-eye images . It would be very interesting to be able to use tools based on first-person video sequences in the future, because they would provide scenarios based on real-world situations, without conditioning participants to perceive information and make decisions differently than they normally would (Roca et al. 2011).

Validity and reliability
To validate this type of instruments, a distinction must be made between content validity, criterion validity and construct validity (Cronbach and Meehl 1955). Content validity reflects a specific content domain of what is measured (Hernández et al. 2010), and is usually calculated through expert judgment. In this review, two instruments (McMorris 1997;Serra-Olivares and García-López 2016) were validated through the independent opinions of a panel of experts, calculating the degree of agreement between them, an instrument (De la Vega 2002) was validated through the expert discussion group technique, another instrument (Bennett et al. 2019) used a set of videos validated in a previous study (Vaeyens et al. 2007), two instruments (Blomqvist et al. 2005;Fontana 2004) were designed without explaining the content validation process, and three instruments (Keller et al. 2018;Mangas 1999) were developed without describing the content validity process for the selection and adaptation of the scenes that make up the test, although later, groups of experts were used to identify the solutions for the proposed scenes.
Criterion validity, understood as concurrent or concomitant validity, measures the degree of correlation between two measures of the same concept, at the same time and on the same subjects (Polit and Hungler 1999). In other words, the results obtained must be compared with an external criterion that tries to measure the same (Hernández et al. 2010). In one of the seven tools (Serra-Olivares and García-López 2016), the results obtained were correlated with the opinion of the players' coaches who assessed the level of knowledge of their players in a rating scale (0-10), aspect more than questionable due to the subjectivity of this process. In the rest of the tools no mention is made of this type of validity. In this sense, it would seem interesting to measure in future studies whether the use of some of the tools presented to the same sample of players provides similar results. Construct validity, from its perspective of discriminant validity (Carvajal et al. 2011), determines the degree of the instrument to distinguish between groups of individuals that are expected to be different (McDowell and Newell 1996), due to their characteristics or performance (Thomas et al. 2011). In two tools (Mangas 1999;Serra-Olivares and García-López 2016), groups were differentiated according to their sports context; in three tools (Bennett et al. 2019;Keller et al. 2018;, groups of different expertise levels were compared considering the quantity of accumulated training hours or the level of the competition. In the studies of the five tools mentioned, the T-Student test Mangas 1999;Serra-Olivares and García-López 2016), ANOVA with Tukey Honest Significant Difference post hoc tests (Keller et al. 2018) or Bonferroni post-hoc corrections (Bennett et al. 2019) were used to determine any difference between the groups. This process is not specified in the remaining three tools.
Regarding the reliability of the instruments, three types of reliability were found: intraobserver stability, inter-observer agreement, and internal consistency. For intra-observer stability, four tools (Blomqvist et al. 2005;Keller et al. 2018;Serra-Olivares and García-López 2016) were subjected to the test-retest correlation, while the measurement of this type of reliability was not specified in the remaining five tools. For inter-observer agreement, in six tools (Bennett et al. 2019;Blomqvist et al. 2005;Keller et al. 2018;Mangas 1999;McMorris 1997), groups of experts were used to weight, rank or select the possible solutions for each scene; in a tool (Fontana 2004), groups of experienced players were used to carry out this task; while in the other two tools this process is not described. It seems necessary to point out the risk involved in using groups of experienced players in this reliability process, since there is no solid evidence to confirm the relationship between DTK and PTK in professional players. In other words, in professional players, the tactical knowledge acquired can often be implicit, and not explicit. Finally, regarding internal consistency, three instruments (Blomqvist et al. 2005;Fontana 2004; Serra-Olivares and García-López 2016) were analyzed using the cronbach's alpha coefficient. The rest of the instruments do not describe this process.

Limitations
It seems necessary to argue that the limitations of this review are drawn by the inclusion and exclusion criteria used. In this sense, only tools based on game-play scenarios were presented, but there are other types of tools and procedures presented in the introduction of this review that can also be found to approach the evaluation of DTK.
Regarding the decision to avoid the use of verbal language as a means to evaluate DTK, Araújo et al. (2010) point out that, up to now, there is no theoretical support to defend the relationship between verbal behavior and tactical behavior. For this reason, only tools that do not use interviews or verbal protocols have been included in this review, according to one of the criteria presented. On this fact, verbalizing and reflecting about their own performance may help players to become more attuned to important informational constraints that they may encounter in future competitive performance (Silva et al. 2013).
In this way, scientific research continues trying to answer the question that revolves around how declarative and procedural knowledge are related when explaining the tactical performance of players in sociomotor sports (Parlebas 1988) that are characterized by the presence of predominantly perceptual skills (Knapp 1963), open skills (Poulton 1957) or externally paced skills (Singer 1980), such as soccer; because there are soccer players who can solve game-specific problems conceptually, but when these problems appear during a practice situation they are unable to apply the same solution; as well as many professional players who do not know how to explain what they are capable of doing, since there is a difference between selecting the answer and executing it in sports compared to selecting the answer and executing it in other domains such as chess; because tactical skills not only involve the ability to determine which decision is most appropriate in a given situation, but involve new decisions if the decision can be successfully executed within the constraints of the required movement (Elferink-Gemser et al. 2010).

Future prospects
Assuming that the analysis of PTK of players, understood as football competence (Parlebas 2018), is especially relevant in football, the evaluation of DTK will also be, as long as there are firm scientific evidences that support the relationship between these two types of knowledge. Therefore, the next step is to determine to what extent this type of tool could be related to tools that evaluate the PTK, for example, the tools presented in the review by González-Víllora et al. (González-Víllora et al. 2015). The existence of a real relationship, would invite to use some of the tools included in this review for the evaluation of DTK in a fast way, accessing speedily to large samples of players. This would allow, among many other things, to evaluate very easily a player longitudinally in time, or transversally with respect to other players, either comparing the player with his teammates (intra-evaluation), or with players from other teams (inter-evaluation). Thus, the conquests, in the form of tactical knowledge, could be detected in each age and category, as well as the particular knowledge that young players show, helping to optimize the teaching and training processes in each specific context and, identifying the keys to success in soccer (Lago-Ballesteros and Lago-Peñas 2010) and the fundamentals that can support a methodological proposal (Echeazarra 2016). In short, if we understand what children know and how they learn, we will be in a better position to know how to teach (Riera 2005).

Conclusion
In this systematic review, nine tools were analyzed. "TacticUP video test for soccer"  and "TCTOF" (Serra-Olivares and García-López 2016) were the tools with the highest quality score. Most of the tools did not pass many of the criteria proposed to assess their quality. Fundamentally, it can be concluded that few tools show specific tactical scores based on game principles or subroles that allow identifying possible points of improvement in the knowledge that players have on specific aspects of the game.
For this main reason, and based on the other findings found in this review, future studies should take into account the following aspects: (i) the importance of designing tools that reflect scores based on tactical principles and game subroles, since this would allow establishing more specific teaching and training processes oriented towards the tactical component of the game, a key aspect in sociomotor sports, such as soccer; (ii) the advantages and disadvantages of designing tools based on static images or video sequences: in general, video sequences can better represent the dynamic reality of the game, but static images allow access to samples from earlier ages; (iii) the need to design tools that can access the DTK of young children (u-6, u-7, u-8); (iv) the requirement to design tools that present game-play scenarios in the first person, which is what the player can really see in the action; (v) the essentiality of facing the tools designed to rigorous processes of validity and reliability.