Detecting bots with temporal logic

Social bots are computer programs that act like human users on social media platforms. Social bot detection is a rapidly growing field dominated by machine learning approaches. In this paper, we propose a complementary method to machine learning by exploring bot detection as a model checking problem. We introduce Temporal Network Logic (TNL) which we use to specify social networks where agents can post and follow each other. Using this logic, we formalize different types of social bot behavior with formulas that are satisfied in a model of a network with bots. We also consider an extension of the logic where we explore the expressive power of including elements from hybrid logic in our framework. We give model checking algorithms for TNL and its hybrid extension, and show that the complexity of the former is in p and the latter in pspace.


Introduction
Software-controlled bots, often called social bots, act like human users on social media: they interact with other users, both humans and other bots; share content; and target users that are likely to believe in misinformation (Shao et al., 2018).Social bots may have beneficial purposes (Gilani et al., 2017), but they can also be used to amplify or direct misinformation.In the worst case, they can be seen as a threat to democracy (Gorodnichenko et al., 2018).
Bot detecting is a rapidly growing field, mainly dominated by literature on algorithms using machine learning techniques (Cresci, 2020).These algorithms try to catch bots by identifying behavior such as aggressive following and unfollowing (Lee et al., 2011), or a "bursty nature": posting a lot in short periods of time followed by long periods of inactivity (Chu et al., 2012).Other algorithms try to detect bots by looking for specific network topologies, such as clusters of bots (Cao et al., 2012), or by considering content of the bots' posts and metadata about them (Kudugunta & Ferrara, 2018).
Machine learning approaches require a large amount of labeled data for training.We are interested in exploring a complementary approach to bot detection that circumvents this problem: Can social bot detection be done as a model checking problem?
Model checking is a method for verifying whether a formally specified model of a system meets a given specification.Among several purposes, it has been proposed as an approach for the analysis of social networks in the context of information and opinion diffusion properties (Belardinelli & Grossi, 2015;Dennis & Slavkovik, 2020;Dennis et al., 2022;Machado & Benevides, 2022) and privacy (Dennis et al., 2017;Pardo & Schneider, 2017).It has also been proposed to identify groups of compromised smartphones, called mobile botnets, in mobile apps (Bernardeschi et al., 2019).In the line of work on formal verification of social networks, we propose to use this method for detecting social bots.
We introduce Temporal Network Logic (TNL) to represent sufficient information about a social network to detect whether the network contains social bots.In this framework, bot behavior is expressed in logical formulas.This behavior is based on existing bot detectors as well as empirical findings on how social bots have acted in online social networks.By representing a social network as a model of TNL and bot behavior as a logical formula, checking whether a specific bot formula is satisfied in a model of TNL checks the presence of that particular bot behavior in the social network.We also present a simple algorithm for building a TNL model from social network data, enabling future work to implement this detection method on a real-life social network.
There are three specific advantages of our bot detection method: 1. We do not need to train classifiers on big data.Social bots are constantly evolving (Cresci, 2020).Requiring that each type of a bot needs a classifier to be trained for it, might be an issue because of the amount of training data required.In our approach, we can formalize certain types of new bot behavior as a new formula.
Once the specification in TNL of the social network is available, we can check both for properties of agents or for properties of groups of agents in the network.2. Model checking, unlike some machine learning methods, is directly inspectable by design.This detection method is relevant to the current need for responsible algorithms in artificial intelligence research.3. Detection of logical inconsistencies in agents' posts is possible.The logic syntax lets us formalize the content of a post in terms of propositions and logical connectives.In theory, this allows us to, for instance, check for posts that themselves are logical contradictions.The practical applicability of this option, of course, depends heavily on the availability of natural language processing tools.
Beyond inconsistencies, a symbolic approach allows us to perform a qualitative analysis on chosen segments and time intervals of the social network.Intuitively, we can translate what a human social network user would consider bot behavior into a property that we can formally verify.
Our approach makes it possible to give an exact logical specification of pre-defined mechanisms for bot behavior.This builds a direct connection to studies in social science on information flow in social networks.A limitation of our approach is that it does not yet offer the possibility of real-time bot detection.This work is perhaps best described as forensics: we uncover evidence of bot activity in the network in the past.This, however, is still sufficient to understand the influence that bots have on propagation of information in the network and to use this information to improve social network services to limit the power of bots.
The framework we use is a temporal logic that is also interesting in its own respect.We discuss similarities between TNL and existing logics, such as linear-time temporal logic and tense logic, and show that TNL is sound and weakly complete by combining axioms from existing axiomatic systems.In the latter sections of the paper, we take advantage of the structure of the models of TNL to extend our logical semantics and evaluate formulas not only from the viewpoint of a global timeline, but also for each local agent in the social network.This enables us to further analyze powerful network positions a suspected malicious agent might have.We give model checking algorithms for both TNL and the extended TNL.
The paper is structured as follows.We begin in Sect. 2 by investigating existing bot detectors and empirical research on the behavior of social bots.In Sect.3, we assess related work.Then, in Sect.4, we introduce TNL with an example and show some network and agent properties that can be expressed in this language.We also present a sound and weakly complete axiomatization of TNL.In Sect.5, we give formulas for bot behavior.Section 6 is devoted to a discussion of the complexity and efficiency of model checking for bot detection in TNL.In Sect.7, we extend TNL to TNL a to be able to evaluate formulas at agents as well as time points.Here, we also introduce a model checking algorithm for TNL a .Section 8 proposes a way to build temporal network models from social network data.We end with a conclusion and prospects for future work in Sect.9.
This paper is an extended version of a paper by Pedersen et al. (2021), first published in the proceedings of the Eight International Workshop on Logic, Rationality and Interaction (LORI 2021).The extension includes the following additions: a review of related work; a sound and weakly complete axiomatization of TNL; a model checking algorithm for TNL; and, the introduction of the logic TNL a as well as a model checking algorithm for TNL a .

Social bots
To develop a method for bot detection, it is crucial to first analyze how a bot behaves.We do this by considering existing bot detectors and empirical research on social bot behavior.
Misinformation.Misinformation is, according to common dictionaries, known as incorrect or misleading information.Disinformation is often regarded as a subset of misinformation that is deliberately spread to influence public opinion or distort the truth.There is a general agreement in the literature on bots on social media platforms that their malicious behaviors are inherently related to the spread of misinformation such as hoaxes, rumors, conspiracy theories, or fabricated reports (Shao et al., 2018).
Empirical studies conducted by Vosoughi et al. (2018) on Twitter found that false information, and in particular political news, was diffused significantly faster and farther than true information.Two reasons for this tendency are presumed to be the novelty and shock effect of false information (O'Connor & Weatherall, 2019).People are more likely to share novel or surprising information, and since false information is more likely to be novel and surprising than true information, fake news tend to propagate faster through a social network.Although misinformation is linked to bot behavior, Vosoughi et al. (2018) also found that human behavior contributes more to the spread of false news than bots.As was mentioned by one of the reviewers, identifying bots as well as false information in the network can help us understand which agents spread this type of information.
Bursty nature.Analyses of bot behavior on Twitter have revealed specific activity patterns, both posting-and following-activity.In a well-known study, Chu et al. (2012) analyzed more than 40 million tweets from over 500,000 users and found that bots tend to act in a "bursty" nature, posting a lot in a short period of time and have longer periods of inactivity.
One reason for this bursty posting behavior is speculated in Shao et al. (2018) to be that early and aggressive engagement with information, shortly after it has first been published by a low-credibility source, exposes many users to the content and increases the chances that the article might go "viral": spread to large groups of users in a short time.In a similar study, Lee and Kim (2014) confirmed that bot account creation is bursty too: that accounts are created in bulk within a short period of time, just before the accounts start spamming.A likely reason for bursty account creation before spamming is that it allows the bots to create vast amounts of content before being removed.
Hashtag, subgroup and user targeting.Bots often strive to gain visibility in their social networks.One specific method is to target hashtags by posting irrelevant information to divert a discussion (Chang & Iyer, 2012).An example of bots' misuse of Twitter hashtags was seen under the 2017 natural disasters Hurricane Harvey, Irma and Maria, as well as an earthquake that occurred in Mexico.By gathering data from over 1.2 million tweets by more than 770,000 accounts, Khaund et al. (2018) found bots using disaster-related hashtags to promote unrelated political content on topics such as North Korean leader Kim Jong-un and the Black Lives Matter movement.Social bots also posted hoaxes such as "shark swimming on freeway" under the same hashtags.
Another tactic to gain visibility in social networks is to target subgroups.Leading up to the United States presidential election in 2016, certain Twitter accounts posed as being queer-friendly, dog-loving pages, and only after reaching a number of followers started to publish political content (O'Connor & Weatherall, 2019).Although these accounts were not confirmed to be social bots, this is a potential tactic that could be used for manipulation online.Misusing followers' trust by agreeing on particular topics potentially put these pages in a position to convince their audience on political matters.
Aggressive following and unfollowing.One feature social bots have shown to have on social media platforms is aggressive following and unfollowing (Lee et al., 2011).A reason for this behavior is, as in the case of targeting, to gain visibility; most online social media platforms give an alert to their users when they have a new follower, or contact request.In this way, a bot can enhance the probability that users will engage with their posts.

Related work
The work in this paper lies in the intersection of several scientific fields.We can find related literature in research on social bot detectors, in work on logical frameworks to model social network dynamics, and in the formal verification community, in particular work on model checking properties in social networks.In this section, we give a review of related work across these three fields.

Bot detectors
Some of the social bot detectors and their mechanisms to detect bots have been reviewed in the previous Sect.2. Most bot detecting frameworks technically differ a lot from ours, and few papers combine work on social bot detectors and formal logic.Bernardeschi et al. (2019) propose model checking to identify mobile botnets in the operating system Android.A mobile botnet is a group of compromised smartphones controlled by an owner using software called Command and Control (C&C).The method in the paper uses a branching temporal logic to characterize botnet behavior which is checked against a set of finite state automata representing the run-time behavior of an app on a smartphone.Although a mobile botnet in its nature is quite different from a social bot, we include the paper as related work because the method proposed is similar to ours, and because both social bots and botnets can be used for malicious purposes.

Logics for social network dynamics
Our work uses a modal logic framework to reason about the dynamics of a social network.In recent years, the field of social network logics, and particularly dynamic social network logics, has seen many influential papers.By a social network, we mean a graph structure in which the nodes represent agents, and the relation between them Synthese (2023) 202 :79 represents some kind of friendship relation or communication channel.We take social network logics to be logical frameworks that model this network graph explicitly.1 Some work in the field is reviewed in a paper by Liu and Li (2022), which focuses on work on social network logics in China.
As one of the first on the topic, Pacuit and Parikh (2005) introduce a logic of communication graphs where edges between agents depict possibilities of direct communication.This logic uses a neighborhood semantics2 which language includes a knowledge modality and a dynamic diamond.A formula ♦K i φ is to be read as "after some communications, agent i knows φ according to i's current information".
The frameworks (Seligman et al., 2013;Liu et al., 2014) built upon the work first presented by Seligman et al. (2011) have been deeply influential in the field of social network logics.The underlying logical models have a set of agents, a set of epistemic states and two binary relations: one family of friendship relations on the set of agents and one family of epistemic relations on the set of epistemic states.The logical language includes nominals and hybrid operators, which makes it possible to have statements such as "Bella knows that she is not a spy, but doesn't know if a friend of hers is a spy" @ b (K ¬s ∧ ¬K F s) (Seligman et al., 2013).The frameworks are used to model phenomena such as peer pressure, belief change and epistemic updates.Work by other authors that has continued the technical results of these frameworks include (Christoff et al., 2016;Sano, 2017;Fernández González, 2021;Balbiani & Fernández González, 2020;Zhen, 2020).
Our logic tracks the dynamics of the social network through a temporal logic.Other work in social network logic that uses temporal operators include (Pedersen & Slavkovik, 2017;Machado & Benevides, 2022;Van der Hoek et al., 2020).Of the papers in this section, we have found the frameworks in these papers to be the most similar to ours as they combine social networks and temporal logics.We therefore include a longer description of their logical models.Pedersen and Slavkovik (2017) introduce a logic with next-time as its only temporal operator to model social influence in a network of agents.The logical model is constructed such that each state is a social network G = (N , , pr o, E) where N is a set of agents, is a set of relevant issues, pr o is a function deciding for each issue in which agents support them, and E is a set of edges between agents.A successor of G is defined to be a new social network (N , , pr o, E/A) where only the set of edges has possibly been reduced.The motivation behind reducing edges in E is that agents cut links to their neighbors to avoid conflict, which might appear when the agents' neighbors disagree on the same issue.
Machado and Benevides (2022) present a logic called LTL-SN, based on linear-time temporal logic with the operators next-time and until.With until a future operator can also be defined.LTL-SN tracks the changes of a social network similar to the models of diffusion first defined by Baltag et al. (2019), in which agents adopt a behavior in the next time point if a number of their neighbors have the behavior, relative to a given threshold.The changes in the network are therefore decided by the initial model M = (A, N , θ, I ) where A is a set of agents, N a set of relations between agents, θ a given threshold and I ⊆ A a set of agents that initially exhibit the behavior.LTL-SN can be reduced to propositional logic, due to the fact that the changes through the temporal operators are decided based on M. The paper also includes a model checking algorithm, and it is shown how their models can be used to represent a known model for epidemics called the Susceptible-Infectious-Recovered (SIR) model.
The work by Van der Hoek et al. (2020) is based on a framework originally presented by van der Hoek et al. (2019); Van der Hoek et al. (2022), where balance games are introduced.Balance games are game theoretical renditions of social situations where relations between groups of three agents are judged to be balanced or not.Given a set of agents A, a relation on A is defined such that each relation is either a friend relation "+" or an enemy relation "−", but not both.For each triad of three agents, the triad is balanced if on the form + + + or −−+ and unbalanced otherwise.When all triads between all agents are balanced, the whole network is said to be balanced.In balance games, a random agent is chosen in each turn to decide whether to change one of its relations for a cost.The agents receive a higher utility when they are in balanced triads.A network is defined to be stable when all agents have more reason for their relationships to stay the same than to change.Van der Hoek et al. (2020) analyze balance games through a temporal logic called Logic of Allies and Enemies (LAE).The language of LAE includes operators AXφ, A[φU φ], and E[φU φ] from Computation Tree Logic (CTL).It is shown that some formulas are valid in the network if and only if, for instance, the network eventually becomes stable.Other works by Xiong (2017), Xiong and Ågotnes (2020), Pedersen et al. (2019) and Pedersen (2019) also use logic to model balance, although neither with a temporal logic nor a game theoretic framework.

Model checking social networks
We also include here some related work which combines model checking and social networks.As mentioned earlier, Machado and Benevides (2022) give a model checking algorithm for the logic LTL-SN which models diffusion in a social network.The work by Belardinelli and Grossi (2015), Dennis and Slavkovik (2020) and Dennis et al. (2022) also explores model checking diffusion properties in social networks.Belardinelli and Grossi (2015) introduce logical models for multi-agent systems called open dynamic agent networks, in which agents can join and leave the network.The logic used is a type of first-order CTL, and it is shown that its model checking problem is decidable.Dennis and Slavkovik (2020) use Markov chains to model information diffusion in social networks, where the internal information state of an agent is explicitly modeled.It is shown that the PRISM probabilistic model checker cannot be used to check even simple models.The work by Dennis and Slavkovik (2020) is extended by Dennis et al. (2022).By supplementing the use of PRISM with Monte Carlo simulation, larger networks can be checked.
Pardo and Schneider (2017) present a formal system to model knowledge of agents in social networks.The social network is not presented as a Kripke model, which has been standard in many works in the social network logic literature.Rather, the models by Pardo and Schneider (2017), called SNMs, are social graphs with a knowledge base for each user in the system.The motivation is to formalize privacy policies for agents in the network, such as "only my friends can know my location".Model checking is proposed to check whether a user knows a given statement.It is shown that model checking formulas over SNMs is decidable.The work by Dennis et al. (2017) is also motivated by formal verification of privacy policies in social networks.More concretely, the proposal is to use model checking to analyze information leakage, which is when information is accessed by other agents who are not authorized to share it.The framework by Dennis et al. (2017) is probabilistic and uses details from BDI, Belief-Desire-Intention, models.As for Dennis and Slavkovik (2020) and Dennis et al. (2022), the probabilistic model checker PRISM is used by Dennis et al. (2017) to check properties in the network.

Temporal Network Logic (TNL)
We are now ready to introduce Temporal Network Logic (TNL).TNL is a temporal logic in which each time point hosts a social network.In addition to the standard temporal logic operators past Pφ, future Fφ and next-time X φ, the language includes predicates f ollow (a, b) and posted (a, ω) that let us reason about agents' activities in the social network, as well as the network structure, at a specific time on the timeline.

Language
We begin by presenting the syntax of TNL.
Definition 1 (Syntax) Let At be a finite set of atomic posts and Ag be a finite set of agents.We define the well-formed formulas of the language L T N L to be generated by the following grammar with the separation property: where p ∈ At and a, b ∈ Ag.We define propositional connectives like ∨, → and the formulas , ⊥ as usual.Further, we define the duals H := ¬P¬ and G := ¬F¬ as standard.
All formulas of L T N L have the separation property, which means that each formula can be rewritten as a Boolean combination of formulas, each of which depends only on the past, present, or future. 3We use to denote the set of all possible posts ω, which is the propositional fragment of L T N L .Since the set At is finite, the set is finite up to equivalent formulas.
Intuitively, we read f ollow (a, b) as "agent a follows agent b" and posted (a, ω) as "agent a posted ω" or "ω appears on agent a's profile".ω is a logical formula made up of atomic posts and standard logical connectives.In L T N L , we therefore have formulas such as posted (a, p) and posted (a, p → q).We call p, q, r , . . .∈ At atomic posts.Pφ, Fφ and X φ are temporal operators.We read Pφ as "φ was the case at some past time", Fφ as "φ will be the case at some future time" and X φ as "φ is the case next step, or next time".

Models and semantics
Before presenting the semantics, we give definitions of temporal network frames and models.
Definition 2 (Temporal Network Frame and Model) We define the following: • T is a countably infinite set of time points; • < is a strict linear order with no end on T , obtaining an infinite chain; • N : (A × T ) → P(A) is a follower function specifying for each agent their followers at a specific time; • V : At → P(T ) is a valuation function deciding the truth value of atomic posts at each time point; • Posts : (A × T ) → is a function outputting for each agent the posts they have on their profile at a specific time.
The tuple F = (A, T , <) is a temporal network frame, whereas the tuple M = (A, T , <, N , V , Posts) is a temporal network model.We define a pointed temporal network model to be (M, t) where M is a temporal network model and t ∈ T its distinguished point, at which the evaluation takes place.A temporal network model represents the evolution of a social network over time: a timeline in which each time point hosts a social network, defined by the follower function N .Agents' posts at each time is modeled by the post function Posts.The intention of N and Posts is that followers continue to follow until they actively unfollow, and that posts stay on an agents' profile until they are actively removed by the posting agent.If a post ω ∈ Posts(a, t) for agent a at t, we say that "a posted ω at t" or that "ω appears on agent a's profile at t".This is not intended to mean that ω is first posted at t, ω could have been posted earlier, but still appears on the agent's profile.To avoid confusion, we say "a actively posts ω at t" when ω ∈ Posts(a, t) and ω / ∈ Posts(a, t − 1) for t − 1 being the immediate predecessor of t.See Fig. 1 for a simple example of a temporal network model.This network has only five agents A = {a, b, c, d, e} and although the timeline has no end, we choose to concentrate on its first time step to the next: T = {t 1 , t 2 , . ..}.Also note that t 1 , t 2 ∈ V ( p) and t 1 , t 2 ∈ V (r ), whereas t 1 , t 2 / ∈ V (q).Further, Posts(d, t 1 ) = { p → q}, Posts(a, t 2 ) = {p} and Posts(b, t 2 ) = {r }.The edges within the time points correspond to the information captured in the follower function: for instance "c follows e" is represented by an edge from c to e.
We now define the semantics of TNL.
Definition 3 (Semantics) Let M = (A, T , <, N , V , Posts) be a temporal network model, t ∈ T be a time point, a, b ∈ A be agents, and ω ∈ be a post.Truth conditions for TNL are defined as follows: where t + 1 is the immediate successor of t

Expressing agent properties
An important motivation in the still emerging field of social network logics is to explore what network properties we can formalize with a limited logical syntax.In this section, we present some properties that can be expressed in L T N L .
1. Agent a actively posts ω in the next step: ¬ posted (a, ω) ∧ X posted (a, ω) 2. Agent a removes post ω in the next step: posted (a, ω) ∧ X ¬ posted (a, ω) We return to the example in Fig. 1 and see the following: Note that we here define "account creation" as the first action, in other words posting or following, of the agent in the network.The intuition is that an agent must follow other agents to view the other agents' content.One could alternatively allow for a different setting in which we expand or shrink the set A of agents itself, which would require an extension of the logical framework.

Soundness and completeness
TNL is a logic with both future, past and next-time operators, interpreted over a strict linear order with no end.A strict linear order is a binary relation that is transitive, irreflexive and trichotomous, the latter being the property such that for all s, t ∈ T : exactly one of s < t, t < s or t = s is true.Irreflexivity entails that we read the Pφ and Fφ modalities as strict future and past which does not include the current time point.The formulas f ollow (a, b) and posted (a, ω) are original in TNL.
The language of TNL is the basic tense logic language with an additional next-time operator and follow-and posted-formulas.Tense logic, originating from Prior (1957), classically includes a past and future operator, but not next-time.There are known  axiomatic systems for tense logic which are complete with respect to the class of strict linear orders with no end (Blackburn et al., 2001;Venema, 2001).L T N L can also be seen as an the language of linear-time temporal logic (LTL) without the until operator and with the additional past operator and follow-and postedformulas.Additionally, models of LTL are reflexive with respect to the future operator.In one of the seminal works that first introduce LTL, Gabbay et al. (1980) present an axiomatic system called DX which is weakly complete with respect to logical models representing the natural numbers and their natural, and irreflexive, ordering.These models are essentially TNL models as the natural numbers and their ordering are strict linear orders with no end.The axiomatization for TNL is the axioms of DX in addition to axioms representing the relationship between past and future, as well as axioms for the past operators from axiomatizations of tense logic.The axiomatic system for TNL is given in Table 1.
Like DX, the axiomatic system for TNL is weakly complete.TNL is not compact and can therefore not be strongly axiomatized.Consider the set of sentences: is the set consisting of ¬Gp, X p, X X p, X X X p and so on.Every finite subset of is satisfiable, because we can always construct a model in which the necessary point on the timeline does not satisfy p. However itself is not satisfiable: if X n p is satisfied for all n ∈ N, then Gp is true and consequently ¬Gp is false.
Theorem 1 TNL is sound and weakly complete with respect to temporal network models.
Proof The proof closely follows the proof of completeness for DX by Gabbay et al. (1980) with details from standard proofs of completeness for various tense logics (Blackburn et al., 2001;Venema, 2001).The full proof is included in Appendix A.

Detection formulas
Our approach to bot detecting is to define a set of formulas corresponding to particular behaviors of a social bot.Checking satisfiability of these formulas in a temporal network model gives us information on whether we should expect bots in the social network.
As was seen in the empirical research overview in Sect.2, a recurring feature of bot behavior is to act aggressively, or bursty.To model these traits, we need to define what it means to do something a lot and in a short or long period of time.We decide to rely on an external source of our system: a program or person that in each network and at each time gives us a specific value for a lot, depending on the individual situation in each network, and for each agent.This is implemented in our language as the constant alot.The same approach goes for a long period of time denoted with the constant long.The constants alot and long can be seen as thresholds.That is, for any specific number x in place of, say, alot, it means that alot is at least x.Anything that happens more than x times, also happens x times.For a short period of time, we argue that in any case, a single time point is a short period of time.We now present the formulas corresponding to bot behaviors.

Posting false information
Recall that our models include a valuation function V : At → P(T ) with an objective truth value for each atomic post in the network, at each time point.This enables us to capture the existence of misinformation in the network with this simple formula FI. 4a∈A ω∈ posted (a, ω) ∧ ¬ω (FI) We cannot claim that there are bots in the network based solely on whether the FI formula is forced at a time point in the model, as human behavior has been shown to contribute as much to misinformation spread as bots (Vosoughi et al., 2018).The FI formula should therefore be checked together with other bot detecting formulas to see whether it is likely that there are social bots, and false information, in the network.

Bursty nature
To characterize the bursty nature of a social bot, we want to define a formula that describes agents that post a lot in a short period of time and have longer periods of inactivity.We remind the reader that the intended interpretation of the formula posted (a, ω) is that the post ω appears on agent a's profile, and therefore, if posted (a, ω) holds in consecutive time points, it is intended to mean that the post stays on the agent's profile.We first define a formula representing agent a being inactive for a long time in inactive_long(a X long is the standard notation X n := X X X . . .X n , in this case for n = long.We read inactive_long(a) as "for all posts and all agents, in the next steps for a long time, agent a posted something, or followed someone, if and only if they have already done this in the current step".We also abbreviate the formula active_ post(a, ω) := ¬ posted (a, ω) ∧ X posted (a, ω) to be read as "agent a actively posts ω in the next time point".We now define the formula BN.
The notation ∀n = k : ω n ≢ ω k means that for any n = k: ω n and ω k are not logically equivalent formulas.The BN formula states that "there exists an agent, called a for reference, and there exists m (a lot of) inequivalent posts such that the agents posts all these posts at the next time point, and then they are inactive for a long time in the future".
Recall that we formalized account creation as the first action of an agent in the network in Sect.4.3.This definition allows us to formalize bursty account creation: that a lot of accounts are created in a short period of time.For simplification, we refer to the creation-formula for an agent a as cr eated(a) := We define bursty account creation with the use of the external classifier for alot in the formula BAC.

Targeting
We did not extend the syntax to express and directly identify a hashtag within a post in L T N L .But, we can detect the posting of a lot of posts that relates to a particular atomic post p.If we rely on a person or an external natural language processing algorithm that can recognize a hashtag in a post, we can denote p for a given hashtag.We use the abbreviation posted (a, p + alot) := for "agent a posted a lot of formulas on the form p ∧ ω for a given p" and present the formula HT.An observation made by one of the reviewers is that hashtags on social media platforms are usually not true or false.One way to tackle this is to set the truth values of hashtags to always be true.Alternatively, we could add a separate set of atomic posts that are hashtags, which cannot be true or false.Another property we did not explicitly specify in the language L T N L is whether a post is relevant or irrelevant to a specific hashtag.An important feature of hashtag targeting is that irrelevant information is posted under a particular hashtag.It is therefore arguable whether HT captures hashtag targeting.
Subgroup targeting is characterized by a user posting a lot on one topic related to a subgroup, to then start posting about another irrelevant topic, often political.Although we cannot express relevance, we can formalize an agent posting a lot on one topic p before posting a lot on another topic q.This action is described in the formula ST. a∈A P( posted (a, p + alot)) ∧ posted (a, q + alot) (ST)

Aggressive following and unfollowing
We define aggressive following or unfollowing as an action that happens a lot, repeatedly, in a short time span.In this context, that is a combination of the proposition alot in a short period of time.For simplification, we abbreviate We read the formula as "there exists an agent a, and a lot of other agents b 1 , . . ., b m who all start being followed by a in the next step and are unfollowed by a in the future".5 6 Bot detecting: model checking We define the problem of bot detecting in a social network as a model checking problem in TNL.Model checking is an automated decision procedure for establishing whether a finite model of a system satisfies a formal specification expressed as a logical formula.
Let φ ∈ L T N L be a given formula specifying a property of a bot and let (M, t) be a given finite pointed temporal network model.The model checking problem is the problem to determine whether φ is satisfied in (M, t).
In this section, we discuss complexity results and the computational efficiency of model checking for bot detecting.We propose two alternative options for model checking.First, we give a model checking algorithm for finite fragments of TNL and show that it runs in polynomial time.Then, we show that the model checking problem for TNL can be translated to the model checking problem for linear-time temporal logic with past (PLTL).The reason for including this translation when we already present a model checking algorithm for TNL is to reason about whether we can use these results to leverage existing model checkers such as SPIN and NuSMV which has native support for linear-time temporal logic.

Finite fragments
The main proposal in this paper is to use model checking to detect social bots in a network of agents.We envision this to be done for a real-life network by using data from a social network to form a corresponding temporal network model.Further details on how we propose to build TNL models from social network data can be found in Sect.8.
Recall that TNL models are strict linear orders with no end.When taking existing information about a social network, it is clear that this will always be data represented on a finite timeline.Translating this information to a strict linear order with no end is therefore not entirely straightforward.To bypass this problem, we introduce finite fragments of temporal network models.

Definition 4 (Finite Fragments of Temporal Network Models)
A finite fragment of a temporal network model M f = (A, T , ∼ f , N , V , Posts) is a tuple where all elements except ∼ f are defined as in a temporal network model.∼ f is a binary relation on T which is a finite linear order in which all points are irreflexive except the last reflexive point.
A finite fragment of a temporal network model is essentially a finite strict linear order with a final deadlock state: a last state that loops to itself to end the timeline.This construction is needed to obtain a well-defined semantics for the X operator for all formulas of L T N L on finite fragments.The semantics for M f is defined as the semantics for TNL on classical temporal network models.

A model checking algorithm for TNL
The model checking algorithm for finite fragments of TNL is based on the model checking algorithm for Computation Tree Logic (CTL), a known result which can be found in logic textbooks (Huth & Ryan, 2004), and originates from influential work on model checking by Clarke and Emerson (1981).We provide pseudocode for the algorithm below.
Algorithm 1 Function S AT (φ) determining the set of time points that satisfy φ The set sat of all time points in M f where φ is satisfied 1: sat := ∅ 2: case 3: φ is Pψ: sat = S AT P (ψ) 10: φ is Fψ: sat = S AT F (ψ) 11: φ is X ψ: sat = S AT X (ψ) 12: end case 13: return sat Algorithm 1 shows a labeling algorithm that labels the time points of M f in a case analysis.The function goes through the subformulas of φ from the shortest, working upwards in increasing order of length.It is recursive on the structure of φ (lines 7,8,9,10 and 11), referring to the function S AT (φ) itself in the cases of negation and conjunction.The algorithm relies on three subroutines S AT P , S AT F and S AT X (lines 8-10) found in Algorithms 8, 9 and 10 in Appendix B. For a formula φ, |φ| denotes the size of φ, i.e. the number of distinct subformulas in φ.We should note that our function is in fact a global model checking algorithm: the output gives us all time points in which the formula is satisfied.Model checking for a single given state only entails to check whether the state is a member of the output set S AT (φ).

From TNL to PLTL
We discuss how we can reduce Temporal Network Logic to a fragment of linear-time temporal logic with past (PLTL) (Schnoebelen, 2002) with respect to model checking.That is, we show that for a pointed TNL model (M, t) and a formula φ in the language of TNL, there exists a pointed PLTL model (M, t) and a formula ϕ in the language of PLTL, such that instead of checking whether (M, t) φ, we can rather check whether (M, t) ϕ.The reason we include a reduction of the model checking problem is to open for the possibility of utilizing existing model checkers for LTL, such as NuSMV (Cimatti et al., 2002), as an alternative to the model checking algorithm for TNL presented in the previous section.
PLTL is classical linear-time temporal logic with a past P and a previous step, or yesterday, X −1 operator in addition to the standard future F and next-time X operator.In PLTL, the semantics of X −1 φ is analogous to X φ in which X −1 φ is forced at the current time point if and only if φ holds at the previous time point.
Models of PLTL are tuples M = (T , R, V ), where T and V are as defined in Definition 2 of temporal network models.R is usually not a strict linear order in PLTL, and is defined as having no beginning in addition to having no end.The translation from models of TNL to models of PLTL strips down the TNL model to a 3-tuple.We also include a reflexive deadlock state at the beginning of the timeline, so that X −1 will be well-defined.
The reduction from any well-formed formula of TNL to a formula of PLTL is shown in the following translation.
The translation t : L T N L → L P LT L is defined as follows.
The semantics of the P and F operators in TNL are defined as strict past and future, without including the present.Thus, the translation into PLTL, where the standard semantics of past and future includes the present, requires an additional X −1 and X operator in the translation of Pφ and Fφ, respectively.The set of agents A is finite and the set of posts is finite up to equivalent formulas.Therefore, the set of formulas on the form f ollow (a, b) and posted (a, ω) for any a, b, ω is finite too and can be translated into atomic posts in the set At.
It is well known that the model checking problem for LTL is in pspace (Sistla & Clarke, 1985) and that adding the past operators to LTL does not increase the expressivity of LTL6 .Namely the model checking problem for LTL with past is also in pspace (Markey, 2004).
Despite being in pspace, model checking LTL formulas can be done computationally efficiently in practice (Schnoebelen, 2002) and numerous efficient model checking tools have been developed.It has to be noted that current model checking algorithms are exponential in size of the formula (describing the property).In many of our formulas, the size (number of atoms) depends on the number of agents and number of posts we would like to consider.These are numbers we can control and in most cases we would be looking at a specific small number of posts and a part of the social network.We might also be able to use results on model checking LTL on finite traces (Fionda & Greco, 2016).Experiments are needed to establish true practical efficacy of model checking with PLTL in our examples.It has been shown that for PLTL formulas that satisfy the separation property, the SPIN model checker can be used efficiently (Pradella et al., 2003).The NuSMV model checker (Cimatti et al., 2002) has a native support for PLTL formulas.

Extending TNL: from an agent's perspective
We look at an extension of Temporal Network Logic in which we switch our perspective from the global view of a time point to the local view of a singular agent in the social network, within the time point.The advantage of this extended version of TNL is that it increases our expressive power and allows us to express specific properties from each agent's point of view such as "this agent is likely a bot" or "this agent is likely following a bot".The model checking algorithm for TNL given in Sect.6.2 labels time points t in the model where a given formula is forced.Given a specific agent, it is possible to check at what time points, if at all, this agent exhibited a particular behavior.In this section, we give a model checking algorithm for the extended TNL.This model checker labels pairs (a, t) of agents and time points.Therefore, checking a formula representing a property such as "this agent is likely a bot" outputs not only the time points, but also the agents for which this property is true.Model checking these types of formulas might help identify bots in the network.

Syntax and semantics
This version of TNL where we evaluate formulas at agents in addition to time points, we name Temporal Network Agent Logic, TNL a .The language of TNL a includes elements from hybrid logic (Areces & ten Cate, 2007) such as a separate set Nom of atomic propositions called nominals.In contrast to the set of posts At, an element of Nom can only be true at one agent in a time point, and can therefore be taken to represent the agent's unique name.We also include a set of nominal variables Var.
Definition 5 (Syntax of TNL a ) Let At, Nom and Var be sets of propositional atoms, all finite and pairwise disjoint.We define the well-formed formulas of the language L T N L a to be generated by the following grammar: where p ∈ At, s ∈ Nom ∪ Var and i ∈ Nom.We define propositional connectives like ∨, → and the formulas , ⊥ as usual.
The syntax of TNL a is the syntax of TNL with nominals, the hybrid operators @ and ↓, and the diamonds ♦ and ♦ −1 .Intuitively, we read @ s φ as "φ is true at the agent who is called s".The other hybrid operator ↓ x names the current agent x, such that we can speak generally about this agent without knowing the actual nominal that is satisfied there.We can read ↓ x.φ intuitively as "the current agent is called some specific nominal such as i, but we refer to it as the unique generic x, and φ is true at x".We read ♦φ as "the current agent is being followed by an agent where φ holds" and ♦ −1 φ as "the current agent follows someone where φ is true".Additionally, the operators f ollow and posted are now defined with nominals and nominal variables.The reason behind this change is to be able to characterize the necessary axioms for the relationship between the global f ollow -operator and the new local ♦ and ♦ −1 .We intuitively read f ollow (i, j) as "the agent called i follows the agent called j" and posted (s, ω) as "the agent called s posted ω" or "ω appears on the agent called s' profile".
The models of Temporal Network Logic already include two distinct binary relations on separate sets: the temporal relation < on the set of time points T and the follower function N on the set of agents A. Yet, so far we have restricted our semantics to evaluate formulas only at time points and not at agents.Taking advantage of this existing model structure, we need to make minimal changes to the temporal network model to be able to evaluate formulas also at agents.The only component that differs is that the valuation function V a now needs to be accommodated to nominals.

Definition 6 (Temporal Network Agent Model)
A temporal network agent model is a tuple M a = (A, T , <, N , V a , Posts) where all items except V a are defined as in the case of a standard temporal network model.V a is defined as follows: We call the tuple F a = (A, T , <, N ) a temporal network agent frame.
V a is a valuation function that takes both atomic propositions and nominals as input.
The temporal network agent model is named, that is, for any agent in a time point, there is a nominal satisfied there.Furthermore, this nominal is unique: the same nominal cannot be true at two distinct agents in the same time point.
Before presenting the semantics of TNL a , we introduce the function g : Var → A which assigns agents to nominal variables.We define an x-variant of g as g x a (x) = a and g x a = g(y) for all y = x.Also, for s ∈ Nom ∪ Var, let [s] M a ,g denote the agent whose name is s.

Relations between the global and the local view
The extended logic TNL a naturally has to come with some principles establishing the relationship between the old global and the new local formulas.We discuss a selection of these principles characterized with axioms that must be valid on temporal network agent frames to narrow down our class of frames to the ones we are interested in.
Posting.An immediate concern is to make sure that we cannot have a formula stating that the current agent posted ω, while the agent with the unique name of the current agent did not.That is, we do not allow ↓ x. posted (x, ω) and ¬ posted (i, ω) both to be true when x and i refer to the same agent.Therefore this axiom must be valid for all ω ∈ and all i ∈ Nom.7 .
Following.Another concern occurs in the context of the f ollow operator and the ♦ and ♦ −1 operators: we exclude formulas such that an agent called i follows an agent called j, when ♦i is false at the agent called j.We have a similar concern for the other diamond.These axioms for all i, j ∈ Nom characterize the properties that we want.
Global truth.The aim of extending our framework is to preserve the original timeline in TNL, and to be able to evaluate formulas at agents.Therefore, we need to ensure that atomic posts in At either hold at none or at all agents in the same time point.Since our original interpretation of the atomic posts in L T N L are true facts at each time point, we do not want it to be the case in our new model that a fact is true for some agent, but not for another.This property can be characterized with the following axiom for all p ∈ At and all i ∈ Nom.p → @ i p Names.Another important feature we want to preserve from the original timeline is to characterize an agent unfollowing another with a formula such as f ollow (i, j) ∧ X ¬ f ollow (i, j).The X operator switches the evaluation of ¬ f ollow (i, j) to the next time step, and so it must be crucial to us that the names i and j refer to the same agents for all time steps.If not, we cannot guarantee that it is the same agent called i who unfollows the same agent called j as in f ollow (i, j) in the previous time point.The property that all agents have the same name in all time points can be characterized with this simple axiom for all i ∈ Nom.

i → Gi
We conjecture that TNL a is complete, but leave a complete axiomatic system for TNL a to future work.More specifically, we conjecture completeness both with respect to temporal network agent frames, and also the subclass of frames that have the desirable relationships between global and local, with the use of the previously mentioned axioms.The reason for this assumption is, mainly, the observation that our frames are indexed (Balbiani & Fernández González, 2020).Indexed frames are relational structures (W 1 , W 2 , R 1 , R 2 ) with two sets of worlds and two binary relations such that R 1 is a family of relations {R 1 w : w ∈ W 1 } on W 2 and R 2 is a family of relations {R 2 w : w ∈ W 2 } on W 1 .The frames of TNL a are essentially such structures: the following-relation N is a family of relations on the set of agents A, indexed for each time point in T .The temporal relation < is a relation on the time points, and could be formalized as being a family of relations indexed by each agent in A, although for all agents a, < a would refer to the same set.Completeness results for logics on indexed frames, also for hybrid logics, have been extensively researched by Balbiani andFernández González (2020, 2021) and Fernández González (2021), and will provide valuable input for our work on TNL a .

Formulas in TNL a
In this section we present some of the formulas in TNL a related to the existence of social bots in a network and the powerful positions they may hold.With the use of the local diamond modalities and the hybrid operators, we can begin to analyze properties that, by zooming in on a specific agent, holds either of me, or the agents I am following or being followed by.
1.I posted p for the first time now: 2. Someone who follows me now, has posted false information in the past: 3. I am following someone now who will have a bursty nature in the future (and therefore likely is a bot): It is interesting to note that the hybrid language lets us analyze particular properties of the network structure, and likely powerful positions that agents inhabit.One such position is called a local gatekeeper (Easley & Kleinberg, 2010).In our context, an agent a is a local gatekeeper if and only if there are two other agents, called b and c, who both follow and are being followed by a, but does not have any followerrelationship with each other.The reason a is called a local gatekeeper between b and c is that it is likely that information from b to c, or vice versa, would go through a.It is therefore reasonable that we would be interested in whether or not a is a trustworthy agent.With the language of TNL a we can formalize properties such as the following.
4. I am following someone now who posted false information in the past, and is a local gatekeeper between me and another agent: Another property we can define with formulas in L T N L a is a least number of followers for the current agent or any of their relations.When analyzing the network, knowing the number of followers is crucial to be able to discuss a particular agent's power to be heard.5.I am being followed by someone who has at least three followers: The list of sentences we have included is only a minor selection of possible properties we could have mentioned.The aim is rather to show some examples of properties we can express in TNL a which we could use a model checker to check for in the network.

Model checking TNL a
We introduce a model checking algorithm for TNL a .As in the case of TNL, we start by assuming that we have gathered data from a real-life social network and can use this to develop, in this case, a finite fragment of a TNL a model.The TNL a model is, as the TNL model, an infinite timeline whereas gathered social network data will always be finite.To depict this finite information as a TNL a model, we therefore define an equivalent to the finite fragments of TNL.

Definition 8 (Finite Fragments of Temporal Network Agent Models
tuple where all elements except ∼ f are defined as in a temporal network agent model.∼ f is a binary relation on T which is a finite linear order in which all points are irreflexive except the last reflexive point.
Finite fragments of temporal network agent models are finite timelines with social network information, in which the last irreflexive point is a reflexive deadlock point.The reason for this final deadlock point is to keep the semantics of the X operator well-defined.The semantics for formulas in L T N L a on finite fragments are defined as in the case of temporal network agent models.
The model checking algorithm for TNL a builds on the model checking algorithm MCFULL for hybrid logic given by Franceschet and de Rijke (2006), with some substantive differences accommodated for our setting.Classical hybrid logic models only have one relational structure, as opposed to the two relations ∼ f and N in finite fragments of temporal network agent models.This difference is also reflected in L T N L a , which is richer than the tense hybrid logic language given by Franceschet and de Rijke ( 2006) and includes both X φ, ♦φ, ♦ −1 φ as well as the f ollow -and posted -formulas.
For the standard hybrid language without the binder ↓, Franceschet and de Rijke (2006) introduce a model checker MCLITE that takes a finite hybrid model M = M, R, V , an assignment function g and a formula φ as input.After termination of the procedure, each state in the model is labeled with all the subformulas of φ forced at that state.Its algorithm updates a table whose elements are bits.We also begin by defining a model checker for the language of TNL a without ↓, from here on called L T N L a@ .Our procedure is based on a similar method, but the table of bits needs to be three-dimensional as opposed to the two-dimensional table of MCLITE.The subroutines introduced in the following also all include details that are uniquely defined for T N L a .
The model checker MC @ receives a finite fragment M f a = (A, T , ∼ f , N , V a , Posts), an assignment function g and a formula φ in L T N L a@ .It outputs the finite fragment where all states are labeled by the subformulas of φ which hold at the respective states.Like the model checking algorithm MCLITE, MC @ goes through the subformulas of φ starting with the shortest and working upwards in increasing order of length.Boolean connectives are handled in a standard way.
MC @ updates a table L of bits of size |φ| × |A| × |T |.Note that the size of L is finite, as both A and T are defined as finite in M f a .We denote sub(φ) = {α, β, . ..} for the set of subformulas of φ.Initially, for each element in the table L(α, a, t) = 1 if and only if: When MC @ terminates, L(α, a, t) = 1 if and only if M f a , g, a, t α for all a ∈ A, t ∈ T and α ∈ sub(φ).
For formulas on the form ♦α, ♦ −1 α, Pα, Fα, X α and @ s α, MC @ relies on subroutines mc ♦ , mc ♦ −1 , mc P , mc F , mc X and mc @ , respectively.We give the subroutines mc ♦ , mc F and mc @ , and explain how the other routines can be defined.For subroutine mc ♦ in Algorithm 2, define the set N −1 such that for any a, b ∈ A and t ∈ T : b ∈ N (a, t) iff a ∈ N −1 (b, t).Also define L(α) as the set of pairs (a, t) in the table L where L(α, a, t) = 1.
The procedure mc ♦ −1 can be defined similarly as in the case of mc ♦ , where N −1 (a, t) is exchanged with N (a, t) and line 3 states L(♦ −1 α, b, t) ← 1 instead.For subroutine mc F in Algorithm 3 define the set < (t) = {s ∈ T | s < t}.
The procedures mc P and mc X can be defined similarly as mc F .In mc P , < (t) should be exchanged with the set > (t) = {v ∈ T | t < v} and line 3 should rather state L(Pα, a, s) ← 1.In the case of mc X , instead of < (t), define the set < 1 (t) to be Algorithm 2 Procedure mc ♦ (M f a , g, α) 1: for (a, t) ∈ L(α) do 2: for b ∈ N −1 (a, t) do 3: L(♦α, b, t) ← 1 4: end for 5: end for Algorithm 3 Procedure mc F (M f a , g, α) 1: for (a, t) ∈ L(α) do 2: for s ∈< (t) do 3: L(Fα, a, s) ← 1 4: end for 5: end for the set of states reachable one step in the past, in other words the set of the immediate predecessors of t.Recall that this set would be defined as pr e({t}) using the notation from the model checking algorithm of TNL in Sect.6.In mc X , line 3 should state L(X α, a, s) ← 1 instead of L(Fα, a, s) ← 1.
Algorithm 4 Procedure mc @ (M f a , g, s, α) for b ∈ A do 5: L(@ s α, b, t) ← 1 6: end for 7: end if 8: end for Proposition 3 Let M f a = (A, T , ∼ f , N , V a , Posts) be a finite fragment of TNL a , g an assignment function and φ a formula in L T N L a@ , the language of TNL a without the binder ↓.The model checker MC @ (M f a , g, φ) Proof The proof follows the similar proof of the complexity of MCLITE, Theorem 4.3 (Franceschet & de Rijke, 2006).MC @ checks all subformulas of φ, in total |sub(φ)| = |φ| formulas.To determine the initial state of the table, one needs to check |V a | times if α is an atomic post or a nominal, |N | times if α is on the form f ollow (i, j) and |Posts| times if α is on the form posted (s, ω).Since V a , N and Posts depend on A × T , the complexity of determining the initial state is O(|A| × |T |).The complexity of checking each subformula α depends on the main operator in α.Recall that ∼ f is the binary relation in M f a which is a finite linear order in which all points are irreflexive except the last reflexive point.The cardinality | ∼ f | is the number of elements in ∼ f , i.e. the number of transitions from a time point to another in M f , plus 1 for the last reflexive point.Subroutine mc ♦ is the routine with the highest complexity to check, and is checked in time We introduce the model checker MC ↓ for the full language of TNL a .MC ↓ uses the same combination of bottom-up and top-down strategy as the model checker MCFULL (Franceschet & de Rijke, 2006).It is a recursive model checker, which, as MC @ , updates a three-dimensional table L of bits.For each formula φ, all subformulas where the main operator is not the hybrid binder ↓ is checked using the subroutines from MC @ .If the main operator is the hybrid binder ↓, the formula is on the form ↓ x.α.First, for each time point t and for each agent a, we assign a to x.Then, we check α at t and a, using the subroutines from MC @ , with the new assignment for x.MC ↓ uses the same procedures check * for * ∈ {♦, ♦ −1 , P, F, X , @} as in MCFULL, shown in Algorithm 5.
The procedure check ↓ is shown in Algorithm 6. check ↓ uses a new subroutine clear (L, x, t) which resets all values of L(α), where x is free in α, in t.As in MC @ , Boolean operators are treated as usual.
Algorithm 6 Procedure check ↓ (M f a , g, x, α) end for 10: end for Proposition 4 Let M f a = (A, T , ∼ f , N , V a , Posts) be a finite fragment of TNL a , g an assignment function and φ a formula in L T N L a .Let r ↓ be the nesting degree of ↓.

The model checker MC
Proof The proof follows closely the proof of the complexity of MCFULL, Theorem 4.5 (Franceschet & de Rijke, 2006).The procedure check * runs in time C α + C * where C α is the cost of checking α and C * is the cost of the subroutine mc * for each * ∈ {♦, ♦ −1 , P, F, X , @}.The procedure check ↓ runs in time |A| × |T | × C α .Let C MC @ denote the complexity of MC @ .The worst case time complexity is As in the case of MCFULL, the height of the recursion stack for MC ↓ is at most the length of the formula φ, thus MC ↓ uses polynomial space.
, N , V , Posts).Note that the output here is a finite fragment of TNL, and not a finite fragment of TNL a .To build a finite fragment of TNL a , the algorithm would take a nominal valuation V a instead of V as part of the input.The rest of the algorithm would remain the same.
T is a set of natural numbers of the same size as the set of snapshots.Recall that a snapshot is a dataset that contains information about agents in the network, their posts and relations at given moments in time.For each element t in T , the ordered pair of (t, t + 1) is added to ∼ f , to construct a strict linear order (line 3).From each snapshot, every agent that is a node in the social network snapshot is added to A (line 5), relations are added to N (line 7) and posts are added to Posts (line 10).Then the reflexive deadlock state is added (line 16) and the program returns the model (line 17).The algorithm runs in linear time on the number of agents, connections and posts in the social network.
Algorithm 7 Building a finite fragment of a temporal network model from snapshots of a social network Input: Ordered snapshots of a social network, enumerated from 1 to n, V Parameter: Set of natural numbers T = {1, . . ., n} of size n with its standard ordering, each element referring to a snapshot Output:

Conclusion and future directions
In this paper, we explored social bot detection as a model checking problem.We represented social networks as logical models and characterized bot behavior as formulas; thus, checking whether the formula is satisfied in the model detects whether to expect bots in the network.We first presented examples of bot behavior based on empirical literature and existing bot detectors.Then, we reviewed related work before we introduced Temporal Network Logic and showed some of the network and agent properties we can formalize in the language.We also showed that TNL is sound and weakly complete.Then, we presented bot behavior formulas corresponding to the earlier assessed properties of social bots.We gave a model checking algorithm for finite fragments of TNL models, and showed that it runs in polynomial time.We also showed the reduction of the model checking problem for TNL to the model checking problem for past linear-time temporal logic and discussed the complexity of model checking for bot detecting through this translation.We then presented an extension of TNL, named TNL a , where we explored the expressive power of adding hybrid elements to the language and evaluating formulas not only at time points, but also at agents in the network.We introduced a model checking algorithm for finite fragments of TNL a and showed that its complexity is in pspace.The paper ended by a proposal for a way to build finite fragments of TNL models from real social network data.
Completeness for TNL a is still an open problem.In future work, we want to use the axiomatic system for TNL as well as literature on indexed frames (Balbiani & Fernández González, 2020, 2021;Fernández González, 2021) to provide a complete axiomatization of TNL a .
Temporal network models keep the whole social network through the entire timeline.As was mentioned by one of the reviewers, an alternative to TNL models is to rather have a framework that defines an initial model and only track the necessary changes.We believe this could be done in a Dynamic Epistemic Logic-style setting by using event models to track changes from a given outset.It will be interesting to study the model checking problem for the language of TNL on this type of models.A proper exploration of this idea is left to future work.
We also hope to be able to empirically evaluate our detection method, and check its efficiency vis-a-vis standard machine learning methods.The verification can be done in at least one of two ways: either by implementing the model checking algorithms for finite fragments of TNL and TNL a models, or by using an LTL model checker such as SPIN (Holzmann, 1997), LTSmin (Kant et al., 2015) or NuSMV (Cimatti et al., 2002).One of the challenges is to substitute human input for the natural language and fact checking tasks.Theorem 1 TNL is sound and weakly complete with respect to temporal network models.

(K
Proof The proof closely follows the proof of completeness for DX by Gabbay et al. (1980) with details from standard proofs of completeness for various tense logics (Blackburn et al., 2001;Venema, 2001).Soundness is established by showing validity of the axioms, and that the rules preserve validity.We omit details on proof of soundness.
A classic proposition in modal logic states that a logic L is weakly complete with respect to a class of frames F if and only if every L-consistent formula is satisfiable on a frame F ∈ F (Blackburn et al., 2001).Let φ be an arbitrary consistent formula.We want to show that φ is satisfiable on some TNL frame.We define the canonical model M T N L is identical to the canonical model S by Gabbay et al. (1980), except for the added relation R T N L H and that the valuation function in S only takes standard atoms as input.Using the axioms (G P) and (H F) we can prove that for any , ∈ W T N L : < iff > .This is a standard proof in tense logic (Blackburn et al., 2001).The result implies that R T N L G and R T N L H is the same relation, and we only need to define R T N L G and R T N L X in the canonical model.Therefore, the frame structure of M T N L is now the same as S. From now on, the proof follows Gabbay et al. (1980).It is shown that for every ∈ W T N L : R T N L G ( ) ∈ W T N L , and that for all , , ∈ W T N L : if < , < and = , then < or < .It is also shown that if Fψ ∈ , then for some : < and ψ ∈ .The canonical relation R T N L G on the set W T N L is therefore a linear order consistent with F, but it is not irreflexive.We will use the closure of a formula to build a finitely representable model that represents the natural numbers and their ordering.Proving completeness by constructing a model based on the closure of a formula is also used in weak completeness proofs for other non-compact logics such as S5 with common knowledge (van Ditmarsch et al., 2008) and propositional dynamic logic (Blackburn et al., 2001).
Then, we define our final model.Let s 0 be an element of W T N L such that φ ∈ s 0 .Let S n be a sequence of states from W T N L such that for any t ∈ W T N L , if t = s i for infinitely many i, then every u such that (t, u) ∈ ρ X is also equal to infinitely many s i .That is, if a state appears infinitely many times in the sequence, then so does each of its successors.Our final model is the sequence S n with the valuation function V T N L which is defined as V T N L but with respect to elements in W T N L instead of W T N L .S n is a strict linear order with no end.A truth lemma is proved by induction by Gabbay et al. (1980): that in the language of DX, for any ψ ∈ cl(φ) : (S n , V T N L ), s n ψ iff ψ ∈ s n .For the language of TNL, we must make sure the proof also holds for the past operator and the f ollow -and posted formulas.For formulas on the form f ollow (a, b) and posted (a, ω), the induction proof can be extended to handle two additional separate base cases.We prove the case for ψ is Pχ .
(⇐) Suppose that Pχ ∈ s n .s n = ∩ cl(φ) for some ∈ W T N L , and therefore Pχ ∈ .As mentioned earlier in the proof, Gabbay et al. (1980) show that for any ψ, any ∈ W T N L : if Fψ ∈ , then for some : < and ψ ∈ .We assume that by the same reasoning, we have that for any ψ, any ∈ W T N L : if Pψ ∈ , then for some : < and ψ ∈ .This step is common in standard completeness proofs for tense logics, see for instance (de Jongh et al., 2004).It follows that for some : < and χ ∈ .Since cl(φ) is closed under subformulas and Pχ ∈ cl(φ), we have that χ ∈ cl(φ).Thus, χ ∈ ∩ cl(φ).By the induction hypothesis (S n , V T N L ), ∩ cl(φ) χ .Hence, (S n , V T N L ), s n Pχ .
We have now proved that the truth lemma holds also for formulas in L T N L .Since φ ∈ s 0 , it follows that φ is satisfied in (S n , V T N L ).We can build a TNL model from Z := sat F ; 5: sat F := Y ∪ pre(sat F ); 6: end 7: return sat F Algorithm 10 Function S AT X (φ) determining the set of time points that satisfy X φ Input: Finite fragment M f = (A, T , ∼ f , N , V , Posts), formula φ ∈ L T N L Output: The set sat X of time points in M f where X φ is satisfied 1: X := S AT (φ); 2: sat X := pre(X ); 3: return sat X

Fig. 1
Fig. 1 A temporal network model M 3. Agent b starts to follow agent a back: P f ollow (a, b) ∧ ¬P f ollow (b, a) ∧ f ollow (b, a) 4. Agent a posted an original post ω, i.e. they are the first in the network to post ω: posted (a, ω) ∧ b∈A ¬P posted (b, ω) 5.At the current time, agent a is a lurker, an agent in the network that only observes (follows at least one other agent), but has not yet posted: b∈A P f ollow (a, b) ∧ ( ω∈ ¬P posted (a, ω) ∧ ¬ posted (a, ω)) 10. (MP) I f φ and φ → ψ then ψ 11. (US) I f φ then φ σ , for σ a substitution 12. (TG G ) I f φ then Gφ 13. (TG H ) I f φ then H φ a∈A posted (a, p + alot) (HT) start f ollow(a, b) := ¬ f ollow (a, b) ∧ X f ollow (a, b) and define the formula AggU.a∈A b 1 ,...,b m ∈A ∀n =k:b n =b k start f ollow(a, b 1 ) ∧ F¬ f ollow (a, b 1 ) To construct sat, each visit of line 4 checks |V | times, each visit of line 5 checks |N | times and each visit of line 6 checks |Posts| times.Since V depends on T , and N and Posts depend on A × T , the complexity of checking lines 4, 5, 6 is O(|A| × |T |).We imagine a worst case scenario where one would need to check every distinct node in the parse tree of φ exactly once.The search through the parse tree is depth-first and has complexity O(|φ|).The complexity of the whole algorithm is therefore O(|φ| × |A| × |T |).
and φ → ψ then ψ 11. (US) I f φ then φ σ , for σ a substitution 12. (TG G ) I f φ then Gφ 13. (TG H ) I f φ then H φ ) where:• W T N L is the set of all maximal consistent sets of formulas in TNL;• R T N L G is a binary relation on W T N L such that ( , ) ∈ R T N L G iff for all ψ : Gψ ∈ implies ψ ∈ ; • R T N L H is a binary relation on W T N L such that ( , ) ∈ R T N L H iff for all ψ : H ψ ∈ implies ψ ∈ ; • R T N L X is a function from W T N L to the power set of L T N L defined such that R T N L X ( ) = {ψ | X ψ ∈ }; • V T N L is the valuation defined by -V T N L ( p) = { ∈ W T N L | p ∈ }, -V T N L ( f ollow (a, b)) = { ∈ W T N L | f ollow (a, b) ∈ }, -V T N L ( posted (a, ω)) = { ∈ W T N L | posted (a, ω) ∈ }.For simplicity, we write < if ( , ) ∈ R T N L G and > if ( , ) ∈ R T N L H .

Algorithm 9
Function S AT F (φ) determining the set of time points that satisfy FφInput: Finite fragment M f = (A, T , ∼ f , N , V , Posts), formula φ ∈ L T N L Output:The set sat F of time points in M f where Fφ is satisfied 1: sat F := ∅; 2: Y := pre(S AT (φ)); 3: repeat until Z = sat F 4: