Empowering the Peer Group to Prevent School Bullying in Kosovo: Effectiveness of a Short and Ultra-Short Version of the ViSC Social Competence Program

Evidence-based anti-bullying programs are predominantly implemented in high-income countries, although there is a clear need for bullying prevention also in low- and middle-income countries. The present study reports the effectiveness of a short and ultra-short version of the ViSC Social Competence Program that was implemented in nine Kosovar schools. The ViSC program aims to empower adolescents to recognize bullying and to intervene in bullying situations. A quasi-experimental longitudinal control group design was realized to examine the effectiveness of the two program versions regarding different forms of self-reported perpetration and victimization. The short program version was implemented in 10 classes (N = 282, 52% girls, Mage = 13.45), the ultra-short program version was implemented in 13 classes (N = 354, 46% girls, Mage = 13.28), and 23 classes (N = 613, 50% girls, Mage = 13.31) served as control group. Multilevel growth models revealed intervention effects in favor of the ultra-short version when compared to the control group regarding physical victimization. All other effects were not significant. To conclude, educational and social policies supporting the implementation of evidence-based anti-bullying programs need to be issued in low- and middle-income countries, as even ultra-short versions might be effective in contexts with limited available resources.

There is ample evidence that bystanders matter in bullying situations implying that bullying is a group phenomenon (Lambe et al. 2018). Research demonstrates that reinforcing bullies and not defending victims are associated with higher levels of bullying at class level . Therefore, several anti-bullying programs aim to raise children's awareness of the importance of bystander behaviors and try to change the peer norms regarding bullying (Polanin et al. 2012). Anti-bullying programs usually consist of a large variety of measures; therefore, their implementation requires extensive amounts of resources that are not readily available in low-and middle-income countries (Sivaraman et al. 2018). Kosovo is a lower middle-income country in which antibullying programs have not been implemented yet, although there is a clear need for evidence-based prevention because prevalence rates of school violence and bullying are high (Xërxa et al. 2014). For the purpose of the present study, (1) a short and ultra-short version of the Viennese Social Competence (ViSC) Program was developed, and (2) the effectiveness of these two program versions regarding the reduction of five different forms of perpetration and victimization was compared within a longitudinal randomized control study.

Why Bystander Behavior Matters in Bullying Situations
Peers are usually present when bullying happens (Hawkins et al. 2001) and they take different participant roles during the bullying process (Salmivalli et al. 1996). A bullying episode consists of youth who are actively involved in intentionally harming a peer who cannot easily defend him-or herself, youth who try to defend their bullied peer, and youth who do not recognize the incident or who ignore it (Pöyhönen et al. 2012). Research shows that prosocial bystanders might take several actions to stop bullying ranging from active interventions during the bullying episode, taking the side of the bullied peer, or reporting the incident to teachers or other adults (Espelage et al. 2011;Pozzoli and Gini 2010;Salmivalli 2010).
In line with the socio-ecological theory of human development (Bronfenbrenner 1979), research demonstrates that individual and class characteristics are associated with different types of bystander behaviors during bullying episodes (Pozzoli et al. 2012;Salmivalli et al. 2011;Kollerová et al. 2018). Systematic reviews revealed that on the individual level, being a girl, having high empathy, low moral disengagement, high social self-efficacy and high peer popularity are associated with defending a bullied peer (Lambe et al. 2018;van Noorden et al. 2014). On the class level, class pro-victim attitudes (Pozzoli et al. 2012), peer injunctive norms (Kollerová et al. 2018), and perceived supportive relationships with teachers (Lambe et al. 2018) are associated with higher levels of defending. Being defended matters for youth who are bullied, because they are more likely to escape from an escalating circle of bullying (Hawkins et al. 2001;O'Connell et al. 1999) and they are less likely to suffer from mental health problems compared to their non-defended peers (Juvonen et al. 2016;Sainio et al. 2010). Defending also matters on the class level, because in classes where defending is supported by descriptive and injunctive norms, bullying rates are lower (Pozzoli et al. 2012;Salmivalli et al. 2011;Thornberg and Wänström 2018).

Empowering the Peer Group to Intervene in Bullying Situations
Encouraging defending and empowering the peer group to take an active stance against bullying has been found to be an effective component to reduce bullying across various bullying prevention programs (Polanin et al. 2012). Salmivalli (2010) pointed out that group processes like not reinforcing bullying behavior and not assigning high status to youth who bully others need to be tackled by anti-bullying programs to reduce bullying and victimization. Socio-ecological whole school anti-bullying programs acknowledge that bullying is a complex systemic problem; therefore, such programs usually aim to change several processes on the school, class, and individual level (Kärnä et al. 2011). A meta-analysis comprising 44 studies showed that school-based anti-bullying programs are on average effective in decreasing bullying by 20-23% and victimization by 17-20% (Ttofi and Farrington 2011). The same meta-analysis also showed that programs that are more intensive were more effective, as were programs including parent meetings, disciplinary methods, and improved playground supervision. A discrete choice experiment examining 11 hypothetical anti-bullying program attributes demonstrated that the number of the implemented antibullying measures in a school moderates the decisions to intervene (Cunningham et al. 2019). Analyses revealed that 28.7% of the participating students in grades 5 to 8 would intervene if the school implemented intensive anti-bullying measures like daily program activities, a high number of playground supervisions, mandatory reporting, and suspensions for perpetrators. Interestingly, a smaller group of students (10.3%) thought that their decisions to intervene would be most motivated in schools with minimal interventions.
The ViSC program (Strohmeier et al. 2012) is a socioecological anti-bullying program suitable for adolescents aged 11 to 14 years that aims to empower the whole school to take actions against bullying. The original program consists of a large number of measures on the school, class and individual level (for details see Strohmeier et al. 2012). On the class level, the program trains students to feel responsible when bullying is going on and to react in a way that is likely to improve the situation. Students are also trained to recognize their own emotions and the emotions of others and to cope with these emotions in a positive, non-aggressive way. Furthermore, students are trained how to best react when being bullied by others. Previous studies that were conducted in Austria demonstrate that the ViSC program is effective in reducing victimization (Yanagida et al. 2019); however, a school-wide program implementation also requires an extensive amount of resources (Schultes et al. 2014) that are not readily available in in lowand middle-income countries (Sivaraman et al. 2018). Therefore, it was necessary to shorten the ViSC program to be able to implement it in Kosovar schools.
Bullying Prevalence in Lowand Middle-Income Countries As Sivaraman et al. (2018) point out, research on school bullying predominantly originates from high-income countries (HICs), although 80% of school-aged children and adolescents live in low-and middle-income countries (LMICs) where this age group also represents a proportionally larger section of the general population compared to HICs (Blum et al. 2012). LMICs are classified as countries that fall under the Gross National Income (GNI) index, an indicator that is publicly disclosed by the World Bank every year. Fleming and Jacobsen (2010) examined the prevalence of victimization in middle school students in 19 low-and middle-income countries (Kosovo was not included) and explored the relationship of victimization and mental health. The prevalence rate for being victimized at least once during the last month was 34.2%. However, large differences between LMICs were found ranging between 7.8% (Tajikistan) and 60.9% (Zambia). However, in all LMICs, victimized students reported higher levels of sadness and hopelessness, loneliness, insomnia and suicidal ideation as well as higher rates of tobacco use, alcohol use, drug use and sexual intercourse as compared to non-victimized students, indicating that victimization is an important mental health risk also in LMICs (Fleming and Jacobsen 2010).
For the situation in Kosovo, there is only limited data available. According to the 2014 HBSC study (Xërxa et al. 2014), 23.8% of 11-to 15-year-old adolescents reported being involved in a physical fight, 20.3% reported having bullied others, and 24.3% reported being bullied by others at least once during the last 12 months. In line with international findings, boys were more often involved in these behaviors compared to girls and prevalence rates were highest for 13-yearolds compared to 11-and 15-year-olds. A literature review comprising 17 research reports on school violence conducted in Kosovo since 1999 (Arënliu et al. 2019) documents the presence of all forms of school violence in Kosovo including physical, psychological and sexual aggression, bullying, cyberbullying, and weapon carrying. It should be mentioned that this review paper found mainly papers and reports that focused on school violence rather than bullying which is characterized by repetition and power imbalance. Depending on selected samples and methodology, the prevalence rates of different forms of violence varied widely. Prevalence rates for the last 12 months or the last school year for physical violence ranged between 25% and 38%, for psychological violence the rates ranged between 14.2% and 53.4%, for sexual violence the rates ranged between 1.8% and 5.8%, and for cyberbullying the rates ranged between 16.1% and 21.8%.

Challenges of Implementing the ViSC Program in Kosovar Schools
During the last decades, the enrollment of children in school in LMICs has increased (Bundy et al. 2017) and therefore schools can be used to address the social and mental health of students more systematically. However, most of the schools and teachers in LMICs struggle to cope with basic infrastructural problems and are therefore less able to foster the socioemotional development of their students. Addressing the social and mental health needs of students is also often associated with stigma that prevents schools from using the limited services or community resources that might exist in LMICs.
Kosovo experienced long periods of political instability and violence, especially since 1989, which culminated in a war in 1999. In the same year, the NATO bombings and subsequent troop placement in Kosovo put an end to this situation and approximately 900.000 refugees returned to their homes in Kosovo. In 1999, the UN protectorate-secured by a multinational force-has been established in Kosovo (NATO, July 15, 1999). On February 17, 2008, Kosovo declared its independence under a new government.
Since 1999, the educational system in Kosovo is in a constant reform process, with increasing investments in the infrastructure and the training of teachers (Ministry of Education 2015). However, the Kosovar education system still struggles with a high number of major challenges. For example, because of lacking infrastructure, schools are usually working in two shifts and 42% of primary and secondary school teachers teach courses for which they have no proper credentials (Ministry of Education 2015, pp. 24). Furthermore, the PISA assessment (OECD 2018) evidenced very poor results for the Kosovar students who showed the third lowest performance of all participating countries. Not surprisingly, interventions targeting school violence and bullying in Kosovo are limited and evidence-based approaches are lacking as they are not considered as priority by policymakers, school managers and teachers. Schools also lack school psychologists and counselors who could promote the implementation of antischool violence and bullying preventions programs. Lastly, schools in Kosovo have no tradition of implementing structured mental health or socio-emotional interventions including bullying or violence prevention programs.
This lack of evidence-based anti-bullying programs was also documented for other LMICs. Sivaraman et al. (2018) conducted a systematic review to locate evidence-based antibullying programs that were implemented in LMICs between January 1987 and June 2016. Over a period of 29 years, only three reports were eligible to be included in the systematic review. Importantly, one of these three eligible studies reported the evaluation results of the ViSC-REBE program that was implemented in Romania in 2012/13 (Trip et al. 2015). The ViSC program was also implemented in Turkey in 2015/16 (Dogan et al. 2017). Thus, to date, the ViSC program is the only evidence-based program that has already been implemented in two LMICs-Romania and Turkey. This was one important reason why this specific program was implemented also in Kosovo in addition to a small grant that made a pilot implementation feasible.

Developing a Short and Ultra-Short Version of the ViSC Program
Given the many challenges of the Kosovar educational system, it was necessary to adapt the ViSC program to be able to implement it in Kosovar schools. The adaptations were guided primarily by available resources, but also by theoretical considerations to overcome several practical obstacles. In Kosovo, the Austrian program developers trained volunteer assistants who were undergraduate students of the University of Prishtina instead of training professional multipliers (e.g., school psychologists, school social workers). For each intervention class, a team consisting of two volunteer assistants then implemented a 4 or 6-unit class project by directly working with the students in each class over a period of 4 or 6 weeks-instead of implementing a whole school program consisting of teacher trainings, parent meetings, and a 13unit class project over a period of one school year (see Strohmeier et al. 2012). To focus on the class level and to implement the four vs. six units within 4 and 6 weeks was necessary to not overburden the schools and the volunteer assistants.
The decision to compare a short (six units) versus ultrashort (four units) version of the ViSC program was guided by theoretical considerations, because responsibility is passed over to the students between unit 4 and unit 5. It has, however, also practical implications, since the four-unit intervention requires fewer resources to be implemented than the six-unit intervention. During unit 1 to 4, the students are trained in various social skills (e.g., recognizing bullying, empathy) while during unit 5 and 6, the group puts a small self-chosen project into practice (see Table 1). The role of the volunteer assistants during unit 1 to 4 is to work with the materials provided by the ViSC program manual. In contrast, during unit 5, the class is assigned to find a common, positive, and realistic activity that can be carried out together during unit 6. The role of the volunteer assistants in these units is to create a group process that enables cooperative learning and the experience of a common success.

Previous Evaluation Studies of the ViSC Program
To date, the ViSC program has been implemented in Austria, Cyprus, Romania, and Turkey. In Austria, the effectiveness of the ViSC program has been demonstrated in various studies and it has been shown that the program is effective in reducing victimization, cyber-victimization, and cyberbullying (Gollwitzer et al. 2006(Gollwitzer et al. , 2007Gradinger et al. 2015Gradinger et al. , 2016Yanagida et al. 2019).
In Cyprus, evaluation results revealed that seventh-grade students profited more from the program compared to eighthgrade students (Solomontos-Kountouri et al. 2016). In Romania, where only the class project was implemented by external research assistants and no teacher trainings were provided, no intervention effects on victimization and bullying were found but reductions regarding dysfunctional cognitions and emotions were observed (Trip et al. 2015). In Turkey, evaluation results demonstrated that perpetration and victimization increased in the intervention group compared to control group between pre-and posttest, but also decreased between posttest and follow-up indicating a sensitizing effect of the program (Dogan et al. 2017).

The Present Study
The present study is the first investigation of the effectiveness of a short and ultra-short version of the ViSC program that was implemented in Kosovo. The ViSC program was chosen to be implemented in Kosovo, because it was already successfully implemented in four other countries including two LMICs. Moreover, the current implementation was feasible given the What can we do if we are treated in a mean and unfair way by others? What is the best thing to do in such situations, and why? Reflection Unit 5 What have we learned during the project so far, and what do we want to learn in the remaining units? Which common activity do we want to carry out during our project day? How can we plan and organize the common activity in a way that every classmate is able to make a valuable contribution? Action Unit 6 Carrying out the common activity by creating a process that leads to the experience of a common success. available resources. The program was implemented in seventh-and eighth-grade classes. These grade levels were chosen, because prevalence rates of bullying peak for these age groups in Kosovo (Xërxa et al. 2014). Five different forms of perpetration and victimization were differentiated in the present study, because perpetration and victimization might include physical attacks, verbal insults, or relational harassments (Olweus 1993) as well as offenses via electronic means (Smith et al. 2008). These five forms of perpetration and victimization were also differentiated and compared in previous ViSC evaluation studies (Dogan et al. 2017;Solomontos-Kountouri et al. 2016). We formulated the following two main hypotheses: Hypothesis 1 We expected that both the short and the ultrashort program version were effective. Program effectiveness is indicated by a steeper decrease or lower increase in the different forms of perpetration and victimization when comparing the two intervention groups to the control group. We also expected differential training effects depending on the form of perpetration and victimization. We expected that the two program versions are more effective regarding direct (e.g., physical) forms of perpetration and victimization compared to more indirect (e.g., relational, cyber) forms, because direct behavior is more visible and can be disapproved more easily by the peer group compared to indirect behavior.
Hypothesis 2 We expected that the six-unit class project (short version) would be more effective than four-unit class project (ultra-short version). This hypothesis was formulated because the exposure with the training content is more intense in the short compared to the ultra-short version. Because programs with higher intensity and longer duration for children were found to be more effective in a meta-analysis (Ttofi and Farrington 2011), it is reasonable to assume that the six-unit version leads to better training outcomes compared to the four-unit version.

Program Implementation
Seven school psychologists employed in nine secondary schools supported the implementation of the program by facilitating the communication with the schools. These seven school psychologists were identified by the first author who send emails to fourteen school psychologists and asked for their interest in the intervention. To be able to implement the program, permission was obtained from relevant education municipal authorities located in Prishtina, Lipjan, and Podujevë where the seven school psychologists were employed. School psychologists in collaboration with school administration and two undergraduate volunteer assistants randomly assigned classrooms to intervention and control group.
A BViSC Train-the-Trainer Seminar^was organized at the University of Prishtina and 46 undergraduate volunteer assistants (two persons per class) were certified as ViSC coaches shortly before program implementation. The seven school psychologists also attended the training. The training (18 h, October 2017) focused on the definition and recognition of bullying; tackling acute bullying incidents; and preventive measures at the class level including detailed instructions how to implement the ultra-short versus short program version.
To monitor the implementation fidelity, the volunteer assistants filled in reflection sheets after each session. These reflection sheets contained three parts. First, there were three questions regarding the didactical process, the use of the work sheets, and of the games. Second, there were two questions regarding the aspects that went well and that did not went so well. Third, there was an overall evaluation of the unit, and there were four items regarding the behavior of the students during the units. Descriptive analyses of these sheets revealed that the four-unit intervention was implemented in 13 classes, whereas 10 other classes received the six-unit intervention. In total, 112 training sessions were implemented. In the majority of classrooms, interventions were conducted without major modifications. The minor modifications included not using the individual worksheet (during one session) and not using the group worksheets (during three sessions). Alongside the planned materials, volunteer assistants used flipchart papers, coloring markers, small rectangular pieces of paper, and finger-paints. In one implementation session, students did not want to fill in any worksheets; therefore, the volunteer assistants improvised the same activities without using the worksheets. During 23 implementation units out of the total 112 implemented units, the volunteer assistants reported being disadvantaged by the large number of students present in classrooms (more than 35 pupils per classroom) and the shortage of time to implement the intervention units. Further analyses revealed that these 23 problematic units were delivered in four classes. It is important to understand that overcrowded classrooms are common in Kosovar schools because of the poor infrastructure. Thus, these kinds of obstacles are typical in schools of many low-income countries and need to be considered when implementing anti-bullying programs that were developed in high-income countries characterized by a smaller number of students and more spacious classrooms.

Study Design and Procedure
A quasi-experimental longitudinal control group design with two measurement points was conducted in nine schools. The 23 intervention and 23 control classes were randomly chosen from these nine schools. The ultra-short program version (four units) was implemented in 7 seventh-grade and 6 eighth-grade classes. The short program version (six units) intervention was implemented in 5 seventh-grade and 5 eighth-grade classes. Twenty-three classes of the same nine schools served as control group. Classrooms were randomly assigned to intervention and control group. Data were collected at two time points: pretest (beginning of November 2017) and posttest (end of December 2017).

Participants
In total, 1249 students (354 in four-unit intervention, 282 in six-unit intervention, 613 in control group) nested in 46 classes (13 in four-unit intervention, 10 in six-unit intervention, 23 in control group) and 9 schools participated in at least one measurement and were included. At wave 1 (pretest), the sample comprised 1053 students (49.28% girls) with a mean age of 13.33 years (SD = 0.70, Min = 10, Max = 15). Table 2 provides a description of the sample. The groups were compared regarding demographic characteristics using a series of Pearson's chi-square tests and a one-way analysis of variance (ANOVA). There were no statistically significant results, except for age, F (2, 977) = 4.13, p = 0.016. Tukey post hoc tests revealed that students participating in the short program version (six units) were slightly older compared to students participating in the ultra-short program version (four units) and the control group.

Missing Data
In total, 610 records (177 in the four-unit intervention, 143 in the six-unit intervention, 290 in control group) were incomplete. A total of 387 students showed wave non-response, 196 students were missing at wave 1 and 191 students were missing at wave 2. Because active parental consent was obtained from all students, these students missed data collection because of three reasons: they were (1) absent at the day of data collection, (2) completed the questionnaire in an invalid way, or (3) the student moved to another school. The remaining 223 students had a general missing data pattern with omitted items on single scales. The percentage of missing values across the 92 variables varied between 15.69 and 17.61%.
A series of two-sample Welch t tests with Bonferroni-Holm correction for multiple comparisons was conducted to compare students with complete and incomplete data. There were no statistically significant differences between students with complete data and students with missing values (effect sizes ranged between d = 0.00 to d = 0.22) and indicate that missing data is not systematically related to the study variables. Full information maximum likelihood (FIML) under the MAR assumption was used to deal with missing data (Enders 2010).

Measures
Demographic Information Students reported their gender and age, and their parents' marital status, perceived financial situation, working status, and educational level. There were no statistically significant differences between groups Bullying Perpetration/Bullying Victimization Both of these scales consist of one global item, and three specific items covering different forms (physical, relational, and verbal) of bullying and victimization. In the global item, students were asked BHow often have you insulted or hurt other students during the last two months?^and BHow often have others insulted or hurt you in the last two months?^The three specific items were similar to the global one, except that they described specific forms of bullying and victimization. Cronbach's α coefficients were .80/.85 (pretest/posttest) for the bullying perpetration scale and .76/.85 (pretest/posttest) for the bullying victimization scale. Prior to filling the questionnaires, students were given following explanation: BSometimes, a student or a group of students hurt or insult other student or students by saying unpleasant things, teasing, hitting, kicking, threatening, or excluding him/her. These or similar events may take place repeatedly towards one or more students. In these events, one student is stronger than the other, and it is hard for the weaker one to defend himself/herself. Please think about such incidents when you answer the following questions.Ĉ yberbullying/Cyber-Victimization Both of these scales contain one global and seven specific items related to different electronic means based on Smith et al. (2008). The different electronic means were calls, text messages, emails, chat contributions, discussion board, instant messages, and videos or photos. Cronbach's α coefficients for the cyberbullying scale were .84/.92 (pretest/posttest) and .77/.90 (pretest/posttest) for the cyber victimization scale.
Physical Aggression/Physical Victimization The peer nomination measure developed by Crick and Grotpeter (1995) was modified into a self-report questionnaire and comprised three items, e.g., BHow often did you hit one or more classmates?ô r BHow often have you been hit by one or more classmates?Ĉ ronbach's α coefficients were .79/.84 (pretest/posttest) for the physical aggression scale and .79/.82 (pretest/posttest) for the physical victimization scale.
Relational Aggression/Relational Victimization These five items were also adapted from the peer nomination measure originally developed by Crick and Grotpeter (1995), e.g., BHow often did you leave out other kids on purpose when it was time to play or do an activity?^or BHow often were you left out on purpose when it was time to play or do an activity by one or more classmates?^Cronbach's α coefficients were .85/.90 (pretest/posttest) for the relational aggression scale and .79/.89 (pretest/posttest) for the relational victimization scale.
Verbal Aggression/Verbal Victimization These three items cover direct and indirect verbal harassments (Strohmeier et al. 2013), e.g., BHow often did you say mean or hurtful things to other classmates?^or BHow often did other classmates make fun of you?B Cronbach's α coefficients were .76/ .67 (pretest/posttest) for the verbal aggression scale and .84/ .84 (pretest/posttest) for the verbal victimization scale.
These self-report scales cover specific aggressive behavior and victimization incidents during the last 2 months. Answers to all questions were given on a five-point response scale consisting of the following answer options: 0 (never), 1 (once or twice), 2 (two or three times a month), 3 (once a week), 4 (nearly every day).
Exactly the same scales had been used when evaluating the program effectiveness in Austria, Cyprus, Romania, and Turkey (for details see Dogan et al. 2017;Solomontos-Kountouri et al. 2016;Trip et al. 2015;Yanagida et al. 2019). Cross-national scalar measurement invariance of these scales was established (for more details see . A series of confirmatory factor analyses (CFA) was conducted with Mplus Version 8.1 (Muthén andMuthén 1998-2018) using robust the maximum likelihood estimator (MLR) to establish measurement models under strong longitudinal and between-group measurement invariance (see Little 2013) for the five different forms of aggression and victimization. Results yielded acceptable model fit (see Table 3) indicating sound measurement properties for all scales except for cyberbullying, relational aggression, verbal aggression, and cyber-victimization.

Analytic Strategy
Multilevel growth modelling (level 1: time, level 2: student, level 3: class) was conducted in R version 3.5.0 (R Core Team 2018) using the R package lme4 version 1.1-17 (Bates et al. 2016) and the lmerTest package version 3.0-1 (Kuznetsova et al. 2017) to test program effectiveness. Maximum likelihood was used as estimation procedure. This analysis adequately considers the nested data structure, where time is nested in students, and students are nested in classes taking into account the dependencies between observations (i.e., design effect, see Snijders and Bosker 2012).
To investigate our hypotheses, we had to run two sets of multilevel growth models. The first set of models was conducted to investigate whether the control group differed from the two intervention groups (Hypothesis 1). Program effectiveness regarding the five perpetration and five victimization scales was investigated based on the following two cross-level interactions: time × control vs. four-unit intervention and time × control vs. six-unit intervention (see Table 5).
The second set of models was conducted to investigate whether the two intervention groups (four-unit vs. six-unit) differ from each other (Hypothesis 2). Intervention type differences regarding the five perpetration and the five victimization scales was investigated based on the following cross-level interaction (see Table 6): time × intervention type (four-unit intervention vs. six-unit intervention).
We also computed standardized estimates for each effect to be able to estimate effect sizes. In order to compute these standardized estimates, we re-run all analyses with standardized dependent variables. We added the standardized estimates in the tables only for relevant effects.

Descriptive Statistics for Outcome Variables
Means and standard deviations of the different perpetration and victimization scales for the three groups are reported by wave of data collection (pre-and posttest) in Table 4. Sample size slightly differs by scale and wave due to missing values. Answers to all questions were given on a five-point response scale with the labels never (0), once or twice (1), two or three times a month (2), once a week (3), and nearly every day (4)

Baseline Effects in Perpetration and Victimization
Both intervention groups did not differ from the control group in any of the perpetration or victimization outcome variables during the pretest (see Table 5). Similarly, when comparing the two intervention groups (four-vs. six-unit intervention) with each other, no baseline effects were detected (see Table 6).

Intervention Effects on Perpetration
Regarding program effectiveness, no intervention effects were found for any of the outcome variables measuring perpetration (see Table 5). As shown in Table 6, results in the two intervention groups did not differ from each other.

Intervention Effects on Victimization
Regarding program effectiveness, an intervention effect was found for physical victimization (see Table 5). More specifically, there was a steeper decrease in physical victimization over time in the four-unit intervention (time × control vs. fourunit intervention, b = − 0.122, p = .023) compared with the control group. However, no intervention effects were found for all other outcome variables. Likewise, results regarding the two intervention types did not differ from each other (see Table 6).

Discussion
Most of the schools and teachers in low-or middle-income countries struggle to cope with basic infrastructural problems and are therefore less able to foster the socio-emotional development of their students. Not surprisingly, evidence-based anti-bullying programs are only rarely implemented in lowor middle-income countries (Sivaraman et al. 2018), mostly because their implementation requires extensive amounts of resources that are not readily available. The ViSC program (Strohmeier et al. 2012) is one of the few evidence-based programs that was originally developed and implemented in a high-income country (Austria), but was later on adapted and implemented in two low-or middle-income countries (Dogan et al. 2017;Trip et al. 2015).
For the purpose of the present study, a short and ultra-short version of the ViSC class level component was developed to be able to implement it in Kosovar schools. To overcome a number of practical obstacles, it was only possible to implement a short (i.e., 6-week) and an ultra-short (i.e., 4-week) program version on the class level that was implemented by volunteer assistants, instead of implementing a multilevel whole school program including all teachers during the time span of one school year.
Although it was necessary to shorten the original program considerably, it was still expected that both the short and the ultra-short program version would be effective in reducing different forms of perpetration and victimization. This hypothesis was formulated because both program versions aim to empower the peer group to take an active stance against bullying and to intervene in bullying situations. The importance of the peer group in bullying situations has been investigated in a large number of studies (Lambe et al. 2018) and research demonstrates that reinforcing bullies and not defending victims is associated with higher levels of bullying at class level ). Thus, raising youth's awareness of the importance of bystander behaviors and trying to change the peer norms regarding bullying are important components to reduce bullying rates in schools (Polanin et al. 2012).
The analyses revealed one program effect of the ultra-short (i.e., 4-week) class intervention on physical victimization. This finding could indicate that bystanders started expressing their disapproval of bullying either directly (e.g., confronting the perpetrator) or indirectly (e.g., considering bullying as unacceptable), or that the program might have encouraged bystanders to defend their victimized peers, which could have been easiest for them when the harassment was directly observable (i.e., physical aggression). However, this result certainly needs to be replicated, and the hypothesized mechanisms of change need to be measured as well. It is also possible that students are less familiar with project-based work as frontal teaching is quite common in Kosovo and this might be the reason why the ultra-short version was more effective compared to the short version. Of course all these explanations remain highly speculative until this result is not replicated in future studies. Contrary to this one positive finding and to our hypotheses, there were no further program effects for any form of perpetration for neither the short nor the ultra-short program version. Several factors might be able to explain why the two program versions were ineffective in changing different forms of perpetration. To begin with, our analyses revealed that the levels of cyberbullying, cyber victimization, relational aggression and relational victimization were extremely low at baseline. This restriction of range might be the reason why it was not possible to decrease the mean levels of these five variables even further. Likewise, we cannot rule out the occurrence of sensitization effects (e.g., Dogan et al. 2017). Whereas the students might have been already aware of the negative consequences of physical aggression, the program might have stimulated them to perceive and think about other forms of perpetration or victimization they might not have considered important before. Future studies should therefore also include a follow-up data collection in order to disentangle these different effects.
Moreover, the necessary trade-offs regarding program duration and intensity in favor of feasibility and costeffectiveness might have been too big. Although the short vs. ultra-short Kosovar program versions included a number of exercises to empower the peer group to recognize and intervene in bullying situations, a four vs. six.week implementation period might be too short to cause a sustainable change and to reduce the bullying rates. Furthermore, the status and expertise of the volunteer assistants might have played a role. In both intervention versions, the class project units were delivered by trained undergraduate psychology students, but not by teachers. It is possible that external trainers, especially undergraduate students, might not have the same authority as teachers to change dysfunctional class norms. Moreover, external trainers with more practical experience with working with students might have been more effective in bringing about positive change. Another relevant factor might be the management of ongoing bulling situations. In neither intervention groups, indicated actions (e.g., talks with bullies, victims, and bully-victims) were implemented. It is reasonable to assume that a four vs. six-unit preventive class project is not a strong enough measure to deal with an actual bullying case. In a similar vein, bystander behavior might not be sufficient to stop ongoing bullying cases; instead, it might be necessary that teachers or other adults intervene (Rigby and Bauman 2010;Burger et al. 2015). A very similar negative result was found when the original and much longer ViSC class project was implemented in Romania (Trip et al. 2015). Although it might be more resource-intensive, future implementations of the ViSC program in low-or middle-income countries (LMICs) could benefit from a wider social and administrative support by making the approach more whole school. Instead of focusing on the peers in the class, future efforts should strive to get headmasters of the schools, teachers, other school staff including school psychologists, and parents on board with the bullying intervention.

Limitations
Although the present study has a high methodological standard, several limitations should be mentioned. To begin with, we relied on self-assessments only. When applying an intervention program large-scale (as it was the case in the present study) self-report measures are often chosen because they are easy to apply and are reliable given multiple items are used to measure a construct (Yanagida et al. 2019). The strengths and weaknesses of self-report measures in studies about aggressive behavior including bullying have already been discussed extensively in the literature (e.g., Solberg and Olweus 2003). Likely, aggressive behavior is underestimated using self-reports because perpetrators might not report the Btrue^frequency of their behavior. Thus, self-report measures should be interpreted with caution. In addition, our rigorous analyses revealed that the construct validity of four scales was not satisfactory, indicating that future studies need to improve the