Geofencing in location-based behavioral research: Methodology, challenges, and implementation

This manuscript presents a novel geofencing method in behavioral research. Geofencing, built upon geolocation technology, constitutes virtual fences around specific locations. Every time a participant crosses the virtual border around the geofenced area, an event can be triggered on a smartphone, e.g., the participant may be asked to complete a survey. The geofencing method can alleviate the problems of constant location tracking, such as recording sensitive geolocation information and battery drain. In scenarios where locations for geofencing are determined by participants (e.g., home, workplace), no location data need to be transferred to the researcher, so this method can ensure privacy and anonymity. Given the widespread use of smartphones and mobile Internet, geofencing has become a feasible tool in studying human behavior and cognition outside of the laboratory. The method can help advance theoretical and applied psychological science at a new frontier of context-aware research. At the same time, there is a lack of guidance on how and when geofencing can be applied in research. This manuscript aims to fill the gap and ease the adoption of the geofencing method. We describe the current challenges and implementations in geofencing and present three empirical studies in which we evaluated the geofencing method using the Samply application, a tool for mobile experience sampling research. The studies show that sensitivity and precision of geofencing were affected by the type of event, location radius, environment, operating system, and user behavior. Potential implications and recommendations for behavioral research are discussed.

Nowadays, with technological development of the mobile Internet, research is no longer limited to the laboratory, desktop computers, or bulky laptops, but can be conducted anywhere the Internet can reach via smartphones and other small devices.In support of this tendency that goes beyond Webbased research (e.g., Reips, 2008Reips, , 2021)), scientific investigation has been done increasingly via participants' smartphones in recent years (Pinter et al., 2015), and the number of mobile health (mHealth) applications is growing (Agarwal et al., 2016).The expansion of mobile Internet coverage and the fact that people carry smartphones everywhere make it possible to study people's experiences across spatial boundaries and in the midst of activities (Klein & Reips, 2017).The ability to take into account participants' locations is important because many experiences are defined by happening in specific places, such as shopping (in a store) or visiting the ill (in a hospital).Experiences may be changed or shaped by the location where they happen, e.g., singing at a concert hall versus a church.Memory has been shown to be prone to biases and distortions (Shiffman et al., 2008), therefore an ideal time to survey participants is when they are at or have just left a specific location, like, e.g., in exit polls during elections.
Geofencing technology enables location-aware research by defining a virtual border around a geographic area.A smartphone can detect incidents of entering or leaving the geofenced area, and each of these events can trigger an action, which depends on the application that uses geofencing.As we will demonstrate in this article, one such event can be sending a link to a mobile survey, where a participant is asked about the current experience.Other possible applications of the method include the presentation of any study materials that can be put on a website linked to the notification: e.g., interventions or experimental tasks.
Geofencing is based on the geotracking functionality that allows a mobile device to determine its location.
Historically, the first mobile location-based services developed in 2000-2007 derived the user's position from the coordinates of the serving base station, i.e., the Cell-ID protocol (Willaredt, 2011).The use of this technology was limited to point-of-interest (PoI) services, such as finding a nearby restaurant.Later, the Global Positioning System (GPS) was developed and became a standard feature on most smartphones.Some limitations of GPS, such as low signal quality indoors or near large buildings, long acquisition time to obtain the first position, and high battery consumption, have been addressed by the development of hybrid positioning systems, which are a combination of GPS, Wi-Fi, and Cell-ID positioning.For example, using the Secure User Plane (SUPL) protocol, the user's device first obtains the position via Wi-Fi and cell identifiers, and then a more accurate GPS protocol is employed using GPS satellites.Further technological advancement enabled background tracking, which runs as a background process on a smartphone (Küpper et al., 2011).Beyond reliable and valid measures of longitude and latitude, Stieger and Reips (2019) showed that GPS sensors in smartphones can also provide valid measures of altitude.
Geofencing is different from geotracking, which continuously records a user's absolute geographic location at regular intervals, depending on the app's settings (e.g., Geyer et al., 2019).The geotracking method has been widely used in smartphone-based research, but it entails problems such as missing data (Bähr et al., 2020), potential violation of user privacy, and users' concerns about battery drain (Liss et al., 2018).Location tracking technologies have been criticized for their intrusion into private life, sometimes disguised under the mask of security (Zuboff, 2019).Compared to geotracking, geofencing can be perceived as less intrusive.For example, if the location of interest is the participant's home, using geofencing will not expose the home address, but researchers can still send notifications and conduct surveys when people leave or enter the home.Geofencing also allows the smartphone to apply strategies for energy-efficient tracking, as there are algorithms that do not require continuous geotracking (Nakagawa, 2013).Küpper et al. (2011) described possible methods, such as checking the location less frequently or using the built-in accelerometer to activate tracking only during movements.Finally, the smartphone can also use approximate methods, such as Cell-ID protocol, as a first step to determine if a more accurate geotracking method should be used afterwards.
The geofencing method can trigger an event in a mobile application only when some predefined geographical borders are crossed.Moreover, when using geofencing, the absolute coordinates of a participant's location do not need to be sent to the server as geofencing can be handled by the participant's smartphone.This is ideal for situations where a researcher studies a sensitive topic and must not collect location data for privacy reasons.This also offsets the burden of handling sensitive private data for a researcher, such as anonymization of location data (Attwood et al., 2017).In addition, geofencing eliminates the need to analyze largescale multidimensional geotracking data that might consist of thousands of timestamps with coordinates and requires preprocessing and aggregation.

Previous research using geofencing
Because the technique itself is relatively new, research using geofencing has only begun in the last decade.To our knowledge, this article is the first systematic description of the geofencing method and its evaluation in a behavioral research journal.The method has the potential to generate novel theoretical perspectives and practical applications.
Geofencing can provide insights for the ecological approach that studies the structure of contextualized social interactions, the link between knowledge and action, and the situated nature of knowledge (Good, 2007).The use of geofencing is in line with a theoretical shift from studying the psychology of people passively perceiving the environment to the psychology of people acting in the current situation.In social and personality psychology, geofencing can be a valuable tool for studying person-situation interactions.The method can help to characterize individual differences in perception of situations (Rauthmann et al., 2014;Ziegler et al., 2019) and behavior across different locations (e.g., "if... then..." situation-behavior relations in Mischel & Shoda, 1995).Geofencing can benefit research that combines psychological theories with digital movement data.For example, surveys conducted at specific locations, such as sports events or segregated neighborhoods, can explore the sense of identity, group belonging, or perceptions of out-groups (for other examples, see Hinds et al., 2022).Such studies can inform social identity and intergroup contact theories.Researchers can use publicly available databases or maps that provide location information to select potential places of interest and specify survey questions (e.g., the Atlas of Inequality in Moro, 2019).
Previous research underscored that geofencing can improve understanding of behavior in context, with an emphasis on potential applications.Mair et al. (2019) described the importance of incorporating neighborhood and regional contexts into research on alcohol use by combining risk assessments of exposures within specific locations.Risk assessment models can be further updated based on real-time data to generate interventions tailored to a user and delivered at a time of high risk for a lapse (Forman et al., 2019).This makes geofencing a viable tool for clinical and mobile health research.Mobile health research can benefit from including locations in the analysis, as many addictive behaviors, such as alcohol or nicotine consumption, are triggered by specific places.Potential applications for locationbased interventions include a number of health issues, such as physical activity, alcohol use, smoking, obesity, and mental illnesses (in Nahum-Shani et al., 2018).Attwood et al. (2017) used geofencing as part of an alcohol reduction intervention.The mobile app sent messages when participants were at locations where they needed support to regulate alcohol consumption.Wray et al. (2019) examined social and environmental factors of alcohol use by sending surveys to participants at places associated with heavy drinking.About 40% of surveys arrived when participants were not at the intended location, indicating that geofencing technology is still in development.On the other hand, Naughton et al. (2016) concluded that geofencing was reliable and accurate in identifying smoking locations.The authors developed a context-triggered lapse-prevention system, in which supportive messages were sent to smokers who had decided to quit when they were entering the areas associated with smoking.Besoain et al. (2020) used a co-design approach to develop a mobile application UBESAFE that informed users about the potential risk of HIV near areas with a high probability of sexual encounters.An interesting feature of this app was that users could add their own health-promoting messages and share the areas with a community.Coral et al. (2020) proposed using geofences around gambling establishments to discourage their attendance.The authors developed a mobile app that could apply geofencing even if the GPS was turned off.Nguyen et al. (2017) showed the usefulness of geofencing in ascertaining hospitalizations; however the accuracy of geofencing validated by medical records was moderate.Other geofencing applications include dental clinic promotion (Wright et al., 2021), employment research by geofencing at job centers (Haas et al., 2020), and using geofencing for disaster information systems (Suyama & Inoue, 2016).During the COVID-19 pandemic, some mobile applications, such as the luca App developed in Germany (culture4life GmbH, 2021), used geofencing to automatically check people out when they left a location.
Overall, previous research has shown that geofencing can be applied for data collection in the field (see the summary of the existing and potential research applications for geofencing in Table 1).However, the geofencing method has not been systematically evaluated in terms of its sensitivity and precision.With the present manuscript we intend to fill this gap: we report three experimental studies in which we tested the method.Thus, this manuscript can help researchers understand the influence of various factors, such as the type of operating system or location radius, and plan a geofencing study in their research area.Second, we provide the implementation of the geofencing method with a mobile application that allows researchers to conduct their own geofencing studies.In the following, we discuss this implementation in detail.

Implementation of geofencing in Samply
Geofencing itself is performed by the smartphone's operating system.To use geofencing, application developers utilize the software libraries provided by LocationManager Class in Android (Android, 2023) and Core Location Framework in iOS (Apple, 2023).Researchers must either program a mobile application from scratch or use one of the available applications.
In the example implementations described in this article, we will be using the software Samply, which was developed in our iScience research group (Shevchenko et al., 2021).Other researchers can also freely use Samply, without having to program an app or set up a server.Samply supports experience-sampling studies on mobile devices and consists of the Samply website for researchers (https:// samply.unikonst anz.de) and the Samply Research mobile app for participants.Participants can download the app at no cost from the respective Apple or Google app stores1 .Researchers can create and send push notifications containing links to online surveys or other websites2 .Once the participant taps on the notification, the link opens in a mobile web browser.Samply provides schedule-based notifications that allow participants to be notified at certain predefined or random times or at time intervals (Shevchenko & Reips, 2022).
In the current work, we extended the Samply website and mobile application code and built a user-friendly geofencing interface to create and edit geofenced locations on the website for researchers and in the mobile application for participants.Thus, Samply allows both researchers and participants to customize locations, which broadens the software's application scope -a benefit for software (Rough & Quigley, 2020).The events of entering or leaving a location can trigger sending a notification with a web link to the participant.For the locations determined by a researcher, the Samply website requires entering GPS coordinates and defining the radius of the fence.The coordinates, longitude, and latitude can be found in publicly available maps, e.g., Google Maps, Apple Maps, Microsoft Bing Maps, or OpenStreetMaps.Alternatively, locations can be selected through the dragand-drop map interface (see Fig. 1a).Researcher-defined locations can be the same areas for all participants, such as a university or a city center.Participant-defined locations are unique to each participant, such as home or workplace.Participants use the mobile app interface to select locations on the map (see Fig. 1b).The coordinates are stored locally on the smartphone and not shared with the Samply website or the researcher to protect participants' privacy.While the operating system continuously monitors the location in the background, the GPS data is neither stored on the smartphone nor shared with the server.

Factors influencing geofencing
A number of factors can affect the accuracy of geofencing: radius of the geofence, type of mobile operating system and device (Kuhlmann et al., 2020), Wi-Fi access, and type of geofencing event.The common shape of the geofenced area is a circle that is defined by its radius.Although there are possibilities of defining the boundaries in the polygon shape, this functionality is not equally supported in iOS and Android devices.Regarding the radius, previous research has used different sizes, e.g., 10 to 30 m by Wray et al. (2019) or 100 m by Naughton et al. (2016).A radius that is too small can increase the number of misses and reduce the accuracy of geofencing.
The way a smartphone responds to geofencing events depends on the type of mobile operating system -almost all smartphones run either iOS or Android 3 .The iOS documentation specifies 10 m as the smallest possible radius 4 , although anecdotal evidence from Internet forums suggests that the use of the 10-m radius could be problematic.Since iOS14, released in the fall of 2020, there are two types of user location available to applications: precise and approximate.Geofencing requires a precise location and that a user grants permission always to access the location.To decrease the likelihood of a false alarm, the iOS waits for a user to travel a minimum distance over the geofenced area and remain on the same side of the geofence for at least 20 s 5 .The Android documentation recommends that the minimum geofence radius be between 100 and 150 m for best results 6 .Beginning with Android 12, released in the fall of 2021, there is also a distinction between precise and approximate user locations available to an application.With enabled Wi-Fi (even if the smartphone is not connected to a Wi-Fi network), the minimum radius can be between 20 and 50 m.If an indoor positioning system is available, the radius can be as small as 5 m.When Wi-Fi is disabled, the smartphone relies on GPS or the cellular network to determine the location7 .The lack of reliable network connection or the latency of location queries can also affect the accuracy of geofencing.An Android smartphone usually requests the current location every second minute.If the device has been stationary for a significant amount of time, the latency may increase up to 6 min.
The way Android and iOS operating systems implement geofencing represents different approaches to maximizing the efficiency of geofencing without putting too much strain on the smartphone battery.In summary, iOS relies more on distance measures, while Android reduces the frequency of geotracking.These differences may become important for real-world applications, therefore the comparison of the two operating systems was included in our empirical studies.
With regard to similarities between the operating systems, geofencing in both systems still works in the background even when a user closes the mobile application that uses geofencing.The system continues to monitor the registered regions and launches the application in case of an event.Both operating systems impose a limit on the number of geofences that a user can create: 20 for iOS and 100 for Android.When a user uninstalls the mobile application that uses geofencing, the geofencing service is stopped.For iOS, a restart of geofencing in the application is required if the user has disabled Background App Refresh (either for the app or for all apps).For Android, a restart is necessary if the user has rebooted the smartphone, cleared the app's data, cleared Google Play services data, or disabled Android's Network Location Provider.Researchers should thus provide instructions on how to restart geofencing in case participants accidentally disable it (e.g., by rebooting the smartphone).
Previous research reported problems with detecting geofencing events when Wi-Fi was disabled in iOS (Suyama & Inoue, 2016), so we included a test between smartphones with enabled and disabled Wi-Fi in Study 1.In addition, Suyama and Inoue (2016) found differences in the distance to the fence at which users were notified for entering and exiting events in iOS: While the distance for enter events was 20 to 30 m, it was 120 to 130 m for exit events in their study.This motivated us to systematically observe the differences between enter and exit events in our studies.
To evaluate the feasibility of the geofencing implementation, we conducted three empirical studies.By conducting these empirical studies and with this manuscript we aim to address the lack of methodological, technical, and practical guidance on geofencing in behavioral research and help researchers understand the influence of various factors on the results of the method's application.

General method
To evaluate the geofencing method and its implementation in Samply, we analyzed its performance using sensitivity, precision, distance, and time measures.Sensitivity and precision are measures constructed from a confusion matrix (see Table 2).Sensitivity is the probability of receiving the notification given that its corresponding event has occurred.Sensitivity can be calculated as the proportion of correct hits to the sum of correct hits and misses.Precision shows how accurate was the notification given that the notification was sent.Precision can be calculated as the proportion of correct hits to the sum of correct hits and false alarms.The distance and time measures refer to the absolute distance and time difference, respectively, between the location where the notification was received and the geofenced area.
We used different methods and measures in our three empirical studies (see Table 3).The evaluations in Study 1 and Study 2 were conducted by the first author and research assistants, while Study 3 involved naïve participants.In Study 1, we recorded the location of geofencing events using the smartphone GPS tracker available via the mobile Internet browser.The main measures were the geofencing sensitivity and the distance between the center of the geofenced area and the location where the notification was received.In Study 2, we replicated and validated the results from Study 1 by using an external GPS tracker to record the location and timestamp of geofencing events.Study 2 measured sensitivity, the distance between the fence and the location where the notification was received, and the time difference between the two.In both Study 2 and Study 3, sensitivity was modelled as a binary variable (whether or not the notification was received given that the evaluator was at the location), which allowed us to assess the effect of each factor.In Study 3, students at the University of Konstanz received notifications with an online survey when they entered or left the university.Sensitivity and precision were computed based on the number of received notifications for each participant.The data and analysis scripts are available at OSF (https:// osf.io/ mg6f4/).

Study 1
The goal of Study 1 was to assess and calculate sensitivity and the distance to the center in different conditions using the absolute geolocation of testers as a benchmark.

Participants
We evaluated the geofencing method in a series of empirical tests, which were performed by the first author and research assistants.

Procedure
Four evaluators walked with their smartphones along a pre-defined route in three different environments: downtown, residential area, and forest (see Fig. 2).These three The downtown area included locations in the city center characterized by a high density of commercial buildings such as shops and restaurants.The residential area was composed of a mixture of single-family houses and apartment buildings.The forest was a dense collection of trees outside the city.Additionally, we systematically varied the factors such as the type of geofencing event (enter or exit), type of OS (Android, iOS), and Wi-Fi access (turned on or off).Therefore, the testers completed a route in each environment eight times in a fully crossed 2 x 2 x 2 design with two event types, two operating systems, and two Wi-Fi conditions.To evaluate the effect of the geofence radius, there were 15 locations for each route -five locations for each radius of 10, 50, and 100 m (see Fig. 2).In total, there were 360 tests performed (3 environments x 8 repetitions x 15 locations).
During the tests, the evaluators walked along the pathways that went through predefined geofenced areas.They were instructed to carry the smartphone conveniently in their dominant hand and not to use it for other purposes (e.g., texting or web browsing).By carrying the smartphone in the hand, we wanted to minimize the probability of missing the notification if the smartphone was in the pocket or the bag and the evaluators missed the sound or vibration.When they heard the signal of notification arrived or/and felt vibration of the smartphone, they stopped moving, attended to their smartphone, and clicked on the notification, which opened a mobile browser with a one-question survey about the current location.At this moment, the absolute geolocation of the user was automatically recorded in the smartphone's mobile browser.After answering the survey question, the evaluators closed the browser app and continued moving.
We measured the geofencing sensitivity by recording whether the notification was sent for every location the evaluator visited.We also computed the distance between the actual location of the smartphone at the moment Note.Each of the three routes contains 15 locations -five locations for each radius of 10, 50, and 100 m

Materials Smartphones
The four evaluators used their own smartphones: iPhone 11, iPhone 12, Xiaomi Redmi Note 10 Pro, and Samsung Galaxy S8 (see Table 4).The Samply Research app was installed by downloading the app from the respective app stores.Then, the testers joined the "Geofencing test" study and gave the app permission to access the smartphone's location at any time.

Survey
The online survey was created in the lab.js experiment builder (https:// lab.js.org, Henninger et al., 2021) and hosted on the Open Lab platform (https:// open-lab.online, Shevchenko, 2022).The survey included one multiple-choice question about the participant's current location.The possible answers were locations from 1 to 15 (e.g., Location 1, Location 2) and "Other" with an option to specify details.Additionally, we recorded the time and the location coordinates using the mobile browser libraries so that locations were recorded on the same smartphone that was used for testing.We used the mobile browser libraries to access the user's location, because Samply itself does not keep records of absolute geolocation.

Analysis plan
The data recorded in Samply and the survey were matched using the message ID, a unique identifier for each notification passed via a link and recorded in the survey.We computed the sensitivity score and distance to the center of the geofenced area for each experimental condition.Based on the distance, we later calculated the percentage of the notifications sent at different distances.To test the effect of different conditions, we used the general linear model approach: logistic regression to model the sensitivity as the probability to receive a notification and linear regression to model the distance to the center (R Core Team, 2021).Although the data were nested within evaluators, we did not apply a mixed-effects model approach due to the small number of clusters (N = 4).The effect of the environment was of primary interest, given that the absence of Wi-Fi networks in the forest area might impair the geofencing.Turned on Wi-Fi access on the device was expected to have a higher sensitivity than switched off Wi-Fi.We expected a larger distance to the center of the geofenced area for exit events than for enter events.The reason for our expectation was the setup of the experiment.Given a possible time delay between the moment of crossing the fence and notification, people entering the area may be closer to the center of the area than people walking out of the area.We did not expect to see statistically significant differences with regard to the effects of the radius and type of operating system.

Results
We conducted the tests with the mobile app Samply Research between July and November 2021 and obtained 330 records of notifications sent from the server.In 16 of 330 cases, although the notification was delivered, no notification information could be recorded due to the lack of Internet connection, so we treated 16 unidentified records as missing data.Therefore, we used the remaining 314 records to calculate the sensitivity score.
To calculate the distance to the center of the geofenced area, we needed to record the location of the testers.In 46 of 314 cases, it was impossible to record the smartphone location due to an unstable Internet connection8 .Therefore, we used the other 268 records where the location was recorded to calculate the distance between the center of the geofenced area and the actual position of the smartphone.The sensitivity score and distance for each of the experimental conditions can be found in Appendix Table 16.

Sensitivity
On average, the notification was received in 82.50% of all locations9 .To evaluate the effect of each experimental factor on sensitivity, we constructed a logistic regression model.The dependent variable was binary -whether the notification was received or not.As independent variables in the model, we entered the type of environment (downtown, residential area, and forest), Wi-Fi settings (off, on), type of event (enter, exit), operating system of the smartphone (Android, iOS), and radius (10, 50, and 100 m).The results of the logistic regression model are displayed in Table 5.
The baseline condition in the model denotes the enter event in the 10-m area in downtown, on an Android device with turned off Wi-Fi.The sensitivity in the baseline condition was 60%.The effects of other conditions are interpreted in terms of the odds ratio, increasing (odds ratio > 1) or decreasing (odds ratio < 1) the probability to receive the notification in comparison with the baseline condition.
Conducting a study in the forest area decreased the probability to receive the notification, OR = 0.43, 95% CI = 0.19-0.93,p = 0.035.Smartphone operating system turned out to play an important role in sensitivity, with iOS devices showing higher sensitivity than Android devices, OR = 7.61, 95% CI = 3.76-16.59,p < 0.001.Both 50 and 100 m radius of geofenced area had a significantly higher sensitivity compared to 10-m radius, OR = 7.58, 95% CI = 3.57-17.33,p < 0.001, and OR = 10.94,95% CI = 4.83-27.73,p < 0.001, respectively.Neither the Wi-Fi settings nor the type of event significantly affected the probability to receive the notification (ps > 0.05).

Distance to the center
The distribution of distances to the center of the geofenced area was right-skewed due to some potentially invalid data points (see Fig. 3).
For further analysis, we excluded six observations as outliers that were above two standard deviations from the mean (upper threshold of 882.86 m) and could distort the analysis.The average distance in the remaining sample was 153.34 m (Med = 113.08,SD = 146.18).The majority of excluded observations were from the forest area (n = 5), so the unusually large distances might be related to the lack of Internet connection and hence a measurement error in the mobile browser in the area.For example, a bad Internet connection10 could result in a situation where the notification link was clicked but not completely opened in the mobile browser until the evaluator moved to a place with better Internet access (which may be 1000 m away from the location where the notification was received).In this case, the measurement would not be a valid representation of the distance.
Therefore, we used the remaining 262 records for the analysis in the linear regression model.The dependent variable was the distance between the smartphone's position and the center of the geofenced area.As independent variables in the model, we entered the type of environment (downtown, residential area, and forest), Wi-Fi settings (off, on), type of event (enter, exit), operating system of the smartphone (Android, iOS), and radius (10, 50, and 100 m).The results of the linear regression model are displayed in Table 6.
It is important to note that the distance data represent only the cases where there was an Internet connection.Because we excluded missing data from locations without an Internet connection, the missing mechanism was not random.Therefore, the results should be considered as the results only for situations with an Internet connection.This limitation generally applies to Internet-based studies, by definition they require an Internet connection.
The baseline condition represents the enter event in the 10-m area in downtown, on an Android device with turned off Wi-Fi.The average distance to the center in the baseline condition was 28.13 m (95% CI = -15.09 to 71.36 m).Being in the forest area increased the distance between the location of receiving a notification and the center of the geofenced area, b = 75.64,95%CI = 38.50-112.78m, p < 0.001.We detected no significant difference between downtown and residential areas, b = 27.47,95%CI = -4.24 to 59.19 m, p = 0.089.Turning on  17 for the marginal effects of each experimental condition on sensitivity and distance for enter and exit events.We have also calculated the statistical power post hoc using a simulation approach (see Appendix Table 22).

False alarms
Regarding false alarms, we did not observe any obvious cases of notification confusion, such as approaching location 1 but receiving a notification for location 2. There was a technical error at the moment of joining the study in the mobile application and enabling notifications for exit events.
If the user was outside the area, this resulted in a batch of notifications (15 notifications) being sent to the user at once (as the operating system incorrectly registered the exit event).We discarded these notifications and solved this error afterwards, which will be explained in Study 2. The data on distance to the center of the geofenced area showed that notifications often arrived outside the geofenced area.While this was expected for exit events, notifications for enter events sometimes were also sent outside the area.Therefore, whether a notification is considered a "false alarm" depends on the threshold chosen for when notifications are correctly sent and when they are not.For example, if we apply a strict threshold of the exact geofencing boundary, then only notifications for entering events sent at the boundary or within the geofenced area should be considered correct, and all others are "false alarms".To investigate the relationship between the distance to the center and the number of sent notifications, we calculated the percentage of notifications received at different distances from the center of the geofenced area (see Fig. 4).For example, iOS can potentially generate a higher number of false alarms than Android, which corresponds to the fact that iOS notifications had a larger distance to the center of the geofenced area than Android notifications.

Exploratory analysis
To investigate further the differences between operating systems, we examined how Android and iOS systems performed with different radii (see Appendix Table 18 for descriptive statistics).We repeated the main analysis with additional interactions between the type of operating system and radius in the subgroups of enter and exit events.Regarding sensitivity, the interactions between the  4 Percentage of sent notifications for different distances to the center of the geofenced area.Note: On the X-axis, the distances between the location where the notification was received and the center of the fenced area are shown.For each distance, we calculated the percentage of notifications (between 0 and 100%, on the Y-axis) that had arrived at that or smaller distance operating system and radius were not statistically significant (ps > 0.05).Concerning the distance to the center of the geofenced area, the interaction effect between the type of operating system and 100-m radius was not statistically significant for enter events (b = -78.84,SE = 45.49,p = 0.09), but reached the level of statistical significance for exit events (b = -166.06,SE = 76.22,p = 0.032).Android showed higher differentiation between the radii of different sizes, while iOS demonstrated little differentiation between the radii (see Fig. 5).That means, notifications for exit events on Android devices were sent closer to the border of the geofenced area than on iOS devices.

Discussion
The goal of Study 1 was to test the geofencing method under various conditions, such as the type of environment, geofencing radius, Wi-Fi settings, the type of event, and operating system.Study 1 demonstrated a moderately high geofencing sensitivity, defined as the probability of receiving a notification at a location, of 82.50%.The average distance between the actual location of receiving a notification and the center of the geofenced area was 87 m for enter and 234 m for exit events.Importantly, both sensitivity and the distance were affected by several factors, such as location radius, operating system, environment, Wi-Fi settings, and the type of event.
Using a 10-m radius for geofencing notifications resulted in missed notifications on Android or notifications delivered at a larger distance on iOS (see Table 15 in General discussion for recommendations on using geofencing).In general, iOS outperformed Android in sensitivity, but Android notifications were sent closer to the border.No substantial differences were found between residential and downtown areas, but geofencing in forest areas encountered problems due to a lack of mobile Internet connection.Turned-off Wi-Fi increased the distance for exit events, but did not affect sensitivity.The type of geofencing event (enter or exit) did not influence the probability of receiving the notification, whereas exit events had a larger distance than enter events.

Study limitations
Study 1 had some limitations that could have biased the results.First, we used the GPS tracker of the smartphone to measure its absolute geolocation in Study 1.This measure might not be independent, as the same smartphone's GPS system was used to trigger notifications.In Study 2, therefore we used an independent external GPS tracker to record the position of the evaluator.Additionally, the distance between the location of receiving a notification and the center of the geofenced area may not be an ideal measure of geofencing precision.If the notification is triggered by the event of crossing the geofencing border when entering or leaving the area, a more precise measure would be the distance between the location of the fence crossing and the location where a notification was received.In Study 1, we could not calculate that distance because we measured the actual location of the smartphone only at the moment of opening a web survey after receiving the notification.In Study 2, we thus recorded the evaluator's complete path with an external GPS tracker.This allowed us to calculate where and when the geofenced area was crossed.Because of this, we could further calculate the distance and time difference between the location and time of crossing the fence and the location and time of receiving the notification.
In Study 1, the testers moved through a geofenced area without stopping or taking a break.The testers were free to choose their own speed, there were no specific instructions regarding the time or speed.However, in a real case scenario people might stay for some time inside of a geofenced area (e.g., visiting a shop, staying in the office).Therefore, we added a staying inside condition in Study 2. In this condition, we investigated a more realistic protocol that represented the situation in which people are not just going through the geofenced locations, but stay inside of them for some time11 .From the perspective of geofencing technology, it might make a difference, as staying inside of a geofenced area for a longer time can increase the likelihood that an enter event will be triggered.

Participants
All the tests were performed by the first author together with a research assistant to ensure following the same protocol.

Procedure
The procedure was identical to Study 1, except we introduced some changes to improve the methodology.First, an external GPS tracker was used to record evaluators' timestamps and positions during the whole test at the rate of one recording per second.Second, we added a new condition, where evaluators stayed for 5 min in the center of each location.
As Study 1 had clearly demonstrated the disadvantage of geofencing in the forest environment, where there was a lack of Internet, we excluded the forest from the tests.Additionally, as Study 1 did not find any differences between downtown and residential areas, we conducted all tests in the downtown area.We also left out the condition with turned off Wi-Fi settings, as there were no substantial differences found for each condition in Study 1, and participants usually have their Wi-Fi turned on.
We systematically varied the factors type of geofencing event (enter or exit), type of OS (Android, iOS), and user behavior (walking through or staying in the center for 5 min).Therefore, the route in the downtown environment was completed eight times in a fully crossed design with two event types, two operating systems, and two user behavior conditions (2 x 2 x 2).To evaluate the effect of the geofence radius, there were 15 locations for each route -five locations for each radius of 10, 50, and 100 m (see Fig. 2A).In total, 120 tests were performed (1 environment x 8 repetitions x 15 locations).
During the tests, both evaluators walked along the pathway in the downtown area that went through predefined geofenced areas (see Fig. 2A).The smartphone was carried in the right hand, and the evaluator was not to use it for other purposes (e.g., texting or web browsing).The Garmin Watch (model Venu Sq), an external GPS tracker, was carried on the left hand.When the evaluator heard the signal of notification arrived or/and felt vibration of the smartphone, he stopped moving, and registered the event on the Garmin Watch by starting a new lap.Then the evaluator noted the time of receiving the notification and the lap number in the Garmin Watch on a protocol sheet.The notification was then removed from the screen by swiping it away.
We measured the geofencing sensitivity by recording whether the notification was sent for every location the evaluator visited.We also computed the distance and time difference between the actual location of the evaluator at the moment of receiving the notification and the point of crossing the area fence.False-alarm rates were later calculated based on the distance data.

Materials Smartphones
The evaluators used their own smartphones: iPhone 11 and Nokia (see Table 7).

External GPS tracker
We used a Garmin Watch (model Venu Sq) as an external GPS tracker.The log files recorded by the watch were available for downloading in the Garmin Connect software.The log files were opened and examined with the Google Earth program.Each event of a notification was recorded as the start of a new lap, which was shown in the logs with a timestamp and geolocation.The time and location of crossing the virtual fence were also calculated from the logs.

Preventing false alarms for exit events
To prevent the batch notification problem observed in Study 1, we implemented the "vicinity zone" algorithm in the Samply Research mobile app.In general, for each triggered notification in the mobile app, the operating system provides additional information about whether the user is inside or outside the geofenced area.While this prevents a false alarm for enter events (by not sending a notification if the user is outside), for exit events the condition that the user is outside is quite loose and can be met if the user is, for example, 100 m or 100 km away.As a remedy and to provide an additional control mechanism, we programmed the app to check the user's current location when an exit event is triggered.Only when the user is in the immediate vicinity of the fence (i.e., up to 200 m) can the application proceed and send a notification.The exact threshold of 200 m was based on the results of Study 1 in the areas with good Internet connectivity (downtown and residential area).The idea behind this is that if the user has actually left the area, the user should still be somewhere nearby.If this is not the case, a false alarm is very likely, and the notification should not be sent.

Analysis plan
The data recorded in Samply and the Garmin Watch were matched using the time and the lap numbers recorded during the tests.We computed sensitivity, distance, and time difference for each radius condition.The distance and time difference to the fence were calculated with respect to the location and time at which the fence was crossed.Positive values on the distance and time scales mean that the notification was triggered after the participants had crossed the fence and negative values mean that the notification was triggered before they had crossed the fence.Analysis was similar to the analysis in Study 1.In general, we expected to replicate the results of Study 1.With respect to the new staying inside condition, we expected to record higher sensitivity.

Results
We conducted the tests in July and August 2022 and obtained 84 records of notifications sent from the server.One notification was sent twice, we only used the first notification in the analysis.The final dataset thus consisted of 83 records.For each of the notifications, we recorded the absolute position of the smartphone, and calculated the distance and time difference with the location where the fence was crossed.The sensitivity score, distance, and time difference for each of the experimental conditions can be found in Appendix Table 19.

Sensitivity
On average, the notification was received in 69% of all locations.To evaluate the effect of each experimental factor on sensitivity, we constructed a logistic regression model.The dependent variable was binary -whether the notification was received or not.As independent variables in the model, we entered the type of user behavior (no waiting, 5 min waiting), the type of event (enter, exit), operating system of the smartphone (Android, iOS), and radius (10, 50, and 100 m).The results of the logistic regression model are displayed in Table 8.

Distance to the fence
The average distance between the location where the fence was crossed and the location where the notification was received was -6.76 m for enter events (range -137.79 to 109.47,Med = 8.76, SD = 71.13)and 132.00 m for exit events (range -56.58 to 217.56, Med = 140.75,SD = 53.99).
In the regression model analysis, the dependent variable was the distance between the smartphone's position and the fence.As independent variables in the model, we entered the type of user behavior (no waiting, waiting 5 min), the type of event (enter, exit), operating system of the smartphone (Android, iOS), and radius (10, 50, and 100 m).The results of the linear regression model are displayed in Table 9.
The baseline condition represents the enter event without waiting in the 10-m area, on an Android device.The average distance to the fence in the baseline condition was -8.97 m, so the notification was on average triggered before crossing the fence (95% CI = -49.21 to 31.27 m).Waiting 5 min in the center of the geofenced area did not affect the distance, b = 0.32, 95% CI = 27.10 to 27.75 m, p = 0.98.iOS triggered notifications at a greater distance before crossing the fence compared to the baseline with an Android device, b = -41.80,95% CI = -70.06 to -13.53 m, p = 0.004.Notifications sent on the exit event had a larger distance to the fence after crossing than notifications sent on the enter event, b = 135.40,95% CI = 108.88-161.93 m, p < 0.001.The 50-m or 100-m radius increased the distance after crossing the fence compared to the 10-m radius in the baseline condition, b = 41.12,95% CI = 6.66-75.57m, p = 0.02, and b = 38.15,95% CI = 3.80-72.50m, p = 0.03.

Time difference with the fence crossing
For the analysis, we had to exclude two observations, because of unusually large time difference values, which were not related to the experiment (i.e., taking a break during the test).
The average time difference between the time when the fence was crossed and the time when the notification was received was 21.68 s for enter events (range -172 to 684, Med = 20.0,SD = 147.88)and 177.82 s for exit events (range -59 to 560, Med = 176.50,SD = 109.85).In general, the linear regression analysis of time differences confirmed some of the patterns that we had found in the analysis of the distance to the fence above.The average time difference with the fence in the baseline condition was 46.09 s, so the notification was on average triggered after crossing the fence (95% CI = -33.61 to 125.79 s).iOS had triggered notifications before crossing a fence compared to Android, b = -122.35,95% CI = -178.81to -65.89 s, p < 0.001 (see Table 10).Notifications sent on the exit event had a larger time difference after crossing the fence than notifications sent on the enter event, b = 160.49,95% CI = 107.45-213.52 s, p < 0.001.There were no differences in the time between 50-or 100-m radius compared to the 10-m radius (although it took longer for both radii).Waiting 5 min in the center of the geofenced area did not affect the time difference as well, in comparison with the condition of going through the geofenced area without stopping.Please see Appendix Table 20 for the marginal effects of the conditions and Appendix Table 22 for statistical power estimates.

False alarms
Regarding false alarms, one notification was sent twice with an interval of 1 min 33 s.It was triggered by an exit event on Android device at the location of 100-m radius in the condition when the tester moved through the location without stopping in the center.As in Study 1, the data on the distance to the fence showed that notifications often arrived before the smartphone crossed the fence.Whether to consider these notifications as false alarms depends on the threshold a researcher selects.To investigate the relationship between the distance to the fence and the false alarm rate, we calculated the percentage of notifications received at each distance (see Fig. 6).

Exploratory analysis
When comparing distances and time differences for enter events, the average distance was negative, but the average time difference was positive.This counterintuitive outcome could be due to averaging across different operating systems.A clearer picture arises if we consider the effects separately for iOS and Android operating systems (see Fig. 7).
We repeated the main analysis with additional interactions between the operating system and event.The interaction was statistically significant both for the distance, b = 120.76,SE = 23.98,p < 0.001, and for the time difference, b = 168.72,SE = 52.12,p = 0.002.With regard to enter events, iOS sent notifications on average before the fence was crossed, as shown by the negative distance to the fence (M iOS = -43.90,SD iOS = 60.25) and time difference (M iOS = -55.34,SD iOS = 66.93).Android sent notifications on average after the fence was crossed, as indicated by the positive distance (M Android = 53.07,SD Android = 39.57) and time difference (M Android = 145.78,SD Android = 159.27).

Discussion
The goal of Study 2 was to confirm the findings of Study 1 using an external GPS tracker, and in addition systematically test the difference between walking through the geofenced area or staying for 5 min in the center of the location.Study 2 replicated the differences in sensitivity between iOS and Android operating systems, with iOS having a higher sensitivity than Android.A larger radius and waiting 5 min increased sensitivity, while exit events generally had a lower sensitivity level.Study 2 also provided more accurate measurements of distance and confirmed that iOS triggered enter notifications before crossing the fence, whereas Android sent enter notifications after crossing the fence.Possible explanations and recommendations are discussed in the General discussion section.

Study 3
In Studies 1 and 2, we only tested a limited selection of smartphones with trained testers who paid extensive attention to receiving notifications.However, in many studies Fig. 6 Percentage of received notifications for different distances to the fence.Note.On the X-axis, the distances between the location where the notification was received and the fence are shown.For each distance, we calculated the percentage of notifications (between 0 and 100%, on the Y-axis) that had arrived at that or smaller distance Fig. 7 Sensitivity, distance, and time difference for iOS and android operating systems for enter and exit events.Note.95% CI is shown researchers might encounter a higher variation of participant devices, operation system versions, and mobile Internet providers.The question is whether and to which extent geofencing will still be accurate under those conditions.To evaluate the geofencing method, we conducted a study with participants, who were naïve to the research materials and hypotheses.The participants, students of the University of Konstanz, used their own smartphones.

Participants
Fifty-eight participants (54 women, four men, mean age 21, SD = 4, range 18-42) completed the baseline survey and registered in the mobile app "Samply Research".Regarding the type of operating system, 40 participants used iOS, 17 participants used Android, and one participant used EMUI (an Android-based system developed by Huawei).From the perspective of the system sending notifications, at least one geofencing notification was sent to 53 participants, and the average number of notifications sent per participant was 15.83, Med = 14, SD = 13.05,range 1-71.As for responding to the notification by entering the data in the geofencing survey, 49 participants responded at least once, and the average number of completed geofencing surveys per participant was 12.31, Med = 12, SD = 7.25, range 1-39.

Procedure1
After completing the baseline survey, participants installed the Samply Research mobile application and joined the study in the app.From that point on, participants received a notification linked to the geofencing survey when they entered or left the area of interest, i.e., the University of Konstanz.The radius was set up to 200 m.In addition, participants received a notification at 9:00 pm each evening with a link to the daily survey.The study lasted 2 weeks for each participant, and participants could begin the study between November 28 and December 7, 2022.At the end of the study (on day 15) participants received a notification with a link to the debriefing survey.

Baseline survey
The purpose of the baseline survey was to introduce participants to the study and to collect basic sociodemographic (gender, age) and smartphone-related technical information (smartphone model, operating system, mobile Internet provider).The perceived quality of mobile Internet connectivity was measured on a five-point Likert scale ranging from "very poor" to "excellent".The survey also explained the goal and content of the study to participants, provided the informed consent form, and instructed them on how to install and use the mobile application.

Geofencing survey
The survey consisted of two questions about the participant's location when the notification was received.The first question asked whether the participant was at the university or not.The second question was whether the participant was entering, leaving, or neither entering nor leaving the university.

Daily survey
The daily survey aimed to ask participants whether they were at the university during the day.The options were "Yes", "No", and "Don't want to answer".The notification to fill in the daily survey was sent out at 9:00 pm on each study day.

Debriefing survey
In the debriefing survey, participants were informed about the study's aim and thanked for their contribution.Participants were asked whether there were any technical problems during the study and whether they had any comments about the study for researchers.The survey also included instructions on how to disable geofencing, leave the study, and delete the app.

Analysis plan
The data recorded in Samply and online surveys were matched using the anonymous participant ID.We computed the confusion matrix for enter and exit events based on notification logs and participants' survey answers.Hits and false alarms were calculated from the answers that participants provided in the geofencing surveys, i.e., whether they were entering/ leaving the university at the moment of receiving the notification.Misses and correct rejections were computed from the answers that participants provided in the daily surveys, i.e., whether they were at the university during the day.Sensitivity and precision statistics were calculated for each participant (see Appendix Table 21) and for each phone model.We used a t test to analyze the effect of the type of operating system on the sensitivity and precision of enter and exit events.

Sensitivity
To compute sensitivity, we calculated the number of correct hits and misses based on the answers to the daily survey (see Table 11).For enter events, the sensitivity was 70%, and for exit events, the sensitivity was 18%.

Precision
To compute precision, we calculated the number of correct hits and false alarms based on the answers to the geofencing survey (see Table 12).The precision for enter events was 70% and for exit events 89%.
It could be that some participants received multiple enter notifications even though they had been at the university for some time, so they did not select the "Entering the university" option.Analysis of responses to the second question in the geofencing survey about the current position confirmed that participants were at the university 97% of the time when enter notifications were sent (see Table 13).Thus, the false alarms for enter events were not related to misidentification of location (as participants were at the university), but were related to either timing (e.g., a notification was delayed) or the repetition of a notification when participants had been in the university for some time or had moved on campus.For exit events, 91% of participants reported being at the university at the time they received the notification.The boundaries between the university and outside the university were not clearly defined in the study, so participants who were leaving the university could still have considered that they were in the university campus area.

Smartphone models
To explore the variability in precision and sensitivity between smartphones, we calculated both scores for each smartphone model for enter and exit events (see Fig. 8).

Operating systems
We conducted t tests to analyze whether there were differences in precision and sensitivity between the iOS and Android operating systems.The precision for enter events and the sensitivity for exit events were higher for iOS devices than for Android devices, t (df = 21.11)= -2.35,p = 0.028, and t (df = 40.90)= -2.57,p = 0.014 (see Table 14 and Appendix Table 22 for statistical power estimates).Precision for exit events and  sensitivity for enter events did not differ statistically significantly between operating systems, t (df = 17.07) = 1.00, p = 0.33, and t (df = 25.11)= -1.27,p = 0.21.The perceived quality of the mobile Internet connection or the type of mobile Internet provider had no effect on precision and sensitivity (ps > 0.05).

Discussion
The goal of Study 3 was to investigate the feasibility of the geofencing method, which was implemented via Samply, in a study with naïve subjects.At the same time, because the participants used their own smartphones, we tested the app's functionality on different devices, operating systems, and mobile Internet providers.The general results indicate a moderate level of sensitivity and precision for enter events, 70% for both.For exit events, while precision was high at 89%, sensitivity was low at 18%.As in Studies 1 and 2, iOS performed better than Android.The iOS operating system demonstrated higher precision for enter events and higher sensitivity for exit events than Android.

General discussion
Our three empirical studies showed that implementing geofencing technology yielded sensitivity and precision scores comparable to those of previous studies (Nguyen Fig. 8 Precision and sensitivity for enter and exit events.Note.The smartphone model names are shown next to the data points.iPhone models are abbreviated with "i-".Question marks in the name of a phone model (e.g., "Xiaomi-?")mean that the participant has not provided information about the specific model of the phone et al., 2017;Suyama & Inoue, 2016;Wray et al., 2019).The scores were not ideal, suggesting that there is room for improvement in both the technology and its use by researchers.We identified several limitations and provided recommendations for geofencing research (see Table 15).

Location radius
Using a 10-m radius for sending notifications may be problematic, as some of the notifications for this radius were not delivered on Android or were delivered earlier than expected on iOS.A 50-m radius improves sensitivity on Android, but iOS still sends notifications earlier than expected.The differences between operating systems can be attributed to the operating systems' underlying strategies for optimized geofencing.As Android requests the current location every second minute, locations with a 10-m radius could be missed because the smartphone had a higher chance of being removed from the zone before the notification is triggered.Regarding the iOS performance, Apple documentation points out that certain conditions should be met to deliver a notification: "The user's location must cross the region boundary, move away from the boundary by a minimal distance, and remain at that minimum distance for at least 20 s"12 .However, the documentation does not explain why notifications are delivered before crossing the region boundary.It might be that iOS uses a radius of 100 m for all radii below 100 m.This would explain the results of Studies 1 and 2, where iOS did not differentiate between radii below 100 m.If this is true, then iOS can reach a higher sensitivity even for smaller radii, such as a 10-m radius, by giving up on precision.This implies that researchers who want to use geofencing for very specific and small locations, e.g., a smoking area next to an office building, should be aware of the possibility of misses (in particular, for Android smartphones) or false alarms (in particular, for iOS smartphones).Both sensitivity and distance measures were acceptable for a radius of 100 m or above, which leads us to recommend 100 m as a minimum radius for studies.We expect a further increase in radius should not significantly affect the results, although this has to be tested in further research.Many locations can be set up with a radius of 100 m and higher, such as public areas, offices, apartments, stores, etc.Whether the precision can be improved in the future is an open question, as it is not only a matter of technical affordances, but also an ethical decision regarding user privacy.

Operating system
All three studies reported here demonstrated that iOS currently has an advantage over the Android system in using geofencing for research.These differences between operating systems were mainly present for a 10-m radius in the controlled tests and were also replicated in a study with naïve participants.As discussed above, the differences can be explained by the fact that iOS relies more on distance measures to optimize the geofencing, whereas Android reduces the frequency of tracking (to every second minute or every sixth minute, if the device has not been active) and may apply additional optimization techniques that interfere with geofencing.Android's doze mode might contribute to misses, although the documentation specifies that the system exits the doze mode when the user moves the device13 .Moreover, some Android devices optimize battery performance by shutting down or delaying background tasks, such as geofencing.In these cases, researchers may prepare troubleshooting instructions for participants to adjust the settings14 .

Environment
We did not find any substantial differences in geofencing sensitivity and distance between residential and downtown areas.Although there were more public Wi-Fi spots in downtown due to stores and restaurants, this did not significantly improve sensitivity and the distance compared to the residential area with fewer public Wi-Fi spots.On the other hand, we discovered that geofencing in the forest area might encounter a number of problems.First, the sensitivity was lower, and the distance was measured as larger in the forest area.Second, in our tests in the forest area, there was also an area without cellular Internet connection, which further reduced connectivity.While the notifications arrived even in the area without Internet connection, the lack of connection prevented the data from being sent from the smartphone to the server and the geolocation from being recorded correctly.Therefore, it is recommended that studies be conducted in an area that has a cellular Internet connection (at least at the poor signal level indicated by at least one bar or dot of signal strength in the smartphone menu).
Because Internet connectivity can affect geofencing technology, it is important to understand what can affect the connection.The signal can be blocked by building materials such as steel, concrete, brick, wood, or fiberglass insulation.Therefore, we recommend setting up a geofencing boundary in an open area.For example, setting up the boundary inside a building would not be a good idea.If researchers are interested in a particular building (e.g., a school, a hospital, a store), the boundary should be set up around the building, including the open space next to the building.In addition, in densely populated areas, cell towers may reach their capacity limit.When many people use the bandwidth of the same network and cell tower, connectivity decreases.This means that geofencing studies could be problematic at events or high-traffic locations (e.g., concerts, conferences, shopping malls) unless Wi-Fi networks compensate for cellular network congestion.

Wi-Fi settings
Although the previous studies and operating system documentation pointed to possible adverse effects of turned-off Wi-Fi signals, we could only confirm the negative impact of turned-off Wi-Fi on the distance to the center of the geofenced area in exit events.This factor may not be as problematic for geofencing studies because most people usually leave their Wi-Fi on by default.Researchers may remind participants to keep their Wi-Fi on during the study.

Enter vs. exit event
While the type of geofencing event (enter or exit) did not influence the probability of receiving the notification, it did affect where the geofencing event was triggered.Exit events had a larger distance from the center of the geofenced location than enter events.Given the time delay by the operating system, people leaving the area moved further away from the border with each additional minute.This means that exit notifications can generate a higher variability of responses than enter notifications.When an exit notification arrives, people may be in different locations and at different distances from the geofenced area depending on the mode of transportation (e.g., walking or driving).
By combining enter and exit events, researchers can design studies that measure the time participants spend at a location.To record events without sending notifications, researchers can activate the invisible mode for each type of event in Samply.

User behavior
Waiting 5 min in the center of the geofenced area improved sensitivity because smartphones had more time to detect entering or leaving the area.This means that researchers using geofencing should consider how much time participants might spend at the location: The more time a participant spends at the location, the higher the sensitivity likely will be.If a participant only passes through the location, iOS devices will tend to have a higher sensitivity than Android devices.

General recommendations
There are two general strategies for conducting geofencing research.The researcher can use prepared devices (e.g., smartphones) or utilize participants' smartphones.In the case of the first strategy, the choice of operating system and smartphone model is important.Our results show that iOS has an advantage over the Android system but comes with higher costs.An Android device could also be an option, as long as a minimum 50-m radius is used and the smartphone is configured allow geofencing background tasks to run without being optimized by the operating system.With a predetermined smartphone model, researchers can pre-test the phone's functionality.If researchers choose to use participants' smartphones, it is recommended to record the type of device and assist participants in adjusting smartphone settings if needed.False alarms and misses are currently unavoidable in geofencing research.To control for false alarms, researchers can include a question in a geofencing survey that confirms that the participant is in the intended location.To control for misses, researchers can use Samply's interval-based notifications functionality (as we did in Study 3) to ask participants to report whether they were at the intended location during the day.In Samply, we have implemented two customizable features to decrease the number of false alarms.These are the minimum time window between notifications and the size of the vicinity zone for exit notifications.The first feature allows researchers to set a minimum time window between geofencing notifications.When the notification is triggered, an algorithm checks the time of the previous notification and prevents sending the notification if the time that has passed since is too short.The second feature allows users to customize the size of the vicinity zone for exit notifications.Only when a participant is in the vicinity zone (i.e., up to 500 m) could the app proceed and send an exit notification.This enables researchers to control the trade-off between precision and sensitivity for exit events.
Researchers can set up either shared or participant-specific locations.In Study 3, we tested a single location shared among all participants.For those interested in participantspecific locations, the Samply mobile app interface allows participants to enter their locations in the app.The location coordinates are not shared with the researcher.This allows for various research designs, such as notifying participants when they enter their office, place of study, home, etc.However, if the geofenced areas are close to each other, the probability of false alarms is higher.Due to the limited number of locations a user can create (20 locations for iOS, 100 locations for Android), we recommend focusing on the most important locations.
Participant privacy is an essential aspect of a geofencing study.Researchers should be explicit and specific about any information they collect in the study, as required by the EU GDPR (Voigt & dem Bussche, 2017).To support this, the Samply user interface prompts researchers to provide a study consent form and a rationale for geofencing as two separate information inputs to make it explicit for participants.

Validation method
To validate the accuracy of geofencing in Samply, we combined controlled repeated tests from Study 1 and Study 2 with real-world tests in Study 3. Controlled tests allowed us to systematically examine the combination of different factors, such as radius or operating system.We recommend applying an external source of validation, such as an external GPS tracker in Study 2, to verify an app's functionality.Although the smartphone can provide various measurements, such as GPS position from the mobile browser, an external data source guarantees the independence of those measurements.
With real-world tests, on the other hand, we estimated the plausibility of applying geofencing given the heterogeneity of devices and user behavior.This approach of combining controlled and real-world tests has been used before with other smartphone apps (e.g., Geyer et al., 2022).However, in our studies, we found that the statistical power of realworld tests in Study 3 was lower than that of controlled tests in other studies (see Appendix Table 22).This difference may be due to the use of aggregated proportions as measures of sensitivity and precision for each participant.While aggregated per-participant data is a common way to measure accuracy, further studies may consider recruiting a larger sample size to increase the statistical power.

Limitations and further research
The required Internet connectivity is a general limitation of the geofencing approach implemented in Samply.However, given technological developments and the widespread use of smartphones, it is reasonable to assume that more people will own smartphones and have access to mobile Internet.At the same time, we expect the geofencing technology to evolve both quantitatively (by improving sensitivity and precision) and qualitatively, by developing new forms of geofencing (e.g., vertical geofencing in Stieger & Reips, 2019) or combining it with other device information such as spatial orientation (Kuhlmann et al., 2020).
Given our findings and the speed of technological development, further research will be required to amend our findings and test new technological solutions and implementations.At the same time, we are confident that as technology advances, researchers will adopt geofencing (e.g., by using the Samply software) to develop and test new questions that complement laboratory research with inquiries into everyday life.

Conclusion
Technically, geofencing and geotracking technology have already achieved impressive results and are being used in many domains outside of research, e.g., logistics, retail, car navigation, and business (e.g., Uber, Airbnb).Behavioral science research has yet to take advantage of this rapidly evolving technology to explore new applications.User privacy is of great importance, so a technology such as geofencing could be more attractive to users because it does not record absolute geolocation and does not share the data with others.However, although geofencing has this advantage, it needs to be explained to the end user to establish a trustworthy and transparent relationship between the researcher and the participant.People differ in terms of their preferences regarding the level of privacy (Joinson et al., 2006;Paine et al., 2007), so all participants' privacy should be treated with care.
This manuscript shows advantages, conditions, and limitations of geofencing methodology.As an application ready to be used, Samply provides a research infrastructure for geofencing, and researchers are welcome to use the software, run benchmark tests, and use the tool in their studies.Documentation with information on geofencing and step-by-step tutorials can be found on the Samply website https:// samply.uni-konst anz.de/.Each radius was repeated five times.The overall number of tests was 120 (2 types of events x 2 OS x 3 radii x 2 types of user behavior x 5 locations per radius).Sensitivity is defined as the probability of receiving a notification at a geofenced location (between 0 and 1).Distance is provided in meters between the location where the fence was crossed and the location of the smartphone at the moment of receiving the notification.Mean (SD).As the enter and exit events represent substantially different experimental conditions affecting the geofencing, the marginal effects are presented separately for enter and exit events.Sensitivity is defined as the probability of receiving a notification at a geofenced location (between 0 and 1).Distance is provided in meters between the location where the fence was crossed and the location of the smartphone at the moment of receiving the notification.The perceived quality of mobile Internet connectivity was measured on a five-point Likert scale ranging from "very poor" (1) to "excellent" (5).Question marks in the name of a phone model (e.g., "Xiaomi ?") mean that the participant has not provided information about the specific model of the phone

Table 22 Statistical power estimates
We used a simulation approach in which model parameters were used to simulate the outcome variable (with 1000 repetitions).After that, a model was fit into the simulated data, and the effects were estimated.The percentage indicates the number of cases in which the effect turned out to be statistically significant (below the alpha level of 0.05).

Fig. 1
Fig. 1 Geofencing interface on the Samply website for researchers (a) and mobile application for participants (b)

Fig. 2
Fig. 2 Routes with geofenced locations in downtown (a) residential area (b), and forest (c).Note.Each of the three routes contains 15 locationsfive locations for each radius of 10, 50, and 100 m

Fig. 5
Fig. 5 Effect of operating system and radius on sensitivity and distance to the center.Note.95% CI is shown.The probability of receiving a notification indicates the sensitivity of geofencing.The distance

Table 1
Existing and potential applications of geofencing in research

Table 2
Confusion matrix for geofencing events

Table 3
Overview of studies

Table 4
Smartphone specification

Table 5
(Tjur, 2009)s ratios for probability to receive a notification The odds ratios compare the odds of receiving a notification at the absence and at the presence of the predictor.Odds ratios greater than 1 indicate that receiving a notification is more likely when the predictor is present.Odds ratios less than 1 indicate that receiving a notification is less likely when the predictor is present.R 2 Tjur is a coefficient of discrimination, which represents the difference between the averages of fitted values for successes (notifications received) and failures (notifications not received)(Tjur, 2009) ferences, iOS had a larger distance compared to Android, b = 85.48, 95% CI = 55.67-115.30m, p < 0.001.Using the 50-m or 100-m radius surprisingly did not affect the distance to the center, b = 5.86, 95% CI = -31.25 to 42.96 m, p = 0.76, and b = 16.71,95% CI = -20.61 to 54.02 m, p = 0.38.Notifications sent on the exit event had a larger distance to the center, b = 144.52,95% CI = 116.39-172.64 m, p < 0.001, which was expected as the exit event should be triggered after people left the geofenced area.Please see Appendix Table

Table 6
Model 2. Regression coefficients for the distance to the center

Table 7
Smartphone specification in Study 2

Table 8
Model 1. Odds ratios for probability to receive a notification The odds ratios compare the odds of receiving a notification at the absence versus the presence of the predictor.Odds ratios greater (Tjur, 2009)ate that receiving a notification is more likely when the predictor is present.Odds ratios less than 1 indicate that receiving a notification is less likely when the predictor is present.R 2 Tjur is a coefficient of discrimination, which represents the difference between the averages of fitted values for successes (notifications received) and failures (notifications not received)(Tjur, 2009)

Table 9
Model 2. Regression coefficients for the distance to the fence

Table 11
Geofencing notifications and participants' responses to the daily surveys

Table 12
Geofencing notifications and participants' responses about their current movement in the geofencing surveys

Table 13
Geofencing notifications and participants' responses about their current location in the geofencing surveys

Table 14
Sensitivity and precision for iOS and android operating systems

Table 15
Limitations and recommendations for geofencing research

Table 17
Marginal effects of each experimental condition on sensitivity and distance to the center of the geofenced area in Study 1 Mean (SD).As the enter and exit events represent substantially different experimental conditions affecting the geofencing, the marginal effects are presented separately for enter and exit events

Table 18
Marginal effects of operating system and radius on sensitivity and distance to the center of the geofenced area in Study 1

Table 19
Sensitivity, distance, and time difference with the fence in Study 2

Table 20
Time difference is provided in seconds between the moment of crossing the fence and receiving a notification Marginal effects of each experimental condition on sensitivity, distance, and time difference with the fence in Study 2 Time difference is provided in seconds between the moment of crossing the fence and receiving a notification

Table 21
Sensitivity and precision in Study 3