Stealing PINs via mobile sensors: actual risk versus user perception

In this paper, we present the actual risks of stealing user PINs by using mobile sensors versus the perceived risks by users. First, we propose PINlogger.js which is a JavaScript-based side channel attack revealing user PINs on an Android mobile phone. In this attack, once the user visits a website controlled by an attacker, the JavaScript code embedded in the web page starts listening to the motion and orientation sensor streams without needing any permission from the user. By analysing these streams, it infers the user’s PIN using an artificial neural network. Based on a test set of fifty 4-digit PINs, PINlogger.js is able to correctly identify PINs in the first attempt with a success rate of 74% which increases to 86 and 94% in the second and third attempts, respectively. The high success rates of stealing user PINs on mobile devices via JavaScript indicate a serious threat to user security. With the technical understanding of the information leakage caused by mobile phone sensors, we then study users’ perception of the risks associated with these sensors. We design user studies to measure the general familiarity with different sensors and their functionality, and to investigate how concerned users are about their PIN being discovered by an app that has access to all these sensors. Our studies show that there is significant disparity between the actual and perceived levels of threat with regard to the compromise of the user PIN. We confirm our results by interviewing our participants using two different approaches, within-subject and between-subject, and compare the results. We discuss how this observation, along with other factors, renders many academic and industry solutions ineffective in preventing such side channel attacks.


I. INTRODUCTION
Smartphones equipped with modern sensors such as GPS, light, orientation and motion are continuously providing more features to end users in order to interact with their realworld surroundings.Developers can have access to the mobile sensors either by 1) writing native code using mobile OS APIs [17], 2) recompiling HTML5 code into a native app [34], or 3) using standard APIs provided by the W3C which are accessible through JavaScript code within a mobile browser 1 .The last method has the advantage of not needing any appstore approval for releasing the app or doing future updates.More importantly, the JavaScript code is platform independent, i.e., once the code is developed it can be executed within any modern browser on any mobile OS.
In-browser access risks.While sensor-enabled mobile web applications provide users more functionalities, they raise new privacy and security concerns.Both the academic community and the industry have recognised such issues regarding certain sensors such as geolocation [20].For the website to access the geolocation data, it must ask for explicit user permission.However, to the best of our knowledge, there is little work evaluating the risks of in-browser access to other sensors.Unlike in-app attacks, an in-browser attack, i.e., via JavaScript 1 w3.org/TR/#trJavascript APIs Fig. 1.PINlogger.jspotential attack scenarios; a) the malicious code is loaded in an iframe and the user is on the same tab, b) the attack tab is already open and the user is on a different tab, c) the attack content is already open in a minimised browser, and the user is on an installed app, d) the attack content is already open in a (minimised) browser, and the screen is locked.The attacker listens to the side channel motion and orientation measurements of the victim's mobile device through JavaScript code, and uses machine learning methods to discover the user's sensitive information such as his activity types and PINs.code embedded in a web page, does not require any app installation.Furthermore, JavaScript code does not require any user permission to access sensor data such as device motion and orientation.Furthermore, there is no notification while JavaScript is reading the sensor data stream.Hence, such inbrowser attacks can be carried out far more covertly than the in-app counterparts.However, launching an effective inbrowser attack still has to overcome the technical challenge that the sampling rates available in browser are much lower than those in app.For example, as we observed in [24], frequency rates of motion and orientation sensor data available in-browser are 3 to 5 times lower than those of accelerometer and gyroscope available in-app.
Motion and orientation sensors detail.According to W3C specifications [2] motion and orientation sensor data are a series of a few different measurements: • device orientation which provides the physical orientation of the device, expressed as three rotation angles (α, β, γ) in the device's local coordinate frame • device acceleration which provides the physical acceleration of the device, expressed in Cartesian coordinates (x, y, z) in the device's local coordinate frame • device acceleration-including-gravity which is similar to acceleration except that it includes gravity as well • device rotation rate which provides the rotation rate of the device about the local coordinate frame, expressed as three rotation angles (α, β, γ) • interval which provides the constant sampling rate and is expressed in milliseconds (ms) The device coordinate frame is defined with respect to the standard position of the mobile screen.When it is in the portrait mode, x and y axes are in the plane of the screen and are positive towards the screen's right and up, and z is perpendicular to the plane of the screen and is positive outwards from the screen.Moreover, the sensor data discussed above are processed sensor data obtained from multiple physical sensors (see Section V).In the rest of this paper, unless specified otherwise, by sensor data we mean the sensor data accessible through mobile browsers which includes acceleration, acceleration including gravity, rotation rate, and orientation.
Motivation.Many popular browsers such as Safari, Chrome, Firefox, Opera and Dolphin have already implemented access to the above sensor data.As we demonstrated in [23], [24], all of these mobile browsers allow such access when the code is placed in any part of the active tab including iframes (Figure 1, a).In some cases such as Chrome and Dolphine on iOS, an inactive tab including the sensor listeners have access to the sensor measurements as well (Figure 1, b).Even worse, some browsers such as Safari allow the inactive tabs to access the sensor data, when the browser is minimised (Figure 1, c), or even when the screen is locked (Figure 1,  d).Mobile operating systems and browsers do not seem to be implementing consistent access control policies in regard to mobile orientation and motion sensor data.Furthermore, W3C specifications [2] do not discuss any risks associated with this potential vulnerability.Because of the low sampling rates available in browser, the community have been neglecting the security risks associated with in-browser access to such sensor data.However, in TouchSignatures [24], we showed that despite the low sampling rates, it is possible to identify user touch actions such as click, scroll, and zoom.They also demonstrated that numpad's digits can be recovered in a side channel attack.In this work we contribute to the study of such attacks as follows: • We show that other highly sensitive information about users such as their physical activities can be obtained through these sensors.Furthermore, we introduce PIN-Logger.js, an attack on full 4-digit PINs as opposed to only single digits in [24].We show that unregulated access to these sensors impose more serious security risks to the users in comparison with more well-known sensors such as camera, light and microphone.• We conduct user studies to investigate users' understanding about these sensors and also their perception of the security risks associated with them.We show that users in fact have less security concerns about these sensors comparing to more well-known ones.• We study and challenge current suggested solutions, and discuss why our studies show they cannot be effective.We argue that a usable and secure solution is not straightforward and requires further research.

II. USER ACTIVITIES
The potential threats to the user security posed by an unauthorised access to the described sensor data are not immediately clear.Here we demonstrate two simple scenarios which show that sensitive user information such as phone calls timing and physical activities can be deduced from device orientation and motion sensor data obtained from JavaScript.
Users tend to move their mobile devices in distinctive manners while performing certain tasks on the devices, or by simply carrying them.Examples of the former include answering a call or taking a photo, while the latter covers their transport mode.In both cases, an identifiable succession of movements is exhibited by the device.As a result, a webbased program which has access to the device orientation and motion data may reveal sensitive facts about users such as the exact timing information of the start and end of phone calls and that of taking photos.On the other hand, while the user is simply carrying her device, the device movement pattern may reveal information about the user's movement pattern, e.g., if the user is stationary at one place, walking, running, on the bus, in a car, or on the train.We present the results of two initial experiments that we have performed on a Nexus 5 using Maxthon Browser (as an example of a browser that provides access to sensor data even when the screen is locked).
Phone call timing.In the first experiment, we opened the website carrying our Javascript code, then locked the screen.The Javascript code continued to log orientation and motion data while the Android phone was on a desk.For this experiment, we used another phone to call the Android phone four times with a few seconds gap between the calls.As demonstrated in Fig. 2 (left), the 4 distinct phone calls along with their timing are recognisable from the three dimensions of acceleration (including gravity) which come from the device motion sensor.For a better comparison, Fig. 2 (right) shows the received call history of the phone during the experiment with their start times and durations.As shown in this figure, the captured sensor data match the call history.
User physical activities.In the second experiment, we again locked the phone and recorded the sensor data during 22 seconds of sitting, 34 seconds of walking and 25 seconds of slow running.We observed that the mentioned activities have visibly distinctive sensor streams.As an example, Fig. 3 shows the acceleration data from motion sensor.As it can be seen, the mentioned activities are recognisable from each other since they are visibly different in the sensor measurements.
Our initial evaluations suggest that discovering device movement related information such as call times and user's mode of transport can be easily implemented.However, as we will explain, distinguishing user PINs is a lot harder as the induced sensor measurements are only subtly different.In the following sections, we will demonstrate that, with advanced machine learning techniques, we are able to remotely infer the entered PINs on a mobile with high accuracy.

III. PINLOGGER.JS
In this section, we describe a more advanced attack on user's PINs by introducing PINlogger.js.

A. Attack approach
We consider an attacker who wants to learn the user's PIN tapped on a soft keyboard of a smartphone via side channel information.We consider (digit-only) PINs since they are popular passwords used by users for many purposes such as unlocking phone, SIM PIN, NFC payments, bank cards, other banking services, gaming, and other personalised applications such as healthcare, insurance, etc.Unlike similar works which have to gain the access through an installed app [25], [29], [26], [11], [31], [32], [28], [35], [4], [12], our attack does not require any user permission.Instead, we assume that the user has loaded the malicious web content in the form of an iframe, or another tab while working with the mobile browser as shown in Figure 1.At this point, the attack code has already started listening to the sensor sequences from the user's interaction with the phone.
In order to uncover when the user enters his PIN, we need to classify his touch actions such as click, scroll, and zoom.We already have shown in TouchSignatures [24] that with the same sensor data and by applying simple classifiers, it is possible to effectively identify user's touch actions.Here, we consider a scenario after the touch action classification.In other words, our attacker already knows that the user is entering his PIN.Moreover, unless explicitly noted, we consider a generic attack scenario which is not user dependant.This means that we do not need to train our machine learning algorithm with the same user as the subject of the attack.Instead, we have a oneround training phase with multiple voluntary users, and use the obtained trained algorithm to output any new user's PIN later.This approach has the benefit of not needing to trick individual users to collect data for training.

B. Web program implementation
We implemented a web page with embedded JavaScript code in order to collect the data from voluntary users.Our code registers two listeners on the window object to have access to orientation and motion data, separately.The event handlers defined for these purposes are named DeviceOrientationEvent and DeviceMotionEvent, respectively.On the client side, we developed a GUI in HTML5 which shows random 4-digit PINs to the users and activates a nummpad for them to enter the PINs as shown in Figure 4.All sensor sequences are sent to the database along with their associated labels which are the digits of the entered PINs.We implemented our server program using Node.js(nodejs.org).Our code sends the orientation and motion sensor data of the mobile device to our NoSQL database using MongoLab (mongolab.com,web-based service for MongoDB).When the event listener fires, it establishes a socket by using Socket.IO (socket.io) between the client and the server and constantly transmits the sensor data to the database.Both Node.js and a MongoDB (as a documentoriented database) are known for being capable of supporting data intensive applications in real time.

C. Data collection
Following the approach of Aviv et al. [4] and Spreitzer [32], we consider a set of 50 fixed random PINs in this paper.We conducted our user studies using Chrome on an Android device (Nexus 5).The experiments and results are based on the collected data from 5 users, each entering all the 50 4-digit PINs for 5 times.Our voluntary participants were university students and staff and performed the experiments at university offices.We simply explained to them that all they needed was to enter a few PINs as was shown in a web page.
In relation to the environmental setting for the data collection, we asked the users to remain sitting in a chair while working with the phone.We did not require our users to hold the phone in any particular mode (portrait or landscape) or work with it by using any specific input method (using one or two hands).We let them choose their most comfortable posture for holding the phone and working with it as they do in their usual manner.While watching the users during the experiments, we noticed that all of our users used the phone in the portrait mode by default.Users were either leaning their hands on the desk or freely keeping them in the air.We also observed the following input methods used by the users.• Holding the phone in one hand and entering the PIN with the thumb of the same hand (Figure 4, left).• Holding the phone in one hand and entering the PIN with the fingers of the other hand (Figure 4, centre).• Holding the phone with two hands and entering the PIN with the thumbs of both hands (Figure 4, right).In the first two cases, users exchangeably used either their right hands or left hands in order to hold the phone.We tried to simulate a real world data collection environment.We took the phone to each user's workspace and briefly explained the experiment to them, and let them complete the experiment without our supervision.All users found this way of data collection very easy and could finish the experiments without any difficulties.

D. Feature extraction
In order to build the feature vector as the input to our classifier algorithm, we consider both time domain and frequency domain features.We improve our suggested feature vectors in [24] by adding some more complicated features such as the correlation between the measurements.This addition improves the results, as we will discuss in Section IV.As discussed before, 12 different sequences obtained form the collected data include; orientation (ori), acceleration (acc), acceleration including gravity (accG), and rotation rate (rotR) with three sequences (either x, y and z, or α, β and γ) for each sensor measurement.As a pre-processing step and in order to remove the effect of the initial position and orientation of the device, we subtract the initial value in each sequence from subsequent values in the sequence.
We use these pre-processed sequences for feature extraction in time domain directly.In frequency domain, we apply the Fast Fourier transform (FFT) on the pre-processed sequences and use the transformed sequences for feature extraction.In order to build our feature vector, first we obtain the maximum, minimum, and average values of each pre-processed and FFT sequences.These statistical measurements give us 3×12 = 36 features in the time domain, and the same number of features in the frequency domain.We also consider the total energy of each sequence in both time and frequency domains calculated as the sum of the squared sequence values, i.e., E = v 2 i which gives us 24 new features.The next set of features are in time domain and are based on the correlation between each pair of sequences in different axes.We have 4 different sequences; ori, acc, accG, and rotR, each represented by 3 measurements.Hence, we can calculate 6 different correlation values between the possible pairs; (ori, acc), (ori, accG), (ori, rotR), (acc, accG), (acc, rotR), and (accG, rotR), each in 3 values.We use the Correlation coefficient function in order to calculate the similarity rate between the mentioned sequences.The correlation coefficient method is commonly used to compare the similarity of the shapes of two signals (e.g.[6]).Given two sequences A and B and Cov(A, B) denoting covariance between A and B, the correlation coefficient is computed as below, where The correlation coefficient of two vectors measures their linear dependence by using their covariance.By adding these new 18 features, our feature vector consists of a total of 114 features.

E. Neural network training
We apply a supervised machine learning algorithm by using an Artificial Neural Network (ANN) to solve this classification problem.The input of an ANN system could be either raw data, or pre-processed data from the samples.In our case, we have preprocessed our samples by building a feature vector as described before.Therefore, as input, our ANN receives a set of 114 features for each sample.As explained before, we collected 5 sample per each 4-digit PINs from 5 different users, giving us 1250 feature vectors in general.
The feature vectors are mapped to specific labels from a finite set: i.e., 50 fixed random 4-digit PINs.We train and validate our algorithm with two different subsets of our collected data, and test the neural network against a completely new subset of the data.We trained the network with 70% of our data, validated it with 15% of the records and tested it with the remaining 15% of our data set.We used a pattern recognition/classifying network in Matlab with one hidden layer and 1000 nodes.Pattern recognition/classifying networks normally use a scaled conjugate gradient (SCG) backpropagation algorithm for updating weight and bias values in training.Scaled conjugate gradient is a fast supervised learning algorithm [27].

IV. EVALUATION
In this section we present the results of our attack and compare them with other works.

A. PINlogger.js success rate
Table I shows the accuracy of our ANN trained with the data from all users.Since these results are based on the collected data from all users, we refer to it as the user-independent mode.As the table shows, in the first attempt PINlogger.js is able to infer the user's 4-digit PIN correctly with accuracy of 82.96%.The results get better on further attempts.As the table shows, our system is able to reveal the user's PIN with nearly 100% accuracy in three attempts.By  the probability of 2% in the first attempt, and 6% in three attempts.

B. User-dependent mode
In order to study the impact of individual training, we trained, validated and tested the network with the data collected from one user.We refer to this mode of analysis as the user-dependent mode.We asked our user to enter 50 random PINs, repeated 5 rounds.The reason we have increased the number of rounds from 1 to 5 is that the classifier needs to receive enough samples to be able to train the system.Interestingly, our user used all three different input methods shown in Figure 4 during the PIN entrance.As expected, our classifier performs better when it is personalized: the accuracy increases to 91.42% in the first attempt, and 98.64% and 100% in two and three attempts, respectively.
In the user-dependent mode, convincing the users to provide the attacker with sufficient data for training customised classifiers is not easy, but still possible.Approaches based on game apps similar to in-app ideas such as Math Trainer 2 could be applied.Math-based CAPTCHAs are possible web-based alternatives.Any other web-based game application which segments the GUI similar to a numerical keypad will do as well.Nonetheless, this is out of the scope of this paper since we mainly follow a user-independent approach.

C. Guessing the PIN from the entire PIN space
One might argue that the attack should be evaluated against the whole 4-digit PIN space.However, we believe that the attack could still be practical when selecting from a limited set of PINs since users do not select their PINs randomly [9].It has been reported that around 27% of all possible 4-digit PINs belong to a set of 20 PINs 3 , including straightforward ones like '1111', '1234', or '2000'.Nevertheless, we present the results of our analysis of the attack against the entire search space for both the user-independent and user-dependent modes.We train another ANN in order to infer a single digit on the numpad.In this experiment, we consider 10 classes of the entered digits (0-9) from the data we collected on 4digit PINs used in Section IV-A.Unlike the test condition of Section IV-B, we did not have to increase the number of rounds of PIN entry for the user-dependent mode here since we had enough samples for each digit per user.Hence, we trained and tested our system with individual users, and used the average of the results of our 5 users in this section.The average identification rates of different digits are presented in Table III The results in our user-independent mode shows that it is possible to correctly infer digits in over 71% of the cases in the first attempt, going up to 92% in three attempts.This means that for a 4-digit PIN and based on the obtained sensor data, the attacker can guess the PIN to be within a set of 3 4 = 81 possible PINs with a probability of success equal to 0.92 4 = 71.67%.A random attack, however, can only predict the 4-digit PIN with the probability of 0.81% in 81 attempts.By comparison, PINlogger.jsachieves a dramatically higher success rate than a random attacker.
Using a similar argument, in the user-dependent mode the success probability of guessing the PIN in 81 attempts is 81.62%.In the same setting, Cai and Chen report a success rate of 65% using accelerometer and gyroscope data [3] and Simon and Anderson's PIN Skimmer only achieves a 12% success rate in 81 attempts using camera and microphone [31].Our results in digit recognition in this paper are also better than what is achieved in TouchSignatures [24].In summary, PINlogger.jsperforms better than all sensor-based digit-identifier attacks in the literature.

D. Comparison with related work
Obtaining sensitive information about user information such as PINs based on mobile sensors through a malicious app running in the background has been actively explored by researchers in the field.For example, GyroPhone, by Michalevsky et al. [25], shows that gyroscope data is sufficient to identify the speaker and even parse speech to some extent.Other examples include Accessory [29] by Owusu et al. and Tapprints [26] by Miluzzo.They infer passwords on full alphabetical soft keyboards based on accelerometer measurements.Touchlogger [11] is another example by Cai and Chen [3] which shows the possibility of distinguishing user's input on a mobile numpad by using accelerometer and gyroscope.The same authors demonstrate a similar attack in [12] on both numerical and full keyboards.The only work which relies on in-browser JavaScript for attacking on a numpad sensors is our previous work, TouchSignatures [24].All of these works, however, aim for the individual digits or characters of a keyboard, rather than the entire PIN or password.
Another category of works directly target user PINs.For example, PIN skimmer by Simon and Anderson [31] is an attack on a user's numpad and ultimately PINs using their smartphone camera and microphone.Spreitzer suggests another PIN Skimming attack [32] and steals a user's PIN based on the measurements from the smartphone's ambient light sensor.Narain  crophone [28].TapLogger by Xu et al. [35] is another attack on the smartphone numpad which outputs the pressed digits and PINs based on accelerometer and orientation sensor data.Similarly, Aviv et al. introduce an accelerometer-based side channel attack on the user's PINs and patterns in [4].We choose to compare PINlogger.jswith the works in this category since they have the same goal of revealing the user's PINs.Table II presents the results of our comparison.
As shown in Table II, PINlogger.js is the only attack on PINs which acquires the sensor data via JavaScript code.Inbrowser JavaScript-based attacks impose even more security threats to users since unlike in-app attacks, they do not require any app installation and user permission to work.Moreover, the attacker does not need to develop different apps for different platforms such as Android, iOs, and Windows.Once the attacker develops the JavaScript code, it can be deployed to attack all mobile devices regardless of the platform.Moreover, Touchlogger.jsand [4] are the only user independent works.By contrast, the results form other works are based on training the classifiers for individual users.In other words, they assume the attacker is able to collect input training data from the victim user before launching the PIN attack.We do not have such an assumption as the training data is obtained from other users.In terms of accuracy, with the exception of [28], PINlogger.jsgenerally outperforms other works with an identification rate of 82.96% in the first try, and 96.23% and 100% in the second and third tries respectively.This is a significant success rate (despite that the sampling rate in-browser is much lower than that available in-app) and confirms that the described attack imposes a serious threat to the users' security and privacy.

V. WHY DOES THIS VULNERABILITY EXIST?
Although reports of side channel attacks based on the inbrowser access to mobile sensors via JavaScript are relatively recent, similar attacks via in-app access to mobile sensors have been known for years.Yet the problem has not been fixed.Here, we discuss the reasons why such a vulnerability has remained unfixed for a long time.

A. Unmanaged sensors
In an attempt to explain multiple sensor-related in-app vulnerabilities, Xu et al. argue that "the fundamental problem is that sensing is unmanaged on existing smartphone platforms" [35].There are multiple in-app side-channel attacks that support this argument, as we discussed in the previous section.Our work shows that the problem of in-app access to "unmanaged sensors" is now spreading to in-browser access.
Here we present the "unmanaged" motion and orientation sensor case which shows how the technical mismanagement of these sensors causes serious user privacy consequences when it comes to unregulated access to such sensors via JavaScript.
W3C vs. Android.According to W3C specifications, the motion and orientation sensor streams are not raw sensor data, but rather high-level data which are agnostic to the underlying source of information.Common sources of information include gyroscopes, compasses and accelerometers [2].In Tables IV and V, we present raw (low-level) and synthesized (high-level) motion sensors supported by Android [17] along with their descriptions and units, as well as their corresponding W3C definitions [2].
As it can be seen in the tables, different terminologies have been used for describing the same measurements in-app and in-browser.For example, while in-app access uses the raw sensor terminology, i.e., accelerometer, gyroscope, magnetic field, the in-browser access uses synthesized sensor terminology, i.e., motion and orientation [2].This creates confusion for users (as we will explain later) and developers (as we experienced it ourselves).One of the W3C's specifications on mobile sensors, "Generic Sensor API" [1], dedicates a few sections to the issue of naming sensors, and low-level and high-level sensors.It discusses how the terminology for inbrowser access has been high-level so far.It also mentions that the low-level use cases are increasingly popular among the developers.As stated in this specification: "The distinction between high-level and low-level sensor types is somewhat arbitrary and the line between the two is often blurred".And, "Because the distinction is somewhat blurry, extensions to this specification are encouraged to provide domain-specific definitions of high-level and low-level sensors for the given sensor types they are targeting".We believe due to the rapid increase of mobile sensors, it is necessary to come up with a consistent approach.
Furthermore, as mentioned before, HTML5 code can be recompiled to build a native app [18], [34] ,called hybrid apps, via automatic tools such NativeScript 4 .Hybrid apps provide a cross-platform, device independent, means for developers to utilize mobile-only capabilities such as sensors.The inconsistency of the sensors in web, and in particular its access control   policies would probably conflict with OS policies and create even more complexity.

B. Unknown sensors
Another contributing factor is that users seem to be less familiar with the relatively newer (and less advertised) sensors such as motion and orientation, as opposed to their immediate familiarity with well-established sensors such as camera and GPS.For example, a confused user has asked this question on a mobile forum: "... What benefits do having a gyroscope, accelerometer, proximity sensor, digital compass, and barometer offer the user?I understand it has to do with the phone orientation but am unclear in their benefits.Any explanation would be great!Thanks!" 5 .
We design and conduct user studies in this work in order to investigate that these sensors and their risks levels are less known to the users.
List of mobile sensors.We prepared a list of different mobile sensors by inspecting the official websites of the latest iOS and Android products, and the specifications that W3C and Android provide for developers.We also added some extra sensors as common sensing mobile hardware which are not covered before.

C. User study
We prepared a list of sensors based on the above.We asked volunteer participants to rate the level of their familiarity with each sensor.In all of our studies, we had 29 participants (12 self-identified as male and 17 as female) from the university and local community recruited through social and vocational networks, from 18 to 59 years old, with a median age of 31.Except one, none of the participants were studying or working in the field of computer security.Our university participants were from multiple degree programs and levels, and the remaining participants worked in a different range of fields.Moreover, our participants owned a wide range of mobile devices, and had been using a smartphone/tablet for 6 years on average (from 0 to 11 years).We interviewed our participants at a university office and gave each an Amazon voucher (worth 10 pounds Sterling) at the end for their participation.Details of the interview template can be found in the Appendix.For a list of 25 different sensors, we used a five-point scale self-rated familiarity questionnaire as used in [21]: "I've never heard of this", "I've heard of this, but I don't know what this is", "I know what this is, but I don't know how this works", "I know generally how this works", and "I know very well how this works".The list of sensors was randomly ordered for each user to minimize bias.In addition, we needed to observe the experiments to make sure users were answering the questions based on their own knowledge in order to avoid the effect of processed answers.Full descriptions of all studies are provided in the Appendix.Fig. 5 summarizes the results of this study.
Our participants were generally surprised to hear about some sensors and impressed with the variety.As one may expect, newer sensors tend to be less known to the users in comparison to older ones.In particular, our participants were generally not familiar with ambient sensors.Also low-level hardware sensors such as accelerometer and gyroscope, seem to be less known to the users in comparison with high-level software ones such as motion, orientation, and rotation.We suspect that this is partly due to the fat that the high-level sensors are named after their functionalities and can be more immediately related to user activities.
We also noticed that a few of the participants knew some of the low-level sensors by name but they could not link them to their functionality.For example, one of our participants which knew almost all of the listed sensors (except hall sensor and sensor hub) stated that: "When I want to buy a mobile [phone], I do a lot of search, that is why I have heard of all of these sensors.But, I know that I do not use them (like accelerometer and gyroscope)".On the other hand, as the functionalities of mobile devices grow, vendors quite naturally turn to promote the software capabilities of their products, instead of introducing the hardware.For example, many mobile devices are recognised for their gesture recognition features by the users, however the same users might not know how these devices provide such a feature.For instance, one of the participants commented on a feature on her smartphone called "Smart Stay"9 as follows: "I have another sensor on my phone; Smart Stay.I know how it works, but I don't know which sensors it uses?".

VI. RISK PERCEPTION OF MOBILE SENSORS
In this section, we study the participants' risk perception of mobile sensors.There have been several studies on risk perception addressing different aspects of mobile technology.Some works discuss the risks that users perceive on smartphone authentication methods such as PIN and patterns [19], TouchID and Android face unlock [15], and implicit authentication [22].Other works focus on the privacy risks of certain sensors such as GPS [5].In [30], Raji et al. show users' concerns (on disclosure of selected behaviours and contexts) about a specific sensor-enabled device called AutoSense10 .To the best of our knowledge, the research presented in this paper is the first that studies the user risk perception for a comprehensive list of mobile sensors (25 in total).We limit our study to the level of perceived risks users associate with their PINs being discovered by each sensor.The reasons we chose PINs are that first, finding one's PIN is a clear and intuitive security risk, and second, we can put the perceived risk levels in context with respect to the actual risk levels for a number of sensors as described in Table II.

A. Methodology
For this study, we interviewed the same group of users from Section V-C in two phases.In phase one, we gave the same sensor list (randomized for each user).We asked users to rate the level of risk they perceive for each sensor in regards to revealing their PINs.We described a specific scenario in which a game app is open in the background and the user is working on his online banking app, entering a PIN.We used a selfrated questionnaire with five-point scale answers following the same terminology as used in [30]: "Not concerned", "A little concerned", "Moderately concerned", "Concerned", and "Extremely concerned".During this phase, we asked the users to rely on the information that they already had about each sensor (see the Appendix for details).In the second phase, first we provided the participants with a short description of each sensor and let them know that they can ask further questions until they feel confident that they understand the functionality of all sensors.Afterwards, we asked the participants to fill in another copy of the same questionnaire on risk perceptions (details in the Appendix).The results are presented in Fig. 6.

B. Intuitive risk perception
We make the following observations from the results of the experiment: Touch Screen.Although our participants rated touch screen as one of the most risky sensors in relation to a PIN discovery scenario, still about half of our participants were either moderately concerned, a little concerned, or not concerned at all.Through our conversations with the users, we received some interesting comments, e.g., "Why any of these sensors should be dangerous on an app while I have officially installed it from a legal place such as Google Play?", and "As long as the app with these sensors is in the background, I have no concern at all".It seems that a more general risk model in relation to mobile devices is affecting the users' perception in regard to the presented PIN discovery threat.This fact can be a topic of research on its own, and is out of the scope of this paper.
Communicational Sensors.One category of the sensors which users are generally concerned about includes WiFi, bluetooth and NFC.For example one of the participants commented that: "I am not concerned with physical [motion, orientation, accelerometer, etc.]/ environmental [light, pressure, etc.] sensors, but network ones.Hackers might be able to transfer my information and PIN".This comment is understandable since we asked them to what extent they were concerned about each sensor in regard to the PIN discovery.
Identity-related Sensors.Another category which has been rated more risky than others contains those sensors which can capture something related to the user's identity i.e. camera, fingerprint, GPS, microphone, and TouchID.Despite that we described a PIN-related scenario, our participants were still concerned about these sensors.This was also pointed out by a few participants through the comments.For example a user stated: "..., however, GPS might reveal the location along with the user input PIN that has a risk to reveal who (and where) that PIN belongs to.Also the fingerprint/TouchID might recognize and record the biometrics with the user's PIN".Some of these sensors such as GPS, fingerprint, and TouchID, however, can not cause the disclosure of PINs on their own.Hence, the concern does not entirely match the actual risk.Similar to the discussion on touch screen, we believe that a more general risk model on mobile technology influences the users to perceive risk on specific threats such as the one we presented to them.
Environmental Sensors.The level of concern on ambient sensors (humidity, light, pressure, and temperature) is generally low and stays low after the users are provided with the description of the sensors (see Fig. 6).In many cases, our users expressed that they were concerned about these sensors simply because they did not know them: "[now that I know these sensors,] I am quite certain that movement/environmental sensors would not affect the security of personal id/passwords etc.".In fact, researchers have reported that it is possible to infer the user's PIN using the ambient light sensor data [32], although, to our knowledge, exploits of other environmental sensors have not been reported in the literature.
Movement Sensors.On the sensors related to the movement and the position of the phone (accelerometer, gyroscope, motion, orientation, and rotation), the users display varying levels of the risk perceptions.In some cases they are slightly more concerned, but in others they are less concerned once they know the functionality.Some of our users stated that since they did not know these sensors, they were not concerned at all, but others were more concerned when they were faced with new sensors.Overall, knowing, or not knowing these sensors has not affected the perceived risk level significantly, and they were rated generally low in both cases.
Motion and Orientation Sensors.The sensors using which we performed our attack, namely orientation, rotation, and motion, have not been generally scored relatively high for their risk in revealing PINs.Users do not seem to be able to relate the risk of these sensors to the disclosure of their PINs, despite that they seem to have an average general understanding about how they work.On hardware sensors such as accelerometer and gyroscope, the risk perception seems to be even lower.A few comments include: "In my everyday life, I don't even think about these [movement] sensors and their security.There is nothing on the news about their risk", and "I have never been thinking about these [movement] sensors and I have not heard about their risk".On the other hand, some of the participants expressed more concerns for sensors that they were familiar with, as one wrote, "You always hear about privacy stuff for example on Facebook when you put your location or pictures".Similarly, it seems that having a previous risk model is a factor that might explain the correlation between the user's knowledge and their perceived risk.

C. General knowledge versus risk perception
Figs. 5 and 6 suggest that there may be a correlation between the relative level of knowledge users have about sensors and the relative level of risk they perceive from them.We limit our attention to users' knowledge before being presented with sensor descriptions.We confirm our observation of correlation using Spearman's rank-order correlation measure.
Spearman's correlation between the comparative knowledge (median: "I know what this is, but I don't know how this works", IQR: "I've never heard of this" -"I know very well how this works") and the perceived risk about different sensors (median: "Not concerned", IQR: "Not concerned" -"A little concerned") was r = 0.61 (p < 0.05).This result supports that the more the users know about these sensors, the more concern they express about the risk of the sensors revealing PINs.We acknowledge that other methods of ranking the results, e.g. using median, produce slightly different final rankings.However, given the high confidence level of the above test, we expect the correlation to be supported if other methods of ranking is used.
Assuming that customer demand drives better security designs, the above correlation may explain why sensors that are newer to the market have not been considered as OS resources and consequently have not been subject to similar strict access control policies.

D. Perceived risk vs the actual risk
We are specifically interested in the users' relative risk perception of sensors in revealing their PINs in comparison to the actual relative risk level of these sensors.We list the results reported in the literature in Table II for the following sensors: light, camera, microphone, gyroscope, motion, and orientation.Fig. 6 shows that users generally have expressed more concern about sensors such as camera and microphone than accelerometer, gyroscope, orientation, and motion.This does not match the actual risk levels since the latter sensors allow PIN recovery with higher accuracy as we have shown in Section IV.When asked after filling the questionnaire, most participants could not come up with realistic attack scenarios using camera and microphone.For microphone, some users thought they might say the PIN out loud.For camera, a few of our participants thought face recognition might be used to recover the PIN, hence they rated camera's risk to their PINs high.One user thought the camera might capture the reflection of the entered PIN in their glasses.
Among our participants, one mentioned but described doubt about motion, orientation, accelerometer, and gyroscope being able to record the shakes of the mobile phone while entering a PIN after they saw the sensor descriptions: "I feel those positional sensors might be able to reveal something about my activities, for example if I open my banking app or enter my PIN.But it is extremely hard for different users, and when working with different hands and positions".This participant expressed only "a little concern" about them.One of our participants was completely familiar with these attacks and in fact had read some related papers.This user was "extremely concerned".Other users who rated these sensors risky in general, said they were generally concerned about different sensors.One commented: "I can not think of any particular situation in which these sensors can steal my PIN, but the hackers can do everything these days."

VII. POSSIBLE SOLUTIONS
In this section, we discuss the current academic and industrial countermeasures to mitigate sensor-based attacks.

A. Academic approach
Different solutions to address the in-app access attacks have been suggested in the literature: e.g., restricting the sensor to one app, reducing the sampling rate, temporal pause of the sensor on sensitive entries such as keyboard, rearranging keyboard for password entrance, asking for explicit permission from the user, ranking apps based on their similarities to malware, and obfuscating anomalies in sensor data [28], [4], [32], [35], [31], [25], [26], [29], [14], [7].However, after many years of research on showing the serious security risks of sensors such as accelerometer and gyroscope, none of the major mobile platforms have revised their in-app access policy.
We believe that the risks of unmanaged sensors on mobile phones, specially through JavaScript code, are not known very well yet.More specifically, many OS/app level solutions such as asking for permissions at the installation time, or malware detection approaches would not work in the context of a web attack.In our previous work [24], we suggested to apply the same security policies as those for camera, microphone, and GPS for the motion and orientation sensors.Our suggestion was to set a multi-layer access control system on the OS and browser levels.However, the usability and effectiveness of this solution are arguable.First, asking too many permissions from the user for different sensors might not be usable.Furthermore, for some basic use cases such as gesture recognition to clear a web form, or adjusting the screen from portrait to landscape, it might not make sense to ask for user permission for every website.Second, with the increase of the number of sensors accessible through mobile browsers, this approach might not be effective due to the classic problem of sidestepping the security procedure by users when it is too much of a burden [10].As stated by one of our participants: "I don't mind these sensors being risky anyway.I don't even review the permission list.I have no other choice to be able to use the app".Moreover, as we have shown in Section V, users generally do not understand the implications of these sensors on discovering their PINs for example, even though they know how these sensors work.Hence, such an approach might not be effective in practice.

B. Industrial approach
W3C Device Orientation Event Specification.There is no Security and Privacy section in the latest official W3C Working Draft Document on Device Orientation Event [2].However, at the time of writing this paper, a new version of the W3C specification is being drafted, which includes a new section on security and privacy issues related to mobile sensors 11 , as suggested by us in [24].The authors working on the revision of the W3C specification point out the problem of fingerprinting mobile devices [8], and touch action recovery [24] through these sensors, and suggest the following mitigations: • "Do not fire events when the page where they were registered on is not visible or has been backgrounded." • "Fire events only on the top-level browsing context or same-origin nested iframes."• "Limit the frequency of events (typically 60 Hz seems to be sufficient)."We believe that these measures may be too restrictive in blocking useful functionalities.For example, imagine a user consciously running a web program in the browser to monitor his daily physical activities such as walking and running.This program needs to continue to have access to the motion and orientation sensor data when the user is working on another tab or minimizes the browser.One might argue that such a program should be available as an app instead, hence the use case is not valid.However, it is expected that the boundary between installed apps and embedded JavaScript programs in the browser will gradually diminish [13].
Mobile browsers.As we showed in [24], browsers and mobile operating systems behave differently on providing access to sensors.Some allow access only on the active webpage and any embedded iframes (although with different origins), some allow access to other tabs, when browser is minimized, or even when the phone is locked.Hence, there is not a consistent approach across all browsers and mobile platforms.Reducing the frequency rate has been applied to all well-known browsers at the moment [24].For instance, Chrome reduced the sensor readings from 200 Hz to 60 Hz due to security concerns 12 .However, our attack shows that security risks are still present even at lower frequencies.iOS and Android limit the maximum frequency rate of some sensors such as Gyroscope to 100 Hz and 200 Hz, respectively.It is expected that these frequencies will increase on mobile OSs in the near future and in-browser access is no exception.In fact, current mobile gyroscopes support much higher sampling frequencies, e.g., up to 800 Hz by STMicroelectronics (on Apple products), and up to 8000 Hz by InvenSense (on the Google Nexus range) [25].With higher frequencies available, attacks such as ours can perform better in the future if adequate security countermeasures are not applied.
Following our report of the issue to Mozilla, starting from version 46 (released in April 2016), Firefox restricts JavaScript access to motion and orientation sensors to only top-level documents and same-origin iframes 13 .In the latest Apple Security Updates for iOS 9.3 (released in March 2016), Safari took a similar countermeasure by "suspending the availability of this [motion and orientation] data when the web view is hidden" 14 .However, we believe the implemented countermeasures should only serve as a temporary fix rather than the ultimate solution.
In particular, we are concerned that it has the drawback of prohibiting potentially useful web applications in the future.For example, a web page running a fitness program has a legitimate reason to access the motion sensors even when the web page view is hidden.However, this is no longer possible in the new versions of Firefox and Safari.Our concern is confirmed by members in the Google Chromium team 15 , who also believes the issue remains unresolved.

VIII. FURTHER DISCUSSION AND LIMITATIONS OF OUR WORK
As mentioned earlier, another problem with respect to the discussed sensors is that there is no consistent terminology to refer to sensors available in-app and in-browser, e.g., accelerometer and gyroscope sensors available to an app vs. motion and orientation in browser.Hence, even if the user recognizes these sensors in-app, it is not guaranteed that he will be able to recognize the same sensors in-browser, and vice versa.On the other hand, many of the suggested academic solutions either have not been applied by the industry as a practical solution, or have failed.Given the results in our user studies, designing a practical solution for this problem does not seem to be straightforward.A combination of different approaches might help researchers devise a usable and secure solution.Having control on granting access before running a website and during working with it, in combination with a smart notification feature on the browser would probably achieve a balance between security and usability.Users should also have control on reviewing, updating and deleting these data, if stored by the website or shared with a third party afterwards.Solutions such as Taintrod [16], a tracking app for monitoring sources of sensitive data on a mobile which has been applied for GPS in [5] could be helpful.After all, it seems that an extensive study is required toward designing a permission framework which is usable and secure at the same time.Such research is a very important usable security and privacy topic to be explored further in the future.
We consider this work a pilot study that explores user risk perception on a comprehensive list of mobile sensors.We envisage the following future work to address these limitations and expand this work: • More Participants: We performed our user studies on a set of users who were recruited from a wide range of backgrounds.Yet the number of the participants is limited.A larger set of participants will improve the confidence in the results.With a large and diverse set of participants, we can also study the effect of demographic factors on perceived risk.• Other Risks: We studied the perceived risk on PINs as a serious and immediate risk to users' security.The study can be expanded by studying users' risk perception on other issues such as attackers discovering phone call timing, physical activities, or shopping habits.• Other Types of Access: When interviewing our participants, we presented them with a scenario involving a game app which is installed on their smartphone.This only covers the in-app access to sensors.However, people might express different risk levels for other types of access, e.g., in-browser access.This needs further investigation.
• Issues with Training Users.We decided to provide our participants with a short description of each sensor's functionality (details in the Appendix, part 3).Furthermore, the participants were given the chance to ask as many questions as they wanted to fully understand the functionality of each sensor.This might not be the most effective way to inform users about sensors since some descriptions might seem too technical (and hence not fully understandable) to some users.Effectiveness of different approaches in informing users is a complex topic of research which can be explored in the future.Besides, we used the same set of participants to generally compare the level of perceived risk before and after seeing sensor descriptions.An alternative approach is to use a different set of participants, i.e., to follow a between-subjects approach instead of a within-subjects one, which would have less bias if carefully designed.However, in order to get meaningful results, the between-subjects approach would require recruiting a larger number of participants.

IX. CONCLUSION
In this paper, we introduced PINlogger.js,a web-based program which reveals user's sensitive information such as phone call timing, physical movements, and PINs by recording the mobile device's orientation and motion sensor data through JavaScript code.We also showed that users do not generally perceive a high risk about such sensors being able to steal their PINs.We discussed the complexity of designing a usable and secure solution to prevent the proposed attacks.Access to mobile sensor data via JavaScript is limited to only a few sensors at the moment.This will probably expand in the future, considering for instance the ongoing development of JavaScript-based operating systems such as Firefox OS 16 .Hence, designing a general mechanism for secure and usable sensor data management remains a crucial open problem for future research.

Interview Script
Hi. Thanks very much for contributing to our study.In this interview, we will ask you to fill in a few questionnaires about mobile sensors such as GPS, camera, light, motion and orientation.You are encouraged to think out loud as you go through, and please feel free to provide any comments during the interview.Everything about this interview is anonymous.Please provide some information about yourself in Table VI

PART THREE
Let us explain each sensor here: • GPS: identifies the real-world geographic location.
• Touch Screen: enables the user to interact directly with the display by physically touching it.• WiFi: is a wireless technology that allows the device to connect to a network.• Bluetooth: is a wireless technology for exchanging data over short distances.• NFC (Near Filed Communication): is a wireless technology for exchanging data over shorter distances (less than 10 cm) for purposes such as contacless payment.• Proximity: measures the distance of objects from the touch screen.• Ambient Light: measures the light level in the environment of the device.
• Ambient Pressure (Barometer), Ambient Humidity, and Ambient Temperature: measure the air pressure, humidity, and temperature in the environment of the device, respectively.• Device Temperature: measures the temperature of the device.• Gravity: measures the force of gravity.
• Magnetic Field: reports the ambient magnetic field intensity around the device.• Hall sensor: produces voltage based on the magnetic field.
• Accelerometer: measures the acceleration of the device movement or vibration.• Rotation: reports how much and in what direction the device is rotated.• Gyroscope: estimates the rotation rate of the device.
• Motion: measures the acceleration and the rotation of the device.• Orientation: reports the physical angle that the device is held in.• Sensor Hub: is an activity recognition sensor and its purpose is to monitor the device's movement.Please feel free to ask us about any of these sensors for more information.
Now that you have more knowledge about the sensors, let us describe the same scenario here again.Imagine that you own a smartphone which is equipped with all these sensors.You have opened a game app which can have access to all mobile sensors.You leave the game app open in the background, and open your banking app which requires you to enter your PIN.Do you think any of these sensors can help the game app to discover your entered PIN?To what extent are you concerned about each sensor's risk to your PIN? Please rate them in the table (Table VIII was used).In this part, please make sure that you know the functionality of all the sensors.If you are unsure, please have another look at the descriptions, or ask us about them.
Thanks very much for taking part in this study.Please leave any extra comment here.
An Amazon voucher and a business card are in this envelope.Please contact us if you have any questions about this interview, or are interested in the results of this study.

Fig. 2 .
Fig. 2. Left: Three dimensions (x, y, and z) of acceleration data including gravity (from the motion sensor).The start time, duration, and end time of four phone calls are easily recognisable from these measurements.Right: The screenshot of the call history of the phone during the experiment.

Fig. 3 .
Fig. 3. Three dimensions (x, y, and z) of acceleration data (from the motion sensor) during 22 s of sitting, 34 s of walking and 25 s of running.

Fig. 4 .
Fig. 4. Different input methods used by the users for PIN entrance.

Fig. 5 .
Fig. 5. Level of self-declared knowledge about different mobile sensors.Question: "To what extent do you know each sensor on a mobile device?" sensors are ordered based on the aggregate percentage of participants declaring they know generally or very well how each sensor works.This aggregate percentage is shown on the right hand side.

Fig. 6 .
Fig. 6.Users' perceived risk for different mobile sensors, before (top bars) and after (bottom bars) being presented with descriptions of sensors.Question: "To what extent are you concerned about each sensor's risk to your PIN?", sensors are ordered based on the aggregate percentage of participants declaring they are either concerned or extremely concerned about each sensor before seeing the descriptions.This aggregate percentage is the first value presented on the right hand side.
comparison, a random attack can guess a PIN from a set of 50 PINs with et al. introduce another attack on smartphone numerical and alphabetical keyboards and ultimately the user's PINs and credit card numbers by using the smartphone mi-

TABLE V POSITION
SENSORS SUPPORTED BY ANDROID AND THEIR CORRESPONDING W3C DEFINITIONS.NOTE: ORIENTATION SENSOR WAS DEPRECATED IN ANDROID 2.2 (API LEVEL 8).
. multiple mobile sensors is presented below.To what extent do you know each sensor on a mobile device?Please rate them in the table (Table VII was used).PART TWO Imagine that you own a smartphone which is equipped with all these sensors.Consider this scenario: you have opened a game app which can have access to all mobile sensors.You leave the game app open in the background, and open your banking app which requires you to enter your PIN.Do you think any of these sensors can help the game app discover your entered PIN?To what extent are you concerned about each sensor's risk to your PIN? Please rate them in the table (Table VIII was used).In this section, please only rely on the knowledge you already have about the sensors, and if you do not know some of them, describe your feeling of security about them.