The Development and Usability Evaluation of an Interactive Health Technology Solution, for Encouragement of Physical Activity in Inactive Adults Based on the User Perspective

Physical inactivity is increasing worldwide. Technology shows good evidence for its good impact on health. Successful behavior change regarding physical activity requires user involvement and the integration of relevant behavior change techniques. The aim of this study was, based on the user perspective, to develop and later evaluate the usability of an interactive health technology solution that can encourage physically inactive adults to physical activity. A mixed methods design with a user-centered approach was used. The data collection included two workshops, technical development, and a usability evaluation (effectiveness, efficiency, satisfaction) involving 35 participants. From one workshop, four themes were identified: focus, platform, characteristics, and content. A schematic model of the first prototype was presented at the next workshop, and the feedback was positive. A behavior change application was developed (App&Move), with focus on encouraging functions. App&Move, which measures physical activity in minutes per day, has integrated behavior change techniques, is based on physical activity recommendations, relies on scientific evidence, and is strongly based on the user perspective. The usability evaluation of App&Move showed that effectiveness was acceptable, efficiency was (with one exception) high, and satisfaction was medium. The results from this study contribute to the development of a first model of a behavior change application that aims to encourage physically inactive adults to physical activity. In conclusion, the usability evaluation of App&Move showed promising usability, and App&Move seemed to encourage physical activity to some extent. Future research suggestions can be to further improve the usability, encouragement, and user demands.


Background
The prevalence of physical inactivity is increasing in many countries (WHO 2005) and can therefore be regarded as a global public health problem. According to the World Health Organization (WHO), 23-55% of adults were not sufficiently physically active in 2010 (WHO 2017). Physical activity (PA) can reduce the risk of chronic disease (ISPAH 2010) and has positive effects on several body functions (WHO 2004;Van Praag 2008;Nieman et al. 2010;Hillman et al. 2008;Sibley and Etnier 2003). Different kinds of self-monitoring technologies are used today when promoting PA, mostly targeting already physically active people, for example, pedometers, activity trackers, and smartphone applications. However, pedometers have limitations (Corder et al. 2007;Melanson et al. 2004;Åkerberg and Linden 2013;Kinnunen et al. 2011); for example, a recent study showed that a pedometer application and traditional pedometer were valid for step self-monitoring under some conditions (Åkerberg et al. 2016).
Users can be defined as individuals who use a service or a product. Warburton et al. (2006) states that most improvements in health are gained when the least fit individuals become physically active. Therefore, the users in this study correspond to physically inactive adults. Involvement of users in health care device development has several advantages, such as improvements in functionality, usability, and quality of the devices (Sarwar Shah and Robinson 2007). The degree to which a product can be used by specified users to achieve goals in a specified context of use can be referred to as usability (ISO 9241-11 1998), and a user-centered design is often used to improve the usability of products or systems (Nielsen 1993;Abras et al. 2004). A recent study of a group of physically inactive people indicated that encouraging PA is more important than measuring it (Åkerberg et al. 2017, submitted). By encouragement, people can become motivated to make behavior changes and therefore increase their level of PA. Behavior change techniques (BCTs) consist of theory-based techniques that are effectively used in behavior change interventions (Abraham and Michie 2008); however, most PA selfmonitoring technologies consist of a limited number of BCTs. Based on a user perspective, the aim of this study was to develop an interactive health technology solution to encourage physical activity behavior change in physically inactive adults. Further, the aim was to evaluate the developed interactive health technology solution in a usability study.

Methods
The Regional Ethics Review Board in Uppsala, Sweden, approved the study (EPN, Dnr 2013/072).

Study Design
A mixed methods design was used, based on a user-centered approach. This study was performed in two phases, as illustrated in Fig. 1.

Study samples
The selection method, the study sample, and the criteria for inclusion for the two phases in the study are illustrated in Table 1.
Phase 1 For the brainstorming workshop (B-WS), the experts were chosen because of their different kinds of technical/ engineering experience and knowledge (4 m/2 w, ages 28-69 years). For the user workshop (U-WS), persons who had participated in a related study (Åkerberg et al. 2017, submitted) were invited (2 m/2 w, ages 40-64 years). The pre-testing included two voluntary test persons (1 m/1 w, 33-38 years).
Phase 2 Out of the sample of 23, 61% were men and 39% were women, in the ages of 18-64 years (74% were between 25 and 44). Seventy percent were employed and 30% were students. Out of the sample, 75% lived with a partner or partner and children, and 83% had university education. The majority of the participants used Samsung smartphones (70%); however, One Plus, Huawei, ZTE, and Sony were also represented.

Recruiting Process
Phase 1 For the B-WS and the U-WS, a request was sent by email to the selected persons. When recruiting the pre-testers, the test leader (main author) personally contacted employees at the university to ask if they used an Android smartphone and if they were willing to undergo pre-testing.
Phase 2 A request was sent by e-mail to participants who had participated in a related study (Åkerberg et al. 2017, Fig. 1 Illustration of the study designs and phases in the study submitted), and some of these were the same individuals who had participated in the U-WS. An e-mail request was sent to group-mail addresses of the employees, and advertisements were placed on bulletin boards at the university. Both employees and students were also recruited by personal contact.

Data Collection
Phase 1 The data collection for the B-WS and U-WS consisted of Dictaphone recordings and written notes. The pre-testers were asked to use the interactive health technology solution, hereafter only referred to as Bthe technology solution,^for 4 weeks. Regular meetings were held once a week, which were documented. A test file, filtered to contain data between 06:00 and 22:00, was downloaded from the cloud, including raw data from the technology solution generated from the pre-testers' Android smartphones.
Phase 2 The data collection consisted of activity data from the technology solution and two questionnaires (Q1 and Q2; see Table 2). Q1 included 12 questions: background information, health status, PA level, and behavior change. Q2 consisted of 24 questions, partly the same as those of Q1, and usability questions. The raw acceleration data (the activity data) were converted to activity minutes per day. The participants' data were identified using an Android ID, which is specific for each Android smartphone.

Procedure
Phase 1 B-WS and U-WS were performed during autumn 2016 and spring 2017. One moderator (main author) led the B-WS, and two observers (the co-authors) took notes. Preparation documents were sent to the participants in advance. At the workshop, the participants completed a consent form and a background information form. Information was given about the study and the aim of the B-WS, which was  (Åkerberg et al. 2017, submitted)  U1 -To what extent do you experience that App&Move has encouraged you to an increased level of physical activity? (not encouraged at all = 0, very encouraged = 10) U2 -How do you experience that App&Move measures your physical activity? (not good = 0, very good = 10) U3 -Do you experience that App&Move is capable to measure all physical activity (both everyday activity and exercise)? (do not agree at all = 0, agree completely = 10) Q1 questionnaire 1; Q2 questionnaire 2; A activity data from App&Move; B, C questions from Q1 and Q2; D, E, F usability questions from Q2, numbered as U1, U2, etc.
SOC stands for stages of change to generate one or more theoretical ideas on how to build the technology solution. Then, parts from the B10-plus-10 method^ (Greenberg et al. 2012) were used. Steps 1-4 were used as follows: (1) state your design challenge, (2) generate 10 or more different design concepts of a system that addresses this challenge, (3) reduce the number of design concepts, and (4) choose the most promising design concept(s) as a starting point. A questioning route was used, consisting of an introduction question, the selected steps from the 10-plus-10 method, and two ending questions. The B-WS was recorded by Dictaphone. The same moderator led the U-WS. The participants signed a consent form and a background information form. General information was given about the study and the aim of the workshop, which was to gain feedback from a user group that had participated in a related study (Åkerberg et al. 2017, submitted). A PowerPoint presentation described the first draft of the technology solution. This was followed by a discussion, where the moderator used a questioning route including eight questions. The discussion was recorded by a Dictaphone. Both WSs took a maximum of 1.5 h to perform. A company was contacted with a request to develop the technology solution according to findings in the present and former studies, and study results, requirement specification, drafts, and media files were delivered to the company. The technical development was conducted in an iterative process in collaboration with the company, including the testing of prototype versions in parallel. The pre-testing was performed during autumn 2017, collecting the pre-testers' opinions regarding functions and usability.
Phase 2 The usability study was performed during autumn 2017. The participants attended meeting 1 (approx. 60 min), which contained general information about the study and instructions regarding how to use the technology solution. The participants completed a consent form and questionnaire 1 (Q1), downloaded an Android device application, Device ID (Android ID) (1 M Lab, PEGI3), and installed the technology solution on their Android smartphones (Mälardalen University/Life Science Technology in Europe AB 2017). The participants were instructed to use the technology solution independently for 4 weeks. The day after it was installed, a baseline measure started, which contained a blind measure during seven whole days (the first week). The smartphone needed to be turned on 24/7, or restarted before 06:00, during the baseline measure. When the baseline measure was completed, the participant received the results as feedback on the screen as the mean minutes per day during the baseline measure. The baseline measurements were performed to get an appreciation on the pre-intervention status and, further, to help the participants to formulate their first personal goal. Immediately after the baseline measurement was finished, the participant was instructed to confirm the readiness to change the activity behavior and to formulate the first personal goal, which was set to 1 week. After that, the participant had access to all the functions in the application. Mandatory parts of the study were to perform the baseline measurement, to formulate a first personal goal, and to revise the goal once a week during the following 3 weeks. The test leader sent information to the participants by e-mail when the study started, for the occasions in the mandatory parts of the study, and when the study was ending. Every day during the 4-week test period, the test leader received reports by e-mail from the developer about the status for each Android ID. After the 4 weeks of use, the participants attended meeting 2 (approx. 30 min), where they completed questionnaire 2 (Q2) and received information about future plans for the technology solution. All participants received a lottery ticket (trisslott) to compensate for their participation in the study.

Data Analysis
Phase 1 The B-WS was analyzed by qualitative content analysis (Graneheim and Lundman 2004), and the U-WS was summarized without any categorization. The protocols from the pre-testing were sent by e-mail to the developer.
Phase 2 The ISO's (ISO 9241-11 1998) definition of usability and the included aspects of effectiveness, efficiency, and satisfaction were used. Effectiveness reflects the quality of the system and can be described as how accurately the user can achieve specified goals with the system or product (ISO 9241-11 1998). In this study, the effectiveness aspect is indicated by the data that were related to encouraging PA and measuring outcomes of the technology solution. Efficiency corresponds to the expended resources in achieving the goals with the system or product (ISO 9241-11 1998). The efficiency indicator in this study corresponds to factors regarding learning and understanding how to use the technology solution. Satisfaction reflects the users' comfort with and attitudes towards using the system (ISO 9241-11 1998). In the present study, satisfaction is reflected by questions about attitudes towards the technology solution and further use outside the study. The aspects of usability and its included data are presented in Table 2.
The activity data from the technology solution were presented as descriptive statistics as baseline measure, outcome in minutes per week, and four personal goals. The baseline measure was assessed as the mean minutes for the baseline activity period. The first personal goal was possible to formulate after the baseline measure was finished, and the goal was thereafter revised three times. If no last goal (fourth) was formulated by the participant, the current goal was used as the last goal. The outcomes per week were assessed as follows: the sum of the number of activity minutes for the day of the first goal until the day before the goal was revised was divided by the number of days during this period. A day with zero (0) active minutes corresponds to when the technology solution was active but not able to detect any activity minutes. This was classified as an active day (the device was active). A day with no data (− −) corresponds to no data measured by the device or the cloud server collecting data from the device, which classified this day as a non-active day (the device was not active).
The Wilcoxon signed rank test (Field 2013) was used to study possible differences between the same questions asked in Q1 and Q2 (see Table 2); self-estimated health status (q. 1-3), PA level (q. 4-5), sedentary level (q. 6-7), and SOC (q. 8). The participants were classified in different SOC according to the algorithm by Marcus and Forsyth (2003). The significance level of 0.05 was used in all statistical analyses. The data were analyzed using Microsoft Excel (v.2013, Microsoft, WA, USA) and IBM SPSS Statistics (v.24, IBM Corporation, USA).
The usability questions from Q2 (U1-U13) were presented as descriptive statistics, and several questions were also complemented with qualitative data from the questionnaires, which were analyzed by qualitative content analysis (Graneheim and Lundman 2004). The meaning units for each question were read several times and thereafter condensed, and categories were generated for each question separately. The categories are described in the text, and in most cases also clarified by one or several quotations.

Results
Phase 1 The transcript of the B-WS resulted in 57 meaning units and 11 categories, where four themes were formed. Figure 2 presents an illustration of the results from the qualitative content analysis. The yellow boxes describe the categories, and the blue boxes describe the four themes.
The four themes from the B-WS were used as the starting point when developing the first theoretical model of the technology solution and were later presented in the U-WS. The analyses of the U-WS generated a short summary: The users in the U-WS were positive to the draft, to use it, understood its benefits, and thought it was simple and clear. The users thought the functions were encouraging and were positive to the red and green feedback. Some suggestions for feedback and personal or individual adjustments were given. The users thought that reminders were important and suggested connecting with other users in games and competitions. Possible challenges were development of this technical solution, the fact that people are different, and the user demand to carry the smartphone.
An application aiming at behavior change was developed specifically for the Android platform. App&Move version 1.8 was delivered from the company and was available at Google Play for free download for the study participants. App&Move is mainly focused on encouraging the user to be more physically active and less sedentary and has built-in functions for encouragement in the form of integrated BCTs. The main functions in App&Move were listed, and by matching to the taxonomy by Abraham and Michie (2008), relevant BCTs were selected, according to the recommendation from Direito et al. (2014). The application includes 19 different BCTs (some repeated in several functions), as presented in Table 3.
With the built-in accelerometer in the smartphone, App&Move measures PA in minutes per day (everyday Fig. 2 Schematic illustration of the results from the qualitative content analyses generated by the B-WS, described in categories and themes activity and exercise) between 06:00 and 22:00, in counts per minute (CPM). The CPM threshold was set to 220 CPM (a minute was defined as active if the number of CPM exceeded 220 during that minute) and filtered at 1.4 (meaning an acceleration more than 1.4 m/s 2 over gravitation, exceeding 9.81 + 1.4 m/s 2 ). CPM relates to how many counts per minute should count for an active minute and the filter related to the sensitivity. The first model of App&Move was developed for level 0, which means that the starting point for all new users is level 0, with possibilities to advance up to level 4 (but with the same functions as for level 0). App&Move is compatible with Android smartphones equipped with at least Android version 5.0. Requirements (user demands) for using App&Move are to carry/wear the smartphone during the time awake and to have access to Wi-Fi or mobile data. Data were uploaded from the device to the cloud server when the user was connected to a Wi-Fi network or using mobile data if there was no Wi-Fi connection during the last 24 h. Raw acceleration data are delivered to a server when there is an internet connection. Seven days of raw data can be saved on the smartphone for uploading to the server. The raw acceleration data were filtered and converted to activity minutes for the analysis.
A short description of the main functions in App&Move is as follows: -Every user has a unique user ID, only visible to the user -Every installation has a unique Android ID used for analysis -App&Move presents activity data in text and illustration on the main page -App&Move provides feedback as notifications, text, and icons -App&Move performs a baseline measure during 7 days, blind for the user -App&Move provides several pages with written information, tips, and available recommendations, based on research within the area -The user should set and revise personal goals in minutes (possible to set between 10 and 1000 min) and during weeks (possible to set between 1 and 10 weeks; however, 1 week was set in this study) -App&Move provides reminders regarding being physically active -App&Move provides a link to App&Move's closed Facebook group -App&Move provides feedback on activity minutes, in the form of bar graphs over 7 or 21 days -App&Move provides storage and upload of accelerometer data to a dedicated server.
A flowchart over the navigation in App&Move is presented in Fig. 3.
When developing App&Move, parts from available PA recommendations were adopted.  Make agreement Agreement of behavioral contract, action planning, provide feedback on performance, self-efficacy Measure active time Self-monitoring of behavior outcomes, provide feedback on performance Baseline measure Self-monitoring of behavior outcomes, provide feedback on performance Encouragement Fear arousal (fear-inducing information on health risks), informing when and where to perform the behavior, environmental restructuring, use of follow-up prompts Personal goals Information provision (to the individual), SMART goal setting, effort or progress towards contingent rewards, agreement of behavioral contract Report and compare measures Review of outcome goals, provide feedback on performance, focus on past success

Connection with other users
Plan social support, information provision (others' approval), information provision (others' behavior), facilitate social comparison Game functions Plan social support, gamification a , provide feedback on performance a Gamification has several similarities with some available BCTs (Cugelman 2013) 4. In addition to being regularly physically active, try to decrease the total amount of daily sedentary time (American Diabetes Association; Colberg et al. 2016).
In parallel with the technical development, testing of the App&Move versions was conducted, for instance, when walking, jogging, on a treadmill, housework, gardening, bicycling, on stairs, on a slope, and driving a car. During the pretesting, the company that developed the App&Move performed minor changes and fixed some bugs in App&Move, according to the delivered pre-testing protocols.
Phase 2 In total, 23 participants used App&Move during the test period of 4 weeks (17 users owned Androids, and 6 had temporary Android smartphones) and data were generated from 22 persons; 91% of the participants could certify their own use of the App&Move, and 77% had undergone the test period according to the given instructions. Reasons for not performing the study according to given instructions were as follows: forgot to carry the smartphone sometimes, health issues, problem with the smartphone battery, the application temporarily crashed sometimes, and problems uploading data. One week after the test period ended, nine participants (43%) were still active users of App&Move. Eight participants (38%) after 2 weeks, five participants (24%) after 3 weeks, three participants (13%) after 4 weeks, and two participants (9%) were still active after 8 weeks. A participant was regarded to be still active after the test period if he/she continued to use the application after the test period of 4 weeks (however, not necessarily regularly), and these data were uploaded to the server. The response rate was 100% (23/23) for Q1 and 96% (22/23) for Q2.

Assessment of Effectiveness
Activity data from App&Move Figure 4 presents the activity data from App&Move as the mean activity minutes plotted for all participants for the baseline measure and for weeks 1, 2, and 3. Figure 4 indicates that the participants slightly increased their activity minutes from week 1 to week 3. Table 4 presents how the participants developed their personal goals, described as baseline mean activity minutes, differences between the baseline and the four personal activity minute goals, and the total and mean increase per week in minutes, for all participants. Table 4 illustrates how the participants had formulated their personal goals and shows that compared to the baseline, the mean increase in minutes was the highest for the first goal (m = 45 min, range 12-199). The total increase in mean activity minutes from baseline to the fourth goal (performed during 4 weeks) was 59 min (range 21-124), which generated a mean of 15-min increase per week (range 5-31).B

Estimated health status (Q1 and Q2)
Before and after using App&Move, the participants estimated their health status in Q1 and Q2 (see Table 2). Regarding selfrated health status (q. 1, Q1 and Q2), 57% (Q1) and 74% (Q2) Fig. 4 Presentation of the mean activity minutes plotted for all participants for the baseline and for weeks 1-3. The mean activity minutes per week were not possible to calculate for two participants because of partly missing data a Mean and median activity minutes per week were not possible to calculate for three participants because of complete or partly missing data rated their health status as very good/good, 39% (Q1) and 13% (Q2) rated their health as neither good nor bad, and 4% (Q1) and 9% (Q2) felt bad/very bad. The Wilcoxon signed rank test showed no significant difference on the question of self-rated health status between the Q1 and Q2 (p = 0.25). Seventy-eight percent of the participants stated that they had no long-term health problem (q. 2, Q1), and 22% stated that they had experienced a health change during the test period (q. 3, Q2). The content analysis of the participants' description of their health changes generated five meaning units, and a category, increased health and awareness about health. This category involves opinions on increased awareness about their PA, that they had increased their level of PA and that they felt healthier. BYes, I have been moving more and started to exercise. I feel healthier and stronger.^(participant 7) Estimated level of physical activity and sedentary behavior (Q1 and Q2) and SOC (Q1 and Q2) The participants rated their level of physical exercise in Q1 and Q2 (see Table 2), (q. 4), everyday activity (q. 5), sedentary level at work (q. 6) and leisure time (q. 7), and classification in SOC (q. 8). Table 5 shows that there were no significant differences in the self-estimated level of physical exercise, level of everyday activity, level of sedentary time at work or during leisure time, or in stages of change (SOC) between Q1 and Q2.

Three usability questions related to effectiveness aspect
Three usability questions in Q2 (U1-U3; see Table 2) were assessed regarding the effectiveness aspect. A medium mean score (5.1-6.1) was generated for the questions about achieving specified goals (U1-U3). The participants expressed to what extent App&Move had encouraged them to PA (U1). The content analysis generated 11 meaning units and the category extra effort, including statements regarding taking extra walks, spending more time for movement, reminders, and to reach goals. BThe application made me to want to be more physically active, to achieve my goals…^(participant 17). Thirteen meaning units were generated for the question about experiences of how App&Move measures PA (U2), and the category good and less good experiences of the measurements. This category consisted of statements regarding satisfaction about the measurements as well as difficulties regarding measuring some activities. BGood, but possibilities to add activities manually would be good…^(participant 19) BIt seems like it does measure some activity when travelling by bus…^(participant 8).
Twelve meaning units were produced for the question about experiences of whether App&Move can measure all PA (U3), generating the category, different experiences Table 5 All participants before and after using the application, classification at different PA levels (4-5), sedentary levels (6-7), and SOC (8) The number of the question is clarified in the first column to the left Significance level p < 0.05. Because of missing data for one participant in Q2, this participant was excluded from all statistical analyses. In the test for SOC, three more participants were excluded from the analysis because their SOC were not possible to identify of measuring capabilities. This category involved opinions regarding whether the App&Move can measure everyday activities and certain exercise activities. Some participants had not been exercising and had therefore no opinion about exercise. BEveryday activities yes, but ex-ercise…it doesn't measure some activities like swimming and gym training^(participant 16).

Assessment of Efficiency
Six questions in Q2 (U4-U9) were related to the efficiency aspect.
Six usability questions related to the efficiency aspect The highest means (7.0-8.4) were generated from the questions about learning and understanding (U4-U8; see Table 2). The question about the participants' own ability to use the application (U4) generated six meaning units and the category believe in own ability, where the participants discussed control of the functions and that it was good not having any more devices beside the smartphone. The question about if the application was simple to use (U5) consisted of five meaning units and the category mainly simple to use, which includes opinions that App&Move is very useful and simple to use but also that one action, in particular, was complicated. BVery user-friendly, incredibly smooth^(participant 13). BA bit confusing that so many choices led to set a new personal goal( participant 4). The question about providing fair feedback (U6) generated nine meaning units and the category fair feedback but sometimes-bad timing. BIt feels completely fair( participant 13). BSometimes, I walk for some time, and I still get the message that my activity is low^(participant 20). The participants graded if the information sections in App&Move were understandable (U7), including four meaning units and the category clear and understandable. The participants were asked about experiences of using App&Move during the time awake (U8), which generated 13 meaning units and a category, slightly problematic using during the time awake. The participants stated opinions that it was easy to forget to carry the smartphone, problems with battery capacity, and problems charging the smartphone when needing to use it. BA bit difficult in the beginning, but you get used to it^(participant 5).
BForget the smartphone sometimes…^(participant 13). A low mean score (3.6) was generated for the question about the user demand (U9), generating 11 meaning units and the category mostly difficult user demand. The opinions regarding the user demand (U9) were twofold; a few were used to it since before, while others were not used to carry the smartphone all the time. They also stated problems with battery capacity and that the smartphone was not appropriate to use in all situations.
BDifficult to remember to always carry the smartphone^(participant 6).

Assessment of Satisfaction
The last four usability questions in Q2 (U10-U13; see Table  2) relate to the satisfaction aspect.

Four usability questions related to the satisfaction aspect
The participants' attitudes towards using App&Move (U10) were slightly higher than neutral (6.7), generating 11 meaning units and the category mainly positive attitude. BEasy to use, and interesting information about my own everyday routines^(participant 7). The participants scored a high mean (7.5); thus, App&Move was perceived to be generally useful (U11

Discussion
A first model of a behavior change application, called App&Move, was developed, strongly based on the user perspective. App&Move was usability evaluated, indicating that it had promising usability and seemed to encourage the participants to PA to some extent. The assessments of the usability aspects, according to ISO 9241-11 (1998), showed that the effectiveness was assessed as acceptable, the efficiency was mainly high, and satisfaction was medium. The effectiveness aspect was regarded as acceptable because the included data (see Table 2, A, B, C, and D) showed positive, but not significant, results or was estimated as medium. Besides, for the user demand, the efficiency data (E) showed high means. The satisfaction data (F) were mainly scored as medium.
The data from App&Move indicated a slight increase in activity minutes. However, the data from Q1 and Q2 showed no significant differences in level of physical exercise, everyday activity, sedentary time at work or leisure time, or in SOC.
Possible reasons could be that since the increase in minutes was small, the participants did not notice any difference in these health parameters. It is, however, possible that the participants were encouraged to PA, a behavior change was initiated, and that the participants progressed on the SOC ladder, which can explain why the p value for SOC approached significance. Some of the participants seemed to have estimated their level of PA slightly lower in Q2 compared to that in Q1, and a possible reason could be an overestimation or memory recall. Or, that they trusted the activity data generated from App&Move, which, according to Haskell (2012), can be referred to as an objective measuring method that does not have to be correct.
App&Move is based on the users' needs (Åkerberg et al. 2016, 2017, submitted), and the smartphone was regarded as a suitable platform. The participants were informed about the user requirement to carry the smartphone at all times; however, the result indicated that this could be a problem. This was noted by individuals who were not used to carrying a smartphone before the study. This study does not reveal anything about the wearing time of the smartphones, which can affect the reliability of the study.
The baseline measure was partly blind for the participant to minimize the risk of change of PA behavior during the baseline measurements. According to Fig. 4, the baseline activity minutes were in some cases higher than the outcome for week 1. Michie et al. (2009) concluded that the self-monitoring BCT is most effective in PA and eating interventions. A possible reason for the higher baseline in this study might be that the participant was strongly affected by doing self-monitoring. The participants were also instructed how to formulate personal goals; however, some participants seemed to formulate overly high goals, especially the first goal. Therefore, it can be discussed whether 4 weeks was too short a span for the participants to manage to formulate a SMART goal.
The participation (App&Move) and response rates (Q1 and Q2) were high in the usability study, which possibly generated more accurate results. To measure PA in minutes is not a new phenomenon. For instance, advanced pedometers, activity trackers, and smartwatches measure moderate and vigorous PA in minutes. However, the difference is that App&Move captures a wider spectrum of PA behavior by also measuring everyday activities in minutes. Another strength of App&Move is that it has been generated from the users' perspectives, scientific evidence within the field, and available PA recommendations. Most PA applications contain a limited number of BCTs (Yang et al. 2015); however, App&Move has 19 integrated BCTs. A special effort was made to integrate BCTs that were recommended or missing (Sullivan and Lachman 2017), for instance, environmental restructuring.
The integrity of the participants was high during the study. The developer of the app only obtained access to the Android IDs of the participants. The company delivered the captured data, identified by Android ID, to the research group, and only the main author could identify the participants in the usability study. To have a Wi-Fi connection and to use mobile data were also requirements for using App&Move and for uploading data to the cloud and server. A few problems with the smartphones occurred, such as bad battery capabilities and that some specific brands of smartphones seemed to automatically shut down applications (including App&Move).
The participants in this study used a few different brands and models of smartphones, which can be considered a limitation and can make comparisons between different individuals difficult. However, we could see that App&Move was working satisfactorily on several smartphone models, and each participant used the same smartphone during the entire study, which enabled comparisons on an individual basis. Notably, newer and older smartphones can have different sampling frequencies, meaning that the number of CPM may be different for different smartphones for the same activity performed. It is also possible that different ways of carrying the smartphone during activity can affect the movement and registered activity on the device (Åkerberg et al. 2016).
A limitation of the usability study was the low number of participants (23) and the short test period. However, the study was performed to receive a first evaluation of the developed application. The idea was to develop a first model of App&Move that could suit most physically inactive adults and thereafter continue developing and refining this model in the future.

Conclusion
The results from the two phases in this study (phase 1-development and pre-testing and phase 2-usability evaluation) served as the basis for the requirement specification of the behavior change application, with the aim to encourage physically inactive adults to be more physically active. As the next step, the development of the first model of App&Move was performed, and thereafter, the usability was evaluated; the effectiveness was assessed as acceptable, the efficiency as mainly high, and the satisfaction as medium. The usability evaluation indicates that the developed application for behavior change has promising usability. The study also indicates that App&Move encouraged the users to physical activity to some extent; however, this finding requires further investigation. Other future research suggestions can be to further improve the usability attributes, encouragement functions, and user demand.

Compliance with Ethical Standards
Ethical Standards This study complies with the laws in Sweden, where it was performed. The study was approved by an ethical board in Sweden and was performed in accordance with the 1964 Helsinki Declaration. All participants in this study gave informed consent to participate, and the results were presented in an unidentifiable manner.