User-Centered Design of Learn to Quit, a Smoking Cessation Smartphone App for People With Serious Mental Illness

Background Smoking rates in the United States have been reduced in the past decades to 15% of the general population. However, up to 88% of people with psychiatric symptoms still smoke, leading to high rates of disease and mortality. Therefore, there is a great need to develop smoking cessation interventions that have adequate levels of usability and can reach this population. Objective The objective of this study was to report the rationale, ideation, design, user research, and final specifications of a novel smoking cessation app for people with serious mental illness (SMI) that will be tested in a feasibility trial. Methods We used a variety of user-centered design methods and materials to develop the tailored smoking cessation app. This included expert panel guidance, a set of design principles and theory-based smoking cessation content, development of personas and paper prototyping, usability testing of the app prototype, establishment of app’s core vision and design specification, and collaboration with a software development company. Results We developed Learn to Quit, a smoking cessation app designed and tailored to individuals with SMI that incorporates the following: (1) evidence-based smoking cessation content from Acceptance and Commitment Therapy and US Clinical Practice Guidelines for smoking cessation aimed at providing skills for quitting while addressing mental health symptoms, (2) a set of behavioral principles to increase retention and comprehension of smoking cessation content, (3) a gamification component to encourage and sustain app engagement during a 14-day period, (4) an app structure and layout designed to minimize usability errors in people with SMI, and (5) a set of stories and visuals that communicate smoking cessation concepts and skills in simple terms. Conclusions Despite its increasing importance, the design and development of mHealth technology is typically underreported, hampering scientific innovation. This report describes the systematic development of the first smoking cessation app tailored to people with SMI, a population with very high rates of nicotine addiction, and offers new design strategies to engage this population. mHealth developers in smoking cessation and related fields could benefit from a design strategy that capitalizes on the role visual engagement, storytelling, and the systematic application of behavior analytic principles to deliver evidence-based content.


Background
Smoking rates in the United States have been reduced to 15% in the past decades [1]. However, this downward trend is not present in people with serious mental illness (SMI) [2]. This population, which is characterized by people with chronic mental health symptoms and functional impairments that interfere with major life activities, typically encompasses individuals with a diagnosis of schizophrenia, schizoaffective, bipolar, and recurrent major depression [3]. People with SMI have smoking rates of up to 88% [4][5][6], have high levels of disease [7], and lose 25 years of life expectancy [8].
Survey research indicates that this population has rates of adoption of mobile technology that range between 72% and 81% [9,10], closing the gap with the 95% adoption of the general population [11], and thus presenting an opportunity to develop mobile smoking cessation apps that address the treatment needs of this population.
Despite the increasing number of digital interventions developed for people with SMI [12][13][14][15][16][17], no mobile intervention for smoking cessation has been described in the scientific literature for this vulnerable patient population. This shortage of smoking cessation mHealth research for SMI is not surprising when considering the larger context of smoking cessation mHealth. There are over 540 smoking cessation apps in the market [18], but only 2 have been tested in randomized controlled trials [19,20], and to our knowledge, there are no reports of their user-centered design research.
This lack of reports on the user-centered design process of mHealth interventions for smoking cessation is problematic, because (1) the determination of the active therapeutic ingredients delivered by an app should be the result of a careful design process, (2) poorly designed software systems have an impact on their ultimate efficacy and when not usable can be a waste of resources [21,22], and (3) unreported design research undermines design reproducibility and our body of knowledge. Thus, user-centered design research of mHealth interventions is not only an important step to ensure their efficacy and usability, but also an important way to advance our scientific knowledge.
People with SMI can have very low levels of adherence to digital interventions [12,23], which calls for an user-centered design process that addresses a series of known usability barriers in this population, such as persistent and moderate-to-severe mental health symptoms [24], low levels of educational attainment [25], cognitive deficits [26,27], and poor fine motor skills [28].

Prior Work
In previous research, we found a direct link between key demographic factors and engagement with SmartQuit, a smoking cessation app designed for the general population. Specifically, we demonstrated that experiencing mental health symptoms, being female, and having low levels of educational attainment predicted low levels of engagement with the app at a 3-month follow-up [29]. In subsequent research, we further identified specific usability barriers encountered by people with SMI when using NCI QuitPal, a smoking cessation app developed by the National Cancer Institute (NCI) [30]. Despite adherence to US Clinical Practice Guidelines (USCPG) for smoking cessation, our study directly showed that NCI QuitPal led to critical performance errors and low user experience among smokers with SMI, suggesting the need for new design approaches for this population.

Goal of This Study
From these initial studies, we planned to develop Learn to Quit, a smoking cessation app tailored to this often neglected and vulnerable population. Consistent with the need to report the user-centered design process of mHealth interventions, the aim of this paper was to describe the rationale, ideation, prototyping, design, user research, and final feature set of Learn to Quit, a smoking cessation app tailored to individuals with SMI that will be subsequently tested in a randomized controlled feasibility trial (clinicaltrials.gov NCT03069482).

Methods
Our user-centered design methods and materials are summarized in Figure 1. This formative study was organized in 7 phases consistent with the user-centered design framework [31,32], which includes the following key activities: (1) understanding and specifying the context of use (phase I); (2) specifying the user and organizational requirements, such as the active ingredients of the behavior change intervention (phases II-III); (3) produce design solutions (phase IV); and (4) evaluate the design (phase V). Our design process also involved users throughout the design and development process, addressed the whole user experience (both usability and user experience), and incorporated multidisciplinary perspectives, which are key principles in user-centered design [31].
Phases VI and VII are not part of the user-centered design process per se, but are important steps in design implementation, which are also documented in the user-centered design literature [33]. Substantial progress in each phase was necessary to initiate meaningful progress in the following phase. However, at times, these phases overlapped with each other. For example, phase IV overlapped with phase V, because feedback from usability testing was used to modify or edit the original sketches and paper prototype. Conversely, completion of phase VI was a required step to initiate phases VI and VII. All study procedures were approved by the Institutional Review Board of the University of Washington.

Phase I: Expert Panel
To better understand the needs of our target population, we sought to get the perspective of patients from this population and their providers. Expert guidance has been used in prior work to inform app development efforts [34]. We formed an expert panel composed of 2 social workers, 1 psychiatric case manager, 2 psychiatrists, and 2 smokers with SMI. After a project introduction, we addressed the panel with 2 research questions: (1) What are the biggest challenges for people with SMI to quit smoking? and (2) How could we design a highly engaging app for smokers with SMI? The first and second authors took notes from comments and observations, discussed them to identify agreements and focal points, and organized them for qualitative review. No formal thematic analysis of these interviews was conducted.

Phase II: Selection of Design Principles
Before designing the app, we prespecified 2 sets of design principles: (1) general learning principles based on applied behavior analysis and (2) design principles specific to individuals with SMI. Applied behavior analysis is a scientific discipline focused on developing strategies and behavior modification techniques in areas of social relevance based on principles of learning. Use of these principles has shown promise for the treatment of addiction in people with SMI [35,36], and these principles have been used in the design of games and other health apps [37]. Design principles specific to individuals with SMI have typically addressed the cognitive deficits and mental health symptoms encountered by this population. This approach has been used in previous work designing technologies for people with SMI [38], and it is supported by the general literature with regard to the importance of adjusting designs systems to meet the cognitive model of the user [39][40][41].

Phase III: Incorporating Evidence-Based Smoking Cessation Content
Smoking cessation app content was selected from behavior change interventions supported by the empirical literature (eg, clinical trials) and from process research suggesting a theoretical link between intervention components and the symptoms typically experienced by our target population (see below "Evidence-Based Smoking Cessation Content" subheading in the Results section). This process ensured the theoretical grounding of the app and its evidence-based foundation.

Phase IV: Ideation, Sketching, and Paper Prototyping
We used several user-centered design tools to ideate a prototype of the app. This included (1) the creation of personas, a technique aimed at increasing the designer's emotional understanding of the end user by creating a short narrative of their motivations, context, and personal characteristics [42][43][44], and (2) sketching and paper prototyping, an important component of the design process consisting the use of paper drawings to quickly iterate on variations of app structure, app interactions, and the layout of content [42,45].

Procedures
Sketches and images developed during the ideation and paper prototyping phase provided the basis for usability testing. To simulate the app experience, we used app prototyping software Figure 3. These wireframes represent how our app's initial Play Screen evolved throughout our design process. From left to right: (a) Play Screen sketch, (b) pre-14 journey map, and (c) post-14 journey map. Wireframe (a) presents a character that needs to "jump" from stone to stone to "pick up" skills for quitting while navigating through a "swamp of urges." In wireframe (b), the user has completed the "Finding Your North Star" lesson and practiced the "Your North Star for Quitting" skills module. Wireframe (c) presents a user who has completed all levels of the Learn to Quit journey and motivated by his values for quitting has metaphorically reached "Learn to Quit Land".

Results
In this section, we describe the results of our formative study by focusing on a summary of our expert panel feedback, describing an initial paper prototype of the app and the results of its usability testing. This formative study resulted in a clearer definition of the app's core vision and helped define the scope of work to be conducted by a software development company. Results from the remaining user-centered design processes and materials (ie, design principles, theory-based content, paper prototyping) are presented throughout a final section that lays out the app's final characteristics and features and how they were informed by those processes and materials.

Expert Panel Results
The panel emphasized the specific challenges faced by this population when trying to quit (Table 1). This included the panel members' view that, as opposed to the general population, patients with SMI are more concerned about ongoing medical and mental health challenges (eg, metabolic syndrome, suicidality) rather than dying from cancer. Although not consistent with the behavioral economics literature [52], in the panel members' experience as providers, motivations to quit in people with SMI are either different or less pronounced than in the general population. To add to this challenge, providers indicated that both withdrawal symptoms and ongoing mental health symptoms were a common concern in their patients when discussing the possibility of quitting smoking (eg, triggering a psychotic event). Needing preparation to quit was also emphasized, arguing that patients would prefer a system that teaches them a set of skills for quitting, rather than a system that emphasizes setting up a quit date and monitoring maintenance.
To solve this lack of motivation and concerns about the process of quitting, the panel outlined a series of strategies that could engage the users with this app. One of them was the use of meaningful images and storytelling. Some level of gamification was viewed as important to engage users with SMI. In addition, the panel argued that as a way to compensate for the overwhelming task of quitting smoking, the system should include progressive disclosures, make sure it rewards small victories, and add external motivation (eg, money saved). Social networking and monitoring of medication aids were also mentioned. Finally, a few other themes emerged during our discussion, including challenges related to the use of technology in this population, the social context of these patients (eg, risk of having their smartphone being stolen), and the benefits of personalization and provider support during the initial stages of nicotine withdrawal and beyond.

Initial Results From Ideation, Sketching, and App Prototyping
On the basis of input from our expert panel and the authors' experience as clinical providers, we created 3 personas: (1) Esteban, a 55-year-old male with schizophrenia, 35 years of smoking history, psychiatrically stable, and with a range of medical complications; (2) Martin, a 31-year-old male with a diagnosis of bipolar disorder, who is a light smoker, and holds a stable job; and (3) Julia, a 43-year-old female with recurrent major depression and a history of drug use, who is taking care of a young daughter. These personas guided the design process and helped ensure that the initial prototype was responsive to the physical and emotional contexts of a broad spectrum of people with SMI, from psychotic to chronic affective disorders (see Multimedia Appendix 1).
These personas were used as inspiration to sketch our first wireframes (ie, sets of images displaying the functional elements of a website or app) and the overall app prototype, which was limited to a few basic components: A Home Screen (see Figure  2, panel a); a Play Screen (see Figure 3, panel a); a Tracking Screen (see Figure 4, panel a); a Quiz Screen (see Figure 5, panel a); a proof-of-concept smoking cessation module (see Figure 6, panel a); and an onboarding tutorial. To minimize cognitive demand in people with SMI, the app had only a few buttons (eg, "back," "next," and "Home"). These buttons were large, an important feature given the fine motor deficits observed in people with SMI [28] and findings from previous research in this population [30,38]. Table 1. Expert panel themes. Each of the insights of our expert panel are organized by a question and accompanied by a short description.

Description
Questions and themes Question 1: What are the biggest challenges for people with serious mental illness to quit smoking?
Life expectancy is not generally a motivation to quit in this population Motivation to quit Early attempts to quit without enough preparation, and/or lacking a step-down quitting process Quitting without preparation . These wireframes represent how our initial tracking feature evolved. From left to right: (a) tracking feature sketch, (b) cigarette tracking, and (c) personalized cigarette use feedback. As opposed to wireframe (a), in which we planned to use a single wireframe to collect all desirable tracking dimensions, in the final app, we used separate wireframes for each dimension (eg, smoking, mood). Note in (b) that users could report smoking half cigarette. Wireframe (c) is an example of personalized feedback following a user who reported smoking between 5 and 10 cigarettes.

Figure 5.
Wireframe examples of an initial sketch of a "Review Quiz" and a final Learning Module Quiz. Quizzes were presented at the end of the learning modules, and contained 3 questions each. From left to right: (a) Review Quiz sketch, (b) example of question for the "Key to Quitting" module, (c) feedback to correct answer that is followed by game reward sound, and (d) summary of quiz results, which indicates number of correct answers, best answer of all times, and number of practice stars gained (1 for each practice with a total of 3 per module).

Usability Testing Results
A total of 5 daily smokers recruited from an outpatient mental health clinic participated in usability testing of the app prototype. They averaged 44 years of age (standard deviation [SD] 7.5), and the majority were female (4/5) and had less than a college education (4/5). Of the 5 participants, 1 was multiracial, 1 African American, and the rest were white. Our sample group smoked an average of 11 cigarettes per day (SD 3), and most had an extended smoking history (mean 20 years, SD 13). Although we did not conduct diagnostic interviews, all participants were patients from a community mental health clinic, had an assigned psychiatric case manager and a psychiatric provider, were currently taking psychiatric medication, and on average had received mental health treatment for 25 years (SD 10). See Table 2 for a breakdown of individual baseline characteristics.

System Usability Scale
Overall, participants' levels of usability with the prototype were above the standard cutoff, suggesting that the initial prototype had promise. This was reflected in both the overall scale and the usability subscale. Conversely, the learnability subscale of the SUS did not reach the standard cutoff, although it reached high levels for 3 out of 5 individuals (see Table 2). Table 2 summarizes the results of the thematic analysis. To facilitate comparison with user testing of a nontailored app for smoking cessation in the same population (NCI QuitPal), we matched some of the themes resulting from our analysis with a previous usability testing study we conducted in smokers with SMI [30].

Thematic Analysis
Second, usability testing identified a critical usability error with our paper prototype. Specifically, it revealed that our original Home Screen was confusing to most users (see Figure 2, panel a). Users had difficulty in understanding the Home Screen structure and the purpose of each Home Screen subpanel. This is reflected in the prototype's suboptimal SUS learnability score.
Finally, we identified minor usability problems with the prototype, including small font in the subpanels and confusion about specific language. These usability errors led to a final Home Screen that had a simpler layout and removed most of its original displays and content (see Figure 2, panels b and c). Table 3. Usability testing results for Learn to Quit prototype (n=5) matched with comparable usability testing results from a previous user-centered design study (n=5) we conducted in a smoking cessation app designed for the general population (QuitPal) [30].
Quote/Observation/Feature Theme Learn to Quit paper prototype; (SUS= 74) Smoking cessation app (QuitPal); (SUS a =65.5) The need to use the keypad was removed from the prototype Unable to "pull up the keypad" Difficulty entering information in the app The need to use a "save" button was removed from the prototype Failure to identify and press "save" button at the top of screen

Difficulty saving information
No observed confusion about how to return to the Home Screen "It took me a long time to get back to that menu frame" Getting lost in app layers P5 b : "I like how the letters are big" "[buttons were] too close together" Tremor and fine motor skills a SUS: System Usability Scale; scores above the usability standard cut-off (>68) are indicated in italics. b P: Participant. Table 4. Themes identified during usability testing of the Learn to Quit app prototype (n=5).
Representative quote a Themes P1: "That would be so cool! A point every day" P2: "[the most exciting] the points" P4: "this would be kind of fun"; "So this is like a little game and people often play a lot of games" P5: "That's a good thing […] that builds confidence" Interested in gamification of smoking cessation skills P3: "The cartoons, the whole thing. It's got great spirit" P4: "that's very cute!"; "I like it, it's cartoony-like"; "it is very eye-catching" P5: "So you got a cartoon character! That's what I was thinking. It was right on. That works for me"; "Mm, cartoon characters, yes!" Drawn by cartoons and storytelling P2: "It was simple, informative, easy to use" P4: "[I like it]...when you're a kid and you're learning something new, it's basic and it's not all overstimulated, to put it that way…" P5: "It just goes right in your mind"; "this looks really simple too and it looks good" Appreciating simplicity P1: "I wish you guys could send it to me so that I could practice it and learn it" P4: "it's like this too shall pass […], makes sense."; "Mind and action skills. That's pretty good. Because it's like when you go to an AA" P5: "It relaxed me" Proof of concept: Acceptance and Commitment Therapy module showed promise P1: "It's very small and I can't see what it is." P2: "I don't know if that's an "I" or not." P3: "The first part of the app is so busy…"

Design Specifications and the App's Core Vision
On the basis of this formative study, we created a technical document laying out the app's structure and its core screens and features. We named the app Learn to Quit and synthesized the app's core vision with the following: learn, practice, and play. Learning referred to the process of being exposed to daily modules that explain different smoking cessation concepts. Brief quizzes would help the user retain and learn those materials. Practice referred to the actual practice of smoking cessation skills in the form of brief daily exercises. Practice should lead to "mastery" of the learned materials. Play referred to the user's opportunity to participate in a game comprising completing learning and practice modules and earning rewards along the way. The role of play was to promote higher levels of app engagement and commitment to learn.

Software Development Timeline
In May 2015, we filed a Report of Innovation at the University of Washington (ROI#47274) with the design specifications document of the app and proceeded to approach a company to develop a software-coded version of the app. Learn to Quit was built between August 2015 and October 2015. Smashing Ideas Inc. [53], a design and development agency, contributed to the refining and enhancement of the app prototype. ACT has 3 components relevant to smoking cessation: awareness, or the smoker's ability to recognize smoking triggers and urges; openness, or the smoker's willingness to experience smoking urges and triggers; and values activation, or the smoker's active engagement in values-based activities related to health. The app content was therefore designed to deliver techniques related to each of those components, which predicted smoking cessation outcomes in previous research [59,60]. A secondary active ingredient of our intervention was adherence to key elements of USCPG for smoking cessation [66]. These guidelines include setting up a quit date, proper use of medication aids (eg, nicotine patch), and preparing for relapse. These 2 active ingredients (ACT and USCPG) were integrated to ensure an evidence-based design approach to app development.

Evidence-Based Smoking Cessation Content
We adapted ACT to a mobile format by creating 14 modules of ACT+USCPG content and 14 modules of exercises to practice smoking cessation skills (see Multimedia Appendix 2). We chose a 14-day program that would be consistent with USCPG, which recommends 2 weeks of preparation for quitting, and would balance the need for gradual exposure to smoking cessation content while providing a concrete timeline. Each module presented a short narrative that exemplified ACT+USCPG content and explained it from the perspective of someone with nicotine addiction and SMI. Each concept was presented in a variety of narratives and exercises to increase learning generalizability (see principle of multiple exemplar training in Multimedia Appendix 3). In addition, each lesson ended with a quiz designed to further enhance comprehension and retention (see Figure 5, panels b and c). We arranged these modules so that completing a lesson per day was necessary to unlock new lessons the next day (see principle of differential reinforcement of successive approximations in Multimedia Appendix 3).
Completion of skills modules was optional to increase the user's perceived behavioral control [67]. ACT+USCPG content was developed and organized so that it would gradually increase in complexity as the user advances through each of the 3 levels (see section below on app gamification).

Simple Screens, Large Buttons, and a Predictable App Structure
To address the cognitive deficits observed in this population, we incorporated feedback from our formative study and followed recommendations from previous literature [30,[38][39][40][41]68]. See Multimedia Appendix 3 for a list of key design principles we used to address SMI in our app. The result was an app with simple screens, large buttons, and a simple structure. The Home Screen ( Figure 2, panels b and c) only included 2 buttons, 1 to access the Play Screen, which unlocked the module content described above, and 1 button to access the app settings. Usability testing suggested that the Home Screen should incorporate only a few elements. Thus, the Home Screen had as its focus a character surrounded by a circle of checkmarks, indicating overall progress in modules' completion.
The Play Screen ( Figure 3, panels b and c) represents a game in which each day the user takes a step forward toward completing a smoking cessation module. This screen had a linear structure, with older modules at the bottom and newer modules toward the top. Each module ranged between 6 and 24 screens, all of which had a reduced amount of semantic information, a simple color palette, consistent font size and type, and a predictable structure. All remaining wireframes in the app were presented in the form of cartoon scripts divided in 2 panels: the upper panel included the text and the lower panel a complementary image. We made each panel identical in size to maximize consistency and minimize cognitive load. This format was consistent with our panel's emphasis on the use of storytelling and interactive characters to engage patients with SMI. We also avoided the use of videos and audio. We hypothesized that the use of sliding cartoons and vignettes could maximize retention and comprehension of smoking cessation skills, as this format provides the user with more control over the speed of presentation of smoking cessation content and the ability to stop or review a single vignette for as long as needed.

Gamification of Smoking Cessation Content
Consistent with our expert panel's feedback, usability testing, and the empirical literature on the use of games not for entertainment, such as health or education (ie, serious games) [67,69], we designed an app that gamifies smoking cessation content. The main concept was designing a game in which the smoker metaphorically overcomes a swamp of urges and learns quitting skills along the way (see Figure 3). Gamification included interactive quizzes that provide immediate feedback about quiz results (see Figure 6, panel d), a component that has been correlated with long-term improvements in eHealth interventions [70]. Feedback included auditory cues for right or wrong answers and a reward system consisting of checkmarks, badges, cups, and crowns (see Applied Behavior Analysis section below). Repeated practice was encouraged with a reward system that provided stars every time a module was completed. This star system was consistent with the narrative of the app game (ie, module "Finding Your North Star"; see Multimedia Appendix 2). Sliding through module screens was also followed by video game sounds and feedback. We also added interactive elements to our tracking feature. Specifically, a 3-tiered system of personalized feedback was incorporated to diminish assessment burden. That is, we categorized all possible answers to a specific question ("How many cigarettes you smoked today?") into 3 different levels and we created a custom message for each, acknowledging the users' answer and encouragement to move forward in their journey (eg, smoking less than 5 cigarettes; see Figure 4, panels b and c).

Application of Principles of Applied Behavior Analysis
Behavioral principles (see Multimedia Appendix 3) informed the organization and delivery of smoking cessation content. These principles had the goals of (1) increasing retention, comprehension, and mastery of app content and (2) minimizing the impact of the cognitive deficits and low educational attainment of our target population [26,27,71]. For example, instead of providing the user with all smoking cessation content at once, we used the principle of successive approximations [72] to lay out increasingly complex ACT content throughout a 14-day period (see Multimedia Appendix 2 for a list of all the lessons and skills). Then, we applied the principle of multiple exemplar training [73]. Consistent with this principle, we presented multiple examples of a given concept or skill throughout multiple modules as a means to foster skills generalization to real-world settings.
In addition, the app used a combination of antecedent and consequential control strategies [72,74]. On the one hand, antecedent control was used in the form of notifications and Home Screen messages to prompt the individual to complete certain modules or tasks ( Figure 2, panel b: "New Lesson unlocked! Tap button to Start"). On the other hand, consequential control was used by applying positive reinforcement arranged on a fixed ratio schedule [72]. More specifically, obtaining a new star was made contingent upon completing a module. This reinforcement ratio was used to encourage users to complete those modules at least 3 times (ie, a maximum of 3 stars could be obtained per module; see Figure   3, panels b and c), which could lead to a total of 84 stars (see Figure 2, panels b and c). Additional modules could be completed at any time but it was not incentivized with more stars.
A rewards scheme was also implemented to increase the reinforcing effect of responding correctly to lesson quizzes (see Figure 5, panel d). For example, an individual who gave 3 correct answers to 3 questions would receive a "crown." Badges and cups were provided to users who responded 2 and 1 correct answers, respectively. Checkmarks were offered to users who completed the quiz but did not have any correct answer. Finally, we used negative reinforcement to promote daily completion of app modules. Specifically, the number of days in a row in which the user completed a module was indicated with a big bold number at the bottom of the Home Screen (see Figure 2, panel b). Not completing a new module one day was penalized with starting the count again from zero.

Emphasis in Visual Engagement and Storytelling
Because our expert panel and usability testing strongly supported the use of simple cartoons and visual storytelling, we created a gender-neutral character that rotated across the different stories and metaphors presented in each module (see Figures 2,3,and 6). The character enacts a variety of scenarios with the aim of exemplifying the experience of nicotine addiction and the smoking cessation skills offered in the app. Combination of imagery and text has been recommended in previous literature on mHealth in SMI populations [14]. The stories were told in very short sentences designed to avoid cognitive overload and maximize module completion. The purpose of these images and stories is to evoke an emotional connection and increase retention and comprehension of app content. For example, the Home Screen had 2 types of background: (1) dark green ( Figure  2, panel b), to represent that the user is still completing the Learn to Quit journey and thus navigating through a "swamp of urges," and (2) light blue ( Figure 2, panel c), to indicate that the user has reached "Learn to Quit land" and thus is ready to quit. Similarly, the background of each module used a consistent and muted color scheme that maps into the 3 dimensions of the app's theory-based content: (1) gray for Learning Modules, (2) blue for Mind Skills, and (3) red for Action Skills. The function of this color scheme was to offer a simple and predictable layout that would be more likely to fit the cognitive model of the user [40,41]. Previous research in people with SMI also indicated that bright colors can be overwhelming in this population, thus supporting our use of muted colors and a simplified color scheme [38].

Access to Technical Coaching
Our expert panel indicated that "technological illiteracy" was common in patients with SMI (Table 1). In the panel members' opinion, many of their patients have never used a smartphone device, and therefore using successfully an app for smoking cessation could be a challenge. Furthermore, they commented on the complex social context of these patients and the potential need for more intense personal support. In our previous study testing a smoking cessation app in a small sample of this population [30], we observed some of these challenges, but we also noted that there was a wide range of technological literacy in this population, with some patients owning their own device and comfortably using smartphone apps. Therefore, we incorporated a simple technical coaching feature in our design that would serve as an aid to those that needed guidance, whereas at the same time maintaining the core vision of the app as a stand-alone intervention. This technical coaching feature consisted of a button in the settings section that allowed the user to call a preassigned technical coach. Eventually, this technical coaching role could be performed by addiction counselors or case managers and therefore serve to integrate the app within ongoing health care at a community mental health clinic or larger health care organization.

Principal Findings
This paper reports the rationale, ideation, design, user research, and final features of a novel smoking cessation app developed for people with SMI, a population in great need of novel smoking cessation treatment. Building this app involved a user-centered design process that carefully considered a series of design principles to maximize comprehension and retention of smoking cessation concepts, minimize the impact of known challenges in people with SMI, and ensure the effective delivery of evidence-based smoking cessation content.
Results from our user-centered design process informed the features included in the final app. First, informed by our formative study and as suggested by the literature [14], we gradually delivered evidence-based smoking cessation content using imagery and simple semantic content. The primary active ingredient of this evidence-based content was ACT, which is an intervention that has empirical support as a smoking cessation intervention [55-58] and as an intervention to treat individuals with SMI [62][63][64][65]. This intervention was further integrated with smoking cessation recommendations from USCPG [66] to ensure alignment with best clinical practices.
Second, as suggested by the literature [30,[38][39][40][41]68] and our user testing of a paper prototype, we created an app with very simple screens, buttons, and a predictable app structure that led to promising usability and user experience results. The paper prototype we created used a minimal set of app buttons and a very simple app structure, which probably contributed to scores above the usability standard cut-off. Our thematic analysis of user experience interviews was in line with these usability scores and further reinforced the inclusion of our final set of app's core features.
Third, input from a panel of experts in SMI led to the idea of incorporating app gamification, visual engagement, and storytelling [67,69,75]. Use of these elements is consistent with the serious games literature for smoking cessation published in this journal [67]. More specifically, the app offered a module to help users identify their core values for quitting and used these values as the game's objective (ie, module "Your North Star for Quitting"), which increases the user's perceived behavioral control and intrinsic motivation [67]. In addition, it offered a clearly structured game that had functional utility [67], that is, successfully completing the game involved being exposed to a series of content known to help people quit smoking.
Fourth, we implemented a number of applied behavior analysis principles to maximize retention and comprehension of app content [37,76]. This included the use of progressive disclosures of smoking cessation content, variation and repetition of that content, a reward system, and modules that gradually portrayed complex smoking cessation concepts (eg, psychological awareness). Application of these principles provided guidance to adjust our design to the needs of our population and provided a level of conceptual clarity that linked our work with the behavior change literature at large. For example, although the term "notification" is common in mHealth, this feature is essentially an antecedent control strategy, which has been extensively used in the behavior change literature in areas such as autism [77][78][79], individuals with dementia and cognitive impairments [80][81][82], and in cases of traumatic brain injury and poor executive functioning [83,84].
Finally, our development effort took into account implementation considerations: (1) it used an Android operating system, which according to our panel was a common platform among people with SMI and tends to dominate the market among people with lower socioeconomic status (eg, individuals with disabilities) [85,86], and (2) it addressed the potential need for more intense personal support by including a technical coach feature within the app. In the future, this coaching role could be performed by addiction counselors or case managers in community mental health clinics and address not just technical issues but also support to use of ACT skills to deal with smoking cravings or adherence to USCPG.

Comparison With Prior Work
The study reported in this paper is consistent with user-centered design research of mobile apps for depression [87], smoking cessation [30], and work with diverse populations [88]. Furthermore, Learn to Quit's emphasis in gamification is consistent with an app developed for depression, SuperBetter [89].
To date, many apps have focused on the use of sensors and algorithms to track user context and provide personalized feedback [90][91][92][93][94]. Learn to Quit differs from this approach in the sense that it is based on a more traditional form of engagement, visual storytelling. Visual storytelling is a core form of engagement common across human cultures [95]. Storytelling and visual engagement are implicated in emotional arousal [96] and in attaching value and significance to sensory descriptions [97,98], thus contributing to learning and memory (eg, comprehension and retention of new content) [98,99]. Given the fact that motivational challenges are common among people with SMI (eg, negative symptoms of schizophrenia), we built an app that had visual storytelling at its core. Our main goal was to create an app with visuals and stories that were as engaging as possible, in combination with gamification, well-established behavior analytic principles, and the use of evidence-based smoking cessation content.
As stated in the introduction, we believe that the determination of the active therapeutic ingredients delivered by an app should be the result of a careful design process. However, user-centered design research could lead to stakeholder recommendations that are not consistent with evidence-based practices or theory-based principles of change. Adherence to evidence-based practices or theory-based principles of change might not always be emphasized in user-centered design research, yet it is a key activity of the design process inherent in the original user-centered design guidelines [31,32]. As shown in our Methods section, we emphasized this key activity by incorporating phases II and III and by making an effort to ensure that our final design integrated concerns brought up by each of these phases.
Finally, a relevant aspect of this app is that it could be a good example of the concept of universal design [100], for which systems tailored to specific groups of individuals with certain disabilities or challenges become inherently usable to larger groups of the population without those challenges. For example, even though the app addresses core usability barriers experienced in people with SMI, in doing so, it explains in simple terms very complex psychological concepts which might engage smokers of young age, low literacy groups, and the general public. Furthermore, the app's systematic use of behavior analytic principles to increase comprehension and retention of ACT and USCPG is broadly applicable. Therefore, Learn to Quit's design approach could be generalizable to related mental health conditions, health behaviors, or the general public.

Limitations
This study had several limitations. First, the number of patients with SMI in the expert panel group (n=2) and the usability testing study (n=5) could have been small, leaving to question whether a larger sample of people with SMI could have led to more feedback and opportunities for innovation. There is debate among user-centered design researchers about the most cost-efficient number of subjects to identify usability errors [101,102], with the most traditional approach suggesting a sample size of 5 [101]. Recruiting individuals with SMI is particularly challenging; therefore, we believe our sample size was justified.
Second, our methods could have been more rigorous in several aspects. Specifically, results from our expert panel were not transcribed and analyzed using a complete set of qualitative methods, which could have led to the identification of additional themes. However, rather than a thorough and comprehensive analysis of provider input, the goal of this expert panel was to quickly gather initial insights and impressions that would orient our imminent design process. Additionally, usability testing did not include observational coding of user behavior. As reported in similar studies [30,87,88], this could have provided more concrete usability feedback and complement the results of the SUS and the thematic analysis of our transcripts. A third methodological limitation is that we did not conduct diagnostic interviews of our subjects. Despite this, all subjects received psychiatric treatment and were recruited from a community mental health clinic, which, according to regulations by the US Department of Health and Human Services [3], requires serving individuals with SMI status.
Third, the app's tracking feature provided personalized feedback based on participants' responses to self-reported ratings of mood and smoking behavior. However, this level of personalization did not take into consideration each individual's baseline (eg, certain smoking reductions could be large or small depending on the individual's baseline), limiting its impact for personalization. Likewise, the current app system does not take into account a variety of quitting scenarios (eg, individuals who quit before the end of the program) and how these scenarios interact with app content. Future versions could take into account these personal scenarios to strengthen Learn to Quit's usefulness and level of personalization.
Finally, this paper focuses on the user-centered design research of a paper prototype leading to the development of the Learn to Quit app. Although this report does not provide data about the usability and user experience of the final Learn to Quit app, it allowed us to transparently report in more detail its user-centered design process. A separate report of Learn to Quit's usability and user experience in its target population is under review elsewhere.

Conclusions
This is the first paper to systematically describe the rationale, ideation, design, and user research of a smoking cessation app specifically designed for people with SMI, a population with alarmingly high rates of nicotine addiction and in high need of novel smoking cessation treatments. The feasibility and acceptability of this app will be subsequently tested in a randomized controlled feasibility trial (clinicaltrials.gov NCT03069482).
User-centered design is a critical process for the development of mobile interventions for individuals with SMI. However, because emotional and cognitive challenges are present in less severe forms of mental illness or can be present in other health conditions (eg, cancer patients), the results of this study might be generalizable to other areas of mHealth research. Therefore, while ideating and designing digital interventions, mHealth developers might consider capitalizing on the role of visual engagement, storytelling, and the systematic application of behavior analytic principles to deliver evidence-based content.
RV envisioned the app rationale and intervention, ideated app design and features, sketched imagery, conducted research activities, interpreted results, and wrote this manuscript. JR coordinated research activities, sketched imagery to be included in each module, and contributed to the overall conduct of design and research activities. EZ transcribed the recordings of the usability testing procedure, conducted thematic analysis of the interviews, discussed emerging themes with first author, and sketched imagery to be included in each module. JK provided user-centered design expertise in the development of the software app for behavior change and smoking cessation and contributed to early drafts of this manuscript. RR contributed to form the expert panel, provided serious mental illness expertise and feedback about the overall app design, and contributed to late drafts of this manuscript. CO created high-fidelity designs and illustrations and contributed to improvements in interaction design and the final user experience of the app. KH contributed to late drafts of this manuscript and to the layout of the manuscript structure and sections.

Conflicts of Interest
None declared.

Multimedia Appendix 1
Primary and secondary personas that guided Phase IV of the user-centered design process.

Multimedia Appendix 2
Evidence-based smoking cessation content of Learn to Quit app.

Anxiety Disorders and Adolescence
The term anxiety disorder (AD) represents a category of psychological disorders characterized by feelings of anxiety about future events, and fear reactions to current events [1], in addition to increased attentional biases toward threat detection [2]. ADs are the most prevalent of the psychiatric disorders [3], affecting approximately 117 million young people worldwide; it is the sixth leading cause of disability, with the largest longevity among young people aged 15-34 years [4], with a range of factors such as misdiagnosis, health care avoidance behaviors, and hardiness, meaning these statistics change relentlessly.
The extant evidence suggests that the proportion of adolescents suffering from ADs has increased by up to 70% since the mid-1980s and that nearly 300,000 young people in the United Kingdom have a diagnosable AD [5]. The onset of AD increases significantly during the adolescent years [6], in part as a result of conflicts regarding existential identity [7], educational pressures and high self-expectations [8], negative peer comparisons or perceived relational victimization [9], and over-demanding intrusive parenting [10,11]. Experience of AD in early life is associated with negative short-and long-term implications for social, academic, financial, and health performance [12] and predicts adult anxiety and substance abuse disorders [13].
For the individual, adolescence is both a source of increased opportunity and increased pressure and risk [14]. Increased social expectations of developing autonomy in self-regulation and self-determination of behavior, coinciding with diminishing assistance from adults, require the adolescent individual to develop and coordinate effective emotional and cognitive capabilities in relatively short time frames. These time frames, however, do not always correlate well with progress made in brain maturation [15,16]. Functional magnetic resonance imaging data also point to a tendency toward an increased response to emotionally loaded stimuli at this age [17,18]. Neuroimaging data point to a biomaturational explanation for the increased prevalence of AD in the adolescent years [19]. Furthermore, naturally occurring consolidation of neural pathways during adolescence may explain the tendency of experience of AD at this age to lead to negative outcomes in later life, leading some to describe AD as a potential gateway disorder [20]. Therefore, effective treatment of adolescent AD is critical in the mitigation of both its impact at the point of experience and the potential long-term ramifications [21].

Therapeutic Interventions for Anxiety Disorders
Cognitive behavioral therapy (CBT) has been shown to be highly effective in the treatment of ADs [22], reducing or eliminating symptoms through the development of effective behavioral adjustment and coping strategy enhancement. Attention bias modification (ABM), an emerging technique derived from neurocognitive models of anxiety, has also been noted for having significant potential to enhance both pharmacological and psychological interventions for anxiety, as well as being an effective standalone intervention [23]. However, although evidence-based early intervention strategies reduce the probability of negative life outcomes [24], practical barriers to treatment (eg, cost) and social barriers to treatment (eg, stigma) mean as many as 50% of people in the United Kingdom experiencing anxiety do not seek treatment [25]. For those who seek treatment, waiting lists via Improving Access to Psychological Therapies (IAPT) referrals can be lengthy, leading to high dropout [26]. In addition, educational institutions and universities often fail to provide adequate support [27], with the existing services unable to meet the rising demand [28]. As a result, in adolescents, it is estimated that less than 20% of individuals affected by ADs receive treatment [29], with fewer than 20% of those seeking and receiving treatment being provided with interventions supported by scientific evidence [30].
The growing discrepancy between demand for, and available provision of, mental health services has led to the development of a range of alternative methods for delivering clinical interventions for anxiety [31]. eHealth (health care practice supported by electronic processes and communication) and mHealth (versions of eHealth using mobile devices) models (eg, computerized cognitive behavioral therapy) aim to mitigate the impact of both practical and social barriers to treatment by utilizing ubiquitous mediums to broaden the reach of clinical models [32][33][34]. One such medium is therapeutic video games, a derivative of serious games. Due to the improved realism in simulated artificial environments and capabilities of contemporary hardwares, Web-based therapies are more comparable than ever to in vivo forms of treatment [35]. The gamification of clinical models may be particularly suitable for younger people, as they often reflect the typically more visual, rapid, and multi-tasking learning styles of a generation with a lifelong exposure to and familiarity with technology [36,37]. Therapeutic games afford a flexible and personalized learning environment that allows for exploratory learning and behavior practice [38], allowing Web-based environments to be adapted in terms of content and challenge to the requirements of the user, which is likely to be conducive to an enhanced learning experience [39,40]. As games utilize both intrinsic and extrinsic motivational elements in active and realistic learning opportunities with immediate opportunities for feedback, they have already been shown to be capable of eliciting improvements in self-awareness and self-management behaviors in people with chronic physical health conditions [41]. In terms of the benefits of therapeutic games for mental health and well-being, the extant evidence suggests they may be effective across a variety of disorders, including reducing psychopathological symptoms associated with gambling disorders [42], and as an effective preliminary treatment to CBT for bulimia nervosa [43]. Although the current literature presents conflicting evidence regarding the health benefits versus health hazards of video game platforms [35,44], therapeutic games utilize a popular platform to achieve clinically measurable health improvements and behavioral changes [45].

Research Questions
Therapeutic games provide young people with a dynamic, adaptable, and personalized learning environment in which they are afforded an opportunity to seek relevant information and guidance in an exploratory manner, receive immediate feedback, and utilize unlimited opportunities for repeat engagement [38]. As a result, therapeutic games should offer a more accessible platform for rapid learning and adoption of scientifically validated therapeutic techniques.
Although some evidence exists to suggest that therapeutic games can be linked with reductions in subjective anxiety and observed stress reactivity [46], research to date often combines adolescent samples with either child or adult participants, limiting the capacity of the current data in terms of its applicability to the unique nature of anxiety experienced at this life stage. Furthermore, to the researchers' knowledge, there is currently no stated set of guidelines available for the development of therapeutic games, to which developers are required to adhere, nor is there a definitive protocol established for their scientific evaluation.
Consequently, it is unclear whether the potential benefits of therapeutic games establish themselves in anxiety in adolescents, and if so, whether any benefits are modulated by the therapeutic framework employed. A systematic review was conducted to assess the effectiveness of therapeutic games in enhancing engagement with clinical interventions, and their efficacies in making clinically measurable reductions in AD symptoms in adolescent samples.

Databases Searched
Relevant papers were identified by performing a comprehensive literature search of the following databases: Journal of Medical Internet Research, Journal Storage, Psychology Articles, Psychology Info, ScienceDIRECT, and Scopus.

Search Terms and Selection of Papers for Inclusion
The following search terms were used to address the variety of games that might be played, and the variation in terms used to describe them: all ("serious game" OR "video game" OR "therapeutic game" OR "online game") AND all("adolescen*" OR "teenage" OR "youth" OR "young adult*") AND "anxi*." For further detail regarding search terms, definitions, and variation of input, see Multimedia Appendix 1.
Paper abstracts were initially scanned to determine eligibility. If eligibility could not be determined from the abstract alone, or if the paper was deemed as potentially relevant from the abstract, the full-text paper was studied for its relevance to the review.

Inclusion Criteria
In line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, clear inclusion criteria were established to determine the eligibility of papers for inclusion in the review. Only studies meeting the following criteria were considered eligible for inclusion: papers linked to therapeutic games for ADs; studies conducted on adolescent samples; empirical research using a randomized controlled trial (RCT) design, quasi-experimental design, or correlational design; and studies using a control group or pre-versus posttest design to allow for comparison. All searches were limited to anxiety.
Adolescent sample was defined using American Psychiatric Association criteria (10-19 years). As a result, papers that used a sample of participants aged between 10 and 19 years were considered eligible. Papers that used samples including participants aged 8 or 9 years were considered eligible if the mean age for the sample was over or very close to 10 years.
Papers published in English or German were selected and subjected to the inclusion criteria as outlined above. In line with PRISMA guidelines, a specific date range was established. Studies published between January 1990 and July 2017 were selected. This date frame was chosen as papers first studying the effect of video games in the context of health education were published in the 1990s [41].

Exclusion Criteria
Papers regarding opinion pieces, existing literature reviews, conference posters, and design documents for therapeutic games were excluded from the review. Study protocols were also eliminated from the review as they would be unable to provide outcome measures. Pilot studies were included in the review provided the 4 criteria discussed above were met.

Quality Assessment
For the purposes of consistency, one researcher oversaw the initial coding of the papers. Quality assessment of papers meeting the inclusion criteria was assessed using the mixed methods appraisal tool (2011) [47]. The mixed methods appraisal tool is designed for systematic reviews, including a combination of quantitative, qualitative, and mixed methods studies, and has been noted for its reliability and efficiency as a quality assessment protocol, and capability to concomitantly appraise methodological quality across a variety of empirical research [48]. In line with PRISMA guidelines, an interrater process was adopted and the degree of agreement was assessed, to reduce risk of bias.

Primary Outcome Measure
To assess the extent to which therapeutic games elicit reductions in AD symptoms, papers were studied for comparisons between measures of anxiety symptoms at pre-versus postintervention.

Papers Meeting Inclusion Criteria
A total of 2259 records were identified through database searches. After papers published in languages other than English or German, and duplicate instances of papers were removed, remaining papers were assessed using the inclusion and exclusion criteria outlined above (N=2222). Initially, abstracts were searched to assess a paper's eligibility for inclusion. If abstract information alone was not sufficient to determine whether a paper met the criteria, the entire paper was studied. Figure 1 shows the number of academic papers from each database identified using the search terms and the number of papers meeting the inclusion criteria.

Quality Assessment Outcomes
Papers meeting all of the inclusion criteria (N=5) were then quality assessed using the procedure described above. Subsequent to these papers meeting the quality criteria, they were included in the review. The mean rating for the papers was 75%, and the modal rating for the papers was 75%.
Owing to the small number of papers meeting the inclusion criteria, all papers included in the review were subsequently coded again for interrater reliability. The interrater reliability for the total scores was 0.8, showing good agreement between the 2 coders regarding paper quality.

Overview of Papers Included in the Review
Of the 5 studies selected as meeting the criteria for inclusion in the review, 2 relied solely on quantitative measures [49,50] and 3 used a mixed-methods approach [51][52][53], with differing balances of reliance on quantitative versus qualitative data. Of the papers reporting quantitative data (N=5), 2 used an RCT design [49,52], 1 used a quasi-experimental design [50], and 2 were exploratory evaluations and usability studies [50,53]. Of the papers reporting qualitative data (N=3) [51][52][53], all used interview techniques to obtain varying amounts of qualitative feedback.
Of the 2 papers focusing on the therapeutic game "Dojo" [51,52], one study focused on a pilot study to evaluate perceptions and feasibility of the game [51], whereas the other paper concerned an RCT comparing the game to "Rayman 2: The Great Escape," in terms of their relative abilities to reduce adolescent anxiety [52].
In terms of quantitative data, one paper focused on the neurofeedback game "MindLight" [49] using an RCT to compare the game to "Max and the Magic Marker" in terms of their relative abilities to reduce adolescent anxiety.
One paper used a mixed-methods approach to qualitatively evaluate impressions of the game "gNats Island" using a sample of 6 adolescents [53]. Pre-and posttreatment, and 6-week follow-up, measures of anxiety were assessed quantitatively using a range of clinical measures to indicate symptom levels for anxiety and a range of other mental health conditions.
The final paper included in this review utilized a quantitative approach to assess the efficacy of an augmented reality (AR) therapeutic game [50]. A quasi-experimental design was adopted for this study, recruiting participants to experimental (AR exposure therapy) and control (no game exposure) conditions in 2 separate phases.

Outcomes of Therapeutic Games on Adolescent Anxiety
Schoneveld et al [49] utilized an RCT design to examine the anxiety-reduction effects of the game "MindLight"-a game designed for children and adolescents using a combination of CBT and ABM [49]. In this game, users are guided through relaxation techniques (CBT) and must use a glowing light on their headset (their "mind light"), which reacts to activity collected from an electroencephalogram (EEG) they wear during play, to help them navigate the in-game world and defeat monsters they encounter. Nonplayable characters (NPCs) become gradually more difficult to ignore, and players must remain calm (keep their "mind-light" bright) to "decloak" the threats (eg, turn a scary cat into a friendly kitten). The game also rewards players for attending to and quickly responding to positive stimuli and disattending or moving away from negative stimuli (ABM).
A total of 136 children aged between 8 and 13 years (mean 9.93 [SD 1.33]) were selected after screening for elevated anxiety and randomized to either an experimental or control condition. Experimental participants took part in five 1-hour sessions, scheduled twice a week, in which they played the game "MindLight" in groups of 7-19 participants at a time. Control participants undertook the same program in terms of time allocated to game playing, and group size, but the therapeutic game was substituted for a control game "Max and the Magic Marker." Self-and parent-reported anxiety levels were assessed pre-and postintervention, followed by a 3-month follow-up, using the child and parent versions of the Spence Children's Anxiety Scale (SCAS-C and SCAS-P) [54]. Latent growth curve modeling revealed a significant slope for all models, indicating levels of anxiety decreased significantly over time.
Intention-to-treat linear regression analysis found no significant effect, however, of game condition on anxiety outcome. Qualitative feedback revealed that "MindLight" was more anxiety inducing than "Max and the Magic Marker," suggesting "MindLight" was successful in achieving its intended emotional exposure effects. Furthermore, no difference was found in perceived difficulty of the 2 games studied, or their perceived appeal to other children. "MindLight," however, was reported by the participants as less appealing to themselves and less likely to induce flow, a common issue with serious games when they are compared with their more entertainment-focused counterparts [55].
The therapeutic game "Dojo" appeared in 2 of the papers selected for review. "Dojo" is a first-person emotion-management game that takes place in a secret temple below an urban subway. The player assumes the role of a young person experiencing a difficult time, and navigates 3 rooms, headed by a "dojo master" thematically designed to represent different emotions: anger, frustration, and fear. In each room, the "dojo master" trains the player in a relevant coping skill for the emotion, after which the player undertakes a task (eg, in the fear room, the "dojo master" trains the player in deep-breathing exercises), then challenges them to collect bones in a labyrinth while attempting to evade a powerful and frightening angry ghost. The player's heart rate can be monitored and used by the game to increase or decrease game difficulty.
In one study, "Dojo" was assessed by means of a pilot study using a mixed-methods approach [51]. This study was conducted in 2 residential treatment centers offering 24-hour care for youths with severe mental health problems. A total of 8 adolescents (mean age 14.38 [SD 1.60]) took part in eight 30-min sessions playing "Dojo" on a laptop. These sessions took part twice a week for 4 consecutive weeks-though 3 participants experienced a 2-week break due to scheduling conflicts. The participants rated statements regarding game satisfaction on a 5-point scale, and they were also offered the opportunity to provide comments. The Dutch language version of the SCAS was used to measure anxiety. Reported satisfaction with "Dojo" was high, and both participants and mentors reported high compliance and positive changes in anxiety. The participants did, however, suggest that "Dojo" became repetitive and would have preferred more game rooms.
Following this study, Scholten et al [52] then utilized an RCT design to examine the anxiety-reduction potential of "Dojo" in comparison with "Rayman 2: The Great Escape," which was chosen as a control game. A total of 138 adolescents (11-15 years, mean age 13.87 [SD 0.91]) were tested both pre-and postintervention, with a 3-month follow-up (N=126) using the SCAS-C. The participants also provided feedback about game experience and game expectations before the intervention. The intervention took place over 3 weeks, consisting of two 1-hour sessions a week. All the participants accessed their games after school hours in the same room regardless of condition, using separate computer terminals and headphones to hear game sound and diminish distractions. Results indicated that anxiety symptoms significantly decreased at follow-up in both conditions (total anxiety symptoms: beta=.70, SE=0.04, P<.001; personalized anxiety symptoms: beta=.63, SE=0.05, P<.001). Latent growth curve models revealed a steeper decrease of personalized anxiety symptoms in "Dojo," but not total anxiety symptoms.
Coyle et al [53] studied the therapeutic game "gNats Island" using a series of trials. "gNats Island" is a gamified CBT intervention derived from a paper-based CBT manual for 19 adolescents (total sample aged 11-16 years, 12 males, 7 females). Players navigate a 3D animated tropical island in which they meet a series of NPCs, which introduce mental health concepts using a spoken conversation, embedded animations, videos, and questions regarding the player's own situation (to which players can respond by a multiple-choice question). Players carry an in-game notebook to answer further questions posed by NPCs and record new ideas. Negative automatic thoughts are represented in-game as "gNats," which sting players to cause negative thinking. Catching, trapping, and swatting gNats are used to represent identification and challenging of negative thinking.
In one study, therapists independent of the design team used "gNats Island" with adolescents referred to the psychology team at the participating hospital experiencing clinical ADs. Adolescents played the game alongside a clinician who acted as a partner. Results from questionnaire feedback indicated that adolescents found the game more fun and engaging than "just talking," and assisted in avoiding perceptions of confrontation in direct face-to-face interaction. Quantitative data indicated a decrease in anxiety scores both during and postintervention. No further statistical data are provided in this study.
In a second study, a member of the design team used the game with 15 adolescents experiencing issues including anxiety, but also depression, anger management, and issues relating to autism spectrum conditions. Although some participants used the game in a structured manner, in six 1-hour sessions over 6 weeks, others used the game flexibly as determined by clinician assessments of their individual needs. Qualitative feedback showed that after the intervention stage had finished, the participants preferred to explore the Web-based world, suggesting "winding down" time may be as important as engagement with therapeutic elements of game play. The participants also rated modules 4 and 5 of the game lower than previous elements. Notably, difficulty levels of the game at this point had increased, and a core component of CBT had been introduced, suggesting pacing at this stage of the game may not have been as effective as required. Low graphical fidelity of the game was noted by users, but not reported as a barrier.
The final study eligible for inclusion in this review was conducted by Li et al [50] and investigated the effectiveness of "PlayMotion" hardware (which creates Web-based environments via AR) using a quasi-experimental design. A total of 122 children aged between 8 and 16 years (mean age 11.85 [SD 2.20]) admitted to an oncology ward in a large hospital in Hong Kong were assigned either to experimental (PlayMotion) or control (routine nursing care with no engagement in Web-based interactions) conditions. The participants were recruited in 2 phases, with all patients admitted in phase 1 assigned to control (N=70), followed by a 1-month washout, followed by phase 2, in which all patients admitted were assigned to the experimental condition (N=52). Engagement with the Web-based intervention consisted of 30-min sessions for 5 days a week, with 4 participants per group. Anxiety was measured using the SCAS-C, with results indicating no change in anxiety symptoms after 7 days.

Principal Findings
Although therapeutic games show early signs of promise in helping to alleviate symptoms of anxiety in adolescent samples, a number of issues and limitations of the extant evidence have emerged. First, although some evidence utilizes RCT protocols to establish a clear comparison between therapeutic games designed specifically for anxiety reduction and control games designed without this primary purpose in mind, other research is based on clinician's impressions of games, or feedback from adolescents while in the presence of a clinician, who may also be a member of the design team, creating the potential for bias.
Of the RCTs that exist, these are limited in number (N=2) [49,52]. Furthermore, both these studies reported reductions in anxiety in control groups, with no significant differences in anxiety reduction found between the 2 conditions. Authors note that control condition games may have inadvertently utilized game mechanics that trained resilience and coping skills despite this not being their intended or primary purpose [49], or that participants may have vicariously acquired coping strategies from their peers in the experimental group, playing their games in the same room at the same time [49,52].
In addition, as neither study utilized a waiting-list control group or comparison to established nongaming therapy for further comparison, and although the therapeutic games tested seem to have potential, it is difficult in either case, despite their healthy sample sizes, to establish the extent to which specifically designed therapeutic games may have additional capacities for anxiety reduction in adolescents. As other findings offer no follow-up results, offer pre-versus posttest as control [53], or offer a "wait-list" control but use a game designed for multiple conditions [50,53], there is a scope for further research to further examine the capabilities of specifically designed therapeutic games to reduce anxiety in both short term and long term.
Current research has also focused on the evaluation and exploration of the benefits of therapeutic games in relatively controlled environments using hardwares such as EEG, AR hardware, and heart rate monitoring, which are impractical in everyday environments. Intervention sessions in this field are routinely scheduled in classrooms [49,52] or clinical environments, sometimes with a practitioner present to guide the interaction [50,51,53]. Although initial findings suggest that engagement with therapeutic gaming may assist in clinically measurable reductions in anxiety symptoms over time, it is not clear how effective such games may be in real-life environments. It is also unclear how such games may have potential to assist at the point of symptom experience, either as a distraction technique or coping mechanism, in an everyday manner. As highlighted previously, access to, and efficacy of, clinically proven interventions for anxiety is limited by practical and social barriers to treatment [24], lengthy IAPT referrals [26], or inadequate support from their educational institution [27,28]. As a result, as many as 50% of individuals who may benefit do not receive any form of therapy [25], with figures significantly higher for adolescents [29]. Therapeutic games that successfully manage to breach these practical and social barriers to treatment, overcoming the need for hardwares impractical to day-to-day life, may be significant in helping to mitigate the short-and long-term implications of one of the most prevalent psychological disorders in a population that is difficult to treat.
A further theme of the current research concerned the structure and format of delivery of the existing interventions. For instance, in Scholten et al's [52] RCT to assess "Dojo," participants reported that the duration of the intervention was "too long," specifically with regard to maintaining concentration and motivation after approximately 4 sessions. Authors suggested that this was potentially a product of "Dojo" being a small game. Once adolescents finished the available rooms, as most did after 3-4 sessions, they were required to repeat the rooms until the end of the intervention period. Although repeating the rooms may be seen as a reinforcement of their learning, this repetitive play may also have caused boredom. Murphy et al [56] suggest that although some degree of repetition may be beneficial in improving learning and future experience of flow in video games, repetition without learning anything new, or repetition without experience of even subtle changes in gameplay, can disrupt flow and inhibit perceptions of "mastery." This may explain wider research in the field, which suggests that programs with shorter durations tend to have better outcomes [57].

Limitations
Despite the initial number of studies found through search terms being extensive, the final number of papers successfully meeting the inclusion criteria was low (N=5). Accordingly, although this was somewhat expected due to the stringency of the criteria used in this review, there is further research using therapeutic games for young adults, which may be of interest in the development of games for anxiety aimed at older adolescents. In addition, while the mean age of participants in papers included in this review was between (or just below) 10-19 years, the studies used individual participants aged below 10. As a result, the data are partially affected by the presence of participants who do not qualify as adolescents, but rather as children, making current research to date problematic in establishing the extent to which currently available therapeutic games may be beneficial for older adolescents on the brink of early adulthood.
Furthermore, this review considered papers with a publication date of up to and including July 2017. Due to the fast-paced nature of technology development, particularly with regard to software applications, of which games are an example, the relevance of this review in terms of its ability to answer the research questions posed may be time-limited.

Future Research
As noted by authors of papers included in this review, this is an emerging area of interest in the field, and subsequently, there are several avenues of exploration yet to be fully explored. Further research would benefit from full RCT studies ensuring appropriate control games are selected, or by using a waiting-list control as a second control condition, to allow for more rigorous comparison. Furthermore, although some studies to date have utilized a follow-up time point, others have not. As a result, the long-term benefits of the use of therapeutic games for adolescent anxiety is currently unclear.
Future research would also benefit from further consideration of the applicability and efficacy of therapeutic games in more ecologically valid settings. Current research to date has explored the use of therapeutic gaming in systematically controlled environments. Consequently, the potential of using games in day-to-day life or at the point of symptom experience remains unknown. As noted previously, therapeutic games that successfully manage to breach practical and social barriers to treatment, with engaging games capable of repeating concepts of clinical value while maintaining flow, will be of value to the field in establishing the potential of this protocol.
Finally, no current legislative body or code of conduct exists for the development and regulation of "therapeutic games," nor is there currently a standardized procedure or empirical protocol for their scientific evaluation. In such an unregulated environment, there is substantial potential, therefore, for misuse of the term "therapeutic game" on the part of a more commercially driven developer. The capricious nature of the results presented in the investigations included in this review could be argued to be a product of such methodological variability, rather than an indication of inconsistencies of therapeutic games as effective treatments for anxiety symptoms. Consequently, future research should aim to establish a valid and reliable model for the assessment and verification of therapeutic games, with the view to developing a trustworthy quality-approved protocol.

Conclusions
This review aimed to assess the effectiveness of therapeutic games in making clinically measurable reductions in AD symptoms in adolescent samples. Although research in this field appears to be extremely limited, as demonstrated by the small number of papers meeting the inclusion criteria for this review, early findings suggest that therapeutic games have potential in helping to reduce anxiety levels in adolescents. By utilizing this protocol in a medium that facilitates overcoming the existing barriers to treatment, therapeutic games may be a valuable facilitator in reducing the short-and long-term implications of one of the most prevalent psychological disorders in a population that is difficult to treat. Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Serious Games, is properly cited. The complete bibliographic information, a link to the original publication on http://games.jmir.org, as well as this copyright and license information must be included.

Introduction
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder responsible for impairments in social communication and interaction and restricted repetitive patterns of behavior, interests, or activities. Current ASD prevalence is estimated to be around 1% of the population, worldwide [1]. Given that a cure is yet to be found, individuals rely on interventional approaches to improve and overcome their impairments. Further increasing the relevance of rehabilitation driven strategies is the fact that still only a minority of individuals with ASD are able to live independently in adulthood [1].
Virtual reality (VR) consists of artificial, 3D, computer-generated environments which the user can explore and interact with [2]. VR usually takes advantage of different types of devices such as head-mounted displays and controllers to deliver the stimulation and provide means of interaction that lead to immersive experiences [3]. The immersiveness of an experience is measured by the level of fidelity, concerning all sensory modalities, that a VR system can provide. Thus, immersion is objective, measurable, and depends only on the technology used by the VR system. Presence, on the other hand, is the human reaction to immersion, the feeling of actually being in the virtual environment and behaving as such [4,5,6]. In fact, the immersive VR technologies available nowadays are able to present users with experiences realistic enough to trick the mind and create a feeling of presence within the environment [7]. These technologies have already been used and proven effective in therapies for posttraumatic stress disorder [8], phobias [9], as well as ASD [10]. There are several reasons that justify the use of VR in those approaches. Its capacity to provide safe, realistic, and controlled environments [11] make therapy possible for people who, due to physical (eg paralysis) or psychological (eg, anxiety) reasons, cannot undergo exposure in real life situations. Moreover, this creates the possibility of applying therapies at ease without the need to go to a specific location where real exposure occurs. Finally, even when using low-budget hardware and software, it has been proven that VR therapies can be as effective as exposure in real life [9].
Serious games have also been proven effective in ASD therapies, not only for including game design techniques to keep the players motivated, but also because individuals with ASD are often interested in computer-based activities [12]. However, according to Zakari et al [13], most of the serious games developed for ASD rehabilitation between 2004 and 2014 are delivered through nonimmersive VR (eg, desktops, tablets) and focus mainly on communication and social skill development [13]. This highlights both the importance of focusing on other relevant ASD impairments, including executive function in outdoor situations, as well as exploring the potential of these new immersive technologies.
The aim of this study was to develop and validate a serious game which uses a VR headset (Oculus Rift) as a proof of concept for rehabilitation of individuals with ASD. Teaching a person with ASD to use public transportation requires parents or therapists to practice with them until they are ready and comfortable enough to perform this task alone (basically, the same process used with people with typical development, but with all the obstacles set by ASD particularities). In fact, being able to engage in outdoor activities such as the efficient use of public transport can be particularly challenging for people with ASD due to deficits in adaptive behavior [14] and anxiety [15]. This project intended to ease this process by creating and validating a game that prepares the players to use, in this case, buses for transportation. This included not only teaching the required skills, but also making them comfortable with the involved procedures and environments. To our knowledge, this is the first study to use VR training for teaching the process of bus-taking for people with ASD.

Methods
In this section, we describe the game, the experimental setup, the participants, and the validation procedure.

Game Description
The game was developed in-house with the aim of preparing individuals with ASD to use buses for transportation. To achieve that, it places the user in a three-dimensional city and sets a task that is completed by riding buses to reach a specific destination. Figure 1 shows different screenshots of the city and the buses.
There are several different buses driving in 4 pre-defined routes within the city. Figure 2 shows the map of the city with the 4 bus lines available. The player can enter any of these buses, validate his/her ticket, choose a place to sit, and press the STOP button, requesting the bus to stop, and leave the bus. Before starting the game, it is possible to choose from 7 different tasks, 4 of them labelled by complexity as simple (the player only needs to take 1 bus to reach the destination) and 3 labelled as complex (the player needs to take 2 buses to reach the destination). Each task has 2 levels of difficulty: an easy and a hard mode. The easy mode leads the player, step by step, to the destination, while in the hard mode, the player is only told the place he/she must go to. At the end of each task, a scoring system evaluates the performance of the player on 2 different components: "Actions" (the capacity of the player to memorize and execute bus norms, eg, validating the ticket or sitting on unreserved seats) and "Route" (the capacity of the player to plan a route to the destination, eg, if the player took the right buses in the right bus stops).
The game includes several objects/elements, such as people, traffic, and dogs, which might cause anxiety, depending on the player. For that reason, a biofeedback system was implemented to ensure that, while the player becomes familiarized with the environment, it never becomes hostile to him/her. This is achieved by assessing the anxiety felt by the player through the analysis of the electrodermal activity (EDA) and reducing the stimulus clutter the player is exposed to, in real time, in case high anxiety levels are reached (eg, reducing the amount of noise in the environment). Figure 3 schematizes the biofeedback loop.

Recruitment
For this project, 2 groups were selected: a clinical and a control group. For the clinical group, 10 participants with ASD, whose diagnosis followed the criteria established on the Diagnostic and Statistical Manual of Mental Disorders, 5 th Edition [1], were recruited from Associação Portuguesa para as Perturbações do Desenvolvimento e Autismo de Viseu (APPDA-Viseu). For the clinical group, 9 males and 1 female were recruited, with a mean age of 18.8 (SD 4.5) years. In the same group, 8 were without intellectual disability (IQ equal to 70 or higher) and 2 were with mild intellectual disability (IQ between 50 and 69 [16]). For the control group, 10 individuals with typical development (TD) were also recruited; 4 males and 6 females, with a mean age of 21.9 (SD 3.56) years. The groups were matched by age (t(18)=−1.633, P=.12). Participants gave oral consent, and a written informed consent was obtained from their parents/guardians or themselves if they were adults with sufficient autonomy.

Intervention Protocol
The ASD group underwent an intervention of 3 sessions of increasing complexity and difficulty (see Table 1), with a duration between 20 and 40 minutes each. Of the 10 participants recruited with ASD, only 6 performed all the intervention sessions of the study due to scheduling issues. During the recruitment, participants completed a questionnaire to assess their experience in using buses for transportation. With an exception for 1 of the participants, all the participants were unable to take buses autonomously at the beginning of the study.
The control group was submitted to a single session (corresponding to the first of the patients). This group was used to provide a task baseline and to assess the capacity of the serious game to identify differences between groups.
The sessions took place in APPDA-Viseu for the ASD participants and in the Institute of Biomedical Imaging and Life Sciences for the TD participants. In all of them, the same research staff was present. This study and all of the procedures were reviewed and approved by the Ethics Commission of our Faculty of Medicine from University of Coimbra and were conducted in accordance with the declaration of Helsinki.

Session Procedure
In each session, the players received a tutorial (to learn or review the game controls) and a task. The task difficulty and complexity changed from session to session, as shown in Table 1. At the end of every session, participants were asked to describe the process of riding a bus, from the moment they arrived at the bus stop, until they reached their destination, but never received feedback on the answer given. Their responses were recorded in a checklist containing all the steps existing in a bus trip (Textbox 1).

Acquisition Setup
In every session, players sat on a swivel chair and wore a bracelet for wireless EDA recording (Biopac Bionomadix BN-PPGED and MP150 amplifier). All tasks were run on a laptop computer (Windows 8.1, 16.0 GB RAM and an IntelCore i7 2.50 GHz processor). The head-mounted display used was Oculus Rift Development Kit 2, firmware version 2.12, and a gamepad was used for input. Three participants with ASD and 1 with TD received the tasks without the Oculus Rift, due to vision impairments. Figure 4 illustrates the setup used.

Metrics and Outcome Measures
Because the main objective of the serious game was to teach how to take the bus, we defined two main outcome measures to evaluate the knowledge of the process of riding the bus. One is measured automatically by the game and the other is measured using the debriefing. Both consist of the percentage of the checklist (Textbox 1) performed correctly.

Actions Accuracy
The game identified every action of the participant (entering the bus, ticket validation, etc) and calculated the actions accuracy based on the equation: number of correct actions/number of expected actions (Textbox 1).

Debriefing Accuracy
After the game, we asked the participants to describe the step-by-step process of riding a bus and calculated an accuracy based on the same equation of the actions accuracy.

Task Duration
Duration, in minutes, of the time taken to complete the task in each session. Since the tasks changes in complexity and difficulty, this metric is not directly comparable between sessions. Nevertheless, it is useful to analyze intersubject variability and to perform intergroup comparisons for the first session.

Anxiety Level
The variation of EDA values during the session was used as a measure of anxiety. To calculate it, we detrended the signal, subtracting it to its best fit to a straight-line (for details, see detrend implementation in MathWorks Matlab), then calculated the standard deviations of the detrended EDA values. We then created heat maps of anxiety peaks in the "virtual city", where it is possible to highlight the game locations where EDA peaks occurred, corresponding to anxiety events felt by the players. If the locations of anxiety peaks are the same between subjects and sessions, those locations will become red, but if they are sparse, their representations are green and blue. The heat maps were created using heatmap.js [17], an open source heat map visualization library for JavaScript. Since the game has two major different situations (the city streets and the inside of the buses), two different conditions were defined: one representing the anxiety felt on the streets of the city and another with the anxiety felt inside the buses.

Statistical Analysis
For every metric, normality was assessed using histograms, Q-Q plots and the Kolmogorov-Smirnov test. Normality test results were used to choose between parametric and nonparametric test, accordingly.

Clinical vs Control Group Analysis:
The metrics actions accuracy, debriefing accuracy, and task duration were not normally distributed. Therefore, we used the Mann-Whitney U test for the between-group comparison of those metrics. Anxiety Level data was nevertheless normally distributed. Thus, we used two-sample t test for those between-groups comparisons.

Intervention Analysis (Intragroup Analysis for the Clinical Group):
As expected from the intergroup analysis, Actions Accuracy, Debriefing Accuracy, and Task Duration were not normally distributed. Paired data for last and first session were therefore compared using a Wilcoxon Sign-rank test. Regarding Anxiety Level, a one-sample t test was used for the paired data (last and first session). Additionally, we generated heat maps of locations where consistent peaks of anxiety where identified.

Results
The results are organized in two sections, one for the intergroup analysis and another for the within-subject analysis of the clinical group. In the figures presented, the error bars always represent the standard error of the mean, and the horizontal bar represents the median when considering nonparametric data (actions and debriefing accuracies, as well as task durations) and the mean when considering parametric data (EDA measures).

Clinical vs Control Group Results
Regarding the actions accuracy (percentage of correct actions performed during the task), we found a statistically significant difference between groups. The Mann-Whitney U test indicated that the actions accuracy of the control group (median=100%) was greater than for the clinical group (median=88%), U=75.0, P=.01. When analyzing the knowledge retained by the debriefing, the control group (median=100%) has greater accuracy than the clinical group (median=62.5%), as shown by the Mann-Whitney U test, U=80.5, P=.02 ( Figure 5).
In terms of Task Duration, Figure 6 shows the between-group differences. A Mann-Whitney U test showed that the task duration of the ASD group (median=3.76) was statistically significant higher than for the control group (median=3.19), U=18.0, P=.02.
Regarding the anxiety level of each group, we see a trend for higher values for the clinical group for all the scenarios (global EDA, inside the bus and outdoors, in the street). Figure 7 shows

Intervention Results
We then compared the main outcome measures pre-and postintervention for the 6 participants who completed the 3 sessions. The in-game measure (actions accuracy) evolved positively throughout the sessions (preintervention median accuracy=75.0%, postintervention median accuracy=93.8%; see Figure 8), with a Wilcoxon signed-rank test showing a small trend for significant differences (Z=1.63, P=.10). Even though the differences were not statistically significant when using "in-game" measures, the same test applied to the Debriefing Accuracy show a significant increase (Z=2.22, P=.03) from session 1 (median=68.8%) to session 3 (median=100.0%). Figure 9 illustrates the evolution of the debriefing checklist accuracy throughout the sessions.
Regarding the task duration, since the tasks increased in complexity and difficulty along the intervention, the task duration is not directly comparable. However, it is important to verify that the time to successfully complete the task did not statistically increase, even with exposure to harder levels ( Figure  10). In session 1, the task was performed in the "easy" mode, in which players are guided step by step, from the starting position until the final location. Tasks in "easy" mode do not require as much planning as the ones in "hard" mode, but slowly introduce the player to this concept and demonstrate how buses can be used to travel between bus stations. The "hard" mode tasks, on the other hand, only tell the player where they must go to complete the task (eg, "go to the hospital"). In these tasks, players have to analyze the map to discover which bus or buses they can take to reach the final destination. Furthermore, the tasks received in sessions 1 and 2 required the player to take 1 bus, while the task from session 3 required the player to take 2 buses. A Wilcoxon signed-rank test show no difference between the easiest and simpler (first session) task and the most complex and difficult task (last session), Z=0.105, P=.92.
The anxiety levels decreased from the first to the last session, for both overall EDA and for the bus and streets conditions. We created heatmaps using the peaks of anxiety of each participant and separated the conditions inside the bus and outside. Figure 12 shows the streets scenario for each of the sessions, and Figure 13 shows the same metrics but for the inside the bus condition.
According to Figure 12, players felt most anxious in 3 different situations (Textbox 2).

Textbox 2. Situations in which players felt anxious.
• When waiting for the bus at the bus stops (red areas near the bus stop signs) • When planning the trip or looking for the correct bus stop in the starting areas (ie, green areas near the restaurant in session 1, the fire station in stage 2, one of the bus stops in session 3) • When reaching or looking for the destination in the finishing areas (ie, green areas near the police station in session 1, the hospital in session 2, and the fire station in session 3).

Principal Findings
This project aimed to assess the potential of the developed serious game and here presents it as a tool for rehabilitation. The game is, to our knowledge, the first application specifically developed to teach ASD individuals to use public transportation systems. In addition, we were able to measure psychophysiological markers of anxiety, paving the way for future biofeedback applications. With only three sessions, it was possible to improve the knowledge of the participants regarding the norms of the bus-taking process and to reduce the anxiety levels felt by the participants during that process.
The impacts of a learning tool with this purpose are broad since it trains executive functions and might increase the autonomy of the users, providing them with a new way of moving through a city. It is also a way to make cities more inclusive, providing people with special needs ways to successfully use this type of public service.
Some studies have been conducted using VR training for ASD, usually focusing on training other skills. Most interventional approaches target social performance training [18,19] or job interviewing [20]. Gaming platforms [21] and brain-computer interfaces [22] were also suggested in the literature for autism training, but without validation with patients. Although these are important targets of intervention, our work focuses on a more specific task of executive function that is relevant for the needs of daily life. Our pilot validation study aimed to assess not only the efficacy of the application, but also the acceptance of the solution with this specific clinical population. Few studies perform fully immersive interventions, and the difficulties of combining them with biofeedback create a technology apparatus that could potentially be disruptive to the participants. The 4 drop-outs during the intervention occurred exclusively due to scheduling issues. No patient dropped the study due to discomfort or raised any difficulty in using the setup. There were specific cases where the Oculus device was not used, but all of those were related to vision impairments, not to lack of tolerance from the user.
The baseline comparison with the control group was clear in identifying the impairments in the clinical group. Both the debriefing of the procedure for taking the bus and the in-game actions showed statistically different results between groups, proving the validity of the rehabilitation target and confirming the capacity of the game to, by itself, identify the deficits of target participants in the process of taking the bus. Additionally, the time needed to complete the task was increased for the clinical group. Despite having mean anxiety levels above the control group, this difference was not significant, perhaps due to the small statistical power resulting from a small group of participants, as well as the implementation of biofeedback, which decreases differences between groups.
The intervention was successful in increasing the accuracy of the process description during the debriefing, showing a statistically significant improvement in the theoretical knowledge of the process, which was the main outcome measure. When evaluated inside the game by the user actions, the increase was not statistically significant, but showed a tendency that we believe additional sessions or a larger intervention group would further confirm. It was also successful in decreasing the anxiety felt by participants, especially inside the bus. By using heat maps to represent the anxiety peaks recorded, it was possible to understand that participants with ASD felt more anxious in bus stops and near the starting and finishing areas. This led to the conclusion that, when outside the buses, players felt most anxious when planning the trip, when looking for the bus stop, when waiting for the bus, and when looking for the final destination. Inside the bus, we observed a desensitization to stress throughout the sessions, with a final session showing fewer peaks of EDA activity.
Despite the increase of task complexity and difficulty across sessions, the time duration to complete the task did not increase, suggesting a learning effect and adaptation to the serious game.

Conclusions
By using the game as a therapeutic intervention tool, in just three sessions it was possible to improve the general efficiency of participants and expose them to peculiar scenarios in which they could train their planning skills. More importantly, it was possible to nearly extinguish the anxiety felt in bus environments and teach the bus-taking norms necessary for the autonomous use of buses for transportation, both in theoretical and practical contexts. Future studies should conduct randomized controlled trials, with larger intervention groups, to replicate the findings and extend them to other clinical populations with executive function deficits and lack of autonomy.

Introduction
Participation in physical activities has multiple health benefits for children [1]. It also provides cognitive benefits, including facilitating learning and academic achievement [2,3]. The World Health Organization recommends that children be physically active for at least 60 min daily [1]. Unfortunately, the majority of children around the world do not reach this level [4]. Excessive use of technology has been suggested as a contributing factor to childhood inactivity [5]. However, research also recognizes that games that incentivize exercise have the potential to promote children's physical activity [6].
Pokémon GO is a mobile game where fictional Pokémon creatures are captured and trained [7]. Niantic, the developer, uses an augmented reality technique where live direct views of a real-world environment are augmented by computer-generated graphics to create a geocaching (ie, geographic search) game where Pokémon characters are sought. Supplementing the real world with computer imagery allows players to become immersed in the game, requiring them to be physically active to capture and hatch new Pokémon [7]. Some studies suggest that Pokémon GO increases physical activity and discourages sedentary behavior among adults [8][9][10][11]. However, another study showed that although the number of steps increased at the beginning, it gradually decreased and returned to previous levels 9 weeks after downloading the game [12]. Nevertheless, the authors conclude that the effect of Pokémon GO on physical activity might be different in children [12]. Moreover, the game might have the potential to reach populations that traditionally have low levels of physical activity (preteens and teens) and promote their physical activity [8]. Despite the potential benefits of the game, negative effects, such as increased risk of injury, abduction, and trespassing, threaten the safety and physical well-being of children [13,14]. Further research with children is needed to understand which game components promote initial and sustained physical activity [7]. To the best of our knowledge, this is the first qualitative study of both children and parents aiming to develop an understanding of the underlying mechanisms of Pokémon GO so that this knowledge can be transferred to physical activity interventions for children.
We aimed to explore children's and parents' experiences playing Pokémon GO to determine which components make physical activity games attractive.

Design
This qualitative study is one of the several studies of the process of developing gamification-inspired programs to promote physical activity in children. Qualitative methods provide the opportunity to achieve rich descriptions, enhance understanding of a phenomenon, and capture the voices of people who are rarely heard (ie, the children) [15]. Thus, a qualitative approach to gamification research has the potential to identify the components that make physical activity games attractive to promote physical activity among children and adolescents.

Participants
Children aged 7 to 12 years and their parents living in a municipality of approximately 80,000 inhabitants situated in north Sweden with experience playing Pokémon GO were selected using purposeful sampling. Eight families comprising 13 children and 9 parents agreed to participate. Participant information is shown in Table 1. Krueger and Casey [16] recommended having a homogeneous composition of focus groups in terms of age and sex, for example. However, we chose to distribute participants in the focus groups based on family affiliation because we wanted to capture the reflections from the parents on the opinions of the children and vice versa. We paid extra attention to letting the children tell their stories because they could automatically have a sense of less power in this situation.

Data Collection
In December 2016, data were collected using focus groups. A semi structured interview guide was constructed to ensure all aspects of the aim were addressed. The opening question was, "Let's pretend I know nothing. Could you please tell me about Pokémon GO?" To persuade informants to expound on their answers, follow-up questions, such as "Could you tell me more?" and "How did you experience that?" were asked [17]. Each family formed a focus group consisting of 1 or 2 children and 1 or 2 parents, and the interviews lasted between 30 and 60 min. The focus group interviews were performed by the first and last authors, and they were audio recorded and transcribed verbatim.

Data Analysis
The qualitative latent content analysis was inspired by Graneheim and Lundman [18]. First, transcriptions were read several times to obtain a sense of the overall data; second, the text was divided into meaning units; and third, the meaning units were coded, and these codes were compared, contrasted, and sorted into themes while maintaining fidelity with the text.
To strengthen the credibility of the study, the first and last authors initially performed their analyses separately. Then, differences in coding and sorting were discussed, and consensus was reached. The second and third authors made comments about the result to add to readability and clarity. To further strengthen the credibility of the study, quotes were used to illustrate our interpretations.

Ethical Approval
This study was performed in accordance with the principles of the World Medical Association's Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects. Each participant signed an informed consent before participating in the study. This study was approved by the Regional Ethical Board in Umeå, Sweden, before the start of the research project (issue date: February 9, 2016; Application Registration Number: 2015/296-31Ö). Table 1. Characteristics of the children and parents who participated in the focus groups.

Results
The analysis rendered the following three themes: exciting and enjoyable exploration, dangers and disadvantages, and cooperation conquers competition.

Exciting and Enjoyable Exploration
As the game requires exploration of the participant's neighborhood for fictional characters that appear and "hatching" of Pokémon eggs, almost all respondents described the game as increasing their physical activity, especially when the game was new. In the beginning, it was often played every day, and sometimes weekend trips were taken with the sole goal of catching Pokémon. It was fun and exciting to catch a new Pokémon and add it to the Pokédex. Having a full Pokédex was a goal for some players; however, with the release of new Pokémon, this was difficult. Having captured a Pokémon that nobody else had gave feelings of uniqueness and gave extra spice to the game. In addition, it was exciting to evolve a Pokémon by giving it candies, which players earned by being physically active. Some players said they walked hundreds of kilometers on "Poké-walks" to catch Pokémon, get candies, or incubate eggs, not knowing which Pokémon would be gained, as shown in the quoted text below: Boy: The game is really exciting. A couple of days ago, we saw that Pikachu was nearby, and we hurried up with clothes and ran there. But he had disappeared when we came there, and we got really disappointed. Suddenly, he appeared again and I just shouted out. A woman was walking by and just looked at me, but I was so involved in the game. [Focus Group 6] One highlighted advantage of Pokémon GO was that it combines playing with being outside and moving around. Parents often encouraged their children to play and vice versa. The children wanted their parents to come along on Pokémon walks. Parents told how easy it was to get their children to go for a walk or join a trip to the city, which earlier had not been so appreciated. Some parents, primarily those who played together with their children, began to take extra walks during lunch breaks or in the evenings just to progress in the game. In that way, the game led to increased everyday physical activity, as shown in the quoted text below: Father: We went on a walk last night, all three of us. I think we were outside one hour. Earlier, we have never gone walking that far and especially my son has not earlier been the one pushing us to go out walking. [Focus Group 4] Both children and parents stated that to keep their interest, it was important that new things happen in the game. Examples were the ability to catch rare new Pokémon or participate in special events that appeared in the game. For instance, on Halloween, more experience points (XP) and candies could be earned. Moreover, bonuses for playing every day during a certain period increased the desire to play. A common wish for future game development was allowing friends to trade Pokémon. As some Pokémon exist only on one continent, it was desirable to have a chat function that allowed people from various parts of the world to connect and exchange Pokémon. In addition, players expressed the desire to collaborate with others and play together, build a joint Pokédex, and so on. Furthermore, they suggested having a map so that friends could be located. Another wish for development was having the ability to do more with their Pokémon than fight in a gym, as shown in the quoted text below:

Dangers and Disadvantages
A few children described having small accidents, such as falling into a ditch or biking into a mailbox, when playing. Furthermore, some parents were afraid that their children would walk into a road with traffic without acknowledging the consequences. Some parents limited where their children could play the game, and one parent did not allow his son to go outside alone in the evenings when it was dark. When playing the game, it was difficult to heed one's surroundings, making it possible to hurt oneself. In addition, one could miss things in nature, such as a squirrel or something else in the real world. One family acknowledged that walking around with a phone in hand at all times increased the risk of theft. Some parents noticed that by playing the game, it became acceptable to walk around with the phone in one hand even when not playing the game, as shown in the quoted text below:  Group 3] Overall, parents emphasized the challenge in limiting the time their children spent playing computer and mobile games. As they saw several advantages of being physically active and being outside, they were less likely to limit the time spent on this game. In addition, parents who played with their children relaxed their rules about screen time. Moreover, some families experienced conflicts when playing the game. For instance, one boy became hysterical when he was disallowed from going outside to catch a rare Pokémon that he really wanted, as shown in the quoted text below: Mother: My sister sent a message that a really rare Pokémon was in the city, and my son really wanted to go there, and we couldn´t. It ended in a real fight, and he got really sad, which I can understand because he really wanted that. I have also heard from friends about children becoming hysteric, as it is important to them to catch a Pokémon somewhere. It is like an obsessed behavior. [Focus Group 5] Some families mentioned that others cheat in the game. Two boys gave up playing when it became obvious that some of their friends were cheating, and they thought, "why bother playing" when it is possible to progress in the game incorrectly. Another family refrained from using cheats because they took pride in making progress without using them. A practical problem that occurred with many parents was that their children stayed out playing until the phone battery died. When this happened, the parent and the child could not contact each other in the usual way. This was a particularly large problem in the winter when phone batteries are much less effective. It also led to a need for keeping the phone in a pocket, making it more difficult to play. In addition, several respondents commented on the relative ease of playing in the summer when it was more comfortable because less clothing was needed to play outside, especially for a long time.

Cooperation Conquers Competition
Cooperation and togetherness were highly valued by the participants, and without a doubt, this was highlighted before competition. Nevertheless, winning a gym fight and competing against other players were mentioned as fun activities, especially among the boys in the study. Through these features of the game, they could measure who was best in the area. Those players who liked competing in the gym were motivated to reach higher levels in the game. One aspect of the game that was especially appreciated was that it led to new friends. The children talked a lot about the game while at school, discussing things such as who found or scouted a rare Pokémon. This brought about new friendships throughout the school. Showing others a rare Pokémon was described as fun, and others were interested in hearing about it. Sometimes, lot of attention was given by people they did not know. At some locations, many players would gather, and it was common for players to start talking to others in the same manner that dog owners start talking to each other about their dogs. For one boy, this game facilitated making friends, something that he previously found challenging, as shown in the quoted text below: Mother: My boy has had difficulties interacting with peers. Most of them like to play football. Now he has found friends with the same interests, and his social skills grow every day. [Focus Group 7] Some parents and children played on the same phone or using the same login, thereby cooperating in progressing through the game. This added another dimension to the game, and they often talked about which Pokémon was caught, or they were motivated to be more active in the game. Both parents and children described how cooperation was an important factor to continue playing the game, and it motivated them to play a lot. In one family, 2 children and their mother played using a single login, and the daughter would not play at all if not doing it together. Furthermore, cooperation created the opportunity for interacting with the children and talking about other things, such as school, while walking together and playing the game; this is not possible with other computer-related activities. In addition, playing together allowed parents to be more involved in the game compared with other computer or mobile games that their children played, as shown in the quoted text below:

Principal Findings
This study shows that the components of the game that the participants valued most were predominately linked to exploring and socializing, and only a few players valued fighting in the gym. We found that the social aspect of the game dominated; comments about walking together, cooperating using a single account, making new friends, and sharing a common interest in the game prevailed. This finding is not in line with the findings of Rasche et al [19], who found that social interaction was not affected by playing the game. In their case, the focus was predominantly on adults, and it is reasonable to believe that children as well as families act differently. The second-most prominent classification was that of the explorers who commented about exploring new places to find new Pokémon, being excited about spotting a rare Pokémon, hatching a new Pokémon, and being able to show off their rare and developed Pokémon. Our findings suggest that gaming elements, such as cooperating and exploring, inspire participation in physical activities. These components of the game can be valuable when designing physical activity interventions using gamification. In earlier research, game characteristics shown to promote physical activity were cooperative play, flexibility in activity choices, short-term goals, integration of physical activities with elements of a story or narrative, rewards, and replayability [7]. Moreover, evidence supports that cooperative gaming offers more intrinsic motivation, self-efficacy, enjoyment, and continued participation than solitary play [20]. An increase in motivation is crucial to physical activity interventions because motivation promotes adherence and attendance, two factors necessary for creating lasting behaviors and physiological changes [20].
Our findings show that playing Pokémon GO is not without some risk; however, learning to navigate environmental risks is a natural part of growth and maturation. Furthermore, the sustainability of the new behaviors is inherently based on the elements of the game, such as gaining access to new characters or a joint Pokédex. Thus, the increased physical activity evidenced in this study remains largely dependent on the game and is therefore not likely sustainable. The bidirectional and reciprocal nature of the social cognitive theory [21] can explain underlying tenets related to the discoveries in this study. Using the game as the context, participants could begin to model behaviors that lead to success in the game because this experience brings joy to their lives. Players learn from each experience how best to locate, lure, and catch Pokémon. When successful, the behaviors are replicated. Thus, social learning through watching and cooperating with others to catch specific characters helps shape the new behavior of daily participation in physical activities. Visiting new locations and being physically active increased the likelihood of success, and it was the behavior that linked feelings of enjoyment and excitement with the environment even though it was through a virtual, augmented environment. In other words, the game players were rewarded for their behaviors regardless of whether the behavior was modeled by another player or by the game itself. In the presence of self-observation, self-evaluation, self-reaction, and self-efficacy, as features of the game, the interactivity between and among these processes positively influenced motivation and goal achievement [22]. Self-observation, or having an awareness of one's performance, does not by itself result in a sustainable behavior change, but the inclusion of both regularity and proximity can enhance its effects on motivation [22]. It is the consistency of the success (eg, if I am physically active, then I am more likely to catch a Pokémon) and the timing of the feedback that influences behavior. The components that make physical activity games addictive might be transferable, and gamification components have been used previously to improve the effectiveness of health promotions [23], and promising research demonstrates the use of gamification with the aim of promoting physical activity [24]. For instance, we are currently developing an intervention promoting children's active school transportation, and that intervention will draw upon the design patterns from Pokémon GO.

Limitations
This is a qualitative study that provides a rich description of families' experiences playing Pokémon GO, and it therefore enhances the understanding of how mobile games can promote physical activity as well as capture the voices of people who are rarely heard (ie, the children) [15]. This study is limited in that we used only 8 families; however, the last 2 focus group interviews did not add new insights, indicating saturation in the data. This small sample limits the transferability of the results, but the knowledge gained by this study might be transferable to physical activity interventions for children.

Conclusions
As cooperation and exploration were the motivating aspects of Pokémon GO that were most appreciated by the families and most inspiring to increase physical activity, they could be advantageous aspects of gamification when building physical activity interventions for children. For example, including elements of surprise and cooperation with friends and parents, such as collecting points or badges together, might be useful. This knowledge is important when designing interventions aimed at increasing physical activity using gamification. However, more research is needed in the area.

Introduction
The latest government audit of university-based research in the United Kingdom has highlighted the need for more sport and exercise research to target health inequalities that persist in society [1]. There are strong inverse relationships between life expectancy, cardiorespiratory fitness, and socioeconomic status [2][3][4]. As these trends are exacerbated in males, improving fitness in men living in regions of socioeconomic deprivation could be important for addressing health inequalities. High-intensity training (HIT) is a time-efficient way to improve fitness over a short duration [5], and a growing body of evidence advocates this form of exercise for public health interventions [6]. HIT is known to stimulate a combination of central and peripheral adaptations promoting an enhanced availability, extraction, and utilization of oxygen [7,8]. Although these findings are encouraging, these previous studies have tended to prescribe exercise with a focus on cycling and running [9][10][11][12]. Some argue that the psychologically aversive nature of high-intensity exercise means that this training will not be adopted or maintained by many people [13]. Thus, although the physiological benefits of HIT are unequivocal, there is ongoing debate about its relevance across populations, particularly for those less-motivated individuals. Finding ways to improve its acceptability could be the key to improve the wider acceptance of HIT.
The concept of exergaming, or active video gaming, has been around since the 1980s and affords unrivalled opportunities to reward exercise through visual, audio, haptic, and mental stimuli. Exergaming-based interventions have understandably shown improved rates of compliance, adherence, and social inclusion (eg, see [2]). Despite this, however, and after nearly a decade of research, the potential for exergaming to improve levels of fitness on a population level has not been realized [14]. A possible reason is that most of the research is based on commercially available exergames that are designed primarily for entertainment. In other areas of health research, such as physiotherapy, researchers have overcome these shortfalls by creating their own exergames specifically geared to the physiological needs of the clinical or at-risk population. These serious games are purposively designed to train physiological systems at a level that is sufficient to cause positive adaptation. Importantly, these serious exergames have succeeded in improving health-related outcomes (eg, balance and falls risk) and demonstrate their potential for use in other health professions [15].
A major challenge with improving HIT and exergaming is to translate positive laboratory-based findings into interventions that can directly affect those individuals who need it most and can also be administered on a larger scale [16,17]. Whether or not the concept of serious exergaming can contribute in this regard requires, first, that a game is tailored toward reducing the aversive protocols associated with HIT and second, that the intervention is capable of improving fitness in our target population. Given that the commercially available games are known to deliver, at best, low to moderate levels of exercise [18], our first aim was to describe the development of a serious exergame for this purpose. Our second aim was to quantify its real-world fidelity and potential to improve fitness and health outcomes in males recruited in regions of socioeconomic deprivation over a 6-week intervention.

Development of the Game and Intervention
During the preintervention period, we held regular focus groups comprising in-house participants, computer programmers, and exercise scientists with specialisms in delivering HIT. Together, we explored opportunities to gamify HIT protocols. Boxing was an obvious theme for gamification as it involves high-intensity exercise [19], and the subculture already exists within our target population of lower working-class males of the North-East of England [20,21]. The complete set of hardware used in the trials comprised a single computer with a dedicated graphics card (GeForce GTX 960, [22]), two standard 23-inch monitors, two Kinect for Windows sensors, four wrist-mounted wireless inertial measurement units (developed in-house to reduce latency in the Kinect datastream), and two wireless chest-worn heart-rate monitors (Polar RS400, Polar Electro Oy, Kempele, Finland). In addition, the participants wore boxing-specific resistance bands (ShadowBoxer Pty, Australia) to increase levels of exertion and reduce eccentric muscle contractions during the deceleration phase of punching. The movement data from the sensors were streamed to our gaming engine [23] via a User Datagram Protocol with a latency of approximately 15 ms. Input from the skeletal tracking module of the Kinect system and inertial measurement units were converted to a digital avatar using purpose-written functions to convert each body segment long-axis (19 in total) into three-dimensional quaternions, as can be seen with the hardware setup in Figure 1. The C# programming language was used throughout to create scripts for gameplay ( Figure 2).
Commercial games are primarily designed for entertainment and recreation for younger populations and tend to have colorful and visually busy game and user interfaces [24]. Furthermore, commercially available games are mostly designed for enjoyment and not based on basic exercise principles. For games to promote beneficial changes in health, they need to engage physiological systems in a meaningful manner that is relevant for the function being trained [25]. Our philosophy was to ensure that the gameplay was simple and, where possible, true to the rules of boxing. Each avatar was assigned collision objects with inertial characteristics on the trunk, head, and hands, thus enabling the physics engine to superimpose realistic joint movements in response to being hit. As in real boxing, the primary mechanism for scoring was to successfully land a punch on either the trunk or head of the opponent. The number of points awarded per punch was the product of the current heart rate (expressed as a percentage of maximal) and speed of impact (ms -1 ). Thus, for example, a punch landing to the head with a relative speed of 9 ms -1 and at 95% of maximal heart rate (ie, 95% HRmax) was awarded 8.55 points (ie, 9 x 0.95). Each participant was provided with biofeedback in the form of a partial arc positioned around the opponent's head; the angle subtended by this arc was linked to the current heart rate ( Figure 3).
For example, when player 1 was working at 50% of maximal heart rate, an arc of 180 o was displayed around the head of player 2. The players were made aware that the angle of the arc was proportional to their current training intensity, which in turn contributed strongly to punch power and points scoring. They were also aware that large global displacements of their center of mass (eg, bouncing up and down) would increase this intensity but with a lag of approximately 20 to 30s [26]. This arc was visible in their monitor only and was active throughout the exergaming session (ie, during both repetitions and rest), essentially allowing the participants to self-regulate training intensity per the oncoming requirements of the game. Although we kept on-screen text to a minimum, we elected to display a countdown timer during some periods. These were during the final 30 s of active recovery to encourage the elevation of training intensity in preparation for a repetition and during the latter 10 s of each repetition to encourage all-out punching right up until the bell. Following preliminary tests, we removed the ability to block punches as this was observed to be energetically undemanding; thus, the only forms of defense were to attack before being hit or to dodge the oncoming punch using whole body movements. For similar reasons, we limited the periods of allowable close-up punching. Specifically, after 1 s of close-up punching, the referee interrupted the fight (shown in the striped shirt in Figure 3) and ordered both participants to race back to their respective corners (ie, 3 m backwards of backwards stepping). The winner of each mini race received an additional 50 points; equivalent to approximately 6 to 7 well-timed punches. After several iterations of observing kinematics, heart rates, and tactics during gameplay, we prepared the exergaming system (ie, both hardware and software) for use in the target settings.

Participants
We conducted a 6-week exploratory controlled trial designed to assess the fidelity of the game in terms of delivering the intended training stimulus and to examine the effect of the intervention on selected health outcomes. As appropriate for an exploratory trial, we did not conduct formal sample size estimation a priori, rather the CIs would be used to inform future trials. Men are commonly referred to as a hard-to-reach population to engage in health promotion activities [27], where a targeted recruitment approach at locations predominantly attended by men may facilitate uptake of participants. Therefore, to maximize recruitment within our intended population, relevant gatekeepers were approached at institutions positioned within regions of social deprivation. Thus, two settings used for recruitment and the trials were a social club and mosque, both situated within deprived regions of Middlesbrough, United Kingdom (TS1 and TS4). Recruitment in the social club relied heavily on the gatekeeper (club secretary), whereas uptake of A third-party minimization process using baseline measures of age, waist circumference, and predicted maximum oxygen consumption (VO 2 max) was used to remove bias in group allocation. The control group was instructed to maintain their current physical activity levels and inform the researchers should any changes arise during the intervention period. One of the participants in the control group decided to embark on his own fitness program following the preintervention tests but was included in the data analysis. Participant flow through the trials is shown in Figure 4, whereby 3 participants were lost to follow-up (control=2, intervention=1). Overall retention to the intervention that encompassed baseline and follow-up measures was 87.5% (21/24).
To explore perceptions of the exergame and the HIT regime, semistructured interviews were conducted with 5 intervention participants following the 6-week training period, which were analyzed semantically. The study was approved by the ethics committee of Teesside University, United Kingdom, and written informed consent was obtained from all participants.

Measures
Participants' baseline characteristics (mean [SD]) are shown in Table 1. All measures were assessed pre and post intervention. Blood pressure was collected on the left arm positioned at heart height with the subjects in a seated position by an automatic upper arm blood pressure monitor (Omron MX13). Measures were made at least three times at 3-min intervals, where an average of the two lowest measures was used for analysis [31]. Waist circumference was measured using the World Health Organization guidelines [32]. Predicted VO 2 max was obtained by performing a submaximal 8-min ramped step test [33]. Heart rate response (Polar T34; PolarElectro OY, Kempele, Finland) and simultaneous breath-by-breath expired gas were collected using a portable indirect calorimeter (Cosmed K4 b2; Rome, Italy), calibrated per the manufacturer's guidelines. Individual HRmax was estimated [34] and plotted against VO 2 data for the determination of predicted VO 2 max.

Intervention Delivery
Evidence recommends a minimum duration of 12 weeks for a HIT protocol to promote favorable changes in blood pressure and anthropometric measurements of obesity [35]. However, a 6-week intervention was selected, as a minimum of 13 sessions (0.16 work/rest ratio) is sufficient to elicit moderate improvements in VO 2 max in sedentary individuals [5]. Additionally, there is still ambiguity regarding the optimal work-to-rest ratio when designing HIT interventions, particularly in populations with varied age, baseline fitness, and training experience [36]. Therefore, longer duration HIT models (1-4 min) were deemed unsuitable for our target population [37]. Furthermore, minigames (such as the current exergame) have short life spans, where adherence to a longer intervention (eg, 12 weeks) may diminish over time and influence health outcomes. This was evident from a 12-week pilot study (unpublished data) using an exergame in the same population that saw attendance drop from 53% during week 2 to 16% during week 12.
Participants allocated to the intervention group were invited to attend three sessions of exergaming per week. At the beginning of the exergaming session, participants were required to complete a 6-min structured warm-up consisting of a series of exercises on a 210 mm step until both participants reached >70% HRmax. Session workloads with volumetric progression were set automatically once the user's identifying information was entered. The session workloads were 120s-, 150-s, and 180-s of work during weeks 1 and 2, weeks 3 and 4, and weeks 5 and 6, respectively.
To avoid staleness, the repetition lengths (10, 20, or 30-s) were randomly selected at the beginning of each round. We set the work-to-rest ratio at 1:4, and thus, the respective repetitions were followed by 40, 80, or 120-s of active recovery. Participants were instructed to perform the repetitions at an intensity ≥85% HRmax. Each exergaming session took approximately 30 to 40 min to complete, including equipment set-up, warm-up with additional enjoyment, and task immersion questionnaires upon completion of the HIT bouts (not reported here). Heart rate responses were taken within repetitions and therefore, did not include any of the recovery period. This, therefore, avoided an overestimation of physiological load, which can occur when heart rate continues to rise after exercise cessation [26].

Statistical Analysis
Data are presented as mean (SD). Exergaming session attendance was calculated using descriptive statistics. Training data (heart rate [repetition mean and peak], rating of perceived exertion [RPE]) were analyzed using a mixed linear model with a random intercept (Statistical Package for the Social Sciences This approach enabled us to calculate (1) the within-participant variability in the training dose [38], expressed as an SD; (2) the effect of HIT repetition duration (entered as a fixed effect) on heart rate and RPE; and (3) the change in heart rate and RPE across the 6-week HIT intervention, with session number (1-18) entered as a fixed effect. A priori, we defined a minimal practically important difference (MPID) in training heart rates as two percentage points, given that when training at high-intensity, this difference influences the adaptive response [39].
The MPID for RPE was set at one arbitrary unit (AU) on the Borg CR10 Scale, representing a full increment change on the scale. Inferences were then based on the disposition of the 90% confidence limits (CLs) for the mean difference to these MPID; the probability (percent chances) that differences in heart rate and RPE between HIT repetitions of different durations or across the 6-week intervention were substantial (>2 percentage points, >1 AU) or trivial was calculated as per the magnitude-based inference approach described by Batterham and Hopkins [40], an approach that has been advocated within user research [41].
These percent chances were qualified via probabilistic terms assigned using the following scale: 25 to 75%, possibly; 75 to 95%, likely; 95 to 99.5%, very likely; and >99.5%, most likely [42]. To determine the magnitude of the within-participant variability in our training heart rate and RPE, the values were doubled and then interpreted against aforementioned MPID. All training data effects were evaluated mechanistically, whereby if the 90% CL overlapped the thresholds for the smallest worthwhile positive and negative effects, the effect was deemed unclear [42]. The effect of HIT on our outcome measures was determined using a custom-made spreadsheet [43], with the baseline value of the dependent variable used as a covariate to control for baseline between-group imbalances. Following this, standardized thresholds for small, moderate, and large changes (0.2, 0.6, and 1.2, respectively) [42] derived from between-subject SDs of the baseline values were used to assess the magnitude of all effects, with magnitude-based inferences subsequently applied. Here, all inferences were categorized as clinical, with the default probabilities for declaring an effect clinically beneficial being <0.5% (most unlikely) for harm and >25% (possibly) for benefit [42].

Measures
When compared with the control group, the effect of the exergaming intervention was a likely small beneficial improvement in predicted VO 2 max. The SD of the individual responses in predicted VO 2 max to the exergaming intervention was 3.4 (90% CLs ±3.1). All other effects were either most likely trivial (body mass and waist circumference) or unclear (systolic blood pressure and diastolic blood pressure; Table 1).

Intervention Delivery
Mean session attendance was 67.5% (170/252; range 0%-100%). Of the 14 intervention participants, 9 attended at least 75% of the prescribed exergaming sessions. The total number of repetitions performed was 1268 (range 6-136 per participant), and the descriptive training data for the exergaming intervention are presented in Table 2. The magnitude of the within-participant variability in all intensity measures was moderate.
Mean exercise intensity data for the 10-s, 20-s, and 30-s repetitions are presented in Table 3. The effect of repetition duration on mean heart rate was substantially higher heart rates during the 30-s repetitions when compared with the 20-s (2.2 percentage points; ±90% CLs 0.5 percentage points) and 10-s repetitions (3.5 percentage points; ±0.5 percentage points). Peak heart rates were also substantially higher for the 30-s repetitions when compared with the 20-s (3.6 percentage points; ±0.6 percentage points) and 10-s repetitions (6.4 percentage points; ±0.6 percentage points). The effect of repetition duration on RPE was most likely trivial. The mean exercise intensity scores per exercise session along with the individual data points to illustrate the variability around the mean score are presented in Figure 5. Across the intervention, the regression slope revealed most likely trivial changes in mean heart rate (0.1 percentage points; ±90% CLs 0.2 percentage points), peak heart rate (0.1 percentage points; ±0.2 percentage points), and RPE (0.1 AU; ±0.1 AU) across the duration of the 6-week HIT intervention.

Qualitative Findings
Participants who engaged in interviews ranged between 25 and 50 years and attended between 60% and 100% of the exercise sessions. Data revealed that participants found the exergame to be challenging but rewarding, as illustrated in the following quotes: At the start you think this is too hard but as you go on and it does get easier, I say it gets easier but you extend the time don't you so you ' Participants found that the competitive aspects resulted in a changing game scenario between players and rounds for each training session, which facilitated motivation to attend and engage, as illustrated in the following quotes: There was always something else that kept you interested, kept you wanting to do more. If you go for a run, its more like "I'm just running," that's it. The variance in the allocated workload was a motivating factor but highlighted that repeated 30-s bursts may have discouraged one participant from attending exercise sessions, as illustrated in the following quotes:

Principal Findings
This study developed a novel HIT intervention using exergaming. Obtaining information about whether the game was used as intended by the target population is a fundamental stage in the development cycle. We evaluated its use in men recruited from regions of socioeconomic deprivation, a population predisposed to low life expectancy. The training data is indicative of a consistent and progressive dose of HIT (>85% of maximal heart rate) over the duration of the 6-week intervention. Given that our protocol for measuring training intensity is conservative [38], we are confident that the game was used by the participants as intended. Furthermore, we also found clear beneficial effects on predicted VO 2 max, and these findings together demonstrate the potential usefulness of exergaming for the future delivery of HIT.
Despite the respective popularities of both HIT and video gaming, there have been no previous attempts to combine these concepts in the form of an exercise intervention. As such, there are no studies against which to directly compare our training data, although the intensity levels are clearly greater than those measured for commercially available exergames [18]. In fact, our training data are more comparable to those from a recent high-intensity exercise programs delivered via traditional means and described as a high-quality dose of HIT [38]. For example, our repetition peak heart rates of 89% of maximal when averaged over the duration of the 6-week intervention were only two percentage points below their levels. Although we cannot pinpoint the specific mechanisms underpinning this success, we suggest the feedback and point-scoring made a strong contribution. Specifically, by matching point-scoring to current training intensity and enabling users to self-regulate and maximize their intensity, it was possible to deliver a consistent dose of HIT. Furthermore, given that these levels were based on their own individual maximal levels (ie, relative to maximal heart rate), the participants were also aware that these were attainable (eg, [44]). Accordingly, after the first session during which users became familiar with the game concepts, intensity levels remained consistent throughout the 6-week intervention, indicative of a high-quality dose of HIT.
When our outcome data are compared with those from other HIT programs, however, our improvements in fitness were relatively modest. For example, the effect on predicted VO 2 max was only half the pooled effect for untrained males of 6.2% [5]. This relatively smaller improvement could be because of a range of factors (eg, age), although given that we analyzed the outcome data on an intention-to-treat (ITT) basis, nonattendance was the most probable cause. Specifically, our attendance of 68% is much lower than is typical for lab-based studies and more similar to other community-based trials involving HIT. For example, our attendance data was similar to other boxing themed interventions (79%, [45]), mixed-gender HIT programs (75% for maximal volitional intensity training; [46]), obese males (57%, [47]), and community-based HIT interventions using adult populations (58%-77%, [48]). Thus, although there were some positive findings related to this HIT intervention (eg, ease of recruitment in the target population and training at the intended dose), some of the usual problems associated with conducting exercise interventions in the community remained. Despite favorable changes in VO 2 max, there were no clear beneficial effects for body mass, waist circumference, and blood pressure when compared with our control group. We suspect that our ITT analysis and shorter intervention period (<12 weeks) influenced these outcome measures when compared with other HIT studies [35,49].
In summary, we have established that HIT can be delivered via exergaming but that the usual problems of attendance remain. Clearly, finding strategies to improve attendance will be important in future development cycles. One of the appealing features of HIT among our participants was the time-efficient nature of HIT. However, the delivery of our exergaming sessions in many cases required travel to and from the venue on a thrice-weekly basis. This could add hours onto a relatively brief period of HIT, and for some, this was a major disincentive to attend. On a more positive note, however, online gaming is now a major part of the gaming industry, and through internet protocols (IPs) and greater bandwidth, these exergaming sessions could be delivered over the internet. Further work into the acceptability of delivering exercise in this manner may need to be undertaken, but the benefits of doing exercise at work or at home would most probably appeal to some of our participants. Furthermore, by using these technologies effectively, it may be possible to exploit other opportunities to further motivate, such as social media to improve competitiveness and camaraderie [50].

Limitations
There were several limitations to this study that need to be overcome in a larger trial. First, our sample only included white Europeans and South Asian males recruited from regions of socioeconomic deprivation. The acceptability of this intervention across different ethnicities and other socioeconomic groups is therefore not known. Second, we administered a submaximal fitness test despite it being known that such tests are prone to errors [51]. Third, minigames such as this usually have a short lifespan [52,53], and it is not known how much longer the exergame could maintain interest beyond this 6-week intervention. Future studies are needed to explore the acceptability of a maximal VO 2 max test within this population. Fourth, participant blinding was not possible in this study, and we therefore cannot rule out cluster contamination and participant allocation resentment. Fifth, we did not perform an economic evaluation of this intervention. Whether these benefits, given the costs of treating symptoms associated with lack of exercise, set against the costs of developing and delivering the intervention, requires formal evaluation in future research.

Conclusions
Exergaming can be configured to deliver a sustained and consistent dose of high-intensity exercise, resulting in improved fitness in a group of males living in regions of socioeconomic status over a 6-week program. Furthermore, class-salient interventions delivered to specific populations have the potential to facilitate adherence and attendance. Exergaming is inherently scalable and has good reach into our population, although future strategies using IPs could improve the real-world effectiveness.