Published on 11.11.14 in Vol 2, No 2 (2014): Jul-Dec
How to Systematically Assess Serious Games Applied to Health Care
The usefulness and effectiveness of specific serious games in the medical domain is often unclear. This is caused by a lack of supporting evidence on validity of individual games, as well as a lack of publicly available information. Moreover, insufficient understanding of design principles among the individuals and institutions that develop or apply a medical serious game compromises their use. This article provides the first consensus-based framework for the assessment of specific medical serious games. The framework provides 62 items in 5 main themes, aimed at assessing a serious game’s rationale, functionality, validity, and data safety. This will allow caregivers and educators to make balanced choices when applying a serious game for healthcare purposes. Furthermore, the framework provides game manufacturers with standards for the development of new, valid serious games.
JMIR Serious Games 2014;2(2):e11
Serious or applied games are digital games with the purpose to improve individual’s knowledge, skills, or attitudes in the “real” world. Serious games applied to medical or health-related purposes are growing rapidly in numbers and in types of applications. Serious games have been shown to be at least as effective as conventional tests in improving cognitive abilities in the elderly  and even more effective than conventional neuropsychological interventions when it comes to improving neuropsychological abilities of alcoholic patients [ ]. Serious game-based interventions have been used to support rehabilitation in disabled patients, showing equal effectiveness compared to conventional training programs [ ]. Games have been applied to promote healthy behavior in children [ ] and educate patients [ , ]. Serious game-based patient education has also been shown to increase the treatment adherence among adolescents with leukemia [ ]. A third application for serious games is training medical personnel [ ]. Serious games have been shown to add to Advanced Life Support training [ ] and improve understanding of geriatrics principles among medical students compared to conventional training methods [ ]. Patients [ ], students, and professionals [ ] generally view game-based interventions as fun and challenging.
Although results for serious games in terms of effectiveness for such purposes are promising, their implementation as “serious” modalities for prevention, treatment, or training in health care is hindered by lack of understanding of the underlying concepts among health care professionals, or even distrust. Before doctors and patients consider using serious games as a useful solution for a health care-related problem, it is important that they understand what problem is being addressed by the game and that a proposed claim on effectiveness is indeed trustworthy. Many clinicians are currently undereducated in judging a serious game’s safety or effectiveness. Information on individual games is often hard to find in disorganized app stores and websites . Studies on serious games’ validity and effectiveness remain scarce [ , ]. The idea of applying a video game in health care may even be resentful to certain clinicians or patients. In addition, threats to data safety fuel distrust towards electronic applications in health care altogether [ ]. Such issues menace the practical application of serious games throughout health care, subsequently limiting investments in smart solutions that may actually prove beneficial in the end.
This article discusses the first tool for the systematic assessment of serious games applied to medical use, for educators and clinicians. The information collected and organized accordingly, will aid health care practitioners to understand and appraise the risks and benefits of specific serious games in health care in a uniform manner.
To our knowledge, there is currently no systematic framework for the assessment of serious games in health care described in literature. Therefore, the Dutch Society for Simulation in Healthcare (DSSH)  has developed a consensus-based framework, categorizing important items that assess a serious game’s safety and validity. Eight individuals (see Acknowledgements section for details) from six different institutions experienced in designing, applying, or researching serious games for health care-related purposes participated. The reporting standards for non-game mobile health apps for medical purposes (mHealth), published by Lewis [ ] and Albrecht [ ], was used as a basis. This system is applied by the peer-reviewed mHealth app assessment initiative of the Journal of Medical Internet Research [ ]. Due to inherent differences in the functionality of games compared to purely informational mHealth applications, this framework required re-evaluation.
The panel reviewed the items from these reporting standards during two meetings. All items in the Albrecht framework  were systematically evaluated. For each of the 5 categories, items irrelevant to serious games were removed and if necessary, extra items were added. During the second panel meeting, the framework was re-evaluated and all members approved the final version.
The framework described provides 62 items in 5 main themes (), aimed at assessing a serious game’s rationale, functionality, validity, and data safety. It specifically does not aim to assess its effectiveness in terms of success or user attractiveness. The panel defined serious games (other than a regular medical application) as digital applications instigating a specific behavioral change to its user, in the form of skills, knowledge, or attitudes useful to reality [ ]. The framework does therefore not apply to (mobile or Web-based) digital health apps with a purely informational purpose, for which the mHealth app assessment framework is designed [ ].
|Game description||Meta-data||Operating system||Operating systems of the game|
|Project type||Commercial, non-commercial, other|
|Access||Public / restricted / other|
|Adjunct devices||Is an adjunct device needed?|
|Development||Funding||How was development funded? Eg, funding agencies, investors|
|Sponsoring / Advertising||Advertisement policy||Is the game free of commercial pop-ups?|
|If not, what is advertised?|
|Sources of income||Are there sources of income within the game?|
|Sources of income outside game||What are the sources of income of the owner/distributor?|
|Potential conflicts of interest||Affiliations||What affiliations do the publishers have that could influence content or user group?|
|Conflicts of interest||What interests do the publishers have that could influence the game’s content or user group?|
|Disclosure||Are conflicts of interest disclosed?|
|Rationale||Purpose||Goal or purpose||What is (are) the purpose(s) of the game?|
|Disclosure||Is (are) the purpose(s) disclosed to users?|
|Medical device||Medical device||Is the serious game a medical device, or not?|
|Class||If yes, which class?|
|Approval by legal bodies||If yes, does it comply to the necessary requirements (FDA-approval, CE-mark?).|
|User group||Specific user groups||For each user group: disease/condition, or health care profession.|
|Description||Please specify gender, age (range), and other relevant descriptive items.|
|Limits||Are there age limits, or other limits?|
|Disclosure||Is the intended user group disclosed?|
|Setting||Patient care||Is the game used in patient care?|
|Training courses||Is the game used in training courses or -curricula?|
|SCORM compliancy||If used in training courses or curricula, is the serious game SCORM-compliant?|
|Functionality||Purposes / didactic features||For every purpose of the game:|
|Learning or behavioral goals||What content will the player learn?|
|Relation learning and gameplay||How does the learning content relate to the gameplay?|
|Instruction||What intervention leads to the learning transition (eg, tutorial, instructions (in-game))|
|Assessment (progress) in game||Through which parameters is progress in the game measured?|
|Assessment parameters||Which parameters are to designers\' opinion indicative for measuring learning effects?|
|Content Management||Content Management system||Is the Content Management System restricted to specified persons or institutions?|
|User uploaded content||If no, are users allowed to upload their own content?|
|Content monitoring||How is uploaded content checked?|
|Restrictions and limits of the serious game||Please describe restrictions and limits of the serious game. What content on the learning goals is not covered?|
|Potentially undesirable effects||Potentially undesirable effects||What potential undesirable effects could the game have?|
|Disclosure||Are such potential undesirable effects disclosed to the user?|
|Measures taken||What measures are taken to prevent potential undesirable effects?|
|Validity||Design process||Medical expert complicity||Were medical experts (content experts) involved in the design process from the start?|
|User group complicity||Were representatives from the user group involved in the design process from the start?|
|Educationalist complicity||Were educationalists involved in the design process from the start?|
|User testing||User testing||Did user testing take place? What were the results, and how were these incorporated in the design?|
|Stability||Platform stability||Does the game produce the same results on different platforms?|
|Validity (effectiveness)||Face validity||Do educators and trainees view it as a valid way of instruction?|
|Content validity||How is its content validated to be complete, correct, and nothing but the intended medical construct?|
|Construct validity||Is the game able to measure differences in skills it intends to measure?|
|Concurrent validity||How does learning outcome compare to other methods assessing the same medical construct?|
|Predictive validity||Does playing the game predict skills improvement in real life?|
|Data protection||Data protection and privacy||Data processing||How is data collected in the serious game?|
|Patient privacy||Are patient-specific data stored in the game?|
|If yes, are patient informed consent criteria met according to relevant national standards?|
|Data ownership||Who owns and stores the data resulting from play?|
|Data storage period||During what period are data stored?|
|Data removal||Can the user delete data temporarily and/or permanently?|
|Data storage security||Is the data storage secured in conformity with laws of the countries stated above?|
|Data transmission security||Is the data transmission secured in conformity with laws of the countries stated above?|
|Disclosure||Are all items on “data protection” disclosed to the user?|
Assessing Medical Serious Games
When evaluating a specific serious game, it should be thoroughly described and registered (including information about the manufacturer or owner to whom the game should be attributed and the version). Equally to mobile applications, a special interest is taken into the owner’s policy concerning revenues from sponsoring and advertisements, both during development as well as its use. Sources of revenue and affiliations (eg, pharmaceutical industry) may bias or threaten a serious game’s validity for obvious reasons. These should be fully disclosed to the game’s users. Sources of income within a game can be equally relevant to the costs required for the initial purchase.
This clarifies the game’s purpose outside the game. This external purpose (eg, improving eye-hand coordination in laparoscopic surgery) may differ from the actual goal in the game (eg, completing a quest in an underground world  or playing a tennis game [ ]). This clearly differs from the Albrecht framework, because most mHealth apps have a single obvious purpose (internal goal = external goal). A game’s purpose relates to the intended user group and the setting in which it is used, similar to mHealth apps.
Additionally, serious games might fall within the scope of the medical devices, requiring specific guidelines to be implemented, set by the US Food & Drug Administration (FDA), European Committee (Conformité Européenne, CE), or national equivalents. This specifically applies to games with a distinct diagnostic or therapeutic purpose. Moreover, integration of serious games into electronic learning environments may demand certain technical requirements. The industry has set standards to improve the interoperability of e-learning content (the Sharable Content Object Reference Model; SCORM). Its implementation will improve the integration of educative serious games in learning management software.
Functionality of a serious game clearly differs from that of an mHealth app. These usually contain “dry” content (eg, medical information) or an obvious functionality (eg, communicating or registering information), whereas a game requires the user to operate or interact with the content, with the ultimate goal to change ones behavior in real life (ie, learning). To understanding this process, information is required on the game’s content, how the instruction is delivered, how performance is assessed and how these aspects are integrated in the gameplay [, ].
Consequently, it is important to register information on the game’s content management. For instance, users may be able to add content themselves, making content validation an important issue. This directly influences the game’s content’s validity.
Finally, undesired results or negative transfer of learning could occur in the interaction with a serious game, which is not the same concept as “gaming the game” (ie, cheating), an effect that may very well enhance learning . If validation research is not present, at least a logical connection between gameplay and behavioral or learning goals should be present and disclosed by the developer.
Validity determines whether an instructional instrument (such as a serious game) adequately resembles the construct it aims to educate or measure. More formally, “the degree to which evidence and theory supports the interpretations of [game] scores entailed by the proposed use of [the game]” . The American Psychological Association has set a series of standards to measure validity [ ]. Whereas many validity types have been described, validity research in medical education usually contains several consequential phases [ , ]. First, experts should scrutinize the game’s content to determine its legitimacy (content validity). Second, experts and novices judge the instrument’s apparent similarity to the construct it attempts to represent (face validity). Construct validity reflects the ability of the instrument to actually measure what it intends to measure (ie, the difference in performance between groups of users with different levels of experience in reality). Concurrent validity reflects the correlation between performance on the serious game and their performance on an instrument believed to measure the same construct (eg, a simulator or course). The ultimate goal is to prove a game’s predictive validity: does performance in the game lead to better outcomes in reality? Most validation research currently published in the medical domain uses these concepts [ ]. For individual cases, relevance of specific validity types may differ. When considering mHealth apps in general, content validity may be the sole source of validity.
Validity research is frequently a long and costly enterprise. Many newly developed serious games have therefore not yet undergone validity research . The framework therefore determines a number of steps to pre-assess a serious game’s potential as a valid instrument, with regard to its design and initial testing phases. This encompasses the involvement of user groups, content experts, or educationalists in the design (if relevant to the game’s purpose). Next, if a game has undergone user testing and stability testing, the game is more likely to have higher face- and content validity.
Threats to user privacy are imminent in electronic and mobile health apps, especially when patient-specific data are measured or entered in the game . This considers data “at rest” on devices or servers, as well as data “in transit”. It must be clear whether data is collected by the game, who owns the data and whether users can request to remove their data. Storage and analysis of personal data should be disclosed to users and must be in conformity with the laws applied in countries the serious game is distributed in. Special care must be taken if patient information is collected. These items are in general conformity with the requirements for mHealth apps described in Albrecht’s framework [ ].
When using serious games in health care, end users (clinicians, patients, or educators) must decide whether games are safe and effective enough to be used for their intended purposes. In order to do so, they need consistent, transparent, and reliable assessments. Are applied games really stating their claim in this field? In the framework described in this article, both developers and end users are supported in assessing relevance, validity, and data safety of an applied game. In order to become a “qualified game”, developers should disclose comprehensive information on their products and claims. They must provide transparency to meet the standards. The Journal of Medical Internet Research and the Dutch Society for Simulation in Healthcare  have launched an international peer-reviewing initiative for serious games in health care.
The safe application of technology-enhanced solution remains the responsibility of the health care provider. Choosing if a serious game answers to the user’s needs, can be based on information concerning 5 main areas described in this article. The majority of the items cannot be assessed using objective parameters. For instance, claiming a specific serious game’s predictive validity should be supported by solid evidence. A comprehensive evaluation by a panel of experts in the form of a quality label could form a more practical solution.
Guidelines have been recently published reporting standards to support clinicians and patients in distinguishing high quality mhealth apps [- ] and medical websites [ ]. These standards form the basis for the framework described in this article. These standards have two important shortcomings when it comes to games. First, explicit information on a serious game’s content and didactic features is required, as the external purpose of a serious game is frequently less obvious to the user than in the case of mHealth apps. Second, serious games require additional validation steps (eg, construct and predictive validity), compared to non-interactive information platforms. Gameplay is dynamic and learning goals in gameplay are often not disclosed to the user. In fact, the user learns by playing the game, whereas discovery in itself may be part of the gameplay. Disclosing learning goals would thus be counterproductive.
There are several limitations to the framework described in this study. It considers validity of the serious game’s content and its didactic functionality. Validity does not predict a game’s success nor its attractiveness to the user, which also depend on its entertainment capability and distribution method . It does not wish to objectify which game is most fun, but merely which game is most valid. A second consideration is that in the scientific field of validity research in medicine, validity concepts other than the one used in this framework have been proposed [ ]. The “classical” validity concepts (content-, face-, construct-, concurrent-, and predictive validity) have been most frequently used in validity research in medicine and therefore the most logical to encompass in the framework presented in this article [ , ].
In summary, this consensus-based tool provides the end users the support required when assessing the effectiveness and relevance of serious games in health care. An FDA-approval or CE-mark is simply insufficient for this purpose. In order to prevent wrongful application and data theft of unsuspecting patients or medical students, this information on medical serious games should become publically available to all end users. This will aid the prescription of safe and effective games to patients and the implementation of games into educational programs.
The authors received funding from the Patient safety project (grant ref PID 101060), of the Pieken in de Delta-program by the Dutch Ministry of Economic Affairs, Agriculture and Innovation, the city of Utrecht and the province of Utrecht, the Netherlands. The funding agency had no role in the preparation of the consensus tool, nor the preparation of the manuscript.
The Committee for Serious Gaming committee of the Dutch Society for Simulation in Healthcare (DSSH) consist of the following members: M Graafland (Department of Surgery, Academic Medical Centre, Amsterdam, The Netherlands); MEW Dankbaar (Department of Medical Education (Desiderius School), Erasmus University Medical Centre, Rotterdam, The Netherlands); A Mert (National Military Rehabilitation Centre, Doorn, The Netherlands); J Lagro (Department of Geriatrics, Radboud University Medical Centre, Nijmegen, The Netherlands); LD de Wit-Zuurendonk (Department of Obstetrics and Gynaecology, Máxima Medical Centre, Veldhoven, The Netherlands); SCE Schuit (Departments of Emergency Medicine and Internal Medicine, Erasmus University Medical Centre, Rotterdam, The Netherlands); A Schaafstal (Department of ICT, Windesheim University of Applied Sciences, Zwolle, The Netherlands); and MP Schijven (Department of Surgery, Academic Medical Centre, Amsterdam, The Netherlands).
Conflicts of Interest
- Kueider AM, Parisi JM, Gross AL, Rebok GW. Computerized cognitive training with older adults: a systematic review. PLoS One 2012;7(7):e40588 [FREE Full text] [CrossRef] [Medline]
- Gamito P, Oliveira J, Lopes P, Brito R, Morais D, Silva D, et al. Executive functioning in alcoholics following an mHealth cognitive stimulation program: randomized controlled trial. J Med Internet Res 2014;16(4):e102 [FREE Full text] [CrossRef] [Medline]
- Prange GB, Kottink AI, Buurke JH, Eckhardt MM, van Keulen-Rouweler BJ, Ribbers GM, et al. The Effect of Arm Support Combined With Rehabilitation Games on Upper-Extremity Function in Subacute Stroke: A Randomized Controlled Trial. Neurorehabil Neural Repair 2014 May 29. [CrossRef] [Medline]
- Majumdar D, Koch PA, Lee H, Contento IR, Islas-Ramos AD, Fu D. "Creature-101": A Serious Game to Promote Energy Balance-Related Behaviors Among Middle School Adolescents. Games Health J 2013 Oct;2(5):280-290 [FREE Full text] [CrossRef] [Medline]
- Cooper H, Cooper J, Milton B. Technology-based approaches to patient education for young people living with diabetes: a systematic literature review. Pediatr Diabetes 2009 Nov;10(7):474-483. [CrossRef] [Medline]
- Read JL, Shortell SM. Interactive games to promote behavior change in prevention and treatment. JAMA 2011 Apr 27;305(16):1704-1705. [CrossRef] [Medline]
- Kato PM, Cole SW, Bradlyn AS, Pollock BH. A video game improves behavioral outcomes in adolescents and young adults with cancer: a randomized trial. Pediatrics 2008 Aug;122(2):e305-e317 [FREE Full text] [CrossRef] [Medline]
- Graafland M, Schraagen JM, Schijven MP. Systematic review of serious games for medical education and surgical skills training. Br J Surg 2012 Oct;99(10):1322-1330. [CrossRef] [Medline]
- Buttussi F, Pellis T, Cabas Vidani A, Pausler D, Carchietti E, Chittaro L. Evaluation of a 3D serious game for advanced life support retraining. Int J Med Inform 2013 Sep;82(9):798-809. [CrossRef] [Medline]
- Lagro J, van de Pol MH, Laan A, Huijbregts-Verheyden FJ, Fluit LC, Olde Rikkert MG. A Randomized Controlled Trial on Teaching Geriatric Medical Decision Making and Cost Consciousness With the Serious Game GeriatriX. J Am Med Dir Assoc 2014 Jun 6. [CrossRef] [Medline]
- Forsberg A, Nilsagård Y, Boström K. Perceptions of using videogames in rehabilitation: a dual perspective of people with multiple sclerosis and physiotherapists. Disabil Rehabil 2014 May 16:1-7. [CrossRef] [Medline]
- Graafland M, Vollebergh MF, Lagarde SM, van Haperen M, Bemelman WA, Schijven MP. A Serious Game Can Be a Valid Method to Train Clinical Decision-Making in Surgery. World J Surg 2014 Aug 27. [CrossRef] [Medline]
- van Velsen L, Beaujean DJ, van Gemert-Pijnen JE. Why mobile health app overload drives us crazy, and how to restore the sanity. BMC Med Inform Decis Mak 2013;13:23 [FREE Full text] [CrossRef] [Medline]
- Connolly TM, Boyle EA, MacArthur E, Hainey T, Boyle JM. A systematic literature review of empirical evidence on computer games and serious games. Computers & Education 2012 Sep;59(2):661-686. [CrossRef]
- Kotz D. A threat taxonomy for mHealth privacy. In: Crowcroft J, Manjunath D, Misra A. editors Third International Conference on Communication Systems and Networks (COMSNETS 2011).: IEEE; 2011 Presented at: Third International Conference on Communication Systems and Networks; 2011; New York, NY p. 1. [CrossRef]
- Dutch Society for Simulation in Healthcare. Official Website Internet. DSSH official website 2014:2014.
- Lewis TL. A systematic self-certification model for mobile medical apps. J Med Internet Res 2013;15(4):e89 [FREE Full text] [CrossRef] [Medline]
- Albrecht UV. Transparency of health-apps for trust and decision making. J Med Internet Res 2013;15(12):e277 [FREE Full text] [CrossRef] [Medline]
- Journal of Medical Internet Research (JMIR). 2014. Apps Peer-Review launched Internet URL: http://mhealth.jmir.org/announcement/view/67 [accessed 2014-11-04] [WebCite Cache]
- Michael DR, Chen S. Serious Games: Games That Educate, Train, and Inform. Boston, MA: Course Technology PTR; 2006.
- Jalink MB, Goris J, Heineman E, Pierie JP, ten Cate Hoedemaker HO. Construct and concurrent validity of a Nintendo Wii video game made for training basic laparoscopic skills. Surg Endosc 2014 Feb;28(2):537-542. [CrossRef] [Medline]
- Badurdeen S, Abdul-Samad O, Story G, Wilson C, Down S, Harris A. Nintendo Wii video-gaming ability predicts laparoscopic skill. Surg Endosc 2010 Aug;24(8):1824-1828. [CrossRef] [Medline]
- SCORM. SCORM Explained Internet URL: http://scorm.com/scorm-explained/ [accessed 2014-11-04] [WebCite Cache]
- Anderson LW, Krathwohl DR, Bloom B. A taxonomy for learning, teaching, and assessing: a revision of Bloom's taxonomy of educational objectives. New York, NY: Longman; 2001.
- Van Staalduinen JP, de Freitas S. A Game-Based Learning Framework: Linking Game DesignLearning Outcomes. In: Khine MS, editor. Learning to Play. New York, NY: Peter Lang Publishing, Inc; 2011:29.
- American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 1999.
- Schijven MP, Jakimowicz JJ. Validation of virtual reality simulators: Key to the successful integration of a novel teaching technology into minimal access surgery. Minim Invasive Ther Allied Technol 2005;14(4):244-246. [CrossRef] [Medline]
- Gallagher AG, Ritter EM, Satava RM. Fundamental principles of validation, and reliability: rigorous science for the assessment of surgical education and training. Surg Endosc 2003 Oct;17(10):1525-1529. [CrossRef] [Medline]
- Cook DA, Zendejas B, Hamstra SJ, Hatala R, Brydges R. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Adv Health Sci Educ Theory Pract 2014 May;19(2):233-250. [CrossRef] [Medline]
- Health on the Net Foundation Code of Conduct (HONcode). Operational definition of the HONcode principles Internet URL: http://www.hon.ch/HONcode/Webmasters/Guidelines/guidelines.html [accessed 2014-11-04] [WebCite Cache]
- Bellotti F, Kapralos B, Lee K, Moreno-Ger P, Berta R. Assessment in and of Serious Games: An Overview. Advances in Human-Computer Interaction 2013;2013:1-11. [CrossRef]
- Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 2006 Feb;119(2):166.e7-166.16. [CrossRef] [Medline]
Edited by G Eysenbach; submitted 31.08.14; peer-reviewed by I Paraskevopoulos; comments to author 10.09.14; revised version received 15.10.14; accepted 17.10.14; published 11.11.14
©Maurits Graafland, Mary Dankbaar, Agali Mert, Joep Lagro, Laura De Wit-Zuurendonk, Stephanie Schuit, Alma Schaafstal, Marlies Schijven. Originally published in JMIR Serious Games (http://games.jmir.org), 11.11.2014.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Serious Games, is properly cited. The complete bibliographic information, a link to the original publication on http://games.jmir.org, as well as this copyright and license information must be included.