Published on in Vol 7, No 3 (2019): Jul-Sep

Preprints (earlier versions) of this paper are available at, first published .
Impact of Using a 3D Visual Metaphor Serious Game to Teach History-Taking Content to Medical Students: Longitudinal Mixed Methods Pilot Study

Impact of Using a 3D Visual Metaphor Serious Game to Teach History-Taking Content to Medical Students: Longitudinal Mixed Methods Pilot Study

Impact of Using a 3D Visual Metaphor Serious Game to Teach History-Taking Content to Medical Students: Longitudinal Mixed Methods Pilot Study

Original Paper

1South Auckland Clinical Campus, Faculty of Medical and Health Sciences, The University of Auckland, Auckland, New Zealand

2College of Medicine, Taif University, Taif, Saudi Arabia

3Auckland City Hospital, Auckland, New Zealand

4Middlemore Hospital, Auckland, New Zealand

Corresponding Author:

Hussain Alyami, MBChB, FRANZCP, PGDCBT

South Auckland Clinical Campus

Faculty of Medical and Health Sciences

The University of Auckland

Middlemore Hospital

Private Bag 93311

Auckland, 1640

New Zealand

Phone: 00966 541118828


Background: History taking is a key component of clinical practice; however, this skill is often poorly performed by students and doctors.

Objective: The study aimed to determine whether Metaphoria, a 3D serious game (SG), is superior to another electronic medium (PDF text file) in learning the history-taking content of a single organ system (cardiac).

Methods: In 2015, a longitudinal mixed methods (quantitative and qualitative) pilot study was conducted over multiple sampling time points (10 weeks) on a group of undergraduate medical students at The University of Auckland Medical School, New Zealand. Assessors involved in the study were blinded to group allocation. From an initial sample of 83, a total of 46 medical students were recruited. Participants were assigned to either a PDF group (n=19) or a game group (n=27). In total, 1 participant left the PDF group after allocation was revealed and was excluded. A total of 24 students in the game group and 14 students in the PDF group completed follow-up 7 weeks later. Using an iterative design process for over a year, with input from a variety of clinical disciplines, a cardiac history-taking game and PDF file were designed and informed by Cognitive Load Theory. Each group completed its intervention in 40 min. A total of 3 levels of Kirkpatrick training evaluation model were examined using validated questionnaires: affective (perception and satisfaction), cognitive (knowledge gains and cognitive load), and behavioral attitudes (Objective Structured Clinical Exam) as well as qualitative assessment. A priori hypotheses were formulated before data collection.

Results: Compared with baseline, both groups showed significant improvement in knowledge and self-efficacy longitudinally (P<.001). Apart from the game group having a statistically significant difference in terms of satisfaction (P<.001), there were no significant differences between groups in knowledge gain, self-efficacy, cognitive load, ease of use, acceptability, or objective structured clinical examination scores. However, qualitative findings indicated that the game was more engaging and enjoyable, and it served as a visual aid compared with the PDF file.

Conclusions: Students favored learning through utilization of an SG with regard to cardiac history taking. This may be relevant to other areas of medicine, and this highlights the importance of innovative methods of teaching the next generation of medical students.

JMIR Serious Games 2019;7(3):e13748




Since the time of Hippocrates, clinical history taking has remained a cornerstone of medicine [1]. Although history taking is the most common medical procedure performed by doctors [2], research has shown a deficit in this essential skill among medical students [3-6], interns, and, more worryingly, general practitioners [7]. History taking is more valuable than physical examination in reaching a diagnosis in around 80% of medical outpatient referrals [8,9]. When adequately performed, history taking is associated with improved patient physiological and psychological health outcomes [10], satisfaction [11,12], and compliance [13,14]. Although the literature shows that this essential clinical skill can be learned, it is inadequately taught [3,5,15].

A fine balance between the history-taking components, that is, history-taking content (HTC) and history-taking process (HTP), needs to be achieved for optimal effectiveness. HTC is commonly referred to as data or information gathering, and it is concerned with the what part of the medical interview, eliciting specific information about the patient’s symptomatology from the presenting complaint through to social and occupational histories [1]. On the other hand, HTP, commonly known as communication skills, is the method by which this content is elicited; thus, it is more concerned with the how of the medical interview [1]. There is evidence to suggest that medical students are distracted when trying to remember HTC, which impairs their communication skills [16,17]. Despite the importance of teaching HTC, this remains an understudied area, with a recent systematic review showing only 6 studies of educational interventions that targeted this essential part of the medical interview; this review also found that technologically enhanced interventions improved teaching of HTC [18].


The literature shows that medical students use multiple learning styles, including visual, auditory, and kinesthetic [19]; therefore, we propose that a visual metaphor system could prove to be of significant value to enhance the teaching of HTC. A visual metaphor is defined as “a graphic structure that uses the shape and elements of a familiar natural or man-made artifact or of an easily recognizable activity or story, to organize content meaningfully and use the associations with the metaphor to convey additional meaning about the content” [20:233]. The use of visual metaphors in medicine dates back to antiquity, when Aristotle explained how blood is contained in the heart and associated vessels and compared this with a “vase” [21]. Currently, visual metaphors continue to be used successfully to teach some of the more complicated and abstract scientific principles, for example, Emil Fischer’s Lock-and-Key model of enzyme-substrate interaction in physiology [22]. Until recently, metaphor was considered a decorative component of speech; it is now considered “to pervade all forms of knowledge” [23:86]. A number of benefits have been reported for using a visual metaphor system as a learning and knowledge sharing tool, including improved audience engagement, attention, memory, and comprehension [24]. Visual metaphors can provide a means of transferring a complex concept by using simple visual symbols that facilitate the connection between thoughts and feelings [25]. Visual metaphors, when created and presented well, enable high levels of content comprehension, retention, and recall, and they are easier to construct and interpret, compared with mind maps [20]. They also assist learners to incorporate newly learned material with previous knowledge [26,27]. Serious games (SGs) offer a promising modality of delivering visual metaphors, as they have been shown to provide learners with an “anchor” for knowledge [28]. Fabricatore described how metaphors could be embedded in SGs to enhance learning, especially if they are designed to being at the heart of gameplay rather than serving as a mere decorative component [29]. There are potential benefits of SGs, which utilize visual metaphorical elements, in medical education [30]; however, the application and assessment of metaphor use in SG design are limited [31]. SGs are defined as games “that engage the user and contribute to the achievement of a defined purpose other than pure entertainment” [30: 5]. Although adopting visual metaphors in SG design is in its infancy, the SGs’ literature suggests they can improve engagement and facilitate learning of the intended teaching points [28,32,33]. However, SGs are considered complex, and they need to be designed while taking into consideration the limits of human cognitive capacity [34]. Cognitive Load Theory (CLT) offers a wide range of instructional design principles [35]. CLT has been used in the design of SGs to manage the limited cognitive capacity of the learner [36], and it highlights the limited capacity of working memory (WM) to process from 3 to 7 items at any given time, compared with the potentially unlimited capacity of long-term memory [37]. CLT aims to reduce the cognitive load imposed by instructional design (extraneous load) on WM to improve learning (germane load). CLT design principles mostly target the extraneous cognitive load imposed by poorly designed material [38,39]. These design principles include using worked examples and minimizing redundant information and split attention [38]. The aim of this study was to evaluate the impact of an SG on learning HTC in medical students during their first clinical exposure. To our knowledge, this is the first visual metaphor–enhanced SG implemented for teaching HTC to medical students. To evaluate this SG’s impact, Kirkpatrick’s training evaluation model was adopted, as it is the most widely used method in evaluating training programs [40] and is commonly used to assess medical educational interventions [41-43]. Kirkpatrick’s model was modified by Freeth et al [43] and adopted by the Best Evidence Medical Education Collaboration to assess medical teaching interventions [44]. The domains included are the following: affective (cognitive load, material difficulty level, and perceptions and satisfaction), cognitive (knowledge gains), and behavioral attitudes (objective structured clinical exam). The fourth level in Kirkpatrick ‘s model (outcomes related to the impact of the interventions on patients’ care) is not commonly measured in educational settings [45] and is beyond the scope of this study.


This pilot study used a mixed methods approach by combining a repeated-measures nonrandomized quasi-experimental design with a qualitative component. The study was approved by University of Auckland Human Participants Ethics Committee, reference number 015567. Med Metaphoria game design and development were funded by The New Zealand Health Innovation Hub, in collaboration with Counties Manukau District Health Board, who provided the first author with a clinical fellowship in Health Systems Innovation and Improvement.


The University of Auckland has a 6-year undergraduate medical program, predominantly comprising basic sciences (Years 1-3) and then clinical practice (Years 4-6). Our sample was a preclinical population of year-3 medical students at the South Auckland Clinical Campus of the University of Auckland, who were attending a 10-week clinical methods module at the end of 2015. This is a foundation course for teaching clinical skills. All 83 students on the rotation were invited to participate in this study. Recruitment strategies involved both verbal and poster invitations. Students were preallocated, at the medical school administration level, to 1 of 2 teaching days (Wednesday or Thursday), independent of the investigators.


To minimize potential contamination, the control group (PDF file group) was allocated to Wednesday, followed by the intervention group (Visual Metaphor Game; VMG group) on Thursday. This was done to minimize contamination in keeping with medical education recommendations [46], which suggested restricting access to the intended educational intervention. Both groups had the same face-to-face cardiac history teaching (whole morning session) delivered on the day of the interventions, followed by a one-off 40-min session, using either a PDF file or the VMG. Both groups used iPads to access content, whereby the same content was covered but differed in instructional design, as the PDF file was textual, and the game used visual metaphors. A visual metaphor–enhanced game (Med Metaphoria) was developed as a 3D SG that contains elements of HTC (Figure 1 and Multimedia Appendix 1). The 3D game was codesigned by the lead author in collaboration with clinicians, medical educationalists, code developers, and students. We followed a participatory iterative agile game development process [47], using the I’s development framework, proposed by Annetta et al for the design of SGs [48] (Figure 2). This “I’s” framework included the following elements: Identity, Immersion, Interactivity, Increasing complexity, Informed teaching, and Instructional. Game development (versions 1 and 2) incorporated student and clinician feedback by using a think-aloud protocol, originally developed by Ericsson and Simon [47,49] and later adopted by SGs’ design literature [50] until the final iteration (version 3) was developed (Figure 3). Although there are no significant differences between 2D and 3D educational games in terms of learning gains [51], 3D design was chosen as the main game design modality, despite a higher associated cost for the following reasons. First, a key teaching point in the history-taking visual metaphor is to avoid “tunnel vision” history-taking, which is similar to looking at a 2D pyramid, focusing on 1 causative side or system and ignoring the other 3D pyramid sides and systems. Therefore, 3D depth of the visual metaphor was essential to convey this point, which is commonly used in the game literature and called “spatial metaphor” [52]. Second, the literature has shown that health sciences students’ preference for 3D over 2D games [53]. Finally, the game literature has found that 3D games facilitate a more enhanced sense of presence than 2D games [53]. Overall, 2D graphics were also used if there was no spatial need for metaphors.

Figure 1. In-game snapshot showing the symptoms and signs relevant to the cardiovascular system.
View this figure
Figure 2. Visual metaphor and game design process in each game version.
View this figure
Figure 3. Iterative visual metaphor design.
View this figure

The main characters in the game were the medical students in the submarine (the rescuer and students had the option of choosing their own personalized name) and John the Archaeologist (the patient). The game was a point-and-click adventure game, taking place in a 3D fantasy world where John was inadvertently locked up in a pyramid after deciphering a set of visual metaphors wrongly. The game rules included deciphering the visual metaphors correctly by associating them with the correct multiple-choice answer to rescue John, who is complaining of chest pain. The quicker this was done, the higher the score and rank achieved. Each section of the history taking was a separate part of the game journey. Audio visual and textual feedback was provided, in addition to static and animated game narrative and cut scenes. Students had to decipher the visual metaphors by choosing the right answer out of 3 or 4 multiple-choice answers per question. Although question order was constant, answer positions were randomly ordered for each question. The game had 3 difficulty levels, with the beginner level allowing 30 seconds per question, using the full textual question, followed by the intermediate level, which allows 20 seconds, with questions partially visible, and the advanced level, allowing 10 seconds, which only shows the visual item with no textual question. The total number of seconds per difficulty level was divided into thirds, and students were ranked into 1 of 3 ranks on the basis of their time to answer each question correctly: Consultant (first third), Resident (second third), or Intern (last third). Students also scored points corresponding to their speed in choosing the right answer (Figure 4).

Figure 4. In-game snapshot showing the visual metaphor in the background, question and multiple choice options. The number of points (29) represents that it took the player in the beginner level one second to decipher the visual metaphor and answer the question correctly (out of the 30 seconds allocated per question).
View this figure

Development and Validation of Teaching Interventions and History-Taking Content Measures

Both PDF files and the VMG were developed after consulting with the recommended reading list the students received, which included a clinical skills module booklet and a recommended clinical skills textbook [54]. Content validity and instructional design of both interventions were carried out through iterative consultation with multiple junior and senior doctors, from medical and surgical disciplines, including Internal Medicine, Cardiology, Surgery, and Psychiatry. Face validity was established through iterative discussions with multiple medical students across the medical school program.


For both groups, knowledge about HTC was assessed using a paper-based context-rich open-ended question. This was asked immediately before intervention (baseline), immediately after intervention (postintervention), and then again 7 weeks after intervention (follow-up). These tests were designed to assess baseline HTC knowledge and immediate and delayed recall (see Figure 5 for study design). This open-ended question was deemed more appropriate than using closed multiple-choice questions [55]. It read “what are the cardiac symptoms you would ask about as part of the history of presenting complaint in a patient presenting with chest pain?” The model answer template had a total of 25 symptoms. Answers that included lay or medical terms were accepted if they were consistent with the symptom meaning. Marking was conducted by a senior cardiology registrar, who was blinded to both groups, out of a total score of 25 based on predefined marking criteria (Multimedia Appendix 2).

Figure 5. Study design and sampling time points.
View this figure

Pre, post, and follow-up questionnaires were designed to assess the 3 levels of Kirkpatrick’s training evaluation model. These key points about the measures used are summarized in Table 1. The Mental Effort Scale [56] was used to assess students’ perceived history-taking cognitive load, whereas history-taking difficulty was measured by using the Difficulty Level Scale [57]. The Mental Effort Scale is a subjective rating scale for cognitive load, with a Cronbach coefficient alpha of .90 (internal reliability) [56]. This scale has been studied widely in the literature, and it has shown good psychometric properties [37]. It has been shown to be comparable to objective physiological measures and has good validity and reliability [58], and it is more sensitive and simpler than objective physiologic measures [37]. The adapted 5-point Likert scale is commonly used to assess cognitive load [57], where the question “In taking a history, I invest...” is answered on a 5-point Likert scale. As a one-off administration of the scale was shown to over or underestimate scores [59,60], it was administered at 2 timelines (baseline and postintervention). The perceived task difficulty level was also measured, as it is an important but different construct to cognitive load [61]. According to Gog and Paas, tasks perceived to be highly difficult might lead to using low mental effort in solving them [61]. This scale was administered to answer the question “I experienced history-taking as...”

Despite the known criticism of single-item scales [62], such as cognitive load and perceived task difficulty scales, research has shown these scales to have comparable properties with multi-item scales [63]. The Self-Efficacy Scale [64] was used to measure student perceived self-confidence in history taking. Medical students’ self-efficacy has been shown to influence academic performance [65-67]. Bandura’s seminal work on self-efficacy theory highlighted that competency to perform a certain task is not limited to only knowledge and skill acquisition but also in how the participant believes in their efficacy of reaching this goal [68]. In the context of SGs, Gee highlighted how they could increase self-efficacy [69]. The General Self-Efficacy (GSE) Scale devised by Schwarzer and Jerusalem [64] was used to measure student perceived self-confidence in history taking. GSE refers to the global confidence in one’s coping ability across multiple situations [64]. This scale has been used extensively in research, with an internal consistency of 0.75 to 0.91, with a reliability of 0.67 [70]. However, it is considered a nondomain-specific scale, which goes against Bandura’s domain-specific conceptualization of self-efficacy [71]. Therefore, GSE scale adaptation was necessary to make it history-taking domain specific to fit with Bandura’s self-efficacy theory [71]. Given that self-efficacy is dynamic, researchers have advised to measure this construct longitudinally rather than cross-sectionally [72]. Although some researchers have measured self-efficacy at 2 time points, medical educationalists have recommended doing so more than twice [71]. Given that both educational interventions were delivered using a technologically enhanced medium (an iPad) in this study, the Technology Acceptance Model (TAM) was used [73]. The TAM’s Ease of Use and Acceptability questionnaire has been previously used to investigate medical students’ [74] and physicians’ intentions to use technology [75,76]. Owing to its robustness, simplicity, and adaptive nature, it has become one of the most widely used models in measuring technology acceptance [77].

SGs that involve instructional design optimization have been shown to improve satisfaction levels [78]. Therefore, a satisfaction questionnaire was designed as it could serve as an indirect academic performance measure [79]. These questions were drawn from other relevant medical education literature evaluating technologically enhanced educational interventions [80,81]. The questionnaire face and content validation process followed an iterative design process similar to game design (Figure 2). Immediately after the intervention, 3 open-ended questions were used to assess students’ perceptions and suggestions of what they learned, enjoyed, and did not enjoy, as well as suggestions for further improvement.

History-taking clinical skills were objectively assessed at 10 weeks postintervention by comparing end-of-year objective structured clinical examination (OSCE) results for the history-taking station. The history-taking station was designed independent of the investigators, and it targeted assessment of abdominal pain; however, the overall history-taking structure and pain questions were similar.

Table 1. Measures used and relevant information.
Scale name and ConstructItems1 is5 isTimelines
Mental effort

Cognitive load1Very lowVery highBaseline and postintervention
Perceived difficulty

Task difficulty1Not difficult at allVery high difficultyBaseline and postintervention
General Self-Efficacy

Self-confidence in history taking10Strongly disagreeStrongly agreeBaseline, postintervention, and 7 weeks later
Technology acceptance model

Ease of use6Strongly disagreeStrongly agreePostintervention

Usefulness6Strongly disagreeStrongly agreePostintervention

Satisfaction8Strongly disagreeStrongly agreePostintervention

Open-ended knowledge questionMarks out of 25aBaseline and postintervention
Clinical skills

Objective structured clinical examinationMarks out of 2510 weeks postintervention

aNot applicable.

Statistical Analysis

Descriptive statistics were used to assess baseline demographic characteristics, and mean scores of continuous variables were calculated. Chi-square or Fisher exact tests were used to test for associations between the PDF and VMG groups. To examine changes in self-efficacy and knowledge across baseline, postintervention, and follow-up, a repeated measure analysis of variance was carried out. This was adjusted for confounders, including age, gender, ethnicity, education level, marital status, and specialty preference. Interaction between group and time were also tested and adjusted in the model if it was deemed to be significant (P<.05). All the analyses were carried out using SAS (version 9.3; SAS Institute) software. Arbitrary missing data at the follow-up test were handled using multiple imputation, using the 2-fold fully conditional specification approach proposed by Welch et al, using the predicted mean matching [82].

Qualitative Analysis

Data analysis was conducted using the 6-phase thematic analyses guide developed by Braun and Clark [83].

Authors HA and MA independently followed these 6 steps, and consensus was reached using the constant comparison approach where disagreements were discussed until mutual agreement was reached. The first phase involved entering responses in an Excel spreadsheet where they were read and reread as part of the data familiarization stage, and notes were taken for noticeable patterns. Manual coding was performed through highlighting repeated meanings that represented certain patterns as part of a data driven analysis. After data were coded, broader themes were generated. We used tables of codes and generated themes accordingly. At this stage, themes were reviewed and refined against the coded data level, followed by the whole dataset level. These themes were refined and named to ensure they reflected the meaning of the dataset they represent, highlighting interesting points and providing any potential explanations. Each theme was analyzed separately and in relation to other themes within the overall dataset meaning. Finally, report writing involved writing the meaning of the datasets on the basis of the generated themes supported by data extracts. At this stage, critical explanations about the rationale for such themes in the context of our research question about learning enhancement were sought.

Sample Characteristics

A total of 83 students were eligible for the study, and 46 students agreed to participate (55% response rate). In total, 1 student left the PDF group, leaving 18 participants in this group and 27 in the VMG group. The student demographic characteristics in both groups were similar, including age, gender, specialty preference, and gaming frequency (Table 1). All students participated in the baseline and postintervention measures. Of those, 24 (88%) of the students in the VMG group and 14 (73%) students in the PDF group completed the follow-up test 7 weeks later. The sample characteristics are provided in Table 2.

Table 2. Sample characteristics.
Baseline characteristicsGroupTotalP value


Gender, n (%).73a

Male4 (21.1)8 (29.6)12 (26.1)

Female15 (79)19 (70.4)34 (73.9)
Age (years), n (%).68a

19-2110 (52.6)18 (66.7)28 (60.9)

22-258 (42.1)8 (29.6)16 (34.8)

>261 (5.3)1 (3.7)2 (4.4)
Ethnicity, n (%).32a

NZ European5 (26.3)8 (30.8)13 (28.9)

Maori2 (10.5)3 (11.5)5 (11.1)

Pacific5 (26.3)2 (7.7)7 (15.6)

Asian1 (5.3)6 (23.1)7 (15.6)

Other6 (31.6)7 (26.9)13 (28.9)
Marital status, n (%).58a

Single16 (84.2)18 (69.2)34 (75.6)

Couple/De Facto3 (15.8)7 (26.9)10 (22.2)

Married01 (3.9)1 (2.2)
Specialty, n (%).75a

Medicine10 (52.6)14 (53.9)24 (53.3)

Surgery02 (7.7)2 (4.4)

Subspecialities2 (10.5)3 (11.5)5 (11.1)

Do not know7 (36.8)7 (26.9)14 (31.1)
Gaming frequency, n (%).88a

Once a day1 (5.3)3 (11.5)4 (8.9)

More than one a day1 (5.3)2 (7.7)3 (6.7)

Once a week3 (15.8)5 (19.2)8 (17.8)

More than once a week01 (3.9)1 (2.2)

I do not play games14 (73.7)15 (57.7)29 (64.4)
Education, n (%).31b

Undergraduate entry12 (63.2)20 (76.9)32 (71.1)

Master’s degree7 (36.8)6 (23.1)13 (28.9)
Enrolment status, n (%).50a

Domestic19 (100)25 (92.6)44 (95.7)

International01 (7.4)2 (4.4)

aFisher exact test used.

bChi-square test used.

Knowledge and Self-Efficacy

Although between-group differences in knowledge and self-efficacy at postintervention and 7-week time points compared with baseline were not significant (knowledge P=.95 and self-efficacy P=.85), the within-group differences were statistically significant (P<.001) for both groups (Figures 6 and 7). This significant difference persisted across unadjusted and adjusted models (adjusted by age, gender, ethnicity, education level, marital status, and specialty preference).

Clinical skills (Objective Structured Clinical Examination Scores)

The OSCE took place 10 weeks after the intervention. No significant differences between the intervention and control groups were found (P=.60; Table 3).

Figure 6. Knowledge across baseline, postintervention and follow up time points (no statistically significant differences).
View this figure
Figure 7. History-taking self-efficacy across baseline, postintervention and follow up time points (no statistically significant differences).
View this figure
Table 3. History-taking cognitive load, difficulty level, objective structured clinical examination results, ease of use, usefulness, and satisfaction means, SD, and P values.
Outcome, group, and timelineMean (SD)P value
Cognitive load.88a


Baseline3.94 (0.64)

Post3.89 (0.68)


Baseline3.81 (0.62)

Post3.74 (0.66)
Difficulty level.68a


Baseline2.94 (0.73)

Post3.06 (0.64)


Baseline2.96 (0.59)

Post3.04 (0.52)


Post23.72 (4.35)


Post29.85 (4.86)


Post21.39 (4.1)


Post22.59 (3.62)
Ease of use .19


Post23.28 (3.68)


Post24.7 (3.36)
Objective structured clinical examination .60


Post14.33 (4.52)


Post15.04 (4.28)

aWilcoxon 2-sample test (nonparametric test).

History-Taking Cognitive Load and Level of Difficulty

There were no statistically significant differences from baseline between the 2 groups in either cognitive load scores (P=.88) or level of difficulty scores (P=.68; Table 2). Similarly, the ease-of-use and usefulness levels assessed at postintervention did not reach statistical significance (ease of use P=.19 and usefulness (P=.30; Table 3).


There was a significant difference in satisfaction levels between the groups, with the VMG group having a statistically higher satisfaction level compared with the PDF group (mean difference of 6.13; 95% CI 3.27-9.00, P<.001; Table 3).

Qualitative Results

The overall findings indicate that the VMG and PDF file presented both HTC and structure well to participants. However, although VMG was viewed as a learning tool that aided understanding, encoding, and recalling history taking in an engaging and fun way, the PDF file was viewed as easy to read but difficult to learn or remember. In addition, the PDF file was seen as an electronic handout, which was no different to other handouts that were boring and nonengaging. A total of 4 major themes emerged from this analysis for both interventions: interventions as learning tools, enjoyment and engagement, generalizability, and suggestions for future improvement (Table 4). Identities are masked and replaced by pseudonyms in the section below to preserve confidentiality.

Table 4. Thematic analysis summary.
GameShared main themesPDF
Memory aid for conceptual, procedural, and metacognitive knowledge developmentLearning toolAn easy-to-read but hard-to-learn reference tool
Enhances enjoyment in learning history takingAffects enjoyment and engagementNeither engaging nor enjoyable
Text size, spelling, grammar optimization, add skip button, and increase difficulty levelsDesign and functionality optimization suggestionsAdd pictures, color and animations, as well as enhancing interactivity
Game specific theme: Game is generalizable to other clinical scenarios of different systemsa

aNot applicable.

Theme 1: Interventions as Learning Tools

Visual Metaphor Game as a Memory Aid for Conceptual, Procedural, and Metacognitive Knowledge Development

A recognized benefit of using visual metaphors is enhancing memory retrieval through activating prior knowledge, thus allowing new knowledge and concepts to be learned [21]. For example, Stacey discussed how visual metaphors enhanced understanding and recall: “The use of symbols made important factors to ask about easy to recall.” Michael described how visual metaphors went beyond being just a memory aid to facilitating the “how” aspects of knowledge in terms of structure and systematic history-taking approach, as they added a “very good systematic process with visual symbols – aids memory,” whereas James shared that visual metaphors “helped realise that a system is important to follow - logical order/sequence.” On a metacognitive level in terms of “thinking about thinking” [84], visual metaphors enable the interpreter to make strong associations among concepts [85]. This enables the students to understand how they think and process information. Tanya stated that she enjoyed “the metaphors - since it has visual association with aspects of history-taking. I like that it is very structured in its approach,” and Sarah highlighted how visual metaphors were “very effective associating concepts with images.”

PDF Viewed as an Easy-to-Read but Hard-to-Learn Reference Tool

Although a majority of the PDF group participants viewed it as a clear and concise list, they found it hard to learn and remember. Although it was seen by William as “easy to read bullet points and concise information,” it was “not interactive, cannot remember just by reading, boring.” Perceiving the PDF file as a reference tool is consistent with the medical education literature, as technology-based handouts were thought of as a reference tool [86,87]. The negative impact of lack of PDF file interactivity on learning outcomes has been observed in a previous study where students in the noninteractive PDF handouts group had significantly reduced educational outcomes compared with other interactive electronic handouts [88]. This may have implications for medical education in general, as information (including difficult to understand concepts) is often presented in textual format, without accompanying images.

Theme 2: Instructional Design Influences Enjoyment and Engagement

Visual Metaphors Enhance Enjoyment in Learning History Taking

Almost all the VMG participants enjoyed the VMG. For example, Jake enjoyed “the results afterwards, i.e., remembering content of history-taking.” Stacey highlighted this by stating:

I enjoyed how easy it was to use…..made things fun to follow.
This highlighted how enjoyment was linked to improved learning the material, as well as the ease of playing the game. This increase in game enjoyment is reported in the SGs literature, as studies have highlighted how students experienced high enjoyment levels while playing SGs [28,89]. Some researchers explained this increased enjoyment by the balance achieved between the learner’s abilities and the challenge posed by the game [90], which is consistent with Stacey’s statement above.

PDF Was Neither Engaging Nor Enjoyable

A majority of the PDF group participants found it boring and nonengaging because of a lack of visuals or interactivity, as Josephine stated, “It was not interactive at all, so it was not fun and I probably won't remember the stuff well.” Sam mentioned that “It was plain and was not necessarily enjoyable but just another handout but electronic.” These observations are supported by the literature, as technologically noninteractive modalities of presenting information, such as textual information, are considered less productive than interactive educational modalities [91]. Incorporating visual and verbal modalities of information is said to improve student understanding of taught material [92], which could explain the above statements.

Theme 3: Visual Metaphor Game Is generalizable to Other Clinical Scenarios of Different Systems

Overall, a majority of the VMG group participants appreciated the teaching value of the VMG and suggested generalizing it to teach other organ systems and clinical scenarios. Martin suggested the following:

To have different cases played out but with the same format so we learn about other systems with depth too. I feel that it has definitely helped me a lot with my memory recall, thanks.

Mike said the following:

I think the game is focused on one type of clinical problem, chest pain. It would be nice to include more scenarios

This did not feature in the PDF group responses. Visual metaphor use with games has been reported to serve as an “anchor” for conceptual knowledge [28], and this could explain the above responses. In the current study, though the history-taking visual metaphor had a cardiac focus, this could be easily applied to other organ systems (eg, respiratory system).

Theme 4: Design and Functionality Suggestions for Improvement

Participants in both groups offered suggestions on how both interventions could be optimized. Although PDF group participants offered several design suggestions, the VMG group had limited design suggestions.

Game Prototype Design and Functionality Suggestions

VMG participants made suggestions as to how the functionality and user interface could be optimized. In terms of design, some participants suggested text size optimization, as Simon commented “make text larger maybe?” Another suggested enlarging the buttons’ size. Others suggested spelling and grammatical corrections and adding more levels of difficulty. In terms of functionality, suggestions included adding back and skip buttons, lag improvement, and minimizing repetition. Some participants, such as Harris, commented on the game prototype lagging and requested to “make it faster if possible. Great concept.” Stephanie shared a similar opinion that we need to “Make the game respond faster.” Lag in the game literature refers to the game delay in response to the actions of the player [93], and this has been shown to negatively impact the gameplay experience [49,93,94]. A possible explanation for the reported lag is the use of data-rich 3D graphics in the game [95].

Angela appreciated the positive impact of repetition on memory and acknowledged that repetition was less enjoyable when they have learned the material. This was captured in her response: “repetition was really good for memory, but after I was able to remember it, it became a little boring, sorry.” The use of repetition in SGs is associated with improved learning [96], and the reported boredom after learning content is a good illustration of the flow state proposed by Csikszentmihalyi [95]. In the game literature, a balance needs to be struck between the game task difficulty and the player’s ability to reach a state of flow [97]. On the basis of this, hard games could frustrate the player, whereas easy ones could bore the player. Therefore, the literature suggests increasing game difficulty as the user’s knowledge improves, to maintain this state of flow. This increase in difficulty levels was suggested by Jasmin, as she wanted “an even more advanced level.” However, others enjoyed the current 3 difficulty levels, which is evident in Yvonne’s response about enjoying the “different levels to eventually learn history from memory.” Finally, given that the reported boredom only occurred after the material was learned, which is the educational purpose of this SG; adding more difficulty levels might be questionable.

PDF Design and Functionality Optimization Suggestions

Most PDF group participants had suggestions to enhance PDF design and functionality through adding pictures, color, and animations, as well as enhancing interactivity.

In terms of design, Patrick suggested “using more animations, colours, pictures making it more interesting and interactive,” whereas Patricia suggested adding functions, such as a record button, and a function to write on the file by having a “marker so you can write on and interact with the file to improve memorisation,” and Ronald suggested to “have a record button that records the conversation and puts all patients’ responses into each section immediately.” These suggestions are in line with educational research, which showed that interactive audio visual functionalities enhance learning outcomes [98].

Summary of Key Findings in This Study

This quasi-experimental pilot study that utilized both the quantitative and qualitative research methods approach evaluated a visual metaphor–enhanced 3D SG in teaching cardiac HTC to year-3 medical students in comparison to PDF-delivered teaching. More than half the sample indicated a preference to specialize in General Medicine in the future and were not playing games regularly. This study showed that the game is comparable to and as effective as textual information in increasing knowledge gains, enabling self-efficacy, and managing cognitive load, level of difficulty, perceived ease of use, and acceptability at various time points. However, the game was superior to textual information with regard to higher satisfaction scores relative to the PDF group. The reasons for this could be drawn from the qualitative analyses, which showed that the VMG was perceived to be a useful visual memory aid, more enjoyable, and transferrable to other clinical scenarios. On the contrary, the PDF group found the PDF file to be boring and merely another handout that was neither interactive nor engaging.

Quantitative Findings in the Context of the Literature

The quantitative findings were consistent with the current pre and postgraduate medical education SGs literature, which shows that they are at least as effective as traditional teaching methods. Consistent with the higher satisfaction scores in the VMG group and the qualitative findings of students viewing the game as more enjoyable, fun, and as a useful memory aid, 2 systematic reviews reported that SGs have the advantage of increasing students’ enjoyment and interest in the topic taught [99,100]. The improvement in satisfaction was consistent with medical education game use among medical students when compared with traditional teaching [81,101]. The main benefit of the new teaching approach was that the VMG was statistically more significant in terms of satisfaction and was more favored in the qualitative analysis. As student satisfaction improvement is on its own an important learning outcome that complements academic achievement [102], this significant satisfaction improvement in the game group is worthy of further consideration. Research has shown that improved satisfaction enhances student academic performance, and its enhancement is strongly advised by educationalists [103,104]. A previous systematic review assessing the impact of educational games on medical students’ learning outcomes found potential for improving learning outcomes, but it highlighted the need for more rigorous research [105]. Graafland et al systematically reviewed the literature for the impact of SGs on training health professionals and assessed their validity [106]. They included a total of 30 SGs, 17 of which were designed for educational purposes. However, none of these games were fully validated. The authors suggested validating games before incorporating them into teaching, and such validation needs to include content, face, and concurrent and predictive validity. In 2013, a Cochrane review looking at the use of games as a teaching method for health professionals found insufficient evidence for or against their use [107]. Wang et al conducted a systematic review in 2016 of SGs in medical education, and they identified 42 studies reflecting an increase in the number of SGs. Of these, only 19 studies included an evaluation of SGs, with a majority of these (n=17) associated with significant educational benefits [108]. Out of the 19 studies, only 4 were relevant to a medical student population, which assessed the use of SGs on medical students’ knowledge and skill acquisition [81,109-111]. Only 1 of these studies found a significant improvement in knowledge, but this was for immediate recall only, as long-term retention was not assessed [111]. There are several possible reasons why this study showed no significant quantitative differences between the teaching methods apart from satisfaction scores. The duration (40 min) for learning the textual information may be a confounder, as it was regarded as too long by many students, which meant that students started interacting and assessing each other. As assessment drives learning [112], it could be argued that the PDF group assessing each other instead of engaging with the iPad only could have improved their scores independent of the PDF file. The influence of student-student interaction has been found in previous work to be similar to interactions between students and facilitators [81]. SGs are usually played more than once, which is different to the one-off 40-min play session design in this study, and this could also explain the lack of significant findings. In 2013, a meta-analysis of SGs reported how a lack of significant difference between games and traditional teaching could be attributed to the lack of playing the game on multiple occasions, as normally happens [32]. In that study, compared with multiple gameplay sessions, single session gameplay was not more beneficial than traditional teaching methods. Another possible reason for lack of significant difference in scores between the groups is the progression of skill development over time in the absence of formal teaching [3]. Although the VMG group only had 40 min of gameplay, both groups had unlimited access to traditional teaching material from which the PDF file was designed, which could have been a confounding factor in this study. Another possible explanation for this result is that the control group’s awareness of its control status, as text PDF file is 1 of 2 interventions where the other is a game, influenced the group’s behavior, as the members of the group may have tried to outperform the intervention group. This is a statistical confounder called the John Henry Effect [113], evident by PDF group participants assessing each other during the experiment and highlighted by the fact that a participant left the control group as soon as she knew she saw the PDF file was not the game. Embedding qualitative research alongside quantitative approaches has been suggested as a way of addressing the John Henry Effect [114].

Qualitative Findings in Context of the Literature

The qualitative research in this study was to investigate students’ perceptions about the 2 educational approaches. Although both were seen as learning tools, the depth of such learning was better in the VMG group (conceptual, procedural, and metacognitive knowledge levels) compared with the PDF group (conceptual level). Several students touched on educational benefits of visual metaphors, for example, students saw how it enabled them to “associate” visual metaphors with HTCs. This mirrors findings on how visual metaphors work in terms of associating existing knowledge of certain visuals (familiar objects) and linking them with new (unfamiliar) concepts [20,115-117]. This associative function could be because of its visual nature, which, according to Pavio’s Dual Coding Theory, creates cognitive associations with textual information, thus enhancing learning [118]. Active cognitive processing marrying new-to-previous knowledge is commonly referred to as “meaningful learning,” a term that was pioneered by the educational psychologist, Ausubel, in the 1960s. He stated that “If I had to reduce all of educational psychology to just one principle, I would say this: The most important single factor influencing learning is what the learner already knows. Ascertain this and teach him accordingly” [119].

The VMG group’s awareness of such associative cognitive processes shows how visual metaphor enabled them to “think about thinking,” which is an important metacognitive skill that is essential for self-regulated learning [120]. Another benefit reported by students using visual metaphor was reaching a personalized deeper meaning of history taking. This “Cognitive Elaboration” [117] was observed in some responses alluding to reaching a conceptual understanding of how history taking has multiple interrelated parts, which need to be uncovered systematically and carefully. This was consistent with the visual metaphor literature, as it was seen to provide the learner with a deeper meaning and understanding of the presented material [115].

Game Design Considerations

For the VMG group, most participants suggested developing the game for other organ systems and clinical scenarios suggesting the usefulness of the VMG. Few students commented on the need to address the gameplay lag. This need for immediacy is one of the main features contributing to improved gameplay [121], and this will be addressed in the next iteration of the game. In terms of implications for this study, clinical educational interventions that focus on HTC should continue to explore the value of visual metaphor use delivered through SGs to better teach this crucial skill. First, our SG was delivered as a one-off 40-min session, and according to the literature, more than one session of playing SGs yielded significantly better results compared with traditional teaching [32]. Therefore, the SG could be tested using a bigger sample and randomized study design for multiple sessions, to assess its actual impact compared with traditional teaching, given the overall literature support of SGs’ benefits. Second, students’ feedback on game optimization is being taken into consideration, as we are at the final stages of developing another game iteration, taking into account students’ feedback. It is hoped that this version will be tested in a multicenter randomized controlled trial to assess its impact against traditional teaching. Third, another area that warrants further exploration is the use of visual metaphors in teaching other clinical topics, as students’ qualitative feedback showed its perceived generalizability to other history-taking topics apart from cardiac history taking. In terms of cross-cultural impact, SGs could be useful for medical students across the globe, especially with the technically savvy current and future millennial medical students. Although most of the developed SGs are from developed countries, it is expected that other less-developed countries will follow suit [30]. The global benefits of SGs are widely recognized [122], as once they are developed and deployed, they can be accessed from all around the world, and collaboration could be sought to translate SGs to local languages to facilitate medical education efforts across the globe. This is increasingly made possible with higher access to the internet internationally. An example of this exponential increase to mobile phone access has jumped from 10% in 2002 to 80% in 2015 in Kenya, with more access to smartphones for higher education students [123]. Internet is seen by a majority of the 32 underdeveloped countries as beneficial for education [124].


This study has several limitations. First, this study included a relatively small sample size, but of note, this was a pilot study, targeting one of the medical school’s 3 main teaching sites in Auckland. The response rate was 55%, which is consistent with the SGs literature [125], as some studies response rates were as low as less than 50%. The studies with higher response rates incorporated the interventions as part of the curriculum, which was not possible in this study. Although a randomized study design would have been ideal, it was not feasible, as students were preallocated at the medical school administration level, factoring in students’ residential proximity to the clinical teaching campus. This lack of contextual feasibility has been previously cited as one of the barriers to randomized study design adoption in medical education [126,127], and this is reflected in that less than 20% of SGs studies have used a randomized study design [128]. Moreover, participants were not blinded, which is a common issue in educational interventions, such as games, as students can easily identify the intervention. In addition, student-student interaction in the PDF group was not possible to control, which could have influenced their learning; potentially reducing the time limit for the PDF and VMG groups from 40 min may have helped address this issue. Finally, our design included a one-off game session, which differs from the usual gameplay experience of playing it more than once, which has been found to explain the lack of positive findings associated with SGs when compared with traditional teaching methods [32].


In a mixed-method experimental pilot study, we provide evidence that 40 min of playing an SG is as effective as textual information in teaching cardiac history taking to year-3 medical students, with the added value of increasing student satisfaction compared with traditional teaching. The qualitative analysis showed the game was more engaging, fun, enjoyable, and perceived as a useful visual memory aid. The game was as effective for the 3 Kirkpatrick model levels of evaluating educational interventions: affective, cognitive, and behavioral attitudes. Although SGs are in their infancy, their potential educational benefits should be harnessed, considering the current and future digitally adept medical students, especially when motivation and enjoyment of learning is at stake.

Conflicts of Interest

None declared.

Multimedia Appendix 1

A showreel of the Med Metaphoria game.

MP4 File (MP4 Video)120956 KB

Multimedia Appendix 2

Knowledge test marking criteria.

PDF File (Adobe PDF File)206 KB

  1. Nardone DA, Reuler JB, Girard DE. Teaching history-taking: where are we? Yale J Biol Med 1980;53(3):233-250 [FREE Full text] [Medline]
  2. Lipkin Jr M, Frankel RM, Beckman HB, Charon R, Fein O. Performing the interview. In: Lipkin MJ, Putnam SM, Lazare A, Carroll JG, Frankel RM, Keller A, et al, editors. The Medical Interview. New York: Springer; 1995:65-82.
  3. Craig JL. Retention of interviewing skills learned by first-year medical students: a longitudinal study. Med Educ 1992;26(4):276-281. [CrossRef]
  4. Maguire GP, Rutter DR. History-taking for medical students. I-deficiencies in performance. Lancet 1976 Sep 11;2(7985):556-558. [CrossRef] [Medline]
  5. Pfeiffer C, Madray H, Ardolino A, Willms J. The rise and fall of students' skill in obtaining a medical history. Med Educ 1998 May;32(3):283-288. [CrossRef] [Medline]
  6. Platt FW, McMath JC. Clinical hypocompetence: the interview. Ann Intern Med 1979 Dec;91(6):898-902. [CrossRef] [Medline]
  7. Ramsey PG, Curtis J, Paauw DS, Carline JD, Wenrich MD. History-taking and preventive medicine skills among primary care physicians: an assessment using standardized patients. Am J Med 1998 Feb;104(2):152-158. [CrossRef] [Medline]
  8. Hampton JR, Harrison MJ, Mitchell JR, Prichard JS, Seymour C. Relative contributions of history-taking, physical examination, and laboratory investigation to diagnosis and management of medical outpatients. Br Med J 1975 May 31;2(5969):486-489 [FREE Full text] [CrossRef] [Medline]
  9. Peterson MC, Holbrook JH, von Hales D, Smith NL, Staker LV. Contributions of the history, physical examination, and laboratory investigation in making medical diagnoses. West J Med 1992 Feb;156(2):163-165 [FREE Full text] [CrossRef] [Medline]
  10. Stewart MA. Effective physician-patient communication and health outcomes: a review. Can Med Assoc J 1995 May 1;152(9):1423-1433 [FREE Full text] [Medline]
  11. Robbins JA, Bertakis KD, Helms LJ, Azari R, Callahan EJ, Creten DA. The influence of physician practice behaviors on patient satisfaction. Fam Med 1993 Jan;25(1):17-20. [Medline]
  12. Kravitz RL, Callahan EJ, Paterniti D, Antonius D, Dunham M, Lewis CE. Prevalence and sources of patients' unmet expectations for care. Ann Intern Med 1996 Nov 1;125(9):730-737. [CrossRef] [Medline]
  13. Harmon G, Lefante J, Krousel-Wood M. Overcoming barriers: the role of providers in improving patient adherence to antihypertensive medications. Curr Opin Cardiol 2006 Jul;21(4):310-315. [CrossRef] [Medline]
  14. Zolnierek KB, Dimatteo MR. Physician communication and patient adherence to treatment: a meta-analysis. Med Care 2009 Aug;47(8):826-834 [FREE Full text] [CrossRef] [Medline]
  15. Aspegren K, Lønberg-Madsen P. Which basic communication skills in medicine are learnt spontaneously and which need to be taught and trained? Med Teach 2005 Sep;27(6):539-543. [CrossRef] [Medline]
  16. van Dalen J, Bartholomeus P, Kerkhofs E, Lulofs R, Van Thiel J, Rethans JJ, et al. Teaching and assessing communication skills in Maastricht: the first twenty years. Med Teach 2001 May;23(3):245-251. [CrossRef] [Medline]
  17. Harding SR, D'Eon MF. Using a Lego-based communications simulation to introduce medical students to patient-centered interviewing. Teach Learn Med 2001;13(2):130-135. [CrossRef] [Medline]
  18. Alyami H, Su'a B, Sundram F, Alyami M, Lyndon M, Yu TC, et al. Teaching medical students history taking content: a systematic review. Am J Educ Res 2016;4(3):227-233 [FREE Full text] [CrossRef]
  19. Hilliard RI. How do medical students learn: medical student learning styles and factors that affect these learning styles. Teach Learn Med 1995 Jan;7(4):201-210. [CrossRef]
  20. Eppler MJ. A comparison between concept maps, mind maps, conceptual diagrams, and visual metaphors as complementary tools for knowledge construction and sharing. Inf Vis 2006 Jun 22;5(3):202-210. [CrossRef]
  21. Marcos A. The tension between Aristotle's theories and uses of metaphor. Stud Hist Philos Sci A 1997 Mar;28(1):123-139. [CrossRef]
  22. Paton RC. Towards a metaphorical biology. Biol Philos 1992 Jul;7(3):279-294. [CrossRef]
  23. St Clair RN. Visual metaphor, cultural knowledge, the new rhetoric. In: Reyhner J, Martin J, Lockard L, Gilbert WS, editors. Learn in Beauty: Indigenous Education for a New Century. Flagstaff, Arizona: Northern Arizona University; 2000:85-101.
  24. Hanson L. Affective response to learning via 'visual metaphor'. In: Beauchamp DG, Braden RA, Baca JC, editors. Visual Literacy in the Digital Age. Blacksburg, VA: The International Visual Literacy Association; 1993:271-279.
  25. Ortony A. Metaphor, language and thought. In: Metaphor and Thought. Second Edition. United Kingdom: Cambridge University Press; 1993:1-16.
  26. Roth WM, Bowen GM, McGinn MK. Differences in graph-related practices between high school biology textbooks and scientific ecology journals. J Res Sci Teach 1999 Nov;36(9):977-1019. [CrossRef]
  27. Eppler MJ, Burkhard R. Google Books. 2004. Knowledge Visualization: Towards a New Discipline and Its Fields of Application   URL: https:/​/books.​​books/​about/​Knowledge_Visualization_Towards_a_New_Di.​html?id=15gunQAACAAJ&redir_esc=y [accessed 2019-08-27]
  28. Rieber LP, Noah D. Games, simulations, and visual metaphors in education: antagonism between enjoyment and learning. Educ Media Int 2008 Jun;45(2):77-92. [CrossRef]
  29. Fabricatore C. Learning Video Games: An Unexploited Synergy. In: Proceedings of the Annual Convention of the Association for Educational Communications and Technology. 2000 Presented at: AECT'00; February 16-19, 2000; Long Beach, CA, USA   URL:
  30. Susi T, Johannesson M, Backlund P. DiVA portal. 2007. Serious Games – An Overview   URL: [accessed 2017-10-01]
  31. Kuipers DA, Terlouw G, Wartena BO, van't Veer JT, Prins JT, Pierie JP. The role of transfer in designing games and simulations for health: systematic review. JMIR Serious Games 2017 Nov 24;5(4):e23 [FREE Full text] [CrossRef] [Medline]
  32. Wouters P, van Nimwegen C, van Oostendorp H, van der Spek ED. A meta-analysis of the cognitive and motivational effects of serious games. J Educ Psychol 2013;105(2):249-265. [CrossRef]
  33. Ludden GD, van Rompay TJ, Kelders SM, van Gemert-Pijnen JE. How to increase reach and adherence of web-based interventions: a design research viewpoint. J Med Internet Res 2015 Jul 10;17(7):e172 [FREE Full text] [CrossRef] [Medline]
  34. Sweller J. Cognitive load during problem solving: effects on learning. Cogn Sci 1988 Jun;12(2):257-285. [CrossRef]
  35. Adams DM, Clark DB. Integrating self-explanation functionality into a complex game environment: keeping gaming in motion. Comput Educ 2014 Apr;73:149-159. [CrossRef]
  36. Sweller J, van Merriënboer JJ, Paas FG. Cognitive architecture and instructional design. Educ Psychol Rev 1998 Sep;10(3):251-296. [CrossRef]
  37. Sweller J, van Merriënboer JJ, Paas FG. Cognitive architecture and instructional design: 20 years later. Educ Psychol Rev 2019 Jan 22;31(2):261-292 [FREE Full text] [CrossRef]
  38. Wong A, Marcus N, Sweller J. Instructional animations: more complex to learn from than at first sight? In: Campos P, Graham N, Jorge J, Nunes N, Palanque P, Winckler M, editors. Human-Computer Interaction -- INTERACT 2011. Berlin, Heidelberg: Springer; 2011:552-555.
  39. Bates R. A critical analysis of evaluation practice: the Kirkpatrick model and the principle of beneficence. Eval Program Plan 2004 Aug;27(3):341-347. [CrossRef]
  40. Curran VR, Fleet L. A review of evaluation outcomes of web-based continuing medical education. Med Educ 2005 Jun;39(6):561-567. [CrossRef] [Medline]
  41. Hill AG, Yu TC, Barrow M, Hattie J. A systematic review of resident-as-teacher programmes. Med Educ 2009 Dec;43(12):1129-1140. [CrossRef] [Medline]
  42. Yardley S, Dornan T. Kirkpatrick's levels and education 'evidence'. Med Educ 2012 Jan;46(1):97-106. [CrossRef] [Medline]
  43. Freeth D, Reeves S, Koppel I, Hammick M, Barr H. University of Nebraska Medical. 2005. Evaluating Interprofessional Education: A Self-Help Guide   URL: [accessed 2019-04-25]
  44. Walker SL, Fraser BJ. Development and validation of an instrument for assessing distance education learning environments in higher education: the distance education learning environments survey (DELES). Learn Environ Res 2005 Nov;8(3):289-308. [CrossRef]
  45. O’Hagan AO, Coleman G, O'Connor RV. Software Development Processes for Games: A Systematic Literature Review. In: Proceedings of the European Conference on Software Process Improvement. 2014 Presented at: EuroSPI'14; June 25-27, 2014; Luxembourg p. 182-193. [CrossRef]
  46. Howe A, Keogh-Brown M, Miles S, Bachmann M. Expert consensus on contamination in educational trials elicited by a Delphi exercise. Med Educ 2007 Feb;41(2):196-204. [CrossRef] [Medline]
  47. Ericsson KA, Simon HA. Protocol Analysis: Verbal Reports as Data. Cambridge, MA, US: MIT Press; 1984.
  48. Annetta LA. The 'I's' have it: a framework for serious educational game design. Rev Gen Psychol 2010 Jun;14(2):105-113. [CrossRef]
  49. Olsen T, Procci K, Bowers C. Serious Games Usability Testing: How to Ensure Proper Usability, Playability, and Effectiveness. In: Proceedings of the International Conference of Design, User Experience, and Usability. 2011 Presented at: DUXU'11; July 9-14, 2011; Orlando, FL, USA p. 625-634. [CrossRef]
  50. Ak O, Kutlu B. Comparing 2D and 3D game-based learning environments in terms of learning gains and student perceptions. Br J Educ Technol 2015 Aug 17;48(1):129-144. [CrossRef]
  51. Möring SM. CORE. 2013. Games and Metaphor – A Critical Analysis of the Metaphor Discourse in Game Studies   URL: [accessed 2019-08-27]
  52. Chang HY, Poh DY, Wong LL, Yap JY, Yap KY. Student preferences on gaming aspects for a serious game in pharmacy practice education: a cross-sectional study. JMIR Med Educ 2015 May 11;1(1):e2 [FREE Full text] [CrossRef] [Medline]
  53. Roettl J, Terlutter R. The same video game in 2D, 3D or virtual reality - how does technology impact game evaluation and brand placements? PLoS One 2018;13(7):e0200724 [FREE Full text] [CrossRef] [Medline]
  54. Talley NJ, O'Connor S. Clinical Examination: A Systematic Guide to Physical Diagnosis. New South Wales, Australia: Churchill Livingstone; 2013.
  55. Schuwirth LW, van der Vleuten CP. Different written assessment methods: what can be said about their strengths and weaknesses? Med Educ 2004 Sep;38(9):974-979. [CrossRef] [Medline]
  56. Paas FG. Training strategies for attaining transfer of problem-solving skill in statistics: a cognitive-load approach. J Educ Psychol 1992;84(4):429-434. [CrossRef]
  57. Klepsch M, Schmitz F, Seufert T. Development and validation of two instruments measuring intrinsic, extraneous, and germane cognitive load. Front Psychol 2017;8:1997 [FREE Full text] [CrossRef] [Medline]
  58. Szulewski A, Gegenfurtner A, Howes DW, Sivilotti ML, van Merriënboer JJ. Measuring physician cognitive load: validity evidence for a physiologic and a psychometric tool. Adv Health Sci Educ Theory Pract 2017 Oct;22(4):951-968. [CrossRef] [Medline]
  59. van Gog T, Kirschner F, Kester L, Paas F. Timing and frequency of mental effort measurement: evidence in favour of repeated measures. Appl Cognit Psychol 2012 Nov 28;26(6):833-839. [CrossRef]
  60. Schmeck A, Opfermann M, van Gog T, Paas F, Leutner D. Measuring cognitive load with subjective rating scales during problem solving: differences between immediate and delayed ratings. Instr Sci 2014 Aug 21;43(1):93-114. [CrossRef]
  61. van Gog T, Paas F. Instructional efficiency: revisiting the original construct in educational research. Educ Psychol 2008 Jan 28;43(1):16-26. [CrossRef]
  62. Brunken R, Plass JL, Leutner D. Direct measurement of cognitive load in multimedia learning. Educ Psychol 2003 Mar;38(1):53-61. [CrossRef]
  63. Yeo G, Neal A. Subjective cognitive effort: a model of states, traits, and time. J Appl Psychol 2008 May;93(3):617-631. [CrossRef] [Medline]
  64. Schwarzer R, Jerusalem M. Generalized self-efficacy scale. In: Weinman J, Wright S, Johnston M, editors. Measures in Health Psychology: A User’s Portfolio Causal and Control Beliefs. Windsor, UK: Nfer-Nelson; 1995:35-37.
  65. Artino AR, La Rochelle JS, Durning SJ. Second-year medical students' motivational beliefs, emotions, and achievement. Med Educ 2010 Dec;44(12):1203-1212. [CrossRef] [Medline]
  66. Woods JL, Pasold TL, Boateng BA, Hense DJ. Medical student self-efficacy, knowledge and communication in adolescent medicine. Int J Med Educ 2014 Aug 20;5:165-172 [FREE Full text] [CrossRef] [Medline]
  67. Young HN, Schumacher JB, Moreno MA, Brown RL, Sigrest TD, McIntosh GK, et al. Medical student self-efficacy with family-centered care during bedside rounds. Acad Med 2012 Jun;87(6):767-775 [FREE Full text] [CrossRef] [Medline]
  68. Bandura A. Self-efficacy: toward a unifying theory of behavioral change. Psychol Rev 1977 Jan;84(2):191-215. [CrossRef]
  69. Gee JP. What Video Games Have to Teach Us About Learning and Literacy. New York, USA: St Martin's Griffin Publishing; 2004.
  70. Scholz U, Doña BG, Sud S, Schwarzer R. Is general self-efficacy a universal construct? Eur J Psychol Assess 2002 Sep;18(3):242-251. [CrossRef]
  71. Klassen RM, Klassen JR. Self-efficacy beliefs of medical students: a critical review. Perspect Med Educ 2018 Apr;7(2):76-82 [FREE Full text] [CrossRef] [Medline]
  72. Campbell J, Tirapelle L, Yates K, Clark R, Inaba K, Green D, et al. The effectiveness of a cognitive task analysis informed curriculum to increase self-efficacy and improve performance for an open cricothyrotomy. J Surg Educ 2011;68(5):403-407. [CrossRef] [Medline]
  73. Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 1989 Sep;13(3):319-340. [CrossRef]
  74. Briz-Ponce L, García-Peñalvo FJ. An empirical assessment of a technology acceptance model for apps in medical education. J Med Syst 2015 Nov;39(11):176. [CrossRef] [Medline]
  75. Hu PJ, Chau PY, Sheng OR, Tam KY. Examining the technology acceptance model using physician acceptance of telemedicine technology. J Manage Inform Syst 2015 Dec 2;16(2):91-112. [CrossRef]
  76. Huang HM, Liaw SS, Lai CM. Exploring learner acceptance of the use of virtual reality in medical education: a case study of desktop and projection-based display systems. Interact Learn Environ 2013 Jul 22;24(1):3-19. [CrossRef]
  77. King WR, He J. A meta-analysis of the technology acceptance model. Inf Manag 2006 Sep;43(6):740-755. [CrossRef]
  78. Gentry SV, Gauthier A, Ehrstrom BL, Wortley D, Lilienthal A, Car LT, et al. Serious gaming and gamification education in health professions: systematic review. J Med Internet Res 2019 Mar 28;21(3):e12994 [FREE Full text] [CrossRef] [Medline]
  79. Witowski LL. Semantic Scholar. 2008. The Relationship Between Instructional Delivery Methods and Student Learning Preferences: What Contributes to Student Satisfaction in an Online Learning Environment?   URL: [accessed 2019-08-27]
  80. Succar T, Zebington G, Billson F, Byth K, Barrie S, McCluskey P, et al. The impact of the virtual ophthalmology clinic on medical students' learning: a randomised controlled trial. Eye (Lond) 2013 Oct;27(10):1151-1157 [FREE Full text] [CrossRef] [Medline]
  81. Sward KA, Richardson S, Kendrick J, Maloney C. Use of a web-based game to teach pediatric content to medical students. Ambul Pediatr 2008;8(6):354-359. [CrossRef] [Medline]
  82. Welch CA, Petersen I, Bartlett JW, White IR, Marston L, Morris RW, et al. Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data. Stat Med 2014 Sep 20;33(21):3725-3737 [FREE Full text] [CrossRef] [Medline]
  83. Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol 2006 Jan;3(2):77-101. [CrossRef]
  84. Pintrich PR. The role of metacognitive knowledge in learning, teaching, and assessing. Theor Pract 2002 Nov;41(4):219-225. [CrossRef]
  85. Serig D. A conceptual structure of visual metaphor. Stud Art Educ 2015 Dec 18;47(3):229-247. [CrossRef]
  86. Islam MN, Majumder MD, Ja'afar R, Rahman S. Students' perceptions of 'technology-based' lecture handouts. Malays J Med Sci 2005 Jan;12(1):26-28 [FREE Full text] [Medline]
  87. Buzzetto-More N. Student perceptions of various e-learning components. Interdiscip J E-Learn Learn Obj 2008 Sep 01;4(17):113-135. [CrossRef]
  88. McLaughlin JE, Rhoney DH. Comparison of an interactive e-learning preparatory tool and a conventional downloadable handout used within a flipped neurologic pharmacotherapy lecture. Curr Pharm Teach Learn 2015 Jan;7(1):12-19. [CrossRef]
  89. Ke F. Computer games application within alternative classroom goal structures: cognitive, metacognitive, and affective evaluation. Educ Tech Res Dev 2008 Jan 25;56(5-6):539-556. [CrossRef]
  90. Sherry JL. Flow and media enjoyment. Commun Theory 2004 Nov;14(4):328-347. [CrossRef]
  91. Moreno R, Mayer R. Interactive multimodal learning environments. Educ Psychol Rev 2007 Jun 22;19(3):309-326. [CrossRef]
  92. Fletcher JD, Tobias S. The multimedia principle. In: Mayer RE, editor. The Cambridge Handbook of Multimedia Learning. New York: Cambridge University Press; 2005:117-133.
  93. Tseng P, Wang N, Lin R, Chen K. On the Battle Between Lag and Online Gamers. In: Proceedings of the Conference on Communications Quality and Reliability. 2011 Presented at: CQR'11; April 10, 2011; Florida, USA. [CrossRef]
  94. Loh CS, Sheng Y, Ifenthaler D. Serious games analytics: theoretical framework. In: Loh CS, Sheng Y, Ifenthaler D, editors. Serious Games Analytics: Methodologies for Performance Measurement, Assessment, and Improvement. New York, USA: Springer International Publishing; 2015:3-29.
  95. Csikszentmihalyi M, Abuhamdeh S, Nakamura J. Flow and the Foundations of Positive Psychology: The Collected Works of Mihaly Csikszentmihalyi. Heidelberg: Springer Netherlands; 2014.
  96. Drummond D, Hadchouel A, Tesnière A. Serious games for health: three steps forwards. Adv Simul (Lond) 2017;2:3 [FREE Full text] [CrossRef] [Medline]
  97. Harteveld C, Guimarães R, Mayer I, Bidarra R. Balancing Pedagogy, Game and Reality Components Within a Unique Serious Game for Training Levee Inspection. In: Proceedings of the International Conference on Technologies for E-Learning and Digital Entertainment. 2007 Presented at: Edutainment'07; June 11-13, 2007; Hong Kong, China p. 128-139. [CrossRef]
  98. Kennedy G. Promoting cognition in multimedia interactivity research. J Interact Learn Res 2004;15(1):43-61 [FREE Full text]
  99. Blakely G, Skirton H, Cooper S, Allum P, Nelmes P. Educational gaming in the health sciences: systematic review. J Adv Nurs 2009 Feb;65(2):259-269. [CrossRef] [Medline]
  100. Ricciardi F, de Paolis LT. A comprehensive review of serious games in health professions. Int J Comput Games Technol 2014;2014:1-11. [CrossRef]
  101. O'Leary S, Diepenhorst L, Churley-Strom R, Magrane D. Educational games in an obstetrics and gynecology core curriculum. Am J Obstet Gynecol 2005 Nov;193(5):1848-1851. [CrossRef] [Medline]
  102. Doménech-Betoret F, Abellán-Roselló L, Gómez-Artiga A. Self-efficacy, satisfaction, and academic achievement: the mediator role of students' expectancy-value beliefs. Front Psychol 2017;8:1193 [FREE Full text] [CrossRef] [Medline]
  103. Artino AR. Motivational beliefs and perceptions of instructional quality: predicting satisfaction with online training. J Comput Assist 2008;24(3):260-270. [CrossRef]
  104. Eom SB, Wen HJ, Ashill N. The determinants of students' perceived learning outcomes and satisfaction in university online education: an empirical investigation*. Decis Sci J Innov Educ 2006 Jul;4(2):215-235. [CrossRef]
  105. Akl EA, Pretorius RW, Sackett K, Erdley WS, Bhoopathi PS, Alfarah Z, et al. The effect of educational games on medical students' learning outcomes: a systematic review: BEME guide no 14. Med Teach 2010 Jan;32(1):16-27. [CrossRef] [Medline]
  106. Graafland M, Bemelman WA, Schijven MP. Prospective cohort study on surgeons' response to equipment failure in the laparoscopic environment. Surg Endosc 2014 Sep;28(9):2695-2701. [CrossRef] [Medline]
  107. Akl EA, Sackett KM, Erdley WS, Mustafa RA, Fiander M, Gabriel C, et al. Educational games for health professionals. Cochrane Database Syst Rev 2013 Jan 31(1):CD006411. [CrossRef] [Medline]
  108. Wang R, DeMaria S, Goldberg A, Katz D. A systematic review of serious games in training health care professionals. Simul Healthc 2016 Feb;11(1):41-51. [CrossRef] [Medline]
  109. Aagaard E, Teherani A, Irby DM. Effectiveness of the one-minute preceptor model for diagnosing the patient and the learner: proof of concept. Acad Med 2004 Jan;79(1):42-49. [CrossRef] [Medline]
  110. Duque G, Fung S, Mallet L, Posel N, Fleiszer D. Learning while having fun: the use of video gaming to teach geriatric house calls to medical students. J Am Geriatr Soc 2008 Jul;56(7):1328-1332. [CrossRef] [Medline]
  111. Boeker M, Andel P, Vach W, Frankenschmidt A. Game-based e-learning is more effective than a conventional instructional method: a randomized controlled trial with third-year medical students. PLoS One 2013;8(12):e82328 [FREE Full text] [CrossRef] [Medline]
  112. Wormald BW, Schoeman S, Somasunderam A, Penn M. Assessment drives learning: an unavoidable truth? Anat Sci Educ 2009 Oct;2(5):199-204. [CrossRef] [Medline]
  113. Salkind NJ, editor. John Henry effect. In: Encyclopedia Of Research Design. Newbury Park, California: Sage Publications; 2010.
  114. Kocakaya S. An educational dilemma: are educational experiments working? Educ Res Rev 2011;6(1):110-123 [FREE Full text]
  115. Feinstein H. Meaning and visual metaphor. Stud Art Educ 1982;23(2):45-55. [CrossRef]
  116. Jeffreys MR. Guided visual metaphor: a creative strategy for teaching nursing diagnosis. Nurs Diagn 1993;4(3):99-106. [CrossRef] [Medline]
  117. Jeong S. Visual metaphor in advertising: is the persuasive effect attributable to visual argumentation or metaphorical rhetoric? J Market Comm 2008 Feb;14(1):59-73. [CrossRef]
  118. Clark JM, Paivio A. Dual coding theory and education. Educ Psychol Rev 1991 Sep;3(3):149-210. [CrossRef]
  119. Ausubel DP. Educational Psychology: A Cognitive View. New York: Rinehart and Winston; 1968.
  120. Schraw G, Crippen KJ, Hartley K. Promoting self-regulation in science education: metacognition as part of a broader perspective on learning. Res Sci Educ 2006 Mar;36(1-2):111-139. [CrossRef]
  121. Klimmt C, Hartmann T. Effectance, self-efficacy, and the motivation to play video games. In: Vorderer P, Bryant J, editors. Playing Video Games: Motives, Responses, and Consequences. Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers; 2006:133-145.
  122. Edgcombe H, Paton C, English M. Enhancing emergency care in low-income countries using mobile technology-based training tools. Arch Dis Child 2016 Dec;101(12):1149-1152 [FREE Full text] [CrossRef] [Medline]
  123. Pew Research Center. 2015. Cell Phones in Africa: Communication Lifeline   URL: [accessed 2019-07-01]
  124. Pew Research Center. 2015. Internet Seen as Positive Influence on Education but Negative on Morality in Emerging and Developing Nations   URL: https:/​/www.​​global/​2015/​03/​19/​internet-seen-as-positive-influence-on-education-but-negative-influence-on-morality-in-emerging-and-developing-nations/​ [accessed 2019-07-01]
  125. Gorbanev I, Agudelo-Londoño S, González RA, Cortes A, Pomares A, Delgadillo V, et al. A systematic review of serious games in medical education: quality of evidence and pedagogical strategy. Med Educ Online 2018 Dec;23(1):1438718 [FREE Full text] [CrossRef] [Medline]
  126. Prideaux D. Researching the outcomes of educational interventions: a matter of design. RTCs have important limitations in evaluating educational interventions. Br Med J 2002 Jan 19;324(7330):126-127 [FREE Full text] [CrossRef] [Medline]
  127. Dauphinee WD, Wood-Dauphinee S. The need for evidence in medical education: the development of best evidence medical education as an opportunity to inform, guide, and sustain medical education research. Acad Med 2004 Oct;79(10):925-930. [CrossRef] [Medline]
  128. Primack BA, Carroll MV, McNamara M, Klem ML, King B, Rich M, et al. Role of video games in improving health-related outcomes: a systematic review. Am J Prev Med 2012 Jun;42(6):630-638 [FREE Full text] [CrossRef] [Medline]

CLT: cognitive load theory
GSE: General Self-Efficacy
HTC: history-taking content
HTP: history-taking process
OSCE: objective structured clinical examination
SG: serious game
TAM: Technology Acceptance Model
VMG: Visual Metaphor Game
WM: working memory

Edited by T Rashid Soron; submitted 18.02.19; peer-reviewed by C Eichenberg, D Elliott; comments to author 28.04.19; revised version received 17.06.19; accepted 17.07.19; published 26.09.19


©Hussain Alyami, Mohammed Alawami, Mataroria Lyndon, Mohsen Alyami, Christin Coomarasamy, Marcus Henning, Andrew Hill, Frederick Sundram. Originally published in JMIR Serious Games (, 26.09.2019

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Serious Games, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.