Negatively Marked Elimination-Format Multiple-Choice Questions Are Associated with High Cognitive Load and Poor Student Experience Compared to Single Best Answer

Participants (Studies 1 and 3)

Participants were recruited from the online pool www.prolific.com. Participants were restricted to those currently studying a science topic at undergraduate degree level. The specific degrees allowed were Medicine, Biochemistry, Dentistry, Biomedical Sciences, Genetics, Pharmacology, Biological Sciences, Veterinary Science, Physics, Engineering, Nursing, Chemistry, Biology, Earth Sciences, Health and Medicine and Material Sciences. Participants were paid the rate recommended by Prolific, which at the time was GBP 2.40 for satisfactory completion of the study. Sixty-five participants were recruited for Study 1. (The original intention was to recruit 2 × 32 but due to a technical error an additional participant was recruited to one of the groups.) Two ‘attention check’ questions [28] were included where the only appropriate answer option was ‘I am paying attention to this study’. One participant from Study 1 answered both attention check questions incorrectly, and so their data were removed, resulting in an ‘N’ of 33 participants answering questions in a certain format, and N = 31 for others. Fifty new participants were recruited for Study 3, none of whom failed both attention checks.

Study Instrument Structure (Studies 1 and 3)

The study instrument was constructed in Qualtrics™ and included 40 general science MCQs aimed at a level that year 1 UK STEM undergraduate students could reasonably answer, based on A-level and GCSE (UK schools exams) science revision questions [29,30,31,32,33]. The full question set is shown in Appendix 1. Each question was constructed in both SBA and ET format. The study instrument was piloted twice on Prolific for clarity and the overall level of difficulty of the questions. Following the first pilot, colour cues were added to highlight the question format being answered at any one time, along with additional instructions regarding the format of the questions. A second pilot was conducted with eight participants per group. No further changes were required, and so the second set of pilot data were included in the final analysis. The study settings on Prolific were configured to ensure that no participants could take the study twice.

Introduction

The purpose of the study was explained (see below). Participants were asked to refrain from consulting any external sources and to complete the survey to the very best of their ability. They were told that they would be paid for any satisfactory contribution made and that the submission of their answers indicated consent.

Instructions

The two different question formats were explained, along with the way in which they would have been scored if this was an actual test (SBA questions: one point for selecting the correct answer and no points for selecting the incorrect answer; ET format: one point for each of the four incorrect answer options eliminated, but − 4 points for eliminating the correct answer (Study 1; negative marking) or zero points if they eliminated the correct answer (Study 3; neutral marking)), regardless of how many incorrect options had been eliminated. The negative mark of − 4 in Study 1 was used to match the real-world experience of the students using ET formats for summative assessment in Study 2.

Practice Question

The explanation was then followed by three practice questions in each format. All practise questions were easy (for example, ‘what colour is the sky’) so the participants could focus on the format rather than the content.

Study Questions

These were presented in four groups of ten. Two groups of participants were recruited in each study. Both groups received the same forty questions, in the same order. However, the first group (N = 31 participants in Study 1, N = 25 in Study 3) received questions 1–10 and 21–30 in ET format and questions 11–20 and 31–40 in SBA format. This formatting was reversed for the second group (N = 31 participants for Study 1, N = 25 for Study 3). Thus, all questions were tested in both formats, and the presentation of the formats was counterbalanced.

Cognitive Load

Participants were asked, in separate questions, to ‘rate the mental effort required to complete’ each of the question types using the Paas 9-point scale [34], where 1 is ‘very, very low mental effort’, and 9 is ‘very, very high mental effort’.

Student Experience

Participants were asked ‘of the two aforementioned questioning styles, which did you find easier’ and ‘of the two aforementioned questioning styles, which did you prefer’. In Study 3, participants were then asked additional questions to further explore their reasoning. This section was then branched so that participants who said they preferred SBA were asked different questions to those who said they preferred ET. These questions were used in Study 2 and were derived from the existing literature on the student experience of the different question types [35].

Clarity Check

Participants were asked whether (1) the format of the survey was clear and (2) the format of the questions was clear.

Finally, participants were asked if they had any further feedback. They were then debriefed and redirected to Prolific and paid.

Study 2. Survey of Students for Whom Elimination MCQs Are Part of Their Summative Assessment

This survey was administered to students studying Medical Science degrees at a medical school in the UK (not medical students). These are a suite of 19 different undergraduate degrees, in which ET MCQs are used as part of a portfolio of summative assessment methods, and so students have experience of this format.

Survey Instrument Structure

The structure of the two question formats was explained, and participants were asked to take the example questions as in Studies 1 and 3. They were then asked to confirm that they had taken a multiple-choice exam, in either format and then in a separate question to confirm that they had taken an MCQ exam in the ET format. They were then asked which of the two formats they preferred, followed by four Likert scale questions designed to understand their reasons for their preference. These Likert questions were based upon prior literature which examined students preference for the two formats [35]. The Likert questions were specific to the preferred format (i.e. students that preferred SBA-type questions were asked four question about why they preferred that format). All students were then asked a free-text question ‘Is there anything else you would like to add about your preference (or lack thereof) towards exam styles?’.

Survey Distribution and Sample

We used a convenience sample. The survey was sent via email to all 827 enrolled students from UK FHEQ Level 3 to 6 (foundation undergraduate through to final year undergraduate). The email stated that the researchers ‘are running a very short survey on student experiences of certain types of exam questions. Your responses are entirely anonymous and the survey should take less than 5 min. The project has ethical approval from the SUMS RESC’.

Participants

One hundred eighteen students responded, with 107 completing the survey, giving a response rate of 13%. Of these, 90 answered yes to both confirmation questions (i.e. had sat MCQ-based exams, including in the ET format), and so data are presented from those students.

Analysis

Data are reported as mean ± standard error. All datasets were tested for normal distribution using a Kolmogorov–Smirnov test [36]. Non-parametric tests were used where data were not normally distributed. The specific statistical tests used are identified in the relevant section of the results.

Ethics

Ethical approval was given by the Swansea University Medical School Research Ethics Committee, code SUMS-RESC 2019–0060. All participants gave electronic written consent, and all data were anonymous.

View original article

MEDICAL SCIENCE EDUCATOR

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Negatively Marked Elimination-Format Multiple-Choice Questions Are Associated with High Cognitive Load and Poor Student Experience Compared to Single Best Answer

Comments (0)