Web-assisted assessment of professional behaviour in problem-based learning: more feedback, yet no qualitative improvement?
Although other web-based approaches to assessment of professional behaviour have been studied, no publications studying the potential advantages of a web-based instrument versus a classic, paper-based method have been published to date. This study has two research goals: it focuses on the quantity and quality of comments provided by students and their peers (two researchers independently scoring comments as correct and incorrect in relation to five commonly used feedback rules (and resulting in an aggregated score of the five scores) on the one, and on the feasibility, acceptability and perceived usefulness of the two approaches on the other hand (using a survey). The amount of feedback was significantly higher in the web-based group than in the paper based group for all three categories (dealing with work, others and oneself). Regarding the quality of feedback, the aggregated score for each of the three categories was not significantly different between the two groups, neither for the interim, nor for the final assessment. Some, not statistically significant, but nevertheless noteworthy trends were nevertheless noted. Feedback in the web-based group was more often unrelated to observed behaviour for several categories for both the interim and final assessment. Furthermore, most comments relating to the category ‘Dealing with oneself’ consisted of descriptions of a student’s attendance, thereby neglecting other aspects of personal functioning. The survey identified significant differences between the groups for all questionnaire items regarding feasibility, acceptability and perceived usefulness in favour of the paper-based form. The use of a web-based instrument for professional behaviour assessment yielded a significantly higher number of comments compared to the traditional paper-based assessment. Unfortunately, the quality of the feedback obtained by the web-based instrument as measured by several generally accepted feedback criteria did not parallel this increase.
KeywordsProblem based learning Professional behaviour Professionalism Tutorial group Assessment Peer Web-assisted E-mail Electronic
Assessment of Professional Behaviours
National Board of Medical Examiners
Professionalism is becoming increasingly central in undergraduate and postgraduate training, and the herewith associated research results in a vast increase in the number of papers on the topic (van Mook, de Grave et al. 2009). Tools for assessing professionalism and professional behaviour have been developed to identify, counsel, and remediate the performance of students and trainees demonstrating unacceptable professional behaviour (Papadakis et al. 2005, 2008). Since validated tools are scarce (Cruess et al. 2006), combining currently available instruments has become the current norm (Schuwirth and van der Vleuten 2004; van Mook, Gorter et al. 2009). Self- and peer assessment and direct observation by faculty during regular educational sessions (Singer et al. 1996; Asch et al. 1998; Fowell and Bligh 1998; van der Vleuten and Schuwirth 2005; Cohen 2006) are some of these tools. Self-assessment is defined as personal evaluation of one’s professional attributes and abilities against perceived norms (Eva et al. 2004; Eva and Regehr 2005; McKinstry 2007). So far, there is a scarcity of published studies on self-assessment of professionalism (Rees and Shepherd 2005). Given the poor validity of self-assessment in general (Eva and Regehr 2005), it seems ill advised to use self-assessment in isolation without triangulation from other sources. Peer assessment involves assessors with the same level of expertise and training and similar hierarchical institutional status. Medical students usually know which of their classmates they would trust to treat their family members, which illustrates the intrinsic potential of peer assessment (Dannefer et al. 2005). However, a recent analysis of instruments for peer assessment of physicians revealed that none met the required standards for instrument development (Evans et al. 2004). Studies addressing peer assessment of professional behaviour of medical students are beginning to appear (Freedman et al. 2000; Arnold et al. 2005; Dannefer et al. 2005; Shue et al. 2005; Lurie, Nofziger et al. 2006a, b). Observation and assessment by faculty using rating scales is another commonly used method of professional behaviour assessment (van Luijk et al. 2000; van Mook, Gorter et al. 2009; van Mook and van Luijk 2010). Prior studies have revealed that such teacher-led sessions are highly dependent on the teacher’s attitudes, motivation and instructional skills (van Mook et al. 2007). When teachers’ commitment declines, assessment of professional behaviour may become more trivialised. This may misplace emphasis on attendance rather than participation and on completion of tick boxes rather than informative feedback and students’ contribution and motivation (van Mook et al. 2007). In an attempt to further improve professional behaviour assessment, the triangulated teacher-led discussion of self- and peer-assessment of professional behaviour using a paper form is the contemporary practice at Maastricht medical school (van Mook and van Luijk 2010).
The quantity and quality of comments provided by students and the feedback provided by their tutor and peers, and on
The feasibility, acceptability and perceived usefulness of the two approaches.
Methods and research tools
The study involved all medical students enrolled in the second, ten-week course in year 2 at the Faculty of Health, Medicine and Life Sciences, Maastricht University, the Netherlands. During the bachelor programme of the six-year problem-based medical curriculum, professional behaviour is assessed on various occasions in tutorial groups during all regular courses (van Mook and van Luijk 2010). Each tutorial group consists of ten students on average and a tutor/facilitator, and each meeting lasts 2 h. For the purpose of this study, the students were divided into two groups: those in tutorial groups with even numbers and those in groups with odd numbers. The first group used a web-based instrument to assess professional behaviour and the other group used the usual method with a paper assessment form. We will first describe the two assessment methods in some detail.
The ‘classic’ paper-based professional behaviour assessment form
The web-based instrument
The web-based instrument is based on an application that consists of a 360° feedback system specifically designed for higher education. Its development involved more than thirty pilot studies and evaluations by over 6,000 students. Prior to the current study, the tool was piloted at Maastricht in a group of first year students, which did not participate in the current study. Providing adequate practical information to students and tutors prior to using the application and rephrasing of items to achieve a more detailed focus on aspects of professional behaviour were considered prerequisite for the successful implementation of web-based assessment (unpublished data). The web-based instrument used for assessment of professional behaviour pertained to the same three categories (and clarifying descriptions) also used on the paper form (Project Team Consilium Abeundi van Luijk 2005). Ample information about background, confidentiality, timing and some practical matters was made available to students and staff electronically and in writing prior to, and during the opening session of the course, as well as verbally to the tutors during the tutor instruction session. Halfway and at the end of the course each student in the web-based assessment group received an internet link in an e-mail. Clicking the link gave access to the web-based assessment instrument. The students were asked to complete the questions themselves and then invite five peers and the tutor of their group to evaluate their professional behaviour and provide feedback. Selection of the peer students was standardised to the five students listed immediately below the student’s name on the centrally randomly generated list of the tutorial group members, resulting in a semi-anonymised feedback procedure. Ample space for narrative feedback relating to the three categories of professional behaviour was provided for each questionnaire item (Fig. 1). All items were also answered using a Likert scale (1 = almost never to 5 = almost always). The students received the results of the feedback process in the form of a printable report presenting the results of their self-assessment relative to the assessment by their peers as well as an overview of all the narrative comments. The web-based group used the printed reports and the paper-based group used the completed paper-based forms to discuss each student’s professional behaviour during the end-of-course assessment in the final tutorial group of the course.
At the end of the last tutorial group of the course, all students of the two groups were asked to complete a questionnaire addressing fourteen aspects of feasibility, acceptability and perceived usefulness of the two instruments. The tutors were invited to report their findings by e-mail. All data were recorded and analysed anonymously.
Is clear and concrete
Is vague and general
Is constructive and positive
Is destructive and negative
Comments on behaviour
Comments on personality
Is descriptive (formative)
Is evaluative (summative)
Of the 307 (198 females, 109 males) medical students enrolled in the second course of the second year, 150 were assigned to tutorial groups with even numbers (web-based group) and 157 to the tutorial groups with odd numbers (paper-based group). Since assessment of professional behaviour is mandatory, the participation rate was 100%. We will first present the results of the quantitative and qualitative analyses of the narrative feedback provided at the interim and final assessment of professional behaviour in the two groups. After that we present the results for the feasibility, acceptability and perceived usefulness of the two assessment instruments.
Analysis of the amount of feedback
The total number of comments per category of professional behaviour in the web-based and the paper-based group at the interim and final assessment
Number of comments [n (%)]
Dealing with work
Dealing with others
Dealing with oneself
Dealing with work
Dealing with others
Dealing with oneself
Analysis of the quality of the feedback
Number of correct and incorrect scores in the web-based and paper-based groups regarding the aggregated scores on the five feedback criteria for the categories of professional behaviour at the interim and final assessment
Dealing with work
Dealing with others
Dealing with oneself
Dealing with work
Dealing with others
Dealing with oneself
When statistical significance was corrected for sample size no difference in the aggregated scores on all five feedback categories was found. Several differences of potential practical significance pertaining to the five generally accepted feedback rules were however, found, all favouring paper-based assessment. The feedback in the web-based group pertaining to the categories ‘Dealing with others’ and ‘Dealing with oneself’ was for example more often unrelated to observed behaviour. However, in the view of the statistical ‘multiple comparisons’ problem, the statistical significance regarding the between-groups differences revealed by these individual analysis is questionable. Finally, the majority of comments relating to the category ‘Dealing with oneself’ consisted of descriptions of a student’s attendance to the neglect of other aspects of personal functioning.
Feasibility, acceptability and perceived usefulness
Scores on the items of the questionnaire on the acceptability, feasibility and usefulness of web-based versus paper-based assessment of professional behaviour
Web-based PB assessment (n = 121)
Paper-based PB assessment (n = 143)
Web-based versus paper-based
The program was easy accessible using the link
The program/form was easy to use
The program/form was clear
Completing the program/form contributed to self-reflection on my personal functioning
The output (results, report) from the program was/were clear
The tutor has used the program’s results (report)/form as basis for discussing professional behaviour in the tutorial group
Discussing the program’s results (report)/completed form in the tutorial group contributed to self reflection
The program’s results (report) increased the usefulness of the professional behaviour evaluation in the tutorial group
I recognize the strengths and weaknesses identified by my peers and/or tutor
The time and effort needed to complete the program/form were worthwhile (agree = 1/disagree = 2)
Time needed to complete the program/form (in minutes)
Give a mark out of ten for ease of use (whole mark, 1–10)
Give a mark out of ten for the usefulness of the evaluation of professional behaviour (whole mark, 1–10)
Remarks by students regarding web-based professional behaviour assessment
Number of comments
No added value
Difficult to use
Difficulty with internet access, including late receipt of link or report
Some questions need rephrasing
Electronic process of professional behaviour assessment too standardised
Hampers provision of feedback
Rest (including privacy issues, requests for assessment of professional behaviour of tutor, plea for paper-based professional behaviour assessment)
Problems with report, such as print problems and the presentation of results
Process of professional behaviour evaluation becomes less personal, more distant
Although other web-based approaches to assessment of professional behaviours have been studied (Mazor et al. 2007, 2008; Stark et al. 2008) as well as are contemporarily used(National Board of Medical Examiners 2010), very few studies specifically address the amount and quality of feedback resulting from using such approach. The assessment method that is currently used at Maastricht medical school requires each student to reflect on their professional behaviour and requires all members of a tutorial group (tutor and students) to provide feedback on the professional behaviour of each student, which is then recorded by the tutor on the assessment form. This process was deliberately mimicked in the web-based instrument, which elicited feedback from students and tutor on the same three categories and items relating to professional behaviour that are included in the paper form.
The study reveals that the number of comments was significantly higher in the web-based group compared with the paper-based group. The quality of the feedback, however, did not parallel the quantitative increase. When considering the aggregated scores on the five feedback criteria no differences in quality of feedback was found between the groups (Table 3). Nevertheless, the feedback provided by the web-based group showed poorer quality in relation to several feedback criteria (e.g. was unrelated to the observed behaviour; data not shown). However, as previously mentioned, the statistical significance revealed by these individual analysis is questionable.
Moreover, the survey results on acceptability, feasibility and usefulness of the instruments were strongly in favour of the paper form. It should be noted that this result might be partly due to technical difficulties that were experienced with the web-based instrument despite adequate technical preparation and extensive tutor and student instruction. However, even when these limitations are taken into account, the web-based instrument did not show an improvement in educational impact compared with the existing method of assessing professional behaviour.
Another striking finding, which is unrelated to the nature of the assessment instrument, was the emphasis on attendance in the feedback relating to the category ‘Dealing with oneself’ and the relative absence of feedback on other aspects of self-functioning. An earlier analysis of 4 years of experience at Maastricht with paper-based assessment of professional behaviour had yielded similar findings (van Mook and van Luijk 2010). This suggests that the context (small group sessions) in the earlier years of medical school may be less suited to stimulate self-reflection (van Mook and van Luijk 2010). Perhaps attendance and time management as measures of responsible behaviour should be evaluated separately from feedback on other aspects of professional behaviour, a suggestion that was also put forward during the plenary discussion at a recent symposium on professionalism (Centre for Excellence in Developing Professionalism 2010; van Mook and van Luijk 2010).
Comparison of the results of this study to results reported in the literature is difficult since few studies on web-based instruments to assess professional behaviour have been published. However, there are some published reports on the development and use of web-based assessment in general (Wheeler et al. 2003; Tabuenca et al. 2007). In one study, implementation of a web-based instrument resulted in a substantial reduction in administration and bureaucracy for course organisers and proved to be a valuable research tool, while students and teachers were overwhelmingly in favour of the new course structure (Wheeler et al. 2003). Another study described a successful multi-institutional validation of a web-based core competency assessment system in surgery (Tabuenca et al. 2007). However, the transferability of these more general studies to web-based self- and peer- assessment of professional behaviour seems limited. Studies addressing the NMBE’s APB program however, report comparable promising results, with improved faculty comfort and self-assessed skill in giving feedback about professionalism as an example (Stark et al. 2008). It seems therefore advisable to conduct further studies to examine the effectiveness and optimal use of web-based assessment of professional behaviour.
Although the literature pertaining to web-based assessment is sparse, the peer assessment literature provides evidence of the importance of anonymity, or at least confidentiality, for the acceptance of peer assessment (Arnold et al. 2005; Shue et al. 2005). That is why we used a semi-anonymous feedback procedure in the web-based assessment in this study. Although reliability can be enhanced by increasing the number of raters (Ramsey et al. 1993; Dannefer et al. 2005), the desired number of raters may not be feasible or acceptable, for example due to time constraints. Consequently, in the current study we limited the number of peer raters to five randomly selected students (Ramsey et al. 1993; Dannefer et al. 2005). It seems reasonable for medical schools to base the selection of peer raters on practical and logistical considerations (Arnold et al. 1981; Arnold and Stern 2006; Lurie, Nofziger et al. 2006a, b), since bias due to rater selection has been shown not to affect peer assessment results(Lurie, Nofziger et al. 2006a, b). Although some anticipated problems could thus be adequately addressed, mention must be made of some remaining limitations of the current study.
In the preparation phase we were confronted with limited availability of tools for web-based multisource feedback. Because re-designing an existing tool proved costly, a for the purpose of this study superfluous feature, (the Likert scales), was left unchanged, and this may have unavoidably influenced the results, for instance those relating to time investment. The content of the web-assisted and paper versions of the instrument was otherwise identical. Furthermore, it cannot be excluded that the results were negatively affected by the participants’ unfamiliarity with web-based assessment instruments, even though the implementation process was carefully prepared based on feedback from a pilot study. Ample time was spent on technical preparation and the participants received information and instruction on multiple occasions. Furthermore, the possibility of omitting redundant comments and concretizing comments before noting them by the tutor, or the more limited space may have contributed to the lower number and/or the higher quality of the comments in the paper based group. Finally, automated data extraction only enabled feedback analysis at the level of the whole year group, although analysis of data at individual (students, tutors) or tutorial group level would have been preferable.
The results revealed that a confidential web-based assessment instrument for professional behaviour yielded a significantly higher number of comments compared to the traditional paper-based assessment. The quality of the feedback obtained by the web-based instrument was comparable as measured by several generally accepted feedback criteria. However, judging by the questionnaire results students strongly favoured the use of the traditional paper-based method. The interpersonal nature of professional behaviour prompted comments that professional behaviour was eminently suitable for ‘en-groupe’, face-to-face discussion and assessment. Although teachers and students are nowadays preferably ‘wired for learning’ it seems that, so far, professional behaviour assessment does not necessarily require the use of advanced assessment technologies, although such new ‘innovative’ electronic and/or web-based assessment methods thus do result in more feedback of comparable quality. Their exact position among the currently used, labour-intensive traditional assessment armamentarium needs to be subject of further study.
The authors wish to thank the following colleagues for their contributions. Kirsten Thijsen, medical student, University of Maastricht for her contributions to data entry from the paper forms; Renee Stalmeijer, Department of Educational Development and Research, Maastricht University for her help in developing the survey and all students and tutors in block 2.2 in the 2008–2009 academic year for participating in this study. Furthermore, the authors thank ms. Mereke Gorsira, Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands, for critically reviewing the manuscript regarding use of the English language. Finally, the authors are indebted to Mrs. Anita Legtenberg, data manager MEMIC, Maastricht University, The Netherlands, for assistance in managing the dataset from the web-based groups.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Arnold, L. and D. Stern (2006). Content and context of peer assessment. In: D. T. Stern (ed.), Measuring medical professionalism. New York: Oxford University Press. ISBN-13: 978-0-19-517226-3.Google Scholar
- Arnold, L., Willoughby, L., et al. (1981). Use of peer evaluation in the assessment of medical students. Journal of Medical Education, 56(1), 35–42.Google Scholar
- Centre for Excellence in Developing Professionalism (2010). Professionalism or post-professionalism? Four years of the centre for excellence. Liverpool, UK (chair H. O’Sullivan).Google Scholar
- Cohen, J. (1987). Statistical power analysis for behavioral sciences. Hillsdale, NJ: Erlbaum. ISBN 0-8058-0283-5.Google Scholar
- De Leng, B. (2009). Wired for learning: How computers can support interaction in small group learning in higher education. Thesis. Mediview, Maastricht. ISBN:978-90-77201-35-0.Google Scholar
- Evans, R., Elwyn, G., et al. (2004). Review of instruments for peer assessment of physicians. British Medical Association, 328(7450), 1240.Google Scholar
- Fingertips http://www.fngtps.com/work. Accessed 16 Mar 2011
- McKinstry, B. (2007). BEME guide no 10: A systematic review of the literature on the effectiveness of self-assessment in clincal education. http://www.bemecollaboration.org/beme/files/BEME%20Guide%20No%2010/BEMEFinalReportSA240108.pdf. Accessed October 10th 2008.
- National Board of Medical Examiners (2010). Assessment of professional behaviors program. http://www.nbme.org/schools/apb/index.html. Accessed September 24th 2010.
- Papadakis, M. A., Arnold, G. K., et al. (2008). Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards. Annals of Internal Medicine, 148(11), 869–876.Google Scholar
- Pendleton, D., & Schofield, T., et al. (1984). A method for giving feedback. In: The consultation: An approach to learning and teaching. Oxford: Oxford University Press, pp. 68–71.Google Scholar
- Project Team Consilium Abeundi van Luijk, S. J. e. (2005). Professional behaviour: Teaching, assessing and coaching students. Final report and appendices. Mosae Libris.Google Scholar
- SPSS, Inc. (2007). SPSS 16.0.1.Google Scholar
- van Mook, W. N., van Luijk, S. J., et al. (2010). Combined formative and summative professional behaviour assessment approach in the bachelor phase of medical school: A Dutch perspective. Medical Teacher, 32, e517–531.Google Scholar