Evidence Based Library and Information Practice Evidence Based Library and Information Practice 2011, 6.1 71 Evidence Based Library and Information Practice Evidence Summary Statistical Measures Alone Cannot Determine Which Database (BNI, CINAHL, MEDLINE, or EMBASE) Is the Most Useful for Searching Undergraduate Nursing Topics A Review of: Stokes, P., Foster, A., & Urquhart, C. (2009). Beyond relevance and recall: Testing new user-centred measures of database performance. Health Information and Libraries Journal, 26(3), 220-231. Reviewed by: Giovanna Badia Librarian, Royal Victoria Hospital Medical Library McGill University Health Centre Montreal, Quebec, Canada Email: giovanna.badia@mail.mcgill.ca Received: 15 Dec. 2010 Accepted: 23 Feb. 2011 2011 Badia. This is an Open Access article distributed under the terms of the Creative Commons-Attribution- Noncommercial-Share Alike License 2.5 Canada (http://creativecommons.org/licenses/by-nc-sa/2.5/ca/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly attributed, not used for commercial purposes, and, if transformed, the resulting work is redistributed under the same or similar license to this one. Abstract Objective – The research project sought to determine which of four databases was the most useful for searching undergraduate nursing topics. Design – Comparative database evaluation. Setting – Nursing and midwifery students at Homerton School of Health Studies (now part of Anglia Ruskin University), Cambridge, United Kingdom, in 2005-2006. Subjects – The subjects were four databases: British Nursing Index (BNI), CINAHL, MEDLINE, and EMBASE). Methods – This was a comparative study using title searches to compare BNI (British Nursing Index), CINAHL, MEDLINE and EMBASE. According to the authors, this is the first study to compare BNI with other databases. BNI is a database produced by British libraries that indexes the nursing and midwifery literature. It covers over 240 British journals, and includes references to articles from health sciences journals that are relevant to nurses and midwives (British Nursing Index, n.d.). The researchers performed keyword searches in the title field of the four databases for the dissertation topics of nine nursing and midwifery students enrolled in undergraduate dissertation modules. The list of titles of mailto:giovanna.badia@mail.mcgill.ca� Evidence Based Library and Information Practice 2011, 6.1 72 journals articles on their topics were given to the students and they were asked to judge the relevancy of the citations. The title searches were evaluated in each of the databases using the following criteria: • precision (the number of relevant results obtained in the database for a search topic, divided by the total number of results obtained in the database search); • recall (the number of relevant results obtained in the database for a search topic, divided by the total number of relevant results obtained on that topic from all four database searches); • novelty (the number of relevant results that were unique in the database search, which was calculated as a percentage of the total number of relevant results found in the database); • originality (the number of unique relevant results obtained in the database for a search topic, which was calculated as a percentage of the total number of unique results found in all four database searches); • availability (the number of relevant full text articles obtained from the database search results, which was calculated as a percentage of the total number of relevant results found in the database); • retrievability (the number of relevant full text articles obtained from the database search results, which was calculated as a percentage of the total number of relevant full text articles found from all four database searches); • effectiveness (the probable odds that a database will obtain relevant search results); • efficiency (the probable odds that a database will obtain both unique and relevant search results); and • accessibility (the probable odds that the full text of the relevant references obtained from the database search are available electronically or in print via the user’s library). Students decided whether the search results were relevant to their topic by using a “yes/no” scale. Only record titles were used to make relevancy judgments. Main Results – Friedman’s Test and odds ratios were used to compare the performance of BNI, CINAHL, MEDLINE, and EMBASE when searching for information about nursing topics. These two statistical measures demonstrated the following: • BNI had the best average score for the precision, availability, effectiveness, and accessibility of search results; • CINAHL scored the highest for the novelty, retrievability, and efficiency of results, and ranked second place for all the other criteria; • MEDLINE excelled in the areas of recall and originality, and ranked second place for novelty and retrievability; and • EMBASE did not obtain the highest, or second highest score, for any of the criteria. Conclusion – According to the authors, these results suggest that none of the databases studied can be considered the most useful for searching undergraduate nursing topics. CINAHL and MEDLINE emerge as consistently good performers, but both databases are needed to find relevant material on a topic. Friedman’s Test clearly differentiated between the databases for the accessibility of search results. Odds ratio testing may assist librarians to make decisions about database purchases. BNI scored the highest for availability of results and CINAHL ranked the highest for retrievability. Statistical measures need to be supplemented with qualitative data about user preferences in order to determine which database is the most useful to our users. Evidence Based Library and Information Practice 2011, 6.1 73 Commentary This study contributed to the existing literature in that it was the first study to compare BNI, CINAHL, MEDLINE, and EMBASE, and the first one to combine the novelty, originality, availability, and retrievability of search results with the traditional testing criteria of precision and recall to compare database performance. Its findings confirmed what is already known, i.e., “that searching a single database is likely to miss relevant articles, and that some databases may be in general good performers” (pp. 230). The statistical measures used for comparative database evaluation, i.e., Friedman’s Test and odds ratios, could not determine which database was the most useful. This reviewer questions whether odds ratio was an appropriate statistical test to compare BNI, CINAHL, MEDLINE, and EMBASE, since the authors state that “the odds ratio is comparing each database individually against the pool of data; it does not compare the four databases with each other” (pp. 229-230). The authors also suggest that odds ratio may assist in the selection of databases to purchase, but do not explain how. The study would have benefited from including a brief description of Friedman’s Test and odds ratios, as well as an explanation of how the data from both tests were combined. A table with the raw data from the searches could also have included in the article itself. Unfortunately, the appendix containing the search data is no longer available on the publisher’s website. The missing appendix and the lack of sufficient explanatory details in the article make it difficult to replicate, or completely understand, the study’s research methodology. It is also difficult to generalize the study’s findings due to: its small sample size (i.e., nine students’ topics); the use of keyword searching in the title field to obtain relevant results, which may not be a user’s typical searching behaviour; and the use of database testing criteria that are dependent on an individual library’s subscriptions rather than on database search performance (i.e., the use of the availability, retrievability, and accessibility criteria). Despite its weaknesses, this study reminds librarians that precision and recall are not the only criteria that should be used to measure the performance of a database. References British Nursing Index (n.d.). About BNI. Retrieved 20 Feb. 2011 from http://www.bniplus.co.uk/about_bni.html / Evidence Based Library and Information Practice