Layout 1 ISDS Annual Conference Proceedings 2012. This is an Open Access article distributed under the terms of the Creative Commons Attribution- Noncommercial 3.0 Unported License (http://creativecommons.org/licenses/by-nc/3.0/), permitting all non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. ISDS 2012 Conference Abstracts Data Quality: A Systematic Review of the Biosurveillance Literature Tera Reynolds*1, Ian Painter2 and Laura Streichert1 1International Society for Disease Surveillance, Brighton, MA, USA; 2University of Washington, Seattle, WA, USA Objective To highlight how data quality has been discussed in the biosur- veillance literature in order to identify current gaps in knowledge and areas for future research. Introduction Data quality monitoring is necessary for accurate disease surveil- lance. However it can be challenging, especially when “real-time” data are required. Data quality has been broadly defined as the de- gree to which data are suitable for use by data consumers [1]. When compromised at any point in a health information system, data of low quality can impair the detection of data anomalies, delay the response to emerging health threats [2], and result in inefficient use of staff and financial resources. While the impacts of poor data quality on biosurveillance are largely unknown, and vary depending on field and business processes, the information management literature in- cludes estimates for increased costs amounting to 8-12% of organi- zational revenue and, in general, poorer decisions that take longer to make [3]. Methods To fill an unmet need, a literature review was conducted using a structured matrix based on the following predetermined questions: -How has data quality been defined and/or discussed? -What measurements of data quality have been utilized? -What methods for monitoring data quality have been utilized? -What methods have been used to mitigate data quality issues? -What steps have been taken to improve data quality? The search included PubMed, ISDS and AMIA Conference Pro- ceedings, and reference lists. PubMed was searched using the terms “data quality,” “biosurveillance,” “information visualization,” “qual- ity control,” “health data,” and “missing data.” The titles and abstracts of all search results were assessed for relevance and relevant articles were reviewed using the structured matrix. Results The completeness of data capture is the most commonly measured dimension of data quality discussed in the literature (other variables include timeliness and accuracy). The methods for detecting data quality issues fall into two broad categories: (1) methods for regular monitoring to identify data quality issues and (2) methods that are utilized for ad hoc assessments of data quality. Methods for regular monitoring of data quality are more likely to be automated and fo- cused on visualization, compared with the methods described as part of special evaluations or studies, which tend to include more manual validation. Improving data quality involves the identification and correction of data errors that already exist in the system using either manual or au- tomated data cleansing techniques [4]. Several methods of improving data quality were discussed in the public health surveillance literature, including development of an address verification algorithm that iden- tifies an alternative, valid address [5], and manual correction of the contents of databases [6]. Communication with the data entry personnel or data providers, either on a regular basis (e.g., annual report) or when systematic data entry errors are identified, was mentioned in the literature as the most common step to prevent data quality issues. Conclusions In reviewing the biosurveillance literature in the context of the data quality field, the largest gap appears to be that the data quality meth- ods discussed in literature are often ad hoc and not consistently im- plemented. Developing a data quality program to identify the causes of lower quality health data, address data quality problems, and pre- vent issues would allow public health departments to more efficiently and effectively conduct biosurveillance and to apply results to im- proving public health practice. Keywords Biosurveillance; Data quality; Literature review Acknowledgments We thank the ISDS Data Quality Workgroup for initiating this project, which was supported by CDC through contract with the Task Force for Global Health. References 1. Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. JMIS. 1996:5–33. 2. Dixon BE, McGowan JJ, Grannis SJ. Electronic Laboratory Data Qual- ity and the Value of a Health Information Exchange to Support Pub- lic Health Reporting Processes. Proc AMIA Symp. 2011;2011:322. 3. Redman TC. The impact of poor data quality on the typical enterprise. Commun ACM. 1998;41(2):79–82. 4. Maydanchik A. Data Quality Assessment. Technics Publications, LLC; 2007. 5. Zinszer K, Charland K, Jauvin C, et al. The influence of address errors on detecting outbreaks of campylobacteriosis. Emerg Health Threats J. 2011;4(s59):68–69. 6. Chen L, Dubrawski A, Waidyanatha N, Weerasinghe C. Automated de- tection of data entry errors in a real time surveillance system. Emerg Health Threats J. 2011;4(s69):9–10. *Tera Reynolds E-mail: treynolds@syndromic.org Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 5(1):e20, 2013