Developments in technology over the past 30 years have led us to an era of data collection. In healthcare, we see this as the routine computerized collection of health information. The collection of data from administrative databases, electronic medical records (EMRs), prescription dispenses, specially designed registries, as well as hospital billing records can serve as an abundant source of information that can be harnessed to answer questions about clinical effectiveness,1 safety, financing,2 knowledge translation,3 and delivery of care.4 Utilizing secondary-sources of patient data provide a unique opportunity to conduct large epidemiologic studies within a short-time frame for a fraction of the cost of investigations using prospectively collected data. The sheer efficiency of using secondary sources of data has led to their increasing availability and utilization in clinical research. Despite the more practical advantages of using these data sources, there are considerable limitations that require careful attention when designing and interpreting the results of these studies.
Utilization of routinely collected medical data provides access to a wealth of information, and often results in datasets with patients numbering in the thousands.5 Since large sample sizes often increase the likelihood of spuriously significant findings,6 the sheer size of such datasets provide unique challenges for appraisal and interpretation. Larger studies can magnify bias and the play of chance,6 thus it becomes critical to appraise the quality of the data collected. Protocol based studies utilize specially trained research staff who collect data according to standardized procedures, with the purpose of ensuring high-quality data collection.7 This process of data collection is very different from the information found in secondary-source databases, where data is collected and recorded for numerous reasons. For instance, data stored within patient EMRs is influenced by the clinical and prescribing practices of the attending physicians and thus reflects the bias inherent in the training, experiences, and unique values held by the physician, as well as the differences between information systems used by clinics/hospitals.8 Differences in physician, community, administrative, billing, or hospital practices may result in data inconsistencies and ultimately bias the findings. This phenomena has been demonstrated before, where studies comparing standardized patients to abstraction of medical charts for measuring quality of outpatient care found medical charts underestimated quality of care by 10% to 75%. 9-13 Trained research staff evaluating patients according to a standardized protocol, utilizing structured case-report forms have been shown to collect more accurate data than busy clinicians.14 In addition, prospectively collected data provides opportunity for trained staff to be blinded to treatment allocation in the circumstances which the patients and clinicians cannot.
Performance characteristics for variables collected by secondary-sources of data is an important topic rarely addressed by investigators. Administrative coding systems such as the international classification of diseases (ICD) index were developed to capture the presence of specific disorders, transforming information from complex clinical environments to pre-specified categorizations. Such coding systems are subject to the same diagnostic performance as any medical test, and it becomes critical we understand their ability to accurately represent the disease of interest. Validation studies evaluating the performance characteristics of different secondary-source coding systems are being undertaken,15,16 finding poor sensitivity and specificity in some cases.17 Errors in measurement can result in misclassification of patients, impacting our ability to draw accurate inferences and ascertain eligibility criteria, study outcome(s), exposure(s) of interest, as well as variables of prognostic importance that require risk adjustment. Thus, it is important to understand how well the routinely collected data performs at accurately identifying the exposure(s)/outcome(s) of interest. While we often accurately identify more serious disorders (HIV, cancer) within secondary-data sources, we can often miss more mild/non-life threatening events, this is known as spectrum bias. Poor performance characteristics of routinely collected data is captured well by a recent study which linked data from primary care electronic health records, data on hospital admissions, and a registry of acute coronary syndrome, finding less than one third of patients with non-fatal myocardial infarcts could be identified as such across the three sources.18
The sometimes poor performance characteristics of data from secondary-sources in combination with the lack of caution displayed during the interpretation of these studies is concerning. A recent randomized trial evaluating the impact of statins on reducing COPD exacerbations highlights the problems associated with using large administrative databases, where this trial overturned the results of a high-impact observational study using secondary-sourced data.19 The trial by Criner et. al (2014) received a lot of attention after publishing results that conflicted with widely accepted previous work,20 where they found participants randomized to statins had the same rates of COPD exacerbations as those randomized to placebo.19 The large (n=76232) retrospective cohort study utilizing secondary-source data showed participants on statins had significantly lower rates of COPD related mortality.20 In contrast to the stringently designed trial by Criner et. al (2014), the retrospective cohort study used the Lovelace Patient Database, a longitudinal, de-identified database comprised of prescription information and insurance records of healthcare encounters from several health maintenance organizations.20 The authors had no pre-specified analysis plan, or any description of the performance characteristics of the outcome variable (COPD mortality by ICD-9CM 490-496 coding).20 The investigators only briefly mentioned ICD-9CM codes as a possible source of bias in the limitations section of the discussion. There was also a cardiovascular risk imbalance between the intervention groups.20 Statins are indicated for populations with cardiovascular disease (CVD), thus there should have been adjustment to account for the higher rates CVD among the patients receiving statins, especially given the strong relationship between COPD and CVD.21 The trial addressed these problems by excluding patients already receiving statins for CVD, and patients meeting the CVD profile requiring statins.19 However, such exclusions are not always possible and it is important to remember we are limited by what is collected in the secondary-source database.
To improve our confidence in studies using secondary-sources of data it is important we described the performance characteristics of any coding system or routinely collected variable being used in the study. Citations of previous work demonstrating the sensitivity, specificity, positive predictive value, or negative predictive value of your routinely collected variable against a gold standard test for the outcome/exposure of interest would strengthen confidence in the work. To address problems with data validity, authors can also make the effort to properly adjust for important confounding and perform additional sensitivity analyses to ensure the robustness of the findings. One such study by Gershon et. al (2014) provides an exemplar for the strong analytical assessment these studies should be subject to.22 This recent Canadian study compared long acting beta2-agonists (LABAs) alone to combination long-acting beta2 agonists/inhaled corticosteroids (LABA/ICS).22 The study was conducted using Ontario’s linked administrative databases. Both treatments are used to treat COPD, but LABA/ICS combinations are more likely to be used in severe cases than LABAs alone. Confounding would likely lead to the observation that patients treated with LABA/ICS have worse outcomes than those treated with LABAs alone. In this comparison, however, LABA/ICS performed better than LABAs alone, strengthening our confidence in the results.22 Further, the authors of this study made excellent use of a priori designed secondary/sensitivity analyses, lending additional credibility to their findings.22 The authors used propensity score matching was used to compare patients with similar observed characteristics (all of whom were eligible for LABAs or LABAs and ICSs).22 This method assured a stronger prognostic balance between groups.
The presence of bias can always impact the results of observational research, making the design and analysis of these studies a unique challenge. There are many the strategies for dealing with bias in observational studies, however the quality of the data remains a central pillar, which can at times render the results invalid. Thus, in the absence of prospectively collected high-quality data, we are required to focus on ways to identify and adjust for data inadequacies.
Stay tuned for the next blog post in the series from Dr Brittany!
Connect with Dr Brittany B. Dennis on MedShr: click here!
Check out this case from Dr Brittany: Fight or Flight: A Reflective Piece on the Emotional Response to Cardiac Events
- Mamdani M, Juurlink DN, Kopp A, Naglie G, Austin PC, Laupacis A. Gastrointestinal bleeding after the introduction of COX 2 inhibitors: ecological study. Bmj. Jun 12 2004;328(7453):1415-1416.
- Stukel TA, Fisher ES, Alter DA, et al. Association of hospital spending intensity with mortality and readmission rates in Ontario hospitals. Jama. Mar 14 2012;307(10):1037-1045.
- Juurlink DN, Mamdani MM, Lee DS, et al. Rates of hyperkalemia after publication of the Randomized Aldactone Evaluation Study. N Engl J Med. Aug 5 2004;351(6):543-551.
- Mamdani M, Warren L, Kopp A, et al. Changes in rates of upper gastrointestinal hemorrhage after the introduction of cyclooxygenase-2 inhibitors in British Columbia and Ontario. CMAJ : Canadian Medical Association journal = journal de l’Association medicale canadienne. Dec 5 2006;175(12):1535-1538.
- Joynt KE, Orav EJ, Jha AK. Mortality rates for Medicare beneficiaries admitted to critical access and non-critical access hospitals, 2002-2010. Jama. Apr 3 2013;309(13):1379-1387.
- Ioannidis JP. Large scale evidence and replication: insights from rheumatology and beyond. Ann Rheum Dis. Mar 2005;64(3):345-346.
- Nagurney JT, Brown DF, Sane S, Weiner JB, Wang AC, Chang Y. The accuracy and completeness of data collected by prospective and retrospective methods. Academic emergency medicine : official journal of the Society for Academic Emergency Medicine. Sep 2005;12(9):884-895.
- Parsons A, McCullough C, Wang J, Shih S. Validity of electronic health record-derived quality measurement for performance monitoring. Journal of the American Medical Informatics Association : JAMIA. Jul-Aug 2012;19(4):604-609.
- Luck J, Peabody JW, Dresselhaus TR, Lee M, Glassman P. How well does chart abstraction measure quality? A prospective comparison of standardized patients with the medical record. The American journal of medicine. Jun 1 2000;108(8):642-649.
- Peabody JW, Luck J, Glassman P, Dresselhaus TR, Lee M. Comparison of vignettes, standardized patients, and chart abstraction: a prospective validation study of 3 methods for measuring quality. Jama. Apr 5 2000;283(13):1715-1722.
- Peabody JW, Luck J, Glassman P, et al. Measuring the quality of physician practice by using clinical vignettes: a prospective validation study. Ann Intern Med. Nov 16 2004;141(10):771-780.
- Norman GR, Neufeld VR, Walsh A, Woodward CA, McConvey GA. Measuring physicians’ performances by using simulated patients. Journal of medical education. Dec 1985;60(12):925-934.
- Rethans JJ, Martin E, Metsemakers J. To what extent do clinical notes by general practitioners reflect actual medical performance? A study using simulated patients. The British journal of general practice : the journal of the Royal College of General Practitioners. Apr 1994;44(381):153-156.
- Mills AM, Dean AJ, Shofer FS, et al. Inter-rater reliability of historical data collected by non-medical research assistants and physicians in patients with acute abdominal pain. The western journal of emergency medicine. Feb 2009;10(1):30-36.
- Goldstein LB. Accuracy of ICD-9-CM coding for the identification of patients with acute ischemic stroke: effect of modifier codes. Stroke; a journal of cerebral circulation. Aug 1998;29(8):1602-1604.
- Guevara RE, Butler JC, Marston BJ, Plouffe JF, File TM, Jr., Breiman RF. Accuracy of ICD-9-CM codes in detecting community-acquired pneumococcal pneumonia for incidence and vaccine efficacy studies. Am J Epidemiol. Feb 1 1999;149(3):282-289.
- van de Garde EM, Oosterheert JJ, Bonten M, Kaplan RC, Leufkens HG. International classification of diseases codes showed modest sensitivity for detecting community-acquired pneumonia. J Clin Epidemiol. Aug 2007;60(8):834-838.
- Herrett E, Shah AD, Boggon R, et al. Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study. Bmj. 2013;346:f2350.
- Criner GJ, Connett JE, Aaron SD, et al. Simvastatin for the prevention of exacerbations in moderate-to-severe COPD. N Engl J Med. Jun 5 2014;370(23):2201-2210.
- Frost FJ, Petersen H, Tollestrup K, Skipper B. Influenza and COPD mortality protection as pleiotropic, dose-dependent effects of statins. Chest. Apr 2007;131(4):1006-1012.
- Mannino DM, Thorn D, Swensen A, Holguin F. Prevalence and outcomes of diabetes, hypertension and cardiovascular disease in COPD. The European respiratory journal. Oct 2008;32(4):962-969.
- Gershon AS, Campitelli MA, Croxford R, et al. Combination long-acting beta-agonists and inhaled corticosteroids compared with long-acting beta-agonists alone in older adults with chronic obstructive pulmonary disease. Jama. Sep 17 2014;312(11):1114-1121.