Professor Harvard Medical School and Harvard Pilgrim Health Care Institute, United States
Background: Pharmacoepidemiologic studies are increasingly conducted within linked databases, often to obtain richer confounder data. However, the potential for selection bias is frequently overlooked when linked data is only available in a subset of the study population.
Objectives: To highlight the importance of accounting for potential selection bias by evaluating the association between antipsychotics and type 2 diabetes (T2D) in youths within a claims database linked to a smaller laboratory database.
Methods: We defined the study population within the MarketScan commercial claims database (2010-2019) as youths (5-24 years) who initiated an antipsychotic medication or a control psychotropic medication. To obtain additional confounder data, we identified a smaller laboratory database and linked records across data sources for the subset of patients who appeared in both databases. We considered three definitions for identifying the “linked cohort”: 1) linked patients with any laboratory data during the study period, 2) linked patients with any laboratory data during the covariate assessment period, 3) linked patients with complete confounder data during the covariate assessment period.
We used inverse probability of treatment weights (IPTW) to control for confounding. In analyses restricted to the linked cohorts, we applied inverse probability of selection weights (IPSW) to account for potential selection bias. Then, we used pooled logistic regression weighted by IPTW only or IPTW and IPSW to estimate treatment effects.
Results: The full claims-based cohort consisted of 349,180 antipsychotic initiators and 2,000,308 initiators of a control medication. After linkage to the laboratory database, the sample size substantially reduced (ranging from 0.1% to 15.0% of full cohort, depending on the definition of linked cohort). Metabolic conditions were more prevalent in linked cohorts compared to the full cohort.
Within the full cohort, the confounding-adjusted hazard ratio (aHR) was 2.19 (95% CI: 1.98-2.42) comparing initiation of antipsychotics to initiation of control medications. Within the linked cohorts, no adjustment for selection suggested a different magnitude of association (aHR ranging from 0.72 to 1.73), whereas applying IPSW resulted in similar point estimates (aHR: 2.16 and 2.21).
Conclusions: Studies conducted within linked databases, often with the goal of improved confounding control, may be restricted to patients who are not representative of the target population of interest. As a result, analyses conducted within linked cohorts may generate biased effect estimates without proper adjustment for potential selection bias.