Improving our understanding of the social determinants of mental health: a data linkage study of mental health records and the 2011 UK census

Summary of results

To our knowledge, this is the first time in which large-scale routine EHRs from a major secondary mental healthcare provider have been successfully linked to individual-level socio-demographic data from census in England. The resultant data set draws from an urban and ethnically diverse catchment area from which 220 864 secondary mental healthcare records were linked deterministically to detailed socio-demographic data from the 2011 census of England and Wales. Overall, half (50.4%) of records in the secondary mental healthcare data set linked to the 2011 census, and our analyses revealed differences between matched and non-matched records with respect to several socio-demographic and clinical characteristics. We observed the lowest match rates among young adults, individuals living in more deprived areas and among members of ethnic minority groups. We applied weights to assess how non-matching influenced mortality estimates and observed negligible differences between unweighted and weighted estimates, suggesting that non-linkage to census did not significantly bias associations.

Analysis of records not matching

There are multiple reasons why non-linkage might occur. First, the match rate in our study will have been inherently constrained by the proportion of cases in the CRIS cohort that responded to the 2011 census in the first place. The average response rate within the four London boroughs that comprise the SLaM catchment was lower (88%) compared with the national average (94%).7 Among younger individuals (25–34 years old), who constituted a large proportion of our sample, the response rate was even lower in this region (84%). More mobile populations, which may include migrants and other groups temporarily moving into an area for work alongside people with severe mental illnesses,19 may have been less likely to have taken part in the census. Individuals who moved into the SLaM catchment area and accessed services after 2011 would by default be unable to match. In addition, a growing body of evidence shows that racially minoritised groups, migrants and other socioeconomically marginalised groups are more likely to face discrimination in their interaction with governmental institutions in the UK, such as the police and the criminal justice system20 21 and the NHS.22 Previous studies have highlighted that Black and South Asian people may have concerns around how their data is safeguarded by institutions23 and it is conceivable that this is manifested in lower rates of participation, although this could be explored in future work. Whatever the cause may be, it would nevertheless seem improbable that our match rate would exceed the average census response rate specific to the SLaM region or the various demographic groups that were prevalent in our sample. It is also well established that unit non-response can be considerable among individuals with a history of mental health disorders, who because of their illnesses might find it challenging to participate24 or may be more mobile.19 Individuals with mental disorders are also more likely to experience objective social isolation (eg, have fewer measurable contacts with other individuals)25 and might consequently be less likely to be captured through proxy responses (ie, family members responding in their stead). Indeed, surveys conducted annually since 2004 by the Quality Care Commission, the independent regulator of healthcare in the UK, have never observed response rates of above 41% in community mental health samples.26

Another factor that merits consideration is the underlying methodology employed in the matching itself. In our study, records were matched deterministically through matchkeys comprised of administrative information collected in both data sets. Inaccuracies or differences (eg, wrong postcode, incorrect date of birth, name changes due to marriage or alternative or erroneous spelling of names) in how these data were recorded might therefore have prevented some records from successfully matching. For example, previous linkage of health records to the census in Scotland highlighted a higher chance of clerical error with respect to the spelling of names for minority ethnic groups, leading to lower match rates.27 As individuals from these groups were preponderant in our cohort, it is possible that clerical error accounted for a degree of non-matching in our study. Moreover, because most matchkeys required postcode information to match and because the match rate peaked among individuals who were referred the year the census was taken, it is possible that the deterministic matching methodology that we employed also missed some individuals who had a different address at the time they interacted with SLaM services and responded to the census. This is supported by higher observed levels of matching (60%) for those with an address recorded in the mental health records at the time of census, in 2011, and is consistent with the interpretation that a high proportion of the sample in this study were potentially more mobile. Comparisons to previous efforts of linking the 2011 census to other administrative data could help disentangle the relative effects of sample-specific non-participation (eg, cohort member mobility or non-participation due to mental illness) and issues related to the methodology itself (eg, sensitivity of matchkeys). However, data linkage methods and the measurement of the linkage quality are continuously evolving within the ONS following the adaptation of new working environments and data sharing agreements, which preclude a fair comparison to other data linkage efforts involving the 2011 census. Our weighted analyses nevertheless indicated that missingness had a negligible influence on relevant study outcomes, such as associations of clinical/socio-demographic characteristics with all-cause mortality.

Finally, together with existing evidence from cohort studies of substantial attrition among participants diagnosed with mental illnesses, and of non-participation in community surveys, our findings point to non-response being a significant contributor to the low match-rate that we observed. Since the census informs the planning, funding and commissioning of local services, such as schools and health services, the potential under-representation of individuals with mental illnesses is concerning and merits further investigation.

Strength and weaknesses

We believe that this is the first study to link census data in England to clinical records from a population in contact with secondary mental healthcare services. Because of the cohort’s size, unique socio-demographic composition and abundant individual-level data on a multitude of important socio-demographic indicators provided by the linkage, we expect this data set to facilitate novel investigations into health inequalities among people living with mental disorders. For example, most prior research based on EHRs in the UK have relied on area level measures of socioeconomic status, such as the IMD, which itself is derived from census attributes.14 Smith et al, By linking to clinical records to the census at the individual level, we could obtain a more accurate measure of the socioeconomic indicators. The overall size of the cohort is several magnitudes larger than previous UK-based mental health cohorts,28 particularly with respect to ethnic minority groups and specific clinical subpopulations (eg, individuals with severe mental illnesses). The degree of non-linkage that we observed is a potential source of bias. However, we had comprehensive data on many relevant characteristics for the fully enumerated cohort, irrespective of matching status and could therefore determine through non-response weighting the relative influence that missingness related to these characteristics had, on all-cause mortality estimates. We intend to incorporate these weights in all future analyses to minimise sources of bias. Although the area is ethnically diverse with a good overall representation of Black Caribbean and Black African people, other prevalent ethnic minority groups in England, such as Indian, Pakistani and Bangladeshi populations, are less well represented. In addition, some characteristics that we examined as predictors for matching, such as ethnicity and marriage status, are inherently dynamic, which may have resulted in less precise estimates. Although the highly urban nature of the South London catchment area may be generalisable to other urbanised locations in England, inferences relating to more rural areas may not be possible. There is some evidence that matching of administrative records can be improved through the use of probabilistic techniques,29 but these were not used by the ONS for this linkage. It is possible that we could have obtained a higher match rate had record matching been supplemented with probabilistic methods. Salary information, a direct measure of socioeconomic standing, is not collected in the census. However, it does contain data on numerous other factors which can be used to estimate individual wealth, including employment status, tenure, house composition and car ownership. One of the challenges with the linkage methods employed here is that we could not conclusively determine the exact causes of non-linkage. For instance, we could not quantify the relative degree to which non-linkage was caused by unit non-response or clerical errors in how data was recorded. Our study described the process of linking census to mental health electronic records. In the future, we plan to undertake assessments for the association of social and economic indicators from census with potential mental health outcomes. However, a limitation of census is that it is self-report, and this may lead to under-reporting for some important indicators (eg, migration status, employment status). This will need to be considered in future work. Finally, we could not examine cause-specific mortality, but will explore this in future analyses with linked data from the ONS mortality registration.

This post was originally published on https://bmjopen.bmj.com