Routinely collected electronic medical records are an important resource for researchers. They offer a relatively cheaper, faster and more accessible alternative to costly epidemiological studies which recruit patients. The Health Improvement Network (THIN) is a database of anonymised electronic primary care medical records which can be accessed through the University of Birmingham.
The THIN database contains electronic patient records from over 580 general practices from the UK. The database covers approximately 6% of the UK population and includes data on 12 million patients Data can be extracted from THIN and used for epidemiological research.
Data available from THIN is summarised in the diagram below. Data is organised by general practice then patient and linked by practice ID and patient ID. There are practice files, four main files (patient, therapy, medical and additional health data (AHD) files).
Data within THIN are presented as coded information. Clinical data (e.g. diagnoses, procedures, signs and symptoms), are coded using Read codes (version 2). Read codes are hierarchical, organised in chapters and categories, and comprise of seven characters (e.g. G64z200: left sided cerebral infarction). Additional clinical information (e.g. clinical measurements) is coded using AHD codes. The therapy file contains drug codes which correspond to specific drug formulations and British National Formulary (BNF) codes which are based on BNF chapters. Anonymised free text comments are contained in the medical and AHD files.
A strength of research using THIN data is that you don’t require ethical approval. Data collection for THIN was approved by the South East Multicentre Research Ethics Committee (MREC) in 2003. Therefore, individual studies using THIN data do not require separate ethical approval if only anonymised THIN data is used. However, these studies must be reviewed by an independent Scientific Review Committee (SRC) to ensure data is analysed and interpreted appropriately.
It is important to understand the strength and limitation of THIN and this database is not suitable for all research questions.
- The data is collected in a non-interventional way and, therefore, represents ‘real primary care’ (or at least primary care recording practice).
- Patients within the database are broadly representative of the UK population
- Over 580 general practices; therefore, you can get large sample sizes
- The database contains rich and reliable clinical and prescribing data
- Data is relatively accessible and research studies have reduced time and cost compared to traditional epidemiological studies
- The data is current: updated 3 times a year at the University of Birmingham
- The database includes people who are often excluded from research, e.g. pregnant women/ very elderly
- The accuracy of the research is dependent on quality of the data recorded by GPs
- The data is routinely collected for clinical management, not research
- There is a hierarchy of data accuracy in THIN: prescribing data is very reliable but some other data, such as ethnicity, is poorly coded
- Prescription of a medication may not mean patient collected/ took it therefore you can’t infer adherence
- Quality and Outcome Framework (QOF)/ policy changes may distort data
- Disease severity/ subtypes may not be recorded
- Large sample sizes may be a limitation because they cause too much power
For more information on accessing THIN data please see the University of Birmingham’s THIN website: http://www.birmingham.ac.uk/research/activity/mds/projects/HaPS/PCCS/THIN/index.aspx
By Grace Turner