Bioinformatics: It’s a Bird… It’s a Plane… It’s magic?

What is bioinformatics?

Good question. Let’s start off with some definitions:

Essentially, bioinformatics is a diverse and multidisciplinary subject. Put simply, it is a combination of biology, statistics, and computer science (Figure 1). Some may argue that it also includes mathematics, physics, and chemistry (Bayat, 2002; EMBL-EBI). From my limited experience, I’ve seen bioinformatics being advertised as predictive machine learning models to creating interactive apps. For example, machine learning algorithms can predict diagnosis of neurodegenerative diseases based on medical images, medical history, sex, age, and other attributes (Myszczynska et al., 2020). Also, at a previous job interview they used R-shiny (R package) to develop apps for scientists to interact with data on T-cell receptor structures. In the NHS, I’ve observed Django (python package) being used to develop apps and limited JavaScript/HTML/CSS. Bioinformatics also includes analysis of proteomics, transcriptomics, and genomics data. At university I did a project on protein mutations which involved computational modelling with PyMol. However, most in-service work at the NHS is analysis of genomic sequencing data.

Figure 1: Bioinformatics is the combination of biology, statistics, and computer science. Some bioinformatics overlaps with data science and software engineering.

Bioinformatics Genomics vs Other Specialisms

In the NHS STP, these are the different specialisms that might sound confusing and similar. So, what exactly is the difference? Well, from left to right they become more computationally-heavy (Table 1). Bioinformatics Genomics focusses on generating reports on genomic variants whereas Informatics looks at patient medical records. Scientific computing is about developing software for medical devices.

Table 1: Comparison between the different NHS STP Informatics specialism.

Bioinformatics GenomicsInformaticsScientific Computing
Life sciencesInformaticsPhysical sciences
Processing genomic sequencing dataData science (AI/ML)Software engineering (project management)
Develop and implement automated pipelinesDeliver digital systems to record, analyse, extract, and use dataDevelop and maintain new and existing in-house software systems
Implement visualisation tools and data warehousingDevelopment of a healthcare app for administration purposesSupport IT infrastructure within the department of Medical Physics & Clinical Engineering
Optimise and streamline workflows to speed up data analysisInformation governance and auditingCyber security
Collaborate with clinical scientists in the genomics department (geneticists)Gather, analyse, interpret, and present information to improve servicesDatabase design
Follow SOPs/guidelines/legislationFollow SOPs/guidelines/legislationFollow SOPs/guidelines/legislation
Python, Bash, R, SQLSQL, Python, UML, PHP, RJavaScript, CSS, HTML, Python, C++, Java, C, SQL, UML, VBA

NSHCS, 2022

You may also be wondering what the difference between Genomics and Bioinformatics (Genomics) is. Genomics involves analysis of rare diseases e.g., Cystic Fibrosis. Genomics clinical scientists interpret results from FISH and G-banding (karyotyping) from cytogenetics. They also interpret results from WGS/WES/targeted gene panels from molecular. On the other hand, Bioinformaticians in the NHS consists of running “pipelines” which is a workflow of tools used to process data from WGS/WES/targeted gene panels. So, we use command line tools such as BWA-MEM, Pindel and GATK. In summary, Bioinformaticians process data whereas Geneticists interpret data.

Where does bioinformatics genomics sit within the NHS?

When a patient is suspected to have a genetic condition, they are referred for genetic testing by a GP or Genomic Counsellor. The genetic technologists in the laboratory extract DNA from the patient’s sample and perform sequencing. There are 3 types of sequencing: target gene panels, Whole Exome Sequencing, and Whole Genome Sequencing. When the results arrive, they are raw and require processing. Bioinformaticians take this data and run it through a pipeline/workflow (definition in next section) to make sense of the data. Then, Geneticists analyse this data to identify pathogenic variants and return the result to the GP or Genomic Counsellor who go on to explain the result to the patient (Figure 2).

Figure 2: Patient pathway for Genetic Testing. A multidisciplinary team work together to produce a diagnosis for the patient which guides treatment options.

So, what is a pipeline/workflow?

Ok, so you’re absolutely baffled. Let’s imagine you’re baking a cake. Your “input” is a bunch of ingredients which don’t appeal to your bake sale audience (who wants to each flour, egg, and butter separately?!). Your job is to produce a yummy “output” i.e., a cake. So, you pop it through a “pipeline” of tools which include a set of scales to weigh out our ingredients, an electric whisk to mix them, and an oven to bake them (Figure 3A). Ta-dah! Cake.

Still confuzzled? Right, this time your “input” is a bunch of reads from sequencing genomes. Your aim is to produce an “output” which is a file of variants (e.g., SNPs, indels, CNVs). So, we check the quality of our reads, trim them, align our reads to the reference genome, convert files, call the variants, and then annotate them (Figure 3B). Simples!

Figure 3: A) Pipeline for baking a cake. B) Pipeline for analysing genomic sequencing data.

If you have a memory of a goldfish 🐟 – here’s what you need to know

  • Bioinformatics is the use of computational tools to process and analyse biological data
  • Bioinformaticians in the NHS create and run pipelines
  • Pipelines are an ordered list of software used to process data

Phoarrrr that was a lot of info! 🤯 Time for a cuppa! ☕

References

Bayat, A., 2002. Science, medicine, and the future: Bioinformatics. BMJ, 324(7344), pp.1018–1022.

EMBL-EBI. Bioinformatics for the terrified: What is bioinformatics? [online] EMBL-EBI. Available from: https://www.ebi.ac.uk/training/online/courses/bioinformatics-terrified/what-bioinformatics/ [Accessed 21st December 2022]

Genomics Education Programme. What is bioinformatics? [online] NHS Health Education England. Available from: https://www.genomicseducation.hee.nhs.uk/education/core-concepts/what-is-bioinformatics/ [Accessed 21st December 2022]

Myszczynska, M.A., Ojamies, P.N., Lacoste, A.M., Neil, D., Saffari, A., Mead, R., Hautbergue, G.M., Holbrook, J.D., and Ferraiuolo, L., 2020. Applications of machine learning to diagnosis and treatment of Neurodegenerative Diseases. Nature Reviews Neurology, 16(8), pp.440–456.

NSHCS, 2022. Healthcare science specialities explained: Informatics [online]. NSHCS. https://nshcs.hee.nhs.uk/healthcare-science/healthcare-science-specialisms-explained/informatics/  [Accessed 21st December 2022]

Roy, S., Coldren, C., Karunamurthy, A., Kip, N.S., Klee, E.W., Lincoln, S.E., Leon, A., Pullambhatla, M., Temple-Smolkin, R.L., Voelkerding, K.V., Wang, C., and Carter, A.B., 2018. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines. The Journal of Molecular Diagnostics, 20(1), pp.4–27.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: