Precision Medicine Driven by Bioinformatics and AI
The 19th and 20th centuries saw milestone achievements in medicine. From the development of the “germ theory” to the first vaccine; from the first stethoscope to the first practical electrocardiograph; from improvements to antiseptic practices to pioneering surgical procedures; from solving the structure of DNA to polymerase chain reaction technology that can create thousands of copies of DNA – these noteworthy achievements constitute just some of the medical advancements that have saved lives and indisputably changed the world.
The Human Genome Project
A multinational, multidisciplinary collaboration to identify the individual base pairs that make up the human genome – was recently completed in 2021 after a three-decade long effort that started in 1990. Mapping of the DNA from this project spurred new discoveries and complemented genetic engineering technologies, which scientists used to study the functions of genes. These technologies empowered research groups around the world to focus on specific areas of interest, diseases, or biological processes to understand their function, and the consequences of mutations. However, as the Human Genome Project dramatically increased our understanding of human genetics, it paradoxically revealed how little we know about human biology. Now, in the 21st century, newly developed and emerging technologies are once again enabling scientists to understand the intricate mechanisms of disease to create novel targeted therapies.
Chief among those technologies are massively parallel, high-throughput sequencing methodologies that have created a sprawling biotechnology industry. These rapid profiling methods have allowed for the large-scale study of the three components in the central dogma of biology: DNA, RNA, and proteins – essentially the building blocks and gears behind all life.
One area that has seen major breakthroughs is next-generation sequencing (NGS) which allows for the rapid sequencing of DNA and RNA at affordable costs. Such technology has, in turn, created new types of cellular and molecular assays that enable researchers to interrogate genes, their expression, and their products (proteins) across multiple organisms. Examples include high-throughput screens that survey hundreds or thousands of genes at once (transcriptome studies, CRISPR screening, whole genome sequencing) and assays designed to elucidate the accessibility of chromosomes (ATAC-seq) or how proteins interact with genes (ChIP-sequencing). These types of studies are referred to as genomic or transcriptomic studies; those that investigate the genes (DNA) and expression (RNA) thereof.
Likewise, there have been advancements in the study of proteins – the ultimate products of most genes. This has been aided by gene editing (CRISPR/Cas9 genetic engineering) and high-throughput mass-spectrometry that allows for gain-of-function or loss-of-function protein studies, or rapid sequencing of proteins, respectively. The large-scale study of proteins in this manner is called proteomics.
Finally – the newest field among the group – focuses on small biochemicals called metabolites that interact with and influence many biological processes. Metabolites are substrates, intermediates, or end products of cellular processes, and their presence or absence can provide information on cellular physiology. The large-scale study of metabolites is called metabolomics.
These four areas constitute the four pillars of multi-omics systems biology, and they are transforming how we make decisions that impact patient outcomes. Each of these areas is generating exabytes of data that are helping clinicians, scientists, and researchers unravel the intricate mechanisms of diseases. At the heart of this is Bioinformatics, sitting at the intersection of cellular and molecular biology, medicine, engineering, data science, and statistics. An exciting challenge within Bioinformatics is the integration of these data – finding biologically meaningful patterns and correlations that can not only help create new drugs and therapies, but also help diagnose patients and support long-term care. This has given rise to precision medicine, which aims to identify and understand the multi-omics profile (or part thereof) of an individual – essentially a molecular signature that can help healthcare providers specifically tailor a course of treatment to an individual’s unique needs. For example, one individual may have a genetic mutation that would render drug ABC ineffective, but the person may respond to drug XYZ. Or perhaps an individual’s gut microbiota (the communities of microorganisms that reside in the gut) are found to be altered, and this is causing immune system dysregulations, indicating treatments should perhaps focus on the microbiome. Together, Bioinformatics and precision medicine aim to identify and solve such problems.
These are, however, difficult problems to solve. Any Bioinformatics algorithm developed to recognize patterns must be both biologically and statistically sound. This requires rigorous testing and validation at the computational and experimental level, as well as validation through clinical trials for any diagnostic platform. Research in precision medicine methodologies and their applications are actively ongoing and novel research is consistently being disseminated in peer-reviewed publications.
Advancements in computational power and the increasing accessibility of machine learning (ML) and artificial intelligence (AI) has also had positive effects on Bioinformatics and precision medicine. The amount of peer-reviewed Life Sciences publications that describe ML/AI increased from approximately 600 in 2010 to over 12,000 in 2019. The Food and Drug Administration (FDA) in the US has already approved several medical devices or platforms to aid in several areas, including general decision making, oncology, and ophthalmology. Many pre-clinical studies have also applied ML/AI to multi-omics data to identify diseases, but the applications go far beyond this and can aid in the interpretation of imaging data, lab results, and patient behavior. Some studies have shown that ML and deep learning can improve patient diagnostics by augmenting current practices. ML can also help select patients for experimental therapies, reduce wait times, determine optimal drug dosages, and improve medication adherence. The penetration of ML/AI in medicine is still nascent but these preliminary studies show great promise for the future.
This is largely being driven by advances in automation and miniaturization that have enabled the development of relatively low-cost yet sophisticated biomedical instrumentation, which supports large-scale data aggregation such as the NGS technologies. Advances in cytometry and imaging methods have also improved diagnostics at the cellular, tissue, organ, and whole-organism levels. Digital stores of clinical data complement the various biological data and provide descriptions of disease and physiological conditions. The burgeoning areas of wearables and smart devices – including smartphones, smartwatches, and fitness trackers – provide access to even more sources of physiological data. The aggregation, processing, analyzing, and integration of these data are at the core of Bioinformatics with the goals of deciphering the underlying mechanisms of disease with respect to the four pillars of multi-omics systems biology.
Our “macro” understanding of biology in the 19th and 20th centuries led to medical innovations that eradicated certain suffering and pioneered medical technologies of the modern era. Today, in the 21st century, we are utilizing our Bioinformatics prowess to sift through an enormous wealth of data to not only gain a deeper understanding at the “micro” and “nano” levels but also uncover how biological processes are interconnected and how they uniquely differ across individuals. Bioinformatics, ML/AI, and precision medicine, therefore, have the potential to bring forth remarkable advancements and change the landscape of medicine and healthcare.