A genome is the complete set of information that is needed to build and maintain a single individual, like a big instruction book. The human genome is split into 23 sections of information known as chromosomes, which can be thought of as chapters in a book. The human genome is made of DNA.
Genes are particular sequences of DNA, which are a bit like the words in each chapter of the book that makes up your genome. Within a gene there are coding regions of DNA known as exons, and non-coding gaps known as introns. Each gene contains specific instructions to make proteins. Proteins are large, complex molecules that carry out lots of different jobs in the body, from helping your muscles to contract so you can bend your arms and legs, to carrying oxygen around your body through your bloodstream.
We often talk about genes having different characteristics and this is where proteins play a part. When we talk about the genes for eye colour, for example, what we are talking about is a code for the protein that gives our eyes colour. Genes can come in different versions, so some people’s versions of the gene will code for proteins that make their eyes look blue, while other people’s versions will make proteins that make their eyes look brown. Sometimes however, the genes that code for proteins have differences in them which mean that the gene and its protein do not work in the usual way, a bit like a spelling mistake in a word.
It may seem surprising but only around 1% of the human genome is made up of DNA that codes for proteins (exons). The rest of it is made up of non-coding DNA. Coding and non-coding DNA is made up of the same four chemical building blocks or bases but it doesn’t have the same function. As a result, although all of our genome is made up of DNA, only a tiny proportion of it is coding for proteins.
Although it doesn’t directly code for proteins, some non-coding sequences enable our cells to produce different amounts of proteins at different times because they contain instructions that tell the cell how to switch genes on and off. Other non-coding sequences are part of genes, but don’t directly code for proteins. These are thought to help the cell to generate a number of different varieties of the same protein from just one gene.
We refer to the protein-coding part of the genome as the exome. Although the exome only comprises a very small fraction of the genome, alterations in these sequences of DNA are thought to contain 85% of the variations that can cause genetic conditions. Researchers can use a technology called sequencing to look at the order of chemical bases in DNA. They can therefore sequence your exome and determine whether or not there are any alterations by comparing it to a reference exome. It’s just like reading a book and checking to see if there are any spelling mistakes. By focusing on sequencing and analysing just the 1% of the genome that codes for proteins, researchers believe that they are likely to find most disease-causing alterations. Looking at just this small fraction of the genome also makes it cheaper, faster and easier to sequence and analyse.
While many genetic alterations that cause disease are found in the exome, not all of them are. In fact researchers are finding out more and more about the genome every day and some non-coding parts of the genome are now considered to play a role in genetic conditions. Alterations in this part of the genome cannot be picked up by just looking at the exome but can be found by sequencing all of the DNA in a genome. This is called whole-genome sequencing. As whole-genome sequencing becomes faster and cheaper, it is likely to eventually replace exome sequencing, allowing us to gain a deeper understanding of our genome and how its alterations are related to genetic conditions.