I’ve wanted to write an intro to “epigenetics” for a while, because when I was trying to learn about it for the first time, I found most things fell into two camps: either mainstream press with lots of hype but not much detail or jargon-filled explanations aimed at hardcore biologists with lots of prior genetics knowledge. As someone who started my PhD with little understanding of genetics, it took me quite a while to piece together information with enough detail to properly understand all the terms but not so much detail that my brain felt like melting. I wished there had been one resource which talked me through at least the basic ideas from start to finish, without assuming a tonne of knowledge beforehand – and so that’s the aim of this series of posts! For people new to genetics, this will be a crash course in all the key things to know. I will use some of the jargon so that you can learn the words, but try not to get bogged down in it – you don’t need to remember it all and I will try to explain everything as I go. Leave me a comment if I need to explain something further! For people familiar with genetics, this will probably cover a lot of old ground and you might want to just skim it then read the extra resources at the end!
So… let’s start with the hype!
Maybe you’ve read that “Epigenetics is showing that your perceptions and thoughts control your biology, which places you in the driver’s seat. By changing your thoughts, you can influence and shape your own genetic readout.” Wow! Or maybe you’ve read”When a disease occurs, the solution, according to epigenetic therapy, is simply to “remind” your affected cells (change its environmental instructions) of its healthy function, so they can go back to being normal cells instead of diseased cells.” Incredible! Maybe you’ve heard that your parents’ social lives may have changed the genes they passed on to you , or that “children inherit[ed] the nightmare[s] their mothers experience“.
Well, I’m sorry to break it to you but… none of this is true. Or, more accurately, we have no evidence currently to support that any of this is true (and instead quite a bit of evidence suggesting that some of this is very unlikely). Epigenetics is real and it is important and about adding another layer to our understanding of genetics – but there’s no epigenetic therapy which involves just wishing our biology different and the idea that parents can pass on their experiences via our genes is still incredibly controversial.
Things get even more complicated by the fact that scientists themselves struggle to agree on what counts as “epigenetic”. But I’ll come back to that at the end…
So, in this series of posts, I’m going to focus on one particular biological mechanism called DNA methylation. This is far from the only mechanism we might be interested in, and it may well not be the most important mechanism, but I’ve chosen it because a) most of the stuff you’ll see in the press talking about epigenetics is based on research studying DNA methylation and b) DNA methylation was the focus of my PhD!
In this post I’ll describe what DNA methylation is, starting from the very basics of what DNA is, and why we care about it. In future posts I’ll move on to how we study DNA methylation, the problems in a lot of the research that has been done, and then some of the specific research findings in more depth – including my own!
Note – I came to epigenetics via psychiatric research so my focus is very much on humans. A lot of the below applies more broadly, but some doesn’t, so bear that in mind.
What is DNA methylation??
First, let’s do some background, because once you know the basics of genetics, DNA methylation itself is a pretty simple concept… So, let’s start with DNA: DNA is a type of acid – deoxyribonucleic acid – which contains all of our genetic information. You’ve probably seen images like the below before, and these are representing DNA and it’s famous “double helix” structure:
Looking at the picture, those short bars linking together the long strands swirling around each other can be constructed with four different “bases”. The bases are called cytosine, thymine, guanine and adenine but we often refer to them just by their first letters – C, T, G and A. When we talk about genes, genes are simply a section of this DNA strand, containing a bunch of these bases which we represent with these letters – so you can think of genes as codes, unique combinations of these letters which tell our body what to do. Certain genes are universal and all humans have the same code in that section of DNA, but other ones can differ and this can lead to differences between people – switch out some Cs for some Ts in one piece of genetic code and you might change your eye colour. We call genes like this polymorphic (which simply means – many forms) and each different version of the gene is called an allele – so, everyone has eye colour genes, but you might have allele A of a certain eye colour gene and I might have allele B of that gene, and each allele will have different letters of code.
Don’t worry about remembering all that detail and all the terms – the key things to remember are that DNA is an acid which looks like these long winding strands, different sections of these strands are different genes, and genes contain codes for our body.
These strands of DNA wrap around a cluster of proteins (called a histone core), and then we have lots and lots of these protein clusters with literally miles of DNA wrapped around them which bundle up tightly into X shapes and form a set of 46 chromosomes (23 pairs of chromosomes) that look like this:
These 46 chromosomes contain the whole of our DNA and all our genetic information. This is the bit where you flashback to sex education biology in school, with the egg providing 23 chromosomes and the sperm providing 23 chromosomes – remember? And you might also remember being told that (pretty much) all the cells in our body have a full set of all of our chromosomes and all of our genetic information – this is because we start off as just that sperm and egg with those chromosomes, they combine and then they just divide and replicate again and again, copying these chromosomes and all the information inside them each time.
Now, the second key thing to understand the basic idea of is gene expression (I promise we’ll get to the methylation soon). I’m not going to go into detail in this post but – in short – gene expression is when the body reads the code of our genes and then uses this to make instructions to create things that our body needs (such as proteins). The proper words for these processes are transcription and translation. One detail that is useful to know – to start this process, certain molecules (called RNA polymerase) need to attach to DNA at certain points and then run along the strand of DNA reading the code. A copy of the code called RNA is then created, which provides the core instructions. Other proteins called transcription factors can also attach to DNA and make this whole process more likely to happen.
In short, the gene is the section of DNA with a code containing biological instructions and gene expression is the process of that gene actually being read and those instructions implemented.
Now, finally, let’s move on to: what is DNA methylation? If you’ve got this far, the hard part is over! Quite simply, DNA methylation is when a methyl group – a little bunch of atoms, as in the picture below – attaches to DNA. That’s it!
This is a way of imagining it, with the DNA laid flat and orangey red Ms representing methyl groups:
So, along that strand of DNA, running along the outside of these strands are patterns of methylation- sometimes sections have lots of these methyl groups, sometimes they have none. When these methyl groups attach it’s methylation, when they detach its demethylation. If there are lots of these methyl groups around then a region is highly methlated, and vice versa.
If you want a little more information… Methyl groups only attach to the C bases in DNA (remember that C, G, T and A code – you can see these letters in the picture above) and C bases always attach to G bases in pairs. When you get two of these C-G pairs next to one another (as with the first two vertical bars in the picture above), we call this a CpG site. Papers on DNA methylation tend to focus on CpG sites because whether the C bases in these sites are methylated or not is particularly important – partly because we tend to find a lot of CpG sites around the parts of DNA which are important for triggering gene expression.
Why do we care about DNA methylation?
And so this is when we come to the reason why DNA methylation is of such interest – methylation seems to be associated with gene expression. In other words, methylation is associated with whether our genes get read or not and whether the instructions within actually get used in our body or not – so it’s a pretty important thing to consider when looking at genes! If the gene is there but not getting read, it’s almost like it being switched off or invisible.
Quick aside for those interested in a bit of the nitty gritty – it’s worth noting that, we don’t really know how DNA methylation is linked to gene expression. Firstly, while there tends to be a link between high methylation and low gene expression (=lots of methyl groups on the outside of DNA strands, not a lot of the gene being read), we don’t know what happens first. Do high levels of methylation physically block important molecules (transcription factors and RNA polymerase) from accessing genes and being able to read them? This has been the prevailing view. Or is it simply that genes with high expression happen to also have low methylation? Secondly, there has been increasing evidence that sometimes high methylation is linked with high gene expression! So we’ve got some stuff to figure out there….
But there’s some kind of link and critically – big reveal! – while we’re stuck with our genetic code throughout our lifetime, these patterns of methylation change. So, in that last picture, those Cs and Gs and As won’t change, but the red Ms might change. If you think about it, we need our bodies to do different things during different parts of our life, so of course these patterns and the bits of genetic instruction we need to be reading will change over time. Also, while our body is forming in the womb, because every cell in our body has all of our chromosomes, we need something to tell certain cells to only listening to the genetic instructions relevant for becoming a hand, other cells to only listen to instruction relevant for becoming an eye, etc. We need something extra on top of our genetic code which changes and adapts.
Now, we’ve known all this for quite a long time. But in recent years the question has been, if our methylation patterns change over our development, do they also change in response to our environment? How flexible are they? How much do they differ between people, and what information can we get from this? Then two really big questions are: Can we deliberately influence methylation patterns to help prevent or treat illness? And if our methylation patterns do change in response to our experiences, do we then pass this on to our children?
The short answer to most of this is that we don’t really know. Some things are more certain than others – some things we do definitely affect DNA methylation. Smoking has a very clear effect on methylation patterns (research article), as do certain foods and medications, though the impact of these changes isn’t fully clear. There might be meaningful methylation differences between people that give us extra information on top of knowing someones genetic code – but a lot of the studies out there have flaws and are difficult to interpret (which I’ll explore more in a future post). But least certain of all is the idea that we can deliberately change our methylation to achieve specific outcomes or that we pass on experiences via methylation to our children – the evidence for this is very very limited in humans. So, going back to all those media articles and hype at the start of this post, while the study of epigenetics has raised these ideas and is interested in exploring all of these questions, unfortunately the headlines have focused on the ideas with the least evidence!
So… what about epigenetics??
You’ll notice I haven’t referred to epigenetics throughout that description of DNA methylation and that’s because I said I’d come back to the issue of defining epigenetics at the end…
So epigenetics can be defined in a number of ways.
You can take the word at face value – epi means beyond/above/on top of, so epigenetics means anything beyond/above/on top of the genome. Or you can take the historical meaning from the guy who coined the term, Waddington – the basic idea that we start with a genetic profile which gives us a range of future routes and over our development different internal or external experiences guide us down different routes. Hopefully you’ll agree that, by these definitions, DNA methylation is epigenetic.
Over the last decade or so, epigenetic variation has often been defined as dynamic changes in the structure of chromatin (the collective term for DNA wrapped around cores to make up chromosomes) without changes in DNA sequence – the methylation of DNA is part of the changing structure of chromatin but it doesn’t change the code of the base letters, so it would again meet the criteria to be epigenetic.
However, another common definition for epigenetic variation is that it refers to mechanisms which control and regulate gene expression – this becomes slightly more controversial due to some of the uncertainty I discussed above. Are all the changes in DNA methylation that we study actually controlling and causing changes in gene expression? Jury’s out on that one.
Other definitions use aspects of the above but specify that it must be heritable – again, this becomes controversial due to the lack of evidence that changes in DNA methylation are passed onto children.
So, I might be leaving you a little confused about what epigenetics is (join the club) – but hopefully I’ll be leaving you a little clearer on what DNA methylation is! Which, chances are, if you’ve heard about epigenetics in the news, is what they were talking about.
Next time, I’ll talk about how we study DNA methylation, some of the issues we run into, and some of the limitations with the studies out there.
If you want to do some further reading, here are a few articles and scientific papers which I recommend, which recap and expand on what I’ve said (though most of them still talk with more certainty than I think is justified so bear that in mind):
- Guardian article by Adam Rutherford [Adam Rutherford is great, follow him on Twitter and read his book!]
- Another decent Guardian overview
- Nice overview with some info on relevance in cancer
- Slightly more technical overview
- A more technical discussion of the different definitions of epigenetics in Nature
- Great Nature review from a couple years back for those wanting much more of the biology
- Wikipedia is pretty solid too but very dense!
- Finally, this is a really nicely illustrated video which goes into a bit more of the biology including histone modifications which I haven’t even touched on:
05/04/2018 – I made a few edits based on some v helpful Twitter feedback from Cath Ennis. In particular, correcting the definition of CpG sites and adding a disclaimer that I’m only speaking for human DNA methylation.