What is ENCODE?
You’ve probably heard a lot about ENCODE. But you might still be wondering just what the big deal is, why some people find it controversial, and how it affects you. Ask Science explains.
Lee Falin, PhD
Listen
What is ENCODE?
by Lee Falin, PhD
By now you’ve heard the hype surrounding ENCODE, more formally known as “The Encyclopedia of DNA Elements.” You might have heard sensational phrases thrown around like “Disproves widely held beliefs” or “Shatters what we used to believe.” So just like with our discussion of the Higgs Boson, let’s cut through the hype and see what exactly ENCODE is, and what all the excitement is about.
Sponsor: Squarespace has launched a brand new content management system, making Squarespace faster and easier than ever to create a high quality site or blog. The New Squarespace features mobile responsive designs with automatic scaling to any size device, plus more than 50 new features. For a free trial and 10% off your first purchase on new accounts, go to Squarespace everyday and use offer code EVERYDAY9.
Question #1: What exactly is ENCODE?
Traditionally most studies involving genetics have centered around so-called “encoding” regions, that is the parts of the DNA where genes are. This is because genes contain the blueprints for proteins, which seem to get all of the flashy jobs in the cell. The rest of the DNA just wasn’t that exciting, though most people thought it had to be useful for something.
ENCODE combined the results of hundreds and hundreds of experiments across multiple types of cells in an attempt to systematically catalog what the rest of the genome was up to.
What they found was that while a small portion of the genome (only around 2%) contains the blueprints for proteins, another 78% of the genome contains biochemical switches that act to control how these genes are turned on and off.
Question #2: What’s all this about junk DNA? Did people really believe some of our DNA is junk?
Back in the 1970s scientist Susumu Ohno famously said that most of the human genome must be “junk” because it was too unlikely that a large genome that was subject to constant mutations could be entirely functional. It’s sort of like buying a used car. Sure that 1965 Chevy looks shiny, but it’s so old, what are the chances that all the parts still work?
Well most scientists didn’t believe him then, and over time more and more evidence has been found that disprove this claim. Unfortunately like most snappy terms, the phrase “junk DNA” stuck around in the media and over time this phrase was used to refer to any part of the genome that didn’t contain the blueprints for proteins.
Question #3: Why is the media reporting that ENCODE disproves this “widely held theory”?
Because sensational headlines sell! While many people believed that the rest of the genome must be doing something, nobody really knew what exactly it all did, nor how much of it was relevant to things like cancer and other genetic diseases. The ENCODE project is significant because it determines what kinds of things happen to each part of the genome and how those parts interact with one another. ENCODE has shown that the interactions happening between different parts of the genome are much more complex and widespread than we originally thought.
As Ewan Birney, lead analysis coordinator for the project from the European Bioinformatics Institute said:
“We’ve always known there’s another set of controls in your DNA that turn genes on and off. We uncovered the control points, or switches, that do this. And there are way, way more switches than we ever thought possible – an insane number of switches.”
Question #4: What’s all the controversy about the genome being 80% functional?
The controversy comes from the differing definitions of the word “functional.” Traditionally, a “functional” part of the genome meant a gene, the part of the genome containing the blueprint for proteins. Some people extend that meaning to include those parts of the genome that control the expression of those genes, areas typically found nearby the genes themselves. This new definition of “functional” is even looser and encompasses any areas have some type of biochemical activity. (Though some of those activities are arguably more interesting than others.)
Question #5: So what about the other 20%?
Who knows? Right now most scientists think that the other 20% represents a blend of bits that have lost their function over time along with errors made when DNA is copied for cell division. There are also likely some parts that are leftover viral and bacterial DNA which has integrated itself over the years.
Question #6: If this is all such a big deal, why haven’t I heard about it until now?
You haven’t heard about it because there was a publication embargo on the research. A project the size of ENCODE took years of work by hundreds of individuals. As the project progressed, papers were written describing the progress of the project and its impact. However, there was an embargo on publishing these papers. The idea was that all of the papers would come out at once, making a bigger impact for the project so that more people would sit up and take notice.
While critics of this approach say that this type of embargo is driven by journals in an attempt to make a greater profit from the research and that this embargo has cost scientists (and the world) up to 10 years of access to this data, some of this data, along with a guide to its use, has been available for some time.
Question #7: Is this going to affect me in some way?
The biggest message you should take away from the ENCODE project right now is that “DNA is not destiny.” ENCODE shows more definitively than ever that what makes you, you isn’t just a set of genes that you can’t do anything about, but rather the choices you make and the environment your body is exposed to.
As with the Higgs Boson particle, the most immediate change most people will see will be the differences in the genetics chapters of their children’s biology textbooks. However, in the next generation (or maybe sooner if we’re fortunate) this information will change much of how we assess the risk of, diagnoses, and treatments for disease.
Question #8: Is there anything else interesting I should know about ENCODE?
I’m glad you asked! Aside from the data itself, there are two other things that I’m particularly excited about that came from ENCODE. Two concepts about the publication of the data that I hope will set a precedent for other genetics studies. The first is the concept of threads. If you look at Nature’s ENCODE page, you’ll see that there are over 30 papers (and more to come) published on the ENCODE project’s findings.
If you’re interested in a certain topic of genetics, the old method would have required you to manually search through all of those papers and pick out the relevant bits. Thankfully, threads do this for you by highlighting the relevant parts of each paper for a given topic. It’s a fantastic idea and one that I hope sees more widespread adoption in the future.
The second is the ENCODE virtual machine. One of the traditional difficulties of building off of the findings of other teams engaged in bioinformatics (the computational part of genetic research) is that their results are sometimes very tricky to reproduce. You usually have access to their data and a description of their analysis, but sometimes getting from that to the final result can be challenging (if not impossible) without the detailed steps of how they cleaned up the data and formatted it into its final form. The ENCODE virtual machine gets around this by giving everyone access to a downloadable, virtual computer that they can use to see the exact code used to generate the ENCODE results. This, even more than journal threads, is something that I hope sees wider adoption in the future.
Conclusion
Hopefully that answered most of your questions about ENCODE. You can read more about ENCODE on Nature.com’s ENCODE page. As always, if you have further questions, please leave them in the comments below.
If you liked today’s episode, you can become a fan of Ask Science on Facebook or follow me on Twitter. If you have a question that you’d like to see on a future episode, send me an email at everydayeinstein@quickanddirtytips.com
DNA image from Shutterstock