How do scientists connect the dots between your genes?

June 29, 2018

How do geneticists find the parts of your DNA that influence or determine your traits? As we’ve written previously, the path to discovery often starts with a Genome Wide Association Study (GWAS). A successful GWAS will generate a number of leads that can then be followed to uncover the biology behind the trait in question, like BMI or sleep traits.
These leads come in the form of “tag SNPs,” which are common DNA variants meant to identify a certain region of the genome for follow-up study. You can think of a tag SNP like the X on a treasure map, or a ZIP code. It draws our attention to a location, but we have to do more digging to find exactly what we’re looking for—in our case, the causal genetic variant that directly influences the trait in question. Imagine that a friend of yours wants to meet you for lunch. “Meet me in San Francisco” is much less informative than, “meet me near the 24th St. Mission BART station”. You have enough information to start heading in that direction, but you’re going to need more information to find exactly which restaurant your friend is at.
A frequent starting point for identifying the gene related to a tag SNP is to look for which gene is closest to it (sometimes even within a gene). Since sections of DNA that are closest to each other are usually most likely to be inherited together, it makes sense to use physical proximity as a starting point. It’s important to note, though, that the closest gene is not always correct. For example, there’s a variant located within the MCM6 gene that is associated with lactose intolerance because it impacts the production of the protein lactase from the LCT gene—not MCM6!

Ultimately, genetic discovery doesn’t stop with the GWAS

Scientists also use existing information about genes in a region to identify the causal variants behind the GWAS SNP. For example, if we’re studying BMI, it would make sense to find a tag SNP near genes that have previously been associated with something that has previously been linked to controlling appetite. In the metaphor where we’re meeting our friend for lunch, you might get to the neighborhood and start looking inside of taquerias if you know that your friend likes burritos. However, the gene we’re looking for doesn’t always have an obvious connection to the trait we’re studying. One of the reasons for performing a GWAS is to identify genes and biological pathways that we didn’t already know were related to a given trait.
Ultimately, genetic discovery doesn’t stop with the GWAS—we have to do additional experiments to test if the genes we’ve identified do indeed influence the trait we’re studying. Often this involves a number of different approaches using biochemistry or genetic experiments in model organisms. Another possibility is to look for rare genetic variants in the target gene in another population with the trait. You might ask why we don’t start by looking for rare variants, and in some cases we do! Previously, tag SNPs provided a more cost-effective starting point, but the cost of DNA sequencing has decreased dramatically in recent years. This allows us to more easily follow up on previous GWAS, and to do new rare variant association studies. The great thing about research is that it is always growing, making new discoveries, and finding new ways to make discoveries!