Who does it take to build better genetic insights? Everyone.

Genome Diversity

Over the last decade, genetic testing has become increasingly accessible to the masses. With the cost of sequencing at a historic low and a growing number of DNA products available on the market, more individuals than ever before can choose to sequence their DNA and open a world of genetic insights.

That level of accessibility is incredibly exciting. Now, we have the opportunity to remove another barrier.

For the most part, genetic studies have been drastically unrepresentative of non-European populations. In 2009, an analysis revealed that 96% of participants in genome-wide association studies (GWAS), studies that seek to find genetic associations with biological traits, were of European descent2. A similar analysis performed seven years later showed slight improvement in those numbers, but highlighted a concerning trend: Although the proportion of GWAS participants of European descent dropped from 96% to 81%, individuals of African, Latin American, and indigenous/native descent remained significantly underrepresented in genomics research. Together, they made up less than 4% of all participants!

What does this mean?

Over the last decade, GWAS have been used to find tens of thousands of significant associations between genetic variants and complex biological traits. For example, these studies have allowed the scientific community to make significant progress in identifying genetic variants associated with Type 2 diabetes, Parkinson’s disease, heart disorders, obesity, Crohn’s disease, schizophrenia, and prostate cancer. Although we have accumulated an incredible amount of knowledge, the fact remains that the vast majority of this data comes from European samples. We cannot confidently apply many of these associations outside of European ancestral groups until we have repeated the studies in non-European populations—in fact, sometimes the associations do not replicate in these populations at all. Therefore, individuals of non-European ancestry are more likely to receive inconclusive test results after testing as compared to their European counterparts.

A little history

In the early days of genetic research, investigators used “convenience sampling,” a process of recruiting study participants based on ease of recruitment, instead of trying to assemble pools of research participants on their own. For the sake of efficiency, they piggybacked off of existing large cohorts, such as the Framingham Heart Study, which was started in 1948. Unfortunately, these existing cohorts mainly consisted of individuals of European ancestry, so these genetic studies also reflect this lack of diversity.

Another reason researchers chose study participants of the same ancestral background was to minimize confounding variables—outside variables that can result in false associations. Individuals from the same ancestral populations have similar genetic profiles in parts of their DNA because their ancestors lived in close proximity to each other. These shared genetic markers are informative of ancestry, but may be completely unrelated to the trait being studied. Therefore, when participants of diverse ancestry are included in the same study, these markers can be falsely associated with the trait if the true association is actually with an underlying cause that is tightly linked to geography, such as environmental conditions or lifestyle.

Why is diversity important to genetic studies?

By continuing to disproportionately rely on these large European cohorts, the genetics community runs the risk of missing the full spectrum of human genetic diversity. When a person receives a genetic test result that says they have a variant that affects their health, that report is based on the notion that the data has been interpreted accurately. But what if it hasn’t been? A common concern is that a variant that is rare in European populations might be misattributed to a disease. Whereas, if that same variant had been studied in another population, we would have seen that it is actually relatively common and benign1.

This recruitment bias can have significant consequences for non-European populations. In 2013, researchers discovered that one set of variants influences dosing of the blood thinning medication Warfarin in individuals of European descent, while a different set of variants influences dosing in African populations3. Warfarin is a drug that must be dosed very carefully, because a starting dose that is too high can cause internal bleeding and a starting dose that is too low will not be effective against blood clots. Physicians start patients on a standard dose and monitor them closely. However, finding a safe dose is complicated, especially in African Americans, and Warfarin is responsible for one-third of US hospital visits due to adverse drug reactions in patients 65 or older. By including African Americans in their study, researchers discovered that there is a genetic variant that is associated with a Warfarin dose reduction in individuals of African descent. Had researchers limited themselves to using European cohorts, they would have never discovered this variant and many patients could have had serious medical complications as a result of being incorrectly dosed.

What can we do moving forward?

Efforts are being made to include underrepresented populations in genetic studies. Helix co-founder James Lu was involved in the 1000 Genomes Project, which was one of the first projects to attempt to address this issue by sequencing the genomes of 2,504 people from 26 diverse populations. Furthermore, in 2016, The National Cancer Institute launched The Breast Cancer Genetic Study in African-Ancestry Populations initiative in order to investigate the genetic factors that contribute to breast cancer risk in black women, who tend to be diagnosed with a type of breast cancer known as “triple negative breast cancer” that is harder to treat. The NIH also supports a number of long-term study cohorts, such as The Multi-Ethnic Study of Atherosclerosis, The Hispanic Community Health Study, and the All of Us Research Program, which aims to have about half of its participants be from non-European populations.

Additionally, large databases are being assembled around the world to increase our understanding of underrepresented genomes. A good example is the Chinese Kadoorie Biobank, which aims to collect information on more than 500,000 people of Chinese descent. These types of major efforts are encouraging, and signal a change in the right direction.

Still, more progress needs to be made. As Nature puts it, “The message being broadcast by the scientific and medical genomics community to the rest of the world is currently a harmful and misleading one: the genomes of European descendants matter the most.” Moving forward, continuing to actively recruit and include research participants from underrepresented populations will allow us to improve testing outcomes for individuals from all ancestral groups.

We recognize this need, and we’re taking it seriously. Helix has recently joined with National Geographic Society in supporting the African Diaspora Genome Database from Howard University’s Cobb Research Laboratory, which aims—in the lab’s words—to develop “a fully documented genomic database of African-descended peoples of the transatlantic and Red Sea diasporas.” Given the incredible advances that we have made in terms of genetic research and increasing accessibility to genetic testing over the last decade, lack of diversity is a hurdle that we certainly have the resources and ability to overcome. These types of initiatives are critical to unlocking personalized genetic insights for everyone, and we look forward to seeing even more of them in the future.


1Manrai, Arjun K. et al. “Genetic Misdiagnoses and the Potential for Health Disparities.” The New England journal of medicine 375.7 (2016): 655–665. PMC. Web. 10 Nov. 2017.

2Need, AC, Goldstein, DB. “Next generation disparities in human genomics: concerns and remedies.” Trends in Genetics, vol 25, Issue 11, Nov. 2009, pp 489-494.

3Perera, Minoli A et al. “Genetic Variants Associated with Warfarin Dose in African-American Individuals: A Genome-Wide Association Study.” Lancet 382.9894 (2013): 790–796. PMC. Web. 10 Nov. 2017.

Categorized in: