Tuning Into Allele Frequencies

Positive Result Rate

When friends and acquaintances find out that I provide consulting services to genetic testing companies, I’m frequently asked how often people get back a positive result from a genetic test. It’s a loaded question for many reasons. Defining a positive test seems straightforward; it means that the laboratory found a change in a particular gene or chromosome. However, there are many different kinds of genetic tests, and to further complicate matters, how often a certain test result is seen can vary in different populations. I’ll review these issues in more detail below. At its core, the answer to the question can be found in allele frequencies. Allele frequencies show the genetic diversity of a species population – how rich is the gene pool?

Our understanding of how common human genetic variants are has changed over time. In the early days of sequencing, reference populations used could be small, on the order of 100 people. The available reference sets have grown larger and more diverse, and the most current is the Genome Aggregation Database (gnomAD), which contains 141,456 unrelated individuals.

Admera Health Allele Frequencies Reference Set Over Time

Figure 1: The population breakdown of the Genome Aggregation Database (gnomAD). In its first release, which contained exclusively exome data, gnomAD was known as the Exome Aggregation Consortium (ExAC). Other earlier reference populations included National Heart, Lung, and Blood Institute (NHLBI)’s Exome Sequencing Project (ESP) and the 1000 Genome Project. The figure shows how the reference databases have grown larger and more diverse over time. Source: https://macarthurlab.org/2018/10/17/gnomad-v2-1/

I’ve been working with allele frequencies since before any of these reference populations were available, and to me, it’s amazing to be able to use the gnomAD browser today to pull up allele frequencies for nearly any variant that I need to work with, broken down by gender and ancestry. Thanks to gnomAD, we now know that the vast majority of human mutations are rare, with minor allele frequencies under 1%, with the plurality observed only once in gnomad (i.e., “singletons”)1.

Allele Frequency

So what is an allele frequency? Well, an allele is one of the two (or more) alternative forms of a gene that exist at the same place on a chromosome. The allele frequency is how often one form occurs in some reference population. Allele frequencies will be a number between 0 and 1, or can be expressed as a percent between 0% and 100%. Allele frequency is not the same as, “how many people have it-” it's “how many chromosomes have it”? To answer, “how many people have it,” you need the genotype frequency.

Admera Health Allele Frequency Genotype Frequency

Each of our DNA differs from the reference genome in more than 3 million places that can be identified by sequencing, but only a few thousand of these variants are in the coding regions of disease-causing genes. For mendelian disorders, that is, those caused by the changes or alterations in a single gene, each of us typically carries only a few variants that are pathogenic, meaning that they could cause disease. (Keep in mind that many mendelian disorders are recessive, that is, they only result in disease when a person inherits two pathogenic variants in the same gene, one from the mother and one from the father. For dominant conditions, one variant alone is enough to cause the disease).

Most variants that cause mendelian disease are rare. In fact, the guidelines for interpreting sequence variants for mendelian disease, which clinical geneticists use to find the “needle in the haystack” disease-causing variant(s), state that if the allele frequency is >5% in ESP, 1000 Genomes, or ExAC, the variant is not pathogenic but benign2 (the guidelines predate gnomAD).

However, other kinds of genetic test results can include variants with allele frequencies greater than 5%, and these can reveal significant and actionable health information. Pharmacogenetics is a type of genetic testing to help determine what medication and dosage will be most effective and beneficial for you if you have a particular health condition or disease. Many of the genetic variants that are used in pharmacogenetics testing are polymorphisms, or variants that are seen most commonly, usually defined as an allele frequency of 1% or higher. A SNP is a type of polymorphism that is exactly one base pair change (single nucleotide polymorphism). Health care providers using pharmacogenetic tests are likely to see a positive test result in many, if not most, patients. Of course, this result is most actionable at the time of testing if it relates to a medication that the patient is currently taking or that the provider is considering prescribing. Having the pharmacogenomic result in a patient’s medical chart before they need the medicine can also prove to be actionable in the long run, if the medications are needed later.

Two health care providers using the same pharmacogenetic test may see very different rates of positive tests, simply because the communities that they serve may come from different parts of the world. For example, the variant allele HLA-B*15:02 is strongly associated with greater risk of serious adverse effects, such as Stevens–Johnson syndrome and toxic epidermal necrolysis, in patients treated with carbamazepine or oxcarbazepine (used to treat epilepsy). The allele frequency of this variant is 7% among East Asians, but it is much rarer in other populations. The allele frequency is 0.16% in the Americas, 0.04% in Caucasians (European and North American), and 0.00% in African Americans.3 Thus, some medical practices might see a positive HLA-B*15:02 test result fairly frequently, while other practices might hardly ever see the same positive result.

A recent study4 investigated the spectrum of variation found in genes that are important for drug response (pharmacogenes) across human populations. The reference population used was the 1000 Genomes Project Phase 3 data (2504 individuals and 26 global populations). The patterns revealed were similar to those described above for all genes; the vast majority of variants are rare. 90% of variants had an allele frequency less than 0.5%, and about half of variants were singletons. Amazingly, 97% of individuals in the 1000 Genomes Project had at least one clinically relevant variant. The frequencies of some of the more common variants are shown in the figure below. Variants in certain genes, such as CY2C19, CYP4F2, and SLCO1B1, displayed differences in allele frequencies between populations. For SLCO1B1, a single coding SNP, rs4149056T>C, increases systemic exposure to simvastatin and the risk of muscle toxicity. Simvastatin is among the most commonly used prescription medications for cholesterol reduction. This SNP is most common in Europeans and least common in Africans. Again, some medical practices might see a positive test result fairly frequently, while other practices might hardly ever see the same positive result.

Admera Health Allele Frequency Scatterplot

Scatterplot of allele frequencies of clinically relevant variants in the different population groups. Variants in certain genes, such as CY2C19, CYP4F2, and SLCO1B1 displayed differences in allele frequencies between super-populations. Abbreviations: AFR= African; AMR= admixed American; EAS= East Asian; EUR= European; SAS= South Asian. Source: Wright 20184

In conclusion, while most genetic variants are rare, having a genetic variant that may influence your health is common. Knowledge is power, and I hope this blog post has given you a little more of both!

About the Author:

Elana Silver, MS Principal Consultant, Laurelton ResearchElana Silver, MS, is Principal Consultant at Laurelton Research, which provides public health planning, analysis, and education services for a range of clients. Elana works in a wide variety of subject areas, with an emphasis on environmental and genetic epidemiology. She is working with Admera and other biotech companies to provide people with actionable information about their genetic susceptibility to disease, including how their genes can affect their response to medications. Additionally, Elana recently collaborated with the California Department of Public Health on a study of occupational exposures to breast cancer and e-cigarettes, and developed online CME courses for health care providers. Elana received her Master’s degree in epidemiology from UC Berkeley and lives in Oakland, California.

 

References:

1. Hernandez et al., 2018. Singleton Variants Dominate the Genetic Architecture of Human Gene Expression. BioRxiv preprint: https://www.biorxiv.org/content/10.1101/219238v1.full.pdf

2. Richards et al., 2015. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine 17(5):405-24.

3. CPIC HLA-A and HLA-B frequency tables: https://cpicpgx.org/guidelines/guideline-for-carbamazepine-and-hla-b/

4. Wright et al., 2018. The global spectrum of protein-coding pharmacogenomic diversity. The Pharmacogenomics Journal (2018) 18, 187–195