Interpretome, a new tool to explore your genomic dataJun 20th, 2011 | By Trey | Category: Featured, Learning, Tutorial
Interpretome is a tool created by a team at Stanford that allows individuals to explore their genomic data (currently from either 23andme or Lumigenix) in new ways. There are ways to explore your ancestry, health and drug response data. Additionally, there are some exercises to try out and a ‘SNP’ lookup tool. I’ve put together a short video tip to get you started. This will show you how to get your data from 23andme, get it into Interpretome and start exploring ancestry. I’ll do another 2 tips over the next couple weeks to further explore the tool (more ancestry and clinical). Additionally, Daniel at Genomes UnZipped and Blaine at Genetic Genealogist have both written up about it and you might want to check out those posts.
For the novice, much of the terminology used might be a bit opaque. It’s worth trying out though because you will learn more and also get a sense of the things that you can know from your genomic data and those that aren’t so clear.
To make it easier for the ‘rest of us,’ I’m going to explain some of the concepts and terminology in this blog post (focusing on ancestry for now):
PCA*: The “Principal Component Analysis” basically, in a sentence, attempts to place your genome in the geographic or ancestral context of many other genomes. Similar to what 23andMe’s global similarity analysis does, only it allows you to play around with the options.
Over the last decade there have been several projects to determine human diversity. These projects have taken genomic data from many different individuals from several populations. What the PCA analysis allows you to compare your genome to these and see where you fit. There are several sets population data to compare with. Some from the “Human Genome Diversity Project” and some are from “HapMap” (wikipedia article) or POPRES . Each of these are divided regionally.
You pick a region to compare to and a ‘resolution’ (pick the highest if you have a few moments to wait). For our purposes choose PC1 and PC2 axes.
Also, make sure you choose the ‘full descriptions” instead of the abbreviations when you are able. The population abbreviations can be useless. They don’t supply them for the HapMap populations, but you can see them here for the HapMap2 data, and here for the more recent (and in progress) HapMap 3 data.
You will notice from your analysis that your placement will change depending on the resolution, populations used and axes. The take home from this is that your placement is ‘fuzzy’ and you should take it as a guide, not gospel. If you are mixed race or of mixed geographic origins, the placement will not be precise. For example, a friend of mine is of mixed Arabic and Northern European heritage. PCA analysis puts him smack dab in the middle of Southern Europeans. From PCA analysis alone, he’d have a difficult time teasing apart his ancestry. It’s illustrative, not determinative.
Ancestry Painting: This is similar to the tool by the same name at 23andMe. As populations expand and migrate, there are genomic ‘signatures’ that are inherited and passed on. They are also lost. This tool looks for those signatures and attempts to determine which you’ve inherited from which population. You have a choice to compare with the HapMap 2 data (which looked at 3 basic populations, European, Asian -Japanese/Chinese- and African) or the more extensive (but not yet complete) HapMap 3 data that looked at 11 separate populations.
Again, as with above, you will need to realize some caveats. These ‘signatures’ can be lost over time depending on which half of the genome you inherited from your parents and which half they inherited from theirs and so forth. Basically, the rule of thumb is that signatures from 5-6 generations back will most likely be seen, further back the chance of that ‘signature’ being lost increases substantially. It doesn’t mean you don’t have that “Native American Great great grandmother,” it only means that her signature is gone. Here’s an example. We have an African ancestor in my maternal lineage. A signature of that heritage shows up for myself and my mother (same location on chromosome 8), but not for my brother. He didn’t get that segment of my mother’s chromosome when the egg was formed.
Also realize that the analysis can change depending on what options are chosen. If I do HapMap 2 using the default advanced settings, I get 100% European. Just a little tweaking of those settings and I get significant African heritage and a bit Asian (Native-American descended paternal grandmother). If I do HapMap 3, my chromosomal segments are placed with Italians and Gujarati Indians** in Houston, Texas. A bit of tweaking will get me the African ancestry in Southwest USA and the Gujarati Indians in Houston, Texas again.
Again, illustrative, even informative, not determinative.
*PCA is a non-trivial mathematical procedure
**I am 99.9% sure I have no Gujarati ancestors from the last 300 years.