Genome assembly: GRCh37.p13
Ensembl version: 87
1000 genomes: phase 3 version 5
This is a silent update of SNiPA annotations that extends version 3.3. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 100 release, as well as new data on mQTL and pQTL associations (see below).
We updated SNiPA to contain genome-wide significant associations of two recent studies by Schlosser et al. (Nature Genetics, Feb 2020) and Lotta et al. (preprint, Jul 2020).
From a recent study of the "Genetic architecture of host proteins interacting with SARS-CoV-2" by Pietzner et al. (preprint, Jul 2020) we incorporated around 225,000 pQTL associations from 10,708 individuals.
We updated SNiPA to contain the new V8 release of the GTEx project. SNiPA version 3.4 now includes GTEx associations across 50 tissues (6 more than the previous release). Please refer to gtexportal.org for information on GTEx data release and publication policy.
Data from the Neale lab UK Biobank GWASs (round 2) added more than 4.2 million unique pooled and sex-specific variant-phenotype associations. This is the largest addition of data from a single source, therefore it is displayed in a separate table in variant annotations.
For variant associations, we updated association datasets that changed from Ensembl92 to Ensembl100. The following table shows these datasets and association counts:
Source | N (unique) | Reference | Version |
HGMD | 108,643 (108,069) | PMID: 24077912 | Ensembl 100 |
ClinVar | 687,892 (652,053) | PMID: 24234437 | Ensembl 100 |
GWAS Catalog | 148,463 (142,722) | PMID: 19474294 | Ensembl 100 - r2020-06-15 |
Covid pGWAS | 226,741 (226,720) | doi: 10.1101/2020.07.01.182709 | |
GCKD GWAS | 902,657 (846,281) | PMID: 31959995 | |
Crossplattform GWAS meta-analysis | 393,464 (393,457) | doi: 10.1101/2020.02.03.932541 | |
UK Biobank | 35,028,767 (4,232,784) | http://www.nealelab.is/uk-biobank |
This is a silent update that continues to use "Ensembl87" as identifier for some of the latest annotation data. SNiPA version 3.4 has been made accessible on November 13th, 2020. If you accessed SNiPA after this date, you have been using version 3.4. If you are uncertain as to which annotation version you have been using, we provide a full data freeze of SNiPA annotations version 3.3 here.
Genome assembly: GRCh37.p13
Ensembl version: 87
1000 genomes: phase 3 version 5
This is a silent update of SNiPA annotations for Ensembl 87 as most data remains as of version 3.2. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 92 release, as well as new data on pQTL associations (see below).
We updated SNiPA to contain genome-wide significant pQTL association data from the pGWAS study in blood published this month by Sun et al. 2018. SNiPA version 3.3 now includes more than 16,500 cis- and trans-associations with blood protein levels.
For variant associations, we only updated association datasets that changed from Ensembl87 to Ensembl92. The following table shows these datasets and association counts:
Source | N (unique) | Reference | Version |
HGMD | 75,910 (68,320) | PMID: 24077912 | Ensembl 92 |
ClinVar | 335,828 (308,521) | PMID: 24234437 | Ensembl 92 |
OMIM variation | 24,028 (22,220) | http://omim.org/ | Ensembl 92 |
GWAS Catalog | 67,764 (63,808) | PMID: 19474294 | Ensembl 92 - r2018-05-29 |
This is a silent update that continues to use "Ensembl87" as identifier for the latest annotation data. SNiPA version 3.3 has been made accessible on June 25th, 2018. If you accessed SNiPA after this date, you have been using version 3.3. If you are uncertain as to which annotation version you have been using, we provide a full data freeze of SNiPA annotations version 3.2 here.
Genome assembly: GRCh37.p13
Ensembl version: 87
1000 genomes: phase 3 version 5
This is only a minor update. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 87 release, as well as new data on mQTL and pQTL associations (see below).
We updated SNiPA to contain new data from the metabolomics GWAS server and of two additional studies (Draisma et al. 2015, Long et al. 2017). SNiPA version 3.2 now includes more than half a million associations with metabolite concentrations from two biofluids (blood and urine).
We updated SNiPA to contain pQTL data from our proteomics GWAS server that is based on the largest pGWAS in blood to date (Suhre et al. 2017). SNiPA version 3.2 now includes almost 15,000 cis- and trans-associations with blood protein levels.
We updated SNiPA to contain the most recent version (v1.3) of the CADD score. We also used updated versions of phyloP and phastCons based on sequences of 100 vertebrate species with alignments to the current GRCh38 genome assembly (mapped back to GRCh37 using our variant-based mapping procedure).
Source | N (unique) | Reference |
HGMD | 61,489 (55,484) | PMID: 24077912 |
dbGaP | 40,253 (28,766) | PMID: 17898773 |
ClinVar | 190,165 (174,435) | PMID: 24234437 |
OMIM variation | 22,129 (20,501) | http://omim.org/ |
UniProt | 18,805 (17,575) | PMID: 24253303 |
GWAS Catalog | 26,193 (18,769) | PMID: 19474294 |
DrugBank - 4.2 | 179 (169) | PMID: 24203711 |
Genome assembly: GRCh37.p13
Ensembl version: 82
1000 genomes: phase 3 version 5
This is only a minor update. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 82 release, as well as the new release of GTEx (see below). As of this version, we include the association data of the new GWAS Catalog at EMBL-EBI.
We updated SNiPA to contain the new V6 release of the GTEx project. SNiPA version 3.1 now includes about 20 mio. significant associations from GTEx across 44 tissues. Please refer to http://gtexportal.org/home/documentationPage for information on GTEx data release and publication policy.
Source | N (unique) | Reference |
HGMD - 2015 | 53,420 (48,305) | PMID: 24077912 |
dbGaP - October, 9th, 2015 | 40,254 (28,767) | PMID: 17898773 |
ClinVar - October, 9th, 2015 | 156,160 (139,160) | PMID: 24234437 |
OMIM variation - October, 9th, 2015 | 19,878 (18,442) | http://omim.org/ |
UniProt - October, 9th, 2015 | 3,484 (3,219) | PMID: 24253303 |
GWAS Catalog - October, 9th, 2015 | 19,950 (18,769) | PMID: 19474294 |
DrugBank - 4.2 | 179 (169) | PMID: 24203711 |
Genome assembly: GRCh37.p13
Ensembl version: 80
1000 genomes: phase 3 version 5
There was a minor bug in our probe mapping script that mapped array address IDs to probe IDs. Affected data sets were the associations reported by Westra et al. and Innocenti et al. The bug was fixed and as of Ensembl version 80, the associations should all be reported correctly. We thank ME Reyes for reporting this bug!
As the development of SNiPA started with 1000 genomes phase 1 version 3 data, we then used the precompiled CADD data set for 1000 genomes. With 1000 genomes phase 3 version 5, the variant count was more than doubled and the "new" variants were not contained in the provided CADD data file leading to a large amount of variants for which no CADD scores where available. In SNiPA version 3, we downloaded the genome-wide data set from the CADD website and retrieved allele-matched scores for all variants contained in SNiPA. We thank AM Nissen for reporting this bug!
Several SNiPA-users asked if we could include the eQTL associations from the GTEx project. In SNiPA version 3, we included significant associations from GTEx release 4 data for 13 tissues. Please refer to http://gtexportal.org/home/documentationPage for information on GTEx data release and publication policy.
Ensembl's variant effect predictor (VEP) now includes SnpEff effect impact predictions (details). We added the prediction denoted as "effect impact" to the basic features table contained in SNiPAcards.
We were asked if it would be possible to customize the association maps feature to be able to create own figures of association results. Therefore, we added this as a new feature in the Association Maps module, including a howto and example input.
SNiPA GeneBuild: Ngenes = 59,413 (based on GENCODE 22)
SNiPA RegBuild: Nelements = 1,471,812 (now includes FANTOM5 permissive promoters)
PolyPhen: v2.2.2
SIFT: v5.2.2
Source | N (unique) | Reference |
DECIPHER | 1,829 (1,829) | http://decipher.sanger.ac.uk/ |
OMIM gene | 4,886 (4,882) | http://omim.org/ |
OrphaNet | 5,684 (5,684) | http://orpha.net/ |
Source | N (unique) | Reference |
HGMD - 2014.4 | 41,077 (36,807) | PMID: 24077912 |
dbGaP - June 12th, 2015 | 41,426 (28,824) | PMID: 17898773 |
ClinVar - June 12th, 2015 | 98,289 (85,835) | PMID: 24234437 |
OMIM variation - June 12th, 2015 | 18,854 (17,586) | http://omim.org/ |
UniProt - June 12th, 2015 | 3,399 (3,210) | PMID: 24253303 |
GWAS Catalog - June 12th, 2015 | 17,703 (16,635) | PMID: 19474294 |
DrugBank - 4.2 | 179 (169) | PMID: 24203711 |
We have included the variant identifiers contained in dbSNP's rsMergeArch table, so users can search for rs identifiers that were merged into newer rs numbers. SNiPA cards and the tooltips in the interactive plots now list these alias rs identifiers in addition to the one currently assigned to each variant. Also, an column labeled "RSALIAS" was added to the genomic data sets (available for download here).
Users may optionally specify variants that should be highlighted in the linkage disequilibrium plots.
Genome assembly: GRCh37
Ensembl version: 77
1000 genomes: phase 3 version 5
SNiPA's data model is fully position-based. It would therefore be possible to update to the new GRCh38 genome assembly. However, as most annotation sets contained in SNiPA are not yet available for this assembly, we decided to stick to the old but fully annotated GRCh37 assembly in this version. This has some implications which are referred to in the following sections.
The current Ensembl gene build (GENCODE 21) is based on GRCh38, a mapping to GRCh37 is not provided. We updated gene information as known from the previous SNiPA version, used the UCSC liftOver tool (command line executable) to convert genome coordinates, and retained all genes that could be mapped to the old genome assembly. The new SNiPA gene build contains 59,006 entries (Ndiff to SNiPA v1 = +1,764).
The current Ensembl regulatory build (ENCODE) is based on GRCh38, a mapping to GRCh37 is not provided for batch retrieval. We updated information on regulatory elements from ENCODE as known from the previous SNiPA version, used the UCSC liftOver tool (command line executable) to convert genome coordinates, and retained all elements that could be mapped to the old genome assembly. Combined with the known datasets for promoter and enhancer regions, the new SNiPA regulatory build contains 1,127,068 elements (Ndiff to SNiPA v1 = -109,063). The lower number of regulatory elements is due to the new regulatory build of Ensembl where several regulatory clusters have been merged.
We have used the new 1000 genomes release (phase 3 version 5) in SNiPA v2. This release has more than doubled the number of available variants. We want to point the user to the data usage policy of the 1000 genomes project.
dbSNP identifiers are not yet completely included in the 1000 genomes data files, but autosomal variants have already been integrated in dbSNP build 142. We downloaded dbSNP mapped to GRCh37 and merged both data sets. As X-chromosomal 1000 genomes markers were released after dbSNP 142 was created, we only supply those variants which could be mapped to a dbSNP rs-identifier.
population | rs-count |
African (AFR) | 39,581,182 |
American (AMR) | 26,474,088 |
East Asian (EAS) | 22,128,163 |
European (EUR) | 22,541,970 |
South Asian (SAS) | 24,854,259 |
total (unique) | 78,471,927 |
SNiPA allows to combine all annotation releases with all variant sets. However, annotation data of SNiPA v1 (Ensembl v. 75) is not available for the new variant set. Therefore, if you use 1000 genomes phase 3 version 5 data and combine it with Ensembl 75 annotations, then variants not contained in phase 1 version 3 will show no annotations.
The Ensembl VEP tool only features full annotation data for the new GRCh38 genome assembly. Therefore, SNiPA's annotation workflow had to be adjusted to provide both all the annotation data from the previous release and simultaneously a mapping to the newest gene and regulatory builds of Ensembl. To achieve that, we did effect predictions for both assemblies. Our custom annotation program then merged the VEP output for both assemblies using the variant mapping provided by dbSNP build 142 to again yield a full SNiPA build for GRCh37.
All phenotype annotation sets have been updated to Ensembl version 77. OrphaData and the GWAS Catalog have been accessed at October 20th 2014.
Source | N (unique) | Reference |
HGMD | 35,326 (31,770) | PMID: 24077912 |
dbGaP | 41,426 (28,824) | PMID: 17898773 |
ClinVar | 89,522 (87,923) | PMID: 24234437 |
OMIM variation | 9,595 (8,968) | http://omim.org/ |
UniProt | 3,573 (3,366) | PMID: 24253303 |
GWAS Catalog | 16,342 (15,343) | PMID: 19474294 |
DrugBank 4.0 | 179 (169) | PMID: 24203711 |
Source | N (unique) | Reference |
DECIPHER | 1,795 (1,795) | http://decipher.sanger.ac.uk/ |
OMIM gene | 5,055 (5,051) | http://omim.org/ |
OrphaNet | 5,684 (5,684) | http://orpha.net/ |
We have decided to make all data contained in SNiPA available to the community. Please refer to the README for details on folder structure and data formats.
→ Data access
Genome assembly: GRCh37
Ensembl version: 75
1000 genomes: phase 1 version 3
This is the original release of SNiPA as described in the original publication and the documentation.
This version contains all bi-allelic variants present in 1000 genomes project, phase 1 version 3. These are the variant counts for the individual superpopulations:
population | rs-count |
African (AFR) | 25,837,142 |
American (AMR) | 20,097,916 |
Asian (ASN) | 15,012,236 |
European (EUR) | 17,361,202 |
Source |
N (unique) |
Reference |
HGMD |
93,758 (86,491) |
PMID: 24077912 |
dbGaP |
41,181 (33,819) |
PMID: 17898773 |
ClinVar |
47,315 (44,141) |
PMID: 24234437 |
OMIM variation |
19,911 (19,184) |
|
UniProt |
5,055 (4,850) |
PMID: 24253303 |
GWAS Catalog |
15,500 (14,513) |
PMID: 19474294 |
DrugBank 4.0 |
179 (169) |
PMID: 24203711 |
Source |
N (unique) |
Reference |
DECIPHER |
2,144 (2,143) |
|
OMIM gene |
5,775 (5,775) |
|
OrphaNet |
5,705 (5,675) |