• Software
  • NOVA APPLICATIONS
    Protein Modeling
  • Molecular Biology
  • Automated Virtual Cloning
  • Clone Sequence Verification
  • Gel Electrophoresis Simulation
  • Multiple Sequence Alignment
  • Pairwise Sequence Alignment
  • PCR Site-Directed Mutagenesis
  • PCR Primer Design
  • Sanger Sequence Assembly
  • Protein Analysis
  • Protein Docking
  • Protein Structure Prediction
  • Genomics
  • Clinical Research
  • De Novo Genome Assembly
  • Variant Analysis
  • Whole Genome/Whole Exome
  • Transcriptomics
  • ChIP-seq Data Analysis
  • RNA-Seq Alignment and Analysis
  • Services
  • COVID-19
  • Product Updates
  • Product Notifications
  • Educational Software Request
  • Help + Tutorials
  • About
  • Contact

QUESTIONS? CALL 866.511.5090

DOWNLOAD FREE TRIAL
SHOPPING CART
MY ACCOUNT
DNASTAR DNASTAR
  • Software
  • NOVA APPLICATIONS
    Protein Modeling
  • Molecular Biology
  • Automated Virtual Cloning
  • Clone Sequence Verification
  • Gel Electrophoresis Simulation
  • Multiple Sequence Alignment
  • Pairwise Sequence Alignment
  • PCR Site-Directed Mutagenesis
  • PCR Primer Design
  • Sanger Sequence Assembly
  • Protein Analysis
  • Protein Docking
  • Protein Structure Prediction
  • Genomics
  • Clinical Research
  • De Novo Genome Assembly
  • Variant Analysis
  • Whole Genome/Whole Exome
  • Transcriptomics
  • ChIP-seq Data Analysis
  • RNA-Seq Alignment and Analysis
  • Services
  • COVID-19
  • Product Updates
  • Product Notifications
  • Educational Software Request
  • Help + Tutorials
  • About
  • Contact

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

September 7, 2016 Clinical Research, Next-Gen Sequencing

human-molecules
Lasergene Genomics Suite now includes access to the Variant Annotation Database (VAD) for human sequencing data. I recently spoke with DNASTAR Scientist, Dr. Tim Durfee about the VAD to get a better understanding of how the tool works and how it can help genomics and clinical researchers with their variant analysis.

Can you describe what the Variant Annotation Database is?
The VAD is a database resource that contains information on individual positions and alleles across the human genome. It is currently human genome specific. The major purpose of the VAD is to allow rapid prioritizing and ranking of the large number of variants found in any given sample relative to the reference genome. This can be on the order of thousands of variants for gene panels; tens of thousands for exomes; and millions for whole genomes. This kind of large-scale analysis is critical for the clinical sequencing market.

How can users access the information in the VAD?
Annotation information for each called variant in a specific sample is automatically retrieved from the VAD during project setup in ArrayStar. With the upcoming Lasergene 14.0 release, it will be added to the project directly following assembly and variant calling. The data is accessible in the ArrayStar SNP table and can be used to filter and create gene and SNP sets. For examples on how this can be done, take a look at our tutorial.

What is the source of the annotations in the VAD?
The data is from two major sources: the 1000 Genomes Project and dbNSFP (Database of Human Nonsynonymous SNPs and their Functional Predictions). As the name suggests, the dbNSFP data is on protein encoding positions in the genome. The data are organized into five broad categories:

  1. Allele and genotype frequencies from the 1000 Genomes phase 3 data as well as from NHLBI’s Exome Sequencing Project. The 1000 Genomes data is available as global frequencies as well as frequencies for 26 populations grouped into 5 super populations. This data is extremely useful for filtering. For example, if you’re studying a rare disease that only occurs in a small number of individuals, you wouldn’t expect a relevant SNP to occur at high frequency in the population – typically, you filter for variants that occur less than 5% or even less than 1% in the population.

 

  1. Functional impact prediction methods: LRT, MutationTaster, PolyPhen-2 (two models) and SIFT. The four methods use different strategies to predict whether a given non-synonymous change is deleterious to the function of the encoded protein.

 

  1. Evolutionary conservation scoring systems: GERP++, SiPhy, PhyloP and PhastCons. These methods use sequence alignments of the human genome with the corresponding regions of other organisms to produce scores of how conserved each particular base is across evolution. In coding regions, the more evolutionarily conserved the particular base is, the more likely having that base in that position is important for the function of the encoded protein. Some methods (e.g. GERP++) can also be used to assess the importance of bases outside the coding regions.

 

  1. Pathogenicity information from ClinVar: ClinVar is a central repository hosted by NCBI that catalogs and reviews human variation and its connection to disease.  The VAD uses the clinical significance field to allow filtering on different classifications including Benign and Pathogenic.

 

  1. Miscellaneous information: The VAD also contains other types of information such as links to dbSNP Uniprot and Interpro that allow the user to easily retrieve additional data from those resources.

What are the advantages to using the VAD over a user’s own database or VCF file?
If a user has huge VCF files with the annotations, they would have to manually go through each position and retrieve the relevant information for that allele. With the VAD, all the annotations are automatically retrieved and readily available for filtering. The VCF is more useful as a record file of all the variants and their annotations that can be shared between applications.  For example, a VCF of alleles of interest produced by ArrayStar can be used by SeqMan NGen in subsequent assemblies to report on those positions.

How does this compare to other tools on the market today?
The major advantage of DNASTAR’s Variant Annotation Database is the seamless connection with the assembly and variant caller. With open source software, you have to first run the assembly, do the variant calling with a separate program, and then use yet another tool to add the annotation information. There is often a steep learning curve with each of these tools, which can make the overall process laborious. The DNASTAR pipeline integrates all these steps into one suite and allows for multiple sample comparison and filtering. Additionally, we provide the most accurate assembly and variant calling.

Want to learn more? Check out our variant analysis workflow page to see videos and benchmarks on NGS assembly and variant analysis in Lasergene Genomics Suite.

0
Share

Leave a Reply

Your email is safe with us.
Cancel Reply

Search Blog Posts

Categories

  • Blog
    • Best Practices
    • Clinical Research
    • DNASTAR Customer Stories
    • DNASTAR News
    • Newsletters
    • Next-Gen Sequencing
    • Press Releases
    • Product Notifications
    • Product Updates
    • Publications
    • Resources
    • Sequence Analysis
    • Structural Biology
    • Webinars
    • Workflows
  • featured post
  • Uncategorized

Recent Posts

  • Lasergene 17.3.3 Release Notes June 28, 2022
  • Why Structure Prediction Matters June 14, 2022
  • Expert Guided Protein Structure Prediction Webinar June 14, 2022
  • Why Structure Prediction Matters June 13, 2022
  • Why Structure Prediction Matters June 13, 2022

Tags

assembling sequences cloud Cloud Assemblies customers De Novo Assembly DNASTAR Genomics Lasergene Metagenomics Metagenomic Sequencing NCBI GenBank newsletters next-gen NGS NGS Sequence Alignment NGS Sequence Asembly publications seqbuilder pro SeqMan NGen sequence assembly Webinar

Archives

Find us on

Most Commented Posts

  • Eppley Institute Adopts DNASTAR Software By toms on March 13, 2013 0
  • Clustal Omega alignment does not complete and results in a “fatal error” message By Sharon Page on June 17, 2014 0
  • DNASTAR Lasergene Software Now Available on the Amazon Cloud By toms on September 4, 2014 0

Would you like to receive technical tips and special offers straight to your inbox?

  • About

Get a 14-Day free trial of our complete Lasergene package. Try before you buy!

FREE TRIAL DOWNLOAD

© 2026 — DNASTAR Privacy Policy

Prev Next
This website uses cookies to improve user experience and understand our web usage. By continuing to use our website, you consent to our use of cookies. Accept
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.