Developing new tools to support regulatory use of “Next Gen Sequencing” data

By: Carolyn A. Wilson, Ph.D.

When you’re thirsty, you don’t want to take a drink from a fire hose. And when scientists are looking for data they don’t want to be knocked over with a flood of information that overwhelms their ability to analyze and make sense of it.

Carolyn WilsonThat’s especially true of data generated by some types of both human and non-human genome research called Next Generation Sequencing (NGS). This technology produces sets of data that are so large and complex that they overwhelm the ability of most computer systems to store, search, and analyze it, or transfer it to other computer systems.

The human genome comprises about 3 billion building blocks called nucleic acids; much medical research involves analyzing this huge storehouse of data by a process called sequencing—determining the order in which the nucleic acids occur, either in the entire genome or a specific part of it. The goal is often to find changes in the sequence that might be mutations that cause specific disease. Such information could be the basis of diagnostic tests, new treatments, or ways to track the quality of certain products, such as vaccines made from viruses.

NGS is a complicated technique, but basically it involves cutting the genome into millions of small pieces so you can use sophisticated chemical tricks and technologies to ignore the “junk” you don’t need, and then make up to hundreds of copies of each of the pieces you want to study. This enables additional techniques to identify changes in the sequence of nucleic acids that might be mutations. NSG enables scientists to fast-track this process by analyzing millions of pieces of the genome at the same time. For comparison, the famous human genome sequencing and analysis program that took 13 years to complete and cost $3 billion could now be completed in days for a few thousand dollars.

Man with HIVE Computer

The Center for Biologics Evaluation and Research (CBER) supported the development of High-Performance Integrated Virtual Environment (HIVE) technology, a private, cloud-based environment that comprises both a storage library of data and a powerful computing capacity being used to support Next Generation Sequencing of genomes.

In order to prepare FDA to review and understand the interpretation and significance of data in regulatory submissions that include NGS, the Center for Biologics Evaluation and Research (CBER) supported the development of a powerful, data-hungry computer technology called High-Performance Integrated Virtual Environment (HIVE), which can consume, digest, analyze, manage, and share all this data. HIVE is a private cloud-based environment that comprises both a storage library of data and a powerful computing capacity. One specific algorithm (set of instructions for handling data) of HIVE that enables CBER scientists to manage the NGS fire hose is called HIVE-hexagon aligner. CBER scientists have used HIVE-hexagon in a variety of ways; for example, it helped scientists in the Office of Vaccines Research and Review study the genetic stability of influenza A viruses used to make vaccines. The scientists showed that this powerful tool might be very useful for determining if influenza viruses being grown for use in vaccines were accumulating mutations that could either reduce their effectiveness in preventing infections, or even worse, cause infections.

There’s another exciting potential to HIVE-hexagon research: the more scientists can learn about variations in genes that alter the way they work—or make them stop working–the more they can help doctors modify patient care to reflect those very personal differences. These differences can affect health, disease, and how individuals respond to treatments, such as chemotherapy and influenza vaccines. Such knowledge will contribute to advances in personalized medicine.

Team members at work in FDA's HIVE server room.

CBER scientists showed that HIVE might help scientists determine if influenza viruses being grown for use in vaccines were accumulating mutations that could either reduce their effectiveness in preventing infections or cause infections. Genome studies supported by HIVE will also contribute to advances in personalized medicine.

Because CBER’s HIVE installation has been so successful we are now collaborating with FDA’s Center for Devices and Radiological Health (CDRH) to provide a second installation with greater capacity and computer power that takes advantage of the high-performance computing capacity there. When ready and approved by FDA for use, we will use this powerful, CBER-managed, inter-center resource to handle regulatory submissions.

HIVE-hexagon and its innovative NGS algorithms are just one major step CBER has taken recently as it continues its pioneering work in regulatory research to ensure that products for consumers are safe and effective. I’ll tell you about other exciting breakthroughs in my next update on CBER research.

Carolyn A. Wilson, Ph.D., is Associate Director for Research at FDA’s Center for Biologics Evaluation and Research.

For more HIVE photos go to Flickr

Recent Related Posts