Started in the mid-90s inside the Human Genome Global Physical Mapping Project of Genethon/CEPH laboratories, the Biofacet Engine is a multiple databases, multiple-algorithms, high-performance sequence search engine. Initially developed for the needs of large scale comparative genomics studies, Biofacet Engine has evolved over years to reach a unprecedented level of expressive power able to conduct fast and sophisticated analysis on any sequence dataset, focusing on NGS data.
Biofacet Engine is different from everything that you may have used so far with Blast searches, or more recently with popular analysis packages for short reads in the NGS bioinformatics arena. It is a breakthrough in sequence comparison.
It is different because:
It is a breakthrough because his design has:
Ever heard anything like this?
Discard all reads that mapped the Mouse, and give me the Human-only; btw, remove also abnormally repeated ones (from the search itself I mean; not from known repeats). And, hey sorry to call back, it’s really great but was wondering whether I could get only genes with short insertions on splice sites, and see how it compares with Rat. Just checking in; nothing urgent, but would be nice to have something by the end of the afternoon, thanks.
Biofacet Engine is designed to handle these types of requests, implements them with a few command lines.
Below an overview of some of Biofacet Engine features:
Biofacet Engine is different from everything that you may have used so far with Blast searches, or more recently with popular analysis packages for short reads in the NGS bioinformatics arena. It is a breakthrough in sequence comparison.
It is different because:
- It is not yet-another-stack of heterogeneous software modules, whose throughput is limited by the thin pipes connecting multiple components, and by the flaws of parsing huge flat files back and forth.
- It is not an ad-hoc solution for a given problem linked to a sequencing machine vendor technology, or to a read length, or to a given application. It is a versatile, global solution.
It is a breakthrough because his design has:
- No limits: Biofacet Engine handles large volumes by design.
It operates from individual objects up to whole datasets, whether they be sequence data, annotations or alignments. - No dbms: Biofacet Engine definitively solves the dichotomy between annotation and sequences.
All properties of a sequence record, or of an alignment record, whether it be annotation or sequence, whether the records be one or millions, are useable all together simultaneously, instantaneously. - No breakpoints: Alignments are not an ending anymore, but first-class, reusable objects.
Hundred of millions of alignments, coming from one or multiple databases, computed with different algorithms, can be queried, filtered and reused as databases or alignments for the next steps. More generally speaking, objects in the system can be filtered, grouped, sorted, and dumped, without any limitation, but the hardware. - No parser: the biggest limitation in Bioinformatics.
The free-formatting module allows you to massively dump sequences or alignments in any format, from simple tab delimited to standard SAM/BAM, including generating data structures directly usable by programs. - No compromise: always know what could have happened.
Whatever the algorithm chosen, the number of hits and their error-distribution per query are computed, regardless of the max number of hits kept. - No mental barrier: BWA in this case, BOWTIE in that case ... genomes are not enough anyway.
One can decide to use the ultra-fast short-reads mapper algorithms included, gapped or not, or - as read lengths increase and SMS arises – to use the our extended GASSST mapper dealing with reads up to 20Knt, or to use unbeatable Blast algorithm, whichever the technology is (ILMN, Genia, Ion Torrent, 454, SOLiD, etc.), whichever the database is (a reference genome, a Genbank division, a protein database, any subset and/or any combination of them).
Ever heard anything like this?
Discard all reads that mapped the Mouse, and give me the Human-only; btw, remove also abnormally repeated ones (from the search itself I mean; not from known repeats). And, hey sorry to call back, it’s really great but was wondering whether I could get only genes with short insertions on splice sites, and see how it compares with Rat. Just checking in; nothing urgent, but would be nice to have something by the end of the afternoon, thanks.
Biofacet Engine is designed to handle these types of requests, implements them with a few command lines.
Below an overview of some of Biofacet Engine features: