BiofacetCast Database Service (BFCast)
BFCast is a database service, operating at BIOFACET’s facilities, which provides the Subscriber with regular updates of biological REFERENCE DATABASES, according to the PRINCIPLES OF OPERATIONS.
PRINCIPLE OF OPERATIONS
BIOFACET periodically fetches reference databases from public repositories (NCBI, EBI, etc.). Upon reception, the Biofacet lspbank software module parses and produces their content in Biofacet and BLAST formats.
lspbank’s specific options (-T GENBANK_EXT, -T EMBL_EXT) allow building the Biofacet format such a way the original content is preserved. Original content can be re-generated anytime. Indexing of the entire content is performed.
Consequently:
- preservation and retrieval of original records content and format;
- total indexing, allowing keyword searching on any combination of annotation fields (including all fields)
In addition to mirroring content of major biological databases, BFCast provides with two added-value databases: NR and NT. Those databases are original FASTA-files available as nr and nt at NCBI, for which the FASTA annotation is replaced by the full, flat file content, of the original records.
The ensemble of databases processed forms the REFERENCE DATABASES. Once databases are built at BIOFACET facilities, they are downloaded by Subscriber, following a schedule described in the UPDATE FREQUENCY. This ensemble defines what is called the “Hotdrive” at Subscriber’s site.
REFERENCE DATABASES
Glossary
- gbff: GenBank Flatfile Format: NCBI’s flat file format for nucleotide sequences
- gpff: GenBank Flatfile Format: NCBI’s flat file format for protein sequences
- embl: EBI’s flat file format (nucleotide and protein)
- NUC: nucleotide
- PRT: protein
- nr (nt): NCBI non redundant protein (nucleotide) databases in FASTA format
- NR (NT): Biofacet’s nr (nt) populated with gpff (gbff) content
In addition to the above databases, the following productions and operations are provided:
UPDATE FREQUENCY
Biofacet databases are built at BIOFACET facilities every two months, when a new UNIPROT (release N+2 from previous) is ready.
Only major releases (and not incremental updates) are provided.
Databases have different cycles of releases:
- monthly: UNIPROT
- bi-monthly: Genbank, Refseq
- weekly: nr/nt
BFCast includes every two months:
- Other Reference Databases
- NCBI bacterial & archea-bacterial genomes, plasmids
- Cross-referenced operations
- Dual cross references of Uniprot/NR “by 100% sequence identity”.
- Uniprot inclusion of NR cross-references (and vice-versa)
- Uniprot inclusion of GOA
UPDATE FREQUENCY
Biofacet databases are built at BIOFACET facilities every two months, when a new UNIPROT (release N+2 from previous) is ready.
Only major releases (and not incremental updates) are provided.
Databases have different cycles of releases:
- monthly: UNIPROT
- bi-monthly: Genbank, Refseq
- weekly: nr/nt
BFCast includes every two months:
- The latest (and new) major release of UNIPROT
- The latest (and new) releases of NR and NT
- Depending on calendar events, the latest (new or not new) major releases of Genbank and Refseq.
- All other derived databases described previously