GCG-Provided Sequence Databases

With SeqWeb, GCG provides the DNA and protein databases shown in the following table. Your SeqWeb Administrator determines which of these databases are available to you.

Note All DNA databases are a combination of sequences in GenBank and the EMBL Data Library. Due to the large duplication between GenBank and EMBL, GCG has eliminated EMBL sequence entries sharing the same primary accession number as sequences in GenBank.

DNA Databases
Name Description
All DNA Databases Includes all entries in GenBank and the abridged EMBL (GenEMBLPlus).
DNA Databases excluding EST and STS Includes sequences from all divisions of both DNA databases except the EST and STS divisions (GenEMBL).
EST and STS Includes only sequences from the EST and STS divisions of GenEMBLPlus (Tags).
Expressed Sequence Tags (EST) Includes sequences derived from sampling cDNA libraries. Entries are mostly short and generated by automatic sequencing.
Genome Survey Sequences (GSS) Includes short sequences from several genomes. Most entries have been generated by automatic sequencing.
Sequence Tagged Sites (STS) Includes short, randomly primed regions of several genomes. The location of the entries make them important markers for genome mapping and sequencing.
Bacterial Includes sequences derived from bacteria species.
High Throughput Genomes (HTG) Include long, contiguous regions of several genomes usually at the cosmid level or longer. The entries are considered finished sequences.
Invertebrate Includes sequences derived from invertebrate species.
Organelle Includes sequences from mitochondria and other organelles. (EMBL only; GenBank entries are in the division of the host species).
Mammalian without Primate
and Rodent
Includes sequences from all mammalian species except primates and rodents.
Vertebrate without Mammalian Includes sequences from all vertebrate species except mammalian species.
Patent Includes sequences submitted to the public databases by the European and United States Patent Offices.
Phage Includes sequences from bacteriophage.
Plant Includes sequences from plant, fungal, and algal species.
Primate Includes sequences from primate species.
Rodent Includes sequences from rodent species.
Structural RNA Includes RNA sequences with solved structures. (Not all structural RNAs have a solved structure.)
Synthetic Includes man-made sequences, mostly vectors, regions of vectors, or genetically engineered plasmids.
Unannotated Includes entries which have incomplete documentation or other problems. Entries in this division will move to another division in a future database release.
Viral Includes sequences from all viral species except bacteriaphage.

Protein Databases
Name Description
SWISS-PROT Includes all SWISS-PROT entries.
Translated EMBL Includes most of the proposed translated regions of the complete EMBL except those regions with entries already in the SWISS-PROT database (SP-TREMBL).
SWISS-PROT plus Translated EMBL Includes all SWISS-PROT and
SP-TREMBL entries (SwissProtPlus).
PIR Includes all PIR entries.