BACKTRANSLATE

Table of Contents
FUNCTION
DESCRIPTION
OUTPUT
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

BackTranslate backtranslates an amino acid sequence into a nucleotide sequence. The output helps you identify areas with fewer ambiguities that might be candidates for synthetic probes.

DESCRIPTION

[ Previous | Top | Next ]

BackTranslate uses a translation table to backtranslate a protein sequence to the most probably or most ambiguous nucleic acid sequence. The output file can be used as input to other Wisconsin Package(TM) programs.

If you choose one of the table of back-translations parameters, the program also uses a codon preference table and writes the codons for each amino acid in order of their preference in that table. Below each codon list, there is a number between 0 and 1,000; it is the product of the probabilities for the most likely codons for the next four amino acids multiplied by 1,000. The higher the number, the more likely it is that the next 12 nucleotides (four amino acids) contain preferred codons.

OUTPUT

[ Previous | Top | Next ]

Here is part of the output file:


!!NA_SEQUENCE 1.0
 BACKTRANSLATE of: : ilvhiaa.pep  check: 2165  from: 1  to: 6

E Coli. ilvI - ilvH (peptide)

 Using codon frequencies from: /package/share/9.0/gcgcore/data/rundata/ecohigh.cod
 CheckFile: 9032

Codon usage for enteric bacterial (highly expressed) genes 7/19/83

    Ser        Phe        Ser        Gln        Pro        Trp

  UCC 0.37   UUC 0.76   UCC 0.37   CAG 0.86   CCG 0.77   UGG 1.00
  UCU 0.34   UUU 0.24   UCU 0.34   CAA 0.14   CCA 0.15
  AGC 0.20              AGC 0.20              CCU 0.08
  UCG 0.04              UCG 0.04              CCC 0.00
  AGU 0.03              AGU 0.03
  UCA 0.02              UCA 0.02
  89         186        245        0          0          0

ilvhiaa.seq  Length: 18  September 30, 1998 17:08  Type: N  Check: 2929  ..

       1  WSNTTYWSNC ARCCNTGG

INPUT FILES

[ Previous | Top | Next ]

BackTranslate accepts a single protein sequence and a single codon frequency table as input. Look at the CodonFrequency program for information about how to create or modify a codon frequency file. If BackTranslate rejects your protein sequence, turn to Appendix VI to see how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

Prime selects oligonucleotide primers for a template DNA sequence. The primers may be useful for the polymerase chain reaction (PCR) or for DNA sequencing. You can allow Prime to choose primers from the whole template or limit the choices to a particular set of primers listed in a file.

CodonFrequency tabulates codon usage from sequences or existing codon frequency tables. Composition counts trinucleotides from any set of sequences. The mapping programs can be run with -ALL to identify all potential restriction sites in back-translated sequences. If you run the mapping programs with -SILent, they will identify potential restriction sites that can be created which won't change the translation of the nucleic acid sequence.

RESTRICTIONS

[ Previous | Top | Next ]

No checking is done to see that your codon frequency table and your translation table agree. The most ambiguous back-translated sequence comes from the translation table. The most probable back-translated sequence comes from the codon frequency table. The table of codon choices also comes from the codon frequency table.

CONSIDERATIONS

[ Previous | Top | Next ]

You should realize that the most ambiguous back-translation uses three IUB codes (see Appendix III) to represent each codon. These codes are not capable of correctly representing sets of codons where more than one of the bases is incompletely permuted. This is the case for the stop codons and for the residues with six synonymous codons. For instance, serine should back-translate into the codons TCT, TCC, TCA, TCG, AGT or AGC . These can be represented precisely as either TCN or AGY. The codon shown by BackTranslate for serine is WSX, which has eight permutations, six of which are correct and two of which are not!

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

Would you like to see:
    most probable sequence only
    most ambiguous sequence only

indicates the type of output to create. By selecting the most probable sequence, BackTranslate will select the most probable nucleotides based on the codon usage. By selecting the most ambiguous sequence, the translation table is used to create a nucleotide sequence with ambiguity symbols.

Codon Frequency Table

selects the codon frequency table to use when constructing the most probable sequence. You can select optional codon frequency tables to bias the results in favor of the codon usage in E. coli, human, drosophila, maize, yeast, and some other class of organisms.

Translation Table

Usually, the Standard translation table is the basis for all translations. You can choose translation tables for various non-standard genomes such as yeast mitochondrial.

Printed: January 13, 1999 6:26 (1162)