Prime selects oligonucleotide primers for a template DNA sequence. The primers may be useful for the polymerase chain reaction (PCR) or for DNA sequencing. You can allow Prime to choose primers from the whole template or limit the choices to a particular set of primers listed in a file.
The Polymerase Chain Reaction (PCR) process for amplifying nucleic acids is covered by U.S. Patent Nos. 4,683,195 and 4,683,202 owned by Hoffmann La Roche. A license for research may be obtained through the purchase and use of authorized reagents and thermocyclers from Perkin-Elmer Corp., or by otherwise negotiating a license with Perkin-Elmer. No license to use PCR is granted by the purchase or use of the Wisconsin Package(TM).
Prime analyzes a template DNA sequence and chooses primer pairs for the polymerase chain reaction (PCR) and primers for DNA sequencing. For PCR primer pair selection, you can choose a target range of the template sequence to be amplified. For DNA sequencing primers, you can specify positions on the template that must be included in the sequencing.
In selecting appropriate primers, Prime considers a variety of constraints on the primer and amplified product sequences. You either can use the program's default constraint values or modify those values to customize the analysis. You can specify upper and lower limits for primer and product melting temperatures and for primer and product GC contents. For primers, you can specify a range of acceptable primer sizes, any required bases at the 3' end of the primer (3' clamp), and a maximum difference in primer melting temperatures for PCR primer pairs. For PCR products, you can specify a range of acceptable product sizes.
For efficient priming, you should avoid primers with extensive self-complementarity in order to minimize primer secondary structure and primer dimer formation. Additionally, in PCR experiments, primer pairs with extensive complementarity between the two primers should be avoided in order to minimize primer dimer formation. Prime uses the annealing test described in the ALGORITHM topic to check individual primers for self-complementarity and to check the two primers in a PCR primer pair for complementarity to each other. Using this same annealing test, Prime optionally can screen against non-specific primer binding on the template sequence and on any repeated sequences you specify.
The terms forward primer and reverse primer are used in the remainder of this document and in the program output. Forward primers are complementary to sequences on the reverse template strand and create copies of the forward strand by primer extension. Conversely, reverse primers are complementary to sequences on the forward template strand and create copies of the reverse strand by primer extension.
Here is some of the output file listing the twenty-five most appropriate PCR primer pairs selected by Prime.
PRIME of: ggamma.seq ck: 3814 from: 1 to: 500 October 1, 1998 10:50 INPUT SUMMARY ------------- Input sequence: ggamma.seq Primer constraints: primer size: 18 - 22 primer 3' clamp: S primer sequence ambiguity: NOT ALLOWED primer GC content: 40.0 - 55.0% primer Tm: 50.0 - 65.0 degrees Celsius primer self-annealing. . . 3' end: < 8 (weight: 2.0) total: < 14 (weight: 1.0) unique primer binding sites: required primer-template and primer-repeat annealing. . . 3' end: ignored total: ignored repeated sequences screened: none specified Product constraints: product length: 100 - 300 product GC content: 40.0 - 55.0 product Tm: 70.0 - 95.0 degrees Celsius duplicate primer endpoints: NOT ALLOWED difference in primer Tm: < 2.0 degrees Celsius primer-primer annealing. . . 3' end: < 8 (weight: 2.0) total: < 14 (weight: 1.0) PRIMER SUMMARY -------------- forward reverse Number of primers considered: 1403 1403 Number of primers rejected for . . . primer 3' clamp: 227 225 primer sequence ambiguity: 0 0 primer GC content: 623 631 primer Tm: 170 178 non-unique binding sites: 0 0 primer self-annealing: 56 57 primer-template annealing: 0 0 primer-repeat annealing: 0 0 Number of primers accepted: 327 312 PRODUCT SUMMARY --------------- Number of products considered: 102024 Number of products rejected for. . . product length: 76315 product GC content: 1636 product Tm: 0 product position: 0 duplicate primer endpoints: 9001 difference in primer Tm: 5992 primer-primer annealing: 7455 Number of products accepted: 1625 Number of products saved: 25 Maximum overlap between products: 300 bp -------------------------------------------------------------------------------- Product: 1 [DNA] = 50.000 nM [salt] = 50.000 mM PRIMERS ------- 5' 3' forward primer (19-mer): 18 AGTTCCACACACTCGCTTC 36 reverse primer (19-mer): 145 CTTCCACATTCACCTTGCC 127 forward reverse primer %GC: 52.6 52.6 primer Tm (degrees Celsius): 52.0 50.8 PRODUCT ------- product length: 128 product %GC: 50.8 product Tm: 75.4 degrees Celsius difference in primer Tm: 1.2 degrees Celsius annealing score: 37 optimal annealing temperature: 53.2 degrees Celsius -------------------------------------------------------------------------------- Product: 2 [DNA] = 50.000 nM [salt] = 50.000 mM PRIMERS ------- 5' 3' forward primer (20-mer): 17 CAGTTCCACACACTCGCTTC 36 reverse primer (20-mer): 144 TTCCACATTCACCTTGCCCC 125 ///////////////////////////////////////////////////////////////
The output file begins with a summary listing all of the constraints used by the program to select appropriate primers or PCR primer pairs. Most of these constraints can be modified by adjusting the program parameters. Many of these constraints were described in the DESCRIPTION topic of this document. Several of the constraints, including primer-self annealing, primer-template annealing, primer-primer annealing, and duplicate primer endpoints, are explained more fully in the ALGORITHM and CONSIDERATIONS topics.
Following the input summary, a primer summary lists the number of forward and reverse primers considered by Prime and the number of primers rejected because they failed to meet the various primer constraints. You can use this information to relax the appropriate program constraints if few or no primers are accepted.
If you are selecting PCR primer pairs, the primer summary is followed by a product summary listing the number of PCR products considered by Prime and the number of products rejected because they failed to meet the various product constraints. Again, you can use this information to relax the appropriate program constraints if few or no PCR primer pairs are selected.
Following these summaries is an ordered listing of the most appropriate primers or PCR primer pairs selected by Prime. The list is ordered by total annealing score (see the ALGORITHM topic) so that those primers or PCR primer pairs with the least amount of complementarity to sequences other than the appropriate primer binding sites are listed first. Each output primer or PCR primer pair is designated by a number that corresponds to a line number in the plot of primer sites. While the text output file lists the location of the primer binding site along with each primer sequence, the plot provides a convenient way to review the primer binding sites of many of the selected primers at once.
Prime can create a plot of the primer sites that can help you rapidly review the primer binding sites for the primers selected by the program. The line numbers in the plot correspond to the primer or product numbers in the text output file. Short blue lines extending above the horizontal sequence line indicate the positions of forward primers and short red lines extending below the sequence line indicate the positions of reverse primers.
By default, Prime writes instructions for plotting the primer sites into a figure file named prime.figure. Such files can be plotted on any supported graphics device using the Figure program.
Prime accepts any nucleotide sequence as input and selects appropriate oligonucleotide primers that are complementary to sites on the input template sequence. If Prime rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.
You optionally can specify an input file of primer sequences from which to select appropriate oligonucleotide primers that are complementary to sites on the template sequence with -PRImers. The file of primer sequences for Prime is modeled on the enzyme data files for the mapping programs described in Appendix VII. The primer names should not have more than 31 characters. The offset field is ignored by Prime, but the field must have a number in it to make the input primer files compatible with the files that are read by mapping programs. The input primer sequence may contain only valid GCG sequence symbols (see Appendix III of this manual) and the single quotation mark ( ' ) and underscore ( _ ) characters. Single quotation marks and underscores in the sequence patterns are ignored. Prime ignores input primers containing any other characters in the sequence. The overhang field has no significance to Prime and can be omitted. For other GCG mapping programs, if the overhang field is absent or is a non-numeric character, then the bottom strand is not searched.
The exact spacing between each field does not matter, only the order of the fields in the line. Blank lines and lines that start with an exclamation point ("!") are ignored. Here is part of an example file of input primers:
An example file of input primers for the PRIME program. Name Offset Sequence Documentation .. x13598 1 ACCCTTCAGCAGTTCCACAC ! x24332 1 AAGCACCCTTCAGCAGTTCC ! u35982 1 AAGAGAGGTGGAAATGAGG !
The GCG mapping programs Map, MapPlot, and MapSort can be used to mark finds in the context of a DNA restriction map. FindPatterns identifies sequences that contain short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. You can provide the patterns in a file or simply type them in from the terminal.
You cannot search for primers longer than fifty bases. You cannot specify a maximum product length greater than 10,000 bases. Prime will not read more than 5,000 primers from an input file of primer sequences.
Prime determines primer melting temperatures by a calculation using the nearest-neighbor model of Borer, and thermodynamic parameters for DNA nearest-neighbor interactions and the salt dependence of oligonucleotides determined by SantaLucia (Proc. Natl. Acad. Sci. USA. 95; 1460-1465 (1986)):
T(m)(primer) = delta H / ((delta S' + R x ln(c/4)) - 273.15
where delta H is the enthalpy of helix formation, delta S' is the salt-adjusted entropy of helix formation (including helix initiation), R is the molar gas constant (1.987 cal/degree Celsius/mol), and c is the primer concentration.
In the above equation, the salt-adjusted delta S' is determined from the delta S at 1M salt according to the equation:
delta S' = delta S + (0.368 x (primer_length - 1) x ln[K(+)])
where [K(+)] is the potassium ion concentration.
Prime determines PCR product melting temperatures using the formula of Baldino, et al. (in Methods Enzymol. 168; 761-777 (1989)) as modified slightly by Rychlik, et al. (Nucleic Acids Res. 18; 6409-6412 (1990)).
T(m)(product) = 0.41 x (% G+C) + 16.6 x log[K(+)] - 675 / len + 81.5
where len is the length of the product.
If you are selecting PCR primer pairs, the output includes a proposed annealing temperature for each listed primer pair. The annealing temperature is calculated using the formula of Rychlik, et al. (Nucleic Acids Res. 18; 6409-6412 (1990)).
T(a) = 0.3 x T(m)(primer) + 0.7 x T(m)(product) - 14.9
Prime uses an annealing test described by Hillier and Green (PCR Methods and Applications. 1; 124-128 (1991)), with slight modification, to check individual primers for self-complementarity and to check the two primers in a PCR primer pair for complementarity to each other. For tests of self-complementarity, a primer sequence in the 5' to 3' orientation is compared with the same sequence in the 3' to 5' orientation. For tests of complementarity between two different primers, one of the primer sequences in the 5' to 3' orientation is compared to the other sequence in the 3' to 5' orientation. The sequences are compared in every register of comparison, using a scoring matrix containing values of complementarity for every pair of nucleotide symbols. For each register of comparison, the score of each base pair comparison is determined. The scores of contiguous base pairs with positive comparison values are summed. The maximum score of all such contiguous segments, taken over all registers of comparison between the sequences, determines the total primer-primer annealing score. Complementarity at the 3' ends of the primer sequences has a particularly large influence on primer-dimer formation. Therefore, the maximum score of all contiguous segments that include the 3' position of either primer sequence, taken over all registers of comparison, is separately determined as the 3' primer-primer annealing score.
The same annealing test is used to determine complementarity between the primer and any non-specific binding sites on the template sequences. In this case, the primer in the 5' to 3' orientation is compared over all registers of comparison with both strands of the template sequence in the 3' to 5' orientation to determine a total primer-template annealing score. Since complementarity at the 3' end of the primer sequence has a particularly large effect on non-specific primer binding, the 3' primer-template annealing score is also determined. If you screen against non-specific primer binding on any specified repeated sequences, then total primer-repeat and 3' primer-repeat annealing scores, taken over all registers of comparison in all repeated sequences, are also determined.
Total and 3' annealing scores are saved in tests of primer self-complementarity (to check for secondary structure and primer dimer formation) and in tests of complementarity between the two primers in PCR primer pairs (to check for primer dimer formation). Total and 3' annealing scores are also saved when you screen against non-specific primer binding on the template sequence and when you screen against non-specific primer binding on any specified repeated sequences. Primers are rejected that exceed the maximum score you specify for any of these tests. For those primers that are accepted, the program uses the sum of all annealing scores to determine the order of primers or PCR primer pairs in the output list. You can specify weights for each of these scores to adjust their relative contributions in determining the output order. By default, 3' annealing scores have twice the weight of total annealing scores in determining the output order.
The template sequence may contain ambiguous bases, but Prime will not select primers complementary to any ambiguous sites on the template sequence. If you specify an input file of primer sequences from which to select appropriate oligonucleotide primers, Prime will not select any primers in the file that contain ambiguous bases.
By default, primer selects appropriate PCR primer pairs. To search for DNA sequencing primers, you need to use either Select: or Select:.
When several acceptable PCR primer pairs have the same 3' ends for both primers, Prime outputs only the PCR primer pair with the shortest primer sequences. By not allowing duplicate primer endpoints, Prime increases the diversity among the PCR primer pairs in the output list.
Prime only determines melting temperatures for DNA primers. We do not know of any appropriate nearest-neighbor thermodynamic parameters for RNA-DNA hybrids, so we haven't attempted to calculate melting temperatures for RNA primers. While thermodynamic parameters for RNA duplexes involving mismatches have been described, we do not know of any similar results for DNA duplexes. Therefore, we have not attempted to calculate melting temperatures or other thermodynamic properties for DNA duplexes involving mismatches.
Prime does not currently allow you to determine the compatibility of two primers for PCR in the absence of a template sequence. This function will be added in a future release.
Prime does not currently consider formamide concentration in determining primer melting temperatures.
If Prime fails to select any appropriate primers or PCR primer pairs, review the program summary displayed both on the terminal screen and in the output file. This summary lists the number of primers and PCR primer pairs rejected because they failed to meet each program constraint. With this information, you can determine which constraints to relax in subsequent runs of the Prime program.
To avoid reporting trivially different PCR primer pairs in the output list, use Set maximum overlap (in base pairs) between predicted PCR products to select sets of primer pairs whose PCR products have limited overlap with each other.
Prime was written by Irv Edelman.
You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.