PeptideSort shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by position, putative molecular weight, and relative HPLC retention at pH 2.1, and shows the composition of each peptide. It also prints a summary of the composition of the whole protein.
PeptideSort cuts a peptide sequence with any or all of the proteolytic enzymes and reagents listed in the public or local data file proenzall.dat. The peptides from each digest are sorted by position, weight, and retention time in a high-pressure liquid chromatograph at pH 2.1. For each peptide in each sorting, the following data are displayed: beginning and ending positions, molecular weight, HPLC retention at pH 2.1, HPLC retention at pH 7.4, charge, number of aromatic residues, number of acidic residues, number of basic residues, number of residues containing sulfur, number of hydrophilic residues, and number of hydrophobic residues. The content, isoelectric point, and molar extinction coefficient at 280 nm of each peptide are shown with the table of peptides sorted by position. The content can be displayed in the order of expected elution from an amino acid analyzer.
Here is the output file:
PEPTIDESORT of: gzeinaa.pep check: 2106 from: 18 to: 243 Corn Storage Protein Am. Ac. (19,000, genomic) extracted from GZEIN.SEQ, checksum 2842, row a With Enzymes: TRYPSIN October 5, 1998 10:42 .. Digest with: Trypsin. Peptides Sorted by Position Pos From To Mol Wt Ret2.1 Ret7.4 Chg Aro Acid Base Sulf Phil Phob 1 18 - 65 4991.8 173.4 167.6 0.0 3 1 1 4 21 27 A7,C2,E1,F1,G1,I3,L8,M2,N1,P7,Q2,R1,S8,T1,V1,Y2 Iso=6.11 Ext=2800 2 66 - 103 4117.9 160.0 146.2 1.0 1 0 1 0 16 22 A5,F1,G1,H1,I4,L10,N1,P3,Q7,R1,S3,V1 Iso=10.53 Ext=0 3 104 - 156 5919.7 153.6 115.4 1.0 6 0 1 0 24 29 A11,F3,L11,N3,P3,Q14,R1,S3,V1,Y3 Iso=9.50 Ext=3840 4 157 - 243 9608.0 364.4 291.6 -1.0 11 1 0 0 38 49 A12,D1,F8,G3,H2,I2,L18,N4,P8,Q16,S3,T4,V3,Y3 Iso=6.50 Ext=3840 Digest with: Trypsin. Peptides Sorted by Weight Pos From To Mol Wt Ret2.1 Ret7.4 Chg Aro Acid Base Sulf Phil Phob 2 66 - 103 4117.9 160.0 146.2 1.0 1 0 1 0 16 22 1 18 - 65 4991.8 173.4 167.6 0.0 3 1 1 4 21 27 3 104 - 156 5919.7 153.6 115.4 1.0 6 0 1 0 24 29 4 157 - 243 9608.0 364.4 291.6 -1.0 11 1 0 0 38 49 Digest with: Trypsin. Peptides Sorted by Retention Pos From To Mol Wt Ret2.1 Ret7.4 Chg Aro Acid Base Sulf Phil Phob 3 104 - 156 5919.7 153.6 115.4 1.0 6 0 1 0 24 29 2 66 - 103 4117.9 160.0 146.2 1.0 1 0 1 0 16 22 1 18 - 65 4991.8 173.4 167.6 0.0 3 1 1 4 21 27 4 157 - 243 9608.0 364.4 291.6 -1.0 11 1 0 0 38 49 Summary for whole sequence: Molecular weight = 24583.35 Residues = 226 Average Residue Weight = 108.776 Charged = 1 Isoelectric point = 8.12 Extinction coefficient = 10360 Residue Number Mole Percent .. A = Ala 35 15.487 B = Asx 0 0.000 C = Cys 2 0.885 D = Asp 1 0.442 E = Glu 1 0.442 F = Phe 13 5.752 G = Gly 5 2.212 H = His 3 1.327 I = Ile 9 3.982 K = Lys 0 0.000 L = Leu 47 20.796 M = Met 2 0.885 N = Asn 9 3.982 P = Pro 21 9.292 Q = Gln 39 17.257 R = Arg 3 1.327 S = Ser 17 7.522 T = Thr 5 2.212 V = Val 6 2.655 W = Trp 0 0.000 Y = Tyr 8 3.540 Z = Glx 0 0.000 A + G 40 17.699 S + T 22 9.735 D + E 2 0.885 D + E + N + Q 50 22.124 H + K + R 6 2.655 D + E + H + K + R 8 3.540 I + L + M + V 64 28.319 F + W + Y 21 9.292 Enzymes that do cut: Trypsin Enzymes that do not cut: NONE
PeptideSort accepts a single protein sequence as input. If PeptideSort rejects your protein sequence, turn to Appendix VI to see how to change or set the type of a sequence.
PeptideMap creates a peptide map with an output format similar to the DNA restriction maps. Isoelectric plots the charge as a function of pH for any peptide sequence.
The algorithm used by PeptideSort to estimate HPLC retention times (Meek, Proc. Natl. Acad. Sci. USA 77; 1632 (1980)) is based on the assumption that the retention of a peptide correlates to its amino acid composition. This assumption holds for peptides of up to about 20 amino acids, but steric and conformational factors can affect the retention of longer peptides. Retention times calculated by PeptideSort for peptides longer than 20 amino acids should not be considered accurate.
The formula for estimating the retention time is the sum of the retention coefficients for the amino acids in the peptide, plus the coefficients for the end groups, plus a value t0, which is the time for elution of unretained compounds. The retention time reported by PeptideSort does not include the t0 value. You will have to determine this time for your HPLC system and add it to the reported times.
Meek's paper does not report retention coefficients for cysteine, only for cystine. PeptideSort assumes that these are the same. Therefore the estimated retention time for a peptide containing cysteines may be inaccurate.
The retention times reported by PeptideSort should be regarded as estimates, since the actual retention times can vary according to the elution conditions. Meek's retention coefficients were determined empirically using a linear gradient of acetonitrile, starting at 0% at 0 min and increasing to 60% at 80 min (0.75% per min). Increasing the gradient rate to 1.5% acetonitrile per min resulted in retention times that were 70 percent of normal. Decreasing the gradient rate to 0.5% per min resulted in retention times that were 120 percent of normal. Meek also noted minor differences in relative retention rates with columns made by different manufacturers.
A digest may not produce more than 1,000 peptides. If you choose all enzymes and your protein sequence is over 500 residues long, there may be a great deal of output. Remember to delete the output file when you are finished looking at the data to free disk space.
The program presents you with an enzyme selection prompt that lets you enter enzymes individually or collectively. We maintain our enzyme files with a semicolon (;) character in front of all but one member of a family of isoschizomers. (Isoschizomers are restriction endonucleases with the same recognition site.) The isoschizomers beginning with a semicolon are normally not displayed by our mapping programs unless you specifically select them by name or type "**" instead of "*" at the enzyme prompt.
There is more information on enzyme files in Appendix VII.
PeptideSort was written by John Devereux in the GCG laboratory. It was designed to handle several suggestions made to us by Drs. Michael Gribskov and Roland Rueckert. HPLC retention is from Meek, Proc. Natl. Acad. Sci. USA 77; 1632 (1980). Molar extinction coefficient is from Gill, S.C. and von Hippel, P.H., Anal. Biochem. 182; 319-326 (1989).
You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.