StemLoop finds stems (inverted repeats) within a sequence. You specify the minimum stem length, minimum and maximum loop sizes, and the minimum number of bonds per stem. All stems or only the best stems can be displayed on your screen or written into a file.
StemLoop searches for inverted repeats in your sequence after you choose a minimum stem length and minimum and maximum loop sizes. You must also specify a minimum number of bonds per stem with G-T, A-T/U, and G-C scored as 1, 2, and 3 bonds, respectively. The stems found can be sorted by position, size (stem length), or quality (number of bonds) and can be either filed or displayed on the screen. StemLoop tells you the number of stems found for your settings of minimum stem size, maximum loop size, minimum loop size, and minimum bonds per stem. If you feel there are too many stems, you may reset the parameters without reviewing the stems found or view only the best stems found. To view only the best stems, there must be more than 25 stems found and you must sort them by quality or size. (See the ALGORITHM topic below to understand precisely what StemLoop does.)
StemLoop creates an output file if you choose to file the stems from any search; otherwise, you may view the stems on your screen. In either case, the stem is shown, as below, with vertical bars ('|') indicating the base pairs. The associated loop is shown to the right of the stem. If either the stem or loop is too long to be displayed in its entirety on the line, then only that part that fits on the line is shown. The first and last coordinates of the stem are displayed on the left, and the length of the stem (size), the number of bonds in the stem (quality), and the loop size are shown on the right. Here is part of the file alucons.stem created by the example session above:
STEMLOOP of: alucons.seq check: 1861 from: 1 to: 290 Alu consensus sequence Labuda, D. and Striker, G. (1989) Sequence conservation in Alu evolution. Nucleic Acids Research 17, 2477-2491. Minimum Stem: 8 Minimum bonds/stem: 18 Maximum loop size: 20 Stems found: 8 Stems shown: 8 Average Match: 1.80 Average Mismatch: 0.00 Nibbling Threshold: 1 October 6, 1998 14:16 .. 217 AGGCTGCAGTG AGCCGTGAT 11, 25 |||||| |||| C 257 TCCGGCCTCAC GTCACCGCG 19 135 TAGCCGGGCGT GG 11, 22 ||| || |||| 160 GTCCGCGCGCG GT 4 ///////////////////////////// 221 TGCAGTG AGCCGTG 7, 18 ||||||| 248 ACGTCAC CGCGCTA 14 35 CACTTCGG GA 8, 18 | |||||| 54 GCGGAGCC GG 4
You may choose to see only the numbers defining each stem on your screen by choosing option '2' in the first menu. This is what that screen output would look like if you choose option '2' in the first menu and then choose to sort by quality in the second menu:
Loop Start End Size Quality 1 217 257 11 25 2 135 160 11 22 3 139 160 8 20 4 69 95 7 20 5 4 25 9 20 6 213 247 8 19 7 221 248 7 18 8 35 54 8 18
StemLoop can also make an output file with points for plotting with DotPlot.
StemLoop accepts a single nucleotide sequence as input. If StemLoop rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.
MFold predicts optimal and suboptimal secondary structures for an RNA or DNA molecule using the most recent energy minimization method of Zuker. PlotFold displays the optimal and suboptimal secondary structures for an RNA or DNA molecule predicted by MFold.
Using Compare-DotPlot to create a
dot-plot of the similarities between
a nucleotide sequence and its
reverse-complement strand is functionally equivalent
to running StemLoop. Repeat
uses the same
algorithm as StemLoop to find
repeats that are not inverted.
DotPlot shows you the
output from
Compare or StemLoop on a
surface of comparison.
StemLoop only searches for loops through a range that is equal to twice the minimum stem length, plus the maximum loop size. You may extend the search range by increasing the maximum loop size; however, the maximum range for the search may not exceed 2,000 bases. StemLoop cannot find more than 1,000 loops.
StemLoop uses a window and stringency match criterion in exactly the same manner as Compare. For every position in each register shift, a window set by you as the minimum stem size is moved along the sequence, and if the minimum number of bonds per stem or more are found, then a stem is recorded covering all of the bases under the window. The number of the bonds under the window at each window position is the sum of the scoring matrix values for each base pair found in the file stemloop.cmp . Mismatches can be scored negatively, although the public data file simply scores matches with G-T, A-T/U, and G-C worth 1, 2, and 3, respectively. Several adjacent mismatches may be found within a long stem if there are strong matches on either side. The criterion for a stem is that the minimum number of bonds occur within a length set by you as the minimum stem length.
Before the stems are presented,
they are extended (or nibbled)
from both ends so that
the first
base on each end participates
in a bond. The
criterion for a bond between
pairing bases is that
the value in the scoring
matrix file (stemloop.cmp) for the
pair is greater than or
equal to the
average positive non-identical comparison value
in the scoring matrix.
You can reset the
threshold for nibbling with Threshold for nibbling, match (|), and point display.
You could set a
pairing threshold high enough so
that all
stems are nibbled away!
StemLoop chooses a default minimum number of bonds per stem that is appropriate for the scoring matrix it reads. If you select a different scoring matrix with -MATRix, the program will adjust the default minimum number of bonds per stem accordingly.
You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.