The editor for sequence data recognizes predefined symbols for nucleotide and protein sequences according the IUPAC definitions.
Symbol | Name |
A | Adenine |
C | Cytosine |
G | Guanine |
T | Thymine |
U | Uracile |
W | Weak (A or T) |
S | Strong (G or C) |
M | aMino (A or C) |
K | Keto (G or T) |
R | puRine (G or A) |
Y | pYrimidine (C or T) |
B | not A (B comes after A) |
D | not C (D comes after C) |
H | not G (H comes after G) |
V | not T (V comes after T and U) |
N | No idea (not a gap) |
- | gap symbol |
X | wrong entry |
The symbols with yellow background are ambiguity symbols. The difference between "N" and a gap symbol ("-") is that a gap symbol represents an unspecified number of unknown symbols but "N" stands for exatly one nucleic acid. Wrong symbols are marked with a brown background.