Both FASTA and BLAST use a rapid word-based lookup strategy to speed the initial phase of the similarity search. In protein searches, FASTA looks for pairs of aligned identical amino-acids, e.g.
seq1 KDKEAYADRQELQDELRQEREARQKLEMMIKELKLQILKSSKTAKE . ::. .::..::..::. ::: :.. :::. .: seq2 NAKEGLEKIEELEEELENERKLRQKSELQRKELESRIEELQDQLET ^^ ^^ ^^ ^^ ^^^ ^^^
With ktup=2, FASTA would ignore a region like:
seq1 LNKKLLNLKQAGEHLKPE .....:. .. :.:. . seq2 FEEEFLETREQYEKLQKDin the initial scanning phase. Thus, searches with ktup=1 can be more sensitive than searches with ktup=2. However, a more sensitive algorithm may also raise the scores of unrelated sequences, so that the statistical significance of an intermediate-distance match is reduced, while the significance of a very distance match is improved.
BLAST also looks for initial similarities using a word-size (ktup) of 3, but BLAST looks for conservative substitutions as well as identities. Thus, BLAST with a wordsize of 3 is often more sensitive than FASTA with a ktup=2.
For DNA sequences, FASTA uses a ktup=6 by default. DNA searches with ktup=3 are even more sensitive, but ktup=1 is less sensitive (at a given statistical significance threshold) than ktup=3 for DNA. ktup=1 is appropriate when searching for oligonucleotides (< 20 nt).