Markov Chain Algorithm (cont)
Text analysis phase: (build model of text)
- consider the input as a sequence of overlapping phrases
e.g. the text "this is mine, this is yours" contains the phrases
"this is mine,",
"is mine, this",
"mine, this is",
"this is yours"
- each phrase contains a two-word prefix and a suffix
- each prefix may occur several times with different suffixes
- build a list of all suffixes for a given prefix within the text
Note: you can use n-word phrases (for any n ≥ 1) to construct model, but n=3 seems to work best.
|