|
1
|
- Parameters and
- Probabilities
|
|
2
|
- Forward decoding:
- What is the most probable path underlying the observed sequence?
- What is the full probability of all paths (and hence the probability of
any path) underlying the observed sequence?
- Posterior Decoding
- Given an observed sequence, what
is the a posteriori probability that the observed sequence came from a
particular state?
|
|
3
|
- When the training data are labeled
- Maximum Likelihood Estimation
- When the training paths are unknown
- EM techniques: Baum-Welch algorithm
|
|
4
|
|
|
5
|
|
|
6
|
|
|
7
|
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
|
|
13
|
|
|
14
|
|
|
15
|
- Define a function relating all joint probabilities up to the observation
of interest,xi
|
|
16
|
- Consider the Forward Algorithm variable (function) f , vis à vis the Viterbi variable vk(i)
- v is an ad hoc argmax times emission
- f is an ad hoc joint probability
times emission times transition
|
|
17
|
|
|
18
|
|
|
19
|
- Both the Forward Algorithm and the Viterbi Algorithm can be implemented
with DYNAMIC PROGRAMMING
- Otherwise they are hard problems (exponential)
- Use logs to prevent underflow
|
|
20
|
- Did observation xi come from state k given the observed sequence x.
- Restated, what is P(pi=k|x)
- Need the Backward Algorithm
|
|
21
|
- Recurse backward from the end of the sequence
|
|
22
|
|
|
23
|
|
|
24
|
- Expectation Step
- Estimate transition and emission probabilities using frequency counts
(the states will look alike)
- Maximization step
- Run the model with these estimates, using Posterior decoding
- Re-estimate the parameters using maximum likelihood estimator
- Iterate
|