Some fixes in LM part regarding ngram history length + MLE ngram #13

uralik · 2017-12-30T03:23:55Z

So given the definition of n-gram the text is 100% correct but in the formulas there are always histories of length n, which is probably a typo. I have also added small explanation about why relative freq. ngram estimator is optimal from the MLE perspective.

kyunghyuncho · 2017-12-30T03:46:00Z

lecture_note.tex

@@ -3568,11 +3568,12 @@ \section{$n$-Gram Language Model}
 conditional probability (Eq.~\eqref{eq:unidir_sentence}~(a)) is only conditioned
 on the $n-1$ preceding symbols only, meaning
 \begin{align*}
-    p(w_k | w_{<k}) \approx p(w_k | w_{k-n}, w_{k-n+1}, \ldots, w_{k-1}).
+    % p(w_k | w_{<k}) \approx p(w_k | w_{k-n}, w_{k-n+1}, \ldots, w_{k-1}). % history length should be n-1


please remove this commented line

kyunghyuncho · 2017-12-30T03:46:10Z

lecture_note.tex

 \end{align*}
 This results in 
 \begin{align*}
-    p(S) \approx \prod_{t=1}^T p(w_t | w_{t-n}, \ldots, w_{t-1}).
+    p(S) \approx \prod_{t=1}^T p(w_t | w_{t-n+1}, \ldots, w_{t-1}). % history should have n-1 length


kyunghyuncho · 2017-12-30T03:47:03Z

lecture_note.tex

 \subsection{Smoothing and Back-Off}

 {\em Note that I am missing many references this section, as I am writing this
    on my travel. I will fill in missing references once I'm back from my
 travel.}

 The biggest issue of having an $n$-gram that never occurs in the training corpus
-is that any sentence containing the $n$-gram will be given a zero probability
+is that any sentence containing such $n$-gram will be given a zero probability


such an $n$-gram

Some fixes in LM part regarding ngram history length + MLE ngram

3870bb2

kyunghyuncho reviewed Dec 30, 2017

View reviewed changes

uralik added 2 commits December 29, 2017 22:50

wipe comments, + such as

0b5ad71

such an

4664255

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some fixes in LM part regarding ngram history length + MLE ngram #13

Some fixes in LM part regarding ngram history length + MLE ngram #13

uralik commented Dec 30, 2017

kyunghyuncho Dec 30, 2017

uralik Dec 30, 2017

kyunghyuncho Dec 30, 2017

uralik Dec 30, 2017

kyunghyuncho Dec 30, 2017

uralik Dec 30, 2017

Some fixes in LM part regarding ngram history length + MLE ngram #13

Are you sure you want to change the base?

Some fixes in LM part regarding ngram history length + MLE ngram #13

Conversation

uralik commented Dec 30, 2017

kyunghyuncho Dec 30, 2017

Choose a reason for hiding this comment

uralik Dec 30, 2017

Choose a reason for hiding this comment

kyunghyuncho Dec 30, 2017

Choose a reason for hiding this comment

uralik Dec 30, 2017

Choose a reason for hiding this comment

kyunghyuncho Dec 30, 2017

Choose a reason for hiding this comment

uralik Dec 30, 2017

Choose a reason for hiding this comment