Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

第三章HMM中的viterbi的max函数 #25

Open
Lukasjame opened this issue Apr 25, 2019 · 3 comments
Open

第三章HMM中的viterbi的max函数 #25

Lukasjame opened this issue Apr 25, 2019 · 3 comments

Comments

@Lukasjame
Copy link

当需要分词的是一个文本段落时,viterbi函数中的max函数报错

Traceback (most recent call last):
File "D:\code\python_test\test\start.py", line 11, in
print(str(list(res)))
File "D:\code\python_test\test\hmm.py", line 150, in cut
prob, pos_list = self.viterbi(text, self.state_list, self.Pi_dic, self.A_dic, self.B_dic)
File "D:\code\python_test\test\hmm.py", line 134, in viterbi
for y0 in states if V[t - 1][y0] > 0])
ValueError: max() arg is an empty sequence

@choupiqi
Copy link

choupiqi commented May 7, 2019

是不是用了自己做的语料?若果是,可能是语料不好。
还可以试试在
for k,v in enumerate(line_state):
Count_dic[v] += 1
if k == 0:
这个if下面也加入self.B_dic[line_state[k]][word_list[k]] = self.B_dic[line_state[k]].get(word_list[k],0)+1.0
我碰到这个问题的时候是用的自己的语料,发现是语料不好,换一个就OK了。

@qxxiao
Copy link

qxxiao commented Jul 20, 2019

最好用来分析短句子,长文本的话V矩阵,超过一定的迭代后,计算结果会很小,最后趋于为0,导致for y0 in states if V[t - 1][y0] > 0的结果是空。可以采用对数概率相加来解决下溢的问题。

@FlyingCat-fa
Copy link

128行 for y0 in states if V[t - 1][y0] > 0]),会出现上一时刻不存在的状态如“B”,应该修改为
for y0 in V[t - 1].keys()])

chenw23 pushed a commit to chenw23/learning-nlp that referenced this issue Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants