You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am new to word2vec. I am preparing corpus in sentences using wikipedia
dump. However the dump is pre-splitted in paragraphs which seems need to
further be processed into sentences.
My question is
is it possible to directly train paragraphs instead of sentences? Or it is a
must that word2vec (the SkipGram model) has to work with sentences.
Since the algorithm trains the data by a context window, I didn't see much
difference by add the extra window across sentences within the same paragraph.
Original issue reported on code.google.com by [email protected] on 24 Feb 2015 at 9:35
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 24 Feb 2015 at 9:35The text was updated successfully, but these errors were encountered: