update char_cnn and fasttext #66

gmichalo · 2020-07-04T14:54:56Z

By examining the great models that you have created, I saw that the code in char_cnn (Character-level Convolutional Networks for TextClassification) is not exactly the same as the model that the authors described in their paper.

In particular, in this pull request I added:

a)For the char_cnn model:
1)I changed the last (sixth) convolutional networks and the first linear layer of the char_cnn to the format that the author used.
2) I wrote the equation (line 33,34) for calculating the input size of the first linear layer based on the maximum number of characters of a sentence (this number can be added to the config file but I did not want to change more files so I just I added it as a constant in the model.py)

b)For the FastText model:

I change the F.avg_pool2d(x, (x.shape[1], 1)).squeeze(1) to F.avg_pool1d(x, x.shape[1]).squeeze(1) which are equivalent but I believe that by using F.avg_pool1d will make the code more understandable as the code right now uses F.avg_pool2d to basically do avg_pool1d.

daemon · 2020-08-31T13:49:20Z

Thanks for the PR. Have you run any benchmarks?

gmichalo · 2020-09-01T22:11:39Z

For the fasttext model, I tested the model on Reuters and the AAPD dataset and the new version get the exact same results as the previous version as it was expected.

For the char_cnn model the main difference between the new version and the old version is the input dimension of the first linear layer which in the old version was fixed to 256 but in the new version is calculated (as it was mentioned in the paper) and in the new version of the model is 8448 for Reuters (lines 33-36 on the code and last paragraph before section 2.4 in the paper). Due to this change when I tried to run on Reuters I got 0 f1 on dev due to the limited data. However, if we follow the paper I believe this is the correct way to create the char_cnn

daemon · 2020-09-04T22:21:23Z

Okay, it's probably a good idea to note in some readme that the correct implementation of char_cnn produces an F1 of 0 on Reuters, so people don't waste their time running it.

update char_cnn and fasttext

3b65e11

Ubuntu and others added 2 commits September 1, 2020 21:44

updating models

cda846f

fixing comments

15ea3eb

updating the char_cnn model and readme

52ec7b5

daemon previously approved these changes Sep 15, 2020

View reviewed changes

updating fasttext readme

cf49ec2

gmichalo dismissed daemon’s stale review via cf49ec2 September 15, 2020 06:11

gmichalo added 2 commits September 22, 2020 15:18

fixing readme and args for fasttext and char_cnn

60494a4

fixing readme and args for fasttext and char_cnn

f0bf8bb

daemon approved these changes Oct 2, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update char_cnn and fasttext #66

update char_cnn and fasttext #66

gmichalo commented Jul 4, 2020

daemon commented Aug 31, 2020

gmichalo commented Sep 1, 2020

daemon commented Sep 4, 2020

update char_cnn and fasttext #66

Are you sure you want to change the base?

update char_cnn and fasttext #66

Conversation

gmichalo commented Jul 4, 2020

daemon commented Aug 31, 2020

gmichalo commented Sep 1, 2020

daemon commented Sep 4, 2020