ELMo From Scratch

This repository has the code for ELMo in two variants:

ELMo with Character CNNs
ELMo with Pretrained Embeddings (GloVe)

The file WordEmb.py contains the code for the Pretrained Word Embeddings

The file WordEmb.py goes through all the steps:

Preprocessing the data
- Loading the data
- Preprocessing the data
- Creating the vocabulary
- Creating the Dataset
Pretraining the Embeddings
- Creating the Model
- Training the Model
- Evaluating the Model
- Saving the Model
Downstream Task
- Loading the Pretrained Model
- Creating the Downstream Model
- Training the Downstream Model
- Evaluating the Downstream Model

The file CharCNN.py contains the code for the Character CNNs

The file CharCNN.py goes through all the steps:

Preprocessing the data
- Loading the data
- Preprocessing the data
- Creating the char vocabulary
- Creating the Dataset
Pretraining the Embeddings
- Creating the CharCNN Model
- Creating the ELMo Model
- Training the Model
- Evaluating the Model
- Saving the Model
Downstream Task
- Loading the Pretrained Model
- Creating the Downstream Model
- Training the Downstream Model
- Evaluating the Downstream Model

There is a parameter DIR in both the files which is the path to the directory where you want to save the best models.

The train.csv and the test.csv have to be present within a directory data in the same directory as the code.

Main Libraries used in the code:

PyTorch       -> create the models
Pandas        -> load the data
Gensim        -> load the pretrained embeddings
NLTK          -> tokenize the data
Scikit-Learn  -> evaluate the models
WandB         -> log the metrics
NumPy         -> for numerical operations
tqdm          -> for progress bars
torchinfo     -> for model summary

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
ANLP2.pdf		ANLP2.pdf
CharCNN.jpeg		CharCNN.jpeg
CharEmb.ipynb		CharEmb.ipynb
CoVe.jpeg		CoVe.jpeg
ELMo.jpeg		ELMo.jpeg
Pretraining_Loss.png		Pretraining_Loss.png
README.md		README.md
REPORT.md		REPORT.md
REPORT.pdf		REPORT.pdf
WordEmb.ipynb		WordEmb.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ELMo From Scratch

Main Libraries used in the code:

About

Releases

Packages

Languages

shu7bh/elmo-scratch

Folders and files

Latest commit

History

Repository files navigation

ELMo From Scratch

Main Libraries used in the code:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages