IMoJIE-Faster-Copy-Attention

Overview

This is a simple repository holding scrips that were used for a German 'Belegarbeit' about the IMoJIE-Model.

If one searches to replicate the speed-up for IMoJIE from 247.4 to 85.8 seconds without a degradation in performance or any retraining, just copy code for the optimized gather_final_log_probs() function from this repo and may use it alongside the original IMoJIE-Code.

Content

As this work targets teachers, it is written in a long but easier to understand manner for the general adept reader. Also, this work represents the writer's first ever paper that was written inside the field of AI and NLP. This belegarbeit started with the goal of replacing the BERT encoder of IMoJIE with more effective variants to potentially achieve a new state of the art performances. Firstly, however, experiments oriented on the easier task of replacing only the LSTM decoder with a GRU decoder for possible performance and speed improvements. No significant gains were made by replacing the decoder, but the huge performance bottle-neck of the gather_final_log_probs() function was found in the end and successfully optimized.

The second goal of replacing the BERT encoder with the more recent variants RoBERTa, ELECTRA, DeBERTa, DeBERTaV3 was done in this repo. The deprecated version of AllenNLP alongside the BERT-oriented IMoJIE code made experiments very difficult, which is why experiments were finally conducted on the more recent DetIE model.

Plots and Measurements

Decoding requires over 95% of IMoJIE's the extraction time:

Model performance wont deacrease by using the optimized function:

Note that 'GRU' and 'LSTM' represent IMoJIE models with their respective decoder. Their performances are nearly similar. Numbers represent the training batch sizes for IMoJIE.
'*' means that no copy log probabilities of the exact same tokens are combined anymore.
'†' means that tokens will be only copied from the source sentence. This frames the combining/summing of token copy log probs just to the source sentences.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Aktivierungsfunktionen		Aktivierungsfunktionen
Berechnungszeiten_Anteil_Decoding		Berechnungszeiten_Anteil_Decoding
Decoderschrittzeiten		Decoderschrittzeiten
PR-plot		PR-plot
Tabellen		Tabellen
bearbeiteter Modellcode		bearbeiteter Modellcode
configfiles		configfiles
Belegarbeit_IMoJIE_HBeyer.pdf		Belegarbeit_IMoJIE_HBeyer.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMoJIE-Faster-Copy-Attention

Overview

Content

Plots and Measurements

Decoding requires over 95% of IMoJIE's the extraction time:

Model performance wont deacrease by using the optimized function:

About

Releases

Packages

Languages

HenningBeyer/IMoJIE-Faster-Copy-Attention

Folders and files

Latest commit

History

Repository files navigation

IMoJIE-Faster-Copy-Attention

Overview

Content

Plots and Measurements

Decoding requires over 95% of IMoJIE's the extraction time:

Model performance wont deacrease by using the optimized function:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages