This is the code for the paper Unsupervised Topic Segmentation of Meetings with BERT Embeddings.
Link to the paper and original repository: .
The code doesn't require training and uses a pretrained model from See paper appendix for more information.
[The original code uses AMI and ICSI datasets. If you're trying segmentation on those datasets, the original code should be more suitable.]
- Unpacked the major running functions in a notebook format. See the Unsupervised_text_seg.ipynb.
- We can run the code for our custom corpus. See the example corpus in the same notebook.
- Changed the import method of RoBERTa to be compatible with latest Huggingface library.
- There are minor modifications made in other base files like and to make it compatible to the corpus I have.