Skip to content

iesl/box-embeddings

Repository files navigation

A fully open source Python library for geometric representation learning, compatible with both PyTorch and TensorFlow, which allows existing neural network layers to be replaced with or transformed into boxes easily.

Tests Typing/Doc/Style Binder codecov


🌟 Features

  • Modular and reusable library that aids the researchers in studying probabilistic box embeddings.
  • Extensive documentation and example code, demonstrating the use of the library to make it easy to adapt to existing codebases.
  • Rigorously unit-test the codebase with high coverage, ensuring an additional layer of reliability.
  • Customizable pipelines
  • Actively being maintained by IESL at UMass

💻 Installation

Installing via pip

The preferred way to install Box Embeddings for regular usage, test, or integration into the existing workflow is via pip. Just run

pip install box-embeddings

Installing from source

You can also install Box Embeddings by cloning our git repository

git clone https://github.com/iesl/box-embeddings

Create a Python 3.7 or 3.8 virtual environment under the project directory and install the Box Embeddings package in editable mode by running:

virtualenv box_venv
source box_venv/bin/activate
pip install --editable . --user
pip install -r core_requirements.txt

👟 Quick start

After installing Box Embeddings, a box can be initialized from a tensor as follows:

import torch
from box_embeddings.parameterizations.box_tensor import BoxTensor
data_x = torch.tensor([[1,2],[-1,5]])
box_x = BoxTensor(data_x)
box_x

The result box_x is now a BoxTensor object. To view other examples, visit the examples section.

BoxTensor(tensor([[ 1,  2],
        [-1,  5]]))

📖 Command Overview

Command Description
box_embeddings An open-source library for NLP or graph learning
box_embeddings.common Utility modules that are used across the library
box_embeddings.initializations Initialization modules
box_embeddings.modules A collection of modules to operate on boxes
box_embeddings.parameterizations A collection of modules to parameterize boxes

📍 Navigating the codebase

Task Where to go
Contribution manual Link
Source codes Link
Usage documentation Link
Training examples Link
Unit tests Link

📚 Reference

  1. If you use this library in you work, please cite the following arXiv version of the paper
@article{chheda2021box,
  title={Box Embeddings: An open-source library for representation learning using geometric structures},
  author={Chheda, Tejas and Goyal, Purujit and Tran, Trang and Patel, Dhruvesh and Boratko, Michael
  and Dasgupta, Shib Sankar and McCallum, Andrew},
  journal={arXiv preprint arXiv:2109.04997},
  year={2021}
}
  1. If you use simple hard boxes with surrogate loss then cite the following paper:
@inproceedings{vilnis2018probabilistic,
  title={Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures},
  author={Vilnis, Luke and Li, Xiang and Murty, Shikhar and McCallum, Andrew},
  booktitle={Proceedings of the 56th Annual Meeting of the Association for
  Computational Linguistics (Volume 1: Long Papers)},
  pages={263--272},
  year={2018}
}
  1. If you use softboxes without any regularizaton the cite the following paper:
@inproceedings{li2018smoothing,
title={Smoothing the Geometry of Probabilistic Box Embeddings},
author={Xiang Li and Luke Vilnis and Dongxu Zhang and Michael Boratko and Andrew McCallum},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=H1xSNiRcF7},
}
  1. If you use softboxes with regularizations defined in the Regularizations module then cite the following paper:
@inproceedings{patel2020representing,
title={Representing Joint Hierarchies with Box Embeddings},
author={Dhruvesh Patel and Shib Sankar Dasgupta and Michael Boratko and Xiang Li and Luke Vilnis
and Andrew McCallum},
booktitle={Automated Knowledge Base Construction},
year={2020},
url={https://openreview.net/forum?id=J246NSqR_l}
}
  1. If you use Gumbel box then cite the following paper:
@article{dasgupta2020improving,
  title={Improving Local Identifiability in Probabilistic Box Embeddings},
  author={Dasgupta, Shib Sankar and Boratko, Michael and Zhang, Dongxu and Vilnis, Luke
  and Li, Xiang Lorraine and McCallum, Andrew},
  journal={arXiv preprint arXiv:2010.04831},
  year={2020}
}

💪 Contributors

We welcome all contributions from the community to make Box Embeddings a better package. If you're a first time contributor, we recommend you start by reading our CONTRIBUTING.md guide.

💡 News and Updates

Our library Box Embeddings will be officially introduced at EMNLP 2021!

🤗 Acknowledgments

Box Embeddings is an open-source project developed by the research team from the Information Extraction and Synthesis Laboratory at the College of Information and Computer Sciences (UMass Amherst).