update readme and bump version

py-lidbox · Jul 4, 2020 · 9323393 · 9323393
1 parent 3a9ba61
commit 9323393
Show file tree

Hide file tree

Showing 2 changed files with 22 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@
 * Average detection cost (`C_avg`) implemented as a `tf.keras.metrics.Metric`.
 * You can also try `lidbox` for speaker recognition, since no assumptions will be made of the signal labels. E.g. use utt2speaker as utt2label and see what happens.
 
-[Here](./examples/common-voice/common-voice-4.ipynb) is an example notebook showing `lidbox` in action.
+[Here](./examples/common-voice/common-voice-4.ipynb) is a full example notebook showing what `lidbox` can do.
 
 ## Why would I want to use this?
 
@@ -27,31 +27,43 @@
 
 ## Installing
 
-Install TensorFlow 2.1 or 2.2 (both have been tested).
-
-Clone the repo and install `lidbox` as a Python package (note the explicit `./`).
-This will install all other required dependencies, but not TensorFlow.
 ```
 git clone --depth 1 https://github.com/matiaslindgren/lidbox.git
 pip install ./lidbox
 ```
-Check that the command line entry point is working
+Check that the command line entry point is working:
 ```
 lidbox -h
 ```
 If not, make sure the `setuptools` entry point scripts (e.g. directory `$HOME/.local/bin`) are on your path.
 
+Then, install TensorFlow 2.1 or 2.2 (both should work), unless it is already installed.
+
 If everything is working, see [this](./examples/common-voice) for a simple example to get started.
 
-### Note
+### Language embeddings
+
+If you want to use language embeddings, install the [PLDA package](https://github.com/RaviSoji/plda) from [here](https://github.com/matiaslindgren/plda/tree/as-setuptools-package):
+```
+pip install plda@https://github.com/matiaslindgren/plda/archive/as-setuptools-package.zip#egg=plda-0.1.0
+```
+
+### Editable install
 
 If you plan on making changes to the code, it is easier to install `lidbox` as a Python package in setuptools develop mode:
 ```
-pip install --editable ./lidbox
+git clone --depth 1 https://github.com/matiaslindgren/lidbox.git
+pip install ./lidbox
 ```
 Then, if you make changes to the code, there's no need to reinstall the package since the changes are reflected immediately.
 Just be careful not to make changes when `lidbox` is running, because TensorFlow will use its `autograph` package to convert some of the Python functions to TF graphs, which might fail if the code changes suddenly.
 
-### X-vector embeddings from a trained model for 4 languages
+## X-vector embeddings
+
+One benefit of deep learning classifiers is that you can first train them on large amounts of data and then use them as feature extractors to produce low-dimensional, fixed-length language vectors from speech.
+See e.g. the [x-vector](http://danielpovey.com/files/2018_odyssey_xvector_lid.pdf) approach by Snyder et al.
+
+Below is visualization of test set language embeddings for 4 languages in 2-dimensional space.
+Each data point represents 2 seconds of speech in one of the 4 languages.
 
 ![2-dimensional PCA plot of 400 random x-vectors for 4 Common Voice languages](./examples/common-voice/img/embeddings-PCA-2D.png)
diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setuptools.setup(
     name="lidbox",
-    version="0.5.0",
+    version="0.6.0",
     description="End-to-end spoken language identification (LID) on TensorFlow",
     long_description=readmefile_contents,
     long_description_content_type="text/markdown",