Merge pull request #94 from readbeyond/devel

aeneas v1.5.1
readbeyond · Jul 25, 2016 · d7dbb8c · d7dbb8c
2 parents faeaff6 + d30ad36
commit d7dbb8c
Show file tree

Hide file tree

Showing 154 changed files with 2,064 additions and 1,705 deletions.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -14,6 +14,7 @@ prune docs/build
 include CHANGELOG
 include LICENSE
 recursive-include licenses *
+include output/.gitignore
 include README.md
 include README.rst
 include requirements.txt

diff --git a/README.md b/README.md
@@ -2,8 +2,8 @@
 
 **aeneas** is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
 
-* Version: 1.5.0.3
-* Date: 2016-04-23
+* Version: 1.5.1.0
+* Date: 2016-07-25
 * Developed by: [ReadBeyond](http://www.readbeyond.it/)
 * Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
 * License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -87,6 +87,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
 
 ### Installation
 
+All-in-one installers are available for Mac OS X and Windows,
+and a Bash script for deb-based Linux distributions (Debian, Ubuntu)
+is provided in this repository.
+It is also possible to download a VirtualBox+Vagrant virtual machine.
+Please see the
+[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
+for detailed, step-by-step installation procedures for different operating systems.
+
+The generic OS-independent procedure is simple:
+
 1. Install
    [Python](https://python.org/) (2.7.x preferred),
    [FFmpeg](https://www.ffmpeg.org/), and
@@ -102,20 +112,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
     pip install aeneas
     ```
 
-See the
-[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
-for detailed, step-by-step procedures for Linux, OS X, and Windows.
-
-
-## Usage
-
-1. To **check** whether you installed **aeneas** correctly, run:
+4. To **check** whether you installed **aeneas** correctly, run:
 
    ```bash
     python -m aeneas.diagnostics
     ```
 
-2. Run without arguments to get the **usage message**:
+
+## Usage
+
+1. Run without arguments to get the **usage message**:
 
     ```bash
     python -m aeneas.tools.execute_task
@@ -131,7 +137,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
     python -m aeneas.tools.execute_task --examples-all
     ```
 
-3. To **compute a synchronization map** `map.json` for a pair
+2. To **compute a synchronization map** `map.json` for a pair
    (`audio.mp3`, `text.txt` in
    [plain](http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN)
    text format), you can run:
@@ -169,7 +175,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
    [documentation](http://www.readbeyond.it/aeneas/docs/)
    for details.
 
-4. If you have several tasks to process,
+3. If you have several tasks to process,
    you can create a **job container**
    to batch process them:
 
@@ -222,12 +228,12 @@ which explains how to use the built-in command line tools.
 * Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
 * Input audio file formats: all those readable by `ffmpeg`
 * Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TSV, TTML, TXT, VTT, XML
-* Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
+* Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
 * MFCC and DTW computed via Python C extensions to reduce the processing time
-* On Linux, eSpeak called via a Python C extension for faster audio synthesis
-* Batch processing of multiple audio/text pairs
 * Several built-in TTS engine wrappers: eSpeak (default, FLOSS), Festival (FLOSS), Nuance TTS API (commercial)
-* Use custom TTS engine wrappers besides the built-in ones
+* Default TTS (eSpeak) called via a Python C extension for fast audio synthesis
+* A custom, user-provided TTS engine Python wrapper can be used instead of the built-in ones (included example for speect)
+* Batch processing of multiple audio/text pairs
 * Download audio from a YouTube video
 * In multilevel mode, recursive alignment from paragraph to sentence to word level
 * Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
@@ -236,13 +242,14 @@ which explains how to use the built-in command line tools.
 * Output an HTML file for fine tuning the sync map manually (`finetuneas` project)
 * Execution parameters tunable at runtime
 * Code suitable for Web app deployment (e.g., on-demand cloud computing)
+* Extensive test suite including 898 unit/integration/performance tests, that run and must pass before each release
 
 
 ## Limitations and Missing Features 
 
 * Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
-* Audio is assumed to be spoken: not suitable/YMMV for song captioning
-* No protection against memory trashing if you feed extremely long audio files
+* Audio is assumed to be spoken: not suitable for song captioning, YMMV for CC applications
+* No protection against memory trashing if you feed extremely long audio files (>1.5h per single audio file)
 * [Open issues](https://github.com/readbeyond/aeneas/issues)
 
 
@@ -340,6 +347,9 @@ for its asynchronous usage.
 **Chris Hubbard** prepared the files for
 packaging aeneas as a Debian/Ubuntu `.deb`.
 
+**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts**
+packaged the installers for Mac OS X and Windows.
+
 **Firat Ozdemir** contributed the `finetuneas`
 HTML/JS code for fine tuning sync maps in the browser.
 

diff --git a/README.rst b/README.rst
@@ -4,8 +4,8 @@ aeneas
 **aeneas** is a Python/C library and a set of tools to automagically
 synchronize audio and text (aka forced alignment).
 
--  Version: 1.5.0.3
--  Date: 2016-04-23
+-  Version: 1.5.1.0
+-  Date: 2016-07-25
 -  Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
 -  Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
 -  License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -100,6 +100,16 @@ modern OS (Linux, Mac OS X, Windows).
 Installation
 ~~~~~~~~~~~~
 
+All-in-one installers are available for Mac OS X and Windows, and a Bash
+script for deb-based Linux distributions (Debian, Ubuntu) is provided in
+this repository. It is also possible to download a VirtualBox+Vagrant
+virtual machine. Please see the `INSTALL
+file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
+for detailed, step-by-step installation procedures for different
+operating systems.
+
+The generic OS-independent procedure is simple:
+
 1. Install `Python <https://python.org/>`__ (2.7.x preferred),
    `FFmpeg <https://www.ffmpeg.org/>`__, and
    `eSpeak <http://espeak.sourceforge.net/>`__
@@ -114,18 +124,14 @@ Installation
        pip install numpy
        pip install aeneas
 
-See the `INSTALL
-file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
-for detailed, step-by-step procedures for Linux, OS X, and Windows.
+4. To **check** whether you installed **aeneas** correctly, run:
+
+``bash     python -m aeneas.diagnostics``
 
 Usage
 -----
 
-1. To **check** whether you installed **aeneas** correctly, run:
-
-``bash     python -m aeneas.diagnostics``
-
-2. Run without arguments to get the **usage message**:
+1. Run without arguments to get the **usage message**:
 
    .. code:: bash
 
@@ -140,7 +146,7 @@ Usage
        python -m aeneas.tools.execute_task --examples
        python -m aeneas.tools.execute_task --examples-all
 
-3. To **compute a synchronization map** ``map.json`` for a pair
+2. To **compute a synchronization map** ``map.json`` for a pair
    (``audio.mp3``, ``text.txt`` in
    `plain <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN>`__
    text format), you can run:
@@ -178,7 +184,7 @@ specifies the parameters controlling the I/O formats and the processing
 options for the task. Consult the
 `documentation <http://www.readbeyond.it/aeneas/docs/>`__ for details.
 
-4. If you have several tasks to process, you can create a **job
+3. If you have several tasks to process, you can create a **job
    container** to batch process them:
 
    .. code:: bash
@@ -229,17 +235,19 @@ Supported Features
 -  Input audio file formats: all those readable by ``ffmpeg``
 -  Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB,
    TSV, TTML, TXT, VTT, XML
--  Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO,
-   EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD,
-   NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
+-  Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU,
+   ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN,
+   LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE,
+   TUR, UKR
 -  MFCC and DTW computed via Python C extensions to reduce the
    processing time
--  On Linux, eSpeak called via a Python C extension for faster audio
-   synthesis
--  Batch processing of multiple audio/text pairs
 -  Several built-in TTS engine wrappers: eSpeak (default, FLOSS),
    Festival (FLOSS), Nuance TTS API (commercial)
--  Use custom TTS engine wrappers besides the built-in ones
+-  Default TTS (eSpeak) called via a Python C extension for fast audio
+   synthesis
+-  A custom, user-provided TTS engine Python wrapper can be used instead
+   of the built-in ones (included example for speect)
+-  Batch processing of multiple audio/text pairs
 -  Download audio from a YouTube video
 -  In multilevel mode, recursive alignment from paragraph to sentence to
    word level
@@ -253,15 +261,18 @@ Supported Features
 -  Execution parameters tunable at runtime
 -  Code suitable for Web app deployment (e.g., on-demand cloud
    computing)
+-  Extensive test suite including 898 unit/integration/performance
+   tests, that run and must pass before each release
 
 Limitations and Missing Features
 --------------------------------
 
 -  Audio should match the text: large portions of spurious text or audio
    might produce a wrong sync map
--  Audio is assumed to be spoken: not suitable/YMMV for song captioning
+-  Audio is assumed to be spoken: not suitable for song captioning, YMMV
+   for CC applications
 -  No protection against memory trashing if you feed extremely long
-   audio files
+   audio files (>1.5h per single audio file)
 -  `Open issues <https://github.com/readbeyond/aeneas/issues>`__
 
 License
@@ -362,6 +373,9 @@ asynchronous usage.
 **Chris Hubbard** prepared the files for packaging aeneas as a
 Debian/Ubuntu ``.deb``.
 
+**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts** packaged
+the installers for Mac OS X and Windows.
+
 **Firat Ozdemir** contributed the ``finetuneas`` HTML/JS code for fine
 tuning sync maps in the browser.
 

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.5.0
+1.5.1
diff --git a/aeneas/__init__.py b/aeneas/__init__.py
@@ -13,7 +13,7 @@
     Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/adjustboundaryalgorithm.py b/aeneas/adjustboundaryalgorithm.py
@@ -30,7 +30,7 @@
     Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/analyzecontainer.py b/aeneas/analyzecontainer.py
@@ -32,7 +32,7 @@
     Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/audiofile.py b/aeneas/audiofile.py
@@ -37,7 +37,7 @@
     Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 
@@ -116,6 +116,77 @@ class AudioFile(Loggable):
     :type  logger: :class:`~aeneas.logger.Logger`
     """
 
+    FILE_EXTENSIONS = [
+        u"3g2",
+        u"3gp",
+        u"aa",
+        u"aa3",
+        u"aac",
+        u"aax",
+        u"aiff",
+        u"alac",
+        u"amr",
+        u"ape",
+        u"asf",
+        u"at3",
+        u"at9",
+        u"au",
+        u"avi",
+        u"awb",
+        u"celt",
+        u"dct",
+        u"dss",
+        u"dvf",
+        u"eac",
+        u"flac",
+        u"flv",
+        u"gsm",
+        u"m4a",
+        u"m4b",
+        u"m4p",
+        u"m4v",
+        u"mid",
+        u"midi",
+        u"mkv",
+        u"mmf",
+        u"mov",
+        u"mp2",
+        u"mp3",
+        u"mp4",
+        u"mpc",
+        u"mpeg",
+        u"mpg",
+        u"mpv",
+        u"msv",
+        u"oga",
+        u"ogg",
+        u"ogv",
+        u"oma",
+        u"opus",
+        u"pcm",
+        u"qt",
+        u"ra",
+        u"ram",
+        u"raw",
+        u"riff",
+        u"rm",
+        u"rmvb",
+        u"shn",
+        u"sln",
+        u"theora",
+        u"tta",
+        u"vob",
+        u"vorbis",
+        u"vox",
+        u"wav",
+        u"webm",
+        u"wma",
+        u"wmv",
+        u"wv",
+        u"yuv",
+    ]
+    """ Extensions of common formats for audio (and video) files. """
+
     TAG = u"AudioFile"
 
     def __init__(self, file_path=None, is_mono_wave=False, rconf=None, logger=None):

diff --git a/aeneas/audiofilemfcc.py b/aeneas/audiofilemfcc.py
@@ -29,7 +29,7 @@
     Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 
@@ -134,7 +134,7 @@ def __init__(
                 self._compute_mfcc_c_extension,
                 self._compute_mfcc_pure_python,
                 (),
-                c_extension=self.rconf[RuntimeConfiguration.C_EXTENSIONS]
+                rconf=self.rconf
             )
             self.audio_length = self.audio_file.audio_length
             if audio_file_was_none:

diff --git a/aeneas/cdtw/000_compile_driver.sh b/aeneas/cdtw/000_compile_driver.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 
-gcc cdtw_driver.c cdtw_func.c cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
+gcc cdtw_driver.c cdtw_func.c ../cint/cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
 
 
 
diff --git a/aeneas/cdtw/900_clean.sh b/aeneas/cdtw/900_clean.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+rm -rf build __pycache__ *.so cdtw_driver