Releases: snipsco/snips-nlu
Releases · snipsco/snips-nlu
0.7.0
Fixed
- we don't add regexes to the
RegexIntentParser
anymore when there are enough queries for a given intent (> 50). In this case we rely on theProbabilisticIntentParser
to parse correclty. Before that if you added 1000 queries per intent then 1000 regexes were compiled and kept in memory per intent. This fix will both improve memory footprint and serialization
Changed
- rename the
NLUEngine.add_pretrained_model
toNLUEngine.add_fitted_tagger
Added
- added a
NLUEngine.get_fitted_tagger(dataset, intent)
that return a trained CRFGTagger for a given intent. This tagger can then be added to theNLUEngine
, like this:tagger = engine.get_fitted_tagger(dataset, "my_intent")
andengine.add_fitted_tagger(intent="my_intent", model_data=tagger.to_dict())
. Now when you train theNLUEngine
make sure that you do not retrain the"my_intent"
intent. What you can do is passing the list of the intent you want to train when calling:engine.fit(dataset, intents=[<list the intent you need to retrain>])
0.6.0
Removed
- Removed the
NLUENgine.tag
method. Tagging is now handled by this library: https://github.com/snipsco/snips-nlu-tagging
Fixed
- Fixed the serialiaztion bug of the
SnipsIntentClassfier
0.5.0
Added
- Tests to ensure that the whole pipeline is robust to naughty strings
- Added a
NLUEngine.tag(text, intent)
method for autotagging. When an intent as less thanx
queries, the tag method return (by order of priority in case of overlap): the previously seen entities in the NLUEngine intents, the bulitin entities, the result of the intent model trained withx
queries. When there are more thanx
queries, then the autotagging output is the output of the model trained withx
queries - Added the ability to add pretrained intent to an
NLUEngine
with theNLUEngine.add_pretrained_model(intent, model_data)
method - Added the ability to only train particular intents when fitting a dataset with the
intents
argument of theNLUEngine.fit(dataset, intents=None)
method
Changed
- Changed the serialization of the
NLUEngine
and most other objects of the lib - Improved some feature performances by adding caching
- Improved builtin entities handling in regex generation
snips_nlu_version
key is now mandatory in the input dataset
Fixed
- Fixed bug with synonyms in the dataset, see: #224
- Fixed bug with unseen CRF labels at inference time, if some labels were not seen during training, we could ask the CRF probabilities of unseens labels when post processing builtin entities
- Fixed the bug happening with the intent classification feature extraction when the input queries were empty or only contained stop words, leading to an empty vocab for the
Featurize.count_vectorizer
Removed
- the
force_builtin_entities
flag in theNLUEngine.parse
method, autotagging is now handle by theNLUEngine.tag
method - removed deep intent support with the rust library, builtin intent are now light intents added from the registry
0.4.1
0.4.0
Added
snips_nlu
can now expose the version of thebuiltin_entities_ontology
that it's using throughsnips_nlu.__builtin_entities_version__
- added test to check that the members of the
BuiltInEntities
enum class are exactly the members of thebuiltin_entities_ontology
Changed
- BuiltIn entities in the dataset do not require the following fields anymore:
data
,automatically_extensible
,use_synonyms
0.3.4
Added
- Post processing that enrich parsed slots with additional builtin entities
- New SnipsNLUEngine API that allows to parse and give the intent of an input when it is known, thus skipping the intent classification
- New SnipsNLUEngine API that force the use of builtin entities even when not declared by the user (useful for automatic tagging)
Changed
- Load resources for all languages during the import of the snips_nlu package instead of doing it on-the-fly
Fixed
- Fixed a bug that made the engine crash when used with builtin intents only
- The regex parser now handles correctly intents with no slots
0.3.2
[0.3.2] - 2017-04-28
Added
- expose version number through:
snips_nlu.__version__
Changed
- improved tokenization to support keep special tokens like
$
- improved CI
Fixed
- Tagging issue when entities contained special characters
- Can now fit an empty dataset. Before when the user add only built in intent, the nlu engine was crashing
0.3.1
[0.3.1] - 2017-04-18
Added
- Korean support
- New duckling entities as model features
Changed
- Improved German and English models
- Improved
dataset
validation - Improved the intent classification strategy, so that built-in intents are not mixed up with custom intents
Fixed
- Fixed tagging scheme related issues
- Heavy serialization
Removed
- the internal
utterance_text
field from thedataset