Specialized Keras.Model which implement the core features needed for
TFSimilarity.callbacks.SimilarityModel(
*args, **kwargs
)
metric learning. In particular, SimilarityModel() supports indexing, searching and saving the embeddings predicted by the network.
All Similarity models classes derive from this class to benefits from those core features.
calibrate(
x: <a href="../../TFSimilarity/callbacks/FloatTensor.md">TFSimilarity.callbacks.FloatTensor```
</a>,
y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
thresholds_targets: MutableMapping[str, float] = {},
k: int = 1,
calibration_metric: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMetric.md">TFSimilarity.callbacks.ClassificationMetric```
</a>] = f1,
matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
extra_metrics: MutableSequence[Union[str, ClassificationMetric]] = [precision, recall],
rounding: int = 2,
verbose: int = 1
) -> <a href="../../TFSimilarity/indexer/CalibrationResults.md">TFSimilarity.indexer.CalibrationResults```
</a>
Calibrate model thresholds using a test dataset.
Args | |
---|---|
x | examples to use for the calibration. |
y | labels associated with the calibration examples. |
thresholds_targets | Dict of performance targets to (if possible) meet with respect to the calibration_metric. |
calibration_metric | - [ClassificationMetric()](classification_metrics/overview.md) used to evaluate the performance of the index. |
k | How many neighboors to use during the calibration. Defaults to 1. |
matcher | 'match_nearest', 'match_majority_vote' or ClassificationMatch object. Defines the classification matching, e.g., match_nearest will count a True Positive if the query_label is equal to the label of the nearest neighbor and the distance is less than or equal to the distance threshold. Defaults to 'match_nearest'. |
extra_metrics | List of additional tf.similarity.classification_metrics.ClassificationMetric() to compute and report. Defaults to ['precision', 'recall']. |
rounding | Metric rounding. Default to 2 digits. |
verbose | Be verbose and display calibration results. Defaults to 1. |
Returns | |
---|---|
CalibrationResults containing the thresholds and cutpoints Dicts. |
create_index(
distance: Union[<a href="../../TFSimilarity/distances/Distance.md">TFSimilarity.distances.Distance```
</a>, str] = cosine,
search: Union[<a href="../../TFSimilarity/indexer/Search.md">TFSimilarity.indexer.Search```
</a>, str] = nmslib,
kv_store: Union[<a href="../../TFSimilarity/indexer/Store.md">TFSimilarity.indexer.Store```
</a>, str] = memory,
evaluator: Union[<a href="../../TFSimilarity/callbacks/Evaluator.md">TFSimilarity.callbacks.Evaluator```
</a>, str] = memory,
embedding_output: int = None,
stat_buffer_size: int = 1000
) -> None
Create the model index to make embeddings searchable via KNN.
This method is normally called as part of SimilarityModel.compile(). However, this method is provided if users want to define a custom index outside of the compile() method.
NOTE: This method sets SimilarityModel._index and will replace any existing index.
Args | |
---|---|
distance | Distance used to compute embeddings proximity. Defaults to 'auto'. |
kv_store | How to store the indexed records. Defaults to 'memory'. |
search | Which Search() framework to use to perform KNN search. Defaults to 'nmslib'. |
evaluator | What type of Evaluator() to use to evaluate index performance. Defaults to in-memory one. |
embedding_output | Which model output head predicts the embeddings that should be indexed. Default to None which is for single output model. For multi-head model, the callee, usually the SimilarityModel() class is responsible for passing the correct one. |
stat_buffer_size | Size of the sliding windows buffer used to compute index performance. Defaults to 1000. |
Raises | |
---|---|
ValueError | Invalid search framework or key value store. |
evaluate_classification(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
k: int = 1,
extra_metrics: MutableSequence[Union[str, ClassificationMetric]] = [precision, recall],
matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
verbose: int = 1
) -> DefaultDict[str, Dict[str, Union[str, np.ndarray]]]
Evaluate model classification matching on a given evaluation dataset.
Args | |
---|---|
x | Examples to be matched against the index. |
y | Label associated with the examples supplied. |
k | How many neighbors to use to perform the evaluation. Defaults to 1. |
extra_metrics | List of additional tf.similarity.classification_metrics.ClassificationMetric() to compute and report. Defaults to ['precision', 'recall']. |
matcher |
'match_nearest', 'match_majority_vote' or
ClassificationMatch object. Defines the classification matching,
e.g., match_nearest will count a True Positive if the query_label
is equal to the label of the nearest neighbor and the distance is
less than or equal to the distance threshold.
verbose (int, optional): Display results if set to 1 otherwise results are returned silently. Defaults to 1. |
Returns | |
---|---|
Dictionary of (distance_metrics.md)[evaluation metrics] |
Raises | |
---|---|
IndexError | Index must contain embeddings but is currently empty. |
ValueError | Uncalibrated model: run model.calibration()") |
evaluate_retrieval(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
retrieval_metrics: Sequence[<a href="../../TFSimilarity/indexer/RetrievalMetric.md">TFSimilarity.indexer.RetrievalMetric```
</a>],
verbose: int = 1
) -> Dict[str, np.ndarray]
Evaluate the quality of the index against a test dataset.
Args | |
---|---|
x | Examples to be matched against the index. |
y | Label associated with the examples supplied. |
retrieval_metrics |
List of
- [RetrievalMetric()](retrieval_metrics/overview.md) to compute.
verbose (int, optional): Display results if set to 1 otherwise results are returned silently. Defaults to 1. |
Returns | |
---|---|
Dictionary of metric results where keys are the metric names and values are the metrics values. |
Raises | |
---|---|
IndexError | Index must contain embeddings but is currently empty. |
index(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a> = None,
data: Optional[<a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>] = None,
build: bool = True,
verbose: int = 1
)
Index data.
Args | |
---|---|
x | Samples to index. |
y | class ids associated with the data if any. Defaults to None. |
data | store the data associated with the samples in the key value store. Defaults to True. |
build | Rebuild the index after indexing. This is needed to make the new samples searchable. Set it to false to save processing time when calling indexing repeatidly without the need to search between the indexing requests. Defaults to True. |
verbose | Output indexing progress info. Defaults to 1. |
index_single(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a> = None,
data: Optional[<a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>] = None,
build: bool = True,
verbose: int = 1
)
Index data.
Args | |
---|---|
x | Sample to index. |
y | class id associated with the data if any. Defaults to None. |
data | store the data associated with the samples in the key value store. Defaults to None. |
build | Rebuild the index after indexing. This is needed to make the new samples searchable. Set it to false to save processing time when calling indexing repeatidly without the need to search between the indexing requests. Defaults to True. |
verbose | Output indexing progress info. Defaults to 1. |
index_size() -> int
Return the index size
index_summary()
Display index info summary.
load_index(
filepath: str
)
Load Index data from a checkpoint and initialize underlying structure with the reloaded data.
Args | |
---|---|
path | Directory where the checkpoint is located. |
verbose | Be verbose. Defaults to 1. |
lookup(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
k: int = 5,
verbose: int = 1
) -> List[List[Lookup]]
Find the k closest matches in the index for a set of samples.
Args | |
---|---|
x | Samples to match. |
k | Number of nearest neighboors to lookup. Defaults to 5. |
verbose | display progress. Default to 1. |
Returns list of list of k nearest neighboors: List[List[Lookup]]
match(
x: <a href="../../TFSimilarity/callbacks/FloatTensor.md">TFSimilarity.callbacks.FloatTensor```
</a>,
cutpoint=optimal,
no_match_label=-1,
k=1,
matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
verbose=0
)
Match a set of examples against the calibrated index
For the match function to work, the index must be calibrated using calibrate().
Args | |
---|---|
x | Batch of examples to be matched against the index. |
cutpoint | Which calibration threshold to use. Defaults to 'optimal' which is the optimal F1 threshold computed using calibrate(). |
no_match_label | Which label value to assign when there is no match. Defaults to -1. |
k | How many neighboors to use during the calibration. Defaults to 1. |
matcher |
'match_nearest', 'match_majority_vote' or
ClassificationMatch object. Defines the classification matching,
e.g., match_nearest will count a True Positive if the query_label
is equal to the label of the nearest neighbor and the distance is
less than or equal to the distance threshold.
verbose. Be verbose. Defaults to 0. |
Returns | |
---|---|
List of class ids that matches for each supplied example |
This function matches all the cutpoints at once internally as there is little performance downside to do so and allows to do the evaluation in a single go.
reset_index()
Reinitialize the index
save_index(
filepath, compression=True
)
Save the index to disk
Args | |
---|---|
path | directory where to save the index |
compression | Store index data compressed. Defaults to True. |
single_lookup(
x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
k: int = 5
) -> List[<a href="../../TFSimilarity/indexer/Lookup.md">TFSimilarity.indexer.Lookup```
</a>]
Find the k closest matches in the index for a given sample.
Args | |
---|---|
x | Sample to match. |
k | Number of nearest neighboors to lookup. Defaults to 5. |
Returns list of the k nearest neigboors info: List[Lookup]
to_data_frame(
num_items: int = 0
) -> <a href="../../TFSimilarity/indexer/PandasDataFrame.md">TFSimilarity.indexer.PandasDataFrame```
</a>
Export data as pandas dataframe
Args | |
---|---|
num_items (int, optional): Num items to export to the dataframe. Defaults to 0 (unlimited). |
Returns | |
---|---|
pd.DataFrame | a pandas dataframe. |