Skip to content

Commit

Permalink
Merge pull request #55 from vc1492a/54-add-regression-tests-for-refac…
Browse files Browse the repository at this point in the history
…tor-validation

54 add regression tests for refactor validation
  • Loading branch information
IroNEDR authored May 1, 2024
2 parents 89483a2 + e24cc98 commit 2662adf
Show file tree
Hide file tree
Showing 6 changed files with 234 additions and 53 deletions.
168 changes: 162 additions & 6 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,168 @@ nasaValve
rel_research
PyNomaly/loop_dev.py
/PyNomaly.egg-info/
.pytest_cache
build
htmlcov/
*.egg
*.pyc
.coverage
*.coverage.*
.coveragerc
venv/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

2 changes: 1 addition & 1 deletion PyNomaly/loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
pass

__author__ = 'Valentino Constantinou'
__version__ = '0.3.4'
__version__ = '0.3.3'
__license__ = 'Apache License, Version 2.0'


Expand Down
11 changes: 1 addition & 10 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,6 @@ All notable changes to PyNomaly will be documented in this Changelog.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## 0.3.4
### Changed
- Unit tests from using the `sklearn.utils.testing` submodule
to standard Python assertions, as the submodule will be changed
to private functions after scikit-learn version 0.24.
- Logging statements or warnings when testing with numba disabled or
enabled (respectively) to reflect the effect of numba just-in-time
compilation on code coverage statistics.

## 0.3.3
### Changed
- The implementation of the progress bar to support use when the number of
Expand Down Expand Up @@ -226,4 +217,4 @@ in computing the neighborhood distance for each observation.
### Added
- readme.md file documenting methodology, package dependencies, use cases,
how to contribute, and acknowledgements.
- Initial open release of PyNomaly codebase on Github.
- Initial open release of PyNomaly codebase on Github.
20 changes: 10 additions & 10 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ LoOP is a local density based outlier detection method by Kriegel, Kröger, Schu
scores in the range of [0,1] that are directly interpretable as the probability of a sample being an outlier.

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPi](https://img.shields.io/badge/pypi-0.3.4-blue.svg)](https://pypi.python.org/pypi/PyNomaly/0.3.4)
[![PyPi](https://img.shields.io/badge/pypi-0.3.3-blue.svg)](https://pypi.python.org/pypi/PyNomaly/0.3.3)
![](https://img.shields.io/pypi/dm/PyNomaly.svg?logoColor=blue)
[![Build Status](https://travis-ci.org/vc1492a/PyNomaly.svg?branch=master)](https://travis-ci.org/vc1492a/PyNomaly)
[![Coverage Status](https://coveralls.io/repos/github/vc1492a/PyNomaly/badge.svg?branch=master)](https://coveralls.io/github/vc1492a/PyNomaly?branch=master)
[![Build Status](https://travis-ci.org/vc1492a/PyNomaly.svg?branch=main)](https://travis-ci.org/vc1492a/PyNomaly)
[![Coverage Status](https://coveralls.io/repos/github/vc1492a/PyNomaly/badge.svg?branch=main)](https://coveralls.io/github/vc1492a/PyNomaly?branch=main)
[![JOSS](http://joss.theoj.org/papers/f4d2cfe680768526da7c1f6a2c103266/status.svg)](http://joss.theoj.org/papers/f4d2cfe680768526da7c1f6a2c103266)

The outlier score of each sample is called the Local Outlier Probability.
Expand Down Expand Up @@ -234,13 +234,13 @@ plt.close()
Your results should look like the following:

**LoOP Scores without Clustering**
![LoOP Scores without Clustering](https://github.com/vc1492a/PyNomaly/blob/master/images/scores.png)
![LoOP Scores without Clustering](https://github.com/vc1492a/PyNomaly/blob/main/images/scores.png)

**LoOP Scores with Clustering**
![LoOP Scores with Clustering](https://github.com/vc1492a/PyNomaly/blob/master/images/scores_clust.png)
![LoOP Scores with Clustering](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_clust.png)

**DBSCAN Cluster Assignments**
![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/master/images/cluster_assignments.png)
![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/main/images/cluster_assignments.png)


Note the differences between using LocalOutlierProbability with and without clustering. In the example without clustering, samples are
Expand Down Expand Up @@ -312,7 +312,7 @@ scores = m.local_outlier_probabilities
The below visualization shows the results by a few known distance metrics:

**LoOP Scores by Distance Metric**
![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/master/images/scores_by_distance_metric.png)
![DBSCAN Cluster Assignments](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_by_distance_metric.png)

## Streaming Data

Expand Down Expand Up @@ -383,7 +383,7 @@ plt.close()
```

**LoOP Scores using Stream Approach with n=10**
![LoOP Scores using Stream Approach with n=10](https://github.com/vc1492a/PyNomaly/blob/master/images/scores_stream.png)
![LoOP Scores using Stream Approach with n=10](https://github.com/vc1492a/PyNomaly/blob/main/images/scores_stream.png)

### Notes
When calculating the LoOP score of incoming data, the original fitted scores are not updated.
Expand All @@ -401,10 +401,10 @@ any changes to a branch which corresponds to an open issue. Hot fixes
and bug fixes can be represented by branches with the prefix `fix/` versus
`feature/` for new capabilities or code improvements. Pull requests will
then be made from these branches into the repository's `dev` branch
prior to being pulled into `master`. Pull requests which are works in
prior to being pulled into `main`. Pull requests which are works in
progress or ready for merging should be indicated by their respective
prefixes ([WIP] and [MRG]). Pull requests with the [MRG] prefix will be
reviewed prior to being pulled into the `master` branch.
reviewed prior to being pulled into the `main` branch.

### Tests
When contributing, please ensure to run unit tests and add additional tests as
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
setup(
name='PyNomaly',
packages=['PyNomaly'],
version='0.3.4',
version='0.3.3',
description='A Python 3 implementation of LoOP: Local Outlier '
'Probabilities, a local density based outlier detection '
'method providing an outlier score in the range of [0,1].',
author='Valentino Constantinou',
author_email='[email protected]',
url='https://github.com/vc1492a/PyNomaly',
download_url='https://github.com/vc1492a/PyNomaly/archive/0.3.4.tar.gz',
download_url='https://github.com/vc1492a/PyNomaly/archive/0.3.3.tar.gz',
keywords=['outlier', 'anomaly', 'detection', 'machine', 'learning',
'probability'],
classifiers=[],
Expand Down
Loading

0 comments on commit 2662adf

Please sign in to comment.