Skip to content

The project implements Kohonen Network (SOM self organising map) algorithm, packaged as containerised service.

License

Notifications You must be signed in to change notification settings

Divjyot/kohonen-network

Repository files navigation

Kohonen Network

For more discussion, please see kohonen.ipynb

Running from source code:

As local server:

  1. cd app/
  2. uvicorn main:app (default port is 8000)

As local via python file:

  • Look for app/test.py test.py file
  • It has sample tests written for running the network over following configurations:
    1. 10X10 for 100N
    2. 10X10 for 200N
    3. 10X10 for 500N
    4. 100X100 for 1000N

Build/Running containised (on 8000/your choice port):

  1. docker build . -t kohonen (assuming your current directory is this project's directory)
  2. docker run -d --name kohonen -p 8000:80 kohonen:latest

Project Structure

  • numpy is used as primary library to hold and manipluate data.
  • fastAPI is used to package the application and expose /train/, /atrain/, /list-of-models/, /download/, /predict/ endpoints. fastAPI was chosen due to its high performance and auto tuned for number of CPU cores for handling high request load.
  • The solution is packaged as a production ready server application & containerised. The solution files reside under app/ folder:
    • kohonen.py : file contains Kohonen that implements the algorithm.
    • main.py : file act as entry point for fastAPI, where all REST API requests will fall.
    • settings.py : file contain settings, contant values etc used in the project.
    • utils/utils.py : file contains helper methods that are used in the project.
    • saved_models/ : folder which (can) contain saved models (weights) post training. Ideally, models/weights could be saved on a blob storage for better scalablity.
      • There are already saved grids/model files (.npy) under following names/configurations:
        • 10X10 100N
        • 10X10 200N
        • 10X10 500N
        • 100X100 1000N
    • exceptions.py : file implementing kohonen algorithm exceptions.
    • logs/logging.log : file contains logs that are captured throughout the appplication.
    • api_params : file contains multiple classes that are responsible for parsing fastAPI requests (body parameters).
    • app/test.py : Please look for test.py file for triggering tests. Uncomment lines, to run via command line python test.py
    • saved_plots/ : folder to save any plot images (used in test.py) for analysis/code-testing purposes
    • saved_train_inputs/ : folder to save any training data (used in test.py) as .npy file for analysis/code-testing purposes
    • configs/ : folder contains python package requiements files.
      • prod.requirements.text is used to install requirements when packaging, containerising.
      • requirements.text has to be used to create local envrionment for running, testing the application via command line, jupyter lab etc.
        • (Core Libs)
          • ! pip install numpy==1.20.1
          • ! pip install matplotlib==3.3.4
        • (Packaging/Productioning Libs)
          • ! pip install fastapi==0.63.0
          • ! pip install aiofiles==0.6.0
          • ! pip install uvicorn==0.13.4
        • (For making HTTP requests to running application)
          • ! pip install requests==2.25.1
    • Dockerfile : to package this project as REST API
      • This is based on tiangolo/uvicorn-gunicorn-fastapi:latest image which has python==3.8.6 preinstalled.

EDIT :

Project Outcome

To make Kohonen network more efficient, I tried using python-inbuilt Multiprocessing at first. However, as the results show (look at kohonen.ipynb), the larger model took more than an hour ie. not 'that' efficient.

My Further Action

I revisited the codebase and started attacking the most computationaly expensive method in the codebase i.e. (euclidean) distance calculation method in utils.py. After learning more about vectorisation, I managed to convert method to use vectorization (that you see now in codebase). The result was nothing less than joy! :D

For a network of 100X100 for 1000 run, I managed to turn around the average time for model training from ```2-3 hours``` down to ```24-25 seconds!```

alt text

About

The project implements Kohonen Network (SOM self organising map) algorithm, packaged as containerised service.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published