A Tchoung té

Yemba language meaning association/group in French

The objective of the project is to federate the metadata of all Cameroonian associations in France to make them more accessible to the community.

Functional Context

Presentation video (in French)

If you want to do data analysis, the raw latest database of cameroonian association is accessible here.

We also maintained a public dashboard to visualize associations here

Technical context

If you are here, it means that you are interested in an in-house deployment of the solution. Follow the guide :) !

Prerequisites

Create a Sourcegraph account and get credentials to use CodyAI
Devspace installed locally
Have admin access on a Gogocarto
Go through the Gogocarto tutorials
Locally install all tools ( init and command scripts from the .gitpod.yml file or use a ready-made development environment on gitpod :

Deployment

Execute filter-cameroon.ipynb et enrich-database.ipynb notebooks :

  pipenv shell
  secretsfoundry run --script 'python filter-cameroon.py'

Finally use the resulting csv file as a data source in Gogocarto and customize it. You can for example define icons by category (social object); ours are in html/icons.

These have been built from these basic icons https://thenounproject.com/behanzin777/kit/favorites/

Update database

  csvdiff ref-rna-real-mars-2022.csv rna-real-mars-2022-new.csv -p 1 --columns 1 --format json | jq '.Additions' > experiments/update-database/diff.csv
  python3 main.py

Start the chatbot

  cd etl/
  secretsfoundry run --script "chainlit run experiments/ui.py"

Deploy the chatbot

 devspace deploy

Evaluation

RAG base evaluation dataset

The list of runs runs.csv has been built by getting all the runs from the beginning using:

export LANGCHAIN_API_KEY=<key>
cd evals/
python3 rag-evals.py save_runs --days 400

Then we use lilac to get the most interesting questions by clustering them per topic/category. "Associations in France" was the one chosen, and we also deleted some rows due to irrelevance.

The clustering repartition is available here: Clustering Repartition

Finally, you just need to do:

export LANGCHAIN_API_KEY=<key>
cd evals/
python3 rag.py ragas_eval tchoung-te --run_ids_file=runs.csv
python3 rag.py deepeval tchoung-te --run_ids_file=runs.csv

RAG offline evaluation

Whenever you change a parameter that can affect RAG, you can execute all inputs present in evals/base_ragas_evaluation.csv using langsmith to track them. Then you just have to get the runs and execute above command. As it's just 27 elements, you will be able to compare results manually.

Backtesting the prompt

 cd etl/
 python3 backtesting_prompt.py

Create the dataset on which you want to test the new prompt on langSmith. Then run the file above to backtest and see the results of the new prompt on the dataset. You would specify in the file the name of the dataset before running

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_{Ghislain TAKAM} ✅ 🔣	_pdjiela ✅	_{DimitriTchapmi} ✅	_GNOKAM ✅ 🔣	_{fabiolatagne97} ✅ 🔣	_hsiebenou 🔣 ⚠️ ✅	_{Flomin TCHAWE} 💻 ✅ 🔣
_{Bill Metangmo} 💻 🔣 🤔 ⚠️ ✅	_dimitrilexi 🔣	_ngnnpgn 🔣	_{Tchepga Patrick} 🔣

This project follows the all-contributors specification. Contributions of any kind welcome!!

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
.github		.github
.vscode		.vscode
etl		etl
gogocarto		gogocarto
infra		infra
.all-contributorsrc		.all-contributorsrc
.dockerignore		.dockerignore
.envrc.gitpod		.envrc.gitpod
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
commitlint.config.js		commitlint.config.js
devspace.yaml		devspace.yaml
devspace_start.sh		devspace_start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Tchoung té

Functional Context

Technical context

Prerequisites

Deployment

Update database

Start the chatbot

Deploy the chatbot

Evaluation

RAG base evaluation dataset

RAG offline evaluation

Backtesting the prompt

Contributors ✨

About

Releases

Packages

Contributors 13

Languages

License

mongulu-cm/tchoung-te

Folders and files

Latest commit

History

Repository files navigation

A Tchoung té

Functional Context

Technical context

Prerequisites

Deployment

Update database

Start the chatbot

Deploy the chatbot

Evaluation

RAG base evaluation dataset

RAG offline evaluation

Backtesting the prompt

Contributors ✨

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

Packages