We utilize the text completion functionality of the OpenAI API to simplify lengthy Corporate Social Responsibility (CSR) reports and extract relevant information.
For this we use the API to create text embeddings for semantic search, and then query the API to get summaries for the most relevant text snippets. The final result is a trace of summaries to a final overall summary. This allows users to quickly gather a high level overview of the report and, if they need to, also follow the trace back to the original text to get the details.
- common: folder for all common variables such as api keys
: file containing secrete keys (DO NOT COMMIT THIS; added to gitignore)
- file_parser: folder for all parsing needs
: used for smartly splitting up sections that are sent by the file_reader.sumarizer.py
: see description below
- file_reader: For all reading/formating needs.
: see below
- models: For all the models that we use in the project
contains wrapper class for openAPI.