Skip to content

A tool to extract meaningful information from long CSR reports.

Notifications You must be signed in to change notification settings

jyaacoub/CSR_summarizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSR_summarizer

We utilize the text completion functionality of the OpenAI API to simplify lengthy Corporate Social Responsibility (CSR) reports and extract relevant information.

For this we use the API to create text embeddings for semantic search, and then query the API to get summaries for the most relevant text snippets. The final result is a trace of summaries to a final overall summary. This allows users to quickly gather a high level overview of the report and, if they need to, also follow the trace back to the original text to get the details.

Video Overview

Net.Zero.AI.-.OpenAI.Hackathon.Submission.-.112022.mp4

file structure

  • common: folder for all common variables such as api keys
    • SECRETS.py: file containing secrete keys (DO NOT COMMIT THIS; added to gitignore)
  • file_parser: folder for all parsing needs
    • chunker.py: used for smartly splitting up sections that are sent by the file_reader.
    • sumarizer.py: see description below
  • file_reader: For all reading/formating needs.
    • pdf_reader.py: see below
  • models: For all the models that we use in the project
    • open_ai.py contains wrapper class for openAPI.

About

A tool to extract meaningful information from long CSR reports.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages