Skip to content

dgobalak/File-Insights

Repository files navigation

File Insights

Logo

File Insights

Built to provide insight into media content.
Explore the docs »
Report Bug · Request Feature

Table of Contents

  1. About The Project
  2. Getting Started
  3. Contributing
  4. License
  5. Contact
  6. Acknowledgements

About The Project

Overview

File Insights is a tool that takes a media file (.wav, .mp3, .mp4, .pdf, or .png) and provides further information regarding the file's content. We use machine learning, natural language processing, and web-scraping to provide succinct, translated summaries of Wikipedia articles relevant to the file. We've implemented NLP algorithms to extract named entitites and summarize text, so users can quickly learn more about major topics. We use an image captioning ML model to caption images that don't have any text and we use the textract library for every other file type. All of these features (and more!) are integrated into an easy-to-use Flask website.

Algorithm Flowchart

Built With

  • Python
  • Tensorflow
  • NLTK
  • Beautiful Soup
  • Flask

Getting Started

To get a local copy up and running, follow these simple steps.

Prerequisites

  • Verify if python (Version >= 3.8) is installed. Previous versions may also work.

    python --version
  • Verify if pip is installed

    pip --version

Installation and Setup

  1. Clone the repo
    git clone https://github.com/dgobalak/File-Insights.git
  2. Create a virtual environment
    python -m venv venv
  3. Activate the virtual environment
    venv\scripts\activate
  4. Install dependencies
    pip install -r requirements.txt
  5. Start the Flask app
    python run.py

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch
   git checkout -b AmazingFeature
  1. Commit your Changes
   git commit -m 'Add some AmazingFeature'
  1. Push to the Branch
   git push origin AmazingFeature
  1. Open a Pull Request and wait for it to be reviewed.

License

Distributed under the Apache 2.0 License . See LICENSE for more information.

Contact

Acknowledgements