Text Preprocessing Project

Welcome to the Text Preprocessing Project! This project is designed to help you preprocess text data using various techniques like tokenization, lemmatization, stemming, and more. It includes a simple Tkinter GUI that allows users to input text and apply different preprocessing functions to it.

Project Structure

Functions/: This directory includes Python scripts for various text preprocessing functions:
- voices_script.py: For handling different voice processing tasks.
- clauses.py: For clause-based text processing.
- tenses.py: For identifying and modifying tenses in text.
- pos_tagging.py: For part-of-speech tagging.
- lemmatize_script.py: For lemmatization.
- stem_script.py: For stemming.
- tokenize_script.py: For tokenization.
nltk_data/: Contains necessary NLTK data used for various text processing tasks:
- corpora/: Includes corpora like stopwords and WordNet.
- taggers/: Contains the averaged perceptron tagger.
- tokenizers/: Includes the Punkt tokenizer.
.gitignore: Specifies files and directories to be ignored by Git.
Readme.md: This file.
requirements.txt: Lists the Python packages required to run the project.
run.py: The main script to run the Tkinter GUI.

Features

Tokenization: Breaks down text into words or sentences.
Lemmatization: Converts words to their base or dictionary form.
Stemming: Reduces words to their root form.
POS Tagging: Tags parts of speech for each word in the text.
Tense Handling: Identifies and modifies tenses in the text.
Clause Processing: Analyzes and processes clauses within the text.
Voice Processing: Handles different voice-related text transformations.

Installation

To get started with this project, follow these steps:

Clone the Repository

git clone https://github.com/SumitRajam/Text_Preprocessor.git
cd Text_Preprocessor

Install Dependencies
Create a virtual environment and install the required packages

 python -m venv venv
 source venv/bin/activate   # On Windows use `venv\Scripts\activate`
 pip install -r requirements.txt

Usage

Run tkinter GUI
Start the application by running:
```
python run.py
```
The Tkinter GUI will open, allowing you to input text and apply various preprocessing functions.
Using the GUI

Input Text: Enter the text you want to preprocess.
Select a Function: Choose a preprocessing function from the available buttons.
View Output: Click the selected button to apply the function and view the results.

Contributing

We welcome contributions to enhance this project! If you'd like to contribute, please follow these steps:

Fork the Repository: Click the "Fork" button on the top-right corner of this page.
Create a Branch: Create a new branch for your feature or bug fix.
```
git checkout -b feature/your-feature-name
```

Make Changes: Implement your changes and commit them.

git add .
git commit -m "Add your commit message"

Push Changes: Push your changes to your forked repository.
```
git push origin feature/your-feature-name
```
Create a Pull Request: Go to the original repository and create a pull request from your forked repository.

Happy preprocessing! ⚙️💬🧠🖥️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text Preprocessing Project

Project Structure

Features

Installation

Usage

Contributing

Happy preprocessing! ⚙️💬🧠🖥️

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
functions		functions
nltk_data		nltk_data
.gitignore		.gitignore
Readme.md		Readme.md
requirements.txt		requirements.txt
run.py		run.py

SumitRajam/Text_Preprocessor

Folders and files

Latest commit

History

Repository files navigation

Text Preprocessing Project

Project Structure

Features

Installation

Usage

Contributing

Happy preprocessing! ⚙️💬🧠🖥️

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages