|
| 1 | +# Text Preprocessing Project |
| 2 | + |
| 3 | +Welcome to the Text Preprocessing Project! This project is designed to help you preprocess text data using various techniques like tokenization, lemmatization, stemming, and more. It includes a simple Tkinter GUI that allows users to input text and apply different preprocessing functions to it. |
| 4 | + |
| 5 | +## Project Structure |
| 6 | + |
| 7 | +- **Functions/**: This directory includes Python scripts for various text preprocessing functions: |
| 8 | + - `voices_script.py`: For handling different voice processing tasks. |
| 9 | + - `clauses.py`: For clause-based text processing. |
| 10 | + - `tenses.py`: For identifying and modifying tenses in text. |
| 11 | + - `pos_tagging.py`: For part-of-speech tagging. |
| 12 | + - `lemmatize_script.py`: For lemmatization. |
| 13 | + - `stem_script.py`: For stemming. |
| 14 | + - `tokenize_script.py`: For tokenization. |
| 15 | +- **nltk_data/**: Contains necessary NLTK data used for various text processing tasks: |
| 16 | + - **corpora/**: Includes corpora like stopwords and WordNet. |
| 17 | + - **taggers/**: Contains the averaged perceptron tagger. |
| 18 | + - **tokenizers/**: Includes the Punkt tokenizer. |
| 19 | +- **.gitignore**: Specifies files and directories to be ignored by Git. |
| 20 | +- **Readme.md**: This file. |
| 21 | +- **requirements.txt**: Lists the Python packages required to run the project. |
| 22 | +- **run.py**: The main script to run the Tkinter GUI. |
| 23 | + |
| 24 | +## Features |
| 25 | + |
| 26 | +- **Tokenization**: Breaks down text into words or sentences. |
| 27 | +- **Lemmatization**: Converts words to their base or dictionary form. |
| 28 | +- **Stemming**: Reduces words to their root form. |
| 29 | +- **POS Tagging**: Tags parts of speech for each word in the text. |
| 30 | +- **Tense Handling**: Identifies and modifies tenses in the text. |
| 31 | +- **Clause Processing**: Analyzes and processes clauses within the text. |
| 32 | +- **Voice Processing**: Handles different voice-related text transformations. |
| 33 | + |
| 34 | +## Installation |
| 35 | + |
| 36 | +To get started with this project, follow these steps: |
| 37 | + |
| 38 | +1. **Clone the Repository** |
| 39 | + ```bash |
| 40 | + git clone https://github.com/SumitRajam/Text_Preprocessor.git |
| 41 | + cd Text_Preprocessor |
| 42 | + ``` |
| 43 | +2. **Install Dependencies**\ |
| 44 | + Create a virtual environment and install the required packages |
| 45 | + ```bash |
| 46 | + python -m venv venv |
| 47 | + source venv/bin/activate # On Windows use `venv\Scripts\activate` |
| 48 | + pip install -r requirements.txt |
| 49 | + ``` |
| 50 | + |
| 51 | +## Usage |
| 52 | +1. **Run tkinter GUI**\ |
| 53 | + Start the application by running: |
| 54 | + ```bash |
| 55 | + python run.py |
| 56 | + ``` |
| 57 | + The Tkinter GUI will open, allowing you to input text and apply various preprocessing functions. |
| 58 | + |
| 59 | +2. **Using the GUI** |
| 60 | +- **Input Text:** Enter the text you want to preprocess. |
| 61 | +- **Select a Function:** Choose a preprocessing function from the available buttons. |
| 62 | +- **View Output:** Click the selected button to apply the function and view the results. |
| 63 | + |
| 64 | +## Contributing |
| 65 | +We welcome contributions to enhance this project! If you'd like to contribute, please follow these steps: |
| 66 | +1. **Fork the Repository:** Click the "Fork" button on the top-right corner of this page. |
| 67 | +2. **Create a Branch:** Create a new branch for your feature or bug fix. |
| 68 | + ```bash |
| 69 | + git checkout -b feature/your-feature-name |
| 70 | + ``` |
| 71 | +3. **Make Changes:** Implement your changes and commit them. |
| 72 | + ```bash |
| 73 | + git add . |
| 74 | + git commit -m "Add your commit message" |
| 75 | + ``` |
| 76 | +4. **Push Changes:** Push your changes to your forked repository. |
| 77 | + ```bash |
| 78 | + git push origin feature/your-feature-name |
| 79 | + ``` |
| 80 | +5. **Create a Pull Request:** Go to the original repository and create a pull request from your forked repository. |
| 81 | +
|
| 82 | +
|
| 83 | +
|
| 84 | +# Happy preprocessing! ⚙️💬🧠🖥️ |
0 commit comments