This guide provides detailed instructions on how to set up and run the code for the technical challenge. It includes exercises 1.1, 1.2, 2.1, and 2.2.
- Python 3.8 or higher installed.
- Access to the terminal or command prompt.
- Download the required dataset from this link and place it in the
Part2
folder with the nameSoftware.json
.
- Clone the Repository: Clone or download the code repository to your local machine.
- pandas
- beautifulsoup4 (bs4)
- nltk
- matplotlib
- numpy
- transformers
- scikit-learn
pip3 install pandas
pip3 install beautifulsoup4
pip3 install nltk
pip3 install matplotlib
pip3 install numpy
pip3 install transformers
pip3 install scikit-learn
This will install all the necessary libraries and dependencies required to run the code.
- Update File Paths: Change the "file paths" in all the scripts. Replace
/Users/joaovasco/Desktop/Part I/
with your local directory path.
The Part I
folder contains the following files:
Bert_results.txt
comments.json
ideas.json
PreTrainedPortugueseAnalysis.py
innovation.py
innovation12bert.py
cluster_descriptions.txt
innovation_ideas.txt
-
Running Scripts:
-
To run the solution for 1.1
innovation.py
:python3 innovation.py
-
To run the solution for 1.2
innovation12bert.py
:python3 innovation12bert.py
-
Before running the scripts in Part II, change to the Part2
directory:
cd Part2
The Part2
folder contains:
21.py
22.py
Software.json
-
Place the Dataset: Ensure
Software.json
is in thePart2
folder. -
Running Scripts:
-
To run the solution for 2.1
21.py
:python3 21.py
-
To run the solution for 2.2
22.py
:python3 22.py
-
- Don't exitate to email me for further questions :)