Extraction of Skills

Extracting Skills from resume using NLP & Machine Learning techniques along with Word2Vec from gensim for Word Embeddings.

Description

Used Word2Vec from gensim for word embeddings after cleaning the data using NLP methods such as tokenization and stopword removal. Now, using these word embeddings K Clusters are created using K-Means Algorithm. Out of these K clusters some of the clusters contains skills (Tech, Non-tech & soft skills).

Prerequisites

Software

PyPDF2 1.26.0
doc2text 0.2.4
textract 1.6.3
python-docx 0.8.10
pdfminer3 2018.12.3.0
nltk 3.5
pandas 1.0.3
wordcloud 1.7.0
matplotlib 3.2.1
gensim 3.8.3
sklearn 0.22.2.post1
python 3.8.2

Dataset

The dataset for this project as of now has been collected from : https://github.com/JAIJANYANI/Automated-Resume-Screening-System

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
jupyter-notebooks		jupyter-notebooks
models		models
output/clusters		output/clusters
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extraction of Skills

Description

Prerequisites

Software

Dataset

About

Releases

Packages

Languages

License

Msq-9/Extraction-of-Skills

Folders and files

Latest commit

History

Repository files navigation

Extraction of Skills

Description

Prerequisites

Software

Dataset

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages