Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 635 Bytes

readme.md

File metadata and controls

14 lines (11 loc) · 635 Bytes

Natural Language Processing

Pre processing Data:

  1. Tokenizing: Seperating of words or sentences.
  2. Stop Words: Getting rid of useless words
  3. Stemming: Converting all words into root.
  4. Lemmatization: Better than stemming.
  5. Spech Tagging: Making tuples of words with their tags(nouns, adverbs, adjectives)
  6. Chunking: Grouping of similar grammar together
  7. Chinking: Grouping of similar grammar together by selecting all and removing certain kind out.
  8. Named Entity Recognition: Alternative to chunking/chinking.
  9. Wordnet: To find synonyms/ antonyms/ meanings of words. Also used to find similarities between words.