Skip to content

Latest commit

 

History

History
19 lines (14 loc) · 325 Bytes

count-words-n-grams-shingles.md

File metadata and controls

19 lines (14 loc) · 325 Bytes
description
⚠️ THIS IS A WORK IN PROGRESS

Count words, n-grams, shingles x

library(stringr)


top_words <- all_full_txt %>%
  unnest_tokens(word, txt) %>%
  anti_join(get_stopwords()) %>%
  filter(!str_detect(word, "[0-9]+") == TRUE) %>%
  group_by(url) %>%
  count(word, sort = F) %>%
  View()