Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sub-category information of the KEGG pathways #5

Open
GuangchuangYu opened this issue May 8, 2023 · 6 comments
Open

add sub-category information of the KEGG pathways #5

GuangchuangYu opened this issue May 8, 2023 · 6 comments

Comments

@GuangchuangYu
Copy link
Member

KEGG can be divided into 7 categories, see https://www.genome.jp/kegg/pathway.html.

It is easy to incorporate this information in the enrichKEGG() and gseaKEGG() results, so that we can use this information to filter the results or to differentiate the pathways for visualization.

reference: https://mp.weixin.qq.com/s/17ujVhcrkX1DLsUJBtUGEw.

@Potato-tudou
Copy link

Potato-tudou commented May 12, 2023

A function embedding KEGG api (https://rest.kegg.jp/get/) could be used to get the classification info of a KEGG term with the given ID. However, the purpose of this function is to keep the classification info updated, it is effective, but not so high-efficient. Still, handle things locally is more elegant.

library(pacman)
pacman::p_load(httr, jsonlite, magrittr)

class_t <- function(term) {
    res <- paste( "https://rest.kegg.jp/get/",
                  term, sep = '') %>% GET()
    res_info <- res$content %>% rawToChar()
    class_res <- gsub('^.*CLASS\\s*|\\s*PATHWAY_MAP.*$', '', res_info)
    return(class_res)
  }
class_t("mmu04380")
[1] "Organismal Systems; Development and regeneration"

#Then the user can easily get all the classification of KEGG terms in enrichKEGG result by: 
lapply(kegg_res@result$ID, class_t)

@Potato-tudou
Copy link

Potato-tudou commented May 14, 2023

Now the category of a certain kegg term can be extracted by the help of the referred url ("https://pathview.uncc.edu/data/khier.tsv").

'''
k.info <- read.table("https://pathview.uncc.edu/data/khier.tsv", header = T) %>%
separate(pathway, c("ID","Description"), extra = "merge",fill = "right")

getKEGG_cat <- function(ID, k_info) {
cleanID <- function (id_num) {
gsub("[a-z]", "", id_num)
}
inputID <- cleanID(ID)
k_info[(k_info$ID == inputID),]$category
}

getKEGG_cat(ID = "mmu04380", k_info = k.info)
[1] "Organismal Systems"

'''

@GuangchuangYu
Copy link
Member Author

@Potato-tudou pls learn how to format your code first.

Refer to point 2 mentioned by Yonghe, #1 (comment).

@Potato-tudou
Copy link

k.info <- read.table("https://pathview.uncc.edu/data/khier.tsv", header = T) %>%
separate(pathway, c("ID","Description"), extra = "merge",fill = "right")

getKEGG_cat <- function(ID, k_info) {
cleanID <- function (id_num) {
gsub("[a-z]", "", id_num)
}
inputID <- cleanID(ID)
k_info[(k_info$ID == inputID),]$category
}

getKEGG_cat(ID = "mmu04380", k_info = k.info)

@Potato-tudou
Copy link

I think it's better to use the result of enrichKEGG() as an input. So here it is:

listKEGG_cat <- function (enrich_res) {
  k.info <- read.table("https://pathview.uncc.edu/data/khier.tsv", header = T) %>% 
    separate(pathway, c("ID","Description"), extra = "merge",fill = "right")
  getKEGG_cat <- function(ID, k_info) {
    cleanID <- function (id_num) {
      gsub("[a-z]", "", id_num)
    }
    inputID <- cleanID(ID)
    k_info[(k_info$ID == inputID),]$category
  }
  lapply(enrich_res@result$ID,getKEGG_cat, k_info = k.info) %>% unlist()
}

@GuangchuangYu
Copy link
Member Author

see also YuLab-SMU/clusterProfiler#236.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants