Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

section 4.1 what do you mean "Each broad domain and its sub-domains share seed terms" #2

Open
lihuiliullh opened this issue Feb 13, 2022 · 5 comments

Comments

@lihuiliullh
Copy link

Suppose there are two seed terms "A" and "B", given the hierarchy CS -> AI -> ML. Do you mean "A" and "B" connect to "CS" "AI' and "ML" at the same time?

image

@jeffhj
Copy link
Owner

jeffhj commented Feb 14, 2022

It is not "A" and "B" connect to "CS" "AI' and "ML" at the same time, but in our setting, CS/AI/ML shares the same set of seed terms (because we use the same domain corpus to extract the terms to initialize the graph). You may also refer to the statistics of data in Table 1, where #terms is the number of "seed terms".

In this paper, we use "seed terms" to refer to a large set of terms extracted from the corpus, which may be different from the concept in graph mining, e.g., a small number of seed nodes to initialize the algorithm.

@lihuiliullh
Copy link
Author

lihuiliullh commented Feb 14, 2022

Thanks. I have another question.
In section 3.3, each core term has one or several categories, and all these categories can form a category tree. Does this mean all the core terms from the same large area, like "computer science" or "Physics"?
In the picture blew, "for a given domain", does the "domain" here mean the small domain in the large area?

For example, in "computer science" dataset, the root of the tree is "subfields of computer science", if I want to find information about "deep learning", e.g., given domain "deep learning", then in the sentence "For a given domain, we can first traverse from a root category and collect some gold subcategories.", the root category here means "subfields of computer science" or "deep learning"?

image

@jeffhj
Copy link
Owner

jeffhj commented Feb 15, 2022

Yes. Your understanding is correct. The root category here means "Category:Subfields of computer science" (https://en.wikipedia.org/wiki/Category:Subfields_of_computer_science) or "Category:Machine learning" (https://en.wikipedia.org/wiki/Category:Machine_learning)

@lihuiliullh
Copy link
Author

lihuiliullh commented Feb 16, 2022

May I know the label here is a boolean value ( e.g., 1 means related and 0 means unrelated) or a scalar (e.g., 0.8, 0.7)?
If the domain changes, does all the terms need to be relabeled again?

Also, in the paragraph above Equation (7), "all the core terms are labeled at each level of the hierarchy", can a term belong to several different hierarchy at the same time?

image

@jeffhj
Copy link
Owner

jeffhj commented Feb 16, 2022

Yes. The label here is a boolean value.
Yes. All the terms need to be relabeled for a new domain (In our paper, we introduce an automatic approach to do this)
Yes. A term can belong to several different hierarchies at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants