The definition of IRD in original paper and the codes in NOF are different. #1

BlueBirdHouse · 2019-07-18T12:35:54Z

The package DDOutlier is a great job. In the searching for sophisticated outlier detection algorithms, I find the DDOutlier by accident. It proves the codes together with the associated papers, which are what I need.

I choose one algorithm called NOF; then, I read two papers associated with it: 'A non-parameter outlier detection algorithm based on Natural Neighbor' and 'Natural neighbor: A self-adaptive neighborhood method without parameter K.' There is no other reason to choose them except that they do not need parameter K. This is my first time to use a density-based algorithm. I have no idea about how to decide the parameter K, so it seems better for the algorithm to select them automatically.

To properly understand your code, I studied the language R last week!

Even your codes do not precisely reflect the idea of Huang et al., I agree with your codes. However, I need your confirm because of the lack of experience in the R language and the outlier detection algorithms. The writer in 'A non-parameter outlier detection algorithm based on Natural Neighbor' try to upgrade the LOF algorithm with a set of concepts about the nature neighbor. So, I agree that everything in LOF should be replaced. The writer references the definition of IRD in equation two. In this definition, everything is about the k-distance neighbor, so �we add all the 'o' in 'NNk(p)'. In definition 13 (equation 9), all the things should be associated with NIS. I think this is why 'k_dist <- as.vector(dist.obj$dist[NIS[[i]],max_nb]).' Note that the writer never renews the definition of IRD, so it should be the old one. I think this is the writer's typo. However, there needs a set of experiments to show that your code is right. Unfortunately, I am new to R. I do not have resources to compare different explains.

I need your opinion about this different point.

jhmadsen · 2019-07-18T16:34:54Z

As I replied to you in private by mail, I'm posting the response here as well.

First I need to mention that I had some difficulties understanding the natural neighborhood algorithm, specifically step 16 and step 17 in the original paper by Zhu et. al. To solve this issue, I texted Zhu himself, but unfortunately he never replied.

Second, the paper by Zuang et. al is a bit confusing I think. I'm also a bit confused about your argument.
What part of the paper is wrong/a typo? Definition 13 (equation 9) or definition 4 (equation 2)? Can you provide an example? Preferably with the Iris dataset.

I would happily accept your help. If you'd like, you can make a pull request on Github.

BlueBirdHouse · 2019-07-23T01:13:53Z

The detailed bug reports come slowly because of my limited research ability. I am working on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The definition of IRD in original paper and the codes in NOF are different. #1

The definition of IRD in original paper and the codes in NOF are different. #1

BlueBirdHouse commented Jul 18, 2019

jhmadsen commented Jul 18, 2019

BlueBirdHouse commented Jul 23, 2019

The definition of IRD in original paper and the codes in NOF are different. #1

The definition of IRD in original paper and the codes in NOF are different. #1

Comments

BlueBirdHouse commented Jul 18, 2019

jhmadsen commented Jul 18, 2019

BlueBirdHouse commented Jul 23, 2019