Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avg. dist to knn in second dataset #31

Open
nlebovits opened this issue Oct 4, 2023 · 0 comments
Open

Avg. dist to knn in second dataset #31

nlebovits opened this issue Oct 4, 2023 · 0 comments

Comments

@nlebovits
Copy link

Is there a quick way of calculating knn distance from one set of points to another? i.e., I have pts and otherPts and for each point in pts want to calculate the average distance to knn in otherPoints. Current use case is an assignment for home sale price prediction in Public Policy Analytics in which we're asked, among other potential features, to consider the average distance to k nearest crimes.

Current custom function proposed by our textbook is:

nn_function <- function(measureFrom,measureTo,k) {
  measureFrom_Matrix <-
    as.matrix(measureFrom)
  measureTo_Matrix <-
    as.matrix(measureTo)
  nn <-   
    get.knnx(measureTo, measureFrom, k)$nn.dist
  output <-
    as.data.frame(nn) %>%
    rownames_to_column(var = "thisPoint") %>%
    gather(points, point_distance, V1:ncol(.)) %>%
    arrange(as.numeric(thisPoint)) %>%
    group_by(thisPoint) %>%
    summarize(pointDistance = mean(point_distance)) %>%
    arrange(as.numeric(thisPoint)) %>% 
    dplyr::select(-thisPoint) %>%
    pull()
  
  return(output)  
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant