Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Problem with the predict method for bipartiteSBM with covariates #7

Open
Chabert-Liddell opened this issue Mar 3, 2021 · 7 comments

Comments

@Chabert-Liddell
Copy link

The predict method for bipartiteSBM with covariates and 'bernoulli' distribution is highly biased, I believe there is a problem with the computation as it is different than the result given by the following code of blockmodels (with the same returned model):

for(k in seq_along(covlbm)) {
  B <- B + sbm_cov$model_parameters[[4]]$beta[k] * covlbm[[k]]
}
1/(1+exp(-sbm_cov$memberships[[4]]$Z1 %*%
               sbm_cov$model_parameters[[4]]$m %*%
               t(sbm_cov$memberships[[4]]$Z2)-B))

If I sum the above expression in a network with 129 edges, I get about 125. But the sum of the predictions with the predict method of sbm return 65.

@jchiquet
Copy link
Member

jchiquet commented Mar 17, 2021

Thank you for seing this...

The link function was not applied at the correct place in the prediction method. This is now fixed and the following code give equivalent results for bm and sbm.

Please confirm that it is also working on your side (since version 0..4.0-9110) so as I can close this Issue.

library(sbm)
library(blockmodels)

set.seed(111)

nbNodes  <- c(30, 60)
blockProp <- list(row = c(.5, .5), col = c(1/3, 1/3, 1/3)) # group proportions
nbBlocks <- sapply(blockProp, length)
covarParam <- c(-2,2)
covar1 <- matrix(rnorm(prod(nbNodes)), nbNodes[1], nbNodes[2])
covar2 <- matrix(rnorm(prod(nbNodes)), nbNodes[1], nbNodes[2])
covarList <- list(covar1 = covar1, covar2 = covar2)

## BIPARTITE UNDIRECTED BERNOULLI SBM
means <- matrix(c(0.05, 0.95, 0.4, 0.75, 0.15, 0.6), 2, 3)  # connectivity matrix
connectParam <- list(mean = means)

## Basic construction - check for wrong specifications
mySampler <- BipartiteSBM$new('bernoulli', nbNodes, blockProp, connectParam, covarParam = covarParam, covarList = covarList)
mySampler$rMemberships(store = TRUE)
mySampler$rEdges(store = TRUE)

## Construction----------------------------------------------------------------
mySBM <- BipartiteSBM_fit$new(mySampler$networkData, 'bernoulli', covarList = covarList)

sbm_fit <- estimateBipartiteSBM(mySBM$networkData, covariates = covarList)
bm_fit <- BM_bernoulli_covariates_fast("LBM", adj = mySBM$networkData, covariates = covarList)
bm_fit$estimate()
Q <- 4

## identical parameters
sbm_fit$connectParam
sbm:::.logistic(bm_fit$model_parameters[[Q]]$m)
sbm_fit$covarParam
bm_fit$model_parameters[[Q]]$beta
sbm:::.logit(sbm_fit$connectParam$mean)
bm_fit$model_parameters[[Q]]$m

## identical clustering
row_cl <- apply(bm_fit$memberships[[Q]]$Z1, 1, which.max)
col_cl <- apply(bm_fit$memberships[[Q]]$Z2, 1, which.max)
sbm_fit$memberships

## identical covariates effects
sbm_mu <- sbm_fit$covarEffect
bm_mu  <- matrix(0, nrow(mySBM$networkData), ncol(mySBM$networkData))
for (k in 1:length(covarList)) {
  bm_mu  <- bm_mu  + bm_fit$model_parameters[[Q]]$beta[k] * covarList[[k]]
}
sum( (bm_mu - sbm_mu)^2 )


pred_bm  <- bm_fit$prediction(Q = 4)
pred_sbm <- predict(sbm_fit)

delta <- sum( (pred_bm - pred_sbm)^2 )
delta

@Chabert-Liddell
Copy link
Author

Thank you for your change.

version 0..4.0-9110
Here is a .Rdata where predict(lbm_cov_sbm) != lbm_cov_bm$prediction(Q = 2)
predict_cov.zip
On this example, sbm gives me a uniform prediction which is not what is attended.

@jchiquet
Copy link
Member

jchiquet commented Apr 29, 2021

@Chabert-Liddell Could you be a little bit more explicit ? How many groups were considered ? What were the settings (node size, etc...).

And more importantly, what distribution for the edges? Bernoulli, Poisson, Gaussian ?

thks

@jchiquet
Copy link
Member

Ok my guess is that your using a Poisson model right ?

@Chabert-Liddell
Copy link
Author

I am using a Bernoulli model with edge covariates (fast covariates) on a bipartite SBM.
The goal is to fit an EDD model by setting the number of blocks to 1 for both rows and columns.
The estimation of the model parameters seems correct, the problem comes from the predict formula which in this case does not take into account the covariates parameters.

@jchiquet
Copy link
Member

We realized wtih Pierre that their is maybe a mistake in blockmodels regarding the prediction for the Bernoulli model with covariates, and I copied this mistake in sbm... I am looking into the details for all the models regarding their prediction, an keep you informed.

@jchiquet
Copy link
Member

Part of the discussion will hopefully follow there jb-leger/blockmodels#2

jchiquet added a commit that referenced this issue Apr 29, 2021
…s a slight mistake for the Bernoulli, covariate case. Will be fixed later
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants