-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] Spawned processes and memory usage #25
Comments
Hi!
I would suggest not to 'source; anything on every request, but source all the code during application start.
How do you measure memory usage? It's true that each process is handled in a separate fork. All processes share memory and follow copy-on-write semantics. But there will be some small amount of memory which is not shared (because when you handle requests each process likely will allocate memory during calculation).
This is related to s-u/Rserve#111. I would suggest to put proxy behind RestRserve/Rserve and configure it to close connection after each request. I use HAproxy for that (see
I usually deploy it with |
Hi, Thanks for you quick reply and help provided! So, just a quick feedback:
Thank you again. Cheers |
(sorry, I closed the issue by mistake) |
The problem with memory is likely related to your code. Consider following example where on each request we make a dot product between 800mb matrix and a vector. library(RestRserve)
n = 1e5
m = 1e3
mat = matrix(runif(m * n), nrow = n)
# object.size(mat) / 1e6
# around 800mb
app = RestRserveApplication$new()
app$add_get(
"/tst",
function(req, res) {
v = runif(m)
dummy = m %*% v
res$body = as.character(Sys.getpid())
forward()
}
)
app$run(8080)
Now create a container: FROM dselivanov/restrserve:0.1.5
COPY app.R /
CMD ["Rscript", "/app.R"] docker build -t tst . Run it with memory limited to 1g: docker run -p 8080:8080 -m='1g' -it tst And stress test now with 16 thread using apib tool: apib -c 16 -d 3 http://127.0.0.1:8080/tst You will see that it successfully run several concurrent requests using 1g container memory constraint. If processes would not share memory it won't be possible. |
From what I've seen - yes, but only from the same client. |
Thank you for explanation and example! Can you clarify what you mean by "same client"? |
See here https://en.m.wikipedia.org/wiki/HTTP_persistent_connection
пн, 14 янв. 2019 г., 13:06 Rui Freitas [email protected]:
… Thank you for explanation and example!
Can you clarify what you mean by "same client"?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE4u3dF6PDFU570Dn7sM7BRZbiQNWTKXks5vDEiygaJpZM4Z7vRv>
.
|
Thank you for the support! I'll have a look into what you have sent. Just a quick non related question: do you know what may be causing this error? It seems something related with Rserve and not by my code. Thanks. |
yes, this is Rserve related issue - see here s-u/Rserve#121 |
Thanks again for the help and support! Keep up the good work. |
Hello @freitasskeeled, |
Hello @dselivanov, Here is my R script put in systemd as micro-service: #!/usr/bin/env Rscript
# define external arguments when calling this R script by bash commands
args <- commandArgs(trailingOnly = TRUE)
## ---- load packages ----
tryCatch({
library(RestRserve)
library(pool)
},
error = function(error_detail) {
install.packages(c("RestRserve", "pool"))
})
## ---- create pool connection to database ----
create_pool_conn <- function() {
...
...
}
pool_conn <- create_pool_conn()
pool::dbListTables(pool_conn) # it is important to run in the pool for the first time
## ---- create application -----
app <- RestRserve::Application$new()
## ---- create handler for the HTTP requests ----
my-function <- function(request, response) {
# foo is an R package of my algorithm
response$body <- foo::bar(input = request$body,
pool = pool_conn)
response$content_type <- "application/json"
}
## ---- register endpoints and corresponding R handlers ----
app$add_post(path = "my-api-endpoint",
FUN = my-function)
app$add_openapi(
path = "/openapi.yaml",
file_path = "openapi.yaml"
)
# see details on https://swagger.io/tools/swagger-ui/
app$add_swagger_ui(
path = "/swagger",
path_openapi = "/openapi.yaml",
path_swagger_assets = "/swagger/assets/",
file_path = tempfile(fileext = ".html"),
use_cdn = FALSE
)
## ---- start application ----
backend <- RestRserve::BackendRserve$new()
backend$start(app, http_port = 8060) What should I improve, please tell me? Thank you very much. |
Hi @dselivanov , |
I honestly don't see how
|
How turn off |
Hi there,
I'm using your library to get predictions based on a trained model.
Here is the code sequence when I start the program:
.RData
files and other binary filesAfter this, I have one process using about 350MB of memory (which is normal for this application).
Then, I do a request and a second process spawns using the same amount of memory (350MB).
After this, if I put some load on the API (more than 2 requests at the same time) a new process is spawned using the same amount of memory per process.
I understand that's the way RestRserve or Rserve handle concurrent requests (by forking) but I can't understand why each process has that memory usage. Since it's all shared read-only data, shouldn't all processes use the same memory space instead of copying the data?
I also don't understand why all the spawned processes are kept running even if there aren't any processes to handle.
And my third question is: what are the advantages of the recommended way of deploying the API (the one mentioned on the documentation) versus just running
Rscript api.R
?Sorry for the long text and probably some of the questions are basic ones but my knowledge in R is not really extensive.
Thank you!
The text was updated successfully, but these errors were encountered: