Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.7.1 April version and test of big data files (unmber and / or size) #190

Open
yvanlebras opened this issue Apr 5, 2022 · 2 comments
Open

Comments

@yvanlebras
Copy link
Contributor

Testing MetaShARK 1.7.1 april version on ASPE and OBIS (IA-Biodiv project) datasets, we encounter some issues apparently related to the size of files, here around 1.5 Gb, / rich content!

For ASPE on the 6 main data tables, here is a log when we have a crash after clicking on "make EML":

remotes::install_github("ROpenSci/bibtex")
[Metric] Connected users at 2022-04-05 10:45:27: 1
[fill_module.R] save & template, at: 4.2s
[fill_module.R] set EAL variables, at: 4.2s
[fill_module.R] set local rv, at: 4.2s
[fill-module-setup.R] set variable, at: 0s
[fill-module-setup.R] post-modification, at: 0s
[fill-module-setup] passed, at: 0s
[fill_module.R] change pane, at: 0s
[fill_module.R] update history, at: 0s
[fill_module.R] save variables change, at: 0s
[fill_module.R] display UI, at: 0s
[fill_module.R] ended, at: 0s
[fill_module.R] save & template, at: 40.8s
[fill_module.R] set EAL variables, at: 40.8s
[fill_module.R] set local rv, at: 40.8s
[fill-module-setup.R] set variable, at: 0s
[fill-module-setup.R] post-modification, at: 0s
[fill-module-setup] passed, at: 0s
[fill_module.R] change pane, at: 0s
[fill_module.R] update history, at: 0s
[fill_module.R] save variables change, at: 0s
[fill_module.R] display UI, at: 0s
[fill_module.R] ended, at: 0s
[dev] autosaved: 9, at: 0.1s

 *** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
 1: data.table::fread(file = f, fill = TRUE, blank.lines.skip = TRUE,     sep = "\t", colClasses = list(character = 1:utils::count.fields(f,         sep = "\t")[1]))
 2: as.data.frame(data.table::fread(file = f, fill = TRUE, blank.lines.skip = TRUE,     sep = "\t", colClasses = list(character = 1:utils::count.fields(f,         sep = "\t")[1])))
 3: read_tbl(paste0(path, "/", tfound[i]))
 4: EMLassemblyline::template_arguments(path = .$SelectDP$dp.metadata.path,     data.path = .$SelectDP$dp.data.path, data.table = dir(.$SelectDP$dp.data.path))
 5: doTryCatch(return(expr), name, parentenv, handler)
 6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 7: tryCatchList(expr, classes, parentenv, handlers)
 8: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        sm <- strsplit(conditionMessage(e), "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && isTRUE(getOption("show.error.messages"))) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
 9: try(EMLassemblyline::template_arguments(path = .$SelectDP$dp.metadata.path,     data.path = .$SelectDP$dp.data.path, data.table = dir(.$SelectDP$dp.data.path)))
10: eval(expr, env)
11: eval(expr, env)
12: withProgress({    . <- main.env$save.variable    fileName <- .$SelectDP$dp.title    x <- try(EMLassemblyline::template_arguments(path = .$SelectDP$dp.metadata.path,         data.path = .$SelectDP$dp.data.path, data.table = dir(.$SelectDP$dp.data.path)))    if (class(x) == "try-error") {        out <- x        out[1] <- paste("Upon templating arguments: ", x)        incProgress(0.9)    }    else {        incProgress(0.3)        x$path <- .$SelectDP$dp.metadata.path        x$data.path <- .$SelectDP$dp.data.path        x$eml.path <- .$SelectDP$dp.eml.path        x$dataset.title <- .$SelectDP$dp.title        x$temporal.coverage <- .$Misc$temporal.coverage        x$maintenance.description <- "ongoing"        x$data.table.name <- optional(.$DataFiles$table.name)        x$data.table.description <- optional(.$DataFiles$description)        x$data.table.url <- optional(.$DataFiles$url)        x$user.id <- optional(if (main.env$SETTINGS$user != "public")             main.env$SETTINGS$user)        x$package.id = x$dataset.title        x$write.file <- TRUE        x$return.obj <- TRUE        incProgress(0.2)        file.remove(dir(main.env$save.variable$SelectDP$dp.eml.path,             full.names = TRUE))        do.call(EMLassemblyline::make_eml, x[names(x) %in% names(formals(EMLassemblyline::make_eml))])        incProgress(0.4)    }}, message = "Writing EML ...", value = 0.1)
13: `<observer>`(...)
14: valueFunc()
15: ..stacktraceon..(expr)
16: contextFunc()
17: env$runWith(self, func)
18: force(expr)
19: domain$wrapSync(expr)
20: promises::with_promise_domain(createVarPromiseDomain(.globals,     "domain", domain), expr)
21: withReactiveDomain(.domain, {    env <- .getReactiveEnvironment()    rLog$enter(.reactId, id, .reactType, .domain)    on.exit(rLog$exit(.reactId, id, .reactType, .domain), add = TRUE)    env$runWith(self, func)})
22: domain$wrapSync(expr)
23: promises::with_promise_domain(reactivePromiseDomain(), {    withReactiveDomain(.domain, {        env <- .getReactiveEnvironment()        rLog$enter(.reactId, id, .reactType, .domain)        on.exit(rLog$exit(.reactId, id, .reactType, .domain),             add = TRUE)        env$runWith(self, func)    })})
24: ctx$run(function() {    ..stacktraceon..(expr)})
25: ..stacktraceoff..(ctx$run(function() {    ..stacktraceon..(expr)}))
26: isolate(valueFunc())
27: func(v$value)
28: withVisible(func(v$value))
29: f(init, x[[i]])
30: Reduce(function(v, func) {    if (v$visible) {        withVisible(func(v$value))    }    else {        withVisible(func(invisible(v$value)))    }}, list(...), result)
31: withCallingHandlers(expr, error = doCaptureStack)
32: domain$wrapSync(expr)
33: promises::with_promise_domain(createStackTracePromiseDomain(),     expr)
34: captureStackTraces({    result <- withVisible(force(expr))    if (promises::is.promising(result$value)) {        p <- promise_chain(valueWithVisible(result), ..., catch = catch,             finally = finally)        runFinally <- FALSE        p    }    else {        result <- Reduce(function(v, func) {            if (v$visible) {                withVisible(func(v$value))            }            else {                withVisible(func(invisible(v$value)))            }        }, list(...), result)        valueWithVisible(result)    }})
35: doTryCatch(return(expr), name, parentenv, handler)
36: tryCatchOne(expr, names, parentenv, handlers[[1L]])
37: tryCatchList(expr, classes, parentenv, handlers)
38: tryCatch({    captureStackTraces({        result <- withVisible(force(expr))        if (promises::is.promising(result$value)) {            p <- promise_chain(valueWithVisible(result), ...,                 catch = catch, finally = finally)            runFinally <- FALSE            p        }        else {            result <- Reduce(function(v, func) {                if (v$visible) {                  withVisible(func(v$value))                }                else {                  withVisible(func(invisible(v$value)))                }            }, list(...), result)            valueWithVisible(result)        }    })}, error = function(e) {    if (!is.null(catch))         catch(e)    else stop(e)}, finally = if (runFinally && !is.null(finally)) finally())
39: do()
40: hybrid_chain(eventFunc(), function(value) {    if (ignoreInit && !initialized) {        initialized <<- TRUE        return()    }    if (ignoreNULL && isNullEvent(value)) {        return()    }    if (once) {        on.exit(x$destroy())    }    req(!ignoreNULL || !isNullEvent(value))    isolate(valueFunc())})
41: `EAL9: make eml`(...)
42: contextFunc()
43: env$runWith(self, func)
44: force(expr)
45: domain$wrapSync(expr)
46: promises::with_promise_domain(createVarPromiseDomain(.globals,     "domain", domain), expr)
47: withReactiveDomain(.domain, {    env <- .getReactiveEnvironment()    rLog$enter(.reactId, id, .reactType, .domain)    on.exit(rLog$exit(.reactId, id, .reactType, .domain), add = TRUE)    env$runWith(self, func)})
48: domain$wrapSync(expr)
49: promises::with_promise_domain(reactivePromiseDomain(), {    withReactiveDomain(.domain, {        env <- .getReactiveEnvironment()        rLog$enter(.reactId, id, .reactType, .domain)        on.exit(rLog$exit(.reactId, id, .reactType, .domain),             add = TRUE)        env$runWith(self, func)    })})
50: ctx$run(.func)
51: run()
52: withCallingHandlers(expr, error = doCaptureStack)
53: domain$wrapSync(expr)
54: promises::with_promise_domain(createStackTracePromiseDomain(),     expr)
55: captureStackTraces(expr)
56: withCallingHandlers(captureStackTraces(expr), error = function(e) {    if (inherits(e, "shiny.silent.error"))         return()    handle <- getOption("shiny.error")    if (is.function(handle))         handle()})
57: shinyCallingHandlers(run())
58: force(expr)
59: withVisible(force(expr))
60: withCallingHandlers(expr, error = doCaptureStack)
61: domain$wrapSync(expr)
62: promises::with_promise_domain(createStackTracePromiseDomain(),     expr)
63: captureStackTraces({    result <- withVisible(force(expr))    if (promises::is.promising(result$value)) {        p <- promise_chain(valueWithVisible(result), ..., catch = catch,             finally = finally)        runFinally <- FALSE        p    }    else {        result <- Reduce(function(v, func) {            if (v$visible) {                withVisible(func(v$value))            }            else {                withVisible(func(invisible(v$value)))            }        }, list(...), result)        valueWithVisible(result)    }})
64: doTryCatch(return(expr), name, parentenv, handler)
65: tryCatchOne(expr, names, parentenv, handlers[[1L]])
66: tryCatchList(expr, classes, parentenv, handlers)
67: tryCatch({    captureStackTraces({        result <- withVisible(force(expr))        if (promises::is.promising(result$value)) {            p <- promise_chain(valueWithVisible(result), ...,                 catch = catch, finally = finally)            runFinally <- FALSE            p        }        else {            result <- Reduce(function(v, func) {                if (v$visible) {                  withVisible(func(v$value))                }                else {                  withVisible(func(invisible(v$value)))                }            }, list(...), result)            valueWithVisible(result)        }    })}, error = function(e) {    if (!is.null(catch))         catch(e)    else stop(e)}, finally = if (runFinally && !is.null(finally)) finally())
68: do()
69: hybrid_chain({    if (!.destroyed) {        shinyCallingHandlers(run())    }}, catch = function(e) {    if (inherits(e, "shiny.silent.error")) {        return()    }    printError(e)    if (!is.null(.domain)) {        .domain$unhandledError(e)    }}, finally = .domain$decrementBusyCount)
70: flushCallback()
71: FUN(X[[i]], ...)
72: lapply(.flushCallbacks, function(flushCallback) {    flushCallback()})
73: ctx$executeFlushCallbacks()
74: .getReactiveEnvironment()$flush()
75: flushReact()
76: serviceApp()
77: ..stacktracefloor..(serviceApp())
78: withCallingHandlers(expr, error = doCaptureStack)
79: domain$wrapSync(expr)
80: promises::with_promise_domain(createStackTracePromiseDomain(),     expr)
81: captureStackTraces({    while (!.globals$stopped) {        ..stacktracefloor..(serviceApp())    }})
82: ..stacktraceoff..(captureStackTraces({    while (!.globals$stopped) {        ..stacktracefloor..(serviceApp())    }}))
83: runApp(shinyApp(ui = ui, server = server), launch.browser = args$launch.browser)
84: MetaShARK::runMetashark(dev = FALSE)
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault

Such "big" dataset is related to something like 2 hours to generate / load attributes metadata after the uploading step. There is mabe too much attributes and related info to be stored in mem by EAL ? Testing directly in EAL give the same kind of behaviour, with a crash of R

@yvanlebras
Copy link
Contributor Author

Using the "make_eml" EAL function in RStudio gives a "R Session Aborted" message / R encountered a fatal error

@yvanlebras
Copy link
Contributor Author

After applying several tests, it seems to me this can be due to "catvars_**" files with too much content / size...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant