Open
Description
Hello,
I want to thank you for your amazing work. I'm using your lib since almost 1 year now and it's really nice.
I'm having an issue about memory (heap memory).
Showcase 1: I'm starting my app without doing any scrap
Showcase 2: I'm starting my app and doing 1 scrap with close
class ArkeaArenaFetcher {
fun fetch(): List<EventJpa> {
val webClient = WebClient().apply {
options.isCssEnabled = true
options.isJavaScriptEnabled = true
cssErrorHandler = SilentCssErrorHandler()
javaScriptErrorListener = SilentJavaScriptErrorListener()
options.isThrowExceptionOnFailingStatusCode = false
}
return try {
val page = webClient.getPage<HtmlPage>("https://www.arkeaarena.com/fr/programmation/tous-les-evenements/#")
webClient.waitForBackgroundJavaScript(4000)
val container: HtmlElement = page.getFirstByXPath("//div[@class='events-list ajaxed']/div[@class='container']")
val rawEvents = container.getByXPath<HtmlElement>("a")
rawEvents.map(::htmlToInfra)
} catch (e: Exception) {
emptyList()
} finally {
webClient.close()
}
}
private fun htmlToInfra(html: HtmlElement): EventJpa {
// convert html to Kotlin object
...
}
}
Showcase 3: I'm starting my app and doing 1 scrap with close + other clean + gc
The code is the same as the showcase 2 but only the finally clause is changing like below
finally {
webClient.cache.clear()
webClient.topLevelWindows.forEach { it.close(false) }
webClient.topLevelWindows.forEach { it.jobManager.removeAllJobs() }
webClient.cookieManager.clearCookies()
webClient.close()
System.gc()
}
The issue here is even when I'm closing the webclient instance there is still memory which is not released. Here in my example code I'm dealing with a single source but in production I'm dealing with multiple sources.
I also tried
.use
in Kotlin (try with resources) (article)
Other info
- Language: Kotlin
- HtmlUnit version:
3.9.0
(the behaviour is the same on older versions)
Article read about the memory subject:
- https://maxrohde.com/2013/01/05/fix-htmlunit-memory-leak
- https://htmlunit.sourceforge.io/faq.html#MemoryLeak
Similar issues
Metadata
Metadata
Assignees
Labels
No labels