Skip to content

Memory leak (even after webclient.close()) #729

Open
@fleboulch

Description

@fleboulch

Hello,

I want to thank you for your amazing work. I'm using your lib since almost 1 year now and it's really nice.

I'm having an issue about memory (heap memory).

Showcase 1: I'm starting my app without doing any scrap

Heap: 74Mo
image

Showcase 2: I'm starting my app and doing 1 scrap with close

Heap: 256Mo
image

class ArkeaArenaFetcher {

    fun fetch(): List<EventJpa> {
        val webClient = WebClient().apply {
            options.isCssEnabled = true
            options.isJavaScriptEnabled = true
            cssErrorHandler = SilentCssErrorHandler()
            javaScriptErrorListener = SilentJavaScriptErrorListener()
            options.isThrowExceptionOnFailingStatusCode = false
        }

        return try {

            val page = webClient.getPage<HtmlPage>("https://www.arkeaarena.com/fr/programmation/tous-les-evenements/#")
            webClient.waitForBackgroundJavaScript(4000)
            val container: HtmlElement = page.getFirstByXPath("//div[@class='events-list ajaxed']/div[@class='container']")
            val rawEvents = container.getByXPath<HtmlElement>("a")
            rawEvents.map(::htmlToInfra)
        } catch (e: Exception) {
            emptyList()
        } finally {
            webClient.close()
        }
    }
    
     private fun htmlToInfra(html: HtmlElement): EventJpa {
        // convert html to Kotlin object
        ...
     }


}

Showcase 3: I'm starting my app and doing 1 scrap with close + other clean + gc

Heap: 166Mo
image

The code is the same as the showcase 2 but only the finally clause is changing like below

        finally {
            webClient.cache.clear()
            webClient.topLevelWindows.forEach { it.close(false) }
            webClient.topLevelWindows.forEach { it.jobManager.removeAllJobs() }
            webClient.cookieManager.clearCookies()
            webClient.close()
            System.gc()
        }

The issue here is even when I'm closing the webclient instance there is still memory which is not released. Here in my example code I'm dealing with a single source but in production I'm dealing with multiple sources.

I also tried

  • .use in Kotlin (try with resources) (article)

Other info

  • Language: Kotlin
  • HtmlUnit version: 3.9.0 (the behaviour is the same on older versions)

Article read about the memory subject:

Similar issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions