Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could the project switch to using log4j for logs? #1

Closed
Asimov4 opened this issue Sep 13, 2013 · 33 comments
Closed

Could the project switch to using log4j for logs? #1

Asimov4 opened this issue Sep 13, 2013 · 33 comments
Milestone

Comments

@Asimov4
Copy link

Asimov4 commented Sep 13, 2013

I see a lot of logs printed to System.out or System.err.
Would it be possible to use a library like log4j http://logging.apache.org/log4j/2.x/ and use log.error, log.warning, log.info, log.debug instead?
That would make it easier for users of the StanfordCoreNLP to manage which logs should be printed by choosing the log level of the project.

@AngledLuffa
Copy link
Contributor

My initial reaction is that it would be quite a lot of work for very little
early return for us... we certainly could discuss it again in the future,
of course.

Why specifically log4j?

John
On Sep 12, 2013 7:18 PM, "Asimov4" [email protected] wrote:

I see a lot of logs printed to System.out or System.err.
Would it be possible to use a library like log4j
http://logging.apache.org/log4j/2.x/ and use log.error, log.warning,
log.info, log.debug instead?
That would make it easier for users of the StanfordCoreNLP to manage which
logs should be printed by choosing the log level of the project.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1
.

@Asimov4
Copy link
Author

Asimov4 commented Sep 13, 2013

I don't have any strong opinion about log4j. It is pretty standard and does the job well.
I'd be willing to contribute if that is something that you would agree to.

@reckart
Copy link

reckart commented Sep 13, 2013

IMHO log4j would be a pretty bad choice, since it is a logging framework, not a logging API. Apache Commons Logging for example is split into an API and an implementation part. The default implementation can be replaced with e.g. a SLF4J bridge which can then log using any logging backend, including log4j. There is also a logging API built directly into Java: Java Utility Logging - it requires no extra dependencies and can likewise (although with a bit of extra effort) be rerouted over another logging backend.

My all-time-favourite is still Apache Commons Logging, since it is very easy to re-route, widely used, and very stable (no releases since ages). I usually re-route it to log4j using SLF4J, but that's only an option, not a must.

@AngledLuffa
Copy link
Contributor

Stuff that gets done around here tends to fall into one or more of the
following categories: stuff that interests us, stuff that advances us
academically, and stuff that gets us paid. In all honesty, switching the
logging system from System.err.println() to anything else doesn't fall into
any of those three categories. It's always possible that changes in the
future, but for now, I don't expect us to put effort into this.

It does sound like if we were to use any alternate framework, Java Utility
Logging would be the winner, since we wouldn't have to depend on any
outside libraries to use it. If you or anyone else is interested in
contributing this on their own, please keep that in mind.

On Fri, Sep 13, 2013 at 12:05 AM, reckart [email protected] wrote:

IMHO log4j would be a pretty bad choice, since it is a logging framework,
not a logging API. Apache Commons Logging for example is split into an API
and an implementation part. The default implementation can be replaced with
e.g. a SLF4J bridge which can then log using any logging backend, including
log4j. There is also a logging API built directly into Java: Java Utility
Logging - it requires no extra dependencies and can likewise (although with
a bit of extra effort) be rerouted over another logging backend. My
all-time-favourite is still Apache Commons Logging, since it is very easy
to re-route, widely used, and very stable (no releases since ages).


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-24376874
.

@manning
Copy link
Member

manning commented Sep 29, 2013

On writing to stdout/stderr: There are certainly a lot of things written to stderr. However, there should not be any logging/error messages written to stdout. If anyone finds any, we will fix them. Things should (and at least mainly do) run cleanly so that only genuine NLP output gets sent to stdout.

Logging: There is now some use of logging in CoreNLP (whereas a few years ago, there was none). I think my preference would be to stick to j.u.l, which is what is used a bit now in the code, just because the marginal gains in flexibility from some of the other frameworks just don't seem to outweigh the cost of adding another library dependency (exacerbated by the fact that there are several different ones that different people like...).

Our use of logging is likely to increase gradually over time, but, as AngledLuffa commented, there's not likely to be someone tasked to convert everything. If someone like you wanted to take this on, we'd certainly be willing to support a change to j.u.l everywhere.

@dinoboy197
Copy link

I'd be very interested in getting involved here. I think having CoreNLP write to a logging API which can route into any particular logging backend of the library user's choice seems like the right approach - this seems pretty industry standard at this point. It's unfortunate to write / maintain / use programs which maintain a nice separation between logging API and logging backend for the purpose of maintaining consistent logging format, level, and log destination, only to include libraries which don't take advantage of these nice logging APIs and just dump unformatted text onto stdout / stderr (I don't mean to pick on CoreNLP here alone - there are hundreds, probably thousands of libraries in Maven Central which blindly dump text onto stdout / stderr).

As for a logging API for CoreNLP to interface with, my recommendation would be to use SLF4J. It doesn't bias the client code (which makes calls to CoreNLP) to use any particular logging backend, nor does it require client code to provide additional backend shims to re-route logging events between backends. For example, by using SLF4j as the logging API, a program including the CoreNLP library needs only to include a single logging backend of their choice, for example logback (the successor to log4j), log4j, commons-logging, or java logging (jul). No logging bridges or shims are required for the end client to use any logging backend that they choose. (CoreNLP -> SLF4j API -> SLF4j implementation -> logging backend -> log destination)

If CoreNLP used commons-logging as the logging API, then programs which want to use logback, log4j, or jul as the backend would need to include some sort of log event redirection shim (as reckart mentioned above) (typically in the form of SLF4j plus some additional commons-logging to destination backend bridge) which increases logging backend complexity, adds additional library dependencies to the client program, and increases logging event overhead. (CoreNLP -> commons-logging -> slf4j-commons-logging-shim -> slf4j API -> slf4j implementation -> logging backend -> log destination).

If CoreNLP used JUL as the logging API, then programs which want to use logback, log4j, or commons-logging as the backend need the shims described in the paragraph above - plus, JUL + shims has some very unfortunate performance penalties as desribed (look for jul-to-slf4j bridge "Note on performance" section). (CoreNLP -> jul -> slf4j-jul-shim -> slf4j API -> slf4j implementation -> logging backend -> log destination).

Anyway, I'd love to discuss further and/or get involved to help out.

@reckart
Copy link

reckart commented Jul 15, 2015

The thing is the Commons Logging is basically done - I doubt it will ever change again - as stable as it can be. While with SLF4J I've run into trouble in the past due to conflicting versions of SLF4J being on the classpath, etc. I like the SFL4J and I think it's a great choice for applications - for frameworks, I'd still vote for commons logging. If you want to use commons logging with log4j, no additional shims are needed. Just add log4j to the classpath and commons logging picks it up and uses it automatically. If you prefer another backend, you can route it via SLF4J.

@dinoboy197
Copy link

Cool, thanks for the insight. If CoreNLP is starting to migrate to Commons Logging, then that's fine - maybe I could help to just finish it up? The latest stable release still writes to stderr without Commons Logging, in places.

Question for Commons Logging: do you happen to know if a library (like CoreNLP) uses Commons Logging as the API and client code uses Logback (which is log4j's "successor") as the backend, does that work without additional shims (ie, no need to SLF4j or any other libraries to be on the classpath?)

@reckart
Copy link

reckart commented Jul 15, 2015

I don't think the folks in Stanford are convinced just yet that they want to do this, but maybe if somebody comes up with a patch? No idea... I guess it will be quite some work and would be a pity if they later didn't merge it.

Unless logback comes with a "shim" that allows it to pose as another framework such as log4j, I doubt it. ACL apparently supports these frameworks: Log4j (1.2), LogKit, Avalog, and Java 1.4.

Btw. I just noted that ACL indeed is still being improved - veeery slowly - and activity picked up only recently again: there were 6 years between the release of 1.1.1 and 1.1.2. The latest version is 1.2 (Jul 2014). Changes are apparently minimal and mostly bugfixes.

http://commons.apache.org/proper/commons-logging/changes-report.html

@markwoon
Copy link

@dinoboy197 You'll need to use SLF4J if you want to use/control/configure both logback and ACL.

@dinoboy197
Copy link

Btw. I just noted that ACL indeed is still being improved - veeery slowly - and activity picked up only recently again: there were 6 years between the release of 1.1.1 and 1.1.2. The latest version is 1.2 (Jul 2014). Changes are apparently minimal and mostly bugfixes.

Heh. Better late than never, right? :)

Anyway, as crazy as it may seem, I may write a patch to convert library methods to use ACL versus println and see if the CoreNLP folks are interested. Thanks for your feedback, @reckart!

@manning
Copy link
Member

manning commented Sep 27, 2015

Hi everyone, good news: We are convinced to start moving to logging! Our plan is to go with slf4j, essentially for the reasons @dinoboy197 gives: It just seems the most modern, most flexible facade, and it seems to be winning in practice: http://zeroturnaround.com/rebellabs/the-state-of-logging-in-java-2013/ . This is likely to be a slow, gradual implementation for the reasons @AngledLuffa gave, but in the next release, you should start to see slf4j appearing and j.u.l -- and a few System.err.println's -- disappearing. Happy to have help. We are actually at 0 unmerged pulled requests right now (!).

@manning manning added this to the v.3.5.3 milestone Sep 27, 2015
@Hronom
Copy link

Hronom commented Oct 29, 2015

Any progress in moving to logging?
Print in error stream message that not an error not good aproach, for example edu.stanford.nlp.util.Timing#startDoing:1500.
Current version: 3.5.2

Or at least give to user an ability to put custom LogListener

@dinoboy197
Copy link

Hi everyone, good news: We are convinced to start moving to logging! Our plan is to go with slf4j, essentially for the reasons @dinoboy197 gives

Super awesome! I missed this message about a month ago. Very excited!

Happy to have help.

@manning - What, in particular can I do to help out this effort? Once slf4j-api is integrated into the build, I can start replacing print statements with slf4j statements, but am not sure if you want smaller PRs, larger PRs, PRs for certain classes, packages, etc. Let me know!

@manning
Copy link
Member

manning commented Nov 17, 2015

Since work on this has started (slowly) on this - and slf4j-api is now present in the project - I'm going to close this issue.
@dinoboy197: I just merged your first PR. Thanks! In general, medium size PRs, not ones with 30+ files are much more manageable. But, I think we will still try to get v.3.5.3 out any day now before incorporating a lot of new stuff....

@manning manning closed this as completed Nov 17, 2015
@dinoboy197
Copy link

@manning - thanks for merging that first PR! That one only had 4 files - was that size ok?

The second slf4j PR that I filed (with 11 files) was in response to a comment from @J38 mentioning to me that I should file individual PRs for whole packages in corenlp. Which size do you like better?

J38 pushed a commit that referenced this issue Jan 5, 2016
@manning
Copy link
Member

manning commented Feb 21, 2016

Just in case anyone looks at this issue, we've decided to change again what we're doing for logging. (Flip-flopping Californian liberals, or something!)

While we agree that much of the world would like logging support, the adoption of slf4j logging in v.3.6.0 was unpleasant for many users, as it meant that not only CoreNLP but all our other smaller tools (NER, parser, classifier, POS tagger, ...) required the slf4j library. This caused breakages for many tools, especially integrations into non-Java frameworks which used to just load one jar and to have everything that they needed....

So, for v.3.6.1, we're going to switch to logging in CoreNLP directly via the Redwoods logging package that @gangeli wrote (and which already appeared in parts of CoreNLP). We've extended it -- and speeded it up -- and it will log via slf4j, if slf4j is present, and otherwise it will (by default) log to stderr.

If you looked at what's in master on github, we've now made a lot of progress on swapping System.err.println's across to Redwoods logging commands, though there are still quite a few minor things to fix.

@markwoon
Copy link

Out of curiosity, if the goal was to keep to a single jar, why not just use java.util.logging?

I'm not familiar with Redwoods. What features are available there that's not available in j.u.l?

@gangeli
Copy link
Member

gangeli commented Feb 23, 2016

There are actually features in Redwood that are not in j.u.l (tracks,
etc); but the main reason is that Redwood can re-route to whatever
framework the user wants, whereas j.u.l is stuck as itself. If you'd like
to log to j.u.l, you can always call
RedwoodConfiguration.javaUtilLogging().apply()

--Gabor

On Mon, Feb 22, 2016 at 4:40 PM, Mark Woon [email protected] wrote:

Out of curiosity, if the goal was to keep to a single jar, why not just
use java.util.logging?

I'm not familiar with Redwoods. What features are available there that's
not available in j.u.l?


Reply to this email directly or view it on GitHub
#1 (comment).

@reckart
Copy link

reckart commented Feb 23, 2016

j.u.l can also be routed through anything you like, for example through SLF4J. Only the respective logging backend needs to provide an adapter for j.u.l. Although the way that j.u.l is implemented makes it just a tad bit more effort to route it through something else than e.g. Apache Commons Logging.

@markwoon
Copy link

I use SLF4J, so I'm familiar with routing j.u.l. Just curious about Redwood. It sounds like it will automatically use SLF4J if it's detected, so that works fine for me.

@kno10
Copy link
Contributor

kno10 commented Mar 8, 2017

Please reconsider this. For all I can tell, Redwood comes at quite some overhead. Interfaces such as slf4j are designed to minimize performance cost of logging, both enabled and disabled. There is a reason why they use message templates, for example: to avoid string concatenation when logging is disabled. Redwood appears to be very expensive here, performing several array allocations for every log call (even when disabled) and by not allowing the use of templates. All the good logging frameworks are designed so that the Hotspot VM will inline the logger call, and when logging is disabled, it is a if (false) {} and can then be optimized away by Hotspot easily.

Also, the logger should never kill the application, but Redwood.java does:

public void fatal(Object... objs) { log(Util.revConcat(objs, ERR, FORCE)); System.exit(1); }

Use an exception, and give users a chance to catch and recover, rather than doing a hard exit.

@manning:

While we agree that much of the world would like logging support, the adoption of slf4j logging in v.3.6.0 was unpleasant for many users, as it meant that not only CoreNLP but all our other smaller tools (NER, parser, classifier, POS tagger, ...) required the slf4j library. This caused breakages for many tools, especially integrations into non-Java frameworks which used to just load one jar and to have everything that they needed....

I believe the better approach here is to simply provide a "standalone" jar (also known as "fat jar") for these users, which is self-contained and thus includes e.g. slf4j-jdk14 to log via the JDK (this should be just a few kb), and a library jar which does not include slf4j to be used with Maven, and which allows users to choose their favorite logging framework via slf4j, and get all the benefits.

@dinoboy197
Copy link

I believe the better approach here is to simply provide a "standalone" jar (also known as "fat jar") for these users, which is self-contained and thus includes e.g. slf4j-jdk14 to log via the JDK (this should be just a few kb), and a library jar which does not include slf4j to be used with Maven, and which allows users to choose their favorite logging framework via slf4j, and get all the benefits.

I completely support this approach. Logging framework changes should not cause undue pain with existing developers. However, I would hate to see a lock-in approach where best-practices in Java software engineering (for instance, third-party logging) can't be introduced to benefit those clients who want to follow them.

Perhaps this idea of dual distribution (fat jar with custom logging AND standalone jar via maven dependency which pulls in slf4j) suggestion is something which could be rolled into a major version change; ex: v3 -> v4 with explicit communication to clients that they now have choices?

@gangeli
Copy link
Member

gangeli commented Mar 9, 2017

Redwood does indeed come with some overhead, but CoreNLP doesn't exactly spew a lot of logs, even on the debug channel. Are there benchmarks that show measurable performance degradation? On my machine, I count 7.7M log messages / second, which means that even if CoreNLP were to spew an absurd 1k log messages per second, this would add an extra second for every 2 hours of runtime. Have you encountered concrete issues with runtime / interfacing with Redwood?

As the author of Redwood I of course I bring my own biases into the discussion, but in my view CoreNLP is first and foremost an academic library meant to support NLP research. Redwood was created first and foremost as a logger to facilitate this: hierarchical logging (startTrack / endTrack) are a more convenient way to log when the point of logging is to trace a single program execution (e.g., training a model / evaluating on a dataset) rather than a continuously running program. As another example: whereas in a production application a logger should never crash a program (as fatal does), in research code that's exactly what the semantics of "fatal" -- as opposed to "error" -- is. Of course, a library should never issue a fatal log, and CoreNLP never does.

Second, I do think there's an appeal to having pure CoreNLP be entirely dependency-free. Sure, certain features depend on external libraries (SUTime / serialization formats / etc.), but the library will boot without these. Keep in mind that not everyone uses Maven, and not everyone calls CoreNLP from Java. "Fat Jars" are a wonderful idea for applications, but often lead to a special sort of Jar hell with libraries, and I dislike the "use Maven or be sad" answer to that.

@dinoboy197
Copy link

@gangeli - understood. I'm commenting from a non-academic setting; my organization makes use of CoreNLP in combination with a variety of other libraries in a combined experiment that we use internally (so a fat jar doesn't work as losing that dependency versioning and conflict resolution creates a mess).

Having said that, whatever internal implementation a library (such as corenlp) has to interact with the slf4j api (redwood, for instance) is non-material; as long as the slf4j-api is the final logging backend, that's what matters. I know Chris mentioned that a year ago ("and it will log via slf4j, if slf4j is present, and otherwise it will (by default) log to stderr.") - that's the important part for me.

@kno10
Copy link
Contributor

kno10 commented Mar 10, 2017

@gangeli I am an academic user. Nevertheless, I do care about performance; because it limits how fast I can do development iterations and in the worst case, may mean missing a deadline. But yes, I have not benchmarked the share of runtime cause by logging currently, and it may well be negligible overall (but note that you want a next-to-zero overhead logging to make people actually use it).

Fat jars of course should only be used standalone, then there is no dependency hell. It's just the solution for those unable to handle multiple jars; the problematic case mentioned above. These are the users that will want a fat jar:

non-Java frameworks which used to just load one jar and to have everything that they needed


There is no excuse to use System.exit() in a logger:

  1. it violates separation of concerns. Loggers should log, not control program flow. If you really want to log and exit hard, you should be using log("message"); System.exit(1);

  2. It may cause you to lose the log message (e.g. when asynchronously logging to a network server collecting the logs of multiple workers)

  3. throw new Error("Fatal: cannot continue.") (and often, even a RuntimeException may be enough) is exactly the semantics you describe:

    An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch.

    Nevertheless, an Error may allow the application to flush pending log messages, close file handles, and shut down gracefully rather than hard exiting, and also killing all other threads of the VM. See also: https://cwe.mitre.org/data/definitions/382.html
    It's okay to use System.exit in a driver. I'd recommend to never use it outside the class that has the public static void main(String[] args) entrance point; almost always an exception or error is more appropriate.

slf4j has a Marker interface that may be well suited. "Markers can contain references to other markers, which in turn may contain references of their own.", so they probably can be used to model the hierarchical use case you seem to have without much cost.

@J38
Copy link
Contributor

J38 commented Mar 10, 2017

Some thoughts:

1.) I'll look into improving the fatal situation, there are a few fatal logs in CoreNLP right now so it shouldn't be too hard to clean that up and make it safer. I'm open to changing Redwood's fatal() behavior if @gabor doesn't mind me making changes to his logging system! : )

I do think we have to be mindful of making sure Stanford CoreNLP doesn't bring entire applications down and gives them a chance to recover if for instance a pipeline has a serious problem.

2.) I am extremely skeptical that Redwood logging is causing any kind of meaningful performance hit. I would have to see someone provide a concrete example to consider this an issue.

3.) We support slf4j logging, so if users want to plug in their favorite logging framework that works with slf4j, we allow for that (though with the added cost of Redwood).

4.) If there are more problems with Redwood please feel free to let me know and I am open to modifying it to address concerns.

5.) Thanks everyone for bringing up great points!

@reckart
Copy link

reckart commented Mar 10, 2017

It is relatively straightforward to create a fat jar in addition to a non-fat JAR by adding the maven-shade-plugin to a POM. So offering both shouldn't complicate your build setup too much.

@kno10
Copy link
Contributor

kno10 commented Mar 10, 2017

Some numbers, with a jmh microbenchmark, tinylog backend, info level, logging disabled:

Logging 1 million string constants:

RedwoodBenchmark.redwoodconst      1000  thrpt   10     0,545 ±  0,011  ops/µs
RedwoodBenchmark.slf4jconstant     1000  thrpt   10   892,057 ± 27,687  ops/µs
RedwoodBenchmark.tinyconstant      1000  thrpt   10  1190,662 ± 10,029  ops/µs

Logging with the format string pattern (i.e. logger.info("{} plus {} is {}", i, j, i+j))

RedwoodBenchmark.redwoodlogf       1000  thrpt   10     0,515 ±  0,029  ops/µs
RedwoodBenchmark.slf4jformat       1000  thrpt   10   132,398 ± 18,193  ops/µs
RedwoodBenchmark.tinyformat        1000  thrpt   10   137,060 ± 14,474  ops/µs

Logging with string concatenation antipattern (logger.info(i+" plus "+j+" is "+ (i+j))):

RedwoodBenchmark.redwoodstringbuilder     1000  thrpt   10     0,518 ±  0,021  ops/µs
RedwoodBenchmark.slf4jstringbuilder       1000  thrpt   10    24,443 ±  0,271  ops/µs
RedwoodBenchmark.tinystringbuilder        1000  thrpt   10    25,293 ±  2,406  ops/µs

So while slf4j adds some overhead over e.g. tinylog, the cost of the facade is small. The cost of redwood however is massive. With redwood, I can ignore 0.5 disabled log statements per microsecond. With the others, this is around 1000. With format strings, about 130 (apparently there is still some array construction or boxing involved then).

We can also look at the runtime of a single call to info, first with a string constant:

RedwoodBenchmark.redwoodconst             1000  avgt   10  1602,644 ± 138,955  ns/op
RedwoodBenchmark.slf4jconstant            1000  avgt   10     1,100 ±   0,008  ns/op
RedwoodBenchmark.tinyconstant             1000  avgt   10     0,837 ±   0,007  ns/op

Then with a format string:

RedwoodBenchmark.redwoodlogf              1000  avgt   10  1807,916 ± 208,674  ns/op
RedwoodBenchmark.slf4jformat              1000  avgt   10     7,998 ±   1,441  ns/op
RedwoodBenchmark.tinyformat               1000  avgt   10     7,231 ±   0,469  ns/op

Or with string concatenation:

RedwoodBenchmark.redwoodstringbuilder     1000  avgt   10  1883,103 ±  16,495  ns/op
RedwoodBenchmark.slf4jstringbuilder       1000  avgt   10    40,877 ±   1,153  ns/op
RedwoodBenchmark.tinystringbuilder        1000  avgt   10    41,646 ±   6,564  ns/op

Cost of string construction with i+" plus "+j+" is "+ (i+j)):

RedwoodBenchmark.stringbuilder            1000  avgt   10    37,776 ±   5,089  ns/op

So Redwood Channels logging costs about 1600 ns per call, alternative approaches are about 1 ns for constant strings, and 8 ns for string patterns. Avoid using string concatenation in logging statements, unless they are warning level or higher (so usually enabled anyway). Instead, use templates.

You really don't want to use Redwood Channels for any debug-level logging. Too heavyweight.

Right now, it may not matter, but if you at some point want to add (optional) detailed logging for debugging, you may find this cost become problematic.

@gangeli
Copy link
Member

gangeli commented Mar 10, 2017

Right now, it may not matter, but if you at some point want to add (optional) detailed logging for debugging, you may find this cost become problematic.

I agree entirely with this statement, and with the benchmarks showing Redwood to be much slower than SLF4J. But, the point of Redwood is not to be a replacement for SLF4J -- SLF4J does a wonderful job of what it does. The point is to provide a different sort of logging, very much tuned to a particular use-case. CoreNLP is not expected to spew debug logs anytime in the near future (why would it? It's not like we analyze logs, or keep historical logs, or any of the normal things companies do with logs.).

I don't see this as a runtime concern for any practical application. If we're optimizing for efficiency, there's much more low hanging fruit in, e.g., your IntPair bug, or in CoreMaps, or in the Simple API serializing+deserializing on every annotation.

RE Fat Jar: My issue isn't with our own build process, but with people using it downstream. I'd like people to be able to put CoreNLP into a lib/ folder and add it to their classpath, and have it work (as, actually, I did for all of my research projects, or as wrappers in other languages often do it). This is awkward if CoreNLP doesn't boot as a simple Jar, and problematic if CoreNLP includes dependencies that I may be including from elsewhere -- effectively forcing a Maven-like dependency resolution scheme. SLF4J itself is not awful, because there aren't many versions and it doesn't have transitive dependencies, but nonetheless it's a trend I'd like to argue against.

RE Fatal: I've used this in my own projects to bring down multi-threaded programs. There's nothing worse than starting a run, having the main thread crash on something, and then having the whole process hang (holding up, e.g., cluster resources) because a non-daemon thread didn't exit. Sure, all my threads should be Daemons; but again, academic code. Yes, it breaks abstractions, since it's more than what a logging framework should do, but the whole point of Redwood is to be a bit different -- otherwise again, SLF4J does a wonderful job. Redwood also manages threads a bit to synchronize tracks in multithreaded applications, and does its own logic to suppress repeated log messages, etc. Clearly out of scope for a "logging framework," but useful in code that I and others have written. That said, I'm not opposed to removing the function, or changing its behavior.

And of course, at the end of the day the final say is Chris' and Jason's. I'm just voicing opinions. :)

@markwoon
Copy link

A big +1 for removing the System.exit() in logger.

@kno10
Copy link
Contributor

kno10 commented Mar 11, 2017

Does Redwood have any means of logging error traces (i.e. Throwables)?

I got this out-of-memory warning (logged as info) due to some Wikipedia documents, and the backtrace could be helpful to see where it got the OOM: #382

On a side note, when running multi threaded, the out of memory will also affect other threads, so it's not as easy as just continuing...) I probably need to rerun the entire database, because I can't reliably tell which documents were correctly processed, and which were not...

Also, CoreNLP currently has much of its debug logging code-disabled via private static final boolean DEBUG = false; - with a low-overhead logging, you could enable them if desired via logging configuration per package or class... i.e. if a user reports a bug, you could ask them to reproduce with logging enabled for relevant classes and send the log file; whereas with the static final boolean they would need to recompile first.

@gangeli
Copy link
Member

gangeli commented Mar 12, 2017

Redwood should support stack traces. The following works for me:

Redwood.log(new RuntimeException());

Yielding:

[main] INFO CoreNLP - java.lang.RuntimeException
  edu.stanford.nlp.util.logging.Redwood.main(Redwood.java:1315)
  sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  java.lang.reflect.Method.invoke(Method.java:498)
  com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

The DEBUG = false bits are actually a holdover from when CoreNLP logged to stderr, and this was the only way to squash messages. I agree that we should remove them, but these sorts of technical debt fixes are often low-priority for the group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants