-
Notifications
You must be signed in to change notification settings - Fork 310
ISC title variations #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISC title variations #497
Conversation
The colon in `ISC License:` is probably copied from https://www.isc.org/downloads/software-support-policy/isc-license/ where it is part of a longer string that is more clearly not even supposed to be included in the text ("Text of the ISC License:") so I think it makes sense to support matching the colon but not have it be canonical -- not sure I've seen it in the wild. I have seen `The ISC License plenty`, eg one of the most starred ISC repos on github.com -- https://github.com/isaacs/node-glob/blob/master/LICENSE I have also seen `ISC License (ISC)`, probably because that's what seems to be on https://opensource.org/licenses/isc-license.txt if you take the heading to be part of the license text, which people often do. Note the version at https://choosealicense.com/licenses/isc/ just has `ISC License` and that version is of course popular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Some previous discussion here and the following comments. I'll buy your assertion that the ISC page does not consider their current “Text of the ISC License:” as part of the license iteself, and once you accept that your title makes sense.
I'm also ok dropping the entire title from the canonical version. For example, DHCP v4.0.0 includes no title. Markup for that would be: <titleText>
<alt match="(The )?ISC Liense( \(ISC\))?:?" name="title"></alt>
</titleText> or something like that. |
I weakly prefer to see a title, but happy to change PR to use alt for title if that makes sense for the project. Speaking of which, I realize I'm not sure what the objective here for canonical versions is. It's the text that'll show up on spdx.org/licenses/$ID ... but should that text be as true to possible to the 1) first known use of a license, 2) the license as currently published by its steward if it has one, or 3) reflecting most common use in the wild, or 4) reflecting useful defaults for people who wish to copy and use the license text, or 5) something else? This especially has bearing on ISC ... if 3 or 4, probably ISC should not be the copyright holder in the canonical/published-on-spdx.org version. |
The
While that's obviously an acceptable option, I'd point out that whatever canonical version ends up being picked will have a significant effect in the text people will end up using in the wild (case in point, the title shown in https://choosealicense.com/licenses/isc/ has been getting popular, as @mlinksva points out). The short, permissive licenses are quite easily confused: I ran into several people who were using the ISC license mistakenly believing it to be MIT (possibly having copied it from elsewhere, or inherited the repo maintainership). Sane defaults would make this kind of confusion less likely. [EDIT] examples of ISC/MIT confusion: umbrae/jsonlintdotcom#32, jung-kurt/gofpdf#77, danreeves/react-tether#36. |
👍. There's some discussion about this in licensee/licensee#139. I quite like the wording used for BSD-2 ("the copyright holders and/or contributors"), but I agree it's best to discuss this in a separate thread. |
IMHO this is the most reasonable choice. The other ones risk propagating mistakes (e.g. typos) or sub-optimal choices, but I can imagine that 2) could be considered legally safer. |
On Tue, Nov 28, 2017 at 06:05:49PM +0000, Waldir Pimenta wrote:
> 4) reflecting useful defaults for people who wish to copy and use
> the license text
IMHO this is the most reasonable choice. The other ones risk
propagating mistakes (e.g. typos) or sub-optimal choices, but I can
imagine that 2) could be considered legally safer.
Yeah, my preference is (2), because it means the SPDX isn't making
decisions. Users can always advocate with the license steward to move
the canonical form towards useful defaults, and then SPDX would pick
them up if/when the steward accepted them. Although that obviously
doesn't help with licenses which lack an upstream maintainer (HPND?
Or is the OSI the canonical steward of that license?).
On the other hand, I am fine fixing upstream typos without waiting for
upstream to sign off on those changes (e.g. [1]). So not black and
white, but I like sticking closer to (2) than (4).
[1]: #488 (comment)
|
As pointed out at spdx#497 (comment) ISCL is used in some communities, and sometimes appears in license files https://github.com/search?utf8=%E2%9C%93&q=ISCL+filename%3ALICENSE&type=Code
I included "if it has one" with (2) for a reason... Is ISC steward of the ISC license? I think it's questionable. They're the creator/first user, and they have a web page up about what licenses their software are under (some ISC, some MPL-2.0). Contrast with FSF, CC, Eclipse, Mozilla, and Larry Rosen (off the top of my head), who either actively maintain their licenses, or their licenses stipulate that they are the steward. |
I think that makes them the steward unless:
If the ISC doesn't want to steward the ISC license, can we get them to designate an official successor? |
I don't think that being creator/first user makes one steward. Not providing a reusable version is pretty clear indication of non-stewardship. ISC is the steward of their software, not of the license they happened to make up. Same as MIT, UC Berkeley, and others. Those licenses have become widely used in spite of non-stewardship of their creators. (This is just a conjecture, not strongly held.) |
On Tue, Nov 28, 2017 at 08:33:36PM +0000, Mike Linksvayer wrote:
Not providing a reusable version is pretty clear indication of
non-stewardship.
I don't think it's that straightforward. In some cases the license
steward may not intend for reuse. In others, the license steward may
feel that <alt> tags are sufficient, without using placeholder values
as well.
ISC is the steward of their software, not of the license they
happened to make up.
I don't think the distinction between licenses and software is that
clear either. For example, the GPL is © FSF [1] verbatim copies only
[2]. They don't clarify whether that covers markup (like the XML we
add), but my impression is that they don't mind restyling (e.g. see
[3]). I think they might start to get grumpy if folks started
suggesting changes like “redistribute it and/or modify” →
“redistribute it or modify” and such.
If a steward goes dark and the OSI or SPDX or whoever wants to
nominate themselves as the steward , that's fine. But I think folks
should be careful, or we'll end up with multiple parallel would-be
stewards and things will get really confusing ;). The safe way would
be to freeze out the abandoned license, and deprecate it in favor of
ISC-SPDX (or whatever) that was stewarded by the SPDX from the start.
But that approach won't work with the OSI, where only the steward can
make the retirement request [4]. In that case, all you can do is add
a note saying “We the OSI assert that upstream is non-responsive and
suggest you use ISC-OSI (or whatever) instead because …”.
[1]: https://github.com/spdx/license-list-XML/blob/9f4432fbb660510859417b3d78a795beeeb8279b/src/GPL-3.0.xml#L24
[2]: https://github.com/spdx/license-list-XML/blob/9f4432fbb660510859417b3d78a795beeeb8279b/src/GPL-3.0.xml#L25-L26
[3]: https://www.gnu.org/licenses/licenses.html#VerbatimCopying
[4]: https://opensource.org/approval#retirement
|
I'd rather license texts be public domain (as Creative Commons ones are), but I don't think that has any bearing on whether a license is reusable, and thus in my opinion, potentially stewarded. As far as influencing practice, ISC, MIT, UC Berkeley, and similar are have all gone dark. The cat is out of the bag, the licenses aren't frozen (obviously, but also see |
I tend to agree with @mlinksva here. The points raised by @wking are strong, but ultimately I believe organizations/projects aiming to curate licenses (OSI, SPDX, choosealicense, etc.) would do more good overall by ensuring sane defaults are available and can proliferate, than by increasing the complexity of an already fragmented field out of respect for actors who, for all we know, didn't make a conscious decision to make licenses hard to reuse in the first place. That said, I think it's reasonable to assume that consensus can be easily reached, in a one-by-one case, regarding what changes are acceptable (formatting/templating vs. rewording à la "and/or"); so as long as these decisions are made by the interested members of the community, following a reasonable discussion period, in an open forum, with permanent documentation, I don't think it's problematic to make these changes. Which, going back to the origin of this sub-thread, is the reason I support option 4 ("reflecting useful defaults for people who wish to copy and use the license text") for SPDX's canonical versions of licenses. |
That seems orthogonal. You still can't edit, for example,
So formally appoint a new steward in charge of vetting these things? What happens if the OSI decides a particular variation matches the ISC but SPDX does not agree? Or vice versa? For example, SPDX currently accepts the $ curl -s https://opensource.org/licenses/ISC | grep and/or
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. SPDX claims the ISC is OSI-approved, but is the and-only form actually OSI approved? Once you have multiple parties editing the templates, this gets sticky. But in order to have a single party editing templates, the OSI (and others) would have to delegate that authority to SPDX (or vice versa). Or things like “OSI approved” will need to become conditional on the exact wording (in which case, collapsing to an SPDX ID may be too lossy to be useful).
Right, as long as folks keep in mind that the consensus impacts the many parties consuming SPDX.
With title text excluded from SPDX matching (and presumably from OSI/FSF/… analysis), I'm ok having a title in the canonical ISC text. And I'm also fine using dummy placeholders in copyright statements and other “you must replace this variable text to use the license” cases; users will be replacing those anyway. I don't think leaving those out makes the license particularly hard to use, but I don't think making those changes is likely to break compatibility with folks consuming the SPDX identifier either. But there's a very similar title issue with the MIT, where we claim (possibly based on the same mistaken reading of the OSI page) that there's a title which belongs in the license. The FSF folks would presumably prefer “The Expat License”, although they include no title in their text. Would you recommend the SPDX breaks that tie and picks one as the canonical license title? |
Wouldn't an |
On Fri, Dec 01, 2017 at 01:34:14AM +0000, Waldir Pimenta wrote:
Wouldn't an `<alt match=...>` expression be able to support both
options?
Yes.
Note, I don't actually see this as being a tie in practical terms,
considering the vastly more common usage of the MIT title than the
Expat one.
But how many people include a title at all? choosealicense does [1].
The OSI [2], FSF [3], Gentoo [4], James Clark [5] (linked from Debian
[6]), and Wikipedia [7] do not. A few of the Fedora variants contain
titles [8], but they seem to end with “COPYRIGHT NOTICE, LICENSE AND
DISCLAIMER”.
So as I see it, it would make sense to use the "MIT" title as part
of the canonical template, while supporting the "Expat" title.
Supporting both makes sense. But having a title as part of the
MIT/Expat license text at all seems to be unique to choosealicense and
the SPDX.
[1]: https://github.com/github/choosealicense.com/blob/7213ccf8ac07263d66dd6aa8bf955d1bbec06117/_licenses/mit.txt#L32
[2]: $ curl -s https://opensource.org/licenses/MIT | grep -B4 Copy
</script></div>
<p>Copyright <YEAR> <COPYRIGHT HOLDER></p>
[3]: $ curl -s https://directory.fsf.org/wiki/License:Expat | grep -B1 'Copyright (c)' | head -n2
…<div class="smwttcontent">The given value "<pre>
Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd
[4]: https://gitweb.gentoo.org/repo/gentoo.git/tree/licenses/MIT?id=d4ae32945e452603d9ab8000c4f748a7b2be446b#n1
[5]: http://www.jclark.com/xml/copying.txt
[6]: https://www.debian.org/legal/licenses/
[7]: https://en.wikipedia.org/wiki/MIT_License#License_terms
[8]: https://fedoraproject.org/wiki/Licensing:MIT?rd=Licensing/MIT
|
Getting back to the original purpose of this issue and pull request and a question for @mlinksva as he opened this: do I understand that you simply wanted to reflect that the presence or absence of a colon in the title can still be a match to the license? (even though as per the matching guidelines, the title does not need to be an exact match) |
src/ISC.xml
Outdated
@@ -6,7 +6,7 @@ | |||
<crossRef>http://www.opensource.org/licenses/ISC</crossRef> | |||
</crossRefs> | |||
<titleText> | |||
<p>ISC License:</p> | |||
<p><alt match="The " name="titleThe"></alt>ISC License<alt match=" \(ISC[L]{0,1}\)" name="titleID"></alt><alt match=":" name="titleColon"></alt></p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an alternative alt tag that uses a single alt which should improve performance on the matching:
<alt match="(The )?ISC License( \(ISC\))?:?"></alt>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the recommended match with the following:
ISC License:
ISC License
The ISC License
The ISC License (ISC)
The ISC License (ISC):
@jlovejoy yes. I was not aware of:
What counts as substantive text? I don't see that defined in https://spdx.org/spdx-license-list/matching-guidelines ... would love a pointer to matcher code so I can see intention manifest. 😄 I had noticed that punctuation is generally significant for matching purposes, which is one of the reasons I opened this PR. But the main reason I opened it doesn't have much to do with matching, which perhaps make it uninteresting or offtopic for this repo: ideally for me, the texts published by SPDX would be as close as possible to what is used and useful in the wild so that eventually projects like choosealicense.com and thus github.com can just consume what SPDX publishes. I've never seen a colon used in the title of ISC in the wild. Yes, trivial. I should've left ISC as the canonical copyright holder to a wholly separate issue, but this main reason is also why I mentioned that topic. |
@mlinksva There is a Java implementation of matcher code in spdx tools repo. It is rather complex (probably more complex than it needs to be ;) There were a number of interesting technical challenges in implementing the algorithm, such as handling nested optional and variable blocks and greedy regular expression matches. The starting point is isStandardLicense. The algorithm works off of the license template which describe variable and optional text. The License-List-XML source is translated to the license templates when the license-list-data and website HTML pages are generated.
Note that this doesn't exactly implement the matching guidelines since there is no attempt to programatically determine substantive text. This could be considered a bug in the matching algorithm, however, if we tried to implement a more generous matching algorithm we run the risk of allowing a match to substantive text unless we define substantive text more precisely. For the purposes of the License-List-XML source, If we include the additional matching hints as we did here, we will be able to match more of the license titles. |
@goneall thanks for your patient reviews and explanation. I might open up other issues or PRs for tangents raised here, but am happy the title is sorted out. 😄 |
src/ISC.xml
Outdated
@@ -6,7 +6,7 @@ | |||
<crossRef>http://www.opensource.org/licenses/ISC</crossRef> | |||
</crossRefs> | |||
<titleText> | |||
<p>ISC License:</p> | |||
<p><alt match="(The )?ISC License( \(ISC[L]?\))?:?"></alt></p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is back to having no canonical title in the license body (which seems reasonable to me, but has had a fair bit of discussion in this PR, I don't know if there is a consensus yet). If the intention is to have no canonical title in the license body, I think we want to drop the <p>
tags too. No need to render an empty <p></p>
.
If the intention is to have a canonical title in the license body, then I think you want:
<p><alt match="(The )?ISC License( \(ISC[L]?\))?:?">ISC License</alt></p>
or whatever you decide to use for the canonical title inside. I'm also fine with <alt …><titleText><p>…</p></titleText></alt>
if folks prefer that order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the title be in the canonical version, so I'll go ahead and add it back. @goneall can either approve or ask it to be removed.
The colon in
ISC License:
is probably copied from https://www.isc.org/downloads/software-support-policy/isc-license/ where it is part of a longer string that is more clearly not even supposed to be included in the text ("Text of the ISC License:") so I think it makes sense to support matching the colon but not have it be canonical -- not sure I've seen it in the wild.I have seen
The ISC License
plenty, eg one of the most starred ISC repos on github.com -- https://github.com/isaacs/node-glob/blob/master/LICENSEI have also seen
ISC License (ISC)
, probably because that's what seems to be on https://opensource.org/licenses/isc-license.txt if you take the heading to be part of the license text, which people often do.Note the version at https://choosealicense.com/licenses/isc/ just has
ISC License
and that version is of course popular.So I'm proposing that
ISC License
be canonical, optionally preceded byThe
, optionally followed by(ISC)
, optionally followed by:
.Separately, I also think it would be a good idea to make ISC not the canonical copyright holder. The license is now very widely used generally with the disclaimer fields as
AUTHOR
and, and starting with a fill-in copyright line like MIT (see above links). If there's any interest I could add here or make a new PR to cover these non-title aspects.cc @waldyrious as an ISC expert