Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ownership of specification? #17

Closed
isaacs opened this issue Mar 23, 2015 · 23 comments
Closed

Ownership of specification? #17

isaacs opened this issue Mar 23, 2015 · 23 comments

Comments

@isaacs
Copy link
Contributor

isaacs commented Mar 23, 2015

How are TAP specification changes adopted? Who votes?

Is this repository even relevant? If so, can that information be added to the README.md?

@jonathanKingston
Copy link
Member

@isaacs sorry for the delay, I will answer this comprehensively shortly.

@jonathanKingston
Copy link
Member

I'm sure you are aware of most of this however I will make it clear for the benefit of others.

I get the feeling some of the advancement of the format becoming centralised has come from Andy Armstrong who is the domain owner and previous site maintainer, however I'm not perfect on the history here.
I brought the site back online over a year ago, clearing up lots of the old conversation style content and turning it into something closer that could be treated as a specification.

However some of the issues itself as I'm sure you are aware with the specification itself where by it is ambiguous in content which doesn't lend itself to writing a consumer or producer.

Something I have tried to drum up interest in is a committee who vote on the specification changes which TAP really needs to advance; I'm of the mindset some pretty big breaking changes need to happen to actually advance the format. However that goes against a lot of the people who contribute on here stand for.

This puts us in a very tricky place where by advancing the format is likely just to consist of more of the same additive changes - which so far has been library writers just adding new features until they become common place. This can't really continue either, the differences in producers and consumers are very far and wide. Take for example sub tests which you have raised in the other issue, very few implementations align and the old site had some notes on peoples ideas.

I'm coming to the mindset there are three separate directions TAP motives belong to:

  • Legacy - lots of TAP parsers and consumers exist in the wild, most will likely break with any change. The motivation here is to keep TAP the same as it always has.
  • Human readable - People want to add in sub testing and other changes in the most readable format with lots of meaningful documentation and meta data
  • Machine readable - Lots of meta data, in the smallest format possible, concurrency and ordering doesn't matter, consumers can open up the exact line of code that caused the issue with a stack trace and memory output like a cross device chrome debugger.

These three directions really don't match up at all and it puts advancing the format at a standstill. The closest I have come so far in my thinking is that two new formats would need to be produced to match these directions.

@isaacs
Copy link
Contributor Author

isaacs commented Mar 23, 2015

I actually did not know most of that history, so this is very helpful, thank you!

I agree 100% that we need a committee of committed stake-holders. It's very easy to have big ideas about what would make a good format. It's another thing entirely to have thousands of people who use your test framework in a CI workflow and will get justifiably upset if you change it on them.

I would suggest that we approach it somewhat like TC-39 manages JavaScript. Come up with a list of the most relevant currently maintained and depended-upon TAP implementations from a variety of languages, and let them each nominate a member to be on the committee. Then, before a proposal can be ratified, there must be at least 2 compliant implementations.

Regarding your three groups, I think that there is more overlap than one might expect. I am personally in all three camps. I maintain a very popular JavaScript tap generating and consuming framework, and I have a lot of users who depend on node-tap's output being consumable by other consumers. I also maintain a lot of projects where I look at tap output for tests I run, so I like it to be human-readable. And making it machine-readable means that we can write better reporters that output stuff like spec, or dots, dashboards, or other fancy colorful things, which serves the human-readable goals in other ways.

I strongly disagree with the idea that we need large breaking changes. If anything, I'd like to see a "One TAP" resolution from this committee similar to TC-39's "One JavaScript". That is, no new proposals should be ratified if they cause significant hardship for parsers that do not understand that new proposal.

@jonathanKingston
Copy link
Member

The reason very little has been approved so far is because there has been opposition on so many of the issues. I'm treating the format like the ageing relic that it is, the main issue that I see it is that most implementations remind me of the browser issues of the 90's and that is why we need the standardisation process to happen.

The problem with the 'one tap' vision is mostly inconsistencies of producers:

  • Subtests are implemented in a few ways
    • Sub test plans, ordering, numbering
  • Some tests use '-' to separate the numbering
  • Some producers have defined meanings to YAML
  • Many producers add extra meta data which is specific to just themselves
    • For example I only recently found out about pragmas Pragmas #6
  • The philosophy of TAP suggests producers and consumers should behave in a very 'Quirks mode' manner http://testanything.org/philosophy.html
  • The age of some of the implementations on older systems leads a lot to be desired; all *nix installs get some version of TAP with their installs which drives sys-admins to worry about the stability of upgrades.

There is a lot of overlap within those three points, however I feel some desired changes can't be addressed very easily:

This all seems a little negative however, I'm trying to Shepard something which is far bigger and older than most formats - XML and TAP are of a similar age; however TAP wasn't written down in it's current format until 2001. As you suggest getting the authors of notable implementers on board is certainly a good direction.

Also I am aware of node-tap I used it to test the ESLint formatter I submitted 👍

@isaacs
Copy link
Contributor Author

isaacs commented Mar 24, 2015

Yes, "One TAP" is an aspirational goal, and certainly not trivial. In less pithy terms, I'm proposing that we consider "breaks TAP <= n implementations" as a significant drawback when proposing new additions for "TAP n+1". New additions should be non-harmful to existing parser implementations, and ideally be interpreted (by humans, if not machines) somewhat like the intended purpose. Otherwise, we run the risk of providing too steep an upgrade path for implementors.

Remember, this is first and foremost a protocol, so it is a way for multiple different programs with different implementations to communicate with one another. The proscription that consumers ignore any line that does not parse as TAP provides a powerful means for forward-compatibility, and we should take advantage of it.

If there is a desire to implement something like asynchronous parallel/concurrent child tests, which cannot be safely done with the constraints of TAP, then the answer is not "make breaking changes to support this feature", but rather, "Create something that is not called "tap", which has that feature." If that thing is better than tap, then it'll win in the long run. Lead with running code and loose consensus.

The Test::More approach I advocated in #2 for example is extremely backwards compatible. Any compliant TAP 13 parser will safely ignore the sub-test details, and print the pass/fail based on the trailing test line. So, eg instead of seeing 4 tests with 4 points each, it'll only see the 4 top-level tests. That's a reasonable fallback, and a human who sees the output will likely know what's going on. I'm using this with my own projects now, and it's very good.

I'm not sure why you'd say that machines hate yaml. There are many well-established parsers and generators. The one that node-tap uses today is rather terrible, but I'll be moving to js-yaml in the next major release and abandoning the "yamlish" module. JSON might've been a better choice, but it's significantly less human-readable.

My concern is that, if we optimize for parsers and generators, and are willing to make sweeping breaking changes, then taken to the logical ends, that leads to something like streaming line-delimited JSON, with a suite of output reporters for various purposes. A rigorously specified JSON format would address those concerns ideally, but in my opinion, the defining feature of TAP is that it does work for both humans and machines, and is profoundly liberal in what it accepts. But calling such a thing "TAP" is just going to be confusing and weird.

@beatgammit
Copy link

JSON might've been a better choice, but it's significantly less human-readable.

If we're open to changing the format of meta-data inclusions, perhaps something like TOML or any other ini-style format would be nicer (easy to parse, easy to read). The main problem with YAML is that it's a huge specification which tries to do everything. Perhaps I'll formalize this as change request for TAP > 13.

I think I agree though that any changes shouldn't break existing consumers, but existing consumers may not be able to take advantage of the new features. Perhaps this could be formalized with a CI-setup? In order to get a new feature into the protocol, an example must go through some number of existing implementations. If this exposes a bug in an implementation, the bug must be fixed and released before the change is accepted.

It seems that we have support of a few significant projects, so perhaps this needs to be formalized for progress to be made? I created #18 to discuss how we'll guarantee this.

@jonathanKingston
Copy link
Member

@isaacs
I'm struggling to find a very good comment on one of the TestAnything repos which explained a lot of the issues with YAML, mostly it is far too feature packed and has had lots of poor implementations over the years because of the complex syntax etc.

Subtests are extremely backwards compatible if we wish to continue that TAP implementations continue to ignore lines that are not understood. I'm still lost as to why forwards compatibility is good behaviour for a testing framework?
Let me take an extreme example however it highlights my direction on how testing should behave:

  • A life support machine outputs TAP for each test of the patients heart beat
    • The machine reads 'OK - heart rate within normal bounds' when the patient is ok.
    • The machine reads 'not ok - heart rate not ok' when the patient is dead or not with a healthy heart rate
  • A separate system reads the TAP output and alerts the nearest doctors pager if the patient is not ok.
    • This system abides to the the TAP philosophy of ignoring TAP that it doesn't understand
    • The system designer is also clever enough to cope with loss of network or power failure, no response for a certain time calls out a nurse to check the patient is ok.
  • The life support machine suffers a stack overflow and start to output memory dumps and stack traces
    • The patient arrests and the separate system ignores the output.

If this behaviour seems OK to the TAP community as a whole, this is where I have wanted to create a subset of stricter TAP parsers which fail for non standard output.


Line delimited JSON as much as it solves the JSON issue, it's in the same ballpark as YAML frontmatter for articles; it breaks a format into something that was never designed to be embedded into another.

@beatgammit #18 is very duplicated, pretty much with all the others referenced.

Yeah I even started creating a subset of JSON which allows template strings for this exact usage.

@beatgammit
Copy link

@jonathanKingston

The reason I created it was to discuss backwards compatibily and how to ensure it. I think getting this in place (with a related RFC process) is critical to making progress. If you don't agree, feel free to close it.

I feel there's been a lot of discussion, but I feel there hasn't been a lot of progress because it's not apparent what the next step is. I'll detail my proposed RFC process in #18.

Again, since it seems you're the defacto project lead, I would really like your input on this type of proposal system. If you agree that it's a valuable effort, then I'll go ahead and work on a PR to get this system in place.

@isaacs
Copy link
Contributor Author

isaacs commented Mar 25, 2015

@jonathanKingston

Let me take an extreme example however it highlights my direction on how testing should behave...

Please do not use tap for monitoring life support systems.

Tap is a test reporting protocol designed to be parseable and also human readable. It is not an alerting system. It is the wrong tool for this job.

I'd suggest using a system designed specifically for monitoring outages, such as nagios.

That being said, if you believe that test harnesses should be extremely strict and treat any non-tap output as an error, then you are of course free to create one that does so.

If we want to make it easy for implementers to upgrade, we need to make it easy for them to continue to work with tap 13 implementations while they upgrade to tap 14. That means creating new features that degrade gracefully in the current context of what exists, so that a tap 14 producer can be reasonably consumed by a tap 13 consumer, and vice versa.

If we don't want to make it easy for implementers to upgrade, then there is no reason to bother specifying anything. History has taught us how to handle cases where there are many implementations in the wild of a many-to-many protocol. Pave cowpaths, provide graceful upgrade options, specification following implementation. If it doesn't work with TAP::Test, Test::More, tap4j, node-tap, etc, it's not a spec worth caring about. The implementations will push one another towards consistency; the spec is a way to facilitate this natural maturation.

So. Who's gonna be on the committee? When can we meet? Which implementors are interested in participating? Is anyone working on bylaws, voting mechanisms, inclusion requirements? Without implementor buy-in, there is no specification.

This repo here is a fine place to start, but I am eager to progress past this to some kind of explicit decision-making process.

@kinow
Copy link
Member

kinow commented Mar 25, 2015

Hi all, I'm rellocating and won't be able to help much on TAP discussions for the next four to six weeks I think. But I'm trying to read all the messages to keep up to date to the discussion :-)

"Create something that is not called "tap", which has that feature." If that thing is better than tap, then it'll win in the long run. Lead with running code and loose consensus.

Indeed. The JPA Java specification has several things that were defined in Hibernate. IIRC, some Java security specifications were also based on implementations. One more recent example is the new Java date API in Java 8, based on the excellent Joda time library. If one creates such library for running tap with other interesting features, that will only make it easier for them to get added to future specifications.

If we want to make it easy for implementers to upgrade, we need to make it easy for them to continue to work with tap 13 implementations while they upgrade to tap 14. That means creating new features that degrade gracefully in the current context of what exists, so that a tap 14 producer can be reasonably consumed by a tap 13 consumer, and vice versa.

+1, I agree with @jonathanKingston when he says that we must not be afraid of breaking things (IIRC it was you who said that in some other thread, if not I apologize) but I'm also +1 that we must find a balance, introducing new things, and slowly deprecating things.

So. Who's gonna be on the committee? When can we meet? Which implementors are interested in participating? Is anyone working on bylaws, voting mechanisms, inclusion requirements? Without implementor buy-in, there is no specification.

I believe the old specification had a group of people working on a formal IETF draft, and they also met in some conferences to discuss and vote some of the issues they had at the moment (which is probably like the situation we have now in this repo).

The votes were reported on the old website/Wiki with +1 and -1, as well as the decisions taken by the group. It would be really nice if we could also meet in person, or have some video conferences to try to work on issues that are, sometimes, harder to get consensus here on GitHub or mailing lists.

Even if I'm not in the committee, I'd be help to help testing, reviewing docs and later updating tap4j and the Jenkins tap plug-in up to the latest specification (or even creating snapshots compliant to a spec draft).

@jonathanKingston
Copy link
Member

@isaacs I had no intention to use it for anything that critical, it was however the example that kept cropping up in my head.

It isn't a system at all however I would like to use it for simple continuous tests similar to this, for example monitoring many pages to see if live reloaded changes cause parse errors or linting issues. Consumers could then join together the stream input of multiple of these lint tests and alert when something broke the build. TAP does lend itself to this style of tests, if however this isn't a goal others are interested in then that can be done elsewhere.

That being said, if you believe that test harnesses should be extremely strict and treat any non-tap output as an error, then you are of course free to create one that does so.

I do, however in my paranoia I would like to see that both consumer and producer were in this behaviour. We have mentioned before the use of pragmas to specify strictness which would be a forwards compatible change which would make me very happy.

As I have mentioned before I would prefer TAP 14 to be non breaking, just further clarifying what is currently out there in a much stricter manner than is currently done and bringing it in line with all other specs in wording and examples. I am pro breaking changes as you say when we have isolated where the weaknesses are and if there are no other alternate paths to use instead (Further additions that could be the preferred way, this is the behaviour of many other well adopted web specs with warts). I think we are aligned mostly however I would take a harder line I feel which is where the steering committee comes into play.

So. Who's gonna be on the committee? When can we meet? Which implementors are interested in participating? Is anyone working on bylaws, voting mechanisms, inclusion requirements? Without implementor buy-in, there is no specification.

I'm open to helping, I'm not sure if there would be a place for me in the long term however. I think there is enough technical knowledge between us to all be implementor of the changes - in fact I think all people who speak here have written TAP plugins open source or commercially.

I'm sure certain contributors here should certainly have a vote: @Leont, @AndyA, @exodist, @Ovid, @isaacs

All these people have expressed a lot of interest in the project and would likely give valuable conversation and I'm not sure if locking them out early on is a wise idea: @xmojmr, @beatgammit, @ligurio, @kinow, @gaurav, @jonathanKingston
Without these I suspect the traction would not have been there to get to this stage, I would propose having a period of review while the committee is still young. After the period of review I would suggest any member could be voted out by all but themselves voting against them. I'm wary of forming a list set in concrete for a format that is so old; but on the other hand worried about a committee of only 2 plugin authors that have the time continuing the format.

I would also like to reach out to the TAP mailing list, all the contributors so far to the format: http://testanything.org/history.html of which includes the likes of Larry Wall @TimToady - I'm guessing there will be crickets in return however it is certainly only fair that these people are given an open hand in voting for TAPs future too.

I suspect a closed mailing list would potentially be appropriate to gain consensus to items, the committee would then publish the outcome. Video chat would work too as part of a regular minuted meeting. Meetings in person would likely be more productive however I'm not sure they could be regular due to the distribution of people.

This repo here is a fine place to start, but I am eager to progress past this to some kind of explicit decision-making process.

Me too, this repo has actually drummed up a lot more interest into TAP than has happened in the past years. It needed traction before the process could happen in my opinion due to the age of the software; now is the time to solidify the process.

@kinow yup I did say that, see above.

@beatgammit
Copy link

All these people have expressed a lot of interest in the project and would likely give valuable conversation and I'm not sure if locking them out early on is a wise idea: @xmojmr, @beatgammit, @ligurio, @kinow, @gaurav, @jonathanKingston

I am interested in being involved, but I don't think I'm at a point yet where I'd want any kind of voting power. I use TAP occasionally, but I'm not invested enough to want that kind of responsibility.

I don't know enough about the history of TAP, but perhaps people involved in the original Perl test framework should be offered a position at the outset (that is, if they weren't already in your list)?

@isaacs
Copy link
Contributor Author

isaacs commented Mar 26, 2015

@beatgammit Everyone on the list other than me are Perl testing folks ;)

@isaacs
Copy link
Contributor Author

isaacs commented Mar 26, 2015

@jonathanKingston As far as the steps to move forward, I think we're pretty much on the same page, which is very encouraging. We can disagree with everything else after that, but at least then we'll have a framework for disagreeing productively ;)

I think it would probably be a bit overkill to create a foundation and working group and all the other ceremony for this, especially since there's really no assets to maintain, and it's just a loose coalition of implementors.

A private conversation could be useful to get things started, but once we come to agreement on how the spec gets decided, we should certainly make decisions and discussions as public as possible.

I'm going to be away for the next few weeks on vacation. I'd love to pick up the conversation then and see if we can make progress towards a regular meeting. I also expect that some of the perl crowd who haven't been involved in this repo will probably not be interested, but I'd love to extend the invite out of respect anyway. Their input would be valuable if they feel like being a part of the process.

@jonathanKingston
Copy link
Member

@isaacs great I look forward to the debate anyway, I'm happy to be overruled so long as I can help move the discussions on.

You mean we can't have TAPconf? 😆

I was thinking mostly that the private conversations could be useful to align the group regular, not to hide away discussions just to ensure that people are not reading lots of debate over semantic noise that is unlikely to help. It might just be a tool used in the initial setup however it could be worthwhile as the group would likely be very distributed. I believe many of the W3C groups do the same for similar reasons.

I will draft a mail tomorrow to the others that were mentioned in my comment above (Mostly perl contributors and the mailing list).

@beatgammit I don't really see how a history of TAP matters much, I see you are maintaining a consumer in go which also parses TAPY/J so you are likely versed in the benefits those additions have brought to the table and could be merged back into the format.

@beatgammit
Copy link

You mean we can't have TAPconf?

I'm sure the TAP community could force its way in to nearly any other conference.

I will draft a mail tomorrow to the others that were mentioned in my comment above (Mostly perl contributors and the mailing list).

Awesome!

@Leont
Copy link

Leont commented Mar 27, 2015

How are TAP specification changes adopted? Who votes?

Is this repository even relevant? If so, can that information be added to the README.md?

I've been wondering about that myself too. We need some structure. I'm generally in favor of focussing on implementors here (producers and consumers).

@ranford
Copy link

ranford commented Apr 7, 2015

I support the work you are trying to do here and will at least be watching closely. I am an implementer of the TAP producer for MATLAB (BTW, here is a post demonstrating this producer in action: http://blogs.mathworks.com/developer/2015/01/29/tap-plugin/) and one reason we chose to implement TAP before another output format like JUnit xml is the language agnostic intention of the framework, which I applaud. That said JUnit xml format is certainly more widespread, meaning that there are more CI systems for example that suppport the JUnit style xml but do not support TAP.

However, even with the problems discussed here surounding the TAP spec, the JUnit spec is worse, with the format getting gleaned from the JUnit Ant task and CI systems interpreting this format how they will. There is no official schema or spec like there is for TAP.

As far as the committee goes, I would encourage you to be sure to look for a committee which represents a diverse set of languages and test frameworks. Obviously due to the Perl rich history with TAP there should be no trouble getting the Perl community involved in the committee, but it may take some deliberate thought to grab folks that are consumers/producers working in other languages. For example, it may be a good goal to strive for representation from the top languages of the tiobe index and their relevant test frameworks.

@jonathanKingston
Copy link
Member

Due to the time I wish to put into working on the WebAppSec WG and my drive to work on web standards I'm going to mostly bow out of this now.

I shall appoint @isaacs as a organisation administrator due to his background in managing such communities and large scale software.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 11, 2015

Hahaha, careful what you wish for, right!?

Thanks, @jonathanKingston, I'll try to get something moving shortly, maybe write up at least a straw-man of a plan and get folks to start poking at it. I've been rallying some of the JS tap folks to at least write up a specification of what we're currently implementing. Maybe we can get broader buy-in as well.

@jonathanKingston
Copy link
Member

I got made an Invited Expert there so really want to plough as much effort as possible into that. I'm currently spending most of my time trying to catch up on oodles of pages of specs.

I'm happy to read whatever and provide help where possible though. Thank you so much for taking on waving the flag for JS devs to help too.

@Leont
Copy link

Leont commented Jun 12, 2015

Due to the time I wish to put into working on the WebAppSec WG and my drive to work on web standards I'm going to mostly bow out of this now.

That sounds pretty cool, good luck with that :-).

@isaacs
Copy link
Contributor Author

isaacs commented Aug 8, 2015

See #20

@isaacs isaacs closed this as completed Aug 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants