Testing framework #3

ladc · 2016-02-04T07:33:38Z

Agree on a testing library and a test runner with minimal dependencies, preferably Lua/shell only.
Test cases should still be easy to run without any (big) dependencies.

agentzh · 2016-02-04T22:26:49Z

As mentioned in #5, I hope that with the data driven approach, we can provide an official test runner (or test harness) but also allow 3rd-party more sophisticated test runners to run exactly the same test suite without modifications. The test cases themselves should be test framework agnostic and stay very clean and self-contained. This also has the advantage that there is no way that a Lua test framework library itself can affect the behaviors of the test cases in any way if we wish.

ladc · 2016-02-22T00:05:02Z

@agentzh I've made a proof-of-concept test runner in the spirit of TestML and following the guidelines by @MikePall in his comment to issue #5. I've adapted some tests as an example (in the directories test/libs and test/libs_ext).

Like ldoc it considers the --- as special, in this case a marker for a new test. The first line defines the test name. Subsequent comments preceded by -- contain the description, and tags are added in the description +this +way. The test file's path is also tokenized and added to the tags.

The tags allow you to select or exclude certain tests, this allows for 'tiers' (slow/fast/lua52 etc.).

It also considers the code before the first --- identifier as a prelude which should be included before each subsequent test.

It's here: https://github.com/ladc/LuaJIT-test-cleanup
Try running

luajit test.lua test/libs +table -lua52

or

luajit test.lua --help

Let me know what you think.

It still has a lot of TODOs of course, like the filenames used to extract failing code snippets to (currently uses os.tmpname()), etc.

fsfod · 2016-02-22T19:14:12Z

It seems kind of fragile trying to parse out the bodys of the test and append what you hope the helper functions are instead of just rewriting all the individual tests in a file into functions that you can just selectively call

ladc · 2016-02-22T21:49:09Z

@fsfod Thank you for your feedback. I tried to find a solution that's minimally invasive to the tests, while avoiding the need to include a YAML parser or the like, and while allowing the tests to be run as-is.

Of course a declarative framework like this would need a minimal specification for the tests' format, which would include that the chunk before the first --- contains the helper functions. And some tests would need to be adapted to comply to this.

Wiladams · 2016-02-23T02:48:29Z

I wonder why lua itself could not be the test format? Why rely on something else, when lua is perfectly capable of doing the job?

Sent from my Windows Phone

From: Lesley De Cruzmailto:[email protected]
Sent: ‎2/‎22/‎2016 1:49 PM
To: LuaJIT/LuaJIT-test-cleanupmailto:[email protected]
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

@fsfod Thank you for your feedback. I tried to find a solution that's minimally invasive to the tests, while avoiding the need to include a YAML parser or the like, and while allowing the tests to be run as-is.

Of course a declarative framework like this would need a minimal specification for the tests' format, which would include that the chunk before the first --- contains the helper functions. And some tests would need to be adapted to comply to this.

Reply to this email directly or view it on GitHub:
#3 (comment)

ladc · 2016-02-23T17:00:19Z

@Wiladams That would require more invasive changes to the tests, and no longer make them runnable as-is. Or it would require putting every (sub)test into a different file, which is not ideal either.

CapsAdmin · 2016-02-24T00:17:32Z

Coming up with names for all the subtests seems a bit difficult if they were to be split into files but I guess you could just name them table/insert1.lua table/insert2.lua etc instead.

I think parsing a test for subtests is fine. I would match subtests by balanced do end and leave the top part of the test as shared code. For instance https://github.com/LuaJIT/LuaJIT-test-cleanup/blob/master/test/ffi/ffi_new.lua has 10 sub tests where line 1 to 11 would be each subtests header. Then moving your comment config thing inside do end or left ofdo` while adding a symbol of some sort to tell the people reading the code that this is something that's being parsed would make it seem a little less "fragile".

I don't know how important it is to not have dependencies but if you're going to use a stable version of luajit to run the tests you could just use ffi to iterate directories instead of relying on lfs. But that's something that should probably be looked at later.

ladc · 2016-02-26T23:02:25Z

OK, the --- divisor is indeed a bit fragile since the subtests can still leak (local) variables if they don't have do ... end delimiters. How would you match this in a robust way without adding too much machinery?

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

ladc · 2016-02-26T23:22:02Z

I wrote

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

OTOH if you're not testing a stable LuaJIT (or if you're in the process of porting it) you may appreciate a test runner that works in vanilla Lua...

Wiladams · 2016-02-27T06:23:54Z

Or minilua

Sent from my Windows Phone

From: Lesley De Cruzmailto:[email protected]
Sent: ‎2/‎26/‎2016 3:22 PM
To: LuaJIT/LuaJIT-test-cleanupmailto:[email protected]
Cc: William Adamsmailto:[email protected]
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

I wrote

It's indeed better to drop the dependency on lfs in favour of an ffi solution (see ljsyscall and TINN).

OTOH if you're not testing a stable LuaJIT (or if you're in the process of porting it) you may appreciate a test runner that works in vanilla Lua...

Reply to this email directly or view it on GitHub:
#3 (comment)

ladc · 2016-02-27T09:50:24Z

@Wiladams I also considered minilua. But keep in mind that minilua lacks some core libraries and functions. If I understand correctly from genminilua.lua the following list is not included:

  collectgarbage dofile gcinfo getfenv getmetatable load print rawequal rawset
  select tostring xpcall
  foreach foreachi getn maxn setn
  popen tmpfile seek setvbuf __tostring
  clock date difftime execute getenv rename setlocale time tmpname
  dump gfind len reverse

CapsAdmin · 2016-02-27T14:08:51Z

@ladc instead of trying to come up with a parser that matches do end you do something like

--#header foo {
local ffi = require("ffi")
local bit = require("bit")

dofile("../common/ffi_util.inc")

ffi.cdef([[
typedef struct { int a,b,c; } foo1_t;
typedef int foo2_t[?];
void *malloc(size_t size);
void free(void *ptr);
]])
--#}

do 
  --#subtest test {
  --#include foo
  assert(ffi.sizeof("foo1_t") == 12)
  local cd = ffi.new("foo1_t")
  assert(ffi.sizeof(cd) == 12)
  local foo1_t = ffi.typeof("foo1_t")
  assert(ffi.sizeof(foo1_t) == 12)
  cd = foo1_t()
  assert(ffi.sizeof(cd) == 12)
  --#}  
end 

do --#subtest test2 {
  --#include foo
  assert(ffi.sizeof("foo2_t", 3) == 12)
  local cd = ffi.new("foo2_t", 3)
  assert(ffi.sizeof(cd) == 12)
  local foo2_t = ffi.typeof("foo2_t")
  fails(ffi.sizeof, foo2_t)
  assert(ffi.sizeof(foo2_t, 3) == 12)
  cd = foo2_t(3)
  assert(ffi.sizeof(cd) == 12)
  --#}  
end 

do --#subtest test2 {
  --#include foo
  local tpi = ffi.typeof("int")
  local tpb = ffi.typeof("uint8_t")
  local t = {}
  for i=1,200 do t[i] = tpi end
  t[100] = tpb
  local x = 0
  for i=1,200 do x = x + tonumber(ffi.new(t[i], 257)) end
  assert(x == 199*257 + 1)
  --#}  
end

I think the prefix symbol makes it clear that this is something more than a comment.

Instead of making a pseudo language you could use lua itself somehow too which at that point it becomes a generic macro thing. (--#header = [[ ... --#]] ... --#test1 = [[ ... --#include(header) ... --#]]) but I'm not sure if there's going to be much benefit or if it'll make things easier.

Wiladams · 2016-02-27T16:10:38Z

Here's an example of what I'm thinking. Just deal with the test file as if it were a database of individual test cases.

This is all pure lua, so a lua parser can deal with it. And if you want to split thing out, you can easily do that by just

querying the table, and only running what you want.

The added benefit is that you get more meta data associated with tests. So, if you want to run tests that are related to a particular area, you can easily do that.

I just want to separate the difference between data, and the mechanism to run the tests. The data itself can easily be represented in lua itself, minimizing the need to create a different 'language' simply to represent the data.

local testCases = {
{

id = "constov1",

desc = "test case to catch issue 2345",

author = "williamaadams",

issue = "2345",


[[
  local t = { "local x\n" }
   for i=2,65537 do t[i] = "x="..i..".5\n" end
  assert(loadstring(table.concat(t)) ~= nil)
   t[65538] = "x=65538.5"
   assert(loadstring(table.concat(t)) == nil)

]]

},

{

id = "constov2",

desc = "test case to catch issue 2346",

issue = "2346",

[[

  local t = { "local x\n" }


  for i=2,65537 do t[i] = "x='"..i.."'\n" end


  assert(loadstring(table.concat(t)) ~= nil)


  t[65538] = "x='65538'"


  assert(loadstring(table.concat(t)) == nil)

]]

}
}

if not os.getenv("SLOWTEST") then return end

-- Run all the test cases

for tcase in ipairs(testCases) do

for idx, tcase in ipairs(tcase) do

dostring(tcase)

end

=============================== - Shaping clay is easier than digging it out of the ground.

Date: Sat, 27 Feb 2016 06:08:52 -0800
From: [email protected]
To: [email protected]
CC: [email protected]
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

@ladc instead of trying to come up with a parser that matches do end you do something like

--#header foo {
local ffi = require("ffi")
local bit = require("bit")

dofile("../common/ffi_util.inc")

ffi.cdef([[
typedef struct { int a,b,c; } foo1_t;
typedef int foo2_t[?];
void *malloc(size_t size);
void free(void *ptr);
]])
--#}

do
--#subtest test {
--#include foo
assert(ffi.sizeof("foo1_t") == 12)
local cd = ffi.new("foo1_t")
assert(ffi.sizeof(cd) == 12)
local foo1_t = ffi.typeof("foo1_t")
assert(ffi.sizeof(foo1_t) == 12)
cd = foo1_t()
assert(ffi.sizeof(cd) == 12)
--#}
end

do --#subtest test2 {
--#include foo
assert(ffi.sizeof("foo2_t", 3) == 12)
local cd = ffi.new("foo2_t", 3)
assert(ffi.sizeof(cd) == 12)
local foo2_t = ffi.typeof("foo2_t")
fails(ffi.sizeof, foo2_t)
assert(ffi.sizeof(foo2_t, 3) == 12)
cd = foo2_t(3)
assert(ffi.sizeof(cd) == 12)
--#}
end

do --#subtest test2 {
--#include foo
local tpi = ffi.typeof("int")
local tpb = ffi.typeof("uint8_t")
local t = {}
for i=1,200 do t[i] = tpi end
t[100] = tpb
local x = 0
for i=1,200 do x = x + tonumber(ffi.new(t[i], 257)) end
assert(x == 199*257 + 1)
--#}
end

I think the prefix symbol makes it clear that this is something more than a comment.

Instead of making a pseudo language you could use lua itself somehow too which at that point it becomes a generic macro thing. (--#header = [[ ... --#]] ... --#test1 = [[ ... --#include(header) ... --#]]) but I'm not sure if there's going to be much benefit or if it'll make things easier.

—
Reply to this email directly or view it on GitHub.

CapsAdmin · 2016-02-27T16:19:02Z

That's probably how I would do it personally too but if it's important (for some reason?) to keep the tests as is in single lua files then I would do what I said.

ladc · 2016-02-27T17:15:49Z

@Wiladams That's a lot of boilerplate. If you need additional metadata that can easily be parsed from the comments or tags. You could use a simple @key:value pair syntax in the description.

I really don't see the need to add a lot of overhead to the tests. I think it's best to keep the tests clean and simple and to not end up with thousands of files (seriously, this can be very painful on some systems).

Please look at the example below:
https://github.com/ladc/LuaJIT-test-cleanup/blob/master/test/libs_ext/table.lua

The tests remain very readable and immediately runnable, while you still have a lot of flexibility, like adding metadata using the tags and possibly arbitrary attributes.

I'm also not sure if this 'fragility' is much of a concern? The files will be "cut" at --- comments and the individual chunks as such will still need to be valid.

The responsibility of not unintendedly leaking variables between the tests is with the writer of the tests, as is already the case. This will require some curation but this will be the case with any framework, and luacheck can certainly help there.

ladc · 2016-02-27T23:57:24Z

@Wiladams I've implemented your suggestion to allow for more metadata, using the @key: value syntax. Right now it's not yet possible to filter on these attributes like for tags, but this can easily be added if needed. In any case the functionality is there and the interface should be stable enough.

Please read the requirements by @MikePall again; I think I've addressed many of them in the proposed implementation. Missing features include documentation of the current features, parallellization, shuffling of tests, timing the runs to look for potential performance regressions, running a test multiple times. And C(-API) test are not handled yet.

Wiladams · 2016-02-28T01:53:37Z

I didn't quite get the comment on "a lot of boilerplate". None of those attributes are required. If you leave them out, then what you have is essentially the same thing as what you're proposing, but, instead of using a special notation in the comments, I just use the language itself to indicate what's what. Saves me from having to require that special parser to separate out the meaning from the comments. I can just use plain old lua.

Also, as far as the MikePall 'requirements', I take that as Mike's strong suggestions. He's not here to drive the effort, and he's said as much. He's leaving it in our hands. I thought I was following those strong suggestions. minimal dependencies, tests can stand on their own, executed from plain a normal lua...

But, really, we should just go ahead and implement some things. We risk debating/prototyping ourselves into inaction. Not a single checkin since the original one...

ladc · 2016-02-28T16:03:51Z

Rewriting the tests as Lua strings in a table would add unnecessary, repeated code, which to me is boilerplate. But a more important down side to me is that you lose a lot with this approach: syntax highlighting and other editor tools like luacheck or lua inspect don't work on the code inside the strings; the test files will compile even with syntax errors in the tests; you can't run the tests as-is. To be honest, I really don't look forward to refactoring the tests in this format.

The parser is not that special, it basically just matches three leading dashes and parses the comments for tags and attributes. It doesn't get much simpler than that. It's implemented in #6. The resulting table is basically the one you would define explicitly in your proposed format.

Anyway, the required test format can still be modified by replacing the parse() function in the module tester.lua if need be. Could you please have a look at the other features of the test runner I proposed? For example

one can filter tests by specifying the tags as in +tag1 -tag2; this could be used for the different 'tiers', or possibly to replace conditions like if os.getenv("LUA52") then ... end
to run the tests in a separate state, specify the command used, e.g. --runcmd="luajit -joff"
failed tests are extracted into the failed_tests dir (with an error report appended as a comment).

The command line tool's output to stdout is currently quite crude and the features are not complete. But if you think it could useful (test format aside), please consider merging my pull request #6.

Wiladams · 2016-02-28T16:31:37Z

I think you should just do a pull request and we can move from debate to reality. I'm not married to my idea as much as I was just demonstrating a concept.

Sent from my Windows Phone

From: Lesley De Cruzmailto:[email protected]
Sent: ‎2/‎28/‎2016 8:03 AM
To: LuaJIT/LuaJIT-test-cleanupmailto:[email protected]
Cc: William Adamsmailto:[email protected]
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

Rewriting the tests as Lua strings in a table would add unnecessary, repeated code, which to me is boilerplate. But a more important down side to me is that you lose a lot with this approach: syntax highlighting and other editor tools like luacheck or lua inspect don't work on the code inside the strings; the test files will compile even with syntax errors in the tests; you can't run the tests as-is. To be honest, I really don't look forward to refactoring the tests in this format.

The parser is not that special, it basically just matches three leading dashes and parses the comments for tags and attributes. It doesn't get much simpler than that. It's implemented in #6. The resulting table is basically the one you would define explicitly in your proposed format.

Anyway, the required test format can still be modified by replacing the parse() function in the module tester.lua if need be. Could you please have a look at the other features of the test runner I proposed? For example

one can filter tests by specifying the tags as in +tag1 -tag2; this could be used for the different 'tiers', or possibly to replace conditions like if os.getenv("LUA52") then ... end
to run the tests in a separate state, specify the command used, e.g. --runcmd="luajit -joff"
failed tests are extracted into the failed_tests dir (with an error report appended as a comment).

The command line tool's output to stdout is currently quite crude and the features are not complete. But if you think it could useful (test format aside), please consider merging my pull request #6.

Reply to this email directly or view it on GitHub:
#3 (comment)

ladc · 2016-02-28T16:35:42Z

Well, the code has already been written and the pull request has been sitting here for a week... See #6.

fsfod · 2016-02-28T18:23:51Z

I don't mind metadata that can be ignored but I still prefer wrap all the individual tests that are in a file in functions and use telescope to run them with optionally using your metadata to control what tests run and using it to declared special things to check like loops were JIT'ed instead of what I currently do. Maybe because I started with a different test setup to MikePaul but extracting failing test doesn't seem that useful to me based on my experience implementing intrinsic support for LuaJIT. I would just set a telescope test filter to only run failed tests I wanted based on the test name from the test declaration it("test name", function() end). I also ended up sticking what would be you tags in the test name as well

Wiladams · 2016-02-29T13:52:09Z

@ladc, just to be complete:
{
id = "constov2",
desc = "test case to catch issue 2346",
issue = "2346",
function()
local t = { "local x\n" }
for i=2,65537 do
t[i] = "x='"..i.."'\n"
end
assert(loadstring(table.concat(t)) ~= nil)
t[65538] = "x='65538'"
assert(loadstring(table.concat(t)) == nil)
end
}

This can also work. You don't need to put the test code into a literal string, it can be bracketed by anything that can show up discretely in a table. At this point, eliminating the meta data, and turning the curly braces into '---', we have exactly the same thing, except mine is parseable by lua directly, not requiring any sort of test parser.

agentzh · 2016-03-03T22:32:12Z

Using Lua for the test spec is very bad since it makes alternative test scaffolds written in other languages (like Perl and Python) much much harder. We need the capability to run the same test suite in various wildly different ways. BTW, TestML supports custom section delimiters other than ---:

http://testml.org/specification/language//index.html

I believe it's very wrong and limited to assume the test scaffold is always Lua.

Wiladams · 2016-03-04T05:50:12Z

I'm not sure you're reading my comments correctly.
There is a difference between how you represent the test data, and how you run the test cases. What I've been trying to point out in this thread is that there's not much difference in my eyes between using:
--- annotation here

and:

{ annotation = "here"

In either case, you can still have a test runner in PHP or whatever language you choose. It just so happens that the annotations I'm selection are parseable using Lua as well.

The real question to me is "can we choose a test case format that is easily parseable by lua as well as other languages?" Is it necessary to make it harder for Lua to be the test case parser?

At any rate, I'm not pushing for anything here because I don't think it's worth the argument. The thing that should be selected is the thing we have tools for. It will no doubt change over time anyway.

ladc · 2016-03-04T20:04:09Z

Whichever representation we choose right now, it'll be trivial to parse and reformat the tests in the format which we ultimately decide to go for.

Wiladams · 2016-03-04T20:18:35Z

I agree. Its more important to get some momentum at the moment.

Sent from my Windows Phone

-----Original Message-----
From: "Lesley De Cruz" [email protected]
Sent: ‎3/‎4/‎2016 12:04 PM
To: "LuaJIT/LuaJIT-test-cleanup" [email protected]
Cc: "William Adams" [email protected]
Subject: Re: [LuaJIT-test-cleanup] Testing framework (#3)

Whichever representation we choose right now, it'll be trivial to parse and reformat the tests in the format which we ultimately decide to go for.
—
Reply to this email directly or view it on GitHub.

lukego · 2016-10-11T08:54:27Z

Help! I really need a simple way to drive the tests for the CI in #10. The ideal would be:

Simple way to enumerate the names of all available benchmarks.
Simple way to run a benchmark by name (using some sane default parameters).
Uniform way to calculate a score for each benchmark e.g. elapsed time until process terminates (with non-zero exit status on failure).

Could any kind soul provide this?

The current solution is basically to enumerate the tests with ls bench/*.lua and to run them with luajit bench/$test.lua. This is not ideal though... not every Lua source file is a stand-alone test case(some contain none, some contain many), not every test has sane defaults (e.g. running for at least 0.1s to amortize startup costs), and not every test case can necessarily be scored by execution time (don't always do a fixed amount of work).

The drop-in solution would be to refactor bench/*.lua such that every file does run one test case, require no parameters, and execute a fixed amount of work. This would require splitting up bigger benchmarks (e.g. scimark) and moving library code into a subdirectory (also scimark). However that is just one possibility.

I am happy to take care of the CI side if somebody can support on the test suite side :).

ladc mentioned this issue Feb 5, 2016

Convert, split up and reorganize tests #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing framework #3

Testing framework #3

ladc commented Feb 4, 2016

agentzh commented Feb 4, 2016

ladc commented Feb 22, 2016

fsfod commented Feb 22, 2016

ladc commented Feb 22, 2016

Wiladams commented Feb 23, 2016

ladc commented Feb 23, 2016

CapsAdmin commented Feb 24, 2016

ladc commented Feb 26, 2016

ladc commented Feb 26, 2016

Wiladams commented Feb 27, 2016

ladc commented Feb 27, 2016

CapsAdmin commented Feb 27, 2016

Wiladams commented Feb 27, 2016

CapsAdmin commented Feb 27, 2016

ladc commented Feb 27, 2016

ladc commented Feb 27, 2016

Wiladams commented Feb 28, 2016

ladc commented Feb 28, 2016

Wiladams commented Feb 28, 2016

ladc commented Feb 28, 2016

fsfod commented Feb 28, 2016

Wiladams commented Feb 29, 2016

agentzh commented Mar 3, 2016

Wiladams commented Mar 4, 2016

ladc commented Mar 4, 2016

Wiladams commented Mar 4, 2016

lukego commented Oct 11, 2016

Testing framework #3

Testing framework #3

Comments

ladc commented Feb 4, 2016

agentzh commented Feb 4, 2016

ladc commented Feb 22, 2016

fsfod commented Feb 22, 2016

ladc commented Feb 22, 2016

Wiladams commented Feb 23, 2016

ladc commented Feb 23, 2016

CapsAdmin commented Feb 24, 2016

ladc commented Feb 26, 2016

ladc commented Feb 26, 2016

Wiladams commented Feb 27, 2016

ladc commented Feb 27, 2016

CapsAdmin commented Feb 27, 2016

Wiladams commented Feb 27, 2016

CapsAdmin commented Feb 27, 2016

ladc commented Feb 27, 2016

ladc commented Feb 27, 2016

Wiladams commented Feb 28, 2016

ladc commented Feb 28, 2016

Wiladams commented Feb 28, 2016

ladc commented Feb 28, 2016

fsfod commented Feb 28, 2016

Wiladams commented Feb 29, 2016

agentzh commented Mar 3, 2016

Wiladams commented Mar 4, 2016

ladc commented Mar 4, 2016

Wiladams commented Mar 4, 2016

lukego commented Oct 11, 2016