GitHub - kamiraux/studentMill: This program is designed to execute generic test-suite on a large set of students submissions

kamiraux / studentMill Public
Notifications You must be signed in to change notification settings
Fork 0
Star 0
This program is designed to execute generic test-suite on a large set of students submissions
Notifications
Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
example		example
exe		exe
modules		modules
trace_generators		trace_generators
AUTHORS		AUTHORS
Makefile		Makefile
README		README
TODO		TODO
Repository files navigation

% Moulette documentation
% Kevin Amiraux
% 2013

Moulette
========

General information
-------------------

### Author and origin

This software was written by Kevin Amiraux (<[email protected]>) in 2013.

It was originally designed to execute generic test-suite on a large set of
students submissions.

### Presentation and objectives

This is a system to execute automatically a test suite on a set of students.

The following functionalities are available:

- Isolation of students from each other and from the system

- Configuration file for a project

- Complete architecture for tests:

  - Set of compilation units: if the student code has to be compiled several
    way.

  - Set of test units: a set of tests may refer to a test unit, this avoid
    rewriting the same thing several times.

  - A set of test data: the test data may be put in this set in order to be
    referenced by tests afterwards.


### Usage

This software can be called with the following syntax:

    ./moulette CONFIG_FILE

where `CONFIG_FILE` is the path to the configuration file of the project to be
computed.


### About isolation

TODO

#### Timeouts

TODO


### About parallelization

Parallelization of tests is possible by setting the variable `M_NB_WORKERS` to a
value greater than `1` int the configuration file.

Please keep in mind that parallelization is done on students, i.e. multiple
students are processed at the same time, not multiple tests on the same student.


Configuration file
------------------

This software provides a simple way to compute various projects, each one may
have a different configuration file which is given as an argument when the
software is invocated.

The configuration must expect to be executed in the directory where it is
located (very important). Every path defined by the variables in this file must
be an absolute path. To use paths relatives to where the configuration file is,
the special variable `$PWD` might be used.

Example:

    $ ls
    project.config
    my_script.sh
    $ cat project.config
    #! /bin/sh

    PATH_TO_SCRIPT=$PWD/my_script.sh

### Variables to set

- `M_PROJECT_NAME` is a string representing the project name.

- `M_RTOKEN` is a string, it should be set to a random token. This is for
  security considerations, this may prevent anyone from finding the name of the
  directory when tests will be executed. Rights management of this software
  should avoid such threat, so this variable may be empty.

- `M_LOG_FILE_NAME` is an absolute path to the file where logs will be
  output. This name will be suffixed to precise the nature of the log-file; for
  example during pre-processing step, logs will be output to file
  `${M_LOG_FILE_NAME}_pre_process.out`.

- `M_LIMIT_DISK_USAGE` is a boolean value (`true` or `false`). It is used to
  determine if temporary tests output directories should be deleted when no more
  required by the software. Set it to `true` may save a lot of disk space during
  the computation process. Set it to `false` may be useful for test-suite
  debugging purpose or manual review.

- `M_MAKE` is a command, it will be used to call the `make` utility. It should
  be used to chose between `make` and `gmake` on non-gnu operating systems.

- `M_MAX_DISPLAY_DIFF_LENGTH` is an integer, it is the maximum size of the
  `diff` output when differences exist between test output and reference test
  output. Beyond this limit the `diff` output will not be displayed`, a simple
  message will be shown instead saying that the outputs differ.

- `M_ARCH` is a string, it is the system architecture where the tests will be
  executed. This may be accessible through the command `uname`.  This variable
  is used for example to display correct information messages when a signal is
  received by a process during a test execution.

- `M_STUDENT_LIST_FILE` is an absolute file path, it is the file containing the
  list of students to be moulinetted.

- `M_TESTS_FOLDER` is an absolute path to the directory where test suite is
  located (see section on _Test suite_).

- `M_TARBALLS_FOLDER` is an absolute path to the directory containing tarballs
  to be tested.  For the moment (-> TODO) the tarballs must be uncompressed
  (i.e. each "tarball" is a folder containing the files).  The name of the
  folder must be the name/login of the student in the students list file.

- `M_TRACE_DIR` is an absolute path to the directory where traces must be
  output. One file will be created for each student.

- `M_LOCK_FILE` is an absolute path to the lock-file to be used. This is used to
  prevent parallel execution of tests marked as "non-parallelizable".

- `M_NB_WORKERS` is an integer that represent the maximum number of workers that
  will be executed simultaneously. Set it to `1` will disable parallel
  execution.

- `M_MOULETTE_ASSETS` is an absolute path to the directory that contains the
   installation of this software (i.e. folder containing the main program
   `moulette.sh`).

- `M_TRACE_GENERATOR` is an absolute or relative path to the executable file that
  will be used to generate traces; relative path must be taken from the folder
  `$M_MOULETTE_ASSETS/trace_generators`.



Test suite
----------

### Compilation units

TODO

A compilation unit could be referred in a test directory using a file named
`ref_comp_unit` which contains the name (without '`.comp_unit`') of the
compilation unit.

The folder `comp_units` (if present) must be in the tests root directory. It
should contain a set of compilation units.

A compilation unit is represented by a directory whose name ends with
'.comp_unit'.

This folder may contain the following files:

- `pre_comp_tb_changes.sh` (x)
- `compile_student.sh` (x)
- `exec_timeout`
- `abs_timeout`
- `expected` (d)


#### Pre compilation tarball changes

Execution of `pre_comp_tb_changes.sh`

This script should expect to be run into the compilation unit directory.  The
first argument to this script is the absolute path to the student directory
corresponding to the current compilation unit.


#### Compilation of the student code

Execution of `compile_student.sh`

This script should expect to be run in the student directory corresponding to
the current compilation unit. It is given as argument the content of the
variable 'M_MAKE' of the project configuration file.


A timeout is applied on this process using `exec_timeout` and `abs_timeout`
files if present, if not a default timeout is applied as described in the
section _'Timeout'_ of the _'Isolation'_ part.


#### Checking presence of expected files

The system allow you to defines a set of files that must be present for the
compilation unit to be marked as "valid".

To achieve this you have to create a folder named `expected` int the compilation
unit directory.

In this folder you may create as many folders as you want, in
each one the system will look for a file named `expected_files`. In this file
are the names of every files that the presence has to be checked (one file name
per line).

#### Checking usage of functions

If there is a file named `authorized_functions` in the folder then
all expected files listed in `expected_files` will be checked for usage of
external functions.

The format of this file is as follow:

- One function name per line;

- A function name may use the character '\*' with the sane meaning than
  globbing.

  For example: "`dup*`" on a line will allow all function whose name begin with
  `dup` (i.e. `dup`, `dup2`, etc).

If usage of a non-authorized function is detected an error will be reported and
the whole compilation unit will be marked as failed.




### Test units

TODO

The test units are defined in a folder named `test_units`. This folder (if
present) must be in the tests root directory. It should contain a set of test
units.

A test unit could be referred in a test directory using a file named
`ref_test_unit` which contains the name (without '`.test_unit`') of the
test unit.

A test unit is represented by a directory whose name ends with
'.test_unit'.

This folder may contain the following files:

- `configure` (x)
- `Makefile`
- `gen_tests_exec_commands.sh` (x)
- `pre_test.sh` (x)
- `exec_timeout`
- `abs_timeout`
- `compute_result.sh` (x)

#### Configure and Makefile

These files will be used twice: first when compiling tests during the
preprocessing step, second when linking (if needed) the test with the student
code (this step is important for projects that generate a static library).

First time they will be called inside the test folder and the rule called in the
Makefile is '`common_compilation`'. The `configure` file is given as its first
argument the absolute path to the test unit directory.

Second time they will be executed inside the folder where the test will be
executed (different for each student), the binary (if any) to be generated
should be put here. In order to access previously compilated files (from the
test and from the student) two variables are set before the call to (`g`)`make`:

- `COMP_UNIT_DIR` contains the absolute path to the student directory
  corresponding to the current test's compilation unit (this may contain the lib
  or binary generated by student compilation).

- `TEST_DIR` contains the absolute path to the test folder (this may contain the
  precompiled object files for the test).

The rule called in the Makefile is '`test_compilation`'.


#### Generation of test execution commands

During the preprocessing step the script `gen_tests_exec_commands.sh` of the
test unit will be called. Its goal is to generate a file that contains the list
of commands that will be called during the execution of the test.

The script `gen_tests_exec_commands.sh` will be executed inside each test folder
that refers the test unit. This script must generate a file named
`test_exec_commands` with the following format: each line represents a test;
line format is: '`<test_id>,<test_command>`'.

- `<test_id>` is the identifier of the test, it will be its name in the
  results. It will also allow to refer files corresponding to the test:

    * `<test_id>.in` file that will be piped to the standard input for the
      execution of the test;

    * `<test_id>.arg` file that contains the arguments that will be given to the
      program for the execution of the test;

    * `<test_id>.out` file that contains the standard output of the reference;

    * `<test_id>.err` file that contains the error output of the reference;

    * `<test_id>.ret` file that contains the return code of the reference;

    * `<test_id>.ref` file that will be diffed with the`<test_id>.my` in the
      test execution folder;

    * `<test_id>.diff_comp` file that contains a list of files to be "diffed"
      for this test (see section on tests for more details);

    * `<test_id>.name` file that contains the name of this test. **Not
      implemented**

    * `<test_id>.desc` file that contains the description of this test. **Not
      implemented**

    These files are specific to a test so they are located in the `.test`
    folder. For more details please refers to the part on test folders.

- `<test_command>` is the command to be executed, i.e. the path to a binary that
  will be executed for the test. It is important to note that this command
  should assume to be launched from the student test folder.

  For example if the `test_compilation` rule of the Makefile (see previous part
  on _'Configure and Makefile'_ in the test units) has generated (or copied) a
  binary named `test` in the directory where it was executed, then the
  `test_command` should be `./test`. You must not give the arguments of the call
  in the `test_command`. The arguments will be appended automatically using the
  file `test_id.arg` (if any) present in the test folder.

Note that if there is no `gen_tests_exec_commands.sh` file or its execution does
not generate the `test_exec_commands` file, then nothing will be executed
(unless the test directory contains a static file `test_exec_commands`).


#### Pre-test script execution

A script named `pre_test.sh` may be put in the test unit folder if any action
must be executed on the system before the execution of a test. It will be
executed just before the execution of the test (just once per test, not before
each subtest).

This script must assume to be executed in the folder where it is located,
therefore the first argument of the script will be the absolute path to the test
execution directory.

This script is called after copying test data to the test execution directory
and before the test execution.


#### Timeouts

The files `exec_timeout` and `abs_timeout` located in a test unit are used as
the timeout for the execution of each test referring to the test unit.


#### Computing the results

If there is no specific script, then the standard output, error output and
return value for the test execution will be compared to the reference files
(`<test_id>.out`, `<test_id>.err` and `<test_id>.ret`). If a `<test_id>.ref` is
present it will also be checked. Each difference will be reported and the
error log will be generated. Note that only present files will be diffed
(i.e. if a file is not present, no error will be reported, the check will just
not be done).

If there is a specific script (named `compute_result.sh`) it will be executed
after (not instead).  It will be executed inside the result directory specific
to this test (i.e. the `.test`). It is assumed to generate, at least, a file
named `test.result`.

- If the result is found as successful then the test.result must contain
  "`PASS`";

- If the result is found as failed then the test.result must contain "`FAIL`". A
  file named `test.error` may also be created with an explanation of why the
  test wasn't successful and an error log.

The script will be given the following 3 arguments:

- Absolute path to the test dir (ref)
- Absolute path to the output dir
- Absolute path to the test exec dir

#### Precedence considerations

The following files may be present in a test (`.test`) directory:

- `configure`
- `Makefile`
- `pre_test.sh`
- `gen_test_exec_commands.sh`
- `exec_timeout`
- `abs_timeout`
- `compute_result.sh`

Each of these files, if present in a test directory, has the exact same meaning
and usage than the one that may be present in a test unit. The only difference
is that a file present in a test directory takes precedence over the one present
in the test unit directory. This allow to have a generic behavior for a set of
test, but keep the possibility to have a more specific behavior for a test.


### Tests

During test execution the test-suite directory will be browsed to find every
folder whose name match "`*.test`". Each of these folders represents a test,
independent from each others.

In addition to the files previously explained a test directory may contain the
following files:

- `ref_comp_unit`: This file must contain the name of the compilation unit to be
  used for this test (without the '`.comp_unit`').

- `ref_test_unit`: This file must contain the name of the test unit to be
  used for this test (without the '`.test_unit`').

- `required_data`: This file must contain the list (one per line) of the data
  files needed for the execution of the test. A file is referred by its relative
  path from the `test_data` directory of the test-suite. A line can refer to a
  file or a directory, in this case the whole directory will be copied inside
  the test execution directory.  If this file is not present or is empty,
  nothing will be copied.  If the file contains only one line with '`all`'
  (i.e. '`echo all > required_data`') then all files of the directory
  `test_data` will be copied.

    Note that the complete directory arborescence will be copied, for example:

    With the following test-suite architecture:

          .
          |-- comp_units
          |   `-- [...]
          |-- project
          |   `-- [...]
          |-- test_data
          |   |-- some_dir
          |   |   |-- some_file
          |   |   `-- some_file_2
          |   `-- some_other_dir
          |   |   |-- some_file
          |   |   `-- some_file_2
          `-- test_units
              `-- [...]

    And the `required_data`:

          some_dir/some_file
          some_other_dir

    The test execution directory will look like that before the test:

          .
          |-- some_dir
          |   `-- some_file
          `-- some_other_dir
              |-- some_file
              `-- some_file_2


- `not_parallelizable`: This file indicates that the test is *not*
  parallelizable, i.e. only one instance of this test will be executed at a time
  even if multiple students are run in parallel (see part on _Parallelization_
  for more details).  Note that other tests may be run in the same time for
  other students. This only prevents from execution of two instances of tests
  tagged as "non-parallelizable".

    Two tests tagged as "non-parallelizable" will never run simultaneously (even
    two different tests). But several parallelizable tests may run in the same
    time than a "non-parallelizable" test.


#### Test execution

##### Sub-tests

During the execution of the test, the system use the file
`test_exec_commands`. Each line define what is called a "sub-test".

The format of the line is: '`<test_id>,<test_command>`', where `<test_id>` is
the identifier of the test (it will be its name in the results, it will also be
used to refer files corresponding to the test as explained in the part _Test
structure_) and `<test_command>` is the command that will be executed for the
test (absolute path or relative from the test execution directory).

In fact a test directory must contain at least one sub-test (otherwise the test
is useless).

##### Test structure

In addition to the test configuration files, there are some data files for the
execution of the test that will be used automatically by the test system.


The executable file specified is executed inside the folder where the test data
have been copied. Neither the executable file nor the output files are
accessible during the execution of the test (they are not in this folder). The
directory where the test is executed is referred as the "test execution
directory". It contains the test data copied before the tests and the files that
may be generated during the execution.

_Input files:_

- `<test_id>.in` file that will be piped to the standard input for the
  execution of the test;

- `<test_id>.arg` file that contains the arguments that will be given to the
  program for the execution of the test;

_Reference output files:_

- `<test_id>.out` file that contains the standard output of the reference;

- `<test_id>.err` file that contains the error output of the reference;

- `<test_id>.ret` file that contains the return code of the reference;

These files will be used to compute the result of the test. Each one (if
present) will be diffed and differences will be reported.


_Special test files:_

- `<test_id>.ref` file that will be diffed with the`<test_id>.my` in the test
  execution folder; useful when the execution generates a single file to test;

- `<test_id>.diff_comp` file that contains a list of files to be "diffed" for
  this test; usseful when execution generates several files and/or modifies some
  data files.

    This file must comply to a specific format, one file to compare is specified
    per line with the following line format:

        <reference_file>,<file_to_test>

    Where `<reference_file>` is the path to the "correct" output file (i.e. the
    "ref"); this path is relative to the test directory. And `<file_to_test>`
    is the path to the file that should be generated by the test, this file will
    be diffed with the referene file. The path is relative to the test execution
    directory.

Remember that `<test_id>.ref` file is for "simple" tests, `<test_id>.diff_comp`
is for more advanced usage. Then if `<test_id>.diff_comp` is present
`<test_id>.ref` will not be used even if present. It implies that you can refer
a file `<test_id>.ref` inside the file `<test_id>.diff_comp` without any risk of
conflict between both files, the `diff_comp` file has precedence over the `ref`
file.

_Information files:_

- `<test_id>.name` file that contains the name of this test. **Not
  implemented**

- `<test_id>.desc` file that contains the description of this test. **Not
  implemented**


### Generating traces

At the end of computation for a student, a script will be called to generate the
final "trace".  This script will use a variable named '`M_TRACE_GENERATOR`' from
the project configuration file. This variable must refer to a trace generator
for the wanted output trace format. It may be an absolute path, but if it is not
an absolute path, then the script will be searched in the `trace_generators`
directory in the moulette assets directory.

The generator will be executed inside the result directory so that it can be
browsed. It will be given 2 arguments, the first one is the login of the student
and the second one is the path to the directory containing the generator (this
way the generator can use some files in the directory where it is located).

The trace must be generated on the standard output, it will automatically be put
in the correct file.

Notes
=====

In this document, when talking about a file we use the following convention:

- \(x\) represents a executable file
- \(d\) represents a directory
- no annotation represents a regular file


Appendix
========