Further information

How does it work?

Here's a description of the evaluation process that taskgrader does, managed by the function evaluation(evaluationParams).

evaluationParams is the input JSON
A build folder is created for the evaluation
The defaultParams.json file from the task is read and its variables added
A dictWithVars is created to add the variables, and returns the JSON data as if variables were replaced by their values
generators are compiled
generations describe how generators are to be executed in order to generate all the test files and optional libraries
extraTests are added into the tests pool
The sanitizer and the checker are compiled
The solutions are compiled
All executions are done for the solutions
The full evaluation report is returned on standard output

Executions

Each execution is the grading of one solution against multiple test files. For each execution: Test files corresponding to filterTests are selected, then for each test file: It passes first the sanitizer test Then the solution is executed, with the test file as standard input and the output saved Finally the checker grades the solution according to its output on that particular test file

filterTests is a list of globs (as "test*.in" or "mytest.in") selecting test files to use among all the test files generated by the generators, and the extraTests given. One can specify directly test files into this array to use only specific ones.

Evaluation components

The evaluation is made against a task which has multiple components.

Generators

The generators are generating the testing environment. They are executed, optionally with various parameters, to generate files, which can be:

test files: inputs for the solution, and if necessary, expected output results
libraries, for the compilation and execution of solutions

Some of these files can be passed directly in the evaluation JSON, without the need of a generator.

Sanitizer

The sanitizer checks whether a test input is valid. It expects the test input on its stdin, and its exit code indicates the validity of the data.

Checker

The checker checks whether the output of a solution corresponds to the expected result. It expects three arguments on the command line:

test.solout the solution output
test.in the reference input
test.out the reference output

All checkers are passed these three arguments, whether they use it or not. The checker outputs the grading of the solution; its exit code can indicate an error while checking (invalid arguments, missing files, ...).

Tools

Various tools are available in the subfolder tools. They can be configured with their respective config.py files.

Creating a task

taskstarter.py helps task writers create and modify simple tasks. This simple tool is meant as a starting point for people not knowing how the taskgrader works but willing to write a task, and helps them through documented steps. It allows to do some operations in tasks folders, such as creating the base skeleton, giving some help on various components and testing the task. This tool creates a taskSettings.json in the task folder, that genJson.py can then use to create a defaultParams.json accordingly. Read the "Getting started on writing a task" section for more information.

Preparing a task for grading

genJson.py analyses tasks and creates the defaultParams.json file for them. It will read the taskSettings.json file in each task for some settings and try to automatically detect other settings.

taskSettings.json

The taskSettings.json is JSON data giving some parameters about the task, for use by genJson.py. It has the following keys:

generator: path to the generator of the task
generatorDeps: dependencies for the generator (list of fileDescr, see the input JSON schema for more information)
sanitizer, sanitizerDeps, sanitizerLang: sanitizer of the task (path, dependencies, language; default is no dependencies and auto-detect language depending on extension)
checker, checkerDeps, checkerLang: checker of the task (path, dependencies, language; default is no dependencies and auto-detect language depending on extension)
extraDir: folder with extra files (input test files and/or libraries)
overrideParams: JSON data to be copied directly into defaultParams.json, will replace any key with the same name from genJson.py generated JSON data
correctSolutions: list of solutions known as working with the task, will be tested by genJson.py which will check whether they get the right results. Each solution must have the following keys: path, lang and grade (the numerical grade the solution is supposed to get).

defaultParams.json

The defaultParams.json is a task file giving some information about the task, must be JSON data pairing the following keys with the right objects:

rootPath: the root path of the files
defaultGenerator: a default generator
defaultGeneration: the default generation for the default generator
extraTests (optional): some extra tests
defaultSanitizer: the default sanitizer
defaultChecker: the default checker
defaultDependencies-[language] (optional): default dependencies for that language; if not defined, it will fallback to defaultDependencies or to an empty list
defaultFilterTests-[language] (optional): default glob-style filters for the tests for that language; if not defined, it will fallback to defaultFilterTests or to an empty list

Grading a solution

stdGrade.sh allows to easily grade a solution. The task path must be the current directory, or must be specified with -p. It will expect to have a defaultParams.json file in the task directory, describing the task with some variables. Note that it's meant for fast and simple grading of solutions, it doesn't give a full control over the evaluation process. stdGrade.sh is a shortcut to two utilities present in its folder, for more options, see genStdTaskJson.py -h.

Basic usage: stdGrade.sh [SOLUTION]... from a task folder.

Exit codes

The taskgrader will return the following exit codes:

0 if the evaluation took place without error
1 if an error with the evaluation happened, usually because of the evaluation parameters themselves
2 if there was a temporary error, meaning the same evaluation should be tried again at a later time
3 if the evaluation needed a language which is not supported, or which lacks a dependency to compile

Internals (for developers)

evaluation is the evaluation process. It reads an input JSON and preprocesses it to replace the variables.

Each program is defined as an instance of the class Program, that we compile, then prepareExecution to set the execution parameters, then execute with the proper parameters.

Languages are set as classes which define two functions: getSource which defines how to search for some dependencies for this language, and compile which is the compilation process.

The cache is handled by various Cache classes, each storing the cache parameters for a specific program and giving access to the various cache folders corresponding to compilation or execution of said programs.

Update documentation

The documentation is in the docs/ folder. It is written in MarkDown, and formatted into HTML by MkDocs. To update the documentation:

pip install mkdocs to install mkdocs locally with pip
mkdocs serve to preview (live) your changes
mkdocs build to build a local HTML version of the documentation
mkdocs gh-deploy to build the HTML version and upload it directly to the taskgrader GitHub Pages