Configuration schema¶

This is the complete reference for a CRAB experiment config — the JSON file you pass to crab run. It is system-independent: the same file runs on any cluster by selecting a different preset.

For the conceptual walkthrough of writing one, see Writing experiment configs. This page is the exhaustive field list.

The experiment hierarchy¶

A run config describes a small nesting of levels. From outermost to innermost:

Job — one `crab run`: one config file → one Slurm job → one output directory
│     (named by global_options.name)
├─ Experiment "baseline"          an entry in `experiments`; experiments run sequentially
│    ├─ Run 1, Run 2, …           repetitions (minruns…maxruns), for statistical convergence
│    └─ apps 0, 1, 2, …           entries in `apps`; all run concurrently within each run
└─ Experiment "with_aggressor"
     └─ …

Level	Where it lives	Notes
Job	the whole config file	One `crab run` = one Slurm job = one `data/<system>/<name>_<timestamp>/` directory. Named by `global_options.name`.
Experiment	a key under `experiments`	Run one after another, in sorted key order. Has its own `apps`; may override options via `local_options`.
Run	runtime only (not in the JSON)	One repetition of an experiment. Repeated from `minruns` to `maxruns`, stopping early on convergence.
Application	a key under an experiment's `apps`	The processes launched concurrently within each run.

Runs and apps are orthogonal

Each run executes all of the experiment's apps once, together; runs are repetitions for statistics, apps are the concurrent workload measured in every run. A 3-app experiment with 10 runs launches those 3 apps together, ten times.

Top-level shape¶

A config has two top-level keys: global_options and experiments.

{
  "global_options": { ... },
  "experiments": {
    "experiment_name": {
      "description": "...",
      "local_options": { ... },
      "apps": { ... }
    }
  }
}

Legacy single-experiment form

A config may instead use a top-level applications block in place of experiments. CRAB wraps it automatically into a single experiment named default_ex. New configs should use the explicit experiments form.

`global_options`¶

Settings applied to the whole run. An experiment can override most of these in its own local_options (see below).

Key	Type	Default	Meaning
`numnodes`	int	required	Total nodes to allocate for the Slurm job.
`ppn`	int	`1`	Processes per node. Drives the launcher's total task count.
`name`	string	`""`	Human-readable run name; prefixes the output directory (`<name>_<timestamp>`).
`datapath`	string	`<CRAB_ROOT>/data`	Root directory for results.
`allocationmode`	`l` \| `i` \| `p`	`l`	Node-to-app mapping: linear, interleaved, partitioned. See Allocation.
`allocationsplit`	string	`e`	How nodes are divided among apps. See Allocation.
`partitionsplit`	string	`100`	(mode `p`) Size split between partitions, e.g. `50:50`.
`partitionlayout`	`l` \| `i`	`l`	(mode `p`) Whether partitions take contiguous (`l`) or interleaved (`i`) nodes.
`minruns`	int	`10`	Minimum runs before convergence is checked.
`maxruns`	int	`20`	Hard cap on runs.
`timeout`	float	`1200.0`	Wall-clock budget for the experiment, in seconds.
`convergeall`	bool	`false`	If true, every metric must converge; otherwise only metrics flagged `conv` in the wrapper.
`alpha`	float	`0.05`	Confidence-interval significance level.
`beta`	float	`0.05`	Convergence threshold: CI width must fall below `beta × mean`.
`outformat`	`csv` \| `hdf`	`csv`	Output file format.
`retain_files`	bool	`true`	Keep per-run working directories. If `false`, successful runs' scratch dirs are deleted.
`tags`	string	`none`	Free-form label recorded in the run registry (`metadata.csv`).
`walltime`	string	`00:10:00`	Base Slurm `--time` value (overridable via `sbatch_directives`).
`extrainfo`	string	`job`	Short token used to build the Slurm job name.
`sbatch_directives`	list \| dict	`[]`	User Slurm directives. See sbatch directives.

ppn is global

The processes-per-node value is read from global_options and applied uniformly; it reflects the physical allocation and is not overridden per experiment.

Allocation fields¶

allocationmode selects the strategy that maps the job's nodes onto the applications:

l — linear: each application gets a contiguous block of nodes.
i — interleaved: nodes are dealt round-robin across applications.
p — partitioned: nodes are first split into partitions (typically victims vs aggressors, by each app's partition), then apps within each partition are placed by a per-partition sub-rule.

allocationsplit controls the division among apps (modes l/i) or within partitions (mode p):

e — equal split across applications.
50:50, 70:30, … — explicit percentages per application (must not exceed 100).
For mode p, the value uses - to separate per-partition rules, e.g. 100-100 (each partition shares all its nodes among its apps) — see the partitioned example below.

partitionsplit (mode p only) sizes the partitions themselves: 50:50, or e to auto-size by the number of distinct partition IDs in use.

`experiments`¶

A dictionary mapping an experiment name to its definition. Experiments run sequentially, in sorted key order.

Key	Type	Meaning
`description`	string	Optional free-text note (stored in the run's `config.json`).
`local_options`	object	Per-experiment overrides of `global_options` (see below).
`apps`	object	The applications to run concurrently in this experiment.

Per-experiment overrides (`local_options`)¶

Any global_options key may be repeated in an experiment's local_options; the experiment's value wins for that experiment (the framework merges {**global, **local}). Use this to vary, say, allocationmode or timeout between experiments in the same run.

The `apps` block¶

A dictionary keyed by numeric string IDs ("0", "1", …). Each value describes one application:

Key	Type	Default	Meaning
`path`	string	required	Path to the wrapper module. If relative, resolved against `CRAB_PATH_WRAPPERS`.
`args`	string	`""`	Command-line arguments passed to the executable.
`collect`	bool	`false`	Whether to parse and store this app's metrics.
`start`	string	`"0"`	When to start the app. See Scheduling.
`end`	string	`""`	When to stop the app. See Scheduling.
`partition`	int	auto	(mode `p`) Which allocation partition this app belongs to. Defaults to `0` if `collect` is true, else `1`.

Extra keys become wrapper attributes

Any key in an app entry that is not one of the reserved keys above (path, args, collect, start, end, partition) is injected as an attribute onto the wrapper instance. This lets a wrapper accept custom configuration straight from the JSON without framework changes.

Scheduling (`start` and `end`)¶

The start and end strings encode the victim/aggressor model and timed/sequential execution.

start — when the app launches:

Value	Meaning
`"0"` or a number	Delay in seconds from the start of the run before launching.
`"sN"`	Start only after application N has finished (a dependency, enabling sequential chains).

end — when the app is stopped:

Value	Meaning	Role
`""` (empty)	Wait for the app to finish on its own.	Victim
`"f"`	Force-terminate once all non-`f` apps have finished.	Aggressor
a number	Terminate after that many seconds.	Timed

sbatch directives¶

global_options.sbatch_directives lets you add or override Slurm #SBATCH lines. The preferred form is a list of complete directives:

"sbatch_directives": ["--account=IscrC_FOO", "--qos=normal", "--exclusive"]

A legacy dict form is also accepted (true → bare flag, false → omitted, value → --key=value):

"sbatch_directives": { "time": "00:20:00", "exclusive": true }

CRAB protects some directives

--nodes and --ntasks-per-node are computed by the framework from numnodes/ppn and cannot be overridden — user attempts are ignored with a warning. Overriding --output/ --error is allowed but warned about, since it redirects CRAB's standard logs. Directives containing newlines are rejected. System-level directives from the preset are merged in with lower priority than your config's directives.

Worked example: partitioned victim vs aggressor¶

{
  "global_options": {
    "name": "congestion_study",
    "numnodes": "8",
    "ppn": "1",
    "allocationmode": "p",
    "partitionsplit": "50:50",
    "allocationsplit": "100-100",
    "partitionlayout": "i",
    "minruns": "10",
    "maxruns": "30",
    "timeout": "1200.0",
    "outformat": "csv",
    "sbatch_directives": ["--exclusive", "--time=00:20:00"]
  },
  "experiments": {
    "a2a_vs_graph500": {
      "description": "All-to-all victim against a Graph500 aggressor, interleaved nodes.",
      "apps": {
        "0": {
          "path": "a2a_comm_only.py",
          "args": "-msgsize 8192 -iter 1000",
          "collect": true,
          "start": "0",
          "end": "",
          "partition": 0
        },
        "1": {
          "path": "others/g500.py",
          "args": "",
          "collect": false,
          "start": "0",
          "end": "f",
          "partition": 1
        }
      }
    }
  }
}

This allocates 8 nodes, splits them 50/50 into two interleaved partitions, runs an all-to-all victim (collected) in partition 0 against a Graph500 aggressor (force-killed when the victim finishes) in partition 1, and repeats between 10 and 30 runs until the victim's convergence-target metric stabilizes.

For where the resulting files land, see Architecture → Output layout.

Configuration schema¶

The experiment hierarchy¶

Top-level shape¶

global_options¶

Allocation fields¶

experiments¶

Per-experiment overrides (local_options)¶

The apps block¶

Scheduling (start and end)¶