# Analyses configuration

The Analyses configuration defines how analyses should be run in TrustInSoft CI, by describing the source files to use for the analysis, the entry point, the compilation options to preprocess the source files, and any other options to configure the TrustInSoft CI Analyzer.

## The Analyses configuration file

{% hint style="info" %}
It is recommended to read first the [Configuration files](https://docs.ci.trust-in-soft.com/configuration-file) section which explain the differences between a **Global configuration** and a **Committed configuration**.
{% endhint %}

&#x20;If a **Global configuration** is used, the Analyses configuration file can be enabled and written in the Project settings page of the project in the `Build configuration` section:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-MeenHDdmhslY91Lw1wb%2F-MeeniZUT0KVUz_ikFtB%2Fimage.png?alt=media\&token=078fc0c1-876f-4087-93b0-7799ec903b76)

If a **Committed configuration** is used, the Analyses configuration file should be written and committed in the `.trustinsoft/config.json` file for the branch that is going to be analyzed.

{% hint style="info" %}
With a **Committed configuration**, it is possible to generate the `.trustinsoft/config.json` file during the [Build preparation stage](https://docs.ci.trust-in-soft.com/configuration-file/build-preparation-script).&#x20;
{% endhint %}

## Syntax and basic usage

{% hint style="success" %}
All examples in this section can be replayed with our [demo-caesar](https://github.com/TrustInSoft-CI/demo-caesar) repository, used for our [Introduction tutorial](https://docs.ci.trust-in-soft.com/tutorial). Feel free to fork this repository to try the different analysis options.
{% endhint %}

The Analyses configuration should be written using the JSON [ECMA-404 standard](https://www.ecma-international.org/publications/standards/Ecma-404.htm), with a syntax extension for comments:

* `//` ignores all characters until the end of line
* `/*` ignores all characters until the next `*/`

The Analyses configuration is a **list** of analysis configuration **objects:**

```javascript
[
  {
    /* First analysis configuration */
  },
  {
    /* Second analysis configuration */
  }
  // And so on...
]
```

Each analysis configuration should contains at least:

* the **list** of source files to analyze (usually `.c` or `.cpp` files)
* the compilation options required to [preprocess](https://en.wikipedia.org/wiki/C_preprocessor) these files

```javascript
[
  {
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I ."
    // or use "cxx-cpp-extra-args" for C++ source files
  }
]
```

{% hint style="danger" %}
Do not add a comma`,` before any closing bracket `}` or `]`. Otherwise it will lead to a syntax error of the JSON format.
{% endhint %}

{% hint style="info" %}
`cpp-extra-args` or`cxx-cpp-extra-args` can be omitted if the files do not need any particular pre-processing options.
{% endhint %}

Then, it is also recommended for each analysis configuration to add the following **optional** information:

* a `name` to clearly identify the analysis in the result table in TrustInSoft CI
* the target architecture to use (also called `machdep`); if omitted, the default one is `"gcc_x86_32"`(see also the list of [Supported architecture](https://docs.ci.trust-in-soft.com/reference/supported-architectures))
* the function to use ad the entry point of the analysis; if omitted, the default one is the `main` function

```javascript
[
  {
    "name": "Test shift values 7 and -3 (gcc_x86_64)",
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I .",
    "machdep": "gcc_x86_64",
    "main": "main"
  }
]
```

## Paths in Analyses configuration

With a **Global configuration**, all filenames/paths inside an Analyses configuration are relative to the root of the repository.

With a **Committed configuration**, all filenames/paths inside an Analyses configuration are relative to the directory where the file is, hence relative to `.trustinsoft`.

{% tabs %}
{% tab title="Global configuration" %}

```javascript
[
  {
    "name": "Test shift values 7 and -3",
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I ."
  }
]
```

{% endtab %}

{% tab title="Committed configuration" %}
{% code title=".trustinsoft/config.json" %}

```javascript
[
  {
    "name": "Test shift values 7 and -3",
    "files": [ "../main.c", "../caesar.c" ],
    "cpp-extra-args": "-I .."
  }
]
```

{% endcode %}
{% endtab %}
{% endtabs %}

For the **Committed configuration**, prefixing all paths by `../`can be annoying. To avoid such a task, the option `"prefix_path"` can be used to prefix all paths by the given value:

{% code title=".trustinsoft/config.json" %}

```javascript
[
  {
    "name": "Test shift values 7 and -3",
    "prefix_path": "..",
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I ."
  }
]
```

{% endcode %}

{% hint style="info" %}
For instance, the `"prefix_path"`option can also be used with a **Global configuration** if all your sources files are located in a same sub-directory.
{% endhint %}

## Advanced usages

### Adding inputs for the entry point function

If the entry point function has type `int (int argc, char * argv[])`, inputs can be given to the program with the `"val-args"` option.

The analysis starts with `argc` bound to k+1 and `argv` pointing to a NULL-terminated array of pointers to strings `program`, `arg_1`, …, `arg_k` with `arg_1`, ..., `arg_k`the arguments given to `"val-args"`. The first character is used as separator to split the `arg_k` arguments.

{% hint style="info" %}
`argv[0]` is set by default to `program`. This value can be changed with the `"val-program-name"` option.
{% endhint %}

```javascript
[
  {
    "name": "Test from program inputs",
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I .",
    "main": "main_with_input",
    
    // argc will be set to "3"
    // argv[0] = "a.out"
    // argv[1] = "People of Earth, your attention please"
    // argv[2] = "7"
    "val-program-name": "a.out",
    "val-args": "|People of Earth, your attention please|7"
  }
]
```

{% hint style="danger" %}
If your entry point function has not the type `int (int argc, char * argv[])`, then it is not possible to given an input with `"val-args"`. In this case, it is recommended to use a *test driver function* (a function written only for a test purpose) which directly calls your function with the wanted input and to use the new test driver function as the entry point for the analysis.
{% endhint %}

{% hint style="warning" %}
If you want to analyze a lot of different inputs, it is recommended to use a *test driver function* (a function written only for a test purpose) instead. Write this *test driver function* to call your entry point as many times you want and then set the new entry point of the analysis to this *test driver function*.
{% endhint %}

### Using compilation databases

If your project uses tools such as [CMake](https://cmake.org/) or [Bear](https://github.com/rizsotto/Bear), the generated [compilation database JSON file(s)](https://clang.llvm.org/docs/JSONCompilationDatabase.html) can be used instead of the `"cpp-extra-args"` and `"cxx-cpp-extra-args"` options to deduce the preprocessing options to use for the analyzed source files.

First, the compilation database file(s) must be generated during the [Build preparation stage](https://docs.ci.trust-in-soft.com/configuration-file/build-preparation-script):

```bash
#!/bin/bash

set -e

# Generate compile_commands.json files with Bear or CMake
bear make
```

Then, to use these generated compilation database file(s), the `"compilation-database"` option should be added in your analysis configuration object with the paths to the compilation database file(s):

```javascript
[
  {
    "name": "Test with a compilation database",
    "files": [ "main.c", "caesar.c" ],
    "compilation-database": [ "compile_commands.json" ]
  }
]
```

{% hint style="info" %}
If the `"cpp-extra-args"` or `"cxx-cpp-extra-args"` options are given in addition of the `"compilation-database"`, these options are concatenated to the preprocessing command line (used by TrustInSoft Analyzer to parse the source files) after the preprocessing options extracted from the compilation database.
{% endhint %}

{% hint style="success" %}
If a **directory** is given instead of a compilation database file in the `"compilation-database"` option, the analyzer will scan all `compile_commands.json` files located in this directory and sub-directories.
{% endhint %}

### Selecting a C++ standard

For C++ programs, it is recommended to explicitly specify which C++ standard to use for the analysis with the `"cxx-std"` option. If omitted, the default C++ standard used is `c++11`.

The available C++ standards for TrustInSoft CI are: `c++03`, `c++0x`, `c++11`, `c++14`, `c++17`, `c++1y`, `c++1z`, `c++20`, `c++2a`, `c++98`, `gnu++03`, `gnu++0x`, `gnu++11`, `gnu++14`, `gnu++17`, `gnu++1y`, `gnu++1z`, `gnu++20`, `gnu++2a`, `gnu++98`.

Example with our C++ repository example [Cxx\_matrix](https://github.com/TrustInSoft-CI/Cxx_matrix):

```javascript
[
  {
    "name": "Matrix manipulations in C++",
    "files": [ "matrix.cpp" ],
    "compilation_cmd": "-I.",
    "cxx-std": "c++14"
  }
]
```

### Customizing the address alignment

In TrustInSoft CI Analyzer, the base addresses are assumed to be aligned to multiples of `1` by default.

If your analyzed program assumes the addresses to have a different alignment, it can be specified with the `"address-alignment"`  option:

```javascript
     // Base adresses are assumed to be aligned to multiples of 65536.
     "address-alignment": 65536
```

### Simulating a file system

If the analyzed program uses the file system to do operations on files, the analysis may need to have a virtual file system to be deterministic, otherwise the analysis may be interrupted by a `Bad libc call`error.

This virtual file system simulates a list of files available for the analyzed program. This list of files is based on files of the real file system, hence it is recommended to either commit the files needed for the analysis in your GitHub repository or to generate them during the [Build preparation stage](https://docs.ci.trust-in-soft.com/configuration-file/build-preparation-script).

The virtual file system can be used by using the `"filesystem"` option which contains a list of `"files"`. Each file should indicate its `"name"`used by the program and its associated file `"from"` the real file system. The contents of the `"name"` file during the analysis will be mapped to the one of the real `"from"` file, allowing a deterministic behavior of functions operating on files (such as `fgetc`, `fread`, ...).

{% hint style="danger" %}
The string in `"name"`needs to be **exactly the same one** used inside the program to open the file. Otherwise the file will not be correctly found and mapped to the `"from"` file of the virtual file system.
{% endhint %}

```javascript
[
  {
    "name": "Test with file as input",
    "files": [ "caesar.c", "main.c" ],
    "cpp-extra-args": "-I.",
    "main": "main_with_filesystem"
    "filesystem": {
      "files": [
        {
          // Path used for the "fopen" in "main_with_filesystem".
          "name": "/var/demo/caesar/test-suite.txt",
          // Path to the file located in the repository.
          "from": "input.txt"
        },
        {
          // If "from" is omitted, the analyzer assumes this file
          // does not exist. 
          "name": "/var/demo/caesar/test-suite-2.txt"
        } 
      ]
    }
  }
]
```

### Tweaking for performance issues

Some analyses can take too much time or memory according to the limits set by TrustInSoft CI.

About time, an single analysis is stopped after running for 15 minutes, leading to the `Timeout` error. This limit can be increase up to 3 hours with the `"val-timeout"` option.

About memory usage, TrustInSoft CI Analyzer keeps all the results of the analysis in its memory. These results are useful to **Inspect** the analysis with the Graphical User Interface of TrustInSoft CI Analyzer (which allow to see all values of all variables at any program point).

However it is hard to understand and anticipate how much memory will be consumed by the analysis. If an analysis is stopped with the `Out of memory` error, it is possible to use the `"no-results"`option to force TrustInSoft CI Analyzer to not keep results of the analysis in its memory. As a side effect, the `"no-results"` can also slightly make the analysis faster.

{% hint style="warning" %}
With the `"no-results"`option, the analysis may no longer hit the memory limit. However you will no longer be able to **Inspect** the result with the Graphical User Interface of TrustInSoft CI Analyzer.
{% endhint %}

```javascript
[
  {
    "name": "Test beyond limits",
    "files": [ "main.c", "caesar.c" ],
    "cpp-extra-args": "-I .",
    
    // The value is the number of seconds. 10800 is equal to 3 hours.
    // A value greater than 3 hours will still be capped to 3 hours.
    "val-timeout": 10800,
    
    // If "true", inspecting the analysis afterwards with
    // TrustInSoft CI Analyzer is no longer possible.
    "no-results": true
  }
]
```

### More advanced usages

{% hint style="warning" %}
Options described in this page is only a part of a long list. Most of options not described here are only useful on very specific use cases.

TrustInSoft CI Analyzer shares its options with [TrustInSoft Analyzer](https://trust-in-soft.com/product-c-and-c-source-code-analyzer/). So the complete list of options for an analysis configuration can be found on the [TrustInSoft Analyzer documentation](https://man.trust-in-soft.com/ref/tis-config.html#list-of-options) and is closely related to the [TrustInSoft Analyzer command line options](https://man.trust-in-soft.com/ref/options.html#list-of-options).

However, some of these options are **not available** with TrustInSoft CI Analyzer.

Do not hesitate to [contact us](https://docs.ci.trust-in-soft.com/get-help) if you have trouble to configure your analyses.
{% endhint %}
