# Find the root cause of the undefined behavior

## Launch TrustInSoft CI Analyzer

{% hint style="info" %}
In the following, we'll use TrustInSoft CI Analyzer to explore the behavior of the code along the tested path and find the root cause of the undefined behavior.
{% endhint %}

**1.** In the Summary, click on `Inspect with TrustInSoft CI Analysis` (under the snippet of code) to launch TrustInSoft CI Analyzer:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-Meek5aFnikzGe2zq0ZC%2F-MeekN02qAcifZPyhhe-%2Fimage.png?alt=media\&token=00c862ad-b6f2-4eba-92be-124c731591a8)

**2.** Once TrustInSoft CI Analyzer has launched, take a look at the interface:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0qbB42-5VfKas3Jk7U%2F-M0qd-goPbaztwAESIZK%2Fimage.png?alt=media\&token=3bf6a4ca-41a0-4eed-8035-96a8c9165728)

* The **Overview panel** displays information about the analysis
* The **Interactive code panel** displays a "normalized" version of the code that preserves the semantics of the program
* The **Source code panel** shows the original source code

## Describe the undefined behavior

**1.** Observe how the **Interactive code panel** points to the undefined behavior:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LrK1S4B-nMf4vJjmyr1%2F-LrK3SPn7oM79f9yMNBq%2Fimage.png?alt=media\&token=1548f050-9838-4b02-b691-c42d76c8dea0)

The second line highlights the statement in the normalized code where the undefined behavior was found. The first line highlights the corresponding functional property, which TrustInSoft CI could not verify.

{% hint style="info" %}
As part of the analysis process, TrustInSoft CI automatically annotates the code with ACSL functional properties starting with `assert Value`. These properties are written using the [ACSL specification language](https://en.wikipedia.org/wiki/ANSI/ISO_C_Specification_Language). They contain predicates, such as `\valid(buf+i)`, that TrustInSoft CI tries to verify to prove the absence of undefined behaviors.
{% endhint %}

Note the corresponding source code statement **line 31** in grey in the **Source code panel**:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0qbB42-5VfKas3Jk7U%2F-M0qdCH0lVhh33SByLud%2Fimage.png?alt=media\&token=d2616288-9718-4411-b9d7-66909972d783)

Let's understand why the predicate `\valid(buf+i)` did not hold, by inspecting the variables `buf` and `i`.

**2.** In the **Interactive code panel**, click on `i` inside `\valid(buf+i)`, then look at the **Values widget** (bottom):

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LrUO4tOiDeuit1ue6UI%2F-LrURfoqwdNdWREFjZ2H%2Fimage.png?alt=media\&token=6ea0f298-b38e-44f9-98e1-3233cae193dc)

The widget displays the selected expression `i` along with the union of all its values until the undefined behavior was found. Since `i` gets incremented at each iteration of the `while` loop enclosing the undefined behavior, `i` takes several values so an integer range between 1 and 33.

**3.** Similarly, click on `buf` to look at its values in the **Values widget**:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LrUO4tOiDeuit1ue6UI%2F-LrURwBzynHD8RVOmXVz%2Fimage.png?alt=media\&token=3407ef1a-637b-4080-95a6-9857aa3cfc04)

`buf` is a pointer, whose value is the result of the call to `malloc` in the function `caesar_encrypt` line 26. The memory space allocated by `malloc` is represented using the symbolic notation `__malloc_caesar_encrypt_l26`.

**4.** You can click anywhere in the code on `i`, `buf` or any other expression, to see their values before and after the clicked statement in the **Values widget**:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LrPKCeuSsihHV9H8m7T%2F-LrPU5hUWcAQh4vmIlKA%2F2019-10-17_18-04-55%20\(2\).gif?alt=media\&token=72af48f9-b377-4128-bbf4-eb7f3b3b4271)

**5.** Right click on `i` and select **Track this term**, then click on `buf`  to simultaneously inspect the values of `i`  and ~~`buf`~~ :

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0NhxF_UC-ks2LfJcvF%2F-M0O9eclZztHvYacL_O2%2Fimage.png?alt=media\&token=00acb36f-8fdb-4bdf-a6ac-abcd12f2e1f7)

**6.** In the **Values widget**, click on the link **\_\_malloc\_caesar\_encrypt\_l26** to see its values. Then, click on the cell to expand it:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LsNXI10FzgLqDnjfOwa%2F-LsN_B1dm8Wls3AOqKlB%2F2019-10-29_18-33-18%20\(1\).gif?alt=media\&token=c508450f-415f-476b-81fb-c69444ea31b1)

**7.** Click on **Per path** to look at the values of `i` and `__malloc_caesar_encrypt_l26`  for each iteration of the `while` loop enclosing the undefined behavio&#x72;**:**

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0NhxF_UC-ks2LfJcvF%2F-M0O6yhXWMrwSYP66mrx%2Fimage.png?alt=media\&token=cc27a2ed-c44f-4e43-bf16-445b6e933c82)

Each "master" row corresponds to one iteration of the loop. For example, the first row shows  `i = 1`, which means  that the statement with the undefined behavior is first reached when `i = 1`.

**8.** Click on **Emission rank** in the column headers, then **Sort Descending** to display the last loop iteration first, that is where the analysis stopped because of the undefined behavior:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0NhxF_UC-ks2LfJcvF%2F-M0O83RiREaqv4MUJ0AK%2Fimage.png?alt=media\&token=f8761cb4-cdea-495e-8d3f-3f76a800e674)

In the last iteration (top "master" row) so right before the analysis stopped because of the undefined behavior, the code is accessing `buf` with an offset `i` equal to 33:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0O9iwl5a_P14uqQ365%2F-M0OB3Lzyh-aBaLMwfFJ%2Fimage.png?alt=media\&token=afe382cc-0e37-4f6a-886d-65a7c6e59151)

**9.** Finally, click on `__malloc_caesar_encrypt_l26` in the **Expression** column (see above screenshot) to see its type:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-LsN_TPyJqeqjWTQsPWv%2F-LsN_Y-zbQB_pTVBGub6%2Fimage.png?alt=media\&token=a8b5a1e1-18d0-4fd8-9599-b195c6f5806f)

`__malloc_caesar_encrypt_l26` is an array of char of length 33 `(char [33])`.  The access with an offset of 33 is therefore out of bounds and it is an undefined behavior.

Let's find where the length of the array comes from.

## Find the root cause of the undefined behavior

**1.** In the **Interactive code panel**, right click on `buf`  inside the code statement below the functional property, then click on **Show Defs**.

The **Show Defs Statement widget** (left) moves up into view and displays the previous write statement to `buf` (see video capture below).

**2.** Click on this previous statement to point the **Interactive code panel** to it:

![](https://3982345336-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LpYF4Rmm1tt0M5Cn4E0%2F-M0qdqpI_gjAJmDgAqcS%2F-M0qfQpJOCesM9XxNXrB%2F2020-02-24_11-10-05%20\(1\).gif?alt=media\&token=bea48166-e5a7-4d79-878e-fdc5b3b1ff05)

The problem is the integer literal 33 that is passed to `malloc`. Since `buf` contains the result of the encryption of `str`, the code should not allocate a buffer of constant size but one of the same size as `str`.

{% hint style="info" %}
Whenever it is needed to understand where a value comes from, use **Show Defs** on this value and then recursively on the left-value of the previous statement, until you reach the root cause of the bad value.
{% endhint %}

## Summary

Before we fix the root cause of the undefined behavior, let's summarize what happened and what we did:

* TrustInSoft CI verifies the presence of undefined behaviors by propagating the program's input values, statement by statement, along the analyzed path. As part of the process, it computes for each statement the current "memory state", that can be roughly defined as the current value of all the variables.<br>
* We first analyzed the undefined behavior by inspecting the values of the variables at the program point where the undefined behavior was found. Then, we explored backwards the memory states, until we found the root cause of the undefined behavior.<br>
* We used the **Values widget** to isolate the program point where the undefined behavior occurred and to explore the values of the variables, and **Show Defs** to move to previous program points / memory states.
