assert(), __FILE__, Path and other cool GNU gcc Tricks to be aware of

It is always good to have a close look what ends up in a microcontroller FLASH memory. For example using EHEP Eclipse plugin to inspect the binary file:

Source File Name in Binary Image

Source File Name in Binary Image

Obviously it has path and source file information in it. Why is that? And is this really needed?

What about:

  • Privacy: the path or file name might expose information (secret project name?) or might be used for reverse engineering?
  • Size: The strings add up to the final data/FLASH size, so this increases the need for ROM space?

So let’s have a look what is the reason for this and how it could be avoided or at least reduced.

Outline

This article covers why information about file names and path information can be present in a binary. It goes through how assert() checks are used, how they can be enabled or disabled, how the information about the files can be avoided, changed or removed to address privacy or code size concerns.

assert()

The reason for the file name strings are using asserts like this for example:

void McuLED_GetDefaultConfig(McuLED_Config_t *config) {
  assert(config!=NULL);
  memcpy(config, &defaultConfig, sizeof(*config));
}

Asserts are used to verify a condition to catch error cases (a NULL Pointer in above case). The assert checks if the condition is true or not. If false it can trigger an error handler.

assert tooltip help

assert tooltip help

The assert is typically a macro. A typical library implementation is like below, found in assert.h:

#ifdef NDEBUG           /* required by ANSI standard */
# define assert(__e) ((void)0)
#else
# define assert(__e) ((__e) ? (void)0 : __assert_func (__FILE__, __LINE__, \
						       __ASSERT_FUNC, #__e))
#endif

The macro (if turned on) uses the __FILE__ macro/preprocessor symbol which gets filled by the compiler with the file name. It is used with the __LINE__ preprocessor symbol to write an error message, indicating the file name and line number where the assertion failed.

💡 If that __FILE__ gets resolved to just the file name or the file name with the path depends on the implementation of the compiler, more about this later.

To which extend the information about the file name and path to it might be useful for reverse engineering of course depends: at least it might expose some information you do not want to share.

Turning Asserts Off

As we can see, we can completely turn off the assert functionality with having NDEBUG defined. This is usually defined for a Release build (see Debug vs. Release?). So one solution is obviously to have the usual DEBUG changed to NDEBUG<, or just to have it present in the list of defines like below:

NDEBUG defined to turn off asserts

NDEBUG defined to turn off asserts

__FILE__

The advantage of having NDEBUG this is of course these file names will be removed because not generated by the asserts. As another side effect, this can reduce code size too depending how many asserts are present in the code (see Tutorial: How to Optimize Code and RAM Size).

But as the safety checks with the asserts are gone too. So what can I do to keep the asserts, but not exposing the full path with the __FILE__ macro?

The thing is that the C/C++ standard does not specify if it is with the full path or not. Some compilers implement dedicated options to configure exactly that. For the GNU gcc the compiler basically is using what I’m passing on the command line. So if I compile a file with the full path, this is what ends up resolved by __FILE__. In the example below the full path is used:

full path

full path

full path passed to the compiler

full path passed to the compiler

What gets passed to the compiler as file name depends on your build environment, e.g. how you call the compiler in the make file.

Eclipse for example usually uses a relative path to the file:

💡 In Eclipse CDT the relative paths are relative to the ‘output’ folder which is where the binaries and object files are stored. Usually this is the ‘Debug’ or ‘Release’ folder in your project.

Eclipse CDT Build with relative path

Eclipse CDT Build with relative path

Again it is up to the compiler what is then used for __FILE__. For the above file compiled with “../source/main.c” the result is interesting:

relative path shorten by gcc

relative path shorten by gcc

So obviously the compiler has shorten the path to something I have not expected: “.main.c”

Compiling it with an absolute path shows that __FILE__ is absolute too:

Compiling with absolute path

Compiling with absolute path

__FILE__ with absolute path

__FILE__ with absolute path

__BASE_FILE__

When I first saw that there is a __BASE_FILE__ macro in gcc, I thought this could be the solution and just have the file name without path. But this is not the case:

Absolute path with __BASE_FILE__

Absolute path with __BASE_FILE__

Confirmed by the documentation on https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html

__BASE_FILE__

This macro expands to the name of the main input file, in the form of a C string constant. This is the source file that was specified on the command line of the preprocessor or C compiler.

So to me it is the same as __FILE__. Well, not really: if used in a included (header) file it reports the includer file and not the included file (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42579#c3). So this one might not be helpful, but let’s see :-).

Linux

Because paths are handled differently on Windows and Linux, I created a a simple test on Raspberry Pi (using gcc for that ARM core):

#include &amp;amp;lt;stdio.h&amp;amp;gt;

int main(void) {
  printf("__FILE__ is '%s'\n", __FILE__);
  return 0;
}

Compiled with
gcc main.c
it gives

__FILE__ is 'main.c'

which is expected.

Compiled with
gcc ../0_test/main.c
it gives

__FILE__ is '../0_test/main.c'

which is different from what I had on Windows.

Finally compiled it with
gcc /home/pi/aembs/0_test/main.c
it gives

__FILE__ is '/home/pi/aembs/0_test/main.c'

To me things are a bit more consistent on Linux. It seems that GNU gcc for ARM has an issue with relative paths on windows and somehow truncates it to a single dot (.).

Absolute Paths

The first thing would be to get rid of the path or make it at least relative. This depends what build environment is used. In Eclipse look at the build console output what kind of path is used to the file:

full path

full path

If using Linked Files or Folders, they get expanded to the to an absolute path which is passed to the build tools:

Linked Location

Linked Location

Unfortunately depending how the project is organized, this cannot be easily changed. The project is still portable because it is a relative path, but an absolute path gets passed to the compiler.

-ffile-prefix-map Option

The GNU compiler has a nice option (see https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html) which can be used to cut of paths in the preprocessor:

-ffile-prefix-map=old=new

When compiling files residing in directory old, record any references to them in the result of the compilation as if the files resided in directory new instead. Specifying this option is equivalent to specifying all the individual -f*-prefix-map options. This can be used to make reproducible builds that are location independent. See also -fmacro-prefix-map and -fdebug-prefix-map.

Say if I want to cut-off “c:/tmp/” from the __FILE__, I can use the following option:

-fmacro-prefix-map=c:/tmp/=

That way a “c:/tmp/main.c” gets mapped just to “main.c”:

remapped __FILE__ macro

remapped __FILE__ macro

If a path has spaces, use a double quoted path, e.g.

-fmacro-prefix-map="c:/path with spaces/"="new path"

Of course this means I have to do this for multiple directories depending on my application structure, but at least that way I can keep the strings short and still useful.

For that example shown at the beginning of the article I can easily make that path shorter and get the first part of the absolute path removed:

shorter path

shorter path

So with this I have to shorten the path/file name as much as I want. In addition to that, what could be done with the __FILE__ macro to just have it represent the file name?

__FILE__ without path: __FILENAME__

With the __FILE__ being the problem, why not create a new one (__FILENAME__) which only contains the file name and not path?

That’s actually not that hard to do:

 
#define __FILENAME__ (strrchr("/"__FILE__, '/') + 1)

In above macro the strrchr() is from <strings.h> and locates the last character (in this case ‘/’) in a string (in this case __FILE__) and points past it (+1). A bit of trickery is the implicit string concatenation with the prefix of the “/”: that way the string always has a ‘/’ present.

But wait! strrchr() is a function call: first it is not efficient and with the __FILE__ it still will have the string with the path in my binary :-(. And yes, indeed looking at the assembly code confirms this:

#define __FILENAME__ (strrchr("/"__FILE__, '/') + 1)
 
const char *fileName;
...
str = __FILENAME__;

gives:

call to strrchr()

call to strrchr()

Definitely not good if there would be such a call for each __FILE__ usage.

Actually there is help with gcc built-in functions (see GNU gcc printf() and BuiltIn Optimizations and list of gcc built-in functions).

First, make sure that you don’t have a -fno-builtin in your project settings, so remove that option:

remove -fno-builtin

remove -fno-builtin

With that option removed or not present, the gcc compiler can optimize, replace or inline standard library functions. Because just removing the option might not be enough, I do call the built-in function directly:

 
#define __FILENAME__ (__builtin_strrchr("/"__FILE__, '/') + 1)

💡 Note that strrchr() does return a pointer into the __FILE__ string.

With this, no extra call and it uses just the pointer to the constant string “main.c” which is at address 0x2f7c below:

using string main.c

using string main.c

So with this I have a __FILENAME__ macro which other than the normal __FILE__ one only contains the file name and no path.

Using __FILENAME__ in assert()?

Instead of using __FILE__ I can use __FILENAME__ and I should be fine. That’s OK for my own code which I can change from using __FILE__ to __FILENAME__.

Unfortunately this does not work for the assert() macro which is inside <assert.h>: I can disable the assert() with NDEBUG but I cannot easily overwrite it with my own define (I could define my own __assert_func(), but it is the assert() which uses the __FILE__ below:

 
#ifdef NDEBUG /* required by ANSI standard */ 
# define assert(__e) ((void)0) 
#else 
# define assert(__e) ((__e) ? (void)0 : __assert_func (__FILE__, __LINE__, \ 
__ASSERT_FUNC, #__e)) 
#endif 

Well, I could change the <assert.h> library header file or recompile the GNU standard library. But I rather want to keep it as it is because this is not easy and I want to keep the library as it is.

Redefining __FILE__

If the __FILE__ macro is not what I want, why not changing that macro instead?

 
#define __FILE__ (__builtin_strrchr("/"__FILE__, '/') + 1)

To get rid of the recursion, I can rewrite it as

 
#define __FILE__ (__builtin_strrchr("/"__BASE_FILE__, '/') + 1)

This of course raises a gcc warning which I can suppress with
-Wno-builtin-macro-redefined

-Wno-builtin-macro-redefined

-Wno-builtin-macro-redefined

To have this define present for every file I compile in the project, I use the -include option with the following file:

 
/*
 * __FILE__def.h
 *
 * Copyright (c) 2020: Erich Styger
 * License: PDX-License-Identifier: BSD-3-Clause
 */

#ifndef FILE__DEF_H_
#define FILE__DEF_H_

/* Redefine the __FILE__ macro so it contains just the file name and no path
 * Add -Wno-builtin-macro-redefined to the compiler options to suppress the warning about this.
 */

#define __FILE__ (__builtin_strrchr("/"__BASE_FILE__, '/') + 1)

#endif /* FILE__DEF_H_ */

To include it, I use the -include option:

Including the file with -include

Including the file with -include

With this, no warnings, I still have the asserts in place, the file names are without path information plus I save FLASH space 🙂

Assert callback

The last thing is about what should happen in case the assertion triggers. By default the library will print an error message like this using printf():

What I recommend is overwriting the callbacks with custom routines: this not only avoids using the printf() bloat, it gives you the ability to do custom actions (blink an LED) or log the error. Below is what I usually use with the McuLog library:

 
/* overwrite assertion callback */
#include "McuLog.h"

void __assertion_failed(char *_Expr)  {
  McuLog_fatal(_Expr);
  McuLog_fatal("Assert failed!");
  __asm volatile("bkpt #0");
  for(;;) {
    __asm("nop");
  }
}

void __assert_func(const char *file, int line, const char *func, const char *expr) {
  McuLog_fatal("%s:%d %s() %s", file, line, func, expr);
  McuLog_fatal("Assert failed!");
  __asm volatile("bkpt #0");
  for(;;) {
    __asm("nop");
  }
}

Simply add the two above functions to the code as the base implementation in the GNU library is marked as ‘weak’ and can be easily overwritten.

Summary

Using the assert() in the source code is a good thing to catch errors early. The assert() checks a condition and if it fails the default implementation reports the source file name (__FILE__) and line number (__LINE__). That way the path and source file name gets added as constant strings to the binary which can be a concern both because of privacy and/or code size. What exactly is represented with __FILE__ depends on the compiler and how the file gets passed to the compiler. The asserts can be turned off with the NDEBUG macro. In case asserts shall be still checked in a release binary, the assert can be overloaded and modified to whatever you want.

With the help of this article you can now turn on/off asserts, limit or replace the used path information to the files, having the file name without path using the __FILE__ preprocessor macro and the ability to use custom assert() hooks. Congratulations!

Happy asserting 🙂

Links

14 thoughts on “assert(), __FILE__, Path and other cool GNU gcc Tricks to be aware of

  1. Hi, some questions please.

    1.- How can I see the disassembly of the binary in MCUXpresso?
    2.- Do you know if this is possible to see in all IDE based on Eclipse as in ST for STM32?
    3.- Can it be seen in any binary or only in the full projects with sources developed in MCUXpresso?
    4.- Is it possible to modify it or is it only displayed?

    Regards

    Like

  2. Hi Erich,

    This partially worked for me. The call to strrchr does get optimized out, and just the relevant portion of the filename gets used, however, the full path is still included in the binary. I tried a bunch of different optimization levels and options, but couldn’t find anything that eliminated the full path from the binary. Do you have any suggestions?

    These are my options:
    -Og -mcmodel=medium -g3 -Wall -mcpu=n25f -ffunction-sections -fdata-sections -c -fmessage-length=0 -fomit-frame-pointer -fno-strict-aliasing -Werror -fstrict-volatile-bitfields

    I’m using the AndeSight IDE (eclipse-based) with an Andes gcc compiler, so maybe things are just too different.

    Thanks,
    Ben

    Liked by 1 person

    • Hi Ben,
      I did not know about this AndeSight IDE. I checked their web site and it seems to me that they are not using the standard ARM/GNU compiler, so the compiler/linker might not be doing the same thing. Your options look fine. Did you check/verify from where that absolute path is used? Maybe you are using something different or assert is still using the full path?

      Like

      • Hi Erich,

        Thanks for your response. I’ve tried to eliminate variables to solve this issue so I’m not working with “assert” yet. I just have
        #define __FILE_NAME__ (__builtin_strrchr(“/”__FILE__, ‘/’) + 1)
        and then I call
        printf(“__FILE_NAME__ is %s\n”, __FILE_NAME__);
        in main(). The functionality is correct and no calls to strrchr are found in the assembly, but the full path of the file can still be found in the binary.

        Thanks,
        Ben

        Liked by 1 person

        • “I guess you have something wrong with your includes. Place that #define directly in front of your printf and check the assembly code.”

          No luck with that either. And I just got a response from my question to Andes – they don’t think it’s possible. Must be something about their toolchain…

          Liked by 1 person

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.