Poor Man’s Trace: Free-of-Charge Function Entry/Exit Trace with GNU Tools

There are cases where my application runs find for days, weeks or even months, but then from time to time there is an application crash. Yes, the watchdog will recover it, but still it would be good to know what happened? One solution would be to hook up a trace probe (like the one I have described in this post: “First Steps with the P&E Tracelink“). But having such a trace probe attached all the time is first not cheap and second not always possible. So what if the application would leave ‘breadcrumbs’ behind which would tell me the flow of the program leading to the problem? I have found a functionality in the GNU tools which seems not be widely known or use, but is incredibly helpful in such cases.

So what if I could get a log like this telling me which functions get called by whom?

{ 00000E88->00000DA0 ???->DEMO_Init
} 00000E88<-00000DA0 ???<-DEMO_Init
{ 00000E8C->00000D40 ???->DEMO_Run
 { 00000D62->00000CE8  DEMO_Run:0x0022->decide
  { 00000D0E->00000C60  decide:0x0026->calcValue
  } 00000D0E<-00000C60  decide:0x0026<-calcValue
  { 00000D16->00000CA0  decide:0x002E->getValue
   { 00000CC6->00000C60  getValue:0x0026->calcValue
   } 00000CC6<-00000C60  getValue:0x0026<-calcValue
  } 00000D16<-00000CA0  decide:0x002E<-getValue
 } 00000D62<-00000CE8  DEMO_Run:0x0022<-decide
 { 00000D62->00000CE8  DEMO_Run:0x0022->decide

Example Trace

For example if I have this piece of code:

static int decide(int i) {
  if (i<10) {
    return calcValue(i,5)+getValue(7);
  } else {
    return getValue(i);
  }
}

I can get this function trace

  { 00000D0E->00000C60  decide:0x0026->calcValue
  } 00000D0E<-00000C60  decide:0x0026<-calcValue
  { 00000D16->00000CA0  decide:0x002E->getValue
   { 00000CC6->00000C60  getValue:0x0026->calcValue
   } 00000CC6<-00000C60  getValue:0x0026<-calcValue
  } 00000D16<-00000CA0  decide:0x002E<-getValue

The trace is very simple:

  • { denotes entry of a function. It lists the PC where the function gets called, with the destination address (<PCofCaller> -> <DestinationAddress<). After that, it tries to decode the function name and offset inside that function.
  • } denotes exit of a function. It lists the address where it returns to, and from where it returns. If available, it lists the function name with offsets inside the function.

To take it from the above example:

{ 00000D0E->00000C60  decide:0x0026->calcValue

This means: at address 0xD0E there is a function call to the function calcValue(). That function call happens inside the function decide(), 0x26 bytes after the function start.

And

} 00000D0E<-00000C60  decide:0x0026<-calcValue

means that there is an exit of the function calcValue(), and it returns to the function decide() at the offset 0x26 inside the function decide().

The question is: how to get that kind of information? I’ll show you :-).

Poor Developers Trace

To be clear: what I show here does not replace more powerful hardware tracing. And the approach presented here only works with GNU :-). Basically what I’m doing is recording in my application firmware the entry and exit sequence of my program. The data is stored in a ring buffer with configurable size, so I can spend as much RAM available as needed. So using a function trace like this has an impact on the realtime behaviour of the application, so use with care!

The basic idea is: every function entry and exit will be recorded and stored in a trace ring buffer. In that ring buffer there is information if it is a function entry or exit, with the corresponding address information. Then that data gets transformed into a human readable form, like shown in the example above.

Instrumenting Function Enter and Exit

To instrument the function enter and exit, the following option has to be added to the GNU compiler settings:

-finstrument-functions
GNU Compiler Instrumentation Option

GNU Compiler Instrumentation Option

With this option enabled, gcc will call the following functions for entry and exit of a function:

void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn, void *call_site);

To exclude functions from profiling, use

__attribute__((no_instrument_function))

The profiling functions itself need to be excluded, otherwise they would be profiled recursively.

With that option, every function entry code will contain a call to __cyg_profile_func_enter(). It will pass its own address/location from where the call happens (call_site) and which function it calls ‘this_fn’:

call to __cyg_profile_func_enter

call to __cyg_profile_func_enter

In a similar way, it calls __cyg_profile_func_exit() at the end of the function to record the function exit:

call to __cyg_profile_func_exit

call to __cyg_profile_func_exit

Custom Recording Functions

All what I need to do is to implement a recording in the enter and exit functions. I’m using the following implementation:

/*!
 * \brief Function which is called upon function enter. The function call is inserted by the compiler.
 * \param this_fn Address of the caller function.
 * \param call_site Return address to the function which called this_fn
 */
__attribute__((no_instrument_function))
void __cyg_profile_func_enter(void *this_fn, void *call_site) {
  if (CYG_Enabled) {
    CYG_Store(TRUE, call_site, this_fn);
  }
}

/*!
 * \brief Function which is called upon function exit. The function call is inserted by the compiler.
 * \param this_fn Address of the caller function.
 * \param call_site Return address to the function which called this_fn
 */
__attribute__((no_instrument_function))
void __cyg_profile_func_exit(void *this_fn, void *call_site) {
  if (CYG_Enabled) {
    CYG_Store(FALSE, call_site, this_fn);
  }
}

CYG_Enabled is a global flag to turn recording on and off. CYG_Store() is used to store the information into a ring buffer. Every element in the ring buffer needs 9 bytes (without alignment):

/*!
 * Element in ring buffer to store the trace information.
 */
typedef struct {
  bool isEnter; /*!< TRUE for __cyg_profile_func_enter(), FALSE for __cyg_profile_func_exit() */
  void *this_fn; /*!< address (with thumb bit) of the (caller) function */
  void *call_site; /*!< return address to the function which called this_fn */
} CYG_RNG_ElementType;

Function Names

Because only addresses are recorded, the data does not contain any symbol/function names. So a trace would look like this:

{ 00000E88->00000DA0
} 00000E88<-00000DA0
{ 00000E8C->00000D40
 { 00000D62->00000CE8
  { 00000D0E->00000C60
  } 00000D0E<-00000C60
  { 00000D16->00000CA0
   { 00000CC6->00000C60
   } 00000CC6<-00000C60
  } 00000D16<-00000CA0
 } 00000D62<-00000CE8
 { 00000D62->00000CE8

To produce a more verbose trace with function names, one way is that you look up the linker map file and determine the function names. Because this is very time-consuming, I added support so the application itself provides information about the function names. For this, the application can provide decoding function like this

#if CYG_FUNC_TRACE_NAMES_ENABLED
__attribute__((no_instrument_function))
const char *DEMO_getFuncName(void *addr) {
  if (addr==calcValue) {
    return "calcValue";
  } else if (addr==decide) {
    return "decide";
  } else if (addr==getValue) {
    return "getValue";
  } else if (addr==DEMO_Run) {
    return "DEMO_Run";
  } else if (addr==DEMO_Init) {
    return "DEMO_Init";
  }
  return "";
}
#endif /* __FUNC_TRACE_ENABLED */

To disable function name decoding the following macro is used which is either 0 (disabled) or 1 (enabled):

CYG_FUNC_TRACE_NAMES_ENABLED

The decoder functions are listed in a table which is used by the decoder:

/*! Array of callbacks/function pointers to find the names for the functions */
static CYG_GetFuncNameFct GetFuncNames[] = {
  getFuncName,
  /* list all your decoder/function name functions here... */
  DEMO_getFuncName
};

Which are used by the decoder to determine the function names. With this the decoder is able to extend the information with function names, and can use ??? if the function is unknown:

{ 00000E88->00000DA0 ???->DEMO_Init
} 00000E88<-00000DA0 ???<-DEMO_Init
{ 00000E8C->00000D40 ???->DEMO_Run
 { 00000D62->00000CE8  DEMO_Run:0x0022->decide
  { 00000D0E->00000C60  decide:0x0026->calcValue
  } 00000D0E<-00000C60  decide:0x0026<-calcValue
  { 00000D16->00000CA0  decide:0x002E->getValue
   { 00000CC6->00000C60  getValue:0x0026->calcValue
   } 00000CC6<-00000C60  getValue:0x0026<-calcValue
  } 00000D16<-00000CA0  decide:0x002E<-getValue
 } 00000D62<-00000CE8  DEMO_Run:0x0022<-decide
 { 00000D62->00000CE8  DEMO_Run:0x0022->decide

Demo Application and Source Code

A demo application using the FRDM-KL25Z board and Kinetis Design Studio is available on GitHub here:

https://github.com/ErichStyger/mcuoneclipse/tree/master/Examples/KDS/FRDM-KL25Z/FRDM-KL25Z_FuncTrace

Call

CYG_Init();

to initialize the ring buffer. To dump the data collected on the console use

CYG_PrintCallTrace();

Summary

The solution presented here is based on open source GNU tools. The gcc compiler has an option to insert calls to function enter and exit functions which can be used for my own recoreding. All what it requires is little RAM on the target board to store the data, and a way to get the data off the target (e.g. with the debugger or with a serial connection. One can say it is the ‘poor developers trace’, but in finding some strange errors or crashes any breadcrumbs are more than welcome :-).

The project presented here is available on GitHub (for Kinetis Design Studio) here:
https://github.com/ErichStyger/mcuoneclipse/tree/master/Examples/KDS/FRDM-KL25Z/FRDM-KL25Z_FuncTrace

Happy Tracing 🙂

Links:

10 thoughts on “Poor Man’s Trace: Free-of-Charge Function Entry/Exit Trace with GNU Tools

  1. One thing I’ve done is added some code to the fault handler to pull the return address off the stack and save it in a no init section. On start up I pull that out and log it. The nice thing is with the return address you can use addr2line to find the code location where the error happened.

    I also have a CriticalError() function that does the same thing, records the location and any information needed as a hint, and then resets the uP.

    This has been so useful it’s ridiculous.

    One other thought I’ve had but never followed up on is boards that have external eerprom. If there is a fault, copy the ram into a section in eeprom. And then have a function that pulls that information out and copies it back into ram. The idea is to be able to do an old time core dump.

    Like

    • Yes, such a thing helps a lot too. I’m using a few applications which have an external EEPROM available. But what I have found out is that in case of a crash, I usually cannot write any more to the EEPROM (e.g. because the connection/I2C to it is the problem). So in case of a problem, be aware that because of that problem there is not much you can do (except to reset with a watchdog).

      Like

    • Not out of the box, and it would not be easy. This method only inserts calls to special functions. But if you know what parameters are passed (and where they are), you could trace them. So technically possible, but I’m not aware of anyone who did that.

      Like

    • You can turn this off with __attribute__((no_instrument_function)). The other way round you define a macro you place into your code, and then turn it on/off depending on your needs. Basically a ‘instrument_function’ version of it.

      Like

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.