GNU Static Stack Usage Analysis

Stack overflows are a big problem: If I see a system crash, the first thing usually is I try to increase the stack size to see if the problem goes away. The GNU linker can check if my global variables fit into RAM. But it cannot know how much stack I need. So how cool would it be to have a way to find out how much stack I need?

Static Stack Usage Analysis with GNU

Static Stack Usage Analysis with GNU

And indeed, this is possible with the GNU tools (e.g. I’m using it with the GNU ARM Embedded (launchpad) 4.8 and 4.9 compilers :-). But it seems that this ability is not widely known?


One approach I have used for a very long time is:

  1. Fill the memory of the stack with a defined pattern.
  2. Let the application run.
  3. Check with the debugger how much of that stack pattern has been overwritten.

That works pretty good. Except it is very empirical. What I need is some numbers from the compiler to have a better view.

In this article I present an approach with GNU tools plus Perl script to report the stack usage in the application.

GNU -fstack-usage Compiler Option

The GNU compiler suite has an interesting option: -fstack-usage

“A unit compiled with -fstack-usage will generate an extra file that specifies the maximum amount of stack used, on a per-function basis. The file has the same basename as the target object file with a .su extension.” (

If I add that option to the compiler settings, there is now a .su (Stack Usage) file together with each object (.o) file:

Stack Usage File

Stack Usage File

The files are simple text files like this:

main.c:36:6:bar    48    static
main.c:41:5:foo    88    static
main.c:47:5:main    8    static

It lists the source file (main.c), the line (35) and column (5) position of the function, the function name (bar), the stack usage in bytes (48) and the allocation (static, this is the normal case).

Creating Stack Report

While the .su files already is a great source of information on a file/function basis, how to combine them to get the full picture? I have found a Perl script ( developed by Daniel Beer (see

From the original script, you might need to adapt the $objdump and $call_cost. With $objdump I specify the GNU objdump command (make sure it is present in the PATH) and $call_cost is a constant value added to the costs for each call:

my $objdump = "arm-none-eabi-objdump";
my $call_cost = 4;

Call with the list of object files, e.g. ./Debug/Sources/main.o ./Debug/Sources/application.o

💡 You need to list all the object files, the script does not have a feature to use all the .o files in a directory. I usually put the call to the Perl file into a batch file which I call from a post-build step (see “Executing Multiple Commands as Post-Build Steps in Eclipse“).

This generates a report like this:

  Func                               Cost    Frame   Height
> main                                176       12        4
  foo                                 164       92        3
  bar                                  72       52        2
> INTERRUPT                            28        0        2
  __vector_I2C1                        28       28        1
  foobar                               20       20        1
R recursiveFunct                       20       20        1
  __vector_UART0                       12       12        1

Peak execution estimate (main + worst-case IV):
  main = 176, worst IV = 28, total = 204
  • The function names with a ‘>’ in front show ‘root’ functions: they are not called from anywhere else (maybe I have not passed all the object files, or are really not used).
  • If the function is recursive, it is marked with ‘R’. The cost estimate will be for a single level of recursion.
  • Cost shows the cumulative stack usage (this function plus all the callees).
  • Frame is the stack size used as in the .su file, including $call_cost constant.
  • Height indicates the number of call levels which are caused by this function.

Notice the INTERRUPT entry: it is the level of stack needed by the interrupts. The tool assumes non-nested interrupts: it counts the worst case Interrupt Vector (IV) stack usage to the peak execution:

Peak execution estimate (main + worst-case IV):
  main = 176, worst IV = 28, total = 204

What is counted as interrupt routine is controlled by this part in the Perl script, so every function starting with __vector_ is treated as interrupt routine:

# Create fake edges and nodes to account for dynamic behaviour.
$call_graph{"INTERRUPT"} = {};

foreach (keys %call_graph) {
    $call_graph{"INTERRUPT"}->{$_} = 1 if /^__vector_/;

Assembly Code

If I have inline assembly and assembly code in my project, then the compiler is not able to report the stack usage. These functions are reported with ‘zero’ stack usage:

  Func                               Cost    Frame   Height
> HF1_HardFaultHandler                  0        0        1

The compiler will warn me about it:

stack usage computation not supported for this target

stack usage computation not supported for this target

💡 I have not found a way to provide that information to the compiler in the source.

RTOS Tasks

The tool works nicely and out-of-the box for tasks in an RTOS (e.g. FreeRTOS) based system. So with the tool I get a good estimate of each task stack usage, but I need to count to that value the interrupt stack usage:

  Func                               Cost    Frame   Height
> ShellTask                           712       36       17

-Wstack-usage Warning

Another useful compiler option is -Wstack-usage. With this option the compiler will issue a warning whenever the stack usage exceeds a given limit.

Option to warn about stack usage

Option to warn about stack usage

That way I can quickly check which functions are exceeding a limit:

stack usage warning

stack usage warning


The GNU compiler suite comes with the very useful option -fstack-usage which produces text files for each compilation unit (source file) listing the stack usage. These files can be processed further, and I’m using the great Perl script created by Daniel Beer (Thanks!). With the presented tools and techniques, I get an estimate of the stack usage upfront. I’m aware that this is an estimate only, that recursion is only counted at a minimum level, and that assembly code is not counted in. I might extend the Perl file to scan folders for all the object files in it, unless someone already did this? If so, please post a comment and share :-).

Happy Stacking 🙂

UPDATE 24-Aug-2015: For all the C++ users: Daniel Beer has updated his article on


28 thoughts on “GNU Static Stack Usage Analysis

  1. This looks very useful. Have you compared the results from the GNU stack usage option method against the empirical approach?


  2. Thanks for the tip.
    If you don’t want to type on all of the object files you can do something like this: `find Release -name *.o`

    If you are doing this in Eclipse then you would replace Release with the path to your Release (or Debug directory).

    Also as an aside, I was running on Cygwin where it promptly died because of a carriage return that was being generated by objdump.

    The solution was just to remove any carriage returns in the calling function:

    Around line 94:
    if (/: R_[A-Za-z0-9_]+_CALL[ \t]+(.*)/) {
    my $t = $1;

    $t =~ s/\r//g; #New -> remove carriage returns

    if ($t eq “.text”) {
    $t = “\@$objfile”;
    } elsif ($t =~ /^\.text\+0x(.*)$/) {
    $t = “$1\@$objfile”;

    $call_graph{$source}->{$t} = 1;


  3. Thank you very much for this post, Erich!
    Daniel Beer says ” .. This is calculated for each function as the maximum stack usage of any of its callees, plus its own stack frame, plus some call-cost constant (not included in GCC’s analysis).”
    How do we know what ‘-fstack-usage’ includes or not in its output? is there some documentation about how these compiler options works? I was looking for that information but I haven´t found anything yet.
    Would you show your batch file where you call the perl script and pass the objects list?

    Thank you in advance!



    • Hi Alex,
      Hi Alex,
      GCC simply knows the amount of local variables/stack in the compiler internal data structure, while allocating the local variables and temporary variables. To really know what is included or not you need to check the disassembly code, because it might differ from compiler to compiler. My finding is that it does not include the amount of stack needed which is added by the call instruction itself. This is not a big issue with ARM if the BX or BL instruction is used, as the return address is in the link register and not pushed on the stack.
      A batch file content how you could call the perl script is something like this: ./Debug/Sources/main.o ./Debug/Sources/application.o
      Simply add your own object files to it.
      Have a look at the .bat file here:


  4. Hi Erich,
    i get a proble.
    The doesn’t work correctly, when i compile my code using gcc.
    the result of Height will always 1,and the call path is also wrong.
    Is there something i have to edit?
    Thank you for you working.


  5. Hi Erich,

    I used the perl script with a batch file similar to yours. I used it for a FreeRTOS application. It seems to work fine on a individual function basis. However it does not tell me the stack usage of a function that calls other functions. So my result file contains only ‘root’ entries with an initial ‘>’. The entries for my tasks, that call some functions, are not having ‘child’ entries like in your example with foo and bar.
    So I can further examine by hand. However, it would be nice if the script will do the job for me. Do you have any hint what is not working in my case?

    Best Regards


  6. To pass the object list files. You can create a windows batch file like this:

    set “OFILES=”
    FOR /R %WORKSPACE%\Your_Debug_Folder\ %%G IN (*.o) DO (
    ECHO [Batch] Adding %%G to analysis
    ) !OFILES!


  7. Pingback: New NXP MCUXpresso IDE v11.0 | MCU on Eclipse

  8. Hi Erich, thanks for the post.

    you mentioned one way you used to check stack usage is by

    1)Fill the memory of the stack with a defined pattern.
    2)Let the application run.
    3)Check with the debugger how much of that stack pattern has been overwritten.

    I am trying to apply stack painting technique for a simple recursive program (that can potentially overflow the stack) on IMX6SX sabre board which is simulated on kile microvision5 (or) on STM32F407VG board emulated with QEMU on eclipse. I want to check on small program first and apply it for office project if results are good.

    Find more about it from my questions on stack overflow.

    as a new bee to embedded programming i am finding it difficult to do this.

    Do you have any working code or open source project that implements stack painting technique to check the stack usage??if yes, can you please share the code.

    Thank you!

    Liked by 1 person

  9. Hi Erich, thanks for sharing! Though I understood this approach is only an approximation, but have you wondered how to deal with indirect calls in building the call graph? Function pointers could get passed around everywhere… it is difficult problem but just wanted to know your thoughts. Thank you


What do you think?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.