Debugging ARM Cortex-M Hard Faults with GDB Custom Command

In “A Processor Expert Component to Help with Hard Faults” I’m using a C handler with some assembly code, created with Processor Expert, to help me with debugging hard faults on ARM Cortex-M. Inspired by a GNU gdb script here, I have now an alternative way. As this approach is using the GDB command line approach, it works both with an Eclipse GUI and with using GDB in command line mode only :-).

GDB script to debug ARM Hard Faults
GDB script to debug ARM Hard Faults

The idea is:

  1. Set a breakpoint in the hard fault exception handler
  2. When a hard fault occurs, the CPU will call the hard fault exception handler, and the debugger will stop the target
  3. Execute the ‘armex’ (ARM Exception) script/command in GDB to dump the stacked registers to show the program counter where the problem happened.

.gdbinit Script

There are several ways to extend GDB with own commands. One easy way is to add the extra functions into the .gdbinit scrip which is loaded by GDB on startup.

I have added the following to my .gdbinit file to define my ‘armex’ command:

define armex
  printf "EXEC_RETURN (LR):\n",
  info registers $lr
    if ($lr & (0x4 == 0x4))
      printf "Uses MSP 0x%x return.\n", $MSP
      set $armex_base = $MSP
    else
      printf "Uses PSP 0x%x return.\n", $PSP
      set $armex_base = $PSP
    end

    printf "xPSR            0x%x\n", *($armex_base+28)
    printf "ReturnAddress   0x%x\n", *($armex_base+24)
    printf "LR (R14)        0x%x\n", *($armex_base+20)
    printf "R12             0x%x\n", *($armex_base+16)
    printf "R3              0x%x\n", *($armex_base+12)
    printf "R2              0x%x\n", *($armex_base+8)
    printf "R1              0x%x\n", *($armex_base+4)
    printf "R0              0x%x\n", *($armex_base)
    printf "Return instruction:\n"
    x/i *($armex_base+24)
    printf "LR instruction:\n"
    x/i *($armex_base+20)
end

document armex
ARMv7 Exception entry behavior.
xPSR, ReturnAddress, LR (R14), R12, R3, R2, R1, and R0
end

You can place the .gdbinit file anywhere. I have it placed where my gdb is located inside the Freescale Kinetis Design Studio (C:\Freescale\KDS_3.0.0\toolchain\bin).

To make sure GDB finds the .gdbinit, I specify the path to it in the Eclipse workspace preferences:

GDB Command File in Eclipse Workspace Preferences
GDB Command File in Eclipse Workspace Preferences

Debugging Hard Fault

To debug a hard fault, I set a breakpoint in my hard fault interrupt handler to stop the debugger when the fault happens:

stopped on hard fault
stopped on hard fault

To find out where the problem occurred, I use now the ‘armex’ command in the gdb console:

💡 Use the ‘triangle’ menu of the console to switch to the arm-none-eabi-gdb view

armex command in gdb console
armex command in gdb console

The armex command lists the stacked registers (same as with my handler shown in “Debugging Hard Faults on ARM Cortex-M“). The important information is either the return instruction or the LR instruction information. I can enter that address in the disassembly view to find out where the problem happened:

Disassembly View of Hard Fault Reason
Disassembly View of Hard Fault Reason

In the above example, the LR (Link Register or Return Address) was 0xbd2 (0xbd3 with the Thumb Bit set). In the disassembly view this is the address where the handler would return to, so the problem must be just before that. Checking the assembly code there is a branch register indirect

blx r3

The stacked register shows

R3              0x0

Which causes the hard fault. If the problem is not that clear, then simply set a breakpoint around that location and restart the application to debug what happens before the hardfault is triggered. With this, it should be hopefully easy to find and fix the problem.

Summary

I have now yet another way to debug my hard faults: using my custom gdb command to dump the stacked registers. The advantage of this approach is that it does not need any additional resources on the target (no extra handler in the code and no variables), compared to my earlier solution. And the added benefit is now that I know how to extend GDB with my custom commands :-).

Happy Faulting 🙂

47 thoughts on “Debugging ARM Cortex-M Hard Faults with GDB Custom Command

  1. Hello Eric

    Try a hard fault like

    void hard_fault(void)
    {
    }

    and set a break point at the { (which is the return).

    Then, when the fault occurs you step out of the routine (in disassember view) and you should be at the line of code that faulted with all register content as it was, showing bad pointers etc., without needing to dump any registers etc.

    Regards

    Mark

    Like

    • Hi Mark,
      unless I’m really seeing the obvious thing: how are you able to step out of a hard fault handler? Yes, I can do this for normal interrupts, but not for a hard fault. So how do you recover from a hard fault?

      Like

      • Eric

        A hard interrupt will simply return like any other interrupt, then it will repeat the error and exception back into the hard fault again. It then will return, exception etc. etc. until the watchdog fires.

        This means that hard faults are usually very easy to debug because you simply disable the watchdog, and then, when it happens, it results in an endless loop (hard fault, hard fault exception, return, hard fault, hard fault exception, return, hard fault, hard fault except…) and you can then connect the debugger (without resetting), pause and see the instruction causing the problem. As long as the watchdog doesn’t cause a reset, a system can be started without the debugger and left to run (for as long as it takes) until such a fault occurs – then carefully connect the debugger without causing a reset, pause, see the problem and solve it. In some cases it is even possible to manually correct a pointer or such and let the system recover (until it maybe fails again later).

        This works well with IAR as long as stepping out of the hard fault in disassemble mode (not in source code view since the debugger gets confused) and I think that KDS manages it too.

        Regards

        Mark

        Like

  2. Additional comment – it is only the core error that is unrecoverable (eg. a hard fault with an invalid handler), which will result in a reset.

    Regards

    Mark

    Like

  3. Pingback: Tutorial: Adafruit WS2812B NeoPixels with the Freescale FRDM-K64F Board – Part 5: DMA | MCU on Eclipse

  4. Hi Erich Styger,
    In my application Hard fault is happening before entering to main itself, In disassembly view i am getting the address but its not giving the C source?
    Question is How we should debug startup and initialization code before main? can we put breakpoint before main?

    Like

  5. I’m trying to debug a hard fault on a KE06 MCU in KDS in Windows. Is there any chance that you could help with that? The PE Hard Fault component wants to be in the .text portion of memory, and my project doesn’t have that. When I define it, it doesn’t like the leading period.

    I then tried the steps here, but I can’t create a .gdbinit because Windows also doesn’t like the preceeding period. The menus are all different in KDS, too.

    These look like very helpful tips to help with hard faults, but sadly they are difficult to implement in KDS.

    Like

    • Hi Dan,
      I’m using KDS (v3.x) too. And the screenshots in this posts are from KDS too, so not sure why your menus/etc are different?
      Yes, you cannot rename a file to name with the dot at the beginning with the Windows Explorer :-(. Simply use the DOS shell (command prompt) for this.
      As for “The PE Hard Fault component wants to be in the .text portion of memory, and my project doesn’t have that.”, I’m not sure what your problem is neither. I have created a Processor Expert project (no SDK!) for the KE06Z128, and I can add the HardFault component without any problems? And the component has no requirement on the text section or section name, as far as I can tell. What I’m missing?
      I hope that this is of some help?

      Like

  6. Pingback: Flashing many ARM Boards without a Host PC | MCU on Eclipse

  7. Thanks for this information. There are a couple problems: gdb parses “$lr & 0x4 == 0x4” as “$lr & (0x4 == 0x4)” which tests the wrong condition. With the common EXC_RETURN value 0xfffffff9 this obscured the fact that when “$lr & 0x04” is non-zero the active stack is PSP, not MSP. The correct check to handle all return modes is:

    if (0x04 == ($lr & 0x04))
    set $armex_base = $psp
    else
    set $armex_base = $msp
    end

    Like

    • hi pabigot,
      indeed, thanks for pointing this out. I have used this approach mostly for bare metal applications, so did not see that bug. I have now updated the code, thank you!
      Erich

      Like

    • Hi Peter,
      interesting approach! I quickly tried setting the SP based on MSP or PSP in a FreeRTOS thread, but somehow it did not show the stack for me. Not sure what was wrong. I have added your code sequence to my HardFault handler, but have it disable for now. I’m using the GDB which is in KDS V3.2.0 (GNU gdb (GNU Tools for ARM Embedded Processors) 7.6.0.20140731-cvs). Are you using a different version?

      Like

      • I extracted the KDS 3.2.0 installer and used the bundled gdb (7.6.0, as you said).

        but it also displays the thread when the hardfault handler is active.

        Like

      • I did a build run using the gcc toolchain supplied with KDS 3.2

        now the callstacks with the Segger Thread Awareness is not working anymore and a also I don’t see the callstack when I’m a intentional hardfault.

        Like

      • I just compiled with the the gcc launchpad toolchain.
        stack view also works. but only when I use gdb 7.8
        the bundled 7.10 does not play well with my GNU ARM Eclipse Plugin or Segger GDBServer.

        Like

    • Hi Macro,
      .gdbinit is a text file with gdb commands and settings in it. You cannot create a file with a dot at the beginning with the Windows Explorer. Use the Windows command shell/prompt/cmd exe for this, e.g. create a dummy.txt and then rename it with ‘rename dummy.txt .gdbinit’
      I hope this helps,
      Erich

      Like

      • Thank you, Erich

        I followed your advice and now my .gdbinit file is created. I still haven’t faced a Hard Fault condition to test your tool, but sometimes such a situation occurs to some customers and it is hard to find out what caused the Hard Fault condition. And this certainly will be a great tool to help me in those cases.

        Like

        • Hi Marco,
          it depends a bit on the core you are using, but common things to trigger a hard fault are for example
          *(int*)0x0 = 1;
          e.g. writing to read only memory.

          Like

        • Hi, Erich

          Thanks for your response. Such an instruction really causes a Hard Fault condition, but after typing the command “armex” in Console, I received the following:

          armex
          EXEC_RETURN (LR):
          Value can’t be converted to integer.
          lr 0xfffffff9 -7
          Uses MSP 0x

          I don’t know what could be causing such a problem. Can you throw some light on that, please?

          I tested with a firmware in KDS 3.2 and FreeRTOS on a FRDM-K64F kit.

          Like

        • Hi Marco,
          ‘armex’ does not ring any bell for me. About which console are you talking about?
          But after a hard fault your have to reboot the device.

          Like

        • Hi, Erich

          The Console I am using is “[GDB PEMicro Interface Debugging] arm-none-eabi-gdb”. And the interface used is the on board P&E Open-SDA built on FRDM-K64F kit.
          I just set a breakpoint at the beginning of “HardFaut_Handler”, run the code with that ilegal memory access and receive that error message I described in my previous message.

          Like

        • Hi Marco,
          I suggest to give that special HardFault handler a chance: to me it solved nearly any hard fault problem because it pointed out the location where it happend.

          Like

  8. Hi, Erich

    I don’t know “armex” didn’t work to me.

    I will try the other Hard Fault debugging solution you posted before. In that solution (Simple PC Handler), you suggest to replace the default code with:

    __asm volatile (
    ” movs r0,#4 \n”
    ” movs r1, lr \n”
    ” tst r0, r1 \n”
    ” beq _MSP \n”
    ” mrs r0, psp \n”
    ” b _HALT \n”
    “_MSP: \n”
    ” mrs r0, msp \n”
    “_HALT: \n”
    ” ldr r1,[r0,#20] \n”
    ” bkpt #0 \n”

    In my code (that uses KSDK 2 and FreeRTOS, but not Processor Expert), my Hard Fault Handler is already in assembly language and looks like this:

    HardFault_Handler:

    ldr r0,=HardFault_Handler
    bx r0
    .size HardFault_Handler, . – HardFault_Handler

    .align 1
    .thumb_func
    .weak SVC_Handler
    .type SVC_Handler, %functionSVC_Handler:
    ldr r0,=SVC_Handler
    bx r0
    .size SVC_Handler, . – SVC_Handler

    .align 1
    .thumb_func
    .weak PendSV_Handler
    .type PendSV_Handler, %function

    My question: Is it necessary to change something or add more code in addition to yours? Sorry for the question. I’m not very used to assembly language, specially in ARM MCU’s.

    Like

    • Hi Marco,
      your handlers bascially do nothing. So you won’t get easily the information you would need (what is causing the hardfault). The handler discussed in this article does extra stuff, like getting the values from the stack and storing it in local variables so they can be easily inspected using the debugger. Of course if you are able to do that manually (looking at the stack and memory), you don’t need that, but this might not be an easy task for everything. So if you would like to see the information in the debugger, then you should use such a handler.

      Like

      • Hi, Erich again

        I just noticed that you left a comment above Extended Handler code that says: “This is called from the HardFault_HandlerAsm with a pointer the Fault stack
        as the parameter”. Can you please explain me in more detail how should I do that call in Hard Fault_HandlerAsm? I suppose “HardFault_HandlerAsm” is the same function we find in ““startup_MK64F12.S” file, isn’t it?

        Thanks!

        Like

      • Hi, Erich

        I just replaced the code of HardFault Handler(ASM) with the one you suggested, just removing the quotation marks and put “HardFault_HandlerC” function in my main.c file. It worked perfectly!

        It really is such a great HardFault Handler debugging tool. The best (maybe the only one) I’ve ever seen!

        Thank you very much!

        Like

        • Great! For myself, these kind of things are maybe hard to manage the first time, but it is such a great learning experience (I hope it is the same for you). So keep going that way!

          Like

  9. For those having problems with ARMEX, my version of the GDB (GNU Tools for Arm Embedded Processors 7-2017-q4-major 8.0.50.20171128-git ) uses $msp and $psp instead of the upper case $MSP and $PSP. Changing to lower case definitions of these fixes the armex command for me.

    Like

      • Yes, changing to all lower case works for me.

        The error ” Value can’t be converted to integer.” people keep getting appears because p $MSP evaluates to void (apparently like any other unknown variable/register) which cannot be converted to integer.

        p $msp evaluates to the correct value of the register, and if I send GDB “info registers” it prints all the registers with lower case names, so logically they should always be referred to in lower case?

        Liked by 1 person

        • Thanks for the additional explanation, this absolutely makes sense. Are you going to contribute that back to the gdb sources or have you already done that?

          Like

        • Thanks!
          Its not an issue with the GDB sources I think, its with the .gdbinit you give, I needed to change all instances of $MSP, $PSP to $msp, $psp, e.g:

          if ($lr & (0x4 == 0x4))
          printf “Uses MSP 0x%x return.\n”, $MSP
          set $armex_base = $MSP

          Should be

          if ($lr & (0x4 == 0x4))
          printf “Uses MSP 0x%x return.\n”, $msp
          set $armex_base = $msp

          Liked by 1 person

        • Hi Chrizlax & Erich,
          Thanks ! It works. Changing all the $MSP and $PSP to lowercase does solve the “Value can’t be converted to integer.” issue.

          Now, i have more information on the underlining issue. Could someone help me shed some light on this armex report ?

          “PE_ISR(Unhandled_ivINT_Hard_Fault);
          PE_ISR(Unhandled_ivINT_Hard_Fault)
          {
          PE_DEBUGHALT();
          }”

          armex
          EXEC_RETURN (LR):
          lr 0xffffffe9 4294967273
          Uses MSP 0x2002fea8 return.
          xPSR 0x5e4f
          ReturnAddress 0xa
          LR (R14) 0x0
          R12 0x5e6c
          R3 0x2002ffbe
          R2 0x0
          R1 0x2
          R0 0x2002feb0
          Return instruction:
          0xa : movs r0, r0
          LR instruction:
          0x0 : movs r0, r0

          Sincerely,
          Vu

          Like

        • So your application has caused a hard fault (see https://mcuoneclipse.com/2012/11/24/debugging-hard-faults-on-arm-cortex-m/ and https://mcuoneclipse.com/2012/12/28/a-processor-expert-component-to-help-with-hard-faults/).
          But your register dump does not give enough information to track down the issue (LR does not show usable information). You have to debug and step through your code to find out where it happens.
          It could be an illegal access to memory or accessing a peripheral which is not clocked.

          Like

Leave a reply to Erich Styger Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.