Debugging ARM Cortex-M Hard Faults with GDB Custom Command

Posted on July 5, 2015 by Erich Styger

In “A Processor Expert Component to Help with Hard Faults” I’m using a C handler with some assembly code, created with Processor Expert, to help me with debugging hard faults on ARM Cortex-M. Inspired by a GNU gdb script here, I have now an alternative way. As this approach is using the GDB command line approach, it works both with an Eclipse GUI and with using GDB in command line mode only :-).

The idea is:

Set a breakpoint in the hard fault exception handler
When a hard fault occurs, the CPU will call the hard fault exception handler, and the debugger will stop the target
Execute the ‘armex’ (ARM Exception) script/command in GDB to dump the stacked registers to show the program counter where the problem happened.

.gdbinit Script

There are several ways to extend GDB with own commands. One easy way is to add the extra functions into the .gdbinit scrip which is loaded by GDB on startup.

I have added the following to my .gdbinit file to define my ‘armex’ command:

define armex
  printf "EXEC_RETURN (LR):\n",
  info registers $lr
    if ($lr & (0x4 == 0x4))
      printf "Uses MSP 0x%x return.\n", $MSP
      set $armex_base = $MSP
    else
      printf "Uses PSP 0x%x return.\n", $PSP
      set $armex_base = $PSP
    end

    printf "xPSR            0x%x\n", *($armex_base+28)
    printf "ReturnAddress   0x%x\n", *($armex_base+24)
    printf "LR (R14)        0x%x\n", *($armex_base+20)
    printf "R12             0x%x\n", *($armex_base+16)
    printf "R3              0x%x\n", *($armex_base+12)
    printf "R2              0x%x\n", *($armex_base+8)
    printf "R1              0x%x\n", *($armex_base+4)
    printf "R0              0x%x\n", *($armex_base)
    printf "Return instruction:\n"
    x/i *($armex_base+24)
    printf "LR instruction:\n"
    x/i *($armex_base+20)
end

document armex
ARMv7 Exception entry behavior.
xPSR, ReturnAddress, LR (R14), R12, R3, R2, R1, and R0
end

You can place the .gdbinit file anywhere. I have it placed where my gdb is located inside the Freescale Kinetis Design Studio (C:\Freescale\KDS_3.0.0\toolchain\bin).

To make sure GDB finds the .gdbinit, I specify the path to it in the Eclipse workspace preferences:

GDB Command File in Eclipse Workspace Preferences

Debugging Hard Fault

To debug a hard fault, I set a breakpoint in my hard fault interrupt handler to stop the debugger when the fault happens:

To find out where the problem occurred, I use now the ‘armex’ command in the gdb console:

💡 Use the ‘triangle’ menu of the console to switch to the arm-none-eabi-gdb view

The armex command lists the stacked registers (same as with my handler shown in “Debugging Hard Faults on ARM Cortex-M“). The important information is either the return instruction or the LR instruction information. I can enter that address in the disassembly view to find out where the problem happened:

In the above example, the LR (Link Register or Return Address) was 0xbd2 (0xbd3 with the Thumb Bit set). In the disassembly view this is the address where the handler would return to, so the problem must be just before that. Checking the assembly code there is a branch register indirect

blx r3

The stacked register shows

R3              0x0

Which causes the hard fault. If the problem is not that clear, then simply set a breakpoint around that location and restart the application to debug what happens before the hardfault is triggered. With this, it should be hopefully easy to find and fix the problem.

Summary

I have now yet another way to debug my hard faults: using my custom gdb command to dump the stacked registers. The advantage of this approach is that it does not need any additional resources on the target (no extra handler in the code and no variables), compared to my earlier solution. And the added benefit is now that I know how to extend GDB with my custom commands :-).

Happy Faulting 🙂

47 thoughts on “Debugging ARM Cortex-M Hard Faults with GDB Custom Command”

Mark on July 5, 2015 at 11:07 said:

Hello Eric

Try a hard fault like

void hard_fault(void)
{
}

and set a break point at the { (which is the return).

Then, when the fault occurs you step out of the routine (in disassember view) and you should be at the line of code that faulted with all register content as it was, showing bad pointers etc., without needing to dump any registers etc.

Regards

Mark

LikeLike

Reply ↓
- Erich Styger on July 5, 2015 at 14:02 said:
  
  Hi Mark,
  unless I’m really seeing the obvious thing: how are you able to step out of a hard fault handler? Yes, I can do this for normal interrupts, but not for a hard fault. So how do you recover from a hard fault?
  
  LikeLike
  
  Reply ↓
  - Mark on July 6, 2015 at 17:47 said:
    
    Eric
    
    A hard interrupt will simply return like any other interrupt, then it will repeat the error and exception back into the hard fault again. It then will return, exception etc. etc. until the watchdog fires.
    
    This means that hard faults are usually very easy to debug because you simply disable the watchdog, and then, when it happens, it results in an endless loop (hard fault, hard fault exception, return, hard fault, hard fault exception, return, hard fault, hard fault except…) and you can then connect the debugger (without resetting), pause and see the instruction causing the problem. As long as the watchdog doesn’t cause a reset, a system can be started without the debugger and left to run (for as long as it takes) until such a fault occurs – then carefully connect the debugger without causing a reset, pause, see the problem and solve it. In some cases it is even possible to manually correct a pointer or such and let the system recover (until it maybe fails again later).
    
    This works well with IAR as long as stepping out of the hard fault in disassemble mode (not in source code view since the debugger gets confused) and I think that KDS manages it too.
    
    Regards
    
    Mark
    
    LikeLike
    
    Reply ↓
Mark on July 6, 2015 at 17:49 said:

Additional comment – it is only the core error that is unrecoverable (eg. a hard fault with an invalid handler), which will result in a reset.

Regards

Mark

LikeLike

Reply ↓
Pingback: Tutorial: Adafruit WS2812B NeoPixels with the Freescale FRDM-K64F Board – Part 5: DMA | MCU on Eclipse
Horaira on January 26, 2016 at 11:58 said:

Hi Erich Styger,
In my application Hard fault is happening before entering to main itself, In disassembly view i am getting the address but its not giving the C source?
Question is How we should debug startup and initialization code before main? can we put breakpoint before main?

LikeLike

Reply ↓
- Erich Styger on January 26, 2016 at 12:31 said:
  
  So it crashes inside the library. You could use the program counter and check in the linker map file which function it is.
  You would need to rebuild the library (which is not easy): https://mcuoneclipse.com/2014/08/23/gnu-libs-with-debug-information-rebuilding-the-gnu-arm-libraries/
  But I think it is crashing for you in the malloc for the libraries. Have a look at this one:
  
  FreeRTOS, malloc() and SP check with GNU Tools
  
  Maybe this helps?
  And yes, you could put breakpoints before main, but if you don’t have debug information, then you need to do assembly stepping.
  Erich
  
  LikeLike
  
  Reply ↓
  - Horaira on January 26, 2016 at 21:20 said:
    
    Thanks, I will go through it.
    
    LikeLike
    
    Reply ↓
Dan on March 28, 2016 at 00:16 said:

I’m trying to debug a hard fault on a KE06 MCU in KDS in Windows. Is there any chance that you could help with that? The PE Hard Fault component wants to be in the .text portion of memory, and my project doesn’t have that. When I define it, it doesn’t like the leading period.

I then tried the steps here, but I can’t create a .gdbinit because Windows also doesn’t like the preceeding period. The menus are all different in KDS, too.

These look like very helpful tips to help with hard faults, but sadly they are difficult to implement in KDS.

LikeLike

Reply ↓
- Erich Styger on March 28, 2016 at 11:27 said:
  
  Hi Dan,
  I’m using KDS (v3.x) too. And the screenshots in this posts are from KDS too, so not sure why your menus/etc are different?
  Yes, you cannot rename a file to name with the dot at the beginning with the Windows Explorer :-(. Simply use the DOS shell (command prompt) for this.
  As for “The PE Hard Fault component wants to be in the .text portion of memory, and my project doesn’t have that.”, I’m not sure what your problem is neither. I have created a Processor Expert project (no SDK!) for the KE06Z128, and I can add the HardFault component without any problems? And the component has no requirement on the text section or section name, as far as I can tell. What I’m missing?
  I hope that this is of some help?
  
  LikeLike
  
  Reply ↓
Pingback: Flashing many ARM Boards without a Host PC | MCU on Eclipse
pabigot on May 30, 2016 at 22:16 said:

Thanks for this information. There are a couple problems: gdb parses “$lr & 0x4 == 0x4” as “$lr & (0x4 == 0x4)” which tests the wrong condition. With the common EXC_RETURN value 0xfffffff9 this obscured the fact that when “$lr & 0x04” is non-zero the active stack is PSP, not MSP. The correct check to handle all return modes is:

if (0x04 == ($lr & 0x04))
set $armex_base = $psp
else
set $armex_base = $msp
end

LikeLike

Reply ↓
- Erich Styger on May 31, 2016 at 07:44 said:
  
  hi pabigot,
  indeed, thanks for pointing this out. I have used this approach mostly for bare metal applications, so did not see that bug. I have now updated the code, thank you!
  Erich
  
  LikeLike
  
  Reply ↓
Peter on June 13, 2016 at 10:45 said:

Hi Erich,
I wrote a small write-up on a method which I didn’t see discussed anywhere:
https://www.element14.com/community/message/199113/l/gdb-assisted-debugging-of-hard-faults#199113

what do you think about this approach?

LikeLike

Reply ↓
- Erich Styger on June 13, 2016 at 13:48 said:
  
  Hi Peter,
  interesting approach! I quickly tried setting the SP based on MSP or PSP in a FreeRTOS thread, but somehow it did not show the stack for me. Not sure what was wrong. I have added your code sequence to my HardFault handler, but have it disable for now. I’m using the GDB which is in KDS V3.2.0 (GNU gdb (GNU Tools for ARM Embedded Processors) 7.6.0.20140731-cvs). Are you using a different version?
  
  LikeLike
  
  Reply ↓
  - Peter on June 13, 2016 at 14:01 said:
    
    I’m not using the toolchain coming with KDS
    my versions:
    gdb 7.8.2
    gcc 4.9.3
    
    LikeLike
    
    Reply ↓
  - Peter on June 13, 2016 at 14:11 said:
    
    I extracted the KDS 3.2.0 installer and used the bundled gdb (7.6.0, as you said).
    
    but it also displays the thread when the hardfault handler is active.
    
    LikeLike
    
    Reply ↓
  - Peter on June 13, 2016 at 14:21 said:
    
    I did a build run using the gcc toolchain supplied with KDS 3.2
    
    now the callstacks with the Segger Thread Awareness is not working anymore and a also I don’t see the callstack when I’m a intentional hardfault.
    
    LikeLike
    
    Reply ↓
    - Erich Styger on June 13, 2016 at 14:25 said:
      
      Interesting, so it depends on the gcc toolchain? Segger Thread Awareness works mostly (not always) for that combination.
      
      LikeLike
  - Peter on June 13, 2016 at 14:50 said:
    
    I just compiled with the the gcc launchpad toolchain.
    stack view also works. but only when I use gdb 7.8
    the bundled 7.10 does not play well with my GNU ARM Eclipse Plugin or Segger GDBServer.
    
    LikeLike
    
    Reply ↓
- Peter on June 17, 2016 at 11:23 said:
  
  Arne from Segger shares my motivation on this topic.
  the quality of embedded systems debugging is about to be improved again 🙂
  
  http://forum.segger.com/index.php?page=Thread&postID=11390#post11390
  
  LikeLike
  
  Reply ↓
  - Erich Styger on June 18, 2016 at 15:49 said:
    
    That’s indeed good news! Hopefully this will be in beta drop/release soon 🙂
    
    LikeLike
    
    Reply ↓
Marco Coelho on October 11, 2016 at 20:47 said:

Hi, Erich! How do you create the .gdbinit file in Windows Explorer? What kind of file is this? Is it a .exe file?

LikeLike

Reply ↓
- Erich Styger on October 11, 2016 at 21:02 said:
  
  Hi Macro,
  .gdbinit is a text file with gdb commands and settings in it. You cannot create a file with a dot at the beginning with the Windows Explorer. Use the Windows command shell/prompt/cmd exe for this, e.g. create a dummy.txt and then rename it with ‘rename dummy.txt .gdbinit’
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
  - Marco Coelho on October 11, 2016 at 22:05 said:
    
    Thank you, Erich
    
    I followed your advice and now my .gdbinit file is created. I still haven’t faced a Hard Fault condition to test your tool, but sometimes such a situation occurs to some customers and it is hard to find out what caused the Hard Fault condition. And this certainly will be a great tool to help me in those cases.
    
    LikeLike
    
    Reply ↓
    - Erich Styger on October 12, 2016 at 07:20 said:
      
      Hi Marco,
      it depends a bit on the core you are using, but common things to trigger a hard fault are for example
      *(int*)0x0 = 1;
      e.g. writing to read only memory.
      
      LikeLike
    - Marco Coelho on October 13, 2016 at 16:08 said:
      
      Hi, Erich
      
      Thanks for your response. Such an instruction really causes a Hard Fault condition, but after typing the command “armex” in Console, I received the following:
      
      armex
      EXEC_RETURN (LR):
      Value can’t be converted to integer.
      lr 0xfffffff9 -7
      Uses MSP 0x
      
      I don’t know what could be causing such a problem. Can you throw some light on that, please?
      
      I tested with a firmware in KDS 3.2 and FreeRTOS on a FRDM-K64F kit.
      
      LikeLike
    - Erich Styger on October 13, 2016 at 16:55 said:
      
      Hi Marco,
      ‘armex’ does not ring any bell for me. About which console are you talking about?
      But after a hard fault your have to reboot the device.
      
      LikeLike
    - Marco Coelho on October 13, 2016 at 17:05 said:
      
      Hi, Erich
      
      The Console I am using is “[GDB PEMicro Interface Debugging] arm-none-eabi-gdb”. And the interface used is the on board P&E Open-SDA built on FRDM-K64F kit.
      I just set a breakpoint at the beginning of “HardFaut_Handler”, run the code with that ilegal memory access and receive that error message I described in my previous message.
      
      LikeLike
    - Erich Styger on October 13, 2016 at 20:04 said:
      
      Hi Marco,
      I suggest to give that special HardFault handler a chance: to me it solved nearly any hard fault problem because it pointed out the location where it happend.
      
      LikeLike
Marco Coelho on October 13, 2016 at 22:32 said:

Hi, Erich

I don’t know “armex” didn’t work to me.

I will try the other Hard Fault debugging solution you posted before. In that solution (Simple PC Handler), you suggest to replace the default code with:

__asm volatile (
” movs r0,#4 \n”
” movs r1, lr \n”
” tst r0, r1 \n”
” beq _MSP \n”
” mrs r0, psp \n”
” b _HALT \n”
“_MSP: \n”
” mrs r0, msp \n”
“_HALT: \n”
” ldr r1,[r0,#20] \n”
” bkpt #0 \n”

In my code (that uses KSDK 2 and FreeRTOS, but not Processor Expert), my Hard Fault Handler is already in assembly language and looks like this:

HardFault_Handler:

ldr r0,=HardFault_Handler
bx r0
.size HardFault_Handler, . – HardFault_Handler

.align 1
.thumb_func
.weak SVC_Handler
.type SVC_Handler, %functionSVC_Handler:
ldr r0,=SVC_Handler
bx r0
.size SVC_Handler, . – SVC_Handler

.align 1
.thumb_func
.weak PendSV_Handler
.type PendSV_Handler, %function

My question: Is it necessary to change something or add more code in addition to yours? Sorry for the question. I’m not very used to assembly language, specially in ARM MCU’s.

LikeLike

Reply ↓
- Erich Styger on October 14, 2016 at 12:40 said:
  
  Hi Marco,
  your handlers bascially do nothing. So you won’t get easily the information you would need (what is causing the hardfault). The handler discussed in this article does extra stuff, like getting the values from the stack and storing it in local variables so they can be easily inspected using the debugger. Of course if you are able to do that manually (looking at the stack and memory), you don’t need that, but this might not be an easy task for everything. So if you would like to see the information in the debugger, then you should use such a handler.
  
  LikeLike
  
  Reply ↓
  - Marco Coelho on October 14, 2016 at 20:34 said:
    
    Hi, Erich
    
    As I told you previously, “armex” didn’t work to me. So, the last chance I have is to try Extended Handler posted on https://mcuoneclipse.com/2012/11/24/debugging-hard-faults-on-arm-cortex-m/
    
    But I have a problem. As my project does not use Processor Expert, HardFault Handler is in “startup_MK64F12.S” file and is in assembly language. Do you have a similar solution in assembly language?
    
    Thanks!
    
    LikeLike
    
    Reply ↓
  - Marco Coelho on October 14, 2016 at 20:51 said:
    
    Hi, Erich again
    
    I just noticed that you left a comment above Extended Handler code that says: “This is called from the HardFault_HandlerAsm with a pointer the Fault stack
    as the parameter”. Can you please explain me in more detail how should I do that call in Hard Fault_HandlerAsm? I suppose “HardFault_HandlerAsm” is the same function we find in ““startup_MK64F12.S” file, isn’t it?
    
    Thanks!
    
    LikeLike
    
    Reply ↓
  - Marco Coelho on October 14, 2016 at 22:01 said:
    
    Hi, Erich
    
    I just replaced the code of HardFault Handler(ASM) with the one you suggested, just removing the quotation marks and put “HardFault_HandlerC” function in my main.c file. It worked perfectly!
    
    It really is such a great HardFault Handler debugging tool. The best (maybe the only one) I’ve ever seen!
    
    Thank you very much!
    
    LikeLike
    
    Reply ↓
    - Erich Styger on October 15, 2016 at 09:53 said:
      
      Great! For myself, these kind of things are maybe hard to manage the first time, but it is such a great learning experience (I hope it is the same for you). So keep going that way!
      
      LikeLike
chrizlax on December 31, 2019 at 18:36 said:

For those having problems with ARMEX, my version of the GDB (GNU Tools for Arm Embedded Processors 7-2017-q4-major 8.0.50.20171128-git ) uses $msp and $psp instead of the upper case $MSP and $PSP. Changing to lower case definitions of these fixes the armex command for me.

LikeLike

Reply ↓
- Erich Styger on January 1, 2020 at 06:36 said:
  
  Thanks for letting us know. I agree that registers should not be case sensitive, but in that case I always can change them to what is accepted?
  
  LikeLike
  
  Reply ↓
  - chrizlax on January 2, 2020 at 13:18 said:
    
    Yes, changing to all lower case works for me.
    
    The error ” Value can’t be converted to integer.” people keep getting appears because p $MSP evaluates to void (apparently like any other unknown variable/register) which cannot be converted to integer.
    
    p $msp evaluates to the correct value of the register, and if I send GDB “info registers” it prints all the registers with lower case names, so logically they should always be referred to in lower case?
    
    LikeLiked by 1 person
    
    Reply ↓
    - Erich Styger on January 2, 2020 at 13:30 said:
      
      Thanks for the additional explanation, this absolutely makes sense. Are you going to contribute that back to the gdb sources or have you already done that?
      
      LikeLike
    - chrizlax on January 2, 2020 at 13:37 said:
      
      Thanks!
      Its not an issue with the GDB sources I think, its with the .gdbinit you give, I needed to change all instances of $MSP, $PSP to $msp, $psp, e.g:
      
      if ($lr & (0x4 == 0x4))
      printf “Uses MSP 0x%x return.\n”, $MSP
      set $armex_base = $MSP
      
      Should be
      
      if ($lr & (0x4 == 0x4))
      printf “Uses MSP 0x%x return.\n”, $msp
      set $armex_base = $msp
      
      LikeLiked by 1 person
    - Erich Styger on January 2, 2020 at 13:55 said:
      
      ah, I misunderstood when you wrote ‘my version of gdb’: I thought you changed the GDB sources to accept both lower and upper case. It is clear now 🙂
      
      LikeLiked by 1 person
    - Vu on March 13, 2020 at 02:33 said:
      
      Hi Chrizlax & Erich,
      Thanks ! It works. Changing all the $MSP and $PSP to lowercase does solve the “Value can’t be converted to integer.” issue.
      
      Now, i have more information on the underlining issue. Could someone help me shed some light on this armex report ?
      
      “PE_ISR(Unhandled_ivINT_Hard_Fault);
      PE_ISR(Unhandled_ivINT_Hard_Fault)
      {
      PE_DEBUGHALT();
      }”
      
      armex
      EXEC_RETURN (LR):
      lr 0xffffffe9 4294967273
      Uses MSP 0x2002fea8 return.
      xPSR 0x5e4f
      ReturnAddress 0xa
      LR (R14) 0x0
      R12 0x5e6c
      R3 0x2002ffbe
      R2 0x0
      R1 0x2
      R0 0x2002feb0
      Return instruction:
      0xa : movs r0, r0
      LR instruction:
      0x0 : movs r0, r0
      
      Sincerely,
      Vu
      
      LikeLike
    - Erich Styger on March 14, 2020 at 06:50 said:
      
      So your application has caused a hard fault (see https://mcuoneclipse.com/2012/11/24/debugging-hard-faults-on-arm-cortex-m/ and https://mcuoneclipse.com/2012/12/28/a-processor-expert-component-to-help-with-hard-faults/).
      But your register dump does not give enough information to track down the issue (LR does not show usable information). You have to debug and step through your code to find out where it happens.
      It could be an illegal access to memory or accessing a peripheral which is not clocked.
      
      LikeLike
Vu on March 17, 2020 at 20:02 said:

Hi Erich,
Thanks for your feedback. I captured more information using your “simple PC handler” along with the GDB custom command. I captured the issue and posted in the below thread.

https://community.nxp.com/message/1283392

Thanks,
Vu

LikeLike

Reply ↓
- Vu on March 17, 2020 at 20:07 said:
  
  Some how my post hasn’t show up in the forum yet. Probably needs approval from the moderator.
  
  LikeLike
  
  Reply ↓
  - Erich Styger on March 17, 2020 at 20:09 said:
    
    Do you mean on the NXP forum? Yes, posts up there need to be approved first.
    
    LikeLike
    
    Reply ↓