A Processor Expert Component to Help with Hard Faults

Ahrg! Again my ARM application crashed somewhere and I ended up in a HardFault exception :-(. In my earlier post I used a handler to get information from the processor what happened. But it is painful to add this handler again and again. So I decided to make things easier for me: with a special HardFault Processor Expert component :-).

After adding this HardFault component to my project, it automatically adds an entry to the vector table. So no manual steps are needed: having the component in the project and enabled will do the needed steps.

The component implements two functions: one is the interrupt handler itself:

__attribute__</b>((naked)) void</b> <b>HF1_HardFaultHandler</b>(<b>void</b>)
{
  __asm volatile (
    " movs r0,#4      \n"  /* load bit mask into R0 */
    " movs r1, lr     \n"  /* load link register into R1 */
    " tst r0, r1      \n"  /* compare with bitmask */
    " beq _MSP        \n"  /* if bitmask is set: stack pointer is in PSP. Otherwise in MSP */
    " mrs r0, psp     \n"  /* otherwise: stack pointer is in PSP */
    " b _GetPC        \n"  /* go to part which loads the PC */
    "_MSP:            \n"  /* stack pointer is in MSP register */
    " mrs r0, msp     \n"  /* load stack pointer into R0 */
    "_GetPC:          \n"  /* find out where the hard fault happened */
    " ldr r1,[r0,#20] \n"  /* load program counter into R1. R1 contains address of the next instruction where the hard fault happened */
    " b HandlerC      \n"  /* decode more information. R0 contains pointer to stack frame */
  );
  HandlerC(0); /* dummy call to suppress compiler warning */
}

The above code will load the stack pointer into R0, while R1 contains the PC right after where the problem occurred.

At the end it calls a function implemented in C (HandlerC()) which performs more decoding:

/**
 * This is called from the HardFaultHandler with a pointer the Fault stack
 * as the parameter. We can then read the values from the stack and place them
 * into local variables for ease of reading.
 * We then read the various Fault Status and Address Registers to help decode
 * cause of the fault.
 * The function ends with a BKPT instruction to force control back into the debugger
 */
static void HandlerC(dword *hardfault_args)
{
  volatile unsigned long stacked_r0;
  volatile unsigned long stacked_r1;
  volatile unsigned long stacked_r2;
  volatile unsigned long stacked_r3;
  volatile unsigned long stacked_r12;
  volatile unsigned long stacked_lr;
  volatile unsigned long stacked_pc;
  volatile unsigned long stacked_psr;
  volatile unsigned long _CFSR;
  volatile unsigned long _HFSR;
  volatile unsigned long _DFSR;
  volatile unsigned long _AFSR;
  volatile unsigned long _BFAR;
  volatile unsigned long _MMAR;

  /* suppress warnings about unused variables */
  (void)stacked_r0;
  (void)stacked_r1;
  (void)stacked_r2;
  (void)stacked_r3;
  (void)stacked_r12;
  (void)stacked_lr;
  (void)stacked_pc;
  (void)stacked_psr;
  (void)_CFSR;
  (void)_HFSR;
  (void)_DFSR;
  (void)_AFSR;
  (void)_BFAR;
  (void)_MMAR;

  stacked_r0 = ((unsigned long)hardfault_args[0]);
  stacked_r1 = ((unsigned long)hardfault_args[1]);
  stacked_r2 = ((unsigned long)hardfault_args[2]);
  stacked_r3 = ((unsigned long)hardfault_args[3]);
  stacked_r12 = ((unsigned long)hardfault_args[4]);
  stacked_lr = ((unsigned long)hardfault_args[5]);
  stacked_pc = ((unsigned long)hardfault_args[6]);
  stacked_psr = ((unsigned long)hardfault_args[7]);

  /* Configurable Fault Status Register */
  /* Consists of MMSR, BFSR and UFSR */
  _CFSR = (*((volatile unsigned long *)(0xE000ED28)));

  /* Hard Fault Status Register */
  _HFSR = (*((volatile unsigned long *)(0xE000ED2C)));

  /* Debug Fault Status Register */
  _DFSR = (*((volatile unsigned long *)(0xE000ED30)));

  /* Auxiliary Fault Status Register */
  _AFSR = (*((volatile unsigned long *)(0xE000ED3C)));

  /* Read the Fault Address Registers. These may not contain valid values.
   * Check BFARVALID/MMARVALID to see if they are valid values
   * MemManage Fault Address Register
   */
  _MMAR = (*((volatile unsigned long *)(0xE000ED34)));
  /* Bus Fault Address Register */
  _BFAR = (*((volatile unsigned long *)(0xE000ED38)));

  __asm("BKPT #0\n") ; /* cause the debugger to stop */
}

It will use the stack pointer in R0 to write the stacked registers into variables for easier inspection.

Automatic Vector Allocation

The component uses a new feature available in CodeWarrior for MCU10.3: it is able to automatically assign a handler to a vector.

%if (CPUfamily = "Kinetis")
%- =============================================================================
%- Allocation of interrupt vectors by component.
%- =============================================================================
%-
%if (defined(PEversionDecimal) && (PEversionDecimal >=0 '1282')) %- this is only supported with MCU 10.3
%- Get interrupts info from CPU database
%- Note: this is done only for Kinetis for now.
%:tmp = %CPUDB_define_Interrupt_Vectors_info()
%-
 %for vect from InterruptVectors
   %if %"%'vect'" = 'defaultInt'
     %if vect = 'ivINT_Hard_Fault'
       %define_prj %'vect' %'ModuleName'%.%HardFaultHandler
     %else
       %- keep PEx default
     %endif
   %endif
 %endfor
%-
%endif %- MCU 10.3
%-
%else
  %error "this component is only supported for GCC and Kinetis!"
%endif %-(CPUfamily = "Kinetis")

As this is specific for MCU10.3 and the Processor Expert version in it, it needs to test against the version number. Additionally it is only supported for GCC and Kinetis, thus the second test.

Usage

Using the component is simple: just add it to the list of components in the project:

HardFault Handler in the project

HardFault Handler in the project

The HardFaultHandler() automatically gets assigend to the ARM Cortex HardFault Interrupt.

In case of a hard fault, the debugger will stop on the bkpt 0x0 instruction. The Variables View shows the stacked registers, and the stacked_lr shows the address where we are coming from:

Stopped in a hard fault

Stopped in a hard fault

Then that link register/PC address can be entered in the Disassembly View to see what where the problem happened:

Instruction causing the Hard Fault

Instruction causing the Hard Fault

đź’ˇ Keep in mind that the Link Register points to the instruction *after* the problem. And that an odd address bit indicates that the code is executed in Thumb mode.

For the above case, the blx r3 instruction at address 0x4d8 is causing the problem. And inspecting the stacked_r3 register, it is clear that R3 has 0x0 or a NULL pointer. Calling a function at address 0x0 is not a good idea ;-).

Summary

This new HardFault component makes my life easier: I simply can add it to the project, and if I run into a HardFault exception, I get the information I need to find the problem. The component has parts written in inline assembly, and for now supports ARM GNU gcc for Kinetis. The new component is available from this link.

27 thoughts on “A Processor Expert Component to Help with Hard Faults

  1. Es auf Ihrer Website schneit?

    I found your technique to split out the interrupts and exceptions to individual handler useful also on a STM32 ARM based MCU I’m working with. Of course I had to create it all by hand and with vi, since there is no ProcessorExpert available.

    Like

  2. Pingback: Debugging ARM Cortex-M0+ Hard Fault with MTB Trace | MCU on Eclipse

  3. Erich, I have my own design board with a KL04Z8 uC on it. It’s brand new, I’m using the internal OSC. After following your tutorial, I kept receiving this error: Cpu_ivINT_Hard_Fault. What could go wrong if my project uses only the generated code from PE without any modification? Many thanks!!

    Like

    • I suggest that you step through your code (from the startup) to see what code is causing this? I suspect that you are accessing the RTC (realtime clock) without having it powered or clocks enabled. Happened to me in the past.

      Like

  4. Pingback: C++ with Kinetis Design Studio | MCU on Eclipse

  5. Pingback: Debugging ARM Cortex-M Hard Faults with GDB Custom Command | MCU on Eclipse

  6. Hi Erich, I added your FSL_USB_Stack component to a MK20DX256VLL10 microcontroller, I was running other tasks on it but I was getting a Hard Fault most of the time, in the CFSR the PRECISERR bit was set but couldn’t find any info on possible causes for that kind of fault, the only way I could avoid it was putting some breakpoints in some places and continuing execution after hitting them, it was very strange. I spent a lot of hours researching the possible causes and learned a lot about the ARM Cortex M4 cpu and fault registers. In the end out of desperation I increased the “Interrupt stack size” from 256 to 512 in the MQX Lite component and that did the trick, I’m sharing this experience here if anyone faces the same or similar problem, your HardFault component was of great great help! I think KDS should have a special View that shows fault reports, such as in Keil (http://www.keil.com/dd/vtr/4919/13530.htm) but well.. Thanks for your efforts in this great blog!

    Like

    • Hi Carlos,
      thanks for sharing your experience and solution! Yes, in my experience stack overflows are at constant source of application crashes, and usually if things do not go well, I try to increase the stack to see if this solves the problem.
      Thanks again!

      Like

  7. Hi Erich,
    Again a very big thanks for such an awesome tutorial.

    I am using FRDM-K22F board. I am using following components.
    1. ADC
    2. FREE-RTOS

    I have very simple task which enables ADC conversion. Once i enable, I call vTaskDelay();. In this function, I am hitting Hard_fault. I used the hardfault component. My observations are as fellows. Probably FreeRTOS has some bug. Or might be I am wrong.

    1. LR is pointing to a function in tasks.c file. Funtion: vTaskDelay().
    Line: if( uxListRemove( &( pxCurrentTCB->xGenericListItem ) ) == ( UBaseType_t ) 0 )
    2. PC is pointing to a fucntion in list.c. Function: uxListRemove();
    Line: pxItemToRemove->pxPrevious->pxNext = pxItemToRemove->pxNext;

    This basically means that when delay is called, the task tries to remove itself from the ready list and while doing so, it is hitting the hardfault error. I am stuck currently. Can you please help me?

    Like

    • Hi Vishal,
      Are you using interrupts? One common programming error is not properly assigning the correct interrupt priorities in relationship to the RTOS.
      Have a look at http://www.freertos.org/RTOS-Cortex-M3-M4.html
      The other thing is: if weird things happen, try to increase the task stack size. Just in case if your problem is caused by a stack overflow (do you have the stack overflow check hook enabled?).
      I hope this helps,
      Erich

      Like

      • Hi Erich,
        I have the same problem like Vishal, but I am running it on my KL27 MCU.
        I am not using interrupts, just like polling from two ADC channels alternatively.
        After read the above feedback, I did increased the stack size, but still have the hardfault handler error. In addition, I tried to enable the stack overflow hook, but I am using IAR Workbench and not sure what component i need for it.
        Please advise. Thanks a lot!
        Gilbert

        Like

        • Hi Gilbert,
          it is not easy on a Cortex-M0+ to find out the reason for a hard fault.
          About the stack overflow hoo: this is a setting in FreeRTOS_config.h, set the following
          configCHECK_FOR_STACK_OVERFLOW to 1.
          And make sure your variables on the stack are aligned: on the M0 it gives a hard fault for unaligned access, while the M4 does two access (no hardfault, but slower execution).
          I hope this helps,
          Erich

          Like

  8. Pingback: How to Add Bluetooth Low Energy (BLE) Connection to ARM Cortex-M | MCU on Eclipse

  9. Getting this error on compile with the hard fault component, any ideas?
    Generated_Code/HF1.c:124:(.text.HF1_HardFaultHandler+0x14): relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `HF1_HandlerC’ defined in .text.HF1_HandlerC section in /var/folders/kn/fhk1p9tx2dz52qqq3jcmy7_h0000gn/T//ccAhPPjD.ltrans6.ltrans.o

    Like

    • Never seen that one :-(. It looks like the jump from the handler to the HandlerC function cannot be resolved with a 11bit address somehow? Maybe somehow your linker has placed them to much apart? Is it a compiler error or linker error?

      Like

  10. Pingback: Debugging ARM Cortex-M0+ HardFaults | MCU on Eclipse

  11. Hey Erich, this is a most helpful component, thank you. I have a situation where an odd fault is manifesting in a bootloader+application scenario, and the debugger can’t be running (I think). Do you have any thoughts on accessing this diagnostic data without the debugger? I guess I could dump the numbers out a com port by writing to low level component calls, but the MCU is in an ‘exception’ state so I’m not sure how well that would work.
    Thanks in advance.

    Like

    • You always should be able to debug things, even with a bootloader. Especially if you have the source code, you could add the symbols of it and will have more information.
      Writing things to a COM port is possible, as long as not using any interrupts for this.

      Like

      • Erich I’m back again on a similar issue. I patched in some code to dump the regs captured in HF1_Handler to internal EEPROM (Kinetis). This actually works surprisingly, but occasionally it rebuilds PEx components and nukes the code I added in. I’m not familiar with how to control that part of auto code generation; do you have any tips? Basically I want to #include a .h file with inline code in it (add in a hook, I guess).
        Thanks in advance.

        Like

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.