By default, the GNU compiler (gcc) optimizes each compilation unit (source file) separately. This is effective, but misses the opportunity to optimize across compilation units. Here is where the Link Time Optimization (LTO,Β option -flto) can help out: with a global view it can optimize one step further.
The other positive side effect is that the linker can flag possible issues like the one below which are not visible to the compiler alone:
type of '__SP_INIT' does not match original declaration [enabled by default]
Link Time Optimization (-flto)
Link Time Optimizer can be turned on in the optimization settings of the GNU MCU Eclipse plugins (e.g. in Kinetis Design Studio):
The same setting can be found inside the MCUXpresso IDE (shown for the version 10.2 below):
Type does not match original declaration
What LTO has found in this case is an issue the compiler was not able to see:
The warning is for the object named ‘__SP_INIT’ which is flagged for used with different prototypes. Searching the project for the usage of that object shows that LTO is correct with its analysis:
In startup.c the (linker generated) symbol __SP_INIT is declared as
extern char __SP_INIT[];
while it is used in Vectors.c with:
extern uint32_t __SP_INIT;
π‘ Note that in one case it is an array of char, while in the other case it is an unsigned 32bit variable!
That problem would not exist if that external declaration would be in a header file, but this is how the engineers have set up the startup and vector table files :-(.
In Vectors.c __SP_INIT is used to initialize the stack pointer (SP) in the vector table as:
extern uint32_t __SP_INIT; #define VECTOR_SP_MAIN &__SP_INIT .... __attribute__ ((section (".vectortable"))) const tVectorTable __vect_table = { /* Interrupt vector table */ /* ISR address No. Name */ VECTOR_SP_MAIN, /* 0x00 ivINT_Initial_Stack_Pointer */ {
__SP_INIT is defined in the linker script:
_estack = 0x20000000; /* end of m_data */ __SP_INIT = _estack;
So the linker creates a virtual object and assigns it to the address 0x2000’0000.
It is used in the startup code as below:
extern char __SP_INIT[]; __attribute__((naked)) void __thumb_startup(void) { int addr = (int)__SP_INIT; /* setup the stack before we attempt anything else skip stack setup if __SP_INIT is 0 assume sp is already setup. */ __asm ( "mov r0,%0\n\t" "cmp r0,#0\n\t" "beq skip_sp\n\t" "mov sp,r0\n\t" "sub sp,#4\n\t" "mov r0,#0\n\t" "mvn r0,r0\n\t" "str r0,[sp,#0]\n\t" "add sp,#4\n\t" "skip_sp:\n\t" ::"r"(addr));
So for the vector table it is just a 32bit value (memory address), while in the startup code it is an array of char, and assigning the array name with a cast will take the address of that object and assign it to the local variable ‘addr’.
While this ‘magically’ works, it is not correct, especially using an object as above with different declarations. At least it is not a clean way to use that symbol.
Solution
How to solve this? One challenge in this case is that the ‘Vectors.c’ file is generated by Processor Expert, so I cannot easily change that one. But I can fix the variable and usage in startup.c which is normal application code. The fix is to match the declaration present in vectors.c:
With this, the linker is happy. And I’m happy too :-).
LTO and FreeRTOS
As a reminder: if using -flto with FreeRTOS and you want to debug it, make sure you turn on the LTO helpers in the FreeRTOS configuration:
The above setting tweaks the FreeRTOS source base and makes sure that symbols needed for Kernel Awareness or symbols used in some assembly routines are not removed or tweaked by LTO.
Summary
Link Time Optimization is a cool optimization. The optimization still has room for improvements, but I have found that with turning it on, it is able to flag hidden issues in the code. So -flto is as well an extra check for my code.
Happy Finding π
Links
- GNU Link Time Optimization: https://gcc.gnu.org/wiki/LinkTimeOptimization
- Inter Procedural Optimizations: https://en.wikipedia.org/wiki/Interprocedural_optimization
A couple of comments, one on stack and one on LTO:
From my GCC Linker script:
/*
* To prevent obscure problems with printf like library code, when
* printing 64-bit numbers, the stack needs to be aligned to an
* eight byte boundary.
*
* See “Eight-byte Stack Alignment” – http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/14269.html
*/
So the uint32 would still be wrong.
LTO can be aggressive and remove code that it thinks is unneeded yet is used such as interrupt code. Unless debugging is an obsession, and not using FreeRTOS ‘helpers’, mark such code or variables with __attribute__((used)) to prevent them from vanishing.
LikeLike
Hi Bob,
good reminder about stack 8-byte alignment. I have this usually done implicitly in the linker file, pointing the stacktop/start to the end of memory which usually is 8-byte aligned.
And yes, LTO can be agressive and replace/remove symbols. I have used the __attribute__((used)) for this in several places.
Thanks!
Erich
LikeLike
And yet another time this blog helps me out with a coding problem…
I had a typedef with an #ifdef modifying the type based on the some configuration options. The problem was, that this #define was placed in a header, which was not included everywhere the type was used. Only the LTO-warning described here detected this issue!
Time to subscribe for updates of this blog I guess π
Many thanks,
Michael
LikeLike
Hi Michael,
glad to be at your service! Yes, LTO has uncovered severy subtle issues in my coding too. And if you subscribe to this blog, you get all kind of not-technology pieces which hopefully you will enjoy as well. π
Erich
LikeLike
Pingback: FreeRTOS V10.4.1 with SEGGER SystemView V3.12 | MCU on Eclipse