In “Code Size Information with gcc for ARM/Kinetis” I use an option in the ARM gcc tool chain for Eclipse to show me the code size:
text data bss dec hex filename 0x1408 0x18 0x81c 7228 1c3c size.elf
I have been asked by a reader of this blog what these item numbers really mean. Especially: what the heck is ‘bss’???? 🙂
Note: I’m using the ARM GNU ‘printsize’ utility for gcc, with an example for Kinetis-L (KL25Z).
text
‘text’ is what ends up in FLASH memory. I can show this with adding
void foo(void) { /* dummy function to show how this adds to 'text' */ }
to my program, the ‘text’ part increases so:
text data bss 0x1414 0x18 0x81c
Likewise, my new function ‘foo’ gets added to the .text segment, as I can see in the map file generated by the linker:
*(.text*) .text.foo 0x000008c8 0x8 ./Sources/main_c.o 0x000008c8 foo
But it does not only contain functions, it has constant data as too. If I have a constant table like
const int table[] = {5,0,1,5,6,7,9,10};
then this adds to ‘text’ too. That variable ‘table’ will be in FLASH, initialized with the values specified in the source.
Another thing which is included in ‘text’ is the interrupt vector table (more on this later).
In summary: ‘text’ is what ends up typically in FLASH and has code and constant data.
data
‘data’ is used for initialized data. This is best explained with the following (global/extern) variable:
int32_t myVar = 0x12345678;
Adding above variable to my application will increase the ‘data’ portion by 4 bytes:
text data bss 0x1414 0x1c 0x81c
This variable ‘myVar’ is not constant, so it will end up in RAM. But the initialization (0x12345678) *is* constant, and can live in FLASH memory. The initialization of the variable is done during the normal ANSI startup code. The code will assign/copy the initialization value. This is sometimes named ‘copy-down’. For the startup code used by CodeWarrior for MCU10.3 for Kinetis-L (ARM Cortex-M0+), this is performed in __copy_rom_sections_to_ram()
:
Just one thing to consider: my variable ‘myVar’ will use space in RAM (4 bytes in my case), *plus* space in FLASH/ROM for the initialization value (0x12345678). So I need to count the ‘data’ size twice: that size will end up in RAM, plus will occupy FLASH/ROM. That amount of data in FLASH is *not* counted in the text portion.
❗ The ‘data’ only has the initialization data (in my example 0x12345678. And not the variable (myVar).
bss
The ‘bss’ contains all the uninitalized data.
💡 bss (or .bss, or BSS) is the abbreviation for ‘Block Started by Symbol’ by an old assembler (see this link).
This is best explained with following (global/extern) variable:
int32_t myGlobal;
Adding this variable will increase the ‘bss’ portion by 4:
text data bss 0x1414 0x18 0x820
💡 I like to remember ‘bss’ as ‘Better Save Space’ :-). As bss ends up in RAM, and RAM is very valuable for a microcontroller, I want to keep the amount of variables which end up in the .bss at the absolute minimum.
The bss segment is initialized in the startup code by the zero_fill_bss() function:
static void zero_fill_bss(void) { extern char __START_BSS[]; extern char __END_BSS[]; memset(__START_BSS, 0, (__END_BSS - __START_BSS)); }
dec
The ‘dec’ (as a decimal number) is the sum of text, data and bss:
dec = text + data + bss
Size – GNU Utility
The size (or printsize) GNU utility has more options:
size [-A|-B|--format=compatibility] [--help] [-d|-o|-x|--radix=number] [--common] [-t|--totals] [--target=bfdname] [-V|--version] [objfile...]
The ‘System V’ option can be set directly in the Eclipse panel:
It produces similar information as shown above, but with greater detail.
To illustrate this, I use
int table[] = {1,2,3,4,5};
While in ‘Berkeley’ mode I get:
text data bss dec hex filename 0x140c 0x2c 0x81c 7252 1c54 size.elf
I get this in ‘System V’ mode:
section size addr .interrupts 0xc0 0x0 .text 0x134c 0x800 .data 0x14 0x1ffff000 .bss 0x1c 0x1ffff014 .romp 0x18 0x1ffff030 ._user_heap_stack 0x800 0x1ffff048 .ARM.attributes 0x31 0x0 .debug_info 0x2293 0x0 .debug_abbrev 0xe66 0x0 .debug_loc 0x27df 0x0 .debug_aranges 0x318 0x0 .debug_macinfo 0x53bf3 0x0 .debug_line 0x1866 0x0 .debug_str 0xc23 0x0 .comment 0x79 0x0 .debug_frame 0x594 0x0 Total 0x5defe
I’m using an ARM Cortex-M0+ in my example, so addresses greater 0x1ffff000 are in RAM.
The lines from .ARM.attributes up to .debug_frame are not ending up in the target, they are debug and other information.
.interrupts is my interrupt vector table, and .text is my code plus constants, and is in FLASH memory. That makes the 0xc0+0x134c=0x140c for text in ‘Berkeley’.
.bss is my uninitialized (zero-outed) variable area. Additionally there is .user_heap_stack: this is the heap defined in the ANSI library for malloc() calls. That makes the total of 0x1c+0x800=0x81c shown in ‘Berkeley’ format.
.data is for my initialized ‘table[]’ variable in RAM (5*4 bytes=0x14)
The .romp is used by the linker for the ‘copy-down’ and initialization of .data. But it looks confusing: it is shown with addresses in RAM? Checking the linker map file shows:
.romp 0x1ffff030 0x18 load address 0x00001b60 0x00001b60 __S_romp = _romp_at 0x1ffff030 0x4 LONG 0x1b4c ___ROM_AT 0x1ffff034 0x4 LONG 0x1ffff000 _sdata 0x1ffff038 0x4 LONG 0x14 ___data_size 0x1ffff03c 0x4 LONG 0x0 0x1ffff040 0x4 LONG 0x0 0x1ffff044 0x4 LONG 0x0
Ah! That actually is not in RAM, but in FLASH: the linker maps this to the FLASH address 0x1b60! So this size 0x18 really needs to be added to the FLASH size too!
Summary
I hope I have sorted out things in a correct way. The way how the initialized data is reported might be confusing. But with the right knowledge (and .map file in mind), things get much clearer:
‘text’ is my code, vector table plus constants.
‘data’ is for initialized variables, and it counts for RAM and FLASH. The linker allocates the data in FLASH which then is copied from ROM to RAM in the startup code.
‘bss’ is for the uninitialized data in RAM which is initialized with zero in the startup code.
Happy Sizing 🙂
Do you ever compare the sizes between debug and release builds? On one project I was almost out of Flash code space and just to see what would happen I switched to Release build. Dramatic difference in size. Much more than I expected. I was thinking of using custom switches “per file” or maybe using libraries for tested code and only using Debug compile for new stuff.
LikeLike
Hi Bill,
I always do only ‘release’ builds. I do not see any value in doing a ‘debug’ build. See https://mcuoneclipse.com/2012/06/01/debug-vs-release/. With this, I do not see the problem you had. Yet another reason not to get fooled by ‘debug’ builds. It makes sense for the desktop world, but not for embedded. So probably what you saw is that in the ‘release’ build the compiler optimizations were turned on.
LikeLike
Hmm, I must have missed that post or just forgotten. I will try it. I guess I have been under the assumption the un-optimized code generation was needed by the debugger.
Thanks!
Bill
LikeLike
Modern debug information standards like Dwarf has been designed with optimized code debugging in mind. If it is not possible to debug highly optimized code with the debugger, then this is a bug in the tools for me. Either in the compiler (not correctly generating debug information) or in the debugger (not correctly using that information). Yes, generating correct and useful debug information is not the simplest thing in the world. But very doable. Yes, there are academic cases where it isdebatable that some optimizations are impacting severely debugging information. But as I say: academic 🙂
LikeLike
I apologize, as this isn’t a Freescale MCU. But it is ARM, Eclipse, and gcc.
This is the ‘debug’ build:
‘Invoking: ARM Windows GNU Print Size’
arm-none-eabi-size –format=berkeley stm32f4_test.elf
text data bss dec hex filename
264468 14744 102772 381984 5d420 stm32f4_test.elf
This is the ‘release’ build:
arm-none-eabi-size –format=berkeley stm32f4_test.elf
text data bss dec hex filename
147484 14392 102764 264640 409c0 stm32f4_test.elf
‘Finished building: stm32f4_test.siz’
That’s a significant amount of Flash space I’ve been wasting. And have been for years!
LikeLike
That really looks like in your release build you have the optimizations turned on, while they are off in the debug build. I get similar differences with my ARM cores with optimizations turned on/off.
LikeLike
Wrong! Being able to debug on embedded hardware will save you hours of work. It sounds like you need to brush up on utilizing the power of remote GDB, it is a doddle under eclipse and it will literally transform your development cycle. That debug code can be a complete godsend, not using it makes no sense.
LikeLike
Hi teejay,
not sure if you have posted that comment to the wrong thread? If it is about release and debug configurations: of course a ‘release’ configuration does not mean that I cannot debug it. Of course I keep the debug information in that configuration, and I’m using GDB and remote GDB every day 🙂
LikeLike
Hi!
can you please explain how to switch from debug to release build?
Thanks,
Cristian
LikeLike
Hi Cristian,
That depends on the architecture and tool chain. In general, in debug mode the optimizations are switched off, while in release they are on and symbolics are stripped off. You simply need to look at the compiler/build options/libraries and make sure, it matches your need. In any case even in ‘release’ you shall be able to debug it.
LikeLike
Hello:
Thank you for your response! It is very informative!
I have a question about BSS. I made a bare-bones project for the FDRM-25KLZ board in CodeWarrior, and I am getting a huge .BSS at startup! About 2076 with a blank code! The MCU has 4k of RAM, so it seems excesive to me to waste half of the RAM without any code! Is this space really used, or is it in some kind reserving the space for use? Maybe it can be the start-up code Codewarrior generated, but I don’t know how to look into the subject.
Thank you for your time,
Ed.
LikeLike
Hi Ed,
the KL25Z on the FRDM board has 16 KByte of RAM :-).
Could it be that you have a heap used/linked with your application? This for sure will add to the .bss section. Best if you have a look at the generated .map file (generated by the linker). There you should see what uses that amount of RAM in your application.
LikeLike
Hello Erich:
I am sorry for my late response. Yes, I am working with the FRDM board, but at the end I will work with a KL05 chip with 32kb ROM, 4kb RAM. Sorry I did not mention it! 😛
I made a baremetal project (as you did on your great tutorial “Optimizing the Kinetis gcc Startup”) and I found I have the same 2076 bytes of BSS that you had at the start of the tutorial.
Now, I have seen in other of your posts that you have less BSS code size. For example, in your tutorial of RTOS with the FDRM board you had a BSS of 1040! Is there some way to reduce the BSS code? As I said before, I only made a baremetal project with no code of my own or no processor expert code. I cannot find the startup code which takes all this space.
I made a barebones project for the KL05 chip and found the following BSS usage:
text data bss dec hex filename
868 24 540 1432 598 p1.elf
It still uses a lot! (13% of ram!), but at least not so much as in the case of the KL25 (~50%).
I am trying to understand the .map file, and I will post here what I find. I may take a while though :(.
Thank you for your time,
Ed.
LikeLike
Hi Ed,
have a look and check the linker file: how much stack and heap do you have allocated?
/* Generate a link error if heap and stack don’t fit into RAM */
__heap_size = 0x400; /* required amount of heap */
__stack_size = 0x400; /* required amount of stack */
Depending on your needs and startup, you can reduce this.
E.g. with FreeRTOS, I only need the initial stack to initialize the registers and basic drivers.
Then I switch to the stack of each task.
If you are not using any heap (and using the FreeRTOS one), you can reduce the heap size to zero.
The stack size to less than 0x50 (depending on what you do in your main()).
I hope this helps.
LikeLike
Oops! I made a mistake in my post! Sorry, I can’t edit it out…
“It still uses a lot! (13% of ram!), but at least not so much as in the case of the KL25 (~50%).”
KL25 using 2076 of BSS would take about the same percentaje (~13%).
LikeLike
Pingback: Traps and Pitfalls: No Hex/Bin/S19 File Created with GNU? | MCU on Eclipse
Pingback: GNU Linker, can you NOT Initialize my Variable? | MCU on Eclipse
Pingback: Printing Code Size Information in Eclipse | MCU on Eclipse
Pingback: Semihosting with Kinetis Design Studio | MCU on Eclipse
Hello, I have a problem when generating code for my bootloader project, it appears by looking at the mapfile and srec file that although I used the technique of limiting ROM to 0x5000, using Processor Expert, for my bootloader code, some code specifically the code for .romp is being allocated outside this limit.
********MAP File extract
._user_heap_stack
0x1fff84e4 0x400 load address 0x00004e1c
0x1fff84e4 . = ALIGN (0x4)
0x1fff84e4 PROVIDE (end, .)
0x1fff84e4 PROVIDE (_end, .)
0x1fff84e4 __heap_addr = .
0x1fff84e4 . = (. + __heap_size)
0x1fff88e4 . = (. + __stack_size)
*fill* 0x1fff84e4 0x400 00
0x1fff88e4 . = ALIGN (0x4)
0x00005c20 _romp_at = ((___ROM_AT + SIZEOF (.data)) + SIZEOF (.m_data_20000000))
.romp 0x1fff88e4 0x24 load address 0x00005c20
0x00005c20 __S_romp = _romp_at
0x1fff88e4 0x4 LONG 0x4938 ___ROM_AT
0x1fff88e8 0x4 LONG 0x1fff8000 _sdata
0x1fff88ec 0x4 LONG 0x298 ___data_size
0x1fff88f0 0x4 LONG 0x4bd0 ___m_data_20000000_ROMStart
0x1fff88f4 0x4 LONG 0x20000000 ___m_data_20000000_RAMStart
0x1fff88f8 0x4 LONG 0x1050 ___m_data_20000000_ROMSize
0x1fff88fc 0x4 LONG 0x0
0x1fff8900 0x4 LONG 0x0
0x1fff8904 0x4 LONG 0x0
******Linker File extract
MEMORY {
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x000001BC
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x00004BF0
m_data (RW) : ORIGIN = 0x1FFF8000, LENGTH = 0x00008000
m_data_20000000 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00008000
m_cfmprotrom (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
}
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
__heap_addr = .;
. = . + __heap_size;
. = . + __stack_size;
. = ALIGN(4);
} > m_data
_romp_at = ___ROM_AT + SIZEOF(.data) +SIZEOF(.m_data_20000000);
.romp : AT(_romp_at)
{
__S_romp = _romp_at;
LONG(___ROM_AT);
LONG(_sdata);
LONG(___data_size);
LONG(___m_data_20000000_ROMStart);
LONG(___m_data_20000000_RAMStart);
LONG(___m_data_20000000_ROMSize);
LONG(0);
LONG(0);
LONG(0);
}
.ARM.attributes 0 : { *(.ARM.attributes) }
}
Question is.. how can I tell the linker not to allocate .romp outside the 0x5000 limit?, thanks in advance
LikeLike
It seems to me that you are missing the > m_data for the
.romp : AT(_romp_at)
{
…
} > m_data
?
LikeLike
Thanks for your response Erich
This is an automatically generated file from PE.
Just in case anyone runs into this, _romp_at = ___ROM_AT + SIZEOF(.data) +SIZEOF(.m_data_20000000); was causing trouble when using a large uninitialised buffer defined in .m_data_20000000, this made m_text overflow beyond 0x5000, changing the calculation of .rompat to _romp_at = ___ROM_AT + SIZEOF(.data) solved the problem.
LikeLike
Erich, you are correct there is a missing > m_data, but this linker file is (was) managed by PE, it seems like a bug in this version of PE, i have the following installed MCU10-3 PRODUCT 1.0.0, Procesor Expert for MCU 1.2.0.S.RT2_b1248-1334.
LikeLike
Pingback: Comparing CodeWarrior with Kinetis Design Studio | MCU on Eclipse
Reblogged this on a version of mine and commented:
This explanation is easy to understand. I will use it as my reminder, and I hope it useful to others too.
LikeLike
Can you help me understand memory allocation for local static variables? Here is a code snippet:
uint32_t Flash_ReadRecord(uint16_t recordNumber) {
uint32_t previousTimeStamp;
uint16_t nextIndex;
uint16_t nextOffset;
// code here
}
As I understand it, these variables will be allocated space on a stack, and this memory will be freed once the Flash_ReadRecord() routine exits. Right? But if they are decalred static then they will be allocated permanent RAM locations. Right? So as I make each of these variables static in turn I would expect to see the bss number increase (by 4, 2 and 2 bytes respectively). However something different happens:
With none static:
text data bss dec hex filename
30360 48 3868 34276 85e4 HBM2_MMM.elf
Changing previousTimeStamp to static I get 4 extra bss bytes but also 8 extra text bytes:
30368 48 3872 34288 85f0 HBM2_MMM.elf
Changing nextIndex to static as well I get no extra bss bytes and 48 extra text bytes!
30416 48 3872 34336 8620 HBM2_MMM.elf
Changing all three to static I get 4 extra bss bytes and a further 20 extra text bytes!
30436 48 3876 34360 8638 HBM2_MMM.elf
Making nextIndex and nextOffset uint32_t (all 3 static) I get 4 extra bss bytes and no change to text bytes:
30436 48 3880 34364 863c HBM2_MMM.elf
The bss numbers seem OK if bss memory has to be allocated on 4-byte boundaries – but what is happening to the text?
LikeLike
Hmmm… from the map file it looks like the size of the routine Flash_ReadRecord() is changing, so this is presumably related to the relative efficiencies of accessing variables if they are on the stack compared to when they are in RAM elsewhere.
Interestingly:
previousTimeStamp is used 6 times and making it static takes 8 extra bytes
nextIndex is used 3 times and making it static takes 48 extra bytes
nextOffset is used 4 times and making it static takes 20 extra bytes
I wonder why these numbers are as they are?
What are the general lessons if one wants to minimise memory use?
LikeLike
To minimize memory usage:
– consider alignment for structs: make sure you place 8bit, 16bit and 32bit variables together so there are no gaps inbetween
– try to us the smallest data type possible
– instead using global memory, use dynamic/heap memory or local (stack) variables
– ensure that linker is removing unused variables
– try not to use the C runtime/ANSI library if possible: they tend to use their own variables/heap
LikeLike
Yes, if you mark local variables static, they are ‘static locals’ and will consume global RAM (like other global variables). The code size might change too because the compiler will use different ways to access them. I hope that makes sense?
LikeLike
My application (KL05, PE, FreeRTOS) sometimes behaves as though it has run out of RAM even though (a) the compiler shows no errors and (b) the sum of .data and .bss is less than the RAM available in the chip. What might be happening? I wonder about the “heap” and “stack” defined in the CPU PE component. I’ve defined no heap and a small stack. Does the stack start at the end of RAM and extend towards the variables? Maybe the stack is trashing some of the variables? The FreeRTOS documentation says “The stack used upon program entry is not required once the RTOS scheduler has been started (unless your application … uses the stack as an interrupt stack as is done in the ARM Cortex-M)”.
http://www.freertos.org/FAQMem.html#RAMReduce
How can I establish the stack’s RAM requirements?
LikeLike
If you need heap depends how you are using the ANSI library. It depends as well which library you are using (newlib, newlib nano), as the library might use heap for things like printf() etc. The stack you assign in the Processor Expert component is the stack used out of reset and during startup and calling main(): I usually allocate 0x100 there. But this depends what you are actually doing during startup, until FreeRTOS is running. Once you start the RTOS, it is using the stack assigned to the tasks. How do you know that your application is running out of RAM? Yes, it could be that your application starts overwriting global variables/RAM. FreeRTOS has a check for stack overflows which is very useful, have you turned that on (it is on by default)? The FreeRTOS stack is allocated inside the FreeRTOS heap, so it would likely overwrite another task stack if this really happened. Otherwise it could be a dangling pointer?
LikeLike
I have used the useful FreeRTOS stack high water and stack overflow routines. To look for stack overflow in the main() stack, before FreeRTOS starts, I added this code to startup.c before the stack pointer initialisation:
extern char __END_BSS[];
unsigned long len = __SP_INIT – __END_BSS;
unsigned long dst = (unsigned long) __END_BSS;
while( len > 0)
{
*((char *)dst) = 0x44;
dst += 1;
len–;
}
Then I can check how much stack is used by examining the RAM for 0x44. I think my earlier problem was that the stack was extending down from the end of RAM and over-writing some of the variables at the end of .bss (without warning).
This also shows the benefit of doing as little initialisation as possible before FreeRTOS starts, to minimise the main() stack use, which is wasted after FreeRTOS starts. My stack use is current 64 bytes.
I am left with a question: how are the data and bss numbers derived? I create a dummy array of a size so that the RAM is 100% used (verified by the .map file, which also gives ___data_size = 16). The Berkley figures are data=48 bss=4056 when I would expect numbers that add up to 4096 (KL05 RAM size).
LikeLike
The size utility reads the ELF information of the ELF/Dwarf file and sums the object sizes up. I don’t know the actual implementation, but this is what I think it is.
LikeLike
Pingback: Tutorial: Adafruit WS2812B NeoPixels with the Freescale FRDM-K64F Board – Part 2: Software Tools | MCU on Eclipse
Hi 2 ALL !
Talking about Kinetis and memory layout, I’d like to warn you, about a potential problem which may lead your code to the HardFault handler. The K-series devices include two blocks of on-chip SRAM: SRAM_L and SRAM_U, split at address 0x200000000. If say a 32-bit variable is placed across 0x200000000 e.t. 0x200000000-2 to 0x200000000+2, any attempt to access it would end up in the HardFault handler.
I think that If you want to not care, you should use either SRAM_L or SRAM_U, else, a special care should be taken what ends up where.
The ANs say that the faster one is the smaller one – SRAM_L, suitable for critical code (ISR ?).
LikeLike
Typo correction: the address has one zero less: 0x20000000
LikeLike
Ups, Erich already has written about it here: https://mcuoneclipse.com/2013/07/10/freertos-heap-with-segmented-kinetis-k-sram/
LikeLike
UPS, Erich was wrong about KL25. It’s not discontinuous SRAM space. See the above. Tested that goes to the HardFault. Test methodology:
In the .ld linker (GNU used), place absolute start address section:
.myBufBlock 0x1FFFFFFE:
{
KEEP(*(.myBufSection))
} > RAM
In the main.c, declare a global at this address:
volatile uint32_t __attribute__((section (“.myBufSection”))) myBufVar __attribute__ ((aligned (1)));
In main.c trigger the hard fault:
printf(“n Test HF:”);
myBufVar=11;
if(myBufVar==11) printf(“- HF not happened”);
LikeLike
It’s possible that my example is also *not* very correct, since at 0x1FFFFFFE, it’s only 16-bit-aligned, and if the M0+ core fails on non-32-bit aligned access, then the hardFault will be caused more likely because of this instead of cross-boundary access. I won’t flood the forum with cross-boundary DMA test, but may report the results.
LikeLike
Hi Erich,
As always, thanks again for the great tutorial. I am currently using FreeRTOS for my code. Is there any way I can know the Stack and Heap sizes during run time? I just want to know the stats of these while executing my code.
LikeLike
Hi Vishal,
yes, have a look at my Shell implementation (see System Status screenshot in https://mcuoneclipse.com/2012/08/05/a-shell-for-the-freedom-kl25z-board/). You can get the free heap size and a list of stack sizes for each task.
Additionally, there is an Eclipse plugin helping with the stack size: https://mcuoneclipse.com/2013/08/04/diy-free-toolchain-for-kinetis-part-5-freertos-eclipse-kernel-awareness-with-gdb/
LikeLike
Hi Erich,
Thanks a lot for the quick reply. I will take look at both of them.
LikeLike
Hi Vishal – you might find these 3 calls useful. Put the first two in a “print status” call to report useage – then tweak the task stack sizes. Implement the third to catch stack overflows:
FRTOS1_uxTaskGetStackHighWaterMark() // call for each task
FRTOS1_xPortGetFreeHeapSize() // won’t change once FRTOS resources are allocated
FRTOS1_vApplicationStackOverflowHook()
Be aware that you might call some stack-intensive code from within different tasks, meaning that code pushes up the high water mark in more than one task. You might be able to design around that (the XF print routines for example).
AFAIK the stack allocated in the PE CPU component is not used after FRTOS starts to run. If RAM is short you can minimise this by being careful with calls before starting FRTOS. I don’t know if there is a GetStackHighWaterMark() call for that stack.
LikeLike
Hi Erich, how are you?
First, thanks for the blog, it’s been very helpful!
Now I’ve got one question related to this topic.
My code is getting very close to the the 16k flash size of the MKE0216VLC4 uC I’m using. Here is the print:
text data bss dec hex filename
15068 140 600 15808 3dc0 AL-X15.elf
I’m getting the impression the linker is considering the ‘dec’ value as the amount to be stored in flash. Is that right? Shouldn’t it be the sum of ‘text’ and ‘data’ only, not including the ‘bss’?
I’m getting a lot of “region m_text overflowed with text and data” errors!
Thanks!
LikeLike
Hi Daniel,
yes, it should be the sum of text+data which is stored in flash.
LikeLike
Daniel,
to really know what the linker does: increase in your linker file artificially the size of RAM and FLASH and make it link. Then check the liker .map file produced (where the .elf is) what it actually did.
LikeLike
Great! Looking at the .map file I’v found something rather strange to me. Here is part of the memory configuration defined by PE:
Name Origin Length Attributes
m_text 0x00000410 0x00003bf0 xr
Why is my .text region starting at 0x0410? I’m loosing almost 1Kb of code space!! That’s why my code doesn’t fit!
I’ve tried to change the linker file via PE, but it gives me an error, saying: “There is flash-configuration area from 0x0400 to 0x040F”. What is that?
Do you have any suggestions? Should I manually override the PE link file and start .text at 0x0? Or should I create a new section and store some functions in that region?
I’m new to Freescale’s uCs, I’ve been using NXP for the last 5 years. Here is an usual .map file on my LPC1113 projetct:
Name Origin Length Attributes
MFlash24 0x00000000 0x00006000 xr
RamLoc4 0x10000000 0x00001000 xrw
Thaks!
LikeLike
Starting at address 0x0 there is the vector table. And at 0x410 there is the flash and security configuration. Do not mess this up! See
https://mcuoneclipse.com/2014/06/22/preventing-reverse-engineering-enabling-flash-security/
https://mcuoneclipse.com/2014/06/22/preventing-reverse-engineering-enabling-flash-security/
https://mcuoneclipse.com/2012/05/28/device-is-secure/
https://mcuoneclipse.com/2012/10/26/unsecuring-the-kl25z-freedom-board/
Otherwise, I highly recommend to read the device refrence manual.
LikeLike
What about the region in between the end of the Vector Table (0x0C0) and the start of Flash Configuration (0x0400)? Is it safe to use it? That region is defined on the linker file as “m_text_000000C0”. Like I said, it’s almost 1K of code space wasted (832 bytes to be precise).
I’m all optimized (-os) and there is still more code to go, so I really could use that space! I know there is a 32k version of this chip, but this is a very “cost sensitive” project!
Thanks!
LikeLike
Hi Daniel,
yes, you can use that area.
LikeLike
Hi Daniel – it would be ideal if the linker could place code from 0xc0 to 0x3ff, then start again at 0x410. I don’t know if that can be done, but I have managed to get the linker to use some of the lost space to store constants. The hack has three parts (no guarantee this is bug-free):
1) Add a new SECTIONS definition in the .ld linker file, after the vector table section:
.applicationConstants :
{
. = ALIGN(4);
KEEP(*(.application_const)) /* Flash Configuration Field (FCF) = 0x400 */
. = ALIGN(4);
} > m_text_000000C0
2) Add this definition of “CONSTANT” in a .h header file:
#define CONSTANT __attribute__ ((section (“.application_const”))) const
3) When you have some constant data that can reside in this ” .application_const” section, make use of the CONSTANT definition:
CONSTANT uint8_t bar[2] = {0x55, 0xaa};
Then bar gets put in .application_const region that was previously memory. But most of my constants are strings, as in:
Xprintf(“foo\r\n”);
I can force these strings into the .application_const region like this:
CONSTANT char string1[] = “foo\r\n”;
then later:
Xprintf(string1);
Not so easy to read, but it works.
LikeLike
Hi Erich.
My constants sum up to only 0xA4 bytes, so I’m manually selecting functions to store on that region to take the most space possible. Hope it works!
It would be nice to have an automated script to search for functions in *(.text*) and place them in .text2 section, until it’s filled. My knowledge of linker file scripts is limited, so I have no idea if that is even possible!
Thanks!
LikeLiked by 1 person
Hi Daniel,
there is no way (to my knowledge) to distribute objects in the linker across multiple sections. This is one feature other linkers have implemented, but not the GNU one 😦
LikeLike
Hi Erich
Thanks for your great post.
But I have a question.
In my Eclipse environment, seleting sysV mode, size data printed on dec.
I want to see size data on dec.
Can you help me this problem?
LikeLike
Hi RedSun,
set the format to Berkeley, and then you get exactly what I have.
LikeLike
Hi Erich,
I’ve started writing a customized bootloader for KL25z uC at address 0x00000000 in flash.
The startup code which I normally use to write an application at address other than 0x00000000 doesn’t seem to work properly at address 0x00000000. I use the startup files crt0.s, startup.c, system.c and system.s provided in USB HID bootloader by Freescale. I’m able to successfully compile it, and flash it, but during Flashing IAR shows the following warnings:
Warning: Target inconsistency detected in Memory range 0x00000000-0x000000BD
Warning: Target inconsistency detected in Memory range 0x00000410-0x00000D41
These memories belong to vector table and configuration registers respectively. I’m not sure why these warnings are popping up. After Flashing I debug the code stepping and at “SystemInit” as shown below of crt0.s file, the value in r0 register is expected to be 0xAEC (indicated by function pointer) but it is filled by 0xA61(this address belongs to some instruction of a function other than “SystemInit”). From that instruction onwards I lose complete control of Debug.
#define SCS_BASE (0xE000E000) /*!< System Control Space Base Address */
#define SCB_BASE (SCS_BASE + 0x0D00) /*!< System Control Block Base Address */
#define SCB_VTOR_OFFSET (0x00000008)
PUBLIC Reset_Handler
EXPORT Reset_Handler
Reset_Handler
// Mask interrupts
cpsid i
// Set VTOR register in SCB first thing we do.
ldr r0,=__vector_table
ldr r1,=SCB_BASE
str r0,[r1, #SCB_VTOR_OFFSET]
// Init the rest of the registers
ldr r2,=0
ldr r3,=0
ldr r4,=0
ldr r5,=0
ldr r6,=0
ldr r7,=0
mov r8,r7
mov r9,r7
mov r10,r7
mov r11,r7
mov r12,r7
// Initialize the stack pointer
ldr r0,=CSTACK$$Limit
mov r13,r0
// Call the CMSIS system init routine
ldr r0,=SystemInit
blx r0
// Init .data and .bss sections
ldr r0,=init_data_bss
blx r0
// Init interrupts
ldr r0,=init_interrupts
blx r0
// Unmask interrupts
cpsie i
// Set argc and argv to NULL before calling main().
ldr r0,=0
ldr r1,=0
ldr r2,=main
blx r2
__done
B __done
I suspect the problem is in the linker file. Here is the linker file I'm using.
/*###ICF### Section handled by ICF editor, don't touch! ****/
/*-Editor annotation file-*/
/* IcfEditorFile="$TOOLKIT_DIR$\config\ide\IcfEditor\cortex_v1_0.xml" */
define symbol __CODE_START_ADDRESS__ = 0x00000000;
define symbol __ICFEDIT_intvec_start__ = __CODE_START_ADDRESS__;
/*-Memory Regions-*/
define symbol __ICFEDIT_region_ROM_start__ = __CODE_START_ADDRESS__;
define symbol __ICFEDIT_region_ROM_end__ = 0x0001FFFF;
define symbol __ICFEDIT_region_RAM_start__ = 0x1FFFF000;
define symbol __ICFEDIT_region_RAM_end__ = 0x20002FFF;
define symbol __ICFEDIT_region_RAM1_start__ = 0x1FFFF000;
define symbol __ICFEDIT_region_RAM1_end__ = 0x1FFFFFFF;
define symbol __ICFEDIT_region_RAM2_start__ = 0x20000000;
define symbol __ICFEDIT_region_RAM2_end__ = 0x20002FFF;
define symbol __IntVectTable_start__ = __CODE_START_ADDRESS__ + 0x00000000;
define symbol __IntVectTable_end__ = __CODE_START_ADDRESS__ + 0x0000003F;
define symbol __FlashConfig_start__ = __CODE_START_ADDRESS__ + 0x00000400;
define symbol __FlashConfig_end__ = __CODE_START_ADDRESS__ + 0x0000040f;
/*-Sizes-*/
define symbol __ICFEDIT_size_cstack__ = (3 * 1024);
define symbol __ICFEDIT_size_heap__ = (2 * 1024);
define exported symbol __BOOT_STACK_ADDRESS = __ICFEDIT_region_RAM_end__ – 8;
/**** End of ICF editor section. ###ICF###*/
define memory mem with size = 4G;
//define region ROM_region = mem:[from __ICFEDIT_region_ROM_start__ to (__FlashConfig_start__ – 1)] | mem:[from (__FlashConfig_end__ + 1) to __ICFEDIT_region_ROM_end__];
define region ROM_region = mem:[from __ICFEDIT_region_ROM_start__ to __ICFEDIT_region_ROM_end__];
define region RAM_region = mem:[from __ICFEDIT_region_RAM_start__ to __ICFEDIT_region_RAM_end__];
define region FlashConfig_region = mem:[from __FlashConfig_start__ to __FlashConfig_end__];
define region IntVectTable_region = mem:[from __IntVectTable_start__ to __IntVectTable_end__];
define block CSTACK with alignment = 8, size = __ICFEDIT_size_cstack__ { };
define block HEAP with alignment = 8, size = __ICFEDIT_size_heap__ { };
do not initialize { section .noinit };
initialize manually { readwrite };
initialize manually { section .data};
initialize manually { section .textrw };
define block CodeRelocateRam { section .textrw };
define block CodeRelocate { section .textrw_init };
define block BootloaderFlash { readonly, block CodeRelocate };
define block BootloaderRam { readwrite, block CodeRelocateRam, block CSTACK, block HEAP };
place at address mem:__ICFEDIT_intvec_start__ { readonly section .intvec, readonly section .noinit };
place in ROM_region { block BootloaderFlash };
place in RAM_region { block BootloaderRam };
place in IntVectTable_region { section IntVectTable};
place in FlashConfig_region { section FlashConfig};
am I using the well configured Linker file?
If I change the address in the linker to be
define symbol __ICFEDIT_intvec_start__ = 0x8000;
things starts working.
I request you to guide me to tackle this issue.
Best Regards,
Bharath
LikeLike
Hi Bharath,
I believe the warning is that you have these memory areas allocated twice in your downloaded file? Can you check that you are not allocating these areas multiple times (e.g. by the application and by the bootloader)?
I hope this helps,
Erich
LikeLike
Hi Erich,
I’m working with a FRDM-K64F board, under KDS 3.1.0 and KSDK 2.0.
I have made some changes in the MKF64FN1M0xxx12_flash.ld file to make my program running, but they seem not correct. See the messages and my comments below
Any help is welcome
Best regards
Nadine
====================== ORIGINAL SETTINGS ARE:
/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00010000
m_data_2 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00030000
}
====================== I CHANGE THEM TO:
/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00020000
m_data_2 (RW) : ORIGIN = 0x20010000, LENGTH = 0x00020000
}
====================== THEN, MY PROGRAM LINKS and RUNS OK ….
12:03:50 **** Incremental Build of configuration Debug for project PLC ****
make -j2 all
‘Invoking: Cross ARM GNU Print Size’
arm-none-eabi-size –format=berkeley “PLC.elf”
text data bss dec hex filename
115416 516 129512 245444 3bec4 PLC.elf
‘Finished building: PLC.siz’
‘ ‘
====================== I ADD A LARGE TABLE IN MY PROGRAM, AND I HAVE A m_data overflowed message
‘Building target: PLC.elf’
‘Invoking: Cross ARM C++ Linker’
arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -O0 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -Wall -g3 -T “../MK64FN1M0xxx12_flash.ld” -Xlinker –gc-sections -Wl,-Map,”PLC.map” -specs=nosys.specs -specs=nano.specs -Xlinker -z -Xlinker muldefs -o “PLC.elf” ./utilities/fsl_debug_console.o ./utilities/fsl_notifier.o ./utilities/fsl_sbrk.o ./startup/startup_MK64F12.o ./startup/system_MK64F12.o ./source/CRC_get.o ./source/DXpsk_DeMod.o ………..
c:/freescale/kds_3.0.0/toolchain/bin/../lib/gcc/arm-none-eabi/4.8.4/../../../../arm-none-eabi/bin/ld.exe: PLC.elf section `.bss’ will not fit in region `m_data’
c:/freescale/kds_3.0.0/toolchain/bin/../lib/gcc/arm-none-eabi/4.8.4/../../../../arm-none-eabi/bin/ld.exe: region `m_data’ overflowed by 26896 bytes
collect2.exe: error: ld returned 1 exit status
make: *** [PLC.elf] Error 1
====================== I INCREASE THE m_data as seen below
/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00028000
m_data_2 (RW) : ORIGIN = 0x20018000, LENGTH = 0x00020000
}
====================== BUT, A PROBLEM APPEARS WHILE DOWNLOADING THE PROGRAM
‘Invoking: Cross ARM GNU Create Flash Image’
arm-none-eabi-objcopy -O ihex “PLC.elf” “PLC.hex”
‘Invoking: Cross ARM GNU Print Size’
arm-none-eabi-size –format=berkeley “PLC.elf”
text data bss dec hex filename
115416 516 159504 275436 433ec PLC.elf
‘Finished building: PLC.siz’
‘Finished building: PLC.hex’
‘ ‘
‘ ‘
Open On-Chip Debugger 0.8.0-dev (2015-01-09-16:22)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.sourceforge.net/doc/doxygen/bugs.html
Info : only one transport option; autoselect ‘cmsis-dap’
Info : CMSIS-DAP: SWD Supported
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : add flash_bank kinetis kinetis.flash
cortex_m reset_config sysresetreq
adapter speed: 1000 kHz
Started by GNU ARM Eclipse
Info : CMSIS-DAP: FW Version = 1.0
Info : SWCLK/TCK = 0 SWDIO/TMS = 1 TDI = 0 TDO = 0 nTRST = 0 nRESET = 1
Info : DAP_SWJ Sequence (reset: 50+ ‘1’ followed by 0)
Info : CMSIS-DAP: Interface ready
Info : clock speed 1000 kHz
Info : IDCODE 0x2ba01477
Info : kinetis.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : accepting ‘gdb’ connection from 3333
Info : Probing flash info for bank 0
Warn : acknowledgment received, but no packet pending
undefined debug reason 7 – target needs reset
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x000004d0 msp: 0x20030000
semihosting is enabled
Warn : Any changes to flash configuration field will not take effect until next reset
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x000004d8 msp: 0x20038000, semihosting
===== arm v7m registers
(0) r0 (/32): 0x00000000
(1) r1 (/32): 0x00000000
(2) r2 (/32): 0x00000000
(3) r3 (/32): 0x00000000
(4) r4 (/32): 0x00000000
(5) r5 (/32): 0x00000000
(6) r6 (/32): 0x00000000
(7) r7 (/32): 0x00000000
(8) r8 (/32): 0x00000000
(9) r9 (/32): 0x00000000
(10) r10 (/32): 0x00000000
(11) r11 (/32): 0x00000000
(12) r12 (/32): 0x00000000
(13) sp (/32): 0x20038000
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000004D8
(16) xPSR (/32): 0x01000000
(17) msp (/32): 0x20038000
(18) psp (/32): 0x00000000
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)
(31) dwt_2_comp (/32)
(32) dwt_2_mask (/4)
(33) dwt_2_function (/32)
(34) dwt_3_comp (/32)
(35) dwt_3_mask (/4)
(36) dwt_3_function (/32)
Error: CMSIS-DAP: Write Error (0x04)
Error: CMSIS-DAP: Write Error (0x04)
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 100ms
Error: CMSIS-DAP: Write Error (0x04)
Error: Failed to write memory at 0x00000000
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 300ms
Error: CMSIS-DAP: Read Error (0x04)
Error: Failed to read memory at 0x00000000
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 700ms
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 1500ms
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 3100ms
Info : dropped ‘gdb’ connection
LikeLike
Hi Nadine,
first off: be careful with ignoring that memory boundary at 0x2000’0000 (see https://mcuoneclipse.com/2013/07/10/freertos-heap-with-segmented-kinetis-k-sram/). If you know what you are doing, simply declare one area of memory (without the split), see that discussion in this post.
LikeLike
Hi Erich,
Thanks for the answer.
Now, I better understand the usage of m_data and m_data2 sections.
I have made suggested changes, and thinks are working better.
Regards
Nadine
LikeLike
Pingback: Dealing with Code Size in Kinetis SDK v2.x Projects | MCU on Eclipse
I blogged about my investigation: Size optimization for .elf ARM binaries in Kinetis Design Studio ‘link not valid any more!’
LikeLike
thanks for sharing that link! One point to note: the debug information (size of it) does not matter. Or only if you care about how big the file is on the host PC. Because only the code (and data) gets downloaded to the target. So all the .debug_* section and sizes do not matter: this stuff does not take space on your board (as long as you are not running something like an Embedded Linux). What shrinks indeed the size is using the optimization settings of the compiler. And here I see as well in my projects typical 50% reduction, as without the optimizations the gcc compiler is producing completely unoptimized code.
LikeLike
I am a digital verification engineer and not an expert in linker or compilers.
I have a particular question on MEMORY defined in .ld file. And, it is related to ARM GCC bare metal linker.
What I noticed is, if I don’t specify MEMORY in .ld file and in .text specify the start address as 0x0000_0000, and 0x0000_0000 is ROM address space, there is no initialization done by memset function.
But, when I specify FLASH and RAM in MEMORY, there is a zero initialization of .bss section. I can actually see ARM core writing 0’s to certain addresses.
What is confusing is why there is a change of behavior? Does linker do something clever when there is no memory defined with ‘rwx’ attributes?
Does it assume memory to be ‘rx’?
What happens to the heap if everything is ‘rx’?
LikeLike
Hi Neha,
I always have used the MEMORY block. What you see as initialization of bss is what is required by a standard ANSI C/C++ startup code. To follow the standard, the uninitialized memory has to be initialized with zeros.
If you don’t want that or you don’t need that, you could skip this initialization part in the startup code.
Erich
LikeLike
thanks for this explanation. is there any correlation between ‘hex’ or ‘dec’ and size of hex-file?
LikeLike
‘hex’ and ‘dec’ are simply the same size, with different number bases.
Now they don’t really correlate to the size of the hex file, as the hex is using an ASCII encoding (see https://mcuoneclipse.com/2012/09/27/s-record-intel-hex-and-binary-files/), and therefore the size is very much different, altough in the range of twice what you have reported by the GNU size utility for the code size, as every code byte is encoded in two ASCII bytes.
I hope this helps,
Erich
LikeLike
Pingback: MCUXpresso IDE: S-Record, Intel Hex and Binary Files | MCU on Eclipse
From the Department of Unabashed Astonishment:
=== Debug
arm-none-eabi-size –format=berkeley “evaluation.elf”
text data bss dec hex filename
62560 540 3568 66668 1046c evaluation.elf
=== Release
text data bss dec hex filename
17996 340 3248 21584 5450 evaluation.elf
Finished building: evaluation.siz
I’ll go back and check my optimization settings, but still, that’s a 3x reduction going from debug to release.
LikeLike
Hello,
I’ve a problem of space in a cortex M3 when I try to debug, but not when I create the release version. Can somebody tell me how to debug a release version or create a debug one which fits into the available space?
thanks
BR
LikeLike
‘Release’ usually is just with higher optimization levels, and with debug information removed (which does not make sense): See https://mcuoneclipse.com/2012/06/01/debug-vs-release/ on the general concept. I suggest you turn on/enable debug information, then you can debug the ‘release’ version with source level debugging as well.
LikeLike
Hi Erich,
Thanks for the post. This is very helpful.
Is there an option that I can use to pad the elf (output) to be always take up the exact same space in flash.
Reference:
I am using zynq 7020 SOC from Xilinx (and for firmware deelopment, I am using xilinx SDK, which is basically eclipse)
regards,
Srijan.
LikeLike
Hi Srijan,
You can fill up things with the linker script and even use your own pattern, see https://mcuoneclipse.com/2014/06/23/filling-unused-memory-with-the-gnu-linker/.
I hope this helps,
Erich
LikeLike
dec = text + data + bss
LikeLike
yes, thanks for that reminder. Will make it clear in the article.
LikeLike
Pingback: Flash-Resident USB-HID Bootloader with the NXP Kinetis K22 Microcontroller | MCU on Eclipse
Hi,
How do we do it similarly for MSP430 in CCS IDE.
Regards
Suresh
LikeLike
Hi Suresh,
you should be able to call the size tool from the CCS as post-build step as outlined in https://mcuoneclipse.com/2014/05/04/printing-code-size-information-in-eclipse/
I hope this helps,
Erich
LikeLike
quote: “💡 I like to remember ‘bss’ as ‘Better Save Space’ :-). As bss ends up in RAM, and RAM is very valuable for a microcontroller, I want to keep the amount of variables which end up in the .bss at the absolute minimum.”
????
‘Better Save Space’ refers to ROM space !
So you should want to have as much as possible variables in .BSS instead of in .DATA !
All variables are stored either in .DATA or .BSS and .DATA also resides in RAM but ADITIONALLY needs initialisation data in ROM and .BSS data does not !
So why do you wanna keep the amount of variables which end up in the .bss at the absolute minimum ??
LikeLike
Hi,
I suggest you have a read at https://en.wikipedia.org/wiki/Data_segment
LikeLike
Hi,
I am a beginner at this, trying to compile and debug on an atmel eval board with a cortex m4. When I compile the program, I always get
text data bss dec
0x1c594 0x108c 0x506c 140940
and adding and removing things from the code does not change these values.
I try adding: float32_t fft_out[10000000];
and this has no effect, even though I would want it to complain that I am out of memory.
Any idea if and where I have a problem in my settings which would prevent these values from being correct?
Thanks!
LikeLike
The linker will remove (‘dead-strip’) any variable or code you are not using. So just adding something and not referencing it will not change anything.
you will see a difference if you do
float32_t var;
and then use it e.g. as
var = fft_out[0];
in main().
LikeLike
Hello Eric,
A question that maybe related. With your approach I am able to get the flash footprint of my functions. What if I would like to know how much ram does the singe function take? with -size I am only able to get the flash footprint of each function; using -fstak-usage I get huge values for simple procedures, so I was wondering if I should inspect the map file. But again, in the map file I can see only the map file.
I am testing in a simple way – despite knowing that the function is really unuseful
void test_function(void);
int main(void)
{
test_function();
}
uint32_t test_function(void){
uint32_t internalVal = 0;
internalVal++;
return internalVal;
}
any hint on this?
K.R.
LikeLike
Hi Luca,
are you using the approach I described in https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/?
I don’t get huge numbers, for your example I get:
MK64FN1M0xxx12_Project.c:72:5:main 8 static
MK64FN1M0xxx12_Project.c:76:10:test_function 16 static
which is reasonable.
LikeLike
Hello Erich,
Yes I tried with that approach but I have some issues (https://community.nxp.com/thread/486445) because results vary according to the perl version used. I was looking for some alternative and tested method that does not involve making a new script by myself, using GNU toolchain if possible.
LikeLike
Hi Luca,
Because I don’t need the stack usage for every build, I sticked with calling it from the shell/console and not as part of the build process.
Maybe this is related to the thing you see: I had different set of tools (in my case: scp) called depending on if I do it from the application or from the console.
The issue seems that it depends if the application (Eclipse in this case) is 32bit or 64bit: Windows makes a kind of ‘shadow’ environment and depending on if the caller is 32bit or 64bit it might call different binaries in different folders :-(.
LikeLike
Hello Erich,
Thank you for the post.
I have a question, could you give some advise. if i do not use standard library, such as __main, and i must do some assemble code to copy data section from rom to ram. So how to know the the start address of data section, and size of it.
LikeLike
Hi Henry,
thank you!
Have a read at https://mcuoneclipse.com/2016/11/01/getting-the-memory-range-of-sections-with-gnu-linker-files/. You can set your own symbols in the linker file, like a start and end symbol. The size you get with the address difference between the symbols.
I hope this helps?
Erich
LikeLike
Thank you for your replay. it help me very much.
LikeLiked by 1 person
Hello Eric,
I can see that this is still an hot topic after 6 years (and gladly I can polish my skills reading back your posts!).
I have a question about local variables initialized reading register values. something like:
uint32_t wdog_cfg = (uint32_t)((FEATURE_WDOG_CLK_FROM_LPO << WDOG_CS_CLK_SHIFT))
where is it placed?
I expect the wdog_cfg to be pushed in the stack memory while the mcu evaluates the value of FEATURE_WDOG_CLK_FROM_LPO << WDOG_CS_CLK_SHIFT
are both of them part of the .text section?
What if the variable is local and initialized?
uint32_t wdog_cfg = 0xAAAAAAAA;
the numeric value will be placed in flash in text?
My goal is to perform some watchdog test/init before branching into init_data_bss but I am wondering if I am risking to mess up with unitialized ram data with this approach!
LikeLike
Hi Luca,
as a local variable, the wdog_cfg is placed (by default) on the stack. But if the compiler is optimizing it, it could be kept in register and actually never stored on the stack memory itself.
Numeric constants as 0xAAAAAAAAA are handled by the compiler, usually the compiler might constant as immediate value into the code, usually at the end of the function (especially for ARM Cortex).
There are some details about this discussed in this article: https://mcuoneclipse.com/tag/execute-only-code/
I hope this helps,
Erich
LikeLike
The .romp is used by the linker for the ‘copy-down’ and initialization of .data. But it looks confusing: it is shown with addresses in RAM? Checking the linker map file shows:
Ah! That actually is not in RAM, but in FLASH: the linker maps this to the FLASH address 0x1b60! So this size 0x18 really needs to be added to the FLASH size too!
Does this mean that we have to consider the size of .romp in both FLASH as well as RAM?
LikeLike
yes.
LikeLike
Pingback: Tutorial: How to Optimize Code and RAM Size | MCU on Eclipse
Hello Erich,
Can you please explain .fill section in gcc mapfile. Also should this section be added to total consumption?
LikeLike
.fill is used to fill up a section to the next section boundary. I believe it is added to the total consumption, but I have not tried/counted the numbers.
LikeLike
Pingback: 6 ways to communicate with STM32F103C8T6. Part 1. Zero to blinking an led – Miles's nerdier side
Pingback: Listing Code and Data Size for all Files with the GNU size Utility in a Post-Build Action | MCU on Eclipse
Sorry to be posting to such an old thread, but it’s where I ended up after following from your today post about assert/etc!
I wonder how “size” can be customized in MCUXpresso … but more, I don’t think it would tell me what I want anyway.
I am a good way through development on K22F for a redesign of our handheld configuration tool, and I’m stumped by “data=73100” in the compiled output. The code itself is only a little bigger (text=93596) so this additional “data” is a big hit on storage! I cannot find what is the cause; the map file gives no clues that I’ve spotted.
Is there a tool to decode / identify where initialized data comes from?
Thanks.
LikeLike
Hi Ian,
yes, the ld map file is sometimes not that useful. Some ideas: are you aware of the ‘Image Info’ view in MCUXpresso? See https://mcuoneclipse.com/2020/01/10/listing-code-and-data-size-for-all-files-with-the-gnu-size-utility-as-post-build/ which gives as well another tool to see which file is producing how much of data.
Have a look as well at the .bin file, if the linker has created initialization data (see https://mcuoneclipse.com/2014/04/19/gnu-linker-can-you-not-initialize-my-variable/ for that topic).
I hope this helps?
LikeLike
I did find that article Saturday but apparently didn’t pursue enough. Today in “PROGRAM_FLASH” when I expand I see section “data_RAM2” of 51K; it lists among other things “menucache” of nearly 20K so that’s part of where the data is coming from.
My declaration was “__SECTION(data,RAM2) menucache[CacheSize]; there’s no “= {0}” initialiser but the compiler/linker decided to do it anyway! Maybe the __SECTION is being used wrong?
I just changed to “__NOINIT(RAM)” and that seems to have removed the data block from Flash. Now I have to do the same in a few other places!
And … the code ends at 0x1EDB8 and the SREC file ends at 0x1EFE0 which is so much better!
Thanks 🙂
LikeLike
That’s great, thanks for reporting this back!
LikeLike
Hi Erich,
You said:‘data’ is for initialized variables, and it counts for RAM and FLASH. The linker allocates the data in FLASH which then is copied from ROM to RAM in the startup code. But I don’t fully understand why data is equal to 0x1c( after plusing 4) instead of 0x20 as you told? And when data is stored in FLAH, how can it be copied from ROM?
Thanks,
Jane
LikeLiked by 1 person
Hi Jane,
the initialization of RAM data with content from ROM/FLASH is done in your startup code. Check the code you are executing before calling main: there it does the zero-out (initializing globals with zero) and the copy-down (initializing variables with content from FLASH).
LikeLike