text, data and bss: Code and Data Size Explained

Posted on April 14, 2013 by Erich Styger

In “Code Size Information with gcc for ARM/Kinetis” I use an option in the ARM gcc tool chain for Eclipse to show me the code size:

   text       data        bss        dec        hex    filename
 0x1408       0x18      0x81c       7228       1c3c    size.elf

I have been asked by a reader of this blog what these item numbers really mean. Especially: what the heck is ‘bss’???? 🙂

Note: I’m using the ARM GNU ‘printsize’ utility for gcc, with an example for Kinetis-L (KL25Z).

text

‘text’ is what ends up in FLASH memory. I can show this with adding

void foo(void) {
  /* dummy function to show how this adds to 'text' */
}

to my program, the ‘text’ part increases so:

   text       data        bss
 0x1414       0x18      0x81c

Likewise, my new function ‘foo’ gets added to the .text segment, as I can see in the map file generated by the linker:

 *(.text*)
 .text.foo      0x000008c8        0x8 ./Sources/main_c.o
                0x000008c8                foo

But it does not only contain functions, it has constant data as too. If I have a constant table like

const int table[] = {5,0,1,5,6,7,9,10};

then this adds to ‘text’ too. That variable ‘table’ will be in FLASH, initialized with the values specified in the source.

Another thing which is included in ‘text’ is the interrupt vector table (more on this later).

In summary: ‘text’ is what ends up typically in FLASH and has code and constant data.

data

‘data’ is used for initialized data. This is best explained with the following (global/extern) variable:

int32_t myVar = 0x12345678;

Adding above variable to my application will increase the ‘data’ portion by 4 bytes:

 text       data        bss
 0x1414     0x1c      0x81c

This variable ‘myVar’ is not constant, so it will end up in RAM. But the initialization (0x12345678) *is* constant, and can live in FLASH memory. The initialization of the variable is done during the normal ANSI startup code. The code will assign/copy the initialization value. This is sometimes named ‘copy-down’. For the startup code used by CodeWarrior for MCU10.3 for Kinetis-L (ARM Cortex-M0+), this is performed in __copy_rom_sections_to_ram():

ARM Startup Code Initializing Variables

Just one thing to consider: my variable ‘myVar’ will use space in RAM (4 bytes in my case), *plus* space in FLASH/ROM for the initialization value (0x12345678). So I need to count the ‘data’ size twice: that size will end up in RAM, plus will occupy FLASH/ROM. That amount of data in FLASH is *not* counted in the text portion.

❗ The ‘data’ only has the initialization data (in my example 0x12345678. And not the variable (myVar).

bss

The ‘bss’ contains all the uninitalized data.

💡 bss (or .bss, or BSS) is the abbreviation for ‘Block Started by Symbol’ by an old assembler (see this link).

This is best explained with following (global/extern) variable:

int32_t myGlobal;

Adding this variable will increase the ‘bss’ portion by 4:

 text       data        bss
 0x1414     0x18      0x820

💡 I like to remember ‘bss’ as ‘Better Save Space’ :-). As bss ends up in RAM, and RAM is very valuable for a microcontroller, I want to keep the amount of variables which end up in the .bss at the absolute minimum.

The bss segment is initialized in the startup code by the zero_fill_bss() function:

static void zero_fill_bss(void)
{
	extern char __START_BSS[];
	extern char __END_BSS[];

	memset(__START_BSS, 0, (__END_BSS - __START_BSS));
}

dec

The ‘dec’ (as a decimal number) is the sum of text, data and bss:

dec = text + data + bss

Size – GNU Utility

The size (or printsize) GNU utility has more options:

size [-A|-B|--format=compatibility]
          [--help]
          [-d|-o|-x|--radix=number]
          [--common]
          [-t|--totals]
          [--target=bfdname] [-V|--version]
          [objfile...]

The ‘System V’ option can be set directly in the Eclipse panel:

GNU Print Size Option in CodeWarrior for MCU10.3

It produces similar information as shown above, but with greater detail.

To illustrate this, I use

int table[] = {1,2,3,4,5};

While in ‘Berkeley’ mode I get:

   text       data        bss        dec        hex    filename
 0x140c       0x2c      0x81c       7252       1c54    size.elf

I get this in ‘System V’ mode:

section                size         addr
.interrupts            0xc0          0x0
.text                0x134c        0x800
.data                  0x14   0x1ffff000
.bss                   0x1c   0x1ffff014
.romp                  0x18   0x1ffff030
._user_heap_stack     0x800   0x1ffff048
.ARM.attributes        0x31          0x0
.debug_info          0x2293          0x0
.debug_abbrev         0xe66          0x0
.debug_loc           0x27df          0x0
.debug_aranges        0x318          0x0
.debug_macinfo      0x53bf3          0x0
.debug_line          0x1866          0x0
.debug_str            0xc23          0x0
.comment               0x79          0x0
.debug_frame          0x594          0x0
Total               0x5defe

I’m using an ARM Cortex-M0+ in my example, so addresses greater 0x1ffff000 are in RAM.

The lines from .ARM.attributes up to .debug_frame are not ending up in the target, they are debug and other information.

.interrupts is my interrupt vector table, and .text is my code plus constants, and is in FLASH memory. That makes the 0xc0+0x134c=0x140c for text in ‘Berkeley’.

.bss is my uninitialized (zero-outed) variable area. Additionally there is .user_heap_stack: this is the heap defined in the ANSI library for malloc() calls. That makes the total of 0x1c+0x800=0x81c shown in ‘Berkeley’ format.

.data is for my initialized ‘table[]’ variable in RAM (5*4 bytes=0x14)

The .romp is used by the linker for the ‘copy-down’ and initialization of .data. But it looks confusing: it is shown with addresses in RAM? Checking the linker map file shows:

.romp           0x1ffff030       0x18 load address 0x00001b60
                0x00001b60                __S_romp = _romp_at
                0x1ffff030        0x4 LONG 0x1b4c ___ROM_AT
                0x1ffff034        0x4 LONG 0x1ffff000 _sdata
                0x1ffff038        0x4 LONG 0x14 ___data_size
                0x1ffff03c        0x4 LONG 0x0
                0x1ffff040        0x4 LONG 0x0
                0x1ffff044        0x4 LONG 0x0

Ah! That actually is not in RAM, but in FLASH: the linker maps this to the FLASH address 0x1b60! So this size 0x18 really needs to be added to the FLASH size too!

Summary

I hope I have sorted out things in a correct way. The way how the initialized data is reported might be confusing. But with the right knowledge (and .map file in mind), things get much clearer:

‘text’ is my code, vector table plus constants.

‘data’ is for initialized variables, and it counts for RAM and FLASH. The linker allocates the data in FLASH which then is copied from ROM to RAM in the startup code.

‘bss’ is for the uninitialized data in RAM which is initialized with zero in the startup code.

Happy Sizing 🙂

104 thoughts on “text, data and bss: Code and Data Size Explained”

Bill Lewis on April 14, 2013 at 15:59 said:

Do you ever compare the sizes between debug and release builds? On one project I was almost out of Flash code space and just to see what would happen I switched to Release build. Dramatic difference in size. Much more than I expected. I was thinking of using custom switches “per file” or maybe using libraries for tested code and only using Debug compile for new stuff.

LikeLike

Reply ↓
- Erich Styger on April 14, 2013 at 16:02 said:
  
  Hi Bill,
  I always do only ‘release’ builds. I do not see any value in doing a ‘debug’ build. See https://mcuoneclipse.com/2012/06/01/debug-vs-release/. With this, I do not see the problem you had. Yet another reason not to get fooled by ‘debug’ builds. It makes sense for the desktop world, but not for embedded. So probably what you saw is that in the ‘release’ build the compiler optimizations were turned on.
  
  LikeLike
  
  Reply ↓
  - Bill Lewis on April 14, 2013 at 16:05 said:
    
    Hmm, I must have missed that post or just forgotten. I will try it. I guess I have been under the assumption the un-optimized code generation was needed by the debugger.
    
    Thanks!
    Bill
    
    LikeLike
    
    Reply ↓
    - Erich Styger on April 14, 2013 at 16:10 said:
      
      Modern debug information standards like Dwarf has been designed with optimized code debugging in mind. If it is not possible to debug highly optimized code with the debugger, then this is a bug in the tools for me. Either in the compiler (not correctly generating debug information) or in the debugger (not correctly using that information). Yes, generating correct and useful debug information is not the simplest thing in the world. But very doable. Yes, there are academic cases where it isdebatable that some optimizations are impacting severely debugging information. But as I say: academic 🙂
      
      LikeLike
    - Bill Lewis on April 14, 2013 at 16:26 said:
      
      I apologize, as this isn’t a Freescale MCU. But it is ARM, Eclipse, and gcc.
      
      This is the ‘debug’ build:
      ‘Invoking: ARM Windows GNU Print Size’
      arm-none-eabi-size –format=berkeley stm32f4_test.elf
      text data bss dec hex filename
      264468 14744 102772 381984 5d420 stm32f4_test.elf
      
      This is the ‘release’ build:
      arm-none-eabi-size –format=berkeley stm32f4_test.elf
      text data bss dec hex filename
      147484 14392 102764 264640 409c0 stm32f4_test.elf
      ‘Finished building: stm32f4_test.siz’
      
      That’s a significant amount of Flash space I’ve been wasting. And have been for years!
      
      LikeLike
    - Erich Styger on April 14, 2013 at 16:47 said:
      
      That really looks like in your release build you have the optimizations turned on, while they are off in the debug build. I get similar differences with my ARM cores with optimizations turned on/off.
      
      LikeLike
  - teejay on June 5, 2015 at 07:48 said:
    
    Wrong! Being able to debug on embedded hardware will save you hours of work. It sounds like you need to brush up on utilizing the power of remote GDB, it is a doddle under eclipse and it will literally transform your development cycle. That debug code can be a complete godsend, not using it makes no sense.
    
    LikeLike
    
    Reply ↓
    - Erich Styger on June 5, 2015 at 08:01 said:
      
      Hi teejay,
      not sure if you have posted that comment to the wrong thread? If it is about release and debug configurations: of course a ‘release’ configuration does not mean that I cannot debug it. Of course I keep the debug information in that configuration, and I’m using GDB and remote GDB every day 🙂
      
      LikeLike
Cristian Iacob on April 14, 2013 at 20:42 said:

Hi!
can you please explain how to switch from debug to release build?
Thanks,
Cristian

LikeLike

Reply ↓
- Erich Styger on April 15, 2013 at 05:57 said:
  
  Hi Cristian,
  That depends on the architecture and tool chain. In general, in debug mode the optimizations are switched off, while in release they are on and symbolics are stripped off. You simply need to look at the compiler/build options/libraries and make sure, it matches your need. In any case even in ‘release’ you shall be able to debug it.
  
  LikeLike
  
  Reply ↓
Ed Garcia on April 15, 2013 at 15:37 said:

Hello:

Thank you for your response! It is very informative!

I have a question about BSS. I made a bare-bones project for the FDRM-25KLZ board in CodeWarrior, and I am getting a huge .BSS at startup! About 2076 with a blank code! The MCU has 4k of RAM, so it seems excesive to me to waste half of the RAM without any code! Is this space really used, or is it in some kind reserving the space for use? Maybe it can be the start-up code Codewarrior generated, but I don’t know how to look into the subject.

Thank you for your time,

Ed.

LikeLike

Reply ↓
- Erich Styger on April 15, 2013 at 16:26 said:
  
  Hi Ed,
  the KL25Z on the FRDM board has 16 KByte of RAM :-).
  Could it be that you have a heap used/linked with your application? This for sure will add to the .bss section. Best if you have a look at the generated .map file (generated by the linker). There you should see what uses that amount of RAM in your application.
  
  LikeLike
  
  Reply ↓
  - Ed Garcia on April 18, 2013 at 22:12 said:
    
    Hello Erich:
    
    I am sorry for my late response. Yes, I am working with the FRDM board, but at the end I will work with a KL05 chip with 32kb ROM, 4kb RAM. Sorry I did not mention it! 😛
    
    I made a baremetal project (as you did on your great tutorial “Optimizing the Kinetis gcc Startup”) and I found I have the same 2076 bytes of BSS that you had at the start of the tutorial.
    
    Now, I have seen in other of your posts that you have less BSS code size. For example, in your tutorial of RTOS with the FDRM board you had a BSS of 1040! Is there some way to reduce the BSS code? As I said before, I only made a baremetal project with no code of my own or no processor expert code. I cannot find the startup code which takes all this space.
    
    I made a barebones project for the KL05 chip and found the following BSS usage:
    text data bss dec hex filename
    868 24 540 1432 598 p1.elf
    
    It still uses a lot! (13% of ram!), but at least not so much as in the case of the KL25 (~50%).
    
    I am trying to understand the .map file, and I will post here what I find. I may take a while though :(.
    
    Thank you for your time,
    
    Ed.
    
    LikeLike
    
    Reply ↓
    - Erich Styger on April 19, 2013 at 10:22 said:
      
      Hi Ed,
      have a look and check the linker file: how much stack and heap do you have allocated?
      /* Generate a link error if heap and stack don’t fit into RAM */
      __heap_size = 0x400; /* required amount of heap */
      __stack_size = 0x400; /* required amount of stack */
      Depending on your needs and startup, you can reduce this.
      E.g. with FreeRTOS, I only need the initial stack to initialize the registers and basic drivers.
      Then I switch to the stack of each task.
      If you are not using any heap (and using the FreeRTOS one), you can reduce the heap size to zero.
      The stack size to less than 0x50 (depending on what you do in your main()).
      
      I hope this helps.
      
      LikeLike
  - Ed Garcia on April 18, 2013 at 22:19 said:
    
    Oops! I made a mistake in my post! Sorry, I can’t edit it out…
    
    “It still uses a lot! (13% of ram!), but at least not so much as in the case of the KL25 (~50%).”
    
    KL25 using 2076 of BSS would take about the same percentaje (~13%).
    
    LikeLike
    
    Reply ↓
Pingback: Traps and Pitfalls: No Hex/Bin/S19 File Created with GNU? | MCU on Eclipse
Pingback: GNU Linker, can you NOT Initialize my Variable? | MCU on Eclipse
Pingback: Printing Code Size Information in Eclipse | MCU on Eclipse
Pingback: Semihosting with Kinetis Design Studio | MCU on Eclipse
Hugo Ordonez on July 2, 2014 at 00:12 said:

Hello, I have a problem when generating code for my bootloader project, it appears by looking at the mapfile and srec file that although I used the technique of limiting ROM to 0x5000, using Processor Expert, for my bootloader code, some code specifically the code for .romp is being allocated outside this limit.

********MAP File extract
._user_heap_stack
0x1fff84e4 0x400 load address 0x00004e1c
0x1fff84e4 . = ALIGN (0x4)
0x1fff84e4 PROVIDE (end, .)
0x1fff84e4 PROVIDE (_end, .)
0x1fff84e4 __heap_addr = .
0x1fff84e4 . = (. + __heap_size)
0x1fff88e4 . = (. + __stack_size)
*fill* 0x1fff84e4 0x400 00
0x1fff88e4 . = ALIGN (0x4)
0x00005c20 _romp_at = ((___ROM_AT + SIZEOF (.data)) + SIZEOF (.m_data_20000000))

.romp 0x1fff88e4 0x24 load address 0x00005c20
0x00005c20 __S_romp = _romp_at
0x1fff88e4 0x4 LONG 0x4938 ___ROM_AT
0x1fff88e8 0x4 LONG 0x1fff8000 _sdata
0x1fff88ec 0x4 LONG 0x298 ___data_size
0x1fff88f0 0x4 LONG 0x4bd0 ___m_data_20000000_ROMStart
0x1fff88f4 0x4 LONG 0x20000000 ___m_data_20000000_RAMStart
0x1fff88f8 0x4 LONG 0x1050 ___m_data_20000000_ROMSize
0x1fff88fc 0x4 LONG 0x0
0x1fff8900 0x4 LONG 0x0
0x1fff8904 0x4 LONG 0x0

******Linker File extract
MEMORY {
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x000001BC
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x00004BF0
m_data (RW) : ORIGIN = 0x1FFF8000, LENGTH = 0x00008000
m_data_20000000 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00008000
m_cfmprotrom (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
}

/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
__heap_addr = .;
. = . + __heap_size;
. = . + __stack_size;
. = ALIGN(4);
} > m_data

_romp_at = ___ROM_AT + SIZEOF(.data) +SIZEOF(.m_data_20000000);
.romp : AT(_romp_at)
{
__S_romp = _romp_at;
LONG(___ROM_AT);
LONG(_sdata);
LONG(___data_size);
LONG(___m_data_20000000_ROMStart);
LONG(___m_data_20000000_RAMStart);
LONG(___m_data_20000000_ROMSize);
LONG(0);
LONG(0);
LONG(0);
}

.ARM.attributes 0 : { *(.ARM.attributes) }
}

Question is.. how can I tell the linker not to allocate .romp outside the 0x5000 limit?, thanks in advance

LikeLike

Reply ↓
- Erich Styger on July 2, 2014 at 12:54 said:
  
  It seems to me that you are missing the > m_data for the
  .romp : AT(_romp_at)
  {
  …
  } > m_data
  ?
  
  LikeLike
  
  Reply ↓
  - Hugo Ordonez on July 2, 2014 at 17:55 said:
    
    Thanks for your response Erich
    This is an automatically generated file from PE.
    Just in case anyone runs into this, _romp_at = ___ROM_AT + SIZEOF(.data) +SIZEOF(.m_data_20000000); was causing trouble when using a large uninitialised buffer defined in .m_data_20000000, this made m_text overflow beyond 0x5000, changing the calculation of .rompat to _romp_at = ___ROM_AT + SIZEOF(.data) solved the problem.
    
    LikeLike
    
    Reply ↓
  - Hugo Ordonez on July 2, 2014 at 18:32 said:
    
    Erich, you are correct there is a missing > m_data, but this linker file is (was) managed by PE, it seems like a bug in this version of PE, i have the following installed MCU10-3 PRODUCT 1.0.0, Procesor Expert for MCU 1.2.0.S.RT2_b1248-1334.
    
    LikeLike
    
    Reply ↓
Pingback: Comparing CodeWarrior with Kinetis Design Studio | MCU on Eclipse
ninovsnino on March 20, 2015 at 09:47 said:

Reblogged this on a version of mine and commented:
This explanation is easy to understand. I will use it as my reminder, and I hope it useful to others too.

LikeLike

Reply ↓
acutetech on April 22, 2015 at 12:16 said:

Can you help me understand memory allocation for local static variables? Here is a code snippet:

uint32_t Flash_ReadRecord(uint16_t recordNumber) {
uint32_t previousTimeStamp;
uint16_t nextIndex;
uint16_t nextOffset;
// code here
}

As I understand it, these variables will be allocated space on a stack, and this memory will be freed once the Flash_ReadRecord() routine exits. Right? But if they are decalred static then they will be allocated permanent RAM locations. Right? So as I make each of these variables static in turn I would expect to see the bss number increase (by 4, 2 and 2 bytes respectively). However something different happens:

With none static:
text data bss dec hex filename
30360 48 3868 34276 85e4 HBM2_MMM.elf

Changing previousTimeStamp to static I get 4 extra bss bytes but also 8 extra text bytes:
30368 48 3872 34288 85f0 HBM2_MMM.elf

Changing nextIndex to static as well I get no extra bss bytes and 48 extra text bytes!
30416 48 3872 34336 8620 HBM2_MMM.elf

Changing all three to static I get 4 extra bss bytes and a further 20 extra text bytes!
30436 48 3876 34360 8638 HBM2_MMM.elf

Making nextIndex and nextOffset uint32_t (all 3 static) I get 4 extra bss bytes and no change to text bytes:
30436 48 3880 34364 863c HBM2_MMM.elf

The bss numbers seem OK if bss memory has to be allocated on 4-byte boundaries – but what is happening to the text?

LikeLike

Reply ↓
- acutetech on April 22, 2015 at 13:01 said:
  
  Hmmm… from the map file it looks like the size of the routine Flash_ReadRecord() is changing, so this is presumably related to the relative efficiencies of accessing variables if they are on the stack compared to when they are in RAM elsewhere.
  
  Interestingly:
  previousTimeStamp is used 6 times and making it static takes 8 extra bytes
  nextIndex is used 3 times and making it static takes 48 extra bytes
  nextOffset is used 4 times and making it static takes 20 extra bytes
  I wonder why these numbers are as they are?
  
  What are the general lessons if one wants to minimise memory use?
  
  LikeLike
  
  Reply ↓
  - Erich Styger on April 22, 2015 at 18:17 said:
    
    To minimize memory usage:
    – consider alignment for structs: make sure you place 8bit, 16bit and 32bit variables together so there are no gaps inbetween
    – try to us the smallest data type possible
    – instead using global memory, use dynamic/heap memory or local (stack) variables
    – ensure that linker is removing unused variables
    – try not to use the C runtime/ANSI library if possible: they tend to use their own variables/heap
    
    LikeLike
    
    Reply ↓
- Erich Styger on April 22, 2015 at 14:45 said:
  
  Yes, if you mark local variables static, they are ‘static locals’ and will consume global RAM (like other global variables). The code size might change too because the compiler will use different ways to access them. I hope that makes sense?
  
  LikeLike
  
  Reply ↓
acutetech on April 24, 2015 at 13:23 said:

My application (KL05, PE, FreeRTOS) sometimes behaves as though it has run out of RAM even though (a) the compiler shows no errors and (b) the sum of .data and .bss is less than the RAM available in the chip. What might be happening? I wonder about the “heap” and “stack” defined in the CPU PE component. I’ve defined no heap and a small stack. Does the stack start at the end of RAM and extend towards the variables? Maybe the stack is trashing some of the variables? The FreeRTOS documentation says “The stack used upon program entry is not required once the RTOS scheduler has been started (unless your application … uses the stack as an interrupt stack as is done in the ARM Cortex-M)”.

http://www.freertos.org/FAQMem.html#RAMReduce

How can I establish the stack’s RAM requirements?

LikeLike

Reply ↓
- Erich Styger on April 26, 2015 at 10:12 said:
  
  If you need heap depends how you are using the ANSI library. It depends as well which library you are using (newlib, newlib nano), as the library might use heap for things like printf() etc. The stack you assign in the Processor Expert component is the stack used out of reset and during startup and calling main(): I usually allocate 0x100 there. But this depends what you are actually doing during startup, until FreeRTOS is running. Once you start the RTOS, it is using the stack assigned to the tasks. How do you know that your application is running out of RAM? Yes, it could be that your application starts overwriting global variables/RAM. FreeRTOS has a check for stack overflows which is very useful, have you turned that on (it is on by default)? The FreeRTOS stack is allocated inside the FreeRTOS heap, so it would likely overwrite another task stack if this really happened. Otherwise it could be a dangling pointer?
  
  LikeLike
  
  Reply ↓
acutetech on April 28, 2015 at 19:23 said:

I have used the useful FreeRTOS stack high water and stack overflow routines. To look for stack overflow in the main() stack, before FreeRTOS starts, I added this code to startup.c before the stack pointer initialisation:
extern char __END_BSS[];
unsigned long len = __SP_INIT – __END_BSS;
unsigned long dst = (unsigned long) __END_BSS;
while( len > 0)
{
*((char *)dst) = 0x44;
dst += 1;
len–;
}
Then I can check how much stack is used by examining the RAM for 0x44. I think my earlier problem was that the stack was extending down from the end of RAM and over-writing some of the variables at the end of .bss (without warning).
This also shows the benefit of doing as little initialisation as possible before FreeRTOS starts, to minimise the main() stack use, which is wasted after FreeRTOS starts. My stack use is current 64 bytes.
I am left with a question: how are the data and bss numbers derived? I create a dummy array of a size so that the RAM is 100% used (verified by the .map file, which also gives ___data_size = 16). The Berkley figures are data=48 bss=4056 when I would expect numbers that add up to 4096 (KL05 RAM size).

LikeLike

Reply ↓
- Erich Styger on April 28, 2015 at 21:18 said:
  
  The size utility reads the ELF information of the ELF/Dwarf file and sums the object sizes up. I don’t know the actual implementation, but this is what I think it is.
  
  LikeLike
  
  Reply ↓
Pingback: Tutorial: Adafruit WS2812B NeoPixels with the Freescale FRDM-K64F Board – Part 2: Software Tools | MCU on Eclipse
Angel G. on October 24, 2015 at 10:45 said:

Hi 2 ALL !
Talking about Kinetis and memory layout, I’d like to warn you, about a potential problem which may lead your code to the HardFault handler. The K-series devices include two blocks of on-chip SRAM: SRAM_L and SRAM_U, split at address 0x200000000. If say a 32-bit variable is placed across 0x200000000 e.t. 0x200000000-2 to 0x200000000+2, any attempt to access it would end up in the HardFault handler.
I think that If you want to not care, you should use either SRAM_L or SRAM_U, else, a special care should be taken what ends up where.
The ANs say that the faster one is the smaller one – SRAM_L, suitable for critical code (ISR ?).

LikeLike

Reply ↓
- Angel G. on October 24, 2015 at 12:43 said:
  
  Typo correction: the address has one zero less: 0x20000000
  
  LikeLike
  
  Reply ↓
Angel G. on October 24, 2015 at 10:50 said:

Ups, Erich already has written about it here: https://mcuoneclipse.com/2013/07/10/freertos-heap-with-segmented-kinetis-k-sram/

LikeLike

Reply ↓
- Angel G. on October 24, 2015 at 12:13 said:
  
  UPS, Erich was wrong about KL25. It’s not discontinuous SRAM space. See the above. Tested that goes to the HardFault. Test methodology:
  In the .ld linker (GNU used), place absolute start address section:
  .myBufBlock 0x1FFFFFFE:
  {
  KEEP(*(.myBufSection))
  } > RAM
  In the main.c, declare a global at this address:
  volatile uint32_t __attribute__((section (“.myBufSection”))) myBufVar __attribute__ ((aligned (1)));
  In main.c trigger the hard fault:
  printf(“n Test HF:”);
  myBufVar=11;
  if(myBufVar==11) printf(“- HF not happened”);
  
  LikeLike
  
  Reply ↓
- Angel G. on October 25, 2015 at 15:19 said:
  
  It’s possible that my example is also *not* very correct, since at 0x1FFFFFFE, it’s only 16-bit-aligned, and if the M0+ core fails on non-32-bit aligned access, then the hardFault will be caused more likely because of this instead of cross-boundary access. I won’t flood the forum with cross-boundary DMA test, but may report the results.
  
  LikeLike
  
  Reply ↓
Vishal Girisagar on November 13, 2015 at 04:35 said:

Hi Erich,
As always, thanks again for the great tutorial. I am currently using FreeRTOS for my code. Is there any way I can know the Stack and Heap sizes during run time? I just want to know the stats of these while executing my code.

LikeLike

Reply ↓
- Erich Styger on November 13, 2015 at 05:59 said:
  
  Hi Vishal,
  yes, have a look at my Shell implementation (see System Status screenshot in https://mcuoneclipse.com/2012/08/05/a-shell-for-the-freedom-kl25z-board/). You can get the free heap size and a list of stack sizes for each task.
  Additionally, there is an Eclipse plugin helping with the stack size: https://mcuoneclipse.com/2013/08/04/diy-free-toolchain-for-kinetis-part-5-freertos-eclipse-kernel-awareness-with-gdb/
  
  LikeLike
  
  Reply ↓
  - Vishal Girisagar on November 13, 2015 at 06:15 said:
    
    Hi Erich,
    Thanks a lot for the quick reply. I will take look at both of them.
    
    LikeLike
    
    Reply ↓
    - acutetech on November 13, 2015 at 10:27 said:
      
      Hi Vishal – you might find these 3 calls useful. Put the first two in a “print status” call to report useage – then tweak the task stack sizes. Implement the third to catch stack overflows:
      FRTOS1_uxTaskGetStackHighWaterMark() // call for each task
      FRTOS1_xPortGetFreeHeapSize() // won’t change once FRTOS resources are allocated
      FRTOS1_vApplicationStackOverflowHook()
      Be aware that you might call some stack-intensive code from within different tasks, meaning that code pushes up the high water mark in more than one task. You might be able to design around that (the XF print routines for example).
      AFAIK the stack allocated in the PE CPU component is not used after FRTOS starts to run. If RAM is short you can minimise this by being careful with calls before starting FRTOS. I don’t know if there is a GetStackHighWaterMark() call for that stack.
      
      LikeLike
Daniel on December 5, 2015 at 00:51 said:

Hi Erich, how are you?
First, thanks for the blog, it’s been very helpful!
Now I’ve got one question related to this topic.
My code is getting very close to the the 16k flash size of the MKE0216VLC4 uC I’m using. Here is the print:
text data bss dec hex filename
15068 140 600 15808 3dc0 AL-X15.elf
I’m getting the impression the linker is considering the ‘dec’ value as the amount to be stored in flash. Is that right? Shouldn’t it be the sum of ‘text’ and ‘data’ only, not including the ‘bss’?
I’m getting a lot of “region m_text overflowed with text and data” errors!
Thanks!

LikeLike

Reply ↓
- Erich Styger on December 5, 2015 at 08:50 said:
  
  Hi Daniel,
  yes, it should be the sum of text+data which is stored in flash.
  
  LikeLike
  
  Reply ↓
  - Erich Styger on December 5, 2015 at 08:52 said:
    
    Daniel,
    to really know what the linker does: increase in your linker file artificially the size of RAM and FLASH and make it link. Then check the liker .map file produced (where the .elf is) what it actually did.
    
    LikeLike
    
    Reply ↓
  - Daniel on December 5, 2015 at 12:30 said:
    
    Great! Looking at the .map file I’v found something rather strange to me. Here is part of the memory configuration defined by PE:
    Name Origin Length Attributes
    m_text 0x00000410 0x00003bf0 xr
    Why is my .text region starting at 0x0410? I’m loosing almost 1Kb of code space!! That’s why my code doesn’t fit!
    
    I’ve tried to change the linker file via PE, but it gives me an error, saying: “There is flash-configuration area from 0x0400 to 0x040F”. What is that?
    
    Do you have any suggestions? Should I manually override the PE link file and start .text at 0x0? Or should I create a new section and store some functions in that region?
    
    I’m new to Freescale’s uCs, I’ve been using NXP for the last 5 years. Here is an usual .map file on my LPC1113 projetct:
    Name Origin Length Attributes
    MFlash24 0x00000000 0x00006000 xr
    RamLoc4 0x10000000 0x00001000 xrw
    
    Thaks!
    
    LikeLike
    
    Reply ↓
    - Erich Styger on December 5, 2015 at 12:52 said:
      
      Starting at address 0x0 there is the vector table. And at 0x410 there is the flash and security configuration. Do not mess this up! See
      
      Preventing Reverse Engineering: Enabling Flash Security
      
      Preventing Reverse Engineering: Enabling Flash Security
      
      Device is secure?
      
      Unsecuring the KL25Z Freedom Board
      
      Otherwise, I highly recommend to read the device refrence manual.
      
      LikeLike
  - Daniel on December 5, 2015 at 13:24 said:
    
    What about the region in between the end of the Vector Table (0x0C0) and the start of Flash Configuration (0x0400)? Is it safe to use it? That region is defined on the linker file as “m_text_000000C0”. Like I said, it’s almost 1K of code space wasted (832 bytes to be precise).
    I’m all optimized (-os) and there is still more code to go, so I really could use that space! I know there is a 32k version of this chip, but this is a very “cost sensitive” project!
    Thanks!
    
    LikeLike
    
    Reply ↓
    - Erich Styger on December 5, 2015 at 13:34 said:
      
      Hi Daniel,
      yes, you can use that area.
      
      LikeLike
    - acutetech on December 5, 2015 at 19:29 said:
      
      Hi Daniel – it would be ideal if the linker could place code from 0xc0 to 0x3ff, then start again at 0x410. I don’t know if that can be done, but I have managed to get the linker to use some of the lost space to store constants. The hack has three parts (no guarantee this is bug-free):
      1) Add a new SECTIONS definition in the .ld linker file, after the vector table section:
      .applicationConstants :
      {
      . = ALIGN(4);
      KEEP(*(.application_const)) /* Flash Configuration Field (FCF) = 0x400 */
      . = ALIGN(4);
      } > m_text_000000C0
      
      2) Add this definition of “CONSTANT” in a .h header file:
      #define CONSTANT __attribute__ ((section (“.application_const”))) const
      
      3) When you have some constant data that can reside in this ” .application_const” section, make use of the CONSTANT definition:
      CONSTANT uint8_t bar[2] = {0x55, 0xaa};
      
      Then bar gets put in .application_const region that was previously memory. But most of my constants are strings, as in:
      Xprintf(“foo\r\n”);
      I can force these strings into the .application_const region like this:
      CONSTANT char string1[] = “foo\r\n”;
      then later:
      Xprintf(string1);
      Not so easy to read, but it works.
      
      LikeLike
  - Daniel on December 6, 2015 at 13:53 said:
    
    Hi Erich.
    My constants sum up to only 0xA4 bytes, so I’m manually selecting functions to store on that region to take the most space possible. Hope it works!
    It would be nice to have an automated script to search for functions in *(.text*) and place them in .text2 section, until it’s filled. My knowledge of linker file scripts is limited, so I have no idea if that is even possible!
    Thanks!
    
    LikeLiked by 1 person
    
    Reply ↓
    - Erich Styger on December 6, 2015 at 13:55 said:
      
      Hi Daniel,
      there is no way (to my knowledge) to distribute objects in the linker across multiple sections. This is one feature other linkers have implemented, but not the GNU one 😦
      
      LikeLike
RedSun on December 25, 2015 at 08:09 said:

Hi Erich
Thanks for your great post.
But I have a question.
In my Eclipse environment, seleting sysV mode, size data printed on dec.
I want to see size data on dec.
Can you help me this problem?

LikeLike

Reply ↓
- Erich Styger on December 25, 2015 at 08:59 said:
  
  Hi RedSun,
  set the format to Berkeley, and then you get exactly what I have.
  
  LikeLike
  
  Reply ↓
Bharath on February 4, 2016 at 14:52 said:

Hi Erich,

I’ve started writing a customized bootloader for KL25z uC at address 0x00000000 in flash.

The startup code which I normally use to write an application at address other than 0x00000000 doesn’t seem to work properly at address 0x00000000. I use the startup files crt0.s, startup.c, system.c and system.s provided in USB HID bootloader by Freescale. I’m able to successfully compile it, and flash it, but during Flashing IAR shows the following warnings:

Warning: Target inconsistency detected in Memory range 0x00000000-0x000000BD
Warning: Target inconsistency detected in Memory range 0x00000410-0x00000D41

These memories belong to vector table and configuration registers respectively. I’m not sure why these warnings are popping up. After Flashing I debug the code stepping and at “SystemInit” as shown below of crt0.s file, the value in r0 register is expected to be 0xAEC (indicated by function pointer) but it is filled by 0xA61(this address belongs to some instruction of a function other than “SystemInit”). From that instruction onwards I lose complete control of Debug.

#define SCS_BASE (0xE000E000) /*!< System Control Space Base Address */
#define SCB_BASE (SCS_BASE + 0x0D00) /*!< System Control Block Base Address */
#define SCB_VTOR_OFFSET (0x00000008)

PUBLIC Reset_Handler
EXPORT Reset_Handler
Reset_Handler

// Mask interrupts
cpsid i

// Set VTOR register in SCB first thing we do.
ldr r0,=__vector_table
ldr r1,=SCB_BASE
str r0,[r1, #SCB_VTOR_OFFSET]

// Init the rest of the registers
ldr r2,=0
ldr r3,=0
ldr r4,=0
ldr r5,=0
ldr r6,=0
ldr r7,=0
mov r8,r7
mov r9,r7
mov r10,r7
mov r11,r7
mov r12,r7

// Initialize the stack pointer
ldr r0,=CSTACK$$Limit
mov r13,r0

// Call the CMSIS system init routine
ldr r0,=SystemInit
blx r0

// Init .data and .bss sections
ldr r0,=init_data_bss
blx r0

// Init interrupts
ldr r0,=init_interrupts
blx r0

// Unmask interrupts
cpsie i

// Set argc and argv to NULL before calling main().
ldr r0,=0
ldr r1,=0
ldr r2,=main
blx r2

__done
B __done

I suspect the problem is in the linker file. Here is the linker file I'm using.

/*###ICF### Section handled by ICF editor, don't touch! ****/
/*-Editor annotation file-*/
/* IcfEditorFile="$TOOLKIT_DIR$\config\ide\IcfEditor\cortex_v1_0.xml" */

define symbol __CODE_START_ADDRESS__ = 0x00000000;

define symbol __ICFEDIT_intvec_start__ = __CODE_START_ADDRESS__;

/*-Memory Regions-*/
define symbol __ICFEDIT_region_ROM_start__ = __CODE_START_ADDRESS__;
define symbol __ICFEDIT_region_ROM_end__ = 0x0001FFFF;
define symbol __ICFEDIT_region_RAM_start__ = 0x1FFFF000;
define symbol __ICFEDIT_region_RAM_end__ = 0x20002FFF;
define symbol __ICFEDIT_region_RAM1_start__ = 0x1FFFF000;
define symbol __ICFEDIT_region_RAM1_end__ = 0x1FFFFFFF;
define symbol __ICFEDIT_region_RAM2_start__ = 0x20000000;
define symbol __ICFEDIT_region_RAM2_end__ = 0x20002FFF;
define symbol __IntVectTable_start__ = __CODE_START_ADDRESS__ + 0x00000000;
define symbol __IntVectTable_end__ = __CODE_START_ADDRESS__ + 0x0000003F;
define symbol __FlashConfig_start__ = __CODE_START_ADDRESS__ + 0x00000400;
define symbol __FlashConfig_end__ = __CODE_START_ADDRESS__ + 0x0000040f;
/*-Sizes-*/
define symbol __ICFEDIT_size_cstack__ = (3 * 1024);
define symbol __ICFEDIT_size_heap__ = (2 * 1024);

define exported symbol __BOOT_STACK_ADDRESS = __ICFEDIT_region_RAM_end__ – 8;

/**** End of ICF editor section. ###ICF###*/

define memory mem with size = 4G;
//define region ROM_region = mem:[from __ICFEDIT_region_ROM_start__ to (__FlashConfig_start__ – 1)] | mem:[from (__FlashConfig_end__ + 1) to __ICFEDIT_region_ROM_end__];
define region ROM_region = mem:[from __ICFEDIT_region_ROM_start__ to __ICFEDIT_region_ROM_end__];
define region RAM_region = mem:[from __ICFEDIT_region_RAM_start__ to __ICFEDIT_region_RAM_end__];
define region FlashConfig_region = mem:[from __FlashConfig_start__ to __FlashConfig_end__];
define region IntVectTable_region = mem:[from __IntVectTable_start__ to __IntVectTable_end__];

define block CSTACK with alignment = 8, size = __ICFEDIT_size_cstack__ { };
define block HEAP with alignment = 8, size = __ICFEDIT_size_heap__ { };

do not initialize { section .noinit };
initialize manually { readwrite };
initialize manually { section .data};
initialize manually { section .textrw };

define block CodeRelocateRam { section .textrw };
define block CodeRelocate { section .textrw_init };
define block BootloaderFlash { readonly, block CodeRelocate };
define block BootloaderRam { readwrite, block CodeRelocateRam, block CSTACK, block HEAP };

place at address mem:__ICFEDIT_intvec_start__ { readonly section .intvec, readonly section .noinit };
place in ROM_region { block BootloaderFlash };
place in RAM_region { block BootloaderRam };
place in IntVectTable_region { section IntVectTable};
place in FlashConfig_region { section FlashConfig};

am I using the well configured Linker file?
If I change the address in the linker to be
define symbol __ICFEDIT_intvec_start__ = 0x8000;
things starts working.
I request you to guide me to tackle this issue.

Best Regards,
Bharath

LikeLike

Reply ↓
- Erich Styger on February 5, 2016 at 08:41 said:
  
  Hi Bharath,
  I believe the warning is that you have these memory areas allocated twice in your downloaded file? Can you check that you are not allocating these areas multiple times (e.g. by the application and by the bootloader)?
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
Nadine on February 22, 2016 at 12:45 said:

Hi Erich,

I’m working with a FRDM-K64F board, under KDS 3.1.0 and KSDK 2.0.

I have made some changes in the MKF64FN1M0xxx12_flash.ld file to make my program running, but they seem not correct. See the messages and my comments below

Any help is welcome
Best regards
Nadine

====================== ORIGINAL SETTINGS ARE:

/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00010000
m_data_2 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00030000
}

====================== I CHANGE THEM TO:

/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00020000
m_data_2 (RW) : ORIGIN = 0x20010000, LENGTH = 0x00020000
}

====================== THEN, MY PROGRAM LINKS and RUNS OK ….
12:03:50 **** Incremental Build of configuration Debug for project PLC ****
make -j2 all
‘Invoking: Cross ARM GNU Print Size’
arm-none-eabi-size –format=berkeley “PLC.elf”
text data bss dec hex filename
115416 516 129512 245444 3bec4 PLC.elf
‘Finished building: PLC.siz’
‘ ‘

====================== I ADD A LARGE TABLE IN MY PROGRAM, AND I HAVE A m_data overflowed message

‘Building target: PLC.elf’
‘Invoking: Cross ARM C++ Linker’
arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -O0 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -Wall -g3 -T “../MK64FN1M0xxx12_flash.ld” -Xlinker –gc-sections -Wl,-Map,”PLC.map” -specs=nosys.specs -specs=nano.specs -Xlinker -z -Xlinker muldefs -o “PLC.elf” ./utilities/fsl_debug_console.o ./utilities/fsl_notifier.o ./utilities/fsl_sbrk.o ./startup/startup_MK64F12.o ./startup/system_MK64F12.o ./source/CRC_get.o ./source/DXpsk_DeMod.o ………..
c:/freescale/kds_3.0.0/toolchain/bin/../lib/gcc/arm-none-eabi/4.8.4/../../../../arm-none-eabi/bin/ld.exe: PLC.elf section `.bss’ will not fit in region `m_data’
c:/freescale/kds_3.0.0/toolchain/bin/../lib/gcc/arm-none-eabi/4.8.4/../../../../arm-none-eabi/bin/ld.exe: region `m_data’ overflowed by 26896 bytes
collect2.exe: error: ld returned 1 exit status
make: *** [PLC.elf] Error 1

====================== I INCREASE THE m_data as seen below
/* Specify the memory areas */
MEMORY
{
m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
m_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
m_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x000FFBF0
m_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00028000
m_data_2 (RW) : ORIGIN = 0x20018000, LENGTH = 0x00020000
}

====================== BUT, A PROBLEM APPEARS WHILE DOWNLOADING THE PROGRAM

‘Invoking: Cross ARM GNU Create Flash Image’
arm-none-eabi-objcopy -O ihex “PLC.elf” “PLC.hex”
‘Invoking: Cross ARM GNU Print Size’
arm-none-eabi-size –format=berkeley “PLC.elf”
text data bss dec hex filename
115416 516 159504 275436 433ec PLC.elf
‘Finished building: PLC.siz’
‘Finished building: PLC.hex’
‘ ‘
‘ ‘

Open On-Chip Debugger 0.8.0-dev (2015-01-09-16:22)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.sourceforge.net/doc/doxygen/bugs.html
Info : only one transport option; autoselect ‘cmsis-dap’
Info : CMSIS-DAP: SWD Supported
Info : CMSIS-DAP: Interface Initialised (SWD)
Info : add flash_bank kinetis kinetis.flash
cortex_m reset_config sysresetreq
adapter speed: 1000 kHz
Started by GNU ARM Eclipse
Info : CMSIS-DAP: FW Version = 1.0
Info : SWCLK/TCK = 0 SWDIO/TMS = 1 TDI = 0 TDO = 0 nTRST = 0 nRESET = 1
Info : DAP_SWJ Sequence (reset: 50+ ‘1’ followed by 0)
Info : CMSIS-DAP: Interface ready
Info : clock speed 1000 kHz
Info : IDCODE 0x2ba01477
Info : kinetis.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : accepting ‘gdb’ connection from 3333
Info : Probing flash info for bank 0
Warn : acknowledgment received, but no packet pending
undefined debug reason 7 – target needs reset
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x000004d0 msp: 0x20030000
semihosting is enabled
Warn : Any changes to flash configuration field will not take effect until next reset
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x000004d8 msp: 0x20038000, semihosting
===== arm v7m registers
(0) r0 (/32): 0x00000000
(1) r1 (/32): 0x00000000
(2) r2 (/32): 0x00000000
(3) r3 (/32): 0x00000000
(4) r4 (/32): 0x00000000
(5) r5 (/32): 0x00000000
(6) r6 (/32): 0x00000000
(7) r7 (/32): 0x00000000
(8) r8 (/32): 0x00000000
(9) r9 (/32): 0x00000000
(10) r10 (/32): 0x00000000
(11) r11 (/32): 0x00000000
(12) r12 (/32): 0x00000000
(13) sp (/32): 0x20038000
(14) lr (/32): 0xFFFFFFFF
(15) pc (/32): 0x000004D8
(16) xPSR (/32): 0x01000000
(17) msp (/32): 0x20038000
(18) psp (/32): 0x00000000
(19) primask (/1): 0x00
(20) basepri (/8): 0x00
(21) faultmask (/1): 0x00
(22) control (/2): 0x00
===== Cortex-M DWT registers
(23) dwt_ctrl (/32)
(24) dwt_cyccnt (/32)
(25) dwt_0_comp (/32)
(26) dwt_0_mask (/4)
(27) dwt_0_function (/32)
(28) dwt_1_comp (/32)
(29) dwt_1_mask (/4)
(30) dwt_1_function (/32)
(31) dwt_2_comp (/32)
(32) dwt_2_mask (/4)
(33) dwt_2_function (/32)
(34) dwt_3_comp (/32)
(35) dwt_3_mask (/4)
(36) dwt_3_function (/32)
Error: CMSIS-DAP: Write Error (0x04)
Error: CMSIS-DAP: Write Error (0x04)
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Write Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: JTAG failure 4
Error: CMSIS-DAP: Read Error (0x04)
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 100ms
Error: CMSIS-DAP: Write Error (0x04)
Error: Failed to write memory at 0x00000000
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 300ms
Error: CMSIS-DAP: Read Error (0x04)
Error: Failed to read memory at 0x00000000
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 700ms
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 1500ms
Error: CMSIS-DAP: Write Error (0x04)
Polling target kinetis.cpu failed, GDB will be halted. Polling again in 3100ms
Info : dropped ‘gdb’ connection

LikeLike

Reply ↓
- Erich Styger on February 22, 2016 at 21:16 said:
  
  Hi Nadine,
  first off: be careful with ignoring that memory boundary at 0x2000’0000 (see https://mcuoneclipse.com/2013/07/10/freertos-heap-with-segmented-kinetis-k-sram/). If you know what you are doing, simply declare one area of memory (without the split), see that discussion in this post.
  
  LikeLike
  
  Reply ↓
  - Nadine on February 23, 2016 at 11:29 said:
    
    Hi Erich,
    
    Thanks for the answer.
    Now, I better understand the usage of m_data and m_data2 sections.
    I have made suggested changes, and thinks are working better.
    
    Regards
    Nadine
    
    LikeLike
    
    Reply ↓
Pingback: Dealing with Code Size in Kinetis SDK v2.x Projects | MCU on Eclipse
jacobjennings on March 11, 2016 at 20:11 said:

I blogged about my investigation: Size optimization for .elf ARM binaries in Kinetis Design Studio ‘link not valid any more!’

LikeLike

Reply ↓
- Erich Styger on March 11, 2016 at 20:20 said:
  
  thanks for sharing that link! One point to note: the debug information (size of it) does not matter. Or only if you care about how big the file is on the host PC. Because only the code (and data) gets downloaded to the target. So all the .debug_* section and sizes do not matter: this stuff does not take space on your board (as long as you are not running something like an Embedded Linux). What shrinks indeed the size is using the optimization settings of the compiler. And here I see as well in my projects typical 50% reduction, as without the optimizations the gcc compiler is producing completely unoptimized code.
  
  LikeLike
  
  Reply ↓
Neha on July 1, 2016 at 17:34 said:

I am a digital verification engineer and not an expert in linker or compilers.
I have a particular question on MEMORY defined in .ld file. And, it is related to ARM GCC bare metal linker.
What I noticed is, if I don’t specify MEMORY in .ld file and in .text specify the start address as 0x0000_0000, and 0x0000_0000 is ROM address space, there is no initialization done by memset function.
But, when I specify FLASH and RAM in MEMORY, there is a zero initialization of .bss section. I can actually see ARM core writing 0’s to certain addresses.
What is confusing is why there is a change of behavior? Does linker do something clever when there is no memory defined with ‘rwx’ attributes?
Does it assume memory to be ‘rx’?
What happens to the heap if everything is ‘rx’?

LikeLike

Reply ↓
- Erich Styger on July 1, 2016 at 19:05 said:
  
  Hi Neha,
  I always have used the MEMORY block. What you see as initialization of bss is what is required by a standard ANSI C/C++ startup code. To follow the standard, the uninitialized memory has to be initialized with zeros.
  If you don’t want that or you don’t need that, you could skip this initialization part in the startup code.
  Erich
  
  LikeLike
  
  Reply ↓
naum_18 on November 1, 2016 at 15:22 said:

thanks for this explanation. is there any correlation between ‘hex’ or ‘dec’ and size of hex-file?

LikeLike

Reply ↓
- Erich Styger on November 1, 2016 at 18:08 said:
  
  ‘hex’ and ‘dec’ are simply the same size, with different number bases.
  Now they don’t really correlate to the size of the hex file, as the hex is using an ASCII encoding (see https://mcuoneclipse.com/2012/09/27/s-record-intel-hex-and-binary-files/), and therefore the size is very much different, altough in the range of twice what you have reported by the GNU size utility for the code size, as every code byte is encoded in two ASCII bytes.
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
Pingback: MCUXpresso IDE: S-Record, Intel Hex and Binary Files | MCU on Eclipse
Robert Poor on April 12, 2017 at 03:10 said:

From the Department of Unabashed Astonishment:

=== Debug
arm-none-eabi-size –format=berkeley “evaluation.elf”
text data bss dec hex filename
62560 540 3568 66668 1046c evaluation.elf

=== Release
text data bss dec hex filename
17996 340 3248 21584 5450 evaluation.elf
Finished building: evaluation.siz

I’ll go back and check my optimization settings, but still, that’s a 3x reduction going from debug to release.

LikeLike

Reply ↓
Javier on May 17, 2017 at 18:57 said:

Hello,
I’ve a problem of space in a cortex M3 when I try to debug, but not when I create the release version. Can somebody tell me how to debug a release version or create a debug one which fits into the available space?

thanks

BR

LikeLike

Reply ↓
- Erich Styger on May 17, 2017 at 19:13 said:
  
  ‘Release’ usually is just with higher optimization levels, and with debug information removed (which does not make sense): See https://mcuoneclipse.com/2012/06/01/debug-vs-release/ on the general concept. I suggest you turn on/enable debug information, then you can debug the ‘release’ version with source level debugging as well.
  
  LikeLike
  
  Reply ↓
srijan Banerjee on August 18, 2017 at 16:20 said:

Hi Erich,

Thanks for the post. This is very helpful.
Is there an option that I can use to pad the elf (output) to be always take up the exact same space in flash.

Reference:
I am using zynq 7020 SOC from Xilinx (and for firmware deelopment, I am using xilinx SDK, which is basically eclipse)

regards,
Srijan.

LikeLike

Reply ↓
- Erich Styger on August 18, 2017 at 16:26 said:
  
  Hi Srijan,
  You can fill up things with the linker script and even use your own pattern, see https://mcuoneclipse.com/2014/06/23/filling-unused-memory-with-the-gnu-linker/.
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
JH on September 27, 2017 at 16:46 said:

dec = text + data + bss

LikeLike

Reply ↓
- Erich Styger on September 27, 2017 at 17:09 said:
  
  yes, thanks for that reminder. Will make it clear in the article.
  
  LikeLike
  
  Reply ↓
Pingback: Flash-Resident USB-HID Bootloader with the NXP Kinetis K22 Microcontroller | MCU on Eclipse
Suresh on June 11, 2018 at 16:34 said:

Hi,
How do we do it similarly for MSP430 in CCS IDE.
Regards
Suresh

LikeLike

Reply ↓
- Erich Styger on June 11, 2018 at 18:37 said:
  
  Hi Suresh,
  you should be able to call the size tool from the CCS as post-build step as outlined in https://mcuoneclipse.com/2014/05/04/printing-code-size-information-in-eclipse/
  
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
pgScorpio on June 22, 2018 at 17:49 said:

quote: “💡 I like to remember ‘bss’ as ‘Better Save Space’ :-). As bss ends up in RAM, and RAM is very valuable for a microcontroller, I want to keep the amount of variables which end up in the .bss at the absolute minimum.”

????
‘Better Save Space’ refers to ROM space !
So you should want to have as much as possible variables in .BSS instead of in .DATA !

All variables are stored either in .DATA or .BSS and .DATA also resides in RAM but ADITIONALLY needs initialisation data in ROM and .BSS data does not !
So why do you wanna keep the amount of variables which end up in the .bss at the absolute minimum ??

LikeLike

Reply ↓
- Erich Styger on June 22, 2018 at 18:29 said:
  
  Hi,
  I suggest you have a read at https://en.wikipedia.org/wiki/Data_segment
  
  LikeLike
  
  Reply ↓
mesorahvchidush on November 1, 2018 at 12:01 said:

Hi,
I am a beginner at this, trying to compile and debug on an atmel eval board with a cortex m4. When I compile the program, I always get

text data bss dec
0x1c594 0x108c 0x506c 140940

and adding and removing things from the code does not change these values.
I try adding: float32_t fft_out[10000000];
and this has no effect, even though I would want it to complain that I am out of memory.
Any idea if and where I have a problem in my settings which would prevent these values from being correct?
Thanks!

LikeLike

Reply ↓
- Erich Styger on November 1, 2018 at 12:38 said:
  
  The linker will remove (‘dead-strip’) any variable or code you are not using. So just adding something and not referencing it will not change anything.
  you will see a difference if you do
  float32_t var;
  
  and then use it e.g. as
  var = fft_out[0];
  in main().
  
  LikeLike
  
  Reply ↓
Luca on January 15, 2019 at 10:48 said:

Hello Eric,
A question that maybe related. With your approach I am able to get the flash footprint of my functions. What if I would like to know how much ram does the singe function take? with -size I am only able to get the flash footprint of each function; using -fstak-usage I get huge values for simple procedures, so I was wondering if I should inspect the map file. But again, in the map file I can see only the map file.
I am testing in a simple way – despite knowing that the function is really unuseful
void test_function(void);

int main(void)
{
test_function();
}

uint32_t test_function(void){
uint32_t internalVal = 0;
internalVal++;
return internalVal;
}

any hint on this?
K.R.

LikeLike

Reply ↓
- Erich Styger on January 17, 2019 at 14:50 said:
  
  Hi Luca,
  are you using the approach I described in https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/?
  I don’t get huge numbers, for your example I get:
  MK64FN1M0xxx12_Project.c:72:5:main 8 static
  MK64FN1M0xxx12_Project.c:76:10:test_function 16 static
  which is reasonable.
  
  LikeLike
  
  Reply ↓
  - Luca on January 17, 2019 at 15:39 said:
    
    Hello Erich,
    Yes I tried with that approach but I have some issues (https://community.nxp.com/thread/486445) because results vary according to the perl version used. I was looking for some alternative and tested method that does not involve making a new script by myself, using GNU toolchain if possible.
    
    LikeLike
    
    Reply ↓
    - Erich Styger on January 17, 2019 at 16:51 said:
      
      Hi Luca,
      Because I don’t need the stack usage for every build, I sticked with calling it from the shell/console and not as part of the build process.
      Maybe this is related to the thing you see: I had different set of tools (in my case: scp) called depending on if I do it from the application or from the console.
      The issue seems that it depends if the application (Eclipse in this case) is 32bit or 64bit: Windows makes a kind of ‘shadow’ environment and depending on if the caller is 32bit or 64bit it might call different binaries in different folders :-(.
      
      LikeLike
Henry on May 5, 2019 at 05:16 said:

Hello Erich,
Thank you for the post.

I have a question, could you give some advise. if i do not use standard library, such as __main, and i must do some assemble code to copy data section from rom to ram. So how to know the the start address of data section, and size of it.

LikeLike

Reply ↓
- Erich Styger on May 5, 2019 at 08:07 said:
  
  Hi Henry,
  thank you!
  Have a read at https://mcuoneclipse.com/2016/11/01/getting-the-memory-range-of-sections-with-gnu-linker-files/. You can set your own symbols in the linker file, like a start and end symbol. The size you get with the address difference between the symbols.
  
  I hope this helps?
  Erich
  
  LikeLike
  
  Reply ↓
  - henry on May 6, 2019 at 10:10 said:
    
    Thank you for your replay. it help me very much.
    
    LikeLiked by 1 person
    
    Reply ↓
Luca on May 23, 2019 at 14:56 said:

Hello Eric,
I can see that this is still an hot topic after 6 years (and gladly I can polish my skills reading back your posts!).
I have a question about local variables initialized reading register values. something like:
uint32_t wdog_cfg = (uint32_t)((FEATURE_WDOG_CLK_FROM_LPO << WDOG_CS_CLK_SHIFT))
where is it placed?
I expect the wdog_cfg to be pushed in the stack memory while the mcu evaluates the value of FEATURE_WDOG_CLK_FROM_LPO << WDOG_CS_CLK_SHIFT
are both of them part of the .text section?
What if the variable is local and initialized?
uint32_t wdog_cfg = 0xAAAAAAAA;
the numeric value will be placed in flash in text?

My goal is to perform some watchdog test/init before branching into init_data_bss but I am wondering if I am risking to mess up with unitialized ram data with this approach!

LikeLike

Reply ↓
- Erich Styger on May 24, 2019 at 08:09 said:
  
  Hi Luca,
  as a local variable, the wdog_cfg is placed (by default) on the stack. But if the compiler is optimizing it, it could be kept in register and actually never stored on the stack memory itself.
  Numeric constants as 0xAAAAAAAAA are handled by the compiler, usually the compiler might constant as immediate value into the code, usually at the end of the function (especially for ARM Cortex).
  There are some details about this discussed in this article: https://mcuoneclipse.com/tag/execute-only-code/
  I hope this helps,
  Erich
  
  LikeLike
  
  Reply ↓
Litesh on August 13, 2019 at 12:02 said:

The .romp is used by the linker for the ‘copy-down’ and initialization of .data. But it looks confusing: it is shown with addresses in RAM? Checking the linker map file shows:
Ah! That actually is not in RAM, but in FLASH: the linker maps this to the FLASH address 0x1b60! So this size 0x18 really needs to be added to the FLASH size too!

Does this mean that we have to consider the size of .romp in both FLASH as well as RAM?

LikeLike

Reply ↓
- Erich Styger on August 13, 2019 at 13:19 said:
  
  yes.
  
  LikeLike
  
  Reply ↓
Pingback: Tutorial: How to Optimize Code and RAM Size | MCU on Eclipse
LItesh on September 11, 2019 at 08:13 said:

Hello Erich,
Can you please explain .fill section in gcc mapfile. Also should this section be added to total consumption?

LikeLike

Reply ↓
- Erich Styger on September 11, 2019 at 11:28 said:
  
  .fill is used to fill up a section to the next section boundary. I believe it is added to the total consumption, but I have not tried/counted the numbers.
  
  LikeLike
  
  Reply ↓
Pingback: 6 ways to communicate with STM32F103C8T6. Part 1. Zero to blinking an led – Miles's nerdier side
Pingback: Listing Code and Data Size for all Files with the GNU size Utility in a Post-Build Action | MCU on Eclipse
Ian C. on January 23, 2021 at 22:36 said:

Sorry to be posting to such an old thread, but it’s where I ended up after following from your today post about assert/etc!
I wonder how “size” can be customized in MCUXpresso … but more, I don’t think it would tell me what I want anyway.
I am a good way through development on K22F for a redesign of our handheld configuration tool, and I’m stumped by “data=73100” in the compiled output. The code itself is only a little bigger (text=93596) so this additional “data” is a big hit on storage! I cannot find what is the cause; the map file gives no clues that I’ve spotted.
Is there a tool to decode / identify where initialized data comes from?

Thanks.

LikeLike

Reply ↓
- Erich Styger on January 24, 2021 at 08:01 said:
  
  Hi Ian,
  yes, the ld map file is sometimes not that useful. Some ideas: are you aware of the ‘Image Info’ view in MCUXpresso? See https://mcuoneclipse.com/2020/01/10/listing-code-and-data-size-for-all-files-with-the-gnu-size-utility-as-post-build/ which gives as well another tool to see which file is producing how much of data.
  Have a look as well at the .bin file, if the linker has created initialization data (see https://mcuoneclipse.com/2014/04/19/gnu-linker-can-you-not-initialize-my-variable/ for that topic).
  I hope this helps?
  
  LikeLike
  
  Reply ↓
  - Ian C. on January 25, 2021 at 14:55 said:
    
    I did find that article Saturday but apparently didn’t pursue enough. Today in “PROGRAM_FLASH” when I expand I see section “data_RAM2” of 51K; it lists among other things “menucache” of nearly 20K so that’s part of where the data is coming from.
    My declaration was “__SECTION(data,RAM2) menucache[CacheSize]; there’s no “= {0}” initialiser but the compiler/linker decided to do it anyway! Maybe the __SECTION is being used wrong?
    I just changed to “__NOINIT(RAM)” and that seems to have removed the data block from Flash. Now I have to do the same in a few other places!
    
    And … the code ends at 0x1EDB8 and the SREC file ends at 0x1EFE0 which is so much better!
    
    Thanks 🙂
    
    LikeLike
    
    Reply ↓
    - Erich Styger on January 25, 2021 at 15:38 said:
      
      That’s great, thanks for reporting this back!
      
      LikeLike
Jane on July 27, 2021 at 05:59 said:

Hi Erich,
You said:‘data’ is for initialized variables, and it counts for RAM and FLASH. The linker allocates the data in FLASH which then is copied from ROM to RAM in the startup code. But I don’t fully understand why data is equal to 0x1c( after plusing 4) instead of 0x20 as you told? And when data is stored in FLAH, how can it be copied from ROM？
Thanks,
Jane

LikeLiked by 1 person

Reply ↓
- Erich Styger on July 27, 2021 at 07:30 said:
  
  Hi Jane,
  the initialization of RAM data with content from ROM/FLASH is done in your startup code. Check the code you are executing before calling main: there it does the zero-out (initializing globals with zero) and the copy-down (initializing variables with content from FLASH).
  
  LikeLike
  
  Reply ↓