Spilling the Beans: Endless Loops

The university lectures are kind of ‘back to normal’: with the COVID certificates mandatory, many former limitations (social distance, masks, …) have been relaxed. So this means there are now many more questions and discussions with students.

One of the thing I realized is that I am doing things in a certain way, and I don’t need to think about it, because I have used certain techniques for a long time. So I had several discussions last week with students which I would characterize as “aus dem Nähkästchen plaudern”. No real ‘secrets’, but just things which might be a something new to think about. Well, I think this is worth a potential new blog article series if this continues, so here we go with a first one: how to write ‘endless’ loops in C?

Photo by Pixabay on Pexels.com

My usual way is to write ‘endless’ loops as below:

for(;;) {
  /* do things */
}

I prefer the above ‘for’ way over the following variants which of course do the same thing:

while(true) {
  /* do things */
}

or:

do {
  /* do things */
} while(true);

Why? Because a really picky compiler could warn about an ‘always’ true condition. And to me the ‘for’ way is more compact too.

Sometimes I need to halt the program in case of an error. I could write this as:

for(;;);

which is compact, but I don’t like the missing { … }. So a better version is:

for(;;) { /* wait */ }

In general, I’m a big fan of having the {…} even if strictly not necessary: they improve the readability and avoid the ‘dangling else’ problem:

 if (a == 1)
    if (b == 1)
      a = 42;
 else
    b = 42;

Can you tell in which case the variable b gets the value 42? Using { … } increases readability even if not strictly necessary.

Using

for(;;) { }

still can be somewhat problematic, or any code which ‘branches on itself’. The assembly code will be something like this:

L:
  b L  /* branch to label L */

which is fine, but still some debuggers have problems with it :-(. The solution for this is to add some code inside, like:

for(;;) {
  __asm("nop"); /* for debugging purpose only */
}

Again, this would be only for easier debugging, so that might not be relevant for you.

The question is: where the use ‘endless loops’? For sure not in interrupt service routines (keep them short an fast!), it would be a really bad idea to have such a loop in there.

One usage of it is in case of error and to wait for a COP (Computer Operating Properly) or Watchdog to kick in:

if (fatalError happend) {
  for(;;) {
    /* wait and block, the watchdog shall trigger a reboot here! */
  }
}

Another useful case for endless loops are tasks, for example in FreeRTOS:

static void myTask(void *pv) {
  (void)pv; /* parameter not used */
  for(;;) {
    /* do work here */
  }
  /* never leave the task! */
}

So: I try to keep these ‘beans’ short, and I hope they are useful for you?

Happy spilling 🙂

49 thoughts on “Spilling the Beans: Endless Loops

  1. Excellent article – of course I agree with everything you’ve written (including the code formatting comments).

    I’m hoping that your next step is to explain *when* infinite loops are used. I would argue that in a single task thread, it’s only used once in the main loop – code like:

    if (errorCondition) {
    for (;;) { }
    }

    in a method or ISR should be avoided, especially when developing code/testing/debugging because it is so final, especially when the product gets into the field and somebody forgot that it is there.

    If you’re writing a multi-tasking application then every task should have an infinite loop with a wait for a timer delay, message, notification or mutex to indicate that it should resume execution.

    Liked by 1 person

  2. When I started coding in C the while loop seemed intuitive to me. Then I read the for loop the the industry standard, so I switched for the sake of the programmers who would support my apps. Still think the while loop is easier to read.

    Liked by 2 people

  3. Yeah, I like while(true) (although I dont like while(1)). This syntax makes sense to me and tells me what the programmer intended. A for without parameters is disturbing to me and frankly I dont know it has caught on in recent years, perhaps because poor compilers optimize poorly on this statement??. In any case I checked the output and my compiler is the same in both cases.

    For halting I would use assert(NULL) and if the code is never meant to reach there, then I would use an appropriate “unreachable_code” compiler specific function such as __builtin_unreachable(); so the compiler doesn’t complain and again it is very clear what the programmer intended. I always laugh when I see a mountain of comments explaining that the app will not continue such as after a rtos scheduler start.

    Liked by 1 person

    • Personally, I preferred “while (1 == 1)” as metaphysically, it continues to be true while the rules governing the way universe works stays the same.

      Liked by 2 people

      • >>Personally, I preferred “while (1 == 1)”
        I like that one :-). C did not had ‘bool’ as datatype (until , and ‘equal to zero’ and ‘not equal to zero’ were the conditions, using ‘1==1′ (or maybe ’42==42’ makes things clear for the universe 🙂

        Liked by 2 people

    • >>Yeah, I like while(true) (although I dont like while(1))
      ‘true’ and ‘false’ are not part of the C language, and until got added a with C99, everyone had its own define (TRUE, FALSE, True, False, as #define, as enumeration) making portable code very hard, or causing conflicts about different implementations. With this in mind, using ‘while(1)’ was a good approach because it was portable these days. With now available, using ‘while(true)’ is the correct approach. Yes, C99 seems like ‘old’, but I have to maintain code bases using C89!

      Liked by 1 person

    • Don’t give them the out.

      I’ve seen too many cases where it becomes a judgement call and ends up going the wrong way – see Erich’s comment on dangling elses.

      Liked by 2 people

      • yeah that’s true, good point… and I have been caught myself in times past!
        Its a rare event that, as you say, very much depends on a ‘judgement call’.

        I guess the libertarian in me is saying it’s OK and the authoritarian in me is saying only in some cases! 🙂

        Liked by 2 people

    • >>but I do allow it if it is only a simple one line statement.
      No, do not allow it for that. IMHO it is really bad practice. Consider the following:
      if (cond) DO_THINGS;
      Consider DO_THINGS as a macro, having one or multiple statements depending on some settings (e.g. DEBUG, etc).
      Putting aside that using macros is another possible topic for discussions: putting ‘{‘ … ‘}’ around things never hurts and increases code quality.

      Liked by 1 person

  4. Have you considered teaching the students to follow, or at least make them aware of, the MISRA C/C++ Guide Lines?

    “MISRA provides world-leading best practice guidelines for the safe and secure application of both embedded control systems and standalone software.”

    The for(;;) is also recommended by Gimple Lint to avoid “Evaluation is always true” errors.
    Gimple was once a great thing to use, until they implemented their new license scheme that requires you to by a new copy, at a no longer reasonable price, for each individual project.
    This is akin to a carpenter having to buy a new hammer for each nail they pound.

    Liked by 1 person

    • Yes, I make them aware under the topic of ‘Software Reviews and Checks’ and ‘Coding Guidelines’. I have been a great fan of Gimpel (PC-Lint) too, and as you notice their change in licensing/product bundling made me shy away from them.

      Like

  5. “for(;;) {
    __asm(“nop”); /* for debugging purpose only */
    }”

    GCC can be aggressive, even at low levels of optimization, and can remove statements that have no side effects. The NOPs should be marked as Volatile.

    #define ATTR_NO_INSTRUMENT_FUNCTION __attribute__( ( no_instrument_function ) )
    /*
    * NOP is not necessarily a time-consuming NOP. The processor might
    * remove it from the pipeline before it reaches the execution stage.
    * Use in combination with the Sync Barrier Instruction to consume time.
    */
    static inline ATTR_NO_INSTRUMENT_FUNCTION void nop( void )
    {
    __asm__ __volatile__ (“nop”);
    }

    Liked by 2 people

    • Yes, good point. I thought about it, but did not add it to seed a discussion :-). So yes, it is a good practice to add the volatile for assembly too, but I have not seen (or missed) a case where gcc would have removed that NOP, even with the highest optimizations?

      Like

      • Hmmm… I tried “nop” without volatile on four different compilers with max optimization and none of them took it out. I know this is a side topic, but surely a nop is ‘known’ to have no side effects (other than timing) and I would expect any competent compiler to know this also. Hence I see no need for the volatile keyword, which btw has had significant usage deprecations of late.

        Liked by 1 person

        • Marking inline assembly with volatile certainly does not hurt, but imho it should not be necessary, because the compiler should take it ‘as-is’ with the exception of address or branch offsets calculated. I wrote compiler front and back ends, and the mandate was always not to touch inline assembly code. But in any case, if you want 100%+ control about what the code will be on the target, you would need to switch to a pure assembly module, and not having it mixed inside C/C++.

          Like

        • I have in fact seen the GCC-AVR compiler remove such loops.
          In the early days of GCC-AVR it was something that came up on the support mailing list from time to time when it would bite someone.

          Also did you try optimization of ‘Small’ -Os?

          “significant usage deprecations of late.”

          Is that because it was being used incorrectly in the first place in those cases?
          The only correct application is to warn the compiler a value may change outside of the compilers abstract machine knowledge and that instructions must not be removed.

          Liked by 1 person

        • Hi Bob,
          >>“significant usage deprecations of late.”

          My take is too it had been used in wrong ways and places, and maybe programmers have realized this.

          Like

      • On ARM the NOP is a ‘register-register move’ in the background, so it translates into ‘something’ like ‘mov r0, r0’. The question what the processor does then with it, especially pipelines (or caches): depending on these the instruction might be optimized by the processor architecture itself. I don’t think this is the case on Cortex-M. Anyway delay loops with nops are subject of such things, and execution time might vary depending on flash access wait cycles and so on.

        Liked by 1 person

        • Its not the case on a Cortex-M. I often use NOP for timing delays and so does does the NXP SDK for that matter. I asked the question because I cant think of reason other than timing to use a NOP.

          This point is somewhat pedantic, but I also think it is incumbent on any embedded software engineer to know such behavior, to ‘know’ their MCU, that’s why such programmers have the ‘engineer’ in their name.

          Liked by 1 person

        • >>Its not the case on a Cortex-M.

          You mean that the time needed for NOP is always the same? Actually on a LPC845 (Cortex-M0) it depends on the flash memory wait cycles, so depending on the settings for FLASH this might take 1, 2 or more cycles. I thought I wrote about it, but cannot find it 😦

          And yes, I do use NOPs for delays too, but not in an ‘exact’ fashion, knowing that timing will vary (flash bus, internal pipelines, bus clock frequency, core frequency) to some extend. I tend to say that delay loops with NOPs are in the +/- 10-15% accuracy: if that fits the needs: fine. Otherwise other realtime synchronization methods have to be used.

          Like

        • This is a really interesting conversation and one that I wish could be carried out in a forum better than this.

          Both of you have excellent points (I would include Bob’s comments in here as well) but I feel like something fundamental is missing here and that is nops should NEVER be required in a application in a device that has a word size greater than 8 bits.

          When I first started doing assembly language programming, literally 40 years ago, the nop had two primary purposes:
          1. Provide clock level delays for IO operations.
          2. Provide space for code patches.

          Hopefully we can all agree that point 2. is not required today and should have never been a consideration in the first place.

          For point 1., I would start with Erich’s point that in a reasonably complex, modern systems (ie Cortex M) with interrupts, DMA operations, caching, etc., the delay provided by the nop is at best a minimum case and cannot be expected to be accurate at all times.

          So that leaves us with an assertion that nops should never be used but we have the real world where debuggers can’t handle empty loops and hardware which cannot be accessed across consecutive bus operations.

          What do we do? My preference would be to eschew nops all together and when we discover an issue with a compiler, linker or debugger that needs a nop to function correctly, a bug is posted and highlighted immediately. As for hardware issues, I think John is right and we have to be engineers and understand what is happening with the hardware, document it (including making sure it’s documented in the appropriate datasheets and hardware manuals) and work out the best way to avoid the issue that avoids the use of nops because of their unpredictable execution discussed above.

          Liked by 2 people

        • Hi Mike,
          excellent summary and thoughts, thanks! I may add one more usage of NOPs for some architectures (non-ARM, afik) for Branch-Delay-Slots: the architecture requires that after a branch instruction due the prefetch there shall be a NOP followed by the branch. But it is the job of the compiler to include these nops in the right places.
          And I agree that some hardware peripheral require one more bus/CPU cycle to be effective, so using a NOP there is justified in my view. But I would ask if they could have made the hardware in a way not requiring this? Here again software needs to cover possible not-thought-through hardware, maybe?

          Like

        • Yeah me to… I dont use nop’s for “exact timing”, close enough is good enough in most cases and the product’s flash and setup remains largely the same throughout its life. I would not expect subtle changes to break my code, that would be bad programming on my part!! very bad!

          If I needed absolute accuracy I would do it differently e.g. PIT

          Liked by 2 people

        • Looking at how the replies are organized, my original thought wishing that this conversation could be on another platform seems warranted.

          A few comments back…

          >>Erich wrote: “I may add one more usage of NOPs for some architectures (non-ARM, afik) for Branch-Delay-Slots: the architecture requires that after a branch instruction due the prefetch there shall be a NOP followed by the branch.”

          Isn’t this something that you have with a PICmicro? I’m not sure it’s something that you have to worry about with other (larger word size) architectures. This is why I put in “nops should NEVER be required in a application in a device that has a word size greater than 8 bits” in the comments back.

          >>John Coppola wrote: “I upgraded to C++20 and then the NXP SDK’s generated a ton of deprecation warnings.” with Erich replying ” I feel volatile or something similar as an attribute is reasonable for some hardware register (only), to allow the compiler to be informed about underlying things. Maybe a new qualifier/keyword or attribute?”

          This conversation reminds me of the xkcd cartoon: https://xkcd.com/927/ Rather than deprecating “volatile” and coming up with something different shouldn’t there be a better definition and guidelines on when to use volatile?

          >>Erich wrote: “. I tend to say that delay loops with NOPs are in the +/- 10-15% accuracy: if that fits the needs: fine. Otherwise other realtime synchronization methods have to be used.”

          The “+/- 10-15%” is significantly better than anything I’ve ever seen when I’ve used delay loops in anything more than an 8 bit system running with interrupts off and no DMA. A few years back I had an employee try them on a Nordic M0 processor and saw variations as high as 300% longer and none shorter than the expected with a very uneven distribution that made them basically useless for the task. Can I ask how you came up with that value?

          >>John Coppola wrote: “close enough is good enough in most cases…”

          This comment made me want to relate our experiences over the summer although I know I’ll want to repress those memories. Our originally selected (Toshiba) stepper motor drivers won’t be available until next year (this was discovered in April) and the thought was we’ll use Trinamic’s TMC2xxx devices because there are a) lots available of different but very similar part numbers and b) they’re used in a lot of applications so it should be a low risk activity.

          It’s now October and we’re finally getting to the point where we have qualified motor driver circuits. Going with Trinamic (recently purchased by Maxim) was a disaster as we have had four FAEs working on our application (with one being a chip designer) and we have found multiple major errors and omissions in their datasheets because they felt their evaluation tools were “close enough is good enough”. And they were for large steppers turning at what they call low to moderate speeds driven by Arduinos and they extrapolated their confidence to small motors running at moderate to high speeds driven by custom controllers (and software)…

          Liked by 2 people

        • Hi Myke,
          >>>>Erich wrote: “I may add one more usage of NOPs for some architectures (non-ARM, afik) for Branch-Delay-Slots: the architecture requires that after a branch instruction due the prefetch there shall be a NOP followed by the branch.”
          >>Isn’t this something that you have with a PICmicro? I’m not sure it’s something that you have to worry about with other (larger word size) architectures. This is why I put in “nops should NEVER be required in a application in a device that has a word size greater than 8 bits” in the comments back.

          No, to my knowledge this is not something on the PIC, but found on some of the ‘pure’ RISC cores like the M-CORE (worked with that one a while back) or some DSP or older RISC cores. https://en.wikipedia.org/wiki/Delay_slot gives a good explanation why NOPs are needed there.

          >>>>Erich wrote: “. I tend to say that delay loops with NOPs are in the +/- 10-15% accuracy: if that fits the needs: fine. Otherwise other realtime synchronization methods have to be used.”
          >>The “+/- 10-15%” is significantly better than anything I’ve ever seen when I’ve used delay loops in anything more than an 8 bit system running with interrupts off and no DMA. A few years back I had an employee try them on a Nordic M0 processor and saw variations as high as 300% longer and none shorter than the expected with a very uneven distribution that made them basically useless for the task. Can I ask how you came up with that value?

          Actually such realtime synchronization should be in the +10-15% range (always taking the specified time, but something more). You might have a look at the McuWait_Waitms() implementation in the McuLib (https://github.com/ErichStyger/McuOnEclipseLibrary/blob/master/lib/src/McuWait.c) which does exactly that, with a series of NOPs. Of course this timing is not intended to count in interrupts (except you are using the cycle counter approach used by that module), but it is good enough to wait a few ns, us or even ms. waiting for ms I usually use a delay from the RTOS anyway, and this module is able to handle this too.

          Liked by 1 person

      • “My preference would be to eschew nops all together and when we discover an issue with a compiler, linker or debugger that needs a nop to function correctly, a bug is posted and highlighted immediately. ”

        In the case that started this whole discussion of a volatile NOP. with the GCC-AVR compiler optimization is set by default to ‘Small’ in the WinAVR/avrlibc project templates that many many people blindly copy. The compiler is doing what was requested. A loop with no side effects is removed to make the code ‘small’, just as the user asked. So it is not technically a bug in this case, rather it is something that violates the tenant of the tools should do the things of least surprise.

        Liked by 2 people

        • Hey Bob,

          I agree it’s not a “bug” but and incompatibility between the various tools being used. I said put it in as a bug as I don’t know of any other way of reporting the issue so that it’s a least documented and discussed.

          Cheers.

          Liked by 1 person

  6. Pingback: Spilling the Beans: Breaking Loops | MCU on Eclipse

  7. C++20 has deprecated the use of the volatile keyword especially with regard to compound assignments and atomic operations (there other bad use deprecations), the fundamental purpose that we embedded programmers rely on is still in place. Specifically variables are not atomic and should not be expected to be manipulated as such.

    Further, asm(“”) is inherently left “as is” by the compiler and if it’s not, the compiler is at fault. But of course the compiler may not take into account what the MCU architecture will do at runtime, so you need to know.

    Liked by 1 person

    • Hi John, 100% agreed! On the topic of volatile: I did not expect that that level of good discussions. I any case, I started drafting a new article around volatile, so other comments could be added there. Should be able to publish it early this week if time permits.

      Like

      • I look forward to it. I was forced to look at all this recently because I upgraded to C++20 and then the NXP SDK’s generated a ton of deprecation warnings. Hopefully, they will fix it in the near future, for now I use “-Wno-deprecated-volatile” to avoid the myriad of warnings…

        I read that volatile may be completely deprecated in the future, will be interesting to see how.

        Liked by 2 people

        • I did not switch/use C++20 yet, thanks for the heads-up. I feel volatile or something similar as an attribute is reasonable for some hardware register (only), to allow the compiler to be informed about underlying things. Maybe a new qualifier/keyword or attribute?

          Like

  8. You are right Mike, but when I said “close enough is good enough” I was talking as someone who had done his due diligence and determined that absolute precision is not warranted. The same applies when I choose a RTOS delay function, some are accurate, some have latency. I decide… I’m responsible. But I take your point and one must be cognizant. For the most part, I use timers. I also agree that removal of nop is probably a good thing especially for 8bit micros. In any case you should know better than to trust an FAE! LOL

    The cartoon is great! and sadly, very true. However, I have read and tried to understand the volatile deprecations, not an easy read, and what they have done makes sense. see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1152r4.html

    I think the word has been abused by many non-embedded programmers, I have worked in a purely IT applications environment, very large projects, for many years and would often see it in legacy code. It was used incorrectly most times and I only needed to use it once with respect to OS thread global variables. The standards correction aims to fix this and I think it is warranted, a kind of forced education perhaps. Further, the volatile changes will also apply to C in the proposed draft aimed at C23. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf

    I firmly believe in moving forward, my mind boggles when I read people suggesting one stays with C99!

    As for MAXIM, I feel your pain! I’m currently working on our new battery pack using the max17320 fuel gauge. Their documentation leaves a lot to be desired and I have come across several typos! But that’s why I have a eval kit to confirm my understanding. 🙂

    Finally, the complete removal of volatile may be achieved through function based interfaces in C++, I don’t know what the equivalent would be C. Interfaces in C++ are interesting and if you want a good overview, see https://www.fluentcpp.com/2017/06/20/interface-principle-cpp/

    regards,

    Liked by 2 people

    • Hi John,

      A couple of things back.

      Volatile. Thank you for the links; I never realized how out of control the reserved word had become. I have always restricted my use to variables/registers which are updated externally to the currently executing code and are read and then polled. Honestly, when you’re writing what I consider good code – which means in a multi-tasker with no polling (see the comments on nops) you should never need to use volatile (yes I know we live in the real world and there will be cases where this isn’t possible or optimal).

      When reading through the references you provided, the conclusion I came to is that the standards committee to go back to K&R and look at how they defined it before deciding what and how to deprecate; “The purpose of volatile is to force an implementation to suppress optimization that could otherwise occur.” It seems like they’re spending much too much time deciding what needs to be deprecated and how it “should” work.

      I’m all for updated versions of languages but they need to be downlevel compatible. I would think that a good test of whether or not C 20 is going in the right direction is the number of warnings produced when building existing code – if you have to turn off the warnings because you are so inundated with them, then the language update is not appropriate.

      Liked by 2 people

  9. Pingback: Spilling the Beans: volatile Qualifier | MCU on Eclipse

  10. Yeah, I wholeheartedly agree, but I don’t think the committee is just on a mission to deprecate. I think volatile still means what You and I know it to mean, and they probably don’t need to change or clarify the existing meaning. Its just all those ‘other’ bad uses out there that compilers have allowed that should be prevented moving forward and I have no problem with a bit of short term pain.

    My code base is substantial and runs across three projects and several libraries. NONE of my code generated those warnings. Only my NXP SDK and specifically compound statements such as “volatile x++”. I’m sure your code wouldn’t either! 🙂

    Also, I have “all warnings” enabled and that’s probably why I see this particular warning.

    Liked by 2 people

  11. Erich Wrote:

    >>> Actually such realtime synchronization should be in the +10-15% range (always taking the specified time, but something more). You might have a look at the McuWait_Waitms() implementation in the McuLib (https://github.com/ErichStyger/McuOnEclipseLibrary/blob/master/lib/src/McuWait.c) which does exactly that, with a series of NOPs. Of course this timing is not intended to count in interrupts (except you are using the cycle counter approach used by that module), but it is good enough to wait a few ns, us or even ms. waiting for ms I usually use a delay from the RTOS anyway, and this module is able to handle this too.

    Wow. I haven’t seen code like that used for anything other than an 8bit device for literally decades – the last time would be in 1988/1989 when I was working on the POWER Processor for IBM RS/6k memory tests. You will get 10%-15% accuracy with that approach if interrupts, DMAs, etc. are not active in the system.

    Thank you for the example.

    Liked by 1 person

    • Hi Myke,
      >> You will get 10%-15% accuracy with that approach if interrupts, DMAs, etc. are not active in the system.
      Yes, and actually with using cycle counters it is even below 5% (it all depends on how long you wait). And of course depending on your system load. But even say if your interrupt load is 10% of the system, then it would be an extra 10% in the worst case which is not bad. And you don’t need to spend a timer for this. Actually this kind of realtime synchronization is very useful during system startup, e.g. waiting 100 ms because the OLED needs time to power up: at that stage interrupts usually are not enabled, so using a timer is not always the best choice. Or it is very handy to wait if you need to pull down say for 100 us: an RTOS usually has 1 kHz resolution, so won’t be good for this. Again, it is not for everything.

      Liked by 1 person

  12. These endless loops can be a hardware engineering nightmare. For instance, a board doesn’t boot and without hooking up the debugger you have no idea why because it’s stuck in a loop. Consider adding some type of breadcrumb, serial output, toggle a pin, novram log etc. so that you can quickly tell what is going on without pulling out the big guns.

    Liked by 1 person

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.