Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long sleeps imprecise #11

Open
diggit opened this issue Oct 4, 2019 · 11 comments
Open

Long sleeps imprecise #11

diggit opened this issue Oct 4, 2019 · 11 comments

Comments

@diggit
Copy link
Contributor

diggit commented Oct 4, 2019

When there is scheduled sleep longer than maximum systick period, sleeping times become imprecise. Sleep times are then multiples of systick maximum period, even the last one which would be probably shorter than that. Problem is this line https://github.com/flit/argon-rtos/blob/master/src/ar_kernel.cpp#L279. Systick is not reconfigured and stays at maximum period until it overshoots or matches nextWakeup time. Sleep is then longer than required.

I'll try to fix it and open PR

@diggit
Copy link
Contributor Author

diggit commented Oct 4, 2019

Oh, and Argon is running in tickless mode.

@diggit
Copy link
Contributor Author

diggit commented Oct 4, 2019

Also, argon keeps time of wakeups in ticks. When systick period is clamped to value, which is not aligned to with (multiple of) tick duration, more timing errors raises.

@diggit
Copy link
Contributor Author

diggit commented Apr 8, 2020

Hi @flit, again, I have project which would benefit from such light RTOS. So I dove into Argon debugging again and found several bugs in tickless mode timing (not tested tick mode). It's all about requests to argon between ticks (aligned times when could scheduler ticks occur). Are you interested in discussion about fixes?

@flit
Copy link
Owner

flit commented Apr 9, 2020

Definitely interested! 😄

@diggit
Copy link
Contributor Author

diggit commented Apr 10, 2020

Great!
At first, I have to say, that systick really sucks for ticksless mode.

Timing allignment issue

Let's have this scenario:
10ms scheduler time quanta
Thread A:

  • turn on LED_1
  • sleep 10 ms
  • turn off LED_1
  • sleep 40 ms

Thread B:

  • flip state of LED_2
  • suspend

Thread B is periodically resumed from some timer IRQ (really does not matter, just asynchronous to tick).

Runtime:
System starts, init is done (abs time = 0ms, 0 ticks) and Thread A is started.

  1. Thread A finishes it's job and requests 10ms sleep.
  2. Scheduler is invoked, context is switched to idle thread because all threads are sleeping or are suspended. Next wake-up is scheduled to absolute time 10ms (tick 1).
  3. At 3 ms, timer irq fires and requests resume of Thread B.
  4. Scheduler is invoked, every thread is sleeping, so context is switched to Thread B
  5. Thread B finishes it's job and requests suspend.
  6. Scheduler is invoked, all treads are sleeping or are suspended. when ar_port_set_timer_delay is called, timer value is reset and Argon completely looses information, about those 3ms. Next wake-up is scheduled to absolute time 10ms (1 tick).

In extreme case when F_async > 1/(kSchedulerQuanta_ms/1000) (more than 1 async events fit in first kSchedulerQuanta_ms of sleep), ticks will never increment and Thread A never wakes.

Solution is to take timer current value into account when configuring timer. This would mean configuring timer to 7 ms.
Systick is kinda stupid timer and its' value cannot be preset. Only option is to modify Reload value. This involves several issues:

  • Now when Argon calls "ar_port_get_timer_elapsed_us()", returned value must also add that 3ms "correction", because that is real time since tick. Could be solved
  • (as a result of solution above) in scenario when timer is reconfigured to tick in very short time (eg. 10us) and execution is busy by some thread having disabled interrupts or by some other interrupt with higher priority, systick timer starts rapidly overflowing and thus loosing time. This issue does not have solution with systick, because we have to modify reload value to alter next tick time. Solution is to use some normal timer allowing value preset. Thus reloading to longer times after overflow allowing to Argon to catch up.

Same time loss happens, when thread runs several ms and in the end, requests sleep. Wakeup time is referenced to moment of this call screwing timing for others.

Repetitive tick increment issue

Same thread config like in issue above.

Runtime:
System runs sime time...

  1. Thread A finishes it's job and requests 40ms sleep.
  2. Scheduler is invoked, context is switched to idle thread because all threads are sleeping or are suspended. Next wake-up is scheduled after 40ms.
  3. After 25 ms, timer irq fires and requests resume of Thread B.
  4. Scheduler is invoked, every thread is sleeping, so context is switched to Thread B
  5. Thread B finishes it's job and requests suspend.
  6. Scheduler is invoked, all treads are sleeping or are suspended. Next wake-up is scheduled after 20 ms. (ignore time losses)
    ...

The issue is, that scheduler calls ar_kernel_increment_tick_count(elapsed ticks) where elapsed ticks are 2 and this increment is done more than once.

IMO cleanest solution would be to move handling of all ticking to ar_port.cpp together with tickCount and probably get rid of missedTickCount (why do we need it instead of just incrementing ticks even when kernel is locked?).

Note: Issues may not be clear at first, but I've spent several days debugging Argon. Drawn numerous timing diagrams and did some poor man's tracing

Current state

I have working tickless mode with precise timing on STM32F303 with 2 chained timers/counters. First one is counting microseconds between kernel ticks (16bit) and second one is counting overflows = ticks (32bit). Wake-up interrupt is connected on value compare on tick counter. Values or Reloads are never altered. Counters are never stopped. When Argon wants time, it always gets value from counters.
Not deeply tested, but at first glance working.
image
LED = LED_1, ASYNC = LED_2
Some broken timing could be seen in #10

I'll try to implement timing with single timer without necessity for chaining 2 timers. I have to cleanup my code and will link relevant branch here.

@diggit
Copy link
Contributor Author

diggit commented Apr 11, 2020

WIP here: https://github.com/diggit/argon-rtos/tree/ticklessFixes
Compare with master: master...diggit:ticklessFixes
Default ar_port.cpp still needs conversion. Tick mode was not tested yet.

@diggit
Copy link
Contributor Author

diggit commented Apr 11, 2020

I am also curious, which version of C++ do you target? Even C++11 has constexpr for constants, so you don't have to abuse enums for this purpose. (in C++ code obviously)

@flit
Copy link
Owner

flit commented Apr 13, 2020

(Sorry I'm being slow to reply… this weekend I've been totally focused on implementing CMSIS-Pack debug sequence support for pyOCD.)

Timing
Thanks a ton for investigating these problems! I've known there are issues with timing in Argon, especially for tickless. The main cause of the timing alignment issue (that affects tick mode too) is that the timing granularity is only whole ticks, so as you've seen, you lose any fraction of a tick if rescheduling asynchronously.

These are where I'd like to take Argon in regards to timing:

  1. Switch to microsecond precision timing instead of ticks. Might be optional since it really requires 64-bit variables.
  2. Support use of two timers, one for tracking wall time and one for sleeping. This ensures you don't lose any time whilst reconfiguring a single timer, which is a continual problem for tickless implementations. There might be better ways, such as how you are chaining timers. (Should also still support using only SysTick, so Argon is easy to get working on new MCUs without having to write timer code until you need precision.)
  3. Move Argon to be tickless-only.

(You can see some of these, and a lot more, in doc/argon_todo.txt.)

I was thinking the timer management should be moved to a separate ar_port_timer file. This would have to be chip-specific, or even application-specific. While the rest of ar_port.c works for any Cortex-M device.

I'll take a look at your ticklessFixes branch over the next few days/week. Again, I really appreciate your working on this! 😄

Language stuff
Argon nominally targets C99 and C++03. At the time, I would have liked to use C++11, but IAR didn't support it. Now, all 3 main Arm compilers support C11 and C++14. so should be ok to switch. But I'd rather do it as a coherent change instead of piecemeal.

Btw, using enums for integer constants is a style widely used by Apple. Years ago, I used to primarily be a Mac developer, so it just looks natural to me. For related constants, I like how it syntactically groups them together.

Missed ticks
Regarding missedTicks, you're right it should be removed. Just need to refactor ar_kernel_increment_tick_count() to only check sleeping threads (rename it ar_kernel_check_sleeping_threads()), and move the update of tickCount.

@diggit
Copy link
Contributor Author

diggit commented Apr 13, 2020

No problem. I've never tried pyOCD and always stick to openocd. Maybe it's time to try something new.

  1. Yeah tick<->ms<->us conversion is done on several places and it would make sense to use just one unit. Also scheduler tick decoupling from timing resolution would make sense. This could benefit from some c++ templating and not introducing a lot of ifdefs.

  2. Have a look at ar_port_f303_chained.cpp I've utilized two timers which are chained in HW (STM32 specific). First one counts microseconds between ticks and when it overflows, increments second counter (tick). Wake-up delay config is just about changing compare value of second one. This requires quite large counters (second one is 32bit wide).
    In ar_port_f303_single.cpp is used only one timer. Value of timer is never adjusted, just reload value. Some parts would be better atomic, but requires disabling irqs for at least short moment.

  3. Definitely. Well dropping tick mode support is probably not so difficult.

What about moving todos to some issue which can be easily edited/updated even without committing changes?

Language stuff

Well I am using llvm and gcc, both support C++17, but even C++14 would be nice.
Current mic of C++ and C is bit dangerous. Eg. C++ class extending C struct max not be compatible. C and C++ compiler may have different opinion about padding inside etc. Imo cleanest solution would be going full C++ internally and exposing C wrappers.
(I have projects using C++20 on embedded, but that's different story)

Missed ticks

Already ditched them in my branch. Function was remaed ar_kernel_tick_process() in ma branch, but it is kind of temporary. All ticking was moved to port, even they storage.

@flit
Copy link
Owner

flit commented Apr 19, 2020

Unfortunately IAR only supports C++14 (not even C++11!). Although I'm not using IAR much anymore myself, I'd still like to keep compatibility with the 3 major compilers.

Long ago (like mid-2000s), in a pre-Argon RTOS for ARM7, I did have a full C++ implementation with C wrappers. Unfortunately, this configuration makes it very difficult to manage statically-allocated kernel objects from the C wrappers. It also is the least efficient as far as code-size (you can't place calls to the C++ objects in inline C functions, since you can't import the C++ headers at the C API level).

@diggit
Copy link
Contributor Author

diggit commented Apr 27, 2020

...very difficult to manage statically-allocated kernel objects from the C wrappers

can you elaborate?

In following code, void int test_double_me_and_add(int num) can be called from C code.

class Test {
	int member {99};
public:
	int double_me_and_add(int num) {return num*2 + member;}
};

Test g_t;

extern "C" int test_double_me_and_add(int num) {
	return g_t.double_me_and_add(num);
}

It also is the least efficient as far as code-size (you can't place calls to the C++ objects in inline C functions, since you can't import the C++ headers at the C API level).

Well, as long as you compile sources individually and link them afterwards, almost every call in single compilation unit can be inlined. Calls between compilation units can be inlined when function body is in header which can be included other compilation units. If you enable -flto, inlining can happen between compilation units even when function body is not in header. Modern compilers are really amazing in optimizations. You can even mark functions as constexpr and calculate some data at compile time.

Is IAR Embedded something differemnt from IAR? Here they say about support of all features of C++17.

Even C++14 would be nice. Please consider migration of Argon to C++, there are so many benefits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants