Proposal to remove ztimer clock adjustments

To be precise, I am proposing to remove the functionality provided by ztimer_clock_t::adjust_set and ztimer_clock_t::adjust_sleep, I have not looked at ztimer_clock_t::adjust_clock_start, but it also might make since to remove.

I also propose to add some flavor of ztimer_set_absolute() (inspired by timer_set_absolute()), as a replacement solution to the problem, which the adjustment feature sought to solve. That is having timers fire too much later than expected. Callers anticipate some delay, but it should not be unbounded. Much of the latency of ztimer is that “now” is a moving target. If “now“ could be fixed at the time of call (example ztimer_set_absolute(ZTIMER_USEC, ztimer_now() + 100)), the delay could be accounted for in ztimer’s internals. Currently, all of the ztimer low level implementations only allow setting timers relative to now. This means that calculating a timer’s timeout time must deal with the fact that there is a finite difference between the time when a user calls ztimer_set() and the time the low level timer is set, and the time must be stable for the current adjustment feature to work well.

No in-tree board makes use of the ztimer_auto_adjust module. I don’t see CONFIG_ZTIMER_USEC_ADJUST_CLOCK_START used anywhere, and CONFIG_ZTIMER_USEC_ADJUST_CLOCK_START and CONFIG_ZTIMER_USEC_ADJUST_SLEEP are only used by a few boards.

The problem is that ztimer_periodic fails catastrophically if its callback fires early. It gets suck in its ISR chasing the next set point while repeatedly setting its internal timer with a value of 0. I suspect ztimer_periodic_wakeup() also would not tolerate an early wakeup. You could argue that the bug is in ztimer_periodic, however I bet I can find other ztimer users that also do not assume an early wakeup. I also bet that every application writer that makes use of ztimer, makes the assumptions that early callbacks won’t happen.

How does this early wakeup occur? There are two ways. One is through the use of theCONFIG_ZTIMER_ADJUST_* macros. If they are set and then forgotten, they may become overly aggressive and subtract too much time from timers, if ztimer (in all cases of ztimer use) or the scheduler’s (in the case of ztimer_sleep()) performance is improved. This is especially likely to happen to out-of-tree boards. The second way is as follows, and is the way I discovered this issue. I had the ztimer_auto_adjust module enabled. For reasons I don’t understand, it was setting the adjustment values to different values as my application’s code base changed. These changes should not have had any impact on the module which runs very early in the boot process. Further, I confirmed that regardless of the adjustment values determined at boot, calls to ztimer_periodic were firing 1 tick early sometimes. This implies that, after boot, sometimes the latency in setting timers was less than what ztimer_auto_adjust measured at boot.

Initially I sought to make ztimer_periodic more robust to handle early callbacks, but this turned out to be non-trivial, and only fixes the problem in this one place.

Using absolute timeouts comes with a different can of worms. We had that before. I wouldn’t go there again. Just solving underflows with e.g., ztimer_set_absolute(ZTIMER_USEC, ztimer_now()+0) on a loaded system is not trivial. The sane thing to do is using 64bit timer values, which ztimer64 already provides (ztimer64_set_at()).

Timers triggering too early is obviously not great. Maybe the adjust_* functionality needs to be toned down a bit? Add some safety ticks?

How much earlier were your timers firing?

Using absolute timeouts comes with a different can of worms. We had that before. I wouldn’t go there again.

Fair enough, you have the experience to back that claim up.

I did not know about ztimer64’s absolute time functions. I incorrectly assumed that it only existed to implement longer wait times.

Timers triggering too early is obviously not great. Maybe the adjust_* functionality needs to be toned down a bit? Add some safety ticks?

This would improve the situation when ztimer_auto_adjust is used, but doesn’t help when the CONFIG_ZTIMER_ADJUST_*' macros are used, and could get out-of-date.

How much earlier were your timers firing?

Auto adjust would determine I needed 2 or 3 ticks offset (auto adjust was always consistent for a given binary, but it varied from binary to binary depending on changes to code that runs much later after auto adjust, or even code in branches that was never actually executed). Any adjustment >=2 ticks would trigger the mentioned ztimer_periodic bug. I interpret that to mean calls were at least 2 ticks early sometimes (and that the adjustment should be 1 tick max).

Perhaps setting absolute timers is me focusing on the solution and not the problem. Here is what I want to be able to do (my problem).

  1. set a timer relative to another timer
  2. set a periodic timer (special case of point 1, but relative to itself)
  3. set a timer relative to a point in time other than now (possible/likely in the past)

Solving these problems externally to ztimer are frustrated by the fact that ztimer_set() does work spanning multiple ticks both before and after the “now” tick that the timer is set relative to. Even the tick value returned by the function is not the now that the timer is set relative to. If I call now = ztimer_set(ZTIMER_USEC, 100) the timer is actually set now + SOME_NONZERO_VALUE + 100. This is because the now that is returned is the now from before the timer was inserted, but the now used to set the timer is the now from after the timer was inserted and the clock’s list was updated.

The clock adjustments attempt to solve time spent before the timer is set, but there is no solution to the time spent after the timer is set. So we wind up with:

unsigned isr_state = isr_disable();
const uint32 start = ztimer_now(ZTIMER_USEC);
ztimer_set(ZTIMER_USEC, start + 100 - ztimer_now(ZTIMER_USEC));
ztimer_set(ZTIMER_USEC, start + 200 - ztimer_now(ZTIMER_USEC));
ztimer_set(ZTIMER_USEC, start + 300 - ztimer_now(ZTIMER_USEC));
unsigned isr_state = isr_restore();

Internally ztimer uses absolute times and does have to deal with underflow. There is some great stuff with check-pointing and extending clocks. Applying some of these tools, it seems like there should be a way to solve the problems above. Maybe a ztimer_set_rel(clock, last, offset)? Or something inspired by ztimer_periodic_wakeup() ?

The difficulty with absolute timers is that it is hard to detect underflows. Like when setting timer_set_absolute(<actual_now>), was that meant to sleep for a whole timer tick cycle or for 0 ticks?

But when ztimer_set(100) is called, we know already that the absolute timer value must be between 0 and 100 from now(), in absolute values. ztimer calls now() very early in ztimer_set(), still non-zero but much earlier than after working the timer queue.

I think this would need to be handled at the lowest level (like, in `periph_timer`).

Something like timer_set_absolute_with_max(absolute_value, max_length).