hypervisor: 0.846 register: 0.809 arm: 0.793 peripherals: 0.775 mistranslation: 0.766 semantic: 0.752 architecture: 0.750 PID: 0.742 performance: 0.740 device: 0.736 permissions: 0.730 risc-v: 0.717 graphic: 0.706 boot: 0.701 ppc: 0.697 virtual: 0.697 TCG: 0.695 KVM: 0.691 user-level: 0.689 assembly: 0.686 VMM: 0.681 vnc: 0.661 debug: 0.657 files: 0.641 network: 0.616 x86: 0.600 socket: 0.527 kernel: 0.524 i386: 0.365 arm9 clock pending (SP804) Hello all, I'm using the versatilepb board and the timer Interrupt Mask Status register (offset 0x14 of the SP804) does not seem to be working properly on the latest qemu-2.12. I tried on the 2.5 (i believe this is the mainstream version that comes with Linux) and it works perfectly. What happens is that the pending bit does not seem to be set in some scenarios. In my case, I see the timer value decreasing to 0 and then being reset to the reload value and neither does the interrupt is triggered nor the pending bit is set. I believe this is a matter of timing since in the "long" run the system eventually catches up (after a few microseconds). Thank you Do you have a test case image you can provide that reproduces the problem? Hello, Let me explain a bit how to reproduce the problem. I'm developing a RTOS with qemu (and boards,etc) - https://sourceforge.net/projects/rtospharos/. The first version used ARM926 with the versatilepb board and I used the qemu 2.5 version. In my test (to check that the RTOS is working correctly) I produced a loop that is always reading the current time ("pharosClockGet") and checking that it is always increasing. This works perfectly on qemu 2.5. When qemu 2.12 was released I updated to the new version to add support for the raspberry pi 3. Here is a piece of the test: void aperiodicThread0_0() { uint64_t microseconds; uint64_t previousMicroseconds = 0; uint32_t i; for(i = 0; i < NUMBER_REPETITIONS; i++) { microseconds = pharosClockGetSinceBoot(); if(previousMicroseconds > microseconds) { print("ERROR! Clock got lower\r\n"); break; } } } I'll show you the "internals" of Pharos next (this is file clock.c): uint64_t pharosIClockGetSinceBoot(void) { /* microseconds since last clock tick*/ uint32_t microseconds; /* number of clock ticks since boot */ ClockTick clockTicks; /* >> interrupts are disabled when we get here << */ /* read the number of microseconds on the counter */ clockTicks = pharosVClockTicks; /* read the timer value */ microseconds = pharosCpuClockReadCounter(); /* if the clock interrupt is pending we might have read the microseconds before or after the clock tick occurred (but not processed since interrupts are disabled) */ if(pharosCpuClockIsrIsPending() == TRUE) { /* interrupt is pending, so read again the microseconds and subtract 2 clock ticks */ microseconds = (2U * pharosIMicrosecondsPerClockTick()) - pharosCpuClockReadCounter(); } else { /* the timer counts backwards so we must subtract to get the number of microseconds since last tick */ microseconds = pharosIMicrosecondsPerClockTick() - microseconds; } /* calculate the result using the last us read */ return (uint64_t) ((pharosIMicrosecondsPerClockTick() * clockTicks) + microseconds); } And the part that is hw dependent (for ARM926) - this is in file ClockSp804.h: INLINE bool armSp804TimerInterruptIsPending(const ptrArmSp804Timer c) { /* get the first bit of the interrupt status (ignore other bits) to check if the interrupt is pending */ return (c->timerInterruptMaskStatus & 0x1U) == 1U ? TRUE : FALSE; } I hope the code is easy enough to read. Thank you Sorry, I should have been clearer. Please can you provide a test case binary and QEMU command line that reproduces the problem (including what the expected output is and what the actual output is)? Mark this bug as Incomplete as we're still waiting for the test case binary and command line to reproduce it. [Expired for QEMU because there has been no activity for 60 days.] Sorry to have taken some time to answer. Created the binary with the test. Here is the output when it fails (tested in Qemu 2.12): Checked 0 times if the time is always increasing Checked 100000 times if the time is always increasing Checked 200000 times if the time is always increasing Checked 300000 times if the time is always increasing Checked 400000 times if the time is always increasing Checked 500000 times if the time is always increasing Checked 600000 times if the time is always increasing Checked 700000 times if the time is always increasing Error: the time read is smaller than the previous time read And here with the latest Qemu (4.1.0): Checked 0 times if the time is always increasing Checked 100000 times if the time is always increasing Error: the time read is smaller than the previous time read The error can occur basically at any time. But when tested with Qemu 2.5 it goes well: Checked 0 times if the time is always increasing Checked 100000 times if the time is always increasing Checked 200000 times if the time is always increasing Checked 300000 times if the time is always increasing Checked 400000 times if the time is always increasing Checked 500000 times if the time is always increasing Checked 600000 times if the time is always increasing Checked 700000 times if the time is always increasing Checked 800000 times if the time is always increasing Checked 900000 times if the time is always increasing Checked 1000000 times if the time is always increasing Checked 1100000 times if the time is always increasing Checked 1200000 times if the time is always increasing Checked 1300000 times if the time is always increasing Checked 1400000 times if the time is always increasing Checked 1500000 times if the time is always increasing Checked 1600000 times if the time is always increasing Checked 1700000 times if the time is always increasing Checked 1800000 times if the time is always increasing Checked 1900000 times if the time is always increasing Checked 2000000 times if the time is always increasing Checked 2100000 times if the time is always increasing Checked 2200000 times if the time is always increasing Checked 2300000 times if the time is always increasing Checked 2400000 times if the time is always increasing Checked 2500000 times if the time is always increasing Checked 2600000 times if the time is always increasing Checked 2700000 times if the time is always increasing Checked 2800000 times if the time is always increasing Checked 2900000 times if the time is always increasing . . . Checked 9500000 times if the time is always increasing Checked 9600000 times if the time is always increasing Checked 9700000 times if the time is always increasing Checked 9800000 times if the time is always increasing Checked 9900000 times if the time is always increasing Example finished Just to give some background on the test itself, the main.c file contains this: void helloWorld0() { uint64_t microseconds; uint64_t previousMicroseconds = 0; uint32_t i; for(i = 0; i < NUMBER_REPETITIONS; i++) { microseconds = pharosClockGetSinceBoot(); if(microseconds < previousMicroseconds) { printk("Error: the time read is smaller than the previous time read\r\n"); pharosThreadSuspendSelf(); } previousMicroseconds = microseconds; /* only print 100 times the message */ if(i % (NUMBER_REPETITIONS / 100) == 0) { printk("Checked %d times if the time is always increasing\r\n" , i); } } printk("Example finished\r\n"); pharosThreadSuspendSelf(); } And where I suspect the code is reading the wrong hardware level is here: bool pharosCpuClockIsrIsPending(void) { /* apparently QEMU has a bug when it comes to reading the SP804 interrupt status. It seems that we have to read it multiple times in order for it to give the correct result since it does not update the interrupt status of the SP804 when it should. Qemu 2.5 did not have this problem but at least Qemu 2.12, 3.1 and 4.1.0 have this problem. So we just create a work-around for it. */ #if 0 /* number of times to read it. */ #define N 100 uint32_t i; /* read the interrupts status a lot of times to wait for QEMU to update the status */ for(i = 0; i < N; i++) { /* if there is an interrupt */ if(armSp804TimerInterruptIsPending(TIMER1_BASE_ADDRESS) == TRUE) { /* return right away */ return TRUE; } } /* read the status a LOT of times, there is no interrupt pending */ return FALSE; /* On a real hardware this problem should not occur so we only have to read it once */ #else /* read the SP804 timer interrupt pending info */ return armSp804TimerInterruptIsPending(TIMER1_BASE_ADDRESS); #endif } that is, the armSp804TimerInterruptIsPending is not correctly returning if the timer1 has a pending interrupt (even though I check the timer load value and it was reset just before and no interrupt was raised - interrupts disabled) Forgot to state the qemu argument, it's this one: ~/cross-compiler/qemu/2.5/bin/qemu-system-arm -M versatilepb -m 128M -nographic -kernel myApplication Thanks for the repro case. Preliminary analysis: I added some tracepoints to the sp804 code, and you're right that the value of the interrupt status register isn't always correct: A normal attempt to read the clock looks like this: Taking exception 2 [SVC] ...from EL0 to EL1 ...with ESR 0x11/0x4600000b AArch32 mode switch from svc to sys PC 0x12010 AArch32 mode switch from sys to svc PC 0x12018 32696@1569513406.450378:sp804_read addr 0x00000004 value 0x00000681 32696@1569513406.450384:sp804_read addr 0x00000014 value 0x00000000 Exception return from AArch32 svc to usr PC 0x10ff0 Sometimes we might read the clock when it's exactly got down to 0 (this is more likely on QEMU than on real h/w for internal reasons of our implementation): Taking exception 2 [SVC] ...from EL0 to EL1 ...with ESR 0x11/0x4600000b AArch32 mode switch from svc to sys PC 0x12010 AArch32 mode switch from sys to svc PC 0x12018 32696@1569513406.452273:sp804_read addr 0x00000004 value 0x00000000 32696@1569513406.452279:sp804_read addr 0x00000014 value 0x00000000 Exception return from AArch32 svc to usr PC 0x10ff0 A correct handling of the rollover looks like this (we read the counter, which is rolled over, and the interrupt-status, which is 1, which causes us to reread the counter; once we're done the IRQ handler itself runs): 4003@1569514474.944756:sp804_read addr 0x00000004 value 0x0000c29d 4001@1569514474.944761:sp804_arm_timer_update level 1 4003@1569514474.944797:sp804_read addr 0x00000014 value 0x00000001 4003@1569514474.944828:sp804_read addr 0x00000004 value 0x0000c255 Taking exception 5 [IRQ] ...from EL1 to EL1 ...with ESR 0x11/0x4600000b AArch32 mode switch from irq to svc PC 0x2a5e4 4003@1569514474.944943:sp804_read addr 0x00000014 value 0x00000001 4003@1569514474.944957:sp804_read addr 0x00000034 value 0x00000000 4003@1569514474.944962:sp804_read addr 0x00000014 value 0x00000001 4003@1569514474.944965:sp804_write addr 0x0000000c value 0x00000000 4003@1569514474.944966:sp804_arm_timer_update level 0 4003@1569514474.944969:sp804_read addr 0x00000034 value 0x00000000 AArch32 mode switch from svc to irq PC 0x2a718 Exception return from AArch32 irq to svc PC 0x2a3bc Exception return from AArch32 svc to usr PC 0x10ff0 But sometimes we get this, where we read the rolled-over counter value but the interrupt-status register is still 0: Taking exception 2 [SVC] ...from EL0 to EL1 ...with ESR 0x11/0x4600000b AArch32 mode switch from svc to sys PC 0x12010 AArch32 mode switch from sys to svc PC 0x12018 4003@1569514475.794690:sp804_read addr 0x00000004 value 0x0000c2df 4003@1569514475.794698:sp804_read addr 0x00000014 value 0x00000000 4001@1569514475.794703:sp804_arm_timer_update level 1 Exception return from AArch32 svc to usr PC 0x10ff0 Taking exception 5 [IRQ] ...from EL0 to EL1 ...with ESR 0x11/0x4600000b AArch32 mode switch from irq to svc PC 0x2a5e4 4003@1569514475.794768:sp804_read addr 0x00000014 value 0x00000001 4003@1569514475.794937:sp804_read addr 0x00000034 value 0x00000000 4003@1569514475.794944:sp804_read addr 0x00000014 value 0x00000001 4003@1569514475.794947:sp804_write addr 0x0000000c value 0x00000000 4003@1569514475.794949:sp804_arm_timer_update level 0 4003@1569514475.794952:sp804_read addr 0x00000034 value 0x00000000 AArch32 mode switch from svc to irq PC 0x2a718 Exception return from AArch32 irq to usr PC 0x10ff0 This happens because the sp804 uses a QEMU timer abstraction "ptimer". The ptimer updates its internal state when a raw QEMU timer expires and calls the ptimer_tick() function. From this point on, a guest read of the counter value will get the rolled-over value, because the sp804 implements this as a simple ptimer_get_count(). However, the ptimer doesn't immediately call the sp804's arm_timer_tick() function (which is where we update the interrupt-status flag and arrange for an IRQ to be delivered) -- it just schedules that to happen later via a QEMU "bottom half handler". Unfortunately it's possible for the guest CPU to run between when the ptimer's ptimer_tick() happens and when the bottom-half-handler is triggered, which means that the guest can see this incorrectly out-of-sync state from the sp804 device. I'm not currently sure how best to fix this. Thank you for checking this issue. I looked at the ptimer code, like you said. Just one question: isn't this used by other hw as well? Maybe this problem is more general... I also tried (basically) the same example on a aarch64 (raspberry pi3), and I don't find any problems there. Maybe could be helpful to look how this problem is solved there. Just trying to help I sent out an initial RFC patchset which fixes this (it's just an RFC because it only converts this one device to the new ptimer API, and we should do a proper conversion of all devices); it seems to make the test case in this bug work correctly: https://