hypervisor: 0.846
register: 0.809
arm: 0.793
peripherals: 0.775
mistranslation: 0.766
semantic: 0.752
architecture: 0.750
PID: 0.742
performance: 0.740
device: 0.736
permissions: 0.730
risc-v: 0.717
graphic: 0.706
boot: 0.701
ppc: 0.697
virtual: 0.697
TCG: 0.695
KVM: 0.691
user-level: 0.689
assembly: 0.686
VMM: 0.681
vnc: 0.661
debug: 0.657
files: 0.641
network: 0.616
x86: 0.600
socket: 0.527
kernel: 0.524
i386: 0.365

arm9 clock pending (SP804)

Hello all,

I'm using the versatilepb board and the timer Interrupt Mask Status register (offset 0x14 of the SP804) does not seem to be working properly on the latest qemu-2.12. I tried on the 2.5 (i believe this is the mainstream version that comes with Linux) and it works perfectly.

What happens is that the pending bit does not seem to be set in some scenarios. In my case, I see the timer value decreasing to 0 and then being reset to the reload value and neither does the interrupt is triggered nor the pending bit is set.

I believe this is a matter of timing since in the "long" run the system eventually catches up (after a few microseconds).

Thank you

Do you have a test case image you can provide that reproduces the problem?


Hello,

Let me explain a bit how to reproduce the problem. I'm developing a RTOS with qemu (and boards,etc) - https://sourceforge.net/projects/rtospharos/. The first version used ARM926 with the versatilepb board and I used the qemu 2.5 version.

In my test (to check that the RTOS is working correctly) I produced a loop that is always reading the current time ("pharosClockGet") and checking that it is always increasing. This works perfectly on qemu 2.5. When qemu 2.12 was released I updated to the new version to add support for the raspberry pi 3. 

Here is a piece of the test:


void aperiodicThread0_0()
{
    uint64_t microseconds;
    uint64_t previousMicroseconds = 0;
    uint32_t i;

    for(i = 0; i < NUMBER_REPETITIONS; i++)
    {
        microseconds = pharosClockGetSinceBoot();
        if(previousMicroseconds > microseconds) 
        {
            print("ERROR! Clock got lower\r\n");
            break;
        }
    }
}


I'll show you the "internals" of Pharos next (this is file clock.c):


uint64_t pharosIClockGetSinceBoot(void)
{
    /* microseconds since last clock tick*/
    uint32_t microseconds;

    /* number of clock ticks since boot */
    ClockTick clockTicks;

    /* >> interrupts are disabled when we get here << */

    /* read the number of microseconds on the counter */
    clockTicks = pharosVClockTicks;

    /* read the timer value */
    microseconds = pharosCpuClockReadCounter();

    /* if the clock interrupt is pending we might have read the microseconds before or after the clock tick occurred (but not processed since interrupts are disabled) */
    if(pharosCpuClockIsrIsPending() == TRUE)
    {
        /* interrupt is pending, so read again the microseconds and subtract 2 clock ticks */
        microseconds = (2U * pharosIMicrosecondsPerClockTick()) - pharosCpuClockReadCounter();
    }
    else
    {
        /* the timer counts backwards so we must subtract to get the number of microseconds since last tick */
        microseconds = pharosIMicrosecondsPerClockTick() - microseconds;
    }

    /* calculate the result using the last us read */
    return (uint64_t) ((pharosIMicrosecondsPerClockTick() * clockTicks) + microseconds);
}


And the part that is hw dependent (for ARM926) - this is in file ClockSp804.h: 

INLINE bool armSp804TimerInterruptIsPending(const ptrArmSp804Timer c)
{
        /* get the first bit of the interrupt status (ignore other bits) to check if the interrupt is pending */
        return (c->timerInterruptMaskStatus & 0x1U) == 1U ? TRUE : FALSE;
}


I hope the code is easy enough to read.
Thank you


Sorry, I should have been clearer. Please can you provide a test case binary and QEMU command line that reproduces the problem (including what the expected output is and what the actual output is)? 

Mark this bug as Incomplete as we're still waiting for the test case binary and command line to reproduce it.


[Expired for QEMU because there has been no activity for 60 days.]

Sorry to have taken some time to answer.

Created the binary with the test.

Here is the output when it fails (tested in Qemu 2.12):

Checked 0 times if the time is always increasing
Checked 100000 times if the time is always increasing
Checked 200000 times if the time is always increasing
Checked 300000 times if the time is always increasing
Checked 400000 times if the time is always increasing
Checked 500000 times if the time is always increasing
Checked 600000 times if the time is always increasing
Checked 700000 times if the time is always increasing
Error: the time read is smaller than the previous time read


And here with the latest Qemu (4.1.0):

Checked 0 times if the time is always increasing
Checked 100000 times if the time is always increasing
Error: the time read is smaller than the previous time read


The error can occur basically at any time. But when tested with Qemu 2.5 it goes well:

Checked 0 times if the time is always increasing
Checked 100000 times if the time is always increasing
Checked 200000 times if the time is always increasing
Checked 300000 times if the time is always increasing
Checked 400000 times if the time is always increasing
Checked 500000 times if the time is always increasing
Checked 600000 times if the time is always increasing
Checked 700000 times if the time is always increasing
Checked 800000 times if the time is always increasing
Checked 900000 times if the time is always increasing
Checked 1000000 times if the time is always increasing
Checked 1100000 times if the time is always increasing
Checked 1200000 times if the time is always increasing
Checked 1300000 times if the time is always increasing
Checked 1400000 times if the time is always increasing
Checked 1500000 times if the time is always increasing
Checked 1600000 times if the time is always increasing
Checked 1700000 times if the time is always increasing
Checked 1800000 times if the time is always increasing
Checked 1900000 times if the time is always increasing
Checked 2000000 times if the time is always increasing
Checked 2100000 times if the time is always increasing
Checked 2200000 times if the time is always increasing
Checked 2300000 times if the time is always increasing
Checked 2400000 times if the time is always increasing
Checked 2500000 times if the time is always increasing
Checked 2600000 times if the time is always increasing
Checked 2700000 times if the time is always increasing
Checked 2800000 times if the time is always increasing
Checked 2900000 times if the time is always increasing
.
.
.
Checked 9500000 times if the time is always increasing
Checked 9600000 times if the time is always increasing
Checked 9700000 times if the time is always increasing
Checked 9800000 times if the time is always increasing
Checked 9900000 times if the time is always increasing
Example finished


Just to give some background on the test itself, the main.c file contains this:

void helloWorld0()
{
    uint64_t microseconds;
    uint64_t previousMicroseconds = 0;
    uint32_t i;

    for(i = 0; i < NUMBER_REPETITIONS; i++)
    {
        microseconds = pharosClockGetSinceBoot();

        if(microseconds < previousMicroseconds)
        {
            printk("Error: the time read is smaller than the previous time read\r\n");
            pharosThreadSuspendSelf();
        }

        previousMicroseconds = microseconds;

        /* only print 100 times the message */
        if(i % (NUMBER_REPETITIONS / 100) == 0)
        {
            printk("Checked %d times if the time is always increasing\r\n" , i);
        }
    }

    printk("Example finished\r\n");
    pharosThreadSuspendSelf();
}


And where I suspect the code is reading the wrong hardware level is here:

bool pharosCpuClockIsrIsPending(void)
{
    /* apparently QEMU has a bug when it comes to reading the SP804 interrupt status. It seems that we have to read
      it multiple times in order for it to give the correct result since it does not update the interrupt status 
      of the SP804 when it should. Qemu 2.5 did not have this problem but at least Qemu 2.12, 3.1 and 4.1.0 have this
      problem. So we just create a work-around for it. */
#if 0

    /* number of times to read it. */
#define N 100

    uint32_t i;

    /* read the interrupts status a lot of times to wait for QEMU to update the status */
    for(i = 0; i < N; i++)
    {
        /* if there is an interrupt */
        if(armSp804TimerInterruptIsPending(TIMER1_BASE_ADDRESS) == TRUE)
        {
            /* return right away */
            return TRUE;
        }
    }

    /* read the status a LOT of times, there is no interrupt pending */
    return FALSE;

    /* On a real hardware this problem should not occur so we only have to read it once */
#else
    /* read the SP804 timer interrupt pending info */
    return armSp804TimerInterruptIsPending(TIMER1_BASE_ADDRESS);
#endif
}

that is, the armSp804TimerInterruptIsPending is not correctly returning if the timer1 has a pending interrupt (even though I check the timer load value and it was reset just before and no interrupt was raised - interrupts disabled)


Forgot to state the qemu argument, it's this one:

~/cross-compiler/qemu/2.5/bin/qemu-system-arm -M versatilepb -m 128M -nographic -kernel myApplication


Thanks for the repro case. Preliminary analysis: I added some tracepoints to the sp804 code, and you're right that the value of the interrupt status register isn't always correct:

A normal attempt to read the clock looks like this:

Taking exception 2 [SVC]
...from EL0 to EL1
...with ESR 0x11/0x4600000b
AArch32 mode switch from svc to sys PC 0x12010
AArch32 mode switch from sys to svc PC 0x12018
32696@1569513406.450378:sp804_read addr 0x00000004 value 0x00000681
32696@1569513406.450384:sp804_read addr 0x00000014 value 0x00000000
Exception return from AArch32 svc to usr PC 0x10ff0

Sometimes we might read the clock when it's exactly got down to 0 (this is more likely on QEMU than on real h/w for internal reasons of our implementation):
Taking exception 2 [SVC]
...from EL0 to EL1
...with ESR 0x11/0x4600000b
AArch32 mode switch from svc to sys PC 0x12010
AArch32 mode switch from sys to svc PC 0x12018
32696@1569513406.452273:sp804_read addr 0x00000004 value 0x00000000
32696@1569513406.452279:sp804_read addr 0x00000014 value 0x00000000
Exception return from AArch32 svc to usr PC 0x10ff0

A correct handling of the rollover looks like this (we read the counter, which is rolled over, and the interrupt-status, which is 1, which causes us to reread the counter; once we're done the IRQ handler itself runs):

4003@1569514474.944756:sp804_read addr 0x00000004 value 0x0000c29d
4001@1569514474.944761:sp804_arm_timer_update level 1
4003@1569514474.944797:sp804_read addr 0x00000014 value 0x00000001
4003@1569514474.944828:sp804_read addr 0x00000004 value 0x0000c255
Taking exception 5 [IRQ]
...from EL1 to EL1
...with ESR 0x11/0x4600000b
AArch32 mode switch from irq to svc PC 0x2a5e4
4003@1569514474.944943:sp804_read addr 0x00000014 value 0x00000001
4003@1569514474.944957:sp804_read addr 0x00000034 value 0x00000000
4003@1569514474.944962:sp804_read addr 0x00000014 value 0x00000001
4003@1569514474.944965:sp804_write addr 0x0000000c value 0x00000000
4003@1569514474.944966:sp804_arm_timer_update level 0
4003@1569514474.944969:sp804_read addr 0x00000034 value 0x00000000
AArch32 mode switch from svc to irq PC 0x2a718
Exception return from AArch32 irq to svc PC 0x2a3bc
Exception return from AArch32 svc to usr PC 0x10ff0

But sometimes we get this, where we read the rolled-over counter value but the interrupt-status register is still 0:

Taking exception 2 [SVC]
...from EL0 to EL1
...with ESR 0x11/0x4600000b
AArch32 mode switch from svc to sys PC 0x12010
AArch32 mode switch from sys to svc PC 0x12018
4003@1569514475.794690:sp804_read addr 0x00000004 value 0x0000c2df
4003@1569514475.794698:sp804_read addr 0x00000014 value 0x00000000
4001@1569514475.794703:sp804_arm_timer_update level 1
Exception return from AArch32 svc to usr PC 0x10ff0
Taking exception 5 [IRQ]
...from EL0 to EL1
...with ESR 0x11/0x4600000b
AArch32 mode switch from irq to svc PC 0x2a5e4
4003@1569514475.794768:sp804_read addr 0x00000014 value 0x00000001
4003@1569514475.794937:sp804_read addr 0x00000034 value 0x00000000
4003@1569514475.794944:sp804_read addr 0x00000014 value 0x00000001
4003@1569514475.794947:sp804_write addr 0x0000000c value 0x00000000
4003@1569514475.794949:sp804_arm_timer_update level 0
4003@1569514475.794952:sp804_read addr 0x00000034 value 0x00000000
AArch32 mode switch from svc to irq PC 0x2a718
Exception return from AArch32 irq to usr PC 0x10ff0


This happens because the sp804 uses a QEMU timer abstraction "ptimer". The ptimer updates its internal state when a raw QEMU timer expires and calls the ptimer_tick() function. From this point on, a guest read of the counter value will get the rolled-over value, because the sp804 implements this as a simple ptimer_get_count(). However, the ptimer doesn't immediately call the sp804's arm_timer_tick() function (which is where we update the interrupt-status flag and arrange for an IRQ to be delivered) -- it just schedules that to happen later via a QEMU "bottom half handler". Unfortunately it's possible for the guest CPU to run between when the ptimer's ptimer_tick() happens and when the bottom-half-handler is triggered, which means that the guest can see this incorrectly out-of-sync state from the sp804 device.

I'm not currently sure how best to fix this.


Thank you for checking this issue.

I looked at the ptimer code, like you said. Just one question: isn't this used by other hw as well?
Maybe this problem is more general...

I also tried (basically) the same example on a aarch64 (raspberry pi3), and I don't find any problems there. Maybe could be helpful to look how this problem is solved there.

Just trying to help

I sent out an initial RFC patchset which fixes this (it's just an RFC because it only converts this one device to the new ptimer API, and we should do a proper conversion of all devices); it seems to make the test case in this bug work correctly:
https://<email address hidden>/


Fix has been included in QEMU v4.2:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=5a65f7b5f4907ca70cb6