diff options
| author | Christian Krinitsin <mail@krinitsin.com> | 2025-07-08 13:28:15 +0200 |
|---|---|---|
| committer | Christian Krinitsin <mail@krinitsin.com> | 2025-07-08 13:28:28 +0200 |
| commit | 5aa276efcbd67f4300ca1a7f809c6e00aadb03da (patch) | |
| tree | 9b8f0e074014cda8d42f5a97a95bc25082d8b764 /results/classifier/zero-shot-user-mode/runtime/1086 | |
| parent | 1a3c4faf4e0a25ed0b86e8739d5319a634cb9112 (diff) | |
| download | emulator-bug-study-5aa276efcbd67f4300ca1a7f809c6e00aadb03da.tar.gz emulator-bug-study-5aa276efcbd67f4300ca1a7f809c6e00aadb03da.zip | |
restructure results
Diffstat (limited to 'results/classifier/zero-shot-user-mode/runtime/1086')
| -rw-r--r-- | results/classifier/zero-shot-user-mode/runtime/1086 | 75 |
1 files changed, 75 insertions, 0 deletions
diff --git a/results/classifier/zero-shot-user-mode/runtime/1086 b/results/classifier/zero-shot-user-mode/runtime/1086 new file mode 100644 index 00000000..51ab6212 --- /dev/null +++ b/results/classifier/zero-shot-user-mode/runtime/1086 @@ -0,0 +1,75 @@ +runtime: 0.432 +instruction: 0.308 +syscall: 0.261 + + + +Numpy/scipy test suites fails in QEMU on ppc64le (but not on aarch64) +Description of problem: +I'm not really qualified to report this problem, but after being affected by it for ~2 years (and QEMU 7 not fixing things), I decided to give it a shot. Please excuse reporting deficiencies, I'll endeavour to fix them as best I can once pointed out. + +In my spare time, I help out for the packaging effort in the [conda-forge](https://conda-forge.org/) ecosystem, which is mostly associated/attached to the python world, but - in contrast to the vanilla python tools - also deals with non-python dependencies, and in particular has strong enough abstractions to deal with ABI-issues and generally provides much better integration than the packages on PyPI. + +This strength of abstraction has also allowed conda-forge to publish artefacts for many more architectures than most projects are commonly able to provide precompiled binaries for. Due to the lack of (reliable) public CI for aarch64 & ppc64le, these packages are mostly cross-compiled from linux-x86. Where cross compilation is not possible, the packages are compiled in emulation through QEMU, coming through https://github.com/multiarch/qemu-user-static (this is the part of the infrastructure I don't fully understand myself...). The full infrastructure is somewhat involved, but should not be relevant (hopefully) to the issue at hand (see instructions below) - and even if that turns out to be the case, that would be a great information gain as well. + +In either case, the tests for the package (ideally comprising the entire upstream test suite) are then run in emulation. + +Two of the so-called "feedstocks" I co-maintain are for [numpy](https://github.com/conda-forge/numpy-feedstock) and [scipy](https://github.com/conda-forge/scipy-feedstock), and there have been persistent issues with running the test suite in emulation on PPC (interestingly, the same setup on a different architecture - aarch64 - has no problems). However, the compiled artefacts on PPC run fine on native hardware. + +Said otherwise, it appears numpy/scipy are exercising QEMU enough to uncover some bugs. I've seen similar problems also in other packages (e.g. the cvxpy-stack), reinforcing the impression that this is a QEMU issue, and not one on the level of the individual packages. + +Depending on the exact combination of python version, the result of the numpy test suite might be as follows: +``` +320 failed, 18900 passed, 361 skipped, 36 xfailed, 9 xpassed, 144 warnings in 2516.49s (0:41:56) +``` + +Looking at the test failures, sometimes the results are garbage +``` +> assert_array_max_ulp(x, x+eps, maxulp=20) +E AssertionError: Arrays are not almost equal up to 20 ULP (max difference is 8.55554e+08 ULP) + +eps = 1.1920929e-07 +self = <numpy.testing.tests.test_utils.TestULP object at 0x401ec8beb0> +x = array([ 2.3744986e-38, nan, 2.2482052e-15, 7.5780330e+28, + nan, nan, 5.8310814e+29, -5.6511531e+24, + 1.0010809e+00, 1.0101526e+00], dtype=float32) +``` +sometimes the values are permuted +``` +> assert_array_equal(actual, desired) +E AssertionError: +E Arrays are not equal +E +E x and y nan location mismatch: +E x: array([0.000000e+00, 6.704092e-39, 9.000000e+00, 2.350989e-38, +E 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, +E 6.772341e-39, nan], dtype=float32) +E y: array([6.704092e-39, 6.772341e-39, 0.000000e+00, 0.000000e+00, +E 0.000000e+00, 0.000000e+00, nan, 2.350989e-38, +E 2.000000e+00, 7.000000e+00], dtype=float32) +``` +sometimes the results are fundamentally different (zero vs. non-zero) +``` +> raise AssertionError(msg) +E AssertionError: +E Arrays are not almost equal to 6 decimals +E +E Mismatched elements: 72 / 216 (33.3%) +E Max absolute difference: 1. +E Max relative difference: 1. +E x: array([[[[[0., 0., 0.], +E [0., 0., 0.], +E [0., 0., 0.]],... +E y: array([[[[[1., 0., 0.], +E [0., 1., 0.], +E [0., 0., 1.]],... +``` + +I don't know where it goes wrong, but it's not just a little tolerance violation. One PR that illustrates this is [here](https://github.com/conda-forge/numpy-feedstock/pull/274) and the respective CI run is [here](https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=526218&view=results) (ignore the errors for osx-arm64, those are unrelated). +Steps to reproduce: +1. In an emulated ppc64 machine, install miniforge from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-ppc64le.sh) +2. Run `conda create -n test_env numpy pytest cython hypothesis typing_extensions` and then `conda activate test_env` +3. Run `python -c "import numpy; numpy.test()"` +4. Pick any test that fails and run it as `python -c "import numpy; numpy.test(tests='x.y.z')"` +Additional information: + |