How To Use Mesa's test tool: deqp-runner
deqp-runner is a series of tools written by mesa developers for running vulkan and opengl quality test case programs. It can run in parallel and robustly dEQP (draw-element quality program), piglit, SkQP (Skia Quality Program), and so on.
In my experience, when running large dEQP test cases, your own changes may cause
umd (user mode driver) to fail, crash, timeout, or hang. If you simply use
deqp-vk
, the khronos testcases program for vulkan, the test will stop when umd
throws unrecoverable error. And we cannot easily get the results that new
failures based on the previous version.
Yes, we don’t need to write a new / own scripts to handle testcase stopping. deqp-runner help us to do it:
- Parallel running cases, it’s very useful if we need to run a huge set
- Run next case and report this case’s state if an unrecoverable error is thrown
- automatically compare the differences between baseline and current version
Build
The tool is written in rust-lang, and is very easy to get and build (compare with C++, LoL).
|
|
The executable file is generated in _build/release
, deqp-runner
and
piglit-runner
are very useful for me.
Usage for dEQP
We run dEQP cases via deqp-runner
:
|
|
For running dEQP testcases, we have some important arguments in deqp-runner
:
- deqp
- dEQP executable file
- jobs
- how many threads running in parallel
- output
- where the results are stored
- timeout
- the run is terminted if the given number of seconds is exceeded
- caselist
- a list of testcase list files
- env
- a list of test run-time environment variables, e.g.
VK_DRIVER_FILES
,MESA_LOADER_DRIVER_OVERRIDE
If you run a base driver first, you can set baseline to failures of base,
deqp-runner
will automatically compare current results with baseline.
If you just want to run a subset of given case lists, include-tests help us to do it, testcases are skipped if testcase name is not match in this option.
For example, I use command a lot:
|
|
Running opengl dEQP is similar to vulkan:
|
|
Output
The main output files we should focus on:
- results.csv that all testcases status in here
- failures.csv that all failures, crashes, timeouts in here
According to source code, the test status as following:
test status | runner status |
---|---|
Pass | Pass |
Fail | Fail |
QualityWarning | Warn |
CompatibilityWarning | Warn |
Pending | Fail |
NotSupported | Skip |
ResourceError | Fail |
InternalError | Fail |
Crash | Crash |
DeviceLost | Crash |
Timeout | Timeout |
Waiver | Warn |
Not found testcase | Missing |
If using baseline, maybe get new status:
runner status | means |
---|---|
ExpectedFail | Failed in baseline and current |
UnexpectedImprovement | Status in current better than baseline |
If any results came back with an unexpected failure, run the caselist again to see if we get the same results, and mark any changing results as flaky tests. In results file, it called Flake.
Easily report the flake testcases in current running:
|
|
Usage for piglit
Similar to deqp-runner
, piglit must set profile mode what is *.xml.gz
in tests
folder. And using piglit-folder instead of deqp executable file.
For example:
|
|
Different with other runner, testcase’s name looks strange in result file. For
example, asmparsertest@arbfp1.0@cos-03.txt
. In fact, the name is from profile.