Difference between revisions of "Test Standards"

From eLinux.org
Jump to: navigation, search
(Test Execution API (E))
(Build Artifacts: add link to Kernel Build Bundle)
(25 intermediate revisions by 2 users not shown)
Line 20: Line 20:
 
[[File:CI-Loop-high-level-arch-v3.jpg|1000px|high level CI loop]]
 
[[File:CI-Loop-high-level-arch-v3.jpg|1000px|high level CI loop]]
  
= Power Control =
+
= Board management (G) =
See the document...
+
This standard has a set of APIs or interfaces for managing the devices under test (DUTs)
 +
It includes things like:
 +
* board reservation
 +
* image instantiation (in the case of VMs or emulators)
 +
* board provisioning (installation of software under test)
 +
* power control (or, in the case of VMs - VM start)
 +
** See [[Test Power Control Notes]]
 +
* bus control
 +
* power measurement
 +
* attribute discovery
 +
* console monitoring
 +
* file transfer to/from the board
 +
* command execution
 +
 
 +
See [[Board Management Layer Notes]]
 +
 
 +
As of October 2020, TimeSys and Sony have developed a "board farm REST API" which they are proposing to the
 +
Linux automated testing community.
 +
 
 +
The API is described at [[Board Farm REST API]]
 +
 
 +
== Power Control ==
 +
The power control API is a standard for controlling the power state of a board
 +
in a board farm (inside an automated testing farm).
 +
 
 +
See [[Test Power Control Notes]]
 +
 
 +
=== Standards ===
 +
pdudaemon was selected as the standard for controlling power to a board in a lab.
 +
 
 +
The document containing this standard is at:
 +
* https://github.com/dave-pigott/pdudaemon/blob/master/share/powercontrolapi.md
 +
or
 +
* https://docs.google.com/document/d/1-f2VNVlOnaJUSKUUWeYko3wXh7_ertbFD55y_0dTZvI
 +
 
 +
== Board IO connections ==
 +
This consists of the ability to control or test features of the board that are related
 +
to busses or other connections to the device under test. This could include
 +
things like:
 +
* serial ports
 +
* USB buses
 +
* i2c
 +
* CAN
 +
 
 +
It might also involve any other aspect of board IO, such as audio, video, gpio, leds,
 +
etc.
 +
 
 +
In general, the connection might be between the board farm controller node and the DUT (interface G)
 +
or between some special hardware and the DUT (interface H, controlled by interface J).
 +
 
 +
See [[Board IO connection Testing]]
  
 
= Test Definition =
 
= Test Definition =
Line 36: Line 86:
 
harmonize test definitions across multiple test systems.
 
harmonize test definitions across multiple test systems.
  
 +
== Test Definition server ==
 +
The test definition itself is separate from the API or interface to a test definition server.
 +
A test definition server is a place where test definitions are stored.
 +
 +
In 2019, there is not central repository for test definitions that hold open source test definitions for
 +
multiple test frameworks.
 +
 +
But there is some movement towards storing auto-generated test (produced by fuzzers) for the linux kernel
 +
in a centralized location.  See this repository: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-arts.git/
 +
Discussion about the creation of this repository is at: https://lwn.net/Articles/799162/
  
  
Line 47: Line 107:
  
 
= Test Execution API (E) =
 
= Test Execution API (E) =
 +
See [[Test Execution Notes]] for more details and miscellaneous notes.
 +
 
* test API
 
* test API
 
* host/target abstraction
 
* host/target abstraction
 
** kernel installation / provisioning
 
** kernel installation / provisioning
 +
*** See [[Board Provisioning Notes]]
 
** file operations
 
** file operations
 
** console access
 
** console access
Line 57: Line 120:
 
*** ex: 'make test'
 
*** ex: 'make test'
 
*** runtest.sh?
 
*** runtest.sh?
 +
*** communicating lab and board-specific parameters to a test
 +
**** See [[LTP device discovery protocol]]
 
* test phases
 
* test phases
 
If a program uses Makefile to build and install the software under test, then it should provide
 
the following Makefile targets, to support testing of the software.
 
  "make test"
 
 
"make check"
 
 
See https://www.gnu.org/software/make/manual/make.html, in section "9.2 Arguments to Specify the Goals"
 
 
 
See [[Test Execution Notes]]
 
  
 
= Build Artifacts =
 
= Build Artifacts =
 +
* kernel build bundle
 +
** See [[Kernel build bundle]]
 +
* image build bundle
 
* test package format
 
* test package format
 
** meta-data for each test
 
** meta-data for each test
Line 79: Line 136:
 
This is a package intended to be installed on a target (as opposed to the collection of test definition
 
This is a package intended to be installed on a target (as opposed to the collection of test definition
 
information that may be stored elsewhere in the test system)
 
information that may be stored elsewhere in the test system)
 +
 +
== Test server ==
 +
This is a place where tests or test packages can be stored, and downloaded for use in a CI framework.
 +
 +
See [[Test Server Notes]]
  
 
= Run Artifacts =
 
= Run Artifacts =
Line 88: Line 150:
  
 
== Results Format ==
 
== Results Format ==
 +
See [[Test Results Format Notes]] for details and miscellaneous notes
 +
 +
The results format is the output from the test and creates is part of the interface between
 +
the test program and the test execution layer (or test harness).
 +
 +
The main thing that the format communicates is the list of testcases (or metrics, in the case of benchmarks)
 +
and the result of the testcase (pass, fail, etc.)
 +
 
* test log output format
 
* test log output format
 
** counts
 
** counts
Line 96: Line 166:
 
*** JUnit
 
*** JUnit
  
One aspect of the result format is the result or status code for individual
+
=== Server-based results storage ===
test cases or the test itself.
+
==== yocto project ====
See [[Test Result Codes]] and [https://gist.github.com/ligurio/5e972552c8b0d4f4b5e109564cbfe764 comparison of TAP, SubUnit and JUnit output formats].
+
All our test results for YP builds are added to a git repository:
 +
 
 +
http://git.yoctoproject.org/cgit.cgi/yocto-testresults/
 +
 
 +
(they're stored in a json format). the YP project doesn't have good tools to analyse
 +
the data yet but are at least storing them.
 +
 
 +
==== Fuego ====
 +
Uses the fserver project (https://github.com/tbird20d/fserver) to store run results in a common location.
 +
 
 +
Data is stored using Fuego's run.json format (http://fuegotest.org/wiki/run.json)
 +
 
 +
Fuego can also save results to a kernelci backend and a Squad backend.
 +
 
 +
==== KernelCI ====
 +
Site: https://kerneci.org/
  
=== TAP version 14 ===
+
A test has the results from a test: https://api.kernelci.org/schema-test.html
 +
A test group is a collection of test cases:
 +
See https://api.kernelci.org/schema-test-group.html
 +
and
 +
https://api.kernelci.org/schema-test-case.html
  
The effort to create TAP version 14 has stalled.
+
==== SQUAD ====
 +
The results backend for LKFT is: https://qa-reports.linaro.org/lkft/
  
Version 14 was intended to capture current practices that are already in use.
+
Source code for SQUAD: https://github.com/linaro/squad
  
The pull request for version 14, and resulting discussion is at:
+
Documentation: https://squad.readthedocs.io/en/latest/
  
  * https://github.com/TestAnything/testanything.github.io/pull/36/files
+
==== BigQuery common results server project ====
 +
The client for this is at:
 +
https://github.com/spbnick/kcidb
  
You can see the full version 14 document in the submitter's repo:
+
The server for this is at:
 +
??
  
  $ git clone https://github.com/isaacs/testanything.github.io.git
+
=== Standards ===
  $ cd testanything.github.io
+
The Linux kernel kselftest uses TAP as the preferred output format.
  $ git checkout tap14
 
  $ ls tap-version-14-specification.md
 
  
 
= Pass Criteria =
 
= Pass Criteria =
 +
The pass criteria is a set of data that indicate to the test framework how to interpret
 +
the results from a test.  They can indicate the following:
 +
 
* what tests can be skipped (this is more part of test execution and control)
 
* what tests can be skipped (this is more part of test execution and control)
 
* what test results can be ignored (xfail)
 
* what test results can be ignored (xfail)
Line 123: Line 217:
 
* thresholds for measurement results
 
* thresholds for measurement results
 
** requires testcase id, number and operator
 
** requires testcase id, number and operator
 +
 +
The pass criteria allows separation of things like expected failure from the test code itself,
 +
to allow for situations where different sets of results are interpreted as success or failure
 +
depending on factors outside the test (for example, kernel version, kernel configuration,
 +
available hardware, etc.)
 +
 +
For things like functional unit tests, a single failing result should result in the
 +
overall failure of a test suite.  However, for system tests or benchmarks, it is often
 +
the case that some results must be interpreted in a context-sensitive manner, or some
 +
set of testcases are ignored for expediency's sake.
  
 
= Miscelaneous (uncategorized) =
 
= Miscelaneous (uncategorized) =
 
* environment variables used to create an SDK build environment for a board
 
* environment variables used to create an SDK build environment for a board
* environment variables used for  
+
* environment variables used for controlling execution of a test
 
* location of kernel configuration (used for dependency testing) KCONFIG_PATH (adopted by LTP)
 
* location of kernel configuration (used for dependency testing) KCONFIG_PATH (adopted by LTP)
 
* default name of test program in a target package (run-test.sh?)
 
* default name of test program in a target package (run-test.sh?)
 
** this should be part of the test definition
 
** this should be part of the test definition

Revision as of 09:16, 21 October 2021

This page will be used to collect information about test standards.

meta-documents

A survey of existing test systems was conducted in the Fall of 2018. The survey and results are here: Test Stack Survey


Here are some things we'd like to standardize in open source automated testing:

Terminology and Framework

Diagram

Below is a diagram for the high level CI loop:

The boxes represent different processes, hardware, or storage locations. Lines between boxes indicate APIs or control flow, and are labeled with letters. The intent of this is to provide a reference model for the test standards.

high level CI loop

Board management (G)

This standard has a set of APIs or interfaces for managing the devices under test (DUTs) It includes things like:

  • board reservation
  • image instantiation (in the case of VMs or emulators)
  • board provisioning (installation of software under test)
  • power control (or, in the case of VMs - VM start)
  • bus control
  • power measurement
  • attribute discovery
  • console monitoring
  • file transfer to/from the board
  • command execution

See Board Management Layer Notes

As of October 2020, TimeSys and Sony have developed a "board farm REST API" which they are proposing to the Linux automated testing community.

The API is described at Board Farm REST API

Power Control

The power control API is a standard for controlling the power state of a board in a board farm (inside an automated testing farm).

See Test Power Control Notes

Standards

pdudaemon was selected as the standard for controlling power to a board in a lab.

The document containing this standard is at:

or

Board IO connections

This consists of the ability to control or test features of the board that are related to busses or other connections to the device under test. This could include things like:

  • serial ports
  • USB buses
  • i2c
  • CAN

It might also involve any other aspect of board IO, such as audio, video, gpio, leds, etc.

In general, the connection might be between the board farm controller node and the DUT (interface G) or between some special hardware and the DUT (interface H, controlled by interface J).

See Board IO connection Testing

Test Definition

The test definition is the set of attributes, code, and data that are used to perform a test. A test definition standard would specify things like the following:

  • fields - the data elements of a test
  • file format (json, xml, etc.) - how a test is expressed and transported
  • meta-data - data describing the test
  • visualization control - information used for visualization of results
  • instructions - executable code to perform the test

See Test Definition Project for more information about a project to harmonize test definitions across multiple test systems.

Test Definition server

The test definition itself is separate from the API or interface to a test definition server. A test definition server is a place where test definitions are stored.

In 2019, there is not central repository for test definitions that hold open source test definitions for multiple test frameworks.

But there is some movement towards storing auto-generated test (produced by fuzzers) for the linux kernel in a centralized location. See this repository: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-arts.git/ Discussion about the creation of this repository is at: https://lwn.net/Articles/799162/


Test dependencies

  • how to specify test dependencies
    • ex: assert_define ENV_VAR_NAME
    • ex: kernel_config
  • types of dependencies

See Test_Dependencies

Test Execution API (E)

See Test Execution Notes for more details and miscellaneous notes.

  • test API
  • host/target abstraction
  • test retrieval, build, deployment
  • test phases

Build Artifacts

  • kernel build bundle
  • image build bundle
  • test package format
    • meta-data for each test
    • test results
    • baseline expected results for particular tests on particular platforms

Test package format

This is a package intended to be installed on a target (as opposed to the collection of test definition information that may be stored elsewhere in the test system)

Test server

This is a place where tests or test packages can be stored, and downloaded for use in a CI framework.

See Test Server Notes

Run Artifacts

  • logs
  • data files (audio, video)
  • monitor results (power log, trace log)
  • snapshots


Results Format

See Test Results Format Notes for details and miscellaneous notes

The results format is the output from the test and creates is part of the interface between the test program and the test execution layer (or test harness).

The main thing that the format communicates is the list of testcases (or metrics, in the case of benchmarks) and the result of the testcase (pass, fail, etc.)

Server-based results storage

yocto project

All our test results for YP builds are added to a git repository:

http://git.yoctoproject.org/cgit.cgi/yocto-testresults/

(they're stored in a json format). the YP project doesn't have good tools to analyse the data yet but are at least storing them.

Fuego

Uses the fserver project (https://github.com/tbird20d/fserver) to store run results in a common location.

Data is stored using Fuego's run.json format (http://fuegotest.org/wiki/run.json)

Fuego can also save results to a kernelci backend and a Squad backend.

KernelCI

Site: https://kerneci.org/

A test has the results from a test: https://api.kernelci.org/schema-test.html A test group is a collection of test cases: See https://api.kernelci.org/schema-test-group.html and https://api.kernelci.org/schema-test-case.html

SQUAD

The results backend for LKFT is: https://qa-reports.linaro.org/lkft/

Source code for SQUAD: https://github.com/linaro/squad

Documentation: https://squad.readthedocs.io/en/latest/

BigQuery common results server project

The client for this is at: https://github.com/spbnick/kcidb

The server for this is at: ??

Standards

The Linux kernel kselftest uses TAP as the preferred output format.

Pass Criteria

The pass criteria is a set of data that indicate to the test framework how to interpret the results from a test. They can indicate the following:

  • what tests can be skipped (this is more part of test execution and control)
  • what test results can be ignored (xfail)
  • min required pass counts, max allowed failures
  • thresholds for measurement results
    • requires testcase id, number and operator

The pass criteria allows separation of things like expected failure from the test code itself, to allow for situations where different sets of results are interpreted as success or failure depending on factors outside the test (for example, kernel version, kernel configuration, available hardware, etc.)

For things like functional unit tests, a single failing result should result in the overall failure of a test suite. However, for system tests or benchmarks, it is often the case that some results must be interpreted in a context-sensitive manner, or some set of testcases are ignored for expediency's sake.

Miscelaneous (uncategorized)

  • environment variables used to create an SDK build environment for a board
  • environment variables used for controlling execution of a test
  • location of kernel configuration (used for dependency testing) KCONFIG_PATH (adopted by LTP)
  • default name of test program in a target package (run-test.sh?)
    • this should be part of the test definition