Test Lab Architecture

Table Of Contents:

Introduction
This document describes the architecture of the CE Linux Forum Open Test Lab. The architecture of the lab is intended to support multiple activities envisioned for the lab. See Open Test Lab for an introduction and description of the purpose of the lab.

Access use cases
There are four broad types of activities that the test lab is intended to support:
 * interactive software development and testing (single-user testing on a remote board)
 * custom automated testing against multiple boards (a single user running an ad-hoc test remotely)
 * automated regression testing against lab boards (lab-driven tests against multiple boards)
 * private testing (accessing tests from the lab to run against a local (possibly private) board)

On the Open Test Lab page, these are described as:
 * Interactive board usage via remote access
 * Automated multi-platform testing
 * Nightly regression tests
 * Private test usage

Definitions

 * server : : A single machine in the lab is used as the main server for lab operations. This machine

processes board reservations, coordinates nightly regression tests, is a repository for lab test software, and provides information and results for the lab to the public.
 * host : : The machine used to control or access one or more target boards.


 * node : : The collection of machines consisting of a single host and one or more targets.


 * client : : A machine that accesses the lab for testing services.


 * target : : A development board which can be accessed by the lab software.


 * target software : : The collection of software that is run on the target. The target software is

compiled on the host machine, and transferred to the target machine for execution. This includes the kernel, and user-space libraries, daemons, utilities and other programs.
 * toolchain : : The collection of programs used to create software for a target machine. This consists of the compiler (including preprocessor), assembler, linker and other programs used for manipulating source and machine code into a form usable on the target.  This usually consists of software from the Linux packages for gcc, binutils, and glibc.  The toolchains are run on the host machine of a node.

A host and target combination (a node) can be located anywhere in the world - either physically in the main lab, in a satellite lab, or at some point on the Internet. Operationaly, it is useful to distinguish between those that are made available for public use (and are managed by CELF or an affiliated entity), and those that are used used privately. The following terminology is used for these different nodes:


 * lab node : : An host/target combination managed by CELF


 * private node : : An host/target combination NOT managed by CELF.

Architecture Diagram
Here is a diagram of the architecture of the CELF Open Test Lab.

Server
More details about the server for the lab are at Test Lab Server Spec.

One important attribute of the server that is reflected strongly in the architecture is that the server plays a very passive role in the lab. As can be seen below, the host machines have the major role in initiating test activities as part of automated testing.

Interactive use of a board remotely
A developer wants to use an individual board for interactive testing. In this mode, the developer follows certain high-level steps to work with the board:

Steps:
 * the developer identifies a board using information on lab web site
 * each board should have a page on the lab wiki
 * the developer reserves a time slot to use the board
 * reservation information is maintained for each board on its associated host
 * the server presents a web form to the user for reserving time on a board
 * the client accesses the server the using http
 * the server runs a command on the host make or change a reservation
 * the developer logs in to the host for that board (using ssh)
 * this requires an account that the client can use, on the host
 * lab hosts have static IP addresses that can be accessed from outside the lab
 * the server has host-target mappings, and host IP addresses
 * the developer compiles the software for the board
 * user can compile on the lab host, or send binaries to the host from their own client
 * user can supply their own configuration and patches for software on the host
 * lab host has toolchains, kernel source and linux distribution for each target it controls
 * the developer installs the software to target (or use directly from host - e.g. nfs-mounted root filesystem)
 * the developer reboots target, from the host
 * the developer uses a serial console or network login from host to interactively work on the target
 * the developer (on the target) runs tests, examine results, etc.
 * the developer can collect results onto host, and transfer them back to client, if desired

In the lab diagram, an example of this would be client A accessing Target 1, through Lab host A. The reservation would be sent to the server. The host would grant the reservation after checking with the server.

Automated multi-platform testing
In this scenario, a developer uses the lab to perform the same test on multiple boards. This could be a small subset of boards, or conceivably all the boards registered with the lab.


 * the developer identifies a set of boards using information on lab web site
 * the developer uploads the test software and test paramaters to the server
 * the host checks with the server periodically, and initiates a test that is requested for one of its boards
 * the host uses http to access the server
 * the host downloads the software for the test from the server
 * test programs for the target must be packaged somehow /\ [specifications for host and client test package formats are needed]
 * the host compiles the software for the board (if appropriate)
 * the host has toolchains, kernel source and linux distribution for each target it controls
 * the host installs the software to target
 * the host reboots the target
 * the host runs the tests on the target
 * the host collects the results
 * the host uploads the results to the server
 * the server collates the results and publishes them on the web site

In the lab diagram, an example of this would be client A requesting a test to be run on Target 1 through Lab host A and target 3 through lab host B.

Automated Periodic Regression Test
In this case, a lab administrator (CELF) sets up a test to be run on a periodic basis (e.g. nightly), on a certain set of boards.


 * the administrator identifies a set of boards using information on lab web site
 * the administrator reserves a periodically-repeating time slot for those boards
 * the administrator uploads the test software and test paramters to the server
 * the host initiates the test, at the start of the reserved time slot
 * the host checks for updates to the linux kernel or other software involved in the test
 * when a new version is detected, the host downloads the code automatically
 * the host compiles the software for the board (if appropriate)
 * the host has toolchains, kernel source and linux distribution for each target it controls
 * the host installs the software to target
 * the host reboots the target
 * the host runs the tests on the target
 * the host collects the results from the target
 * the host uploads the results to the server
 * the server collates the results and publishes them on the web site
 * the server may do additional processing of the results, to provide enhanced feedback to the community
 * for example, the server may do an automated binary search for the patch that caused a a particular regression, and report that to the community

Private testing
This mode of usage allows an individual to use all the tests on the server, in an interactive or automated fashion, on their own machines. An example of this would be Client B downloading tests from the lab server, and running them locally (as a host) to run a test on private target 1.
 * PRE-REQUISITE - the user verifies that their host/target combination interoperates with the lab system
 * the user downloads an interoperability test and runs it
 * the developer selects a test on the test server
 * the available tests are listed on the lab server (wiki)
 * the developer downloads the test
 * the developer runs the test on their local host/target combination

Required Interfaces and Capabilities
There are multiple elements in the lab system, with interfaces between each one.

This section describes the interfaces between different elements, in terms of protocols, formats and conventions. The model of interaction (push vs. pull, synchronous vs. asynchronous, interactive vs. automated) is also described.

/\ Note - This section is still under construction

Interface between client and server
A client access the server via http. The server runs a web server, with a wiki to provide data about each board in the system.

Jobs for each host to run are located on the server. A reservation for interactive use of a board is a special type of job. Many jobs will not have a specific time slot.

When a job is submitted, the server creates a page for each lab node for which the job is requested. The server rejects job requests or reservations which conflict with those already scheduled for a node. (This can only happen with job requests with a specific time or duration.)
 * protocols:
 * http
 * data:
 * web page of information about each target board
 * web page per job
 * (the schema is not yet fully defined, but it includes:
 * lab node for job (host:target)
 * requesting account
 * requested start time (specific time or ANY)
 * requested duration (specific time or NONE)
 * duration estimate (average of previous durations of this test on this target)
 * priority (1=high, 5=low)
 * status (one of: pending, in-progress, complete)
 * current client (in the case of an active reservation)
 * result summary (one-line description of result)
 * result detail (multi-line description of result)
 * test script
 * web page showing list of jobs and reservations per target
 * user account

Interface between host and server
A server does not directly initiate work on a host. Rather, the host initiates contact with the server, to obtain reservation requests. At regular intervals, the host downloads the list of its jobs and determines which one it will "execute". At the end of a job, the host uploads the results and completion information to the server.

The host also uploads its target information and status periodically to the server.


 * protocols:
 * http
 * data:
 * web page of information about each target board (schema not defined yet)
 * web page per job
 * web page with list of jobs per target

During a reservation, the host does nothing automated. That is, is doesn't run a script from the server, but it might record secure-shell access during the reservation time slot, and report the activity log as the result of the "job". This could be used to detect wasted time slots (if no one logs in).

Interface between host and target
The interface between the host and the target is specified in Remote Board Access Spec

Note that there is no specified interface between a client and a target, or between a server and a target. Targets are only accessible via a host machine.

The host machine is the initiator of all activity in the system. The target machine is a slave to the host.

Interface between host and Internet
A host may be required to download software directly from the Internet as part of building the software for a target. A host needs http access to the Internet.


 * protocols:
 * http

Interface between client and host
The only direct interface between a client and a host is interactive command line access, via ssh.


 * protocols:
 * ssh (from client to host)
 * conventions:
 * the default command shell for an incoming client is bash
 * programs and utilities that are available on the host are specified in Remote Board Access Spec

At the time of an ssh login, the host machine may validate the client's reservation by contacting the server. (Or, it may "know" that this client is allowed to access the machine, by having recently checked with the server). The host may update the status of the job with the list of the current user for the target.

Desired operations on the host machine
This section describes in simple sentences, the requirements on the host machine for this architecture. After each statement, in parenthesis, is a possible command that could be used to implement the required operation.
 * A user MUST be able to log in to the host machine remotely (ssh)
 * A user SHOULD be able to verify that the target machine is available for use ("target status")
 * A user MUST be able to reserve a target for use, for a period of time ("target acquire [timeout]")
 * A user SHOULD be able to unreserve a target, or relinquish a held reservation ("target release")
 * A user or program MUST be able to build the Linux kernel for the target, on the host
 * obtain kernel source for target ("target get_kernel")
 * set up build environment for kernel ("target setenv")
 * get default kernel configuration ("target get_config")
 * set specific configuration paramenters ("target set_config [options]")
 * A user or program SHOULD be able to build a root filesystem for target, from source
 * A user or program MUST be able to build an arbitrary C program for target
 * ability to put source for arbitrary program on the host???
 * set up build environment for programs (target setup_build_environment???)
 * A user or program MUST be able to install kernel ("target kinstall")
 * A user or program MUST be able to install a program ("target cp")
 * A user or program SHOULD be able to install a new rootfs ("target fsinstall")
 * A user or program MUST be able to transfer files from the target to the host ("target cp")
 * A user or program MUST be able to restart a target machine, by at least one of the following methods:
 * resetting the target hardware ("target reset")
 * power cycling the target machine ("target reboot")
 * A user MUST be able to interactively access the Linux console of the target("target console")
 * A user SHOULD be able to interactively access a target command shell ("target login")
 * A user or program MUST be able to execute an arbitrary command on the target ("target run")


 * A host machine MUST periodically poll the server for it's list of tasks (wget)
 * A host machine MUST be able to download materials from the lab server (wget)
 * A LAB host machine MUST respond to a request for reservation information from the server ???
 * The server MUST be able to query a LAB host machine for it's reservation status ???