LRI
Équipe VALS
Bât 650 Ada Lovelace, Université Paris Sud
Rue Noetzlin, 91190 Gif-sur-Yvette, France
Test Club Meeting on 5/6/2018

A History of Test Theories

Personnes présentes

  • Marie-Claude Gaudel
  • Delphine Longuet
  • Frederic Voisin
  • Safouan Taha
  • Burkhart Wolff
  • Lina Yue
  • Fatiha Zaidi
  • Du Zeye
  • Hai Nguyen Van

Presentators (by order of appearance)

  • Marie-Claude Gaudel

Background:

Anger in a review of a test-paper making corse and crude statements about the theory of testing incited a debate between mcg and bu, who convinced her to write a statement about her experiences about the 40 years of history in test theories. And about the question "what is the folklore of testing as a research field"

A Definition

For Marie-Claude (in this context), a theory is something that allows to explain and anticipate phenomena in a rigourous manner. It is not necessarily a theory in a mathematical and/or formal sense. It should be refutable (in Poppers sense).

Discussion (mix of Bu and Fred 's rough notes)

Slides of the talk

Basics:

- Testing Software is different from testing hardware

- One tests a system, not a program text, a specification or a model. A system is a dynamic entity embedded in a physical world. It is observable only through some limited interfaces and is not always controllable.

- To test, we need

  • some specified requirements that the SUT must satisfy
  • a Systeme Under Test (SUT) that is executable
  • some knowledge about the SUT and its environment.
  • an oracle to decide if a test is successful or not.

A Test Theory and the notion of Testability

— Notion of selection criteria formalised for the first time in the following paper:

 John B. Goodenough, Susan L. Gerhart: "Toward a theory of test data selection" (IEEE TSE-1(2):156-173). Goodenough/Gerhart1975

This paper establishes also the notions of:

  • Reliable Criterion: if two test sets satisfy the criterion, either they are both successful or they both fail.
  • Valid criterion: if some input is not correct there exists at least one test set that satisfy the criterion and that will reveal the error.

and a theorem establishing the link between the success of a test set satisfying a valid and reliable criterion and the system correctness (but without stating how to find such a criterion and data test).

It also introduces the notion of exhaustive Testing, a comparaison between Test and Proof, and the idea of partionning input domains in equivalence classes.

- The second historical paper if the one by T.S. Chow in: "Testing Software Design Modeled by Finite State Machines" (IEEE T.S.E., May 1978) T. Chow TSE 1978. It considers the test of systems that can be modeled by a deterministic FSM ans the notion of exhaustivity of a test set under the assumption that the SUT behaves like a FSM with the same number of states.

Additional historical and interesting papers in the domain are:

  • J. S. Gourlay: "a Mathematical Framework for the Investigation of Testing" (IEEE T.S.E., Nov. 1983) Gourlay: TSE paper and "Introduction to the Formal Treatment of Testing" (Ohio State Univ., 1983 ?) Gourlay: Research Report
  • J. C. Cherniavsky and C. H. Smith: "a Recursion Theoretic Approach to Program Testing" (IEEE T.S.E., July 1987) Cherniavsky and Smith.

About Test Hypothesis

In general we are looking for a conformance relation between a specification SP and a SUT, despite the first one is some sort of model or formula and the second is a system, hence both are objects of different kinds. To fill this gap, authors introduce (even implicitly) Testability Hypothesis on the SUT considering the SUT to behave like a model of the specification. A particular case is the one of SUT that are non deterministic, with the introduction of Complete Testing Assumption. In practice one relies on "on the fly" selection of test inputs or on the instrumentation of the SUT or its environment (scheduler).

Even when one can define an "exhaustive test set", the latter is generally infinite and additional hypothesis (like uniformity or regularity) must be added to select an adequate finite subset.

- For uniformity and regularity hypothesis for specification based testing, see for instance:

  • Luc Bougé: "Modélisation de la notion de test de programmes; application à la production de jeux de tests" (Thèse, Université Pierre et Marie Curie, 1982, or "A contribution to the theory of program testing" (TCS, Vol 37, 1985) Bouge-TCS

- Testing hypothesis, coverage criteria and fault models are basically the same as stated in

  • R. Hierons: "Comparing Test Sets and Criteria in the Presence of Test Hypothesis and Fault Domains", (ACM TSE Methodology , 2002). R. Hierons ACM-2002

Future of Test Theory:

Things that are still to be investigated or polished:

  • Probabilistic testing and adequate measures; Observability and Concurrency (but see additional references below).
  • Testing and machine learning: see how machine learning techniques can help weakening test hypothesis
  • Test and Invariant Learning: how testing techniques can help discovering or disproving invariants on the fly.

Additional References:

  • R. De Nicola and M.C.B. Hennesy: Testing Equivalence for Processes (TCS 34, 1984) DeNicola and Hennessy
  • A prominent Survey reflecting test theories in the unit test domain: Software Unit Test Coverage and Adequacy (1997) H. ZHU, Nanjing University, P. A. V. HALL and J. R. MAY, The Open University, Milton Keynes, UK (ACM Computing Surveys, Volume 29 Issue 4, Dec. 1997).
  • I. Castellani and M. Hennessy: Testing Theories for Asynchronous Languages (FSTTCS98). Castellani and Hennessy
  • Yuxin Deng and Rob van Glabbeek and Matthew Hennessy and Carroll Morgan and Chenyi Zhang. Remarks on Testing Probabilistic Processes. Y. Deng and alli
  • A survey about Formal Methods and Testing: R. Hierons and alli : Using formal methods to support testing (ACM Computing Surveys, 41-2, 2009) R. Hierons and alli.