## Combinatoric Input Set Generation I’ve been studying combinatoric methods of generating test cases for quite some time now. Most publicly available fuzz testing packages implement fairly crude techniques for passing input values to applications–although recent research is becoming more creative in attacking the issue because of insufficient path coverage metrics by orthodox methods. Generating input sets combinatorially is a much more holistic approach to the black-box software testing paradigm.

In this article, I’ll be providing a brief overview of how set operations taken from the field of discrete mathematics can be applied to fuzz testing. Explicit definitions of set operations can be found elsewhere and links to Wikipedia are provided where appropriate. Brief explanations on set theory will be given here but I will mainly be focusing on how it relates to software testing.

Let’s start off with permutations; just about any computer science bachelor is familiar with these. They’re the possible orderings of a sequence. A “sequence” may consist of members that are non-unique (which is in contrast to a “set” whose members are unordered and unique.) For example, the permutations of sequence {1,2,3} are:

{{1,2,3},{1,3,2}, {2,1,3},{2,3,1},{3,1,2},{3,2,1}}

Easy enough. Permutations can be used when testing protocols where the order of commands is significant. For example, if testing the FTP protocol with a sequence of {“STAT .”,”CWD ..”,”MKD foo”} depending on the order in which the commands are executed, the client could be retrieving the status for and/or making the “foo” directory in the parent/child directories.

Sets of subsets (also known as combinations) are also quite useful and are expressed using what’s called “n-choose-r” notation because r elements are being chosen from a set that has n elements in total (the set is said to have cardinality n.) Combinations can be utilized to enumerate command-line argument possibilities since the order of argv values is usually irrelevant. Say an executable has the command line flags “-n”, “-i”, “-q”, and “-v”. We want to generate all subsets of cardinality 2 (4-choose-2.) The answer is:

{{“-n”,”-i”},{“-n”,”-q”},{“-n”,”-v”},{“-i”,”-q”},

{“-i”,”-v”},{“-q”,”-v”}}

r-permutations are similar to subsets in that a given amount of objects are chosen from the list but since permutations are represented as sequences the order of the elements matters. Using the same baseline sequence as the previously described n-permutations, the 2-permutations of {1,2,3} are as follows:

{{1,2},{2,1},{1,3},{3,1},{2,3},{3,2}}

Again, observe how these permutations would affect the logic of network protocol commands processed by a server daemon..

Moving on, a power set is the set of all subsets including the empty set. Using the example above, the power set of {“-n”,”-i”,”-q”,”-v”} looks like this:

{{},{“-n”},{“-i”},{“-q”},{“-v”},{“-n”,”-i”},{“-n”,”-q”},

{“-n”,”-v”},{“-i”,”-q”}{“-i”,”-v”},{“-q”,”-v”},

{“-n”,”-i”,”-q”},{“-n”,”-q”,”-v”},{“-n”,”-i”,”-v”},

{“-i”,”-q”,”-v”},{“-n”,”-i”,”-q”,”-v”}}

The accuracy of the power set calculation can be checked because power sets have a cardinality of 2**n. In this case, 2**4=16. Other set operations can be checked with similar formulas.

Taking it another step further, consider giving that executable whose command line flags are being generated environment variable input as well. Suppose that the power set for the command line options is:

{{},{“-n”},{“-i”},{“-n”,”-i”}}

and the power set for the environment variable values is:

The environment variable names and values were paired up using a set operation known as the Cartesian product. i.e. the Cartesian product of the sets {{“TERM”}} and {“vt100″,”%n%n%n”} is:

{{“TERM”,”vt100″},{“TERM”,”%n%n%n”}}

Taking further advantage of Cartesian products, all possible pairings of command line flags and environment values are generated. I won’t be typing the whole thing out here as it is excruciatingly long but it would start out something like this:

{{{},{}},{{},”TERM=vt100″},{{},”TERM=%n%n%n”}},