Specifying the appropriate software load testing workload.

Specification of the appropriate workload is critical to the overall value and conclusions of any load testing project.

The appropriate workload specification should be tailored to the meet the projects objectives - no more no less.

Defining the workload for software performance testing.


The importance of defining the appropriate workload to satisfy the performance project objectives has been emphasized in all of the basic performance project type descriptions. The reason why specification of the workload is key to these performance projects is that:-
  • There are many variables that define the workload.
  • There are many variables that define the intensity (arrival rates, concurrency etc.) of the workload.
  • These variables and the specification of their intensity form a unique combination and it is difficult to extrapolate outcomes by substituting variables or rates of intensity without re-running the performance test with the new variables (at which time it becomes another project iteration).

It is useful to define workload within three basic categories:-

  • What, function, the software is being subjected to.
  • How many concurrent users, or calling API, are performing the function.
  • At what average rate is the function being called from each user.

By way of example, lets consider the business function: Place Order, we could define the workload for Place Order as:-
  • Function: Place Order.
  • Concurrent Users: 700
  • Rate (per User): 1 every 60 seconds (plus or minus 15 seconds).
The rate could also be specified for the total user (700) population rather than per User. So in the above example the overall rate would be 700/60 = approx 11.5 function (Place Order) calls per second. Whilst the above workload might measure the resource usage (and response) times for Place Order at the defined rate for a concurrent population there might also be an issue with the potential User population as well as their access patterns.

We also need to consider one other key workload characteristic:-
  • The User access patterns for the function being tested.
Consider the simplest case and that is we have 700 employees who log on at around 8am in the morning then Place Orders at a known rate and then log off at 5pm. In this case our workload requirements, for a basic stability test, are as stated above.

Now consider the situation under which we have 7,000 employees working shifts and performing other functions. Out of these 7,000 any one of 700 could be logged on at a given time placing orders. Some of these 700 would be logged on all day and place orders at a steady rate whilst others could logon and logoff throughout the day and place one or two orders when logged on.

The question arising from the above scenarios is For the purposes (objectives) of the given load testing project do we need to simulate the potential user populations and their access patterns?

The answer to the usage access question is It depends on the requirements of the load testing project. For example: for an initial code stability test (looking for memory leaks or excessive CPU usage) it might not be important to recreate the user access patterns but if there was some elaborate security (access) algorithms that accompanied the Place Order function then the actual usage patterns might need reproducing to perform an accurate performance capability analysis.

In cases such as these, when specifying the workload requirements, the question of Is the difference material to the processing environment and\or the project's objectives should always be examined.

A basic workload specification for a stability test.

If we add duration to the workload specification we have been discussing, for Place Order, we now have the following basic workload requirement:-
  • Function: Place Order.
  • Concurrent Users: 700
  • Rate (per User): 1 every 60 seconds (plus or minus 15 seconds).
  • Duration: 6 Hours
If we believe that the user access patterns will make a material difference to the project's conclusions, based on the objectives, then the following workload requirements could be specified:-
  • Function: Place Order.
  • Concurrent Users: 700
  • Total User population: 7,000
  • Average time each user is logged on: 1 hour.
  • Rate (per User): 1 every 60 seconds (plus or minus 15 seconds).
  • Duration: 6 Hours
As can be seen the second workload specification gets closer to what might be happening but it could be refined further by introducing the idea of spread or standard deviation from the averages, (the plus or minus 15 seconds for the rate being a simple example of this) to better reflect the required workload.

Other workload specification considerations

The type of workload specification questions (noted above) need to be asked as issues such as caching versus disk reading will come up when repeatedly reusing the same data. In all cases the material difference of the workload specification options needs to be weighed against the load testing objectives.. The answer to this question is a pragmatic one that depends on what you are trying to conclude from your performance testing project.

The above discussion has just touched the surface on specifying the appropriate workload for a given load testing project.

The software under test could be a single API, a user function (as is the case described), or a series of user functions representing multiple users performing various activities.
Regardless of the scope of the software under test the principles of defining the appropriate workload are the same. Careful analysis of the projects objectives needs to be undertaken before a specification of the appropriate workload can be made.

The workload specification versus the workload implementation in a given load testing tool.

The current discussion has been about specifying What workload is required, in terms of meeting the software load testing objectives. This activity is separate from How the workload requirements are implemented in a given load testing tool (e.g. Jmeter). As with defining the workload itself there are many tradeoffs and issues with the extent to which the required workload can be simulated in a given tool. This discussion is continued in reproducing the defined workload but, by way of example; consider the workload requirement of concurrent users. A given load testing tool may open one TCP connection for each simulated Web user and then process all the HTTP requests on that single connection. This TCP connection behavior might not reproduce the number of TCP connections that real browsers (typically a browser will open multiple TCP connections to get URLs in parallel when possible) utilize. As with the analysis of the appropriate workload requirements further analysis as to how accurately a given load testing tool can (and needs to) reproduce the required workload, needs to be made.