You are here: Global Spin -> Perl for the Web -> Testing Site Performance

Testing Site Performance

part of Perl for the Web

Click here to order from a bookstore near you.

It's not enough to write the fastest Web applications imaginable and develop the most efficient architecture to host them. In a real-world situation, a Web site also should be tested to see just how fast those applications perform. It's also nice to know just where the architecture might break down under heavy load. A good start to performance testing comes from simply acknowledging that performance is a factor that can be measured and compared against real-world situations. After this idea is firmly in place, it's possible to design a series of repeatable tests that produce results that can be used to predict the future performance of the site under any condition.

Performance testing tools make it possible to test the response of a site to a variety of traffic levels before it is put into production. A number of testing applications exist, and some are made freely available either independently or in conjunction with Web server software. However, most of these tools are designed to provide only basic performance information for simple areas of a Web site, with little support for the complex paths taken by real-world site visitors. These tools generally require some outside assistance to provide understandable data based on realistic tests. Fortunately, Perl provides a good framework for automating tools and creating more complex tests without much extra work. Perl also enables analysis to be incorporated into the automation program and customized to provide specific performance results.

Creating a Useful Test

As I mentioned in Chapter 5, "Architecture-Based Performance Loss", a poorly conceived test can be worse than no site testing at all. The best way to start a site performance test is to gather the data necessary to make the test realistic and representative. This usually involves discovering the usage patterns of current site visitors or using a proxy to capture a log of a few representative visits. This data then can be used to provide a more robust test of all parts of the site in the combinations likely to be seen with real usage. Testing also requires a good grasp of the amount of traffic currently experienced by the site or expected after the site goes live. The combination of realistic usage patterns and traffic estimates enables testers to produce a baseline figure for site traffic, which then can be used to determine the effects of both incremental and exponential increases of traffic.

For most sites, determining site usage or predicting site patterns in exhaustive detail won't be necessary. The data need be complete only to the extent that it affects the test results. Besides, real-world usage on a live site is likely to vary greatly from day to day and over the life of the site, so only a representative average of site usage is necessary. "Representative data" can be loosely defined and still provide valuable insights into the state of site performance and future trends.

Discovering Usage Patterns

For a good test that's representative of real-world site traffic, a realistic set of requests to the site is necessary. There's no better source for these sets than real-world data collected from existing traffic to a Web site. This data indicates how actual visitors are using the site as it exists. Thus, it can be assumed that additional visitors to the site will probably use the site in the same way. Usage data can provide insight into the relative popularity of site sections and functions, which translates to the relative use of static files and Perl programs. For instance, a site with ten static HTML files and two search programs might see much heavier use of the search programs. Site usage analysis would indicate that the search programs should be weighted more heavily in testing.

The best source of existing site usage information also is the most commonly available: server logs. Most Web servers take copious notes on every incoming request in a server access log. Server access logs usually contain an entry for each request that includes the client address, a time stamp, the URL being requested, the referring page, and the status of the request. With this information, analogous site requests can be generated easily by a server load testing application, in essence recreating the traffic from any given time period. These blocks of requests can be valuable when testing site performance because they contain a mix of files–including static HTML, programs, and graphic files–that otherwise might be overlooked during performance testing.

In addition, server log information usually is all that's necessary to recreate an individual user's complete path through the site, which is called a visit or session. Session information usually is valuable when testing a Web application that requires multiple requests to perform a coherent action. For example, an e-commerce application that makes a flight reservation might require four or five pages to complete the transaction. Because later pages in the transaction rely on the results of the previous pages, it's necessary to generate simulated requests in the correct order.

Session information can be extracted from an access log fairly easily. If one client IP address is selected and lines containing the address are extracted from the log file, the result is a list of every URL visited by a single user over time. That list can be further narrowed to a single session by choosing only one connected set of URLs from the entire record. The start and end of a session usually can be identified by finding a common entry page–a login page, for example–and extracting entries from that point until a likely exit page is reached. (A page displaying the results of a transaction is an example of an exit page.) Finding the start and end of a session is easier for sites, such as Google, that serve a single purpose from a common starting point, and determining entry and exit pages can be difficult for diffuse sites, such as Yahoo!, that have multiple points of entry. In general, though, a Web application transaction already has some identifier–a user login or session ID, for instance–that separates out a single session.

Using Proxy Data

Server access logs aren't always an ideal source of usage information for a site in development. Logs are sometimes unavailable for the site being tested, or existing server logs might refer to pages and applications on a current site that aren't representative of the new site being created. In some cases, only a specific part of the site is being tested–exclusive of the rest of the site. In these cases, acquiring and processing existing usage data from server logs becomes a problem. In other cases, access logs don't contain all the information necessary to recreate a complete user session. Data stored in cookies or sent through POST requests usually is not recorded in access logs. Thus, requests that rely on this data for session state information or program flow can't be recreated from log data alone.

When server access logs aren't viable for testing, it's possible to generate a representative set of user sessions by using a Web proxy. A Web proxy is a server that accepts all Web requests from a client and passes them along to the Web server and then accepts the response from the server and passes it back to the client. Web proxies usually are used to provide caching or additional security for Web requests, but for server testing, a proxy needs to log only the requests and their associated POST and cookie information as they pass through. In addition, because a proxy for this purpose is concerned more with accurate logging than with throughput, the proxy can be implemented either within the load testing application or with available tools such as Perl. (A proxy implementation in Perl is left as an exercise for the reader. A simple proxy is included with VeloMeter, as mentioned later in this chapter.)

One of the strong points of using a Web proxy to capture client requests is ubiquity. Almost all Web clients–including browsers, file updaters, and custom Web clients–can be configured to use a proxy to process all Web requests. The Web client handles the connection to the proxy automatically, and the rest of the user interface behaves identically, regardless of whether a proxy is used. For instance, specifying a proxy in the Netscape Web browser requires only a simple change to the preferences. After the change has been made, any interaction between Netscape and the Web sites is filtered through the proxy transparently. (I've been known to spend hours diagnosing a "network problem" because I've forgotten to change my Netscape preferences back after demonstrating a proxy during tutorials. Transparency can have its pitfalls.)

Estimating Site Traffic

When interpreting the results of performance testing, it's important to compare them to a current benchmark using the same metrics. It might make sense to find that a Web application has a response time of 0.02 seconds in a test environment, but that value provides little insight into the response time of that application in the future–or the total number of requests the application can process in a given time period. A better result would be a comparison between the current performance of the application and the maximum performance the application can sustain under heavy load. In addition, an idea of current traffic estimates usually provides a framework for evaluating testing results. Determining current traffic levels has the added benefit of forcing the tester to think in terms of traffic units (for example, requests per second [RPS]) rather than in terms of response times.

Site traffic can be estimated in a number of ways, but the easiest source of traffic data comes from log analysis software. In fact, some log analysis programs produce traffic figures as a matter of course–any figures listed as requests (or hits) per a specified time period can be converted to the commonly used RPS. If precalculated figures aren't available, a figure in RPS can be attained easily by taking any listing of the number of requests in a time period and dividing it by the number of seconds in that period. In addition, log analysis figures might also include the minimum and maximum number of hits in a given time period. These values, especially the maximum, can give hints about future traffic levels and periodic traffic increases.

When determining site traffic in terms of a simple unit (such as RPS), keep in mind that not all requests are equal in terms of server load and response times. Requests for static HTML, for instance, are likely to enjoy higher throughput than requests to a Web application. As a result, site traffic that's composed of mostly dynamic requests might cause the same load as a much larger number of static requests. Thus, the ratio of the two can be an important number to determine as well. Don't discount static requests entirely. They do contribute to overall load and play a part in perceived performance, but be sure to reduce their significance in load analysis because they are unlikely to be the cause of a performance bottleneck.

Basic Load Simulation with ApacheBench

After representative site usage data has been collected, the next step in testing site performance is to simulate site traffic using a load testing application. At its core, a load testing application simply generates Web requests based on input, times the responses, and records or displays the results. Many applications are available, but a few are more likely to be encountered by Perl programmers in standard Web environments.

ApacheBench is a tool included with most distributions of the Apache Web server. It's a simple load testing application designed to take a single URL and a few configuration parameters and produce a report based on the requests sent by the application. The application is very efficient as a result, so ApacheBench can simulate a heavy load without much CPU or memory impact. This can come in handy when simulating thousands of simultaneous users with a single client machine.

ApacheBench doesn't provide a user-friendly interface, and the tests it performs generally are simplistic when used in the default configuration. However, it is possible to write a Perl program that controls ApacheBench to automate more complex simulations, including dynamic requests from a server or proxy log. The result of this kind of automation more closely resembles a full-featured load testing application suite, and a good deal of input customization and report generation can be built in.

Configuring ApacheBench

ApacheBench is a command-line program, usually found under the name ab in the Apache binary directory or in a system utilities directory. Each execution of ApacheBench corresponds to a load test performed on a single URL–the program simply accesses the URL repeatedly until the test is completed. Testing parameters are specified by command-line arguments, including the total time spent testing and the number of concurrent users to simulate. Additional parameters, such as POST request data (for simulating Web forms) and cookie values, can be specified as well. A sample test performed on a local server with a time limit of 30 seconds and a simulated load of 20 users might be configured like this:

Listing 16.

ab -t 30 -c 20 http://localhost/perl/

The -t parameter specifies the total test time in seconds; with the -n parameter, the duration of the test can be limited by the number of requests. Concurrency is specified by the -c parameter, which sets the number of simultaneous users (or threads, in this case) accessing the URL at the same time. If concurrency were set to 1, for instance, ApacheBench would always wait for a response before initiating the next request. Setting higher values for the -c parameter generally increases the load experienced by the server, up to a limit imposed by client capabilities. The result of a typical ApacheBench test might look like the following:

Listing 16.

This is ApacheBench, Version 1.3c <$Revision: 1.1 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd,
Copyright (c) 1998-1999 The Apache Group,

Benchmarking localhost (be patient)...
Server Software:        Apache/1.3.14
Server Hostname:        localhost
Server Port:            80

Document Path:          /perl/
Document Length:        2511 bytes

Concurrency Level:      20
Time taken for tests:   30.004 seconds
Complete requests:      19989
Failed requests:        0
Total transferred:      55991990 bytes
HTML transferred:       50194890 bytes
Requests per second:    666.21
Transfer rate:          1866.15 kb/s received

Connnection Times (ms)
              min   avg   max
Connect:        0     3   128
Processing:     4    25   810
Total:          4    28   938

Of the values reported by ApacheBench, the few that are most interesting for site performance testing are complete requests, failed requests, and RPS. Complete requests gives the total number of requests that came back in a successful form, which is divided by the time taken for tests to calculate the RPS value. The RPS value might be faulty in some situations, however. Some valid responses might be listed as failed requests due to assumptions on the part of ApacheBench. For the purposes of the test, a complete request is one that receives a response with a status of 200 OK that is the same length as the initial response. If a page is dynamically generated from a Web application, it might return a different status code or have a different length and therefore be classified as a failed request. These requests should be added to the total to compute the true RPS value.

Simulating a Simple Path

Performing a robust performance test can be difficult given the basic command-line interface to ApacheBench. Simulating even a simple path through the site is difficult when URLs have to be entered one at a time, and the result is unlikely to be representative of the mixed requests received by a live site. Fortunately, a Perl interface to the ApacheBench application was developed to facilitate more complex paths than are enabled by the command-line interface. The HTTPD::Bench::ApacheBench module provides an object interface based around test runs and URL sets that makes it possible to create repeatable performance tests based on real-world data.

Listing 15.1 is a test set example using a single usage path encoded in a command-line Perl program. The program uses HTTPD::Bench::ApacheBench to test the response rate of a set of URLs at various concurrency values. Results of the test are displayed to the console.

Listing 16.1 Simple ApacheBench Automation

01 #!/usr/bin/perl
03 use 5.6.0;
04 use strict;
05 use warnings;
07 use HTTPD::Bench::ApacheBench;
09 # create and configure the ApacheBench object
10 my $b = HTTPD::Bench::ApacheBench->new;
11 $b->priority("run_priority");
12 $b->repeat(1000);
14 # add the URL list
15 my $list = HTTPD::Bench::ApacheBench::Run->new();
16 $list->order('depth_first');
17 $list->urls([
18              "http://localhost/index.html", 
19              "http://localhost/thing.html",
20              "http://localhost/perl/",
21             ] );
22 $b->add_run($list);
25 # run the load test and return the result
26 foreach my $number_of_users (5,10,15,20)
27 {
28   $b->concurrency($number_of_users);
29   my $ro = $b->execute;
31   # calculate requests per second
32   print "$number_of_users users, ", $b->total_requests, " total requests:\n";
33   my $rps = sprintf('%2.2f', (1000*$b->total_requests / $b->total_time));
34   print "$rps requests per second\n\n";
35 }

Lines 01[nd]07 of Listing 15.1 set up the environment and load the HTTPD::Bench::ApacheBench module. Line 10 creates a benchmarking object that provides both configuration methods and regression methods for the test. Line 11 sets the priority of the test runs, which would affect only this program if it used more than one test path. Line 12 sets the number of times to repeat the test; this value should be set to keep the test running at least 30 seconds. Shorter tests are likely to produce erratic results because the server doesn't have time to adjust to a sustained load.

Lines 15[nd]22 add a list of URLs as a single test run. Line 16 specifies that the list should be ordered depth-first, or completed in full before starting the next repetition, as opposed to breadth-first, which would loop over each URL individually. Line 22 adds the test run object created in line 15 to the complete test. Multiple tests can be added this way, allowing a more varied set of user interactions to be simulated.

Lines 26[nd]35 perform a test once for each value of $number_of_users, which varies the concurrency value starting at 5 users and ending with 20 simultaneous users. These values should be determined interactively during testing; settings used in practice are likely to be much higher than these defaults. Line 28 applies the current concurrency value to the test, and line 29 starts the test itself. Lines 32[nd]34 display a selection of the test results, concentrating on the RPS values for each level of concurrency. The output of a typical run might look like the following:

Listing 16.

5 users, 3000 total requests:
345.94 requests per second

10 users, 3000 total requests:
301.08 requests per second

15 users, 3000 total requests:
301.02 requests per second

20 users, 3000 total requests:
301.17 requests per second

Note that the number of total requests doesn't vary based on the number of users. That number is based solely on the number of URLs given in the test path times the number of repetitions. Note also that the total number of requests for this particular run isn't enough to carry the test for more than 10 seconds; results from an initial run should be used to modify the number of repetitions or the number of URLs to provide a longer test. Three URLs isn't a very robust sample of site usage. Thus, this test could be augmented by using more data from site access logs to fill out the test path.

Comparing Multiple Simple Paths

Listing 15.2 provides a more general way to configure site paths. It also produces output that is more amenable to automated comparisons using spreadsheet software and other numeric tools. The example is a similar command-line Perl program that reads URLs for a test path from a log file specified on the command line, saves the results to a comma-separated values (CSV)file, and displays the results to the console.

Listing 16.2 Log-Based ApacheBench Automation

01 #!/usr/bin/perl
03 use 5.6.0;
04 use strict;
05 use warnings;
07 use HTTPD::Bench::ApacheBench;
09 # create and configure the ApacheBench object
10 my $b = HTTPD::Bench::ApacheBench->new;
11 $b->priority("run_priority");
12 $b->repeat(1000);
14 # get a URL list from the specified file
15 my $filename = $ARGV[0];
16 unless ($filename)
17 {
18   print "Please specify a filename.\n\n";
19   exit;
20 }
21 open URLFILE, "$filename" or die "Can't open $filename: $!";
22 my @urls = <URLFILE>;
23 close URLFILE;
25 print "Simulating load from path:\n";
26 print @urls, "\n";
28 # add the URL list
29 my $list = HTTPD::Bench::ApacheBench::Run->new();
30 $list->order('depth_first');
31 $list->urls(\@urls);
32 $b->add_run($list);
34 my $csvfile = $ARGV[1] || "$filename.csv";
35 open CSV, ">$csvfile" or die "Can't open $csvfile for writing: $!";
36 print CSV "Test path, users, total requests, total time, RPS,\n";
38 # run the load test and return the result
39 foreach my $number_of_users (5,10,15,20)
40 {
41   $b->concurrency($number_of_users);
42   my $ro = $b->execute;
44   print "$number_of_users users, ", $b->total_requests, " total requests:\n";
46   # calculate and display requests per second
47   my $rps = sprintf('%2.2f', (1000*$b->total_requests / $b->total_time));
48   print "$rps requests per second\n\n";
50   # log run results to CSV file
51   print CSV "$filename, ";
52   print CSV "$number_of_users, ";
53   print CSV $b->total_requests, ", ";
54   print CSV $b->total_time, ", ";
55   print CSV "$rps, ";
56   print CSV "\n";
57 }
58 close CSV;

Lines 15[nd]23 open the file specified as a command-line argument and read in a list of URLs to use as the test path. Line 15 gets the file name from $ARGV[0], and lines 16[nd]20 display an error and exit if no file name is provided. Lines 21[nd]23 open the file and assign each line of the file as an element in the @urls array. Line 31 assigns the list of URLs to a test run. As a result, the input file for Listing 15.2 is formatted with one URL per line, as in the following example:

Listing 16.


A list of URLs in this format easily can be extracted from site access logs or custom proxy logs. Lines 25 and 26 of the program print the list of URLs being used in the test run for verification. The console output of Listing 15.2 looks like the following:

Listing 16.

Simulating load from path:

5 users, 3000 total requests:
329.20 requests per second

10 users, 3000 total requests:
300.93 requests per second

15 users, 3000 total requests:
301.69 requests per second

20 users, 3000 total requests:
300.63 requests per second

Lines 34[nd]36 open and label a CSV file that is used to save the test results in a format readable by other programs for further analysis. Line 34 tries to read the file name from the second command-line argument; if none is provided, it uses the name of the input file with a .csv extension appended. The labels printed by line 36 are optional and should correspond directly with the values printed to the file in lines 51[nd]56 for each test run. Again, the emphasis in this example is on the number of RPS for each run, but other data is provided by the HTTPD::Bench::ApacheBench module if needed. A file produced using the format in Listing 15.2 would look like the following:

Listing 16.

Test path, users, total requests, total time, RPS,
first-set, 5, 3000, 9113, 329.20, 
first-set, 10, 3000, 9969, 300.93, 
first-set, 15, 3000, 9944, 301.69, 
first-set, 20, 3000, 9979, 300.63,

The CSV output can be used to compare the results from multiple sets by importing the files into a spreadsheet program or database. These results also can be used to compare the performance of a Web application at baseline and after changes have been made to the application. Additional analysis can be performed within the Perl program, as well; the only limiting factors are the complexity of the file that specifies URLs for test runs and the format of the CSV file produced as output. This kind of program also can be used to produce Web-friendly graphs and tables from the results, although care should be taken in making the program itself available as a dynamic Web application. Such an application could easily be turned into a site from which distributed denial of service attacks could be carried out.

Comparing Multiple Complex Paths

The capability to pass complex requests to the server would make Listing 15.2 a more robust performance-testing tool. URLs can encode some kinds of form data, but most Web applications require POST data or cookie values to operate correctly. Listing 15.3 is a Perl program that reads paths with POST data from a more complex log file and provides results similar to those from Listing 15.2.

Listing 16.3 Complex ApacheBench Automation

01 #!/usr/bin/perl
03 use 5.6.0;
04 use strict;
05 use warnings;
07 use HTTPD::Bench::ApacheBench;
09 # create and configure the ApacheBench object
10 my $b = HTTPD::Bench::ApacheBench->new;
11 $b->priority("run_priority");
12 $b->repeat(1000);
14 # get a URL list from the specified file
15 my $filename = $ARGV[0];
16 unless ($filename)
17 {
18   print "Please specify a filename.\n\n";
19   exit;
20 }
21 open URLFILE, "$filename" or die "Can't open $filename: $!";
22 my @urls;
23 my @post_data;
24 while (my $line = <URLFILE>)
25 {
26   my ($url, $post) = split "\t", $line;
27   $post = '' unless ($post =~ tr/=//);
28   push @urls, "$url\n";
29   push @post_data, $post;
30 }
31 close URLFILE;
33 print "Simulating load from path:\n";
34 print @urls, "\n";
36 # add the URL list
37 my $list = HTTPD::Bench::ApacheBench::Run->new();
38 $list->order('depth_first');
39 $list->urls(\@urls);
40 $list->postdata(\@post_data);
41 $b->add_run($list);
43 my $csvfile = $ARGV[1] || "$filename.csv";
44 open CSV, ">$csvfile" or die "Can't open $csvfile for writing: $!";
45 print CSV "Test path, users, total requests, total time, RPS,\n";
47 # run the load test and return the result
48 foreach my $number_of_users (5,10,15,20)
49 {
50   $b->concurrency($number_of_users);
51   my $ro = $b->execute;
53   print "$number_of_users users, ", $b->total_requests, " total requests:\n";
55   # calculate and display requests per second
56   my $rps = sprintf('%2.2f', (1000*$b->total_requests / $b->total_time));
57   print "$rps requests per second\n\n";
59   # log run results to CSV file
60   print CSV "$filename, ";
61   print CSV "$number_of_users, ";
62   print CSV $b->total_requests, ", ";
63   print CSV $b->total_time, ", ";
64   print CSV "$rps, ";
65   print CSV "\n";
66 }
67 close CSV;

HTTPD::Bench::ApacheBench enables POST data to be specified for each request in a test series. Lines 21[nd]31 open the file specified as a command-line argument and populate an array of URLs as well as a corresponding array of POST data. Line 26 splits each line of the file into a URL and POST data segment. The input file format for Listing 15.3 is slightly different from the one for Listing 15.2; each line contains a URL and its associated post information separated by a tab character:

Listing 16.


Not all requests have associated POST data, although each line has a tab character after the URL. To handle the difference, line 27 checks for lines without POST data–identifiable by the lack of an = character–and creates an empty entry in the @post_data array for each one. The array of POST data is then added to the test run in line 40. URLs with no corresponding POST data are sent through GET requests by HTTPD::Bench::ApacheBench, and the rest are sent using POST requests. In the sample file, the first and third URLs would be processed using GET requests, and the second URL would be sent through a POST request with the supplied POST data.

These programs only hint at the scriptability of an ApacheBench setup, of course. Cookie data could be added to the mix, or session information could be generated using new POST data for each request. HTTPD::Bench::ApacheBench also provides more detailed data about the result of each test. Thus, values provided in the response could be used to generate new requests that mimic the cycle of a client browser more closely.

Graphic Comparisons with VeloMeter

Of course, after a site performance analysis tool is written to handle robust test sets and produce graphable results, a logical option is to combine the application with a user-friendly interface and graphing tools. The result would rival high-end performance analysis packages while retaining the capability to be customized. VeloMeter is an application that followed just such a path. Originally developed by VelociGen as an in-house tool for testing the performance of client sites, VeloMeter eventually gained features such as a graphing library and a built-in proxy server for generating usage logs interactively.

VeloMeter is written in Java, so the application can run under any Java environment. In addition to a short run as a commercial application, VeloMeter was offered as a free download from the VelociGen Web site. Starting with version 3.0, VeloMeter is offered as open source software under the GNU General Public License (GPL), which gives anyone permission to use, modify, and redistribute the program. It is available in Java source form or as a binary package for Linux or Windows environments.

Proxy Configuration

A VeloMeter testing session is organized around test sets, which are single site paths that consist of a list of URLs with associated POST and cookie data. A test set can be entered manually one URL at a time, or it can be generated using the built-in proxy server and a standard Web browser. The latter method is considerably easier and more likely to produce complete test sets, including graphic file requests and POST data that might be overlooked otherwise. The dialog box in Figure 15.1 provides an example of the proxy configuration process and results.

***INSERT FIGURE 15.115HPP01.TIFFSC***crop to show both boxes

Figure 15.1

VeloMeter proxy configuration.

When activated, the VeloMeter proxy indicates which port it is using (usually 9999). This port should be specified when configuring the proxy settings of the browser being used to generate test sets. After the browser is configured, a user can browse through the site to be tested as normal, and the interactions between the browser and the Web server are recorded in the proxy window. Each URL is listed on its own line with accompanying POST or cookie values (or null, if none). After the session is complete, the test set can be saved by stopping the proxy with the Done button. Any number of test sets can be generated this way, and additional URLs can be added to any test set by using the same process.

Log File Configuration

Test sets can be saved to or loaded from VeloMeter-format test set files. Each file can contain one or more test sets in a format similar to the following:

Listing 16.

# 5 users Set (6 urls, 5 users, 100 times)
5 users



[end set]
# End 5 users Set

The core of a test set is the list of URLs in the [urls] section. This list follows a simple format, with one URL per line followed by POST and cookie value strings, separated by a tab character. The list of URLs easily can be extracted from a site access log, with POST and cookie data added by hand if necessary. Conversely, lists generated by the VeloMeter proxy server can be adapted for use with other load testing applications, including Listing 15.3 earlier in this chapter. In VeloMeter 3.0, test sets from files in the VeloMeter format can be imported into the comparison form using the File menu.

Comparison Settings

After test sets have been generated or imported, each set can be run individually to generate a performance test and save the results. A useful baseline test can be created by importing the same test set repeatedly and then assigning an incremental number of users to each set. For example, Figure 15.2 shows a test set that has been imported four times and assigned levels of concurrency in increments of five users. Note that the total number of requests is derived by multiplying the number of URLs by both the number of users and the number of repeated runs.

***INSERT FIGURE 15.215HPP02.TIFFSC***crop to show both boxes

Figure 15.2

Running a VeloMeter test.

Test runs can be repeated individually or in groups, if necessary, but VeloMeter saves only the results of the most recent test. Results from each set can be compared either by saving the result data or by plotting the results on a common graph. A table of result data can be viewed individually for each test set and saved in CSV format from the display dialog if desired. Results also can be plotted individually or in groups on a graph that displays the average response times for each URL in a set. When used in conjunction with the testing method shown in Figure 15.2, this type of URL-by-URL comparison might provide insights into the source of bottlenecks at various concurrency levels. The graph of response times can be overlaid by a bar chart showing the average response time and RPS value for each test run, which gives a more-is-better indicator of which run performed the best.

VeloMeter is undergoing continuous development as this book is being written; you likely will see significant changes due to its open-source nature. Check the Web site for this book for current updates of the VeloMeter software.

Simulating Site Overload

The most common test performed on a site is an overload test, which sends requests to a site as fast as the site can handle them. As usually conducted, however, an overload test is good only for comparisons between configurations that have been subjected to the same overload test. The comparison might be between different configurations of a site, different iterations of a Web application, or even different times when a site is active. An overload test doesn't provide any relationship between its results and the current traffic experienced by a site, however. As a result, site testing should be carried out with the idea that any one test is incomplete by itself and that it requires a comparison test to put it into perspective.

All these tests assume that the site being tested is a development site, not a live site. Under no circumstances should overload testing be performed on a live site. Sites are expected to behave erratically when put under these conditions, and requests might be dropped or incorrect data returned if a user accesses a live site while testing is performed. If a live site needs to be tested, make a copy of the site solely for testing or use the procedures in the "Testing a Live Site" section of this chapter.

Checking Baseline Performance

When testing a site to determine the maximum traffic it can support, it's good to start with a reasonable amount of load that produces response times and server activity similar to the current load. This value can take some fine tuning to achieve, but site usage data from access logs and site analysis software can provide some insights into the amount of traffic the site currently receives. Add additional load gradually, checking response times and server monitors to gauge the server's reaction. With tools such as ApacheBench and VeloMeter, load can be added by increasing the number of users (or concurrent threads) accessing the site simultaneously.

At some point during testing, the server should encounter a bottleneck that causes response times to change noticeably. When this happens, the load being generated at that point can be considered the maximum baseline traffic that can be handled gracefully by the server. That load then can be used in more robust site tests to determine site response in terms of RPS and total response time for each URL being tested. For instance, a concurrency value of 2,000 users might produce results of 200 RPS and an average response time of 0.5 seconds. These values won't relate directly. Thus, the important numbers to consider are the RPSs–especially compared to current traffic values–and the response times, which can be used to determine the overall response time perceived by visitors. If a site that is currently receiving 50 RPS can achieve a baseline of 200 RPS, for example, traffic to the site can grow by 400 percent without losing significant performance.

In addition, a test can be performed to see how the server responds when it is completely overloaded. This usually can be simulated by generating as much traffic as the client machine can produce and then adding in as many client machines as necessary to cause the server to overload. The point of overload can be gauged by checking the load test application for dropped requests or by accessing the server with a normal Web browser to see if it returns the 500 Server Busy error or stops responding to Web requests entirely.

Finding Points of Failure

One of the most difficult parts of performance testing is determining the root cause of a bottleneck. When a server is overloaded by testing, the resultant performance disruption to different parts of the server could be either the cause of the slowdown or a result of it. For instance, a server bottleneck due to Perl processing might use all available processing power and tie up every Web server process, which slows down any other Web request. Overall Web server slowness might lead a site tester to incorrectly determine that the Web server itself was the cause of the bottleneck, while the real culprit goes undetected.

The cause of a bottleneck sometimes can be determined by repeating the test multiple times and watching the condition of each system involved to catch unusual behavior. For instance, the top program can be run in a UNIX environment to determine memory and CPU usage of processes in real time. A test can be performed while the top program is running to check if any particular process (such as the Web server, the application engines, or the database) is taking up an unusual amount of memory or processor time. Additionally, the conditions that cause overload can be throttled back a bit to catch conditions that are close to failure but that haven't yet gone over the top.

Retesting After Fixes

After performance data is collected and potential bottlenecks have been identified, work usually is done to fix the root of the problem and overcome the bottlenecks. (If not, work should be done to fix the problem.) This isn't the end of the process, however. After performance bottlenecks seem to be solved, it's important to retest the site using the original test sets. This will indicate if a real performance improvement has been realized, and it will provide numbers that can be used to calculate a percentage improvement. Then the improvement can be compared to estimated traffic growth in order to determine whether the time spent finding a solution allowed the site to handle traffic growth from the same time period. For example, a 10% improvement that took six weeks to implement didn't meet its goal if site traffic increased 20% over the same period.

The performance testing process might have any number of these iterations, as many as the development timeline allows. As a result, it's good to plan for as many testing cycles as possible, starting as early as possible in the timeline to allow time for fixes and retesting. One good place to which performance testing can be added is regression and acceptance test cycles. When load testing is considered a regular part of the Web application evaluation process, the work done in designing performance test sets can be combined with usability tests and other standard testing practices.

Testing a Live Site

Sometimes a site is already in production before it can be tested. Unfortunately, testing the performance of a live site is trickier than testing a site in development. Even when a site has been load tested during development, it's important to continue testing a site after it is put in production.

A repetition of the baseline test described earlier in this section can provide valuable data for a live site. Using data determined from the development site–or similar data determined indirectly from usage patterns–you can test the live site using the baseline testing configuration. Of course, the last thing a busy site needs is to be inundated by load testing requests while it also is handling real user traffic. Therefore, extra care needs to be taken to restrict the test set to a percentage of current traffic that is well within the graceful maximum.

It also is important to exclude any usage information that alters or adds data to the site. User requests that post information to forums, e-commerce forms, and other site input applications should not be duplicated because their effects are duplicated as well. For instance, it's not a good idea to use a forum-posting request as the template for a thousand similar requests because each adds a new message to the forum. Similarly, any request that is likely to alter the site indirectly–for instance, a user login that alters the "last logged in" date–should be avoided as well. These exclusions might reduce the accuracy of the representative requests, but the alternative is a site with erroneous data created whenever testing occurs.

Sidebar: Interpreting Test Results

Web site performance indicators have had a rough history. They started out as an offshoot of the infamous hit counter graphic, and in most cases, they've gone downhill from there. The industry never quite got the idea of "hits" out of its collective consciousness, and site log analysis tools have been crippled as a result. I usually have to go to great pains to eradicate any mention of hits when configuring a new analysis tool, but the result is worth it. Web applications, HTML files, and sundry files such as graphics and PDFs should be treated differently when judging performance, and analysis tools that lump them together provide analyses of marginal utility at best.

With load testing tools, the opposite is true–there isn't nearly enough synthesis being performed within the application. ApacheBench, for instance, produces a result that means more to the hit-counter crowd than anyone; the assumption is that I'd like to see only the absolute best transfer rate possible for a single file being tested. There's no way for me to get what I really want–the time I have before my Web server collapses under the strain of traffic increases–without having to do lots of work massaging the results into a format that makes sense. I'm not alone: I have yet to see any real-world performance test–in a magazine or on the Web–provide the results in a format directly produced by any load-testing application. I'm much more likely to see results displayed as a graph produced by Excel from hand-entered data. This isn't the kind of environment that encourages Web administrators to test their sites more often.

Closer to home, VeloMeter is no exception to this rule. The initial release of VeloMeter created exciting color graphs, but the data in those graphs was nearly useless for anything but raw comparisons. Gleaning any sort of useful information from the VeloMeter data required hours of configuration and reconfiguration and meticulous notes of both the conditions and the results of each test. The current iteration has made some progress in the right direction, but it still produces results that are more closely related to the hit counter than to the kind of usage I'm liable to see from a real site over time. It still doesn't take connection speeds or varying numbers of realistic users into account.


Web application performance testing can be difficult to understand, but the results of a well-designed test are usually worth the effort. The first step to a successful load test is designing the test with real-world data and a realistic expectation of the results that are needed to prepare the site for future traffic. After a testing plan is in place and representative data are collected, load generators such as ApacheBench or VeloMeter can be used to simulate high request volumes and to gauge the response times and throughput of the server. Using these tools, a tester can check baseline performance levels, simulate the SlashDot effect, or overload a development site to determine potential bottlenecks. Data gleaned from these tests can be used to fix potential performance problems, evaluate new hardware and software, and provide assurance that the site can perform efficiently under any expected load.

This is a test.

Page last updated: 15 August 2001