You are here: Global Spin -> Perl for the Web -> Tools for Perl Persistence

Tools for Perl Persistence

part of Perl for the Web

Click here to order from a bookstore near you.

Many tools exist for adding persistent performance to Perl-based Web application environments. Products such as mod_perl, FastCGI, VelociGen, and PerlEx provide persistence for Web applications on a wide variety of platforms using widely varying architectures. This list is far from complete, although it is a sampling of the most common tools available at the time this book was written. In fact, most readers are likely to run across one of these tools–mod_perl–much more often than all the rest combined. This is simply because the server for which it is written, Apache, is the single most popular Web server in the world. Each product has its strengths and weaknesses, though. Thus, the decision to use one over another is based as much on the Web application being deployed as on any comparison among the tools.

A combination of approaches also is possible. One product, VelociGen, uses a variety of architectures to provide a range of possible solutions based on environment. You can create similar solutions by using a combination of tools. Any of these solutions provides a marked improvement in performance; they all implement architectures that provide persistence and that do away with most of the overhead due to Common Gateway Interface (CGI) application processing. The main differentiation points between these tools are subtle architectural differences that are likely to make notable differences only when tested using individual Web applications in production-class environments. More information about simulating these environments is available in Chapter 15, "Testing Site Performance."

Other factors to consider when choosing a persistence environment are platform, price, and ease of use. Some tools are available only for a handful of platforms, and some platforms rule out all but one persistence environment. For instance, an iPlanet Web server running under Solaris won't be able to use mod_perl or PerlEx, and FastCGI might also be difficult to find for that platform. Thus, the choice would be narrowed to VelociGen. The Apache Web server running under Linux, on the other hand, has a wide range of Perl-persistence choices. In that case, the relative price of each tool could be the deciding factor when it is weighted against the relative difficulty of implementing one persistent solution over another.


One popular environment for providing Perl persistence is mod_perl, which is offered as a product by the Apache group ( mod_perl is implemented as a Perl interface to the Apache application program interface (API). Thus, mod_perl applications are compiled into the server process and run within it. mod_perl provides persistence for Perl programs by embedding the Perl interpreter into the server process and caching the compiled Perl code as it is executed.

mod_perl is offered under the same open source license as the Apache Web server. In this context, an open source license grants users the right to download, execute, modify, and redistribute both the Apache Web server and mod_perl without prior permission from the Apache group. Because of this openness and the wide deployment of the Apache Web server, mod_perl has developed a large user base and is used by many other open source Web applications and site development systems made available for download. In fact, mod_perl is the most popular Perl-persistence package in use at the time this book was written.

The tight integration between Apache and mod_perl can be both a strength and a weakness for the environment. Full access to the Apache server API gives mod_perl a unique advantage when creating server extensions, but the distinct API programming style used to create mod_perl extensions is often daunting for CGI programmers who are not used to the object-oriented Perl style. mod_perl comes with an Apache::Registry module to provide a CGI compatibility mode for these developers, but some care still must be taken when using complex CGI programs in a mod_perl environment.

Architecture of mod_perl

Apache offers mod_perl as a direct Perl interface to the Apache API. As a result, the architecture of a Web application written for mod_perl is based around the architecture of the Apache server itself. At runtime, mod_perl programs are considered direct extensions of the Apache server process rather than separate programs, even though mod_perl extensions are written independently of the compiled server.

mod_perl operates by embedding a Perl interpreter into each server process. The Perl interpreter compiles mod_perl programs into the server process at runtime. In addition, these programs can compile Perl subprograms into the server process at compile time or as requests come in. This latter approach is how Apache::Registry is used to compile CGI programs. Embedding the Perl interpreter into the server process saves the overhead of loading and executing it every time a request comes in. It also offers the capability to compile and cache Perl programs as persistent parts of the server process, which saves the overhead of loading and compiling these programs each time a request comes in.

However, the one-to-one relationship between Apache server processes and mod_perl processes can cause a performance problem for servers with exceptionally high volumes of traffic. This problem stems from the default behavior of Apache server processes. This default behavior creates as many server processes as there are simultaneous requests. In some cases, hundreds of Apache processes might be running simultaneously. Because mod_perl embeds the Perl interpreter in every Apache process, these processes are likely to be much larger in terms of memory than Apache server processes would be alone. Note that all Apache processes are considered equal when deciding which process handles a specific request, regardless of whether that request is for a static file or a Perl program; thus, it's possible to have hundreds of Perl interpreters loaded in server processes, even though most requests handled by those processes won't need mod_perl at all. This situation can cause secondary inefficiencies with caching Perl programs; with hundreds of server processes, there could be hundreds of copies of each Perl program, which causes even more memory to be used.

Open Source

Both Apache and mod_perl are open source programs offered under the GNU General Public License (GPL). The GPL gives Apache and mod_perl users and developers permission to use, modify, and redistribute the source code to both programs. The only condition imposed on these permissions is that modified or redistributed versions of the programs must also be licensed under the GPL with the same permissions. (Incidentally, programs written in Perl for mod_perl or Apache are not bound by this license because they are considered dynamically linked libraries and therefore separate from the licensed programs.) In addition to the GPL license, both the Apache server and mod_perl are made freely available for download from the Apache project Web site.

Correction: Both Apache and mod_perl are open source programs offered under the Apache Software License (ASL). The ASL gives Apache and mod_perl users and developers permission to use, modify, and redistribute the source code or binaries to both programs. The conditions imposed on these permissions include that modified or redistributed versions of the programs must not use the Apache name without permission. Both the Apache server and mod_perl are made freely available for download from the Apache project Web site.

Open source licenses embody a philosophy that has served the Web community well, and the Apache project specifically. In fact, one of the major reasons for the Apache server's overwhelming popularity and exponential growth has been the fact that it's readily available to any Web developer. Similarly, mod_perl has become a popular way to achieve Perl persistence because it is made just as readily available. In addition, both products have become faster, more robust, and more stable because the open source development model encourages programmers to add features and fix errors in the code, regardless of whether they're directly allied with the project. Perl itself has been offered under a similar license called the Artistic License, which gives similar permissions to users, but which imposes no restrictions on the license used for derivative works. Because of their shared open source philosophy, the Perl and Apache communities have a good working relationship, with current technology and future plans shared between the groups. This results in recognition of mod_perl and persistence when developing Perl modules, many of which are optimized to perform well in persistent environments.

The fact that mod_perl is offered under an open source license also hedges against the possibility that the software's original developers will shift their focus away from mod_perl and leave it undeveloped. In the case of closed source commercial software, a project can suffer significantly if a vendor decides to discontinue development of a crucial product. The software still is available, but it soon can become obsolete without regular updates to make it compatible with current technologies. With open source software, however, it is possible to take control of crucial software if the vendor decides to discontinue development. The vendor's product then is moved from the list of purchased products to the list of internally developed applications. Also, in some cases, development of a product is taken over by a new group if an existing group stops development. In fact, many open source projects (including Perl's XML::Parser and DBI modules) were developed initially by programmers from related projects, and then taken over and implemented by different programmers.

Tight Integration with Apache

Apache and mod_perl are integrated very tightly, which can be both a benefit and a drawback. When viewed as a Perl interface to the Apache API, mod_perl can be a very powerful tool for changing the Apache server environment and creating extensions for complex tasks that CGI-style Web applications are not efficient at performing. However, when viewing mod_perl as a CGI replacement for persistent Perl environments, the tight integration of mod_perl into the Apache server can be a deterrent to developing portable Web applications.

One of the unique benefits of developing Perl Web applications in mod_perl is that it gives complete access to the Apache API. However, because mod_perl provides little in the way of an abstraction layer between mod_perl programs and the API, programs written for a specific version of mod_perl are subject to changes in the Apache API. For instance, if a call that generates a specific error message (for example, "500 Server Error" or "301 Moved") changes from one version of the Apache server to the next, modules written using mod_perl with this functionality behave erratically or cease to work.

Of course, the main difficulty with mod_perl's tight integration with Apache is that it requires the Apache server. This would seem to be a minor issue because Apache is currently the most popular Web server software available and the acknowledged performance leader. In addition, Apache is available for most operating systems, including most UNIX variants and Windows. Unfortunately, this assumes that the only considerations when choosing a Web server are Perl integration and platform availability. If some other reason to switch Web server platforms arises, converting mod_perl programs to operate in the new environment would be a nontrivial task.

Large User Base

As a popular module for the most popular server in the world, mod_perl benefits from a very large user base. mod_perl programmers have developed many different types of applications, ranging from development environments to complete Web site administration systems. Many systems have been written with mod_perl in mind, including the Slash news site generation system mentioned in the SlashCode section of Chapter 7, "Perl for the Web."

In addition, many modules have been created to enable existing Perl modules to take advantage of the persistent environment offered by mod_perl. These applications usually are listed under the Apache:: branch of the Perl module tree, which includes important modules such as Apache::DBI. More information on Apache::DBI and its role in persistence can be found in Chapter 14, "Database-Backed Web Sites."

Object-Oriented Programming

Although there are many Perl programming environments built on top of mod_perl, the standard form of mod_perl programming takes the form of Apache API calls through an object-oriented interface. Although this provides a powerful way to interact directly with the Web server, some Perl programmers are less than comfortable with Perl's object-oriented programming style. In fact, a few Perl programmers might have a specific aversion to object-oriented style because of occasional programmatic awkwardness such a style entails.

In addition, the persistent nature of mod_perl programs requires them to have a stricter style that is more suitable for a persistent environment. Again, this can be a style that is different from that which is used by many Perl programmers in a stand-alone or CGI environment. Specifically, adapting existing CGI programs for use as mod_perl modules might be a daunting task because the differences between the mod_perl style and the CGI style can be present at such a fundamental level that it might seem easier to write the entire application from scratch.

Many of these issues are common enough that they have been addressed in modules designed to soften the sharp differences between the mod_perl programming style and the CGI style. The Apache::Registry module, for instance, provides a way to use CGI programs in an unmodified form within a mod_perl environment. This process is not without its caveats, however; Apache::Registry does not account for the needs of CGI programs that stray from the norm (for example, programs with nonparsed headers), so additional changes to the environment or the program might be necessary to fit with the assumptions made by Apache::Registry. The module also doesn't solve the problem of global variables used in a persistent context, which still can cause some CGI-style programs to behave erratically when run persistently. For new program creation, additional functionality is provided by modules such as Embperl and HTML::Mason, which enable mod_perl programs to be created in sections that make more sense within a Web site framework. Environments such as these are discussed in Chapter 12, "Environments for Reducing Development Time."


FastCGI is a protocol designed to replace the CGI protocol and provide a more platform-neutral approach to API programming for Web servers. As a replacement for CGI, it was written to provide a language-independent framework for persistent connections between a Web server and application engines running in separate processes. As a broader solution for API programming, the FastCGI architecture adds greater control over the performance characteristics of application engines, as well as an abstracted interface that enables the deployment of FastCGI applications in various load-balancing configurations, including over a network of application processing servers.

FastCGI was originally developed by Fast Engines, and then released as open source software and maintained by a group of programmers. Modules implementing the FastCGI protocol are available for Apache Web servers from the FastCGI Web site (, and Zeus servers include support for the FastCGI protocol as a native API.

The architecture of FastCGI applications is different from both CGI applications and mod_perl. FastCGI applications are deployed as independent persistent processes that communicate with the Web server through the FastCGI protocol. On the other hand, CGI applications are run once per request by the Web server, and they communicate through standard program output channels. Despite the differences, FastCGI holds true to the CGI programming style. FastCGI applications are written in a style that is similar to the style used with CGI programs, although some modifications have to be made to accommodate the persistent environment and to handle the FastCGI protocol.

Architecture of FastCGI

FastCGI programs are run as independent processes–either as network daemons or application engines, as described in Chapter 9, "The Power of Persistence." The FastCGI processes communicate with a module embedded in the Web server process or processes. FastCGI engines written in Perl embed the Perl interpreter in each application engine process, providing a persistent Perl interface to the Web server that is similar to the one mod_perl provides. However, the FastCGI interface is abstracted away from the Web server API by the FastCGI protocol. Thus, FastCGI programs are more resistant to changes in the Web server environment.

FastCGI processes can be implemented as daemons, which communicate with the Web server over the network, or as application engines, which communicate with the Web server through named pipes or similar structures. FastCGI processes might also be implemented as threaded applications, but Perl processes are unlikely to take advantage of FastCGI threading because its not advisable to run Perl programs in a threaded environment. However, some Perl-related programs, such as VelociGen, use FastCGI threading to reduce the overhead of having multiple FastCGI processes in memory.

Because FastCGI application engines are kept separate from the Web server process, its possible to execute a different number of application engines than there are Web server processes. This can be a benefit to environments in which hundreds of Web server processes are running to deal with large volumes of static page requests while only a few application engines are needed to handle the less-frequent requests for dynamic pages. The opposite case also is possible, such as when there is only a single-threaded Web server process handling static requests with many more application engines supporting it.

In addition, the separation between FastCGI and Web server processes adds a layer of stability by isolating potentially unstable code in the FastCGI processes. Because the FastCGI protocol conveys only a limited amount of information between the Web server and the application engine, an engine can fail without taking down the Web server process. This level of separation is crucial for threaded Web server processes because there is likely only one process running and all current client connections would be lost if that process were to crash.

Like the CGI protocol, FastCGI was designed to be language-neutral. FastCGI applications can be created in any language as long as the application implements the FastCGI protocol. Toolkits with FastCGI libraries and sample code are available for the C, Perl, Tcl, and Java languages. These toolkits are made available under open source licenses. Thus, programs can be developed with them without the need for licensing agreements or fees. For Perl programmers, the FCGI module provides all the functionality necessary to create a FastCGI application engine in Perl.

Familiar CGI-Style Code

True to its name, FastCGI implements an interface that should be familiar to CGI application developers. In fact, the core of a FastCGI application engine can simply be a CGI application written in the same style as any other CGI program. This core is implemented as a subroutine that is similar to the way Apache::Registry compiles CGI programs as subroutines, but FastCGI programs are likely to have just one core subroutine that handles every request.

Of course, CGI programs as written are not FastCGI applications. An additional layer of code has to be added to each application engine to handle the communication between the engine and the Web server and the persistence of the application engine. The FCGI module makes this code as easy to implement as possible, but modifications still have to be made to each CGI application being translated into a FastCGI process. It is possible to combine the functionality of the Apache::Registry and FCGI modules to produce a FastCGI program that loads other Perl programs and that processes the differences automatically; however, the work involved can be more difficult than rewriting the applications individually.

One notable difference between CGI and FastCGI styles is that global variables have to be declared and initialized at the beginning of a FastCGI program. In contrast, CGI-style global variables are generally declared in place, if at all. The upside of this initialization phase is that complex data structures can be explicitly initialized outside the event handler loop so that they are instantiated only once over the lifetime of an application engine. This process can reduce the overhead of creating these data structures each time a request comes in, which can be considerable for complex data such as XML parse trees or other parsed files. This procedure should not be used to create databases in memory to store state information because more than one application might be used to process the same type of request. In addition, the process loses all in-memory data when the application engine is restarted, so it's generally safer to store state information outside the application engine, in a database or session server, for instance.

Other aspects of persistence need to be coded in a FastCGI application as well. CGI programs rarely have to consider the effects of global variables versus local or file-scoped variables. In a persistent context, however, global variables can hold on to their values from one request to the next, providing inconsistent results for each request. In addition, FastCGI programs run CGI-style code as a subroutine the same way Apache::Registry does under mod_perl. Thus, unanticipated behavior might occur due to CGI subroutines being shifted outside the main subroutine. Some of these issues are covered in Chapter 11.


VelociGen is a commercial Web application server offered by a company of the same name ( that provides Perl persistence and a series of Perl development environments. VelociGen is available for Netscape Enterprise Server, Apache, Microsoft Internet Information Server (IIS), and Zeus on most UNIX and Windows platforms.

The VelociGen server architecture combines many of the techniques mentioned in Chapter 9 to take advantage of every performance improvement provided by any technique available. As a result, the specific architecture used by any one installation of VelociGen can vary widely depending on the capabilities of the operating system, the hardware configuration, and the Web server software and deployment style used on any specific instance of the server environment.

Common to all VelociGen servers, however, is an abstracted programming environment that uses the same style, regardless of the underlying architecture. This environment is designed to provide comprehensive CGI compatibility for existing programs with no program modifications necessary. Additional environments are provided for new Web application programming. These environments include a full persistence mode with no overhead, which is for power-hungry applications, and an embedded mode, which addresses simpler programming tasks in more complex HTML or XML environments.

Architecture of VelociGen

The architecture of a VelociGen server varies depending on the environment. For instance, VelociGen for Netscape Server API (NSAPI) servers uses a threaded plug-in module that communicates with application engines, while VelociGen for Apache servers uses Dynamic Shared Objects (DSO) or FastCGI connections between application engines and the Web server. In fact, variations on both the monolithic architecture (the "Web Applications in One Modular Program" section in Chapter 9) and the independent architectures (the "Web Applications as Persistent Separate Programs" in Chapter 9) are possible depending on server choice and the configuration of the VelociGen engines. For instance, VelociGen engines can be deployed as DSOs for the Apache server under Linux, and each in turn can control a Perl processing engine on the same machine. In another configuration, a threaded VelociGen load balancing plug-in for the iPlanet server under Solaris can control a number of clustered Perl engines on multiple machines.

The most common architecture used by VelociGen servers consists of a number of application engines controlled by a load-balancing plug-in module. Each application engine is a persistent Perl program started by the plug-in modules, which is sent requests as the Web server delivers them. The application engines communicate with the VelociGen plug-in through a named pipe or socket, depending on the server and configuration used, so that they can be deployed on local or remote machines.

Each application engine is designed to operate in a fashion similar to Apache::Registry by compiling independent programs as subroutines within an overarching persistent Perl program. The engines can be set up to run for a specific number of requests before restarting, which enables the environment to be purged of cached elements if they are likely to grow over time.

Caching and Clustering

In addition to basic process control over the number of application engines present, the plug-in module provides load balancing and automatic clustering for application engines. Each request is dispatched to an appropriate Perl engine based on a load-balancing algorithm. The algorithm takes into account which engines have processed similar requests in the past as well as which engines are currently in use or unavailable. As a result, program engines are clustered automatically based on execution patterns and relative application popularity. This results in engines that take up less memory because fewer programs are being cached in each engine. The engines also are more responsive to requests overall because the odds of a request being processed by an engine with the necessary program and connections cached increase dramatically.

Because applications in a persistent environment are compiled and cached in memory and those applications might be running in a persistent mode for days or months, it's necessary to check for changes in the application or other cached files every time a request is received. VelociGen tracks the age of each file cached in memory, checks that age against the file on disk, and reloads and recompiles the file (if necessary) before processing the request. Although this capability can be important when developing a Web application, this functionality can be turned off to save overhead and improve the performance of deployed applications.

As mentioned, Perl application engines can be clustered across machines to provide greater scalability to a Web application. To do this, identical copies of the programs and their shared environment–including database connection managers and other driver software–must be present on each machine used as an application environment. After those copies are present, however, each new machine can be pressed into service interactively as additional performance is needed, without modifying any of the applications.

Full CGI Compatibility

VelociGen was developed to provide a complete solution to companies that want to accelerate existing CGI programs without the need to rewrite any code. To this end, the VelociGen architecture and environment is built around providing high-transaction support for persistence in as transparent a fashion as possible. In fact, VelociGen is tested to ensure that any valid CGI Perl program–no matter how complex or sloppy– produces the same output in a persistent context as it would in a CGI environment. To do this, VelociGen provides a CGI compatibility environment that simulates a CGI environment as closely as possible. The environment checks programs before they are compiled to make simple changes automatically, and then it checks the Perl environment after each instance of the program is executed to clean up global variables or other memory structures that might not have been disposed of by the program itself. The CGI compatibility environment also provides environment variables expected by many CGI programs that might not be present in the persistent environment.

VelociGen provides persistent environments with stricter style rules as well. After existing CGI programs are made persistent within the compatibility framework, it's possible to develop new applications in full persistence mode. Persistent applications built without the CGI compatibility layer achieve even better performance because the environment no longer keeps track of global variables or preprocesses input parameters. This saves a bit of overhead and offers an incremental performance improvement, but it requires extra care when writing programs. Often, however, the requirements for this mode can be met simply by using the strict and warnings pragmas to check code during development and the CGI module to process input parameters when necessary.

For development that more closely resembles HTML than CGI programming, an environment that provides embedded Perl code and abstracted HTML-like tags also is available. This environment, which is known as Perl Server Pages (PSP), enables Perl code to be embedded into HTML pages in a manner similar to Active Server Pages (ASP), Java Server Pages (JSP), and PSP pages. The Perl code then can be treated as though it will be executed to fill in the page at runtime, which enables smaller sections of Perl to be used within more complex HTML pages by site designers without the cumbersome CGI style. The specifics of PSP syntax and usage are explored in Chapter 12, "Environments For Reducing Development Time." An important architectural note is that these pages are translated into full persistent Perl programs at runtime and treated much like other persistent programs handled by VelociGen.

In fact, all the environments offered by VelociGen provide persistence, database connection caching, precompiling, and script caching. Because the VelociGen server is based on a robust persistence architecture, performance benchmarks in any of the available modes are similar, and even the slowest of the modes still posts benchmarks that are orders of magnitude faster than CGI environments.


PerlEx is a similar commercial CGI Perl accelerator offered by the ActiveState software company ( The software is generally less expensive than VelociGen, but it is not offered under an open source license, as are FastCGI and mod_perl. Unlike mod_perl, however, PerlEx provides persistence for Perl programs under Windows. In addition, accelerating Perl Web applications might be simpler and might require less modification using PerlEx than with a FastCGI implementation. PerlEx can be a good solution for noncritical Perl programs that need basic persistence.

PerlEx utilizes a plug-in architecture, embedding Perl interpreters directly into the Web server process. The plug-in is available for IIS, O'Reilly WebSite, and Netscape (or iPlanet) servers. As this book is being written, PerlEx is available only for Web servers running under the Windows operating system; versions of PerlEx for UNIX or other operating systems have not been announced. Similarly, a version of PerlEx for the Apache Web server has not been announced.

Architecture of PerlEx

PerlEx is implemented as a plug-in for Windows Web servers. The PerlEx environment embeds a threaded Perl interpreter in the Web server process, which provides persistence by compiling Perl programs into an encompassing Perl program. The resulting architecture is most similar to the "A Web Application as a Single Plug-In" section of Chapter 9. No separation is maintained between the Web server process and the Perl environment, and there is only one server process. Thus, any unstable Perl code is potentially capable of bringing down the entire Web server process.

Web application performance in PerlEx can be improved by designating scripts for preloading and precompiling in the PerlEx registry entries. In addition, PerlEx can support persistent database connections or custom file caching by specifying one-time code in the program's BEGIN and END blocks. Code in the BEGIN block to create a database connection, for instance, is run once when the program is compiled. The resulting connection then is available to subsequent requests as the program is executed. This method is not as transparent as the method employed by Apache::DBI, however. (Apache::DBI is described in detail in Chapter 14.)

As mentioned in Chapter 9, threading is experimental in Perl implementations through version 5.6. Though many Web servers are threaded, it is generally not wise to introduce threaded Perl into a high-transaction Web application. PerlEx is designed to work in threaded mode, however. Thus, thread safety of Perl programs running under PerlEx is a concern. A list of thread-safe Perl modules is available from the ActiveState Web site (, but the list is far from comprehensive. In general, any Perl module written solely in Perl (that is, without any compiled extensions) is likely to be thread-safe, and a few common compiled modules, such as DBI, are considered thread-safe as well. Because the threading implementation is likely to change, however, many modules considered thread-safe in one version of Perl might not be so in another.

ASP-Style Programming

PerlEx provides persistence for both CGI and ASP-style code. Many CGI programs can be used with PerlEx without modification, but a few caveats should be observed when using PerlEx to add persistence to a complex CGI Web application. PerlEx doesn't always automatically account for global variables, and common modules, such as, should be used only in object-oriented mode. Luckily, PerlEx enables programs to be excluded from the persistent mode by specifying them in a registry entry used by the environment. Unfortunately, though, programs excluded in this fashion do not benefit from any performance increases PerlEx provides other programs.

PerlEx also provides an environment for new development using ASP-style embedded Perl code. Perl code can be embedded in HTML pages using the <% and %> tags, which are familiar to ASP programmers. These pages then are processed to replace the Perl code with its output at runtime–similar to the way a Server Side Include (SSI) works.


Although there are many tools that provide Perl persistence in a Web server environment, there are four that are most commonly seen:

Overall, each environment should be evaluated in terms of the Web application being developed, as well as budget, platform, and development time constraints, because each tool provides a marked improvement in performance over a comparable CGI solution.

This is a test.

Page last updated: 15 August 2001