Environments for Reducing Development Time
part of Perl for the Web
Increasing the performance of Perl code as it executes is one thing, but there are aspects of a Web application that can't be sped up automatically. However, a change in architectureand the resultant development environmentcan do for development overhead what persistence does for execution time. A new approach also can pave the way for more robust applications that keep the final product fixed firmly in mind.
The first step to improving the development process for Web applications in Perl is to move from a Perl-centric approach to a Web-centric approach. This is done by setting up an environment that enables Perl code to be embedded in HTML, instead of the other way around. At first blush, it would seem to be an equivalent situation, but the difference in perspective can make the difference between a Web-savvy developer and a developer who can't make the leap from Perl to the Web.
Embedding Perl in HTML or XML
To put it simply, embedding Perl in Hypertext Markup Language (HTML) (or Extensible Markup Language [XML], eventually) avoids simplistic application designs. If the focus of an application is what the Perl code is doing rather than how the result is presented to the user, that result is likely to be too simple to convey the full meaning of the data being presented. In an HTML environment, the end product is everything because it's the only thing the user sees.
Basing a program on HTML makes it more Web-like and takes advantage of Web techniques. For instance, a Web application has the capability to reference parts of the same application (other pages) or parts of other Web applications indiscriminately. The point of a hypertext system is to link all possible applications (or pages) in a web of nodes that have no boundaries. If a Web application is designed as a monolithic structure, it's less likely to take advantage of these links to communicate with anything but itself.
After the focus is shifted from traditional applications to Web-centric applications, it's possible to ease development further by adding template mechanisms and convenience methods. The need for reusable components becomes much more acute when hundreds of programs with minor differences are being created to operate as a whole within a coherent environment. Because Web users value consistency from one site to the next and insist on consistency within a site, it makes perfect sense to separate out those parts of a site that stay constant and reference them indirectly within each individual program. From there, it's only a short hop to writing the programs themselves as components that are referenced indirectly by a user's actions as he or she moves through a dynamic site.
"Hello World" Syndrome
The first Web application learned by most Common Gateway Interface (CGI)-style developers is nearly identical to the first application learned by any other programmer"Hello World." Listing 12.1 is an example of how the Hello World program might be created by a Web programmer who is beginning to learn Perl CGI.
Listing 16.1 Hello World
01 #!/usr/bin/perl
02
03 # include libraries
04 require 5.6.0;
05 use strict;
06 use warnings;
07 use CGI;
08
09 # initiate CGI parser object
10 my $q = CGI->new;
11
12 # begin the page
13 print $q->header,
14 $q->start_html('Hello World');
15
16 print $q->p("Hello, World!");
17
18 print $q->end_html;
This program might look pretty impressive from a Perl point of view. In a Web context, though, it's one line of relevant information and 17 lines of telling us what we already knowwe're using Perl, we're writing a Web application, and it's creating HTML. Its output probably would inspire no one:
Listing 16.
Hello, World!
As it's written, the program produces a very simple page. It doesn't take advantage of HTML's more advanced features. However, the simple case still is difficult enough that more complex tasks (such as the compilation of information from around the Web by a page like My Yahoo!) seem daunting in comparison to their HTML output. In fact, as the program gets more complex, it still finds no room for the frills, color, and ease of use users have come to expect on the Web. As a result, too many mature applications on the Web show the same blandness from which Hello World suffers.
Worse yet, future Web applications written in the same style have to start from Hello World all over again because none of the code developed for one application is exported into the environment for other applications to use. This provides no continuity at all over the course of a site, and it relegates usability concerns to the few Web programmers who have successfully slogged through the reams of code necessary to get the programs working in the first place.
Embracing Web Style
The Web is based on HTML pages for a very good reason. Back when the Web was being invented, a simple page markup and hypertext description language also was being developed. It was designed to be simpler to understandand therefore more likely to be usedthan page description languages, such as Postscript, or programming languages, such as C. The result (HTML) became one of the handful of reasons why the Web became phenomenally popular in the ensuing years. Suddenly, the distance between source code and final product was shorter than ever before. It was short enough that ordinary tinkerers could take a look at the workings of a multimillion-dollar site and discover its secrets in a matter of hours.
The core of Web style is that HTML pages have a certain look and feel that comes about from the kind of markup HTML enables. This creates pages that flow from top to bottom in (generally) one column of text with frequent breaks in the text caused by headers, images, and hyperlinks. Viewing an HTML source file gives the same impressiona single flow broken by a few different types of text. Images might surround the core information stream, but their meaning usually fades to the background as the central focus of the page is concentrated on. The application logic sometimes can obfuscate that by being more abstract. For designers who are used to seeing the content of a page in terms of a flow with some breaks, it might be difficult to get a handle on how a page flows when it is listed as nested subroutines or abstract function calls. The flow is further impeded when the page itself is buried in a mound of print statements and environment declarations.
Web-like programming can give rise to stronger Web pages and more intuitive Web applications. Because the Web style is so easy to grasp, Web applications written in a similar style would seem to benefit from the same easy-to-understand logic. As it turns out, this is often the case when developing Web applications in a Web-centric style. The applications match their final output so closely that simply viewing the end result indicates the source of any discord or flawed program logic. Additional componentization of the application spreads this vision further by enabling the programmer or designer to separate the program out by visual section. After a program is fully Web-centric, it becomes easy for a Web programmer to focus on a section of code by addressing just the output of that section, without giving any thought to the sections around iton the page or on other unseen pages.
Templates and Code Separation
Templates defining the overall look and behavior of a site are a natural extension of the idea of embedding Perl code in HTML. Templates can be viewed as HTML-centric pages with smaller Perl snippets, which stand for abstract parts of the Web application interface. In fact, there's no better way to think of thembecause Perl and HTML are both known quantities, viewing a template system as Perl-in-HTML makes it much easier to implement. The details can be tricky, however; see Chapter 13, "Using Templates with Perl Applications," for more information on templates.
Code separation also is a tricky topic because the code part of an application can't be completely separate from the display part of the application. It's possible to create abstractions of each in the arena of the other, but there's no complete way to separate the two because there always has to be a connection point. Embedding code in HTML provides a good interface that doesn't have to give up either the robustness of code or the simplicity of HTML. In addition, it takes away the need to create an artificial third layer that might not have the strengths of either.
Comparisons with CGI.pm Style
Before talking about embedded solutions, you need to get a better understanding of CGI.pm style. It's a style that grew out of the standard Perl programming style, with the addition of the conventions of CGI.pm, which is the most prominent CGI interaction library in use. CGI.pm offers a host of functions for accessing environment and form variables and controlling output. By controlling the way that the Perl program interacts with these interfaces, CGI.pm emphasizes the Perlish nature of a Web application and hides much of the HTML code that eventually results.
The CGI.pm style usually is overkill when creating a Web application. Functions are present that are just Perl translations of HTML text. Thus, it's often easier just to print the HTML directly instead of calling a Perl function that might be less familiar to a Web designer. In addition, the style of HTML tags is not fully compatible with a linear programming method such as the one Perl uses. Trying to compose a complex document with these functions becomes awkward and unreadable, while the equivalent HTML would be trivial to compose. In addition, using CGI.pm to access form variables for Web application communication can be cumbersome and repetitive. In short, CGI.pm has its place, but it usually is much more useful in an inherent context instead of being used explicitly.
SQL Query Processor Revisited
The SQL query processor example from Chapter 7, "Perl for the Web," reprinted here as Listing 12.2, is written using the CGI style. This program is self-contained, relying on multiple requests to the same program to provide a complete Web application. It incorporates no other files aside from Perl modules. Its coding style is overwhelmingly Perl-centric with very few intimations that the output of the program is actually HTML. It also assumes only the barest of execution environments. Thus, every aspect of Perl initialization is present and necessary.
Listing 16.2 SQL Query Processor
001 #!/usr/bin/perl
002
003 #-----------------------------------------
004 #
005 # sql_query.pl - CGI application example
006 #
007 #-----------------------------------------
008
009 # include libraries
010 require 5.6.0;
011 use strict;
012 use warnings;
013 use CGI;
014 use DBI;
015
016 # declare some variables
017 my ($q, $dbh, $sth, $query, $datasource, $user, $password, $error, $field, $result, $results);
018 my (@datasources);
019
020 # initiate CGI parser object
021 $q = CGI->new;
022
023 # begin the page
024 print $q->header,
025 $q->start_html('SQL Database Viewer'),
026 $q->h2('SQL Database Viewer');
027
028 # build a (safe) list of data sources
029 foreach (DBI->available_drivers)
030 {
031 eval {
032 foreach (DBI->data_sources($_))
033 {
034 push @datasources, $_;
035 }
036 };
037 }
038
039
040 # display the entry form
041 print $q->start_form;
042
043 print qq{<p>Choose a datasource:</p>\n};
044 print $q->popup_menu(-name => 'datasource',
045 -values => \@datasources);
046
047 print qq{<p>Specify username/password:</p>\n};
048 print $q->textfield(-name => 'user',
049 -size => 10);
050 print $q->password_field(-name => 'password',
051 -size => 10);
052
053 print qq{<p>Enter a SELECT query:</p>\n};
054 print $q->textarea(-name => 'query',
055 -rows => '5',
056 -cols => '40',
057 -wrap => 'virtual');
058
059 print $q->p, $q->submit;
060 print $q->end_form;
061
062 # get form variables
063 $datasource = $q->param('datasource');
064 $user = $q->param('user');
065 $password = $q->param('password');
066 $query = $q->param('query');
067
068 # check form variables
069 if ($query)
070 {
071 $error = "Improper datasource specified" unless ($datasource =~ /^dbi/i);
072 $error = "Query should start with SELECT" unless ($query =~ /^select/i);
073 }
074
075 # if a query is specified and form variables are OK,
076 if ($query and !$error)
077 {
078 # connect to the database
079 $dbh = DBI->connect($datasource, $user, $password)
080 or $error = "Connection failed: $DBI::errstr";
081
082 # if the database connection worked, send the query
083 unless ($error)
084 {
085 $sth = $dbh->prepare($query)
086 or $error = "Query failed: $DBI::errstr";
087 $sth->execute or $error = "Query failed: $DBI::errstr";
088 }
089 }
090
091 # if any errors are present, display the error and exit
092 if ($error) {print $q->p("Error: $error"), $q->end_html and exit;}
093
094 # if the query produced an output,
095 if ($query and $sth->{NAME})
096 {
097 # start a data table
098 print qq{<table border="1">\n};
099 print qq{<tr>\n};
100
101 # display the fields as table headers
102 foreach $field (@{$sth->{NAME}})
103 {
104 print qq{<th>$field</th>\n};
105 }
106 print qq{</tr>\n};
107
108 # display the results in a table
109 while ($results = $sth->fetchrow_arrayref)
110 {
111 print qq{<tr>\n};
112 foreach $result (@$results)
113 {
114 print qq{<td>$result</td>\n};
115 }
116 print qq{</tr>\n};
117 }
118
119 # finish the data table
120 print qq{</table>\n};
121 }
122
123 # finish the page
124 print $q->end_html;
125
126 # disconnect from the database
127 $dbh->disconnect if $dbh;
Incidentally, Listing 12.2 should run in CGI as it would in persistent environments. This is made possible by using the my keyword in lines 17 and 18 to restrict the scope of program variables. It's also helped by using the strict and warnings pragma modules to check the code as it is compiled. Otherwise, the listing doesn't require any stylistic changes to make it executable in a persistent environment.
This emphasizes the difference between CGI style and the CGI environment. The CGI environment already has been exposed in previous chapters as a performance drain, which can be overcome by using a persistent environment, as described in Chapter 9, "The Power of Persistence," and Chapter 10, "Tools for Perl Persistence." CGI style, however, is the style of Web application programming encouraged by the limitations of that environment. This includes the following:
- The overall style and look of the code listing. CGI-style code looks like a Perl program from beginning to end because it runs with few assumptions about its environment.
- The code used to perform basic program functions. This includes both standard Perl code sections and code used specifically for CGI functions.
- Whether those functions are needed at all. CGI-style code assumes that all program functions must be performed explicitly. With persistence, some functions can be performed implicitly within the Web server environment.
Process Form Variables
Form variables are the main means of explicit communication between the client browser and Web applications. They usually result from a user filling in information in HTML form fields. An example of this is a username typed into an HTML text field. After it is submitted, the names and values of all form variables are sent along with the page request. These variables then become the parameters used by the Web application to respond to the request appropriately.
Processing form variables can be a difficult task for CGI beginners, but after the form variable problem is solved, there's very little reason to vary the solution. The CGI protocol provides the names and values of all form variables to any CGI program, but the format of these variables is not conducive to easy Perl programming. For example, Listing 12.2 might produce form values in this format:
Listing 16.
datasource=dbi:null&user=poppy&password=f8ntom&query=SELECT+*+FROM+dual
The format is standardized and parsible, but writing a parser to split out variables and their values would be too complex a task for most beginner Web applications. Luckily, the problem has been solved in a general way and incorporated into the CGI module. CGI.pm provides form variable access through a set of object methodsone of which is the param method used to retrieve an individual form variable by name. Lines 063[nd]066 of Listing 12.2 retrieve the necessary form variables for the rest of the application.
In Listing 12.2, only a few form variables need be brought into the program. Thus, they are assigned individually using the param method. To enable the variables to contain a wide range of values, very little checking is performed. The only exceptions are the data sourcewhich should start with DBIand the SQL query itself, which is limited to SELECT statements to reduce the potential for unanticipated changes to the database. Most Web applications require much more variable checking than this example. For instance, an order form would need more stringent checks on the format of a credit card number or Zip code.
Query a Database
Although database access isn't central to most programming languages, it has become central to Web application programming. As a result, it should be a central part of any embedded Perl programming environment. A database often is used as the central place where a Web application can store and retrieve information. Providing access to this kind of data should be as seamless as possible. In addition, many Web applications are designed to provide a browseable interface to data stored in a database. These applications require a seamless interface between Perl and the database.
Listing 12.2 is devoted to database access. Thus, it makes sense that database methods should be common throughout the program. In addition, the program is designed to accept a wide variety of SELECT queries and display a wide variety of results from those queries. Therefore, the methods it uses to process the queries are more generic than would be found in most Web applications. For instance, line 085 of Listing 12.2 simply passes the SQL query as it is provided by the input form, but most Web applications would compose the database query from a number of form variables based on a SQL template.
It should be noted that database access and CGI.pm do not overlap at all. CGI.pm doesn't provide any convenience methods for accessing a database in a Web-friendly fashion, nor does it provide shortcuts for handling or displaying the resultant data from queries. As a result, the methods used in Listing 12.2 to interact with the database are provided directly by the DBI module, which is discussed in greater detail in Chapter 14, "Database-Backed Web Sites."
Format Data in HTML
Another core aspect of Web applications is the need to generate HTML-formatted text from data sources. Often the format of the data being presented is known when the application is being designed so that the data can be placed within an HTML context with a fine grain of control over presentation. In some cases, though, the data being displayed can vary widely from one request to the next. In these cases, it's more important to provide a flexible framework in which all possible results are displayed in an understandable fashion, even though some control over formatting layout might be given up. The SQL query processor is one of these programs, as illustrated by the data table generation code in lines 109[nd]117.
The HTML generated in Listing 12.2 largely depends on the results of the SQL query provided. As a result, the exact number of table cells in each row is not known, and it's not possible to present the table in its entirety. Rather, parts of the table are abstracted out and governed by foreach loops. The result of the code ends up looking like the following:
Listing 16.
<tr>
<th>id</th>
<th>username</th>
<th>realname</th>
</tr>
<tr>
<td>12</td>
<td>cmonster</td>
<td>Chris Radcliff</td>
</tr>
Note that CGI.pm object methods aren't being used to generate the table tags. It would be possible to do so, but very little abstraction would be gained, and the code would have to be more awkward. The CGI module does provide some assistance with creating form fields from data structures, but the majority of other CGI methods seem to be simple copies of the HTML tags they generate. For instance, line 114 could be rewritten as follows:
Listing 16.
# the original line:
# print qq{<td>$result</td>\n};
# is replaced by:
print $q->td($result), "\n";
Although the resulting line is more Perlish, the change doesn't add much abstraction because the function name is the same as the resulting HTML tag name. Any additional styles or formatting have to be added to the Perl explicitly to appear in the HTML. For instance, setting the contents of each table cell in bold would require as much Perl as HTML:
Listing 16.
# the original line:
# print qq{<td><b>$result</b></td>\n};
# is replaced by:
print $q->td($q->b($result)), "\n";
It is important to note that the CGI.pm style buries HTML formatting in the Perl code. This makes the job of both site designers and programmers even more difficult because an explicit translation step has to be performed between the HTML the site designer expects and the CGI methods the programmer produces. Because there's no intrinsic benefit to making the translation in most cases, it's more likely that the CGI methods for HTML creation will go unused.
HTML::Mason
One of the most common Perl embedding environments is HTML::Mason (http://www.masonhq.com). HTML::Mason is also one of the most robust Perl development environments for the Web at the time this book was being written. In fact, HTML::Mason can be considered on par with commercial efforts, such as Perl Server Pages, and dedicated Web languages, such as PHP.
Mason is presented as a programming environment for reusable Web application segments. It enables Perl to be embedded in HTML pages, and it also enables other pages to be included in the page by invoking them as components. These components can be pure HTML, pure Perl, or a mix of the two, and the interface between the calling component and the called component is defined robustly.
HTML::Mason is best implemented in a persistent environment, but it has few dependencies on platform or architecture. Mason usually is used in conjunction with mod_perl, for instance, but it can just as easily be used in a FastCGI application or as a CGI process. Details of installing and configuring HTML::Mason for a specific platform are subject to change, but instructions can be found at the Mason Web site.
Query Processor Application Changes
The syntax used by HTML::Mason is similar to many other Perl embedding environments, but it is different from CGI.pm style in many important respects. Listing 12.3 is an example of modifications that could be made to the SQL query processor from Listing 12.2 to adapt it for use within the HTML::Mason framework.
Listing 16.3 SQL Processor in HTML::Mason
001 <%args>
002 $query=>''
003 $datasource=>''
004 $user=>''
005 $pass=>''
006 </%args>
007
008 <%perl>
009 # declare some variables
010 my ($dbh, $sth, $error, $field, $result, $results);
011 my (@datasources);
012
013 # build a (safe) list of data sources
014 foreach (DBI->available_drivers)
015 {
016 eval {
017 foreach (DBI->data_sources($_))
018 {
019 push @datasources, $_;
020 }
021 };
022 }
023 </%perl>
024
025 <html>
026 <head>
027 <title>SQL Query Browser</title>
028 </head>
029 <body>
030
031 <h3>SQL Query Browser</h3>
032
033 <form method="post">
034 <p>Choose a datasource:</p>
035
036 <select name="datasource">
037 %foreach my $source (@datasources) {
038 % if ($source eq $datasource) {
039 <option selected="selected"><%$source%></option>
040 % } else {
041 <option><%$source%></option>
042 % }
043 %}
044 </select>
045
046 <p>Specify username/password:</p>
047
048 <input type="text" name="user" size="10" value="<%$user%>" />
049
050 <input type="password" name="password" size="10" value="<%$password%>" />
051
052 <p>Enter a SELECT query:</p>
053
054 <textarea name="query" rows="5" cols="40"
055 wrap="virtual"><%$query%></textarea>
056
057 <p><input type="submit" /></p>
058 </form>
059
060 <%perl>
061 if ($query)
062 {
063 # check form variables
064 $error = "Improper datasource specified" unless ($datasource =~ /^dbi/i);
065 $error = "Query should start with SELECT" unless ($query =~ /^select/i);
066
067 unless ($error)
068 {
069 # connect to the database
070 $dbh = DBI->connect($datasource, $user, $password)
071 or $error = "Connection failed: $DBI::errstr";
072
073 # if the database connection worked, send the query
074 unless ($error)
075 {
076 $sth = $dbh->prepare($query)
077 or $error = "Query failed: $DBI::errstr";
078 $sth->execute or $error = "Query failed: $DBI::errstr";
079 }
080 }
081
082 # if any errors are present, display the error
083 if ($error)
084 {
085 print qq{<p><font color="red"><b>Error: $error</b></font></p>};
086 }
087
088 # if the query produced an output,
089 if (!$error and $sth->{NAME})
090 {
091 # start a data table
092 print qq{<table border="1">\n};
093 print qq{<tr>\n};
094
095 # display the fields as table headers
096 foreach $field (@{$sth->{NAME}})
097 {
098 print qq{<th>$field</th>\n};
099 }
100 print qq{</tr>\n};
101
102 # display the results in a table
103 while ($results = $sth->fetchrow_arrayref)
104 {
105 print qq{<tr>\n};
106 foreach $result (@$results)
107 {
108 print qq{<td>$result</td>\n};
109 }
110 print qq{</tr>\n};
111 }
112
113 # finish the data table
114 print qq{</table>\n};
115 }
116 }
117 </%perl>
118 </body>
119 </html>
From a structural point of view, Listing 12.3 isn't much different from Listing 12.2. This is due in part to the fact that the SQL query Web application is of medium complexity with a mix of both HTML-heavy and Perl-heavy sections. In contrast, an application with only a few short sections of Perl mixed in with a large amount of HTML would look significantly different from a CGI version of the program, whereas an application with mostly Perl and little HTML would look nearly identical. Because most pages on a complete Web site are more like the former than the latter, stylistic differences between CGI and HTML::Mason are likely to be more noticeable in practice. The SQL query application also is self-contained instead of spread across a number of pages. This deemphasizes one of the strengths of HTML::Masonthe capability to share program components easily among a number of pages.
One stylistic aspect of Masonand of all the other embedded environments in this chapterthat should be immediately noticeable is the lack of Perl initialization code at the beginning of the program. In fact, the first 13 lines of Listing 12.2 have no counterpart in Listing 12.3 or subsequent examples. This is an acknowledgment of the persistent environment of which these programs are likely to be a part. A program such as this isn't going to be run from a shell. Therefore, there's no #!/usr/bin/perl to indicate where the Perl executable is. Likewise, the page is part of an environment in which DBI and other supporting Perl modules already have been declared and loaded. Thus, there's no need to include modules explicitly at this point. With HTML::Mason specifically, this program is considered to be only one part of a whole Web application that can be called from any other part as easily as from the Web server directly. This shift in thinking does more than save a few lines of initialization at the beginning of each program; it breaks Web applications free of the restraints of the cgi-bin directory and promotes them to first-class citizens that are integrated with the rest of the Web site.
Demarcating Perl Segments
The simplest way to embed Perl in an HTML::Mason page is by enclosing it in a section delimited by <%perl> at the beginning and </%perl> at the end. An embedded Perl section such as this is evaluated as though it were a short Perl program. It then is replaced in the HTML page with the printed results of the section. In addition, the Perl section has access to any variables defined before it that are still in scope (based on my declarations), and variables defined within the section are available to subsequent sections and other embedded structures. This type of embedded Perl segment is common to all environments listed in this chapter, and they all provide roughly the same functionality. Each environment has its own way of evaluating the Perl sections and other embedded structures as the program is executed, but the resultant output is similar enough to be equivalent.
In a Mason page, this kind of segment is most useful when evaluating multiple lines of Perl, as in lines 008[nd]023 of Listing 12.3. These lines are nearly identical to the corresponding section in Listing 12.2. In both cases, they declare the scope of variables used throughout the rest of the program. They then create a list of data sources and store it in @datasources. Because the Perl code isn't creating or affecting any HTML output directly, it's best to keep this code separated in its own sectionas opposed to the HTML-centric sections that follow it. Lines 060[nd]117 of Listing 12.3 also are enclosed in a <%perl> block, but the reasoning is a little less direct. The number of print statements producing HTML would seem to indicate that lines 091[nd]114 at least could be written in an HTML-centric style, but the need for Perl conditionals and loops on practically every other line would make the code confusing as the style shifted back and forth. With so many nested loops, confusing code could make unmatched-bracket errors difficult to trace.
For cases in which a minimal amount of Perl is needed in a section to provide conditionals or loops, individual lines of Perl can be indicated with a percent symbol at the beginning of the line. For instance, lines 036[nd]044 of Listing 12.3 use a number of single Perl statements to provide a list of data source options by looping through the @datasources array.
Although this mix of Perl and HTML is slightly more confusing than would be optimal in most cases, it actually provides one of the simplest ways to directly populate a drop-down list from a Perl variable with a preselected, specified value. Drop-down form elements are particularly difficult to code directly while providing the functionality that site users expect. If a site is likely to use many of them, it's generally advised to create a custom component that handles the specifics of creating the list box. (See the next section, "Using External Program Components.")
As a trivial case, it also would be possible to simply enclose the entire contents of Listing 12.2 in an embedded Perl segment instead of separating out the Perl-centric and HTML-centric sections. This wouldn't provide any of the benefits of embedded Perl, however, and would incur the overhead of parsing the document as though the HTML sections were present. However, there are some cases when it's necessary to write pages using this style. For instance, if a page already is available as a CGI program, it's sometimes easiest to copy the entire program into an embedded Perl segment to offer it quickly. After the environment has been changed, it's possible to adapt smaller program sections to embedded code over time without compromising usability. It's also easier to merge an existing CGI program into a new site style using this process.
Using External Program Components
In HTML::Mason, every page is a component. The Web server calls the first pageknown as the top-level componentwhich then calls any additional Mason components through a tag in the form <& component &>. (HTML::Mason provides other ways to call components, including automatically-called components inserted before the top-level component, but variations on that theme are outside the scope of this book.) For instance, a component file named listbox in the same directory might be called with the following syntax:
Listing 16.
<& listbox &>
The listbox component would then be executed and its result included in place. This can be useful for defining common ways to handle difficult constructs, such as the <select> form widget discussed in the previous section. In that case, lines 036[nd]044 of Listing 12.3 could be replaced with the following call to the listbox component:
Listing 16.
<& listbox, items=>[@datasource], selected=>$source &>
In this case, the relevant data are passed to the component as the named arguments, which are items and selected. The array of items to be listed is passed through reference as items, and the value of the selected item is passed as selected. Naming the arguments this way enables Mason to name them correctly within the component. It's also possible to pass arguments anonymously by simply listing them, but that method is discouraged for the sake of readability and uniformity.
After passing, arguments can be accessed within other Mason components in a few different ways. If named arguments are passed, they can be imported into the component by way of the <%args> block. For instance, the listbox component, as called in the previous example, might contain an <%args> block, which introduces the @items and $selected named arguments like this:
Listing 16.
<%args>
@items
$selected=>''
</%args>
<select name="datasource">
%foreach my $item (@items) {
% if ($item eq $selected) {
<option selected="selected"><%$item%></option>
% } else {
<option><%$item%></option>
% }
%}
</select>
This same syntax is used whether arguments are being passed from other components or from the Web server itself. For instance, in Listing 12.3, lines 001[nd]006 make up the <%args> block that defines incoming form variables. The variables are each assigned a default of an empty string to avoid error messages if there are no values present. These variables then are used as normal in the rest of Listing 12.3. For instance, lines 063 and 064 check the $datasource and $query variables, respectively. Alternately, the %ARGS hash contains all named arguments passed to the component, regardless of whether they have been assigned to variables. For instance, line 064 could access the datasource named argument from this hash directly as $ARGS{datasource}. The choice between declaring named variables or accessing the hash is a matter of style; the former usually is more readable, but the latter serves as a reminder that the values come from an external source. In some cases, it's useful to make a distinction between the raw arguments stored in %ARGS when the local variables might have been modified.
HTML::Mason also provides the $m request object for object-oriented access to all aspects of the Mason environment. Most functions of $m are more customizable versions of standard Mason functions, such as the calling of components or the accessing of arguments. In addition, though, the $m object offers a fine grain of control over data caching, session management, and other architectural workings of Mason. The HTML::Mason documentation provides a complete listing of the methods provided by the $m object.
Displaying Inline Variables
It's sometimes useful to include variable values in a block of HTML text. The full syntax of a <%perl>print $value</%perl> block would be cumbersome in these cases. Therefore, Mason provides an alternate syntax specifically for evaluating expressions. Any Perl expression is evaluated inline when enclosed in a block of the form <% expression %>. This kind of block can be used to display a variable, such as $query, or to evaluate and display an expression, such as localtime() or "moo"x3.
In Listing 12.3, lines 046055 use inline variables to display the default values of each form field as given by the form variable arguments defined in <%args>. Note that this syntax is used only when the variable is included in an HTML block, not when the variable is used in a <%perl> block or other sections of Perl code.
Benefits and Limitations
Mason's emphasis on components makes writing robust template systems much easier because much of the work inherent in combining HTML templates with program segments is handled by HTML::Mason itself. HTML-compatible templates can be incorporated very easily by making the template an automatic top-level component for every Mason file in a directory. Sections of the template page then can be filled in with other shared components, including the original page requested. Further discussion of templates can be found in Chapter 13.
Mason's biggest problem might be one of overabundance. With so many ways to create reusable components, templates, preprocessors, and post-fixers, it's tempting to overengineer a site so that every possible aspect of HTML::Mason is used. The result of such an endeavor by a component novice, however, is likely to be a big mess. Mason provides much more than the average dynamic Web site needs, especially when the site has only recently made the leap from CGI to persistent Perllet alone embedded Perl. This problem can be overcome by using restraint and implementing new Mason features only when they are called for and well thought out. However, restraint isn't the hallmark of a Web designer. Thus, additional care should be taken when developing a style guide for Mason programmers.
EmbPerl
EmbPerl is another embedded Perl environment for use with mod_perl and other persistent environments. As this book was written, EmbPerl development was headed in the direction of a Mason-style component model called EmbPerl Objects, even though historically EmbPerl was designed for use with simpler, single-page Web application embedding. The older style is reviewed here because EmbPerl Objects still are in a state of flux.
EmbPerl's syntax overall is very similar to HTML::Mason and other embedding systems. Blocks of Perl code are set apart from HTML text with identifiers, and inline variables are displayed using another set of identifiers. EmbPerl has a few more sets of identifiers, however, because it has additional limitations on the type of code each block can contain. This adds complexity that can result in some awkwardness when developing Web applications that need to use each block style in conjunction with them.
Query Processor Application Changes
The syntax used by EmbPerl is analogous to HTML::Mason, albeit with a different set of block identifiers and a simplified interface to form variables. Listing 12.4 is an example of modifications that could be made to the SQL query processor from Listing 12.2 to adapt it for use within the EmbPerl framework.
Listing 16.4 SQL Processor in EmbPerl
001 [-
002 # declare some variables
003 my ($dbh, $sth, $error, $field, $result, $results, $head, $dat);
004 my (@datasources);
005
006 # build a (safe) list of data sources
007 foreach (DBI->available_drivers)
008 {
009 eval {
010 foreach (DBI->data_sources($_))
011 {
012 push @datasources, $_;
013 }
014 };
015 }
016 -]
017
018 <html>
019 <head>
020 <title>SQL Query Browser</title>
021 </head>
022 <body>
023
024 <h3>SQL Query Browser</h3>
025
026 <form method="post">
027 <p>Choose a datasource:</p>
028
029 <select name="datasource">
030 [-
031 foreach my $source (@datasources)
032 {
033 if ($source eq $fvar{datasource})
034 {
035 print qq{<option selected="selected">$source</option>\n};
036 }
037 else
038 {
039 print qq{<option>$source</option>\n};
040 }
041 }
042 -]
043 </select>
044
045 <p>Specify username/password:</p>
046
047 <input type="text" name="user" size="10" value="[+$fvar{user}+]" />
048
049 <input type="password" name="password"
050 size="10" value="[+$fvar{password}-]" />
051
052 <p>Enter a SELECT query:</p>
053
054 <textarea name="query" rows="5" cols="40"
055 wrap="virtual">[+$fvar{query}+]</textarea>
056
057 <p><input type="submit" /></p>
058 </form>
059
060 [-
061 if ($fvar{query})
062 {
063 # check form variables
064 $error = "Improper datasource specified"
065 unless ($fvar{datasource} =~ /^dbi/i);
066 $error = "Query should start with SELECT"
067 unless ($fvar{query} =~ /^select/i);
068
069 unless ($error)
070 {
071 # connect to the database
072 $dbh = DBI->connect($fvar{datasource}, $fvar{user}, $fvar{password})
073 or $error = "Connection failed: $DBI::errstr";
074
075 # if the database connection worked, send the query
076 unless ($error)
077 {
078 $sth = $dbh->prepare($fvar{query})
079 or $error = "Query failed: $DBI::errstr";
080 $sth->execute or $error = "Query failed: $DBI::errstr";
081 $head = $sth->{NAME};
082 $dat = $sth->fetchall_arrayref;
083 }
084 }
085 }
086 -]
087
088 [$ if ($error) $]
089 <p><font color="red"><b>Error: [+ $error +] </b></font></p>
090 [$ endif $]
091
092 [$ if (!$error and $head) $]
093 <table border="1">
094 <tr><th>[+ $head->[$col] +]</th></tr>
095 <tr><td>[+ $dat->[$row][$col] +]</td></tr>
096 </table>
097 [$ endif $]
098
099 </body>
100 </html>
Again, the major difference between EmbPerl and CGI style is the beginning of the program because using Perl and including common modules are assumed. In addition, Listing 12.2 could have been rewritten in EmbPerl simply by enclosing the entire listing in [- and -] block identifiers with additional native EmbPerl syntax added to the program as it became necessary. EmbPerl style is noticeably different from HTML::Mason toward the end of Listing 12.4, which shows both EmbPerl's limitations due to the way the pages are processed as well as its built-in database handling features.
Demarcating Perl Segments
Embedded Perl sections are denoted a few different ways in EmbPerl, depending on how they should be treated. Complete Perl segments that can be executed independently are set apart by enclosing them in [- and -] identifiers. Lines 060[nd]086 of Listing 12.4, for instance, make up a block that checks form variables, creates a database connection, and sends the specified query to the database. Unlike blocks in HTML::Mason and the other environments discussed in this chapter, these blocks must be complete blocks of Perl code. This means that loops, conditional blocks, eval blocks, and subroutines can't contain a mix of these Perl blocks and HTML sections.
EmbPerl does provide a way to mix conditional and loop code with HTML sections. Blocks denoted by [$ $] are used for program structures such as if, for, and while, which are called meta commands by EmbPerl. Unfortunately, the code used within these blocks is not Perl, but a Perl-like command language used specifically for these purposes. For instance, in Listing 12.4, lines 088[nd]090 comprise a conditional block with the value of the variable $error as the condition. Conditional blocks of this nature start with the usual if, but they aren't ended by closing brackets. Instead, a shell-like endif command is used.
Like HTML::Mason, EmbPerl provides a simplified syntax to evaluate an expression. The [+ +] block denotes a program segment, which should be evaluated as an expression with the result displayed in place. For example, line 089 of Listing 12.4 displays an error message inline as defined by the variable $error. This syntax enables the variable to be set apart from surrounding HTML in an unobtrusive manner.
One of EmbPerl's strengths is the automated table-creation features that it includes. When provided with an array reference called $head and a reference to an array of arrays called $dat, EmbPerl can use the inherent variables $row and $col to iterate over an entire data set and display it in a formatted table. For example, lines 093[nd]096 of Listing 12.4 create a data table consisting of the results of the specified SQL query, including column headers with the names of the result fields. This short block of HTMLand the EmbPerl automation behind itdoes the work of a large chunk of Listing 12.2, replacing over 20 lines of Perl code to accomplish the same result. Although the cases to which this facility applies are few, Web application designers could use more shortcuts of this nature.
Form Variables
Form variables are provided by EmbPerl through the %fvar hash variable. Similar to Mason's %ARGS variable, %fvar holds one named entry for each form variable provided. You access the variables the same way as you do any other hash values. As a result, form variables can be used either inplace (such as the way they are in lines 063[nd]068 of Listing 12.4) or by assigning them to other variables before use, as shown in the following code:
Listing 16.
# check form variables
$error = "Improper datasource specified"
unless ($fvar{datasource} =~ /^dbi/i);
$error = "Query should start with SELECT"
unless ($fvar{query} =~ /^select/i);
In these lines, the form variables are accessed directly by reading them from the %fvar hash each time they are needed. This is sometimes used to set form variables apart from internal program variables such as $error. Some Web applications assign form variables to internal variables only after the values have been modified to keep a copy of the form variable values in their original state. Note that no additional steps need be taken to import the form variablesas opposed to CGI-style programs that have to parse the variables explicitly.
Benefits and Limitations
EmbPerl serves well its purpose as a simple Perl-embedding environment. It provides persistent execution for Perl code embedded in HTML pages. It also provides convenient access to form variables. One of the most useful aspects of EmbPerl is the table autocreation facility, which shows off the capability of an environment to truly reduce the lines of code necessary to perform a common Web application task.
EmbPerl's biggest limitation is the awkward nature of its embedded Perl blocks. Different types of blocks have to be used for different types of code, and the program structure blockprobably the most common type to use on a Web siteis implemented in meta-code, not Perl itself. Although this problem is slated to be addressed in version 2.0 of EmbPerl, the solution is yet another type of block with new syntax and uncertain applications. Remembering the usage differences between all these varieties of blocks is difficult, and the resultant mix of block types with similar delimiters can cause confusion and lead to unnecessary errors.
Apache::ASP
Apache::ASP is an embedding environment that is designed to provide an Apache version of the PerlScript syntax of Active Server Pages (ASP), which is an environment used with the Internet Information Server (IIS) under Windows operating systems.
The syntax used by Apache::ASP is again very similar to Mason and EmbPerl, with Perl code sections set apart from HTML text by delimiters and inline variables and expressions denoted by a similar set of delimiters. The delimiter symbols used by Apache::ASP are identical to those used in ASP under IIS. Thus, the format should be very familiar to programmers who are used to ASP-style code. In addition, interface objects for Web server interactionincluding form variable access and result header declarationsare very similar to the objects used in ASP under IIS.
Apache::ASP has the benefit of wide acceptance among graphic HTML editors and Web application development environments. The ASP style likely is recognized by a wide variety of client-side tools (mostly in Windows environments), including graphic editors. Therefore, editors that produce valid Apache::ASP code without damaging the underlying Perl code are more common than with some lesser-known embedding styles.
Query Processor Application Changes
The CGI-style query processor in Listing 12.2 would have many of the same changes made to it in order to use it with Apache::ASP as Listing 12.3 did to make it compatible with HTML::Mason. There are a few noticeable differences, however, which are outlined in Listing 12.5.
Listing 16.5 SQL Processor in Apache::ASP
001 <%
002 # declare some variables
003 my ($dbh, $sth, $error, $field, $result, $results);
004 my (@datasources);
005 my %QUERY = %{$Request->Form};
006
007 # build a (safe) list of data sources
008 foreach (DBI->available_drivers)
009 {
010 eval {
011 foreach (DBI->data_sources($_))
012 {
013 push @datasources, $_;
014 }
015 };
016 }
017 %>
018
019 <html>
020 <head>
021 <title>SQL Query Browser</title>
022 </head>
023 <body>
024
025 <h3>SQL Query Browser</h3>
026
027 <form method="post">
028 <p>Choose a datasource:</p>
029
030 <select name="datasource">
031 <%
032 foreach my $source (@datasources)
033 {
034 if ($source eq $QUERY{datasource})
035 {
036 print qq{<option selected="selected">$source</option>\n};
037 }
038 else
039 {
040 print qq{<option>$source</option>\n};
041 }
042 }
043 %>
044 </select>
045
046 <p>Specify username/password:</p>
047
048 <input type="text" name="user" size="10" value="<%=$QUERY{user}%>" />
049
050 <input type="password" name="password" size="10" value="<%=$QUERY{password}%>" />
051
052 <p>Enter a SELECT query:</p>
053
054 <textarea name="query" rows="5" cols="40" wrap="virtual"><%=$QUERY{query}%></textarea>
055
056 <p><input type="submit" /></p>
057 </form>
058
059 <%
060 if ($QUERY{query})
061 {
062 # check form variables
063 $error = "Improper datasource specified" unless ($QUERY{datasource} =~ 064 /^dbi/i);
065 $error = "Query should start with SELECT" unless ($QUERY{query} =~ /^select/i);
066
067 unless ($error)
068 {
069 # connect to the database
070 $dbh = DBI->connect($QUERY{datasource}, $QUERY{user}, $QUERY{password})
071 or $error = "Connection failed: $DBI::errstr";
072
073 # if the database connection worked, send the query
074 unless ($error)
075 {
076 $sth = $dbh->prepare($QUERY{query})
077 or $error = "Query failed: $DBI::errstr";
078 $sth->execute or $error = "Query failed: $DBI::errstr";
079 }
080 }
081
082 # if any errors are present, display the error
083 if ($error)
084 {
085 print qq{<p><font color="red"><b>Error: $error</b></font></p>};
086 }
087
088 # if the query produced an output,
089 if (!$error and $sth->{NAME})
090 {
091 # start a data table
092 print qq{<table border="1">\n};
093 print qq{<tr>\n};
094
095 # display the fields as table headers
096 foreach $field (@{$sth->{NAME}})
097 {
098 print qq{<th>$field</th>\n};
099 }
100 print qq{</tr>\n};
101
102 # display the results in a table
103 while ($results = $sth->fetchrow_arrayref)
104 {
105 print qq{<tr>\n};
106 foreach $result (@$results)
107 {
108 print qq{<td>$result</td>\n};
109 }
110 print qq{</tr>\n};
111 }
112
113 # finish the data table
114 print qq{</table>\n};
115 }
116 }
117 %>
118 </body>
119 </html>
The simplicity of the ASP style provides a clean way to embed all sorts of Perl code into an HTML page. However, Listing 12.5 shows that very little appreciable difference exists between Apache::ASP style and CGI style because of that same simplicity. Apache::ASP doesn't provide much in the way of additional Web application abstraction or componentization. Thus, the biggest difference between ASP-style and CGI-style code would be seen in pages that consist mostly of HTML with only a few Perl segments.
Demarcating Perl Segments
Perl segments are noted in Apache::ASP using the <% %> delimiters. As with HTML::Mason or EmbPerl, any Perl code placed within the delimiters is evaluated by Apache::ASP as though it were a Perl program. Unlike EmbPerl, though, any kind of Perl code is valid within the delimiters, and only one kind of delimiter is used for most code sections. For instance, lines 031[nd]043 of Listing 12.5 (which display a drop-down list of data sources based on the @datasources array) could be listed as they are or rewritten as the following:
Listing 16.
<%
foreach my $source (@datasources)
{
if ($source eq $QUERY{datasource})
{
%>
<option selected="selected"><%= $source %></option>
<%
}
else
{
%>
<option><%= $source %></option>
<%
}
}
%>
This version of the code would be perfectly valid even though it breaks up a Perl loop between delineated blocks. (It doesn't add much readability, though.) For inline variables, Apache::ASP provides the <%= %> delimiter syntax, which evaluates the expression within the block and displays the result inline. Line 048 of Listing 12.5 (which displays an input text box for the username) uses the inline display notation to show the existing value of the user form variable.
Form Variables
Because of Apache::ASP's inheritance of the ASP programming style, form variables are only accessible from the $Request object through the Form method. The method provides a single form variable value if provided a variable name, or it provides a hash of all the form variables if not. Unfortunately, there's no direct access to form variables. Thus, the usual way to access the variables would have to involve CGI-style statements assigning each variable in turn.
A different approach was taken in Listing 12.5. Because the Form method provides a hash of form variables and their values if no variable name is specified, line 005 creates a hash called %QUERY to store all the form variables for direct access. This hash then can be used in the same manner as those provided by Mason or EmbPerl, as seen in line 048 and onward. This also provides a general means by which the behavior of one embedding environment can be modified to simulate othersthe hash could just as easily be called %fvar, %ARGS, or a combination of these if there were already code using one style or another.
Benefits and Limitations
The ASP syntax is simple and familiar to many Web programmers. Because ASP is a common format, many Web application editors provide additional features for ASP. These features include syntax highlighting and code previews that are unavailable for other embedded Perl styles. This familiarity of ASP data structures also means that Web programmers might have an easier time adapting existing ASP skill sets to Apache::ASP. In the other direction, writing pages in an ASP style enables the pages to be ported to PerlScript for IIS, if that becomes necessary.
The utility of Apache::ASP is limited by its need to be similar to the VBScript or PerlScript implementations of ASP. As an example, form variables are accessible by two different methods on the $Request objectone works only for POST requests and one works only for GET requests. Because many Web applications need to enable both types of requests, this creates a difficult hurdle for application developers where none is necessary.
Perl Server Pages
Perl Server Pages (PSP) is a format originally designed for use with Version 3 of the VelociGen application environment. They are an extension of the original embedded mode of VelociGen, which was similar to Mason, EmbPerl, and Apache::ASP in that it enabled Perl segments to be set in HTML pages by surrounding the code in delimiters. The original environment also provided simple program components and automated form variable parsing.
PSP extends the original embedding concept by providing an intermediate language, which looks more like ColdFusion or Java Server Pages (JSP) than embedded Perl. This templating language consists of HTML-like tags that perform functions based on definitions written in Perl, HTML, or the PSP language itself. This results in an abstracted language that can be further abstracted and customized as necessary using either inline definitions or external definition files.
Coinciding with the release of this book, VelociGen has kindly agreed to release a reference implementation of PSP for the Apache server as open source software under the Artistic License.
Query Processor Application Changes
At its core, the PSP style is very similar to the style used with HTML::Mason, EmbPerl, and Apache::ASP. As a result, it's possible to make the PSP version of the SQL query processor in Listing 12.2 look very similar to the other listing styles in this chapter. Listing 12.6 is one example of this style.
Listing 16.6 SQL Processor in PSP
001 <perl>
002 # declare some variables
003 my ($dbh, $sth, $error, $field, $result, $results);
004 my (@datasources);
005
006 # build a (safe) list of data sources
007 foreach (DBI->available_drivers)
008 {
009 eval {
010 foreach (DBI->data_sources($_))
011 {
012 push @datasources, $_;
013 }
014 };
015 }
016 </perl>
017
018 <include file="$ENV{DOCUMENT_ROOT}/page.psp" />
019 <template title="SQL Query Browser" section="Admin">
020
021 <form method="post">
022 <output>
023 <p>Choose a datasource:</p>
024
025 <select name="datasource">
026 <loop name="source" list="@datasources">
027 <if cond="$source eq $QUERY{datasource}">
028 <option selected="selected">$source</option>
029 <else />
030 <option>$source</option>
031 </if>
032 </loop>
033 </select>
034
035 <p>Specify username/password:</p>
036
037 <input type="text" name="user" size="10" value="$QUERY{user}" />
038
039 <input type="password" name="password" size="10" value="$QUERY{password}" />
040
041 <p>Enter a SELECT query:</p>
042
043 <textarea name="query" rows="5" cols="40" wrap="virtual">$QUERY{query}</textarea>
044
045 <p><input type="submit" /></p>
046 </output>
047 </form>
048
049 <perl>
050 if ($QUERY{query})
051 {
052 # check form variables
053 $error = "Improper datasource specified" unless ($QUERY{datasource} =~ /^dbi/i);
054 $error = "Query should start with SELECT" unless ($QUERY{query} =~ /^select/i);
055
056 unless ($error)
057 {
058 # connect to the database
059 $dbh = DBI->connect($QUERY{datasource}, $QUERY{user}, $QUERY{password})
060 or $error = "Connection failed: $DBI::errstr";
061
062 # if the database connection worked, send the query
063 unless ($error)
064 {
065 $sth = $dbh->prepare($QUERY{query})
066 or $error = "Query failed: $DBI::errstr";
067 $sth->execute or $error = "Query failed: $DBI::errstr";
068 }
069 }
070
071 # if any errors are present, display the error
072 if ($error)
073 {
074 print qq{<p><font color="red"><b>Error: $error</b></font></p>};
075 }
076
077 # if the query produced an output,
078 if (!$error and $sth->{NAME})
079 {
080 # start a data table
081 print qq{<table border="1">\n};
082 print qq{<tr>\n};
083
084 # display the fields as table headers
085 foreach $field (@{$sth->{NAME}})
086 {
087 print qq{<th>$field</th>\n};
088 }
089 print qq{</tr>\n};
090
091 # display the results in a table
092 while ($results = $sth->fetchrow_arrayref)
093 {
094 print qq{<tr>\n};
095 foreach $result (@$results)
096 {
097 print qq{<td>$result</td>\n};
098 }
099 print qq{</tr>\n};
100 }
101
102 # finish the data table
103 print qq{</table>\n};
104 }
105 }
106 </perl>
107 </template>
Much in the way that a CGI program could be copied verbatim into an embedded page and used as the starting point for further tuning and abstraction, Listing 12.6 could be used as a starting point for further abstraction using PSP. The PSP tag style requires some planning to define a reasonable set of tags for the right level of abstraction, though. Thus, the real benefit of increased abstraction comes when a large group of Web application pages on a site are developed using a core set of custom tags.
One tag used in this listingthe <template> tagisn't be mentioned until Chapter 13. For now, suffice it to say that this template provides the basic structure of an HTML page in which the rest of the printed HTML is set before returning the combined result.
Demarcating Perl Segments
Perl segments in PSP pages are enclosed using the <perl> tag. This method is similar to the segment styles of Mason, EmbPerl, and Apache::ASP. However, there's only one type of Perl segment within a PSP page, which is used for any kind of Perl statements, including portions of loops and other blocks.
The %COOKIE, %QUERY, and %ENV hashes are provided automatically as interfaces to the environment. The %ENV hash is an augmented version of the environment variables provided by Perl and the Web server, with additional entries for the location of the document root directory and similar PSP-centric information. The %QUERY hash provides access to form variables in a way that is similar to the way in which the %ARGS hash in Mason or the %fvar hash in EmbPerl works. Form variables are used directly in lines 053 and 054 of Listing 12.6, for example. In addition, the %COOKIE hash provides access to data stored in client-side cookies, which can be useful for state management and automatic user recognition.
Displaying Inline Variables
Inline variable display is handled differently in PSP than in most other embedded Perl environments. PSP takes the approach that variables are likely to be used only in certain sections of the page. In that case, it might be better to treat the sections as a whole rather than separating out individual program statements.
For this purpose, PSP provides the <output> tag. Text within the <output> tag is treated in a way that is similar to the way Perl treats a double-quoted string: variables in the text are replaced by the values of the variables as defined. In contrast, if a variable name is present outside the <output> tag, it is printed just as listed. Lines 037 and 039 of Listing 12.6, for instance, use the values of form variables as the default values for input text boxes. This is made possible by the <output> tag started on line 022 and ended on line 046, just inside the <form> tag.
The <output> tag treats undefined variables differently than does a Perl double-quoted string. If a variable is used within an <output> tag that hasn't been defined (as opposed to one that is empty or that holds a value), it is displayed as though there were no <output> tag in effect. This makes it easier to display a dollar sign in normal contexts without having to escape it.
Using Standard PSP Tags
A host of standard PSP tags are provided with the reference implementation. Most of these tags are defined within a central tag library also written as a PSP page, but somelike the <tag> tag and other fundamental tagsare implemented within the PSP parser. The standard tags provide a library of functions likely to be needed for common Web applications, including loop and conditional processing, database connectivity, client access to the Web, and basic XML processing.
For example, lines 025[nd]033 of Listing 12.6 create a drop-down list of data sources based on the @datasources array with the chosen data source highlighted. Line 027 uses the <loop> tag to start a loop over the values of the @datasources array, setting the $source variable to the current loop value. Line 027 starts a conditional block with the <if> tag, and line 029 partitions the results of a true condition from the result of a false condition with an <else> tag.
Obviously, database-oriented tags would be useful for Listing 12.6 as well. They are omitted here only because the database tags included with the PSP reference implementation are better suited to a specific data source instead of the drop-down choice given by the datasource query variable. Translating the DBI-format datasource given by this variable into the database and data source variables needed by the <sql> tag would require additional steps that are unnecessary when the existing code works just as well. For examples of the <sql> tag and database access with PSP pages, see Chapter 14.
Declaring New PSP Tags
PSP provides the <tag> tag to enable the definition of new tags. This isn't used within Listing 12.6, but a custom tag could easily have been defined within the page. For instance, a tag could be defined to produce the drop-down box in lines 025[nd]033 in a more readable fashion. The page would use the <tag> tag to provide a tag definition, and it would then use the tag itself in place of the larger code block, as shown in the following code:
Listing 16.
<tag name="dropdown" accepts="name, values, default">
<perl>my @values = @{$values}</perl>
<select name="$name">
<loop name="item" list="@values">
<if cond="$item eq $default">
<option selected="selected">$item</option>
<else />
<option>$item</option>
</if>
</loop>
</select>
</tag>
<perl>my $datasources = \@datasources</perl>
<dropdown name="datasource" values="$datasources" default="$datasource" />
The <tag> tag itself takes the two attributes, name and accepts, among others. The name attribute specifies the name of the tag being defined ("dropdown" in this case), and the accepts attribute defines the attributes that can be passed from the tag for use as variables within the definition. In this case, the <dropdown> tag can have attributes called name, values, and default, each of which is assigned to a scalar variable of the same name for use within the definition. After being defined, the tag can be used immediately to create the drop-down box, just as the block of code in Listing 12.6 does. Of course, this kind of tag definition wouldn't save any time, effort, or space in this one program. However, if the definition were placed in a library file with other useful definitions and used throughout the site, the resulting abstraction would make each page more readable in the long run.
It's possible to override existing HTML tags with custom tags, but it is not advised. Conflicts and unwanted parsing loops can occur when the HTML tags are used within their custom counterparts. For instance, it might seem reasonable to override the <td> HTML tag with a PSP tag of the same name that automatically sets the font style within the table cell. However, after that tag is defined, there's no way to specify a <td> tag without that formatting because it always is interpreted by the PSP page parser. It would be more advisable to create a tag with a different namesuch as <mytd>to provide access to the original tag definition.
Using External Components
PSP provides the <include> tag for parsing and including other PSP pages. These pages are handled more like server-side includes than HTML::Mason components because the page doesn't take any arguments when it is being included. (Tags themselves handle that aspect of componentization instead.) For instance, line 018 of Listing 12.6 includes a file called page.psp that contains the definition of a <template> tag:
Listing 16.
<include file="$ENV{DOCUMENT_ROOT}/page.psp" />
The file attribute indicates the absolute or relative path to the file to include, which could be either a PSP file or another type of text file. In the case of a PSP file, the file is compiled in place of the <include> tag and incorporated into the main program. As shown with the <template> tag and the page.psp file, the <include> tag can be used to create simple site templates. More templating ideas can be found in Chapter 13.
The <include> tag also can be used to create libraries of PSP tags. One such library is the default library included with the PSP parser. These libraries are simply collections of PSP tag definitions using the <tag> tag, which define a set of custom tags for use in other PSP pages on the site. Tag libraries can be included explicitly in each file that uses them, or each tag can be defined as global to all PSP pages by setting a global attribute in the tag definition. The library then only has to be loaded once as the application engine starts to be accessible by all other pages.
Benefits and Limitations
The PSP syntax is very close to HTML syntax, which makes it a good choice for a templating and high-level Web application language. In addition, the capability to create custom tags makes the tag system infinitely customizable. Tag sets can be created for a Web application as the application is being defined, and the resulting code can be cleaned up considerably as a result.
A limitation of the PSP style comes from the HTML-style tag syntax, in which it can be awkward to express programmatic constructs. For instance, the <if> <else/> </if> construct is a bit awkward because the else block is started by an empty tag while the if tag extends beyond the end of its corresponding block. HTML-style syntax also can cause code to become less distinguishable from straight HTML, making it more difficult to get a sense of the program flow from the structure of the listing.
In addition, the way PSP pages are parsed casts some uncertainty on the execution order of program segments because it's possible that tags will be evaluated from the interior out, instead of from first to last. This becomes most apparent when errors occur, which shows less-than-informative messages. These messages reference the program that was generated from the PSP code, which usually has little relation to the PSP code itself.
Performance Issues with Embedding
Embedding is a good choice for most Web application needs, but it isn't always the best choice. For one, embedding Perl code in HTML can incur additional overhead. Usually this comes from the increased time taken when compiling an embedded page into a Perl application. It also comes from the increased memory usage created by the page parser and related code.
Additionally, developing a complex application in embedded mode often requires additional work to decipher error messages generated by Perl. Because the embedded Perl code is interspersed with HTML, it's sometimes difficult to use standard code-scanning techniques to get a sense of how the program is supposed to operate and where the error comes from. Occasionally, these concerns can outweigh the benefits of embedding code in Web pages.
Increased Overhead
Embedded Perl code isn't necessarily as efficient overall as more traditional persistent Perl. This is due to a number of factors, which are not all applicable to every embedding environment. For one, not all embedded code is processed the same way. Some environments translate the embedded pages into pure-Perl code before compiling. Other environments compile Perl blocks individually and then treat them more like server-side includes when composing a page in response to a request. The result of these behind-the-scenes changes are rarely controllable and can sometimes result in fundamentally less efficient code than would be produced by hand.
Another source of overhead with embedding is the parser itself. No matter how it's implemented, the parser used to separate program segments out from HTML has to invoke some overhead. It might take additional time to parse a page before each request, or it might just take up additional memory while it waits for new pages to be parsed into pure-Perl programs. The resultant overhead is sometimes large enough to make the use of embedding prohibitive, although in most mixed-code Web applications, this isn't the case.
Error messages also can be difficult to decipher due to translations occurring behind the scenes. Perl normally identifies errors based on the line number of the offending code and (in the case of a syntax error) the code surrounding it for context. Unfortunately, both clues correspond to the post-translated code rather than the pretranslated embedded file. Thus, it might be difficult to match up the error to its location and circumstance. In addition, errors can be introduced by the parsing step itself, especially in the case in which a Perl block was not properly delineated.
When Not to Embed
Two types of pages should probably be avoided when embedding Perl code. The first is obvious: pages that contain only HTML and need no code are poor candidates for embedding. Parsing these pages takes time, and the performance of an embedded Perl page never exceeds the performance of the corresponding pure-HTML page. Oddly enough, this unusual situation often can occur when a site goes through repeated redesigns. A site that contains mostly embedded Perl can be replaced with pages that are pure HTML but that require the same filenames to preserve links from outside sources. In this case, it's often a good idea to declare those pages as a form of HTML within the Web server configuration to save the overhead of parsing them through a Perl environment.
On the other end of the spectrum, code that doesn't produce HTML at all is usually a poor candidate for embedding as well. Such code usually ends up as a single Perl block delimiter surrounding an otherwise pure Perl program. In this case, it's generally better to avoid the overhead of parsing the page by implementing the program directly in persistent Perl. Of course, this might require adding additional code to handle common aspects of CGI-style programs, such as form variable processing and environment initialization, so the performance trade-off might not be worth the additional work of developing in a CGI style. Embedding Perl in HTML might also make it difficult to use standard Perl development tools for syntax checking and debugging. This can impede the development of complex programs in an embedded environment. However, this might be a blessing in disguise if it encourages complex sections of Perl code to be abstracted out of embedded pages and into reusable modules. These modules then can be checked and debugged using all the usual tools.
Sidebar: Evangelizing Web Usability
One of my hidden motives behind encouraging embedded programming style is its impact on the Web usability testing cycle. By forcing Web programs to conform more closely to the structure of their output, embedded programming makes it much easier to see the effects of stylistic and structural changes to a Web application. Combined with template programming, embedded Perl can break programmers of the Hello World habit and encourage a little bit more thought about the relationship between the code we write and the way the application behaves on site.
Of course, the other requirement is an understanding of the mechanics of Web usability, a subject that is outside the scope of this book entirely. Fortunately, it's the sole subject of an excellent book, Designing Web Usability, by Jakob Nielsen. This book is published under the New Riders banner and is pure distilled Web wisdom. After you've read this book, read his next, Homepage Usability (Fall 2001).
Evangelizing these aspects of good Web design is as important as designing the applications in the first place. As Dr. Nielsen's work has shown time and time again, programmers ignore usability concerns at their own peril. Thus, it's important to lock onto good usability practices and principles and stick to them. Luckily, most Web designers are copycats. If they see a good design getting lots of attention, they won't hesitate to copy the good design for their own uses. If that happened more often, the Web would become a wonderful place to surf.
Summary
Web application development can be made more Web-centric and componentized by embedding Perl in HTML pages, which helps programmers break out of the Hello World frame of mind. Many embedding environments are available. HTML::Mason provides a robust framework for creating reusable program components. EmbPerl provides a simpler environment with helpful table generation features. Apache::ASP brings ASP PerlScript support to the Apache server. PSP enables HTML-style tags to be created using a combination of HTML and Perl code. Any of these environments is likely to promote new development using a Web-centric model, but it's also good to consider times when embedding incurs too much overhead. In cases when the page is likely to contain all static HTML or no HTML at all, performance can be improved by using only HTML or only Perl.