Examples and How-To's Introduction to the ZWS APIs Zeus Web Server supports a number of API interfaces, with differing degrees of power, robustness and complexity. This section introduces each API, describes what each is typically used for, highlights its advantages and disadvantages and provides sample code for each. Read this section to gain an overview of the APIs available and understand which is most appropriate for your requirements. The purpose of all web server APIs is to: • pass requests to content handlers • modify server behaviour (authentication, logging, redirection) • process content as it's sent back to the client • process data from submitted forms. The Zeus Web Server supports all open dynamic content APIs in common use today, and several proprietary APIs, including: • SSI • CGI • FastCGI • ISAPI • NSAPI • ZDAC • Perl Extensions (ZWS 4.2 and later only) CGI Introduction to the Common Gateway Interface (CGI) The Common Gateway Interface runs an external program (a 'CGI script') as a result of a client request and returns the output of the program to the client. CGI defines the interface between the web server and the CGI script. The CGI script is short-lived - it only runs for the duration of the request. The author is free to program the script in almost any language. CGI is extremely powerful because there are very few limits on what the CGI script can be programmed to do. Languages used for writing CGI scripts CGI scripts are commonly written in C or Perl. Perl is a very rich scripting language, with great support for common CGI operations like string manipulation, and CGI scripts can be created very quickly using this language. However, Perl scripts are relatively expensive to run (in terms of resources required - CPU cycles, memory foot print, startup time), so for performance-critical CGI scripts, C is commonly used. Libraries are freely available for both languages that help in processing the request data from the web server. CGI scripts are rarely written in a shell scripting language because of well-founded security concerns. The author of the script must always be aware of the security implications of running a program on the web server, and shell script interpreters have too many additional features that could be exploited by a malicious attacker. Strengths and weaknesses of CGI CGI is flexible, easy to develop and easy to deploy, but carries a significant penalty that impacts the performance and scalability of heavily loaded web sites. On every CGI request,the web server must fork(); the child must release all resources it inherited from the web server parent (file descriptors, mem-mapped regions etc.); then the child must exec() the CGI binary. CGI is very appropriate for lightly-loaded or over-specified web server installations and for prototyping web applications in a development environment, but is not suitable for a web site installation that needs to scale or operate at or near the limits of the server hardware. Advantages • Simple to write • Portable across many server platforms • Can be written in many languages e.g. Perl, C, etc.(input from environment variables, output to standard output). Disadvantages • Performance limited - server must fork new process for every request • Provides no mechanism to maintain state across requests (e.g. while browsing on-line store) • Must implement it yourself. CGI 'Hello World' program CGIs are one means of producing dynamic web pages. They can alter the content of your Web pages based on browser, time of day, user information or a host of other factors. For large-scale web application development, alternatives to CGIs such as FastCGI, ISAPI, Java Servlets and Zeus Distributed Authentication and Content Generation API exist and should be considered for performance reasons. However small, simple CGI programs can make your Web site a far more interesting place to be than purely static pages. CGI is an acronym for Common Gateway Interface, it defines the method used by web servers to communicate with external programs. CGIs are generally easy and quick to produce and can be written to power even the most complicated of Web sites. It should be noted that the flexibility of CGI programs is also a potential weakness, not least the security concerns of badly written programs. This document is intended as an introduction to CGI programming, not a comprehensive tutorial. Users wishing to taking CGI programs further than the scope of this document should consult the wealth of online documentation or a good book on the subject. CGI Programs can be written in almost any language, including but not limited to C/C++, FORTRAN, Pascal, Perl, SH and Python. CGI programs are often referred to as CGI scripts, although scripting languages are a popular way of developing CGI programs it is a misnomer. CGI programs can be, and often are traditional compiled programs. A particular advantage of scripting languages for CGI programming is portability. An interpreted language is not tied to one platform, and should easily transfer between different Operating Systems providing an interpreter exists. Whereas compiled languages will always have to be recompiled if you change your web server machine to another platform or OS. We shall primarily use Perl (Practical Extraction and Reporting Language) for our examples, Perl although originally a UNIX tool, is now available on almost all computer platforms including Windows / NT and Apple Macs. Perl is an interpreted language and so may run a little slower than compiled programs. It does however have a number of useful functions for dealing with Strings and Lists which are applicable to CGI programming. Basic Concepts The beauty of CGI programs lie in their simplicity. In order to interface with the Zeus Server all you need to do is write to the standard output stream from your chosen language. In Perl this would use the print function, in C you would useprintf(). Any information generated in this way will be passed on by the Zeus server to the client. CGIs can output any type of information, regular text, HTML, images, even audio. It is therefore important for a CGI to identify what type of information it is sending back to the browser so it is correctly displayed. The browser expects any data it receives to be prefixed with HTTP Headers, specifically in this context, a Content-Type header. The ContentType header needs to be set to the correct MIME type of the data sent to the browser. Generally your CGIs will return text, or HTML which have MIME types of text/plain and text/html respectively. So an example of the first line your CGI program needs to output will be print( "Content-Type: text/plain\n\n" ); The "\n\n" (a user visible blank line) is required to terminate the HTTP Headers section before you send back the data. My First CGI Program We now know enough information to write a real CGI program. We shall start where all good tutorials start, with a hello world program. Hello World in Perl: #!/usr/bin/perl print "Content-Type: text/plain", "\n\n"; print "Hello World in Perl", "\n"; The first line of our Perl program tells the web server where to find the Perl interpreter, contact your local system administrator if you don't know where this is. We then output the HTTP Headers, in this case we only supply the Content-Type field, followed by the blank line required to terminate the Headers section. Finally we output our data. Hello World in C: #include <stdio.h> int main(void) { printf("Content-Type: text/plain \n\n"); printf("Hello World in C! \n"); } These short, simple programs show how easy it is to write CGI programs, they are not however particularly dynamic! To produce dynamic web pages the CGI programs require information from the server on which to act. There are two types of information which the Zeus Server passes to the CGI program. • Information about the client (browser), server (Zeus) and Authentication User. • Data submitted via HTML forms. All this information, with the exception of some types of form data, is passed in environment variables for the CGI program to read. CGI Environment variables The Common Gateway Interface runs an external program (a 'CGI script') as a result of a client request and returns the output of the program to the client. CGI defines the interface between the web server and the CGI script. Useful references: • http://hoohoo.ncsa.uiuc.edu/cgi/ • http://www.perl.org/ • http://www.w3.org/ #!/usr/bin/perl print STDOUT <<HERE; Content-Type: text/plain Status: 200 OK Some Environment Variables: The server host name is: $ENV{SERVER_NAME} and the client is : $ENV{HTTP_USER_AGENT} HERE This script produces output like: Some Environment Variables: The server host name is : digital.zeus.com and the client is : Mozilla/4.01 [en] (WinNT; I) Simple CGI forms The Common Gateway Interface runs an external program (a 'CGI script') as a result of a client request and returns the output of the program to the client. CGI defines the interface between the web server and the CGI script. Useful references: • http://hoohoo.ncsa.uiuc.edu/cgi/ • http://www.perl.org/ • http://www.w3.org/ The last example might have included some dynamic elements, but it wasn't very useful. We'll now look at how we can ask users for information and then act on it to generate personalized Web pages. HTML Forms are the standard method of requesting information from the user. They provide a simple means of displaying text boxes, check buttons and radio buttons within the browser. For full information on HTML forms consult the HTML specifications at the World Wide Web Consortium (http://www.w3.org/). Example: A basic HTML Form <html> <head> <title>A simple form</title> </head> <body> <h1>Please enter your name!</h1> <form action="processform.cgi" method="POST"> <p> Please enter your name <input type="text" name="name"> </p> <p> Please enter your email <input type="text" name="email"> </p> <p> <input type="submit" value="Submit the form"> <input type="reset" value="Clear all fields"> </p> </form> </body> </html> The important lines in the HTML are: <form action="processform.cgi" method="POST"> This defines the CGI program to run when the submit button is pressed. It also defines how the data is passed to the CGI program. There are two methods which data can be passed to CGI programs, GET and POST. POST is generally the more useful, and more widely used method so it's that which we'll use here. For a more complete list of differences between the two methods please consult the Zeus CGI Reference Document. And also: <p> Please enter your name <input type="text" name="name"> </p> <p> Please enter your email <input type="text" name="email"> </p> These define two text input boxes labelled name and email. These labels are used in the CGI program to read the associated values. CGI Form Code The PERL script to process the form information is a little more complicated than the previous examples. The POST method writes data from the form straight to the CGI program through standard input, it also escapes some characters by prefixing them with %'s and converts other characters to their ASCII values in hexadecimal. Process Form CGI #!/usr/bin/perl use strict; main(); sub main () { my $query; # # # # Here we read the information entered from the form into an associated array, the total length of the data is passed via the environmental variable CONTENT_LENGTH. The array is then split on &'s which the POST method uses to separate each field in the HTML Form. read( STDIN, $query, $ENV{CONTENT_LENGTH} ); # # # # # # Then for each item in the array we split it on ='s into key, value pairs. All spaces which have been encoded into +'s are then translated back using a Perl regular expression. We then use another regular expression to convert the ASCII values back in the real characters. Finally the last line builds an associative array (hash) indexed on the HTML input labels. my @param = split( /&/, $query ); my %pairs = (); foreach my $item ( @param ) { my ($key, $value) = split( /=/, $item ); $key =~ tr/+/ /; $value =~ tr/+/ /; $key =~ s/%([A-F\d]{2})/chr(hex($1))/ieg; $value =~ s/%([A-F\d]{2})/chr(hex($1))/ieg; } # NOTE: we are assuming the client passed us each key/value only once. # This is not an assumption you can always make in the real world. $pairs{$key} = $value; # We then set up three more variables. The first two which store the # values out of the associative array we actually wanted. The last # reads the remote machine name from an environment variable. my $name = $pairs{name}; my $email = $pairs{email}; my $machine = $ENV{REMOTE_HOST}; # From here on it's plain sailing. We output the HTTP Content-Type # and Status headers, and the necessary HTML to build the page: print( STDOUT "Content-Type:text/html\r\n" ); print( STDOUT "Status: 200 Ok\r\n" ); print( STDOUT "\r\n" ); print( STDOUT <<HTML ); <html> <head> <title>Form example output</title> </head> <body> <h1>welcome</h1> <hr> <p> Hi <em>$name</em> of <em>$email</em> from machine <em>$machine</em> </p> <hr> </body> </html> HTML } Session Tags Situation A web designer is creating a site which will need users to log into it. Logging in will provide the user with a session tag, which must be provided to the site with each request. The web designer does not wish to rely on cookies or HTTP basic realm authentication, nor do they want to pass the session tag in a query string (or hidden form item) each time. The web designer also has a lot of legacy scripts which they wish to re-use. The scripts expect session information to be provided in an environment variable, and some of them create URLs of the form: www.site.com/SESSION=sessionstring/<rest of url> Solution The session tag can be stored in the hostname of the site, and a rewrite script can be used to put this information into an environment variable (or two) for the legacy scripts. Use a request rewrite script similar to the following: # Take URLs of the form www.site.com/SESSION=sessionstring/<url> # and create www.sessionstring.site.com and remove the rest. match URL into $ with ^/SESSION=([^./]+)/(.*)$ if matched then set IN:Host = www.$1.site.com set URL = /$2 endif # Done with that - fortunately we can now fall through to... # Take URLs of the form www.sessionstring.site.com and # create SESSION=sessionstring as an environment variable match IN:Host into $ with ^www\.([^.]+)\.site.com if matched then set ENV:SESSION = $1 endif Explanation The first set of rewrite rules translates URLs that contain SESSION=something into URLs of the format www.sessionstring.site.com. This is done by finding /SESSION=something/ at the start of the URI and removing it, rewriting the hostname to match the form of www.session.site.com. The second set of rules searches for session information in the URL, and translates it into a SESSION environment variable. The two rules together then present a consistent interface to the scripts running on the site, without breaking any existing functionality. Vanity Domain Names Situation An ISP provides customers with free web hosting by providing them with web sites rooted at http://home.isp.com/~username/. This is proving unpopular and the ISP wishes instead to offer http://www.username.isp.com/ without changing how they provide the web space. Solution Create a Virtual Server for home.isp.com and configure the home directories functionality to provide web sites rooted at ~username. Next, go to the fundamental configuration settings for the Virtual Server and add an alias of '*' to it, so it will catch requests for www.username.isp.com. Then add a request rewrite script similar to the following: # Turn www.username.isp.com into home.isp.com/~username match IN:Host into $ with ^www\.([^\.]+)\.isp\.com if matched then set IN:Host = home.isp.com set URL = /~$1%{URL} endif # End rule for home directories Explanation The request rewriter executes the script for each request which is processed by the virtual server created above. For each request, the regular expression matching hostnames of the form www.username.isp.com is executed against the Host header. If that matches, then the regular expression captures the username into the $1 variable. If the regular expression matched, then two more actions are performed: 1. The hostname is rewritten to home.isp.com; 2. The URL portion has /~username prepended to it. FastCGI Introduction to FastCGI FastCGI is a high-performance API supported by the Zeus Web Server. It is an open standard, and is language and platform independent. FastCGI provides both authorization and content generation services, and has a simple, quick development cycle. FastCGI is recommended as a general purpose solution for almost any task.FastCGI attempts to mitigate the performance problems of CGI by eliminating the need to fork() the CGI script on each CGI request. It implements a framework for a persistent CGI runner process - in effect, the CGI script becomes a function call rather than a standalone binary. FastCGI is an open (non-proprietary) web server interface. It is supported natively by the Zeus Web Servers. FastCGI applications can act as 'authorizers' (eg for access control) as well as the typical CGI 'responder' mode. The applications can be distributed over a network, running on a different machine than the web server itself, which in certain circumstances can bring security or licensing benefits. A good starting point for getting to grips with FastCGI is the programmer's guide to writing FastCGI:http://www.fastcgi.com/devkit/doc/fastcgi-prog-guide/ap_guide.htm . Advantages • Simple • Easy to migrate CGI applications to FastCGI • High performance • Stable • Distributed scalability. FastCGI applications can run on a different machine to the webserver, and sit being a load-balancer to scale over an aplication server farm • Can be written in any language, e.g. Perl • The application can use any process model (single threaded, pre-fork, multi-threaded etc) • Best general purpose high performance content generation and authorization API. Disadvantages • API restricted to content generation and authorization. Cannot be used for more low-level server behaviour modification, such as rewriting incoming URLs. 'Hello World' in Perl and C FastCGI is a high-performance API supported by the Zeus Web Server. It is an open standard, and is language and platform independent. FastCGI provides both authorization and content generation services, and has a simple, quick development cycle. FastCGI is recommended as a general purpose solution for almost any task. Hello World in Perl Here is a simple example of a hello world program for FastCGI written in Perl. Additionally you will need to first install the FastCGI Perl module. At the time of writing, the latest version is available athttp://www.cpan.org/modules/by-module/FCGI/FCGI-0.67.tar.gz Ensure the FastCGI module is enabled, and the path mapping module is set up correctly so that your FastCGI programs can be run. #!/usr/bin/perl use FCGI; while( FCGI::accept() >= 0 ) { print( "Content-Type: text/plain", "\n\n" ); print( "Hello World in Perl", "\n" ); } Hello World in C Here is a simple example of a hello world program for FastCGI written in C. Before you can compile this, you will need to install the FastCGI devkit. At the time of writing the latest version is available fromhttp://www.fastcgi.com/dist/devkit_2.2.0.tar.gz #include <fcgi_stdio.h> int main( int argc, char *argv[] ) { while( FCGI_Accept() >= 0 ) { printf( "Content-Type: text/plain\n\n" ); printf( "Hello world in C\n" ); } return 0; } On a Linux machine, a typical command to compile this would be along the lines of: gcc -o hello hello.c -lfcgi Simple authorization with FastCGI FastCGI is a high-performance API supported by the Zeus Web Server. It is an open standard, and is language and platform independent. FastCGI provides both authorization and content generation services, and has a simple, quick development cycle. FastCGI is recommended as a general purpose solution for almost any task.Here is a simple example of using FastCGI to perform authorization. It is written in Perl, using the Perl FastCGI library described in the Hello World example. #!/usr/bin/perl use FCGI; while (FCGI::accept >= 0) { if( $ENV{'REMOTE_USER' } eq "foo" && $ENV{'REMOTE_PASSWD'} eq "bar" ) { print( "Status: 200\n\n" ); } else { print( "Status: 401\n\n" ); print( "WWW-Authenticate: basic realm=\"foo\"\n\n" ); } } A FastCGI authorizer simply retrieves the information about the request using FCGI::accept() , and then returns a response of either 'Status: 200' if the request should be allowed, or 'Status: 401' if the username or password is invalid. In this case, any requests passed to the authorizer will be allowed only is the user supplied a username of 'foo' and a password of 'bar'. Otherwise the authorizer will disallow the request and prompt the user to enter a new password. Running this example In order to run this code, you'll need to enable FastCGI authorizer support in the Zeus Web Server and have installed a copy of Perl and the FCGI perl module. Authenticating against a database As a more advanced example, imagine wanting to authenticate the customers request from a backend SQL customer data which holds all the information about your customers. This is such a common task we explain how to do this in detail. For this example, we are going to use the popular MySQL, and a simple database schema to hold our customer information. The database is running on a machine called 'db', and is a schema that comprises of a single table called 'users' which looks like: Username Password Fred Bloggs John Doe Thus to retrieve a password for a given user, we would issue the following SQL statement: /* the question mark is a marker for a `bound parameter'*/ select password from users where username = ?; Altering our FastCGI application to connect to this database and check the user supplied password against it is very easy, here's a mysql version. Note, one of the key advantages of FastCGI is its ability to have persistent database-connections. Establishing a new connection to a database is generally a very time-consuming and CPU-intensive task, so persistent database connections provide a huge performance increase. When doing database queries in FastCGI one can spawn multiple instances of the authorizer process to allow multiple database queries to be running in parallel Authenticating against a LDAP server Instead of using a general purpose database, many large ISPs and corporations are now choosing to keep their customer data in a global LDAP database, commonly known as a white pages server. Again, altering the FastCGI example to query the LDAP server is an easy task. For this example, OpenLDAP slapd server (http://www.openldap.org/) with the following configuration: Our 'minimal' slapd.conf file contained: schemacheck database suffix directory defaultaccess rootdn rootpw off ldbm "o=Zeus Technology, c=UK" /tmp read "cn=root, o=Zeus Technology, c=UK" secret All access was disabled to the LDAP server unless the client provided the rootdn/rootpw. Different access control rules between the web server and your LDAP server will probably apply to your setup. The database we used, in `ldif' format was: dn: o=Zeus Technology, c=UK o: Zeus Technology dn: cn=fred_bloggs, ou=People, o=Zeus Technology, c=UK cn: fred_bloggs password: secret dn: cn=john_doe, ou=People, o=Zeus Technology, c=UK cn: john_doe userid: john.doe password: notverysecret This defined two users, each with a password. In this example we have stored the plaintext passwords in the LDAP server for simplicity, but a hash of the passwords could easily be stored instead. The FastCGI application will need to be able to retrieve the password for a given user name. Queries are encoded in the standard LDAP URL format (see http://search.ietf.org/rfc/rfc2255.txt for a full specification). Using our example database schema above, and the knowledge that our test LDAP server was setup on a machine named `olympus', the LDAP query URL we need is: ldap://olympus/o=Zeus Technology, c=UK?password?sub?userid=$u FastCGI/MySQL authorization example using Perl #!/usr/bin/perl use strict; use DBI; use FCGI; use constant PATH_NEVER => 0; use constant PATH_MAYBE => 1; use constant PATH_ALWAYS => 2; # normally, I would abstract this stuff out into a sitewide config module, # but for didactic reasons, I'll just define some constants here: use constant COOKIE_NAME => 'AUTH_TOKEN'; use constant DBI_DSN => 'dbi:mysql:hostname=DBHOSTNAME;database=DBNAME'; use constant DBI_USR => 'monty'; use constant DBI_PWD => 'widenius'; use constant AUTH_ERR => -1; use constant AUTH_NOK => 0; use constant AUTH_OK => 1; use constant AUTH_QUERY => <<AQ_SQL; select count(*) as authorized from login_table where user = ? and token = ? and expiry > unix_timestamp() AQ_SQL use vars qw($DBH $STH $N); sub sub sub sub sub _init (); _exit (); authorized ($$$); get_login_cookie (); query_decode (;@); _init(); for ($N = 0; FCGI::accept() >= 0; $N++) { # check the path to see if we want/need to authorize access: my $path_auth = check_path($ENV{REQUEST_URI}); if($path_auth == PATH_MAYBE) { my $auth = undef(); my $user; my $token; my $cookie; # get the login cookie and decompose it into user + token: # cookie format: USERNAME:OPAQUETOKENFROMDB $cookie = get_login_cookie(); ($user,$token) = split(/:/,$cookie,2); # check to see if an unexpired entry exists in the db: $auth = authorized($STH, $user, $token); if($auth == AUTH_OK) { # return 200 Ok, and set the AUTH_USER_NAME env variable # in case there is a dynamic content generator: # variables you want to set for the requested script/page # need to be prefixed w. the string 'Variable-', # or they will be passed back to the client, not the server. print(STDOUT "Status: 200 Authorized\r\n"); print(STDOUT "Variable-AUTH_USER_NAME: $user\r\n"); print(STDOUT "\r\n"); } elsif($auth == AUTH_NOK) { # Not authorized. # You can make your login page the default # 401 page with the htaccess ErrorDocument 401 directive: print(STDOUT "Status: 401 Not Authorized\r\n"); print(STDOUT "WWW-Authenticate: basic realm=\"foo\"\r\n"); print(STDOUT "\r\n"); } else { # Waah. Something blew up. print(STDOUT "Status: 500 Internal Auth Error\r\n"); print(STDOUT "\r\n"); } } elsif($path_auth == PATH_NEVER) { # we never allow anyone in to these: print(STDOUT "Status: 403 Denied\r\n"); print(STDOUT "\r\n"); } elsif($path_auth == PATH_ALWAYS) { # these we don't really care about, just let them in. # your error pages, icon, etc should all fall into this # category, as should your login page: print(STDOUT "Status: 200 Ok\r\n"); print(STDOUT "\r\n"); } else { # This should not be able to happen: If it does, # your site needs attention from you: print(STDOUT "Status: 500 Internal Auth Error\r\n"); print(STDOUT "\r\n"); } } _exit(); ############## # Access rules, first match wins: # # # # /auth/login.cgi is always allowed /share/* and /icons/* are always allowed /lib/* and /auth/* are always disallowed anything else must be explicitly authorized sub check_path ($) { my $uri = $_[0]; #warn("check_path($_[0])\n"); if ($uri =~ m@/auth/login.cgi@){ return PATH_ALWAYS } if ($uri =~ m@/share/|/icons/@){ return PATH_ALWAYS } if ($uri =~ m@/lib/|/auth/@) { return PATH_NEVER } } return PATH_MAYBE; sub authorized ($$$) { my $rv; my $row; my $sth = $_[0]; my $user = $_[1]; my $token = $_[2]; if(!$sth->execute($user,$token)) { warn("DBI error: ", $sth->errstr(), "\n"); return AUTH_ERR; } if($row = $sth->fetchrow_arrayref()) { # only interested in one column in this case $rv = $row->[0] ? AUTH_OK : AUTH_NOK; } else { warn("DBI error: ", $sth->errstr(), "\n"); $rv = AUTH_ERR; } # paranoia: empty out the sql result buffer, just in case # so that it's clean for the next invocation while ($sth->{Active}) { $sth->fetchrow_arrayref() } } return $rv; # open a database connection and prepare the query sub _init () { $DBH = DBI->connect(DBI_DSN, DBI_USR, DBI_PWD) || die("DBI->connect() failed: $DBI::errstr\n"); } $STH = $DBH->prepare(AUTH_QUERY) || die("prepare(AUTH_QUERY) failed: ",$DBH->errstr(),"\n"); # clean up and close down sub _exit () { $STH->finish(); $DBH->disconnect(); } # extract a login cookie from the headers: # the assumption is made here that any... unusual characters # in the cookie have been %XX encoded: sub get_login_cookie () { my $cval = undef(); if(exists($ENV{HTTP_COOKIE})) { my @cookie = split(/; /,$ENV{HTTP_COOKIE}); COOKIE: foreach my $ck (@cookie) { my($n, $v) = query_decode(split(/=/,$ck,2)); } } } if($n eq COOKIE_NAME) { $cval = $v; last COOKIE } return $cval; # %XX decode a string or strings: sub query_decode (;@) { my @str = @_; my $item; foreach $item (@str) { $item =~ tr/+/ /; $item =~ s/\%([A-F\d]{2})/chr(hex($1))/gei; } } return wantarray ? (@str) : $str[0]; END /* mysql login table create statement */ create table login_table (token char(32) not null, user char(32) not null, expiry int(11) not null) ISAPI Introduction to ISAPI ISAPI provides a vendor-independent API to extend the functionality of your web server, while avoiding performance limitations inherent to CGI. ISAPI was originally proposed, and is still backed up by Microsoft, but was designed to be an open standard that anyone can implement and use. ISAPI follows a simple model, where 'callback' functions implemented by the web server backend can be invoked by a user written module. These callback functions can either retrieve data from the web server or from the HTTP client (a.k.a., browser), send data to the HTTP client or change some state in the web server. ISAPI defines a standard set of callback functions that a server must implement, but because the standard is open, specific vendors may implement specific functionality that may exist only within its own product. In/out of process When using ISAPI modules with the Zeus Server you must remember that the 'out-of-process' ISAPI runner is a multi-threaded server. This means that there could be several threads in your ISAPI application at any one time, so you should write your application in a thread safe way. Due to the nature of ISAPI, 'in-process' modules run as part of the server core. This means that if they have any unexpected behavior, the web server itself could suffer. More seriously, due to the way the Zeus Server works, it is possible for a malicious ISAPI module to set the effective user id back to the user that started the server (possibly root). Do not use un-trusted code in ISAPI applications. When an ISAPI module is being run by the web server, the file is mapped into the address space of the web server process. If you overwrite or delete the file then you can corrupt the address space of the web server process causing it to crash. Filters are constantly mapped in. Extensions will get mapped in when they are first used, and may be left mapped in, cached in memory after they have been used. You should take special care in these cases. Getting started Please refer to our getting started guide. There is a reference guide in the ISAPI Programming Documentation. Advantages • Powerful API, includes both content generation, authorization along with low-level APIs to rewrite incoming or outgoing content and modify the server behaviour in many ways • In-process ISAPI provides highest-performance API • Out-of-process ISAPI provides multi-threaded application server. Disadvantages • Must be written in C or a C-derived language, such as C++. • In-process ISAPI load customer code directly into the web server, so it must be trustworthy and stable. (It will run as the same permissions as the web server process, and if the ISAPI crashes or leaks memory or file descriptors, it will bring down the entire web server) • Out-of-process ISAPI applications must be MT-safe (multi-threaded). Writing stable multithreaded applications is the domain of a professional software team, and is complicated and very difficult to debug • Out-of-process ISAPI applications all load into a single multi-threaded application server, so all applications run with the same security permissions. ISAPI applications must comprise trusted code. Getting Started with ISAPI This document aims to answer all your questions about getting started with ISAPI. You may also wish to refer to the ISAPI Programming Documentation. What exactly does ISAPI provide? ISAPI provides a vendor-independent way of extending the functionality of your web server. It offers far more flexibility than the CGI interface and avoids all the performance limitations. ISAPI code is written in C and compiled as a shared object, and can either be single-threaded or multi-threaded. I thought ISAPI was a Windows-based API. Can I use it under UNIX? Sure you can. Although the types may look unfamiliar to most seasoned UNIX developers (since the standard was originally designed by Process Software and Microsoft), it is actually a very elegant and efficient way of extending the functionality of your web server. How does ISAPI code link into the web server? ISAPI code is compiled as a shared object, also known as a dynamic library. This code automatically becomes part of the web server when a web site is started, and can then interact with data structures and talk with a connected client. Unlike a CGI which is a separate process, ISAPI code has access to far more of the internal data and can therefore provide much more functionality than CGI. What about performance? Is ISAPI fast? Yes! Since the web server already has ISAPI code loaded, it is a very fast mechanism for dynamically providing or modifying content. I want to write an ISAPI module - where do I start? Firstly, you'll need to decide whether to use an ISAPI filter or an extension. Filters are called on every HTTP request and are chained together in a specific order. Extensions generate content and are mapped to a specific URL path requested. Should I write an ISAPI filter or an extension? You should work out if your primary purpose is to dynamically generate content, or to redirect/authenticate/rewrite client headers prior to a request. You also need to know if you wish to be called for specific URLs, or for every request. Both are equally easy to write, although your code must be "thread-safe". (The exception to this rule is "in-process" filters and extensions, which don't have to be thread-safe, see below for more details). I want to write an extension to write data to the client. Now what? You need to register your extension with the web server. Firstly you'll need to define theGetExtensionVersion() function, something like this: BOOL WINAPI GetExtensionVersion ( HSE_VERSION_INFO *pVer ) { pVer->dwExtensionVersion = HSE_VERSION_MAJOR; strncpy(pVer->lpszExtensionDesc, "A simple page counter", HSE_MAX_EXT_DLL_NAME_LEN); return TRUE; } This registers and defines what your extension's purpose is. Once you'vecompiled your code, you should register with the web server which URL it is called for. The web server will call the HttpExtensionProc()function, which you should complete (shown below) when a request for this URL arrives, and you can callReadClient() or WriteClient() using the Extension Control Block structure (ECB) provided. DWORD WINAPI HttpExtensionProc ( LPEXTENSION_CONTROL_BLOCK ecb ) { /* your code */ return HSE_STATUS_SUCCESS; } What about if I want to write an ISAPI filter? You will need to register your filter in a similar way to an extension shown above. This can be done by defining a GetFilterVersion() function: BOOL WINAPI GetFilterVersion ( HTTP_FILTER_VERSION *pVer ) { pVer->dwFilterVersion = HTTP_FILTER_REVISION; strncpy( pVer->lpszFilterDesc, "A Cookie Filter", SF_MAX_FILTER_DESC_LEN ); /* Notify me when headers have been processed */ pVer->dwFlags = SF_NOTIFY_PREPROC_HEADERS; | SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSEURE_PORT; } return TRUE; Once you've compiled your code, you will also need to register your filter with the web server. TheHttpFilterProc() function will be called for every HTTP request for this virtual server, regardless of URL. ISAPI filters are called in a specific order, and more than one may be called per request. The return value from this function will determine if other filters are also called. Here is an example function for you= to complete: DWORD WINAPI HttpFilterProc ( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, PHTTP_FILTER_PREPROC_HEADERS headers ) { } /* your code here */ /* return, tell server to notify the next module */ return SF_STATUS_REQ_NEXT_NOTIFICATION; Cool, what else can I do with ISAPI? Many things! If you're writing a filter, you'll want to specify when you're called. You can hook into the following events which happen in the lifetime of a client connection: Event When called SF_NOTIFY_READ_RAW_DATA when the web server has read data from the client SF_NOTIFY_PREPROC_HEADERS when data has been read and headers have been parsed SF_NOTIFY_URL_MAP when the web server is mapping a URL to a file SF_NOTIFY_AUTHENTICATION when the web server is authenticating a user SF_NOTIFY_ACCESS_DENIED if the user is denied access SF_NOTIFY_SEND_RAW_DATA when data is sent to the client SF_NOTIFY_END_OF_REQUEST when the request has ended SF_NOTIFY_LOG when a log entry is made SF_NOTIFY_END_OF_NET_SESSIO when a connection has closed N Ok, so how do I compile my ISAPI-based code? You will need a working C compiler and a copy of the ISAPI header files. These are provided with the Zeus Web Server and can be found in the $ZEUSHOME/web/include/ directory. The flags you need to specify will also depend on whether you are running the filter or extension as in-process or out-of-process. (See the next question). You can find further information about compilation in our FAQ. Why does my ISAPI module stop the webserver serving requests? Uh-oh. The core Zeus Web Server code is single-threaded (for speed) so your ISAPI module code must not block for too long. You should run your code in the out-of-process ISAPI runner. What's the difference between "out-of-process" and "in-process"? Out-of-process means that a separate ISAPI runner process handles your code. In-process means that your code gets loaded into the main web server process, which is potentially more dangerous! If your code has a bug in it, you will crash the entire server. Out-of-process is safer since you can only crash the ISAPI runner with poor code. Zeus is fault-tolerant however, and will restart itself if it notices one of its processes has crashed. Here's a table summarizing the features of extensions and filters, to help you decide: Extension Filter Purpose generate content rewrite headers and URLs perform authentication redirect, forbid, etc. Calling model by requested URL on every request out-of-process or inout-of-process or in-process process How do I make my ISAPI module run faster? There is a small degradation in speed if you run your ISAPI module out-of-process. Moving it into the core web server means that the operating system does not have to schedule two processes, and data does not have to be piped between them. My code has random corruption problems Your code needs to be "thread-safe". This means that your code must not depend on the value of variables, if it is possible that a subequent request may overwrite whilst the first request is still executing. You will need to re-structure your code or lock access to critical shared variables via mutexes. (The only exception to this rule is "in-process" modules, which are run sequentially). Where run Adding a tilde to directory names (ISAPI filter) Here is a simple example using an ISAPI filter to add a tilde to directory names. It is written in C. /* * Add a tilde to directory names * * This ISAPI filter will change URLs of the form 'isp.com/USER/stuff' * to the form 'isp.com/~USER/stuff' * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <string.h> /* Maximum length of a URI */ #define MAX_URI_SIZE 4065 /* This function is called when the filter gets loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *pVer ) { /* Set the filter version */ pVer->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description for this filter */ strncpy( pVer->lpszFilterDesc, "Tilde Insertion Filter", SF_MAX_FILTER_DESC_LEN ); /* Ask to be notified when the HTTP headers have been processed */ pVer->dwFlags = SF_NOTIFY_PREPROC_HEADERS; } return TRUE; /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, VOID *pvNotificationInfo ) { /* This is a notification of type SF_NOTIFY_PREPROC_HEADERS, therefore the argument * pvNotificationInfo is of type HTTP_FILTER_PREPROC_HEADERS* */ HTTP_FILTER_PREPROC_HEADERS* headers = (HTTP_FILTER_PREPROC_HEADERS*) pvNotificationInfo; /* Let the web server allocate (MAX_URI_SIZE+1) bytes of memory for us */ char *CurrentURI = pfc->AllocMem( pfc, MAX_URI_SIZE+1, 0 ); int URILength = MAX_URI_SIZE; /* Copy the header 'url' to the address (CurrentURI+1) */ if( !headers->GetHeader( pfc, "url", CurrentURI+1, &URILength )) { /* The URI was not fetched successfully */ return SF_STATUS_REQ_ERROR; } if( *(CurrentURI+2) == '~' || strrchr( CurrentURI+1, '/') == CurrentURI+1) { /* Don't add a tilde if it's already there or if the URI is a file in the root directory */ return SF_STATUS_REQ_NEXT_NOTIFICATION; } /* Copy the leading slash (currently at index 1) to index 0 */ *CurrentURI = *(CurrentURI+1); /* Copy a tilde character to index 1 */ *(CurrentURI+1) = '~'; /* Set the new header */ if( !headers->SetHeader( pfc, "url", CurrentURI )) { /* The URI was not re-set successfully */ return SF_STATUS_REQ_ERROR; }; } /* Return, instructing the server to notify us next time */ return SF_STATUS_REQ_NEXT_NOTIFICATION; Basic authentication (ISAPI filter) Here is a simple example using an ISAPI filter to perform basic authentication. It is written in C. /* * Authenticate users * * This ISAPI filter performs basic authentication. * Before every HTTP request is served, it checks for username/password * information and sends a '401 Permission Denied' response if the * information provided is incorrect. * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <string.h> #define USERNAME "fred" #define PASSWORD "bloggs" #define DOMAIN "Members Area" /* This function is called when the filter is loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *pVer ) { /* Set the filter version */ pVer->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description string for this filter */ strncpy(pVer->lpszFilterDesc, "Basic Authentication Filter", SF_MAX_FILTER_DESC_LEN); /* Ask to be notified at the authentication stage of every HTTP request */ pVer->dwFlags = SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSECURE_PORT | SF_NOTIFY_AUTHENTICATION; } return TRUE; /* Ask the client to authenticate by sending a 401 response */ static void Denied( PHTTP_FILTER_CONTEXT pfc, char *msg ) { char domain[256]; int l = strlen( msg ); /* Set up string to include name of domain */ sprintf(domain, "WWW-Authenticate: Basic realm=\"%s\"\r\n", DOMAIN); /* Send a 401 header */ pfc->ServerSupportFunction( pfc, SF_REQ_SEND_RESPONSE_HEADER, (PVOID) "401 Permission Denied", (LPDWORD) domain, 0 ); /* Explain why authentication failed */ pfc->WriteClient( pfc, msg, &l, 0 ); } /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, VOID *pvNotification ) { HTTP_FILTER_AUTHENT *auth = (HTTP_FILTER_AUTHENT *) pvNotification; if( auth->pszUser[0] == 0) { /* No authentication information was given */ Denied( pfc, "No user/password given" ); /* This is the end of this HTTP request */ return SF_STATUS_REQ_FINISHED; } if( strcmp( auth->pszUser, USERNAME ) ) { /* Username is wrong */ Denied( pfc, "Unknown user" ); return SF_STATUS_REQ_FINISHED; } if( strcmp( auth->pszPassword, PASSWORD ) ) { /* Password is wrong */ Denied( pfc, "Wrong password" ); return SF_STATUS_REQ_FINISHED; } } /* Continue with this request */ return SF_STATUS_REQ_NEXT_NOTIFICATION; Unique client logging using cookies (ISAPI filter) Here is a simple example using an ISAPI filter to log unique clients using cookies . It is written in C. /* * Log unique clients by using cookies * * This ISAPI filter keeps track of unique clients by reading * a cookie and setting it if it doesn't exist. * It logs every cookie read to a file. * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <unistd.h> #include <fcntl.h> #include <string.h> /* This function is called when the filter is loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *pVer ) { /* Set the filter version */ } pVer->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description string for the filter */ strncpy( pVer->lpszFilterDesc, "A Cookie Filter", SF_MAX_FILTER_DESC_LEN ); /* Ask to be notified, when the headers have been processed */ pVer->dwFlags = SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSECURE_PORT | SF_NOTIFY_PREPROC_HEADERS; return TRUE; /* This function fills (*buffer) with (count) random digits */ static void RandomBytes( char *buffer, int count ) { int i=0; for( i=0; i<count; i++ ) buffer[i] = '0' + (rand() % 10); buffer[i] = '\0'; } /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, VOID *pvNotification ) { HTTP_FILTER_PREPROC_HEADERS *headers = (HTTP_FILTER_PREPROC_HEADERS *) pvNotification; char cookie[256], url[256]; int cookielen=256, urllen=256; } /* Get header information */ if( headers->GetHeader( pfc, "Cookie:", cookie, &cookielen ) && headers->GetHeader( pfc, "url", url, &urllen ) && cookielen > 1 && urllen > 1 ) { /* We have a cookie, log it */ int fd = open( "/tmp/cookie.log", O_WRONLY|O_CREAT|O_APPEND, 0755 ); if ( fd != -1 ) { char outbuff[514]; sprintf( outbuff, "%s %s\n", cookie, url ); write( fd, outbuff, strlen( outbuff ) ); close( fd ); } } else { /* Set a cookie header with a random cookie */ char msg[256]; RandomBytes( cookie, 16 ); sprintf( msg, "Set-Cookie: %s\r\n", cookie ); pfc->AddResponseHeaders( pfc, msg, 0 ); } /* Return, instructing the server to notify the next module */ return SF_STATUS_REQ_NEXT_NOTIFICATION; A simple page counter (ISAPI extension) Here is a simple example using an ISAPI extension which creates a simple page counter. It is written in C. /* * * A simple page counter * * This ISAPI extension writes out the number of times it has been accessed. * NOTE: It should not be run out-of-process, as several threads might * try to access the counter variable at the same time. * */ /* Include ISAPI definitions and constants */ #include <httpext.h> /* This variable holds the number of times this page has been accessed */ static int hits = 0; /* This function is called when the extension is loaded by the web server */ BOOL WINAPI GetExtensionVersion( HSE_VERSION_INFO *pVer ) { /* Set the extension version */ pVer->dwExtensionVersion = HSE_VERSION; /* Set a description string for the extension */ strncpy( pVer->lpszExtensionDesc, "A Simple Page Counter", HSE_MAX_EXT_DLL_NAME_LEN ); return TRUE; } /* This function is called when the extension is accessed */ DWORD WINAPI HttpExtensionProc( LPEXTENSION_CONTROL_BLOCK ecb ) { char *header = "Content-Type: text/plain"; int headerlen = strlen( header ); char msg[256]; int msglen; /* Use a server support function to write out a header with our additional header information */ ecb->ServerSupportFunction( ecb->ConnID, HSE_REQ_SEND_RESPONSE_HEADER, 0, &headerlen, (DWORD *)header ); /* Write out the number of accesses */ sprintf( msg, "This page has been accessed %d times", ++hits ); msglen = strlen( msg ); ecb->WriteClient( ecb->ConnID, msg, &msglen, 0 ); } /* Return, indicating success */ return HSE_STATUS_SUCCESS; A simple thread-safe page counter (ISAPI extension) Here is a simple example using an ISAPI extension to create a simple thread-safe page counter. It is written in C. /* * * A simple page counter * * This ISAPI extension writes out the number of times it has been accessed. * NOTE: This code is thread-safe and can be run as an out-of-process * extension. * */ /* Include ISAPI definitions and constants */ #include <httpext.h> #include <pthread.h> /* Assuming you are using POSIX threads */ /* This variable holds the number of times this page has been accessed */ int hits = 0; pthread_mutex_t hits_mutex; /* This function is called when the extension is loaded by the web server */ BOOL WINAPI GetExtensionVersion( HSE_VERSION_INFO *pVer ) { /* Set the extension version */ pVer->dwExtensionVersion = HSE_VERSION; /* Set a description for the extension */ strncpy(pVer->lpszExtensionDesc, "A Simple Thread-Safe Page Counter", HSE_MAX_EXT_DLL_NAME_LEN); /* Initialize the mutex */ pthread_mutex_init( &hits_mutex, NULL ); } return TRUE; /* This function is called when the extension is accessed */ DWORD WINAPI HttpExtensionProc( LPEXTENSION_CONTROL_BLOCK ecb ) { char *header = "Content-Type: text/html"; int headerlen = strlen( header ); char msg[256]; int msglen; /* Use a server support function to write out a header with our additional header information */ ecb->ServerSupportFunction( ecb->ConnID, HSE_REQ_SEND_RESPONSE_HEADER, 0, &headerlen, (DWORD *)header ); /* Try to lock the mutex */ pthread_mutex_lock( &hits_mutex ); /* Update the number of hits */ sprintf( msg, "This page has been accessed %d times", ++hits ); /* Now unlock the mutex */ pthread_mutex_unlock( &hits_mutex ); /* Write out the number of accesses */ msglen = strlen( msg ); ecb->WriteClient( ecb->ConnID, msg, &msglen,0 ); } /* Return, indicating success */ return 0; Multi-language web site support (ISAPI filter) Here is a simple example using an ISAPI filter to enable multi-language support. It is written in C. /* * Multi-language support * * This ISAPI filter reads the 'Accept-Language:' HTTP header and tries * to serve a file in the corresponding language. * e.g. if language is 'fr', 'index.html' maps to 'index.fr.html' * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <string.h> #include <sys/stat.h> /* This function is called when the filter is loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *pVer ) { /* Set the filter version */ pVer->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description for the filter */ strncpy( pVer->lpszFilterDesc, "Language Negotiation Filter", SF_MAX_FILTER_DESC_LEN ); /* Ask to be notified when the URL has been mapped to a path name */ pVer->dwFlags = SF_NOTIFY_SECURE_PORT | SF_NOTIFY_NONSECURE_PORT | SF_NOTIFY_URL_MAP; return TRUE; } #define PREAMBLE_LENGTH 22 /* length of "HTTP_ACCEPT_LANGUAGE" */ /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, VOID *pvNotification ) { HTTP_FILTER_URL_MAP *map = (HTTP_FILTER_URL_MAP *) pvNotification; char *lang, *nfile; char buffer[4096]; int size = sizeof(buffer); /* Retrieve all available HTTP headers */ pfc->GetServerVariable(pfc, "ALL_HTTP", buffer, &size); /* Find the 'Accept-Language:' header */ lang = strstr(buffer, "HTTP_ACCEPT_LANGUAGE"); if ( lang ) { /* We found the header */ char *p; lang += PREAMBLE_LENGTH; /* Skip "Accept Language: " */ for(p=lang; isalpha(*p); p++) /* Skip letters */ ; *p = '\0'; /* Terminate after first language */ /* Now look for a file with that language in the name */ nfile = pfc->AllocMem( pfc, 1 + strlen( lang ) + strlen( map->pszPhysicalPath ), 0 ); p = strchr( map->pszPhysicalPath, '.' ); } } if( p && nfile ) { /* Build up the new filename */ struct stat st; int c = p - map->pszPhysicalPath + 1; memcpy( nfile, map->pszPhysicalPath, c); strcpy( nfile + c, lang ); c += strlen( lang ); strcpy( nfile + c, p ); /* If the file exists, use it! */ if( !stat( nfile, &st ) ) map->pszPhysicalPath = nfile; } return SF_STATUS_REQ_NEXT_NOTIFICATION; Generating a web site front page from a Java Servlet (ISAPI filter) Here is a simple example of an ISAPI filter that demonstrates the generation a web site's front page using a Java Servlet. It is written in C. /* * Map requests for '/' to '/dir/file' * * This ISAPI filter shows how to make the frontpage of a website be generated * by a Java Servlet. * * Normally Java Servlets are mounted under a URL prefix, such as '/jserv'. * However this means it is not directly possible to mount Java Servlets * directly under '/', since then a request for any URL, such as * /images/bar.gif would get mapped onto a Java Servlet. * * This ISAPI filter rewrites requests for the frontpage '/' onto a Java * servlet onto mounted under '/jserv', i.e. requests for '/' transparently * get mapped onto '/jserv/frontpage'. * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <string.h> /* This is the URL that / will be mapped to */ #define NEW_URL "/jserv/SimpleServlet" /* This function is called when the filter is loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *ver ) { /* Set the filter version */ ver->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description string for the filter */ strncpy( ver->lpszFilterDesc, "Map / to /dir/file", SF_MAX_FILTER_DESC_LEN ); /* We want to hook onto the pre-processing headers stage */ ver->dwFlags = SF_NOTIFY_PREPROC_HEADERS; } return TRUE; /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, LPVOID notificationInfo ) { char url[1024]; int i; PHTTP_FILTER_PREPROC_HEADERS tbl = (PHTTP_FILTER_PREPROC_HEADERS) notificationInfo; /* Retrieve the URL asked for */ if( tbl->GetHeader( pfc, "url", url, &i ) == TRUE && !strcmp( url, "/" )) { /* This is a request for '/', * transparently rewrite this to 'NEW_URL' */ tbl->SetHeader( pfc, "url", NEW_URL ); } } return SF_STATUS_REQ_NEXT_NOTIFICATION; Transparently mapping requests from one URL to another (ISAPI filter) Here is a simple example using an ISAPI filter to show how to make the frontpage of a website be generated by a Java Servlet. It is written in C. /* * Map 'www.isp.com/USER' to 'www.USER.isp.com' * * This ISAPI filter shows how to transparently map requests for * 'http://www.myisp.com/user/page' to * 'http://www.user.myisp.com/page' * * This is designed for ISPs wanting to migrate from legacy hosting systems * to Zeus's support for subservers, allowing mass-hosting customers to have * their own domain. This works extremely well with BIND 8.x's support for * wildcarded DNS, i.e. you setup a single DNS record for * www.*.myisp.com -> * IP address. * */ /* Include ISAPI definitions and constants */ #include <httpfilt.h> #include <string.h> static char* HOSTINGDOMAIN = "hosting.myisp.com"; static char* HOLDPAGE = "myisp.net"; /* New hosting domain */ /* Where to send bad URLs */ /* This function is called when the filter is loaded by the web server */ BOOL WINAPI GetFilterVersion( HTTP_FILTER_VERSION *ver ) { /* Set the filter version */ ver->dwFilterVersion = HTTP_FILTER_REVISION; /* Set a description string for the filter */ strncpy( ver->lpszFilterDesc, "Map www.isp.com/USER/ to www.USER.isp.com/", SF_MAX_FILTER_DESC_LEN ); /* We want to hook onto the pre-processing headers stage */ ver->dwFlags = SF_NOTIFY_PREPROC_HEADERS; } return TRUE; /* This function is called for every HTTP request */ DWORD WINAPI HttpFilterProc( PHTTP_FILTER_CONTEXT pfc, DWORD notificationType, LPVOID notificationInfo ) { char url[1024]; char dest[1024+128], *p; int i; PHTTP_FILTER_PREPROC_HEADERS tbl = (PHTTP_FILTER_PREPROC_HEADERS) notificationInfo; /* This is the header we will use to redirect the client */ memcpy( dest, "Location: http://www.", 21 ); p = dest + 21; /* Retrieve the URL asked for */ if( tbl->GetHeader( pfc, "url", url, &i ) == TRUE /* Set up new destination */ char* u = url; while( *u == '/' ) u++; /* if( !*u ) { /* No username given, send hold-page */ strcpy( p, HOLDPAGE ); } else { while( *u && *u != '/' ) *p++ = *u++; /* *p++ = '.'; /* strcpy( p, HOSTINGDOMAIN ); /* p += strlen( HOSTINGDOMAIN ); strcpy( p, u ); /* } } else { /* Couldn't read URL header -- just redirect to strcpy( p, HOLDPAGE ); } ) { Strip off leading '/' */ Copy over user name */ Add '.' */ Add new hosting domain */ Add rest of URL */ ISP front-page */ /* Redirect to new destination */ pfc->ServerSupportFunction( pfc, SF_REQ_SEND_RESPONSE_HEADER, "302", (LPDWORD) dest, 0 ); i = 0; pfc->WriteClient( pfc, "", &i, 0 ); } /* Instruct web server not to process this request any further */ return SF_STATUS_REQ_FINISHED; NSAPI NSAPI (Netscape Server API) Introduction NSAPI, Netscape's webserver API, provides a programmer's interface to the core operations of a webserver. NSAPI applications can access and modify the internal state of a web server, and may contribute to processing a user's request, authenticating it, locating resources, generating content and logging any results. Many third-party commercial web applications are written to the NSAPI specification. Zeus supports NSAPI to provide a migration path for customers using these applications. The Zeus NSAPI module provides a binary-compatible emulation of Netscape's NSAPI interface v3.0 and v4.0. Zeus does not provide any support for NSAPI application development. Advantages • Many commerical 3rd party applications interface to the web server using NSAPI • High performance multi-threaded API • Rich, comprehensive API allowing all aspects of the web server behaviour to be modified. Disadvantages • Proprietary API designed by Netscape • NSAPI applications must be MT-safe (multi-threaded). Writing stable multi-threaded applications is the domain of a professional software team, and is complicated and very difficult to debug. Sample NSAPI configuration files NSAPI configuration is held in an 'obj.conf' and 'magnus.conf' file. In addition, if your webserver runs as the root user, ZWS allows you to configure an alternative user and group that the NSAPI execution environment will use. Three Netscape-like configuration files are required: 'obj.conf', 'mime.types' and 'magnus.conf'. These three files must be stored in a https-vserver name/config directory, by default $ZEUSHOME/ns-config/https-vserver name/config. You can download default configuration files by using the links for the obj.conf, mime.types and magnus.conf templates in the NSAPI configuration page of a virtual server in your Zeus Administration Server. These should be used in preference to the examples displayed below. Sample obj.conf file # Sample obj.conf file # # $ZEUSHOME/ns-config/https-foo/config/obj.conf Init fn="load-modules" funcs="myauth" shlib="/usr/local/nsapi/libs/auth.so" Init fn="load-modules" funcs="myrewrite, mylog, mylog-open" shlib="/usr/local/nsapi/libs/util.so" Init fn="mylog-open" logfile=/usr/local/auth/access.log Init commands are run at startup. The first two commands import the function 'myauth' from the shared library 'auth.so', and 'myrewrite', 'mylog' and 'mylog-open' from 'util.so'. The third command runs the function 'mylog-open' once at startup. Note that iPlanet and SunONE servers require that Init functions are stored in the magnus.conf file. Continuing the obj.conf file... <'Object name="default"> AuthTrans fn=basic-auth auth-type="basic" userdb="/usr/local/auth/users.dat" userfn=myauth NameTrans fn=myrewrite from=/somewhere/ to=http://www.somewhere.else.com/ </Object> <Object ppath="*/secure*"> PathCheck fn=require-auth realm="Secure Site" auth-type="basic" auth-user="(pete|joe|karen)" AddLog fn=mylog </Object> The first word of a command indicates the stage in the webserver's processing where the command should be executed. All commands, other than Init commands, are grouped into 'objects'. Commands in the 'default' object are run on every request. Additional 'objects' can be used to augment the default object. The easiest way to specify additional objects is to give them a path regexp which matches a full filesystem name. Commands are one line long, but any line beginning with whitespace (space or tab) is taken to be a continuation of the previous line. Note: this sample obj.conf supposes the user has created or acquired NSAPI libraries auth.so and utils.so containing the appropriate functions. Sample mime.types file # Supplementary mime types for NSAPI # # $ZEUSHOME/ns-config/https-foo/config/mime.types type=image/gif type=text/html exts=gif exts=log,dat These mime types override the Zeus mime-typer. Sample magnus.conf file # Default Zeus magnus.conf # # $ZEUSHOME/ns-config/https-foo/config/magnus.conf LoadObjects obj.conf 'LoadObjects' provides the name of the obj.conf file. All other tags in this file are ignored. Perl Extensions Perl Extensions Introduction Zeus Web Server (4.2 and later) includes an external runner with an embedded Perl interpreter and a set of support modules that allow you to run persistent Perl applications and scripts alongside the web server. These Perl applications can modify the behaviour of the web server in a wide range of ways, from implementing custom authentication schemes to changing the way that requests are logged. The supporting Perl modules allow you to run a wide range of existing applications developed for Apache's mod_perl interface in the web server. This makes it possible to run a large number of existing applications on the Zeus Web Server without any modification. Handlers The Perl Extensions add a number of extra hooks to the Zeus Web Server's request processing. On each of these hooks it is possible to install a Perl handler. Handlers can collect information about the incoming request and manipulate the response either by generating it themselves or by issuing directives to the web server to adjust its behaviour. Process model The Perl interpreters responsible for executing handlers do not run in the web server itself, but in a pool of external processes. The web server communicates with these processes whenever it wants to run a handler. This model allows Perl handlers to do whatever they need to do, including blocking, without hampering the performance of the optimized web server core. If dealing with a client connection involves executing several handlers, the same Perl runner will be used for all of them. This allows the semantics of Apache mod_perl to be preserved when passing information from one handler to another. When a handler completes, the Perl interpreter continues running, so state may be preserved for later requests; this may allow, for instance, the use of persistent database connections in order to avoid the overhead of setting them up every time. Getting started Please refer to our getting started guide. For extensive documentation on Apache mod_perl, much of which applies to the Zeus Perl Extensions, see the mod_perl web site. Advantages • Allows extension modules to be written in a convenient scripting language. • A large number of modules and applications are freely available for Apache mod_perl, and many will run without alterations with Zeus Perl Extensions. • Powerful API, including rewriting, authentication, content generation, logging, and a number of other hooks, with a broad selection of support functions. • Built-in CGI acceleration layer. Disadvantages • Somewhat slower than ISAPI and NSAPI due to the overhead of an embedded Perl interpreter and the current requirement for a multi-process rather than multi-threaded model. • Not all applications built on Apache mod_perl can be supported, due to some of them relying on the Apache web server itself. • By default, all Perl applications are loaded into the same pool of runner processes, so they all run in the same security domain. Getting started with Perl Extensions This document aims to answer your questions about getting started with Zeus Perl Extensions. If you have any other queries, please email [email protected]. 1.1 How can I install a standard mod_perl application? Installation $ZEUSHOME/zperl/bin/perl is the perl executable modified to work within the confines of a Zeus installation. Most third-party Perl modules and Apache mod_perl applications have a standard installation procedure that begins by running a Makefile.PL script to configure the installation process. We recommend that you install modules and applications into $ZEUSHOME/zperl/local. To do this, instead of using: $ perl Makefile.PL to configure the modules you are installing, use: $ $ZEUSHOME/zperl/bin/perl Makefile.PL If any pre-requisites are missing, the Makefile.PL will warn you. You can resolve prerequisites manually by obtaining and installing the required software, or you can handle them automatically by installing the software through CPAN (see below). Running the Makefile.PL will produce a Makefile which then installs the software into$ZEUSHOME/zperl/local. You should then continue to follow the installation and configuration instructions that came with your third party software. These normally suggest running the following two commands: $ make $ make install You will need to run the make install command as the same user you installed the Zeus Web Server as; this may be the root user. Configuration You can configure the Perl runner either through the UI or using .htaccess files. The former is easier for CGI script migration, while the latter is what will generally appear in the instructions for installing Apache mod_perl applications. To request that a given file be run through the Zeus::ModPerl::Registry module, set its MIME type toapplication/x-httpd-perl. Handlers can be set on each request phase from the Perl page in the UI. Common migration issues The instructions for some Apache mod_perl applications may ask you to put directives in httpd.conf. In Zeus, these should either go in a global htaccess file, a perdirectory .htaccess file, or the virtual server configuration. 1.2 How do I install new modules? You can install applications and modules directly from CPAN (the Comprehensive Perl Archive Network). This process makes the handling of dependencies and prerequisites much easier. Use the CPAN shell as follows: $ZEUSHOME/zperl/bin/perl -MCPAN -eshell (Make sure you use the full path to the perl binary, otherwise modules may be installed into the wrong place.) The CPAN shell will then download, compile, test, and install new modules under $ZEUSHOME/zperl. Consequently, you should run the shell as the same user id you installed Zeus Web Server. If you've not used CPAN before, the module shell will first run through some configuration questions, asking you for the location of standard tools, etc. Note that CPAN uses FTP (or wget, which uses FTP) to access information. If you are behind a firewall which prohibits active FTP, you may need to explicitly configure your FTP programs to use passive FTP. At configuration, you can do this by specifying the following settings: Where is your wget program? [/usr/bin/wget] /usr/bin/wget --passive-ftp Where is your ftp program? [/usr/bin/ftp] /usr/bin/pftp After configuration, you can set these configuration variables using the o conf command at the cpan> prompt.o conf lists the configuration variables; set and save them as follows: cpan> o conf ftp /usr/bin/pftp ftp /usr/bin/pftp cpan> o conf commit commit: wrote $ZEUSHOME/zperl/lib/CPAN/Config.pm Once you've configured the shell, you may want to download some standard modules before proceeding: cpan> install MD5 Bundle::libnet Other useful hints You may find the configuration variables 'makepl_arg', 'make_arg' and 'make_install_arg' useful for supplying additional parameters to the make and install process, such as additional library and include file search paths. For more information, see the CPAN documentation. 1.3 Why are the built-in modules called Zeus::ModPerl::*? The modules built into Apache mod_perl are placed in the Apache namespace: Apache::Registry,Apache::Util, and so on. To avoid confusion, the Zeus equivalents to these modules are calledZeus::ModPerl::Registry, Zeus::ModPerl::Util, and the like. A set of small wrapper modules are in place for backward compatibility. If your code needs to be portable between Apache and Zeus, you should use the Apache namespace. Otherwise you can use that or the Zeus::ModPerl namespace as you wish. 1.4 What is the difference between Zeus's bundled Perl and the standard Perl? There are no significant differences between the Zeus Perl and standard Perl v5.6.1. Zeus bundle their own version to ensure that the web server uses a well tested and validated version of Perl. It is installed under$ZEUSHOME/zperl. Zeus's version of Perl has been modified slightly to facilitate the bundling; details are published in the Zeus Perl Licence statement. 2.1 Perl Extensions allow Perl scripts to be persistent. How does this help me? All handlers configured for the same request run in the same Perl runner process, so they can share state at the server side. Mechanisms for this include: • databases • shared memory • "notes" Furthermore, since Perl runner processes remain active after dealing with a request, it is possible to store information from one request to another. This may be a performance advantage in some situations. For example, if you are authenticating users against a database, the most expensive part of the procedure is usually setting up and tearing down connections to the database server. Instead, you can take advantage of the Perl runner's persistence and set up a connection once per process, which will then speed up second and subsequent authentication operations. 2.2 How do handlers interact with each other? Perl handlers run in a fixed sequence for each request. These are as follows: Handler Purpose PerlChildInitHandler special handler, called when a new Perl runner child process starts up PerlPostReadRequestHandler called immediately after incoming request headers have been read PerlTransHandler translation of URLs to physical filenames PerlHeaderParserHandler called just after local .htaccess files have been read PerlAccessHandler access control PerlAuthenHandler authentication (who is the user?) PerlAuthzHandler authorization (is the authenticated user permitted to access this page?) PerlTypeHandler MIME type mapping PerlFixupHandler called just before content generation, and used to make last-minute changes to the request PerlHandler content generation PerlLogHandler logging PerlCleanupHandler resource deallocation and other cleanup tasks special handler, called when a Perl runner child process shuts down Multiple handlers may be installed on each of these stages, a process known as "stacking". In that case, the return code of each handler determines what happens to subsequent stacked handlers. At the URI translation, MIME type mapping, and content generation stages, returning DECLINED will continue to the next handler, while returning anything else will break out and return control to the Zeus Web Server; at all other stages OK andDECLINED will continue to the next handler, while anything else will return control to ZWS. PerlChildExitHandler 2.3 How do handlers interact with the Zeus Web Server? When a handler returns an HTTP status code or DONE, the web server will immediately return an appropriate response to the client. OK and DECLINED cause the web server to continue with its normal sequence of request processing, except that at the content generation stage returning OK means that content generation is complete Parts of the API provided to handlers involve manipulating the web server's record of the client request or the response it will send. When these are used, the handler sends a notification to the web server, which then adjusts its internal state accordingly. 3.1 Why do I get the error 'Variable "$foo" will not stay shared' when running CGI scripts underZeus::ModPerl::Registry? This is a common problem when porting CGI scripts. For an explanation of its cause and how to avoid it, see theApache mod_perl porting guide. 3.2 What security issues are involved when using Zeus Perl Extensions? All Perl handlers run in the same security domain: that is, the user and group of the web server. This is likely to be inappropriate if you have untrusted users developing content on your web server; they should not be able to run Perl handlers under the web server's user id. You should be careful to enable the SSI #perl command only if all developers writing server-parsed HTML pages are trusted, since it allows them to execute arbitrary code on the web server. This command is disabled by default. 3.3 What happends when a perl extension "dies"? When a perl extension dies, the web server will abort the current hook. This can cause unpredictable/undesirable behaviour within other modules in use by the web server that also relate to the current hook. This presents a problem if you have code in your extension that relies on an action such as inserting rows into an external database. To add fault tolerance to your extension, you can wrap your code in an eval{} block and print the result as a warning message. use SOME::MODULES; my $foo="bar"; sub handler { eval{ ... YOUR CODE .... } }; warn $@ if $@; # Print the exception if there was one. return OK Zeus Web Server Perl Licence Zeus Web Server contains a slightly modified version of Perl (5.6 in Zeus Web Server 4.2, 5.8 in Zeus Web Server 4.3 and later). Use of Perl is governed by the Perl Artistic License. This allows the Perl interpreter, its support files, and modified versions of the same to be distributed freely for commercial and non-commercial purposes. The licence requires that all modifications, such as the following, are placed in the public domain. The following patch was applied to the Perl source tree before building the Perl distribution: • perl.build.diffs (Perl 5.6) • perl.build.5.8.diffs (Perl 5.8) The following patch was applied to Perl after building it but before turning it into a package: • perl.install.diffs (Perl 5.6) • perl.install.5.8.diffs (Perl 5.8) Authentication-Driven Content A simple authentication handler either accepts or rejects users. If you instead want to give them different output if authentication fails, you could use a pair of Perl handlers, one on the authentication hook and one on content generation. If the user tries to authenticate, the authentication handler checks that the username and password match (in this case, in the Unix password file), and creates a "note" that the content handler should enter authenticated mode. (We use the pnotes method here rather than notes since there is no need for the note to be sent back to the web server; if we needed to log the contents of the note, then the pnotes method would not be sufficient.) Otherwise, the authentication handler does not create a note and simply allows the user through in anonymous mode. The content handler then checks whether the note is present, and prints one page or the other depending on the result of that check. package AuthContent; use strict; use Zeus::ModPerl; use Zeus::ModPerl::Constants qw( :common ); sub authen ($$) { my( $pkg, $r ) = @_; my( $ret, $password ) = $r->get_basic_auth_pw(); if( $ret != OK ) { return $ret } my $errmsg = ''; my $user = $r->connection()->user(); if( defined($user) ) { my @pw = getpwnam( $user ); if( @pw && (crypt($password,$pw[1]) ne $pw[1]) ) { $r->note_basic_auth_failure(); $r->log_reason( $errmsg, $r->filename() ); return AUTH_REQUIRED; } } $r->pnotes( authenticated => 1 ); } return OK; sub content ($$) { my( $pkg, $r ) = @_; if( $r->pnotes( 'authenticated' ) ) { my $user = $r->connection()->user(); print( <<EOF ); <html> <head><title>Authenticated page</title></head> <body>You are authenticated as user '$user'.</body> </html> EOF } else { print( <<EOF ); <html> <head><title>Anonymous page</title></head> <body>You are not authenticated.</body> </html> EOF } return OK; }; 1; To install this example using the Administration Server user interface, put the above code in$ZEUSHOME/zperl/local/lib/site_perl/AuthContent.pm, enable Perl Extensions in the user interface, add the paths that should be controlled by this, and add AuthContent->authen as an authentication handler andAuthContent->content as a content generation handler for those paths. You will also need to enable the htaccess support, and create an htaccess file with an 'authname' entry, and a global htaccess file with 'passenvauthorization on' in a suitable 'Directory' tag. Logging to a Database A frequently requested feature in Zeus has been the ability to log to a database. With the Perl extensions this can be done in quite a small amount of code. Here's how. Consider a simple database schema containing three fields: requested URL, remote host, and useragent. The following code, when installed on the logging stage from the Administration Server user interface, will insert a row into this database after each request. package DBLog; use use use use strict; Zeus::ModPerl; Zeus::ModPerl::Constants qw( OK ); DBI; use constant SQL_LOG_ACCESS => q|INSERT INTO access (url, remote_host, user_agent) VALUES (?, ?, ?)|; sub handler { eval { my $r my $db my $dbuser my $dbpassword my $url my $remote_host = = = = = = shift( @_ ); $r->dir_config( 'LogDatabase' ); $r->dir_config( 'LogUser' ); $r->dir_config( 'LogPassword' ); $r->uri(); $r->get_remote_host(); my $user_agent = $r->header_in( 'User-Agent' ); my $dbh = DBI->connect( $db, $dbuser, $dbpassword, { AutoCommit => 1 } ); my $sth = $dbh->prepare( SQL_LOG_ACCESS ) or die( $dbh->errstr() ); $sth->execute( $url, $remote_host, $user_agent ) or die( $dbh->errstr() ); } }; # Print a warning if there is one. warn $@ if $@; return OK; 1; To install this example using the Administration Server user interface, put the above code in$ZEUSHOME/zperl/local/lib/site_perl/DBLog.pm, enable Perl Extensions in the user interface, add the paths that should be controlled by this, and add DBLog as a logging handler for those paths. This handler uses the dir_config method to retrieve its configuration. You will need to set Perl configuration variables called LogDatabase, LogUser, and LogPassword; this can be done either using an .htaccess file or through the Admin Server user interface. The eval{} block lets the code fail/die without causing the web server's log hook to abort in the event the database becomes inaccessible. Advanced Rewriting The post-read-request stage of request processing can be used for tasks similar to the Request Rewriting module. For example, a webmaster might want to use referrer checking to make sure that users enter the site through the front page, installing the following code at the post-read-request stage as Rewriting->referrer: package Rewriting; use strict; use Zeus::ModPerl; use Zeus::ModPerl::Constants qw( :response ); my $host = 'my.web.site'; sub referrer { my $pkg my $r = my $ref = my $re = = shift; shift; $r->header_in( 'Referer' ); qr|^http://\Q$host\E|; # HTTP spells this wrongly if( $ref !~ $re ) { my $action = $r->dir_config( 'ReferrerAction' ); } } if( defined($action) and (lc($action) eq 'reject') ) { return FORBIDDEN; } else { my $uri = $r->uri(); $uri =~ s,/[^/]*$,,; $r->uri( "$uri/denied.html" ); } return DECLINED; 1; When the uri method is called with an argument, it causes a notification to be sent to the web server. In this case, that notification will instruct the server to update its record of the requested URL. The handler returns DECLINED to indicate that other post-read-request handlers may still be run. As another example, you might wish to have some user-agent-specific pages, installing the following code on the URI Translation hook as Rewriting->user_agent: sub user_agent { my $pkg = shift; my $r = shift; my $ua = $r->header_in( 'User-Agent' ); if( $ua =~ /Netscape/ ) { $r->uri ( $r->uri() $r->filename( $r->filename() } elsif( $ua =~ /Explorer/ ) { $r->uri ( $r->uri() $r->filename( $r->filename() } } . '.netscape' ); . '.netscape' ); . '.explorer' ); . '.explorer' ); return OK; We use the URI Translation hook here to take advantage of the fact that the web server has already done the hard work of mapping the URI to a filename. However, this means that the handler must keep the physical filename up to date itself using the filename method. To install either of these examples using the Administration Server user interface, save the source code as$ZEUSHOME/zperl/local/lib/site_perl/Rewriting.pm, enable Perl Extensions in the user interface, add the paths that should be controlled, and add either Rewriting->referrer as a post-readrequest handler or Rewriting->user_agent as a URI translation handler as appropriate. Using Perl Extensions with SSI The Zeus Perl Extensions offer a new SSI tag: #perl. This can be used as a simple way to embed the output of a Perl subroutine in an HTML page. Here is a trivial example: <html> <head><title>Calling Perl from SSI</title></head> <body> <!--#perl sub="sub { print \"hello, world!\"; }" --> </body> </html> When the SSI tag is evaluated by the server, it calls the Perl runner and asks it to evaluate the Perl code in the sub argument. Anything that this code prints to standard output will then be included into the document. Naturally, you should make sure to enable this capability only on virtual servers whose content authors are trusted. Consider a more complex example. A hosting company wants to insert random advertising banners into a set of pages that already use SSI. Each banner needs to be associated with an appropriate link. To do this, they write a simple configuration file containing image and link URLs, one pair per line, and use the following Perl handler to select a random pair to display. package RandomLink; use strict; use Zeus::ModPerl; use Zeus::ModPerl::Constants qw( OK ); sub handler { my $r = shift( @_ ); my $source = shift( @_ ); # Come up with a random image/link combination from the # configuration file $source. local( *CONFIG ); open( CONFIG, '<', $source ) or return OK; my @config = <CONFIG>; close( CONFIG ); # choose a line at random from the list: my $line = $config[ int(rand(@config)) ]; # bail out if the line doesn't look right: # <image url> <whitespace> <link url> $line =~ /(.*?)\s*(.*)/ or return OK; my( $image, $link ) = ( $1, $2 ); $r->print( qq{<a href="$link"><img src="$image"></a>\n} ); } return OK; 1; When this code is put in $ZEUSHOME/zperl/local/lib/site_perl/RandomLink.pm, the .shtml pages can then invoke the handler like this: <!--#perl sub="RandomLink" arg="/home/www/ads.config" --> Request Rewriting Request Rewriting Introduction The request rewriting functionality can be used to change a requested URL into any other URL by running a script of rewrite commands to pre-process every request. This powerful functionality enables you to modify the URL and HTTP headers of a request in any way you want. The modified request is passed on to be processed in the usual way, returning the response page to the user. This powerful functionality can be used in many different ways. It is typically used to: • Prevent “broken links” when a URL on your web site becomes obsolete. You may restructure a web site, moving commonly accessed files that users may have bookmarked. The rewrite scripts can change requests for the files from their old locations into requests for the files from their new locations. • Store cookie information in URLs. Cookies are useful for tracking a user’s visits to a web site, but are easily disabled by users, making them ineffective. The request rewriting functionality can be used to effectively hide cookie-style information in URLs, enabling you to track users even if they have disabled cookies. • Hide the implementation details of your web site. Some parts of your web site may only be available via lengthy or complicated URLs. The rewrite functionality can enable users to access these parts of your web site using much simpler URLs. • Convert from an existing Apache web server that uses mod_rewrite. If you have an Apache web server that uses mod_rewrite scripts, you can convert these to run on Zeus Web Server. Advantages • High performance, low overhead. • Simple to write • Portable across many server platforms Disadvantages • Uses proprietary Zeus Rewrite scripting language • Not compatible with Apache's mod_rewrite language • Difficult to debug Redirecting based on Client User Agent Different browsers have different capabilities, and sometimes no two browsers display the same content in the same way. It is sometimes desirable to author seperate content for each major web browser, then serve the version of a page best suited to the particular client. match IN:User-Agent into $ with ^Mozilla.* if matched then set URL = /content/index_x.htm else set URL = /content/index_y.htm endif This rule allows different pages to be served to different browsers. For example, a more complex page could be served to Gecko-based browsers which properly support CSS, whilst a more plain page could be served to Internet Explorer (which, as far as CSS is concerned, is broken). Note that many browsers can pretend to be Microsoft Internet Explorer or can specify a custom User Agent. This limits the effectiveness of this rule. Converting Apache rules to ZWS You are migrating sites from Apache to Zeus, and wish to convert your existing Apache rewrite rules by hand. Apache: RewriteCond %{REQUEST_FILENAME} !-f RewriteRule ^[^/]*\.html$ index.php Zeus Web Server: match URL into $ with ^[^\/]*\.html$ if matched then set URL = index.php endif Aside from keywords and formatting, Apache and Zeus rules are very similar, although Zeus rules can be much more readable and offer additional functionality. The match expression itself can remain intact with no changes, whereas the arcane RewriteCond directive is rewritten in a form much closer to natural language. The REQUEST_FILENAME variable of Apache is the URL variable in Zeus, upon which a pattern match is attempted in the first line of the Zeus rule. If this match succeeds, then the URL field used for the remainder of the request is set to the desired page. Enhancing Gateway functionality You are using the Gateway module to pass on requests to other servers, and wish to allow a downstream server access to client IP address details. set IN:ClientIPAddress = %{REMOTE_HOST} When using the Gateway module, all downstream requests appear to originate from the web server machine. This rule allows downstream processess to see the IP address of the client instead. Session Tags Situation A web designer is creating a site which will need users to log into it. Logging in will provide the user with a session tag, which must be provided to the site with each request. The web designer does not wish to rely on cookies or HTTP basic realm authentication, nor do they want to pass the session tag in a query string (or hidden form item) each time. The web designer also has a lot of legacy scripts which they wish to re-use. The scripts expect session information to be provided in an environment variable, and some of them create URLs of the form www.site.com/SESSION=sessionstring/<rest of url> Solution The session tag can be stored in the hostname of the site, and a rewrite script can be used to put this information into an environment variable (or two) for the legacy scripts. Use a request rewrite script similar to the following: # Take URLs of the form www.site.com/SESSION=sessionstring/<url> # and create www.sessionstring.site.com and remove the rest. match URL into $ with ^/SESSION=([^./]+)/(.*)$ if matched then set IN:Host = www.$1.site.com set URL = /$2 endif # Done with that - fortunately we can now fall through to... # Take URLs of the form www.sessionstring.site.com and # create SESSION=sessionstring as an environment variable match IN:Host into $ with ^www\.([^.]+)\.site.com if matched then set ENV:SESSION = $1 endif Explanation The first set of rewrite rules translates URLs that contain SESSION=something into URLs of the formatwww.sessionstring.site.com. This is done by finding /SESSION=something/ at the start of the URI and removing it, rewriting the hostname to match the form of www.session.site.com. The second set of rules searches for session information in the URL, and translates it into a SESSIONenvironment variable. The two rules together then present a consistent interface to the scripts running on the site, without breaking any existing functionality. Vanity Domain Names Situation An ISP provides customers with free web hosting by providing them with web sites rooted athttp://home.isp.com/~username/. This is proving unpopular and the ISP wishes instead to offerhttp://www.username.isp.com/ without changing how they provide the web space. Solution Create a Virtual Server for home.isp.com and configure the home directories functionality to provide web sites rooted at ~username. Next, go to the fundamental configuration settings for the Virtual Server and add an alias of '*' to it, so it will catch requests for www.username.isp.com. Then add a request rewrite script similar to the following: # Turn www.username.isp.com into home.isp.com/~username match IN:Host into $ with ^www\.([^\.]+)\.isp\.com if matched then set IN:Host = home.isp.com set URL = /~$1%{URL} endif # End rule for home directories Explanation The request rewriter executes the script for each request which is processed by the virtual server created above. For each request, the regular expression matching hostnames of the form www.username.isp.com is executed against the Host header. If that matches, then the regular expression captures the username into the $1 variable. If the regex matched, then two more actions are performed: 1. The hostname is rewritten to home.isp.com; 2. The URL portion has /~username prepended to it. Redirecting clients to Alternate Pages Situation 1 You wish to transparently redirect one page within your document root to another. The basis of this example is from an international site with Language Negotiation enabled, but where the desired behavior was to redirect unhandled languages to a default English-language page. Some of the code necessary to do this has been removed for clarity. Solution match URL into $ with ^/$ if matched then set SCRATCH:LOC = set Response = set OUT:Location = set OUT:Content-Type = set Body = endif (Note that the "set Body" http://www.example.com/en/homepage.html 303 %{SCRATCH:LOC} text/html <!doctype html public "-W3C/DTD HTML 3.2EN"> <html><head><title>303 See Other</title></head> <body><p>See <a href="%{SCRATCH:LOC}">here</a>.</p></body></html> section of the above example should be entered on a single line - it has been split here only for readability) Description This example will send a 303 ("See Other") response to requests for the root directory of a website only. These requests will then be redirected from index.html to /en/homepage.html whilst producing a standards-compliant redirect page for browsers that do not automatically redirect (an option available, albeit not enabled, for many current browsers). Technical For ease of maintainance, the location to redirect to is first stored in the temporary LOC variable. This saves having to have the same string appear twice in the rule, potentially breaking the rule if only one were updated. Situation 2 You wish to operate a site where all content is stored on a secure server. To do this you have created two Virtual Servers, one standard and one secure, and wish to redirect all requests made to the standard server to the secure server instead, whilst keeping the details of the request intact. Solution match URL into $ with (.*) if matched then set Response = 307 set OUT:Location = https://intranet.example.com$1 set OUT:Content-Type = text/html set Body = Go <a href="https://intranet.example.com$1">here</a> instead. endif Description This example will route clients from a directory on the standard server to the same place on the secure server. Technical This example uses the 307 ("Moved Temporarily") response rather than the deprecated 302 ("Found"). Either of these should cause the client browser to attempt the original page for every request and not cache the fact that the page is moved. If instead, the browser should only attempt a single GET on the standard page and then honor the response, code303 should be returned as detailed in the redirect example. Note how the original request (including the initial "/") is mapped into $1, which is then postpended onto the destination location. The match could be written as "^/(.*)" to exclude the initial slash, allowing Location to be set to "https://intranet.example.com/$1" which probably has a lesser chance of being misunderstood. Situation 3 You are a company providing hosting facilities, and would like to allow your Web Server to handle redirections to the domains you host, rather than relying on (and having to update) a large number of DNS entries. Solution match IN:Host into $ with ^.+$ if matched then set RESPONSE = 302 set OUT:Location = http://www.domainhost.com/park.php?domain=%{IN:Host} set BODY = <a href="http://www.domainhost.com/park.php?domain=%{IN:Host}">Go here instead</a> endif Description In this situation, we have a Virtual Server with a script that redirects requests for a particular domain to a dynamically specified document root. Techincal This rule matches any Host header of at least one character (which will correctly ignore HTTP/1.0 requests without a Hostheader), then redirects the request to the specified PHP script with the hostname as a parameter. This script is then responsible for matching a hostname to a document root. Enhanced Referrer Checking Situation You wish to implement additional control of referrer checking, over and above that provided by the Referer Checking module (which allows specific MIME types to be protected). Solution match URL into $ with ^/dir/to/be/protected/.* if matched match IN:REFERER into $ with members\.example\.com if not matched set URL = /errorPageForBadReferer.html endif endif match IN:REFERER into $ with ^(www\.)?.+\.example\.com$ if not matched set URL = /errorPageForBadReferer.html endif Decription Redirect requests for pages that haven't originated from a specified site to an error page, which can be achieved for a specific directory (as in the first example) or for an entire site (second example). Technical Note that all "."s in the referrer header must be escaped to prevent them being interpreted as "any character". As above, match blocks (between "if" and "endif") can be nested. "referer" is a mis-spelling of "referrer" which somehow made it into the HTTP standard without being spotted... SSI Server Side Includes (SSI) Introduction SSIs are special tag(s) that you place in your document and are the simplest way to generate dynamic content from your web server. When the web server prepares to send a copy of the document to a web client, it examines the document and interprets the SSI tags as appropriate. However, SSI tags are not very powerful. You can only use them to insert text into an existing document, and it is generally not possible to perform complicated processing to generate the inserted text. Unlike other web servers, Zeus will recursively SSI parse content included or generated by SSI tags. This greatly extends the power of SSI, allowing you to pass values into included SSI documents. SSI is suitable for use in document formatting. The ability to include standard document elements (like headers and footers), and, in the case of Zeus, make further SSI substitutions within these is very powerful. SSI is an example of a content handler. Advantages • SSI is the simplest way to generate dynamic content from your web server • The Zeus Web Server will recursively SSI parse content included or generated by SSI tags. This greatly extends the power of SSI, allowing you to pass values into included SSI documents • The ability to include standard document elements (like headers and footers), and, in the case of Zeus, make further SSI substitutions within these is very powerful. Disadvantages • SSI tags are not very powerful - you can only use them to insert text into an existing document • It is not generally possible to perform complicated processing to generate inserted text. The #config Command The #config command can be used to change the output format of certain SSI directives, specifically those which return a time / date or a filesize. This allows you to use a more understandable format than the defaults for your particular application. The #config command takes two parameters, timefmt to modify the date and time formats, and sizefmt to modify the file size format. The timefmt parameter takes a free form text string of tokens as an argument in a similar vein to the UNIX library call strftime. These tokens are expanded by the server to produce the desired time format. %a Day : abbreviated Mon, Tue, Wed %A Day Monday, Tuesday %b Month : abbreviated (%h Jan, Feb, Mar %B Month January, February %C Year :Without Century "00" - "99" %d Day of Month "01" - "31" %D Date : mm/dd/yy "01/01/00" - "12/31/99" %e Day of Month : Single figure "1" - "31" %h Month : abbreviated (%b) Jan, Feb, Mar %H Hour of Day : 24 hour clock "00" - "23" %I Hour of Day "01" - "12" %j Julian Day of Year "001" - "366" %j Julian Day of Year "001" - "366" %m Month of Year "01" - "12" %M Minute "00" - "59" %n Newline character %p AM or PM "AM" | "PM" %r Time in %I:%M:%S %p "00:00:00 am" - "12:59:59 pm" %R Time in %H:%M "00:00" - "23:59" %S Seconds "00" - "61" %T Time in %H:%M:%S "00:00:00" - "23:59:59" %u Numbered Day of the Week : Starting with Monday = 1 "1" - "7" %w Numbered Day of the Week : Starting with Sunday = 0 "0" - "6" %y Year : Without Century "00" - "99" %Y Year "0000" - "9999" %Z Time Zone "GMT", "BST", "CST" .. %% Percentage Character "%" Sophisticated time and date information can be constructed using these tokens. Additional text can also be included in the string, allowing additional explanation. The sizefmt parameter takes an argument of either bytes or abbrev. The bytes value will cause all filesize information to be displayed exactly in bytes, while the abbrev value will display an approximate value with the units appended (Kb, Mb, Gb). Simple Example: Formatting the Date Displaying a date on the welcome page for your site is quite a common requirement, but you probably don't want to got to all the trouble of writing a CGI script to display the whole page. You also probably don't want to use the default time format either. Using #config and the SSI environment variable date_local we can easily display the time in the way we want. The .shtml file: <html> <head> <title>SSI Date Test</title> </head> <body> The current time is : <b><!--#echo var="DATE_LOCAL"--></b> <br> But the time looks nicer this way : <!--#config timefmt="It's <b>%A</b> and the time is <b>%I:%M %p</b>!"--> <!--#echo var="DATE_LOCAL"--> <hr> </body> </html> Will result in something similar to the following being sent to the client: <html> <head> <title>SSI Date Test</title> </head> <body> The current time is : <b>08:00:35 PM 06/02/97</b> <br> But the time looks nicer this way : It's <b>Monday</b> and the time is <b>08:00 PM</b>! <hr> </body> </html> We are even able to include HTML tags in the text string which defined the date. The #echo Command The echo command makes available the set of environment variables sent to CGI programs. Using this command you can send these variables back to the user without having to resort to CGI programs. The echo command is followed by the parameter and argument values "var=environment variable" Simple Example: Output Environmental Variables The following SHTML file displays a selection of SSI variables. <html> <head> <title>SSI Variables</title> </head> <h1>SSI Environment Variables</h1> <pre> SERVER SOFTWARE GATEWAY_INTERFACE SERVER_PROTOCOL SERVER_NAME SERVER_PORT REQUEST_METHOD REMOTE_HOST HTTP_USER_AGENT DATE_LOCAL DATE_GMT </pre> : : : : : : : : : : <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo <!--#echo var="SERVER_SOFTWARE"--> var="GATEWAY_INTERFACE"--> var="SERVER_PROTOCOL"--> var="SERVER_NAME"--> var="SERVER_PORT"--> var="REQUEST_METHOD"--> var="REMOTE_HOST"--> var="HTTP_USER_AGENT"--> var="DATE_LOCAL"--> var="DATE_GMT"--> </body> </html> Will result in something similar to the following being sent to the client: <html> <head> <title>SSI Variables</title> </head> <h1> SSI Environment Variables </h1> <pre> SERVER SOFTWARE GATEWAY_INTERFACE SERVER_PROTOCOL SERVER_NAME SERVER_PORT REQUEST_METHOD REMOTE_HOST REMOTE_ADDR HTTP_USER_AGENT : : : : : : : : : Zeus/3.0 CGI/1.1 HTTP/1.0 www.zeus.co.uk 80 GET pc203.zeus.co.uk 10.0.0.14 Mozilla/4.0b5 [en] (Win95; I) DATE_LOCAL DATE_GMT </pre> : 06:12:16 PM 06/02/97 : 05:12:16 PM 06/02/97 </body> </html> The #exec Command There are occasions when SSI includes and SSI variables cannot achieve the dynamic content you want, but you still don't want to go to the trouble of writing a CGI program to generate the entire page content. By using the #exec command, you can include the output of a program within your HTML pages. The #exec command is followed by the cgi parameter, and a valid CGI program as the attribute. <!--#exec cgi="prog.cgi" --> Simple Example: A CGI to Display Machine Load To prove that your server machine is capable, you might want to include a load meter on the machine spec page. The page contains a lot of information and is updated frequently so you don't want a CGI program to generate the complete HTML code. The examples below will obtain the load average of the machine, then determine what colour to display the text. The .shtml file: <html> <head><title>SSI Test</title> <meta http-equiv="Refresh" content="2; test.shtml"> </head> <body> This is a big page about our large machine. <p> <!--#exec cgi="load.cgi"--> </body> </html> The CGI Program: #!/usr/bin/perl use strict; print( STDOUT "Content-type: text/html", "\n\n" ); my my my my $uptime @fields $load $colour = = = = `uptime`; split( /,/, $uptime ); $fields[ -2 ]; undef(); if ( $load < 1 ) { $colour = elsif( $load < 2 ) { $colour = elsif( $load < 3 ) { $colour = else { $colour = "white" "yellow" "orange" "red" } } } } print( STDOUT <<TXT ); <p>Load average is <font color="$colour"> $load </font></p> <hr> TXT END Possible (this is dynamic after all) result in the HTML: <html> <head><title>SSI Test</title> <meta http-equiv="Refresh" content="2; test.shtml"> </head> <body> <p>This is a big page about our large machine. </p> <p>Load average is <font color="white"> 0.32 </font></p> <hr> </body> </html> Executing arbitrary shell commands The #exec command can also be used to execute arbitrary shell commands when used with the cmd = tag. For this tag to be available, the server needs to have `CGI enable anywhere' turned on. <!--#exec cmd="command" --> The server will execute the equivalent of /bin/sh -c "command" and replace the tag with the output of that command. For example: <!--#exec cmd="cat /etc/passwd | cut -f1 -d':' | sort" --> #if, #elif, #else and #endif commands SSI pages can also contain flow control commands. These provide the ability to perform advanced alteration of a the html page. The various flow control commands are as follows: <!--#if expr="<expression>" --> <!--#elif expr="<expression>" --> <!--#else --> <!--#endif --> When one of the expressions matches, then the html enclosed in the matching block is parsed and returned to the client. Flow control commands can be nested. Expression Meaning string False if string is non-empty, true otherwise string1 = string2 True if the strings match string1 != string2 True if the strings do not match string1 > string2 True if the string1 is greater than string2(string compare) string1 >= string2 True if the string1 is greater than or equal to string2(string compare) string1 < string2 True if the string1 is less then string2(string compare) string1 <= string2 True if the string1 is less than or equal to string2(string compare) ! expression Inverts the result of an expression; true if expression is false ( expression ) Groups an expression together; true if the expression is true expression1 || expression2 True if either expression1 or expression2 is true expression1 && expression2 True if both expression1 and expression2 are true 1. Inside strings, variables starting with a '$' are expanded. For example "$SERVER_PROTOCOL" will be expanded to HTTP/1.1 2. For the '=' and '!=' comparisons, string2 will be interpreted as a regular expression if it starts and ends with a '/'. 3. Strings can be quoted using single quotes. If they arn't quoted, any white space between adjacent strings are merged into a single space. 4. Operators bind in the following order (tightest first): !, =, !=, ||, && Caching of expressions In order to improve performance of expressions and regular expressions, Zeus Web Server caches the last 100 expressions it parsed, and the last 100 regular expressions. These values can be altered using the following tunables: tuning!ssi!expressioncachesize 100 tuning!ssi!regexcachesize 100 Apache Compatibility The Zeus implementation follows as closly as possible the documentation for Apache, however Apache has some bugs in its expression parsing (especially '$' expansion). By default Zeus parses the expressions without implementing the bugs. In order to switch the parser to the 'Apache bug Compatibility' mode, set the following tunable (in web/global.cfg): tuning!ssi!apachebugcompat yes Example An example use of flow control commands would be to report back to the client the browser that is being used. <!--#if expr=" ! $HTTP_USER_AGENT " --> No user agent specified <!--#elif expr=" $HTTP_USER_AGENT = /MSIE/ " --> Microsoft Internet Explorer <!--#elif expr=" $HTTP_USER_AGENT = /Mozilla/ " --> Mozilla or something that identifies itself as such (almost everything) <!--#else --> Other browser type <!--#endif --> The #fsize and #flastmod Commands The #fsize command will insert the size of the specified file into the document. There are occasions when you may want a visitor to your web site to download information which could be substantial in size. In these cases it is polite to include the file size in the document so the visitor can estimate how long the download will take. You could include the file size directly in the HTML, but if the files were to change you would need to update the HTML pages as well. The #fsize command allows you to include the size of files which may change regularly.The #fsize command is followed by the parameters file or virtual with the same meaning as the #include command described above. Each parameter take a file location as an argument. The #flastmod command works in the same manner returning the data the file was last modified. By using the recursive nature of the Zeus SSI engine it would be possible to generate a file listing from a CGI script, but obtain the file sizes and modification dates via the SSI commands. Simple Example: File Size of Downloads A University publishes hourly satellite pictures on the web. Due to the difference in time and weather, combined with the effect of the compression code, the size can alter considerably. Using the #fsize and #flastmod commands the web page always has the correct information displayed. The .shtml file: <html> <head> <title>Pictures</title> </head> <body> <h1>Satellite Pictures</h1> Please download the <a href="picture.jpg">picture</a>, it is only <!--#fsize file="picture.jpg"--> in size. <br> Last updated on <!--#flastmod file="picture.jpg"--> </body> </html> Will result in something similar to the following being sent to the client: <html> <head> <title>Pictures</title> </head> <body> <h1>Satellite Pictures</h1> Please download the <a href="picture.jpg">picture</a>, it is only 4.8Kb in size. Last updated on 05:52:21 PM 06/03/97 </body> </html> The #include Command Managing a complex web site can be a considerable task, particularly if you wish to maintain a common look and feel through out all the pages. Including a common footer or navigation bar to your site helps your visitors identify where they are and helps them move quickly to the information they need. However if you then need to change the footer, you would need to update every page on your site. By using SSI to insert a file at the end of each of your pages you would only need to change one file. The SSI directive include allows us to merge other files into our HTML pages. It is followed by the parameters file or virtual. file specifies a filename relative to the current directory, virtual can specify a relative filename or an absolute URL within the whole document root for the virtual server. <!--#include virtual="/dir/filename" --> <!--#include file="filename" --> Simple Example: Including a footer A .shtml file: <html> <head><title>SSI Test</title></head> <body> <p> This is my text from index.shtml </p> <!--#include virtual="/includes/footer.html"--> </body> </html> The /includes/footer.html file: Please email <a href="mailto:[email protected]"> Harry</a> for more details on our products. Will result in the following output being set to the client: <html> <head> <title>SSI Test</title> </head> <body> <p> This is my text from index.shtml </p> <p> Please email <a href="mailto:[email protected]">Harry</a> for more details on our products. </body> </html> The #set Command The #set command allows you to add your own variables that can be used with the #echo command. For format of a #set command is: <!--#set var="..." value="..." --> For example: <!--#set var="title" value="My document's title" --> The #set command is extremely powerful as it provides a means of passing 'arguments' to recursively included SSI pages. This is because the #set command adds to the global variable environment, so a parsed document can set a variable then include another parsed document which uses this variable, or vice versa. For example: File title.shtml: <!-- requires title --> <!-- Imagine this is a complicated table --> <!-- which puts the title at the top of --> <!-- the page in a standard fashion. --> <html> <head> <title><!--#echo var="title"--></title> </head> <body bgcolor=#ffffff> <table bgcolor=#f0f0c3 cellspacing=4> <tr> <td> <font size=+3> <!--#echo var="title"--> </font> </td> </tr> </table> Then we could use this complex standard title in a document "About Zeus" as follows: <!-- Set title, and include standard document header --> <!--#set var="title" value="About Zeus" --> <!--#include file="title.shtml" --> Document body goes here. <!-- We'd probably do the footer in the --> <!-- same way, but for now we'll just --> <!-- terminate the html manually. --> </body> </html> Using the #set command allows the reuse of 'components' of HTML and is extremely useful in large web-site design. ZDAC ZDAC Introduction The distributed module implements the Zeus Distributed Authentication and Content API. The ZDAC API allows the web server to interact with external authentication servers and content generation servers over a TCP/IP socket interface. Authentication Zeus Web Server provides a high-performance API to allow third parties to completely customize the access control performed by the server. The API allows the server to delegate its access control policies for portions of its document tree to a separate authentication server. The interface utilizes the TCP/IP socket abstraction, allowing the authentication server to be running on a physically different machine from the server, and multiple Zeus Servers to share a centralized authentication server. It also allows the authentication server implementation to be in any programming language which can interact with sockets. Sites running online subscription magazines, for example, have very complicated access control policies, requiring external database lookups on every request. On receiving a request, Zeus Server can be configured to connect to an authentication server to check whether the request should be satisfied or requires a password etc. The authentication server can do whatever is required to validate the client, for example perform some SQL query into a central database on a mainframe where all of the subscription records are kept. Such a design allows for complete flexibility. Content Zeus Server provides a high-performance CGI replacement which allows dynamic documents to be provided to its clients. The dynamic document API utilizes a TCP/IP socket interface to allow the document generation code to be implemented on a separate machine to facilitate load-balancing, and lso makes no restrictions on the implementation language. The dynamic document generation code, called sockd here in, runs as an external daemon to the server and listens for the request from the server to generate a page. The server then passes information, similar to the environment setup for a CGI process, to the sockd server, which can then use this information to determine the data for the page to generate. This information is then routed back through the server down to the client. This offers significant performance increases over frequently forking a large CGI script. Advantages • • • • • Extremely simple High Performance Applications can be remote, on a different machine to the web server Language independent Applications can be securely chroot()ed and run with different permissions for each application. Disadvantages • Limited to authentication and content generation services • Non-standard, proprietary interface Authentication Server Zeus Server provides a high-performance API to allow third parties to completely customize the access control performed by the server. The API allows the server to delegate its access control policies for portions of its document tree to a separate authentication server. The interface utilizes the TCP/IP socket abstraction, allowing the authentication server to be running on a physically different machine from the server, and multiple Zeus Servers to share a centralized authentication server. It also allows the authentication server implementation to be in any programming language which can interact with sockets. Sites running online subscription magazines, for example, have very complicated access control policies, requiring external database lookups on every request. On receiving a request, Zeus Server can be configured to connect to an authentication server to check whether the request should be satisfied or requires a password etc. The authentication server can do whatever is required to validate the client, for example perform some SQL query into a central database on a mainframe where all of the subscription records are kept. Such a design allows for complete flexibility. Simple Example Here is an example of a authentication server.This is a simple authentication server written in Perl, and demonstrates how to interact with the authentication API. It receives requests from Zeus Server for authentication information, and checks the user details supplied in a separate text-file containing its authentication database. Authentication daemon protocol: Input: Hostname: (DNS name|IP address)\r\n URL: url-requested\r\n Method: (GET|POST...)\r\n Password: (NULL|userid:password)\r\n Cookie: (NULL|cookie)\r\n \r\n Output: NO : forbidden PASSWORD : Password required, or wrong password supplied YES : access granted Presented here is a small example of an authentication daemon (herein called authd) capable of interacting with the Zeus Server authentication API. Upon startup, the authd requires a port number to run on, and the filename of the access file to use. The access file is a simple text file which contains information of the following format: url:userid:password Where url is the request sent from the client (eg. /~foobar/private.html) and userid/password are the required input from the client for authorization. Upon receiving a complete request from Zeus Server, the authd opens the access file, and scans it to see if the requested URL has any access restrictions. If no restrictions are found, the authd allows access, otherwise it compares the userid/password pair supplied by the client with that contained in the access file to determine whether access should be granted. #!/usr/bin/perl require 5.001; use Socket; use POSIX; # authorize(hostname,url,method,password,cookie) # Simple example of an authorize function which just looks up the URL in the # access file, and if present checks authorization, otherwise grants access. sub authorize { my $url = $_[1]; my $password = $_[3]; my $cookie = $_[4]; } open(FD,$accessfile) || return "NO"; # unable to open file { local $/ = "\n"; while(<FD>) { my $furl,$fuid,$fpwd; ($furl,$fuid,$fpwd) = split(/:/); chomp $fpwd; if($url =~ /$furl.*/) { # access restricted on this url close(FD); if ($password eq "$fuid:$fpwd") { return "YES" } return "PASSWORD"; # Need correct password } } close(FD); } return "YES"; # access not restricted, grant access # complete(string) # Determines whether a complete request has been recieved yet. # non-zero if so. sub complete { return ($_[0] =~ /\r\n\r\n/) } Returns # process(string) # Takes a complete request, and returns the data to output. sub process { my $buf = $_[0]; $buf =~ m/Hostname:\ (.*)\r\n URL:\ (.*)\r\n Method:\ (.*)\r\n Password:\ (.*)\r\n Cookie:\ (.*)\r\n/x; return( authorize($1,$2,$3,$4,$5) ); } # ---------------------------------------------------------------------------# Server code if( $#ARGV!=1 ) { print "Usage: authd port accessfile\n"; exit 1; } $port = shift; $accessfile = shift; my $proto = getprotobyname('tcp'); socket(SERVER, PF_INET, SOCK_STREAM, $proto) setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, 1) || die "socket: $!"; || die "setsockopt: $!"; $sockaddr = 'S n a4 x8'; $this = pack($sockaddr, AF_INET, $port, "\0\0\0\0"); bind(SERVER, $this) || die "bind: $!"; listen(SERVER,5) || die "listen: $!"; print "authd server started on port $port using accessfile $accessfile\n"; $child = fork; if ($child < 0) { # fork failed die "error on fork: $!"; } elsif ($child > 0) { # parent exit(0); } else { # child if (!setsid()) { die "failed setsid: $!"; } } $rin = ''; vec($rin,fileno(SERVER),1) = 1; $win = ''; $ein = ''; # Main program loop while(1) { $nfound = select($rout=$rin,$wout=$win,$eout=$ein, undef); if(vec($rout,fileno(SERVER),1)) { # new connection my $NS = FileHandle->new(); if(accept($NS,SERVER)) { vec($rin,fileno($NS),1) = 1; vec($ein,fileno($NS),1) = 1; $input[fileno($NS)] = $output[fileno($NS)] = ''; $fd2file[fileno($NS)] = $NS; push ((@clients),fileno($NS)); } } else { # look through read/write/except bits for clients @cl = @clients; while($fd = shift @cl) { if(vec($eout,$fd,1)) { # Exception on $fd &killclient($fd); } if(vec($wout,$fd,1)) { # Write on $fd local($len) = length($output[$fd]); if($len) { # data left to write $w=syswrite $fd2file[$fd],$output[$fd],$len; } } } } if(!defined($w)) { &killclient($fd); } else { $output[$fd] = substr($output[$fd],$w,$len-$w); } else { &killclient($fd); } # finished } if(vec($rout,$fd,1)) { # Read on $fd $r=sysread $fd2file[$fd],$input[$fd],128,length($input[$fd]); if(!defined($r) || !$r) { &killclient($fd); } else { &parse($fd); } } # Removes a client of given fd from the system sub killclient { local($fd) = $_[0]; local(@cl); local($i); vec($rin,$fd,1) = 0; vec($win,$fd,1) = 0; vec($ein,$fd,1) = 0; $input[$fd] = $output[$fd] = ''; close $fd2file[$fd]; while($i = shift @clients) { if($i != $fd) { push ((@cl),$i); } } @clients = @cl; } # Determines whether a complete request on an fd has been recieved, and if so # processes it sub parse { local($fd) = $_[0]; if(&complete($input[$fd])) { # got a complete request vec($rin,$fd,1) = 0; vec($win,$fd,1) = 1; $output[$fd] = &process($input[$fd]); } } FILEHANDLE: { use strict; require FileHandle; package FileHandle; # make sure real one is loaded sub new { my $self = shift; my $class = ref($self) || $self; return bless(&_genfh, $class); } # for inheritance sub DESTROY { my $self = shift; if (defined fileno $self) { close $self; } } } ########## # internal only ########## sub _genfh { no strict 'refs'; local *{'FileHandle::DEMI_ANON_GLOB'}; return \delete $FileHandle::{DEMI_ANON_GLOB}; } 1; Sockd Server Zeus Server provides a high-performance CGI replacement which allows dynamic documents to be provided to its clients. The dynamic document API utilizes a TCP/IP socket interface to allow the document generation code to be implemented on a separate machine to facilitate load-balancing, and also makes no restrictions on the implementation language. The dynamic document generation code, called sockd here in, runs as an external daemon to the server and listens for the request from the server to generate a page. The server then passes information, similar to the environment setup for a CGI process, to the sockd server, which can then use this information to determine the data for the page to generate. This information is then routed back through the server down to the client. This offers significant performance increases over frequently forking a large CGI script. Simple Example Here is an example of a sockd server. This is a simple page-counter implemented in Perl which demonstrates how to interact with the API. It is intended to interact with the non-forking CGI API of Zeus Server. CGI daemon protocol: See specifications for more information. Input: ENVIRONMENT\r\n DATA\r\n\r\n Where: ENVIRONMENT = (LINE\r\n)+ LINE = environment variable line (eg. HTTP_USER_AGENT=ZeusBrowser) DATA = data supplied to POST, or arguments supplied to GET ENVIRONMENT contains all of sames fields that are set in the environment of a normal CGI program. Output: [as normal CGI] Presented here is a small example of a CGId daemon (herein called sockd) capable of interacting with the Zeus Server non-forking CGI API. Upon startup, the sockd requires a port number to run on. On recieving a complete request, the sockd opens up a counter file, increments the value in there, and returns a page to the user displaying the counter value. The webserver should be configured to map requests to this sockd server. #!/usr/bin/perl require 5.001; use Socket; use POSIX; # complete(string) # Determines whether a complete request has been recieved yet. # non-zero if so. sub complete { return ($_[0] =~ /\r\n\r\n.*\r\n\r\n/) } Returns # process(string) # Takes a complete request, and returns the data to output. # Trivial example simply opens up the counter file specified in the request, # increments it, and returns a page containing the value. sub process { my $buff = $_[0]; my $environ, $data; my $countdir = "/tmp/"; # Directory where counter files are stored my $num; $buff =~ /\r\n\r\n/; $environ = $`; $data = $'; substr( $data, -4, 4, '' ); # eat \r\n\r\n if (! -f "$countdir/$data") { open(NUM,">$countdir/$data"); close(NUM); } open(NUM,"+<$countdir/$data"); $num = <NUM>; $num++; seek(NUM,0,0); print NUM "$num\n"; close NUM; return join( '' , "Content-Type: text/html\r\n\r\n" , "<body bgcolor=#ffffff><h3>" , "You've seen this page $num times before!" , "<p><hr noshade size=2>" , "<a href=\"http://www.zeus.co.uk/products/server/\">" , "<center><font size=7>P<i>owered</i> B<i>y</i> Z<i>eus</i> " , "S<i>erver</i></a>" ); } # ---------------------------------------------------------------------------# Server code if( $#ARGV!=0 ) { print "Usage: sockd port\n"; exit 1; } $port = shift; my $proto = getprotobyname('tcp'); socket(SERVER, PF_INET, SOCK_STREAM, $proto) setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, 1) || die "socket: $!"; || die "setsockopt: $!"; $sockaddr = 'S n a4 x8'; $this = pack($sockaddr, AF_INET, $port, "\0\0\0\0"); bind(SERVER, $this) || die "bind: $!"; listen(SERVER,5) || die "listen: $!"; print "CGId server started on port $port\n"; $child = fork; if ($child < 0) { # fork failed die "error on fork: $!"; } elsif ($child > 0) { # parent exit(0); } else { # child if (!setsid()) { die "failed setsid: $!"; } } $rin = ''; vec($rin,fileno(SERVER),1) = 1; $win = ''; $ein = ''; # Main program loop while(1) { $nfound = select($rout=$rin,$wout=$win,$eout=$ein, undef); if(vec($rout,fileno(SERVER),1)) { # new connection my $NS = FileHandle->new(); if(accept($NS,SERVER)) { vec($rin,fileno($NS),1) = 1; vec($ein,fileno($NS),1) = 1; $input[fileno($NS)] = $output[fileno($NS)] = ''; $fd2file[fileno($NS)] = $NS; push ((@clients),fileno($NS)); } } else { # look through read/write/except bits for clients @cl = @clients; while($fd = shift @cl) { if(vec($eout,$fd,1)) { # Exception on $fd } } } &killclient($fd); } if(vec($wout,$fd,1)) { # Write on $fd local($len) = length($output[$fd]); if($len) { # data left to write $w=syswrite $fd2file[$fd],$output[$fd],$len; if(!defined($w)) { &killclient($fd); } else { $output[$fd] = substr($output[$fd],$w,$len-$w); } } else { &killclient($fd); } # finished } if(vec($rout,$fd,1)) { # Read on $fd $r=sysread $fd2file[$fd],$input[$fd],128,length($input[$fd]); if(!defined($r) || !$r) { &killclient($fd); } else { &parse($fd); } } # Removes a client of given fd from the system sub killclient { local($fd) = $_[0]; local(@cl); local($i); vec($rin,$fd,1) = 0; vec($win,$fd,1) = 0; vec($ein,$fd,1) = 0; $input[$fd] = $output[$fd] = ''; close $fd2file[$fd]; while($i = shift @clients) { if($i != $fd) { push ((@cl),$i); } } @clients = @cl; } # Determines whether a complete request on an fd has been recieved, and if so # processes it sub parse { local($fd) = $_[0]; if(&complete($input[$fd])) { # got a complete request vec($rin,$fd,1) = 0; vec($win,$fd,1) = 1; $output[$fd] = &process($input[$fd]); } } FILEHANDLE: { use strict; require FileHandle; package FileHandle; # make sure real one is loaded sub new { my $self = shift; my $class = ref($self) || $self; return bless(&_genfh, $class); } # for inheritance sub DESTROY { my $self = shift; if (defined fileno $self) { close $self; } } } ######## # internal only ######## sub _genfh { no strict 'refs'; local *{'FileHandle::DEMI_ANON_GLOB'}; return \delete $FileHandle::{DEMI_ANON_GLOB}; } 1;
© Copyright 2024 Paperzz