Common Gateway Interface (CGI)

Common Gateway
Interface (CGI)
Hamid Zarrabi-Zadeh
Web Programming – Fall 2013
2
Outline
• History
• What is CGI?
• CGI Programs
• Processing Form Data
• Cookies
• Security Concerns
• Summary
3
History
• The web was initially developed to be a global
online repository or archive of documents
• Such pieces of information generally come in the
form of static text and usually in HTML
• These (static) HTML documents live on the web
server and are sent to clients when requested
• As the Internet and Web services evolved, there
grew a need to process user input
• Thus fill-out forms were invented
4
History (Cont'd)
• But, web servers are only good at one thing:
getting a user request for a file and returning that
file to the client
• They do not have the “brains” to be able to deal
with user-specific data
• Given this is not their responsibility, web servers
farm out such requests to external programs
• These programs process user input and return
dynamically generated HTML that is returned to
the client
5
What is CGI?
• CGI is a standard interface by which the web
server passes the client's request to an external
program and receives the response from that
program
Call CGI
Request
Web
Browser
Web
Server
Response
CGI
Program
CGI
Response
CGI Programs
7
CGI Programs
• CGI programs are usually written in one of
higher-level programming (scripting) languages,
such as Perl, PHP, and Python
• You can use any programming language (C,
C++, Java, bash,...)
• A CGI program is slightly different from a typical
program, in terms of input, output, and user
interaction
8
Web Server Configuration
• Web servers usually treat executable files in
certain directories as CGI programs
 Example: /var/www/cgi-bin
• By convention, CGI files have .cgi extension
• You can usually configure your web server to
accept custom CGI extensions and directories
# .htaccess
AddHandler cgi-script .cgi .py
Options +ExecCGI
9
CGI Input
• The web server sends information to the CGI
program using environment variables
• This information includes
– HTTP headers
– Server information
– Client information
– Request Information
– User's input
10
CGI Output
• The CGI program should write its output to stdout
• The output is directly forwarded to the client, and
hence, it must be a valid HTTP response
• The response consists of two parts: a header and
an optional body, separated by a blank line
• The header minimally specifies the content type
 Content-Type: text/html
• The body is usually an HTML
11
Example
• This is a simple CGI program written in Python
#!/usr/bin/python
print('Content-type: text/html\n')
print('<h1>Hello World!</h1>')
print('<p>This is my first CGI program.</p>')
12
Example: C
• The previous CGI example, written in C
#include <stdio.h>
int main(void) {
printf("Content-type: text/html\n\n");
printf("<h1>Hello World!</h1>");
printf("<p>This is my first CGI program.</p>");
return 0;
}
13
Example: Show Variables
• A CGI program showing all environment
variables passed by the web server
#!/usr/bin/python
import os
print('Content-type: text/html\n')
print('<h1>Environment Variables</h1>')
for i in os.environ:
print('<b>%s</b> = %s<br>' % (i, os.environ[i]))
14
Output: Show Variables
HTTP_REFERER = http://localhost/python/
SERVER_SOFTWARE = Apache/2.2.17 (Win32) PHP/5.3.5
SCRIPT_NAME = /python/env.py
REQUEST_METHOD = GET
SERVER_PROTOCOL = HTTP/1.1
QUERY_STRING =
HTTP_USER_AGENT = Mozilla/5.0 (Windows NT 6.2; WOW64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.69
SERVER_PORT = 80
COMSPEC = C:\Windows\system32\cmd.exe
SERVER_ADMIN = admin@localhost
HTTP_HOST = localhost
HTTP_CACHE_CONTROL = max-age=0
GATEWAY_INTERFACE = CGI/1.1
HTTP_ACCEPT_ENCODING = gzip,deflate,sdch
…
Processing Form Data
16
Receiving Form Data
• A CGI program can receive form data in two
different ways
• If the form is submitted by the GET method then
the query is encoded in the QUERY_STRING
environment variable
• If the form is submitted by the POST method then
– The data arrives on stdin (standard input)
– The CONTENT_LENGTH environment indicates how
much data will arrive (the server does not transmit
EOF!)
17
CGI Libraries
• Most programming languages provide (built-in)
libraries that make working with environment
variables easier
• These libraries usually unify accessing to
GET/POST data sent by the web server
18
Simple Form
• Here is a simple form from our previous lectures
<form action="get-form.py" method="get">
First name:
<input type="text" name="fname"><br>
Last name:
<input type="text" name="lname">
<input type="submit" value="Submit">
</form>
19
Simple Form Processing
• Here is a simple CGI for processing form data
import cgi
form = cgi.FieldStorage()
fname = form.getvalue('fname')
lname = form.getvalue('lname')
print('Content-type: text/html\n')
print('<h2>Hello %s %s!</h2>' % (fname, lname))
• This program works with method="post" as well
20
Items with Same Name
• Multiple items can have the same name, for
example when a form contains a group of
multiple checkboxes
<form action="checkbox.py" method="get">
<input type="checkbox" name="device"
value="iPhone">iPhone<br>
<input type="checkbox" name="device"
value="iPad">iPad<br>
<input type="submit" value="Submit">
</form>
21
Processing Multi-Value Items
• When multiple items have the same name, the
name usually represents a list of values on the
CGI side
import cgi, cgitb
form = cgi.FieldStorage()
device = form.getvalue('device')
print('Content-type: text/html\n')
print('<h2>Device: %s!</h2>' % device)
devices = form.getlist('device')
print('<p>Selected %d items.</p>' % len(devices))
Cookies
23
Reading Cookies
• Cookies are passed to CGI program via
HTTP_COOKIE environment variable
import os
if 'HTTP_COOKIE' in os.environ:
cookies = os.environ['HTTP_COOKIE']
cookies = cookies.split(';')
cookie_list = {}
for cookie in cookies:
cookie = cookie.split('=')
cookie_list[cookie[0]] = cookie[1]
24
Setting Cookies
• Cookies can be set via SET_COOKIE HTTP header
#!/usr/bin/python
print('Set-cookie:user_id=1100')
print('Set-cookie: secret=abcd;' +
'expires=Tuesday, 20-Nov-2013 23:59:00 GMT;' +
'path=/;')
print('Content-type: text/html\n')
print('<h2>Hello World</h2>')
Security Concerns
26
Security Concerns
• The client is sending a request that causes the
server to execute a program
• The program uses data provided by the client
• Client data can not be trusted!
27
Security Tips
• Do not trust the client to follow rules
 setting maxlength in a text field does not guarantee that
you will never receive a longer string
• Never leave any opportunity to execute data
provided by the client (say, using eval)
• Be careful with file names passed by clients
 if a client sends "../../../etc/passwd" it might give them
access to /etc/passwd
28
Security Tips (Cont'd)
• Don't store data where it can be accessed by
HTTP clients
 Either put data in a separate directory that is not under
your public_html directory, OR
 Adjust file permissions
• Always escape user-supplied data before
outputting it as HTML
• Don't give out more information that necessary
 If wrong password is entered for a legitimate user, just
print out "incorrect login" not "incorrect password"
29
Security Tips (Cont'd)
• If receiving SQL commands from the user, be
careful with SQL injections
• Equality-based injections:
 SELECT * FROM users WHERE userid = 110 or 1=1
• Batched SQL
 SELECT * FROM users WHERE userid = 110; DROP
TABLE users
30
Some Technical Issues
• The user ID that the CGI program is run under
depends on the server configuration
• Most often, it is either the web server, or the
owner of the CGI file
 In the former case, make sure that your script is readable
and executable by “others”
• The CGI program is restricted to performing
operations permitted to that user
31
Summary
• CGI is a simple means by which the server and
CGI program exchange data
• CGI (or something like it) is required for creating
web pages with dynamic content
• Form data is handled differently depending on
whether the form method is GET or POST
• CGI programs can read and set cookies
• CGI introduces many subtle potential security
problems
32
References
• Core Python Applications Programming
 By Wesley J. Chun
• Internet Programming by Pat Morin
 http://cg.scs.carleton.ca/~morin/teaching/2405
• Python cgi Library
 http://docs.python.org/3/library/cgi.html