Allowing Registration-Required Binary Downloads
by Robert Bernier10/21/2004
One of the hallmarks of a well-developed web presence is the ability to offer downloads. We're all familiar with the site that offers you a free application. You receive something of value in exchange for supplying the site with a contact name and email address. This isn't the only example: remote code compilation, translation services, and forum requests are others.
This article demonstrates how to create a web site that offers a binary download in exchange for user-supplied information in a manner that protects your data. The site will store information gleaned from a form on the site as well as the downloadable binary in a PostgreSQL database.
It's easy to validate the form's values, such as the email's domain, by carefully crafting the table's definitions and using the appropriate PHP functions. You can also keep user statistics, such as tracking the number of downloads on a per-user basis.
For the sake of expediency, I'll only discuss the contents of the server pages in the broadest terms. I expect that you already know PHP and SQL.
The Scenario
Your company supplies consulting services. You're offering a software tool in exchange for the name and email address of the downloader. After a user invokes a download request, your system sends an email with the download URL. The applicant clicks on the URL and the download begins. Here's where the first problem crops up. You saw this coming; most of those emails that you've received are bogus. You now face the daunting task of validating each record to see who really downloaded your binary.
Creating a download site shouldn't be brain surgery, but there are several facts that you should keep in mind.
- Some people will take the time to fill out forms with bad email addresses, even if the site is plastered with notices saying that they can't download the binary without a valid email address.
- Some people will take the extraordinary amount of effort of registering an email address on a free email service just for the download.
- A small number of people who practice fact number 2 have purposely introduced viruses/trojans into an email addressed to you under the guise of a legitimate communication.
|
Related Reading
Upgrading to PHP 5 |
Therefore, it's necessary to institute a mechanism that reduces frivolous form submissions and maximizes security. It seems there are a lot of people out there with free time on their hands.
Who Should Mind the Store?
There's a lot of technology out there, but it's a bit of a trick to decide what to use. Developing a viable solution depends upon understanding the context. You must know the people who run the site and appreciate the system's operating conditions.
Here are some questions you should ask:
- Is the business logic in the server pages or in the database back end?
- What kind of validation is necessary to protect both the data integrity and the site itself?
- Should the client process part of the business logic, perhaps through client-side JavaScript in a browser?
- Is there a division of labor? Does one person maintain the site while another processes the data?
- Do the people involved understand what can go wrong?
- Do you host the site on your own machine or does it belong to a third-party, such as an ISP?
Creating the Simple Downloader
Let's start by looking at some pages with minimal code and little validation. (You can download the sample server pages and table definitions if you like.)
This is the SQL used to generate the client table:
CREATE TABLE client(
firstname text,
lastname text,
email text,
CONSTRAINT unique_nameAndEmail
UNIQUE (firstname, lastname, email)
)
The PostgreSQL server runs on localhost with a database named
mycompany. The user name is applicant with a password
of 123 It's more secure if you've created the tables and
functions as the superuser, but don't forget that you'll need to assign the
correct privileges for applicant as well.
There are four pages:
- index.html, the application form
- downloadVerification.php, the form's target page
- down1.php, the target page for the applicant's emailed URL
- down.php, which transmits the binary to the applicant's browser
How It Works
When a user submits the form in index.html, the parameters go to
downloadVerification.php. This page obtains three parameters:
firstname, lastname, and email.
The page concatenates these parameters into a single string using the
md5() function and appends it to the download URL. It sends this URL
to the applicant's email address using PHP's mail() function:
$email = trim($_POST['email']);
$str_md5 = md5($f_name.$l_name.$email);
$msg = "http://localhost/down1.php?vericode=".str_md5;
mail($email, "Binary Download Verification Code", $msg,
"From:sales@mycompany.com");
It also inserts the name and email variables into
the client table:
$query = "INSERT INTO client (firstname,lastname,email)
VALUES('$f_name','$l_name','$email')";
$dbconn = pg_connect("host=127.0.0.1 port=5432 dbname=mycompany user=applicant password=123")
or die("<h2>Warning... system is congested. Please wait, then try again</h2>");
Clicking on the link directs the applicant to down1.php. Note that this page is really an intermediary page, in that it provides something to read during the download. The JavaScript in this page redirects to the real download page, down.php.
<html>
<head>
<title>The download page</title>
<?php
$str_md5 = $_GET['vericode'];
echo "
<script language='JavaScript' type='text/JavaScript'>\n
function redirect()\n
{\n
window.location = 'http://localhost/down.php?vericode=$str_md5'\n
}\n
setTimeout('redirect();', 2000)\n
</script>\n";
?>
</head>
<body>This page gives a nice message before the download actually begins</body>
</html>
The down.php page extracts the vericode parameter with
the following query:
SELECT firstname,lastname,email FROM client
WHERE
md5(firstname||lastname||email) = '$str_download'"
The binary download code is cool not only because it stores the binary in the database, as opposed to the filesystem, but because of its utter simplicity. The secret is sending the browser the correct headers:
// large objects must be obtained from within a transaction
pg_query($dbconn, "begin");
// current large object resource number of the zipped file is "17899"
$lo_oid = 17899;
$handle_lo = @pg_lo_open($dbconn,$lo_oid,"r") or die("<h2>Error.. can't get handle</h2>");
//headers to send to the browser before beginning the binary download
header('Accept-Ranges: bytes');
header('Content-Length: 32029974'); //this is the size of the zipped file
header('Keep-Alive: timeout=15, max=100');
header('Content-type: Application/x-zip');
header('Content-Disposition: attachment; filename="superjob.zip"');
@pg_lo_read_all($handle_lo) or
die("<h2>Error, can't read large object.</h2>");
// committing the data transaction
pg_query ($dbconn, "commit");
Note: I submitted this snippet to the PHP homepage at pg-lo-read-all
documentation.
I've previously uploaded this binary into the database as a large object
using the psql utility. This process returned an object
identifier (oid) to use to refer to the object. Using that oid, the code
fetches a handle to it:
$handle_lo = @pg_lo_open($dbconn,$lo_oid,"r") or die("<h2>Error.. can't get handle</h2>");
The browser needs to know that it's about to receive a zipped file before the data transfer. The page sends the appropriate headers:
header('Accept-Ranges: bytes');
header('Content-Length: 32029974'); // the size of the zipped file
header('Keep-Alive: timeout=15, max=100');
header('Content-type: Application/x-zip'); // a zipped file is coming
header('Content-Disposition: attachment; filename="superjob.zip"');
Then the download begins:
@pg_lo_read_all($handle_lo) or
die("<h2>Error, can't read large object.</h2>");
Caveat! Large object transactions require that you enclose the instructions within a transaction; that is, between
pg_query ($dbconn, "begin");
and
pg_query ($dbconn, "commit");
The Problem of Validation
Validation protects your data and makes the entire transaction more robust, too. You need to decide where to place the validation instructions, either in the server pages or the PostgreSQL back end. This will depend on the site's unique operating conditions.
Normally, I opt to place the bulk of validation on the database side, with only the most basic validation in the server pages. Since the DBA uses the data, it only makes sense to say that he is also the one who is in the best position to protect the data.
Validation in Forms
There's not much I want to say here other than to remind you that protecting your forms is the first step to preventing problems:
- Always use a maximum-length parameter for the form's text variables.
- When possible, use radio buttons and checkboxes rather than text boxes, because this limits the kinds of variables expected at the back end. Don't forget, though, that determined users can send any kind of data they want.
- Trim white spaces on text input.
- Use JavaScript when possible. JavaScript can ensure that mandatory fields are not empty, or permit only certain kinds of characters as input. It also provides fast response to the client's actions without wasting bandwidth. Again, though, never trust the client.
Validation in Server Pages
You should, at the very least, repeat the same validation as for forms. Avoid composing strings in PHP scripts as SQL statements. For example, instead of writing:
// DO NOT USE; INSECURE
$query = "INSERT INTO client (firstname,lastname,email)
VALUES('$f_name','$l_name','$email')";
Use this:
$query = SELECT f_insert($firstname,$lastname,$email);
The user-defined function f_insert() is a plpgsql function created
in the database client. User-defined functions allow you to define additional
validation rules without changing the PHP source code. They also mitigate
exploits such as SQL injection.
Validation in the Database
There are two goals when defining this example database. First, verify the inserted variables. Second, prevent uncontrolled downloads of the zipped binary.
A good table definition is the first step to maintaining data integrity.
Let's rewrite the client table:
CREATE TABLE client(
firstname varchar(15) NOT NULL,
lastname varchar(15) NOT NULL,
email varchar(50) NOT NULL,
counter int DEFAULT 0,
CONSTRAINT unique_nameAndEmail
UNIQUE (firstname, lastname, email),
CONSTRAINT firstname_length CHECK(length(rtrim(ltrim(firstname)))>=3),
CONSTRAINT lastname_length CHECK(length(rtrim(ltrim(lastname)))>=3),
CONSTRAINT email_include CHECK(email ~~ '%_@%.%'),
CONSTRAINT email_exclude CHECK(email !~ '[\,#,$,%,&,(,), ]'),
CONSTRAINT email_exclude1 CHECK(email !~ 'hotmail' AND email !~ 'yahoo'),
CONSTRAINT max_download CHECK(counter <= 3)
);
The table definition carries out the following constraints:
- There must be no white spaces at the beginning or end of the variables.
- The variables
firstnameandlastnamemust be at least three and less than 15 characters long. - The variable
emailmust be less than 50 characters long and it must have an@symbol and a period to the right of it. The email address cannot contain the symbols\,#,$,%,&,(,), or], or any white spaces. This isn't strictly RFC-822-compliant, but it's a good first-order approximation. - The email address cannot a the domain name containing the strings
hotmailoryahoo, because many spammers use these email domains.
I've also added the counter column to control the maximum
number of downloads. In this case, it's three.
The function f_insert() is:
CREATE OR REPLACE FUNCTION f_insert(
varchar, --firstname
varchar, --lastname
varchar --email
)
RETURNS INT AS '
BEGIN
INSERT INTO client
(firstname, lastname, email)
VALUES ($1, $2, $3);
RETURN 0;
END;
'
LANGUAGE plpgsql;
Controlling the number of downloads is as easy as defining a
CHECK constraint on the counter column in the
table:
CONSTRAINT max_download CHECK(counter <= 3)
The column value increments by one for each download. When the value of
counter exceeds the CONSTRAINT value, the SQL query fails,
rejecting additional downloads.
Implementing a counter requires an UPDATE query and a trigger, which fires a
function that increments the counter column.
The new query in down.php becomes:
$query = "UPDATE client SET counter= -1
WHERE
md5(firstname||lastname||email) = '$str_download'";
The function f_update is:
CREATE OR REPLACE FUNCTION f_tr_update()
RETURNS TRIGGER AS '
BEGIN
IF NEW.counter = -1 THEN
NEW.counter= OLD.counter + 1;
END IF;
RETURN NEW;
END;
'
LANGUAGE plpgsql;
The value -1 is a dummy variable, which flags the update trigger that an
increment is about to occur. In addition, the presence of the flag allows the
DBA to update records that have nothing to do with a download request.
The command that creates the trigger is:
CREATE TRIGGER tr_update
BEFORE UPDATE ON client
FOR EACH ROW
EXECUTE PROCEDURE f_tr_update();
The Email Conundrum
Obtaining a valid email address is our single most critical validation. Although much of the previous discussion has gone a long way to mitigating the problem, there's still the challenge of validating the domain name itself. Fortunately, PHP contains a function that provides a readymade solution.
Insert these two lines of code into downLoadVerification.php:
$email_domain = explode("@",$email);
if (!checkdnsrr($email_domain[1],"MX")) die($errormsg);
The explode() function parses the email address based on the
@ symbol and checkdnsrr() contacts the Domain Name
Server and verifies that the domain name has a working SMTP server.
Conclusion
This article has covered two issues: how to download a binary through an interactive process of filling out a form, and how to protect your site through the efficient use of validation in a PostgreSQL database. Creating an interactive web site can be a rewarding experience, if you do it right.
Robert Bernier is the PostgreSQL business intelligence analyst for SRA America, a subsidiary of Software Research America (SRA).
Return to the PHP DevCenter
-
It would be better to pre-calculate the md5 hash.
2006-02-27 09:29:39 phallstrom@gmail.com [View]
-
It would be better to pre-calculate the md5 hash.
2009-01-10 07:51:07 Jason.Carter [View]
-
wankers
2005-03-10 06:59:36 idiotkiller [View]
-
e-mail validation
2004-11-28 12:23:06 butterfi [View]
-
Discrimination against the short-named
2004-10-25 06:34:48 ST [View]



