Notes week of 10 Dec

For this week, we’re going to look at ways to implement form validation using WAI. For these notes (and for assignment 12) you need one more library installed. Here’s the complete set:

apt install ghc libghc-warp-dev libghc-blaze-html-dev libghc-mtl-dev

The usual imports

A basic registration form

Here is an HTML spec for a basic registration form, with four fields: first name, last name, email address, and phone number.

The registration form rendered in Firefox
The registration form rendered in Firefox

Retrieving the data

Since the form response is sent back to the server using a POST, we can use requestMethod to distinguish whether we should present the form or process the result. Furthermore we can use requestBody to retrieve the data.

Run this version of the app in GHCi by typing:

ghci> Warp.run 80 app1

(I suggest using port 80 on the Mimir IDE, or port 3000 if on your own computer.)

After I access the page and fill out the form, I get an output on the GHCi console that looks like this:

Body: first=Chris&last=League&email=league%40acm.org&phone=2222222

Parsing the data

That format is known as a URL-encoded query string. It uses equals (=) and ampersand (&) symbols to separate different variables, and also encodes other non-alphanumeric bytes using two-digit hex codes preceded by a percent sing (%), as in %40 to represent the at sign @ in the email address.

What if we enter a last name that contains an apostrophe and a hyphen (O’Reilly-Jones), a first name that contains a space and an accented character (Rúnar Bram), and an email address with an ampersand (jones&co@gmail.com)? The result is:

Body: first=R%FAnar+Bram&last=O%27Reilly-Jones&email=jones%26co%40gmail.com&phone=1112223333

The substitutions that were used are:

  • %FA for ‘ú’
  • + for a space
  • %27 for the apostrophe
  • %26 for the ampersand
  • %40 for the at sign
  • (The hyphen did not need to be substituted.)

There is a function in Network.HTTP.Types.URI that can decode these:

But it is meant to decode the path part of the URL too, as in:

λ> decodePath "/watch?v=KWd8ZLKxlY0"
(["watch"],[("v",Just "KWd8ZLKxlY0")])

The result is a pair of the path segments and the query parameters. We can trick it into decoding just a query without the path by preceding it with a question mark:

λ> decodePath "?first=R%FAnar+Bram&last=O%27Reilly-Jones"
([],[("first",Just "R\250nar Bram"),("last",Just "O'Reilly-Jones")])

Notice how it decoded the percent codes and the plus/space in the first name. It looks like it may have flubbed the accented character, but actually \250 in a Haskell string represents the right character:

λ> putStrLn "R\250nar"
Rúnar

So here’s a version of the app where the registration form data is parsed and printed:

Run it in GHCi by typing:

ghci> Warp.run 80 app2

Here are the results when I submit the data as before:

("first",Just "R\249nar Bram")
("last",Just "O'Reilly-Jones")
("email",Just "jones&co@gmail.com")
("phone",Just "1112223333")

Validating the data

Most forms on the web come with various validation rules. Most simply, form fields can be required (non-empty). For something like an email address, we could verify the syntax (at least make sure it contains an @). For phone numbers, we could verify it has enough digits.

Here is a pretty simplistic validator for required fields:

The lookup returns a nested Maybe because the key itself could be missing, but if present its value could also be `Nothing. Or it could be a string, but the string is empty. Here’s how we’d get all four of the cases handled above:

λ> lookup "x" $ snd $ decodePath "?y=9"
Nothing
λ> lookup "x" $ snd $ decodePath "?y=9&x"
Just Nothing
λ> lookup "x" $ snd $ decodePath "?y=9&x="
Just (Just "")
λ> lookup "x" $ snd $ decodePath "?y=9&x=3"
Just (Just "3")

Now we can rewrite the application, using requireField on all the fields in the user registration form:

We’re using lefts and rights (from Data.Either) to split apart the error messages from the successful results. Then we determine the output based on whether there were errors.

Response with missing fields.
Response with missing fields.
Response with complete fields.
Response with complete fields.

A new abstraction

The form validation in the previous version of the app is not too bad, but as the rules get more complex we may want a better abstraction. Functors and monads to the rescue!

We’re going to use a Reader monad to pass around the query, and a Writer to keep track of any errors that arise. By importing these commonplace monad abstractions, we don’t have to declare any custom type-class instances.

To access the parsed query data, we use the ask selector defined by the Reader monad.

The join operator collapses the nested Maybe types resulting from lookup.

To emit an error message, we use the tell action defined by the Writer monad.

Here’s a helpful primitive that converts a query value from ByteString to Text, supplies a default of the empty string, and also trims any leading and trailing spaces.

Finally, here’s a general validation operator that tests a value against a Boolean function, and emits the error message if the result is False.

We can use that to flag required fields that are empty (or only contain spaces):

It can also encode more sophisticated validation, checking that the email address contains an @ sign:

and check that a phone number is digits only, and a certain minimum length:

Now we can put it all together with this representation of user data:

and the corresponding validation rules:

A new version of the app that uses them:

Sample responses to registration form
Sample responses to registration form
Errors produced based on responses above
Errors produced based on responses above