Notes week of 10 Sep

Using GHCi

Here are a few options for using GHCi to compile Haskell code.

Install using stack

You can optionally download and install Stack, a build tool and dependency manager for Haskell projects. Follow the instructions at https://docs.haskellstack.org/en/stable/install_and_upgrade/

For Mac and Linux, that page recommends a curl command; you would run that in the Terminal application (in Applications » Utilities on the Finder menu). For Windows, there’s a traditional installation program.

Once installed, you will also use Stack from the Terminal application, called the Command Prompt on Windows. To launch the Windows Command Prompt, open the start menu and type cmd – you should see the Command Prompt application appear as the best match. Press enter to open it.

On either platform, verify that Stack is installed correctly by typing stack in the terminal, and pressing enter. You should see output similar to what follows:

stack - The Haskell Tool Stack

Usage: stack [--help] [--version] [--numeric-version]
             [--hpack-numeric-version] [--docker*] [--nix*]
             ([--verbosity VERBOSITY] | [-v|--verbose] |
              [--silent]) [--[no-]time-in-log]

[...lots more options...]

stack's documentation is available at https://docs.haskellstack.org/

Now we need to ask Stack to download the latest version of GHC. Just type this in your terminal:

stack setup

Once it is successfully installed, you should be able to type:

stack ghci

and be loaded into the interactive loop. It should look something like this:

Configuring GHCi with the following packages:
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude>

To quit GHCi, just type :q then enter.

Use online at repl.it

The site https://repl.it/ is a great resource for programming in lots of languages using only a web browser. Just type Haskell to get started. If you want to come back to your program in progress, copy down the URL and/or create an account.

The site comes up with 3 panes: a file browser, an editor, and an interactive REPL. The default file, main.hs, will be enough for a little while so you can close the file browser by clicking on the file icon to the left of it.

One caveat about how repl.it operates is that the “run” button requires a main function to be defined. You can just put:

to satisfy its requirement without doing anything else. Any other functions you specify in your file will be available in the REPL after you hit the run button.

Use mimir.io

The platform we use for assignments, https://mimir.io/ will also compile and test your Haskell programs. As the assignments get more complex and our programs get larger, I will show you how to use an entire development environment within Mimir that operates much like the one described on repl.it.

Booleans

The Boolean values in Haskell are spelled True and False – they must be capitalized as such. The case of the first character of an identifier determines what kind of identifier it is. All the functions we’ve written so far (such as areaCircle and volumeOfSphere) start with lower-case letters. Lower-case indicates it’s a function or variable. Upper-case indicates it’s a constructor or type. We’ll learn more about those shortly, but for now it’s enough to know that True and False are constructors, so their identifiers are capitalized.

Here are some operators that generate Booleans:

  • && is the logical AND (same as in C++)
  • || is logical OR (same as C++)
  • not is logical negation (! in C++)
  • == is equality of two values (same as C++)
  • /= is not-equals (!= in C++)
  • <=, <, >, >= are inequality operators, as in C++
  • even and odd take integers and return True if they are even/odd respectively.

Tuples

We’re going to look at two ways to group together multiple values. The first is using tuples.

A tuple has a fixed number of elements, and they can have different types. A pair is a tuple with two elements, and a triple is what we would call a tuple with three elements. The word tuple is just a generalization to any (fixed) number of elements. The notation is uses parentheses and commas:

  • (3,4) is a pair of integers
  • ("Chris", 99) is a pair of a string and an integer
  • (42, 13, 'Z') is a triple containing two integers and a character.

There are built-in functions for extracting the first and second elements of a pair:

ghci> score = ("Chris", 99)
ghci> fst score
"Chris"
ghci> snd score
99

There is nothing built-in for triples or higher, but you can define them as needed. For example:

These are defined using pattern matching. Rather than give a single-name argument, like r was specified in volumeOfSphere r, the argument is itself a tuple. The underscore characters _ are “wildcard” patterns that match anything but don’t give it a name. The a (or b or c) is just a variable name that gets bound to the value in that position of the tuple. Usage of these functions:

ghci> things = (42, 13, 'Z')
ghci> firstOf3 things
42
ghci> secondOf3 things
13
ghci> thirdOf3 things
'Z'

Lists

Lists are distinguished from tuples in that they can have any number of elements, but they must all have the same type. The notation for lists is square brackets, with values separated by commas. Here are some examples in the REPL:

ghci> [3,5,19]
[3,5,19]
ghci> ["Alice", "Bob", "Chris"]
["Alice","Bob","Chris"]
ghci> []                 -- empty list
[]
ghci> ['C','A','T']
"CAT"                    -- strings are just lists of characters

One useful bit of notation is that lists of “enumerable” values can be specified with this range notation:

ghci> [3..8]
[3,4,5,6,7,8]

We’ll describe what “enumerable” means technically later, but such values include integers, floats, and characters:

ghci> [pi, 2*pi .. 10]
[3.141592653589793,6.283185307179586,9.42477796076938]
ghci> ['A'..'Z']
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"

What if you want to generate the list [3,4,5,6,7,8] but in reverse order? You could try:

ghci> [8..3]
[]

but it produces the empty list. That’s because by default enumerating adds one to the begin element until it reaches (or exceeds) the ending element. Since 8 already exceeds the ending element, no elements are generated.

But an expanded notation allows us to specify the second element, which gives the interval it uses to generate the remaining ones:

ghci> [2,4..10]
[2,4,6,8,10]
ghci> [3,6..20]
[3,6,9,12,15,18]
ghci> [8,7..3]
[8,7,6,5,4,3]

Partial application

Whenever a function takes more than one argument, you can actually give it one argument at a time to produce a new function that takes the remaining arguments. This is called partial application, and functions with which it can be used are know as curried functions or functions with curried arguments.

Aside: the name “curried” is derived from the name of an American logician, Haskell Curry (1900–1982), after whom the language is also named.

Here’s an example function with three arguments:

Whenever we apply that, as in floop 9 1 2, you can view that as being parenthesized: (((floop 9) 1) 2) where it looks more like each argument is provided separately. So the result of floop 9 is a function that can take two more arguments. The result of (floop 9) 1 is yet another function, that can take one more argument.

Here’s a way to take advantage of partial application, by giving a separate name to the partially applied function:

It’s pretty straightforward to justify what’s going on here. Just substitute the equality goop = floop 9 1 into the expressions below:

goop 2 ↪          -- definition of goop
(floop 9 1) 2 ↪   -- definition of floop, substituting a=9 b=1 c=2
9 + 2*1 - 3*2 ↪   -- the rest is just arithmetic
9 + 2 - 6 ↪
11 - 6 ↪
5

You can also do this with symbolic operators. Recall that the parentheses allow the operator to be used in prefix position:

ghci> thrice = (*) 3
ghci> thrice 4
12
ghci> thrice 100
300

With operators like (+) and (*), the positioning of the arguments don’t matter because they’re commutative. But with other operators like (-) and (/) it may matter. In that case, we can use an operator section.

This refers to specifying an infix operator with only one argument, which then becomes a function of the other argument. For example:

  • (/3) is a section of the division operator, where the numerator is the next argument.

  • (3/) is also a section of the division operator, but now the denominator is the next argument.

Usage:

ghci> boo = (/3)
ghci> moo = (3/)
ghci> boo 10
3.3333333333333335
ghci> moo 10
0.3
ghci> boo 12
4.0
ghci> moo 12
0.25

Section with the minus operator are syntactically complicated by ambiguity with negative numbers. So a different way to create partial functions with minus is to use prefix and the function flip.

ghci> bee = flip (-) 3   -- subtract 3 from arg
ghci> mee = (-) 3        -- subtract arg from 3
ghci> bee 10
7
ghci> mee 10
-7
ghci> bee 12
9
ghci> mee 12
-9

flip takes any function of two or more arguments, and produces a function that feeds the arguments in the reverse order. The definition is: would look something like this:

Here’s a good example:

ghci> squarePair x y = (x*x, y*y)
ghci> squarePair 3 4
(9,16)
ghci> flip squarePair 3 4
(16,9)
ghci> flip squarePair (2+5) (div 12 2)
(36,49)

Let’s evaluate that last expression step-by-step:

flip squarePair (2+5) (div 12 2) ↪   -- definition of flip
squarePair (div 12 2) (2+5) ↪        -- arithmetic
squarePair 6 7 ↪                     -- definition of squarePair
(6*6, 7*7) ↪                         -- arithmetic
(36, 49)

High-level list operations

  • length of a list is pretty self-explanatory:

    ghci> length []
    0
    ghci> length [0..5]
    6
    ghci> length "Haskell"
    7
    ghci> length "Bye!\n"
    5                      -- newline is just one character
  • map takes a function and a list. It applies the function to each element of the list and returns the results as a list. Here are some examples:

    ghci> map (*5) [1..5]
    [5,10,15,20,25]
    ghci> map succ "Hello"
    "Ifmmp"
    ghci> square x = x*x
    ghci> map square [1..10]
    [1,4,9,16,25,36,49,64,81,100]
    ghci> map (/7) [1..6]
    [0.14285714285714285, 0.2857142857142857, 0.42857142857142855,
     0.5714285714285714, 0.7142857142857143, 0.8571428571428571]
  • filter applies a function returning a boolean to each element of the list, and then retains only those for which the function was true:

    ghci> filter even [1,2,3,4,6,10,11]
    [2,4,6,10]
    ghci> filter (> 5) [3,8,2,9,5,7,9,1]
    [8,9,7,9]
    ghci> special x = x < 8 || x `mod` 3 == 0
    ghci> filter special [1..20]
    [1,2,3,4,5,6,7,9,12,15,18]
  • all is like a logical for-all operator (\(\forall\)) — it takes a boolean function and a list, and returns true if the function is true for all the elements of the list.

    ghci> all even [2,8,5,10]
    False
    ghci> all even [2,8,4,10]
    True
    ghci> all even []
    True
  • any is like a logical “there exists” operator (\(\exists\)) — it takes a boolean function and a list, and returns true if the function is true for any elements of the list.

    ghci> any even [2,8,5,10]
    True
    ghci> any even [1,5,3]
    False
    ghci> any even []
    False

Functions with multiple clauses

Functions can be defined with multiple clauses – different expressions that apply to different cases. One way to do this is with pattern guards, which are introduced with a pipe/bar character (|) before the equals sign:

Following each bar is a Boolean expression, called a guard. The first guard that results in True indicates the expression that is will be executed. The keyword otherwise will match if no previous guard matched.

Here are some derivations using the sample function above:

collatz 18 ↪
  -- plug in 18 for n in the guards
  -- evaluate: even 18 ↪ True
18 `div` 2 ↪
9

collatz 5 ↪
  -- plug in 5 for n in the guards
  -- evaluate: even 5 ↪ False
  -- move on to otherwise case
3*5 + 1 ↪
15 + 1 ↪
16

Let’s define a few more numeric functions this way:

The error function signals a run-time error. It will look something like this:

*** Exception: Negative argument
CallStack (from HasCallStack):
  error, called at <interactive>:60:1 in interactive:Ghci44

In Haskell we usually try pretty hard to avoid generating run-time errors. Later we can explore some techniques for ensuring they don’t happen.

Let’s attempt to derive the values of expressions using these guarded recursive functions.

fact 5 ↝
  -- substitute 5 for n in guards
  -- evaluate: 5 < 0 ↝ False
  -- evaluate: 5 == 0 ↝ False
  -- move on to otherwise case
5 * fact (5-1) ↝    -- just arithmetic
5 * fact 4 ↝
  -- substitute 4 for n in guards
  -- evaluate: 4 < 0 ↝ False
  -- evaluate: 4 == 0 ↝ False
  -- move on to otherwise case
5 * 4 * fact (4-1) ↝          -- arithmetic
5 * 4 * fact 3 ↝              -- otherwise case
5 * 4 * 3 * fact 2 ↝          -- arithmetic, otherwise
5 * 4 * 3 * 2 * fact 1 ↝      -- arithmetic, otherwise
5 * 4 * 3 * 2 * 1 * fact 0 ↝  -- arithmetic
  -- evaluate: 0 < 0 ↝ False
  -- evaluate: 0 == 0 ↝ True, use that case
5 * 4 * 3 * 2 * 1 * 1 ↝
20 * 3 * 2 * 1 * 1 ↝
60 * 2 * 1 * 1 ↝
120 * 1 * 1 ↝
120 * 1 ↝
120
pow 2 6 ↝
2 * pow 2 5 ↝
2 * 2 * pow 2 4 ↝
2 * 2 * 2 * pow 2 3 ↝
2 * 2 * 2 * 2 * pow 2 2 ↝
2 * 2 * 2 * 2 * 2 * pow 2 1 ↝
2 * 2 * 2 * 2 * 2 * 2 * pow 2 0 ↝
2 * 2 * 2 * 2 * 2 * 2 * 1 ↝
4 * 2 * 2 * 2 * 2 * 1 ↝
8 * 2 * 2 * 2 * 1 ↝
16 * 2 * 2 * 1 ↝
32 * 2 * 1 ↝
64 * 1 ↝
64

Exercise: why is it helpful to have a case for n < 0 in these functions? What happens if we omit that case and we try to evaluate fact (-1)?