Sun Dec 2

**Note about submission:** The format of this assignment on Mimir is different than previous ones. You may use the web IDE on Mimir, or an editor and GHC on your local system.

If using your own system, you can download the starter code as a Zip file from Mimir (but there is only one file in it,

`A10.hs`

). You can then upload the modified file.If using the web IDE on Mimir, use this command in the terminal to install GHC:

`apt install ghc`

You might need to repeat that whenever you reopen the IDE? The starter code can be found within the folders at the left, and/or get there this way in the terminal:

`cd cs168_functional_programming/assignment_10 ghci :load A10.hs`

This assignment includes the beginnings of an **expression parser,** that can scan through a stream of characters and build a tree to match arithmetic expressions. We begin with all the type and instance definitions that we covered in class…below this code there will be a friendlier summary of the available operations.

```
instance Functor Parser where
-- fmap :: (a -> b) -> Parser a -> Parser b
fmap f p = Parser g
where g s = mapFst f (runParser p s)
```

```
instance Applicative Parser where
-- pure: a -> Parser a
pure a = Parser (\s -> [(a,s)])
-- <*> :: Parser (a -> b) -> Parser a -> Parser b
pf <*> pa = Parser g
where g s1 = concatMap h (runParser pf s1)
where h (f, s2) = mapFst f (runParser pa s2)
```

```
instance Monad Parser where
-- (>>=) :: Parser a -> (a -> Parser b) -> Parser b
pa >>= fpb = Parser g
where g s1 = concatMap h (runParser pa s1)
where h (a,s2) = runParser (fpb a) s2
```

```
instance Alternative Parser where
-- empty :: Parser a
empty = Parser (\_ -> [])
-- (<|>) :: Parser a -> Parser a -> Parser a
p1 <|> p2 = Parser g
where
g s1 =
case runParser p1 s1 of
[] -> runParser p2 s1 -- p1 failed, so try p2
results -> results -- otherwise p1's results are fine
```

These are a really useful set of primitive operations on parsers. Basically we want to be able to match characters meeting certain criteria, or exact strings.

```
satisfies :: (Char -> Bool) -> Parser Char
satisfies ok = Parser f
where f [] = [] -- input was empty
f (c:cs)
| ok c = [(c,cs)] -- first char satisfies function
| otherwise = [] -- first char doesn't satisfy function
```

```
string :: String -> Parser String
string s = loop s >> return s
where loop "" = return ""
loop (c:cs) = char c >> loop cs
```

Some examples using these:

```
ghci> runParser (satisfies isAlpha) "abc"
[('a',"bc")]
ghci> runParser (satisfies isAlpha) "2bc"
[] -- failed parse
ghci> runParser (char 'x') "xyz"
[('x',"yz")]
ghci> runParser (string "hello") "helloworld"
[("hello","world")]
ghci> runParser (string "hello") "helium"
[] -- failed parse
```

actually executes the parser on a given string. Remember that the list result represents multiple possible ways to parse values of type `a`

, and the paired `String`

is the remaining characters that haven’t been used yet. When the list is empty, that means the parse **failed.**

These return a value without consuming any input at all, for example:

```
ghci> runParser (pure 99) "hello"
[(99,"hello")]
```

```
(>>) :: Parser a -> Parser b -> Parser b -- SEQUENCING
(<|>) :: Parser a -> Parser a -> Parser a -- ALTERNATIVES/CHOICE
```

These two operators have very similar types, but they do different things. The first represents a *sequence* – `p >> q`

means parse using `p`

and (if that succeeds) continue parsing using `q`

on the remaining string. However, the result returned is just that of the *second* parser, `q`

. The result of `p`

is ignored:

```
ghci> runParser (string "hello" >> string "world") "helloworld"
[("world","")]
ghci> runParser (string "hello" <|> string "world") "helloworld"
[("hello","world")]
ghci> runParser (string "hello" <|> string "world") "worldwide"
[("world","wide")]
```

The alternative type class also gives us these handy operators:

```
many :: Parser a -> Parser [a] -- Zero or more, in sequence
some :: Parser a -> Parser [a] -- One or more, in sequence
optional :: Parser a -> Parser (Maybe a)
```

Examples:

```
ghci> runParser (many (satisfies isAlpha)) "abc123"
[("abc","123")]
ghci> runParser (many (satisfies isAlpha)) "123abc"
[("","123abc")] -- Not a failure, just matches zero characters
ghci> runParser (some (satisfies isAlpha)) "abc123"
[("abc","123")]
ghci> runParser (some (satisfies isAlpha)) "123abc"
[] -- Failure
ghci> runParser (optional (char 'x')) "x18"
[(Just 'x',"18")]
ghci> runParser (optional (char 'x')) "y18"
[(Nothing,"y18")]
```

Finally, there is the usual applicative pattern, shaped like this:

where `f`

would be a function expecting four arguments, and the blanks would be replaced with four parsers. The parsers are executed **in sequence**, and if they all succeed, the results are passed to the function to produce a value. Here are some examples of that pattern:

```
ghci> runParser ((++) <$> some (satisfies isDigit)
<*> some (satisfies isAlpha)) "12ab34"
[("12ab","34")]
```

So that one matches digits, followed by alphabetic characters, and appends them together with `(++)`

.

```
ghci> runParser ((\a b c -> [a,b,c]) <$> string "hi" <*> pure "lo" <*> string "sa") "hisamo"
[(["hi","lo","sa"],"mo")]
```

We’ll reintroduce those tree types for arithmetic expressions:

```
data Arith
= Num Double
| Var String
| UnaryOp UnaryOp Arith
| BinaryOp BinaryOp Arith Arith
deriving (Eq, Show)
```

```
data BinaryOp
= Add -- add, +
| Sub -- subtract, -
| Mul -- multiply, *
| Div -- divide, /
| Pow -- power (exponent), **
deriving (Eq, Show)
```

Let’s begin with parsers for the `UnaryOp`

:

```
parseUnaryOp :: Parser UnaryOp
parseUnaryOp =
(string "sqrt" >> return Sqrt) <|>
(string "-" >> return Neg) <|>
(string "abs" >> return Abs)
```

Then `BinaryOp`

:

```
parseBinaryOp :: Parser BinaryOp
parseBinaryOp =
(string "+" >> return Add) <|>
(string "-" >> return Sub) <|>
(string "**" >> return Pow) <|>
(string "*" >> return Mul) <|>
(string "/" >> return Div)
```

Here it’s important to ensure that the `**`

symbol appears *before* the single `*`

in the alternatives. Otherwise it would prefer to match just a single `*`

when faced with two. Compare these:

```
ghci> runParser (string "**" <|> string "*") "**9"
[("**","9")]
ghci> runParser (string "*" <|> string "**") "**9"
[("*","*9")]
```

Now for parsing variable names. To begin, we’ll only allow variable names that are fully alphabetic:

Here are examples:

```
ghci> runParser parseVar "myCounter++"
[(Var "myCounter","++")]
ghci> runParser parseVar "my2ndCounter==2"
[(Var "my","2ndCounter==2")]
```

You can see in the second example that `my2ndCounter`

is not permitted as a single variable name.

**TODO:** Revise the definition of `parseVar`

so that variable names must *begin* with an alphabetic character, but then can contain alpha-numeric characters after the first one (this is how identifiers are defined in most programming languages). **Hint:** there’s a function `isAlphaNum`

in `Data.Char`

.

**Solution:**

```
parseVar :: Parser Arith
parseVar =
Var <$> ((:) <$> satisfies isAlpha
<*> many (satisfies isAlphaNum))
```

Your revised version of `parseVar`

should support variable names like these:

```
ghci> runParser parseVar "my2ndCounter==2"
[(Var "my2ndCounter","==2")]
ghci> runParser parseVar "h238a19"
[(Var "h238a19","")]
ghci> runParser parseVar "x"
[(Var "x","")]
ghci> runParser parseVar "x3"
[(Var "x3","")]
ghci> runParser parseVar "3x"
[] -- failed parse; cannot start with digit
```

To begin, let’s ignore decimal points. We’ll just support integers with an optional negative sign.

```
parseNum :: Parser Arith
parseNum =
f <$> optional (char '-') <*> some (satisfies isDigit)
where
f Nothing digits = Num (read digits)
f (Just _) digits = Num (read ('-':digits))
```

We are slightly cheating here, because we’re using the Haskell function `read`

to do the actual conversion from `String`

to `Double`

. It is already a sort of parser for numbers and other types. There are ways to code that conversion ourselves, but it’s just not a technique I want to focus on for now.

So this works for integers, but stops at any decimal point:

```
ghci> runParser parseNum "138*2"
[(Num 138.0,"*2")]
ghci> runParser parseNum "138.2"
[(Num 138.0,".2")]
ghci> runParser parseNum "3.14159"
[(Num 3.0,".14159")]
ghci> runParser parseNum "-18"
[(Num (-18.0),"")]
ghci> runParser parseNum "x"
[] -- failure
```

**TODO:** Revise the definition of `parseNum`

so that numbers can (optionally) contain decimal parts. The `read`

function can still be used to do the conversion, but you need to feed it strings of the right form.

**Solution:**

```
parseNum :: Parser Arith
parseNum =
f <$> optCharThen '-' (some (satisfies isDigit))
<*> (pad <$> optCharThen '.' (many (satisfies isDigit)))
where
f ds1 ds2 = Num (read (ds1 ++ ds2))
pad "." = ".0"
pad s = s
```

```
optCharThen :: Char -> Parser String -> Parser String
optCharThen c p =
f <$> optional (char c) <*> p
where
f Nothing s = s
f (Just c) s = c:s
```

Once you have revised `parseNum`

appropriately, here’s how it should behave:

```
ghci> runParser parseNum "138*2"
[(Num 138.0,"*2")]
ghci> runParser parseNum "138.2"
[(Num 138.2,"")]
ghci> runParser parseNum "3.14159"
[(Num 3.14159,"")]
ghci> runParser parseNum "-18"
[(Num (-18.0),"")]
ghci> runParser parseNum "-90.05x"
[(Num (-90.05),"x")]
```