Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Learning Mode] Discussion: building up to Pattern Matching #1103

Open
kytrinyx opened this issue Aug 31, 2022 · 26 comments
Open

[Learning Mode] Discussion: building up to Pattern Matching #1103

kytrinyx opened this issue Aug 31, 2022 · 26 comments

Comments

@kytrinyx
Copy link
Member

@MatthijsBlom suggested the following concept graph, which is a good starting point for building up to a basic understanding of proper pattern matching in Haskell.

              |
   Simple Pattern Matching      
   /          |        \
Maybe      Either      ... (tuples, lists?)
   \          |        /
    Algebraic Data Types
      ("custom types")
              |
   Pattern Matching Proper
              |

I'd like to start the discussion by looking at some sample code that demonstrates the end point, Pattern Matching Proper. It doesn't have to relate to any exercise or story, rather it's a way to make sure that everyone involved understands where we're trying to get to.

@MatthijsBlom I know you've been thinking a lot about this over the past couple of weeks. Would you be willing to write a few (maybe 10 or so?) lines of code to illustrate Pattern Matching?

@MatthijsBlom
Copy link
Contributor

MatthijsBlom commented Aug 31, 2022

The solution to Valentine's is feature-complete, so to speak. It matches on a custom ADT (Activity), matches on fields (e.g. Korean), uses binding patterns (not sure this is the right terminology) (kilometers), and it uses guards. All this is very normal Haskell.

Here are a few variations of a translation of the F# exemplar solution. (The variation is in syntax alone.)

Show code
{-# LANGUAGE LambdaCase #-}
-- The above enables a well-established language extension

module ValentinesDay (rateActivity) where

data Approval = Yes | No | Maybe

data Cuisine = Korean | Turkish

data Genre = Crime | Horror | Romance | Thriller

data Activity = BoardGame | Chill | Movie Genre | Restaurant Cuisine | Walk Integer

-- How I would write it, but requires LambdaCase extension
rateActivity :: Activity -> Approval
rateActivity = \case
  Restaurant Korean -> Yes
  Restaurant Turkish -> Maybe
  Movie Romance -> Yes
  Walk kilometers
    | kilometers < 3 -> Yes
    | kilometers < 5 -> Maybe
  _ -> No

-- Without LambdaCase
rateActivity' :: Activity -> Approval
rateActivity' activity =
  case activity of
    Restaurant Korean -> Yes
    Restaurant Turkish -> Maybe
    Movie Romance -> Yes
    Walk kilometers
      | kilometers < 3 -> Yes
      | kilometers < 5 -> Maybe
    _ -> No

-- Using function definition syntactic sugar
rateActivity'' :: Activity -> Approval
rateActivity'' (Restaurant Korean) = Yes
rateActivity'' (Restaurant Turkish) = Maybe
rateActivity'' (Movie Romance) = Yes
rateActivity'' (Walk kilometers)
  | kilometers < 3 = Yes
  | kilometers < 5 = Maybe
rateActivity'' _ = No

@kytrinyx
Copy link
Member Author

This is really nice.

My gut feeling is that the middle solution is just right at this point in the concept tree.

I do like the clean approach of the first one, but I think that we should probably wait to introduce language extensions. Similarly I think it's worth deferring the function definition syntactic sugar until later in the concept tree.

@MatthijsBlom
Copy link
Contributor

My gut feeling is that the middle solution is just right at this point in the concept tree.

I agree. The third one is not any more complicated (depending on their introduction to the language, a beginner might even find it simpler), but it is noisier.

I think that we should probably wait to introduce language extensions

While I agree on this, a warning: the language extensions are very diverse, and so are their appropriateness, need, and difficulty level. Intuitions from other languages about them do not necessarily transfer. To paraphrase Alexis King: the question is less «is introducing language extensions a good idea?» and more «is using this language extension a good idea?».

I think it's worth deferring the function definition syntactic sugar until later in the concept tree.

Many Haskell tutorials start out with the sweet version and introduce case expressions only later. (Case in point: LYAH has them at the bottom of the page on pattern matching.)

I think it's safe to introduce them together. Wouldn't know when to do it otherwise either: PM2 is already kinda heavy, and I do not feel like this syntactic sugar deserves its own concept node.

@kytrinyx
Copy link
Member Author

The third one is not any more complicated [...] but it is noisier.

Yes, that's my sense, exactly. The three actually feel really similar to me in terms of difficulty, but I think the middle example strikes a good balance.

To paraphrase Alexis King: the question is less «is introducing language extensions a good idea?» and more «is using this language extension a good idea?».

Yepp, that makes sense.

Cool. So, if @ErikSchierboom also agrees that the middle example is the way to go, I think we'd be good to kick off work on the Valentine's Day exercise itself.

@MatthijsBlom Have you given any thought yet to how you'd introduce Algebraic Data Types?

@MatthijsBlom
Copy link
Contributor

MatthijsBlom commented Aug 31, 2022

I think we'd be good to kick off work on the Valentine's Day exercise itself.

It looks like a straightforward translation of F#'s instructions.md would be good.

@MatthijsBlom Have you given any thought yet to how you'd introduce Algebraic Data Types?

Working on it right now. (On that whole graph section actually.)

@MatthijsBlom
Copy link
Contributor

It occurs to me that Maybe, Either, tuples, … do not actually depend on Simple Pattern Matching.

@ErikSchierboom
Copy link
Member

It occurs to me that Maybe, Either, tuples, … do not actually depend on Simple Pattern Matching.

So your saying you wouldn't use pattern matching when working with them?

@ErikSchierboom
Copy link
Member

Cool. So, if @ErikSchierboom also agrees that the middle example is the way to go, I think we'd be good to kick off work on the Valentine's Day exercise itself.

I agree!

@ErikSchierboom
Copy link
Member

The solution to Valentine's is feature-complete, so to speak. It matches on a custom ADT (Activity), matches on fields (e.g. Korean), uses binding patterns (not sure this is the right terminology) (kilometers), and it uses guards. All this is very normal Haskell.

Which of these concepts will be prerequisites and which of these concepts will be newly introduced concepts?

@kytrinyx
Copy link
Member Author

kytrinyx commented Sep 1, 2022

@MatthijsBlom Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph here (First Maybe, Either, ADTs, but also any of the "etc" concepts that you have on the right hand side as well, if you've been thinking about that).

@MatthijsBlom
Copy link
Contributor

So your saying you wouldn't use pattern matching when working with them?

I would regularly, but a beginner need not immediately. The utility functions in their respective modules are sufficient for everything, if in a slightly different way. Seems fine practice in composing functions.

...But that's not really what that was about: Maybe, Either, and (,) will not be matched against until Pattern Matching Proper. Everything else about them does not depend on pattern matching on literals, which is what Simple Pattern Matching is about.

I tried to say that the graph might look like this instead:

   Simple Pattern Matching      
                   |    
Maybe      Either  |   ... (tuples, lists?)
   \          |    |   /
    Algebraic Data Types

Which of these concepts will be prerequisites and which of these concepts will be newly introduced concepts?

Guards are part of Simple Pattern Matching; the rest will be introduced in Pattern Matching Proper.

@MatthijsBlom Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph

Not yet, but these are the kinds of signatures I'm thinking of:

  • binarySearch :: Array -> Value -> Maybe Index
  • compile :: Code -> Either CompileError Program

ADTs: the data type declarations in Valentine's Day contain most of this. The only thing that is really missing is polymorphism: data Maybe a = Nothing | Just a (the lowercase a, a type variable).

Tuples are plumbing, so I do not expect appealing code. However, the likes of zip :: [a] -> [b] -> [(a, b)] and lookup :: Eq a => a -> [(a, b)] -> Maybe b use them.

Looks like lists will get their own node early on. I haven't checked this, but I imagine that lots of material from other tracks on lists can be reused. (Expecially Purescript, Elm.)

@ErikSchierboom
Copy link
Member

Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph

As a quite rusty Haskell programmer, code samples would be great, yes. I must admit that things don't readily "click" in my mind currently, but I'm sure I'll get there. I think the code samples will be invaluable.

@MatthijsBlom
Copy link
Contributor

This is taking way longer than expected, and @pwadsworth is being productive in my absence, so I tried to write down my thoughts here below.

Pattern Matching & Data Type Declarations

Here is an attempt at a somewhat thorough outline of things related to pattern matching and data types in Haskell. Explanations are omitted, as are obviously 'advanced' concepts like GADTs. Some such concepts however are included, simply because it was easier for me to include than to omit them.

Pattern Matching syntax

case expressions:

f x =
  case x of
    pattern -> expression

Function definition syntactic sugar (compiles to case)

f pattern = expression

Patterns

(Please disregard GitHub's broken highlighting of pattern.)

x = case someExpression of
  -- Wildcard pattern
  _ -> expression
  
  -- Binding pattern
  name -> expression

  -- Regular constructor pattern
  Constructor pat ter ns -> expression

  -- (Named) "as" pattern
  name@pattern -> expression

  -- Special syntax reused as pattern syntax
  5         -> expression  -- integers
  -3.7      -> expression  -- fractional numbers
  "Hello!"  -> expression  -- strings
  [x, y, z] -> expression  -- lists
  (x, y)    -> expression  -- tuples

  -- Bang pattern
  !pattern -> expression

  -- Irrefutable pattern
  ~pattern -> expression

  -- Guards
  pattern | booleanExpression      -> expression  -- boolean guard
          | pattern <- expression  -> expression  -- pattern guard
          | let name = expression  -> expression  -- local binding
          | guard, guard, guard    -> expression  -- sequence of guards

Data type declarations

-- "data declaration"
data TypeConstructor typeParameters =
    DataConstructor  fi El Ds
  | DataConstructor' Fi eld S

-- "newtype declaration"
newtype TypeConstructor typeParameters =
  DataConstructor Field

-- "type declaration"
type TypeConstructor typeParameters = someType

-- deriving clause
data T a b c = C a b c deriving (classes)
newtype N a b c = N a b c deriving (classes)

Of note:

  • Otherwise synonymous with data, newtype declares un-lifted types.
  • type does not provide type-safety. (To the contrary, in practice!)

Records

TBA

Right now I do not feel need to fill in this section. See below for reasoning.

Note: while they look as similar to each other as some other constructs shared between the languages, the records of Haskell and Elm are not to be confused: they are different beasts.

Examples of data types and how to work with them

  • Sum types: Bool, Maybe, Either
  • Product types: tuples ((,), (,,), etc.)
  • Sum+product type: lists ([])

Each of these except Bool are parametric types, i.e. not themselves concrete types.

Maybe, [], (,), etc. are type constructors.

False, Just, [], (:), (,), etc. are data constructors

Note: sadly tuples and lists are not great examples, because their syntax is special on many levels (syntactic sugar + weird names of both type and data constructors).

Each of these types can be worked with, even without (explicit) pattern matching, using the utility functions available from dedicated standard library modules.


What to teach when

For reference the above list condensed

(Unordered)

  • case expressions
  • function definition syntactic sugar
  • Wildcard patterns
  • Binding patterns
  • Regular constructor patterns
  • (Named) "as" patterns 😢
  • Special syntax reused as pattern syntax
  • Bang patterns 🛑
  • Irrefutable patterns 🛑
  • boolean guards
  • pattern guards 🛑
  • local binding guards 🛑
  • guard sequences 🛑
  • data declarations
  • type variables + parametric types
  • type constructors
  • data constructors
  • newtype declarations 😢
  • type declarations / type synonyms 😢
  • deriving 😢
  • All things records 🛑

I propose we do not teach

  • Strictness-related stuff, i.e. bang patterns and irrefutable patterns. Not in the concept nodes introducing pattern matching, anyway. Strictness might deserve its own node (later).
  • Records have historically been not nice to work with. An appreciable number of language extensions have been created to improve ergonomics. It would be a shame to introduce a student to records but not to at least some of the relevant extensions. But this would be too much to include in the nodes introducing data types and pattern matching. Therefore I say: if we do records, let's give them their own node. Records are just syntactic sugar; we do not really lose anything by skipping over them.
  • All types of guards except boolean guards, as they are kinda obscure and we do not need them anyway.

I have marked these with 🛑 in the condensed list above.

In addition, I wouldn't be too sad if we skip

  • "As" patterns as they are neither conceptually very interesting nor very common.
  • newtype declarations as they are conceptually (almost) redundant. Seems cheap to include them though.
  • type declarations / type synonyms as they are conceptually anti-interesting, not that common, and potentially confusing. Reasons to include them anyway: cheap, and the very common String is (very noticeably) a type synonym.
  • deriving clauses because type classes have not been introduced yet. They are very common though, so it would be nice if we could at least say something about them.

I have marked these 😢 in the condensed list above.

Now, assuming the concept graph proposed earlier, …

Which graph? This one.
   Simple Pattern Matching      
                  |
 Maybe   Either   |   ... (tuples, lists?)     ("Example nodes")
    \      |      |    /
    Algebraic Data Types
      ("custom types")
              |
   Pattern Matching Proper

…I was thinking

Concept Node
case expressions Simple PM
Wildcard patterns Simple PM
Special syntax reused as pattern syntax Simple PM
function definition syntactic sugar Simple PM or PM Proper1
Binding patterns Simple PM or PM Proper2
boolean guards Simple PM3 or PM Proper2
type variables + parametric types example nodes + ADTs
data constructors example nodes + ADTs
type constructors example nodes + ADTs
data declarations ADTs
newtype declarations ADTs
type declarations / type synonyms ADTs
deriving ADTs
Regular constructor patterns PM Proper
"As" patterns PM Proper

I realize these ideas conflict with ongoing work. What do you think about the above, @pwadsworth?

If it helps, I can produce some design.mds. I think I finally understand why those exist; it wasn't obvious to me before.

Footnotes

  1. Consider delaying introduction of function definition pattern matching syntactic sugar, because beginners often get stuck in the habit of using the sugar, whereas the first instinct should be to reach for case.

  2. One generally guards on bindings, so we should introduce binding patterns no later than we introduce guards. 2

  3. Beginners often reach for series of (==) comparison guards instead of case. If introduction of guards is postponed, there is no danger of this: in more general settings (i.e. those in PM Proper), (==) guards fail to be applicable.

@ErikSchierboom
Copy link
Member

ErikSchierboom commented Oct 5, 2022

@MatthijsBlom Wow, that is a lot of information! As mentioned, we (the Exercism team) would like to have the next step be some sample code for the various concepts. @pwadsworth would you be interested in writing some sample code for the various concepts?

edit: lets focus on 1 concept at a time, preferrably the one with the least amount of prerequisites

@MatthijsBlom
Copy link
Contributor

I'm guessing you mean code samples for the Example Nodes, as a Simple Pattern Matching has already been merged.

lets focus on 1 concept at a time, preferrably the one with the least amount of prerequisites

That would be (,) (2-tuples). Not because it is the simplest, but because it is the most familiar (to those coming from other languages, anyway). Next up would be the similarly simple Maybe, followed by its generalization Either.

Anyway, I failed at this. What follows is the same treatment for all three of these types. If this is bothersome, I can edit some out.


I had previously published a few solutions that 'use' Maybe, Either, and tuples.

(PureScript and Haskell are closely related dialects; the few differences do not matter here.)

Some observations on the use of Maybe / Either / (,) values in these solutions:

  • I only ever (explicitly) construct them.
  • Sometimes not even that: neither Pascals Triangle nor Perfect Numbers contains explicit construction.
  • I never (explicitly) inspect them.
  • Most are intermediate values, fed to plumbing functions like <$>, <*>, and >>=.
  • The use of (,) is entirely intermediate, for immediate consumption by fromList.

All this is normal in Haskell (and, I suspect, in PureScript as well).

(Are there other possible 'uses', besides construction and inspection?)

We tend to avoid creating functions that take these types as arguments. Most cases by far are covered by plumbing functions like <$> and (derivatives of) maybe :: b -> (a -> b) -> Maybe a -> b, either :: (a -> c) -> (b -> c) -> Either a b -> c, and uncurry :: (a -> b -> c) -> (a, b) -> c.

All three of Maybe, Either, and (,) are plumbing-like (in practice), and (,) especially so.


I previously mentioned that a typical type signature with Maybe would be binarySearch :: Ord a => Array a -> a -> Maybe Index. Indeed, this is my solution:

Binary Search in Haskell
find :: Ord a => Array Int a -> a -> Maybe Int
find arr x = uncurry go (bounds arr)
  where
    go lo hi
      | lo > hi = Nothing  -- 👈
      | otherwise =
        let m = (lo + hi) `div` 2
         in case compare (arr ! m) x of
              LT -> go (m + 1) hi
              EQ -> Just m  -- 👈
              GT -> go lo (m - 1)

-- Two constructions, no intermediate values,
-- and again no inspection.

As expected, the actual implementation hardly adds to the Maybe story: the signature already tells it all.


While the above examples may show typical use of Maybe / Either / (,), they may well be not very instructive. So here are rough adaptations of Elm's Role Playing Game, Go, and Tisbury Treasure Hunt. (I have not tried to restrict myself to already-introduced concepts, or any specific style.)

Role Playing Game (Maybe)
{-# LANGUAGE MonadComprehensions #-}

module RolePlayingGame (Player, castSpell, introduce, revive) where

import Data.Functor
import Data.Maybe (fromMaybe)

data Player = Player
  { name :: Maybe String,
    level :: Int,
    health :: Int,
    mana :: Maybe Int -- This distinction between Nothing and Just 0 feels unnatural
  }

introduce :: Player -> String
introduce = fromMaybe "Mighty Magician" . name

revive :: Player -> Maybe Player
revive player =
  player
    { health = 100,
      mana = 100 <$ guard (level player >= 10)
    }
    <$ guard (health player == 0)
-- or
revive player = do
  guard (health player == 0)
  pure $
    player
      { health = 100,
        mana = do
          guard (level player >= 10)
          pure 100
      }
-- or even (using somewhat rare extension MonadComprehensions)
revive player =
  [player {health = 100, mana = [100 | level player >= 10]} | health player == 0]

castSpell :: Int -> Player -> (Player, Int)
castSpell cost player =
  case mana player of
    Nothing -> (player {health = max 0 (health player - cost)}, 0)
    Just m
      | cost <= m -> (player {mana = Just (m - cost)}, 2 * cost)
      | otherwise -> (player, 0)

guard :: Bool -> Maybe ()
guard condition = if condition then Just () else Nothing
-- or
guard = guard -- standard, more general function

mapConst :: a -> Maybe b -> Maybe a
mapConst _ Nothing = Nothing
mapConst x (Just _) = Just x
-- or
mapConst x = fmap (const x)
-- or
mapConst = (<$) -- standard, more general function
Go (Either)
{-# LANGUAGE LambdaCase #-}

module Go where

import Data.Function ((&))
import Data.Functor ((<&>))

import GoSupport

-- Note: this is an unnatural setting, so the solution may be confusing.
applyRules :: Game -> Rule -> NonValidatingRule -> Rule -> Rule -> Game
applyRules game oneStonePerPointRule captureRule libertyRule koRule =
  -- approximately literal translation from Elm solution
  game
    & oneStonePerPointRule
    <&> captureRule
    >>= libertyRule
    >>= koRule
    & ( \case
          Left e -> game {err = e}
          Right newGame -> changePlayer newGame
      )
  -- 'Simplified'
  oneStonePerPointRule game
    >>= libertyRule . captureRule
    >>= koRule
    & either (\e -> game {err = e}) changePlayer
  -- Using `do` and `where`
  case gameState of
    Left errorMessage -> game {err = errorMessage}
    Right newGame -> changePlayer newGame
  where
    gameState = do
      s <- oneStonePerPointRule game
      let s' = captureRule s
      s'' <- libertyRule s'
      koRule s''
Tisbury Treasure Hunt ((,))
module TisburyTreasureHunt where

import Data.List (filter, length)
import Data.Tuple (swap)

type PlaceLocation = (Char, Int)  -- Please don't do this IRL.

type TreasureLocation = (Int, Char)  -- Create proper data types instead,

type Place = (String, PlaceLocation)  -- even if they would be

type Treasure = (String, TreasureLocation)  -- isomorphic to tuple types.

placeLocationToTreasureLocation :: PlaceLocation -> TreasureLocation
placeLocationToTreasureLocation (char, int) = (int, char)
-- or simply
placeLocationToTreasureLocation = swap

treasureLocationMatchesPlaceLocation :: PlaceLocation -> TreasureLocation -> Bool
treasureLocationMatchesPlaceLocation placeLocation treasureLocation =
  placeLocationToTreasureLocation placeLocation == treasureLocation

countPlaceTreasures :: Place -> [Treasure] -> Int
countPlaceTreasures (_, placeLocation) =
  let thisLocation = placeLocationToTreasureLocation placeLocation
   in length . filter ((thisLocation ==) . snd)

specialCaseSwapPossible :: Treasure -> Place -> Treasure -> Bool
specialCaseSwapPossible (foundTreasure, _) (place, _) (desiredTreasure, _) =
  case (foundTreasure, place, desiredTreasure) of
    ("Brass Spyglass", "Abandoned Lighthouse", _) -> True
    ("Amethyst Octopus", "Stormy Breakwater", _) ->
      desiredTreasure `elem` ["Crystal Crab", "Glass Starfish"]
    ("Vintage Pirate Hat", "Harbor Managers Office", _) ->
      desiredTreasure `elem` ["Model Ship in Large Bottle", "Antique Glass Fishnet Float"]
    _ -> False

I'm having some trouble coming up with examples of tuple use that are both real and interesting / nontrivial / notable / something. Maybe their use in State?

An attempt at beginner-friendly instance declarations for State
-- A value of type `State s a` is (essentially) a function that
-- takes a state and returns a new state paired with some result.
newtype State s a = State {runState :: s -> (s, a)}
-- I'd like to write
--   type State s a = s -> (s, a)
-- but we'd run into trouble below.

instance Functor (State s) where
  -- If we have a stateful function that produces a's, and
  -- we can transform a's into b's, then we can create
  -- a stateful function that produces b's.
  fmap :: (a -> b) -> State s a -> State s b
  fmap f st =
    State
      ( \s ->
          let (s', x) = runState st s
           in (s', f x)
      )

instance Applicative (State s) where
  -- A stateful function that just produces the given value
  pure :: a -> State s a
  pure x = State (\s -> (s, x))

  -- Combine a stateful function that produces a function
  -- with a stateful function that produces input
  -- into a single stateful function that produces output.
  (<*>) :: State s (a -> b) -> State s a -> State s b
  sf <*> sx =
    State
      ( \s ->
          let (s', f) = runState sf s
              (s'', x) = runState sx s'
           in (s'', f x)
      )

instance Monad (State s) where
  -- Given a stateful function that produces input
  -- and a function that uses that input to
  --    generate a stateful function that produces output,
  -- produce a stateful function that produces output.
  (>>=) :: State s a -> (a -> State s b) -> State s b
  st >>= f =
    State
      ( \s ->
          let (s', x) = runState st s
              st' = f x
           in runState st' s'
      )

Tuples are used here (as they are always) to group multiple values into one so that they can be returned together. Tuples are the tool for grouping multiple values into one without attaching any extra meaning to this grouping.


@ErikSchierboom are there specific kinds of sample code still missing?

@ErikSchierboom
Copy link
Member

@MatthijsBlom Sorry that I've not yet replied. I'm incredibly swamped right now :(

@pwadsworth
Copy link
Contributor

Sorry for the delayed reply. I have been super busy and haven't had any time to dedicate here. I expect to have more time available in late November and December. If we settle on the way forward before then I'll be able to write several exercises and related docs.

@MatthijsBlom: IRT conflict with what we have already done, I do not have any issue adjusting to improve the learning path, but what you proposed has many concepts in each node. I thought the idea was to keep the number of concepts to one or two per node to make each a small incremental step for students. I can see benefits in clarity of exposition, and nodes being self-contained though, so I have no problems with expanding the nodes already written if the Exercism team wants to take that route. I do have a couple minor shown below in bold on where some concepts are better introduced.

@ErikSchierboom: Assuming the above, here is what I propose to do for each.

Concept Node Comment
case expressions ADT (1) Used in valentines-day (VD). I can add a more detailed explanation in the introduction
Wildcard patterns ADT Same as (1)
Special syntax reused as pattern syntax Simple PM @MatthijsBlom Not sure what you mean by this
function definition syntactic sugar Simple PM or PM Proper1 PM Proper
Binding patterns Simple PM or PM Proper2 PM Proper
boolean guards Simple PM3 or PM Proper2 PM Proper
type variables + parametric types example nodes + ADTs Not sure about this one. It could be its own node or added as part of Types or Type Classes (proposed future nodes)
data constructors example nodes + ADTs Covered in ADTs (VD).
type constructors example nodes + ADTs Can add to ADTs and make a separate exercise. Code Example 1 to follow.
data declarations ADTs Covered in VD.
newtype declarations ADTs Can add to ADTs and make a separate exercise. Code Example 2 to follow.
type declarations / type synonyms new Types node Code Example 3 to follow.
deriving ADTs Recommend leaving for a later node on type classes
Regular constructor patterns PM Proper PM Proper
"As" patterns PM Proper PM Proper

@ErikSchierboom
Copy link
Member

I thought the idea was to keep the number of concepts to one or two per node to make each a small incremental step for students.

This is spot on. We don't want many concepts explained in a single exercise. If we build an exercise and it turns out that there are many concepts it introduces, we should then introduce more exercises (which will serve as prerequisites) to introduce some of those concepts.

So what would be a logical next step? We have basics, numbers, booleans, pattern-matching-literals and algebraic-data-types as concepts.

@MatthijsBlom
Copy link
Contributor

I thought the idea was to keep the number of concepts to one or two per node

It cannot be that simple, as concepts are often neither atomic nor uniformly granular. It does not make as much sense as I would like to speak of numbers of concepts; one should instead think of concept weights.

Example illustrating this point; feel free to skip if you already believe me.

Pattern matching has a conceptual core: to check, one by one, a number of 'patterns' for 'matching' a specific value, and then acting on which 'pattern' turns out to 'match'.

Aside from this core, pattern matching comes in a number of forms (see my list of kinds of patterns).

To put pattern matching into practice, one must use (at least) one of these forms! Introducing the conceptual core by itself is impossible.

Similarly, introducing any one form without the conceptual core is impossible.

On the other hand, introducing all types of patterns at once would be overwhelming and confusing.

Guards are (optional) pattern building blocks, but they do not exist by themselves so they cannot be introduced before patterns. A significant complication, it would be unwise to (wholely) introduce them alongside the conceptual core.

In conclusion, the Pattern Matching concept cannot be introduced as a whole (and it isn't, presently), and neither can it be introduced entirely piecewise. A choice must be made how to lump together the many partial concepts.

By my estimation (based on observed student behavior), the core pattern matching concept is heavy, pattern forms are individually light but collectively heavy, and (boolean) guards are moderately heavy but undesirably attractive.

Of course, thinking of weights has problems, chief among them that it is subjective. Still, I do not see a better way of thinking about it.

So what would be a logical next step? We have […], pattern-matching-literals and algebraic-data-types as concepts.

Neither of these cover single concepts (even though their names suggest they do):

  • pattern-matching-literals treats
    • function definition syntactic sugar
    • special syntax reused as pattern syntax
    • binding patterns
    • boolean guards
  • algebraic-data-types treats
    • data declarations
    • case expressions
    • Constructor patterns (+ binding patterns)

As illustrated by my table, if it were up to me the syntactic sugar would be postponed, the binding patterns would too (as absent both sugar and case they stop making sense), guards would (maybe) be postponed, and data declarations would be separated from pattern matching (as, contra Peter, I think they are separate (and both heavy) concepts).

Which is to say: I don't think we have these yet. (Unless it is decided that we actually do and should move on.)

So what would be a logical next step?

A bit off-topic, but: lists (excluding pattern matching on them) could be introduced independently, and _ where { } and let { } in _ could use early introduction. Maybe more; can't think of anything right now.

Regarding the pattern matching concept-cluster: assuming the current repo state, I don't know.


Going through @pwadsworth's table:

Concept Node Comment
Special syntax reused as pattern syntax Simple PM @MatthijsBlom Not sure what you mean by this

See the large code block listing all the kinds of patterns above (page-search -- Special syntax reused as pattern syntax). Presently this is treated by the pattern-matching-literals (i.e. "Simple PM") node, which I agree with.

Concept Node Comment
function definition syntactic sugar Simple PM or PM Proper1 PM Proper
Binding patterns Simple PM or PM Proper2 PM Proper
boolean guards Simple PM3 or PM Proper2 PM Proper

Are you sure? All of these are presently treated by pattern-matching-literals, which you made.

Concept Node Comment
case expressions ADT (1) Used in valentines-day (VD). I can add a more detailed explanation in the introduction
Wildcard patterns ADT Same as (1)

Do you mean for "ADTs" to come after "PM Proper"? The exemplar Valentine's Day solution uses guards and binding patterns. (I'm assuming you mean to keep Valentine's Day's association with algebraic-data-types.)

Concept Node Comment
type variables + parametric types example nodes + ADTs Not sure about this one. It could be its own node or added as part of Types or Type Classes (proposed future nodes)

This seemed to me like a concept hard to explain without lots of examples. But since we were going to introduce Maybe, Either, (,), etc. anyway, I thought why not use them to soft-introduce this concept? That way, later on we can tell the student «see, you've seen this before!» before generalizing. For an example of how I'd tackle the first half of this strategy, see #1106.

Concept Node Comment
type declarations / type synonyms new Types node Code Example 3 to follow.

Are you sure? I mean declarations of the form type Name = Type. Seems excessively light for a node.

Concept Node Comment
deriving ADTs Recommend leaving for a later node on type classes

I agree, but we cannot avoid deriving completely: it is too common, and necessarily present in exercise stubs. We should at least note something like «we'll explain fully what they do when we cover type classes, but for now understand that deriving clauses like this ensure we can use the == operator and that we can show our values on the screen».

Code Example 1 to follow.
Code Example 2 to follow.
Code Example 3 to follow.

@pwadsworth You forgot to include these.

what you proposed has many concepts in each node.

Guilty as charged. However, I have tried to reasonably distribute weight among the nodes:

  • case expressions (or function definition syntactic sugar) are the only heavy thingy in "Simple PM", the rest all being light.
  • Type variables and type constructors naturally come together and are collectively heavy, but are distributed over multiple nodes with lots of repetition.
  • data declarations are heavy; the rest in "ADTs" is light or has been heavily foreshadowed in the "Example Nodes".
  • Regular constructor patterns are light or moderately heavy, @ patterns are light; there is room left for some or all of the light stuff marked with "or" – which is fortuitous as I would like to delay two of them.

I thought I'd mentioned it somewhere, but I cannot find it anymore: I have thought about moving guards into their own node. It might keep students from developing bad habits a bit, and make the rest of the pattern matching nodes lighter. It would also allow to go more in-depth on guards.

@kytrinyx
Copy link
Member Author

kytrinyx commented Nov 2, 2022

@MatthijsBlom Thank you for this in-depth exploration into your thinking on the subject.

Let's start with the exercise for Pattern Matching Proper, making the assumption that Algebraic Data Types, Maybe, and Either have all been explained before.

Then let's work backwards from there. If we find that it becomes necessary, we can always create additional prerequisite exercises later.

If I recall correctly, we settled on using Valentines Day for the story for Pattern Matching Proper.

@MatthijsBlom @pwadsworth Does this sound like it will work?

@MatthijsBlom
Copy link
Contributor

Does this sound like [using Valentine's Day for the story for Pattern Matching Proper] will work?

To me it does. As I noted before, Valentine's Day is 'feature complete': it has (the most common) patterns, boolean guards, and a data declaration to interpret.

@cdimitroulas
Copy link
Contributor

hey folks, sorry to jump in here but I'm interested in helping out with getting Haskell learning mode + concept exercises built out.
Are there any clear next bits of work that I could tackle or other preparation work I could help with?

It seems like we still need some exercises for Maybe, Either and tuples. Do we think that any of these options that @MatthijsBlom mentioned would be good candidates?

I had previously published a few solutions that 'use' Maybe, Either, and tuples.

@MatthijsBlom
Copy link
Contributor

MatthijsBlom commented Feb 8, 2024

It seems like we still need some exercises for Maybe, Either and tuples.

Yes, we very much do.

I have been working on the concept introductions with some success, but have had trouble finding compelling & suitable exercises to go with them. #1106 contains a draft for the introduction of Maybe. An exercise to go with it would be most welcome. My intent was to focus on using the constructors to construct Maybes, and using Prelude & Data.Maybe functions to use Maybes. Pattern matching hasn't been introduced yet, so we cannot use it.

I have more draft work (Either+tuples+lists introductions + list exercise at least) locally, but it is not in a pushable state and I will not have the opportunity to clean things up until a few weeks from now.

Do we think that any of these options that @MatthijsBlom mentioned would be good candidates?

No. It is customary to use dedicated Learning Exercises for the syllabi, and these are all Practice Exercises. Learning Exercises are typically extremely simple and hyperfocused on the concept in question. We could adopt existing stories, or craft our own.

@cdimitroulas
Copy link
Contributor

Got it, thanks for explaining. I will try to take a look at the existing stories to see if there are any we could use or take inspiration from for crafting our own if we have to.

@cdimitroulas
Copy link
Contributor

cdimitroulas commented Feb 9, 2024

OK after looking through the list I couldn't see anything obvious that would be a good candidate, so perhaps it's better to come up with something ourselves.

For constructing Maybe values, there are a couple of simple and canonical ideas:

  • Implement a safeHead/safeLast function which returns a Maybe if the list is empty (but students can use if .. then .. else .. instead of pattern matching to solve it). Though safeHead exists as listToMaybe in Data.Maybe so perhaps we can nudge students to use that in a different context to get them used to using the Data.Maybe module.
  • Implement a safeDivide function which returns a Maybe instead of dividing by zero

For using stuff from Data.Maybe I don't have anything obvious in mind but I think I could come up with some ideas if I sat down and tried to hash it out:

  • It would be great to have students use catMaybes in a solution
  • Having them use fromMaybe and simple helpers like isJust/isNothing would also be good

@MatthijsBlom
Copy link
Contributor

I like the safeDivide idea.

Lists haven't been introduced yet, so I am a bit hesitant to require using mapMaybe/catMaybes. However, I like these functions, and lists are introduced in a parallel concept node. I don't think using them anyway would be a significant problem.

I do not think it is healthy to encourage using isJust/isNothing. I fear they might stick as bad habits. Also, we should warn explicitly against fromJust; I will probably do this already in the concept introduction.

On my wish list are maybe, fromMaybe, and perhaps mapMaybe/catMaybes. Of the latter two, probably catMaybes is the one to go with, as using higher order functions might be needlessly difficult (for some), and also when the student later on uses catMaybes . map f HLint will suggest using mapMaybe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants