exercism · ErikSchierboom · Apr 24, 2024 · Apr 17, 2024 · Apr 18, 2024 · Apr 18, 2024
diff --git a/concepts/chars/.meta/config.json b/concepts/chars/.meta/config.json
@@ -0,0 +1,4 @@
+{
+  "blurb": "A char is a type that represents a Unicode UTF-16 character. Strings are a sequence of chars.",
+  "authors": ["colinleach"]
+}
diff --git a/concepts/chars/about.md b/concepts/chars/about.md
@@ -0,0 +1,163 @@
+# About
+
+## Representation, Characters and Integers
+
+Like other simple types (`int`s, `bool`s, etc.) the `char` has a companion or alias type, in this case, `System.Char`. 
+
+This is in fact a `struct` with a 16 bit field, and is immutable by default.
+
+`char` has some instance methods such as `Equals`, `ToString` and [`CompareTo`][compare-to].
+
+`char` has the same width as a [`ushort`][uint16] but they are generally not used interchangeably as they are in some languages. `ushort` has
+to be explicitly cast to a `char`. 
+
+For what it's worth, `char`s can be subject to arithmetic operations. The result of these operations is an integer.
+
+Obviously there is no equivalence between a `byte` at 8 bits and the 16 bit `char`.
+
+## Usage
+
+`char`s are generally easy to use. 
+
+They can be defined as literals with single quotes:
+
+```fsharp
+let ch = 'A'
+// => val ch: char = 'A'
+```
+
+An individual `char` can be retrieved from a string with (zero-based) indexing:
+
+```fsharp
+let str = "Exercism" 
+// => val str: string = "Exercism"
+
+str[4]
+// => val it: char = 'c'
+```
+
+Iterating over a string returns a `char` at each step:
+
+```fsharp
+[| for c in "F#" -> c, int c |]
+// => val it: (char * int) array = [|('F', 70); ('#', 35)|]
+```
+
+As shown above, a `char` can be cast to its `int` value.
+This also works (*at least some of the time*) for other scripts:
+
+```fsharp
+[| for c in "東京" -> c, int c |] // Tokyo, if Wikipedia is to be believed
+// => val it: (char * int) array = [|('東', 26481); ('京', 20140)|]
+```
+
+The underlying Int16 is used when comparing characters:
+
+```fsharp
+'A' < 'D'
+// => val it: bool = true
+```
+
+Also, an `int` can be cast to `char`:
+
+```fsharp
+char 77
+// => val it: char = 'M'
+```
+
+The `System.Char` library contains the full set of [methods][Char-methods] expected for a .NET language, such as upper/lower conversions (but see the caveats in the next section):
+
+```fsharp
+'a' |> System.Char.ToUpper
+// => val it: char = 'A'
+
+'Q' |> System.Char.ToLower
+// => val it: char = 'q'
+```
+
+The .NET libraries help with extracting `char`s from strings, in this case `Seq` methods:
+
+```fsharp
+"Exercism" |> Seq.toList
+// => val it: char list = ['E'; 'x'; 'e'; 'r'; 'c'; 'i'; 's'; 'm']
+
+"Zürich" |> Seq.toArray
+// => val it: char array = [|'Z'; 'ü'; 'r'; 'i'; 'c'; 'h'|]
+```
+
+There are various ways to convert a character list (or array) to a string, including these:
+
+```fsharp
+let s = ['E'; 'x'; 'e'; 'r'; 'c'; 'i'; 's'; 'm']
+// => val s: char list = ['E'; 'x'; 'e'; 'r'; 'c'; 'i'; 's'; 'm']
+
+// with a .NET method
+System.String.Concat s
+// => val it: string = "Exercism"
+
+// with String.concat
+String.concat "" <| List.map string s
+// => val it: string = "Exercism"
+
+// with a string constructor
+new string [|for c in s -> c|]
+// => val it: string = "Exercism"
+
+// with StringBuilder
+open System.Text
+string (List.fold (fun (sb:StringBuilder) (c:char) -> sb.Append(c)) 
+                      (new StringBuilder())
+                       s)
+// => val it: string = "Exercism"
+```
+
+General information on `char`s can be found here:
+
+- [Chars documentation][chars-docs]: reference documentation for `char`.
+
+However, `char`s have a number of rough edges as detailed below. These rough edges mostly relate to the opposition between the full unicode standard on the one side and  historic representations of text as well as performance and memory usage on the other.
+
+## Unicode Issues
+
+When dealing with strings, if [`System.String`][System-string] library methods are available you should seek these out and use them rather than breaking the string down into characters.
+
+Some textual "characters" consist of more than one `char` because the unicode standard has more than 65536 code points. For instance the emojis that show up in some of the tests have 2 `char`s as they comprise [surrogate][surrogates] characters.
+
+Additionally, there are combining sequences for instance where in some cases an accented character may consist of one `char` for the plain character and another `char` for the accent.
+
+If you have to deal with individual characters you should try to use library methods such as [`System.Char.IsControl`][is-control], [`System.Char.IsDigit`][is-digit] rather than making naive comparisons such as checking that a character is between '0' and '9'. 
+
+For instance, note that '٢' is the arabic digit 2. `IsDigit` will return true for the arabic version so you need to be clear say when validating what range of inputs is acceptable.
+
+Even the `System.Char` library methods may not behave as you would expect when you are dealing with more obscure languages.
+
+One way safely to break a string into display "characters" is to use [`StringInfo`][string-info] and methods such as [`GetNexttextElement`][get-next-text-element]. 
+This might be necessary if you are dealing with globalization/localization. 
+
+Another avenue where the scalar values of unicode characters is important (say you are rolling your own encoding system) is to use [runes][runes]. However, if you know the range of characters you deal with does not include surrogates or combining character sequences (e.g. Latin ASCII) and your input is well validated then you can avoid this. 
+
+Again, the best position to be in is where you can use `String`'s library methods.
+
+If you do find yourself in the unenviable position of dealing with the minutiae of unicode then [this][char-encoding-net] is a good starting point.
+
+## Globalization
+
+If you are working in an environment where you are dealing with multiple cultures or the culture is important in some parts of the code but not others then be aware of the overloads of [`ToUpper`][to-upper] and [`ToLower`][to-lower] which take a culture and [`ToUpperInvariant`][to-upper-invariant] and [`ToLowerInvariant`][to-lower-invariant] which will provide a consistent result irrespective of the current [culture][culture-info].
+
+[chars-docs]: https://learn.microsoft.com/en-us/dotnet/api/system.char?view=net-8.0
+[culture-info]: https://docs.microsoft.com/en-us/dotnet/api/system.globalization.cultureinfo
+[uint16]: https://docs.microsoft.com/en-us/dotnet/api/system.uint16
+[string-info]: https://docs.microsoft.com/en-us/dotnet/api/system.globalization.stringinfo
+[runes]: https://docs.microsoft.com/en-us/dotnet/api/system.text.rune
+[char-encoding-net]: https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction
+[surrogates]: https://docs.microsoft.com/en-us/dotnet/api/system.char.issurrogate
+[is-control]: https://docs.microsoft.com/en-us/dotnet/api/system.char.iscontrol
+[to-upper]: https://docs.microsoft.com/en-us/dotnet/api/system.char.toupper
+[to-lower]: https://docs.microsoft.com/en-us/dotnet/api/system.char.tolower
+[to-upper-invariant]: https://docs.microsoft.com/en-us/dotnet/api/system.char.toupperinvariant
+[to-lower-invariant]: https://docs.microsoft.com/en-us/dotnet/api/system.char.tolowerinvariant
+[is-digit]: https://docs.microsoft.com/en-us/dotnet/api/system.char.isdigit
+[get-next-text-element]: https://docs.microsoft.com/en-us/dotnet/api/system.globalization.stringinfo.getnexttextelement
+[compare-to]: https://docs.microsoft.com/en-us/dotnet/api/system.char.compareto
+[Char-methods]: https://learn.microsoft.com/en-us/dotnet/api/system.char?view=net-8.0#methods
+[System-string]: https://learn.microsoft.com/en-us/dotnet/api/system.string?view=net-8.0
diff --git a/concepts/chars/introduction.md b/concepts/chars/introduction.md
@@ -0,0 +1,54 @@
+# Introduction
+
+The F# `char` type is a 16 bit value to represent the smallest addressable components of text, immutable by default.
+
+`char`s can be defined as literals with single quotes:
+
+```fsharp
+let ch = 'A'
+// => val ch: char = 'A'
+```
+
+Strings are a sequence of chars.
+
+An individual `char` can be retrieved from a string with (zero-based) indexing:
+
+```fsharp
+"Exercism"[4] //  =>  'c'
+```
+
+Iterating over a string returns a `char` at each step.
+
+The next example uses a higher order function and an anonymous function, for convenience.
+These will be covered properly later in the syllabus, but for now they are are a concise way to write a loop over the characters in a string.
+
+```fsharp
+Seq.map (fun c -> c, int c) "F#"  //  =>  [('F', 70); ('#', 35)]
+```
+
+As shown above, a `char` can be cast to its `int` value.
+This also works (*at least some of the time*) for other scripts:
+
+```fsharp
+Seq.map (fun c -> c, int c) "東京"  //  =>  [('東', 26481); ('京', 20140)]
+```
+
+The underlying Int16 is used when comparing characters:
+
+```fsharp
+'A' < 'D'  // =>  true
+```
+
+Also, an `int` can be cast to `char`:
+
+```fsharp
+char 77  // => 'M'
+```
+
+The `System.Char` library contains the full set of methods expected for a .NET language, such as upper/lower conversions:
+
+```fsharp
+'a' |> System.Char.ToUpper  // =>  'A'
+
+'Q' |> System.Char.ToLower  // =>  'q'
+```
diff --git a/concepts/chars/links.json b/concepts/chars/links.json
@@ -0,0 +1,6 @@
+[
+  {
+    "url": "https://learn.microsoft.com/en-us/dotnet/api/system.char?view=net-8.0",
+    "description": "Documentation for the System.Char library."
+  }
+]
diff --git a/config.json b/config.json
@@ -189,6 +189,18 @@
           "pattern-matching",
           "strings"
         ]
+      },
+      {
+        "slug": "squeaky-clean",
+        "name": "squeaky-clean",
+        "uuid": "8196f0ad-cfd9-409e-827f-57b16b296a4b",
+        "concepts": [
+          "chars"
+        ],
+        "prerequisites": [
+          "strings",
+          "if-then-else-expressions"
+        ]
       }
     ],
     "practice": [
@@ -2195,6 +2207,11 @@
       "uuid": "5bd49bb7-3487-4925-9c0e-866d56a880ee",
       "slug": "tuples",
       "name": "Tuples"
+    },
+    {
+      "uuid": "7a9e2985-56be-4170-b41d-7da98361b7c9",
+      "slug": "chars",
+      "name": "Chars"
     }
   ],
   "key_features": [

diff --git a/exercises/concept/squeaky-clean/.docs/hints.md b/exercises/concept/squeaky-clean/.docs/hints.md
@@ -0,0 +1,24 @@
+# Hints
+
+## 1. Replace any spaces encountered with underscores
+
+- [Reference documentation][char-docs] for `char`s is here.
+- You can retrieve `char`s from a string in the same way as elements from an array, though in this exercise it may be better to use a higher order function such as `String.collect`.
+- `char` literals are enclosed in single quotes.
+
+## 2. Remove all whitespace
+
+- See [this method][iswhitespace] for detecting spaces and [this method][isnumber] for digits.
+
+## 3. Convert camel-case to kebab-case
+
+- See [this method][tolower] to convert a character to lower case.
+
+## 5. Omit Greek lower case letters
+
+- `char`s support the default equality and comparison operators.
+
+[char-docs]: https://learn.microsoft.com/en-us/dotnet/api/system.char
+[iswhitespace]: https://docs.microsoft.com/en-us/dotnet/api/system.char.iswhitespace
+[isnumber]: https://docs.microsoft.com/en-us/dotnet/api/system.char.isnumber
+[tolower]: https://docs.microsoft.com/en-us/dotnet/api/system.char.tolower
diff --git a/exercises/concept/squeaky-clean/.docs/instructions.md b/exercises/concept/squeaky-clean/.docs/instructions.md
@@ -0,0 +1,74 @@
+# Instructions
+
+In this exercise you will implement a partial set of utility routines to help a developer clean up identifier names.
+
+In the 6 tasks you will gradually build up the functions `transform` to convert single characters and `clean` to convert strings.
+
+A valid identifier comprises zero or more letters, underscores, hyphens, question marks and emojis.
+
+If an empty string is passed to the `clean` function, an empty string should be returned.
+
+## 1. Replace any hyphens encountered with underscores
+
+Implement the `transform` function to replace any hyphens with underscores.
+
+```fsharp
+transform '-'  // => "_"
+```
+
+## 2. Remove all whitespace
+
+Remove all whitespace characters.
+This will include leading and trailing whitespace.
+
+```fsharp
+transform ' '  // => ""
+```
+
+## 3. Convert camelCase to kebab-case
+
+Modify the `transform` function to convert camelCase to kebab-case
+
+```fsharp
+transform 'D'  // => "-d"
+```
+
+## 4. Omit characters that are digits
+
+Modify the `transform` function to omit any characters that are numeric.
+
+```fsharp
+transform '7'  // => ""
+```
+
+## 5. Replace Greek lower case letters with question marks
+
+Modify the `transform` function to replace any Greek letters in the range 'α' to 'ω'.
+
+```fsharp
+transform 'β' // => "?"
+```
+
+## 6. Combine these operations to operate on a string
+
+Implement the `clean` function to apply these operations to an entire string.
+
+Characters which fall outside the rules should pass through unchanged.
+
+```fsharp
+clean "  a2b Cd-ω😀  " //  => "ab-cd_?😀"
+```
+
+## Assembling a string from characters
+
+This topic will be covered in detail later in the syllabus.
+
+For now, it may be useful to know that there is a [higher order function][higher-order-function] called [`String.collect`][string-collect] that converts a collection of `char`s to a string, using a function that you supply.
+
+```fsharp
+let transform ch = $"{ch}_"
+String.collect transform "abc"  // =>  "a_b_c_"
+```
+
+[higher-order-function]: https://exercism.org/tracks/fsharp/concepts/higher-order-functions
+[string-collect]: https://fsharp.github.io/fsharp-core-docs/reference/fsharp-core-stringmodule.html#collect