Generative Data Intelligence

Type Classes in Scala3: A Beginner’s Guide | Ledger

Date:

This document is intended for the beginner Scala3 developer who is already versed in Scala prose, but is puzzled about all the `implicits` and parameterized traits in the code.

This document explains the why, how, where and when of Type Classes (TC).

After reading this document the beginner Scala3 developer will gain solid knowledge to use and dive into the source code of a lot of Scala libraries and start writing idiomatic Scala code.

Let’s start with the why …

The expression problem

In 1998, Philip Wadler stated that “the expression problem is a new name for an old problem”. It’s the problem of software extensibility. According to mister Wadler writing, the solution to the expression problem must comply to the following rules:

  • Rule 1: Allow the implementation of existing behaviors (think of Scala trait) to be applied to new representations (think of a case class )
  • Rule 2:  Allow the implementation of new behaviors to be applied to existing representations
  • Rule 3: It must not jeopardize the type safety
  • Rule 4: It must not necessitate to recompile existing code

Solving this problem will be the silver thread of this article.

Rule 1: implementation of existing behavior on new representation

Any object oriented language has a baked-in solution for rule 1 with subtype polymorphism. You can safely implement any `trait` defined in a dependency on a `class` in your own code, without recompiling the dependency. Let’s see that in action:

Scala

def todo = 42
type Height = Int
type Block = Int

object Lib1:
 trait Blockchain:
 def getBlock(height: Height): Block

 case class Ethereum() extends Blockchain:
 override def getBlock(height: Height) = todo

 case class Bitcoin() extends Blockchain:
 override def getBlock(height: Height) = todo


object Lib2:
 import Lib1.*

 case class Polkadot() extends Blockchain:
 override def getBlock(height: Height): Block = todo

val eth = Lib1.Ethereum()
val btc = Lib1.Bitcoin()
val dot = Lib2.Polkadot()

In this fictitious example, library `Lib1` (line 5) defines a trait `Blockchain` (line 6) with 2 implementations of it (lines 9 & 12). `Lib1` will remain the same in ALL this document (enforcement of rule 4).

`Lib2` (line 15) implements the existing behavior `Blockchain` on a new class `Polkadot` (rule 1) in a type safe (rule 3) manner, without recompiling `Lib1` (rule 4). 

Rule 2: implementation of new behaviors to be applied to existing representations

Let’s imagine in `Lib2` we want a new behavior `lastBlock` to be implemented specifically for each `Blockchain`.

First thing that comes to mind is creating a big switch based on the type of parameter.

Scala

def todo = 42
type Height = Int
type Block = Int

object Lib1:
 trait Blockchain:
 def getBlock(height: Height): Block

 case class Ethereum() extends Blockchain:
 override def getBlock(height: Height) = todo

 case class Bitcoin() extends Blockchain:
 override def getBlock(height: Height) = todo

object Lib2:
 import Lib1.*

 case class Polkadot() extends Blockchain:
 override def getBlock(height: Height): Block = todo

 def lastBlock(blockchain: Blockchain): Block = blockchain match
 case _:Ethereum => todo
 case _:Bitcoin => todo
 case _:Polkadot => todo
 

object Lib3:
 import Lib1.*

 case class Polygon() extends Blockchain:
 override def getBlock(height: Height): Block = todo

import Lib1.*, Lib2.*, Lib3.*
println(lastBlock(Bitcoin()))
println(lastBlock(Ethereum()))
println(lastBlock(Polkadot()))
println(lastBlock(Polygon()))

This solution is a weak reimplementation of type based polymorphism that is already baked-in language!

`Lib1` is left untouched (remember, enforced rule 4 all over this document). 

The solution implemented in `Lib2` is okayish until yet another blockchain is introduced in `Lib3`. It infringes the type safety rule (rule 3) because this code fails at runtime on line 37. And modifying `Lib2` would infringe rule 4.

Another solution is using an `extension`.

Scala

def todo = 42
type Height = Int
type Block = Int

object Lib1:
 trait Blockchain:
 def getBlock(height: Height): Block

 case class Ethereum() extends Blockchain:
 override def getBlock(height: Height) = todo

 case class Bitcoin() extends Blockchain:
 override def getBlock(height: Height) = todo

object Lib2:
 import Lib1.*

 case class Polkadot() extends Blockchain:
 override def getBlock(height: Height): Block = todo

 def lastBlock(): Block = todo

 extension (eth: Ethereum) def lastBlock(): Block = todo

 extension (btc: Bitcoin) def lastBlock(): Block = todo

import Lib1.*, Lib2.*
println(Bitcoin().lastBlock())
println(Ethereum().lastBlock())
println(Polkadot().lastBlock())

def polymorphic(blockchain: Blockchain) =
 // blockchain.lastBlock()
 ???

`Lib1` is left untouched (enforcement of rule 4 in the whole document). 

`Lib2` defines behavior for its type (line 21) and `extension`s for existing types (lines 23 & 25).

Lines 28-30, the new behavior can be used in each class. 

But there is no way to call this new behavior polymorphically (line 32). Any attempt to do so leads to compilation errors (line 33) or to type based switches. 

This Rule n°2 is tricky. We tried to implement it with our own definition of polymorphism and `extension` trick. And that was weird.

There is a missing piece called ad-hoc polymorphism: the ability to safely dispatch a behavior implementation according to a type, wherever the behavior and type are defined. Enter the Type Class pattern.

The Type Class pattern

The Type Class (TC for short) pattern recipe has 3 steps. 

  1. Define a new behavior
  2. Implement the behavior
  3. Use the behavior

In the following section, I implement the TC pattern in the most straightforward way. It’s verbose, clunky and impractical. But hold on, those caveats will be fixed step by step further in the document.

1. Define a new behavior
Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

`Lib1` is, once again, left untouched.

The new behavior is the TC materialized by the trait. The functions defined in the trait are a way to apply some aspects of that behavior.

The parameter `A` represents the type we want to apply behavior to, which are subtypes of `Blockchain` in our case.

Some remarks:

  • If needed, the parameterized type `A` can be further constrained by the Scala type system. For instance, we could enforce `A` to be a `Blockchain`. 
  • Also, the TC could have many more functions declared in it.
  • Finally, each function may have many more arbitrary parameters.

But let’s keep things simple for the sake of readability.

2. Implement the behavior
Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

 val ethereumLastBlock = new LastBlock[Ethereum]:
 def lastBlock(eth: Ethereum) = eth.lastBlock

 val bitcoinLastBlock = new LastBlock[Bitcoin]:
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

For each type the new `LastBlock` behavior is expected, there is a specific instance of that behavior. 

The `Ethereum` implementation line 22 is computed from the `eth` instance passed as parameter. 

The implementation of `LastBlock` for `Bitcoin` line 25 is implemented with an unmanaged IO and doesn’t use its parameter.

So, `Lib2` implements new behavior `LastBlock` for `Lib1` classes.

3. Use the behavior
Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

 val ethereumLastBlock = new LastBlock[Ethereum]:
 def lastBlock(eth: Ethereum) = eth.lastBlock

 val bitcoinLastBlock = new LastBlock[Bitcoin]:
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

import Lib1.*, Lib2.*

def useLastBlock[A](instance: A, behavior: LastBlock[A]) =
 behavior.lastBlock(instance)

println(useLastBlock(Ethereum(lastBlock = 2), ethereumLastBlock))
println(useLastBlock(Bitcoin(), bitcoinLastBlock))

Line 30 `useLastBlock` uses an instance of `A` and the `LastBlock` behavior defined for that instance.

Line 33 `useLastBlock` is called with an instance of `Ethereum` and an implementation of `LastBlock` defined in `Lib2`. Note that it’s possible to pass any alternative implementation of `LastBlock[A]` (think of dependency injection).

`useLastBlock` is the glue between representation (the actual A) and its behavior. Data and behavior are separated, which is what functional programming advocates for.

Discussion

Let’s recap the rules of the expression problem:

  • Rule 1: Allow the implementation of existing behaviors  to be applied to new classes
  • Rule 2:  Allow the implementation of new behaviors to be applied to existing classes
  • Rule 3: It must not jeopardize the type safety
  • Rule 4: It must not necessitate to recompile existing code

Rule 1 can be solved out of the box with subtype polymorphism.

The TC pattern just presented (see previous screenshot) solves rule 2. It’s type safe (rule 3) and we never touched `Lib1` (rule 4). 

However it’s impractical to use for several reason:

  • Lines 33-34 we have to explicitly pass the behavior along its instance. This is an extra overhead. We should just write `useLastBlock(Bitcoin())`.
  • Line 31 the syntax is uncommon.  We would rather prefer to write a concise and more object oriented  `instance.lastBlock()` statement.

Let’s highlight some Scala features for practical TC usage. 

Enhanced developer experience

Scala has a unique set of features and syntactic sugars that makes TC a truly enjoyable experience for developers.

Implicits

The implicit scope is a special scope resolved at compile time where only one instance of a given type can exist. 

A program puts an instance in the implicit scope with the `given` keyword. Alternatively a program can retrieve an instance from the implicit scope with keyword `using`.

The implicit scope is resolved at compile time, there is know way to change it dynamically at runtime. If the program compiles, the implicit scope is resolved. At runtime, it’s not possible to have missing implicit instances where they are used. The only possible confusion may come from using the wrong implicit instance, but this issue is left for the creature between the chair and the keyboard.

It’s different from a global scope because: 

  1. It’s resolved contextually. Two locations of a program can use an instance of the same given type in implicit scope, but those two instances may be different.
  2. Behind the scene the code is passing implicit arguments function to function until the implicit usage is reached. It’s not using a global memory space.

Going back to the type class! Let’s take the exact same example.

Scala

def todo = 42
type Height = Int
type Block = Int
def http(uri: String): Block = todo

object Lib1:
 trait Blockchain:
 def getBlock(height: Height): Block

 case class Ethereum() extends Blockchain:
 override def getBlock(height: Height) = todo

 case class Bitcoin() extends Blockchain:
 override def getBlock(height: Height) = todo

`Lib1` is the same unmodified code we previously defined. 

Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

 given ethereumLastBlock:LastBlock[Ethereum] = new LastBlock[Ethereum]:
 def lastBlock(eth: Ethereum) = eth.lastBlock

 given bitcoinLastBlock:LastBlock[Bitcoin] = new LastBlock[Bitcoin]:
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

import Lib1.*, Lib2.*

def useLastBlock[A](instance: A)(using behavior: LastBlock[A]) =
 behavior.lastBlock(instance)

println(useLastBlock(Ethereum(lastBlock = 2)))
println(useLastBlock(Bitcoin()))

Line 19 a new behavior `LastBlock` is defined, exactly like we did previously.

Line 22 and line 25, `val` is replaced by `given`. Both implementations of `LastBlock` are put in the implicit scope.

Line 31 `useLastBlock` declares the behavior `LastBlock` as an implicit parameter. The compiler resolves the appropriate instance of `LastBlock` from implicit scope contextualized from caller locations (lines 33 and 34). Line 28 imports everything from `Lib2`, including the implicit scope. So, the compiler passes instances defined lines 22 and 25 as the last parameter of `useLastBlock`. 

As a library user, using a type class is easier than before. Line 34 and 35 a developer has only to make sure that an instance of the behavior is injected in the implicit scope (and this can be a mere `import`). If an implicit is not `given` where the code is `using` it, the compiler tells him.

Scala’s implicit ease the task of passing class instances along with instances of their behaviors.

Implicit sugars

Line 22 and 25 of previous code can be further improved ! Let’s iterate on the TC implementations.

Scala

given LastBlock[Ethereum] = new LastBlock[Ethereum]:
 def lastBlock(eth: Ethereum) = eth.lastBlock

 given LastBlock[Bitcoin] = new LastBlock[Bitcoin]:
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

Lines 22 and 25, if the name of the instance is unused, it can be omitted.

Scala


 given LastBlock[Ethereum] with
 def lastBlock(eth: Ethereum) = eth.lastBlock

 given LastBlock[Bitcoin] with
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

Lines 22 and 25, the repetition of the type can be replaced with `with` keyword.

Scala

given LastBlock[Ethereum] = _.lastBlock

 given LastBlock[Bitcoin] = _ => http("https://bitcoin/last")

Because we use a degenerated trait with a single function in it, the IDE may suggest simplifying the code with a SAM expression. Although correct, I don’t think it’s a proper use of SAM, unless you’re casually code golfing.

Scala offers syntactic sugars to streamline the syntax, removing unnecessary naming, declaration and type redundancy.

Extension

Used wisely, the `extension` mechanism can simplify the syntax for using a type class.

Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

 given LastBlock[Ethereum] with
 def lastBlock(eth: Ethereum) = eth.lastBlock

 given LastBlock[Bitcoin] with
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

 extension[A](instance: A)
 def lastBlock(using tc: LastBlock[A]) = tc.lastBlock(instance)

import Lib1.*, Lib2.*

println(Ethereum(lastBlock = 2).lastBlock)
println(Bitcoin().lastBlock)

Lines 28-29 a generic extension method `lastBlock` is defined for any `A` with a `LastBlock` TC parameter in implicit scope.

Lines 33-34 the extension leverages an object oriented syntax to use TC.

Scala

object Lib2:
 import Lib1.*

 trait LastBlock[A]:
 def lastBlock(instance: A): Block

 given LastBlock[Ethereum] with
 def lastBlock(eth: Ethereum) = eth.lastBlock

 given LastBlock[Bitcoin] with
 def lastBlock(btc: Bitcoin) = http("https://bitcoin/last")

 extension[A](instance: A)(using tc: LastBlock[A])
 def lastBlock = tc.lastBlock(instance)
 def penultimateBlock = tc.lastBlock(instance) - 1

import Lib1.*, Lib2.*

val eth = Ethereum(lastBlock = 2)
println(eth.lastBlock)
println(eth.penultimateBlock)

val btc = Bitcoin()
println(btc.lastBlock)
println(btc.penultimateBlock)

Line 28, the TC parameter can also be defined for the whole extension to avoid repetition. Line 30 we reuse the TC in the extension to define `penultimateBlock` (even though it could be implemented on `LastBlock` trait directly)

The magic happens when the TC is used. The expression feels a lot more natural, giving the illusion that behavior `lastBlock` is conflated with the instance.

Generic type with TC
Scala

import Lib1.*, Lib2.*

def useLastBlock1[A](instance: A)(using LastBlock[A]) = instance.lastBlock

def useLastBlock2[A: LastBlock](instance: A) = instance.lastBlock

val eth = Ethereum(lastBlock = 2)
assert(useLastBlock1(eth) == useLastBlock2(eth))

Line 34 the function uses an implicit TC. Note that the TC doesn’t need to be named if that name is unnecessary.

The TC pattern is so widely used that there is a generic type syntax to express “a type with an implicit behavior”. Line 36 the syntax is a more concise alternative to the previous one (line 34). It avoids declaring specifically the unnamed implicit TC parameter.

This concludes the developer experience section. We have seen how extensions, implicits and some syntactic sugar can provide a less cluttered syntax when the TC is used and defined.

Automatic derivation

A lot of Scala libraries use TC, leaving the programmer to implement them in their code base.

For instance Circe (a json de-serialization library) uses TC `Encoder[T]` and `Decoder[T]` for programmers to implement in their codebase. Once implemented the whole scope of the library can be used. 

Those implementations of TC are more than often data oriented mappers. They don’t need any business logic, are boring to write, and a burden to maintain in sync with case classes.

In such a situation, those libraries offer what is called automatic derivation or semi-automatic derivation. See for instance Circe automatic and semi-automatic derivation. With semi-automatic derivation the programmer can declare an instance of a type class with some minor syntax, whereas automatic derivation doesn’t necessitate any code modification except for an import.

Under the hood, at compile time, generic macros introspect types as pure data structure and generate a TC[T] for library users. 

Deriving generically a TC is very common, so Scala introduced a complete tool box for that purpose. This method is not always advertised by library documentations although it’s the Scala 3 way of using derivation.

Scala

object GenericLib:

 trait Named[A]:
 def blockchainName(instance: A): String

 object Named:
 import scala.deriving.*

 inline final def derived[A](using inline m: Mirror.Of[A]): Named[A] =
 val nameOfType: String = inline m match
 case p: Mirror.ProductOf[A] => compiletime.constValue[p.MirroredLabel]
 case _ => compiletime.error("Not a product")
 new Named[A]:
 override def blockchainName(instance: A):String = nameOfType.toLowerCase

 extension[A] (instance: A)(using tc: Named[A])
 def blockchainName = tc.blockchainName(instance)

import Lib1.*, GenericLib.*

case class Polkadot() derives Named
given Named[Bitcoin] = Named.derived
given Named[Ethereum] = Named.derived

println(Ethereum(lastBlock = 2).blockchainName)
println(Bitcoin().blockchainName)
println(Polkadot().blockchainName)

Line 18 a new TC `Named` is introduced. This TC is unrelated to the blockchain business strictly speaking. Its purpose is to name the blockchain based on the name of the case class.

First focus on definitions lines 36-38. There are 2 syntaxes for deriving a TC:

  1. Line 36 the TC instance can be defined directly on the case class with the `derives` keyword. Under the hood the compiler generates a given `Named` instance in `Polkadot` companion object.
  2. Line 37 and 38, type classes instances are given on pre-existing classes with `TC.derived

Line 31 a generic extension is defined (see previous sections) and `blockchainName` is used naturally.  

The `derives` keyword expects a method with the form `inline def derived[T](using Mirror.Of[T]): TC[T] = ???` which is defined line 24. I won’t explain in depth what the code does. In broad outlines:

  • `inline def` defines a macro
  • `Mirror` is part of the toolbox to introspect types. There are different kinds of mirrors, and line 26 the code focuses on `Product` mirrors (a case class is a product). Line 27, if programmers try to derive something that is not a `Product`, the code won’t compile.
  • the `Mirror` contains other types. One of them, `MirrorLabel`, is a string that contains the type name. This value is used in the implementation, line 29, of the `Named` TC.

TC authors can use meta programming to provide functions that generically generate instances of TC given a type. Programmers can use dedicated library API or the Scala deriving tools to create instances for their code.

Whether you need generic or specific code to implement a TC, there is a solution for each situation. 

Summary of all the benefits

  • It solves the expression problem
    • New types can implement existing behavior through traditional trait inheritance
    • New behaviors can be implemented on existing types
  • Separation of concern
    • The code is not mangled and easily deletable. A TC separates data and behavior, which is a functional programming motto.
  • It’s safe
    • It’s type safe because it doesn’t rely on introspection. It avoids big pattern matching involving types. if you encounter yourself writing such code, you may detect a case where TC pattern will suit perfectly.
    • The implicit mechanism is compile safe! If an instance is missing at compile time the code won’t compile. No surprise at runtime.
  • It brings ad-hoc polymorphism
    • Ad hoc polymorphism is usually missing in traditional object oriented programming.
    • With ad-hoc polymorphism, developers can implement the same behavior for various unrelated types without using traditional sub typing (which couples the code)
  • Dependency injection made easy
    • A TC instance can be changed in respect of Liskov substitution principle. 
    • When a component has a dependency upon a TC, a mocked TC can easily be injected for testing purposes. 

Counter indications

Every hammer is designed for a range of problems.

Type Classes are for behavioral problems and must not be used for data inheritance. Use composition for that purpose.

The usual subtyping is more straightforward. If you own the code base and don’t aim for extensibility, type classes may be overkill.

For instance, In Scala core, there is a `Numeric` type class:

Scala

trait Numeric[T] extends Ordering[T] {
 def plus(x: T, y: T): T
 def minus(x: T, y: T): T
 def times(x: T, y: T): T

It really makes sense to use such a type class because it not only allows reuse of algebraic algorithms on types that are embedded in Scala (Int, BigInt, …), but also on user defined types (a `ComplexNumber` for instance).

On the other hand, implementation of Scala collections mostly use subtyping instead of type class. This design makes sense for several reason:

  • The collection API is supposed to be complete and stable. It exposes common behavior through traits inherited by implementations. Being highly extensible is not a particular goal here.
  • It must be simple to use. TC adds a mental overhead on the end user programmer.
  • TC might also incur small overhead in performance. This may be critical for a collection API.
  • Though, the collection API is still extensible through new TC defined in by third party libraries.

Conclusion

We’ve seen that TC is a simple pattern that solves a big problem. Thanks to Scala rich syntax, the TC pattern can be implemented and used in many ways. The TC pattern is in line with the functional programming paradigm and is a fabulous tool for a clean architecture. There is no silver bullet and TC pattern must be applied when it fits.

Hope you gained knowledge reading this document. 

Code is available at https://github.com/jprudent/type-class-article. Please reach out to me if you have any sort of questions or remarks. You can use issues or code comments in the repository if you want.


Jerome PRUDENT

Software Engineer

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?