URL type using TypeScript or FP

How to enforce a custom condition on a string.

Imagine we pass around URL strings, and sometimes we extract the domain part of it. We could write a function to strip the https:// prefix like this

1
2
3
4
5
6
function stripProtocol(s:string) {
// assuming `s` starts with "https://" we want
// to remove first 8 characters
return s.substr(8)
}
console.log(stripProtocol('https://foo.com')) // "foo.com"

It works yet is very unreliable. We can pass any string to stripProtocol, so sometimes it prints an empty string, or even crashes if we pass undefined.

1
2
3
console.log(stripProtocol('foo.com')) // ""
let x:string
console.log(stripProtocol(x)) // 🔥🔥🔥

We can rely on TypeScript to catch the possible null argument x using strictNullChecks option. Yet we still have a problem: how do we describe "url" type which is a string, yet has https:// prefix; and how do we make TypeScript compiler enforce this new "url" type and tell it apart from plain "string" type?

We could create a user type UrlString and just point at the primitive string type.

1
2
3
4
5
6
7
8
type UrlString = string
function stripProtocol(s:UrlString) {
// assuming s starts with "https://" we want
// to remove first 8 characters
return s.substr(8)
}
console.log(stripProtocol('https://foo.com')) // "foo.com"
console.log(stripProtocol('foo.com')) // ""

Yet this does NOT allow TypeScript to distinguish between "string" and "UrlString"; it happily accepts both as valid arguments to stripProtocol. We have NOT declared a new type, we just declared an alias to string.

We need "nominal" types - a type that is only different by name, but acts like an alias to an existing type (like a string). The "nominal" feature is on the TypeScript roadmap Currently there are a couple of solution, describe in the excellent TypeScript book. I like the approach that uses enumerations to guarantee uniqueness, yet keep the value a primitive.

Nominal type using Enum

A syntax is a little weird, because we are combining string with a new enumeration type.

1
2
3
4
5
6
7
8
9
10
11
const enum AsUrlString {}
type UrlString = string & AsUrlString
function stripProtocol(s:UrlString) {
// assuming s starts with "https://" we want
// to remove first 8 characters
return s.substr(8)
}
function toUrlString(s:string):UrlString {
return s as UrlString
}
console.log(stripProtocol(toUrlString('https://foo.com'))) // "foo.com"

If you try to call stripProtocol with a string argument, the compiler throws an error

1
2
console.log(stripProtocol('foo.com'))
Type '"foo.com"' is not assignable to type 'AsUrlString'.

We could even add a runtime check inside toUrlString to guarantee that we only have a valid string to convert

1
2
3
4
5
6
function toUrlString(s:string):UrlString {
console.assert(s.indexOf('https://') === 0, `expected url string, got ${s}`)
return s as UrlString
}
console.log(stripProtocol(toUrlString('foo.com')))
// 🔥 AssertionError: expected url string, got foo.com

Ok, this is interesting, is there another way to create a user type that does not allow casting?

Class with a private property

We could use class with a custom property to try to stop users from casting stray values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class UrlString {
readonly s: string
private type: 'UrlString'
constructor(s:string) {
console.assert(s.indexOf('https://') === 0, `expected url string, got ${s}`)
this.s = s
}
}
function stripProtocol(url:UrlString) {
// assuming s starts with "https://" we want
// to remove first 8 characters
return url.s.substr(8)
}
console.log(stripProtocol(new UrlString('https://foo.com'))) // "foo.com"

Yet one can still cast!

1
console.log(stripProtocol({s:'foo.com'} as UrlString))

Great, but can we enforce that the only place allowed to create values of type UrlString is the function stripProtocol? We do not want the user to short circuit the creation process by just casting a string directly?

Unfortunately no. I could not find a way to keep a type "private" enough so that the only function that can create it is a function like toUrlString, yet have the rest of the code know about the type.

Hmm, this is disappointing. Maybe I am asking TypeScript to do too much for me?

Solving the root problem

We seem to forget the main problem we are trying to solve. We need to handle two types of strings; some might be the full https:// urls, while others might not be. A static type only can enforce the name of the type, yet cannot guarantee that the original data is truly conforming to our format. Imagine the user entering the URL!

Even the enforcement that I picked is relaying on crashing the entire program if the input data is incorrect. Is that the best approach? No! Entering incorrect data is so common, we must take it into our consideration and devise a code path that handles it correctly.

Instead of describing type, let us see if TypeScript can help us detect when we have not coded every possible execution path!

To do this, let us put an unknown "untrusted" value into a "Maybe" box. We do not know if the value has the necessary https:// prefix yet, but we will be ready when it does not. To do this in TypeScript I will use library TsMonad.

A given string will be placed into an Maybe box, but the exact option type could be Maybe.just or Maybe.nothing, depending on the prefix.

1
2
3
4
5
import {Maybe} from 'tsmonad'
function toUrlString (s:string):Maybe<string> {
return s.indexOf('https://') === 0
? Maybe.just(s) : Maybe.nothing<string>()
}

We are not crashing the program, instead allowing the caller to perform actions as if the box contained correct value. For example, we can use a function to return the domain, and by running it on the wrapped value using lift(cb) method. The callback passed to lift(cb) will only be executed for values placed into Maybe.just instance, thus we guarantee that it will be called with a valid https:// string!

1
2
3
4
5
6
7
8
9
10
11
12
13
const domain = (s:string) => s.substr(8)
console.log(
toUrlString('https://foo.com')
.lift(domain)
.valueOr('missing https:// url')
)
// foo.com
console.log(
toUrlString('foo.com')
.lift(domain)
.valueOr('missing https:// url')
)
// missing https:// url

We can even control if we need to call console.log at all for invalid values.

1
2
3
4
5
6
7
8
toUrlString('https://foo.com')
.lift(domain)
.lift(console.log)
// foo.com
toUrlString('foo.com')
.lift(domain)
.lift(console.log)
// does nothing, `domain` and `console.log` are not called

We can even have a "fork" and run separate callbacks depending on the data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
toUrlString('https://foo.com')
.lift(domain)
.caseOf({
just: s => console.log('great domain', s),
nothing: () => console.log('invalid url')
})
// great domain foo.com
toUrlString('foo.com')
.lift(domain)
.caseOf({
just: s => console.log('great domain', s),
nothing: () => console.log('invalid url')
})
// invalid url

Nice, although in the last case I would switch to Either to preserve the original value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import {Either} from 'tsmonad'
function toUrlString (s:string):Either<string, string> {
return s.indexOf('https://') === 0
? Either.right(s) : Either.left(s)
}
const domain = (s:string) => s.substr(8)
toUrlString('https://foo.com')
.lift(domain)
.caseOf({
right: d => console.log('great domain', d),
left: s => console.error('invalid original url', s)
})
// great domain foo.com
toUrlString('foo.com')
.lift(domain)
.caseOf({
right: d => console.log('great domain', d),
left: s => console.error('invalid original url', s)
})
// invalid original url foo.com

Notice how we placed the "good" input into Either.right and the "bad" input into Either.left, and then matched the property inside .caseOf. If the value is Maybe.right, then it will be transformed by every left(cb) call, while the Maybe.left value is the original unchanged one.

The TypeScript compiler and VSCode even helps me not forget to handle every path. For example, if I do not specify left execution path inside caseOf({right: ...}) it will complain

1
2
3
4
5
6
7
toUrlString('foo.com')
.lift(domain)
.caseOf({
right: d => console.log('great domain', d)
})
// tsc
// Property 'left' is missing in type '{ right: (d: string) => void; }'.

Nice and helpful, and probably a better runtime protection that trying to come up with nominal types to catch the invalid data.

Related info

I highly recommend reading TypeScript Deep Dive online. Excellent resource that covers a lot of TS topics with lots of code examples. Mostly Adequate Guide to Functional Programming is more than adequate :)