Validation is Type Constraint

Validation is Type Constraint

Thinking about validation in the context of types

Sat Dec 30 2023

When it comes to developing software, validation is a common and familiar task. All systems take some form of input that needs to validated in order for it to be processed.

For example we might have a simple HTTP API that allows to fetch users on GET /users/{userId} where userId is a UUID. Since the HTTP protocol allows for arbitrary strings to be passed in the URL, we need to validate that the userId is a valid UUID before we can process the request. This allows us to do 2 things:

  • Return a friendly error if the validation fails
  • Catch errors earlier to prevent any unexpected behaviour later in the program

One way we might do this is:

userIdQueryParam := req.URL.Query().Get("userId")


if isValidUuid(userIdQueryParam) {

 return nil, errors.New("userId is not a valid UUID")

}


return fetchUserFromDb(userIdQueryParam), nil

(Using Golang purely for demonstration purposes in this article)

So this approach works fine. We'll check the userId path param each time and return a sensible error if it's not valid. The code is also clear and easy to understand so let's give ourselves a pat on the back for that!

But we might have created a slight issue downstream in our code here. Let's take a closer look at the fetchUserFromDb function. The exact contents aren't super important but let's take a look at the signature:

func fetchUserFromDb(userId string) User {

 ...

}

The userId parameter is a string. This isn't great since we know in our application domain that the user ID is a UUID. Since we want functions to be re-usable, we would technically have to re-validate the userId string parameter at the top of this function to make sure it isn't being given bad data from some other part of the program.

Ok, so far nothing too complicated. We can easily improve on this:

userIdQueryParam := req.URL.Query().Get("userId")


if isValidUuid(userIdQueryParam) {

 return nil, errors.New("userId is not a valid UUID")

}


userId := parseUuid(userIdQueryParam)


return fetchUserFromDb(userId), nil

func fetchUserFromDb(userId UUID) User {

 ...

}

Now we've changed the fetchUserFromDb function to only accept a UUID type as the user ID. This makes it simpler and more predictable. We can push messy logic like validation to the edge and keep our internal business logic nice and clean.

There is still a small issue here though. If in the future this code gets changed and drifts apart, the if statement could get separated from the parseUuid call. This means if the userIdQueryParam variable gets changed in any way or if the if statement gets changed/removed, the parseUuid call might actually fail.

Golang actually helps us here and lets us combine the two:

userIdQueryParam := req.URL.Query().Get("userId")


userId, err := parseUuid(userIdQueryParam)


if err != nil {

 return nil, errors.New("userId is not a valid UUID")

}


return fetchUserFromDb(userId), nil

The second line is super important here because we do 2 things at the same time:

  • Validation: Execute validation logic to check if the provided string is a valid UUID
  • Type Constraint: "Cast" the the original string type value into a new one of type UUID

We've combined these 2 concepts into a single statement thus simplifying and de-duplicating a bunch of logic from our code. It turns out that the logic required to parse a string into a UUID is the exact same as the logic required to validate a string as a UUID. So any time we do validation in code, we can model it as type constraint!

And type constraint is exactly that. Constraint.

Constraining the type of a variable reduces it's size. Remember a type is simply a set of possible values. So in our example we are constraining the string type, which has a (theoretically) infinite number of text values, to the UUID type which has a finite set of values. It makes perfect sense that using smaller more constrained types in our program creates simpler and easier to read/maintain code!

Type constraint can also be composed/chained. Think of the example of a date string. We might in same scenarios require the date to be in the future. So we would first validate that the string is a valid date and only then check that it is in the future. So we constrain the type twice. Such constraints can be represented as a subset relationship:

future date ⊂ date ⊂ string
Some fun food for thought in that example actually since "future date" isn't really possible in code as types are parsed at compile time.

Let's take a look at this principle in action in another scenario:

user := fetchUserFromDb(userId)


if user.Role != "admin" {

 return errors.New("user is not an admin")

}


login(*user.AdminCredentials)

Where the AdminCredentials field is nil for non-admin users. Type constraint can help us here as well:

user := fetchUserFromDb(userId)


adminUser, err := parseAdminUser(user)


if err != nil {

 return errors.New("user is not an admin")

}


login(adminUser.AdminCredentials)

Now we wrap up the logic for checking a user is an admin together with any logic to parse admin specific fields. So we were able to remove that risky deref on the last line. It also means future development on code below the if statement is safer since the admin role of the user is now represented in the type system so more specific logic can be checked at compile time.

Typescript has some quite clever automatic type inference for doing stuff like this:

type User = {

 role: "admin";

 adminCredentials: string;

} | {

 role: "user";

}

The code:

const user: User = fetchUserFromDb(userId)


login(user.adminCredentials)

won't compile until we add the check:

const user: User = fetchUserFromDb(userId)


if (user.role != "admin") {

 throw new Error("user is not admin")

}


login(user.adminCredentials)

So next time you find yourself doing validation in your code, keep the idea of type constraint in mind. It might just help keep your code cleaner and easier to maintain!