/ javascript

How clean-architecture solved so many of our Serverless problems

Sponsor me on Patreon to support more content like this.

For months I deliberated and considered our problems running our serverless microservices locally. They were pretty standard, they were node lambdas, they used the serverless framework, they used DynamoDB etc, etc. All very standard. However running these things locally started to become a nightmare. We started using GraphQL (AppSync) to create a schema, and an API Gateway between our services and our user interfaces. It was all coming together nicely, once it was deployed. However there are virtually no reliable tools for running AppSync locally. At least not tools that are well adopted or stable. They also didn't quite fit our schema stitching approach well.

It seemed impossible to run our services locally now, which was an embarrassing defeat in many ways. Having to explain to new engineers 'well, we kind of don't run these locally anymore, we don't know how to' 😅.

Like many aspiring engineers, I picked up a copy of Uncle Bobs clean architecture. Whilst reading it, most of the concepts weren't knew to me, at all. De-coupling code, writing abstractions to separate business logic etc. This was mostly how I learnt to code, back when I was writing PHP in Symfony and Laravel.

However, during the serverless revolution, some coherence was lost somewhere along the way. Architecture became a casualty to the serverless promise of producitvity. It felt, for some time that nobody knew what the new best practices were. I heard many claims that you 'didn't need to worry' about architecture, because they were 'just functions'. All of the examples you see, are single functions, littered with direct database calls, references to specific data storage services, etc, etc. It doesn't help, of-course that most of the tutorials and examples are filtered down from the cloud gods themselves. Of course Amazon will litter their lambdas with references to DynamoDB. Of course Google will litter their cloud functions with references to Cloud Datastore. It's inevitable, it's their job to make their services as 'sticky' as possible. They want you drinking from their deep vat of kool aid. But it is your job as engineers and architects, to keep them at arms length.

A part of being a good engineer, I think, is being able to expose opportunities further down the line for your code, which means abstractions, modularity; it means keeping these services at an arms length. Which are, of course, basic principles. Since often lost in the new, vendor driven world of cloud programming.

But their are no new best practices, it's comforting to realise that the same best practices are true. We just need to re-engage with them in our new cloud world.

Here's an example of a typical Lambda function you might see.

const getUser = async (id) => {
  const { Item } = await client.get({
    TableName: 'users',
    Key: { id },
  }).promise();
  return Item;
};

const getOrders = async (id) => {
  const { Items } = await client.query({
    TableName: 'orders',
    KeyConditionExpression: "userId = :uid",
    ExpressionAttributeValues: {
      ":uid": id,
    },
  });
  return Items;
};

const handler = async ({ body }) => {
  try {
    const { id } = body;
    const userPromise = getUser(id);
    const ordersPromise = getOrders(id);
    const user = await userPromise;
    const orders = await ordersPromise;
    return {
     lifeTimeValue: user.lifeTimeValue,
     orders,
    };
  } catch (e) {
    return {
      error: e.message,
    };
  }
};

module.exports = {
  handler,
};

This code probably doesn't ring any alarm bells if you've ever used Lambda or serverless technologies in general. In fact, it's pretty impressive that this is all you need to perform the task of getting some user meta data, and some orders for that user. Painless!

That is, until, you have 'fourty-something' functions, all loosely related to one another, with a 'helpers' folder full of functions containing things that are kinda re-usable, but kinda domain specific. It's so easy to write a single function, but the tendency is for teams to stick with this pattern, because it's immediately very simple to be productive.

But isn't this a classic programming problem? Not investing time up-front to make things easy to change further down the line? Of course. And hindsight is 20/20, sure. However, my experience in writing serverless code in multiple, highly competent teams, it's not entirely an issue of hindsight. It's an issue that serverless technology is new, and people aren't sure what the best practices are anymore. Those best practices of yore have become lost in the contemporary.

Back to my initial problem. We just started using AppSync in production, what a joy, AWS really dropped a banger with this feature. However, using it meant changing our Lambdas to no longer use an api-gateway, which meant no more serverless-offline plugin, which meant we couldn't run the UI and test our back-end changes anymore. Damn... We found ourselves running $ serverless deploy to our dev environment just to test some minor back-end changes.

One good thing came of this though, it made us rely more on having better tests in place for more immediate feedback. But that still wasn't a complete solution. When you're developing new features, end to end. You want to be able to experiement and understand how your user interface engages with those changes. So that's where we were, hammering our AWS development account with continuous, often conflicting changes. Which slowed down our development speed immensly. It felt like swimming in crude oil. It was unnacceptable.

By this point I'd just finished Uncle Bob's clean architecture, I had a proof-of-concept service on my hands. I wrote it in Go, I'll admit, I'm a much better programmer in Go than I am in Javascript. I often wonder why this is. But after reading Uncle Bob's book, I realised I write Go using dependency injection, using interfaces and safe-guarding domains from technologies. I wondered why I didn't do this as well in Javascript. Go's more object-oriented I suppose... but you can still use dependency injection in Javascript, and classes? I took Uncle Bob's advice when it came to splitting out 'deliveries' when writing this new service.

A Delivery, in clean architecture terms, is a very thin layer at the outward facing edge of your application, which extracts the input from the delivery mechanism. For example, gets the user id from the HTTP POST body, or gets the order id from the GraphQL input, or CLI argument. Etc. A delivery is where you wash away any trace of how that data came into your system, so all that's left if the information your business logic needs.

The next thing I did, which again, is actually ancient advice. Your business logic should exist in isolation, completely unaware of its surroundings. It should focus purely on interacting with your data model and performing business level logic. It shouldn't know anything about how the data got there, your delivery layer should have scrubbed away that information. It shouldn't know anything about the database, end of. Your entities and repositories should hide those details.

I struck upon a eureka moment during this project, I wrote a graphql server to mimmick AppSync locally so that I could test against the UI. I followed the clean architecture approach, so all the complex business logic was abstracted out of the lambda function into a 'use-case', which knew nothing of lambdas, or http. So all I had to do to test my GraphQL schema, and my new functionality along-side the UI, was to write a thin GraphQL wrapper, which just called the correct use-cases. I took note that this process took half an hour...

It felt good, and re-assuring that old advice was eventually the solution. Computer science is relatively new. Knowledge in the field is transient and ever-changing. So any semblence of continuity of older theories, with new technologies, is pleasing. It means they're standing true the test of time, and we can move forward more assuredly when these problems arise.

So now here's what a Lambda function might look like:

'use strict';

const DynamoDBRepository = require('../../repositories/dynamodb');
const userRepository = new DynamoDBRepository(dynamodb, 'users');
const orderRepository = new DynamoDBRepository(dynamodb, 'orders');
const getOrdersUsecase = require('../../usecases/get-orders')(
  userRepository,
  orderRepository,
);

const getOrders = async ({ body }) => {
  const { id } = body;
  return getOrdersUsecase(id);
};

module.exports = getOrders;

First of all we initialise a repository for DynamoDB to interface with our two tables, we then dependency inject those into our usecase using a higher-order function. Then in our actual Lambda, we extract the information we actually need from our Lambda invocation args, in our case a user id. We then pass this into our usecase. This Lambda looks a little bare, which is correct. Our business logic now lives in our usecase.

A note on nomenclature
Many people seem to hate the term 'usecase'. Personally, it makes sense, it is literally a usecase. It's a business usecase. But if you have beef with this term, it's just a term. Use your own lexicon, as long as it's consistent it doesn't really matter. You could call them 'actions', I've seen this in many places too.

Let's take a look at our usecase, or action, or whatever you decided on:

'use strict';

const User = require('../entities/User');
const Order = require('../entities/Order');

const getOrders = (userRepository, orderRepository) => async (userId) => {
  const userEntity = new User(userRepository);
  const orderEntity = new Order(orderRepository);
  try {
    const userPromise = userEntity.get(userId);
    const ordersPromise = orderEntity.listByUser(userId);
    const user = await userPromise;
    const orders = await ordersPromise;
    return {
      lifeTimeValue: user.lifeTimeValue,
      orders,
    };
  } catch (e) {
    return {
      error: e.message,
    };
  }
};

module.exports = getOrders;

There's much more going on in here! But you'll notice one thing, I hope. There's no reference to anything other than your business domains (an order, a user). You won't see a lambda context arg, or a http request in here. You wont find a DynamoDB client, or an S3 bucket lurking amidst the lines of business logic. This is the true value in this approach.

When the next, inevitable maelstrom of cloud technology engulfs us all once again, and you need to port your fousty old serverless code to your new quantum hyperfunction serverlessless framework, you just write a new delivery layer at the top. When your business then decides that NoSQL is a proverbial dead horse, you can write a repository that interfaces your new P2P plasma Blockchain database, or whatever.

But more immediately, if you want to... you know, be able to run the damn thing locally. You can write deliveries and repositories that allow you to do that away from your infrastructure.

Now let's see an entity:

'use strict';

class Order {
  constructor(repository) {
    this.repository = repository;
  }

  listByUser(uid) {
    return this.repository.listByUser(uid);
  }
}

module.exports = Order;

The aim of your entity is to encapsulate your data model, at an application level, before it's persisted to the database. This is a very basic demo, you could be much more 'purist' about what you do here. But for the sakes of this demo, we're just encapsulating some state.

And a repository:

'use strict';

class DynamoDBRepository {
  constructor(client, table) {
    this.client = client;
    this.table = table;
  }

  async get(id) {
    const { Item } = await client.get({
      TableName: this.table,
      Key: { id },
    }).promise();
    return Item;
  }

  async listByUser(uid) {
    const { Items } = client.query({
      TableName: this.table,
      KeyConditionExpression: "userId = :uid",
      ExpressionAttributeValues: {
        ":uid": uid,
      },
    });
    return Items;
  }
}

module.exports = DynamoDBRepository;

The repository exposes a common set of interactions with a particular database technology. Your repositories are technology specific, but expose a common set of functions.

You might be looking at all of this, and thinking 'I just had a single function that did all of that before, now there's an entitiy, a delivery, a usecase, a repository, all those files for such a simple thing, you architecture zealot!'. This is the thinking that leads to good programmers, grappling with hundreds of lambda functions, completely entangled with technology specific code, that's hard to change, hard to test, hard to even run locally. It starts out as a single function, but with all architectures, they should be an investment. Even if it means some upfront burden, you will thank yourself when your codebase is x10 the size and you have to change something fundamental.

Summary

Protect your business logic, from what are details - yes, a database, is, or should be, a detail. As should the way your business logic was invoked in the first place. These principles, at least seem to apply to everything from old monoliths to the new serverless world. And that's a wonderful thing.

This process is still very much a working progress for me, there are still some rough edges in my understanding. But just grasping how these basics still apply to serverless, has proved valuable already. If I'm woefully wrong, feel free to tell me how, or if you have any thoughts/suggestions. I'll try to get back to you, but the most articles I post, the more emails I get, mostly positive so far. So thank you 🙏🏻 and a huge apology if I don't get back to you.

Read more: