AWS CDK Lambda and DynamoDB dependency management

Managing dependencies between Lambdas and DynamoDB tables can get ugly.
The default way of allowing a Lambda function to access DynamoDB is done like so:

const tableHandle = new dynamodb.Table(stack, "Table", {
  partitionKey: { name: "id", type: dynamodb.AttributeType.STRING },
});

const functionHandle = new TypeScriptFunction(stack, "Add-Function", {
  entry: require.resolve("@calculator/add/src/handler.ts"),
  environment: {
    TABLE_NAME: tableHandle.tableName, // adding the table name to the environment
  },
});

tableHandle.grantReadWriteData(functionHandle); // grant the lambda access

And then in your code, you'd do:

await this.documentClient
  .scan({ TableName: process.env.TABIE_NAME }) // using the env variable from lambda definition
  .promise();

As you probably already know, this pattern comes with some potential issues.

  1. First is the problematic usage in code - there is no way to match that the environment variables name set on the function match what you are trying to access from the code. (I actually did put a typo there, did you spot it?)
    Although there are things to mitigate this a bit, for example - never use the env variables directly, but have centralized functions that do that, like:
const getTableName = () => process.env.TABLE_NAME;

Still, no verification is happening, and if someone removes the environment variable or changes its name, you won't be able to know until you get a runtime error.

  1. Another problem is the need to pass handlers around. For small stacks that might actually have only one function and one table, that's a non-issue, but if you have a large application with tens or even hundreds of lambdas, and multiple tables, it gets ugly.

  2. Related to number 2 - since you have to pass things around, they have to be introduced in order. Let's say we want to add a lambda that will watch the stream of events in that table, and maybe create some cache or aggregation in another table. It will have to be declared after the initial table. Then let's have another function that reads from that cache. It might seem like that order is correct, and if you are happy to keep things that way - great! Nonetheless - you should not be forced to. Sometimes it might make more sense to group and order things by functionality, not by their dependency order.

  3. You have to remember to grant the permissions to read the Table to the lambda function. It seems like a sensible thing to do, but when you think about it - it wouldn't make sense to add the environment variable if we didn't also grant the permissions. Similarly - it would not make sense to grant permissions if we didn't somehow expose information to the lambda about how to connect to the table. That means - we should be able to do this in one step. (again, a frequent source of errors that are only visible run-time)

  4. Handlers are only typed as a generic CDK Lambda/DynamoDB Table. That means, if you need to pass many of them around there is no way to see a problem before, again, a run-time error.
    Consider a lambda function that requires access to multiple tables:

const createTablesAggregator = (
  stack: Stack,
  someTable: ITable,
  otherTable: ITable,
  yetAnotherTable: ITable
) => {
  new TypeScriptFunction(stack, "Aggregator-Function", {
    entry: require.resolve("@calculator/aggregator/handler.ts"),
    environment: {
      SOME_TABLE: someTable.tableName,
      OTHER_TABLE: otherTable.tableName,
      YET_ANOTHER_TABLE: yetAnotherTable.tableName,
    },
  });
};

and then somewhere else you would call:

createTablesAggregator(stack, someTable, yetAnotherTable, otherTable);

TypeScript would have no way of catching this mistake - everything would deploy. In the best-case scenario things would not work, the worst-case scenario, you might mess up the tables that were passed in the wrong order (maybe the schema for the tables was compatible, and your code successfully did an operation that should happen in the other table). Again - for a small stack this might seem like a non-issue. However, once you have a large one, and multiple people change the CDK code at the same time, it's very easy to mess this up.

What?

By now you are hopefully convinced that there are areas for improvements. Our solution is based on having a central "registry" for Lambdas and Dynamo Tables.

The registry allows you to later reference those constructs by names, instead of passing them around. (which takes care of problems 2, 3, 5).

registerTable(stack, AvailableTables.TABLE, {
  partitionKey: { name: "id", type: dynamodb.AttributeType.STRING },
}); // registerTable is a custom wrapper, trivial to implement yourself, see example below

new ToolkitFunction(stack, AvailableLambdas.ADD, {
  entry: require.resolve("@calculator/add/src/handler.ts"),
  addDependencies: [addTables(AvailableTables.TABLE)],
});

Using the addDependencies automatically adds the permissions (RW by default, trivial to add an option to specify more limited permission) - which takes care of problem number 4.

We are left with problem number 1, which is solved by using a helper function in your code:

export const getDynamoTableName = (tableName: AvailableTables) =>
        process.env[`DYNAMODB_${AvailableTables[tableName]}`]

getDynamoTableName(AvailableTables.TABLE)

To see how this all connects together take a look at dependencyManagement branch of our [xolvio/aws-sales-system-example/dependencyManagement]
(https://github.com/xolvio/aws-sales-system-example/tree/dependencyManagement)

Let me know if you have any questions or thoughts in the comments below.


Let us help you on your journey to Quality Faster

We at Xolvio specialize in helping our clients get more for less. We can get you to the holy grail of continuous deployment where every commit can go to production — and yes, even for large enterprises.

Feel free to schedule a call or send us a message below to see how we can help.

User icon
Envelope icon

or

Book a call
+
Loading Calendly widget...
  • Add types to your AWS lambda handler

    Lambdas handlers can be invoked with many different, but always complex, event arguments. Add to that the context, callback, matching return type and you basically start listing all the different ways that your function can fail in production.

  • How to expose a local service to the internet

    From time to time you might need to expose your locally running service to the external world - for example you might want to test a webhook that calls your service. To speed up the test/development feedback loop it would be great to be able to point that webhook to your local machine.

  • For loops in JavaScript (vs _.times)

    From time to time I still see a for loop in JavaScript codebases. Linters are frequently angry about them. Let's see how we can replace them.