Getting Started with Localstack: Overview, Setup, and Practical Usage Guide

In this tutorial, we will explore what LocalStack is, its benefits, and why it is essential for cloud development. Additionally, I will provide a step-by-step guide on installing LocalStack using Docker, and demonstrate how to use Terraform to provision mock AWS services with LocalStack, where we will be doing a demonstration how to provision a Kinesis Stream, Lambda Function and DynamoDB table.

We will do a PutRecord to our Kinesis Stream and our Lambda Function will consume the item from the stream, base64 decode the value and then do a PutItem to DynamoDB.

Overview

LocalStack is an open-source tool that provides a fully functional local AWS cloud stack. It allows developers to emulate AWS services on their local machines, making it easier to develop and test cloud applications without needing access to live AWS resources. By simulating the behavior of AWS services, LocalStack provides a fast, cost-effective, and secure environment for cloud development and testing.

Description

LocalStack offers a comprehensive suite of tools and features that replicate the most commonly used AWS services, including S3, DynamoDB, Lambda, SQS, SNS, and more. It is designed to integrate seamlessly with existing development workflows, supporting various CI/CD tools and environments. With LocalStack, developers can run their cloud applications locally, perform automated tests, and ensure their code works as expected before deploying to a live AWS environment.

Why You Would Need It

Cost Efficiency: Developing and testing directly on AWS can incur significant costs, especially when working with multiple services or running complex tests. LocalStack eliminates these costs by providing a free, local alternative.
Faster Development Cycles: By running AWS services locally, developers can achieve faster iteration cycles. There is no need to wait for cloud deployments or deal with network latency, which speeds up the development process.
Enhanced Testing Capabilities: LocalStack enables comprehensive testing of AWS-dependent applications in a controlled environment. This includes unit tests, integration tests, and end-to-end tests, all performed locally.
Offline Development: Developers can work on their cloud applications even without an internet connection, ensuring productivity regardless of connectivity issues.
Consistency and Reliability: LocalStack ensures that applications behave consistently in both local and cloud environments. This helps catch bugs and issues early in the development process, reducing the likelihood of problems in production.

Installing Localstack on Docker Compose

We will deploy Localstack on Docker using Docker Compose and once that is up and running we will use the aws cli tools to provision mock infrastructure.

Define a docker-compose.yaml:

docker-compose.yaml

version: '3.8'

services:
  localstack:
    container_name: localstack
    image: localstack/localstack:${LOCALSTACK_VERSION:-3.6.0}
    environment:
      - DEBUG=${DEBUG:-1}
      - AWS_ACCESS_KEY_ID=localstack
      - AWS_SECRET_ACCESS_KEY=localstack
      - AWS_EC2_METADATA_DISABLED=true
    ports:
      - "127.0.0.1:4566:4566"
      - "127.0.0.1:4510-4559:4510-4559"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./volume:/var/lib/localstack
    networks:
      - localstack-network

networks:
  localstack-network:
    name: localstack-network

For more configuration options, see their documentation.

After we have defined the above, we can start the localstack container:

docker compose up -d

Now we can start to interact with Localstack to provision mock aws services. If you don't have the awscli tools installed, you can follow the aws documentation to get that installed.

Since we are not communicating with AWS but rather with Localstack instead, we need to override the endpoint in our awscli configuration. Where we would usually list s3 buckets, using the following command line:

aws --region eu-west-1 s3 ls /

We now need to pass the --endpoint-url flag, which looks like this:

aws --endpoint-url="http://localhost:4566" --region eu-west-1 s3 ls /

To prevent us from typing --endpoint-url all the time, we can create a alias by defining:

alias awslocal="aws --endpoint-url=http://localhost:4566 --region=eu-west-1"

Now instead of typing --endpoint-url all the time, we can just run the following to list s3 buckets:

awslocal s3 ls /

We will create a S3 bucket and a DynamoDB table that we will use for our Terraform backend state. To create a S3 Bucket:

awslocal s3 mb s3://backend-state

We can now create a file and upload it to our s3 bucket:

echo $RANDOM > file.txt
awslocal s3 cp file.txt s3://backend-state/file.txt

Now to create our DynamoDB table, first we will create the table definitions:

cat > table.json << EOF
{
    "TableName": "terraform-state-lock",
    "KeySchema": [
      { "AttributeName": "LockID", "KeyType": "HASH" }
    ],
    "AttributeDefinitions": [
      { "AttributeName": "LockID", "AttributeType": "S" }
    ],
    "ProvisionedThroughput": {
      "ReadCapacityUnits": 5,
      "WriteCapacityUnits": 5
    }
}

Now that we have written the table definition to table.json we can go ahead and create the DynamoDB table with the aws cli tools:

awslocal dynamodb create-table --cli-input-json file://table.json

We can verify if the table was created by listing our tables:

awslocal dynamodb list-tables

Now we have our S3 Bucket and DynamoDB table created and ready to be used by Terraform to persist our Terraform state. However, note that this is a testing environment and we acknowledge that we can lose our state at anytime.

Manage Localstack Infrastructure with Terraform

We can also manage our Localstack Infrastructure with Terraform by defining custom endpoints in our AWS provider configuration.

Kinesis, Lambda and DynamoDB Terraform Localstack Example

Using Terraform and Localstack, we will provision the following:

AWS CLI to do a PutRecord on a Kinesis Stream with the data value "pizza" which is base64 encoded.
The Kinesis Stream has a Event Trigger to Invoke the Lambda Function.
The Lambda Function receives the data in the event body and writes to DynamoDB.
AWS CLI to do a Scan on DynamoDB to preview the data in the table.

This is our Architectural Diagram:

terraform-localstack-kinesis-lamdba

Create the workspace directory:

mkdir -p workspace/kinesis-dynamodb-lambda && cd workspace/kinesis-dynamodb-lambda

Create the directories for the lambda function:

mkdir -p lambda/order-processor/{deps,packages,src}

Inside the lambda/order-processor/deps directory create a requirements.txt file:

lambda/order-processor/deps/requirements.txt

boto3
requests

Inside the lambda/order-processor/src/lambda_function.py define the following function code:

lambda/order-processor/src/lambda_function.py

import boto3
from base64 import b64decode
from datetime import datetime as dt

ddb = boto3.Session(region_name='eu-west-1').client(
    'dynamodb',
    aws_access_key_id='localstack',
    aws_secret_access_key='localstack',
    endpoint_url='http://localstack:4566'
)

def decode_base64(string_to_decode):
    response = b64decode(string_to_decode).decode('utf-8')
    return response

def write_to_dynamodb(hashkey, event_id, value):
    response = ddb.put_item(
        TableName='orders',
        Item={
            'OrderID': {'S': hashkey},
            'EventID': {'S': event_id},
            'OrderData': {'S': value},
            'Timestamp': {'S': dt.now().strftime("%Y-%m-%dT%H:%M:%S")}
        }
    )
    return response

def lambda_handler(event, request):
    for record in event['Records']:
        event_id = record['eventID']
        hashkey = event_id[-15:-1]
        value = decode_base64(record['kinesis']['data'])
        item = write_to_dynamodb(hashkey, event_id, value)
        print('EventID: {}, HashKey: {}, Data: {}'.format(event_id, hashkey, value))
        print('DynamoDB RequestID: {}'.format(item['ResponseMetadata']['RequestId']))
    #print(event)
    return event

Inside the zip.sh file we define the following bash code:

#!/bin/bash

for function in $(ls lambda/)
do 
   pushd "lambda/$function"
   if [ -f "deployment_package.zip" ]; then rm -f deployment_package.zip; fi
   python3 -m pip install --target ./packages --requirement ./deps/requirements.txt
   pushd packages
   zip -r ../deployment_package.zip .
   popd
   pushd src/
   zip -g ../deployment_package.zip lambda_function.py
   popd
   rm -rf packages/*
   popd
done

We can then make this file executable:

chmod +x ./zip.sh

This script will help us to create our deployment package. Now that we have all our lambda code and dependencies sorted out, its time to define the infrastructure as code using Terraform.

In our providers.tf we define the aws provider configuration as well as defining our endpoints, which is used to tell Terraform where to find our Localstack endpoints:

providers.tf

terraform {
  required_version = "~> 1.0"

  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "~> 5.60"
    }
    archive = {
      source = "hashicorp/archive"
      version = "~> 2.4"
    }
  }
}

provider "aws" {
  region                      = "eu-west-1"
  access_key                  = "localstack"
  secret_key                  = "localstack"
  skip_credentials_validation = true
  skip_metadata_api_check     = true
  skip_requesting_account_id  = true

  endpoints {
    dynamodb = "http://localhost:4566"
    lambda   = "http://localhost:4566"
    kinesis  = "http://localhost:4566"
    s3       = "http://localhost:4566"
    iam      = "http://localhost:4566"
  }
}

provider "archive" {}

In our main.tf file is where we define our AWS Mock Infrastructure using Terraform:

main.tf

data "archive_file" "order_processor_package" {
  type             = "zip"
  source_file      = "${path.module}/lambda/order-processor/src/lambda_function.py"
  output_file_mode = "0666"
  output_path      = "/tmp/deployment_package.zip"
}

resource "aws_dynamodb_table" "orders" {
  name           = "orders"
  read_capacity  = "2"
  write_capacity = "5"
  hash_key       = "OrderID"

  attribute {
    name = "OrderID"
    type = "S"
  }
}

resource "aws_kinesis_stream" "orders_processor" {
  name = "orders_processor"
  shard_count = 1
  retention_period = 30

  shard_level_metrics = [
    "IncomingBytes",
    "OutgoingBytes",
  ]
}

data "aws_iam_policy_document" "assume_role" {
  statement {
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["lambda.amazonaws.com"]
    }

    actions = ["sts:AssumeRole"]
  }
}

resource "aws_iam_role" "iam_for_lambda" {
  name               = "iam_for_lambda"
  assume_role_policy = data.aws_iam_policy_document.assume_role.json
}

resource "aws_lambda_function" "order_processor" {
  function_name    = "order_processor"
  filename         = "${path.module}/lambda/order-processor/deployment_package.zip"
  handler          = "lambda_function.lambda_handler"
  role             = aws_iam_role.iam_for_lambda.arn
  runtime          = "python3.7"
  timeout          = 60
  memory_size      = 128
  source_code_hash = data.archive_file.order_processor_package.output_base64sha256
}

resource "aws_lambda_event_source_mapping" "order_processor_trigger" {
  event_source_arn              = aws_kinesis_stream.orders_processor.arn
  function_name                 = "order_processor"
  batch_size                    = 1
  starting_position             = "LATEST"
  enabled                       = true
  maximum_record_age_in_seconds = 604800
}

And lastly our put_record.py which uses Python and Boto3 to connect to our Kinesis stream on Localstack to do a PutRecord:

put_record.py

#!/usr/bin/env python3

import boto3

kinesis = boto3.Session(region_name='eu-west-1').client('kinesis', aws_access_key_id='localstack', aws_secret_access_key='localstack', endpoint_url='http://localhost:4566')
response = kinesis.put_record(StreamName='orders_processor', Data=b'chips', PartitionKey='1')
print(response)

Our directory structure should look like this:

├── lambda
│   └── order-processor
│       ├── debs
│       │   └── requirements.txt    # dependencies
│       ├── packages
│       └── src
│           └── lambda_function.py  # function code
├── main.tf                         # terrafor resources
├── providers.tf                    # aws provider config
├── put_record.py                   # python equivalent of doing a putrecord
└── zip.sh                          # installs dependencies and creates deployment package

We can now start by creating the deployment package:

./zip.sh

Then go ahead and create the infrastructure:

terraform init
terraform apply -auto-approve

We should now be able to see our infrastructure that was deployed with Terraform:

awslocal dynamodb list-tables

{
    "TableNames": [
        "orders",
        "..."
    ]
}

Go ahead and do a PutRecord call by running a put-record subcommand, which will send a record to the Kinesis Stream:

awslocal kinesis put-record --stream-name orders_processor --partition-key 123 --data $(echo -n "pizza" | base64)

We should see the response of:

{
    "ShardId": "shardId-000000000000",
    "SequenceNumber": "49626853442679825006635798069828080735600763790688256002"
}

If we tail the logs from our Localstack container:

docker logs -f localstack

We should see the Kinesis, DynamoDB and Lambda logs:

> START RequestId: 29eceff2-c4c1-17d0-a874-27f0dd913a86 Version: $LATEST
> EventID: shardId-000000000000:49626853442679825006635798069828080735600763790688256002, HashKey: 76379068825600, Data: pizza
> DynamoDB RequestID: 974099a3-2f49-4f0f-b7e4-2c53b07db028
> END RequestId: 29eceff2-c4c1-17d0-a874-27f0dd913a86
> REPORT RequestId: 29eceff2-c4c1-17d0-a874-27f0dd913a86	Init Duration: 221.72 ms	Duration: 34.28 ms	Billed Duration: 100 ms	Memory Size: 1536 MB	Max Memory Used: 40 MB

If we do a Scan on our DynamoDB table:

awslocal dynamodb scan --table-name orders

We will see the item that we sent to our Kinesis stream landed up into our DynamoDB table:

{
    "Items": [
        {
            "EventID": {
                "S": "shardId-000000000000:49626853442679825006635798069828080735600763790688256002"
            },
            "OrderData": {
                "S": "pizza"
            },
            "OrderID": {
                "S": "76379068825600"
            },
            "Timestamp": {
                "S": "2024-08-06T22:29:36"
            }
        }
    ],
    "Count": 1,
    "ScannedCount": 1,
    "ConsumedCapacity": null
}

Although this is just a test environment, always watch out for doing a Scan operation as it can be very expensive.

We can use the AWS CLI tools to do a GetItem on our DynamoDB table to ensure we can retrieve our Item:

awslocal dynamodb get-item --table-name orders --key '{"OrderID": {"S": "76379068825600"}}'
{
    "Item": {
        "EventID": {
            "S": "shardId-000000000000:49626853442679825006635798069828080735600763790688256002"
        },
        "OrderData": {
            "S": "pizza"
        },
        "OrderID": {
            "S": "76379068825600"
        },
        "Timestamp": {
            "S": "2024-08-06T22:29:36"
        }
    }
}

Now we have successfully pushed a item to our Kinesis stream, our Lambda function consumed the item and wrote the item into DynamoDB.

We can now destroy the infrastructure by running the following:

terraform destroy

The source code for the demonstration can be found in my github repository.

Conclusion

LocalStack is an invaluable tool for developers working with AWS. By providing a local emulation of AWS services, it offers significant advantages in terms of cost savings, development speed, testing capabilities, and offline productivity. Whether you're building a simple cloud application or a complex multi-service architecture, LocalStack can help streamline your development process and ensure your code is ready for production.

Thank You

Thanks for reading, if you like my content, feel free to check out my website, and subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Join my Newsletter?

Linktree: https://go.ruan.dev/links
Patreon: https://go.ruan.dev/patreon