boto3 – From monkeypatch to moto

Today, at the suggestion of a college (Zachary Paden), I contemplated the idea of migrating my tests using python/monkeypatch to moto. Here I will share my journey and observations.

For simplicity’s sake, here is a very minimalist version of the starting code and test.

# main.py

import boto3

session = boto3.session.Session()
client = session.client(service_name="firehose", region_name="us-east-1")

def func() -> None:
    client.put_record(DeliveryStreamName="some-stream", Record={"key": "value"})
# test.py
from typing import Any

import pytest
from _pytest.monkeypatch import MonkeyPatch

import main

@pytest.fixture
def mock_aws(monkeypatch: MonkeyPatch) -> list[dict[str, Any]]:
    """Mock the AWS components.
    Returns:
        A reference to a list where firehose events are added.
    """
    called_with: list[dict[str, Any]] = []

    def mock_put_record(DeliveryStreamName: str, Record: dict[str, Any]) -> None:
        called_with.append(Record)

    monkeypatch.setattr(main.client, "put_record", mock_put_record)

    return called_with

def test_main(mock_aws: list[dict[str, Any]]) -> None:
    main.func()
    assert mock_aws[0] == {"key": "value"}

Let’s now go with the moto version of that code.

First of all, you have to install moto and specify as “extra” the components that will be used.

With poetry: poetry add --group dev moto -E s3 -E firehose

# main.py
from typing import Final, Optional

import boto3
from mypy_boto3_firehose.client import FirehoseClient

REGION_NAME: Final = "us-east-1"

_SESSION: Optional[boto3.session.Session] = None
_FH_CLIENT: Optional[FirehoseClient] = None


def get_firehose_client() -> FirehoseClient:
    global _FH_CLIENT
    if not _FH_CLIENT:
        _SESSION = boto3.session.Session()
        _FH_CLIENT = _SESSION.client(service_name="firehose", region_name=REGION_NAME)
    return _FH_CLIENT


def func() -> None:
    get_firehose_client().put_record(DeliveryStreamName="some-stream", Record={"key": "value"})
# test.py
import boto3
import moto

import main

@moto.mock_s3
@moto.mock_firehose
def test_main() -> None:
    # setup
    s3_client = boto3.client("s3", region_name="us-east-1")
    s3_client.create_bucket(Bucket="patate")
    fh_client = boto3.client("firehose", region_name="us-east-1")
    fh_client.create_delivery_stream(
        DeliveryStreamName="some-stream",
        ExtendedS3DestinationConfiguration={"BucketARN": "patate", "RoleARN": "FakeRole"},
    )

    # execution
    main.func()

    # validation
    objs = s3_client.list_objects(Bucket="patate")
    obj = s3_client.get_object(Bucket="patate", Key=objs["Contents"][0]["Key"])
    data = json.loads(obj["Body"].read().decode())
    assert data == {"key": "value"}

Let’s explore the reason behind each of these changes.

1. session and client are now lazy loaded

For moto and its mocks to work, moto must be loaded BEFORE session and client creation. So they can no longer be initialized in the global scope. (Technically they could, if index was imported into each test rather than the global test.py, but that would create a lot of repetition).

2. The mock_aws fixture is replaced by the two decorators @moto.mock_s3 and @moto.mock_firehose

These decorators ensure that the s3 and firehose clients will be simulated for the duration of the test.

3. The test must create a complete mock-up of the AWS components involved

Even if main.func only uses a firehose, the test must first create an s3 bucket in order to point the firehose to it, because a moto firehose, like a real one, must have a destination.

4. To access the result, you must continue to pretend to interact with AWS

Since main.func writes to a firehose, and the destination of the firehose is s3, in order to see the result of the put_record call, you have to look for it in s3.

My Observations

Lazy loading session and client, is not so bad. However, depending on when this initialization takes place, it can go against the “fail fast” principle. This technique also requires more code.

On one side, the monkeypatch approach requires almost no knowledge of AWS. On the other side, the moto approach requires more knowledge than is needed to write the code that is being tested. In the present situation, you need to know that a firehose must have a destination and that s3 is supported. Then you need to know how to use the s3 client to create a bucket, retrieve an object and read its content. One may wonder how far this complexity can extend with more complex applications.

The monkeypatch approach requires replacing all invocations, one by one, no matter where they are. The moto approach replaces them all at once.

Knowing that most of the time, you have to reuse the mocks from one test to another, the approach implemented above is not ideal, as you will have to copy/paste the # setup portion from one test to another. To overcome this situation, here is a reimplementation of test.py.

# test.py
from typing import Generator

import boto3
import moto

from mypy_boto3_firehose.client import FirehoseClient
from mypy_boto3_s3.client import S3Client

import main

@pytest.fixture
def s3_client() -> Generator[S3Client, None, None]:
    with moto.mock_s3():
        conn = boto3.client("s3", region_name="us-east-1")
        yield conn


@pytest.fixture
def firehose_client() -> Generator[FirehoseClient, None, None]:
    with moto.mock_firehose():
        conn = boto3.client("firehose", region_name="us-east-1")
        yield conn

@pytest.fixture(autouse=True)
def aws_setup(s3_client: S3Client, firehose_client: FirehoseClient):
    s3_client.create_bucket(Bucket="patate")
    firehose_client.create_delivery_stream(
        DeliveryStreamName="api-tracking",
        ExtendedS3DestinationConfiguration={"BucketARN": "patate", "RoleARN": "FakeRole"},
    )


def test_v3_api_key_support(s3_client: S3Client) -> None:
    # execution
    main.func()

    # validation
    objs = s3_client.list_objects(Bucket="patate")
    obj = s3_client.get_object(Bucket="patate", Key=objs["Contents"][0]["Key"])
    data = json.loads(obj["Body"].read().decode())
    assert data == {"key": "value"}

Here, the aws_setup fixture is defined with the autouse=True parameter, so that it is invoked automatically on every test. Then, each test can have as argument s3_client to get an s3 client instead of having to create it.

Conclusion

I don’t see a clear winning approach. I think the choice is really contextual.

As far as I was concerned, I only had one invocation to firehose.put_record to replace, so monkeypatching was simple. Also, from my testing, I had only migrated one test function and would have had to spend more time to apply the changes to all the tests to have, in my opinion, no gain. So I opted to keep the monkeypatch in place.

Related Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.