Microservices & Data – Implementing the Outbox Pattern with Hibernate


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


When you start implementing a microservice architecture, you quickly recognize that managing your data has become much harder than it used to be in a monolithic world. In the past, you used distributed transaction and synchronous communication between different parts of your monolith. In a distributed, microservice architecture, this is no longer an option.

You need to find a different approach. You need patterns and technologies that keep your microservices independent of each other so that you can:

  • design and implement your microservices independent of each other,
  • deploy your microservices independent of each other,
  • scale each microservices independently,
  • prevent performance problems in one service from affecting your other services and
  • ensure that a failure in one service doesn’t cascade to other services and takes down your whole application.

Requirements for Exchanging Data Between Your Services

That probably sounds like a tremendous challenge. But it isn’t as bad as it sounds.

You can create independent and scalable microservices by following a relatively small set of requirements:

  1. Each service needs to have its own database so that it can change and scale its persistence solution independently.
  2. You need to avoid synchronous communication between your services to prevent performance problems and failures from cascading. A common way to do that is to use messaging.
  3. Communication between your services needs to be reliable and should follow an all or nothing approach. That’s typically achieved by using a transactional context for your communication.
  4. Distributed transactions are slow, complex and negatively affect the scalability of your application. You should, therefore, only use local transactions. That prevents you from using a service-specific database and a message broker within the same transaction.
  5. It’s not strictly necessary but beneficial if the communication between your services is re-playable. That enables you to add new services to your architecture without developing a new way to share the required data with them.

If you want to fulfill all 5, or at least the first 4 requirements, you might feel like you’re in a tough spot. You obviously need an asynchronous form of communication between your services, e.g. Kafka as a messaging solution. But how do you reliably get your messages to the message broker without using a distributed transaction?

That’s where the Outbox pattern comes into play.

The Outbox Pattern

When you apply the Outbox pattern, you split the communication between your microservice and the message broker into two parts. The key element is that your service provides an outbox within its database.

Yes, an outbox, like the thing people used in paper-based offices to store all the letters that had to be sent via mail.

You, of course, don’t need to print any messages and put them in a box. But you can apply the same idea to your database. You can define a database table that becomes part of your external interface. In this table, you insert a record for each message you want to send to the message broker. That enables you to use one local transaction with your database in which you persist the internal data of your microservice and the external communication.

In the next step, you need an additional service that gets the messages from your outbox table and sends them to your message broker. This message relay service is the topic of another tutorial and I only want to mention your 2 main implementation options here:

  1. You can use a tool like Debezium to monitor the logs of your database and let it send a message for each new record in the outbox table to your message broker. This approach is called Change Data Capture (CDC).
  2. You can implement a service that polls the outbox table and sends a new message to your message broker whenever it finds a new record.

I prefer option 1, but both of them are a valid solution to connect your outbox table with your message broker.

The next important question is: How should you structure your outbox table?

The Structure of the Outbox Table

The outbox table is an external API of your service and you should treat it in the same way as any other externally available API. That means:

  • You need to keep the structure of the table and the contained messages stable.
  • You need to be able to change your microservice internally.
  • You should try to not leak any internal details of your service.

To achieve all of this, most teams use a table that’s similar to the following one. They use a UUID as the primary key, a JSON column that contains the payload of the message and a few additional columns to describe the message.

The message is often times based on the aggregate for which the message was created. So, if your microservice manages books, the aggregate root might be the book itself, which includes a list of chapters.

Whenever a book gets created or changed or when a chapter gets added, a new message for the book gets added to the outbox table.

The payload of the message can be a JSON representation of the full aggregate, e.g. a book with all chapters, or a message-specific subset of the aggregate. I prefer to include the full aggregate in the message, but that’s totally up to you.

Here you can see an example of such a message.

{
	"id":1,
	"title":"Hibernate Tips - More than 70 solutions to common Hibernate problems",
	"chapters":[
		{"id":2,
		 "content":"How to map natural IDs"},
		{"id":3,
		 "content":"How to map a bidirectional one-to-one association"}
	]
}

Filling the Outbox Table

There are lots of different ways to fill the outbox table. You can:

  1. trigger a custom business event, e.g. via CDI, and use an event handler to write a record to the outbox table,
  2. write the record programmatically using an entity or a JPQL statement,
  3. use a Hibernate-specific listener to write a record to the outbox table every time you persist, update or remove an entity.

From a persistence point of view, there is no real difference in the implementation of option 1 and 2. You, of course, need to trigger and observe the event, but that doesn’t influence how you write the record to the outbox table. I will, therefore, only show you how to programmatically write the record and you can use it with your preferred event mechanism or implicitly call the method that writes the record.

The 3rd option is almost identical to the other ones. It uses the same statement to insert a record into the outbox table but it gets triggered by an entity lifecycle event. The main advantage of this approach is, that you can ignore the outbox table in your business logic. Whenever you create, update or remove an entity, Hibernate triggers the listener and automatically adds a record to the outbox table. But it also has the disadvantage, that you can’t aggregate multiple records that are written within the same transaction. So, for all use cases that change or add multiple entities within the same aggregate, the listener will get triggered multiple times. For each time it gets triggered, it adds another record to the table. In the end, this creates way too many records and I highly recommend that you avoid this approach.

Write the Outbox Record Programmatically

Writing the record programmatically is relatively simple. You need to implement a method that transforms your aggregate into its JSON representation and inserts it, together with a few additional information, into the outbox table. You can then call this method from your business logic when you perform any changes on your aggregate.

But how do you write the record? Should use an entity or an SQL INSERT statement?

In general, I recommend using a simple SQL INSERT statement which you execute as a native query. Using an entity doesn’t provide you any benefits because it’s a one-time write operation. You will not read, update or remove the database record. You will also not map any managed association to it. So, there is no need to map the outbox table to an entity class or to manage the lifecycle of an entity object.

Here is an example of a writeMessage method which writes a message for the previously described book aggregate. Please pay special attention to the creation of the JSON document. As described earlier, I prefer to store the complete aggregate which includes the book and the list of chapters.

public class OutboxUtil {

	private static ObjectMapper mapper = new ObjectMapper();
	
	public static final void writeBookToOutbox(EntityManager em, Book book, Operation op) throws JsonProcessingException {
	
		ObjectNode json = mapper.createObjectNode()
			.put("id", book.getId())
			.put("title", book.getTitle());
		
		ArrayNode items = json.putArray("chapters");
		
		for (Chapter chapter : book.getChapters()) {
			items.add(mapper.createObjectNode()
						.put("id", chapter.getId())
						.put("content", chapter.getContent())
			);
		}
		
		Query q = em.createNativeQuery("INSERT INTO Outbox (id, operation, aggregate, message) VALUES (:id, :operation, :aggregate, :message)");
		q.setParameter("id", UUID.randomUUID());
		q.setParameter("operation", op.toString());
		q.setParameter("aggregate", "Book");
		q.setParameter("message", mapper.writeValueAsString(json));
		q.executeUpdate();
	}
}

In your business code, you can now call this method with an instance of the Book entity and an enum value that represents the kind of operation (create, update or remove) performed on the aggregate.

EntityManager em = emf.createEntityManager();
em.getTransaction().begin();

Book b = new Book();
b.setTitle("Hibernate Tips - More than 70 solutions to common Hibernate problems");
em.persist(b);

Chapter c1 = new Chapter();
c1.setContent("How to map natural IDs");
c1.setBook(b);
b.getChapters().add(c1);
em.persist(c1);

Chapter c2 = new Chapter();
c2.setContent("How to map a bidirectional one-to-one association");
c2.setBook(b);
b.getChapters().add(c2);
em.persist(c2);

OutboxUtil.writeBookToOutbox(em, b, Operation.CREATE);

em.getTransaction().commit();
em.close();

When you execute this code, Hibernate first persists the Book and the 2 associated Chapter entities in the database, before it adds a record to the outbox table. All of these SQL INSERT statements are executed within the same transaction. So, you can be sure that the messages in your outbox table always match the current state in your book and chapter tables.

15:31:27,426 DEBUG SQL:94 - 
    select
        nextval ('hibernate_sequence')
15:31:27,494 DEBUG SQL:94 - 
    select
        nextval ('hibernate_sequence')
15:31:27,497 DEBUG SQL:94 - 
    select
        nextval ('hibernate_sequence')
15:31:28,075 DEBUG SQL:94 - 
    insert 
    into
        Book
        (title, version, id) 
    values
        (?, ?, ?)
15:31:28,081 DEBUG SQL:94 - 
    insert 
    into
        Chapter
        (book_id, content, version, id) 
    values
        (?, ?, ?, ?)
15:31:28,085 DEBUG SQL:94 - 
    insert 
    into
        Chapter
        (book_id, content, version, id) 
    values
        (?, ?, ?, ?)
15:31:28,115 DEBUG SQL:94 - 
    INSERT 
    INTO
        Outbox
        (id, operation, aggregate, message) 
    VALUES
        (?, ?, ?, ?)

Conclusion

The Outbox pattern provides an easy and flexible approach to provide messages for other microservices without requiring distributed transactions. In this article, I showed you how to design the outbox table and how to insert records into it.

In the next step, you need to implement another service, which gets the messages from the outbox table and sends them to a message broker, e.g. a Kafka instance. But that’s a topic for another article, which I will write soon.

6 Comments

  1. Nice idea. Never heard of it.
    Does that mean that each microservice needs his own Message Relay Service?
    If not we would break the independence as the MRS depends on ServiceX and ServiceY.

    1. Avatar photo Thorben Janssen says:

      Hi Daniel,

      that’s correct. Each service needs its own Message Relay Service. If you use Debezium (https://thorben-janssen.com/outbox-pattern-with-cdc-and-debezium/), multiple of them will be deployed on the same Kafka Connect instance but they stay independent of each other. As you wrote, this is required to avoid any dependencies between your services.

      Regards,
      Thorben

  2. Avatar photo Sree Ramakrishna says:

    I think I need to buy your Hibernate book and may be online training. To become more valuable Java developer and Engineering Manager/Architect in the Job market particularly in India and USA/Europe
    Regards
    Ramakrishna

    1. Avatar photo Thorben Janssen says:

      That would be highly appreciated and definitely help you improve your Hibernate skills.

      Regards,
      Thorben

  3. Avatar photo Binh Thanh Nguyen says:

    Thanks, nice tips

    1. Avatar photo Thorben Janssen says:

      Thanks for commenting 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.