Take your skills to the next level!
The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, 2 monthly Q&A calls, monthly coding challenges, a community of like-minded developers, and regular expert sessions.
Should you expose your entities in your REST API, or should you prefer to serialize and deserialize DTO classes?
That’s one of the most commonly asked questions when I’m talking to developers or when I’m coaching teams who are working on a new application.
There are two main reasons for these questions and all the discussions that arise from them:
- Entities are POJOs. It often seems like they can get easily serialized and deserialized to JSON documents. If it really works that easily, the implementation of your REST endpoints would become pretty simple.
- Exposing your entities creates a strong coupling between your API and your persistence model. Any difference between the 2 models introduces extra complexity, and you need to find a way to bridge the gap between them. Unfortunately, there are always differences between your API and your persistence model. The most obvious ones are the handling of associations between your entities.
There is an obvious conflict. It seems like exposing entities makes implementing your use cases easier, but it also introduces new problems. So, what has a bigger impact on your implementation? And are there any other problems that might not be that obvious?
I have seen both approaches in several projects, and over the years, I’ve formed a pretty strong opinion on this. Even though it’s tempting to expose your entities, you should avoid it for all applications with at least mediocre complexity and for all applications that you need to support for a long time. Exposing your entities at your API makes it impossible to fulfill a few best practices when designing your API; it reduces the readability of your entity classes, slows down your application, and makes it hard to implement a true REST architecture.
You can avoid all of these issues by designing DTO classes, which you then serialize and deserialize on your API. That requires you to implement a mapping between the DTOs and your internal data structures. But that’s worth it if you consider all the downsides of exposing entities in your API.
Let me explain …
Hide implementation details
As a general best practice, your API shouldn’t expose any implementation details of your application. The structure that you use to persist your data is such a detail. Exposing your entities in your API obviously doesn’t follow this best practice.
Almost every time I bring up this argument in a discussion, someone skeptically raises an eyebrow or directly asks if that is really that big of a deal.
Well, it’s only a big deal if you want to be able to add, remove or change any attributes of your entities without changing your API or if you’re going to change the data returned by a REST endpoint without changing your database.
In other words: Yes, separating your API from your persistence layer is necessary to implement a maintainable application. If you don’t do it, every change of your REST API will affect your entity model and vice versa. That means your API and your persistence layer can no longer evolve independently of each other.
Don’t bloat your entities with additional annotations
And if you consider to only expose entities when they are a perfect match for the input or return value of a REST endpoint, then please be aware of the additional annotations you will need to add for the JSON serialization and deserialization.
Most entity mappings already require several annotations. Adding additional ones for your JSON mapping makes the entity classes even harder to understand. Better keep it simple and separate the entity class from the class you use to serialize and deserialize your JSON documents.
Different handling of associations
Another argument to not expose your entities in your API is the handling of associations between entities. Your persistence layer and your API treat them differently. That’s especially the case if you’re implementing a REST API.
With JPA and Hibernate, you typically use managed associations that are represented by an entity attribute. That enables you to join the entities in your queries easily and to use the entity attribute to traverse the association in your business code. Depending on the configured fetch type and your query, this association is either fully initialized, or lazily fetched on the first access.
In your REST API, you handle these associations differently. The correct way would be to provide a link for each association. Roy Fielding described that as HATEOAS. It’s one of the essential parts of a REST architecture. But most teams decide to either not model the associations at all or to only include id references.
Links and id references provide a similar challenge. When you serialize your entity to a JSON document, you need to fetch the associated entities and create references for each of them. And during deserialization, you need to take the references and fetch entities for them. Depending on the number of required queries, this might slow down your application.
That’s why teams often exclude associations during serialization and deserialization. That might be OK for your client applications, but it creates problems if you try to merge an entity that you created by deserializing a JSON object. Hibernate expects that managed associations either reference other entity objects or dynamically created proxy objects or a Hibernate-specific List or Set implementation. But if you deserialize a JSON object and ignore the managed associations on your entity, the associations get set to null. You then either need to set them manually, or Hibernate will delete the association from your database.
As you can see, managing associations can be tricky. Don’t get me wrong; these issues can be solved. But that requires extra work, and if you forget just one of them, you will lose some of your data.
Design your APIs
Another drawback of exposing your APIs is that most teams use it as an excuse to not design the response of their REST endpoints. They only return serialized entity objects.
But if you’re not implementing a very simple CRUD operation, your clients will most likely benefit from carefully designed responses. Here are a few examples for a basic bookstore application:
- When you return the result of a search for a book, you might only want to return the title and price of the book, the names of its authors and the publisher, and an average customer rating. With a specifically designed JSON document, you can avoid unnecessary information and embed the information of the authors, the publisher, and the average rating instead of providing links to them.
- When the client requests detailed information about a book, the response will most likely be pretty similar to a serialized representation of the entity. But there will be some important differences. Your JSON document might contain the title, blurb, additional description, and other information about the book. But there is some information you don’t want to share, like the wholesale price or the current inventory of the book. You might also want to exclude the associations to the authors and reviews of this book.
Creating these different representations based on use case specific DTO classes is pretty simple. But doing the same based on a graph of entity objects is much harder and most likely requires some manual mappings.
Support multiple versions of your API
If your application gets used for a while, you will need to add new REST endpoints and change existing ones. If you can’t always update all clients at the same time, this will force you to support multiple versions of your API.
Doing that while exposing your entities in your API is a tough challenge. Your entities then become a mix of currently used and old, deprecated attributes that are annotated with @Transient so that they don’t get persisted in the database.
Supporting multiple versions of an API is much easier if you’re exposing DTOs. That separates the persistence layer from your API, and you can introduce a migration layer to your application. This layer separates all the operations required to map the calls from your old API to the new one. That allows you to provide a simple and efficient implementation of your current API. And whenever you deactivate the old API, you can remove the migration layer.
As you can see, there are several reasons why I don’t like to expose entities in my APIs. But I also agree that none of them creates unsolvable problems. That’s why there are still so many discussions about this topic.
If you’re having this discussion in your team, you need to ask yourself: Do you want to spend the additional effort to fix all these issues to avoid the very basic mapping between entity and DTO classes?
In my experience, it’s just not worth the effort. I prefer to separate my API from my persistence layer and implement a few basic entity to DTO mappings. That keeps my code easy to read and gives me the flexibility to change all internal parts of my application without worrying about any clients.