How to fetch multiple entities by id with Hibernate 5
Take your skills to the next level!
The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.
It happens quite often that you have to load multiple entities by their primary keys. And new developers are often surprised when they learn that JPA and Hibernate versions before 5.1 do not offer a simple API for it.
With JPA and older Hibernate versions, you have 2 options to fetch multiple entities from the database:
- You call the EntityManager.find method for each primary key which forces Hibernate to execute a query for each of them or
- You create a query and provide the primary key values of all entities in an IN statement, like I did in the following code snippet.
List<Long> ids = Arrays.asList(new Long[]{1L, 2L, 3L}); List<PersonEntity> persons = em.createQuery("SELECT p FROM Person p WHERE p.id IN :ids").setParameter("ids", ids).getResultList();
The first approach might be OK if you only need to fetch a few entities from the database. But it requires too many queries and slows down your application if you to fetch a huge number of them. The second approach avoids these performance issues, but it has a few disadvantages:
- Some databases, like Oracle, don’t support an unlimited number of parameters in an IN statement.
- All entities are fetched in one batch which might become a performance issue if you select a huge number of entities.
- Hibernate loads all entities from the database and doesn’t check if some of them are already stored in the 1st level cache.
You can, of course, handle all these issues in your code but that adds additional complexity and disturbs you from implementing the business logic. Hibernate 5.1 introduced a new API and extended the Hibernate Session to allow you to fetch multiple entities with one API call and to avoid these drawbacks.
Excursus: How to access the Hibernate Session from JPA
If you’re using Hibernate as your JPA implementation, you most often use the EntityManager and not the proprietary Hibernate Session. But that’s not an issue. You just need to call the unwrap() method of the EntityManger to get a Hibernate Session.
Session session = em.unwrap(Session.class);
Load multiple entities by their primary key
The new Hibernate API makes it very easy to load multiple entities with just one API call, as you can see in the following code snippet.
MultiIdentifierLoadAccess<PersonEntity> multiLoadAccess = session.byMultipleIds(PersonEntity.class); List<PersonEntity> persons = multiLoadAccess.multiLoad(1L, 2L, 3L);
You just need to call the byMultipleIds(Class entityClass) method on the Hibernate Session and provide the class of the entities you want to load as a parameter. You then get a typed instance of the MultiIdentifierLoadAccess interface. You can use it to load multiple entities at once and to configure the database interaction. In this code snippet, I just called the multiLoad(K… ids) method with an array of 3 primary keys of the PersonEntity. Hibernate creates one query for this method call and provides the 3 primary keys as parameters to an IN statement.
14:32:57,602 DEBUG SQL:92 – select personenti0_.id as id1_0_0_, personenti0_.firstName as firstNam2_0_0_, personenti0_.lastName as lastName3_0_0_ from Person personenti0_ where personenti0_.id in (?,?,?)
This is the same SQL statement as Hibernate would create if you use your own JPQL query and provide the 3 primary keys as parameters to the IN statement. But that’s just because I provided only 3 primary key values and there was no need to apply any kind of batching.
Load entities in multiple batches
There are different reasons to apply batching to these kinds of queries:
- Not all databases allow an unlimited number of parameters in IN statements.
- You might detect in your business logic that you don’t need all of them.
- You might want to remove a batch of entities from the 1st level cache before you fetch the next one.
By default, Hibernate uses the batch size defined in the database specific dialect you use in your application. You, therefore, don’t need to worry about database limitations. Hibernate’s default behaviour already takes care of it and it’s most often also good enough for performance critical use cases.
But there might be some use cases in which you want to change the batch size. You can do this with the withBatchSize(int batchSize) method on the MultiIdentifierLoadAccess interface.
List<PersonEntity> persons = session.byMultipleIds(PersonEntity.class).withBatchSize(2).multiLoad(1L, 2L, 3L);
As you can see in the following log messages, Hibernate creates multiple select statements, if the number of provided primary keys exceeds the defined batchSize.
15:20:52,314 DEBUG SQL:92 – select personenti0_.id as id1_0_0_, personenti0_.firstName as firstNam2_0_0_, personenti0_.lastName as lastName3_0_0_ from Person personenti0_ where personenti0_.id in (?,?) 15:20:52,331 DEBUG SQL:92 – select personenti0_.id as id1_0_0_, personenti0_.firstName as firstNam2_0_0_, personenti0_.lastName as lastName3_0_0_ from Person personenti0_ where personenti0_.id in (?)
Don’t fetch entities already stored in 1st level cache
If you use a JPQL query to fetch a list of entities, Hibernate fetches all of them from the database and checks afterwards if they are already managed in the current session and stored in the 1st level cache. This can be an overhead if several of the entities were already used within this session and are already stored in the 1st level cache.
With the new MultiIdentifierLoadAccess interface, you can decide if Hibernate shall check the 1st level cache before it executes the database query. This is deactivated by default and you need to call the enableSessionCheck(boolean enabled) to activate it.
PersonEntity p = em.find(PersonEntity.class, 1L); log.info("Fetched PersonEntity with id 1"); Session session = em.unwrap(Session.class); List<PersonEntity> persons = session.byMultipleIds(PersonEntity.class).enableSessionCheck(true).multiLoad(1L, 2L, 3L);
15:34:07,449 DEBUG SQL:92 – select personenti0_.id as id1_0_0_, personenti0_.firstName as firstNam2_0_0_, personenti0_.lastName as lastName3_0_0_ from Person personenti0_ where personenti0_.id=? 15:34:07,471 INFO TestMultiLoad:118 – Fetched PersonEntity with id 1 15:34:07,476 DEBUG SQL:92 – select personenti0_.id as id1_0_0_, personenti0_.firstName as firstNam2_0_0_, personenti0_.lastName as lastName3_0_0_ from Person personenti0_ where personenti0_.id in (?,?)
You can see in the log messages that Hibernate executed a query for the call of the EntityManager.find(Class entityClass, Object primaryKey) method. The PersonEntity with id 1 is now stored in the 1st level cache and Hibernate doesn’t fetch it for the following call of the multiLoad(K… ids) method.
Summary
Fetching multiple entities by their primary keys is a common use case. If you have to rely on plain JPA, you have to write your own query for it and make sure that the number of IN parameters doesn’t exceed the database capabilities and optimise the batch size for performance.
Hibernate’s MultiIdentifierLoadAccess interface provides an easier way to do that. You just need to define the class of the entities you want to fetch and provide a List or array of primary keys. Hibernate then uses default values provided by the database specific dialect you use in your application or you provide your own set of configuration data.
Great post. What are your thoughts on using the JPA Criteria API to achieve this?
For instance:
public List findAllUsingCriteriaAPI(Class entityClass, List primaryKeys) {
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery cq1 = cb.createQuery(entityClass);
Root root = cq1.from(entityClass);
CriteriaQuery cq2 = cq1.select(root).where(root.get(“id”).in(primaryKeys));
TypedQuery q = em.createQuery(cq2);
return q.getResultList();
}
Hi Mark,
feature-wise the Criteria API and JPQL are identical. So, everything that I wrote about JPQL also applies to the Criteria API. In this case, that means that you have to handle database limitations of the IN clause and batching of huge result sets yourself. If you do that, then using the Criteria API is absolutely fine.
Regards,
Thorben
It’s working, thank you!
Hi author, The entities stored in 1st level cache exist until when? And could you give code example for checking the existing of entities in 1st level cache? Thank so much!
Hi Soleil,
the 1st level cache is associated with your Session. The objects stored in the cache exist until the end of the Session or until they get detached from the current persistence context. I explain that in Hibernate Tips: How to remove entities from the persistence context before doing bulk operations.
Regards,
Thorben
Nice to have feature. In my previous project, we had the same issue of fetching entities in batches.
Thanks!!
Thanks for the article.
How does it deal with Oracle restriction on number of items in IN statement (up to 1000 items in IN) ?
Hi Alex,
By default, Hibernate uses the batch size defined in the database specific dialect you use in your application. That makes sure that Hibernate respects Oracle’s restriction.
Regards,
Thorben
Thanks for very useful article
You’re welcome 🙂
Thanks, nice tips
Very good feature!
I can’t tell how many times I needed to load lots of entities by id and had to deal with batching and 1st level cache to not having performance issues.
Thanks!
Hi! Thanks for a good article.
I had a look at the API, and it seems like it doesn’t check the 2nd level cache for the entities. Am I right, or am I missing something? Any reasons of why they haven’t included that functionality?
Hi Aleksander,
Hibernate only uses the 2nd level cache for calls of the EntityManager.find method and to initialize managed associations. It ignores the cache for all other query operations.
Regards,
Thorben