How to fix Hibernate’s Warning “HHH000104: firstResult/maxResults specified with collection fetch”
Take your skills to the next level!
The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.
One of the most common recommendations to improve the performance of your persistence layer is to use JOIN FETCH clauses or EntityGraphs to fetch required associations when loading an entity. I entirely agree with these recommendations, and we discuss this in great detail in the Hibernate Performance Tuning course in the Persistence Hub. But if you follow this advice and call the setFirstResult and setMaxResult methods to limit the size of the result set, you will see the following warning in your log file:
HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
Hibernate 5 shows that warning if you call the setFirstResult or setMaxResults methods on a query that uses a JOIN FETCH clause or an EntityGraph. Hibernate 6 improved the handling of EntityGraphs and only shows the warning if your query contains a JOIN FETCH clause.
Why Hibernate shows the HHH000104 warning
The reason for this warning becomes obvious when you take a look at the SQL statement that Hibernate has to generate when you use a JOIN FETCH clause or an EntityGraph. Both approaches tell Hibernate to initialize a managed association between 2 entity classes. To do that, Hibernate needs to join the associated tables and select all columns mapped by the entity classes. This combines the records in both tables and increases the size of the result set. That causes problems if you want to limit its size by calling the setFirstResult and setMaxResults methods.
Let’s take a look at an example.
I modeled a many-to-many association between the ChessTournament and the ChessPlayer entity classes. The best practice to work with this association is to use the default FetchType.LAZY and a JOIN FETCH clause or EntityGraph to initialize it if needed.
Hibernate then fetches all the required information using 1 SQL statement. But it triggers the previously shown warning if you limit the size of your query result. You can see an example of that in the following code snippet.
TypedQuery<ChessTournament> q = em.createQuery("""
SELECT t
FROM ChessTournament t
LEFT JOIN FETCH t.players
WHERE t.name LIKE :name""",
ChessTournament.class);
q.setParameter("name", "%Chess%");
q.setFirstResult(0);
q.setMaxResults(5);
List<ChessTournament> tournaments = q.getResultList();
As expected, Hibernate wrote the HHH000104 warning to the log file. And it didn’t add a LIMIT or OFFSET clause to limit the result set size even though I set the firstResult to 0 and maxResult to 5.
15:56:57,623 WARN [org.hibernate.hql.internal.ast.QueryTranslatorImpl] - HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
15:56:57,626 DEBUG [org.hibernate.SQL] -
select
chesstourn0_.id as id1_1_0_,
chessplaye2_.id as id1_0_1_,
chesstourn0_.endDate as enddate2_1_0_,
chesstourn0_.name as name3_1_0_,
chesstourn0_.startDate as startdat4_1_0_,
chesstourn0_.version as version5_1_0_,
chessplaye2_.birthDate as birthdat2_0_1_,
chessplaye2_.firstName as firstnam3_0_1_,
chessplaye2_.lastName as lastname4_0_1_,
chessplaye2_.version as version5_0_1_,
players1_.ChessTournament_id as chesstou1_2_0__,
players1_.players_id as players_2_2_0__
from
ChessTournament chesstourn0_
left outer join
ChessTournament_ChessPlayer players1_
on chesstourn0_.id=players1_.ChessTournament_id
left outer join
ChessPlayer chessplaye2_
on players1_.players_id=chessplaye2_.id
where
chesstourn0_.name like ?
The reason for that becomes visible when you execute the same statement in an SQL client. By joining the managed association and selecting all columns mapped by the ChessTournament and ChessPlayer entity classes, the query’s result set is a product of the records in the ChessTournament table and the associated records in the ChessPlayer table.
Each record in the result set is a unique combination of a tournament and one of its players. That’s the expected way how relational databases handle such a query. But it creates a problem, in the special case of a JOIN FETCH clause or an EntityGraph.
Usually, Hibernate uses the firstResult and maxResult values to apply the pagination in the SQL statement. These tell the database only to return a part of the result set. In the previous examples, I called the setFirstResult method with 0 and the setMaxResults method with 5. If Hibernate would apply the standard handling of these parameters to the generated SQL statement, the database would only return the first 5 rows of the result set. As you can see in the following image, these records contain the Tata Steel Chess Tournament 2021 with 4 of its players and the Tata Steel Chess Tournament 2022 with 1 of its players.
But that’s not what we intended with the JPQL query. The provided firstResult and maxResult values were supposed to return the first 5 ChessTournament entities with all associated ChessPlayer entities. They were supposed to define pagination for the returned ChessTournament entity objects and not of the product in the SQL result set.
That’s why Hibernate writes the warning to the log file and applies the pagination in memory. It executes the SQL statement without any pagination. The database then returns all ChessTournament entities and their associated ChessPlayers. And Hibernate limits the size of the returned List<ChessTournament> when it parses the result set.
Even though this approach provides the correct result, it puts you at the risk of severe performance problems. Depending on the size of your database, the query might select several thousand records and slow down your application.
How to avoid the HHH000104 warning
The best way to avoid Hibernate’s warning and potential performance problems is to execute 2 queries. The 1st query selects the primary keys of all ChessTournament entities you want to retrieve. This query doesn’t fetch the associations, and you can use the setFirstResult and setMaxResult methods to limit the size of the result set. The 2nd one fetches those entities and their associated ChessPlayers.
TypedQuery<Long> idQuery = em.createQuery("""
SELECT t.id
FROM ChessTournament t
WHERE t.name LIKE :name""",
Long.class);
idQuery.setParameter("name", "%Chess%");
idQuery.setFirstResult(0);
idQuery.setMaxResults(5);
List<Long> tournamentIds = idQuery.getResultList();
TypedQuery<ChessTournament> tournamentQuery = em.createQuery("""
SELECT t
FROM ChessTournament t
LEFT JOIN FETCH t.players
WHERE t.id IN :ids""",
ChessTournament.class);
tournamentQuery.setParameter("ids", tournamentIds);
List<ChessTournament> tournaments = tournamentQuery.getResultList();
tournaments.forEach(t -> log.info(t));
The previous code snippet uses Hibernate 6. If you’re using Hibernate 5, you should add the DISTINCT keyword to your 2nd query and set the hint hibernate.query.passDistinctThrough to false. As I explained in a previous article about Hibernate performance tuning, this prevents Hibernate from returning a reference to a ChessTournament object for each of its players.
TypedQuery<Long> idQuery = em.createQuery("""
SELECT t.id
FROM ChessTournament t
WHERE t.name LIKE :name""",
Long.class);
idQuery.setParameter("name", "%Chess%");
idQuery.setFirstResult(0);
idQuery.setMaxResults(5);
List<Long> tournamentIds = idQuery.getResultList();
TypedQuery<ChessTournament> tournamentQuery = em.createQuery("""
SELECT DISTINCT t
FROM ChessTournament t
LEFT JOIN FETCH t.players
WHERE t.id IN :ids""",
ChessTournament.class);
tournamentQuery.setParameter("ids", tournamentIds);
tournamentQuery.setHint(QueryHints.PASS_DISTINCT_THROUGH, false);
List<ChessTournament> tournaments = tournamentQuery.getResultList();
This approach might look more complex and executes 2 statements instead of 1, but it separates the pagination of the query’s result set from the initialization of the players association. This enables Hibernate to add the pagination to the 1st query statement and prevents it from fetching the entire result set and applying the pagination in memory. That resolves the warning and improves your application’s performance if you’re working with a huge database.
07:30:04,557 DEBUG [org.hibernate.SQL] -
select
c1_0.id
from
ChessTournament c1_0
where
c1_0.name like ? escape '' offset ? rows fetch first ? rows only
07:30:04,620 DEBUG [org.hibernate.SQL] -
select
c1_0.id,
c1_0.endDate,
c1_0.name,
p1_0.ChessTournament_id,
p1_1.id,
p1_1.birthDate,
p1_1.firstName,
p1_1.lastName,
p1_1.version,
c1_0.startDate,
c1_0.version
from
ChessTournament c1_0
left join
(ChessTournament_ChessPlayer p1_0
join
ChessPlayer p1_1
on p1_1.id=p1_0.players_id)
on c1_0.id=p1_0.ChessTournament_id
where
c1_0.id in(?,?,?)
07:30:04,666 INFO [com.thorben.janssen.sample.TestSample] - ChessTournament [id=1, name=Tata Steel Chess Tournament 2021, startDate=2021-01-14, endDate=2021-01-30, version=0]
07:30:04,666 INFO [com.thorben.janssen.sample.TestSample] - ChessTournament [id=2, name=Tata Steel Chess Tournament 2022, startDate=2022-01-14, endDate=2022-01-30, version=0]
07:30:04,666 INFO [com.thorben.janssen.sample.TestSample] - ChessTournament [id=3, name=2022 Superbet Chess Classic Romania, startDate=2022-05-03, endDate=2022-05-15, version=0]
Conclusion
You should use JOIN FETCH clauses or EntityGraphs to initialize the associations you’re using in your business code. This avoids n+1 select issues and improves the performance of your application.
But if you want to limit the size of the result set by calling the setFirstResult and setMaxResult methods, the fetching of associated entities creates a problem. The result set then contains the combination of all matching records in the joined tables. If Hibernate limited the size of that result set, it would limit the number of combinations instead of the number of selected entities. It instead fetches the entire result set and applies the pagination in memory. Depending on the size of the result set, this can cause severe performance problems.
You can avoid that by executing 2 query statements. The first one applies pagination when it fetches the primary keys of all records you want to retrieve. In the example of this post, these were the id values of all ChessTournament entities that matched the WHERE clause. The 2nd query then uses the list of primary key values to get the entity objects and initializes the required associations.