Hibernate Tips: How To Apply DISTINCT to Your JPQL But Not Your SQL Query


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


Hibernate Tips is a series of posts in which I describe a quick and easy solution for common Hibernate questions. If you have a question for a future Hibernate Tip, please post a comment below.

Question:

I’m using a query with a JOIN FETCH clause to load parent entities with all their child entities. The JPQL query requires the DISTINCT keyword to return each parent entity only once. I don’t want the DISTINCT in the SQL query. How can I tell Hibernate to use the DISTINCT only in the JPQL query?

Solution:

JPA’s JOIN FETCH clause provides a great and easy to use way to load an entity and its lazy associations within one query. From a performance point of view, this is much better than using additional queries to initialize the associations. That’s especially the case, if you load a List of entities.

But if you use this clause for the first time, you might be surprised that the List of entities returned by the getResultList method contains duplicates. The SQL query joins the associated database records, and the result set consists of combinations of a parent record with each child record. Unfortunately, Hibernate doesn’t resolve these duplicates by default.

Here you can see an example of such a JPQL query, the executed SQL query and the returned entities.

List<Author> authors = em.createQuery(
				"SELECT a FROM Author a JOIN FETCH a.books",
				Author.class).getResultList();
for (Author a : authors) {
	log.info(a.getFirstName() + " " + a.getLastName() + " wrote "
			+ a.getBooks().size() + " books.");
}

As you can see in the log statements, the List<Author> authors contains as many references to the Author 1 and the Author 2 entities as they’ve written books. Each of these references points to the same entity object, which contains a fully initialized list of related Book entities.

08:31:06,047 DEBUG [org.hibernate.SQL] - 
    select
        author0_.id as id1_0_0_,
        book2_.id as id1_1_1_,
        author0_.firstName as firstNam2_0_0_,
        author0_.lastName as lastName3_0_0_,
        author0_.version as version4_0_0_,
        book2_.publisherid as publishe5_1_1_,
        book2_.publishingDate as publishi2_1_1_,
        book2_.title as title3_1_1_,
        book2_.version as version4_1_1_,
        books1_.authorId as authorId2_2_0__,
        books1_.bookId as bookId1_2_0__ 
    from
        Author author0_ 
    inner join
        BookAuthor books1_ 
            on author0_.id=books1_.authorId 
    inner join
        Book book2_ 
            on books1_.bookId=book2_.id
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 1 wrote 2 books.
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 1 wrote 2 books.
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 2 wrote 3 books.
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 2 wrote 3 books.
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 2 wrote 3 books.
08:31:06,227 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 3 wrote 1 books.

Only return unique results

You can add the DISTINCT keyword to your query to tell Hibernate to return each Author entity only once.

List<Author> authors = em.createQuery(
				"SELECT DISTINCT a FROM Author a JOIN FETCH a.books",
				Author.class).getResultList();
for (Author a : authors) {
	log.info(a.getFirstName() + " " + a.getLastName() + " wrote "
			+ a.getBooks().size() + " books.");
}

But as you can see in the following log messages, Hibernate also adds the DISTINCT keyword to the SQL query. This is often not intended and might result in an efficient database query.

08:35:53,031 DEBUG [org.hibernate.SQL] - 
    select
        distinct author0_.id as id1_0_0_,
        book2_.id as id1_1_1_,
        author0_.firstName as firstNam2_0_0_,
        author0_.lastName as lastName3_0_0_,
        author0_.version as version4_0_0_,
        book2_.publisherid as publishe5_1_1_,
        book2_.publishingDate as publishi2_1_1_,
        book2_.title as title3_1_1_,
        book2_.version as version4_1_1_,
        books1_.authorId as authorId2_2_0__,
        books1_.bookId as bookId1_2_0__ 
    from
        Author author0_ 
    inner join
        BookAuthor books1_ 
            on author0_.id=books1_.authorId 
    inner join
        Book book2_ 
            on books1_.bookId=book2_.id
08:35:53,686 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 3 wrote 1 books.
08:35:53,687 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 2 wrote 3 books.
08:35:53,688 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 1 wrote 2 books.

The passDistinctThrough Hint

Since Hibernate 5.2, you can prevent Hibernate from adding the DISTINCT keyword to the SQL statement by setting the query hint hibernate.query.passDistinctThrough to false. The easiest way to set this hint is to use the constant defined in Hibernate’s org.hibernate.jpa.QueryHints and org.hibernate.annotations.QueryHints class.

TypedQuery<Author> q = em.createQuery("SELECT DISTINCT a FROM Author a JOIN FETCH a.books", Author.class);
q.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false);
List<Author> authors = q.getResultList();
for (Author a : authors) {
	log.info(a.getFirstName() + " " + a.getLastName() + " wrote "
			+ a.getBooks().size() + " books.");
}

As you can see in the log messages, Hibernate no longer adds the DISTINCT keyword to the SQL statement. It now only applies it to the SQL result set and returns each Author entity only once.

08:36:55,075 DEBUG [org.hibernate.SQL] - 
    select
        author0_.id as id1_0_0_,
        book2_.id as id1_1_1_,
        author0_.firstName as firstNam2_0_0_,
        author0_.lastName as lastName3_0_0_,
        author0_.version as version4_0_0_,
        book2_.publisherid as publishe5_1_1_,
        book2_.publishingDate as publishi2_1_1_,
        book2_.title as title3_1_1_,
        book2_.version as version4_1_1_,
        books1_.authorId as authorId2_2_0__,
        books1_.bookId as bookId1_2_0__ 
    from
        Author author0_ 
    inner join
        BookAuthor books1_ 
            on author0_.id=books1_.authorId 
    inner join
        Book book2_ 
            on books1_.bookId=book2_.id
08:36:55,182 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 1 wrote 2 books.
08:36:55,183 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 2 wrote 3 books.
08:36:55,183 INFO  [org.thoughts.on.java.model.TestJoinFetch] - Author 3 wrote 1 books.

Learn more:

If you want to learn more about lazy associations or query hints, you might enjoy reading the following articles:

Hibernate Tips Book

Get more recipes like this one in my new book Hibernate Tips: More than 70 solutions to common Hibernate problems.

It gives you more than 70 ready-to-use recipes for topics like basic and advanced mappings, logging, Java 8 support, caching, and statically and dynamically defined queries.

Get it now!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.