Mapping Collections with Hibernate and JPA
Take your skills to the next level!
The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.
JPA and Hibernate provide 3 main options to map a Collection. If it’s a Collection of other entities, you can model it as a to-many association. This is the most common mapping. But you can also map it as an @ElementCollection or as a basic type. In this article, I will show you all 3 options and explain their advantages and disadvantages.
Map a Collection as an Association
If you’re not completely new to Hibernate, I’m sure you have modeled at least 1 one-to-many or many-to-many association. These are the most common Collection mappings because they are easy to implement, fit a relation table model very well, and provide great performance. To model such an association in your domain model, you only need to add an attribute of type java.util.List or java.util.Set to your entity class and annotate it with @ManyToOne or @ManyToMany.
@Entity public class Book { @ManyToMany private Set<Author> authors = new HashSet<>(); ... }
You can also map your association to a java.util.Map. But that’s outside of the scope of this article. If you want to learn more about it, please read my article: How to map an association as a java.util.Map.
Mapping Pitfalls You Should Avoid
I wrote earlier that mapping a Collection as an association is simple. But that doesn’t mean that there are not several pitfalls you should avoid. Let’s look at the ones that I see most often during code reviews and project coaching engagements.
Don’t use FetchType.EAGER
The most common one is the usage of the wrong FetchType. As I explained in a previous article, the FetchType defines when Hibernate initializes an association. When you use FetchType.EAGER, it initializes the association when you load the entity. You should avoid this FetchType because it fetches all association elements even if you don’t use them.
@Entity public class Book { // Don't do this @ManyToMany(fetch = FetchType.EAGER) private Set<Author> authors = new HashSet<>(); ... }
FetchType.LAZY is the default for all to-many association, and it provides much better performance. Hibernate then only fetches the association when you use it in your business code.
Book b = em.find(Book.class, 1L); // get associated Author entities from database b.getAuthors();
Prefer java.util.Set Over java.util.List
Another typical mistake is the mapping of a many-to-many association as a java.util.List. A List might seem like the most efficient and obvious mapping in your Java code. But as I showed in great detail before, Hibernate handles this association very inefficiently when you change its elements. Instead of adding only the new or deleting only the removed association between 2 entities, Hibernate removes all of them before inserting all remaining ones. Depending on the association’s size, this can result in tens or even hundreds of unnecessary database operations and significantly slows down your application.
So, if your many-to-many association doesn’t need to support multiple associations between the same entities, better model it as a java.util.Set.
@Entity public class Book { @ManyToMany private Set<Author> authors = new HashSet<>(); ... }
Be Careful About CascadeType.REMOVE
Cascading applies the lifecycle state change of a parent entity to all its child entities. You can activate it by referencing the type of operation you want to cascade in the cascade attribute of the one-to-many or many-to-many annotation.
@Entity public class Author { @ManyToMany(cascade = CascadeType.REMOVE) private Set<Book> books = new HashSet<>(); ... }
This works well for all parent-child associations in which the child depends on its parent. In almost all cases, these are one-to-many associations. A typical example is an order with its order positions.
Many-to-many associations only rarely represent parent-child associations, and you should better avoid cascading. That’s especially the case for CascadeType.REMOVE. Using it on both ends of a many-to-many association can bounce the cascade operation back and forth between the 2 tables until all records are removed.
But that’s not the only issue. Even if you only use CascadeType.REMOVE on one side of your many-to-many association, you might delete more data than you expected. Let’s use the example that I showed you before, which activates CascadeType.REMOVE on the books association attribute of the Author entity,
If you now remove an Author entity, your persistence provider will cascade the operation to all associated Book entities. As a result, all of them will get removed. Unfortunately, that includes all books that have been written by more than one author.
Author a1 = em.find(Author.class, 1L); log.info("Before remove: " + a1.getBooks().stream().map(b -> b.getTitle()).collect(Collectors.joining(", "))); Author a2 = em.find(Author.class, 2L); em.remove(a2); em.flush(); em.clear(); a1 = em.find(Author.class, 1L); log.info("After remove: " + a1.getBooks().stream().map(b -> b.getTitle()).collect(Collectors.joining(", ")));
17:18:17,588 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_ from Author author0_ where author0_.id=? 17:18:17,612 DEBUG [org.hibernate.SQL] - select books0_.authors_id as authors_2_2_0_, books0_.books_id as books_id1_2_0_, book1_.id as id1_1_1_, book1_.title as title2_1_1_ from Book_Author books0_ inner join Book book1_ on books0_.books_id=book1_.id where books0_.authors_id=? Nov 02, 2020 5:18:17 PM com.thorben.janssen.TestCollectionMapping testCascadeRemove INFORMATION: Before remove: A book about everything, Hibernate Tips 17:18:17,618 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_ from Author author0_ where author0_.id=? 17:18:17,624 DEBUG [org.hibernate.SQL] - select books0_.authors_id as authors_2_2_0_, books0_.books_id as books_id1_2_0_, book1_.id as id1_1_1_, book1_.title as title2_1_1_ from Book_Author books0_ inner join Book book1_ on books0_.books_id=book1_.id where books0_.authors_id=? 17:18:17,642 DEBUG [org.hibernate.SQL] - delete from Book_Author where books_id=? 17:18:17,644 DEBUG [org.hibernate.SQL] - delete from Book_Author where books_id=? 17:18:17,647 DEBUG [org.hibernate.SQL] - delete from Book where id=? 17:18:17,650 DEBUG [org.hibernate.SQL] - delete from Book where id=? 17:18:17,653 DEBUG [org.hibernate.SQL] - delete from Author where id=? 17:18:17,659 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_ from Author author0_ where author0_.id=? 17:18:17,662 DEBUG [org.hibernate.SQL] - select books0_.authors_id as authors_2_2_0_, books0_.books_id as books_id1_2_0_, book1_.id as id1_1_1_, book1_.title as title2_1_1_ from Book_Author books0_ inner join Book book1_ on books0_.books_id=book1_.id where books0_.authors_id=? Nov 02, 2020 5:18:17 PM com.thorben.janssen.TestCollectionMapping testCascadeRemove INFORMATION: After remove: Hibernate Tips
Map a Collection as an @ElementCollection
An @ElementCollection enables you to map a Collection of values that are not an entity itself. This might seem like an easy solution for lists of basic attributes, like the phone numbers of a person. In the database, Hibernate maps the @ElementCollection to a separate table. Each value of the collection gets stored as a separate record.
@Entity public class Author { @ElementCollection private List<String> phoneNumbers = new ArrayList<>(); public List<String> getPhoneNumbers() { return phoneNumbers; } public void setPhoneNumbers(List<String> phoneNumbers) { this.phoneNumbers = phoneNumbers; } ... }
But the mapping as an @ElementCollection has a downside: The elements of the collection don’t have their own identity and lifecycle. They are a part of the surrounding entity. This often becomes a performance issue if you need to change the elements in the collection. Because they don’t have their own identity, all elements of an @ElementCollection are always read, removed, and written, even if you only add, change, or remove one of them. This makes write operations on an @ElementCollection much more expensive than the same operation on a mapped association.
Author a = em.find(Author.class, 1L); a.getPhoneNumbers().add("345-543");
17:33:20,988 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_ from Author author0_ where author0_.id=? 17:33:21,011 DEBUG [org.hibernate.SQL] - select phonenumbe0_.Author_id as author_i1_1_0_, phonenumbe0_.phoneNumbers as phonenum2_1_0_ from Author_phoneNumbers phonenumbe0_ where phonenumbe0_.Author_id=? 17:33:21,031 DEBUG [org.hibernate.SQL] - delete from Author_phoneNumbers where Author_id=? 17:33:21,034 DEBUG [org.hibernate.SQL] - insert into Author_phoneNumbers (Author_id, phoneNumbers) values (?, ?) 17:33:21,038 DEBUG [org.hibernate.SQL] - insert into Author_phoneNumbers (Author_id, phoneNumbers) values (?, ?) 17:33:21,040 DEBUG [org.hibernate.SQL] - insert into Author_phoneNumbers (Author_id, phoneNumbers) values (?, ?)
I, therefore, recommend modeling an additional entity and a one-to-many association instead of an @ElementCollection. This enables you to use lazy loading and to update these values independently of each other. Doing that requires only a minimum amount of code but provides much better performance.
Map a Collection as a Basic Type
Hibernate can map a Collection as a basic type that gets mapped to 1 database column. You only rarely see this kind of mapping in a project. There are 3 reasons for that:
- This mapping makes it hard to search for records with a specific collection value.
- Similar to an @ElementCollection, the collection with all its elements becomes part of the entity object itself and has to follow its lifecycle.
- You need to implement your own basic type and type descriptor.
If you want to use this mapping, the basic type and type descriptor implementations are not complex.
Your type descriptor needs to extend Hibernate’s AbstractTypeDescriptor and implement a mapping from and to the String representation you want to store in the database.
public class CustomCollectionTypeDescriptor extends AbstractTypeDescriptor<List> { public static final String DELIMITER = "-"; public CustomCollectionTypeDescriptor() { super( List.class, new MutableMutabilityPlan<List>() { @Override protected List deepCopyNotNull(List value) { return new ArrayList<String>( value ); } } ); } @Override public String toString(List value) { return ((List<String>) value).stream().collect(Collectors.joining(DELIMITER)); } @Override public List fromString(String string) { return Arrays.asList(string.split(DELIMITER)); } @Override public <X> X unwrap(List value, Class<X> type, WrapperOptions options) { return (X) toString(value); } @Override public <X> List wrap(X value, WrapperOptions options) { return fromString((String) value); } }
After you’ve done that, you can extend Hibernate’s AbstractSingleColumnStandardBasicType to implement your basic type.
public class CustomCollectionType extends AbstractSingleColumnStandardBasicType<List> { public CustomCollectionType() { super( VarcharTypeDescriptor.INSTANCE, new CustomCollectionTypeDescriptor() ); } @Override public String getName() { return "custom_collection_type"; } }
Please make sure that your getName method returns a unique and expressive name for your type. You will use it in the @TypeDef annotation to register the type and your entity classes to reference it.
@org.hibernate.annotations.TypeDef(name = "custom_collection_type", typeClass = CustomCollectionType.class) package com.thorben.janssen;
You can then use your type in your entity mapping by annotating your entity attribute with @Type and a reference to the name of your attribute type.
@Entity public class Book { @Type(type = "custom_collection_type") private List<String> topics = new ArrayList<>(); ... }
Conclusion
As you have seen, you have several options to map a Collection with Hibernate.
The most common approach is to map it as a one-to-many or many-to-many association between 2 entity classes. This mapping is simple and efficient. You can find several articles about it here on the blog:
- Ultimate Guide – Association Mappings with JPA and Hibernate
- Best Practices for Many-To-One and One-To-Many Associations Mappings
- Best Practices for Many-to-Many Associations with Hibernate and JPA
If you don’t want to define an entity class to store each collection element in a separate database record, you can use an @ElementCollection. The elements of the collection don’t have their own identity and lifecycle. Because of that, you can’t write them independently. This often results in significantly worse performance compared to previously described mapping as a separate entity class.
You can also map all elements of your collection to the same database field. This requires a custom type that merges all collection elements during write operations and extracts them while reading. This mapping requires the most effort and gets only rarely used.
Thanks, nice tips
Thanks!