|

Why you should avoid CascadeType.REMOVE for to-many associations and what to do instead


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


The CascadeTypes REMOVE and ALL, which includes REMOVE, provide a comfortable option to remove an entity together with all its child entities.

But it creates several issues for to-many associations, and you should only use it for to-one relationships.

Problems with CascadeType.REMOVE for To-Many Associations

Most developers worry about deleting too many database records when they use CascadeType.REMOVE. And that’s definitely an issue.

But it’s not the only problem you need to be afraid of.

Too Many Queries

Hibernate might execute a ton of queries to perform the cascade operation on a one-to-many association. And that’s just the smallest issues you should worry about. It slows down your application, but you, at least, don’t lose any data.

Let’s have a look at an example.

A publisher published multiple books. You can model that with 2 simple entities and a one-to-many association.

You can then define the CascadeType.REMOVE on the books association of the Publisher entity.

@Entity
public class Publisher {

	@OneToMany(mappedBy = "publisher", cascade = CascadeType.REMOVE)
	private Set<Book> books = new HashSet<Book>();

	...
}

When you now remove a Publisher entity, Hibernate will also remove the associated Book entities.

Publisher p = em.find(Publisher.class, 1);
em.remove(p);

Everything seems to be fine as long as you don’t activate the logging of the executed SQL statements.

15:32:39,836 DEBUG [org.hibernate.SQL] - select publisher0_.id as id1_3_0_, publisher0_.name as name2_3_0_, publisher0_.version as version3_3_0_ from Publisher publisher0_ where publisher0_.id=?
15:32:39,885 DEBUG [org.hibernate.SQL] - select books0_.publisherid as publishe5_1_0_, books0_.id as id1_1_0_, books0_.id as id1_1_1_, books0_.publisherid as publishe5_1_1_, books0_.publishingDate as publishi2_1_1_, books0_.title as title3_1_1_, books0_.version as version4_1_1_ from Book books0_ where books0_.publisherid=?
15:32:39,933 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
15:32:39,939 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
15:32:39,940 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
15:32:39,942 DEBUG [org.hibernate.SQL] - delete from Publisher where id=? and version=?

As you can see, Hibernate performed 6 SQL statements to remove 1 Publisher and 3 associated Book entities. And this gets even worse with every additional Book the Author has written.

The first SELECT statement was triggered by the em.find method and returned the Publisher entity I want to delete. The second one gets all associated Book entities. In this case, the query returns 3 Books. Hibernate executes 3 additional DELETE statements to remove them one by one. And then it finally deletes the Publisher entity.

That was just a small example with 3 associated entities. But it’s good enough to show how inefficient this approach is. That’s especially the case for huge to-many associations. On a production system, you can easily trigger several hundred queries by just removing 1 entity.

Remove More Than You Expected

You get even bigger problems if you use CascadeType.REMOVE on a many-to-many association. That not only triggers a lot of SQL statements, but you might also delete more database records than you expected.

Let’s have a look at another example.

An author writes multiple books and a book can be written by multiple authors. You can easily model that with an Author and a Book entity and a many-to-many association.

My small test database contains 2 Authors and 3 Books. Book 1 was written by Author 1, Book 2 by Author 1 and Author 2 and Book 3 was written by Author 2.

OK, let’s specify the CascadeType.REMOVE for the books attribute of the Author entity and see what happens.

@Entity
public class Author {

	@ManyToMany(mappedBy = "authors", cascade = CascadeType.REMOVE)
	private Set<Book> books = new HashSet<Book>();

	...
}

When you remove the Author entity with id 1, you probably expect that Hibernate also removes the Book with id 1.

Author a = em.find(Author.class, 1);
em.remove(a);

But what about the Book with id 2?

Author 1 wrote it together with Author 2. So, you probably want to keep it and just remove the association to the removed Author.

But Hibernate removes that Book as well, as you can see in the following log output.

16:21:03,364 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_, author0_.version as version3_0_0_ from Author author0_ where author0_.id=?
...
16:21:03,475 DEBUG [org.hibernate.SQL] - select books0_.authorId as authorId2_2_0_, books0_.bookId as bookId1_2_0_, book1_.id as id1_1_1_, book1_.publisherid as publishe5_1_1_, book1_.publishingDate as publishi2_1_1_, book1_.title as title3_1_1_, book1_.version as version4_1_1_, publisher2_.id as id1_3_2_, publisher2_.name as name2_3_2_, publisher2_.version as version3_3_2_ from BookAuthor books0_ inner join Book book1_ on books0_.bookId=book1_.id left outer join Publisher publisher2_ on book1_.publisherid=publisher2_.id where books0_.authorId=?
...
16:21:03,579 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
16:21:03,585 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:21:03,587 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
16:21:03,587 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:21:03,589 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
16:21:03,590 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:21:03,592 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:21:03,629 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
16:21:03,629 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:21:03,629 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:21:03,631 DEBUG [org.hibernate.SQL] - delete from Author where id=? and version=?
16:21:03,632 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:21:03,633 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]

And it gets even worse when you specify the CascadeType.REMOVE on both ends of the association.

@Entity
public class Book {

	@ManyToMany(cascade = CascadeType.REMOVE)
	@JoinTable(
		      name="BookAuthor",
		      joinColumns={@JoinColumn(name="bookId", referencedColumnName="id")},
		      inverseJoinColumns={@JoinColumn(name="authorId", referencedColumnName="id")})
	private Set<Author> authors = new HashSet<Author>();

	...
}

When you now remove an Author entity, Hibernate cascades the operation to all associated Book entities. From there, it cascades it to all associated Authors and from there to their Books and so on.

So, in this example, Hibernate will cascade the remove operation from Author 1 to Book 1 and 2. From Book 2 it cascades the operation to Author 2 and from there to Book 3.

16:28:43,483 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_, author0_.version as version3_0_0_ from Author author0_ where author0_.id=?
16:28:43,488 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,535 DEBUG [org.hibernate.SQL] - select books0_.authorId as authorId2_2_0_, books0_.bookId as bookId1_2_0_, book1_.id as id1_1_1_, book1_.publisherid as publishe5_1_1_, book1_.publishingDate as publishi2_1_1_, book1_.title as title3_1_1_, book1_.version as version4_1_1_, publisher2_.id as id1_3_2_, publisher2_.name as name2_3_2_, publisher2_.version as version3_3_2_ from BookAuthor books0_ inner join Book book1_ on books0_.bookId=book1_.id left outer join Publisher publisher2_ on book1_.publisherid=publisher2_.id where books0_.authorId=?
16:28:43,536 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,553 DEBUG [org.hibernate.SQL] - select authors0_.bookId as bookId1_2_0_, authors0_.authorId as authorId2_2_0_, author1_.id as id1_0_1_, author1_.name as name2_0_1_, author1_.version as version3_0_1_ from BookAuthor authors0_ inner join Author author1_ on authors0_.authorId=author1_.id where authors0_.bookId=?
16:28:43,554 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,562 DEBUG [org.hibernate.SQL] - select authors0_.bookId as bookId1_2_0_, authors0_.authorId as authorId2_2_0_, author1_.id as id1_0_1_, author1_.name as name2_0_1_, author1_.version as version3_0_1_ from BookAuthor authors0_ inner join Author author1_ on authors0_.authorId=author1_.id where authors0_.bookId=?
16:28:43,563 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:28:43,583 DEBUG [org.hibernate.SQL] - select books0_.authorId as authorId2_2_0_, books0_.bookId as bookId1_2_0_, book1_.id as id1_1_1_, book1_.publisherid as publishe5_1_1_, book1_.publishingDate as publishi2_1_1_, book1_.title as title3_1_1_, book1_.version as version4_1_1_, publisher2_.id as id1_3_2_, publisher2_.name as name2_3_2_, publisher2_.version as version3_3_2_ from BookAuthor books0_ inner join Book book1_ on books0_.bookId=book1_.id left outer join Publisher publisher2_ on book1_.publisherid=publisher2_.id where books0_.authorId=?
16:28:43,583 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:28:43,600 DEBUG [org.hibernate.SQL] - select authors0_.bookId as bookId1_2_0_, authors0_.authorId as authorId2_2_0_, author1_.id as id1_0_1_, author1_.name as name2_0_1_, author1_.version as version3_0_1_ from BookAuthor authors0_ inner join Author author1_ on authors0_.authorId=author1_.id where authors0_.bookId=?
16:28:43,601 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [3]
16:28:43,627 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
16:28:43,627 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,629 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
16:28:43,629 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:28:43,630 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
16:28:43,631 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [3]
16:28:43,632 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
16:28:43,633 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,635 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:28:43,638 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
16:28:43,639 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [3]
16:28:43,639 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:28:43,640 DEBUG [org.hibernate.SQL] - delete from Author where id=? and version=?
16:28:43,640 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:28:43,641 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:28:43,642 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
16:28:43,642 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [2]
16:28:43,642 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]
16:28:43,644 DEBUG [org.hibernate.SQL] - delete from Author where id=? and version=?
16:28:43,644 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [1] as [BIGINT] - [1]
16:28:43,644 TRACE [org.hibernate.type.descriptor.sql.BasicBinder] - binding parameter [2] as [INTEGER] - [0]

That deletes a lot more records than you probably expected when you removed the Author entity with id 1. In the worst case, this removes the whole database.

Solution

OK, so what’s the better approach?

First of all, you shouldn’t use the CascadeType.REMOVE for to-many associations. And that’s also the case for CascadeType.ALL which includes the CascadeType.REMOVE.

When you don’t use cascading, you need to delete the associated entities yourself. You can either do that by calling the remove method of the EntityManager for each entity or with a bulk operation.

Remove One By One

That is the easiest but not the most efficient approach. But you can, at least, be sure that you don’t delete any records by accident.

You need to iterate through the list of associated Books and check if it’s associated with any other Author. If it’s not, you call the remove method for it. Otherwise, you just delete the association to the Author entity.

Author a = em.find(Author.class, 1);
for (Book b : a.getBooks()) {
	if (b.getAuthors().size() == 1) {
		em.remove(b);
	} else {
		b.getAuthors().remove(a);
	}
}
em.remove(a);

As I said, this is not a very efficient approach. Hibernate has to perform 2 SQL DELETE operations for each Book entity you want to remove. One to remove the Book entity and another one to remove the records from the association table. And if you just want to remove the association to the Author entity, Hibernate has to delete the record from the association table.

So, in the end, Hibernate performs 4 SELECT, 1 UPDATE and 4 REMOVE operations to delete the Author with id 1 and the Book with id 1. And it also removes the association of Book 2 with Author 1.

17:08:34,059 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_, author0_.version as version3_0_0_ from Author author0_ where author0_.id=?
17:08:34,125 DEBUG [org.hibernate.SQL] - select books0_.authorId as authorId2_2_0_, books0_.bookId as bookId1_2_0_, book1_.id as id1_1_1_, book1_.publisherid as publishe5_1_1_, book1_.publishingDate as publishi2_1_1_, book1_.title as title3_1_1_, book1_.version as version4_1_1_, publisher2_.id as id1_3_2_, publisher2_.name as name2_3_2_, publisher2_.version as version3_3_2_ from BookAuthor books0_ inner join Book book1_ on books0_.bookId=book1_.id left outer join Publisher publisher2_ on book1_.publisherid=publisher2_.id where books0_.authorId=?
17:08:34,134 DEBUG [org.hibernate.SQL] - select authors0_.bookId as bookId1_2_0_, authors0_.authorId as authorId2_2_0_, author1_.id as id1_0_1_, author1_.name as name2_0_1_, author1_.version as version3_0_1_ from BookAuthor authors0_ inner join Author author1_ on authors0_.authorId=author1_.id where authors0_.bookId=?
17:08:34,146 DEBUG [org.hibernate.SQL] - select authors0_.bookId as bookId1_2_0_, authors0_.authorId as authorId2_2_0_, author1_.id as id1_0_1_, author1_.name as name2_0_1_, author1_.version as version3_0_1_ from BookAuthor authors0_ inner join Author author1_ on authors0_.authorId=author1_.id where authors0_.bookId=?
17:08:34,175 DEBUG [org.hibernate.SQL] - update Book set publisherid=?, publishingDate=?, title=?, version=? where id=? and version=?
17:08:34,183 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=?
17:08:34,184 DEBUG [org.hibernate.SQL] - delete from BookAuthor where bookId=? and authorId=?
17:08:34,186 DEBUG [org.hibernate.SQL] - delete from Book where id=? and version=?
17:08:34,189 DEBUG [org.hibernate.SQL] - delete from Author where id=? and version=?

Bulk Remove

When your association contains a lot of entities, it’s better to remove them with a few queries. This approach is much more complicated than the one I showed you before. But it needs a fixed number of queries to remove an Author with all associated Books and performs much better for huge associations.

Author a = em.find(Author.class, 1);
		
// get all books that this author wrote alone
Query q = em.createNativeQuery("SELECT ba.bookId FROM BookAuthor ba JOIN Book b ON ba.bookId = b.id JOIN BookAuthor ba2 ON b.id = ba2.bookId WHERE ba2.authorId = ? GROUP BY ba.bookId HAVING count(ba.authorId) = 1");
q.setParameter(1, a.getId());
List<Integer> bookIds = (List<Integer>)q.getResultList();
		
// remove all associations for this author
q = em.createNativeQuery("DELETE FROM BookAuthor ba WHERE ba.authorId = ?");
q.setParameter(1, a.getId());
q.executeUpdate();
		
// remove all books that this author wrote alone
q = em.createNativeQuery("DELETE FROM Book b WHERE b.id IN (:ids)");
q.setParameter("ids", bookIds);
q.executeUpdate();
	
// remove author
em.remove(a);

As you can see, this approach consists of 4 steps.

You first get the ids of all Books that Author 1 wrote alone and store them in a List. These are the ones you need to delete in a later step.

In the next step, you remove all records from the association table that are linked to Author 1. So now, Book 1 is not associated with any Authors and Book 2 is only associated with Author 2. That allows you to remove Book 1 without violating a foreign key constraint.

Now you need to cleanup to the Book table and remove all Books that Author 1 wrote alone. You already got the List of Books you need to delete in the first step. You now just need to provide it to an IN clause of a SQL DELETE statement.

And in the final step, you remove the Author entity.

As you can see in the log output, Hibernate performed 5 queries. The first one to get the Author entity with id 1 and then 4 additional queries to remove the Author and all associated records from the database.

18:05:01,158 DEBUG [org.hibernate.SQL] - select author0_.id as id1_0_0_, author0_.name as name2_0_0_, author0_.version as version3_0_0_ from Author author0_ where author0_.id=?
18:05:01,286 DEBUG [org.hibernate.SQL] - SELECT ba.bookId FROM BookAuthor ba JOIN Book b ON ba.bookId = b.id JOIN BookAuthor ba2 ON b.id = ba2.bookId WHERE ba2.authorId = ? GROUP BY ba.bookId HAVING count(ba.authorId) = 1
18:05:01,301 DEBUG [org.hibernate.SQL] - DELETE FROM BookAuthor ba WHERE ba.authorId = ?
18:05:01,308 DEBUG [org.hibernate.SQL] - DELETE FROM Book b WHERE b.id IN (?)
18:05:01,332 DEBUG [org.hibernate.SQL] - delete from Author where id=? and version=?

Summary

As you’ve seen, it’s risky to use CascadeType.REMOVE with to-many associations. The cascading requires a lot of SQL statements and in the worst case removes more records than you intended.

You should, therefore, use CascadeType.REMOVE and CascadeType.ALL only for to-one associations.

For to-many relationships, you need to implement the removal of child entities yourself. You can do that by iterating through the List of associated entities and removing them one by one. Or you can implement the required update and delete operations with JPQL and native queries.

The second approach is more difficult to implement, but it only requires a fixed set of SQL statements and performs a lot better for huge associations.

15 Comments

  1. Hi Torben,

    Thank you for the great post!

    Would it also be possible to pass a List of typ UUIDS to the query “DELETE FROM Book b WHERE b.id IN (:ids)”, event when the Book has other attributes referenced as ElementCollection?

    Best Regards,
    Valentin

    1. Avatar photo Thorben Janssen says:

      Hi Valentin,

      That’s only possible if you remove all elements of the ElementCollection first. Otherwise, it will fail for a foreign key constraint violation.
      If you tell Hibernate to remove one specific entity, it handles that automatically. But by using a JPQL DELETE statement, you are responsible for handling all associations and ElementCollections yourself.

      Regards,
      Thorben

  2. Avatar photo Stefan Trenkel says:

    Hi Thorben,

    thank you for this post!

    Is it possible (and recommended) to do this on database level with ON DELETE CASCADE?
    E.g.:

    ALTER TABLE book
    ADD CONSTRAINT myconstraint
    FOREIGN KEY (authorid)
    REFERENCES author (id)
    ON DELETE CASCADE;

    And now just delete an author.

    Best regards.
    Stefan

    1. Avatar photo Thorben Janssen says:

      Hi Stefan,

      No, that approach is not recommended.
      Hibernate doesn’t know which records the cascading operation on the database removes. It can’t remove any entity that maps one of the removed records from the 1st or 2nd level cache. Due to this, you are at a high risk that your caches contain outdated information (records that no longer exist).

      Regards,
      Thorben

      1. Avatar photo Stefan Trenkel says:

        Understand! Thanks a lot!

    1. Avatar photo Thorben Janssen says:

      Your User entity owns the many-to-many association to the Role entity. When you remove a User, Hibernate automatically also removes all entries from the association table. But it doesn’t cascade the remove operation to the Role entity.

      You should never use CascadeType.REMOVE on a many-to-many association. If you remove an entity, Hibernate will remove all associated entities even if they are still referenced by other entities. I explained that in great detail here.

      If you would use CascadeType.REMOVE on your roles association and remove a User, Hibernate would remove all Role entities referenced by that User. It would do that even if there are other User entity objects that are associated with these Roles.

  3. Avatar photo Attila Simó says:

    Hi Thorben. Thanks for the great article.

    I have a question: Could be any issue, if I use JpaRespository query to remove multiple entities from a One-To-Many associations Many side, in the same time when CascadeType.Remove is set. So basically using the bulk remove approach and CascadeType.Remove as well? Of course not both at the same time, but for different use cases e.g. to have different service methods for both?

    Thanks,
    Attila

    1. Avatar photo Thorben Janssen says:

      Hi Attila,

      the bulk remove doesn’t trigger JPA’s cascading because it’s implemented by your persistence provider (e.g. Hibernate) and not in the database. The EntityManager doesn’t know which entities you’re removing and can’t trigger the cascading.

      So yes, you could you both approaches for 2 different use cases. Just make sure that the cascading doesn’t remove any unexpected records.

      Regards,
      Thorben

  4. Avatar photo Trường says:

    this post is very helpful but i have a small confusion. i did not see the table BookAuthor defined. but it used in query. could you help me to explain it

  5. Thanks for sharing it. I got myself using cascade.remove wrong sometimes. Now I know exactly why.

    1. Avatar photo Thorben Janssen says:

      Thanks, Victroch. Happy to help 🙂

  6. Avatar photo Binh Thanh Nguyen says:

    Thanks, nice tips.

    1. Avatar photo Thorben Janssen says:

      Thank you, Binh 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.