Hibernate Performance Tuning Done Right

Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, 2 monthly Q&A calls, monthly coding challenges, a community of like-minded developers, and regular expert sessions.

Optimizing the performance of your application is a complex and application-specific task. All domain models are different, and often enough, the amount of data managed by them also differs significantly between multiple installations. In addition to that, almost all performance tuning techniques have trade-offs, which don’t make them a great fit for all situations.

Because of that, following best practices and general recommendations aren’t enough to implement an efficient, high-performance persistence layer. You will, most likely, avoid the most obvious performance pitfalls, but you will also miss all application-specific issues. At the same time, you will increase your persistence layer’s complexity and spend time implementing performance optimizations that are not relevant to your application and data.

If you want to do it right, you need to take a different approach. One that enables you to use your time efficiently and ensures that you fix the relevant performance issues. You can only do that if you have the right mindset and the necessary information to pick the best performance tuning feature for each situation.

Performance Mindset

Let’s talk about the mindset first. It sets the theme for the following sections by defining what you want to optimize and when you should do it.

One of the most common mistakes is that developers try to prevent all theoretical possible performance problems before they occur in tests or production. This adds lots of complexity, makes your code harder to maintain, and slows down your development while providing only minimal value to the users. This is commonly known as premature optimization.

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.

Donald Knuth – The Art of Computer Programming

To avoid that, you need to decide wisely which parts of your code you want to optimize and when to do it.

What should you optimize?

There is an obvious answer to that question: All the parts that are too slow!

But how do you find these parts?

You will not find them by reading about best practices or following the recommendations from static code analyzers. That’s not because these things are generally wrong. It’s because both approaches lack 2 important pieces of information:

  1. The amount of data you’re working with.
  2. The number of parallel requests your system has to handle.

Both have a strong impact on your application’s performance or, should I better say, on the inefficiencies you can accept in your code. For example:

  • You can handle multiple associations that never contain more than 3 elements very inefficiently without experiencing any performance issues. But you can’t do that with one association that references a thousand records.
  • If you’re building an in-house application that gets only used by 20 users simultaneously, you can easily use features like Hibernate’s @Formula annotation to improve your development speed. But if you do that in a webscale application, the generated SQL statement’s complexity will most likely cause performance issues.

These examples show that you need to analyze how your persistence layer performs in a production scenario.

When should you optimize?

Donald’s quote and the previous section already answered this question. To avoid working on the wrong performance improvements, you need to identify the relevant ones. That means you need to prove that the performance problem already exists in production or that it will exist in production soon.

After you’ve done that, you know that the effort you will spend and the complexity you will add to your system will provide value to your users.

Performance Analysis

Before you start improving your persistence layer’s performance, you have to identify the parts that need to be improved. There are several ways you can do that. In this article, I want to show 2 options that focus on Hibernate’s internal operations and don’t require a profiler.

Hibernate Statistics

The easiest way to monitor Hibernate’s internal operations and database queries is to activate Hibernate’s statistics component. You can do that by setting the system property hibernate.generate_statistics to true. Or you can set the parameter in your persistence.xml configuration.

    <persistence-unit name="my-persistence-unit">
			<property name="hibernate.generate_statistics" value="true" />

After you did that, Hibernate will write the following log statements to the log file.

2021-02-22 20:28:52,484 DEBUG [org.hibernate.stat.internal.ConcurrentStatisticsImpl] (default task-1) HHH000117: HQL: Select p From Product p, time: 0ms, rows: 10
2021-02-22 20:28:52,484 INFO  [org.hibernate.engine.internal.StatisticalLoggingSessionEventListener] (default task-1) Session Metrics {
    8728028 nanoseconds spent acquiring 12 JDBC connections;
    295527 nanoseconds spent releasing 12 JDBC connections;
    12014439 nanoseconds spent preparing 21 JDBC statements;
    5622686 nanoseconds spent executing 21 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    403863 nanoseconds spent executing 1 flushes (flushing a total of 10 entities and 0 collections);
    25529864 nanoseconds spent executing 1 partial-flushes (flushing a total of 10 entities and 10 collections)

For each query that you execute, Hibernate will write a message containing the provided statement, the time spent executing it, and the number of returned rows. That makes it easy to spot slow or very complex queries or the ones that return thousands of rows.

At the end of the session, Hibernate also summarizes all executed queries, used JDBC batches, 2nd level cache interactions, and performed flushes. This summary is always a great starting point for your performance analysis. It shows you if Hibernate caused the performance problem and what kind of problem it is. Here are a few examples:

If Hibernate executed much more statements than you expected, you probably have an n+1 select issue. I explain how to analyze and fix it in this free, 3-part video course.

If the number of queries is low but the execution time is high, you can find the slowest statements in your log. For each of them, you can then check how the database executes it and start improving it. If the query gets too complex for JPQL, you can implement it as a native SQL query.

If Hibernate used way too many JDBC batches or only found a few entities in the 2nd level cache, you should check my Hibernate Performance Tuning Online Training. These problems are usually caused by a simple misconfiguration or a misunderstanding about the feature and how you should use it.

And too many flushes or long execution times for your flushes often occur when you’re using too many entities.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.