The Best Spring Data JPA Logging Configuration in Spring Boot


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


Spring Data JPA’s repositories provide most of the database operations for you. That makes working with Spring Data JPA very comfortable but it also increases the importance of a good logging configuration. If you don’t implement your database operations yourself, you at least need to make sure that you know which operations get performed.

The best way to monitor all executed SQL statements during development is to use the right logging configuration for the persistence provider used by Spring Data JPA. The persistence provider does all the heavy lifting. It’s responsible for executing all queries, detecting and persisting dirty entities, handling caching, and providing lazy loading for managed associations. And it’s also the best component to measure and report all these operations.

When using Spring Boot, Spring Data JPA uses Hibernate as its persistence provider by default. So, let’s take a closer look at a development configuration that provides you with the necessary insides and a configuration for production that keeps the overhead as low as possible.

Logging Configuration for Development

When implementing a new feature or fixing a bug, your log file should tell you about all the executed SQL statements, cache interactions, and other operations your persistence layer performs. When using Hibernate, you can get all of this by activating its statistics. You do that by setting the property spring.jpa.properties.hibernate.generate_statistics in your application.properties file to true. And when doing that, you should also set the log category logging.level.org.hibernate.stat to DEBUG to not miss out on any available information.

# Generate and log statistics
spring.jpa.properties.hibernate.generate_statistics=true
logging.level.org.hibernate.stat=DEBUG

Hibernate writes a log message for each query executed by Spring’s repositories based on these settings. At the end of each transaction, it also summarizes all performed operations.

Let’s try this with a simple test case that gets all ChessPlayer entities from the database and writes a log message with their first and last name

List<ChessPlayer> players = playerRepo.findAll();
assertThat(players.size()).isEqualTo(19);

ChessPlayer player = players.get(0);
log.info(player.getFirstName() + " " + player.getLastName()); 

Spring Data JPA’s findAll method generates a simple query that selects all ChessPlayer entities. As you can see in the log output, Hibernate executed that query and wrote the statement, its execution time, and the number of returned rows to the log file.

And the block of Session Metrics gives you a quick overview of all the performed operations. In this test case, Hibernate prepared and executed 1 JDBC statement. The 19 L2C puts reported in the metrics tell you that Hibernate added 19 objects to the 2nd level cache. And Hibernate also performed 1 partial flush at the end of the transaction.

2021-12-14 14:41:05.189 DEBUG 65812 --- [           main] o.h.stat.internal.StatisticsImpl         : HHH000117: HQL: select generatedAlias0 from ChessPlayer as generatedAlias0, time: 75ms, rows: 19
2021-12-14 14:41:05.196  INFO 65812 --- [           main] i.StatisticalLoggingSessionEventListener : Session Metrics {
    1913100 nanoseconds spent acquiring 1 JDBC connections;
    0 nanoseconds spent releasing 0 JDBC connections;
    1401600 nanoseconds spent preparing 1 JDBC statements;
    3825100 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    46241400 nanoseconds spent performing 19 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    11600 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}
2021-12-14 14:41:05.213  INFO 58480 --- [           main] c.thorben.janssen.spring.data.TestDemo   : Magnus Carlsen

If I would’ve activated JDBC batching or triggered more operations on the 2nd level cache, you would see that in the session metrics as well. Based on this information, you can check if any potential problems might cause trouble when deploying your application to production and using it with a bigger test database.

In addition to Hibernate’s statistics, you should also configure a threshold for Hibernate’s slow query log. Hibernate then measures the execution time of each executed SQL statement. If it exceeds the configured threshold, it writes a log message with the execution time and the executed statement.

# Log slow queries
spring.jpa.properties.hibernate.session.events.log.LOG_QUERIES_SLOWER_THAN_MS=3

Let’s rerun the same test case and check the log output.

2021-12-14 15:02:57.570  INFO 36940 --- [           main] org.hibernate.SQL_SLOW                   : SlowQuery: 4 milliseconds. SQL: 'HikariProxyPreparedStatement@2022715982 wrapping select chessplaye0_.id as id1_1_, chessplaye0_.birth_date as birth_da2_1_, chessplaye0_.first_name as first_na3_1_, chessplaye0_.last_name as last_nam4_1_, chessplaye0_.version as version5_1_ from chess_player chessplaye0_'
2021-12-14 15:02:57.643 DEBUG 36940 --- [           main] o.h.stat.internal.StatisticsImpl         : HHH000117: HQL: select generatedAlias0 from ChessPlayer as generatedAlias0, time: 82ms, rows: 19
2021-12-14 15:02:57.650  INFO 36940 --- [           main] i.StatisticalLoggingSessionEventListener : Session Metrics {
    1901800 nanoseconds spent acquiring 1 JDBC connections;
    0 nanoseconds spent releasing 0 JDBC connections;
    1981300 nanoseconds spent preparing 1 JDBC statements;
    3001100 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    52468500 nanoseconds spent performing 19 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    12500 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}
2021-12-14 15:02:57.696  INFO 36940 --- [           main] c.thorben.janssen.spring.data.TestDemo   : Magnus Carlsen

Using this configuration, you get all information you need to understand which operations Spring Data JPA executed and how your persistence layer interacted with the database.

If you want to get an additional log message for every executed SQL statement and more detailed information on Hibernate’s operations on the 2nd level cache, you should also set the configuration properties logging.level.org.hibernate.SQL and logging.level.org.hibernate.cache to DEBUG.

Recommended Configuration for Development

Here you can see the entire configuration for development systems.

# Generate and log statistics
spring.jpa.properties.hibernate.generate_statistics=true
logging.level.org.hibernate.stat=DEBUG
# Log slow queries
spring.jpa.properties.hibernate.session.events.log.LOG_QUERIES_SLOWER_THAN_MS=1
# Log all SQL statements
logging.level.org.hibernate.SQL=DEBUG
# Log cache operations
logging.level.org.hibernate.cache=DEBUG

Logging Configuration for Production

When you deploy your application to production, you should already know how it works internally and which operations it performs. Now it’s time to reduce the logging overhead to get the best performance.

You do that by not setting the property spring.jpa.properties.hibernate.generate_statistics or by setting it to false and by setting the log level of org.hibernate to ERROR.

spring.jpa.properties.hibernate.generate_statistics=false
logging.level.org.hibernate=ERROR

Hibernate then only writes a log message if an exception occurs.

Summary

Spring Data JPA adds another layer of abstraction on top of your persistence provider. That makes the implementation of your persistence much easier. But the abstraction also makes it harder to understand how your application uses the database. This increases the importance of your logging configuration. Using the right configurations, you get all the information you need to find, analyze, and fix potential problems during development. You should use a different configuration to avoid any overhead in production and get the best performance.

One Comment

  1. Avatar photo Sam Buriima says:

    Well Explained. Thank you sir

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.