Localized Data – How to Map It With Hibernate


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


Internationalization not only affects the UI. If your application stores user-generated data and supports multiple languages, you also need a way to store different translations in your database. Typical examples are:

  • market places that allow you to provide product descriptions in various languages,
  • travel sites that offer trips to people all over the world and
  • document management systems that store document descriptions and keywords for multiple languages.

In all of these examples, you need to localize your frontend and parts of the persisted data. The two most common approaches for that are:

  1. Using Java ResourceBundle
    This standard Java feature provides a simple to use and very efficient option to implement internationalization. You need to provide a properties file for each locale you want to support. You can then use the ResourceBundle class to get the property for the currently active Locale.
    The only downside of this approach is that the different translations are hard to maintain. If you want to add, change, or remove the translation of a property, you need to edit one or more properties files. In the worst case, that might even require a re-deployment of your application.
    That makes Java’s ResourceBundle a good option for all static, pre-defined texts, like general messages or attribute names that you use in your UI. But if you want to translate user-generated content or any other String that gets often changed, you should prefer a different approach.
  2. Storing translations in the database
    You get more flexibility, and updating a translated name or description is much easier if you persist the localized data in your database. Adding or changing a translation, then only requires the execution of an SQL INSERT or UPDATE statement. That makes it a great approach for all user-generated content.
    Unfortunately, the implementation is also more complicated. There is no standard Java feature that you can easily use. You need to design your table model accordingly, and you need to implement the read and update routines yourself.

In this tutorial, I want to focus on the 2nd option. There are a few commonly used patterns that enable you to store and handle localized information in your database easily.

Different Ways to Store Localized Data

Let’s first take a look at the table model before we discuss how you can map it with JPA and Hibernate. To make that easier to understand, I will use the following example:

We want to create a marketplace in which suppliers can offer their products. The marketplace supports the languages German and English. The supplier can provide the name and description of a product in both languages.

As so often, you can model this in various ways. Shantanu Kher created a great overview of different options and discussed their advantages and disadvantages on the vertabelo blog.

Even though the popularity of these approaches varies, I have seen all of them in real life. In my experience, the most commonly used ones are:

  1. Using separate columns for each language in the same database table, e.g., modeling the columns description_en and description_de to store different translations of a product description.
  2. Storing translated fields in a separate table. That would move the description_en and description_de columns to a different table. Let’s call it LocalizedProduct.

Let’s take a closer look at both options.

Separate Language Columns in Each Table

The general idea of this approach is simple. For each localized attribute and language you need to support, you add an extra column to your table. Depending on the number of supported languages and localized attributes, this can result in a vast amount of additional columns. If you want to translate 4 attributes into 5 different languages, you would need to model 4*5=20 database columns.

In the previously described example, you need 4 database columns to localize the product name and description. You use the columns description_en and description_de to persist the different translations of the product description. The columns name_en and name_de to store the localized product name.

Creating Your Entity Mappings

As you have seen in the previous diagram, using separate columns for each translation results in a straightforward table model. The same is true for the entity mapping.

The id attribute is of type Long and maps the primary key. The @GeneratedValue annotation tells Hibernate to use a database sequence to generate unique primary key values. In this example, I use Hibernate’s default sequence. But as I showed in a previous article, you can easily provide your own sequence.

The version attribute is used for optimistic locking and provides a highly scalable way to avoid concurrent updates. I explain it in more details in my Hibernate Performance Tuning Online Training.

The supplier attribute defines the owning side of a many-to-one association to the Supplier entity. As for all to-one associations, you should make sure to set the FetchType to LAZY to avoid unnecessary queries and performance problems.

The nameDe, nameEn, descriptionDe, and descriptionEn attributes just map each of the localized columns. That might result in a lot of attributes, but it is also a simple and efficient way to handle localized data.

@Entity
public class Product {

	@Id
	@GeneratedValue(strategy = GenerationType.AUTO)
	private Long id;

	@Version
	private int version;

	@ManyToOne(fetch = FetchType.LAZY)
	private Supplier supplier;

	private Double price;

	@Column(name = "name_de")
	private String nameDe;
	
	@Column(name = "name_en")
	private String nameEn;

	@Column(name = "description_de")
	private String descriptionDe;
	
	@Column(name = "description_en")
	private String descriptionEn;
	
	...
}

Using Entities with Separate Language Columns

You can use these entity attributes in the same way as any other entity attributes.

When you persist a new Product entity, you call the setter methods of each localized name attribute with the translated version of the product name.

Product p = new Product();
p.setPrice(19.99D);
p.setNameDe("Hibernate Tips - Mehr als 70 Lösungen für typische Hibernateprobleme");
p.setNameEn("Hibernate Tips - More than 70 solution to common Hibernate problems");
p.setDescriptionDe("Wenn Du Hibernate in Deinen Projekten einsetzt, stellst Du schnell fest, dass ...");
p.setDescriptionEn("When you use Hibernate in your projects, you quickly recognize that you need to ...");
em.persist(p);

Hibernate then includes these columns in the SQL INSERT statement and stores all translations in the database. If you use my recommended logging configuration for development systems, you can see the executed SQL statements in the log file.

19:14:27,599 DEBUG SQL:92 - 
    select
        nextval ('hibernate_sequence')
19:14:27,735 DEBUG SQL:92 - 
    insert 
    into
        Product
        (description_de, description_en, name_de, name_en, price, supplier_id, version, id) 
    values
        (?, ?, ?, ?, ?, ?, ?, ?)

And when you fetch an entity from the database, you can call the getter methods for your preferred locale to retrieve the translated name and description. In the following example, I use the getNameEn and getDescriptionEn methods to get the English version of the product name and description.

Product p = em.createQuery("SELECT p FROM Product p WHERE id = 101", Product.class).getSingleResult();
log.info("Product: "+p.getNameEn());
log.info("Product Description: "+p.getDescriptionEn());

As you can see in the log messages, Hibernate uses a simple, efficient SQL statement to get the Product entity with the given id.

19:16:12,406 DEBUG SQL:92 - 
    select
        product0_.id as id1_0_0_,
        product0_.description_de as descript2_0_0_,
        product0_.description_en as descript3_0_0_,
        product0_.name_de as name_de4_0_0_,
        product0_.name_en as name_en5_0_0_,
        product0_.price as price6_0_0_,
        product0_.supplier_id as supplier8_0_0_,
        product0_.version as version7_0_0_ 
    from
        Product product0_ 
    where
        product0_.id=?
19:16:12,426  INFO UsabilityText:64 - Product: Hibernate Tips - More than 70 solutions to common Hibernate problems
19:16:12,427  INFO UsabilityText:65 - Product Description: When you use Hibernate in your projects, you quickly recognize that you need to ...

Pros & Cons of Entities with Separate Language Columns

As you have seen, adding a dedicated column for each translation to your table:

  • is very easy to implement in the table model,
  • is very easy to map to an entity and
  • enables you to fetch all translations with a simple query that doesn’t require any JOIN clauses.

But on the downside:

  • this mapping might require a lot of database columns if you need to translate multiple attributes into various languages,
  • fetching an entity loads translations that you might not use in your use case and
  • you need to update the database schema if you need to support a new language.

In my experience, the inflexibility of this approach is the biggest downside. If your application is successful, your users and sales team will request additional translations. The required schema update makes supporting a new language much harder than it should be. You not only need to implement and test that change, but you also need to update your database without interrupting your live system.

The next approach avoids these problems, and I, therefore, recommend it for most applications.

Different Tables and Entities for Translated and Non-Translated Fields

Instead of storing all the information in the same database table, you can also separate the translated and non-translated fields into 2 tables. That enables you to model a one-to-many association between the non-translated fields and the different localizations.

Here you can see a table model that applies this approach to the previously discussed example.

The LocalizedProduct table stores the different translations of the product name and description. As you can see in the diagram, that table contains a record for each localization of a product. So, if you want to store an English and a German name and description of your product, the LocalizedProduct table contains 2 records for that product. And if you’re going to support an additional language, you only need to add another record to the LocalizedProduct table instead of changing your table model.

Creating Your Entity Mappings

The entity model is almost identical to the table model. You map the non-translated columns of the Product table to the Product entity and the translated columns of the LocalizedProduct table to the LocalizedProduct entity. And between these 2 entity classes, you can model a managed many-to-one association.

Entity with Translated Fields – The LocalizedProduct entity

The following mapping of the LocalizedProduct entity consists of a few mandatory and an optional part. Let’s first talk about the mandatory mapping of the primary key and the association to the Product entity.

@Entity
@Cache(usage = CacheConcurrencyStrategy.TRANSACTIONAL)
public class LocalizedProduct {

	@EmbeddedId
	private LocalizedId localizedId;
	
	@ManyToOne
	@MapsId("id")
	@JoinColumn(name = "id")
	private Product product;
	
	private String name;
	
	private String description;

	...
}

The LocalizedProduct entity represents the to-many side of the association. The Product product attribute, therefore, owns the relationship definition. The @JoinColumn annotation tells Hibernate to use the id column of the LocalizedProduct table as the foreign key column. And the @MapsId annotation defines that the primary key value of the associated Product entity is part of the composite primary key of the LocalizedProduct entity. It gets mapped to the id attribute of the primary key class.

As I explain in great details in the Advanced Hibernate Online Training, you can map a composite primary key in various ways with JPA and Hibernate. In this example, I use an embedded id and an embeddable called LocalizedId.

As you can see in the following code snippet, the LocalizedId class is a basic Java class which implements the Serializable interface and is annotated with @Embeddable. And because you want to use it as an embedded id, you also need to make sure to implement the equals and hashCode methods.

@Embeddable
public class LocalizedId implements Serializable {

	private static final long serialVersionUID = 1089196571270403924L;

	private Long id;

	private String locale;

	public LocalizedId() {
	}

	public LocalizedId(String locale) {
		this.locale = locale;
	}

	// getter and setter methods ...

	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + ((locale == null) ? 0 : locale.hashCode());
		result = prime * result
				+ ((id == null) ? 0 : id.hashCode());
		return result;
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		LocalizedId other = (LocalizedId) obj;
		if (locale == null) {
			if (other.locale != null)
				return false;
		} else if (!locale.equals(other.locale))
			return false;
		if (id == null) {
			if (other.id != null)
				return false;
		} else if (!id.equals(other.id))
			return false;
		return true;
	}
}

OK, these were the necessary mapping parts of the LocalizedProduct entity. They map the composite primary key and the association to the Product entity.

If you want to take it one step further, you might also want to cache the LocalizedProduct entity. You can do that by activating the cache in your persistence.xml configuration and by annotating the LocalizedProduct entity with JPA’s @Cacheable or Hibernate’s @Cache annotation. As I explain in my Hibernate Performance Tuning Online Training, caching is a two-edged sword. It can provide substantial performance benefits but also introduce an overhead which can slow down your application. You need to make sure that you only change data that gets often read but only rarely changed. In most applications, that’s the case for the localized Strings. That makes them excellent candidates for caching.

Entity with Non-Translated Fields – The Product entity

After we mapped the LocalizedProduct table, which represents the different translations of the localized fields, it’s time to work on the mapping of the Product table.

The only difference to the previous example is the mapping of the localized attributes. Instead of mapping an attribute for each translation, I’m using the localizations attribute. It maps the referencing side of the many-to-one association to the LocalizedProduct entity to a java.util.Map. This is one of the more advanced association mappings defined by the JPA specification, and I explained in great details in How to map an association as a java.util.Map.

In this example, I use the locale attribute of the LocalizedProduct entity as the key and the LocalizedProduct entity as the value of the Map. The locale is mapped by the LocalizedId embeddable, and I need to specify the path localizedId.locale in the @MapKey annotation.

The mapping to a java.util.Map makes accessing a specific translation in your business code more comfortable. And it doesn’t affect how Hibernate fetches the association from the database. In your JPQL or Criteria Queries, you can use this association in the same way as any other managed relationship.

@Entity
public class Product {

	@Id
	@GeneratedValue(strategy = GenerationType.SEQUENCE)
	private Long id;

	@Version
	private int version;

	@ManyToOne(fetch = FetchType.LAZY)
	private Supplier supplier;

	private Double price;

	@OneToMany(mappedBy = "product", cascade = {CascadeType.DETACH, CascadeType.MERGE, CascadeType.PERSIST, CascadeType.REFRESH}, orphanRemoval = true)
	@MapKey(name = "localizedId.locale")
	@Cache(usage = CacheConcurrencyStrategy.TRANSACTIONAL)
	private Map<String, LocalizedProduct> localizations = new HashMap<>();

	...
	
	public String getName(String locale) {
		return localizations.get(locale).getName();
	}

	public String getDescription(String locale) {
		return localizations.get(locale).getDescription();
	}
}

If you want to make your entity model more comfortable to use, you could activate orphanRemoval for the association. That is a general best practice for one-to-many associations that model a parent-child relationship in which the child can’t exist without its parent. It tells your JPA implementation, e.g., Hibernate, to delete the child entity as soon as its association to the parent entity gets removed. I use it in this example to remove a LocalizedProduct entity as soon as it’s no longer associated with a Product entity.

Another thing you could do to improve the usability of your entities is to provide getter methods that return the product name and description for a given locale. If you implement additional getter methods to return a localized name and description, you need to keep in mind that they are accessing a lazily fetched one-to-many association. That triggers an additional SQL statement if the association isn’t already fetched from the database. You can avoid that by using a JOIN FETCH clause or an entity graph to initialize the association while loading your Product entity.

And if you activated the 2nd level cache on the LocalizedProduct entity, you should also annotate the localizations attribute with Hibernate’s @Cache annotation. That tells Hibernate to cache the association between these 2 entities. If you miss this annotation, Hibernate will execute a query to retrieve the associated LocalizedProduct entities even though they might be already in the cache. That is another example of how complex caching with Hibernate can be. It’s also one of the reasons why the Hibernate Performance Tuning Online Training includes a very detailed lecture about it.

Using Different Entities for Translated and Non-Translated Fields

Using this mapping is a little harder than the previous one. The translations are now mapped by an associated entity. It gets a little bit easier if you activate CascadeType.PERSIST, so that you can persist your Product entity and Hibernate automatically cascades this operation to all associated LocalizedProduct entities.

And because I modeled a bidirectional association between the Product and the LocalizedProduct entity, I always need to make sure to update both ends of the relationship.

Product p = new Product();
p.setPrice(19.99D);

LocalizedProduct lpDe = new LocalizedProduct();
lpDe.setId(new LocalizedId("de"));
lpDe.setProduct(p);
lpDe.setName("Hibernate Tips - Mehr als 70 Lösungen für typische Hibernateprobleme");
p.getLocalizations().put("de", lpDe);

LocalizedProduct lpEn = new LocalizedProduct();
lpEn.setId(new LocalizedId("en"));
lpEn.setProduct(p);
lpEn.setName("Hibernate Tips - More than 70 solution to common Hibernate problems");
p.getLocalizations().put("en", lpEn);

em.persist(p);
19:19:37,237 DEBUG SQL:92 - 
    select
        nextval ('hibernate_sequence')
19:19:37,338 DEBUG SQL:92 - 
    insert 
    into
        Product
        (price, supplier_id, version, id) 
    values
        (?, ?, ?, ?)
19:19:37,345 DEBUG SQL:92 - 
    insert 
    into
        LocalizedProduct
        (description, name, id, locale) 
    values
        (?, ?, ?, ?)
19:19:37,357 DEBUG SQL:92 - 
    insert 
    into
        LocalizedProduct
        (description, name, id, locale) 
    values
        (?, ?, ?, ?)

Due to the utility methods that return a product name and description for a given locale, retrieving a specific set of translations is very easy. But please keep in mind, that these getter methods use the managed association and might cause an additional SQL statement to fetch the LocalizedProduct entities.

Product p = em.createQuery("SELECT p FROM Product p WHERE id = 101", Product.class).getSingleResult();
log.info("Product ID:"+p.getId());
log.info("Product: "+p.getName("en"));
log.info("Product Description: "+p.getDescription("en"));
19:25:19,638 DEBUG SQL:92 - 
    select
        product0_.id as id1_2_,
        product0_.price as price2_2_,
        product0_.supplier_id as supplier4_2_,
        product0_.version as version3_2_ 
    from
        Product product0_ 
    where
        product0_.id=101
19:25:19,686  INFO UsabilityText:65 - Product ID:101
19:25:19,695 DEBUG SQL:92 - 
    select
        localizati0_.id as id1_0_0_,
        localizati0_.locale as locale2_0_0_,
        localizati0_.locale as formula1_0_,
        localizati0_.id as id1_0_1_,
        localizati0_.locale as locale2_0_1_,
        localizati0_.description as descript3_0_1_,
        localizati0_.name as name4_0_1_ 
    from
        LocalizedProduct localizati0_ 
    where
        localizati0_.id=?
19:25:19,723  INFO UsabilityText:66 - Product: Hibernate Tips - More than 70 solutions to common Hibernate problems
19:25:19,723  INFO UsabilityText:67 - Product Description: When you use Hibernate in your projects, you quickly recognize that you need to ...

Pros & Cons of Different Entities for Translated and Non-Translated Fields

Storing your translations in a separate table is a little more complicated, but it provides several benefits:

  • Each new translation is stored as a new record in the LocalizedProduct table. That enables you to store new translations without changing your table model.
  • Hibernate’s 2nd level cache provides an easy way to cache the different localizations. In my experience, other attributes of an entity, e.g., the price, change more often than the translations of a name or description. It can, therefore, be a good idea to separate the localizations from the rest of the data to be able to cache them efficiently.

But the mapping also has a few disadvantages:

  • If you want to access the localized attributes, Hibernate needs to execute an additional query to fetch the associated LocalizedProduct entities. You can avoid that by initializing the association when loading the Product entity.
  • Fetching associated LocalizedProduct entities might load translations that you don’t need for your use case.

Conclusion

Using additional columns to store the translations of a field, might seem like the most natural and obvious choice. But as I showed you in this article, it’s very inflexible. Supporting an additional language requires you to change your table and your domain model.

You should, therefore, avoid this approach and store the translated and non-translated information in 2 separate database tables. You can then map each table to an entity and model a one-to-many association between them.

This approach allows you to add new translations without changing your domain and table model. But the mapping is also a little more complicated, and Hibernate needs to execute an additional query to retrieve the different localizations. You can avoid these queries by activating the 2nd level cache.

7 Comments

  1. Thank you for sharing your knowledge.

  2. Hello, I’m trying to create a solutions for storing localized entities and got an error, it says:

    Caused by: java.sql.SQLIntegrityConstraintViolationException: ORA-01400: cannot insert NULL into (“ASSET_LOCALIZED”.”ASSET_ID”)
    I have two entities: HotelImpl & HotelImplLocalized
    mappings are following:

    @Entity
    @Table(name = “asset”)
    public class HotelImpl implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;
    @Column(length = 5)
    private String code;

    @OneToMany(mappedBy = “asset”, cascade = {CascadeType.ALL}, orphanRemoval = true)
    @MapKey(name = “hotelImplLocalizedPK.locale”)
    private Map localized = new HashMap();

    /*getters … setters*/
    }

    @Entity(name = “HotelImplLocalized”)
    @Table(name = “asset_localized”)
    public class HotelImplLocalized implements Serializable {

    @ManyToOne
    @MapsId(“asset_id”)
    @JoinColumn(name = “asset_id”)
    @Cascade(CascadeType.ALL)
    private HotelImpl asset;

    @EmbeddedId
    @Cascade(CascadeType.ALL)
    private HotelImplLocalizedPK hotelImplLocalizedPK = new HotelImplLocalizedPK();
    @Column
    private String address;

    /*getters … setters*/
    }

    @Embeddable
    public class HotelImplLocalizedPK implements Serializable {

    @Column(name = “asset_id”)
    private Long id;
    private String locale;

    /*getters … setters … equals … hashcode*/

    }

    and I’m saving entities with the following code:

    Session session = sessionFactory.openSession();
    session.beginTransaction();

    HotelImpl hotel = new HotelImpl();
    hotel.setCode(“TEST”);

    session.saveOrUpdate(hotel);
    session.getTransaction().commit();

    session.beginTransaction();
    HotelImplLocalized localizedHotel = new HotelImplLocalized();
    localizedHotel.setAddress(“test”);
    String localeEn = “en_us”;
    localizedHotel.getHotelImplLocalizedPK().setLocale(localeEn);
    localizedHotel.setAsset(hotel);

    hotel.getLocalized().put(localeEn, localizedHotel);

    session.saveOrUpdate(hotel);

    session.getTransaction().commit();

    session.close();

    complete example is here:
    https://github.com/andreySuhoverka/compositeid-mapping-demo

    Could you help figuring out what I need to do so that hibernate put the id of HotelImpl in HotelImplLocalized .HotelImplLocalizedPK.id ?

    Thanks!

    1. I figured out what was the issue, in @MapsId(“id”) id is the name of the field in composite id object, and I specified the name of the column in db

  3. Avatar photo Nishit Kumar says:

    What would be the best way to implement Localized Data with olingo(using ODataJPAProcessor) so that the entities are exposed in OData service as well?

    I am trying to implement this but what I found is that the header parameter “Accept-Language”(of OData requests) is passed to the createEntity method of ODataJPAProcessor but not to the readEntitySet method. I am not sure how the read scenario would work.

    1. Avatar photo Thorben Janssen says:

      Interesting question. Olingo looks interesting but I have never used it one of my projects.

  4. Good…but…how it is possible to manage different zoned time ?
    I have an app deployed to AWS which has the UTC time, but I am in Italy.
    When I store a record with Instant.now() at 10 AM in Italy, I see the stored record with time 8 AM.

    So…which is the correct way to manage the time zone and the correct data type?

    1. Avatar photo Thorben Janssen says:

      You can persist a timestamp with timezone in your database. Unfortunately, JPA only supports LocalDate and LocalDateTime. Hibernate supports ZonedDateTime but it does that by converting it to the timezone of your JVM, then stores that timestamp without a timezone in the database and converts it back into a ZonedDateTime when you read it.
      You can learn more about it in the following articles:

      Regards,
      Thorben

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.