Saturday, May 24, 2014

An absolute essential thing you should know about Hibernate cache

You might have used Hibernate for your ORM needs. The most important thing that you should about Hibernate cache is that it is implemented using a Map and a reference every entity that you have read from the database is kept in that Map. The key of that Map is the Entity ID (which could be just a Long or some form of composite primary key you have defined and wrapped in an EntityKey). The value stored is the entity itself (proxied).

If you read a 1000 entities (or rows) from the database, a reference to each one of them will be kept in the Map. If those entities have relationships specified (one-to-many or many-to-many) with other entities and those entities should be loaded eagerly, you are asking Hibernate to load a lot more than 1000 entities.

If you venture into processing a large set of rows (say 100,000 rows), a reference to every one those rows is going to be kept in the Map. Sometimes the amount of memory needed could be so large that your process might run out of memory (as mine did).

So what is the life cycle of this Map? This map is part of the "persistence context" that is referenced in Hibernate session. Hence when a Hibernate session is created, this Map contains no elements. As you keep reading or inserting entities, this Map accumulates references to these entities. When you close the session the Map is cleared.

If you would like to deal with a large volume of entities, given below is a strategy that you can follow:

The disadvantage of this method is that if there is one or more entities that are referred across that batches, they will be read again once per batch. But the main advantage is that you will keep your memory utilization low.