Friday, April 23, 2010

Note on allocationSize parameter of @SequenceGenerator while using JPA

I ran into what I thought as an issue while I was using the sequence ID generation strategy in JPA. The JPA provider I am using is Hibernate. I think sharing my experience will save someone some time.

To use a sequence (For e.g. Oracle sequence) to generate values and assign them to the ID field of your persistence object, you will following something like this:

@Id
@Column(name = "ITEM_ID")
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator="ItemIdSeqGenerator")
@SequenceGenerator(name="ItemIdSeqGenerator", sequenceName="ITEM_ID_SEQ", allocationSize=1)
private long itemId;
This means the following things:
  1. The @Id annotation says that the field itemId is a primary key.
  2. The @Column annotation says that the corresponding column in the database is ITEM_ID.
  3. The @GeneratedValue says that the value that needs to be populated in the itemId should be generated, while that object is persisted. The strategy to generate the value is to make use of a sequence. The details of the sequence are provided in the following annotation, that has the generator name "ItemIdSeqGenerator".
  4. The @SequenceGenerator annotation describes the sequence generator named "ItemIdSeqGenerator". The sequence in the database that needs to be used is called ITEM_ID_SEQ. 
So far so good. So roughly it implies that every time you try to persist the object, the itemId will be populated using a SQL command (in case of Oracle) "SELECT ITEM_ID_SEQ.NEXTVAL FROM DUAL". So to persist a new object, you will go to the database twice (once to get the NEXTVAL and once to insert the new row).

Imagine a situation where performance is critical and you are going to do a lot of inserts and a lot of IDs need to be generated. To help in such cases, the allocationSize comes to our help and that's exactly what this blog post is about.

allocationSize=N means that "Go and fetch the next value from the database once in every N persist calls. And locally increment the value by 1 in between.".

Let me give an example. Assume that the sequence is set up like this in the database:
CREATE SEQUENCE ITEM_ID_SEQ START WITH 1 INCREMENT BY 10;
And assume that all the values starting with 1 are legal (1, 2, 3, 4, ...) for the ITEM_ID column. In such case you will set the allocationSize=10. So the first persist call will go and fetch the ITEM_ID_SEQ.NEXTVAL from database. The subsequent persist calls will not go to the database, rather they will return last value+1 locally until the value reaches 10. That saves 9 database reads.

What if there are two entity managers that try to do the same thing? When the first entity manager calls the ITEM_ID_SEQ.NEXTVAL it will get 1 and the second one will get 11. Hence the first one will go on like 1,2,3,...10 and the second one will go on like 11,12,13...20, before fetching the next ITEM_ID_SEQ.NEXTVAL.

What if the allowed values are like 1, 11, 21, 31, ... and still you would like to make use of the allocationSize? So far I could not find how to do this! It would have been possible if we can specify an increment, like "allocationSize=100, increment=10" and set up the sequence as follows in the database:
CREATE SEQUENCE ITEM_ID_SEQ START WITH 1 INCREMENT BY 100;
As of now, if you have such requirement, you will have to set the value of allocationSize=1 and suffer the cost of going to the database every time when an object is persisted.

Remember another thing in mind. The J2EE documentation for the allocationSize says  an ambiguous thing which is quite different from the one I described above:
The amount to increment by when allocating sequence numbers from the sequence.
As per my experiments, the increments were always 1 :-(. This parameter just controlled how frequent to go to the database to fetch NEXTVAL.


If you are using an allocationSize greater than 1, it is important to remember that you should not assume that lower ID means earlier to invoke the NEXTVAL from the sequence. Don't try to interpret the ID in anyway other than just an ID to identify the row.

14 comments:

Anonymous said...

Hi Roy,

Thanks for the useful information - you definitely saved me a few hours of pain.

Regards,
Rennay

grg.bhesh said...

Nice information.

But I want to know if allocationSize in @SequenceGenerator and cashSize in the sequence itself in the database are same or related or can we use them together and if yes how. Can you please share some information on this.

I persisted 1000 entities to the db with cashSize 0 and 500. And I found that the performance was very very poor with cashSize 0.

Sowmya said...

Please read http://stackoverflow.com/questions/5346147/hibernate-oracle-sequence-produces-large-gap. Hibernate uses the hilo algorithm for sequence generation. We had large gaps in the sequence with the default allocation size of 50. We had to change to allocationSize of 1.

Anonymous said...

Roy, you saved my day buddy. Thank you and please keep blogging!!! :)

Anonymous said...

Thanks Roy.

Anonymous said...

Thanks for saving time!

Anonymous said...

For us, the behavior was incorrect unless this hibernate parameter was set. hibernate.id.new_generator_mappings=true

Anonymous said...

Very very helpful blog. Thanks you so much.

Ramil Izmaylov said...

Thank you for this post, it clarifies quite a lot!

prathap kumar said...

Interesting Article

Spring Hibernate Training Online | Hibernate Training in Chennai


Hibernate Training Online | Java Online Course

Anonymous said...

nice article

Anonymous said...

Very good information. I was confused to see sequence increment by 1 even though I set that to 10.

Thanks Roy.

business video backgrounds said...

The ideas as mentioned would help students regarding all those instances and the values which must have been followed by the individual.

Anonymous said...

Nice explanation, saved me much time thx