Skip to content

Commit f838e53

Browse files
committed
Added readme file
1 parent 36a4d98 commit f838e53

File tree

2 files changed

+89
-22
lines changed

2 files changed

+89
-22
lines changed

.gitignore

Lines changed: 0 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,11 @@
1-
README.adoc
21
target/
32
!.mvn/wrapper/maven-wrapper.jar
43
!**/src/main/**/target/
54
!**/src/test/**/target/
65

7-
### STS ###
8-
.apt_generated
9-
.classpath
10-
.factorypath
11-
.project
12-
.settings
13-
.springBeans
14-
.sts4-cache
156

167
### IntelliJ IDEA ###
178
.idea
189
*.iws
1910
*.iml
2011
*.ipr
21-
22-
### NetBeans ###
23-
/nbproject/private/
24-
/nbbuild/
25-
/dist/
26-
/nbdist/
27-
/.nb-gradle/
28-
build/
29-
!**/src/main/**/build/
30-
!**/src/test/**/build/
31-
32-
### VS Code ###
33-
.vscode/

readme.adoc

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
= Spring Boot: JPA Bulk Database Insert
2+
3+
In this project, I achieved reducing 10k records insertion time from 183 seconds to just 5 secs.
4+
5+
For this I did teh following changes :-
6+
7+
==== 1) Change the number of records while inserting.
8+
9+
i. Set hibernate batchin insert size with the folowing properties.
10+
11+
12+
spring.jpa.properties.hibernate.jdbc.batch_size=30
13+
14+
ii. Add connection string properties.
15+
16+
17+
cachePrepStmts=true
18+
&useServerPrepStmts=true
19+
&rewriteBatchedStatements=true
20+
21+
e.g
22+
jdbc:mysql://localhost:3306/BOOKS_DB?serverTimezone=UTC&cachePrepStmts=true&useServerPrepStmts=true&rewriteBatchedStatements=true
23+
24+
iii. Changed the code for inserting, so that saveAll methods get batch sizes of 30 to insert as per what we also set in the properties file.
25+
26+
A very crude implementation of something like this.
27+
28+
for (int i = 0; i < totalObjects; i = i + batchSize) {
29+
if( i+ batchSize > totalObjects){
30+
List<Book> books1 = books.subList(i, totalObjects - 1);
31+
repository.saveAll(books1);
32+
break;
33+
}
34+
List<Book> books1 = books.subList(i, i + batchSize);
35+
repository.saveAll(books1);
36+
}
37+
38+
This reduced the time by not that much, but dropped from 185 secs to 153 Secs. That's approximately 18% improvement.
39+
40+
41+
==== 2) Change the ID generation strategy.
42+
43+
This made a major impact.
44+
45+
I stopped usign the `@GeneratedValue` annotation with strategy i.e `GenerationType.IDENTITY` on my entity class.
46+
Hibernate has disabled batch update with this strategy, Because it has to make a select call to get the id from the database to insert each row.
47+
48+
I changed the strategy to SEQUENCE and provided a sequence generator.
49+
50+
public class Book {
51+
@Id
52+
@GeneratedValue(strategy = SEQUENCE, generator = "seqGen")
53+
@SequenceGenerator(name = "seqGen", sequenceName = "seq", initialValue = 1)
54+
private Long id;
55+
}
56+
57+
This change drastically changed the insert performance as Hibernate was able to leverage bulk insert.
58+
From the previous performance improvement of 153 secs, the time to insert 10k records reduced to only 9 secs. Thats an increase in performance by nearly 95%.
59+
60+
Next, I pushed it further to use higher batch sizes and I noticed that doubling the batch size does not double down on time. The time to insert only gradually reduces.
61+
62+
|===
63+
|Batch Size | Time to insert (Secs)
64+
65+
|30
66+
|9.5
67+
68+
|60
69+
|6.48
70+
71+
|200
72+
|5.04
73+
74+
|500
75+
|4.46
76+
77+
|1000
78+
|4.39
79+
80+
|2000
81+
|4.5
82+
83+
|5000
84+
|5.09
85+
86+
|===
87+
88+
89+
The most optimal I found for my case was a batch size of 1000 which took around 4.39 secs for 10K records. After that, I saw the performance degrading as you can see in the graph.

0 commit comments

Comments
 (0)