最近在項目中碰到了須要批量插入數據的場景,當數據量在20w條的時候,就要花近36s(有索引)java
用MyBatis動態SQL,代碼相似這樣:sql
<insert id="batchInsert" useGeneratedKeys="true" keyProperty="id">
INSERT INTO sample(X,X,X)
VALUES
<foreach collection="list" item="item" separator=",">
(#{X},#{X},#{X})
</foreach>
</insert>
複製代碼
最後拼接出來執行的sql大概這樣:app
INSERT INTO sample(X,X,X) VALUES (X,X,X),(X,X,X),(X,X,X) ……
複製代碼
若是是查詢優化,還能夠在索引上下點功夫,可是批量插入的場景下,SQL語句就沒有什麼商量的餘地了,目前也就只能想到將數據集分批批量插入,控制下單次執行SQL的長度而已。難道這就是極限了?測試
在同事的建議下,用了JProfiler看了下call tree的耗時狀況,發現近36s的總耗時中,就有10s左右是在作PreparedStatement.setXXX()的賦值操做,抱着試一試的想法直接經過StringBuilder直接append出一條SQL,再去執行,竟然只須要11s左右,足足差了3倍多!雖然能夠想到經過MyBatis處理動態SQL會有點損耗,可是差距仍是超出想象的。優化
單獨把這個場景抽離出來,作了下實驗,過程以下ui
假設要插入的每條記錄有10個String類型的字段,每一個字段的數據爲「abcdefghijklmnopqrstuvwxyz」,每批插入1w條,共插入50w條.spa
在SpringBootTest下進行還原:code
DROP TABLE IF EXISTS `sample`;
CREATE TABLE `sample` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`col1` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col2` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col3` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col4` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col5` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col6` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col7` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col8` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col9` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`col10` varchar(255) COLLATE utf8_bin DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
複製代碼
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
public class Sample implements Serializable {
private Long id;
private String col1;
private String col2;
private String col3;
private String col4;
private String col5;
private String col6;
private String col7;
private String col8;
private String col9;
private String col10;
}
複製代碼
public List<Sample> buildData() {
List<Sample> samples = new ArrayList<>();
String col = "abcdefghijklmnopqrstuvwxyz";
for (int i = 0; i < 500000; i++) {
Sample sample = Sample.builder()
.col1(col)
.col2(col)
.col3(col)
.col4(col)
.col5(col)
.col6(col)
.col7(col)
.col8(col)
.col9(col)
.col10(col)
.build();
samples.add(sample);
}
return samples;
}
複製代碼
動態SQL以下:xml
<insert id="batchInsertSamples" parameterType="java.util.List" useGeneratedKeys="true" keyProperty="id">
INSERT INTO sample(col1,col2,col3,col4,col5,col6,col7,col8,col9,col10)
VALUES
<foreach collection="samples" item="item" separator=",">
(#{item.col1},#{item.col2},#{item.col3},#{item.col4},#{item.col5},
#{item.col6},#{item.col7},#{item.col8},#{item.col9},#{item.col10})
</foreach>
</insert>
複製代碼
測試代碼:對象
@Autowired
private SampleMapper sampleMapper;
@Test
public void testMyBatis() {
List<Sample> samples = buildData();
StopWatch stopWatch = new StopWatch();
System.out.println("開始使用MyBatis進行批量插入");
stopWatch.start();
List<List<Sample>> batch = Lists.partition(samples,10000);
for(List<Sample> part :batch){
sampleMapper.batchInsertSamples(part);
}
stopWatch.stop();
System.out.println("Mybatis批量插入完成,耗時:" + stopWatch.getTotalTimeMillis());
}
複製代碼
結果用時26.439s
@Autowired
private JdbcTemplate jdbcTemplate;
@Test
public void testAppend(){
List<Sample> samples = buildData();
String prefix = "INSERT INTO sample(col1,col2,col3,col4,col5,col6,col7,col8,col9,col10) VALUES";
StopWatch stopWatch = new StopWatch();
System.out.println("開始直接拼接sql插入");
stopWatch.start();
List<List<Sample>> batch = Lists.partition(samples,10000);
for(List<Sample> part :batch){
StringBuilder sb = new StringBuilder();
for(Sample sample :part){
sb.append("(");
sb.append("\""+sample.getCol1()+"\"").append(",");
sb.append("\""+sample.getCol2()+"\"").append(",");
sb.append("\""+sample.getCol3()+"\"").append(",");
sb.append("\""+sample.getCol4()+"\"").append(",");
sb.append("\""+sample.getCol5()+"\"").append(",");
sb.append("\""+sample.getCol6()+"\"").append(",");
sb.append("\""+sample.getCol7()+"\"").append(",");
sb.append("\""+sample.getCol8()+"\"").append(",");
sb.append("\""+sample.getCol9()+"\"").append(",");
sb.append("\""+sample.getCol10()+"\"");
sb.append(")");
sb.append(",");
}
String sql = prefix + sb.replace(sb.length()-1,sb.length(),"");
jdbcTemplate.execute(sql);
}
stopWatch.stop();
System.out.println("拼接sql批量插入完成,耗時:" + stopWatch.getTotalTimeMillis());
}
複製代碼
結果用時13.473s
2倍之差,仍是挺可觀的。
萬萬沒想到一句簡單的賦值操做,當數據量大的時候,會有這麼多的差距