源碼分析 Mybatis 的 foreach 爲何會出現性能問題

時間 2019-11-30

標籤源碼分析 mybatis foreach 爲何出現性能問題欄目 MyBatis 简体版

原文原文鏈接

背景

最近在作一個相似於綜合報表之類的東西，須要查詢全部的記錄（數據庫記錄有限制），大概有1W條記錄，該報表須要三個表的數據，也就是根據這 1W 個 ID 去執行查詢三次數據庫，其中，有一條查詢 SQL 是本身寫，其餘兩條是根據別人提供的接口進行查詢，剛開始的時候，沒有多想，直接使用 in 進行查詢，使用 Mybatis 的 foreach 語句；項目中使用的是 jsonrpc 來請求數據，在測試的時候，發現總是請求不到數據，日誌拋出的是 jsonrpc 超時異常，繼續查看日誌發現，是被阻塞在上面的三條SQL查詢中。spring

在之前分析 Mybatis 的源碼的時候，瞭解到，Mybatis 的 foreach 會有性能問題，因此改了下 SQL，直接在代碼中拼接SQL，而後在 Mybatis 中直接使用 # 來獲取，替換 class 測試了下，果真一會兒就能查詢出數據。sql

前提

這裏先不考慮使用 in 好很差，如何去優化 in，如何使用 exists 或 inner join 進行代替等，這裏就只是考慮使用了 in 語句，且使用了 Mybatis 的 foreach 語句進行優化，其實 foreach 的優化很簡單，就是把 in 後面的語句在代碼裏面拼接好，在配置文件中直接經過 #{xxx} 或 ${xxx} 看成字符串直接使用便可。數據庫

測試

在分析 foreach 源碼以前，先構造個數據來看看它們的區別有多大。json

建表語句：bash

CREATE TABLE person
(
    id int(11) PRIMARY KEY NOT NULL,
    name varchar(50),
    age int(11),
    job varchar(50)
);複製代碼

插入 1W 條數據：mybatis

POJO 類：架構

@Getter
@Setter
@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Person implements Serializable {
    private int id;
    private String name;
    private String job;
    private int age;
}複製代碼

方式一

經過原始的方式，使用 foreach 語句：併發

1. 在 dao 裏面定義方法：app

List<Person> queryPersonByIds(@Param("ids") List<Integer> ids);
複製代碼

2. 配置文件SQL：分佈式

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
	select * from person where 1=1
	<if test="ids != null and ids.size() > 0">
		and id in
		<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
			#{item}
		</foreach>
	</if>
</select>複製代碼

3. 執行 main 方法：

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath:spring-mybatis.xml" })
public class MainTest {

    @Autowired
    private IPersonService personService;

    @Test
    public void test(){
        // 構造 1W 個 ID
        List<Integer> ids = new ArrayList<>();
        for (int i = 1; i <= 10000; i++) {
            ids.add(i);
        }
        long start = System.currentTimeMillis();
        
        // 執行三次
        personService.queryPersonByIds(ids);
        personService.queryPersonByIds(ids);
        personService.queryPersonByIds(ids);

        long end = System.currentTimeMillis();
        System.out.println(String.format("耗時：%d", end - start));
    }
}
結果：耗時：2853複製代碼

能夠看到經過 foreach 的方法，大概須要 3s

方式二

在代碼中封裝 SQL ，在配置文件中經過 ${xxx} 來獲取：

1. 在 dao 添加方法：

List<Person> queryPersonByIds2(@Param("ids") String ids);
複製代碼

2. 配置文件SQL：

<select id="queryPersonByIds2" parameterType="String" resultMap="queryPersonMap">
	select * from person where 1=1
	<if test="ids != null and ids != ''">
	  and id in ${ids}
	</if>
</select>複製代碼

3. 執行 main 方法：

@Test
public void test_3(){
	// 拼接 SQL 
	StringBuffer sb = new StringBuffer();
	sb.append("(");
	for (int i = 1; i < 10000; i++) {
		sb.append(i).append(",");
	}
	sb.deleteCharAt(sb.toString().length() - 1);
	sb.append(")");
    // 最終的 SQL 爲 (1,2,3,4,5...)

	long start2 = System.currentTimeMillis();

    // 執行三次
	personService.queryPersonByIds2(sb.toString());
	personService.queryPersonByIds2(sb.toString());
	personService.queryPersonByIds2(sb.toString());

	long end2 = System.currentTimeMillis();
	System.out.println(String.format("耗時：%d", end2 - start2));
}
結果：耗時：360複製代碼

經過拼接 SQL，使用 ${xxx} 的方式，執行一樣的 SQL ，耗時大概 360 ms

方式三

在代碼中封裝 SQL ，在配置文件中經過 #{xxx} 來獲取：

1. 在 dao 中添加方法：

List<Person> queryPersonByIds3(@Param("ids") String ids);
複製代碼

2. 配置文件SQL：

<select id="queryPersonByIds3" parameterType="String" resultMap="queryPersonMap">
	select * from person where 1=1
	<if test="ids != null and ids != ''">
		and id in (#{ids})
	</if>
</select>複製代碼

3. 執行 main 方法：

@Test
public void test_3(){
    // 拼接 SQL
	StringBuffer sb2 = new StringBuffer();
	for (int i = 1; i < 10000; i++) {
		sb2.append(i).append(",");
	}
	sb2.deleteCharAt(sb2.toString().length() - 1);
    // 最終的SQL爲 1,2,3,4,5....

	long start3 = System.currentTimeMillis();

	personService.queryPersonByIds3(sb2.toString());
	personService.queryPersonByIds3(sb2.toString());
	personService.queryPersonByIds3(sb2.toString());

	long end3 = System.currentTimeMillis();
	System.out.println(String.format("耗時：%d", end3 - start3));
}
結果：耗時：30複製代碼

經過拼接 SQL，使用 #{xxx} 的方式，執行一樣的 SQL ，耗時大概 30 ms

總結

經過上面三種方式能夠看到，使用不一樣的方式，耗時的差異仍是麻大的，最快的是拼接 SQL，使用 #{xxx} 看成字符串處理，最慢的是 foreach。爲何 foreach 會慢那麼多呢，後面再分析源碼的時候再進行分析；而這裏一樣是拼接 SQL 的方式，#{xxx} 和 ${xxx} 耗時卻相差 10 倍左右；咱們知道，Mybatis 在解析 # 和 $ 這兩種不一樣的符號時，採用不一樣的處理策略；使用過 JDBC 的都知道，經過 JDBC 執行 SQL 有兩種方式： Statment 對象和PreparedStatment 對象， PreparedStatment 表示預編譯的SQL，包含的SQL已經預編譯過了，SQL 中的參數部分使用？進行佔位，以後使用 setXXX 進行賦值，當使用 Statement 對象時，每次執行一個SQL命令時，都會對它進行解析和編譯。全部 PreparedStatment 效率要高一些。那麼 Mybatis 在解析 # 和 $ 的時候，分別對應的是這兩種對象，# 被解析成 PreparedStatment 對象，經過 ? 進行佔位，以後再賦值，而 $ 被解析成 Statement ，經過直接拼接SQL的方式賦值，因此，爲何一樣是經過在代碼中拼接 SQL ，# 和 $ 的耗時不一樣的緣由。

PS：上面只是介紹了三種方式，應該沒有人問，拼接SQL爲 (1,2,3,4,5)，在配置SQL中經過 #{xxx} 來獲取吧

foreach 源碼解析

下面來看下 foreach 是如何被解析的，最終解析的 SQL 是什麼樣的：

在 Mybatis 中，foreach 屬於動態標籤的一種，也是最智能的其中一種，Mybatis 每一個動態標籤都有對應的類來進行解析，而 foreach 主要是由 ForEachSqlNode 負責解析。

ForeachSqlNode 主要是用來解析 <foreach> 節點的，先來看看 <foreach> 節點的用法：

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
	select * from person where 1=1
	<if test="ids != null and ids.size() > 0">
		and id in
		<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
			#{item}
		</foreach>
	</if>
</select>複製代碼

最終被數據庫執行的 SQL 爲 select * from person where 1=1 and id in (1,2,3,4,5)

先來看看它的兩個內部類：

PrefixedContext

該類主要是用來處理前綴，好比 "(" 等。

private class PrefixedContext extends DynamicContext {   
   private DynamicContext delegate;
    // 指定的前綴
    private String prefix;
    // 是否處理過前綴
    private boolean prefixApplied;
    // .......

    @Override
    public void appendSql(String sql) {
      // 若是尚未處理前綴，則添加前綴
      if (!prefixApplied && sql != null && sql.trim().length() > 0) {
        delegate.appendSql(prefix);
        prefixApplied = true;
      }
       // 拼接SQL
      delegate.appendSql(sql);
    }
}複製代碼

FilteredDynamicContext

FilteredDynamicContext 是用來處理 #{} 佔位符的，可是並未綁定參數，只是把 #{item} 轉換爲 #{_frch_item_1} 之類的佔位符。

private static class FilteredDynamicContext extends DynamicContext {
    private DynamicContext delegate;
    //對應集合項在集合的索引位置
    private int index;
    // item的索引
    private String itemIndex;
    // item的值
    private String item;
    //.............
    // 解析 #{item}
    @Override
    public void appendSql(String sql) {
      GenericTokenParser parser = new GenericTokenParser("#{", "}", new TokenHandler() {
        @Override
        public String handleToken(String content) {
          // 把 #{itm} 轉換爲 #{__frch_item_1} 之類的
          String newContent = content.replaceFirst("^\\s*" + item + "(?![^.,:\\s])", itemizeItem(item, index));
           // 把 #{itmIndex} 轉換爲 #{__frch_itemIndex_1} 之類的
          if (itemIndex != null && newContent.equals(content)) {
            newContent = content.replaceFirst("^\\s*" + itemIndex + "(?![^.,:\\s])", itemizeItem(itemIndex, index));
          }
          // 再返回 #{__frch_item_1} 或 #{__frch_itemIndex_1}
          return new StringBuilder("#{").append(newContent).append("}").toString();
        }
      });
      // 拼接SQL
      delegate.appendSql(parser.parse(sql));
    }
  private static String itemizeItem(String item, int i) {
    return new StringBuilder("__frch_").append(item).append("_").append(i).toString();
  }
}複製代碼

ForeachSqlNode

瞭解了 ForeachSqlNode 它的兩個內部類以後，再來看看它的實現：

public class ForEachSqlNode implements SqlNode {
  public static final String ITEM_PREFIX = "__frch_";
  // 判斷循環的終止條件
  private ExpressionEvaluator evaluator;
  // 循環的集合
  private String collectionExpression;
  // 子節點
  private SqlNode contents;
  // 開始字符
  private String open;
  // 結束字符
  private String close;
  // 分隔符
  private String separator;
  // 本次循環的元素，若是集合爲 map，則index 爲key，item爲value
  private String item;
  // 本次循環的次數
  private String index;
  private Configuration configuration;

  // ...............

  @Override
  public boolean apply(DynamicContext context) {
    // 獲取參數
    Map<String, Object> bindings = context.getBindings();
    final Iterable<?> iterable = evaluator.evaluateIterable(collectionExpression, bindings);
    if (!iterable.iterator().hasNext()) {
      return true;
    }
    boolean first = true;
    // 添加開始字符串
    applyOpen(context);
    int i = 0;
    for (Object o : iterable) {
      DynamicContext oldContext = context;
      if (first) {
        // 若是是集合的第一項，則前綴prefix爲空字符串
        context = new PrefixedContext(context, "");
      } else if (separator != null) {
        // 若是分隔符不爲空，則指定分隔符
        context = new PrefixedContext(context, separator);
      } else {
          // 不指定分隔符，在默認爲空
          context = new PrefixedContext(context, "");
      }
      int uniqueNumber = context.getUniqueNumber();  
      if (o instanceof Map.Entry) {
        // 若是集合是map類型，則將集合中的key和value添加到bindings參數集合中保存
        Map.Entry<Object, Object> mapEntry = (Map.Entry<Object, Object>) o;
        // 因此循環的集合爲map類型，則index爲key，item爲value，就是在這裏設置的
        applyIndex(context, mapEntry.getKey(), uniqueNumber);
        applyItem(context, mapEntry.getValue(), uniqueNumber);
      } else {
        // 不是map類型，則將集合中元素的索引和元素添加到 bindings集合中
        applyIndex(context, i, uniqueNumber);
        applyItem(context, o, uniqueNumber);
      }
      // 調用 FilteredDynamicContext 的apply方法進行處理
      contents.apply(new FilteredDynamicContext(configuration, context, index, item, uniqueNumber));
      if (first) {
        first = !((PrefixedContext) context).isPrefixApplied();
      }
      context = oldContext;
      i++;
    }
     // 添加結束字符串
    applyClose(context);
    return true;
  }

  private void applyIndex(DynamicContext context, Object o, int i) {
    if (index != null) {
      context.bind(index, o); // key爲idnex，value爲集合元素
      context.bind(itemizeItem(index, i), o); // 爲index添加前綴和後綴造成新的key
    }
  }

  private void applyItem(DynamicContext context, Object o, int i) {
    if (item != null) {
      context.bind(item, o);
      context.bind(itemizeItem(item, i), o);
    }
  }
}複製代碼

因此該例子：

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
	select * from person where 1=1
	<if test="ids != null and ids.size() > 0">
		and id in
		<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
			#{item}
		</foreach>
	</if>
</select>複製代碼

解析以後的 SQL 爲：

select * from person where 1=1 and id in (#{__frch_item_0}, #{__frch_item_1}, #{__frch_item_2}, #{__frch_item_3}, #{__frch_item_4})

以後在經過 PreparedStatment 的 setXXX 來進行賦值。

因此，到這裏，知道了 Mybatis 在解析 foreach 的時候，最後仍是解析成了

的方式，可是爲何仍是很慢呢，這是由於須要循環解析 #{__frch_item_0} 之類的佔位符，foreach 的集合越大，解析越慢。既然知道了須要解析佔位符，爲什麼不本身拼接呢，因此就能夠在代碼中拼接好，而再也不使用 foreach 啦。

因此，Mybatis 在解析 foreach 的時候，底層仍是會解析成

號的形式而不是

的形式，既然知道了這個，若是須要 foreach 的集合很大，就可使用代碼拼接 SQL ，使用

(#{xxx}) 的方式進行獲取，不要再拼接成 (1,2,3,4,5) 再使用 ${xxx} 的方式啦。

歡迎工做一到五年的Java工程師朋友們加入Java進階高級架構：416843702 羣內提供免費的Java架構學習資料（裏面有高可用、高併發、高性能及分佈式、Jvm性能調優、Spring源碼， MyBatis，Netty,Redis,Kafka,Mysql,Zookeeper,Tomcat,Docker,Dubbo,Nginx等多個知識點的架構資料）合理利用本身每一分每一秒的時間來學習提高本身，不要再用"沒有時間「來掩飾本身思想上的懶惰！趁年輕，使勁拼，給將來的本身一個交代！