在進行試題綜合查詢時,在和往常同樣使用Predicate
拼接謂語時,遇到了棘手的問題。java
需求是查詢試題,除了其餘的專業課、模型等通用條件外,須要查詢沒有被使用過的試題,以及當前試卷使用的試題。spring
試題列表中須要包含當前試卷可選擇的全部試題,故如此設計。sql
這是最初的實現,構造了一個OR
條件,試題的subjectSpread
爲空,或者其所在的subjectSpread
所屬的試卷就是當前試卷。數據庫
return subjectRepository.findAll((Specification<Subject>) (root, query, builder) -> { Predicate predicate = root.get("parent").isNull(); logger.debug("構造是否使用查詢條件"); Predicate usedPredicate = root.get("subjectSpread").isNull(); logger.debug("根據試卷id構造查詢條件"); if (paperId != null) { Predicate belongPredicate = builder.equal(root.join("subjectSpread").join("part").join("paper").get("id").as(Long.class), paperId); usedPredicate = builder.or(usedPredicate, belongPredicate); } logger.debug("鏈接謂語"); predicate = builder.and(predicate, usedPredicate); return predicate; }, pageable);
查詢條件構造的邏輯看起來沒問題,但通過測試,該接口只能查出來當前試卷的試題,沒法查詢出subjectSpread
爲空的試題。ide
項目中啓用了show-sql
的選項,在控制檯打印Hibernate
生成的SQL
語句。性能
spring: jpa: show-sql: true
Hibernate
自動生成的SQL
代碼以下:測試
SELECT subject0_.id AS id1_15_, subject0_.analysis AS analysis2_15_, subject0_.course_id AS course_i7_15_, subject0_.create_time AS create_t3_15_, subject0_.create_user_id AS create_u8_15_, subject0_.difficult AS difficul4_15_, subject0_.mark AS mark5_15_, subject0_.model_id AS model_id9_15_, subject0_.p_id AS p_id10_15_, subject0_.stem AS stem6_15_, subject0_.subject_spread_id AS subject11_15_ FROM SUBJECT subject0_ INNER JOIN subject_spread subjectspr1_ ON subject0_.subject_spread_id = subjectspr1_.id INNER JOIN part part2_ ON subjectspr1_.part_id = part2_.id INNER JOIN paper paper3_ ON part2_.paper_id = paper3_.id INNER JOIN course course4_ ON subject0_.course_id = course4_.id INNER JOIN model model5_ ON subject0_.model_id = model5_.id WHERE (subject0_.p_id IS NULL) AND ( subject0_.subject_spread_id IS NULL OR paper3_.id = 2 ) AND course4_.id = 1 AND model5_.id = 2 ORDER BY subject0_.id DESC
問題就出如今這幾行INNER JOIN
上:ui
SUBJECT subject0_ INNER JOIN subject_spread subjectspr1_ ON subject0_.subject_spread_id = subjectspr1_.id INNER JOIN part part2_ ON subjectspr1_.part_id = part2_.id INNER JOIN paper paper3_ ON part2_.paper_id = paper3_.id INNER JOIN course course4_ ON subject0_.course_id = course4_.id
左鏈接、右鏈接、內鏈接區別,請看下圖:spa
若是圖片不清晰請訪問源地址:How do I decide when to use right joins/left joins or inner joins Or how to determine which table is on which side? - StackOverflowdebug
再看以下SQL
:
subject INNER JOIN subject_spread ON subject.subject_spread_id = subject_spread.id
subject
與subject_spread
表進行內鏈接,條件subject.subject_spread_id = subject_spread.id
,因此subject_spread_id
爲NULL
的記錄就被鏈接排除,因此查不出來未使用的試題。
既然一次查不出來,就查兩次,將兩次的集合UNION
到一塊兒。
惋惜JPA
不支持UNION
,只能使用原生SQL
進行查詢。
具體SQL
以下,在原SQL
基礎上進行改動,再寫一個查詢未使用試題的語句,將二者的結果集進行UNION
,再Order
,再分頁。
SELECT subject.* FROM subject INNER JOIN course ON subject.course_id = course.id INNER JOIN model ON subject.model_id = model.id WHERE subject.p_id IS NULL AND subject.subject_spread_id IS NULL AND course.id = ? AND model.id = ? UNION SELECT subject.* FROM subject INNER JOIN subject_spread ON subject.subject_spread_id = subject_spread.id INNER JOIN part ON subject_spread.part_id = part.id INNER JOIN paper ON part.paper_id = paper.id INNER JOIN course ON subject.course_id = course.id INNER JOIN model ON subject.model_id = model.id WHERE subject.p_id IS NULL AND paper.id = ? AND course.id = ? AND model.id = ? ORDER BY id DESC LIMIT ?
請教潘老師後,發現其實並不須要這麼麻煩,以前的查詢錯誤是由於對JOIN
的理解不深入,該應用場景下應該使用左鏈接方式,而非默認的內鏈接。
join
方法的第二個參數便是鏈接類型,以前沒用過,一直使用默認的INNER
鏈接類型。
Predicate
查詢修改成左鏈接:
Predicate belongPredicate = builder.equal(root.join("subjectSpread", JoinType.LEFT).join("part", JoinType.LEFT).join("paper", JoinType.LEFT).get("id").as(Long.class), paperId);
二者都能實現功能,咱們對比一下在大量數據的環境下各自查詢的性能。
以前構造大量數據一直使用JPA
的saveAll
方法,以爲saveAll
一直是執行一條SQL
,比for
循環調用save
性能會有所提高。
直到上次與同窗討論時才推翻這個錯誤觀點。
他向MySQL
中使用saveAll
插入一千條數據,耗費了大量時間,具體忘記了,好像是幾十秒,最後使用MyBatis
拼SQL
去了。
通過研究後才發現,saveAll
還真就是for
循環,難怪這麼慢。
之後大量數據的時候不再用JPA
了,還不如本身寫SQL
。
數據中初始化了幾條測試數據:
寫個存儲過程,對數據進行翻倍,翻16
次,共計393,216
條數據。(冪真的是世界上最偉大的運算)
CREATE PROCEDURE BIG_DATA() BEGIN DECLARE i INT DEFAULT 0; WHILE i < 16 DO INSERT INTO subject(analysis, create_time, difficult, mark, stem, course_id, model_id, subject_spread_id) SELECT analysis, create_time, difficult, mark, stem, course_id, model_id, subject_spread_id FROM subject; SET i = i + 1; END WHILE; END
由於Hibernate
生成SQL
會有一些性能損失,其與JDBCTemplate
執行的原生SQL
在性能上會有所差距,因此咱們脫離Hibernate
,僅在數據庫層面對比LEFT JOIN
與UNION
的性能。
查詢語句以下:
SELECT subject.* FROM subject LEFT JOIN subject_spread ON subject.subject_spread_id = subject_spread.id LEFT JOIN part ON subject_spread.part_id = part.id LEFT JOIN paper ON part.paper_id = paper.id INNER JOIN course ON subject.course_id = course.id INNER JOIN model ON subject.model_id = model.id WHERE subject.p_id IS NULL AND ( subject.subject_spread_id IS NULL OR paper.id = 2 ) AND course.id = 1 AND model.id = 2 ORDER BY subject.id DESC
執行時間7.283
秒。
查詢語句以下:
SELECT subject.* FROM subject INNER JOIN course ON subject.course_id = course.id INNER JOIN model ON subject.model_id = model.id WHERE subject.p_id IS NULL AND subject.subject_spread_id IS NULL AND course.id = 1 AND model.id = 2 UNION SELECT subject.* FROM subject INNER JOIN subject_spread ON subject.subject_spread_id = subject_spread.id INNER JOIN part ON subject_spread.part_id = part.id INNER JOIN paper ON part.paper_id = paper.id INNER JOIN course ON subject.course_id = course.id INNER JOIN model ON subject.model_id = model.id WHERE subject.p_id IS NULL AND paper.id = 2 AND course.id = 1 AND model.id = 2 ORDER BY id DESC
執行時間16.450
秒。
兩者對比,UNION
花費的時間大約是LEFT JOIN
的兩倍,數據庫進行了兩次條件查詢。
除非業務必要,不然,SQL
語句儘可能不要採用UNION
等聯合多語句查詢結果的方式,屢次查詢意味着更多的時間花費。