saiku執行速度慢

使用saiku的過程當中發現一個重要問題,速度慢!下面是跟蹤和優化過程前端

1、首先抓包,發現ajax請求:http://l-tdata2.tkt.cn6.qunar.com:8080/saiku/rest/saiku/api/query/executeweb

裏面的參數很多,下面是截屏ajax

2、看日誌:發現了mdx語句sql

WITH
SET [~ROWS_create_date_create_date] AS
    {[create_date].[create_date].[2016-04-12]}
SET [~ROWS_dimPartner_dimPartner] AS
    Hierarchize({{[dimPartner].[dimPartner].[All dimPartners]}, {[dimPartner].[dimPartner].[name].Members}})
SET [~ROWS_in_track_in_track] AS
    {[in_track].[in_track].[All in_tracks]}
SET [~ROWS_product_product] AS
    {[product].[product].[All products]}
SET [~ROWS_self_self] AS
    {[self].[self].[All selfs]}
SET [~ROWS_sight_sight] AS
    {[sight].[sight].[All sights]}
SET [~ROWS_ticket_type_ticket_type] AS
    {[ticket_type].[ticket_type].[All ticket_types]}
SET [~ROWS_order_status_order_status] AS
    {[order_status].[order_status].[All order_statuss]}
SET [~ROWS_refund_status_refund_status] AS
    {[refund_status].[refund_status].[All refund_statuss]}
SELECT
NON EMPTY {[Measures].[money], [Measures].[quantity], [Measures].[qunar_income], [Measures].[order_num]} ON COLUMNS,
NON EMPTY Order(NonEmptyCrossJoin([~ROWS_create_date_create_date], NonEmptyCrossJoin([~ROWS_dimPartner_dimPartner], NonEmptyCrossJoin([~ROWS_in_track_in_track], NonEmptyCrossJoin([~ROWS_product_product], NonEmptyCrossJoin([~ROWS_self_self], NonEmptyCrossJoin([~ROWS_sight_sight], NonEmptyCrossJoin([~ROWS_ticket_type_ticket_type], NonEmptyCrossJoin([~ROWS_order_status_order_status], [~ROWS_refund_status_refund_status])))))))), [Measures].[money], BDESC) ON ROWS
FROM [com_order_detail_cube]
2016-04-15 17:02:55,948 INFO  [org.saiku.datasources.connection.SaikuOlapConnection] Clearing cache
2016-04-15 17:03:26,603 WARN  [mondrian.rolap.RolapSchema] Model is in legacy format
2016-04-15 17:04:09,714 INFO  [org.saiku.datasources.connection.SaikuOlapConnection] Catalogs:1
2016-04-15 17:06:03,660 DEBUG [org.saiku.service.olap.ThinQueryService] Query End
2016-04-15 17:06:03,661 INFO  [org.saiku.service.olap.ThinQueryService] RUN#:84 Size: 14/7      Execute:        190420ms        Format: 0ms     Totals: 0ms      Total: 190420ms

觀察日誌,發現前端一直執行不返回。分析主要緣由是執行mdx須要很長時間,190秒api

三、找代碼:org.saiku.web.rest.resources.Query2Resource的execute方法優化

繼續追蹤代碼:org.saiku.service.olap.ThinQueryService的execute方法()。下面是核心重點:spa

    private CellDataSet execute(ThinQuery tq, ICellSetFormatter formatter) {
        try {

            Long start = (new Date()).getTime();
            log.debug("Query Start");
            CellSet cellSet =  executeInternalQuery(tq); //這是執行mdx語句的地方,須要較長時間
            log.debug("Query End");
            String runId = "RUN#:" + ID_GENERATOR.get();
            Long exec = (new Date()).getTime();

            CellDataSet result = OlapResultSetUtil.cellSet2Matrix(cellSet,formatter);
            Long format = (new Date()).getTime();

            if (ThinQuery.Type.QUERYMODEL.equals(tq.getType()) && formatter instanceof FlattenedCellSetFormatter && tq.hasAggregators()) {
                calculateTotals(tq, result, cellSet, formatter);
            }
            Long totals = (new Date()).getTime();
            log.info(runId + "\tSize: " + result.getWidth() + "/" + result.getHeight() + "\tExecute:\t" + (exec - start)
                    + "ms\tFormat:\t" + (format - exec) + "ms\tTotals:\t" + (totals - format) + "ms\t Total: " + (totals - start) + "ms");

            result.setRuntime(new Double(format - start).intValue());
            return result;
        } catch (Exception | Error e) {
            throw new SaikuServiceException("Can't execute query: " + tq.getName(),e);
        }
    }

四、查看數據執行的sql,看看爲何執行的很慢debug

4.1 選擇狀況rest

首先任何的篩選都是對立方體內的字段進行全表的掃描,好比個人立方體對應的數據表是:com_order_detail_view,時間對應的字段是create_date,那麼選擇時間的時候,捕獲執行的sql以下:日誌

 select "com_order_detail_view"."create_date" as "c0" from "com_order_detail_view" as "com_order_detail_view" group by "com_order
_detail_view"."create_date" order by "com_order_detail_view"."create_date" ASC NULLS LAST

發現根本沒有where條件。好吧,這個能夠理解!

4.2 執行狀況

篩選的時候,爲了提高效率,選擇了一個日期,而且只是選擇了name字段做爲區分。執行時間:190s

抓取的sql以下:

4.2.1

select "dim_partner"."name" as "c0", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view", "dim_partner" as "dim_partner" where "com_order_detail_view"."partner" = "dim_partner"."code" group by "dim_partner"."name"

4.2.2

 select sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view"

沒有發現where條件。猜想多是選擇日期沒有在過濾條件裏面,因此全表掃描,那麼將日期放入過濾條件,mdx被修改成:

WITH
SET [~FILTER] AS
    {[create_date].[create_date].[2016-04-01]}
SET [~ROWS_dimPartner_dimPartner] AS
    Hierarchize({{[dimPartner].[dimPartner].[All dimPartners]}, {[dimPartner].[dimPartner].[name].Members}})
SET [~ROWS_in_track_in_track] AS
    {[in_track].[in_track].[All in_tracks]}
SET [~ROWS_product_product] AS
    {[product].[product].[All products]}
SET [~ROWS_self_self] AS
    {[self].[self].[All selfs]}
SET [~ROWS_sight_sight] AS
    {[sight].[sight].[All sights]}
SET [~ROWS_ticket_type_ticket_type] AS
    {[ticket_type].[ticket_type].[All ticket_types]}
SET [~ROWS_order_status_order_status] AS
    {[order_status].[order_status].[All order_statuss]}
SET [~ROWS_refund_status_refund_status] AS
    {[refund_status].[refund_status].[All refund_statuss]}
SELECT
NON EMPTY {[Measures].[money], [Measures].[quantity], [Measures].[qunar_income], [Measures].[order_num]} ON COLUMNS,
NON EMPTY Order(NonEmptyCrossJoin([~ROWS_dimPartner_dimPartner], NonEmptyCrossJoin([~ROWS_in_track_in_track], NonEmptyCrossJoin([~ROWS_product_product], NonEmptyCrossJoin([~ROWS_self_self], NonEmptyCrossJoin([~ROWS_sight_sight], NonEmptyCrossJoin([~ROWS_ticket_type_ticket_type], NonEmptyCrossJoin([~ROWS_order_status_order_status], [~ROWS_refund_status_refund_status]))))))), [Measures].[money], BDESC) ON ROWS
FROM [com_order_detail_cube]
WHERE [~FILTER]
2016-04-15 17:13:02,448 DEBUG [org.saiku.service.olap.ThinQueryService] Query End
2016-04-15 17:13:02,449 INFO  [org.saiku.service.olap.ThinQueryService] RUN#:86 Size: 13/8      Execute:        20679ms Format: 1ms     Totals: 0ms      Total: 20680ms

發現有了效果,執行時間:20s。下面是抓取的sql

4.2.3

select "com_order_detail_view"."create_date" as "c0", "dim_partner"."name" as "c1", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view", "dim_partner" as "dim_partner" where "com_order_detail_view"."create_date" = DATE '2016-04-01' and "com_order_detail_view"."partner" = "dim_partner"."code" group by "com_order_detail_view"."create_date", "dim_partner"."name"

4.2.4

select "com_order_detail_view"."create_date" as "c0", sum("com_order_detail_view"."money") as "m0", sum("com_order_detail_view"."quantity") as "m1", sum("com_order_detail_view"."qunar_income") as "m2", count(distinct "com_order_detail_view"."display_id") as "m3" from "com_order_detail_view" as "com_order_detail_view" where "com_order_detail_view"."create_date" = DATE '2016-04-01' group by "com_order_detail_view"."create_date"

 

總結:使用saiku的時候,將時間條件放在《行》或者《列》裏面,基本不起做用。最好放入在《過濾》裏面

相關文章
相關標籤/搜索