harfbuzz-ng如何選擇一個shaper node
harfbuzz-ng在shape文字時,依然是會根據文字的特色,選中一個種shaper,而後再將實際shape的過程都交給shaper來處理這樣。只是在harfbuzz-ng中,不像在老版harfbuzz中那樣,是簡單的依據所傳入字串的script,經過一個表就直接地選定一個shaper,而是在參考字串的特性(script)以外,還會參考字庫文件自己的特徵,來最終選定一個shaper。同時,harfbuzz-ng中的shaper與老版harfbuzz中的shaper在概念上也有必定的區別。老版harfbuzz中的shaper基本上都是針對某種特定的語言而實現,並藉助於內部的OpenType處理功能,來提供OpenType的高級渲染操做,能夠稱爲是語言shaper吧。而在harfbuzz-ng中,其shaper則主要包括Graphite2 shaper,OpenType shaper這樣的一些,能夠稱爲是字庫shaper吧。藉助於harfbuzz-ng這樣的一種結構,用戶能夠只爲harfbuzz-ng編寫一個客戶端,而後簡單的將Graphite2之類的其餘shape engine接到harfbuzz-ng下面,以實現對字串的最優化shaping。harfbuzz-ng自己主要是實現了OpenType shaper,於是下面咱們也會更多的關注與OpenType shaper有關的一些內容。下面咱們就來看一下,在harfbuzz-ng中,選擇shaper的邏輯是怎樣的吧。固然,下面的code分析,必定是基於harfbuzz某個特定的版本的,這個版本其實是0.9.10,harfbuzz目前都還依然處於開放狀態中,本文的分析對於將來的某些版本也可能會有不適用的情況。 api
首先,來看一下harfbuzz-ng的主入口函數hb_shape(): 緩存
void hb_shape (hb_font_t *font, hb_buffer_t *buffer, const hb_feature_t *features, unsigned int num_features) { hb_shape_full (font, buffer, features, num_features, NULL); }
這個函數接收font和buffer參數,其中font裏面包含有與字體相關的信息,好比所使用的字庫文件的內容,字體的大小等;而buffer中則包含有關於字串的信息,好比字串的內容,字串的方向和script等。而這個函數的 features 和 num_features 參數經常是NULL和0,由於客戶端一般都不須要本身來肯定shape一個字串所須要的features嘛,選擇到底要使用那些features的工做,徹底交給harfbuzz-ng來就好了。完成對字串的shaping以後,結果會經過傳入的buffer參數返回給調用者。 函數
能夠看到hb_shape()的定義卻是簡單的很,就只是調用了hb_shape_full()來完成全部的工做而已。接下來來看一下hb_shape_full()的實現: 字體
hb_bool_t hb_shape_full (hb_font_t *font, hb_buffer_t *buffer, const hb_feature_t *features, unsigned int num_features, const char * const *shaper_list) { if (unlikely (!buffer->len)) return true; assert (buffer->content_type == HB_BUFFER_CONTENT_TYPE_UNICODE); buffer->guess_segment_properties (); hb_shape_plan_t *shape_plan = hb_shape_plan_create_cached (font->face, &buffer->props, features, num_features, shaper_list); hb_bool_t res = hb_shape_plan_execute (shape_plan, font, buffer, features, num_features); hb_shape_plan_destroy (shape_plan); if (res) buffer->content_type = HB_BUFFER_CONTENT_TYPE_GLYPHS; return res; }
能夠看到這個函數對字串作shape的過程: 優化
不過所謂的shape plan到底是個什麼東西呢?先來看一下hb_shape_plan_t的定義: atom
struct hb_shape_plan_t { hb_object_header_t header; ASSERT_POD (); hb_bool_t default_shaper_list; hb_face_t *face; hb_segment_properties_t props; hb_shape_func_t *shaper_func; const char *shaper_name; struct hb_shaper_data_t shaper_data; };
在harfbuzz-ng中,hb_face_t是一個字庫文件的抽象,harfbuzz-ng能夠藉助於這個對象來獲取字庫文件的一些內容,好比獲取OpenType的表等;而hb_segment_properties_t則主要保存字串屬性相關的一些信息,包括script,direction和language等。能夠看到,這個結構前面的幾個字段,主要是與字庫文件(face)和字串屬性(props)有關的一些內容,然後面的幾個字段,則是與選中的shaper有關的一些內容。不難理解,建立shape plan的過程,大致上應該是,用client傳入的參數,設置前面的幾個字段(default_shaper_list,face和props),而後依據於client傳入的參數,建立或者肯定後面幾個字段(shaper_func,shaper_name和shaper_data)的內容。接着,咱們就來看一下,harfbuzz-ng究竟是如何完成這一切的吧。 spa
前面提到,hb_shape_plan_create_cached()函數建立shape plan,那麼咱們就先來看一下這個函數的定義: 指針
hb_shape_plan_t * hb_shape_plan_create_cached (hb_face_t *face, const hb_segment_properties_t *props, const hb_feature_t *user_features, unsigned int num_user_features, const char * const *shaper_list) { if (num_user_features) return hb_shape_plan_create (face, props, user_features, num_user_features, shaper_list); hb_shape_plan_proposal_t proposal = { *props, shaper_list, NULL }; if (shaper_list) { /* Choose shaper. Adapted from hb_shape_plan_plan(). */ #define HB_SHAPER_PLAN(shaper) \ HB_STMT_START { \ if (hb_##shaper##_shaper_face_data_ensure (face)) \ proposal.shaper_func = _hb_##shaper##_shape; \ } HB_STMT_END for (const char * const *shaper_item = shaper_list; *shaper_item; shaper_item++) if (0) ; #define HB_SHAPER_IMPLEMENT(shaper) \ else if (0 == strcmp (*shaper_item, #shaper)) \ HB_SHAPER_PLAN (shaper); #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT #undef HB_SHAPER_PLAN if (unlikely (!proposal.shaper_list)) return hb_shape_plan_get_empty (); } retry: hb_face_t::plan_node_t *cached_plan_nodes = (hb_face_t::plan_node_t *) hb_atomic_ptr_get (&face->shape_plans); for (hb_face_t::plan_node_t *node = cached_plan_nodes; node; node = node->next) if (hb_shape_plan_matches (node->shape_plan, &proposal)) return hb_shape_plan_reference (node->shape_plan); /* Not found. */ hb_shape_plan_t *shape_plan = hb_shape_plan_create (face, props, user_features, num_user_features, shaper_list); hb_face_t::plan_node_t *node = (hb_face_t::plan_node_t *) calloc (1, sizeof (hb_face_t::plan_node_t)); if (unlikely (!node)) return shape_plan; node->shape_plan = shape_plan; node->next = cached_plan_nodes; if (!hb_atomic_ptr_cmpexch (&face->shape_plans, cached_plan_nodes, node)) { hb_shape_plan_destroy (shape_plan); free (node); goto retry; } /* Release our reference on face. */ hb_face_destroy (face); return hb_shape_plan_reference (shape_plan); }
能夠看到,這個函數正是主要依據於face和props來建立shape plan的。這個函數中爲兩種狀況作了一些特殊的處理:第一種是num_user_features大於0的狀況,即客戶端指定了一些features;第二種是shaper_list非空的狀況,即客戶端已經提供了一個shaper列表給harfbuzz-ng來選則適當的shaper。 code
第一種狀況簡單明瞭,會直接調用hb_shape_plan_create()函數來建立shape plan,此處再也不多說,後面會再來講明這個函數的定義。那就來看一下第二種狀況的一些特殊處理:
if (shaper_list) { /* Choose shaper. Adapted from hb_shape_plan_plan(). */ #define HB_SHAPER_PLAN(shaper) \ HB_STMT_START { \ if (hb_##shaper##_shaper_face_data_ensure (face)) \ proposal.shaper_func = _hb_##shaper##_shape; \ } HB_STMT_END for (const char * const *shaper_item = shaper_list; *shaper_item; shaper_item++) if (0) ; #define HB_SHAPER_IMPLEMENT(shaper) \ else if (0 == strcmp (*shaper_item, #shaper)) \ HB_SHAPER_PLAN (shaper); #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT #undef HB_SHAPER_PLAN if (unlikely (!proposal.shaper_list)) return hb_shape_plan_get_empty (); }
這段code看起來還真夠奇怪的。這都是些什麼東西嘛。只是定義了兩個宏,HB_SHAPER_PLAN和HB_SHAPER_IMPLEMENT,而後外加一個什麼也沒作的空循環,在而後就是include了一個文件而已。難道include的那個文件暗藏玄機?沒錯,全部的祕密還確實都在那個文件裏了。咱們來看一下那個hb-shaper-list.hh的內容:
#ifndef HB_SHAPER_LIST_HH #define HB_SHAPER_LIST_HH #endif /* HB_SHAPER_LIST_HH */ /* Dummy header guards */ /* v--- Add new shapers in the right place here. */ #ifdef HAVE_GRAPHITE2 /* Only picks up fonts that have a "Silf" table. */ HB_SHAPER_IMPLEMENT (graphite2) #endif #ifdef HAVE_OT HB_SHAPER_IMPLEMENT (ot) /* <--- This is our main OpenType shaper. */ #endif #ifdef HAVE_HB_OLD HB_SHAPER_IMPLEMENT (old) #endif #ifdef HAVE_ICU_LE HB_SHAPER_IMPLEMENT (icu_le) #endif #ifdef HAVE_UNISCRIBE HB_SHAPER_IMPLEMENT (uniscribe) #endif #ifdef HAVE_CORETEXT HB_SHAPER_IMPLEMENT (coretext) #endif HB_SHAPER_IMPLEMENT (fallback) /* <--- This should be last. */
至此,那兩個宏的做用看起來就清晰多了嘛。根據這個文件中的內容,將前面看到的那兩個宏都解開來看那段code究竟是什麼:
if (shaper_list) { for (const char * const *shaper_item = shaper_list; *shaper_item; shaper_item++) if (0) ; else if (0 == strcmp (*shaper_item, "graphite2")) do { if (hb_graphite_shaper_face_data_ensure (face)) proposal.shaper_func = _hb_graphite_shape; } wihle (0); else if (0 == strcmp (*shaper_item, "ot")) do { if (hb_ot_shaper_face_data_ensure (face)) proposal.shaper_func = _hb_ot_shape; } wihle (0); else if (0 == strcmp (*shaper_item, "fallback")) do { if (hb_fallback_shaper_face_data_ensure (face)) proposal.shaper_func = _hb_fallback_shape; } wihle (0); if (unlikely (!proposal.shaper_list)) return hb_shape_plan_get_empty (); }
固然,這個也不必定就是前面那段code將宏解開的真實結果。由hb-shaper-list.hh的內容,咱們知道,for循環下的else-if block的數量和內容,都會依賴於到底有多少種shaper是經過宏而被enabled起來的。不過,有兩種shaper是一定會被enabled的,一種是harfbuzz-ng實現的ot,另一種就是fallback。
由這段code來看,前面所提到的針對第二種狀況的特殊處理,其實也就是補足proposal中shape_func相關的信息,以便於後面在匹配shape plan時,能有更多的依據。
這段code會逐個的檢查傳進來的那個shaper_list中的shaper,以肯定合適的shaper。它會調用shaper的hb_##shaper##_shaper_face_data_ensure()函數,好比hb_ot_shaper_face_data_ensure()等,來檢查相應的shaper是否可以處理傳入的那個face(字庫文件),若是能夠,則將相應的shaper_func函數賦給proposal.shaper_func。
那麼,在選擇shaper的時候爲何會須要對字庫作檢查呢?由於確實有一些shaper對字庫有特殊的要求,好比ot的shaper就要求傳入的字庫必須是一個OpenType字庫,而不能是簡單的TrueType字庫,graphite2的shaper則對字庫有更高的要求。
根據須要(shaper_list非空的那段code,一般都不會執行到,shaper_list爲空的時候多),對第二種狀況做了proposal中shape_func相關信息的補充以後,hb_shape_plan_create_cached()函數就會從face對象中取出一個緩存的hb_face_t::plan_node_t鏈表(face->shape_plans),並檢查是否可以找到proposal所描述的shape plan,若能夠找到則將shape plan返回給調用者,建立shape plan的過程就算結束了。
若hb_shape_plan_create_cached()函數沒能在face的緩存中找到所須要的shape plan的話,則它就會調用hb_shape_plan_create()來建立一個,將這個shape plan緩存進face的shape_plans鏈表裏去,並返回剛剛建立的這個shape plan。
那個所謂的對字庫的檢查究竟是如何進行的呢?以ot的shaper爲例,來看一下字庫檢查都作了些什麼事情。來看hb_ot_shaper_face_data_ensure()函數的定義,它是經過一個宏在相同的文件(hb-shape-plan.cc)中完成的:
#define HB_SHAPER_IMPLEMENT(shaper) \ HB_SHAPER_DATA_ENSURE_DECLARE(shaper, face) \ HB_SHAPER_DATA_ENSURE_DECLARE(shaper, font) #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT
而後,咱們繼續追蹤,來看宏HB_SHAPER_DATA_ENSURE_DECLARE的定義(在hb-shaper-private.hh中):
#define HB_SHAPER_DATA_ENSURE_DECLARE(shaper, object) \ static inline bool \ hb_##shaper##_shaper_##object##_data_ensure (hb_##object##_t *object) \ {\ retry: \ HB_SHAPER_DATA_TYPE (shaper, object) *data = (HB_SHAPER_DATA_TYPE (shaper, object) *) hb_atomic_ptr_get (&HB_SHAPER_DATA (shaper, object)); \ if (unlikely (!data)) { \ data = HB_SHAPER_DATA_CREATE_FUNC (shaper, object) (object); \ if (unlikely (!data)) \ data = (HB_SHAPER_DATA_TYPE (shaper, object) *) HB_SHAPER_DATA_INVALID; \ if (!hb_atomic_ptr_cmpexch (&HB_SHAPER_DATA (shaper, object), NULL, data)) { \ HB_SHAPER_DATA_DESTROY_FUNC (shaper, object) (data); \ goto retry; \ } \ } \ return data != NULL && !HB_SHAPER_DATA_IS_INVALID (data); \ }這麼大一坨,都是些什麼東西嘛。又是引用了一堆的宏,真是看得人暈死了。不用急,能夠先逐個的看一下那些宏究竟是怎麼定義的,而後根據那些宏的定義,再逐行的解開這個函數就都真相大白了 。首先是 HB_SHAPER_DATA, HB_SHAPER_DATA_INSTANCE和 HB_SHAPER_DATA_TYPE的定義:
#define HB_SHAPER_DATA_TYPE(shaper, object) struct hb_##shaper##_shaper_##object##_data_t #define HB_SHAPER_DATA_INSTANCE(shaper, object, instance) (* (HB_SHAPER_DATA_TYPE(shaper, object) **) &(instance)->shaper_data.shaper) #define HB_SHAPER_DATA(shaper, object) HB_SHAPER_DATA_INSTANCE (shaper, object, object)
咱們知道,用於實現hb_ot_shaper_face_data_ensure()這個函數時,HB_SHAPER_DATA_ENSURE_DECLARE(shaper, object)宏的shaper是「ot」,而object是「face」。先來解開以下的這一行:
HB_SHAPER_DATA_TYPE (shaper, object) *data = (HB_SHAPER_DATA_TYPE (shaper, object) *) hb_atomic_ptr_get (&HB_SHAPER_DATA (shaper, object));
能夠看到這一行其實是:
struct hb_ot_shaper_face_data_t *data = (struct hb_ot_shaper_face_data_t *) hb_atomic_ptr_get (&(*(struct hb_ot_shaper_face_data_t**)&face->shaper_data.ot));
究竟是什麼意思呢?說白了就是從face對象裏面拿了一個數據成員出來,即face->shaper_data.ot。那它拿的那個成員到底又是怎麼一回事呢?能夠再來跟一下hb_face_t定義中與這個部分有關的一些內容。來看一下那個shaper_data成員是個什麼東西:
struct hb_face_t { hb_object_header_t header; ASSERT_POD (); hb_bool_t immutable; hb_reference_table_func_t reference_table_func; void *user_data; hb_destroy_func_t destroy; unsigned int index; mutable unsigned int upem; mutable unsigned int num_glyphs; struct hb_shaper_data_t shaper_data;
shaper_data的類型是hb_shaper_data_t,而後來看hb_shaper_data_t的定義:
struct hb_shaper_data_t { #define HB_SHAPER_IMPLEMENT(shaper) void *shaper; #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT };
聯繫咱們前面看到的hb-shaper-list.hh文件的內容,能夠發現hb_shaper_data_t就只是包含了一些void *類型的指針而已。 於是前面取出數據的那個過程,其實就是取出一個void *指針,而後作強制類型轉換。
讓咱們回到HB_SHAPER_DATA_ENSURE_DECLARE的定義,接着來看HB_SHAPER_DATA_CREATE_FUNC宏的定義:
#define HB_SHAPER_DATA_CREATE_FUNC(shaper, object) _hb_##shaper##_shaper_##object##_data_create
展開這個宏就是:
_hb_ot_shaper_face_data_create
它實際上是定義在hb-ot-shape.cc中的一個函數:
hb_ot_shaper_face_data_t * _hb_ot_shaper_face_data_create (hb_face_t *face) { return _hb_ot_layout_create (face); }
至此,咱們能夠來總結一下hb_ot_shaper_face_data_ensure()函數所作的事情:它會從face對象裏面拿到對應於ot shaper的shaper_data,也就是一個struct hb_ot_shaper_face_data_t 對象,檢查一下是否爲空;若爲空,他就會去建立一個struct hb_ot_shaper_face_data_t 對象,並賦給face->shaper_data的ot成員face->shaper_data.ot。能夠再多來看一點,那個struct hb_ot_shaper_face_data_t的實際類型是struct hb_ot_layout_t:
#define hb_ot_shaper_face_data_t hb_ot_layout_t
hb_ot_shaper_face_data_ensure()函數以什麼爲依據來判斷face所表明的字庫是ot shaper所能處理的字庫呢?就是看那個data對象是否能建立成功而且有效。
可見hb_ot_shaper_face_data_ensure()函數可能不只僅是作check,它還可能新建立一個特定於shaper的結構,由face傳出,以供後面shaper的func在執行時使用。
來看hb_shape_plan_create_cached()中另外的一個重要函數,也就是實際完成建立shape plan動做的hb_shape_plan_create()函數,來看它的定義:
hb_shape_plan_t * hb_shape_plan_create (hb_face_t *face, const hb_segment_properties_t *props, const hb_feature_t *user_features, unsigned int num_user_features, const char * const *shaper_list) { assert (props->direction != HB_DIRECTION_INVALID); hb_shape_plan_t *shape_plan; if (unlikely (!face)) face = hb_face_get_empty (); if (unlikely (!props || hb_object_is_inert (face))) return hb_shape_plan_get_empty (); if (!(shape_plan = hb_object_create<hb_shape_plan_t> ())) return hb_shape_plan_get_empty (); hb_face_make_immutable (face); shape_plan->default_shaper_list = shaper_list == NULL; shape_plan->face = hb_face_reference (face); shape_plan->props = *props; hb_shape_plan_plan (shape_plan, user_features, num_user_features, shaper_list); return shape_plan; }
能夠看到,這個函數主要是爲hb_shape_plan_t對象分配了內存空間,簡單地初始化了一些變量,完了以後便調用另外的一個函數hb_shape_plan_plan()來對hb_shape_plan_t對象作更細緻的設置。來看hb_shape_plan_plan()的定義:
static void hb_shape_plan_plan (hb_shape_plan_t *shape_plan, const hb_feature_t *user_features, unsigned int num_user_features, const char * const *shaper_list) { const hb_shaper_pair_t *shapers = _hb_shapers_get (); #define HB_SHAPER_PLAN(shaper) \ HB_STMT_START { \ if (hb_##shaper##_shaper_face_data_ensure (shape_plan->face)) { \ HB_SHAPER_DATA (shaper, shape_plan) = \ HB_SHAPER_DATA_CREATE_FUNC (shaper, shape_plan) (shape_plan, user_features, num_user_features); \ shape_plan->shaper_func = _hb_##shaper##_shape; \ shape_plan->shaper_name = #shaper; \ return; \ } \ } HB_STMT_END if (likely (!shaper_list)) { for (unsigned int i = 0; i < HB_SHAPERS_COUNT; i++) if (0) ; #define HB_SHAPER_IMPLEMENT(shaper) \ else if (shapers[i].func == _hb_##shaper##_shape) \ HB_SHAPER_PLAN (shaper); #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT } else { for (; *shaper_list; shaper_list++) if (0) ; #define HB_SHAPER_IMPLEMENT(shaper) \ else if (0 == strcmp (*shaper_list, #shaper)) \ HB_SHAPER_PLAN (shaper); #include "hb-shaper-list.hh" #undef HB_SHAPER_IMPLEMENT } #undef HB_SHAPER_PLAN }
又是一大堆宏在這兒繞來繞去。不過不要緊,畢竟這個函數的結構整體看起來還算清晰。定義宏,而後經過include "hb-shaper-list.hh"文件而產生代碼的這種手法,在前面是已經有見識過了,於是這個部分的邏輯應該也還不難理解。這個函數首先是調用了_hb_shapers_get ()函數獲取到一個hb_shaper_pair_t的列表,而後分爲shaper_list爲空和非空兩種狀況來處理。先來看一下hb_shaper_pair_t的定義:
struct hb_shaper_pair_t { char name[16]; hb_shape_func_t *func; };
這個結構只有兩個成員,一個是shaper的name,另一個就是shape func,一個函數指針,其餘不須要作過多的解釋。
當shaper_list爲空時,也是執行這個函數最常常出現的狀況,在那個if block裏面實現,由裏面的for循環,不難看出這個函數是會從_hb_shapers_get()返回的shaper list中挑選一個。由if block裏面的else-if語句,能夠知道,for循環每遍歷到一個shaper,就總有一個else-if能與之匹配,因此else-if語句僅有的做用,就只是幫助它的HB_SHAPER_PLAN()來識別一個shaper而已。而選擇shaper的主要依據,還得看HB_SHAPER_PLAN()宏,展開這個宏的定義,能夠發現,此處也同樣是調用hb_##shaper##_shaper_face_data_ensure()函數來對字庫文件作檢查,而這個函數算是咱們的老朋友了,如前所述,它主要是建立一個face data,若建立成功,harfbuzz-ng就認爲相應的shaper是可用的。
選擇一個shaper的具體含義又是什麼呢?能夠看宏HB_SHAPER_PLAN()接下來的幾行,首先是,經過一個函數_hb_##shaper##_shaper_##object##_data_create(),建立一個shape_plan的data,並賦值給shape_plan->shaper_data.shaper。好比,對於ot shaper,就是調用_hb_ot_shaper_shape_plan_data_create()函數,建立一個struct hb_ot_shaper_shape_plan_data_t對象,並賦值給shape_plan->shaper_data.ot,以返回給調用者。接下來即是設置shape_plan的shaper_func爲對應shaper的shaper_func。最後就是將shape_plan的shaper_name設置爲對應shaper的shaper_name。
那當shaper_list非空時,又是怎樣的一個執行過程呢?有相應的block裏面的code來看,它與shaper_list爲空時,有兩點區別,一是,它在調用者傳進來的shaper_list中來選擇;二是,它是經過shaper的shaper_name類識別一個shaper的,而不像前面的case,是經過shaper_func來識別一個shaper。其餘則都徹底同樣。
此處咱們來總結一下,harfbuzz-ng選擇一個shaper的過程。首先,harfbuzz-ng是經過調用shaper的hb_##shaper##_shaper_face_data_ensure()函數來肯定那個shaper是否可用的,這個函數實際上算是在對字庫文件作檢查,它會建立一個face data,若建立成功,則認爲相應的shaper可用,不然,認爲shaper不可用。這個函數還會將建立的face data賦值給face->shaper_data.shaper,以返回給調用者。肯定了一個shaper可用以後,harfbuzz-ng還會經過調用shaper的_hb_##shaper##_shaper_shape_plan_data_create()函數建立一個shape plan的data,並經過shape_plan->shaper_data.shaper返回給調用者。而後就是爲shape_plan設置適當的shaper_func和shaper_name,其中的shaper_func是名爲_hb_##shaper##_shape的函數。另外,就是harfbuzz-ng在選擇shaper時是有按必定的優先級的,在hb-shaper-list.hh文件中,被列出的越靠前的shaper,其優先級就相應的越高。