咱們將設計一個Parser解析以下字符串:python
str1 = "ab2(c3(d)2(ef)"
每遇到一個數字都會將括號裏的元素打印 「數字」倍:app
假設3(d)那麼將打印ddd,2(ef)則打印efef。函數
若是遇到括號套括號,那麼將括號內的元素處理好後,再打印相應數字倍:.net
例如:2(c3(d)),咱們會先將d打印三遍2(cddd)而後再展開裏邊的元素2遍:cdddcddd設計
因爲數字、字母、括號都屬於不一樣的元素,咱們將定義一個元素狀態字典,並創造一個分詞器,先進行分詞。這裏的思路跟 簡易Parser入門【二】 同樣:code
from enum import Enum class Mark1(Enum): # 定義不一樣類別 En = 0 LeftBracket = 1 RightBracket = 2 Numb = 3
咱們構造一個search字典,這樣每來一個字符都能很快歸類其屬於什麼類別 blog
class Parser2: def __init__(self): words = 'abcdefghijklmnopqrstuvwxyz' # ------ 12345678901234567890123456 nums = '0123456789' self.search_dict = {} # 構造Search字典 for c in words: self.search_dict[c] = Mark1.En for n in nums: self.search_dict[n] = Mark1.Numb self.search_dict['('] = Mark1.LeftBracket self.search_dict[')'] = Mark1.RightBracket self.pos = 0 self.length = None
咱們按照 簡易Parser入門【二】 的方式一樣構造一個分詞器:遞歸
# -------------------- 該函數屬於class Parser2的一部分 ---------------------- def str_to_ast(self, str_in): # type: (str) -> None word_list = [] last_state = self.search_dict[str_in[0]] for i, c in enumerate(str_in): curr_state = self.search_dict.get(c, Mark1.En) if curr_state != last_state: word_list.append((str_in[self.pos:i], last_state)) self.pos = i last_state = curr_state print(word_list) print(list(map(lambda x: x[0], word_list)))
咱們調用一下看看效果,能夠看到把字符串和類別都分好了:字符串
[('ab', <Mark1.En: 0>), ('2', <Mark1.Numb: 3>), ('(', <Mark1.LeftBracket: 1>), ('c', <Mark1.En: 0>), ('3', <Mark1.Numb: 3>), ('(', <Mark1.LeftBracket: 1>), ('d', <Mark1.En: 0>), (')', <Mark1.RightBracket: 2>), ('2', <Mark1.Numb: 3>), ('(', <Mark1.LeftBracket: 1>), ('ef', <Mark1.En: 0>)] ['ab', '2', '(', 'c', '3', '(', 'd', ')', '2', '(', 'ef']
接下來咱們遞歸構造語法樹。此時,這裏的語法樹不太同樣,咱們嘗試引入字典來表示2(c3(d))這種狀況:get
{2: ['c', {3: 'd'}]}
最終咱們的遞歸Parser表示以下:
# ---------------------- 該方法位於class Parser2 中 ------------------ @staticmethod def iter_find(word_list, p2_in): # type: (list, Parser2) -> Optional[list, dict] save_list = [] while p2_in.pos < len(word_list): curr_word = word_list[p2_in.pos] if curr_word[1] == Mark1.Numb: p2_in.pos += 2 tmp_dict = {int(curr_word[0]): Parser2.iter_find(word_list, p2_in)} save_list.append(tmp_dict) elif curr_word[1] == Mark1.RightBracket: p2_in.pos += 1 break else: save_list.append(curr_word[0]) p2_in.pos += 1 if len(save_list) == 1: return save_list[0] else: return save_list
咱們獲得以下的樹狀結構:
['ab', {2: ['c', {3: 'd'}, {2: 'ef'}]}]
那麼如何打印呢?一樣運用遞歸打印:
@staticmethod def iter_print(ast_in): if isinstance(ast_in, list): tmp_str = '' for e in ast_in: tmp_str += Parser2.iter_print(e) return tmp_str elif isinstance(ast_in, dict): nums = list(ast_in.keys())[0] return nums * Parser2.iter_print(ast_in[nums]) elif isinstance(ast_in, str): return ast_in else: return ''
最終打印效果以下:
abcdddefefcdddefef