上一節講了JSON, 這一節將介紹YAML。能夠認爲,YAML是JSON的超集,可是更加簡單易用,適合人類閱讀和書寫。html
1. 什麼是YAML?python
YAML是YAML Ain't Markup Language的遞歸縮寫。Clark Evans在2001年率先提出了YAML,Ingy döt Net和Oren Oren Ben-Kiki參與了YAML的設計。最初,YAML被稱爲Yet Another Markup Language(另外一種標籤語言),但它後來又被改寫爲YAML Ain't Markup Language(YAML不是一種標籤語言)的遞歸縮寫,旨在強調YAML面向的是數據而非文檔標籤。
git
Wikipedia對YAML的解釋:github
2. YAML的基本語法規則express
3. YAML的三種數據結構編程
注意:YAML跟JSON相似,其骨架也是基於字典(dict)和數組(array)構建出來的。而字典的"鍵/值對"的值的類型和數組元素的類型能夠是以上三種數據結構。json
3.1 映射(map)數組
映射(map), 即字典(dict),使用冒號結構,格式爲key: value,冒號後面要加一個空格。例如:ruby
foo: foo1: abc foo2: 0xFF foo3: true
foo: {foo1: abc, foo2: 0xFF, foo3: true}
上面的例子轉換成Python對象表示就是:bash
{'foo': {'foo1': 'abc', 'foo2': 0xFF, 'foo3': True}}
3.2 列表(list)
列表(list)即數組(array), 使用一個短橫線加一個空格表明一個數組元素。例如:
- C - Python - Go
上面的例子轉換成Python對象表示就是:
['C', 'Python', 'Go']
上面的數組包含了三個元素,每一個元素是一個字符串。
也有這樣的寫法:
- - C - Python - Go
上面的例子轉換成Python對象表示就是:
[['C', 'Python', 'Go']]
上面的數組只有一個元素,該元素是一個包含了三個元素的數組。
數組的元素能夠是數組,也能夠是字典;固然,對字典來講,每個key的值既能夠是純量,也能夠是字典,還能夠是數組。例如:
Students: - name: John age: 23 city: Agra - name: Steve age: 28 city: Delhi
上面的例子轉換成Python對象表示就是:
{'Students': [{'city': 'Agra', 'age': 23, 'name': 'John'}, {'city': 'Delhi', 'age': 28, 'name': 'Steve'}]}
顯然,鍵'Students'的值是一個具備兩個元素的數組,該數組的每個元素都是一個字典。
3.3 純量(scalar)
3.3.1 數值(number)
數值(number)包括整數(int)和浮點數(float)。 例如:
a: 10 b: 10.9
上面的例子轉換成Python對象表示就是:
{'a': 10, 'b': 10.9}
3.3.2 字符串(string)
- Hello Beijing - Hello China - Hello World
上面的例子轉換成Python對象表示就是:
[
'Hello Beijing',
'Hello China',
'Hello World'
]
- "Hello\tChina" - 'Hello\tWorld'
上面的例子轉換成Python對象表示就是:
[
'Hello\tChina',
'Hello\\tWorld'
]
- "I'm good" - 'I''m good'
上面的例子轉換成Python對象表示就是:
[
"I'm good",
"I'm good"
]
foo: There is a napping house, where everyone is sleeping.
上面的例子轉換成Python對象表示就是:
{'foo': 'There is a napping house, where everyone is sleeping.'}
1 foo1: | 2 Be good, 3 do right. 4 foo2: > 5 Be good, 6 do right.
上面的例子轉換成Python對象表示就是:
{'foo1': 'Be good,\ndo right.\n', 'foo2': 'Be good, do right.\n'}
1 foo1: | 2 abc 3 4 5 foo2: |+ 6 abc 7 8 9 foo3: |- 10 abc 11 12 13 #END#
上面的例子轉換成Python對象表示就是:
1 { 2 'foo1': 'abc\n', 3 'foo2': 'abc\n\n\n', 4 'foo3': 'abc' 5 }
1 foo: >- 2 <p> 3 Hello World, <br> 4 Hello China. <br> 5 </p>
上面的例子轉換成Python對象表示就是:
{'foo': '<p> Hello World, <br> Hello China. <br> </p>'}
3.3.3 布爾值(boolean)
在YAML中,布爾值一般用true/false表示,也支持True/False, Yes/No, On/Off之類的表示方法。 例如:
1 T: [true, True, yes, Yes, on, On] 2 F: [false, False, no, No, off, Off]
上面的例子轉換成Python對象表示就是:
1 { 2 'T': [True, True, True, True, True, True], 3 'F': [False, False, False, False, False, False] 4 }
對true/false, yes/no, on/off這些成對的表示boolean的反義詞來講,第一個字母是能夠大寫的。
3.3.4 空值(null)
在YAML中,空值(null)一般用~表示。 例如:
1 a: ~ 2 b: null 3 c: Null 4 d: NULL
上面的例子轉換成Python對象表示就是:
1 { 2 'a': None, 3 'c': None, 4 'b': None, 5 'd': None 6 }
3.3.5 時間和日期
例如:
1 date: 1983-05-24 2 datetime: 1983-05-24T15:02:31+08:00
上面的例子轉換成Python對象表示就是:
1 { 2 'date': datetime.date(1983, 5, 24), 3 'datetime': datetime.datetime(1983, 5, 24, 7, 2, 31) 4 }
4. 經常使用的特殊符號
4.1 文件開始符(---)和結束符(...)
多個YAML能夠合併到同一個文件中,使用---表示一個文件的開始; ... 和 --- 配合使用,表明一個文件的結束。這相似與電子郵件的附件組織形式。 例如:
1 --- 2 name: foo1.mp3 3 size: 20480 4 ... 5 6 --- 7 name: foo2.mp3 8 size: 10240 9 ...
4.2 強制類型轉換符(!!)
使用!!<type> <value>能夠對value進行強制類型轉換,跟C語言相似。 例如:
1 foo: 2 - !!str 123.1 3 - 123.1 4 - !!int '0xff' 5 - '0xff'
上面的例子轉換成Python對象表示就是:
{'foo': ['123.1', 123.1, 255, '0xff']}
4.3 字符串刪除換行符(>)和保留換行符(|)
符號(>)和(|)在前面介紹字符串的時候已經講過了,它們在YAML字符串裏可謂使用很是頻繁。這裏再舉個例子以加深印象:
1 foo1: > 2 Beautiful is 3 better than ugly. 4 5 foo2: | 6 Explicit is 7 better than implicit.
上面的例子轉換成Python對象表示就是:
1 { 2 'foo1': 'Beautiful is better than ugly.\n', 3 'foo2': 'Explicit is\nbetter than implicit.\n' 4 }
注意: > 和 | 均可之後面跟 + 或 - , 具體含義請參見3.3.2字符串。 這裏給出一個完整的例子對照 >, >+, >- 和 |, |+, |- 的區別:
1 foo10: > 2 (> ) Beautiful is 3 better than ugly. 4 5 6 #1.0 /* end of demo > */ 7 8 foo11: >+ 9 (>+) Beautiful is 10 better than ugly. 11 12 13 #1.1 /* end of demo >+ */ 14 15 foo12: >- 16 (>-) Beautiful is 17 better than ugly. 18 19 20 #1.2 /* end of demo >- */ 21 22 foo20: | 23 (| ) Explicit is 24 better than implicit. 25 26 27 #2.0 /* end of demo | */ 28 29 foo21: |+ 30 (|+) Explicit is 31 better than implicit. 32 33 34 #2.1 /* end of demo |+ */ 35 36 foo22: |- 37 (|-) Explicit is 38 better than implicit. 39 40 41 #2.2 /* end of demo |- */
上面的例子轉換成Python對象表示就是:
1 { 2 'foo10': '(> ) Beautiful is better than ugly.\n', 3 'foo11': '(>+) Beautiful is better than ugly.\n\n\n', 4 'foo12': '(>-) Beautiful is better than ugly.', 5 'foo20': '(| ) Explicit is\nbetter than implicit.\n', 6 'foo21': '(|+) Explicit is\nbetter than implicit.\n\n\n', 7 'foo22': '(|-) Explicit is\nbetter than implicit.' 8 }
因而可知, >- 能刪除全部的換行符, |+ 能保留全部的換行符。
4.4 錨點定義(&),錨點引用(*)和內容合併(<<)
4.4.1 錨點定義(&)和錨點引用(*)
重複的內容可使用 & 來完成錨點定義,使用 * 來完成錨點引用。 例如:
1 employees: 2 - {employee: Jack Li, manager: &SS Sara Song} 3 - {employee: John Wu, manager: *SS} 4 - {employee: Ann Liu, manager: *SS}
上面的例子轉換成Python對象表示就是:
1 { 2 'employees': [ 3 {'employee': 'Jack Li', 'manager': 'Sara Song'}, 4 {'employee': 'John Wu', 'manager': 'Sara Song'}, 5 {'employee': 'Ann Liu', 'manager': 'Sara Song'} 6 ] 7 }
還能夠單起一行定義錨點,例如:
1 manager: &SS Sara Song 2 employees: 3 - {employee: Jack Li, manager: *SS} 4 - {employee: John Wu, manager: *SS} 5 - {employee: Ann Liu, manager: *SS}
因而, 上面的例子轉換成Python對象表示就是:
1 { 2 'manager': 'Sara Song', 3 'employees': [ 4 {'employee': 'Jack Li', 'manager': 'Sara Song'}, 5 {'employee': 'John Wu', 'manager': 'Sara Song'}, 6 {'employee': 'Ann Liu', 'manager': 'Sara Song'} 7 ] 8 }
固然,錨點還支持複雜的數據結構,例如:
1 manager: &SS 2 name: Sara Song 3 gender: Female 45 employees: 6 - {employee: Jack Li, manager: *SS} 7 - {employee: John Wu, manager: *SS} 8 - {employee: Ann Liu, manager: *SS}
看起來*SS相似C語言的指針引用。因而, 上面的例子轉換成Python對象表示就是:
1 { 2 'manager': {'name': 'Sara Song', 'gender': 'Female'}, 3 'employees': [ 4 { 5 'employee': 'Jack Li', 6 'manager': {'name': 'Sara Song', 'gender': 'Female'} 7 }, 8 { 9 'employee': 'John Wu', 10 'manager': {'name': 'Sara Song', 'gender': 'Female'} 11 }, 12 { 'employee': 'Ann Liu', 13 'manager': {'name': 'Sara Song', 'gender': 'Female'} 14 } 15 ] 16 }
4.4.2 錨點定義(&)和內容合併(<<)
& 用來創建錨點,* 用來引用錨點, << 則表示將錨點數據合併到當前數據(就像C語言的宏同樣就地展開)。 例如:
1 defaults: &defaults 2 adapter: mlx5 3 host: sunflower 4 5 dev: 6 hca_name: hermon0 7 <<: *defaults 8 9 test: 10 hca_name: hermon1 11 <<: *defaults
上面的例子等價於:
1 defaults: &defaults 2 adapter: mlx5 3 host: sunflower 4 5 dev: 6 hca_name: hermon0 7 adapter: mlx5 8 host: sunflower 9 10 test: 11 hca_name: hermon1 12 adapter: mlx5 13 host: sunflower
能夠將 << 理解相似於Bash裏的文檔讀入符號。 例如:
1 #!/bin/bash 2 cat << EOF 3 Hello China! 4 Hello World! 5 EOF
5. 在Python中使用YAML
要在Python中使用YAML, 須要首先安裝PyYAML包。
$ sudo pip install pyyaml
5.1 將YAML文件load爲Python對象
1 #!/usr/bin/python 2 3 """ Deserialize YAML text to a Python Object by using yaml.load() """ 4 5 import sys 6 import yaml 7 8 9 def main(argc, argv): 10 if argc != 2: 11 sys.stderr.write("Usage: %s <yaml file>\n" % argv[0]) 12 return 1 13 14 yaml_file = argv[1] 15 with open(yaml_file, 'r') as f: 16 txt = ''.join(f.readlines()) 17 obj = yaml.load(txt) 18 print type(obj) 19 print obj 20 21 return 0 22 23 24 if __name__ == '__main__': 25 argv = sys.argv 26 argc = len(argv) 27 sys.exit(main(argc, argv))
$ cat -n list.yaml 1 - name: Jack 2 - name: John 3 - name: Annie $ ./foo_load.py list.yaml <type 'list'> [{'name': 'Jack'}, {'name': 'John'}, {'name': 'Annie'}]
5.2 將Python對象dump爲YAML文件
1 #!/usr/bin/python 2 3 """ Serialize a Python Object by using yaml.dump() """ 4 5 import sys 6 import yaml 7 8 obj = { 9 "students": 10 [ 11 { 12 "name": "John", 13 "age": 23, 14 "city": "Agra", 15 "married": False, 16 "spouse": None 17 }, 18 { 19 "name": "Steve", 20 "age": 28, 21 "city": "Delhi", 22 "married": True, 23 "spouse": "Grace" 24 }, 25 { 26 "name": "Peter", 27 "age": 32, 28 "city": "Chennai", 29 "married": True, 30 "spouse": "Rachel" 31 } 32 ], 33 "teacher": { 34 "name": "John", 35 "age": 35, 36 "city": "Chennai", 37 "married": True, 38 "spouse": "Anna", 39 "childen": 40 [ 41 { 42 "name": "Jett", 43 "age": 8 44 }, 45 { 46 "name": "Lucy", 47 "age": 5 48 } 49 ] 50 } 51 } 52 53 54 def main(argc, argv): 55 if argc != 2: 56 sys.stderr.write("Usage: %s <yaml file to save obj>\n" % argv[0]) 57 return 1 58 59 with open(argv[1], 'a') as f: 60 txt = yaml.dump(obj) 61 print "DEBUG> " + str(type(obj)) 62 print "DEBUG> " + str(obj) 63 print "DEBUG> " + str(type(txt)) 64 print "DEBUG> " + txt 65 f.write(txt) 66 67 return 0 68 69 70 if __name__ == '__main__': 71 sys.exit(main(len(sys.argv), sys.argv))
huanli$ rm -f /tmp/dict.yaml huanli$ ./foo_dump.py /tmp/dict.yaml DEBUG> <type 'dict'> DEBUG> {'students': [{'city': 'Agra', 'age': 23, 'married': False, 'name': 'John', 'spouse': None}, {'city': 'Delhi', 'age': 28, 'married': True, 'name': 'Steve', 'spouse': 'Grace'}, {'city': 'Chennai', 'age': 32, 'married': True, 'name': 'Peter', 'spouse': 'Rachel'}], 'teacher': {'city': 'Chennai', 'name': 'John', 'age': 35, 'married': True, 'childen': [{'age': 8, 'name': 'Jett'}, {'age': 5, 'name': 'Lucy'}], 'spouse': 'Anna'}} DEBUG> <type 'str'> DEBUG> students: - {age: 23, city: Agra, married: false, name: John, spouse: null} - {age: 28, city: Delhi, married: true, name: Steve, spouse: Grace} - {age: 32, city: Chennai, married: true, name: Peter, spouse: Rachel} teacher: age: 35 childen: - {age: 8, name: Jett} - {age: 5, name: Lucy} city: Chennai married: true name: John spouse: Anna huanli$ cat -n /tmp/dict.yaml 1 students: 2 - {age: 23, city: Agra, married: false, name: John, spouse: null} 3 - {age: 28, city: Delhi, married: true, name: Steve, spouse: Grace} 4 - {age: 32, city: Chennai, married: true, name: Peter, spouse: Rachel} 5 teacher: 6 age: 35 7 childen: 8 - {age: 8, name: Jett} 9 - {age: 5, name: Lucy} 10 city: Chennai 11 married: true 12 name: John 13 spouse: Anna
附錄1: 常見問題
1. Is there an official extension for YAML files? YAML文件的官方擴展名是什麼?
A: Please use ".yaml" when possible. 請儘量地使用".yaml"。
2. Why does YAML forbid tabs? 爲何YAML禁止使用tab鍵?
A: Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt. Indeed Guido van Rossum of Python has acknowledged that allowing TABs in Python source is a headache for many people and that were he to design Python again, he would forbid them. Tab鍵是被取締使用的,由於不一樣的編輯器和不一樣的工具處理Tab鍵的方式是不同的。縮進對於正確解釋YAML來講相當重要,因此容許使用Tab鍵實在是太難了以致於沒法嘗試。事實上,Python之父(Guido van Rossum)也已經認可在Python源代碼中容許使用Tab鍵對於不少人來講是一個頭疼的問題,若是他再設計Python的話,他將禁止在Python源代碼中使用Tab鍵進行縮進。
附錄2:一個基於oyaml的腳本(yamlfmt.py),該腳本支持JSON和YAML的相互轉換
1 #!/usr/bin/python3 2 3 """ 4 Convert YAML file to JSON file or convert JSON file to YAML file, also support 5 to load a YAML file and dump it out in case it looks ugly 6 7 Note we use oyaml which is a drop-in replacement for PyYAML which preserves 8 dict ordering. And you have to install PyYAML first, then have a try, e.g. 9 $ git clone https://github.com/wimglenn/oyaml.git /tmp/oyaml 10 $ export PYTHONPATH=/tmp/oyaml:$PYTHONPATH 11 12 """ 13 14 import sys 15 import getopt 16 import json 17 import collections 18 import oyaml as yaml 19 20 21 def to_json(txt, indent=4): 22 # XXX: yaml.load() support to load both JSON and YAML 23 obj = yaml.load(txt) 24 out = json.dumps(obj, indent=indent) 25 return out 26 27 28 def to_yaml(txt, indent=2): 29 # XXX: yaml.load() support to load both JSON and YAML 30 obj = yaml.load(txt) 31 out = yaml.dump(obj, default_flow_style=False, indent=indent) 32 return out.rstrip('\n') 33 34 35 def new_argv(argv0, rargv): 36 argv = [] 37 argv.append(argv0) 38 argv.extend(rargv) 39 return argv 40 41 42 def usage(argv0): 43 sys.stderr.write('Usage: %s [-t indent] [-o outfile] <subcmd> ' 44 '<yaml or json file>\n' % argv0) 45 sys.stderr.write('subcmd:\n') 46 sys.stderr.write('\ttojson | j : convert yaml to json OR\n') 47 sys.stderr.write('\t load json then dump out\n') 48 sys.stderr.write('\ttoyaml | y : convert json to yaml OR\n') 49 sys.stderr.write('\t load yaml then dump out\n') 50 sys.stderr.write('e.g.\n') 51 sys.stderr.write(' %s tojson foo1.yaml\n' % argv0) 52 sys.stderr.write(' %s toyaml foo2.json\n' % argv0) 53 sys.stderr.write(' %s toyaml foo3.yaml\n' % argv0) 54 sys.stderr.write(' %s -t 8 -o foo2.json tojson foo1.yaml\n' % argv0) 55 sys.stderr.write(' %s -t 2 -o foo1.yaml toyaml foo2.json\n' % argv0) 56 sys.stderr.write(' %s -t 2 -o foo3.yaml toyaml foo1.yaml\n' % argv0) 57 58 59 def main(argc, argv): 60 indent = 4 61 output_file = None 62 63 options, rargv = getopt.getopt(argv[1:], 64 ':t:o:h', 65 ['indent=', 'output=', 'help']) 66 for opt, arg in options: 67 if opt in ('-t', '--indent'): 68 indent = int(arg) 69 elif opt in ('-o', '--output'): 70 output_file = arg 71 else: 72 usage(argv[0]) 73 return 1 74 75 argv = new_argv(argv[0], rargv) 76 argc = len(argv) 77 if argc != 3: 78 usage(argv[0]) 79 return 1 80 81 subcmd = argv[1] 82 yaml_file = argv[2] 83 txt = None 84 with open(yaml_file, 'r') as file_handler: 85 txt = ''.join(file_handler.readlines()) 86 87 if subcmd in ['tojson', 'j']: 88 out = to_json(txt, indent) 89 elif subcmd in ['toyaml', 'y']: 90 out = to_yaml(txt, indent) 91 else: 92 usage(argv[0]) 93 return 1 94 95 if output_file is None: 96 print(out) 97 else: 98 with open(output_file, 'w') as file_handler: 99 file_handler.write('%s\n' % out) 100 101 return 0 102 103 104 if __name__ == '__main__': 105 sys.exit(main(len(sys.argv), sys.argv))
參考資料: