在閱讀 nsq 源碼的時候,發現nsq使用 TOML 配置文件規範。順帶翻譯了大部分。採用英中文混排的方式,這樣比較容易理解。 java
toml-lang/toml
TOML
Tom's Obvious, Minimal Language. 直觀的,最小化的語言
By Tom Preston-Werner.
Latest tagged version:
v0.2.0.
Be warned, this spec is still changing a lot. Until it's marked as 1.0, you should assume that it is unstable and act accordingly.
Objectives
TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics. TOML is designed to map unambiguously to a hash table. TOML should be easy to parse into data structures in a wide variety of languages.
最小化配置文件格式,直觀的語意,很容易閱讀。沒有歧義的映射成哈希表。可以簡單的解析成各類語言的數據結構。
Example
# This is a TOML document. Boom.
title
= "TOML Example"
[owner]
name
= "Tom Preston-Werner"
organization
= "GitHub"
bio
= "GitHub Cofounder & CEO\nLikes tater tots and beer."
dob
= 1979-05-27T07:32:00Z
# First class dates? Why not?
[database]
server
= "192.168.1.1"
ports
= [ 8001, 8001, 8002 ]
connection_max
= 5000
enabled
=
true
[servers]
# You can indent as you please. Tabs or spaces. TOML don't care.
[servers.alpha]
ip
= "10.0.0.1"
dc
= "eqdc10"
[servers.beta]
ip
= "10.0.0.2"
dc
= "eqdc10"
[clients]
data
= [ ["gamma", "delta"], [1, 2] ]
# Line breaks are OK when inside arrays
hosts
= [
"alpha",
"omega"
]
Spec
- TOML is case sensitive.
- Whitespace means tab (0x09) or space (0x20).
規範
TOML 是大小寫敏感的
空白符 指的是(0x09) tab 和 空格(0x20)
Comment
Speak your mind with the hash symbol. They go from the symbol to the end of the line.
# I am a comment. Hear me roar. Roar.
key
= "value"
# Yeah, you can do this.
註釋
使用 # 代表你的想法。從 # 開始,到行的結尾
String
There are four ways to express strings: basic, multi-line basic, literal, and multi-line literal. All strings must contain only valid UTF-8 characters.
字符串
有四中方式表達字符串:基礎,多行基礎,字面,多行字面。全部的只能包含 UTF-8 字符
Basic strings are surrounded by quotation marks. Any Unicode character may be used except those that must be escaped: quotation mark, backslash, and the control characters (U+0000 to U+001F).
"I'm a string. \"You can quote me\". Name\tJos\u00E9\nLocation\tSF.」
基礎字符串
雙引號包裹。Unicode 字符均可以使用。部分字符須要轉義,包括,引號,反斜槓,控制字符 (U+0000 to U+001F)
For convenience, some popular characters have a compact escape sequence.
\b - backspace (U+0008)
\t - tab (U+0009)
\n - linefeed (U+000A)
\f - form feed (U+000C)
\r - carriage return (U+000D)
\" - quote (U+0022)
\/ - slash (U+002F)
\\ - backslash (U+005C)
\uXXXX - unicode (U+XXXX)
\UXXXXXXXX - unicode (U+XXXXXXXX)
Any Unicode character may be escaped with the \uXXXX or \UXXXXXXXX forms. Note that the escape codes must be valid Unicode code points.
Unicode 字符能夠轉義,使用 \uXXXX 或者 \UXXXX 的形式。編碼必須是有效的 unicode 編碼點。好比「中文」的 unicode 編碼:\u4e2d\u6587
Other special characters are reserved and, if used, TOML should produce an error.
其餘特殊字符不容許,TOML 會出錯。
ProTip™: You may notice that the above string specification is the same as JSON's string definition, except that TOML requires UTF-8 encoding. This is on purpose.
ProTip:上面的字符串定義和json字符串一致。除了 TOML 須要 utf-8 編碼,相似於 java 中的 native2ascii 方法,而 json 字符串直接能夠寫中文。這是有意爲之的。
Sometimes you need to express passages of text (e.g. translation files) or would like to break up a very long string into multiple lines. TOML makes this easy.
Multi-line basic strings are surrounded by three quotation marks on each side and allow newlines. If the first character after the opening delimiter is a newline (0x0A), then it is trimmed. All other whitespace remains intact.
多行基礎字符串 表示文本段,或者想分裂長的字符串到多行。3個雙引號包裹,容許新行。開始標識符後面的換行符會被刪除(顯而易見)。其它空白符不會受到影響。
# The following strings are byte-for-byte equivalent: key3 更直觀
key1
= "One\nTwo"
key2
= """One\nTwo"""
key3
= """
One
Two"""
For writing long strings without introducing extraneous whitespace, end a line with a \. The \ will be trimmed along with all whitespace (including newlines) up to the next non-whitespace character or closing delimiter. If the first two characters after the opening delimiter are a backslash and a newline (0x5C0A), then they will both be trimmed along with all whitespace (including newlines) up to the next non-whitespace character or closing delimiter. All of the escape sequences that are valid for basic strings are also valid for multi-line basic strings.
爲了寫長字符串而不引入無關空白符,使用 \ 結束一行。
\ 字符,一直到下一個非空白字符或者結束標識符之間的全部空白字符包括換行符,都會被刪除掉。
開始標識符若是後面是 \ 字符,直到下一個非空白字符之間的全部字符,包括換行符都會被刪除掉。
基礎字符串中的轉義字符對多行基礎字符串仍然有效。
# The following strings are byte-for-byte equivalent: 這是個好例子
key1
= "The quick brown fox jumps over the lazy dog."
key2
= """
The quick brown \
fox jumps over \
the lazy dog."""
key3
= """\
The quick brown \
fox jumps over \
the lazy dog.\
"""
Any Unicode character may be used except those that must be escaped: backslash and the control characters (U+0000 to U+001F). Quotation marks need not be escaped unless their presence would create a premature closing delimiter.
unicode 字符的規則和基礎字符串一致。雙引號不須要轉義,除了那些製造歧義的狀況。
If you're a frequent specifier of Windows paths or regular expressions, then having to escape backslashes quickly becomes tedious and error prone. To help, TOML supports literal strings where there is no escaping allowed at all.
Literal strings are surrounded by single quotes. Like basic strings, they must appear on a single line:
若是頻繁指定windows路徑或正則表達式,轉義反斜槓很快變得枯燥且容易出錯。 TOML 支持 字面字符串,不容許轉義,單引號包裹,單行。
# What you see is what you get.
winpath
= 'C:\Users\nodejs\templates'
winpath2
= '\\ServerX\admin$\system32\'
quoted
= 'Tom "Dubs" Preston-Werner'
regex
= '<\i\c*\s*>'
Since there is no escaping, there is no way to write a single quote inside a literal string enclosed by single quotes. Luckily, TOML supports a multi-line version of literal strings that solves this problem.
Multi-line literal strings are surrounded by three single quotes on each side and allow newlines. Like literal strings, there is no escaping whatsoever. If the first character after the opening delimiter is a newline (0x0A), then it is trimmed. All other content between the delimiters is interpreted as-is without modification.
由於沒有轉義,因此單引號包裹的字面字符串不能包含單引號。很是幸運,TOML 支持多行字面字符串。3個單引號包裹,支持換行,一樣不能轉義。起始標識符後面跟隨的換行會被刪除。其它內容原封不動的保留。
regex2
= '''I [dw]on't need \d{2} apples'''
lines
= '''
The first newline is
trimmed in raw strings.
All other whitespace
is preserved.
'''
For binary data it is recommended that you use Base64 or another suitable ASCII or UTF-8 encoding. The handling of that encoding will be application specific.
Integer
Integers are bare numbers, all alone. Feeling negative? Do what's natural. 64-bit minimum size expected.
Float
Floats are numbers with a single dot within. There must be at least one number on each side of the decimal point. 64-bit (double) precision expected.
Boolean
Booleans are just the tokens you're used to. Always lowercase.
Datetime
Datetimes are ISO 8601 dates, but only the full zulu form is allowed.
Array
Arrays are square brackets with other primitives inside. Whitespace is ignored. Elements are separated by commas. Data types may not be mixed.
中括號包裹基本類型,忽略空白符,元素逗號分隔,數據類型不能混合,必須單一。
[ 1, 2, 3 ]
[ "red", "yellow", "green" ]
[ [ 1, 2 ], [3, 4, 5] ]
[ [ 1, 2 ], ["a", "b", "c"] ]
# this is ok
[ 1, 2.0 ]
# note: this is NOT ok
Arrays can also be multiline. So in addition to ignoring whitespace, arrays also ignore newlines between the brackets. Terminating commas are ok before the closing bracket.
支持多行。忽略空白符,一樣還忽略中括號內的換行符,閉括號前的逗號也會忽略。
key
= [
1, 2, 3
]
key
= [
1,
2,
# this is ok
]
Table
Tables (also known as hash tables or dictionaries) are collections of key/value pairs. They appear in square brackets on a line by themselves. You can tell them apart from arrays because arrays are only ever values.
字典
key/value對 的集合。做爲單獨一行,出如今方括號中。和數組的不一樣就是,數組是做爲值存在。
Under that, and until the next table or EOF are the key/values of that table. Keys are on the left of the equals sign and values are on the right. Keys start with the first character that isn't whitespace or [ and end with the last non-whitespace character before the equals sign. Keys cannot contain a # character. Key/value pairs within tables are not guaranteed to be in any specific order.
在這之下,直到下一個表,或者 EOF,都是字典的 key/value 對。等號的左邊是 key,右邊是值。key 從第一個非空白字符或 [ 開始,到等號左邊的最後一個非空白字符結束。key 不能包含 # 符號。表中的 key/value 對不保證某種順序。
You can indent keys and their values as much as you like. Tabs or spaces. Knock yourself out. Why, you ask? Because you can have nested tables. Snap.
key 和 value 可使用空白符縮進。隨便!爲何要這麼作,由於 字典支持縮進。
Nested tables are denoted by table names with dots in them. Name your tables whatever crap you please, just don't use #, ., [ or ].
嵌套字典點表示法,名中有點號。字典名能夠隨便起,只要不包含 #, ., [, ]。
[dog.tater]
type
= "pug"
In JSON land, that would give you the following structure:
{ "dog": { "tater": { "type": "pug" } } }
You don't need to specify all the super-tables if you don't want to. TOML knows how to do it for you.
下面這個例子很 happy
# [x] you
# [x.y] don't
# [x.y.z] need these
[x.y.z.w]
# for this to work
Empty tables are allowed and simply have no key/value pairs within them.
支持空字典。
As long as a super-table hasn't been directly defined and hasn't defined a specific key, you may still write to it.
儘管父字典沒有直接定義,也沒有定義具體的 key,你仍然能夠往裏面寫。
You cannot define any key or table more than once. Doing so is invalid.
key 和 table 不能重複定義!
# DO NOT DO THIS
[a]
b
= 1
[a]
c
= 2
# DO NOT DO THIS EITHER
[a]
b
= 1
[a.b]
c
= 2
All table names and keys must be non-empty.
# NOT VALID TOML
[]
[a.]
[a..b]
[.b]
[.]
= "no key name"
# not allowed
Array of Tables
The last type that has not yet been expressed is an array of tables. These can be expressed by using a table name in double brackets. Each table with the same double bracketed name will be an element in the array. The tables are inserted in the order encountered. A double bracketed table without any key/value pairs will be considered an empty table.
字典數組
雙中括號包裹字典名。有相同字典名的每個字典是數組中的一個元素。(記住上面,普通字典名不能重複定義)。元素的順序按定義插入數組中。沒有 key/value 對的字典是空的。
[[products]]
name
= "Hammer"
sku
= 738594937
[[products]]
[[products]]
name
= "Nail"
sku
= 284758393
color
= "gray"
In JSON land, that would give you the following structure.
{
"products": [
{ "name": "Hammer", "sku": 738594937 },
{ },
{ "name": "Nail", "sku": 284758393, "color": "gray" }
]
}
You can create nested arrays of tables as well. Just use the same double bracket syntax on sub-tables. Each double-bracketed sub-table will belong to the most recently defined table element above it.
字典數組支持嵌套。子字典數組的語法和數組字典同樣。
[[fruit]]
name
= "apple"
[fruit.physical]
color
= "red"
shape
= "round"
[[fruit.variety]]
name
= "red delicious"
[[fruit.variety]]
name
= "granny smith"
[[fruit]]
name
= "banana"
[[fruit.variety]]
name
= "plantain"
The above TOML maps to the following JSON.
{
"fruit": [
{
"name": "apple",
"physical": {
"color": "red",
"shape": "round"
},
"variety": [
{ "name": "red delicious" },
{ "name": "granny smith" }
]
},
{
"name": "banana",
"variety": [
{ "name": "plantain" }
]
}
]
}
Attempting to define a normal table with the same name as an already established array must produce an error at parse time.
# INVALID TOML DOC
[[fruit]]
name
= "apple"
[[fruit.variety]]
name
= "red delicious"
# This table conflicts with the previous table
[fruit.variety]
name
= "granny smith"
Seriously?
Yep.
But why?
Because we need a decent human-readable format that unambiguously maps to a hash table and the YAML spec is like 80 pages long and gives me rage. No, JSON doesn't count. You know why.
YAML 的規範80頁長,讓我惱火。
Oh god, you're right
Yuuuup. Wanna help? Send a pull request. Or write a parser. BE BRAVE.
Implementations
If you have an implementation, send a pull request adding to this list. Please note the commit SHA1 or version tag that your parser supports in your Readme.