MAML v0.1
Minimal Abstract Markup Language
By Medvedev Anton.
Objectives
- MAML aims to be a minimal configuration format.
- MAML should be easily readable by humans.
- MAML should be easily parsed by machines.
Spec
- MAML is case-sensitive.
- A MAML file must be a valid UTF-8 encoded Unicode document.
- Whitespace means a space (0x20) or a tab (0x09).
- Newline means LF (0x0A) or CRLF (0x0D 0x0A).
Whitespaces and newlines are allowed to separate values and structural characters.
Comment
A hash symbol marks the rest of the line as a comment, except when inside a string.
# Comment before the object
{
foo: "value" # Inline comment
bar: "# This is not a comment"
}
Control characters other than tab (U+0000 to U+0008, U+000A to U+001F, U+007F) are not permitted in comments.
Values
A MAML value is an object, array, string, multiline string, integer, float, boolean, or null.
Array
An array structure is represented as [
]
square brackets surrounding zero or more values. Values are separated by ,
comma or newlines. There is no requirement that the values in an array be of the same type.
[
"red"
"yellow"
"green"
]
Commas are optional. Trailing comma is allowed.
[ "red", "yellow", "green", ]
Objects
An object is an ordered set of key/value pairs. An object begins with {
left brace and ends with }
right brace. Each key is followed by :
colon and the key/value pairs are separated by ,
comma or newlines.
{
key: "value"
"quotted key": "value"
}
Commas are optional. Trailing comma is allowed.
{
foo: "value",
bar: "value",
}
Duplicate keys are not allowed within an object.
Keys
A key may be either an identifier or a quoted string.
Identifier keys may only contain A-Z
a-z
letters, 0-9
digits, _
underscores, and -
hyphens.
Identifier keys are allowed to be composed of only digits, (for example, 1234
), but are always interpreted as strings.
Quoted string keys follow the exact same rules as strings and allow you to use a much broader set of key names.
An identifier key must be non-empty, but an empty quoted string key is allowed.
String
Strings are surrounded by "
quotation marks. Any Unicode character may be used except those that must be escaped: quotation mark, backslash, and the control characters other than tab (U+0000 to U+0008, U+000A to U+001F, U+007F).
"String with a \"nested\" string, \t tab, 😁 emoji, and \u0022 sequence"
For convenience, some popular characters have a compact escape sequence.
\b - backspace (U+0008)
\t - tab (U+0009)
\n - linefeed (U+000A)
\f - form feed (U+000C)
\r - carriage return (U+000D)
\" - quote (U+0022)
\\ - backslash (U+005C)
A Unicode character may be escaped with the \uXXXX
form. The escape codes must be valid Unicode scalar values.
All other escape sequences not listed above are reserved; if they are used, MAML should produce an error. All strings must contain only valid UTF-8 characters.
Multiline String
Multiline Strings are surrounded by """
three quotes on each side and allow newlines. There is no escaping. A newline immediately following the opening delimiter is ignored. All other content between the delimiters is interpreted as-is without modification.
"""
The quick brown
fox jumps over
the lazy dog.
"""
}
In the previous example, a string ends with a newline at the end:
"The quick brown\nfox jumps over\nthe lazy dog.\n"
To avoid the last newline, place the closing delimiter on the same line.
"""
The quick brown
fox jumps over
the lazy dog."""
All whitespaces are preserved as is.
{
key: """
Roses are red,
Violets are blue;
"""
}
The previous example can be written as:
{
key: " Roses are red,\n Violets are blue;\n "
}
All escape sequences are preserved as is.
"""
There is no escaping, so \n, \u0022, etc.,
are interpreted as-is without modification.
"""
You can write one or two quotes anywhere within a multiline string, but sequences of three or more quotes are not permitted.
"""
Maximum of two "" quotes allowed inside.
But many if spaces: "1", "2", "3".
"""
Multiline string may also be written in a single line.
"""A multiline string and with "quotas"."""
The following example evaluates to the empty string ""
:
"""
"""
If you want a string that contains exactly one newline character, insert one additional blank line between the delimiters:
"""
"""
The single-line multiline string cannot represent an empty string. To write an empty string on one line, use a normal string ""
.
All multiline strings must contain only valid UTF-8 characters.
Integer
Integers are whole numbers. Negative numbers are prefixed with a minus sign. Leading zeros are not allowed. Plus sign is not allowed.
{
int1: 42
int2: -100
}
Arbitrary 64-bit signed integers (from −2^63 to 2^63−1) should be accepted and handled losslessly. If an integer cannot be represented losslessly, an error must be thrown.
Float
Floats should be implemented as IEEE 754 binary64 values.
A float consists of an integer part (which follows the same rules as integer values) followed by a fractional part and/or an exponent part. If both a fractional part and an exponent part are present, the fractional part must precede the exponent part.
[
# fractional
1.0
3.1415
-0.01
# exponent
5e+22
1e06
-2E-2
# both
6.626e-34
]
A fractional part is a decimal point followed by one or more digits.
An exponent part is an E (upper or lower case) followed by an integer part (which follows the same rules as integer values but may include leading zeros).
The decimal point, if used, must be surrounded by at least one digit on each side.
Boolean
Booleans are true
and false
. Always lowercase.
[ true, false ]
Null
Represents the lack of a value. An object with some key and a null value is valid and different from not having that key in the object. Always lowercase.
null
Filename Extension
MAML files should use the extension .maml
.
MIME Type
When transferring MAML files over the internet, the appropriate MIME type is application/maml
.
ABNF Grammar
A formal description of MAML's syntax in ABNF format (RFC 5234).
A visual representation of the MAML syntax is available in the syntax diagram.
maml = ws-comment-newline value ws-comment-newline
value = object / array / string / multiline-string / number / "true" / "false" / "null"
object = "{" [ members ] ws-comment-newline "}"
members = ws-comment-newline key-value ws [ comment ] separator members
members =/ ws-comment-newline key-value ws [ comment ] [ separator ]
key-value = key ws ":" ws value
key = string / identifier
identifier = 1*( ALPHA / DIGIT / "-" / "_" )
array = "[" [ items ] ws-comment-newline "]"
items = ws-comment-newline value ws [ comment ] separator items
items =/ ws-comment-newline value ws [ comment ] [ separator ]
separator = "," / newline
string = quote *char quote
multiline-string = 3quote 1*literal-char *( 1*2quote 1*literal-char ) 3quote
literal-char = %x09 / %x20-21 / %x23-7E / non-ascii / newline
number = [ "-" ] integer [ fraction ] [ exp ]
onenine = %x31-39
exp = ( "e" / "E" ) [ "-" / "+" ] 1*DIGIT
fraction = "." 1*DIGIT
integer = "0" / ( onenine *DIGIT )
quote = %x22
char = %x20-21 / %x23-5B / %x5D-10FFFF
char =/ %x5C ( %x5C / quote / "/" / "b" / "f" / "n" / "r" / "t" / "u" 4HEXDIG )
comment = "#" *non-eol
non-eol = %x09 / %x20-7E / non-ascii
non-ascii = %x80-D7FF / %xE000-10FFFF
ws-comment-newline = *( (SP / HTAB) / [ comment ] newline )
newline = LF / CR LF
ws = *( SP / HTAB )
SP = %x20 ; space
HTAB = %x09 ; horizontal tab
CR = %x0D ; carriage return
LF = %x0A ; linefeed
ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
DIGIT = %x30-39 ; 0-9
HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"