Term rewriting macros ===================== Term rewriting macros are macros or templates that have not only a *name* but also a *pattern* that is searched for after the semantic checking phase of the compiler: This means they provide an easy way to enhance the compilation pipeline with user defined optimizations: .. code-block:: nim template optMul{`*`(a, 2)}(a: int): int = a+a let x = 3 echo x * 2 The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the curlies is the pattern to match against. The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns if they are written in infix notation, so to match verbatim against ``*`` the ordinary function call syntax needs to be used. Unfortunately optimizations are hard to get right and even the tiny example is **wrong**: .. code-block:: nim template optMul{`*`(a, 2)}(a: int): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2 We cannot duplicate 'a' if it denotes an expression that has a side effect! Fortunately Nim supports side effect analysis: .. code-block:: nim template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a proc f(): int = echo "side effect!" result = 55 echo f() * 2 # not optimized ;-) You can make one overload matching with a constraint and one without, and the one with a constraint will have precedence, and so you can handle both cases differently. So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We cannot really do that however as the following code only swaps arguments blindly: .. code-block:: nim template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a What optimizers really need to do is a *canonicalization*: .. code-block:: nim template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a The ``int{lit}`` parameter pattern matches against an expression of type ``int``, but only if it's a literal. Parameter constraints --------------------- The `parameter constraint`:idx: expression can use the operators ``|`` (or), ``&`` (and) and ``~`` (not) and the following predicates: =================== ===================================================== Predicate Meaning =================== ===================================================== ``atom`` The matching node has no children. ``lit`` The matching node is a literal like "abc", 12. ``sym`` The matching node must be a symbol (a bound identifier). ``ident`` The matching node must be an identifier (an unbound identifier). ``call`` The matching AST must be a call/apply expression. ``lvalue`` The matching AST must be an lvalue. ``sideeffect`` The matching AST must have a side effect. ``nosideeffect`` The matching AST must have no side effect. ``param`` A symbol which is a parameter. ``genericparam`` A symbol which is a generic parameter. ``module`` A symbol which is a module. ``type`` A symbol which is a type. ``var`` A symbol which is a variable. ``let`` A symbol which is a ``let`` variable. ``const`` A symbol which is a constant. ``result`` The special ``result`` variable. ``proc`` A symbol which is a proc. ``method`` A symbol which is a method. ``iterator`` A symbol which is an iterator. ``converter`` A symbol which is a converter. ``macro`` A symbol which is a macro. ``template`` A symbol which is a template. ``field`` A symbol which is a field in a tuple or an object. ``enumfield`` A symbol which is a field in an enumeration. ``forvar`` A for loop variable. ``label`` A label (used in ``block`` statements). ``nk*`` The matching AST must have the specified kind. (Example: ``nkIfStmt`` denotes an ``if`` statement.) ``alias`` States that the marked parameter needs to alias with *some* other parameter. ``noalias`` States that *every* other parameter must not alias with the marked parameter. =================== ===================================================== Predicates that share their name with a keyword have to be escaped with backticks: `` `const` ``. The ``alias`` and ``noalias`` predicates refer not only to the matching AST, but also to every other bound parameter; syntactically they need to occur after the ordinary AST predicates: .. code-block:: nim template ex{a = b + c}(a: int{noalias}, b, c: int) = # this transformation is only valid if 'b' and 'c' do not alias 'a': a = b inc a, c Pattern operators ----------------- The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns if they are written in infix notation. The ``|`` operator ~~~~~~~~~~~~~~~~~~ The ``|`` operator if used as infix operator creates an ordered choice: .. code-block:: nim template t{0|1}(): untyped = 3 let a = 1 # outputs 3: echo a The matching is performed after the compiler performed some optimizations like constant folding, so the following does not work: .. code-block:: nim template t{0|1}(): untyped = 3 # outputs 1: echo 1 The reason is that the compiler already transformed the 1 into "1" for the ``echo`` statement. However, a term rewriting macro should not change the semantics anyway. In fact they can be deactivated with the ``--patterns:off`` command line option or temporarily with the ``patterns`` pragma. The ``{}`` operator ~~~~~~~~~~~~~~~~~~~ A pattern expression can be bound to a pattern parameter via the ``expr{param}`` notation: .. code-block:: nim template t{(0|1|2){x}}(x: untyped): untyped = x+1 let a = 1 # outputs 2: echo a The ``~`` operator ~~~~~~~~~~~~~~~~~~ The ``~`` operator is the **not** operator in patterns: .. code-block:: nim template t{x = (~x){y} and (~x){z}}(x, y, z: bool) = x = y if x: x = z var a = false b = true c = false a = b and c echo a The ``*`` operator ~~~~~~~~~~~~~~~~~~ The ``*`` operator can *flatten* a nested binary expression like ``a & b & c`` to ``&(a, b, c)``: .. code-block:: nim var calls = 0 proc `&&`(s: varargs[string]): string = result = s[0] for i in 1..len(s)-1: result.add s[i] inc calls template optConc{ `&&` * a }(a: string): untyped = &&a let space = " " echo "my" && (space & "awe" && "some " ) && "concat" # check that it's been optimized properly: doAssert calls == 1 The second operator of `*` must be a parameter; it is used to gather all the arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"`` is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``) which is flattened into a call expression; thus the invocation of ``optConc`` produces: .. code-block:: nim `&&`("my", space & "awe", "some ", "concat") The ``**`` operator ~~~~~~~~~~~~~~~~~~~ The ``**`` is much like the ``*`` operator, except that it gathers not only all the arguments, but also the matched operators in reverse polish notation: .. code-block:: nim import macros type Matrix = object dummy: int proc `*`(a, b: Matrix): Matrix = discard proc `+`(a, b: Matrix): Matrix = discard proc `-`(a, b: Matrix): Matrix = discard proc `$`(a: Matrix): string = result = $a.dummy proc mat21(): Matrix = result.dummy = 21 macro optM{ (`+`|`-`|`*`) ** a }(a: Matrix): untyped = echo treeRepr(a) result = newCall(bindSym"mat21") var x, y, z: Matrix echo x + y * z - x This passes the expression ``x + y * z - x`` to the ``optM`` macro as an ``nnkArgList`` node containing:: Arglist Sym "x" Sym "y" Sym "z" Sym "*" Sym "+" Sym "x" Sym "-" (Which is the reverse polish notation of ``x + y * z - x``.) Parameters ---------- Parameters in a pattern are type checked in the matching process. If a parameter is of the type ``varargs`` it is treated specially and it can match 0 or more arguments in the AST to be matched against: .. code-block:: nim template optWrite{ write(f, x) ((write|writeLine){w})(f, y) }(x, y: varargs[untyped], f: File, w: untyped) = w(f, x, y) Example: Partial evaluation --------------------------- The following example shows how some simple partial evaluation can be implemented with term rewriting: .. code-block:: nim proc p(x, y: int; cond: bool): int = result = if cond: x + y else: x - y template optP1{p(x, y, true)}(x, y: untyped): untyped = x + y template optP2{p(x, y, false)}(x, y: untyped): untyped = x - y Example: Hoisting ----------------- The following example shows how some form of hoisting can be implemented: .. code-block:: nim import pegs template optPeg{peg(pattern)}(pattern: string{lit}): Peg = var gl {.global, gensym.} = peg(pattern) gl for i in 0 .. 3: echo match("(a b c)", peg"'(' @ ')'") echo match("W_HI_Le", peg"\y 'while'") The ``optPeg`` template optimizes the case of a peg constructor with a string literal, so that the pattern will only be parsed once at program startup and stored in a global ``gl`` which is then re-used. This optimization is called hoisting because it is comparable to classical loop hoisting. AST based overloading ===================== Parameter constraints can also be used for ordinary routine parameters; these constraints affect ordinary overloading resolution then: .. code-block:: nim proc optLit(a: string{lit|`const`}) = echo "string literal" proc optLit(a: string) = echo "no string literal" const constant = "abc" var variable = "xyz" optLit("literal") optLit(constant) optLit(variable) However, the constraints ``alias`` and ``noalias`` are not available in ordinary routines. Move optimization ----------------- The ``call`` constraint is particularly useful to implement a move optimization for types that have copying semantics: .. code-block:: nim proc `[]=`*(t: var Table, key: string, val: string) = ## puts a (key, value)-pair into `t`. The semantics of string require ## a copy here: let idx = findInsertionPosition(key) t[idx].key = key t[idx].val = val proc `[]=`*(t: var Table, key: string{call}, val: string{call}) = ## puts a (key, value)-pair into `t`. Optimized version that knows that ## the strings are unique and thus don't need to be copied: let idx = findInsertionPosition(key) shallowCopy t[idx].key, key shallowCopy t[idx].val, val var t: Table # overloading resolution ensures that the optimized []= is called here: t[f()] = g()