Procedures ========== What most programming languages call `methods`:idx: or `functions`:idx: are called `procedures`:idx: in Nim (which is the correct terminology). A procedure declaration consists of an identifier, zero or more formal parameters, a return value type and a block of code. Formal parameters are declared as a list of identifiers separated by either comma or semicolon. A parameter is given a type by ``: typename``. The type applies to all parameters immediately before it, until either the beginning of the parameter list, a semicolon separator or an already typed parameter, is reached. The semicolon can be used to make separation of types and subsequent identifiers more distinct. .. code-block:: nim # Using only commas proc foo(a, b: int, c, d: bool): int # Using semicolon for visual distinction proc foo(a, b: int; c, d: bool): int # Will fail: a is untyped since ';' stops type propagation. proc foo(a; b: int; c, d: bool): int A parameter may be declared with a default value which is used if the caller does not provide a value for the argument. .. code-block:: nim # b is optional with 47 as its default value proc foo(a: int, b: int = 47): int Parameters can be declared mutable and so allow the proc to modify those arguments, by using the type modifier `var`. .. code-block:: nim # "returning" a value to the caller through the 2nd argument # Notice that the function uses no actual return value at all (ie void) proc foo(inp: int, outp: var int) = outp = inp + 47 If the proc declaration has no body, it is a `forward`:idx: declaration. If the proc returns a value, the procedure body can access an implicitly declared variable named `result`:idx: that represents the return value. Procs can be overloaded. The overloading resolution algorithm determines which proc is the best match for the arguments. Example: .. code-block:: nim proc toLower(c: char): char = # toLower for characters if c in {'A'..'Z'}: result = chr(ord(c) + (ord('a') - ord('A'))) else: result = c proc toLower(s: string): string = # toLower for strings result = newString(len(s)) for i in 0..len(s) - 1: result[i] = toLower(s[i]) # calls toLower for characters; no recursion! Calling a procedure can be done in many different ways: .. code-block:: nim proc callme(x, y: int, s: string = "", c: char, b: bool = false) = ... # call with positional arguments # parameter bindings: callme(0, 1, "abc", '\t', true) # (x=0, y=1, s="abc", c='\t', b=true) # call with named and positional arguments: callme(y=1, x=0, "abd", '\t') # (x=0, y=1, s="abd", c='\t', b=false) # call with named arguments (order is not relevant): callme(c='\t', y=1, x=0) # (x=0, y=1, s="", c='\t', b=false) # call as a command statement: no () needed: callme 0, 1, "abc", '\t' # (x=0, y=1, s="abc", c='\t', b=false) A procedure may call itself recursively. `Operators`:idx: are procedures with a special operator symbol as identifier: .. code-block:: nim proc `$` (x: int): string = # converts an integer to a string; this is a prefix operator. result = intToStr(x) Operators with one parameter are prefix operators, operators with two parameters are infix operators. (However, the parser distinguishes these from the operator's position within an expression.) There is no way to declare postfix operators: all postfix operators are built-in and handled by the grammar explicitly. Any operator can be called like an ordinary proc with the '`opr`' notation. (Thus an operator can have more than two parameters): .. code-block:: nim proc `*+` (a, b, c: int): int = # Multiply and add result = a * b + c assert `*+`(3, 4, 6) == `*`(a, `+`(b, c)) Export marker ------------- If a declared symbol is marked with an `asterisk`:idx: it is exported from the current module: .. code-block:: nim proc exportedEcho*(s: string) = echo s proc `*`*(a: string; b: int): string = result = newStringOfCap(a.len * b) for i in 1..b: result.add a var exportedVar*: int const exportedConst* = 78 type ExportedType* = object exportedField*: int Method call syntax ------------------ For object oriented programming, the syntax ``obj.method(args)`` can be used instead of ``method(obj, args)``. The parentheses can be omitted if there are no remaining arguments: ``obj.len`` (instead of ``len(obj)``). This method call syntax is not restricted to objects, it can be used to supply any type of first argument for procedures: .. code-block:: nim echo "abc".len # is the same as echo len "abc" echo "abc".toUpper() echo {'a', 'b', 'c'}.card stdout.writeLine("Hallo") # the same as writeLine(stdout, "Hallo") Another way to look at the method call syntax is that it provides the missing postfix notation. The method call syntax conflicts with explicit generic instantiations: ``p[T](x)`` cannot be written as ``x.p[T]`` because ``x.p[T]`` is always parsed as ``(x.p)[T]``. **Future directions**: ``p[.T.]`` might be introduced as an alternative syntax to pass explict types to a generic and then ``x.p[.T.]`` can be parsed as ``x.(p[.T.])``. See also: `Limitations of the method call syntax`_. Properties ---------- Nim has no need for *get-properties*: Ordinary get-procedures that are called with the *method call syntax* achieve the same. But setting a value is different; for this a special setter syntax is needed: .. code-block:: nim type Socket* = ref object of RootObj FHost: int # cannot be accessed from the outside of the module # the `F` prefix is a convention to avoid clashes since # the accessors are named `host` proc `host=`*(s: var Socket, value: int) {.inline.} = ## setter of hostAddr s.FHost = value proc host*(s: Socket): int {.inline.} = ## getter of hostAddr s.FHost var s: Socket new s s.host = 34 # same as `host=`(s, 34) Command invocation syntax ------------------------- Routines can be invoked without the ``()`` if the call is syntatically a statement. This command invocation syntax also works for expressions, but then only a single argument may follow. This restriction means ``echo f 1, f 2`` is parsed as ``echo(f(1), f(2))`` and not as ``echo(f(1, f(2)))``. The method call syntax may be used to provide one more argument in this case: .. code-block:: nim proc optarg(x: int, y: int = 0): int = x + y proc singlearg(x: int): int = 20*x echo optarg 1, " ", singlearg 2 # prints "1 40" let fail = optarg 1, optarg 8 # Wrong. Too many arguments for a command call let x = optarg(1, optarg 8) # traditional procedure call with 2 arguments let y = 1.optarg optarg 8 # same thing as above, w/o the parenthesis assert x == y The command invocation syntax also can't have complex expressions as arguments. For example: (`anonymous procs`_), ``if``, ``case`` or ``try``. The (`do notation`_) is limited, but usable for a single proc (see the example in the corresponding section). Function calls with no arguments still needs () to distinguish between a call and the function itself as a first class value. Closures -------- Procedures can appear at the top level in a module as well as inside other scopes, in which case they are called nested procs. A nested proc can access local variables from its enclosing scope and if it does so it becomes a closure. Any captured variables are stored in a hidden additional argument to the closure (its environment) and they are accessed by reference by both the closure and its enclosing scope (i.e. any modifications made to them are visible in both places). The closure environment may be allocated on the heap or on the stack if the compiler determines that this would be safe. Creating closures in loops ~~~~~~~~~~~~~~~~ Since closures capture local variables by reference it is often not wanted behavior inside loop bodies. See `closureScope `_ for details on how to change this behavior. Anonymous Procs --------------- Procs can also be treated as expressions, in which case it's allowed to omit the proc's name. .. code-block:: nim var cities = @["Frankfurt", "Tokyo", "New York", "Kyiv"] cities.sort(proc (x,y: string): int = cmp(x.len, y.len)) Procs as expressions can appear both as nested procs and inside top level executable code. Do notation ----------- As a special more convenient notation, proc expressions involved in procedure calls can use the ``do`` keyword: .. code-block:: nim sort(cities) do (x,y: string) -> int: cmp(x.len, y.len) # Less parenthesis using the method plus command syntax: cities = cities.map do (x:string) -> string: "City of " & x ``do`` is written after the parentheses enclosing the regular proc params. The proc expression represented by the do block is appended to them. ``do`` with parentheses is an anonymous ``proc``; however a ``do`` without parentheses is just a block of code. The ``do`` notation can be used to pass multiple blocks to a macro: .. code-block:: nim macro performWithUndo(task, undo: untyped) = ... performWithUndo do: # multiple-line block of code # to perform the task do: # code to undo it Nonoverloadable builtins ------------------------ The following builtin procs cannot be overloaded for reasons of implementation simplicity (they require specialized semantic checking):: declared, defined, definedInScope, compiles, low, high, sizeOf, is, of, shallowCopy, getAst, astToStr, spawn, procCall Thus they act more like keywords than like ordinary identifiers; unlike a keyword however, a redefinition may `shadow`:idx: the definition in the ``system`` module. From this list the following should not be written in dot notation ``x.f`` since ``x`` cannot be type checked before it gets passed to ``f``:: declared, defined, definedInScope, compiles, getAst, astToStr Var parameters -------------- The type of a parameter may be prefixed with the ``var`` keyword: .. code-block:: nim proc divmod(a, b: int; res, remainder: var int) = res = a div b remainder = a mod b var x, y: int divmod(8, 5, x, y) # modifies x and y assert x == 1 assert y == 3 In the example, ``res`` and ``remainder`` are `var parameters`. Var parameters can be modified by the procedure and the changes are visible to the caller. The argument passed to a var parameter has to be an l-value. Var parameters are implemented as hidden pointers. The above example is equivalent to: .. code-block:: nim proc divmod(a, b: int; res, remainder: ptr int) = res[] = a div b remainder[] = a mod b var x, y: int divmod(8, 5, addr(x), addr(y)) assert x == 1 assert y == 3 In the examples, var parameters or pointers are used to provide two return values. This can be done in a cleaner way by returning a tuple: .. code-block:: nim proc divmod(a, b: int): tuple[res, remainder: int] = (a div b, a mod b) var t = divmod(8, 5) assert t.res == 1 assert t.remainder == 3 One can use `tuple unpacking`:idx: to access the tuple's fields: .. code-block:: nim var (x, y) = divmod(8, 5) # tuple unpacking assert x == 1 assert y == 3 **Note**: ``var`` parameters are never necessary for efficient parameter passing. Since non-var parameters cannot be modified the compiler is always free to pass arguments by reference if it considers it can speed up execution. Var return type --------------- A proc, converter or iterator may return a ``var`` type which means that the returned value is an l-value and can be modified by the caller: .. code-block:: nim var g = 0 proc WriteAccessToG(): var int = result = g WriteAccessToG() = 6 assert g == 6 It is a compile time error if the implicitly introduced pointer could be used to access a location beyond its lifetime: .. code-block:: nim proc WriteAccessToG(): var int = var g = 0 result = g # Error! For iterators, a component of a tuple return type can have a ``var`` type too: .. code-block:: nim iterator mpairs(a: var seq[string]): tuple[key: int, val: var string] = for i in 0..a.high: yield (i, a[i]) In the standard library every name of a routine that returns a ``var`` type starts with the prefix ``m`` per convention. Overloading of the subscript operator ------------------------------------- The ``[]`` subscript operator for arrays/openarrays/sequences can be overloaded. Multi-methods ============= Procedures always use static dispatch. Multi-methods use dynamic dispatch. For dynamic dispatch to work on an object it should be a reference type as well. .. code-block:: nim type Expression = ref object of RootObj ## abstract base class for an expression Literal = ref object of Expression x: int PlusExpr = ref object of Expression a, b: Expression method eval(e: Expression): int {.base.} = # override this base method quit "to override!" method eval(e: Literal): int = return e.x method eval(e: PlusExpr): int = # watch out: relies on dynamic binding result = eval(e.a) + eval(e.b) proc newLit(x: int): Literal = new(result) result.x = x proc newPlus(a, b: Expression): PlusExpr = new(result) result.a = a result.b = b echo eval(newPlus(newPlus(newLit(1), newLit(2)), newLit(4))) In the example the constructors ``newLit`` and ``newPlus`` are procs because they should use static binding, but ``eval`` is a method because it requires dynamic binding. As can be seen in the example, base methods have to be annotated with the `base`:idx: pragma. The ``base`` pragma also acts as a reminder for the programmer that a base method ``m`` is used as the foundation to determine all the effects that a call to ``m`` might cause. In a multi-method all parameters that have an object type are used for the dispatching: .. code-block:: nim type Thing = ref object of RootObj Unit = ref object of Thing x: int method collide(a, b: Thing) {.base, inline.} = quit "to override!" method collide(a: Thing, b: Unit) {.inline.} = echo "1" method collide(a: Unit, b: Thing) {.inline.} = echo "2" var a, b: Unit new a new b collide(a, b) # output: 2 Invocation of a multi-method cannot be ambiguous: collide 2 is preferred over collide 1 because the resolution works from left to right. In the example ``Unit, Thing`` is preferred over ``Thing, Unit``. **Performance note**: Nim does not produce a virtual method table, but generates dispatch trees. This avoids the expensive indirect branch for method calls and enables inlining. However, other optimizations like compile time evaluation or dead code elimination do not work with methods. Iterators and the for statement =============================== The `for`:idx: statement is an abstract mechanism to iterate over the elements of a container. It relies on an `iterator`:idx: to do so. Like ``while`` statements, ``for`` statements open an `implicit block`:idx:, so that they can be left with a ``break`` statement. The ``for`` loop declares iteration variables - their scope reaches until the end of the loop body. The iteration variables' types are inferred by the return type of the iterator. An iterator is similar to a procedure, except that it can be called in the context of a ``for`` loop. Iterators provide a way to specify the iteration over an abstract type. A key role in the execution of a ``for`` loop plays the ``yield`` statement in the called iterator. Whenever a ``yield`` statement is reached the data is bound to the ``for`` loop variables and control continues in the body of the ``for`` loop. The iterator's local variables and execution state are automatically saved between calls. Example: .. code-block:: nim # this definition exists in the system module iterator items*(a: string): char {.inline.} = var i = 0 while i < len(a): yield a[i] inc(i) for ch in items("hello world"): # `ch` is an iteration variable echo ch The compiler generates code as if the programmer would have written this: .. code-block:: nim var i = 0 while i < len(a): var ch = a[i] echo ch inc(i) If the iterator yields a tuple, there can be as many iteration variables as there are components in the tuple. The i'th iteration variable's type is the type of the i'th component. In other words, implicit tuple unpacking in a for loop context is supported. Implict items/pairs invocations ------------------------------- If the for loop expression ``e`` does not denote an iterator and the for loop has exactly 1 variable, the for loop expression is rewritten to ``items(e)``; ie. an ``items`` iterator is implicitly invoked: .. code-block:: nim for x in [1,2,3]: echo x If the for loop has exactly 2 variables, a ``pairs`` iterator is implicitly invoked. Symbol lookup of the identifiers ``items``/``pairs`` is performed after the rewriting step, so that all overloads of ``items``/``pairs`` are taken into account. First class iterators --------------------- There are 2 kinds of iterators in Nim: *inline* and *closure* iterators. An `inline iterator`:idx: is an iterator that's always inlined by the compiler leading to zero overhead for the abstraction, but may result in a heavy increase in code size. Inline iterators are second class citizens; They can be passed as parameters only to other inlining code facilities like templates, macros and other inline iterators. In contrast to that, a `closure iterator`:idx: can be passed around more freely: .. code-block:: nim iterator count0(): int {.closure.} = yield 0 iterator count2(): int {.closure.} = var x = 1 yield x inc x yield x proc invoke(iter: iterator(): int {.closure.}) = for x in iter(): echo x invoke(count0) invoke(count2) Closure iterators have other restrictions than inline iterators: 1. ``yield`` in a closure iterator can not occur in a ``try`` statement. 2. For now, a closure iterator cannot be evaluated at compile time. 3. ``return`` is allowed in a closure iterator (but rarely useful) and ends iteration. 4. Neither inline nor closure iterators can be recursive. Iterators that are neither marked ``{.closure.}`` nor ``{.inline.}`` explicitly default to being inline, but this may change in future versions of the implementation. The ``iterator`` type is always of the calling convention ``closure`` implicitly; the following example shows how to use iterators to implement a `collaborative tasking`:idx: system: .. code-block:: nim # simple tasking: type Task = iterator (ticker: int) iterator a1(ticker: int) {.closure.} = echo "a1: A" yield echo "a1: B" yield echo "a1: C" yield echo "a1: D" iterator a2(ticker: int) {.closure.} = echo "a2: A" yield echo "a2: B" yield echo "a2: C" proc runTasks(t: varargs[Task]) = var ticker = 0 while true: let x = t[ticker mod t.len] if finished(x): break x(ticker) inc ticker runTasks(a1, a2) The builtin ``system.finished`` can be used to determine if an iterator has finished its operation; no exception is raised on an attempt to invoke an iterator that has already finished its work. Note that ``system.finished`` is error prone to use because it only returns ``true`` one iteration after the iterator has finished: .. code-block:: nim iterator mycount(a, b: int): int {.closure.} = var x = a while x <= b: yield x inc x var c = mycount # instantiate the iterator while not finished(c): echo c(1, 3) # Produces 1 2 3 0 Instead this code has to be used: .. code-block:: nim var c = mycount # instantiate the iterator while true: let value = c(1, 3) if finished(c): break # and discard 'value'! echo value It helps to think that the iterator actually returns a pair ``(value, done)`` and ``finished`` is used to access the hidden ``done`` field. Closure iterators are *resumable functions* and so one has to provide the arguments to every call. To get around this limitation one can capture parameters of an outer factory proc: .. code-block:: nim proc mycount(a, b: int): iterator (): int = result = iterator (): int = var x = a while x <= b: yield x inc x let foo = mycount(1, 4) for f in foo(): echo f .. Implicit return type -------------------- Since inline iterators must always produce values that will be consumed in a for loop, the compiler will implicitly use the ``auto`` return type if no type is given by the user. In contrast, since closure iterators can be used as a collaborative tasking system, ``void`` is a valid return type for them. Converters ========== A converter is like an ordinary proc except that it enhances the "implicitly convertible" type relation (see `Convertible relation`_): .. code-block:: nim # bad style ahead: Nim is not C. converter toBool(x: int): bool = x != 0 if 4: echo "compiles" A converter can also be explicitly invoked for improved readability. Note that implicit converter chaining is not supported: If there is a converter from type A to type B and from type B to type C the implicit conversion from A to C is not provided.