D - Macros
We’ve used macros, such as println!
, throughout this book. This appendix will
explain:
- What macros are and how they differ from functions
- How to define a declarative macro to do metaprogramming
- How to define a procedural macro to create custom
derive
traits
Macros are covered in an appendix because they’re still evolving. They have changed and will change more than the rest of the language and standard library since Rust 1.0, so this section will likely get out of date more than the rest of this book. The code shown here will still continue to work due to Rust’s stability guarantees, but there may be additional capabilities or easier ways to write macros that aren’t available at the time of this publication.
Macros are More Flexible and Complex than Functions
Fundamentally, macros are a way of writing code that writes other code, which
is known as metaprogramming. In the previous appendix, we discussed the
derive
attribute, which generates an implementation of various traits for
you. We’ve also used the println!
and vec!
macros. All of these macros
expand to produce more code than what you’ve written in your source code.
Metaprogramming is useful to reduce the amount of code you have to write and
maintain, which is also one of the roles of functions. However, macros have
some additional powers that functions don’t have, as we discussed in Chapter 1.
A function signature has to declare the number and type of parameters the
function has. Macros can take a variable number of parameters: we can call
println!("hello")
with one argument, or println!("hello {}", name)
with two
arguments. Also, macros are expanded before the compiler interprets the meaning
of the code, so a macro can, for example, implement a trait on a given type,
whereas a function can’t because a function gets called at runtime and a trait
needs to be implemented at compile time.
The downside to implementing a macro rather than a function is that macro definitions are more complex than function definitions. You’re writing Rust code that writes Rust code, and macro definitions are generally more difficult to read, understand, and maintain than function definitions.
Another difference between macros and functions is that macro definitions
aren’t namespaced within modules like function definitions are. In order to
prevent unexpected name clashes when using a crate, when bringing an external
crate into the scope of your project, you have to explicitly bring the macros
into the scope of your project as well with the #[macro_use]
annotation. This
example would bring all the macros defined in the serde
crate into the scope
of the current crate:
#[macro_use]
extern crate serde;
If extern crate
also brought macros into scope by default, you wouldn’t be
allowed to use two crates that happened to define macros with the same name. In
practice this conflict doesn’t come up much, but the more crates you use, the
more likely it is.
One last important difference between macros and functions: macros must be defined or brought into scope before they’re called in a file. Unlike functions, where we can define a function at the bottom of a file yet call it at the top, we always have to define macros before we’re able to call them.
Declarative Macros with macro_rules!
for General Metaprogramming
The first form of macros in Rust, and the one that’s most widely used, is
called declarative macros. These are also sometimes referred to as macros by
example, macro_rules!
macros, or just plain macros. At their core,
declarative macros allow you to write something similar to a Rust match
expression. As discussed in Chapter 6, match
expressions are control
structures that take an expression, compare the resulting value of the
expression to patterns, and then choose the code specified with the matching
pattern when the program runs. Macros also have a value that is compared to
patterns that have code associated with them, but the value is the literal Rust
code passed to the macro, the patterns match the structure of that source code,
and the code associated with each pattern is the code that is generated to
replace the code passed to the macro. This all happens during compilation.
To define a macro, you use the macro_rules!
construct. Let’s explore how to
use macro_rules!
by taking a look at how the vec!
macro is defined. Chapter
8 covered how we can use the vec!
macro to create a new vector that holds
particular values. For example, this macro creates a new vector with three
integers inside:
# #![allow(unused_variables)] #fn main() { let v: Vec<u32> = vec![1, 2, 3]; #}
We can also use vec!
to make a vector of two integers or a vector of five
string slices. Because we don’t know the number or type of values, we can’t
define a function that is able to create a new vector with the given elements
like vec!
can.
Let’s take a look at a slightly simplified definition of the vec!
macro:
# #![allow(unused_variables)] #fn main() { #[macro_export] macro_rules! vec { ( $( $x:expr ),* ) => { { let mut temp_vec = Vec::new(); $( temp_vec.push($x); )* temp_vec } }; } #}
Note: the actual definition of the
vec!
macro in the standard library also has code to pre-allocate the correct amount of memory up-front. That code is an optimization that we’ve chosen not to include here for simplicity.
The #[macro_export]
annotation indicates that this macro should be made
available when other crates import the crate in which we’re defining this
macro. Without this annotation, even if someone depending on this crate uses
the #[macro_use]
annotation, this macro would not be brought into scope.
Macro definitions start with macro_rules!
and the name of the macro we’re
defining without the exclamation mark, which in this case is vec
. This is
followed by curly brackets denoting the body of the macro definition.
Inside the body is a structure similar to the structure of a match
expression. This macro definition has one arm with the pattern ( $( $x:expr ),* )
, followed by =>
and the block of code associated with this pattern. If
this pattern matches, then the block of code will be emitted. Given that this
is the only pattern in this macro, there’s only one valid way to match; any
other will be an error. More complex macros will have more than one arm.
The pattern syntax valid in macro definitions is different than the pattern syntax covered in Chapter 18 because the patterns are for matching against Rust code structure rather than values. Let’s walk through what the pieces of the pattern used here mean; for the full macro pattern syntax, see the reference.
The $x:expr
part of the pattern matches any Rust expression and gives the
expression the name $x
. The *
specifies that the pattern matches zero or
more of whatever precedes the *
. In this case, *
is preceded by $(),
so
this pattern matches zero or more of whatever is inside the parentheses,
delimited by a comma. When we call this macro with vec![1, 2, 3];
, the
pattern matches the three expressions 1
, 2
, and 3
.
In the body of the code associated with this arm, the $()*
part is generated
for each part that matches $()
in the pattern, zero or more times depending
on how many times the pattern matches. The $x
in the code associated with the
arm is replaced with each expression matched. When we call this macro with
vec![1, 2, 3];
, the code generated that replaces this macro call will be:
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
We’ve defined a macro that can take any number of arguments of any type and can generate code to create a vector containing the specified elements.
Given that most Rust programmers will use macros more than write macros,
that’s all we’ll discuss about macro_rules!
in this book. To learn more about
how to write macros, consult the online documentation or other resources such
as The Little Book of Rust Macros.
Procedural Macros for Custom derive
The second form of macros is called procedural macros because they’re more
like functions (which are a type of procedure). Procedural macros accept some
Rust code as an input, operate on that code, and produce some Rust code as an
output, rather than matching against patterns and replacing the code with other
code as declarative macros do. Today, the only thing you can define procedural
macros for is to allow your traits to be implemented on a type by specifying
the trait name in a derive
annotation.
Let’s create a crate named hello-world
that defines a trait named
HelloWorld
with one associated function named hello_world
. Rather than
making users of our crate implement the HelloWorld
trait for each of their
types, we’d like users to be able to annotate their type with
#[derive(HelloWorld)]
to get a default implementation of the hello_world
function associated with their type. The default implementation will print
Hello world, my name is TypeName!
where TypeName
is the name of the type on
which this trait has been defined.
In other words, we’re going to write a crate that enables another programmer to write code that looks like Listing A4-1 using our crate:
Filename: src/main.rs
extern crate hello_world;
#[macro_use]
extern crate hello_world_derive;
use hello_world::HelloWorld;
#[derive(HelloWorld)]
struct Pancakes;
fn main() {
Pancakes::hello_world();
}
This code will print Hello world, my name is Pancakes!
when we’re done. Let’s
get started!
Let’s make a new library crate:
$ cargo new hello-world
First, we’ll define the HelloWorld
trait and associated function:
Filename: src/lib.rs
# #![allow(unused_variables)] #fn main() { pub trait HelloWorld { fn hello_world(); } #}
At this point, a user of our crate could implement the trait themselves to achieve the functionality we wanted to enable, like so:
extern crate hello_world;
use hello_world::HelloWorld;
struct Pancakes;
impl HelloWorld for Pancakes {
fn hello_world() {
println!("Hello world, my name is Pancakes!");
}
}
fn main() {
Pancakes::hello_world();
}
However, they would need to write out the implementation block for each type
they wanted to be able to use with hello_world
; we’d like to make using our
trait more convenient for other programmers by saving them this work.
Additionally, we can’t provide a default implementation for the hello_world
function that has the behavior we want of printing out the name of the type the
trait is implemented on: Rust doesn’t have reflection capabilities, so we can’t
look up the type’s name at runtime. We need a macro to generate code at compile
time.
Defining Procedural Macros Requires a Separate Crate
The next step is to define the procedural macro. At the moment, procedural
macros need to be in their own crate. Eventually, this restriction may be
lifted, but for now, it’s required. As such, there’s a convention: for a crate
named foo
, a custom derive procedural macro crate is called foo-derive
.
Let’s start a new crate called hello-world-derive
inside our hello-world
project:
$ cargo new hello-world-derive
We’ve chosen to create the procedural macro crate within the directory of our
hello-world
crate because the two crates are tightly related: if we change
the trait definition in hello-world
, we’ll have to change the implementation
of the procedural macro in hello-world-derive
as well. The two crates will
need to be published separately, and programmers using these crates will need
to add both as dependencies and bring them both into scope. It’s possible to
have the hello-world
crate use hello-world-derive
as a dependency and
re-export the procedural macro code, but structuring the project this way makes
it possible for programmers to easily decide they only want to use
hello-world
if they don’t want the derive
functionality.
We need to declare that the hello-world-derive
crate is a procedural macro
crate. We also need to add dependencies on the syn
and quote
crates to get
useful functionality for operating on Rust code. To do these two things, add
the following to the Cargo.toml for hello-world-derive
:
Filename: hello-world-derive/Cargo.toml
[lib]
proc-macro = true
[dependencies]
syn = "0.11.11"
quote = "0.3.15"
To start defining the procedural macro, place the code from Listing A4-2 in
src/lib.rs for the hello-world-derive
crate. Note that this won’t compile
until we add a definition for the impl_hello_world
function. We’ve split the
code into functions in this way because the code in Listing A4-2 will be the
same for almost every procedural macro crate; it’s code that makes writing a
procedural macro more convenient. What you choose to do in the place where the
impl_hello_world
function is called will be different and depend on the
purpose of your procedural macro.
Filename: hello-world-derive/src/lib.rs
extern crate proc_macro;
extern crate syn;
#[macro_use]
extern crate quote;
use proc_macro::TokenStream;
#[proc_macro_derive(HelloWorld)]
pub fn hello_world_derive(input: TokenStream) -> TokenStream {
// Construct a string representation of the type definition
let s = input.to_string();
// Parse the string representation
let ast = syn::parse_derive_input(&s).unwrap();
// Build the impl
let gen = impl_hello_world(&ast);
// Return the generated impl
gen.parse().unwrap()
}
We have introduced three new crates: proc_macro
, syn
, and quote
. The
proc_macro
crate comes with Rust, so we didn’t need to add that to the
dependencies in Cargo.toml. The proc_macro
crate allows us to convert Rust
code into a string containing that Rust code. The syn
crate parses Rust code
from a string into a data structure that we can perform operations on. The
quote
crate takes syn
data structures and turns them back into Rust code.
These crates make it much simpler to parse any sort of Rust code we might want
to handle: writing a full parser for Rust code is no simple task.
The hello_world_derive
function is the code that will get called when a user
of our library specifies the #[derive(HelloWorld)]
annotation on a type
because we’ve annotated the hello_world_derive
function here with
proc_macro_derive
and specified the same name, HelloWorld
. This name
matches our trait named HelloWorld
; that’s the convention most procedural
macros follow.
The first thing this function does is convert the input
from a TokenStream
to a String
by calling to_string
. This String
is a string representation
of the Rust code for which we are deriving HelloWorld
. In the example in
Listing A4-1, s
will have the String
value struct Pancakes;
because
that’s the Rust code we added the #[derive(HelloWorld)]
annotation to.
At the moment, the only thing you can do with a TokenStream
is convert it to
a string. A richer API will exist in the future.
What we really need is to be able to parse the Rust code String
into a data
structure that we can then interpret and perform operations on. This is where
syn
comes to play. The parse_derive_input
function in syn
takes a
String
and returns a DeriveInput
struct representing the parsed Rust code.
Here’s the relevant parts of the DeriveInput
struct we get from parsing the
string struct Pancakes;
:
DeriveInput {
// --snip--
ident: Ident(
"Pancakes"
),
body: Struct(
Unit
)
}
The fields of this struct show that the Rust code we’ve parsed is a unit struct
with the ident
(identifier, meaning the name) of Pancakes
. There are more
fields on this struct for describing all sorts of Rust code; check the syn
API docs for DeriveInput
for more information.
We haven’t defined the impl_hello_world
function; that’s where we’ll build
the new Rust code we want to include. Before we get to that, the last part of
this hello_world_derive
function is using the quote
crate’s parse
function to turn the output of the impl_hello_world
function back into a
TokenStream
. The returned TokenStream
is added to the code that users of
our crate write so that when they compile their crate, they get extra
functionality we provide.
You may have noticed that we’re calling unwrap
to panic if the calls to the
parse_derive_input
or parse
functions fail because they’re unable to parse
the TokenStream
or generate a TokenStream
. Panicking on errors is necessary
in procedural macro code because proc_macro_derive
functions must return
TokenStream
rather than Result
in order to conform to the procedural macro
API. We’ve chosen to keep this example simple by using unwrap
; in production
code you should provide more specific error messages about what went wrong by
using expect
or panic!
.
Now that we have the code to turn the annotated Rust code from a TokenStream
into a String
and into a DeriveInput
instance, let’s write the code that
will generate the code implementing the HelloWorld
trait on the annotated
type:
Filename: hello-world-derive/src/lib.rs
fn impl_hello_world(ast: &syn::DeriveInput) -> quote::Tokens {
let name = &ast.ident;
quote! {
impl HelloWorld for #name {
fn hello_world() {
println!("Hello, World! My name is {}", stringify!(#name));
}
}
}
}
We are able to get an Ident
struct instance containing the name (identifier)
of the annotated type using ast.ident
. With the code from Listing A4-1,
name
will be Ident("Pancakes")
.
The quote!
macro from the quote
crate lets us write up the Rust code that
we wish to return and convert it into quote::Tokens
. The quote!
macro lets
us use some really cool templating mechanics; we can write #name
and quote!
will replace it with the value in the variable named name
. You can even do
some repetition similar to the way regular macros work. Check out the quote
crate’s docs for a thorough introduction.
What we want to do for our procedural macro is generate an implementation of
our HelloWorld
trait for the type the user of our crate has annotated, which
we can get by using #name
. The trait implementation has one function,
hello_world
, and the function body contains the functionality we want to
provide: printing Hello, World! My name is
and then the name of the type the
user of our crate has annotated. The stringify!
macro used here is built into
Rust. It takes a Rust expression, such as 1 + 2
, and at compile time turns
the expression into a string literal, such as "1 + 2"
. This is different than
format!
or println!
, which evaluate the expression and then turn the result
into a String
. There’s a possibility that #name
would be an expression that
we would want to print out literally, and stringify!
also saves an allocation
by converting #name
to a string literal at compile time.
At this point, cargo build
should complete successfully in both hello-world
and hello-world-derive
. Let’s hook these crates up to the code in Listing
A4-1 to see it in action! Create a new binary project in your projects
directory with cargo new --bin pancakes
. We need to add both hello-world
and hello-world-derive
as dependencies in the pancakes
crate’s
Cargo.toml. If you’ve chosen to publish your versions of hello-world
and
hello-world-derive
to https://crates.io they would be regular dependencies;
if not, you can specify them as path
dependencies as follows:
[dependencies]
hello_world = { path = "../hello-world" }
hello_world_derive = { path = "../hello-world/hello-world-derive" }
Put the code from Listing A4-1 into src/main.rs, and executing cargo run
should print Hello, World! My name is Pancakes
! The implementation of the
HelloWorld
trait from the procedural macro was included without the
pancakes
crate needing to implement it; the #[derive(HelloWorld)]
took care
of adding the trait implementation.
The Future of Macros
In the future, we’ll be expanding both declarative and procedural macros. A
better declarative macro system will be used with the macro
keyword, and
we’ll add more types of procedural macros, for more powerful tasks than only
derive
. These systems are still under development at the time of publication;
please consult the online Rust documentation for the latest information.