Skip to content

C Preprocessor tricks, tips, and idioms

pfultz2 edited this page Jun 21, 2012 · 14 revisions

Basic

Pattern Matching

The ## operator is used to concatenate two tokens into one token. This is provides a very powerful way to do pattern matching. Say we want to write a IIF macro, we could write it like this:

#define IIF(cond) IIF_ ## cond
#define IIF_0(t, f) f
#define IIF_1(t, f) t

However there is one problem with this approach. A subtle side effect of the ## operator is that it inhibits expansion. Heres an example:

#define A() 1
//This correctly expands to true
IIF(1)(true, false) 
// This will however expand to IIF_A()(true, false)
// This is because A() doesn't expand to 1,
// because its inhibited by the ## operator
IIF(A())(true, false) 

The way to work around this is to use another indirection. Sense this is commonly done we can write a macro called CAT that will concatenate without inhibition.

#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__

So now we can write the IIF macro(its called IIF right now, later we will show how to define a more generalized way of defining an IF macro):

#define IIF(c) PRIMITIVE_CAT(IIF_, c)
#define IIF_0(t, ...) __VA_ARGS__
#define IIF_1(t, ...) t

#define A() 1
//This correctly expands to true
IIF(1)(true, false) 
// And this will also now correctly expand to true
IIF(A())(true, false) 

With pattern matching we can define other operations, such as COMPL which takes the complement:

#define COMPL(b) PRIMITIVE_CAT(COMPL_, b)
#define COMPL_0 1
#define COMPL_1 0

We can define increment and decrement operators as macros:

#define INC(x) PRIMITIVE_CAT(INC_, x)
#define INC_0 1
#define INC_1 2
#define INC_2 3
#define INC_3 4
#define INC_4 5
#define INC_5 6
#define INC_6 7
#define INC_7 8
#define INC_8 9
#define INC_9 9

#define DEC(x) PRIMITIVE_CAT(DEC_, x)
#define DEC_0 0
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC_9 8

Detection

Detection techniques can be used to detect if the parameter is a certain value or if it is parenthesis. It relies on vardiac arguments expanding to different number of parameters. At the core of detection is a CHECK macro with a PROBE macro like this:

#define CHECK_N(x, n, ...) n
#define CHECK(...) CHECK_N(__VA_ARGS__, 0,)
#define PROBE(x) x, 1,

This is very simple. When the probe is given to the CHECK macro like this:

CHECK(PROBE(~)) // Expands to 1

But if we give it a single token:

CHECK(xxx) // Expands to 0

So with this, we can create some detection macros. For instance, if we want to detect for parenthesis:

#define IS_PAREN(x) IS_PAREN_PROBE x
#define IS_PAREN_PROBE(...) PROBE(~)
IS_PAREN(()) // Expands to 1
IS_PAREN(xxx) // Expands to 0

Now, say we want to write a generalized IF macro that picks false if its 0 otherwise it would pick true. We can use these detection techniques to do that. First, we start by writing a NOT operator which is just like an IS_0 macro:

#define NOT(x) CHECK(PRIMITIVE_CAT(NOT_, x))
#define NOT_0 PROBE(~)

Next we use the COMPL macro that we wrote to inverse this result, and then call IIF:

#define BOOL(x) COMPL(NOT(x))
#define IF(c) IIF(BOOL(c))

Recursion

Unfortunately, macros can't expand recursively. When a macro expands its painted blue, and can no longer expand for that scan. First, there are ways to work around this to keep macros expanding. Secondly, we can detect if a macro is painted blue(because it wont expand) and use this state to expand to a different macro.

Deferred expression

A deferred expression is an expression that requires more scans to fully expand. Heres an example:

#define EMPTY()
#define DEFER(id) id EMPTY()
#define OBSTRUCT(id) id DEFER(EMPTY)()
#define EXPR(...) __VA_ARGS__

#define A() 123
A() // Expands to 123
DEFER(A)() // Expands to A () because it requires one more scan to fully expand
EXPR(DEFER(A)()) // Expands to 123, because the EXPR macro forces another scan

Why is this important? Well when macros are scanned and expanded, they are painted blue, which means they can no longer expand for that one scan; but when the preprocessor starts another scan all these blue painted macros are cleared, which means they can expand again. So what that means for us, is we can create multiple EXPR macros, which will let our deferred expression expand on different preprocessor scans:

#define EXPR_S(s) PRIMITIVE_CAT(EXPR_, s)
#define EXPR_0(...) __VA_ARGS__
#define EXPR_1(...) __VA_ARGS__
#define EXPR_2(...) __VA_ARGS__
#define EXPR_3(...) __VA_ARGS__
#define EXPR_4(...) __VA_ARGS__
#define EXPR_5(...) __VA_ARGS__
#define EXPR_6(...) __VA_ARGS__
#define EXPR_7(...) __VA_ARGS__
#define EXPR_8(...) __VA_ARGS__
#define EXPR_9(...) __VA_ARGS__

We can use this to write a REPEAT macro. We split the macro in two parts in the interface and implementation. This just makes it easier to set up any values that will be used several times, such as incrementing the recursion state and decrementing the repeat counter. Also we pass the OBSTRUCT macro in as the first parameter. This is just to make it easier and cleaner to write a deferred expression in the implementation. So instead of writing DEFER(WHEN)(n), we will just write WHEN _(n).

//This is the interface
//REPEAT_S(s, n, m, data...)
// s    - This is the current recursion state
// n    - This is the number of repeats
// m    - A macro to be called at each repeat: m(s, n, data...)
// data - data is passed to the macro, in addition to the counter and state.
#define REPEAT_S(s, n, m, ...) \
        REPEAT_I(OBSTRUCT(), INC(s), DEC(n), m, __VA_ARGS__)
//This is the implementation
#define REPEAT_INDIRECT() REPEAT_S
#define REPEAT_I(_, s, n, m, ...) \
        WHEN _(n) \
        ( \
            EXPR_S _(s)(REPEAT_INDIRECT _()(s, n, m, __VA_ARGS__) ) \
        ) \
        m _(s, n, __VA_ARGS__)

//An example of using this macro
#define M(s, i, _) i
EXPR_S(0)(REPEAT_S(0, 8, M, ~)) // 0 1 2 3 4 5 6 7

Active arguments

An active argument is an argument that expands each time that it is scanned by the preprocessor. For example:

#define A(n) \
    DEFER(A_INDIRECT)()(INC(n)) \

#define A_INDIRECT() A

#define X(arg) arg
#define Y(arg) X(arg)
#define Z(arg) Y(arg)

   A(0)   // A_INDIRECT()(1)
X( A(0) ) // A_INDIRECT()(2)
Y( A(0) ) // A_INDIRECT()(3)
Z( A(0) ) // A_INDIRECT()(4)

In the above, each invocation of A is subjected to a different number of scans, which causes the results to be different each time. Normally, unlike the example in the previous example, an active argument eventually reaches a terminal, or inactive, state. For example,

#define B(n) \
    n WHEN(n) \
    ( \
        B_INDIRECT OBSTRUCT()()(DEC(n)) \
    ) \
    /**/
#define B_INDIRECT() B

#define ID(x) x

#define SCAN(x) ID(ID(ID(ID(ID(x)))))

SCAN( B(0) ) // 0
SCAN( B(1) ) // 1 0
SCAN( B(2) ) // 2 1 0
SCAN( B(3) ) // 3 2 1 0

Here, each invocation of B creates an active argument that eventually reaches a terminal state. However, each invocation reaches that terminal state at a different time (i.e. after a different number of scans).

Not all useful active arguments reach a terminal state. The following code produces an infinite list:

#define HEAD(list) HEAD_I list
#define HEAD_I(head, tail) head

#define TAIL(list) TAIL_I list
#define TAIL_I(head, tail) tail

#define STREAM(x, op, data) (x, STREAM_I(x, op, data))
#define STREAM_INDIRECT() STREAM_I
#define STREAM_I(x, op, data) \
    IIF(IS_PAREN(TAIL_I(~, ()))) \
    ( \
        STREAM_INDIRECT OBSTRUCT()()(x, op, data) EAT, \
        STREAM_II \
    )(op(x, data), op, data) \
    /**/
#define STREAM_II(x, op, data) \
    (x, DEFER(STREAM_INDIRECT)()(x, op, data)) \
    /**/

#define NATURALS STREAM(0, NATURALS_OP, ~)
#define NATURALS_OP(x, _) INC(x)

HEAD(NATURALS)             // 0
HEAD(TAIL(NATURALS))       // 1
HEAD(TAIL(TAIL(NATURALS))) // 2

The NATURALS macro conceptually expands to a list of all natural numbers. (Because this code is using INC, the value eventually will saturate, causing an infinite list of values saturated to maximum of INC(which is 9 for the examples in this page) from that point on.)

We can also use active arguments for recursion. We can define an EVAL macro like this:

#define EVAL(...) A(A(A(__VA_ARGS__)))
#define A(...) B(B(B(__VA_ARGS__)))
#define B(...) C(C(C(__VA_ARGS__)))
#define C(...) D(D(D(__VA_ARGS__)))
#define D(...) E(E(E(__VA_ARGS__)))
#define E(...) __VA_ARGS__

This basically forces several scans causing the active argument to expand several times. Here is how a REPEAT macro could be implemented:

#define REPEAT(count, macro, ...) \
    WHEN(count) \
    ( \
        REPEAT_INDIRECT OBSTRUCT() () \
        ( \
            DEC(count), macro, __VA_ARGS__ \
        ) \
        macro OBSTRUCT() \
        ( \
            DEC(count), __VA_ARGS__ \
        ) \
    ) \
#define REPEAT_INDIRECT() REPEAT

This form of recursion, while interesting, has several major drawbacks. Most of them can be minimized, in one way or another, but a fundamental problem is efficiency. Eventually, the active argument reaches a terminal state--at which point it no longer changes--and no further scans are required to produce the final result. However, further scans are applied anyway--exponentially. Another drawback is that this approach effectively enforces a continuation style because it destroys call-and-return semantics. In the above example, the result of the "nested" use of REPEAT is not needed for further calculation, so it isn't an issue. That is not always the case.

Clone this wiki locally