Underscores.jl

Underscores provides a macro @_ for passing closures to functions by interpreting _ placeholders as anonymous function arguments. For example @_ map(_+1, xs) means map(x->x+1, xs).

Underscores is useful for writing anonymous functions succinctly and without naming the arguments. This is particular useful for data processing pipelines such as

@_ people |> filter(_.age > 40, __) |> map(_.name, __)

Tutorial

Basic use of _

@_ and _ placeholders are for making functions to pass to other functions. For example, to get the second last element of each array in a collection, broadcasting syntax would be awkward. Instead we can use:

julia> @_ map(_[end-1],  [[1,2,3], [4,5]])
2-element Vector{Int64}:
 2
 4

Repeated use of _ refers to the argument of a single-argument anonymous function. To sum the last two elements of the arrays from the previous example:

julia> @_ map(_[end] + _[end-1],  [[1,2,3], [4,5]])
2-element Vector{Int64}:
 5
 9

Multiple arguments

Multiple argument anonymous functions can be created with numbered placeholders like _1 can be useful when you need to repeat arguments or reorder them. For example,

julia> @_ map("X $_2 $(repeat(_1,_2))", ["a","b","c"], [1,2,3])
3-element Vector{String}:
 "X 1 a"
 "X 2 bb"
 "X 3 ccc"

Tabular data

@_ is handy for manipulating tabular data. Let's filter a list of named tuples:

julia> table = [(x="a", y=1),
                (x="b", y=2),
                (x="c", y=3)];

julia> @_ filter(!startswith(_.x, "a"), table)
2-element Vector{NamedTuple{(:x, :y), Tuple{String, Int64}}}:
 (x = "b", y = 2)
 (x = "c", y = 3)

When combined with double underscore placeholders __ and piping syntax this becomes particularly neat. In the following, think of __ as the table, and _ as an individual row:

julia> @_ table |>
          filter(!startswith(_.x, "a"), __) |>
          map(_.y, __)
2-element Vector{Int64}:
 2
 3

Reference

Underscores.@_Macro
@_ func(ex1, [ex2 ...])

Convert ex1,ex2,... into anonymous functions when they have _ placeholders, and pass them along to func.

The detailed rules are:

  1. Uses of the placeholder _ expand to the single argument of an anonymous function which is passed to the outermost ordinary function call.
  2. Numbered placeholders _1,_2,... (or _₁,_₂,...) may be used if you need more than one argument. Numbers indicate position in the argument list.
  3. The double underscore placeholder __ (and numbered versions __1,__2,...) expands the closure scope to the whole expression.
  4. Piping and composition chains with |>,<|,∘ are treated as a special case where the replacement recurses into sub-expressions.

These rules imply the following equivalences

ExpressionRulesMeaning
@_ map(_+1, a)(1)map(x->x+1, a)
@_ map(_^_, a)(1)map(x->x^x, a)
@_ map(_2/_1, a, b)(1,2)map((x,y)->y/x, a, b)
@_ func(a,__,b)(3)x->func(a,x,b)
@_ func(a,__2,b)(3)(x,y)->func(a,y,b)
@_ data |> map(_.f,__)(1,3,4)data |> (d->map(x->x.f,d))

Extended help

Examples

@_ can be used for simple mapping operations in cases where broadcasting syntax is awkward. For example, to get the second last element of each array in a collection:

julia> @_ map(_[end-1],  [[1,2,3], [4,5]])
2-element Vector{Int64}:
 2
 4

If you need to repeat an argument more than once, just use _ multiple times:

julia> @_ map(_^_,  [1,2,3])
3-element Vector{Int64}:
  1
  4
 27

For manipulating tabular data @_ provides convenient syntax which is especially useful when combined with double underscore placeholders __ and piping syntax. Think of __ as the table, and _ as an individual row:

julia> table = [(x="a", y=1),
                (x="b", y=2),
                (x="c", y=3)];

julia> @_ table |>
          filter(!startswith(_.x, "a"), __) |>
          map(_.y, __)
2-element Vector{Int64}:
 2
 3

Extraordinary functions

The scope of _ as described in rule 1 depends on "ordinary" function call. This excludes the following operations:

  • Square brackets: In map(_^2,__)[3], it is map which receives an anonymous function, as this happens before the indexing is lowered to getindex(...,3). Comprehensions (collect) and explicit matrix constructions (hvcat) are treated similarly.

  • Broadcasting, and field access: In f.(_,xs) and f(_,x).y the function f is the ordinary call, not the internal broadcast or getproperty.

  • Infix operators: While sum(_^2,x) / length(x) can be written in prefix form /(...,...), the convention of @_ is not to view this as an ordinary call, and hence to pass the anonymous function to sum instead. This also applies to broadcasted operators, such as map(_^2,x) ./ length(x).

  • If statements, including the ternary operator. Note that this has higher precedence than pipes: data |> (any(_.x<0, __) ? abs.(__) : __) |> step needs these brackets.

The scope of __ is unaffected by these concerns.

ExpressionMeaning
@_ data |> map(_[2],__)[3]data |> (d->map(x->x[2],d)[3])
@_ [sum(_*_, z) for z in a][sum(x->x*x, z) for z in a]
@_ sum(1+_^2, data).resum(x->1+x^2, data).re
@_ sum(_^2,a) / length(a)sum(x->x^2,a) / length(a)
@_ /(sum(_^2,a), length(a))The same, infix form is canonical.
@_ data |> filter(_>3,__).^2data |> d->(filter(>(3),d).^2)
@_ any(_>3,xs) ? 0 : map(_,ys)any(x->x>3,xs) ? 0 : map(y->y,ys)
source

Design

Underscore syntax for Julia has been discussed at great length in #24990, #5571 and elsewhere in the Julia community. The design for Underscores.jl grew out of this discussion. A great many packages have had a go at macros for this, including at least ChainMap.jl, ChainRecursive.jl, FunctionalData.jl, Hose.jl, Lazy.jl, LightQuery.jl, LambdaFn.jl, MagicUnderscores.jl, Pipe.jl, Query.jl and SplitApplyCombine.jl

The key benefits of _ placeholders are

  • They avoid the need to come up with argument names (for example, the x in x->x+1 may not be meaningful).
  • Brevity. For example _+1.

One design difficulty is that much of the package work has focussed on the piping and tabular data manipulation scenario. However as a language feature a compelling general solution for _ placeholders needs wider appeal.

Starting with the need to be useful outside of tabular data manipulation, we observe that anonymous functions are generally passed directly to another "outer" function. For example, in map(x->x*y, A) the outer function is map. However, putting @_ inside the function call leads to a lot of visual clutter, especially because it needs to be parenthesized to avoid consuming the remaining arguments to map. However, one can place the @_ on the function receiving the closure which results in less visual clutter and improved clarity. Compare:

@_ map(_+1, A)   # This design

map(@_(_+1), A)  # The obvious alternative

map(x->x+1, A)   # Current status quo

With this "outermost-but-one" placement in mind, one can generalize to pipelines where anonymous functions are generally used as arguments to filter and map. This works in a particularly nice way for lazy versions of Filter and Map, allowing expressions such as

Filter(f) = x->filter(f,x);   Map(f) = x->map(f,x);

@_  data         |>
    Filter(_>10) |>
    Map(_+1)

However, julia natively has non-lazy filter and map so we'd really like a way to make these directly useable as well. For this we introduce the longer placeholder __ to escape the extra function call to the outer level. This is also appealing because the larger data structure (ie, the full table) ends up being represented by __, while the smaller row data structure is the smaller placeholder _. Thus we get:

@_  data             |>
    filter(_>10, __) |>
    map(_+1, __)