This package provides consistent utility functions for array programming with arbitrary dimensions (summary below).
We recommend to reference functions using the package namespace or
loading the package into its own namespace to not shadow base R
functions. The latter can be done using box
or
import
.
# do not use library() or require()
# example referencing the package namespace
narray::stack(...)
# example using `box`
box::use(narray)
narray$stack(...)
stack()
is like cbind
/rbind
, but along arbitrary axes, and taking care of (1) names
along each dimension and (2) padding partial matching arrays.
A = matrix(1:4, nrow=2, ncol=2, dimnames=list(c('a','b'),c('x','y')))
B = matrix(5:6, nrow=2, ncol=1, dimnames=list(c('b','a'),'z'))
C = stack(A, B, along=2)
# x y z
# a 1 3 6 # B is stacked correctly according to its names
# b 2 4 5
D = stack(m=A, n=C, along=3) # we can also introduce new dimensions
# , , m , , n
#
# x y z x y z
# a 1 3 NA a 1 3 6
# b 2 4 NA b 2 4 5
split()
splits an array along a given axis; can do each element or defined subsets.
split(C, along=2, subsets=c('s1','s1','s2'))
# $s1 $s2
# x y z # each subset is split into a separate array
# a 1 3 a 6 # use drop=TRUE to get a vector for s2
# b 2 4 b 5
Like apply
, but not reordering array dimensions and allowing to specify
subsets that the function should be applied on. The function must either return
a vector of the same length as the input (returns matrix of same dimension) or
of length 1 (drops current dimension or returns subsets).
map(C, along=2, function(x) x*2) # return same length vector
# x y z
# a 2 6 12
# b 4 8 10
map(C, along=2, mean, subsets=c('s1', 's1', 's2')) # summarize each subset to scalar
# s1 s2
# a 2 6
# b 3 5
We can also index multiple arrays using the lambda
function. If the result
is a scalar we will get back an array, and an index with result column otherwise.
dot = function(x, y) sum(x * y)
lambda(~ dot(A, B), along=c(A=1, B=2))
# z
# a 23 # the same as A %*% B
# b 34
lambda(~ dot(A, B), along=c(A=1, B=2), simplify=FALSE)
# A B result
# 1 a z 23
# 2 b z 34
Takes a number of arrays, intersects their names along a given dimension,
and returns sub-arrays that match in their names; intersect_list
takes
a list of arrays and returns a list of subsets.
E = matrix(1:6, nrow=3, dimnames=list(c('a','b','d'), c('x','y')))
F = matrix(7:9, nrow=3, dimnames=list(c('b','a','c'), 'z'))
intersect(E, F, along=1)
# > E > F
# x y z
# a 1 4 a 8
# b 2 5 b 7
construct()
takes a data frame and a formula specifying dependent (values) and independent
(axes) of the resulting array.
DF = data.frame(k1=rep(letters[1:3],2), k2=rep(letters[24:25],3), v=1:6)[-6,]
construct(v ~ k1 + k2, data=DF)
# x y
# a 1 4
# b 5 2
# c 3 NA
Takes either a factor or a list of vectors and creates a binary matrix specifying whether each element is present.
G = list(a='e1', b=c('e1','e2'), c='e2')
mask(G)
# e1 e2
# a TRUE FALSE
# b TRUE TRUE
# c FALSE TRUE