Operations¶

Operation Interface¶

The following functions define DyNet “Expressions,” which are used as an interface to the various functions that can be used to build DyNet computation graphs. Expressions for each specific function are listed below.

struct dynet::expr::Expression¶

#include <expr.h>

Expressions are the building block of a Dynet computation graph.

[long description]

Public Functions

Expression(ComputationGraph *pg, VariableIndex i)¶

Base expression constructor.

Used when creating operations

Parameters

pg: Pointer to the computation graph
i: Variable index
name: Name of the expression

const Tensor &value() const¶

Get value of the expression.

Throws a tuntime_error exception if no computation graph is available

Return: Value of the expression as a tensor

const Tensor &gradient() const¶

Get gradient of the expression.

Throws a tuntime_error exception if no computation graph is available

Make sure to call backward on a downstream expression before calling this.

If the expression is a constant expression (meaning it’s not a function of a parameter), dynet won’t compute it’s gradient for the sake of efficiency. You need to manually force the gradient computation by adding the agument full=true to backward

Return: Value of the expression as a tensor

const Dim &dim() const¶

Get dimension of the expression.

Throws a tuntime_error exception if no computation graph is available

Return: Dimension of the expression

Input Operations¶

These operations allow you to input something into the computation graph, either simple scalar/vector/matrix inputs from floats, or parameter inputs from a DyNet parameter object. They all requre passing a computation graph as input so you know which graph is being used for this particular calculation.

Expression dynet::expr::input(ComputationGraph &g, real s)¶

Scalar input.

Create an expression that represents the scalar value s

Return

An expression representing s

Parameters

g: Computation graph
s: Real number

Expression dynet::expr::input(ComputationGraph &g, const real *ps)¶

Modifiable scalar input.

Create an expression that represents the scalar value *ps. If *ps is changed and the computation graph recalculated, the next forward pass will reflect the new value.

Return

An expression representing *ps

Parameters

g: Computation graph
ps: Real number pointer

Expression dynet::expr::input(ComputationGraph &g, const Dim &d, const std::vector<float> &data)¶

Vector/matrix/tensor input.

Create an expression that represents a vector, matrix, or tensor input. The dimensions of the input are defined by d. So for example > input(g,{50},data): will result in a 50-length vector > input(g,{50,30},data): will result in a 50x30 matrix and so on, for an arbitrary number of dimensions. This function can also be used to import minibatched inputs. For example, if we have 10 examples in a minibatch, each with size 50x30, then we call > input(g,Dim({50,30},10),data) The data vector “data” will contain the values used to fill the input, in column-major format. The length must add to the product of all dimensions in d.

Return

An expression representing data

Parameters

g: Computation graph
d: Dimension of the input matrix
data: A vector of data points

Expression dynet::expr::input(ComputationGraph &g, const Dim &d, const std::vector<float> *pdata)¶

Updatable vector/matrix/tensor input.

Similarly to input that takes a vector reference, input a vector, matrix, or tensor input. Because we pass the pointer, the data can be updated.

Return

An expression representing *pdata

Parameters

g: Computation graph
d: Dimension of the input matrix
pdata: A pointer to an (updatable) vector of data points

Expression dynet::expr::input(ComputationGraph &g, const Dim &d, const std::vector<unsigned int> &ids, const std::vector<float> &data, float defdata = 0.f)¶

Sparse vector input.

This operation takes input as a sparse matrix of index/value pairs. It is exactly the same as the standard input via vector reference, but sets all non-specified values to “defdata” and resets all others to the appropriate input values.

Return

An expression representing data

Parameters

g: Computation graph
d: Dimension of the input matrix
ids: The indexes of the data points to update
data: The data points corresponding to each index
defdata: The default data with which to set the unspecified data points

Expression dynet::expr::parameter(ComputationGraph &g, Parameter p)¶

Load parameter.

Load parameters into the computation graph.

Return

An expression representing p

Parameters

g: Computation graph
p: Parameter object to load

Expression dynet::expr::parameter(ComputationGraph &g, LookupParameter lp)¶

Load lookup parameter.

Load a full tensor of lookup parameters into the computation graph. Normally lookup parameters are accessed by using the lookup() function to grab a single element. However, in some cases we’ll want to access all of the parameters in the entire set of lookup parameters for some reason. In this case you can use this function. In this case, the first dimensions in the returned tensor will be equivalent to the dimensions that we would get if we get calling the lookup() function, and the size of the final dimension will be equal to the size of the vocabulary.

Return

An expression representing lp

Parameters

g: Computation graph
lp: LookupParameter object to load

Expression dynet::expr::const_parameter(ComputationGraph &g, Parameter p)¶

Load constant parameters.

Load parameters into the computation graph, but prevent them from being updated when performing parameter update.

Return

An expression representing the constant p

Parameters

g: Computation graph
p: Parameter object to load

Expression dynet::expr::const_parameter(ComputationGraph &g, LookupParameter lp)¶

Load constant lookup parameters.

Load lookup parameters into the computation graph, but prevent them from being updated when performing parameter update.

Return

An expression representing the constant lp

Parameters

g: Computation graph
lp: LookupParameter object to load

Expression dynet::expr::lookup(ComputationGraph &g, LookupParameter p, unsigned index)¶

Look up parameter.

Look up parameters according to an index, and load them into the computation graph.

Return

An expression representing p[index]

Parameters

g: Computation graph
p: LookupParameter object from which to load
index: Index of the parameters within p

Expression dynet::expr::lookup(ComputationGraph &g, LookupParameter p, const unsigned *pindex)¶

Look up parameters with modifiable index.

Look up parameters according to the *pindex, and load them into the computation graph. When *pindex changes, on the next computation of forward() the values will change.

Return

An expression representing p[*pindex]

Parameters

g: Computation graph
p: LookupParameter object from which to load
pindex: Pointer index of the parameters within p

Expression dynet::expr::const_lookup(ComputationGraph &g, LookupParameter p, unsigned index)¶

Look up parameter.

Look up parameters according to an index, and load them into the computation graph. Do not perform gradient update on the parameters.

Return

A constant expression representing p[index]

Parameters

g: Computation graph
p: LookupParameter object from which to load
index: Index of the parameters within p

Expression dynet::expr::const_lookup(ComputationGraph &g, LookupParameter p, const unsigned *pindex)¶

Constant lookup parameters with modifiable index.

Look up parameters according to the *pindex, and load them into the computation graph. When *pindex changes, on the next computation of forward() the values will change. However, gradient updates will not be performend.

Return

A constant expression representing p[*pindex]

Parameters

g: Computation graph
p: LookupParameter object from which to load
pindex: Pointer index of the parameters within p

Expression dynet::expr::lookup(ComputationGraph &g, LookupParameter p, const std::vector<unsigned> &indices)¶

Look up parameters.

The mini-batched version of lookup. The resulting expression will be a mini-batch of parameters, where the “i”th element of the batch corresponds to the parameters at the position specified by the “i”th element of “indices”

Return

An expression with the “i”th batch element representing p[indices[i]]

Parameters

g: Computation graph
p: LookupParameter object from which to load
indices: Index of the parameters at each position in the batch

Expression dynet::expr::lookup(ComputationGraph &g, LookupParameter p, const std::vector<unsigned> *pindices)¶

Look up parameters.

The mini-batched version of lookup with modifiable parameter indices.

Return

An expression with the “i”th batch element representing p[*pindices[i]]

Parameters

g: Computation graph
p: LookupParameter object from which to load
pindices: Pointer to lookup indices

Expression dynet::expr::const_lookup(ComputationGraph &g, LookupParameter p, const std::vector<unsigned> &indices)¶

Look up parameters.

Mini-batched lookup that will not update the parameters.

Return

A constant expression with the “i”th batch element representing p[indices[i]]

Parameters

g: Computation graph
p: LookupParameter object from which to load
indices: Lookup indices

Expression dynet::expr::const_lookup(ComputationGraph &g, LookupParameter p, const std::vector<unsigned> *pindices)¶

Look up parameters.

Mini-batched lookup that will not update the parameters, with modifiable indices.

Return

A constant expression with the “i”th batch element representing p[*pindices[i]]

Parameters

g: Computation graph
p: LookupParameter object from which to load
pindices: Lookup index pointers.

Expression dynet::expr::zeroes(ComputationGraph &g, const Dim &d)¶

Create an input full of zeros.

Create an input full of zeros, sized according to dimensions d.

Return

A “d” dimensioned zero vector

Parameters

g: Computation graph
d: The dimensions of the input

Expression dynet::expr::random_normal(ComputationGraph &g, const Dim &d)¶

Create a random normal vector.

Create a vector distributed according to normal distribution with mean 0, variance 1.

Return

A “d” dimensioned normally distributed vector

Parameters

g: Computation graph
d: The dimensions of the input

Expression dynet::expr::random_bernoulli(ComputationGraph &g, const Dim &d, real p, real scale = 1.0f)¶

Create a random bernoulli vector.

Create a vector distributed according to bernoulli distribution with parameter p.

Return

A “d” dimensioned bernoulli distributed vector

Parameters

g: Computation graph
d: The dimensions of the input
p: The bernoulli p parameter
scale: A scaling factor for the output (“active” elements will receive this value)

Expression dynet::expr::random_uniform(ComputationGraph &g, const Dim &d, real left, real right)¶

Create a random uniform vector.

Create a vector distributed according to uniform distribution with boundaries left and right.

Return

A “d” dimensioned uniform distributed vector

Parameters

g: Computation graph
d: The dimensions of the input
left: The left boundary
right: The right boundary

Expression dynet::expr::random_gumbel(ComputationGraph &g, const Dim &d, real mu = 0.0, real beta = 1.0)¶

Create a random Gumbel sampled vector.

Create a vector distributed according to a Gumbel distribution with the specified parameters. (Currently only the defaults of mu=0.0 and beta=1.0 supported.

Return

A “d” dimensioned Gumbel distributed vector

Parameters

g: Computation graph
d: The dimensions of the input
mu: The mu parameter
beta: The beta parameter

Arithmetic Operations¶

These operations perform basic arithemetic over values in the graph.

Expression dynet::expr::operator-(const Expression &x)¶

Negation.

Negate the passed argument.

Return

The negation of x

Parameters

x: An input expression

Expression dynet::expr::operator+(const Expression &x, const Expression &y)¶

Expression addition.

Add two expressions of the same dimensions.

Return

The sum of x and y

Parameters

x: The first input
y: The second input

Expression dynet::expr::operator+(const Expression &x, real y)¶

Scalar addition.

Add a scalar to an expression

Return

An expression equal to x, with every component increased by y

Parameters

x: The expression
y: The scalar

Expression dynet::expr::operator+(real x, const Expression &y)¶

Scalar addition.

Add a scalar to an expression

Return

An expression equal to y, with every component increased by x

Parameters

x: The scalar
y: The expression

Expression dynet::expr::operator-(const Expression &x, const Expression &y)¶

Expression subtraction.

Subtract one expression from another.

Return

An expression where the ith element is x_i minus y_i

Parameters

x: The expression from which to subtract
y: The expression to subtract

Expression dynet::expr::operator-(real x, const Expression &y)¶

Scalar subtraction.

Subtract an expression from a scalar

Return

An expression where the ith element is x_i minus y

Parameters

x: The scalar from which to subtract
y: The expression to subtract

Expression dynet::expr::operator-(const Expression &x, real y)¶

Scalar subtraction.

Subtract a scalar from an expression

Return

An expression where the ith element is x_i minus y

Parameters

x: The expression from which to subtract
y: The scalar to subtract

Expression dynet::expr::operator*(const Expression &x, const Expression &y)¶

Matrix multiplication.

Multiply two matrices together. Like standard matrix multiplication, the second dimension of x and the first dimension of y must match.

Return

An expression x times y

Parameters

x: The left-hand matrix
y: The right-hand matrix

Expression dynet::expr::operator*(const Expression &x, float y)¶

Matrix-scalar multiplication.

Multiply an expression component-wise by a scalar.

Return

An expression where the ith element is x_i times y

Parameters

x: The matrix
y: The scalar

Expression dynet::expr::operator*(float y, const Expression &x)¶

Matrix-scalar multiplication.

Multiply an expression component-wise by a scalar.

Return

An expression where the ith element is x_i times y

Parameters

x: The scalar
y: The matrix

Expression dynet::expr::operator/(const Expression &x, float y)¶

Matrix-scalar division.

Divide an expression component-wise by a scalar.

Return

An expression where the ith element is x_i divided by y

Parameters

x: The matrix
y: The scalar

Expression dynet::expr::affine_transform(const std::initializer_list<Expression> &xs)¶

Affine transform.

This performs an affine transform over an arbitrary (odd) number of expressions held in the input initializer list xs. The first expression is the “bias,” which is added to the expression as-is. The remaining expressions are multiplied together in pairs, then added. A very common usage case is the calculation of the score for a neural network layer (e.g. b + Wz) where b is the bias, W is the weight matrix, and z is the input. In this case xs[0] = b, xs[1] = W, and xs[2] = z.

Return

An expression equal to: xs[0] + xs[1]*xs[2] + xs[3]*xs[4] + ...

Parameters

xs: An initializer list containing an odd number of expressions

Expression dynet::expr::sum(const std::initializer_list<Expression> &xs)¶

Sum.

This performs an elementwise sum over all the expressions in xs

Return

An expression where the ith element is equal to xs[0][i] + xs[1][i] + ...

Parameters

xs: An initializer list containing expressions

Expression dynet::expr::sum_elems(const Expression &x)¶

Sum all elements.

Sum all the elements in an expression.

Return

The sum of all of its elements

Parameters

x: The input expression

Expression dynet::expr::average(const std::initializer_list<Expression> &xs)¶

Average.

This performs an elementwise average over all the expressions in xs

Return

An expression where the ith element is equal to (xs[0][i] + xs[1][i] + ...)/|xs|

Parameters

xs: An initializer list containing expressions

Expression dynet::expr::sqrt(const Expression &x)¶

Square root.

Elementwise square root.

Return

An expression where the ith element is equal to \(\sqrt(x_i)\)

Parameters

x: The input expression

Expression dynet::expr::abs(const Expression &x)¶

Absolute value.

Elementwise absolute value.

Return

An expression where the ith element is equal to \(\vert x_i\vert\)

Parameters

x: The input expression

Expression dynet::expr::erf(const Expression &x)¶

Gaussian error function.

Elementwise calculation of the Gaussian error function

Return

An expression where the ith element is equal to erf(x_i)

Parameters

x: The input expression

Expression dynet::expr::tanh(const Expression &x)¶

Hyperbolic tangent.

Elementwise calculation of the hyperbolic tangent

Return

An expression where the ith element is equal to tanh(x_i)

Parameters

x: The input expression

Expression dynet::expr::exp(const Expression &x)¶

Natural exponent.

Calculate elementwise y_i = e^{x_i}

Return

An expression where the ith element is equal to e^{x_i}

Parameters

x: The input expression

Expression dynet::expr::square(const Expression &x)¶

Square.

Calculate elementwise y_i = x_i^2

Return

An expression where the ith element is equal to x_i^2

Parameters

x: The input expression

Expression dynet::expr::cube(const Expression &x)¶

Cube.

Calculate elementwise y_i = x_i^3

Return

An expression where the ith element is equal to x_i^3

Parameters

x: The input expression

Expression dynet::expr::lgamma(const Expression &x)¶

Log gamma.

Calculate elementwise y_i = ln(gamma(x_i))

Return

An expression where the ith element is equal to ln(gamma(x_i))

Parameters

x: The input expression

Expression dynet::expr::log(const Expression &x)¶

Logarithm.

Calculate the elementwise natural logarithm y_i = ln(x_i)

Return

An expression where the ith element is equal to ln(x_i)

Parameters

x: The input expression

Expression dynet::expr::logistic(const Expression &x)¶

Logistic sigmoid function.

Calculate elementwise y_i = 1/(1+e^{-x_i})

Return

An expression where the ith element is equal to y_i = 1/(1+e^{-x_i})

Parameters

x: The input expression

Expression dynet::expr::rectify(const Expression &x)¶

Rectifier.

Calculate elementwise the recitifer (ReLU) function y_i = max(x_i,0)

Return

An expression where the ith element is equal to max(x_i,0)

Parameters

x: The input expression

Expression dynet::expr::softsign(const Expression &x)¶

Soft Sign.

Calculate elementwise the softsign function y_i = x_i/(1+|x_i|)

Return

An expression where the ith element is equal to x_i/(1+|x_i|)

Parameters

x: The input expression

Expression dynet::expr::pow(const Expression &x, const Expression &y)¶

Power function.

Calculate an output where the ith element is equal to x_i^y_i

Return

An expression where the ith element is equal to x_i^y_i

Parameters

x: The input expression
y: The exponent expression

Expression dynet::expr::min(const Expression &x, const Expression &y)¶

Minimum.

Calculate an output where the ith element is min(x_i,y_i)

Return

An expression where the ith element is equal to min(x_i,y_i)

Parameters

x: The first input expression
y: The second input expression

Expression dynet::expr::max(const Expression &x, const Expression &y)¶

Maximum.

Calculate an output where the ith element is max(x_i,y_i)

Return

An expression where the ith element is equal to max(x_i,y_i)

Parameters

x: The first input expression
y: The second input expression

Expression dynet::expr::max(const std::initializer_list<Expression> &xs)¶

Max.

This performs an elementwise max over all the expressions in xs

Return

An expression where the ith element is equal to max(xs[0][i], xs[1][i], ...)

Parameters

xs: An initializer list containing expressions

Expression dynet::expr::dot_product(const Expression &x, const Expression &y)¶

Dot Product.

Calculate the dot product sum_i x_i*y_i

Return

An expression equal to the dot product

Parameters

x: The input expression
y: The input expression

Expression dynet::expr::cmult(const Expression &x, const Expression &y)¶

Componentwise multiply.

Do a componentwise multiply where each value is equal to x_i*y_i. This function used to be called cwise_multiply.

Return

An expression where the ith element is equal to x_i*y_i

Parameters

x: The first input expression
y: The second input expression

Expression dynet::expr::cdiv(const Expression &x, const Expression &y)¶

Componentwise multiply.

Do a componentwise multiply where each value is equal to x_i/y_i

Return

An expression where the ith element is equal to x_i/y_i

Parameters

x: The first input expression
y: The second input expression

Expression dynet::expr::colwise_add(const Expression &x, const Expression &bias)¶

Columnwise addition.

Add vector “bias” to each column of matrix “x”

Return

An expression where bias is added to each column of x

Parameters

x: An MxN matrix
bias: A length M vector

Probability/Loss Operations¶

These operations are used for calculating probabilities, or calculating loss functions for use in training.

Expression dynet::expr::softmax(const Expression &x)¶

Softmax.

The softmax function normalizes each column to ensure that all values are between 0 and 1 and add to one by applying the e^{x[i]}/{sum_j e^{x[j]}}.

Return

A vector or matrix after calculating the softmax

Parameters

x: A vector or matrix

Expression dynet::expr::log_softmax(const Expression &x)¶

Log softmax.

The log softmax function normalizes each column to ensure that all values are between 0 and 1 and add to one by applying the e^{x[i]}/{sum_j e^{x[j]}}, then takes the log

Return

A vector or matrix after calculating the log softmax

Parameters

x: A vector or matrix

Expression dynet::expr::log_softmax(const Expression &x, const std::vector<unsigned> &restriction)¶

Restricted log softmax.

The log softmax function calculated over only a subset of the vector elements. The elements to be included are set by the restriction variable. All elements not included in restriction are set to negative infinity.

Return

A vector with the log softmax over the specified elements

Parameters

x: A vector over which to calculate the softmax
restriction: The elements over which to calculate the softmax

Expression dynet::expr::logsumexp(const std::initializer_list<Expression> &xs)¶

Log, sum, exp.

The elementwise “logsumexp” function that calculates \(ln(\sum_i e^{xs_i})\), used in adding probabilities in the log domain.

Return

The result.

Parameters

xs: Expressions with respect to which to calculate the logsumexp.

Expression dynet::expr::pickneglogsoftmax(const Expression &x, unsigned v)¶

Negative softmax log likelihood.

This function takes in a vector of scores x, and performs a log softmax, takes the negative, and selects the likelihood corresponding to the element v. This is perhaps the most standard loss function for training neural networks to predict one out of a set of elements.

Return

The negative log likelihood of element v after taking the softmax

Parameters

x: A vector of scores
v: The element with which to calculate the loss

Expression dynet::expr::pickneglogsoftmax(const Expression &x, const unsigned *pv)¶

Modifiable negative softmax log likelihood.

This function calculates the negative log likelihood after the softmax with respect to index *pv. This computes the same value as the previous function that passes the index v by value, but instead passes by pointer so the value *pv can be modified without re-constructing the computation graph. This can be used in situations where we want to create a computation graph once, then feed it different data points.

Return

The negative log likelihood of element *pv after taking the softmax

Parameters

x: A vector of scores
pv: A pointer to the index of the correct element

Expression dynet::expr::pickneglogsoftmax(const Expression &x, const std::vector<unsigned> &v)¶

Batched negative softmax log likelihood.

This function is similar to standard pickneglogsoftmax, but calculates loss with respect to multiple batch elements. The input will be a mini-batch of score vectors where the number of batch elements is equal to the number of indices in v.

Return

The negative log likelihoods over all the batch elements

Parameters

x: An expression with vectors of scores over N batch elements
v: A size-N vector indicating the index with respect to all the batch elements

Expression dynet::expr::pickneglogsoftmax(const Expression &x, const std::vector<unsigned> *pv)¶

Modifiable batched negative softmax log likelihood.

This function is a combination of modifiable pickneglogsoftmax and batched pickneglogsoftmax: pv can be modified without re-creating the computation graph.

Return

The negative log likelihoods over all the batch elements

Parameters

x: An expression with vectors of scores over N batch elements
pv: A pointer to the indexes

Expression dynet::expr::hinge(const Expression &x, unsigned index, float m = 1.0)¶

Hinge loss.

This expression calculates the hinge loss, formally expressed as: \( \text{hinge}(x,index,m) = \sum_{i \ne index} \max(0, m-x[index]+x[i]). \)

Return

The hinge loss of candidate index with respect to margin m

Parameters

x: A vector of scores
index: The index of the correct candidate
m: The margin

Expression dynet::expr::hinge(const Expression &x, const unsigned *pindex, float m = 1.0)¶

Modifiable hinge loss.

This function calculates the hinge loss with with respect to index *pindex. This computes the same value as the previous function that passes the index index by value, but instead passes by pointer so the value *pindex can be modified without re-constructing the computation graph. This can be used in situations where we want to create a computation graph once, then feed it different data points.

Return

The hinge loss of candidate *pindex with respect to margin m

Parameters

x: A vector of scores
pindex: A pointer to the index of the correct candidate
m: The margin

Expression dynet::expr::hinge(const Expression &x, const std::vector<unsigned> &indices, float m = 1.0)¶

Batched hinge loss.

The same as hinge loss, but for the case where x is a mini-batched tensor with indices.size() batch elements, and indices is a vector indicating the index of each of the correct elements for these elements.

Return

The hinge loss of each mini-batch

Parameters

x: A mini-batch of vectors with indices.size() batch elements
indices: The indices of the correct candidates for each batch element
m: The margin

Expression dynet::expr::hinge(const Expression &x, const std::vector<unsigned> *pindices, float m = 1.0)¶

Batched modifiable hinge loss.

A combination of the previous batched and modifiable hinge loss functions, where vector *pindices can be modified.

Return

The hinge loss of each mini-batch

Parameters

x: A mini-batch of vectors with indices.size() batch elements
pindices: Pointer to the indices of the correct candidates for each batch element
m: The margin

Expression dynet::expr::sparsemax(const Expression &x)¶

Sparsemax.

The sparsemax function (Martins et al. 2016), which is similar to softmax, but induces sparse solutions where most of the vector elements are zero. Note: This function is not yet implemented on GPU.

Return

The sparsemax of the scores

Parameters

x: A vector of scores

Expression dynet::expr::sparsemax_loss(const Expression &x, const std::vector<unsigned> &target_support)¶

Sparsemax loss.

The sparsemax loss function (Martins et al. 2016), which is similar to softmax loss, but induces sparse solutions where most of the vector elements are zero. It has a gradient similar to the sparsemax function and thus is useful for optimizing when the sparsemax will be used at test time. Note: This function is not yet implemented on GPU.

Return

The sparsemax loss of the labels

Parameters

x: A vector of scores
target_support: The target correct labels.

Expression dynet::expr::sparsemax_loss(const Expression &x, const std::vector<unsigned> *ptarget_support)¶

Modifiable sparsemax loss.

Similar to the sparsemax loss, but with ptarget_support being a pointer to a vector, allowing it to be modified without re-creating the compuation graph. Note: This function is not yet implemented on GPU.

Return

The sparsemax loss of the labels

Parameters

x: A vector of scores
ptarget_support: A pointer to the target correct labels.

Expression dynet::expr::squared_norm(const Expression &x)¶

Squared norm.

The squared norm of the values of x: \(\sum_i x_i^2\).

Return

The squared norm

Parameters

x: A vector of values

Expression dynet::expr::squared_distance(const Expression &x, const Expression &y)¶

Squared distance.

The squared distance between values of x and y: \(\sum_i (x_i-y_i)^2\).

Return

The squared distance

Parameters

x: A vector of values
y: Another vector of values

Expression dynet::expr::l1_distance(const Expression &x, const Expression &y)¶

L1 distance.

The L1 distance between values of x and y: \(\sum_i |x_i-y_i|\).

Return

The squared distance

Parameters

x: A vector of values
y: Another vector of values

Expression dynet::expr::huber_distance(const Expression &x, const Expression &y, float c = 1.345f)¶

Huber distance.

The huber distance between values of x and y parameterized by c, \(\sum_i L_c(x_i, y_i)\) where:

\( L_c(x, y) = \begin{cases}{lr} \frac{1}{2}(y - x)^2 & \textrm{for } |y - f(x)| \le c, \\ c\, |y - f(x)| - \frac{1}{2}c^2 & \textrm{otherwise.} \end{cases} \)

Return

The huber distance

Parameters

x: A vector of values
y: Another vector of values
c: The parameter of the huber distance parameterizing the cuttoff

Expression dynet::expr::binary_log_loss(const Expression &x, const Expression &y)¶

Binary log loss.

The log loss of a binary decision according to the sigmoid sigmoid function \(- \sum_i (y_i * ln(x_i) + (1-y_i) * ln(1-x_i)) \)

Return

The log loss of the sigmoid function

Parameters

x: A vector of values
y: A vector of true answers

Expression dynet::expr::pairwise_rank_loss(const Expression &x, const Expression &y, real m = 1.0)¶

Pairwise rank loss.

A margin-based loss, where every margin violation for each pair of values is penalized: \(\sum_i max(x_i-y_i+m, 0)\)

Return

The pairwise rank loss

Parameters

x: A vector of values
y: A vector of true answers
m: The margin

Expression dynet::expr::poisson_loss(const Expression &x, unsigned y)¶

Poisson loss.

The negative log probability of y according to a Poisson distribution with parameter x. Useful in Poisson regression where, we try to predict the parameters of a Possion distribution to maximize the probability of data y.

Return

The Poisson loss

Parameters

x: The parameter of the Poisson distribution.
y: The target value

Expression dynet::expr::poisson_loss(const Expression &x, const unsigned *py)¶

Modifiable Poisson loss.

Similar to Poisson loss, but with the target value passed by pointer so that it can be modified without re-constructing the computation graph.

Return

The Poisson loss

Parameters

x: The parameter of the Poisson distribution.
py: A pointer to the target value

Flow/Shaping Operations¶

These operations control the flow of information through the graph, or the shape of the vectors/tensors used in the graph.

Expression dynet::expr::nobackprop(const Expression &x)¶

Prevent backprop.

This node has no effect on the forward pass, but prevents gradients from flowing backward during the backward pass. This is useful when there’s a subgraph for which you don’t want loss passed back to the parameters.

Return

The new expression

Parameters

x: The input expression

Expression dynet::expr::flip_gradient(const Expression &x)¶

Negative backprop.

This node has no effect on the forward pass, but takes negative on backprop process. This operation is widely used in adversarial networks.

Return

An output expression containing the same as input (only effects on backprop process)

Parameters

x: The input expression

Expression dynet::expr::reshape(const Expression &x, const Dim &d)¶

Reshape to another size.

This node reshapes a tensor to another size, without changing the underlying layout of the data. The layout of the data in DyNet is column-major, so if we have a 3x4 matrix

\( \begin{pmatrix} x_{1,1} & x_{1,2} & x_{1,3} & x_{1,4} \\ x_{2,1} & x_{2,2} & x_{2,3} & x_{2,4} \\ x_{3,1} & x_{3,2} & x_{3,3} & x_{3,4} \\ \end{pmatrix} \)

    and transform it into a 2x6 matrix, it will be rearranged as:

\( \begin{pmatrix} x_{1,1} & x_{3,1} & x_{2,2} & x_{1,3} & x_{3,3} & x_{2,4} \\ x_{2,1} & x_{1,2} & x_{3,2} & x_{2,3} & x_{1,4} & x_{3,4} \\ \end{pmatrix} \)

   **Note:** This is O(1) for forward, and O(n) for backward.

Return

The reshaped expression

Parameters

x: The input expression
d: The new dimensions

Expression dynet::expr::transpose(const Expression & x, const std::vector< unsigned > & dims = {1, 0})

Transpose a matrix.

Transpose a matrix or tensor, or if dims is specified shuffle the dimensions arbitrarily. Note: This is O(1) if either the row or column dimension is 1, and O(n) otherwise.

Return

The transposed/shuffled expression

Parameters

x: The input expression
dims: The dimensions to swap. The ith dimension of the output will be equal to the dims[i] dimension of the input. dims must have the same number of dimensions as x.

Expression dynet::expr::select_rows(const Expression &x, const std::vector<unsigned> &rows)¶

Select rows.

Select a subset of rows of a matrix.

Return

An expression containing the selected rows

Parameters

x: The input expression
rows: The rows to extract

Expression dynet::expr::select_rows(const Expression &x, const std::vector<unsigned> *prows)¶

Modifiable select rows.

Select a subset of rows of a matrix, where the elements of prows can be modified without re-creating the computation graph.

Return

An expression containing the selected rows

Parameters

x: The input expression
prows: The rows to extract

Expression dynet::expr::select_cols(const Expression &x, const std::vector<unsigned> &cols)¶

Select columns.

Select a subset of columns of a matrix. select_cols is more efficient than select_rows since DyNet uses column-major order.

Return

An expression containing the selected columns

Parameters

x: The input expression
columns: The columns to extract

Expression dynet::expr::select_cols(const Expression &x, const std::vector<unsigned> *pcols)¶

Modifiable select columns.

Select a subset of columns of a matrix, where the elements of pcols can be modified without re-creating the computation graph.

Return

An expression containing the selected columns

Parameters

x: The input expression
pcolumns: The columns to extract

Expression dynet::expr::sum_batches(const Expression &x)¶

Sum over minibatches.

Sum an expression that consists of multiple minibatches into one of equal dimension but with only a single minibatch. This is useful for summing loss functions at the end of minibatch training.

Return

An expression with a single batch

Parameters

x: The input mini-batched expression

Expression dynet::expr::pick(const Expression &x, unsigned v, unsigned d = 0)¶

Pick element.

Pick a single element/row/column/sub-tensor from an expression. This will result in the dimension of the tensor being reduced by 1.

Return

The value of x[v] along dimension d

Parameters

x: The input expression
v: The index of the element to select
d: The dimension along which to choose the element

Expression dynet::expr::pick(const Expression &x, const std::vector<unsigned> &v, unsigned d = 0)¶

Batched pick.

Pick elements from multiple batches.

Return

A mini-batched expression containing the picked elements

Parameters

x: The input expression
v: A vector of indicies to choose, one for each batch in the input expression.
d: The dimension along which to choose the elements

Expression dynet::expr::pick(const Expression &x, const unsigned *pv, unsigned d = 0)¶

Modifiable pick element.

Pick a single element from an expression, where the index is passed by pointer so we do not need to re-create the computation graph every time.

Return

The value of x[*pv]

Parameters

x: The input expression
pv: Pointer to the index of the element to select
d: The dimension along which to choose the elements

Expression dynet::expr::pick(const Expression &x, const std::vector<unsigned> *pv, unsigned d = 0)¶

Modifiable batched pick element.

Pick multiple elements from an input expression, where the indices are passed by pointer so we do not need to re-create the computation graph every time.

Return

A mini-batched expression containing the picked elements

Parameters

x: The input expression
pv: A pointer to vector of indicies to choose
d: The dimension along which to choose the elements

Expression dynet::expr::pickrange(const Expression &x, unsigned v, unsigned u)¶

Pick range of elements.

Pick a range of elements from an expression.

Return

The value of {x[v],...,x[u]}

Parameters

x: The input expression
v: The beginning index
u: The end index

Expression dynet::expr::pick_batch_elem(const Expression &x, unsigned v)¶

(Modifiable) Pick batch element.

Pick batch element from a batched expression. For a Tensor with 3 batch elements:

\( \begin{pmatrix} x_{1,1,1} & x_{1,1,2} \\ x_{1,2,1} & x_{1,2,2} \\ \end{pmatrix} \begin{pmatrix} x_{2,1,1} & x_{2,1,2} \\ x_{2,2,1} & x_{2,2,2} \\ \end{pmatrix} \begin{pmatrix} x_{3,1,1} & x_{3,1,2} \\ x_{3,2,1} & x_{3,2,2} \\ \end{pmatrix} \)

pick_batch_elem(t, 1) will return a Tensor of

\( \begin{pmatrix} x_{2,1,1} & x_{2,1,2} \\ x_{2,2,1} & x_{2,2,2} \\ \end{pmatrix} \)

Return

The expression of picked batch element. The picked element is a tensor whose bd equals to one.

Parameters

x: The input expression
v: The index of the batch element to be picked.

Expression dynet::expr::pick_batch_elems(const Expression &x, const std::vector<unsigned> &v)¶

(Modifiable) Pick batch elements.

Pick several batch elements from a batched expression. For a Tensor with 3 batch elements:

\( \begin{pmatrix} x_{1,1,1} & x_{1,1,2} \\ x_{1,2,1} & x_{1,2,2} \\ \end{pmatrix} \begin{pmatrix} x_{2,1,1} & x_{2,1,2} \\ x_{2,2,1} & x_{2,2,2} \\ \end{pmatrix} \begin{pmatrix} x_{3,1,1} & x_{3,1,2} \\ x_{3,2,1} & x_{3,2,2} \\ \end{pmatrix} \)

pick_batch_elems(t, {2, 3}) will return a Tensor of with 2 batch elements:

\( \begin{pmatrix} x_{2,1,1} & x_{2,1,2} \\ x_{2,2,1} & x_{2,2,2} \\ \end{pmatrix} \begin{pmatrix} x_{3,1,1} & x_{3,1,2} \\ x_{3,2,1} & x_{3,2,2} \\ \end{pmatrix} \)

Return

The expression of picked batch elements. The batch elements is a tensor whose bd equals to the size of vector v.

Parameters

x: The input expression
v: A vector of indicies of the batch elements to be picked.

Expression dynet::expr::pick_batch_elem(const Expression &x, const unsigned *v)¶

Pick batch element.

Pick batch element from a batched expression.

Return

The expression of picked batch element. The picked element is a tensor whose bd equals to one.

Parameters

x: The input expression
v: A pointer to the index of the correct element to be picked.

Expression dynet::expr::pick_batch_elems(const Expression &x, const std::vector<unsigned> *pv)¶

Pick batch elements.

Pick several batch elements from a batched expression.

Return

The expression of picked batch elements. The batch elements is a tensor whose bd equals to the size of vector v.

Parameters

x: The input expression
v: A pointer to the indexes

Expression dynet::expr::concatenate_to_batch(const std::initializer_list<Expression> &xs)¶

Concatenate list of expressions to a single batched expression.

Perform a concatenation of several expressions along the batch dimension. All expressions must have the same shape except for the batch dimension.

Return

The expression with the batch dimensions concatenated

Parameters

xs: The input expressions

Expression dynet::expr::concatenate_cols(const std::initializer_list<Expression> &xs)¶

Concatenate columns.

Perform a concatenation of the columns in multiple expressions. All expressions must have the same number of rows.

Return

The expression with the columns concatenated

Parameters

xs: The input expressions

Expression dynet::expr::concatenate(const std::initializer_list<Expression> &xs, unsigned d = 0)¶

Concatenate.

Perform a concatenation of multiple expressions along a particular dimension. All expressions must have the same dimensions except for the dimension to be concatenated (rows by default).

Return

The expression with the specified dimension concatenated

Parameters

xs: The input expressions
xs: The dimension along which to perform concatenation

Expression dynet::expr::max_dim(const Expression &x, unsigned d = 0)¶

Max out through a dimension.

Select out a element/row/column/sub-tensor from an expression, with maximum value along a given dimension. This will result in the dimension of the tensor being reduced by 1.

Return

An expression of sub-tensor with max value along dimension d

Parameters

x: The input expression
d: The dimension along which to choose the element

Expression dynet::expr::min_dim(const Expression &x, unsigned d = 0)¶

Min out through a dimension.

Select out a element/row/column/sub-tensor from an expression, with minimum value along a given dimension. This will result in the dimension of the tensor being reduced by 1.

Return

An expression of sub-tensor with min value along dimension d

Parameters

x: The input expression
d: The dimension along which to choose the element

Noise Operations¶

These operations are used to add noise to the graph for purposes of making learning more robust.

Expression dynet::expr::noise(const Expression &x, real stddev)¶

Gaussian noise.

Add gaussian noise to an expression.

Return

The noised expression

Parameters

x: The input expression
stddev: The standard deviation of the gaussian

Expression dynet::expr::dropout(const Expression &x, real p)¶

Dropout.

With a fixed probability, drop out (set to zero) nodes in the input expression, and scale the remaining nodes by 1/p. Note that there are two kinds of dropout:

Regular dropout: where we perform dropout at training time and then scale outputs by p at test time.
Inverted dropout: where we perform dropout and scaling at training time, and do not need to do anything at test time. DyNet implements the latter, so you only need to apply dropout at training time, and do not need to perform scaling and test time.

Return

The dropped out expression

Parameters

x: The input expression
p: The dropout probability

Expression dynet::expr::block_dropout(const Expression &x, real p)¶

Block dropout.

Identical to the dropout operation, but either drops out all or no values in the expression, as opposed to making a decision about each value individually.

Return

The block dropout expression

Parameters

x: The input expression
p: The block dropout probability

Tensor Operations¶

These operations are used for performing operations on higher order tensors.

Expression dynet::expr::contract3d_1d(const Expression &x, const Expression &y)¶

Contracts a rank 3 tensor and a rank 1 tensor into a rank 2 tensor.

The resulting tensor \(z\) has coordinates \(z_ij = \sum_k x_{ijk} y_k\)

Return

Matrix

Parameters

x: Rank 3 tensor
y: Vector

Expression dynet::expr::contract3d_1d_1d(const Expression &x, const Expression &y, const Expression &z)¶

Contracts a rank 3 tensor and two rank 1 tensor into a rank 1 tensor.

This is the equivalent of calling contract3d_1d and then performing a matrix vector multiplication.

The resulting tensor \(t\) has coordinates \(t_i = \sum_{j,k} x_{ijk} y_k z_j\)

Return

Vector

Parameters

x: Rank 3 tensor
y: Vector
z: Vector

Expression dynet::expr::contract3d_1d_1d(const Expression &x, const Expression &y, const Expression &z, const Expression &b)¶

Same as contract3d_1d_1d with an additional bias parameter.

This is the equivalent of calling contract3d_1d and then performing an affine transform.

The resulting tensor \(t\) has coordinates \(t_i = b_i + \sum_{j,k} x_{ijk} y_k z_j\)

Return

Vector

Parameters

x: Rank 3 tensor
y: Vector
z: Vector
b: Bias vector

Expression dynet::expr::contract3d_1d(const Expression &x, const Expression &y, const Expression &b)¶

Same as contract3d_1d with an additional bias parameter.

The resulting tensor \(z\) has coordinates \(z_{ij} = b_{ij}+\sum_k x_{ijk} y_k\)

Return

Matrix

Parameters

x: Rank 3 tensor
y: Vector
b: Bias matrix

Linera Algebra Operations¶

These operations are used for performing various operations common in linear algebra.

Expression dynet::expr::inverse(const Expression &x)¶

Matrix Inverse.

Takes the inverse of a matrix (not implemented on GPU yet, although contributions are welcome: https://github.com/clab/dynet/issues/158). Note that back-propagating through an inverted matrix can also be the source of stability problems sometimes.

Return

The inverse of the matrix

Parameters

x: A square matrix

Expression dynet::expr::logdet(const Expression &x)¶

Log determinant.

Takes the log of the determinant of a matrix. (not implemented on GPU yet, although contributions are welcome: https://github.com/clab/dynet/issues/158).

Return

The log of its determinant

Parameters

x: A square matrix

Expression dynet::expr::trace_of_product(const Expression &x, const Expression &y)¶

Trace of Matrix Product.

Takes the trace of the product of matrices. (not implemented on GPU yet, although contributions are welcome: https://github.com/clab/dynet/issues/158).

Return

trace(x1 * x2)

Parameters

x1: A matrix
x2: Another matrix

Convolution Operations¶

These operations are convolution-related.

Expression dynet::expr::conv2d(const Expression &x, const Expression &f, const std::vector<unsigned> &stride, bool is_valid = true)¶

conv2d without bias

2D convolution operator without bias parameters. ‘VALID’ and ‘SAME’ convolutions are supported. Think about when stride is 1, the distinction:

SAME: output size is the same with input size. To do so, one needs to pad the input so the filter can sweep outside of the input maps.
VALID: output size shrinks by filter_size - 1, and the filters always sweep at valid positions inside the input maps. No padding needed.

In detail, assume:

Input feature maps: (XH x XW x XC) x N
Filters: FH x FW x XC x FC, 4D tensor
Strides: strides[0] and strides[1] are row (h) and col (w) stride, respectively.

For the SAME convolution: the output height (YH) and width (YW) are computed as:

YH = ceil(float(XH) / float(strides[0]))
YW = ceil(float(XW) / float(strides[1])) and the paddings are computed as:
pad_along_height = max((YH - 1) * strides[0] + FH - XH, 0)
pad_along_width = max((YW - 1) * strides[1] + FW - XW, 0)
pad_top = pad_along_height / 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width / 2
pad_right = pad_along_width - pad_left

For the VALID convolution: the output height (YH) and width (YW) are computed as:

YH = ceil(float(XH - FH + 1) / float(strides[0]))
YW = ceil(float(XW - FW + 1) / float(strides[1])) and the paddings are always zeros.

Return

The output feature maps (H x W x Co) x N, 3D tensor with an optional batch dimension

Parameters

x: The input feature maps: (H x W x Ci) x N (ColMaj), 3D tensor with an optional batch dimension
f: 2D convolution filters: H x W x Ci x Co (ColMaj), 4D tensor
stride: the row and column strides
is_valid: ‘VALID’ convolution or ‘SAME’ convolution, default is True (‘VALID’)

Expression dynet::expr::conv2d(const Expression &x, const Expression &f, const Expression &b, const std::vector<unsigned> &stride, bool is_valid = true)¶

conv2d with bias

2D convolution operator with bias parameters. ‘VALID’ and ‘SAME’ convolutions are supported. Think about when stride is 1, the distinction:

SAME: output size is the same with input size. To do so, one needs to pad the input so the filter can sweep outside of the input maps.
VALID: output size shrinks by filter_size - 1, and the filters always sweep at valid positions inside the input maps. No padding needed.

In detail, assume:

Input feature maps: XH x XW x XC x N
Filters: FH x FW x XC x FC
Strides: strides[0] and strides[1] are row (h) and col (w) stride, respectively.

For the SAME convolution: the output height (YH) and width (YW) are computed as:

YH = ceil(float(XH) / float(strides[0]))
YW = ceil(float(XW) / float(strides[1])) and the paddings are computed as:
pad_along_height = max((YH - 1) * strides[0] + FH - XH, 0)
pad_along_width = max((YW - 1) * strides[1] + FW - XW, 0)
pad_top = pad_along_height / 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width / 2
pad_right = pad_along_width - pad_left

For the VALID convolution: the output height (YH) and width (YW) are computed as:

YH = ceil(float(XH - FH + 1) / float(strides[0]))
YW = ceil(float(XW - FW + 1) / float(strides[1])) and the paddings are always zeros.

Return

The output feature maps (H x W x Co) x N, 3D tensor with an optional batch dimension

Parameters

x: The input feature maps: (H x W x Ci) x N (ColMaj), 3D tensor with an optional batch dimension
f: 2D convolution filters: H x W x Ci x Co (ColMaj), 4D tensor
b: The bias (1D: Ci)
stride: the row and column strides
is_valid: ‘VALID’ convolution or ‘SAME’ convolution, default is True (‘VALID’)

Normalization Operations¶

This includes batch normalization and the likes.

Expression dynet::expr::layer_norm(const Expression &x, const Expression &g, const Expression &b)¶

Layer normalization.

Performs layer normalization :

\( \begin{split} \mu &= \frac 1 n \sum_{i=1}^n x_i\\ \sigma &= \sqrt{\frac 1 n \sum_{i=1}^n (x_i-\mu)^2}\\ y&=\frac {\boldsymbol{g}} \sigma \circ (\boldsymbol{x}-\mu) + \boldsymbol{b}\\ \end{split} \)

Reference : Ba et al., 2016

Return

An expression of the same dimension as x

Parameters

x: Input expression (possibly batched)
g: Gain (same dimension as x, no batch dimension)
b: Bias (same dimension as x, no batch dimension)