Core functionalities¶

Computation Graph¶

The ComputationGraph is the workhorse of DyNet. From the DyNet technical report :

[The] computation graph represents symbolic computation, and the results of the computation are evaluated lazily: the computation is only performed once the user explicitly asks for it (at which point a “forward” computation is triggered). Expressions that evaluate to scalars (i.e. loss values) can also be used to trigger a “backward” computation, computing the gradients of the computation with respect to the parameters.

int dynet::get_number_of_active_graphs()¶

Gets the number of active graphs.

This is 0 or 1, you can’t create more than one graph at once

Return: Number of active graphs

unsigned dynet::get_current_graph_id()¶

Get id of the current active graph.

This can help check whether a graph is stale

Return: Id of the current graph

struct ComputationGraph¶

#include <dynet.h>

Computation graph where nodes represent forward and backward intermediate values, and edges represent functions of multiple values.

To represent the fact that a function may have multiple arguments, edges have a single head and 0, 1, 2, or more tails. (Constants, inputs, and parameters are represented as functions of 0 parameters.) Example: given the function z = f(x, y), z, x, and y are nodes, and there is an edge representing f with which points to the z node (i.e., its head), and x and y are the tails of the edge. You shouldn’t need to use most methods from the ComputationGraph except for backward since most of them are available directly from the Expression class.

Public Functions

ComputationGraph()¶: Default constructor.

VariableIndex add_input(real s, Device *device)¶

Add scalar input.

The computational network will pull inputs in from the user’s data structures and make them available to the computation

Return

The index of the created variable

Parameters

s: Real number
device: The device to place input value

VariableIndex add_input(const real *ps, Device *device)¶

Add scalar input by pointer.

The computational network will pull inputs in from the user’s data structures and make them available to the computation

Return

The index of the created variable

Parameters

ps: Pointer to a real number
device: The device to place input value

VariableIndex add_input(const Dim &d, const std::vector<float> &data, Device *device)¶

Add multidimentsional input.

The computational network will pull inputs in from the user’s data structures and make them available to the computation

Return

The index of the created variable

Parameters

d: Desired shape of the input
data: Input data (as a 1 dimensional array)
data: The data points corresponding to each index
device: The device to place input value

VariableIndex add_input(const Dim &d, const std::vector<float> *pdata, Device *device)¶

Add multidimentsional input by pointer.

The computational network will pull inputs in from the user’s data structures and make them available to the computation

Return

The index of the created variable

Parameters

d: Desired shape of the input
pdata: Pointer to the input data (as a 1 dimensional array)
device: The device to place input value

VariableIndex add_input(const Dim &d, const std::vector<unsigned int> &ids, const std::vector<float> &data, Device *device, float defdata = 0.f)¶

Add sparse input.

The computational network will pull inputs in from the user’s data structures and make them available to the computation. Represents specified (not learned) inputs to the network in sparse array format, with an optional default value.

Return

The index of the created variable

Parameters

d: Desired shape of the input
ids: The indexes of the data points to update
data: The data points corresponding to each index
device: The device to place input value
defdata: The default data with which to set the unspecified data points

VariableIndex add_parameters(Parameter p)¶

Add a parameter to the computation graph.

Return

The index of the created variable

Parameters

p: Parameter to be added

VariableIndex add_parameters(LookupParameter p)¶

Add a full matrix of lookup parameters to the computation graph.

Return

The index of the created variable

Parameters

p: LookupParameter to be added

VariableIndex add_const_parameters(Parameter p)¶

Add a parameter to the computation graph (but don’t update)

Return

The index of the created variable

Parameters

p: Parameter to be added

VariableIndex add_const_parameters(LookupParameter p)¶

Add a full matrix of lookup parameter to the computation graph (but don’t update)

Return

The index of the created variable

Parameters

p: LookupParameter to be added

VariableIndex add_lookup(LookupParameter p, const unsigned *pindex)¶

Add a lookup parameter to the computation graph.

Use pindex to point to a memory location where the index will live that the caller owns

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
pindex: Pointer to the index to lookup

VariableIndex add_lookup(LookupParameter p, unsigned index)¶

Add a lookup parameter to the computation graph.

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
index: Index to lookup

VariableIndex add_lookup(LookupParameter p, const std::vector<unsigned> *pindices)¶

Add lookup parameters to the computation graph.

Use pindices to point to a memory location where the indices will live that the caller owns

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
pindices: Pointer to the indices to lookup

VariableIndex add_lookup(LookupParameter p, const std::vector<unsigned> &indices)¶

Add lookup parameters to the computation graph.

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
indices: Indices to lookup

VariableIndex add_const_lookup(LookupParameter p, const unsigned *pindex)¶

Add a lookup parameter to the computation graph.

Just like add_lookup, but don’t optimize the lookup parameters

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
pindex: Pointer to the indices to lookup

VariableIndex add_const_lookup(LookupParameter p, unsigned index)¶

Add a lookup parameter to the computation graph.

Just like add_lookup, but don’t optimize the lookup parameters

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
index: Index to lookup

VariableIndex add_const_lookup(LookupParameter p, const std::vector<unsigned> *pindices)¶

Add lookup parameters to the computation graph.

Just like add_lookup, but don’t optimize the lookup parameters

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
pindices: Pointer to the indices to lookup

VariableIndex add_const_lookup(LookupParameter p, const std::vector<unsigned> &indices)¶

Add lookup parameters to the computation graph.

Just like add_lookup, but don’t optimize the lookup parameters

Return

The index of the created variable

Parameters

p: Lookup parameter from which to pick
indices: Indices to lookup

template <class Function> VariableIndex add_function(const std::initializer_list<VariableIndex> &arguments)¶

Add a function to the computation graph.

This what is called when creating an expression

Return

The index of the output variable

Parameters

arguments: List of the arguments indices

Template Parameters

Function: Function to be applied

template <class Function, typename... Args> VariableIndex add_function(const std::initializer_list<VariableIndex> &arguments, Args&&... side_information)¶

Add a function to the computation graph (with side information)

This what is called when creating an expression

Return

The index of the output variable

Parameters

arguments: List of the arguments indices
side_information: Side information that is needed to compute the function

Template Parameters

Function: Function to be applied

void clear()¶

Reset ComputationGraph to a newly created state.

[long description]

void checkpoint()¶: Set a checkpoint.

void revert()¶: Revert to last checkpoint.

Dim &get_dimension(VariableIndex index) const¶

Get dimension of a node.

Return

Dimension

Parameters

index: Variable index of the node

const Tensor &forward(const Expression &last)¶

Run complete forward pass from first node to given one, ignoring all precomputed values.

Return

Value of the last Expression after execution

Parameters

last: Expression up to which the forward pass must be computed

const Tensor &forward(VariableIndex i)¶

Run complete forward pass from first node to given one, ignoring all precomputed values.

Return

Value of the end Node after execution

Parameters

i: Variable index of the node up to which the forward pass must be computed

const Tensor &incremental_forward(const Expression &last)¶

Run forward pass from the last computed node to given one.

Useful if you want to add nodes and evaluate just the new parts.

Return

Value of the last Expression after execution

Parameters

last: Expression up to which the forward pass must be computed

const Tensor &incremental_forward(VariableIndex i)¶

Run forward pass from the last computed node to given one.

Useful if you want to add nodes and evaluate just the new parts.

Return

Value of the end Node after execution

Parameters

last: Variable index of the node up to which the forward pass must be computed

const Tensor &get_value(VariableIndex i)¶

Get forward value for node at index i.

Performs forward evaluation if note available (may compute more than strictly what is needed).

Return

Requested value

Parameters

i: Index of the variable from which you want the value

const Tensor &get_value(const Expression &e)¶

Get forward value for the given expression.

Performs forward evaluation if note available (may compute more than strictly what is needed).

Return

Requested value

Parameters

e: Expression from which you want the value

const Tensor &get_gradient(VariableIndex i)¶

Get gradient for node at index i.

Performs backward pass if not available (may compute more than strictly what is needed).

Return

Requested gradient

Parameters

i: Index of the variable from which you want the gradient

const Tensor &get_gradient(const Expression &e)¶

Get forward gradient for the given expression.

Performs backward pass if not available (may compute more than strictly what is needed).

Return

Requested gradient

Parameters

e: Expression from which you want the gradient

void invalidate()¶: Clears forward caches (for get_value etc).

void backward(const Expression &last, bool full = false)¶

Computes backward gradients from the front-most evaluated node.

The parameter full specifies whether the gradients should be computed for all nodes (true) or only non-constant nodes.

By default, a node is constant unless

it is a parameter node
it depends on a non-constant node

Thus, functions of constants and inputs are considered as constants.

Turn full on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.

Parameters

last: Expression from which to compute the gradient
full: Whether to compute all gradients (including with respect to constant nodes).

void backward(VariableIndex i, bool full = false)¶

Computes backward gradients from node i (assuming it already been evaluated).

The parameter full specifies whether the gradients should be computed for all nodes (true) or only non-constant nodes.

By default, a node is constant unless

it is a parameter node
it depends on a non-constant node

Thus, functions of constants and inputs are considered as constants.

Turn full on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.

Parameters

i: Index of the node from which to compute the gradient
full: Whether to compute all gradients (including with respect to constant nodes). Turn this on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.

void print_graphviz() const¶: Used for debugging.

unsigned get_id() const¶

Get the unique graph ID.

This ID is incremented by 1 each time a computation graph is created

Return: graph is

Nodes¶

Nodes are constituents of the computation graph. The end user doesn’t interact with Nodes but with Expressions.

However implementing new operations requires to create a new subclass of the Node class described below.

struct Node¶

#include <dynet.h>

Represents an SSA variable.

Contains information on tha computation node : arguments, output value and gradient of the output with respect to the function. This class must be inherited to implement any new operation. See nodes.cc for examples. An operation on expressions can then be created from the new Node, see expr.h/expr.cc for examples

Subclassed by dynet::Abs, dynet::Acos, dynet::Acosh, dynet::AddVectorToAllColumns, dynet::AffineTransform, dynet::Argmax, dynet::Asin, dynet::Asinh, dynet::Atan, dynet::Atanh, dynet::Average, dynet::BinaryLogLoss, dynet::BlockDropout, dynet::Ceil, dynet::CircularConvolution, dynet::CircularCorrelation, dynet::Concatenate, dynet::ConcatenateToBatch, dynet::Constant, dynet::ConstantMinusX, dynet::ConstantPlusX, dynet::ConstParameterNode, dynet::ConstrainedSoftmax, dynet::ConstScalarMultiply, dynet::Conv2D, dynet::Cos, dynet::Cosh, dynet::Cube, dynet::CumulativeSum, dynet::CwiseMultiply, dynet::CwiseQuotient, dynet::CwiseSum, dynet::DotProduct, dynet::Dropout, dynet::DropoutBatch, dynet::DropoutDim, dynet::Erf, dynet::Exp, dynet::ExponentialLinearUnit, dynet::Filter1DNarrow, dynet::Floor, dynet::FoldRows, dynet::GaussianNoise, dynet::Hinge, dynet::HingeDim, dynet::HuberDistance, dynet::Identity, dynet::InnerProduct3D_1D, dynet::InnerProduct3D_1D_1D, dynet::InputNode, dynet::KMaxPooling, dynet::KMHNGram, dynet::L1Distance, dynet::L2Norm, dynet::Log, dynet::LogDet, dynet::LogGamma, dynet::LogisticSigmoid, dynet::LogSigmoid, dynet::LogSoftmax, dynet::LogSumExp, dynet::LogSumExpDimension, dynet::MatrixInverse, dynet::MatrixMultiply, dynet::Max, dynet::MaxDimension, dynet::MaxPooling1D, dynet::MaxPooling2D, dynet::Min, dynet::MinDimension, dynet::MomentBatches, dynet::MomentDimension, dynet::MomentElements, dynet::Negate, dynet::NoBackprop, dynet::PairwiseRankLoss, dynet::ParameterNodeBase, dynet::PickBatchElements, dynet::PickElement, dynet::PickNegLogSoftmax, dynet::PickRange, dynet::PoissonRegressionLoss, dynet::Pow, dynet::RandomBernoulli, dynet::RandomGumbel, dynet::RandomNormal, dynet::RandomUniform, dynet::Rectify, dynet::Reshape, dynet::RestrictedLogSoftmax, dynet::Round, dynet::ScalarInputNode, dynet::ScaleGradient, dynet::SelectCols, dynet::SelectRows, dynet::SigmoidLinearUnit, dynet::Sin, dynet::Sinh, dynet::Softmax, dynet::SoftSign, dynet::SparseInputNode, dynet::Sparsemax, dynet::SparsemaxLoss, dynet::Sqrt, dynet::Square, dynet::SquaredEuclideanDistance, dynet::SquaredNorm, dynet::StdBatches, dynet::StdDimension, dynet::StdElements, dynet::StridedSelect, dynet::Sum, dynet::SumDimension, dynet::SumElements, dynet::Tan, dynet::Tanh, dynet::ToDevice, dynet::TraceOfProduct, dynet::Transpose, dynet::VanillaLSTMC, dynet::VanillaLSTMGates, dynet::VanillaLSTMH, dynet::WeightNormalization

Public Types

enum INPLACE_TYPE¶

< Type for the inplace operations: NOPE(non-inplace), READ(no changes to the memory), WRITE(unrecoverable changes possibly)

Values:

NOPE¶

READ¶

WRITE¶

Public Functions

virtual Dim dim_forward(const std::vector<Dim> &xs) const = 0¶

Compute dimensions of result for given dimensions of inputs.

Also checks to make sure inputs are compatible with each other

Return

Dimension of the output

Parameters

xs: Vector containing the dimensions of the inputs

virtual std::string as_string(const std::vector<std::string> &args) const = 0¶

Returns important information for debugging.

See nodes-conv.cc for examples

Return

String description of the node

Parameters

args: String descriptions of the arguments

size_t aux_storage_size() const¶

Size of the auxiliar storage.

in general, this will return an empty size, but if a component needs to store extra information in the forward pass for use in the backward pass, it can request the memory here (nb. you could put it on the Node object, but in general, edges should not allocate tensor memory since memory is managed centrally for the entire computation graph).

Return: Size

virtual void forward_impl(const std::vector<const Tensor *> &xs, Tensor &fx) const = 0¶

Forward computation.

This function contains the logic for the forward pass. Some implementation remarks from nodes.cc:

fx can be understood as a pointer to the (preallocated) location for the result of forward to be stored
fx is not initialized, so after calling forward fx must point to the correct answer
fx can be repointed to an input, if forward(x) evaluates to x (e.g., in reshaping)
scalars results of forward are placed in fx.v[0]
DYNET manages its own memory, not Eigen, and it is configured with the EIGEN_NO_MALLOC option. If you get an error about Eigen attempting to allocate memory, it is (probably) because of an implicit creation of a temporary variable. To tell Eigen this is not necessary, the noalias() method is available. If you really do need a temporary variable, its capacity must be requested by Node::aux_storage_size

Note on debugging problems with differentiable components

fx is uninitialized when forward is called- are you relying on it being 0?

Parameters

xs: Pointers to the inputs
fx: pointer to the (preallocated) location for the result of forward to be stored

virtual void backward_impl(const std::vector<const Tensor *> &xs, const Tensor &fx, const Tensor &dEdf, unsigned i, Tensor &dEdxi) const = 0¶

Accumulates the derivative of E with respect to the ith argument to f, that is, xs[i].

This function contains the logic for the backward pass. Some implementation remarks from nodes.cc:

dEdxi MUST ACCUMULATE a result since multiple calls to forward may depend on the same x_i. Even, e.g., Identity must be implemented as dEdx1 += dEdf. THIS IS EXTREMELY IMPORTANT
scalars results of forward are placed in fx.v[0]
DYNET manages its own memory, not Eigen, and it is configured with the EIGEN_NO_MALLOC option. If you get an error about Eigen attempting to allocate memory, it is (probably) because of an implicit creation of a temporary variable. To tell Eigen this is not necessary, the noalias() method is available. If you really do need a temporary variable, its capacity must be requested by Node::aux_storage_size

Note on debugging problems with differentiable components

dEdxi must accummulate (see point 4 above!)

Parameters

xs: Pointers to inputs
fx: Output
dEdf: Gradient of the objective w.r.t the output of the node
i: Index of the input w.r.t which we take the derivative
dEdxi: Gradient of the objective w.r.t the input of the node

virtual bool supports_multibatch() const¶

Whether this node supports computing multiple batches in one call.

If true, forward and backward will be called once with a multi-batch tensor. If false, forward and backward will be called multiple times for each item.

Return: Support for multibatch

virtual bool supports_multidevice() const¶

Whether this node supports processing inputs/outputs on multiple devices.

DyNet will throw an error if you try to process inputs and outputs on different devices unless this is activated.

Return: Support for multi-device

void forward(const std::vector<const Tensor *> &xs, Tensor &fx) const¶

perform the forward/backward passes in one or multiple calls

Parameters

xs: Pointers to the inputs
fx: pointer to the (preallocated) location for the result of forward to be stored

void backward(const std::vector<const Tensor *> &xs, const Tensor &fx, const Tensor &dEdf, unsigned i, Tensor &dEdxi) const¶

perform the backward passes in one or multiple calls

Parameters

xs: Pointers to inputs
fx: Output
dEdf: Gradient of the objective w.r.t the output of the node
i: Index of the input w.r.t which we take the derivative
dEdxi: Gradient of the objective w.r.t the input of the node

virtual int autobatch_sig(const ComputationGraph &cg, SigMap &sm) const¶: signature for automatic batching This will be equal only for nodes that can be combined. Returns 0 for unbatchable functions.

virtual std::vector<int> autobatch_concat(const ComputationGraph &cg) const¶: which inputs can be batched This will be true for inputs that should be concatenated when autobatching, and false for inputs that should be shared among all batches.

virtual Node *autobatch_pseudo_node(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids) const¶: create a pseudonode for autobatching This will combine together multiple nodes into one big node for the automatic batching functionality. When a node representing one component of the mini-batch can be used as-is it is OK to just return the null pointer, otherwise we should make the appropriate changes and return a new node.

virtual void autobatch_reshape(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids, const std::vector<int> &concat, std::vector<const Tensor *> &xs, Tensor &fx) const¶: reshape the tensors for auto Takes in info, and reshapes the dimensions of xs (for which “concat” is true), and fx. By default do no reshaping, which is OK for componentwise operations.

void autobatch_reshape_concatonly(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids, const std::vector<int> &concat, std::vector<const Tensor *> &xs, Tensor &fx) const¶: reshape the tensors for auto Takes in info, and reshapes the dimensions of xs (for which “concat” is true) and fx by concatenating their batches.

unsigned arity() const¶

Number of arguments to the function.

Return: Arity of the function

Public Members

std::vector<VariableIndex> args¶: Dependency structure

Dim dim¶: Will be .size() = 0 initially filled in by forward() TODO fix this

void *aux_mem¶: this will usually be null. but, if your node needs to store intermediate values between forward and backward, you can use store it here. request the number of bytes you need from aux_storage_size(). Note: this memory will be on the CPU or GPU, depending on your computation backend

Parameters and Model¶

Parameters are things that are optimized. in contrast to a system like Torch where computational modules may have their own parameters, in DyNet parameters are just parameters.

To deal with sparse updates, there are two parameter classes:

Parameters represents a vector, matrix, (eventually higher order tensors) of parameters. These are densely updated.
LookupParameters represents a table of vectors that are used to embed a set of discrete objects. These are sparsely updated.

struct ParameterStorageBase¶

#include <model.h>

This is the base class for ParameterStorage and LookupParameterStorage, the objects handling the actual parameters.

You can access the storage from any Parameter (resp. LookupParameter) class, use it only to do low level manipulations.

Subclassed by dynet::LookupParameterStorage, dynet::ParameterStorage

Public Functions

virtual void scale_parameters(float a) = 0¶

Scale the parameters.

Parameters

a: scale factor

virtual void scale_gradient(float a) = 0¶

Scale the gradient.

Parameters

a: scale factor

virtual void zero() = 0¶: Set the parameters to 0.

virtual void squared_l2norm(float *sqnorm) const = 0¶

Get the parameter squared l2 norm.

Parameters

sqnorm: Pointer to the float holding the result

virtual void g_squared_l2norm(float *sqnorm) const = 0¶

Get the squared l2 norm of the gradient w.r.t. these parameters.

Parameters

sqnorm: Pointer to the float holding the result

virtual bool is_updated() const = 0¶: Check whether corpus is updated.

virtual bool has_grad() const = 0¶: Check whether the gradient is zero or not (true if gradient is non-zero)

virtual size_t size() const = 0¶

Get the size (number of scalar parameters)

Return: Number of scalar parameters

struct ParameterStorage : public dynet::ParameterStorageBase ¶

#include <model.h>

Storage class for Parameters.

Subclassed by dynet::ParameterStorageCreator

Public Functions

void copy(const ParameterStorage &val)¶

Copy from another ParameterStorage.

Parameters

val: ParameterStorage to copy from

void accumulate_grad(const Tensor &g)¶

Add a tensor to the gradient.

After this method gets called, g <- g + d

Parameters

g: Tensor to add

void clear()¶: Clear the gradient (set it to 0)

void clip(float left, float right)¶: Clip the values to the range [left, right].

Public Members

std::string name¶: Name of this parameter

Dim dim¶: Dimensions of the parameter tensor

Tensor values¶: Values of the parameter

Tensor g¶: Values of the gradient w.r.t. this parameter

bool updated¶: Whether this is updated

bool nonzero_grad¶: Whether the gradient is zero

ParameterCollection *owner¶: Pointer to the collection that “owns” this parameter

struct LookupParameterStorage : public dynet::ParameterStorageBase ¶

#include <model.h>

Storage class for LookupParameters.

Subclassed by dynet::LookupParameterStorageCreator

Public Functions

void initialize(unsigned index, const std::vector<float> &val)¶

Initialize one particular lookup.

Parameters

index: Index of the lookput to initialize
val: Values

void copy(const LookupParameterStorage &val)¶

Copy from another LookupParameterStorage.

Parameters

val: Other LookupParameterStorage to copy from

void accumulate_grad(const Tensor &g)¶

Add a Tensor to the gradient of the whole lookup matrix.

after this grads<-grads + g

Parameters

g: [description]

void accumulate_grad(unsigned index, const Tensor &g)¶

Add a Tensor to the gradient of one of the lookups.

after this grads[index]<-grads[index] + g

Parameters

index: [description]
g: [description]

void accumulate_grads(unsigned n, const unsigned *ids_host, const unsigned *ids_dev, float *g)¶

Add tensors to muliple lookups.

After this method gets called, grads[ids_host[i]] <- grads[ids_host[i]] + g[i*dim.size():(i+1)*dim.size()]

Parameters

n: size of ids_host
ids_host: Indices of the gradients to update
ids_dev: [To be documented] (only for GPU)
g: Values

Public Members

std::string name¶: Name of this parameter

Dim all_dim¶: Total dimension

Tensor all_values¶: Values for all dimensions at once

Tensor all_grads¶: Gradient values for all dimensions at once

Dim dim¶: Dimension for one lookup

std::vector<Tensor> values¶: List of values for each lookup

std::vector<Tensor> grads¶: List of gradient values for each lookup

std::unordered_set<unsigned> non_zero_grads¶: Gradients are sparse, so track which components are nonzero

bool updated¶: Whether this lookup parameter should be updated

bool nonzero_grad¶: Whether all of the gradients have been updated. Whether the gradient is zero

ParameterCollection *owner¶: Pointer to the collection that “owns” this parameter

struct Parameter¶

#include <model.h>

Object representing a trainable parameter.

This objects acts as a high level component linking the actual parameter values (ParameterStorage) and the ParameterCollection. As long as you don’t want to do low level hacks at the ParameterStorage level, this is what you will use.

Public Functions

Parameter()¶: Default constructor.

Parameter(std::shared_ptr<ParameterStorage> p)¶

Constructor.

This is called by the model, you shouldn’t need to use it

Parameters

p: Shared pointer to the parameter storage

ParameterStorage &get_storage() const¶

Get underlying ParameterStorage object.

Return: ParameterStorage holding the parameter values

string get_fullname() const¶: Get the full name of the ParameterStorage object.

void zero()¶: Zero the parameters.

Dim dim() const¶

Shape of the parameter.

Return: Shape as a Dim object

Tensor *values()¶

Values of the parameter.

Return: Values as a Tensor object

Tensor *gradients()¶

gradients of the parameter

Return: gradients as a Tensor object

float current_weight_decay() const¶: Get the current weight decay for the parameters.

void set_updated(bool b)¶

Set the parameter as updated.

Parameters

b: Update status

void scale(float s)¶

Scales the parameter (multiplies by s)

Parameters

s: scale

void scale_gradient(float s)¶

Scales the gradient (multiplies by s)

Parameters

s: scale

bool is_updated() const¶

Check the update status.

Return: Update status

void clip_inplace(float left, float right)¶: Clip the values of the parameter to the range [left, right] (in place)

void set_value(const std::vector<float> &val)¶: set the values of the parameter

Public Members

std::shared_ptr<ParameterStorage> p¶: Pointer to the storage for this Parameter

struct LookupParameter¶

#include <model.h>

Object representing a trainable lookup parameter.

Public Functions

LookupParameterStorage &get_storage() const¶

Get underlying LookupParameterStorage object.

Return: LookupParameterStorage holding the parameter values

void initialize(unsigned index, const std::vector<float> &val) const¶

Initialize one particular column.

Parameters

index: Index of the column to be initialized
val: [description]

void zero()¶: Zero the parameters.

string get_fullname() const¶: Get the full name of the ParameterStorage object.

Dim dim() const¶

Shape of the lookup parameter.

Return: Shape as a Dim object

std::vector<Tensor> *values()¶

Values of the lookup parameter.

Return: Values as a Tensor object

float current_weight_decay() const¶: Get the current weight decay for the parameters.

void scale(float s)¶

Scales the parameter (multiplies by s)

Parameters

s: scale

void scale_gradient(float s)¶

Scales the gradient (multiplies by s)

Parameters

s: scale

void set_updated(bool b)¶

Set the parameter as updated.

Parameters

b: Update status

bool is_updated() const¶

Check the update status.

Return: Update status

Public Members

std::shared_ptr<LookupParameterStorage> p¶: Pointer to the storage for this Parameter

class ParameterCollection¶

#include <model.h>

This is a collection of parameters.

if you need a matrix of parameters, or a lookup table - ask an instance of this class. This knows how to serialize itself. Parameters know how to track their gradients, but any extra information (like velocity) will live here

Subclassed by dynet::Model

Public Functions

ParameterCollection()¶

Constructor.

Weight-decay value is taken from commandline option.

ParameterCollection(float weight_decay_lambda)¶

Constructor.

Parameters

weight_decay_lambda: Default weight-decay value for this collection.

float gradient_l2_norm() const¶

Returns the l2 of your gradient.

Use this to look for gradient vanishing/exploding

Return: L2 norm of the gradient

void reset_gradient()¶: Sets all gradients to zero.

Parameter add_parameters(const Dim &d, float scale = 0.0f, const std::string &name = "", Device *device = dynet::default_device)¶

Add parameters to model and returns Parameter object.

creates a ParameterStorage object holding a tensor of dimension d and returns a Parameter object (to be used as input in the computation graph). The coefficients are sampled according to the scale parameter

Return

Parameter object to be used in the computation graph

Parameters

d: Shape of the parameter
scale: If scale is non-zero, initializes according to \(mathcal U([-\mathrm{scale},+\mathrm{scale}]\), otherwise uses Glorot initialization
name: Name of the parameter
device: Device placement for the parameter

Parameter add_parameters(const Dim &d, Device *device)¶

Add parameters to model and returns Parameter object.

creates a ParameterStorage object holding a tensor of dimension d and returns a Parameter object (to be used as input in the computation graph).

Return

Parameter object to be used in the computation graph

Parameters

d: Shape of the parameter
device: Device placement for the parameter

Parameter add_parameters(const Dim &d, const std::string &name, Device *device = dynet::default_device)¶

Add parameters to model and returns Parameter object.

creates a ParameterStorage object holding a tensor of dimension d and returns a Parameter object (to be used as input in the computation graph).

Return

Parameter object to be used in the computation graph

Parameters

d: Shape of the parameter
name: Name of the parameter
device: Device placement for the parameter

Parameter add_parameters(const Dim &d, const ParameterInit &init, const std::string &name = "", Device *device = dynet::default_device)¶

Add parameters with custom initializer.

Return

Parameter object to be used in the computation graph

Parameters

d: Shape of the parameter
init: Custom initializer
name: Name of the parameter
device: Device placement for the parameter

std::vector<std::shared_ptr<ParameterStorageBase>> get_parameter_storages_base() const¶

Get parameters base in current model.

Return: list of points to ParameterStorageBase objects

std::shared_ptr<ParameterStorage> get_parameter_storage(const std::string &pname)¶

Get parameter in current model.

It is not recommended to use this

Return: the pointer to the Parameter object

std::vector<std::shared_ptr<ParameterStorage>> get_parameter_storages() const¶

Get parameters in current model.

Return: list of points to ParameterStorage objects

LookupParameter add_lookup_parameters(unsigned n, const Dim &d, const std::string &name = "", Device *device = dynet::default_device)¶

Add lookup parameter to model.

Same as add_parameters. Initializes with Glorot

Return

LookupParameter object to be used in the computation graph

Parameters

n: Number of lookup indices
d: Dimension of each embedding
name: Name of the parameter
device: Device placement for the parameter

LookupParameter add_lookup_parameters(unsigned n, const Dim &d, const ParameterInit &init, const std::string &name = "", Device *device = dynet::default_device)¶

Add lookup parameter with custom initializer.

Return

LookupParameter object to be used in the computation graph

Parameters

n: Number of lookup indices
d: Dimension of each embedding
init: Custom initializer
name: Name of the parameter
device: Device placement for the parameter

std::shared_ptr<LookupParameterStorage> get_lookup_parameter_storage(const std::string &lookup_pname)¶

Get lookup parameter in current model.

It is not recommended to use this

Return: the pointer to the LookupParameter object

std::vector<std::shared_ptr<LookupParameterStorage>> get_lookup_parameter_storages() const¶

Get lookup parameters in current model.

Return: list of points to LookupParameterStorage objects

void project_weights(float radius = 1.0f)¶

project weights so their L2 norm = radius

NOTE (Paul) : I am not sure this is doing anything currently. The argument doesn’t seem to be used anywhere… If you need this raise an issue on github

Parameters

radius: Target norm

void set_weight_decay_lambda(float lambda)¶

Set the weight decay coefficient.

Parameters

lambda: Weight decay coefficient

const std::vector<std::shared_ptr<ParameterStorage>> &parameters_list() const¶

Returns list of shared pointers to ParameterSorages.

You shouldn’t need to use this

Return: List of shared pointers to ParameterSorages

const std::vector<std::shared_ptr<LookupParameterStorage>> &lookup_parameters_list() const¶

Returns list of pointers to LookupParameterSorages.

You shouldn’t need to use this

Return: List of pointers to LookupParameterSorages

size_t parameter_count() const¶

Returns the total number of tunable parameters (i. e. scalars) contained within this model.

That is to say, a 2x2 matrix counts as four parameters.

Return: Number of parameters

size_t updated_parameter_count() const¶

Returns total number of (scalar) parameters updated.

Return: number of updated parameters

void set_updated_param(const Parameter *p, bool status)¶

[brief description]

[long description]

Parameters

p: [description]
status: [description]

void set_updated_lookup_param(const LookupParameter *p, bool status)¶

[brief description]

[long description]

Parameters

p: [description]
status: [description]

bool is_updated_param(const Parameter *p)¶

[brief description]

[long description]

Return

[description]

Parameters

p: [description]

bool is_updated_lookup_param(const LookupParameter *p)¶

[brief description]

[long description]

Return

[description]

Parameters

p: [description]

ParameterCollection add_subcollection(const std::string &name = "", float weight_decay_lambda = -1)¶

Add a sub-collection.

This will allow you to add a ParameterCollection that is a (possibly named) subset of the original collection. This is useful if you want to save/load/update only part of the parameters in the model.

Return

The subcollection

Parameters

name:
weight_decay_lambda: if negative/omitted, inherit from parent.

size_t size()¶

Get size.

Get the number of parameters in the ParameterCollection

std::string get_fullname() const¶: get namespace of current ParameterCollection object(end with a slash)

L2WeightDecay &get_weight_decay()¶: Get the weight decay object.

float get_weight_decay_lambda() const¶: Get the weight decay lambda value.

struct ParameterInit¶

#include <param-init.h>

Initializers for parameters.

Allows for custom parameter initialization

Subclassed by dynet::ParameterInitConst, dynet::ParameterInitFromFile, dynet::ParameterInitFromVector, dynet::ParameterInitGlorot, dynet::ParameterInitIdentity, dynet::ParameterInitNormal, dynet::ParameterInitSaxe, dynet::ParameterInitUniform

Public Functions

ParameterInit()¶: Default constructor.

virtual void initialize_params(Tensor &values) const = 0¶

Function called upon initialization.

Whenever you inherit this struct to implement your own custom initializer, this is the function you want to overload to implement your logic.

Parameters

values: The tensor to be initialized. You should modify it in-place. See dynet/model.cc for some examples

struct ParameterInitNormal : public dynet::ParameterInit ¶

#include <param-init.h>

Initialize parameters with samples from a normal distribution.

Public Functions

ParameterInitNormal(float m = 0.0f, float v = 1.0f)¶

Constructor.

Parameters

m: Mean of the gaussian distribution
v: Variance of the gaussian distribution (reminder : the variance is the square of the standard deviation)

struct ParameterInitUniform : public dynet::ParameterInit ¶

#include <param-init.h>

Initialize parameters with samples from a uniform distribution.

Public Functions

ParameterInitUniform(float scale)¶

Constructor for uniform distribution centered on 0.

[long description]Samples parameters from \(mathcal U([-\mathrm{scale},+\mathrm{scale}]\)

Parameters

scale: Scale of the distribution

ParameterInitUniform(float l, float r)¶

Constructor for uniform distribution in a specific interval.

[long description]

Parameters

l: Lower bound of the interval
r: Upper bound of the interval

struct ParameterInitConst : public dynet::ParameterInit ¶

#include <param-init.h>

Initialize parameters with a constant value.

Public Functions

ParameterInitConst(float c)¶

Constructor.

Parameters

c: Constant value

struct ParameterInitIdentity : public dynet::ParameterInit ¶

#include <param-init.h>

Initialize as the identity.

This will raise an exception if used on non square matrices

Public Functions

ParameterInitIdentity()¶: Constructor.

struct ParameterInitGlorot : public dynet::ParameterInit ¶

#include <param-init.h>

Initialize with the methods described in Glorot, 2010

In order to preserve the variance of the forward and backward flow across layers, the parameters \(\theta\) are initialized such that \(\mathrm{Var}(\theta)=\frac 2 {n_1+n_2}\) where \(n_1,n_2\) are the input and output dim.

In the case of 4d tensors (common in convolutional networks) of shape \(XH,XW,XC,N\) the weights are sampled from \(\mathcal U([-g\sqrt{\frac 6 {d}},g\sqrt{ \frac 6 {d}}])\) where \(d = XC * (XH * XW) + N * (XH * XW)\) Important note : The underlying distribution is uniform (not gaussian)

Note: This is also known as Xavier initialization

Public Functions

ParameterInitGlorot(bool is_lookup = false, float gain = 1.f)¶

Constructor.

Parameters

is_lookup: Boolean value identifying the parameter as a LookupParameter
gain: Scaling parameter. In order for the Glorot initialization to be correct, you should ût this equal to \(\frac 1 {f'(0)}\) where \(f\) is your activation function

struct ParameterInitSaxe : public dynet::ParameterInit ¶

#include <param-init.h>

Initializes according to Saxe et al., 2014

Initializes as a random orthogonal matrix (unimplemented for GPU)

Public Functions

ParameterInitSaxe(float gain = 1.0)¶: Constructor.

struct ParameterInitFromFile : public dynet::ParameterInit ¶

#include <param-init.h>

Initializes from a file.

Useful for reusing weights, etc…

Public Functions

ParameterInitFromFile(std::string f)¶

Constructor.

Parameters

f: File name (format should just be a list of values)

struct ParameterInitFromVector : public dynet::ParameterInit ¶

#include <param-init.h>

Initializes from a std::vector of floats.

Public Functions

ParameterInitFromVector(std::vector<float> v)¶

Constructor.

Parameters

v: Vector of values to be used

Tensor¶

Tensor objects provide a bridge between C++ data structures and Eigen Tensors for multidimensional data.

Concretely, as an end user you will obtain a tensor object after calling .value() on an expression. You can then use functions described below to convert these tensors to float s, arrays of float s, to save and load the values, etc…

Conversely, when implementing low level nodes (e.g. for new operations), you will need to retrieve Eigen tensors from DyNet tensors in order to perform efficient computation.

vector<Eigen::DenseIndex> dynet::as_vector(const IndexTensor &v)¶

Get the array of indices in an index tensor.

For higher order tensors this returns the flattened value

Return

Index values

Parameters

v: Input index tensor

std::ostream &dynet::operator<<(std::ostream &os, const Tensor &t)¶

You can use cout<<tensor; for debugging or saving.

Parameters

os: output stream
t: Tensor

real dynet::as_scalar(const Tensor &t)¶

Get a scalar value from an order 0 tensor.

Throws an runtime_error exception if the tensor has more than one element.

TODO : Change for custom invalid dimension exception maybe?

Return

Scalar value

Parameters

t: Input tensor

std::vector<real> dynet::as_vector(const Tensor &v)¶

Get the array of values in the tensor.

For higher order tensors this returns the flattened value

Return

Values

Parameters

v: Input tensor

std::vector<real> dynet::as_scale_vector(const Tensor &v, float a)¶

Get the array of values in the scaled tensor.

For higher order tensors this returns the flattened value

Return

Values

Parameters

v: Input tensor
a: Scale factor

real dynet::rand01()¶

This is a helper function to sample uniformly in \([0,1]\).

Return: \(x\sim\mathcal U([0,1])\)

int dynet::rand0n(int n)¶

This is a helper function to sample uniformly in \(\{0,\dots,n-1\}\).

Return

\(x\sim\mathcal U(\{0,\dots,n-1\})\)

Parameters

n: Upper bound (excluded)

real dynet::rand_normal()¶

This is a helper function to sample from a normalized gaussian distribution.

Return: \(x\sim\mathcal N(0,1)\)

struct IndexTensor¶

#include <index-tensor.h>

Represents a tensor of indices.

This holds indices to locations within a dimension or tensor.

Public Functions

IndexTensor()¶: Create an empty tensor.

IndexTensor(const Dim &d, Eigen::DenseIndex *v, Device *dev, DeviceMempool mem)¶

Creates a tensor.

[long description]

Parameters

d: Shape of the tensor
v: Pointer to the values
dev: Device
mem: Memory pool

Public Members

Dim d¶: Shape of tensor

Eigen::DenseIndex *v¶: Pointer to memory

struct Tensor¶

#include <tensor.h>

Represents a tensor of any order.

This provides a bridge between classic C++ types and Eigen tensors.

Public Functions

Tensor()¶: Create an empty tensor.

Tensor(const Dim &d, float *v, Device *dev, DeviceMempool mem)¶

Creates a tensor.

[long description]

Parameters

d: Shape of the tensor
v: Pointer to the values
dev: Device
mem: Memory pool

float *batch_ptr(unsigned bid)¶

Get the pointer for a particular batch.

Automatically broadcasting if the size is zero

Return

Pointer to the memory where the batch values are located

Parameters

bid: Batch id requested

bool is_valid() const¶

Check for NaNs and infinite values.

This is very slow: use sparingly (it’s linear in the number of elements). This raises a std::runtime_error exception if the Tensor is on GPU because it’s not implemented yet

Return: Whether the tensor contains any invalid value

Tensor batch_elem(unsigned b) const¶

Get a Tensor object representing a single batch.

If this tensor only has a single batch, then broadcast. Otherwise, check to make sure that the requested batch is smaller than the number of batches.

TODO: This is a bit wasteful, as it re-calculates bs.batch_size() every time.

Return

Sub tensor at batch b

Parameters

b: Batch id

std::vector<Tensor> batch_elems() const¶

Get tensors for all batches.

Return: List of the tensors in each batch

Public Members

Dim d¶: Shape of tensor

float *v¶: Pointer to memory

struct TensorTools¶

#include <tensor.h>

Provides tools for creating, accessing, copying and modifying tensors (in-place)

Public Static Functions

void clip(Tensor &d, float left, float right)¶

Clip the values in the tensor to a fixed range.

Parameters

d: Tensor to modify
left: Target minimum value
right: Target maximum value

void scale(Tensor &x, float left, float right)¶

Do an elementwise linear transform of values a*x + b.

Parameters

x: Tensor to modify
a: The value to multiply by
b: The value to add

void uniform_to_bernoulli(Tensor &x, float p)¶

Take a tensor of Uniform(0,1) sampled variables and turn them into Bernoulli(p) variables.

Parameters

x: Tensor to modify
p: The bernoulli probability

void constant(Tensor &d, float c)¶

Fills the tensor with a constant value.

Parameters

d: Tensor to modify
c: Target value

void zero(Tensor &d)¶

Fills a tensor with zeros.

Parameters

d: Input tensor

void identity(Tensor &val)¶

Set the (order 2) tensor as the identity matrix.

this throws a runtime_error exception if the tensor isn’t a square matrix

Parameters

val: Input tensor

void randomize_bernoulli(Tensor &val, real p, real scale = 1.0f)¶

Fill the tensor with bernoulli random variables and scale them by scale.

Parameters

val: Input tensor
p: Parameter of the bernoulli distribution
scale: Scale of the random variables

void randomize_normal(Tensor &val, real mean = 0.0f, real stddev = 1.0f)¶

Fill the tensor with gaussian random variables.

Parameters

val: Input tensor
mean: Mean
stddev: Standard deviation

void randomize_uniform(Tensor &val, real left = 0.0f, real right = 1.0f)¶

Fill the tensor with uniform random variables.

Parameters

val: Input tensor
left: Left bound of the interval
right: Right bound of the interval

void randomize_orthonormal(Tensor &val, real scale = 1.0f)¶

Takes a square matrix tensor and sets it as a random orthonormal matrix.

More specifically this samples a random matrix with RandomizeUniform and then performs SVD and returns the left orthonormal matrix in the decomposition, scaled by scale

Parameters

val: Input tensor
scale: Value to which the resulting orthonormal matrix will be scaled

float access_element(const Tensor &v, int index)¶

Access element of the tensor by index in the values array.

AccessElement and SetElement are very, very slow (potentially) - use appropriately

Return

v.v[index]

Parameters

v: Tensor
index: Index in the memory

float access_element(const Tensor &v, const Dim &index)¶

Access element of the tensor by indices in the various dimension.

This only works for matrix shaped tensors (+ batch dimension). AccessElement and SetElement are very, very slow (potentially) - use appropriately

Return

(*v)(index[0], index[1])

Parameters

v: Tensor
index: Indices in the tensor

void set_element(const Tensor &v, int index, float value)¶

Set element of the tensor by index in the values array.

AccessElement and SetElement are very, very slow (potentially) - use appropriately

Parameters

v: Tensor
index: Index in the memory
value: Desired value

void copy_element(const Tensor &l, int lindex, Tensor &r, int rindex)¶

Copy element from one tensor to another (by index in the values array)

Parameters

l: Source tensor
lindex: Source index
r: Target tensor
rindex: Target index

void set_elements(const Tensor &v, const std::vector<float> &vec)¶

Set the elements of a tensor with an array of values.

(This uses memcpy so be careful)

Parameters

v: Input Tensor
vec: Values

void copy_elements(Tensor &v, const Tensor &v_src)¶

Copy one tensor into another.

Parameters

v: Target tensor
v_src: Source tensor

void accumulate(Tensor &v, const Tensor &v_src)¶

Accumulate the values of one tensor into another.

Parameters

v: Target tensor
v_src: Source tensor

void logsumexp(const Tensor &x, Tensor &m, Tensor &z, unsigned d = 0)¶

Calculate the logsumexp function over all columns of the tensor.

Parameters

x: The input tensor
m: A tensor of scratch memory to hold the maximum values of each column
z: The output tensor

IndexTensor argmax(const Tensor &v, unsigned dim = 0, unsigned num = 1)¶

Calculate the index of the maximum value.

Return

A newly allocated LongTensor consisting of argmax IDs. The length of the dimension “dim” will be “num”, consisting of the appropriate IDs.

Parameters

v: A tensor where each row represents a probability distribution
dim: Which dimension to take the argmax over
num: The number of kmax values

IndexTensor categorical_sample_log_prob(const Tensor &v, unsigned dim = 0, unsigned num = 1)¶

Calculate samples from a log probability.

Return

A newly allocated LongTensor consisting of argmax IDs. The length of the dimension “dim” will be “num”, consisting of the appropriate IDs.

Parameters

v: A tensor where each row represents a log probability distribution
dim: Which dimension to take the sample over
num: The number of samples for each row

std::pair<Tensor, IndexTensor> topk(const Tensor &v, unsigned dim = 0, unsigned num = 1)¶

Calculate the k-max values and their indexes.

Return

A newly allocated pair<Tensor, LongTensor> consisting of argmax Vals/IDs. The length of the dimension “dim” will be “num”, consisting of the appropriate Vals/IDs.

Parameters

v: A tensor where each row represents a probability distribution
dim: Which dimension to take the kmax over
num: The number of kmax values

Dimensions¶

The Dim class holds information on the shape of a tensor. As explained in Unorthodox Design, in DyNet the dimensions are represented as the standard dimension + the batch dimension, which makes batched computation transparent.

DYNET_MAX_TENSOR_DIM¶: Maximum number of dimensions supported by dynet : 7

struct Dim¶

#include <dim.h>

The Dim struct stores information about the dimensionality of expressions.

Batch dimension is treated separately from standard dimension.

Public Functions

Dim()¶: Default constructor.

Dim(std::initializer_list<unsigned int> x)¶

Initialize from a list of dimensions.

The batch dimension is 1 in this case (non-batched expression)

Parameters

x: List of dimensions

Dim(std::initializer_list<unsigned int> x, unsigned int b)¶

Initialize from a list of dimensions and a batch size.

Parameters

x: List of dimensions
b: Batch size

Dim(const std::vector<long> &x)¶

Initialize from a vector of dimensions.

The batch dimension is 1 in this case (non-batched expression)

Parameters

x: Array of dimensions

Dim(const std::vector<long> &x, unsigned int b)¶

Initialize from a vector of dimensions and a batch size.

Parameters

x: Vector of dimensions
b: Batch size

unsigned int size() const¶

Total size of a batch.

Return: Batch size * size of a batch

unsigned int batch_size() const¶

Size of a batch (product of all dimensions)

Return: Size of a batch

unsigned int sum_dims() const¶

Sum of all dimensions within a batch.

Return: Sum of the dimensions within a batch

Dim truncate() const¶

remove trailing dimensions of 1

iterate all the dimensions of Dim, stop at last dimension of 1

Return: truncated dimension

Dim single_batch() const¶

Set the batch dimension to 1.

Return: 1-batch version of this instance

void resize(unsigned int i)¶

Change the number of dimensions.

Parameters

int: New number of dimensions

unsigned int ndims() const¶

Get number of dimensions.

Return: Number of dimensions

unsigned int rows() const¶

Size of the first dimension.

Return: Size of the first dimension

unsigned int num_nonone_dims() const¶

Number of non-one dimensions.

Return: Number of non-one dimensions

unsigned int cols() const¶

Size of the second dimension (or 1 if only one dimension)

Return: Size of the second dimension (or 1 if only one dimension)

unsigned int batch_elems() const¶

Batch dimension.

Return: Batch dimension

void set(unsigned int i, unsigned int s)¶

Set specific dimension.

Set the value of a specific dimension to an arbitrary value

Parameters

i: Dimension index
s: Dimension size

unsigned int operator[](unsigned int i) const¶

Access a specific dimension as you would access an array element.

Return

Size of dimension i

Parameters

i: Dimension index

unsigned int size(unsigned int i) const¶

Size of dimension i.

Return

Size of dimension i

Parameters

i: Dimension index

void delete_dim(unsigned int i)¶

Remove one of the dimensions.

Parameters

i: index of the dimension to be removed

void delete_dims(std::vector<unsigned int> dims, bool reduce_batch)¶

Remove multi-dimensions.

Parameters

dims: dimensions to be removed
reduce_batch: reduce the batch dimension or not

void add_dim(unsigned int n)¶

Insert a dimension to the end.

Parameters

n: the size of the new dimension

void insert_dim(unsigned int i, unsigned int n)¶

Insert a dimension.

Parameters

i: the index to insert the new dimension
n: the size of the new dimension

Dim transpose() const¶

Transpose a vector or a matrix.

This raises an invalid_argument exception on tensors with more than 2 dimensions

Return: The transposed Dim structure

void print_profile(std::ostream &out) const¶: Print the unbatched profile as a string.

Public Members

unsigned int d[DYNET_MAX_TENSOR_DIM]¶: Array of dimension

unsigned int nd¶: Number of dimensions

unsigned int bd¶: Batch dimension