# Questions tagged [automatic-differentiation]

50 questions

3

votes

0

answer

20

Views

### Computational graph vs (computer algebra) symbolic expression

I was reading Baydin et al, Automatic Differentiation in Machine Learning: a Survey, 2018 (Arxiv), which differentiates between symbolic differentiation and automatic differentiation (AD). It then says:
AD Is Not Symbolic Differentiation.
Symbolic differentiation is the automatic manipulation of [sy...

1

votes

0

answer

90

Views

### Automatic Differentiation in Julia: Hessian from ReverseDiffSparse

How can I evaluate the Hessian of a function in Julia using automatic differentiation (preferably using ReverseDiffSparse)? In the following example, I can compute and evaluate the gradient at a point values through JuMP:
m = Model()
@variable(m, x)
@variable(m, y)
@NLobjective(m, Min, sin(x) + sin(...

1

votes

0

answer

32

Views

### finite difference derivative array valued functions

Suppose I have the following code
import numpy as np
f = lambda x,y: (np.sum(x) + np.sum(y))**2
x = np.array([1,2,3])
y = np.array([4,5,6])
df_dx
df_dy
df2_dx2
df2_dxdy
...
is there a fast way to compute all the derivatives (single and mixed) of such a function? The module should perform the classic...

1

votes

0

answer

58

Views

### grad_outputs in Chainer vs grad in Tensorflow for backward function

I want to translate some custom operation with self-defined gradient from Chainer to Tensorflow. The forward pass is relatively straightforward, I already have it. But for the backward pass, I can never make the optimization works. Let's suppose the backward pass in Chainer is this:
def backward_gpu...

1

votes

1

answer

169

Views

### How to implement automatic differentiation in Haskell?

So I have a Dual number class:
data Dual a = !a :+ !a
instance [safe] Eq a => Eq (Dual a)
instance [safe] RealFloat a => Floating (Dual a)
instance [safe] RealFloat a => Fractional (Dual a)
instance [safe] RealFloat a => Num (Dual a)
instance [safe] Read a => Read (Dual a)
instance [safe] Show a =>...

1

votes

0

answer

112

Views

### Hessian of a black box function that uses Pytorch

First of all I am very new to Python and machine learning, so please excuse my ignorance on what might be a very basic issue; I do appreciate any input on this question!
I have a very complicated scalar-valued multivariable function implemented in Python that uses Pytorch functionalities (it is actu...

1

votes

0

answer

80

Views

### Is there a way to stop Fortran compiler from checking if negative arguments are passed to SQRT function?

I am trying to use a third party automatic differentiation module, ADF95, which uses the expression -sqrt(asin(-1.0_dpr)) to return a Not-a-Number (NaN) in specific cases, where dpr is defined using integer, parameter :: dpr = KIND(1.D0).
Upon attempting to compile a simple test program which uses...

1

votes

1

answer

149

Views

### Tensorflow: Differentiable Primitives

I was under the impression that all tensorflow primitives are differentiable. Under this 'illusion' I wrote this function in the hopes that tensorflow will just automatically differentiate it and I can backprop erros through it.
Rank-weight function:
def ranked(a):
lens = tf.convert_to_tensor(...

1

votes

1

answer

0

Views

### Update step in PyTorch implementation of Newton's method

I'm trying to get some insight into how PyTorch works by implementing Newton's method for solving x = cos(x). Here's a version that works:
x = Variable(DoubleTensor([1]), requires_grad=True)
for i in range(5):
y = x - torch.cos(x)
y.backward()
x = Variable(x.data - y.data/x.grad.data, requires_grad...

1

votes

1

answer

124

Views

### minimal Numeric.AD example won't compile

I am trying to compile the following minimal example from Numeric.AD:
import Numeric.AD
timeAndGrad f l = grad f l
main = putStrLn 'hi'
and I run into this error:
test.hs:3:24:
Couldn't match expected type ‘f (Numeric.AD.Internal.Reverse.Reverse
s a)
-> Numeric.AD.Internal.Reverse.Reverse s a’...

1

votes

2

answer

229

Views

### Understanding higher order automatic differentiation

Having recently just finished my own basic reverse mode AD for machine learning purposes, I find myself wanting to learn about the field, but I've hit a hardness wall with higher order methods.
The basic reverse AD is beautifully simple and easy to understand, but the more advanced material is both...

3

votes

1

answer

44

Views

### Automatic Differentiation with CoDiPack

The following code:
#include
...
codi::RealForward Gcodi[l];
for (int p = 0; p < l; p++)
{
...
double a = Gcodi[p];
}
gives me the compilation error:
nnBFAD.cpp: In function ‘void OptBF()’:
nnBFAD.cpp:156:25: error: cannot convert ‘codi::RealForward {aka codi::ActiveReal >}’ to ‘double...

2

votes

2

answer

92

Views

### Julia ReverseDiff: how to take a gradient w.r.t. only a subset of inputs?

In my data flow, I'm querying a small subset of a database, using those results to construct about a dozen arrays, and then, given some parameter values, computing a likelihood value. Then repeating for a subset of the database. I want to compute the gradient of the likelihood function with respect...

5

votes

1

answer

239

Views

### How efficient/intelligent is Theano in computing gradients?

Suppose I have an artificial neural networks with 5 hidden layers. For the moment, forget about the details of the neural network model such as biases, the activation functions used, type of data and so on ... . Of course, the activation functions are differentiable.
With symbolic differentiation, t...

2

votes

2

answer

14.3k

Views

### Java - Computation of Derivations with Apache Commons Mathematic Library

I have a problem in using the apache commons math library.
I just want to create functions like f(x) = 4x^2 + 2x and I want to compute the derivative of this function --> f'(x) = 8x + 2
I read the article about Differentiation (http://commons.apache.org/proper/commons-math/userguide/analysis.html,...

12

votes

7

answer

2.6k

Views

### Automatic differentiation library in Scheme / Common Lisp / Clojure

I've heard that one of McCarthy's original motivations for inventing Lisp was to write a system for automatic differentiation. Despite this, my Google searches haven't yielded any libraries/macros for doing this. Are there any Scheme/Common Lisp/Clojure libraries (macros) out there for taking a func...

5

votes

1

answer

382

Views

### Acceptable types in Numeric.AD functions

I'm having little success wrapping my head around the basic plumbing of the types involved in the ad package. For example, the following works perfectly:
import Numeric.AD
ex :: Num a => [a] -> a
ex [x, y] = x + 2*y
> grad ex [1.0, 1.0]
[1.0, 2.0]
where grad has the type:
grad
:: (Num a, Traversabl...

3

votes

2

answer

109

Views

### Why is a function type required to be “wrapped” for the type checker to be satisfied?

The following program type-checks:
{-# LANGUAGE RankNTypes #-}
import Numeric.AD (grad)
newtype Fun = Fun (forall a. Num a => [a] -> a)
test1 [u, v] = (v - (u * u * u))
test2 [u, v] = ((u * u) + (v * v) - 1)
main = print $ fmap (\(Fun f) -> grad f [1,1]) [Fun test1, Fun test2]
But this program fails...

4

votes

1

answer

105

Views

### C++ reverse automatic differentiation with graph

I'm trying to make a reverse mode automatic differentiation in C++.
The idea I came up with is that each variable that results of an operation on one or two other variables, is going to save the gradients in a vector.
This is the code :
class Var {
private:
double value;
char character;
std::vector...

2

votes

0

answer

105

Views

### Automatic differentiation (AD) with respect to list of matrices in Haskell

I am trying to understand how can I use Numeric.AD (automatic differentiation) in Haskell.
I defined a simple matrix type and a scalar function taking an array and two matrices as arguments. I want to use AD to get the gradient of the scoring function with respect to both matrices, but I'm running i...

5

votes

4

answer

409

Views

### Optimization issue, Nonlinear: automatic analytical Jacobian/Hessian from objective and constraints in R?

In R, is it possible to find the Jacobian/Hessian/sparsity pattern analytically when you provide just the objective function and constraints for an optimization problem?
AMPL does this, and from what I hear even MATLAB can do this, but I don't know if you need Knitro for this.
However, all the optim...

3

votes

1

answer

449

Views

### Where is Wengert List in TensorFlow?

TensorFlow use reverse-mode automatic differentiation(reverse mode AD), as shown in https://github.com/tensorflow/tensorflow/issues/675.
Reverse mode AD need a data structure called a Wengert List - see https://en.wikipedia.org/wiki/Automatic_differentiation#Reverse_accumulation.
However, searching...

1

votes

2

answer

117

Views

### Computational Efficiency of Forward Mode Automatic vs Numeric vs Symbolic Differentiation

I am trying to solve a problem of finding the roots of a function using the Newton-Raphson (NR) method in the C language. The functions in which I would like to find the roots are mostly polynomial functions but may also contain trigonometric and logarithmic.
The NR method requires finding the diffe...

13

votes

2

answer

3k

Views

### What does the parameter retain_graph mean in the Variable's backward() method?

I'm going through the neural transfer pytorch tutorial and am confused about the use of retain_variable(deprecated, now referred to as retain_graph). The code example show:
class ContentLoss(nn.Module):
def __init__(self, target, weight):
super(ContentLoss, self).__init__()
self.target = target.deta...

3

votes

1

answer

63

Views

### Using automatic differentiation on a function that makes use of a preallocated array in Julia

My long subject title pretty much covers it.
I have managed to isolate my much bigger problem in the following contrived example below. I cannot figure out where the problem exactly is, though I imagine it has something to do with the type of the preallocated array?
using ForwardDiff
function test()...

2

votes

1

answer

357

Views

### Change fortran compile order in NetBeans 8

I'm working in NetBeans 8 on CentOS 7 to change some old fortran code to replace numerical differentiation with automatic differentiation using OpenAD. OpenAD takes an annotated fortran function as input and generates an automatically differentiated function as output. That output function depends...

3

votes

1

answer

288

Views

### How to do automatic differentiation on complex datatypes?

Given a very simple Matrix definition based on Vector:
import Numeric.AD
import qualified Data.Vector as V
newtype Mat a = Mat { unMat :: V.Vector a }
scale' f = Mat . V.map (*f) . unMat
add' a b = Mat $ V.zipWith (+) (unMat a) (unMat b)
sub' a b = Mat $ V.zipWith (-) (unMat a) (unMat b)
mul' a b =...

13

votes

4

answer

2.3k

Views

### Is there any working implementation of reverse mode automatic differentiation for Haskell?

The closest-related implementation in Haskell I have seen is the forward mode at http://hackage.haskell.org/packages/archive/fad/1.0/doc/html/Numeric-FAD.html.
The closest related related research appears to be reverse mode for another functional language related to Scheme at http://www.bcl.hamilton...

4

votes

1

answer

394

Views

### Combining Eigen and CppAD

I want to use automatic differentiation mechanism provided by
CppAD inside Eigen linear algebra. An example type is
Eigen::Matrix< CppAD::AD,-1,-1>. As CppAD::AD is a custom numeric type
the NumTraits for this type have to be provided. CppAD provides
those in the file cppad/example/cppad_eigen.hpp....

5

votes

0

answer

659

Views

### How to create an array-like class compatible with NumPy ufuncs?

I'm trying to implement Automatic Differentiation using a class that behaves like a NumPy array. It does not subclass numpy.ndarray, but contains two array attributes. One for the value, and one for the Jacobian matrix. Every operation is overloaded to operate both on the value and the Jacobian. How...

4

votes

3

answer

603

Views

### Derivative of a Higher-Order Function

This is in the context of Automatic Differentiation - what would such a system do with a function like map, or filter - or even one of the SKI Combinators?
Example: I have the following function:
def func(x):
return sum(map(lambda a: a**x, range(20)))
What would its derivative be? What will an AD sy...

2

votes

1

answer

577

Views

### Partial Derivative using Autograd

I have a function that takes in a multivariate argument x. Here x = [x1,x2,x3]. Let's say my function looks like:
f(x,T) = np.dot(x,T) + np.exp(np.dot(x,T) where T is a constant.
I am interested in finding df/dx1, df/dx2 and df/dx3 functions.
I have achieved some success using scipy diff, but I am a...

7

votes

1

answer

1.4k

Views

### how is backpropagation the same (or not) as reverse automatic differentiation?

The Wikipedia page for backpropagation has this claim:
The backpropagation algorithm for calculating a gradient has been
rediscovered a number of times, and is a special case of a more
general technique called automatic differentiation in the reverse
accumulation mode.
Can someone expound on this, p...

57

votes

3

answer

4.7k

Views

### Why don't C++ compilers do better constant folding?

I'm investigating ways to speed up a large section of C++ code, which has automatic derivatives for computing jacobians. This involves doing some amount of work in the actual residuals, but the majority of the work (based on profiled execution time) is in calculating the jacobians.
This surprised m...

7

votes

2

answer

228

Views

### Numeric.AD and typing problem

I'm trying to work with Numeric.AD and a custom Expr type. I wish to calculate
the symbolic gradient of user inputted expression. The first trial with a constant
expression works nicely:
calcGrad0 :: [Expr Double]
calcGrad0 = grad df vars
where
df [x,y] = eval (env [x,y]) (EVar 'x'*EVar 'y')
env vs...

3

votes

1

answer

136

Views

### Numeric.AD - type variable escaping its scope

I'm trying to use automatic differentiation in Haskell for a nonlinear control problem, but have some problems getting it to work. I basically have a cost function, which should be optimized given an initial state. The types are:
data Reference a = Reference a deriving Functor
data Plant a = Plant a...

3

votes

1

answer

335

Views

### Automatic differentiation with ForwardDiff in Julia

I am having some trouble using correctly the ForwardDiff package in Julia. I have managed to isolate my problem in the following chunk of code.
In short, I define the function:
using ForwardDiff
function likelihood(mu,X)
N = size(X,2)
# Calculate likelihood
aux = zeros(N)
for nn=1:N
aux[nn] = exp(-0...

2

votes

2

answer

110

Views

### Breaking TensorFlow gradient calculation into two (or more) parts

Is it possible to use TensorFlow's tf.gradients() function in parts, that is - calculate the gradient from of loss w.r.t some tensor, and of that tensor w.r.t the weight, and then multiply them to get the original gradient from the loss to the weight?
For example, let W,b be some weights, let x be a...

2

votes

0

answer

105

Views

### Automatic differentiation with custom data types

I'm facing a problem while trying to differentiate custom data types using the Haskell ad library. There is a related question here, which has been helpful, but I feel might be unsufficient in this case.
Here is a simplified version of the issue that I'm facing:
{-# LANGUAGE DeriveFunctor #-}
{-# LA...

3

votes

1

answer

825

Views

### how does the pytorch autograd work?

I submitted this as an issue to cycleGAN pytorch implementation, but since nobody replied me there, i will ask again here.
I'm mainly puzzled by the fact that multiple forward passes was called before one single backward pass, see the following in code cycle_gan_model
# GAN loss
# D_A(G_A(A))
self.f...