DiffSharp provides most of the essential operations found in tensor libraries such as NumPy, PyTorch, and TensorFlow. All differentiable operations support the forward, reverse, and nested differentiation modes.
When implementing new operations, you should prefer to implement these as compositions of existing DiffSharp Tensor operations, which would give you differentiability out of the box.
In the rare cases where you need to extend DiffSharp with a completely new differentiable operation that cannot be implemented as a composition of existing operations, you can use the provided extension API.
If the function you would like to implement is a simple elementwise function, you can use the UnaryOpElementwise or BinaryOpElementwise types to define your function and its derivatives. The forward, reverse, and nested differentiation rules for the function are automatically generated by the type. The documentation of these two types detail how they should be instantiated.
Let's see several examples.
\(f(a) = \mathrm{sin}(a)\), with derivative \(\frac{\partial f(a)}{\partial a} = \mathrm{cos}(a) \;\).
open DiffSharp
type Tensor with
member a.sin() =
Tensor.Op
{ new UnaryOpElementwise("sin") with
member _.fRaw(a) = a.SinT()
member _.dfda(a,f) = a.cos()
}
(a)
\(f(a) = \mathrm{log}(a)\), with derivative \(\frac{\partial f(a)}{\partial a} = 1/a \;\).
type Tensor with
member a.log() =
Tensor.Op
{ new UnaryOpElementwise("log") with
member _.fRaw(a) = a.LogT()
member _.dfda(a,f) = 1/a
}
(a)
\(f(a, b) = ab\), with derivatives \(\frac{\partial f(a, b)}{\partial a} = b\), \(\frac{\partial f(a, b)}{\partial b} = a \;\).
type Tensor with
member a.mul(b) =
Tensor.Op
{ new BinaryOpElementwise("mul") with
member _.fRaw(a,b) = a.MulTT(b)
member _.dfda(a,b,f) = b
member _.dfdb(a,b,f) = a
}
(a,b)
\(f(a, b) = a^b\), with derivatives \(\frac{\partial f(a, b)}{\partial a} = b a^{b-1}\), \(\frac{\partial f(a, b)}{\partial b} = a^b \mathrm{log}(a) \;\). Note the use of the argument f
in the derivative definitions that makes use of the pre-computed value of \(f(a, b) = a^b\) that is available to the derivative implementation.
type Tensor with
member a.pow(b) =
Tensor.Op
{ new BinaryOpElementwise("pow") with
member _.fRaw(a,b) = a.PowTT(b)
member _.dfda(a,b,f) = b * f / a // equivalent to b * a.pow(b-1)
member _.dfdb(a,b,f) = f * a.log() // equivalent to a.pow(b) * a.log()
}
(a,b)
For more complicated functions, you can use the most general way of defining functions using the UnaryOp or BinaryOp types, which allow you to define the full forward and reverse mode differentiation rules. The documentation of these two types detail how they should be instantiated.
Let's see several examples.
\(f(A) = A^{\intercal}\), with the forward derivative propagation rule \(\frac{\partial f(A)}{\partial X} = \frac{\partial A}{\partial X} \frac{\partial f(A)}{\partial A} = (\frac{\partial A}{\partial X})^{\intercal}\) and the reverse derivative propagation rule \(\frac{\partial Y}{\partial A} = \frac{\partial Y}{\partial f(A)} \frac{\partial f(A)}{\partial A} = (\frac{\partial Y}{\partial f(A)})^{\intercal} \;\).
type Tensor with
member a.transpose() =
Tensor.Op
{ new UnaryOp("transpose") with
member _.fRaw(a) = a.TransposeT2()
member _.ad_dfda(a,ad,f) = ad.transpose()
member _.fd_dfda(a,f,fd) = fd.transpose()
}
(a)
\(f(A, B) = AB\), with the forward derivative propagation rule \(\frac{\partial(A, B)}{\partial X} = \frac{\partial A}{\partial X} \frac{\partial f(A, B)}{\partial A} + \frac{\partial B}{\partial X} \frac{\partial f(A, B)}{\partial B} = \frac{\partial A}{\partial X} B + A \frac{\partial B}{\partial X}\) and the reverse propagation rule \(\frac{\partial Y}{\partial A} = \frac{\partial Y}{\partial f(A, B)} \frac{\partial f(A, B)}{\partial A} = \frac{\partial Y}{\partial f(A, B)} B^{\intercal}\), \(\frac{\partial Y}{\partial B} = \frac{\partial Y}{\partial f(A, B)} \frac{\partial f(A, B)}{B} = A^{\intercal} \frac{\partial Y}{\partial f(A, B)} \;\).
type Tensor with
member a.matmul(b) =
Tensor.Op
{ new BinaryOp("matmul") with
member _.fRaw(a,b) = a.MatMulTT(b)
member _.ad_dfda(a,ad,b,f) = ad.matmul(b)
member _.bd_dfdb(a,b,bd,f) = a.matmul(bd)
member _.fd_dfda(a,b,f,fd) = fd.matmul(b.transpose())
member _.fd_dfdb(a,b,f,fd) = a.transpose().matmul(fd)
}
(a,b)
namespace DiffSharp
type dsharp =
static member abs: input: Tensor -> Tensor
static member acos: input: Tensor -> Tensor
static member add: a: Tensor * b: Tensor -> Tensor
static member arange: endVal: float * ?startVal: float * ?step: float * ?device: Device * ?dtype: Dtype * ?backend: Backend -> Tensor + 1 overload
static member arangeLike: input: Tensor * endVal: float * ?startVal: float * ?step: float * ?device: Device * ?dtype: Dtype * ?backend: Backend -> Tensor + 1 overload
static member argmax: input: Tensor -> int[] + 1 overload
static member argmin: input: Tensor -> int[] + 1 overload
static member asin: input: Tensor -> Tensor
static member atan: input: Tensor -> Tensor
static member backends: unit -> Backend list
...
<summary>
Tensor operations
</summary>
static member DiffSharp.dsharp.config: unit -> DiffSharp.Device * DiffSharp.Dtype * DiffSharp.Backend * DiffSharp.Printer
static member DiffSharp.dsharp.config: configuration: (DiffSharp.Device * DiffSharp.Dtype * DiffSharp.Backend * DiffSharp.Printer) -> unit
static member DiffSharp.dsharp.config: ?device: DiffSharp.Device * ?dtype: DiffSharp.Dtype * ?backend: DiffSharp.Backend * ?printer: DiffSharp.Printer -> unit
Multiple items
module Backend
from DiffSharp
<summary>
Contains functions and settings related to backend specifications.
</summary>
--------------------
type Backend =
| Reference
| Torch
| Other of name: string * code: int
override ToString: unit -> string
member Name: string
<summary>
Represents a backend for DiffSharp tensors
</summary>
union case DiffSharp.Backend.Reference: DiffSharp.Backend
<summary>
The reference backend
</summary>
static member DiffSharp.dsharp.seed: ?seed: int -> unit
type Tensor =
private | TensorC of primalRaw: RawTensor
| TensorF of primal: Tensor * derivative: Tensor * nestingTag: uint32
| TensorR of primal: Tensor * derivative: Tensor ref * parentOp: TensorOp * fanout: uint32 ref * nestingTag: uint32
interface IConvertible
interface IComparable
override Equals: other: obj -> bool
override GetHashCode: unit -> int
member GetSlice: bounds: int[,] -> Tensor
override ToString: unit -> string
member abs: unit -> Tensor
member acos: unit -> Tensor
member add: b: Tensor -> Tensor + 1 overload
member addSlice: location: seq<int> * b: Tensor -> Tensor
...
<summary>
Represents a multi-dimensional data type containing elements of a single data type.
</summary>
<example>
A tensor can be constructed from a list or sequence using <see cref="M:DiffSharp.dsharp.tensor(System.Object)" /><code>
let t = dsharp.tensor([[1.; -1.]; [1.; -1.]])
</code></example>
val a: Tensor
static member Tensor.Op: ext: BinaryOp -> (Tensor * Tensor -> Tensor)
static member Tensor.Op: ext: UnaryOp -> (Tensor -> Tensor)
Multiple items
type UnaryOpElementwise =
inherit UnaryOp
new: name: string -> UnaryOpElementwise
override ad_dfda: a: Tensor * ad: Tensor * f: Tensor -> Tensor
abstract dfda: a: Tensor * f: Tensor -> Tensor
override fd_dfda: a: Tensor * f: Tensor * fd: Tensor -> Tensor
<summary>Defines a new op implementing an elementwise unary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.UnaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type is specialized to elementwise ops. It requires the user to specify only (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation and (2) the derivative of the function with respect to its argument. The corresponding derivative propagation rules for the forward and reverse differentiation modes are automatically generated.</para><para>If you are implementing a complex op that is not elementwise, you can use the generic type <see cref="T:DiffSharp.UnaryOp" />, which allows you to define the full derivative propagation rules.</para></remarks>
<example><code>
{ new UnaryOpElementwise("cos") with
member _.fRaw(a) = a.CosT()
member _.dfda(a,f) = -a.sin()
}
{ new UnaryOpElementwise("exp") with
member _.fRaw(a) = a.ExpT()
member _.dfda(a,f) = f
}
{ new UnaryOpElementwise("log") with
member _.fRaw(a) = a.LogT()
member _.dfda(a,f) = 1/a
}
</code></example>
--------------------
new: name: string -> UnaryOpElementwise
val a: Backends.RawTensor
abstract Backends.RawTensor.SinT: unit -> Backends.RawTensor
val f: Tensor
abstract Backends.RawTensor.LogT: unit -> Backends.RawTensor
val b: Tensor
Multiple items
type BinaryOpElementwise =
inherit BinaryOp
new: name: string -> BinaryOpElementwise
override ad_dfda: a: Tensor * ad: Tensor * b: Tensor * f: Tensor -> Tensor
override bd_dfdb: a: Tensor * b: Tensor * bd: Tensor * f: Tensor -> Tensor
abstract dfda: a: Tensor * b: Tensor * f: Tensor -> Tensor
abstract dfdb: a: Tensor * b: Tensor * f: Tensor -> Tensor
override fd_dfda: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor
override fd_dfdb: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor
<summary>Defines a new op implementing an elementwise binary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.BinaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks>
This type is specialized to elementwise ops. It requires the user to specify only (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation and (2) the derivative of the function with respect to each argument. The corresponding derivative propagation rules for the forward and reverse differentiation modes are automatically generated.
<para>If you are implementing a complex op that is not elementwise, you can use the generic type <see cref="T:DiffSharp.BinaryOp" />, which allows you to define the full derivative propagation rules.</para></remarks>
<example><code>
{ new BinaryOpElementwise("pow") with
member _.fRaw(a,b) = a.PowTT(b)
member _.dfda(a,b,f) = b * f / a
member _.dfdb(a,b,f) = f * a.log()
}
{ new BinaryOpElementwise("mul") with
member _.fRaw(a,b) = a.MulTT(b)
member _.dfda(a,b,f) = b
member _.dfdb(a,b,f) = a
}
</code></example>
--------------------
new: name: string -> BinaryOpElementwise
val b: Backends.RawTensor
abstract Backends.RawTensor.MulTT: t2: Backends.RawTensor -> Backends.RawTensor
abstract Backends.RawTensor.PowTT: t2: Backends.RawTensor -> Backends.RawTensor
Multiple items
type UnaryOp =
new: name: string -> UnaryOp
abstract ad_dfda: a: Tensor * ad: Tensor * f: Tensor -> Tensor
abstract fRaw: a: RawTensor -> RawTensor
abstract fd_dfda: a: Tensor * f: Tensor * fd: Tensor -> Tensor
member name: string
<summary>Defines a new op implementing a unary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.UnaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type represents the most generic definition of a new op representing a unary function, allowing the specification of: (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation, (2) the derivative propagation rule for the forward differentiation mode and (3) the derivative propagation rule for the reverse differentiation mode.</para><para>In general, if you are implementing a simple elementwise op, you should prefer using the <see cref="T:DiffSharp.UnaryOpElementwise" /> type, which is much simpler to use.</para></remarks>
<example><code>
{ new UnaryOp("transpose") with
member _.fRaw(a) = a.TransposeT2()
member _.ad_dfda(a,ad,f) = ad.transpose()
member _.fd_dfda(a,f,fd) = fd.transpose()
}
</code></example>
--------------------
new: name: string -> UnaryOp
abstract Backends.RawTensor.TransposeT2: unit -> Backends.RawTensor
val ad: Tensor
val fd: Tensor
Multiple items
type BinaryOp =
new: name: string -> BinaryOp
abstract ad_dfda: a: Tensor * ad: Tensor * b: Tensor * f: Tensor -> Tensor
abstract bd_dfdb: a: Tensor * b: Tensor * bd: Tensor * f: Tensor -> Tensor
abstract fRaw: a: RawTensor * b: RawTensor -> RawTensor
abstract fd_dfda: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor
abstract fd_dfdb: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor
member name: string
<summary>Defines a new op implementing a binary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.BinaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type represents the most generic definition of a new op representing a binary function, allowing the specification of: (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation, (2) the derivative propagation rule for the forward differentiation mode and (3) the derivative propagation rule for the reverse differentiation mode.</para><para>In general, if you are implementing a simple elementwise op, you should prefer using the <see cref="T:DiffSharp.BinaryOpElementwise" /> type, which is much simpler to use.</para></remarks>
<example><code>
{ new BinaryOp("matmul") with
member _.fRaw(a,b) = a.MatMulTT(b)
member _.ad_dfda(a,ad,b,f) = ad.matmul(b)
member _.bd_dfdb(a,b,bd,f) = a.matmul(bd)
member _.fd_dfda(a,b,f,fd) = fd.matmul(b.transpose())
member _.fd_dfdb(a,b,f,fd) = a.transposeExt().matmul(fd)
}
</code></example>
--------------------
new: name: string -> BinaryOp
abstract Backends.RawTensor.MatMulTT: t2: Backends.RawTensor -> Backends.RawTensor
val bd: Tensor