Extending DiffSharp

DiffSharp provides most of the essential operations found in tensor libraries such as NumPy, PyTorch, and TensorFlow. All differentiable operations support the forward, reverse, and nested differentiation modes.

When implementing new operations, you should prefer to implement these as compositions of existing DiffSharp Tensor operations, which would give you differentiability out of the box.

In the rare cases where you need to extend DiffSharp with a completely new differentiable operation that cannot be implemented as a composition of existing operations, you can use the provided extension API.

Simple elementwise functions

If the function you would like to implement is a simple elementwise function, you can use the UnaryOpElementwise or BinaryOpElementwise types to define your function and its derivatives. The forward, reverse, and nested differentiation rules for the function are automatically generated by the type. The documentation of these two types detail how they should be instantiated.

Let's see several examples.

\(f(a) = \mathrm{sin}(a)\), with derivative \(\frac{\partial f(a)}{\partial a} = \mathrm{cos}(a) \;\).

open DiffSharp

type Tensor with
    member a.sin() = 
        Tensor.Op
            { new UnaryOpElementwise("sin") with 
                member _.fRaw(a) = a.SinT()
                member _.dfda(a,f) = a.cos()
            }
            (a)

\(f(a) = \mathrm{log}(a)\), with derivative \(\frac{\partial f(a)}{\partial a} = 1/a \;\).

type Tensor with
    member a.log() =
        Tensor.Op
            { new UnaryOpElementwise("log") with
                member _.fRaw(a) = a.LogT()
                member _.dfda(a,f) = 1/a
            }
            (a)

\(f(a, b) = ab\), with derivatives \(\frac{\partial f(a, b)}{\partial a} = b\), \(\frac{\partial f(a, b)}{\partial b} = a \;\).

type Tensor with
    member a.mul(b) =
        Tensor.Op
            { new BinaryOpElementwise("mul") with
                member _.fRaw(a,b) = a.MulTT(b)
                member _.dfda(a,b,f) = b
                member _.dfdb(a,b,f) = a
            }
            (a,b)

\(f(a, b) = a^b\), with derivatives \(\frac{\partial f(a, b)}{\partial a} = b a^{b-1}\), \(\frac{\partial f(a, b)}{\partial b} = a^b \mathrm{log}(a) \;\). Note the use of the argument f in the derivative definitions that makes use of the pre-computed value of \(f(a, b) = a^b\) that is available to the derivative implementation.

type Tensor with
    member a.pow(b) =
        Tensor.Op
            { new BinaryOpElementwise("pow") with
                member _.fRaw(a,b) = a.PowTT(b)
                member _.dfda(a,b,f) = b * f / a  // equivalent to b * a.pow(b-1)
                member _.dfdb(a,b,f) = f * a.log()  // equivalent to a.pow(b) * a.log()
            }
            (a,b)

General functions

For more complicated functions, you can use the most general way of defining functions using the UnaryOp or BinaryOp types, which allow you to define the full forward and reverse mode differentiation rules. The documentation of these two types detail how they should be instantiated.

Let's see several examples.

\(f(A) = A^{\intercal}\), with the forward derivative propagation rule \(\frac{\partial f(A)}{\partial X} = \frac{\partial A}{\partial X} \frac{\partial f(A)}{\partial A} = (\frac{\partial A}{\partial X})^{\intercal}\) and the reverse derivative propagation rule \(\frac{\partial Y}{\partial A} = \frac{\partial Y}{\partial f(A)} \frac{\partial f(A)}{\partial A} = (\frac{\partial Y}{\partial f(A)})^{\intercal} \;\).

type Tensor with
    member a.transpose() =
        Tensor.Op
            { new UnaryOp("transpose") with
                member _.fRaw(a) = a.TransposeT2()
                member _.ad_dfda(a,ad,f) = ad.transpose()
                member _.fd_dfda(a,f,fd) = fd.transpose()
            }
            (a)

\(f(A, B) = AB\), with the forward derivative propagation rule \(\frac{\partial(A, B)}{\partial X} = \frac{\partial A}{\partial X} \frac{\partial f(A, B)}{\partial A} + \frac{\partial B}{\partial X} \frac{\partial f(A, B)}{\partial B} = \frac{\partial A}{\partial X} B + A \frac{\partial B}{\partial X}\) and the reverse propagation rule \(\frac{\partial Y}{\partial A} = \frac{\partial Y}{\partial f(A, B)} \frac{\partial f(A, B)}{\partial A} = \frac{\partial Y}{\partial f(A, B)} B^{\intercal}\), \(\frac{\partial Y}{\partial B} = \frac{\partial Y}{\partial f(A, B)} \frac{\partial f(A, B)}{B} = A^{\intercal} \frac{\partial Y}{\partial f(A, B)} \;\).

type Tensor with
    member a.matmul(b) =
        Tensor.Op
            { new BinaryOp("matmul") with
                member _.fRaw(a,b) = a.MatMulTT(b)
                member _.ad_dfda(a,ad,b,f) = ad.matmul(b)
                member _.bd_dfdb(a,b,bd,f) = a.matmul(bd)
                member _.fd_dfda(a,b,f,fd) = fd.matmul(b.transpose())
                member _.fd_dfdb(a,b,f,fd) = a.transpose().matmul(fd)
            }
            (a,b)

namespace DiffSharp

type dsharp = static member abs: input: Tensor -> Tensor static member acos: input: Tensor -> Tensor static member add: a: Tensor * b: Tensor -> Tensor static member arange: endVal: float * ?startVal: float * ?step: float * ?device: Device * ?dtype: Dtype * ?backend: Backend -> Tensor + 1 overload static member arangeLike: input: Tensor * endVal: float * ?startVal: float * ?step: float * ?device: Device * ?dtype: Dtype * ?backend: Backend -> Tensor + 1 overload static member argmax: input: Tensor -> int[] + 1 overload static member argmin: input: Tensor -> int[] + 1 overload static member asin: input: Tensor -> Tensor static member atan: input: Tensor -> Tensor static member backends: unit -> Backend list ...
<summary> Tensor operations </summary>

static member DiffSharp.dsharp.config: unit -> DiffSharp.Device * DiffSharp.Dtype * DiffSharp.Backend * DiffSharp.Printer
static member DiffSharp.dsharp.config: configuration: (DiffSharp.Device * DiffSharp.Dtype * DiffSharp.Backend * DiffSharp.Printer) -> unit
static member DiffSharp.dsharp.config: ?device: DiffSharp.Device * ?dtype: DiffSharp.Dtype * ?backend: DiffSharp.Backend * ?printer: DiffSharp.Printer -> unit

Multiple items
module Backend from DiffSharp
<summary> Contains functions and settings related to backend specifications. </summary>

--------------------
type Backend = | Reference | Torch | Other of name: string * code: int override ToString: unit -> string member Name: string
<summary> Represents a backend for DiffSharp tensors </summary>

union case DiffSharp.Backend.Reference: DiffSharp.Backend
<summary> The reference backend </summary>

static member DiffSharp.dsharp.seed: ?seed: int -> unit

type Tensor = private | TensorC of primalRaw: RawTensor | TensorF of primal: Tensor * derivative: Tensor * nestingTag: uint32 | TensorR of primal: Tensor * derivative: Tensor ref * parentOp: TensorOp * fanout: uint32 ref * nestingTag: uint32 interface IConvertible interface IComparable override Equals: other: obj -> bool override GetHashCode: unit -> int member GetSlice: bounds: int[,] -> Tensor override ToString: unit -> string member abs: unit -> Tensor member acos: unit -> Tensor member add: b: Tensor -> Tensor + 1 overload member addSlice: location: seq<int> * b: Tensor -> Tensor ...
<summary> Represents a multi-dimensional data type containing elements of a single data type. </summary>
<example> A tensor can be constructed from a list or sequence using <see cref="M:DiffSharp.dsharp.tensor(System.Object)" /><code> let t = dsharp.tensor([[1.; -1.]; [1.; -1.]]) </code></example>

val a: Tensor

static member Tensor.Op: ext: BinaryOp -> (Tensor * Tensor -> Tensor)
static member Tensor.Op: ext: UnaryOp -> (Tensor -> Tensor)

Multiple items
type UnaryOpElementwise = inherit UnaryOp new: name: string -> UnaryOpElementwise override ad_dfda: a: Tensor * ad: Tensor * f: Tensor -> Tensor abstract dfda: a: Tensor * f: Tensor -> Tensor override fd_dfda: a: Tensor * f: Tensor * fd: Tensor -> Tensor
<summary>Defines a new op implementing an elementwise unary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.UnaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type is specialized to elementwise ops. It requires the user to specify only (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation and (2) the derivative of the function with respect to its argument. The corresponding derivative propagation rules for the forward and reverse differentiation modes are automatically generated.</para><para>If you are implementing a complex op that is not elementwise, you can use the generic type <see cref="T:DiffSharp.UnaryOp" />, which allows you to define the full derivative propagation rules.</para></remarks>
<example><code> { new UnaryOpElementwise("cos") with member _.fRaw(a) = a.CosT() member _.dfda(a,f) = -a.sin() } { new UnaryOpElementwise("exp") with member _.fRaw(a) = a.ExpT() member _.dfda(a,f) = f } { new UnaryOpElementwise("log") with member _.fRaw(a) = a.LogT() member _.dfda(a,f) = 1/a } </code></example>

--------------------
new: name: string -> UnaryOpElementwise

val a: Backends.RawTensor

abstract Backends.RawTensor.SinT: unit -> Backends.RawTensor

val f: Tensor

abstract Backends.RawTensor.LogT: unit -> Backends.RawTensor

val b: Tensor

Multiple items
type BinaryOpElementwise = inherit BinaryOp new: name: string -> BinaryOpElementwise override ad_dfda: a: Tensor * ad: Tensor * b: Tensor * f: Tensor -> Tensor override bd_dfdb: a: Tensor * b: Tensor * bd: Tensor * f: Tensor -> Tensor abstract dfda: a: Tensor * b: Tensor * f: Tensor -> Tensor abstract dfdb: a: Tensor * b: Tensor * f: Tensor -> Tensor override fd_dfda: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor override fd_dfdb: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor
<summary>Defines a new op implementing an elementwise binary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.BinaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks> This type is specialized to elementwise ops. It requires the user to specify only (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation and (2) the derivative of the function with respect to each argument. The corresponding derivative propagation rules for the forward and reverse differentiation modes are automatically generated. <para>If you are implementing a complex op that is not elementwise, you can use the generic type <see cref="T:DiffSharp.BinaryOp" />, which allows you to define the full derivative propagation rules.</para></remarks>
<example><code> { new BinaryOpElementwise("pow") with member _.fRaw(a,b) = a.PowTT(b) member _.dfda(a,b,f) = b * f / a member _.dfdb(a,b,f) = f * a.log() } { new BinaryOpElementwise("mul") with member _.fRaw(a,b) = a.MulTT(b) member _.dfda(a,b,f) = b member _.dfdb(a,b,f) = a } </code></example>

--------------------
new: name: string -> BinaryOpElementwise

val b: Backends.RawTensor

abstract Backends.RawTensor.MulTT: t2: Backends.RawTensor -> Backends.RawTensor

abstract Backends.RawTensor.PowTT: t2: Backends.RawTensor -> Backends.RawTensor

Multiple items
type UnaryOp = new: name: string -> UnaryOp abstract ad_dfda: a: Tensor * ad: Tensor * f: Tensor -> Tensor abstract fRaw: a: RawTensor -> RawTensor abstract fd_dfda: a: Tensor * f: Tensor * fd: Tensor -> Tensor member name: string
<summary>Defines a new op implementing a unary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.UnaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type represents the most generic definition of a new op representing a unary function, allowing the specification of: (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation, (2) the derivative propagation rule for the forward differentiation mode and (3) the derivative propagation rule for the reverse differentiation mode.</para><para>In general, if you are implementing a simple elementwise op, you should prefer using the <see cref="T:DiffSharp.UnaryOpElementwise" /> type, which is much simpler to use.</para></remarks>
<example><code> { new UnaryOp("transpose") with member _.fRaw(a) = a.TransposeT2() member _.ad_dfda(a,ad,f) = ad.transpose() member _.fd_dfda(a,f,fd) = fd.transpose() } </code></example>

--------------------
new: name: string -> UnaryOp

abstract Backends.RawTensor.TransposeT2: unit -> Backends.RawTensor

val ad: Tensor

val fd: Tensor

Multiple items
type BinaryOp = new: name: string -> BinaryOp abstract ad_dfda: a: Tensor * ad: Tensor * b: Tensor * f: Tensor -> Tensor abstract bd_dfdb: a: Tensor * b: Tensor * bd: Tensor * f: Tensor -> Tensor abstract fRaw: a: RawTensor * b: RawTensor -> RawTensor abstract fd_dfda: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor abstract fd_dfdb: a: Tensor * b: Tensor * f: Tensor * fd: Tensor -> Tensor member name: string
<summary>Defines a new op implementing a binary function and its derivatives. Instances of this class are used with the <see cref="M:DiffSharp.Tensor.Op(DiffSharp.BinaryOp)" /> method to define a new differentiable tensor function that supports forward, reverse, and nested differentiation.</summary>
<remarks><para>This type represents the most generic definition of a new op representing a binary function, allowing the specification of: (1) the <see cref="T:DiffSharp.Backends.RawTensor" /> operation, (2) the derivative propagation rule for the forward differentiation mode and (3) the derivative propagation rule for the reverse differentiation mode.</para><para>In general, if you are implementing a simple elementwise op, you should prefer using the <see cref="T:DiffSharp.BinaryOpElementwise" /> type, which is much simpler to use.</para></remarks>
<example><code> { new BinaryOp("matmul") with member _.fRaw(a,b) = a.MatMulTT(b) member _.ad_dfda(a,ad,b,f) = ad.matmul(b) member _.bd_dfdb(a,b,bd,f) = a.matmul(bd) member _.fd_dfda(a,b,f,fd) = fd.matmul(b.transpose()) member _.fd_dfdb(a,b,f,fd) = a.transposeExt().matmul(fd) } </code></example>

--------------------
new: name: string -> BinaryOp

abstract Backends.RawTensor.MatMulTT: t2: Backends.RawTensor -> Backends.RawTensor

val bd: Tensor