DiffSharp


K-Means Clustering

K-means clustering is a method in cluster analysis for partitioning a given set of observations into \(k\) clusters, where the observations in the same cluster are more similar to each other than to those in other clusters.

Given \(d\) observations \(\{\mathbf{x}_1,\dots,\mathbf{x}_d\}\), the observations are assigned to \(k\) clusters \(\mathbf{S} = \{S_1,\dots,S_k\}\) so as to minimize

\[ \underset{\mathbf{S}}{\textrm{argmin}} \sum_{i=1}^{k} \sum_{\mathbf{x} \in S_i} \left|\left| \mathbf{x} - \mathbf{\mu}_i \right|\right|^2 \; ,\]

where \(\mathbf{\mu}_i\) is the mean of the observations in \(S_i\).

The classical way of finding k-means partitionings is to use a heuristic algorithm cycling through an assignment step, where observations are assigned to the cluster of the mean that they are currently closest to, and an update step where the means are updated as the centroids of the observations that are currently assigned to them, until assignments no longer change.

Let's use an alternative approach and implement k-means clustering using the stochastic gradient descent algorithm that we introduced in another example. This variety of k-means clustering has been proposed in the literature for addressing large-scale learning tasks, due to its superior performance.

We start with the generic stochastic gradient descent code, introduced in the stochastic gradient descent example, which can be used for finding weights \(\mathbf{w}\) optimizing a model function \(f_{\mathbf{w}}: \mathbb{R}^n \to \mathbb{R}^m\) trained using a set of inputs \(\mathbf{x}_i \in \mathbb{R}^n\) and outputs \(\mathbf{y}_i \in \mathbb{R}^m\).

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
open DiffSharp.AD.Float64

let rnd = new System.Random()

// Stochastic gradient descent
// f: function, w0: starting weights, eta: step size, epsilon: threshold, t: training set
let sgd f w0 (eta:D) epsilon (t:(DV*DV)[]) =
    let rec desc w =
        let x, y = t.[rnd.Next(t.Length)]
        let g = grad (fun wi -> DV.l2norm (y - (f wi x))) w
        if DV.l2norm g < epsilon then w else desc (w - eta * g)
    desc w0

The following code implements the k-means model in the form

\[ f_{\mathbf{w}}(\mathbf{x}) = \left|\left| \mathbf{x} - \mathbf{\mu}_{\ast} \right|\right|^2 \; ,\]

where \(\mathbf{x} \in \mathbb{R}^n\),

\[ \mathbf{\mu}_{\ast} = \{ \mathbf{\mu}_i : \left|\left| \mathbf{x} - \mathbf{\mu}_i \right|\right|^2 \le \left|\left| \mathbf{x} - \mathbf{\mu}_j \right|\right|^2 \; , \; \textrm{for all} \; 1 \le j \le k \}\]

is the closest of the current means to the given point \(\mathbf{x}\), and the current means \(\mathbf{\mu}_i \in \mathbb{R}^n\) encoded in the weight vector \(\mathbf{w} \in \mathbb{R}^{k\,n}\) are obtained by splitting it into subvectors \(\{\mathbf{\mu}_1,\dots,\mathbf{\mu}_k\}\)

\[ \mathbf{w} = \left[ \mathbf{\mu}_1 \, \mathbf{\mu}_2 \, \dots \, \mathbf{\mu}_k \right] \; .\]

A given set of \(d\) observations are then supplied to the stochastic gradient descent algorithm as the training set consisting of pairs \((\mathbf{x}_i,\,0)\) (thus decreasing \(f_{\mathbf{W}} (\mathbf{x}_i) \to 0\,\) for all \(1 \le i \le d\)).

An important thing to note here is that DiffSharp can take the derivative (via reverse mode AD) of this whole algorithm, which includes subprocedures, control flow, and random sampling, and makes the gradient calculations transparent. We do not need to concern ourselves with formulating the model in a closed-form expression for being able to define and then compute its gradient.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
// k-means clustering
// k: number of partitions, eta: SGD step size, epsilon: SGD threshold, data: observations
let kmeans k eta epsilon (data:DV[]) =
    // (index of, squared distance to) the nearest mean to x
    let inline nearestm (x:DV) (means:seq<DV>) =
        means |> Seq.mapi (fun i m -> i, DV.normSq (x - m)) |> Seq.minBy snd
    // Squared distance of x to the nearest of the means encoded in w
    let inline dist (w:DV) (x:DV) = w |> DV.splitEqual k |> nearestm x |> snd
    let w0 = Seq.init k (fun _ -> data.[rnd.Next(data.Length)]) |> DV.concat
    let wopt = Array.zip data (Array.create data.Length (toDV [0.])) |> sgd dist w0 (D eta) (D epsilon)
    let means = DV.splitEqual k wopt
    let assign = Array.map (fun d -> (nearestm d means |> fst, d)) data
    Array.init k (fun i -> assign |> Array.filter (fun (j, d) -> i = j) |> Array.map snd)

Now let's test the algorithm in two-dimensions, using a set of randomly generated points.

1: 
2: 
3: 
4: 
5: 
// Generate 200 random points
let data = Array.init 200 (fun _ -> (DV.init 2 (fun _ -> rnd.NextDouble())))

// Partition the data into 5 clusters
let clusters = kmeans 5 0.01 0.01 data
1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
open FSharp.Charting

let plotClusters (c:DV[][]) =
    List.init c.Length (fun i -> 
        Chart.Point(Array.map (fun (d:DV) -> float d.[0], float d.[1]) c.[i], MarkerSize = 10))
    |> Chart.Combine

plotClusters clusters
Chart

Compared to the commonly used batch-update k-means algorithm running through all the observations at each step, this algorithm has the advantage of running independent from the size of the data set and thus being suitable for large-scale or online learning applications.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
// Generate 10000 random points
let data2 = Array.init 10000 (fun _ -> (DV.init 2 (fun _ -> rnd.NextDouble())))

// Partition the data into 8 clusters
let clusters2 = kmeans 8 0.01 0.01 data2

plotClusters clusters2
Chart

Finally, we can test our algorithm with the Iris flower data set that is commonly used for demonstrations. The data set contains four morphological features (sepal length, sepal width, petal length, petal width) of Iris flowers belonging to three related species. A version of the data set can be found here.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
open FSharp.Data

let iris = new CsvProvider<"./resources/iris.csv">()

let irisData = 
    iris.Rows
    |> Seq.map (fun r -> toDV [float r.``Sepal Width``; float r.``Petal Length``])
    |> Seq.toArray

let irisClusters = kmeans 3 0.01 0.01 irisData

plotClusters irisClusters
Chart

Our clustering of the sepal width - petal length data correctly predicts the actual assignment of these features to the three flower species, which can be seen below (image by user Indon on Wikimedia Commons, CC BY-SA 3.0).

Chart
namespace DiffSharp
namespace DiffSharp.AD
module Float64

from DiffSharp.AD
val rnd : System.Random

Full name: Examples-kmeansclustering.rnd
namespace System
Multiple items
type Random =
  new : unit -> Random + 1 overload
  member Next : unit -> int + 2 overloads
  member NextBytes : buffer:byte[] -> unit
  member NextDouble : unit -> float

Full name: System.Random

--------------------
System.Random() : unit
System.Random(Seed: int) : unit
val sgd : f:(DV -> DV -> D) -> w0:DV -> eta:D -> epsilon:D -> t:(DV * DV) [] -> DV

Full name: Examples-kmeansclustering.sgd
val f : (DV -> DV -> D)
val w0 : DV
val eta : D
type D =
  | D of float
  | DF of D * D * uint32
  | DR of D * D ref * TraceOp * uint32 ref * uint32
  interface IComparable
  member Copy : unit -> D
  override Equals : other:obj -> bool
  member GetForward : t:D * i:uint32 -> D
  override GetHashCode : unit -> int
  member GetReverse : i:uint32 -> D
  override ToString : unit -> string
  member A : D
  member F : uint32
  member P : D
  member PD : D
  member T : D
  member A : D with set
  member F : uint32 with set
  static member Abs : a:D -> D
  static member Acos : a:D -> D
  static member Asin : a:D -> D
  static member Atan : a:D -> D
  static member Atan2 : a:int * b:D -> D
  static member Atan2 : a:D * b:int -> D
  static member Atan2 : a:float * b:D -> D
  static member Atan2 : a:D * b:float -> D
  static member Atan2 : a:D * b:D -> D
  static member Ceiling : a:D -> D
  static member Cos : a:D -> D
  static member Cosh : a:D -> D
  static member Exp : a:D -> D
  static member Floor : a:D -> D
  static member Log : a:D -> D
  static member Log10 : a:D -> D
  static member LogSumExp : a:D -> D
  static member Max : a:D * b:D -> D
  static member Min : a:D * b:D -> D
  static member Op_D_D : a:D * ff:(float -> float) * fd:(D -> D) * df:(D * D * D -> D) * r:(D -> TraceOp) -> D
  static member Op_D_D_D : a:D * b:D * ff:(float * float -> float) * fd:(D * D -> D) * df_da:(D * D * D -> D) * df_db:(D * D * D -> D) * df_dab:(D * D * D * D * D -> D) * r_d_d:(D * D -> TraceOp) * r_d_c:(D * D -> TraceOp) * r_c_d:(D * D -> TraceOp) -> D
  static member Pow : a:int * b:D -> D
  static member Pow : a:D * b:int -> D
  static member Pow : a:float * b:D -> D
  static member Pow : a:D * b:float -> D
  static member Pow : a:D * b:D -> D
  static member ReLU : a:D -> D
  static member Round : a:D -> D
  static member Sigmoid : a:D -> D
  static member Sign : a:D -> D
  static member Sin : a:D -> D
  static member Sinh : a:D -> D
  static member SoftPlus : a:D -> D
  static member SoftSign : a:D -> D
  static member Sqrt : a:D -> D
  static member Tan : a:D -> D
  static member Tanh : a:D -> D
  static member One : D
  static member Zero : D
  static member ( + ) : a:int * b:D -> D
  static member ( + ) : a:D * b:int -> D
  static member ( + ) : a:float * b:D -> D
  static member ( + ) : a:D * b:float -> D
  static member ( + ) : a:D * b:D -> D
  static member ( / ) : a:int * b:D -> D
  static member ( / ) : a:D * b:int -> D
  static member ( / ) : a:float * b:D -> D
  static member ( / ) : a:D * b:float -> D
  static member ( / ) : a:D * b:D -> D
  static member op_Explicit : d:D -> float
  static member ( * ) : a:int * b:D -> D
  static member ( * ) : a:D * b:int -> D
  static member ( * ) : a:float * b:D -> D
  static member ( * ) : a:D * b:float -> D
  static member ( * ) : a:D * b:D -> D
  static member ( - ) : a:int * b:D -> D
  static member ( - ) : a:D * b:int -> D
  static member ( - ) : a:float * b:D -> D
  static member ( - ) : a:D * b:float -> D
  static member ( - ) : a:D * b:D -> D
  static member ( ~- ) : a:D -> D

Full name: DiffSharp.AD.Float64.D
val epsilon : D
val t : (DV * DV) []
Multiple items
union case DV.DV: float [] -> DV

--------------------
module DV

from DiffSharp.AD.Float64

--------------------
type DV =
  | DV of float []
  | DVF of DV * DV * uint32
  | DVR of DV * DV ref * TraceOp * uint32 ref * uint32
  member Copy : unit -> DV
  member GetForward : t:DV * i:uint32 -> DV
  member GetReverse : i:uint32 -> DV
  member GetSlice : lower:int option * upper:int option -> DV
  member ToArray : unit -> D []
  member ToColDM : unit -> DM
  member ToMathematicaString : unit -> string
  member ToMatlabString : unit -> string
  member ToRowDM : unit -> DM
  override ToString : unit -> string
  member Visualize : unit -> string
  member A : DV
  member F : uint32
  member Item : i:int -> D with get
  member Length : int
  member P : DV
  member PD : DV
  member T : DV
  member A : DV with set
  member F : uint32 with set
  static member Abs : a:DV -> DV
  static member Acos : a:DV -> DV
  static member AddItem : a:DV * i:int * b:D -> DV
  static member AddSubVector : a:DV * i:int * b:DV -> DV
  static member Append : a:DV * b:DV -> DV
  static member Asin : a:DV -> DV
  static member Atan : a:DV -> DV
  static member Atan2 : a:int * b:DV -> DV
  static member Atan2 : a:DV * b:int -> DV
  static member Atan2 : a:float * b:DV -> DV
  static member Atan2 : a:DV * b:float -> DV
  static member Atan2 : a:D * b:DV -> DV
  static member Atan2 : a:DV * b:D -> DV
  static member Atan2 : a:DV * b:DV -> DV
  static member Ceiling : a:DV -> DV
  static member Cos : a:DV -> DV
  static member Cosh : a:DV -> DV
  static member Exp : a:DV -> DV
  static member Floor : a:DV -> DV
  static member L1Norm : a:DV -> D
  static member L2Norm : a:DV -> D
  static member L2NormSq : a:DV -> D
  static member Log : a:DV -> DV
  static member Log10 : a:DV -> DV
  static member LogSumExp : a:DV -> D
  static member Max : a:DV -> D
  static member Max : a:D * b:DV -> DV
  static member Max : a:DV * b:D -> DV
  static member Max : a:DV * b:DV -> DV
  static member MaxIndex : a:DV -> int
  static member Mean : a:DV -> D
  static member Min : a:DV -> D
  static member Min : a:D * b:DV -> DV
  static member Min : a:DV * b:D -> DV
  static member Min : a:DV * b:DV -> DV
  static member MinIndex : a:DV -> int
  static member Normalize : a:DV -> DV
  static member OfArray : a:D [] -> DV
  static member Op_DV_D : a:DV * ff:(float [] -> float) * fd:(DV -> D) * df:(D * DV * DV -> D) * r:(DV -> TraceOp) -> D
  static member Op_DV_DM : a:DV * ff:(float [] -> float [,]) * fd:(DV -> DM) * df:(DM * DV * DV -> DM) * r:(DV -> TraceOp) -> DM
  static member Op_DV_DV : a:DV * ff:(float [] -> float []) * fd:(DV -> DV) * df:(DV * DV * DV -> DV) * r:(DV -> TraceOp) -> DV
  static member Op_DV_DV_D : a:DV * b:DV * ff:(float [] * float [] -> float) * fd:(DV * DV -> D) * df_da:(D * DV * DV -> D) * df_db:(D * DV * DV -> D) * df_dab:(D * DV * DV * DV * DV -> D) * r_d_d:(DV * DV -> TraceOp) * r_d_c:(DV * DV -> TraceOp) * r_c_d:(DV * DV -> TraceOp) -> D
  static member Op_DV_DV_DM : a:DV * b:DV * ff:(float [] * float [] -> float [,]) * fd:(DV * DV -> DM) * df_da:(DM * DV * DV -> DM) * df_db:(DM * DV * DV -> DM) * df_dab:(DM * DV * DV * DV * DV -> DM) * r_d_d:(DV * DV -> TraceOp) * r_d_c:(DV * DV -> TraceOp) * r_c_d:(DV * DV -> TraceOp) -> DM
  static member Op_DV_DV_DV : a:DV * b:DV * ff:(float [] * float [] -> float []) * fd:(DV * DV -> DV) * df_da:(DV * DV * DV -> DV) * df_db:(DV * DV * DV -> DV) * df_dab:(DV * DV * DV * DV * DV -> DV) * r_d_d:(DV * DV -> TraceOp) * r_d_c:(DV * DV -> TraceOp) * r_c_d:(DV * DV -> TraceOp) -> DV
  static member Op_DV_D_DV : a:DV * b:D * ff:(float [] * float -> float []) * fd:(DV * D -> DV) * df_da:(DV * DV * DV -> DV) * df_db:(DV * D * D -> DV) * df_dab:(DV * DV * DV * D * D -> DV) * r_d_d:(DV * D -> TraceOp) * r_d_c:(DV * D -> TraceOp) * r_c_d:(DV * D -> TraceOp) -> DV
  static member Op_D_DV_DV : a:D * b:DV * ff:(float * float [] -> float []) * fd:(D * DV -> DV) * df_da:(DV * D * D -> DV) * df_db:(DV * DV * DV -> DV) * df_dab:(DV * D * D * DV * DV -> DV) * r_d_d:(D * DV -> TraceOp) * r_d_c:(D * DV -> TraceOp) * r_c_d:(D * DV -> TraceOp) -> DV
  static member Pow : a:int * b:DV -> DV
  static member Pow : a:DV * b:int -> DV
  static member Pow : a:float * b:DV -> DV
  static member Pow : a:DV * b:float -> DV
  static member Pow : a:D * b:DV -> DV
  static member Pow : a:DV * b:D -> DV
  static member Pow : a:DV * b:DV -> DV
  static member ReLU : a:DV -> DV
  static member ReshapeToDM : m:int * a:DV -> DM
  static member Round : a:DV -> DV
  static member Sigmoid : a:DV -> DV
  static member Sign : a:DV -> DV
  static member Sin : a:DV -> DV
  static member Sinh : a:DV -> DV
  static member SoftMax : a:DV -> DV
  static member SoftPlus : a:DV -> DV
  static member SoftSign : a:DV -> DV
  static member Split : d:DV * n:seq<int> -> seq<DV>
  static member Sqrt : a:DV -> DV
  static member StandardDev : a:DV -> D
  static member Standardize : a:DV -> DV
  static member Sum : a:DV -> D
  static member Tan : a:DV -> DV
  static member Tanh : a:DV -> DV
  static member Variance : a:DV -> D
  static member ZeroN : n:int -> DV
  static member Zero : DV
  static member ( + ) : a:int * b:DV -> DV
  static member ( + ) : a:DV * b:int -> DV
  static member ( + ) : a:float * b:DV -> DV
  static member ( + ) : a:DV * b:float -> DV
  static member ( + ) : a:D * b:DV -> DV
  static member ( + ) : a:DV * b:D -> DV
  static member ( + ) : a:DV * b:DV -> DV
  static member ( &* ) : a:DV * b:DV -> DM
  static member ( / ) : a:int * b:DV -> DV
  static member ( / ) : a:DV * b:int -> DV
  static member ( / ) : a:float * b:DV -> DV
  static member ( / ) : a:DV * b:float -> DV
  static member ( / ) : a:D * b:DV -> DV
  static member ( / ) : a:DV * b:D -> DV
  static member ( ./ ) : a:DV * b:DV -> DV
  static member ( .* ) : a:DV * b:DV -> DV
  static member op_Explicit : d:float [] -> DV
  static member op_Explicit : d:DV -> float []
  static member ( * ) : a:int * b:DV -> DV
  static member ( * ) : a:DV * b:int -> DV
  static member ( * ) : a:float * b:DV -> DV
  static member ( * ) : a:DV * b:float -> DV
  static member ( * ) : a:D * b:DV -> DV
  static member ( * ) : a:DV * b:D -> DV
  static member ( * ) : a:DV * b:DV -> D
  static member ( - ) : a:int * b:DV -> DV
  static member ( - ) : a:DV * b:int -> DV
  static member ( - ) : a:float * b:DV -> DV
  static member ( - ) : a:DV * b:float -> DV
  static member ( - ) : a:D * b:DV -> DV
  static member ( - ) : a:DV * b:D -> DV
  static member ( - ) : a:DV * b:DV -> DV
  static member ( ~- ) : a:DV -> DV

Full name: DiffSharp.AD.Float64.DV
val desc : (DV -> DV)
val w : DV
val x : DV
val y : DV
System.Random.Next() : int
System.Random.Next(maxValue: int) : int
System.Random.Next(minValue: int, maxValue: int) : int
property System.Array.Length: int
val g : DV
val grad : f:('c -> D) -> x:'c -> 'c (requires member GetReverse and member get_A)

Full name: DiffSharp.AD.Float64.DiffOps.grad
val wi : DV
val l2norm : v:DV -> D

Full name: DiffSharp.AD.Float64.DV.l2norm
val kmeans : k:int -> eta:float -> epsilon:float -> data:DV [] -> DV [] []

Full name: Examples-kmeansclustering.kmeans
val k : int
val eta : float
val epsilon : float
val data : DV []
val nearestm : (DV -> seq<DV> -> int * D)
val means : seq<DV>
Multiple items
val seq : sequence:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Core.Operators.seq

--------------------
type seq<'T> = System.Collections.Generic.IEnumerable<'T>

Full name: Microsoft.FSharp.Collections.seq<_>
module Seq

from Microsoft.FSharp.Collections
val mapi : mapping:(int -> 'T -> 'U) -> source:seq<'T> -> seq<'U>

Full name: Microsoft.FSharp.Collections.Seq.mapi
val i : int
val m : DV
val normSq : v:DV -> D

Full name: DiffSharp.AD.Float64.DV.normSq
val minBy : projection:('T -> 'U) -> source:seq<'T> -> 'T (requires comparison)

Full name: Microsoft.FSharp.Collections.Seq.minBy
val snd : tuple:('T1 * 'T2) -> 'T2

Full name: Microsoft.FSharp.Core.Operators.snd
val dist : (DV -> DV -> D)
val splitEqual : n:int -> v:DV -> seq<DV>

Full name: DiffSharp.AD.Float64.DV.splitEqual
val init : count:int -> initializer:(int -> 'T) -> seq<'T>

Full name: Microsoft.FSharp.Collections.Seq.init
val concat : v:seq<DV> -> DV

Full name: DiffSharp.AD.Float64.DV.concat
val wopt : DV
module Array

from Microsoft.FSharp.Collections
val zip : array1:'T1 [] -> array2:'T2 [] -> ('T1 * 'T2) []

Full name: Microsoft.FSharp.Collections.Array.zip
val create : count:int -> value:'T -> 'T []

Full name: Microsoft.FSharp.Collections.Array.create
val toDV : v:seq<'a> -> DV (requires member op_Explicit)

Full name: DiffSharp.AD.Float64.DOps.toDV
union case D.D: float -> D
val assign : (int * DV) []
val map : mapping:('T -> 'U) -> array:'T [] -> 'U []

Full name: Microsoft.FSharp.Collections.Array.map
val d : DV
val fst : tuple:('T1 * 'T2) -> 'T1

Full name: Microsoft.FSharp.Core.Operators.fst
val init : count:int -> initializer:(int -> 'T) -> 'T []

Full name: Microsoft.FSharp.Collections.Array.init
val filter : predicate:('T -> bool) -> array:'T [] -> 'T []

Full name: Microsoft.FSharp.Collections.Array.filter
val j : int
val data : DV []

Full name: Examples-kmeansclustering.data
val init : n:int -> f:(int -> 'a) -> DV

Full name: DiffSharp.AD.Float64.DV.init
System.Random.NextDouble() : float
val clusters : DV [] []

Full name: Examples-kmeansclustering.clusters
namespace FSharp
namespace FSharp.Charting
val plotClusters : c:DV [] [] -> ChartTypes.GenericChart

Full name: Examples-kmeansclustering.plotClusters
val c : DV [] []
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
  | ( [] )
  | ( :: ) of Head: 'T * Tail: 'T list
  interface IEnumerable
  interface IEnumerable<'T>
  member GetSlice : startIndex:int option * endIndex:int option -> 'T list
  member Head : 'T
  member IsEmpty : bool
  member Item : index:int -> 'T with get
  member Length : int
  member Tail : 'T list
  static member Cons : head:'T * tail:'T list -> 'T list
  static member Empty : 'T list

Full name: Microsoft.FSharp.Collections.List<_>
val init : length:int -> initializer:(int -> 'T) -> 'T list

Full name: Microsoft.FSharp.Collections.List.init
type Chart =
  static member Area : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
  static member Area : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
  static member Bar : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
  static member Bar : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
  static member BoxPlotFromData : data:seq<#key * #seq<'a2>> * ?Name:string * ?Title:string * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart (requires 'a2 :> value)
  static member BoxPlotFromStatistics : data:seq<#key * #value * #value * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart
  static member Bubble : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
  static member Bubble : data:seq<#key * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
  static member Candlestick : data:seq<#value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
  static member Candlestick : data:seq<#key * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
  ...

Full name: FSharp.Charting.Chart
static member Chart.Point : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:System.Drawing.Color * ?XTitle:string * ?YTitle:string * ?MarkerColor:System.Drawing.Color * ?MarkerSize:int -> ChartTypes.GenericChart
static member Chart.Point : data:seq<#key * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:System.Drawing.Color * ?XTitle:string * ?YTitle:string * ?MarkerColor:System.Drawing.Color * ?MarkerSize:int -> ChartTypes.GenericChart
Multiple items
val float : value:'T -> float (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.float

--------------------
type float = System.Double

Full name: Microsoft.FSharp.Core.float

--------------------
type float<'Measure> = float

Full name: Microsoft.FSharp.Core.float<_>
static member Chart.Combine : charts:seq<ChartTypes.GenericChart> -> ChartTypes.GenericChart
val data2 : DV []

Full name: Examples-kmeansclustering.data2
val clusters2 : DV [] []

Full name: Examples-kmeansclustering.clusters2
namespace FSharp.Data
val iris : CsvProvider<...>

Full name: Examples-kmeansclustering.iris
type CsvProvider

Full name: FSharp.Data.CsvProvider


<summary>Typed representation of a CSV file.</summary>
       <param name='Sample'>Location of a CSV sample file or a string containing a sample CSV document.</param>
       <param name='Separators'>Column delimiter(s). Defaults to `,`.</param>
       <param name='InferRows'>Number of rows to use for inference. Defaults to `1000`. If this is zero, all rows are used.</param>
       <param name='Schema'>Optional column types, in a comma separated list. Valid types are `int`, `int64`, `bool`, `float`, `decimal`, `date`, `guid`, `string`, `int?`, `int64?`, `bool?`, `float?`, `decimal?`, `date?`, `guid?`, `int option`, `int64 option`, `bool option`, `float option`, `decimal option`, `date option`, `guid option` and `string option`.
       You can also specify a unit and the name of the column like this: `Name (type&lt;unit&gt;)`, or you can override only the name. If you don't want to specify all the columns, you can reference the columns by name like this: `ColumnName=type`.</param>
       <param name='HasHeaders'>Whether the sample contains the names of the columns as its first line.</param>
       <param name='IgnoreErrors'>Whether to ignore rows that have the wrong number of columns or which can't be parsed using the inferred or specified schema. Otherwise an exception is thrown when these rows are encountered.</param>
       <param name='SkipRows'>SKips the first n rows of the CSV file.</param>
       <param name='AssumeMissingValues'>When set to true, the type provider will assume all columns can have missing values, even if in the provided sample all values are present. Defaults to false.</param>
       <param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, `double.NaN` or `""` for missing values. Defaults to false.</param>
       <param name='Quote'>The quotation mark (for surrounding values containing the delimiter). Defaults to `"`.</param>
       <param name='MissingValues'>The set of strings recogized as missing values. Defaults to `NaN,NA,N/A,#N/A,:,-,TBA,TBD`.</param>
       <param name='CacheRows'>Whether the rows should be caches so they can be iterated multiple times. Defaults to true. Disable for large datasets.</param>
       <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param>
       <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless `charset` is specified in the `Content-Type` response header.</param>
       <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param>
       <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource
          (e.g. 'MyCompany.MyAssembly, resource_name.csv'). This is useful when exposing types generated by the type provider.</param>
val irisData : DV []

Full name: Examples-kmeansclustering.irisData
property Runtime.CsvFile.Rows: seq<CsvProvider<...>.Row>
val map : mapping:('T -> 'U) -> source:seq<'T> -> seq<'U>

Full name: Microsoft.FSharp.Collections.Seq.map
val r : CsvProvider<...>.Row
val toArray : source:seq<'T> -> 'T []

Full name: Microsoft.FSharp.Collections.Seq.toArray
val irisClusters : DV [] []

Full name: Examples-kmeansclustering.irisClusters