## 1. Statistics: `(import (robin statistics))`

A library of functions to compute statistical or other information about sets of data.

### 1.1. arithmetic-mean

Same as `mean`.

### 1.2. coefficient-of-variation

`coefficient-of-variation` describes the amount of spread in a dataset relative to its mean. In the following example, both lists have the same mean (3), but the spread is greater in the second.

```#|kawa:32|# (coefficient-of-variation '(1 2 3 4 5))
52.70462766947299
#|kawa:33|# (coefficient-of-variation '(-2 2 3 4 8))
120.18504251546631```

### 1.3. combinations

`combinations` returns the number of ways to take k things from n without replacement, when the order does not matter. The form is `(combinations n k)`

```#|kawa:2|# (combinations 5 2)
10
#|kawa:3|# (combinations 1000 500)
270288240945436569515614693625975275496152008446548287007392875106625428705522193898612483924502370165362606085021546104802209750050679917549894219699518475423665484263751733356162464079737887344364574161119497604571044985756287880514600994219426752366915856603136862602484428109296905863799821216320```

### 1.4. geometric-mean

The geometric mean takes the nth root of the product of the numbers, where `n` is the size of the data. This yields the central number within a geometric progression.

```#|kawa:3|# (geometric-mean '(1 2 3 4 5))
2.605171084697352
#|kawa:4|# (geometric-mean '(1 3 9 27 81))
9.000000000000002
#|kawa:5|# (geometric-mean '(1 3 9 27))
5.196152422706632```

### 1.5. harmonic-mean

The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals.

```#|kawa:6|# (harmonic-mean '(1 2 3 4 5))
300/137
#|kawa:7|# (inexact (harmonic-mean '(1 2 3 4 5)))
2.18978102189781```

### 1.6. jaccard-distance

The Jaccard Distance is simply `(1 - (jaccard-index))`, and measures how dissimilar the two sets are.

```#|kawa:46|# (jaccard-distance '(1 2 3 4 5) '(3 4 5 6 7))
4/7```

### 1.7. jaccard-index

The Jaccard Index returns a number from 0 to 1 indicating how similar two sets are. Sets here are represented as lists, and duplicates are ignored. An optional third argument provides the set-equality function (which defaults to `equal?`).

```#|kawa:40|# (jaccard-index '(1 2 3 4 5) '(3 4 5 6 7))
3/7
#|kawa:42|# (jaccard-index '(1 2 3 4 5) '(3 4 5 1 2 2 1 1))
1
#|kawa:43|# (jaccard-index '("a" "b" "c") '("A" "B" "F"))
0
#|kawa:45|# (jaccard-index '("a" "b" "c") '("A" "B" "D") string-ci=?)
1/2```
Note

Interestingly, the Jaccard Index is a distance metric satisfying, in particular, the triangle inequality.

### 1.8. mean

Computes the arithmetic `mean` of a list of numbers. This is the familiar "add up all the numbers, divide by the total" average.

```#|kawa:2|# (mean '(1 2 3 4 5))
3```

### 1.9. median

The `median` is the central number, when all the numbers are put in order (or, when there are an even number of numbers, the average of the two middle numbers).

```#|kawa:8|# (median '(5 3 1 2 4))
3```

### 1.10. mode

The `mode` is the number appearing most often. The `mode` function returns two values: a list of the most frequent number or numbers, and their number of occurrences.

```#|kawa:9|# (mode '(5 3 1 2 4))
(1 2 3 4 5) 1
#|kawa:11|# (let-values (((items count) (mode '(5 3 5 1 2 4))))
#|.....12|# (display items) (newline) (display count) (newline))
(5)
2```

### 1.11. percentile

`percentile` takes two arguments, a list of data and a `percent` value, and returns the item in the list at that percentile position along the list, or an average of the nearest two values. `percent` values must be in the range 1 to 99, inclusive. Note that a `percent` of 50 corresponds to the `median`.

```#|kawa:13|# (percentile '(1 2 3 4 5) 50)
3
#|kawa:14|# (percentile '(1 2 3 4 5) 75)
4
#|kawa:16|# (percentile '(1 2 3 4 5) 99)
5
#|kawa:19|# (percentile '(1 2 3 4 5) 85)
5
#|kawa:20|# (percentile '(1 2 3 4 5) 80)
9/2```

### 1.12. perlin-noise

`perlin-noise` is a function returning a value computed using a gradient function. Perlin Noise was invented by Ken Perlin and has been widely applied in computer graphics. The implementation here is a conversion of the Java reference implementation.

`perlin-noise` takes three numbers as input, and returns a single number (all numbers inexact).

```sash> (perlin-noise 12.3 4.2 5.7)
0.4085747160714242```

### 1.13. permutations

`permutations` returns the number of ways to take k things from n without replacement, when the order matters. The form is `(permutations n k)`

```#|kawa:4|# (permutations 5 2)
20
#|kawa:5|# (permutations 100 20)
1303995018204712451095685346159820800000```

### 1.14. population-standard-deviation

`population-standard-deviation` of a list of data:

```#|kawa:28|# (population-standard-deviation '(1 2 3 4 5))
1.4142135623730951```

Optionally, if you know the mean of your data, pass that as a second argument so the function does not have to recompute it:

```#|kawa:39|# (population-standard-deviation '(1 2 3 4 5) 3)
1.4142135623730951```

### 1.15. population-variance

`population-variance` of a list of data:

```#|kawa:29|# (population-variance '(1 2 3 4 5))
2```

Optionally, if you know the mean of your data, pass that as a second argument so the function does not have to recompute it.

### 1.16. random-normal

`random-normal` is used to return a random number normally distributed with a given mean and standard deviation.

```#|kawa:2|# (random-normal 0 10)
3017/1250
#|kawa:3|# (inexact (random-normal 0 10))
0.05948
#|kawa:4|# (inexact (random-normal 0 10))
0.48029
#|kawa:5|# (inexact (random-normal 0 10))
-3.83553
#|kawa:7|# (inexact (random-normal 20 10))
21.81053
#|kawa:8|# (inexact (random-normal 20 10))
15.54491```

### 1.17. random-pick

`random-pick` returns a random selection from a given list

```#|kawa:20|# (random-pick '(1 2 3 4 5 6))
6
#|kawa:21|# (random-pick '(1 2 3 4 5 6))
3
#|kawa:22|# (random-pick '(1 2 3 4 5 6))
2```

### 1.18. random-sample

`random-sample` returns a random sample of `n` chosen from a list of items, without replacement

```#|kawa:25|# (random-sample 3 '(1 2 3 4 5 6))
(1 4 5)
#|kawa:26|# (random-sample 3 '(1 2 3 4 5 6))
(6 5 1)
#|kawa:27|# (random-sample 3 '(1 2 3 4 5 6))
(2 3 6)```

### 1.19. random-weighted-sample

`random-weighted-sample` returns a random sample of `n` chosen from a list of items, without replacement, but with the items weighted by the given list of weights.

The following gives a heavy bias to choosing 1 and 6:

```#|kawa:28|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(10 1 1 1 1 10))
(6 1 5)
#|kawa:29|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(10 1 1 1 1 10))
(1 6 4)
#|kawa:30|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(10 1 1 1 1 10))
(1 6 5)```

The following is a lighter bias to choosing 1 and 4:

```#|kawa:34|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(2 1 1 2 1 1))
(4 6 2)
#|kawa:35|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(2 1 1 2 1 1))
(5 4 3)
#|kawa:36|# (random-weighted-sample 3 '(1 2 3 4 5 6) '(2 1 1 2 1 1))
(1 3 2)```

### 1.20. sign

`sign` returns 1 for positive numbers, 0 for zero, -1 for negative numbers

```#|kawa:36|# (sign 10)
1
#|kawa:37|# (sign 0)
0
#|kawa:38|# (sign -0.7)
-1```

### 1.21. sorenson-dice-index

The Sorenson Dice Index is a measure of the similarity of two sets. Sets here are represented as lists, and duplicates are ignored. An optional third argument provides the set-equality function (which defaults to `equal?`).

```#|kawa:47|# (sorenson-dice-index '(1 2 3 4 5) '(3 4 5 6 7))
3/5
#|kawa:48|# (sorenson-dice-index '("a" "b" "c") '("A" "B" "D"))
0
#|kawa:49|# (sorenson-dice-index '("a" "b" "c") '("A" "B" "D") string-ci=?)
2/3```

### 1.22. standard-deviation

`standard-deviation` of a list of data:

```#|kawa:30|# (standard-deviation '(1 2 3 4 5))
1.5811388300841898```

Optionally, if you know the mean of your data, pass that as a second argument so the function does not have to recompute it.

### 1.23. standard-error-of-the-mean

`standard-error-of-the-mean` describes the mean of the sampling distribution.

```#|kawa:35|# (standard-error-of-the-mean '(1 2 3 4 5))
0.7071067811865476```

### 1.24. variance

`variance` of a list of data:

```#|kawa:31|# (variance '(1 2 3 4 5))
5/2```

Optionally, if you know the mean of your data, pass that as a second argument so the function does not have to recompute it.