76 Commits

Author SHA1 Message Date
morenol
62de25b2ae Handle kernel serialization (#232)
* Handle kernel serialization
* Do not use typetag in WASM
* enable tests for serialization
* Update serde feature deps

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>
2022-11-08 11:29:56 -05:00
morenol
7d87451333 Fixes for release (#237)
* Fixes for release
* add new test
* Remove change applied in development branch
* Only add dependency for wasm32
* Update ci.yml

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>
2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
265fd558e7 make work cargo build --target wasm32-unknown-unknown 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
e25e2aea2b update CHANGELOG 2022-11-08 11:29:56 -05:00
Lorenzo
2f6dd1325e update comment 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
b0dece9476 use getrandom/js 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
c507d976be Update CHANGELOG 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
fa54d5ee86 Remove unused tests flags 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
459d558d48 minor fixes to doc 2022-11-08 11:29:56 -05:00
Lorenzo
1b7dda30a2 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
c1bd1df5f6 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
cf751f05aa minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
63ed89aadd minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
890e9d644c minor fix 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
af0a740394 Fix std_rand feature 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
616e38c282 cleanup 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
a449fdd4ea fmt 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
669f87f812 Use getrandom as default (for no-std feature) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
6d529b34d2 Add static analyzer to doc 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
3ec9e4f0db Exclude datasets test for wasm/wasi 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
527477dea7 minor fixes 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
5b517c5048 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
2df0795be9 Release 0.3 2022-11-08 11:29:56 -05:00
Lorenzo
0dc97a4e9b Create DEVELOPERS.md 2022-11-08 11:29:56 -05:00
Lorenzo
6c0fd37222 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
d8d0fb6903 Update README.md 2022-11-08 11:29:56 -05:00
morenol
8d07efd921 Use Box in SVM and remove lifetimes (#228)
* Do not change external API
Authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
morenol
ba27dd2a55 Fix CI (#227)
* Update ci.yml
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
ed9769f651 Implement CSV reader with new traits (#209) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
b427e5d8b1 Improve options conditionals 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
fabe362755 Implement Display for NaiveBayes 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
ee6b6a53d6 cargo clippy 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
19f3a2fcc0 Fix signature of metrics tests 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
e09c4ba724 Add kernels' parameters to public interface 2022-11-08 11:29:56 -05:00
Lorenzo
6624732a65 Fix svr tests (#222) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
1cbde3ba22 Refactor modules structure in src/svm 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
551a6e34a5 clean up svm 2022-11-08 11:29:56 -05:00
Lorenzo
c45bab491a Support Wasi as target (#216)
* Improve features
* Add wasm32-wasi as a target
* Update .github/workflows/ci.yml
Co-authored-by: morenol <22335041+morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
7f35dc54e4 Disambiguate distances. Implement Fastpair. (#220) 2022-11-08 11:29:56 -05:00
morenol
8f1a7dfd79 build: fix compilation without default features (#218)
* build: fix compilation with optional features
* Remove unused config from Cargo.toml
* Fix cache keys
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
712c478af6 Improve features (#215) 2022-11-08 11:29:56 -05:00
Lorenzo
4d36b7f34f Fix metrics::auc (#212)
* Fix metrics::auc
2022-11-08 11:29:56 -05:00
Lorenzo
a16927aa16 Port ensemble. Add Display to naive_bayes (#208) 2022-11-08 11:29:56 -05:00
Lorenzo
d91f4f7ce4 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
a7fa0585eb Merge potential next release v0.4 (#187) Breaking Changes
* First draft of the new n-dimensional arrays + NB use case
* Improves default implementation of multiple Array methods
* Refactors tree methods
* Adds matrix decomposition routines
* Adds matrix decomposition methods to ndarray and nalgebra bindings
* Refactoring + linear regression now uses array2
* Ridge & Linear regression
* LBFGS optimizer & logistic regression
* LBFGS optimizer & logistic regression
* Changes linear methods, metrics and model selection methods to new n-dimensional arrays
* Switches KNN and clustering algorithms to new n-d array layer
* Refactors distance metrics
* Optimizes knn and clustering methods
* Refactors metrics module
* Switches decomposition methods to n-dimensional arrays
* Linalg refactoring - cleanup rng merge (#172)
* Remove legacy DenseMatrix and BaseMatrix implementation. Port the new Number, FloatNumber and Array implementation into module structure.
* Exclude AUC metrics. Needs reimplementation
* Improve developers walkthrough

New traits system in place at `src/numbers` and `src/linalg`
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>

* Provide SupervisedEstimator with a constructor to avoid explicit dynamical box allocation in 'cross_validate' and 'cross_validate_predict' as required by the use of 'dyn' as per Rust 2021
* Implement getters to use as_ref() in src/neighbors
* Implement getters to use as_ref() in src/naive_bayes
* Implement getters to use as_ref() in src/linear
* Add Clone to src/naive_bayes
* Change signature for cross_validate and other model_selection functions to abide to use of dyn in Rust 2021
* Implement ndarray-bindings. Remove FloatNumber from implementations
* Drop nalgebra-bindings support (as decided in conf-call to go for ndarray)
* Remove benches. Benches will have their own repo at smartcore-benches
* Implement SVC
* Implement SVC serialization. Move search parameters in dedicated module
* Implement SVR. Definitely too slow
* Fix compilation issues for wasm (#202)

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
* Fix tests (#203)

* Port linalg/traits/stats.rs
* Improve methods naming
* Improve Display for DenseMatrix

Co-authored-by: Montana Low <montanalow@users.noreply.github.com>
Co-authored-by: VolodymyrOrlov <volodymyr.orlov@gmail.com>
2022-11-08 11:29:56 -05:00
RJ Nowling
a32eb66a6a Dataset doc cleanup (#205)
* Update iris.rs

* Update mod.rs

* Update digits.rs
2022-11-08 11:29:56 -05:00
Lorenzo
f605f6e075 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
3b1aaaadf7 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
d015b12402 Update CONTRIBUTING.md 2022-11-08 11:29:56 -05:00
morenol
d5200074c2 fix: fix issue with iterator for svc search (#182) 2022-11-08 11:29:56 -05:00
morenol
473cdfc44d refactor: Try to follow similar pattern to other APIs (#180)
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
morenol
ad2e6c2900 feat: expose hyper tuning module in model_selection (#179)
* feat: expose hyper tuning module in model_selection

* Move to a folder

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
9ea3133c27 Update CONTRIBUTING.md 2022-11-08 11:29:56 -05:00
Lorenzo
e4c47c7540 Add contribution guidelines (#178) 2022-11-08 11:29:56 -05:00
Montana Low
f4fd4d2239 make default params available to serde (#167)
* add seed param to search params

* make default params available to serde

* lints

* create defaults for enums

* lint
2022-11-08 11:29:56 -05:00
Montana Low
05dfffad5c add seed param to search params (#168) 2022-11-08 11:29:56 -05:00
morenol
a37b552a7d Lmm/add seeds in more algorithms (#164)
* Provide better output in flaky tests

* feat: add seed parameter to multiple algorithms

* Update changelog

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Montana Low
55e1158581 Complete grid search params (#166)
* grid search draft

* hyperparam search for linear estimators

* grid search for ensembles

* support grid search for more algos

* grid search for unsupervised algos

* minor cleanup
2022-11-08 11:29:56 -05:00
morenol
cfa824d7db Provide better output in flaky tests (#163) 2022-11-08 11:29:56 -05:00
morenol
bb5b437a32 feat: allocate first and then proceed to create matrix from Vec of Ro… (#159)
* feat: allocate first and then proceed to create matrix from Vec of RowVectors
2022-11-08 11:29:56 -05:00
morenol
851533dfa7 Make rand_distr optional (#161) 2022-11-08 11:29:56 -05:00
Lorenzo
0d996edafe Update LICENSE 2022-11-08 11:29:56 -05:00
morenol
f291b71f4a fix: fix compilation warnings when running only with default features (#160)
* fix: fix compilation warnings when running only with default features
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Tim Toebrock
2d75c2c405 Implement a generic read_csv method (#147)
* feat: Add interface to build `Matrix` from rows.
* feat: Add option to derive `RealNumber` from string.
To construct a `Matrix` from csv, and therefore from string, I need to be able to deserialize a generic `RealNumber` from string.
* feat: Implement `Matrix::read_csv`.
2022-11-08 11:29:56 -05:00
Montana Low
1f2597be74 grid search (#154)
* grid search draft
* hyperparam search for linear estimators
2022-11-08 11:29:56 -05:00
Montana Low
0f442e96c0 Handle multiclass precision/recall (#152)
* handle multiclass precision/recall
2022-11-08 11:29:56 -05:00
dependabot[bot]
44e4be23a6 Update criterion requirement from 0.3 to 0.4 (#150)
* Update criterion requirement from 0.3 to 0.4

Updates the requirements on [criterion](https://github.com/bheisler/criterion.rs) to permit the latest version.
- [Release notes](https://github.com/bheisler/criterion.rs/releases)
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.3.0...0.4.0)

---
updated-dependencies:
- dependency-name: criterion
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix criterion

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Christos Katsakioris
01f753f86d Add serde for StandardScaler (#148)
* Derive `serde::Serialize` and `serde::Deserialize` for
  `StandardScaler`.
* Add relevant unit test.

Signed-off-by: Christos Katsakioris <ckatsak@gmail.com>

Signed-off-by: Christos Katsakioris <ckatsak@gmail.com>
2022-11-08 11:29:56 -05:00
Tim Toebrock
df766eaf79 Implementation of Standard scaler (#143)
* docs: Fix typo in doc for categorical transformer.
* feat: Add option to take a column from Matrix.
I created the method `Matrix::take_column` that uses the `Matrix::take`-interface to extract a single column from a matrix. I need that feature in the implementation of  `StandardScaler`.
* feat: Add `StandardScaler`.
Authored-by: titoeb <timtoebrock@googlemail.com>
2022-11-08 11:29:56 -05:00
Lorenzo
09d9205696 Add example for FastPair (#144)
* Add example

* Move to top

* Add imports to example

* Fix imports
2022-11-08 11:29:56 -05:00
Lorenzo
dc7f01db4a Implement fastpair (#142)
* initial fastpair implementation
* FastPair initial implementation
* implement fastpair
* Add random test
* Add bench for fastpair
* Refactor with constructor for FastPair
* Add serialization for PairwiseDistance
* Add fp_bench feature for fastpair bench
2022-11-08 11:29:56 -05:00
Chris McComb
eb4b49d552 Added additional doctest and fixed indices (#141) 2022-11-08 11:29:56 -05:00
morenol
98e3465e7b Fix clippy warnings (#139)
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
ferrouille
ea39024fd2 Add SVC::decision_function (#135) 2022-11-08 11:29:56 -05:00
dependabot[bot]
4e94feb872 Update nalgebra requirement from 0.23.0 to 0.31.0 (#128)
Updates the requirements on [nalgebra](https://github.com/dimforge/nalgebra) to permit the latest version.
- [Release notes](https://github.com/dimforge/nalgebra/releases)
- [Changelog](https://github.com/dimforge/nalgebra/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/dimforge/nalgebra/compare/v0.23.0...v0.31.0)

---
updated-dependencies:
- dependency-name: nalgebra
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
dependabot-preview[bot]
fa802d2d3f build(deps): update nalgebra requirement from 0.23.0 to 0.26.2 (#98)
* build(deps): update nalgebra requirement from 0.23.0 to 0.26.2

Updates the requirements on [nalgebra](https://github.com/dimforge/nalgebra) to permit the latest version.
- [Release notes](https://github.com/dimforge/nalgebra/releases)
- [Changelog](https://github.com/dimforge/nalgebra/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/dimforge/nalgebra/compare/v0.23.0...v0.26.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* fix: updates for nalgebre

* test: explicitly call pow_mut from BaseVector since now it conflicts with nalgebra implementation

* Don't be strict with dependencies

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
78 changed files with 1136 additions and 1838 deletions
-14
View File
@@ -37,8 +37,6 @@ $ rust-code-analysis-cli -p src/algorithm/neighbour/fastpair.rs --ls 22 --le 213
```
* find more information about what happens in your binary with [`twiggy`](https://rustwasm.github.io/twiggy/install.html). This need a compiled binary so create a brief `main {}` function using `smartcore` and then point `twiggy` to that file.
* Please take a look to the output of a profiler to spot most evident performance problems, see [this guide about using a profiler](http://www.codeofview.com/fix-rs/2017/01/24/how-to-optimize-rust-programs-on-linux/).
## Issue Report Process
1. Go to the project's issues.
@@ -70,15 +68,3 @@ $ rust-code-analysis-cli -p src/algorithm/neighbour/fastpair.rs --ls 22 --le 213
* **PRs on develop**: any change should be PRed first in `development`
* **testing**: everything should work and be tested as defined in the workflow. If any is failing for non-related reasons, annotate the test failure in the PR comment.
## Suggestions for debugging
1. Install `lldb` for your platform
2. Run `rust-lldb target/debug/libsmartcore.rlib` in your command-line
3. In lldb, set up some breakpoints using `b func_name` or `b src/path/to/file.rs:linenumber`
4. In lldb, run a single test with `r the_name_of_your_test`
Display variables in scope: `frame variable <name>`
Execute expression: `p <expr>`
+1 -2
View File
@@ -23,7 +23,6 @@ jobs:
]
env:
TZ: "/usr/share/zoneinfo/your/location"
RUST_BACKTRACE: "1"
steps:
- uses: actions/checkout@v3
- name: Cache .cargo and target
@@ -37,7 +36,7 @@ jobs:
- name: Install Rust toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: 1.81 # 1.82 seems to break wasm32 tests https://github.com/rustwasm/wasm-bindgen/issues/4274
toolchain: stable
target: ${{ matrix.platform.target }}
profile: minimal
default: true
+1 -1
View File
@@ -41,4 +41,4 @@ jobs:
- name: Upload to codecov.io
uses: codecov/codecov-action@v2
with:
fail_ci_if_error: false
fail_ci_if_error: true
-6
View File
@@ -4,12 +4,6 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.4.0] - 2023-04-05
## Added
- WARNING: Breaking changes!
- `DenseMatrix` constructor now returns `Result` to avoid user instantiating inconsistent rows/cols count. Their return values need to be unwrapped with `unwrap()`, see tests
## [0.3.0] - 2022-11-09
## Added
+3 -3
View File
@@ -2,7 +2,7 @@
name = "smartcore"
description = "Machine Learning in Rust."
homepage = "https://smartcorelib.org"
version = "0.4.0"
version = "0.3.0"
authors = ["smartcore Developers"]
edition = "2021"
license = "Apache-2.0"
@@ -42,13 +42,13 @@ std_rand = ["rand/std_rng", "rand/std"]
js = ["getrandom/js"]
[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { version = "0.2.8", optional = true }
getrandom = { version = "*", optional = true }
[target.'cfg(all(target_arch = "wasm32", not(target_os = "wasi")))'.dev-dependencies]
wasm-bindgen-test = "0.3"
[dev-dependencies]
itertools = "0.13.0"
itertools = "*"
serde_json = "1.0"
bincode = "1.3.1"
+1 -1
View File
@@ -18,4 +18,4 @@
-----
[![CI](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml/badge.svg)](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml)
To start getting familiar with the new smartcore v0.3 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md).
To start getting familiar with the new smartcore v0.5 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md).
+15
View File
@@ -0,0 +1,15 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="RUST_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/examples" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/tests" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/benches" isTestSource="true" />
<excludeFolder url="file://$MODULE_DIR$/target" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>
+3 -4
View File
@@ -40,11 +40,11 @@ impl BBDTreeNode {
impl BBDTree {
pub fn new<T: Number, M: Array2<T>>(data: &M) -> BBDTree {
let nodes: Vec<BBDTreeNode> = Vec::new();
let nodes = Vec::new();
let (n, _) = data.shape();
let index = (0..n).collect::<Vec<usize>>();
let index = (0..n).collect::<Vec<_>>();
let mut tree = BBDTree {
nodes,
@@ -343,8 +343,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let tree = BBDTree::new(&data);
+4 -4
View File
@@ -124,7 +124,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
current_cover_set.push((d, &self.root));
let mut heap = HeapSelection::with_capacity(k);
heap.add(f64::MAX);
heap.add(std::f64::MAX);
let mut empty_heap = true;
if !self.identical_excluded || self.get_data_value(self.root.idx) != p {
@@ -145,7 +145,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
}
let upper_bound = if empty_heap {
f64::INFINITY
std::f64::INFINITY
} else {
*heap.peek()
};
@@ -291,7 +291,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
} else {
let max_dist = self.max(point_set);
let next_scale = (max_scale - 1).min(self.get_scale(max_dist));
if next_scale == i64::MIN {
if next_scale == std::i64::MIN {
let mut children: Vec<Node> = Vec::new();
let mut leaf = self.new_leaf(p);
children.push(leaf);
@@ -435,7 +435,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
fn get_scale(&self, d: f64) -> i64 {
if d == 0f64 {
i64::MIN
std::i64::MIN
} else {
(self.inv_log_base * d.ln()).ceil() as i64
}
+30 -24
View File
@@ -17,7 +17,7 @@
/// &[4.6, 3.1, 1.5, 0.2],
/// &[5.0, 3.6, 1.4, 0.2],
/// &[5.4, 3.9, 1.7, 0.4],
/// ]).unwrap();
/// ]);
/// let fastpair = FastPair::new(&x);
/// let closest_pair: PairwiseDistance<f64> = fastpair.unwrap().closest_pair();
/// ```
@@ -52,8 +52,10 @@ pub struct FastPair<'a, T: RealNumber + FloatNumber, M: Array2<T>> {
}
impl<'a, T: RealNumber + FloatNumber, M: Array2<T>> FastPair<'a, T, M> {
///
/// Constructor
/// Instantiate and initialize the algorithm
/// Instantiate and inizialise the algorithm
///
pub fn new(m: &'a M) -> Result<Self, Failed> {
if m.shape().0 < 3 {
return Err(Failed::because(
@@ -72,8 +74,10 @@ impl<'a, T: RealNumber + FloatNumber, M: Array2<T>> FastPair<'a, T, M> {
Ok(init)
}
///
/// Initialise `FastPair` by passing a `Array2`.
/// Build a FastPairs data-structure from a set of (new) points.
///
fn init(&mut self) {
// basic measures
let len = self.samples.shape().0;
@@ -154,7 +158,9 @@ impl<'a, T: RealNumber + FloatNumber, M: Array2<T>> FastPair<'a, T, M> {
self.neighbours = neighbours;
}
///
/// Find closest pair by scanning list of nearest neighbors.
///
#[allow(dead_code)]
pub fn closest_pair(&self) -> PairwiseDistance<T> {
let mut a = self.neighbours[0]; // Start with first point
@@ -211,10 +217,10 @@ mod tests_fastpair {
use super::*;
use crate::linalg::basic::{arrays::Array, matrix::DenseMatrix};
///
/// Brute force algorithm, used only for comparison and testing
pub fn closest_pair_brute(
fastpair: &FastPair<'_, f64, DenseMatrix<f64>>,
) -> PairwiseDistance<f64> {
///
pub fn closest_pair_brute(fastpair: &FastPair<f64, DenseMatrix<f64>>) -> PairwiseDistance<f64> {
use itertools::Itertools;
let m = fastpair.samples.shape().0;
@@ -254,8 +260,8 @@ mod tests_fastpair {
let distances = fastpair.distances;
let neighbours = fastpair.neighbours;
assert!(!distances.is_empty());
assert!(!neighbours.is_empty());
assert!(distances.len() != 0);
assert!(neighbours.len() != 0);
assert_eq!(10, neighbours.len());
assert_eq!(10, distances.len());
@@ -265,24 +271,28 @@ mod tests_fastpair {
fn dataset_has_at_least_three_points() {
// Create a dataset which consists of only two points:
// A(0.0, 0.0) and B(1.0, 1.0).
let dataset = DenseMatrix::<f64>::from_2d_array(&[&[0.0, 0.0], &[1.0, 1.0]]).unwrap();
let dataset = DenseMatrix::<f64>::from_2d_array(&[&[0.0, 0.0], &[1.0, 1.0]]);
// We expect an error when we run `FastPair` on this dataset,
// becuase `FastPair` currently only works on a minimum of 3
// points.
let fastpair = FastPair::new(&dataset);
assert!(fastpair.is_err());
let _fastpair = FastPair::new(&dataset);
if let Err(e) = fastpair {
let expected_error =
Failed::because(FailedError::FindFailed, "min number of rows should be 3");
assert_eq!(e, expected_error)
match _fastpair {
Err(e) => {
let expected_error =
Failed::because(FailedError::FindFailed, "min number of rows should be 3");
assert_eq!(e, expected_error)
}
_ => {
assert!(false);
}
}
}
#[test]
fn one_dimensional_dataset_minimal() {
let dataset = DenseMatrix::<f64>::from_2d_array(&[&[0.0], &[2.0], &[9.0]]).unwrap();
let dataset = DenseMatrix::<f64>::from_2d_array(&[&[0.0], &[2.0], &[9.0]]);
let result = FastPair::new(&dataset);
assert!(result.is_ok());
@@ -302,8 +312,7 @@ mod tests_fastpair {
#[test]
fn one_dimensional_dataset_2() {
let dataset =
DenseMatrix::<f64>::from_2d_array(&[&[27.0], &[0.0], &[9.0], &[2.0]]).unwrap();
let dataset = DenseMatrix::<f64>::from_2d_array(&[&[27.0], &[0.0], &[9.0], &[2.0]]);
let result = FastPair::new(&dataset);
assert!(result.is_ok());
@@ -338,8 +347,7 @@ mod tests_fastpair {
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
])
.unwrap();
]);
let fastpair = FastPair::new(&x);
assert!(fastpair.is_ok());
@@ -512,8 +520,7 @@ mod tests_fastpair {
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
])
.unwrap();
]);
// compute
let fastpair = FastPair::new(&x);
assert!(fastpair.is_ok());
@@ -561,8 +568,7 @@ mod tests_fastpair {
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
])
.unwrap();
]);
// compute
let fastpair = FastPair::new(&x);
assert!(fastpair.is_ok());
@@ -576,7 +582,7 @@ mod tests_fastpair {
};
for p in dissimilarities.iter() {
if p.distance.unwrap() < min_dissimilarity.distance.unwrap() {
min_dissimilarity = *p
min_dissimilarity = p.clone()
}
}
+2 -2
View File
@@ -61,7 +61,7 @@ impl<T, D: Distance<T>> LinearKNNSearch<T, D> {
for _ in 0..k {
heap.add(KNNPoint {
distance: f64::INFINITY,
distance: std::f64::INFINITY,
index: None,
});
}
@@ -215,7 +215,7 @@ mod tests {
};
let point_inf = KNNPoint {
distance: f64::INFINITY,
distance: std::f64::INFINITY,
index: Some(3),
};
+7 -2
View File
@@ -49,15 +49,20 @@ pub mod linear_search;
/// Both, KNN classifier and regressor benefits from underlying search algorithms that helps to speed up queries.
/// `KNNAlgorithmName` maintains a list of supported search algorithms, see [KNN algorithms](../algorithm/neighbour/index.html)
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)]
#[derive(Debug, Clone)]
pub enum KNNAlgorithmName {
/// Heap Search algorithm, see [`LinearSearch`](../algorithm/neighbour/linear_search/index.html)
LinearSearch,
/// Cover Tree Search algorithm, see [`CoverTree`](../algorithm/neighbour/cover_tree/index.html)
#[default]
CoverTree,
}
impl Default for KNNAlgorithmName {
fn default() -> Self {
KNNAlgorithmName::CoverTree
}
}
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug)]
pub(crate) enum KNNAlgorithm<T: Number, D: Distance<Vec<T>>> {
+2 -2
View File
@@ -133,7 +133,7 @@ mod tests {
#[test]
fn test_add1() {
let mut heap = HeapSelection::with_capacity(3);
heap.add(f64::INFINITY);
heap.add(std::f64::INFINITY);
heap.add(-5f64);
heap.add(4f64);
heap.add(-1f64);
@@ -151,7 +151,7 @@ mod tests {
#[test]
fn test_add2() {
let mut heap = HeapSelection::with_capacity(3);
heap.add(f64::INFINITY);
heap.add(std::f64::INFINITY);
heap.add(0.0);
heap.add(8.4852);
heap.add(5.6568);
-1
View File
@@ -3,7 +3,6 @@ use num_traits::Num;
pub trait QuickArgSort {
fn quick_argsort_mut(&mut self) -> Vec<usize>;
#[allow(dead_code)]
fn quick_argsort(&self) -> Vec<usize>;
}
+6 -7
View File
@@ -18,7 +18,7 @@
//!
//! Example:
//!
//! ```ignore
//! ```
//! use smartcore::linalg::basic::matrix::DenseMatrix;
//! use smartcore::linalg::basic::arrays::Array2;
//! use smartcore::cluster::dbscan::*;
@@ -315,7 +315,8 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
}
}
while let Some(neighbor) = neighbors.pop() {
while !neighbors.is_empty() {
let neighbor = neighbors.pop().unwrap();
let index = neighbor.0;
if y[index] == outlier {
@@ -442,8 +443,7 @@ mod tests {
&[2.2, 1.2],
&[1.8, 0.8],
&[3.0, 5.0],
])
.unwrap();
]);
let expected_labels = vec![1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0];
@@ -488,8 +488,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let dbscan = DBSCAN::fit(&x, Default::default()).unwrap();
@@ -512,6 +511,6 @@ mod tests {
.and_then(|dbscan| dbscan.predict(&x))
.unwrap();
println!("{labels:?}");
println!("{:?}", labels);
}
}
+11 -13
View File
@@ -41,7 +41,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//!
//! let kmeans = KMeans::fit(&x, KMeansParameters::default().with_k(2)).unwrap(); // Fit to data, 2 clusters
//! let y_hat: Vec<u8> = kmeans.predict(&x).unwrap(); // use the same points for prediction
@@ -96,7 +96,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> PartialEq for KMeans<
return false;
}
for j in 0..self.centroids[i].len() {
if (self.centroids[i][j] - other.centroids[i][j]).abs() > f64::EPSILON {
if (self.centroids[i][j] - other.centroids[i][j]).abs() > std::f64::EPSILON {
return false;
}
}
@@ -249,7 +249,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> Predictor<X, Y>
impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> KMeans<TX, TY, X, Y> {
/// Fit algorithm to _NxM_ matrix where _N_ is number of samples and _M_ is number of features.
/// * `data` - training instances to cluster
/// * `data` - training instances to cluster
/// * `parameters` - cluster parameters
pub fn fit(data: &X, parameters: KMeansParameters) -> Result<KMeans<TX, TY, X, Y>, Failed> {
let bbd = BBDTree::new(data);
@@ -270,7 +270,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> KMeans<TX, TY, X, Y>
let (n, d) = data.shape();
let mut distortion = f64::MAX;
let mut distortion = std::f64::MAX;
let mut y = KMeans::<TX, TY, X, Y>::kmeans_plus_plus(data, parameters.k, parameters.seed);
let mut size = vec![0; parameters.k];
let mut centroids = vec![vec![0f64; d]; parameters.k];
@@ -331,7 +331,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> KMeans<TX, TY, X, Y>
let mut row = vec![0f64; x.shape().1];
for i in 0..n {
let mut min_dist = f64::MAX;
let mut min_dist = std::f64::MAX;
let mut best_cluster = 0;
for j in 0..self.k {
@@ -361,7 +361,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> KMeans<TX, TY, X, Y>
.cloned()
.collect();
let mut d = vec![f64::MAX; n];
let mut d = vec![std::f64::MAX; n];
let mut row = vec![TX::zero(); data.shape().1];
for j in 1..k {
@@ -424,7 +424,7 @@ mod tests {
)]
#[test]
fn invalid_k() {
let x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]).unwrap();
let x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]);
assert!(KMeans::<i32, i32, DenseMatrix<i32>, Vec<i32>>::fit(
&x,
@@ -492,15 +492,14 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let kmeans = KMeans::fit(&x, Default::default()).unwrap();
let y: Vec<usize> = kmeans.predict(&x).unwrap();
for (i, _y_i) in y.iter().enumerate() {
assert_eq!({ y[i] }, kmeans._y[i]);
for i in 0..y.len() {
assert_eq!(y[i] as usize, kmeans._y[i]);
}
}
@@ -532,8 +531,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let kmeans: KMeans<f32, f32, DenseMatrix<f32>, Vec<f32>> =
KMeans::fit(&x, Default::default()).unwrap();
+1 -1
View File
@@ -31,7 +31,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, f32> {
let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("boston.xy"))
{
Err(why) => panic!("Can't deserialize boston.xy. {why}"),
Err(why) => panic!("Can't deserialize boston.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features),
};
+1 -1
View File
@@ -33,7 +33,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features) =
match deserialize_data(std::include_bytes!("breast_cancer.xy")) {
Err(why) => panic!("Can't deserialize breast_cancer.xy. {why}"),
Err(why) => panic!("Can't deserialize breast_cancer.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (
x,
y.into_iter().map(|x| x as u32).collect(),
+2 -2
View File
@@ -26,7 +26,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features) =
match deserialize_data(std::include_bytes!("diabetes.xy")) {
Err(why) => panic!("Can't deserialize diabetes.xy. {why}"),
Err(why) => panic!("Can't deserialize diabetes.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (
x,
y.into_iter().map(|x| x as u32).collect(),
@@ -40,7 +40,7 @@ pub fn load_dataset() -> Dataset<f32, u32> {
target: y,
num_samples,
num_features,
feature_names: [
feature_names: vec![
"Age", "Sex", "BMI", "BP", "S1", "S2", "S3", "S4", "S5", "S6",
]
.iter()
+6 -4
View File
@@ -16,7 +16,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, f32> {
let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("digits.xy"))
{
Err(why) => panic!("Can't deserialize digits.xy. {why}"),
Err(why) => panic!("Can't deserialize digits.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features),
};
@@ -25,14 +25,16 @@ pub fn load_dataset() -> Dataset<f32, f32> {
target: y,
num_samples,
num_features,
feature_names: ["sepal length (cm)",
feature_names: vec![
"sepal length (cm)",
"sepal width (cm)",
"petal length (cm)",
"petal width (cm)"]
"petal width (cm)",
]
.iter()
.map(|s| s.to_string())
.collect(),
target_names: ["setosa", "versicolor", "virginica"]
target_names: vec!["setosa", "versicolor", "virginica"]
.iter()
.map(|s| s.to_string())
.collect(),
+3 -3
View File
@@ -22,7 +22,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features): (Vec<f32>, Vec<u32>, usize, usize) =
match deserialize_data(std::include_bytes!("iris.xy")) {
Err(why) => panic!("Can't deserialize iris.xy. {why}"),
Err(why) => panic!("Can't deserialize iris.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (
x,
y.into_iter().map(|x| x as u32).collect(),
@@ -36,7 +36,7 @@ pub fn load_dataset() -> Dataset<f32, u32> {
target: y,
num_samples,
num_features,
feature_names: [
feature_names: vec![
"sepal length (cm)",
"sepal width (cm)",
"petal length (cm)",
@@ -45,7 +45,7 @@ pub fn load_dataset() -> Dataset<f32, u32> {
.iter()
.map(|s| s.to_string())
.collect(),
target_names: ["setosa", "versicolor", "virginica"]
target_names: vec!["setosa", "versicolor", "virginica"]
.iter()
.map(|s| s.to_string())
.collect(),
+1 -1
View File
@@ -78,7 +78,7 @@ pub(crate) fn serialize_data<X: Number + RealNumber, Y: RealNumber>(
.collect();
file.write_all(&y)?;
}
Err(why) => panic!("couldn't create {filename}: {why}"),
Err(why) => panic!("couldn't create {}: {}", filename, why),
}
Ok(())
}
+18 -22
View File
@@ -35,7 +35,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//!
//! let pca = PCA::fit(&iris, PCAParameters::default().with_n_components(2)).unwrap(); // Reduce number of features to 2
//!
@@ -231,7 +231,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
if parameters.n_components > n {
return Err(Failed::fit(&format!(
"Number of components, n_components should be <= number of attributes ({n})"
"Number of components, n_components should be <= number of attributes ({})",
n
)));
}
@@ -373,20 +374,21 @@ mod tests {
let parameters = PCASearchParameters {
n_components: vec![2, 4],
use_correlation_matrix: vec![true, false],
..Default::default()
};
let mut iter = parameters.into_iter();
let next = iter.next().unwrap();
assert_eq!(next.n_components, 2);
assert!(next.use_correlation_matrix);
assert_eq!(next.use_correlation_matrix, true);
let next = iter.next().unwrap();
assert_eq!(next.n_components, 4);
assert!(next.use_correlation_matrix);
assert_eq!(next.use_correlation_matrix, true);
let next = iter.next().unwrap();
assert_eq!(next.n_components, 2);
assert!(!next.use_correlation_matrix);
assert_eq!(next.use_correlation_matrix, false);
let next = iter.next().unwrap();
assert_eq!(next.n_components, 4);
assert!(!next.use_correlation_matrix);
assert_eq!(next.use_correlation_matrix, false);
assert!(iter.next().is_none());
}
@@ -443,7 +445,6 @@ mod tests {
&[2.6, 53.0, 66.0, 10.8],
&[6.8, 161.0, 60.0, 15.6],
])
.unwrap()
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
@@ -458,8 +459,7 @@ mod tests {
&[0.9952, 0.0588],
&[0.0463, 0.9769],
&[0.0752, 0.2007],
])
.unwrap();
]);
let pca = PCA::fit(&us_arrests, Default::default()).unwrap();
@@ -502,8 +502,7 @@ mod tests {
-0.974080592182491,
0.0723250196376097,
],
])
.unwrap();
]);
let expected_projection = DenseMatrix::from_2d_array(&[
&[-64.8022, -11.448, 2.4949, -2.4079],
@@ -556,8 +555,7 @@ mod tests {
&[91.5446, -22.9529, 0.402, -0.7369],
&[118.1763, 5.5076, 2.7113, -0.205],
&[10.4345, -5.9245, 3.7944, 0.5179],
])
.unwrap();
]);
let expected_eigenvalues: Vec<f64> = vec![
343544.6277001563,
@@ -574,8 +572,8 @@ mod tests {
epsilon = 1e-4
));
for (i, pca_eigenvalues_i) in pca.eigenvalues.iter().enumerate() {
assert!((pca_eigenvalues_i.abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
for i in 0..pca.eigenvalues.len() {
assert!((pca.eigenvalues[i].abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
}
let us_arrests_t = pca.transform(&us_arrests).unwrap();
@@ -620,8 +618,7 @@ mod tests {
-0.0881962972508558,
-0.0096011588898465,
],
])
.unwrap();
]);
let expected_projection = DenseMatrix::from_2d_array(&[
&[0.9856, -1.1334, 0.4443, -0.1563],
@@ -674,8 +671,7 @@ mod tests {
&[-2.1086, -1.4248, -0.1048, -0.1319],
&[-2.0797, 0.6113, 0.1389, -0.1841],
&[-0.6294, -0.321, 0.2407, 0.1667],
])
.unwrap();
]);
let expected_eigenvalues: Vec<f64> = vec![
2.480241579149493,
@@ -698,8 +694,8 @@ mod tests {
epsilon = 1e-4
));
for (i, pca_eigenvalues_i) in pca.eigenvalues.iter().enumerate() {
assert!((pca_eigenvalues_i.abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
for i in 0..pca.eigenvalues.len() {
assert!((pca.eigenvalues[i].abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
}
let us_arrests_t = pca.transform(&us_arrests).unwrap();
@@ -738,7 +734,7 @@ mod tests {
// &[4.9, 2.4, 3.3, 1.0],
// &[6.6, 2.9, 4.6, 1.3],
// &[5.2, 2.7, 3.9, 1.4],
// ]).unwrap();
// ]);
// let pca = PCA::fit(&iris, Default::default()).unwrap();
+9 -8
View File
@@ -32,7 +32,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//!
//! let svd = SVD::fit(&iris, SVDParameters::default().
//! with_n_components(2)).unwrap(); // Reduce number of features to 2
@@ -180,7 +180,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
if parameters.n_components >= p {
return Err(Failed::fit(&format!(
"Number of components, n_components should be < number of attributes ({p})"
"Number of components, n_components should be < number of attributes ({})",
p
)));
}
@@ -201,7 +202,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
let (p_c, k) = self.components.shape();
if p_c != p {
return Err(Failed::transform(&format!(
"Can not transform a {n}x{p} matrix into {n}x{k} matrix, incorrect input dimentions"
"Can not transform a {}x{} matrix into {}x{} matrix, incorrect input dimentions",
n, p, n, k
)));
}
@@ -225,6 +227,7 @@ mod tests {
fn search_parameters() {
let parameters = SVDSearchParameters {
n_components: vec![10, 100],
..Default::default()
};
let mut iter = parameters.into_iter();
let next = iter.next().unwrap();
@@ -292,8 +295,7 @@ mod tests {
&[5.7, 81.0, 39.0, 9.3],
&[2.6, 53.0, 66.0, 10.8],
&[6.8, 161.0, 60.0, 15.6],
])
.unwrap();
]);
let expected = DenseMatrix::from_2d_array(&[
&[243.54655757, -18.76673788],
@@ -301,8 +303,7 @@ mod tests {
&[305.93972467, -15.39087376],
&[197.28420365, -11.66808306],
&[293.43187394, 1.91163633],
])
.unwrap();
]);
let svd = SVD::fit(&x, Default::default()).unwrap();
let x_transformed = svd.transform(&x).unwrap();
@@ -343,7 +344,7 @@ mod tests {
// &[4.9, 2.4, 3.3, 1.0],
// &[6.6, 2.9, 4.6, 1.3],
// &[5.2, 2.7, 3.9, 1.4],
// ]).unwrap();
// ]);
// let svd = SVD::fit(&iris, Default::default()).unwrap();
+5 -198
View File
@@ -33,7 +33,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y = vec![
//! 0, 0, 0, 0, 0, 0, 0, 0,
//! 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
@@ -55,9 +55,7 @@ use serde::{Deserialize, Serialize};
use crate::api::{Predictor, SupervisedEstimator};
use crate::error::{Failed, FailedError};
use crate::linalg::basic::arrays::MutArray;
use crate::linalg::basic::arrays::{Array1, Array2};
use crate::linalg::basic::matrix::DenseMatrix;
use crate::numbers::basenum::Number;
use crate::numbers::floatnum::FloatNumber;
@@ -456,12 +454,8 @@ impl<TX: FloatNumber + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY
y: &Y,
parameters: RandomForestClassifierParameters,
) -> Result<RandomForestClassifier<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape();
let (_, num_attributes) = x.shape();
let y_ncols = y.shape();
if x_nrows != y_ncols {
return Err(Failed::fit("Number of rows in X should = len(y)"));
}
let mut yi: Vec<usize> = vec![0; y_ncols];
let classes = y.unique();
@@ -604,76 +598,11 @@ impl<TX: FloatNumber + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY
}
samples
}
/// Predict class probabilities for X.
///
/// The predicted class probabilities of an input sample are computed as
/// the mean predicted class probabilities of the trees in the forest.
/// The class probability of a single tree is the fraction of samples of
/// the same class in a leaf.
///
/// # Arguments
///
/// * `x` - The input samples. A matrix of shape (n_samples, n_features).
///
/// # Returns
///
/// * `Result<DenseMatrix<f64>, Failed>` - The class probabilities of the input samples.
/// The order of the classes corresponds to that in the attribute `classes_`.
/// The matrix has shape (n_samples, n_classes).
///
/// # Errors
///
/// Returns a `Failed` error if:
/// * The model has not been fitted yet.
/// * The input `x` is not compatible with the model's expected input.
/// * Any of the tree predictions fail.
///
/// # Examples
///
/// ```
/// use smartcore::ensemble::random_forest_classifier::RandomForestClassifier;
/// use smartcore::linalg::basic::matrix::DenseMatrix;
/// use smartcore::linalg::basic::arrays::Array;
///
/// let x = DenseMatrix::from_2d_array(&[
/// &[5.1, 3.5, 1.4, 0.2],
/// &[4.9, 3.0, 1.4, 0.2],
/// &[7.0, 3.2, 4.7, 1.4],
/// ]).unwrap();
/// let y = vec![0, 0, 1];
///
/// let forest = RandomForestClassifier::fit(&x, &y, Default::default()).unwrap();
/// let probas = forest.predict_proba(&x).unwrap();
///
/// assert_eq!(probas.shape(), (3, 2));
/// ```
pub fn predict_proba(&self, x: &X) -> Result<DenseMatrix<f64>, Failed> {
let (n_samples, _) = x.shape();
let n_classes = self.classes.as_ref().unwrap().len();
let mut probas = DenseMatrix::<f64>::zeros(n_samples, n_classes);
for tree in self.trees.as_ref().unwrap().iter() {
let tree_predictions: Y = tree.predict(x).unwrap();
for (i, &class_idx) in tree_predictions.iterator(0).enumerate() {
let class_ = class_idx.to_usize().unwrap();
probas.add_element_mut((i, class_), 1.0);
}
}
let n_trees: f64 = self.trees.as_ref().unwrap().len() as f64;
probas.mul_scalar_mut(1.0 / n_trees);
Ok(probas)
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::ensemble::random_forest_classifier::RandomForestClassifier;
use crate::linalg::basic::arrays::Array;
use crate::linalg::basic::matrix::DenseMatrix;
use crate::metrics::*;
@@ -727,8 +656,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let classifier = RandomForestClassifier::fit(
@@ -750,30 +678,6 @@ mod tests {
assert!(accuracy(&y, &classifier.predict(&x).unwrap()) >= 0.95);
}
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(21, 200);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let fail = RandomForestClassifier::fit(
&x_rand,
&y,
RandomForestClassifierParameters {
criterion: SplitCriterion::Gini,
max_depth: Option::None,
min_samples_leaf: 1,
min_samples_split: 2,
n_trees: 100,
m: Option::None,
keep_samples: false,
seed: 87,
},
);
assert!(fail.is_err());
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
@@ -801,8 +705,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let classifier = RandomForestClassifier::fit(
@@ -827,101 +730,6 @@ mod tests {
);
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
fn test_random_forest_predict_proba() {
use num_traits::FromPrimitive;
// Iris-like dataset (subset)
let x: DenseMatrix<f64> = DenseMatrix::from_2d_array(&[
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
&[4.7, 3.2, 1.3, 0.2],
&[4.6, 3.1, 1.5, 0.2],
&[5.0, 3.6, 1.4, 0.2],
&[7.0, 3.2, 4.7, 1.4],
&[6.4, 3.2, 4.5, 1.5],
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
])
.unwrap();
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 1, 1, 1, 1, 1];
let forest = RandomForestClassifier::fit(&x, &y, Default::default()).unwrap();
let probas = forest.predict_proba(&x).unwrap();
// Test shape
assert_eq!(probas.shape(), (10, 2));
let (pro_n_rows, _) = probas.shape();
// Test probability sum
for i in 0..pro_n_rows {
let row_sum: f64 = probas.get_row(i).sum();
assert!(
(row_sum - 1.0).abs() < 1e-6,
"Row probabilities should sum to 1"
);
}
// Test class prediction
let predictions: Vec<u32> = (0..pro_n_rows)
.map(|i| {
if probas.get((i, 0)) > probas.get((i, 1)) {
0
} else {
1
}
})
.collect();
let acc = accuracy(&y, &predictions);
assert!(acc > 0.8, "Accuracy should be high for the training set");
// Test probability values
// These values are approximate and based on typical random forest behavior
for i in 0..(pro_n_rows / 2) {
assert!(
f64::from_f32(0.6).unwrap().lt(probas.get((i, 0))),
"Class 0 samples should have high probability for class 0"
);
assert!(
f64::from_f32(0.4).unwrap().gt(probas.get((i, 1))),
"Class 0 samples should have low probability for class 1"
);
}
for i in (pro_n_rows / 2)..pro_n_rows {
assert!(
f64::from_f32(0.6).unwrap().lt(probas.get((i, 1))),
"Class 1 samples should have high probability for class 1"
);
assert!(
f64::from_f32(0.4).unwrap().gt(probas.get((i, 0))),
"Class 1 samples should have low probability for class 0"
);
}
// Test with new data
let x_new = DenseMatrix::from_2d_array(&[
&[5.0, 3.4, 1.5, 0.2], // Should be close to class 0
&[6.3, 3.3, 4.7, 1.6], // Should be close to class 1
])
.unwrap();
let probas_new = forest.predict_proba(&x_new).unwrap();
assert_eq!(probas_new.shape(), (2, 2));
assert!(
probas_new.get((0, 0)) > probas_new.get((0, 1)),
"First sample should be predicted as class 0"
);
assert!(
probas_new.get((1, 1)) > probas_new.get((1, 0)),
"Second sample should be predicted as class 1"
);
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
@@ -950,8 +758,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let forest = RandomForestClassifier::fit(&x, &y, Default::default()).unwrap();
+4 -37
View File
@@ -29,7 +29,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//! let y = vec![
//! 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2,
//! 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9
@@ -399,10 +399,6 @@ impl<TX: Number + FloatNumber + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1
) -> Result<RandomForestRegressor<TX, TY, X, Y>, Failed> {
let (n_rows, num_attributes) = x.shape();
if n_rows != y.shape() {
return Err(Failed::fit("Number of rows in X should = len(y)"));
}
let mtry = parameters
.m
.unwrap_or((num_attributes as f64).sqrt().floor() as usize);
@@ -574,8 +570,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
@@ -600,32 +595,6 @@ mod tests {
assert!(mean_absolute_error(&y, &y_hat) < 1.0);
}
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(17, 200);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
];
let fail = RandomForestRegressor::fit(
&x_rand,
&y,
RandomForestRegressorParameters {
max_depth: Option::None,
min_samples_leaf: 1,
min_samples_split: 2,
n_trees: 1000,
m: Option::None,
keep_samples: false,
seed: 87,
},
);
assert!(fail.is_err());
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
@@ -649,8 +618,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
@@ -704,8 +672,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
+2 -21
View File
@@ -30,10 +30,8 @@ pub enum FailedError {
DecompositionFailed,
/// Can't solve for x
SolutionFailed,
/// Error in input parameters
/// Erro in input
ParametersError,
/// Invalid state error (should never happen)
InvalidStateError,
}
impl Failed {
@@ -66,22 +64,6 @@ impl Failed {
}
}
/// new instance of `FailedError::ParametersError`
pub fn input(msg: &str) -> Self {
Failed {
err: FailedError::ParametersError,
msg: msg.to_string(),
}
}
/// new instance of `FailedError::InvalidStateError`
pub fn invalid_state(msg: &str) -> Self {
Failed {
err: FailedError::InvalidStateError,
msg: msg.to_string(),
}
}
/// new instance of `err`
pub fn because(err: FailedError, msg: &str) -> Self {
Failed {
@@ -115,9 +97,8 @@ impl fmt::Display for FailedError {
FailedError::DecompositionFailed => "Decomposition failed",
FailedError::SolutionFailed => "Can't find solution",
FailedError::ParametersError => "Error in input, check parameters",
FailedError::InvalidStateError => "Invalid state, this should never happen", // useful in development phase of lib
};
write!(f, "{failed_err_str}")
write!(f, "{}", failed_err_str)
}
}
+2 -3
View File
@@ -3,8 +3,7 @@
clippy::too_many_arguments,
clippy::many_single_char_names,
clippy::unnecessary_wraps,
clippy::upper_case_acronyms,
clippy::approx_constant
clippy::upper_case_acronyms
)]
#![warn(missing_docs)]
#![warn(rustdoc::missing_doc_code_examples)]
@@ -64,7 +63,7 @@
//! &[3., 4.],
//! &[5., 6.],
//! &[7., 8.],
//! &[9., 10.]]).unwrap();
//! &[9., 10.]]);
//! // Our classes are defined as a vector
//! let y = vec![2, 2, 2, 3, 3];
//!
+194 -213
View File
File diff suppressed because it is too large Load Diff
+107 -235
View File
@@ -19,8 +19,6 @@ use crate::linalg::traits::svd::SVDDecomposable;
use crate::numbers::basenum::Number;
use crate::numbers::realnum::RealNumber;
use crate::error::Failed;
/// Dense matrix
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)]
@@ -52,26 +50,26 @@ pub struct DenseMatrixMutView<'a, T: Debug + Display + Copy + Sized> {
}
impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixView<'a, T> {
fn new(
m: &'a DenseMatrix<T>,
vrows: Range<usize>,
vcols: Range<usize>,
) -> Result<Self, Failed> {
if m.is_valid_view(m.shape().0, m.shape().1, &vrows, &vcols) {
Err(Failed::input(
"The specified view is outside of the matrix range",
))
fn new(m: &'a DenseMatrix<T>, rows: Range<usize>, cols: Range<usize>) -> Self {
let (start, end, stride) = if m.column_major {
(
rows.start + cols.start * m.nrows,
rows.end + (cols.end - 1) * m.nrows,
m.nrows,
)
} else {
let (start, end, stride) =
m.stride_range(m.shape().0, m.shape().1, &vrows, &vcols, m.column_major);
Ok(DenseMatrixView {
values: &m.values[start..end],
stride,
nrows: vrows.end - vrows.start,
ncols: vcols.end - vcols.start,
column_major: m.column_major,
})
(
rows.start * m.ncols + cols.start,
(rows.end - 1) * m.ncols + cols.end,
m.ncols,
)
};
DenseMatrixView {
values: &m.values[start..end],
stride,
nrows: rows.end - rows.start,
ncols: cols.end - cols.start,
column_major: m.column_major,
}
}
@@ -91,7 +89,7 @@ impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixView<'a, T> {
}
}
impl<T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixView<'a, T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
writeln!(
f,
@@ -104,26 +102,26 @@ impl<T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixView<'_, T>
}
impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixMutView<'a, T> {
fn new(
m: &'a mut DenseMatrix<T>,
vrows: Range<usize>,
vcols: Range<usize>,
) -> Result<Self, Failed> {
if m.is_valid_view(m.shape().0, m.shape().1, &vrows, &vcols) {
Err(Failed::input(
"The specified view is outside of the matrix range",
))
fn new(m: &'a mut DenseMatrix<T>, rows: Range<usize>, cols: Range<usize>) -> Self {
let (start, end, stride) = if m.column_major {
(
rows.start + cols.start * m.nrows,
rows.end + (cols.end - 1) * m.nrows,
m.nrows,
)
} else {
let (start, end, stride) =
m.stride_range(m.shape().0, m.shape().1, &vrows, &vcols, m.column_major);
Ok(DenseMatrixMutView {
values: &mut m.values[start..end],
stride,
nrows: vrows.end - vrows.start,
ncols: vcols.end - vcols.start,
column_major: m.column_major,
})
(
rows.start * m.ncols + cols.start,
(rows.end - 1) * m.ncols + cols.end,
m.ncols,
)
};
DenseMatrixMutView {
values: &mut m.values[start..end],
stride,
nrows: rows.end - rows.start,
ncols: cols.end - cols.start,
column_major: m.column_major,
}
}
@@ -142,7 +140,7 @@ impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixMutView<'a, T> {
}
}
fn iter_mut<'b>(&'b mut self, axis: u8) -> Box<dyn Iterator<Item = &'b mut T> + 'b> {
fn iter_mut<'b>(&'b mut self, axis: u8) -> Box<dyn Iterator<Item = &mut T> + 'b> {
let column_major = self.column_major;
let stride = self.stride;
let ptr = self.values.as_mut_ptr();
@@ -169,7 +167,7 @@ impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixMutView<'a, T> {
}
}
impl<T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixMutView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixMutView<'a, T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
writeln!(
f,
@@ -184,102 +182,42 @@ impl<T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrixMutView<'_,
impl<T: Debug + Display + Copy + Sized> DenseMatrix<T> {
/// Create new instance of `DenseMatrix` without copying data.
/// `values` should be in column-major order.
pub fn new(
nrows: usize,
ncols: usize,
values: Vec<T>,
column_major: bool,
) -> Result<Self, Failed> {
let data_len = values.len();
if nrows * ncols != values.len() {
Err(Failed::input(&format!(
"The specified shape: (cols: {ncols}, rows: {nrows}) does not align with data len: {data_len}"
)))
} else {
Ok(DenseMatrix {
ncols,
nrows,
values,
column_major,
})
pub fn new(nrows: usize, ncols: usize, values: Vec<T>, column_major: bool) -> Self {
DenseMatrix {
ncols,
nrows,
values,
column_major,
}
}
/// New instance of `DenseMatrix` from 2d array.
pub fn from_2d_array(values: &[&[T]]) -> Result<Self, Failed> {
pub fn from_2d_array(values: &[&[T]]) -> Self {
DenseMatrix::from_2d_vec(&values.iter().map(|row| Vec::from(*row)).collect())
}
/// New instance of `DenseMatrix` from 2d vector.
#[allow(clippy::ptr_arg)]
pub fn from_2d_vec(values: &Vec<Vec<T>>) -> Result<Self, Failed> {
if values.is_empty() || values[0].is_empty() {
Err(Failed::input(
"The 2d vec provided is empty; cannot instantiate the matrix",
))
} else {
let nrows = values.len();
let ncols = values
.first()
.unwrap_or_else(|| {
panic!("Invalid state: Cannot create 2d matrix from an empty vector")
})
.len();
let mut m_values = Vec::with_capacity(nrows * ncols);
pub fn from_2d_vec(values: &Vec<Vec<T>>) -> Self {
let nrows = values.len();
let ncols = values
.first()
.unwrap_or_else(|| panic!("Cannot create 2d matrix from an empty vector"))
.len();
let mut m_values = Vec::with_capacity(nrows * ncols);
for c in 0..ncols {
for r in values.iter().take(nrows) {
m_values.push(r[c])
}
for c in 0..ncols {
for r in values.iter().take(nrows) {
m_values.push(r[c])
}
DenseMatrix::new(nrows, ncols, m_values, true)
}
DenseMatrix::new(nrows, ncols, m_values, true)
}
/// Iterate over values of matrix
pub fn iter(&self) -> Iter<'_, T> {
self.values.iter()
}
/// Check if the size of the requested view is bounded to matrix rows/cols count
fn is_valid_view(
&self,
n_rows: usize,
n_cols: usize,
vrows: &Range<usize>,
vcols: &Range<usize>,
) -> bool {
!(vrows.end <= n_rows
&& vcols.end <= n_cols
&& vrows.start <= n_rows
&& vcols.start <= n_cols)
}
/// Compute the range of the requested view: start, end, size of the slice
fn stride_range(
&self,
n_rows: usize,
n_cols: usize,
vrows: &Range<usize>,
vcols: &Range<usize>,
column_major: bool,
) -> (usize, usize, usize) {
let (start, end, stride) = if column_major {
(
vrows.start + vcols.start * n_rows,
vrows.end + (vcols.end - 1) * n_rows,
n_rows,
)
} else {
(
vrows.start * n_cols + vcols.start,
(vrows.end - 1) * n_cols + vcols.end,
n_cols,
)
};
(start, end, stride)
}
}
impl<T: Debug + Display + Copy + Sized> fmt::Display for DenseMatrix<T> {
@@ -366,7 +304,6 @@ where
impl<T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrix<T> {
fn get(&self, pos: (usize, usize)) -> &T {
let (row, col) = pos;
if row >= self.nrows || col >= self.ncols {
panic!(
"Invalid index ({},{}) for {}x{} matrix",
@@ -446,15 +383,15 @@ impl<T: Debug + Display + Copy + Sized> MutArrayView2<T> for DenseMatrix<T> {}
impl<T: Debug + Display + Copy + Sized> Array2<T> for DenseMatrix<T> {
fn get_row<'a>(&'a self, row: usize) -> Box<dyn ArrayView1<T> + 'a> {
Box::new(DenseMatrixView::new(self, row..row + 1, 0..self.ncols).unwrap())
Box::new(DenseMatrixView::new(self, row..row + 1, 0..self.ncols))
}
fn get_col<'a>(&'a self, col: usize) -> Box<dyn ArrayView1<T> + 'a> {
Box::new(DenseMatrixView::new(self, 0..self.nrows, col..col + 1).unwrap())
Box::new(DenseMatrixView::new(self, 0..self.nrows, col..col + 1))
}
fn slice<'a>(&'a self, rows: Range<usize>, cols: Range<usize>) -> Box<dyn ArrayView2<T> + 'a> {
Box::new(DenseMatrixView::new(self, rows, cols).unwrap())
Box::new(DenseMatrixView::new(self, rows, cols))
}
fn slice_mut<'a>(
@@ -465,17 +402,15 @@ impl<T: Debug + Display + Copy + Sized> Array2<T> for DenseMatrix<T> {
where
Self: Sized,
{
Box::new(DenseMatrixMutView::new(self, rows, cols).unwrap())
Box::new(DenseMatrixMutView::new(self, rows, cols))
}
// private function so for now assume infalible
fn fill(nrows: usize, ncols: usize, value: T) -> Self {
DenseMatrix::new(nrows, ncols, vec![value; nrows * ncols], true).unwrap()
DenseMatrix::new(nrows, ncols, vec![value; nrows * ncols], true)
}
// private function so for now assume infalible
fn from_iterator<I: Iterator<Item = T>>(iter: I, nrows: usize, ncols: usize, axis: u8) -> Self {
DenseMatrix::new(nrows, ncols, iter.collect(), axis != 0).unwrap()
DenseMatrix::new(nrows, ncols, iter.collect(), axis != 0)
}
fn transpose(&self) -> Self {
@@ -493,12 +428,12 @@ impl<T: Number + RealNumber> EVDDecomposable<T> for DenseMatrix<T> {}
impl<T: Number + RealNumber> LUDecomposable<T> for DenseMatrix<T> {}
impl<T: Number + RealNumber> SVDDecomposable<T> for DenseMatrix<T> {}
impl<T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrixView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrixView<'a, T> {
fn get(&self, pos: (usize, usize)) -> &T {
if self.column_major {
&self.values[pos.0 + pos.1 * self.stride]
&self.values[(pos.0 + pos.1 * self.stride)]
} else {
&self.values[pos.0 * self.stride + pos.1]
&self.values[(pos.0 * self.stride + pos.1)]
}
}
@@ -515,7 +450,7 @@ impl<T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrix
}
}
impl<T: Debug + Display + Copy + Sized> Array<T, usize> for DenseMatrixView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> Array<T, usize> for DenseMatrixView<'a, T> {
fn get(&self, i: usize) -> &T {
if self.nrows == 1 {
if self.column_major {
@@ -553,16 +488,16 @@ impl<T: Debug + Display + Copy + Sized> Array<T, usize> for DenseMatrixView<'_,
}
}
impl<T: Debug + Display + Copy + Sized> ArrayView2<T> for DenseMatrixView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView2<T> for DenseMatrixView<'a, T> {}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for DenseMatrixView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView1<T> for DenseMatrixView<'a, T> {}
impl<T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrixMutView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrixMutView<'a, T> {
fn get(&self, pos: (usize, usize)) -> &T {
if self.column_major {
&self.values[pos.0 + pos.1 * self.stride]
&self.values[(pos.0 + pos.1 * self.stride)]
} else {
&self.values[pos.0 * self.stride + pos.1]
&self.values[(pos.0 * self.stride + pos.1)]
}
}
@@ -579,12 +514,14 @@ impl<T: Debug + Display + Copy + Sized> Array<T, (usize, usize)> for DenseMatrix
}
}
impl<T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)> for DenseMatrixMutView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)>
for DenseMatrixMutView<'a, T>
{
fn set(&mut self, pos: (usize, usize), x: T) {
if self.column_major {
self.values[pos.0 + pos.1 * self.stride] = x;
self.values[(pos.0 + pos.1 * self.stride)] = x;
} else {
self.values[pos.0 * self.stride + pos.1] = x;
self.values[(pos.0 * self.stride + pos.1)] = x;
}
}
@@ -593,89 +530,29 @@ impl<T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)> for DenseMat
}
}
impl<T: Debug + Display + Copy + Sized> MutArrayView2<T> for DenseMatrixMutView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> MutArrayView2<T> for DenseMatrixMutView<'a, T> {}
impl<T: Debug + Display + Copy + Sized> ArrayView2<T> for DenseMatrixMutView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView2<T> for DenseMatrixMutView<'a, T> {}
impl<T: RealNumber> MatrixStats<T> for DenseMatrix<T> {}
impl<T: RealNumber> MatrixPreprocessing<T> for DenseMatrix<T> {}
#[cfg(test)]
#[warn(clippy::reversed_empty_ranges)]
mod tests {
use super::*;
use approx::relative_eq;
#[test]
fn test_instantiate_from_2d() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]);
assert!(x.is_ok());
}
#[test]
fn test_instantiate_from_2d_empty() {
let input: &[&[f64]] = &[&[]];
let x = DenseMatrix::from_2d_array(input);
assert!(x.is_err());
}
#[test]
fn test_instantiate_from_2d_empty2() {
let input: &[&[f64]] = &[&[], &[]];
let x = DenseMatrix::from_2d_array(input);
assert!(x.is_err());
}
#[test]
fn test_instantiate_ok_view1() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 0..2, 0..2);
assert!(v.is_ok());
}
#[test]
fn test_instantiate_ok_view2() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 0..3, 0..3);
assert!(v.is_ok());
}
#[test]
fn test_instantiate_ok_view3() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 2..3, 0..3);
assert!(v.is_ok());
}
#[test]
fn test_instantiate_ok_view4() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 3..3, 0..3);
assert!(v.is_ok());
}
#[test]
fn test_instantiate_err_view1() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 3..4, 0..3);
assert!(v.is_err());
}
#[test]
fn test_instantiate_err_view2() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 0..3, 3..4);
assert!(v.is_err());
}
#[test]
fn test_instantiate_err_view3() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let v = DenseMatrixView::new(&x, 0..3, 4..3);
assert!(v.is_err());
}
#[test]
fn test_display() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]);
println!("{}", &x);
}
#[test]
fn test_get_row_col() {
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let x = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]);
assert_eq!(15.0, x.get_col(1).sum());
assert_eq!(15.0, x.get_row(1).sum());
@@ -684,7 +561,7 @@ mod tests {
#[test]
fn test_row_major() {
let mut x = DenseMatrix::new(2, 3, vec![1, 2, 3, 4, 5, 6], false).unwrap();
let mut x = DenseMatrix::new(2, 3, vec![1, 2, 3, 4, 5, 6], false);
assert_eq!(5, *x.get_col(1).get(1));
assert_eq!(7, x.get_col(1).sum());
@@ -698,22 +575,21 @@ mod tests {
#[test]
fn test_get_slice() {
let x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9], &[10, 11, 12]])
.unwrap();
let x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9], &[10, 11, 12]]);
assert_eq!(
vec![4, 5, 6],
DenseMatrix::from_slice(&(*x.slice(1..2, 0..3))).values
);
let second_row: Vec<i32> = x.slice(1..2, 0..3).iterator(0).copied().collect();
let second_row: Vec<i32> = x.slice(1..2, 0..3).iterator(0).map(|x| *x).collect();
assert_eq!(vec![4, 5, 6], second_row);
let second_col: Vec<i32> = x.slice(0..3, 1..2).iterator(0).copied().collect();
let second_col: Vec<i32> = x.slice(0..3, 1..2).iterator(0).map(|x| *x).collect();
assert_eq!(vec![2, 5, 8], second_col);
}
#[test]
fn test_iter_mut() {
let mut x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9]]).unwrap();
let mut x = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9]]);
assert_eq!(vec![1, 4, 7, 2, 5, 8, 3, 6, 9], x.values);
// add +2 to some elements
@@ -749,8 +625,7 @@ mod tests {
#[test]
fn test_str_array() {
let mut x =
DenseMatrix::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"], &["7", "8", "9"]])
.unwrap();
DenseMatrix::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"], &["7", "8", "9"]]);
assert_eq!(vec!["1", "4", "7", "2", "5", "8", "3", "6", "9"], x.values);
x.iterator_mut(0).for_each(|v| *v = "str");
@@ -762,20 +637,20 @@ mod tests {
#[test]
fn test_transpose() {
let x = DenseMatrix::<&str>::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"]]).unwrap();
let x = DenseMatrix::<&str>::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"]]);
assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values);
assert!(x.column_major);
assert!(x.column_major == true);
// transpose
let x = x.transpose();
assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values);
assert!(!x.column_major); // should change column_major
assert!(x.column_major == false); // should change column_major
}
#[test]
fn test_from_iterator() {
let data = [1, 2, 3, 4, 5, 6];
let data = vec![1, 2, 3, 4, 5, 6];
let m = DenseMatrix::from_iterator(data.iter(), 2, 3, 0);
@@ -784,25 +659,25 @@ mod tests {
vec![1, 2, 3, 4, 5, 6],
m.values.iter().map(|e| **e).collect::<Vec<i32>>()
);
assert!(!m.column_major);
assert!(m.column_major == false);
}
#[test]
fn test_take() {
let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]).unwrap();
let b = DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]);
let b = DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6]]);
println!("{a}");
println!("{}", a);
// take column 0 and 2
assert_eq!(vec![1, 3, 4, 6], a.take(&[0, 2], 1).values);
println!("{b}");
println!("{}", b);
// take rows 0 and 2
assert_eq!(vec![1, 2, 5, 6], b.take(&[0, 2], 0).values);
}
#[test]
fn test_mut() {
let a = DenseMatrix::from_2d_array(&[&[1.3, -2.1, 3.4], &[-4., -5.3, 6.1]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1.3, -2.1, 3.4], &[-4., -5.3, 6.1]]);
let a = a.abs();
assert_eq!(vec![1.3, 4.0, 2.1, 5.3, 3.4, 6.1], a.values);
@@ -813,29 +688,26 @@ mod tests {
#[test]
fn test_reshape() {
let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9], &[10, 11, 12]])
.unwrap();
let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9], &[10, 11, 12]]);
let a = a.reshape(2, 6, 0);
assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values);
assert!(a.ncols == 6 && a.nrows == 2 && !a.column_major);
assert!(a.ncols == 6 && a.nrows == 2 && a.column_major == false);
let a = a.reshape(3, 4, 1);
assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values);
assert!(a.ncols == 4 && a.nrows == 3 && a.column_major);
assert!(a.ncols == 4 && a.nrows == 3 && a.column_major == true);
}
#[test]
fn test_eq() {
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.]]).unwrap();
let b = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.]]);
let b = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[4., 5., 6.], &[7., 8., 9.]]);
let c = DenseMatrix::from_2d_array(&[
&[1. + f32::EPSILON, 2., 3.],
&[4., 5., 6. + f32::EPSILON],
])
.unwrap();
let d = DenseMatrix::from_2d_array(&[&[1. + 0.5, 2., 3.], &[4., 5., 6. + f32::EPSILON]])
.unwrap();
]);
let d = DenseMatrix::from_2d_array(&[&[1. + 0.5, 2., 3.], &[4., 5., 6. + f32::EPSILON]]);
assert!(!relative_eq!(a, b));
assert!(!relative_eq!(a, d));
+10 -31
View File
@@ -15,25 +15,6 @@ pub struct VecView<'a, T: Debug + Display + Copy + Sized> {
ptr: &'a [T],
}
impl<T: Debug + Display + Copy + Sized> Array<T, usize> for &[T] {
fn get(&self, i: usize) -> &T {
&self[i]
}
fn shape(&self) -> usize {
self.len()
}
fn is_empty(&self) -> bool {
self.len() > 0
}
fn iterator<'b>(&'b self, axis: u8) -> Box<dyn Iterator<Item = &'b T> + 'b> {
assert!(axis == 0, "For one dimensional array `axis` should == 0");
Box::new(self.iter())
}
}
impl<T: Debug + Display + Copy + Sized> Array<T, usize> for Vec<T> {
fn get(&self, i: usize) -> &T {
&self[i]
@@ -55,7 +36,6 @@ impl<T: Debug + Display + Copy + Sized> Array<T, usize> for Vec<T> {
impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for Vec<T> {
fn set(&mut self, i: usize, x: T) {
// NOTE: this panics in case of out of bounds index
self[i] = x
}
@@ -66,7 +46,6 @@ impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for Vec<T> {
}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for Vec<T> {}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for &[T] {}
impl<T: Debug + Display + Copy + Sized> MutArrayView1<T> for Vec<T> {}
@@ -119,7 +98,7 @@ impl<T: Debug + Display + Copy + Sized> Array1<T> for Vec<T> {
}
}
impl<T: Debug + Display + Copy + Sized> Array<T, usize> for VecMutView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> Array<T, usize> for VecMutView<'a, T> {
fn get(&self, i: usize) -> &T {
&self.ptr[i]
}
@@ -138,7 +117,7 @@ impl<T: Debug + Display + Copy + Sized> Array<T, usize> for VecMutView<'_, T> {
}
}
impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for VecMutView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> MutArray<T, usize> for VecMutView<'a, T> {
fn set(&mut self, i: usize, x: T) {
self.ptr[i] = x;
}
@@ -149,10 +128,10 @@ impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for VecMutView<'_, T>
}
}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for VecMutView<'_, T> {}
impl<T: Debug + Display + Copy + Sized> MutArrayView1<T> for VecMutView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView1<T> for VecMutView<'a, T> {}
impl<'a, T: Debug + Display + Copy + Sized> MutArrayView1<T> for VecMutView<'a, T> {}
impl<T: Debug + Display + Copy + Sized> Array<T, usize> for VecView<'_, T> {
impl<'a, T: Debug + Display + Copy + Sized> Array<T, usize> for VecView<'a, T> {
fn get(&self, i: usize) -> &T {
&self.ptr[i]
}
@@ -171,7 +150,7 @@ impl<T: Debug + Display + Copy + Sized> Array<T, usize> for VecView<'_, T> {
}
}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for VecView<'_, T> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView1<T> for VecView<'a, T> {}
#[cfg(test)]
mod tests {
@@ -181,8 +160,8 @@ mod tests {
fn dot_product<T: Number, V: Array1<T>>(v: &V) -> T {
let vv = V::zeros(10);
let v_s = vv.slice(0..3);
v_s.dot(v)
let dot = v_s.dot(v);
dot
}
fn vector_ops<T: Number + PartialOrd, V: Array1<T>>(_: &V) -> T {
@@ -212,7 +191,7 @@ mod tests {
#[test]
fn test_len() {
let x = [1, 2, 3];
let x = vec![1, 2, 3];
assert_eq!(3, x.len());
}
@@ -237,7 +216,7 @@ mod tests {
#[test]
fn test_mut_iterator() {
let mut x = vec![1, 2, 3];
x.iterator_mut(0).for_each(|v| *v *= 2);
x.iterator_mut(0).for_each(|v| *v = *v * 2);
assert_eq!(vec![2, 4, 6], x);
}
+16 -12
View File
@@ -68,7 +68,7 @@ impl<T: Debug + Display + Copy + Sized> ArrayView2<T> for ArrayBase<OwnedRepr<T>
impl<T: Debug + Display + Copy + Sized> MutArrayView2<T> for ArrayBase<OwnedRepr<T>, Ix2> {}
impl<T: Debug + Display + Copy + Sized> BaseArray<T, (usize, usize)> for ArrayView<'_, T, Ix2> {
impl<'a, T: Debug + Display + Copy + Sized> BaseArray<T, (usize, usize)> for ArrayView<'a, T, Ix2> {
fn get(&self, pos: (usize, usize)) -> &T {
&self[[pos.0, pos.1]]
}
@@ -144,9 +144,11 @@ impl<T: Number + RealNumber> EVDDecomposable<T> for ArrayBase<OwnedRepr<T>, Ix2>
impl<T: Number + RealNumber> LUDecomposable<T> for ArrayBase<OwnedRepr<T>, Ix2> {}
impl<T: Number + RealNumber> SVDDecomposable<T> for ArrayBase<OwnedRepr<T>, Ix2> {}
impl<T: Debug + Display + Copy + Sized> ArrayView2<T> for ArrayView<'_, T, Ix2> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView2<T> for ArrayView<'a, T, Ix2> {}
impl<T: Debug + Display + Copy + Sized> BaseArray<T, (usize, usize)> for ArrayViewMut<'_, T, Ix2> {
impl<'a, T: Debug + Display + Copy + Sized> BaseArray<T, (usize, usize)>
for ArrayViewMut<'a, T, Ix2>
{
fn get(&self, pos: (usize, usize)) -> &T {
&self[[pos.0, pos.1]]
}
@@ -173,7 +175,9 @@ impl<T: Debug + Display + Copy + Sized> BaseArray<T, (usize, usize)> for ArrayVi
}
}
impl<T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)> for ArrayViewMut<'_, T, Ix2> {
impl<'a, T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)>
for ArrayViewMut<'a, T, Ix2>
{
fn set(&mut self, pos: (usize, usize), x: T) {
self[[pos.0, pos.1]] = x
}
@@ -191,9 +195,9 @@ impl<T: Debug + Display + Copy + Sized> MutArray<T, (usize, usize)> for ArrayVie
}
}
impl<T: Debug + Display + Copy + Sized> MutArrayView2<T> for ArrayViewMut<'_, T, Ix2> {}
impl<'a, T: Debug + Display + Copy + Sized> MutArrayView2<T> for ArrayViewMut<'a, T, Ix2> {}
impl<T: Debug + Display + Copy + Sized> ArrayView2<T> for ArrayViewMut<'_, T, Ix2> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView2<T> for ArrayViewMut<'a, T, Ix2> {}
#[cfg(test)]
mod tests {
@@ -213,7 +217,7 @@ mod tests {
fn test_iterator() {
let a = arr2(&[[1, 2, 3], [4, 5, 6]]);
let v: Vec<i32> = a.iterator(0).copied().collect();
let v: Vec<i32> = a.iterator(0).map(|&v| v).collect();
assert_eq!(v, vec!(1, 2, 3, 4, 5, 6));
}
@@ -232,7 +236,7 @@ mod tests {
let x = arr2(&[[1, 2, 3], [4, 5, 6]]);
let x_slice = Array2::slice(&x, 0..2, 1..2);
assert_eq!((2, 1), x_slice.shape());
let v: Vec<i32> = x_slice.iterator(0).copied().collect();
let v: Vec<i32> = x_slice.iterator(0).map(|&v| v).collect();
assert_eq!(v, [2, 5]);
}
@@ -241,11 +245,11 @@ mod tests {
let x = arr2(&[[1, 2, 3], [4, 5, 6]]);
let x_slice = Array2::slice(&x, 0..2, 0..3);
assert_eq!(
x_slice.iterator(0).copied().collect::<Vec<i32>>(),
x_slice.iterator(0).map(|&v| v).collect::<Vec<i32>>(),
vec![1, 2, 3, 4, 5, 6]
);
assert_eq!(
x_slice.iterator(1).copied().collect::<Vec<i32>>(),
x_slice.iterator(1).map(|&v| v).collect::<Vec<i32>>(),
vec![1, 4, 2, 5, 3, 6]
);
}
@@ -275,8 +279,8 @@ mod tests {
fn test_c_from_iterator() {
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12];
let a: NDArray2<i32> = Array2::from_iterator(data.clone().into_iter(), 4, 3, 0);
println!("{a}");
println!("{}", a);
let a: NDArray2<i32> = Array2::from_iterator(data.into_iter(), 4, 3, 1);
println!("{a}");
println!("{}", a);
}
}
+7 -7
View File
@@ -41,7 +41,7 @@ impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for ArrayBase<OwnedRepr<T>
impl<T: Debug + Display + Copy + Sized> MutArrayView1<T> for ArrayBase<OwnedRepr<T>, Ix1> {}
impl<T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayView<'_, T, Ix1> {
impl<'a, T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayView<'a, T, Ix1> {
fn get(&self, i: usize) -> &T {
&self[i]
}
@@ -60,9 +60,9 @@ impl<T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayView<'_, T,
}
}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for ArrayView<'_, T, Ix1> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView1<T> for ArrayView<'a, T, Ix1> {}
impl<T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayViewMut<'_, T, Ix1> {
impl<'a, T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayViewMut<'a, T, Ix1> {
fn get(&self, i: usize) -> &T {
&self[i]
}
@@ -81,7 +81,7 @@ impl<T: Debug + Display + Copy + Sized> BaseArray<T, usize> for ArrayViewMut<'_,
}
}
impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for ArrayViewMut<'_, T, Ix1> {
impl<'a, T: Debug + Display + Copy + Sized> MutArray<T, usize> for ArrayViewMut<'a, T, Ix1> {
fn set(&mut self, i: usize, x: T) {
self[i] = x;
}
@@ -92,8 +92,8 @@ impl<T: Debug + Display + Copy + Sized> MutArray<T, usize> for ArrayViewMut<'_,
}
}
impl<T: Debug + Display + Copy + Sized> ArrayView1<T> for ArrayViewMut<'_, T, Ix1> {}
impl<T: Debug + Display + Copy + Sized> MutArrayView1<T> for ArrayViewMut<'_, T, Ix1> {}
impl<'a, T: Debug + Display + Copy + Sized> ArrayView1<T> for ArrayViewMut<'a, T, Ix1> {}
impl<'a, T: Debug + Display + Copy + Sized> MutArrayView1<T> for ArrayViewMut<'a, T, Ix1> {}
impl<T: Debug + Display + Copy + Sized> Array1<T> for ArrayBase<OwnedRepr<T>, Ix1> {
fn slice<'a>(&'a self, range: Range<usize>) -> Box<dyn ArrayView1<T> + 'a> {
@@ -152,7 +152,7 @@ mod tests {
fn test_iterator() {
let a = arr1(&[1, 2, 3]);
let v: Vec<i32> = a.iterator(0).copied().collect();
let v: Vec<i32> = a.iterator(0).map(|&v| v).collect();
assert_eq!(v, vec!(1, 2, 3));
}
+7 -11
View File
@@ -15,7 +15,7 @@
//! &[25., 15., -5.],
//! &[15., 18., 0.],
//! &[-5., 0., 11.]
//! ]).unwrap();
//! ]);
//!
//! let cholesky = A.cholesky().unwrap();
//! let lower_triangular: DenseMatrix<f64> = cholesky.L();
@@ -175,14 +175,11 @@ mod tests {
)]
#[test]
fn cholesky_decompose() {
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]])
.unwrap();
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]]);
let l =
DenseMatrix::from_2d_array(&[&[5.0, 0.0, 0.0], &[3.0, 3.0, 0.0], &[-1.0, 1.0, 3.0]])
.unwrap();
DenseMatrix::from_2d_array(&[&[5.0, 0.0, 0.0], &[3.0, 3.0, 0.0], &[-1.0, 1.0, 3.0]]);
let u =
DenseMatrix::from_2d_array(&[&[5.0, 3.0, -1.0], &[0.0, 3.0, 1.0], &[0.0, 0.0, 3.0]])
.unwrap();
DenseMatrix::from_2d_array(&[&[5.0, 3.0, -1.0], &[0.0, 3.0, 1.0], &[0.0, 0.0, 3.0]]);
let cholesky = a.cholesky().unwrap();
assert!(relative_eq!(cholesky.L().abs(), l.abs(), epsilon = 1e-4));
@@ -200,10 +197,9 @@ mod tests {
)]
#[test]
fn cholesky_solve_mut() {
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]])
.unwrap();
let b = DenseMatrix::from_2d_array(&[&[40., 51., 28.]]).unwrap();
let expected = DenseMatrix::from_2d_array(&[&[1.0, 2.0, 3.0]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]]);
let b = DenseMatrix::from_2d_array(&[&[40., 51., 28.]]);
let expected = DenseMatrix::from_2d_array(&[&[1.0, 2.0, 3.0]]);
let cholesky = a.cholesky().unwrap();
+22 -24
View File
@@ -19,7 +19,7 @@
//! &[0.9000, 0.4000, 0.7000],
//! &[0.4000, 0.5000, 0.3000],
//! &[0.7000, 0.3000, 0.8000],
//! ]).unwrap();
//! ]);
//!
//! let evd = A.evd(true).unwrap();
//! let eigenvectors: DenseMatrix<f64> = evd.V;
@@ -66,7 +66,7 @@ pub trait EVDDecomposable<T: Number + RealNumber>: Array2<T> {
fn evd_mut(mut self, symmetric: bool) -> Result<EVD<T, Self>, Failed> {
let (nrows, ncols) = self.shape();
if ncols != nrows {
panic!("Matrix is not square: {nrows} x {ncols}");
panic!("Matrix is not square: {} x {}", nrows, ncols);
}
let n = nrows;
@@ -820,8 +820,7 @@ mod tests {
&[0.9000, 0.4000, 0.7000],
&[0.4000, 0.5000, 0.3000],
&[0.7000, 0.3000, 0.8000],
])
.unwrap();
]);
let eigen_values: Vec<f64> = vec![1.7498382, 0.3165784, 0.1335834];
@@ -829,8 +828,7 @@ mod tests {
&[0.6881997, -0.07121225, 0.7220180],
&[0.3700456, 0.89044952, -0.2648886],
&[0.6240573, -0.44947578, -0.6391588],
])
.unwrap();
]);
let evd = A.evd(true).unwrap();
@@ -839,9 +837,11 @@ mod tests {
evd.V.abs(),
epsilon = 1e-4
));
for (i, eigen_values_i) in eigen_values.iter().enumerate() {
assert!((eigen_values_i - evd.d[i]).abs() < 1e-4);
assert!((0f64 - evd.e[i]).abs() < f64::EPSILON);
for i in 0..eigen_values.len() {
assert!((eigen_values[i] - evd.d[i]).abs() < 1e-4);
}
for i in 0..eigen_values.len() {
assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON);
}
}
#[cfg_attr(
@@ -854,8 +854,7 @@ mod tests {
&[0.9000, 0.4000, 0.7000],
&[0.4000, 0.5000, 0.3000],
&[0.8000, 0.3000, 0.8000],
])
.unwrap();
]);
let eigen_values: Vec<f64> = vec![1.79171122, 0.31908143, 0.08920735];
@@ -863,8 +862,7 @@ mod tests {
&[0.7178958, 0.05322098, 0.6812010],
&[0.3837711, -0.84702111, -0.1494582],
&[0.6952105, 0.43984484, -0.7036135],
])
.unwrap();
]);
let evd = A.evd(false).unwrap();
@@ -873,9 +871,11 @@ mod tests {
evd.V.abs(),
epsilon = 1e-4
));
for (i, eigen_values_i) in eigen_values.iter().enumerate() {
assert!((eigen_values_i - evd.d[i]).abs() < 1e-4);
assert!((0f64 - evd.e[i]).abs() < f64::EPSILON);
for i in 0..eigen_values.len() {
assert!((eigen_values[i] - evd.d[i]).abs() < 1e-4);
}
for i in 0..eigen_values.len() {
assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON);
}
}
#[cfg_attr(
@@ -889,8 +889,7 @@ mod tests {
&[4.0, -1.0, 1.0, 1.0],
&[1.0, 1.0, 3.0, -2.0],
&[1.0, 1.0, 4.0, -1.0],
])
.unwrap();
]);
let eigen_values_d: Vec<f64> = vec![0.0, 2.0, 2.0, 0.0];
let eigen_values_e: Vec<f64> = vec![2.2361, 0.9999, -0.9999, -2.2361];
@@ -900,8 +899,7 @@ mod tests {
&[-0.6707, 0.1059, 0.901, 0.6289],
&[0.9159, -0.1378, 0.3816, 0.0806],
&[0.6707, 0.1059, 0.901, -0.6289],
])
.unwrap();
]);
let evd = A.evd(false).unwrap();
@@ -910,11 +908,11 @@ mod tests {
evd.V.abs(),
epsilon = 1e-4
));
for (i, eigen_values_d_i) in eigen_values_d.iter().enumerate() {
assert!((eigen_values_d_i - evd.d[i]).abs() < 1e-4);
for i in 0..eigen_values_d.len() {
assert!((eigen_values_d[i] - evd.d[i]).abs() < 1e-4);
}
for (i, eigen_values_e_i) in eigen_values_e.iter().enumerate() {
assert!((eigen_values_e_i - evd.e[i]).abs() < 1e-4);
for i in 0..eigen_values_e.len() {
assert!((eigen_values_e[i] - evd.e[i]).abs() < 1e-4);
}
}
}
+3 -3
View File
@@ -12,9 +12,9 @@ pub trait HighOrderOperations<T: Number>: Array2<T> {
/// use smartcore::linalg::traits::high_order::HighOrderOperations;
/// use smartcore::linalg::basic::arrays::Array2;
///
/// let a = DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.]]).unwrap();
/// let b = DenseMatrix::from_2d_array(&[&[5., 6.], &[7., 8.], &[9., 10.]]).unwrap();
/// let expected = DenseMatrix::from_2d_array(&[&[71., 80.], &[92., 104.]]).unwrap();
/// let a = DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.]]);
/// let b = DenseMatrix::from_2d_array(&[&[5., 6.], &[7., 8.], &[9., 10.]]);
/// let expected = DenseMatrix::from_2d_array(&[&[71., 80.], &[92., 104.]]);
///
/// assert_eq!(a.ab(true, &b, false), expected);
/// ```
+12 -10
View File
@@ -18,7 +18,7 @@
//! &[1., 2., 3.],
//! &[0., 1., 5.],
//! &[5., 6., 0.]
//! ]).unwrap();
//! ]);
//!
//! let lu = A.lu().unwrap();
//! let lower: DenseMatrix<f64> = lu.L();
@@ -126,7 +126,7 @@ impl<T: Number + RealNumber, M: Array2<T>> LU<T, M> {
let (m, n) = self.LU.shape();
if m != n {
panic!("Matrix is not square: {m}x{n}");
panic!("Matrix is not square: {}x{}", m, n);
}
let mut inv = M::zeros(n, n);
@@ -143,7 +143,10 @@ impl<T: Number + RealNumber, M: Array2<T>> LU<T, M> {
let (b_m, b_n) = b.shape();
if b_m != m {
panic!("Row dimensions do not agree: A is {m} x {n}, but B is {b_m} x {b_n}");
panic!(
"Row dimensions do not agree: A is {} x {}, but B is {} x {}",
m, n, b_m, b_n
);
}
if self.singular {
@@ -263,13 +266,13 @@ mod tests {
)]
#[test]
fn decompose() {
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[0., 1., 5.], &[5., 6., 0.]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[0., 1., 5.], &[5., 6., 0.]]);
let expected_L =
DenseMatrix::from_2d_array(&[&[1., 0., 0.], &[0., 1., 0.], &[0.2, 0.8, 1.]]).unwrap();
DenseMatrix::from_2d_array(&[&[1., 0., 0.], &[0., 1., 0.], &[0.2, 0.8, 1.]]);
let expected_U =
DenseMatrix::from_2d_array(&[&[5., 6., 0.], &[0., 1., 5.], &[0., 0., -1.]]).unwrap();
DenseMatrix::from_2d_array(&[&[5., 6., 0.], &[0., 1., 5.], &[0., 0., -1.]]);
let expected_pivot =
DenseMatrix::from_2d_array(&[&[0., 0., 1.], &[0., 1., 0.], &[1., 0., 0.]]).unwrap();
DenseMatrix::from_2d_array(&[&[0., 0., 1.], &[0., 1., 0.], &[1., 0., 0.]]);
let lu = a.lu().unwrap();
assert!(relative_eq!(lu.L(), expected_L, epsilon = 1e-4));
assert!(relative_eq!(lu.U(), expected_U, epsilon = 1e-4));
@@ -281,10 +284,9 @@ mod tests {
)]
#[test]
fn inverse() {
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[0., 1., 5.], &[5., 6., 0.]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1., 2., 3.], &[0., 1., 5.], &[5., 6., 0.]]);
let expected =
DenseMatrix::from_2d_array(&[&[-6.0, 3.6, 1.4], &[5.0, -3.0, -1.0], &[-1.0, 0.8, 0.2]])
.unwrap();
DenseMatrix::from_2d_array(&[&[-6.0, 3.6, 1.4], &[5.0, -3.0, -1.0], &[-1.0, 0.8, 0.2]]);
let a_inv = a.lu().and_then(|lu| lu.inverse()).unwrap();
assert!(relative_eq!(a_inv, expected, epsilon = 1e-4));
}
+11 -13
View File
@@ -13,7 +13,7 @@
//! &[0.9, 0.4, 0.7],
//! &[0.4, 0.5, 0.3],
//! &[0.7, 0.3, 0.8]
//! ]).unwrap();
//! ]);
//!
//! let qr = A.qr().unwrap();
//! let orthogonal: DenseMatrix<f64> = qr.Q();
@@ -102,7 +102,10 @@ impl<T: Number + RealNumber, M: Array2<T>> QR<T, M> {
let (b_nrows, b_ncols) = b.shape();
if b_nrows != m {
panic!("Row dimensions do not agree: A is {m} x {n}, but B is {b_nrows} x {b_ncols}");
panic!(
"Row dimensions do not agree: A is {} x {}, but B is {} x {}",
m, n, b_nrows, b_ncols
);
}
if self.singular {
@@ -201,20 +204,17 @@ mod tests {
)]
#[test]
fn decompose() {
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]])
.unwrap();
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]]);
let q = DenseMatrix::from_2d_array(&[
&[-0.7448, 0.2436, 0.6212],
&[-0.331, -0.9432, -0.027],
&[-0.5793, 0.2257, -0.7832],
])
.unwrap();
]);
let r = DenseMatrix::from_2d_array(&[
&[-1.2083, -0.6373, -1.0842],
&[0.0, -0.3064, 0.0682],
&[0.0, 0.0, -0.1999],
])
.unwrap();
]);
let qr = a.qr().unwrap();
assert!(relative_eq!(qr.Q().abs(), q.abs(), epsilon = 1e-4));
assert!(relative_eq!(qr.R().abs(), r.abs(), epsilon = 1e-4));
@@ -226,15 +226,13 @@ mod tests {
)]
#[test]
fn qr_solve_mut() {
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]])
.unwrap();
let b = DenseMatrix::from_2d_array(&[&[0.5, 0.2], &[0.5, 0.8], &[0.5, 0.3]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]]);
let b = DenseMatrix::from_2d_array(&[&[0.5, 0.2], &[0.5, 0.8], &[0.5, 0.3]]);
let expected_w = DenseMatrix::from_2d_array(&[
&[-0.2027027, -1.2837838],
&[0.8783784, 2.2297297],
&[0.4729730, 0.6621622],
])
.unwrap();
]);
let w = a.qr_solve_mut(b).unwrap();
assert!(relative_eq!(w, expected_w, epsilon = 1e-2));
}
+15 -18
View File
@@ -136,12 +136,13 @@ pub trait MatrixPreprocessing<T: RealNumber>: MutArrayView2<T> + Clone {
/// ```rust
/// use smartcore::linalg::basic::matrix::DenseMatrix;
/// use smartcore::linalg::traits::stats::MatrixPreprocessing;
/// let mut a = DenseMatrix::from_2d_array(&[&[0., 2., 3.], &[-5., -6., -7.]]).unwrap();
/// let expected = DenseMatrix::from_2d_array(&[&[0., 1., 1.],&[0., 0., 0.]]).unwrap();
/// let mut a = DenseMatrix::from_2d_array(&[&[0., 2., 3.], &[-5., -6., -7.]]);
/// let expected = DenseMatrix::from_2d_array(&[&[0., 1., 1.],&[0., 0., 0.]]);
/// a.binarize_mut(0.);
///
/// assert_eq!(a, expected);
/// ```
fn binarize_mut(&mut self, threshold: T) {
let (nrows, ncols) = self.shape();
for row in 0..nrows {
@@ -158,8 +159,8 @@ pub trait MatrixPreprocessing<T: RealNumber>: MutArrayView2<T> + Clone {
/// ```rust
/// use smartcore::linalg::basic::matrix::DenseMatrix;
/// use smartcore::linalg::traits::stats::MatrixPreprocessing;
/// let a = DenseMatrix::from_2d_array(&[&[0., 2., 3.], &[-5., -6., -7.]]).unwrap();
/// let expected = DenseMatrix::from_2d_array(&[&[0., 1., 1.],&[0., 0., 0.]]).unwrap();
/// let a = DenseMatrix::from_2d_array(&[&[0., 2., 3.], &[-5., -6., -7.]]);
/// let expected = DenseMatrix::from_2d_array(&[&[0., 1., 1.],&[0., 0., 0.]]);
///
/// assert_eq!(a.binarize(0.), expected);
/// ```
@@ -185,8 +186,7 @@ mod tests {
&[1., 2., 3., 1., 2.],
&[4., 5., 6., 3., 4.],
&[7., 8., 9., 5., 6.],
])
.unwrap();
]);
let expected_0 = vec![4., 5., 6., 3., 4.];
let expected_1 = vec![1.8, 4.4, 7.];
@@ -196,7 +196,7 @@ mod tests {
#[test]
fn test_var() {
let m = DenseMatrix::from_2d_array(&[&[1., 2., 3., 4.], &[5., 6., 7., 8.]]).unwrap();
let m = DenseMatrix::from_2d_array(&[&[1., 2., 3., 4.], &[5., 6., 7., 8.]]);
let expected_0 = vec![4., 4., 4., 4.];
let expected_1 = vec![1.25, 1.25];
@@ -211,13 +211,12 @@ mod tests {
let m = DenseMatrix::from_2d_array(&[
&[0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25],
&[0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25],
])
.unwrap();
]);
let expected_0 = vec![0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
let expected_1 = vec![1.25, 1.25];
assert!(m.var(0).approximate_eq(&expected_0, f64::EPSILON));
assert!(m.var(1).approximate_eq(&expected_1, f64::EPSILON));
assert!(m.var(0).approximate_eq(&expected_0, std::f64::EPSILON));
assert!(m.var(1).approximate_eq(&expected_1, std::f64::EPSILON));
assert_eq!(
m.mean(0),
vec![0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]
@@ -231,8 +230,7 @@ mod tests {
&[1., 2., 3., 1., 2.],
&[4., 5., 6., 3., 4.],
&[7., 8., 9., 5., 6.],
])
.unwrap();
]);
let expected_0 = vec![
2.449489742783178,
2.449489742783178,
@@ -253,10 +251,10 @@ mod tests {
#[test]
fn test_scale() {
let m: DenseMatrix<f64> =
DenseMatrix::from_2d_array(&[&[1., 2., 3., 4.], &[5., 6., 7., 8.]]).unwrap();
DenseMatrix::from_2d_array(&[&[1., 2., 3., 4.], &[5., 6., 7., 8.]]);
let expected_0: DenseMatrix<f64> =
DenseMatrix::from_2d_array(&[&[-1., -1., -1., -1.], &[1., 1., 1., 1.]]).unwrap();
DenseMatrix::from_2d_array(&[&[-1., -1., -1., -1.], &[1., 1., 1., 1.]]);
let expected_1: DenseMatrix<f64> = DenseMatrix::from_2d_array(&[
&[
-1.3416407864998738,
@@ -270,8 +268,7 @@ mod tests {
0.4472135954999579,
1.3416407864998738,
],
])
.unwrap();
]);
assert_eq!(m.mean(0), vec![3.0, 4.0, 5.0, 6.0]);
assert_eq!(m.mean(1), vec![2.5, 6.5]);
@@ -289,7 +286,7 @@ mod tests {
}
{
let mut m = m;
let mut m = m.clone();
m.standard_scale_mut(&m.mean(1), &m.std(1), 1);
assert_eq!(&m, &expected_1);
}
+18 -24
View File
@@ -17,7 +17,7 @@
//! &[0.9, 0.4, 0.7],
//! &[0.4, 0.5, 0.3],
//! &[0.7, 0.3, 0.8]
//! ]).unwrap();
//! ]);
//!
//! let svd = A.svd().unwrap();
//! let u: DenseMatrix<f64> = svd.U;
@@ -48,9 +48,11 @@ pub struct SVD<T: Number + RealNumber, M: SVDDecomposable<T>> {
pub V: M,
/// Singular values of the original matrix
pub s: Vec<T>,
///
m: usize,
///
n: usize,
/// Tolerance
///
tol: T,
}
@@ -487,8 +489,7 @@ mod tests {
&[0.9000, 0.4000, 0.7000],
&[0.4000, 0.5000, 0.3000],
&[0.7000, 0.3000, 0.8000],
])
.unwrap();
]);
let s: Vec<f64> = vec![1.7498382, 0.3165784, 0.1335834];
@@ -496,22 +497,20 @@ mod tests {
&[0.6881997, -0.07121225, 0.7220180],
&[0.3700456, 0.89044952, -0.2648886],
&[0.6240573, -0.44947578, -0.639158],
])
.unwrap();
]);
let V = DenseMatrix::from_2d_array(&[
&[0.6881997, -0.07121225, 0.7220180],
&[0.3700456, 0.89044952, -0.2648886],
&[0.6240573, -0.44947578, -0.6391588],
])
.unwrap();
]);
let svd = A.svd().unwrap();
assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4));
assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4));
for (i, s_i) in s.iter().enumerate() {
assert!((s_i - svd.s[i]).abs() < 1e-4);
for i in 0..s.len() {
assert!((s[i] - svd.s[i]).abs() < 1e-4);
}
}
#[cfg_attr(
@@ -578,8 +577,7 @@ mod tests {
-0.2158704,
-0.27529472,
],
])
.unwrap();
]);
let s: Vec<f64> = vec![
3.8589375, 3.4396766, 2.6487176, 2.2317399, 1.5165054, 0.8109055, 0.2706515,
@@ -649,8 +647,7 @@ mod tests {
0.73034065,
-0.43965505,
],
])
.unwrap();
]);
let V = DenseMatrix::from_2d_array(&[
&[
@@ -710,15 +707,14 @@ mod tests {
0.1654796,
-0.32346758,
],
])
.unwrap();
]);
let svd = A.svd().unwrap();
assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4));
assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4));
for (i, s_i) in s.iter().enumerate() {
assert!((s_i - svd.s[i]).abs() < 1e-4);
for i in 0..s.len() {
assert!((s[i] - svd.s[i]).abs() < 1e-4);
}
}
#[cfg_attr(
@@ -727,11 +723,10 @@ mod tests {
)]
#[test]
fn solve() {
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]])
.unwrap();
let b = DenseMatrix::from_2d_array(&[&[0.5, 0.2], &[0.5, 0.8], &[0.5, 0.3]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[0.9, 0.4, 0.7], &[0.4, 0.5, 0.3], &[0.7, 0.3, 0.8]]);
let b = DenseMatrix::from_2d_array(&[&[0.5, 0.2], &[0.5, 0.8], &[0.5, 0.3]]);
let expected_w =
DenseMatrix::from_2d_array(&[&[-0.20, -1.28], &[0.87, 2.22], &[0.47, 0.66]]).unwrap();
DenseMatrix::from_2d_array(&[&[-0.20, -1.28], &[0.87, 2.22], &[0.47, 0.66]]);
let w = a.svd_solve_mut(b).unwrap();
assert!(relative_eq!(w, expected_w, epsilon = 1e-2));
}
@@ -742,8 +737,7 @@ mod tests {
)]
#[test]
fn decompose_restore() {
let a =
DenseMatrix::from_2d_array(&[&[1.0, 2.0, 3.0, 4.0], &[5.0, 6.0, 7.0, 8.0]]).unwrap();
let a = DenseMatrix::from_2d_array(&[&[1.0, 2.0, 3.0, 4.0], &[5.0, 6.0, 7.0, 8.0]]);
let svd = a.svd().unwrap();
let u: &DenseMatrix<f32> = &svd.U; //U
let v: &DenseMatrix<f32> = &svd.V; // V
+7 -9
View File
@@ -12,8 +12,7 @@
//! pub struct BGSolver {}
//! impl<'a, T: FloatNumber, X: Array2<T>> BiconjugateGradientSolver<'a, T, X> for BGSolver {}
//!
//! let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0.,
//! 11.]]).unwrap();
//! let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]]);
//! let b = vec![40., 51., 28.];
//! let expected = vec![1.0, 2.0, 3.0];
//! let mut x = Vec::zeros(3);
@@ -27,9 +26,9 @@ use crate::error::Failed;
use crate::linalg::basic::arrays::{Array, Array1, Array2, ArrayView1, MutArrayView1};
use crate::numbers::floatnum::FloatNumber;
/// Trait for Biconjugate Gradient Solver
///
pub trait BiconjugateGradientSolver<'a, T: FloatNumber, X: Array2<T>> {
/// Solve Ax = b
///
fn solve_mut(
&self,
a: &'a X,
@@ -109,7 +108,7 @@ pub trait BiconjugateGradientSolver<'a, T: FloatNumber, X: Array2<T>> {
Ok(err)
}
/// solve preconditioner
///
fn solve_preconditioner(&self, a: &'a X, b: &[T], x: &mut [T]) {
let diag = Self::diag(a);
let n = diag.len();
@@ -133,7 +132,7 @@ pub trait BiconjugateGradientSolver<'a, T: FloatNumber, X: Array2<T>> {
y.copy_from(&x.xa(true, a));
}
/// Extract the diagonal from a matrix
///
fn diag(a: &X) -> Vec<T> {
let (nrows, ncols) = a.shape();
let n = nrows.min(ncols);
@@ -159,10 +158,9 @@ mod tests {
#[test]
fn bg_solver() {
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]])
.unwrap();
let a = DenseMatrix::from_2d_array(&[&[25., 15., -5.], &[15., 18., 0.], &[-5., 0., 11.]]);
let b = vec![40., 51., 28.];
let expected = [1.0, 2.0, 3.0];
let expected = vec![1.0, 2.0, 3.0];
let mut x = Vec::zeros(3);
+8 -7
View File
@@ -38,7 +38,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//!
//! let y: Vec<f64> = vec![83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0,
//! 100.0, 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9];
@@ -425,7 +425,10 @@ impl<TX: FloatNumber + RealNumber, TY: Number, X: Array2<TX>, Y: Array1<TY>>
for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}")));
return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
}
}
@@ -511,8 +514,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
@@ -563,8 +565,7 @@ mod tests {
&[17.0, 1918.0, 1.4054969025700674],
&[18.0, 1929.0, 1.3271699396384906],
&[19.0, 1915.0, 1.1373332337674806],
])
.unwrap();
]);
let y: Vec<f64> = vec![
1.48, 2.72, 4.52, 5.72, 5.25, 4.07, 3.75, 4.75, 6.77, 4.72, 6.78, 6.79, 8.3, 7.42,
@@ -629,7 +630,7 @@ mod tests {
// &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
// &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
// &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
// ]).unwrap();
// ]);
// let y = vec![
// 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
+5 -3
View File
@@ -356,7 +356,10 @@ impl<TX: FloatNumber + RealNumber, TY: Number, X: Array2<TX>, Y: Array1<TY>> Las
for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}")));
return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
}
}
@@ -418,8 +421,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
+10 -4
View File
@@ -16,7 +16,7 @@ use crate::linalg::basic::arrays::{Array1, Array2, ArrayView1, MutArray, MutArra
use crate::linear::bg_solver::BiconjugateGradientSolver;
use crate::numbers::floatnum::FloatNumber;
/// Interior Point Optimizer
///
pub struct InteriorPointOptimizer<T: FloatNumber, X: Array2<T>> {
ata: X,
d1: Vec<T>,
@@ -25,8 +25,9 @@ pub struct InteriorPointOptimizer<T: FloatNumber, X: Array2<T>> {
prs: Vec<T>,
}
///
impl<T: FloatNumber, X: Array2<T>> InteriorPointOptimizer<T, X> {
/// Initialize a new Interior Point Optimizer
///
pub fn new(a: &X, n: usize) -> InteriorPointOptimizer<T, X> {
InteriorPointOptimizer {
ata: a.ab(true, a, false),
@@ -37,7 +38,7 @@ impl<T: FloatNumber, X: Array2<T>> InteriorPointOptimizer<T, X> {
}
}
/// Run the optimization
///
pub fn optimize(
&mut self,
x: &X,
@@ -100,7 +101,7 @@ impl<T: FloatNumber, X: Array2<T>> InteriorPointOptimizer<T, X> {
// CALCULATE DUALITY GAP
let xnu = nu.xa(false, x);
let max_xnu = xnu.norm(f64::INFINITY);
let max_xnu = xnu.norm(std::f64::INFINITY);
if max_xnu > lambda_f64 {
let lnu = T::from_f64(lambda_f64 / max_xnu).unwrap();
nu.mul_scalar_mut(lnu);
@@ -207,6 +208,7 @@ impl<T: FloatNumber, X: Array2<T>> InteriorPointOptimizer<T, X> {
Ok(w)
}
///
fn sumlogneg(f: &X) -> T {
let (n, _) = f.shape();
let mut sum = T::zero();
@@ -218,9 +220,11 @@ impl<T: FloatNumber, X: Array2<T>> InteriorPointOptimizer<T, X> {
}
}
///
impl<'a, T: FloatNumber, X: Array2<T>> BiconjugateGradientSolver<'a, T, X>
for InteriorPointOptimizer<T, X>
{
///
fn solve_preconditioner(&self, a: &'a X, b: &[T], x: &mut [T]) {
let (_, p) = a.shape();
@@ -230,6 +234,7 @@ impl<'a, T: FloatNumber, X: Array2<T>> BiconjugateGradientSolver<'a, T, X>
}
}
///
fn mat_vec_mul(&self, _: &X, x: &Vec<T>, y: &mut Vec<T>) {
let (_, p) = self.ata.shape();
let x_slice = Vec::from_slice(x.slice(0..p).as_ref());
@@ -241,6 +246,7 @@ impl<'a, T: FloatNumber, X: Array2<T>> BiconjugateGradientSolver<'a, T, X>
}
}
///
fn mat_t_vec_mul(&self, a: &X, x: &Vec<T>, y: &mut Vec<T>) {
self.mat_vec_mul(a, x, y);
}
+3 -4
View File
@@ -40,7 +40,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//!
//! let y: Vec<f64> = vec![83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0,
//! 100.0, 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9];
@@ -341,8 +341,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8,
@@ -394,7 +393,7 @@ mod tests {
// &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
// &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
// &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
// ]).unwrap();
// ]);
// let y = vec![
// 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
+67 -104
View File
@@ -35,7 +35,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y: Vec<i32> = vec![
//! 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
//! ];
@@ -71,14 +71,19 @@ use crate::optimization::line_search::Backtracking;
use crate::optimization::FunctionOrder;
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Eq, PartialEq, Default)]
#[derive(Debug, Clone, Eq, PartialEq)]
/// Solver options for Logistic regression. Right now only LBFGS solver is supported.
pub enum LogisticRegressionSolverName {
/// Limited-memory BroydenFletcherGoldfarbShanno method, see [LBFGS paper](http://users.iems.northwestern.edu/~nocedal/lbfgsb.html)
#[default]
LBFGS,
}
impl Default for LogisticRegressionSolverName {
fn default() -> Self {
LogisticRegressionSolverName::LBFGS
}
}
/// Logistic Regression parameters
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)]
@@ -183,11 +188,14 @@ pub struct LogisticRegression<
}
trait ObjectiveFunction<T: Number + FloatNumber, X: Array2<T>> {
///
fn f(&self, w_bias: &[T]) -> T;
///
#[allow(clippy::ptr_arg)]
fn df(&self, g: &mut Vec<T>, w_bias: &Vec<T>);
///
#[allow(clippy::ptr_arg)]
fn partial_dot(w: &[T], x: &X, v_col: usize, m_row: usize) -> T {
let mut sum = T::zero();
@@ -258,8 +266,8 @@ impl<TX: Number + FloatNumber + RealNumber, TY: Number + Ord, X: Array2<TX>, Y:
}
}
impl<T: Number + FloatNumber, X: Array2<T>> ObjectiveFunction<T, X>
for BinaryObjectiveFunction<'_, T, X>
impl<'a, T: Number + FloatNumber, X: Array2<T>> ObjectiveFunction<T, X>
for BinaryObjectiveFunction<'a, T, X>
{
fn f(&self, w_bias: &[T]) -> T {
let mut f = T::zero();
@@ -313,8 +321,8 @@ struct MultiClassObjectiveFunction<'a, T: Number + FloatNumber, X: Array2<T>> {
_phantom_t: PhantomData<T>,
}
impl<T: Number + FloatNumber + RealNumber, X: Array2<T>> ObjectiveFunction<T, X>
for MultiClassObjectiveFunction<'_, T, X>
impl<'a, T: Number + FloatNumber + RealNumber, X: Array2<T>> ObjectiveFunction<T, X>
for MultiClassObjectiveFunction<'a, T, X>
{
fn f(&self, w_bias: &[T]) -> T {
let mut f = T::zero();
@@ -413,7 +421,7 @@ impl<TX: Number + FloatNumber + RealNumber, TY: Number + Ord, X: Array2<TX>, Y:
/// Fits Logistic Regression to your data.
/// * `x` - _NxM_ matrix with _N_ observations and _M_ features in each observation.
/// * `y` - target class values
/// * `parameters` - other parameters, use `Default::default()` to set parameters to default values.
/// * `parameters` - other parameters, use `Default::default()` to set parameters to default values.
pub fn fit(
x: &X,
y: &Y,
@@ -441,7 +449,8 @@ impl<TX: Number + FloatNumber + RealNumber, TY: Number + Ord, X: Array2<TX>, Y:
match k.cmp(&2) {
Ordering::Less => Err(Failed::fit(&format!(
"incorrect number of classes: {k}. Should be >= 2."
"incorrect number of classes: {}. Should be >= 2.",
k
))),
Ordering::Equal => {
let x0 = Vec::zeros(num_attributes + 1);
@@ -608,8 +617,7 @@ mod tests {
&[10., -2.],
&[8., 2.],
&[9., 0.],
])
.unwrap();
]);
let y = vec![0, 0, 1, 1, 2, 1, 1, 0, 0, 2, 1, 1, 0, 0, 1];
@@ -626,21 +634,21 @@ mod tests {
objective.df(&mut g, &vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
objective.df(&mut g, &vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
assert!((g[0] + 33.000068218163484).abs() < f64::EPSILON);
assert!((g[0] + 33.000068218163484).abs() < std::f64::EPSILON);
let f = objective.f(&[1., 2., 3., 4., 5., 6., 7., 8., 9.]);
let f = objective.f(&vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
assert!((f - 408.0052230582765).abs() < f64::EPSILON);
assert!((f - 408.0052230582765).abs() < std::f64::EPSILON);
let objective_reg = MultiClassObjectiveFunction {
x: &x,
y,
y: y.clone(),
k: 3,
alpha: 1.0,
_phantom_t: PhantomData,
};
let f = objective_reg.f(&[1., 2., 3., 4., 5., 6., 7., 8., 9.]);
let f = objective_reg.f(&vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
assert!((f - 487.5052).abs() < 1e-4);
objective_reg.df(&mut g, &vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
@@ -669,8 +677,7 @@ mod tests {
&[10., -2.],
&[8., 2.],
&[9., 0.],
])
.unwrap();
]);
let y = vec![0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1];
@@ -686,22 +693,22 @@ mod tests {
objective.df(&mut g, &vec![1., 2., 3.]);
objective.df(&mut g, &vec![1., 2., 3.]);
assert!((g[0] - 26.051064349381285).abs() < f64::EPSILON);
assert!((g[1] - 10.239000702928523).abs() < f64::EPSILON);
assert!((g[2] - 3.869294270156324).abs() < f64::EPSILON);
assert!((g[0] - 26.051064349381285).abs() < std::f64::EPSILON);
assert!((g[1] - 10.239000702928523).abs() < std::f64::EPSILON);
assert!((g[2] - 3.869294270156324).abs() < std::f64::EPSILON);
let f = objective.f(&[1., 2., 3.]);
let f = objective.f(&vec![1., 2., 3.]);
assert!((f - 59.76994756647412).abs() < f64::EPSILON);
assert!((f - 59.76994756647412).abs() < std::f64::EPSILON);
let objective_reg = BinaryObjectiveFunction {
x: &x,
y,
y: y.clone(),
alpha: 1.0,
_phantom_t: PhantomData,
};
let f = objective_reg.f(&[1., 2., 3.]);
let f = objective_reg.f(&vec![1., 2., 3.]);
assert!((f - 62.2699).abs() < 1e-4);
objective_reg.df(&mut g, &vec![1., 2., 3.]);
@@ -732,8 +739,7 @@ mod tests {
&[10., -2.],
&[8., 2.],
&[9., 0.],
])
.unwrap();
]);
let y: Vec<i32> = vec![0, 0, 1, 1, 2, 1, 1, 0, 0, 2, 1, 1, 0, 0, 1];
let lr = LogisticRegression::fit(&x, &y, Default::default()).unwrap();
@@ -818,41 +824,37 @@ mod tests {
assert!(reg_coeff_sum < coeff);
}
//TODO: serialization for the new DenseMatrix needs to be implemented
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
#[cfg(feature = "serde")]
fn serde() {
let x: DenseMatrix<f64> = DenseMatrix::from_2d_array(&[
&[1., -5.],
&[2., 5.],
&[3., -2.],
&[1., 2.],
&[2., 0.],
&[6., -5.],
&[7., 5.],
&[6., -2.],
&[7., 2.],
&[6., 0.],
&[8., -5.],
&[9., 5.],
&[10., -2.],
&[8., 2.],
&[9., 0.],
])
.unwrap();
let y: Vec<i32> = vec![0, 0, 1, 1, 2, 1, 1, 0, 0, 2, 1, 1, 0, 0, 1];
// TODO: serialization for the new DenseMatrix needs to be implemented
// #[cfg_attr(all(target_arch = "wasm32", not(target_os = "wasi")), wasm_bindgen_test::wasm_bindgen_test)]
// #[test]
// #[cfg(feature = "serde")]
// fn serde() {
// let x = DenseMatrix::from_2d_array(&[
// &[1., -5.],
// &[2., 5.],
// &[3., -2.],
// &[1., 2.],
// &[2., 0.],
// &[6., -5.],
// &[7., 5.],
// &[6., -2.],
// &[7., 2.],
// &[6., 0.],
// &[8., -5.],
// &[9., 5.],
// &[10., -2.],
// &[8., 2.],
// &[9., 0.],
// ]);
// let y: Vec<i32> = vec![0, 0, 1, 1, 2, 1, 1, 0, 0, 2, 1, 1, 0, 0, 1];
let lr = LogisticRegression::fit(&x, &y, Default::default()).unwrap();
// let lr = LogisticRegression::fit(&x, &y, Default::default()).unwrap();
let deserialized_lr: LogisticRegression<f64, i32, DenseMatrix<f64>, Vec<i32>> =
serde_json::from_str(&serde_json::to_string(&lr).unwrap()).unwrap();
// let deserialized_lr: LogisticRegression<f64, i32, DenseMatrix<f64>, Vec<i32>> =
// serde_json::from_str(&serde_json::to_string(&lr).unwrap()).unwrap();
assert_eq!(lr, deserialized_lr);
}
// assert_eq!(lr, deserialized_lr);
// }
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
@@ -881,8 +883,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<i32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let lr = LogisticRegression::fit(&x, &y, Default::default()).unwrap();
@@ -895,7 +896,11 @@ mod tests {
let y_hat = lr.predict(&x).unwrap();
let error: i32 = y.into_iter().zip(y_hat).map(|(a, b)| (a - b).abs()).sum();
let error: i32 = y
.into_iter()
.zip(y_hat.into_iter())
.map(|(a, b)| (a - b).abs())
.sum();
assert!(error <= 1);
@@ -904,46 +909,4 @@ mod tests {
assert!(reg_coeff_sum < coeff);
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
fn lr_fit_predict_random() {
let x: DenseMatrix<f32> = DenseMatrix::rand(52181, 94);
let y1: Vec<i32> = vec![1; 2181];
let y2: Vec<i32> = vec![0; 50000];
let y: Vec<i32> = y1.into_iter().chain(y2).collect();
let lr = LogisticRegression::fit(&x, &y, Default::default()).unwrap();
let lr_reg = LogisticRegression::fit(
&x,
&y,
LogisticRegressionParameters::default().with_alpha(1.0),
)
.unwrap();
let y_hat = lr.predict(&x).unwrap();
let y_hat_reg = lr_reg.predict(&x).unwrap();
assert_eq!(y.len(), y_hat.len());
assert_eq!(y.len(), y_hat_reg.len());
}
#[test]
fn test_logit() {
let x: &DenseMatrix<f64> = &DenseMatrix::rand(52181, 94);
let y1: Vec<u32> = vec![1; 2181];
let y2: Vec<u32> = vec![0; 50000];
let y: &Vec<u32> = &(y1.into_iter().chain(y2).collect());
println!("y vec height: {:?}", y.len());
println!("x matrix shape: {:?}", x.shape());
let lr = LogisticRegression::fit(x, y, Default::default()).unwrap();
let y_hat = lr.predict(x).unwrap();
println!("y_hat shape: {:?}", y_hat.shape());
assert_eq!(y_hat.shape(), 52181);
}
}
+14 -7
View File
@@ -40,7 +40,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//!
//! let y: Vec<f64> = vec![83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0,
//! 100.0, 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9];
@@ -71,16 +71,21 @@ use crate::numbers::basenum::Number;
use crate::numbers::realnum::RealNumber;
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Eq, PartialEq, Default)]
#[derive(Debug, Clone, Eq, PartialEq)]
/// Approach to use for estimation of regression coefficients. Cholesky is more efficient but SVD is more stable.
pub enum RidgeRegressionSolverName {
/// Cholesky decomposition, see [Cholesky](../../linalg/cholesky/index.html)
#[default]
Cholesky,
/// SVD decomposition, see [SVD](../../linalg/svd/index.html)
SVD,
}
impl Default for RidgeRegressionSolverName {
fn default() -> Self {
RidgeRegressionSolverName::Cholesky
}
}
/// Ridge Regression parameters
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)]
@@ -379,7 +384,10 @@ impl<
for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}")));
return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
}
}
@@ -455,8 +463,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
@@ -514,7 +521,7 @@ mod tests {
// &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
// &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
// &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
// ]).unwrap();
// ]);
// let y = vec![
// 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
+3 -3
View File
@@ -98,8 +98,8 @@ mod tests {
let mut scores = HCVScore::new();
scores.compute(&v1, &v2);
assert!((0.2548 - scores.homogeneity.unwrap()).abs() < 1e-4);
assert!((0.5440 - scores.completeness.unwrap()).abs() < 1e-4);
assert!((0.3471 - scores.v_measure.unwrap()).abs() < 1e-4);
assert!((0.2548 - scores.homogeneity.unwrap() as f64).abs() < 1e-4);
assert!((0.5440 - scores.completeness.unwrap() as f64).abs() < 1e-4);
assert!((0.3471 - scores.v_measure.unwrap() as f64).abs() < 1e-4);
}
}
+1 -1
View File
@@ -125,7 +125,7 @@ mod tests {
fn entropy_test() {
let v1 = vec![0, 0, 1, 1, 2, 0, 4];
assert!((1.2770 - entropy(&v1).unwrap()).abs() < 1e-4);
assert!((1.2770 - entropy(&v1).unwrap() as f64).abs() < 1e-4);
}
#[cfg_attr(
+2 -3
View File
@@ -25,7 +25,7 @@
//! &[68., 590., 37.],
//! &[69., 660., 46.],
//! &[73., 600., 55.],
//! ]).unwrap();
//! ]);
//!
//! let a = data.mean_by(0);
//! let b = vec![66., 640., 44.];
@@ -151,8 +151,7 @@ mod tests {
&[68., 590., 37.],
&[69., 660., 46.],
&[73., 600., 55.],
])
.unwrap();
]);
let a = data.mean_by(0);
let b = vec![66., 640., 44.];
+2 -2
View File
@@ -95,8 +95,8 @@ mod tests {
let score1: f64 = F1::new_with(beta).get_score(&y_true, &y_pred);
let score2: f64 = F1::new_with(beta).get_score(&y_true, &y_true);
println!("{score1:?}");
println!("{score2:?}");
println!("{:?}", score1);
println!("{:?}", score2);
assert!((score1 - 0.57142857).abs() < 1e-8);
assert!((score2 - 1.0).abs() < 1e-8);
+1 -1
View File
@@ -37,7 +37,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y: Vec<i8> = vec![
//! 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
//! ];
@@ -3,9 +3,9 @@
use crate::{
api::{Predictor, SupervisedEstimator},
error::{Failed, FailedError},
linalg::basic::arrays::{Array1, Array2},
numbers::basenum::Number,
linalg::basic::arrays::{Array2, Array1},
numbers::realnum::RealNumber,
numbers::basenum::Number,
};
use crate::model_selection::{cross_validate, BaseKFold, CrossValidationResult};
+10 -6
View File
@@ -213,17 +213,17 @@ mod tests {
for t in &test_masks[0][0..11] {
// TODO: this can be prob done better
assert!(*t)
assert_eq!(*t, true)
}
for t in &test_masks[0][11..22] {
assert!(!*t)
assert_eq!(*t, false)
}
for t in &test_masks[1][0..11] {
assert!(!*t)
assert_eq!(*t, false)
}
for t in &test_masks[1][11..22] {
assert!(*t)
assert_eq!(*t, true)
}
}
@@ -283,7 +283,9 @@ mod tests {
(vec![0, 1, 2, 3, 7, 8, 9], vec![4, 5, 6]),
(vec![0, 1, 2, 3, 4, 5, 6], vec![7, 8, 9]),
];
for ((train, test), (expected_train, expected_test)) in k.split(&x).zip(expected) {
for ((train, test), (expected_train, expected_test)) in
k.split(&x).into_iter().zip(expected)
{
assert_eq!(test, expected_test);
assert_eq!(train, expected_train);
}
@@ -305,7 +307,9 @@ mod tests {
(vec![0, 1, 2, 3, 7, 8, 9], vec![4, 5, 6]),
(vec![0, 1, 2, 3, 4, 5, 6], vec![7, 8, 9]),
];
for ((train, test), (expected_train, expected_test)) in k.split(&x).zip(expected) {
for ((train, test), (expected_train, expected_test)) in
k.split(&x).into_iter().zip(expected)
{
assert_eq!(test.len(), expected_test.len());
assert_eq!(train.len(), expected_train.len());
}
+8 -12
View File
@@ -36,7 +36,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y: Vec<f64> = vec![
//! 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
//! ];
@@ -84,7 +84,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y: Vec<i32> = vec![
//! 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
//! ];
@@ -169,7 +169,7 @@ pub fn train_test_split<
let n_test = ((n as f32) * test_size) as usize;
if n_test < 1 {
panic!("number of sample is too small {n}");
panic!("number of sample is too small {}", n);
}
let mut indices: Vec<usize> = (0..n).collect();
@@ -396,8 +396,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let cv = KFold {
@@ -442,8 +441,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
@@ -491,8 +489,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
@@ -542,8 +539,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<i32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let cv = KFold::default().with_n_splits(3);
@@ -557,6 +553,6 @@ mod tests {
&accuracy,
)
.unwrap();
println!("{results:?}");
println!("{:?}", results);
}
}
+17 -17
View File
@@ -19,14 +19,14 @@
//! &[0, 1, 0, 0, 1, 0],
//! &[0, 1, 0, 1, 0, 0],
//! &[0, 1, 1, 0, 0, 1],
//! ]).unwrap();
//! ]);
//! let y: Vec<u32> = vec![0, 0, 0, 1];
//!
//! let nb = BernoulliNB::fit(&x, &y, Default::default()).unwrap();
//!
//! // Testing data point is:
//! // Chinese Chinese Chinese Tokyo Japan
//! let x_test = DenseMatrix::from_2d_array(&[&[0, 1, 1, 0, 0, 1]]).unwrap();
//! let x_test = DenseMatrix::from_2d_array(&[&[0, 1, 1, 0, 0, 1]]);
//! let y_hat = nb.predict(&x_test).unwrap();
//! ```
//!
@@ -258,7 +258,7 @@ impl<TY: Number + Ord + Unsigned> BernoulliNBDistribution<TY> {
/// * `x` - training data.
/// * `y` - vector with target values (classes) of length N.
/// * `priors` - Optional vector with prior probabilities of the classes. If not defined,
/// priors are adjusted according to the data.
/// priors are adjusted according to the data.
/// * `alpha` - Additive (Laplace/Lidstone) smoothing parameter.
/// * `binarize` - Threshold for binarizing.
fn fit<TX: Number + PartialOrd, X: Array2<TX>, Y: Array1<TY>>(
@@ -271,18 +271,21 @@ impl<TY: Number + Ord + Unsigned> BernoulliNBDistribution<TY> {
let y_samples = y.shape();
if y_samples != n_samples {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
)));
}
if n_samples == 0 {
return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]"
"Size of x and y should greater than 0; |x|=[{}]",
n_samples
)));
}
if alpha < 0f64 {
return Err(Failed::fit(&format!(
"Alpha should be greater than 0; |alpha|=[{alpha}]"
"Alpha should be greater than 0; |alpha|=[{}]",
alpha
)));
}
@@ -315,7 +318,8 @@ impl<TY: Number + Ord + Unsigned> BernoulliNBDistribution<TY> {
feature_in_class_counter[class_index][idx] +=
row_i.to_usize().ok_or_else(|| {
Failed::fit(&format!(
"Elements of the matrix should be 1.0 or 0.0 |found|=[{row_i}]"
"Elements of the matrix should be 1.0 or 0.0 |found|=[{}]",
row_i
))
})?;
}
@@ -402,10 +406,10 @@ impl<TX: Number + PartialOrd, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Arr
{
/// Fits BernoulliNB with given data
/// * `x` - training data of size NxM where N is the number of samples and M is the number of
/// features.
/// features.
/// * `y` - vector with target values (classes) of length N.
/// * `parameters` - additional parameters like class priors, alpha for smoothing and
/// binarizing threshold.
/// binarizing threshold.
pub fn fit(x: &X, y: &Y, parameters: BernoulliNBParameters<TX>) -> Result<Self, Failed> {
let distribution = if let Some(threshold) = parameters.binarize {
BernoulliNBDistribution::fit(
@@ -427,7 +431,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Arr
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
if let Some(threshold) = self.binarize {
@@ -528,8 +531,7 @@ mod tests {
&[0.0, 1.0, 0.0, 0.0, 1.0, 0.0],
&[0.0, 1.0, 0.0, 1.0, 0.0, 0.0],
&[0.0, 1.0, 1.0, 0.0, 0.0, 1.0],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 0, 1];
let bnb = BernoulliNB::fit(&x, &y, Default::default()).unwrap();
@@ -560,7 +562,7 @@ mod tests {
// Testing data point is:
// Chinese Chinese Chinese Tokyo Japan
let x_test = DenseMatrix::from_2d_array(&[&[0.0, 1.0, 1.0, 0.0, 0.0, 1.0]]).unwrap();
let x_test = DenseMatrix::from_2d_array(&[&[0.0, 1.0, 1.0, 0.0, 0.0, 1.0]]);
let y_hat = bnb.predict(&x_test).unwrap();
assert_eq!(y_hat, &[1]);
@@ -588,8 +590,7 @@ mod tests {
&[2, 0, 3, 3, 1, 2, 0, 2, 4, 1],
&[2, 4, 0, 4, 2, 4, 1, 3, 1, 4],
&[0, 2, 2, 3, 4, 0, 4, 4, 4, 4],
])
.unwrap();
]);
let y: Vec<u32> = vec![2, 2, 0, 0, 0, 2, 1, 1, 0, 1, 0, 0, 2, 0, 2];
let bnb = BernoulliNB::fit(&x, &y, Default::default()).unwrap();
@@ -646,8 +647,7 @@ mod tests {
&[0, 1, 0, 0, 1, 0],
&[0, 1, 0, 1, 0, 0],
&[0, 1, 1, 0, 0, 1],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 0, 1];
let bnb = BernoulliNB::fit(&x, &y, Default::default()).unwrap();
+16 -15
View File
@@ -24,7 +24,7 @@
//! &[3, 4, 2, 4],
//! &[0, 3, 1, 2],
//! &[0, 4, 1, 2],
//! ]).unwrap();
//! ]);
//! let y: Vec<u32> = vec![0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0];
//!
//! let nb = CategoricalNB::fit(&x, &y, Default::default()).unwrap();
@@ -95,7 +95,7 @@ impl<T: Number + Unsigned> PartialEq for CategoricalNBDistribution<T> {
return false;
}
for (a_i_j, b_i_j) in a_i.iter().zip(b_i.iter()) {
if (*a_i_j - *b_i_j).abs() > f64::EPSILON {
if (*a_i_j - *b_i_j).abs() > std::f64::EPSILON {
return false;
}
}
@@ -158,7 +158,8 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
pub fn fit<X: Array2<T>, Y: Array1<T>>(x: &X, y: &Y, alpha: f64) -> Result<Self, Failed> {
if alpha < 0f64 {
return Err(Failed::fit(&format!(
"alpha should be >= 0, alpha=[{alpha}]"
"alpha should be >= 0, alpha=[{}]",
alpha
)));
}
@@ -166,13 +167,15 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
let y_samples = y.shape();
if y_samples != n_samples {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
)));
}
if n_samples == 0 {
return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]"
"Size of x and y should greater than 0; |x|=[{}]",
n_samples
)));
}
let y: Vec<usize> = y.iterator(0).map(|y_i| y_i.to_usize().unwrap()).collect();
@@ -199,7 +202,8 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
.max()
.ok_or_else(|| {
Failed::fit(&format!(
"Failed to get the categories for feature = {feature}"
"Failed to get the categories for feature = {}",
feature
))
})?;
n_categories.push(feature_max + 1);
@@ -363,7 +367,7 @@ impl<T: Number + Unsigned, X: Array2<T>, Y: Array1<T>> Predictor<X, Y> for Categ
impl<T: Number + Unsigned, X: Array2<T>, Y: Array1<T>> CategoricalNB<T, X, Y> {
/// Fits CategoricalNB with given data
/// * `x` - training data of size NxM where N is the number of samples and M is the number of
/// features.
/// features.
/// * `y` - vector with target values (classes) of length N.
/// * `parameters` - additional parameters like alpha for smoothing
pub fn fit(x: &X, y: &Y, parameters: CategoricalNBParameters) -> Result<Self, Failed> {
@@ -375,7 +379,6 @@ impl<T: Number + Unsigned, X: Array2<T>, Y: Array1<T>> CategoricalNB<T, X, Y> {
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
self.inner.as_ref().unwrap().predict(x)
@@ -426,6 +429,7 @@ mod tests {
fn search_parameters() {
let parameters = CategoricalNBSearchParameters {
alpha: vec![1., 2.],
..Default::default()
};
let mut iter = parameters.into_iter();
let next = iter.next().unwrap();
@@ -456,8 +460,7 @@ mod tests {
&[1, 1, 1, 1],
&[1, 2, 0, 0],
&[2, 1, 1, 1],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0];
let cnb = CategoricalNB::fit(&x, &y, Default::default()).unwrap();
@@ -515,7 +518,7 @@ mod tests {
]
);
let x_test = DenseMatrix::from_2d_array(&[&[0, 2, 1, 0], &[2, 2, 0, 0]]).unwrap();
let x_test = DenseMatrix::from_2d_array(&[&[0, 2, 1, 0], &[2, 2, 0, 0]]);
let y_hat = cnb.predict(&x_test).unwrap();
assert_eq!(y_hat, vec![0, 1]);
}
@@ -541,8 +544,7 @@ mod tests {
&[3, 4, 2, 4],
&[0, 3, 1, 2],
&[0, 4, 1, 2],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0];
let cnb = CategoricalNB::fit(&x, &y, Default::default()).unwrap();
@@ -574,8 +576,7 @@ mod tests {
&[3, 4, 2, 4],
&[0, 3, 1, 2],
&[0, 4, 1, 2],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0];
let cnb = CategoricalNB::fit(&x, &y, Default::default()).unwrap();
+11 -12
View File
@@ -16,7 +16,7 @@
//! &[ 1., 1.],
//! &[ 2., 1.],
//! &[ 3., 2.],
//! ]).unwrap();
//! ]);
//! let y: Vec<u32> = vec![1, 1, 1, 2, 2, 2];
//!
//! let nb = GaussianNB::fit(&x, &y, Default::default()).unwrap();
@@ -175,7 +175,7 @@ impl<TY: Number + Ord + Unsigned> GaussianNBDistribution<TY> {
/// * `x` - training data.
/// * `y` - vector with target values (classes) of length N.
/// * `priors` - Optional vector with prior probabilities of the classes. If not defined,
/// priors are adjusted according to the data.
/// priors are adjusted according to the data.
pub fn fit<TX: Number + RealNumber, X: Array2<TX>, Y: Array1<TY>>(
x: &X,
y: &Y,
@@ -185,13 +185,15 @@ impl<TY: Number + Ord + Unsigned> GaussianNBDistribution<TY> {
let y_samples = y.shape();
if y_samples != n_samples {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
)));
}
if n_samples == 0 {
return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]"
"Size of x and y should greater than 0; |x|=[{}]",
n_samples
)));
}
let (class_labels, indices) = y.unique_with_indices();
@@ -317,7 +319,7 @@ impl<TX: Number + RealNumber, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Arr
{
/// Fits GaussianNB with given data
/// * `x` - training data of size NxM where N is the number of samples and M is the number of
/// features.
/// features.
/// * `y` - vector with target values (classes) of length N.
/// * `parameters` - additional parameters like class priors.
pub fn fit(x: &X, y: &Y, parameters: GaussianNBParameters) -> Result<Self, Failed> {
@@ -328,7 +330,6 @@ impl<TX: Number + RealNumber, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Arr
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
self.inner.as_ref().unwrap().predict(x)
@@ -374,6 +375,7 @@ mod tests {
fn search_parameters() {
let parameters = GaussianNBSearchParameters {
priors: vec![Some(vec![1.]), Some(vec![2.])],
..Default::default()
};
let mut iter = parameters.into_iter();
let next = iter.next().unwrap();
@@ -396,8 +398,7 @@ mod tests {
&[1., 1.],
&[2., 1.],
&[3., 2.],
])
.unwrap();
]);
let y: Vec<u32> = vec![1, 1, 1, 2, 2, 2];
let gnb = GaussianNB::fit(&x, &y, Default::default()).unwrap();
@@ -437,8 +438,7 @@ mod tests {
&[1., 1.],
&[2., 1.],
&[3., 2.],
])
.unwrap();
]);
let y: Vec<u32> = vec![1, 1, 1, 2, 2, 2];
let priors = vec![0.3, 0.7];
@@ -465,8 +465,7 @@ mod tests {
&[1., 1.],
&[2., 1.],
&[3., 2.],
])
.unwrap();
]);
let y: Vec<u32> = vec![1, 1, 1, 2, 2, 2];
let gnb = GaussianNB::fit(&x, &y, Default::default()).unwrap();
+10 -85
View File
@@ -40,7 +40,7 @@ use crate::linalg::basic::arrays::{Array1, Array2, ArrayView1};
use crate::numbers::basenum::Number;
#[cfg(feature = "serde")]
use serde::{Deserialize, Serialize};
use std::{cmp::Ordering, marker::PhantomData};
use std::marker::PhantomData;
/// Distribution used in the Naive Bayes classifier.
pub(crate) trait NBDistribution<X: Number, Y: Number>: Clone {
@@ -89,14 +89,14 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: NBDistribution<TX,
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
let y_classes = self.distribution.classes();
let predictions = x
.row_iter()
.map(|row| {
y_classes
let (rows, _) = x.shape();
let predictions = (0..rows)
.map(|row_index| {
let row = x.get_row(row_index);
let (prediction, _probability) = y_classes
.iter()
.enumerate()
.map(|(class_index, class)| {
@@ -106,26 +106,11 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: NBDistribution<TX,
+ self.distribution.prior(class_index).ln(),
)
})
// For some reason, the max_by method cannot use NaNs for finding the maximum value, it panics.
// NaN must be considered as minimum values,
// therefore it's like NaNs would not be considered for choosing the maximum value.
// So we need to handle this case for avoiding panicking by using `Option::unwrap`.
.max_by(|(_, p1), (_, p2)| match p1.partial_cmp(p2) {
Some(ordering) => ordering,
None => {
if p1.is_nan() {
Ordering::Less
} else if p2.is_nan() {
Ordering::Greater
} else {
Ordering::Equal
}
}
})
.map(|(prediction, _probability)| *prediction)
.ok_or_else(|| Failed::predict("Failed to predict, there is no result"))
.max_by(|(_, p1), (_, p2)| p1.partial_cmp(p2).unwrap())
.unwrap();
*prediction
})
.collect::<Result<Vec<TY>, Failed>>()?;
.collect::<Vec<TY>>();
let y_hat = Y::from_vec_slice(&predictions);
Ok(y_hat)
}
@@ -134,63 +119,3 @@ pub mod bernoulli;
pub mod categorical;
pub mod gaussian;
pub mod multinomial;
#[cfg(test)]
mod tests {
use super::*;
use crate::linalg::basic::arrays::Array;
use crate::linalg::basic::matrix::DenseMatrix;
use num_traits::float::Float;
type Model<'d> = BaseNaiveBayes<i32, i32, DenseMatrix<i32>, Vec<i32>, TestDistribution<'d>>;
#[derive(Debug, PartialEq, Clone)]
struct TestDistribution<'d>(&'d Vec<i32>);
impl NBDistribution<i32, i32> for TestDistribution<'_> {
fn prior(&self, _class_index: usize) -> f64 {
1.
}
fn log_likelihood<'a>(
&'a self,
class_index: usize,
_j: &'a Box<dyn ArrayView1<i32> + 'a>,
) -> f64 {
match self.0.get(class_index) {
&v @ 2 | &v @ 10 | &v @ 20 => v as f64,
_ => f64::nan(),
}
}
fn classes(&self) -> &Vec<i32> {
self.0
}
}
#[test]
fn test_predict() {
let matrix = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9]]).unwrap();
let val = vec![];
match Model::fit(TestDistribution(&val)).unwrap().predict(&matrix) {
Ok(_) => panic!("Should return error in case of empty classes"),
Err(err) => assert_eq!(
err.to_string(),
"Predict failed: Failed to predict, there is no result"
),
}
let val = vec![1, 2, 3];
match Model::fit(TestDistribution(&val)).unwrap().predict(&matrix) {
Ok(r) => assert_eq!(r, vec![2, 2, 2]),
Err(_) => panic!("Should success in normal case with NaNs"),
}
let val = vec![20, 2, 10];
match Model::fit(TestDistribution(&val)).unwrap().predict(&matrix) {
Ok(r) => assert_eq!(r, vec![20, 20, 20]),
Err(_) => panic!("Should success in normal case without NaNs"),
}
}
}
+17 -17
View File
@@ -20,13 +20,13 @@
//! &[0, 2, 0, 0, 1, 0],
//! &[0, 1, 0, 1, 0, 0],
//! &[0, 1, 1, 0, 0, 1],
//! ]).unwrap();
//! ]);
//! let y: Vec<u32> = vec![0, 0, 0, 1];
//! let nb = MultinomialNB::fit(&x, &y, Default::default()).unwrap();
//!
//! // Testing data point is:
//! // Chinese Chinese Chinese Tokyo Japan
//! let x_test = DenseMatrix::from_2d_array(&[&[0, 3, 1, 0, 0, 1]]).unwrap();
//! let x_test = DenseMatrix::from_2d_array(&[&[0, 3, 1, 0, 0, 1]]);
//! let y_hat = nb.predict(&x_test).unwrap();
//! ```
//!
@@ -208,7 +208,7 @@ impl<TY: Number + Ord + Unsigned> MultinomialNBDistribution<TY> {
/// * `x` - training data.
/// * `y` - vector with target values (classes) of length N.
/// * `priors` - Optional vector with prior probabilities of the classes. If not defined,
/// priors are adjusted according to the data.
/// priors are adjusted according to the data.
/// * `alpha` - Additive (Laplace/Lidstone) smoothing parameter.
pub fn fit<TX: Number + Unsigned, X: Array2<TX>, Y: Array1<TY>>(
x: &X,
@@ -220,18 +220,21 @@ impl<TY: Number + Ord + Unsigned> MultinomialNBDistribution<TY> {
let y_samples = y.shape();
if y_samples != n_samples {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
)));
}
if n_samples == 0 {
return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]"
"Size of x and y should greater than 0; |x|=[{}]",
n_samples
)));
}
if alpha < 0f64 {
return Err(Failed::fit(&format!(
"Alpha should be greater than 0; |alpha|=[{alpha}]"
"Alpha should be greater than 0; |alpha|=[{}]",
alpha
)));
}
@@ -263,7 +266,8 @@ impl<TY: Number + Ord + Unsigned> MultinomialNBDistribution<TY> {
feature_in_class_counter[class_index][idx] +=
row_i.to_usize().ok_or_else(|| {
Failed::fit(&format!(
"Elements of the matrix should be convertible to usize |found|=[{row_i}]"
"Elements of the matrix should be convertible to usize |found|=[{}]",
row_i
))
})?;
}
@@ -345,10 +349,10 @@ impl<TX: Number + Unsigned, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Array
{
/// Fits MultinomialNB with given data
/// * `x` - training data of size NxM where N is the number of samples and M is the number of
/// features.
/// features.
/// * `y` - vector with target values (classes) of length N.
/// * `parameters` - additional parameters like class priors, alpha for smoothing and
/// binarizing threshold.
/// binarizing threshold.
pub fn fit(x: &X, y: &Y, parameters: MultinomialNBParameters) -> Result<Self, Failed> {
let distribution =
MultinomialNBDistribution::fit(x, y, parameters.alpha, parameters.priors)?;
@@ -358,7 +362,6 @@ impl<TX: Number + Unsigned, TY: Number + Ord + Unsigned, X: Array2<TX>, Y: Array
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
self.inner.as_ref().unwrap().predict(x)
@@ -434,8 +437,7 @@ mod tests {
&[0, 2, 0, 0, 1, 0],
&[0, 1, 0, 1, 0, 0],
&[0, 1, 1, 0, 0, 1],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 0, 1];
let mnb = MultinomialNB::fit(&x, &y, Default::default()).unwrap();
@@ -469,7 +471,7 @@ mod tests {
// Testing data point is:
// Chinese Chinese Chinese Tokyo Japan
let x_test = DenseMatrix::<u32>::from_2d_array(&[&[0, 3, 1, 0, 0, 1]]).unwrap();
let x_test = DenseMatrix::<u32>::from_2d_array(&[&[0, 3, 1, 0, 0, 1]]);
let y_hat = mnb.predict(&x_test).unwrap();
assert_eq!(y_hat, &[0]);
@@ -497,8 +499,7 @@ mod tests {
&[2, 0, 3, 3, 1, 2, 0, 2, 4, 1],
&[2, 4, 0, 4, 2, 4, 1, 3, 1, 4],
&[0, 2, 2, 3, 4, 0, 4, 4, 4, 4],
])
.unwrap();
]);
let y: Vec<u32> = vec![2, 2, 0, 0, 0, 2, 1, 1, 0, 1, 0, 0, 2, 0, 2];
let nb = MultinomialNB::fit(&x, &y, Default::default()).unwrap();
@@ -557,8 +558,7 @@ mod tests {
&[0, 1, 0, 0, 1, 0],
&[0, 1, 0, 1, 0, 0],
&[0, 1, 1, 0, 0, 1],
])
.unwrap();
]);
let y = vec![0, 0, 0, 1];
let mnb = MultinomialNB::fit(&x, &y, Default::default()).unwrap();
+8 -12
View File
@@ -22,7 +22,7 @@
//! &[3., 4.],
//! &[5., 6.],
//! &[7., 8.],
//! &[9., 10.]]).unwrap();
//! &[9., 10.]]);
//! let y = vec![2, 2, 2, 3, 3]; //your class labels
//!
//! let knn = KNNClassifier::fit(&x, &y, Default::default()).unwrap();
@@ -211,7 +211,7 @@ impl<TX: Number, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec
{
/// Fits KNN classifier to a NxM matrix where N is number of samples and M is number of features.
/// * `x` - training data
/// * `y` - vector with target values (classes) of length N
/// * `y` - vector with target values (classes) of length N
/// * `parameters` - additional parameters like search algorithm and k
pub fn fit(
x: &X,
@@ -236,7 +236,8 @@ impl<TX: Number, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec
if x_n != y_n {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{x_n}], |y|=[{y_n}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
x_n, y_n
)));
}
@@ -261,7 +262,6 @@ impl<TX: Number, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec
/// Estimates the class labels for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with class estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
let mut result = Y::zeros(x.shape().0);
@@ -312,8 +312,7 @@ mod tests {
#[test]
fn knn_fit_predict() {
let x =
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]])
.unwrap();
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]]);
let y = vec![2, 2, 2, 3, 3];
let knn = KNNClassifier::fit(&x, &y, Default::default()).unwrap();
let y_hat = knn.predict(&x).unwrap();
@@ -327,7 +326,7 @@ mod tests {
)]
#[test]
fn knn_fit_predict_weighted() {
let x = DenseMatrix::from_2d_array(&[&[1.], &[2.], &[3.], &[4.], &[5.]]).unwrap();
let x = DenseMatrix::from_2d_array(&[&[1.], &[2.], &[3.], &[4.], &[5.]]);
let y = vec![2, 2, 2, 3, 3];
let knn = KNNClassifier::fit(
&x,
@@ -338,9 +337,7 @@ mod tests {
.with_weight(KNNWeightFunction::Distance),
)
.unwrap();
let y_hat = knn
.predict(&DenseMatrix::from_2d_array(&[&[4.1]]).unwrap())
.unwrap();
let y_hat = knn.predict(&DenseMatrix::from_2d_array(&[&[4.1]])).unwrap();
assert_eq!(vec![3], y_hat);
}
@@ -352,8 +349,7 @@ mod tests {
#[cfg(feature = "serde")]
fn serde() {
let x =
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]])
.unwrap();
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]]);
let y = vec![2, 2, 2, 3, 3];
let knn = KNNClassifier::fit(&x, &y, Default::default()).unwrap();
+14 -13
View File
@@ -24,7 +24,7 @@
//! &[2., 2.],
//! &[3., 3.],
//! &[4., 4.],
//! &[5., 5.]]).unwrap();
//! &[5., 5.]]);
//! let y = vec![1., 2., 3., 4., 5.]; //your target values
//!
//! let knn = KNNRegressor::fit(&x, &y, Default::default()).unwrap();
@@ -88,21 +88,25 @@ pub struct KNNRegressor<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D:
impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
KNNRegressor<TX, TY, X, Y, D>
{
///
fn y(&self) -> &Y {
self.y.as_ref().unwrap()
}
///
fn knn_algorithm(&self) -> &KNNAlgorithm<TX, D> {
self.knn_algorithm
.as_ref()
.expect("Missing parameter: KNNAlgorithm")
}
///
fn weight(&self) -> &KNNWeightFunction {
self.weight.as_ref().expect("Missing parameter: weight")
}
#[allow(dead_code)]
///
fn k(&self) -> usize {
self.k.unwrap()
}
@@ -203,7 +207,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
{
/// Fits KNN regressor to a NxM matrix where N is number of samples and M is number of features.
/// * `x` - training data
/// * `y` - vector with real values
/// * `y` - vector with real values
/// * `parameters` - additional parameters like search algorithm and k
pub fn fit(
x: &X,
@@ -220,7 +224,8 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
if x_n != y_n {
return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{x_n}], |y|=[{y_n}]"
"Size of x should equal size of y; |x|=[{}], |y|=[{}]",
x_n, y_n
)));
}
@@ -246,7 +251,6 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
/// Predict the target for the provided data.
/// * `x` - data of shape NxM where N is number of data points to estimate and M is number of features.
///
/// Returns a vector of size N with estimates.
pub fn predict(&self, x: &X) -> Result<Y, Failed> {
let mut result = Y::zeros(x.shape().0);
@@ -292,10 +296,9 @@ mod tests {
#[test]
fn knn_fit_predict_weighted() {
let x =
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]])
.unwrap();
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]]);
let y: Vec<f64> = vec![1., 2., 3., 4., 5.];
let y_exp = [1., 2., 3., 4., 5.];
let y_exp = vec![1., 2., 3., 4., 5.];
let knn = KNNRegressor::fit(
&x,
&y,
@@ -309,7 +312,7 @@ mod tests {
let y_hat = knn.predict(&x).unwrap();
assert_eq!(5, Vec::len(&y_hat));
for i in 0..y_hat.len() {
assert!((y_hat[i] - y_exp[i]).abs() < f64::EPSILON);
assert!((y_hat[i] - y_exp[i]).abs() < std::f64::EPSILON);
}
}
@@ -320,10 +323,9 @@ mod tests {
#[test]
fn knn_fit_predict_uniform() {
let x =
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]])
.unwrap();
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]]);
let y: Vec<f64> = vec![1., 2., 3., 4., 5.];
let y_exp = [2., 2., 3., 4., 4.];
let y_exp = vec![2., 2., 3., 4., 4.];
let knn = KNNRegressor::fit(&x, &y, Default::default()).unwrap();
let y_hat = knn.predict(&x).unwrap();
assert_eq!(5, Vec::len(&y_hat));
@@ -340,8 +342,7 @@ mod tests {
#[cfg(feature = "serde")]
fn serde() {
let x =
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]])
.unwrap();
DenseMatrix::from_2d_array(&[&[1., 2.], &[3., 4.], &[5., 6.], &[7., 8.], &[9., 10.]]);
let y = vec![1., 2., 3., 4., 5.];
let knn = KNNRegressor::fit(&x, &y, Default::default()).unwrap();
+7 -2
View File
@@ -49,15 +49,20 @@ pub type KNNAlgorithmName = crate::algorithm::neighbour::KNNAlgorithmName;
/// Weight function that is used to determine estimated value.
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)]
#[derive(Debug, Clone)]
pub enum KNNWeightFunction {
/// All k nearest points are weighted equally
#[default]
Uniform,
/// k nearest points are weighted by the inverse of their distance. Closer neighbors will have a greater influence than neighbors which are further away.
Distance,
}
impl Default for KNNWeightFunction {
fn default() -> Self {
KNNWeightFunction::Uniform
}
}
impl KNNWeightFunction {
fn calc_weights(&self, distances: Vec<f64>) -> std::vec::Vec<f64> {
match *self {
+3 -26
View File
@@ -2,13 +2,9 @@
//! Most algorithms in `smartcore` rely on basic linear algebra operations like dot product, matrix decomposition and other subroutines that are defined for a set of real numbers, .
//! This module defines real number and some useful functions that are used in [Linear Algebra](../../linalg/index.html) module.
use rand::rngs::SmallRng;
use rand::{Rng, SeedableRng};
use num_traits::Float;
use crate::numbers::basenum::Number;
use crate::rand_custom::get_rng_impl;
/// Defines real number
/// <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS_CHTML"></script>
@@ -67,12 +63,8 @@ impl RealNumber for f64 {
}
fn rand() -> f64 {
let mut small_rng = get_rng_impl(None);
let mut rngs: Vec<SmallRng> = (0..3)
.map(|_| SmallRng::from_rng(&mut small_rng).unwrap())
.collect();
rngs[0].gen::<f64>()
// TODO: to be implemented, see issue smartcore#214
1.0
}
fn two() -> Self {
@@ -116,12 +108,7 @@ impl RealNumber for f32 {
}
fn rand() -> f32 {
let mut small_rng = get_rng_impl(None);
let mut rngs: Vec<SmallRng> = (0..3)
.map(|_| SmallRng::from_rng(&mut small_rng).unwrap())
.collect();
rngs[0].gen::<f32>()
1.0
}
fn two() -> Self {
@@ -162,14 +149,4 @@ mod tests {
fn f64_from_string() {
assert_eq!(f64::from_str("1.111111111").unwrap(), 1.111111111)
}
#[test]
fn f64_rand() {
f64::rand();
}
#[test]
fn f32_rand() {
f32::rand();
}
}
@@ -1,3 +1,5 @@
// TODO: missing documentation
use std::default::Default;
use crate::linalg::basic::arrays::Array1;
@@ -6,27 +8,30 @@ use crate::optimization::first_order::{FirstOrderOptimizer, OptimizerResult};
use crate::optimization::line_search::LineSearchMethod;
use crate::optimization::{DF, F};
/// Gradient Descent optimization algorithm
///
pub struct GradientDescent {
/// Maximum number of iterations
///
pub max_iter: usize,
/// Relative tolerance for the gradient norm
///
pub g_rtol: f64,
/// Absolute tolerance for the gradient norm
///
pub g_atol: f64,
}
///
impl Default for GradientDescent {
fn default() -> Self {
GradientDescent {
max_iter: 10000,
g_rtol: f64::EPSILON.sqrt(),
g_atol: f64::EPSILON,
g_rtol: std::f64::EPSILON.sqrt(),
g_atol: std::f64::EPSILON,
}
}
}
///
impl<T: FloatNumber> FirstOrderOptimizer<T> for GradientDescent {
///
fn optimize<'a, X: Array1<T>, LS: LineSearchMethod<T>>(
&self,
f: &'a F<'_, T, X>,
@@ -108,13 +113,12 @@ mod tests {
g[1] = 200. * (x[1] - x[0].powf(2.));
};
let ls: Backtracking<f64> = Backtracking::<f64> {
order: FunctionOrder::THIRD,
..Default::default()
};
let mut ls: Backtracking<f64> = Default::default();
ls.order = FunctionOrder::THIRD;
let optimizer: GradientDescent = Default::default();
let result = optimizer.optimize(&f, &df, &x0, &ls);
println!("{:?}", result);
assert!((result.f_x - 0.0).abs() < 1e-5);
assert!((result.x[0] - 1.0).abs() < 1e-2);
+29 -20
View File
@@ -11,29 +11,31 @@ use crate::optimization::first_order::{FirstOrderOptimizer, OptimizerResult};
use crate::optimization::line_search::LineSearchMethod;
use crate::optimization::{DF, F};
/// Limited-memory BFGS optimization algorithm
///
pub struct LBFGS {
/// Maximum number of iterations
///
pub max_iter: usize,
/// TODO: Add documentation
///
pub g_rtol: f64,
/// TODO: Add documentation
///
pub g_atol: f64,
/// TODO: Add documentation
///
pub x_atol: f64,
/// TODO: Add documentation
///
pub x_rtol: f64,
/// TODO: Add documentation
///
pub f_abstol: f64,
/// TODO: Add documentation
///
pub f_reltol: f64,
/// TODO: Add documentation
///
pub successive_f_tol: usize,
/// TODO: Add documentation
///
pub m: usize,
}
///
impl Default for LBFGS {
///
fn default() -> Self {
LBFGS {
max_iter: 1000,
@@ -49,7 +51,9 @@ impl Default for LBFGS {
}
}
///
impl LBFGS {
///
fn two_loops<T: FloatNumber + RealNumber, X: Array1<T>>(&self, state: &mut LBFGSState<T, X>) {
let lower = state.iteration.max(self.m) - self.m;
let upper = state.iteration;
@@ -91,6 +95,7 @@ impl LBFGS {
state.s.mul_scalar_mut(-T::one());
}
///
fn init_state<T: FloatNumber + RealNumber, X: Array1<T>>(&self, x: &X) -> LBFGSState<T, X> {
LBFGSState {
x: x.clone(),
@@ -114,6 +119,7 @@ impl LBFGS {
}
}
///
fn update_state<'a, T: FloatNumber + RealNumber, X: Array1<T>, LS: LineSearchMethod<T>>(
&self,
f: &'a F<'_, T, X>,
@@ -155,6 +161,7 @@ impl LBFGS {
df(&mut state.x_df, &state.x);
}
///
fn assess_convergence<T: FloatNumber, X: Array1<T>>(
&self,
state: &mut LBFGSState<T, X>,
@@ -166,7 +173,7 @@ impl LBFGS {
}
if state.x.max_diff(&state.x_prev)
<= T::from_f64(self.x_rtol * state.x.norm(f64::INFINITY)).unwrap()
<= T::from_f64(self.x_rtol * state.x.norm(std::f64::INFINITY)).unwrap()
{
x_converged = true;
}
@@ -181,16 +188,17 @@ impl LBFGS {
state.counter_f_tol += 1;
}
if state.x_df.norm(f64::INFINITY) <= self.g_atol {
if state.x_df.norm(std::f64::INFINITY) <= self.g_atol {
g_converged = true;
}
g_converged || x_converged || state.counter_f_tol > self.successive_f_tol
}
fn update_hessian<T: FloatNumber, X: Array1<T>>(
///
fn update_hessian<'a, T: FloatNumber, X: Array1<T>>(
&self,
_: &DF<'_, X>,
_: &'a DF<'_, X>,
state: &mut LBFGSState<T, X>,
) {
state.dg = state.x_df.sub(&state.x_df_prev);
@@ -204,6 +212,7 @@ impl LBFGS {
}
}
///
#[derive(Debug)]
struct LBFGSState<T: FloatNumber, X: Array1<T>> {
x: X,
@@ -225,7 +234,9 @@ struct LBFGSState<T: FloatNumber, X: Array1<T>> {
alpha: T,
}
///
impl<T: FloatNumber + RealNumber> FirstOrderOptimizer<T> for LBFGS {
///
fn optimize<'a, X: Array1<T>, LS: LineSearchMethod<T>>(
&self,
f: &F<'_, T, X>,
@@ -237,7 +248,7 @@ impl<T: FloatNumber + RealNumber> FirstOrderOptimizer<T> for LBFGS {
df(&mut state.x_df, x0);
let g_converged = state.x_df.norm(f64::INFINITY) < self.g_atol;
let g_converged = state.x_df.norm(std::f64::INFINITY) < self.g_atol;
let mut converged = g_converged;
let stopped = false;
@@ -280,15 +291,13 @@ mod tests {
g[0] = -2. * (1. - x[0]) - 400. * (x[1] - x[0].powf(2.)) * x[0];
g[1] = 200. * (x[1] - x[0].powf(2.));
};
let ls: Backtracking<f64> = Backtracking::<f64> {
order: FunctionOrder::THIRD,
..Default::default()
};
let mut ls: Backtracking<f64> = Default::default();
ls.order = FunctionOrder::THIRD;
let optimizer: LBFGS = Default::default();
let result = optimizer.optimize(&f, &df, &x0, &ls);
assert!((result.f_x - 0.0).abs() < f64::EPSILON);
assert!((result.f_x - 0.0).abs() < std::f64::EPSILON);
assert!((result.x[0] - 1.0).abs() < 1e-8);
assert!((result.x[1] - 1.0).abs() < 1e-8);
assert!(result.iterations <= 24);
+8 -8
View File
@@ -1,6 +1,6 @@
/// Gradient descent optimization algorithm
///
pub mod gradient_descent;
/// Limited-memory BFGS optimization algorithm
///
pub mod lbfgs;
use std::clone::Clone;
@@ -11,9 +11,9 @@ use crate::numbers::floatnum::FloatNumber;
use crate::optimization::line_search::LineSearchMethod;
use crate::optimization::{DF, F};
/// First-order optimization is a class of algorithms that use the first derivative of a function to find optimal solutions.
///
pub trait FirstOrderOptimizer<T: FloatNumber> {
/// run first order optimization
///
fn optimize<'a, X: Array1<T>, LS: LineSearchMethod<T>>(
&self,
f: &F<'_, T, X>,
@@ -23,13 +23,13 @@ pub trait FirstOrderOptimizer<T: FloatNumber> {
) -> OptimizerResult<T, X>;
}
/// Result of optimization
///
#[derive(Debug, Clone)]
pub struct OptimizerResult<T: FloatNumber, X: Array1<T>> {
/// Solution
///
pub x: X,
/// f(x) value
///
pub f_x: T,
/// number of iterations
///
pub iterations: usize,
}
+17 -12
View File
@@ -1,9 +1,11 @@
// TODO: missing documentation
use crate::optimization::FunctionOrder;
use num_traits::Float;
/// Line search optimization.
///
pub trait LineSearchMethod<T: Float> {
/// Find alpha that satisfies strong Wolfe conditions.
///
fn search(
&self,
f: &(dyn Fn(T) -> T),
@@ -14,31 +16,32 @@ pub trait LineSearchMethod<T: Float> {
) -> LineSearchResult<T>;
}
/// Line search result
///
#[derive(Debug, Clone)]
pub struct LineSearchResult<T: Float> {
/// Alpha value
///
pub alpha: T,
/// f(alpha) value
///
pub f_x: T,
}
/// Backtracking line search method.
///
pub struct Backtracking<T: Float> {
/// TODO: Add documentation
///
pub c1: T,
/// Maximum number of iterations for Backtracking single run
///
pub max_iterations: usize,
/// TODO: Add documentation
///
pub max_infinity_iterations: usize,
/// TODO: Add documentation
///
pub phi: T,
/// TODO: Add documentation
///
pub plo: T,
/// function order
///
pub order: FunctionOrder,
}
///
impl<T: Float> Default for Backtracking<T> {
fn default() -> Self {
Backtracking {
@@ -52,7 +55,9 @@ impl<T: Float> Default for Backtracking<T> {
}
}
///
impl<T: Float> LineSearchMethod<T> for Backtracking<T> {
///
fn search(
&self,
f: &(dyn Fn(T) -> T),
+9 -7
View File
@@ -1,19 +1,21 @@
/// first order optimization algorithms
// TODO: missing documentation
///
pub mod first_order;
/// line search algorithms
///
pub mod line_search;
/// Function f(x) = y
///
pub type F<'a, T, X> = dyn for<'b> Fn(&'b X) -> T + 'a;
/// Function df(x)
///
pub type DF<'a, X> = dyn for<'b> Fn(&'b mut X, &'b X) + 'a;
/// Function order
///
#[allow(clippy::upper_case_acronyms)]
#[derive(Debug, PartialEq, Eq)]
pub enum FunctionOrder {
/// Second order
///
SECOND,
/// Third order
///
THIRD,
}
+16 -16
View File
@@ -12,7 +12,7 @@
//! &[1.5, 2.0, 1.5, 4.0],
//! &[1.5, 1.0, 1.5, 5.0],
//! &[1.5, 2.0, 1.5, 6.0],
//! ]).unwrap();
//! ]);
//! let encoder_params = OneHotEncoderParams::from_cat_idx(&[1, 3]);
//! // Infer number of categories from data and return a reusable encoder
//! let encoder = OneHotEncoder::fit(&data, encoder_params).unwrap();
@@ -132,7 +132,8 @@ impl OneHotEncoder {
data.copy_col_as_vec(idx, &mut col_buf);
if !validate_col_is_categorical(&col_buf) {
let msg = format!(
"Column {idx} of data matrix containts non categorizable (integer) values"
"Column {} of data matrix containts non categorizable (integer) values",
idx
);
return Err(Failed::fit(&msg[..]));
}
@@ -181,7 +182,7 @@ impl OneHotEncoder {
match oh_vec {
None => {
// Since we support T types, bad value in a series causes in to be invalid
let msg = format!("At least one value in column {old_cidx} doesn't conform to category definition");
let msg = format!("At least one value in column {} doesn't conform to category definition", old_cidx);
return Err(Failed::transform(&msg[..]));
}
Some(v) => {
@@ -240,16 +241,14 @@ mod tests {
&[2.0, 1.5, 4.0],
&[1.0, 1.5, 5.0],
&[2.0, 1.5, 6.0],
])
.unwrap();
]);
let oh_enc = DenseMatrix::from_2d_array(&[
&[1.0, 0.0, 1.5, 1.0, 0.0, 0.0, 0.0],
&[0.0, 1.0, 1.5, 0.0, 1.0, 0.0, 0.0],
&[1.0, 0.0, 1.5, 0.0, 0.0, 1.0, 0.0],
&[0.0, 1.0, 1.5, 0.0, 0.0, 0.0, 1.0],
])
.unwrap();
]);
(orig, oh_enc)
}
@@ -261,16 +260,14 @@ mod tests {
&[1.5, 2.0, 1.5, 4.0],
&[1.5, 1.0, 1.5, 5.0],
&[1.5, 2.0, 1.5, 6.0],
])
.unwrap();
]);
let oh_enc = DenseMatrix::from_2d_array(&[
&[1.5, 1.0, 0.0, 1.5, 1.0, 0.0, 0.0, 0.0],
&[1.5, 0.0, 1.0, 1.5, 0.0, 1.0, 0.0, 0.0],
&[1.5, 1.0, 0.0, 1.5, 0.0, 0.0, 1.0, 0.0],
&[1.5, 0.0, 1.0, 1.5, 0.0, 0.0, 0.0, 1.0],
])
.unwrap();
]);
(orig, oh_enc)
}
@@ -281,7 +278,7 @@ mod tests {
)]
#[test]
fn hash_encode_f64_series() {
let series = [3.0, 1.0, 2.0, 1.0];
let series = vec![3.0, 1.0, 2.0, 1.0];
let hashable_series: Vec<CategoricalFloat> =
series.iter().map(|v| v.to_category()).collect();
let enc = CategoryMapper::from_positional_category_vec(hashable_series);
@@ -338,11 +335,14 @@ mod tests {
&[2.0, 1.5, 4.0],
&[1.0, 1.5, 5.0],
&[2.0, 1.5, 6.0],
])
.unwrap();
]);
let params = OneHotEncoderParams::from_cat_idx(&[1]);
let result = OneHotEncoder::fit(&m, params);
assert!(result.is_err());
match OneHotEncoder::fit(&m, params) {
Err(_) => {
assert!(true);
}
_ => assert!(false),
}
}
}
+51 -56
View File
@@ -11,7 +11,7 @@
//! vec![0.0, 0.0],
//! vec![1.0, 1.0],
//! vec![1.0, 1.0],
//! ]).unwrap();
//! ]);
//!
//! let standard_scaler =
//! numerical::StandardScaler::fit(&data, numerical::StandardScalerParameters::default())
@@ -24,7 +24,7 @@
//! vec![-1.0, -1.0],
//! vec![1.0, 1.0],
//! vec![1.0, 1.0],
//! ]).unwrap()
//! ])
//! );
//! ```
use std::marker::PhantomData;
@@ -172,14 +172,18 @@ where
T: Number + RealNumber,
M: Array2<T>,
{
columns.first().cloned().map(|output_matrix| {
columns
.iter()
.skip(1)
.fold(output_matrix, |current_matrix, new_colum| {
current_matrix.h_stack(new_colum)
})
})
if let Some(output_matrix) = columns.first().cloned() {
return Some(
columns
.iter()
.skip(1)
.fold(output_matrix, |current_matrix, new_colum| {
current_matrix.h_stack(new_colum)
}),
);
} else {
None
}
}
#[cfg(test)]
@@ -193,18 +197,15 @@ mod tests {
fn combine_three_columns() {
assert_eq!(
build_matrix_from_columns(vec![
DenseMatrix::from_2d_vec(&vec![vec![1.0], vec![1.0], vec![1.0],]).unwrap(),
DenseMatrix::from_2d_vec(&vec![vec![2.0], vec![2.0], vec![2.0],]).unwrap(),
DenseMatrix::from_2d_vec(&vec![vec![3.0], vec![3.0], vec![3.0],]).unwrap()
DenseMatrix::from_2d_vec(&vec![vec![1.0], vec![1.0], vec![1.0],]),
DenseMatrix::from_2d_vec(&vec![vec![2.0], vec![2.0], vec![2.0],]),
DenseMatrix::from_2d_vec(&vec![vec![3.0], vec![3.0], vec![3.0],])
]),
Some(
DenseMatrix::from_2d_vec(&vec![
vec![1.0, 2.0, 3.0],
vec![1.0, 2.0, 3.0],
vec![1.0, 2.0, 3.0]
])
.unwrap()
)
Some(DenseMatrix::from_2d_vec(&vec![
vec![1.0, 2.0, 3.0],
vec![1.0, 2.0, 3.0],
vec![1.0, 2.0, 3.0]
]))
)
}
@@ -286,24 +287,21 @@ mod tests {
/// sklearn.
#[test]
fn fit_transform_random_values() {
let transformed_values = fit_transform_with_default_standard_scaler(
&DenseMatrix::from_2d_array(&[
let transformed_values =
fit_transform_with_default_standard_scaler(&DenseMatrix::from_2d_array(&[
&[0.1004222429, 0.2194113576, 0.9310663354, 0.3313593793],
&[0.2045493861, 0.1683865411, 0.5071506765, 0.7257355264],
&[0.5708488802, 0.1846414616, 0.9590802982, 0.5591871046],
&[0.8387612750, 0.5754861361, 0.5537109852, 0.1077646442],
])
.unwrap(),
);
println!("{transformed_values}");
]));
println!("{}", transformed_values);
assert!(transformed_values.approximate_eq(
&DenseMatrix::from_2d_array(&[
&[-1.1154020653, -0.4031985330, 0.9284605204, -0.4271473866],
&[-0.7615464283, -0.7076698384, -1.1075452562, 1.2632979631],
&[0.4832504303, -0.6106747444, 1.0630075435, 0.5494084257],
&[1.3936980634, 1.7215431158, -0.8839228078, -1.3855590021],
])
.unwrap(),
]),
1.0
))
}
@@ -312,10 +310,13 @@ mod tests {
#[test]
fn fit_transform_with_zero_variance() {
assert_eq!(
fit_transform_with_default_standard_scaler(
&DenseMatrix::from_2d_array(&[&[1.0], &[1.0], &[1.0], &[1.0]]).unwrap()
),
DenseMatrix::from_2d_array(&[&[0.0], &[0.0], &[0.0], &[0.0]]).unwrap(),
fit_transform_with_default_standard_scaler(&DenseMatrix::from_2d_array(&[
&[1.0],
&[1.0],
&[1.0],
&[1.0]
])),
DenseMatrix::from_2d_array(&[&[0.0], &[0.0], &[0.0], &[0.0]]),
"When scaling values with zero variance, zero is expected as return value"
)
}
@@ -330,8 +331,7 @@ mod tests {
&[1.0, 2.0, 5.0],
&[1.0, 1.0, 1.0],
&[1.0, 2.0, 5.0]
])
.unwrap(),
]),
StandardScalerParameters::default(),
),
Ok(StandardScaler {
@@ -354,8 +354,7 @@ mod tests {
&[0.2045493861, 0.1683865411, 0.5071506765, 0.7257355264],
&[0.5708488802, 0.1846414616, 0.9590802982, 0.5591871046],
&[0.8387612750, 0.5754861361, 0.5537109852, 0.1077646442],
])
.unwrap(),
]),
StandardScalerParameters::default(),
)
.unwrap();
@@ -365,18 +364,17 @@ mod tests {
vec![0.42864544605, 0.2869813741, 0.737752073825, 0.431011663625],
);
assert!(&DenseMatrix::<f64>::from_2d_vec(&vec![fitted_scaler.stds])
.unwrap()
.approximate_eq(
assert!(
&DenseMatrix::<f64>::from_2d_vec(&vec![fitted_scaler.stds]).approximate_eq(
&DenseMatrix::from_2d_array(&[&[
0.29426447500954,
0.16758497615485,
0.20820945786863,
0.23329718831165
],])
.unwrap(),
],]),
0.00000000000001
))
)
)
}
/// If `with_std` is set to `false` the values should not be
@@ -394,9 +392,8 @@ mod tests {
};
assert_eq!(
standard_scaler
.transform(&DenseMatrix::from_2d_array(&[&[0.0, 2.0], &[2.0, 4.0]]).unwrap()),
Ok(DenseMatrix::from_2d_array(&[&[-1.0, -1.0], &[1.0, 1.0]]).unwrap())
standard_scaler.transform(&DenseMatrix::from_2d_array(&[&[0.0, 2.0], &[2.0, 4.0]])),
Ok(DenseMatrix::from_2d_array(&[&[-1.0, -1.0], &[1.0, 1.0]]))
)
}
@@ -416,8 +413,8 @@ mod tests {
assert_eq!(
standard_scaler
.transform(&DenseMatrix::from_2d_array(&[&[0.0, 9.0], &[4.0, 12.0]]).unwrap()),
Ok(DenseMatrix::from_2d_array(&[&[0.0, 3.0], &[2.0, 4.0]]).unwrap())
.transform(&DenseMatrix::from_2d_array(&[&[0.0, 9.0], &[4.0, 12.0]])),
Ok(DenseMatrix::from_2d_array(&[&[0.0, 3.0], &[2.0, 4.0]]))
)
}
@@ -436,8 +433,7 @@ mod tests {
&[0.2045493861, 0.1683865411, 0.5071506765, 0.7257355264],
&[0.5708488802, 0.1846414616, 0.9590802982, 0.5591871046],
&[0.8387612750, 0.5754861361, 0.5537109852, 0.1077646442],
])
.unwrap(),
]),
StandardScalerParameters::default(),
)
.unwrap();
@@ -450,18 +446,17 @@ mod tests {
vec![0.42864544605, 0.2869813741, 0.737752073825, 0.431011663625],
);
assert!(&DenseMatrix::from_2d_vec(&vec![deserialized_scaler.stds])
.unwrap()
.approximate_eq(
assert!(
&DenseMatrix::from_2d_vec(&vec![deserialized_scaler.stds]).approximate_eq(
&DenseMatrix::from_2d_array(&[&[
0.29426447500954,
0.16758497615485,
0.20820945786863,
0.23329718831165
],])
.unwrap(),
],]),
0.00000000000001
))
)
)
}
}
}
+4 -4
View File
@@ -206,7 +206,7 @@ mod tests {
#[test]
fn from_categories() {
let fake_categories: Vec<usize> = vec![1, 2, 3, 4, 5, 3, 5, 3, 1, 2, 4];
let it = fake_categories.iter().copied();
let it = fake_categories.iter().map(|&a| a);
let enc = CategoryMapper::<usize>::fit_to_iter(it);
let oh_vec: Vec<f64> = match enc.get_one_hot(&1) {
None => panic!("Wrong categories"),
@@ -218,8 +218,8 @@ mod tests {
fn build_fake_str_enc<'a>() -> CategoryMapper<&'a str> {
let fake_category_pos = vec!["background", "dog", "cat"];
CategoryMapper::<&str>::from_positional_category_vec(fake_category_pos)
let enc = CategoryMapper::<&str>::from_positional_category_vec(fake_category_pos);
enc
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
@@ -275,7 +275,7 @@ mod tests {
let lab = enc.invert_one_hot(res).unwrap();
assert_eq!(lab, "dog");
if let Err(e) = enc.invert_one_hot(vec![0.0, 0.0, 0.0]) {
let pos_entries = "Expected a single positive entry, 0 entires found".to_string();
let pos_entries = format!("Expected a single positive entry, 0 entires found");
assert_eq!(e, Failed::transform(&pos_entries[..]));
};
}
+14 -9
View File
@@ -30,7 +30,7 @@ pub struct CSVDefinition<'a> {
/// What seperates the fields in your csv-file?
field_seperator: &'a str,
}
impl Default for CSVDefinition<'_> {
impl<'a> Default for CSVDefinition<'a> {
fn default() -> Self {
Self {
n_rows_header: 1,
@@ -83,7 +83,7 @@ where
Matrix: Array2<T>,
{
let csv_text = read_string_from_source(source)?;
let rows: Vec<Vec<T>> = extract_row_vectors_from_csv_text(
let rows: Vec<Vec<T>> = extract_row_vectors_from_csv_text::<T, RowVector, Matrix>(
&csv_text,
&definition,
detect_row_format(&csv_text, &definition)?,
@@ -103,7 +103,12 @@ where
/// Given a string containing the contents of a csv file, extract its value
/// into row-vectors.
fn extract_row_vectors_from_csv_text<'a, T: Number + RealNumber + std::str::FromStr>(
fn extract_row_vectors_from_csv_text<
'a,
T: Number + RealNumber + std::str::FromStr,
RowVector: Array1<T>,
Matrix: Array2<T>,
>(
csv_text: &'a str,
definition: &'a CSVDefinition<'_>,
row_format: CSVRowFormat<'_>,
@@ -162,7 +167,7 @@ where
}
/// Ensure that a string containing a csv row conforms to a specified row format.
fn validate_csv_row(row: &str, row_format: &CSVRowFormat<'_>) -> Result<(), ReadingError> {
fn validate_csv_row<'a>(row: &'a str, row_format: &CSVRowFormat<'_>) -> Result<(), ReadingError> {
let actual_number_of_fields = row.split(row_format.field_seperator).count();
if row_format.n_fields == actual_number_of_fields {
Ok(())
@@ -203,7 +208,7 @@ where
match value_string.parse::<T>().ok() {
Some(value) => Ok(value),
None => Err(ReadingError::InvalidField {
msg: format!("Value '{value_string}' could not be read.",),
msg: format!("Value '{}' could not be read.", value_string,),
}),
}
}
@@ -238,8 +243,7 @@ mod tests {
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
&[4.7, 3.2, 1.3, 0.2],
])
.unwrap())
]))
)
}
#[test]
@@ -262,7 +266,7 @@ mod tests {
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
&[4.7, 3.2, 1.3, 0.2],
]).unwrap())
]))
)
}
#[test]
@@ -301,11 +305,12 @@ mod tests {
}
mod extract_row_vectors_from_csv_text {
use super::super::{extract_row_vectors_from_csv_text, CSVDefinition, CSVRowFormat};
use crate::linalg::basic::matrix::DenseMatrix;
#[test]
fn read_default_csv() {
assert_eq!(
extract_row_vectors_from_csv_text::<f64>(
extract_row_vectors_from_csv_text::<f64, Vec<_>, DenseMatrix<_>>(
"column 1, column 2, column3\n1.0,2.0,3.0\n4.0,5.0,6.0",
&CSVDefinition::default(),
CSVRowFormat {
+2 -2
View File
@@ -56,7 +56,7 @@ pub struct Kernels;
impl Kernels {
/// Return a default linear
pub fn linear() -> LinearKernel {
LinearKernel
LinearKernel::default()
}
/// Return a default RBF
pub fn rbf() -> RBFKernel {
@@ -292,7 +292,7 @@ mod tests {
.unwrap()
.abs();
assert!((4913f64 - result).abs() < f64::EPSILON);
assert!((4913f64 - result) < std::f64::EPSILON);
}
#[cfg_attr(
+83 -71
View File
@@ -53,7 +53,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y = vec![ -1, -1, -1, -1, -1, -1, -1, -1,
//! 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
//!
@@ -322,26 +322,19 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX> + 'a, Y: Array
let (n, _) = x.shape();
let mut y_hat: Vec<TX> = Array1::zeros(n);
let mut row = Vec::with_capacity(n);
for i in 0..n {
row.clear();
row.extend(x.get_row(i).iterator(0).copied());
let row_pred: TX = self.predict_for_row(&row);
let row_pred: TX =
self.predict_for_row(Vec::from_iterator(x.get_row(i).iterator(0).copied(), n));
y_hat.set(i, row_pred);
}
Ok(y_hat)
}
fn predict_for_row(&self, x: &[TX]) -> TX {
fn predict_for_row(&self, x: Vec<TX>) -> TX {
let mut f = self.b.unwrap();
let xi: Vec<_> = x.iter().map(|e| e.to_f64().unwrap()).collect();
for i in 0..self.instances.as_ref().unwrap().len() {
let xj: Vec<_> = self.instances.as_ref().unwrap()[i]
.iter()
.map(|e| e.to_f64().unwrap())
.collect();
f += self.w.as_ref().unwrap()[i]
* TX::from(
self.parameters
@@ -350,7 +343,13 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX> + 'a, Y: Array
.kernel
.as_ref()
.unwrap()
.apply(&xi, &xj)
.apply(
&x.iter().map(|e| e.to_f64().unwrap()).collect(),
&self.instances.as_ref().unwrap()[i]
.iter()
.map(|e| e.to_f64().unwrap())
.collect(),
)
.unwrap(),
)
.unwrap();
@@ -360,8 +359,8 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX> + 'a, Y: Array
}
}
impl<TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>> PartialEq
for SVC<'_, TX, TY, X, Y>
impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>> PartialEq
for SVC<'a, TX, TY, X, Y>
{
fn eq(&self, other: &Self) -> bool {
if (self.b.unwrap().sub(other.b.unwrap())).abs() > TX::epsilon() * TX::two()
@@ -473,12 +472,14 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
let tol = self.parameters.tol;
let good_enough = TX::from_i32(1000).unwrap();
let mut x = Vec::with_capacity(n);
for _ in 0..self.parameters.epoch {
for i in self.permutate(n) {
x.clear();
x.extend(self.x.get_row(i).iterator(0).take(n).copied());
self.process(i, &x, *self.y.get(i), &mut cache);
self.process(
i,
Vec::from_iterator(self.x.get_row(i).iterator(0).copied(), n),
*self.y.get(i),
&mut cache,
);
loop {
self.reprocess(tol, &mut cache);
self.find_min_max_gradient();
@@ -510,17 +511,24 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
let mut cp = 0;
let mut cn = 0;
let mut x = Vec::with_capacity(n);
for i in self.permutate(n) {
x.clear();
x.extend(self.x.get_row(i).iterator(0).take(n).copied());
if *self.y.get(i) == TY::one() && cp < few {
if self.process(i, &x, *self.y.get(i), cache) {
if self.process(
i,
Vec::from_iterator(self.x.get_row(i).iterator(0).copied(), n),
*self.y.get(i),
cache,
) {
cp += 1;
}
} else if *self.y.get(i) == TY::from(-1).unwrap()
&& cn < few
&& self.process(i, &x, *self.y.get(i), cache)
&& self.process(
i,
Vec::from_iterator(self.x.get_row(i).iterator(0).copied(), n),
*self.y.get(i),
cache,
)
{
cn += 1;
}
@@ -531,7 +539,7 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
}
}
fn process(&mut self, i: usize, x: &[TX], y: TY, cache: &mut Cache<TX, TY, X, Y>) -> bool {
fn process(&mut self, i: usize, x: Vec<TX>, y: TY, cache: &mut Cache<TX, TY, X, Y>) -> bool {
for j in 0..self.sv.len() {
if self.sv[j].index == i {
return true;
@@ -543,14 +551,15 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
let mut cache_values: Vec<((usize, usize), TX)> = Vec::new();
for v in self.sv.iter() {
let xi: Vec<_> = v.x.iter().map(|e| e.to_f64().unwrap()).collect();
let xj: Vec<_> = x.iter().map(|e| e.to_f64().unwrap()).collect();
let k = self
.parameters
.kernel
.as_ref()
.unwrap()
.apply(&xi, &xj)
.apply(
&v.x.iter().map(|e| e.to_f64().unwrap()).collect(),
&x.iter().map(|e| e.to_f64().unwrap()).collect(),
)
.unwrap();
cache_values.push(((i, v.index), TX::from(k).unwrap()));
g -= v.alpha * k;
@@ -569,7 +578,7 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
cache.insert(v.0, v.1.to_f64().unwrap());
}
let x_f64: Vec<_> = x.iter().map(|e| e.to_f64().unwrap()).collect();
let x_f64 = x.iter().map(|e| e.to_f64().unwrap()).collect();
let k_v = self
.parameters
.kernel
@@ -692,10 +701,8 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
let km = sv1.k;
let gm = sv1.grad;
let mut best = 0f64;
let xi: Vec<_> = sv1.x.iter().map(|e| e.to_f64().unwrap()).collect();
for i in 0..self.sv.len() {
let v = &self.sv[i];
let xj: Vec<_> = v.x.iter().map(|e| e.to_f64().unwrap()).collect();
let z = v.grad - gm;
let k = cache.get(
sv1,
@@ -704,7 +711,10 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.kernel
.as_ref()
.unwrap()
.apply(&xi, &xj)
.apply(
&sv1.x.iter().map(|e| e.to_f64().unwrap()).collect(),
&v.x.iter().map(|e| e.to_f64().unwrap()).collect(),
)
.unwrap(),
);
let mut curv = km + v.k - 2f64 * k;
@@ -722,12 +732,6 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
}
}
let xi: Vec<_> = self.sv[idx_1]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect::<Vec<_>>();
idx_2.map(|idx_2| {
(
idx_1,
@@ -738,12 +742,16 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.as_ref()
.unwrap()
.apply(
&xi,
&self.sv[idx_1]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect(),
&self.sv[idx_2]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect::<Vec<_>>(),
.collect(),
)
.unwrap()
}),
@@ -757,11 +765,8 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
let km = sv2.k;
let gm = sv2.grad;
let mut best = 0f64;
let xi: Vec<_> = sv2.x.iter().map(|e| e.to_f64().unwrap()).collect();
for i in 0..self.sv.len() {
let v = &self.sv[i];
let xj: Vec<_> = v.x.iter().map(|e| e.to_f64().unwrap()).collect();
let z = gm - v.grad;
let k = cache.get(
sv2,
@@ -770,7 +775,10 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.kernel
.as_ref()
.unwrap()
.apply(&xi, &xj)
.apply(
&sv2.x.iter().map(|e| e.to_f64().unwrap()).collect(),
&v.x.iter().map(|e| e.to_f64().unwrap()).collect(),
)
.unwrap(),
);
let mut curv = km + v.k - 2f64 * k;
@@ -789,12 +797,6 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
}
}
let xj: Vec<_> = self.sv[idx_2]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect();
idx_1.map(|idx_1| {
(
idx_1,
@@ -809,8 +811,12 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect::<Vec<_>>(),
&xj,
.collect(),
&self.sv[idx_2]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect(),
)
.unwrap()
}),
@@ -829,12 +835,12 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect::<Vec<_>>(),
.collect(),
&self.sv[idx_2]
.x
.iter()
.map(|e| e.to_f64().unwrap())
.collect::<Vec<_>>(),
.collect(),
)
.unwrap(),
)),
@@ -889,10 +895,7 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
self.sv[v1].alpha -= step.to_f64().unwrap();
self.sv[v2].alpha += step.to_f64().unwrap();
let xi_v1: Vec<_> = self.sv[v1].x.iter().map(|e| e.to_f64().unwrap()).collect();
let xi_v2: Vec<_> = self.sv[v2].x.iter().map(|e| e.to_f64().unwrap()).collect();
for i in 0..self.sv.len() {
let xj: Vec<_> = self.sv[i].x.iter().map(|e| e.to_f64().unwrap()).collect();
let k2 = cache.get(
&self.sv[v2],
&self.sv[i],
@@ -900,7 +903,10 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.kernel
.as_ref()
.unwrap()
.apply(&xi_v2, &xj)
.apply(
&self.sv[v2].x.iter().map(|e| e.to_f64().unwrap()).collect(),
&self.sv[i].x.iter().map(|e| e.to_f64().unwrap()).collect(),
)
.unwrap(),
);
let k1 = cache.get(
@@ -910,7 +916,10 @@ impl<'a, TX: Number + RealNumber, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>
.kernel
.as_ref()
.unwrap()
.apply(&xi_v1, &xj)
.apply(
&self.sv[v1].x.iter().map(|e| e.to_f64().unwrap()).collect(),
&self.sv[i].x.iter().map(|e| e.to_f64().unwrap()).collect(),
)
.unwrap(),
);
self.sv[i].grad -= step.to_f64().unwrap() * (k2 - k1);
@@ -957,8 +966,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<i32> = vec![
-1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
@@ -975,7 +983,11 @@ mod tests {
.unwrap();
let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect()));
assert!(acc >= 0.9, "accuracy ({acc}) is not larger or equal to 0.9");
assert!(
acc >= 0.9,
"accuracy ({}) is not larger or equal to 0.9",
acc
);
}
#[cfg_attr(
@@ -984,8 +996,7 @@ mod tests {
)]
#[test]
fn svc_fit_decision_function() {
let x = DenseMatrix::from_2d_array(&[&[4.0, 0.0], &[0.0, 4.0], &[8.0, 0.0], &[0.0, 8.0]])
.unwrap();
let x = DenseMatrix::from_2d_array(&[&[4.0, 0.0], &[0.0, 4.0], &[8.0, 0.0], &[0.0, 8.0]]);
let x2 = DenseMatrix::from_2d_array(&[
&[3.0, 3.0],
@@ -994,8 +1005,7 @@ mod tests {
&[10.0, 10.0],
&[1.0, 1.0],
&[0.0, 0.0],
])
.unwrap();
]);
let y: Vec<i32> = vec![-1, -1, 1, 1];
@@ -1048,8 +1058,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<i32> = vec![
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
@@ -1067,7 +1076,11 @@ mod tests {
let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect()));
assert!(acc >= 0.9, "accuracy ({acc}) is not larger or equal to 0.9");
assert!(
acc >= 0.9,
"accuracy ({}) is not larger or equal to 0.9",
acc
);
}
#[cfg_attr(
@@ -1098,8 +1111,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<i32> = vec![
-1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
@@ -1110,7 +1122,7 @@ mod tests {
let svc = SVC::fit(&x, &y, &params).unwrap();
// serialization
let deserialized_svc: SVC<'_, f64, i32, _, _> =
let deserialized_svc: SVC<f64, i32, _, _> =
serde_json::from_str(&serde_json::to_string(&svc).unwrap()).unwrap();
assert_eq!(svc, deserialized_svc);
+16 -16
View File
@@ -44,7 +44,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//!
//! let y: Vec<f64> = vec![83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0,
//! 100.0, 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9];
@@ -248,20 +248,19 @@ impl<'a, T: Number + FloatNumber + PartialOrd, X: Array2<T>, Y: Array1<T>> SVR<'
let mut y_hat: Vec<T> = Vec::<T>::zeros(n);
let mut x_i = Vec::with_capacity(n);
for i in 0..n {
x_i.clear();
x_i.extend(x.get_row(i).iterator(0).copied());
y_hat.set(i, self.predict_for_row(&x_i));
y_hat.set(
i,
self.predict_for_row(Vec::from_iterator(x.get_row(i).iterator(0).copied(), n)),
);
}
Ok(y_hat)
}
pub(crate) fn predict_for_row(&self, x: &[T]) -> T {
pub(crate) fn predict_for_row(&self, x: Vec<T>) -> T {
let mut f = self.b;
let xi: Vec<_> = x.iter().map(|e| e.to_f64().unwrap()).collect();
for i in 0..self.instances.as_ref().unwrap().len() {
f += self.w.as_ref().unwrap()[i]
* T::from(
@@ -271,7 +270,10 @@ impl<'a, T: Number + FloatNumber + PartialOrd, X: Array2<T>, Y: Array1<T>> SVR<'
.kernel
.as_ref()
.unwrap()
.apply(&xi, &self.instances.as_ref().unwrap()[i])
.apply(
&x.iter().map(|e| e.to_f64().unwrap()).collect(),
&self.instances.as_ref().unwrap()[i],
)
.unwrap(),
)
.unwrap()
@@ -281,8 +283,8 @@ impl<'a, T: Number + FloatNumber + PartialOrd, X: Array2<T>, Y: Array1<T>> SVR<'
}
}
impl<T: Number + FloatNumber + PartialOrd, X: Array2<T>, Y: Array1<T>> PartialEq
for SVR<'_, T, X, Y>
impl<'a, T: Number + FloatNumber + PartialOrd, X: Array2<T>, Y: Array1<T>> PartialEq
for SVR<'a, T, X, Y>
{
fn eq(&self, other: &Self) -> bool {
if (self.b - other.b).abs() > T::epsilon() * T::two()
@@ -640,8 +642,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
@@ -661,7 +662,7 @@ mod tests {
.unwrap();
let t = mean_squared_error(&y_hat, &y);
println!("{t:?}");
println!("{:?}", t);
assert!(t < 2.5);
}
@@ -689,8 +690,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
@@ -702,7 +702,7 @@ mod tests {
let svr = SVR::fit(&x, &y, &params).unwrap();
let deserialized_svr: SVR<'_, f64, DenseMatrix<f64>, _> =
let deserialized_svr: SVR<f64, DenseMatrix<f64>, _> =
serde_json::from_str(&serde_json::to_string(&svr).unwrap()).unwrap();
assert_eq!(svr, deserialized_svr);
+51 -235
View File
@@ -48,7 +48,7 @@
//! &[4.9, 2.4, 3.3, 1.0],
//! &[6.6, 2.9, 4.6, 1.3],
//! &[5.2, 2.7, 3.9, 1.4],
//! ]).unwrap();
//! ]);
//! let y = vec![ 0, 0, 0, 0, 0, 0, 0, 0,
//! 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
//!
@@ -77,9 +77,7 @@ use serde::{Deserialize, Serialize};
use crate::api::{Predictor, SupervisedEstimator};
use crate::error::Failed;
use crate::linalg::basic::arrays::MutArray;
use crate::linalg::basic::arrays::{Array1, Array2, MutArrayView1};
use crate::linalg::basic::matrix::DenseMatrix;
use crate::numbers::basenum::Number;
use crate::rand_custom::get_rng_impl;
@@ -118,7 +116,6 @@ pub struct DecisionTreeClassifier<
num_classes: usize,
classes: Vec<TY>,
depth: u16,
num_features: usize,
_phantom_tx: PhantomData<TX>,
_phantom_x: PhantomData<X>,
_phantom_y: PhantomData<Y>,
@@ -140,17 +137,16 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
self.classes.as_ref()
}
/// Get depth of tree
pub fn depth(&self) -> u16 {
fn depth(&self) -> u16 {
self.depth
}
}
/// The function to measure the quality of a split.
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)]
#[derive(Debug, Clone)]
pub enum SplitCriterion {
/// [Gini index](../decision_tree_classifier/index.html)
#[default]
Gini,
/// [Entropy](../decision_tree_classifier/index.html)
Entropy,
@@ -158,17 +154,21 @@ pub enum SplitCriterion {
ClassificationError,
}
impl Default for SplitCriterion {
fn default() -> Self {
SplitCriterion::Gini
}
}
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)]
struct Node {
output: usize,
n_node_samples: usize,
split_feature: usize,
split_value: Option<f64>,
split_score: Option<f64>,
true_child: Option<usize>,
false_child: Option<usize>,
impurity: Option<f64>,
}
impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>> PartialEq
@@ -199,12 +199,12 @@ impl PartialEq for Node {
self.output == other.output
&& self.split_feature == other.split_feature
&& match (self.split_value, other.split_value) {
(Some(a), Some(b)) => (a - b).abs() < f64::EPSILON,
(Some(a), Some(b)) => (a - b).abs() < std::f64::EPSILON,
(None, None) => true,
_ => false,
}
&& match (self.split_score, other.split_score) {
(Some(a), Some(b)) => (a - b).abs() < f64::EPSILON,
(Some(a), Some(b)) => (a - b).abs() < std::f64::EPSILON,
(None, None) => true,
_ => false,
}
@@ -405,16 +405,14 @@ impl Default for DecisionTreeClassifierSearchParameters {
}
impl Node {
fn new(output: usize, n_node_samples: usize) -> Self {
fn new(output: usize) -> Self {
Node {
output,
n_node_samples,
split_feature: 0,
split_value: Option::None,
split_score: Option::None,
true_child: Option::None,
false_child: Option::None,
impurity: Option::None,
}
}
}
@@ -514,7 +512,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
num_classes: 0usize,
classes: vec![],
depth: 0u16,
num_features: 0usize,
_phantom_tx: PhantomData,
_phantom_x: PhantomData,
_phantom_y: PhantomData,
@@ -546,10 +543,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
parameters: DecisionTreeClassifierParameters,
) -> Result<DecisionTreeClassifier<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape();
if x_nrows != y.shape() {
return Err(Failed::fit("Size of x should equal size of y"));
}
let samples = vec![1; x_nrows];
DecisionTreeClassifier::fit_weak_learner(x, y, samples, num_attributes, parameters)
}
@@ -567,7 +560,8 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
let k = classes.len();
if k < 2 {
return Err(Failed::fit(&format!(
"Incorrect number of classes: {k}. Should be >= 2."
"Incorrect number of classes: {}. Should be >= 2.",
k
)));
}
@@ -586,7 +580,7 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
count[yi[i]] += samples[i];
}
let root = Node::new(which_max(&count), y_ncols);
let root = Node::new(which_max(&count));
change_nodes.push(root);
let mut order: Vec<Vec<usize>> = Vec::new();
@@ -601,7 +595,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
num_classes: k,
classes,
depth: 0u16,
num_features: num_attributes,
_phantom_tx: PhantomData,
_phantom_x: PhantomData,
_phantom_y: PhantomData,
@@ -615,7 +608,7 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
visitor_queue.push_back(visitor);
}
while tree.depth() < tree.parameters().max_depth.unwrap_or(u16::MAX) {
while tree.depth() < tree.parameters().max_depth.unwrap_or(std::u16::MAX) {
match visitor_queue.pop_front() {
Some(node) => tree.split(node, mtry, &mut visitor_queue, &mut rng),
None => break,
@@ -652,7 +645,7 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
if node.true_child.is_none() && node.false_child.is_none() {
result = node.output;
} else if x.get((row, node.split_feature)).to_f64().unwrap()
<= node.split_value.unwrap_or(f64::NAN)
<= node.split_value.unwrap_or(std::f64::NAN)
{
queue.push_back(node.true_child.unwrap());
} else {
@@ -687,7 +680,16 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
}
}
if is_pure {
return false;
}
let n = visitor.samples.iter().sum();
if n <= self.parameters().min_samples_split {
return false;
}
let mut count = vec![0; self.num_classes];
let mut false_count = vec![0; self.num_classes];
for i in 0..n_rows {
@@ -696,15 +698,7 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
}
}
self.nodes[visitor.node].impurity = Some(impurity(&self.parameters().criterion, &count, n));
if is_pure {
return false;
}
if n <= self.parameters().min_samples_split {
return false;
}
let parent_impurity = impurity(&self.parameters().criterion, &count, n);
let mut variables = (0..n_attr).collect::<Vec<_>>();
@@ -713,7 +707,14 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
}
for variable in variables.iter().take(mtry) {
self.find_best_split(visitor, n, &count, &mut false_count, *variable);
self.find_best_split(
visitor,
n,
&count,
&mut false_count,
parent_impurity,
*variable,
);
}
self.nodes()[visitor.node].split_score.is_some()
@@ -725,6 +726,7 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
n: usize,
count: &[usize],
false_count: &mut [usize],
parent_impurity: f64,
j: usize,
) {
let mut true_count = vec![0; self.num_classes];
@@ -760,7 +762,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
let true_label = which_max(&true_count);
let false_label = which_max(false_count);
let parent_impurity = self.nodes()[visitor.node].impurity.unwrap();
let gain = parent_impurity
- tc as f64 / n as f64
* impurity(&self.parameters().criterion, &true_count, tc)
@@ -805,7 +806,9 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
.get((i, self.nodes()[visitor.node].split_feature))
.to_f64()
.unwrap()
<= self.nodes()[visitor.node].split_value.unwrap_or(f64::NAN)
<= self.nodes()[visitor.node]
.split_value
.unwrap_or(std::f64::NAN)
{
*true_sample = visitor.samples[i];
tc += *true_sample;
@@ -826,9 +829,9 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
let true_child_idx = self.nodes().len();
self.nodes.push(Node::new(visitor.true_child_output, tc));
self.nodes.push(Node::new(visitor.true_child_output));
let false_child_idx = self.nodes().len();
self.nodes.push(Node::new(visitor.false_child_output, fc));
self.nodes.push(Node::new(visitor.false_child_output));
self.nodes[visitor.node].true_child = Some(true_child_idx);
self.nodes[visitor.node].false_child = Some(false_child_idx);
@@ -862,104 +865,11 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
true
}
/// Compute feature importances for the fitted tree.
pub fn compute_feature_importances(&self, normalize: bool) -> Vec<f64> {
let mut importances = vec![0f64; self.num_features];
for node in self.nodes().iter() {
if node.true_child.is_none() && node.false_child.is_none() {
continue;
}
let left = &self.nodes()[node.true_child.unwrap()];
let right = &self.nodes()[node.false_child.unwrap()];
importances[node.split_feature] += node.n_node_samples as f64 * node.impurity.unwrap()
- left.n_node_samples as f64 * left.impurity.unwrap()
- right.n_node_samples as f64 * right.impurity.unwrap();
}
for item in importances.iter_mut() {
*item /= self.nodes()[0].n_node_samples as f64;
}
if normalize {
let sum = importances.iter().sum::<f64>();
for importance in importances.iter_mut() {
*importance /= sum;
}
}
importances
}
/// Predict class probabilities for the input samples.
///
/// # Arguments
///
/// * `x` - The input samples as a matrix where each row is a sample and each column is a feature.
///
/// # Returns
///
/// A `Result` containing a `DenseMatrix<f64>` where each row corresponds to a sample and each column
/// corresponds to a class. The values represent the probability of the sample belonging to each class.
///
/// # Errors
///
/// Returns an error if at least one row prediction process fails.
pub fn predict_proba(&self, x: &X) -> Result<DenseMatrix<f64>, Failed> {
let (n_samples, _) = x.shape();
let n_classes = self.classes().len();
let mut result = DenseMatrix::<f64>::zeros(n_samples, n_classes);
for i in 0..n_samples {
let probs = self.predict_proba_for_row(x, i)?;
for (j, &prob) in probs.iter().enumerate() {
result.set((i, j), prob);
}
}
Ok(result)
}
/// Predict class probabilities for a single input sample.
///
/// # Arguments
///
/// * `x` - The input matrix containing all samples.
/// * `row` - The index of the row in `x` for which to predict probabilities.
///
/// # Returns
///
/// A vector of probabilities, one for each class, representing the probability
/// of the input sample belonging to each class.
fn predict_proba_for_row(&self, x: &X, row: usize) -> Result<Vec<f64>, Failed> {
let mut node = 0;
while let Some(current_node) = self.nodes().get(node) {
if current_node.true_child.is_none() && current_node.false_child.is_none() {
// Leaf node reached
let mut probs = vec![0.0; self.classes().len()];
probs[current_node.output] = 1.0;
return Ok(probs);
}
let split_feature = current_node.split_feature;
let split_value = current_node.split_value.unwrap_or(f64::NAN);
if x.get((row, split_feature)).to_f64().unwrap() <= split_value {
node = current_node.true_child.unwrap();
} else {
node = current_node.false_child.unwrap();
}
}
// This should never happen if the tree is properly constructed
Err(Failed::predict("Nodes iteration did not reach leaf"))
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::linalg::basic::arrays::Array;
use crate::linalg::basic::matrix::DenseMatrix;
#[test]
@@ -991,60 +901,17 @@ mod tests {
)]
#[test]
fn gini_impurity() {
assert!((impurity(&SplitCriterion::Gini, &[7, 3], 10) - 0.42).abs() < f64::EPSILON);
assert!(
(impurity(&SplitCriterion::Entropy, &[7, 3], 10) - 0.8812908992306927).abs()
< f64::EPSILON
(impurity(&SplitCriterion::Gini, &vec![7, 3], 10) - 0.42).abs() < std::f64::EPSILON
);
assert!(
(impurity(&SplitCriterion::ClassificationError, &[7, 3], 10) - 0.3).abs()
< f64::EPSILON
(impurity(&SplitCriterion::Entropy, &vec![7, 3], 10) - 0.8812908992306927).abs()
< std::f64::EPSILON
);
assert!(
(impurity(&SplitCriterion::ClassificationError, &vec![7, 3], 10) - 0.3).abs()
< std::f64::EPSILON
);
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
fn test_predict_proba() {
let x: DenseMatrix<f64> = DenseMatrix::from_2d_array(&[
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
&[4.7, 3.2, 1.3, 0.2],
&[4.6, 3.1, 1.5, 0.2],
&[5.0, 3.6, 1.4, 0.2],
&[7.0, 3.2, 4.7, 1.4],
&[6.4, 3.2, 4.5, 1.5],
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
])
.unwrap();
let y: Vec<usize> = vec![0, 0, 0, 0, 0, 1, 1, 1, 1, 1];
let tree = DecisionTreeClassifier::fit(&x, &y, Default::default()).unwrap();
let probabilities = tree.predict_proba(&x).unwrap();
assert_eq!(probabilities.shape(), (10, 2));
for row in 0..10 {
let row_sum: f64 = probabilities.get_row(row).sum();
assert!(
(row_sum - 1.0).abs() < 1e-6,
"Row probabilities should sum to 1"
);
}
// Check if the first 5 samples have higher probability for class 0
for i in 0..5 {
assert!(probabilities.get((i, 0)) > probabilities.get((i, 1)));
}
// Check if the last 5 samples have higher probability for class 1
for i in 5..10 {
assert!(probabilities.get((i, 1)) > probabilities.get((i, 0)));
}
}
#[cfg_attr(
@@ -1075,8 +942,7 @@ mod tests {
&[4.9, 2.4, 3.3, 1.0],
&[6.6, 2.9, 4.6, 1.3],
&[5.2, 2.7, 3.9, 1.4],
])
.unwrap();
]);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
assert_eq!(
@@ -1105,17 +971,6 @@ mod tests {
);
}
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(21, 200);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let fail = DecisionTreeClassifier::fit(&x_rand, &y, Default::default());
assert!(fail.is_err());
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
@@ -1143,8 +998,7 @@ mod tests {
&[0., 0., 1., 1.],
&[0., 0., 0., 0.],
&[0., 0., 0., 1.],
])
.unwrap();
]);
let y: Vec<u32> = vec![1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0];
assert_eq!(
@@ -1155,43 +1009,6 @@ mod tests {
);
}
#[test]
fn test_compute_feature_importances() {
let x: DenseMatrix<f64> = DenseMatrix::from_2d_array(&[
&[1., 1., 1., 0.],
&[1., 1., 1., 0.],
&[1., 1., 1., 1.],
&[1., 1., 0., 0.],
&[1., 1., 0., 1.],
&[1., 0., 1., 0.],
&[1., 0., 1., 0.],
&[1., 0., 1., 1.],
&[1., 0., 0., 0.],
&[1., 0., 0., 1.],
&[0., 1., 1., 0.],
&[0., 1., 1., 0.],
&[0., 1., 1., 1.],
&[0., 1., 0., 0.],
&[0., 1., 0., 1.],
&[0., 0., 1., 0.],
&[0., 0., 1., 0.],
&[0., 0., 1., 1.],
&[0., 0., 0., 0.],
&[0., 0., 0., 1.],
])
.unwrap();
let y: Vec<u32> = vec![1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0];
let tree = DecisionTreeClassifier::fit(&x, &y, Default::default()).unwrap();
assert_eq!(
tree.compute_feature_importances(false),
vec![0., 0., 0.21333333333333332, 0.26666666666666666]
);
assert_eq!(
tree.compute_feature_importances(true),
vec![0., 0., 0.4444444444444444, 0.5555555555555556]
);
}
#[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test
@@ -1220,8 +1037,7 @@ mod tests {
&[0., 0., 1., 1.],
&[0., 0., 0., 0.],
&[0., 0., 0., 1.],
])
.unwrap();
]);
let y = vec![1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0];
let tree = DecisionTreeClassifier::fit(&x, &y, Default::default()).unwrap();
+14 -17
View File
@@ -18,6 +18,7 @@
//! Example:
//!
//! ```
//! use rand::thread_rng;
//! use smartcore::linalg::basic::matrix::DenseMatrix;
//! use smartcore::tree::decision_tree_regressor::*;
//!
@@ -39,7 +40,7 @@
//! &[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
//! &[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
//! &[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
//! ]).unwrap();
//! ]);
//! let y: Vec<f64> = vec![
//! 83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0,
//! 101.2, 104.6, 108.4, 110.8, 112.6, 114.2, 115.7, 116.9,
@@ -311,15 +312,15 @@ impl Node {
impl PartialEq for Node {
fn eq(&self, other: &Self) -> bool {
(self.output - other.output).abs() < f64::EPSILON
(self.output - other.output).abs() < std::f64::EPSILON
&& self.split_feature == other.split_feature
&& match (self.split_value, other.split_value) {
(Some(a), Some(b)) => (a - b).abs() < f64::EPSILON,
(Some(a), Some(b)) => (a - b).abs() < std::f64::EPSILON,
(None, None) => true,
_ => false,
}
&& match (self.split_score, other.split_score) {
(Some(a), Some(b)) => (a - b).abs() < f64::EPSILON,
(Some(a), Some(b)) => (a - b).abs() < std::f64::EPSILON,
(None, None) => true,
_ => false,
}
@@ -421,10 +422,6 @@ impl<TX: Number + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1<TY>>
parameters: DecisionTreeRegressorParameters,
) -> Result<DecisionTreeRegressor<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape();
if x_nrows != y.shape() {
return Err(Failed::fit("Size of x should equal size of y"));
}
let samples = vec![1; x_nrows];
DecisionTreeRegressor::fit_weak_learner(x, y, samples, num_attributes, parameters)
}
@@ -478,7 +475,7 @@ impl<TX: Number + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1<TY>>
visitor_queue.push_back(visitor);
}
while tree.depth() < tree.parameters().max_depth.unwrap_or(u16::MAX) {
while tree.depth() < tree.parameters().max_depth.unwrap_or(std::u16::MAX) {
match visitor_queue.pop_front() {
Some(node) => tree.split(node, mtry, &mut visitor_queue, &mut rng),
None => break,
@@ -515,7 +512,7 @@ impl<TX: Number + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1<TY>>
if node.true_child.is_none() && node.false_child.is_none() {
result = node.output;
} else if x.get((row, node.split_feature)).to_f64().unwrap()
<= node.split_value.unwrap_or(f64::NAN)
<= node.split_value.unwrap_or(std::f64::NAN)
{
queue.push_back(node.true_child.unwrap());
} else {
@@ -640,7 +637,9 @@ impl<TX: Number + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1<TY>>
.get((i, self.nodes()[visitor.node].split_feature))
.to_f64()
.unwrap()
<= self.nodes()[visitor.node].split_value.unwrap_or(f64::NAN)
<= self.nodes()[visitor.node]
.split_value
.unwrap_or(std::f64::NAN)
{
*true_sample = visitor.samples[i];
tc += *true_sample;
@@ -751,8 +750,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
@@ -766,7 +764,7 @@ mod tests {
assert!((y_hat[i] - y[i]).abs() < 0.1);
}
let expected_y = [
let expected_y = vec![
87.3, 87.3, 87.3, 87.3, 98.9, 98.9, 98.9, 98.9, 98.9, 107.9, 107.9, 107.9, 114.85,
114.85, 114.85, 114.85,
];
@@ -787,7 +785,7 @@ mod tests {
assert!((y_hat[i] - expected_y[i]).abs() < 0.1);
}
let expected_y = [
let expected_y = vec![
83.0, 88.35, 88.35, 89.5, 97.15, 97.15, 99.5, 99.5, 101.2, 104.6, 109.6, 109.6, 113.4,
113.4, 116.30, 116.30,
];
@@ -833,8 +831,7 @@ mod tests {
&[502.601, 393.1, 251.4, 125.368, 1960., 69.564],
&[518.173, 480.6, 257.2, 127.852, 1961., 69.331],
&[554.894, 400.7, 282.7, 130.081, 1962., 70.551],
])
.unwrap();
]);
let y: Vec<f64> = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,