76 Commits

Author SHA1 Message Date
morenol
62de25b2ae Handle kernel serialization (#232)
* Handle kernel serialization
* Do not use typetag in WASM
* enable tests for serialization
* Update serde feature deps

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>
2022-11-08 11:29:56 -05:00
morenol
7d87451333 Fixes for release (#237)
* Fixes for release
* add new test
* Remove change applied in development branch
* Only add dependency for wasm32
* Update ci.yml

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>
2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
265fd558e7 make work cargo build --target wasm32-unknown-unknown 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
e25e2aea2b update CHANGELOG 2022-11-08 11:29:56 -05:00
Lorenzo
2f6dd1325e update comment 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
b0dece9476 use getrandom/js 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
c507d976be Update CHANGELOG 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
fa54d5ee86 Remove unused tests flags 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
459d558d48 minor fixes to doc 2022-11-08 11:29:56 -05:00
Lorenzo
1b7dda30a2 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
c1bd1df5f6 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
cf751f05aa minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
63ed89aadd minor fix 2022-11-08 11:29:56 -05:00
Lorenzo
890e9d644c minor fix 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
af0a740394 Fix std_rand feature 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
616e38c282 cleanup 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
a449fdd4ea fmt 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
669f87f812 Use getrandom as default (for no-std feature) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
6d529b34d2 Add static analyzer to doc 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
3ec9e4f0db Exclude datasets test for wasm/wasi 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
527477dea7 minor fixes 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
5b517c5048 minor fix 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
2df0795be9 Release 0.3 2022-11-08 11:29:56 -05:00
Lorenzo
0dc97a4e9b Create DEVELOPERS.md 2022-11-08 11:29:56 -05:00
Lorenzo
6c0fd37222 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
d8d0fb6903 Update README.md 2022-11-08 11:29:56 -05:00
morenol
8d07efd921 Use Box in SVM and remove lifetimes (#228)
* Do not change external API
Authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
morenol
ba27dd2a55 Fix CI (#227)
* Update ci.yml
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
ed9769f651 Implement CSV reader with new traits (#209) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
b427e5d8b1 Improve options conditionals 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
fabe362755 Implement Display for NaiveBayes 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
ee6b6a53d6 cargo clippy 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
19f3a2fcc0 Fix signature of metrics tests 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
e09c4ba724 Add kernels' parameters to public interface 2022-11-08 11:29:56 -05:00
Lorenzo
6624732a65 Fix svr tests (#222) 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
1cbde3ba22 Refactor modules structure in src/svm 2022-11-08 11:29:56 -05:00
Lorenzo (Mec-iS)
551a6e34a5 clean up svm 2022-11-08 11:29:56 -05:00
Lorenzo
c45bab491a Support Wasi as target (#216)
* Improve features
* Add wasm32-wasi as a target
* Update .github/workflows/ci.yml
Co-authored-by: morenol <22335041+morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
7f35dc54e4 Disambiguate distances. Implement Fastpair. (#220) 2022-11-08 11:29:56 -05:00
morenol
8f1a7dfd79 build: fix compilation without default features (#218)
* build: fix compilation with optional features
* Remove unused config from Cargo.toml
* Fix cache keys
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
712c478af6 Improve features (#215) 2022-11-08 11:29:56 -05:00
Lorenzo
4d36b7f34f Fix metrics::auc (#212)
* Fix metrics::auc
2022-11-08 11:29:56 -05:00
Lorenzo
a16927aa16 Port ensemble. Add Display to naive_bayes (#208) 2022-11-08 11:29:56 -05:00
Lorenzo
d91f4f7ce4 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
a7fa0585eb Merge potential next release v0.4 (#187) Breaking Changes
* First draft of the new n-dimensional arrays + NB use case
* Improves default implementation of multiple Array methods
* Refactors tree methods
* Adds matrix decomposition routines
* Adds matrix decomposition methods to ndarray and nalgebra bindings
* Refactoring + linear regression now uses array2
* Ridge & Linear regression
* LBFGS optimizer & logistic regression
* LBFGS optimizer & logistic regression
* Changes linear methods, metrics and model selection methods to new n-dimensional arrays
* Switches KNN and clustering algorithms to new n-d array layer
* Refactors distance metrics
* Optimizes knn and clustering methods
* Refactors metrics module
* Switches decomposition methods to n-dimensional arrays
* Linalg refactoring - cleanup rng merge (#172)
* Remove legacy DenseMatrix and BaseMatrix implementation. Port the new Number, FloatNumber and Array implementation into module structure.
* Exclude AUC metrics. Needs reimplementation
* Improve developers walkthrough

New traits system in place at `src/numbers` and `src/linalg`
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>

* Provide SupervisedEstimator with a constructor to avoid explicit dynamical box allocation in 'cross_validate' and 'cross_validate_predict' as required by the use of 'dyn' as per Rust 2021
* Implement getters to use as_ref() in src/neighbors
* Implement getters to use as_ref() in src/naive_bayes
* Implement getters to use as_ref() in src/linear
* Add Clone to src/naive_bayes
* Change signature for cross_validate and other model_selection functions to abide to use of dyn in Rust 2021
* Implement ndarray-bindings. Remove FloatNumber from implementations
* Drop nalgebra-bindings support (as decided in conf-call to go for ndarray)
* Remove benches. Benches will have their own repo at smartcore-benches
* Implement SVC
* Implement SVC serialization. Move search parameters in dedicated module
* Implement SVR. Definitely too slow
* Fix compilation issues for wasm (#202)

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
* Fix tests (#203)

* Port linalg/traits/stats.rs
* Improve methods naming
* Improve Display for DenseMatrix

Co-authored-by: Montana Low <montanalow@users.noreply.github.com>
Co-authored-by: VolodymyrOrlov <volodymyr.orlov@gmail.com>
2022-11-08 11:29:56 -05:00
RJ Nowling
a32eb66a6a Dataset doc cleanup (#205)
* Update iris.rs

* Update mod.rs

* Update digits.rs
2022-11-08 11:29:56 -05:00
Lorenzo
f605f6e075 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
3b1aaaadf7 Update README.md 2022-11-08 11:29:56 -05:00
Lorenzo
d015b12402 Update CONTRIBUTING.md 2022-11-08 11:29:56 -05:00
morenol
d5200074c2 fix: fix issue with iterator for svc search (#182) 2022-11-08 11:29:56 -05:00
morenol
473cdfc44d refactor: Try to follow similar pattern to other APIs (#180)
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
morenol
ad2e6c2900 feat: expose hyper tuning module in model_selection (#179)
* feat: expose hyper tuning module in model_selection

* Move to a folder

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Lorenzo
9ea3133c27 Update CONTRIBUTING.md 2022-11-08 11:29:56 -05:00
Lorenzo
e4c47c7540 Add contribution guidelines (#178) 2022-11-08 11:29:56 -05:00
Montana Low
f4fd4d2239 make default params available to serde (#167)
* add seed param to search params

* make default params available to serde

* lints

* create defaults for enums

* lint
2022-11-08 11:29:56 -05:00
Montana Low
05dfffad5c add seed param to search params (#168) 2022-11-08 11:29:56 -05:00
morenol
a37b552a7d Lmm/add seeds in more algorithms (#164)
* Provide better output in flaky tests

* feat: add seed parameter to multiple algorithms

* Update changelog

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Montana Low
55e1158581 Complete grid search params (#166)
* grid search draft

* hyperparam search for linear estimators

* grid search for ensembles

* support grid search for more algos

* grid search for unsupervised algos

* minor cleanup
2022-11-08 11:29:56 -05:00
morenol
cfa824d7db Provide better output in flaky tests (#163) 2022-11-08 11:29:56 -05:00
morenol
bb5b437a32 feat: allocate first and then proceed to create matrix from Vec of Ro… (#159)
* feat: allocate first and then proceed to create matrix from Vec of RowVectors
2022-11-08 11:29:56 -05:00
morenol
851533dfa7 Make rand_distr optional (#161) 2022-11-08 11:29:56 -05:00
Lorenzo
0d996edafe Update LICENSE 2022-11-08 11:29:56 -05:00
morenol
f291b71f4a fix: fix compilation warnings when running only with default features (#160)
* fix: fix compilation warnings when running only with default features
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Tim Toebrock
2d75c2c405 Implement a generic read_csv method (#147)
* feat: Add interface to build `Matrix` from rows.
* feat: Add option to derive `RealNumber` from string.
To construct a `Matrix` from csv, and therefore from string, I need to be able to deserialize a generic `RealNumber` from string.
* feat: Implement `Matrix::read_csv`.
2022-11-08 11:29:56 -05:00
Montana Low
1f2597be74 grid search (#154)
* grid search draft
* hyperparam search for linear estimators
2022-11-08 11:29:56 -05:00
Montana Low
0f442e96c0 Handle multiclass precision/recall (#152)
* handle multiclass precision/recall
2022-11-08 11:29:56 -05:00
dependabot[bot]
44e4be23a6 Update criterion requirement from 0.3 to 0.4 (#150)
* Update criterion requirement from 0.3 to 0.4

Updates the requirements on [criterion](https://github.com/bheisler/criterion.rs) to permit the latest version.
- [Release notes](https://github.com/bheisler/criterion.rs/releases)
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.3.0...0.4.0)

---
updated-dependencies:
- dependency-name: criterion
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix criterion

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
Christos Katsakioris
01f753f86d Add serde for StandardScaler (#148)
* Derive `serde::Serialize` and `serde::Deserialize` for
  `StandardScaler`.
* Add relevant unit test.

Signed-off-by: Christos Katsakioris <ckatsak@gmail.com>

Signed-off-by: Christos Katsakioris <ckatsak@gmail.com>
2022-11-08 11:29:56 -05:00
Tim Toebrock
df766eaf79 Implementation of Standard scaler (#143)
* docs: Fix typo in doc for categorical transformer.
* feat: Add option to take a column from Matrix.
I created the method `Matrix::take_column` that uses the `Matrix::take`-interface to extract a single column from a matrix. I need that feature in the implementation of  `StandardScaler`.
* feat: Add `StandardScaler`.
Authored-by: titoeb <timtoebrock@googlemail.com>
2022-11-08 11:29:56 -05:00
Lorenzo
09d9205696 Add example for FastPair (#144)
* Add example

* Move to top

* Add imports to example

* Fix imports
2022-11-08 11:29:56 -05:00
Lorenzo
dc7f01db4a Implement fastpair (#142)
* initial fastpair implementation
* FastPair initial implementation
* implement fastpair
* Add random test
* Add bench for fastpair
* Refactor with constructor for FastPair
* Add serialization for PairwiseDistance
* Add fp_bench feature for fastpair bench
2022-11-08 11:29:56 -05:00
Chris McComb
eb4b49d552 Added additional doctest and fixed indices (#141) 2022-11-08 11:29:56 -05:00
morenol
98e3465e7b Fix clippy warnings (#139)
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
ferrouille
ea39024fd2 Add SVC::decision_function (#135) 2022-11-08 11:29:56 -05:00
dependabot[bot]
4e94feb872 Update nalgebra requirement from 0.23.0 to 0.31.0 (#128)
Updates the requirements on [nalgebra](https://github.com/dimforge/nalgebra) to permit the latest version.
- [Release notes](https://github.com/dimforge/nalgebra/releases)
- [Changelog](https://github.com/dimforge/nalgebra/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/dimforge/nalgebra/compare/v0.23.0...v0.31.0)

---
updated-dependencies:
- dependency-name: nalgebra
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
dependabot-preview[bot]
fa802d2d3f build(deps): update nalgebra requirement from 0.23.0 to 0.26.2 (#98)
* build(deps): update nalgebra requirement from 0.23.0 to 0.26.2

Updates the requirements on [nalgebra](https://github.com/dimforge/nalgebra) to permit the latest version.
- [Release notes](https://github.com/dimforge/nalgebra/releases)
- [Changelog](https://github.com/dimforge/nalgebra/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/dimforge/nalgebra/compare/v0.23.0...v0.26.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>

* fix: updates for nalgebre

* test: explicitly call pow_mut from BaseVector since now it conflicts with nalgebra implementation

* Don't be strict with dependencies

Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
2022-11-08 11:29:56 -05:00
56 changed files with 284 additions and 267 deletions
+3 -3
View File
@@ -2,7 +2,7 @@
name = "smartcore" name = "smartcore"
description = "Machine Learning in Rust." description = "Machine Learning in Rust."
homepage = "https://smartcorelib.org" homepage = "https://smartcorelib.org"
version = "0.3.1" version = "0.3.0"
authors = ["smartcore Developers"] authors = ["smartcore Developers"]
edition = "2021" edition = "2021"
license = "Apache-2.0" license = "Apache-2.0"
@@ -42,13 +42,13 @@ std_rand = ["rand/std_rng", "rand/std"]
js = ["getrandom/js"] js = ["getrandom/js"]
[target.'cfg(target_arch = "wasm32")'.dependencies] [target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { version = "0.2.8", optional = true } getrandom = { version = "*", optional = true }
[target.'cfg(all(target_arch = "wasm32", not(target_os = "wasi")))'.dev-dependencies] [target.'cfg(all(target_arch = "wasm32", not(target_os = "wasi")))'.dev-dependencies]
wasm-bindgen-test = "0.3" wasm-bindgen-test = "0.3"
[dev-dependencies] [dev-dependencies]
itertools = "0.10.5" itertools = "*"
serde_json = "1.0" serde_json = "1.0"
bincode = "1.3.1" bincode = "1.3.1"
+1 -1
View File
@@ -18,4 +18,4 @@
----- -----
[![CI](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml/badge.svg)](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml) [![CI](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml/badge.svg)](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml)
To start getting familiar with the new smartcore v0.3 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md). To start getting familiar with the new smartcore v0.5 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md).
+15
View File
@@ -0,0 +1,15 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="RUST_MODULE" version="4">
<component name="NewModuleRootManager" inherit-compiler-output="true">
<exclude-output />
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/examples" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/tests" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/benches" isTestSource="true" />
<excludeFolder url="file://$MODULE_DIR$/target" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>
+10 -6
View File
@@ -260,8 +260,8 @@ mod tests_fastpair {
let distances = fastpair.distances; let distances = fastpair.distances;
let neighbours = fastpair.neighbours; let neighbours = fastpair.neighbours;
assert!(!distances.is_empty()); assert!(distances.len() != 0);
assert!(!neighbours.is_empty()); assert!(neighbours.len() != 0);
assert_eq!(10, neighbours.len()); assert_eq!(10, neighbours.len());
assert_eq!(10, distances.len()); assert_eq!(10, distances.len());
@@ -276,14 +276,18 @@ mod tests_fastpair {
// We expect an error when we run `FastPair` on this dataset, // We expect an error when we run `FastPair` on this dataset,
// becuase `FastPair` currently only works on a minimum of 3 // becuase `FastPair` currently only works on a minimum of 3
// points. // points.
let fastpair = FastPair::new(&dataset); let _fastpair = FastPair::new(&dataset);
assert!(fastpair.is_err());
if let Err(e) = fastpair { match _fastpair {
Err(e) => {
let expected_error = let expected_error =
Failed::because(FailedError::FindFailed, "min number of rows should be 3"); Failed::because(FailedError::FindFailed, "min number of rows should be 3");
assert_eq!(e, expected_error) assert_eq!(e, expected_error)
} }
_ => {
assert!(false);
}
}
} }
#[test] #[test]
@@ -578,7 +582,7 @@ mod tests_fastpair {
}; };
for p in dissimilarities.iter() { for p in dissimilarities.iter() {
if p.distance.unwrap() < min_dissimilarity.distance.unwrap() { if p.distance.unwrap() < min_dissimilarity.distance.unwrap() {
min_dissimilarity = *p min_dissimilarity = p.clone()
} }
} }
+7 -2
View File
@@ -49,15 +49,20 @@ pub mod linear_search;
/// Both, KNN classifier and regressor benefits from underlying search algorithms that helps to speed up queries. /// Both, KNN classifier and regressor benefits from underlying search algorithms that helps to speed up queries.
/// `KNNAlgorithmName` maintains a list of supported search algorithms, see [KNN algorithms](../algorithm/neighbour/index.html) /// `KNNAlgorithmName` maintains a list of supported search algorithms, see [KNN algorithms](../algorithm/neighbour/index.html)
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)] #[derive(Debug, Clone)]
pub enum KNNAlgorithmName { pub enum KNNAlgorithmName {
/// Heap Search algorithm, see [`LinearSearch`](../algorithm/neighbour/linear_search/index.html) /// Heap Search algorithm, see [`LinearSearch`](../algorithm/neighbour/linear_search/index.html)
LinearSearch, LinearSearch,
/// Cover Tree Search algorithm, see [`CoverTree`](../algorithm/neighbour/cover_tree/index.html) /// Cover Tree Search algorithm, see [`CoverTree`](../algorithm/neighbour/cover_tree/index.html)
#[default]
CoverTree, CoverTree,
} }
impl Default for KNNAlgorithmName {
fn default() -> Self {
KNNAlgorithmName::CoverTree
}
}
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug)] #[derive(Debug)]
pub(crate) enum KNNAlgorithm<T: Number, D: Distance<Vec<T>>> { pub(crate) enum KNNAlgorithm<T: Number, D: Distance<Vec<T>>> {
+2 -2
View File
@@ -18,7 +18,7 @@
//! //!
//! Example: //! Example:
//! //!
//! ```ignore //! ```
//! use smartcore::linalg::basic::matrix::DenseMatrix; //! use smartcore::linalg::basic::matrix::DenseMatrix;
//! use smartcore::linalg::basic::arrays::Array2; //! use smartcore::linalg::basic::arrays::Array2;
//! use smartcore::cluster::dbscan::*; //! use smartcore::cluster::dbscan::*;
@@ -511,6 +511,6 @@ mod tests {
.and_then(|dbscan| dbscan.predict(&x)) .and_then(|dbscan| dbscan.predict(&x))
.unwrap(); .unwrap();
println!("{labels:?}"); println!("{:?}", labels);
} }
} }
+2 -2
View File
@@ -498,8 +498,8 @@ mod tests {
let y: Vec<usize> = kmeans.predict(&x).unwrap(); let y: Vec<usize> = kmeans.predict(&x).unwrap();
for (i, _y_i) in y.iter().enumerate() { for i in 0..y.len() {
assert_eq!({ y[i] }, kmeans._y[i]); assert_eq!(y[i] as usize, kmeans._y[i]);
} }
} }
+1 -1
View File
@@ -31,7 +31,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, f32> { pub fn load_dataset() -> Dataset<f32, f32> {
let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("boston.xy")) let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("boston.xy"))
{ {
Err(why) => panic!("Can't deserialize boston.xy. {why}"), Err(why) => panic!("Can't deserialize boston.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features), Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features),
}; };
+1 -1
View File
@@ -33,7 +33,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> { pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features) = let (x, y, num_samples, num_features) =
match deserialize_data(std::include_bytes!("breast_cancer.xy")) { match deserialize_data(std::include_bytes!("breast_cancer.xy")) {
Err(why) => panic!("Can't deserialize breast_cancer.xy. {why}"), Err(why) => panic!("Can't deserialize breast_cancer.xy. {}", why),
Ok((x, y, num_samples, num_features)) => ( Ok((x, y, num_samples, num_features)) => (
x, x,
y.into_iter().map(|x| x as u32).collect(), y.into_iter().map(|x| x as u32).collect(),
+1 -1
View File
@@ -26,7 +26,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> { pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features) = let (x, y, num_samples, num_features) =
match deserialize_data(std::include_bytes!("diabetes.xy")) { match deserialize_data(std::include_bytes!("diabetes.xy")) {
Err(why) => panic!("Can't deserialize diabetes.xy. {why}"), Err(why) => panic!("Can't deserialize diabetes.xy. {}", why),
Ok((x, y, num_samples, num_features)) => ( Ok((x, y, num_samples, num_features)) => (
x, x,
y.into_iter().map(|x| x as u32).collect(), y.into_iter().map(|x| x as u32).collect(),
+1 -1
View File
@@ -16,7 +16,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, f32> { pub fn load_dataset() -> Dataset<f32, f32> {
let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("digits.xy")) let (x, y, num_samples, num_features) = match deserialize_data(std::include_bytes!("digits.xy"))
{ {
Err(why) => panic!("Can't deserialize digits.xy. {why}"), Err(why) => panic!("Can't deserialize digits.xy. {}", why),
Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features), Ok((x, y, num_samples, num_features)) => (x, y, num_samples, num_features),
}; };
+1 -1
View File
@@ -22,7 +22,7 @@ use crate::dataset::Dataset;
pub fn load_dataset() -> Dataset<f32, u32> { pub fn load_dataset() -> Dataset<f32, u32> {
let (x, y, num_samples, num_features): (Vec<f32>, Vec<u32>, usize, usize) = let (x, y, num_samples, num_features): (Vec<f32>, Vec<u32>, usize, usize) =
match deserialize_data(std::include_bytes!("iris.xy")) { match deserialize_data(std::include_bytes!("iris.xy")) {
Err(why) => panic!("Can't deserialize iris.xy. {why}"), Err(why) => panic!("Can't deserialize iris.xy. {}", why),
Ok((x, y, num_samples, num_features)) => ( Ok((x, y, num_samples, num_features)) => (
x, x,
y.into_iter().map(|x| x as u32).collect(), y.into_iter().map(|x| x as u32).collect(),
+1 -1
View File
@@ -78,7 +78,7 @@ pub(crate) fn serialize_data<X: Number + RealNumber, Y: RealNumber>(
.collect(); .collect();
file.write_all(&y)?; file.write_all(&y)?;
} }
Err(why) => panic!("couldn't create {filename}: {why}"), Err(why) => panic!("couldn't create {}: {}", filename, why),
} }
Ok(()) Ok(())
} }
+11 -9
View File
@@ -231,7 +231,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
if parameters.n_components > n { if parameters.n_components > n {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Number of components, n_components should be <= number of attributes ({n})" "Number of components, n_components should be <= number of attributes ({})",
n
))); )));
} }
@@ -373,20 +374,21 @@ mod tests {
let parameters = PCASearchParameters { let parameters = PCASearchParameters {
n_components: vec![2, 4], n_components: vec![2, 4],
use_correlation_matrix: vec![true, false], use_correlation_matrix: vec![true, false],
..Default::default()
}; };
let mut iter = parameters.into_iter(); let mut iter = parameters.into_iter();
let next = iter.next().unwrap(); let next = iter.next().unwrap();
assert_eq!(next.n_components, 2); assert_eq!(next.n_components, 2);
assert!(next.use_correlation_matrix); assert_eq!(next.use_correlation_matrix, true);
let next = iter.next().unwrap(); let next = iter.next().unwrap();
assert_eq!(next.n_components, 4); assert_eq!(next.n_components, 4);
assert!(next.use_correlation_matrix); assert_eq!(next.use_correlation_matrix, true);
let next = iter.next().unwrap(); let next = iter.next().unwrap();
assert_eq!(next.n_components, 2); assert_eq!(next.n_components, 2);
assert!(!next.use_correlation_matrix); assert_eq!(next.use_correlation_matrix, false);
let next = iter.next().unwrap(); let next = iter.next().unwrap();
assert_eq!(next.n_components, 4); assert_eq!(next.n_components, 4);
assert!(!next.use_correlation_matrix); assert_eq!(next.use_correlation_matrix, false);
assert!(iter.next().is_none()); assert!(iter.next().is_none());
} }
@@ -570,8 +572,8 @@ mod tests {
epsilon = 1e-4 epsilon = 1e-4
)); ));
for (i, pca_eigenvalues_i) in pca.eigenvalues.iter().enumerate() { for i in 0..pca.eigenvalues.len() {
assert!((pca_eigenvalues_i.abs() - expected_eigenvalues[i].abs()).abs() < 1e-8); assert!((pca.eigenvalues[i].abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
} }
let us_arrests_t = pca.transform(&us_arrests).unwrap(); let us_arrests_t = pca.transform(&us_arrests).unwrap();
@@ -692,8 +694,8 @@ mod tests {
epsilon = 1e-4 epsilon = 1e-4
)); ));
for (i, pca_eigenvalues_i) in pca.eigenvalues.iter().enumerate() { for i in 0..pca.eigenvalues.len() {
assert!((pca_eigenvalues_i.abs() - expected_eigenvalues[i].abs()).abs() < 1e-8); assert!((pca.eigenvalues[i].abs() - expected_eigenvalues[i].abs()).abs() < 1e-8);
} }
let us_arrests_t = pca.transform(&us_arrests).unwrap(); let us_arrests_t = pca.transform(&us_arrests).unwrap();
+5 -2
View File
@@ -180,7 +180,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
if parameters.n_components >= p { if parameters.n_components >= p {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Number of components, n_components should be < number of attributes ({p})" "Number of components, n_components should be < number of attributes ({})",
p
))); )));
} }
@@ -201,7 +202,8 @@ impl<T: Number + RealNumber, X: Array2<T> + SVDDecomposable<T> + EVDDecomposable
let (p_c, k) = self.components.shape(); let (p_c, k) = self.components.shape();
if p_c != p { if p_c != p {
return Err(Failed::transform(&format!( return Err(Failed::transform(&format!(
"Can not transform a {n}x{p} matrix into {n}x{k} matrix, incorrect input dimentions" "Can not transform a {}x{} matrix into {}x{} matrix, incorrect input dimentions",
n, p, n, k
))); )));
} }
@@ -225,6 +227,7 @@ mod tests {
fn search_parameters() { fn search_parameters() {
let parameters = SVDSearchParameters { let parameters = SVDSearchParameters {
n_components: vec![10, 100], n_components: vec![10, 100],
..Default::default()
}; };
let mut iter = parameters.into_iter(); let mut iter = parameters.into_iter();
let next = iter.next().unwrap(); let next = iter.next().unwrap();
+1 -29
View File
@@ -454,12 +454,8 @@ impl<TX: FloatNumber + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY
y: &Y, y: &Y,
parameters: RandomForestClassifierParameters, parameters: RandomForestClassifierParameters,
) -> Result<RandomForestClassifier<TX, TY, X, Y>, Failed> { ) -> Result<RandomForestClassifier<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape(); let (_, num_attributes) = x.shape();
let y_ncols = y.shape(); let y_ncols = y.shape();
if x_nrows != y_ncols {
return Err(Failed::fit("Number of rows in X should = len(y)"));
}
let mut yi: Vec<usize> = vec![0; y_ncols]; let mut yi: Vec<usize> = vec![0; y_ncols];
let classes = y.unique(); let classes = y.unique();
@@ -682,30 +678,6 @@ mod tests {
assert!(accuracy(&y, &classifier.predict(&x).unwrap()) >= 0.95); assert!(accuracy(&y, &classifier.predict(&x).unwrap()) >= 0.95);
} }
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(21, 200);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let fail = RandomForestClassifier::fit(
&x_rand,
&y,
RandomForestClassifierParameters {
criterion: SplitCriterion::Gini,
max_depth: Option::None,
min_samples_leaf: 1,
min_samples_split: 2,
n_trees: 100,
m: Option::None,
keep_samples: false,
seed: 87,
},
);
assert!(fail.is_err());
}
#[cfg_attr( #[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")), all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test wasm_bindgen_test::wasm_bindgen_test
-30
View File
@@ -399,10 +399,6 @@ impl<TX: Number + FloatNumber + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1
) -> Result<RandomForestRegressor<TX, TY, X, Y>, Failed> { ) -> Result<RandomForestRegressor<TX, TY, X, Y>, Failed> {
let (n_rows, num_attributes) = x.shape(); let (n_rows, num_attributes) = x.shape();
if n_rows != y.shape() {
return Err(Failed::fit("Number of rows in X should = len(y)"));
}
let mtry = parameters let mtry = parameters
.m .m
.unwrap_or((num_attributes as f64).sqrt().floor() as usize); .unwrap_or((num_attributes as f64).sqrt().floor() as usize);
@@ -599,32 +595,6 @@ mod tests {
assert!(mean_absolute_error(&y, &y_hat) < 1.0); assert!(mean_absolute_error(&y, &y_hat) < 1.0);
} }
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(17, 200);
let y = vec![
83.0, 88.5, 88.2, 89.5, 96.2, 98.1, 99.0, 100.0, 101.2, 104.6, 108.4, 110.8, 112.6,
114.2, 115.7, 116.9,
];
let fail = RandomForestRegressor::fit(
&x_rand,
&y,
RandomForestRegressorParameters {
max_depth: Option::None,
min_samples_leaf: 1,
min_samples_split: 2,
n_trees: 1000,
m: Option::None,
keep_samples: false,
seed: 87,
},
);
assert!(fail.is_err());
}
#[cfg_attr( #[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")), all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test wasm_bindgen_test::wasm_bindgen_test
+2 -2
View File
@@ -30,7 +30,7 @@ pub enum FailedError {
DecompositionFailed, DecompositionFailed,
/// Can't solve for x /// Can't solve for x
SolutionFailed, SolutionFailed,
/// Error in input parameters /// Erro in input
ParametersError, ParametersError,
} }
@@ -98,7 +98,7 @@ impl fmt::Display for FailedError {
FailedError::SolutionFailed => "Can't find solution", FailedError::SolutionFailed => "Can't find solution",
FailedError::ParametersError => "Error in input, check parameters", FailedError::ParametersError => "Error in input, check parameters",
}; };
write!(f, "{failed_err_str}") write!(f, "{}", failed_err_str)
} }
} }
+1 -2
View File
@@ -3,8 +3,7 @@
clippy::too_many_arguments, clippy::too_many_arguments,
clippy::many_single_char_names, clippy::many_single_char_names,
clippy::unnecessary_wraps, clippy::unnecessary_wraps,
clippy::upper_case_acronyms, clippy::upper_case_acronyms
clippy::approx_constant
)] )]
#![warn(missing_docs)] #![warn(missing_docs)]
#![warn(rustdoc::missing_doc_code_examples)] #![warn(rustdoc::missing_doc_code_examples)]
+29 -12
View File
@@ -548,7 +548,7 @@ pub trait ArrayView2<T: Debug + Display + Copy + Sized>: Array<T, (usize, usize)
let (nrows, ncols) = self.shape(); let (nrows, ncols) = self.shape();
for r in 0..nrows { for r in 0..nrows {
let row: Vec<T> = (0..ncols).map(|c| *self.get((r, c))).collect(); let row: Vec<T> = (0..ncols).map(|c| *self.get((r, c))).collect();
writeln!(f, "{row:?}")? writeln!(f, "{:?}", row)?
} }
Ok(()) Ok(())
} }
@@ -918,7 +918,8 @@ pub trait Array1<T: Debug + Display + Copy + Sized>: MutArrayView1<T> + Sized +
let len = self.shape(); let len = self.shape();
assert!( assert!(
index.iter().all(|&i| i < len), index.iter().all(|&i| i < len),
"All indices in `take` should be < {len}" "All indices in `take` should be < {}",
len
); );
Self::from_iterator(index.iter().map(move |&i| *self.get(i)), index.len()) Self::from_iterator(index.iter().map(move |&i| *self.get(i)), index.len())
} }
@@ -989,7 +990,10 @@ pub trait Array1<T: Debug + Display + Copy + Sized>: MutArrayView1<T> + Sized +
}; };
assert!( assert!(
d1 == len, d1 == len,
"Can not multiply {nrows}x{ncols} matrix by {len} vector" "Can not multiply {}x{} matrix by {} vector",
nrows,
ncols,
len
); );
let mut result = Self::zeros(d2); let mut result = Self::zeros(d2);
for i in 0..d2 { for i in 0..d2 {
@@ -1107,7 +1111,11 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
assert!( assert!(
nrows * ncols == onrows * oncols, nrows * ncols == onrows * oncols,
"Can't reshape {onrows}x{oncols} array into a {nrows}x{ncols} array" "Can't reshape {}x{} array into a {}x{} array",
onrows,
oncols,
nrows,
ncols
); );
Self::from_iterator(self.iterator(0).cloned(), nrows, ncols, axis) Self::from_iterator(self.iterator(0).cloned(), nrows, ncols, axis)
@@ -1121,7 +1129,11 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
let (o_nrows, o_ncols) = other.shape(); let (o_nrows, o_ncols) = other.shape();
assert!( assert!(
ncols == o_nrows, ncols == o_nrows,
"Can't multiply {nrows}x{ncols} and {o_nrows}x{o_ncols} matrices" "Can't multiply {}x{} and {}x{} matrices",
nrows,
ncols,
o_nrows,
o_ncols
); );
let inner_d = ncols; let inner_d = ncols;
let mut result = Self::zeros(nrows, o_ncols); let mut result = Self::zeros(nrows, o_ncols);
@@ -1154,7 +1166,7 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
_ => (nrows, ncols, o_nrows, o_ncols), _ => (nrows, ncols, o_nrows, o_ncols),
}; };
if d1 != d4 { if d1 != d4 {
panic!("Can not multiply {d2}x{d1} by {d4}x{d3} matrices"); panic!("Can not multiply {}x{} by {}x{} matrices", d2, d1, d4, d3);
} }
let mut result = Self::zeros(d2, d3); let mut result = Self::zeros(d2, d3);
for r in 0..d2 { for r in 0..d2 {
@@ -1186,7 +1198,10 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
}; };
assert!( assert!(
d2 == len, d2 == len,
"Can not multiply {nrows}x{ncols} matrix by {len} vector" "Can not multiply {}x{} matrix by {} vector",
nrows,
ncols,
len
); );
let mut result = Self::zeros(d1, 1); let mut result = Self::zeros(d1, 1);
for i in 0..d1 { for i in 0..d1 {
@@ -1417,7 +1432,8 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
0 => { 0 => {
assert!( assert!(
index.iter().all(|&i| i < nrows), index.iter().all(|&i| i < nrows),
"All indices in `take` should be < {nrows}" "All indices in `take` should be < {}",
nrows
); );
Self::from_iterator( Self::from_iterator(
index index
@@ -1432,7 +1448,8 @@ pub trait Array2<T: Debug + Display + Copy + Sized>: MutArrayView2<T> + Sized +
_ => { _ => {
assert!( assert!(
index.iter().all(|&i| i < ncols), index.iter().all(|&i| i < ncols),
"All indices in `take` should be < {ncols}" "All indices in `take` should be < {}",
ncols
); );
Self::from_iterator( Self::from_iterator(
(0..nrows) (0..nrows)
@@ -1719,7 +1736,7 @@ mod tests {
let r = Vec::<f32>::rand(4); let r = Vec::<f32>::rand(4);
assert!(r.iterator(0).all(|&e| e <= 1f32)); assert!(r.iterator(0).all(|&e| e <= 1f32));
assert!(r.iterator(0).all(|&e| e >= 0f32)); assert!(r.iterator(0).all(|&e| e >= 0f32));
assert!(r.iterator(0).copied().sum::<f32>() > 0f32); assert!(r.iterator(0).map(|v| *v).sum::<f32>() > 0f32);
} }
#[test] #[test]
@@ -1937,7 +1954,7 @@ mod tests {
DenseMatrix::from_2d_array(&[&[1, 3], &[2, 4]]) DenseMatrix::from_2d_array(&[&[1, 3], &[2, 4]])
); );
assert_eq!( assert_eq!(
DenseMatrix::concatenate_2d(&[&a, &b], 0), DenseMatrix::concatenate_2d(&[&a.clone(), &b.clone()], 0),
DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6], &[7, 8]]) DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6], &[7, 8]])
); );
assert_eq!( assert_eq!(
@@ -2008,7 +2025,7 @@ mod tests {
let r = DenseMatrix::<f32>::rand(2, 2); let r = DenseMatrix::<f32>::rand(2, 2);
assert!(r.iterator(0).all(|&e| e <= 1f32)); assert!(r.iterator(0).all(|&e| e <= 1f32));
assert!(r.iterator(0).all(|&e| e >= 0f32)); assert!(r.iterator(0).all(|&e| e >= 0f32));
assert!(r.iterator(0).copied().sum::<f32>() > 0f32); assert!(r.iterator(0).map(|v| *v).sum::<f32>() > 0f32);
} }
#[test] #[test]
+9 -9
View File
@@ -581,9 +581,9 @@ mod tests {
vec![4, 5, 6], vec![4, 5, 6],
DenseMatrix::from_slice(&(*x.slice(1..2, 0..3))).values DenseMatrix::from_slice(&(*x.slice(1..2, 0..3))).values
); );
let second_row: Vec<i32> = x.slice(1..2, 0..3).iterator(0).copied().collect(); let second_row: Vec<i32> = x.slice(1..2, 0..3).iterator(0).map(|x| *x).collect();
assert_eq!(vec![4, 5, 6], second_row); assert_eq!(vec![4, 5, 6], second_row);
let second_col: Vec<i32> = x.slice(0..3, 1..2).iterator(0).copied().collect(); let second_col: Vec<i32> = x.slice(0..3, 1..2).iterator(0).map(|x| *x).collect();
assert_eq!(vec![2, 5, 8], second_col); assert_eq!(vec![2, 5, 8], second_col);
} }
@@ -640,12 +640,12 @@ mod tests {
let x = DenseMatrix::<&str>::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"]]); let x = DenseMatrix::<&str>::from_2d_array(&[&["1", "2", "3"], &["4", "5", "6"]]);
assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values); assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values);
assert!(x.column_major); assert!(x.column_major == true);
// transpose // transpose
let x = x.transpose(); let x = x.transpose();
assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values); assert_eq!(vec!["1", "4", "2", "5", "3", "6"], x.values);
assert!(!x.column_major); // should change column_major assert!(x.column_major == false); // should change column_major
} }
#[test] #[test]
@@ -659,7 +659,7 @@ mod tests {
vec![1, 2, 3, 4, 5, 6], vec![1, 2, 3, 4, 5, 6],
m.values.iter().map(|e| **e).collect::<Vec<i32>>() m.values.iter().map(|e| **e).collect::<Vec<i32>>()
); );
assert!(!m.column_major); assert!(m.column_major == false);
} }
#[test] #[test]
@@ -667,10 +667,10 @@ mod tests {
let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]); let a = DenseMatrix::from_2d_array(&[&[1, 2, 3], &[4, 5, 6]]);
let b = DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6]]); let b = DenseMatrix::from_2d_array(&[&[1, 2], &[3, 4], &[5, 6]]);
println!("{a}"); println!("{}", a);
// take column 0 and 2 // take column 0 and 2
assert_eq!(vec![1, 3, 4, 6], a.take(&[0, 2], 1).values); assert_eq!(vec![1, 3, 4, 6], a.take(&[0, 2], 1).values);
println!("{b}"); println!("{}", b);
// take rows 0 and 2 // take rows 0 and 2
assert_eq!(vec![1, 2, 5, 6], b.take(&[0, 2], 0).values); assert_eq!(vec![1, 2, 5, 6], b.take(&[0, 2], 0).values);
} }
@@ -692,11 +692,11 @@ mod tests {
let a = a.reshape(2, 6, 0); let a = a.reshape(2, 6, 0);
assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values); assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values);
assert!(a.ncols == 6 && a.nrows == 2 && !a.column_major); assert!(a.ncols == 6 && a.nrows == 2 && a.column_major == false);
let a = a.reshape(3, 4, 1); let a = a.reshape(3, 4, 1);
assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values); assert_eq!(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], a.values);
assert!(a.ncols == 4 && a.nrows == 3 && a.column_major); assert!(a.ncols == 4 && a.nrows == 3 && a.column_major == true);
} }
#[test] #[test]
+3 -3
View File
@@ -160,8 +160,8 @@ mod tests {
fn dot_product<T: Number, V: Array1<T>>(v: &V) -> T { fn dot_product<T: Number, V: Array1<T>>(v: &V) -> T {
let vv = V::zeros(10); let vv = V::zeros(10);
let v_s = vv.slice(0..3); let v_s = vv.slice(0..3);
let dot = v_s.dot(v);
v_s.dot(v) dot
} }
fn vector_ops<T: Number + PartialOrd, V: Array1<T>>(_: &V) -> T { fn vector_ops<T: Number + PartialOrd, V: Array1<T>>(_: &V) -> T {
@@ -216,7 +216,7 @@ mod tests {
#[test] #[test]
fn test_mut_iterator() { fn test_mut_iterator() {
let mut x = vec![1, 2, 3]; let mut x = vec![1, 2, 3];
x.iterator_mut(0).for_each(|v| *v *= 2); x.iterator_mut(0).for_each(|v| *v = *v * 2);
assert_eq!(vec![2, 4, 6], x); assert_eq!(vec![2, 4, 6], x);
} }
+6 -6
View File
@@ -217,7 +217,7 @@ mod tests {
fn test_iterator() { fn test_iterator() {
let a = arr2(&[[1, 2, 3], [4, 5, 6]]); let a = arr2(&[[1, 2, 3], [4, 5, 6]]);
let v: Vec<i32> = a.iterator(0).copied().collect(); let v: Vec<i32> = a.iterator(0).map(|&v| v).collect();
assert_eq!(v, vec!(1, 2, 3, 4, 5, 6)); assert_eq!(v, vec!(1, 2, 3, 4, 5, 6));
} }
@@ -236,7 +236,7 @@ mod tests {
let x = arr2(&[[1, 2, 3], [4, 5, 6]]); let x = arr2(&[[1, 2, 3], [4, 5, 6]]);
let x_slice = Array2::slice(&x, 0..2, 1..2); let x_slice = Array2::slice(&x, 0..2, 1..2);
assert_eq!((2, 1), x_slice.shape()); assert_eq!((2, 1), x_slice.shape());
let v: Vec<i32> = x_slice.iterator(0).copied().collect(); let v: Vec<i32> = x_slice.iterator(0).map(|&v| v).collect();
assert_eq!(v, [2, 5]); assert_eq!(v, [2, 5]);
} }
@@ -245,11 +245,11 @@ mod tests {
let x = arr2(&[[1, 2, 3], [4, 5, 6]]); let x = arr2(&[[1, 2, 3], [4, 5, 6]]);
let x_slice = Array2::slice(&x, 0..2, 0..3); let x_slice = Array2::slice(&x, 0..2, 0..3);
assert_eq!( assert_eq!(
x_slice.iterator(0).copied().collect::<Vec<i32>>(), x_slice.iterator(0).map(|&v| v).collect::<Vec<i32>>(),
vec![1, 2, 3, 4, 5, 6] vec![1, 2, 3, 4, 5, 6]
); );
assert_eq!( assert_eq!(
x_slice.iterator(1).copied().collect::<Vec<i32>>(), x_slice.iterator(1).map(|&v| v).collect::<Vec<i32>>(),
vec![1, 4, 2, 5, 3, 6] vec![1, 4, 2, 5, 3, 6]
); );
} }
@@ -279,8 +279,8 @@ mod tests {
fn test_c_from_iterator() { fn test_c_from_iterator() {
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]; let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12];
let a: NDArray2<i32> = Array2::from_iterator(data.clone().into_iter(), 4, 3, 0); let a: NDArray2<i32> = Array2::from_iterator(data.clone().into_iter(), 4, 3, 0);
println!("{a}"); println!("{}", a);
let a: NDArray2<i32> = Array2::from_iterator(data.into_iter(), 4, 3, 1); let a: NDArray2<i32> = Array2::from_iterator(data.into_iter(), 4, 3, 1);
println!("{a}"); println!("{}", a);
} }
} }
+1 -1
View File
@@ -152,7 +152,7 @@ mod tests {
fn test_iterator() { fn test_iterator() {
let a = arr1(&[1, 2, 3]); let a = arr1(&[1, 2, 3]);
let v: Vec<i32> = a.iterator(0).copied().collect(); let v: Vec<i32> = a.iterator(0).map(|&v| v).collect();
assert_eq!(v, vec!(1, 2, 3)); assert_eq!(v, vec!(1, 2, 3));
} }
+13 -9
View File
@@ -66,7 +66,7 @@ pub trait EVDDecomposable<T: Number + RealNumber>: Array2<T> {
fn evd_mut(mut self, symmetric: bool) -> Result<EVD<T, Self>, Failed> { fn evd_mut(mut self, symmetric: bool) -> Result<EVD<T, Self>, Failed> {
let (nrows, ncols) = self.shape(); let (nrows, ncols) = self.shape();
if ncols != nrows { if ncols != nrows {
panic!("Matrix is not square: {nrows} x {ncols}"); panic!("Matrix is not square: {} x {}", nrows, ncols);
} }
let n = nrows; let n = nrows;
@@ -837,8 +837,10 @@ mod tests {
evd.V.abs(), evd.V.abs(),
epsilon = 1e-4 epsilon = 1e-4
)); ));
for (i, eigen_values_i) in eigen_values.iter().enumerate() { for i in 0..eigen_values.len() {
assert!((eigen_values_i - evd.d[i]).abs() < 1e-4); assert!((eigen_values[i] - evd.d[i]).abs() < 1e-4);
}
for i in 0..eigen_values.len() {
assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON); assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON);
} }
} }
@@ -869,8 +871,10 @@ mod tests {
evd.V.abs(), evd.V.abs(),
epsilon = 1e-4 epsilon = 1e-4
)); ));
for (i, eigen_values_i) in eigen_values.iter().enumerate() { for i in 0..eigen_values.len() {
assert!((eigen_values_i - evd.d[i]).abs() < 1e-4); assert!((eigen_values[i] - evd.d[i]).abs() < 1e-4);
}
for i in 0..eigen_values.len() {
assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON); assert!((0f64 - evd.e[i]).abs() < std::f64::EPSILON);
} }
} }
@@ -904,11 +908,11 @@ mod tests {
evd.V.abs(), evd.V.abs(),
epsilon = 1e-4 epsilon = 1e-4
)); ));
for (i, eigen_values_d_i) in eigen_values_d.iter().enumerate() { for i in 0..eigen_values_d.len() {
assert!((eigen_values_d_i - evd.d[i]).abs() < 1e-4); assert!((eigen_values_d[i] - evd.d[i]).abs() < 1e-4);
} }
for (i, eigen_values_e_i) in eigen_values_e.iter().enumerate() { for i in 0..eigen_values_e.len() {
assert!((eigen_values_e_i - evd.e[i]).abs() < 1e-4); assert!((eigen_values_e[i] - evd.e[i]).abs() < 1e-4);
} }
} }
} }
+5 -2
View File
@@ -126,7 +126,7 @@ impl<T: Number + RealNumber, M: Array2<T>> LU<T, M> {
let (m, n) = self.LU.shape(); let (m, n) = self.LU.shape();
if m != n { if m != n {
panic!("Matrix is not square: {m}x{n}"); panic!("Matrix is not square: {}x{}", m, n);
} }
let mut inv = M::zeros(n, n); let mut inv = M::zeros(n, n);
@@ -143,7 +143,10 @@ impl<T: Number + RealNumber, M: Array2<T>> LU<T, M> {
let (b_m, b_n) = b.shape(); let (b_m, b_n) = b.shape();
if b_m != m { if b_m != m {
panic!("Row dimensions do not agree: A is {m} x {n}, but B is {b_m} x {b_n}"); panic!(
"Row dimensions do not agree: A is {} x {}, but B is {} x {}",
m, n, b_m, b_n
);
} }
if self.singular { if self.singular {
+4 -1
View File
@@ -102,7 +102,10 @@ impl<T: Number + RealNumber, M: Array2<T>> QR<T, M> {
let (b_nrows, b_ncols) = b.shape(); let (b_nrows, b_ncols) = b.shape();
if b_nrows != m { if b_nrows != m {
panic!("Row dimensions do not agree: A is {m} x {n}, but B is {b_nrows} x {b_ncols}"); panic!(
"Row dimensions do not agree: A is {} x {}, but B is {} x {}",
m, n, b_nrows, b_ncols
);
} }
if self.singular { if self.singular {
+1 -1
View File
@@ -286,7 +286,7 @@ mod tests {
} }
{ {
let mut m = m; let mut m = m.clone();
m.standard_scale_mut(&m.mean(1), &m.std(1), 1); m.standard_scale_mut(&m.mean(1), &m.std(1), 1);
assert_eq!(&m, &expected_1); assert_eq!(&m, &expected_1);
} }
+4 -4
View File
@@ -509,8 +509,8 @@ mod tests {
assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4)); assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4));
assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4)); assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4));
for (i, s_i) in s.iter().enumerate() { for i in 0..s.len() {
assert!((s_i - svd.s[i]).abs() < 1e-4); assert!((s[i] - svd.s[i]).abs() < 1e-4);
} }
} }
#[cfg_attr( #[cfg_attr(
@@ -713,8 +713,8 @@ mod tests {
assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4)); assert!(relative_eq!(V.abs(), svd.V.abs(), epsilon = 1e-4));
assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4)); assert!(relative_eq!(U.abs(), svd.U.abs(), epsilon = 1e-4));
for (i, s_i) in s.iter().enumerate() { for i in 0..s.len() {
assert!((s_i - svd.s[i]).abs() < 1e-4); assert!((s[i] - svd.s[i]).abs() < 1e-4);
} }
} }
#[cfg_attr( #[cfg_attr(
+4 -1
View File
@@ -425,7 +425,10 @@ impl<TX: FloatNumber + RealNumber, TY: Number, X: Array2<TX>, Y: Array1<TY>>
for (i, col_std_i) in col_std.iter().enumerate() { for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() { if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}"))); return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
} }
} }
+4 -1
View File
@@ -356,7 +356,10 @@ impl<TX: FloatNumber + RealNumber, TY: Number, X: Array2<TX>, Y: Array1<TY>> Las
for (i, col_std_i) in col_std.iter().enumerate() { for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() { if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}"))); return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
} }
} }
+15 -9
View File
@@ -71,14 +71,19 @@ use crate::optimization::line_search::Backtracking;
use crate::optimization::FunctionOrder; use crate::optimization::FunctionOrder;
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Eq, PartialEq, Default)] #[derive(Debug, Clone, Eq, PartialEq)]
/// Solver options for Logistic regression. Right now only LBFGS solver is supported. /// Solver options for Logistic regression. Right now only LBFGS solver is supported.
pub enum LogisticRegressionSolverName { pub enum LogisticRegressionSolverName {
/// Limited-memory BroydenFletcherGoldfarbShanno method, see [LBFGS paper](http://users.iems.northwestern.edu/~nocedal/lbfgsb.html) /// Limited-memory BroydenFletcherGoldfarbShanno method, see [LBFGS paper](http://users.iems.northwestern.edu/~nocedal/lbfgsb.html)
#[default]
LBFGS, LBFGS,
} }
impl Default for LogisticRegressionSolverName {
fn default() -> Self {
LogisticRegressionSolverName::LBFGS
}
}
/// Logistic Regression parameters /// Logistic Regression parameters
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
@@ -444,7 +449,8 @@ impl<TX: Number + FloatNumber + RealNumber, TY: Number + Ord, X: Array2<TX>, Y:
match k.cmp(&2) { match k.cmp(&2) {
Ordering::Less => Err(Failed::fit(&format!( Ordering::Less => Err(Failed::fit(&format!(
"incorrect number of classes: {k}. Should be >= 2." "incorrect number of classes: {}. Should be >= 2.",
k
))), ))),
Ordering::Equal => { Ordering::Equal => {
let x0 = Vec::zeros(num_attributes + 1); let x0 = Vec::zeros(num_attributes + 1);
@@ -630,19 +636,19 @@ mod tests {
assert!((g[0] + 33.000068218163484).abs() < std::f64::EPSILON); assert!((g[0] + 33.000068218163484).abs() < std::f64::EPSILON);
let f = objective.f(&[1., 2., 3., 4., 5., 6., 7., 8., 9.]); let f = objective.f(&vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
assert!((f - 408.0052230582765).abs() < std::f64::EPSILON); assert!((f - 408.0052230582765).abs() < std::f64::EPSILON);
let objective_reg = MultiClassObjectiveFunction { let objective_reg = MultiClassObjectiveFunction {
x: &x, x: &x,
y, y: y.clone(),
k: 3, k: 3,
alpha: 1.0, alpha: 1.0,
_phantom_t: PhantomData, _phantom_t: PhantomData,
}; };
let f = objective_reg.f(&[1., 2., 3., 4., 5., 6., 7., 8., 9.]); let f = objective_reg.f(&vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
assert!((f - 487.5052).abs() < 1e-4); assert!((f - 487.5052).abs() < 1e-4);
objective_reg.df(&mut g, &vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]); objective_reg.df(&mut g, &vec![1., 2., 3., 4., 5., 6., 7., 8., 9.]);
@@ -691,18 +697,18 @@ mod tests {
assert!((g[1] - 10.239000702928523).abs() < std::f64::EPSILON); assert!((g[1] - 10.239000702928523).abs() < std::f64::EPSILON);
assert!((g[2] - 3.869294270156324).abs() < std::f64::EPSILON); assert!((g[2] - 3.869294270156324).abs() < std::f64::EPSILON);
let f = objective.f(&[1., 2., 3.]); let f = objective.f(&vec![1., 2., 3.]);
assert!((f - 59.76994756647412).abs() < std::f64::EPSILON); assert!((f - 59.76994756647412).abs() < std::f64::EPSILON);
let objective_reg = BinaryObjectiveFunction { let objective_reg = BinaryObjectiveFunction {
x: &x, x: &x,
y, y: y.clone(),
alpha: 1.0, alpha: 1.0,
_phantom_t: PhantomData, _phantom_t: PhantomData,
}; };
let f = objective_reg.f(&[1., 2., 3.]); let f = objective_reg.f(&vec![1., 2., 3.]);
assert!((f - 62.2699).abs() < 1e-4); assert!((f - 62.2699).abs() < 1e-4);
objective_reg.df(&mut g, &vec![1., 2., 3.]); objective_reg.df(&mut g, &vec![1., 2., 3.]);
+11 -3
View File
@@ -71,16 +71,21 @@ use crate::numbers::basenum::Number;
use crate::numbers::realnum::RealNumber; use crate::numbers::realnum::RealNumber;
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Eq, PartialEq, Default)] #[derive(Debug, Clone, Eq, PartialEq)]
/// Approach to use for estimation of regression coefficients. Cholesky is more efficient but SVD is more stable. /// Approach to use for estimation of regression coefficients. Cholesky is more efficient but SVD is more stable.
pub enum RidgeRegressionSolverName { pub enum RidgeRegressionSolverName {
/// Cholesky decomposition, see [Cholesky](../../linalg/cholesky/index.html) /// Cholesky decomposition, see [Cholesky](../../linalg/cholesky/index.html)
#[default]
Cholesky, Cholesky,
/// SVD decomposition, see [SVD](../../linalg/svd/index.html) /// SVD decomposition, see [SVD](../../linalg/svd/index.html)
SVD, SVD,
} }
impl Default for RidgeRegressionSolverName {
fn default() -> Self {
RidgeRegressionSolverName::Cholesky
}
}
/// Ridge Regression parameters /// Ridge Regression parameters
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
@@ -379,7 +384,10 @@ impl<
for (i, col_std_i) in col_std.iter().enumerate() { for (i, col_std_i) in col_std.iter().enumerate() {
if (*col_std_i - TX::zero()).abs() < TX::epsilon() { if (*col_std_i - TX::zero()).abs() < TX::epsilon() {
return Err(Failed::fit(&format!("Cannot rescale constant column {i}"))); return Err(Failed::fit(&format!(
"Cannot rescale constant column {}",
i
)));
} }
} }
+3 -3
View File
@@ -98,8 +98,8 @@ mod tests {
let mut scores = HCVScore::new(); let mut scores = HCVScore::new();
scores.compute(&v1, &v2); scores.compute(&v1, &v2);
assert!((0.2548 - scores.homogeneity.unwrap()).abs() < 1e-4); assert!((0.2548 - scores.homogeneity.unwrap() as f64).abs() < 1e-4);
assert!((0.5440 - scores.completeness.unwrap()).abs() < 1e-4); assert!((0.5440 - scores.completeness.unwrap() as f64).abs() < 1e-4);
assert!((0.3471 - scores.v_measure.unwrap()).abs() < 1e-4); assert!((0.3471 - scores.v_measure.unwrap() as f64).abs() < 1e-4);
} }
} }
+1 -1
View File
@@ -125,7 +125,7 @@ mod tests {
fn entropy_test() { fn entropy_test() {
let v1 = vec![0, 0, 1, 1, 2, 0, 4]; let v1 = vec![0, 0, 1, 1, 2, 0, 4];
assert!((1.2770 - entropy(&v1).unwrap()).abs() < 1e-4); assert!((1.2770 - entropy(&v1).unwrap() as f64).abs() < 1e-4);
} }
#[cfg_attr( #[cfg_attr(
+2 -2
View File
@@ -95,8 +95,8 @@ mod tests {
let score1: f64 = F1::new_with(beta).get_score(&y_true, &y_pred); let score1: f64 = F1::new_with(beta).get_score(&y_true, &y_pred);
let score2: f64 = F1::new_with(beta).get_score(&y_true, &y_true); let score2: f64 = F1::new_with(beta).get_score(&y_true, &y_true);
println!("{score1:?}"); println!("{:?}", score1);
println!("{score2:?}"); println!("{:?}", score2);
assert!((score1 - 0.57142857).abs() < 1e-8); assert!((score1 - 0.57142857).abs() < 1e-8);
assert!((score2 - 1.0).abs() < 1e-8); assert!((score2 - 1.0).abs() < 1e-8);
+4 -4
View File
@@ -213,17 +213,17 @@ mod tests {
for t in &test_masks[0][0..11] { for t in &test_masks[0][0..11] {
// TODO: this can be prob done better // TODO: this can be prob done better
assert!(*t) assert_eq!(*t, true)
} }
for t in &test_masks[0][11..22] { for t in &test_masks[0][11..22] {
assert!(!*t) assert_eq!(*t, false)
} }
for t in &test_masks[1][0..11] { for t in &test_masks[1][0..11] {
assert!(!*t) assert_eq!(*t, false)
} }
for t in &test_masks[1][11..22] { for t in &test_masks[1][11..22] {
assert!(*t) assert_eq!(*t, true)
} }
} }
+2 -2
View File
@@ -169,7 +169,7 @@ pub fn train_test_split<
let n_test = ((n as f32) * test_size) as usize; let n_test = ((n as f32) * test_size) as usize;
if n_test < 1 { if n_test < 1 {
panic!("number of sample is too small {n}"); panic!("number of sample is too small {}", n);
} }
let mut indices: Vec<usize> = (0..n).collect(); let mut indices: Vec<usize> = (0..n).collect();
@@ -553,6 +553,6 @@ mod tests {
&accuracy, &accuracy,
) )
.unwrap(); .unwrap();
println!("{results:?}"); println!("{:?}", results);
} }
} }
+8 -4
View File
@@ -271,18 +271,21 @@ impl<TY: Number + Ord + Unsigned> BernoulliNBDistribution<TY> {
let y_samples = y.shape(); let y_samples = y.shape();
if y_samples != n_samples { if y_samples != n_samples {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
))); )));
} }
if n_samples == 0 { if n_samples == 0 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]" "Size of x and y should greater than 0; |x|=[{}]",
n_samples
))); )));
} }
if alpha < 0f64 { if alpha < 0f64 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Alpha should be greater than 0; |alpha|=[{alpha}]" "Alpha should be greater than 0; |alpha|=[{}]",
alpha
))); )));
} }
@@ -315,7 +318,8 @@ impl<TY: Number + Ord + Unsigned> BernoulliNBDistribution<TY> {
feature_in_class_counter[class_index][idx] += feature_in_class_counter[class_index][idx] +=
row_i.to_usize().ok_or_else(|| { row_i.to_usize().ok_or_else(|| {
Failed::fit(&format!( Failed::fit(&format!(
"Elements of the matrix should be 1.0 or 0.0 |found|=[{row_i}]" "Elements of the matrix should be 1.0 or 0.0 |found|=[{}]",
row_i
)) ))
})?; })?;
} }
+9 -4
View File
@@ -158,7 +158,8 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
pub fn fit<X: Array2<T>, Y: Array1<T>>(x: &X, y: &Y, alpha: f64) -> Result<Self, Failed> { pub fn fit<X: Array2<T>, Y: Array1<T>>(x: &X, y: &Y, alpha: f64) -> Result<Self, Failed> {
if alpha < 0f64 { if alpha < 0f64 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"alpha should be >= 0, alpha=[{alpha}]" "alpha should be >= 0, alpha=[{}]",
alpha
))); )));
} }
@@ -166,13 +167,15 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
let y_samples = y.shape(); let y_samples = y.shape();
if y_samples != n_samples { if y_samples != n_samples {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
))); )));
} }
if n_samples == 0 { if n_samples == 0 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]" "Size of x and y should greater than 0; |x|=[{}]",
n_samples
))); )));
} }
let y: Vec<usize> = y.iterator(0).map(|y_i| y_i.to_usize().unwrap()).collect(); let y: Vec<usize> = y.iterator(0).map(|y_i| y_i.to_usize().unwrap()).collect();
@@ -199,7 +202,8 @@ impl<T: Number + Unsigned> CategoricalNBDistribution<T> {
.max() .max()
.ok_or_else(|| { .ok_or_else(|| {
Failed::fit(&format!( Failed::fit(&format!(
"Failed to get the categories for feature = {feature}" "Failed to get the categories for feature = {}",
feature
)) ))
})?; })?;
n_categories.push(feature_max + 1); n_categories.push(feature_max + 1);
@@ -425,6 +429,7 @@ mod tests {
fn search_parameters() { fn search_parameters() {
let parameters = CategoricalNBSearchParameters { let parameters = CategoricalNBSearchParameters {
alpha: vec![1., 2.], alpha: vec![1., 2.],
..Default::default()
}; };
let mut iter = parameters.into_iter(); let mut iter = parameters.into_iter();
let next = iter.next().unwrap(); let next = iter.next().unwrap();
+5 -2
View File
@@ -185,13 +185,15 @@ impl<TY: Number + Ord + Unsigned> GaussianNBDistribution<TY> {
let y_samples = y.shape(); let y_samples = y.shape();
if y_samples != n_samples { if y_samples != n_samples {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
))); )));
} }
if n_samples == 0 { if n_samples == 0 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]" "Size of x and y should greater than 0; |x|=[{}]",
n_samples
))); )));
} }
let (class_labels, indices) = y.unique_with_indices(); let (class_labels, indices) = y.unique_with_indices();
@@ -373,6 +375,7 @@ mod tests {
fn search_parameters() { fn search_parameters() {
let parameters = GaussianNBSearchParameters { let parameters = GaussianNBSearchParameters {
priors: vec![Some(vec![1.]), Some(vec![2.])], priors: vec![Some(vec![1.]), Some(vec![2.])],
..Default::default()
}; };
let mut iter = parameters.into_iter(); let mut iter = parameters.into_iter();
let next = iter.next().unwrap(); let next = iter.next().unwrap();
+8 -4
View File
@@ -220,18 +220,21 @@ impl<TY: Number + Ord + Unsigned> MultinomialNBDistribution<TY> {
let y_samples = y.shape(); let y_samples = y.shape();
if y_samples != n_samples { if y_samples != n_samples {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{n_samples}], |y|=[{y_samples}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
n_samples, y_samples
))); )));
} }
if n_samples == 0 { if n_samples == 0 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x and y should greater than 0; |x|=[{n_samples}]" "Size of x and y should greater than 0; |x|=[{}]",
n_samples
))); )));
} }
if alpha < 0f64 { if alpha < 0f64 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Alpha should be greater than 0; |alpha|=[{alpha}]" "Alpha should be greater than 0; |alpha|=[{}]",
alpha
))); )));
} }
@@ -263,7 +266,8 @@ impl<TY: Number + Ord + Unsigned> MultinomialNBDistribution<TY> {
feature_in_class_counter[class_index][idx] += feature_in_class_counter[class_index][idx] +=
row_i.to_usize().ok_or_else(|| { row_i.to_usize().ok_or_else(|| {
Failed::fit(&format!( Failed::fit(&format!(
"Elements of the matrix should be convertible to usize |found|=[{row_i}]" "Elements of the matrix should be convertible to usize |found|=[{}]",
row_i
)) ))
})?; })?;
} }
+2 -1
View File
@@ -236,7 +236,8 @@ impl<TX: Number, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec
if x_n != y_n { if x_n != y_n {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{x_n}], |y|=[{y_n}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
x_n, y_n
))); )));
} }
+2 -1
View File
@@ -224,7 +224,8 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>, D: Distance<Vec<TX>>>
if x_n != y_n { if x_n != y_n {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Size of x should equal size of y; |x|=[{x_n}], |y|=[{y_n}]" "Size of x should equal size of y; |x|=[{}], |y|=[{}]",
x_n, y_n
))); )));
} }
+7 -2
View File
@@ -49,15 +49,20 @@ pub type KNNAlgorithmName = crate::algorithm::neighbour::KNNAlgorithmName;
/// Weight function that is used to determine estimated value. /// Weight function that is used to determine estimated value.
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)] #[derive(Debug, Clone)]
pub enum KNNWeightFunction { pub enum KNNWeightFunction {
/// All k nearest points are weighted equally /// All k nearest points are weighted equally
#[default]
Uniform, Uniform,
/// k nearest points are weighted by the inverse of their distance. Closer neighbors will have a greater influence than neighbors which are further away. /// k nearest points are weighted by the inverse of their distance. Closer neighbors will have a greater influence than neighbors which are further away.
Distance, Distance,
} }
impl Default for KNNWeightFunction {
fn default() -> Self {
KNNWeightFunction::Uniform
}
}
impl KNNWeightFunction { impl KNNWeightFunction {
fn calc_weights(&self, distances: Vec<f64>) -> std::vec::Vec<f64> { fn calc_weights(&self, distances: Vec<f64>) -> std::vec::Vec<f64> {
match *self { match *self {
+3 -26
View File
@@ -2,13 +2,9 @@
//! Most algorithms in `smartcore` rely on basic linear algebra operations like dot product, matrix decomposition and other subroutines that are defined for a set of real numbers, . //! Most algorithms in `smartcore` rely on basic linear algebra operations like dot product, matrix decomposition and other subroutines that are defined for a set of real numbers, .
//! This module defines real number and some useful functions that are used in [Linear Algebra](../../linalg/index.html) module. //! This module defines real number and some useful functions that are used in [Linear Algebra](../../linalg/index.html) module.
use rand::rngs::SmallRng;
use rand::{Rng, SeedableRng};
use num_traits::Float; use num_traits::Float;
use crate::numbers::basenum::Number; use crate::numbers::basenum::Number;
use crate::rand_custom::get_rng_impl;
/// Defines real number /// Defines real number
/// <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS_CHTML"></script> /// <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS_CHTML"></script>
@@ -67,12 +63,8 @@ impl RealNumber for f64 {
} }
fn rand() -> f64 { fn rand() -> f64 {
let mut small_rng = get_rng_impl(None); // TODO: to be implemented, see issue smartcore#214
1.0
let mut rngs: Vec<SmallRng> = (0..3)
.map(|_| SmallRng::from_rng(&mut small_rng).unwrap())
.collect();
rngs[0].gen::<f64>()
} }
fn two() -> Self { fn two() -> Self {
@@ -116,12 +108,7 @@ impl RealNumber for f32 {
} }
fn rand() -> f32 { fn rand() -> f32 {
let mut small_rng = get_rng_impl(None); 1.0
let mut rngs: Vec<SmallRng> = (0..3)
.map(|_| SmallRng::from_rng(&mut small_rng).unwrap())
.collect();
rngs[0].gen::<f32>()
} }
fn two() -> Self { fn two() -> Self {
@@ -162,14 +149,4 @@ mod tests {
fn f64_from_string() { fn f64_from_string() {
assert_eq!(f64::from_str("1.111111111").unwrap(), 1.111111111) assert_eq!(f64::from_str("1.111111111").unwrap(), 1.111111111)
} }
#[test]
fn f64_rand() {
f64::rand();
}
#[test]
fn f32_rand() {
f32::rand();
}
} }
@@ -113,13 +113,12 @@ mod tests {
g[1] = 200. * (x[1] - x[0].powf(2.)); g[1] = 200. * (x[1] - x[0].powf(2.));
}; };
let ls: Backtracking<f64> = Backtracking::<f64> { let mut ls: Backtracking<f64> = Default::default();
order: FunctionOrder::THIRD, ls.order = FunctionOrder::THIRD;
..Default::default()
};
let optimizer: GradientDescent = Default::default(); let optimizer: GradientDescent = Default::default();
let result = optimizer.optimize(&f, &df, &x0, &ls); let result = optimizer.optimize(&f, &df, &x0, &ls);
println!("{:?}", result);
assert!((result.f_x - 0.0).abs() < 1e-5); assert!((result.f_x - 0.0).abs() < 1e-5);
assert!((result.x[0] - 1.0).abs() < 1e-2); assert!((result.x[0] - 1.0).abs() < 1e-2);
+4 -6
View File
@@ -196,9 +196,9 @@ impl LBFGS {
} }
/// ///
fn update_hessian<T: FloatNumber, X: Array1<T>>( fn update_hessian<'a, T: FloatNumber, X: Array1<T>>(
&self, &self,
_: &DF<'_, X>, _: &'a DF<'_, X>,
state: &mut LBFGSState<T, X>, state: &mut LBFGSState<T, X>,
) { ) {
state.dg = state.x_df.sub(&state.x_df_prev); state.dg = state.x_df.sub(&state.x_df_prev);
@@ -291,10 +291,8 @@ mod tests {
g[0] = -2. * (1. - x[0]) - 400. * (x[1] - x[0].powf(2.)) * x[0]; g[0] = -2. * (1. - x[0]) - 400. * (x[1] - x[0].powf(2.)) * x[0];
g[1] = 200. * (x[1] - x[0].powf(2.)); g[1] = 200. * (x[1] - x[0].powf(2.));
}; };
let ls: Backtracking<f64> = Backtracking::<f64> { let mut ls: Backtracking<f64> = Default::default();
order: FunctionOrder::THIRD, ls.order = FunctionOrder::THIRD;
..Default::default()
};
let optimizer: LBFGS = Default::default(); let optimizer: LBFGS = Default::default();
let result = optimizer.optimize(&f, &df, &x0, &ls); let result = optimizer.optimize(&f, &df, &x0, &ls);
+9 -4
View File
@@ -132,7 +132,8 @@ impl OneHotEncoder {
data.copy_col_as_vec(idx, &mut col_buf); data.copy_col_as_vec(idx, &mut col_buf);
if !validate_col_is_categorical(&col_buf) { if !validate_col_is_categorical(&col_buf) {
let msg = format!( let msg = format!(
"Column {idx} of data matrix containts non categorizable (integer) values" "Column {} of data matrix containts non categorizable (integer) values",
idx
); );
return Err(Failed::fit(&msg[..])); return Err(Failed::fit(&msg[..]));
} }
@@ -181,7 +182,7 @@ impl OneHotEncoder {
match oh_vec { match oh_vec {
None => { None => {
// Since we support T types, bad value in a series causes in to be invalid // Since we support T types, bad value in a series causes in to be invalid
let msg = format!("At least one value in column {old_cidx} doesn't conform to category definition"); let msg = format!("At least one value in column {} doesn't conform to category definition", old_cidx);
return Err(Failed::transform(&msg[..])); return Err(Failed::transform(&msg[..]));
} }
Some(v) => { Some(v) => {
@@ -337,7 +338,11 @@ mod tests {
]); ]);
let params = OneHotEncoderParams::from_cat_idx(&[1]); let params = OneHotEncoderParams::from_cat_idx(&[1]);
let result = OneHotEncoder::fit(&m, params); match OneHotEncoder::fit(&m, params) {
assert!(result.is_err()); Err(_) => {
assert!(true);
}
_ => assert!(false),
}
} }
} }
+1 -1
View File
@@ -294,7 +294,7 @@ mod tests {
&[0.5708488802, 0.1846414616, 0.9590802982, 0.5591871046], &[0.5708488802, 0.1846414616, 0.9590802982, 0.5591871046],
&[0.8387612750, 0.5754861361, 0.5537109852, 0.1077646442], &[0.8387612750, 0.5754861361, 0.5537109852, 0.1077646442],
])); ]));
println!("{transformed_values}"); println!("{}", transformed_values);
assert!(transformed_values.approximate_eq( assert!(transformed_values.approximate_eq(
&DenseMatrix::from_2d_array(&[ &DenseMatrix::from_2d_array(&[
&[-1.1154020653, -0.4031985330, 0.9284605204, -0.4271473866], &[-1.1154020653, -0.4031985330, 0.9284605204, -0.4271473866],
+4 -4
View File
@@ -206,7 +206,7 @@ mod tests {
#[test] #[test]
fn from_categories() { fn from_categories() {
let fake_categories: Vec<usize> = vec![1, 2, 3, 4, 5, 3, 5, 3, 1, 2, 4]; let fake_categories: Vec<usize> = vec![1, 2, 3, 4, 5, 3, 5, 3, 1, 2, 4];
let it = fake_categories.iter().copied(); let it = fake_categories.iter().map(|&a| a);
let enc = CategoryMapper::<usize>::fit_to_iter(it); let enc = CategoryMapper::<usize>::fit_to_iter(it);
let oh_vec: Vec<f64> = match enc.get_one_hot(&1) { let oh_vec: Vec<f64> = match enc.get_one_hot(&1) {
None => panic!("Wrong categories"), None => panic!("Wrong categories"),
@@ -218,8 +218,8 @@ mod tests {
fn build_fake_str_enc<'a>() -> CategoryMapper<&'a str> { fn build_fake_str_enc<'a>() -> CategoryMapper<&'a str> {
let fake_category_pos = vec!["background", "dog", "cat"]; let fake_category_pos = vec!["background", "dog", "cat"];
let enc = CategoryMapper::<&str>::from_positional_category_vec(fake_category_pos);
CategoryMapper::<&str>::from_positional_category_vec(fake_category_pos) enc
} }
#[cfg_attr( #[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")), all(target_arch = "wasm32", not(target_os = "wasi")),
@@ -275,7 +275,7 @@ mod tests {
let lab = enc.invert_one_hot(res).unwrap(); let lab = enc.invert_one_hot(res).unwrap();
assert_eq!(lab, "dog"); assert_eq!(lab, "dog");
if let Err(e) = enc.invert_one_hot(vec![0.0, 0.0, 0.0]) { if let Err(e) = enc.invert_one_hot(vec![0.0, 0.0, 0.0]) {
let pos_entries = "Expected a single positive entry, 0 entires found".to_string(); let pos_entries = format!("Expected a single positive entry, 0 entires found");
assert_eq!(e, Failed::transform(&pos_entries[..])); assert_eq!(e, Failed::transform(&pos_entries[..]));
}; };
} }
+2 -2
View File
@@ -167,7 +167,7 @@ where
} }
/// Ensure that a string containing a csv row conforms to a specified row format. /// Ensure that a string containing a csv row conforms to a specified row format.
fn validate_csv_row(row: &str, row_format: &CSVRowFormat<'_>) -> Result<(), ReadingError> { fn validate_csv_row<'a>(row: &'a str, row_format: &CSVRowFormat<'_>) -> Result<(), ReadingError> {
let actual_number_of_fields = row.split(row_format.field_seperator).count(); let actual_number_of_fields = row.split(row_format.field_seperator).count();
if row_format.n_fields == actual_number_of_fields { if row_format.n_fields == actual_number_of_fields {
Ok(()) Ok(())
@@ -208,7 +208,7 @@ where
match value_string.parse::<T>().ok() { match value_string.parse::<T>().ok() {
Some(value) => Ok(value), Some(value) => Ok(value),
None => Err(ReadingError::InvalidField { None => Err(ReadingError::InvalidField {
msg: format!("Value '{value_string}' could not be read.",), msg: format!("Value '{}' could not be read.", value_string,),
}), }),
} }
} }
+10 -2
View File
@@ -983,7 +983,11 @@ mod tests {
.unwrap(); .unwrap();
let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect())); let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect()));
assert!(acc >= 0.9, "accuracy ({acc}) is not larger or equal to 0.9"); assert!(
acc >= 0.9,
"accuracy ({}) is not larger or equal to 0.9",
acc
);
} }
#[cfg_attr( #[cfg_attr(
@@ -1072,7 +1076,11 @@ mod tests {
let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect())); let acc = accuracy(&y, &(y_hat.iter().map(|e| e.to_i32().unwrap()).collect()));
assert!(acc >= 0.9, "accuracy ({acc}) is not larger or equal to 0.9"); assert!(
acc >= 0.9,
"accuracy ({}) is not larger or equal to 0.9",
acc
);
} }
#[cfg_attr( #[cfg_attr(
+1 -1
View File
@@ -662,7 +662,7 @@ mod tests {
.unwrap(); .unwrap();
let t = mean_squared_error(&y_hat, &y); let t = mean_squared_error(&y_hat, &y);
println!("{t:?}"); println!("{:?}", t);
assert!(t < 2.5); assert!(t < 2.5);
} }
+15 -22
View File
@@ -137,17 +137,16 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
self.classes.as_ref() self.classes.as_ref()
} }
/// Get depth of tree /// Get depth of tree
pub fn depth(&self) -> u16 { fn depth(&self) -> u16 {
self.depth self.depth
} }
} }
/// The function to measure the quality of a split. /// The function to measure the quality of a split.
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone, Default)] #[derive(Debug, Clone)]
pub enum SplitCriterion { pub enum SplitCriterion {
/// [Gini index](../decision_tree_classifier/index.html) /// [Gini index](../decision_tree_classifier/index.html)
#[default]
Gini, Gini,
/// [Entropy](../decision_tree_classifier/index.html) /// [Entropy](../decision_tree_classifier/index.html)
Entropy, Entropy,
@@ -155,6 +154,12 @@ pub enum SplitCriterion {
ClassificationError, ClassificationError,
} }
impl Default for SplitCriterion {
fn default() -> Self {
SplitCriterion::Gini
}
}
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
struct Node { struct Node {
@@ -538,10 +543,6 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
parameters: DecisionTreeClassifierParameters, parameters: DecisionTreeClassifierParameters,
) -> Result<DecisionTreeClassifier<TX, TY, X, Y>, Failed> { ) -> Result<DecisionTreeClassifier<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape(); let (x_nrows, num_attributes) = x.shape();
if x_nrows != y.shape() {
return Err(Failed::fit("Size of x should equal size of y"));
}
let samples = vec![1; x_nrows]; let samples = vec![1; x_nrows];
DecisionTreeClassifier::fit_weak_learner(x, y, samples, num_attributes, parameters) DecisionTreeClassifier::fit_weak_learner(x, y, samples, num_attributes, parameters)
} }
@@ -559,7 +560,8 @@ impl<TX: Number + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY>>
let k = classes.len(); let k = classes.len();
if k < 2 { if k < 2 {
return Err(Failed::fit(&format!( return Err(Failed::fit(&format!(
"Incorrect number of classes: {k}. Should be >= 2." "Incorrect number of classes: {}. Should be >= 2.",
k
))); )));
} }
@@ -899,13 +901,15 @@ mod tests {
)] )]
#[test] #[test]
fn gini_impurity() { fn gini_impurity() {
assert!((impurity(&SplitCriterion::Gini, &[7, 3], 10) - 0.42).abs() < std::f64::EPSILON);
assert!( assert!(
(impurity(&SplitCriterion::Entropy, &[7, 3], 10) - 0.8812908992306927).abs() (impurity(&SplitCriterion::Gini, &vec![7, 3], 10) - 0.42).abs() < std::f64::EPSILON
);
assert!(
(impurity(&SplitCriterion::Entropy, &vec![7, 3], 10) - 0.8812908992306927).abs()
< std::f64::EPSILON < std::f64::EPSILON
); );
assert!( assert!(
(impurity(&SplitCriterion::ClassificationError, &[7, 3], 10) - 0.3).abs() (impurity(&SplitCriterion::ClassificationError, &vec![7, 3], 10) - 0.3).abs()
< std::f64::EPSILON < std::f64::EPSILON
); );
} }
@@ -967,17 +971,6 @@ mod tests {
); );
} }
#[test]
fn test_random_matrix_with_wrong_rownum() {
let x_rand: DenseMatrix<f64> = DenseMatrix::<f64>::rand(21, 200);
let y: Vec<u32> = vec![0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
let fail = DecisionTreeClassifier::fit(&x_rand, &y, Default::default());
assert!(fail.is_err());
}
#[cfg_attr( #[cfg_attr(
all(target_arch = "wasm32", not(target_os = "wasi")), all(target_arch = "wasm32", not(target_os = "wasi")),
wasm_bindgen_test::wasm_bindgen_test wasm_bindgen_test::wasm_bindgen_test
+1 -4
View File
@@ -18,6 +18,7 @@
//! Example: //! Example:
//! //!
//! ``` //! ```
//! use rand::thread_rng;
//! use smartcore::linalg::basic::matrix::DenseMatrix; //! use smartcore::linalg::basic::matrix::DenseMatrix;
//! use smartcore::tree::decision_tree_regressor::*; //! use smartcore::tree::decision_tree_regressor::*;
//! //!
@@ -421,10 +422,6 @@ impl<TX: Number + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1<TY>>
parameters: DecisionTreeRegressorParameters, parameters: DecisionTreeRegressorParameters,
) -> Result<DecisionTreeRegressor<TX, TY, X, Y>, Failed> { ) -> Result<DecisionTreeRegressor<TX, TY, X, Y>, Failed> {
let (x_nrows, num_attributes) = x.shape(); let (x_nrows, num_attributes) = x.shape();
if x_nrows != y.shape() {
return Err(Failed::fit("Size of x should equal size of y"));
}
let samples = vec![1; x_nrows]; let samples = vec![1; x_nrows];
DecisionTreeRegressor::fit_weak_learner(x, y, samples, num_attributes, parameters) DecisionTreeRegressor::fit_weak_learner(x, y, samples, num_attributes, parameters)
} }