Files
smartcore/README.md
Lorenzo 52eb6ce023 Merge potential next release v0.4 (#187) Breaking Changes
* First draft of the new n-dimensional arrays + NB use case
* Improves default implementation of multiple Array methods
* Refactors tree methods
* Adds matrix decomposition routines
* Adds matrix decomposition methods to ndarray and nalgebra bindings
* Refactoring + linear regression now uses array2
* Ridge & Linear regression
* LBFGS optimizer & logistic regression
* LBFGS optimizer & logistic regression
* Changes linear methods, metrics and model selection methods to new n-dimensional arrays
* Switches KNN and clustering algorithms to new n-d array layer
* Refactors distance metrics
* Optimizes knn and clustering methods
* Refactors metrics module
* Switches decomposition methods to n-dimensional arrays
* Linalg refactoring - cleanup rng merge (#172)
* Remove legacy DenseMatrix and BaseMatrix implementation. Port the new Number, FloatNumber and Array implementation into module structure.
* Exclude AUC metrics. Needs reimplementation
* Improve developers walkthrough

New traits system in place at `src/numbers` and `src/linalg`
Co-authored-by: Lorenzo <tunedconsulting@gmail.com>

* Provide SupervisedEstimator with a constructor to avoid explicit dynamical box allocation in 'cross_validate' and 'cross_validate_predict' as required by the use of 'dyn' as per Rust 2021
* Implement getters to use as_ref() in src/neighbors
* Implement getters to use as_ref() in src/naive_bayes
* Implement getters to use as_ref() in src/linear
* Add Clone to src/naive_bayes
* Change signature for cross_validate and other model_selection functions to abide to use of dyn in Rust 2021
* Implement ndarray-bindings. Remove FloatNumber from implementations
* Drop nalgebra-bindings support (as decided in conf-call to go for ndarray)
* Remove benches. Benches will have their own repo at smartcore-benches
* Implement SVC
* Implement SVC serialization. Move search parameters in dedicated module
* Implement SVR. Definitely too slow
* Fix compilation issues for wasm (#202)

Co-authored-by: Luis Moreno <morenol@users.noreply.github.com>
* Fix tests (#203)

* Port linalg/traits/stats.rs
* Improve methods naming
* Improve Display for DenseMatrix

Co-authored-by: Montana Low <montanalow@users.noreply.github.com>
Co-authored-by: VolodymyrOrlov <volodymyr.orlov@gmail.com>
2022-10-31 10:44:57 +00:00

4.6 KiB

SmartCore

User guide | API | Examples


The Most Advanced Machine Learning Library In Rust.


Current status

  • Current working branch is development (if you want something that you can test right away).
  • Breaking changes are undergoing development at v0.5-wip (if you are a newcomer better to start from this README as this will be the next major release).

To start getting familiar with the new Smartcore v0.5 API, there is now available a Jupyter Notebook environment repository. Please see instructions there, your feedback is valuable for the future of the library.

Developers

Contributions welcome, please start from CONTRIBUTING and other relevant files.

Walkthrough: traits system and basic structures

numbers

The library is founded on basic traits provided by num-traits. Basic traits are in src/numbers. These traits are used to define all the procedures in the library to make everything safer and provide constraints to what implementations can handle.

linalg

numbers are made at use in linear algebra structures in the src/linalg/basic module. These sub-modules define the traits used all over the code base.

  • arrays: In particular data structures like Array, Array1 (1-dimensional), Array2 (matrix, 2-D); plus their "views" traits. Views are used to provide no-footprint access to data, they have composed traits to allow writing (mutable traits: MutArray, ArrayViewMut, ...).
  • matrix: This provides the main entrypoint to matrices operations and currently the only structure provided in the shape of struct DenseMatrix. A matrix can be instantiated and automatically make available all the traits in "arrays" (sparse matrices implementation will be provided).
  • vector: Convenience traits are implemented for std::Vec to allow extensive reuse.

These are all traits and by definition they do not allow instantiation. For instantiable structures see implementation like DenseMatrix with relative constructor.

linalg/traits

The traits in src/linalg/traits are closely linked to Linear Algebra's theoretical framework. These traits are used to specify characteristics and constraints for types accepted by various algorithms. For example these allow to define if a matrix is QRDecomposable and/or SVDDecomposable. See docstring for referencese to theoretical framework.

As above these are all traits and by definition they do not allow instantiation. They are mostly used to provide constraints for implementations. For example, the implementation for Linear Regression requires the input data X to be in smartcore's trait system Array2<FloatNumber> + QRDecomposable<TX> + SVDDecomposable<TX>, a 2-D matrix that is both QR and SVD decomposable; that is what the provided strucure linalg::arrays::matrix::DenseMatrix happens to be: impl<T: FloatNumber> QRDecomposable<T> for DenseMatrix<T> {};impl<T: FloatNumber> SVDDecomposable<T> for DenseMatrix<T> {}.

metrics

Implementations for metrics (classification, regression, cluster, ...) and distance measure (Euclidean, Hamming, Manhattan, ...). For example: Accuracy, F1, AUC, Precision, R2. As everything else in the code base, these implementations reuse numbers and linalg traits and structures.

These are collected in structures like pub struct ClassificationMetrics<T> {} that implements metrics::Metrics, these are groups of functions (classification, regression, cluster, ...) that provide instantiation for the structures. Each of those instantiation can be passed around using the relative function, like pub fn accuracy<T: Number + RealNumber + FloatNumber, V: ArrayView1<T>>(y_true: &V, y_pred: &V) -> T. This provides a mechanism for metrics to be passed to higher interfaces like the cross_validate:

let results =
  cross_validate(
      BiasedEstimator::fit,  // custom estimator
      &x, &y,                // input data
      NoParameters {},       // extra parameters
      cv,                    // type of cross validator
      &accuracy              // **metrics function** <--------
  ).unwrap();

TODO: complete for all modules