Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparatory work for hierarchical clustering #252

Merged
merged 86 commits into from
Jan 28, 2025
Merged
Changes from all commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
fa802d2
build(deps): update nalgebra requirement from 0.23.0 to 0.26.2 (#98)
dependabot-preview[bot] May 11, 2022
4e94feb
Update nalgebra requirement from 0.23.0 to 0.31.0 (#128)
dependabot[bot] May 11, 2022
ea39024
Add SVC::decision_function (#135)
ferrouille Jun 21, 2022
98e3465
Fix clippy warnings (#139)
morenol Jul 14, 2022
eb4b49d
Added additional doctest and fixed indices (#141)
cmccomb Aug 12, 2022
dc7f01d
Implement fastpair (#142)
Mec-iS Aug 23, 2022
09d9205
Add example for FastPair (#144)
Mec-iS Aug 24, 2022
df766ea
Implementation of Standard scaler (#143)
titoeb Aug 26, 2022
01f753f
Add serde for StandardScaler (#148)
ckatsak Sep 6, 2022
44e4be2
Update criterion requirement from 0.3 to 0.4 (#150)
dependabot[bot] Sep 12, 2022
0f442e9
Handle multiclass precision/recall (#152)
montanalow Sep 13, 2022
1f2597b
grid search (#154)
montanalow Sep 19, 2022
2d75c2c
Implement a generic read_csv method (#147)
titoeb Sep 19, 2022
f291b71
fix: fix compilation warnings when running only with default features…
morenol Sep 19, 2022
0d996ed
Update LICENSE
Mec-iS Sep 19, 2022
851533d
Make rand_distr optional (#161)
morenol Sep 20, 2022
bb5b437
feat: allocate first and then proceed to create matrix from Vec of Ro…
morenol Sep 20, 2022
cfa824d
Provide better output in flaky tests (#163)
morenol Sep 20, 2022
55e1158
Complete grid search params (#166)
montanalow Sep 21, 2022
a37b552
Lmm/add seeds in more algorithms (#164)
morenol Sep 21, 2022
05dfffa
add seed param to search params (#168)
montanalow Sep 21, 2022
f4fd4d2
make default params available to serde (#167)
montanalow Sep 22, 2022
e4c47c7
Add contribution guidelines (#178)
Mec-iS Sep 27, 2022
9ea3133
Update CONTRIBUTING.md
Mec-iS Sep 27, 2022
ad2e6c2
feat: expose hyper tuning module in model_selection (#179)
morenol Oct 1, 2022
473cdfc
refactor: Try to follow similar pattern to other APIs (#180)
morenol Oct 1, 2022
d520007
fix: fix issue with iterator for svc search (#182)
morenol Oct 2, 2022
d015b12
Update CONTRIBUTING.md
Mec-iS Oct 12, 2022
3b1aaaa
Update README.md
Mec-iS Oct 13, 2022
f605f6e
Update README.md
Mec-iS Oct 18, 2022
a32eb66
Dataset doc cleanup (#205)
rnowling Oct 30, 2022
a7fa058
Merge potential next release v0.4 (#187) Breaking Changes
Mec-iS Oct 31, 2022
d91f4f7
Update README.md
Mec-iS Oct 31, 2022
a16927a
Port ensemble. Add Display to naive_bayes (#208)
Mec-iS Oct 31, 2022
4d36b7f
Fix metrics::auc (#212)
Mec-iS Nov 1, 2022
712c478
Improve features (#215)
Mec-iS Nov 1, 2022
8f1a7df
build: fix compilation without default features (#218)
morenol Nov 2, 2022
7f35dc5
Disambiguate distances. Implement Fastpair. (#220)
Mec-iS Nov 2, 2022
c45bab4
Support Wasi as target (#216)
Mec-iS Nov 2, 2022
551a6e3
clean up svm
Mec-iS Nov 2, 2022
1cbde3b
Refactor modules structure in src/svm
Mec-iS Nov 2, 2022
6624732
Fix svr tests (#222)
Mec-iS Nov 3, 2022
e09c4ba
Add kernels' parameters to public interface
Mec-iS Nov 3, 2022
19f3a2f
Fix signature of metrics tests
Mec-iS Nov 3, 2022
ee6b6a5
cargo clippy
Mec-iS Nov 3, 2022
fabe362
Implement Display for NaiveBayes
Mec-iS Nov 3, 2022
b427e5d
Improve options conditionals
Mec-iS Nov 3, 2022
ed9769f
Implement CSV reader with new traits (#209)
Mec-iS Nov 3, 2022
ba27dd2
Fix CI (#227)
morenol Nov 3, 2022
8d07efd
Use Box in SVM and remove lifetimes (#228)
morenol Nov 4, 2022
d8d0fb6
Update README.md
Mec-iS Nov 4, 2022
6c0fd37
Update README.md
Mec-iS Nov 4, 2022
0dc97a4
Create DEVELOPERS.md
Mec-iS Nov 4, 2022
2df0795
Release 0.3
Mec-iS Nov 7, 2022
5b517c5
minor fix
Mec-iS Nov 7, 2022
527477d
minor fixes
Mec-iS Nov 7, 2022
3ec9e4f
Exclude datasets test for wasm/wasi
Mec-iS Nov 7, 2022
6d529b3
Add static analyzer to doc
Mec-iS Nov 7, 2022
669f87f
Use getrandom as default (for no-std feature)
Mec-iS Nov 8, 2022
a449fdd
fmt
Mec-iS Nov 8, 2022
616e38c
cleanup
Mec-iS Nov 8, 2022
af0a740
Fix std_rand feature
Mec-iS Nov 8, 2022
890e9d6
minor fix
Mec-iS Nov 8, 2022
63ed89a
minor fix
Mec-iS Nov 8, 2022
cf751f0
minor fix
Mec-iS Nov 8, 2022
c1bd1df
minor fix
Mec-iS Nov 8, 2022
1b7dda3
minor fix
Mec-iS Nov 8, 2022
459d558
minor fixes to doc
Mec-iS Nov 8, 2022
fa54d5e
Remove unused tests flags
Mec-iS Nov 8, 2022
c507d97
Update CHANGELOG
Mec-iS Nov 8, 2022
b0dece9
use getrandom/js
Mec-iS Nov 8, 2022
2f6dd13
update comment
Mec-iS Nov 8, 2022
e25e2ae
update CHANGELOG
Mec-iS Nov 8, 2022
265fd55
make work cargo build --target wasm32-unknown-unknown
Mec-iS Nov 8, 2022
7d87451
Fixes for release (#237)
morenol Nov 8, 2022
62de25b
Handle kernel serialization (#232)
morenol Nov 8, 2022
0c9c70f
Merge
Mec-iS Nov 9, 2022
0e1bf6c
Add ordered_pairs method to FastPair
Mec-iS Mar 21, 2023
80c406b
Merge branch 'development' of github.com:smartcorelib/smartcore into …
Mec-iS Mar 21, 2023
393cf15
Merge branch 'development' into march-2023-improvements
Mec-iS Mar 24, 2023
074cfaf
rustfmt
Mec-iS Mar 24, 2023
5dd5c2f
Merge branch 'development' into march-2023-improvements
Mec-iS Jan 27, 2025
d60ba63
Merge branch 'main' of github.com:smartcorelib/smartcore into march-2…
Mec-iS Jan 27, 2025
8cc02cd
fix test
Mec-iS Jan 27, 2025
39f87aa
add tests to fastpair
Mec-iS Jan 28, 2025
1603fcf
fix formatting
Mec-iS Jan 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions src/algorithm/neighbour/fastpair.rs
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,21 @@ impl<'a, T: RealNumber + FloatNumber, M: Array2<T>> FastPair<'a, T, M> {
}
}

///
/// Return order dissimilarities from closest to furthest
///
#[allow(dead_code)]
pub fn ordered_pairs(&self) -> std::vec::IntoIter<&PairwiseDistance<T>> {
// improvement: implement this to return `impl Iterator<Item = &PairwiseDistance<T>>`
// need to implement trait `Iterator` for `Vec<&PairwiseDistance<T>>`
let mut distances = self
.distances
.values()
.collect::<Vec<&PairwiseDistance<T>>>();
distances.sort_by(|a, b| a.partial_cmp(b).unwrap());
distances.into_iter()
}

//
// Compute distances from input to all other points in data-structure.
// input is the row index of the sample matrix
Expand Down Expand Up @@ -588,4 +603,103 @@ mod tests_fastpair {

assert_eq!(closest, min_dissimilarity);
}

#[test]
fn fastpair_ordered_pairs() {
let x = DenseMatrix::<f64>::from_2d_array(&[
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
&[4.7, 3.2, 1.3, 0.2],
&[4.6, 3.1, 1.5, 0.2],
&[5.0, 3.6, 1.4, 0.2],
&[5.4, 3.9, 1.7, 0.4],
&[4.9, 3.1, 1.5, 0.1],
&[7.0, 3.2, 4.7, 1.4],
&[6.4, 3.2, 4.5, 1.5],
&[6.9, 3.1, 4.9, 1.5],
&[5.5, 2.3, 4.0, 1.3],
&[6.5, 2.8, 4.6, 1.5],
&[4.6, 3.4, 1.4, 0.3],
&[5.0, 3.4, 1.5, 0.2],
&[4.4, 2.9, 1.4, 0.2],
])
.unwrap();
let fastpair = FastPair::new(&x).unwrap();

let ordered = fastpair.ordered_pairs();

let mut previous: f64 = -1.0;
for p in ordered {
if previous == -1.0 {
previous = p.distance.unwrap();
} else {
let current = p.distance.unwrap();
assert!(current >= previous);
previous = current;
}
}
}

#[test]
fn test_empty_set() {
let empty_matrix = DenseMatrix::<f64>::zeros(0, 0);
let result = FastPair::new(&empty_matrix);
assert!(result.is_err());
if let Err(e) = result {
assert_eq!(
e,
Failed::because(FailedError::FindFailed, "min number of rows should be 3")
);
}
}

#[test]
fn test_single_point() {
let single_point = DenseMatrix::from_2d_array(&[&[1.0, 2.0, 3.0]]).unwrap();
let result = FastPair::new(&single_point);
assert!(result.is_err());
if let Err(e) = result {
assert_eq!(
e,
Failed::because(FailedError::FindFailed, "min number of rows should be 3")
);
}
}

#[test]
fn test_two_points() {
let two_points = DenseMatrix::from_2d_array(&[&[1.0, 2.0], &[3.0, 4.0]]).unwrap();
let result = FastPair::new(&two_points);
assert!(result.is_err());
if let Err(e) = result {
assert_eq!(
e,
Failed::because(FailedError::FindFailed, "min number of rows should be 3")
);
}
}

#[test]
fn test_three_identical_points() {
let identical_points =
DenseMatrix::from_2d_array(&[&[1.0, 1.0], &[1.0, 1.0], &[1.0, 1.0]]).unwrap();
let result = FastPair::new(&identical_points);
assert!(result.is_ok());
let fastpair = result.unwrap();
let closest_pair = fastpair.closest_pair();
assert_eq!(closest_pair.distance, Some(0.0));
}

#[test]
fn test_result_unwrapping() {
let valid_matrix =
DenseMatrix::from_2d_array(&[&[1.0, 2.0], &[3.0, 4.0], &[5.0, 6.0], &[7.0, 8.0]])
.unwrap();

let result = FastPair::new(&valid_matrix);
assert!(result.is_ok());

// This should not panic
let _fastpair = result.unwrap();
}
}
Loading