What's new in RuleKit version 2.1.16.0?

1. RuleKit and RapidMiner part ways 💔

RuleKit has been using the RapidMiner Java API for various tasks, such as loading data, measuring model performance, etc., since its beginning. From major version 2 RuleKit finally parted ways with RuleMiner. This is mainly due to the recent work of our contributors: Wojciech Górka and Mateusz Kalisch.

This change brings many benefits and other changes such as:

a huge reduction in the jar file of the RuleKit java package (from 131MB to 40.9MB).
now the jar file is small enough to fit into the Python package distribution, which means there is no longer a need to download it in an extra step.

Although the license has remained the same (GNU AGPL-3.0 license), for commercial projects that require the ability to distribute RuleKit code as part of a program that cannot be distributed under the AGPL, it may be possible to obtain an appropriate license from the authors. Feel free to contact us!

2. ⚠️ BREAKING CHANGE `min_rule_covered` algorithm parameter was removed

Up to this version this parameter was marked as deprecated and its usage only resulted in warning. Now it was completely removed which might be a breaking change.

3. ⚠️ BREAKING CHANGE The classification metric `negative_voting_conflicts` is no longer available

As of this version, the metric returned from the RuleClassifier.predict method with return_metrics=True no longer includes the negative_voting_conflicts metric.

In fact, there was no way to calculate this metric without access to the true values of the labels. The predict method does not take labels as an argument, so previous results for this metric were unfortunately incorrect.

If you really need to calculate this specific metrics you still can but it requires more effort to do so. Here is an example how you can achieve it using currently available API:

import re
from collections import defaultdict
import numpy as np
from sklearn.datasets import load_iris

from rulekit.classification import RuleClassifier

X, y = load_iris(return_X_y=True)

clf = RuleClassifier()
clf.fit(X, y)

prediction: np.ndarray = clf.predict(X)

# 1. Group rules by decision class based on their conclusions
rule_decision_class_regex = re.compile("^.+THEN .+ = {(.+)}$")

grouped_rules: dict[str, list[int]] = defaultdict(lambda: [])
for i, rule in enumerate(clf.model.rules):
    rule_decision_class: str = rule_decision_class_regex.search(
        str(rule)).group(1)
    grouped_rules[rule_decision_class].append(i)

# 2. Get rules covering each example
coverage_matrix: np.ndarray = clf.get_coverage_matrix(X)

# 3. Group coverages of the rules with the same decision class
grouped_coverage_matrix: np.ndarray = np.zeros(
    shape=(coverage_matrix.shape[0], len(grouped_rules.keys()))
)
for i, rule_indices in enumerate(grouped_rules.values()):
    grouped_coverage_matrix[:, i] = np.sum(
        coverage_matrix[:, rule_indices], axis=1
    )
grouped_coverage_matrix[grouped_coverage_matrix > 0] = 1

# 4. Find examples with voting conflicts
voting_conflicts_mask: np.ndarray = np.sum(coverage_matrix, axis=1) > 1

# 5. Find examples with negative voting conflicts (where predicted class
# is not equal to actual class)
negative_conflicts_mask: np.ndarray = voting_conflicts_mask[
    y != prediction
]
negative_conflicts: int = np.sum(negative_conflicts_mask)
print('Number of negative voting conflicts: ', negative_conflicts)

Not so simple, right?

Perhaps in the future we will add an API to calculate this indicator in a more user-friendly way.

4. 🕰️ DEPRECATION `download_jar` command is now deprecated

Due to the removal of RapidMiner's dependencies from the RuleKit Java package, its jar file size has decreased significantly. Now it's small enough to fit into the Python package distribution. There is no need to download it in an extra step using this command as before:

python -m rulekit download_jar

This command will now do nothing and generate a warning. It will be completely removed in the next major version 3.

Fixed issues:

minsupp_new should be a float parameter #23
Inconsistent results of expert induction for regression and survival #19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.1.16.0

What's new in RuleKit version 2.1.16.0?

1. RuleKit and RapidMiner part ways 💔

2. ⚠️ BREAKING CHANGE `min_rule_covered` algorithm parameter was removed

3. ⚠️ BREAKING CHANGE The classification metric `negative_voting_conflicts` is no longer available

4. 🕰️ DEPRECATION `download_jar` command is now deprecated

Fixed issues:

Uh oh!

v2.1.16.0

What's new in RuleKit version 2.1.16.0?

1. RuleKit and RapidMiner part ways 💔

2. ⚠️ BREAKING CHANGE min_rule_covered algorithm parameter was removed

3. ⚠️ BREAKING CHANGE The classification metric negative_voting_conflicts is no longer available

4. 🕰️ DEPRECATION download_jar command is now deprecated

Fixed issues:

Uh oh!

2. ⚠️ BREAKING CHANGE `min_rule_covered` algorithm parameter was removed

3. ⚠️ BREAKING CHANGE The classification metric `negative_voting_conflicts` is no longer available

4. 🕰️ DEPRECATION `download_jar` command is now deprecated