Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter to control number of rules generated in apriori #9

Open
sjain777 opened this issue Apr 11, 2016 · 3 comments
Open

Parameter to control number of rules generated in apriori #9

sjain777 opened this issue Apr 11, 2016 · 3 comments

Comments

@sjain777
Copy link

Hi, for the apriori algorithm, I use the following parameters:
support, minlen, maxlen, confidence, target ( = "rules"). I am currently using this set to both tune my model as well as limit the size of the model (that is, the number of rules generated).

It would be immensely helpful to have a separate parameter to control the size of the model, for example, something like "maxrules" so that one can fine-tune the model (for better performance) using the above existing parameters as well as create a model that has a controlled number of rules using "maxrules". Right now, if I fine-tune my model using the existing set of parameters, the number of rules becomes too large (sometimes a few million) which results in long model-building time as well as making predictions. This (limiting the size of the apriori object as well as model-tuning) becomes quite of an issue with automating thousands of models.

Is it possible to add such a parameter in the near future?

Thanks!
Supriya

@mhahsler
Copy link
Owner

The code used right now unfortunately does not support this kind of limit. Under Windows you can explore memory.limit. Maybe you should use a very aggressive setting for maxlen first to see how many short rules you produce at a given min. support before you allow longer rules.

@sjain777
Copy link
Author

Thanks for your mail. My current minlen and maxlen vary between 1-4. I tried a few such measures as you've suggested, but with several thousands of models and varying data features for each model, optimizing such checks for both performance and size is challenging, and it also adds to the overall processing time.

@mhahsler
Copy link
Owner

You can now limit the time (at least somewhat). This should help with limiting the number of rules...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants