-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed GP fitness again #301
Conversation
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #301 +/- ##
=======================================
Coverage 97.02% 97.02%
=======================================
Files 29 29
Lines 1849 1849
=======================================
Hits 1794 1794
Misses 55 55
Continue to review full report in Codecov by Sentry.
|
@jmafoster1 Can you explain in a bit more detail what the problem is here? It doesn't make sense for any of the predicted values to be complex so I think hard-coding a condition to eliminate them is probably not the best way of resolving it |
The problem is that candidate expressions are generated at random in GP, so there is the possibility of evaluating the square root of a negative number during GP. We cannot prevent this unless we remove the |
This doesn't make much sense to me. If, for whatever strange reason, your Edit:
What I meant was:
Does that make sense? |
My point here is that if it's returning an array of complex numbers, then the candidate expression is wrong, so should be assigned infinite fitness (we are minimising here, so fitness infinity is infinitely bad). |
Sorry, had a typo in my above comment (see above). If some candidate expressions are wrong/complex dtypes, can you not filter them out instead and avoid doing all of this? |
Unfortunately not. Every individual in the population must have a fitness value assigned to it. Better individuals will persist across generations of the population, with poorer individuals being filtered out (based on fitness value). However, in this case, there is no easy way to generate guaranteed valid individuals (i.e. individuals which will always produce real values). The best we can do is give invalid individuals very poor fitness values so that they (hopefully) do not persist for long. It's a fairly standard practice in GP. |
The ideal situation would be to do this in a strongly typed language, so we could guarantee that every individual was at least valid, but that's just a limitation of doing it in Python |
It sounds like you've thought it through, but I can't quite agree with this approach. Assigning specific fitness values to a selected group of individuals is fine, and sounds like some form of regularisation. But it sounds like the fitness function/model you're employing is probably not well-constrained. I'll approve this PR but it might be worth something coming back to in the future IMO. |
Thanks Farhad. Yes, I'm not really a fan. DEAP has lots of limitations and weird workarounds like this, but it's the most established and best documented toolkit for genetic algorithms that I've found so far. |
The fitness function would still, very rarely, give a fitness that was a complex number. I think this was because some of the predicted values from candidate functions would evaluate
sqrt(-1)
, which would then give complex distances, and thus a complex fitness. I now returnfloat("inf")
if the dtype of the predicted values is not the same as the dtype of the expected values, which should hopefully fix the problem in a robust way.