Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScanReport cannot be used for fake-data-generation or rabbit-in-a-hat #367

Open
thoniTUB opened this issue Feb 10, 2023 · 1 comment · Fixed by #368
Open

ScanReport cannot be used for fake-data-generation or rabbit-in-a-hat #367

thoniTUB opened this issue Feb 10, 2023 · 1 comment · Fixed by #368
Labels

Comments

@thoniTUB
Copy link
Contributor

Describe the bug
The fake-data generator and rabbit-in-a-hat cannot read a scan report that was produced by white-rabbit under a german locale.

The following error (shorted) appears when running the fake-data generation:

*** Generic error information ***
Message: For input string: "0,679"
...

*** Stack trace ***
java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054)
java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
java.base/java.lang.Double.parseDouble(Double.java:651)
org.ohdsi.utilities.files.QuickAndDirtyXlsxReader$Row.getDoubleByHeaderName(QuickAndDirtyXlsxReader.java:591)
org.ohdsi.rabbitInAHat.dataModel.Database.generateModelFromScanReport(Database.java:182)
org.ohdsi.whiteRabbit.fakeDataGenerator.FakeDataGenerator.generateData(FakeDataGenerator.java:53)
org.ohdsi.whiteRabbit.WhiteRabbitMain$FakeDataThread.run(WhiteRabbitMain.java:1076)

*** Console ***
An error report has been generated:
...\whiteRabbit/Error.txt
10.02.2023, 09:08:43	Starting creation of fake data
Loading scan report from ...\ScanReport_...csv.xlsx
Error: For input string: "0,679"

I search the report and found the cell in Field Overview>Fraction unique.
The cell content was a string: <= 0,679

To Reproduce
Steps to reproduce the behavior:

  1. Produce a ScanReport on a System with german locale activated
  2. Check if your ScanReport contains <= 0,... values in the Fraction unique column
  3. Try to generate fake-data with the report.

Expected behavior
The produced ScanReport can be loaded by the fake-data generator and rabbit-in-a-hat.

Desktop (please complete the following information):
Processor type: amd64
Available processors: 4
Maximum available memory: 1,258,291,200 bytes
Used memory: 324,052,120 bytes
Java version: 17.0.2
Java vendor: Oracle Corporation
OS architecture: amd64
OS name: Windows 10
OS version: 10.0

Addtional Info
After fixing all cells (<= 0, -> <= 0.), everything is fine.

@thoniTUB
Copy link
Contributor Author

I submitted a simple fix for the problem: #368

Another solution would be to discard the <= 0,/<= 0. syntax and solely rely on the double data type and percentage representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant