Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream (parser): fix rows counter #22

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

NikitaRazmakhnin
Copy link

While I was testing some things around parsing of exhibits, I noticed
that for small premium-estimates spreadsheet (8 rows) total
count declared by countRowsInSheet is 1990 😨

Some templates for unknown for me reasons has a non-empty rows like:

    <row r="24" spans="4:13" x14ac:dyDescent="0.2">
      <c r="H24" s="14"/>
      <c r="I24" s="14"/>
    </row>
    <row r="25" spans="4:13" x14ac:dyDescent="0.2">
      <c r="H25" s="14"/>
      <c r="I25" s="14"/>
    </row>
    <row r="26" spans="4:13" x14ac:dyDescent="0.2">
      <c r="H26" s="14"/>
      <c r="I26" s="14"/>
    </row>
    <row r="27" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="28" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="29" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="30" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="31" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="32" spans="4:13" x14ac:dyDescent="0.2"/>
    <row r="33" x14ac:dyDescent="0.2"/>
    <row r="34" x14ac:dyDescent="0.2"/>
    <row r="35" x14ac:dyDescent="0.2"/>
    <row r="36" x14ac:dyDescent="0.2"/>
    <row r="37" x14ac:dyDescent="0.2"/>
    <row r="38" x14ac:dyDescent="0.2"/>
    <row r="39" x14ac:dyDescent="0.2"/>
    <row r="40" x14ac:dyDescent="0.2"/>
    <row r="41" x14ac:dyDescent="0.2"/>
    <row r="42" x14ac:dyDescent="0.2"/>
    <row r="43" x14ac:dyDescent="0.2"/>

I don't know whether we need to count these rows or not: perhaps we need to add a countNonEmptyRows and countAllRows functions? Please, guide me here if you have your own opinion.

For now I changed a function in such way that it counts only rows with at least one cell with value :)
This implementation counts rows correctly (tested it with few templates - but will test more later).

P.S. I also don't know should/can I write any tests for that or not?

Count only rows which has at least one
cell with at least one value.
@jappeace
Copy link

Hmm, this likely has an impact on the benchmarks, see if you can run them, we could provide 2 row counting functions and make them injectable 🤔

countRowsWith (forall m . HexpatEvent -> m Bool) -> ...

countEmptyRows :: HexpatEvent -> m Bool
countEmptyRows = \case { StartElement "row" _ -> True; _ -> False}

countNonEmptyRows :: HexpatEvent -> m Bool
countNonEmptyRows = ... -- your implementation

This way the user could also provide their own favorite row counting mechanism.
Keep in mind we're using the upstream xlsx now. So you should make a PR to that repository.

@NikitaRazmakhnin
Copy link
Author

NikitaRazmakhnin commented Feb 10, 2022

@jappeace i have a missing dependency Criterion for some reason when trying to run benchs. Running from nix-shell.

Do you know how I can fix that?

@jappeace
Copy link

a-add criterion

@jappeace
Copy link

jappeace commented Feb 10, 2022

I'm not sure how to run them, I'm getting this error:

cabal: Cannot benchmark the package xlsx-0.8.4 because none of the components
are available to build: the benchmark 'bench' is not available because the
solver did not find a plan that included the benchmarks. Force the solver to
enable this for all packages by adding the line 'tests: True' to the
'cabal.project.local' file.

@NikitaRazmakhnin
Copy link
Author

a-add criterion

I have added a doBenchmark to capture banchmarks cabal-section: works good from ghci at least now :)

@jappeace
Copy link

yeah, even a small change like that is worth upstreaming in my opinion, I got stuck on that, and I'm used to many tooling issues 😔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants