Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

huge variance in time to load iris #79

Open
davidbp opened this issue Oct 24, 2019 · 2 comments
Open

huge variance in time to load iris #79

davidbp opened this issue Oct 24, 2019 · 2 comments

Comments

@davidbp
Copy link

davidbp commented Oct 24, 2019

Hello

I have observed a 10x difference when loading the iris dataset in 2 different machines.

Loading times are a bit unreasonable, is there anything I can do to speed this up?

ulia> using RDatasets

julia> @time iris = dataset("datasets", "iris"); # a DataFrame
100.068931 seconds (75.23 M allocations: 4.053 GiB, 3.19% gc time)

julia> 102.497734 seconds (75.35 M allocations: 4.062 GiB, 3.33% gc time)
       (v1.2) pkg> status RDatasets
           Status `~/.julia/environments/v1.2/Project.toml`
         [a93c6f00] DataFrames v0.19.4
         [ce6b1742] RDatasets v0.6.4

julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i5-4278U CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)
Environment:
  JULIA_EDITOR = subl

(v1.2) pkg> status RDatasets
    Status `~/.julia/environments/v1.2/Project.toml`
  [336ed68f] CSV v0.5.14
  [a93c6f00] DataFrames v0.19.4
  [ce6b1742] RDatasets v0.6.4

In the other machine I get:

julia> using RDatasets
[ Info: Recompiling stale cache file /home/david/.julia/compiled/v1.1/RDatasets/JyIbx.ji for RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]

julia> @time iris = dataset("datasets", "iris"); 
 10.544570 seconds (37.27 M allocations: 1.767 GiB, 8.98% gc time)

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

(v1.1) pkg> status RDatasets
    Status `~/.julia/environments/v1.1/Project.toml`
  [336ed68f] CSV v0.5.14
  [a93c6f00] DataFrames v0.18.4
  [ce6b1742] RDatasets v0.6.1
@ppalmes
Copy link

ppalmes commented Nov 12, 2019

same observation. just simple loading of iris dataset takes more than 80 seconds in a 2017 Mac running Julia 1.2 and Julia 1.3.

@ppalmes
Copy link

ppalmes commented Nov 12, 2019

It's way faster:
using RCall
iris = R"iris" |> rcopy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants