JuliaData
diff --git a/‎REQUIRE
-2 b/‎REQUIRE
-2
diff --git a/‎docs/src/man/categorical.md
+2-2 b/‎docs/src/man/categorical.md
+2-2
diff --git a/‎docs/src/man/getting_started.md
+19-19 b/‎docs/src/man/getting_started.md
+19-19
diff --git a/‎docs/src/man/joins.md
-20 b/‎docs/src/man/joins.md
-20
diff --git a/‎docs/src/man/reshaping_and_pivoting.md
+7-7 b/‎docs/src/man/reshaping_and_pivoting.md
+7-7
diff --git a/‎docs/src/man/split_apply_combine.md
+6-6 b/‎docs/src/man/split_apply_combine.md
+6-6
diff --git a/‎docs/src/man/subsets.md
+6-6 b/‎docs/src/man/subsets.md
+6-6
diff --git a/‎src/DataFrames.jl
+8-19 b/‎src/DataFrames.jl
+8-19
@@ -4,7 +4,5 @@ CategoricalArrays 0.2.0
 StatsBase 0.11.0
 SortingAlgorithms
 Reexport
-Compat 0.19.0
 WeakRefStrings 0.3.0
 DataStreams 0.2.0
-CSV 0.2.0
@@ -45,9 +45,9 @@ cv = categorical(v)
 Or you can edit the columns of a `DataFrame` in-place using the `categorical!` function:
 
 ```julia
-dt = DataFrame(A = [1, 1, 1, 2, 2, 2],
+df = DataFrame(A = [1, 1, 1, 2, 2, 2],
                B = ["X", "X", "X", "Y", "Y", "Y"])
-categorical!(dt, [:A, :B])
+categorical!(df, [:A, :B])
 ```
 
 Using categorical arrays is important for working with the [GLM package](https://github.com/JuliaStats/GLM.jl). When fitting regression models, `CategoricalArray` columns in the input are translated into 0/1 indicator columns in the `ModelMatrix` with one column for each of the levels of the `CategoricalArray`. This allows one to analyze categorical data efficiently.
 
@@ -107,59 +107,59 @@ julia> nulls(Int, 1, 3)
 The `DataFrame` type can be used to represent data tables, each column of which is a vector. You can specify the columns using keyword arguments:
 
 ```julia
-dt = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
+df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
 ```
 
 It is also possible to construct a `DataFrame` in stages:
 
 ```julia
-dt = DataFrame()
-dt[:A] = 1:8
-dt[:B] = ["M", "F", "F", "M", "F", "M", "M", "F"]
-dt
+df = DataFrame()
+df[:A] = 1:8
+df[:B] = ["M", "F", "F", "M", "F", "M", "M", "F"]
+df
 ```
 
 The `DataFrame` we build in this way has 8 rows and 2 columns. You can check this using `size` function:
 
 ```julia
-nrows = size(dt, 1)
-ncols = size(dt, 2)
+nrows = size(df, 1)
+ncols = size(df, 2)
 ```
 
 We can also look at small subsets of the data in a couple of different ways:
 
 ```julia
-head(dt)
-tail(dt)
+head(df)
+tail(df)
 
-dt[1:3, :]
+df[1:3, :]
 ```
 
 Having seen what some of the rows look like, we can try to summarize the entire data set using `describe`:
 
 ```julia
-describe(dt)
+describe(df)
 ```
 
 To focus our search, we start looking at just the means and medians of specific columns. In the example below, we use numeric indexing to access the columns of the `DataFrame`:
 
 ```julia
-mean(Nulls.skip(dt[1]))
-median(Nulls.skip(dt[1]))
+mean(Nulls.skip(df[1]))
+median(Nulls.skip(df[1]))
 ```
 
 We could also have used column names to access individual columns:
 
 ```julia
-mean(Nulls.skip(dt[:A]))
-median(Nulls.skip(dt[:A]))
+mean(Nulls.skip(df[:A]))
+median(Nulls.skip(df[:A]))
 ```
 
 We can also apply a function to each column of a `DataFrame` with the `colwise` function. For example:
 
 ```julia
-dt = DataFrame(A = 1:4, B = randn(4))
-colwise(c->cumsum(Nulls.skip(c)), dt)
+df = DataFrame(A = 1:4, B = randn(4))
+colwise(c->cumsum(Nulls.skip(c)), df)
 ```
 
 ## Importing and Exporting Data (I/O)
@@ -191,8 +191,8 @@ a `DataFrame` rather than the default `DataFrame`. Keyword arguments may be pass
 
 A DataFrame can be written to a CSV file at path `output` using
 ```julia
-dt = DataFrame(x = 1, y = 2)
-CSV.write(output, dt)
+df = DataFrame(x = 1, y = 2)
+CSV.write(output, df)
 ```
 
 For more information, use the REPL [help-mode](http://docs.julialang.org/en/stable/manual/interacting-with-julia/#help-mode) or checkout the online [CSV.jl documentation](https://juliadata.github.io/CSV.jl/stable/)!
 
@@ -35,13 +35,8 @@ There are seven kinds of joins supported by the DataFrames package:
 You can control the kind of join that `join` performs using the `kind` keyword argument:
 
 ```julia
-<<<<<<< HEAD
 a = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
 b = DataFrame(ID = [20, 60], Job = ["Lawyer", "Astronaut"])
-=======
-a = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
-b = DataFrame(ID = [20, 60], Job = ["Lawyer", "Astronaut"])
->>>>>>> b196630fa9ba02372a25dec222425d9b804f5fd5
 join(a, b, on = :ID, kind = :inner)
 join(a, b, on = :ID, kind = :left)
 join(a, b, on = :ID, kind = :right)
@@ -56,37 +51,22 @@ Cross joins are the only kind of join that does not use a key:
 join(a, b, kind = :cross)
 ```
 
-<<<<<<< HEAD
-In order to join data frames on keys which have different names, you must first rename them so that they match. This can be done using rename!:
-
-```julia
-a = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
-b = DataFrame(IDNew = [20, 40], Job = ["Lawyer", "Doctor"])
-=======
 In order to join data tables on keys which have different names, you must first rename them so that they match. This can be done using rename!:
 
 ```julia
 a = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
 b = DataFrame(IDNew = [20, 40], Job = ["Lawyer", "Doctor"])
->>>>>>> b196630fa9ba02372a25dec222425d9b804f5fd5
 rename!(b, :IDNew, :ID)
 join(a, b, on = :ID, kind = :inner)
 ```
 
 Or renaming multiple columns at a time:
 
 ```julia
-<<<<<<< HEAD
-a = DataFrame(City = ["Amsterdam", "London", "London", "New York", "New York"],
-              Job = ["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
-              Category = [1, 2, 3, 4, 5])
-b = DataFrame(Location = ["Amsterdam", "London", "London", "New York", "New York"],
-=======
 a = DataFrame(City = ["Amsterdam", "London", "London", "New York", "New York"],
               Job = ["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
               Category = [1, 2, 3, 4, 5])
 b = DataFrame(Location = ["Amsterdam", "London", "London", "New York", "New York"],
->>>>>>> b196630fa9ba02372a25dec222425d9b804f5fd5
               Work = ["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
               Name = ["a", "b", "c", "d", "e"])
 rename!(b, [:Location => :City, :Work => :Job])
 
@@ -43,20 +43,20 @@ d = stack(iris)
 `unstack` converts from a long format to a wide format. The default is requires specifying which columns are an id variable, column variable names, and column values:
 
 ```julia
-longdt = melt(iris, [:Species, :id])
-widedt = unstack(longdt, :id, :variable, :value)
+longdf = melt(iris, [:Species, :id])
+widedf = unstack(longdf, :id, :variable, :value)
 ```
 
 If the remaining columns are unique, you can skip the id variable and use:
 
 ```julia
-widedt = unstack(longdt, :variable, :value)
+widedf = unstack(longdf, :variable, :value)
 ```
 
-`stackdt` and `meltdt` are two additional functions that work like `stack` and `melt`, but they provide a view into the original wide DataFrame. Here is an example:
+`stackdf` and `meltdf` are two additional functions that work like `stack` and `melt`, but they provide a view into the original wide DataFrame. Here is an example:
 
 ```julia
-d = stackdt(iris)
+d = stackdf(iris)
 ```
 
 This saves memory. To create the view, several AbstractVectors are defined:
@@ -73,13 +73,13 @@ This repeats the original columns N times where N is the number of columns stack
 For more details on the storage representation, see:
 
 ```julia
-dump(stackdt(iris))
+dump(stackdf(iris))
 ```
 
 None of these reshaping functions perform any aggregation. To do aggregation, use the split-apply-combine functions in combination with reshaping. Here is an example:
 
 ```julia
 d = stack(iris)
-x = by(d, [:variable, :Species], dt -> DataFrame(vsum = mean(Nulls.skip(dt[:value]))))
+x = by(d, [:variable, :Species], df -> DataFrame(vsum = mean(Nulls.skip(df[:value]))))
 unstack(x, :Species, :vsum)
 ```
@@ -12,15 +12,15 @@ using CSV
 iris = CSV.read(joinpath(Pkg.dir("DataFrames"), "test/data/iris.csv"), DataFrame)
 
 by(iris, :Species, size)
-by(iris, :Species, dt -> mean(Nulls.skip(dt[:PetalLength])))
-by(iris, :Species, dt -> DataFrame(N = size(dt, 1)))
+by(iris, :Species, df -> mean(Nulls.skip(df[:PetalLength])))
+by(iris, :Species, df -> DataFrame(N = size(df, 1)))
 ```
 
 The `by` function also support the `do` block form:
 
 ```julia
-by(iris, :Species) do dt
-   DataFrame(m = mean(Nulls.skip(dt[:PetalLength])), s² = var(Nulls.skip(dt[:PetalLength])))
+by(iris, :Species) do df
+   DataFrame(m = mean(Nulls.skip(df[:PetalLength])), s² = var(Nulls.skip(df[:PetalLength])))
 end
 ```
 
@@ -36,7 +36,7 @@ aggregate(iris, :Species, [sum, x->mean(Nulls.skip(x))])
 If you only want to split the data set into subsets, use the `groupby` function:
 
 ```julia
-for subdt in groupby(iris, :Species)
-    println(size(subdt, 1))
+for subdf in groupby(iris, :Species)
+    println(size(subdf, 1))
 end
 ```
@@ -24,7 +24,7 @@ julia> df = DataFrame(A = 1:10, B = 2:2:20)
 Referring to the first column by index or name:
 
 ```julia
-julia> dt[1]
+julia> df[1]
 10-element Array{Int64,1}:
   1
   2
@@ -37,7 +37,7 @@ julia> dt[1]
   9
  10
 
-julia> dt[:A]
+julia> df[:A]
 10-element Array{Int64,1}:
   1
   2
@@ -54,25 +54,25 @@ julia> dt[:A]
 Refering to the first element of the first column:
 
 ```julia
-julia> dt[1, 1]
+julia> df[1, 1]
 1
 
-julia> dt[1, :A]
+julia> df[1, :A]
 1
 ```
 
 Selecting a subset of rows by index and an (ordered) subset of columns by name:
 
 ```julia
-julia> dt[1:3, [:A, :B]]
+julia> df[1:3, [:A, :B]]
 3×2 DataFrames.DataFrame
 │ Row │ A │ B │
 ├─────┼───┼───┤
 │ 1   │ 1 │ 2 │
 │ 2   │ 2 │ 4 │
 │ 3   │ 3 │ 6 │
 
-julia> dt[1:3, [:B, :A]]
+julia> df[1:3, [:B, :A]]
 3×2 DataFrames.DataFrame
 │ Row │ B │ A │
 ├─────┼───┼───┤
 
@@ -1,5 +1,4 @@
-__precompile__()
-
+__precompile__(true)
 module DataFrames
 
 ##############################################################################
@@ -8,12 +7,9 @@ module DataFrames
 ##
 ##############################################################################
 
-using Reexport
-using StatsBase
-import NullableArrays: dropnull, dropnull!
-@reexport using NullableArrays
-@reexport using CategoricalArrays
-using SortingAlgorithms
+using Reexport, StatsBase, SortingAlgorithms
+@reexport using CategoricalArrays, Nulls
+
 using Base: Sort, Order
 import Base: ==, |>
 
@@ -23,14 +19,7 @@ import Base: ==, |>
 ##
 ##############################################################################
 
-export @~,
-       @csv_str,
-       @csv2_str,
-       @formula,
-       @tsv_str,
-       @wsv_str,
-
-       AbstractDataFrame,
+export AbstractDataFrame,
        DataFrame,
        DataFrameRow,
        GroupApplied,
@@ -51,7 +40,6 @@ export @~,
        eachrow,
        eltypes,
        groupby,
-       head,
        melt,
        meltdf,
        names!,
@@ -66,7 +54,6 @@ export @~,
        showcols,
        stack,
        stackdf,
-       tail,
        unique!,
        unstack,
        head,
@@ -83,13 +70,15 @@ export @~,
 ##
 ##############################################################################
 
+const _displaysize = Base.displaysize
+
 for (dir, filename) in [
         ("other", "utils.jl"),
         ("other", "index.jl"),
 
         ("abstractdataframe", "abstractdataframe.jl"),
         ("dataframe", "dataframe.jl"),
-        ("subdataframe", "subdataframe.jl"),
+        ("dataframe", "dataframe.jl"),
         ("groupeddataframe", "grouping.jl"),
         ("dataframerow", "dataframerow.jl"),
         ("dataframerow", "utils.jl"),
Original file line number	Diff line number	Diff line change
`@@ -24,7 +24,7 @@ julia> df = DataFrame(A = 1:10, B = 2:2:20)`
`24`	`24`	`Referring to the first column by index or name:`
`25`	`25`
`26`	`26`	```julia
`27`		`-julia> dt[1]`
	`27`	`+julia> df[1]`
`28`	`28`	`10-element Array{Int64,1}:`
`29`	`29`	`1`
`30`	`30`	`2`
`@@ -37,7 +37,7 @@ julia> dt[1]`
`37`	`37`	`9`
`38`	`38`	`10`
`39`	`39`
`40`		`-julia> dt[:A]`
	`40`	`+julia> df[:A]`
`41`	`41`	`10-element Array{Int64,1}:`
`42`	`42`	`1`
`43`	`43`	`2`
`@@ -54,25 +54,25 @@ julia> dt[:A]`
`54`	`54`	`Refering to the first element of the first column:`
`55`	`55`
`56`	`56`	```julia
`57`		`-julia> dt[1, 1]`
	`57`	`+julia> df[1, 1]`
`58`	`58`	`1`
`59`	`59`
`60`		`-julia> dt[1, :A]`
	`60`	`+julia> df[1, :A]`
`61`	`61`	`1`
`62`	`62`	```
`63`	`63`
`64`	`64`	`Selecting a subset of rows by index and an (ordered) subset of columns by name:`
`65`	`65`
`66`	`66`	```julia
`67`		`-julia> dt[1:3, [:A, :B]]`
	`67`	`+julia> df[1:3, [:A, :B]]`
`68`	`68`	`3×2 DataFrames.DataFrame`
`69`	`69`	`│ Row │ A │ B │`
`70`	`70`	`├─────┼───┼───┤`
`71`	`71`	`│ 1 │ 1 │ 2 │`
`72`	`72`	`│ 2 │ 2 │ 4 │`
`73`	`73`	`│ 3 │ 3 │ 6 │`
`74`	`74`
`75`		`-julia> dt[1:3, [:B, :A]]`
	`75`	`+julia> df[1:3, [:B, :A]]`
`76`	`76`	`3×2 DataFrames.DataFrame`
`77`	`77`	`│ Row │ B │ A │`
`78`	`78`	`├─────┼───┼───┤`