-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add proptable #19
add proptable #19
Changes from 1 commit
961525e
6268227
f65d113
a9070ae
4e715b6
495ce57
a660551
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -132,5 +132,46 @@ function freqtable(d::AbstractDataFrame, x::Symbol...; args...) | |
a | ||
end | ||
|
||
""" | ||
prop(tbl, [margin]) | ||
|
||
Create table of proportions from a frequency table `tbl` with margins generated for | ||
dimensions specified by `margin`. Does not check if `tbl` contains non-negative frequencies. | ||
Calculating `sum` over the result of `prop` over dimensions that are complement of `margin` | ||
produces `AbstractArray` containing only approximately `1.0`, see last example below. | ||
|
||
**Arguments** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The docstring recommendations in the manual say that we should avoid using a separate section for arguments then possible. Here, it would be clearer to mention the expected types of arguments in the signature, and explain what they do directly in the description. It would also be be nice to be less technical and explain in very simple terms how to compute row and column proportions. No need to mention "approximately" nor duplicated entries. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. although There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry - given your comment below I can put it into the signature :) |
||
|
||
* `tbl` : `AbstractArray{<:Number}` containing frequency table | ||
* `margin` : `Integer` or collection of `Integer`s of dimensions to generate proportions by; | ||
duplicated entries are ignored, e.g. [1, 1, 2] is the same as `[1, 2]`; | ||
if omitted proportions over the whole `tbl` are computed | ||
|
||
**Result** | ||
|
||
* `::AbstractArray` : array containing calculated proportions | ||
|
||
**Examples** | ||
|
||
```julia | ||
prop([1 2; 3 4]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
prop([1 2; 3 4], ()) | ||
prop([1 2; 3 4], 1) | ||
prop([1 2; 3 4], [1]) | ||
prop([1 2; 3 4], (1,2)) | ||
tbl = prop(rand(5, 5, 5, 5), (1, 2)) | ||
sumtbl = sum(tbl, (3,4)) | ||
all(x -> x ≈ 1.0, sumtbl) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No need to show that, just print the output of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK given the change above |
||
``` | ||
|
||
""" | ||
|
||
prop(tbl::AbstractArray{<:Number}) = tbl / sum(tbl) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add a docstring? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to add a docstring for |
||
prop(tbl::AbstractArray{<:Number}, dims) = tbl ./ sum(tbl, dims) | ||
|
||
function prop(tbl::AbstractArray{<:Number,N}, margin) where N | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
if !all(x -> isa(x, Integer) && (1 <= x <= N), margin) | ||
throw(ArgumentError("invalid margin argument; expected integers in a valid dimension range")) | ||
end | ||
minimum | ||
tbl ./ sum(tbl, tuple(setdiff(1:N, margin)...)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is that inferrable? It would be nice to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it is; added check to tests There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great! |
||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,27 +22,32 @@ tab = @inferred freqtable(x, y) | |
0.075 0.05 0.05 0.075; | ||
0.05 0.075 0.075 0.05; | ||
0.05 0.075 0.075 0.05] | ||
@test prop(tab, (1,2)) == [0.075 0.05 0.05 0.075; | ||
0.075 0.05 0.05 0.075; | ||
0.05 0.075 0.075 0.05; | ||
0.05 0.075 0.075 0.05] | ||
@test prop(tab, 1) == [0.3 0.2 0.2 0.3; | ||
@test prop(tab, ()) == [0.075 0.05 0.05 0.075; | ||
0.075 0.05 0.05 0.075; | ||
0.05 0.075 0.075 0.05; | ||
0.05 0.075 0.075 0.05] | ||
@test prop(tab, 2) == [0.3 0.2 0.2 0.3; | ||
0.3 0.2 0.2 0.3; | ||
0.2 0.3 0.3 0.2; | ||
0.2 0.3 0.3 0.2] | ||
@test prop(tab, 2) == [0.3 0.2 0.2 0.3; | ||
@test prop(tab, 1) == [0.3 0.2 0.2 0.3; | ||
0.3 0.2 0.2 0.3; | ||
0.2 0.3 0.3 0.2; | ||
0.2 0.3 0.3 0.2] | ||
@test prop(tab, ()) == [1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0] | ||
@test prop(tab, (1, 2)) == [1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0; | ||
1.0 1.0 1.0 1.0] | ||
|
||
tbl = prop(rand(5, 5, 5, 5), (1, 2)) | ||
sumtbl = sum(tbl, (3,4)) | ||
@test all(x -> x ≈ 1.0, sumtbl) | ||
|
||
@test_throws MethodError prop() | ||
@test_throws MethodError prop([1,2,3], ("a","b")) | ||
@test_throws MethodError prop(("a","b")) | ||
@test_throws MethodError prop((1, 2)) | ||
@test_throws ArgumentError prop([1,2,3], ("a","b")) | ||
@test_throws ArgumentError prop([1,2,3], 2) | ||
|
||
tab =freqtable(x, y, | ||
subset=1:20, | ||
|
@@ -53,12 +58,12 @@ tab =freqtable(x, y, | |
1.5 1.0] | ||
@test names(tab) == [["a", "b", "c", "d"], ["C", "D"]] | ||
@test prop(tab) == [4 6; 2 3; 6 4; 3 2] / 30.0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't hurt to have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
@test prop(tab, 1) == [8 12; 4 6; 12 8; 6 4] / 30.0 | ||
@test prop(tab, (1,)) == [8 12; 4 6; 12 8; 6 4] / 30.0 | ||
@test prop(tab, 2) == [6 9; 6 9; 9 6; 9 6] / 15.0 | ||
@test prop(tab, (2,)) == [6 9; 6 9; 9 6; 9 6] / 15.0 | ||
@test prop(tab, ()) == [1.0 1.0; 1.0 1.0; 1.0 1.0; 1.0 1.0] | ||
@test prop(tab, (1, 2)) == [4 6; 2 3; 6 4; 3 2] / 30.0 | ||
@test prop(tab, 2) == [8 12; 4 6; 12 8; 6 4] / 30.0 | ||
@test prop(tab, (2,)) == [8 12; 4 6; 12 8; 6 4] / 30.0 | ||
@test prop(tab, 1) == [6 9; 6 9; 9 6; 9 6] / 15.0 | ||
@test prop(tab, (1,)) == [6 9; 6 9; 9 6; 9 6] / 15.0 | ||
@test prop(tab, (1, 2)) == [1.0 1.0; 1.0 1.0; 1.0 1.0; 1.0 1.0] | ||
@test prop(tab, ()) == [4 6; 2 3; 6 4; 3 2] / 30.0 | ||
|
||
using CategoricalArrays | ||
cx = CategoricalArray(x) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily a frequency table: any numeric table will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed