- Fixed a misplacement of
na.rm
incompartment_matrixplot()
(#328) - The
tornado_insulation()
function now accepts a list of bed-like data.frames (#266). - Exposed the
strand
argument inARA()
, making it easier to do strand aware ARAs (#268) - Renamed
cut.off
tocolour_lim
and allowc(minimum, maximum)
specification, rather thancut.off = maximum
inhic_matrixplot()
(#260). - Added a
colour_bar
legend option in*_matrixplot()
family (#260). - The
par()
options now reset to original state after calling*_matrixplot()
family of functions. - Fixed a bug with
NA
s insaddle()
(#263). - Removed {bigwrig} as a dependency in favour of Bioconductor's {rtracklayer}.
- Fixed a bug wherein the correct resolution couldn't be found in
.mcool
files due to scientific formatting of large numbers (#248, #249). - Fixed a bug in
insulation_score()
wherein formatting of sample names could throw errors (#246). - In
load_contacts()
, you can now setcentromeres = FALSE
to signal that the data doesn't contain centromeres (#241). - Fixed a balancing bug in .hic file loading (#225).
This version marks a large refactoring of the code base in an effort to increase consistency. However, it also breaks a lot of old code. It is the version that was used during submission of the GENOVA publication.
- A loaded Hi-C experiment is now dubbed a class object.
- Added support for loading data from the cooler and juicer pipelines. Juicer import requires the {strawr} dependency and cooler import requires the {Rhdf5} dependency.
- Centromere information is automatically estimated while loading the data by looking for the largest stretch of empty bins.
- Added the
sync_indices()
function to harmonise data from various pipelines.
- Results from analysis functions have been dubbed class objects.
- For quantification of analysis, the
quantify()
generic was added with methods for several objects. - Likewise, for visualising results, the
visualise()
generic was added with methods for several objects. Additionaly, some base R equivalents are available as methods to theplot()
function. - New
bundle()
andunbundle()
functions to more easily combine objects. - New
ARA()
functions for aggregating on-diagonal regions, convenient for stripe analysis. - New
CSCAn()
function for crosswise intersections of different sets of regions. - New
anchors_*()
family of functions for related analysis functions for repeated lookup analyses, to make these analyses more flexible. - New
anchors_extendedloops()
function to recapitulate the Haarhuis et al. (2017) strategy for loop anchors. - Standardised the
APA()
,ATA()
andPESCAn()
functions to use a common pattern for repeatedly looking up regions in the Hi-C map. - Speed-up of the repeated lookup functions by making better use of {data.table} joins.
- The
insulation_score()
algorithm was sped up for larger data. - TAD calling via
call_TAD_insulation()
was sped up and closer to the description of the algorithm in Crane et al. (2015). - Compartment strength is no longer calculated in the deprecated
visualise.compartmentStrength()
, but in thequantify.saddle_discovery()
method. - Refactored the
RCP()
function.
- Added colour palettes for sequential and divergent values
(see
?GENOVA_colours
). The default sequential palette can be changed usingoptions("GENOVA.colour.palette" = {a character vector of colours})
. - Wrappers for these colour palettes in {ggplot2} scales are in the
scale_{colour/fill}_GENOVA()
andscale_{colour/fill}_GENOVA_div()
functions for sequential and divergent palettes respectively. - Added the
pyramid()
andpyramid_difference()
function for plotting a Hi-C map at 45 degree angles. - The
pyramid()
plot can be annotated by theadd_bed_graph()
,add_bed_track()
,add_ctcf_sites()
functions. - The
pyramid()
plot can be annotated by compartment score, insulation score, directionality index, domainograms, virtual 4C or arbitrary ggplot2 layer using theas_track()
function. - Most
image()
-based plotting now have therasterise
argument. - Replaced the
visualise.{analysis}.ggplot()
pattern functions by avisualise()
generic with methods for objects.
Generally, functions have been renamed to prevent dot.case
functions implying
S3 methods or lowerCamelCase
implying S4 functions. Instead they now all use
the snake_case
.
Old name | New name |
---|---|
hic.matrixplot() |
hic_matrixplot() |
construct.experiment() |
load_contacts() |
insulation.score() |
insulation_score() |
insulation.domainogram() |
insulation_domainogram() |
fastDI() |
direct_index() |
chromosomeMatrix() |
chromsome_matrix() |
compartment.score() |
compartment_score() |
insulation.callTAD() |
call_TAD_insulation() |
intra.inter.TAD.contacts() |
intra_inter_TAD() |
select.subset() |
select_subset() |
trans.compartment.plot() |
trans_matrixplot() |
- The
badBin.find()
function. - The
badBin.plot()
function. - The
rescale()
function in favour ofscales::rescale()
. - The
resize_mat()
function. - The
differential.TAD.dotplot/scatterplot()
function in favour of thevisualise.IIT_discovery()
method. - The
frequency.from.matrix()
function. - The
chrom.comparison.plot()
function in favour of thevisualise.chrommat_discovery()
method. - The
quantifyAPA()
function in favour of thequantify.APA_discovery()
method. - The
cisTotal.perChrom()
andcisTotal.scores()
functions in favour of thecis_trans()
function. - The
insulation.plot.dual()
andinsulation.plot.single()
functions in favour of theinsulation_matrixplot()
.
-
Added the
get_test_data()
function and 40k/150k datasets of two small chromosomes to quickly try some functions with some data. It uses the {pkgfilecache} -
Limit the number of threads used by {data.table} to 1 by default. We noticed that this is faster if functions are executed in a loop.
-
Removed {reshape2} and {dplyr} dependencies.
-
Replaced {bigwrig} dependency to {rtracklayer} for importing bigwig data.
-
Added the
expnames()
function to glance which samples were involved in a or object. -
Added the
resolution()
function to glance at which resolution a object was calculated or object has.
- HiCseg.callTAD: a function to call TADs with HiCseq (Levy-Leduc) per chromosome-arm or windowed chromosomes.
- construct.experiment: added an option to get Z-score normalised experiment-objects.
- hic.matrixplot: added an option to use Z-score normalised experiment-objects.
- insulation.heatmap: added option to set leftlost bin to 0
- insulation.heatmap: added option to use a borders-file (also for peaks!)
- APA: sped up by using a vapply loop (@MarijneMia)
- ATA: sped up by using a vapply loop
- select.subset: sped up the pos-lookup
- visualise.APA.ggplot: better divergent colors (same as Z-score colors in hic.matrixplot)
- visualise.ATA.ggplot: better divergent colors (same as Z-score colors in hic.matrixplot)
- genome.wide.insulation: added speudocounts
- compartment.score: uses a more robust handing of comparable tracks.
- genome.wide.insulation::matrix.insulation: shifted output half a bin upstream.
- genome.wide.insulation: forcing even window-sizes to get score between bins.
- insulation.heatmap: nicer plotting of profile and heatmap without grid
- insulation.heatmap: refactored the scripts.
Github-hash: 9951098
- ATA: outputs also the used TADs.bed
- HiC_matrixplot: smoothing of the regions with no data (e.g. the white stripes) has been implemented. This is done by filling these bins with the result of a Nadaraya/Watson normalization of the kernel. Set
smoothNA
to true to use andsmoothBandwidth
to tweak - HiC_matrixplot: If
chip.yMax
is NULL, a warning will be given for the used yMaxes. - PE-SCAn: A maximal distance can be given with
maxDist
. - PE-SCAn: verbosity can be limited off with
verbose = F
- PE-SCAn: Catches errors when no matrices are found by cov2d.
- PE-SCAn: A treshold for the minimum amount of BED-entries per chromosome is added:
minComparables
. - PE-SCAn: Shifted bed-entries bigger than their chromosome are now fixed (they "wrap around" to the start)
- PE-SCAn: Returns not only the O/E matrix, but a list with a O/E score-matrix (if shift != 0), otherwise an observed score-matrix, the underlying signal and background-matrices and the shift used.
- visualise.PESCAn.ggplot: several checks to see if the input-data is comparable (including shift).
- visualise.PESCAn.ggplot: colorscales for O/E and O are better (i.e. no log2-divergent scale for shift==0 matrices).
- cis.compartment.plot: a second experiment-object can be used. This is plotted in the lower-left corner. Note: only the first matrix is invisibly returned.
- cis.compartment.plot: smoothing of the regions with no data (e.g. the white stripes) has been implemented. This is done by filling these bins with the result of a Nadaraya/Watson normalization of the kernel. Set
smoothNA
to true to use andsmoothBandwidth
to tweak - getTADstats: a new function to get some stats on ATA-results. Still quite buggy.
- saddle: Forces user for either CS or chip.
- README now contains actual text.
- HiC_matrixplot:
bed.col
andbw.col
are merged to chip.col. This argument can take a vector of four colours, parallel to thechip
-argument. - HiC_matrixplot:
yMax
is renamed tochip.ymax
. - visualise.PESCAn.ggplot: works woth new output of PE-SCAn
- PE-SCAn: cov2d loops over a vectorised list without duplicates, which speeds up the code.
- PE-SCAn: cleaned up cov2d's return-code
- quantifyAPA: the
enrichmentType
argument lets you choose between pixel/mean(backgroundRegions) and the fraction of loops with more than 50% higher signal than the background. - visualise.compartmentStrength: at least one bin is used (bugfix).
- saddle: extra effort to use the CS is taken by matching the CS-vector and the eigen-vector. This will fix problems due to big centromeres or shaky/different centromere-calls.
- vignette: fixed typo's
Github-hash: 8edef82
Github-hash: 35772ec
Github-hash: 7deebca