% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/LSD_functions.R
\name{Brick_local_score_differentiator}
\alias{Brick_local_score_differentiator}
\title{Do TAD Calls with Local Score Differentiator on a Hi-C matrix}
\usage{
Brick_local_score_differentiator(
    Brick,
    chrs = NULL,
    resolution = NA,
    all_resolutions = FALSE,
    min_sum = -1,
    di_window = 200L,
    lookup_window = 200L,
    tukeys_constant = 1.5,
    strict = TRUE,
    fill_gaps = TRUE,
    ignore_sparse = TRUE,
    sparsity_threshold = 0.8,
    remove_empty = NULL,
    chunk_size = 500,
    force_retrieve = TRUE
)
}
\arguments{
\item{Brick}{\strong{Required}.
A string specifying the path to the Brick store created with
Create_many_Brick.}

\item{chrs}{\strong{Optional}. Default NULL
If present, only TAD calls for elements in \emph{chrs} will be done.}

\item{resolution}{\strong{Optional}. Default NA
When an object of class BrickContainer is provided, resolution defines the
resolution on which the function is executed}

\item{all_resolutions}{\strong{Optional}. Default FALSE
If resolution is not defined and all_resolutions is TRUE, the resolution
parameter will be ignored and the function is executed on all files listed
in the Brick container}

\item{min_sum}{\strong{Optional}. Default -1
Process bins in the matrix with row.sums greater than \emph{min_sum}.}

\item{di_window}{\strong{Optional}. Default 200
Use \emph{di_window} to define the directionality index.}

\item{lookup_window}{\strong{Optional}. Default 200
Use \emph{lookup_window} local window to call borders. At smaller
\emph{di_window} values we recommend setting this to 2*\emph{di_window}}

\item{tukeys_constant}{\strong{Optional}. Default 1.5
\emph{tukeys_constant}*IQR (inter-quartile range) defines the lower and upper
fence values.}

\item{strict}{\strong{Optional}. Default TRUE
If TRUE, \emph{strict} creates an additional filter on the directionality
index requiring it to be either greater than or less than 0 on the right tail
or left tail respectively.}

\item{fill_gaps}{\strong{Optional}. Default TRUE
If TRUE, this will affect the TAD stiching process. All Border starts are
stiched to the next downstream border ends. Therefore, at times border ends
remain unassociated to a border start. These border ends are stiched to the
adjacent downstream bin from their upstream border end when \emph{fill_gaps}
is true.

TADs inferred in this way will be annotated with two metadata columns in the
GRanges object. \emph{gap.fill} will hold a value of 1 and \emph{level} will
hold a value 1. TADs which were not filled in will hold a gap.fill value of
0 and a level value of 2.}

\item{ignore_sparse}{\strong{Optional}. Default TRUE
If TRUE, a matrix which has been defined as sparse during the matrix loading
process will be treated as a dense matrix. The \emph{sparsity_threshold}
filter will not be applied. Please note, that if a matrix is defined as
sparse and fill_gaps is TRUE, fill_gaps will be turned off.}

\item{sparsity_threshold}{\strong{Optional}. Default 0.8
Sparsity threshold relates to the sparsity index, which is computed as the
number of non-zero bins at a certain distance from the diagonal. If a matrix
is sparse and ignore_sparse is FALSE, bins which have a sparsity index value
below this threshold will be discarded from DI computation.}

\item{remove_empty}{Not implemented.
After implementation, this will ensure that the presence of centromeric
regions is accounted for.}

\item{chunk_size}{\strong{Optional}. Default 500
The size of the matrix chunk to process. This value should be larger than 2x
di_window.}

\item{force_retrieve}{\strong{Optional}. Default TRUE
If TRUE, this will force the retrieval of a matrix chunk even when the
retrieval includes interaction points which were not loaded into a Brick
store (larger chunks). Please note, that this does not mean that DI can be
computed at distances larger than max distance. Rather, this is meant to aid
faster computation.}
}
\value{
A ranges object containing domain definitions. The starts and ends
of the ranges coincide with the starts and ends of their contained bins from
the bintable.
}
\description{
\code{Local_score_differentiator} calls topologically associated domains on Hi-C
matrices. Local score differentiator at the most fundamental level is a
change point detector, which detects change points in the directionality
index using various thresholds defined on a local directionality index
distributions.
The directionality index (DI) is calculated as defined by Dixon et al., 2012
Nature. Next, the difference of DI is calculated between neighbouring bins to
get the change in DI distribution in each bin. When a DI value goes from a
highly negative value to a highly positive one as expected to occur at domain
boundaries, the ensuing DI difference distribution becomes a very flat
distribution interjected by very large peaks signifying regions where such
a change may take place. We use two difference vectors, one is the difference
vector between a bin and its adjacent downstream bin and another is the
difference between a bin and its adjacent upstream bin. Using these vectors,
and the original directionality index, we define domain borders as outliers.
}
\details{
To define an outlier, fences are first defined. The fences are defined using
tukeys_constant x inter-quartile range of the directionality index. The upper
fence used for detecting domain starts is the 75th quartile +
(IQR x tukeys_constant), while the lower fence is the
25th quartile - (IQR x tukeys_constant). For domain starts the DI difference
must be greater than or equal to the upper fence, it must be greater than the
DI and the DI must be a finite real value. If strict is TRUE, DI will also
be required to be greater than 0. Similarly, for domain ends the
DI difference must be lower than or equal to the lower fence, it must be
lower than the DI and the DI must be a finite real value. If strict is TRUE,
DI will also be required to be lower than 0.

After defining outliers, each domain start will be associated to its
nearest downstream domain end. If \emph{fill_gaps} is defined as TRUE and
there are domain ends which remain unassociated to a domain start, These
domain ends will be associated to the bin adjacent to their nearest upstream
domain end. This associations will be marked by metadata columns, gap.fill= 1
and level = 1.

This function provides the capability to call very accurante TAD definitions
in a very fast way.
}
\examples{
Bintable.path <- system.file(file.path("extdata", "Bintable_100kb.bins"), 
package = "HiCBricks")

out_dir <- file.path(tempdir(), "lsd_test")
dir.create(out_dir)

My_BrickContainer <- Create_many_Bricks(BinTable = Bintable.path, 
    bin_delim = " ", output_directory = out_dir, file_prefix = "Test",
    experiment_name = "Vignette Test", resolution = 100000,
    remove_existing = TRUE)

Matrix_file <- system.file(file.path("extdata", 
"Sexton2012_yaffetanay_CisTrans_100000_corrected_chr3R.txt.gz"), 
package = "HiCBricks")

Brick_load_matrix(Brick = My_BrickContainer, chr1 = "chr3R", 
chr2 = "chr3R", matrix_file = Matrix_file, delim = " ",
remove_prior = TRUE, resolution = 100000)

TAD_ranges <- Brick_local_score_differentiator(Brick = My_BrickContainer, 
chrs = "chr3R", resolution = 100000, di_window = 10, lookup_window = 30, 
strict = TRUE, fill_gaps = TRUE, chunk_size = 500)
}
