% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/workflow.R
\name{pureClipGeneWiseFilter}
\alias{pureClipGeneWiseFilter}
\title{Filter PureCLIP sites by their score distribution per gene}
\usage{
pureClipGeneWiseFilter(
  object,
  cutoff = 0.05,
  overlaps = c("keepSingle", "removeAll", "keepAll"),
  anno.annoDB = NULL,
  anno.genes = NULL,
  match.score = "score",
  match.geneID = "gene_id",
  quiet = FALSE
)
}
\arguments{
\item{object}{a \code{\link{BSFDataSet}} object with stored crosslink ranges
of width=1}

\item{cutoff}{numeric; defines the cutoff for which sites to remove, the
smallest step is 1\% (0.01). A cutoff of 5\% will remove the lowest 5\% sites,
given their score, on each gene, thus keeping the strongest 95\%.}

\item{overlaps}{character; how overlapping gene loci should be handled.}

\item{anno.annoDB}{an object of class \code{OrganismDbi} that contains
the gene annotation (!!! Experimental !!!).}

\item{anno.genes}{an object of class \code{\link{GenomicRanges}} that represents
the gene ranges directly}

\item{match.score}{character; meta column name of the crosslink site
\code{\link{GenomicRanges}} object that holds the score which is used for
sub-setting}

\item{match.geneID}{character; meta column name of the genes
\code{\link{GenomicRanges}} object that holds a unique geneID}

\item{quiet}{logical; whether to print messages}
}
\value{
an object of class \code{\link{BSFDataSet}} with its ranges filtered
by those that passed the gene-wise threshold set with \code{cutoff}
}
\description{
Function that applies a filter on the crosslink site score distribution at
gene level. This allows to filter for those sites with the strongest signal
on each gene. Since scores are tied to the expression level of the hosting
transcript this function allows a fair filter for all genes partially
independent of the expression level.
}
\details{
The \code{\link{GenomicRanges}} contained in the \code{\link{BSFDataSet}} need to
have a meta-column that holds a numeric score value, which is used for filtering.
The name of the column can be set with \code{scoreCol}.

In the case of overlapping gene annotation, a single crosslink site will be
attributed to multiple genes. The \code{\link{overlaps}} parameter allows
to control these cases. Option `keepSingle` will only keep a single instance
of the site; `removeAll` will remove both sites; `keepAll` will keep both
sites.

The function is part of the standard workflow performed by \code{\link{BSFind}}.
}
\examples{
# load clip data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))
# Load GRanges with genes
load(list.files(files, pattern = ".rds$", full.names = TRUE)[1])
# apply 5\% gene-wise filter
pureClipGeneWiseFilter(object = bds, anno.genes = gns, cutoff = 0.5, overlaps = "keepSingle")

}
\seealso{
\code{\link{BSFind}}, \code{\link{estimateBsWidthPlot}}
}
