% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/annotations.R
\name{annotaTEs}
\alias{annotaTEs}
\title{Get RepeatMasker UCSC annotations}
\usage{
annotaTEs(
  genome = "hg38",
  parsefun = rmskidentity,
  verbose = TRUE,
  AHid = NULL,
  ...
)
}
\arguments{
\item{genome}{The genome version of the desired RepeatMasker annotations
(e.g. "hg38").}

\item{parsefun}{A \code{function} to parse the annotations:
\itemize{
  \item Function \code{rmskidentity} returns RepeatMasker annotations as
        present in \link[AnnotationHub]{AnnotationHub}, without
        processing them.
  \item Function \code{rmskbasicparser} parses annotations by removing
        low complexity regions, simple repeats, satellites, rRNA, scRNA,
        snRNA, srpRNA and tRNA. Also removes TEs
        with a strand different than "+" or "-". Modifies "repFamily" and
        "repClass" columns when a "?" is present or when they are defined
        as "Unknown" or "Other". Finally, assigns a unique id to each TE
        instance by adding the suffix "_dup" plus a number at the end of
        the "repName".
  \item Function \code{rmskatenaparser} parses RepeatMasker annotations 
        reconstructing fragmented TEs by assembling together fragments from
        the same TE that are close enough. For LTR class TEs, it tries to
        reconstruct full-length and partial TEs following the LTR - internal
        region - LTR structure. Input is a \code{GRanges} object and output
        is a \code{GRangesList} object.
  \item Function \code{OneCodeToFindThemAll} parses annotations following
        the 'One code to find them all' method by 
        \href{https://doi.org/10.1186/1759-8753-5-13}{(Bailly-Bechet et al. 2014)}. 
        Input is a \code{GRanges} object and output is a \code{GRangesList} 
        object.
  \item User-defined function. Input and output should be \code{GRanges}
        objects.
}}

\item{verbose}{(Default \code{TRUE}) Logical value indicating whether to
report progress.}

\item{AHid}{AnnotationHub unique identifier, of the form AH12345, of an
object with TE annotations. This is an optional argument to
specify a concrete AnnotationHub resource, for instance
when more there is more than one RepeatMasker annotation
available for a specific genome version. If \code{AHid} is
not specified, the latest RepeatMasker annotation is be used.}

\item{...}{Arguments passed to \code{parsefun}.}
}
\value{
A [`GRanges`][GenomicRanges::GRanges-class] object with
        transposable element annotations.
}
\description{
The \code{annotaTEs()} function fetches RepeatMasker UCSC transposable
element (TE) annotations using 
\link[AnnotationHub]{AnnotationHub} and parses them.
}
\details{
Given a specific genome version, the \code{annotaTEs()} function fetches
RepeatMasker annotations from UCSC Genome Browser using the 
\link[AnnotationHub]{AnnotationHub} package. Since RepeatMasker not only
provides TE annotations but also low complexity DNA sequences and other
types of repeats, a specific \code{parsefun} can be set to parse these
annotations (e.g. \code{rmskbasicparser} or a user-defined function). If no
parsing is required, \code{parsefun} can be set to \code{rmskidentity}.
}
\examples{
rmskid <- annotaTEs(genome="hg19", parsefun=rmskidentity)
rmskid


}
\seealso{
\code{\link[AnnotationHub]{AnnotationHub}}
}
