% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/annotation.R
\name{n_annotations}
\alias{n_annotations}
\alias{has_annotation}
\title{Number of annotated items}
\usage{
n_annotations(
  dag,
  terms = NULL,
  uniquify = simona_opt$anno_uniquify,
  use_cache = simona_opt$use_cache
)

has_annotation(dag)
}
\arguments{
\item{dag}{An \code{ontology_DAG} object.}

\item{terms}{A vector of term names. If it is set, the returned vector will be subsetted to the terms that have been set here.}

\item{uniquify}{Whether to uniquify items that are annotated to the term? See \strong{Details}. It is suggested to always be \code{TRUE}.}

\item{use_cache}{Internally used.}
}
\value{
\code{n_annotations()} returns an integer vector.

\code{has_annotation()} returns a logical scalar.
}
\description{
Number of annotated items
}
\details{
Due to the nature of the DAG, a parent term includes all annotated items of its child terms, and an ancestor term includes
all annotated items from its offspring recursively. In current tools, there are two different implementations to deal with
such recursive merging.

For a term \code{t}, denote \code{S_1}, \code{S_2}, ... as the sets of annotated items for its child 1, 2, ..., also denote \code{S_t} as the set
of items that are \strong{directly} annotated to \code{t}. The first method takes the union of annotated items on \code{t} and all its child terms:

\if{html}{\out{<div class="sourceCode">}}\preformatted{n = length(union(S_t, S_1, S_2, ...))
}\if{html}{\out{</div>}}

And the second method takes the sum of numbers of items on \code{t} and on all its child terms:

\if{html}{\out{<div class="sourceCode">}}\preformatted{n = sum(length(s_t) + length(S_1) + length(S_2) + ...)
}\if{html}{\out{</div>}}

In \code{n_annotations()}, when \code{uniquify = TRUE}, the first method is used; and when \code{uniquify = FALSE}, the second method is used.

For some annotation sources, it is possible that an item is annotated to multiple terms, thus, the second method which simply
adds numbers of all its child terms may not be proper because an item may be counted duplicatedly, thus over-estimating \code{n}. The two methods
are identical only if an item is annotated to a unique term in the DAG.

We suggest to always set \code{uniquify = TRUE} (the default), and the scenario of \code{uniquify = FALSE} is only for the testing or benchmarking purpose.
}
\examples{
parents  = c("a", "a", "b", "b", "c", "d")
children = c("b", "c", "c", "d", "e", "f")
annotation = list(
    "a" = c("t1", "t2", "t3"),
    "b" = c("t3", "t4"),
    "c" = "t5",
    "d" = "t7",
    "e" = c("t4", "t5", "t6", "t7"),
    "f" = "t8"
)
dag = create_ontology_DAG(parents, children, annotation = annotation)
n_annotations(dag)
}
