Add Intronic Regions Annotations to Genomic Data

This function generates intronic regions for each transcript based on its associated exons. It assumes that input data includes properly assigned exons and their transcript IDs. The output consists of intronic regions suitable for further analysis or export in GTF/GFF format.

add_introns(input)

Arguments

input: A data frame containing genomic data. The data frame should have the following columns: - `chr`: Chromosome identifier - `start`: Start position of the annotation - `end`: End position of the annotation - `strand`: Strand information ('+' or '-') - `annotationType`: Type of annotation (e.g., 'EXON', 'CDS') - `gene_name`: Name of the associated gene

Value

A data frame with the original input data and additional rows for the introns. Each added row includes the following fields: - `source`: "JBIO-predicted" for newly added annotations - `annotationType`: Indicates 'intron' - `start` and `end`: Updated start and end positions for the intron - `strand`: Strand information copied from the input data - Other fields as present in the input data

Details

The function iterates over unique chromosomes and strand orientations, calculating intron positions for each gene.

Examples


# Run the function
output_data <- add_introns(input)
#> Error in add_introns(input): could not find function "add_introns"