Efficient bioinformatics workflow for de novo transcriptome assembly of Pelargonium zonale

Dejana Milić1*, Ana Pantelić1, Jelena Samardžić1, Bojana Banović Đeri1, Marija Vidović1

1University of Belgrade, Institute of Molecular Genetics and Genetic Engineering, Laboratory for Plant Molecular Biology, Vojvode Stepe 444a, Belgrade, Serbia

dmilic [at] imgge.bg.ac.rs

Abstract

Variegated Pelargonium zonale is a widely cultivated ornamental plant characterized by green, photosynthetically active tissue (GL) and white, non-photosynthetic tissue (WL). The aim of this study was to investigate the transcriptomic differences between these two tissue types.

We performed RNA-seq analysis of GL and WL on Illumina HiSeq 2500 platform. The raw reads were processed using in-house scripts to remove low-quality reads, adapter sequences, poly-N sequences, and contaminants. High-quality clean reads were subjected to de novo transcriptome assembly using Trinity (min_kmer_cov = 2, min_glue = 2). The redundancy was removed and longest transcripts per cluster were selected as unigenes.

Gene expression levels were estimated using RSEM by mapping clean data back to the assembled transcriptome (Bowtie2 with mismatch = 0). Differential expression analysis between GL and WL (three biological replicates per each) was performed with DESeq2 R package (p values adjusted according to Benjamini and Hochberg for controlling False Discovery Rate). Genes with abs (log2 FC) ≥ 2 and adjusted p value < 0.05 were assigned as statistically significant differentially expressed. Functional enrichment analysis was performed using GOseq R package and KOBAS software (corrected p < 0.05).

We annotated 85,374 unigenes (61.17%), providing a valuable resource for future functional genomics studies. Out of 8896 gene clusters that were statistically significantly differentially expressed between the green and white leaf tissues (p value < 0.05 and abs(log2 fold change) ≥ 2), 5585 were upregulated in the WL, while 3311 were upregulated in the GL. These findings shed light on the transcriptomic differences between the two leaf tissue types in P. zonale and provide a foundation for further research on the functional significance of these differences. Also, this study demonstrated utility of the Trinity pipeline for de novo transcriptomic analysis of organism whose genomes are yet not sequenced.

Keywords: de novo transcriptomic assembly, variegated plants, Pelargonium zonale, Trinity software

Acknowledgements: This work was funded by the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Contract No. 451-03-47/2023-01/ 200042) and Bilateral project (no. 451-03-01963/2017-09/09).

Comments are closed.