© 2026 Artemisia Database | Developed by Ayat Taheri at Shanghai Jiao Tong University | Contact Us | GitHub
Visualize expression patterns across tissues
Explore genome annotations interactively
Map sequences to Artemisia genes
Gene function predictions and annotations
The Artemisia Database is a comprehensive platform dedicated to Artemisia annua , the primary source of artemisinin—a vital antimalarial compound. This resource integrates genomic, transcriptomic, and functional annotation data to support research in plant biology and specialized metabolism.
Navigate using the tabs above or click on any feature card to start exploring.
Genes Annotated
Metabolites Analyzed
RNA-seq Samples
Transcription Factors
Purpose: Interactive t-SNE visualization of sample relationships based on gene expression patterns.
Key Features:
How to use:
Interpreting the Visualization:
Data in the Table Below:
Purpose: Calculate and visualize median expression values (TPM) for custom gene lists across selected tissues.
Input Options:
Output:
Tip: Copy gene IDs from other tabs or use the example buttons to get started quickly.
Purpose: Specialized tool for exploring genes involved in artemisinin biosynthesis.
Key Features:
Step-by-Step Workflow:
Note: This tool focuses specifically on genes with known or predicted roles in artemisinin biosynthesis.
This tab provides detailed information about genes related to artemisinin, a vital compound used in the treatment of malaria. You can explore gene classifications, view expression data, and access sequences for selected genes.
Use the Main Table to filter and select genes of interest. The Expression tab allows you to analyze gene expression across various plant tissues, while the Sequence tab provides detailed transcript information.
Detailed information about the selected gene is displayed below.
Purpose: Explore genes classified by tissue-specific expression using Tau index.
Key Features:
Workflow for Tissue-Specific Genes:
Use this to identify genes specifically expressed in roots, stems, leaves, flowers, or petioles.
Explore genes by classification using the donut plot or search by gene IDs below.
Detailed information about the selected gene is displayed below.
Based on PlantTFDB classification of transcription factor families
Detailed information about the selected gene is displayed below.
Transcription factors identified by conserved protein domains
Detailed information about the selected gene is displayed below.
Identify TFs specifically expressed in roots, stems, leaves, flowers, or petioles
Detailed information about the selected gene is displayed below.
Discover functional relationships between gene expression and metabolite profiles
Examples: Artemisinin, Artemisinic acid, DihydroArtemisinic acid, Putrescine
Maximum 50 genes
Only gene-metabolite pairs with |r| ≥ threshold are shown
Processing may take a few seconds
Filtered correlations based on your selection. Click any row to view detailed trend analysis.
Heatmap showing correlation values. Red = positive correlation, Blue = negative correlation.
Compare metabolite abundance (bars) with gene expression (line) across the 4 experimental conditions.
To view trend analysis, first go to the Correlation Table tab and click on any row to select a gene-metabolite pair.
Tip: The most interesting pairs usually have strong correlations (|r| > 0.8)
Biological information for genes identified in correlation analysis
Three Query Types:
Quick Start:
💡 Tips:
sgRNA design using crisprDesign methodology
Key Features:
Follow these steps to design CRISPR guide RNAs:
Enter one gene ID per line from the Mikado annotation
Choose target region, nuclease, and quality filters
Click 'Design sgRNAs' to generate and score guide RNAs
mikado.chr1G1335
mikado.chr1G1016
mikado.chr2G2024
Check the 'Browse' tab for a complete list of available genes.
Make sure the genome and annotation are loaded (see status above).
If they show as 'not loaded' or 'failed', use the Force Load buttons.
Purpose: Explore genes involved in epigenetic regulation through DNA and RNA methylation, and histone modifications.
Three Methylation Types:
Two Ways to Explore:
Workflow:
Detailed information about the selected gene is displayed below.
Purpose: Visualize genomic data for Artemisia annua using JBrowse, based on the latest annotation and sequence files developed in this study.
Step-by-Step Instructions:
After opening JBrowse and loading tracks, zoom into a region to see gene features from the GFF3 file. Select a gene feature to open its details window, then click 'Show Feature Sequence' to view its sequence. From the dropdown menu, extract 1000 bp upstream and downstream, and adjust to 1500 bp using the wheel icon.
Navigation Tips:
Purpose: Map unknown sequences to our new Mikado gene IDs, or find corresponding genes between different annotation systems.
This database uses new gene IDs generated by the Mikado pipeline (e.g., mikado.chr1G1016). If you have:
Use BLAST to find the corresponding Mikado gene IDs for further analysis in this database.
Typical Workflow:
mikado.chrXGXXXX
IDs
How to Use BLAST:
Look for the
sseqid
column in results - this contains the Mikado gene IDs you can use throughout the database.
Purpose: Identify genes co-expressed with your query genes across Artemisia RNA-seq samples to discover functionally related genes and pathways.
Input & Parameters:
Analysis Workflow:
Results Tabs:
Use this to discover: gene regulatory networks, pathway members, functionally related genes, and potential transcription targets.
Purpose: Identify overrepresented biological processes, molecular functions, and cellular components in your gene list.
Input Options:
Analysis Steps:
Results Include:
Typical Uses:
Purpose: Get FASTA sequences for gene IDs.
Input: Enter gene IDs (one per line)
Limits: 1-5 = view & download, 6+ = download only
Output: FASTA file with transcript sequences
Enter gene IDs (one per line):
Welcome to the Artemisia Database , a comprehensive platform designed to facilitate research on Artemisia annua gene expression, transcription factors, functional annotations, and more. This tutorial will walk you through the database's key features, showing you how to navigate its interface and utilize its tools effectively. Whether you're a biologist, bioinformatician, or researcher, this guide will help you get started.
The Artemisia Database was built as part of the postdoctoral project of Dr. Ayat Taheri at the School of Agriculture and Biology, Shanghai Jiao Tong University .
This resource aims to provide valuable insights into Artemisia gene expression, specialized metabolism, and related biological processes.
Below is the comprehensive workflow used to develop the Artemisia Database, from data acquisition to interactive visualization:
We acknowledge the contributions of the researchers and organizations who generated the genome and RNA-sequencing data utilized in this study.
This work was supported by:
The computations in this research were run on the Siyuan-1 cluster supported by the Center for High-Performance Computing at Shanghai Jiao Tong University .
We acknowledge the invaluable support of Professor Kexuan Tang , throughout this project.
We plan to regularly include additional RNA-seq datasets to expand the utility of the database.
This ongoing development aims to make the database a comprehensive resource for researchers studying Artemisia and its specialized metabolism.
Interested in collaborating? I'm open to research partnerships, data sharing, and joint publications.
Welcome to the Artemisia Database, a comprehensive platform designed to facilitate research on Artemisia annua gene expression, transcription factors, functional annotations, and more. This tutorial will walk you through the database’s key features, showing you how to navigate its interface and utilize its tools effectively. Whether you’re a biologist, bioinformatician, or researcher, this guide will help you get started.
Upon accessing the database, you will land on the Home tab, which serves as the central hub for the platform:
The Gene Expression menu contains four powerful tools for visualizing expression data.
The Global Expression Viewer allows you to visualize high-dimensional gene expression data through an interactive t-SNE plot. This tool is designed to help you identify expression clusters across different tissues and access the underlying metadata for each sample.
Pubmed_ID column will provide a direct hyperlink to the research paper on PubMed.
Figure 1: The t-SNE plot displays gene expression data colored by plant tissue. Interactive tools allow for zooming and point-specific identification.
Figure 2: The metadata table dynamically updates based on your plot selections, offering deep-dives into sample attributes and publication IDs.
This tool allows you to aggregate and visualize the median expression levels of specific genes across various plant tissues. It is particularly useful for comparing the expression profiles of gene families or sets of co-expressed genes.
.txt or .csv file containing your IDs.To test the tool, click Example#1 to load IDs like mikado.chr7G1274 and mikado.chr4G1337. Select Leaf and Flower, then click Calculate. The resulting heatmap will provide a direct visual comparison of these genes across the two tissues.
Figure 3: Use the example buttons for a quick setup, then refine your analysis by selecting specific plant parts.
Figure 4: The interactive heatmap visualizes relative expression levels. Hover over cells to see exact log2 values.
This module provides a focused environment for exploring genes specifically linked to artemisinin biosynthesis. It integrates functional data, expression profiles, and sequence information through a centralized, multi-tab interface.
Navigate to Gene Expression > Artemisinin Pathway Genes. The “Main Table” tab will load by default, listing all relevant pathway genes.
Use the search panel to isolate specific genes of interest:
Once you have located your target genes, click on their rows in the Main Table to highlight them. This enables the action buttons at the bottom left:
.txt file for local use.| Feature | Description |
|---|---|
| Main Table | Provides functional annotations and literature references. |
| Expression | Generates dynamic heatmaps based on tissue-specific data. |
| Sequence | Offers high-throughput access to FASTA-formatted sequences. |
This module provides a high-level classification of the Artemisia annua transcriptome. Using an interactive donut chart, you can explore gene categories (such as “Tissue-Specific” vs. “Broadly Expressed”) and drill down into the functional data and sequences associated with each group.
Figure 5: The interactive donut chart allows you to filter the entire dataset by clicking on specific expression categories, such as tissue-specific or constitutive genes.
This module is the fastest way to identify “specialist” genes. By clicking the Specific segment and then using the Check Gene Expression button, you can quickly verify which tissue (e.g., Trichome or Root) those genes are primarily active in.
The Transcription Factors menu provides specialized tools for identifying and analyzing regulatory proteins within the Artemisia annua genome, primarily categorized by the PlantTFDB framework.
This tool allows you to explore transcription factor (TF) families, visualize their distribution, and analyze their tissue-specific expression patterns.
To visualize the expression of bZIP transcription factors:
Figure 5: Interactive bar plot showing TF family distribution. Clicking “bZIP” filters the dataset automatically.
Figure 6: The Main Table updates based on your plot selection. Use the buttons at the bottom left to transition to expression analysis.
Figure 7: Generate and export high-resolution heatmaps for your selected transcription factors across various plant parts.
The Pfam module allows you to explore transcription factors through the lens of conserved protein domains. This approach is ideal for identifying regulatory genes that share specific structural motifs—such as Zinc fingers or Leucine zippers—even if they are not fully categorized into traditional plant TF families.
.txt file.This module is designed to pinpoint transcription factors that exhibit localized activity. By focusing on genes with high expression in specific tissues (e.g., Leaf, Root, or Trichome), you can identify the primary regulators governing tissue-specific development and specialized metabolism.
To investigate stem-specific regulators, click the ERF TF cell in the Stem row of the overview heatmap. Select the resulting genes in the Main Table and click Check Gene Expression. By selecting both Leaf and Root on the expression tab, you can generate a heatmap that clearly demonstrates the gene’s preferential activity in leaf tissue.
Figure 8: The interactive heatmap allows for rapid filtering. Clicking a cell instantly isolates the corresponding transcription factors in the data table below.
The Gene-Metabolite Correlation Analysis module is a powerful multi-omics integration tool. It allows you to discover functional relationships between gene expression (transcriptomics) and metabolite abundance (metabolomics) across different experimental conditions and tissues.
Use the sidebar on the left to set up your correlation parameters:
Once processed, the Correlation Table tab provides a searchable list of results:
Switch to the Correlation Matrix tab to see a high-level heatmap of the relationships:
The Trend Visualization tab offers a detailed look at how a selected pair behaves across four experimental conditions (WT Young Leaf, WT Mature Leaf, Mutant Young Leaf, and Mutant Mature Leaf):
The Gene Context tab provides a deep dive into the selected gene, showing its Pfam domains and a summary of other metabolites it may be correlated with.
The data is derived from four specific states to help identify developmental and mutational effects:
Figure 9: Use the sidebar to switch between Discovery and Hypothesis modes and set your correlation thresholds.
Figure 10: The Trend Analysis plot allows for direct comparison of metabolite intensity (bars) and gene expression (line) across wild-type and mutant conditions.
The Functional Annotation module provides a centralized interface to retrieve biological context for Artemisia annua genes. By integrating multiple databases, you can identify gene functions, metabolic pathways, and protein domains through three flexible query methods.
Navigate: Go to the Functional Annotation section in the main menu.
Select Your Query Method:
Annotation Table: * Choose a specific database from the dropdown menu (e.g., GO, KEGG, Pfam, or SwissProt).
GO:0003674).Gene-Based Queries: * Input specific Gene IDs (one per line) from the latest A. annua gene model.
Functional Descriptions: * Perform a keyword search (e.g., “RNA polymerase” or “transferase”).
Execute Search: Click the Search button corresponding to your chosen method.
Explore Multi-Tab Results: The results are displayed in a dynamic table with tabs representing different databases (e.g., GO, Pfam, KEGG, EggNOG, etc.). This allows you to toggle between different layers of functional information for the same set of genes.
To find information for a specific gene:
mikado.Super-Scaffold_100038G100 into the Gene-Based Queries field.
Figure 11: Searching by Gene ID generates a consolidated view. Use the tabs to switch between database-specific annotations like KEGG pathways or GO terms.
The CRISPR sgRNA Designer is a specialized tool for planning genome editing experiments in Artemisia annua. Using the crisprDesign methodology and a custom-built LQ9v1 BSgenome, this tool identifies optimal single-guide RNA (sgRNA) sequences for various CRISPR nucleases while considering coding sequences (CDS) and potential quality constraints.
Before starting, check the Status Check badges at the top of the page.
loaded (green).In the Design Parameters sidebar:
mikado.chr1G1335) into the text area (one per line).Click the Design sgRNAs button. Once processing is complete, explore the results across the following tabs:
| Column/Metric | Description |
|---|---|
| Spacer | The 20bp (for Cas9) or 21bp (for Cas12a) targeting sequence. |
| Composite Score | A normalized score (0–1) incorporating GC content, on-target affinity (CRISPRater), and sequence penalties. |
| Cut Site | The exact genomic coordinate where the double-strand break is predicted to occur. |
| Poly-T | Marked “Yes” if the guide contains a terminator sequence (avoid these for U6-driven vectors). |
Figure 12: Configure your CRISPR experiment by selecting the appropriate nuclease and target region (CDS or full gene).
Figure 13: The Quality Scores tab allows you to visualize and select the highest-performing guides based on composite modeling.
You can download your designs in multiple formats:
The Methylation module is designed to investigate the epigenetic landscape of Artemisia annua. It allows you to explore genes involved in three primary regulatory mechanisms: RNA methylation (m6A), DNA methylation (5-methyladenosine), and histone modification (Histone-H3).
You can navigate the epigenetic data using two distinct approaches:
The results are displayed in a comprehensive data table:
In the Expression tab:
Switch to the Sequence tab to view detailed transcript information:
| Category | Description | Sub-filters / Roles |
|---|---|---|
| m6A | N6-methyladenosine RNA methylation | Writers, Erasers, Readers |
| 5-methyl | DNA methylation enzymes | DNA_methylase, DNMT1-RFD, TP_methylase |
| Histone-H3 | Proteins modifying Histone H3 | PHD and SET domains |
Figure 14: Use the dynamic filters to isolate specific epigenetic regulators like m6A ‘Writers’ or Histone-H3 ‘SET’ domain proteins.
Figure 15: The Expression tab allows you to visualize if specific epigenetic regulators are tissue-specific (e.g., active primarily in the Root or Trichome-rich Leaf).
The JBrowse tool provides a high-performance, interactive environment for visualizing the Artemisia annua genome. Based on the latest genome assembly and Mikado annotations developed for this project, it allows you to explore gene structures, regulatory regions, and sequence features in their precise genomic context.
To extract a promoter sequence for a gene of interest:
Figure 16: Exploring genomic features in JBrowse. (a) Tracks loaded in the linear view. (b) Feature details window. (c) Sequence extraction tools. (d) Configuration of flanking regions.
The BLAST module is essential for cross-referencing external data with our database. It allows you to perform sequence similarity searches to identify corresponding Artemisia-DB Gene IDs. This is particularly useful if you have a sequence from a different species or are working with IDs from older Artemisia annua genome versions and need to find their updated counterparts in our latest annotation.
>My_Gene\nATCG...) directly into the text area..fasta or .txt format.If you have a sequence associated with a gene ID from a previous publication or older version of the genome:
mikado.chr7G1274).
Figure 17: The BLAST interface facilitates the translation of external sequences into database-specific Gene IDs, enabling a full multi-omics deep dive.
The Co-expression Analysis tool identifies genes that exhibit similar expression patterns across various experimental conditions in Artemisia annua. This module is essential for discovering functional modules, potential regulatory networks, and genes that may be co-regulated within specific metabolic pathways, such as artemisinin biosynthesis.
Define Your Query:
mikado.chr1G1335) into the search box to find their co-expressed partners.Filter the Dataset:
Set Statistical Thresholds:
Analyze Results: Click Run Co-expression to generate data across three interactive tabs:
If you want to find genes co-regulated with a key enzyme like ADS (Artemisinic acid synthase):
Figure 18: Configure your network analysis by selecting correlation methods and specific tissue types to refine the biological context.
Figure 19: Visualize gene relationships through the Network Graph and interpret the biological significance of the co-expressed cluster via the GO Enrichment dot plot.
The GO Enrichment Analysis tool allows you to identify overrepresented biological themes within a list of genes. By comparing your target gene set against the entire Artemisia annua genome “universe,” the tool determines which Biological Processes (BP), Molecular Functions (MF), and Cellular Components (CC) are statistically significant.
Navigate to the GO Enrichment tab. You can provide your gene IDs in two ways:
.txt file containing your gene list (one ID per line).Click the Run GO Enrichment button. The system will process your list using the Benjamini-Hochberg (FDR) correction method. You can monitor the progress via the Analysis Status box, which will report how many genes were successfully mapped and how many GO terms were identified.
An interactive Dot Plot (powered by Plotly) will appear once the analysis is complete:
Use the controls below the plot to clean up your results:
The Results Table at the bottom provides a deep dive into every enriched term:
| Feature | Description |
|---|---|
| Statistical Method | Hypergeometric testing via the enricher methodology. |
| P-value Adjustment | Benjamini-Hochberg (FDR) correction to minimize false positives. |
| Ontology Scope | Comprehensive coverage of BP, MF, and CC root nodes. |
| Dynamic Filtering | Real-time table and plot updates based on significance thresholds. |
If you have identified a cluster of 50 genes that are highly expressed in the Root:
Figure 20: The Dot Plot summarizes the top enriched terms. Use the camera icon in the top-right of the plot to save the visualization as a PNG.
Figure 21: The results table provides the “Gene_Count” and the specific “Genes” involved in each term, allowing for direct functional cross-referencing.
mikado.chr1G1813).The Download menu provides a comprehensive repository for retrieving raw and processed Artemisia annua genomic and transcriptomic data. Whether you need expression matrices for specific tissues, bulk data for entire BioProjects, or core reference files (Genome and Annotation), this section facilitates high-throughput data access.
This module allows you to extract expression data tailored to specific plant organs across all integrated studies.
.tsv files.Best for users interested in the original context of a specific study, this tool provides bulk expression matrices.
.tsv file with genes as rows and samples as columns.Access the foundational reference files required for local bioinformatics pipelines.
| File Type | Extension | Recommended Use |
|---|---|---|
| TPM Matrix | .tsv |
Comparing expression levels across different genes or tissues. |
| Count Matrix | .tsv |
Input for statistical tools like DESeq2 or EdgeR. |
| Reference | .gff3 / .fasta |
Local mapping, BLAST searches, or phylogenomic analysis. |
Figure 22: The Tissue Download interface provides a real-time summary of the data size and file list before you initiate the ZIP download.
Figure 23: Select an individual study from the project table to retrieve its specific expression matrix.