Tutorial: Exploring the Artemisia Database
Welcome to the Artemisia Database, a comprehensive platform designed to facilitate research on Artemisia annua gene expression, transcription factors, functional annotations, and more. This tutorial will walk you through the database’s key features, showing you how to navigate its interface and utilize its tools effectively. Whether you’re a biologist, bioinformatician, or researcher, this guide will help you get started.
Table of Contents
- Getting Started
- Home Page Overview
- Gene Expression
- Transcription Factors
- Functional Annotation
- Gene Editing and Epigenetics
- Tools
- Download
- About the Database
- Tips and Troubleshooting
Getting Started
Accessing the Database
- Open your web browser and navigate to the Artemisia Database URL (
https://artemisia-db.com
).
- The interface is built using Shiny, so it’s interactive and user-friendly.
Home Page Overview
Upon loading the database, you’ll land on the Home tab:
- Welcome Message: A brief introduction to the database and its purpose—exploring Artemisia annua gene expression data.
- Flowchart: A visual representation of how the database was developed, centered on the page for easy reference.
Gene Expression
The Gene Expression menu contains four powerful tools for visualizing expression data.
Global Expression Viewer
-
Purpose: View a t-SNE plot of gene expression colored by plant body parts and explore associated metadata.
-
Steps:
- Navigate to Gene Expression > Global Expression Viewer.
- Observe the interactive t-SNE plot (powered by Plotly) showing expression clusters.
- Scroll down to the “Filtered t-SNE Metadata” table to see detailed sample information.
-
Example: Hover over points in the t-SNE plot to identify clusters representing samples from specific plant parts, such as “Leaf” or “Root.” Click on any point to filter the “Filtered t-SNE Metadata” table below based on your selection. This table provides detailed information about the chosen sample, including its metadata. If a sample is linked to a published study, the Pubmed_ID
column will contain a hyperlink—click it to visit the corresponding PubMed page for more details about the research.

Figure 1: The t-SNE plot displays gene expression data, with points colored by plant parts like Leaf and Root. Hover over or click points to interact with the data.

Figure 2: The Filtered t-SNE Metadata table updates based on your selection, showing sample details and a clickable Pubmed_ID link.
- Purpose: Calculate and visualize median gene expression across selected plant parts.
- Steps:
- Go to Gene Expression > Median Expression Per Part.
- Enter up to 100 gene IDs (one per line) in the text box or upload a
.txt
/.csv
file. We added the example gene ID buttons to auto-fill the text box.
- Select plant parts (e.g., “Root”, “Leaf”) using the checkboxes.
- Click Calculate Median Expression.
- View the resulting heatmap (log2-transformed data).
- Download the plot in PNG format via clicking on the camera icon on the top right corner of the plot.
- Example: Enter
mikado.chr7G1274
and mikado.chr4G1337
, select “Leaf” and “Flower”, then click “Calculate” to see a heatmap comparing their expression.

Figure 3: Clicking an example gene ID button (e.g., “Example#1”) auto-fills the text area with gene IDs. Select different plant parts (e.g., Leaf, Flower) to analyze their median expression in the heatmap.

Figure 4: The heatmap displays the median expression (log2-transformed) of example gene IDs across selected plant parts, generated after clicking ‘Calculate Median Expression’.
Artemisinin Pathway Genes
- Purpose: Focus on genes linked to artemisinin production in Artemisia annua, providing detailed information through a table with tabs for main data, expression, and sequences.
- Steps:
- Navigate to Gene Expression > Artemisinin Pathway Genes.
- View the “Main Table” tab, which lists artemisinin-related genes and their details.
- Search by gene names or gene IDs (one per line) in the search box to filter the table and view specific information:
- For gene IDs, enter the ID and press the Search by Gene ID button.
- For gene names, enter the name and press the Search by Gene Name button.
- Select one or more rows in the “Main Table” to reveal the Check Gene Expression and Copy Gene ID(s) buttons at the bottom left.
- Click Copy Gene ID(s) to copy the selected gene IDs to the clipboard for use in other analyses.
- Click Check Gene Expression to switch to the “Expression” tab, where you can select plant parts (e.g., “Leaf”, “Root”) and click Calculate Median Expression to generate a heatmap.
- Switch to the “Sequence” tab to view and download the sequences of the selected genes as a text file.
Overall Category Distribution
- Purpose: Provide an overall view of gene categories in Artemisia annua (e.g., specific, broad) using a donut chart, allowing users to explore classifications and related data.
- Steps:
- Navigate to Gene Expression > Overall Category Distribution.
- View the donut chart displaying gene categories (e.g., “Specific”).
- Click a segment (e.g., “Specific”) to filter the table below by that category.
- Alternatively, enter gene ID(s) in the search box to find its category and display its details in the table.
- Select one or more rows in the “Main Table” to reveal the Check Gene Expression and Copy Gene ID(s) buttons at the bottom left.
- Click Copy Gene ID(s) to copy the selected gene IDs to the clipboard for use in other analyses.
- Click Check Gene Expression to switch to the “Expression” tab, where you can select plant parts (e.g., “Leaf”, “Root”) and click Calculate Median Expression to generate a heatmap.
- Switch to the “Sequence” tab to view and download the sequences of the selected genes as a text file.
Transcription Factors
The Transcription Factors menu offers three sub-tabs for analyzing transcription factors (TFs).
PlantTFDB
-
Purpose: Explore transcription factor (TF) families from the PlantTFDB database.
-
Steps:
- Navigate to Transcription Factors > PlantTFDB.
- View the interactive TF family distribution plot.
- Click on any bar in the plot (e.g., “bZIP”) to filter the “Main Table” below based on your selection. You can also search different TF families based on the gene IDs.
- Switch to the “Details” box and explore:
- Main Table: A filterable table displaying TF data. Select one or more rows to reveal the Check Gene Expression and Copy Gene ID(s) buttons at the bottom left. Clicking Check Gene Expression button automatically switches you to the “Expression” tab with the selected gene IDs pre-filled. Click Copy Gene ID(s) to copy the selected gene IDs to the clipboard for use in other analyses.
- Expression: On this tab, choose different plant parts (e.g., “Root”, “Leaf”), then click Calculate Median Expression to generate a heatmap of gene expression for the selected genes.
- Sequence: On this tab, view and download the transcript sequences for your selected genes as a text file.
-
Example: Start by clicking the “bZIP” bar in the TF family distribution plot to filter the table for bZIP family TFs. In the “Main Table,” select rows for specific genes (e.g., two bZIP TFs), then click Check Gene Expression. On the “Expression” tab, select “Leaf” and “Stem,” click Calculate Median Expression, and view the resulting heatmap, which you can download in your preferred format.

Figure 5: The TF family distribution bar plot shows the number of TFs per family. Clicking the “bZIP” bar filters the table below to display only bZIP family transcription factors.

Figure 6: The Main Table is filtered to show bZIP family TFs after clicking the “bZIP” bar. Highlighted rows indicate selected genes, and the “Check Gene Expression” button appears at the bottom left, ready to switch to the Expression tab.

Figure 7: On the Expression tab, select plant parts (e.g., Leaf, Stem) for the chosen bZIP genes, then click “Calculate Median Expression” to display the heatmap. Use the “Download Plot” button to save it as PNG, JPEG, or PDF.
Pfam
- Purpose: Analyze transcription factors (TFs) based on Pfam domain annotations.
- Steps: Similar to PlantTFDB, with a focus on Pfam domains instead of PlantTFDB families. Navigate to Explore Transcription Factors > Pfam, view the family distribution plot, filter the table by clicking bars, and explore the “Details” box (Main Table, Expressions, Sequence).
Tissue-specific TFs
-
Purpose: Investigate TFs specific to certain plant tissues, such as Leaf or Root.
-
Steps:
- Navigate to Transcription Factors > Tissue-specific TFs.
- View the heatmap plot showing TF expression across tissues.
- Click a heatmap cell (e.g., CO-like TF in Leaf) to filter the “Main Table” in the “Details” box below.
- Explore the “Details” box:
- Main Table: Filterable table of tissue-specific TF data. Select rows to enable the Check Gene Expression button.
- Expression: Calculate and view a heatmap for selected genes across plant parts.
- Sequence: Access and download sequences for selected TFs.
-
Example: In the “Tissue-specific TFs” heatmap, click the cell for the CO-like TF specific to Leaf tissue to filter the table below. Select the CO-like TF row in the “Main Table,” click Check Gene Expression, and on the “Expression” tab, generate a heatmap for “Leaf” and “Root” to compare expression patterns.

Figure 8: The heatmap shows tissue-specific TFs, with the CO-like TF cell for Leaf tissue clicked, filtering the table below to display only Leaf-specific CO-like TFs.
Functional Annotation
-
Purpose: Query functional annotations of Artemisia annua genes using three distinct methods.
-
Steps:
- Navigate to Functional Annotation.
- Choose one of three query options:
- Annotation Table: Select a database (e.g., “GO”) from the dropdown and enter annotation IDs in the text area. By default, example IDs are provided for each database (e.g., GO, Pfam, KEGG) to guide users—enter similar IDs matching the selected database’s format.
- Gene-Based Queries: Input any gene IDs of interest based on the latest gene model developed for this database. Results will appear in a table with multiple tabs, each representing a different annotation database (e.g., GO, Pfam, KEGG).
- Functional Descriptions: Search using keywords (e.g., “RNA polymerase”) in the text input field. Results are displayed in a table format similar to the gene-based query, with tabs for different annotation types.
- Click the corresponding Search button for your chosen query method.
- View the results in dynamically generated tabs below the query section.
-
Example: For a gene-based query, enter a gene ID (e.g., mikado.Super-Scaffold_100038G100
) in the “Gene-Based Queries” text area and click Search Gene IDs. The results will display in a table with multiple tabs, such as GO, Pfam, and KEGG, showing annotations linked to mikado.Super-Scaffold_100038G100
.

Figure 9: After searching for a gene ID (e.g., GeneID1), the results appear in a table with multiple tabs (e.g., GO, Pfam, KEGG), each displaying different annotations for the gene.
Gene Editing and Epigenetics
CRISPR
- Purpose: Explore gene annotations and CRISPR-related sequences for Artemisia annua, enabling users to investigate gene editing potential.
- Steps:
- Navigate to Gene Editing and Epigenetics > CRISPR.
- View the table of CRISPR-related gene annotations.
- Enter one or more gene IDs in the search box to filter the table and check for associated CRISPR sequences.
- Select one or more rows in the filtered table to reveal the Check Gene Expression and Copy Gene ID(s) buttons at the bottom left.
- Click Copy Gene ID(s) to copy the selected gene IDs to the clipboard for use in other analyses.
- Click Check Gene Expression to switch to the “Expression” tab with the selected gene IDs pre-filled.
- On the “Expression” tab, choose plant parts (e.g., “Leaf”, “Root”), then click Calculate Median Expression to generate a heatmap.
- Switch to the “Sequence” tab to view and download the sequences of the selected genes as a text file.
Methylation
- Purpose: Analyze methylation data (e.g., N6-methyladenosine (m6A), 5-methyladenosine) for Artemisia annua genes to explore epigenetic modifications.
- Steps:
- Navigate to Gene Editing and Epigenetics > Methylation.
- Select a methylation type from the dropdown (e.g., “N6-methyladenosine (m6A)”).
- Choose a subfilter (e.g., “Writer”, “Reader”, “Eraser”) to refine the dataset.
- Enter one or more gene IDs in the search box to filter the table and view specific methylation-related annotations.
- View the table of methylation-related gene annotations.
- Select one or more rows in the filtered table to reveal the Check Gene Expression and Copy Gene ID(s) buttons at the bottom left.
- Click Copy Gene ID(s) to copy the selected gene IDs to the clipboard for use in other analyses.
- Click Check Gene Expression to switch to the “Expression” tab with the selected gene IDs pre-filled.
- On the “Expression” tab, choose plant parts (e.g., “Leaf”, “Root”), then click Calculate Median Expression to generate a heatmap.
- Switch to the “Sequence” tab to view and download the sequences of the selected genes as a text file.
JBrowse
-
Purpose: Visualize genomic data for Artemisia annua using JBrowse, based on the latest annotation and sequence files developed in this study.
-
Steps:
- Navigate to Tools > JBrowse.
- Click the Open button to access the genome browser.
- Click Open Track Selector in the sidebar to view available tracks.
- Select the “Artemisia Annotation” (GFF3) and “Genome Sequence” (FASTA) files to load them into the browser.
- Zoom in on a region of the genome to view gene features, such as exons, introns, or UTRs.
- Click on any feature (e.g., a gene or CDS) to open a new window with detailed information.
- In the feature window:
- Click Show Feature Sequence to display and copy the sequence of the selected feature.
- Use the dropdown menu to extract the sequence with default 1000 bp upstream and downstream.
- Click the wheel icon next to the dropdown to modify the extraction range (e.g., change to 500 bp or 2000 bp).
-
Example: After opening JBrowse and loading the Artemisia annotation and genome sequence tracks, zoom into a region to see gene features from the GFF3 file. Select a gene feature to open its details window, then click Show Feature Sequence to view its sequence. From the dropdown menu, choose to extract 1000 bp upstream and downstream, and adjust this to 1500 bp by clicking the wheel icon and modifying the range.

Figure 10: Exploring features in JBrowse.
(a) The JBrowse interface displays various gene features after loading the Artemisia annotation and genome sequence tracks and zooming in. (b) Clicking a feature opens a new window with detailed information. (c) Selecting “Show Feature Sequence” reveals the sequence, with a dropdown menu to extract 1000 bp upstream and downstream. (d) Clicking the wheel icon allows modification of the extraction range (e.g., to 1500 bp).
BLAST
-
Purpose: Perform sequence similarity searches to identify corresponding Artemisia annua gene IDs based on the latest genome annotation. This tool is ideal for users who have a gene sequence from another species or a gene ID from previous Artemisia annotation files and want to find its match in the database, unlocking access to all features (e.g., expression, annotations, sequences).
-
Steps:
- Navigate to Tools > BLAST.
- Enter a FASTA-formatted sequence (e.g., from another species) or upload a file (<2MB) containing the query sequence.
- Adjust settings as needed:
- Minimum Identity: Set the similarity threshold (default: 80%).
- E-value: Set the statistical significance threshold (default: 1e-5).
- Click Run BLAST to search the sequence against the Artemisia annua genome and latest gene IDs.
- View the results table, which includes matching Artemisia gene IDs, and download it as a CSV file for further use.
-
Example: Suppose you have a gene sequence from another species, such as >Seq1\nATCGATCG...
, or a gene ID from an older Artemisia annotation. Input this sequence in the text box (e.g., >Seq1\nATCGATCG...
), adjust the E-value to 1e-5, and click Run BLAST. The results will list the corresponding Artemisia annua gene ID (e.g., GeneID1
) from the latest annotation, which you can then use to explore expression, functional annotations, or sequences across the database.
Co-expression Analysis
- Purpose: Identify and analyze co-expressed genes in Artemisia annua across selected bioprojects and plant parts, with options for correlation methods and enrichment analysis.
- Steps:
- Navigate to Tools > Co-expression Analysis.
- Enter one or more gene IDs in the top-left search box to query co-expressed genes.
- Select a correlation method (“Pearson” or “Spearman”) from the dropdown menu.
- Choose one or more bioprojects from the “Filter by Bioproject” list, or select “ALL” to include all projects.
- Select plant parts (e.g., “Leaf”, “Root”) from the “Select Plant Part(s)” list, which updates based on the chosen bioprojects.
- Adjust the “FDR Rate Threshold” and “Minimum Absolute Correlation” (0 to 1) using the sliding menus to filter results.
- Click Run Co-expression to generate results, displayed in a three-tabbed table:
- Table: Lists co-expressed genes with gene ID, correlation, value, and FDR.
- Network Graph: Visualizes the top 10 co-expressed genes based on user inputs.
- GO Enrichment: Displays a dot plot of enriched terms for co-expressed genes. Hover over the plot to reveal a camera icon for saving as a PNG. Below, a table provides details on enriched terms, filterable by “Root Node” and a “Max P.adjust” slider. Use the Download button to save the filtered table.
- View the metadata table on the right, which updates based on selected bioprojects and plant parts.
GO Enrichment
- Purpose: Perform Gene Ontology (GO) enrichment analysis for a user-defined list of Artemisia annua gene IDs to identify enriched biological terms.
- Steps:
- Navigate to Tools > GO Enrichment.
- Input a list of gene IDs (one per line) in the text area or upload a text file with one gene ID per line.
- Click Run GO Enrichment to generate results, displayed in two components:
- Dot Plot: Visualizes enriched GO terms. Hover over the plot to reveal a camera icon for saving as a PNG.
- Table: Lists enriched terms with details, filterable by “Root Node” and a “Max P.adjust” slider. Use the Download button to save the filtered table.
Get Gene Sequences
- Purpose: Retrieve and download gene sequences for specified Artemisia annua gene IDs.
- Steps:
- Navigate to Tools > Get Gene Sequences.
- Enter gene IDs (one per line) in the text area (e.g.,
mikado.chr1G1813
).
- If ≤5 gene IDs are entered, view the sequences in the output area below the text input.
- Click Download Sequences to save the sequences as a text file. Note: For >5 gene IDs, sequences are not displayed on-screen but can still be downloaded.
Download
- Purpose: Access raw data files from the Artemisia annua database, including gene expression matrices, project-specific data, and the latest annotation and transcriptome assembly files created in this study.
- Options:
-
Download by Body Part
- Purpose: Download gene expression matrices generated in this research, organized by plant body parts.
- Steps:
- Navigate to Download > Download by Body Part.
- Select at least one body part (e.g., “Leaf”, “Root”) from the available options.
- Choose the expression data type:
- TPM: Transcripts Per Million.
- Bias-corrected Counts: Normalized count data.
- Select the expression level:
- Gene Level: Expression aggregated at the gene level.
- Transcript Level: Expression at the transcript level.
- Click Search to display a summary of the selected files.
- Download the resulting CSV file(s) containing the expression matrices.
-
Download by Project
- Purpose: Download specific to individual BioProjects associated with this study.
- Steps:
- Navigate to Downloading Data > Download by Project.
- View the table listing available BioProjects and their details.
- Select one or multiple rows corresponding to the desired projects.
- Click Download Selected Project Data on the right side to download the data for the selected projects.
-
Latest Annotation
- Purpose: Download the latest genome annotation file (GFF3) created in this study.
- Steps:
- Navigate to Download > Latest Annotation.
- Click the download link to retrieve the GFF3 file containing the latest Artemisia annua annotations.
-
Latest Transcriptome Assembly
- Purpose: Download the latest transcriptome assembly file (FASTA) created in this study.
- Steps:
- Navigate to Download > Latest Transcriptome Assembly.
- Click the download link to retrieve the FASTA file containing the latest Artemisia annua transcriptome assembly.
About the Database
Learn about the project, team, and future plans.
Tips and Troubleshooting
This tutorial provides a foundation for users to explore the Artemisia Database. For advanced usage or specific research questions, refer to the source code or contact the developer. Happy exploring!