◆ In the following descriptions, the letter following each “Task” (e.g.,
Task A) corresponds to the codes (A–Q) listed under “Outsourcing task(s)” in the
Contact Us form.
At WGI, we have developed a proprietary digital filtering analysis technology for
gene expression matrices.
In addition, we accelerate the accurate identification of trait-associated genes through
multifaceted analyses using WGI’s proprietary high-quality omics data, including gene functions, regulatory mechanisms, and gene families.
◆Issues with the Inability to Utilize High-Quality Biological Function Information of Genes
Biological function information of genes is widely used to identify trait-associated genes from candidate DEG groups.
However, conventional methods for predicting gene function have the following limitations:
Read more...
- To predict gene function, researchers often rely on sequence similarity searches against gene/protein databases and their associated annotations.
- However, genes or proteins with similar DNA or amino acid sequences do not necessarily share the same biological functions.
- Conventional similarity search tools do not verify the conservation of protein functional domains within the sequence, making functional equivalence uncertain.
- Annotations of genes and proteins registered in databases are not always biologically accurate.
- General tools such as Gene Ontology and KEGG pathway predictions share these issues, so the reliability of predicted functions remains questionable.
As a result, the predicted functional information of genes is often low in quality and reliability, making it difficult to determine which candidate genes should be prioritized for wet-lab validation.
◆Issues with Enrichment Analysis Using Functional Annotations
Enrichment analysis of functional annotations is commonly used to infer the global functions of DEG groups.
This approach statistically compares the frequency of functional annotations between candidate genes and a background group.
However, enrichment analysis has limitations both in the prediction method for functional annotations and in statistical approaches.
Read more...
- Gene Ontology and KEGG pathways are primarily used as functional annotations in enrichment analyses.
- Since the functional annotations predicted for each candidate gene are unreliable, the accuracy of frequency distributions for the candidate gene group is also questionable.
- The number of annotations (zero or more) assigned to each gene varies significantly across genes.
- Traits or genes that have been extensively studied tend to have more annotations, causing bias in the annotation count across traits and genes.
Therefore, although frequency distributions of Gene Ontology or KEGG pathways in DEG groups may serve as reference, performing enrichment analysis and deriving biological interpretations is generally inappropriate.
◆Issues with Background Gene Sets in Enrichment Analysis
In enrichment analysis, the frequency distribution of functional annotations is compared between a candidate gene group (DEGs) and a reference gene group (background).
Here, there are issues with the definition and use of this background group.
Read more...
- The background group is often defined as either all genes in the genome or a randomly sampled set of genes equal in number to the candidate genes. However, both approaches are biologically and statistically inappropriate.
- The genome contains a diverse set of genes, including those with unconfirmed expression or constitutively expressed genes.
- It is impossible to verify whether the selected background gene group is biologically or statistically valid.
Although concerns have been raised about the background definition in conventional enrichment analysis, this issue remains unresolved.