◆ In the following descriptions, the letter following each “Task” (e.g., Task A) corresponds to the codes (A–Q) listed under “Outsourcing task(s)” in the Contact Us form.
Notes on Automatically Collected and Analyzed Information in This Contracted Service
In A2K analysis, it can be difficult to comprehensively aggregate gene information when using gene names as key terms.
LA2K technology incorporates an AI-driven named-entity recognition (NER) method that automatically identifies gene names in text.
LA2K technology collects documents related to the key term(s) and extracts sentences containing the key term(s) and/or gene name(s). In this process, the gene names that appear in the document are automatically recognized by the LA2K technology. The relevant information from these sentences is then summarized and presented in the A2K Description format.
An example of analysis using the LA2K technique is presented in the following figure. LA2K technology extracts key information related not only to the key term(s) but also to gene(s). In this example, the A2K Descriptions of the automatically recognized gene names in the document are shown.
In this way, by utilizing LA2K technology, the computer can recognize automatically and correctly gene names in the text just like manual review by a researcher. Owing to the powerful text analysis capability of LA2K technology, key information related to genes can be extracted from documents even without prior knowledge of gene names by either the computer or the user.
Integrating of LA2K analysis results with omics data maximize the information for elucidating the molecular mechanisms of various biological processes. Information on transcription factors and cis-elements, gene expression data, and homolog information (gene families) within and across species are valuable as the omics data for the integration. Such integration of multifaceted information facilitates the achievement of research goals, including gene identification.