HomeHow We Do ItBlogNatural Language Processing – The Backbone of Our AI System

Natural Language Processing – The Backbone of Our AI System

In the previous blog post, we briefly discussed various AI tech that helps our company design the best health products and natural medicines. The concept of graph neural networks, deep learning protein modelling and recommendation system has been introduced. However, we have yet to talk about the core AI tech that enables all of those systems to work properly, the Natural Language Processing (NLP).

No, this is not the same as the Neuro-linguistic Programming (NLP) like those you encounter in the self-help books and seminars. The NLP in this article talks about the system used by a computer in order for it to understand human language – either in speaking or writing.

NLP is a subset of artificial intelligence that allows computers to comprehend and interpret human speech. The way it works is as follows: NLP systems pre-process data by “cleaning” the dataset first. This generally entails arranging the data into a more logical manner – for example, tokenization, which includes breaking down language into smaller semantic components, or “tokens.” Pre-processing just makes the dataset easier to comprehend for the NLP system.

The system then applies algorithms to the text in order to comprehend it. Rule-based systems, which translate text based on predetermined grammatical rules, and machine learning models, which employ statistical approaches, are the two basic algorithms used in NLP. The latter, even though it can produce more advanced results, still seems too far away from true “intelligence”.

One significant disadvantage of statistical approaches is that they need extensive feature engineering. As a result, the area has mainly abandoned statistical approaches in favour of neural networks for machine learning since 2015. The use of word embeddings to record semantic features of words is a popular strategy, as is an improvement in end-to-end learning of a higher-level task (e.g., question responding) rather than depending on a pipeline of discrete intermediary tasks (e.g., part-of-speech tagging and dependency parsing).

Despite being a significant scientific development at the intersection of computer science and linguistics, NLP is more prevalent than you may think. When you connect with an at-home virtual assistant like Siri or Alexa, or describe a customer service issue to a chatbot, you are using NLP.

Gene-Disease Mapping

For us in Herbalogi.AI, we use it primarily for natural medicine and drug discovery. We employ deep learning algorithm such as Recurrent Neural Network (RNN) with Long-Short Term Memory (LSTM) to create a gene-disease mapping.

Experimental approaches for detecting gene-disease connections, such as genome-wide association studies and linkage analysis, can be costly and time-consuming.

As a result, in recent years, academics have turned to numerous insilico methodologies that include text-mining, crowdsourcing, network, and semantic-similarity-based algorithms. Mining biomedical literature is critical for retrieving meaningful information from free-text data.

Ligands-Target Interaction Prediction

Word embeddings are used in these approaches to depicting chemical structures of the drug molecule and the binding protein from unlabelled biomedical literature.

In this feature representation technique, raw data such as simplified molecular-input line-entry system (SMILES) strings for molecules and protein sequences are vectorized. CNN-based models are a popular strategy for feature representation.

However, it ignores the interaction between distinct atoms in the molecules. To tackle this difficulty, self-attention and transformer-based embedding models85 are utilised. Following that, machine learning or deep learning models are used to predict the affinity between the therapeutic molecule and the target protein.

These predictions take into account biological, topological, and physio-chemical aspects of the drugs/targets.

Natural Drug Repurposing

Drug repurposing is the process of identifying new therapeutic applications for existing medications. It can result in speedier medication discovery and approval, safer therapy, and lower healthcare costs.

Many medications repurposing research rely on computational methodologies such as virtual screening, molecular docking, deep learning, and NLP. The drug-disease treatment pairs generated from literature using NLP may be utilised for drug repurposing in two ways: the extracted pairs themselves or a drug’s or illness’s similarity with a candidate drug or disease respectively is used to hypothesise a new therapeutic indication for a particular medicine.


Leave a Reply

Your email address will not be published. Required fields are marked *

© 2024 HERBALOGI SDN BHD (1485433-X) -  All Rights Reserved. Web by ZAVARI