Unsupervised and Supervised NLP Approach

Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that is specialized in natural language interactions between computers and humans. NLP is extensively used by today’s AI Chatbots and AI Virtual Assistant Technologies to process, analyze, understand and respond to an input user utterance expressed in natural language either as text (via a Chat Interface) or Voice (via an Interactive Voice Response Interface which converts audio to text). Unsupervised NLP and Supervised NLP play key roles in the success and growth of AI.

unsupervised NLP

NLP is extensively used to address a variety of human language challenges for those systems primarily related to Syntax Analysis (arrangement of words in a sentence such that they make grammatical sense) like Lemmatization, Word Segmentation, Part-of-Speech (PoS) Tagging, etc., and Semantic Analysis (understand the meaning and interpretation of words and how sentences are structured) like Named-entity-Recognition (NER), Word-Sense Disambiguation, Natural Language Generation (NLG), and more.

AI Chatbots and AI Virtual Assistants use either one or a balanced combination of the two families of NLP Learning: Supervised Learning and Unsupervised Learning.

What is Supervised AI Learning?

AI Chatbots and AI Virtual Assistants using Supervised Learning are trained using data that is well-labeled (or tagged). During training, those systems learn the best mapping function between a known data input and expected known output. Supervised NLP models then use the best approximating mapping learned during training to analyze unforeseen input data (never seen before) to accurately predict the corresponding output. Usually, Supervised Learning models require extensive and iterative optimization cycles to adjust the input-output mapping until they converge to an expected and well-accepted level of performance. This type of learning keeps the word “supervised” because its way of learning from training data mimics the same process of a teacher supervising the end-to-end learning process. Supervised Learning models are typically capable of achieving excellent levels of performance but only when enough labeled data is available.

Furthermore, the building, scaling, deploying, and maintaining of accurate supervised learning models takes time and technical expertise from a team of highly skilled data scientists. For example, a typical task delivered by a supervised learning model for AI chatbot / Virtual Assistants is the classification (via a variety of different algorithms like (Support Vector Machine, Random Forest, Classification Trees, etc.) of an input user utterance into a known class of user intents. The precision achieved by those techniques is really remarkable though the shortfall is limited coverage of intent classes to only those for which labeled data is available for training.

unsupervised nlp, supervised nlp

Advancing AI with Unsupervised Learning

To overcome the limitations of Supervised Learning, academia and industry started pivoting towards the more advanced (but more computationally complex) Unsupervised Learning which promises effective learning using unlabeled data (no labeled data is required for training) and no human supervision (no data scientist or high-technical expertise is required). This is an important advantage compared to Supervised Learning, as unlabeled text in digital form is in abundance, but labeled datasets are usually expensive to construct or acquire, especially for common NLP tasks like PoS tagging or Syntactic Parsing. Unsupervised Learning models are equipped with all the needed intelligence and automation to work on their own and automatically discover information, structure, and patterns from the data itself. This allows for the Unsupervised NLP to shine.

The most popular applications of Unsupervised Learning in advanced AI Chatbot / AI Virtual Assistants are clustering (like K-mean, Mean-Shift, Density-based, Spectral clustering, etc.) and association rules methods. Clustering is typically used to automatically group semantically similar user utterances together to accelerate the derivation and verification of an underneath common user intent (notice derivation of a new class, not classification into an existing class). Unsupervised Learning is also used for association rules mining which aims at discovering relationships between features directly from data. This technique is typically used to automatically extract existing dependencies between named entities from input user utterances, or dependencies of intents across a set of user utterances part of the same user/system session, or dependencies of questions and answers from conversational logs capturing the interactions between users and live agents during the problem troubleshooting process.

Even though the benefits and level of automation brought by Unsupervised Learning are large and technically very intriguing, Unsupervised Learning, in general, is less accurate and trustworthy compared to Supervised Learning. Indeed, the most advanced AI Chatbot / AI Virtual Assistant technologies in the market strive by achieving the right level of balance between the two technologies, which when exploited correctly can deliver the accuracy and precision of Supervised Learning (tasks for which labeled data is available) coupled with the self-automation of unsupervised learning (tasks for which no labeled data is available).

Aisera offers the most feature-comprehensive and technology-advanced AI Virtual Assistant solution for self-service automation in the market perfectly blending together Supervised Learning and Unsupervised Learning, Natural Language Understanding (NLU),  AI Virtual Assistant technology, Conversational AI (cognitive search) and Conversational RPA into one SaaS cloud offer for IT Service Desk and Customer Services. Aisera proprietary unsupervised NLP/NLU technology, User Behavioral Intelligence, and Sentiment Analytics are protected by several patents-pending applications.

Additional Resources