[INTERNAL] Natural Language Processing [EDIT]

Natural language processing (NLP) is a component of AI that powers a number of everyday applications such as digital assistants like Siri or Alexa, GPS systems, and predictive texts on smartphones.

Earlier versions of NLP used rule-based computational linguistics with statistical methods and machine learning to understand and draw conclusions from social messages, reviews and other data. More recent approaches leverage neural networks and large-language models (LLMs) to accomplish the tasks below

To facilitate NLP, a number of sub-tasks are often conducted, including:

Tokenization: Text is broken down into smaller, more digestible single clauses.
Stemming: Words are broken down into root forms. For example, reading, reader, reads are stemmed into the word “read”.
Lemmatization: Contextually similar words or degrees are reduced to their root word. For example, better, best and very good are reduced to “good”.
Stop word removal: Words such as prepositions and articles are removed.
Part-of-speech-tagging: Nouns, verbs, adjectives, adverbs, pronouns, etc. are tagged.

To facilitate conversational communication with a human, NLP employs two other sub-branches called natural language understanding (NLU) and natural language generation (NLG). NLU uses algorithms that analyze text to understand words contextually, while NLG assists in generating meaningful words as a human would. Together, they power intelligent chatbots such as ChatGPT.

Here are the main NLP techniques used in business and B2C environments.

Text summarizations: NLP algorithms scan vast amounts of data and condense the information to provide a summary with key insights.
Speech recognition: This technique analyzes audio data to translates it into text or maps it to known words. It’s used make closed captions and has been pivotal in empowering the hearing impaired.
Machine translations: NLP can automatically translate words in different languages so that users can absorb non-native information with minimal effort. Google Translate is a good example
Question answering systems: NLP algorithms scan data and search for relevant information to provide answers to a user. These systems can be rules-based or based on generative pre-trained models, like ChatGPT, that derive information by accessing publicly accessible data on the internet.
Named entity recognition: Named entity recognition (NER) is an NLP technique that identifies and extracts entities such as people, locations, brands, objects, currencies and such.
Semantic search: Semantic search is search technique that allows a user to retrieve information by understanding the intention of the search beyond just using keywords.
Sentiment analysis: NLP algorithms that can categorize the emotions in a text to show whether it is positive, negative or neutral and to what extent.
Aspect-based sentiment: This advanced technique analyzes sentiment in aspects that have been extracted from topics in a text. This fine-grained view of market sentiment can give brands insight into exactly where they need to improve and what aspects are going well.

Published by Hailey Roover