NLP
NLP is a cross-disciplinary technology that enables computers and humans to communicate with each other in natural languages such as English.

Natural language processing (NLP)

Introduction

Artificially intelligent assistants are something we take for granted in today’s world ruled by technological advancements on a daily basis. From the voice navigation feature on Google maps to Alexa and Siri, we have many inventions that are built on the basis of Natural Language Processing, a branch of Machine Learning.

What is NLP?

NLP has been theorized on for many years, but the 1951 paper “Computing Machinery and Intelligence” by Alan Turing is generally considered the most formative work relating to the subject. NLP is a cross-disciplinary technology that enables computers and humans to communicate with each other in natural languages such as English. It is the meeting point of linguistics and computer science. Machines are empowered to study natural languages, process and understand them, and generate responses in the same languages.

What is NLP used for?

The main goal of NLP is to make computers and similar machines smarter, especially when interacting with humans to perform their functions. This is made possible by the ability of machines to process enormous natural language data. At the most basic level, NLP deals with
  • Listening and recording human speech or voice input
  • Converting the input to text
  • Processing the data of the text
  • Generating relevant text or audio response based on the converted data
So, the input and output may either be in the form of speech (e.g. Alexa) or in the form of text (e.g. chatbots). There are many examples of NLP that we commonly use. Translation applications are an important example. We have Google Translate which has the data of so many different languages in its database and is able to translate at the click of a button. Word processors like Google Docs and MS Word are powered by NLP. These applications understand every aspect of language, more than most humans do. Spell-checking and grammar correction are a few of the myriad features offered by these applications. NLP at its best can be seen in personal assistant devices like Alexa and Siri, which are capable of engaging in conversations on an equal level with humans. The voice we hear on customer care service calls is the NLP-enabled Interactive Voice Response, which is an automated Q&A technology that is programmed to understand customer needs before putting them through to the suitable person.

The main uses of NLP

  • Automatic summarization
  • Text classification
  • Question answering
  • Named entity recognition
  • Parts-of-speech tagging
  • Character recognition
  • Relationship extraction
  • Sentiment analysis
  • Speech recognition
  • Topic segmentation
  • Text or information mining

Is NLP difficult?

Learning a new language is a difficult process even for humans, as every language has a wide scope, replete with many ambiguities. Therefore, NLP is considered quite an achievement in the field of Artificial Intelligence. NLP works on the basis of Natural Language Understanding (NLU) and Natural Language Generation (NLG). Both these processes are subsets of NLP and involve major challenges. Humans possess an instinctive understanding of language that comes with many years of practice. A seemingly simple sentence may carry many layers of meanings which, while easy for humans to process, defies machine intelligence. For example, one may respond to a bad joke saying, “Very funny.” Here we understand without being told that the meaning is not to be taken literally; the response is sarcastic in nature. A machine will not be able to extract such subtexts as it operates neither on emotion nor on instinct. This is just one of the many complexities presented by natural language. As machines cannot think on their own, the best they can do is mimic humans. Despite access to enormous language data, NLP still faces many challenges. A major case in point is Microsoft’s 2016 Twitter chatbot Tay  which was built to study and improve conversational language. Tay was designed to use the conversational patterns of a 19-year-old American girl. The more people engaged with Tay, the better it would get at its responses. However, the project soon spiraled out of control with people feeding it with highly racist tweets, and Tay’s tone quickly turned inflammatory and offensive. Microsoft pulled it down only 16 hours after its release.

How does NLP work?

As mentioned earlier, NLP works in conjunction with NLU and NLG. This process can be understood as one involving three steps:
  • Natural Language Understanding is the process by which the input data is studied with regard to grammar, context, intent, and entities
  • Natural Language Processing converts the input into structured data
  • Natural Language Generation lets the machine give a suitable output based on the data
These steps are carried out by the application of algorithms that extract the meaning associated with every word and the sentence as a whole and convert the unstructured language data input into a form that can be understood by the machine. Once the machine understands it, it forms a suitable response in the same natural language. However, the mechanics of the process are a lot more complicated and at times even unsuccessful, as in the case of Tay. Google Translate also is not 100% accurate, as does not understand the sentence as a whole, but goes by statistical occurrence.

What are the techniques used in NLP?

NLP relies on the analysis of the following in any language
  • Syntax
  • Semantics
The syntax is the structure of a given arrangement of words according to the grammatical rules of natural language. By analyzing the syntax using algorithms, machines try to understand the meaning. Under syntactic analysis, the following are covered:
  • Lemmatization – the grouping of inflected forms of a word for easy study
  • Segmentation – morphological segmentation (reducing words into morphemes) and word segmentation into units
  • Parts-of-speech tagging – identifying the different parts of speech
  • Parsing – assigning syntactic roles to the different parts of speech thus broken down
  • Sentence breaking – dividing a large piece of text into sentences
  • Stemming – reducing inflected words to the root form

Semantics is the meaning the text conveys. Here the rules of language are less structured. A sentence that has perfect syntax may not be correct semantically (A famous example is “Colorless green ideas sleep furiously”  by Noam Chomsky in his 1957 book Syntactic Structures. Under semantic analysis, the following are covered: Named entity recognition – classifying of names, places, and similar words in the sentence Word sense analysis – understanding a word’s meaning with regard to its context Language generation – generating semantically correct language with the help of databases

Conclusion

Many businesses are leveraging the power of NLP to their advantage. Today machines without NLP are unheard of, and this technology will continue to influence and mold every aspect of our digital life in the future.

Subscribe to our newsletter

Want to know more about AI but don’t know where to start?

Get a free copy of our whitepaper.

Related articles

AI

Responsible AI

Responsible AI is a framework that registers how an organization is addressing the challenges around artificial intelligence (AI) from an ethical and legal perspective.

Read More »

AI platform for the world’s
data-driven companies