Chatbot are examples of Weak AI. Current generation of chatbot can be thought of smart dialog systems driven through techniques like NLP and fixed conversation flows.
Out of the box, a chatbot doesn’t understand any domain. We need to train the chatbot to understand the domain. Also, based on the complexity of the domain, you would incrementally train and add subdomains. For instance, a chatbot helping you book a cab is an example of fixed domain, while a chatbot helping assisting doctors for cancer treatment would be trained on various types of cancer incrementally. As mentioned in the shopping advisor example, understanding the meaning of the same word in different context is difficult for the current generation of NLP implementation to resolve and you need to rely on custom techniques to handle such conditions.
Now, let’s look at some marketing gimmicks around AI chatbots –
- INGEST AND KNOW IT ALL chatbots – These are chatbots being marketed where you can ingest millions of documents, like medical literature and can ask questions, which can provide expert assistance like diagnosis of diseases. Such kinds of systems unless trained appropriately will never provide desired accuracy.
By appropriately, I mean it can take years to train these systems. The fundamental problem with these systems is that, they still don’t understand the complete language and complexity of the domain. You typically end up with custom domain adoptions and infinity language rules, which is definitely not smart enough to manage in longer run. The predictions of such systems are usually not accurate.
- Self learning chatbots – How often you have heard this terminology called self learning chatbots. This again is a misconception, where chatbots learns on its own. You have to train chatbot on what you what the chatbot to learn. Usually you would capture the user behavior details through their interaction with the chatbot application. This would include capturing user analytics information like capturing his likes or dislikes in some way, either through explicit or implicit means. Explicit information can be a user rating a product and implicit can be the time a user spent looking at a response.
Once you know the user well and have its data, its becomes a recommendation problem on what you want to recommend to the user. So, you end you building a recommendation algorithm to recommend something. For instance, for a fintech application, this would mean recommending similar stocks based on what stock he views regularly or his portfolio.
Different domain and use cases, would need different recommendation algorithms and that needs to be developed as part of the chatbot. However, the learning is boxed, for instance if you have a chatbot which assist you in booking restaurants, it can recommend you similar restaurants, but it can’t recommend your places to stay, it only knows about your restaurants taste.
Well, someone can build a recommendation system which tracks what users eat and where they stay and then try to come up with a correlation which provide a recommendation, as the system now knows – “user eating XYZ, most likely are adventurous. So, recommend some trekking place.” Again, in this case, recommendation is boxed on what you know and what you want to recommend. I don’t know if any such hypotheses exist, but only through data and feedback that can be inferred. The point is, all of these hypotheses, data and feedback needs to be designed and developed, and saying chatbots learns on its own it’s quite misleading.
- General purpose, generative chatbots – A chatbot which is capable of learning new concepts from scratch and provide responses like Human. As it learns from open domain, the chatbots would start behaving similar to the famous Microsoft Tay chatbot (which was forced to shut down on its launch day), as it started learning unwanted details from tweets and started posting inflammatory and offensive tweets. This is a classic example of what I quoted earlier – “AI can learn, but can’t think”. The generative chatbots are formulating the response based on probability of words and creating a grammatically correct sentence, without understanding the real meaning of it.
As I mentioned earlier, the first focus should be on getting domain specific chatbots right and with the current techniques we are far away from realizing the vision.