Chatbots â computer programs that can respond to queries typed in natural language â have recently become a popular addition to web sites and other interactive applications.
Â
The aim of this project is to implement a simple chatbot that can respond to simple queries made up of a single question word (called the intent) and an object (called the entity). The chatbot will be able to answer questions such as Where is the printer? or What is C? by identifying the intent (where or what) and entity (printer or C) then looking up the answer corresponding to this pair in a simple database. The chatbot will also be able to learn new answers by asking questions of the user. If a user asks a question for which the database does not contain an answer, the user will be given the option to supply an answer that can be used to respond to the same question in future. (This simulates having the chatbot refer the unknown question to a third party who does know the answer.)
Â
Your team is required to implement, in C, a chatbot of the kind described in the introduction. The chatbot will converse in the console using a command-line-like interface; no graphical UI is required.
Â
On each iteration of the main loop of the program (implemented for you in the main function in the skeleton code), the program will display a prompt and wait for the user to type a question or instruction.
Â
Firstly, your chatbot should recognise some stock words and phrases to which it can give rote replies. These phrases will be referred to as smalltalk in the remainder of this document. For example, the chatbot may reply âhelloâ whenever the user types âhelloâ or âhiâ. Your chatbot should have at least five items of smalltalk.
Â
Apart from smalltalk, the chatbot will understand only a very simplified form of English in which all sentences are composed of only two parts, called the intent and the entity, in that order.
. The intent represents what the user wants to do or to know. It may be a verb instructing the chatbot to do something, or a question word like what, where, etc.
Â
. The entity is a noun phrase representing the object to which the instruction or question refers.
Â
The table below describes the intents and entities that must be understood by your chatbot.
Â
Â
Â
For example, the user might type in Where is UOM?, which consists of an intent âWHEREâ and an entityâUOMâ. The chatbot might answer UOMÂ is located in Melbourne, Australia.
Â
The chatbot will start without any knowledge of any entities, but can learn about them by asking the user. If someone asks it a question for which it does not know the answer, the chatbot will print I donât know followed by the question. The user may then type in answer. If the user types an answer, the chatbot will thereafter user this answer. If the user simply presses âenterâ without typing anything, the chatbot will ignore the response. In either case, the chatbot will then go back to waiting for more questions.
Â
A conversation might therefore proceed as follows:
Â
Chatbot: Hello.
Â
Me: What is the Accounting Cluster?
Â
Chatbot: I donât know. What is the Accounting Cluster?
Â
Me: The Accounting Cluster teaches accounting and in-depth understanding of business issues and processes.
Â
Chatbot: Thank you.
Â
Me: What is the ICT Cluster?
Â
Chatbot: The Accounting Cluster teaches accounting and in-depth understanding of business issues and processes.
Â
Some more detailed sample output is given at the end of this specification sheet.
Â
Once the chatbot has learned some entities, they can be saved to disk using the SAVE intent, and recalled using the LOAD intent. The RESET intent will erase all of the chatbotâs knowledge, leaving only the smalltalk in memory (but it will not erase any of the files created by SAVE, so that they can be re-loaded later).
Â
The LOAD intent should append all of the entities and responses in the file to whatever entities and responses already exist in the chatbotâs memory. If an entity in the file is the same as one that is already in memory, the corresponding response from the file should replace the response currently in memory.Â
Â
Skeleton
The skeleton program provided to you gives some guidance on how to structure a program of this kind and some hints on how it can be implemented. It consists of a header file, chat1002.h, that containsall of the type definitions and function prototypes for the program together with three modules as follows:
A detailed explanation of what each function should do is given in a comment at the start of the function. Refer also to the comment at the top of chatbot.c for how that module works.
You may alter the skeleton code if you wish but do be sure to update any comments in the code so that they reflect what the modified code does.
Conceptually, the knowledge base consists of a single list for every question intent understood by the chatbot. Each element in the list consists of an entity together with the answer to the question for that entity. Each list may be of arbitrary length; in particular it may grow indefinitely as the chatbot learns more answers to more questions. It is up to you to decide how these lists should be implemented, and your implementation should be described in the report that accompanies your submission.
You may assume that no entity is longer than MAX_ENTITY characters and that no answer is longer than MAX_RESPONSE characters, as defined in chat1002.h.
Different questions may refer to the same entity with different answers. Where is UOM?, for example, might have the answer already described, while What is UOM? might explain that UOM is the University of Melbourne, a university in Melbourne, Australia. An entity need not be associated with an answer to all question intents; Who is UOM?, for example, need not be understood even if Where is UOM? and What is UOM? are.
Entities may be matched using simple a case-insensitive string matching algorithm, such as the one implemented by compare_token() in main.c. You do not need to perform expansion of acronyms, stemming, etc. or any other sophisticated types of matching (these require the use of third party libraries in C)