Designing A Chatbot Application For Web-Based English Learning Using Boyer Moore Algorithm

ABSTRACT


INTRODUCTION
English is widely spoken in many countries and is used in many fields, including education, technology, business, industry, politics, trade and others. In the current era, skills in using English are needed. Having skills in using English can open up opportunities to work on a multinational scale, participate in the global community, and read international research. Based on research conducted by Education First (EF) with the topic of discussion is the English Language Proficiency Index in 2022, Indonesia is ranked 81 (low) out of 111 countries. In 2021, Indonesia is ranked 80th and in 2020 it is ranked 74th. Based on this data, it can be concluded that the ability of Indonesian people to use English is still lagging behind other countries. However, in the process of learning English, there are various problems faced by students. One of the problems that often arises is the monotonous or less interactive teaching method, which comes from the teacher [1,2]. Based on this problem, it is important to develop an application that can help learning become more interesting to learn and easily accessible at any time in learning English. One solution is to utilize Artificial Intelligence and Chatbot [3 ,4,5] Artificial Intelligence is a part of computer science that aims to develop computer systems to have abilities and behaviors similar to humans [4,5,6,7,8,9,10].
One of the fields of artificial intelligence can be implemented in a chatbot application. Chatbot is a program designed to allow humans and computers to communicate with each other using natural language just as humans communicate with other humans [13]. One application of the Boyer-Moore Algorithm can be applied to Chatbot. Where Boyer Moore will perform pattern matching on question keywords, then match them with keywords contained in the knowledge base. Boyer-Moore is a string search algorithm that starts the process of matching characters from the right and then shifts to the left [14].
Chatbot is a series of programs that allow humans and computer systems to interact just as humans communicate with other humans [10,11]. The term "chatbot" comes from "chatterbot", which is a term originally proposed by Michael Mauldin in 1997 to describe a robot that can be used by humans to chat. This technology is also known by other names such as dialog system, conversational agent, conversational interface, virtual assistant, and personal assistant [17]. The use of chatbots has been widely used in various fields, such as customer service in e-commerce, financial dialogue systems, virtual counseling services, and conversational agents in health consultations [18,19].

RESEARCH METHOD
The Research and Development method is the method used in this research. This method is divided into several research steps, as for the steps in question can be explained in Figure 1 below : Fgure 1. Research Framework Figure 1 explains that there are several steps to be taken. Below are the steps that will be carried out, namely: 1. Literature Study, The first thing to do is to collect information related to books related to the discussion of tenses in English. 2. Problem Identification, The problem that researchers identify is how to design applications that can help learn English, especially learning about tenses. 3. Data Collection, The process carried out at this stage is to conduct a literature study in order to find questions related to tenses and the correct answers to these questions. 4. System Design, In designing the system in question, the javascript programming language was used as well as the use of the MySQL database. 5. Boyer Moore Implementation, The next step is to apply the Boyer Moore algorithm to the system that has been designed. By applying Boyer Moore's algorithm, the chatbot application will become more reliable and effective in understanding and responding to user input, providing relevant solutions or information, and improving the overall interaction between users and chatbots. 6. System Testing, Confusion Matrix was used in testing. The goal is to evaluate the performance of the algorithm used in the chatbot. The calculation used in determining the Confusion Matrix value is to use the following equation.
= + + + + Where acccuracy is the distance between the number of correctly identified questions and the total available questions.

Data Collection
The data collection process is taken from the Brainly website which is used as a reference to collect a list of questions about tenses in English and uses a book by M. Furqon& Desi Sugiarti with the title English skill booster: grammar, tenses, vocabulary, conversation as a reference to create answers to the questions made. Brainly is an online learning platform that facilitates users to openly ask and answer questions from other users related to school subjects. On the brainly website there are more than 4000 questions related to tenses, from these questions the author collects questions that are often asked by brainly users to be used as brain files. Table 1 is a sample of data from the brain file used in this study. They watch movies together on weekends. (They watch movies together on weekends.) 4 What is present continuous tense It is a tense that shows the action or state that is taking place at the time of speaking. This tense is also known as the present progressive tense. 5 Example of a present continuous tense sentence a) I am eating breakfast right now.

b)
They are watching a movie together. (They are watching a movie together.)

c)
We are meeting at the park tomorrow morning. ( It is one of the tenses in English that refers to an action or event that was in the past and lasted until now, and its effects are still felt in the present.
In order for the system built to work well, it is necessary to process the data before the data is used.
The system receives input in the form of questions, after receiving input, the input will go through a text preprocessing process, so that the text search process is easier to do. The text preprocessing process is carried out through three stages as described in Figure 2 below Fgure 2. Data Processing

Boyer Moore Algorithm
Boyer Moore algorithm is one of the most efficient string matching algorithms available as it actually finds matches in a sub-linear search time. This algorithm achieves this by simply scanning the key string from left to right. In case of a miss, the key string will be shifted a pre-calculated number of characters to the right until a match is made with the current character. Then, the next character that has not matched will be considered. Since the length of the key string and the position of the current character are known, the number of characters where a key string match can occur can be calculated [20]. Pseudocode of Boyer Moore Algorithm as follows :

RESULTS AND DISCUSSIONS
The system consists of 2 parts, namely the admin and user parts. Admin consists of dashboard page, manage user data page, question data page, and feedback page. User consists of login page, register page and main page. Boyer Moore algorithm is used on the user's main page, when the user inputs a question in the conversation dialog, the question will be processed and a word match is made between the user's question and the brain file to the appropriate answer. Figure 3 is the user login user interface. Users who have never entered the website will be directed to the login page. The email and password used must be registered in the database in order to enter the main page of the website.  Figure 4 is the registration page. The register page is a page for users to register themselves so that they can log in and enter the website. On the register page, new users fill in username, email and password data.  Figure 5 is the user's main page where users can ask questions to the bot. On this page a conversation dialog will appear between the user and the bot and every question and answer from the bot will be saved as a conversation history .

Discussion
Testing the accuracy of the chatbot in providing answers that match the questions given by the user using Confusion Matrix testing. In testing, the test data used is 100 data. Based on the test results, the following results are obtained: TP: Questions about tenses answered correctly by the chatbot TN: Questions that are not about tenses and are answered incorrectly by the chatbot FN: A question about tenses that was answered by a chatbot but the result was wrong. FP: Questions about tenses and cannot be answered by chatbot. Based on the above calculation, the application of Boyer Moore algorithm to perform string matching in English learning chatbot application has an accuracy of 96%. The accuracy result is also influenced by many things. For example, the length and complexity of the strings being matched. If there are many questions or phrases with complex and long structures in the chatbot application, the Boyer Moore algorithm may have difficulty in matching the right words or phrases quickly and accurately. In addition, handling ambiguous keywords and writing or spelling errors can also affect the accuracy of the algorithm. Therefore, good implementation, selection of an efficient indexing method, and the size and quality of the dataset used also play an important role in improving the accuracy of the Boyer Moore algorithm in English learning chatbot applications.
In interpreting the accuracy results that researchers get, it is important to compare them with the accuracy results that have been achieved by similar research that has been done before. The comparison of the accuracy results of the application of the algorithms used can be seen in Table 3 below. Based on these three pieces of information, it can be concluded that the implementation of Boyer Moore algorithm on the chatbot shows various levels of accuracy, namely 79%, 96%, and 99.41%. The implementation on the Medicinal Plants chatbot achieved the highest level of accuracy, while the specific application for Ustaz Abdul Somad had a lower level of accuracy. However, the information provided is not enough to comprehensively evaluate the performance and effectiveness of each of these chatbot applications because the accuracy rate is only one factor that can be used to evaluate the performance of a chatbot. There are other aspects that need to be considered, such as response speed, ability to understand complex questions, and suitability to a particular context or domain.

CONCLUSION
After planning, implementing, and testing the application, conclusions are drawn, namely:

1.
A chatbot application for English language learning has been successfully built and successfully answers questions related to 16 tenses using the boyermoore algorithm.

2.
Based on the results of confusion matrix testing, the chatbot accuracy rate reached 96%. The number of keywords provided in the brain file affects the accuracy of the chatbot, so the more keywords provided, the more relevant the response provided by the chatbot.