Improving Moodle Performance Using HAProxy and MariaDB Galera Cluster

 Abstract —Moodle is a widely used Learning Management System in various educational institutions worldwide. However, frequent reports on internet forums indicate performance degradation when massive simultaneous users access Moodle. One of the most resource-intensive components supporting Moodle is the database, as all user-accessed data is stored in it. This study aims to optimize Moodle’s performance through distributed databases. Distributing the database into multiple database servers allows the database load to be distributed across all the database servers, resulting in an overall improvement in Moodle performance. This study compares the performance of Moodle installed on a single server with that installed on multiple database servers. Various testing parameters are employed to get valid results. Namely, course read, course write, and database performance, utilizing the server performance plugin available in Moodle. This research reveals a performance improvement of 384% in course read, 193% in course write, and 260% in the Moodle database in the multi-server scenario compared to the single-server scenario. This result validates that the database is the most crucial part of Moodle.


I. INTRODUCTION
ith the shift toward online learning in response to the COVID-19 epidemic, the Learning Management System (LMS) has experienced substantial growth in use over the past few years [1], [2].A LMS is a sophisticated software application meticulously designed to streamline and enhance the distribution, administration, and monitoring of educational resources and information.This comprehensive tool empowers educators by allowing them to create and efficiently manage online courses, engage in interactive sessions with their students, and closely monitor the progress and performance of each student, all from the convenience of a centralized location.The LMS acts as a dynamic hub that combines various educational elements, including course materials, assessments, and communication tools, offering a holistic platform for educators to deliver impactful learning experiences.
The significance of LMS became even more apparent during the unprecedented challenges posed by the COVID-19 pandemic.LMS emerged as a pivotal technology that played a critical role in ensuring the uninterrupted continuity of learning.Its adaptive features allowed educational institutions to transition from traditional in-person teaching to effective online delivery swiftly.In this context, the LMS served as a technological solution and became a cornerstone in fostering remote learning environments, enabling educators and students to navigate the complexities of virtual education seamlessly.
The LMS landscape presents various platforms, each characterized by different licensing models, from free and open source to free-to-use and freemium structures, where advanced features entail payment.Learning Management Systems with free and open-source licenses necessitate users to provide their server infrastructure and computer networks, commonly called on-premise infrastructure.Installation and configuration are undertaken independently, although there is support from the LMS user community.Under this concept, user institutions take full responsibility for the whole operational aspects of the LMS, from IT infrastructures to LMS management.This approach grants institutions greater autonomy and control over the LMS, allowing them to tailor the system to their needs.However, it also requires technical expertise and resources to effectively manage and maintain the on-premise infrastructure.Moodle, Canvas, and OpenEDX are LMS platforms operating under open-source licenses, affording users the freedom to utilize them without cost and adapt them further to suit the specific requirements of their institutions.
Users who cannot provide the necessary supporting infrastructure for a learning management system (LMS), such as servers and adequate computer networks, can make use of one of the many free learning management system (LMS) providers, such as Google Classroom, Edmodo, or Schoology [3].These LMS systems run under a free license, enabling users to access them without being charged a fee; however, users are often prevented from modifying the main LMS functionality.Users can leverage the LMS features without being required to handle the underlying technical infrastructure when the service model known as Software as a Service (SaaS) is utilized.
For more advanced users, there are several LMS platforms with freemium licenses.LMS platforms with this license type offer basic LMS functionalities for free, albeit with certain limitations, and provide specific features under a paid model.By utilizing this licensing model, users are relieved of the need to manage LMS-supporting infrastructure and employ expert staff for LMS configuration.Users can immediately leverage the free basic LMS features and make payments only when they require advanced features.In contrast, Blackboard and Classe365 are LMS platforms that operate under a freemium license.The variety of LMS licenses provides educators with numerous options to optimize their teaching processes without requiring intricate technical skills to manage the LMS.
Moodle is one of the LMS platforms utilized by educational institutions worldwide [4].This is because it is licensed under a free and open-source model, enabling further development.Moodle also has a robust community assistance system, making it more straightforward for new users to learn their way around the platform.Moodle is one of the most adaptable LMS platforms because it has many community-created plugins.These plugins make Moodle one of the few LMSs that can accommodate nearly unlimited development options.Moodle provides tools for tracking student progress, providing feedback, and integrating multimedia resources such as videos, audio, and interactive content [5].
According to findings from the earlier study, the database's performance considerably impacts Moodle's overall performance [6].The Moodle database is where the settings for the system, as well as the user information and course materials, are stored.Consequently, the Moodle platform's performance and speed might be substantially hindered by any lags in the database or other problems that arise with it.The performance of the database is susceptible to being influenced by some factors, such as the size and complexity of the Moodle site, the number of users who access the site, and the amount and complexity of the queries made on the database.When Moodle is visited simultaneously by many users, it will send many data queries to the database.The processing capability of the database server determines where these queries will be placed in the queue to be processed.The amount of time needed to complete each query differs based on the data the user is looking for.When a query takes a considerable amount of time to execute, the total time the other inquiries in the queue must wait increases.If the amount of time spent in the queue exceeds the maximum amount of time the server is allowed to execute queries, the query will be terminated.If there are numerous aborted queries, it will lead to a situation where no data is presented on the user's screen (the white screen of death), substantially decreasing the quality of the user experience provided by Moodle.
In the Moodle online forums, people frequently complain that the performance of Moodle suffers when it is simultaneously accessible by many users.The performance of the database is essential to overall Moodle performance since Moodle stores activity generated by users within the database.When database optimization is accomplished but the performance issue continues, upgrading the Moodle server is the only method to overcome performance degradation and restore optimal functionality.There are two different models for scaling or upgrading a server: horizontal and vertical scaling.
The process of increasing the capacity of a server by adding more resources to the server is known as vertical scaling.Vertical scaling, often called "scaling up," involves increasing the power of a single server.This can include adding more processing power, memory, or storage to accommodate growing demands.While vertical scaling is a straightforward approach, it has limitations.There's only so much you can upgrade a single server before encountering hardware restrictions.
Additionally, if the server fails, it can result in significant downtime, making it a less resilient option.On the other hand, horizontal scaling, or "scaling out," distributes the load across multiple servers [7].When compared to vertical scaling, horizontal scaling offers a more significant number of advantages, some of which include load balance and failover protection.In addition, automatic server cloning and load balancing configuration are available by default with cloud computing services.
Load balancing is a critical aspect of horizontal scaling.It involves the efficient distribution of incoming network traffic or workload across multiple servers.This prevents any server from becoming overwhelmed and ensures that resources are utilized optimally.Various algorithms, such as Round Robin, Least Connection, or IP Hash, are used in load balancing to determine how to distribute incoming requests.Failover protection is another significant advantage of horizontal scaling.In a multi-server environment, the others can still handle the load if one server fails.This enhances the system's overall reliability and minimizes downtime.
By utilizing these characteristics, the server can automatically clone itself and configure load balance to distribute traffic when required.Moodle's learning management system (LMS) can retain its performance even when subjected to tremendous demand because of its automatic scaling features [8].
In addition to enhancing server performance, horizontal scaling ensures high server availability, meaning an application will continue functioning on other servers if one server goes down [9].Utilizing a load balancer server is a must to distribute query requests across multiple database servers.A load balancer server distributes the workload evenly among multiple servers.Numerous server load-balancing software solutions are widely utilized, including NginX, HAProxy, and Zevenet [10], [11].Various dynamic techniques are available for distributing server loads to distribute the workload effectively.These techniques are Round Robin, Least Connection, IP Hash, Generic Hash, Least Time, and random methods.The round-robin algorithm is designed to evenly distribute the server workload among all cluster members evenly, ensuring an equal share of the workload for each server.However, this algorithm may be less than optimal when the servers have varying specifications, as it can lead to some servers becoming overloaded due to the cyclic load distribution.In the Least-connection Algorithm, server selection is determined by the server with the fewest active connections [12].The rationale behind this approach is to maintain optimal server performance for serving new users.However, this algorithm is limited, as it solely considers the number of user connections to the server and does not consider the server's workload.It is conceivable that even with a small number of users, a server may be engaged in resource-intensive tasks, such as simultaneously generating randomized quizzes, which can impact its performance.
The Moodle database maintains all of the data, such as course materials, data on assignments along with grades, data on exams along with grades, and so on.As a result, the database that is included with Moodle is a crucial component.Every action that Moodle users take is connected in some way to the database since that is where all of the data is stored.This is true whether they are retrieving data from the database or adding data to it.These user operations are executed using a language specific to databases and known as a query.When compared to information systems that are not LMSs, Moodle's queries are more complex since they require data from many different tables.As a result, they take a more extended amount of time to process [13].
In practical applications, using Moodle to facilitate the learning process often leads to numerous users accessing the system concurrently.This influx of users generates a queue of queries awaiting processing in the database.The processing time duration depends on the complexity of the data requested by users; more complex queries will need extended processing periods.As the queue of processes is unstoppable, there's a critical need to prevent system overload.The database server has an automatic termination mechanism to address system overload, designed explicitly for queries characterized by extended processing durations or prolonged wait times in the queue.
When seen from the perspective of the end-user, this operating method has a significant impact on the experience they have.The user interface might not work as well as it should, the responsiveness of the Moodle system might deteriorate, and there might be more errors in the system.For instance, when an instructor requests information about student grades or a student submits an assignment, Moodle executes queries to retrieve or modify the corresponding data in the database.However, when the queries involve multiple tables, it can lead to delays during periods of high user engagement.Because of the large number of concurrent user requests, the system might have trouble executing queries in a timely manner, which could delay the process of either retrieving or storing data.This delay in responding can significantly damage the overall user experience, causing dissatisfaction and preventing smooth learning.
Enhancing the performance of Moodle goes beyond optimizing the web server, particularly considering the critical role of the database, which stands as one of the most frequently accessed components.Recognizing this, some databases come equipped with advanced features like replication or multi-server capabilities to address the demands of high-concurrency environments.The MySQL database, for instance, provides a robust replication feature.This functionality allows for installing the MySQL database across multiple servers, offering various configuration options such as master-master or master-slave setups.In a master-master configuration, each server can function as both a master and a slave, facilitating bidirectional data replication.On the other hand, the master-slave configuration designates one server as the master, handling write operations, while others act as slaves, replicating data from the master.These configurations contribute to improved performance by distributing the database workload, ensuring redundancy, and enhancing fault tolerance [14].
Galera Cluster is a feature of MariaDB that allows users to establish a distributed database across multiple machines.Data will be synced between database servers using the Galera Cluster function in MariaDB, similar to MySQL's master-master feature.By utilizing this method, a highly available server environment will be produced.This indicates that the given services will be accessible at all times, even if there are issues with the server [15].The previous database technology used Master-slave database techniques to handle large database transactions.This technique makes database replication to multiple servers where the slave database will get updates from the master database but not vice versa [16].The program is set to read data from the slave database while writing/updating data to the master database to reduce database load.However, Galera Cluster can make database replication with master-master status.With this technique, data can be written to any database server; then the database server will synchronize.Thus, a high-availability environment will be created where, when a database server experiences problems, the system will continue running using another server [17].

II. RELATED WORK
While research on utilizing Moodle as a Learning Management System is standard, relatively limited research is available on optimizing Moodle's performance when accessed by many concurrent users.
Efforts to reduce the load on Moodle when accessed by many concurrent users have previously been made through rule-based approaches [18].This study proposes implementing several rules to mitigate the workload on Moodle.These rules include introducing time intervals between the start of lectures for different faculties, avoiding the simultaneous use of examination features by numerous users, and implementing other regulations to prevent concurrent access to Moodle during the same time.Furthermore, this research recommends several strategies to alleviate the burden on Moodle, such as implementing flexible exam scheduling to allow students to take exams asynchronously, pre-generating randomized questions to reduce Moodle's need for real-time question randomization, transitioning to asynchronous learning modes, and conducting classroom management activities like class backups, enrollment of students, and various management tasks during semester breaks when there is no Moodle-based learning activity.The implementation of these recommendations has resulted in a reduction in the Moodle workload of up to 20%.Removing the old Moodle system after upgrading to a newer version has led to approximately 10% to 15% storage space savings.It is also essential to monitor the plugins used in the learning process, allowing for the removal of unused plugins.
The load distribution technique among multiple servers to enhance Moodle performance has been previously implemented [19].In this research, Moodle utilizes three separate servers: the web server, the application server, and the database server.Performance measurement is conducted by comparing the server performance across various scenarios.The scenarios used include (1s, 1s, 1s), (1s, 1s, 2s), (1s, 1s, 1h), and (2s, 2s, 2s).Each parameter represents the specifications for the web server, application server, and database server."s" signifies the standard server specifications (2-core vCPU, 2GB RAM, 20GB storage), while "h" represents the high server specifications (4-core vCPU, 4GB RAM, 40GB storage).The testing used the Apache Bench and Sysbench tools to measure server performance in each scenario.This research indicates that the best performance was achieved in the scenario (1s, 1s, 1h), where the database server had higher specifications than the web server and application server.This research shows that the database server is the component that handles the majority of the workload in the Moodle system.Enhancing the hardware specifications of the database server has a positive impact on the overall Moodle performance.
Efforts to enhance Moodle's performance using cloud computing technology have also been undertaken previously.This research involved migrating Moodle from the university's local server to the Microsoft Azure cloud computing platform [20] .The study utilized a Virtual Machine Scale Set (VMSS) with automatic scaling features.The specifications for the VMSS included two vCPU cores, 7GB RAM, and 128GB SSD storage with the Ubuntu 18.04 LTS operating system.The researcher set up a load balancer to distribute traffic evenly across all servers.For the database server, AzureDB for MySQL was used with server specifications of 4 CPU cores and 20GB of RAM.All servers were placed in the same West Europe Azure Region to minimize data transfer between distant servers, which could result in high latency.The testing was conducted using a quiz feature with 20 multiple-choice questions, displayed with five questions per page.The quiz duration was set to 20 minutes and was simultaneously taken by 300 users.This research indicates that cloud computing technology can significantly improve Moodle's performance compared to on-premise server hosting.No issues were encountered in the cloud computing testing, which contrasts with the on-premise server testing, where problems were consistently experienced.From the server specifications used in cloud computing, it is evident that the database server had higher specifications than the web server.This design choice is made to efficiently handle requests from the web server, which can be replicated based on the number of users accessing the system.
Quality of Service is one of the most crucial factors in information systems.There is research on Moodle's Quality of Service area from the network point of view [21].This research focuses on improving Moodle's Quality of Service from a network perspective using Software Defined Network (SDN).SDN is a relatively new technology in networks used by cloud computing.Using SDN, networks on cloud computing can be managed virtually using a programming language.This research tries to prioritize network traffic for Moodle over another type of traffic in the network.Network prioritization is done by creating a special plugin called SDN4Moodle.This plugin will communicate with SDN to prioritize traffic for Moodle based on Moodle's response to user activity.The result shows that SDN4Moodle can prioritize network traffic for Moodle and increase Moodle's Quality of Service.
Moodle was essential in academic activities as a Learning Management System, especially during the COVID-19 pandemic.That is why Moodle needs high availability and performance [22].This research focuses on making Moodle accessible most of the time by using clustering technology.This research uses redundant servers for the web server, database server, and file server under the coordination of a load balancer.The load balancer will distribute traffic to the web, database, and file servers.Since the hardware specification of the servers is identical, the load balancer uses a round-robin algorithm to distribute traffic evenly to all the servers.To keep data synchronized among database servers, GaleraDB was employed.File replications are done by rsync To keep files synchronized among file servers.This setup will distribute the load to servers, resulting in a performance boost since server load is distributed to many servers.This setup also provide High Availability to the Moodle server in case some server is down due to technical issue; user request will served by another server in the pool.Users will not suffer system errors even if there is any technical issue in server infrastructure, and increase user experience.
Effort to improve Moodle's performance has been elaborated using various technique.Using many servers to distribute user loads is a proven method, but this will increase IT budgeting, mainly to upgrade the server's infrastructure.It's known from previous research that Moodle's performance relies on the database the most.So, this research will focus on improving Moodle's performance by distributing the database server alone rather than all the servers needed by Moodle (including web servers and file servers).This technique will prevent an explosion of budgeting on IT infrastructure but still can increase Moodle's performance.
This research aims to improve the performance of Moodle in an on-premises environment by distributing the database workload across multiple servers.Database load is distributed using HAProxy, a load balancer that can distribute the workload across several servers.HAProxy is a popular open-source load balancer that distributes incoming traffic across multiple servers to improve application performance, scalability, and reliability.There are several load-balancing methods that HAProxy uses to distribute traffic, and each has its advantages and use cases [23].This research uses the least connection method as the HAProxy load balance method.
This research utilizes the Galera Cluster feature in MariaDB to maintain data consistency among database servers.Galera Cluster is a database management system ensuring that all servers' data remains consistent [24].Galera Cluster supports multi-master replication, meaning all the database servers can serve all users' requestsInsert, Update & Delete) [25].Galera Cluster will automatically synchronize data updates among database servers.
By using HAProxy and Galera Cluster in MariaDB, this research is expected to improve Moodle's performance by speeding up access to the database and minimizing the time required to send data between servers.Additionally, by distributing the database workload across multiple servers, this research can also improve the scalability of Moodle, allowing the platform to handle a more significant number of users.

III. RESEARCH METHOD
This study employs a comparative research design to evaluate the performance of the Moodle learning management system in two distinct server configurations: a single server environment and a multiple servers (cluster) environment.Data collection involves the deployment of multiple Moodle instances, each configured to simulate real-world usage scenarios.Database-related performance metrics are systematically measured, including reading course performance, writing course performance, and database performance.Simulated usage scenarios representing various levels of user activity are created to assess the impact of server configuration on Moodle's performance.
Data is collected using Moodle's performance measurement plugins.Writing course performance is measured to understand Moodle's performance upon creating new courses and their content.Reading course performance is measured to understand Moodle's performance upon presenting the course and its content to the user.Database performance is measured to understand the overall database performance used by Moodle in the response to user requests.
In the single-server scenario, Moodle is installed on a server that supports web, database, and file servers, with all the supporting services installed on the same server, enabling server resource sharing between the services.Figure 1 provides a visual representation of the topology of the single-server scenario.On the other hand, the multi-server scenario involves installing Moodle LMS on one server providing web server and file server services.In contrast, the database server is installed on a separate server using MariaDB's Galera Cluster feature.A load balancer is installed on the webserver to evenly distribute the load of accessing the database server.This approach ensures that all database servers receive an equal workload, preventing the overloading of any single server.Figure 2 provides a visual representation of the topology of this scenario.

IV. RESULT
We conducted 300 tests for each scenario to explore different server loads, yielding average outcomes summarized in Tables 1 and 2. Our examination focused on the performance of reading courses, writing courses, and the database, employing the Moodle LMS server performance function.The testing criteria comprehensively addressed these three dimensions.The values presented in the tables depict the duration of each process in seconds.This meticulous testing approach allows us to scrutinize and compare the efficiency of the Moodle system under various conditions, providing valuable insights into its responsiveness and robustness in handling distinct workloads.The outcomes in the tables encapsulate the quantitative representation of these evaluations, offering a comprehensive view of the system's temporal dynamics during diverse operations.Moodle to save data when there is input from users.For example, when users submit assignments or complete quizzes, the results must be saved.The duration of the writing process varies between 0.1 to 0.4 seconds, as shown in Table 1.This variation is influenced by various factors, such as the length of the processing queue that Moodle needs to handle and the simultaneous data writing by multiple users, especially during activities like quiz submissions that conclude simultaneously.These factors contribute to queuing of written commands, which, in turn, results in longer writing times.Meanwhile, the "database" parameter in Table 1 represents the time needed by the database to handle user requests, both for reading (read) and writing (write) data.In a single-server scenario with only one database server, all activities are serviced by a single database, potentially leading to a processing queue with an average processing time of 0.2 seconds.
In the multi-server scenario, there was an improvement in performance across all parameters (course read, course write, and database) compared to the single-server scenario.The average test results for the multi-server scenario are presented in Table 2.For the "course read" parameter, the average testing result yielded a value of 0.1 seconds.Similarly, the average value obtained for the "course write" parameter was 0.09 seconds, and for the "database" parameter, the average value was also 0.09 seconds.The results indicated that using multiple servers for databases can improve the performance of Moodle LMS compared to a single-server configuration.When you average the values from Table 1 and Table 2 and compare each parameter, the results can be visualized in the graph below.Figure 3 illustrates a graph of the average times generated by the "course read" parameter.On average, the "course read" time in the single-server scenario is 0.42 seconds, while in the multi-server scenario, it's 0.11 seconds.This results in a performance improvement of 384% in Moodle's multi-server scenario compared to the single-server scenario.
The average value for the "course write" parameter in the single-server scenario is 0.15 seconds, while in the multi-server scenario, it averages 0.07 seconds, as depicted in Fig. 4. In this scenario, multi-servers can improve course write performance by 193%.The last parameter being compared is database performance.In the single-server scenario, the average database performance is 0.25 seconds, whereas in the multi-server scenario, it averages 0.09 seconds, as shown in Fig. 5. Database performance improvement up to 260% in the multi-server scenario compared to the single-server scenario.Therefore, when these parameters are combined, the Moodle performance in the single-server scenario results in a time of 0.27 seconds, while in the multi-server scenario, it's 0.09 seconds.This demonstrates an improvement in performance in Moodle of 279% when compared to the single-server scenario, as shown in Fig. 6.

V. CONCLUSION
The study aimed to optimize Moodle performance by distributing the database into multiple servers.HAProxy handled load distribution with the least-connection algorithm, and the Galera Cluster feature on MariaDB does data synchronization.The Moodle performance plugin used course reading, course writing, and database performance to compare results.The result shows that distributing the database into multiple servers gives an advantage over single-server, especially in peak access time where Moodle is accessed by massive users simultaneously.Reading course performance is increased by 384% faster in multiple-server scenarios compared to single-server scenarios.Writing course performance is increased by 193% faster in multiple-server scenarios compared to single-server scenarios.The essential thing in this research is performance improvement on the database up to 260% faster than single-server scenarios.Moodle performance is better in multiple-server scenarios, up to 297% faster than in single-server scenarios.This research proves that the database is the most critical part of Moodle.Distributing databases into multi-servers will improve overall Moodle performance drastically.

Table 1 ,
it can be observed that the average time to display the requested course content is approximately 0.4 seconds.Although this appears relatively fast in terms of time, a processing queue can form when numerous users request the display of course content simultaneously (for instance, during a class schedule).If this queue becomes excessively long, it can result in the occurrence of the "white screen of death."The "write" parameter in the table represents the time it takes for

Table 2 .
Performance Test Results for The Multiple-Server Scenario