Molecular Dynamics Simulation of Bioactive Compounds Against Six Protein Target of Sars-Cov-2 As Covid-19 Antivirus Candidates

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the virus that causes Coronavirus 2019 (COVID-19). To date, there has been no proven effective drug for the treatment or prevention of COVID-19. A study on developing inhibitors for this virus was performed using molecular dynamics simulation. 3CL-Pro, PLPro, Helicase, N, E, and M protein were used as protein targets. This study aimed to determine the stability of the selected protein-ligand complex through molecular dynamics simulation by Amber20 to propose bioactive compounds from natural products that have potential as a drug for COVID-19. Based on our previous study, the best value of free binding energy and protein-ligand interactions of the candidate compounds are obtained for each target protein through molecular docking. Corilagin (-14,42 kcal/mol), Scutellarein 7-rutinoside (-13,2 kcal/mol), Genistein 7-O-glucuronide (-10,52 kcal/mol), Biflavonoid-flavone base + 3O (-11,88 and -9,61 kcal/mol), and Enoxolone (-6,96 kcal/mol) has the best free energy value at each protein target. In molecular dynamics simulation, the 3CL-Pro-Corilagin complex was the most stable compared to other complexes, so that it was the most recommended compound. Further research is needed to test the selected ligand activity, which has the lowest free energy value of the six target proteins.


INTRODUCTION
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the virus that causes Coronavirus disease 2019 , which became the center of global attention in 2020. To date, there is no proven effective cure or prevention of COVID-19, so that treatment only focuses on the symptoms that appear. SARS-CoV-2 has several structural and nonstructural proteins that play a role in virus development. The understanding of the proteins found in SARS-CoV-2 is based on various previously reported studies on SARS-CoV. Understanding the proteins present in these viruses allows a more rational approach to designing more effective antiviral drugs (Yoshimoto, 2020). Papain-like proteinases (PL-Pro) and Chymotrypsin-like proteinases (3CL-Pro) are essential for translating polyproteins of viral RNA. Inhibiting the activity of this enzyme will block viral replication. The recombinant helicase protein has several enzymatic activities, so that inhibition of the activity of this protein can also prevent the replication of viral RNA. Nucleocapsid (N), envelope (E), and membrane (M) proteins are structural proteins of the virus. The N protein plays a role in producing viral RNA into a helical nucleocapsid after translation and replication. Protein E has been shown to play a significant role in virus formation. The M protein is a triple integral membrane protein with a short ectodomain and a large carboxyl-terminus endo-domain (Tan, Lim, & Hong, 2005). Based on this report, the six proteins described were the target proteins in this study.
Natural compounds can be an alternative in developing antiviral drugs, including COVID-19 drugs (Kapoor, Sharma, & Kanwar, 2017). As a first step in screening bioactive compounds that can have activity against SARS-CoV-2, a study on developing inhibitors for this virus was carried out using the molecular docking method. Our previous study obtained the best value of free binding energy and protein-ligand interactions of the candidate compounds for each target protein.
This study carried out the molecular dynamics of the selected natural compounds against several SARS-CoV-2 receptor proteins as part of the search for COVID-19 drug candidates. In addition, this study aimed to determine the stability of the selected proteinligand complex through molecular dynamics simulation. Moreover, its binding energy through Molecular Mechanics Energies with The Poisson-Boltzmann Surface Area Method (MM/PBSA) suggests natural compounds with potential as COVID-19 drugs. The MM-PBSA method is one of the most widely used methods to calculate the interaction energy between protein-ligand complexes. MM-PBSA can decode significant conformational fluctuations and entropic contributions to binding energies from molecular dynamics simulations (Gogoi, B., Chowdhury, P., Goswami, N., Gogoi, N., Naiya, T., Chetia, P., Mahanta, S., Chetia, D., Tanti, B., Borah, P., Handique, 2021).

Preparation of Protein and Ligands
The selected protein-ligand complexes carried out molecular dynamics simulations with the AMBER20 program. The complex was processed using Pdb4amber to remove water and hydrogen from the complex. Furthermore, pK values of ionizable groups in amino acid residues were computed using the H++ website at pH 7.0. (http://biophysics.cs.vt.edu/) (Anandakrishnan, Aguilar, and Onufriev 2012). These files are used for protein preparation with the script ambpdb, pdbamber. The complex file was parameterized by tLEaP using the FF14SB amber force field for proteins and ligands using the GAFF2 force field. Truncated octahedron explicit solvent box with the TIP3P water model was used and neutralized using Na + /Cl -. The volume of box used for each protein is 786741.329 Å 3 (3CL-Pro); 1233189.281 Å 3 (PL-Pro); 1461473.610 Å 3 (Helicase); 388727.953 Å 3 (Nucleocapsid); 438913.512 Å 3 (Envelope); and 442944.119 Å 3 (Membrane).

Molecular Dynamic Simulation
Molecular dynamics simulation was carried out on AMBER20. This step included minimization, heating, equilibration, and production run. Minimization steps are divided into five stages, with the first four stages using restraint. Each minimization stage consists of 10000 cycles, with the first 500 steps uses Steepest Descent, and the rest uses the Conjugate Gradient algorithm. The heating step is carried out for one ns with the first 500 ps (picosecond) to increase the temperature to 300K linearly, then held for 500 ps. In the heating process, the amino acid residue is given a constraint of 10 kcal. mol -1 . A -2 . with NVT (constant number of particle, volume, and temperature) ensemble. The equilibration stage consists of 2 parts, the first is for the NVT ensemble, and the second is for the NPT ensemble. In the equilibration stage, the heating constraint is released gradually. Equilibration with Ensemble NPT was carried out for 500 ps at a constant temperature of 300 K. The next step is the production run simulation process using PMEMD.CUDA, which is carried out to see the free movement of molecules without restraint. This stage is run on an NPT (constant number of particles, pressure, and temperature) ensemble for 50 ns with a fixed temperature of 300K (Madej & Walker, 2020). The analysis, including Root Mean Square Deviation (RMSD) and Root Mean Square Fluctuation (RMSF), was performed using the CPPTRAJ package (Roe & Cheatham, 2013). The relative free energy binding was calculated using MM/PBSA method (Genheden & Ryde, 2015). Meanwhile, to visualize the structure and hydrogen bonds formed during the simulation process using VMD (Visual Molecular Dynamics).

Root Mean Square Deviation (RMSD)
The RMSD calculations results of molecular dynamics simulation for each protein-ligand complex are presented in Table  1. The RMSD value of the protein-ligand complex is represented on a graph of the RMSD value during a simulation time of 50 ns, as shown in Figure 1. RMSD is the average atomic displacement during simulation relative to a reference structure, usually the first frame of the simulation or crystallographic structure. RMSD is helpful for the analysis of the timedependent motion of structures. It is often used to distinguish whether a structure is stable in the simulated timescale or deviates from the initial coordinates (Martínez, 2015).  Based on table 1, the RMSD of the 3CL-Pro complex with Corilagin is the best value compared to other complexes. The complex has an average (1.41 Å) and the smallest minimum-maximum value range (0.68 -2.23 Å). This value indicates that the 3CL-Pro -Corilagin complex is the most stable complex among the others. Corilagin is tightly bound to the binding site of the 3CL-Pro protein, resulting in minimal atomic displacement from its original position. The PL-Pro -Scutelarein 7-rutinoside and Nucleocapsid -Biflavonoid-flavone Base + 3O complex also showed relatively low mean and minimum-maximum range values, so it can be assumed that the complex was also stable.
Meanwhile, the RMSD values in the Helicase-Genistein 7-O-glucuronide, Enoxsolone, and Biflavonoid-flavone + 3O membrane-base complexes showed a relatively high average value and an extended range of minimum-maximum values. This value indicates that the complex underwent a significant change or shift in position as the simulation progressed. A high RMSD value suggests a structural change during the simulation, which indicates an unstable structural quality.

Root Mean Square Fluctuation (RMSF)
RMSF showed protein stability indicated by the absence of sharp fluctuation spikes in the residues making up the target protein. In Figure 2, it can be seen that the residual RMSF values in the six target proteins. For example, the RMSF value of the 3CL-Pro-Corilagin complex (Figure 2a) shows a good graph because there are no fluctuations during the simulation process. Therefore, it indicates that the complex is relatively stable. Likewise, the Enoxolone -Envelope complex (Figure 2e) showed no fluctuations in the residue, so that it could be concluded that the complex was stable.
RMSF is the average deviation of each residue from the initial position during the simulation. RMSF is performed to observe the flexible region of a molecule. Residues with a high RMSF value indicate that the residue is highly fluctuating or unstable to the initial conformation. Conversely, if the RMSF value is low, it suggests that the residue is stiffer so that stability is achieved (Kuzmanic & Zagrovic, 2010).

MM/PBSA (Molecular Mechanics Energies with The Poisson-Boltzmann Surface Area Method)
MM/PBSA is a computational method that combines molecular mechanical energy and a complex solution model. The binding energy values for MM/PBSA produced for 250 ns in each complex are shown in Figure 3. The binding energy values for each complex are then averaged so that the binding energy values for MM/PBSA are obtained as listed in Table 2. In Table 2, it can be seen that the Helicase -Genistein 7-O-glucuronide complex has the lowest MM/PBSA binding energy value of -44.6297 kcal/mol. Then in the next order are 3CL-Pro -Corilagin (-43.2287 kcal/mol), Membrane -Biflavonoid-flavone Base +3O (-41.2458 kcal/mol), Nucleocapsid -Biflavonoid-flavone Base +3O (-31.3202 kcal/mol), PL-Pro -Scutellarein 7-rutinoside (-26.3772 kcal/mol), and finally Envelope -Enoxolone complex (-25.9479 kcal/mol). The binding energy calculation is also influenced by the number of amino acids that make up the protein because it is the total energy of all interacting residues. The more amino acid residues that make up the protein, the more negative the binding energy (Genheden & Ryde, 2015). Compared with the Helicase -Genistein 7-O-glucuronide complex with the 3CL-Pro -Corilagin complex, which has the best RMSD and RMSF results among other complexes, the Helicase -Genistein 7-Oglucuronide complex produces the lowest binding energy value because the Helicase is composed of 589 amino acid residues. In comparison, 3CL-Pro only consists of 306 amino acid residues. Then it can be seen that the binding energy value of the 3CL-Pro -Corilagin complex is more negative than PL-Pro -Scutellarein 7-rutinoside complex, where the amino acid residues of PL-Pro are more than 3CL-Pro. It indicates that the 3CL-Pro -Corilagin complex produces stronger interactions than the PL-Pro -Scutellarein 7rutinoside complex.

Hydrogen Bonds
Hydrogen bond analysis was conducted by observing the donor-acceptor pair between the target protein and the selected ligand and the hydrogen bond occupancy. Hydrogen bonding data on selected proteinligand complexes are presented in table 3. Of all the hydrogen bonds formed in each proteinligand complex, ranked based on the largest occupancy value, the five highest bonds were taken during the simulation.
The 3CL-Pro -Corilagin complex has 41 hydrogen bonds formed with the most significant hydrogen bond occupancy value at the Asp187 residue, 91%. Complex PL-Pro -Scutellarein 7-rutinoside produces 53 hydrogen bonds, of which Gly266 is the residue with the most considerable occupancy value of 7.36%. Then the hydrogen bond occupancy value of 71.64% occurred in the Tyr121 residue contained in the Helicase -Genistein 7-Oglucuronide complex, which produced 43 hydrogen bonds. A total of 28 hydrogen bonds were formed in the Nucleocapsid -Biflavonoid-flavone Base + 3O complex, and the residue with the most considerable occupancy value was Gly75 with a value of 47.8%. The Tyr35 residue in the Enoxolone -Envelope complex resulted in an occupancy value of 3.6%, and 11 hydrogen bonds were formed in this complex. Furthermore, in the Membrane -Biflavonoid-flavone Base + 3O complex, ten hydrogen bonds were formed with the Tyr196 residue, resulting in the most significant occupancy value of 31.16%.
Hydrogen bond occupancy is the conformational fraction in a molecular dynamics simulation that contains at least one hydrogen bond involving a specific residue. Occupancy values close to 100% or more indicate that more than one pair of interacting atoms forms hydrogen bonds. The higher the occupancy value, the more frequent hydrogen bonds occur and the stronger the bonds produced (Dhanik, McMurray, & Kavraki, 2012). Total hydrogen bonds and their occupancy can determine the stability of the protein-ligand complex structure (Zikri, Pranowo, & Haryadi, 2020). The occupancy of this hydrogen bond can strengthen the results obtained from the MM/PBSA value. Helicase Complex -Genistein 7-O-glucuronide formed the most hydrogen bonds compared to other complexes with occupancy values above 46%. It is reasonable because the Helicase also has a much higher number of amino acid residues than other target proteins, so the possibility of hydrogen bonding is greater. Meanwhile, the 3CL-Pro -Corilagin complex showed the highest occupancy reaching 91% in the Asp187 residue compared to the occupancy values produced by the residues in the other complexes. It indicates that the hydrogen bonds formed at this residue are substantial, resulting in a stable complex structure. Hydrogen bonds play an essential role in the formation of the secondary or tertiary structure of proteins. They play an indispensable role in the stabilization of the original structure of proteins. Therefore, it is essential to explain the thermodynamic properties of inter/intraprotein hydrogen bonds (Gao, Mei, & Zhang, 2015). Based on the results obtained on RMSD, RMSF, hydrogen bond occupancy, and binding energy of MM/PBSA, we assume that the 3CL-Pro complex is the most stable complex among other complexes. The structure of the 3CL-Pro-corilagin complex is visualized in Figure 4.

CONCLUSION
In the stability tests with molecular dynamics simulations, the 3CL-Pro -Corilagin complex showed the best results on RMSD, RMSF, and hydrogen bond occupancy, although the MM/PBSA value of the Helicase -Genistein 7-O-glucuronide complex was better than the 3CL-Pro -Corilagin complex during the simulation process. It concluded that the 3CL-Pro-Corilagin complex is the most stable compared to other complexes, so that it is the most recommended compound for further research.