219 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 Phishing Website Detection Using Several Machine Learning Algorithms: A Review Paper Alexander Veach*, Munther Abualkibash School of Information Security and Applied Computing, Eastern Michigan University, Ypsilanti, Michigan, United States *Corresponding Email: aveach1@emich.edu 1. INTRODUCTION Phishing has become one of the most prevalent social engineering attacks in the digital environment. From personal accounts to corporate user accounts, all must be aware of the potential dangers of a phishing attack. This has led to an ongoing battle to prevent phishing attacks by blocking dangerous websites and communications. There are many methods to fight these attacks, with many looking to the new advancements in machine learning and artificial intelligence as a potential solution to phishing attacks. The method discussed in this paper is detecting phishing websites with machine learning algorithms. Unfortunately, such a problem lacks a catch-all solution, which has led to the formation of multiple different approaches to the problem. For example, one solution could suggest designing A B S T R A C T S A R T I C L E I N F O Phishing is one of the major web social engineering attacks. This has led to demand for a better way to predict and stop them in a commercial environment. This paper seeks to understand the research done in the field and analyse the next steps forward. This is done by focusing on what goes into the selection of proper features, from manual selection to the use of Genetic Algorithms such as ADABoost and MultiBoost. Then a look into the classifiers in use, Neural Networks and Ensemble algorithms which were prominent alongside some novel approaches. This information is then processed into a framework for cloud-based and client- based phishing website detection, alongside suggestions for possible future research and experiments that could help progress the field. Article History: Submitted/Received 02 Aug 2022 First Revised 05 Sept 2022 Accepted 02 Oct 2022 Available online 20 Oct 2022 Publication Date 01 Dec 2022 Aug 2018 __________________ Keywords: Artificial Intelligence, Data Science, Machine Learning, Phishing. International Journal of Informatics, Information System and Computer Engineering International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 220 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 methods for on-hardware machine learning, which will limit the choice of algorithms to simpler versions but will allow for mass implementation. Other solutions could focus on offloading the classification and model to a third-party service like Microsoft Azure or Amazon Web Services, which circumvents the limitation of algorithms in exchange for another group of issues. Including the differences in selecting features, where to gather the data, and much more there are a multitude of potential solutions with many looking for the most effective solution. The purpose of this paper is to look at the potential solutions and outline what the next steps for such research could be. 2. METHOD To analyze the current popular solutions and implementation of anti-phishing technologies using machine learning and artificial intelligence a plethora of research was gathered from Collage Repositories and online journal sites such as JSTOR. Once the multitude of research was gathered, which amounted to 91 papers. These 91 papers were then read and analyzed, taking the classifier and methods used into account and their differences. Once that was complete, papers with relevance to the topic at hand and important for discussion were selected and used, the number of which is 14. 3. RESULTS AND DISCUSSION 3.1. The Material Used The application of machine learning against phishing is not a new development and there has been a multitude of research done over the last few years. Especially so for phishing URLs. This is some of the relevant research that has come out in the last few years. There is Sanchez-Paniagua et al. (2022) who focused on analysing deep learning methods compared to other methods, namely ensemble and genetic selection algorithms. In their study they found that their model of using TF-IDF + N-gram outperformed other methods by varying degrees. With the closest performers being within 0.5 points of accuracy while the weakest performers were behind as much as 10 points. The researchers also found that “...handcrafted URL features decrease their performance over time, up to 10.42% accuracy in the case of the LightGBM algorithm from the year 2016 to 2020. For this reason, machine learning methods should be trained with recent URLs to prevent substantial aging from the date of its release” (Sanchez-Paniagua et al., 2022). Xiao et al. (2020) focused on using CNN with multi-head self-attention to determine if links were valid or phishing. By using MHSA, the researchers found better accuracy and speed compared to CNN-LTSM with a difference of 0.002 in CNN-MHSA’s favour. For future work, Xiao et al. (2020) focuses on updating the model to take the HTML content into consideration to increase the accuracy further (Xiao et al., 2020). A different direction was pursued by Suleman and Awan (2019), who focused on the use of generic algorithms such as “Yet Another Generating Genetic Algorithm” or YAGGA. Testing it against other GAs found a 94.99% accuracy with an ID3 classifier (Suleman & Awan, 2019). https://doi.org/10.34010/injiiscom.v3i2.8805 221 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 Another example of study when it comes to genetic algorithms is Subasi and Kremic (2020) who compared Adaboost and MultiBoosting when it came to testing phishing websites. The researchers found a high accuracy of 97.61% using an SVM classifier with Adaboost, however the cost of that accuracy is that SVM Adaboost reported a complexity, in seconds, of “8193.72” (Subasi & Kremic, 2020). Another genetic algorithm study comes from Alsariera et al. (2020) who focused on using their Forest Penalizing Attributes algorithm that uses weight to deemphasize inconsequential variables. The team then compared the results to meta-learning variants of the algorithm specifically testing a bagging method and Adaboost. Of which they found that Adaboosted Forest Penalizing Attributes had an accuracy of 97%, beating the other accuracies of 96.26% for base classifier and 96.58% for bagged, and a speed where “...false alarm notifications are next to zero” (Alsariera et al., 2020). A more unique approach is the one Chen et al. (2020) took focusing on the visual similarity of websites to determine if it is a phishing website. It does this by using wavelet hashing and Scale-Invariant Feature Transform to determine similarity. The researchers found some success when using Microsoft, Dropbox, and Bank of America as a comparison point, getting accuracy results of 98.14%, 98.61% and 99.95% respectively (Chen et al., 2020). Another unique approach is that of Ali and Malebary (2020), who used Particle Swarm Optimization to improve detection of fraudulent phishing websites. By using the high speed PSO model the team proposes feature weighting in much the same way a genetic algorithm operates. Compared to the GA selection and weighting the team found that “...PSO- based feature weighting omitted between 7%-57% of irrelevant features” and found that classifiers using their method “...outperformed these machine learning models with applying IG, Chi-Square, Wrapper, GA-based features selection, and GA-based features weighting” (Ali & Malebary, 2020). Another approach takes the visual analysis of websites but then combines it with a neural network classifier. This approach is what Abdelnabi et al. (2020) proposed which uses a triplicate network to compare websites to popular websites on Alexa. By using the ensemble method with neural networks, they outline a potential future path for using website matching (Abdelnabi et al., 2020). Assefa and Katarya (2022) focused on analysing other deep learning methods and their results and compared it to Autoencoder, a form of unsupervised neural network. In the report they noted various limitations in other studies, noting issues such as non-comprehensive reports and compared their achievements to the Autoencoder method. They found that Autoencoder had an accuracy of 91.24% and that with better data mining techniques the performance could be improved (Assefa & Katarya, 2022). Mandadi et al. (2022) focused on finding the most important features denoting three types, Domain-Based, HTML and JavaScript Based, and Address Bar Based features, with the total number of features under these three categories being considered was 17. Once that was set, they tested the features with Random Forest and Decision Tree which gave values of 87.0% and 82.4% for accuracy respectively (Mandadi et al., 2022). https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 222 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 Saravanan and Subramanian (2020) used GA feature selection alongside an ARTMAP supervised neural network. ARTMAP is made up of “a pair of self- organized Adaptive Resonance Theory (ART) modules ARTa and ARTb. These two modules are interconnected by an inter-ART self-associative memory and an internal controller, whose objective is to maximize the predictive generalization and to minimize the predictive error. Each ART module is associated with F1 and F2 layers which act as a short term memory and a long term memory for category selection” (2020). This model also uses the Firefly Algorithm to determine which features are useful. The study found their own unique algorithm to be the best performing in all performance measures except for detection time, which SVM performed better (Saravanan & Subramanian, 2020). Mourtaji et al. (2017) also outlines which features they believe are best suited for detection. Having five groups which are: lexical based analytics method, abnormal based feature, content-based analytics method, and an identity-based method. Alongside these features they suggest a blacklist function on-top of these features. They used a linear regression classifier and reported an accuracy of 95.5% with a false positive rate of 1.4% (Mourtaji et al., 2017). Zhou and Zhang (2022) propose a dual- weight random forest algorithm that is “based on the combination of feature weight and decision tree weight”. The proposed classifier was then tested against Random Forest, Random Forest Algorithm with Decision Tree Weight, and Dynamic Random Forest and had the highest Accuracy with a value of 94.93% which was 2.22 points higher than the next highest which was Dynamic Random Forest with 92.71 (Zhou & Zhang, 2022). 3.2. Analysis Phishing is one of the most dangerous and effective online fraud methods in existence today. This concern has led to the search for a so-called “silver bullet” that would protect potentially affected parties from phishing attacks. Many have looked towards machine learning and artificial intelligence to create an application that, when used, would detect threats and adapt to them to create the ultimate defense. However there are many parts to consider including which classifier should be used for training, what attributes should be weighed to determine threat and which dataset is the best for training the model. The first major question is by which metric should such a model be trained around. Should it be URL focused, should it be based upon the content of the website itself, or should it be based on the websites meta content using tools such as WHOIS. URL based analysis is simple to implement and fast to process, but lacks other information from the website which can decrease accuracy. Similarly analyzing the content of the web page alongside the URL itself takes more time to execute for the benefit of more accurate results. Some even suggest image recognition models such as Chen et al. (2020) with their visual similarity model. Then, when it comes to weighing features, some papers suggest using attribute selection algorithms such as ADABoost, MultiBoost, or other genetic algorithms to predict which attributes lend themselves to correct identification such as Suleman and Awan (2019), Subasi and Kremic https://doi.org/10.34010/injiiscom.v3i2.8805 223 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 (2020), and Alsariera et al. (2020). By using these machine proven attributes many hope to increase the efficiency of the used classification algorithms. Subasi and Kremic (2020) noted that “Adaboost achieved the superior classification accuracy, with SVM 97.61%” which beat their best accuracy single classifier result which was Random Forest which achieved “an accuracy of 97.26%”. Another study done by Sanchez-Paniagua et al. (2022) reported that, when testing trained models based on data from 2016, 2017 and 2020: “...all models struggled to endure over time and their performance decreased when tested on the following years’ dataset” (Sanchez-Paniagua et al., 2022). Thus showing the importance of an ever updating classification scheme. There are many offered solutions when it comes to what classifier to use, with two of the most common answers being Neural Network classifiers and Random Forest classification. Random Forest has been found by many researchers to be their choice of classifier in the studies surveyed. Zhou et al. (2020) used a modified version of Random Forest, named Double Weighted Random Forest, and returned an accuracy 94.94% when using K-means clustering for feature selection. In studies that found other methods to be more effective such as Sanchez-Paniagua et al. the difference was only a 0.20 accuracy difference compared to LightGBM with 94.67 (Sanchez- Paniagua et al., 2022). However, some report a lower accuracy number, such as Mandadi et al. (2022) who found a reported accuracy of 82.4% with 17 features using a PhishTank dataset. This variance could be attributed to the differences in feature selection and the contents of the used datasets. Another common solution is the use of Neural Network classifiers such as CNN, LSTM, GNN and many others. Neural Network classification is recommended similarly to Random Forest with many studies finding high accuracy when predicting malicious phishing URLs. As mentioned in the section prior, Sanchez- Paniagua et al. found that Light BGM had the highest tested accuracy of the classifiers used with static feature selection on the PIU-60K dataset (Sanchez- Paniagua et al., 2022). Other studies have noticed similar results with other neural networks, specifically those with deep learning capabilities. Xiao et al. (2020) applied multi-head self-attention, or MHSA, to a Convolution Neural Network and found an accuracy rate of 0.9834 or 98.34 percent. The study proposed more solutions to increase that number even higher with their main worry being to “decrease the input of [URL’s length parameter]” (Xiao et al., 2020). Novel application of the prior is also well- researched. With a common focus on using visual detection, to detect pages that are too close to other pages as seen in Abdelnabi et al’s work (2020). In their research they proposed a model that uses three convolutional models to determine phishing or not based on the similarity to other major pages collected from Alexa. Another unique approach is Ali and Malebary (2020) who propose a model based on Particle Swarm Optimization feature weighing. Which reportedly outperformed other weighting algorithms. Like most topics there is not a singular silver bullet, so to speak, when it comes to predicting if a website is malicious or not. Phishing methods commonly change to what is most efficient at that time which has led to a never ending conflict trying to https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 224 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 prevent said attacks. This has led to a focus on using Genetic Algorithms or other methods to create a curated list of features. As noted by Sanchez-Paniagua et al, “compared to machine learning algorithms, both CNN models obtained better results than handcrafted features” (Sanchez-Paniagua et al., 2022). By using deep learning models, a higher level of accuracy can be maintained at the cost of more costly requirements. Neural Network classifiers by design develop a much richer identification method, upon which they layer information in a way imitating human neurons, which requires more processing power than simple classifiers such as a Decision Tree classifier. These methods, when properly trained, can generate extremely accurate results. However, this in of itself is a much more costly method requiring a higher level of processing power commonly using high-end graphics cards designed for that explicit purpose such as the Nvidia Titan V. On the other end of the spectrum is Random Forest, or other ensemble classifiers, that instead rely on a series of classification tests to assure accuracy. Thanks to this, ensemble classifiers require less processing power and have a better success rate with less data provided. However, Random forest lacks the potential depth of learning that deep learning neural networks can possibly provide and is not adept when adapting to changes over time, as reported by Sanchez-Paniagua et al. (2022). Then there are two further trains of thought when it comes to implementation, if the software should be designed to run off of the hardware it is installed upon or if the hardware should be run off of virtualized software through the cloud. Both have their benefits and drawbacks, as offloading the processing better works when using devices such as mobile phones and other low powered devices. However, this builds a dependency on stable connection for the service to work, and a reliance on consistent service. This then creates specifications of an infrastructure that can support such needs. While using the physical machine itself limits the potential design of the model, as it must be customized to each device or be designed to work with most devices sacrificing customization. The benefit would be reliability, as the model would only require the model that is already trained and the processing power of the device executing it. This would limit potential downtime and other server connectivity issues, but could cost more in the long run for businesses implementing this method. Another issue would be training the models in a reasonable way to adapt to changes in phishing techniques. Something which Sanchez-Paniagua et al. (2022) found as much as a 10% decrease in accuracy as malicious phishing links change. The next most common solution was custom classifiers or unique analysis methods, or other similar methods, which made up nineteen of the ninety papers analyzed. These solutions focused on designing custom classifiers that would parse the target information, with claims that the unique solution was more effective than other common solutions. These classifiers are often similar to ensemble methods which combine classifiers in a multilayered approach. However, some are amalgamations designed to work as a single classifier instead of the normal multileveled https://doi.org/10.34010/injiiscom.v3i2.8805 225 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 classification that ensemble methods use which is why they have their own category. Some of these solutions claim to have a success rate when tested of around 98 percent while others claim a much lower result. For example, Saravanan and Subramanian (2022) used a combination of a Genetic Algorithm to select important features and ARTMAP, a neural network classifier based upon the Firefly Algorithm. There was also another group that had nineteen studies suggest its use. The deep learning methods are made up of such classifiers as CNN, DNN, GNN and their derivatives. These methods were used specifically to design evolving models that could potentially detect new attacks and adapt quickly. An issue with these studies is of course the resource intensive nature of deep learning methods. The method's resource intensive nature leaves only two options when it comes to potential implementation: require all hardware to meet the specification or offload the AI to a cloud-based solution. By requiring a dedicated GPU any company wishing to adopt will face a steep entry cost which will be a barrier to general adoption especially for major companies with tens of thousands of workers. The same is true for a Cloud based solution as any corporation that wishes to adopt such a method will undoubtedly pay fees for such usage. Something that was noticed in many of the reports is a lack of standardization when it comes to reporting the information gained from experimentation. Several papers only reported the Accuracy without any of the other data points leaving you to extrapolate how they reached that conclusion. This issue has been noted in other papers such as “Intelligent Phishing Website Detection Using Deep Learning”, where Assefa and Katarya (2022) note that 3 of the papers analyzed failed to either provide enough details or the results reported were “not comprehensive”. This issue then compounds as a sizable group of papers would leave out important information such as the specifications of how they created their private dataset, and other key details needed to replicate their findings. This information is critical for understanding how efficient each method is. This can be remedied by having a standard for reporting the results of AI/ML for phishing detection. A solution would be to standardize what results are included in studies. This standard should require: a) the explicit location and name of which dataset was used, b) the algorithm used, C) explicit instructions on how the model was trained, D) an in-depth breakdown of false positives and negatives and true positives and negatives, and E) analysis execution speed. Going forward there appears to be two paths when it comes to designing a defensive tool against fraudulent websites. The first approach would be focused on designing a client-focused service that would run a classifier on the hardware provided. The second approach would be to focus upon designing a cloud- based solution called through an API to offload the compute intensive work. Both of these approaches have their own benefits and drawbacks, which will be discussed in greater detail in the next section, but either are a good beginning step for advancing anti-phishing measures. https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 226 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 3.3. Example of a Client-Based Solution The following is a proposed framework for a Client-based solution for an anti- phishing extension. The solution should be built in a browser native language, such as JavaScript, using the provided machine learning libraries such as TensorFlow. When the website is accessed the extension will check a maintained whitelist which contains commonly used and trusted websites such as search engines, online office tools, and other trusted websites. Then, if the website is not trusted the extension will harvest data needed for classification on the model used. For this example it will be assumed that an ensemble classifier such as Random Forest will be used. The classifier will account for multiple features including domain information, the URL, and content on the website itself. Something similar to the feature set suggested by Mandadi et al. (2022), which lists DNS Record, Website Traffic, Age of Domain, End Period of Domain, IFrame Redirection, Status Bar Customization, Disabling Right Click, Website Forwarding, Domain, IP Address, “@” Symbol, Length, Depth, Redirection “//”, “HTTP/HTTPS” in Domain name, Using URL Shortening Services “Tiny URL”, Prefix or Suffix “-” in Domain (Mandadi et al., 2022). The extension should have a pre-built model based upon the above implemented in the extension, with updates to reflect trends in current phishing websites. While the extension classifies the website the extension should have an interim page that will update when classification is done to either send the user to the website or inform the user of the detected security risk. This model is considerably easy to implement and can theoretically be run on most modern workstations. This model also can be updated when performance drops due to changing trends in phishing to counteract the loss in accuracy, however doing so would require a consistent team to continuously watch the current trends in phishing websites. Another weakness of this model is the potential for False Positives and other accuracy issues, which would slow down the average user’s speed of use. The proposed model will also need to determine if the link is safe or unsafe rapidly, else earning the ire of the end user. These factors would need to be mitigated for a commercial implementation, by either optimizing the classification process, designing unique methods to obfuscate the methods in an unnoticed way, or other similar ideas (see Fig. 1). https://doi.org/10.34010/injiiscom.v3i2.8805 227 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 Fig. 1 A diagram of a simple client-based anti-phishing solution 3.4. Example of a Cloud-Based Solution Where the prior solution is relatively simple to implement, the following is much more difficult due to the necessity of powerful computational processes, which are then hosted on either a public cloud service or a private cloud. This version would use a deep learning method, such as CNN-LTSM, which would be trained using information from repositories such as PhishTank. The classifier should be guided to look at meta information, website content, and the website URL itself. This trained model will then act upon information sent to it from client devices and determine if the site is a phishing website or a legitimate website. The model will then add that information into the next training set to continuously update the dataset to have it evolve naturally to counter new methods of Phishing as they appear as suggested by Sanchez- Paniagua et al. (2022). This model, while simple to outline, is difficult to execute for practical use. For effective deep learning data needs to be consistently fed to the model for it to stay up-to-date. Supporting this infrastructure would cost a lot of money or resources to execute effectively, alongside the customization needed to optimize the classification processes. Ignoring those issues, another issue that one will run into is ensuring uptime for those dependent on the software. The cloud focused model requires consistent back and forth between all users and the classification service at all times for effective use. This also will require a lot of resources to implement. Once the model is properly trained and maintained, it however has the potential for a higher accuracy than its ensemble based brother https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 228 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 above. In the deep learning studies surveyed for this paper, most reported an accuracy of 97% or more, trumping the average next highest classifier which was often the Random Forest algorithm. Therefore, there is potential for cloud- based anti-phishing techniques powered by machine learning and artificial intelligence but the resource cost will limit effective implementation without serious capital investment (see Fig. 2). 3.5. Future Work A prudent first step would be to standardize reporting of machine learning and artificial intelligence. Currently there is no codified standard for reporting Machine Learning and Artificial Intelligence study results. Some studies contain everything needed to replicate the experiments performed and how the conclusion was drawn; however other studies will leave out needed details for conclusive analysis or replication. Mourtaji et al. (2017) for example outlines their own framework and show results from said framework without supplying the dataset used in testing, which they claim to have pulled from PhishTank and Alexa to populate. By providing the dataset used in testing to an online repository for verification it allows for doubt to be cleared and will be of great assistance to other researchers in the field. Fig. 2 A diagram of a simple cloud-based anti-phishing solution https://doi.org/10.34010/injiiscom.v3i2.8805 229 | International Journal of Informatics Information System and Computer Engineering 3(2) (2022) 219-230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 By focusing on standards that ensure easy replication of results, and clarity within the information reported, other researchers will be able to work off of the research and develop the new technologies. Therefore we would like to suggest a framework that would include these specifications for all reported testing: a repository containing the training dataset and testing dataset used, the features selected for classification, the classifier used alongside documentation of how to implement custom classifiers, the true positives and negatives alongside the false positives and negatives from resulting validation tests, precision rating, recall rating, accuracy rating, and F1 score. Alongside this information there should be enough instruction for the reader to validate the paper by replicating the experiment within. By including this information it shall ensure reliable replication, which will make it easier to build upon thus helping the proliferation of information. On a more practical level the next step should be creating working models and testing them in live environments. By making a model, Client or Cloud based, will allow for researchers to see the practical shortcomings to these methods and correct them. Once the shortcomings are known more development can take place evolving the field, which will help combat one of the most common threats on the internet. 4. CONCLUSION Phishing is one of the most common threats to cybersecurity in the current world. Many organizations have become acutely aware of the potential danger of a successful attack. This has led to an increased focus on developing new technologies to prevent such attacks from taking place. By using machine learning and artificial intelligence many posit a learning defensive system that can prevent website phishing attacks and lower potential vectors for attack. Currently there is no cure-all with many papers acknowledging the ever-changing nature of website based phishing attacks, preventing a permanent solution. However, a well automated system could go a long way to preventing website- based phishing attacks and could be a useful solution for major organizations. Most studies believe that a web extension for modern web browsers such as Google Chrome is where companies should look for future developments. A development of a working model for testing in live environments would do well in advancing the field by showing what potential shortcomings exist. Finally, there is a lack of standardization in the reporting of data done in the multitude of studies focusing on the topic. To better advance the field in the focus of implementing anti-phishing ML/AI into working prototypes, a standard of reporting would make it easier to gather information. By always including the dataset used, the algorithm used, the instructions for training the model, a breakdown of the training and testing results and a record of time taken to execute a task, it would allow for information to be disseminated and processed faster which in turn could assist in the development of such anti- phishing technologies. https://doi.org/10.34010/injiiscom.v3i2.8805 Alexander Veach and Munther Abualkibash. Phishing Website Detection Using Several Machine…| 230 DOI: https://doi.org/10.34010/injiiscom.v3i2.8805 p-ISSN 2810-0670 e-ISSN 2775-5584 REFERENCES Abdelnabi, S., Krombholz, K., & Fritz, M. (2020, October). VisualPhishNet: Zero-day phishing website detection by visual similarity. In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security (1681-1698). Ali, W., & Malebary, S. (2020). Particle swarm optimization-based feature weighting for improving intelligent phishing website detection. IEEE Access, 8, 116766- 116780. Alsariera, Y. A., Elijah, A. V., & Balogun, A. O. (2020). Phishing Website Detection: Forest by Penalizing Attributes Algorithm and Its Enhanced Variations. Arabian Journal for Science and Engineering, 45(12), 10459–10470. Assefa, A., & Katarya, R. (2022, March). Intelligent Phishing Website Detection Using Deep Learning. In 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS) 1, 1741-1745. IEEE. Chen, J. L., Ma, Y. W., & Huang, K. L. (2020). Intelligent Visual Similarity-Based Phishing Websites Detection. Symmetry, 12(10), 1681. Mandadi, A., Boppana, S., Ravella, V., & Kavitha, R. (2022, April). Phishing Website Detection Using Machine Learning. In 2022 IEEE 7th International conference for Convergence in Technology (I2CT) (1-4). IEEE. Mourtaji, Y., & Bouhorma, M. (2017, October). Perception of a new framework for detecting phishing web pages. In Proceedings of the Mediterranean Symposium on Smart City Application (1-6). Sánchez-Paniagua, M., Fernández, E. F., Alegre, E., Al-Nabki, W., & González-Castro, V. (2022). Phishing URL Detection: A Real-Case Scenario Through Login URLs. IEEE Access, 10, 42949-42960. Saravanan, P., & Subramanian, S. (2020). A framework for detecting phishing websites using GA based feature selection and ARTMAP based website classification. Procedia Computer Science, 171, 1083-1092. Subasi, A., & Kremic, E. (2020). Comparison of adaboost with multiboosting for phishing website detection. Procedia Computer Science, 168, 272-278. Suleman, M. T., & Awan, S. M. (2019). Optimization of URL-based phishing websites detection through genetic algorithms. Automatic Control and Computer Sciences, 53(4), 333-341. Zhou, J., Liu, Y., Xia, J., Wang, Z., & Arik, S. (2020). Resilient fault-tolerant anti- synchronization for stochastic delayed reaction–diffusion neural networks with semi-Markov jump parameters. Neural Networks, 125, 194-204. Zhou, Z., & Zhang, C. (2022, May). Phishing website identification based on double weight random forest. In 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA) (263-266). IEEE. https://doi.org/10.34010/injiiscom.v3i2.8805