A Comparative Study of Machine Learning Models for Fashion Product Demand Prediction: Exploring Algorithms, Data Splitting, and Feature Engineering
Abstract
The fashion industry faces challenges in accurately predicting demand due to inherent uncertainty, leading to suboptimal inventory and financial losses. Machine learning (ML) offers a robust solution by analyzing large and complex data, identifying non-linear patterns, and providing more accurate predictions than conventional methods that rely on limited factors. This research aims to compare and evaluate the performance of six different ML models—XGBoost, SVM, RF, GBM, KNN, and NN, considering the influence of feature engineering and various data split ratios on predicting fashion product demand. KNN and NN were included due to distinct modeling approaches and competitive capabilities in identifying local and non-linear patterns across numerical, categorical, and time series data. Techniques such as feature extraction and selection and various data split ratios (70:30, 80:20, 90:10) were used. Using Adidas sales data, the models were evaluated based on Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). The results indicate that the XGBoost-based model with feature engineering consistently outperforms the other models across all data split ratios. Particularly, XGBoost with feature engineering at a data split ratio of 90:10 achieved the best performance with an RMSE of 4.46 and an MAE of 1.51. Analyzing model performance shows that the predictive ability of ML models is influenced by the implementation of feature engineering and the selection of the data split ratio. These results demonstrate the potential of using feature-engineered XGBoost models and optimized data ratios to mitigate the risk of stockouts or overstocks, and reduce financial losses and environmental waste.
Keywords
Full Text:
PDFReferences
M. Koren and M. Shnaiderman, “Forecasting in the fashion industry: a model for minimising supply-chain costs,” International Journal of Fashion Design, Technology and Education, vol. 16, no. 3, pp. 30–318, 2023. doi: 10.1080/17543266.2023.2201508
K. Swaminathan and R. Venkitasubramony, “Demand forecasting for fashion products: A systematic review,” International Journal of Forecasting, vol. 40, no. 1, pp. 247–267, 2024, doi: 10.1016/j.ijforecast.2023.02.005
R. Szabó-Geletóczki, E. Szabó, and I. Rudnák, “Stockpile management through the everyday operation of a pharmaceutical company,” Management/Vadyba, vol. 38, no. 1, pp. 51–62, 2022, doi: 10.38104/vadyba.2022.1.06.
X. Long and L. Gui, “Waste not want not? The environmental implications of quick response and upcycling,” SSRN Electronic Journal, Jan. 2022, doi: 10.2139/ssrn.4013877.
A. R. Chowdhury, A. M. Mithu, S. Ahmad, and A. A. Malek, “A data-driven approach to inventory control under uncertain demand for pharmaceutical products using continuous review policy,” GPH-International Journal of Business Management, vol. 7, no. 01, pp. 32–51, 2024. doi: 10.5281/zenodo.10700678.
A. Shah, R. M. Ellahi, U. Nazir, and M. A. Soomro, “Forecasting practices in textile and apparel export industry: A systematic review,” International Journal of Circular Economy Waste Management, vol. 2, no. 1, pp. 1–17, 2022, doi: 10.4018/IJCEWM.288501.
C. Giri and Y. Chen, “Deep learning for demand forecasting in the fashion and apparel retail industry,” Forecasting, vol. 4, no. 2, pp. 565–581, 2022, doi: 10.3390/forecast4020031.
M. Kharfan, V. W. K. Chan, and T. Firdolas Efendigil, “A data-driven forecasting approach for newly launched seasonal products by leveraging machine-learning approaches,” Annals of Operations Research, vol. 303, no. 1–2, pp. 159–174, 2021. doi: 10.1007/s10479-020-03666-w.
Y. Ledmaoui, A. El Maghraoui, M. El Aroussi, R. Saadane, A. Chebak, and A. Chehri, “Forecasting solar energy production: A comparative study of machine learning algorithms,” Energy Reports, vol. 10, pp. 1004–1012, 2023, doi: 10.1016/j.egyr.2023.07.042.
İ. Güven, Ö. Uygun, and F. Şİmşİr, “Machine learning algorithms with intermittent demand forecasting: An application in retail apparel with plenty of predictors,” Textile Apparel, vol. 31, no. 2, pp. 99–110, 2021, doi: 10.32710/tekstilvekonfeksiyon.809867.
S. Hwang, G. Yoon, E. Baek, and B.-K. Jeon, “A sales forecasting model for new-released and short-term product: A case study of mobile phones,” Electronics, vol. 12, no. 15, Art. no. 3256, 2023, doi: 10.3390/electronics12153256.
L. P. E. Yani and A. Aamer, “Demand forecasting accuracy in the pharmaceutical supply chain: a machine learning approach,” International Journal of Pharmaceutical Healthcare Marketing, vol. 17, no. 1, pp. 1–23, 2023. doi: 10.1108/IJPHM-05-2021-0056
A. Mitra, A. Jain, A. Kishore, and P. Kumar, “A comparative study of demand forecasting models for a multi-channel retail company: a novel hybrid machine learning approach,” Operations Research Forum, 2022, vol. 3, no. 4, Sep. 2022, doi: 10.1007/s43069-022-00166-4.
M. Saglam, C. Spataru, and O. A. Karaman, “Forecasting Electricity Demand in Turkey Using Optimization and Machine Learning Algorithms,” Energies, vol. 16, no. 11, Art. no. 4499, 2023. doi: 10.3390/en16114499.
A. Brüggen, I. Grabner, and K. L. Sedatole, “The folly of forecasting: The effects of a disaggregated demand forecasting system on forecast error, forecast positive bias, and inventory levels,” The Accounting Review, vol. 96, no. 2, pp. 127–152, 2021, doi: 10.2308/tar-2018-0559.
F. Yiğit, Ş. ESNAF, and B. Y. KAVUŞ, “A poisson-regression, support vector machine and grey prediction based combined forecasting model proposal: a case study in distribution business,” Turkish Journal of Forecasting, vol. 5, no. 2, pp. 23–35, 2021, doi: 10.34110/forecasting.957494.
I. Amellal, A. Amellal, H. Seghiouer, and M. Ech-Charrat, “An integrated approach for modern supply chain management: Utilizing advanced machine learning models for sentiment analysis, demand forecasting, and probabilistic price prediction,” Decision Science Letters, vol. 13, no. 1, pp. 237–248, 2024, doi: 10.5267/j.dsl.2023.9.003.
M. Rodrigues, V. Miguéis, S. Freitas, and T. Machado, “Machine learning models for short-term demand forecasting in food catering services: A solution to reduce food waste,” Journal of Cleaner Production, vol. 435, Art. no. 140265, 2024, doi: 10.1016/j.jclepro.2023.140265.
D. Chung, C. G. Lee, and S. Yang, “A hybrid machine learning model for demand forecasting: combination of k-means, elastic-net, and gaussian process regression,” International Journal of Intelligent Systems Applications in Engineering, vol. 11, no. 6s, pp. 325–336, 2023.
N. Son and Y. Shin, “Short-and medium-term electricity consumption forecasting using prophet and GRU,” Sustainability, vol. 15, no. 22, Art. no. 15860, 2023, doi: 10.3390/su152215860.
I.-F. Chen and C.-J. Lu, “Demand forecasting for multichannel fashion retailers by integrating clustering and machine learning algorithms,” Processes, vol. 9, no. 9, Art. no. 1578, 2021, doi: 10.3390/pr9091578.
Q. H. Nguyen et al., “Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,” Mathematical Problems in Engineering, vol. 2021, pp. 1–15, 2021, doi: 10.1155/2021/4832864.
J. Kamiri and G. Mariga, “Research methods in machine learning: A content analysis,” International Journal of Computer Information Technology, vol. 10, no. 2, pp. 78–91, Mar. 2021, doi: 10.24203/ijcit.v10i2.79.
S. Ozdemir, Feature Engineering Bookcamp. Simon and Schuster, 2022.
J. Yang, X. Tan, and S. Rahardja, “Outlier detection: How to select k for k-nearest-neighbors-based outlier detectors,” Pattern Recognition Letters, vol. 174, pp. 112–117, 2023, doi: 10.1016/j.patrec.2023.08.020.
M. K. Dahouda and I. Joe, “A deep-learned embedding technique for categorical features encoding,” IEEE Access, vol. 9, pp. 114381–114391, 2021, doi: 10.1109/ACCESS.2021.3104357.
A. Alabrah, “An improved CCF detector to handle the problem of class imbalance with outlier normalization using IQR method,” Sensors, vol. 23, no. 9, Art. no. 4406, 2023, doi: 10.3390/s23094406.
Q. H. Nguyen et al., “Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,” Mathematical Problems in Engineering, vol. 2021, pp. 1–15, 2021, doi: 10.1155/2021/4832864.
A. Grigorev, Machine Learning Bookcamp: Build a Portfolio of Real-life Projects. Shelter Island: Manning Publications co, 2021.
N. Shirzadi, A. Nizami, M. Khazen, and M. Nik-Bakht, “Medium-term regional electricity load forecasting through machine learning and deep learning,” Designs, vol. 2021, no. 5, Art. no. 27, 2021, doi: 10.3390/designs5020027.
H. Darmawan, M. Yuliana, M. Hadi, and Z. Samsono, “GRU and XGBoost performance with hyperparameter tuning using gridsearchcv and bayesian optimization on an iot-based weather prediction system,” International Journal on Advanced Science, Engineering Information Technology, vol. 13, no. 3, p[. 851–862, 2023, doi: 10.18517/ijaseit.13.3.18377.
D. D. M. C. Maceda and J. C. D. Cruz, “Rainfall Classification Model for the Philippines using Optimized K-nearest Neighbor Algorithm with GridSearchCV Hyperparameter Tuning,” in 2023 IEEE 13th International Conference on Control System, Computing and Engineering (ICCSCE), 2023, pp. 51–55, doi: 10.1109/ICCSCE58721.2023.10237156.
D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Computer Science, vol. 7, Art. no. e623, 2021, doi: 10.7717/peerj-cs.623.
P. Dhal and C. Azad, “A comprehensive survey on feature selection in the various fields of machine learning,” Applied Intelligence, vol. 52, no. 4, pp. 4543–4581, 2022, doi: 10.1007/s10489-021-02550-9.
S. Demir and E. K. Sahin, “An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost,” Neural Computing Applications, vol. 35, no. 4, pp. 3173–3190, 2023, doi: 10.1007/s00521-022-07856-4.
C. Qin, Y. Zhang, F. Bao, C. Zhang, P. Liu, and P. Liu, “XGBoost optimized by adaptive particle swarm optimization for credit scoring,” Mathematical Problems in Engineering, vol. 2021, pp. 1–18, 2021, doi: 10.1155/2021/6655510.
J. Wang and S. Zhou, “Particle swarm optimization‐XGBoost‐based modeling of radio‐frequency power amplifier under different temperatures,” International Journal of Numerical Modelling: Electronic Networks, Devices Fields, vol. 37, no. 2, Art. no. e3168, 2024, doi: 10.1002/jnm.3168.
L. Barreñada, P. Dhiman, D. Timmerman, A.-L. Boulesteix, and B. Van Calster, “Understanding overfitting in random forest for probability estimation: a visualization and simulation study,” Diagnostic Prognostic Research, vol. 8, no. 1, Sep. 2024, doi: 10.1186/s41512-024-00177-1.
J. Oh, K.-J. Ha, and Y.-H. Jo, “A predictive model of seasonal clothing demand with weather factors,” Asia-Pacific Journal of Atmospheric Sciences, vol. 58, no. 5, pp. 667–678, 2022, doi: 10.1007/s13143-022-00284-3.
S. M. Robeson and C. J. Willmott, “Decomposition of the mean absolute error (MAE) into systematic and unsystematic components,” PloS one, vol. 18, no. 2, Art. no. e0279774, 2023, doi: 10.1371/journal.pone.0279774.
T. O. Hodson, “Root mean square error (RMSE) or mean absolute error (MAE): When to use them or not,” Geoscientific Model Development Discussions, vol. 2022, pp. 1–10, 2022. doi: 10.5194/gmd-15-5481-2022.
J. Wu, Y. Li, and Y. Ma, “Comparison of XGBoost and the neural network model on the class-balanced datasets,” in 2021 IEEE 3rd international conference on frontiers technology of information and computer (ICFTIC), 2021, pp. 457–461, doi: 10.1109/ICFTIC54370.2021.9647373.
O.-E. Ørebæk and M. Geitle, “Exploring the Hyperparameters of XGBoost Through 3D Visualizations,” in AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering, 2021.
DOI: https://doi.org/10.15408/aism.v8i1.45600
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
EDITORIAL ADDRESS:
Department of Information Systems, Faculty of Science and Technology,
Universitas Islam Negeri (UIN) Syarif Hidayatullah Jakarta
Faculty of Science and Technology Building, 3rd Floor, 1st Campus, Universitas Islam Negeri (UIN) Syarif Hidayatullah Jakarta
Jl. Ir. H. Juanda No. 95, Ciputat Timur, Kota Tangerang Selatan, Banten 15412, Indonesia.
Tlp/Fax: +622174019 25/+62217493315.
E-mail: aism.journal@apps.uinjkt.ac.id, Website: https://journal.uinjkt.ac.id/index.php/aism
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Applied Information System and Management (AISM) | E-ISSN: 2621-254 | P-ISSN: 2621-2536
https://journal.uinjkt.ac.id/index.php/aism