Critical review of text mining and sentiment analysis for stock market prediction
Abstract
The paper is aimed at a critical review of the literature dealing with text mining and sentiment analysis for stock market prediction. The aim of this work is to create a critical review of the literature, especially with regard to the latest findings of research articles in the selected topic strictly focused on stock markets represented by stock indices or stock titles. This requires examining and critically analyzing the methods used in the analysis of sentiment from textual data, with special regard to the possibility of generalization and transferability of research results. For this reason, an analytical approach is also used in working with the literature and a critical approach in its organization, especially for completeness, coherence, and consistency. Based on the selected criteria, 260 articles corresponding to the subject area are selected from the world databases of Web of Science and Scopus. These studies are graphically captured through bibliometric analysis. Subsequently, the selection of articles was narrowed to 49. The outputs are synthesized and the main findings and limits of the current state of research are highlighted with possible future directions of subsequent research.
Keyword : bibliometric analysis, financial market, literature review, sentiment analysis, stock market, text mining
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Al Nasseri, A., Tucker, A., & de Cesare, S. (2014). Big data analysis of StockTwits to predict sentiments in the stock market. In Lecture notes in computer science: Vol. 8777. Discovery science (pp. 13–24). Springer Verlag. https://doi.org/10.1007/978-3-319-11812-3_2
Al Nasseri, A., Tucker, A., & de Cesare, S. (2015). Quantifying StockTwits semantic terms’ trading behavior in financial markets: An effective application of decision tree algorithms. Expert Systems with Applications, 42(23), 9192–9210. https://doi.org/10.1016/j.eswa.2015.08.008
Al-Ramahi, M., El-Gayar, O., Liu, J., & Chang, Y. (2015). Predicting big movers based on online stock forum sentiment analysis. In Americas Conference on Information Systems, AMCIS.
Alostad, H., & Davulcu, H. (2017). Directional prediction of stock prices using breaking news on Twitter. Web Intelligence, 15(1), 1–17. https://doi.org/10.3233/WEB-170349
Antons, D., Grünwald, E., Cichy, P., & Salge, T. O. (2020). The application of text mining methods in innovation research: Current state, evolution patterns, and development priorities. R&D Management, 50(3), 329–351. https://doi.org/10.1111/radm.12408
Batra, R., & Daudpota, S. M. (2018, March). Integrating StockTwits with sentiment analysis for better prediction of stock price movement. In 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1–5). Sukkur, Pakistan. IEEE. https://doi.org/10.1109/ICOMET.2018.8346382
Birbeck, E., & Cliff, D. (2018). Using stock prices as ground truth in sentiment analysis to generate profitable trading signals (IDEAS Working Paper Series from RePEc). Federal Reserve Bank of St Louis. http://search.proquest.com/docview/2189119020/
Bouktif, S., Fiaz, A., & Awad, M. (2019, October). Stock market movement prediction using disparate text features with machine learning. In 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS) (pp. 1–6). Marrakech, Morocco. IEEE. https://doi.org/10.1109/ICDS47004.2019.8942303
Bouktif, S., Fiaz, A., & Awad, M. (2020). Augmented textual features-based stock market prediction. IEEE Access, 8, 40269–40282. https://doi.org/10.1109/ACCESS.2020.2976725
Bustos, O., & Pomares-Quimbaya, A. (2020). Stock market movement forecast: A systematic review. Expert Systems with Applications, 156, 113464. https://doi.org/10.1016/j.eswa.2020.113464
Chen, C.-H., & Shih, P. (2019, June). A stock trend prediction approach based on Chinese news and technical indicator using genetic algorithms. In 2019 IEEE Congress on Evolutionary Computation (CEC) (pp. 1468–1472). Wellington, New Zealand. IEEE. https://doi.org/10.1109/CEC.2019.8790177
Chen, M.-Y., & Chen, T.-H. (2019). Modeling public mood and emotion: Blogs and news sentiment and socio-economic phenomena. Future Generation Computer Systems, 96, 692–699. https://doi.org/10.1016/j.future.2017.10.028
Das, S., & Das, A. (2016, July). Fusion with sentiment scores for market research. In FUSION 2016 – 19th International Conference on Information Fusion, Proceedings (pp. 1003–1010). Heidelberg, Germany. IEEE.
Derakhshan, A., & Beigy, H. (2019). Sentiment analysis on stock social media for stock price movement prediction. Engineering Applications of Artificial Intelligence, 85, 569–578. https://doi.org/10.1016/j.engappai.2019.07.002
Domeniconi, G., Moro, G., Pagliarani, A., & Pasolini, T. (2017). Learning to predict the stock market dow jones index detecting and mining relevant tweets. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. (pp. 165–172). Funchal, Madeira, Portugal. SciTePress. https://doi.org/10.5220/0006488201650172
Eliacik, A., & Erdogan, N. (2015, November). User-weighted sentiment analysis for financial community on Twitter. In 2015 11th International Conference on Innovations in Information Technology (IIT), Conference Proceedings (pp. 46–51). Dubai, United Arab Emirates. IEEE. https://doi.org/10.1109/INNOVATIONS.2015.7381513
Eliacik, A. B., & Erdogan, N. (2018). Influential user weighted sentiment analysis on topic based microblogging community. Expert Systems with Applications, 92, 403–418. https://doi.org/10.1016/j.eswa.2017.10.006
Feuerriegel, S., & Gordon, J. (2018). Long-term stock index forecasting based on text mining of regulatory disclosures. Decision Support Systems, 112, 88–97. https://doi.org/10.1016/j.dss.2018.06.008
Groß-Klußmann, A., König, S., & Ebner, M. (2019). Buzzwords build momentum: Global financial Twitter sentiment and the aggregate stock market. Expert Systems with Applications, 136, 171–186. https://doi.org/10.1016/j.eswa.2019.06.027
Hájek, P. (2018). Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Computing and Applications, 29(7), 343–358. https://doi.org/10.1007/s00521-017-3194-2
Hájek, P., & Boháčová, J. (2016). Predicting abnormal bank stock returns using textual analysis of annual reports – A neural network approach. In Communications in computer and information science: Vol. 629. Engineering applications of neural networks (pp. 67–78). Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_5
Hao, P. Y., Kung, C. F., Chang, C. Y., & Ou, J. B. (2021). Predicting stock price trends based on financial news articles and using a novel twin support vector machine with fuzzy hyperplane. Applied Soft Computing, 98, 106806. https://doi.org/10.1016/j.asoc.2020.106806
Huang, C., Liao, J., Yang, D., Chang, T., & Luo, Y. (2010). Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Systems with Applications, 37(9), 6409–6413. https://doi.org/10.1016/j.eswa.2010.02.078
Hwang, E., & Kim, Y. (2019). Interdependency between the stock market and financial news. ArXiv. https://doi.org/10.48550/arXiv.1909.00344
Jammalamadaka, S., Qiu, J., & Ning, N. (2019). Predicting a stock portfolio with the multivariate bayesian structural time series model: Do news or emotions matter? International Journal of Artificial Intelligence, 17(2), 81–140. https://escholarship.org/uc/item/47m0302b
Janková, Z. (2021). Expert system for decision-making on stock markets using investor sentiment [Doctoral Thesis]. Brno University of Technology. Faculty of Business and Management.
Khedr, A., Salama, S. E., & Yaseen, N. (2017). Predicting stock market behavior using data mining technique and news sentiment analysis. International Journal of Intelligent Systems and Applications, 9(7), 22–30. https://doi.org/10.5815/ijisa.2017.07.03
Kim, M., Park, E. L., & Cho, S. (2018). Stock price prediction through sentiment analysis of corporate disclosures using distributed representation. Intelligent Data Analysis, 22(6), 1395–1413. https://doi.org/10.3233/IDA-173670
Kraus, M., & Feuerriegel, S. (2017). Decision support from financial disclosures with deep neural networks and transfer learning. Decision Support Systems, 104(C), 38–48. https://doi.org/10.1016/j.dss.2017.10.001
Li, Y., Bu, H., Li, J., & Wu, J. (2020). The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning. International Journal of Forecasting, 36(4), 1541–1562. https://doi.org/10.1016/j.ijforecast.2020.05.001
Long, W., Tang, Y. R., & Tian, Y. J. (2018). Investor sentiment identification based on the universum SVM. Neural Computing and Applications, 30(2), 661–670. https://doi.org/10.1007/s00521-016-2684-y
Mäntylä, M. V., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis – A review of research topics, venues, and top cited papers. Computer Science Review, 27, 16–32. https://doi.org/10.1016/j.cosrev.2017.10.002
Meesad, P., & Li, J. (2014, December). Stock trend prediction relying on text mining and sentiment analysis with tweets. In 2014 4th World Congress on Information and Communication Technologies (WICT 2014) (pp. 257–262). Melaka, Malaysia. https://doi.org/10.1109/WICT.2014.7077275
Moro, G., Pasolini, R., Domeniconi, G., Pagliarani, A., & Roli, A. (2019). Prediction and trading of dow jones from Twitter: A boosting text mining method with relevant tweets identification. In communications in computer and information science: Vol. 976. Knowledge discovery, knowledge engineering and knowledge management (pp. 26–42). Springer, Cham. https://doi.org/10.1007/978-3-030-15640-4_2
Nann, S., Krauss, J., & Schoder, D. (2013). Predictive analytics on public data – The case of stock markets. ECIS 2013 Completed Research. 102.
Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611. https://doi.org/10.1016/j.eswa.2015.07.052
Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2020). Predicting stock market price movement using sentiment analysis: Evidence from Ghana. Applied Computer Systems, 25(1), 33–42. https://doi.org/10.2478/acss-2020-0004
O’Hare, N., Davy, M., Bermingham, A., Ferguson, P., Sheridan, P., Gurrin, C., &. Smeaton, A. F. (2009, November). Topic-dependent sentiment analysis of financial blogs. In Proceeding of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion – TSA ‘09 (pp. 9–16). Hong Kong China. Association for Computing Machinery. https://doi.org/10.1145/1651461.1651464
Oliveira, N., Cortez, P., & Areal, N. (2013, June). Some experiments on modeling stock market behavior using investor sentiment analysis and posting volume from Twitter. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics – WIMS ‘13 (pp. 1–8). https://doi.org/10.1145/2479787.2479811
Oliveira, N., Cortez, P., & Areal, N. (2017). The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Systems with Applications, 73(C), 125–144. https://doi.org/10.1016/j.eswa.2016.12.036
Owen, L., & Oktariani, F. (2020, August). SENN: Stock ensemble-based neural network for stock market prediction using historical stock data and sentiment analysis. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1–7). Bandung, Indonesia. IEEE. https://doi.org/10.1109/ICoDSA50139.2020.9212982
Pagolu, V., Kamal, N. R. C., Panda, G., & Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. ArXiv. https://doi.org/10.48550/arXiv.1610.09225
Ren, R., Wu, D. D., & Liu, T. (2019). Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Systems Journal, 13(1), 760–770. https://doi.org/10.1109/JSYST.2018.2794462
Sakhare, N. N., Imambi, S., Kagad, S., Kapadwanjwala, T., Malekar, M., & Dalal, M. (2020). Stock market prediction using sentiment analysis. International Journal of Advanced Science and Technology, 29(4s), 1126–1133. http://sersc.org/journals/index.php/IJAST/article/view/6664
Shi, Y., Tang, Y.-R., Cui, L.-X., & Long, W. (2018). A text mining based study of investor sentiment and its influence on stock returns. Economic Computation and Economic Cybernetics Studies and Research, 52(1), 183–199. https://doi.org/10.24818/18423264/52.1.18.11
Siering, M. (2012, January). “Boom” or “Ruin” – Does it make a difference? Using text mining and sentiment analysis to support intraday investment decisions. In 2012 45th Hawaii International Conference on System Sciences (pp. 1050–1059). Maui. IEEE. https://doi.org/10.1109/HICSS.2012.2
Simoes, C., Neves, R., & Horta, N. (2017, June). Using sentiment from Twitter optimized by Genetic Algorithms to predict the stock market. In 2017 IEEE Congress on Evolutionary Computation (CEC) (pp. 1303–1310). Donostia, Spain. IEEE. https://doi.org/10.1109/CEC.2017.7969455
Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2014). Stream-based active learning for sentiment analysis in the financial domain. Information Sciences, 285(C), 181–203. https://doi.org/10.1016/j.ins.2014.04.034
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. https://doi.org/10.1002/asi.4630240406
Sun, Y., Liu, X., Chen, G., Hao, Y., & Zhang, Z. (2020). How mood affects the stock market: Empirical evidence from microblogs. Information & Management, 57(5), 103181. https://doi.org/10.1016/j.im.2019.103181
Tirea, M., & Negru, V. (2013, September). Classifying and quantifying certain phenomena effect. In SISY 2013 – IEEE 11th International Symposium on Intelligent Systems and Informatics, Proceedings (pp. 363–368). Subotica, Serbia. IEEE. https://doi.org/10.1109/SISY.2013.6662603
Urolagin, S. (2017). Text mining of Tweet for sentiment classification and association with stock prices. In 2017 International Conference on Computer and Applications (ICCA) (pp. 384–388). Doha, Qatar. IEEE. https://doi.org/10.1109/COMAPP.2017.8079788
van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. https://doi.org/10.1007/s11192-009-0146-3
Xie, Y., & Jiang, H. (2017). Stock market forecasting based on text mining technology: A support vector machine method. Journal of Computers, 12(6), 500–510. https://doi.org/10.17706/jcp.12.6.500-510
Zhao, B., He, Y., Yuan, C., & Huang, Y. (2016, July). Stock market prediction exploiting microblog sentiment analysis. In 2016 International Joint Conference on Neural Networks (IJCNN) (pp. 4482–4488). Vancouver, BC, Canada. IEEE. https://doi.org/10.1109/IJCNN.2016.7727786