How web interface labeling quality affects ML models predicting users’ subjective impressions

M. A. Bakaev, V. A. Khvorostov

Abstract


Training data quality is widely recognized as the main pre-requisite for constructing successful Machine Learning (ML) models. However, the concrete aspects of the data quality vary for different domains and outcomes to be predicted by the models. In Human-Computer Interaction (HCI), the common knowledge is that labor-intensive manual labeling of graphical user interfaces (UIs), i.e. identification of visual elements in them, allows to predict users’ subjective impressions more accurately. At the same time, computer vision-based services for automated parametrization of UIs gain in popularity. In our paper, we describe an experimental study with over 200 participants and 1000 web UI screenshots, which were assessed on 3 subjective impressions scales: Complexity, Aesthetics and Orderliness. In order to compare the effects of metrics derived from manual vs. automated labeling, as well as of increased data quality in the manual labeling, we calculated in total 16 metrics subsequently used as factors in the predictive ML models. Our results suggest that Pearson correlation of input data quality and the models’ quality was highly significant and negative for Aesthetics (-0.768) and Orderliness (-0.644). Neither could we identify consistent advantages of the “manual” metrics over “automated” ones, except for the number of UI elements, in which the automated services were somehow lacking. Our conclusions regarding the labeling of the UIs and the applicability of the considered metrics might be of interest to both ML-HCI researchers and practicing UI/UX designers.

Full Text:

PDF (Russian)

References


M. Priestley, F. O’donnell, and E. Simperl, “A survey of data quality requirements that matter in ML development pipelines,” ACM Journal of Data and Information Quality, no. 15(2), pp. 1-39, 2023.

V. Gudivada, A. Apon, J. Ding, “Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations,” Int. J. Adv. Softw, no. 10, pp. 1–20, 2017.

A. Jain, A. Montanari, and E. Sasoglu, “Scaling laws for learning with real and surrogate data,” arXiv preprint, 2024. arXiv:2402.04376.

A. Miniukovich, and M. Marchese, “Relationship between visual complexity and aesthetics of webpages,” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020, pp. 1–13, 2020.

X. Wang, M. Tong, Y. Song, and C. Xue, “Utilizing Multiple Regression Analysis and Entropy Method for Automated Aesthetic Evaluation of Interface Layouts,” Symmetry, no. 16(5), 523, 2024.

M. Bakaev, S. Heil, V. Khvorostov, and M. Gaedke, “Auto-extraction and integration of metrics for web user interfaces,” Journal of Web Engineering, no. 17(6–7), pp. 561-590, 2018.

A. Oulasvirta, S. De Pascale, J. Koch, T. Langerak, J. Jokinen, K. Todi, M. Laine, M. Kristhombuge, Y. Zhu, A. Miniukovich, et al, “Aalto Interface Metrics (AIM): A service and codebase for computational GUI evaluation,” in Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings, Berlin, Germany, 14–17 October 2018, pp. 16–19, 2018.

S. Heil, M. Bakaev, and M. Gaedke, “Assessing completeness in training data for image-based analysis of web user interfaces,” in CEUR Workshop Proceedings, vol. 2500, 2019.

J. C. Gardey, J. Grigera, A. Rodriguez, and A. Garrido, “UX-Analyzer: Visualizing the interaction effort for web analytics,” in Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pp. 1774-1780, 2024.

M. Bakaev, and V. Khvorostov, “Quality of Labeled Data in Machine Learning: Common Sense and the Controversial Effect for User Behavior Models,” Engineering Proceedings, no. 33(1), 3, 2023, https://doi.org/10.3390/engproc2023033003.

A. Bortnikova, and M. Bakaev, “Who Is to Err? A Limited Effect of Data Labeling in Prediction of Users’ Subjective Impressions of Web Designs,” in IEEE 3rd International Conference on Problems of Informatics, Electronics and Radio Engineering (PIERE). Novosibirsk, Russia, 2024, in print.

M. Bakaev, M. Speicher, S. Heil, S., and M. Gaedke, “I Don’t Have That Much Data! Reusing user behavior models for websites from different domains,” in International Conference on Web Engineering, Cham: Springer International Publishing, pp. 146-162, 2020.

A. A. Habay, “A Systematic Literature Review of Visual Design Metrics for Graphical User Interfaces,” Louvain School of Management, Université catholique de Louvain, 2020. Prom., Vanderdonckt, Jean, 2020, http://hdl.handle.net/2078.1/thesis:25647.

N. Hagendorff, and S. Fabi, “Why we need biased AI: How including cognitive biases can enhance AI systems,” Journal of Experimental & Theoretical Artificial Intelligence, no. 36(8), pp. 1885-1898, 2024.

A. Saravanos, S. Zervoudakis, D. Zheng, N. Stott, B. Hawryluk, and D. Delfino, “The hidden cost of using Amazon Mechanical Turk for research,” in International Conference on Human-Computer Interaction, Springer International Publishing, New York, NY, USA, pp. 147–164, 2021.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность IT Congress 2024

ISSN: 2307-8162