Implementation of Metadata Repository based on Object-Oriented Model for Data Quality Assessment

S. E. Dukhovenskiy

Abstract


The paper considers the approach to organizing a metadata repository as one of the elements of the data quality assessment system. An object-oriented data model enriched with data quality checks is proposed for repository formation. A description of the key repository elements is given, along with an approach to linking data quality checks to objects in the data model. The article provides a method for storing the discussed metadata repository and special algorithms for its processing, including the "unpacking" algorithm for class attributes and the algorithm for determining the relevant data checks considering possible overrides. Based on the described theoretical propositions, a prototype of the metadata repository was implemented. The prototype was used to organize checks for assessing the data quality of personal data operator’s registry. In comparison with the implementation of a metadata repository based on a physical data model, the application of the approach described in this research results in a reduction of attribute and data quality check description by 23% and 27%, respectively, while maintaining the same quantity of executed checks. The investigated approach can be useful in practical tasks related to data quality analysis as a potential way to reduce the workload of data quality check management.


Full Text:

PDF (Russian)

References


V.P. Los, E.V. Nikulchev, P.Y. Pushkin, A.M. Rusakov, “Infor-mation and analytical system for monitoring the compliance of personal data operators with the requirements of the legislation,“ Problems of information security. Computer systems, no. 3, pp. 16-23, 2020. (in Rus)

A. A. Ilyin, “Automated technology for designing a data model when building an information and analytical system ,” Bulletin of Russian Universities. Mathematics, vol. 13, no. 1, pp. 89-90, 2008. (in Rus)

S.E. Dukhovenskiy, P.Y. Pushkin, E.V. Nikulchev, “The data quality assessment technique of personal data operators registry,” International Journal of Open Information Technologies, vol. 12, no. 1, pp. 129-136, 2024. (in Rus)

Personal data operators register, https://pd.rkn.gov.ru/operators-registry/operators-list/ (in Rus)

P. Oliveira, F. Rodrigues, P. R. Henriques, “A formal definition of data quality problems” in Proceedings of the 2005 International Conference on Information Quality, MIT, 2005.

J. Wang, Y. Liu, P. Li, Z. Lin, S. Sindakis, S. Aggarwal, “Over-view of data quality: examining the dimensions, antecedents, and impacts of data quality,” Journal of the Knowledge Economy, 2023 https://doi.org/10.1007/s13132-022-01096-6.

E.P. Emelchenkov, V.I. Munerman, D.V. Munerman, T.A. Samoilova, “The object oriented approach to designing data models,” Modern Information Technologies and IT-education, vol. 16, no. 3, pp. 564-574, 2020. (in Rus)

L. Zhao, S.A. Roberts, “An Object-Oriented Data Model for Database Modelling, Implementation and Access,” The Computer Journal, vol. 31, no. 2, pp. 116-124, 1988.

D. Harrington, Designing Object-Oriented Databases, Litres, 2022.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность MoNeTec 2024

ISSN: 2307-8162