Approaches to creating a proprietary data representation format
Abstract
While developing software, the question inevitably arises about choosing a data presentation format, that determines how information will be organized for efficient data storage, transmission and processing. However, well-known universal formats are not always suitable for specific requirements and criteria, so there is a need to develop your own. This article discusses the issues of creating a data representation scheme for the development of proprietary data representation formats. The paper analyzes the main metrics of data storage formats, as a result of which criteria for the data schema are proposed. The main advantages of data storage formats are highlighted during a brief overview of existing formats. Based on the information received, a data schema has been developed for a proprietary data representation format, that implements a hierarchical structure with the proposed classes. Their structure, the general file recording format, recording types and encoding are defined. In addition, the data schema provides information integrity control by adding a CRC checksum to the end of the file. As an example of the application of the developed data schema, the paper presents the creation of a closed proprietary LSGS (Library of symbolic graphic symbols) format designed to store data from a library of conditional graphical designations of functional blocks. The result of serialization and deserialization of objects in a high-level programming language is presented.
Full Text:
PDF (Russian)References
ASCII-Code.com [Online]. Available: https://www.ascii-code.com/
Unicode [Online]. Available: https://home.unicode.org/
Windows-1251 character encoding [Online]. Available: https://wm-school.ru/html/html_win-1251.html
Understanding and implementing CRC (Cyclic Redundancy Check) calculation [Online]. Available: https://www.sunshine2k.de/articles/coding/crc/understanding_crc.html
Bazikalov I.V., Poplavskaya V.A. Comparative analysis of common hash functions // Student science for the development of the information society. Collection of materials of the VI All-Russian Scientific and Technical Conference. — 2017. — Vol. 2. — pp. 230-233 (in Russian).
Komarova A.V., Menshchikov A.A., Korobeinikov A.G. Analysis and comparison of electronic digital signature algorithms GOST r 34. 10-1994, GOST r 34. 10-2001 and GOST r 34. 10-2012 // Cybersecurity issues. — 2017. — No. 1(19). — pp. 51-56 (in Russian).
Introduction to JSON [Online]. Available: https://www.json.org/
json-ru.html
Naumov R.K., Zhelezkov N.E. Comparative Analysis of Text Data Storage Formats for Further Processing Using Machine Learning Methods // Scientific result. Information Technology. — 2021. — Vol. 6, No. 1. — pp. 40-47 (in Russian).
Belov V.A., Nikulchev E.V. Experimental evaluation of the temporal efficiency of big data processing for specified storage formats // International Journal of Open Information Technologies. — 2021. — Vol. 9, No. 9. — pp. 95-102 (in Russian).
What is XML? [Online]. Available: https://aws.amazon.com/
ru/what-is/xml/
Nikiforov D.A., Korzh D.V., Sivakov R.L. An overview of tools for validating XML documents using control rules described in Object Constraint Language (OCL) // Information Technology. — 2017. — Vol. 23, No. 5. — pp. 342-351 (in Russian).
CSV File Format Specification [Online]. Available: https://arquivo.pt/wayback/
/http://mastpoint.curzonnassau.com/csv-1203/
HDF5 [Online]. Available: https://www.hdfgroup.org/solutions/hdf5/
Apache Avro™ 1.11.1 Documentation – URL: https://avro.apache.org/docs/1.11.1/
Format Intel-HEX [Online]. Available: https://spd.net.ru/
Article/Intel-HEX
Apache ORC [Online]. Available: https://orc.apache.org/docs/
Unified modeling language [Online]. Available: https://www.uml.org/
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162