Program Generation Methods: Types and Instances

Daniil Borodin, Alexander Prutzkow

Abstract


The study systematizes existing approaches by chronological principle and categories. The strategy of searching for sources consists of using modern library platforms and keywords on the topic of program generation. We classified program generation methods and identified the following types and their instances. Template methods generate a program using natural language, UML diagrams, and formal specifications, as well as code generation for specific platforms, including emulation of processor architectures. The CASE methods convert high-level descriptions into executable code, including generation in Isabelle/HOL and the use of multi-level rule sets. We analyzed model-based code generation methods, including polyhedral models and the Ptolemy platform. We reviewed tools using of genetic algorithms for creating program code. Compositional programming is represented by the SPIRAL, KLEE projects and other modern developments. A separate type is made up of methods based on artificial intelligence and machine learning, including neural network architectures (AlphaCode, CODEnn) and large language models (CodeBERT, Code Llama). We identified their advantages and areas of application for the types. The types are presented as a scheme that corresponding to directions of development of program generation methods. The study revealed a tendency to move from traditional template methods to technologies based on large language models and machine learning. We will use results of this review in our study on generation of programs that transform arrays.

Full Text:

PDF

References


Aldan A. Vvdenie v generaciu programmnogo koda [Introduction to Program Code Generation] [Electronic Resource]: Study Course / Author: Askar Aldan; INTUIT.ru. 2025. URL: https://intuit.ru/studies/courses/4432/975/lecture/14619?ysclid=m5land94x3873782162 (accessed: 02/09/2025). [in Rus].

Bhartacharyya S.S. et al. Software Synthesis and Code Generation for Signal Processing Systems // IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 2002, 47(9):849-875.

Domı E. et al. A Systematic Review of Code Generation Proposals from State Machine Specifications // Information and Software Technology, 2012, 54(10):1045-1066.

Bajovs A. et al. Code Generation from UML Model: State of the Art and Practical Implications // Applied Computer Systems, 2013.

Gurunule D., Nashipudimath M. A Review: Analysis of Aspect Orientation and Model Driven Engineering for Code Generation // Procedia Computer Science, 2015, 45:852-861.

Shin J., Nam J. A Survey of Automatic Code Generation from Natural Language // Journal of Information Processing Systems, 2021, 17(3):537-555.

Dehaerne E. et al. Code Generation Using Machine Learning: A Systematic Review // IEEE Access, 2022, 10:82434-82455.

Cambaz D., Zhang X. Use of AI-Driven Code Generation Models in Teaching and Learning Programming: A Systematic Literature Review // 55th ACM Technical Symposium on Computer Science Education, 2024, 1:172-178.

Heidorn G.E. An Interactive Simulation Programming System Which Converses in English // 6th Conference on Winter Simulation, 1973:781-794.

Clark P. et al. Naturalness vs. Predictability: A Key Debate in Controlled Languages // Controlled Natural Language. Springer, 2009:65-81.

Schlegel V. et al. Vajra: Step-by-Step Programming with Natural Language // 24th International Conference on Intelligent User Interfaces, 2019:30-39.

Yang G. et al. Fine-Grained Pseudo-Code Generation Method via Code Feature Extraction and Transformer // 28th Asia-Pacific Software Engineering Conference (APSEC), 2021:213-222.

Wehrmeister M.A. et al. GenERTiCA: A Tool for Code Generation and Aspects Weaving // 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), 2008:234-238.

Gessenharter D. Mapping the UML2 Semantics of Associations to a Java Code Generation Model // Model Driven Engineering Languages and Systems: 11th International Conference (MoDELS), 2008:813-827.

Usman M., Nadeem A. Automatic Generation of Java Code from UML Diagrams Using UJECTOR // International Journal of Software Engineering and Its Applications, 2009, 3(2):21-37.

Vadakkumcheril T. et al. A Simple Implementation of UML Sequence Diagram to Java Code Generation through XMI Representation // International Journal of Emerging Technology and Advanced Engineering, 2013, 3(12):1-5.

Minakova O.V. et al. Postroenie generatora programmnogo koda dlya reshenia ingenernih zadach [Building a Program Code Generator for Solving Engineering Problems] // Bulletin of the Voronezh State Technical University, 2020, 6(3):14-19. [in Rus].

Bellard F. QEMU, a Fast and Portable Dynamic Translator // USENIX Annual Technical Conference, FREENIX Track, 2005, 41(46):10-55.

Mozumdar M.M.R. et al. A Framework for Modeling, Simulation and Automatic Code Generation of Sensor Network Application // 5th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, 2008:515-522.

Baskaran M.M. et al. Automatic C-to-CUDA Code Generation for Affine Programs // Compiler Construction: 19th International Conference, Held as Part of the Joint European Conferences on Theory and Practice of Software, 2010:244-263.

Geiger L. et al. EMF Code Generation with Fujaba // Fujaba Days, 2007:25-29.

Burdonov I.B. et al. Formalnie specifikacii v tehnolgiah obratnoy ingenerii [Formal Specifications in Reverse Engineering and Software Verification Technologies] // Proceedings of the Institute for System Programming of the Russian Academy of Sciences, 2000, 1:39-54. [in Rus].

Mattingley J. et al. Code Generation for Receding Horizon Control // IEEE International Symposium on Computer-Aided Control System Design, 2010:985-992.

V'yukova N.I. et al. Code Generation by the Method of Exact Joint Solution of Command Selection and Planning Problems // Software Engineering, 2014, 6:8-15. [in Rus].

Sokolov A.P. et al. Development of Code Generation Software Based on Templates for Creating Engineering Analysis Systems // Software Engineering, 2019, 10(9–10):400–416. DOI: 10.17587/prin.10.400-416.

Petrenko A.K., Marchuk A.G. Modern Approaches to Software Development Automation // Software Engineering, 2017, 4:22–30. [in Rus].

Haftmann F., Nipkow T. Code Generation via Higher-Order Rewrite Systems // International Symposium on Functional and Logic Programming, 2010:103-117.

Myshenkov K.S. Metodika obosnovania vibora CASE-sredstv dlya analiza i proektirovania system upravlenya predpriyatiyami [Methodology for Justifying the Selection of CASE-Tools for Analysis and Design of Enterprise Management Systems] // Innovations, 2013, 10(180):112-122. [in Rus].

Tarasiev A.A. et al. Razrabotka prototipa CASE-sredstva dlya sozdania avtomatizirorannih system na osnove web-prilogeni s ispolzovaniem generacii koda [Development of a Prototype CASE-Tool for Creating Automated Systems Based on Web Applications Using Code Generation] // 28th International Crimean Conference «Microwave Engineering and Telecommunication Technologies» (CriMiCo), 2018:452-458. [in Rus].

Bastoul C. Code Generation in the Polyhedral Model is Easier than You Think // 13th International Conference on Parallel Architecture and Compilation Techniques, 2004:7-16.

Vasilache N. et al. Polyhedral Code Generation in the Real World // Compiler Construction: 15th International Conference, Held as Part of the Joint European Conferences on Theory and Practice of Software, 2006:185-201.

Schoeberl M. et al. Code Generation for Embedded Java with Ptolemy // Software Technologies for Embedded and Ubiquitous Systems: 8th IFIP WG 10.2 International Workshop (SEUS), 2010:155-166.

Dovgal V.M. et al. On the Issue of Solving the Problem of Automatic Code Generation Based on a Given Control Production Algorithm // In the World of Scientific Discoveries, 2012, 1(25):220-235. [in Rus].

Samokhvalov E.N. et al. Generaciya ishodnogo koda programmnogo obespechenia na osnove mnogourovnego nabora pravil [Source Code Generation of Software Based on a Multi-Level Set of Rules] // Herald of the Bauman Moscow State Technical University. Series «Instrument Engineering», 2014, 5(98):77-87. [in Rus].

Burenkov V.S. Generator testov dlya verifikacii protokola kogerentnosti kashpamyati [Test Generator for Verifying the Cache Coherence Protocol] // Issues of Radio Electronics, Series ECT, 2014, 3:56-63. [in Rus].

Alon U. et al. code2seq: Generating Sequences from Structured Representations of Code // arXiv preprint arXiv, 2018:1808.01400.

Hayati S.A. et al. Retrieval-Based Neural Code Generation // arXiv preprint arXiv, 2018:1808.10025.

Liu S. et al. Atom: Commit Message Generation Based on Abstract Syntax Tree and Hybrid Ranking // IEEE Transactions on Software Engineering, 2020, 48(5):1800-1817.

Gitel'man V.S., Tutov I.A. Generacia koda na osnove determinirovannogo konechnogo avtomata [Code Generation Based on a Deterministic Finite Automaton] // Youth and Modern Information Technologies, 2022:299-301. [in Rus].

Pozza D. et al. Spi2java: Automatic Cryptographic Protocol Java Code Generation From Spi Calculus // 18th International Conference on Advanced Information Networking and Applications (AINA), 2004, 1:400-405.

Mattingley J., Boyd S. CVXGEN: A Code Generator for Embedded Convex Optimization // Optimization and Engineering, 2012, 13:1-27.

Fraser O.L. et al. Return-Oriented Programme Evolution with ROPER: A Proof of Concept // Genetic and Evolutionary Computation Conference Companion, 2017:1447-1454.

Polovikova O.N., Zenkov A.V. Solving a Certain Class of Logical Problems in Prolog by Declaring State Generators // Computer Tools in Education, 2019, 1:54-67. [in Rus].

Lebedev R.K. Automatic Generation of Hash Functions for Program Code Obfuscation // Applied Discrete Mathematics, 2020, 50:102-117. [in Rus].

Shintyakov D.V. Opit ispolzovaniya geneticheskih algoritmov dlya generacii koda algoritmov optimizcii [Experience of Using Genetic Algorithms for Generating Optimization Algorithm Code] // Processing, Transmission, and Protection of Information in Computer Systems, 2020:169-174. [in Rus].

Puschel M. et al. SPIRAL: Code Generation for DSP Transforms // Proceedings of the IEEE, 2005, 93(2):232-275.

Cadar C. et al. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs // OSDI, 2008, 8:209-224.

Yang X. et al. Finding and Understanding Bugs in C Compilers // 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011:283-294.

Bezanson J. et al. Julia: A Fast Dynamic Language for Technical Computing // arXiv preprint arXiv, 2012:1209.5145.

Vanyasin N.V. Semanticheskoe redaktirovanie programmnogo koda v intelektualnih integrirovannih sredah razrabotki prilozheniy [Semantic Editing of Program Code in Intelligent Integrated Application Development Environments] // Cybernetics and Programming, 2017, 1:61-68. DOI: 10.7256/2306-4196.2017.1.18881. [in Rus].

Banjac G. et al. Embedded Code Generation Using the OSQP Solver // 56th IEEE Annual Conference on Decision and Control (CDC), 2017:1906-1911.

Allamanis M. et al. A Survey of Machine Learning for Big Code and Naturalness // ACM Computing Surveys (CSUR), 2018, 51(4):1-37.

Vodyakho A.I. et al. Systemi avtomaticheskoy generacii programm monitortinga [Systems of Automatic Program Generation for Monitoring] // Engineering Bulletin of Don, 2019, 8(59):19. [in Rus].

Arhipov I.S. Generacia optimalnogo obyektnogo koda [Generation of Optimal Object Code] // Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS), 2020, 32(3):49-56. [in Rus].

Parvez M.R. et al. Retrieval Augmented Code Generation and Summarization // arXiv preprint arXiv, 2021:2108.11601.

Li Y. et al. Competition-Level Code Generation with Alphacode // Science, 2022, 378(6624):1092-1097.

Kozachok A.V. et al. Method generacii semanticheski korrektnogo koda dlya fazzingtestirovaniya interprotatorov javascript [Method for Generating Semantically Correct Code for Fuzzing Testing of JavaScript Interpreters] // Cybersecurity Issues, 2023, 5:57. [in Rus].

Gu X. et al. Deep Code Search // IEEE/ACM 40th International Conference on Software Engineering, 2018:933-944.

Solkin A.Yu. Sposobi avtomatizacii sozdaniya upravlyaushih programm dlya malorezhushego oborudovaniya s CHPU [Methods for Automating the Creation of Control Programs for CNC Metal-Cutting Equipment] // Bulletin of the Tatishchev Volga University, 2012, 2(19):165-168. [in Rus].

Long F., Rinard M. Automatic Patch Generation by Learning Correct Code // 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016:298-312.

Yin P., Neubig G. A Syntactic Neural Model for General-Purpose Code Generation // arXiv preprint arXiv, 2017:1704.01696.

Filyukov D.A. Primenenie neironnih setey dlya formirovaniya koda vredonosnogo programmnogo obespecheniya [Application of Neural Networks for Generating Malicious Software Code] // Innovations and Investments, 2023, 7:199-204. [in Rus].

Feng Z. et al. Codebert: A Pre-Trained Model for Programming and Natural Languages // arXiv preprint arXiv, 2020:2002.08155.

Austin J. et al. Program Synthesis with Large Language Models // arXiv preprint arXiv, 2021:2108.07732.

Roziere B. et al. Code Llama: Open Foundation Models for Code // arXiv preprint arXiv, 2023:2308.12950.

Ouyang S. et al. An Empirical Study of the Non-Determinism of ChatGPT in Code Generation // ACM Transactions on Software Engineering and Methodology, 2025, 34(2):1-28.

Nijkamp E. et al. Codegen: An Open Large Language Model for Code with Multi-Turn Program Synthesis // arXiv preprint arXiv, 2022:2203.13474.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность ИБП для ЦОД СНЭ

ISSN: 2307-8162