skip navigation
skip mega-menu

A Cobol to Python Converter with AutoGen Agents

In this story, we are going to explore a Cobol to Python converter written with Microsoft’s AutoGen framework. AutoGen 是一个基于python的框架,允许使用不同的会话模式编排多种类型的代理.

About AutoGen

AutoGen has the following types of basic agents:

  • ConversableAgent 这是具有基本功能的代理,也是所有其他代理的基类 AutoGen agents. It contains the base functionality to send and receive messages from other agents, to initiate or continue a conversation.

  • UserProxyAgent — is a proxy agent for humans, 默认情况下,在每个交互回合请求人工输入作为代理的应答,并且还具有执行代码和调用函数的能力. 我们将在转换器中使用这种类型的代理来启动和终止对话或执行一些代码.

  • AssistantAgent — the agent which interacts with the LLM and typically generates text. It does neither execute code nor interact with the user. 我们已经使用这个代理来生成Python代码、生成单元测试和代码审查.

You can also create a GroupChat which allows you to manage a group of AssistantAgent’s which interact with each other.

Finally AutoGen also allows the creation of your own custom agents, but we did not use this functionality for this converter.

These conversation patterns can be for example:

  • bi-directional chat between a user proxy and an assistant agent

  • group chat involving a user proxy and a group of assistant agents

  • 多组聊天涉及多个相互交互的代理组

For more details, please check the AutoGen’s webpage: http://microsoft.github.io/autogen/docs/Use-Cases/agent_chat 

What is the goal of the Cobol Converter?

这个命令行工具的目标是读取Cobol文件,并通过文档和单元测试将其转换为Python代码. 它的第二个目标是将Python代码转换为基于REST的应用程序 FastAPI.

High-level workflow

How does the Cobol Converter work?

The Cobol converter uses two types of tools:

因此,我们结合了AutoGen代理,这些代理由LLM (gpt-4 - 1106预览版)和用于检查和格式化代码的传统工具支持.

There are two AutoGen agent ensembles (teams) at work in this application.

  • Cobol conversion team

  • REST conversion team

Agents Teams

Cobol Conversion Team

Cobol转换团队通过文档和单元测试将Cobol转换为Python. 除了接收用户输入的用户代理代理之外,它还有三个代理:

  • The Cobol conversion agent — responsible for the Cobol conversion using an LLM

  • 单元测试代理——用于使用LLM从Python代码生成单元测试

  • 代码审查器——用于审查转换后的Python代码和单元测试

REST Conversion Team

REST转换团队将转换后的Python代码作为输入,并将其转换为REST接口. So if the application was a command line application of some sort, it becomes a REST interface based application using FastAPI.

Full workflow of the Cobol Agent

Cobol到Python的转换工作流由一个循环组成,该循环处理每个Cobol文件,并使用两个代理集成和传统工具(Black和Pylint)。.

Here is an annotated version of the workflow:

Full conversion workflow

The Cobol converter workflow has these main stages:

  1. The initial loop processes each Cobol file in a directory.

  2. This is the task block with the Cobol Conversion Team. 在这个任务块中,Cobol代码被转换,单元测试被创建,代码被审查. When the Cobol Conversion Team stops, it extracts all relevant blocks with the Python code or text from its agents.

  3. In this block of tasks, the code review is written to disk. The Python code is also formatted and written to disk. 代码也用Pylint进行分析,分析的结果被写入磁盘. The unit tests are also executed and its output saved in a file.

  4. This is the task block with the REST conversion team. 它有两个相互交互的代理:REST代码转换器和代码审查器. 在他们生成代码之后,Python代码将与代码审查一起被提取.

  5. In this block the code review is written to disk, REST接口实际上是在进程中执行的——以查看代码是否编译并运行. The process is then shutdown and formatted and written to disk. Pylint is then used to analyse the code and this analysis is also written to disk.

The converter finishes after all Cobol files have been processed.

The Cobol Converter Output

For a small Cobol file like this one, you should get an output like this one:

Conversion output example

The files are:

  • rest_critique_write_student_2.txt — the code review for the REST implementation

  • rest_write_student_2.py — The REST based implementation

  • rest_write_student_2.py_lint.txt — The result of the static code analysis for the REST implementation

  • test_write_student_2.py — The unit tests for the converted file

  • test_write_student_2.py_lint -转换文件的单元测试的静态代码分析报告

  • test_write_student_2.py_test_output.log — The execution log for test_write_student_2.py

  • write_student_2.py — The Cobol conversion file

  • write_student_2.py_lint.txt — The static code analysis for write_student_2.py

如果您对转换后的文件感兴趣,请查看以下Google Drive链接: http://drive.google.com/drive/u/2/folders/1F7dqo5F2_zDzD8GcLlQFj70ZLdox5SL9 

Implementation

The whole code for this command line tool can be found in this repository: http://github.com/onepointconsulting/cobol-converter 

Installation, Configuration, Running

The Cobol code converter is a Python 3.11 application which requires Conda to be installed.

The installation instructions can be found in the README of this project.

The configuration of the project relies on an .env file, similar to the .env_local file that you can find in this project.

Cobol文件是从项目根文件夹下的目录中读取的. This directory is referenced by the SOURCE_CODE_DIR environment variable.

The main entry point for the application is this file: http://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/cobol_converter_main.py

这个主入口点接受三个参数,它们决定如何写入输出文件:overwrite(覆盖输出文件), clear (clears the output files), only_new (only write out files that are not yet translated)

Prompts

We have separated the agent prompts from the code. The prompts for all agents and user proxies are all in this tool filehttp://github.com/onepointconsulting/cobol-converter/blob/main/prompts.toml 

Here are some prompts used by the Cobol Conversion Team:

[agents]

   [agents.python_coder]

   system_message = """You are a helpful AI assistant.

You convert Cobol code into Python code. Please do not provide unit tests. Provide instead a main method to run the application.

Also do not omit any code for brevity. We want to see the whole code."""

   [agents.python_unit_tester]

   system_message = """You are a helpful AI assistant.

您可以基于会话中的Python代码的单元测试库创建单元测试.

Please copy the original Python code that you are testing to your response.

Please make sure to import the unit test library. Provide a main method to run the tests."""

   [agents.code_critic]

   system_message = """Critic. 你是一个有帮助的助手,在评估给定代码的质量方面非常熟练,可以提供从1(差)到10(好)的分数,同时提供清晰的基本原理. YOU MUST CONSIDER CODING BEST PRACTICES for each evaluation. 具体来说,您可以在以下维度上仔细评估代码

- bugs (bugs):  are there bugs, logic errors, syntax error or typos? Are there any reasons why the code may fail to compile? How should it be fixed? If ANY bug exists, the bug score MUST be less than 5.

- Goal compliance (compliance): how well the Cobol code was converted?

- Data encoding (encoding): How good are the unit tests that you can find?


YOU MUST PROVIDE A SCORE for each of the above dimensions.

{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, aesthetics: 0}

Do not suggest code.

Finally, based on the critique above, 建议编码员应该采取哪些具体的行动来改进代码.

If Unit tests are available already and seem OK, reply with TERMINATE"""

Agent Teams Setup

The teams of agents are set up in these two files:

The Cobol conversion team is setup in this file: http://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/agent_setup.py 

The REST Conversion Team is setup in this file: http://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/agent_rest_setup.py 

Main Workflow Implementation

The main implementation of the workflow can be found in this file: http://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/cobol_conversion_service.py 

下面的链接指向处理每个Cobol文件并执行Cobol到Python转换的方法, Unit test generation, formatting, static code analysis and REST interface creation

http://github.com/onepointconsulting/cobol-converter/blob/main/cobol_converter/service/cobol_conversion_service.py#L94 

For more details about the code please check the code repository.

Takeaways

Cobol Converter是可以在短时间内将Cobol转换为Python的第一步. Smaller simpler programmes might be translated correctly and even executed (e.g. the write_student_2 example). However more complicated programmes (like e.g. (井字游戏)在翻译成Python时在功能上是不相等的(i.e. you could start the program, but the game was unplayable) — even if the output is syntactically correct.

The LLM we used was gpt-4–1106-preview. We have not used any other models. There might be other models which are better at converting Cobol to Python.

Ways to improve the conversion

We can think of several ways to improve the conversion:

  • a fine-tuned LLM, specialized on the conversion of your Cobol dialect. 这将是非常重要的,特别是如果你打算做很多转换.

  • establish human feedback into the script generation process. 人工反馈团队应该至少由一名精通Cobol和目标语言的开发人员组成. No matter how good an LLM is these days, 它们仍然可能以意想不到的方式或与您的业务目标不一致的方式生成代码. 如果人类开发人员的反馈可以输入到代码生成过程中,这将是非常有价值的.

如果您选择对模型进行微调,那么随着时间的推移,您将最终能够通过使用任何成功转换的结果作为新数据来微调您的LLM来改进您的模型.

Gil Fernandes, Onepoint Consulting

Subscribe to our newsletter

Sign up here