Writing Verilog code with Copilot
I have to admit that I started using AI tools like Copilot and ChatGPT a few months ago. Initially, I was resistant to the idea. However, after testing them on some projects, I was surprised, and now I frequently use them to generate auxiliary scripts. To be honest, I usually ask the AI to generate a script and then use it as a reference to create my own. It’s not a matter of trust; I simply prefer to understand the code to make debugging easier. While I often use AI to generate Python scripts, I hadn’t used it to generate Verilog code until last week. I tested the AI integration in VS Code to generate Verilog modules and came to some interesting conclusions. In this article, we will generate an AXI4 Lite Slave using only MS Copilot. Let’s see how far we can go.
The first request, or prompt, was very simple, without details.
Hello, can you create an AXI4 Lite slave with 4 RW registers and 4 RO registers?
It works quite well, but Copilot forgot to make the AXI4 registers available, so it generated a module, with just the AXI4 lite interface, with the registers itself declared within the module.
module axi4lite_slave (
input wire clk,
input wire resetn,
input wire [31:0] awaddr,
input wire awvalid,
output wire awready,
input wire [31:0] wdata,
input wire [3:0] wstrb,
input wire wvalid,
output wire wready,
output wire [1:0] bresp,
output wire bvalid,
input wire bready,
input wire [31:0] araddr,
input wire arvalid,
output wire arready,
output wire [31:0] rdata,
output wire [1:0] rresp,
output wire rvalid,
input wire rready
);
// Internal signals
reg [31:0] rw_registers [0:3]; // Four RW registers
reg [31:0] ro_registers [0:3]; // Four RO registers
reg [31:0] rdata_reg;
reg rvalid_reg, bvalid_reg;
reg awready_reg, wready_reg, arready_reg;
So, the next was request Copilot to make the registers available from the ports of the module.
OK, but I need both the RW and RO registers be accessible externally
And here I had a module which looked very good. It added the AXI4 registers to the ports of the module, declaring the RW registers and outputs, and the RO registers as input. It generated the registers using vectors, which is not bad, but only valid if you want to use System Verilog.
module axi4lite_slave (
input wire clk,
input wire resetn,
input wire [31:0] awaddr,
input wire awvalid,
output wire awready,
input wire [31:0] wdata,
input wire [3:0] wstrb,
input wire wvalid,
output wire wready,
output wire [1:0] bresp,
output wire bvalid,
input wire bready,
input wire [31:0] araddr,
input wire arvalid,
output wire arready,
output wire [31:0] rdata,
output wire [1:0] rresp,
output wire rvalid,
input wire rready,
// External access to RW registers
output reg [31:0] rw_registers [0:3],
// External access to RO registers
input wire [31:0] ro_registers [0:3]
);
Using vectors for the AXI4 registers has some advantages in the implementation. For example, in the write data side, since we can access each AXI4 register by changing the index, the number of code lines is reduced significantly. One detail that surprised me is the use of the wstrb
signal. This signal is not frequently used although it exists in the AXI4 protocol. The signal is used to tell the slave which byte of the 32-bit data has to be updated. In my AXI4 Slaves, sometimes I don’t use it, but if you want to create an AXI4-Lite Slave compliant design, you need to. Using vectors for the AXI4 registers makes the use of the wstrb
signal easier.
// Write data handshake
always @(posedge clk or negedge resetn) begin
if (!resetn) begin
wready_reg <= 1'b0;
bvalid_reg <= 1'b0;
end else if (wvalid && !wready_reg) begin
wready_reg <= 1'b1;
if (awaddr[5:2] < 4) begin
if (wstrb[0]) rw_registers[awaddr[5:2]][7:0] <= wdata[7:0];
if (wstrb[1]) rw_registers[awaddr[5:2]][15:8] <= wdata[15:8];
if (wstrb[2]) rw_registers[awaddr[5:2]][23:16] <= wdata[23:16];
if (wstrb[3]) rw_registers[awaddr[5:2]][31:24] <= wdata[31:24];
end
bvalid_reg <= 1'b1;
end else if (bready && bvalid_reg) begin
bvalid_reg <= 1'b0;
end else begin
wready_reg <= 1'b0;
end
end
Regarding the reading side, again, using vectors makes things easier. Also, we can see that Copilot uses asynchronous resets, which is not bad, but different from the way I would make it. Asynchronous resets are triggered immediately by a reset signal, regardless of the clock, which can simplify initialization but may lead to timing issues in designs with multiple clock domains. I prefer synchronous resets, as they are synchronized with the clock signal, ensuring better timing consistency and avoiding potential metastability issues, especially in complex designs.
// Read address handshake
always @(posedge clk or negedge resetn) begin
if (!resetn) begin
arready_reg <= 1'b0;
rvalid_reg <= 1'b0;
end else if (arvalid && !arready_reg) begin
arready_reg <= 1'b1;
if (araddr[5:2] < 4) begin
rdata_reg <= rw_registers[araddr[5:2]];
end else if (araddr[5:2] < 8) begin
rdata_reg <= ro_registers[araddr[5:2] - 4];
end else begin
rdata_reg <= 32'h00000000; // Default value for invalid addresses
end
rvalid_reg <= 1'b1;
end else if (rready && rvalid_reg) begin
rvalid_reg <= 1'b0;
end else begin
arready_reg <= 1'b0;
end
end
I tested quickly this implementation of the AXI4 Slave, and it works well, however, the use if vectors in the AXI4 registers will make that, at any point of our wrapper, we ned to separate them into different registers, so, I asked Copilot to make this in the module itself.
I need to acces registers individually, so don’t use vectors, just registers.
This modification generated many changes in the code, specially in the write side. First of all, the ports of the module changed according my request.
module axi4lite_slave (
input wire clk,
input wire resetn,
input wire [31:0] awaddr,
input wire awvalid,
output wire awready,
input wire [31:0] wdata,
input wire [3:0] wstrb,
input wire wvalid,
output wire wready,
output wire [1:0] bresp,
output wire bvalid,
input wire bready,
input wire [31:0] araddr,
input wire arvalid,
output wire arready,
output wire [31:0] rdata,
output wire [1:0] rresp,
output wire rvalid,
input wire rready,
// External access to RW registers
output reg [31:0] rw_reg0,
output reg [31:0] rw_reg1,
output reg [31:0] rw_reg2,
output reg [31:0] rw_reg3,
// External access to RO registers
input wire [31:0] ro_reg0,
input wire [31:0] ro_reg1,
input wire [31:0] ro_reg2,
input wire [31:0] ro_reg3
);
Then, the write side of the protocol was replaced by a (not full) case statement, again using correctly the signal wstrb
.
// Write data handshake
always @(posedge clk or negedge resetn) begin
if (!resetn) begin
wready_reg <= 1'b0;
bvalid_reg <= 1'b0;
end else if (wvalid && !wready_reg) begin
wready_reg <= 1'b1;
case (awaddr[5:2])
4'h0: begin
if (wstrb[0]) rw_reg0[7:0] <= wdata[7:0];
if (wstrb[1]) rw_reg0[15:8] <= wdata[15:8];
if (wstrb[2]) rw_reg0[23:16] <= wdata[23:16];
if (wstrb[3]) rw_reg0[31:24] <= wdata[31:24];
end
4'h1: begin
if (wstrb[0]) rw_reg1[7:0] <= wdata[7:0];
if (wstrb[1]) rw_reg1[15:8] <= wdata[15:8];
if (wstrb[2]) rw_reg1[23:16] <= wdata[23:16];
if (wstrb[3]) rw_reg1[31:24] <= wdata[31:24];
end
4'h2: begin
if (wstrb[0]) rw_reg2[7:0] <= wdata[7:0];
if (wstrb[1]) rw_reg2[15:8] <= wdata[15:8];
if (wstrb[2]) rw_reg2[23:16] <= wdata[23:16];
if (wstrb[3]) rw_reg2[31:24] <= wdata[31:24];
end
4'h3: begin
if (wstrb[0]) rw_reg3[7:0] <= wdata[7:0];
if (wstrb[1]) rw_reg3[15:8] <= wdata[15:8];
if (wstrb[2]) rw_reg3[23:16] <= wdata[23:16];
if (wstrb[3]) rw_reg3[31:24] <= wdata[31:24];
end
endcase
bvalid_reg <= 1'b1;
end else if (bready && bvalid_reg) begin
bvalid_reg <= 1'b0;
end else begin
wready_reg <= 1'b0;
end
end
Verilator here informed me that the the case was incomplete.
But copilot can also fixced it…
Here Copilot added an empty default
case to fix it.
Regarding the read side, maybe I would set the arready_reg
in a separate process, but it worked well.
// Read address handshake
always @(posedge clk or negedge resetn) begin
if (!resetn) begin
arready_reg <= 1'b0;
rvalid_reg <= 1'b0;
end else if (arvalid && !arready_reg) begin
arready_reg <= 1'b1;
case (araddr[5:2])
4'h0: rdata_reg <= rw_reg0;
4'h1: rdata_reg <= rw_reg1;
4'h2: rdata_reg <= rw_reg2;
4'h3: rdata_reg <= rw_reg3;
4'h4: rdata_reg <= ro_reg0;
4'h5: rdata_reg <= ro_reg1;
4'h6: rdata_reg <= ro_reg2;
4'h7: rdata_reg <= ro_reg3;
default: rdata_reg <= 32'h00000000; // Default value for invalid addresses
endcase
rvalid_reg <= 1'b1;
end else if (rready && rvalid_reg) begin
rvalid_reg <= 1'b0;
end else begin
arready_reg <= 1'b0;
end
end
Finally, the module generated is poor documented, so I asked Copilot to add a header with some fields.
can you add a header comment with some fields like author, date, module name, and a description of the module?
Copilot generated a detailed description of the module as well a list of the registers with the corresponding offset.
/*
* Module: axi4lite_slave
* Author: [Your Name]
* Date: May 3, 2025
* Description:
* This module implements an AXI4-Lite slave interface with four read-write (RW)
* registers and four read-only (RO) registers. The RW registers are writable
* via the AXI4-Lite interface and accessible externally. The RO registers are
* read-only from the AXI4-Lite interface and their values are driven externally.
*
* Features:
* - AXI4-Lite protocol compliance
* - Individual access to RW and RO registers
* - Address decoding for up to 8 registers
*
* Register Map:
* Address | Register
* ---------|-----------------
* 0x00 | RW Register 0
* 0x04 | RW Register 1
* 0x08 | RW Register 2
* 0x0C | RW Register 3
* 0x10 | RO Register 0
* 0x14 | RO Register 1
* 0x18 | RO Register 2
* 0x1C | RO Register 3
*/
module axi4lite_slave (
...
At this point, the module needed to be tested. To do this test, I used the KR260 with Vivado 2024.2, and turned me like 3 hours making this work properly. let’s start from the beginning. To generate data to fed the AXI4 Lite Slave, I used a Virtual Input Output module, which allows me to write and read data over JTAG. The complete Block Design is shown in the next image.
Now it’s time to create the Petalinux build for this board. First, since I used the BSP for the KR260, I downloaded the version without the _xsct_
text in its name. AMD is transitioning from the traditional xsa
workflow to the System Device Tree (SDT) workflow for generating hardware. The key difference is that to generate the sdt
folder, you need the xsa
file and the xsct
terminal, which Petalinux has announced will soon be deprecated.
Initially, I attempted to use this new SDT workflow, but it caused the Petalinux build to fail. After some research, I discovered that the BSP I downloaded is only compatible with the new SDT workflow. However, AMD is still releasing a BSP that works with the older (and currently functional) workflow, which includes _xsct_
in its name. Switching to this version resolved the first issue.
Once my Petalinux was ready, I notice that I couldn’t communicate with my IP from Petalinux. The reason is that the PL clocks was stopped. This was an issue of Petalinux 2024.1, and I hoped that was fixed in the 2024.2 version. No worries, I know how to fix it… or not? Well, the solutions provided by AMD (adding the clk_ignore_unused
, and modifying the device tree) didn’t work this time.
After another hour of research, I tried to modify the kernel bootargs from U-Boot in the board power up, instead of in the Petalinux configuration. This approach works because U-Boot directly initializes the kernel with the specified boot arguments during the boot process, bypassing potential issues or overrides in the Petalinux configuration. By setting the bootargs in U-Boot, we ensure that the kernel receives the correct parameters immediately, which is particularly useful when debugging or dealing with configuration inconsistencies. To do this, we need to stop the kernel boot, and set (again) the bootargs.
...
Get shared mii bus on ethernet@ff0b0000
ZYNQ GEM: ff0b0000, mdio bus ff0c0000, phyaddr 4, interface sgmii
eth0: ethernet@ff0b0000, eth1: ethernet@ff0c0000
tpm_tis_spi_probe: missing reset GPIO
Hit any key to stop autoboot: 0
ZynqMP> setenv bootargs "earlycon console=ttyPS1,115200 root=/dev/ram0 rw init_fatal_sh=1 xilinx_tsn_ep.st_pcp=4 cma=900M clk_ignore_unused"
ZynqMP> printenv bootargs
bootargs=earlycon console=ttyPS1,115200 root=/dev/ram0 rw init_fatal_sh=1 xilinx_tsn_ep.st_pcp=4 cma=900M clk_ignore_unused
ZynqMP> boot
starting USB...
Bus usb@fe200000: Register 2000440 NbrPorts 2
...
By doing this, the clocks were running, and finally I could test my IP. We can check that the clock it running by reading the clk_summary
.
xilinx-kr260-starterkit-xsct-20242:~$ sudo cat /sys/kernel/debug/clk/clk_summary | grep pl0
pl0_ref_mux 0 0 0 999999990 0 0 50000 Y deviceless no_connection_id
pl0_ref_div1 0 0 0 99999999 0 0 50000 Y deviceless no_connection_id
pl0_ref_div2 0 0 0 99999999 0 0 50000 Y deviceless no_connection_id
pl0_ref 0 0 0 99999999 0 0 50000 Y deviceless no_connection_id
Now, I was able of read and write in the Copilot’s AXI4 Lite slave.
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000000 32
0x00000000
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000010 32
0x00000001
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000000 32 0x00000008
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000004 32 0x00000015
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x8000000c 32 0x000000ff
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000008 32 0x000000ff
xilinx-kr260-starterkit-xsct-20242:~$ sudo devmem 0x80000014 32
0x0000DEAD
At this point, we have an AXI4 Lite slave working, but, what happens if I need to add more registers? I can ask Copilot to generate another module, or I can modify the current module but I love to automate things, so, why not ask Copilot to generate an AXI4 Lite slave generator using as a base the current module?
Create a python script where, configuring the name of the module, the number of rw registers and the number of ro registers, generates the verilog module axi4lite_slave according the features introduced.
def generate_axi4lite_slave(module_name, num_rw, num_ro):
# Header comment
header = f"""\
/*
* Module: {module_name}
* Author: [Your Name]
* Date: May 3, 2025
...
Copilot here generated a Python function that generates the code according the configured arguments, however, what would be great is to generate the module from the command line.
I need that the parameters will be configurable as arguments in the command line
Copilot, using the Python library argparse
, generated an script that now can be configured from the command line.
...
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Generate an AXI4-Lite slave Verilog module.")
parser.add_argument("--module_name", type=str, required=True, help="Name of the Verilog module.")
parser.add_argument("--num_rw", type=int, required=True, help="Number of RW registers.")
parser.add_argument("--num_ro", type=int, required=True, help="Number of RO registers.")
args = parser.parse_args()
...
To test this Python script, we can call it configuring the module_name
, num_rw
and num_ro
.
$ python3 generate_axi4lite_slave.py --module_name axi4lite_slave_py --num_rw 4 --num_ro 4
Generated axi4lite_slave_py.v
At this point, we are able to automate the creation of AXI4 Lite slave modules. If we examine the code, it is quite straightforward. The Python script predefines the structure of the Verilog module and then dynamically generates the sections related to the registers.
We could stop here, but why not take it a step further? Creating AXI4 Lite slaves is a common task for many designers. Often, when you’re designing your first module, you need examples. Instead of just sharing the Python script for others to download and use, we could create a webpage where this tool is accessible to everyone.
To achieve this, integrating the Python script into a webpage would require setting up a server to run the script, exposing that server to the internet, and implementing security measures to prevent potential attacks. A simpler and more efficient solution is to translate the Python script into JavaScript, allowing the browser to execute it locally on the client side.
translate the python script to javascript, and create a webpage where I configure the number of RW and RO registers, the name of the module, and it generate the verilog module. Also I want to see the Verilog code with a correct lighting.
Here Copilot generated an HTML file, with the webpage code, styles and script, all in the same file. It is true that I am not web developer, but I know that this is not the best way, so I asked Copilot to generate different files, and also generate the css file according the style of this blog.
I need separate files for javascript, html and css. Also I need to add a button to download the verilog code. Furthermore, I want to make it nice, so add a container for the AXI4lite slave generator. make it with the same style of controlpaths.com
Here the result.
To verify this tool, I generated another AXI4 Lite slave with the same features that the one generated by Copilot, and i saw that there was no differences in the code, so the translation and the web implementation was succesful.
At this point, havind all of this generated? why not integrate it in this blog? You can find it in /utils/axi4lite_gen/. I tested with different scenarions and seems that it works well, however use it carefully, and inform me if you find some bug. I will update it to make the code more “formal”.
The question is, can anybody generate an AXI4 Lite slave code? Well, develop is not just write code, in this article we saw how AI can generate code, but I need to know how to write the requests in order to generate the code I need. For example, in this case, I needed to know the structure of a Verilog file or what is an incomplete case. Then, to translate it to a webpage, first of all I need to know that, using javascript I can execute code in the browser, also that webpages styles are based on css files that can be modified. And finally, the integration of the tool in this blog was completely made by me. So, answering the first question, yes, anybody can generate an AXI4 Lite Slave, but just FPGA designers will make it work.