Difference between revisions of "BeagleBoard/GSoC/2022 Proposal/TaliaXu"

From eLinux.org
Jump to: navigation, search
(Experience and approach)
(Proposal for Beaglewire PRU and Support)
Line 2: Line 2:
 
About
 
About
 
''Student'': [https://elinux.org/User:Taliaxu Talia Xu]<br>
 
''Student'': [https://elinux.org/User:Taliaxu Talia Xu]<br>
''Mentors'': [https://elinux.org/User:M_w Michael Welling], [https://elinux.org/User:OmkarBhilareOmkar Bhilare]<br>
+
''Mentors'': [https://elinux.org/User:M_w Michael Welling], [https://elinux.org/User:OmkarBhilare Omkar Bhilare]<br>
 
''Proposal'':[https://elinux.org/BeagleBoard/GSoC/2022_Proposal/TaliaXu Implementing PRU and Improving Standalone Cores on BeagleWire]<br>
 
''Proposal'':[https://elinux.org/BeagleBoard/GSoC/2022_Proposal/TaliaXu Implementing PRU and Improving Standalone Cores on BeagleWire]<br>
 
<div style="clear:both;"></div>
 
<div style="clear:both;"></div>

Revision as of 08:34, 16 April 2022

Proposal for Beaglewire PRU and Support

About Student: Talia Xu
Mentors: Michael Welling, Omkar Bhilare
Proposal:Implementing PRU and Improving Standalone Cores on BeagleWire

Proposal

  • Completed All the requirements listed on the ideas page.
  • The PR request for cross-compilation task: #162.

Status

This project is currently just a proposal.

About you

Github: taliaxu09
School: [Technical University Delft]
Country: The Netherlands
Primary language : English
Typical work hours: 12PM-8PM CET
Previous GSoC participation: This is my first time applying to participate for GSoC. I want to patricipate in GSoC with BeagleBoard because I think this is a great opportunity to gain a deeper knowledge of the code repository of BeagleBoard and explore how to use it together with different peripherals. I also think the BeagleWire could be a promising candidate for some of the topics I wish to look into in my study on visible light communication.

About your project

Project name: RISC-V Based PRU on FPGA and Beglewire Updates

Main Goals:

  1. Create RISC-V PRU on BeagleWire and optimize for the I/O latency
  2. Create examples with PRU cores on BeagleWire in assembly
  3. Improve the stability and implement testbenches for subsystems in standalone cores
  4. Improve the documentation

Description

Introduction

The BeagleWire is an FPGA cape with the Lattice iCE40HX that can be connected to and interfaced with the BeagleBoard. There are two main goals in this project, the first one is to implement a programmable real-time unit on BeagleWire to allow low-latency I/O control between the main CPU and peripherals. The second one is to revisit and improve the software support for standalone cores, such as the (SDRAM, UART, SPI, PWM).

RISC-V Cores on FPGA

Several open-sourced PRU cores can be leveraged to implement on the BeagleWire

For iCE40:
  1. https://github.com/olofk/serv
  2. https://github.com/stnolting/neorv32
  3. https://github.com/sylefeb/Silice/tree/draft/projects/ice-v

The above cores can be used as a starting point and reference to quickly implement a working RISC-V core on the current BeagleWire cape. Once the cores are able to be run on the BeagleWire, the goal is to focus on the following improvements:

  1. Interface between BeagleWire PRU cores and BBB: to create an interface for the BBB to manage both the PRU on the BBB and the PRU on the BeagleWire simultaneously
  2. To measure and improve the I/O latency of the PRU cores: to identify any bottlenecks in the implementation of PRU cores for communicating with peripherals and look into ways to improve them if any. The I/O latencites of PRU cores on BBB will be used as a reference.
  3. To fit as many PRU cores as possible on the iCE40: to run multiple cores simultaneously on the BeagleWire with shared access to SDRAM. I also plan on looking into the components that are not necessary for BeagleWire to maximize the PRU cores on BeagleWire
For PolarFire:
  1. https://www.microsemi.com/product-directory/soc-fpgas/5498-polarfire-soc-fpga

The approach of supporting PRU on PolarFire is less sort out, but the purpose is to start with the official support of RISC-V on PolarFire.

Getting a working RISC-V core running on the iCE40 or PolarFire is the first step of the project, but the focus of the project is to improve the latency of I/O access for peripherals.

Soft RISC-V-based CPU core for low-latency I/O on BeagleWire

In the above RISC-V implementations, the I/Os tend to be mapped to memory blocks. Writing and reading from memory are multi-cycled instructions and can cause undesirable delays that violate timing requirements for certain peripherals. For this reason, in this project, I will be connecting one of the 31 general purpose registers directly to the I/O pins.

The following is roughly my plan to go about reaching this goal, if time permits, I will try to make progress before the submission deadline:

  1. Run the PRU example https://www.glennklockwood.com/embedded/beaglebone-pru.html and chracterize the latency with a ring buffer (https://pub.pages.cba.mit.edu/ring/)
  2. Implement a RISC-V core on BeagleWire or a PolarFire dev board and characterize the same latency (Latency can be measured from both digital probe and/or Fmax + # of cycles)
  3. Go through the HDL for the RISC-V core to verify whether the register block has been instantiated with FFs or a RAM block; If a RAM block is used, re-impelment it with FFs
  4. Modify the pin-outs for one of the registers such that I/Os have direct access to them; modify logic to write any peripheral state/data to this register
  5. Characterize the latency again, if it looks like Fmax can be improved further, that would be the enxt to look at
  6. Modify peripheral code in assembly to make sure compiler doesn't touch the I/O register

Supporting multiple peripherals

When multiple peripherals are connected to the PRU and a single register is insufficient, implement a logic block for multiplexing the I/O pins. Current thoughts:

  1. This part is perhaps not as critical in terms of timing, so I might get away with implementing this in memory
  2. The other alternative is to reserve 2 bits of the register for peripheral selection (up to 4)
  3. It's also possible to implement multiple cores on an FPGA, each supporting one peripheral, project Silice can be used as a starting point for this direction

Improving the stand-alone cores on BeagleWire

The previous issues on several subsystems of BeagleWire have been addressed with LiteDRAM, but the issues with the standalone cores are left unresolved. As part of my project, I would like to take a further look into the standalone subsystems.

The tasks I would like to achive in this project are

  1. Implement an automatic test script for each of the subsystem (SDRAM, SPI, UART)
  2. Read the RTL to understand any timing violation, as well as other issues that could have caused the unstable behavior
  3. Improve the reliability of standalone cores by solving any issues identified

Timeline

Date Status Details
Presubmission
  • Build a BeagleWire image & install toolchain (completed)
  • Verify that I am able to edit, recompile and program the FPGA (completed)
  • Define the exact scope of the project before the end of this period (Had a better and clearer idea about this after talking with the folks on slack, but can be further finalized during community bonding)
May 20th - June 12th Community Bonding
  • Read about PRU documentation, and run PRU examples on BBB and Litex
  • Read about the RISC-V implementation on Silice
  • Go through PRU examples on BBB and measure the I/O latency
  • Introduction Video
June 13th Milestone #1
  • Implement Neorv and serv on BeagleWire (Using the Silice implemenetation as a starting point https://github.com/sylefeb/Silice/tree/draft/projects/ice-v). Both will be implemented because their implementations are quite different (serv is a bit-serial core that trades speed for space)
  • Adding a pull request for BeagleWire -> https://github.com/olofk/serv
  • Have a working initial version of both implementations on BeagleWire as a starting point for further improvement
June 20th Milestone #2
June 27th Milestone #3
  • Modify the top level logic of register file to make one of the registers reachable from I/O
  • Make sure I/O states and data is written to the exposed register
  • Look into the space usage, optize it (if needed) to fit multiple cores
July 4th Milestone #4
  • Modify the assembly code for the peripherals to make sure the register values are not overwritten by the compiler
  • Implement multiple PWMs, low latency examples
  • Implement an example running both PRUs on FPGA and PRUs on BBB
July 11th Milestone #5
  • Implement other examples (soft UARTs, low-latency IOs)
July 18th Milestone #6
  • Demonstration of multiple peripherals running on separate cores at the same time
  • Prepare a report summarizing progress with the PRU cores
July 25th Milestone #7
  • Read about the previous open issues on standalone cores
  • Look into reproducing #7, #8 SDRAM issues with the standalone core
  • Try to reproduce the error with standalone SDRAM with a test scrips
August 1st Milestone #8
  • Look into the verilog code and timing reports to figure out errors on SDRAM
August 8th Milestone #9
  • Implement automated test script for different subsystems (SPI, PWM, UART)
August 15th Milestone #10
  • Look into the verilog code and timing reports to see if there is anything wrong with the different subsystems
August 22nd Milestone #11
  • Read on nMigen support if there is time, otherwise budgeting the week for overflow
August 29th Milestone #12
  • Submit final work product and final mentor evaluation
  • Complete YouTube video
Sep. 5th Milestone #13
  • Completion of GSoC

Experience and approach

  • I am currently pursuing a PhD in embedded systems, and am familiar with the concepts/implementations of low-power processors.
  • I have working experience with Verilog and embedded systems from previous internships & research projects. I also have some expriences working with low latency I/Os on FPGA boards. I had done an undergraduate capstone project + internship with using FPGAs as accelerators for machine learning algorithms.
  • I have experience working with existing code repo and I think I am able to navigate a large code base quite well from previous work/internship experience in large software companies.
  • I have working experience with designing PCBs in Altium and Kicad up to 8 layers.

Contingency

if I get stuck on my project and my mentor isn’t around, I will use the following resources:

  1. Getting Started Guide for BeagleBone by derek molloy: http://derekmolloy.ie/beaglebone
  2. PRU Cookbook: [1]
  • BeagleWire Repo:
  1. https://github.com/BeagleWire
  • Documentation and Repo on RISC-V cores:
  1. https://github.com/olofk/serv
  2. https://github.com/stnolting/neorv32
  3. https://github.com/sylefeb/Silice/tree/draft/projects/ice-v

There are a good amount of resources on everything that I look to implement, if all above fail, I will post on the BBB slack/irc channel.

Benefit

(I am reusing last year's benefit - because the goal of the project and thus the benefit is essentially similar) The completed project will provide the BeagleBoard.org community with easy to implement and powerful tools for the realization of projects based on Programmable Logic Device(FPGA), which will surely increase the number of applications based on it. The developed software will be easy and, at the same time, efficient tool for communication with FPGA. At this point, FPGA will be able to meet the requirements of even more advanced applications. The BeagleWire creates a powerful and versatile digital cape for users to create their imaginative digital designs.

It is largely about advancing RISC-V and learning about a key architecture benefit seen in earlier BeagleBone systems. Note how TI has removed PRU documentation from TDA4VM, despite it being a key value to BeagleBone users? This is about developing a PRU-like CPU that is open source on an open ISA, not about the ideal use of FPGA fabric. But, it could also be a handy way to configure FPGA fabric in a way that users don't need to understand how to generate FPGA code itself, if they can just program the RISC-V core, but that is secondary to the development and analysis of the core itself.