Difference between revisions of "BeagleBoard/GSoC/2023 Proposal/Khushi-Balia"

From eLinux.org
Jump to: navigation, search
(Building an LLVM Backend for PRU)
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
==Proposal for Building an LLVM Backend for PRU==
 
==Proposal for Building an LLVM Backend for PRU==
  
Student: Khushi Balia
+
*Student: Khushi Balia
Code:  
+
*Mentors: Vedant Paranjape, Shreyas Atre
Proposal : https://elinux.org/index.php?title=BeagleBoard/GSoC/2023_Proposal/Khushi-Balia
+
*Proposal : https://elinux.org/BeagleBoard/GSoC/2023_Proposal/Khushi-Balia
Wiki: [N/A]
+
*GSoC: Proposal Request
GSoC: Proposal Request
+
 
 +
= Status =
 +
* This project is currently just a proposal.
 +
 
 +
= Proposal =
 +
* Completed the prerequisites
 +
* Created a PR for the task https://github.com/jadonk/gsoc-application/pull/175
 +
 
 +
== About You ==
 +
*IRC Name: Khushi Balia
 +
* Github:https://github.com/Khushi-Balia
 +
* College: [https://vjtimumbai.in/ Veermata Jijabai Technological Institute (VJTI)]
 +
* Country: India
 +
* Primary language: English, Hindi, Gujarati
 +
* Typical work hours: 9 AM - 7 PM Indian Standard Time
 +
* Experience :
 +
* https://github.com/Khushi-Balia/le-transpiler is the project in which I built a transpiler that converts a code in a custom language PYLOX, to an equivalent code in C.
 +
* My areas of interest are Compiler development, Embedded Systems.
 +
* I am familiar with git and operating in a Linux environment.
 +
* I am actively involved in the robotics club of my institute [https://www.sravjti.in/ Society of Robotics and Automation SRA] as a core member.
 +
* I am participating in GSoC for the first time.
 +
 
 +
= About Your Project =
 +
 
 +
* Project name: '''Building an LLVM Backend for PRU'''
 +
 
 +
== Description ==
 +
===Why LLVM?===
 +
*There is an extreme benefit of having a compiler architected using the LLVM model; because of the modularity and the defined boundaries of each stage, new source languages, target architectures, and optimization passes can be added or modified mostly independent of each other.
 +
*LLVM is different from most traditional compiler projects because it is not just a collection of individual programs, but rather a collection of libraries. These libraries are all designed using object-oriented programming and are extendable and modular. This along with its three-phase approach and its modern code design makes it a very appealing compiler infrastructure to work with.
 +
 
 +
 
 +
[[File:llvm.jpeg|700px|thumb|center]]
 +
 
 +
 
 +
===THE LLVM BACKEND (Code Generator Design)===
 +
*The code generator framework provides many classes, methods, and tools to help translate the LLVM IR code into target-specific assembly or machine code. The two main target-specific components that comprise a custom backend are the abstract target description, and the abstract target description implementation.
 +
 
 +
*'''TableGen''': necessary for writing the abstract target description. This tool translates a target description file (.td) into C++ code that is used in code generation. Its main goal is to reduce large, tedious descriptions into smaller and flexible definitions that are easier to manage and structure.
 +
*We’ll be using the TableGen code to define each of the registers that are in the PRU architecture. The AsmWriter TableGen backend, which is responsible for creating code to help with printing the target-specific assembly code, generates the C++ code.
 +
 
 +
 
 +
[[File:tablgen.jpeg|500px|thumb|center]]
 +
 
 +
 
 +
*'''Clang and llc''': Clang is the front end for LLVM which supports C, C++, and Objective C/C++ . The llc tool is the LLVM static compiler. The custom backends written for LLVM are each linked into llc which then compiles LLVM IR code into the target-specific assembly or machine code.
 +
 
 +
<br>
 +
'''Custom Target Implementation''':
 +
The custom LLVM backend inherits from and extends many of the LLVM classes.To implement an LLVM backend, most of the files will be placed in LLVM’s lib/Target/PRU/ directory, that we’ll make. The “entry point” for PRU LLVM backend will be  within the PRUMCTargetDescription. This is where the backend is registered with the LLVM TargetRegistry so that LLVM can find and use the backend.
 +
 
 +
<br>
 +
'''Abstract Target Description''':
 +
The majority of the abstract target description is written in TableGen format. The major components of the PRU backend that will be written in TableGen form are the register information, calling convention, special operands, instruction formats, and the complete instruction definitions.
 +
 
 +
*Register Information: The register information will be defined in PRURegisterInfo.td. This file will define the register set of the PRU as well as different register classes.
 +
*Calling Conventions: The calling convention definitions describe the part which controls how data moves between function calls.They’ll be defined in the in PRUCallingConv.td .
 +
*Instruction Formats: The instruction formats will describe the instruction word formats as per the formats of PRU instructions.These formats will be defined in PRUInstrFormats.td.
 +
*Complete Instruction Definitions: The complete instruction definitions inherit from the instruction format classes to complete the TableGen Instruction base class. These complete instructions will be defined in PRUInstrInfo.td.
 +
 
 +
<br>
 +
'''Instruction Selection''':
 +
The instruction selection stage of the backend is responsible for translating the LLVM IR code into target-specific machine instructions. Phases of the of the instruction selector: SelectionDAG Construction, Legalization, Selection, Scheduling are performed by the compiler.
 +
<br>
 +
 
 +
'''Register Allocation''':
 +
This phase of the backend is responsible for eliminating all of the virtual registers from the list of machine instructions and replacing them with physical registers.
 +
 
 +
<br>
 +
'''Code Emission''':
 +
The final phase of the backend is to emit the machine instruction list as either target-specific assembly code (emitted by the assembly printer) or machine code (emitted by the object writer).
 +
 
 +
* Assembly Printer and Object Writer: Printing assembly code requires the implementation of several custom classes and the custom machine code is emitted in the form of an object file.
 +
 
 +
 
 +
[[File:flowchart1.jpeg|700px|thumb|center]]
 +
 
 +
 
 +
=Implementation Details=
 +
 
 +
*I’ll make a new directory for PRU inside the target, lib/Target/PRU/ which will contain the following custom files:
 +
# PRU.td
 +
# PRUCallingConv.td
 +
# PRUInstrFormats.td
 +
# PRUInstrInfo.td
 +
# PRURegisterInfo.td
 +
# PRURegisterInfo.h
 +
# PRURegisterInfo.cpp
 +
# PRUInstrInfo.h
 +
# PRUInstrInfo.cpp
 +
# PRUFrameLowering.h
 +
# PRUFrameLowering.cpp
 +
# PRUISelDAGtoDAG.cpp
 +
# PRUISelLowering.cpp
 +
# PRUISelLowering.h
 +
# PRUMCInstLower.cpp
 +
# PRUMCInstLower.h
 +
# PRUMachineFunctionInfo.cpp
 +
# PRUMachineFunctionInfo.h
 +
# PRUSubtarget.cpp
 +
# PRUSubtarget.h
 +
# PRUTargetMachine.cpp
 +
# PRUTargetMachine.h
 +
# PRUAsmPrinter.cpp
 +
# PRUAsmPrinter.h
 +
 
 +
I’ll also create the following:
 +
*lib/Target/PRU/InstPrinter/, which will have
 +
# PRUInstPrinter.h
 +
# PRUInstPrinter.cpp
 +
 
 +
*lib/Target/PRU/MCTargetDesc/, which will have
 +
# PRUAsmBackend.cpp
 +
# PRUELFObjectWriter.cpp
 +
# PRUFixupKinds.h
 +
# PRUMCAsmInfo.cpp
 +
# PRUMCAsmInfo.h
 +
# PRUMCCodeEmitter.cpp
 +
# PRUMCTargetDesc.cpp
 +
# PRUMCTargetDesc.h
 +
 
 +
*lib/Target/PRU/TargetInfo/, which will have PRUTargetInfo.cpp
 +
I’ll take reference from the existing targets to write these.
 +
<br>
 +
 
 +
[[File:flow2.jpeg|700px|thumb|center]]
 +
 
 +
 
 +
*'''Target Machine''': Once I have the LLVM IR, I will move onto describing the characteristics of PRU by creating a subclass of the TargetMachine class and create the PRUTargetMachine.cpp and PRUTargetMachine.h files.
 +
- The datatypes are aligned to an 8-bit boundary.
 +
 
 +
  string dataLayout = "";
 +
  dataLayout += "e"; // Little-endian
 +
  dataLayout += "-m:e"; // ELF style name mangling
 +
  dataLayout += "-p:32:8"; // Set 32-bit pointer size with 8-bit boundary
 +
  dataLayout += "-i8:8";
 +
  dataLayout += "-i16:16:8"; // Align i16 to 8-bit
 +
  dataLayout += "-i32:32:8"; // Align i32 to 8-bit
 +
  dataLayout += "-i64:64:8"; //Align i64 to 8-bit
 +
  dataLayout += "-f32:8"; // Align f32 to 8-bit
 +
  dataLayout += "-f64:8"; // Align f64 to 8-bit
 +
  dataLayout += "-n8"; // Set native integer width to 8-bits
 +
 
 +
  // "e-m:e-p:32:8-i8:8-i16:16:8-i32:32:8-i64:64:8-f32:8-f64:8-n8"
 +
 
 +
*'''Target Registration''': I’ll register our target with the TargetRegistry, which is what other LLVM tools use to be able to lookup and use your target at runtime. Declare a global Target object which is used to represent the target(PRU) during registration.
 +
 
 +
*'''Register Set and Register Classes''': Describing the register set of thePRU and using TableGen to generate code for register definition, register aliases, and register classes from a target-specific PRURegisterInfo.td input file. I’ll also write additional code for a subclass of the PRURegisterInfo class that’ll represent the class register file data used for register allocation and will also describe the interactions between registers.
 +
 
 +
  class PRUReg<bits<16> Enc, string n,
 +
            list<string> altNames = []> : Register<n, altNames> {
 +
  let HWEncoding = Enc;
 +
  let Namespace = "PRU";
 +
  }
 +
  class PRUCtrlReg<bits<16> Enc, string n> : Register<n> {
 +
  let HWEncoding = Enc;
 +
  let Namespace = "PRU";
 +
  }
 +
  let Namespace = "PRU",
 +
    FallbackRegAltNameIndex = NoRegAltName in {
 +
  def RegNamesRaw : RegAltNameIndex;
 +
  }
 +
  def R0  : PRUReg< 0, "r0">,  DwarfRegNum<[0]>;
 +
  def R1  : PRUReg< 1, "r1">,  DwarfRegNum<[1]>;
 +
  let RegAltNameIndices = [RegNamesRaw] in {
 +
  def SP  : PRUReg< 2, "sp", ["r2"]>,  DwarfRegNum<[2]>;
 +
  def LR  : PRUReg< 3, "lr", ["r3"]>,  DwarfRegNum<[3]>;
 +
  def AP  : PRUReg< 4, "ap", ["r4"]>,  DwarfRegNum<[4]>; }
 +
  def R5  : PRUReg< 5, "r5">,  DwarfRegNum<[5]>;
 +
  def R6  : PRUReg< 6, "r6">,  DwarfRegNum<[6]>;
 +
  def R7  : PRUReg< 7, "r7">,  DwarfRegNum<[7]>;
 +
  def R8  : PRUReg< 8, "r8">,  DwarfRegNum<[8]>;
 +
  def R9  : PRUReg< 9, "r9">,  DwarfRegNum<[9]>;
 +
  def R10 : PRUReg<10, "r10">,  DwarfRegNum<[10]>;
 +
  def R11 : PRUReg<11, "r11">,  DwarfRegNum<[11]>;
 +
  def R12 : PRUReg<12, "r12">,  DwarfRegNum<[12]>;
 +
  def R13 : PRUReg<13, "r13">,  DwarfRegNum<[13]>;
 +
  def R14 : PRUReg<14, "r14">,  DwarfRegNum<[14]>;
 +
  def R15 : PRUReg<15, "r15">,  DwarfRegNum<[15]>;
 +
  def R16 : PRUReg<16, "r16">,  DwarfRegNum<[16]>;
 +
  def R17 : PRUReg<17, "r17">,  DwarfRegNum<[17]>;
 +
  def R18 : PRUReg<18, "r18">,  DwarfRegNum<[18]>;
 +
  def R19 : PRUReg<19, "r19">,  DwarfRegNum<[19]>;
 +
  def R20 : PRUReg<20, "r20">,  DwarfRegNum<[20]>;
 +
  def R21 : PRUReg<21, "r21">,  DwarfRegNum<[21]>;
 +
  def R22 : PRUReg<22, "r22">,  DwarfRegNum<[22]>;
 +
  def R23 : PRUReg<23, "r23">,  DwarfRegNum<[23]>;
 +
  def R24 : PRUReg<24, "r24">,  DwarfRegNum<[24]>;
 +
  def R25 : PRUReg<25, "r25">,  DwarfRegNum<[25]>;
 +
  def R26 : PRUReg<26, "r26">,  DwarfRegNum<[26]>;
 +
  def R27 : PRUReg<27, "r27">,  DwarfRegNum<[27]>;
 +
  def R28 : PRUReg<28, "r28">,  DwarfRegNum<[28]>;
 +
  def R29 : PRUReg<29, "r29">,  DwarfRegNum<[29]>;
 +
  def R30 : PRUCtrlReg<30, "r30">, DwarfRegNum<[30]>;
 +
  def R31 : PRUCtrlReg<31, "r31">, DwarfRegNum<[31]>;
 +
 
 +
*'''Instruction Set''': Describing the instruction set of the target. Use TableGen to generate code for target-specific instructions from target-specific versions of PRUInstrFormats.td and PRUInstrInfo.td. I’ll also write additional code for a subclass of the PRUInstrInfo class to represent machine instructions supported by the target machine.
 +
 
 +
  class InstPRU<dag outs, dag ins, string asmstr, list<dag> pattern>
 +
    : Instruction {
 +
  field bits<32> Inst;
 +
  let Namespace = "PRU";
 +
  dag OutOperandList = outs;
 +
  dag InOperandList = ins;
 +
  let AsmString  = asmstr;
 +
  let Pattern = pattern;
 +
  }
 +
  // PRU pseudo instructions format
 +
  class PRUPseudoInst<dag outs, dag ins, string asmstr, list<dag> pattern>
 +
    : InstPRU<outs, ins, asmstr, pattern> {
 +
  let isPseudo = 1;
 +
  let isCodeGenOnly = 1;
 +
  }
 +
 
 +
*'''Instruction Selector''': Describing the selection and conversion of the LLVM IR from a Directed Acyclic Graph (DAG) representation of instructions to native target-specific instructions. Using TableGen to generate code that matches patterns and selects instructions based on additional information in a target-specific version of PRUInstrInfo.td. Writing code for PRUISelDAGToDAG.cpp to perform pattern matching and DAG-to-DAG instruction selection. Also writing code in PRUISelLowering.cpp to replace or remove operations and data types that are not supported natively in a SelectionDAG.
 +
 
 +
*'''Assembly Printer''': Writing code for an assembly printer that converts LLVM IR to a GAS format for your target machine. Adding assembly strings to the instructions defined in target-specific version of PRUInstrInfo.td. Also writing code for a subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a trivial subclass of PRUAsmInfo.
 +
 
 +
  namespace {
 +
    class PRUAsmPrinter : public AsmPrinter {
 +
      PRUMCInstLower MCInstLowering;
 +
    public:
 +
      explicit PRUAsmPrinter(TargetMachine &TM,
 +
                          std::unique_ptr<MCStreamer> Streamer)
 +
        : AsmPrinter(TM, std::move(Streamer)),
 +
          MCInstLowering(OutContext, *this) {}
 +
      virtual StringRef getPassName() const {
 +
          return StringRef("PRU Assembly Printer");
 +
      }
 +
    void EmitFunctionEntryLabel();
 +
    void EmitInstruction(const MachineInstr *MI);
 +
    void EmitFunctionBodyStart();
 +
  };
 +
  }
 +
 
 +
*'''Machine Code''': Adding JIT support and creating a machine code emitter (subclass of PRUJITInfo). Writing a PRUCodeEmitter.cpp file that will contain a machine function pass that transforms target-machine instructions into relocatable machine code and a PRUJITInfo.cpp file that will implement the JIT interfaces for target-specific code-generation activities, such as emitting machine code and stubs. Modifying PRUTargetMachine so that it provides a TargetJITInfo object through its getJITInfo method.
 +
 
 +
===References===
 +
* https://github.com/llvm/llvm-project
 +
* Custom 32-bit RISC architecture target: https://github.com/connorjan/llvm-cjg
 +
* LLVM Backend from scratch: https://github.com/Jonathan2251/lbd
 +
* Flowchart: https://opus4.kobv.de/opus4-fau/files/1108/tricore_llvm.pdf
 +
 
 +
 
 +
===Timeline===
 +
 
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Date !! Status !! Details
 +
|-
 +
| April 4 || ||
 +
* Submit final proposals on the portal
 +
|
 +
|-
 +
| April 4  - May 4 || Pre-Selection Phase ||
 +
* Get thorough with the PRU architecture, instructions and LLVM backend
 +
* Discussion on resources and tools relevant to the project
 +
|
 +
|-
 +
| May 4 - May 10 || Community Bonding ||
 +
* Discuss implementation idea with mentors
 +
* Continue with LLVM Backend and PRU understanding
 +
* Getting all doubts cleared regarding the project
 +
* Diligently study any additional resources provided.
 +
* Talk to and learn from other people and their interests.
 +
|-
 +
| May 10 - May 31 || College Exams ||
 +
* Focus on college exams
 +
|-
 +
| June 1
 +
Week 1 & Week 2
 +
|| Milestone #1 ||
 +
* Introductory Video
 +
* Describing the  Target Machine for PRU
 +
* Describing the  Target Registration
 +
|-
 +
| June 12
 +
Week 3 & Week 4
 +
|| Milestone #2 ||
 +
* Adding the register set and register classes
 +
* Adding the Instruction set
 +
|-
 +
| June 26
 +
Week 5 & Week 6
 +
|| Milestone #3 ||
 +
* Constructing  the SelectionDAG .
 +
* Adding the Instruction Selector
 +
|-
 +
| July 10 - July 14
 +
|| Midterm evaluation ||
 +
* Submission for phase 1 evaluation
 +
|-
 +
| July 15
 +
Week 8 & Week 9
 +
|| Milestone #4 ||
 +
* Adding the Assembly printer.
 +
* Adding the Machine code Emitter (JIT support)
 +
|-
 +
| July 28
 +
Week 10, Week 11, Week 12
 +
|| Milestone #5 ||
 +
* Testing and Debugging
 +
|-
 +
| Aug 21 - Aug 28
 +
|| Final week: GSoC contributors submit their final work product and their final mentor evaluation ||
 +
|-
 +
| Aug 28
 +
Week 14
 +
|| Milestone #6 ||
 +
* Completing the remaining work
 +
* This is a large project, so the work will continue till September.
 +
|-
 +
| Sep 4
 +
Week 15
 +
|| Milestone #7
 +
Final Evaluation
 +
||
 +
* Complete the documentation work
 +
* Suggestion from mentors and required changes
 +
* Have the final product ready for submission
 +
* Final mentor evaluation
 +
* Completion of GSoC
 +
|-
 +
| Post GSoC ||  ||
 +
* Continue some work on the project to improvise it and continue contributing to other areas of BeagleBoard:)
 +
|}
 +
 
 +
===Experience and approach===
 +
 
 +
This project requires good knowledge and background of compiler development.
 +
*I have been exploring the compiler developer environment for quite some time,having good knowledge about compilers, and have worked on a transpiler project.
 +
*I have completed the Kaleidoscope Tutorial of LLVM and read through the LLVM Backend documentations, thus have a good understanding of the same.
 +
*I have some experience with the esp-32 micro-controller and am into compiler development which is a perfect blend of hardware and software, which is now the requirement of this project.
 +
*I am an open source enthusiast, passionate about technologies and have always dedicated myself to the work I do with utmost perfection.I have no major commitment other than GSoC during the summer break and would give the best of my potential to complete the project idea in the given time frame.
 +
*I plan to keep working on this project even after GSoC and also engage with the community often.
 +
 
 +
===Contingency===
 +
 
 +
* I have prepared a doc of all the links I have referred to during my preparation phase, and if I get stuck anywhere I would be relying on those resources.
 +
* Moreover the BeagleBoard community is extremely helpful and active in resolving doubts, which makes it a great going for the project resources and clarification.
 +
 
 +
===Benefit===
 +
 
 +
* The PRU target(am335x) will have an LLVM support, so that we can use clang instead of pru-gcc.
 +
* Clang is much faster, uses far less memory than GCC, and provide extremely clear and concise diagnostics, thus will be beneficial.
 +
* The LLVM support will provide better compatibility, optimization and tooling.
 +
 
 +
==Misc==
 +
 
 +
Cross-compilation task,sent a PR to the upstream: [https://github.com/jadonk/gsoc-application/pull/175]
 +
 
 +
[[Category: GSoCProposal2023]]

Latest revision as of 10:17, 14 April 2023

Proposal for Building an LLVM Backend for PRU

Status

  • This project is currently just a proposal.

Proposal

About You

About Your Project

  • Project name: Building an LLVM Backend for PRU

Description

Why LLVM?

  • There is an extreme benefit of having a compiler architected using the LLVM model; because of the modularity and the defined boundaries of each stage, new source languages, target architectures, and optimization passes can be added or modified mostly independent of each other.
  • LLVM is different from most traditional compiler projects because it is not just a collection of individual programs, but rather a collection of libraries. These libraries are all designed using object-oriented programming and are extendable and modular. This along with its three-phase approach and its modern code design makes it a very appealing compiler infrastructure to work with.


Llvm.jpeg


THE LLVM BACKEND (Code Generator Design)

  • The code generator framework provides many classes, methods, and tools to help translate the LLVM IR code into target-specific assembly or machine code. The two main target-specific components that comprise a custom backend are the abstract target description, and the abstract target description implementation.
  • TableGen: necessary for writing the abstract target description. This tool translates a target description file (.td) into C++ code that is used in code generation. Its main goal is to reduce large, tedious descriptions into smaller and flexible definitions that are easier to manage and structure.
  • We’ll be using the TableGen code to define each of the registers that are in the PRU architecture. The AsmWriter TableGen backend, which is responsible for creating code to help with printing the target-specific assembly code, generates the C++ code.


Tablgen.jpeg


  • Clang and llc: Clang is the front end for LLVM which supports C, C++, and Objective C/C++ . The llc tool is the LLVM static compiler. The custom backends written for LLVM are each linked into llc which then compiles LLVM IR code into the target-specific assembly or machine code.


Custom Target Implementation: The custom LLVM backend inherits from and extends many of the LLVM classes.To implement an LLVM backend, most of the files will be placed in LLVM’s lib/Target/PRU/ directory, that we’ll make. The “entry point” for PRU LLVM backend will be within the PRUMCTargetDescription. This is where the backend is registered with the LLVM TargetRegistry so that LLVM can find and use the backend.


Abstract Target Description: The majority of the abstract target description is written in TableGen format. The major components of the PRU backend that will be written in TableGen form are the register information, calling convention, special operands, instruction formats, and the complete instruction definitions.

  • Register Information: The register information will be defined in PRURegisterInfo.td. This file will define the register set of the PRU as well as different register classes.
  • Calling Conventions: The calling convention definitions describe the part which controls how data moves between function calls.They’ll be defined in the in PRUCallingConv.td .
  • Instruction Formats: The instruction formats will describe the instruction word formats as per the formats of PRU instructions.These formats will be defined in PRUInstrFormats.td.
  • Complete Instruction Definitions: The complete instruction definitions inherit from the instruction format classes to complete the TableGen Instruction base class. These complete instructions will be defined in PRUInstrInfo.td.


Instruction Selection: The instruction selection stage of the backend is responsible for translating the LLVM IR code into target-specific machine instructions. Phases of the of the instruction selector: SelectionDAG Construction, Legalization, Selection, Scheduling are performed by the compiler.

Register Allocation: This phase of the backend is responsible for eliminating all of the virtual registers from the list of machine instructions and replacing them with physical registers.


Code Emission: The final phase of the backend is to emit the machine instruction list as either target-specific assembly code (emitted by the assembly printer) or machine code (emitted by the object writer).

  • Assembly Printer and Object Writer: Printing assembly code requires the implementation of several custom classes and the custom machine code is emitted in the form of an object file.


Flowchart1.jpeg


Implementation Details

  • I’ll make a new directory for PRU inside the target, lib/Target/PRU/ which will contain the following custom files:
  1. PRU.td
  2. PRUCallingConv.td
  3. PRUInstrFormats.td
  4. PRUInstrInfo.td
  5. PRURegisterInfo.td
  6. PRURegisterInfo.h
  7. PRURegisterInfo.cpp
  8. PRUInstrInfo.h
  9. PRUInstrInfo.cpp
  10. PRUFrameLowering.h
  11. PRUFrameLowering.cpp
  12. PRUISelDAGtoDAG.cpp
  13. PRUISelLowering.cpp
  14. PRUISelLowering.h
  15. PRUMCInstLower.cpp
  16. PRUMCInstLower.h
  17. PRUMachineFunctionInfo.cpp
  18. PRUMachineFunctionInfo.h
  19. PRUSubtarget.cpp
  20. PRUSubtarget.h
  21. PRUTargetMachine.cpp
  22. PRUTargetMachine.h
  23. PRUAsmPrinter.cpp
  24. PRUAsmPrinter.h

I’ll also create the following:

  • lib/Target/PRU/InstPrinter/, which will have
  1. PRUInstPrinter.h
  2. PRUInstPrinter.cpp
  • lib/Target/PRU/MCTargetDesc/, which will have
  1. PRUAsmBackend.cpp
  2. PRUELFObjectWriter.cpp
  3. PRUFixupKinds.h
  4. PRUMCAsmInfo.cpp
  5. PRUMCAsmInfo.h
  6. PRUMCCodeEmitter.cpp
  7. PRUMCTargetDesc.cpp
  8. PRUMCTargetDesc.h
  • lib/Target/PRU/TargetInfo/, which will have PRUTargetInfo.cpp

I’ll take reference from the existing targets to write these.

Flow2.jpeg


  • Target Machine: Once I have the LLVM IR, I will move onto describing the characteristics of PRU by creating a subclass of the TargetMachine class and create the PRUTargetMachine.cpp and PRUTargetMachine.h files.

- The datatypes are aligned to an 8-bit boundary.

 string dataLayout = "";
 dataLayout += "e"; // Little-endian
 dataLayout += "-m:e"; // ELF style name mangling
 dataLayout += "-p:32:8"; // Set 32-bit pointer size with 8-bit boundary
 dataLayout += "-i8:8"; 
 dataLayout += "-i16:16:8"; // Align i16 to 8-bit
 dataLayout += "-i32:32:8"; // Align i32 to 8-bit
 dataLayout += "-i64:64:8"; //Align i64 to 8-bit
 dataLayout += "-f32:8"; // Align f32 to 8-bit
 dataLayout += "-f64:8"; // Align f64 to 8-bit
 dataLayout += "-n8"; // Set native integer width to 8-bits
 
 // "e-m:e-p:32:8-i8:8-i16:16:8-i32:32:8-i64:64:8-f32:8-f64:8-n8"
  • Target Registration: I’ll register our target with the TargetRegistry, which is what other LLVM tools use to be able to lookup and use your target at runtime. Declare a global Target object which is used to represent the target(PRU) during registration.
  • Register Set and Register Classes: Describing the register set of thePRU and using TableGen to generate code for register definition, register aliases, and register classes from a target-specific PRURegisterInfo.td input file. I’ll also write additional code for a subclass of the PRURegisterInfo class that’ll represent the class register file data used for register allocation and will also describe the interactions between registers.
 class PRUReg<bits<16> Enc, string n,
            list<string> altNames = []> : Register<n, altNames> {
 let HWEncoding = Enc;
 let Namespace = "PRU";
 }
 class PRUCtrlReg<bits<16> Enc, string n> : Register<n> {
 let HWEncoding = Enc;
 let Namespace = "PRU";
 }
 let Namespace = "PRU",
   FallbackRegAltNameIndex = NoRegAltName in {
 def RegNamesRaw : RegAltNameIndex;
 }
 def R0  : PRUReg< 0, "r0">,  DwarfRegNum<[0]>;
 def R1  : PRUReg< 1, "r1">,  DwarfRegNum<[1]>;
 let RegAltNameIndices = [RegNamesRaw] in {
 def SP  : PRUReg< 2, "sp", ["r2"]>,  DwarfRegNum<[2]>;
 def LR  : PRUReg< 3, "lr", ["r3"]>,  DwarfRegNum<[3]>;
 def AP  : PRUReg< 4, "ap", ["r4"]>,  DwarfRegNum<[4]>; }
 def R5  : PRUReg< 5, "r5">,   DwarfRegNum<[5]>;
 def R6  : PRUReg< 6, "r6">,   DwarfRegNum<[6]>;
 def R7  : PRUReg< 7, "r7">,   DwarfRegNum<[7]>;
 def R8  : PRUReg< 8, "r8">,   DwarfRegNum<[8]>;
 def R9  : PRUReg< 9, "r9">,   DwarfRegNum<[9]>;
 def R10 : PRUReg<10, "r10">,  DwarfRegNum<[10]>;
 def R11 : PRUReg<11, "r11">,  DwarfRegNum<[11]>;
 def R12 : PRUReg<12, "r12">,  DwarfRegNum<[12]>;
 def R13 : PRUReg<13, "r13">,  DwarfRegNum<[13]>;
 def R14 : PRUReg<14, "r14">,  DwarfRegNum<[14]>;
 def R15 : PRUReg<15, "r15">,  DwarfRegNum<[15]>;
 def R16 : PRUReg<16, "r16">,  DwarfRegNum<[16]>;
 def R17 : PRUReg<17, "r17">,  DwarfRegNum<[17]>;
 def R18 : PRUReg<18, "r18">,  DwarfRegNum<[18]>;
 def R19 : PRUReg<19, "r19">,  DwarfRegNum<[19]>;
 def R20 : PRUReg<20, "r20">,  DwarfRegNum<[20]>;
 def R21 : PRUReg<21, "r21">,  DwarfRegNum<[21]>;
 def R22 : PRUReg<22, "r22">,  DwarfRegNum<[22]>;
 def R23 : PRUReg<23, "r23">,  DwarfRegNum<[23]>;
 def R24 : PRUReg<24, "r24">,  DwarfRegNum<[24]>;
 def R25 : PRUReg<25, "r25">,  DwarfRegNum<[25]>;
 def R26 : PRUReg<26, "r26">,  DwarfRegNum<[26]>;
 def R27 : PRUReg<27, "r27">,  DwarfRegNum<[27]>;
 def R28 : PRUReg<28, "r28">,  DwarfRegNum<[28]>;
 def R29 : PRUReg<29, "r29">,  DwarfRegNum<[29]>;
 def R30 : PRUCtrlReg<30, "r30">, DwarfRegNum<[30]>;
 def R31 : PRUCtrlReg<31, "r31">, DwarfRegNum<[31]>;
  • Instruction Set: Describing the instruction set of the target. Use TableGen to generate code for target-specific instructions from target-specific versions of PRUInstrFormats.td and PRUInstrInfo.td. I’ll also write additional code for a subclass of the PRUInstrInfo class to represent machine instructions supported by the target machine.
 class InstPRU<dag outs, dag ins, string asmstr, list<dag> pattern>
   : Instruction {
 field bits<32> Inst;
 let Namespace = "PRU";
 dag OutOperandList = outs;
 dag InOperandList = ins;
 let AsmString   = asmstr;
 let Pattern = pattern;
 }
 // PRU pseudo instructions format
 class PRUPseudoInst<dag outs, dag ins, string asmstr, list<dag> pattern>
   : InstPRU<outs, ins, asmstr, pattern> {
 let isPseudo = 1;
 let isCodeGenOnly = 1;
 }
  • Instruction Selector: Describing the selection and conversion of the LLVM IR from a Directed Acyclic Graph (DAG) representation of instructions to native target-specific instructions. Using TableGen to generate code that matches patterns and selects instructions based on additional information in a target-specific version of PRUInstrInfo.td. Writing code for PRUISelDAGToDAG.cpp to perform pattern matching and DAG-to-DAG instruction selection. Also writing code in PRUISelLowering.cpp to replace or remove operations and data types that are not supported natively in a SelectionDAG.
  • Assembly Printer: Writing code for an assembly printer that converts LLVM IR to a GAS format for your target machine. Adding assembly strings to the instructions defined in target-specific version of PRUInstrInfo.td. Also writing code for a subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a trivial subclass of PRUAsmInfo.
 namespace {
   class PRUAsmPrinter : public AsmPrinter {
     PRUMCInstLower MCInstLowering;
   public:
     explicit PRUAsmPrinter(TargetMachine &TM,
                          std::unique_ptr<MCStreamer> Streamer)
       : AsmPrinter(TM, std::move(Streamer)),
         MCInstLowering(OutContext, *this) {}
     virtual StringRef getPassName() const {
         return StringRef("PRU Assembly Printer");
     }
   void EmitFunctionEntryLabel();
   void EmitInstruction(const MachineInstr *MI);
   void EmitFunctionBodyStart();
  };
 } 
  • Machine Code: Adding JIT support and creating a machine code emitter (subclass of PRUJITInfo). Writing a PRUCodeEmitter.cpp file that will contain a machine function pass that transforms target-machine instructions into relocatable machine code and a PRUJITInfo.cpp file that will implement the JIT interfaces for target-specific code-generation activities, such as emitting machine code and stubs. Modifying PRUTargetMachine so that it provides a TargetJITInfo object through its getJITInfo method.

References


Timeline

Date Status Details
April 4
  • Submit final proposals on the portal
April 4 - May 4 Pre-Selection Phase
  • Get thorough with the PRU architecture, instructions and LLVM backend
  • Discussion on resources and tools relevant to the project
May 4 - May 10 Community Bonding
  • Discuss implementation idea with mentors
  • Continue with LLVM Backend and PRU understanding
  • Getting all doubts cleared regarding the project
  • Diligently study any additional resources provided.
  • Talk to and learn from other people and their interests.
May 10 - May 31 College Exams
  • Focus on college exams
June 1

Week 1 & Week 2

Milestone #1
  • Introductory Video
  • Describing the Target Machine for PRU
  • Describing the Target Registration
June 12

Week 3 & Week 4

Milestone #2
  • Adding the register set and register classes
  • Adding the Instruction set
June 26

Week 5 & Week 6

Milestone #3
  • Constructing the SelectionDAG .
  • Adding the Instruction Selector
July 10 - July 14 Midterm evaluation
  • Submission for phase 1 evaluation
July 15

Week 8 & Week 9

Milestone #4
  • Adding the Assembly printer.
  • Adding the Machine code Emitter (JIT support)
July 28

Week 10, Week 11, Week 12

Milestone #5
  • Testing and Debugging
Aug 21 - Aug 28 Final week: GSoC contributors submit their final work product and their final mentor evaluation
Aug 28

Week 14

Milestone #6
  • Completing the remaining work
  • This is a large project, so the work will continue till September.
Sep 4

Week 15

Milestone #7

Final Evaluation

  • Complete the documentation work
  • Suggestion from mentors and required changes
  • Have the final product ready for submission
  • Final mentor evaluation
  • Completion of GSoC
Post GSoC
  • Continue some work on the project to improvise it and continue contributing to other areas of BeagleBoard:)

Experience and approach

This project requires good knowledge and background of compiler development.

  • I have been exploring the compiler developer environment for quite some time,having good knowledge about compilers, and have worked on a transpiler project.
  • I have completed the Kaleidoscope Tutorial of LLVM and read through the LLVM Backend documentations, thus have a good understanding of the same.
  • I have some experience with the esp-32 micro-controller and am into compiler development which is a perfect blend of hardware and software, which is now the requirement of this project.
  • I am an open source enthusiast, passionate about technologies and have always dedicated myself to the work I do with utmost perfection.I have no major commitment other than GSoC during the summer break and would give the best of my potential to complete the project idea in the given time frame.
  • I plan to keep working on this project even after GSoC and also engage with the community often.

Contingency

  • I have prepared a doc of all the links I have referred to during my preparation phase, and if I get stuck anywhere I would be relying on those resources.
  • Moreover the BeagleBoard community is extremely helpful and active in resolving doubts, which makes it a great going for the project resources and clarification.

Benefit

  • The PRU target(am335x) will have an LLVM support, so that we can use clang instead of pru-gcc.
  • Clang is much faster, uses far less memory than GCC, and provide extremely clear and concise diagnostics, thus will be beneficial.
  • The LLVM support will provide better compatibility, optimization and tooling.

Misc

Cross-compilation task,sent a PR to the upstream: [1]