Flameman/PowerPC

From eLinux.org
Jump to: navigation, search

PowerPC (to be fixed)

TLB

Home : Products & Services : System Resources : Processor Central : PowerPC : Architecture : MMU

PowerPC Architecture - Memory Management Unit (MMU)


The PowerPC™ 405 supports 4 GB of flat (non-segmented) address space. The Memory Management Unit (MMU) provides Address Translation, Protection Functions, and Storage Attribute Control for this address space. The MMU supports demand-paged virtual memory using multiple page sizes of 1 KB up to16 MB. When supported by system software, the MMU provides the following functions:

  • Translation of the 4 GB logical address space into a physical address space
  • Independent enabling of instruction translation and protection from that of data translation and protection
  • Page-level access control using the translation mechanism
  • Software control over the page replacement strategy
  • Additional protection control using zones
  • Storage attributes for cache policy and speculative memory access control

The Translation Look-aside Buffer (TLB) is used to control memory translation and protection. Each one of its 64 entries specifies a page translation. It is fully associative and can simultaneously hold translations for any combination of page sizes. To prevent TLB contention between data and instruction accesses, a 4-entry instruction and an 8-entry data shadow TLB are maintained by the processor transparently to the software. Software manages the initialization and replacement of TLB entries.

The PowerPC 405 includes instructions for managing TLB entries by software running in privileged mode. This capability gives significant control to system software over the implementation of a page replacement strategy. Storage attributes are provided to control access of memory regions. When memory translation is enabled, storage attributes are maintained on a page basis and read from the TLB when a memory access occurs. When memory translation is disabled, storage attributes are maintained in storage attribute control registers. A zone protection register (ZPR) is provided to allow system software to override the TLB access controls without requiring the manipulation of individual TLB entries.


Difference of Book-E and PPC440 insn

This document would make clear differences of Book-E defined instructions and PPC440 implementation to maintain BINUTILS opcode table for Book-E, PPC440 and other Book-E based embedded processors

For this purpose, we don't mention about 64bit operations which PPC440 and most embedded processor don't support. Also we don't mention about differences of registers, like Special Purpose Registers (SPRs).


Book-E coverage (to be fixed, soon)

In Book-E, PPC Instructions are classified as following;

    Book-E defined    - defined and described in Book-E;
              (Very few instruction may have
              implementation-dependent variant
              and/or operands)
    preserved    - for classic PPC use
    reserved    - for future use
    allocated    - for implementation-dependent use


E.g. For PPC440 implementation, Section 5, "Instruction Set", in 440_Programming_Model.pdf shows instruction categories at "Table 4 - PowerPC 440 Instruction Categories".

In general, the Book-E provides for allocated instructions, which are instructions available for implementation-dependent and/or application-specific purposes. Those allocated instructions will be described in processor's manual.

Section 9.1 "Instruction Set Portability", in 440x4_um.pdf, mentions allocated instructions of PPC440 core and "Table 9-2. Allocated Instructions" shows list of the allocated instructions. Due to 440x4_um.pdf, allocated instructions of PPC440 core are not PPC440 specific, common extension for IBM PPC400 embedded series.

Even in the Book-E defined instructions, not in allocated class instructions, there are some implementation-dependent variation. For example, tlbre/tlbwe instructions are defined in the Book-E, may have implementation-dependent operands.

Due to results of searching "implementation-dependent" word from the Book-E instruction set descriptions, Following instructions have implementation-dependent field. In other words, following instructions may have implementation-dependent variants and/or operands.


  • Data Cache Block Touch {{{

dcbt[e] CT,RA,RB

   This may have implementation-dependent variants and/or
   operands.
   CT (6:10)    This field used by the Cache Touch instructions
   (dcbt[e], dcbtst[e], and icbt[e]) to specify the target
   portion of the cache facility to place the pre-fetched data or
   instructions and is implementation-dependent.

}}}

  • Data Cache Block Touch for Store {{{

dcbtst[e] CT,RA,RB

   This may have implementation-dependent variants and/or
   operands.
   CT means same as above.

}}}

  • Instruction Cache Block Touch {{{

icbt[e] CT,RA,RB

   This may have implementation-dependent variants and/or
   operands.
   CT means same as above.

}}}

  • TLB Read Entry {{{

tlbre

   This may have implementation-dependent variants and/or
   operands.
   Bits 6:20 of the instruction encoding are allocated for
   implementation-dependent use, and may be used to specify
   the source TLB entry, the source portion of the source
   TLB entry, and the target resource that the result is
   placed into.

}}}

  • TLB Search Indexed [Extended] {{{

tlbsx[e] RA,RB

   This may have implementation-dependent variants and/or
   operands.
   Bits 6:10 of the instruction encoding are allocated for
   implementation-dependent use, and may be used to specify
   the target resource that the result of the instruction
   is placed into.

}}}

  • TLB Write Entry {{{

tlbwe

   This may have implementation-dependent variants and/or
   operands.
   Bits 6:20 of the instruction encoding are allocated for
   implementation-dependent use, and may be used to specify
   the target TLB entry, the target portion of the target
   TLB entry, and the source of the value that is to be
    written into the TLB.

}}}

PPC440 variants/limitations of Book-E defined instructions

PPC440 limitations

This subsection shows intructions which BOOK-E defines but PPC440 doesn't implement.

"2.3.1 Defined Instruction Class" in 440x4_um.pdf, says PPC440 doesn't support following Book-E defined instructions;

    tlbivax[e]
    mfapidi
    64bit operations
    floating-point operations

The first two instructions and 64bit operations are treated as illegal Instruction. PPC440 core doesn't support floating-point operations inside it, however attached auxiliary processor may support them. Without such auxiliary processor, floating-point operations are treated as illegal instruction.

PPC440 variants

This subsection shows intructions which both BOOK-E and PPC440 defines but PPC440 implement slightly diffrent ways.

Book-E defines following TLB management instructions;

    tlbivax[e]    RA,RB    TLB Invalidate Virtual Address
                    Indexed (Extended)
    tlbre            TLB Read Entry
    tlbsx[e]    RA,RB    TLB Search Indexed (Extended)
    tlbsync        TLB Synchronize
    tlbwe            TLB Write Entry

440x4_um.pdf shows following TLB management instructions.

    tlbre        RT, RA, WS
    tlbsx[.]    RT, RA, RB
    tlbsync
    tlbwe        RS, RA, WS

Book-E defines following cache block touch instructions;

    dcbt[e]        CT,RA,RB    Data Cache Block Touch
    dcbtst[e]    CT,RA,RB    Data Cache Block Touch for Store
    icbt[e]        CT,RA,RB    Instruction Cache Block Touch
        CT means implementation-dependent operand.

However, 440x4_um.pdf says PPC440 have following;

    dcbt        RA,RB
    dcbst        RA,RB
    icbt        RA,RB

In addition, as a special case, icbt instruction have two opcodes. One is one of allocated opcode (primary:31, secondary:262) for keeping compatibility with previous PPC400 Series. The other is one of Book-E defined opcode (primary:31, secondary:22), because icbt is now part of Book-E.


PPC440 allocated instructions

Allocated Arithmetic

Those instructions support multiply-accumulate, negative multiply-accumulate and multiply halfword.

Multiply-Accumulate

    macchw[o][.]          RT,RA,RB
    macchws[o][.]        RT,RA,RB
    macchwsu[o][.]    RT,RA,RB
    macchwu[o][.]     RT,RA,RB
    machhw[o][.]          RT,RA,RB
    machhws[o][.]     RT,RA,RB
    machhwsu[o][.]     RT,RA,RB
    machhwu[o][.]     RT,RA,RB
    maclhw[o][.]          RT,RA,RB
    maclhws[o][.]     RT,RA,RB
    maclhwsu[o][.]     RT,RA,RB
    maclhwu[o][.]     RT,RA,RB


Negative Multiply-Accumulate

    nmacchw[o][.]     RT,RA,RB
    nmacchws[o][.]     RT,RA,RB
    nmachhw[o][.]     RT,RA,RB
    nmachhws[o][.]     RT,RA,RB
    nmaclhw[o][.]     RT,RA,RB
    nmaclhws[o][.]     RT,RA,RB


Multiply-Halfword

    mulchw[.]      RT,RA,RB
    mulchwu[.]      RT,RA,RB
    mulhhw[.]      RT,RA,RB
    mulhhwu[.]      RT,RA,RB
    mullhw[.]      RT,RA,RB
    mullhwu[.]      RT,RA,RB


Allocated Logical

This instructions detects left-most zero byte, is helpful for implementing function like strlen().

    dlmzb[.]    RA,RS,RB


Allocated Cache Management

These instructions flash invalidate entire data/instruction cache array.

    dccci        RA,RB
    iccci        RA,RB


Allocated Cache Debug

These instructions read data/instruction cache

    dcread        RT,RA,RB
    icread        RA,RB


Appendix References

  • 440_Programing_Model.pdf IBM PowerPC 440 Microprocessor Core Programming Model Overview, October 4,2001
  • 440x4_um.pdf PPC440 CPU Core Users Manual, SA14-2523-02 July 18, 2002
  • booke_rm.pdf Book E: Enhanced PowerPC Architecture, Version 1.0 May 7, 2002


PowerPC 405GP

Info

  • CPU Performance 133-266MHz
  • Features GPIO, PCI, SRAM Memory Controller
  • A fast, flexible solution for embedded developers.

General Description

405GP/GPR Product Photo The AMCC PowerPC 405GP and 405GPr family of 32-bit RISC processors is designed to provide a flexible, fast time-to- market hardware solution to satisfy the demands of high-performance embedded applications. Implemented in the scalable PowerPC architecture, the 405GP and 405GPr processors maintain code compatibility with other PowerPC processors for ease in migration and faster time-to-market. An optimized balance of performance, low power, and features makes them ideal solutions for communication, data storage, and pervasive computing applications.

The 405GP and 405GPr processors support speeds of up to 266MHz and 400MHz respectively. Both incorporate a rich mix of features, such as a PCI interface, an SDRAM Controller, a 64-bit on-chip CoreConnect bus, Ethernet and other on-chip peripheral support, and the IBM CodePack code compression engine. In addition, power management features, a small form factor, and low power consumption make the AMCC 405 processor family an ideal platform for applications ranging from networking to video.

Highlights

  • High-performance, low-power processors for the most demanding embedded applications PowerPC 405GP/405GPr Embedded Processors deliver up to 400MHz performance and a rich mix of features for Internet, communication, data storage, consumer, and imaging applications
  • Includes on-chip SRAM with single-cycle access for faster processing in data-intensive applications, such as routers and switches
  • Supports full application-code compatibility with all other PowerPC® processors for seamless migration
  • Uses the award-winning 64-bit IBM CoreConnect™ high-performance on-chip bus
  • Offers a wide array of small-footprint- package options for high-density applications, such as telecommunications devices
  • Employs the IBM CodePack code compression core to reduce system memory requirements and cost

Features

On-chip SDRAM Controller

  • Contains separate 32-byte read and 128-byte write buffers
  • Programmable address mapping

External Peripheral Controller

  • Supports ROM, EPROM, SRAM
  • Flash and slave peripheral I/O devices
  • 8, 16, 32-bit external data bus width
  • Programmable address mapping
  • External Bus Master Controller - Allows external masters to access SDRAM and PCI

DMA Controller

  • 4 independent channels
  • Supports transfers between SDRAM, PCI, internal UARTs, and devices on the external peripheral bus

PCI Interface

  • 32-bit PCI V2.2 compatible
  • Synchronous and asynchronous operation
  • Internal PCI arbiter supports six PCI masters
  • Supports external arbitration

On-chip Ethernet Support

  • 10/100Mbit/sec
  • Dedicated DMA controller

CodePack Decompression

  • Stores instructions in memory in compressed format
  • Improves code density by up to 40%

On-chip Peripherals

  • 2 serial ports
  • Master and slave IIC controller
  • Up to 24 general purpose I/Os
  • Interrupt controller including up to 13 external interrupts