RichardB's notes from the seminar

These are notes from Silica's OMAP Workshop, 21 Jan. 09 – ARM, Cambridge, UK. TI's OMAP3 is used e.g. on BeagleBoard.

Cortex A8 Core – Bryan Lawrence – ARM

 * Cortex A8 Core is the design. OMAP is the physical implementation of this design by TI
 * Cortex A8 is based on V7-A instruction-set architecture and includes:
 * NEON advanced SIMD (multimedia accelerator – integer and floating-point SIMD (single-instruction multiple-data)
 * Jazelle-RCT (Java accelerator)
 * TrustZone security foundation (effectively virtualisation of the core)
 * Particularly aimed at applications (rather than real-time or ‘deeply’ embedded)
 * A8 Processor Core (design) can run up too 1GHz +, c. 2000 DMIPS, depending on silicon
 * MMU for OS virtual memory management
 * Thumb-2 allows 16 & 32-bit instructions. Allows efficient, but small (compressed) code size if required
 * Dependant on the compiler t**produce better ‘code density’
 * Thumb-2, for e.g., gives a 29% reduction in Linux kernel size. E.g.:  http://www.arm.com/products/os/linux.html
 * CoreSight; non-invasive real-time trace for debugging
 * JTAG port
 * Debug access port (DAP)
 * Embedded Trace Macrocell (ETM) – captures instruction and data

NEON SIMD – Ashley Stevens – ARM

 * Flexible, generic multimedia acceleration
 * High-power consumption than dedicated hardware but supports emerging standards
 * Hybrid 64/128-bit SIMD architecture
 * Supports up too 64-bit integers, single-precision floating-point
 * Adds additional registers
 * Variety of ways to use: assembler, C Intrinsics, through too OpenMAX DL library (recommended), Vectorizing compilers (generates NEON SIMD instructions)
 * Provides, for e.g., faster FFT’s
 * Armcc vs gcc : armcc produces more compact, faster code.
 * Lots of NEON-optimised codecs available

OMAP35x Processor Overview – Chris Bowers – Snr Field Applications Engineer – TI

 * TI have a range of microcontrollers through t**Application processors & DSP
 * OMAP tends t**be seen in things like digital signage, POS terminals, portable infotainment etc (Lower power, high performance);  “Laptop-like performance”
 * Up t**1200 Dhrystone MIPS
 * ARCHOS7 Internet Media Table built on OMAP3
 * TI are “nicely surprised” by things like BeagleBoard
 * Has a DSP (in addition t**NEON) for vide**processing, up t**HD
 * DSP is generic; not limited t**video/audi**processing
 * Peripheral connectivity (USB, MMC, Serial, USB etc.)
 * OMAP35 models:
 * 3503 - ARM Cortex A8, Peripherals
 * 3515 - ARM Cortex A8, Peripherals, PowerVR SGX (OpenGL ES) graphics engine
 * 3525 - ARM Cortex A8, Peripherals, C64x DSP & video accelerator
 * 3530 - ARM Cortex A8, Peripherals, PowerVR SGX (OpenGL ES) graphics engine, DSP & video accelerator
 * Camera interface
 * Auto-focus engine
 * CCD & CMOS imager interface
 * Preview engine etc.
 * Display subsystem
 * (24-bit RGB up to 1024x768 HD, 2 x 10-bit DAC’s; rotation, image resizing)
 * Overlay, scaling, picture-in-picture
 * Also discussed TI DaVinci platform: video-centric, based an ARM9, has some overlap with OMAP
 * OMAP35x has power-management module. (PRCM), active and static (standby) modes of consumption
 * Can reduce core voltage and frequency
 * Various major components can be turned on/off as required – “power domains”
 * Various complete boards available:
 * OMAP35x evaluation module (EVM); OMAP 3530 plus touchscreen, RAM & NAND flash, Ethernet etc.
 * BeagleBoard
 * Gumstix Overo(tiny)
 * LogicPD
 * Analog & Micro

Understanding 2D/3D Graphics Dev using OMAP 35x - Jason Brand – Fields Apps Engineer – TI

 * Lots of uses/major apps;
 * Scalable UI’s, navigation, Games, Visualisations, Automotive
 * OMAP 35x has NEON vector floating-point processor (VFP) +
 * PowerVR SGX (graphics engine):
 * Tile-based architecture
 * Universal Scalable Shader Engine (USSE)
 * Support for: OpenGL ES (Embedded Standard) 1.1 and 2.0, OpenVG 1.0 (t**accelerate Adobe Flash  and SVG Tiny (Scalable Vector Graphics) and UI’s built on these)
 * ~10M polygons/second, ~0.9 GFLOPS
 * OpenGL ES is a well-defined subset of desktop OpenGL
 * (lots of details on SGX engine)
 * OpenGL ES support seems powerful
 * Graphics SDK is available from TI; tools, headers, libs, demos etc
 * IVA 2.2 – Image, Video, Audio subsystem- C64x DSP core:
 * 32-bit fiex-point media processor
 * Video & image accelerator
 * TI supply compiler tools to optimize for this hardware

ARM Software Development Tools – Elan Lennard – System Design Division – ARM

 * “Enabling all developers to get the best from their ARM-based system”
 * Quality, high-performance s/w
 * Tools: Compilation, Optimization, Middleware, Device Support, verification & debug, Fast simulation
 * RealiVew Development Suite:
 * Co-developed and validated with ARM processor IP; best code
 * Extensive support for CoreSight (debug tech)
 * Supports all ARM processors
 * Std and Pro editions. ***Pro includes NEON compiler, RealView profiler, fast simulator (RTSM), ICE
 * Automatic optimisation; data from profiler feeds back int**compiler == some perf improvement (c. 6%) and 40% (ish) code side reduction.
 * Loop unrolling (where appropriate)
 * Code reordering
 * Link-time compilation; allows optimizations across source files, 5% size reduction, 5% perf improvement
 * ARM compiler vs. GCC: ARM is 30% faster, 43% smaller.  (similar when using Thumb code)
 * NEON Vectorizing compiler; up t**400% (4x) performance improvement on a particular vide**decoder, compared t**regular ARM compiler
 * ARM Workbench IDE – based on Eclipse 3.3
 * ARM Eclipse plugins; ARM profiler, Flash programmer, ARM Linux project wizard etc. etc.
 * Only really useful if RealView is used
 * ARM Profiler:
 * “Get the best out of ARM processors”
 * Performance and code coverage analysis; detailed analysis of performance/usage, call-chain analysis
 * Traces can be logged and replayed
 * Completely non-intrusive; analyse running system/application
 * Good e.g. show one instruction using 27% of application time
 * RealView ICE and Trace,
 * Hardware trace/debugger

Tool Chain Overview – Chris Bower – TI

 * TI Code Composer Toolset
 * DSPBIOS, low level ARM debug, DSP development and debug
 * Montavista (for DaVinci)
 * Linux-based, licensed through TI.
 * Linux app development, Eclipse-based IDE
 * Green Hills
 * Integrity Linux based, MULTI debug environment for DSP and ARM. Application too
 * Code Sourcery
 * Linux (and Windows) – GNU Toolchain. For building Linux apps
 * Eclipse-based for Pers and Pr**editions
 * MPC
 * WinCE
 * Microsoft Platform Builder
 * The choice for Windows CE etc. development
 * Lauterbach TRACE32
 * Low-level debug of ARM & DSP

OMAP3 OS Support – Jason Brand – TI

 * Fundamentally this is Linux or Windows CE.
 * TI issue a Linux 2.6.22 kernel, includes lots of device drivers, EVM drivers, on top of which:
 * There is also DSPBIOS – scheduler, resource manager for DSP.
 * Als**layers on top of these; codec interfaces, algorithm abstraction, Open VG, OpenGL ES, audio/vide**(GStreamer) etc.
 * Windows CE 6.0 can als**function as the ‘underlying’ OS, some of the higher layers are different
 * OMAP353 - SDK Beta SDK:
 * Board boot, test, & flash utils
 * Platform support:
 * U-boot Linux boot-load and flashing
 * Linux kernel with drivers
 * Root fs
 * Demo apps
 * Image viewer
 * DaVinci I/F dump
 * Code Sourcery tools
 * X-Loader
 * Small user boot-loader t**boot from on-board flash
 * Must be signed before use
 * U-Boot
 * The next-stage boot-loader
 * Flexible open-source utility for boot-loading Linux
 * Capable of reading kernel image from flash, Ethernet TFTP, and ?
 * ITBOK (is the board ok)
 * Based on u-boot
 * Basic H/W functionality tests
 * OMAP35x WinCE Support:
 * TI seeing 40% WinCE vs 60% Linux
 * MS suggest that total cost of development is cheaper, and to-market faster than with Linux
 * BSQUARE’s WinCE 6.0 R2 BSP (board support package) Demo and Source is free with OMAP EVM
 * Visual Tools plugin
 * WinCE R2 Pro compiled with Visual Tools
 * Various codecs, DirectShow filters etc. for a/v
 * Production Tested (two full QA passes)
 * 100% CETK passed
 * Adobe Flash Lite 3 port available for WinCE R2 BSP – OMAP35x EVM

Power for OMAP35x Processors – Miriam Corder – TI

 * Max power consumption is 360mW
 * With dynamic voltage/freq scaling – averaging <100mW
 * External power-management chips available (“analog companion “)
 * Includes audio codec, RTC, USB OTG transceiver, battery charger etc.