space
space
BDTI Benchmarks™ results for the ARM Cortex-A8
BDTI
HOME << FREE INFO << BDTI
space
 

 

ARM Cortex-A8 Benchmark Results

February 26, 2008 (Dallas, TX) Texas Instruments yesterday introduced four new members of its OMAP3 family of high-end application processors, and announced that these chips will be offered broadly as part of TI’s “catalog” product line.
Earlier members of the OMAP3 family were available only to selected customers of TI’s Wireless Terminal Business Unit (WTBU), typically big-name cell phone manufacturers with very high volumes. The new catalog parts are the OMAP3503, OMAP3515, OMAP3525, and OMAP3530.
Like the original OMAP3 chip (the OMAP3430, which remains exclusive to TI’s wireless handset chip division), all four of the new OMAP35xx use the ARM Cortex-A8, a superscalar 32-bit CPU core. Read the complete Inside DSP article about TI’s OMAP3 family.
BDTI Benchmark™ results are shown for the Cortex-A8 below.

About the ARM Cortex-A8

The Cortex-A8, announced in 2005, is a 32-bit licensable core developed by ARM Limited. It implements the ARMv7 instruction set. One major difference between the Cortex-A8 and previous ARM cores is the addition of the NEON instruction set extensions designed to accelerate multimedia tasks. Using these extensions, the Cortex-A8 can execute up to four 16-bit multiply-accumulate instructions per cycle (versus two for the ARM11). The Cortex-A8 targets chips in high-performance cellular handsets, as well as set-top boxes, printers, and automotive infotainment applications.

The Cortex-A8 uses a superscalar (first for an ARM core), dual-issue, in-order execution pipeline. The pipeline, unusually long, comprises a 13-stage main pipeline and a 10-stage NEON pipeline for data-processing execution. In contrast, ARM11's pipeline has only 8 stages. According to ARM, the Cortex-A8's long pipeline will enable high clock rates—potentially exceeding 1 GHz in a 65 nm process.

Unlike ARM’s other licensable cores, the Cortex-A8 is intended to be implemented using either a typical logic synthesis methodology (as is almost always done with ARM's other licensable cores) or a highly optimized semi-custom design style. Initial Cortex-A8 licensees creating highly optimized implementations of the Cortex-A8 may apply hand-crafted library cells and other physical-level optimizations for improvements in both frequency and power over traditional synthesis methodologies. As a result, BDTI does not have clock speed, silicon area, and power consumption data for the Cortex-A8 based on BDTI's standardized conditions for processor cores. Caution should therefore be used when comparing BDTI's Cortex-A8 benchmark results with results for other processor cores.

certifiedlogo certifiedlogo DKB Certified

ARM Cortex-A8 on the BDTI DSP Kernel Benchmarks™

BDTI DSP Kernel Benchmarks™ results for the Cortex-A8 specify BDTIsimMark2000™ per MHz. Multiply this figure by projected clock rate to obtain projected BDTIsimMark2000™.

Processor Family

Clock Rate (MHz)

BDTISimMark2000™

ARM Cortex-A8 N/A 7.6 per MHz
Table 1.  ARM Cortex-A8 Performance on BDTI DSP Kernel Benchmarks™

Processor Family

Clock Rate(MHz)

BDTImemMark2000™

ARM Cortex-A8
N/A
78
Table 2.  ARM Cortex-A8 Memory Efficiency on BDTI DSP Kernel Benchmarks™

For more detailed Cortex-A8 benchmark results, see BDTI’s offerings of results for ARM processors or contact BDTI.

certifiedlogo certifiedlogo DKB Certified

ARM Cortex-A8 on the BDTI Video Encoder and Decoder Benchmarks™

BDTI Video Encoder and Decoder Benchmarks™ results for the ARM Cortex-A8 are reported at the following two operating points:

  • QVGA Operating Point
    At this operating point the benchmarks process a video sequence at QVGA resolution (320x240) with a frame rate of 30 fps. This is appropriate for mobile applications such as cell phones that have small displays.
  • D1 Operating Point
    At this operating point the benchmarks process a video sequence at standard-definition television resolution (720x480, also known as “D1” resolution) with a frame rate of 30 fps. This is appropriate for applications such as personal media players (PMPs), digital surveillance equipment, and set-top boxes.

BDTI Video Encoder and Decoder Benchmarks™ results for the Cortex-A8 specify cycles/s. Divide this figure by projected clock rate to obtain projected processor utilization.

BDTI Video Decoder Benchmark™
QVGA (320x240) Operating Point

d

BDTI Video Decoder Benchmark™
D1 (720x480) Operating Point

d
BDTI Video Encoder Benchmark™
QVGA (320x240) Operating Point
d

QVGA Decode

D1 Decode

QVGA Encode

Cycles / s (millions)

Cycles / s (millions)

Cycles / s (millions)

ARM Cortex-A8 114
504
421
Table 1.  ARM Cortex-A8 Performance on BDTI Video Encoder and Decoder Benchmarks™ for Specified Operating Points

Clock Speed (MHz)

L1 Instruction Cache

(Kbytes)

L1 Data Cache (Kbytes)

L2 Cache (Kbytes)

On-chip Main Memory (Mbytes)

External Memory Speed (MHz)

External Memory Bus width (bits)

ARM Cortex-A8   N/A
16
16
128
N/A
1/3 CPU Clock Rate
64
Table 2.  ARM Cortex-A8 Processor Architectural Details
Copyright Notice
Results for the ARM Cortex-A8 on the BDTI DSP Kernel Benchmarks™ and the BDTI Video Encoder and Decoder Benchmarks™ are copyrighted by Berkeley Design Technology, Inc. (BDTI)
No reproduction or reuse is permitted without the express written authorization of BDTI.

Top of page