BDTI Benchmark Results for the ARM Cortex-A8

Looking for ARM Cortex-A8 benchmark results?  You've come to the right place.

The ARM Cortex-A8 is a licensable microprocessor core that forms the heart of several off-the-shelf processor chips from companies such as Texas Instruments and Freescale. BDTI has implemented two of its signal processing benchmark suites on the ARM Cortex-A8:

  • The BDTI DSP Kernel Benchmarks are a suite of 12 hand-coded assembly language algorithm kernels that measure processor performance on one-dimensional signal processing tasks.  A composite result from the BDTI DSP Kernel Benchmarks is provided in the form of a BDTImark2000 score.
  • The BDTI Video Encoder/Decoder Benchmarks measure processor performance on video encoding and decoding tasks.

Below we present estimated BDTIsimMark2000 results for chips based on the Cortex-A8.  We then discuss the Cortex-A8 core itself in more detail and present BDTImark2000 and BDTI Video Encoder/Decoder Benchmark results for the core itself.

Results for Chips Based on the ARM Cortex-A8

Texas Instruments and Freescale offer a range of chips based on the ARM Cortex-A8  processor core.  Figure 1 presents estimated BDTIsimMark2000 results for a number of these devices with varying clock speeds:

Figure 1: Estimated BDTIsimMark2000 results for chips based on the ARM Cortex-A8 core.

The results reported in Figure 1 are estimates of BDTIsimMark2000 scores for the chips listed.  Because BDTI has not actually measured these results running on the given hardware the results shown are referred to as BDTIsimMark2000 scores (rather than BDTmark2000 scores.  The results are comparable with other BDTIsimMark2000 results for chips.  However, these results should not be directly compared with BDTIsimMark2000 results for licensable cores because the benchmarking conditions (and thus device clock speeds) may differ as discussed below.  Please see the BDTIMark2000 page for a discussion of BDTIsimMark2000 vs. BDTIMark2000.

Overview of the ARM Cortex-A8 Licensable Core

The Cortex-A8 is a 32-bit licensable core developed by ARM Limited. It implements the ARMv7 instruction set. One major difference between the Cortex-A8 and previous ARM cores is the addition of the NEON instruction set extensions designed to accelerate multimedia tasks. Using these extensions, the Cortex-A8 can execute up to four 16-bit multiply-accumulate instructions per cycle (versus two for the ARM11). Note that all BDTI benchmark results for the Cortex-A8 make use of the NEON instruction set extensions.  The Cortex-A8 targets chips in high-performance cellular handsets, as well as set-top boxes, printers, and automotive infotainment applications.

The Cortex-A8 uses a superscalar (first for an ARM core), dual-issue, in-order execution pipeline. The pipeline, unusually long, comprises a 13-stage main pipeline and a 10-stage NEON pipeline for data-processing execution. In contrast, ARM11's pipeline has only 8 stages.

Unlike ARM’s other licensable cores, the Cortex-A8 is intended to be implemented using either a typical logic synthesis methodology (as is almost always done with ARM's other licensable cores) or a highly optimized semi-custom design style. Cortex-A8 licensees creating highly optimized implementations of the Cortex-A8 may apply hand-crafted library cells and other physical-level optimizations for improvements in both frequency and power over traditional synthesis methodologies. As a result, BDTI does not have clock speed, silicon area, and power consumption data for the Cortex-A8 based on BDTI's standardized conditions for processor cores. Caution should therefore be used when comparing BDTI's Cortex-A8 benchmark results with results for other processor cores

BDTI Certified logoBDTImark2000™ Results for the ARM Cortex-A8 Core

In terms of execution speed, the ARM Cortex-A8 has received a BDTI-Certified score of 7.6 BDTIsimMark2000 per MHz of clock speed.  This figure must be multiplied by the device's clock speed to obtain a BDTIsimMark2000 figure.  As of this writing the Cortex-A8 has been successfully fabricated in chips with clock speeds ranging from 500 MHz to 1.5 GHz.  Figure 2 below shows estimated Cortex-A8 BDTIsimMark2000 performance as a function of clock speed and also indicates clock speed ranges where the device has been successfully fabricated.

Figure 2: Estimated BDTIsimMark2000 results for the ARM Cortex-A8 core as a function of clock speed. As of this writing the Cortex-A8 has been fabricated at speeds up to 1.5 GHz.

In terms of memory efficiency, the ARM Cortex-A8 has received a BDTI-Certified score of 78 BDTImemMark2000s.

For more information on BDTImark2000 scores please see the BDTImark2000 page.  For more detailed BDTI DSP Kernel Benchmarks results for the ARM Cortex-A8, including cycle counts for each of the twelve kernels that comprise the BDTI DSP Kernel Benchmark suite, please contact BDTI.

BDTI Certified LogoBDTI Video Encoder/Decoder Benchmarks™ Results for the ARM Cortex-A8 Core

BDTI Video Encoder/Decoder Benchmarks™ results for the ARM Cortex-A8 are reported at the following two operating points:

  • QVGA Operating Point. At this operating point the benchmarks process a video sequence at QVGA resolution (320x240) with a frame rate of 30 fps. This is appropriate for mobile applications such as cell phones that have small displays.
  • D1 Operating Point. At this operating point the benchmarks process a video sequence at standard-definition television resolution (720x480, also known as “D1” resolution) with a frame rate of 30 fps. This is appropriate for applications such as personal media players (PMPs), digital surveillance equipment, and set-top boxes.

BDTI-Certified Video Encoder/Decoder Benchmarks™ results for the Cortex-A8 specified in cycles/s are as follows:

ARM Cortex-A8 Performance on BDTI Video Encoder/Decoder Benchmarks
  QVGA Decode D1 Decode QVGA Encode
Cycles/s (millions) Cycles/s (millions) Cycles/s (millions)
ARM Cortex-A8 114 504 414

The cycles/s figures shown can be divided by the device's clock speed to obtain estimated processor utilization.  Figures 3, 4, and 5 below show estimated Cortex-A8 BDTI Video Encoder/Decoder Benchmark processor utilization as a function of clock speed.

Results for the Video Encoder/Decoder Benchmarks on the Cortex-A8 at QVGA Decode

Figure 3: Estimated ARM Cortex-A8 processor utilization on the BDTI Video Decoder Benchmark at the QVGA (320x240) Operating Point as a function of clock speed.

 

Results for the Video Encoder/Decoder Benchmarks on the Cortex-A8 at D1 Decode

Figure 4: Estimated ARM Cortex-A8 processor utilization on the BDTI Video Decoder Benchmark
at the D1 (720x480) Operating Point as a function of clock speed.

 

Results for the Video Encoder/Decoder Benchmarks for the Cortex-A8 at QVGA Encode

Figure 5: Estimated ARM Cortex-A8 processor utilization on the BDTI Video Encoder Benchmark at the QVGA (320x240) Operating Point as a function of clock speed.

Reproducing BDTI Benchmark Results for the ARM Cortex-A8 or Obtaining Detailed Results

Please note that no reproduction or reuse of the above information is permitted without the express authorization of BDTI.  For reproduction permission, to gain access to additional, detailed ARM Cortex-A8 benchmark results, or to arrange to have your processor benchmarked, please call BDTI at +1 925 954 1411 or contact us via the web.