There is a lot more to smart-phones than just the operating system (OS) and an app count. However refined OS a phone has, the performance and, more importantly, the productivity the user gets also depends on the number-crunching power the hardware packs within. The race to improve performance is not just driven by megahertz or umpteen central processing unit (CPU) cores anymore.
Before getting into the nitty-gritty, it’s important to know that smart-phones and tablets use a system-on-chip (SoC). SoC is the equivalent of a computer mother-board including central processing unit, graphics processing unit and memory—on a single chip. There is no way to predict the performance of a smartphone by simply checking out its processor. SoCs with even the same processors have greatly varying performance. This could be due to different configuration (clock speed, GPUs, etc). There is also the chance of some vendors cheating the benchmark test by tuning their product for this particular test often at the expense of worse performance on actual work load.
However, a smart-phone with a high-end SoC does have some expectations to live up to. In this article, we provide benchmarks for most of the popular SoCs so that you can compare these. Remember though, that the benchmark results indicated here may tell a significantly different story from real-world performance, especially for multi-core SoCs. This is true especially when the OS in question has been optimised for those multiple cores.
SoC processor cores
Often, SoCs use processors and design from British company ARM Holdings plc. One of the primary reasons why Advanced RISC Machines (ARM) processors came to be so widely used in SoCs for mobile devices and portables is their low electric power consumption. For CPUs licensed from ARM Holdings plc, the corresponding GPUs are licensed from Imagination Technologies, which is known for its PowerVR graphics cards. PowerVR graphics processor designs are licensed to many SoC makers including Samsung, Apple, Texas Instruments, Intel, NEC, NXP and Freescale.
Another popular core is the Snapdragon core a.k.a Scorpion. It is designed and built by Qualcomm using the ARM v7 instruction set. Snapdragon is considered to perform better for multimedia-related single-instruction, multiple-data (SIMD) operations. The graphics processors in Qualcomm’s SoCs are usually Adreno flavoure processors from Imageon—a subsidiary of Qualcomm and descendant of ATI.
Another core is the Intel Atom CPU for SoCs featured on the mobile Internet devices (MIDs). Intel SoCs are also paired with PowerVR SGX GPUs. The last release was the Moorestown platform with a 45nm Atom CPU.
How good a processor core is for multi-CPU designs depends on:
1. Performance density in the form of maximum aggregate performance per watt or per square millimetre
2. Inter-processor communications that minimise inefficienciesin the partitioning boundaries
There are various software which are used on the complete device to test its processor’s performance:
A system-on-chip consists of many sub-components such as one or more CPUs, graphics processing units (GPUs), digital, analogue, mixed-signal and radio frequency functions, random-access memory (RAM), read-only memory (ROM), Flash memory and EEPROM, oscillators and phase-locked loops (PLLs), real-time timers, analogue-to-digital converters (ADCs), digital-to-analogue converters (DACs), power management circuits and external interfaces such as USB, FireWire and Ethernet. All this is placed on a single substrate.
Simply put, the video encoding and decoding hardware powers the ‘camcorder’ functionality. The image processor ensures that photos are processed properly and saved quickly, and the audio processor frees the CPUs from having to work on audio signals thus allowing these to work on other tasks. Together, all these components and their associated drivers define the overall performance of a system.
BrowserMark appears to have a bug where one stage of their bench-mark suite of tests could be intentionally skipped by using certain browsers. Apple A5 CPU benchmarks are hence higher than they should be.
GLBenchMark 2.1 Offscreen. This is an OpenGL ES benchmark with graphic scenes representing high-end gaming content (refer Fig. 3). The offscreen measurements are where all high-level scenes are rendered offscreen at 1280×720 pixel resolution. This method is claimed to provide apples-to-apples performance comparison for all GPUs as it utilises the GPU’s power as well for calculations.
Unlike the infated BrowserMark score for the Apple A5, GLBenchMark score is accurate because the A5 does feature a high-end dual-core GPU.
If you look at the top three scoring SoCs (refer Fig. 2), you would find tha clock speed does not define performance. Snapdragon MSM8260 despite claiming the highest clock speeds of 1.5 GHz still rates less than Apple A5 SoC running at 800 MHz (though there is a slight advantage due to the Safari-BrowserMark bug).
The reason why MSM8260 lost even with the spruced-up clock speed is that this SoC is based on ARM’s Cortex-A8 design. The other two SoCs that beat it are based on the Cortex-A9, which is known to perform 20 per cent better than A8. This result also shows why you should not compare performance of processors having different architecture based on clock speed alone. A better way to do so is to compare the performance per clock of the chips.
In the graphics benchmark (refer Fig. 3), Apple’s A5 wins hands down with its PowerVR SGX543MP2 graphics solution. Obviously it’s because this is a dual-core GPU. Moreover, SGX540 (belongs to Series 5) used in OMAP4460 was released around two years before the SGX543 (belongs to Series 5XT). As the chips used were from different generations, it was not a fair fight.
Within the single-core GPUs, Mali-400 outperformed SGX540, ULP GeForce and Adreno 220 by a wide margin. But since most of the popular games out in the Android market are running Tegra 2 optimised versions, the real-life performance from a gaming perspective would be far better than what was benchmarked here.
Based on the two benchmarks, the overall score performance of different SoCs is provided on the next page.
TI OMAP3430 features an ARM Cortex A8 processor clocked at 600 MHz. ARM Cortex A8 processor is based on the ARMv7 architecture and is claimed to scale in speed from 600 MHz to greater than 1 GHz. In fact, many people run hacked kernels where the processor is overclocked from 600 MHz to 1.15 MHz. The down-side, however, is that the SoC uses a lot of power as its clock gets scaled up.
This processor also supports a super-scalar micro-architecture with NEON technology for SIMD processing.
Samsung Hummingbird 1GHz
Samsung Hummingbird too is based on the ARM Cortex A8 architecture, albeit at a smaller 45nm process than the TI OMAP 3430. It was jointly developed along with Intrinsity. Intrinsity tweaked the CPU by trimming inefficiencies in the way logic gates are used by using 1-of-n domino logic (NDL). Due to this, the Hummingbird CPU delivers not only high media and data crunching performance in mobile devices but lowered power consumption as well.
Hummingbird comes with 32 kB of both data and instruction cache, a variable-size L2 memory cache and the ARM NEON multimedia extension. This SoC features a PowerVR SGX 540 GPU.
Apple A4 800MHz
Apple A4 SoC is based on the ARM architecture. It is designed by Apple and manufactured by Samsung. It combines an ARM Cortex-A8 CPU with a PowerVR GPU and emphasises power efficiency. The chip made a commercial debut with the release of Apple’s iPad tablet, followed shortly by the iPhone 4 smartphone, the fourth-generation iPod Touch and the second-generation Apple TV.
The A4 processor package does not contain RAM but supports point-of-purchase (PoP) installation. This allows the different devices which utilise it to have different RAM configurations,like 256MB low-power DDR SDRAM for the iPad and 512MB low-power DDR SDRAM for the iPhone 4. The SoC features a PowerVR SGX 535 GPU.
NVIDIA Tegra 2 1GHz
The NVIDIA Tegra 2 is a dual-core SoC that features an ultra-low-power (ULP) NVIDIA GeForce GPU with four pixel shaders and four vertex shaders. The processor within is actually a dual-core ARM Cortex-A9 CPU. The Cortex-A9 is 25 per cent Dhrystone MIPS faster than Cortex-A8. Cortex-A9 runs 1 GHz per core and has a 1MB L2 cache as well as one 32kB L1 cache per core. The GPU is a fully programmable OpenGL ES 2 supporting processor with eight cores. Tegra 2 being a NVIDIA product, a lot of Android games are believed to be optimised for it.
This chipset also features a 1080p playback processor, which allows HD movie playback without taxing the main processor and thus saving battery life. It has been designed using a 40nm process. There is also a version of this SoC supporting 3D displays. Named the Tegra 2 3D, this SoC uses a higher-clocked CPU (1.2GHz) and GPU.
Samsung Exynos 4210 1.2GHz
The Exynos 4210 uses the CortexA9 dual-core. It provides features such as dual-core CPU, high memory bandwidth, native triple display (2 WSVGA+1 HDMI out simultaneously), 1080p video decode and encode hardware, 3D graphics hardware and high-speed interfaces such as SATA and USB. The application processor also supports DDR-based eMMC 4.4 interface to increase the filesystem’s performance. Exynos 4210 uses ARM’s Mali-400 MP GPU. This graphics GPU is a move away from the PowerVR GPU of the Samsung Galaxy S.
Qualcomm Snapdragon MSM8260 1.5GHz
Also known as the Snapdragon S3, this SoC features dual-core Scorpion CPUs and an Adreno 220 GPU built using a 45nm process. It uses an ARMv7 instruction set and has single-channel 333MHz ISM/266 MHz LPDDR2 and 512kB L2 cache. The Adreno GPU has often been the epicentre of debates on whether benchmark scores can actually relate to real-world performance. Snapdragon processors are known to be better at multimedia related SIMD processes even though these score lower at benchmark tests. Snapdragon’s hardware acceleration of Adobe Flash and WebGL content delivers a smooth Web experience without the freezing or jerking seen if you have tasks running in the background.
This SoC also utilises a Harvard Superscalar architecture. It comes with embedded processor (supporting GSM, GPRS, EDGE, UMTS/WCDMA, HSDPA 7.2Mbps, HSUPA 5.76Mbps, HSPA+ 28Mbps/11Mbps, MBMS baseband), embedded seventh-generation gpsOne GPS module and gpsOneXTRA Assistance.
Apple A5 800MHz
Designed by Apple and manufactured by Samsung using a 45nm process, this is the latest ARM processor from Apple featured in its iPhone 4S and iPad 2 devices. A5 is a chip based upon the dual-core ARM Cortex-A9 MPCore CPU with NEON SIMD accelerator and a dual-core PowerVR SGX543MP2 GPU clocked at 200 MHz. Apple lists the A5 to be clocked at 1 GHz on the iPad 2’s technical specifications page, though it can dynamically adjus its frequency to economise on battery life. The SoC also contains 512 MB of DDR2 memory clocked at 533 Mhz.
This chip includes an image signal processor unit that does advanced image post-processing such as face detection, white balance and automatic image stabilisation. Apple remains the only company to have a PowerVR SGX543 graphics solution on its SoC. This solution makes the A5 chip a lot more expensive than the rest.
TI OMAP4460 1.2GHz
OMAP 4460 features dual-core ARM Cortex-A9 MPCore built using a 45nm process just like the Apple A5. One difference is that this OMAP is clocked much higher (1.2GHz) than the A5 (800MHz). TI’s datasheet specifies that OMA 4460 can handle up to 1.5 GHz but it’s clocked at 1.2 GHz to keep battery consumption within limits. The GPU featured here is the PowerVR SGX540, which helps in 2D and 3D graphics acceleration.
This SoC also features TI’s Smartflex technology that reduces power consumption by dynamically controlling the voltage, frequency and power based on the device activity, modes of operation and temperature. It can handle 1080p HD video as well as 1080p stereoscopic 3D video.
The upcoming chips show a trend wherein performance alone ceases to be the driving factor. Mindsets have changed and battery life is just as important as performance. The new chips deliver by featuring low-power cores intended to take on the processing strain in stand-by mode.
Samsung Exynos 5250
Samsung’s Exynos 5250 packs two ARM Cortex-A15 processors clocked at 2 GHz. The dual-core chip is claimed to offer about twice the CPU performance of existing products that are equipped with a pair of ARM 1.5GHz Cortex-A9 processors. It is built using a a 32nm low-power high-K metal gate process.
The flagship of TI’s new OMAP5 series, this 28nm SoC integrates two ARM Cortex-A15 MPCores that are clocked at 2 GHz each, two Cortex-M4 cores that are used as accelerators and CPUs that power the device when in low-power/standby mode. According to TI, the A15 architecture is about 50 per cent faster than the preceding A9 cores and the entire chip is about three times faster than the previous generation (OMAP4, which isn’t available yet).
The graphics engine is based on the PowerVR SGX544-MPx core as well as TI’s own 2D graphics engine. Users can run 1080p video at 60 frames per second and convert 2D 1080p video into S3D in 1080p in real time. TI claims that it has enough horsepower to support a 2D digital camera with up to 24-megapixels resolution or 12-megapixels 3D resolution. It also features support for up to 8GB DDR3 memory, USB 3.0, HDMI 1.4a (3D) and a display resolution of up to 2560×2048 pixels.
Intel Medfield Platform
Specificationsand benchmark results of Intel’s 32nm x86 Atom SoC have been leaked. According to the leak, the Medfieldtablet platform consists of a 1.6GHz CPU, 1 GB of DDR2 RAM, WiFi, Bluetooth, FM radio and some kind of GPU. The smartphone variant will probably be clocked slower. Also, there’s no mention of whether a GSM/LTE radio will be included in the chip.
Intel Medfield 1.6GHz currently scores around 10,500 in Caffeinemark3. For comparison, NVIDIA Tegra 2 scores around 7500, while Qual-comm Snapdragon MSM8260 scores 8000. However, this doesn’t seem to be fair comparison as the NVIDIA and Snapdragon chips have been out for a year.
Quad-core smartphones are guaranteed to be part of the next slew of high-end devices that aim to provide even greater performance improvement. Quad-core tablets have already been released, like the ASUS Eee Transformer Prime and the Lenovo IdeaPad K2.
NVIDIA’s KAL-EL/Tegra 3 is a quad-core chip with a GPU that is claimed to be three times faster than Tegra 2. The overall performance is estimated to be fivetimes the performance of Tegra 2.
Project Kal-El processor implements a novel variable symmetric multiprocessing (vSMP) technology. vSMP includes a fifth CPU core (the ‘companion’ core) built using a special low-power silicon process that executes tasks at low frequency for active standby mode, music playback and video playback. All fiveCPU cores are identical ARM Cortex A9 CPUs, which are individually enabled and disabled based on the work load.
Samsung has provided support for Exynos 4412 on its development boards. The 4412 is similar to the Exynos 4212 and takes its que from the ARM Cortex A9 platform touting four CPU cores. Each core has a clock speed of 1.5 GHz like that of the 4212.