Instruction Set Enhancements Optimized for SLAM Algorithms

It has been announced that Cadence Design Systems, has expanded the high end of its popular Tensilica Vision DSP product family with the introduction of the Cadence Tensilica Vision Q7 DSP delivering up to 1.82 tera operations per second (TOPS).

May 17, 2019

byeeDesignIt Editorial

To address the increasing computational requirements for embedded vision and AI applications, the sixth-generation Vision Q7 DSP provides up to 2X greater AI and floating-point performance in the same area compared to its predecessor, the Vision Q6 DSP.

The Vision Q7 DSP is specifically optimized for simultaneous localization and mapping (SLAM), a technique commonly used in the robotics, drone, mobile and automotive markets to automatically construct or update a map of an unknown environment, and in the AR/VR market for inside-out tracking.

Escalating demand for image sensors in edge applications is driving growth of the embedded vision market. Today’s vision use cases demand a mix of both vision and AI operations, and edge SoCs require highly flexible, high-performance vision and AI solutions operating at low power. In addition, edge applications that include an imaging camera demand a vision DSP capable of performing pre- or post-processing before any AI task.

While performing SLAM, edge SoCs also require a computational offload engine to increase performance, reduce latency and further lower power for battery-operated devices. Because SLAM utilizes fixed- and floating-point arithmetic to achieve the necessary accuracy, any vision DSP employed for SLAM must provide higher performance for both data types.

With its low power and architectural and instruction set enhancements, the Vision Q7 DSP is ideally suited for the most demanding edge vision and AI processing requirements and boosts performance for a number of key metrics:

Very long instruction word (VLIW) SIMD architecture delivers up to 1.7x higher TOPS compared to the Vision Q6 DSP in the same area.
An enhanced instruction set supporting 8/16/32-bit data types and optional VFPU support for single and half precision enables up to two times faster performance on SLAM kernels compared to the Vision Q6 and Vision P6 DSPs.
Delivers up to 2X improvement in floating-point operations per mm2 (FLOPS/mm2) for both half precision (FP16) and single precision (FP32) compared to the Vision Q6 and Vision P6 DSPs.
Up to 2X greater AI performance in the same area compared to the Vision Q6 DSP results in up to 2X improvement in GMAC/mm2 compared to the Vision Q6 DSP.

For AI applications, the Vision Q7 DSP provides a flexible solution delivering 512 8-bit MACs, compared to 256 MACs for the Vision Q6 DSP. For greater AI performance, the Vision Q7 DSP can be paired with the Tensilica DNA 100 processor.

In addition to computational performance, the Vision Q7 DSP boasts a number of iDMA enhancements including 3D DMA, compression and a 256-bit AXI interface. The Vision Q7 DSP is a superset of the Vision Q6 DSP, which preserves customers’ existing software investment and enables an easy migration from the Vision Q6 or Vision P6 DSPs.

“The applications for visual AI are very diverse and are growing very fast, and these applications have huge appetites for computing performance. Achieving the required levels of performance with acceptable cost and power consumption is a common challenge, particularly as vision is increasingly deployed into cost-sensitive and battery-powered devices,” said Jeff Bier, Founder of the Embedded Vision Alliance. “I applaud Cadence for its commitment to address this challenge by developing a series of processing engines tuned for the needs of visual AI applications.”

Lazaar Louis, Senior Director of Product Management and Marketing for Tensilica IP at Cadence, said: “For edge computing in our target markets, offloading vision applications on a high-performance, low-power, highly flexible DSP is a must.

“Cadence has a long and successful track record spanning six generations of Vision DSPs, and the Vision Q7 DSP was designed to address the needs of our key customers deploying highly complex vision and AI algorithms, including SLAM for perception.

“The Vision Q7 DSP strengthens our very successful automotive portfolio, bringing leading-edge computation to the ‘computer in the car’ that can be compliant with safety requirements like ISO 26262.”

The Vision Q7 DSP supports AI applications developed in the Caffe, TensorFlow and TensorFlowLite frameworks through the Tensilica Xtensa Neural Network Compiler (XNNC), which maps neural networks into executable and highly optimized high-performance code for the Vision Q7 DSP.

The Vision Q7 DSP also supports the Android Neural Network (ANN) API for on-device AI acceleration in Android-powered devices, and the software environment also features complete and optimized support for more than 1,700 OpenCV-based vision library functions, enabling fast, high-level migration of existing vision applications.

In addition, development tools and libraries are all designed to enable SoC vendors to achieve ISO 26262 automotive safety integrity level D (ASIL D) certification.

The Vision Q7 DSP has been sampled to strategic customers and is expected to be available for general release in the second quarter of 2019.

May 17, 2019

byeeDesignIt Editorial