Techrecipe

AI proprietary processor popularization era opens

ARM Machine Learning and ARM Object Detection are ARM’s recently announced processors for artificial intelligence processing.

ARM object detection is a processor optimized for face or object detection. In addition to real-time detection using 60 frames in full HD, it also has DSP performance 80 times that of conventional DSP. The product is expected to be used for Internet applications such as drones and security cameras that support crash safety.

ARM Machine Learning is a dedicated processor that enables faster processing of general artificial intelligence like automatic translation and face recognition. When used on mobile devices, it can perform more than 4.6 trillion operations per second. The power efficiency is at least 2 to 4 times that of the conventional one. This product is expected to be used in mobile devices such as smartphones and is expected to come out later this year.

Of course, ARM has already developed Dynamic Queue (DynamiQ), a technology that allows the processing of artificial intelligence processing itself on the terminal.

Dynamic queues are a combination of flexibility and versatility, and ARM’s intent to redefine multicore environments across devices, from edge to cloud. ARM’s big.LITTLE So high-performance low-power technology has been able to use two ARM processors by adding two or four low-power cores to the same design. Dynamic queues evolve this Big Little technology and allow you to make configurations that have never been possible before, such as 1 + 7 or 1 + 3. The thing that can make the optimum configuration depending on the environment.

Designed with dynamic queuing technology, the Cortex-A processor can boost artificial intelligence performance up to 50 times faster than existing Cortex-A73 based systems in the next three to five years, and accelerate CPU and accelerators as much as 10 times faster The computing performance can be increased up to 10 times. When designing SoCs, they can scale from 8 to 8 cores in a single cluster and deliver different per-core performance and power characteristics. You can also quickly respond to machine learning and artificial intelligence apps.

Because dynamic queues are highly responsive to ADAS solutions, they can be used for safe automatic operation systems, which can increase safety and anticipate safe operation when an ASIL-D compatible system is built or a failure occurs.

The ARM machine learning and the second generation ARM object detection announced by ARM can solve the processing of machine learning which is a technique used for artificial intelligence in the terminal rather than the cloud side. It shows the direction of the dynamic queue more clearly. In the future, we will accelerate the process of solving the increasing machine learning process from the terminal rather than the cloud.

Nowadays, machine learning processes that require high processing power are usually solved through the cloud. However, using the cloud can cause problems such as the speed of response and the amount of data transferred during data transmission and reception. In addition to that, there is also a hacking concern when sending data. On the other hand, if AI-specific processor is used for machine learning processing, machine learning can be performed in the terminal itself, and it is advantageous in terms of responsiveness and security.

According to ARM, the processors announced are all new technologies that are not based on existing CPU or GPU architectures. It can be used not only as a SoC for mobile devices, but also for use in Internet devices.

Huawei has introduced the Kirin 970, an AI-only processor capable of machine-learning processing. The product was created by Huawei’s Hisilicon Technologies and adopted the Mali-G72 (Mali-G72) GPU in four cores for both the Cortex-A73 and Cortex-A53. TSMC’s 10nm fabrication process. In comparison with the existing Giraffe 960, the die size is reduced by 40% and power efficiency is increased by 20%.

Of course, the biggest feature is that the SoC itself is equipped with an AI-specific processor called NPU (Neural Network Processing Unit). According to Huawei, the use of NPUs can achieve 25 times as much AI-related processing as a CPU and 50 times more power efficiency. The computational performance of the NPU is 1.92 TFLOPS in the FP16. The supporting deep-run development framework is a tensor plug, a tensor flow light, and a café (2). In this regard, Huawei emphasized the product as the world’s first mobile AI processor at the time of announcement in 2017.

Apple likewise. Apple has included a dedicated processor for machine learning through the AII Bionic, a SoC on iPhone X and 8. The A11 Bionic is a chip with 6 cores and 4.3 billion transistors, which is 30% faster than the existing A10 Fusion, but it can cut power consumption by half. Two high-performance cores, and four high-efficiency cores, the performance of the high-performance cores is 25% higher than that of the existing ones, and 70% of the high-efficiency cores. GPUs were 30% faster than before. The biggest feature is that it also has a Neural Engine. The neural engine is an AI-specific processor specialized in artificial intelligence processing, such as face recognition, vision computing, speech recognition and natural language processing. It can handle up to 600 billion jobs a second.

However, these processors were limited to high-end models. In contrast, products such as ARM Machine Learning, which are announced by ARM, are expected to spread AI processors to entry-level mobile devices. This means that AI processing can be done at low-cost terminals.

ARM tries to expand the devices that can handle artificial intelligence technology. If all users use voice search for about three minutes each day, the number of Google servers should be doubled. ARM says development of AI-specific processors capable of handling artificial intelligence technologies is a trend in semiconductor development. In the meantime, the trend of mobile semiconductor development trend that focuses on simple high performance and power saving can change itself.

Indeed, in addition to these companies, Intel is also developing Loihi, an AI processor that can self-learn on the chip itself, without relying on the cloud. Intel is developing a chip for neuromotor computing, called Neuromorphic Computing (AI) technology that uses human brain structures.

The neural network of the brain transmits information through electrical signals. A spark between neurons surrounded by a net-like mesh controls the weight and stores the change. Human intelligence is generated by interaction with the brain’s neural circuits. Intel also wants to develop a self-learning semiconductor chip without using the cloud. I want to be able to get feedback through various information around and learn like the human brain.

Intel has been developing the technology for the past six years and is making a prototype chip under the code name. It can combine learning and reasoning processes on the chip, enabling autonomous and real-time adaptation to the environment without having to wait for the data to be sent to the cloud. It has a nerve-like asynchronous core structure, and has a programmable learning engine that can control the parameters of each core network. It is made with a 14nm manufacturing process and consists of 130,000 nerves and 130 million synapses. It includes algorithms that can cope with problems such as setting routes or learning prior learning or dynamic patterns. Intel has said that its chipmaker is achieving a million-fold improvement in learning. Energy efficiency is as high as 1,000 times. Of course Rohawy chip is under development and will be provided to research institutes from the first half of this year to share development information.

In addition, Amazon has reportedly launched an AI-only chip design to boost the quality and reduce response times of Alexa, an audio secretary in the eco-series of its voice recognition speakers. Although Amazon is not expected to have its own production as it has no chip manufacturing experience, it is aiming to increase speech recognition and response speed by aiming at a device for more data processing before working with the cloud.

Semiconductor companies say it is essential to develop AI-specific processors to realize high-performance artificial intelligence. This is because the range of applications for agricultural drone, augmented reality technology, and object internet related equipment, which need to grasp the autonomous driving car or health status that requires excellent perception ability, or the crop growth status, will increase.