The video was originally produced for Dell Latitude gets things done [by STUART KENNEDY in the The Australian IT, Dec 11, 2012]:
SITTING next to Apple’s sleek iPad 4, the new Dell Latitude 10 tablet looks a little drab and portly, a bit like a middle-aged bizoid squaring up against a twenty-something fitness fanatic.
But if you want to actually get something done, rather than just looking good running around the block, the homely Dell has it all over the Apple product in many ways.
The Latitude 10 is one of a new breed of tablet that can run Microsoft’s Windows 8 Pro operating system and all the enterprise friendly bits and pieces that bring a smile to the chief information officers who run large fleets of computing gadgets.
These include business-grade security and device management and easy access to virtual private networks, as well as a three-year warranty and the promise of being able to sweat the asset for much longer than a typical consumer tablet, like an iPad.
With its Intel processor, the Latitude 10 tablet can also run the software developed for previous versions of Windows, including Microsoft’s own Office productivity suite and the legions of Windows-based business applications.
While Apple has purposefully left out USB ports and memory card slots from its tablets so that you cannot expand the iPad’s memory and are locked into Apple’s model price points on differing memory capacities, the Latitude 10 has a full-sized USB port and an SD card slot for memory expansion.
The Dell Latitude 10 Windows 8 tablet and a slew of forthcoming Windows 8 tablets from Asus, Acer, Fujitsu and others use the new Intel Atom Z2760 Clover Trail chips. This dual core silicon engine runs at 1.8 Ghz and uses a PowerVR SGX 545 for graphics, clocked at a speedy 533Mhz.
The Clover Trail Intel chip used in the Dell Latitude 10 is Intel’s first big push into the modern tablet chip market and it’s a lot quicker and a lot less power hungry than the old Intel Atom chips that powered the cheap netbook PCs that have taken such a hit from the advent of tablets.
As a guide, I benchmarked a 2010 model, HP 5102 netbook powered by a single core, 1.66 Ghz Atom N450 chip.
Under the PCMark 7 test, the HP knocked up a score of just over 500 PCMarks. The Latitude 10 showed almost triple the grunt, churning through the benchmark in just over 1400 PCMarks.
In use, the review Latitude 10, which ran Windows 8 Pro, was quick and fluid as it wrangled Microsoft’s new tile-centric Windows 8 operating system.
I would snap quickly from desktop mode to the Start screen, would load Word in Office 2010 in a couple of seconds and would play HD movies and snack-type games such as Pinball FX2 without a stutter.
Given its potential as a laptop replacement tablet, the Latitude screams out for a combined keyboard and cover, like the nifty, snap-on, snap-off keyboard cover for Microsoft’s Surface RT tablet.
Strangely, Dell doesn’t sell such a cover but there is a docking stand for desktop use that adds four USB ports, ethernet, and a full size HDMI port into which you could plug a desktop keyboard and mouse.
The Latitude 10 should last a while. It’s built on a magnesium alloy frame, the screen is Gorilla Glass and the case is made from a pleasingly grippy material.
The 10.1-inch, 1366-pixel by 768-pixel 10-point multi-touch display lacks the wow factor of the pretty, 2048×1536 pixel panel on the latest iPad. It’s just a workmanlike display and the first thing I would spruce up on the next series of Dell tablets.
But arrayed around the Latitude 10 is all the connection stuff you don’t get with an iPad, such as a full-sized USB 2.0 port that should be able to handle any USB gadget that has a Windows driver, from keyboards to USB hard drives.
There’s a mini HDMI port for pushing presentations out on to a big screen and the 64GB of memory can be augmented in a snap via the full-size SD card slot.
A trusted platform module guards against data theft and the removable battery means long-haul road warriors can swap in a spare if they are getting low on juice and battery failure no longer means a trip to the repair shop.
The flush fitting, 30-watt-hour two-cell battery can be swapped for an optional, bulkier four-cell unit serving up 60-watt hours.
We got about 8.5 hours out of the two-cell battery running continuous video with the screen at full brightness and all radios on.
There’s a meaty, 8-megapixel rear-facing camera, with LED flash that can shoot 1080p HD video and a 720p front-facing camera.
How does the Intel silicon Dell stack up against the Microsoft Surface RT and its Arm-based innards?
I found the Dell quite a bit quicker than the Surface RT in real-world performance.
Application load times, from a fresh power start, where I pitted Windows RT code apps downloaded from the Microsoft Store against their Windows 8 counterparts from the same store, saw the Dell beat the Surface RT every time.
The Surface RT would take over six seconds to load Microsoft Word whereas the Latitude 10 would do it in less than three seconds.
Loading the Pinball FX2 game took 28 seconds on the RT and 24 seconds on the Dell; ditto the Jetpack Joyride game, which loaded in 27 seconds on the Surface and 22 seconds on the Dell.
When it comes to getting down to business, the Latitude 10’s target audience, the Dell machine has it all over the Surface in terms of enterprise grade security, compatibility with the mass of Windows software and probably ruggedness, although time will tell on that score.
Unfortunately, all the business-class stuff means a biz-class sticker price. The Latitude 10 begins at $899 [US$ 947]. Add in $125 [US$ 132] for 3G cellular connectivity, another couple of hundred for Microsoft Office, another $200 [US$ 211] for the dock and more again for a keyboard case and you are well over a grand.
PRICE: from $899. [US$ 947]
[in Australia, the version with Windows 8, 2GB RAM, 64GB SSD]
Using Microsoft Surface as the point of reference for every 3d party hybrid on this blog, let’s see next a detailed comparision of the Dell device with the Surface:
Wistron of Taiwan Exclusively Supplies 10-inch Tablet to Dell [CENS, Dec 17, 2012]
Dell Computers’ CEO Michael Dell recently indicated that PCs and tablet PCs equipped with Windows 8 are in high demand. An industry source revealed that Dell’s 10-inch tablet Latitude 10 will be exclusively supplied by Taiwan’s major NB contract manufacturer Wistron Corporation, which will ship over 500,000 units in the fourth quarter. Dell is also predicted to be the world’s largest supplier of servers within a few seasons.
Dell’s major Taiwanese contract suppliers Compel Electronics Inc. and Wistron are expected to remarkably benefit from the firm’s optimism towards Windows 8 products.
Dell’s Latitude 10 tablet has been launched in North America and will be released in other markets gradually in the first quarter of 2013, for which orders have been secured by Wistron into the first quarter of next year.
A representative of Wistron estimates that the firm’s tablet PC shipments will reach 2.5 million units in 2012, and as high as six million units in 2013 due to increasing customers.
Dell has also announced to quit the smartphone market. The firm’s consumer sales manager Jeff Clarke [see below on the cover picture of the embedded video] noted that Dell will not tap the said market in the near future.
Moreover Dell quit the Android tablet market as well. See this report referring to the same person, Jeff Clarke:
Dell Quits Smartphone Business Globally, Drops Android [Forbes, Dec 12, 2012]
Dell is definitely pulling the plug on the smartphone business, globally. A tough decision, leaving a market that is expected to reach $150.3 billion in 2014, according to MarketsandMarkets.
However, Jeff Clarke, the head of Dell’s consumer business, confirmed yesterday at the Dell World conference, that there’s no way they’ll jump back into the ring anytime soon. “It needs a lot of investments to really be successful,” told me Clarke.
Earlier this year, the Round Rock, Texas-based computer company stopped selling its mobile devices in the U.S. Although some could still be found in China where Dell hoped to continue. But that’s all over now as well.
Dell’s new Mobile Strategy: Windows tablets!
Now in the 5th year of its “transformation,” Dell’s mobile strategy looks very much like it was before its push in the consumer business and the adoption of Google‘s Android system for most of its mobile devices (Streak, Aero, Thunder).
“It’s a content play with Android. Amazon is selling books and Google is making it up with search. So far we couldn’t find a way to build a business on Android,” added Clarke. But I’m sure Samsung would disagree.
So for Dell, it’s back to the future, I mean Microsoft with its latest tablet family, the XPS10, XPS12 and Latitude 10, all running Windows 8 or Windows RT. “It doesn’t mean we’re not looking at Android. You should come and see what’s in our labs.” An offer that I can’t refuse. Let’s set up a time and date!
These things are even more clear from: Dell World  Influencer Panel Highlights – December 11, 2012 [DellVlog YouTube channel, Dec 11, 2012]
The Dell wants to be more than your box provider post from The Register summarizes the above [Dec 12, 2012] as:
The executive roundtable was a way to introduce some of the new faces of Dell to customers and partners, with just about everybody but Dell, the man, and [Steve] Felice [Dell co-president and chief commercial officer], who joined Dell in 1999 from third-party tech support firm DecisionOne, and Jeff Clarke, vice chairman and co-president in charge of global operations and end user computing, being the old Dell hands.
Marius Haas, president of the cross-group Enterprise Solutions (gulp!) group, just came aboard this year after a short stint at private equity firm KKR and a long career at rival HP. John Swainson, who runs Dell’s Software Group, is a long-time IBMer who turned CA Technologies around. After the surprise resignation last week of long-time EDS executive Steve Schuckenbrock, who has been at Dell since 2007 and who has run its Services and then its Large Enterprise groups, Suresh Vaswani is the new president of the Services group and was formerly in charge of Dell’s Indian services group; before that, he was the co-CEO at Indian services giant Wipro. The consensus on the street seems to be that Schuckenbrock wants to be a CEO, and it ain’t gonna happen at Dell. (There could be some openings up at HP.)
The opening of Dell World was also a way to toss out some more statistics. Dell says that it has presence at 95 per cent of the Fortune 500, and that more than 10 million small and medium businesses rely on its solutions (gulp!) and services (okay, new rule, when Dell says services, you have to pay the person to your right $5.) Dell also has something on the order of 115,000 partners, with about 650 of them showing up at Dell World to get the inside track.
The execs were also put on the spot to answer questions, and Dell, the man, was asked about what he thought about the future of the PC business, something on the minds of both HP and Dell these days and not something that IBM is worried about much these days. (IBM is more worried about the future of systems and services, and it will have its own issues here, fear not.)
“We spend a lot of time talking about this and working and working on it together,” Dell said, referring to his collaboration with Clarke. “We’re quite optimistic about Windows 8. You’re going to hear over the next few days about a broad set of products. Think about a product like Latitude 10, which is a thin, light tablet that also docks to become a full workstation – totally secure, works with all of the other Windows things that a customer have, runs Microsoft Office, and has a USB port, and so on.
“That’s the kind of product that really excites out customers and helps address some of the challenges that exist. We think the touch experience is incredible. We have this stunning 27-inch, quad HD display with our XPS27 all-in-one. We think we are seeing a real revolution in the PC.”
Clarke was more adamant: “We still believe that the PC is still the preferred device to do work, to drive productivity, to create. I look at the long-term prospects of the PC business and I am very optimistic; 85 per cent of the world’s population has a PC penetration rate of less than 20 per cent. I look at the middle class as it grows over the next 20 years from 1.8 billion people to 4.9 billion people, and I see the opportunity there. I look at the number of small businesses that we sell to today, and the creation of small businesses continues at an unprecedented rate and serving that with PCs is still a huge opportunity for the company.”
So Dell is not the PC company as before. Its Dell Evolves the PC: Combines Leading Design With Security, Manageability and Reliability [Dec 12, 2012] is clear about that:
- New line up of devices featuring Touch functionality combine inspired design with advanced features
- Advanced security and flexible management options that meet the most rigorous demands of enterprise IT departments and consumers alike
- Users benefit from secure and convenient anytime, anywhere access to work and personal content
Dell today detailed its strategy for developing and deploying PCs that enable new user experiences while also meeting enterprise IT demands around security, manageability and reliability. The company recently introduced a completely redesigned portfolio of personal computing devices, services and solutions that let people move easily between work and personal applications. The devices also help enterprise IT departments deliver solutions that enable personal productivity while also protecting sensitive corporate data.
New Client Devices
Dell has recently introduced a completely redesigned platform of new commercial and consumer tablets and PCs that combine a consumer-friendly aesthetic with advanced business client functionality. These new form factors were created to capitalize on the advances in new operating systems such as Windows 8 and make touch computing available to more end-users than ever before.
“As one of the world’s largest and most successful companies, General Electric maintains a diverse set of technology solutions to address the needs of our global workforce,” said John Seral, senior vice president and chief information officer at GE Energy. “This diversity creates security and management challenges for IT, especially when new operating systems and software packages are considered. That’s why GE is excited to work with Dell and use its XPS product line for our enterprise needs. The design is attractive and something our employees are proud to carry around and the security benefits make IT’s lives much easier. Simplification is a major focus at GE and reducing operating system variance from Microsoft Windows is helped by the XPS platform that is sleek and light.”
The new devices recently introduced by Dell include the:
Latitude 10 – Dell’s first business-class tablet that takes advantage of the latest advances in touch-enabled applications and fits easily into current IT environments by supporting existing Microsoft productivity applications and plugging into existing management consoles;
Latitude 6430u – a 14-inch notebook that strikes the balance between aesthetic appeal and corporate needs to be the most manageable and secure Ultrabook thanks to Dell’s unique vPro extension. The Latitude 6430u is backed by extensive world-class service and support;
XPS 10 – a tablet that delivers laptop-like productivity so users can fluidly transition from work projects to their personal pursuits. The XPS 10 is powered by Microsoft Windows RT and dual-core ARM architecture; and,
XPS 12 – a convertible notebook that combines the performance of an Ultrabook with the ease-of-use of a tablet into a single device with a leading edge touch experience. The innovative form allows users to quickly shift from work to play and back.
“There are two key requests we are hearing from customers,” said Sam Burd, vice president and general manager Personal Computer Product Group. “The first is they want to simplify the computing experience for their organization, which means providing fewer or lighter devices to employees. Secondly, and even more important, they still require security and manageability. Dell’s new portfolio of PCs announced this fall and upcoming devices previewed this week at Dell World help them do both.”
Bring Your Own Device
Dell continues to empower businesses to embrace “bring your own device” (BYOD) and is helping companies gain a competitive advantage. As a result, the company has enhanced its offerings to meet both end-user and IT department requirements.
“BYOD is growing in popularity with both businesses and users and is becoming a reality in many environments – both large and small,” said Bob O’Donnell, program vice president, clients and displays, IDC. “This creates a whole new set of challenges for IT which needs to strike balance between end-user preferences, productivity and IT control. Dell’s setting sights on both audiences as evidenced in its current Windows 8 lineup, and the services and solutions along the continuum tailored for IT.”
Solutions and Services
In addition to being designed to satisfy the most demanding user, the new devices from Dell can serve as the foundation to complete, adaptive solutions that allow IT departments to support BYOD. Today, Dell offers end-to-end solutions that combine compelling hardware with state-of-the-art services to help protect critical company data on a variety of platforms and devices including those operating on Windows, iOS and Android, thereby enabling companies to better manage a diverse, heterogeneous device topology.
In order to help companies manage the multitude of devices on their network while keeping them secure from external threats, Dell has introduced a suite of complementary offerings:
Dell Wyse Cloud Client Manager is a recently introduced SaaS offering that integrates mobile device and mobile application management functionality with additional capabilities such as thin and zero client management and the ability to manage end-user access to corporate content and apps from any device. It enables IT departments to securely manage company and user-owned devices alongside end-user access to company applications and content without the burden of ongoing solution installation, updates and maintenance.
Dell Data Protection | Encryption is an intelligent file-based encryption solution that protects data on laptops and desktops, as well as external media, in case of loss or theft. It complies with highest level US government security standards and meets U.S. Federal Information Processing Standard (FIPS) 140-2 certification for data encryption.
Dell KACE this week announced a limited release of its new K3000 Mobile Device Management Appliance that extends systems management capabilities to enforce security policies for both corporate and personal mobile devices running on both iOS and Android operating systems. Integration with the K1000 System Management Appliance provides IT with a powerful, integrated, easy-to-use solution to accurately track, monitor and manage desktops, laptops, servers and mobile devices more efficiently
Dell Inc. (NASDAQ: DELL) listens to customers and delivers innovative technology and services that give them the power to do more. For more information, visit www.dell.com.
- Intel Atom processor S1200 vs. Calxeda ECX1000 for microservers
- ARM Holdings on the server opportunity
- x86 on ARM with Linux
- Boston Ltd. related information from Calxeda
- Background on Elbrus (in Russian or English if available)
- Background on Elbrus Technologies (in Russian)
Intel Atom processor S1200 vs. Calxeda ECX1000
Intel Delivers the World’s First 6-Watt Server-Class Processor [Intel press release, Dec 11, 2012]
Several Equipment Makers Building Microservers, Storage and Networking Systems Based on 64-bit Intel® Atom™ Processor S1200 Product Family
- Intel® Atom™ processor S1200 server system on-chip hits lower-power levels, and includes key features such as error code correction, 64-bit support, and virtualization technologies required for use inside data centers.
- More than 20 low-power designs including microservers, storage and networking systems use the Intel Atom processor S1200 family.
Intel Corporation introduced the Intel® Atom™ processor S1200 product family today, delivering the world’s first low-power, 64-bit server-class system-on-chip (SoC) for high-density microservers, as well as a new class of energy-efficient storage and networking systems. The energy-sipping, industrial-strength microprocessor features essential capabilities to achieve server-class reliability, manageability and cost effectiveness.
“The data center continues to evolve into unique segments and Intel continues to be a leader in these transitions,” said Diane Bryant, vice president and general manager of the Datacenter and Connected Systems Group at Intel. “We recognized several years ago the need for a new breed of high-density, energy-efficient servers and other datacenter equipment. Today, we are delivering the industry’s only 6-watt1 SoC that has key datacenter features, continuing our commitment to help lead these segments.”
Intel’s Next Generation of Microservers: The Real Thing
As public clouds continue to grow, the opportunity to transform companies providing dedicated hosting, content delivery or front-end Web servers are also growing. High density servers based on low-power processors are able to deliver the desired performance while at the same time significantly reduce the energy consumption – one of the biggest cost drivers in the data center. However, before deploying new equipment in data centers, companies look for several critical features.
The Intel Atom processor S1200 product family is the first low-power SoC delivering required data center features that ensure server-class levels of reliability and manageability while also enabling significant savings in overall costs. The SoC includes two physical cores and a total of four threads enabled with Intel® Hyper-Threading Technology2 (Intel® HT). The SoC also includes 64-bit support, a memory controller supporting up to 8GB of DDR3 memory, Intel® Virtualization Technologies (Intel® VT), eight lanes of PCI Express 2.0, Error-Correcting Code (ECC) support for higher reliability, and other I/O interfaces integrated from Intel chipsets. The new product family will consist of three processors with frequency ranging from 1.6GHz to 2.0GHz.
The Intel Atom S1200 product family is also compatible with the x86 software that is commonly used in data centers today. This enables easy integration of the new low-powered equipment and avoids additional investments in porting and maintaining new software stacks.
New Milestones in Power Efficiency
Intel continues to drive power consumption down in its products, enabling systems to be as energy efficient as possible. Each year since the 2006 introduction of low-power Intel® Xeon® processors, Intel has delivered a new generation of low-power processors that have decreased the thermal design power (TDP) from 40 watts in 2006 to 17 watts this year due to Intel’s advanced 22-nanometer (nm) process technology. The Intel Atom processor S1200 product family is the first low-power SoC with server-class features offering as low as 6 watts1 of TDP.
Broad Industry Support
Today, more than 20 low-power designs including microservers, storage and networking systems use the Intel Atom processor S1200 processor family from companies including Accusys*, CETC*, Dell*, HP*, Huawei*, Inspur*, Microsan*, Qsan*, Quanta*, Supermicro* and Wiwynn*.
“Organizations supporting hyperscale workloads need powerful servers to maximize efficiency and realize radical space, cost and energy savings,” said Paul Santeler, vice president and general manager, Hyperscale Business Unit, Industry-standard Servers and Software at HP. “HP servers power many of those organizations, and the Intel Atom processor S1200 will be instrumental as we develop the next wave of application-defined computing to dramatically reduce cost and energy use for our customers.”
An Even Brighter Future
Intel is working on the next generation of Intel Atom processors for extreme energy efficiency codenamed “Avoton.” Available in 2013, Avoton will further extend Intel’s SoC capabilities and use the company’s leading 3-D Tri-gate 22 nm transistors, delivering world-class power consumption and performance levels.
For customers interested in low-voltage Intel® Xeon® processor models for low-power servers, storage and networking, Intel will introduce the new Intel Xeon processor E3 v3 product family based on the “Haswell” microarchitecture next year. These new processors will take advantage of new energy-saving features in Haswell and provide balanced performance-per-watt, giving customers even more options.
Pricing and Availability
The Intel Atom processor S1200 is shipping today to customers with recommended customer price starting at $54 in quantities of 1,000 units.
More information on the announcement including Diane Bryant’s presentation, additional documents and pictures are available at http://newsroom.intel.com/docs/DOC-3172.
Fact Sheets & Backgrounders
Intel® Atom™ Processor S1200 for Microserver: Datasheet, Vol. 1 [Intel, Dec 2012]
Comparing Calxeda ECX1000 to Intel’s new S1200 Centerton chip [‘ARM Servers Now’ blog from Calxeda, Dec 11, 2012]
Based on what Intel disclosed today, here’s a snapshot of Calxeda EnergyCore 1000 vs. Intel’s new S1200 chip
2 x .5 MB
So, while the Centerton announcement indicates that Intel takes “microservers” seriously after all, it falls short of the ARM competition. It DOES have 64-bits and Intel ISA compatibility, however. Most workloads targeting ARM are interpreted code (PHP, LAMP, Java, etc), so this is not as big a deal as some would have you believe! Intel did not specify the additional chips required to deliver a real “Server Class” solution like Calxeda’s, but our analysis indicates this could add 10 additional watts PLUS the cost. That would imply the real comparison is between ECX and S1200 is ~3.8 vs ~16 watts. So roughly 3-4 times more power for Intel’s new S1200, again, comparing 2 cores to 4. Internal Calxeda benchmarks indicate that Calxeda’s four cores and larger cache delivery 50% more performance compared to the 2 hyper-threaded Atom cores. This translates to a Calxeda advantage of 4.5 to 6 times better performance per watt, depending on the nature of the application.
What is a “Server-Class” SOC? [‘ARM Servers Now’ blog from Calxeda, Dec 12, 2012]
As reported in various outlets yesterday, Intel has released their S1200 line of Atom SOC’s targeting the microserver market with the tagline: “Intel Delivers the World’s First 6-Watt Server-Class Processor”. The first notable point here is that they had to use 6 Watts, because 5 was already taken. The second notable point is their definition of “Server-Class”. Looking at the list of features on the Atom S1200, there are key “Server-Class” features missing:
- Networking: Intel’s SOC requires you to add hardware for networking
- Storage: Once again, there is no SATA connectivity included on the Intel SOC, so you must add hardware for that
- Management: Even microservers need remote manageability features, so again with Intel you need to tack that on to the power and price budgets.
Unless you add additional hardware on top of it, Intel’s SOC allows you to boot and not much else. Let’s also consider the fact that you’ve got a total of 8 lanes of PCI Express Gen 2 on each SOC. If you’d like to add the Server-Class items listed above, choose wisely, because those 8 lanes will go fast. Add all of that hardware, plus memory, and 6 W is simply not possible. And of course these additional components add cost and take space as well.
Let’s expand that thought to an actual Atom S1200 powered system, like the Quanta S900- X31A. Each node includes a Marvell 88SE9130 SATA controller at a TDP of 1W, an Intel i350 1GB controller at 2.8W TDP, an AST2300M estimated at a conservative 1W, and an SODIMM at roughly 1.2W (Using the same number we at Calxeda have used). That adds at least 6 more watts per node, almost doubling the 6.1W TDP of the processor. Multiply that across 48 nodes and you just tacked on 288W to each chassis. In a 42U rack full of them, you just added 4kW to each rack! By no means is that a limitation or shortcoming of the Quanta design, which is actually quite good, but rather an indication of the excess baggage that all vendors will need to deal with in putting together an S1200 powered system.
The [Ultimate Data X1 (UDX1) system from Penguin Computing] currently [the Viridis from Boston] shipping [the SystemFabricCore from System Fabric Works] Calxeda ECX-1000 Server-Class SOC ships with SATA, Ethernet fabric links, IPMI-based management, and 8 lanes of PCI Express Gen 2, standard at 3.8W (5W including 4GB DDR3). It’s also worth pointing out that Calxeda’s integrated fabric switch provides more than just the Ethernet ports missing on the Atom S1200. Applied at the system and rack level, it can dramatically reduce Top of Rack Switch ports and cabling complexity, while increasing internode bandwidth by 10-fold. You can have all of that in a 5W server. Not 5W + additional components. Why not take that 12W budget you need for each S1200 node and get two Calxeda nodes with all of the Server-Class features included?
In the end, Intel may simply be claiming 64-bit as the main benchmark for Server-Class. When matching microservers to the appropriate workloads, we’ve found that there is surely a place for 32-bit in the datacenter. We’ll be providing a blog post on that very topic in the near future.
Penguin Computing’s New High Density System Ultimate Data X1 Brings ARM’s Low-power Footprint To The Data Center [Penguin Computing press release, Oct 17, 2012]
Penguin Computing today announced the immediate availability of its Ultimate Data X1 (UDX1) system. The UDX1 is the first server platform offered by a North American system vendor that is built on the ARM®-based EnergyCore System on Chip (SoC) from Calxeda.
The UDX1 brings new levels of efficiency and scale to internet datacenters. With a five Watt power envelope per server the UDX1 is ideal for I/O bound workloads including “Big Data” applications, scalable analytics and cloud storage. The UDX1 offers a drastic reduction of TCO for high-density, low power computing environments. Workloads that have been processed by racks of conventional systems can now be handled by a group of servers in a single physical unit. The UDX1 features a modular architecture that can be configured with up to 48 Calxeda EnergyCore server nodes, with four cores per node. The system includes an internal 10 Gigabit Ethernet switch fabric for node-to-node connectivity and provides up to 144TB of hard drive capacity.
“Power and cooling are the biggest facility challenges for most data centers, on the other hand typical cloud computing, web 2.0 and ‘Big Data’ applications are based on scale out architectures,” said Charles Wuischpard, CEO of Penguin Computing.“A new generation of power efficient high density servers is required to run these workloads efficiently. With the incredibly low power envelope and the extremely high density Calxeda’s EnergyCore SoCs offer, the UDX1 is the ideal platform for running these types of workloads.”
“Penguin is an innovator in Linux based solutions for internet datacenters and high performance computing. We are thrilled that their next generation of innovative products includes Calxeda,” said Barry Evans, CEO Calxeda. “We are realizing this new era in breakthrough low power computing that will lift the constraints on datacenter performance and efficiency. Penguin is helping chart this course with an ideal solution to span from scale-out cloud storage to analytics.”
Penguin Computing will be showing a live demo of Hadoop running on the UDX1 at the upcoming Strata Conference + HadoopWorld on October 24-25 in New York.
For more information, please visit www.penguincomputing.com.
About Penguin Computing
For well over a decade Penguin Computing has been dedicated to delivering complete, integrated Enterprise and High Performance Computing (HPC) solutions that are innovative, cost effective, and easy to use. Penguin offers a complete end-to-end portfolio of products and solutions including workstations, rack-mount servers, custom server designs, power efficient rack solutions and turn-key clusters. Penguin also offers the Scyld suite of software products for efficient provisioning and infrastructure monitoring. For users who want to use supercomputing capabilities on-demand and pay as they go, Penguin provides Penguin Computing on Demand (POD), a public HPC cloud that is available instantly and as needed.
Penguin counts some of the world’s most demanding organizations as its customers, including Yelp, Caterpillar, Life Technologies, Dolby, Lockheed Martin and the US Air Force. Penguin Computing is a registered trademark of Penguin Computing, Inc. Penguin Computing on Demand is a pending trademark in the US. All other trademarks are property of their respective owners. Other product or company names mentioned may be trademarks or trade names of their respective companies.
THE WORLD’S FIRST HYPERSCALE SERVER
Hyperscale Computing represents an inflexion point in the industry that will disrupt the very concept of a server in future systems. Modern servers have come a long way, but they are nonetheless fundamentally based around designs originally created decades ago.
The Boston Viridis is a self contained, highly extensible, 48 node ultra-low power ARM® cluster with integral high-speed interconnect and storage within a standard single 2U rack mount enclosure.
Racks of individually connected, high-power, low density servers and blades are installed in modern data centres thousands at a time. Each of these server systems requires its own networking infrastructure, high power distribution, HVAC, and maintenance engineers to take care of it when things go wrong. These ineffciencies could cost data centres billions.
The Boston Viridis uses Server-on-Chip (SoC) technology to integrate the CPU (powered by ARM®), networking and IO onto the server chip. SoC technology, which began life as an embedded systems technology but is primed to storm the data centre in the next few years allows for mass levels of integration at high density requiring little active cooling. With this technology today we can now con gure over a thousand servers in a standard 42U rack.
The Boston Viridis uses the ARM® -based Calxeda EnergyCore® to create a rack mountable 2U server cluster. The solution comprises of 192 processing cores leading the way towards energy effcient hyperscale computing.
Each 2U chassis contains a total of 12 Calxeda EnergyCards connected to a common mainboard sharing power and fabric connectivity. The Calxeda EnergyCard is a single PCB module containing 4 Calxeda EnergyCore SoCs; each with 4GB DDR-3 Registered ECC Memory, 4 x SATA connectors and management interfaces.
Ethernet switching is handled internally by 80Gb bandwidth on the EnergyCore fabric switch, thereby negating the need for additional switches that consume unnecessary power and add unwanted latency.
Astonishingly, utilising all 48 Calxeda EnergyCore SoCs, the whole package including fabric and management consumes less than 300W – this is achieved as each SoC device consumes just 0.5 to 5 watts of power (depending on load).
With specific applications, the overall combined performance of one 2U Boston Viridis appliance can outperform a whole rack of standard x86 servers, yet at the same time consume 1/10th the power and occupy 1/10th the space making it an excellent investment for datacentres and enterprises alike.
The First, and Next Step in Hyper-Efficient Computing
The SystemFabriCore is an Ultra Dense, Ultra Low Power Computing Platform based on a revolutionary new approach to highly parallel, densely packaged, tightly integrated systems utilizing the Calxeda EnergyCore™ SOC (System on a Chip) which delivers computing, fabric, network, storage I/O and management, all in one 3.8 watt SOC as opposed to a traditional x86 motherboard based architecture using 100s of watts.
The SystemsFabriCore is a self contained, highly extensible, multi-node cluster with integral high- speed interconnect and storage within a standard 2U rackmount enclosure.
- Available up to 48 SoC components delivered on 12 Calxeda EnergyCard platforms
- Each Calxeda EnergyCore™ SoC contains a quad-core processing unit, providing a total of 192 cores per 2U enclosure
- 24 x 2.5” SATA HDDs or SSD devices
- 4 x 4GB miniDIMM modules per EnergyCard, providing a total of 192GB of RAM per 2U enclosure
- Rear I/O supporting 4 x SFP+ cages for external fabric connectivity (1Gbe or 10Gbe depending on configuration and number of Energy Cards) and 1 x serial port for management.
- Easily scalable to thousands of nodes
- Calxeda EnergyCore™ SoC Redefines Big Data Efficiency
- Each EnergyCore™ contains an ARM® Cortex™-A9 Quad-core CPU
- Up to 10X the performance in the same power and space
- Cuts energy use and space by up to 90%
- Industry leading low power consumption 3.8 watts per SoC
- Up to 24 SATA HDDs or SSD per 2U
- Up to 192GB of RAM per 2U enclosure
- Total of 192 cores per 2U
ARM Holdings on the server opportunity
ARM in Servers: Taming Big Data with Calxeda @ ISC’12 [ARM Holdings’ Smart Connected Devices blog, June 18, 2012]
I spent a number of years working in High Performance Computing (HPC) and found it to be one of the most innovative communities I’ve had the pleasure to work with. That’s why I’m certain they’re going to be excited to see and hear what Calxeda, an ARM® Connected Community®partner, has to offer at ISC’12 this week! Spoiler alert: they’ll be sharing some new performance and Total Cost of Ownership (TCO) data that shows just how compelling a right-sized solution can be for the target workloads. And what do I mean by ‘right-sized’ solution? More on that in a moment…
First, I’d like to offer kudos to the HPC community for tackling some of the largest and most complex problems known. Unsung heroes in so many aspects of our everyday life – for example, have you ever wondered how cars continue to get safer and more efficient each year? (Hint: they use lots of computers to model and simulate scenarios to improve safety and efficiency.) Similar techniques are used to uncover new medicines, forecast weather, identify new energy sources and predict future environmental impacts to name just a few. Then there’s ‘Big Data’ which applies HPC-like techniques to mine the ever-increasing sources and quantities of unstructured data (search queries, social media, financial transactions, crime reports, live traffic, smart meters etc…) for seemingly unrelated but extremely interesting (read: valuable) patterns and insight.
To tackle a large project, you typically break it down into smaller manageable chunks. In the case of HPC and Big Data, that means decomposing and distributing data across many servers (think hundreds and in some cases thousands or even tens of thousands), then collecting and consolidating the results into an overall ‘solution.’ Today, this is typically performed using a technique such as MapReduce enabled by software from companies like Cloudera, Datastax, MapR and Pervasive running on a cluster of general-purpose servers connected via high-performance networks. Often the compute requirements are somewhat modest relative to the enormity of the data, meaning unimpeded data movement is fundamental to overall efficiency.
With that as a backdrop, think for a moment – “how would you architect highly efficient servers for this purpose if you had a clean slate?” ARM’s business model enables innovative companies the freedom and choice to do just that, resulting in highly efficient and targeted solutions.
As stated before, one size no longer fits all.
To achieve a step function in efficiency, often requires new thinking. In the case of data intensive computing, re-balancing or ‘right-sizing’ the solution to eliminate bottlenecks can significantly improve overall efficiency. That’s exactly what Calxeda has done with its EnergyCore™ ECX-1000 series processor. By combining a quad-core ARM®Cortex™-A series processor with topology agnostic integrated fabric interconnect (providing up to 50Gbits of bandwidth at latencies less than 200ns per hop), they can eliminate network bottlenecks and increase scalability. EnergyCore also includes all the traditional server I/O, memory and management interfaces you would expect. This ‘just add memory’ server on a chip approach means servers can literally be credit card sized and operate at a power-sipping 5W of total power. That means huge density increases are also possible: –
With all this innovation, it’s easy to get caught up in the hardware, but we also need to recognize software plays an important role here. While the ecosystem is coming together quite nicely with Canonical’s Ubuntu Server 12.04LTS release and various open source libraries already available, there’s still much work ahead. As of today, the fundamental pieces are in place to begin doing useful work and key software partners are already engaged with Calxeda on early access hardware. Forthcoming availability of ARM processor-based server systems fromHP and other OEMs will accelerate the next phase of software ecosystem developments.
If you’re at ISC’12 this week and want to know more, be sure to visit Calxeda at booth #410, and check out Karl Freund’s speaking session on the show floor Tuesday, June 19th at 4:15pm. If you’re not at ISC’12 we’ll also be at SC’12 in November (booth #122.) But trust me you don’t want to be left waiting until then! There are plenty of other opportunities throughout June (including GigaOM Structure 2012 in San Francisco.) And we’ll be announcing more opportunities to meet the Calxeda and ARM teams in the near future so be sure to watch this space!
Jeff Underhill, Server Segment Marketing Manager, ARM, is based in Silicon Valley. After spending 10+ years working in the traditional server market Jeff saw an opportunity to revisit server design and redefine an industry. ARM’s business model enables innovative companies the freedom and choice to ask themselves “how would I architect highly efficient servers if I had a clean slate?” Consequently, he is helping drive ARM’s server program with a view to redefining the boundaries of traditional servers as opposed to simply replacing incumbent platforms.
ARM Cortex-A50: Broadening Applicability of ARM Technology in Servers [ARM Holdings’ Smart Connected Devices blog, Oct 31, 2012]
I have been running the ARM® server initiative for a little over four years. At kickoff, there were few that believed that ARM technology would find its way into server applications. Fast forward to today, more of the strands of the strategy are now in the public domain.
- 32-bit ARM powered platforms, from companies that include Boston, Dell, HP, Mitac and Penguin Computing (based on either Marvell’s or Calxeda’s EnergyCore system-on-chip devices) are starting to ship into the market. Customers can start to evaluate the performance of their workloads on ARM based servers hosted in the cloud.
- The initial pieces of the software ecosystem are starting to appear including performance optimized Java compilers/java virtual machines, commercial grade Linux distributions and application stacks.
For companies developing businesses based on web infrastructure, the server IS the business. These companies have honed their software and hardware strategies to enable quick adoption of technologies that drive down system acquisition costs or running costs. Increased use of open source software on a Linux platform reduces the legacy ties to incumbent server platforms and paves the way for more innovation. Companies are now making decisions on system technologies based on metrics like performance (on the user application) / watt / $ or performance / watt / foot3 as opposed to the pure performance.
ARM has consistently indicated that a relatively small set of server applications could take advantage of a 32-bit ARM processor and that the availability of 64-bit ARM devices would significantly broaden the applicability. In the cloud infrastructure space, the main benefit that the 64-bit execution state brings is access to a larger memory address space. 2014 will be the year when we see 64-bit ARM powered server SoCs appearing in the market. Now surely those will all be based on the ARM CortexTM-A57, right? Well, what we have learnt in the server journey is that one size does not fit all. Some server workloads do benefit from a high single thread performance. However, as Brian Jeff notes in his blog [see big.LITTLE in 64-bit, also copied here just below], for applications that have modest compute requirements, the Cortex-A53 processor will deliver the best throughput performance inside a specific power envelope.
We think our cores are a great base for server devices. But as important is the ARM business model which enables our silicon licensees to tightly couple peripherals, memory and processing engines of the same piece of silicon. The selection of this mix of functionality that balances the compute, networking and storage elements for the specific server application is key to driving advantages in the metrics discussed above.
But a chip is useless without software. Earlier this year, ARM released a 64-bit Linux distribution and tools into the open source community. The primary focus of my team is to ensure the multiple commercial grade Linux distributions pick up this technology, augmented with virtualization and application stacks, all in time to intersect silicon availability. Fortunately, we have an early pioneer in in the ARMv8 space. At ARM TechconTM 2011, Applied Micro announced their intent to develop a 64-bit ARM powered server device. ARM demands compatibility between companies that develop their own ARM processors (achieved through an architecture license) and cores that ARM licenses. Software companies are already developing software for use on ARMv8 processors using an FPGA version of the Applied Micro’s X-Gene device. This will be superseded with real silicon, set to appear in the early part of 2013. You can expect to see more announcements about the progress regarding 64-bit server software in the coming quarters.
Some observers remain skeptical as to ARM’s likelihood of success here. My team is immersed daily in this engagement so it is fair to say we are somewhat passionate and evangelical about our chances. What I think we can agree on is that the announcement of the Cortex-A50 series removes a technical barrier that many have argued prevent ARM’s access into the server domain. The list of lead partners of these cores, such as AMD and Calxeda, augmented with the three publically announced ARMv8 architecture licensees (Applied Micro, Cavium and NVIDIA) is an early indicator that choice is coming to the server domain. One size does not fit all. The winners will be those that best deliver relevant, compelling functionality alongside the processor core. A space long devoid of innovation is about to undergo some significant disruption!
Ian Ferguson, Director of Server Systems and Ecosystem, ARM,has spent years fighting from the corner of the underdog. Most of those scars are healing nicely. Ian is particularly passionate about taking ARM technology into new types of applications that do not exist or are at the very formative stages. Consequently, he is driving ARM’s server program with a view to reinvent the way the server function is implemented in networks as opposed to simply replacing incumbent platforms.
big.LITTLE in 64-bit [ARM Holdings’ SoC Design blog, Nov 1, 2012]
With the ARM® CortexTM-A50 series processors, ARM has introduced a “big” and “LITTLE” processor pair that is 64-bit capable. So with this 2nd generation of big.LITTLE platform, what does this mean for big.LITTLE software, which is currently being readied for deployment on ARMv7 32-bit processors? How will big.LITTLE processing technology be used in applications outside mobile like low-power servers, where 64-bit processing is a growing requirement?
Preparing for 64b Operating Systems
To start with, I should highlight that big.LITTLE software operates at the level of the operating system, in kernel space. To be clear, this means it is completely transparent to all apps and middleware. In both the major modes of operation (CPU migration and big.LITTLE MP) (discussed in more detail elsewhere) the software consists of a relatively small patch set to the OS kernel. Today, these patches are written in ARMv7 code, available in the open source or from Linaro. The Cortex-A50 series processors support the AArch32 execution state which is 100% backward compatible with ARMv7, so a Cortex-A50 series big.LITTLE processor can run existing 32-bit kernels without any major changes, including kernels that have been patched to support big.LITTLE. There will be some changes in cache maintenance routines, but effectively the big.LITTLE software is the same.
This is important as we are continuously improving the ARMv7 big.LITTLE code base. The first generation of devices based on big.LITTLE processors expected in the market in 2013.
ARMv8 allows 64-bit and 32-bit operation. AArch64 is the architecture that describes 64-bit mode of operation and AArch32 describes the 32-bit mode of operation. AArch64 also delivers other architectural benefits like enhanced SIMD, larger register files, enhanced cache management, tagged pointers, and more flexible addressing modes. For a big.LITTLE processor to deliver the architectural benefits of AArch64, it must run a 64-bit OS built on AArch64.
ARM 64-bit Linux has already been up-streamed, and ARM has demonstrated Android 32-bit code running (unmodified) on top of the 64-bit Linux kernel. The next step in providing big.LITTLE support in the 64-bit kernel is to modify the big.LITTLE MP and CPU migration patch sets to work cleanly in the AArch64 environment. Fortunately the code is not strongly impacted by register width, and therefore the vast majority should port cleanly and with little effort from ARMv7 to 64 bit; we plan to do this work at ARM and release 64-bit capable patch sets in mid-2013. This lines up well with expected Cortex-A50 based SoCs sampling at the end of 2013 and deployed in products in 2014.
Although we don’t expect 64-bit mobile OS’s to become prevalent that early, the AArch32 mode of the Cortex-A50 series processors will handle the ARMv7 32b OS, and will be ready for the transition to 64-bit when it does occur.
big.LITTLE in the Enterprise?
Originally conceived as an energy savings technique for mobile phones, big.LITTLE can be viewed as an interesting disruptive technology for applications like ARM processor based low-power servers. For servers and networking applications which are generally memory bound, having a large number of efficient processors that are tuned to workload makes a lot of sense. Often this workload leads itself to having multiple cores at different performance levels, but which are software identical.
As performance scales to higher core counts and the system power budgets reduce, the amount of power budget left for the CPU even in enterprise is very similar to that of mobile. Consider a fanless 20-25 W chip that has 16 CPUs, IO devices, a large L3 cache and other accelerators on board. Once you strip out the budgets for the non-CPU portions and split the remaining amongst the 16 CPUs, they budget is very much similar to a mobile phone power budget. big.LITTLE allows system designers to have their cake and eat it by delivering enterprise performance using a mobile pedigree processors and resultant low-cost, fanless device.
The other aspect of big.LITTLE technology that is attractive is the ability to more efficiently support a dynamically varying level of required performance. Infrastructure equipment is typically designed for the peak operating capacity, for example, to support the call volume on Mother’s Day or the mobile internet traffic during the Super Bowl. On most days the traffic is at most half of the peak traffic. An architecture that includes a mix of big and LITTLE cores in the same system, or even on the same die, can be dynamically adapted to the performance needs of the network more efficiently. This leads to better overall power consumption and reducing TCO.
big.LITTLE MP software, which gives the OS full view of all the big and LITTLE processors in the system, can automatically handle the work allocation in such a system. This mode of scheduling is more appropriate to the enterprise use case than CPU migration. CPU migration leverages dynamic voltage and frequency scaling (DVFS) to trigger the move between big and LITTLE cores. This works well in mobile devices which typically employ DVFS, but is not as suitable for enterprise systems which typically do not. Now that big.LITTLE MP has been effectively demonstrated on real silicon, enterprise partners are evaluating how big.LITTLE can help them achieve their performance goals without blowing the power budget.
In servers, the benefits of big.LITTLE are still under investigation. There is tremendous interest in ARM based low-power servers, where even our “big” Cortex-A57 CPU will consume significantly lower power than incumbent solutions. With increasing pressure on OEMs to create power efficient servers, it is clear that high peak performance CPUs do not always equate to the best solution. One CPU size does not fit all. For many classes of server solutions, aggregate throughput is more important than peak performance. In these applications, a many core approach with lots of LITTLE Cortex-A53 processors delivers the highest level of aggregate performance under a reduced power budget. It is likely that a range of power efficient server products will be built around Cortex-A57 or Cortex-A53, but probably not with both on the same chip. The OS software will be ready to cope with either case, big.LITTLE or homogenous multi-core, as the market evolves.
Brian Jeff, Product Manager at ARM, is based in Austin. Brian focuses on the power efficient vector along ARM’s application processor roadmap, including the Cortex-A5, the newly introduced Cortex-A7, and other CPUs further down the roadmap. He has also focused on benchmarking, performance analysis, and power analysis for ARM CPUs and systems. Brian joined ARM 3 years ago; prior to joining ARM he spent time at Texas Instruments and Freescale Semiconductor in product marketing, product management, and applications engineering roles. He has an MBA from the University of Texas at Austin and a BSEE from Virginia Tech.
x86 on ARM with Linux
Here comes the emulators! (EE Times Article) [‘ARM Servers Now’ blog from Calxeda, Oct 3, 2012]
Remember how smoothly Apple transitioned from PowerPC chips to X86 back in the mid 2000′s? Customers hardly noticed that all their software “just worked” on a completely different ISA, thanks to some cool software built by “Transitive”, a small UK based company since gobbled up by IBM. Well, emulation doesn’t solve ALL the worlds problems, and critical applications will of course need to go native for maximum performance. But this approach can be very helpful with the CAO, or Computer Aided Other; the ancillary but important applications, tools, and utilities that are so pervasive in a datacenter.
Below is an excerpt from the EE Times article, ARM Gets Weapon in Server Battle Vs. Intel.
Russian engineers are developing software to run x86 programs on ARM-based servers. If successful, the software could help lower one of the biggest barriers ARM SoC makers face getting their chips adopted as alternatives to Intel x86 processors that dominate today’s server market.
Elbrus Technologies has developed emulation software that delivers 40 percent of current x86 performance. The company believes it could reach 80 percent native x86 performance or greater by the end of 2014. Analysts and ARM execs described the code as a significant, but limited option.
A growing list of companies–including Applied Micro, Calxeda, Cavium, Marvell, Nvidia and Samsung-aim to replace Intel CPUs with ARM SoCs that pack more functions and consume less power. One of their biggest hurdles is their chips do not support the wealth of server software that runs on the x86.
The Elbrus emulation code could help lower that barrier. The team will present a paper on its work at the ARM TechCon in Santa Clara, Calif., Oct. 30-Nov. 1.
The team’s software uses 1 Mbyte of memory. “What is more exciting is the fact that the memory footprint will have weak dependence on the number of applications that are being run in emulation mode,” Anatoly Konukhov, a member of the Elbrus team, said in an e-mail exchange.
The team has developed a binary translator that acts as an emulator, and plans to create an optimization process for it.
“Currently, we are creating a binary translator which allows us to run applications,” Konukhov said. “Implementation of an optimization process will start in parallel later this year–we’re expecting both parts be ready in the end of 2014.”
Work on the software started in 2010. Last summer, Elbrus got $1.3 million in funding from the Russian investment fund Skolkovo and MCST, a veteran Russian processor and software developer. MCST also is providing developers for the [Elbrus] project. Emulation is typically used when the new architecture has higher performance than the old one, which is not the case-at least today–moving from the x86 to ARM. “By the time this software is out in 2014 you could see chips using ARM’s V8, 64-bit architecture,” Krewell noted. “That said, you will lose some of the power efficiency of ARM when doing emulation,” Krewell said. “Once you lose 20 or more percent of efficiency, you put ARM on par with an x86,” he added. Emulation “isn’t the ideal approach for all situations,” said Ian Ferguson, director for server systems and ecosystem at ARM. “For example, I expect native apps to be the main solution for Web 2.0 companies that write their own code in high level languages, but in some areas of enterprise servers and embedded computing emulation might be interesting,” he said.
Russian Chip Gurus ARM Intel Rivals With Secret Weapon [Wired, Oct 5, 2012]
Elbrus was founded in 2010 by employees of MCST — the company behind the Russian computer system also called Elbrus. In 2012, MCST and the Russian investment fund Skolkovo invested $1.3 million into the new Elbrus Technologies.
At MCST, the startup team was part of the Binary Translation Department building x86 emulators for the Russian microprocessor E2K. According to Konukhov, their emulator performed 85 percent as well as native code. They also took part in a joint project with Intel to develop an x86 translator for Intel’s Itanium chip that achieved 90 percent of native performance. Konukhov says that MCST has published 46 journal articles on binary translation, and that the company has several USA patents in the field.
Elbrus Technology’s secret sauce is its binary translator with multiple layers of hand-tuned optimization. And all the translations are handled in memory to speed up the process, with the translator itself taking up just 1MB of memory.
Although the goal is to reach 80 percent of the performance of native ARM, Knukhov says stability is more important. “Our marketing research clearly shows that most vendors and users are interested in functionality and stability rather then performance,” he says. “It is possible for us to release our solution without fully reaching performance goals and enhancing it afterwards.”
Linux 3.7 оправдал надежды ARM-разработчиков [PC Week/Russian Edition, 13.12.2012]
Российская компания “Эльбрус Технологии”, разработчик микропроцессоров, готовится решить эту проблему. Компания ведет разработку эффективного эмулятора для запуска x86-приложений на ARM-оборудовании. Данная разработка сейчас находится в стадии альфа-версии. Компания намерена к 2013 г. выпустить рабочую публичную бета-версию продукта, а к 2014 г. достичь эффективности как минимум в 80% и выпустить продукт на рынок.
На сегодня немногие компании работают на ARM-серверах, следовательно и рынок для x86-эмулятора невелик, но некоторые предприятия очень заинтересованы в экономии средств за счет перехода на ARM-серверы и именно им разработка “Эльбрус Технологии” может быть полезна, тем более, что компания, создающая x86-эмулятор для ARM, имеет опыт работы по бинарной трансляции кода, а новая ARM-среда создается вручную, чтобы максимально учесть особенности новых систем.
Skolkovo have chosen Eltechs as one of the Success Story in scope of October’s 2012 report: http://community.sk.ru/press/our_results/p/oktober_2012.aspx …
2:58 AM – 2 Dec 12
Today marked an important milestone in our product testing and development for our Viridis platform here at Boston. We can now officially confirm that we have run x86 binaries our on ARM based Viridis platform!
Over the last few weeks, we have been working with a group of engineers, from Eltechs, who are developing software to run x86 programs on ARM-based servers. This software could help lower one of the largest barriers to ARM SoC adoption as alternatives to Intel x86 processors in the datacentre.
Eltechs has developed a binary translator that acts as an emulator. The software currently delivers on average around 45% of native ARM performance. During our tests on the Viridis platform we observed up to 65% of native performance (6 tests were run covering a range of tests – details cannot be published at this time). We will be working with Eltechs on our Viridis platform, who believes it could reach 80% native ARM performance or greater in the future.
Of all the ARM products tested by Eltechs, we were delighted to hear our platform was received well:
The Boston server has been the fastest platform we have tested to date, Vadim Gimpelson, CEO of Eltech
We will continue to work with Eltechs in testing and validating our platform and hope to see further improvements as the software matures. In addition to our successful initial tests, we will be adding this software to the Boston ARM Wrestle program so if anyone has a particular code or application that hasn’t been ported to ARM, please get in touch with us at firstname.lastname@example.org to discuss benchmarking on our test cluster.
Boston Viridis ARM Server Gets x86 Binary Translation Support [AnandTech, Oct 18, 2012]
We covered the launch of the Calxeda-based Boston Viridis ARM server back in July. The server is makings its appearance at the UK IP EXPO 2012. Boston has been blogging about their work on the Viridis over the last few months, and one of the most interesting aspects is the fact that x86 binary translation now works on the Viridis. The technology is from Eltech, and they have apparently given the seal of approval to the Calxeda platform by indicating that the Boston Viridis was the fastest platform they had tested.
Eltech seems to be doing dynamic binary translation, i.e, x86 binaries are translated on the fly. That makes the code a bit bulky (heavier on the I-Cache). The overhead is relatively large compared to, say, VMware’s binary translator (BT) that does x86 to x86, becauseof the necessity to translate between two different ISAs.
Eltech uses a 1 MB translator cache (similar to the translator cache of VMware’s BT), which means they can reuse earlier translations. The translation overhead will thus decrease quickly over time if most of the critical loops fit in the translator cache. But it also means that only code with a relatively small footprint will run fast, e.g. get the promised 40-65% of native performance.
Most server applications have a relatively large instruction memory footprint, so it is unclear whether this approach will help to run any heavy server software. Some HPC softwares have a small memory footprint, but since the HPC users tend to pursue performance most of the time, this technology is unlikely to convince them to use ARM servers instead of x86.
In general, the BT software will be useful in the – not uncommon – case where one may have a complex web application comprised of multiple software modules where one small piece of software is not open-source and the vendor does not offer an ARM based binary. So, the Eltech solution does handle a small piece of the puzzle. x86 emulation is thus a nice to have feature, but most ARM based servers will be running fully optimized and recompiled linux software. That is the target market for products such as the Boston Viridis.
Low-power ARM-based Viridis servers manufactured by Boston Limited have made their UK debut at the IP Expo 2012 in London.
Boston is the world’s first company to make servers based on ARM processor technology, commonly used in smartphones and tablets.
The Viridis is the first system to approach the much talked about concept of Hyperscale, involving very high density systems that are only possible with low heat, low power chips.
The flying ARM server pig
Boston Viridis is based on the Calxeda EnergyCore System-on-a-Chip (SoC) which provides “supercomputer performance” while delivering a 90 percent reduction in energy costs when compared with conventional servers. Since every SoC consumes as little as 5 Watts of power, the system needs little active cooling, lowering maintenance costs even further.
Provisioned within a 2U enclosure, each Viridis unit contains up to 12 quad-node Calxeda EnergyCards with built-in Layer-2 networking. The EnergyCard is a single PCB module containing four EnergyCore SoCs, each with 4GB DDR-3 registered ECC memory, four SATA connectors and management interfaces.
Providing up to 192 cores and 48 nodes per enclosure, this highly dense solution can put up to 900 servers into a single industry standard 42U rack.
“These building blocks of high end computing are set to radically change the economics of large scale data centres, sparking innovation in emerging fields such as cloud computing, data modelling and analysis – often called ‘Big Data’ – scientific research and media streaming,” said David Power, head of HPC at Boston.
In the Viridis, Ethernet switching is handled internally by 80Gb bandwidth on the EnergyCore fabric switch, thereby negating the need for additional switches that consume unnecessary power and add unwanted latency.
The servers are supported by Ubuntu Server 12.04 LTS and Fedora v17+ distributions. They have been shown to run cloud management software from Openstack, Big Data tools Hadoop and Cassandra, applications built in Java, Ruby on Rails and Python.
Earlier this month, Boston and Russian software developers Eltech had managed to run x86 binaries on the Viridis platform, proving that in the future ARM servers could pose a serious threat to the Intel silicon in the data centre.
Boston claims that with specific applications, one 2U Viridis appliance can outperform a whole rack of standard x86 servers, yet at the same time consume one tenth of the power and occupy one tenth of the space.
Russian startup working on Intel to ARM software emulator [ITworld, Oct 9, 2012]
[Russian version: Разработано средство миграции ПО с x86 на ARM]
Elbrus Technologies in Moscow is developing an x86 to ARM binary translator for use on ARM-powered servers
A Russian startup company called Elbrus Technologies is developing a technology that will allow data center owners to migrate software designed for x86 platforms to ARM-powered servers without the need to recompile it.
Because of their very low power consumption, ARM processors are used today in most smartphones, tablets and in a wide variety of embedded devices.
However, ARM chips are also expected to gain a foothold in the server market, which is currently dominated by x86 processors, during the next few years. Hewlett-Packard and Dell have already announced plans to build low-power servers based on ARM CPUs.
Intel CPUs use up to ten times more power than ARM CPUs and for large data centers power consumption represents 50 percent of their operational costs, said Anatoly Konukhov, the chief business development officer of Elbrus Technologies in Moscow.
In this context it makes sense for many data center operators to consider switching to ARM-based servers in the future. However, a big impediment is that many applications — specially proprietary, closed-source, ones — that are designed for the x86 CPUs won’t work on ARM processors.
Elbrus Technologies is trying to solve this issue by building an x86 to ARM binary translator application that will allow proprietary software compiled for the x86 architecture to run on ARM-powered servers without any changes.
The software emulation will be transparent to the user, Konukhov said. The emulator will automatically detect when an x86 application is executed and will perform the binary translation, he said.
Even though the technology is theoretically platform-independent, the company currently focuses its development efforts on supporting Linux servers and software. Support for Windows software is a longer term goal.
The project started in the spring of 2012 and the product is expected to be ready for beta testing in the middle of next year, Konukhov said. The final product will be released sometime at the end of 2013 or in the beginning of 2014, he said.
“I think we currently support 50 or 60 percent of the functionality of Intel-based CPUs,” Konukhov said. This includes the entire base instruction set of the x86 architecture.
The company is working on adding support for the Streaming SIMD Extensions (SSE) and MMX instruction sets. “This will basically allow us to have multimedia functionality in our applications,” Konukhov said.
The performance of translated code compared to native code is currently at 45 percent. The goal is to have a performance level of 80 percent or more, but that probably won’t be the case for the first production ready version of the product.
“We think it will be lower and there’s a good reason for that,” Konukhov said. “We’ve discussed this issue with our partners and they were more interested in the functionality supported by our emulator and in stability rather than performance. So, they would like to see working and stable software rather than fast software.”
The performance enhancing work will begin after the initial product is released and a 80 or 90 performance level is expected to be achieved in a matter of months, Konukhov said.
The company worked with partners and potential customers to determine which applications should be considered a priority for its x86 to ARM binary translation technology. Konukhov declined to name any of those applications because of existent non-disclosure agreements, but said that they are from the financial and healthcare sectors.
A lot of the people working on this project came from MCST, Elbrus Technologies’ parent company, where they worked on developing x86 to Elbrus binary translators, Konukhov said. Elbrus is a Russian microprocessor manufactured by MCST.
Elbrus Technologies raised US$1.3 million in funding from MCST and the Skolkovo Foundation, a non-profit organization tasked by the Russian government to manage grant funds for technology projects. Elbrus is looking for additional investors and business partners, Konukhov said.
Boston Ltd. related information from Calxeda
Calxeda EnergyCore-Based Servers Now Available [‘ARM Servers Now’ blog from Calxeda, July 9, 2012]
We spent a lot of time at various tradeshows around the world in June and the #1 question we were asked was “when can I get my hands on a Calxeda-based server?” I am happy to tell you the wait is over.
We have been working with Boston Limited in the UK, a highly respected solution provider, for about a year to bring an excellent Proof of Concept (POC) platform to market called “Viridis”. Boston currently has about 20 customers lined up for beta testing and a pipeline of hundreds of others interested in evaluating the platform. Boston is taking orders now from users in Europe, Asia and the US with shipments beginning later this month.
The Register published a great article today highlighting the features of the Boston Viridis platform:
Boston Viridis is a perfect option for those users who want to port their code, run benchmarks, and optimize their workloads for ARM. This highly configurable solution allows users to create their ideal initial testing environments with options ranging from 4 to 48 Calxeda EnergyCore server nodes in a 2U form factor.
We look forward to working with Boston and other systems providers to enable the market with Calxeda-based POCs. Stay tuned as we learn about success stories users experience with Calxeda EnergyCore-based solutions over the coming months.
The World’s First 130 Watt Server Cluster [‘ARM Servers Now’ blog from Calxeda, Oct 25, 2012]
Calxeda’s approach to driving power optimization in the datacenter goes well beyond the processor. We focus on enabling our partners to achieve rack level power efficiency based on our technology. Last week, Boston Limited announced their 2U Viridis platform with 24 Calxeda EnergyCore(TM) server nodes, 96GB of memory, and 6TB storage is measuring 130W “at the wall”. This equates to just 5.5W of power per server inclusive of memory, disk and chassis-level overhead. At a fraction of the power of a traditional x86 server node, the Viridis server cluster based on Calxeda EnergyCore will allow datacenter operators to experience an order of magnitude improvement in efficiency. Said another way, this platform can power 24 quad-core servers, with 24 SSDs and 96 GB DRAM for about the same or less power consumption as a single low-end two-socket x86 server. So long as the 24 servers can get more work done than the single x86 server for the targeted workload like web serving, it will substantially reduce datacenter power.
If you would like to see these power efficiency enhancements in person, come see the Boston Viridis featured at ARM TechCon 2012 in Santa Clara next week in both the ARM and Canonical booths.
Here is a video of David Borland, Calxeda Co-Founder and VP of Hardware, discussing the Boston Viridis power enhancements and the innovative chassis-level optimizations that our engineering teams worked together to achieve.
Happy Birthday, EnergyCore! [‘ARM Servers Now’ blog from Calxeda, Nov 5, 2012]
One year of EnergyCore technology
Calxeda introduced its patented EnergyCore technology to the marketplace one year ago last week. In the year since, we have continued to work hard with our ISV and OEM partners to expand the ARM server ecosystem and bring systems to market, and we are pleased with the progress that’s been made.
Five companies now provide EnergyCore-based systems: HP, Boston Ltd., Dell, Penguin Computing, and System Fabric Works. We work closely with our partners to optimize EnergyCore technology for each specific application: we recently detailed how we worked with Boston Ltd. to power-optimize the Viridis system, creating the world’s first 130 W server cluster (24 EnergyCore nodes with 96 GB of memory and 6 TB of storage)–that’s just 5.5 W per complete server. Benchmarks including recent releases from Phoronix have demonstrated that Calxeda systems achieve the promised performance levels, resulting in significant potential TCO savings over incumbent x86 solutions.
In the last year, we also have been pleased to collaborate with our partners to support industry initiatives that advance the adoption of ARM server technology, including OpenStack’s TryStack ARM Zone and the Apache Software Foundation (ASF). These programs are important to the open source community and will help further the adoption of ARM servers.
We are honored to be recognized for our efforts: Calxeda was named one of the Wall Street Journal’s Top 10 Venture Green Companies and listed as one of Business Insider’s 10 Most Disruptive Enterprise Tech Companies. Calxeda was an EETimes/EDN ACE Awards finalist this year, and CEO Barry Evans was nominated as E&Y Entrepreneur of the Year.
And to help us pay bills and invest in the future, we recently closed $55M in additional funding with the continued support of our existing investors plus the addition of Austin Ventures and Vulcan Capital. We are looking to the future and recently described our plans for the next generation of our innovative technology using ARM’s Cortex-A50 series 64-bit cores, which we announced at ARM TechCon last week.
All in all, it’s been a great year, and the momentum continues to grow. Happy Birthday, EnergyCore! The ARM revolution definitely has begun.
Calxeda Lays Out a Vision for the Hyper-Efficient Datacenter [Calxeda press release, Oct 17, 2012]
Plans Include New Platforms for Cloud and Warehouse-Scale Datacenters
Calxeda, the company that first invented the concept of using ARM® technology to slash datacenter power, today announced its vision and roadmap to extend the company’s leadership in the hyper-efficient computing market. Following the recent announcement of $55 million in additional capital, this news outlines Calxeda’s plans to catalyse rapid market adoption and the creation of an entirely new category of IT products.
The Calxeda EnergyCore ECX-1000, now available, has been called one of the most disruptive technologies in the IT industry today*. The company has now shipped thousands of early EnergyCore SoCs to OEM customers and end users, and is providing free access to the technology on the OpenStack Trystack.org cloud. The product is now available in servers from Penguin Computing, who announced its partnership with Calxeda today, in addition to long-time partners Boston Limited and Hewlett-Packard.
The Calxeda roadmap implements a two-pronged strategy to reach additional markets. The first enables optimized racks for public and private clouds, while the second will enable and span massive warehouse-scale datacenters.
“We are very excited about the market’s response to our pioneering first generation product,” said Barry Evans, Calxeda’s founder and CEO. “Now we are taking it two steps further to reinvent the server, first into a rack-based cloud appliance, and then extending into an integrated fleet of computing resources, spanning many thousands of efficient servers.”
Calxeda’s second-generation platform, code-named “Midway,” opens new markets for Calxeda. “It’s all about finding the right balance of I/O, Storage, networking, management, memory and computational elements for each target market segment,” added Evans. “This is the beauty of an ARM-based SoC approach: each platform can be tailored to add more value by addressing the unique needs of a specific workload.”
To go after cloud applications such as dynamic web hosting and more computationally intensive Big Data analytics, Midway delivers more performance, more memory and hardware virtualization support using standard CortexA15 ARM cores. In addition, Calxeda’s second generation fabric will support new features such as dynamic power and routing optimization for public and private clouds. Midway will be available in volume in 2013.
“64-bit ARM architecture-based production servers are years away,” said Patrick Moorhead, president and principal analyst, Moor Insights & Strategy. “Calxeda’s approach to shipping 32-bit technology today and upgrading to the ARM A15 in 2013 makes a lot of sense for specialized workloads in the largest datacenters.”
Calxeda’s third generation platform, code-named “Lago,” is Calxeda’s platform for the warehouse-scale datacenter. Built on the 64-bit ARM V8 architecture, Lago features Calxeda’s third generation scaling features, called the Calxeda Fleet Services™, to further automate and optimize common operations at massive scale. The enhanced fabric will also connect hundreds of thousands of nodes, with quality of service features and the ability to allocate and control resources.
“We expect to lead the industry with new concepts that will change the datacenter in ways far beyond just lowering power and increasing density,” continued Evans. “Lago will be in the first wave of 64-bit complete systems and application stacks on ARM in 2014, and we are collaborating with key partners to ensure that customers can ramp quickly with production-quality software and OS support for both Midway and Lago.”
Calxeda Trailblazer partners continue to be critical in collaborating to develop the required ecosystem. The Trailblazer initiative provides early access to Calxeda technology for collaborative development and innovation with Calxeda’s engineers and architects. Canonical has been a Trailblazer partner since the program’s inception and shared this:
“Canonical believes that ARM-based servers deliver significant efficiency savings for enterprises. As part of our long term collaboration, we’ve delivered Ubuntu 12.04 LTS on Calxeda hardware,” said Steve George, vice president, Canonical. ”Today, we welcome the Calxeda team’s extended roadmap and look forward to continuing our partnership with Calxeda as we bring the benefits of power efficient hyperscale computing to datacenters.”
Founded in January 2008, Calxeda brings new performance density to the datacenter with revolutionary server-on-a-chip technology. Calxeda currently employs 100 professionals in Austin Texas and the Silicon Valley area. Calxeda is funded by a unique syndicate comprising industry leading venture capital firms and semiconductor innovators, including ARM Holdings, Advanced Technology Investment Company, Austin Ventures, Battery Ventures, Flybridge Capital Partners, Highland Capital Partners, and Vulcan Capital. See www.calxeda.com for more information.
Background on Elbrus (in Russian or English if available)
Здесь представлена статья Б. Бабаяна “Main Principles of E2K Architecture” в варианте, опубликованном в журнале “Free Software Magazine”, Китай (Vol.1, Issue 02, Feb 2002 17).На сайте сохранена оригинальная нумерация страниц журнала. С нашего сайта Вы можете загрузить перевод оригинала статьи в формате PDF.
Elbrus (computer) [Wikipedia, Aug 13, 2012]
The Elbrus (Russian: Эльбрус) is a line of Soviet and Russian computer systems developed by Lebedev Institute of Precision Mechanics and Computer Engineering. In 1992 a spin-off company Moscow Center of SPARC Technologies (MCST) was created and continued development.
These computers are used in the space program, nuclear weapons research, and defense systems.
- Elbrus 1 (1973) was the fourth generation Soviet computer, developed by Vsevolod Burtsev. Implements tag-based architecture and ALGOL as system language like the Burroughs large systems. A side development was an update of the 1965 BESM-6 as Elbrus-1K2.
- Elbrus 2 (1977) was a 10-processor computer, considered the first Soviet supercomputer, with superscalar RISC processors. Re-implementation of the Elbrus 1 architecture with faster ECL chips.
- Elbrus 3 (1986) was a 16-processor computer developed by Boris Babaian. Differing completely from the architecture of both Elbrus 1 and Elbrus 2, it employed a VLIW architecture.
- Elbrus-90micro (1998-2010) is a computer line based on SPARC instruction set architecture (ISA) microprocessors: MCST R80, R150, R500, R500S and MCST-4R working at 80, 150, 500 and 1000 MHz.
- Elbrus-3M1 (2005) is a 2-processor computer based on Elbrus 2000 microprocessor employing VLIW architecture working at 300 MHz. It is a further development of the Elbrus 3 (1986).
- Elbrus МВ3S1/C (2009) is a ccNUMA 4-processor computer based on Elbrus-S microprocessor working at 500 MHz.
Elbrus 2000 [Wikipedia, Dec 9, 2012]
It supports 2 instruction set architecture (ISA):
- Elbrus VLIW
- Intel x86 (a complete, system-level implementation with a software dynamic binary translation virtual machine, similar to Transmeta Crusoe)
Thanks to its unique architecture Elbrus 2000 can execute up to 23 instructions per clock so even with its modest clock speed can compete with much faster clocked superscalar microprocessors especially when running in native VLIW mode.
Supported operating systems
Elbrus 2000 Highlights
CMOS 0.13 µm
64 Bit: 5.8 GIPS
32 Bit: 9.5 GIPS
16 Bit: 12.3 GIPS
8 Bit: 22.6 GIPS
integer: 32, 64
float: 32, 64, 80
64 KB L1 instruction cache
64 KB L1 data cache
256 KB L2 cache
data transfer rate
to cache: 9.6 GByte/s
to main memory: 4.8 GByte/s
packing / pins
HFCBGA / 900
1.05 / 3.3 V
- Video of booting Windows 2000 on Elbrus microprocessor
- Specifications of E2K at MSCT (In Russian)
Эльбрус-S [Википедия, 30 апреля 2012]
Процессор Эльбрус-S основан на архитектуре ELBRUS (англ. ExpLicit Basic Resources Utilization Scheduling — «явное планирование использования основных ресурсов»), отличительной чертой которой является наиболее глубокое на сегодняшний день распараллеливание ресурсов для одновременно исполняющихся VLIW-инструкций. Пиковая производительность 39,5 GIPS.
Основные характеристики микропроцессора «Эльбрус-S»
КМОП 0,09 мкм
Рабочая тактовая частота
64 бита — 4,0 GFLOPS
32 бита — 8,0 GFLOPS
целые — 8, 16, 32, 64
вещественные — 32, 64, 80
команд 1-го уровня — 64 Кбайт
данных 1-го уровня — 64 Кбайт
2-го уровня (универсальная) — 2 Mбайт
данных — 1024 входов
команд — 64 входов
шин связи с кеш-памятью — 16 Гбайт/с
шин связи с оперативной памятью — 8 Гбайт/с
шин связи межпроцессорного обмена — 12 Гбайт/с
Количество слоев металла
Тип корпуса / количество выводов
HFCBGA / 1156
1,1 / 1,8 / 2,5 В
Вместе с процессором используется микросхема КПИ (контроллера периферийных интерфейсов), испытания которой завершились одновременно с испытаниями процессора.
Процессоры и модуль на их основе были представлены в октябре 2010 года на выставках “ChipEXPO-2010” и Softool
А.К. Ким, Генеральный директор ОАО «ИНЭУМ им. И.С. Брука»
В.Ю. Волконский, нач. отделения ОАО «ИНЭУМ им. И.С. Брука»
Ю.Х.Сахин, нач. отделения ОАО «ИНЭУМ им. И.С.Брука»
С.В.Семенихин, нач. отделения ОАО «ИНЭУМ им. И.С.Брука»
В.М.Фельдман, нач. отделения ОАО «ИНЭУМ им. И.С.Брука»
Ф.А. Груздов, нач. отдела ОАО «ИНЭУМ им.И.С. Брука»,
Ю.Н.Парахин, нач .отдела ОАО «ИНЭУМ им.И.С. Брука»,
М.С. Михайлов, нач. отдела ОАО «ИНЭУМ им.И.С. Брука»,
М.В. Слесарев, научный сотрудник ОАО «ИНЭУМ им.И.С. Брука»,
Рассматриваются архитектурные особенности, принципы построения и технические характеристики российских вычислительных комплексов серии «Эльбрус». Для повышения производительности используется явный параллелизм операций, векторный параллелизм операций, параллелизм потоков управления, параллелизм задач. В структуре российских микропроцессоров этой серии используется многоядерный параллелизм систем на кристалле. Явный параллелизм операций в сочетании со специальной аппаратной поддержкой применяется для обеспечения эффективной совместимости с архитектурной платформой Intel x86 (IA-32) на базе невидимой пользователю системы динамической двоичной трансляции. Наконец, параллелизм используется в аппаратуре для поддержки защищенной реализации любых языков программирования, в том числе C и C++. Все эти особенности позволяют создавать универсальные вычислительные комплексы повышенной надежности и широкого диапазона применения, начиная от настольных компьютеров и встраиваемых ЭВМ и заканчивая мощными серверами и суперкомпьютерами.
2. Реализация архитектуры микропроцессора «Эльбрус» и вычислительного комплекса «Эльбрус-3М1»
Определяющая стадия работы над реализацией оригинальной российской архитектуры завершилась в ноябре 2007 года успешными государственными испытаниями микропроцессора «Эльбрус» и двухпроцессорного вычислительного комплекса «Эльбрус-3М1» на его основе. ВК «Эльбрус-3М1» работал под управлением перенесенной на него операционной системы Linux, а также ОС МСВС. В ходе испытаний была показана возможность эффективного исполнения на ВК «Эльбрус-3М1» программных систем заказчика, разработанных различными организациями. При исполнении этих задач на ВК «Эльбрус-3М1» с частотой 300 МГц было получено ускорение в среднем в 1.44 раза относительно Pentium 4 с частотой 1,4 ГГц.
2.2. Двоичная совместимость с архитектурой IA-32
Пользователю ВК «Эльбрус-3М1» предоставляется средства полной двоичной совместимости с архитектурой IA-32. Это достигается за счет аппаратной поддержки семантики операций архитектуры IA-32, а также средств поддержки программно-аппаратной реализации совместимости с использованием технологии скрытой (невидимой пользователю) динамической двоичной трансляции [8-9].
Система двоичной трансляции (Двоичный транслятор) предназначена для высокоэффективного исполнения двоичных кодов, реализованных для архитектуры IA-32 или аппаратно совместимых с ней (исходная платформа) на вычислительном комплексе ВК «Эльбрус-3М1» (целевая платформа). Двоичный транслятор реализует семантическую совместимость с исходной платформой на уровне виртуальной машины, позволяет исполнять на ВК «Эльбрус-3М1» произвольные коды исходной платформы, включая коды произвольной операционные системы.
Двоичная трансляция является высокопроизводительным и надежным средством обеспечения переносимости двоичных кодов между вычислительными машинами различных архитектур [10-11]. Опыт создания двоично-транслирующей системы для ВК «Эльбрус-3М1» экспериментально подтверждает возможность достижения двоично-транслированными кодами эффективности исполнения, существенно превосходящей показатели исходной архитектуры для аналогичной тактовой частоты.
Современные микропроцессоры, использующие суперскалярную архитектуру, например, микропроцессоры платформы IA-32, сначала аппаратно декодируют сложные команды переменной длины и преобразуют их в более простые и регулярные микрооперации. Далее выполняется переименование регистров, чтобы исключить ложные зависимости между микрооперациями, обусловленные ограниченным количеством регистров в исходной системе команд. При этом выполняются некоторые оптимизации, в частности, из командного потока исключаются операции чтения из памяти, если в этом потоке им предшествуют записи по тому же адресу. Затем для некоторых реализаций формируется трасса перекодированных микроопераций, которая представляет собой наиболее вероятную цепочку операций не с одного, а с нескольких следующих один за другим линейных участков исполнения кода. Эта трасса помещается в специальную скрытую память (кэш трасс) для повторного использования. Чтобы обеспечить наиболее оптимальный набор трасс, аппаратно поддерживается специальная обучающая система, которая наблюдает за выполнением операций передачи управления в программе и стремится предсказать направление перехода в каждой точке. Наконец, аппаратура выполняет планирование выполнения микроопераций на заданном парке имеющихся исполняющих устройств.
При программно-аппаратной реализации совместимости с использованием техники двоичной трансляции большая часть действий по перекодировке, анализу зависимостей, набору региона планирования, назначению регистров и планированию операций исключается из аппаратуры и передается двоичному транслятору. Суть техники двоичной трансляции сводится к декомпозиции последовательностей двоичных кодов исходной архитектуры и преобразованию их в функционально эквивалентные последовательности кодов целевой архитектуры, впоследствии исполняемые на аппаратуре целевой платформы. При этом, в отличие от такого распространенного метода обеспечения двоичной совместимости, как покомандная интерпретация, двоичная трансляция способна достигать достаточно высокой степени эффективности ”исполнения” исходных кодов за счет оптимизации, сохранения и возможности многократного исполнения единожды оттранслированных целевых кодов.
Двоичный транслятор для ВК «Эльбрус-3М1» представляет собой динамический двоичный транслятор уровня виртуальной машины, что позволяет исполнять на ВК полную номенклатуру реализованных для исходной платформы операционных систем с соответствующими наборами приложений. Таким образом, основным достоинством этого режима работы становится высокая универсальность, обеспечивающая возможность исполнения на ВК «Эльбрус-3М» любого программного обеспечения (включая драйверы периферийных устройств), доступного пользователям вычислительных машин исходной архитектуры.
Эффективность системы двоичной трансляции ВК «Эльбрус-3М1» определяется наличием существенно большего числа устройств исполнения операций по сравнению с суперскалярными архитектурами (по крайней мере, в 2 раза), что является прямым следствием исключения из аппаратуры логики распараллеливания операций и передачи этих функций двоичному транслятору. Программные алгоритмы оптимизации обеспечивают просмотр значительно более крупных регионов кодов по сравнению с «окном» распараллеливания операций в суперскалярных архитектурах и позволяют задействовать всю номенклатуру исполняющих устройств. За счет этого на ВК «Эльбрус-3М1» удается достигать более высокой логической скорости (время выполнения при одинаковых тактовых частотах) при выполнении программ в кодах IA-32, что было продемонстрировано при проведении Государственных испытаний. Так, например, ни на одной из 10 задач пакета SPECfp95 производительность ВК «Эльбрус-3М1» (300 МГц) не опускается ниже производительности Pentium II (300 МГц), а, в среднем, превосходит его в 1,75 раза. При этом средняя производительность ВК «Эльбрус-3М1» даже превышает в 1,17 раза Pentium III (450 МГц). На более широком классе задач производительность ВК «Эльбрус-3М1» при исполнении кодов IA-32 сравнима с производительностью процессоров типа Pentium II, Pentium III и Pentium 4, работающих в диапазоне частот 300-1500 МГц.
Система двоичной трансляции обладает высокой надежностью. Она обеспечила успешное исполнение на ВК «Эльбрус-3М1» более 20 операционных систем в кодах IA-32, в том числе MS DOS, несколько версий Windows (95, NT, 2000, XP и др.), Linux, FreeBSD, QNX. Под управлением этих операционных систем успешно и эффективно работают свыше 1000 популярных приложений, в том числе интерактивные компьютерные игры, программы из состава пакета MS Office (MS Word, MS Excel, MS PowerPoint и др.), видео ролики, программы компрессии декомпрессии данных, драйверы всех внешних устройств.
2.4. От большой машины к микропроцессору
Архитектурная линия микропроцессора «Эльбрус» берет свое начало от многопроцессорного вычислительного комплекса (МВК) «Эльбрус-3», который создавался в Советском Союзе в конце 80-х годов как продолжение линии вычислительных комплексов «Эльбрус-1» и «Эльбрус-2» . Это была большая машина, которая разрабатывалась с использованием больших интегральных схем советского производства. В 16-процессорном комплексе каждый процессор представлял собой отдельный шкаф. Но в архитектуре центрального процессора были заложены многие черты, которые затем нашли свое воплощение в микропроцессоре с архитектурой «Эльбрус».
Процессор управлялся широкой командой, позволяя получать до 7 результатов арифметико-логических операций, а также считывать из памяти до 6 и записывать до 2 64-разрядных данных за один машинный такт. В архитектуру были заложены спекулятивные и предикатные операции. Аппаратная поддержка циклов включала вращающиеся базированные регистры и устройство предварительной подкачки данных из памяти в эти регистры с автоматическим продвижением адресов. Для распараллеливания управления использовалась техника подготовки переходов. Поскольку «Эльбрус-3» продолжал архитектурную линию «Эльбрус-1» и «Эльбрус-2», в него была заложена совместимость на уровне операций с этими ВК, включая поддержку защищенного программирования на базе аппаратных тегов.
Первые процессоры «Эльбрус-3» были изготовлены в 1991 г. и началась их наладка. Но начавшиеся в 1992 г. экономические изменения в стране привели к остановке проекта и к переосмыслению путей дальнейшего развития российской вычислительной техники. Стало ясно, что вычислительная техника может успешно развиваться только на базе микропроцессоров. Вопросы совместимости с одной из распространенных в мире микропроцессорных архитектурных платформ стали важным требованием времени. Все эти изменения, в конце концов, привели к трансформации проекта «Эльбрус-3» в проект ВК «Эльбрус-3М1», основанным на микропроцессорной архитектуре «Эльбрус» с явным параллелизмом команд, с поддержкой защищенной реализации языков программирования и с полной совместимостью с платформой IA-32 на основе технологии двоичной трансляции.
Российской компании ЗАО «МЦСТ», которая с 2007 г. интегрируется с ОАО «ИНЭУМ им. И.С.Брука» в отраслевой институт с целью ускорения работ по созданию новых поколений ВК серии «Эльбрус». Программа развития рассчитана более чем на 10-летний срок и охватывает совершенствование микропроцессоров, вычислительных комплексов на их основе, включая микропроцессорный набор и конструктивные элементы, а также системное программное обеспечение, в том числе операционные системы, компиляторы, технологию двоичной трансляции высокопроизводительные библиотеки.
СИСТЕМА ДИНАМИЧЕСКОЙ ДВОИЧНОЙ ТРАНСЛЯЦИИ X86 → «ЭЛЬБРУС»
[Н.В. Воронов, В.Д. Гимпельсон, М.В. Маслов, А.А. Рыбаков, Н.С. Сюсюкалов (ЗАО «МЦСТ»), Oct 31, 2011]
DYNAMIC BINARY TRANSLATION SYSTEM X86 → «ELBRUS»
[Nikita Voronov, Vadim Gimpelson, Maxim Maslov, Aleksey Rybakov, Nikita Syusyukalov (MCST), Oct 31, 2011]
Дается описание системы динамической трансляции двоичных кодов архитектуры x86 в архитектуру «Эльбрус». Рассматривается общая схема работы двоичного транслятора, многоуровневая система оптимизаций, технологии сокращения накладных расходов на трансляцию (долговременное хранение кодов и параллельная трансляция). Приводится сравнение производительности с несколькими x86 микропроцессорами.
Ключевые слова: двоичная трансляция, виртуальная машина, микропроцессор Эльбрус.
The article describes dynamic binary translation system developed for translation of x86 binary codes to Elbrus architecture. We consider general principles of binary translation, describe our multi-level optimization engine and translation overhead decreasing techniques (long-time translation storage and parallel translation). Finally we investigate performance of Elbrus processor running binary translation system and compare it against several x86 microprocessors.
Keywords: binary translation, virtual machine, co-desing virtual machine, Elbrus microprocessor.
4. Экспериментальные результаты
В заключение приведём результаты сравнения производительности системы полной двоичной трансляции, работающей на микропроцессоре «Эльбрус-S» (частота 500 МГц) с двумя x86-микропроцессорами: Pentium-M (частота 1000 МГц) и Atom D510 (частота 1660 МГц). Сравнение проводилось на пакете тестов SPEC 2000. На рис. 8 и 9 приведены результаты целочисленных и вещественных задач, соответственно.
Рис. 8 Результаты сравнения производительности
на пакете SPEC 2000 Int
Рис. 9 Результаты сравнения производительности
на пакете SPEC 2000 FP
Данные для микропроцессора Pentium-M были взяты с официального сайта SPEC. Результаты на микропроцессорах Atom и «Эльбрус» получены авторами, при этом для обоих измерений брались одинаковые коды. Система двоичной трансляции x86 → «Эльбрус» работала со всеми описанными в данной статье технологиями и была собрана оптимизирующим языковым компилятором с высоким уровнем оптимизаций.
Технология двоичной трансляции [iXBT.com, Nov 3, 2009]
Сущность, сферы применения и особенности реализации
- Классификация систем ДТ по типу (FBTS и ABTS)
- Классификация систем ДТ по выполняемой задаче
- Взаимодействие ДТ с другими областями Computer Science
- Ключевые концепции ДТ
- Анализ осуществлённых проектов
- Динамическая и статическая ДТ
- Список литературы
Двоичная трансляция (ДТ) — технология с достаточно длинной на данный момент историей, отсутствием каких-либо официальных документов, подробно описывающих достижения в этой области, и непредсказуемым будущим. Несмотря на то, что уже был реализован ряд систем двоичной трансляции и проведена серия исследований в этой области в различных научных центрах, до сих пор никто не использует такие системы в повседневной работе. Это и по сей день является многообещающей технологией и притягательным для многих инженеров направлением исследований. Уже давно витает в воздухе вопрос, где же реальные реализации в области двоичной трансляции, имеющие возможность стать всемирнопризнанными коммерческими продуктами?
Далее я планирую рассмотреть предпосылки возникновения двоичной трансляции и причины, по которым некоторые, наиболее известные продукты не смогли достичь коммерческого успеха, и отдельно сфокусировать внимание на двух, взаимодополняющих друг друга подходах — динамической и статической ДТ.
Background on Elbrus Technologies (in Russian)
Эльбрус Технологии – молодой и энергичный стартап, фокусирующийся на высокотехнологичных программных проектах. Наша цель – создавать продукты превосходного технического качества, способные повлиять на развитие индустрии ИТ в целом.
Наш рынок – облачные сервисы, дата-центры и кластеры, построенные на новейших серверах с архитектурой ARM. Компании Hewlett Packard и Dell уже анонсировали выпуск таких серверов. Сейчас рынок таких серверов закрыт для проприетарного ПО, подавляющая часть которого написана и скомпилирована для архитектуры x86.
Возможности для инвестиций
Фирма ищет стратегического инвестора для продуктизации технологии двоичной трансляции и выхода на международный рынок.
Вакансии [Oct 15, 2012]
Компания Elbrus Technologies, резидент инновационного Фонда «Сколково», приглашает в свою команду опытного и амбициозного разработчика на должность …
Контакты [Aug 31, 2012]
г. Москва, ул. Вавилова д.24 (10 минут пешком от м. Ленинский проспект)
eltechs.com Новости компании Москва
Потребности 100 млн.руб. Соинвестиции 9.6 млн.руб. [ Потребности $3.125M Соинвестиции $0.3M]
Эльбрус Технологии – молодой и энергичный стартап, фокусирующийся на высокотехнологичных программных проектах. Наша цель – создавать продукты превосходного технического качества, способные повлиять на развитие индустрии ИТ в целом.
Наш рынок – облачные сервисы, дата-центры и кластеры, построенные на новейших серверах с архитектурой ARM. Компании Hewlett Packard и Dell уже анонсировали выпуск таких серверов. Сейчас рынок таких серверов закрыт для проприетарного ПО, подавляющая… дальше
Технология программного переноса двоичных кодов с архитектуры x86 на архитектуру ARM.
Возможности для инвестиций
Фирма ищет стратегического инвестора для продуктизации технологии двоичной трансляции и выхода на международный рынок
Вадим Гимпельсон, Генеральный директор
Максим Маслов, Технический директор
Анатолий Конухов, Директор по развитию
20.11.2012 10:01 от Elbrus Technologies
The World’s First 130 Watt Server Cluster
Gina Longoria Oct 25, 2012 Calxeda’s approach to driving power optimization in the datacenter goes well beyond the processor. We focus on enabling our partners to achieve rack level power…
19.11.2012 15:58 от Elbrus Technologies
Dell wants to tune big data apps for ARM servers
Derrick Harris Oct 24, 2012 Dell is donating an ARM-based server to the Apache Software Foundation so contributors can test their projects on new, energy-efficient hardware architectures…
19.11.2012 14:43 от Elbrus Technologies
Calxeda roadmap leads to 64-bit CPU in 2014
Rick Merritt Oct 17, 2012 SAN JOSE, Calif.–Startup Calxeda has disclosed its two-year road map including its first 64-bit chip just over a week before ARM TechCon, when competitors are expected…
23.10.2012 16:09 от Elbrus Technologies
Russian Startup Working on Intel to ARM Software Emulator
Elbrus Technologies in Moscow is developing an x86 to ARM binary translator for use on ARM-powered servers Lucian Constantin Oct 09, 2012 IDG News Service — A Russian startup company called…
22.10.2012 20:04 от Elbrus Technologies
ARM: природа серверов меняется
Джек Кларк 21.09.2012 Изменения, связанные с переходом к облачным вычислениям, уже оказали огромное влияние на подходы к конструированию серверов, и они же могут оказаться решающим фактором, который…
7.10.2012 0:07 от Elbrus Technologies
ARM gets weapon in server battle vs. Intel
Rick Merritt Oct 2, 2012 AN JOSE, Calif. – Russian engineers are developing software to run x86 programs on ARM-based servers. If successful, the software could help lower one of the biggest…
4.10.2012 12:31 от Elbrus Technologies
ARM может получить козырь в борьбе с Intel благодаря российским разработчикам
Российские инженеры из стартапа «Эльбрус Технологии» работают над созданием двоичного транслятора, позволяющего запускать приложения для традиционных настольных и серверных процессоров x86 от Intel или AMD на энергоэффективных чипах с архитектурой ARM без необходимости перекомпиляции. Цель проекта — сделать чипы ARM более привлекательными…
3.10.2012 16:51 от Elbrus Technologies
Applied Micro’s X-Gene server chip ARMed to the teeth
Aug 30, 2012 Ready to take a bite out of x86 servers and Cisco Hot Chips An opportunity to define the future of server processing comes along once every decade or so, and Applied Micro Circuit, a company known for its networking chips and PowerPC-based embedded controllers, wants to move up into the big leagues to take on Intel, Advanced Micro…
26.9.2012 12:29 от Elbrus Technologies
Nvidia Develops High-Performance ARM-Based “Boulder” Microprocessor – Report
Nvidia Reportedly Preps Competitor for AMD Opteron and Intel Xeon Processor for Servers Anton Shilov Sep 21, 2012 Nvidia Corp. is reportedly working on an ultra high-performance system-on-chip based on ARM architecture, which would challenge AMD Opteron and Intel Xeon microprocessors in the server space. The chip is called project Boulder and…
31.8.2012 18:09 от Elbrus Technologies
Reshape Next Generation Cloud and Data Centers
Project Thunder is a family of highly integrated, multi-core SoC processors that will incorporate highly optimized, full custom cores built from the ground up based on 64-bit ARMv8 Instruction Set Architecture (ISA) into an innovative system-on-chip (SoC) that will redefine features, performance, power and cost metrics for the next-generation cloud…
31.8.2012 18:01 от Elbrus Technologies
The Baserock™ Slab. Highly optimized for use with Baserock Embedded Linux system development software
The Baserock™ Slab is a multi-processor server featuring 8 quad-core ARMv7-A CPUs running at 1.33GHz and an on-board high-speed network switch fabric with 5Gbit/s between the CPUs and 2x10Gbit/s external. Each compute node gets additional performance with its own dedicated low-latency mSATA solid state drive. The Slab is designed to deliver…
31.8.2012 17:49 от Elbrus Technologies
Boston Viridis – ARM® Microservers. A server that only uses 5 watts of power!
The Boston Viridis uses the ARM® based Calxeda EnergyCore™ SoCs (Server on Chip) to create a rack mountable 2U server cluster comprising 192 processing cores leading the way towards energy efficient hyperscale computing. The Boston Viridis is a self contained, highly extensible, 48 node ultra-low power ARM® cluster with integral high…
27.8.2012 20:34 от Elbrus Technologies
ARM rides open cloud computing testbed
Rick Merritt July 18, 2012. SAN JOSE – A handful of vendors have created a trial version for ARM-based servers of the OpenStack cloud computing software now available for testing online. The open source offering fills in another small piece of software puzzle for the low power architecture working its way into the data center. ARM server…
27.8.2012 20:23 от Elbrus Technologies
ARM signs 64-bit deal with Cavium
Peter Clarke Aug 1, 2012. LONDON – Fabless networking chip firm Cavium Inc. (San Jose, Calif.) has announced that it is planning to deliver a family of multicore system-chips based on full custom cores designed to implement the 64-bit ARMv8 instruction set architecture from ARM Holdings plc (Cambridge, England), The chips will be aimed at…
27.8.2012 20:09 от Elbrus Technologies
Samsung plans ARM-based CPU for servers, says report
Peter Clarke Aug 6, 2012. LONDON – Samsung Electronics Co. Ltd. is planning to introduce an ARM-based CPU for server applications in 2014, according to a Seoul Economic Daily report in Korean. Intel currently holds 90 percent of the market for server processors, the report said. Samsung is planning to introduce a very low-power processor…
14.7.2012 12:33 от Константин Трушкин
ARM and X86 Could Coexist in Data Centers, Says Calxeda
Jun 19, 2012 ARM processors could potentially coexist with x86 processors from Intel or Advanced Micro Devices in server environments, with the use case being similar to CPUs and graphics processors in some supercomputers today, chip maker Calxeda said on Monday. In hybrid server environments x86 processors could do the main processing, while…
14.7.2012 12:17 от Константин Трушкин
ARM: Two Licenses for Server Processors Signed
ARM Signs ARMv8/Atlas, Cortex-A15 Licenses for Server Chips April 23, 2012 ARM Holdings, a leading developer of microprocessor technologies for low-power applications, said late on Monday that it has signed two licenses for its intellectual property for use in servers. One undisclosed company has licensed ARMv8-based 64-bit code-named Atlas…
14.7.2012 12:08 от Константин Трушкин
ARM Will Impact Servers in 2014, CEO Says
Jan 18, 2012 ARM hopes for a serious impact on the server market starting in 2014 when its 64-bit processor design reaches the market, CEO Warren East said. Server makers have announced experimental systems with low-power ARM processors, which is a big confidence booster for the company, East said during an interview at the Consumer Electronics…
14.7.2012 11:08 от Константин Трушкин
Copper enables the ARM server ecosystem
Dell drives innovation for the ARM server ecosystem Enterprises that run large web, cloud and big data environments are constantly seeking new technology to gain competitive advantage and reduce operations cost. This focus is motivating a dramatic interest in ARM-based server technologies as a way to meet these requirements. What is ARM? An…
Сколково (инновационный центр) [Википедия, 13 декабря 2012]
Инновационный центр «Сколково» («Российская Кремниевая долина»)) — строящийся в Подмосковье современный научно-технологический инновационный комплекс по разработке и коммерциализации новых технологий, первый в постсоветское время в России строящийся “с нуля” наукоград. В комплексе будут обеспечены особые экономические условия для компаний, работающих в приоритетных отраслях модернизации экономики России: телекоммуникации и космос, биомедицинские технологии, энергоэффективность, информационные технологии , а также ядерные технологии.. Федеральный закон Российской Федерации N 244-ФЗ «Об инновационном центре „Сколково“» был подписан президентом Российской Федерации Д. А. Медведевым 28 сентября 2010 г..
Комплекс первоначально располагался на территории городского поселения Новоивановское, вблизи деревни Сколково, в восточной части Одинцовского района Московской области, к западу от МКАД на Сколковском шоссе. Территория инновационного центра «Сколково» вошла в состав Москвы (район Можайский Западного административного округа) с 1 июля 2012 года..
На территории площадью около 400 гектаров будут проживать примерно 21 тысяча человек, ещё 21 тысяча будет ежедневно приезжать в инновационный центр на работу . Первое здание “Гиперкуб” уже готово. Объекты первой очереди “иннограда” будут введены в эксплуатацию уже к 2014 году, полностью строительство объектов будет завершено к 2020 году[ист
Кластер информационных и компьютерных технологий
Самым крупным кластером Сколково является кластер информационных и компьютерных технологий. Частью IT-кластера стали уже 209 компаний (на 15 августа 2012).
Участники кластера работают над созданием нового поколения мультимедийных поисковых систем, эффективных систем информационной безопасности. Активно идет внедрение инновационных IT-решений в образование, здравоохранение. Реализуются проекты по созданию новых технологий по передаче (оптоинформатика, фотоника) и хранению информации. Ведется разработка мобильных приложений, аналитического программного обеспечения, в том числе для финансовой и банковской сфер. Проектирование беспроводных сенсорных сетей — ещё одно важное направление деятельности компаний-участников кластера.
Одним из важнейших элементов деятельности Сколково является международное сотрудничество. Среди партнеров проекта значатся исследовательские центры, университеты, а также крупные международные корпорации. Большинство зарубежных компаний планирует в скором времени разместить в Сколково свои центры.
- Финляндия: Nokia Siemens Networks.
- Германия: Siemens, SAP.
- Швейцария: швейцарский технопарк Technopark Zurich.
- Соединенные Штаты Америки: Microsoft, Boeing, Intel, Cisco, Dow Chemical, IBM.
- Швеция: Ericsson.
- Франция: Alstom.
- Нидерланды: EADS.
- Австрия: Вексельбергом и министром транспорта, инноваций и технологий Австрии Дорис Бурес в Вене было подписано соглашение, предполагающее поддержку российских и австрийских компаний, специализирующихся на исследовательской деятельности, развитии технологий и инноваций.
- Индия: был подписан меморандум между Фондом «Сколково» и корпорацией Tata Group о возможности привлечения индийской компании Tata Sons Limited к реализации проектов на базе Сколково в таких областях, как средства связи и информационные технологии, инжиниринг, химия, энергетика.
- Италия: достигнуты договоренности по взаимному обмену студентами между вузами двух стран. Также итальянских профессоров и преподавателей будут приглашать для чтения лекций в российских университетах и университетах Сколково, и для совместной разработки научных и образовательных программ.
- Южная Корея: Вексельбергом и президентом Научно-исследовательского института электроники и телекоммуникаций Республики Корея был подписан с меморандум о взаимопонимании.
Отсутствие спроса на инновации
По мнению научного руководителя Инновационного института при МФТИ Юрия Аммосова, в условиях, когда в России отсутствует спрос на инновации, созданные в «кремниевой долине» инновации не смогут вывести российскую экономику на инновационный путь развития. Игорь Николаев из компании ФБК придерживается той же позиции.
Отдельные критики считают, что российские компании не озабочены покупкой и внедрением новых технологий, потому что нацелены не на рост оборота, а на получение высокой маржи: «Конкуренция идет не за потребителя, а за доступ к ресурсам, и до тех пор пока ситуация не переломится, на инновации спроса не будет»
- Общее число резидентов проекта на август 2012 года составило 583 компании.
- С начала работы Фонда одобрено 105 грантов на общую сумму 6 397 млн руб. [$200M as of August, 2012]
- , в том числе за период с 1 января по 30 апреля 2012 года — 22 гранта на сумму 597 млн руб. [$18.7M as of August, 2012]
Коммерциализация результатов исследовательской деятельности
- Создание опытного образца маневрового тепловоза с асинхронным интеллектуальным гибридным приводом «SinaraHybrid» (ТЭМ-9Н). Cумма гранта 35 млн руб. план продаж 8,4 млрд руб.
- Создание первого в мире интерактивного безэкранного(воздушного) дисплея Displair. На данный момент разработана бета-версия. Начало продаж — конец 2012 года
December 13 Report:
– Intel’s next-gen SoC manufacturing process will be able to deliver the next Bay Trail Atom only for 2014 products (with higher end Haswell for H2 2013), and it is just a 26nm process in terminology used by the foundry industry not a 22nm one touted by Intel
Lesson from that: Intel may speak about its “22 nm SoC process” but given the late entry of its 32nm SoC process Atom product (Cover Trail) it would be better to assume that with Windows 8 tablets based on that it will affect only the 2014 tablet market, not earlier. This is what the latest leaks are suggesting as well. Meanwhile expect a low-power Haswell ULT based tablet PC push in the H2 2013 as described already in my Intel Haswell: “Mobile computing is not limited to tiny, low-performing devices” [Nov 15 – Dec 11, 2012] post. As for the next year the real question is Can VIA Technologies save the mobile computing future of the x86 (x64) legacy platform? [this same blog of mine, Nov 23, 2012] For this watch what Allwinner vis-à-vis HTC on 2013 International CES [this same blog of mine, Dec 11, 2012] could bring in that respect, something much more than what is described in Allwinner A31 SoC is here with products and the A20 SoC is coming [USD 99 Allwinner blog of mine, Dec 10, 2012] or in $99 Android 4.0.3 7” IPS tablet with an Allwinner SoC capable of 2160p Quad HD and built-in HDMI–another inflection point, from China again [this same blog, Dec 3, 2012].
– end of life of planar transistor and need to move to FinFET, but meanwhile FD-SOI to the rescue
– ARM Physical IP division via its upcoming IP is preparing with its foundry partners (TSMC, GLOBALFOUNDRIES and Samsung) an easier transition to FinFET
September 27 report:
– TSMC’s View of the Semiconductor IP Ecosystem
– Overall semiconductor IP market overview
– The CEVA case
– When sticking with the “Goliath”: ARM Holdings Plc
– When sticking with a “David”: CAST Inc.
Note: I am not discussing at all the most important development of the 64-bit ARM introductions as will devote to it a separate composite trend-tracking post on this blog.
Warning: These two reports are rather comprehensive and extensive on the given subject. When you will read these through your reward will be a deep and wide ranging understanding of this most actual issue for understanding the upcoming very dramatic changes in the further development of the whole ICT industry. To illustrate only some of the most related topics here is a copy of tags for this post:
14 nm, 14nm, 20 nm, 20nm, 22 nm, 22nm, 28 nm, 28nm, 3D devices, Allwinner, AndesCore, ARM Artisan IP, ARM Holdings, ARM Physical IP division, Artisan Physical IP Platform,Atom, BA22-AP, Bay Trail, Beyond BA22, big.LITTLE Processing, bulk CMOS, CAST Inc., CAST IP, CEVA, choice IP partner, Cortex A15, Cortex-A7, EnSilica eSi-3250, Fastec Imaging Corporation, Fastec TS3, FD-SOI, finFET,foundries, foundry and IP business model, foundry business, Freescale, Freescale ColdFire, general-purpose foundry business, GlobalFoundries, Haswell, Haswell-ULT, in-house IP blocks, inflection points, Intel, Intellectual Property, interface products, Internet of Things, IOT, IP suppliers, Kinetis, LEON3, licensable IP blocks, Lincroft, logic products, mainstream CMOS, Mali, MarketsandMarkets, MediaTek, memory compilers, MIPS32, mobile computing,Motomic, MT6588, MT6589, OpenRISC, planar transistor, POP, prime IP partners, Processor Optimization Pack,reusable subsystems, Samsung, semiconductor design, semiconductor intellectual property market, semiconductor IP, semiconductor IP ecosystem, semiconductor IP market, semiconductor IP revenue, silicon IP market, SoC manufacturing process, SoC process, Sodaville, SOI, standard cells, standard industry IP blocks, STMicroelectronics,system IP, tablet PC, transistor designs, Tri-Gate, Tri-Gate transistor, TSMC, TSMC IP Alliance, TSMC IP portfolio,TSMC Soft-IP Alliance, UMC, VIA Technologies, Z670
December 13 Report
– Intel’s next-gen SoC manufacturing process will be able to deliver the next Atom only for 2014 products (with higher end Haswell for H2 2013), and it is just a 26nm process in terminology used by the foundry industry not a 22nm one touted by Intel
Intel progressing in development of 14nm technology, says CTO [DIGITIMES, Dec 5, 2011]
Intel CTO Justin Rattner on December 4 said that Intel’s development of 14nm technology is on schedule with volume production to kick off in one to two years and development of 18-inch wafers is under way through cooperation with partners.
Rattner also noted that Intel’s aggressiveness over technology advancement will allow Moore’s Law to extend for another 10 years.
At the end of 2013, Intel will enter the generation of 14nm CPUs (P1272) and SoCs (1273), while expanding its investments at its D1X Fab in Oregon, and Fab 42 in Arizona, the US and Fab 24 in Ireland, and will gradually enter 10nm, 7nm and 5nm process generations starting 2015.
As for Intel’s competitors, Samsung is already set to enter 20nm in 2013 and is already working on its 14nm node, while Taiwan Semiconductor Manufacturing Company’s (TSMC) 20nm process [planar, i.e bulk CMOS, see below] will enter small volume production in the second half of 2013 with the first 3D-based FPGA chips to also start.
Globalfoundries has previously announced its 14nm FinFET process will start pilot production at the end of 2013 and enter mass production in 2014.
As for 18-inch wafers, Intel has invested in Holland-based ASML for its EUV technology, and related technologies are expected to start entering production in 2017.
Intel Has No Process Advantage In Mobile, says ARM CEO [Mannerisms on Electronics Weekly, Oct 24, 2012]
Intel has no advantage in IC manufacturing when it comes to manufacturing processes used for mobile ICs, Warren East, CEO of ARM, tells EW.
“This time last year there was a lot of noise from the Intel camp about their manufacturing superiority,” says East, “we’re sceptical about this because, while the ARM ecosystem was shipping on 28nm, Intel was shipping on 32nm. So I don’t see where they’re ahead.”
Furthermore, with the foundries accelerating their process development timescales, it looks increasingly unlikely that Intel will be able to find any advantage on mobile process technology in the future.
“We’re supporting all the independent foundries,” says East. That includes 20nm planar bulk CMOS and 16nm finfet at TSMC; 20nm planar bulk CMOS and 14nm finfet at Samsung and 20nm planar bulk CMOS, 20nm FD-SOI and 14nm finfet at Globalfoundries.
It gives the ARM ecosystem a formidable array of processes to choose from. “I’m no better equipped to judge which of these processes will be more successful than anyone else,” says East, “our approach is to be process agnostic.”
The important thing is that the foundries’ process roadmap is on track to intersect Intel’s at 14nm.
14nm will be the first process at which Intel intends to put mobile SOCs to the front of the node i.e. putting them among the first ICs to be made on a new process.
Asked if the foundries were prepping their next generation processes with the intention of putting mobile SOC at the front of the node, East replies: That’s the information we’re seeing from our foundry partners.”
Globalfoundries intends to have 14nm finfet in volume manufacturing in 2014, the same timescale as Intel has for introducing 14nm finfet manufacturing.
In fact, GF’s 14nm process may have smaller features than Intel’s 14nm process because, says Mojy Chian senior vp at Globalfoundries, because “Intel’s terminology doesn’t typically correlate with the terminology used by the foundry industry. For instance Intel’s 22nm in terms of the back-end metallisation is similar to the foundry industry’s 28nm. The design rules and pitch for Intel’s 22nm are very similar to those for foundries’ 28nm processes.”
Jean-Marc Chery, CTO of STMicroelectronics points out that the drawn gate length on Intel’s ˜22nm” process is actually 26nm.
Furthermore Intel’s triangular fins, which degrade the advantages of finfet processing could underperform GF’s rectangular fins which optimise the finfet advantage.
At the front of the GF 14nm finfet node will be mobile SOCs says Chian. GF has been working with ARM since 2009 to optimise its processes for ARM-based SOCs.
At TSMC the first tape-out on its 16nm finfet process is expected at the end of next year. That test chip will be based on ARM’s 64-bit V8 processor.
Using an ARM processor to validate its 16-nm finfet process should give TSMC’s ARM-based SOC customers great confidence.
Asked about the effects of finfets on ARM-based SOCs, East replies: “There’s no rocket science in what you get out of it. The question is does it deliver the benefits at an acceptable cost? You don’t get something for nothing. How much does it cost to manufacture? How good is the yield? And that, of course, affects cost.”
And so on goes Intel beating its head against the wall to get into the low-margin mobile business.
Recently Intel said it expected its Q4 gross margin to drop 6% from Q3’s 63% to 57%. Shock, horror said the analysts
But if Intel succeeds in the mobile business, its gross margin will drop a lot more than that.
It’s a funny old world.
The Truth About Intel [Mannerisms on Electronics Weekly, Dec 5, 2012]
The darndest things are being said about Intel. The departure of its CEO is unexplained though I heard one person say it was voluntary.
Some people think Apple will put x86 in the iPad.
Others think Apple will drop x86 from iMacs so as to unify its processors across Phone, Pad and Mac.
Sure as eggs are eggs, both can’t happen
Some think Intel is going to become a foundry in a major way starting with Apple’s business – though it’s said the production cost of an Intel wafer is 3x that of a TSMC wafer.
Others say Intel may make wafers for a few customers but will not enter an industry servicing thousands of customers with hundreds of thousands of mask-sets.
Intel is to borrow $6 billion to buy its own shares something it has been doing for some time. I am too financially unsophisticated to understand why it does this but, even before this latest borrowing, Intel’s debt was already pretty high at over $7 billion and its cash rather low – for a cash generative, capex-gobbling company – at $10.5 billion.
The divi is generous – but the purpose of the generosity is to keep the share price up, then generosity hasn’t worked – Intel’s share price is under $20, unchanged in a decade.
The strategy of getting x86 into mobile phones seems mistimed when Apple and Samsung and now LG are designing their own mobile phone processors. This morning Samsung said it will start mass-roducing its own-brand 28nm processors for mobile devices early in 2013.
Intel’s fab situation at 22nm looks tough with 50% utilisation. A $500 million charge for this is expected to be taken in Q4.
Intel’s claim to have a manufacturing advantage looks unconvincing when its 22nm process turns out to have a drawn gate length of 26nm – virtually the same as volume processes at leading foundries.
Where it matters, i.e. in the mobile market, Intel has no process advantage at all because Intel hasn’t yet put its mobile SOCs on its latest process at the start of a node. Intel’s mobile SOCs won’t enjoy early access to a new process node until the 14nm generation.
And was finfet the right bet? 20nm planar may still be made to work, while FD-SOI could turn out to be a better route than finfet.
Meanwhile CEO Paul Otellini won the 2012 Open-Mouth-Insert-Foot Award by some spectacular boo-boos:
- Saying Windows 8 wasn’t ready just before its launch, provoked Microsoft’s riposte that Intel’s power management software wasn’t ready for the launch of Surface, Microsoft’s Windows 8 tablet.
- And endorsing Governor Mitt Romney in the recent US presidential elections probably irked the White House just as Otellini was earning some brownie points by sitting on a Presidential committee. They were much needed brownie points after Intel’s pasting from the FTC for ‘stifling innovation.’
And all the while and worst of all, the PC industry starts to contract and Intel has won few slots in the successor to the PC industry – the mobile device industry.
All in all a pretty rotten year for Intel despite taking in over $50 million in revenues and earning over $12 billion in profits.
Even silver linings can have clouds.
So the war is on as per: IBM, Intel face off at 22 nm [EE Times, Dec 10, 2012]
SAN FRANCISCO – Intel and IBM went head-to-head with their latest 22-nm technologies in back-to-back papers at the International Electron Devices Meeting (IEDM) here Monday (Dec. 10). Separately, a top Intel fab executive commented on increasing wafer costs and the company’s foundry business.
IBM said it is prototyping server processors in a new 3-D ready, 22-nm process technology it hopes will deliver 25 to 35 percent boosts over its 32-nm node. Intel retains an edge with several 22-nm chips already in volume production, and disclosure at IEDM of a variant of the process for SoCs for a wide range of applications.
The Intel paper showed support for “high drive current across the spectrum of leakage and a full suite of SoC tools,” Mark Bohr, head of Intel’s process technology development group, said in a brief interview. The process is geared for a much wider array of designs than that of IBM, he added.
Bohr said Intel’s 22-nm FinFET process is cost effective, contradicting report it is 30 to 40 percent more expensive than TSMC’s 28-nm planar process. The addition of FinFET adds only 3 percent to the cost of the process. Its use of 80-nm minimum feature sizes can be made with a single pass of 193-nm lithography tools, making it cost effective.
Projections from an IMEC keynote that 14-nm wafers will be 90 percent more expensive than 28-nm parts due to the lack of EUV lithography are inaccurate, Bohr asserted. The cost increase for 14-nm wafers at Intel “is nowhere near that,” he said.
“Cost per wafer has always gone up marginally each generation, somewhat more so in recent generations, but that’s more than offset by increases in transistor density so that the cost per transistor continues to go down at 14 nm,” Bohr said.
Separately, Bohr said Intel does have a growing foundry business that may include some higher volume applications than its current announced customers like FPGA startup Achronix. However, “we don’t intend to be in the general-purpose foundry business…[and] I don’t think the [foundry] volumes ever will be huge” for Intel, he said.
Intel’s paper laid out characteristics of Intel’s 22-nm process variation for SoCs (see chart below). It outperforms Intel’s 32-nm planar process by 20 to 65 percent and covers four orders of magnitude in leakage current, said co-author C.H. Jan.
The process provides 51 to 56 percent improvements in high voltage performance used for fast interfaces such Ethernet, HDMI and PCI Express. That’s more than twice the 20 percent boost typical in this area for a new Intel node, Jan said.
In addition, analog performance went up three-fold after declines in the past three nodes. Intel offers a small library of analog circuits tailored to the process including precision resistors, metal-in-metal capacitors and high Q inductors.
The process supports high and standard performance options as well as low and ultra low power ones. It also includes SRAM designs optimized for density, power and performance some of which now hit 2.6 GHz at 1V, up from 1.8 GHz at 32 nm.
Finally, Intel created two new transistor designs specifically for the 22-nm SoC variant. One is focused on low power and the other on high voltage for mixed-signal and analog circuits (see chart above).
For its part, IBM described its 22-nm process using partially depleted silicon-on-insulator. IBM “has prototyped a number of server processors” in the node that achieve latency below 1.5 ns and 750 MHz random clock cycles, said IBM researcher S. Narasimha.
Narasimha declined to give specifics of what IBM might achieve with the 22-nm node. However he did say the goal was to provide 25 to 35 percent boosts of the previous node which delivered server processors running up to 5.5 GHz and others with up to 80 Mbytes embedded DRAM.
IBM created an SRAM cell that measures 0.026 mm2 using the process. It also power supplies at 1.2V across a 550 mm2 die area, he said.
The process provides up to 15 levels of metal. The lowest five levels use 80-nm features, similar to the Intel process, and the top two levels support through-silicon vias for 3-D stacks with memory chips.
IBM will deliver a separate paper Wednesday on its 3-D stacking work.
Before that it was that Intel describes 22-nm SoC process, not chips [EE Times, Sept 13, 2012]
Intel provided the first look at the system-on-chip variant of its 22-nm process technology in a talk at the Intel Developer Forum here Thursday (Sept. 13). However, it declined to provide details on the Atom-based SoCs for tablets and smartphones that will be made in that process.
“It’s fair to say Intel didn’t have much of a focus four or five years ago on SoCs, but that’s changed,” said Mark Bohr, director of Intel’s technology and manufacturing group in a process technology talk. “The success of Medfield [Intel’s 32-nm smartphone platform] shows we are learning to do it right, and I think we will have a technology advantage at 22 nm,” he said.
Intel showed at IDF six smartphones and four Windows 8 tablets using the Medfield SoC, made in an SoC variant of its 32-nm process. “There’s a lot more in the pipeline,” said Ticky Thakkar, a lead Atom designer in a separate talk on the mobile chips.
The company is already shipping to OEMs a 2-GHz version of Clover Trail, a follow on 32nm dual-core processor with boosted graphics. A 1.8-GHz version for tablets is also in the works.
Next up is Bay Trail, Intel’s first 22-nm SoC for tablets and smartphones, expected to debut at IDF Beijing [April 10-11, 2013 as per the IDF page of Intel]. “You’ll have to wait until next year to hear about it,” said Thakkar.
In a separate talk, Bohr described P1271, the 22-nm SoC process to be used for Bay Trail. It differs from the 22-nm CPU process now used for Intel’s Ivy Bridge processors by offering lower leakage logic transistors, higher voltage I/O transistors, denser upper layer interconnects and a set of precision resistors, capacitors and inductors.
“It’s not one set of features, but a menu of feature options—transistors, I/O, interconnects, passive elements and embedded memory,” Bohr said. “The [SoC] transistors go down to much lower leakage levels, but give up some performance,” he said.
The process has significantly better analog characteristics than Intel’s current 32-nm planar process. Designs make heavy use of 80-nm pitch features in lower metal layers, because they are the smallest features Intel can make at 22 nm without needing double patterning, he added.
Intel is running the process at three fabs, two in the U.S. and one in Israel. It will ramp soon in two other fabs.
Reminders: Silicon Technology for 32 nm and Beyond System-on-Chip Products [IDF 2009 presentation by Mark Bohr, Sept 23, 2009]
while the first SoC product was the Sodaville which had no real market success (even specs are not listed on the ark.intel.com), and as such was not continued:
– Intel Unveils 45nm System-on-Chip for Internet TV [press release, Sept 24, 2009]
Intel Corporation today unveiled the Intel® Atom™ processor CE4100, the newest System-on-Chip (SoC) in a family of media processors designed to bring Internet content and services to digital TVs, DVD players and advanced set-top boxes.
The CE4100 processor, formerly codenamed “Sodaville,” is the first 45nm-manufactured consumer electronics (CE) SoC based on Intel architecture. It supports Internet and broadcast applications on one chip, and has the processing power and audio/video components necessary to run rich media applications such as 3-D graphics.
Intel® Atom™ Processor CE4100
The CE4100 processor can deliver speeds up to 1.2GHz while offering lower power and a small footprint to help decrease system costs. It is backward compatible with the Intel® Media Processor CE 3100 and features Intel® Precision View Technology, a display processing engine to support high-definition picture quality and Intel® Media Play Technology for seamless audio and video. It also supports hardware decode of up to two 1080p video streams and advanced 3-D graphics and audio standards. To provide OEMs flexibility in their product offerings, new features were added such as hardware decode for MPEG4 video that is ready for DivX* Home Theater 3.0 certification, an integrated NAND flash controller, support for both DDR2 and DDR3 memory and 512K L2 cache. The CE SoC contains a display processor, graphics processor, video display controller, transport processor, a dedicated security processor and general I/O including SATA-300 and USB 2.0.
Lincroft is mentioned in my Windows 7 tablets/slates with Oak Trail Atom SoC in December [Nov 1-24, 2010] post as:
Intel “is aiming to mass produce its Oak Trail platform for its Sleek Netbook segment targeting the tablet PC market in December 2010. The Oak Trail platform is a combination of Intel’s Lincroft (Atom Z6xx series) processor with Whitney Point chipset.”
The Oak Trail platform will sell at about US$25 with MeeGo [which was terminated as Nokia exited that joint effort 3 months later], and the price for Oak Trail and Microsoft’s Windows 7 will be higher.
so it was Intel’s first attempt to compete against the ARM-based tablet business, including the already successful iPad. As such it ended nowhere in terms of volumes. So adjustment followed as early as noted in my Intel: accelerated Atom SoC roadmap down to 22nm in 2 years and a “new netbook experience” for tablet/mobile PC market [April 17, 2012] despite that fact that products based on Z670 Atom from Lenovo and Fujitsu, as the big names, and Evolve, Motion Computing, Razer and Viliv, as much lesser names, appeared on the market from April, 2011 on (you could find information about them in the post itself). The price was too high: e.g. $729 for the Evolve III Maestro C.
The next Atom based on Intel’s 32nm SoC process appeared in fact just recently, first appeared in Acer Iconia W510: Windows 8 Clover Trail (Intel Z2760) hybrid tablets from OEMs [Oct 28, 2012] priced little lower, from $499 and up which is still overpriced relative to the ongoing 10” Android tablets. Moreover, it became available on in the second half of November and appeared on the Microsoft store to celebrate Cyber Monday (Nov 26) discounted to $399, which is the only competitive price. Now it is back to $499.
Lesson: Intel may speak about its “22 nm SoC process” but given the late entry of its 32nm SoC process Atom product (Cover Trail) it would be better to assume that with Windows 8 tablets based on that it will affect only the 2014 tablet market, not earlier. This is what the latest leaks are suggesting as well. Meanwhile expect a low-power Haswell ULT based tablet PC push in the H2 2013 as described already in my Intel Haswell: “Mobile computing is not limited to tiny, low-performing devices” [Nov 15 – Dec 11, 2012] post. As for the next year the real question is Can VIA Technologies save the mobile computing future of the x86 (x64) legacy platform? [this same blog of mine, Nov 23, 2012] For this watch what Allwinner vis-à-vis HTC on 2013 International CES [this same blog of mine, Dec 11, 2012] could bring in that respect, something much more than what is described in Allwinner A31 SoC is here with products and the A20 SoC is coming [USD 99 Allwinner blog of mine, Dec 10, 2012] or in $99 Android 4.0.3 7” IPS tablet with an Allwinner SoC capable of 2160p Quad HD and built-in HDMI–another inflection point, from China again [this same blog, Dec 3, 2012].
End of Reminders
– end of life of planar transistor and need to move to FinFET, but meanwhile FD-SOI to the rescue
FinFETs or FD-SOI? [SemiMD (Semiconductor Manufacturing and Design), Dec 11, 2012]
By Ed Sperling
STMicroelectronics yesterday unveiled the results of its 28nm production silicon chips using fully depleted silicon on insulator technology, which it claims offers a 30% improvement in speed over bulk CMOS while using less power.
The debate over FD-SOI and FinFETs has been notching up over the past few months. While FinFETs and FD-SOI both promise improvements in controlling leakage current, the FinFETs are more difficult to design. FD-SOI uses the same design flow, although it does use a different SPICE model with better characteristics than the one used for bulk CMOS.
ST also used an ultra thin body and box (UTBB) and body biasing to boost performance, according to Joel Hartmann, the company’s executive vice president of front-end manufacturing and process R&D. Hartmann presented his results at an SOI Consortium-sponsored event at the IEDM show last night.
“We are using body bias to boost performance,” Hartmann said. “You can do that with FD-SOI. We also decreased the Vdd of the device by applying body biasing.”
What’s particularly attractive about FD-SOI is that is can be implemented at the 28nm node for a boost in performance and a reduction in power. The mainstream process node right now is 40nm. And while Intel introduced its version of a finFET transistor called Tri-Gate at 22nm, TSMC and GlobalFoundries plan to introduce it at the next node—whether that’s 16nm or 14nm. That leaves companies facing a big decision about whether to move all the way to 16/14nm to reap the lower leakage of finFETs, whether to move to 20nm on bulk, or whether to stay longer at 28nm with FD-SOI.
Hartmann said ST has seen improvements in analog running on FD-SOI, and for memory where the minimum voltage required is lower. He said ST’s road map calls for FD-SOI all the way down to 10nm, with voltages dropping from 0.9v at 28nm to 0.8v at 14nm and 0.7v at 10nm.
One of the sticking points in adopting FD-SOI has been market acceptance. Despite the promise of improved performance and/or lower power, bulk CMOS has been extended using a variety of techniques such as strain engineering and FD-SOI is considered more expensive. At 28nm and beyond, however, bulk has run out of steam, which is why Intel has opted for finFETs.
Still, FinFETs are more difficult to design and manufacture, and they potentially can add significantly to the cost of an SoC. FD-SOI, in contrast, uses the same design tools and reduces the number of masks and metal layers. ST is the first large fab-lite company to adopt FD-SOI and to move beyond just test chips. It remains to be seen which path the rest of the industry takes—and how quickly.
See also: ST’s FD-SOI Tech Available to All Through GF [SemiMD (Semiconductor Manufacturing and Design), Oct 8, 2012]
– ARM Physical IP division via its upcoming IP is preparing with its foundry partners (TSMC, GLOBALFOUNDRIES and Samsung) an easier transition to FinFET
ARM TechCon 2012 Executive Roundtable: Manufacturing [ARMflix YouTube channel, Nov 14, 2012]
Embedded in the beginning of this roundtable video there is a [4:19] minutes long Investing in FinFET Technology Leadership Presented by ARM [ARMflix YouTube channel, Nov 12, 2012] video in which Dr. Rob Aitken, R&D Fellow at ARM, discusses the need for new transistor technologies and how FinFET may be a solution. The embedded video is starting at [00:39] of the roundtable video. From this I will transcribe here the following part showing ARM’s commitment and strategy for FinFET in its Physical IP Division:
[02:30] ARM is taking a leadership position in FinFET IP development to accelerate the availability of FinFET IP in ARM partnership. We are working closely with foundry partners to develop prototype FinFET physical IP early in the process lifecycle. Using this prototype physical IP ARM is currently developing two different FinFET test chips both taping out in Q3 2012. These efforts continue ARM’s commitment to early development of silicon testing to reduce risk and time to market. Through our early engagement and prototyping work we actively provide feedback to our foundry partners to assure that FinFET technology is well suited to the requirements of energy efficient SoCs. ARM is further contributing to the technical community by publicly releasing fully pre-authorized FinFET transistor model based RTRs roadmap and is extending these models to more advanced FinFET designs. Internally we are modeling proprietary foundry technologies in support of the development work on those processes. This is just the beginning of ARM’s commitment to FinFET IP leadership. [03:46]
There are a number of other ARM specific information about its FinFET efforts in the September 27 report which is in the following major section. Now additional ones from its foundry partners:
Breathing New Life into the Foundry-Fabless Business Model [ARM’s SoC Design blog, Aug 21, 2012]
Early last week, GLOBALFOUNDRIES jointly announcedwith ARM another important milestone in our longstanding collaboration to deliver optimized SoC solutions for ARM® processor designs on GLOBALFOUNDRIES’ leading-edge process technology. We’re extending the agreement to include our 20nm planar offering, next-generation 3D FinFET transistor technology, and ARM’s Mali™ GPUs.
Our collaboration with ARM goes back many years, and its evolution parallels some of the critical developments in the larger semiconductor industry during the same timeframe. ….
This early and deep collaboration has resulted in several significant milestones, including the world’s first foundry optimized Cortex-A9 processor, POP™ IP for the Cortex-A9 processor operating at 1.6GHzon our 28nm-SLP technology, and a demonstration of more than 2GHzon our 28nm high-performance technology. This platform builds on the existing ARM Artisan® physical IP platforms for GLOBALFOUNDRIES processes at 65nm, 55nm and 28nm.
Now we are extending this collaboration to include true joint optimization for 20nm technologies and beyond, as well as a new focus on GPUs, which are becoming increasingly important in today’s smart mobile devices. The TQV strategy has already been scaled to 20nm and is an integral part of our process development, with a 20nm test chip implementation currently running through our Fab 8 in Saratoga County, N.Y.
And while we are seeing great dividends from this collaboration, the real hard work is only just beginning. We are now leveraging historical synergies from 28nm and 20nm planar technology to enable a smooth migration to next-generation, three-dimensional FinFET technology. One of the well publicized benefits of FinFET technology is its superior low-power attributes. The intrinsic capability of the 3D transistor to operate at a lower Vdd translates to longer battery life, which is heavily sought after in performance-hungry mobile computing applications. Our collaboration is focused tightly on this sweet spot in the market, where designers are looking for the optimum combination of performance, power-consumption, area, and cost. Our co-development work with ARM will enable a faster time to FinFET SoC solutions for customers using ARM’s next generation of mobile SoC IP for both CPUs and GPUs.
So clearly the foundry-fabless business model is not collapsing, but rather adapting to meet the challenges of today. Success will be a result of much closer joint development at the technology definition level, early engagement at the architectural stage, and a more integrated and cooperative ecosystem – precisely the kind of collaboration that we’re demonstrating with our valued partner ARM.
Guest Partner Blogger:
Mike Noonen is Executive Vice President, Worldwide Marketing and Sales, for GLOBALFOUNDRIES. In this role, he is responsible for global customer relationships as well as all marketing, sales, customer engineering and quality functions.
If interested in the GLOBALFOUNDRIES Fireside Chat mentioned here watch the separate video GLOBALFOUNDRIES Fireside Chat at ARM Techcon 2012 [Charbax YouTube channel, Oct 31, 2012] with the following content:
“The insatiable need for functional and feature integration on to Mobile SoCs, coupled with ever increasing performance demands has challenged the Foundries and Fabless Semiconductor companies alike. While the diminishing geometries of the process technologies have kept pace to address this challenge, the solutions for leakage power dissipation continued to fall behind threatening to thwart the advances in Mobility. The ground-breaking FinFET technology is the right low-power solution and will serve as an inflection point to further enable SoC-level integration and technological advances in this exciting era of Extreme Mobility. The panel will discuss how the next generation of FinFET technology will change the mobile revolution again.”
Dean Freeman, Research VP, Gartner Research
Bruce Kleinman, VP, Product Marketing, GLOBALFOUNDRIES
Subramani Kengeri, Vice President, Technology Architecture Office of the CTO, GLOBALFOUNDRIES
Srinivas Nori, Director. SOC Innovation, GLOBALFOUNDRIES
Dipesh Patel, Deputy General Manager of the Physical IP Division, ARM
TSMC’s information about collaboration with ARM in FinFET space was already included in the second major section (September 27 Report) beginning from ARM and TSMC Collaborate to Optimize Next-Generation 64-bit ARM Processors for FinFET Process Technology [ARM press release, July 23, 2012] part in the text. As an update to that I will include here: TSMC Accelerates finFET Efforts [SemiMD (Semiconductor Manufacturing and Design), Oct 16, 2012]
In response to its foundry rivals, Taiwan Semiconductor Manufacturing Co. Ltd. (TSMC) has updated and accelerated its process roadmap. The world’s largest silicon foundry has accelerated its 16nm finFET efforts by one quarter and added a 10nm finFET technology to the roadmap.
TSMC also plans to take the “modular fin” approach for its 16nm finFET. It is also looking at 450mm fabs at the 10nm node, according to a TSMC executive, who also stressed that collaboration is a key to success. Customers must collaborate earlier in the design cycle and “at a new level,” said Mark Liu, executive vice president and co-chief operating officer at TSMC, during a keynote at the company’s Open Innovation Platform Ecosystem Forum in San Jose, Calif. on Tuesday (Oct. 16). “We need to align strategically.”
At present, TSMC is ramping up its 28nm process technology. The next process on the roadmap, dubbed CLN20, is a 20nm planar technology. The reference flow for CLN20 is ready and the process is due out in 2013.
[See: TSMC 20nm and CoWoS™ Design Infrastructure Ready [TSMC press release, Oct 9, 2012]
Then, as previously announced, TSMC will enter the finFET transistor era. The company’s initial finFET process, dubbed CLN16FF, is being targeted and branded for the 16nm node. TSMC’s 16nm finFET process is slated for risk production in November of 2013, Liu said. Risk production has been accelerated from February of 2014 to November of 2013.
In an interview after the keynote, Liu said TSMC will take a “modular fin” approach in finFETs. TSMC will marry a 16nm fin with a 20nm backend. “It has 20nm design rules,” he said.
TSMC will also implement a triple-patterning strategy for 16nm finFETs. The company is also keeping its options open. It is exploring 193nm immersion extensions, extreme ultraviolet (EUV) lithography and multi-beam. “At this point, we have both (193nm extensions and EUV) under development,” he said. “Maybe multi-beam will save the day.”
TSMC’s 16nm finFET design solutions, including the EDA tools and IP, will be ready by the first quarter of 2013. “We have pulled in our design enablement solutions,” said Cliff Hou, senior vice president of TSMC, during a separate keynote at the event. The first version of the design solutions, dubbed V0.1, is slated for introduction in January. The second version, V1.0, is due out in October of 2013.
Meanwhile, during his keynote, Liu presented a slide that denoted CLN10FF, which is a second-generation finFET for the 10nm node. TSMC’s 10nm finFET process is expected to move into risk production “close to the end of 2015,” he said.
Also at 10nm, TSMC is looking to enter the 450mm fab era. It is likely TSMC will have a 450mm fab or pilot line in the second phase of 10nm. “There are no show stoppers,” he said. “All of the equipment companies are developing 450mm.”
Other foundries have also accelerated their finFET roadmaps. For example, GlobalFoundries Inc. recently rolled out its finFET technology for the 14nm node. GlobalFoundries is taking a “modular fin” approach with its bulk finFET offering, dubbed 14nm-XM. The 14nm-XM combines a 14nm-class fin with its 20nm back-end-of-line (BEOL) interconnect flow.
By taking the modular approach, the company has accelerated its process roadmap by a year. Early process design kits (PDKs) are available, with customer product tape-outs expected in 2013. Production, which is slated for 2014, will take place within GlobalFoundries’ new 300mm fab in New York.
Another foundry vendor, United Microelectronics Corp. (UMC), is taking a similar modular finFET approach. UMC licensed finFET technology from IBM. Samsung Electronics Co. Ltd. has yet to elaborate on its finFET strategy. Meanwhile, Intel Corp. is already ramping up its 22nm process, which is based on finFET transistors. Intel is providing foundry services for select customers, who plan to ship products based on finFETs.
September 27 Report
In my role, I serve as one of the members of the Global
Semiconductor Alliance (GSA) Steering Committee on Intellectual Property, where we work to share best practices and continue to improve the IP ecosystem for the benefit of the entire semiconductor industry. As part of this role, I’ve observed a trend in the news speculating on the future of the foundry and IP industry, and I recently posted my thoughts on the GSA blog site, and I’d like to share them with you here as well.
In 1897, after a journalist erroneously reported the passing of famed author and humorist Mark Twain, Twain replied in his typical wit with the now famous retort: “the rumor of my death has been greatly exaggerated.” Like the then very alive author, recent reports have speculated on the demise of the foundry and IP business model. I similarly think such talk is pure nonsense. Across many metrics the foundry and IP space is alive and well and providing unprecedented capabilities to semiconductor companies. [his factual argumentation for that you can find much below, in the <<sticking with the “Goliath”>> section]
To understand the semiconductor IP ecosystem one should first understand it via the IP related efforts of far the biggest and most influential foundry, TSMC (as their success most heavily depends on a vibrant and quality IP ecosystem):
ChipEstimate.com DAC 2012 IP Talks presenter Dan Kochpatcharin on TSMC OIP and IP Quality [chipestimate YouTube channel, June 26, 2012]
There are 41 IP partners in the semiconductor IP specific TSMC IP alliance program of TSMC OIP (Open Innovation Platform alliance ecosystem) and also have 20-25 IP partners directly supported but not part of the IP alliance program.
Among those the winners of the 2011 TSMC IP Partner Award of Year were:
- Interface IP: Synopsys Inc. [US]
- Analog/Mixed Signal IP: Dolphin Integration SA [France]
- Foundation IP [such as basic standard cells, standard I/O, and memory-bit cells]: ARM Ltd. [UK] [ARM Artisan® Physical IP such as: ARM® Artisan Standard Cell Libraries, ARM® Artisan Specialty I/O libraries, ARM® Artisan DDR Interface IP, ARM Artisan Embedded Memory Compilers etc. and Processor Optimization Packs (POPs)]
- Specialty Embedded Memory IP: eMemory Technology Inc. [Taiwan] for second year in a row
Note that for such an IP excellency the organisations behind are not big at all. Dolphin Integration SA is a 190 people company. eMemory employs around 200 people as per the award news release. While ARM Holdings Plc had 2,253 full-time employees alltogether at June 30, 2012, considering their Physical IP Division (PIPD) having just 11% of the overall revenue the number of employees there would probably not exceed 300. Artisan Components Inc. (US) acquired by ARM Holdings for not less than 1 billion US$ in Dec 2004 (because of “collaboration between the two companies on ARM’s next-generation MPU core, code-named “Tiger”, in 2005 becoming Cortex-A8) had 72 employees in 1997, so it is likely from historical point of view as well (considering even ARM’s heavy investment later on).
As far as Synopsys is concerned, 9 months ago it had ~6800 employees, but its portfolio is rather large (implementation, verification, IP, manufacturing and FPGA solutions), and in addition to the Interface IP the company has Analog IP and Memories and Logic Libraries as well in the overall DesignWare IP portolio. To understand that split let’s take the following “Top Interface, Analog, and Embedded Memory IP Vendor” presentation slide from Synopsis Investor Day 2011 presentation, referring to a Gartner, March 2011 report, which is indicating $104.1M interface IP revenue for 2010:
which is ~ 7.5% of the overall revenue of Synopsis (having $1.38B for the fiscal year 2010 ending Oct 31, 2010 when it had 6707 employees) which could mean ~500 employees related to Interface IP activities taken proportionally to the revenue.
And here are the number of titles in TSMC IP portfolio also vs. other foundries:
– TSMC Extends Open Innovation Platform™ [TSMC press release, June 7, 2010]
– TSMC Expands IP Alliance to Include Soft IP [TSMC press release, Oct 5, 2010]
– Atrenta and TSMC IP Quality Initiative Gains Broad Industry Acceptance [Atrenta press release, March 5, 2012]: “10 intellectual property (IP) providers have qualified their soft IP for inclusion in the TSMC 9000 IP library using the Atrenta IP Handoff Kit. Those companies, part of TSMC’s Soft-IP Alliance Program, include Arteris, Inc.; CEVA; Chips&Media, Inc.; Digital Media Professionals Inc. (DMP); Imagination Technologies; Intrinsic-ID; MIPS Technologies, Inc.; Sonics, Inc.; Tensilica, Inc.; and Vivante Corporation. The participating companies are able to provide quantitative information to TSMC’s customers regarding the robustness and completeness of their soft or synthesizable semiconductor IP that is part of the TSMC 9000 IP library.”
– Imagination Technology Forum: Advanced SoC solutions in cooperation with TSMC [detailed DIGITIMES report, June 28, 2012]: “Not only will we be introducing our latest graphics processing IP, we will also talk about video, displays, multi-threaded cores [Meta SoC Processors], and wireless processors [Ensigma Universal Communications Core Processors (UCCPs)]. We hope that industries can further understand that Imagination is a company that provides complete SoC solutions.“
– TSMC Open Innovation Platform® Ecosystem Forum, Technical Presentation Abstracts [TSMC, Oct 18, 2011]
– ARM Physical IP Overview [ARM presentation, Sept 9, 2011]
– Leveraging Advanced Physical IP to Deliver Optimized SoC Implementations at 40nm and below [ARM presentation, Nov 19, 2010] [Meta SoC Processors]
– ARM Announces Processor Optimization Pack [ARM press release, Nov 9, 2010]
ARM today announced the immediate availability of the ARM® Cortex™-A9 Processor Optimization Packs (“POPs”). Processor Optimization Packs leverage ARM Artisan® physical IP to enable customers to achieve technology leading performance or power targets on their Cortex-A9 implementations in the shortest time to market. A silicon-proven POP is available now TSMC(R) 40nm G process technology. The Cortex-A9 POP on TSMC 40nm LP process technology will be available to customers in January 2011.
The Cortex-A9 Processor Optimization Packages contain three elements: ARM Artisan optimized logic and memory physical IP for a specific process technology, supported by implementation knowledge and ARM benchmarking. Combined together the POP allows SoC designers to optimize Cortex-A9 designs for maximum performance, lowest power or to develop customized solutions balancing power and performance for their specific application.
– Overall semiconductor IP market overview
The key players listed by the market researcher MarketsandMarkets (with ChipEstimate.com links wherever possible, where “Prime IP Partners” are highlighted in bold) are the following companies:
ARM Holdings Plc (UK)
CEVA Inc. (Israel, Choice IP Partner)
Coreworks S.A. (Portugal), but see Homepage, Technologies, Products, Rapidity
Lattice Semiconductor, but see its IP website
MIPS, Inc., but see Processor Cores, Interconnect IP, and MIPS Alliance
MoSys, Inc., but see unparalleled bandwidth performance for next gen networking systems
Tensilica, Inc. (Choice IP Partner)
Triad Semiconductor, Inc., but see Mixed Signal ASIC, Engagement Model … IP Catalog, ARM Powered VCAs
VeriSilicon, Inc. (Choice IP Partner)
exited: Wipro-NewLogic, Inc., but see RivieraWaves (France) as a successor
ChipEstimate.com Chip Planning Portal Overview
The ChipEstimate.com chip planning portal is an ecosystem comprised of over 200 of the world’s largest semiconductor design and verification IP suppliers and foundries. These companies all share in the common vision of helping the worldwide electronics design community achieve greater profitability and success. To date, a diverse global audience of over 27,000 users has joined the ChipEstimate.com community and has collectively performed over 100,000 chip estimations. ChipEstimate.com is a property of Cadence Design Systems, Inc. (NASDAQ: CDNS), the leader in global electronic-design innovation.
Reasons for missing Coreworks S.A, Lattice Semiconductor, Mentor Graphics, Inc., MIPS, Inc., and MoSys, Inc. on the ChipEstimate.com portal are quite diverse. You can find them via the additional linked explanations, typically marked as “but see”.
Overall the summary of the Semiconductor Intellectual Property Market, Silicon IP Market (2012-2017): Global Forecasts & Analysis [MarketsandMarkets, April 2012] states that:
The growth trend of the Semiconductor IP market revenue can be observed by the CAGRs over various time periods. The CAGR of the Semiconductor IP market from 1997 to 2002 was 17.82% while the value from 2002 to 2007 stood at 11.54%. Post 2007, the market again picked up growth and the forecasted CAGR from 2012 to 2017 is estimated to be 14.47%. In 2012, the global Semiconductor IP market is estimated to be $2.90 billion. The percentage share of Semiconductor IP industry in the global revenue for semiconductors was approximately between 0.3% and 1.0% over the years; stood at 0.71% in 2011, and is estimated to increase to 0.85% by the end of 2012 and 0.99% by the end of 2017.
In the Analyst Briefing Presentation of the same report it is stated that:
Coming to the statistics, in 2011, the global Silicon IP Market stood at $2.25 billion, while the global semiconductor industry revenue was at $315 billion. Both these markets are estimated to reach $2.90 billion and $340 billion respectively by the end of 2012.
which means that while the global semiconductor industry is expected to grow just 6.3% this year the Semiconductor IP Market is estimated to grow by 28.9% ! So the latter is quite healthy although still a tiny part of the whole industry.
Gartner presented last year the following, revenue based Semiconductor IP Market view:
Source: Synopsis Investor Day 2011 presentation, referring to a Gartner, March 2011 report
Note that the $231.6M semiconductor IP revenue was just ~15% of the CY2010 overall revenue (~1.5B estimated at max) of Synopsis where Core EDA (Electronic Design Automation) was and is the bulk of the revenue by far: Core EDA revenue was $959M in FY2010 and $980.7M in FY2011. Relative to that the overall Semiconductor IP segment was and is a double digit growth area for Synopsis. Since the company is following a strong “M&A strategy to broaden TAM and provide incremental revenue growth” in non-Core EDA areas the semiconductor IP revenue will probably grow at the same pace in the coming years. Therefore its #2 position will be maintained on this market, especially as it has almost no competitors (only Mentor Graphics IP) among Top 10 (those companies having not less than 71.1% share of market), while the #3 Imagination Technologies’ strongest competitor is the #1 ARM Holdings, as well as the strongest competitor of the #4 MIPS Technologies is the same #1 ARM Holdings.
So overall the market is quite mature, with well established and strong leaders already having the most of the business for themselves. The #1 ARM Holdings is also having a strong ecosystem of its own, which is providing opportunities for not less than 53 small silicon IP vendors outside the Top 10 as well. See: SoC IP [providers in ARM Connected Community Program].
I’ve edited a more descriptive list of that in PDF, which you can download from here. Below I’am providing an excerpt from that, with strongest players in ARM’s own ecosystem in the sense of relying on ARM’s Artisan Physical IP via the IPNet Partner Program (denoted by +) and/or TSCM IP Alliance Program (denoted by *):
Analog Bits*: the leading supplier of low-power, customizable analog IP for easy and reliable integration into modern CMOS digital chips. Our product range includes precision clocking macros such as PLL’s & DLL’s, programmable interconnect solutions such as multi-protocol SERDES/PMA and programmable I/O’s as well as specialized memories such as high-speed SRAMs and T-CAMs.
– Low Power Wide Range PLL – Common Platform 32LP
Arteris*: Arteris invented Network on Chip technology, offering the world’s first commercial solution in 2006. Arteris connects the IP blocks in semiconductors from Qualcomm, Samsung, TI, and others, representing over 50 System on Chip devices. … Arteris is a private company backed by a group of international investors including ARM Holdings, Crescendo Ventures, DoCoMo Capital, Qualcomm Incorporated, Synopsys, TVM Capital, and Ventech.
– C2C™ Chip to Chip Link™ Inter-chip Connectivity IP
– FlexNoC Network-on-Chip Interconnect IP
– FlexWay Interconnect IP
Aurora VLSI, Inc. +: provides AMBA specification-based SoC/ASIC IP components, peripherals, subsystems, and platforms. … Aurora provides a full set of popular communications and SoC IP cores for ARM and AMBA Bus-based SoCs.
– AMBA Peripherals- Ethernet, PCI, USB, IEEE1394, memory and flash controllers, interrupt controller, timers, counters, GPIOs, etc
– AMBA SOC Platform (Configurable)
AuthenTec*: a leading provider of mobile and network security. … AuthenTec’s products and technologies provide security on hundreds of millions of devices, and the Company has shipped more than 100 million fingerprint sensors for integration in a wide range of portable electronics including over 15 million mobile phones. Top tier customers include Alcatel-Lucent, Cisco, Fujitsu, HBO, HP, Lenovo, LG, Motorola, Nokia, Orange, Samsung, Sky, and Texas Instruments.
– SafeXcel™ IP-06 KASUMI Crypto Core Family
– SafeXcel™ IP-115 HDCP2 Content Protection Crypto Module
– SafeXcel™ IP-123 Secure Platform Crypto Module
– SafeXcel™ IP-154 Public Key Infrastructure Cores
– SafeXcel™ IP-16 3DES Crypto Core Family
– SafeXcel™ IP-160 MACsec Security Engine w/ Classifiers
– SafeXcel™ IP-18 CAMELLIA Crypto Core Family
– SafeXcel™ IP-197 Inline Security Packet Engine
– SafeXcel™ IP-28: Public Key Accelerator Cores
– SafeXcel™ IP-3X AES Crypto Core Family
– SafeXcel™ IP-46 SNOW 3G Crypto Core Family
– SafeXcel™ IP-48 ZUC Crypto Core Family
– SafeXcel™ IP-57 HASH/HMAC Core Family
– SafeXcel™ IP-60 MACsec Frame Engine
– SafeXcel™ IP-62 MACsec/IPsec GCM Packet Engine
– SafeXcel™ IP-76 True Random Number Generator
– SafeXcel™ IP-97 Look-Aside Security Packet Engine
CEVA, Inc.*: the leading licensor of digital signal processor (DSP) cores, multimedia and storage platforms to leading semiconductor and electronics companies worldwide. … This portfolio includes a family of programmable DSP cores, DSP-based subsystems and application-specific platforms including multimedia, audio, Voice over Packet (VoP), Bluetooth, Serial ATA (SATA) and Serial Attached SCSI (SAS).
– Application Platforms: for Mobile Multimedia Applications
The Only Silicon-proven Programmable Solution Supporting H.264 codec up to D1 resolution! … Complete, Low-Cost Audio Solution … Complete, Single Processor VoIP Solution
– DSP Cores: The CEVA-X family of cores is based on CEVA’s latest pioneering DSP architecture. This architecture offers best-in-class performance, scalability, and lowest cost-of-development for DSP deployment … CEVA-TeakLite Architecture DSP core.
– System Platforms: Broad set of DSP peripherals extendible through APB … tailored for specific cores of the CEVA-X architecture framework … High performance multimedia platform … CEVA-TeakLite Architecture DSP subsystems
Chips&Media,Inc. *: video codec technologies cover the full line-up of video standards such as MPEG-2, MPEG-4, H.263, H.264/AVC and VC-1 from CIF to HD resolution.
– BODA7Series-HD Video Decoder IP
– BODA9Series-Dual HD Video Decoder IP
– CODA7Series-HD Video Codec IP
– CODA9Series-Dual HD Video Codec IP
Denali Software, Inc. +: Databahn™ products provide optimal control and data throughput for external DRAM (DDR2, DDR3, LPDDR1, LPDDR2) and Flash memory devices.
– Databahn NAND Flash Controller
– Databahn(TM) PCI Express Controller IP Core
– Databahn(TM) SDR/DDR1/DDR2/DDR3/LPDDR2 Solutions
eMemory Technology Inc. *: focused on the development of logic embedded non-volatile memory (NVM) such as OTP, MTP, and Flash. eMemory has published 186 patents. There are over 120 companies who have implemented our technologies and IP’s worldwide.
Intrinsic-ID *: semiconductor IP and embedded software products based on Hardware Intrinsic Security. Our solutions revolve around patented Physically Unclonable Function (PUF) technology, where a secret key is extracted like a silicon biometric or fingerprint from silicon hardware directly and only when required.
Attackers have nothing to find because no key is stored nor present in the power down state. … Headquartered in Eindhoven, The Netherlands, Intrinsic-ID was founded in 2008 as a spin-out of Royal Philips Electronics and has been deployed in Philips’ production environment.
– Quiddikey™ in Hardware
– XPM: embedded, one-time programmable (OTP) non-volatile memory (NVM). … Over 70 customers have integrated XPM™ in over 200 designs from 180nm to 40nm. Applications range from a few hundred bits for unique ID to prevent cloning to multiple instances of 1Mb for program code storage.
PLDA, Inc. *: a leading provider of semiconductor intellectual property (IP) specialized in high-speed interconnect protocols and technologies.
– AMBA 2 AHB to PCI Bridge
– AMBA 2 AHB to PCI Express Bridge
– AMBA 2 AHB to USB 3.0 Device
– AMBA 2 AHB to USB 3.0 Host
– AMBA 3 AXI to PCI Express Bridge
– PCI Express IP Core with AXI interface
Rambus Inc. *: one of the world’s premier technology licensing companies specializing in the invention and design of high-speed memory architectures.
– XDR Memory: architecture … proven in high-volume, cost-competitive applications. Operating at 3.2Gbps, XDR DRAM provides 6.4GB/s of peak memory bandwidth with a single, 2-byte wide device.
Renesas Technology America, Inc. *
– Renesas Application Specific Products: SoC Architecture for Multimedia Controller Chip. Features: Multiple ARM 9 cores, Graphic Controller on chip, USB on chip, Memory Card Interface, Standard high-performance MCU peripherals, JTAG. Easy to customize with proven architecture and IP.
Sidense Corp. *: Sidense Corp. provides secure, dense and reliable non-volatile, one-time programmable (OTP) memory IP for use in standard-logic CMOS processes, with no additional masks or process steps required and no impact on product yield. Sidense’s patented one-transistor 1T-Fuse™ architecture provides the industry’s smallest footprint, most reliable and lowest power Logic Non-Volatile Memory IP solution and offers an alternative solution to Flash, mask ROM and eFuse in many applications.
Silicon Image GmbH *+
– Multimedia Platform IP: complete system solutions for Mobile Communication including MPEG-4 Encoding and Decoding for video chat and video conferencing applications. For Multimedia the offering incudes solutions for DVD Players and Set Top Boxes. Other leading edge technologies include a broad portfolio of security IPs and IP cores of professional networking applications.
Silicon Interfaces +
– Silicon Cores – Core to the Intelligent Systems(TM): 12+ IP cores targeted to areas such as Networking, Wireless, Communication and Interconnect, and around 5+ Verification IPs using Industry standard Verification Methodology
Sonics, Inc. *+: a pioneer of network-on-chip (NoC) technology and today offers SoC designers the largest portfolio of intelligent, on-chip communications solutions.
– MemMax AMP: an intelligent Dynamic Random Access Memory scheduler designed for use with any AMBA AXI compliant bus fabric and memory controller.
– MemMax Scheduler: an intelligent Dynamic Random Access Memory scheduler designed for use with an OCP compliant memory controller.
– SonicsGN: Sonics’ 4th generation, configurable, on-chip network enabling the design of advanced SoC communications networks using a high-speed scalable fabric topology structure. As the industry’s highest frequency NoC available today, SGN allows SoC designers to deliver high-performance, simultaneous application processing for smart phones, mobile video and tablets.
– SonicsLX: On-chip Network contains a high performance advanced fabric with data flow services for the development of complex SoCs.
– SonicsMX: an actively decoupled, non-blocking, intelligent internal interconnect that enables designers to implement multiprocessor SoC architectures using combinations of similar or heterogeneous processing elements.
– SonicsSX: On-chip Network contains a high performance, advanced fabric and a comprehensive set of data flow services for the development of complex, multicore and multi-subsystem SoCs.
Synopsys *+: world leader in electronic design automation (EDA), supplying the global electronics market with the software, intellectual property (IP) and services used in semiconductor design, verification and manufacturing. … Synopsys is headquartered in Mountain View, California, and has more than 70 offices located throughout North America, Europe, Japan, Asia and India.
– DesignWare Cores: Synopsys is a leading provider of high-quality, silicon-proven interface and analog IP solutions for system-on-chip designs. Synopsys’ broad IP portfolio delivers complete interface IP solutions consisting of controllers, PHY and verification IP for widely used protocols such as USB, PCI Express, DDR, SATA, Ethernet, HDMI and MIPI IP including 3G DigRF, CSI-2 and D-PHY. The analog IP family includes Analog-to-Digital Converters, Digital-to-Analog Converters, Audio Codecs, Video Analog Front-Ends, Touch Screen Controllers and more.
– DesignWare System-Level Library: a portfolio of tool-independent transaction-level models (TLMs) for the creation of virtual platforms. Virtual platforms are fully functional software models of complete embedded systems enabling pre-silicon software development and software-driven system validation.
As one could there 18 silicon IP vendors with very strong (Artisan and/or TSMC IP Alliance) ties in ARM’s own ecosystem, and out of them 5 (AuthenTec, CEVA, Rambus, Silicon Image and Syopsys) are in the Top 10 group of providers.
With that we could finish the overall semiconductor IP market overview.
– The CEVA case
A lot of Silicon IP vendors are highly focussed. Probably the most successful among them is CEVA Inc. (Israel, Choice IP Partner):
CEVA DSP – Company Introduction [cevadsp YouTube channel, Aug 4, 2011]
CEVA, Inc. Announces Second Quarter 2012 Financial Results [CEVA press release, July 31, 2012]
… Total revenue for the second quarter of 2012 was $13.6 million, a decrease of 6% compared to $14.4 million for the second quarter of 2011. Licensing revenue for the second quarter of 2012 was $5.4 million, an increase of 3% compared to $5.2 million reported for the second quarter of 2011. Royalty revenue for the second quarter of 2012 was $7.6 million, compared to $8.3 million reported for the second quarter of 2011. Revenue from services for the second quarter of 2012 was $0.6 million, compared to $0.9 million reported for the second quarter of 2011.
Gideon Wertheizer, Chief Executive Officer, stated: “The second quarter was the strongest licensing quarter in more than three and a half years, driven by a strategic licensing agreement with a tier 1 handset OEM for a range of LTE handsets and the first agreement for our newest DSP, the CEVA-XC4000 for LTE- Advanced. These latest agreements bring the total LTE design wins for CEVA DSPs to date to more than 20, and form the foundation for future royalty growth. Finally, while the competitive 2G market is experiencing pricing pressure, our volume growth in the lucrative 3G market during the quarter significantly outpaced that of the overall 3G space, as low and mid-range 3G smartphones gain traction.” …
About CEVA, Inc.
CEVA is the world’s leading licensor of silicon intellectual property (SIP) DSP cores and platform solutions for the mobile, portable and consumer electronics markets. CEVA’s IP portfolio includes comprehensive technologies for cellular baseband (2G / 3G / 4G), multimedia (HD video, Image Signal Processing (ISP) and HD audio), voice over packet (VoP), Bluetooth, Serial Attached SCSI (SAS) and Serial ATA (SATA). In 2011, CEVA’s IP was shipped in over 1 billion devices and powers handsets from every top handset OEM, including HTC, Huawei, LG, Motorola, Nokia, Samsung, Sony and ZTE. Today, more than 40% of handsets shipped worldwide are powered by a CEVA DSP core. For more information, visit www.ceva-dsp.com. Follow CEVA on twitter at www.twitter.com/cevadsp.
LTE-A Ref.Architecture [part of the Ceva-XC4000 product page, Feb 20, 2012]
CEVA-XC4000 multi-mode LTE-Advanced reference architecture
Based on multiple CEVA-XC4000 processors, CEVA offers a complete multimode LTE-Advanced reference architecture targeting LTE-A Rel-10 Cat-7. The reference architecture was developed together with mimoOn, a member of the CEVA-XCnet partner program and addresses the entire PHY layer requirements.
Reference architecture highlights:
A complete LTE PHY system architecture addressing the entire PHY layer requirements of multiple standards in software including: TD-LTE-A, HSPA+ Rel-9, TD-SCDMA, WiMAX and more
Built around CEVA-XC4000 processors with minimal complementary hardware accelerators
Offers industry’s most competitive SDR platform in terms of both cost and power consumption
Supports maximal throughput of LTE-A Rel-10 CAT-7 UE FDD (DL: 300Mbps, UL: 100Mbps) with up to 8×4 MIMO and carrier aggregation of up to two carrier components to a total of 40MHz channel
High operating margins enabling customer differentiation by software
[See also the related press release, as well as the CEVA Continues to Dominate DSP IP Market with 90% Market Share [May 14, 2012] press release]
CEVA is also a best case for the trend determining the future of the semiconductor IP ecosystem, especially with the above “small print” example of a reusable LTE Advanced subsystem. More about the formation of such a trend you can find in the <<sticking with the “Goliath”>> section below.
– When sticking with the “Goliath”: ARM Holdings Plc
Then there are a number of vendors with an ecosystem of surrounding IP partners such as ARM Holdings Plc on the higher end (which we’ve already presented in the earlier, “Market Overview” section) and CAST Inc. on the lower one.
Let’s examine the future of the semiconductor IP ecosystem through the eyes of these two companies. What they can offer strategically to their customers? Why customers are selecting the smaller and much less influential offerings from CAST against the “industry behemoth” ARM? What does it mean for a customer sticking with one against the other?
Making IP work and getting the right SoC! [Global Semiconductor Alliance (GSA) Intellectual Property blog, July 18, 2012]
Jack Browne, Vice President, Marketing, Sonics, Inc.
Designers defining the next generation SoCs are adding more cores in pursuit of the ever increasing user experience. Whether for pacesetting smart phones, WiFi routers, or personal medical devices, making all this IP work as intended in the SoC requires system IP. System IP includes the on-chip network, performance analysis tools, debug tools, power management and memory subsystems necessary for best in class SoCs. Whether used by the architect in the initial definition of the SoC or the layout engineer finalizing timing for place and route closure, system IP is critical to the design insuring that the capabilities of the SoC will meet the required end user experiences.
For complex SoCs over 100 IP blocks may be included in a design. Choices can be tough, with over a hundred IP vendors offering solutions, each with multiple products. The System IP eases the design burden by supporting both IP blocks and subsystems with the necessary broad range of interface protocols, widths, frequency domains and power domains.
System IP eases the challenges of maintaining a common software platform over multiple generations of SoC’s, built with varying IP cores and subsystems. Market research firm Semico, forecasts subsystem functions for computing, memory, video, communications, multimedia, security and system resource management. The increased abstraction from subsystems gives productivity benefit (leveraging use of commercial IP blocks) as well as differentiation through the integration of in-house IP blocks with standard industry IP blocks into reusable subsystems. A computing subsystem example would be ARM’s big.LITTLE CPU clusters where ARM does most of the integration ahead of time with the designer doing final configuration of features and/or number of cores. Another example would be faster communication subsystems like LTE advanced subsystems [we have already shown CEVA’s LTE-A Ref.Architecture above as the best example for that]. By customizing a 4G LTE advanced subsystem solution with internal technology, SoC design teams can differentiate from standard IP blocks using their internal expertise while leveraging the shared R&D benefits of merchant 4G IP subsystems.
With the increasing cost of today’s SoCs, many are designed for multiple markets where not all of the functionality of the SoC is in use. Many also have multiple usage scenarios within a given market, e.g. music playback on our smartphone. With the importance of battery life, managing the power of a SoC, including the ability to power off unused blocks, gives the best battery life. Today’s 28nm SoCs are using dozens of power domains and even more clock domains to meet the performance and battery life requirements. By moving to system IP supporting hardware centric control of power transitions, end users will make more use of Dark Silicon (normally powered off) for better battery life as compared to interrupt centric software power management control.
When starting a new SoC design, your choice of system IP is a key early decision as you have now selected the on-chip network, performance analysis tools, debug tools, power management and memory subsystems available for your design. Making the right choice can provide a 2x benefit over other choices with regard to performance, power and cost, so make an informed choice.
Dr. John Heinlein, Vice President, Marketing, ARM Physical IP Division
… The IP ecosystem … is diverse and vibrant, with today’s IP providers offering many IP types, spanning a wide range of power, performance and area tradeoffs. As an example, at 45 and 40nm various industry databases list between 450-620 licensable IP blocks available. Furthermore, the latest IP developments at 45nm and 28nm include extensive power management capabilities, cost tradeoffs and implementation options that give designers choices for their chip. Only through this ecosystem diversity can we have the rich and competitive landscape to address the many market segments the industry serves.
… Major technology investments are occurring across the foundry space, with new leading-edge R&D investments in fundamental process technology being made. These investments span major companies like IBM, TSMC, Samsung, GLOBALFOUNDRIES, research consortia like IMEC and even new entrants like SuVolta, all of which are driving for aggressive technologies. Today, 32 and 28nm products are in production and many more ramping to production. Following that, there is a range of solutions already announced at 20nm that deliver the next node of planar bulk CMOS scaling. Furthermore, the industry has clearly shown its commitment to investing in the next wave of 20nm and 14nm solutions beyond bulk ranging from FinFET to fully depleted SOI. …
John A. Ford, Director of Product Marketing, Physical IP Division, ARM
On October 6th, UMC announced the selection of the ARM® Artisan® Physical IP Platform for the UMC foundry sponsored IP program. This new platform for UMC’s 28nm high-K metal gate (HKMG) process is a natural continuation of the long standing relationship between ARM physical IP division and UMC. ARM Artisan IP has been successfully used in millions of SoCs produced at UMC for more than 10 years on 180nm, 130nm, 90nm, 65nm and 55nm process technologies. The addition of UMC to ARM’s family of 28nm Physical IP platforms has a larger meaning than just a high quality set of IP on a technology-leading process. ARM Artisan IP is now the only physical IP platform available at all four of the 28nm commercial foundries in the world: TSMC, UMC, GLOBALFOUNDRIES, and Samsung.
This makes good sense considering ARM’s expertise in physical IP optimization and years of establishing early foundry engagement on advance node IP development. ARM started work on physical IP for HKMG processes way back in 2008 with test chips and process qualification chips for IBM’s 32nmLP process. 32nmLP process was the first commercially available HKMG process and is now in high volume production at Samsung for smart phone, tablet and other applications. With millions of production SoCs at 32nm, 28nm is actually the 2nd generation of HKMG IP from ARM and includes all the critical design technique learning from 32nm development and production. ARM is deploying a full platform of standard cells, logic products, memory compilers and interface products at 28nm. Customers can benefit from being able to use consistent IP at all four foundries for the development of their SoC. With ARM’s exhaustive silicon validation process, customers have the assurance, peace of mind and confidence that only comes for using ARM IP.
We’re not stopping there. ARM is now actively developing 20nm physical IP at both IBM and TSMC, with 5 test chips taped out starting in 2009 and several more planned for 2012 and 2013. By engaging early with foundries and developing IP in parallel with the process development, ARM ensures that designers can achieve the full entitlement of the technology, with a high degree of manufacturability. Foundries engage with ARM as a partner for early physical IP because of the long experience we have in developing physical IP on advanced process including CMOS SiON, CMOS HKMG and SOI. …
ARM big LITTLE processing: Saving Power through heterogeneous multiprocessing and task content migration [chipestimate YouTube channel, June 18, 2012]
From: Enabling Mobile Innovation with the Cortex™-A7 Processor [ARM whitepaper for TechCon 2011 by Brian Jeff, Oct 15 2011]
Market requirements for high-end mobile
High-end smartphones require high performance applications processors and graphics processors, but instantaneous performance requirements are highly elastic. During web browsing, for example, peak performance is required when pages are first rendered, but much lower levels of processor performance are required when reading or scrolling down a page. Similarly, applications have varying levels of performance requirements, typically requiring very high performance during launch, and low to moderate levels of required performance during at least some portion of runtime. For voice calls, the level of performance required by the applications processor is quite low, even on a high-end smartphone.
Given the wide range of required performance, it would be ideal if the phone could use a very power efficient CPU some of the time, and migrate the context to a high performance CPU at other times. ARM has been researching this idea for several years, and has specifically designed the Cortex-A7 CPU not only to ideally fit all but the high-end performance requirements of a high-end smartphone, but also to be able to connect tightly with the larger and higher performance Cortex-A15 CPU in a coherent system. When connected together through AMBA Coherency Extension (ACE) interface a Cortex-A15 CPU cluster can be connected with a cluster of Cortex-A7 CPUs in a processor complex with a single memory map, hardware managed cache coherency, and the ability to run workloads on the large CPU cluster or small CPU cluster depending on instantaneous performance requirements. This concept created by ARM is called big.LITTLE processing.
Big.LITTLE refers to the coherent combination of High Performance and Power Efficient ARM CPUs A platform that contains both Cortex-A15 (big) and Cortex-A7 (LITTLE) can execute across a wider performance range with better energy efficiency than a single processor. Hardware coherency between Cortex-A15 and Cortex-A7 enables distinct big.LITTLE use models, either migrating context between the big and little clusters, or OS aware thread allocation to the appropriately sized CPU or CPUs. The CCI-400 cache coherent interconnect enables an extremely fast context migration between the big and little CPU clusters. Finally, software views the big and LITTLE CPU clusters identically, and transitions are managed automatically by OS power management or directly by the OS. The Net result of big.LITTLE power management is a platform with the peak performance of the Cortex-A15, and average power consumption closer to the Cortex-A7. This enables significantly higher performance at lower power than today’s high-end smartphones. The concept of big.LITTLE processing is only briefly introduced here; a more complete description of the ardware, software, and system implementation of big.LITTLE processing is covered in other TechCon resentations.
From: Big.LITTLE Processing with ARM Cortex™-A15 & Cortex-A7 [ARM whitepaper by Peter Greenhalgh, Sept 15 2011]
In general, there is a different ethos taken in the Cortex-A15 micro-architecture than with the Cortex-A7 micro-architecture. When appropriate, Cortex-A15 trades off energy efficiency for performance, while Cortex-A7 will trade off performance for energy efficiency. A good example of these micro-architectural trade-offs is in the level-2 cache design. While a more area optimized approach would have been to share a single level-2 cache between Cortex-A15 and Cortex-A7 this part of the design can benefit from optimizations in favor of energy efficiency or performance. As such Cortex-A15 and Cortex-A7 have integrated level-2 caches.
Table 1 illustrates the difference in performance and energy between Cortex-A15 and Cortex-A7 across a variety of benchmarks and micro-benchmarks. The first column describes the uplift in performance from Cortex-A7 to Cortex-A15, while the second column considers both the performance and power difference to show the improvement in energy efficiency from Cortex-A15 to Cortex-A7. All measurements are on complete, frequency optimized layouts of Cortex-A15 and Cortex-A7 using the same cell and RAM libraries. All code that is executed on Cortex-A7 is compiled for Cortex-A15.
Cortex-A15 vs Cortex-A7 Performance Cortex-A7 vs Cortex-A15 Energy Efficiency Dhrystone 1.9x 3.5x FDCT 2.3x 3.8x IMDCT 3.0x 3.0x MemCopy L1 1.9x 2.3x MemCopy L2 1.9x 3.4x
Table 1 Cortex-A15 & Cortex-A7 Performance & Energy Comparison
It should be observed from Table 1 that although Cortex-A7 is labeled the “LITTLE” processor its performance potential is considerable. In fact, due to micro-architecture advances Cortex-A7 provides higher performance than current Cortex-A8 based implementations for a fraction of the power. As such a significant amount of processing can remain on Cortex-A7 without resorting to Cortex-A15.
big.LITTLE Task Migration Use Model
In the big.LITTLE task migration use model the OS and applications only ever execute on Cortex-A15 or Cortex-A7 and never both processors at the same time. This use-model is a natural extension to the Dynamic Voltage and Frequency Scaling (DVFS), operating points provided by current mobile platforms with a single application processor to allow the OS to match the performance of the platform to the performance required by the application.
However, in a Cortex-A15-Cortex-A7 platform these operating points are applied both to Cortex-A15 and Cortex-A7. When Cortex-A7 is executing the OS can tune the operating points as it would for an existing platform with a single applications processor. Once Cortex-A7 is at its highest operating point if more performance is required a task migration can be invoked that picks up the OS and applications and moves them to Cortex-A15.
This allows low and medium intensity applications to be executed on Cortex-A7 with better energy efficiency than Cortex-A15 can achieve while the high intensity applications that characterize today’s smartphones can execute on Cortex-A15.
An important consideration of a big.LITTLE system is the time it takes to migrate a task between the Cortex-A15 cluster and the Cortex-A7 cluster. If it takes too long then it may become noticeable to the operating system and the system power may outweigh the benefit of task migration for some time. Therefore, the Cortex-A15-Cortex-A7 system is designed to migrate in less than 20,000-cycles, or 20-microSeconds with processors operating at 1GHz.
big.LITTLE MP Use Model
Since a big.LITTLE system containing Cortex-A15 and Cortex-A7 is fully coherent through CCI-400 another logical use-model is to allow both Cortex-A15 and Cortex-A7 to be powered on and simultaneously executing code. This is termed big.LITTLE MP, which is essentially Heterogeneous MultiProcessing. Note that in this use model Cortex-A15 only needs to be powered on and simultaneously executing next to Cortex-A7 if there are threads that need that level of processing performance. If not, only Cortex-A7 needs to be powered on.
big.LITTLE MP is compelling because it enables threads to be executed on the processing resource that is most appropriate. Compute intensive threads that require significant amounts of processing performance, as their output is user visible, can be allocated to Cortex-A15. Threads that are I/O heavy or that do not produce a result that is time critical to the user can be executed on Cortex-A7.
A simple example of a non-time critical thread is one associated with e-mail updates. While web browsing the user will want email updates to continue, but it does not matter if they are done at CortexA15 performance levels or Cortex-A7 performance levels. Since Cortex-A7 is a more energy efficient processor it makes more sense to take a LITTLE longer, but consume less battery life.
Finally, as a fully coherent system can create a significant volume of coherent transactions, Cortex-A15, Cortex-A7 and CCI-400 have been designed to cope with worst case snooping scenarios. This includes the case where a Mali™-T604 GPU is connected to one of the I/O coherent CCI-400 ports and every transaction is snooping Cortex-A15 and Cortex-A7 at the same time as Cortex-A15 and Cortex-A7 are snooping each other.
From Combining large and small compute engines – ARM Cortex-A7 [by Brian Jeff on ARM SoC Design blog, Oct 19, 2011]
The fourth and final thing is to ensure these engines work with a regular transmission.
We needed to ensure there was a simple software approach to controlling the big.LITTLE switch consistent with power management mechanisms already in place. Current smartphones and tablet devices make use of Dynamic Voltage and Frequency Scaling (DVFS) and multiple idle modes for individual CPU cores and IP blocks in the application processor SoC. Our implementation of big.LITTLE modifies the back end of the driver which controls the processor’s DVFS operating point (for example cpu_freq in Linux/Android). Instead of three or four DVFS operating points, the driver now is aware of two CPU clusters each potentially with three or four independent voltage and frequency operating points, extending the range of performance tuning that existing smartphone power management solutions use. A big.LITTLE CPU cluster can be operated in a pure switching mode, where only one CPU cluster is active at a time under control of the DVFS driver, or a big.LITTLE heterogeneous multiprocessing mode where the OS is explicitly controlling the allocation of threads to the big or little CPU clusters and is thus aware of the presence of the different types of cores.
ARM Cortex-A7 launch — Intro Simon Segars, President ARM Inc [US] [ARMflix YouTube channel, Oct 19, 2011]
ARM Cortex-A7 launch — Presentation, Mike Inglis, EVP & GM ARM Processor Division [ARMflix YouTube channel, Oct 19, 2011]
- Most energy-efficient applications processor
- 5x the energy efficiency of mainstream phones
- Performance to handle common workloads
- >2x the performance of mainstream phone
- Feature set and software compliant with Cortex-A15
- Full backward compatibility
- Scalable and extensible
- Up to 20% more performance while consuming 60% less power
From: Enabling Mobile Innovation with the Cortex™-A7 Processor [ARM whitepaper for TechCon 2011 by Brian Jeff, Oct 15 2011]
The Cortex-A7 processor was designed primarily for power-efficiency and a small footprint. The design team based the pipeline on the extremely power efficient Cortex-A5 CPU, then added microarchitecture enhancements to increase performance and architectural enhancements to deliver full software compatibility with the Cortex-A15 CPU. These architectural enhancements include support for virtualization and 40-bit physical address space, and AMBA® 4 bus interfaces. Virtualization and large address space are unusual features for so small a CPU, but are critical to present a software view of the Cortex-A7 that is identical to the Cortex-A15 high-end CPU.
Like the Cortex-A5, Cortex-A9, and Cortex-A8 processors that came before it, the Cortex-A7 processor is a full ARM v7A CPU, with support for the Thumb®-2 instruction set, optional 32-bit/64-bit floating point acceleration and optional NEON™ 128-bit SIMD architectural blocks. The Cortex-A7 also includes support for TrustZone® to enable secure operating modes which are increasingly important in modern mobile OEM designs. To bring higher scalability, the Cortex-A7 is also configurable as a multicore processor, supporting 1-4 cores in a coherent cluster.
The Cortex-A7 is a simple in-order pipeline with significant but not complete dual-issue capability; however the careful choice of design features has enabled the performance of a single Cortex-A7 core to outperform the full dual-issue Cortex-A8 CPU on some important benchmark tests like web browsing, while consuming up to 60% less power.
The roadmap below shows the legacy of Cortex-A class CPU designs, beginning with the Cortex-A8. In that design, ARM introduces the NEON SIMD architectural extension, and implemented a 2-way superscalar CPU that brought significant performance enhancements over the single-issue ARM11™. The Cortex-A9 extended the Cortex-A8 by bringing in MPCore capability for 1 to 4 CPU’s with cache coherency managed efficiently by a snoop control unit. The Cortex-A9 also introduced performance enhancements inside the core that brought a 20-30% performance increase over Cortex-A8 for a single core.
Cortex-A7 makes use of a simple 8-stage in-order pipeline, extended to include dual-issue capability on a reduced range of data-processing and branch instructions. Increased dual-issuing coupled with other microarchitectural improvements allow the Cortex-A7 to reach very good levels of performance with very low power consumption.
Other performance enhancing features include an integrated L2 cache, which reduces latency to L2 memory and external memory. The integrated L2 cache simplifies OS support as it uses system mapped registers and can be managed using CP15 operations rather than the memory mapped registers needed for an external L2 cache. Integrating the L2 cache controller also reduces the amount of area consumed by an external controller and enables a tighter integration of the controller with internal bus structures.
The L2 cache controller itself was designed with low power in mind. The mechanism for looking up tags in the cache RAM includes consecutive tag followed by data lookup; similarly, the associativity is fixed at 8-way to balance performance against lookup energy. External requests are triggered on an L2 miss, rather than on speculative requests, to reduce energy.
There are branch prediction improvements as well: the branch target instruction cache (BTIC) caches fetches after a direct branch and hides the branch shadow on tight loops.
There are several improvements in memory system performance. The Load-Store path has been increased to 64-bits from the 32-bit path in the Cortex-A5. The external bus structure has been upgraded to 128-bit AMBA4 to improve bandwidth and introduce support for coherency extension beyond the 1-4 SMP cluster using AMBA 4 ACE.
Energy Efficiency Features of the Microarchitecture
There are several features of the L1 Memory system which reduce the power consumption of the CPU or the system. The merging Store-buffer after the write stage reduces data cache lookups. The 2-way set associative instruction cache trades off the slightly improved hit rate of a 4-way set associative cache for the reduced power on each lookup.
Memory System Tuned to Minimize memory latency
There are several performance optimizing features in the memory system. The address generation unit is shifted one stage back in the pipeline to enable a single cycle load-use penalty. The design team increased TLB size to 256 entries, up from 128 entries for the Cortex-A5 and Cortex-A9; this reduces page walks saving power and significantly improves performance for large workloads like web browsing with large data sets that span a large number of pages. Also, page tables entries can be cached in L1, improving the speed of page table walks on TLB misses. The bus interface unit has support for multiple outstanding read and write transactions. Finally, the physically indexed caches enable efficient OS Context switching.
ARM Cortex-A7 launch — big.LITTLE demonstration, Nandan Nayampally, Director, Product Marketing [ARMflix YouTube channel, Oct 19, 2011]
ARM Expands Processor Optimization Pack Solutions for TSMC 40nm and 28nm Process Variants [ARM press release, April 16, 2012]
A Processor Optimization Pack solution is composed of three elements necessary to achieve an optimized ARM core implementation. First, it contains ARM Artisan® Physical IP logic libraries and memory instances that are specifically tuned for a given ARM core and process technology.
This Physical IP is developed through a tightly coupled collaboration with ARM processor engineers in an iterative process to identify the optimal results. Second, it includes a comprehensive benchmarking report to document the exact conditions and results ARM achieved for the core implementation. Finally, it includes a POP Implementation Guide that details the methodology used to achieve the result, to enable the end customer to achieve the same implementation quickly and at low risk.
“A single POP product can be applied to energy-efficient mobile, networking or even enterprise applications, providing a wide range of flexibility for ARM SoC partners to optimize performance and energy-efficiency while reducing risk in their designs,” said Simon Segars, executive vice president and general manager, Processor and Physical IP Division, ARM. “Only ARM can offer a complete roadmap of Processor Optimization Pack implementation solutions so deeply integrated and tightly aligned with ARM processor development activities now and into the future.”
The summary below describes the existing and newly announced POP products for TSMC processes. ARM also incorporates the POP optimizations in hard macros of Cortex cores.
POP availability by process technology
TSMC 40 LP high speed options
TSMC 40 G
TSMC 28 HPM
TSMC 28 HP
ARM Cortex™-A5 Existing
ARM Announces Cortex-A15 Quad-Core Hard Macro [ARM press release, April 17, 2012]
Power-optimized implementation of quad-core hard macro on leading 28nm process
ARM today announced the availability of a high performance, power-optimized quad-core hard macro implementation of its flagship ARM® Cortex™-A15 MPCore™ processor.
The ARM Cortex-A15 MP4 hard macro is designed to run at 2GHz and delivers performance in excess of 20,000DMIPS, while maintaining the power efficiency of the Cortex-A9 hard macro. The Cortex-A15 hard macro development is the result of the unique synergy arising from the combination of ARM Cortex processor IP, Artisan® physical IP, CoreLink™ systems IP and ARM integration capabilities, and utilizes the TSMC 28HPM process.
The low leakage implementation, featuring integrated NEON™ SIMD technology and floating point (VFP), delivers an extremely competitive balance of performance and power and is ideal for wide array of high-performance computing applications for such as notebooks through to power-efficient, extreme performance-orientated network and enterprise devices.
The hard macro was developed using ARM Artisan 12-track libraries and the recently announced Processor Optimization Pack™ (POP) solution for the Cortex-A15 on TSMC 28nm HPM process. This follows the recent announcement of a broad suite of POPs for all Cortex-A series processors (see ARM Expands Processor Optimization Pack Solutions for TSMC 40nm and 28nm Process Variants, 16th April 2012)
Full configuration and implementation details will be presented at the Cool Chips conference (18-20 April) in Yokohama, Japan. Further information is contained in an accompanying blog.
“For SoC designers looking to make a trade-off between the flexibility offered by the traditional RTL-based SoC development strategy and a rapid time to market, with ensured, benchmarked power, performance and area, an ARM hard macro implementation is an ideal, cost-effective solution,” said Jim Nicholas, vice president of Marketing, processor division, ARM. “This new Cortex-A15 hard macro is an important addition to our portfolio and will enable a wider array of partners to leverage the outstanding capabilities of the Cortex-A15 processor.”
– Squaring the circle – Optimizing power efficiency in a Cortex-A15 processor [Haydn Povey on SoC Design blog of ARM, April 17, 2012]
– Simplifying SoC’s with Hard Macros – New solutions for old problems [Haydn Povey on SoC Design blog of ARM, Oct 20, 2011]: “For me, the most important aspect of this talk was the public announcement of the availability of a new Cortex™-A5 Hard Macro for the TSMC 40nm Low Power node (40LP) which can achieve a whopping speed of over 1GHz in a tiny footprint of just 1mm2. … there will always be partners who need the full flexibility of RTL and POPs, but there is also a group for whom having a pre-integrated and hardened ready to run solution out of the box is the best route to market.”
– Hard Macro Processors [ARM product page, April 17, 2012]
The ARM Hard Macro portfolio offers performance and power optimized hard macrocell implementations of the Cortex™-A series processors. For SoC designers looking to make a trade-off between the multifaceted flexibility offered by the traditional RTL based SoC development strategy and the significant costs and efforts it involves, the ARM Hard Macro portfolio is an exciting alternative that enables higher profitability through benchmarked PPA (Performance, Power, and Area), design risk reduction and faster time to market.
ARM Hard Macros are available in a number of different implementation options with more being added.
Currently the following options are available.
Processor TSMC 40LP TSMC 40G TSMC 28HPM Cortex-A5 Single-core X Cortex-A9 Dual-core X Cortex-A15 Quad-core X
Processor Optimization Pack™ (POP) solutions targeting ARM Cortex™ processors [ARMflix YouTube channel, April 16, 2012]ARM Artisan Physical IP Delivers Optimized Performance and Energy-Efficiency for ARM® Cortex™-A5, Cortex -A7, Cortex-A9 and Cortex-A15 cores.
ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript [Seeking Alpha, July 25, 2012]
If I look at physical IP, the story here is our physical IP is being used right across the different sectors that ARM’s processors are used in. We’re continuing with the processor optimization package activity. It was a record quarter for POPs. The best quarter we’ve had. So total of over 32 POPs sold now, still about a 50% attach rate with Cortex-A licensees, so that’s good in terms of generating royalty for the future.
[Note that here are only 13 companies shown out of those 32 POP licensees.]
And also good in terms of generating royalty for the future is that this quarter, we had 4 new fabless semiconductor companies adopting ARM physical IP for their 28nm designs and beyond. So that is good for royalty growth going forward.
Note: On the very first “Q2 2012 Highlights” slide one could see the following overall split:
The overall 77% share of processor division comprised of 31% licensing (the lighter blue)and 47% of royalties. So that is a pretty mature part of the business overall, although the Mail GPU part of it is still developing:
Let’s — I should just highlight, we’ve got on the slide, of course, millions now of Mali devices as well, are going into those Cortex-A-based chips. And as far as Mali is concerned, then we are very much on track for the 100 million-plus units that we expect to deliver this year.
as around 180 million Cortex-A units were shipped in the first half alone (see the graph in the next exerpt from the earnings call).
The “Revenue Split Analysis” slide from the Appendix, however, is showing that due to the steadily growing application processor business (simply indicated Processor Division, PD) the share of the Physical IP business (simply indicated Physical IP Division, PIPD) was not growing for the last four years:
With extremely high interest in upcoming technologies of 28nm and beyond more and more Cortex licensees will (should) exploit the POP opportunity. Here is the low-end SoC market leader, MediaTek (Taiwan) example of its upcoming flagship products which should definitely use PoP as well for such a tight delivery schedule (considering the just 10 months availability of Cortex-A7 for licensing, i.e. ~15 months relative to Jan’13 SoC delivery vs. 2-3 years which were required previously):
MediaTek a product roadmap leaked: Quad-core code-named MT6588 [MTK Smartphones Network (MTK手机网), July 27, 2012]
Update: later was renamed and came to market as MediaTek MT6589 quad-core Cortex-A7 SoC with HSPA+ and TD-SCDMA is available for Android smartphones and tablets of Q1 delivery [this same blog, Dec 12, 2012]
From a recently obtained electronic forum information abroad we see that the MT6585 code communicated earlier for the quad-core MediaTek smartphone chipset is wrong. The true model code is MT6588. It is built on the 28nm process in order achieve higher performance level than the dual-core MT6577 technology.
MT6588 has a 4-core CPU [Cortex-A7 (!), see on the second slide below] clocked at 1GHz [1.XGHz rather, see the included slides below, as well the latest rumor about that being 1.7GHz or 1.5GHz], supports dual-channel at maximum 1066Mbps, has an integrated multimode modem for WCDMA [+ it is delivering HSPA+ WCDMA performance (!) vs just HSPA with MT6577/75, see the first slide below] and TD (!), that is it can support both Unicom [latest upgrade to HSPA+ service, see here] and China Mobile 3G network, supports an up to 13 MP camera and 1080P video playback. It finally has a GPU upgrade with SGX544, doubles the resolution to 1280×800 HD level, and has 32KB L1 cache and 1MB L2 secondary cache.
Along the MT6588 there is a 28nm dual-core version, MT6583 on the MediaTek 2012 product roadmap. From the chipset parameters it is evident that MT6583 is a scaled down version of MT6588. It has 2 cores less, the camera support is 8MP, the video decoder is of 720P level, and the resolution is down to 854×480.
It is understood that MT6588 and MT6583 will be in production in the first quarter of 2013, early next year the fastest.
MediaTek to launch quad-core smartphone solutions in 1Q13, says paper [DIGITIMES, Aug 6, 2012]
MediaTek is expected to launch its first quad-core smartphone solution, the MT6588, in the first quarter of 2013, according to a Chinese-language Liberty Times report. The MT6588 features a quad-core 1.5GHz or 1.7GHz Cortex-A7 CPU, supporting WCDMA and TD-SCDMA technologies.
The MT6588, which features a 13-megapixel camera, also supports 1080p video playback and a display resolution of 1280 by 800 pixels. The chip will be built using a 28nm process, the paper said.
Additionally, MediaTek will also roll out a 28nm dual-core solution, the MT6583, during the same quarter. While the dual-core CPU of the MT6853 will also run at 1.5GHz or 1.7GHz, the chip will support a resolution of 854 by 480 pixels targeting a segment different from that of the MT6588, the paper indicated.
Back to: ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript [Seeking Alpha, July 25, 2012]
One thing we are seeing is the value coming through in mobile, generally, the increasing number of smartphones, and within the smartphones themselves, an increasing number of Cortex-A products. And you can see a little histogram halfway down the slide, the top bar there is the ARM11. So ARM11 is still accounting for 40%, roughly, of the apps processors. And the Cortex-A is accounting for, roughly, 60% of the apps processors. But within that Cortex-A, you can see dual-core Cortex-A increasing significantly if you compare the situation with a year ago. And that’s good news from a value point of view for ARM as royalty, because typically these chips are more expensive. So single-core moving to dual-core and quad-core is a good trend for us. And note also, the underlying growth in sheer volume of our apps processors in smartphones. Don’t forget, with all this gloom and doom around, smartphones continues to be an area of significant growth for the business, and we’re looking forward to 30% thereabout growth in smartphones year-on-year so — for the year as a whole.
ARM in MCU and Internet of Things
Growing standardisation around ARM in Microcontrollers
– More than 100 companies have now licensed Cortex-M class processors mainly for microcontrollers, smart sensors and smartcards
– Cortex-M0+ is ARM’s most energy efficient processor for microcontrollers
Collectively, if you look at the line cards from the ARM partners, there are over 1,400 different ARM microcontroller products that you can go out and buy from ARM partners today. And that’s going to be a much bigger number by the time we’re all of that licensing that we’ve been doing gets into Silicon production.
Earlier this year, we launched the Cortex-M0+ product … And again, at the Freescale technology forum, we saw an excellent demonstration of that power efficiency, where they literally had an ARM-powered charger, crank it up with a crank handle, charged a few capacitors up in the range of different microcontrollers and of course, the Cortex-M0+ went on and on and on. So that’s a great product.
As far as the range of opportunities is concerned, it’s huge, and we’re starting to get design ins and as we start to get design ins, so more and more semiconductor companies are jumping onto the ARM-based microcontroller party. And they’re making these decisions in order to position themselves for the Internet of Things way.
Internet of Things brings new opportunities
– Combining radio technology with ARM-based microcontrollers and sensors
– Huge range of applications, billions of opportunities
– New products announced from Freescale, NXP and Toshiba in Q2
In terms of volume shipments, at the moment then we saw another great quarter, where if we look year-on-year on microcontroller shipments up about 20% compared with industry shipments, up about 8%.
Freescale: History & Future of “Internet of Things” – Design West (ESC) 2012 [ARMflix YouTube channel, March 28, 2012]Jim Trudeau, Solutions Technical Marketing from Freescale on the Cortex-M0+, the Internet of Things and Freescale’s Kinetis L Series
See more: The Internet of Things, the ultimate mashup [Jim Trudeau on Software Meets Silicon blog of Freescale, April 17, 2012], published on ARM blog as “The Internet of Things, a Triad of Partners, and the Singularity of Change”
Implementing connectivity is where a company like Motomic Software comes into play. They bring Human Machine Interface (HMI) capability to a new arena. With connectedness comes the need for HMI to get smarter, to display what we really need to know when we need to know it in better ways. Take the lowly thermostat – as simple as its task, a traditional digital thermostat UI is typically confusing to use. A modern, simple UI in a “learning” thermostat can be quite simple. The contrast in complexity is startling as shown in Figure 1.
Figure 1: Contrasting Digital Thermostat UI
Motomic Embedded Software Tools for IOT – Design West (ESC) 2012 [ARMflix YouTube channel, March 28, 2012]Motomic tells us about embedded software tools for applications focusing on Internet of Things, plus a demo of an embedded browser and media grid. http://www.motomicsoftware.com/
See more: A Face for the Internet of Things [Mike Gee, CEO of Motomic Software, Inc. as a guest blogger on Embedded blog of ARM, June 11, 2012 ]
… Motomic has created two browsers. Both browse and render HTML/CSS. Motomic’s µButterfly “microbrowser” runs in as little as ~320 KB Flash and 109 KB RAM. The Butterfly “minibrowser” is based on Qt, it supports features such as TrueType fonts, anti-aliasing and alpha blending. It requires 6+ MB of Flash. The RAM requirement depends on screen size and content requirements, starting around ~1 MB.
Both leverage the very low power requirements and very small footprints of ARM’s Cortex-M0+ and Cortex-M4 microprocessors that are too small to run a web browser such as WebKit, Chrome, Mozilla, etc. These small processors can now accurately render HTML/CSS content previously reserved for higher-end processors.
Qt on Future’s WVGA display [MotomicSoftware YouTube channel, July 9, 2012]Nokia Qt for Freescale’s MQX real-time operating system on Kinetis K70 @ Future Electronics’ WVGA (800×480) PIM (Passive Intermodulation http://en.wikipedia.org/wiki/Intermodulation#Passive_Intermodulation) displays …. By adding Qt to MQX, you can: develop Qt-based applications for MQX, begin with the latest prebuilt, prevalidated, preintegrated Qt version, ready for your first deployment on one or more hardware platforms—you don’t need to build Qt, add splash screens with the world’s fastest animations, deploy Qt applications to your embedded devices automatically, leverage hardware optimizations and future-proof your hardware platforms. Motomic also lets you add media to MQX, for example advertisements or instruction videos. You can add social networking, games and browser functionality to your applications and products. Motomic helps you distribute your Qt application across networks.
Development for the IoT is also being boosted by the Embedded Software Store. Motomic’s browsers and hundreds of other components for developing embedded software are accessible. Pre-built components allow solutions to be assembled more rapidly and with lower project risk. Complex systems can now be built rapidly by adding pre-built components.
Innovative solutions like the Embedded Software Store (source of pre-built components for embedded developers), Motomic’s browsers, and ARM’s range of processors are allowing the creativity of developers to envision and build highly innovative solutions for the Internet of Things.
ARM Embedded Software 2.0 [chipestimate YouTube channel, June 19, 2012]Will Tu, Director of Business Development at ARM. IP Talks speaker with ChipEstimate.com at DAC 2012 in San Francisco.
– Advances in technology create new problems for today’s embedded developers [Will Tu on Software Enablement blog of ARM, Oct 12, 2011]
– Solving the Challenge of Software Complexity for Today’s Embedded Developer [Will Tu on Software Enablement blog of ARM, Oct 26, 2011]
– Avnet Electronics Marketing and ARM Launch Embedded Software Store [ARM press release, Oct 26, 2011]
… Users can choose from a broad array of reputable embedded software vendors, including ARM, CMX Systems, Inc., DSP Concepts, Micrium, Motomic, YaSSL, and others. New software vendors are invited to join the initiative on an ongoing basis. The site also offers a quick download delivery system and preview of all license agreements in advance of purchase. Users are encouraged to participate in the Embedded Software Store’s online community to create a strong ecosystem of software support for ARM technology. … The site is fully operational and accessible at www.embeddedsoftwarestore.com …
Kinetis L Series & Energy Efficiency: FTF Keynote Demo [freescale YouTube channel, July 31, 2012]
Freescale Debuts Kinetis L Series, World’s Most Energy-Efficient Microcontrollers [Freescale press release, Jun 19, 2012]
Freescale Semiconductor (NYSE: FSL) is now offering alpha samples of its Kinetis L series, the industry’s first microcontrollers (MCUs) built on the ARM® Cortex™-M0+ processor. Kinetis L series devices are on display this week at the Freescale Technology Forum (FTF) Americas and were demonstrated during the event’s opening keynote address.
As machine-to-machine communication expands and network connectivity becomes ubiquitous, many of today’s standalone, entry-level applications will require more intelligence and functionality. With the Kinetis L series, Freescale provides the ideal opportunity for users of legacy 8- and 16-bit architectures to migrate to 32-bit platforms and bring additional intelligence to everyday devices without increasing power consumption and cost or sacrificing space. Applications, such as small appliances, gaming accessories, portable medical systems, audio systems, smart meters, lighting and power control, can now leverage 32-bit capabilities and the scalability needed to expand future product lines – all at 8- and 16-bit price and power consumption levels.
The ARM Cortex-M0+ processor consumes approximately one-third of the energy of any 8- or 16-bit processor available today, while delivering between two to 40 times more performance. The Kinetis L series supplements the energy efficiency of the core with the latest in low-power MCU platform design, operating modes and energy-saving peripherals. The result is an MCU that consumes just 50 µA/MHz* in very-low-power run (VLPR) mode and can rapidly wake from a reduced power state, process data and return to sleep, extending application battery life. These advantages are demonstrated in the FTF demo, which compares the energy-efficiency characteristics of the Kinetis L series against solutions from Freescale competitors in a CoreMark benchmark analysis.
*Typical current at 25C, 3V supply, for Very Low Power Run at 4MHz core frequency, 1MHz bus frequency running code from flash with all peripherals off.
Features common to the Kinetis L series families include:
48 MHz ARM Cortex-M0+ core
High-speed 12/16-bit analog-to-digital converters
12-bit digital-to-analog converters
High-speed analog comparators
Low-power touch sensing with wake-up on touch from reduced power states
Powerful timers for a broad range of applications including motor control
The first three Kinetis L series families:
Kinetis L0 family – the entry point into the Kinetis L series. Includes eight to 32 KB of flash memory and ultra-small 4mm x 4mm QFN packages. Pin-compatible with the Freescale 8-bit S08P family. Software- and tool-compatible with all other Kinetis L series families.
Kinetis L1 family – with 32 to 256 KB of flash memory and additional communications and analog peripheral options. Compatible with the Kinetis K10 family.
Kinetis L2 family – adds USB 2.0 full-speed host/device/OTG. Compatible with the Kinetis K20 family.
The Kinetis L series is pin- and software-compatible with the Kinetis K series (built on the ARM Cortex-M4 processor), providing a migration path to DSP performance and advanced feature integration.
Availability and pricing
Kinetis L series alpha samples are available now, with broad market sample and tool availability planned for Q3. Pricing starts at a suggested resale price of 49 cents (USD) in 10,000-unit quantities. The Freescale Freedom development platform is planned for Q3 availability at a suggested resale price of $12.95 (USD).
For more information about Kinetis L series MCUs, visit www.freescale.com/Kinetis/Lseries.
Kinetis L Series MCUs Built on the ARM Cortex-M0+ Core: What is the Plus For? [freescale YouTube channel, May 4, 2012]
World’s Most Energy-efficient Processor From ARM Targets Low-Cost MCU, Sensor and Control Markets [ARM press release, March 13, 2012]
RM today announced the ARM® Cortex™-M0+ processor, the world’s most energy-efficient microprocessor. The Cortex-M0+ processor has been optimized to deliver ultra low-power, low-cost MCUs for intelligent sensors and smart control systems in a broad range of applications including home appliances, white goods, medical monitoring, metering, lighting and power and motor control devices.
The 32-bit Cortex-M0+ processor, the latest addition to the ARM Cortex processor family, consumes just 9µA/MHz on a low-cost 90nm LP process, around one third of the energy of any 8- or 16-bit processor available today, while delivering significantly higher performance.
“The Internet of Things will change the world as we know it, improving energy efficiency, safety, and convenience,” said Tom R. Halfhill, a senior analyst with The Linley Group and senior editor of Microprocessor Report. “Ubiquitous network connectivity is useful for almost everything – from adaptive room lighting and online video gaming to smart sensors and motor control. But it requires extremely low-cost, low-power processors that still can deliver good performance. The ARM Cortex-M0+ processor brings 32-bit horsepower to flyweight chips, and it will be suitable for a broad range of industrial and consumer applications.”
The new processor builds on the successful low-power and silicon-proven Cortex-M0 processor which has been licensed more than 50 times by leading silicon vendors, and has been redesigned from the ground up to add a number of significant new features. These include single-cycle IO to speed access to GPIO and peripherals, improved debug and trace capability and a 2-stage pipeline to reduce the number of cycles per instruction (CPI) and improve Flash accesses, further reducing power consumption.
The Cortex-M0+ processor takes advantage of the same easy-to-use, C friendly programmer’s model, and is binary compatible with existing Cortex-M0 processor tools and RTOS. Along with all Cortex-M series processors it enjoys full support from the ARM Cortex-M ecosystem and software compatibility enables simple migration to the higher-performance Cortex-M3 and Cortex-M4 processors.
Early licensees of the Cortex-M0+ processor include Freescale and NXP Semiconductor. … The Cortex-M0+ processor is ideally suited for implementation with the Artisan® 7-track SC7 Ultra High Density Standard Cell Library and Power Management Kit (PMK) to fully capitalize on the ground-breaking low power features of the processor.
The Cortex-M0+ processor is fully supported from launch by the ARM Keil™ Microcontroller Development Kit, which integrates the ARM compilation tools with the Keil µVision IDE and debugger. Widely acknowledged as the world’s most popular development environment for microcontrollers, MDK together with the ULINK family of debug adapters now supports the new trace features available in the Cortex-M0+ processor. By utilizing these tools, ARM Partners can take advantage of a tightly coupled application development environment to rapidly realize the performance and ultra low-power features of the Cortex-M0+ processor.
The processor is also supported by third-party tool and RTOS vendors including CodeSourcery, Code Red, Express Logic, IAR Systems, Mentor Graphics, Micrium and SEGGER.
Module 1: Kinetis-L Introduction and Overview of Features [AvnetEMA YouTube channel, Aug 3, 2012]
– ARM Cortex-M0+: More than a low-power processor [Thomas Ensergueix on Embedded ARM blog, June 19, 2012]: “The Cortex-M0 MCU was quite unique when launched in 2009, offering a subtle mix of low-power, 32-bit performance and optimized code size, all of this packed in a very low gate count processor. … The new implementation of the very same ARMv6-M architecture with a 2-stage pipeline in Cortex-M0+ has given us 9% more performance while reducing the power consumption by around 30%.”
– Introducing the ARM Cortex-M0+ processor: The Ultimate in Low Power [ARM whitepaper by Joseph Liu, May 4, 2012]
– ARM Cortex-M0+ Takes Flight on the Wings of Freescale’s Kinetis L Series [Danny Basler from Freescale as a guest partner blogger on Embedded ARM blog, March 14, 2012]
– FTF 2012 and Everything ARM [Drew Barbier on ARM Embedded blog, Aug 1, 2012]
– The Freedom Board [Erich Styger on Software Meets Silicon blog of Freescale, July 27, 2012]: “… my Freescale Kinetis L series Freedom board arrived. … The board will be available at Element 14/Farnell. It is expected to be publicly available by the end of September 2012, and you can pre-order now. The United States Element 14 site will have the board available for a suggested resale price of $12.95 (USD). In Europe it will be about 10 Euro. …”
– Freescale ARM technology powerhouse in action [The Embedded Beat (all posts) blog of Freescale, June 19, 2012]: “Freescale has become an ARM technology powerhouse, offering the most unique and massively broad portfolio on the market today. It starts with our Kinetis portfolio, and the new Kinetis L series based on the ARM Cortex™-M0+ core, extends to the new Vybrid controller solutions [featuring a unique dual core ARM Cortex-A5 + Cortex-M4 architecture that handles both MCU and MPU tasks on a single chip] that enable rich apps in real time, and stretches to the ultimate multimedia and display solution – the scalable i.MX 6 series [based on the ARM® Cortex™-A9 architecture].”
Continuing with the ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript [Seeking Alpha, July 25, 2012]
We now have nearly 900 licenses, and so that continues to grow. The pool of licenses that are out there to generate royalties for the future. If I look at just quarter on its own, 23 licenses in total, collection of Cortex-A licenses, including our 12 big.LITTLE licensee. So we’ve now got 12 partners signed up for big.LITTLE. At the other end of this scale, the microcontroller end, I was just talking about the Internet of Things, yes, more licensing of our Cortex-M products.
And our new architecture, the v8 architecture, the 64-bit stuff, we’ve now got 9 v8 licensees, including the latest architecture licensee. And we’ve got this rather, it’s with — rather ill-defined horizontal axis of time going along the slide here. We are at the stage where we’ve done a lot of lead licensing now. We are approaching the first Silicon, the product launch type phase and so the 64-bit program is on track. And the interesting thing about our 64-bit architecture, it is not just about high-end computing and servers, it’s actually people talking about using it and the mobile as well, talking about using it in infrastructure applications, some of the networking applications that I talked about a moment or 2 ago.
ARM in Networking and Servers
- Leading networking companies choosing ARM processor technology
– Another v8 architecture licensee for intelligent networking applications
– Freescale announced their first ARM-based chip for infrastructure applications
– HiSilicon, LSI, TI and Xilinx have already announced ARM-based chips for networking
… these smartphones, computers and everything, they have — they communicate and that communication means that they’re getting data from somewhere or they’re sending data somewhere. They’re sending over some data handling infrastructure. And the explosion in smartphones and more mobile computing and prevalence of the Internet is generating much more data. Some study suggests as much as 20x as much data over the sort of 10-year period from 2010 to 2020. And clearly, if that data is handled with the existing architecture, it’s going to consume 20x as much power, which is not a very sustainable situation. If you look at all the electricity generated in the world, then IT equipment accounts for about 10% of it, and if that is going to increase by a factor of 20, then we’ll going to have to build a lot more power stations. So that isn’t going to happen. People are going to look for more power efficient ways of designing this stuff, and here is the opportunity for ARM in networking. And so you see, as I mentioned a moment ago, a new v8 architecture licensee engaged in ARM in networking.
Freescale, I wasn’t there, Freescale technology forum a few weeks ago. Freescale busy announcing their extensive networking product range, switching to adopt the ARM architecture. We’ve seen similar indications from HiSilicon, LSI, TI, Xilinx and so on. Everybody is realizing that in order to get more power efficient products here, then ARM is a great solution. And it’s the same power efficiency story, which is behind ARM’s activity in servers.
- Servers bringing new opportunities
– Dell launches ARM-based server with 48 quad-core chips by Marvell
– Calxeda demonstrated 15x power/performance improvement
– Canonical announces server grade software for ARM-based chips
ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript, Question-and-Answer Session [Seeking Alpha, July 25, 2012]
Unknown Analyst … you’ve been talking about 64-bits sort of v8 architecture taping out relatively soon. Maybe you could — if you could give us a bit more details on what type of products would come on the market in the next 12 months for these 64-bit, if it’s only servers and other things.
… On the second question, about 64-bits, then as I said in the presentation, it’s being used across a range of different applications, including mobile and computing. Servers is a very visible application area, where as we’ve said before, our penetration in the server market is limited until such time as we deploy 64-bit solutions. And I think it’s well known that one of our early 64-bit architecture licensees is targeting server applications and so probably, you’ll see that Silicon fairly early on. If we move along and move back.
Unknown Analyst I think, Calxeda provided some interesting milestones this quarter in terms of the server progress. I’m just wondering, whether you can talk to how you feel the progress is going there in terms of actual sort of processing. Secondly, I just wondered whether — part of interesting slide just on the multi-core effect in the quarter, I just wondered, whether you have a sense of how much of your units shipped in mobile today is actually on quad-core based devices, versus dual-core, so the impact of quad-core presumably is still to come.
Okay. On Calxeda and the server activity, I really don’t have anything else to say. We’re very pleased with the progress. The data that’s coming out suggests that all the experiments that we did before and all the simulation that we did before is being proven in Silicon. And bear in mind, this first Calxeda Silicon is actually Cortex-A9 based. And so I think I said Cortex-A9 was a core we developed very much with mobile in mind. Calxeda have added System-on-Chip infrastructure to turn into a server chip but it’s still a microprocessor core that was designed for mobile. When you put that server infrastructure around the microprocessor core that’s been a bit more designed with server applications in mind, like for instance, Cortex-A15, or moving onto v8, then you’re going to see even better performance at these levels of power consumption. But we’re very pleased with the data that’s come out so far. We’re also pleased to see other ARM Silicon partners starting to get a bit more public with their activity on the servers. The dual-core, quad-core, I don’t know that I can talk specifically about numbers, but I’ll just point you to shows like Mobile World Congress and CES, where what tends to happen is that you sort of have an announcement about products 1 year, and they turn into reality the next year. And we saw in the 2011 season, a load of dual-core devices being announced and they’ve now sort of materialized into phones. And it was about a year later at these shows that we saw the quad-core products announced and so we’d expect that sort of trajectory to continue. Over and above that, some people have gone a little bit further ahead with the quad-core and they’re using it as a sort of marketing tool and saying that the quad is better than dual. It’s a bit of a marketing thing. And it’s up to us semiconductor partners to see what performance they can actually — for what performance for a given level of power consumption they can actually achieve. We put it up on the slide as multi-core, and put the 2 together, because that’s really how we view it.
Kai Korschelt – Deutsche Bank AG, Research Division
… just on a like-for-like perspective, if you could remind us maybe of the potential royalty premium for a 64-bit versus 32-bit, please?
… On 64-bit premium for — or sort of royalty premium for 64-bit, I mean this is a continuation of the trend we’ve been on for a while, where, basically, if there’s more value in the microprocessor, they royalty comes through with a higher rate. And we’ve talked about Cortex-A being sort of typically in the sort of 1.5% to 2% range, compared with preCortex-A being more in the sort of 1% to 1.5% range. And that trend will continue with our v8 architecture, so it’s going to be at the higher-end of that range.
ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript [Seeking Alpha, July 25, 2012]
64-bit, Physical IP and FinFET
- TSMC and ARM announce collaboration to optimise ARM’s 64-bit processors and Physical IP and TSMC’s FinFET technology
– Optimization of ARM’s next generation processors and TSMC’s state of the art process technology
– Companies’ joint work will accelerate the adoption of SoC optimized FinFET technology
– Allows ARM’s and TSMC’s partners to develop market leading products for high-performance and low-power applications like mobile and enterprise
Now looking ahead to a more leading edge technologies, as I said, we had an announcement earlier this week with TSMC, and this is ARM and the biggest independent semiconductor wafer fab or foundry company in the world getting together to actually continue work that’s been ongoing together for quite a long time, in terms of optimizing their process technology, working with physical IP division to optimize our physical IP on their new FinFET process, and using our new 64-bit processor as a vehicle for that development. So it’s world leading companies getting together to work from transistors right through its microprocessors to enable our joint partners to produce world leading products.
ARM Holdings Management Discusses Q2 2012 Results – Earnings Call Transcript, Question-and-Answer Session [Seeking Alpha, July 25, 2012]
… So on the FinFETs with TSMC, can you give us, maybe a bit more comments about this? How do you think it compares with Intel 3D, or whatever they call it? And how involved your PIPD team is involved trying [ph] to transistors characteristics, absorbs transistors? And also, I think the timing has been brought forward by 1 year, I think. So that’s the first question. …
Dealing with the FinFETs first. A year or so ago, when Intel took technology, we said yes. So this is something which has been around in the semiconductor industry for the last decade or more. It’s one of the ways of making transistors more efficient, but it comes with a load of associated challenges that are actually making this stuff and making them yield and that sort holds back the semiconductor industry from taking that step. Intel took the step and announced that they’ve taken the step. They were the first ones over the gate, announcing that they were doing this. Of course, everybody else has been the same, researching it and playing with it for the best part of the last decade. And TSMC had their plans in place. They just were not choosing to go public on FinFET until they were choosing to go public. And we’ve been working with TSMC on their next-generation processes for some time. We always stood here and done presentations and talked about tape outs on 20nm, the first ARM tape out on 20nm was well over a year ago. We’ve taped out first 40nm designs already with some of these players and its R&D activity. As and when the foundry wants to make some of these things public, then they will, and that’s what TSMC have chosen to do this week. And they chose to, I guess, communicate particularly with their customers who are ARM partners by saying, “Not only are we doing some process development in the back room, but we’re also thinking about how you’re going to take this technology to market, the sort of products you’re going to built with it. You’re probably going to build ARM-based products with it, and so we’ve been working with ARM and ARM’s physical IP division to make sure that their physical IP, their microprocessors and our semiconductor process technology, works well together. And that’s all there is to it.”
Janardan Menon – Liberum Capital Limited, Research Division
Two questions. One is on the FinFET agreement with the TSMC, it’s on 64-bit. So I’m just wondering what plans you have on moving the 32-bit, Cortex-A15 kind of products to FinFET? DO you have another agreement with them which we don’t know about and will the timing of the introduction of that be roughly the same as the 64-bit signed? …
Okay. Well, let’s answer the first one. The FinFETs, yes, the announcement is, with our 64-bit processor because just as we want to work with TSMC’s most advanced process technology, they want to work with our most advanced microprocessor, making a 20nm FinFET and later, a 16nm FinFET implementation so that our 32-bit processors will form naturally out of that development activity. We’re optimizing our physical IP to build microprocessors. We just happen to be using our new 64-bit processor as the vehicle for it. The same physical IP will be very easily used to implement our 32-bit processors.
Janardan Menon – Liberum Capital Limited, Research Division
And with your — as part of the timescale of introductions, is that a 2014 introduction or is it ’15?
Well, we have to stick with the announcements for now. And I think as and when TSMC want to make more comments on when these things are available, then they’ll make more comments. As I said, from a development point of view, we’re taping out stuff all the time. …
Sumant Wahi – Redburn Partners LLP, Research Division
… The second question has to do with the FinFET again. Am I doing — most of the foundries are sort of offering different known transition and in between, I assume, a FinFET would be, an option in between 20nm and probably 16nm. So my question really was that, would you be licensing FinFET technologies separately as well, or is this an exclusive collaboration with TSMC? And then is there a royalty increase coming from products based on FinFET, PIPD, so to speak? …
Okay. Next question was about FinFET and whether it’s essentially a different physical IP product from ARM. And the answer is, well, it’s a different flavor. We have different flavors of our physical IP for each semiconductor process. And so a low-power version of a given note is a different physical IP bundle than a high-profile version. And the FinFET is another flavor again. So it would be an incremental licensing opportunity. But the fact that our physical IP is used, would generate the royalty opportunity. But it’s not an incremental royalty opportunity. The fact that it’s FinFET, it’s just another flavor. So if we’re going to have a 20nm low-power plainer flavor and the FinFET flavor, and the chips are going to be made out of one process technology, and so the royalty opportunity is the same. …
ARM and TSMC Collaborate to Optimize Next-Generation 64-bit ARM Processors for FinFET Process Technology [ARM press release, July 23, 2012]
TSMC (TWSE: 2330, NYSE: TSM) and ARM today announced a multi-year agreement extending their collaboration beyond 20-nanometer (nm) technology to deliver ARM processors on FinFET transistors, enabling the fabless industry to extend its market leadership in application processors. The collaboration will optimize the next generation of 64-bit ARM® processors based on the ARMv8 architecture, ARM Artisan® physical intellectual property (IP), and TSMC’s FinFET process technology for use in mobile and enterprise markets that require both high performance and energy efficiency.
… The ARMv8 architecture extends ARM low-power leadership with a new energy-efficient 64-bit execution state to meet the performance demands of high-end mobile, enterprise and server applications. The 64-bit architecture has been designed specifically to enable energy-efficient implementations. Similarly, the 64-bit memory addressing and high-end performance are necessary to enable enterprise computing and network infrastructure that are fundamental for the mobile and cloud-computing markets.
TSMC’s FinFET process promises impressive speed and power improvements as well as leakage reduction. All of these advantages overcome challenges that have become critical barriers to further scaling of advanced SoC technology. ARM processors and physical IP will be able to leverage these attributes to maintain market leadership, while the companies’ mutual customers can benefit from these improvements for their new, innovative SoC designs. …
ARM and TSMC Sign Long-Term Strategic Agreement [ARM press release, July 20, 2010]
ARM and Taiwan Semiconductor Manufacturing Company, Ltd. (TWSE: 2330, NYSE: TSM) today jointly announced a long-term agreement that provides TSMC with access to a broad range of ARM processors and enables the development of ARM physical IP across TSMC technology nodes. This agreement supports the companies’ mutual customers to achieve optimized Systems-On-Chip (SoC) based on ARM processors and covers a wide range of process nodes extending down to 20nm. …
ARM and TSMC Tape Out First 20nm ARM Cortex-A15 Multicore Processor [ARM press release, Oct 18, 2011]
ARM and TSMC (TWSE: 2330, NYSE: TSM) today announced that they have taped out the first 20nm ARM® Cortex™-A15 MPCore™ processor. The two companies completed the implementation from RTL to tape out in six months using TSMC’s Open Innovation Platform® (OIP) 20nm design ecosystem.
Building on this tape out, ARM will optimize its physical IP technology to specific TSMC 20nm process technologies for Power, Performance and Area (PPA), driving the specification of the Cortex-A15 Processor Optimization Pack (POP). TSMC’s 20nm process provides more than a 2X performance increase over preceding generations.
FINFET: Has its time finally come for a sub – 20nm 3D device? [Jean Luc Pelloie Fellow Director of SOI Technology on the ARM SoC Design blog of ARM, Dec 21, 2011]
… As we move to 20nm and beyond process technology, Fin-FET design may earn its place as the technology path of the future. … Fin-FET or tri-gate may be implemented on either bulk or SOI wafers. … There is still work to be done, i.e. variability is expected to be different between SOI and bulk versions and needs to be quantified; … However, 3D devices are clearly on the road for sub-20nm nodes…and Fin-FET’s time may finally be here.
Firms Rethink Fabless-Foundry Model [SemiMD (Semiconductor Manufacturing and Design), July 31, 2012]
TSMC, for one, plans to accelerate its finFET efforts. Originally, TSMC planned to introduce finFETs at 14nm by late 2014. Now, the company has no plans to brand its finFETs at 14nm, but rather it will introduce the technology at 16nm. TSMC’s finFET “risk production” is slated for the end of 2013 or early 2014, with production scheduled for the second half of 2015, Chang said.
Taiwan Semiconductor’s CEO Discusses Q2 2012 Results – Earnings Call Transcript [Seeking Alpha, July 19, 2012]
… our 20 nanometer SoC, we believe, is fully competitive with industry leaders, other companies’ 22 nanometer for the served available markets that we serve. For our markets, we believe our 20 SoC is fully competitive with anyone’s 20 nanometer or 22 nanometer offering.
And, one important point to make is that our 20 nanometer has the industry’s leading metal pitch of 64 nanometers. Our leading competitors have 80 nanometer metal pitch. That allows an advantage in the device’s density and die size.
Now, as for the timing, we expect our 20 nanometer technology to be qualified by the end of this year and will be ready to support customers (inaudible) in Q1 of 2013.
Now today, last time I mentioned that we will have a FinFET product after 20 SoC. And today, I’m glad to say that we have been planning the 16 nanometer FinFET. Right after our 20 nanometer (inaudible), which is the 20 SoC, we will offer FinFET at 16 nanometer for significant active power reduction. We expect to achieve speed and density, speed and logic density levels comparable to industry’s leading players 14 nanometer FinFET.
So, we expect our 20 SoC to be competitive with competitors’ 22 nanometer or 20 nanometer products and we expect our 16 nanometer FinFET to be competitive with our competitors’ 14 nanometer FinFET products. You might ask why are we calling it 16. The only reason, in fact, until two days ago, we were undecided on whether to call it 14 or 16 FinFET. Now the only reason we decided to call it 16 FinFET is first, we want to be somewhat modest; second, we are told quite a few major customers ask the 16 FinFET, that designation and we didn’t want to confuse our customers by now switching to 14. But we expect it to be competitive with other people’s 14 nanometer offerings.
Now 16 nanometer FinFET, our 16 nanometer FinFET, is expected to deliver about 25% speed gain given the same standby power over the 20 nanometer SoC. It is expected to give 25% to 30% power reduction at the same speed and the same standby power, and for mobile products, it is expected to give 10% to 20% speed gain at the same total power. As for timing, we expect it to be about one year after 20 SoC namely it should be ready for risk production at the end of 2013 or early 2014, about one year later than the 20 SoC.
[from Q&A session]
… 20-SoC which is 20-nanometer will ramp in 2014. And we believe that the 16 FinFET will ramp in, perhaps the second half of 2015. …
– When sticking with a “David”: CAST Inc.
Decreasing Risk When Selecting Third-Party Semiconductor IP (49th DAC) [castcores YouTube channel, July 17, 2012]
Nikos Zervas, VP of Marketing, CAST, Inc.
The adoption of a reliable design reuse methodology, proliferation of high-quality IP products, and shake-out of the most untrustworthy IP vendors creates a situation offering a huge potential advantage to system integrators and product designers looking to jump ahead of their competition.
Instead of choosing the same big-vendor, star IP that most competitors may pick by default, smarter firms will seek out and commit to what might be technically-superior IP products from smaller vendors/partners who will offer both deeper and broader service and support.
A good example is regarding microprocessors and controllers, the heart of most systems and usually the first, most critical system design choice.
Consider a deeply embedded system that needs the power of a 32-bit processor. Much like that saying from the 1980′s that when choosing PCs “nobody gets fired for buying an IBM,” choosing a processor from the leading processor company is probably the easiest, safest choice, and it’s certainly an undeniably fine product with an extremely effective ecosystem. But making this choice might mean missing an opportunity for differentiation in a competitive market where every advantage is required for success.
The IP portal sites list many 32-bit processor core options beyond the leading processor company, with Chip Estimate and Design and Reuse each returning nearly 300 results for such a search. More significantly, I count almost 30 different providers of these products. Certainly some of these vendors offer a product, support, or licensing terms—or perhaps even all three—that could give the smart designer a critical edge.
Six of these stand out as being especially popular based on my recent visits with designers in California and Asia:
- the AndesCore from Andes Technology,
- the BA22 developed by Beyond Semiconductor and available from CAST, Inc. (disclosure: I work for CAST),
- the ColdFire from IPextreme
- the eSi-3250 from EnSilica,
- the LEON3 from Aeroflex Gaisler, and
- the MIPS 4KS and others from MIPS Technologies.
How can you determine if options like these have sufficient benefits to outweigh the risk of not going with the leading processor company? Comparisons can be tricky, but there are a few key points to start with.
The technical suitability and potential advantages of course depend on the detailed needs of your system. A good IP sales team will help you articulate the relevant characteristics of your project and make sure their product will work well before selling it to you.
Quick comparisons of the performance and operating characteristics is made easier through the publication of well accepted power consumption and speed measures, like the CoreMark performance and CSiBC code density standards. Be sure, however, to look deeper to fully understand the specific configuration and technology details behind each vendor’s figures compared to that of your own target system.
Ecosystems for programming and system development aids are a hot processor marketing topic. Be sure that the basics are covered: effective software programming tools such as the GNU tool chain, JTAG debugging, and ports of the RTOS or OS you want to use. A graphical IDE, support from tool vendors like Keil or Lauterbach, and eval/dev board kits are extras that can help further accelerate development.
Licensing terms and actual costs can vary dramatically. For example, some vendors rely on royalty streams for their profits, while others have simpler up-front licensing fees with no royalties. What’s best for you depends on your specific product and market plans.
Finally, credibility of the processor and the vendor are both crucial. For the former, look to successful use by other customers with applications similar to your own. For the latter, look for business longevity and general reputation, backed by your own experiences with the provider’s sales and engineering people. Try to extrapolate from a vendor’s pre-sale support how effective their integration help and other technical support services will be after you purchase from them.
The examples of 32-bit processor alternatives I listed earlier all compare favorably with the leading processor company’s products in these factors; any might be the one to give you the extra technical, timeframe, or cost edge you need to make your product more competitive.
The same is true of most other areas of semiconductor IP. Now that our industry embraces the use of third-party IP, the smartest designers will get a major payback from putting up-front effort into investigating the very best IP for their specific needs, whether that initially seems like the “safe” choice or not.
(Note: all trademarks and registered trademarks mentioned here are the property of their respective owners.)
About Nikos Zervas
Nikos is the VP of Marketing for CAST, Inc. Before joining CAST in 2010, Nikos was a co-founder, chairman, and CEO of video/image SIP vendor Alma Technologies, SA [Pikermi, Greece]. He has been a member of the board for the Hellenic Silicon Industry Association since 2009, and he is a senior member of IEEE. Nikos holds BA and PhD degrees in Electrical and Computer Engineering from the University of Patras, Greece, and has published over forty papers in referenced journals and international conferences.
AndesCore™ from Andes Technology (founded in Taiwan in 2005) with AndeStar™ ISA:
AndeStar is a patent-pending 16-bit/32-bit mixed-length instruction set to achieve optimal system performance, code density, and power efficiency.
Our extensive collection of ColdFire IP gives you the flexibility to choose the best solution for your cost/performance requirements while benefiting from the huge ecosystem of development resources available for the ColdFire architecture. Deployed in over 500 million devices worldwide, ColdFire is one of the world’s most widely-used 32-bit processor architectures. And the modern implementations of the ColdFire architecture, proven in devices from Freescale Semiconductor and available as synthesizable IP from IPextreme, provide performance and reliability that rival any similarly featured 32-bit processor IP.
All ColdFire cores feature a variable-length RISC architecture for compact code and are supported by an extensive collection of development systems, tools, libraries, and operating systems from Freescale and several third-party commercial and open-source providers.
Beyond BA22 Processor [Beyond Semiconductor web page, Dec 17, 2007] from privately held Slovenian fabless semiconductor IP company Beyond Semiconductor sold, supported, and built within platforms by CAST Inc. worldwide:
Beyond BA22 Processor is the first implementation of Beyond BA2 Architecture processor. It’s main design goal was to minimize code size, gate and flip-flop count while obtaining similar performance as Beyond BA12 processor. The processor is extremely configurable, allowing for variety of size/performance trade-offs.
Note: more Beyond BA22 related information is given later on as part of the CAST-related information
eSi-3250 – 32-bit, high-performance CPU [EnSilica (UK) web page, Oct 11, 2009]
EnSilica’s eSi-3250 CPU IP core is a high-performance processor ideal for integration into ASIC and/or FPGA designs with off-chip memories. The eSi-3250 is suited to a wide range of applications including running complex operating systems such as Linux.
For applications that require do not require off-chip memory, the smaller eSi-3200 is available. For even simpler applications that do not require 32-bit performance or more than 64kB of memory, the eSi-1600 16-bit processor can be used. All of the eSi-RISC processors RTL and toolchains share a common code base, resulting in an easy migration path for both software and hardware developers, should the demands of an application change.
LEON3 Processor [Aeroflex Gaisler (Sweden’s Gaisler acquired by US based Aeroflex) webpage, March 28, 2005]
The LEON3 is a synthesisable VHDL model of a 32-bit processor compliant with the SPARC V8 architecture. The model is highly configurable, and particularly suitable for system-on-a-chip (SOC) designs. The full source code is available under the GNU GPL license, allowing free and unlimited use for research and education. LEON3 is also available under a low-cost commercial license, allowing it to be used in any commercial application to a fraction of the cost of comparable IP cores.
MIPS32® 4KS™ Family [MIPS web page, Feb 28, 2003]
The MIPS32® 4KSd™ secure data core is a high-performance processor that meets the needs of emerging secure data applications and the stringent power, security and size requirements for smart cards. This core has the performance required to implement software programmable cryptography without the need of a coprocessor, reducing SoC size and power consumption. The 4KSd core is the most secure, licensable, 32-bit processor available.
End of additional information
ChipEstimate.com DAC 2012 IP Talks presenter Nikos Zervas [chipestimate YouTube channel, June 21, 2012]
Meet Our New VP of Marketing [IP Notes from CAST, Inc., Sept 9, 2010]
We’re very pleased to announce our new Vice President of Marketing, Nikos D. Zervas.
Why did you join CAST?
CAST has an industry reputation for being an IP vendor customers can really trust, with solid products and great support. Solving difficult technical challenges still excites me, of course, but my nine years working alongside CAST have shown me that having a passionate drive to help customers then earning the satisfaction of seeing those customers succeed can be just as rewarding.
When the opportunity rose to join the impressive team at CAST, help grow the company, and further the ideal of easier design through IP, it seemed like the right time in my career for just such a move.
What trends do you see for the IP market over the next year?
Design reuse was become accepted for reducing risk and minimizing time to market. With this acceptance—and the fast-increasing rates of design complexity growth and design cycle shrinkage—I believe designers will move beyond specific functional cores to seek broader IP systems and complete solutions, like CAST’s recent H.264 Reference Design System. I think CAST is well positioned to supply this need, and that I can help them succeed with this next stage of growth.
Fast JPEG Encoder Core from CAST Used in Fastec TS3 High-Speed Camera [CAST press release, March 6, 2012]
Fastec Imaging Corporation has incorporated a JPEG Encoder IP Core from CAST, Inc. in its groundbreaking TS3™ line of handheld, high-speed digital cameras.
Sourced from long-time CAST partner Alma Technologies SA, the JPEG-E Encoder Core is one of the fastest-available baseline JPEG compression cores. This enables extremely competitive functionality for Fastec’s TS3 high-speed digital cameras, including capture of 1280 x 1024 pixel images at 500 frames per second, or 800 x 600 at 1,250 fps.
“The quality of the core plus CAST’s determination to see us succeed were both instrumental in bringing our groundbreaking handheld high-speed camera, the TS3, to market on time and on spec.,” said Bob Sefton, principal FPGA design engineer at Fastec. “The JPEG encoder’s features and excellent performance were as specified, and the system integration was so easy I didn’t need CAST’s technical support services.”
The encoder core supports the Baseline Sequential DCT mode of the JPEG standard and is suitable for still-image or motion-JPEG capture. This third-generation core offers very fast JPEG compression—up to 750 MSamples/sec in a 65nm technology—yet is compact enough to fit low-cost FPGA devices.
A bit-rate control option further benefits bandwidth-limited applications. “We envisioned demanding customer applications like Fastec’s when designing the JPEG encoder,” said Spyros Theoharis, vice president of products and technology at Alma Technologies. “It’s exciting to see yet another customer release of such a remarkable product using our technology and CAST’s support.”
The JPEG-E core is part of a comprehensive family of image and video IP cores offered by CAST.
The New Handheld TS3 100 High-Speed Camera [Fastec Imaging press release, July 10, 2012]
Fastec Imaging, a leading global manufacturer of digital high-speed video cameras has, once again, taken the high-speed imaging world by storm with the release of the revolutionary new TS3 100 handheld high-speed camera. This portable, affordable, battery operated camera puts all the power of a high end, high-speed camera, in the palm of your hand!
“We wanted to create a high-speed camera that was going to be easy to use, versatile and very portable, unlike many of the other cameras in this field,” explains Steve Ferrell, President of Fastec Imaging. “The TS3 combines the power, speed, resolution and light sensitivity of our renowned HiSpec camera line with the portability and ease of use of our previous handheld ‘point and shoot’ high speed cameras. The result is a completely portable and intuitive high-speed camera with the ease of use of a DSLR.”
The TS3 100 captures 500 frames per second (fps) at 1280 x 1024 pixels and over 20,000 fps at reduced resolutions, making it the perfect high-speed camera for broadcast, research and industrial applications. Featuring a built-in 7’’ high resolution touchscreen LCD, the TS3 allows for instant playback of footage out in the field. Combine that with an industry leading 4 hour battery, and it is easy to see why the TS3 100 is quickly becoming so popular.
Unlike any other high speed camera on the market today, the TS3 100 offers unmatched versatility. Not only is it an intuitive point- and-shoot handheld camera, but it can also be controlled over Gigabit Ethernet via a PC or MAC, or even over the Internet using a standard web browser for long distance control. The TS3 also features both USB ports and SD ports allowing users to easily download images to thumb drives, SD cards, or portable hard drives. Additionally, an optional built-in SSD, (Solid State Drive), provides for up to 256GB of non-volatile internal storage. This allows for shooting all day long without having to download to a computer.
“The response to the TS3 has been overwhelming”, says Ferrell. “Its ease of use and affordability makes the TS3 one of the most accessible high-speed video cameras on the market and a perfect solution for researchers and manufacturers as well as TV and film producers.”
For more information about the TS3 and other Fastec products, visit the web site at www.fastecimaging.com.
Beyond BA22 Processor [Beyond Semiconductor webpage, Dec 17, 2007] from privately held Slovenian fabless semiconductor IP company Beyond Semiconductor:
Beyond BA22 Processor is the first implementation of Beyond BA2 Architecture processor. It’s main design goal was to minimize code size, gate and flip-flop count while obtaining similar performance as Beyond BA12 processor. The processor is extremely configurable, allowing for variety of size/performance trade-offs.
Embedded Processor Cores [Beyond Semiconductor webpage, May 7, 2007]
ARM9™, ARM11™, ARM Cortex™-A9 and ARM Thumb®-2 are registered trademarks of ARM Holdings PLC.
OpenRISC [Beyond Semiconductor webpage, Sept 1, 2007]
Product Status – Obsolete
OpenRISC was an open source hardware RISC CPU designed by Damjan Lampret, one of the contributors of OpenCores, released under the GNU Lesser General Public License. The OpenRISC OR1000 and OR1200 are no longer under active development, and are not recommended for new products.
Beyond Semiconductor can provide commercial support for OR1000 and OR1200 processors.
The Beyond BA12 Embedded Processor is an up-to-date, fully supported commercial version of OpenRISC, including many enhancements, integrated software development tool suite, development platforms and software debug tools.
CAST and Beyond Semiconductor enter 32-bit Processor Core Partnership [joint press release, June 3, 2011]
CAST to sell, support, and build platforms around the BA22 processor IP core from Beyond Semiconductor
San Diego, CA – June 3, 2011, 48th DAC – Semiconductor intellectual property (IP) provider CAST, Inc. has reached an agreement with Beyond Semiconductor by which CAST will provide Beyond Semiconductor’s BA22 processor core worldwide.
The BA22 is a fast, compact, power-saving, 32-bit RISC processor that CAST will offer without royalties. These capabilities plus easy development and integration features make the processor an excellent step up for CAST’s large base of 8-bit 8051 customers who need more processing power. In fact, the BA22’s programming code is so efficient that systems using it may require less silicon area than an 8051 with its respective code and memory.
CAST will package the affordable BA22 with peripheral controllers and other essential IP. The initial focus is on deeply embedded systems; later platforms will exploit the processor’s scalability and performance potential to support broader applications.
The platform approach gives customers a ready-to-use processor subsystem, and eases the transition to 32-bit processing for designers accustomed to similarly configured 8051 IP cores.
“The 8051 is still a good choice for many chips, but our experience with customers incorporating data-intensive functions like touch-based interfaces and high-res video makes it clear they really need a good 32-bit embedded processor,” said Bill Finch, CAST’s senior vice president for sales. “The silicon-proven BA22’s performance, tiny code footprint, and mature development tools make it a great choice for many new systems, while our 15 years of microprocessor IP experience and very attractive business model make CAST a great 32-bit processor provider.”
“CAST has a long track record as a smart, effective, customer-focused IP team that makes them a perfect match for our products,” said Matjaz Breskvar, chief executive officer of Beyond Semiconductor. “Working with them will enable us to bring highly customizable Beyond BA22 to new designers across the world while providing ease of use and excellent customer support.”
Limited availability of the BA22 from CAST begins now, with a full product roll out in the next quarter. IP integration services are also available.
Learn more by visiting http://www.cast-inc.com/beyond or emailing email@example.com. Participants in the 48th DAC in San Diego, June 5–8, are welcome to stop by CAST’s booth (2217) to see a demo and discuss the advantages of the BA22.
About Beyond Semiconductor
Beyond Semiconductor is a privately held fabless semiconductor IP company. Its comprehensive product offering features 32-bit embedded RISC/DSP processors with the highest code density in the industry. For more information, visit http://www.beyondsemi.com.
About CAST, Inc.
CAST, Inc. is a privately held company that provides semiconductor IP products and services. The company features advanced image/video processing and microcontroller IP families, plus the memory controllers, high-speed buses, peripherals, and other functions needed to build complete systems. Learn more at http://www.cast-inc.com/.
CAST IP for ASICs and FPGAs: Introduction and Overview [CAST presentation on SlideShare, July 2002], only images for certain slides are included below
BA22-AP: BA22 32-bit Application Processor [CAST datasheet, June 20, 2012]
Implements a 32-bit RISC processor for demanding embedded applications that use offchip instruction and data memories and that may need to run a real-time operating system (RTOS) or a full operating system such as Linux or Android. Part of the royalty-free BA22 family, this processor core is extremely competitive in terms of high performance and low power consumption, and has best-in-class code density.
The core has Instruction and Data Memory Management Units (MMUs) and Caches, dedicated buses for on-chip instructions and data memories, and an AMBA® AHB™ or Wishbone system bus interface. Optional floating point, divider and multiply–accumulate units benefit DSP applications. The core includes up to 32 general purpose registers (GPRs), a tick-timer (TTimer), a programmable interrupt controller (PIC), an advanced power management unit (PMU), and an optional debug unit (DBGU). Additional microcontroller peripherals may be ordered for pre-integration and delivery with the core, individually or in a complete platform. IP Integration Services are also available to help integrate any BA22 processor configuration with memory controllers, image compression, or other CAST IP cores.
The processor’s BA2 instruction set is relatively simple and extremely compact. Programing is facilitated with the included C/C++ tool chain; Eclipse IDE; architectural simulator; and ported C libraries, RTOSs, and OSs.
The BA22-AP synthesizes to 35k gates in a 90nm technology, can be clocked with more than 450MHz in a 65nm technology and provides as many as 1.59 DMIPS/MHz. The core is delivered, with a complete software development environment under Eclipse IDE, and its users get access to already ported real operating systems (Linux, Android, eCOS and uClinux) and libraries.
The BA22 family of processors has been designed for easy reuse and integration, has been rigorously verified, and is production proven. Contact CAST Sales for details.
Internet, networking and telecom
Portable and wireless
Home entertainment consumer electronics
The core is available for ASICs in synthesizable HDL, and includes everything required for successful implementation:
Verilog RTL source code
Silicon-proven Reference SoC/ASIC Design
Software development tools for Cygwin on Windows and Linux, with Eclipse IDE interface
Operating systems and board support package
A reference design board running Linux and FPGA versions of the core are also available; contact CAST Sales for information.
UPDATE Aug’13: Xiaomi $130 Hongmi superphone END MediaTek MT6589 quad-core Cortex-A7 SoC with HSPA+ and TD-SCDMA is available for Android smartphones and tablets of Q1 delivery
Formerly known as MT6588 but recently renamed as MT6589. About that history see my earlier post: Boosting the MediaTek MT6575 success story with the MT6577 announcement – UPDATED with MT6588/83 coming
early 2013 in Q4 2012 and 8-core MT6599 in 2013 [June 27, July 27, Sept 11-13, Sept 26, Oct 2, 2012]
Although MediaTek is claiming that MT6589 is the first quad-core Cortex-A4 SoC, it is not true as the Allwinner A31 SoC is here with products … [my other ‘USD 99 Allwinner’ blog, Dec 10, 2012]. It should also be noted that Qualcomm quad-core Cortex-A7 SoCs with Adreno 305 and 1080p coming for the high-volume global market and China [this same ‘Experiencing the Cloud’ blog, Dec 9, 2012] with customer sampling by 2Q 2013. Therefore MediaTek will have at least several months advantage over Qualcomm in this respect, as according to MediaTek’s press release
the first models based on this new chipset are expected to ship commercially in Q1 2013.
Updates: MediaTek quadcore Cortex-A7 superphones go higher frequency and lower end in H2 CY2013
From this Xiaomi selected the current highest-frequency MT6589T (Turbo) MediaTek quad-core SoC for its own entry-level superphone:
The list price is as low as ¥ 799 i.e. $130 in order to challenge Apple’s entry
http://www.xiaomi.com/hongmi: 红米手机 [Hongmi] Red Rice phone [July 31, 2013] as translated by Google
Xiaomi shifts into low end of mobile sector [China Daily, August 1, 2013]
The company officially offers the first batch of products on Aug 12
Chinese smartphone manufacturer Xiaomi Corp launched a sub-brand “Hongmi” (red rice) on Wednesday that targets the country’s entry-level smartphone buyers.
With rumors circulating that Apple Inc will introduce cheaper iPhones for Chinese clients in the second half, Beijing-based Xiaomi aims to beat its rival to the punch in the lower-end market.
Xiaomi released the Hongmi smartphone, priced at 799 yuan ($130), at a Beijing newsbriefing on Wednesday.
Hongmi has a 4.7-inch screen, Android-based device equipped with MediaTek Inc’s 1.5-gigahertz quad-core processor. The dual-card handset supports China Mobile Ltd’s second-generation (2G) and third-generation (3G) networks.
Lei Jun, founder and chief executive officer of Xiaomi, said the launch of the Hongmi signifies Xiaomi’s first attempt to explore the nation’s affordable (below 1,000 yuan)smartphone market.
“I believe the Hongmi is the best product among all 1,000-yuan smartphones” in China,Lei said. “Xiaomi does not care much about sales or shipments, but we strive toproduce the finest devices” for our costomers, Lei said at the event.
Since Apple is hatching a plan to slash its iPhone price and garner more Chinese buyers, some Xiaomi officials said the “birth” of Hongmi is a preparation for the looming price-cut trend.
“People will pay more attention to cheaper but capable smartphones,” one said.
Apple’s Chief Executive Officer Tim Cook was in Beijing again, said officials at ChinaMobile Ltd on Wednesday.
Xi Guohua, China Mobile’s chairman, met with Cook on Tuesday to discussion cooperation, said Li Jun, spokesman of China Mobile, via a text message.
Analysts said Cook might have come to China to discuss Apple’s shrinking sales.
There’s no doubt that Hongmi will open more doors for Xiaomi. Compared with the middle and high ends of the smartphone market, where Xiaomi has been operating, the entry-level market boasts many more potential buyers.
According to data from Alibaba Group Holding Ltd, China’s biggest e-commercecompany by sales, 61 percent of the mobile phones sold on the Taobao marketplace and the business-to-customer platform Tmall.com were priced below 1,000 yuan. About one-fifth of the mobile phones sold cost 1,000 yuan to 2,000 yuan, while only 18 percent cost more than 2,000 yuan.
“The entry-level smartphone market is definitely the market offering the most consumers,” said Li Yanyan, an analyst with Beijing-based research firm Analysys International.
Domestic telecom operators have actively promoted and launched market campaigns for affordable smartphones, which help raise consumer awareness, she pointed out.
Sandy Shen, an analyst with Gartner China, said the launch of the Hongmi also fills avoid for Xiaomi in cooperation with the nation’s biggest mobile operator, China Mobile.
“Previously, Xiaomi partnered with both China Unicom and China Telecom, but we never heard any information about cooperation with China Mobile,” Shen said.
China Mobile, although struggling in the domestic 3G competition because it adopted a relatively inferior 3G technology, has gradually got on track to catch up with rivals inrecent months.
China Mobile sold more than 59 million mobile phones for its 3G network in the first half of this year, said Ma Jingxin, vice general manager of China Mobile Terminal Co, during the same event. Ma added the figure was close to China Unicom’s 3G mobile phone shipments.
The Hongmi smartphone is available for pre-orders on Tencent Holdings Ltd’s Qzone, a social-networking platform with more than 60 million users. On Aug 12, Xiaomi will officially offers the first batch of products.
Although Chinese media have reported that Tencent was about to invest in Xiaomi,officials at Xiaomi have denied any such plan.
“Qzone is China’s biggest social-networking website and it closely aligns with Xiaomi’s targeted clients,” Li Wanqiang, vice-president of Xiaomi, said.
“Social-networking platforms are the major battlefield (for Xiaomi marketing and sellingits products),” Li added.
While the MT6589 based smartphones were targeted to the mid-range market so far:
CLICK HERE to get a clickable PDF version of the above “picture document” if needed. Note that the previous hit SoC from MediaTek, the MT6577 has only slightly more, 74 devices listed in this device database since July 2012, but only 47 if taking a similar period. So with 61 devices already MT6589 has a much greater market success. To compare with MT6577 see Boosting the MediaTek MT6575 success story with the MT6577 announcement – UPDATED with MT6588 [later renamed 6589]/83 coming
early 2013 in Q4 2012 and 8-core MT6599 in 2013 [‘Exp. the Cloud’, June 27, July 27, Sept 11-13, Sept 26, O0ct 2, 2012] and MT6577-based JiaYu G3 with IPS Gorilla glass 2 sreen of 4.5” etc. for $154 (factory direct) in China and $183 internationally (via LightTake) [‘Exp. the Cloud’, Sept 13, 2012].
The first product was delivered from Micromax, the leading local brand in India (now world’s 3d largest smartphone market) with 19.3% smartphone market share in the January-April period (24.3% in April), #2 behind Samsung (which had 40.7% share out of the 9.4 million units in total). The 5” IPS 1280×720 8MP/2MP Micromax A116 Canvas HD went on sale starting February 14 for Rs. 13,990 [$236], and 1 million unit sales were achieved by April 24. It was followed by the 5” IPS 480*854 8MP/2MP A110Q Canvas 2 plus entry model from May 22 for Rs. 12,100 [$204], and then by the 5” IPS 1280×720 13MP/5MP Canvas 4 top model from July 8 for Rs. 17,999 [$303]. As other Micromax products they were manufactured in China by unknown white-label vendor(s).
Other Indian brands of similar kind were close to Micromax’s footsteps. I will add just the next two local brands: the #3 Karbonn with 8.6% smartphone market share, and Lava International which is aggressively targeting the smartphone market this year with $169M planned sales, 50% of the overall phone revenue plan. Note that overall phone revenue of Karbonn in the fiscal year ended June 30 was $408M, and for the next fiscal year its plan is $675M according to Karbonn Mobiles eyeing Rs 4,000 crore turnover in FY 2014.
The 4.5” IPS 540×960 5MP/VGA S1 Titanium model from Karbonn appeared February 16 on its website for pre-booking at Rs. 10,999 [$185], then came the 5.5” IPS 1280×720 13MP/5MP S9 Titanium top model announced on July 5 (in order to gain attention before Micromax Canvas 4) at Rs. 19.990 [$337] and to be released in the 2nd week of August. Note that two other quadcore Titanium models the S2 and the S5 are based on Snapdragon SoCs from Qualcomm, as well as a rumored S6 model.
Lava International’s subsidiary Xolo was starting the sales of its 4.5” IPS 540×960 8MP/1MP Q800 model from March 10 for Rs 12,499 [$211], the 4.5” IPS 540×960 5MP/VGA Q700 model from May 13 for Rs. 9999 [$169], the 5” IPS 720×1280 8MP/1.2MP Q1000 top model from May 22 for 14,990 [$253], the latest 4.5” 854×480 5MP/VGA Q600 entry model of the Q series from July 1 for Rs 8,499 [$143].
From announcement point of view the first one was the Alcatel One Touch Scribe HD (announced at the CES 2013 in January) but it was delivered only from March, although with subsequent rollouts worldwide. Here is the WMC 2013 presentation of it:
Building a Better Smartphone Experience: MediaTek Dual-SIM Platform
[mediateklab YouTube channel, June 23, 2013]
MediaTek: April EDM [newsletter, April 26, 2013]
New MediaTek-driven products revealed
MediaTek chipsets have found their way into several new products of late, many of which have since enjoyed widespread media coverage.
Alcatel‘s One Touch Scribe HD, which leverages MediaTek’s quad-core technology to deliver top of the line, HD720p picture was recently featured in CNET.
Lenovo meanwhile generated its own share of CNETand Engadget buzz with the release of its trio of new tablets. The company’s ten-inch IdeaPad S6000 and seven-inch A3000 were widely praised for their value proposition – Both devices, despite being priced for the mid-market, house a MediaTek 1.2 GHz quad-core processor, making them a viable contender in any tablet category. Likewise, Lenovo’s entry-level model, the A1000, is also said to punch well above its weight class.
Similarly, Micromax and BLU Products have achieved advances in their Canvas and BLU LIFE lines respectively, both of which are powered by the MT6589 processor. Both brands have garnered extensive local and international media coverage, including a glowing TECH2AUTO for Micromax. Interest for the new offerings by BLU Products has also been tracked across media such as Engadget and SlashGear .
MediaTek Powers Lenovo’s Premium Multimedia IdeaTab S6000 Tablet [and two other] [press release, Feb 25, 2013]
End of Updates
MediaTek Strengthens Global Position with World’s First Quad-Core Cortex-A7 System on a Chip – MT6589 [MediaTek press release, Dec 12, 2012]
MediaTek Inc., a leading fabless semiconductor company for wireless communications and digital multimedia solutions, announced the launch of the MT6589, the world’s first commercialized quad-core System on a Chip (SoC), available for mid to high-end Android smartphones and tablets worldwide. The new quad-core SoC integrates MediaTek’s advanced multi-mode UMTS Rel. 8/HSPA+/TD-SCDMA modem, a power-efficient quad-core Cortex™-A7 CPU subsystem from ARM, PowerVR™ Series5XT GPU from Imagination Technologies, and is delivered in 28nm process technology. As a leader in Dual-SIM technology, the MT6589 is also the world’s first HSPA+ smartphone platform supporting Dual-SIM, Dual-Active functionality to address increasing multi-SIM demand around the world. The integration of these compelling features makes the MT6589 a universal platform that delivers premium multimedia capabilities with extremely low power consumption for an outstanding user experience. It also enables handset makers to reduce time to market, simplify product development and manage product differentiation in a more cost effective way, for any market worldwide.
“The ARM Cortex™-A7 is the most power-efficient applications processor ever developed by ARM. We are pleased MediaTek is the first company to combine a quad-core Cortex-A7 and leading edge 28nm manufacturing with TrustZone® for system-level security. The MT6589 system-on-chip brings the performance and features associated with high-end mobile devices to mass-market smartphones and tablets,” said Laurence Bryant, director of mobile solutions, ARM.
The MediaTek MT6589 quad-core solution supports 1080p 30fps/30fps low-power video playback and recording, a 13MP Camera with Integrated ISP, up to FHD (1920×1080) LCD displays, and enhanced picture processing for DTV-grade image quality. In addition, the MT6589 also supports MediaTek’s “Cool 3D” suite, which includes support for stereo 3D cameras and displays, real-time 2D-to-3D conversion and an optimal 3D user interface. Leveraging MediaTek’s established 3D technologies from the DTV and Digital Home markets, this suite helps create an optimal stereo 3D display with a custom-tailored 3D interface, providing an extremely flexible platform for product differentiation.
Tony King-Smith, Vice President Marketing, Imagination Technologies, said, “Today’s smart device users have very high expectations for graphic quality and performance. The MT6589 gives Imagination a great opportunity to show the abilities of the PowerVR™ Series5XT GPU, which delivers around twice the performance of previous generation devices while maintaining the lowest possible power and silicon area. We are delighted to contribute to this impressive, highly integrated solution, which demonstrates the benefits of our ongoing close strategic relationship with MediaTek.”
The MT6589 also supports Miracast™ technology for multi-screen content sharing and pre-integrates MediaTek’s leading 4-in-1 connectivity combo, which supports 802.11n Wi-Fi, BT4.0, GPS and FM.
Jeffrey Ju, GM of the smartphone business unit at MediaTek, said, “As the world’s first quad-core SoC, the MT6589 is a strong proof point of MediaTek’s growing global presence and ambition to drive the democratization of the smartphone and reshape the mid to high-end device market. Having built a solid reputation for quality and reliability over the last 15 years, we’ve created a one-of-a-kind achievement with the MT6589 platform—marrying blazing performance and flexibility with surprising affordability and simplicity. It’s an innovative solution that accelerates product development, simplifies differentiation, and offers the best possible experience that mid to high-end smart device owners desire.”
“The demand for a Smartphone SoC that can be delivered anywhere in the world has never been greater, which is why the MT6589 is so important to our business,” said Dr. Ji-Yang Wang, COO at TCL Communications Technology/Alcatel One Touch. “As the first truly universal platform it is designed with the customer in mind to give us a crucial competitive edge. The MT6589 will make the life of our customers and partners easier, allowing them to bring the best possible experience to mid-to high-end users in multiple markets in the most timely and affordable manner, and most importantly, without compromising its performance.”
The MediaTek MT6589 is currently being incorporated into smart devices by MediaTek’s leading global customers, and the first models based on this new chipset are expected to ship commercially in Q1 2013.
For GPU related information see:
A brief history of the PowerVR Series5XT GPU family [Imagination, Nov 5, 2012]
MediaTek Launching Quad-Core MT6589 CPU’s
Today [Gizchina.com, Dec 10, 2012]
Although we have been hearing alot about MT6589 powered phones already, Mediatek will only offiicially launch their low-cost quad-core CPU later
MediaTek took the smartphone market by storm this year with the single core MT6575 CPU and later the dual-core MT6577 and MT6577T processors which have found their way in to phones from local Chinese firms and larger international manufacturers.
The new quad-core MT6589 CPU, which will be launched later
todayin Shenzhen, will build on the company’s low-cost, high performance reputation, but could bring with it a new lower price. Rumour from earlier this year claimed the quad-core chip could cost less to manufacture than current dual-core MT6577 CPU’s, however this is not to say we are going to find $100 quad-core phones launching anytime soon.
According to most companies already testing the MT6589, they hope to launch higher end phones with larger screens, in an attempt to take on the higher price range Android phones from big brands such as Samsung, HTC and Sony. Typical specifications for MT6589 phones current offer 5 inch 1920 x 1080 displays, 2GB RAM, and 12-13 mega-pixel cameras.
Currently Oppo, ZTE, Huawei, Lenovo, Gionee and even Sony have confirmed to be working on phones using the new quad-core CPU, with prices from some smaller brands expected to start at around $200.
MediaTek launches ‘world’s first’ quad-core Cortex-A7 SoC, we go hands-on (video) [engadget, Dec 11, 2012]
There’s a new player in the quad-core SoC game and it’s called the MT6589. MediaTek announced today that it’s launching the “world’s first” quad-core Cortex-A7 SoC and gave us the opportunity to take it for a spin — in prototype form, of course. The MT6589, which includes the aforementioned quad-core Cortex A7 1GHz+ CPU, also features a PowerVR Series5XT GPU, high-performance multimedia support (13MP / 3D camera, 1080p video and display, Miracast) and a built-in 42Mbps HSPA+ / TD-SCDMA-capable dual-SIM dual-active radio. By combining competitive performance with high thermal efficiency and low power consumption in an affordable package, MediaTek’s new chip is well suited for a wide-range of smartphones and tablets running Jelly Bean and beyond. The MT6589 will be available in devices starting Q1 2013. Check out the gallery and hit the break for our impressions and benchmarks plus MediaTek’s videos and PR.
We played with two devices equipped with the new chip — a generic handset with branding covered up by MediaTek stickers and an upcoming Alcatel smartphone with a 1.2GHz MT6589, five-inch HD display, 8MP 1080p camera, dual-SIM 42Mbps HSPA+ connectivity, Miracast support and a 2500mAh battery. While our hands-on time was extremely limited we managed to run some benchmarks on Alcatel’s prototype — namely Quadrant, Vellamo 2 and AnTuTu 2 / 3. As you can see in the table above, the scores are generally lower than the competition, but the results are still decent enough. Both handsets felt snappy despite neither using final software or hardware. What’s more impressive is how efficient the MT6589 appears to be in MediaTek’s videos below, both in terms of heat dissipation and power management. We’ll reserve judgement until we’re able to test a production device equipped with the company’s new quad-core Cortex-A7 SoC, but it sure looks like 2013 is going to be an interesting year in the chip business.
From the gallery:
Replaced with equivalent: MT6589 – The Coolest Quad-Core SoC Platform – Thermal Benchmark [mediateklab YouTube channel, Dec 28, 2012]
Replaced with equivalent: MT6589 – The Coolest Quad-Core SoC Platform – Low Power Benchmark [mediateklab YouTube channel, Dec 28, 2012]
Jiayu G4 redefines thousands of quad-core smart phone! [JiaYu product page, Dec 12, 2012] as translated by Google with manual edits
December 12, 2012, immediately following the MTK 6589 quad-core chip release, Yiayu Mobile will launch next-generation flagship smartphone – the Yiayu G4 redefines thousands Yuan quad-core smartphone!
The best domain the G4 main performance parameters are as follows:
1, CPU: MT6589 1.2Ghz quad-core; the GPU: SGX544
2, 4.7 inch IPS screen resolution of 1280×720 HD (MIPI interface), the the OGS whole fitting process (single glass program)
thickness of the dual-battery design, thin electric 1800 mA, 3000 mA thick power for the different needs of the user selects.
4, gyroscope \ distance \ light \ Gravity \ magnetic sensor \ double wheat Noise Reduction \ WIFI \ Bluetooth \ FM \ GPS galore!
5, body measurements: thin electrical about 130 × 63.5 × 8.1 (mm) thick power 130 x 63.5 x 10 (mm), a larger screen, a shorter, narrower, thinner body!
6, higher definition camera configuration, specific parameters be announced separately.
as translated by Bing with manual edits
On December 12, 2012, released along with the MTK 65,894 core chips, Yiayu Mobile will launch a new generation of flagship Smartphone—”Yiayu G4″ redefines the thousands of quad-core smart phone!
G4 major performance parameters are as follows:
Quad-Core 1, CPU:MT6589 1.2Ghz; GPU:SGX544
2 inch screen, resolution 1280×720 HD (MIPI interface), using OGS laminating technology (single glass scheme)
3, thickness double batteries design, thin 1800 Ma, thick by 3,000 Ma, for different needs of users.
4, gyro \ \ \ distance light gravity \ \ magnetic sensor dual noise-canceling Bluetooth \FM\GPS \WIFI\ MAK-everything!
5, body measurements: thin about 130×63.5×8.1 (mm), thick 130×63.5×10 (mm), a large screen, a shorter, narrower and thinner body!
6, HD camera configuration, specific parameters be announced separately.
Jiayu G4 Is Unveiled With MT6589 Quad-core Processor [GizmoChina, Dec 11, 2012]
Today is all concerned with MediaTek smart phone chip friend who’s big day, including manufacturers, including, has always been a cost-effective smartphone known for the Jiayu taking advantage of MediaTek released MT6589 4-core of the occasion, the official website released a long-awaited the quad-core smartphone – Jiayu G4.
The information revealed in succession before Jiayu G4 hardware parameters with consistent, the Jiayu G4 will equip MediaTek MTK6589 quad-core processor, clocked at 1.2GHz, built-in PowerVR SGX 544 graphics processor. The 4.7 inch IPS screen resolution of 1280 * 720 HD level. There are two different versions of the battery with body size, the the thick version of the body measurements of 133 * 65 * 10 mm, with a capacity of 3000 mA battery, thin version of body size of 133 * 65 * 8.2 mm, The battery capacity of 1800 mA. In addition, support for commonly used Bluetooth, FM, GPS and WIFI, and built-in gyroscope distance light Gravity magnetic sensors and other common sensing devices, and supports dual-microphone noise reduction technology.
See also: MT6577-based JiaYu G3 with IPS Gorilla glass 2 sreen of 4.5” etc. for $154 (factory direct) in China and $183 [on this same blog, Sept 13, 2012]