VLSI Tec: 2015

Tuesday, December 8, 2015

ADVANCED ASIC CHIP SYNTHESIS - Himanshu Bhatnagar

CHAPTER 1: ASIC DESIGN METHODOLOGY - Traditional Design Flow (Specification and RTL Coding, Dynamic Simulation, Constraints, Synthesis and Scan Insertion, Formal Verification, Static Timing Analysis using PrimeTime, Placement, Routing and Verification, Engineering Change Order), Physical Compiler Flow.

CHAPTER 2: TUTORIAL - Traditional Flow (Pre-Layout Steps, Post-Layout Steps), Physical Compiler Flow.

CHAPTER 3: BASIC CONCEPTS - Synopsys Products, Synthesis Environment (Startup Files, System Library Variables). Objects, Variables and Attributes, Finding Design Objects, Synopsys Formats, Data Organization, Design Entry, Compiler Directives (HDL, VHDL).

CHAPTER 4: SYNOPSYS TECHNOLOGY LIBRARY - Technology Libraries (Logic & Physical Library), Delay Calculation.

CHAPTER 5: PARTITIONING AND CODING STYLES - Partitioning for Synthesis, RTL, General Guidelines, Logic Inference, Order Dependency.

CHAPTER 6: CONSTRAINING DESIGNS - Environment and Constraints, Clocking Issues (Pre-Layout, Post-Layout).

CHAPTER 7: OPTIMIZING DESIGNS - Design Space Exploration, Total Negative Slack, Compilation Strategies (Top-Down Hierarchical Compile, Time-Budgeting Compile, Compile-Characterize-Write-Script-Recompile, Design Budgeting), Resolving Multiple Instances, Optimization Techniques, Flattening, Compiling the Design and Structuring, Removing Hierarchy, Optimizing Clock Networks, Optimizing for Area).

CHAPTER 8: DESIGN FOR TEST - Types of DFT, Scan Insertion, DFT Guidelines.

CHAPTER 9: LINKS TO LAYOUT & POST LAYOUT OPT - Generating Netlist for Layout, Layout (Floorplanning, Clock Tree Insertion, Transfer of Clock Tree to Design Compiler, Routing, Extraction), Post-Layout Optimization (Back Annotation and Custom Wire Loads, In-Place Optimization, Fixing Hold-Time Violations).

CHAPTER 10: PHYSICAL SYNTHESIS - Modes of Operation, Other PhyC Commands, Physical Compiler Issues, Back-End Flow.

CHAPTER 11: SDF GENERATION - SDF File Generation (Generating Pre-Layout SDF File, Generating Post-Layout SDF File, False Delay Calculation Problem).

CHAPTER 12: PRIMETIME BASICS - Introduction (Invoking PT, PrimeTime Environment, Automatic Command Conversion), Basics, PrimeTime Commands.

CHAPTER 13: STATIC TIMING ANALYSIS - Why Static Timing Analysis?, Timing Exceptions, Disabling Timing Arcs, Environment and Constraints, Pre-Layout Clock Specification, Timing Analysis, Post-Layout Clock Specification, Timing Analysis, Pre-Layout Setup and Hold-Time

Analysis Report, Post-Layout Setup and-Time Analysis Report,

Cell Swapping, Bottleneck Analysis, Clock Gating Checks.

ADVANCED ASIC CHIP SYNTHESIS by Himanshu Bhatnagar - Click here to Download

Friday, October 30, 2015

EDAgraffiti Paul McLellan with a foreword by Jim Hogan

Chapter 1: Semiconductor Industry – Explains all sorts of facets about the industry ranging from the costs involved in creating and running a fab, to various forms of IP like ARM, Atom, and PowerPC processors and cores, to what’s happening with the semiconductor industry in Japan.

Chapter 2: EDA Industry – Presents many interesting points of view, starting with why EDA (which is predominantly a software-based industry) has a hardware business model. Then bounces around looking at things like the corporate CAD cycle, Verilog and VHDL, Design for Manufacturing, ESL, the EDA press, and where EDA is going in the next ten years.

Chapter 3: Silicon Valley – Considers visas, green cards, China, India, Patents, and the Upturns and Downturns in the valley.

Chapter 4: Management – Being a CEO, hiring and firing in startups, emotional engineers, strategic errors, acquisitions, interview questions, managing your boss, how long should you stay in a job, and much more.

Chapter 5: Sales – Semi equipment and DDA, hunters and farmers, $2M per sales person, channel choices, channel costs for an EDA startup, application engineers, customer support, running a sales force, and much more.

Chapter 6: Marketing – Why Intel only needs one copy, the arrogance of ESL, standards and old standards, pricing, competing with free EDA software, don’t listen to your customers, swiffering new EDA tools, creating demand in EDA, licensed to bill, barriers to entry, the second mouse gets the cheese, and much more.

Chapter 7: Presentations – The art of presentations, presentations without bullets, all-purpose EDA keynote, finger in the nose, it’s like football only with bondage, and much more.

Chapter 8: Engineering – Where is all the open source software, why is EDA so buggy, internal deployment, groundhog day, power is the new timing, multicore, process variation, CDMA tales, SaaS for EDA, and much more.

Chapter 9: Investment and Venture Capital – Venture capital for your grandmother, crushing fixed costs, technology of SOX, FPGA software, Wall Street values, royalties, why are VCs so greedy, the anti-portfolio, CEO pay, early exits, and much more

EDAgraffiti Paul McLellan with a foreword by Jim Hogan - Click here

Thursday, October 29, 2015

How to Make Smartphone Even Smarter?

The IT industry marvels like augmented reality and artificial intelligence, which marked technological utopianism in the science fiction movies during the 1970s and 1980s, are here now, enabled by a machine-learning technique called deep learning.

Deep learning algorithms—which date back to the 1980s—are now driving Google Now speech recognition, face recognition service on Facebook, and instant language translation on Skype. However, the companies like Facebook and Microsoft are using GPUs to run these algorithms, and they could move to FPGAs in a bid to acquire even more processing speed.

Not surprisingly, therefore, these cutting-edge technology services consume an enormous amount of processing power, which is handily available at the large data centers that these companies have. Now mobile is the next frontier where deep learning can bring unprecedented gains by processing sensor data available from smartphones and tablets and perform tasks like speech and object recognition.

A virtual brain on the phone

And that will inevitably require moving some of the processing power to personal devices like smartphones, tablets and smartwatches. On the other hand, traditional mobile hardware made up of CPU and GPU is computationally constrained due to large processing overhead required to run powerful artificial-intelligence algorithms.

Smartphone's New Smarts

So a new breed of processors is now emerging to bring these services at a much lower power to smartphones and wearable devices. Take CEVA-XM4, for instance, an imaging and computer vision processor IP that allows chips to see by running a deep-learning network trained to recognize gestures, faces and even emotions.

The CEVA-XM4 image processing core takes advantage of pixel overlap by reusing same data to produce multiple outputs. That increases processing capability and reduces power consumption; moreover, it saves external memory bandwidth and frees system buses for other tasks.

It's an intelligent vision processor for cameras, image registration, depth map generation, point cloud processing, 3D scanning and more. The CEVA-XM4 combines depth generation with vision processing and supports applications processing in multiple areas like gesture detection and eye-tracking.

Face recognition: CNN usage flow with Caffe training network

Socionext, a Japanese developer of SoC solutions, is using CEVA's imaging and vision DSP core to power its Milbeaut image processing chip for digital SLR, surveillance, drones and other camera-enabled devices. The first chipset of the Milbeaut image processor family—MB86S27—employs imaging DSP core's powerful vector processing engine and is aimed at next-generation camera applications such as augmented reality and video analytics.

CNN/DNN Deployment Framework

The task of building support for deep learning into chips for smartphones and tablets also requires a new breed of software tools for accelerating deep learning application deployment. And the company supplying XM4 vision processor has acknowledged this by launching the CEVA Deep Neural Network (CDNN), a software framework that provides real-time object recognition and vision analytics to harness the power of imaging DSP core.

CEVA claims that its deep neural network framework for XM4 image processor enables deep learning functions three times faster than the leading GPU-based solutions. Moreover, CDNN enables XM4 vision processor to consume 30x less power while requiring 15x less memory bandwidth. Case in point: a pedestrian detection algorithm running DNN on a 28nm chip requires less than 30mW for a 1080p video stream operating at 30fps.

It's worth noting that deep learning works in two stages. First, companies train a neural network to perform a specific task. Second, another neural network carries out the actual task. Here, CDNN toolset boasts CEVA Network Generator, an automated technology that enables real-time classification with pre-trained networks and automatically converts them into real-time network model.

Real-time CDNN application flow for face recognition

Phi Algorithm Solutions, a supplier of machine learning solutions, has optimized its CNN-based "unique object detection network" algorithm using the CDNN framework alongside CEVA-XM4 vision DSP core. The Toronto, Canada–based firm has been able to make a quick and smooth shift from offline training to real-time detection. Now the company's optimized algorithms are available for applications such as pedestrian detection and face detection.

The CDNN software framework supports complete CNN implementation as well as in specific layers. And it supports various training networks like Caffe, Torch and Theano. Moreover, CDNN includes real-time example models for object and scene recognition, ADAS, artificial intelligence, video analytics, augmented reality, virtual reality and similar computer vision applications.

The availability of intelligent vision processors like CEVA-XM4 and toolsets such as CDNN is a testament that deep learning is no longer an exclusive domain of large, powerful computers. The dramatic advances in deep learning have reached the smartphone doorstep, and smartphone is going to get smarter. The smartphone is now powerful enough to run deep learning.

Do 8 Cores Really Matter in Smartphones?

As the smartphone industry has begun to mature, one-upmanship among smartphone manufacturers and SoC vendors has bred a dangerous trend: ever-increasing processor core counts and the association between increased CPU core count and greater performance. This association originated as SoC vendors and OEMs have tried to find ways to differentiate themselves from one another through core counts. Some vendors are creating confusion, as phones today have core counts from 2 up to 8 and vary wildly in performance and, even more importantly, experience. One reason for this confusion is many users and reviewers have used inappropriate benchmarks to illustrate smartphone user experience and real world performance. As a result, we believe that some consumers are misled in their buying decisions and may end up with the wrong device and the wrong experience.

The 8 Core Myth...
The 8 Core Myth, also known as the Octacore Myth, is the perception that more CPU cores are better and having more cores means higher performance. Today’s smartphones range from 2 cores up to 8 cores, even though performance and user experience are not a function of CPU core count. The myth, however, will not be limited to 8 cores, as there are plans for SoCs with up to 10 cores, and we could even see more in the future.

Not All Cores Are the Same...
In some phones, users are getting Octacore designs with up to 8 ARM Cortex-A53 cores. These 8 cores perform differently than 4 ARM Cortex-A57 cores paired with 4 ARM Cortex-A53 cores in what is called a big.LITTLE configuration. Core designs vary wildly from ARM’s own A53 and A57 64-bit CPUs to Intel’s x86 Atom 4-core processors to Apple’s 2-core A8 ARM processor. All these processors are designed differently and behave differently across application workloads and operating systems. Some cores are specifically designed for high performance, some for low power. Others are designed to balance the two through dynamic clocking and higher IPC (instructions per clock). As a result, no two SoCs necessarily perform the same when you take clock speed and core count into account.

Through the different benchmarks, tools, and applications, we showed that CPU core count in a modern smartphone is not an accurate measurement of performance or experience. More CPU cores are not always better. We do acknowledge that having many smaller cores is one way to simplify power management, but these tests are not focused on power; they are focused on performance and user experience.

CPU core counts are not the way that phone manufacturers or carriers should be promoting their devices. CPU core count is only one factor in Android when the SoC has fewer than 4 cores. The marketing of core counts as a primary driver of performance and experience must end and be replaced with improved benchmarking practices and education

Saturday, September 19, 2015

3D Xpoint and the Future of Memory

Existing solutions Logic systems have a memory hierarchy with each tier optimized for a particular task. A typical hierarchy is:

Registers - on chip with the cores. Stores intermediate results during calculations, runs at the core speed, have small capacity (for speed) and wide data paths (for fast loading and unloading of data). Typically kilobytes of SRAM.

Cache – typically on-chip with the cores (used to be off-chip). Stores recently completed results and program instructions and data that are predicted to be needed by the core in the near future, bigger than the registers making access to data somewhat slower but still cache needs to be fast. Typically megabytes of SRAM.

Working memory – typically off-chip, often as multiple memory specific chips. Working memory stores programs that are currently running on the system and stores data currently being worked on by the system, is slower than cache and there is typically a lot more of it so it needs to be less expensive. Typically a few gigabytes of DRAM.

Long term storage – typically off chip in the form of a hard disc drive or in some portable systems as NAND Flash memory. Long term storage is where programs and data are stored even when a system isn’t running, much larger in capacity than the other types of memory outlined above storing more data and programs than could be loaded into main memory at one time, must be non-volatile, typically much slower and much less expensive than main memory. Typically many gigabytes to a terabyte of NAND Flash or a hard drive.

The memory type used for each tier in the hierarchy is determined by a set of tradeoffs in the memory characteristics.

SRAM is the fastest memory and has excellent endurance so it is used for registers and cache even though it is volatile (loses values when the power is turned off), takes up a lot of area and is relatively expensive. DRAM is slower than SRAM and volatile but it is cheaper than SRAM and has excellent endurance so it is used for main memory. NAND is non-volatile and less expensive than DRAM but it is slow and has poor endurance making it only suitable for long term storage.

Memory Evolution All three forms of memory outlined above are facing scaling issues.

SRAM cell sizes continue to scale down as the technologies used to make processors move to smaller and smaller nodes, but the classic 6T SRAM cell is now often supplanted by 8T and 10T cells in the most critical paths.

The key issue in DRAM scaling is the capacitor. A minimum capacitance value is required in order to store values (electrons). In order to shrink the horizontal area of the capacitor very tall structures have been implemented and the dielectric has transitioned to high-k materials. The issue now is twofold, one is that capacitor heights are reaching practical limits, and two, achieving higher dielectric k values is becoming very difficult. There is a variety of high-k materials available with a range of k values, the fundamental problem is that as k values increase, the band gap of the material decreases increasing leakage. The solution to this has been the use of sandwich materials such the current ZAZ where a high band gap aluminum oxide is sandwiched between two layers of high-k zirconium oxide. The problem with a sandwich is the resulting k value is reduced by the aluminum oxide layer. There is a lot of work still being done on capacitor materials with aluminum doped titanium oxide looking promising but DRAM scaling likely only has one or two more nodes left. Current DRAM state-of-the-art is 20nm with possibly a 16nm and maybe a 12nm node in the future.

In NAND scaling the issue has been how to scale down while maintaining good control gate to floating gate coupling within a memory cell - without coupling to adjacent cells. Storing enough electrons is also an issue for further scaling and 2D NAND dimensions have gotten so small that very complex self-aligned quadruple patterning (SAQP) is required on several layers. In the NAND space 3D NAND has been introduced by Samsung as a 2D NAND successor. Samsung’s initial device stacked up 24 memory cell layers with a single bit per cell. Multiple producers are now introducing 48 layers devices with 3 bits per cell. 3D NAND is a new scaling paradigm where a 40nm technology is expected to be used with an increasing number of layers (eventually over 100) until a 1Tb NAND results. There is only one multi-patterning layer required in the device fabrication and the number of lithography layers stays essentially the same even as more memory layers and bits per device are produced. 3D NAND is so promising that Micron has announced 16nm will be their last 2D NAND generation and all their efforts going forward will be on 3D NAND.

3D Xpoint
At the introduction press conference Intel and Micron described their new 3D Xpoint memory as being 1,000 times faster than NAND Flash with 1,000 times the endurance. That is very impressive but still isn’t the speed or endurance required to replace DRAM. The density was described as 10 times as high as DRAM with a cost intermediate between DRAM and NAND.

The exact mechanism behind the 3D Xpoint wasn’t disclosed but each memory cell is a memory element that stores resistance values and a selector (1R1D). Existing memory cells store electrons placing a lower limit on how much the cell can scale, storing values as a resistance should remove those limitations. The memory cells are each located at the cross point between orthogonal bit and word lines. The initial device had two memory layers, further scaling of the device is possible in three ways:

1. Add more memory layers.

2. Scale dimensions in x and y.

3. Go from a single bit per cell to a multi-bit per cell.

Our estimates for the die size suggest a 4F2 memory cell where F is approximately 25nm. Adding more memory layers should be straight forward but will require at least two mask layers for each memory layer and likely more. At a 25nm feature size the mask layers will be relatively complex and expensive multi-patterning layers. Scaling in x and y will require even more complex multi-patterning schemes at least until an alternative such as sufficiently high throughput EUV is available. Multi bit per cell was also mentioned as an option but how easy this is to implement isn’t clear.

Our current calculations find that the bits per mm2 for the initial 3D Xpoint memory is lower than current state-of-the-art 2D or 3D NAND. We expect that with additional layers 3D Xpoint will exceed 2D NAND density but will lag behind 3D NAND. Furthermore, we believe 3D NAND has a cost scaling advantage over 3D Xpoint memory and will continue to be less expensive per bit.

Micron is positioning 3D Xpoint as a Storage Class Memory that sits between main memory and long term storage. We believe this memory will find many interesting applications but will not directly replace DRAM due to speed and endurance issues and won’t replace NAND due to cost.

This discussion of 3D Xpoint is an excerpt from a more detailed document we have produced for our Strategic Cost Model customers.

What’s Next?
3D Xpoint is an interesting memory technology and will likely stake out an applications space between DRAM and NAND, but there is still an issue with DRAM scaling. Micron Technologies’ road map lists a new memory A and new memory B. Memory A is the 3D Xpoint being introduced now with a second generation expected next year, but what is memory B expected in 2017?

There has been a lot of talk for several years about MRAM being the successor to DRAM. MRAM stores memory values as magnetism, not electrons and so can theoretically scale down very small. MRAM also has the potential speed and endurance to be a direct replacement for DRAM. To-date MRAM has been nowhere near the density or cost required to replace DRAM but what if memory B is a multilayer cross point memory with MRAM cells at each cross point? The ability to stack multiple cells up as memory layers might finally get MRAM to a competitive density and ultimately cost to replace DRAM. This is only speculation on my part but a 3D DRAM replacement seems like a logical future direction.

Monday, August 3, 2015

Intel 3D XPoint Storage

The SSD in your computer might seem pretty fast, but it’s about to be left in the dust by a new storage standard from Intel and Micron. After 10 years of research and development, the companies have announced 3D XPoint. It’s the first new type of storage to be created in 25 years and it’s reportedly 1,000 times faster than the NAND flash storage used in SSDs and mobile devices.

Intel is really pushing the speed angle, but 3D XPoint (pronounced cross-point) should also boost storage capacity dramatically. It’s 10 times more dense than the most advanced NAND architectures, meaning you’ll be able to get more bytes in the same physical space. Intel also says XPoint will be affordable, but there’s no telling if you will agree with Intel’s definition of “affordable” when the technology hits the market.

At the heart of XPoint is a new type of data storage mechanism. It doesn’t use transistors or capacitors like traditional flash storage. It’s composed of a lattice of perpendicular conductors stacked on top of each other. The memory cells sit at the intersection of these conductors and can be addressed individually bit-by-bit. The ability to quickly read small data clusters is what makes XPoint so fast.

XPoint is speedy enough that it could replace both non-volatile storage (your SSD) and RAM. In fact, it’s too fast for any current interface technology to keep up with. The first XPoint chips will connect to computers over PCI Express, but even that won’t have enough bandwidth to truly let XPoint shine. New motherboard technology will be needed to take full advantage of the XPoint in the future. Intel and Micron expect to make the memory available sometime next year, but it’s not clear if it will be ready for consumer applications right away.

Wednesday, July 15, 2015

IoT Challenges with FPGA-Based Prototyping

The need for ever-connected devices is skyrocketing. As I fiddle with my myriad of electronic devices that seem to power my life, I usually end up wishing that all of them could be interconnected and controlled through the Internet. The truth is, only a handful of my devices are able to fulfill that wish, but the need is there and developers are increasingly recognizing that we are moving to a connected life. The pressure to create such a connected universe is so immense that designers need a faster, more reliable way to fulfill our insatiable need. Every connected appliance requires software to run it and with the growing number of these gadgets, software development needs must be met to power them. To add to the pressure mix, the competition in this connected space is immense. In other words, if you’re not one of the first to market, your design could be destined for failure.

One way to meet these challenges and alleviate time-to-market apprehension is for designers to adopt FPGA-based prototyping. This proven technique allows designers to explore their designs earlier and faster and thus proceed more quickly with hardware optimization. More to the point, designers can move into software development and software refinement much sooner and conduct the appropriate number of compatibility tests. During software development, testing is critical to make sure the software performs as expected. An error in how the software interoperates with the hardware can be disastrous therefore designers generally execute a large number of tests to achieve the desired interoperability. Without FPGA prototyping, the time it takes to complete the vast number of tests could spell disaster for meeting the precious time-to-market window. With FPGA prototyping, not only can testing be done earlier, more tests can be conducted to achieve optimal results.

In addition, it has to be said that ARM and Xilinx have been at the forefront of enabling today’s embedded designs. It is critical that prototyping technology keep pace with the advancements from ARM and Xilinx.

S2C’s AXI-4 Prototype Ready™ Quick Start Kit based on the Xilinx Zynq® device is part of S2C’s expansive library of Prototype Ready IP and is uniquely suited to next-generation designs including the burgeoning Internet of Things (IoT).

The Quick Start Kit adapts a Xilinx Zynq ZC702 Evaluation Board to an S2C Prodigy Logic Module. The evaluation board supplies a Zynq device containing an ARM dual-core Cortex-A9 CPU and a programmable logic capacity of 1.3M gates. The Quick Start Kit expands this capacity by extending the AXI-4 bus onboard the Zynq chip to external FPGAs on the Prodigy Logic Module Prototyping Platform. This allows designers to quickly leverage a production-proven, AXI-connected prototyping platform with a large, scalable logic capacity – all supported by a suite of prototyping tools.

Integrating Xilinx’s Zynq All Programmable SoC device with S2C’s Virtex-based prototyping system provides designers an instant avenue to large-gate count prototypes centered around ARM's Cortex-A9 processor.

To learn more about how S2C’s FPGA-based prototyping solutions are enabling the next generation of embedded devices and allow you to realize the Genius of Your Design, visithttp://www.s2cinc.com.

Wednesday, June 24, 2015

Fabless: The Transformation of the Semiconductor Industry

As most of you know, Paul McLellan, Beth Martin, and I published a book last year which is a really nice history of the fabless semiconductor ecosystem

Preface

The purpose of this book is to illustrate the magnificence of the fabless semiconductor ecosystem, and to give credit where credit is due. Business models, as much as the technology, are what keep us thrilled with new gadgets year after year, and focused on the evolution of the electronics business. These “In Their Own Words” chapters allow the heavyweights of the industry to tell their corporate history for themselves, focusing on the industry developments (both in technology and business models) that made them successful, and how they in turn drive the further evolution of the semiconductor industry.

The economics of designing a chip and getting it manufactured is similar to how the pharmaceutical industry gets a new drug to market. Getting to the stage that a drug can be shipped to your local pharmacy is enormously expensive. But once it’s done, you have something that can be manufactured for a few cents and sold for, perhaps, ten dollars. ICs are like that, although for different reasons. Getting an IC designed and manufactured is incredibly expensive, but then you have something that can be manufactured for a few dollars, and put into products that can be sold for hundreds of dollars. One way to look at it is that the first IC costs many millions of dollars—you only make a lot of money
if you sell a lot of them.

What we hope you learn from this book is that even though IC-based electronics are cheap and ubiquitous, they are not cheap or easy to make. It takes teams of hundreds of design engineers to design an IC, and a complex ecosystem of software, components, and services to make it happen. The fabs that physically manufacture the ICs cost more to build than a nuclear power plant. Yet year after year, for 40 years, the cost per transistor has decreased in a steady and predictable curve. There are many reasons for this cost reduction, and we argue that the fabless semiconductor business model is among the most important of those reasons over the past three decades. The next chapter is an introduction to the history of the semiconductor industry, including the invention of the basic building block of all modern digital devices, the transistor, the invention of the integrated circuit, and the businesses that developed around them.

Table of Contents
Chapter 1: The Semiconductor Century
Chapter 2: The ASIC Business
In Their Own Words: VLSI Technology
In Their Own Words: eSilicon Corporation
Chapter 3: The FPGA
In Their Own Words: Xilinx
Chapter 4: Moving To The Fabless Model
In Their Own Words: Chips And Technologies
Chapter 5: The Rise Of The Foundry
In Their Own Words: TSMC And Open Innovation Platform
In Their Own Words: GLOBALFOUNDRIES
Chapter 6: Electronic Design Automation
In Their Own Words: Mentor Graphics
In Their Own Words: Cadence Design Systems
In Their Own Words: Synopsys
Chapter 7: Intellectual Property
In Their Own Words: ARM
In Their Own Words: Imagination
Chapter 8: What’s Next For The Semiconductor Industry

FABLESS: THE TRANSFORMATION OF THE SEMICONDUCTOR INDUSTRY - Click here

Wednesday, June 10, 2015

Synopsys to Acquire Atrenta

There have been lots of rumors about potential suitors for Atrenta, not just recently but over the years. Since they operate at the RTL and IP level there is clearly potential for other companies than the usual suspects of Cadence, Synopsys and Mentor. Two that have been much-rumored were ANSYS (who acquired Apache a few years ago) and Dassault (who have some process management solutions and acquired Tuscany Design Automation [disclosure: I was on the board] right at the end of 2012). I've never heard it mentioned but another possible candidate might have been TSMC who have used Atrenta's SpyGlass solution as their "signoff" tool for IP qualification as part of their OIP ecosystem.

The financial details of the deal were not disclosed. But for sure it was not a fire sale. For a start, the current threshold for HSR filing, the so-called "$50M threshold", is actually $76.3M this year so we know that the price was at least that high.

However, since there were multiple companies rumored to be interested I think it is logical to speculate that the price will be at a reasonable multiple on their revenue, itself rumored to be running in the $60M range. At 3.5X revenue that would be $210M so I'll go with that as my guess. Final answer.

Of course Synopsys plans to integrate the Atrenta technology into their verification continuum, especially SpyGlass which has wide acceptance. Manoj Ghandi, who is the GM of verification, adds a bit of color (not much, to be honest):

Atrenta's demonstrated leadership in static and formal technologies is recognized throughout the EDA industry, and its technology is used by design and verification teams around the world. Synopsys expects to leverage this strong technology to further improve our Verification Continuum platform to address continually increasing verification challenges, and to support our ongoing R&D collaborations with customers in both verification and implementation.

Atrenta will be at DAC on booth #1732. Interestingly they just announced the date of their user conference in October. Of course that may take place, since under the rules for an acquisition like this the two companies are not really able to work together until the deal is closed (on the basis that if the deal is struck down then everything should go back to exactly how it was before). However, I think it is more likely that it will just get folded into SNUG.

The Synopsys press release is here.

Saturday, June 6, 2015

3 Reasons behind Intel to buy Altera FPGA

While Altera FPGA aren't as fast as Intel's own Xeon processors, they are more flexible and are used in a number of industries including consumer electronics, telecommunications, and automotive. By securing Altera, Intel can step into IoT applications.Industry analysts believe that Avago's purchase of Broadcom may have propelled Intel to make a second attempt at Altera.

These are some key factors which may have influenced Intel's decision to purchase Altera.

1) Altera FPGA will be used in Intel's processor chips. The purchase allows Intel to enter the SoC market.

Some analysts feel that the price tag of $17 billion was overvalued, but Intel defended its to buy Altera FPGA stating that they allow for faster speeds in Intel's processor chips.

2) Intel's purchase of Altera has a negative impact on its competitors, including IBM and ARM. IBM and ARM will need to rely on another FPGA source - such as Xilinx FPGA, Microsemi or Lattice Semiconductor.

3) Intel has the opportunity to reach out to other markets.

Avago's purchase of Broadcom has propelled Qualcomm to consider other new opportunities. Qualcomm's reach into the mobile chip market is severely limited by the consolidation between the two companies. Both Avago and Intel have mutual interests in the mobile chip market, and industry analysts believe that a partnership between Qualcomm and Intel is a viable step.

Thursday, June 4, 2015

UltraScale FPGA scales to 600 million gates

Pro Design has come up with a kit for prototyping Xilinx Virtex UltraScale XCVU440 FPGAs and will demonstrate it at next month’s Design Automation Conference (DAC) in San Francisco.

Scalable from 1 up to 4 pluggable Xilinx Virtex UltraScale XCVU440 based FPGA modules the Quad system offers a capacity of up to 120 million Asic gates. Up to five Quad systems with overall 20 FPGA modules can be easily connected together to increase the capacity up to 600 million gates.

Pro Design also has a development system called Uno for IP or sub designs development and can reuse the FPGA modules for complete SoC and Asic prototyping by plugging the same proFPGA Virtex 7 or UltraScale FPGA modules on a Duo or Quad motherboard.

There are also motherboards, FPGA modules, daughter cards and accessories which can be used in combination with the proFPGA XCVU440 FPGA modules.The system comes with the proFPGA Builder software, which provides an extensive set of features, like advanced clock management, integrated self- and performance test, automatic board detection and I/O voltage programming, system scan- and safety mechanism, and quick remote system configuration and monitoring through USB, Ethernet or PCIe, which simplifies the usage of the proFPGA system tremendously.

The systems are available for early adopter customers with general availability in Q4 2015.

Tuesday, June 2, 2015

Accelerating PCB Design and Manufacturing

In modern electronic industry PCBs are required to accommodate highly dense circuits with large number of components and complex routing spaces. While the complexity is increasing, the time-to-market is decreasing. In such a scenario, there is no other option than to reduce the design time by employing innovative editing options and make the design correct-by-construction for manufacturing so that the cycles between design and manufacturing are eliminated. Moreover, designs may need customized rules which should be easy to develop and use. Also, frequent changes in fabrication technologies require new rules to be developed in a timely manner. It’s very pleasant to see that with rapid increase in circuit sizes and complexities, PCB tools and technologies have also evolved by a large extent.

A few days ago Cadence released its Allegro 16.6-2015 product portfolio that introduced several new capabilities to address modern day challenges in PCB designs, make designs more predictable, and shorten the overall design time up to manufacturing.

Allegro provides an integrated environment for electrical, physical and manufacturing verification. In Allegro 16.6, the Allegro Manufacturing Option includes DFM (Design for Manufacturing) Checker, Documentation Editor, and Panel Editor. The DFM Checker provides manufacturing analysis tools for designers to analyze and correct fabrication related issues before sending the design for fabrication, thus eliminating cycles between design and fabrication and making the design more predictable. The Documentation Editor is an intelligent documentation-authoring tool that automates the complete process for fabrication documentation. It creates the complex PCB documentation for handoff to manufacturing in a fraction of time compared to traditional way of documentation. This is streamlines manufacturing handoff eliminating unnecessary scrap and iterations with manufacturing partners. The Panel Editor automates the complex process of assembly panel documentation. It enables designers to quickly create manufacturing documents that clearly articulate the panel specification and instructions for successful fabrication, assembly, and inspection of their designs. Cadence customers have observed this efficient fabrication and assembly document generation process to be faster than traditional methods by 60% or more.

The Allegro Rules Developer and Checker provides flexibility to extend supported rule sets. It provides a ‘relational geometric verification language’ designed specifically for creating rules that are proprietary and custom to an original equipment manufacturer (OEM). The tool supports constraint driven flow where the rules can be viewed and executed from the Allegro Constraint Manager making it a single source for all DRCs within a PCB. This also enables designers develop new rules according to changing fabrication processes or even new fabrication technologies.

The Allegro 16.6 provides excellent capabilities for routing and tuning high-speed interfaces such as DDR3, DDR4, PCIe, and so on. These interfaces operate at high bandwidth and low voltage, and are increasingly susceptible to crosstalk. The timing closure is a significant challenge for such a high-speed interface. The routing for such high-speed interface is accomplished under complex set of electrical and layout implementation constraints.

Allegro PCB Editor has added several new capabilities to improve designers’ productivity and accelerate timing closure. These include ‘adding ground current return path vias to differential pairs during Add Connect’, ‘creating off-angle routes to avoid FR4 fiber weave coupling and achieve better impedance control’, ‘improved arc support in routing’, and many others.

There is auto-connect routing where a designer can select a set of signals and the route engine creates flow automatically. The ‘Adjust Spacing’ capability allows users to compress spreading of traces in the trunk of a set of signals.

There is a powerful shape editing environment to quickly create and modify shapes and save lot of time in designing power delivery networks and other complex layout editing. Designers can add notches, join edges, slide edges with corners, move multiple segments with one command, convert corners, and so on.

Overall Allegro 16.6-2015 portfolio provides a powerful and ideal PCB design platform for modern day PCB designs that need fast turnaround to meet ever shrinking time-to-market window. To know more read the press release here.

Monday, June 1, 2015

Intel plans to buy Altera FPGA for a $16 billion deal

Currently, Intel supplies Atom and Xeon processors for computing solutions.By acquiring Altera, Intel aims to target data-intensive networking centers at major companies such as Google, Facebook, LinkedIn and Ebay. FPGA solutions are desirable because they increase performance and consume less power. For data centers that are built to maximize efficiency, a high performance, low-cost solution is ideal.Intel's second aim is to target the faster growing mobile market. This market includes smartphones, tablets, and wearable technology. The demand and turnover for products in this market is much higher compared to computing devices. Mobile device makers rely on FPGA to power their products. FPGA used in mobile technology devices are available at low per unit costs.

VLSI Trends