On July 10th Mentor Graphics announced the availability of Calibre nmDRC tool for physical verification which dramatically reduces total cycle time through hyperscaling and incremental verification and integrates critical elements such as critical area analysis and critical feature identification. Calibre nmDRC is part of a new platform from Mentor, the Calibre nm Platform. Calibre nm Platform leverages next-generation technology in litho-friendly design (LFD), DRC, resolution enhancement technology (RET), and post-layout parasitic extraction and analysis to help design teams transition efficiently from a rule-based approach to a model-based approach. I had a chance to discuss Calibre nmDRC (pronounced Caliber nanonmeter DRC)before the official announcement with Joe Sawicki, VP and GM Design to Silicon Division.
What is Mentor announcing?
The thing we are announcing today is really exciting for us. It is a brand new version of Calibre that we are naming Calibre nmDRC that completely redefines performance characteristics for design rule checking. It is the center of our Calibre nanometer platform which includes all the capabilities to add Design for Manufacturability to the whole signoff process. The thing which is really exciting about this is how it sort of breaks the ground rules of how this industry works. The common myth is that all innovation happens in startups. Big companies eventually buy the startups but then because we are intransigent, incompetent and can't figure out to roll out of bed in the morning, eventually we destroy that software. New startup companies come into play and that becomes the next platform. Calibre was home grown to begin with. It was developed at Mentor Graphics. This new version is also a home grown version of the tool.
What was the motivation for this release?
Frankly, whenever someone says that we've got a new version that puts in place a whole new definition of performance, the fundamental question is why do users need that. Essentially the reason they need that is because over the last few technology nodes as we've gone from 180nm to now the first designs at 65nm, we have essentially walked away from a world where random effects dominated to one where systematic effects are a lot more significant. 180nm yield, if you had designed you circuit correctly, was all based upon particle defects, random defects in the manufacturing process. You looked for compliance to manufacturing by doing geometry checks that are mapped against those sorts of defects. If you are looking for particle defects, you look at minimum width because that tells you whether or not you are going to be susceptible to shorts caused by particles and at minimum spacing because that will tell you whether or not you are going to be susceptible to bridging. And also a bunch of overlay checks that make sure that from region to region you are getting proper alignment between the two areas and that you are maintaining connectivity.
Because of the fact that we are now dealing with subwavelength manufacturing you are seeing very complex effects. A chip with yield problems might have three lines that are virtually identical in terms of width and spaces, yet only one of them is the one causing the yield problem as you start to move within the manufacturing variability because of this very large optical context that causes what is called a pinching issue.
What that has driven in terms of overall environment of sign-off and physical verification is that we have greatly increased the complexity of that operation. Up until 180nm the primary affect you worried about going from node to node was design size and geometry count. If you just do the basics of Moore's Law and plot it out over a 10 year horizon, the mathematics are that you get about 100 times the data you need to verify every 10 years. This is why we went from Dracula to Calibre in the transition from .35 to .25 because that geometry count could not be kept up with a flat tool. What has been happening in addition, starting at 130 is the increase in the number of basic DRC rules. Fundamentally fairly flat going from .35 to .25 to .18 but at 130, 90 and now 65, the number of rules you have to run against those geometries is also drastically increasing. This is because there has been an attempt to encapsulate these more complex failure mechanisms in design rule checks. In addition, especially coming on with 65 and 45, we adding all these additional tasks to sign-off, things like recommended rules which say “Okay, I can meet a certain spacing but it is preferable if you had a wider spacing” and litho-friendly design that brings simulation of lithography hot spots into the design space as well as critical area analysis which takes a model based approach to looking at the particle defect problems.
This explodes the amount of analyses, computation and debug work that a designer needs to do as part of sign-off. At the same time design cycles are decreasing. The last time I worked for a living when I did microprocessor design in the late 80's, a two year design process was perfectly acceptable. With more consumer driven SoC approaches, a year is a horrendous failure. People have to squeeze that down to 6 to 9 months. Simply put, you have to fit the same amount plus all this additional analysis into a shortened design window that drives the requirement for a new performance paradigm. That brings us to introduce Calibre nanometer DRC.
How has Calibre evolved over 5 generations?
Let me give you a quick tour of three major architecture points with the tool. In the first version it was all targeted at single CPU. What we would do is take the hierarchy inherent in the design, create some other hierarchy either by adding combinations or breaking up large flat regions. What you did was run those operations in sequence from the bottom of the hierarchy to the top. Ran them for one operation in the rule deck and then ran them for the next. The reason why Calibre became the standard tool for deep submicron verification is that we did this breaking up better than anyone else out there and thereby had less work to do.
The next version released in the 2000 to 2001 timeframe was a multiple CPU version with data partitioning. It was the first version to take advantage of multiple processors by executing threads on a shared memory processor box. We put data interlocking in place such that we could take those cells and bins and parallelize their execution across multiple CPUs. For the most common hardware available at that time you would have 4, 8 or maybe 16 processors this gave great results and wonderful scalability. It did have one issue when you start to look towards a larger CPU cluster which is that by definition the best compute time you can get was the sum of the longest cells within each operation. We could add another 3,000 processors to a task and it would get no shorter. The magic under the hood for the new version is that we have added operation parallelism to data parallelism. So now in addition to operating on each of the cells with an operation in parallel we are actually able to execute multiple operations in parallel as well. This allows us while one of those cells, the biggest bin, is still completing on the previous operation, we can pump up to 5 to 10 operations out into the cluster and radically decrease the overall run time and increase the scalability.
Since we have built this up around a core hierarchical engine that has the ability to thread and distribute tasks, we are able to optimally take advantage of all three common architectures out there whether it be a single CPU running small blocks and cells, an SMT box like someone's desktop (two Opterons with 4 cores in it) or one of these 40 to 50 CPU Linux clusters that are being put in place. This is a real advantage because most other approaches that are being talked about that are being able to scale out to multiple processors at this point are configured in such a way that it will operate well only on these multicore clusters.
Do you have any hard data on performance?
Yes! Runs that would take to 2 to 3 days on a single CPU execute in 2 hours or less on up to 40 CPUs with hyperscaling. Although the architecture was implicitly put in place to take advantage of large clusters, it also gives customers a big bang where SMP boxes on average were getting 2 X runtime improvements for these clusters.
We have been out to 23 or 24 customers with beta software. There is a wide range of IDM, fab-less and foundry customers. The results are for SoC, microprocessor and memory designs. We have published results for ten of these ranging from 70 to 130 nanometer designs running on anywhere from 8 to 40 CPUs where we get runtimes under 2 hours.
People, especially software vendors, will talk on scalability across on multiple CPUs. If you think about it, no customer goes out and runs their designs and says “Great it ran in three hours. I would like to run it again on more CPUs and get it to run it in an hour and a half.” What they do is put in place a cluster and then they want to be sure that whatever design they throw against it that they get reasonable turnaround. We did a study at a major foundry where they had three designs ranging from 8.2 GB to 27.3GB, a 3x increase in data file size. They ran that on 32 CPUs and the run time increased only 20%. This means that with an established cluster irrespective of what designs going in there, we get very good runtime scalability across multiple design sizes.
One of the key points that enabled is to be able to go out to 24 beta customers (as you know beta customers can be very difficult to support with a new tool) is that Calibre nmDRC just drops into the existing environment. Although it has the ability to have a new rule file language, it runs backward compatible with all the old rule files people have spent years building. It fits into the same design environment, using the same scripts and same reporting. Someone is able to access the ability to get the 2 hour runtime rather than the overnight runtime simply by loading the new software.
What we have talked about is being able to radically decrease the turn around time on a chip design. But that is not the only story. You are going to have errors. This means you need the ability to put in place a very efficient debug environment. There are two new capabilities we are introducing with nmDRC that greatly facilitate that. The first is dynamic result visualization. Traditionally you would run a verification tool. When the tool is done, you have access to all the errors. Instead the new tool sends out the errors to the user environment as the tool finds them. While the tool is still running, the user can go an error, find out where it is in the design database, and fix it. The next error would come and the user then would go and fix that one and so on.
In conjunction with that you want to make sure to give the user the capability to fix these errors without ending up causing additional errors that can only be found with a full chip run. Calibre nmDRC gives you an incremental DCR capability. Users can go off make a change and tell Calibre to run the entire chip. Calibre will determine what changes have been made and check within a full chip context halo around that region defined by the largest area of influence of any particular rule. So you might have this two hour runtime. You execute that and get your first error, fix that and rather than having to run for another two hour, you can do a 20 second run.
Whereas previously you might have an overnight run, debug that for perhaps a half day and start another full chip run. If you did everything right you would be done. If you didn't, you would have to run yet another cycle. It could take days to weeks. With the new version between the collapse of runtime and the dynamic results visualization as well as incremental we are able to collapse that to realistically someone can make sure physical verification in two runs which means you are looking at 4 to 6 hours.
We have also put in a new front end. This is essentially a tcl front end. Conceptually it takes it from an equivalent assembly language for geometry to a higher level programming language for geometry. As an example a set of rules describing a metal stack would take 500 lines in the old version but would take only about 64 lines in the new version. At some level it is not going to matter how much typing you have to do. The important point here is that because it is a high level programming language, your ability to support development as you iterating and cleaning these rules as well as porting to the next process is significantly enhanced.
Is there any way to take the old rules and automatically convert them to this new high level language?
No. It's from here forward. Whenever people get done and have what is called a golden rule file, there is little impetus to go off and change that. Even if it were an automatic process. the qualification time would be just too long. This is something that people would do going forward.
Can Calibre nmDRC be used with design systems form other vendors?
I have often described Calibre as the Switzerland of EDA. Because Mentor does not participate in place and route market, we integrate in with virtually every design creation tool on the planet. What is different with Calibre nmDRC is that we have also added the capability to do native database read and write from OpenAccess as well as from the MilkyWay environment.
So signoff has gone from a simple DRC check to become a multimode analysis concentrating on manufacturability. What are some of the problems in doing this?
DRC metal spacing for 90 nm would be 130 nm metal1 spacing. The DFM rule is that if you did this at 200 nm the yield is better. This is the equivalent of saying if you throw a 180 nm design against this process, it will yield better than a 90 nm design. Consider the typical results for a DRC run. Because you are doing verification for all the integration points along the way it is rare to see more than a handful of errors, especially after you have screened those errors for hierarchical redundancy after a DRC run. This is fairly easy to conceptualize - going off to your layout editor and fixing those. The traditional results of a recommended rule file because it is essentially a recommendation that you design at a bigger design process is that there are thousands and thousands of errors. You might ask yourself “What does a designer do with this?” They close that window and pretend they never saw it because there is nothing you can do with this level of data.
One of the other things that makes this challenging as yield detractors become more complex, it can become more and more difficult to put them into a geometrical check. What happens instead is that we end up moving them to a model based check. The two major ones that have happened to date are critical area analysis where you include a model of your defect density as you are doing this yield check as well as litho-friendly design which is a tool we launched back in the spring that allows you to simulate the lithographic environment of the manufacturing line to determine these complex yield effects that are impossible to code geometrically. The good news on this is that they are a much terser way to describe the issue and a much more reliable way. The difficult part is because they are not a simple measurement but instead essentially a simulation is that they become far more computationally complex than any type of geometry check.
People in the industry often talk about the need to do via doubling. Consider a real simple piece of layout with 2 single vias. What are the options? You can take one and double it or you can take the other one and double it with an x clip. You can extend the two line ends so that your enclosure coverage is a little bit more. Or you could make both metals a little wider to expand closure around the via in three directions. Or some combination of these. The fundamental question becomes which one are you supposed to do? This is not even beginning to talk about how you interact that with the ability to spread those wires and make those metals fatter. It becomes very complex to look at that. If you are just going off and doing a via doubling, how do you go off and figure out in the context of all the recommended rules, critical area analysis, all the litho analysis and all the other analyses you need to put in what is the best way to deal with the problem? As you go sequentially, it is difficult to determine if you are actually going to move to a point where you are getting better or worse yields. There was a paper that got a lot of play at PSY where someone did a very simple via doubling experiment and showed they can actually get less yield by doubling vias because they ran up against other effects. The goal here is you want to get one of the gold dots not one of the green.
What we have put in place with the Calibre nanometer platform is to enable analysis that lets you solve those three problems. You can contrast this with what the platform looked like in the deep submicron area. Life was fairly simple for us. You took in GDS II and a netlist, you did design rule checking, you did LVS (layout versus schematic) to compare GDS II to see if it actually implemented the net list you intended and you did some arasitics. You wrote this in our geometry language and ran on maybe up to 8 processors. The output was error markings, some derived layout, device and parasitic net list. With a simple data query you've got these results into a data editing environment with the ability to write that data out and we were done.
The nanometer platform that we have put in place facilitates this entire handoff and analysis and enhancement to manufacturability.
The key is that you can not just go GDS II anymore. You need the ability to come out of and come into the native design environment because you are making enhancements to the design. Everyone is going to want to be able to make sure they are still making their timing closure within their native design environment. There are additional analysis routines that we put in place to enable us to look at yield manufacturability as well as new programming front end and hyperscaling to allow us to have a computational platform that lets us do all this analysis in subglacier time. To handle the issue of multiple analysis possibilities we can maintain what the original layout is and compare that against multiple optimization possibilities. There is also an incremental analysis and enhancement capability that lets us look at where those different options will send us in terms of the overall yield space as well as a report and visualization area where we can take that morass of error markers, turn it into real information about a design that a design can use to model their yield effects. That whole thing put together we are calling the Calibre nm platform.
What types of outputs does it generate?
First you have error markers. As an example you might have this DFM rule that says your spacing is less than 200 nm where your DRC is 130 nm. You get lots and lots of error markers which the designer has no idea what to do with. Then there is a Praeto analysis of all the violations of that particular rule identifying which are the worse violators and would be the best place to go off and spend your energy. You can also you see collected data over the entire chip for these rules which then lets you build out and understand where your biggest yield effects are overall and how this is going to impact your chip yield.
There a number of other reports such as different Praeto tools and hot spot mapping into the design data base. There is a by cell analysis which can make for interesting conversation with your IP provider as you are talking about what their deliverables should be not only in terms of functionality and performance but also for yield detractors.
For model based verification streams we are doing several things ranging from the critical area analysis tool that includes defect density model as well as systematic edge effects around the litho space. Probably the most interesting is tying that variability into our silicon modeling which goes off and determines not only the variability of transistors over the lithography but also their detailed context within active regions. Essentially this tells us is the deterministic variability of the design and enable people to have a smaller guard banding around what they are putting in place for parametric performance.
We have described fairly radical changes in terms of what customers need to ensure manufacturability especially as we go down towards and look forward to 45 nm. We are not only running more and more DRC checks but also adding all those other manufacturability analyses. In a classical environment that quite simply could run for weeks or months. With all the new capabilities we have built into both the nanometer DRC tool and the performance characteristic that it brings as well as all of the analysis, incremental and statistical gathering that we are able to build into the tools, we actually enable people to deal with that new environment within their existing design cycle.
When is Calibre nmDRC available?
Customers can gain access to this today. In terms of first customer ship in the middle of Q3 which is August.
What is the pricing of the offering?
Highly variable depending on the number of CPUs and which part of the analysis packages but at the same price as the old Calibre DRC.
Are current users of the older Calibre DRC entitled to a free or reduced price upgrade?
If they are on maintenance, then they get this as an upgrade.
What is the number of beta sites?
Twenty four as of yesterday.
How would you characterize their feedback?
It has been great because what we've done is gone in, loaded the software, and within about an hour they are operating configs. They run it. It's done in 2 or 3 hours. The reaction in general has been Wow. It's been really successful.
Whom do you see as competitors to the older and now the new Calibre?
In terms of existing tool we have market penetration up above 60%. There are tools in place out there from Cadence, Synopsys and Magma. I expect all three to continue to compete. From what we are seeing from the results, we expect to be significantly faster than all of them as well as having with the nanometer platform a far more complete roadmap for manufacturability.
If some one makes a design change there may be unintended ramifications and therefore a need to re-simulate the entire design. I understood you to say that if you make a change on the manufacturing side (line width or spacing), you can determine the maximum area of influence.
Yes. That's part of the incremental capability where if someone goes in and edits a particular region, our tool will find where the change was and verify only in the necessary region around that.
So changing the shape of geometry but not the function?
It depends on what someone is doing. If someone is doing something as extensive as timing check in the Place and Route tool and having to do a replace, rip-up and reroute, they will tend to do a full verification run rather than an incremental one. Incremental tends to be about whether what you are doing is essentially a custom layout editing of a particular region.
The target for Calibre nm is the more complex designs, lower processor nodes.
That's definitely the target. We expect that our customers will upgrade as a default. So even for the older 180 nm and 130 nm nodes, they will be taking advantage of the same technology because we can still run on those backwards compatible rule files. So they will essentially just get faster run times and be happy about it.
What are the ramifications of going forward to 45 nm and below? Will litho-friendly changes breakdown at some point? Do we need new breakthrough to go to lower processor nodes?
I would not be surprised to see other new analyses that need to be done. I will tell you an interesting story of how this is playing out. You hear all the time of the difficulty of doing timing closure due to variability of timing, that a cell put in one part of the design will behave significantly different than that cell in another part of the design. We did an experiment with one of our customers. They had a new 65 nm library. We ran that with our litho-friendly design tool to verify timing. We discovered that had done a really good job in designing the cells so that it would be invariant to placement because of lithoeffects. The variability was less than 5%. If you turn around and grab another standard cell library for 65 nm and do the same analysis, because that library was not designed to be invariant, there would be a huge amount of variability. With the second cell library you end up with difficulties in doing timing closure. With the first cell library they had far fewer problems. The aspects around doing these different analyses really can make an impact on the design where rather than things getting harder, they are getting easier.
There have been blogs debating whether this release or by extension whether any release that can be characterized chiefly as a performance improvement can be truly considered significant. Most new releases tout performance improvements and even existing applications benefit from the speedup of new computers.
Of course performance is a critical attribute of any software application. When I was first programming computers 100 years ago it was routine to submit a deck and wait until the following day to get your printed output. If there was a typo, an entire day had been lost. Today I (and I suspect you) get frustrated when Google doesn't return Avogadro's number of items from a search in an eye blink. There is something about the human psyche. We can wait an hour for something but not a minute. This is probably because you can go and do something else if you know the wait will be an hour but not if the anticipated wait time is a minute or two. For an application to be truly interactive it must be just that. It must respond in a way that doesn't slow down the thought process. If echoing lagged typing it would render most software applications useless for all but the most accomplished touch typists.
In the case of EDA there are applications such as verification regression tests that run for hours, even days. This situation is true even when using multiple processor machines and compute farms. This has an obvious impact on the productivity of those waiting for the results. Since physical verification occupies a large chunk of the critical path, any significant improvement would reduce time to market and reduce the risk of missing the market window entirely thereby improving potential revenue and profit.
The use of multiple processors to improve performance can hardly be seen as revolutionary. Array processors, math co-processors, graphics processors and the like have been around forever. In some cases one must merely recompile or re-link an application to take advantage of this capability. In other cases the application must be re-coded and possibly re-designed to leverage these devices.
Grid computing allows one to unite pools of servers, storage systems, and networks into a single large system so the power of multiple-systems resources can be delivered to a single user point for a specific purpose. To a user or an application, the system appears to be a single, enormous virtual computing system. Virtualization enables companies to balance the supply and demand of computing cycles and resources by providing users with a single, transparent, aggregated source of computing power. Well publicized uses of distributed computing include SETI (Search for Extraterrestrial Intelligence) and the Gnome project.
Instead of executing faster another common approach is to avoid or minimize the work effort to be done thereby reducing the time required. When doing system backup, it is routine to do a full backup weekly and incremental backups of only those files that have changed on a daily basis. In the area of software development there are source control systems. If an application consists of but a single module, any source code change requires re-compilation and re-linking. But what if there are thousands of modules and only a few have been modified. The source control system is able to identify which modules are impacted and recompile only those modules. Even so, the entire application must be retested. I can speak from personal experience that the most innocuous change can have significant unforeseen consequences. In my last editorial I described Calypto an equivalence checker that among other things will tell you if micro-architectural optimizations have introduced any side effects requiring re-verification.
Other vendors have made claims for runtimes of less than two hours. In the case of CPUs and graphics cards there are well recognized standards or benchmarks for performance. Comparison across vendors is straightforward. When it comes to EDA applications, existing users of a given application may be able to quickly make performance comparisons between releases but comparisons across vendors may not be as easy.
I leave it to the end users to judge the significance of this release of Calibre.
The top articles over the last two weeks as determined by the number of readers were:
Magma Announces U.S. Patent & Trademark Office Asked to Re-Examine '446 and '438 Patents; Third Party Asks for Patents to Be Invalidated The patents are two of the three patents involved in a patent dispute between Magma and Synopsys Inc. that is pending before the U.S. District Court for the Northern District of California. Magma received notice of the requests for re-examination on Aug. 22, 2006 via U.S. Postal Service. Magma has been prevented from seeking re-examination of these patents as the result of a court order that was requested by Synopsys.
Incentia Timing Analysis and Constraint Management Software Adopted by Ambarella Incentia Design Systems announced that Ambarella has adopted Incentia's TimeCraftTM and TimeCraft-CM as its static timing analysis and constraint management software in its nanometer design flow. TimeCraft is a full-chip, gate-level static timing analyzer (STA) for timing sign-off. TimeCraft-CM Incentia's Constraint Manager, TimeCraft-CM, consists of a constraint checker, a qualified SDC (Synopsys Design Constraint) writer, and a constraint debugger.
X-FAB Selects Cadence Solution for Maximum Yield; Cadence(R) Virtuoso(R) NeoCircuit DFM Aids Analysis, Optimization for Analog IP X-FAB Semiconductor Foundries AG, leading analog mixed-signal semiconductor foundry, announced it is implementing Cadence Design Systems' Virtuoso NeoCircuit DFM solution to identify and eliminate yield-related problems early in the design phase and fabrication process.
International Conference on Computer Aided Design (ICCAD) Previews Technical Program Focused on Today's Challenges and Emerging Technologies The Conference will be held November 5-9 at the DoubleTree Hotel in San Jose, California. ICCAD 2006 will feature industry-leading keynote addresses from Phil Hester, Chief Technology Officer of AMD, entitled "An Industry in Transition: Opportunities and Challenges in Next-Generation Microprocessor Design," and Leon Stok, Director of EDA at IBM, entitled "Innovation in Electronic Design Automation."
Other EDA News
Jasper Design Automation Integrates Verific's SystemVerilog Component Software With JasperGold Verification System
Jasper Design Automation Announces Immediate Availability of GamePlan(TM) Verification Planner As A Free Download
Incentia Timing Analysis and Constraint Management Software Adopted by Ambarella
Grace Semiconductor Manufacturing Corporation (Grace) selects Silicon Canvas Laker Layout Solution for their Custom IC Designs
MEDIA ALERT/Discover How Alteras Programmable Solutions Drive Innovation in the Broadcast Industry at IBC2006
Agilent Technologies Completes Acquisition of Xpedion Design Systems, a Leading Provider of RFIC Simulation and Verification Software
Sequence Extends EM And V-drop Analysis To Full-Custom Designs
Toshiba Adopts Cadence QRC Extraction for 65-nm Design Flows
Novas Launches 2006 User Conference Program
Aprio Contributes Technology to Si2's Design to Manufacturing Coalition
Other IP & SoC News
FSA Announces 2006 FSA Suppliers Expo and Conference
JamTech Achieves Technology Breakthrough for Digital Audio
Samsung Electronics First to Mass-produce 1Gb DDR2 Memory with 80nm Process Technology
IBM, Chartered, Infineon and Samsung Announce Process and Design Readiness for Silicon Circuits on 45nm Low-Power Technology
New High-End Intel(R) Server Processors Expand Performance Leadership
Semtech Announces Selected Second Quarter Results
AMI Semiconductor, Inc. Appoints Ted Tewksbury as President and Chief Operating Officer
Sigma Designs, Inc. Reports Second Quarter Results
Micron Technology Delivers the World's Densest Server Memory Module for Data Intensive Applications
Microchip Technology Unveils Industry's First 1.5A LDO with Shutdown, User-Programmable Power Good and Bond-wire Compensation on Single Chip
Avago Technologies Raises Reliability and Performance of RFIC Amplifiers for Cellular, DBS and CATV Networks
Analog Devices' Video And Audio Signal Processing Expertise Advances Hard Disk Recorder With High-Definition DVD Capabilities
Rail-to-Rail Comparators Are Industry's First to Include LVDS Outputs
AMI Semiconductor Announces Availability of BelaSigna(R) 250 Rapid Prototyping Module
Intersil Promotes Roberto Magnifico to Vice President of European Sales
Xceive Announces Low Power, High Performance Single-Chip RF-to-Baseband Receivers