Xilinx, Inc. is an American technology company, primarily a supplier of programmable logic devices. It is known for inventing the field programmable gate array (FPGA) and as the first semiconductor company with a fabless manufacturing model. Founded in Silicon Valley in 1984, the company is headquartered in San Jose, California. Xilinx is a $ 2.3 Billion company with over 3000 employees worldwide.
Neeraj Varma, Director, Sales, Xilinx India spoke to Ashwin Gopinath about the company’s new portfolio of products in the 28nm along with the migration into the 20nm domain.
Q. Could you walk us through the advanced features that the 28nm process node has allowed to enable in your chips?
A. There are three products on the 28nm portfolo: FPGAs, SoCs and 3D-ICs.
FPGAs are, as you know, Field-Programmable-Gate-Arrays where you can program any digital logic onto the Silicon. However, by utilising 28 nm, they’ve integrated a lot more components. FPGAs now have transceivers, and these are both parallel and serial (serial being the most important one, with bandwidth support of 28 Gb per second per channel). This was also one of the first times we integrated a very capable analog mixed signal capability using an analog conditional converter inside the chip.
We also offered a “Programmable SoC” family which houses a dual core ARM cortex A9. We have two Cortex-A9 processors inside which are tightly integrated with the FPGA family. Thus, the FPGA provides the hardware programmability, while the ARM provides the software programmability. Also, it continues to have the analog mixed signal capability. So, it is a true SOC but it’s all-programmable. Everything on that chip is programmable.
We have also introduced our 3D-ICs. Xilinx was the first company, not only to introduce it but also take it to production. So, we’ve managed the supply chain, the ecosystem to get a 3D-IC in place and we are actually shipping this device to hundreds of customers right now.
Q. What were the key issues that were solved by utilising the new process node?
A. The key theme when we developed this was power. Everybody thinks that FPGAs are power hungry but we’ve been able to solve that problem significantly, going from 40 nm to 28nm, we’ve reduced power consumption by almost 50%. The all-programmable FPGAs are clearly a generation ahead of our competition in terms of memory bandwidth, transceiver performance, DSP performance and the integration capability.
The SoC’s were shipped out in Q4 of 2011. We are more than a year ahead of our competition who are still talking about it while we’ve been shipping them for almost a year. This has been one of the most successful design platforms for us in the history of Xilinx. The design pipeline that we have for this product equals FPGA design pipeline.
Having this integration allowed us to increase gate capacity. We were able to get 2 million logic gates onto a single device. Now, just to put it in perspective, that’s about 20 million ASIC kits. Our nearest competitor can only do half of that amount. The only reason for that is our 3D IC capability. The 3D ICs were shipped along with the programmable SoCs in Q4 of 2011.
Q. What kind of challenges did you face while designing a portfolio so far ahead of your competitors?
A. We partnered with a fab (TSMC) and chose a process called 28nm HPL (High Performance Low power). Usually, a lot of resources go into choosing the right process. Our R&D team took a bet on HPL because they thought it was the best process. The other options were HP (high performance) or LP (low power). This was a process that we co-developed with our fab partner. I think that worked in our favor by enabling us to stay a generation ahead of the others.
Q. Could you elaborate a bit on the 3D-IC?
A. Essentially, in a 3D-IC, we stack different dies in a single package. The silicon dies are connected through what we call silicon interpolars which has tens of thousands of connections between each die. So, that essentially becomes a package as it’s stacked side-by-side. The things which we can put on these chips is immense – we can put in algorithms, software, micro-processors, IOs, protocols, logics and so on. So, we’ve also integrated, along with FPGA, a transceiver die. This is the die which helps us to go to the 28 Gb/sec bandwidth, which is the highest bandwidth at the moment among FPGAs today. The 3D ICs were shipped along with the programmable SoCs in Q4 of 2011. Having this integration allowed us to increase gate capacity. We were able to get 2 million logic gates onto a single device. Now, just to put it in perspective, that’s about 20 million ASIC kits. Our nearest competitor can only do half of that amount. The only reason for that is our 3D IC capability.
Q. What were the biggest challenges while working with the 3D-IC technology?
A. The 3D IC has been a totally different ball-game altogether. Our research group inside Xilinx has worked on this product for the last five years, so we have a very clear idea about what we are doing. It’s not enough to just come up with a chip, we need to make it manufacturable too. In order to do this, our team worked with universities, fab partners, packaging vendors to essentially come up with an ecosystem.
Another thing to note is that, if we are the only company to do this, that too is not good for us, so we’ve worked with semicondutor companies as part of the semi group that we have and they’ve also adopted this technology. Although we are the first, we don’t want to be the only ones either. If I had to pinpoint someone to give credit to for our being so far ahead, it would have to be to every single member of our research team.
Q. With the change to 28nm, what major changes were required from your software tools?
A. One of the reasons FPGA companies are so successful is their software. That is also one of the reasons startup FPGA companies have not been so successful. We here at Xilinx, used to have this design suite called ISE which used to be our traditional design suite. Now, at 28nm, the complexity of our devices has exponentially grown and so our tools have no longer been able to keep pace with the same. For example, implementing a design takes hours. If it’s a large design, then we have had cases where the customer has been implementing this design for 8 hours, compiling and routing etc. So, Xilinx decided to get a complete new tool suite which can cope with all the issues like run-time issues, quality of results.
Q. What are the things to be keep in mind when designing a big or complex device?
A. When you design a big device, you are not going to design everything from scratch, you are going to use a lot of IPs. In this case, you either build your own IPs in-house or you use 3rd party IPs. You need lots of IPs in either case which is not helped by the fact that most of the design teams today are located far away from each other. What you need is to have a team based approach where different people working at different places need to work in tandem. All these things were thought about and we came up with Vivado which is a four year effort entirely by Xilinx. The tool was built from ground up and will be our flagship product for the next decade. So, this is going to be taking care of the 7 series, which is the 28nm, the 8 series, which is the 20nm, and beyond. It’s already being used on 30% of the designs and we foresee it taking over close to 100% of the FPGAs.
In Vivado, first of all, the run-time is significantly lesser. So, if you want to synthesise your design or compile your entire design, Vivado can bring down the run time from months to weeks. Second is the time to implementation. You need to plan and implement and you need to keep track of every change that you make in the design. That time too has come down on similar lines. Third, is the quality of results. So, we expect a minimum of 20% increase in the quality of results. Quality of results is usually measured by using two parameters; one is performance and the other is area. The other improvements are the integration of IPs, which has become much more simpler, IP packaging, if a customer is making his own IP, it has become much more simpler for him to integrate his own IP into the design.
Q. So, now that Xilinx is in the 20nm domain, what would you say are the key differences b/w the 28nm and the 20nm?
A. From an FPGA standpoint, the transceivers are now system optimised. Thus, the transceivers are taking care of the entire system. We have more than 100 transceivers per chip running at 33 Gbps. We have DSP blocks and block RAMs inside, so those will be faster now. So far, we’ve been supporting DDR1, DDR2 and DDR3 memories but now, DDR4s are coming up so we’ll be supporting them as well. This allows us to give twice the memory and in terms of routing architecture, this will allow more than 90% routings. This is enabled not just by the routing architectures but also by the tools. Traditionally, most FPGAs struggle with a value of 70-80% routing but we are able to utilise 90% of the routing capabilities. We’ve optimised power utilisation so that it uses only half of what it used to along with giving some more granular optimisation for the tool to optimise your power. We’ve enhanced the mixed signal capability two-fold.
We are now offering heterogenous multi-core SoCs (SoCs with asymmetric cores). Performance is optimised as we have multi-core, memory, fabrics so we are going to have more optimisation across the family so that the transactions between the processor and the FPGA are much more efficient. Power optimisation is a key factor in this segment too. There is a lot of integration done so that it reduces the bill of materials cost for the customer.
And in 3D ICs, we’ve always had FPGAs and transceivers but now, we also have wide memory. What we are going to offer is memory, FPGA and transceivers. We will be offering industry standard interface between dies so that if there is another ASIC from a customer that needs to be integrated, we need to offer that. That’s the reason our interface is industry standard. We are offering cutting-edge functionality with future transceiver protocols being supported upto 56 Gbps and the density will be enhanced by 50% to three million logic cells which is equivalent to about 30-40 million ASICs.
As far as Vivado is concerned, with the addition of Auto ESL, which allows us to convert C to VHDL, we will be able to do a much faster C-based verification, 4x faster C to RTL, 3-100x faster RTL simulation along with 5x faster IP reuse and time-to-IP integration.