Xilinx, Inc. is an American technology company, primarily a supplier of programmable logic devices. It is known for inventing the field programmable gate array (FPGA) and as the first semiconductor company with a fabless manufacturing model. Founded in Silicon Valley in 1984, the company is headquartered in San Jose, California. Xilinx is a $ 2.3 Billion company with over 3000 employees worldwide.
Neeraj Varma, Director, Sales, Xilinx India spoke to Ashwin Gopinath about the company’s new portfolio of products in the 28nm along with the migration into the 20nm domain.
Q. Could you walk us through the advanced features that the 28nm process node has allowed to enable in your chips?
A. There are three products on the 28nm portfolo: FPGAs, SoCs and 3D-ICs.
FPGAs are, as you know, Field-Programmable-Gate-Arrays where you can program any digital logic onto the Silicon. However, by utilising 28 nm, they’ve integrated a lot more components. FPGAs now have transceivers, and these are both parallel and serial (serial being the most important one, with bandwidth support of 28 Gb per second per channel). This was also one of the first times we integrated a very capable analog mixed signal capability using an analog conditional converter inside the chip.
We also offered a “Programmable SoC” family which houses a dual core ARM cortex A9. We have two Cortex-A9 processors inside which are tightly integrated with the FPGA family. Thus, the FPGA provides the hardware programmability, while the ARM provides the software programmability. Also, it continues to have the analog mixed signal capability. So, it is a true SOC but it’s all-programmable. Everything on that chip is programmable.
We have also introduced our 3D-ICs. Xilinx was the first company, not only to introduce it but also take it to production. So, we’ve managed the supply chain, the ecosystem to get a 3D-IC in place and we are actually shipping this device to hundreds of customers right now.
Q. What were the key issues that were solved by utilising the new process node?
A. The key theme when we developed this was power. Everybody thinks that FPGAs are power hungry but we’ve been able to solve that problem significantly, going from 40 nm to 28nm, we’ve reduced power consumption by almost 50%. The all-programmable FPGAs are clearly a generation ahead of our competition in terms of memory bandwidth, transceiver performance, DSP performance and the integration capability.
The SoC’s were shipped out in Q4 of 2011. We are more than a year ahead of our competition who are still talking about it while we’ve been shipping them for almost a year. This has been one of the most successful design platforms for us in the history of Xilinx. The design pipeline that we have for this product equals FPGA design pipeline.
Having this integration allowed us to increase gate capacity. We were able to get 2 million logic gates onto a single device. Now, just to put it in perspective, that’s about 20 million ASIC kits. Our nearest competitor can only do half of that amount. The only reason for that is our 3D IC capability. The 3D ICs were shipped along with the programmable SoCs in Q4 of 2011.
Q. What kind of challenges did you face while designing a portfolio so far ahead of your competitors?
A. We partnered with a fab (TSMC) and chose a process called 28nm HPL (High Performance Low power). Usually, a lot of resources go into choosing the right process. Our R&D team took a bet on HPL because they thought it was the best process. The other options were HP (high performance) or LP (low power). This was a process that we co-developed with our fab partner. I think that worked in our favor by enabling us to stay a generation ahead of the others.
Q. Could you elaborate a bit on the 3D-IC?
A. Essentially, in a 3D-IC, we stack different dies in a single package. The silicon dies are connected through what we call silicon interpolars which has tens of thousands of connections between each die. So, that essentially becomes a package as it’s stacked side-by-side. The things which we can put on these chips is immense – we can put in algorithms, software, micro-processors, IOs, protocols, logics and so on. So, we’ve also integrated, along with FPGA, a transceiver die. This is the die which helps us to go to the 28 Gb/sec bandwidth, which is the highest bandwidth at the moment among FPGAs today. The 3D ICs were shipped along with the programmable SoCs in Q4 of 2011. Having this integration allowed us to increase gate capacity. We were able to get 2 million logic gates onto a single device. Now, just to put it in perspective, that’s about 20 million ASIC kits. Our nearest competitor can only do half of that amount. The only reason for that is our 3D IC capability.
Q. What were the biggest challenges while working with the 3D-IC technology?
A. The 3D IC has been a totally different ball-game altogether. Our research group inside Xilinx has worked on this product for the last five years, so we have a very clear idea about what we are doing. It’s not enough to just come up with a chip, we need to make it manufacturable too. In order to do this, our team worked with universities, fab partners, packaging vendors to essentially come up with an ecosystem.
Another thing to note is that, if we are the only company to do this, that too is not good for us, so we’ve worked with semicondutor companies as part of the semi group that we have and they’ve also adopted this technology. Although we are the first, we don’t want to be the only ones either. If I had to pinpoint someone to give credit to for our being so far ahead, it would have to be to every single member of our research team.
Q. With the change to 28nm, what major changes were required from your software tools?
A. One of the reasons FPGA companies are so successful is their software. That is also one of the reasons startup FPGA companies have not been so successful. We here at Xilinx, used to have this design suite called ISE which used to be our traditional design suite. Now, at 28nm, the complexity of our devices has exponentially grown and so our tools have no longer been able to keep pace with the same. For example, implementing a design takes hours. If it’s a large design, then we have had cases where the customer has been implementing this design for 8 hours, compiling and routing etc. So, Xilinx decided to get a complete new tool suite which can cope with all the issues like run-time issues, quality of results.
Q. What are the things to be keep in mind when designing a big or complex device?
A. When you design a big device, you are not going to design everything from scratch, you are going to use a lot of IPs. In this case, you either build your own IPs in-house or you use 3rd party IPs. You need lots of IPs in either case which is not helped by the fact that most of the design teams today are located far away from each other. What you need is to have a team based approach where different people working at different places need to work in tandem. All these things were thought about and we came up with Vivado which is a four year effort entirely by Xilinx. The tool was built from ground up and will be our flagship product for the next decade. So, this is going to be taking care of the 7 series, which is the 28nm, the 8 series, which is the 20nm, and beyond. It’s already being used on 30% of the designs and we foresee it taking over close to 100% of the FPGAs.
In Vivado, first of all, the run-time is significantly lesser. So, if you want to synthesise your design or compile your entire design, Vivado can bring down the run time from months to weeks. Second is the time to implementation. You need to plan and implement and you need to keep track of every change that you make in the design. That time too has come down on similar lines. Third, is the quality of results. So, we expect a minimum of 20% increase in the quality of results. Quality of results is usually measured by using two parameters; one is performance and the other is area. The other improvements are the integration of IPs, which has become much more simpler, IP packaging, if a customer is making his own IP, it has become much more simpler for him to integrate his own IP into the design.
Q. So, now that Xilinx is in the 20nm domain, what would you say are the key differences b/w the 28nm and the 20nm?
A. From an FPGA standpoint, the transceivers are now system optimised. Thus, the transceivers are taking care of the entire system. We have more than 100 transceivers per chip running at 33 Gbps. We have DSP blocks and block RAMs inside, so those will be faster now. So far, we’ve been supporting DDR1, DDR2 and DDR3 memories but now, DDR4s are coming up so we’ll be supporting them as well. This allows us to give twice the memory and in terms of routing architecture, this will allow more than 90% routings. This is enabled not just by the routing architectures but also by the tools. Traditionally, most FPGAs struggle with a value of 70-80% routing but we are able to utilise 90% of the routing capabilities. We’ve optimised power utilisation so that it uses only half of what it used to along with giving some more granular optimisation for the tool to optimise your power. We’ve enhanced the mixed signal capability two-fold.
We are now offering heterogenous multi-core SoCs (SoCs with asymmetric cores). Performance is optimised as we have multi-core, memory, fabrics so we are going to have more optimisation across the family so that the transactions between the processor and the FPGA are much more efficient. Power optimisation is a key factor in this segment too. There is a lot of integration done so that it reduces the bill of materials cost for the customer.
And in 3D ICs, we’ve always had FPGAs and transceivers but now, we also have wide memory. What we are going to offer is memory, FPGA and transceivers. We will be offering industry standard interface between dies so that if there is another ASIC from a customer that needs to be integrated, we need to offer that. That’s the reason our interface is industry standard. We are offering cutting-edge functionality with future transceiver protocols being supported upto 56 Gbps and the density will be enhanced by 50% to three million logic cells which is equivalent to about 30-40 million ASICs.
As far as Vivado is concerned, with the addition of Auto ESL, which allows us to convert C to VHDL, we will be able to do a much faster C-based verification, 4x faster C to RTL, 3-100x faster RTL simulation along with 5x faster IP reuse and time-to-IP integration.
Q. What are the advantages of integrating memory on the chip and not as an external element as used to be the case?
A. Typically, if you look at a board, you have a chip, a processor, FPGA etc and then, you have the memory, either a D-RAM or an S-RAM. However, the biggest bottleneck is the bandwidth between memory and chip; how much data you can buffer, process and take out. Now, that is totally on the limitation of the board. Now imagine us taking the memory inside the chip. It’s a real high capacity memory which is now inside a ship which translates to higher bandwidth. That’s the most significant use of integrating the memory on the chip which is what we’ve done on the 20nm. Before the 2nd generation, we never had memory on the chip. Now, by adding memory, we have ensured that the designers don’t have to focus on the memory aspect at all. On the transceivers part, earlier we used to do 28 Gbps, now we’re going from 33 to 56 Gbps transceivers.
Q. So, what are the performance improvements Xilinx is looking at from this second generation portfolio?
A. From an FPGA perspective, at 20nm, we are going to see a 30-50% improvement on the price per performance per watt front. What that means is 30-50% price reduction, performance improvement and power reduction, all for the same capability. Our analog mixed signal will become mainstream. Right now, we’ve had a very significant capacility at 28nm and in addition a lot of IP subsystems will be integrated into it.
From an SoC standpoint, we only had dual-core cotex A9. Now, we are introducing heterogenous multi-core; that means there is not just the ARM core inside but also a GPU core. Plus, the FPGA processing and a software environment which suits it.
For 3D-ICs, it will have the FPGA die, I talked about the tranceivers that we’ve integrated, now we are also integrating wide memory onto the 3D-IC.
Q. How does Xilinx differentiate itself from its competitors in the FPGA market? What would you say sets Xilinx apart?
A. If you compare an FPGA company like us and companies who make standard products, the basic benefit we provide to the customers is programibility and flexibility. So, when we design our product portfolio, we make sure that we are giving them the right amount of resources at the right pricepoint with the power, performance benefits included. For every node we have, we come up with a portfolio of devices at the right gate density and we look at different applications like for example if it’s a 100 Gbps network block or a flat-panel TV, what kind of capabilities would they need. So, we need to decide on a wide array of applications bu really optimise a set of devices which kind of covers most of the applications. A standard product provider would make a chip for a TV and say that this chip is used for image processing for a TV and nothing else. But our FPGA will do that and then it can also be used for military applications. The beauty of FPGAs is that, the same product is used across different vertical markets. Thus, we have a variety of different customers for the same product.
Q. What are the market needs that you are addressing with this 2nd generation?
A. If you look at our FPGAs, we can enable multiple 100 Gbps wired networks now. The advantage we are providing is that we are increasing ports per board and ports per dollar. So, you get more ports for the same dollar spent. This is the result of us giving the customers more integration, more density, more transceiver bandwidth and more memory. If you look at the radio base stations, we are now able to offer multi channel wireless radios. There, the challenge is to provide more performance at lower power levels.
For our SoC products, we serve a wide variety of applications spanning across medical, industrial, machine-vision, military and so on. All that we are trying to do is image processing and analytics. So, you get an image, you process it and you also do analytics on it. Take surveillance, for example. It not only involves taking pictures but also needs expert analysis. So, those are the main applications which will be simplified by the SoCs. Also, any data center or cloud based services where you need secure processing in the data lines can utilise this SoC for their purposes.
3D-ICs are family in the telecom space. They will go up to 100-400 Gbps smart networks and will allow us to work on next generation protocols like the SERDES Framer Interface Level-5 protocol which needs 40 Gbps.