-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] RFSoCs from Xilinx #1
Comments
Any information about the level of bugs and misfeatures in this new shiny device? |
I think this particular device is more suited to superconducting qubit applications than trapped ion applications, so I think those users should drive any development. In sc qubits, the name of the game is latency, latency, and latency. Once you have some tests done to see how bad the various pipeline delays are, it will inform how useful this may or may not be. Measurement is carried out in a few hundred ns, then you choose to apply an error correction pulse or not, and with the round-trip propagation delays between generator and mixing-chamber plate of the dilution refrigerator where the qubits live, you will need to have ~few hundred ns between the end of the measurement time at the ADC input and the start of a subsequent pulse which is conditioned on the measurement data. For ion traps, the win would be if the tight integration means simpler data transmission protocols, but if the amplitude resolution and temperature coefficients are not that good on the ADCs or DACs relative to the current Sayma, I am not sure it would be worth it. |
https://www.e2v.com/products/semiconductors/dac/ |
Sounds good, but 10-bit and 12-bit DACs may not be fine enough resolution for ion trap applications. For sc qubit applications these might be good. |
also, 48 lvds pairs per DAC channel :) parallel wins for latency but it comes at a cost in channel count. |
Yep, this is exactly where the problem is. I need plenty of channels + plenty of SDRAM channels. So have to choose, either ADC/DACs or SDRAM. Take into account that to talk to such ADC and DAC you have to use IOSERDES with delay-tuning logic which is also not for free in terms of logic. |
EV12DS460A DAC is 1.4k EUR (@10 pcs). |
Ignoring the repetitive vague FUD and the whining, I am pretty sure that RFSoCs are the right and ultimately only way to do scalable large quantum systems with ion traps as well. Get more of the entire chain from smart and local control logic, DSP, ADC/DAC, AFE, drivers, all the way to the interface closer to the ions. I don't see an alternative approach to space, power and communication complexity challenges. Hilarious to suggest 48 LVDS pairs per DAC channel are a good idea. And once you can do fast gates, you want low latency to do fast feedback/EC algorithms. That applies to ions as well. If some ion trapper wants to get such an integrated and scalable system going they better start yesterday. |
I simply want to do the same I did with AFCZ board and AMC box. They were created for WUT project with other applications in mind. One of CERN groups will use AFCZ for low level RF. It is compatible with Sayma RTM so one day if someone wants to use ZynQ to drive DACs there is easy migration path. For WUT project we needed only 3 DDR banks and FMC. The same I want to do with SDR for WUT project. We need ADC and DAC channels and some SDRAM. I want to use same AFE boards as for Sayma and make it compatible with AFCZ/Sayma AMC with future migration option to simple AMC without FPGA. |
@dhslichter I'd try really hard to answer stability requirements using active, better, faster, more frequent, and much more local calibration cycles supported by local and decentralized control. And I'd avoid converting them into passive bit depth and tempco requirements if at all possible. It should be possible to close that gap using the 14 bit RFSoCs that are available. |
@jordens That's true, I can synthesize some basic loop-back design and see what the Xilinx tool says about the latency. |
This gives nice perspective of how much power can be saved when we get rid of the JESD or LVDS interfaces.. |
100% agree. If you really want to talk about scaling up then these kinds of RFSocs are going to be a must in terms of cost, space, power consumption, latency etc. Designs like Shuttler and Sayma are ideal for complex physics experiments or medium (~100ion) QC demos, but don't seem that well suited to going to thousands of qubits.
Yes, specifying everything to be 16-bit and low temp co is the lazy approach. It works okay for physicist/research type systems where it's hard to pin down an exact specification since the challenges aren't well understood, but in the long run we need to think more carefully about what our actual minimal specifications are, and what the simplest way of achieving them is (e.g. local feedforwards based on integrated temperature sensors)· |
@gkasprow the example design I mentioned above (rfdc) seems to do just that (dac to adc loopback) and has a lot of latency calculation and calibration code on board if I am not mistaken. Maybe just play with that. |
I saw it. But it has so many variables like inter-tile alignment and various synchronisation features. |
I agree with this as well; it's just an enormously challenging problem, and frankly the career incentives for deep multi-year dives into infrastructure development are nil unless that is your specific job on an industrial QC team. I also think that, while clearly frustrating, the kind of incremental design steps represented by Sayma and Shuttler (better than a PDQ or a DDS backplane on a KC705, but not the final version for a full scalable QC) are important and dangerous to skip entirely; you have to try this hardware out with actual ion traps to be able to catch subtle bugs. You also need to demonstrate progress on the physics side on a regular basis, which mandates not having to wait for the platonic ideal of QC hardware to happen -- you need better hardware to do fancier experiments, but you can't wait for notional perfection. Frankly, for ion traps, I think the endgame would ideally have ADCs and DACs and logic in the same silicon, which also uses upper metal layers to define trap electrodes, because this would solve the interconnect problem too. Your AOMs are integrated nanophotonics and only need low power drive, etc. Put the whole damn lab on the chip with the ions trapped above it. Of course, there are a few steps between here and there ;)
Again, you are preaching to the choir, but the amount of time and bandwidth required to design and implement this kind of system is substantial, and you will always end up spending ages on all the unexpected corner cases and whatnot as in any project (deterministic latencies? clock distribution quality? phase noise? all of these things need to be studied and the answers determined). I'm all in favor of rapidly interleaved self-calibrations of all sorts to make tempco and bit depth specs more relaxed, but it's kind of hilarious to be lectured about this when we are currently cutting loose all of the power detectors on Sayma AFE, for example, because of a feeling that they won't be used or that nobody will write the code to implement the desired automated calibrations. Anyway, I am glad that @gkasprow is working on these things, and I think they are important and will pay dividends down the road. Live calibration loops and tricks like using the (for ions, unnecessarily high) sample rate to do some sigma-delta modulation to increase the effective bit depth would be useful pieces of the puzzle to develop. The silicon cost of ~$1.5k per DAC/ADC pair is not prohibitively high, but putting hundreds of channels together is going to be a spendy proposition as well -- another reason why physicists shy away, and why @hartytp has spent so much time niggling over $2 price differentials for odds and ends on these boards. |
@jordens Please stop shutting down all discussions about hardware bugs. The sorry state of modern hardware is documented by other people, e.g. https://www.embeddedrelated.com/showarticle/988.php - it's not me being "irrational", "whining", or spreading "vague FUD" as you say. It is a real issue; causing pain, frustration, delays and costs. Since the development time on such projects can be completely dominated by debugging, it needs to be addressed appropriately, e.g. by having a number of employees whose full-time job is to figure out hardware misbehavior, by having closer collaborations with the designers of the chip, and/or by preferring silicon vendors who care about those issues (e.g. SiLabs and not HMC). You are the ones being irrational here and overly excited about this new Xilinx gadget. In my opinion, a more reasonable stance would be: "for political reasons, we need to make incremental improvements to the control hardware, and a particularly interesting one would be increase the channel density dramatically on a AWG-type device. This requires integrating the ADC/DACs with the logic on the same chip, and the only product on the market today is the Xilinx RFSoC. We think that the advantages of this approach outweigh the problems associated with this chip, such as having to dedicate large amounts of resources to debugging and sorting out quirks." The source for the risk of physical damage to the chip when not programmed in the official manner is the same as the source for Greg's specs of the next RFSoC - someone told me and you won't find it with Mr Google. Different people have different concerns, it would seem. |
But you are not even discussing hardware bugs let alone proposing workable solutions to anything or evaluating a specific strategy. You just say that they (bugs and solutions) exist. Nobody doubts that and it is certainly of crucial importance to allocating sufficient resources to working around potential bugs. But iterating that face is not contributing to this discussion. What hardware bug relevant here do you want to discuss? How does that bug relate specifically to Greg's proposal? How does knowledge about the bug allow for better alternative proposals? Those would be good to hear as part of a discussion of hardware bugs. |
Here's my synthesis and take on the whole situation:
|
Getting back to specific technical and strategic questions, @gkasprow what's the plan with AFEs and panel connectors on that RTM? Is there anything coming from the WUT MIMO SDR or the NBI SC qubit people in therms of alternatives to analog over SMP/FMC and SMA/MMCX on panels? Given the device-internal crosstalk, I guess FMC might be just fine. You say they need high channel density. |
Agreed that FMC sounds totally fine given -70 dB typical internal crosstalk. Honestly at these crosstalk levels I'd like to see us investigating COTS impedance-matched multi-channel differential cables (e.g. HDMI) to ship signals around the lab, rather than coax with SMA/MMCX. That also pushes the balun problem out to a user-defined endpoint board, takes them out of the crate entirely and lets people make choices depending on their frequency band of interest. |
We plan to use same FMC-like boards as in Sayma RTM. The SC people I know don't care what connectors are used. |
I got offer for XCZU25DR-2FFVG1517E. It is 8.6k $ at 2 pcs. So it looks quite good. |
Do you need an AFE mezzanine for this, or can you put the AFE directly on the RTM? |
For WUT project I need to make at least 3 frontends for different radio bands. So AFE could be good idea. One of them would be up/down converter for X-band. |
makes sense. |
OK, so I can make one assembly option. For Metlino and DRTIO we need single GTY channel. I will route two of them to AMC, third one to Artix and fourth to PORT0. This is max 8Gbit/s so assembly variant will work. |
Looking at the block diagram in figure 8.2 on the data sheet, the LMK04208 looks very similar to the HMC7044. Am I right in thinking that you're trying to use this chip for two different purposes: (1) clock recovery (2) synchronisation (like JESD204B)? As @sbourdeauducq points out, the really big question for clock recovery is deterministic latency. From a quick look at the data sheet, it seems that this chip will not produce deterministic input-output phase relationships unless you deterministically reset the output dividers. As we saw with the HMC7043, that's not always trivial to do, depending on how well designed the chip is. It would need careful prototyping and testing. You can use the FPGA to measure the phase and phase shift the clock, as we currently do with the Si5324, but that technique has limitations. The other question is, how much do you care about noise/stability? If you need the lowest possible noise then the loop bandwidth will have to be very low, which is challenging to implement in an analog circuit. Also, for low bandwidth PLLs, phase stability/drift (e.g. due to air currents on the VCXO) can be a real issue unless you implement a high-order loop filter, which is a pain to do in analog. If you're requirements aren't too strict then this will almost certainly work. If you need the best possible performance then it will be challenging. A WR-type PLL (either VCXO + DAC or DCXO) is would solve both of these issues. In terms of the synchronisation issues, it depends on the details of what you want to do and how well the chip was designed (how many "undocumented features" like the 7043). May work, may not, needs testing. Depending on what you want to do, you may find it easier to do your synchronisation using fast FFs and digital delay lines the way we will do on Sayma v2.0 |
I'd like to use solution from Sayma RTM, but I need 4GHz and 6GHz and later on will need 10GHz ADC and DAC clocks. |
The LMK04208 only goes to 3GHz, so you'll need some other PLL for such high data clocks won't you? @gkasprow there are two elements to this: synchronisation and the clock cleaning PLL For synchronisation I think the Sayma RTM solution will work absolutely fine even for those frequencies. The jitter on those FFs + delay lines is extremely low and they're LVPECL so very fast rising edges. I think they should work for the frequencies that the LMK04208 operates at and beyond. For the clock recovery, I think you will need some WR-type PLL. I hope we will integrate one on Sayma RTM, but you know more about WR than me so you're the best person to judge how this should be done! |
Note also that the ADF4371 that we plan to use on Sayma RTM v2.0 would work very well for you to multiple the 125MHz WR clock to the GHz range. That PLL provides deterministic output phases (the dividers are included in the feedback loop) and can go to many GHz. It seems kind of perfect for what you want, or am I missing something? |
Note the caveat here that I don't understand the clocking/synchronisation for the RF SOCs you're using, so I can't comment on the specifics for that. The above is assuming that this all operates more or less like a JESD204B SC I system. |
LMX2594 works up to 15GHz. I want to align phases between DAC and ADC channels. In case of radar applications it is even more critical than for ion trappers. The RFSOC ADC and DACs works like JESD204Bchips, they need clock and SYSREF signal.
|
SYSref is also generated by main LMK chip. |
Makes sense. It sounds like your plan should work, so long as those chips behave as expected. AFAICT the approach taken on Sayma v2.0 would also work (the digital delay line resolution is something like 15ps which should be good enough for what you need). So, you have options! This isn't my project and I have a preference as to which approach you take. |
First I will check XIlinx configuration once I get ZCU111 devkit. If this is not sufficient, will implement solution from Sayma. Anyway this is not urgent. The project will start in 2 months officially. |
I am a little unclear as to why we need full DRTIO control for all the AFE digital pins. How short are we on the desired pin count @gkasprow? It seems to me that many of the tasks could be efficiently carried out if we just implemented a simple SERDES with the Artix as the receiver, and forcing all timestamps to the RTIO clock (multiples of 8 mu). You could keep a couple of pins per AFE which are true RTIO outputs with full 1 mu resolution and run those directly from the SoC (e.g. for switches on and off), but other devices may well be totally happy with ~7-10 ns resolution as provided by the coarse RTIO clock. I just worry that we could expend a lot of engineering effort here to have the world's most accurately timed digital attenuators and I2C peripherals. Input amplifier gain range switching and digital attenuator updates would be completely fine if the available switching times were confined to edges of the coarse RTIO clock. |
It's not that much effort considering that generic DRTIO is already developed, the GTP code is already there for Kasli, and the GTY code will have to be developed to enable communication with other boards. |
@dhslichter we have just a few pins left. Only banks 84 and 88 are available. For me the top priority is SDRAM bandwidth. Moreover banks 84 and 88 have only HD pins, so don't have IOSERDES. I could use 10 pairs with IOSERDES available in 1.2V SDRAM banks but they would have to be AC coupled. Since we use 8B10B encoding, this could work with biasing.
|
I can also make DC-coupled link between DIFF 1.2V SSTL and 2.5V LVDS when necessary. |
@gkasprow ack |
The project oficially started at WUT. Do you think it is good idea to keep it on sinara-hw or shall I create it on my private repo? |
I propose name: RFSOC-AMC or RSOC-SDR, I'm not sure if we should call it Sayma because it does not say much to people outside our community. It has completely different architecture, communication channels. Only AFEs are similar :) |
I for one would be glad if this remains under Sinara. |
OK, so let's keep RFSOC-AMC for the moment. When better name comes we will change it. |
Do you find another way to run simulation? (for example to see spectrum ? ) I have another question, do you find a way to synchronize your mesurement equipment and the ZCU111 dev board with LMK2594? Do you need to configure a MTS mode, or non-MTS mode? In which way you connected it? I tried several times but It failed. |
I'm still waiting for ZCU111 devkit. I didn't touch it for some time since it is a low priority. |
We measured the latency: |
Thank you for this information, I am not able to see the link : https://gitlab.com/wut_ise/pados/issues/8. May you post the result in a pdf format? Thank you |
The cable lengths were compensated |
FYI, there is now an open-source superconducting qubit project based on an RFSoC eval board. https://arxiv.org/abs/2110.00557 Sandia have a similar thing aimed at ion traps too. Not sure if that will be opened up. |
Yes, I know. I wrote to David already :) |
I got involved in new project at the WUT.
We will build wideband MIMO SDR. Probably on the new RFSoC that:
You won't find any info with Mr Google. I got this news from my colleague.
For the WUT project existing 6G DAC with DUC and 4G ADC with DDC would probably be enough, but I want to play with newest SOCs.
For what applications would you use such beast? Would such upgrade for Sayma ecosystem be useful for ion trappers?
My colleagues would love to use it for superconducting qubit readout.
I still don't know what the ADC-DAC latency is. They need <1us together with demodulation and modulation algorithm. Both ADC and DAC use AXI stream.
I plan to buy ZCU111 and simply measure it.
The configuration for WUT project is minimal. I want to use 1517 package an hook up as many DDR4 channels as possible for all AWG applications.
Is there any benefit of using Sayma AMC in such configuraiton?
Initial idea is to make it as RTM and place SFP/QSFP, clock recovery, MMC, supply on AMC side.
The text was updated successfully, but these errors were encountered: