Hmm, well here's a new plate of what the fuck. I rewired the debugger to support burst memory reads, and now writes don't work any more.
-
hmm, interesting. I synthed a design that writes to some address on boot, within the FPGA, and a readout says the write... well, either didn't happen or isn't being read back correctly. Grmbl.
-
Ooh, but a byte read not using the swanky burst thing _does_ see the value.
So, burst reads are fucked somehow. Okay, that's something I can work with a little bit more, at least. It all happens as interlocking FIFOs passing bytes around, so I can't figure out how it'd give valid data for 4 bytes and then give up on life, but... this is something at least.
-
hmkay well, the datapath that handles the response byte is the same between the burst read and write paths, so that would leave the issue path for burst as the suspect, since unlike the other commands that switches to a separate state where it's pumping out read requests as fast as the FIFOs will allow to satisfy the range request.
Well, this might be the first time I've tried driving the real RAM primitive at one operation per cycle, so maybe this is just revealing existing breakage...
-
Okay well easy theory to test, I can insert delays between the issues, and if the burst read starts working, well then.
-
Well, burst slowed down to be fully synchronous, waiting for each response before issuing another read. Good news/bad news. Burst reads still return all zero, but the RAM is definitely not all zero because single-byte reads still see the value that gets poked at boot. I issue a single-byte read to check it between every burst read, and after all the burst reads have finished destroying my dreams by returning all zero, the byte read goes "yup, value's still there mate". Argh.
-
Okay, well, what could be wrong here... Every burst request I read back 128 bytes, and the single-byte read shows that the debugger is sending exactly 128 bytes, as I requested... So the part of the debugger that manages how many reads to issue and how many bytes to push back out before it returns to idle is clearly working.
For whatever reason, the burst successfully reads the bottom 4 bytes of vram before going off the rails. That's 32 bits, oddly the same size as the debugger request...
-
okay time to do what I should have done a while ago now, and just reimplement the client-side reading with both bursts and non-bursts, so I can compare the two and switch back and forth.
-
And as expected, I can toggle between burst and non-burst in the client, and non-burst shows me the full content of memory, and bursts doesn't.
what. is. happening.
Okay at this point I've decided I'm suspicious of this serial port library I've been using. Let's see what happens if I make it read a single byte at a time in burst reads...
-
Well it was a long shot, and indeed it changes nothing.
Okay, well, new theory: because this FPGA board doesn't wire up the computer->fpga flow control line, I just disabled hardware flow control entirely... But what if the FTDI chip or linux is telling the FPGA to stop sending, and I'm just lolblasting the bytes into the void instead? That would imply a 4-byte buffer somewhere in the chain, which is hilariously tiny, but who knows...
-
Aww, I really liked that hypothesis, but nope, the hardware flow control stays ready and happy the whole time the burst reads are returning nonsense
-
I mean at this point there's just not that much that I'm doing differently between the single byte read and the burst read. You'd think that would make the fault trivial to identify, and yet.
-
@danderson I've been trying to get Bluetooth working right on a small project of mine pretty much all this week, in the time I have after work. The implacablity of malfunctioning hardware is something else.
Good luck. We both need it.
-
@ddr Oof yeah in the context of fighting bluetooth, I feel like I'm on easy mode all of a sudden! Although I'm just about at the point where I'm going to put scope probes on this fucker and look at the serial lines, because god is dead and I don't trust anything any more.
-
Arright well giving up for sleep soon, but Before that, the board's been transported to the hardware desk, and it's getting scope probes placed onto its tiny mind because for my own sanity I need to look at the serial traffic.
Aaand the board doesn't break out the serial lines to test pads. Fucksake.
-
Arright well thanks to the magic of FPGAs I mirrored the TX line out to a probeable location, and got me some hard evidence that yup, something's fucky in the gateware.
With the VRAM loaded with seven non-zero values, the slow byte-at-a-time transmission sees all seven before the memory readout goes to zeros. With the burst transmitter, well for one the transmission is a lot faster for sure, but I only get five of the bytes before the transmission reverts to zeros.
-
It's frustrating that I can't seem to reproduce this in sim, but I suspect I may need to wire up more modules together to find where it goes wrong.
Sleep now, but as a brute force option I'm also considering getting the logic analyzer wired up and just shoving parts of this dataflow out onto a parallel bus and record that. I'm quite sure that's a less intelligent approach than writing integration tests, but...
-
hmkay new theory to test tomorrow, maybe I mucked up the flow control on VRAM requests, and the VRAM is allowing operations to start when the output FIFO isn't clear, and so outputs end up overwriting each other at the output. That might also explain why, when I slowed down the burst mode, it managed to get one more valid byte out before losing its mind again: it takes a few cycles for the FIFO chain to fill, and when it hits the UART and has to slow down to 115kbps, the vram fails to stall.
-
I don't think any of my unit tests tested feeding the vram on every cycle for many cycles, while also consuming the output slowly. That's probably a nice self-contained test to have regardless, and it'll also reveal that this is my problem.
-
... oh, er, hmm, I think I see it. The outer RAM constructs do flow control properly, but the inner ones use non-blocking delay lines and expect the outer module to latch the output if it's not ready to consume it on the right cycle.
So I think when the ram backs up, those bits where I tried to be fancy by not using flow-controlled FIFOs end up throwing away the routing information for which of the 56 chips are presenting the desired output value, and so I end up consuming zeros later.
-
tbh serves me right for trying to be clever. "I'll use non-blocking delay lines, I can save some enable signals and control logic, I'm clever enough to not fuck it up" he said, while fucking it up.
Ah well, tomorrow quick test case to prove the point and then should be fixable.