Home | Forums | What's new | Resources | |
Reading from the framebuffer? |
XL2 - Aug 23, 2018 |
< Prev | 1 | 2 | 3 | 4 | Next> |
ndiddy | Sep 2, 2018 | |||
I hope you don't get disheartened, the stuff you've been doing with Sonic Z-Treme is some of the most impressive homebrew I've seen for the Saturn (or most consoles for that matter). |
David Gámiz Jiménez | Sep 17, 2018 | |||
@XL2... Are render-to-texture the VDP1 framebuffer in this screenshot?? Is like megaTV in destruction derby 1 in psx1??? Are possible make render-to-texture "animated" whit SS at good speed? Or acceptable penalty? What do you think? I think only whit VDP1, because render-to-texture all final VDP2 output is possible, but not usable I think. Very very slow. |
XL2 | Sep 17, 2018 | |||||
It reads the framebuffer skipping some pixels, which I sent to work RAM and SCU indirect DMA to vram during v blank, and displayed as a sprite. It's updated each frame, so you get a picture in picture in picture. Vdp2 would be the same if you use a bitmap layer (nbg0, 1 or rbg0) at 16 bpp. You can zoom 4 times I think, so a small (like 88x64) image could fill the screen, but yes, reading the framebuffer is quite slow. You also have the problem of palette code pixels not working for a rgb layer/sprite and vice versa, so to do like Destruction Derby on PS1 you would lose the background and RBG0 layers unless you process it a bit more. |
David Gámiz Jiménez | Sep 20, 2018 | |||||
Very interesting what you comment here XL2. Sorry for the delay in answering you. Several "conclusions" that I see interesting about your answer: 1) Its use is possible to achieve this Destruction Derby effect or similar. 2) It is possible to use it to get one part of the effect of Burning Rangers "transparency layer". 3) When you mean that reading the framebuffer is slow. You mean what you do now. In other words, read the framebuffer of the VDP1. Or the final framebuffer of VDP2? I understand that both are slow. The VDP1 less than that of the VDP2. But could you get used without penalizing a lot? Balancing resolution/FPS read from the framebuffer? 4) The problem about capture foreground VDP2 that you are talking about is clear. When capturing the "instant" before going to VDP2, we would not have the "backgrounds" information. And the palette or MSB On pixels for VDP2 would look "weird". I thought about it a long time ago. As a solution I thought, that the function that the read the Framebuffer, draw a predefined color as "background color" equal or representative of the "real" background of the VDP2, wherever it finds a palette pixel/MSB On or a transparent pixel. It is clear that it will not be the "Real background" but it will be better than a black spot or strange colors. Regards, |
XL2 | Nov 28, 2018 | |||
I tried fetching the framebuffer using SCU DMA, skipping lines, but it didn't work. It did work in SSF, but Nova tells me it's an illegal operation to SCU DMA from the frame buffer (0x25C800000 ) and it just doesn't work in Yabause. It is my understanding that 0x25C800000 is frame buffer 0, but it should switch every frame between 0 and 1, but I have no idea what register to look for to see the current back buffer. Anyone knows how I should proceed? SCU indirect can skip some pixels (4 bytes I think), but according to the technical documentation it doesn't work with the B bus. DMA transfering the whole framebuffer every frame is too crazy (256 KB), and their is also the issue of bytes alignement if used with NBG layers. The idea would be to send it either to NBG0/1 layer, to VDP1 ram or a h-ram buffer (I'm mostly toying with it with no clear intention at the moment). Any ideas? On a side note, is it possible to fetch the complete image (VDP1+VDP2) sent to TV? |
mrkotfw | Nov 28, 2018 | |||
That's odd. From what I've seen in the SCU restrictions, VDP2 VRAM cannot be read via SCU-DMA. One thing to note is that you cannot read while VDP1 is drawing. To know whether VDP1 is finished drawing, use the SPRITE END IRQ, or poll the EDSR register. Then use PTMR to force stop drawing. What does hardware say? Mednafen? I'd also look at what Burning Rangers did... for sure they're doing what you're intending, which is copying a sampled FB copy to VDP2 VRAM. As for your last question, I would love to know. I doubt it, but there's something like EXBG on the VDP2? It would've been amazing to have direct access to the final output. |
David Gámiz Jiménez | Nov 28, 2018 | |||
@mrkotfw... Rayman make a render-to-texture of VDP1+VDP2 in the Fade pre and post loading 3D animation. If you know how to research what it does, it would be great. It seems very slow, because seconds pass until the animation is done. Thank you! |
mrkotfw | Nov 28, 2018 | |||||
Do you have a YouTube video on this? |
antime | Nov 28, 2018 | |||
The physical address of VDP1 VRAM in the memory map is 0x5C00000. 0x25C00000 is the CPU's cache-bypassing alias. Edit: See section 8.3, "Address Space and the Cache" of the SH7604 manual. |
David Gámiz Jiménez | Nov 28, 2018 | |||||||||
|
XL2 | Nov 28, 2018 | |||||
Thanks, but in SGL too the sprite VRAM base address is seen as #define SpriteVRAM 0x25c00000 Changing it didn't change anything in fact. It seems like it's an emulator issue, Yabause doesn't support it and Nova says SCU DMA reading from "VDP2 RAM" is illegal, but the author of Nova, Steve Kwok, told me it will be fixed in 0.5. Yabasanshiro does support the operation, so does SSF. It allows me to read a part of the screen and send it to a work ram buffer, before sending it to vram after vblank out. I'll have to test on real hardware to see if it behaves correctly. What I don't get is that choosing Frame buffer 0 (or 1) all the time has no impact. Is it because the system just redirect the dma access to the other buffer? Or is it just going to crash on real hardware? (In the images, the size of the sprite is 176x80) |
antime | Nov 29, 2018 | ||||||||
That macro is defined for code running on the CPU. The CPU must bypass the cache when accessing peripherals to avoid bad size-effects (eg. accessing forbidden addresses, returning stale data). Page 4 of the VDP1 manual clearly states
It's of course possible that the rest of the hardware doesn't fully decode addresses, and ignores the 3 uppermost bits, but it definitely doesn't know about any CPU-internal addressing schemes. IIRC there's an errata or technical note that says you can't use SCU DMA to read from VDP2 memory. It doesn't say what the effect is if you try, though. There's also some limits on access widths to keep in mind. |
antime | Nov 29, 2018 | |||
SCU errata 1 says "Write to the A-Bus by the SCU-DMA is prohibited", and 2 says "Read from the VDP2 area by SCU-DMA is prohibited". The document doesn't detail what happens if you break these restrictions. |
XL2 | Nov 29, 2018 | |||||
Thanks, I'll use 0x5C800000 then. So, it is "normal" that using Framebuffer 0 (0x25C800000) all the time works even if it's supposed to be the displayed buffer? About the timings, right now I SCU direct DMA the framebuffer (0) to work ram (I have a loop, like y=0; y<80; y++, SCU DMA 172 pixels for each scan lines, total size of 172*80* sizeof(Uint16)) after v blank out (slSynch) and then transfer that buffer to vram (both as a VDP1 sprite and NBG0 layer, both are working in emulators). I think SGL stacks them, so no need to wait, but all this means I'm 2 frames "late" as I understand it. I will have to find a way to start drawing, interrupt, transfer the buffer and then restart drawing on the same frame buffer to try that Burning Rangers effect. It runs fast in emulators (like no noticable slowdowns), but the DMA is much slower on real hardware, so what I'm doing might not work at all. Any suggestions to speed it up? |
antime | Nov 29, 2018 | |||||
Yes. You can only access the back buffer, and it's always mapped to the same address. See section 2.1 of the VDP1 manual. |
XL2 | Nov 29, 2018 | |||||
Ok, thanks! I guess I misunderstood what it did, I thought it swapped buffers, so like FB 1 would become the back frame buffer. One less problem to worry about... |
XL2 | Dec 1, 2018 | |||
So it's working fine on real hardware, it seems to have very little impact on the framerate (no slowdowns). I doubled the width, which is why it's weird and looks like it's ghosting, I haven't implemented the full Burning Rangers technique yet, but I have a pretty clear idea how I will pull it and it shouldn't be too hard (except for the overdraw - transparent objects over opaque objects - but that's where my bsp engine will be useful once I port it). I won't swap the frame buffers, I will just write to an unoccupied area (from x:352 to x:511), and just copy and scale it by 2.2. I copy the framebuffer (176x112, but it will be 160x112) to work ram, then dma during blanking to the nbg0 layer. |
mrkotfw | Dec 2, 2018 | |||
Are you copying non-paletted render? When it comes to rendering off screen, do you set the system clipping (or user clipping) command before rendering? |
XL2 | Dec 3, 2018 | |||||
The frame buffer is 16 bits, so I just use RGB codes (including CLUT). The system clipping must be set wider and it must be the first draw command else it doesn't work. That's (small) one issue I'm facing with SGL as I need to get around it's restrictions, like it doesn't let me change the system clipping and the polygon for clearing the frame buffer , so I need to overwrite some draw commands. I will need to find where it keeps those in work ram. And of course, you also need user clipping commands to prevent drawing at a wrong place on screen. |
< Prev | 1 | 2 | 3 | 4 | Next> |