Home | Forums | What's new | Resources | |
about Saturn dual CPUs |
vbt - Dec 30, 2003 |
M3d10n | Dec 30, 2003 | |||
Any evil plans for the newly found dual CPU power, VBT? |
Gallstaff | Dec 31, 2003 | |||
How the hell do you guys learn how to do this? |
slinga | Dec 31, 2003 | |||
Yeah these guys are impressive. What really amazes me though, is how Charles Macdonald did all of his programming WITHOUT using the SGL\SBL libraries, and he did this years ago. |
antime | Dec 31, 2003 | |||
Do you mean Charles Doty? His programs use SGL. Charles MacDonald didn't start coding for the Saturn until I had released the C version of my (libless) copperbar sample. Bart Trzynadlowski, Tyranid and others made some programs without libraries and Azuco started on his own set of libraries. Some of the documentation (VDP1 and VDP2 manuals, plus a few others) have been available on the net since around 1997 or so, it's just a matter of being able to read and understand them. |
vbt | Dec 31, 2003 | |||||||
For now nothing, in fact on my simple tests(applied to sms plus after a useless prog) I lost speed. I registered the bg function to be used with the slave proc and while this one was running the master proc ran the sprite rendering function. Something like that :
Code:
|
antime | Jan 1, 2004 | |||
Both CPUs are connected to the rest of the Saturn using a single, shared bus. When both CPUs want to access something, one CPU gets the bus and the other has to wait until it's free, resulting in slowdown. To help against this the cache of the CPUs can be configured as 2K shared cache and 2K RAM (normal mode is 4K mixed cache), and IIRC the slave CPU is configured like this by default. By working out of cache on data in the internal RAM external bus accesses can be minimized which should lead to better performance. |
vbt | Jan 1, 2004 | |||
Ok I'll try to use the cache and if I understood I have only to create each time a second variable that points on the source variable address with 0x20000000 added and it will copy the variable to the cache automatically. |
antime | Jan 1, 2004 | |||
No, that would bypass the cache entirely. When reading a data location with the top three address bits set to zero an entire cache line (16 bytes on the Saturn's CPUs) is read into the cache (which is why hardware register accesses have to bypass the cache). The cache chapter in the 7604 manual describes how it works. It's a tricky subject and not really worth bothering with unless you suspect you actually have a performance problem due to it (like having arranged your data so you always get cache misses). When the cache is configured as cache+RAM, 0xc0000000 to 0xc00007ff become RAM so copy your data there, do whatever operations you want to on it and copy it back out to wherever you want it. The code that operates on this data should be as small as possible to make effective use of the remaining cache, which means many small loops rather than one big loop and so forth. |
ExCyber | Jan 1, 2004 | |||
Ideally, shouldn't you just disable the cache and use the full 4K for code+data? |
antime | Jan 1, 2004 | |||
Yes, you can do that as well, I forgot about that possibility. To create code that runs from that area you must play around with your link script and use GCC's section attribute to map the code and data to the right addresses. The ld manual has an example on how to create a section with different load and virtual addresses which you can use pretty much as-is. |
AntiPasta | Jan 2, 2004 | ||||
That's where I pretty much gave up on the libless approach |
slinga | Jan 3, 2004 | |||
VBT: There's another sample program and some more information in Saturn Technical Bulletin #28 if you need some more code to look at. |