Home | Forums | What's new | Resources | |
vdp 1 3d? |
Coolgame - Jan 8, 2012 |
1 | 2 | Next> |
Coolgame | Jan 11, 2012 | |||
I like to get WinterSports Eins again, it's not on this site any more or the link is broken. but wintersports use 2d sprites it's 3d is prerendered. |
cafe-alpha | Jan 15, 2012 | |||
Tank you Chilly Willy, this is the first time I see somebody speaking about my yeti3d Saturn port And as explained above, everything related to 3D is actually done on software. The VDP1 is only used in order to display quads (2D shapes) on screen. |
Chilly Willy | Jan 15, 2012 | |||||
I've a bit of an interest in Yeti3D... I did ports to the 32X and N64, and now I'm working on ports of Yeti3D-Pro, which is the next step up from the original Yeti3D. The two main differences between the original and the Pro versions is Pro allows for slope, and uses models for objects and enemies, where the original is all flat, and uses sprites. The other difference is Pro uses a lot more memory. That's making it tough to port to the 32X as just the level map uses ALL the ram in the 32X. It would do well on the Saturn, though. |
cafe-alpha | Jan 21, 2012 | |||||
About 3D models: there is support for displaying 3D models on non-Pro engine too ! (See draw_entity_as_model function in draw.c) Memory should not be a problem on Saturn, but the increase of number of quads to process because of 3D models would make it unplayable on Saturn ... Yeti3D Pro ... I discovered it when I made the first public release of my Yeti Saturn adaptation ^^; At that time, I googled of the spelling of Yeti3D original author (in order to write readme.txt or so), and saw he actually released the Pro engine sources some months before. I have added some Pro features to ietx2, but it is still WIP (well, I haven't modified source code for half a year, but let's say it is WIP anyway ...) According to my changelog, I have added the following features: - Level editor + conversion of all levels found in Yeti3D Pro sources to Saturn. - Better visual looking by bilinear-resizing textures on PC when converting level data. - Slopes. - Transparency. (example here...) It is good to hear somebody interested in Yeti3D I plan to release my Saturn port after S.A.T.U.R.N. contest judging and prizes shipping. |
Chilly Willy | Jan 23, 2012 | ||||||||||||||
Support, yes, but it's not used. Yeti redirects all entity drawing to the sprite code. Given that there aren't any test models in the code, I'm not sure how complete the model code is. It may still have bugs, which led him to not use it until later versions which became the pro version.
Well, that would depend on how many, wouldn't it?
Well, it's good to discuss things with someone actually working on the code. I did various tests with the drawing parameters on the 32X when trying to get the speed up without resorting to large amounts of assembly. One thing I noticed in the 32X code that would affect the Saturn as well is to watch the signed vs unsigned integers in places where lots of shifting occurs. Unsigned ints use inline multi-bit shift opcodes, while signed ints call a subroutine that shifts the int one bit at a time (there are no multi-bit shifts for signed ints). There were places in the code I forced a cast to unsigned because I knew at those places the values were never negative and it was critical to use inline multi-bit shifting, not a subroutine. Depending on how many bits are shifted, it would actually be better to do something like this than to use signed shifting:
Code:
You avoid the jsr/rts, and several shifts. At least if you use a constant for the shift count, they have different subroutines for each constant shift count. If the shift count is unknown, it has to call a more generic routine to handle the unknown number of shifts, which makes it even slower. I got in the habit of compiling snippets of code to assembly with the SH2 to see if it needed a little help. I ran into the same thing when making a version of Tremor for the 32X... which would probably be pretty good on the Saturn. It runs completely on the slave SH2. The 32X can handle 22kHz mono or 11kHz stereo with my current code, so the Saturn should be even better with it's faster clock rate and 32 bit buses. |
cafe-alpha | Feb 4, 2012 | ||||||||||||||||||
Sorry for the late reply,
In my WIP yeti 3D Saturn adaptation, there is not yeti3d Pro 3D model support yet. This is due to the facts that I port Pro features little by little to yeti3D GPL for Saturn and that I didn't done anything concerning 3D models. I will let you know about this in the case I add 3D models
Thank you very much for the advice ! I tried your optimization on yeti_build_vis function (draw.c) and it became a little faster: In this function, f2i is used to compute array index, hence not negative values.
Code:
For the same 3D scene,yeti_build_vis using commented out code runs at 24ms per frame, and at 20ms per frame when using logical shifts. I don't know anything about assembly language, and I just discovered the sh-elf-gcc -c -g -Wa,-a,-ad
Code:
Don't hesitate to share other optimization advices Also, are your optimized sources for 32x available for download ? (if available, I would be very interested in adding your changes to my Saturn version) |
Chilly Willy | Feb 5, 2012 | |||||||||||
It's amazing that a tiny change like a (u32) cast can make a significant improvement in speed. One of my Tremor tests is here: http://www.mediafire.com/?9acgq3givvi8kvd... and my double-pixel Yeti demo with music and sound is here: http://www.mediafire.com/?a9y2dnhm3e9dfrc... The Tremor-rockbox directory was just used for reference - it isn't needed for the demo. The demo uses the lowmem branch of the official Tremor with various optimizations, but it could use more on the 32X. It should actually be pretty decent on the Saturn. The Yeti demo renders at 160x112 to a 320x112 15-bit mode display. It's a good example of how to setup the 32X to use only every other line in the display. The code has been modified to draw two pixels at once during rendering. I really need to just make the entire polygon rasterizing assembly. Anywho, drawing 160x112 really improved the performance of Yeti on the 32X. This demo also uses the Slave SH2 to mix and play MOD music with sound effects using DMA PWM audio. Anyway, another generic optimization you may already know: weird shift lengths. The SH2 only does 1, 2, 8, and 16 bit shifts. Everything else must be done as multiples of those. However, there are times when you can be sneaky for better performance. Instead of shlr2 r1 shlr2 r1 shlr2 r1 for a shift of 6, try this shll2 r1 shlr8 r1 Assuming the left shift doesn't kill any significant upper bits, you save a cycle doing the left, then right shift. Most of the shifts not covered directly can be done in a similar manner to save a cycle or two. |
antime | Feb 18, 2012 | |||||||
If you know the value is negative there are some tricks you can use, but some cycle counting may be needed to determine the fastest variant. In the general case, you can convert the operation into an unsigned shift by inverting the bits before and again afterwards, eg.
Code:
Shifting by 16 and 24 bits can be special-cased using the sign extension instructions. It can be faster to handle shift amounts slightly larger like this as well, but that's where the cycle counting comes in.
Code:
|
Chilly Willy | Feb 18, 2012 | |||
Yeah, good points. I've done the logical shift/sign extend trick as well, just forgot to include that in the list. I haven't done the not/shift/not trick... I'll have to remember that one. |
Coolgame | Feb 25, 2012 | |||
sorry for taking so long but i like to say about two weeks of your first response i did some research and found ya'll was right, the vdp 1 is a 2d chip, i was a bit too busy at the time to respond. I also have a playstation one development manual to compare the two consoles. i like to say thank you all for your help, and added knowlege! i learn more about the console thanks to your support (everybody). you all are a big help. |
cafe-alpha | Jun 6, 2012 | ||||||||||
(bump) Recently, I tried to speedup a little more yeti3d code, so that I add a post to this topic. As code inside loops in yeti_build_vis functions is executed nearly 5000~8000 times per frame, I focused optimization on this function only. The "low risk high return" optimization is to modify f2i macro in order to use logical shifts.
Code:
(*) Optimized f2i should require direct reference to local variable in order to be actually faster Example: tmp = x+y; z = f2i(tmp); instead of z = f2i(x+y); Another optimization was the most effective and actually the simplest : yeti_build_vis heavily uses CELL_IS_OPAQUE macro :
Code:
Instead of performing ands, not, etc thousands times on every frame, I compute opaque attribute on startup, and update it only when it is needed
Code:
Also, outside of yeti_build_vis function, I optimized vertex_project function, which is called around 500 times per frame. -> 2 additional reciprocal tables are added : reciprocal*WIDTH and reciprocal*HEIGHT that save one multiplication each. -> At the end of computing, final >>9 shift has been changed to logical >>8 shift, so that only one "shlr8" instruction is used. (The >>1 remaining shift is computed in reciprocal tables, and doesn't affect projection accuracy) After optimizations above, speed was around 7~10 FPS, but due to VDP1 issue (display is flickering on real hardware), I had to limit speed to 3~5 FPS. Hence, there are still a lot of things to investigate on |
Chilly Willy | Jun 6, 2012 | |||
I assume that's 320x200-ish and not using the warped sprite drawing? If so, that's pretty good. You really need to redo the draw poly as assembly to really do better. I plan to do that on the 32X version for better speed. |
cafe-alpha | Jun 9, 2012 | |||||
Well, the polygon drawing is performed by VDP1, not SH2. I took some video on real hardware so that you can get an idea how my yeti3d pro adaptation looks like : ietx3 wacked level... ietx3 church level... ietx3 church level... |
television2000 | Jun 9, 2012 | |||||
I was gonna ask you Cafe to show us a video before I saw this lol. Excellent work. I presume you aren't using SBL or SGL in Yeti. Right ? |
cafe-alpha | Jun 9, 2012 | |||||
I don't use SGL. The sources used as base for my Yeti3D Saturn adaptation are Charles MacDonald's vdp1ex example program.... However, I use some SGL sources, especially for CD-ROM access. The project is compilable from sources only, without the need of Sega precompiled libraries. |
1 | 2 | Next> |