SegaXtreme

Home	Forums	What's new	Resources

Translating Culdcept: tutorials, notes, whatever

benclaff - Apr 7, 2024

benclaff

Apr 7, 2024

Hi everyone,

Following the acquisition of the original disc, I explored a bit the content of the card/board-RPG game "Culcept" and found out that it "might be" relatively easy and straightfoward to translate it to english. In the following thread I will post updates about my finds, but also call for help or advices when I'm stuck. Ultimately, my aim is to translate it.

On the other hand, I really appreciated other segaxtreme threads which documented step by step some approach for other translations.
This was very helpful to explore my 1st saturn game and might be help to other future enthousiasts.
So I decided to make my posts a small tutorial, when possible.

And I found that this approach (forum posts) is much much better than having knowledge spread in discord conversations (finding some information a few weeks later is a nightmare...)

Current situation is:

Confirmed:

- main script is in shift-JIS and can be edited relatively easily
- a full translation exists for the DS port AND is very well documented here : Culdcept DS translation wiki...
- there are minor modifications between both scripts, but not more than a few kanji here and there

In progress :

- multiplayer mode will need specific re-translation: it is wifi-based in the DS version, while it is local 4 player in the saturn version and there are quite a few differences
- it seems that both fixed and variable width fonts are present in the game (needs confirmation, more on this later)
- on top of shift-JIS, used for dialog windows, there are some accessory smaller fonts. They do not seems to match any classic fonts, but I may be wrong. Values and offsets need to be determined.

Will need help at some point :

- how to modify the text routine to switch between variable width and fixed width, depending on context (e.g. dialogs VS menus)

SO, after this short introduction. Let's dive into content.

-------------------
--------------------
-------------------

###################
###################
PART 1 : Discovery of text encoding
###################
###################

Game version used in tutorials :

REDUMP lists 2 version:

v1.004 redump.org • Culdcept • ...
v2.000 redump.org • Culdcept • ...

My disc matches v1.004, any offset listed below is valid for this particular version.
I had a look at the track files, but there is only 1 small binary file of a few megabytes and 2 large bin files.

So I went back to work on the full track dump.
All offset below are valid for track_1.bin, sha-1 == 82003c2bf26d23f8824f8934ce5ca9ae403f0043

If anyone has information about any potential difference between these version, please let me know.

Tools:

windows calculator (for octal/hexa converstions and shifts)
notepad++ : for my markdown notes
crystaltile & tilemolester : tile search, police search, texture searches
vxMedit : direct edition of main scenario text, pattern or texts searches...
mednaffen and yabasanchiro emulator : VDP1/2, CPU RAMs exploration via debuggers, creation of savestates, test of modifications...
Hex to String Converter Online - DenCode... : for rapid hex / shift-JIS conversions

Dialogs font & text replacement :

shift-JIS police is present in : [ 1DE64,36404 )

HOWTO :

open track_1.bin with crystaltile
click on "tile" icon in the top bar
on left menu, select width=16, height = 12, tile form = solid 1bpp
set offset at 1DE64 to see the shift-JS police

So it may be that the game use the same code as in https://mattsmessyroom.com/uploads/sjis.tbl....
We will test that by searching for a simple japanese word that uses a few easy-to-recognize hirigana or katakana.

For instance, will will replace the word "creature" () which appears a lot in the first game dialogs.

How did I got this ? (I do not speak japanese) Well, using the deepL app and my phone camera, I observed the dialogs in the 5 first minutes of the main scenario. And I saw that this word appears many times early in the tutorial dialogs.

HOWTO :

using the shift-JS table above, we expect that in shift-JS hexa is 834E 838A 815B 8360 8383 815B
open track_1.bin with wxMedit
menu display -> encoding -> est asian -> select shift-JIS
menu search -> check search hexa chain -> paste 834E838A815B83608383815B
click count, you should see 246 occurences
click next, then previous to display the 1st occurence
you should see text dialogs on the right, as shown in the following image:

If you look for the 1st occurence at 00CE110A, you can see this word "creature" () appears in a block of text, separated from other blocks of text by a run of 00 values:

We can copy a sentence and check it using the website: Hex to String Converter Online - DenCode...
- overline a sentence piece
- menu edit -> advanced -> copy as hex string
- past in the aformentionned website, make sure it uses shift-JIS, I used this piece of sentence:
  - hex value : 834A838B8368838982CC8EF4949B82AA81418DA182BE82C989E482AA91CC82F0 0A 8D5391A982B582C482A282E982C682A282A482CC82A9814581458145
  - in the website, it should be displayed as :
    - note that all characters are 2-bytes (8xxx or 9xxx in hexa), while 0A is a 1-byte character, highlighted with space in example above
    - this one has been been converted as a line return
  - so, 0A : will likely be a line return
  - spoiler : 07 value will be a special code meaning "go to new dialog window"
If you browse this block with your mouse in wxMedit, will will notice blocks of text are split by a chunk of seven "00".
- you can see this pattern: 00 seven times, then 32 00 ** 0F [some shift-JIS text] [some control code to end the block]
So, there is a chance we get the text data structure, with maybe some pointer to character names at the wondo header, or any other control codes.

Now, let's move back to the existing translation, we are looking for the beginning of above sentence () and do a CTRL-F on each page of the translation website, which gives a match in this page :
- Taunts9 Script - Culdcept DS translation wiki...
- Entries 10-19
- ID 13, translation is : "So, the curse of Culdora is confining my body…"
We just translated something !

Going further, let's see if modifying some text directly in wxMedit will actually makes the translation to appear in-game

HOW TO :

We will replace the word "" (creature in japanese) which shift-JS hexa is 834E 838A 815B 8360 8383 815B, by 63 72 65 61 74 75 72 65 20 20 20 20 which is "creature ", e.g. 8 characters plus 4 spaces, e.g hex value 20, to make it as many bytes as in "" (note that the spaces in the hexa pattern above are only for understanding)
menu search -> replace -> check "find hex string"
then replace 834E838A815B83608383815B, by 637265617475726520202020
save the results to track_1.bin (make sure the name fits the cue file associated to your bin/cue dump)
then load this modified game image into mednaffen
in the main menu, select first icon in main menu ant enter you name to create a new game
you enter the main scenario, after a few dialogs, you will see this

TODO in next updates :

- supplementary fonts found via VDP memory exploration
- deciphering the dialog control codes
- observations related to variable width font in menus

		benclaff	Nov 10, 2024
		#################### ################## PART 2 : Control codes / text data structures ################### ################### This next post will show how I explored text data structures and deduced some textbox control codes, cards statistics flags, offset tables, etc ... First, I looked for a good hex editor that would 1) allow me to see text as shift-JIS police (see previous posts) and 2) allow me to highlight some bytes patterns, defined as regular expressions. (If you do not know what is regexp, check Wikipedia, or some good tutorials in whatever coding language you like. For humanity's sake, do not use chatGPT, that burns 20 times more eqCO2 per query...). I went for 010edito which is open source, Mac / Unix / Windows compatible. An equivalent software (open source) would be imhex. After installation, I open the file CULDCEPT.DT0 where we previously spotted some text. Set the byte translation to shift-JIS : View -> Charset -> International -> select shift-JIS My first regexp will match bytes translation intervals of the shift-JIS police, e.g. one single byte interval (Latin characters mostly) and several 2-bytes (Japanese) characters. You can observe the full shift-JIS byte translation and these intervals there: (picori::shift_jis_1997 - Rust...), bytes intervals are basically 1-byte ([x81-x9F] \| [xE0-xEF]) & 2 bytes ([x81-x9F] \| [xE0-xEF])([[x40-x7E]\|[x80-xFC]]). You can notice that value 7F is excluded for 2nd byte. That would be a few intervals to enter, but for quick exploration, I summarize this to interval [x8140-x9FFC] which is more than shift-JIS but OK for this tutorial. I created a "highlight" in the editor to background colour whatever matches this interval. When translated to base 10 (decimals) this 2-bytes interval (also named shorts in 010editor) match [33088-40956]. Go to menu View -> Highlight -> Edit Highlights, remove existing entries and create a new one, as a 'short' (2-bytes) value and select a colour (I used clear blue). I also entered some control code that I spotted earlier, eg. x0A for line return, x07 for wait_button_input (both in red), and interval [xF00-xFFF] which looks to be character portraits displayed on top of the dialog window (in green). Now my aim is to scroll down in the bytes with page_down key and find some large coloured blocks, which will probably be text-related. The right summary pane is helpful for that. In Culdcept's case, we are lucky because texts are grouped in relatively dense blocks and not compressed, and text structures appear rapidly. See the image ? Look at this big block full of blue, that's probably some text block ! You will notice that the highlight tool is rather limited, as soon as a 1-byte control code is following 2-bytes characters, the highlight tool is 1byte shift and will not highlight in blue until the next 1-byte shift. But this is enough to spot blue blocks. After full file scrolling in the opened file, I listed 21 text blocks. After copying a few words from each block and searching for matches in the existing Nintendo DS translation wiki, I managed to associate each block to a text category (scenario, taunts, cards, items...) and determine their offsets. This is a long work. All together, I probably spent around 6 hours to reach this step. Weirdly, 6 blocks are a repetition of the scenario text, all with exact same bytes (data packing oversight from the developers, I suppose). Now let's dig into text data structures. I found 3 general patterns : 1) offset tables + dense text : easy to guess as you will see (scenario, tutorials). 2) pointer tables + dense text. There is more than control codes here, some bytes blocks between texts have some function. 3) Data/text mixed tables, with pointer and/or tables : short text are in the middle of structured game assets, such as cards, items, spells statistics... Next post will detail pattern 1. ################### ######## ## Text data pattern n°1 This is the simplest text patterns. We are going to use the "tutorial" text as an example. Have a look at this screenshot taken from position xC0B9C0, which is the "Scenario" text. You can clearly see the text block in blue (full block on the summary pane on the right, beginning og the block in the hexadecimal pane) with control codes appearing in red/green. We could already start translating via 010editor from here, but with a strong limitation : we would have to stick to the same byte sizes for each text. Meaning that if a text was 14 bytes long (so 7 shift-JIS jap characters, as they are 2 bytes long), we could at best replace them with, at best, 14 Latin characters (they are 1-byte in shift-JIS, e.g. ASCII-compatble). Could we hope for more freedom ? Well so far I found that yes, but not much more. We are stuck by the fact that all data is packed in 1 file and until someone guess the packing format (not me) we will have to fit English translation into the limits of these blue blocks. But, we would be happy to get some freedom to shift text blocks to our convenience in this interval, because some sentences, when translated from Japanese to English, might need more characters, while others might need less. Did you notice something with the bytes just before the text block ? Have a close look to each pair, can you guess a pattern ? Some clue, look values every 2 bytes. [try before going to next senetence !] So x00F0, x030D x0607 x06CA ... until x6116, x622A, x0765. Then starts the text. These are systematically increasing values ! x00F0 < x030D < x0607 < x06CA < ... < x6116 < x622A < x0765. Let's take adress of 1st pair x00F0, which is located at address xC0B9C0 : C0B9C0 + 00F0 = address of 1st sentence ! More Precisely x0F02 which is character portrait followed by text (spoiler, x1307 is replaced by platyer name). Let's take the second : adress of 00F0 (C0B9C0) + 030D = 2nd block of text ! So we may have a way to set where is starting each text block. With a bit of scripting, that will allow us to build tools to modify this text with more freedom. Also, you will notice that each sub-block targeted by this list of offsets is ending with x00 (highlighted with black background in my screenshot). If you had launched the game and had compared the text you would have confirmed that the 1st sub-block is the 1st conversation of the game ( on the world map). The second is the 1st conversation at the start of the 1st battle ... etc... Now, let's modify bytes to confirm we guessed everything correctly. I modified sub-blocks 1 and 2. I changed their text, but also some portraits and offset of block 2. Basically the dialog on world map (1st sub-block) will be much shorter and dialog in 1st battle intro (2nd sub-block) will be longer. I used Sega Saturn Patcher from KnighOfDragon, using Malenko's tutorial... to patch the image with the modified CULDCEPT.DT0 file. Here is the result, recorded from Mednaffen.

		benclaff	Nov 10, 2024
		#################### ####### ## Text data pattern n°2 [ongoing writting !] Scenario and tutorials are classics dialogs, for which you scroll the text windows after windows. They are loaded sequentially, which may explain how fast we understood how to control them. However, this is not the case for the other half of NPC dialogs. In particular, the "taunts" that NPCs throw during gameplay depend on what is actually happening in the round, for instance some evil laugh comment just after you lost a card in a battle. These taunts are structured using offset tables and batch of bytes associated to each text (it that may be related to their frequency or some game logic). To summarize, we will aim to edit the bytes corresponding to the text, without breaking anything in game logics. This makes a minimal study of the associated data structure a compulsory step. Let's start this study. I chose the block that matches Taunts #1 in the Nintendo DS translation, e.g. taunts from character "Zeneth". He launches taunts in the very first battle, so that will be useful for rapid tests. Matching sentences can be found at a text block starting at xB58B8C. Similarly to the previous section, we can easily distinguish a preliminary pattern, maybe an offset or pointer table, from a block containing many shift-JIS characters. However, we can immediately observe that more non-characters bytes and in particular batch of x00 (highlighted with black background/grey police) are separating taunt texts. Batches of x00 are often (not always) a consequence of "filling" unused bytes when binary data structures are fixed or normalized sizes. For instance, imagine a sheet of paper with a grid, you draw a table of 2 columns anf 5 lines, you decide that each column is 8 grid steps. If you write 'sword' in 1st column, 1st line, 1 letter per grid square, you have 3 square left without a letter. If you write 'helmets' in 2nd column, you have 1 empty square. When serializing (converting) a data structure to bytes, something similar can happen, batches of x00 may be these empty space fillers. As explained in previous sections, we can rapidly realize that there is an offset table before this block. In fact... it seems that there is two ! To understand better, let's use the bookmarks function of 010 editor. You can select some bytes intervals, then CTRL+B to make the bookmark menu appear. Select a colour, add some text description to the bookmark. I did that for the 2 offset tables, as well as for each block of bytes that is target by each offset from the 1st table. Here is the result. Now that's interesting, each block mixes text as we found in the scenario, e.g. x0FXX for portraits, followed by shift-JIS chars, control codes x07 and x0A and x00 to finish the text. But we can also observe some probable pointer / game logic. Actually, before each batch of x00, we can observe 2 bytes like x10FX. With a bit of colour, we can see this : The red arrow is the offset to which the offset table value point to. Then in yellow, I marked this interesting pattern which seems to appear a bit everywhere : [x10FX(some x00)(2 to 3 bytes not necessarily x00)]. It is followed by the portrait control code (in orange) and the text until its end (marked in black). Here you can see that we have one block, with twice this pattern, followed by 2 texts. Could this be pointers, offsets, game logic or script commands ? (ex: give a card to the user before this text) Let's explore more.

		benclaff	Nov 10, 2024
		#################### ####### ## Text data pattern n°3 [placeholder]

		benclaff	Nov 23, 2024
		#################### ################## PART 3 : more fonts in VDP memory ################### ################### [placeholder]

		benclaff	Nov 23, 2024
		#################### #################### ######### PART 4 : notes on tooling (shift-JIS & python) ################### #################### ########## [placeholder]

		benclaff	Dec 29, 2024
		Tools start to be functional for extraction and patching of these elements : scenario, tutorials and card/items/spells data. I will juggle with these and updates of the hack tutorials, in the next months.

		Malenko	Dec 30, 2024
		I removed my reply, it was ruining the flow of the thread and something you already knew. Love the tutorial sections, keep it up!

		OfManNotMachine	Apr 23, 2025
		Very excited for this! Thanks for working on it!!

		giloi	May 18, 2025
		Culdcept deserves a translation, thank you so much for your support

		Hiroshi Takahashi	Jul 2, 2025
		You still updating your post for some status updates about the project?

		benclaff	Jul 2, 2025
		Advancing slowly but steadily since my daughter was born. Side note, I spent a lot of time and had fun deciphering the card data structures. The repo scripts will allow injecting a translation, but will also allow editing cards... to some extent. (this game use some magic the gathering like card system) I'm probably going to poke some Culcept fans from their dedicated discord. I need some help to understand all stats and card flags. Likewise, I got most of them, but some are really obscure, and do not match any guides or database ... I'm documenting this process, and will add it later to the tutorial above.

		Reddha	Jul 8, 2025
		Congrats on the new baby!

		Hiroshi Takahashi	Sep 23, 2025
		Is this project still ongoing?