GuildWiki has been locked down: anonymous editing and account creation are disabled. Current registered users are unaffected. Leave any comments on the Community Portal.

User:Dr ishmael/Gw.dat

From GuildWiki
Jump to: navigation, search

Text files[edit | edit source]

Some of the files are text files, containing all of the in-game text: skill names and descriptions, NPC and quest dialogues (assumed), and text for the in-game store. Text files do not have a defined header to identify them, so they must be identified through their other characteristics.

  • The last 2 bytes of the file designate the language code and file code. Currently there are 12 languages and 96 (0x00 – 0x60) files. The languages are:
00 English
01 Korean
02 French
03 German
04 Italian
05 Spanish
06 Traditional Chinese
07 Simplified Chinese
08 Japanese
09 Polish
0A Russian
11 Bork!
  • A text file contains exactly 1024 strings. Some are plaintext UTF-16, while others are encrypted.
  • Text files are encoded in little-endian UTF-16, which means that within each 16-bit "word," the most-significant byte is the second one. For example, the letter 'A', which is Unicode codepoint U+0041, is encoded in the order 0x41 0x00. The "word" is still considered to be 0x0041, however, and this is how I will denote full words below
  • Each string is identified by a 6-byte header:
    • Bytes 0 and 1 encode the length of the string, in bytes, including the header. Thus the smallest length would be 0x06 0x00, for the mandatory 6-byte header. (This is commonly found at the end of the last text file, where those string IDs are not yet in use.) The longest string in the English files is 4020 bytes, or 0x0FB4.
    • Bytes 2 and 3 are always 0x0000 for plaintext strings, but for encrypted strings they reflect the lowest byte-value of all characters in the (decrypted) string (e.g., if the string is "Backpack" then byte23 is 0x0042, the UTF-16 encoding for 'B'). There are a small number of encrypted strings for which these two bytes are 0xFFFF, which is defined in the Unicode standard as a non-valid character. It is unknown what purpose this serves.
    • Bytes 4 and 5 are a mystery. For all plaintext strings, they are 0x0010 (16), but for encrypted strings, they can take on values between 0x0005 and 0x000e.
  • The plaintext strings cover all skill names and descriptions, zone names, most item names, most NPC names, speech-bubble NPC dialogue, and most interface text.
  • The encrypted strings appear to contain everything else, including chatbox and windowed NPC dialogue.