bible record structures for developers
Download UniBibleMaker
UniBibleMaker is a set of Java programs and Windows .bat files that helps you make a UniBible Bible database.
- You need to have Java available on your computer.
- If you are not using a Windows OS, you need to translate the .bat files into a shell script that your system can understand, or run the commands manually in your shell.
- It requires your starting Bible to be in a text file in a format like this:
Genesis 1:1: In the beginning, God created...
Genesis 1:2: And the earth was formless...
- It comes with a Readme.txt file that tries to give adequate instructions.
- It is provided as-is, in the hope that it may help you out.
Download UniBibleMaker.zip
Bible Database Structure
The following should allow people familiar with Palm OS programming to construct their own UniBible files.
The database should have a creator code of "Unib" and a type code of "BBLE". It is recommended that
the database name begin with "Unib-" followed by the name of the version. For example, "Unib-Arabic".
All Unicode should be encoded in a big-endian format. This means that the letter "A" (0x0041) is stored
in memory as "0041" and not "4100".
There are three types of records, which are included in the following order:
Version Record (only one)
Book Records (one for each book included in the file)
Chapter Records (these contain the Bible text, chapter by chapter)
One Version Record
This needs to be the first record in the Palm database. It has the following format:
| UInt16 | Number of book records |
| UInt16 | Text direction: 0 or 1; 0 = left to right, 1 = right to left |
| UInt16 [array] | The name of the version. A Unicode string (2 bytes per character), 100 character maximum, terminated by 0x0000. |
Multiple Book Records
These records should immediately follow the Version Record.
| UInt16 | Number of chapters in this book. |
| UInt16 | The index of the record which contains the first chapter in this book. |
| UInt16 | Text direction: 0 or 1; 0 = left to right, 1 = right to left |
| UInt16 [array] | The name of this book. A Unicode string (2 bytes per character), 100 character maximum, terminated by 0x0000. |
Multiple Chapter Records
The first chapter should immediately follow the last Book Record. All the chapter from the first book should be grouped together, followed by all the chapters from the second book, etc.
| UInt16 | The chapter number. (First chapter is 0. Last chapter is NumberOfChapters - 1) |
| UInt8 | A number representing what character encoding was used for this chapter's text. 0 for Unicode, or one of the options from the Encoding Format table below. |
| UInt8 | Text direction: 0 or 1; 0 = left to right, 1 = right to left |
| UInt16 | The size in Unicode-characters of the uncompressed chapter text. (All chapters must have their text compressed with zlib.) If all the chapter said was "John" (four Unicode characters) it would have a length of 4. |
| UInt16 | The length of the chapter's name (from the uncompressed text) in Unicode-characters. If the chapter's title is "14", the size would be 2. |
| UInt8 [array] | The chapter text, compressed by the zlib algorithm.
- Be aware that UniBible will prepend the name of the book to the chapter title. So if the chapter title is "14", and the book name is "John", you will get a chapter heading of "John 14" once you open the chapter.
- Use 0x000d as a new line character inside the chapter text. It's a good idea to put a new line character right before each new verse number, so UniBible can display a pick list of verses to the user.
- The chapter text needs to terminate with 0x0000
- You can save space by using one of the single-byte character encodings listed below. Compress your single-byte array using zlib. UniBible will extract it, and convert it into an array of the corresponding two-byte Unicode characters. Don't forget to set the Encoding Format field to the correct value.
|
Encoding Formats
| Use this Code | to represent this Encoding Format |
| 0 | UnicodeBig | UnicodeBig: big-endian, two-byte unicode |
| 1 | 8859_1 | ISO Latin-1 (Western European) |
| 2 | 8859_2 | ISO Latin-2 (Croatian, Czech, Hungarian, Polish, Romanian, Slovak, Slovenian) |
| 3 | 8859_3 | ISO Latin-3 (Esperanto, Maltese, Turkish, Galician) |
| 4 | 8859_4 | ISO Latin-4 (Baltic: Latvian, Lithuanian, Greenlandic, Lappish) |
| 5 | 8859_5 | (Variant forms of Cyrillic: Byelorussian, Bulgarian, Macedonian, Russian, Serbian, Ukranian) |
| 6 | 8859_6 | ASCII plus Arabic |
| 7 | 8859_7 | ASCII plus Greek |
| 8 | 8859_8 | ASCII plus Hebrew |
| 9 | 8859_9 | ISO Latin-5 (Turkish) |
| 10 | 8859_10 | ISO Latin-6 (Lappish, Nordic, Inuit) |
| 11 | 8859_11 | (Thai) |
| 12 | 8859_13 | ISO Latin-8 (Baltic Rim languages) |
| 13 | 8859_14 | ISO Latin-9 (Sami) |
| 14 | 8859_15 | |
| 15 | 8859_16 | |
| 16 | CP424 | EBCDIC for Hebrew |
| 17 | CP437 | DOS English character set |
| 18 | CP737 | DOS ASCII plus Greek |
| 19 | CP775 | DOS ASCII plus Baltic |
| 20 | CP850 | DOS ASCII plus Western European |
| 21 | CP852 | DOS ASCII plus Central European |
| 22 | CP855 | DOS ASCII plus Cyrillic |
| 23 | CP856 | |
| 24 | CP857 | DOS ASCII plus Turkish |
| 25 | CP860 | DOS ASCII plus Portuguese |
| 26 | CP861 | DOS ASCII plus Icelandic |
| 27 | CP862 | DOS ASCII plus Hebrew |
| 28 | CP863 | DOS ASCII plus Canadian French |
| 29 | CP864 | DOS ASCII plus Arabic |
| 30 | CP865 | DOS ASCII plus Nordic |
| 31 | CP866 | DOS ASCII plus Cyrillic |
| 32 | CP869 | DOS ASCII plus modern Greek |
| 33 | CP874 | DOS ASCII plus Thai |
| 34 | CP1006 | AIX Arabic used in Pakistan for Urdu |
| 35 | CP1250 | Windows 3.1 Central European |
| 36 | CP1251 | Windows, ASCII plus Cyrillic |
| 37 | CP1252 | Windows, Western European |
| 38 | CP1253 | Windows, ASCII plus Greek |
| 39 | CP1254 | Windows, ASCII plus Turkish |
| 40 | CP1255 | Windows, ASCII plus Hebrew |
| 41 | CP1256 | Windows, ASCII plus Arabic |
| 42 | CP1257 | Windows, ASCII plus Baltic |
| 43 | CP1258 | Windows, ASCII plus Vietnamese |
| 44 | KOI8_R | ASCII and Cyrillic for Russian |
|