OpenWDB
From WoWWiki
| OpenWDB World of Warcraft Database | |
|---|---|
| File extension | .owdb |
| MIME type | text/x-owdb+xml, text/x-owdb |
| Type of format | XML database |
| Extended from | XML |
OpenWDB is an XML implementation of the WDB file format by Blizzard, with a possible extension to DBCs and human-inserted data.
Contents |
Structure
OpenWDB is an XML-based format and is valid for all the WDBs and DBCs given a specific, very similar XML sheet. The format is dynamic and can store a single, unique WDB; a single or multiple versions of the same WDB row; multiple unique and/or different versions of the same WDBs; human-overwritten metadata along of all of this data.
The advantage of OpenWDB over various formats such as CSV, or over the use of databases' XMLs such as Allakhazam or Wowhead, is that:
- All WDB data is stored. Wowhead, for the most used example, keeps a lot of informations in the database, hidden from the XMLs. If you want advanced information, or data seen by the authors of the XML as "useless", you have to fetch it yourself.
- Unlike Allakhazam's XML, the data is regular and stable. For example, if a column disappears during a few patches, and then reappears, it will have a different name.
- Unlike most if not all of the databases, WDB data and human overwrites are kept apart; even though they are in the same XML.
- Unlike all of those databases, OpenWDB is safe from private servers as long as it stays locked from regular users.
The XML files are linked to a XML "structure" file through their signature and build attribute. I planned to use a DTD early on, but found out it was just easier to use a main structure (particularly for automatic parsing).
Specifications
- Column (node) names can only start with a non capital alphabetic letter (a-z). They may only contain alphanumeric characters (A-Z, a-z, 0-9). id and length are standard names. Attributes must finish in "Id" or "Lvl" or "Amt" for integers depending on the column's meaning, "Bin" for bitmask integers, "Flt" for floats, "Val" for strings.
- The root tag is called "OpenWDB".
- The subroot tag is called "wdb" or "dbc" depending on the file and is directly followed by the <row> tag.
- "id" and "length" are reserved attributes for the <row> node.
WDB metadata
Metadata attributes are properties of the "wdb" or "dbc" tag. Those attributes contain all the information necessary to recreate the core of the file, upon which the data will be inserted. The WDB metadata are presented as follow:
- "name", containing the name of the file, lowercased and without the extension.
- "signature", containing the signature of the file, reversed back.
- "build", containing the build number of the wdb
- "locale", which is the locale of the wdb, and just like signature it must be reversed back.
- "header12" is the integer contained at the 12th byte (bytes 12-16).
- "header16" is the integer contained at the 16th byte (bytes 16-20).
A few notes:
- header12, header16 and, should they exist, the following header chunks, are extremely important. The files are assumed lacking them if they are not specified.
- New headers should follow the "headerXX" rule unless their purpose is discovered, in which case they are renamed accordingly and this page has to be updated.
- signatures and locales are written reversed in the WDBs, they must be reversed back when written to the owdb file.
DBC metadata
DBCs can share the OpenWDB format. However, while they are compatible, due to some differences in the format, the metadata will be slightly different.
- "name", just like WDBs contains the name of the file converted to lowercase and without the extension.
- "signature" should always be WDBC, but for compatibility reasons should be included.
- "build" contains the build number corresponding to the DBC. This sucks, since it's not included in the file, but it has to be there.
- "locale" is the locale of the DBC. It sucks just as much as build, but once again it has to be there.
- "records" is the amount of records in the file.
- "fields" is the amount of fields per records.
- "recsize" is the size of the records, Fields * FieldSize
- "strblocksize" is the size of the string block.
More notes:
- Like for WDBs, unknown headers should be named "headerXX" where XX is the address of the bytes. If their purpose gets known, they get renamed accordingly.
- Locale has to be included, well... technically, it shouldn't have to be included since, unlike WDBs, DBCs are multilingual. Depending on how things evolve, this might change. For now, it stays that way and the specification will be updated if needed.
- DBC rows do have an ID, but don't have a length
- Yeah, build and locale still sucks. Don't blame me.
This would be a standard structure example, supporting multiple WDBs including different builds and locales of a same one, and an optional overwrite (see below) example. <attributes/> is a fake tag replacing all of the attributes.:
<?xml version="1.0" encoding="UTF-8"?> <OpenWDB> <wdb name="itemcache" signature="WIDB" build="1234" locale="enUS" header12="220" header16="8"> <row id="12345" length="750"> <attributes/> </row> <row id="12346" length="750"> <attributes/> </row> </wdb> <wdb name="itemcache" signature="WIDB" build="2345" locale="frFR" header12="220" header16="8"> <row id="12345" length="750"> <attributes/> <overwrite> <attributes/> </overwrite> </row> </wdb> <wdb name="creaturecache" signature="WMOB" build="1234" locale="enUS" header12="220" header16="8"> <row id="1234" length="220"> <attributes/> </row> </wdb> <dbc name="item" signature="WDBC" build="1234" locale="enUS"> <row id="1234"> <attributes/> </row> </OpenWDB>
Empty values
All empty values (as in: all values evaluated to false - 0 for integers, 0.0 for floats, and nothing for strings) are written to the XML as an empty, self-closed tag in order to reduce space taken (about 30% space gain). Please refer to the corresponding item structure when parsing those.
<overwrite/>
For each WDB row, OpenWDB can include a separate, human overwrite tag which will contain information either not available in the WDBs, or that is being overwritten serverside. Such information is more dynamic than the informations in the WDB since users may chose not to parse it. Normalization of any kind is up to the author parsing it.
Normalization on overwrites will be up on the wiki, but will be wowwiki-specific. Users may chose not to follow it.
Test cases
No currently available test cases.
WoWWiki implementation
The wiki itself won't be able to get any benefit from OpenWDB unless Wikia codes a way for us to parse XML from the wiki. I am also assuming those pages will be editable by a given group of editors - wherever they are stored. Since they might have cross-site use, they should be automatically locked to regular members. Such a group should be different from regular users in order to be able to add more than just admins (i.e. bots or other users).
Warning: We are talking about gigabytes of XML data here. If Wikia doesn't want to store it on the Wiki, I can eventually spare a couple of servers for that - this is to be discussed. Wherever it is actually stored, it just needs to be editable for a given group.
- All WDB data will be stored on the wiki (given the exception of Wowcache.wdb, for obvious reasons.
- DBC data may be stored, who knows. But since this would be so horribly tedious to store in XML format (A single build's DBCs is well over 2GB of XML data), it will have to wait - I got some other ideas on that which I'll talk about later on.
- When I'll say "WDB" in the future, it really means WDB and DBC. Those formats are sufficiently compatible.
