Mobileread
WOLF format
#1  DaleDe 12-03-2007, 06:52 PM
Is there a description of this format somewhere? Is anyone using it? Where can books in this format be obtained? Are there converters available?

There needs to be a wiki entry on this format and I could use some help.

Dale
Reply 

#2  NatCh 12-03-2007, 07:38 PM
It's HanLin's pet format (you should pardon the pun). Unfortunately, that's about all I know about it.
Reply 

#3  kovidgoyal 12-03-2007, 09:22 PM
Someone once contacted me to write a converter for it, but apparently hanlin didn't want to part with its specifications.
Reply 

#4  mrdini 12-04-2007, 07:05 AM
Quote kovidgoyal
Someone once contacted me to write a converter for it, but apparently hanlin didn't want to part with its specifications.
FWIW, the specifications _are_ around... I've asked Hanlin recently, & since they said it's okay to discuss their SDK, here goes....

From the Hanlin docs...
Code
struct HlDocInfoS
{ int type; int lang; int pageNum; int fileSize; char *fileName; char *fileDate; //Printable file date. for example 08/22/2007 char *bookName; char *author; char *series; // e.g. "Harry Potter #2" char *isbn; char *publisher; char *publishDate; char *translator; char *originalName; // name of book in original language, for translation char *originalAuthor; // author(s) in original language, for translation char *originalSeries; // author(s) in original language, for translation
};
From the CoolReader docs
Code
Wolf file structure (sections):
{ Header Description [Cover] WolfPages Catalog [SubCatalog] PageTable
}
1. Header Length = 128 bytes.
{ WolFileId : array[0..12] of Char; Unknown0D : DWORD; Unknown11 : WORD; Unknown13 : DWORD; DescriptionSize : WORD; CoverSize : DWORD; Unknown1D : Byte; CatalogSize : DWORD; Level1Items : DWORD; WolfPagesSize : DWORD; Unknown2A : array[0..17] of Byte; PageTableSize : DWORD; Unknown40 : Byte; BookType : Byte; Unknown42 : WORD; Unknown44 : array[0..6] of Byte; Unknown4B : WORD; Unknown4D : array[0..17] of Byte; Level23Items : WORD; SubCatalogOffs : DWORD; Unknown65 : array[0..26] of Byte;
}
where: [0x00] WolFileId: constant = "WolfEbook1.11" [0x0D] Unknown0D: constant = 0x00000000 [0x11] Unknown11: constant = 0x0201 [0x13] Unknown13: constant = 0x00000000 [0x17] DescriptionSize: size of section "Decription" [0x19] CoverSize: size of section "Cover". 0, if there is no cover page. [0x1D] Unknown1D: ??? really unknown. (!!! I think this is 0 for "Homochrome" Wolf, 1 for "Gray" Wolf file.) [0x1E] CatalogSize: size of sections "Catalog" and "SubCatalog". [0x22] Level1Items: count of items on first catalog level. [0x26] WolfPagesSize: size of section "WolfPages". [0x2A] Unknown2A: constant = fill with 0x00. [0x3C] PageTableSize: size of section "PageTable". [0x40] Unknown40: constant = 0x01 [0x41] BookType: 0=book, 1=article; 2=magazine [0x42] Unknown42: ??? really unknown. (!!! I think this fiels is = 2*<count of pages> - only for graphic wolf!) [0x44] Unknown44: constant = fill with 0x00. [0x4B] Unknown4B: ??? really unknown. (!!! I think this fiels is = <count of pages> - only for graphic wolf!) [0x4D] Unknown4D: constant = fill with 0x00. [0x5F] Level23Items: count of all items on levels 2 and 3. [0x61] SubCatalogOffs: offset (from begining of file) to section "SubCatalog". 0, if there is no "SubCatalog" section. [0x65] Unknown65: constant = fill with 0x00. CatalogSize = 19 + CountOfLevel1TOCItems * 17 + Length(NamesInLevel1TOC) SubcatalogSize = 26 + CountOfAllTOCItems * 80 + Length(NamesInAllTOCLevels) PageTableSize = 105 + 56 * PagesCount {for ver="021211"} = 105 + 60 * PagesCount {for ver="001"}
2. Description: A string with 9 pseudo-XML elements. Every tag is followed by value and new line (#13#10). Tags with empty value can't be omited. <title> - Title of book. Max - 128 chars (according to WolfDLL description). <subject> - Subject, type. Max - 128 chars (according to WolfDLL description). <author> - Author name. Max - 128 chars (according to WolfDLL description). <adpter> - Adapter (???). Max - 128 chars (according to WolfDLL description). <translator> - Translator. Max - 128 chars (according to WolfDLL description). <publisher> - Publisher. Max - 128 chars (according to WolfDLL description). <time_publish> - Publishing time. Max - 16 chars (according to WolfDLL description). <introduction> - Annotation. Max - 1024 chars (according to WolfDLL description). This field allow using of new line chars (#10#13). <ISBN> - ISBN. Max - 128 chars (according to WolfDLL description).
3. Cover Section "Cover" contains description (header) and data for cover page:
{ CoverHeader CoverData
}
3.1. CoverHeader Length = 10 bytes.
{ Compression : WORD; ImageWidth : WORD; BitsPerPixel : WORD; BytesPerLine : WORD; ImageHeight : WORD;
}
where: [0x00] Compression: 0xFFFF for raw data (no compression); 0x0001 for LZSS compression. [0x02] ImageWidth: width of image (in pixels) [0x04] BitsPerPixel: 1 for monochrome images; 2 for 4-level gray images. [0x06] BytesPerLine: count of bytes in one row of image. (= ImageWidth*BitsPerPixel/8) [0x08] ImageHeight: height of image (in pixels)
3.2. CoverData Raw bitmask for pixels (if Compression=0xFFFF) or compressed with LZSS (if Compression=0x0001). Note that in monochrome images 1 is black and 0 is white; in 4-level gray images 0 is black and 3 is white.
4. WolfPages Pseudo-XML node; tags are written without dividers. Warning: There is #13#10 between <wolf> and <catalog>, but not between </catalog> and </wolf>.
{ <wolf> <catalog> <img bitcount=%B compact=1 width=%W height=%H length=%L> ImageData (for page 1) </img> <img bitcount=%B compact=1 width=%W height=%H length=%L> ImageData (for page 2) </img> ... <img bitcount=%B compact=1 width=%W height=%H length=%L> ImageData (for last page) </img> </catalog></wolf>
}
where: %B: Bits per pixel - 1 for monochrome images; 2 for 4-level gray images. %W: Width of image (in pixels) %H: Height of image (in pixels) %L: Length of ImageData ImageData: compressed bitmask stream. Note that in monochrome images 1 is black and 0 is white; in 4-level gray images 0 is black and 3 is white.
5. Catalog Pseudo-XML node; tags are written without dividers (like new lines):
{ <catalog> <item>ItemName</item>ItemOffs ... <item>ItemName</item>ItemOffs </catalog>
}
where: ItemName: Name of element from first level in TOC (table of contents). ItemOffs: Offset from beginning of section "WolfPages" to description of page (<img...>). Every item represents one element from first level in TOC. If there is no TOC, catalog contains only one item with ItemName=BookTitle and ItemOffs=OffsetOfPage#1.
6. SubCatalog Describes full TOC. If there is no TOC, this section not exists. Pseudo-XML node; datas are written without dividers (like new lines):
{ <subcatalog> Item1 Item2 ... ItemN Names </subcatalog>
} Every item represents one element from TOC. Elements are ordered as followed: first are all items in level 1, then all items in level 2, and then - all items in level 3. Elements in same level are ordered by his number (index) in level.
6.1. SubCatalog Item Length = 80 bytes.
{ PageOffs : DWORD; NameOffs : DWORD; NameSize : WORD; ChildsCount : WORD; PrevPeerOffs : DWORD; NextPeerOffs : DWORD; ChildOffs : DWORD; ParentOffs : DWORD: Level3Idx : Byte; Level2Idx : Byte; Level1Idx : Byte; AlignByte : Byte; ItemName : array[0..47] of Byte;
}
where: [0x00] PageOffs: offset from beginning of section "WolfPages" to description of page (<img...>). [0x04] NameOffs: offset from beginning of file to beginnig of name (in "Names" area). [0x08] NameSize: length (in chars) of name (in "Names" area). [0x0A] ChildsCount: count of subitems for this element from TOC. [0x0C] PrevPeerOffs: offset from beginning of file to the description of previous peer element in SubCatalog table. 0, if current item is the first child of parent. [0x10] NextPeerOffs: offset from beginning of file to the description of next peer element in SubCatalog table. 0, if current item is the last child of parent. [0x14] ChildOffs: offset from beginning of file to the description of first child in SubCatalog table. 0, if there is no subitems (in TOC) for current item. [0x18] ParentOffs: offset from beginning of file to the description of parent element in SubCatalog table. 0, if element is on level 1. [0x1C] Level3Idx: level 3 index of element; 0 if element is on level 1 or level 2 in TOC. [0x1D] Level2Idx: level 2 index of element; 0 if element is on level 1 in TOC. [0x1E] Level1Idx: level 1 index of element. [0x1F] AlignByte: constant = 0x00. [0x20] ItemName: name of item in TOC. All items are ordered by levels - first going all items in level 1, then all items in level 2, then all items in level 3.
6.2. Names All names for subcatalog items without dividers. Ends with constant = 0x08.
7. PageTable Pseudo-XML node; tags are written without dividers (like new lines):
{ <pagetable ver="021211 "> Group1Offs : DWORD; Group2Offs : DWORD; GroupDiv1Offs : DWORD; Group3Offs : DWORD; Group4Offs : DWORD; GroupDiv2Offs : DWORD; Group5Offs : DWORD; Group6Offs : DWORD; GroupDiv3Offs : DWORD; Group7Offs : DWORD; Group8Offs : DWORD; GroupDiv4Offs : DWORD; EndFileOffs : DWORD; </pagetable> PageOffsGroup1 PageOffsGroup2 GroupDivider1 PageOffsGroup3 PageOffsGroup4 GroupDivider2 PageOffsGroup5 PageOffsGroup6 GroupDivider3 PageOffsGroup7 PageOffsGroup8 GroupDivider4
}
where: [0x00] Group1Offs: offset from beginning of file to PageOffsGroup1. [0x04] Group2Offs: offset from beginning of file to PageOffsGroup2. [0x08] GroupDiv1Offs: offset from beginning of file to GroupDivider1. [0x0C] Group3Offs: offset from beginning of file to PageOffsGroup3. [0x10] Group4Offs: offset from beginning of file to PageOffsGroup4. [0x14] GroupDiv2Offs: offset from beginning of file to GroupDivider2. [0x18] Group5Offs: offset from beginning of file to PageOffsGroup5. [0x1C] Group6Offs: offset from beginning of file to PageOffsGroup6. [0x20] GroupDiv3Offs: offset from beginning of file to GroupDivider3. [0x24] Group7Offs: offset from beginning of file to PageOffsGroup7. [0x28] Group8Offs: offset from beginning of file to PageOffsGroup8. [0x2C] GroupDiv4Offs: offset from beginning of file to GroupDivider4. [0x30] EndFileOffs: offset from beginning of file to end of file (size of file). GroupDividerX: constant = 0xFFFFFFFF. PageOffsGroup1, PageOffsGroup2, PageOffsGroup3, PageOffsGroup4, PageOffsGroup7 and PageOffsGroup8 have following structure:
{ CatalogOffs : DWORD; Page1Offs : DWORD; Page2Offs : DWORD; Page2Offs : DWORD; ... PageNOffs : DWORD; PageNOffs : DWORD;
}
where: [0x00] CatalogOffs: offset from beginning of file to beginning of Wolf catalog ("<catalog>" in section "WolfPages"). [0x04] Page1Offs: offset from beginning of file to description of first page (<img...>). [0x08] Page2Offs: offset from beginning of file to description of second page (<img...>). [0x0C] -"- ... [0x??] PageNOffs: offset from beginning of file to description of last page (<img...>). Minimal size of this structure is 2 items (8 bytes). PageOffsGroup5 and PageOffsGroup6 have following structure:
{ CatalogOffs : DWORD; Page2Offs : DWORD; Page3Offs : DWORD; ... PageNOffs : DWORD;
}
where all items are the same as in PageOffsGroup1 description. Minimal size of this structure is 1 item (4 bytes).
Note: For page table version "001" header is: <pagetable ver="001">
and all groups are like PageOffsGroup5 and PageOffsGroup6.
---
I'd suggest looking in the following files...
From the Hanlin SDK docs - /sdk_root/docs/Parser-Viewer.pdf
A more descriptive file (the 2nd codeblock above) can be found in the CoolReader sources... /crengine/docs/WolfFormat.txt from the Coolreader site.

Hope that helps!

Y.
Reply 

#5  jeczmien 12-04-2007, 07:38 AM
Quote DaleDe
Is there a description of this format somewhere? Is anyone using it? Where can books in this format be obtained? Are there converters available?

There needs to be a wiki entry on this format and I could use some help.

Dale
I've asked them for definition, but they refuse.
Its proprietary Jinkie format with DRM ability. They releases MS Windows software, so you can "print" to WOLF format (check WOLFPrinter software - for V8, of course it works with V3).

I'm not sure how close have you check your V3 - maybe you have noticed it.
Near battery there is a slot for SIM card - it can be used for DRM'ing books.

Any way - I've tried WOLF and I'm not impressed. For regular user there is nothing more than pdf (even filesize).
Reply 

#6  JSWolf 12-04-2007, 07:47 AM
Unless the V3 supports a DRM format that you can purchase at some ebook shop to get say the latest bestseller, its going to tank. I've said this from day one when they announced this Wolf format as their DRM format. And I still think so. The average person will have no idea what to do with a V3. It'll just be a case of "Oh that looks nice" and then into some drawer to sit there unused. I know Sony has their BBeB format, but at least they have a shop and real books in that format. Wolf format has no bestsellers and Jinke wanting to keep Wolf a closed format is just shooting themselves in the foot (so to speak).
Reply 

#7  kovidgoyal 12-04-2007, 08:02 AM
@mrdini
Thanks
Reply 

#8  DaleDe 12-05-2007, 02:26 PM
Quote jeczmien
I've asked them for definition, but they refuse.
Its proprietary Jinkie format with DRM ability. They releases MS Windows software, so you can "print" to WOLF format (check WOLFPrinter software - for V8, of course it works with V3).

I'm not sure how close have you check your V3 - maybe you have noticed it.
Near battery there is a slot for SIM card - it can be used for DRM'ing books.

Any way - I've tried WOLF and I'm not impressed. For regular user there is nothing more than pdf (even filesize).
I am beginning to believe that Wolf format is nothing more than bitmapped images with one image per page. Am I close?
Reply 

#9  mrdini 12-05-2007, 02:37 PM
Quote DaleDe
I am beginning to believe that Wolf format is nothing more than bitmapped images with one image per page. Am I close?
Possibly.

But in the case of the V6, there definitely is 2 versions of WOLF files. Let me explain...

When you use the Hanlin printer, the output appears to be graphical/bitmapped, and the resulting output cannot be zoomed in/out. The result does look good, as the fonts matches what's on your computer display.

When you use the Wolf application (chinese-only(?)) to create the WOLF file, it is rendered as text, and the text CAN be zoomed in/out, and the text does reflow, which leads me to think it's actual text. However, the output from the Wolf application is hmmm.... passable :P Nothing to write home about!

Checking the code I gave earlier, it appears that the latter format isn't documented, although I could be wrong, as I haven't got enough C coding skills to be able to implement a Wolf parser yet...

Hope that helps!

Out of curiousity, what are you planning on doing with the WOLF format...
Reply 

#10  DaleDe 12-05-2007, 02:59 PM
Quote mrdini
Possibly.

Hope that helps!

Out of curiousity, what are you planning on doing with the WOLF format...
Definitely helps. At this point I am trying to develop a WOLF format page for the wiki. You can check it to see what I have done so far. I am on a mission to try and document, to some degree, all of the eBook formats for the wiki.

Dale
Reply 

  Next »  Last »  (1/2)
Today's Posts | Search this Thread | Login | Register