Mobileread
on-the-fly epub creation
#1  ilovejedd 04-14-2009, 03:59 PM
Need some (major) help...

I have a modified PHP script (based on FLAG) that dynamically creates a Stanza catalog for my favorite FanFiction.Net categories. It basically allows me to browse FanFiction.Net in real-time and creates epub-format ebooks on the fly using Calibre for reading in Stanza iPhone. The script currently runs on my Windows PC running XAMPP.

I have a shared Linux hosting account on 1and1 and I wish to upload the script there. My current dilemma is the epub creation part. I'm currently researching what methods I can use to create epub files using utilities/etc already installed or are user-installable on the shared account. The Linux host has Perl, Python and PHP installed and the operating system is CentOS, iirc.

Options I'm considering:
  1. Install Calibre on Linux Host
    Pros:
    • no changes to the PHP script required
    Cons:
    • no idea how to do this or if it's even feasible
  2. BookGlutton API
    Pros:
    • seems like this might be the easiest to implement
    Cons:
    • dependent on another website
    • don't know how I'm supposed to handle the post request
  3. DocBook+XSLT
    Pros:
    • seems like the dependencies should already be installed or are user-installable (no admin rights required)
    Cons:
    • don't know a thing about docbook
    • don't know a thing about xslt
  4. Code my own PHP script to create epubs
    Pros:
    • highly customizable
    Cons:
    • I might be able to finish this in 2 years if I'm lucky
Right now, I'm thinking using the BookGlutton API might be the best option for me (unless, of course, it's possible to install Calibre or at least html2epub on a shared host). I'm just not sure how I'm supposed to handle the post requests via PHP. Currently, I have an epub.php script that calls html2epub and returns the epub file. I guess I could modify this to send a post request to BookGlutton instead. I just don't know how, particularly the part where you upload the html file.

Anyway, not really looking for a discussion on the merits of the different methods. Just asking for help on the how. If you know of another way to do this (preferably something even an inexperienced coder can do), please post it here.

Thanks!
Reply 

#2  kovidgoyal 04-14-2009, 04:09 PM
the calibre binary installer is (almost) fully self contained, so you should be able to install it on a shared host.
Reply 

#3  ilovejedd 04-14-2009, 04:59 PM
Thanks! That's good to know.

I don't have secure shell/terminal access to the shared host. Can I just extract the tarball on my home computer and upload via ftp? What does calibre_postinstall do? The binary installer seems to call it at the end. Is it necessary to run it?

Again, thank you very, very much!
Reply 

#4  kovidgoyal 04-14-2009, 05:01 PM
No you should be able to run it without running postinstall (postinstall just sets up integration with the host OS which you don't need if all you want to do is conversions). I don't know if the FTP will preserve execute permissions on the files in the tarball though
Reply 

#5  DigitalFeonix 04-14-2009, 09:35 PM
Quote ilovejedd
Need some (major) help...

I have a modified PHP script (based on FLAG) that dynamically creates a Stanza catalog for my favorite FanFiction.Net categories. It basically allows me to browse FanFiction.Net in real-time and creates epub-format ebooks on the fly using Calibre for reading in Stanza iPhone. The script currently runs on my Windows PC running XAMPP.

I have a shared Linux hosting account on 1and1 and I wish to upload the script there. My current dilemma is the epub creation part. I'm currently researching what methods I can use to create epub files using utilities/etc already installed or are user-installable on the shared account. The Linux host has Perl, Python and PHP installed and the operating system is CentOS, iirc.

Options I'm considering:
  1. Install Calibre on Linux Host
    Pros:
    • no changes to the PHP script required
    Cons:
    • no idea how to do this or if it's even feasible
  2. BookGlutton API
    Pros:
    • seems like this might be the easiest to implement
    Cons:
    • dependent on another website
    • don't know how I'm supposed to handle the post request
  3. DocBook+XSLT
    Pros:
    • seems like the dependencies should already be installed or are user-installable (no admin rights required)
    Cons:
    • don't know a thing about docbook
    • don't know a thing about xslt
  4. Code my own PHP script to create epubs
    Pros:
    • highly customizable
    Cons:
    • I might be able to finish this in 2 years if I'm lucky
Right now, I'm thinking using the BookGlutton API might be the best option for me (unless, of course, it's possible to install Calibre or at least html2epub on a shared host). I'm just not sure how I'm supposed to handle the post requests via PHP. Currently, I have an epub.php script that calls html2epub and returns the epub file. I guess I could modify this to send a post request to BookGlutton instead. I just don't know how, particularly the part where you upload the html file.

Anyway, not really looking for a discussion on the merits of the different methods. Just asking for help on the how. If you know of another way to do this (preferably something even an inexperienced coder can do), please post it here.

Thanks!
I created my own script similar to FLAG some time ago to create .oeb files for my EB-1150 from stories on portkey.org and fanfiction.net. I have updated it to output .epub using a class to zip up the data. If you have customized FLAG, this is pretty easy to integrate as an output method.

Usage as follows:

PHP Code
$tstamp time(); // timestamp for zip entries
$epub   = new ZipCreate();

$prev_encoding $epub->ztype;
$epub->ztype 'store';
$epub->add_file('application/epub+zip''mimetype'$tstamp);
$epub->ztype $prev_encoding;

// add container
$epub->add_file($container'META-INF/container.xml'$tstamp);

// add opf
$epub->add_file($opf'OEBPS/content.opf'$tstamp);

// add toc
$epub->add_file($toc'OEBPS/toc.ncx'$tstamp);

// add your xhtml and CSS and pictures and fonts here

// finish it up and download
$output_file $epub->build_zip();
$output_name $story['title'] . '.epub';
$output_mime 'application/epub+zip';

header('Content-Type: application/x-download');
header('Content-Length: 'strlen($output_file));
header('Content-Disposition: attachment; filename="' $output_name '"');
header('Content-Transfer-Encoding: binary');

echo 
$output_file
[zip] zipcreate.cls.zip (4.6 KB, 550 views)
Reply 

#6  ilovejedd 04-14-2009, 10:45 PM
Quote DigitalFeonix
I created my own script similar to FLAG some time ago to create .oeb files for my EB-1150 from stories on portkey.org and fanfiction.net. I have updated it to output .epub using a class to zip up the data. If you have customized FLAG, this is pretty easy to integrate as an output method.
Thanks! That looks pretty cool. If installing Calibre doesn't pan out, I might work with this. However, I didn't see any code for opf and ncx creation and those parts, I'm still trying to figure out how to handle. I tried reading the IDPF spec, but ADHD kicked in before anything could sink in.
Reply 

#7  nrapallo 04-14-2009, 11:26 PM
Quote DigitalFeonix
I created my own script similar to FLAG some time ago to create .oeb files for my EB-1150 from stories on portkey.org and fanfiction.net.
Wow, that would make a great Impserve plugin... but in reverse, that is, .epub to .imp and served up to the EBW1150.

p.s. care to share your .oeb script? Inquirying minds would like to know...
Reply 

#8  DigitalFeonix 04-15-2009, 11:34 AM
Essentially the script takes three arguments; a site, a story id and an output format. I slurp the whole story into an associative array - using htmlpurifier as it's brought in - and output using the desired format.

For epub the .opf is created using this

PHP Code
/******************************************************************************
    CONTENT.OPF
******************************************************************************/

// create info for XML
$manifest   '';
$spine      '';

for (
$i=1;$i<=$story['chapter_count'];$i++)
{
    
$id sprintf('%03d'$i);

    
$manifest  .= '        <item id="chapter-' $id '" href="chapter-' $id '.xhtml" media-type="application/xhtml+xml"/>' "\n";
    
$spine     .= '        <itemref idref="chapter-' $id '"/>' "\n";
}

$story_title $story['title'];

// add the OPF info
$opf = <<<EOM
<{$qm}xml version="1.0"{$qm}>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="bookid" version="2.0">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title>
{$story_title}</dc:title>
        <dc:identifier id="bookid">urn:uuid:
{$UID}</dc:identifier>
        <dc:language>en</dc:language> 
        <dc:creator>
{$story['author']}</dc:creator>
        <dc:publisher>DigitalFeonix</dc:publisher> 
        <dc:rights>Public Domain</dc:rights> 
        <dc:subject>FanFiction</dc:subject>
    </metadata>
    <manifest>
        <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
        <item id="cover" href="cover.xhtml" media-type="application/xhtml+xml"/>
{$manifest}
        <item id="backcover" href="backcover.xhtml" media-type="application/xhtml+xml"/>
    </manifest>
    <spine toc="ncx">
        <itemref idref="cover"/>
{$spine}
        <itemref idref="backcover"/>
    </spine>
</package>

EOM;

$epub->add_file($opf'OEBPS/content.opf'$tstamp); 
and the toc is created using
PHP Code
/******************************************************************************
    TOC.NCX
******************************************************************************/

// create info for XML
$navpoint   '';

for (
$i=1;$i<=$story['chapter_count'];$i++)
{
    
$id     sprintf('%03d'$i);
    
$iplus  $i 1;

    
$chapter_title $story['chapters'][$i]['title'];

    
$navpoint  .= <<<EOM
        <navPoint id="navpoint-{$iplus}" playOrder="{$iplus}">
            <navLabel>
                <text>
{$chapter_title}</text>
            </navLabel>
            <content src="chapter-
{$id}.xhtml"/>
        </navPoint>

EOM;
}

$iplus $i 1;

//
$toc = <<<EOM
<{$qm}xml version="1.0" encoding="UTF-8" {$qm}>
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN" "http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
    <head>
        <meta name="dtb:uid" content="
{$UID}"/>
        <meta name="dtb:depth" content="1"/>
        <meta name="dtb:totalPageCount" content="0"/>
        <meta name="dtb:maxPageNumber" content="0"/>
    </head>
    <docTitle>
        <text>
{$story_title}</text>
    </docTitle>
    <docAuthor>
        <text>
{$story['author']}</text>
    </docAuthor>
    <navMap>
        <navPoint id="navpoint-1" playOrder="1">
            <navLabel>
                <text>Cover</text>
            </navLabel>
            <content src="cover.xhtml"/>
        </navPoint>
{$navpoint}
        <navPoint id="navpoint-
{$iplus}" playOrder="{$iplus}">
            <navLabel>
                <text>Backcover</text>
            </navLabel>
            <content src="backcover.xhtml"/>
        </navPoint>
    </navMap>
</ncx>

EOM;

$epub->add_file($toc'OEBPS/toc.ncx'$tstamp); 
The .oeb script takes the same associative array and outputs the flat .oeb file with the mime wrapping, building the .opf part the same way. This .oeb script was intended to create files suitable for upload to the eBookwise personal content server.
Reply 

#9  nrapallo 04-15-2009, 12:01 PM
Quote DigitalFeonix
Essentially the script takes three arguments; a site, a story id and an output format. I slurp the whole story into an associative array - using htmlpurifier as it's brought in - and output using the desired format.

For epub the .opf is created using this

PHP Code
**snip** 
The .oeb script takes the same associative array and outputs the flat .oeb file with the mime wrapping, building the .opf part the same way. This .oeb script was intended to create files suitable for upload to the eBookwise personal content server.
Good to know. I take it that the server side .php code would have to be installed on a personal server. Would this be easy to port to Python i.e. Impserve plugin?

I'm working on a Perl script, Epub2IMP.pl, that will convert any .epub to .imp after it is tweaked to accomodate some shortcomings of the ETI eBook Publisher software. It seems that any <img src> with a width=100% stretches the image without regard to the image's aspect ratio. Also, any CSS applied to <div class=>'s doesn't appear to be honoured so must be wrapped within a <p class=> </p> with the same CSS class reference.

To boot, within a .opf, even capiltalized Dublin Core metadata elements i.e. <dc:Title> cause problems. My Perl script will do many text subtitutions to alleviate these issues. Hopefully, ETI will improve their .epub support, especially since they co-authored many of the standards involved.
Reply 

#10  ilovejedd 04-15-2009, 01:40 PM
Thanks DigitalFeonix! Those scripts really help a lot. I'm still going to try to get Calibre working, but if it doesn't, I now have fallback #4, except you've done the job for me. Haven't been able to test anything, though, since I'm experiencing some weird issues with 1and1 mod_rewrite. The .htaccess file I use for my local XAMPP server doesn't want to work with 1and1 so I'm slowly trying to troubleshoot it.

If/when I get this working, I can start on making the covers look spiffy with ImageMagick.

@nrapallo
The PHP scripts don't look complicated at all, barring for the ZipCreate class. That, though, I attribute to my lack of knowledge of the zip file structure. Seems like that's the only thing you really need to port to Python. The rest is basically just creating text files.
Reply 

  Next »  Last »  (1/2)
Today's Posts | Search this Thread | Login | Register