Mobileread
Bytes per page data for a sampling of books
#1  j.p.s 12-06-2019, 06:09 PM
This thread is about bytes per page data based on amazon supplied apnx files for a sampling of books. It is not about whether or not page numbers make sense for ebooks or which page numbering scheme is better or worse than any other scheme. Attached are plots showing byte counts on a page by page basis, which varies within a book and from book to book. Plots for several books are attached.

The variation between books is mostly because of differences in formatting, some of which might improve rendering or might be useless boiler plate. Within a book, variation comes from partial pages at the end of chapters, formatting for first pages of chapters, and the presence of tables, figures, and images.

In most cases, a horizontal line can be imagined going through a typical value for a particular book. But end matter such as end notes, bibliographies, and indices can have their own typical bytes per page, which might be quite different from the rest of the book. The byte size of these sections can make them appear to make up more of the book than they actually do. For example,
Code
Title Pages Locations
A Brief History of Everyone 363 / 402 = 90% 5044 / 7586 = 66%
Bad Blood 299 / 341 = 88% 4695 / 5462 = 86%
The Hidden Life of Trees 245 / 271 = 90% 2683 / 3506 = 77%
Hidden figures 271 / 350 = 77% 4318 / 8009 = 54%
Silent Spring 296 / 297 =100% 3975 / 5653 = 70%
The Fifties 732 / 801 = 91% 13167 / 17018 = 77%
Note that kindles display percentage based on locations, which are based on bytes, whether location, page, or time is displayed to show progress.
BytesPerPage.png BytesPerPageNF.png 
Reply 

#2  j.p.s 12-06-2019, 06:33 PM
The table in post #1 took so long that I forgot to attach and comment on the data plots. Most books seem to be between 2500 and 3000 bytes per page, but there can be quite a bit of variation, with Ford's autobiography is around 2000 and Utopia For Realists 1500.

Hidden Figures is 2500ish in the main text, but on end matter section run over 5000 and another well over 25,000.
Reply 

#3  jhowell 12-06-2019, 10:56 PM
That is interesting. But it makes me wonder, are working toward something in this analysis?
Reply 

#4  j.p.s 12-07-2019, 11:19 AM
Quote jhowell
That is interesting. But it makes me wonder, are working toward something in this analysis?
Maybe. It's possible I might update if anything interesting shows up based on a larger number books. This is mainly a result of exploring apnx files, and is posted for future reference and in the hope that others might find it informative.
Reply 

Today's Posts | Search this Thread | Login | Register