Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 07-14-2014, 08:22 PM   #1
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
How to copy HTML and keep the formatting?

I'd like to copy this text (book of Genesis from the Bible) and get the verse numbers in the small size!

The verse numbers appear NORMAL size when they should be very small and raised (forgotten what's called). I can turn them off, which would solve the problem, but traditionally when reading the Bible, like most people, I am used to having verse numbers...


Bibel 2000: 1 Moseboken


Is this possible? When I copy and paste formatted into Word, I get the numbers in normal size. Any way to get the verse numbers small, and raised?
martienne is offline   Reply With Quote
Old 07-14-2014, 08:42 PM   #2
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by martienne View Post
I'd like to copy this text (book of Genesis from the Bible) and get the verse numbers in the small size!

The verse numbers appear NORMAL size when they should be very small and raised (forgotten what's called). I can turn them off, which would solve the problem, but traditionally when reading the Bible, like most people, I am used to having verse numbers...


Bibel 2000: 1 Moseboken


Is this possible? When I copy and paste formatted into Word, I get the numbers in normal size. Any way to get the verse numbers small, and raised?
As I can't read the site, I don't know if this is copyrighted--I'd imagine that the material may not be (Bible), but the site itself likely is. I can't read the terms of service, which makes me not cheery about answering this particular question.

However, it doesn't really matter, because in short, no, you can't. I viewed the page's source, and they actually went to some lengths to ensure that if you want to "rip it," you'll have to do some work, as Kim mentioned, already, in his post (about the verse numbers). You may have missed it in your rush, but he already told you that you'd have to do the numbers by hand to super-script them. Now, you can probably write a bit of regex, but you're still going to have to (probably) do them one at a time, to be sure you don't superscript a number used elsewhere.

The text isn't in the source for the page. Therefore, no, there's no easy way to "rip" their text.

FWIW.
Hitch
Hitch is offline   Reply With Quote
Advert
Old 07-14-2014, 09:37 PM   #3
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
Quote:
Originally Posted by Hitch View Post
The text isn't in the source for the page. Therefore, no, there's no easy way to "rip" their text.

FWIW.
Hitch
Thanks Hitch. Yes, I can see that the text is not in the source. Out of interest, where, is it then? Text doesn't appear out of nowhere.. I didn't check carefully, but at a quick glance I noticed no iframe or php call. Yet the text is flowing, it's no flash or picture. I just don't know how they format that little number. It's got to be either a smaller and heightened size, or a different font. If it's on a database and formatted it still should be possible to grab it.
martienne is offline   Reply With Quote
Old 07-15-2014, 03:52 AM   #4
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
The superscripts are done like this:

Code:
<span id="v_0_9" class="v">10</span>Gud kallade det torra landet jord, och vattenmassan kallade han hav. Och Gud såg att det var gott.
with:

Code:
span.v {
    vertical-align: top;
    font-size: 0.6em;
}
Nothing particularly fancy.

The text is included with javascript, replacing this placeholder:

Code:
<div class="bible_content" id="bible_content__2k__1_mos">
  <div class="bible_content_notincluded">1 Mos</div>
</div>

Last edited by Jellby; 07-15-2014 at 03:55 AM.
Jellby is offline   Reply With Quote
Old 07-15-2014, 03:58 AM   #5
elibrarian
Imperfect Perfectionist
elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.
 
elibrarian's Avatar
 
Posts: 464
Karma: 724664
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
Quote:
As I can't read the site, I don't know if this is copyrighted--I'd imagine that the material may not be (Bible), but the site itself likely is. I can't read the terms of service, which makes me not cheery about answering this particular question.
@Hitch: This particular translation is owned by the Swedish State. It is ©'ed, and there are royalty fees for using it commercially (printed editions) or redistributing larger "chunks" electronically, but you can (re)use it freely in smaller portions or for personal use.

Regarding the formatting, the Chapters are in a <span class='ch'> and the verse in a <span class='v'>. If you want that, you'll have to mark the text you want and use something like Firefox' "Show source code" and copy from there.

Then you'll have to learn some css and such stuff

- and that's about as detailed as I will get - others have written whole books on html and css, so I won't get into that.

Regards,

Kim
elibrarian is offline   Reply With Quote
Advert
Old 07-15-2014, 04:33 AM   #6
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by elibrarian View Post
@Hitch: This particular translation is owned by the Swedish State. It is ©'ed, and there are royalty fees for using it commercially (printed editions) or redistributing larger "chunks" electronically, but you can (re)use it freely in smaller portions or for personal use.

Regarding the formatting, the Chapters are in a <span class='ch'> and the verse in a <span class='v'>. If you want that, you'll have to mark the text you want and use something like Firefox' "Show source code" and copy from there.

Then you'll have to learn some css and such stuff

- and that's about as detailed as I will get - others have written whole books on html and css, so I won't get into that.

Regards,

Kim

Thanks, Kim.

Hitch
Hitch is offline   Reply With Quote
Old 07-15-2014, 06:00 AM   #7
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
Thanks again, Kim and also Jellby for the examples!

So they are sending the text in HTML format, using Javascript and then formatting it with CSS? Hypothetically, if it was possible to intercept the Javascript and grab that HTML, then I could easily write a CSS class locally or use their stylesheet if it's possible to grab that. All that's needed is to grab the html from the Javascript. I'm sure that can be done. I'm not good at this kind of stuff but it's not totally alien either.

Quote:
Originally Posted by elibrarian
This particular translation is owned by the Swedish State. It is ©'ed, and there are royalty fees for using it commercially (printed editions) or redistributing larger "chunks" electronically, but you can (re)use it freely in smaller portions or for personal use.
As for the copyright, in line of what Kim said; for me personally 1) I am a citizen of Sweden 2) I am a lifelong member of the church of S. and pay taxes to it despite never even attending.

This is not a normal piracy situation.

Rest assured, if it was available to buy and download as an epub or pdf, I would. However the only shop that sells it, is closed and they only sell physical CDs anyway. I simply want a personal copy of this Bible, to put on my e-reader.

If the state chooses to use a monopolistic distributor, not offer a proper e-book, and close the shop for a whole month in the summer, then this will inevitably happen.

The translation situation is different with Swedish. This version is one of only two Bible versions that a modern person can read without getting seriously distracted by the ancient (i.e. post-spelling/grammar reform). The other version was privately funded by Christians who disagreed with the state version; obviously it should be paid for since the sponsors are out of pocket.

The Swedish state didn't commission Bibel 2000 to make money from it: It's not a commercial translation. They simply wanted to get people a bible in modern language, that adheres to political correctness as far as possible, back when the church and state were still one. The reason they don't offer an e-book is most likely because it simply didn't occur to them.

Finally; what do you think Jesus would make of attempts to restrict access to the Bible on grounds of copyright?

Last edited by martienne; 07-15-2014 at 06:45 AM.
martienne is offline   Reply With Quote
Old 07-15-2014, 07:58 AM   #8
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
Got it The HTML comes through in the Elements section of the page. It didn't show in an initial glance at the immediately accessible most basic source view in Firefox. But Chrome kept track of it through some more advanced source interface. I'm sure I could have got it in Firefox too, I'm just not set up for development.

So the trick is to copy the core text as HTML, which is possible.
Next step is to apply the CSS.
Then save it as something that can be converted to epub or PDF.

I grabbed the stylesheet from the site, and imported into Dreamweaver, so now I just need to fix up the styles a bit.

Then repeat it x66 for each book in the Bible.

It was not easy, but certainly possible without doing anything particularly complex.
The CSS comes out a bit funny, but it shouldn't be a problem to fix.

Last edited by martienne; 07-15-2014 at 08:03 AM.
martienne is offline   Reply With Quote
Old 07-15-2014, 11:38 AM   #9
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
I managed to generate this, with the help of HTML/CSS.
But what's going on with the Swedish characters?

martienne is offline   Reply With Quote
Old 07-15-2014, 12:15 PM   #10
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Wrong encoding. It looks like it's utf8 interpreted as latin1.
Jellby is offline   Reply With Quote
Old 07-15-2014, 12:23 PM   #11
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
Thanks! I discovered the problem, it had the wrong ISO charset, it needed Latin 1 Western Europe.
martienne is offline   Reply With Quote
Old 07-15-2014, 01:20 PM   #12
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
I just opened the page in Firefox and opened the developer console to Network (CTRL+SHIFT+Q). Reloaded the page. Looked at responses. After a few tries, got this: http://www.bibeln.se/las/2k/1_mos?content=pure which appears to be the first page.

Also, this looks important: http://www.bibeln.se/las/2k/1_mos?content=annotations

Last edited by eschwartz; 07-15-2014 at 01:23 PM.
eschwartz is offline   Reply With Quote
Old 07-15-2014, 02:14 PM   #13
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
I've got it now! Thanks eschwartz!

But before I copy over all the 66 books of the Bible into my HTML file:
Is it going to be possible to convert the HTML (including the CSS formatting) into an ebook? Either PDF of epub (or both, to check which is best).
What tool would you recommend for doing this?

Will I have problems because of the enormous size of the bible text - I mean, it's a very long book....

Here is the end result:

martienne is offline   Reply With Quote
Old 07-15-2014, 02:24 PM   #14
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
I don't see why not... EPUB is just a bunch of HTML files in a ZIP wrapper.

I'd suggest using calibre's Editor and importing the HTML and CSS. You can right-click on html files to link stylesheets.

Big books are only a problem when they overload the memory, but if you split it into separate HTML files (or leave it as it is now) only one file will be loaded into memory at a time.
eschwartz is offline   Reply With Quote
Old 07-15-2014, 02:42 PM   #15
martienne
.~^пиратка^~.
martienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshesmartienne can read faster than his screen refreshes
 
martienne's Avatar
 
Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
Quote:
Originally Posted by eschwartz View Post
I don't see why not... EPUB is just a bunch of HTML files in a ZIP wrapper.

I'd suggest using calibre's Editor and importing the HTML and CSS. You can right-click on html files to link stylesheets.

Big books are only a problem when they overload the memory, but if you split it into separate HTML files (or leave it as it is now) only one file will be loaded into memory at a time.
Great tip! It worked!

Just some minor glitches,

1) the locale/charset is again causing problems ("å" comes out as a Polish character...)
It seems to know the letters ä and ö, probably because they are used in German - a bigger language. Our little "å" is left out and displayed like the Polish l.

2) Also Calibre inserts a page break at every chapter - maybe good in a book but not in the bible since there are new chapters all the time.

3) Calibre also ignores that the "poetry" and things that God says are supposed to be indented.

4) Plus, there is some kind of issue with the empty space between paragraphs occassionally disappearing.

Can I edit the CSS inside Calibre? If Calibre has a different rendering engine than Firefox, it might be necessary.

Here is what it looks like in Calibre:


Last edited by martienne; 07-15-2014 at 03:08 PM.
martienne is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Why use html formatting with kindle? kateharp Amazon Kindle 30 06-18-2011 10:37 AM
Best example of HTML formatting for Kindle??? delphidb96 Amazon Kindle 13 02-15-2011 06:22 AM
Troubleshooting HTML formatting for K3 SmeagolRO Amazon Kindle 1 11-29-2010 12:56 PM
HTML formatting MarcusStringer ePub 17 04-06-2010 11:23 AM
HTML formatting john folkard Calibre 1 08-18-2009 10:15 AM


All times are GMT -4. The time now is 08:19 AM.


MobileRead.com is a privately owned, operated and funded community.