![]() |
#1 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
How to copy HTML and keep the formatting?
I'd like to copy this text (book of Genesis from the Bible) and get the verse numbers in the small size!
The verse numbers appear NORMAL size when they should be very small and raised (forgotten what's called). I can turn them off, which would solve the problem, but traditionally when reading the Bible, like most people, I am used to having verse numbers... Bibel 2000: 1 Moseboken Is this possible? When I copy and paste formatted into Word, I get the numbers in normal size. Any way to get the verse numbers small, and raised? |
![]() |
![]() |
![]() |
#2 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
However, it doesn't really matter, because in short, no, you can't. I viewed the page's source, and they actually went to some lengths to ensure that if you want to "rip it," you'll have to do some work, as Kim mentioned, already, in his post (about the verse numbers). You may have missed it in your rush, but he already told you that you'd have to do the numbers by hand to super-script them. Now, you can probably write a bit of regex, but you're still going to have to (probably) do them one at a time, to be sure you don't superscript a number used elsewhere. The text isn't in the source for the page. Therefore, no, there's no easy way to "rip" their text. FWIW. Hitch |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
Thanks Hitch. Yes, I can see that the text is not in the source. Out of interest, where, is it then? Text doesn't appear out of nowhere.. I didn't check carefully, but at a quick glance I noticed no iframe or php call. Yet the text is flowing, it's no flash or picture. I just don't know how they format that little number. It's got to be either a smaller and heightened size, or a different font. If it's on a database and formatted it still should be possible to grab it.
|
![]() |
![]() |
![]() |
#4 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
The superscripts are done like this:
Code:
<span id="v_0_9" class="v">10</span>Gud kallade det torra landet jord, och vattenmassan kallade han hav. Och Gud såg att det var gott.
Code:
span.v { vertical-align: top; font-size: 0.6em; } The text is included with javascript, replacing this placeholder: Code:
<div class="bible_content" id="bible_content__2k__1_mos"> <div class="bible_content_notincluded">1 Mos</div> </div> Last edited by Jellby; 07-15-2014 at 03:55 AM. |
![]() |
![]() |
![]() |
#5 | |
Imperfect Perfectionist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 625
Karma: 863576
Join Date: Dec 2011
Location: Ølstykke, Denmark
Device: none
|
Quote:
Regarding the formatting, the Chapters are in a <span class='ch'> and the verse in a <span class='v'>. If you want that, you'll have to mark the text you want and use something like Firefox' "Show source code" and copy from there. Then you'll have to learn some css and such stuff ![]() - and that's about as detailed as I will get - others have written whole books on html and css, so I won't get into that. Regards, Kim |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Thanks, Kim. Hitch |
|
![]() |
![]() |
![]() |
#7 | |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
Thanks again, Kim and also Jellby for the examples!
So they are sending the text in HTML format, using Javascript and then formatting it with CSS? Hypothetically, if it was possible to intercept the Javascript and grab that HTML, then I could easily write a CSS class locally or use their stylesheet if it's possible to grab that. All that's needed is to grab the html from the Javascript. I'm sure that can be done. I'm not good at this kind of stuff but it's not totally alien either. Quote:
This is not a normal piracy situation. Rest assured, if it was available to buy and download as an epub or pdf, I would. However the only shop that sells it, is closed and they only sell physical CDs anyway. I simply want a personal copy of this Bible, to put on my e-reader. If the state chooses to use a monopolistic distributor, not offer a proper e-book, and close the shop for a whole month in the summer, then this will inevitably happen. The translation situation is different with Swedish. This version is one of only two Bible versions that a modern person can read without getting seriously distracted by the ancient (i.e. post-spelling/grammar reform). The other version was privately funded by Christians who disagreed with the state version; obviously it should be paid for since the sponsors are out of pocket. The Swedish state didn't commission Bibel 2000 to make money from it: It's not a commercial translation. They simply wanted to get people a bible in modern language, that adheres to political correctness as far as possible, back when the church and state were still one. The reason they don't offer an e-book is most likely because it simply didn't occur to them. Finally; what do you think Jesus would make of attempts to restrict access to the Bible on grounds of copyright? Last edited by martienne; 07-15-2014 at 06:45 AM. |
|
![]() |
![]() |
![]() |
#8 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
Got it
![]() So the trick is to copy the core text as HTML, which is possible. Next step is to apply the CSS. Then save it as something that can be converted to epub or PDF. I grabbed the stylesheet from the site, and imported into Dreamweaver, so now I just need to fix up the styles a bit. Then repeat it x66 for each book in the Bible. It was not easy, but certainly possible without doing anything particularly complex. The CSS comes out a bit funny, but it shouldn't be a problem to fix. Last edited by martienne; 07-15-2014 at 08:03 AM. |
![]() |
![]() |
![]() |
#9 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
I managed to generate this, with the help of HTML/CSS.
But what's going on with the Swedish characters? ![]() |
![]() |
![]() |
![]() |
#10 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Wrong encoding. It looks like it's utf8 interpreted as latin1.
|
![]() |
![]() |
![]() |
#11 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
Thanks! I discovered the problem, it had the wrong ISO charset, it needed Latin 1 Western Europe.
|
![]() |
![]() |
![]() |
#12 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
I just opened the page in Firefox and opened the developer console to Network (CTRL+SHIFT+Q). Reloaded the page. Looked at responses. After a few tries, got this: http://www.bibeln.se/las/2k/1_mos?content=pure which appears to be the first page.
Also, this looks important: http://www.bibeln.se/las/2k/1_mos?content=annotations Last edited by eschwartz; 07-15-2014 at 01:23 PM. |
![]() |
![]() |
![]() |
#13 |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
I've got it now! Thanks eschwartz!
But before I copy over all the 66 books of the Bible into my HTML file: Is it going to be possible to convert the HTML (including the CSS formatting) into an ebook? Either PDF of epub (or both, to check which is best). What tool would you recommend for doing this? Will I have problems because of the enormous size of the bible text - I mean, it's a very long book.... Here is the end result: ![]() |
![]() |
![]() |
![]() |
#14 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
I don't see why not... EPUB is just a bunch of HTML files in a ZIP wrapper.
I'd suggest using calibre's Editor and importing the HTML and CSS. You can right-click on html files to link stylesheets. Big books are only a problem when they overload the memory, but if you split it into separate HTML files (or leave it as it is now) only one file will be loaded into memory at a time. |
![]() |
![]() |
![]() |
#15 | |
.~^пиратка^~.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 238
Karma: 14000
Join Date: Sep 2009
Location: Ask NSA...
Device: Onyx Boox M92
|
Quote:
Just some minor glitches, 1) the locale/charset is again causing problems ("å" comes out as a Polish character...) It seems to know the letters ä and ö, probably because they are used in German - a bigger language. Our little "å" is left out and displayed like the Polish l. 2) Also Calibre inserts a page break at every chapter - maybe good in a book but not in the bible since there are new chapters all the time. 3) Calibre also ignores that the "poetry" and things that God says are supposed to be indented. 4) Plus, there is some kind of issue with the empty space between paragraphs occassionally disappearing. Can I edit the CSS inside Calibre? If Calibre has a different rendering engine than Firefox, it might be necessary. Here is what it looks like in Calibre: ![]() Last edited by martienne; 07-15-2014 at 03:08 PM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Why use html formatting with kindle? | kateharp | Amazon Kindle | 30 | 06-18-2011 10:37 AM |
Best example of HTML formatting for Kindle??? | delphidb96 | Amazon Kindle | 13 | 02-15-2011 06:22 AM |
Troubleshooting HTML formatting for K3 | SmeagolRO | Amazon Kindle | 1 | 11-29-2010 12:56 PM |
HTML formatting | MarcusStringer | ePub | 17 | 04-06-2010 11:23 AM |
HTML formatting | john folkard | Calibre | 1 | 08-18-2009 10:15 AM |