Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 05-09-2010, 07:24 AM   #346
eboyhan
PandaMuse
eboyhan doesn't littereboyhan doesn't litter
 
eboyhan's Avatar
 
Posts: 104
Karma: 104
Join Date: Nov 2009
Location: Florida
Device: kindle dx, kindle touch SO, kindle fire, kindle fire hd8.9
Thumbs up

Quote:
Originally Posted by itimpi View Post

Would this meet your requirements?
It might, I'll have to delve deeper into Calibre. Since I have over 1000 print books, is there any way to bulk import ISBNs? I can easily get some or all of the metadata maintained by Gurulib exported into Excel, CSV, or XML formats.

Okay, since my original answer here I have delved into the calibre user manual more deeply, played around with the calibre "add empty book" facility, the "edit metadata in bulk", calibredb.exe, etc.

One problem that I found right off the bat is that the calibre user manual predates when the add empty book capability was added. I can only find a few posts (after googling around) related to the add empty book facility -- and they weren't particularly helpful in describing how one might add empty books in bulk. Also I could find no documentation at all as to how one might use calibredb to add empty books.

@itimpi you said in one of your answers to a post elsewhere, that you no longer had to create dummy files, it would seem to me that if one has say 1000 empty books to create, and one does not want to manually enter metadata for all of these, then one approach would be to create 1000 dummy files whose filenames create a unique metadata signature. Ideally, if this is the approach I must take, I would like the files to have the ISBN number as their file name, and create a script using calibredb that would somehow add the 1000 files as empty books --each with their appropriate ISBN number. Unfortunately from the limited documentation that I have been able to find, I cannot find any easy way to link a file name with the ISBN metadata field.

Any help that anyone could give would be appreciated. By the way I'm not adverse to writing a script to accomplish this, I just need some pointers on how to get started. An example or two of a command line that adds an empty book and gives it an ISBN metadata entry would be really helpful here.

Anyhow, thanks for your prompt response

Last edited by eboyhan; 05-09-2010 at 03:57 PM.
eboyhan is offline  
Old 05-13-2010, 04:18 PM   #347
jeanniespc
Member
jeanniespc began at the beginning.
 
Posts: 11
Karma: 10
Join Date: May 2010
Location: NC
Device: Kindle 2, Nook, Sony PRS300
calibre with Kindle2

After you change a book with the metadata....do you have to convert it? What do you do to get it back on the kindle?

Jeannie
jeanniespc is offline  
Advert
Old 05-15-2010, 04:33 AM   #348
Jedai
Member
Jedai began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Oct 2009
Device: iPhone, Sony Touch Reader
@eboyhan : A suggestion without a script (or at least only to create the dummy files from a list of ISBN) would be to fiddle with the import regex in Preference>Add/Save>Adding and then import all those books at once, you can then bulk import the metadata.
Jedai is offline  
Old 05-22-2010, 06:53 AM   #349
Yeti
Member
Yeti began at the beginning.
 
Yeti's Avatar
 
Posts: 12
Karma: 26
Join Date: Jul 2009
Location: Queensland, Australia
Device: Kindle 2i
Hi all,

I am new to the wonderful world of e-readers, have just bought a Kindle 2i. I am having a problem with converting PDF to MOBI using Calibre. I have spent a few hours trying to find a solution, but surprisingly have not seen it mentioned in any of the FAQ's or the Calibre user manual or web site- am I the only one who's having this problem?

What is happening is that in the converted MOBI document every four or five pages the text is interrupted by the following text:
"Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html"
Very disrupting! I have followed the link to ABC Amber's web site, and its FAQ tells me "The registered version removes all our banners, labels and ads." I would be happy to purchase their software and register it, but I use a Mac and the software appears to be for PC's.

I must be missing something, can anyone help?

Thanks in advance, and thanks for Calibre Kovid and all who have contributed.

Yeti.
Yeti is offline  
Old 05-22-2010, 08:33 AM   #350
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Yeti View Post
What is happening is that in the converted MOBI document every four or five pages the text is interrupted by the following text:
"Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html"
Your pdf was previously converted from another format into a pdf. ABC Amber LIT Converter was used to do that job, and it put the objectionable text into your pdf as a header. Calibre is just converting all of your pdf. As to how to fix it, you need to tell Calibre you don't want that text.

Try here.
Starson17 is offline  
Advert
Old 05-22-2010, 04:40 PM   #351
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
[QUOTE=Starson17;922230]
Quote:
Originally Posted by Yeti View Post
What is happening is that in the converted MOBI document every four or five pages the text is interrupted by the following text:
"Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html"
Quote:
Originally Posted by Starson17 View Post
Your pdf was previously converted from another format into a pdf. ABC Amber LIT Converter was used to do that job, and it put the objectionable text into your pdf as a header. Calibre is just converting all of your pdf. As to how to fix it, you need to tell Calibre you don't want that text.
One way I might do it is to put a directory in debug during conversion. After conversion I would grab the original html out of the folder, open it in notepad++ or other editor then find and replace Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html.

Since you might have many other books with this problem learning the method Starson17 linked to will benefit you in the long run.

Good Luck.

Last edited by DoctorOhh; 05-22-2010 at 10:22 PM.
DoctorOhh is offline  
Old 05-22-2010, 09:55 PM   #352
Yeti
Member
Yeti began at the beginning.
 
Yeti's Avatar
 
Posts: 12
Karma: 26
Join Date: Jul 2009
Location: Queensland, Australia
Device: Kindle 2i
Aha ... I am so pleased it wasn't something really dumb and obvious, I feel better now

Thank you Starson. After some trial and error I managed to get rid of the offending text. I still seem to have extra page-breaks where the text was, but I can live with that. Great!

dwanthny, thank you too, I get the gist of what you're saying although I don't follow completely. Notepad++ must be a PC application? Anyway, I will go with your advice and use the method that Starson provided the link for; I have made a note of it for future reference.

All this trial and error with Calibre has made me realize there is a lot of power hidden underneath its uncluttered-looking bonnet. Very nice software.

Yeti.
Yeti is offline  
Old 05-23-2010, 09:14 AM   #353
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Yeti View Post
Thank you Starson. After some trial and error I managed to get rid of the offending text. I still seem to have extra page-breaks where the text was, but I can live with that. Great!
IIRC, there are a pair of <b> breaks around the offending text. They should show up in the wizard. What regular expression did you use to get rid of the junk? You may just have to add <b> to the beginning and/or end to get rid of the page break. IIRC, after that thread I sent you to was written, Calibre was revised to work better in multipage settings.
Starson17 is offline  
Old 05-23-2010, 06:13 PM   #354
Yeti
Member
Yeti began at the beginning.
 
Yeti's Avatar
 
Posts: 12
Karma: 26
Join Date: Jul 2009
Location: Queensland, Australia
Device: Kindle 2i
IIRC?

Starson, I am assuming your question is directed at me, Yeti? I'll answer anyway. Not having any idea about Regex or programming on anything like that, I simply followed instructions I found in the thread. I tried some of mshneour's expressions, but they didn't highlight anything in the wizard so I then tried Kovid's suggestion from his first reply (#2) in the thread:- Generated by.*abclit.html. That seemed to highlight most of the offending text I was trying to get rid off, so I used that, and the result is quite satisfactory, I can live with the extra page breaks. Thanks again.

Yeti.
Yeti is offline  
Old 05-23-2010, 06:33 PM   #355
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Yeti View Post
IIRC?
It stands for If I Recall Correctly.

Quote:
Starson, I am assuming your question is directed at me, Yeti? I'll answer anyway. Not having any idea about Regex or programming on anything like that, I simply followed instructions I found in the thread. I tried some of mshneour's expressions, but they didn't highlight anything in the wizard so I then tried Kovid's suggestion from his first reply (#2) in the thread:- Generated by.*abclit.html. That seemed to highlight most of the offending text I was trying to get rid off, so I used that, and the result is quite satisfactory, I can live with the extra page breaks. Thanks again.

Yeti.
If you looked at the "highlighted text" you should see the <br> near it - probably just before or just after the stuff you are trying to remove. That's the code for a "break" and if you adjust your regex to get it highlighted, the extra page break will disappear. From your post above, it's clear that the "regex" you are currently using is :
Generated by.*abclit.html

A "regex" is a regular expression that defines text to be matched. Your regex has the following meaning: match any text that starts with the phrase "Generated by" followed by zero or more characters (the .* part) followed by "abclit" followed by a single character (the . part) followed by "html."

I was suggesting that it's not that hard to get rid of the extra page break by changing the regex slightly so that it also matches (and therefore highlights) the <br> part (assuming it's there). It's just a matter of adding <br> and maybe another character or two to your regex. If you don't care about the extra page break, ignore this, but if you want to get rid of it, post the text that surrounds your highlighted text (it will probably include <br> as discusssed above) and someone will help you get a better regex. It will look something like this:

<br>Generated by.*abclit.html<br>

but perhaps not exactly like that, depending on what text is in your book. I was just pointing out that it's a tiny change and easy to make.
Starson17 is offline  
Old 05-24-2010, 12:20 AM   #356
Yeti
Member
Yeti began at the beginning.
 
Yeti's Avatar
 
Posts: 12
Karma: 26
Join Date: Jul 2009
Location: Queensland, Australia
Device: Kindle 2i
Quote:
Originally Posted by Starson17 View Post
It stands for If I Recall Correctly.
Doh! Fifteen years + on the internet and I can't remember seeing that one before, even on the good old BBSs.

Ok, bit of a learning curve here, I am trying. Hopefully I will get to read the book eventually ...

Interesting to notice how some things - like the offending text we are talking about here - don't show up in the PDF before conversion and then suddenly appear in the MOBI afterwards ...

I just noticed also that neither the PDF before conversion, nor the MOBI afterwards have any italic print. I have the paper version of this book and, like all books it uses italics for emphasis, to indicate someone's train of thought, for foreign language and so on. This is quite important for a better understanding of the story, and would be nice to correct if possible too. But quite likely it was lost in creating the original PDF version?

Now, trying to get rid of the extra page breaks:

I tried using the expression <br>Generated by.*abclit.html<br> , but it does not highlight anything in the wizard. I also tried leaving off the <br> , first at the start, then at the end - no luck, it does not highlight anything. Here is a copy-and-paste of a section of the text from the wizard after using the expression Generated by.*abclit.html :

... Central Intelligence Agency. He <b>Generated by ABC Amber LIT Conv<a href="http://www.processtext.com/abclit.html">erter, http://www.processtext.com/abclit.html</a></b></p><p>
was also at this moment ...

and this is the part that gets highlighted by the wizard:

Generated by ABC Amber LIT Conv<a href="http://www.processtext.com/abclit.html">erter, http://www.processtext.com/abclit.html

As I have said, I can live with the extra page breaks, and even the lack of italics, but if anyone still feels like playing, I am open for other suggestions. Thanks again.

Yeti.

Last edited by Yeti; 05-24-2010 at 12:25 AM.
Yeti is offline  
Old 05-24-2010, 09:29 AM   #357
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Yeti View Post
Interesting to notice how some things - like the offending text we are talking about here - don't show up in the PDF before conversion and then suddenly appear in the MOBI afterwards ...
I believe that means it's been hidden in the pdf. It's there, but not being displayed until conversion makes it reappear.

Quote:
I just noticed also that neither the PDF before conversion, nor the MOBI afterwards have any italic print. I have the paper version of this book and, like all books it uses italics for emphasis, to indicate someone's train of thought, for foreign language and so on. This is quite important for a better understanding of the story, and would be nice to correct if possible too. But quite likely it was lost in creating the original PDF version?
Yes, it was probably stripped during conversion. I don't know why, as a good conversion wouldn't have done that.

Quote:
Now, trying to get rid of the extra page breaks:
I tried using the expression <br>Generated by.*abclit.html<br> , but it does not highlight anything in the wizard.
I didn't think it would. Without seeing the text you want removed, and the codes around it, that was just a guess.

Quote:
I also tried leaving off the <br> , first at the start, then at the end - no luck, it does not highlight anything.
That's also not surprising - you don't have any <br> codes

Quote:
Here is a copy-and-paste of a section of the text from the wizard after using the expression Generated by.*abclit.html :

... Central Intelligence Agency. He <b>Generated by ABC Amber LIT Conv<a href="http://www.processtext.com/abclit.html">erter, http://www.processtext.com/abclit.html</a></b></p><p>
was also at this moment ...
Try this:

Code:
<b>Generated by.*abclit.*<p>
That may not do it, as I don't see the part causing the break. I think that will just remove some empty bold tags, and an empty paragraph - extra line. The part causing the page break may be in a part of the text you didn't post. If it's not bothering you, you don't need to go any further, but learning a bit about basic regex use can be helpful if you are going to use Calibre over the long term.
Starson17 is offline  
Old 05-24-2010, 10:15 AM   #358
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Starson17 View Post
I don't see the part causing the break. ... The part causing the page break may be in a part of the text you didn't post. If it's not bothering you, you don't need to go any further
I didn't see the part causing the break because there is no break code, and I was looking at your post in the forum reply editor (where there are already lots of extra breaks, so I couldn't see the one in your text.) Your page break problem is solvable, but I think it would require a multiline match, and that's probably more than you want to go into in your first regex attempt. I looked at similar "Converted by" text in one of my books and it had <br> tags in it, which is why I initially thought the match would be easy for you.
Starson17 is offline  
Old 05-27-2010, 02:13 PM   #359
LateAdopter
Junior Member
LateAdopter began at the beginning.
 
Posts: 1
Karma: 10
Join Date: May 2010
Device: Sony PRS-600
Issue with names of TXT files copied to device

I just upgraded to Calibre 0.6.54 (from 0.6.51). Now, when I send a .TXT file to my device (Sony PRS-600), the filename and author change on the next device reset to blank for the Author, and title-author_XXX for the title. A concrete example: file "Adventures of Tom Sawyer.txt; Title "Adventures of Tom Sawyer"; Author "Twain, Mark". I copy this to the PRS-600 main memory successfuly, and it shows in the display correctly. After ejecting the device and allowing it to reset, the new book has the title "Adventures of Tom Sawyer - Twain, Mark_139" and author "Unknown".

Since this did work on previous versions, it loads all other types I tried as expected (EPUB, LIT, RTF) and I saw no other mention of the issue, I am guessing that I have set some variable incorrectly. Thoughts, anyone?

Steve
LateAdopter is offline  
Old 05-27-2010, 05:43 PM   #360
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,451
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by LateAdopter View Post
Since this did work on previous versions, it loads all other types I tried as expected (EPUB, LIT, RTF) and I saw no other mention of the issue, I am guessing that I have set some variable incorrectly. Thoughts, anyone?
Are you sure it worked for text files? The reason I ask is that a Sony rebuilds its private database when you disconnect from the computer, cleaning up the metadata in ways that it thinks necessary. For example, on my 300, multiple authors always get truncated to one author. It seems to do this cleanup by looking for metadata in the files, and because text files have no metadata, the author field is cleaned. On my 300, it becomes empty, not 'Unknown'.
chaley is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Frequently Asked Questions (and answers too!) Stinger Kobo Reader 112 05-12-2017 11:40 AM
PRS-300 Reader freezing frequently paddy77 Sony Reader 15 01-17-2011 02:33 AM
PRS-600 Do you frequently read PDFs on your PRS600? drmaxx Sony Reader 20 09-22-2009 07:15 PM
Questions we wish we had asked before buying a Reader Dr. Drib Sony Reader 15 05-22-2009 06:13 AM
Three not asked earlier questions about iLiad Malder1 iRex 9 08-14-2006 02:10 PM


All times are GMT -4. The time now is 10:47 AM.


MobileRead.com is a privately owned, operated and funded community.