Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-15-2010, 05:35 AM   #31
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Quote:
Originally Posted by chrisix View Post
@JvdW

yes, please
OK, here we go:
Download from http://sourceforge.net/projects/unxutils/files/
the following package:
UnxUtils.zip

Unpack the UnxUtils.zip where you want it (ex: c:\unixutils), note where it ends up since we need the path to the bin folder and the path to usr\local\wbin

Download pdftotext from foolabs.com:
ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl4-win32.zip
Extract the contents, or pdftotext.exe alone, to the usr\local\wbin folder

Open a command prompt
type: set PATH=c:\unixutils\bin;c:\unixutils\usr\local\wbin; %PATH%
type: sh
(this should give you a unix like shell which will run the batch file)
type: ls
(this should give you a listing of files in your current directory)
type: exit
You're back at the command prompt of Windows.
Copy the attached isbn.bat into the wbin folder and rename it to isbn.zsh, edit the first line to reflect your installation of UnxUtils.
(For unknow reasons to me I can't upload isbn.zsh )
Copy your PDF's, if you're not trusting me that they won't be touched , into a temporary folder, no sub folders allowed, for example C:\PDF.
In the command prompt type:
cd c:
cd \pdf
type: sh
type: isbn.zsh

The script will try to find the following order of ISBN's:
ISBN-13
ISBN-10
ISBN:
ISBN
If found it will rename the file and move it into the done folder. Locations can be changed in the script if needed but first leave the defaults since it will make finding problems much easier.
After that make sure that Calibre is set to read the isbn number from the filename and start importing. Preferences->Add/Save->Adding Books>Regular expression: (?P<isbn>[0-9].+$)
Further I have 'Read metadata only from file name' not checked.

That should do the trick, I hope ;-)

Regards,

Joop
Attached Files
File Type: bat isbn.bat (2.1 KB, 342 views)
JvdW is offline   Reply With Quote
Old 06-15-2010, 12:02 PM   #32
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
Thanks a lot for your great script and help!

Can't run unixutils it in Win7 X64 (abnormal program termination)
So I did it in WinXP in a VMware.

I have a very high ISBN output which is great because it is a folder of files which failed all in e-library, but the script often is stopping without any reason and without any output so I interrupt it and start it again, may be it has something to do with the VMware Shared Folder where the files are in?
chrisix is offline   Reply With Quote
Advert
Old 06-16-2010, 04:01 AM   #33
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Quote:
Originally Posted by chrisix View Post
Thanks a lot for your great script and help!

Can't run unixutils it in Win7 X64 (abnormal program termination)
So I did it in WinXP in a VMware.
Sorry I indeed didn't state that it wouldn't work in 64bit Windows.

Quote:
I have a very high ISBN output which is great because it is a folder of files which failed all in e-library, but the script often is stopping without any reason and without any output so I interrupt it and start it again, may be it has something to do with the VMware Shared Folder where the files are in?
Doesn't it atleast output the name of the last pdf it is going to work on?
If it has something todo with the shared folder I would suspect that every file failed.

Can you show some of the output of the script?

Regards,

Joop
JvdW is offline   Reply With Quote
Old 06-16-2010, 04:35 PM   #34
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
Quote:
Originally Posted by JvdW View Post
Doesn't it atleast output the name of the last pdf it is going to work on?
If it has something todo with the shared folder I would suspect that every file failed.
Can you show some of the output of the script?
As I told, it stops without any error, and the pdf at which one it stops is random. I think it has something to do with the unixutlis, your script is clean and I don't see any reason for stopping.

When I have a lot of time I will try to rewrite your script with windows powershell if it is possible
chrisix is offline   Reply With Quote
Old 06-17-2010, 04:12 AM   #35
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Quote:
Originally Posted by chrisix View Post
As I told, it stops without any error, and the pdf at which one it stops is random. I think it has something to do with the unixutlis, your script is clean and I don't see any reason for stopping.
If you look at the script and run the commands just one by one does it then work?
Its really wierd because if the pdf doesn't contain an ISBN number the script will print the name of the PDF and then process the next one, printing the name and move on. One thing that might cause this is a pdf file which isn't strictly conforming to the specs.
We have an application which import a pdf which is first split into single pages and the application which generates this multipage pdf generates an invalid pdf. (MS reporting service)

Quote:
When I have a lot of time I will try to rewrite your script with windows powershell if it is possible
That thought crossed my mind too but I'm a little more proficient with unix commands then with powershell.

Regards,

Joop
JvdW is offline   Reply With Quote
Advert
Old 06-18-2010, 04:42 AM   #36
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
I already made a working PowerShell Script....
Just needs Windows PowerShell and the pdftotext.exe

The next thing I will figure out is to find ISBN-13, ISBN-10, ISBN in one loop to make it a bit faster.
chrisix is offline   Reply With Quote
Old 06-18-2010, 10:16 PM   #37
UnraisedArc
Junior Member
UnraisedArc began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jul 2009
Device: none
Still alive?

I'm surprised this thread is still alive. Unfortunately, my ebook collection has grown to the thousands, and I still have many that have sparse metadata. Good luck guys and keep up the good work!
UnraisedArc is offline   Reply With Quote
Old 06-19-2010, 09:23 AM   #38
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
If someone is interested in a PDF > ISBN Extractor PowerShell Script, let me know.
chrisix is offline   Reply With Quote
Old 06-20-2010, 08:25 AM   #39
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
Does anybody know if it is possible to limit the lines/pages output txt with ebook-convert?
I can't find a parameter for it.
I will try to make ISBN detection for pdf, epub and may be more and use only calibre own commands.
chrisix is offline   Reply With Quote
Old 06-20-2010, 10:29 AM   #40
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chrisix View Post
Does anybody know if it is possible to limit the lines/pages output txt with ebook-convert?
I'm pretty certain the answer is no, assuming you are asking if you can insert page breaks into the txt output at regular intervals. Calibre's txt output is for e-readers where there is no concept of a page. The text is continuous.

If you are asking about limiting articles and feeds during news fetching (my primary use for ebook-convert) the answer is yes, but not by page or line, only by article, feed, date, etc.
Starson17 is offline   Reply With Quote
Old 06-21-2010, 03:43 AM   #41
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Quote:
Originally Posted by chrisix View Post
If someone is interested in a PDF > ISBN Extractor PowerShell Script, let me know.
Would be nice to look at how you've done it. Might even use it instead of my own script



Joop
JvdW is offline   Reply With Quote
Old 06-22-2010, 02:06 PM   #42
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad

Last edited by chrisix; 06-22-2010 at 02:08 PM.
chrisix is offline   Reply With Quote
Old 06-22-2010, 02:07 PM   #43
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
Quote:
Originally Posted by JvdW View Post
Would be nice to look at how you've done it. Might even use it instead of my own script

Just made a compare between yours and mine and yours is doing a much better job, so let me tweak it a bit
chrisix is offline   Reply With Quote
Old 06-22-2010, 05:38 PM   #44
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
PDF ISBN Extractor for Windows PowerShell

Here it is, put pdftotext.exe in same path.

I still have a problem with some situations, but may be you find a better much cleaner way, I already get crazy.....

0.2 > small changes.
0.3 > better results.
Attached Files
File Type: zip isbn.zip (841 Bytes, 351 views)

Last edited by chrisix; 07-07-2010 at 01:02 PM.
chrisix is offline   Reply With Quote
Old 07-01-2010, 06:22 PM   #45
chrisix
Enthusiast
chrisix began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Jun 2010
Device: iPad
8 downloads no comment?

is it usable for anybody?

chris
chrisix is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kobo future firmware feature request thread sabredog Kobo Reader 2150 02-06-2024 07:37 PM
Extract ISBN from PDF? mdroberts Calibre 14 12-16-2016 07:32 AM
Kobo future Hardware feature request thread Psyke Kobo Reader 1 01-07-2011 06:09 PM
[Old Thread] Calibre 'feature request' thread Waba Calibre 2 02-10-2010 07:52 PM
Feature request thread? Dahak Calibre 1 08-02-2009 12:51 AM


All times are GMT -4. The time now is 05:42 PM.


MobileRead.com is a privately owned, operated and funded community.