07-13-2009, 12:07 PM | #1 |
Member
Posts: 14
Karma: 10
Join Date: Mar 2009
Device: prs 505
|
New tool : Comic Processing Utility
Hi,
a few month ago I spent much time trying different tools to convert comics so that I can read them on my prs505. I especially liked pdflrf and papercrop. I also lost much time testing many other tools that read some format and not others, that output some formats and not others, which makes converting a real pain. Pdflrf had licensing issues, we don't have the sources and it seems not developed anymore. It was nice though because it has simple but usefull and efficient features. Papercrop is nice too and has very smart algorithms that I tweaked and I used it for comics but in a way it introduces too much intelligence in the process so that sometimes you can't process a whole comic at once without checking each page etc and you can not change settings for particular pages etc It's also connected to a gui which is not what I prefer and needs work to run on linux. Anyway. I got a bit bored and tried to code something on my own. My idea was to write something : - not dependant on ereaders formats, try to separate this notion, I didn't want to produce lrf, pdf etc. This is the job of other softwares that will do it far better than what I would. So the tool should work on simple jpg - without gui, because ergonomy is another job I didn't want to work on ;) and I like command line tools for batch, for power focus on the job not on gui, buttons etc I like to separate things. I try to focus on the engine not on other things, it could be connected to a gui later on I guess, if I extract the code in a library - short and simple to understand, use and maintain even if it's not super fast to process pictures : I usually don't convert thousand of comics at once. - multiplatform - opensource ? :) So I coded what I called CPU : Comic Processing Utility I wanted to work more on it but didn't have time too and before I completely forget it on my hard drive I wanted to contribute it here, in case someone find it useful. Here is the help from cpu -h : --- Usage: cpu [options] pictures cpu 0.1 - Comic Processing Utility by frediz cpu takes a list of pictures and process them through a list of filters that are applied one after the other on the picture list. Some filters can produce one picture from several successive pictures or produce several pictures from one, in both case, inserting them in the list at the place of the source ones. Often a filter just modifies an picture and outputs a single picture. Example : cpu -o output -f mhs,autocrop,grayscale orig01.jpg orig02.jpg orig03.jpg original : [ o1.jpg o2.jpg o3.jpg ] mhs : [ m1_1.jpg m1_2.jpg m2_1.jpg m3_1.jpg m3_2.jpg m0_3.jpg ] autocrop : [ a1_1.jpg a1_2.jpg a2_1.jpg a3_1.jpg a3_2.jpg a3_3.jpg ] grayscale : [ g1_1.jpg g1_2.jpg g2_1.jpg g3_1.jpg g3_2.jpg g3_3.jpg ] mhs filter autosplits a picture at horizontal lines. m1_1.jpg m1_2.jpg are outputed from o1.jpg after splitting it. Here o2 won't be split and o3 will be in 3 sub-pictures. autocrop and grayscale filters output a single picture both. a1_1.jpg being outputed from m1_1.jpg after this one has been autocropped and a1_2.jpg from m1_2.jpg etc... The finale result will be output001.jpg, output002.jpg... output006.jpg Options: -h, --help show this help message and exit -f FILTERS, --filters=FILTERS List of filters to apply on pictures: FILTER=<filter1[ =parameter1:parameter2:...],filter2,...> where filterN can be one of : autocrop|a, autocontrast|c, equalize|e, fitpage|f, gather, grayscale|g, sharpen|h, align|l, rotate|r, mhsplit|mhs, splitpages|s, transpose|t. Use filterN=h|help to get more details on the filter and its parameters -o OUTPUT_PREFIX, --output=OUTPUT_PREFIX Output prefix for output files -i, --infos Show infos -v, --version Show version --- At the moment I only provide a .exe file but I'll provide the source code as soon as I have clean some stuff and choose a license for it. None were injured in making this :) I wrote a small .bat file so that you can drag and drop images onto it and have cpu process your files. The main features atm are autocrop, autocontrast, equalize, fitpage, gather, grayscale, sharpen, align, rotate, mhsplit, splitpages, transpose. Most a self explanatory. I advise to use : cpu -f FILTER=help to get full help on each. I don't remember my best combinations :) but you can do stuff like : autosplitting each page horizontally, in multiple pictures, autocrop them, autorotate them is the scan was not aligned, gather them if many fit on the screen display, put them in grayscale, rotate them to read in landscape. All these filters can be combined they will be applied on your list of pictures, these list will evolve after each filter is applied, it can grow and diminish if subimages are generated and others are gathered. It's hard to find good settings so that a whole comic is processed as you want it but I tried to make many parameters available to tweak each filter. The output will be all the pictures generated after all filters have been applied in the order found on the command line. Then you just need to transform all of them in you preferred format. I wrote a very simple lrf maker with no processing at all for this (processing is the job of CPU in my mind) and I'll release it too. CPU should have bugs I guess, but I used it already a few times, and it should work at least a bit :) . I don't know if I'll have time to fix things you might find, but will try and source code will be here for that too. Here is v0.1 : http://b1shop.ahau-kin.org/cpu/cpu-0.1.zip |
07-14-2009, 02:59 AM | #2 |
Still wondering why
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Frediz, you're my hero!
Although I doubt I've understood how to use it (), I was just looking for such a tool since two weeks and was wondering how didn't anyone already implemented such a utility. I'll try it asap and give you feedback. Thanks anyway for sharing. MR is an awesome community. |
Advert | |
|
07-14-2009, 05:59 AM | #3 |
Member
Posts: 14
Karma: 10
Join Date: Mar 2009
Device: prs 505
|
Hey Kostas,
well I wrote much to describe CPU and it may seems complicated and messy, but just play with it and you'll get it quickly. Start simple : autocrop one image : Code:
cpu --filters autocrop pic1.jpg Code:
cpu -f a pic1.jpg Code:
cpu --filters autocrop=help (cpu -f a=h) Code:
cpu -f a=50 pic1.jpg (default value was 64) Code:
cpu -f a=50 pic1.jpg pic2.jpg pic3.jpg pic4.jpg pic5.jpg and customize it with your options and then just drag/drop all your files on it. Let's imagine you have double page scanned. You might want to split them first in single pages, then you would like to autocrop each single page : Code:
cpu -f splitpages,a <all_pics> Code:
cpu -f a,splitpages <all_pics> but this will all work without changing colors, and if you need them in gray, let's say 16 levels of gray (=2^4, default is 2^16) then you'll add this filter after all the previous ones : Code:
cpu -f s,a,grayscale=4 <all_pics> Code:
cpu -f s,a,fitpage,g=4 <all_pics> Code:
cpu -f s,a,fitpage=800:600,g=4 <all_pics> are not needed for simple use, but this was a quick how to start. I hope you'll find what you need F. |
07-14-2009, 08:10 AM | #4 |
Still wondering why
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Thanks a lot Frediz,
Actually I've been palying quite a moment with your fantastic tool! Great job! It does amazing things so quickly!! (btw, I doubt if such a tool could ever get a gui...) Two remarks: 1. It seems that your initial instructions about the mhs filter (the most impressive feature which allows to split horizontally the comic strip) could be outdated (maybe a previous version?). After severals tests, I didn't find the way to pass parameters to mhs concerning the number of pictures in which the original must be split. But I found that you can set the sensitivity threshold to autosplit the picture (probably based on white balance). This is the instruction that is given with cpu -f mhs=help I get good reults in my tests with a value of 15-30. So, it gives something like this:Usage : mhsplit|mhs[=m[:s]] Description : Splits image horizontally at blank lines. m : min white : all pixels of line should have higher value to be a split line (Default=210) s : min size region : the splitted region won't be less than that (Default=140) cpu -o out -f mhs=20 pic1.jpg One has to play with the value of the parameter to optimize the result, although some failures needing fine tuning are unavoidable. If you want a rotated, greyscale, autocropped, horizontally autosplitted comic, it goes like this: cpu -o out -f r,g,a,mhs=20 pic1.jpg 2. My only small deception is the -f s (autosplit) filter. Unlike mhs for horizontal splitting, it splits vertically in equal parts, with no autodetection logic. If one wants to have "intelligent" vertical autosplitting, the solution is to aplly mhs after a rotation and to do the counter rotation back after...Anyway, it's a Great Tool! again |
07-14-2009, 11:02 AM | #5 | ||
Member
Posts: 14
Karma: 10
Join Date: Mar 2009
Device: prs 505
|
Quote:
Btw m=20 ? you should have dark pics ?! As you guessed, m parameter is a threshold. Colored pictured are converted to gray levels from 0 to 255. 0 being black, and 255 is white So for the line to be considered as ok i.e. we can split the pic at this line, all the pixels should have a value greater than m. Actually, splitting in a fixed number of subpictures the current picture is doable. But as for threshold, in my the examples I tested, the pages are not always splitable the same way. And the number of subpic being not always the same across the pages, it had (at first sight) the same pb as for threshold. A parameter could be a nb of max images to be splitted into. Could give good results ? It will just be slower, because the simple algorithm would be to test different threshold. I thought of it too but didn't implement it... not hard at first sight. I don't know if you played with s options. This will force splitting not cut sub images with dimension less than s in this direction. To be clear, with default s=140 : the height of images produced by mhs won't be less than 140 pixels. Checking the code, there is hidden features mhs = min based horizontal splitting I see some more filters : - ahs : average based horizontal splitting - avs : average based vertical splitting So splitting can be done also based on an average threshold. And vertically too. But average based results were not satisfying to me. Min is simple and gave good results. But all of this, once again, was just feelings for my own needs. Some may have other and be interested in this. Anyway I can add a "mvs", it wouldn't be hard I think. Quote:
mvs equivalent would be something ok for you ? Last edited by frediz; 07-14-2009 at 11:08 AM. |
||
Advert | |
|
07-15-2009, 04:34 AM | #6 | ||
Still wondering why
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Hi frediz, thanks again for the precious info!
No! Maybe the low value is due to the fact I was trying to split plain A4+ pages (750 x 1000 pixels) with 3 or 4 rows of comic strips per page. Quote:
After a few test, I got 46 pages automatically horizontally splitted in 30 secs!! Cpu's algorithm was "intelligent" enough to autodetect whether there were 3 or 4 rows, with only one failure!! Much better than anything that I could get with papercrop which, besides, degrades a lot the quality of the image, despite my efforts to tweak it. Quote:
Besides, I'm pretty sure that I don't get the difference between mean and average splitting! Frediz, you rock! PS: With an effort on producing clearer instructions, I think this topic deserves a sticky status. |
||
07-15-2009, 10:34 AM | #7 | ||
Member
Posts: 14
Karma: 10
Join Date: Mar 2009
Device: prs 505
|
Quote:
additionnal time) of rotating back and forth the pictures with mhs. Quote:
Actually splitting is done this way : horizontal lines of pixels are read one by one from top to down. All the pixel color of the line are converted to a gray level. From black pixels (value 0) to white pixels (255) When the current line meets a criteria (provided other criteria such as s are also ok) the picture is cut here. The criteria can be : - minimum criteria (mhs) : the pixels should all be lighter (whiter ?) than a given value m : this way of doing will rarely cut dark lines in the picture (depends your m). - average criteria (ahs) : an average of all the pixels value of the line is computed and it shouldnt be too dark to be cut. This means that if the line is white with few black pixels, it could be cut. This criteria is less strong than the one of mhs and your pictures could be cut in more places, not necessarly the good ones. Depends. Sorry for my bad english, I hope to be understandable though |
||
07-15-2009, 11:28 AM | #8 |
Final Five n°42
Posts: 789
Karma: 3599
Join Date: Feb 2008
Location: Lyon, France
Device: Cybook Gen3
|
This seems a pretty interesting tool.
As soon as I find the time to retrieve comics or mangas to read on my Cybok, I'll give it a shot. |
07-15-2009, 12:53 PM | #9 | ||
Still wondering why
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Quote:
Quote:
Now I get the picture... Don't worry, mine is worse... |
||
07-15-2009, 01:07 PM | #10 | |
Still wondering why
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Quote:
It's incredible how easy and fast you can perform batch operations on series of pictures like: - rotation (choosing the angle!) - autocrop - convert to grayscale - sharpen - autosplit - change dimension (fitpage) - a.s.o... So, I am personally using it for a lot of other purposes... Highly recommended! |
|
07-15-2009, 01:27 PM | #11 |
Wizard
Posts: 1,462
Karma: 6061516
Join Date: May 2008
Location: Cascais, Portugal
Device: Kindle PW, Samsung Galaxy Note Pro 12.2", OnePlus 6
|
I would really, really love a simple GUI for this tool!
|
08-10-2009, 05:05 PM | #12 |
Wizard
Posts: 1,462
Karma: 6061516
Join Date: May 2008
Location: Cascais, Portugal
Device: Kindle PW, Samsung Galaxy Note Pro 12.2", OnePlus 6
|
I really would!
|
08-11-2009, 05:45 AM | #13 |
Member
Posts: 14
Karma: 10
Join Date: Mar 2009
Device: prs 505
|
hehe Over
Actually I'm really busy on other projects, and even if that wasn't my goal with CPU, I wouldn't prevent someone from doing one Didn't you try the drag'n drop stuff ? can't you deal with putting some options to the command in the batch file ? F. |
03-30-2012, 08:56 AM | #14 |
Junior Member
Posts: 1
Karma: 10
Join Date: Mar 2012
Device: Kindle 4
|
hi !
thanks for this tool ! that seems awesome ! can someone upload this anywhere ? i can't download it from the link Last edited by Chgros; 03-30-2012 at 09:31 AM. |
03-30-2012, 09:00 AM | #15 |
Guru
Posts: 775
Karma: 1043626
Join Date: Dec 2010
Location: York, Pa
Device: Kindle Fire 10", Honor 8 Android Phone
|
Thanks for the cool tool!
|
Tags |
comic |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Canti: Manga Processing Program | lilman | Sony Reader Dev Corner | 1 | 07-14-2010 04:43 PM |
Comic File Processing | wonderboy | Other formats | 1 | 08-08-2009 04:17 AM |
Image processing using html2epub? | Portnull | Calibre | 2 | 06-03-2009 12:31 PM |
Text Processing: Some Ideas | ahi | Workshop | 4 | 05-29-2009 04:35 PM |
Perl processing | alexxxm | Sony Reader | 3 | 11-26-2007 06:13 AM |