![]() |
#1 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 42
Karma: 2685
Join Date: Aug 2004
Device: Kindle Voyage
|
Stripping extra line returns
What are your suggestions for a filter to strip out unwanted line returns embedded within a paragraph? I keep having that problem with PDF conversions.
|
![]() |
![]() |
![]() |
#2 |
Little Fuzzy Soldier
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 580
Karma: 5711
Join Date: Sep 2008
Location: Nowhere in particular.
Device: cybook gen3, htc hero, ipaq 214
|
In word, e.g.:
Find: ([a-z,;'])^13([a-zA-Z]) or ([a-z,;'])^13^10([a-zA-Z]) Replace: \1^32\2 |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
|
Replacing Breaks using Word
Paste the text file into a word document. Save it. Open Find and Replace (In the “Edit” menu). You need to make three passes with it. 1. Find ^p^p (i.e. find two paragraph marks. The symbols are in More then Special. The paragraph mark is at the top of the list.). Replace with @ (These act as placeholders to preserve the real paragraphs.) 2. Now, for the second pass, find ^p (single paragraph mark). Replace with a single space. 3. Finally, for the third pass, find @ and replace with either a single paragraph mark, or two paragraph marks if you prefer an empty line between paragraphs. Stingo’s Word Macro This does a good job at removing the hard line breaks in text files and making a reflowable document. It also sets the font to Times New Roman, point 14. Once installed, it is very easy to use. Paste your text file into a word document. Save the Doc and run the macro. You can find it here, together with instructions on how to install it: https://www.mobileread.com/forums/showthread.php?t=8793 (The only time this has ever given me trouble was when I failed to save the doc before running the macro. It crashed.) BookCreator and Book Designer MobileRead member =X= has produced BookCreator. This is a Word template containing a number of useful macros, which speed up the editing process. He envisages using it in connection with Calibre. However, it is entirely feasible to run the macros and then import the edited document into BD. BookCreator is available here: https://www.mobileread.com/forums/showthread.php?t=28313 |
![]() |
![]() |
![]() |
#4 |
Addict
![]() Posts: 243
Karma: 48
Join Date: Dec 2006
Device: PRS 500 - REB 1200
|
I don't want word (to pay for it OR to have it cluttering my hard disk) so I found an alternative.
The problem is you have to use multiple steps since you want get RID Of LB's but want to KEEP LBLB's (double line breaks) I use notepad++ CTRL H to bring up replace. First pass replace the DESIRED LBLB to something else from: \r\n\r\n (be sure to dot EXTENDED below) To: DOUBLEBREAK (or anything else not likely to appear in your text) run it now run this from: \r\n to: Nothing (literally nothing leave it blank) there all the LB's are now deleted now run this From: DOUBLEBREAK to: \r\n\r\n there you go. All done. I am still working on a way to automate this somehow into a macro or something. These are coming from PDF so I use office convert to convert them to RTF I open in word pad and use file save to reduce file size (have not tried to skip this step yet) then open in notepad++ and run those replacement routines save as text file. Works a treat. I will try saving as RTF too see if it makes any difference in size or anything like that. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Where is the stripping of DRM legal? | duckbill | News | 38 | 09-02-2011 01:27 PM |
Extra paragraph line when converting from LRF | jhempel24 | Calibre | 3 | 08-18-2010 07:00 AM |
Noobie and DRM-stripping | thecyberphotog | Workshop | 7 | 12-17-2009 08:17 PM |
Removing extra line breaks | plemming | Calibre | 0 | 07-31-2008 07:50 PM |