View Full Version : Removing unnecessary paragraph breaks in .txt


citac
10-25-2010, 10:33 PM
I have a problem I hope someone will help me solve. I have a large number of .txt files, mostly fic saved from various websites. Some of it saved fine, and is displayed correctly on my e-reader, but some of it has a paragraph break at the end of each line.

If I try to remove those extra paragraphs in Word, it removes all paragraphs, and I end up with one big paragraph with no breaks whatsoever. How do I go about removing end-of-line paragraph breaks, and keeping those between paragraphs?

Jellby
10-26-2010, 05:42 AM
I guess true paragraph breaks are represented as two consecutive breaks (aka empty line), right?

In that case, you can use something like (without regexp):

Replace all paragraph breaks with "" (or some other unused char).
Replace all occurrences of "" with a paragraph break.
Replace all other occurrences of "" with a space.

citac
10-26-2010, 06:16 PM
I guess true paragraph breaks are represented as two consecutive breaks (aka empty line), right?

In that case, you can use something like (without regexp):

Replace all paragraph breaks with "" (or some other unused char).
Replace all occurrences of "" with a paragraph break.
Replace all other occurrences of "" with a space.

No, it's text that has short lines,
a break comes after several words
and is very annoying to read on an
e-reader. There are no extra lines in
between. Sometimes I get this,
which is even worse, and
has to be taken care of as well.

Or I get
this, which looks like coding a text
like a poem,
which is the worst.

See how bad it looks? I tried the above on a .txt file and it worked, thank you! I will have to try it out with various files, but I think the extra spaces shouldn't be hard to remove.

(Um, I totally misread this line "I guess true paragraph breaks are represented as two consecutive breaks (aka empty line), right?". Yes, you're right. I'll leave the above representation as an illustration. :blush:)