View Single Post
Old 01-08-2010, 07:30 AM   #2
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
It's basically recognizing patterns.

If you can tell a program to look for patterns, you can have the program change or record or reproduce or whatever you want it to do.

Typically, they're used for search and replace operations -- including replacing with nothing (thus deleting.)

Imagine you have a text file: in that file are recurring lines that have a single number (page numbers, for instance.) They don't add anything to an ebook, in fact they interrupt the reading experience:

Code:
blah blah blah I'm a great shaper of words blah
214
blahbety blah geflugehh cthulhu
Using regular expressions (regex) you can search for the pattern "any line that has only a number on it" and replace it with nothing. (That pattern might look like ^\d+\n or other variations. The ^ means "at the beginning of the line"; the \d means "digits"; the + means "one or more" and the \n means "carriage return". Together they identify the pattern "one or more digits occurring at the beginning of a line, followed immediately by a carriage return".

Once you get familiar with the symbols, you can identify any recurring pattern and, as you get more sophisticated, you can extract information from those patterns and restructure and replace it.

It's really worth learning, especially if you are doing proofreading and editing of texts.

m a r
rogue_ronin is offline   Reply With Quote