Quote:
Originally Posted by HarryT
Because people can easily do "pattern recognition" tasks which are extremely difficult for a computer.
You could say "If a line starts in a capital letter then it's probably a new paragraph", I suppose. It wouldn't be 100% reliable, but it would be a good start.
|
And the previous line was shorter than usual would also be a clue. Unfortunately the length of the line has become a rather poor indicator due to the fact that often the original assumed mono-spaced fonts and currently this is almost never the case. But if you assume it was mono-spaced you can count characters and determine the next word would have fitted. It is certainly beyond simple regexpressions.
Dale