View Single Post
Old 09-01-2009, 10:18 AM   #17
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Can I run something by you, Jellby? (And by whoever else may be reading.)

A lot of what I am trying to do with pacify.py is going to be text processing... but at the same time, I do want to be able to handle some light formatting--bold, italics, maybe a bit more.

Unfortunately any obvious/straightforward way of handling formatting interferes with the straightforwardness of any text processing. e.g.: Once I have html tags, or html entities, or latex commands in there... it begins to get harder to find out what the first character of the subsequent paragraph is, for example, on account of having to escape the formatting portions.

I have a vague idea in my head about creating a class in python that would facilitate both formatting and text processing concerns, by keeping content in the following manner:

For any string of length X, it would store two strings of length X. The first stored string would be the plaintext, the second stored string would be byte-long bitfields that provide formatting information.

Or, to give a dumbed-down view, instead of:

Code:
Isn't <i>that</i> the reason we're <b>here</b>?
would be:

Code:

String 1: Isn't that the reason we're here?
String 2: 000000IIII000000000000000000BBBB0
And then any operation done on the plaintext (via the class's methods) would perform the equivalent operation on the formatting string. This way content and formatting could be dealt with separately without having to painstakingly escape formatting instructions for any text-processing operation.

What do you think? Any chance that this is a better way than the obvious alternative of using HTML or something similar internally?

- Ahi
ahi is offline   Reply With Quote