View Single Post
Old 01-07-2009, 09:08 PM   #52
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by llasram View Post
If the '1' bit is set, and there are no actual multibyte characters in the text, then each record will end with a NUL byte indicating 0 overlaping bytes. (Well, unless bits one of bits 4-8 is set on the "size & flags" byte.)
I am not sure I get it totally. If bit "1" is set is then the last byte in the record always realated to multibyte characters?

My code now is the following and I wondered if this is a correct understanding of it:
Code:
                eval {
                    sub min { return ($_[0]<$_[1]) ? $_[0] : $_[1] }
                    my $maxi = min($#$recs, $header->{'records'});
                    for( my $i = 1; $i <= $maxi; $i ++ ) {
                        my $data = $recs->[$i]->{'data'};
                        my $len = length($data);
                        my $overlap = "";
                        if ($self->{multibyteoverlap}) {
                            my $c = chop $data;
                            print STDERR "I:$i - $len - ", int($c), "\n";
                            my $n = $c & 7;
                            foreach (0..$n-1) {
                                $overlap .= chop $data;
                            }
                        }

                        $body .= _decompress_record( $header->{'version'},
                                                     $data );
                        $body .= $overlap;
                    }
                };
Why is three bits used for the size if the maximum size is 3? (I see now that I have reversed the order in $overlap).
tompe is offline   Reply With Quote