![]() |
#1 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
geekmaster dither formula demystified?
There have been requests, and I have posted, that when I get some time, I will write a bit about how my dither code works. This thread is the beginning of such an effort. There will be more to come later...
As published in the geekmaster kindle video player thread, here is the "raw2gmv" kindle video transcoder: PHP Code:
It uses an in-line formula with no subroutines, to be branch-free (no if-else statements) which makes it cache-friendly. The while statement reads a series of single 800x600 video frames (one byte per pixel). The outer for loop processes entire 800-pixel rows from the current video frame, and the inner for loop processes individual pixels (8-bit grayscale) from the current video row. Inside this outer loop we set xi=y, to convert input rows to output columns, as needed for portrait mode output. The body of code inside the for loops swaps x and y coordinates to rotate the video into portrait mode, and computes a dither table threshold for the current framebuffer pixel position, then computes a single output bit (black or white pixel). Finally, it packs the pixel bit into a byte, which when full (eight sequential pixel bits), it writes that full byte to STDOUT. Here is the loop body (with added white space): PHP Code:
The next statement sets variable o (order) in correspondence with a checkerboard pattern in the framebuffer, representing the output pixel bits needed for a 50-percent grayscale at each resolution step. Then we get a bit complicated, while computing the to variable. The key idea is that some variables are used in the logical expression in a way that enables or disables portions of the overall formula, much like how if-else statements could also be used to do the same thing. The first part of the expression, in parentheses, computes the dither threshold for the current pixel position. Notice how it contains a series of values in a binary progression (&1, &2, &4, &8, &16, &32). These bits are conditionally combined into a 6-bit dither value (0-63). The condition that determines whether these bits remain a shifted one-bit or become a zero-bit is taken from either a shifted y-position or a shifted o value (x xor y), which modified the 50-percent ordered dither mask (in o) to increase or decrease the value depending on its position in the framebuffer. This effectively reproduces the common ordered dither table that is found in most dither routines, without needing a dither table to be precomputed, and without memory lookups (when memory access is very slow compared to real-time computation on modern cached processors). The best way to see how this dither formula works is to fill a table with the values computed by using it, and compare that to a traditional dither table such as in my formula-42 dither logic. The whole point of a dither table is to evenly distribute all values 0 to 63, across a square texture tile, and use that to tile the entire framebuffer. Then each input value is compared against these values, used as a threshold, to determine whether the output pixel is black or white. wb0 is our work-buffer (in-memory version of a framebuffer such as /dev/fb0). wb0[yi*800+xi] is the current input pixel, an 8-bit value. Using scaled-integer arithmetic, we do a brightness and contrast adjustment using predefined values for c=250 and b=120 (as determined empirically for good video image quality). Essentially, we multiple the 8-bit pixel by 63, add 120 (default brightness), divide by 250 (default contrast). Instead of comparing each input value against a dither threshold, we subtract the scaled input value from the dither threshold value so that the result is positive or negative depending on threshold result, then we shift the sign bit with >>8 to extend it down into bottom byte of the computation, making the sign-bit either 0 or 255 (pure black or pure white). And last, we mask the sign bit with &128, and shift it into the output byte. And yes, I did cheat a little in that an if statement determines when to write out that output byte. However, there are no branches while computing each pixel bit. The reason for using a logic expression is that traditional if-else statements will run much slower if it confuses CPU cache branch prediction (which is very simplistic in RISC embedded processors). Some compilers may automatically convert if-else statements into more efficient code, however, that cannot be relied on (especially when using tcc while compiling in the kindle itself). I hope that helps understand this code a little bit better. Notice that this only shows the DITHER step, and does not include extra logic in the gmplay program, needed to deal with variations between different kindle models, and between main and diags modes. You may wish to compare and contrast the gmplay code against raw2gmv, or you can wait until I take time to document that a bit better too. However, until then, read the "theory of operation" details by clicking the appropriate button in that the geekmaster kindle video player first post. Last edited by geekmaster; 01-05-2014 at 09:45 PM. |
![]() |
![]() |
![]() |
#2 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
(reserved for more content)
For now, here is my older dither logic that uses a dither table, but does take kindle variations into account: PHP Code:
Last edited by geekmaster; 01-04-2014 at 02:56 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
And here is the relevant code that takes the pre-dithered content piped in from raw2gmv or from a pre-dithered video (gmv) file, and converts it to whatever the framebuffer requirements dictate for the kindle models and boot modes supported (all of them, as of the PW).
PHP Code:
https://www.mobileread.com/forums/sho...d.php?t=177455 Last edited by geekmaster; 01-04-2014 at 03:07 PM. |
![]() |
![]() |
![]() |
#4 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
(reserved for 256-color to 16-color dither from newtrix demo)
https://www.mobileread.com/forums/sho...d.php?t=176802 This code in particular will be a good starting point for a "generic" dither routine to be used with any (native mode) app. You feed it a 256-color framebuffer, which does not need to be pre-dithered (unlike real framebuffer content on a K5), and this routine converts it to whatever is needed to be in the real framebuffer of any eink kindle model in any boot mode (probably needs updates to work on PW or PW2). This code uses dither tables, so I will want to convert that to logical formulas as shown in the first post in this thread. Last edited by geekmaster; 01-04-2014 at 03:59 PM. |
![]() |
![]() |
![]() |
#5 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
Comments, corrections, or requests for more info are welcome. Please post your thoughts on this subject. Thanks.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,379
Karma: 2155307
Join Date: Nov 2010
Location: Goettingen, Germany
Device: Kindle Paperwhite, Kobo Mini
|
Thank you very much for this! I'll need to take some time (and probably a few steps back) to digest this, I guess, but it will be well worth it.
|
![]() |
![]() |
![]() |
#7 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
Here is the wikipedia page on ordered dither, as used here:
http://en.wikipedia.org/wiki/Ordered_dithering Here is an 8x8 dither table like we use here, from that wikipedia page: ![]() Note that it contains 64 values, from 1 to 64. In order to simplify the comparison into a simple subtract, giving the threshold result in the sign bit, I subtracted one from these dither table values giving 0 to 63. Technically, an ordered dither really gives an extra value when you take zero into account (hence the 1/65 shown to the left of the image), and I prefer to skip a value in the center of the range so I can still have pure black and pure white area fills. In my case, I handle this with the contrast and brightness values to center and stretch the range for best image quality (which can result in more or less bright and dark areas than in the input image, to be adjusted for best eink compatibility as I did for the default values). In this code the dither threshold values are used to determine whether an output pixel will be pure black or pure white. In my 256-color to 16-color dither code as used in the newtrix demo, the threshold values are used to determine when to increment a 16-bit output to the next brighter value to give the appearance of 256 different shades of gray and to prevent "Mach banding" in gradients such as are common in a cloudy sky image. As I mentioned previously, the important things about a dither table are that it contains a single instance of each value in the supported range (0-63 in my case) and that these values are evenly distributed through the table so that an particular grayscale value will consist of evenly distributed black and white pixels in the output image, in a ratio that gives an average value equal to the value of that pixel in the input image. You can think of a dither table as an 8x8 texture image that can be tiled to fill the framebuffer. In fact, modern video graphics cards have shaders that can do dithering by using just such a dithered texture image, corresponding exactly with our dither table. As you can see in the code snippets posted earlier in this thread, I have code that contains a real dither table, and code that computes the table values as-needed where needed using a logical formula. I tested other dither methods, and I decided that a simple ordered dither like this looks the best for video on the kindle eink displays. One of the demos in newtrix also does a spatio-temporal random dither, which has some interesting artifacts causing severe "ringing" that appear as a trail behind moving objects that oscillates between too bright and too dark while it fades, with tail length depending on kindle model and boot mode. Ordered dither is more appealing and more consistent across all the eink kindle models, IMHO. Last edited by geekmaster; 01-05-2014 at 10:16 AM. |
![]() |
![]() |
![]() |
#8 |
BLAM!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,506
Karma: 26047190
Join Date: Jun 2010
Location: Paris, France
Device: Kindle 2i, 3g, 4, 5w, PW, PW2, PW5; Kobo H2O, Forma, Elipsa, Sage, C2E
|
I'm pretty sure this was linked from here or IRC, so it's probably not news to anyone, but that looked neat too: http://pippin.gimp.org/a_dither/
|
![]() |
![]() |
![]() |
#9 | |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
Quote:
Dither methods other than ordered dither can look better for static images, but my experimentation shows that for eink displays, it is important to not change pixels too often, which causes temporal aliasing with the eink drivers, with artifacts that vary between kindle models. The result as mentioned previously is a fading trail that pulsates and can be very annoying for movies or video games. However it does look interesting when that effect is used for artistic purposes, like in one of the demo effects used in my newtrix demo. As it turns out, the ordered dither matrix remains anchored to the display, assuring that pixels only change when they really must. However, for displays that do not have such requirements, such as LCD or OLED, adding a temporal component to the dither (pixels not anchored to screen coordinates) can improve the image. The non-anchored sparkle gets averaged out by the eye, and in fact most modern LCD displays actually do some degree of temporal dither to achieve more effective colors than the hardware could do otherwise. But we are working with eink here, where an anchored dither has positive perceptual significance in how it interacts with the eink display hardware and device drivers, hence my choice of dither method. EDIT: Ordered dither is also known as "Bayer dither", as it is called in the linked page above (which also shows that it provides best compression, due to fewer pixels changing between sequential video frames, besides being more "eink compatible" due to less "pulsating" temporal artifacts). Last edited by geekmaster; 01-06-2014 at 12:48 AM. |
|
![]() |
![]() |
![]() |
#10 |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
Regarding a "karma comment" request (twobob) that we need an "ALL DEVICE" dither wrapper, I agree, but it really needs multiple functions. Besides my setpx() function that sets individual pixels in the dithered screen for highest speed updates of small areas of the screen, I think we need to be able to set 256-color pixels in the work buffer and then call a page update function that dithers the full frame just before calling the eink update function (as can be seen in one of the demos in newtrix). And perhaps a 256-color setpx() type of function that sets 256-color pixels in the work buffer as well.
Actually, I had hoped that people could use my dither routines pretty much as-is (black box style) by lifting them out of my demos, but perhaps they do need to be packaged better for more widespread adoption by this community. I have been doing stuff like this for decades, so it seems obvious to me, but we all come from different backgrounds and cultures, so it is not really fair for me to apply such expectations or desires on this community as a whole. I need to make my stuff more accessible, though I really prefer to spend the time it takes to do all the "finishing touches" working on new stuff and new ideas, and leave the repackaging details to others. This thread is an attempt to begin to bridge that gap. Last edited by geekmaster; 01-05-2014 at 01:39 PM. |
![]() |
![]() |
![]() |
#11 | |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2
Karma: 34484
Join Date: Jul 2013
Device: kobo
|
Quote:
All the GIFs on that page; apart from the rightmost which is using error-diffusion are using ordered dithers. The "a dither" methods have a wider range of display'able colors, most of the code on the page is there to let people play around with the formulas; I expect coders to be able to lift the core of it out of the switch statement yielding something close to 6-7 instructions/pixel. (sadly though; on most archs; a LUT will still beat this procedural mask). For higher perceptual quality from an ordred dither; using "void and cluster" to generate the LUT would be among the best options; but that is computationally a _lot_ more complex than the simple pseudo-random patterns used in the methods summed up on my page. /pippin |
|
![]() |
![]() |
![]() |
#12 | |
Carpe diem, c'est la vie.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
|
Quote:
In my case, I am supporting older kindles including the DX and DXG here, so relying on magic number multiplication and especially floating point may make it run slower. Of course, the only way to tell for sure is to code it both ways and MEASURE the speeds, in addition to examining the assembler language output from the compiler. The Bayer dither has a lot of research behind it. Reordering the dither threshold elements may make some images (or video) look better, but may also have other image examples where it falls short. A table lookup can be faster if it can all fit in CPU cache and remain resident, but that depends a lot on a number of other factors. A 64-byte 8x8 dither table is certainly small enough if it does not need to continuously keep getting reloaded (such as every time another byte is written to STDOUT with a function call). But for the update rate that all eink kindle models support (about 7.7 FPS), the dither function optimization is not all that critical. I did it the way I did it mostly because I enjoyed doing it that way, which is all that really matters for recreational programming such as this. For others to choose one dither method over another makes no difference really, as long as they select one that does the job, and they enjoy doing it. Last edited by geekmaster; 01-06-2014 at 03:04 PM. |
|
![]() |
![]() |
![]() |
#13 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2
Karma: 34484
Join Date: Jul 2013
Device: kobo
|
The computation does not rely on floating point, the patch on the same page for ffmpeg uses only integer arithmetic; the javascript version is merely an interactive play-thing. Showing how to use these short procedural masks both for high bitdepth -> 1bit, as well as -> 4bit or 8bit. Meaning the same formula is fit for driving both high fps 1bit dithers; as well as high fidelity grayscale stills using slower waveforms. "A dither" started out as recreational programming as well; but turnd out interesting enough that I spent too much time optimizing and tuning it's visual characteristics. Fun to find something computationally cheap; that is equvilanet to the much more expensive computation of "void and dither" which creates masks that yield specific green/blue noise frequency distributions (using a, eew, patented algorithm).
IIRC on my x86 laptop it was less than ~50% slower than using a LUT. This is sufficient for my current prototype hacks replacing it with a 2d LUT/mask is among low hanging fruits I'll care about at a later stage when/if I care about battery life or need to do more computations per frame. Use it if you want to, (I should probably wipe some dust of the tiny little framebuffer abstraction "library" I am using and put it out there, with a bit of help static binaries I build with the toolchain I use should work on kobo's as well as some kindles.) |
![]() |
![]() |
![]() |
#14 |
( ͡° ͜ʖ ͡°){ʇlnɐɟ ƃǝs}Týr
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,586
Karma: 6299991
Join Date: Jun 2012
Location: uti gratia usura (Yao ying da ying; Mo ying da yieng)
Device: PW-WIFI|K5-3G+WIFI| K4|K3-3G|DXG|K2| Rooted Nook Touch
|
Many thanks to everyone for their input on this issue.
This work, put in the right place, would open a new world of eink possibilities. |
![]() |
![]() |
![]() |
#15 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11
Karma: 34114
Join Date: Mar 2011
Device: PRS-350
|
Hi geekmaster,
I picked up a DXG recently and I'm still trying to get up to speed on the limitations of the device. In a previous post (https://www.mobileread.com/forums/sho...d.php?t=177455) you mentioned that > 1.5fps was possible on the DX / DXG - any further details on what tricks were used to achieve that, and the update rate obtainable? (apologies if this isn't relevant enough to this thread) |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
geekmaster formula 42 (the dithermatron demo) | geekmaster | Kindle Developer's Corner | 65 | 03-17-2017 08:51 AM |
Formula Plots | PuxyYunm | General Discussions | 9 | 05-15-2011 04:19 AM |
.CSS [CoolReader] Programming Demystified | Dr. Drib | Astak EZReader | 6 | 03-18-2010 01:27 AM |
Python Unicode Demystified | ahi | Workshop | 2 | 09-18-2009 12:45 PM |