I found the image that shows many beads per pixel:
The square blocks in the image are pixels. The much smaller randomly distributed "round spots" are beads (capsules). The beads are a little hard to see because they are all the same color, but you can slightly see their edges.
Also, you can see some "snow" where some dark particles are still stuck to the front surface of some beads. Doing three full refreshes or more knocks those stuck particles loose and results in a higher contrast display.
If there were only one round bead per pixel, you would not see such sharp corners on the pixels.
However, higher resolution displays MAY use smaller pixels with the same sized beads.
I think that the simplified artistic diagram was just drawn to demonstrate how the technology works, and not to show relative scale between beads (capsules) and pixels.
P.S. The above photo also shows the visual effect of different fonts. Gray pixels can help around the edges, but letter strokes must be black "inside" to prevent poor contrast.