They are still are not completely correct, since it does not handle
graphemes at all, just codepoints, but at least it handles the common
case correctly.
The calculation was previously very wrong (rather than just a little bit
wrong): it wasn't accounting for the fact that every character is 1
byte, and so multibyte characters were pretending to be zero width.
cc #8706
file.
Previously multibyte UTF-8 chars were being recorded as byte offsets
from the start of the file, and then later compared against global byte
positions, resulting in the compiler possibly thinking it had a byte
position pointing inside a multibyte character, if there were multibyte
characters in any non-crate files. (Although, sometimes the byte offsets
line up just right to not ICE, but that was a coincidence.)
Fixes#11136.
Fixes#11178.