Simple word indexer (18)
. Continued from a previous article.
Positioning to a character (5)
Where exactly?
The exact location of the infinite loop in GNU’s fseek,
   which I wrote about in articles
   14,
   15, and
   17, is now known.
It is in source file libio/wfileops.c, function
   adjust_wide_data, and the repeated lines are
   567, 568, 576, 582. I tested and debugged with GNU glibc
   version 2.31, but these line numbers also apply to the sources of
   version 2.34.
In the do-while loop there, the enum variable status
   always remains __codecvt_partial, so the loop never
   ends.
Propose a fix?
I cannot propose a fix, because I don’t understand what this part
   of the code is doing, and why it is necessary in the first place.
   As I remarked
   before,
   as I see it, fseek should not involve any file reading
   and buffer filling, nor buffer flushing and file writing, and also
   any conversion of wide or multibyte characters should be unnecessary
   at this stage. All that fseek does in essence, is to set
   a position for future reading or writing operations.
   Whether those operations are wide-character oriented or not, should
   not make any difference.
   The implementation should and could be simple, efficient, fast,
   error-free and easy to maintain.
As I also wrote before, it is easy to say that without fully understanding all of the code. I am aware of that.
Still, when looking at how fseek
   is implemented in FreeBSD, I get the impression
   it is much simpler and cleaner, and GNU in comparison is needlessly
   complicated. Also, FreeBSD seems to do what it does for fseek,
   without ever looking whether we are dealing with wide characters or not.
   And that makes sense to me.
The slowness of ftell is also
   telling in this respect (pun intended).
Addition 30 September 2021
This is what I call clean code! The library
   musl
   by Rich Felker, and others. See also
   this comparison.
Strangely, this musl library does consider an invalid
   byte consumed.
   So my trick using getc() cannot be used here, it
   skips too much. So in Siworin, I now always do fseek(),
   unless compiler symbol __GLIBC__ is defined, then
   I do getc().
Addition 5 November 2022
Later, but probably also before although I had forgotten about it, I found that the C library musl is also at the heart of busybox and Alpine Linux!
Copyright © 2021, 2022 by R. Harmsen, all rights reserved.