Strict aliasing rule strikes back

Andrey Pokrovskiy wonder.mice at gmail.com
Fri May 22 10:14:57 CEST 2015


Wow, thanks for such quick and detailed reply, Marc!

> so all accesss have to be against the ev_watcher struct, everywhere

So you are saying, that libev needs "(ev_watcher *)(void *)" cast
exactly because it uses such cast (ev_watcher * type to be more
precise) everywhere to access those members. And by uniformly
accessing those members via ev_watcher * type it confirms strict
aliasing rules. That make sense.

Once again, thanks for the explanation.

On Fri, May 22, 2015 at 12:09 AM, Marc Lehmann <schmorp at schmorp.de> wrote:
> On Thu, May 21, 2015 at 09:43:24PM -0700, Andrey Pokrovskiy <wonder.mice at gmail.com> wrote:
>> But I'm not saying that there is something wrong with libev and that
>> those warnings should be fixed asap (surprise-surprise).
>
> Indeed :)
>
> In fact, those warnings are slowly being fixed, as far as I can tell, at
> the place where they needed fixing, namely inside the compilers: AFAICT,
> none of these were ever correct, or if they were, they were misleading.
>
> For example, the gcc warning text is:
>
>    dereferencing type-punned pointer will break strict-aliasing rules
>
> and is wisely not emitted by default nor with -W. This text is unfortunate
> due to multiple reasons, one being that "type-punned pointer" is not an
> ISO C concept, so it's unclear what it refers to (it has no standard
> definition). Furthermore, if you actually read_ the text closely, it says
> "if you do that, then you break". It does _not_ say "you break, because
> you did that", a subtle distinction (I have no evidence that the author of
> this message was aware of this distinction though).
>
> As for libev, the pointer isn't "type-punned" (again, for lack of struct
> definition of the term, it's hard to tell, I am using the wikipedia
> definition of the day), so no type-punned pointer will be dereferenced and
> aliasing rules are not being broken - the warning is actually correct,
> because it doesn't claim to apply to the code it warns about, which makes
> it a spurious warning.
>
> In your second example (ev_cb_set), we did not really dereference a
> type-punned pointer either, but we did break the aliasing rules: EITHER
> the code in ev.c which casts the ev_io * to an ev-watcher * and uses it
> to access the callback does, OR ev_cb_set does, but it's not at all clear
> which one is the type-punned access, and which is the "real" access.
>
> Which is why I think that the "type-punned" concept isn't that useful, and
> this might also be the reason why the C standard doesn't talk about that,
> but about aliasing.
>
> This is not altogether different than for many other warnings, which
> really warn about potential problems (for example, -Wreorder, ). I think
> the problem is that a lot fewer people understand the warning (and
> conservatively treat it as an error), and that it is not at all trivial to
> avoid (so we can't just do some little magic and silence it).
>
> For example, gcc's -Wparentheses warns about this potentially (we don't
> know) correct code:
>
>    if (a = b) ...;
>
> But not abut this one:
>
>    if ((a = b)) ...;
>
> So this warning is easy to avoid. I don't know of such an easy workaround
> for the type-punned warnings.
>
>> #define ev_init(ev,cb_) do {                   \
>> -  ((ev_watcher *)(void *)(ev))->active  =      \
>> -  ((ev_watcher *)(void *)(ev))->pending = 0;   \
>> +  (ev)->active  = \
>> +  (ev)->pending = 0;   \
>
> Because C will get angry: while this doesn't trigger strict aliasing
> warnings, it actually does break the rules (and does cause code to be
> miscompiled).
>
>> Since ev must be a watcher, it will have "active" and "pending"
>> members and two-steps cast is not really necessary.
>
> The problem is that some compilers interpret the C rules in the way that
> this active member is not the same one as accessed in the libev code
> later, so all accesss have to be against the ev_watcher struct, everywhere
> (I think this is a valid but not conclusive interpretation of the rules,
> but anyways, its what is done).
>
>> -# define ev_set_cb(ev,cb_)                   (ev_cb_ (ev) = (cb_), memmove (&
>>                           ((ev_watcher *)(ev))->cb, &ev_cb_ (ev),
>> sizeof (ev_cb_ (ev))))
>> +# define ev_set_cb(ev,cb_)                   (ev)->cb = (cb_)
>>
>> Why to use memmove when "ev->cb" and "cb_" must have the same type and
>> are simply assignable?
>
> Note that the memmove is new, and that this was a part of libev where we
> considered the effects carefully, as it was the only (known) place where
> struct aliasing rules were actually broken, but we were very confident that a
> compiler wouldn't break, and that a memmove might be counterproductive.
>
> Nowadays, we think that memmove is a no-op, apart from telling the
> compiler that both locations alias.
>
> Note that, while the type of the variables is the same, the aliasing rule is
> about the types used to access them, and ((ev_watcher *)(ev))->cb has a
> different type then &ev_cb_ (ev), so these variables (although they reside at
> the same location) do not alias.
>
> The memmove tells the compiler that they actually do alias, and is
> otherwise compiled away (at least with reasonably current gcc and clang
> compilers).
>
>> Curiously enough, I noticed that those memmove's were added exactly to
>> "fix a potential aliasing issue".
>
> Indeed, and as you might want to note, no compiler has ever warned about
> that, although we (the libev authors) were aware of it...
>
>> I'm sure all those things are done for a reason, and I'm wondering
>> what those reasons are. Maybe somebody could write a few lines
>> explaining why libev needs those casts and memmove's?
>
> Basiclaly, without the memmove, the following order of instructions
> (potentially resulting from inlining for example):
>
>    ev_set_cb (ev, xcb);
>    ev_watcher *wev = (void *)ev;
>    wev->cb (...);
>
> could be legally optimised to:
>
>    ev_watcher *wev = (void *)ev;
>    wev->cb (...);
>
> Because wev->cb is never written and ev->cb is only written but never
> read.  The memmove aliases both, so the compiler knows that ev->cb is
> written and later read, and likewise, wev->cb is written before.
>
> It is very unlikely that the compiler will ever be able to make this
> optimisation in practise, certainly when libev is used as a library.
>
> Likwise, you could probably remove the casts from macros like ev_init, and
> as long as you use it as a library, it would work. However, when embedded,
> gcc is able to "miscompile" the intent of the code then, due to inlining.
>
> (And gcc already is able to inline libraries by doing whole program
> optimisation. "Nobody" uses this at the moment, because it is hard to do
> so on a typical multi-user system, but it's better to be safe than sorry,
> especially if the overhead is nil).
>
> All this could also have been avoided by embedding the ev_watcher struct
> inside the ev_io etc. structs for example, but that would have caused
> structure sizes to grow due to padding on some platforms, and we really
> wanted to squeeze out those extra 4 bytes from every watcher.
>
> I am not sure I hit the target with my answer, but since this is a
> complicated and subtle question, if anything is unclear, feel free to
> shoot more questions, or ask for clarifications.
>
> --
>                 The choice of a       Deliantra, the free code+content MORPG
>       -----==-     _GNU_              http://www.deliantra.net
>       ----==-- _       generation
>       ---==---(_)__  __ ____  __      Marc Lehmann
>       --==---/ / _ \/ // /\ \/ /      schmorp at schmorp.de
>       -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list