[PATCH 2] fix broken strict aliasing

Marc Lehmann schmorp at schmorp.de
Wed Feb 24 19:33:39 CET 2010


On Wed, Feb 24, 2010 at 05:39:41PM +0100, "common at gmx.ch" <common at gmx.ch> wrote:
>> (the iso c documents ae not that hard to read, though, imho, with the
>> exception of structure member aliasing).
>
> Therefore I'd be glad to get more information on the topic.
> Thanks for the offer.

Well, it's not fundamentally complicated.

1. ALIASING

Aliasing (in this context) means to access the same storage area by different
aliases. For example, here:

   int i;
   int *p = &i;

Now "i" and "*p" are alises for the same storage location. Now take this
example, which uses different types:

   int a = 4;
   long *b = (void *)&a;

   a = 5;
   assert (*b == 5);

If long and int have the same size & representation (typically yes on
32 bit archs), then this should intuitively work - a has the same bit
representation as *b, so a and *b are _aliases_ for each other, and cpus
will not have issues with implementing this, the assert should not fail.

2. ALIASING AND C

The problem is, C says that a compiler can assume that int variables
never are aliases for long variables *by definition*, as different types
never alias each other. That means gcc can assume that *b is NEVER a, or
b is never pointing to a, simply because C says it can do so. Or in other
words, for gcc, "a" and "*b" are two different variables.

There are exceptions - for example, char variables *may* alias with any
other type.

Now, when you access the same storage area via different types, you have a
potential "aliasing issue" - YOU might WANT the expressions to be aliases
for the same variable, but if none of them is of type char, then the compiler
might overrule you and say "no, I just don't see why I should assume it is
the same storage area".

In the latter case you have an aliasing bug - YOU assume the
expressions/variables/storage spaces/pointer targets are aliases, gcc
disagrees and will not notice that writes to one alias will modify the
other.

In the above example:

   a = 5;
   assert (*b == 5);

WE might assume that a and *b are aliases, but gcc will assume that they
are not, so gcc "will not see" that *b is actually 5 as *b has never been
written to, according to the C standard.

That for example means that the only way to portably implement memcpy
would by by using a char-to-char copy, as other data types do not alias
with each other, but using char to access any object is safe, as it is
indeed aliasing with any other type.

3. LIBEV-BASED EXAMPLE

Here is a more practical example: take ev_watcher and ev_io structs in
libev. Both have an "int active" member at the same location:

   struct ev_watcher { int active; };
   struct ev_io      { int active; };
   // union with both omitted

Now, *some* readings of the C standard say (and I would personally support
these, although they are often not helpful), that accessing (ev_watcher
*)->active NEVER accesses the active member of ev_io, as the types are
different, so one active member is never an alias for the other.

Gcc follows this reading, as it is allows for important (IMHO)
optimisations.

That means that this code:

   struct ev_io io;
   struct ev_watcher *w = (ev_watcher *)&io;

   io.active = 0;
   assert (w->active == 0);

might fail (and code like this sometimes happens to fail with current gcc
versions), because gcc assumes that w->active is not the same storage area
as io.active, i..e they do not refer to the same variable.

This is why libev always uses the "ev_watcher *" cast, i.e. it never
accesses "active" via ev_io or other types, only ever through "ev_watcher
*", so libev does not rely on aliasing at all (in theory, it does
however still make a lot of unportable assumptions w.r.t. the C standard
internally).

Or in other words, libev puts a struct ev_watcher at the same storage
address as ev_io. It doesn't use the overlapping data members of the ev_io
watcher. They could in fact be missing, as long as ev_io contains enough
padding at that place - the ev_watcher occupies the head of the memory
area, ev_io the tail.

4. THE GCC WARNING AND LIBEV

The warning gcc emits still applies - it tells you that accessing the
active member via e.g. ev_io might not work.

Note that the gcc warning does not claim that such an access actually
happens anywhere, just that *if* there was one, then you might fall for an
aliasing issue.

As for libev, there might be any number of ways to avoid this warning
most of which will increase maintainance overhead, generate a lot of work
and might cause issues with other systems, as they are big changes. Maybe
there is a magic way to express this that avoids the gcc warning in all
cases (maybe an extra (void *)) will do the trick, but it is by far
simplest and safest to keep the code in a working state until a better
solution is found and well understood.

I think aliasing issues are not really common in practise UNLESS you
really designed the software to take advantage of aliasing. You are
unlikely to a) understand the warning and b) write code that has buggy
aliasing assumptions, so in my book, the priority to avoid it is low, it's
a social problem that hits people who don't even understand the warning,
and those have no business to enable it WITHOUT finding out what it means.

Note: I haven't proofread this well, so it might be fuzzy, hard
to understand or even wrong in some places, feel free to ask for
clarifications or additions.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp at schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list