Stream: t-lang/wg-unsafe-code-guidelines

Topic: Uninitialized memory gone wrong in C/C++


RalfJ (Jul 14 2019 at 12:59, on Zulip):

For a WIP blog post, I came up with this example to show that uninitialized memory is weird: https://godbolt.org/z/JX4B4N

RalfJ (Jul 14 2019 at 12:59, on Zulip):

make_true returns false, which should be "impossible" no matter which bit pattern x has

RalfJ (Jul 14 2019 at 12:59, on Zulip):

I'd like to also have a C or C++ version of this, but the "obvious" translation does not get optimized to return 0. Does someone know a way to trigger that?

RalfJ (Jul 14 2019 at 13:34, on Zulip):

the best I could come up with so far is https://godbolt.org/z/PvZGQB

RalfJ (Jul 14 2019 at 13:34, on Zulip):

but having to work with an unknown parameter is annoying

nagisa (Jul 14 2019 at 14:32, on Zulip):

new unsigned char will initialize to 0

nagisa (Jul 14 2019 at 14:32, on Zulip):

new calls the default constructor.

nagisa (Jul 14 2019 at 14:34, on Zulip):

Wait, it doesn’t…?

nagisa (Jul 14 2019 at 14:38, on Zulip):

In that case

static bool always_return_true(unsigned char *x) {
    return (*x > 120) || (*x < 120);
}

bool make_true(bool test) {
    unsigned char x;
    return always_return_true(&x);
}

works for both C++ and C.

nagisa (Jul 14 2019 at 14:44, on Zulip):

My bad, x > 120 || x < 120 is not the full range.

RalfJ (Jul 14 2019 at 16:36, on Zulip):

If you are curious, here's that blog post: https://www.ralfj.de/blog/2019/07/14/uninit.html

RalfJ (Jul 14 2019 at 16:36, on Zulip):

@Gankro you dont think you could add a link to that one from your uninit post? :D

gnzlbg (Jul 15 2019 at 09:36, on Zulip):

@nagisa new unsigned char () or new unsigned char {} should value-initialize the memory, without the () and the {} the memory is default-initialized which for unsigned char is uninitialized IIRC (I often mix value-initialized with default-initialized so maybe the names are the other way around)

gnzlbg (Jul 15 2019 at 09:37, on Zulip):

@RalfJ my favourite snippet is:

unsigned char uninit() {
    return *(char*)malloc(1);
}

The memory allocated by malloc is uninitialized in the C abstract machine. So any C / C++ compiler worth its salt will avoid allocating any memory, and obviously, trying to read it. Since the result will always be uninitialized, there doesn't need to be any of that.

gnzlbg (Jul 15 2019 at 09:39, on Zulip):

People saying "what the hardware does", or in this case, "what the memory allocator does", are missing the point, because it doesn't matter whether the memory allocator returns zeroed memory, or some junk from a previous allocation, or only allocates when the memory is actually touched, e.g., by a read, since in the C abstract machine, none of that needs to happen here.

gnzlbg (Jul 15 2019 at 09:51, on Zulip):

GCC vs Clang: https://gcc.godbolt.org/z/cqV8Ve

nagisa (Jul 15 2019 at 10:49, on Zulip):

@gnzlbg yeah, I was pretty confident that new T calls the default constructor and you seem to confirm it. I just wasn’t aware that the "default constructor" for "unsigned char" leaves data uninitialized :D

gnzlbg (Jul 15 2019 at 10:49, on Zulip):

unsigned char doesn't have a default constructor per se IIRC

nagisa (Jul 15 2019 at 10:49, on Zulip):

Hence quoted.

gnzlbg (Jul 15 2019 at 10:49, on Zulip):

there are two things, default initialization, and value initialization

gnzlbg (Jul 15 2019 at 10:49, on Zulip):

I always confuse which one is which

gnzlbg (Jul 15 2019 at 10:50, on Zulip):

but for primitive types one is equivalent to int x; and means that x is indeterminate

gnzlbg (Jul 15 2019 at 10:50, on Zulip):

and that extends to aggregates, e.g. struct S { int x; int y }; S s; where s is also indeterminate.

nagisa (Jul 15 2019 at 10:50, on Zulip):

Well, T x also default-initlializes in C++

gnzlbg (Jul 15 2019 at 10:51, on Zulip):

but then some aggregates in some situations are "special" and get a default constructor that might do something else

gnzlbg (Jul 15 2019 at 10:51, on Zulip):

(e.g. if one of the fields has a constructor)

nagisa (Jul 15 2019 at 10:51, on Zulip):

The catch to be aware of is that "default-initialiser/constructor/whatever-you-call-it" for certain types is undef.

gnzlbg (Jul 15 2019 at 10:51, on Zulip):

yep, there is a special way to "value-initialize", and that is to use () or {} to call the constructor directly.

gnzlbg (Jul 15 2019 at 10:52, on Zulip):

so to put it better, the constructor always initializes, but T t; doesn't always call the constructor

nagisa (Jul 15 2019 at 10:52, on Zulip):

Which I’m surprised about. @RalfJ that’s a nice thing to jot down onto your C++ impossible to use right list :D

gnzlbg (Jul 15 2019 at 10:52, on Zulip):

it gets worse

gnzlbg (Jul 15 2019 at 10:52, on Zulip):

T t(); is a function declaration

gnzlbg (Jul 15 2019 at 10:53, on Zulip):

so you end up needing to write crazy syntax to actually avoid the ambiguity, and properly initialize a variable

gnzlbg (Jul 15 2019 at 10:53, on Zulip):

or with C++11, you can just use T t{}; which also initializes

nagisa (Jul 15 2019 at 10:53, on Zulip):

yes. I remember that one. There’s also banana & peach == 0 inheritance from C. I used C++ professionally for a few years. I hate it.

gnzlbg (Jul 15 2019 at 10:53, on Zulip):

but then..... T t{} doesn't always do the same thing either

gnzlbg (Jul 15 2019 at 10:54, on Zulip):

e.g. if T is an initializer list

gnzlbg (Jul 15 2019 at 10:54, on Zulip):

so now you can always use T t{} except... in some corner cases

Andreas Molzer (Jul 15 2019 at 11:03, on Zulip):

There's also the reverse, calling the destructor the end the lifetime of an object. ~int(ptr)and ptr->~int() do not work but if a template parameter T works out to incidentally be an int it is allowed as ptr->~T() which they call the pseduo-destructor. At this point I'm fairly certain the syntax of C++ is a game of trivia against the compiler.

gnzlbg (Jul 15 2019 at 11:07, on Zulip):

it's probably the hardest game of trivia ever, although some of the Rust games of trivia that are online are quiet tricky as well

RalfJ (Jul 15 2019 at 11:08, on Zulip):

Which I’m surprised about. RalfJ that’s a nice thing to jot down onto your C++ impossible to use right list :D

that list has long overflown, I only collect blog posts about it now^^

Shnatsel (Jul 19 2019 at 19:43, on Zulip):

Actually I have encountered some "C++ is impossible to use correctly" skeptics that claim otherwise and as a consequence do not see the point of Rust. If such a list was written down it would help in that conversation a great deal.

RalfJ (Jul 20 2019 at 11:04, on Zulip):

@Shnatsel I usually point to https://robert.ocallahan.org/2017/07/confession-of-cc-programmer.html and https://www.vice.com/en_us/article/a3mgxb/the-internet-has-a-huge-cc-problem-and-developers-dont-want-to-deal-with-it

RalfJ (Jul 20 2019 at 11:04, on Zulip):

https://www.reddit.com/r/rust/comments/cb49lb/coworker_rust_doesnt_offer_anything_c_doesnt/ also has some good arguments

Shnatsel (Jul 20 2019 at 11:22, on Zulip):

These are very basic and broad, and do not really back up their claims, so they're not very useful for a discussion in a technical setting.

Shnatsel (Jul 20 2019 at 11:28, on Zulip):

The Chucklefish whitepaper also falls in the same bucket

Shnatsel (Jul 20 2019 at 11:29, on Zulip):

Reddit discussion is more in-depth and makes some good points, but it's not really structured. I wonder if there is a summary "Rust vs modern C++" article somewhere.

RalfJ (Jul 20 2019 at 12:17, on Zulip):

I am not aware of one. There is always https://blog.regehr.org/archives/1520 for demonstrating the sheer size of the UB problem.

RalfJ (Jul 20 2019 at 19:02, on Zulip):

@Shnatsel but here's another one for the list: https://twitter.com/erdgeist/status/1151555830623408131

rkruppe (Jul 20 2019 at 19:20, on Zulip):

The most vexing parse can make the development experience unpleasant sometimes, but it's not in any way related to UB or safety. It just causes very puzzling compiler errors (unless it's an RAII guard that is never referenced again, I guess?).

RalfJ (Jul 20 2019 at 19:27, on Zulip):

here someone was thinking they had initialized a variable when they had not

RalfJ (Jul 20 2019 at 19:27, on Zulip):

that can cause UB

rkruppe (Jul 20 2019 at 19:30, on Zulip):

pc is not variable and especially not an uninitialized one, it's a declaration of an external function. Using it will likely complain about type mismatches (or else likely fail during linking because no such function exists), not read uninitialized memory.

rkruppe (Jul 20 2019 at 19:31, on Zulip):

The most plausible way this causes UB is if something relies on a side effect of the intended constructor call but I see no indication of that in this case and in fact I've never seen that happen (while I have seen dozens of mysterious compiler errors).

RalfJ (Jul 20 2019 at 19:33, on Zulip):

IIRC this can lead to you think you have a guard that does some stuff on "drop" but it won't? but maybe I just interpreted it all wrong.

rkruppe (Jul 20 2019 at 19:34, on Zulip):

Uh yeah, s/constructor/constructor or destructor/g

nagisa (Jul 20 2019 at 21:03, on Zulip):

I hit this once as well. Since then I only ever use {} to initialize. As in PingController pc { ping }.

nagisa (Jul 20 2019 at 21:04, on Zulip):

The only case where this really does go unnoticed, if you’re using this as a guard, as commented above.

RalfJ (Aug 18 2019 at 15:53, on Zulip):

Uh yeah, s/constructor/constructor or destructor/g

for completeness' sake, here's how this going wrong looks like in practice:
https://www.reddit.com/r/programminghorror/comments/3qm1zp/c_i_spend_much_time_to_find_the_mistake/

rkruppe (Aug 18 2019 at 16:20, on Zulip):

Evil. Yet another reason why "the mutex owns the data being protected" is the far superior design :)

gnzlbg (Aug 20 2019 at 09:06, on Zulip):

Nice find.

@rkruppe:

The most vexing parse can make the development experience unpleasant sometimes, but it's not in any way related to UB or safety. It just causes very puzzling compiler errors (unless it's an RAII guard that is never referenced again, I guess?).

Last famous words :laughter_tears: TBH I thought that too

Last update: Nov 19 2019 at 17:35UTC