Stream: t-compiler

Topic: issue-44056


nikomatsakis (Nov 16 2018 at 19:36, on Zulip):

So I get segfaults from issue-44056.rs:

// compile-pass
// only-x86_64
// no-prefer-dynamic
// compile-flags: -Ctarget-feature=+avx -Clto

Example:

> rustc +rust-5-stage2 issue-44056.rs -Ctarget-feature=+avx -Clto
Segmentation fault (core dumped)

Does anybody else see this? It's pretty annoying. Prevents me from really running the full test suite.

nikomatsakis (Nov 16 2018 at 19:37, on Zulip):

It has something to do with the AVX

nikomatsakis (Nov 16 2018 at 19:47, on Zulip):

Hmm, I also see other failures:

---- [run-pass] run-pass/lto-many-codegen-units.rs stdout ----

error: test compilation failed although it shouldn't!
status: signal: 11
command: "/home/nmatsakis/versioned/rust-5/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/home/nmatsakis/versioned/rust-5/src/test/run-pass/lto-many-codegen-units.rs" "--target=x86_64-unknown-linux-g\
nu" "--error-format" "json" "-Zui-testing" "-o" "/home/nmatsakis/versioned/rust-5/build/x86_64-unknown-linux-gnu/test/run-pass/lto-many-codegen-units/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/hom\
e/nmatsakis/versioned/rust-5/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-C" "lto" "-C" "codegen-units=8" "-L" "/home/nmatsakis/versioned/rust-5/build/x86_64-unknown-linux-gnu/test/run-pass\
/lto-many-codegen-units/auxiliary"
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[run-pass] run-pass/lto-many-codegen-units.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:3282:9
nikomatsakis (Nov 16 2018 at 19:48, on Zulip):

something is weird :(

Wesley Wiser (Nov 16 2018 at 19:51, on Zulip):

@nikomatsakis I was also seeing lto-many-codegen-units fail on my inlining branch. I assumed it was due to my changes, but perhaps not?

nikomatsakis (Nov 16 2018 at 19:51, on Zulip):

who can say :)

nikomatsakis (Nov 16 2018 at 19:52, on Zulip):

I see far more failures than just that one

nikomatsakis (Nov 16 2018 at 19:52, on Zulip):

all of them are tests with "lto" in the name though

Wesley Wiser (Nov 16 2018 at 19:52, on Zulip):

Here's the failures I was seeing:

    [run-pass] run-pass/associated-consts/associated-const-cross-crate-defaults.rs
    [run-pass] run-pass/associated-consts/associated-const-use-default.rs
    [run-pass] run-pass/backtrace-debuginfo.rs
    [run-pass] run-pass/cross-crate/xcrate_generic_fn_nested_return.rs
    [run-pass] run-pass/debuginfo-lto.rs
    [run-pass] run-pass/fat-lto.rs
    [run-pass] run-pass/generator/smoke.rs
    [run-pass] run-pass/issues/issue-11205.rs
    [run-pass] run-pass/lto-many-codegen-units.rs
    [run-pass] run-pass/lto-still-runs-thread-dtors.rs
    [run-pass] run-pass/optimization-fuel-0.rs
    [run-pass] run-pass/optimization-fuel-1.rs
    [run-pass] run-pass/panic-runtime/lto-abort.rs
    [run-pass] run-pass/panic-runtime/lto-unwind.rs
    [run-pass] run-pass/rfcs/rfc-2005-default-binding-mode/constref.rs
    [run-pass] run-pass/sepcomp/sepcomp-lib-lto.rs
    [run-pass] run-pass/stack-probes-lto.rs
    [run-pass] run-pass/ufcs-polymorphic-paths.rs
nikomatsakis (Nov 16 2018 at 19:52, on Zulip):
failures:
    [run-pass] run-pass/debuginfo-lto.rs
    [run-pass] run-pass/fat-lto.rs
    [run-pass] run-pass/lto-many-codegen-units.rs
    [run-pass] run-pass/lto-still-runs-thread-dtors.rs
    [run-pass] run-pass/panic-runtime/lto-abort.rs
    [run-pass] run-pass/panic-runtime/lto-unwind.rs
    [run-pass] run-pass/sepcomp/sepcomp-lib-lto.rs
    [run-pass] run-pass/stack-probes-lto.rs
nikomatsakis (Nov 16 2018 at 19:52, on Zulip):

got me :)

Wesley Wiser (Nov 16 2018 at 19:53, on Zulip):

Well that's useful to me. Now I know the LTO ones might not be my fault :laughing:

nikomatsakis (Nov 16 2018 at 19:53, on Zulip):

from journalctl -b | tail:

                                                            Stack trace of thread 15257:
                                                            #0  0x00007fd2221bf3c3 n/a (/home/nmatsakis/versioned/rust-5/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/codegen\
-backends/librustc_codegen_llvm-llvm.so)
Nov 16 14:38:23 athena.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@66-18182-0 comm="systemd" exe="/usr/lib/sy\
stemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 16 14:38:24 athena.localdomain abrt-server[19001]: Deleting problem directory ccpp-2018-11-16-14:38:23.764102-13200 (dup of ccpp-2018-11-16-09:53:58.514123-11457)
Nov 16 14:38:24 athena.localdomain abrt-notification[19441]: Process 11457 (rustc) crashed in ??()
nikomatsakis (Nov 16 2018 at 19:53, on Zulip):

some kind of crash in LLVM

nikomatsakis (Nov 16 2018 at 19:53, on Zulip):

@eddyb @nagisa any thoughts on what might cause this? :point_up:

nagisa (Nov 17 2018 at 02:00, on Zulip):

I have a number of tests that fail for me locally as well: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/subject/local-ui-test-failures

nagisa (Nov 17 2018 at 02:02, on Zulip):

So one thing that could be a cause is that you perhaps forgot to update your submodules? Another – there is that LLVM stamp file which prevents LLVM from rebuilding even if submodule is updated in some cases… and yet another – perhaps your config.toml is "too" different from the CI’s?

nagisa (Nov 17 2018 at 02:02, on Zulip):

to be clear, I haven’t looked too closely at any of those.

eddyb (Nov 17 2018 at 08:30, on Zulip):

run rm build/*/llvm/llvm-finished-building - if you're lucky, it will trigger a partial rebuild and only a few C++ files will need to be recompiled

nikomatsakis (Nov 17 2018 at 15:56, on Zulip):

so I tried a complete rebuild

nikomatsakis (Nov 17 2018 at 15:56, on Zulip):

building LLVM fails like so:

/home/nmatsakis/versioned/rust-2/src/llvm/include/llvm/CodeGen/SchedulerRegistry.h:40:52: warning: cast between incompatible function types from ‘llvm::RegisterScheduler::FunctionPassCtor’ {aka ‘llvm::Sc\
heduleDAGSDNodes* (*)(llvm::SelectionDAGISel*, llvm::CodeGenOpt::Level)’} to ‘llvm::MachinePassCtor’ {aka ‘void* (*)()’} [-Wcast-function-type]
   : MachinePassRegistryNode(N, D, (MachinePassCtor)C)
                                                    ^
[ 84%] Linking CXX static library ../../libLLVMX86CodeGen.a
[ 84%] Built target LLVMX86CodeGen
gmake: *** [Makefile:152: all] Error 2
thread 'main' panicked at '
command did not execute successfully, got: exit code: 2
nikomatsakis (Nov 17 2018 at 15:57, on Zulip):

this happen to me basically every time; if I start again, it will pass

nikomatsakis (Nov 17 2018 at 15:57, on Zulip):

I always shrugged it off but now I think maybe it's connected

nikomatsakis (Nov 17 2018 at 15:57, on Zulip):

strangely, nothing there shows up in journalctl

nikomatsakis (Nov 17 2018 at 15:58, on Zulip):

I thnk this is the actual error:

/usr/include/c++/8/bits/basic_string.h: In constructor ‘std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_Alloc_hider::_Alloc_hider(std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::pointer, _Allo\
c&&)’:
/usr/include/c++/8/bits/basic_string.h:149:4: internal compiler error: in cp_parser_lookup_name, at cp/parser.c:26145
  : allocator_type(std::move(__a)), _M_p(__dat) { }
    ^~~~~~~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
nikomatsakis (Nov 17 2018 at 15:58, on Zulip):

maybe upgrading to FC29 actually will help

nagisa (Nov 17 2018 at 15:59, on Zulip):

What is your $CC/$CXX?

nikomatsakis (Nov 17 2018 at 16:00, on Zulip):

can't tell you right now, rebooting :)

nagisa (Nov 17 2018 at 16:00, on Zulip):

I found that compiling with $CC=clang/$CXX=clang++ is significantly more involved and error prone compared to $CC=gcc/$CXX=g++

nikomatsakis (Nov 17 2018 at 16:02, on Zulip):

I think I am using gcc

nikomatsakis (Nov 17 2018 at 16:02, on Zulip):

unless rust is picking clang for me

nagisa (Nov 17 2018 at 16:03, on Zulip):

what’s the version?

nagisa (Nov 17 2018 at 16:03, on Zulip):

I can confirm that 7.3.0 and 8.2.1 definitely work.

nikomatsakis (Nov 17 2018 at 16:04, on Zulip):

8.2.1

nikomatsakis (Nov 17 2018 at 16:04, on Zulip):

gcc version 8.2.1 20181105 (Red Hat 8.2.1-5) (GCC)

nikomatsakis (Nov 17 2018 at 16:04, on Zulip):

currently upgrading, though, which I wanted to do anyway

nikomatsakis (Nov 17 2018 at 16:05, on Zulip):

everybody knows the best way to debug a problem is to add more variables, after all

nagisa (Nov 17 2018 at 16:29, on Zulip):

@nikomatsakis still successfully getting through compilation into linking (with 8.2.0 this time)

nikomatsakis (Nov 18 2018 at 14:35, on Zulip):

well, FC29 solved my immediate problems, but I still get irregular crashes from stage0 rustc:

[54386.877422] rustc[13315]: segfault at 10 ip 00007f5e74719540 sp 00007f5d8dbfd128 error 4 in librustc_codegen_llvm-llvm.so[7f5e725a8000+24a7000]
[54386.877431] Code: cf 41 83 c1 01 21 cf 48 8d 34 7f 49 8d 14 f2 49 8b 34 f2 4c 39 f6 74 90 eb d3 4d 85 c0 49 0f 45 d0 eb 99 cc cc cc cc cc cc cc <8a> 0e 89 c8 04 fc 3c 1a 76 1f 89 ca 80 c2 ff 31 c0 80 \
fa 02 48 0f
[55694.589465] rustc[26736]: segfault at 8 ip 00007f2c1943f031 sp 00007f2c07fc00f0 error 4 in libc-2.28.so[7f2c193da000+14d000]
[55694.589474] Code: 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 53 48 83 ec 10 48 8b 05 d0 8e 13 00 48 8b 00 48 85 c0 0f 85 84 00 00 00 48 85 ff 74 6f <48> 8b 47 f8 48 8d 77 f0 a8 02 75 3b 48 8b 15 24 8d 13 \
00 64 48 83

the core dumps are getting truncated, I'm not sure how to change that:

Sun 2018-11-18 07:00:16 EST   12843  1000  1000  11 truncated /home/nmatsakis/versioned/rust-2/build/x86_64-unknown-linux-gnu/stage0/bin/rustc
Sun 2018-11-18 07:22:06 EST   24592  1000  1000  11 truncated /home/nmatsakis/versioned/rust-2/build/x86_64-unknown-linux-gnu/stage1/bin/rustc
simulacrum (Nov 18 2018 at 14:36, on Zulip):

@nikomatsakis ulimit -c 0 I think?

nikomatsakis (Nov 18 2018 at 14:36, on Zulip):

the ulimit is already set to unlimited

nikomatsakis (Nov 18 2018 at 14:36, on Zulip):
> ulimit -c
unlimited
simulacrum (Nov 18 2018 at 14:39, on Zulip):

@nikomatsakis Hm, what about cat /proc/sys/kernel/core_pattern? Maybe that's cutting it somehow?

nikomatsakis (Nov 18 2018 at 14:40, on Zulip):
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e
simulacrum (Nov 18 2018 at 14:40, on Zulip):

I just have core in that file, maybe systemd has some special behavior?

simulacrum (Nov 18 2018 at 14:41, on Zulip):

Maybe also check core_pipe_limit in that same directory?

nikomatsakis (Nov 18 2018 at 14:41, on Zulip):

plausible. core_pipe_limit is set to 0

nikomatsakis (Nov 18 2018 at 14:41, on Zulip):

I'm not really familiar with all this machinery

simulacrum (Nov 18 2018 at 14:42, on Zulip):

Yeah, me neither -- I know historically if that file didn't contain just core or something then I'd end up with no logs whatsoever

nikomatsakis (Nov 18 2018 at 14:43, on Zulip):

from /etc/systemd/coremap.conf:

[Coredump]
#Storage=external
#Compress=yes
#ProcessSizeMax=2G
#ExternalSizeMax=2G
#JournalSizeMax=767M
#MaxUse=
#KeepFree=
nikomatsakis (Nov 18 2018 at 14:43, on Zulip):

perhaps those limits are the problem

nikomatsakis (Nov 18 2018 at 14:46, on Zulip):

(upped to 8G, we'll see)

simulacrum (Nov 18 2018 at 14:49, on Zulip):

Does seem plausible

nikomatsakis (Nov 18 2018 at 14:54, on Zulip):

I should probably file an issue, or search for others

nikomatsakis (Nov 18 2018 at 15:04, on Zulip):

at least I can now build LLVM w/o mysterious crashes, it seems

Last update: Nov 16 2019 at 02:20UTC