Stream: t-compiler/const-eval

Topic: Symbolic execution of MIR


Christian Poveda (Mar 30 2019 at 00:09, on Zulip):

Hello everyone, I'm doing some work related to const generics which is writing an small symbolic interpreter for MIR as a part of my masters thesis. Right now it can only execute a reduced set of MIR expressions but I believe it could be explored further.

I'd like to know if there is some way of taking Rust code, compile it into MIR and then use my code to execute and manipulate the resulting MIR code, I assume this is possible given that miri is a separated project

oli (Mar 30 2019 at 18:06, on Zulip):

You'll have to create your own driver (see src/bin/miri.rs in miri or src/bin/driver.rs in clippy). You can fetch the main function's MIR like miri does it and then process the resulting MIR as your see fit

Christian Poveda (Mar 30 2019 at 20:57, on Zulip):

Thank you! I'm just checking src/bin/miri.rs. I guess that in https://github.com/rust-lang/miri/blob/master/src/bin/miri.rs#L44 miri is fetching the id of the main function and then miri::eval_main will take the mir of the main function from the context using entry_def_id. Am i correct?

oli (Mar 31 2019 at 15:38, on Zulip):

yes

Christian Poveda (Apr 02 2019 at 18:54, on Zulip):

Thank you for your help! I've been able to do a lot since friday. Currently I can evaluate recursive functions without side effects and I hope to do a lot of improvements this week

Christian Poveda (Apr 02 2019 at 18:55, on Zulip):

If anyone wants to see the code: https://github.com/christianpoveda/sire

Christian Poveda (Apr 06 2019 at 19:39, on Zulip):

Currently my project is able to evaluate recursive functions with conditionals by creating an interpreter for each branch and leaving stated the recursive call. However, functions with loops cannot be evaluated given that blocks don't exist in my representation. It would make sense to try to separate the blocks in a loop into an auxiliary function? For example, if I have the function

fn foo(x: u32) -> u32 {
    let mut r = 0;
    let mut i = 0;
    while i < x {
        r += i;
        i += 1;
    }
    r
}

the MIR of foo contains the following transitions between basic blocks:
- bb0 -> bb1.
- bb1 -> bb2 or bb3
- bb3 -> bb1

I was wondering if it is reasonable to move the blocks bb1 and bb3 to its own function to be able to write the MIR of foo as a recursive function. However, I'm worried about the differences when a panic happens for example. Also I'm not sure if the subset of MIR functions that are produced from valid Rust functions is restricted enough to be able to do this in all cases.

oli (Apr 07 2019 at 06:01, on Zulip):

Can't you treat every BB as its own function in your evaluator? Instead of rewriting the MIR, just treat it as if that were the representation

Christian Poveda (Apr 07 2019 at 17:40, on Zulip):

I could, but then some simple functions would have a really long representation

Christian Poveda (Apr 07 2019 at 17:40, on Zulip):

I guess I could represent every block as a function and then check if I can inline it

Christian Poveda (Apr 19 2019 at 17:05, on Zulip):

i was checking the mir generated for a function containing a for loop. I realized that even with level 3 optimizations the mir is really huge (or at least more than I expected after checking a similar function with a while loop. There are a couple terminators that i've never seen falseEdges and falseUnwind, what are those doing?

RalfJ (Apr 19 2019 at 17:06, on Zulip):

that's NLL stuff, I think

Christian Poveda (Apr 19 2019 at 17:11, on Zulip):

I was checking the same function with the playground but I've realised that is smaller than mine. I guess release includes more compiler flags than just opt-level 3?

RalfJ (Apr 19 2019 at 17:12, on Zulip):

release affects which MIR gets generated

RalfJ (Apr 19 2019 at 17:12, on Zulip):

like, whether arithmetic is translated with or without overflow checks

RalfJ (Apr 19 2019 at 17:12, on Zulip):

and debug assertions, obviously

Christian Poveda (Apr 19 2019 at 17:12, on Zulip):

how can I reproduce the behaviour of the compiler under release using only rustc?

RalfJ (Apr 19 2019 at 17:26, on Zulip):

release mode corresponds to the -O flag

RalfJ (Apr 19 2019 at 17:27, on Zulip):

also see rustc --help

Christian Poveda (Apr 19 2019 at 17:31, on Zulip):

opt-level=2?

Christian Poveda (Apr 19 2019 at 17:32, on Zulip):

i see no discernible difference in the number of blocks

Christian Poveda (Apr 19 2019 at 17:32, on Zulip):

there are around 30 and the playground version has 7

Matthew Jasper (Apr 19 2019 at 17:41, on Zulip):

There are a couple terminators that i've never seen falseEdges and falseUnwind, what are those doing?

These shouldn't exist in optimized MIR (after all passes have been performed).

RalfJ (Apr 19 2019 at 17:59, on Zulip):

@Christian Poveda could you share the code you are using to test this?

Christian Poveda (Apr 19 2019 at 18:00, on Zulip):

https://play.integer32.com/?version=nightly&mode=release&edition=2018&gist=7dfbc38e5d8b4e3e79cd37f8f0c5f6ed

rkruppe (Apr 19 2019 at 18:05, on Zulip):

Is the MIR you're looking at locally produced by rustc ... --emit mir or is it (a dump of) the MIR that your custom driver gets?

RalfJ (Apr 19 2019 at 18:18, on Zulip):

@Christian Poveda I can reproduce the playground output with just rustc --emit mir -O file.rs

Christian Poveda (Apr 19 2019 at 18:30, on Zulip):

Is the MIR I get when dumping the mir using the -Z flag

Christian Poveda (Apr 19 2019 at 18:30, on Zulip):

Maybe im looking the wrong mir file?

rkruppe (Apr 19 2019 at 18:31, on Zulip):

Oh the -Z flag might dump after every pass, in that case it would be very easy to look at a file that's wildly different from the final MIR that playground/--emit mir show

Christian Poveda (Apr 19 2019 at 18:45, on Zulip):

I'm gonna check the mir i'd get from the custom driver

Christian Poveda (Apr 19 2019 at 18:47, on Zulip):

Ok yes the driver gets a short mir also

Christian Poveda (Apr 19 2019 at 18:47, on Zulip):

is it there any way to "emit" the .dot diagram for this?

RalfJ (Apr 19 2019 at 18:49, on Zulip):

hm, it is still strange though that there are no dump files of sum from phase 3

RalfJ (Apr 19 2019 at 18:50, on Zulip):
-rw-r--r-- 1 r r 10353 Apr 19 20:46 rustc.sum.001-000.SimplifyCfg-initial.after.mir
-rw-r--r-- 1 r r 11972 Apr 19 20:46 rustc.sum.001-000.SimplifyCfg-initial.before.mir
-rw-r--r-- 1 r r 10343 Apr 19 20:46 rustc.sum.001-001.TypeckMir.after.mir
-rw-r--r-- 1 r r 10344 Apr 19 20:46 rustc.sum.001-001.TypeckMir.before.mir
-rw-r--r-- 1 r r 10345 Apr 19 20:46 rustc.sum.001-002.SanityCheck.after.mir
-rw-r--r-- 1 r r 10346 Apr 19 20:46 rustc.sum.001-002.SanityCheck.before.mir
-rw-r--r-- 1 r r 10353 Apr 19 20:46 rustc.sum.001-003.UniformArrayMoveOut.after.mir
-rw-r--r-- 1 r r 10354 Apr 19 20:46 rustc.sum.001-003.UniformArrayMoveOut.before.mir
-rw-r--r-- 1 r r 10360 Apr 19 20:46 rustc.sum.002-000.QualifyAndPromoteConstants.after.mir
-rw-r--r-- 1 r r 10361 Apr 19 20:46 rustc.sum.002-000.QualifyAndPromoteConstants.before.mir
-rw-r--r-- 1 r r 10360 Apr 19 20:46 rustc.sum.002-001.SimplifyCfg-qualify-consts.after.mir
-rw-r--r-- 1 r r 10361 Apr 19 20:46 rustc.sum.002-001.SimplifyCfg-qualify-consts.before.mir
-rw-r--r-- 1 r r 11955 Apr 19 20:46 rustc.sum.-------.mir_map.0.mir
RalfJ (Apr 19 2019 at 18:51, on Zulip):

ah... it's because the fn is never called so doesnt get dumped

RalfJ (Apr 19 2019 at 18:52, on Zulip):

@Christian Poveda mir_dump/rustc.sum.003-027.PreCodegen.after.mir matches the MIR on playground

RalfJ (Apr 19 2019 at 18:52, on Zulip):

generated by rustc -Zdump-mir=all p.rs -O --emit mir

RalfJ (Apr 19 2019 at 18:53, on Zulip):

also see https://github.com/rust-lang/rust/issues/43620

Christian Poveda (Apr 19 2019 at 18:54, on Zulip):

oh ok, i guess that solves the problem

Christian Poveda (Apr 19 2019 at 19:00, on Zulip):

well thank you both :) this has been really helpful

Christian Poveda (Apr 29 2019 at 22:42, on Zulip):

It seems I'm misunderstanding the semantic of StorageLive/Dead, why the mir of the is_odd function does not have an StorageLive(_4) statement? https://play.integer32.com/?version=nightly&mode=release&edition=2018&gist=756a4645cf1d86997a90708e1beb9966

Christian Poveda (Apr 30 2019 at 04:28, on Zulip):

as a side question, why does 2 == 0 does not evaluate to false immediately?

RalfJ (Apr 30 2019 at 10:17, on Zulip):

some variables are implicitly live and never have a Storage annotation. I don't know why.

Christian Poveda (Apr 30 2019 at 13:08, on Zulip):

Uhhh i started a new thread by accident...

Christian Poveda (Apr 30 2019 at 13:10, on Zulip):

I was thinking on just allocating space for every local at the same time when the function argument locals are allocated and freeing all the locals at the end of execution. Otherwise I'd have to check which locals lack of an StorageLive and StorageDead statement before even executing the function.

oli (Apr 30 2019 at 13:14, on Zulip):

no that does not happen right now

oli (Apr 30 2019 at 13:15, on Zulip):

yea, we do that check in clippy, but we optimized it out for some situations

Christian Poveda (Apr 30 2019 at 13:18, on Zulip):

I guess for the moment is enough to reserve everyone at the same time and just ignore the Storage statements then, I'll let a comment on my code in case this becomes relevant after

Christian Poveda (Apr 30 2019 at 13:19, on Zulip):

It seems I'm misunderstanding the semantic of StorageLive/Dead, why the mir of the is_odd function does not have an StorageLive(_4) statement? https://play.integer32.com/?version=nightly&mode=release&edition=2018&gist=756a4645cf1d86997a90708e1beb9966

I have another question, why on that specific example _4 is not resolved to false during compilation?

oli (Apr 30 2019 at 13:19, on Zulip):

I think the only thing that can be problematic with the way you're addressing is that you may not notice if someone is accessing a dangling pointer, because your local still lives

Christian Poveda (Apr 30 2019 at 13:21, on Zulip):

I think the only thing that can be problematic with the way you're addressing is that you may not notice if someone is accessing a dangling pointer, because your local still lives

Jmm... the other option is to still allocate and free using StorageLive/Dead and also at the beginning and end of execution and do double frees #YOLO

oli (Apr 30 2019 at 13:22, on Zulip):

It seems I'm misunderstanding the semantic of StorageLive/Dead, why the mir of the is_odd function does not have an StorageLive(_4) statement? https://play.integer32.com/?version=nightly&mode=release&edition=2018&gist=756a4645cf1d86997a90708e1beb9966

I have another question, why on that specific example _4 is not resolved to false during compilation?

because https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/Other.20available.20work.3F is still a WIP (const propagation is not actually propagating yet)

Christian Poveda (Apr 30 2019 at 13:26, on Zulip):

I think the only thing that can be problematic with the way you're addressing is that you may not notice if someone is accessing a dangling pointer, because your local still lives

Jmm... the other option is to still allocate and free using StorageLive/Dead and also at the beginning and end of execution and do double frees #YOLO

I thing I'm gonna do the Right Thing™ and check which locals lack of StorageLive/Dead statements. Is possible that one local has just one of those two?

Christian Poveda (Apr 30 2019 at 13:27, on Zulip):

Meaning that they are allocated explicitly but freed at the end of the function, and viceversa?

oli (Apr 30 2019 at 13:27, on Zulip):

I think that used to be a problem, not sure if it is anymore

oli (Apr 30 2019 at 13:27, on Zulip):

miri just checks if either exist

RalfJ (Apr 30 2019 at 13:34, on Zulip):

if a local has either, that's weird but shouldnt need a special case -- that local may not get used until a storagelive happened. (so if it has just a StorageDead, it may never be used at all.)

Christian Poveda (Apr 30 2019 at 13:37, on Zulip):

I'll get hands on then

Christian Poveda (Apr 30 2019 at 13:50, on Zulip):

Ok It worked!!!

Christian Poveda (Apr 30 2019 at 13:50, on Zulip):

:D

Christian Poveda (Apr 30 2019 at 14:33, on Zulip):

I'm going to check that the whole memory is empty after execution

Christian Poveda (Apr 30 2019 at 14:33, on Zulip):

just to be sure

Last update: Nov 15 2019 at 20:05UTC