Rv kmir milestone 3 #1021

ehildenb · 2023-10-02T21:44:34Z

Milestone Delivery Checklist

The milestone-delivery-template.md has been copied and updated.
This pull request is being made by the same account as the accepted application.
I have disclosed any and all sources of reused code in the submitted repositories and have done my due diligence to meet its license requirements.
In case of acceptance, an invoice must be submitted and the payment will be transferred to the BTC/ETH/fiat account provided in the application.
The delivery is according to the Guidelines for Milestone Deliverables.

Link to the application pull request: w3f/Grants-Program#1422 < please fill this in with the PR number of your application.

dsm-w3f · 2023-10-06T11:10:40Z

@ehildenb thank you for the milestone delivery. Please see the evaluation document and provide proper answers and fixes. After that, let mi know when I can continue this evaluation.

dkcumming · 2023-10-12T00:01:35Z

Hi @dsm-w3f, Everett is away so I can help instead.

Skipped tests: The current test suite that we have can be divided into two sections, handwritten and compiletest-rs/ui/. handwritten tests are tests of our own creation, while the compiletest-rs/ui/ tests are a restricted set of the full rustc ui test suite. Our restrictions are that the tests must successfully compile to MIR with rustc 1.72.0-nightly (46514218f 2023-06-20) of 2021 edition rust. compiletest-rs/ui/ can therefore be thought to be a representation of the full conformance suite, featuring all rust constructs and paradigms. It should then be expected that the version of KMIR that this milestone delivers only conforms to a subset of those tests, as the project moves towards covering more semantics. Our workflow is to keep the full test suite on integration, but flag tests we are not currently handling in a file and skip those tests on a run.
Test coverage: This also relates to 1. above, where I mentioned that it is not expected that we pass all the ui/ tests yet. Our current execution coverage is 375 / 2555 which has 2180 skipped. It should be noted that this is not necessarily a reasonable metric of our coverage, a test is skipped if it contains a feature we do not currently handle (e.g. recursion) however we may handle all of the rest of the test except that one feature. Furthermore, this current proposal is at an early stage of exploration, thus we do not setup test coverage metrics monitoring. However, we will have such metrics in the following up proposal (also listed in the future work section of the original proposal).
Unimplemented features: As mentioned above we are working on coverage of the full semantics of rust but that is not expected at this stage. These productions that you listed are placeholders for planned future work.
Kmir run: is running the Rust program using the semantics we implemented in K. Our conformance tests is setup to test running a Rust program using KMIR is producing the same results as executing the Rust natively. However, it should also be noted that it is not a necessary requirement to establish our semantic implementation is equivalent to native Rust. A different result could indicate a problem in our implementation or the Rust compiler. The two flags that are working for output currently are kore and pretty. kore will output the K internal representation of the KMIR program, this is useful for input to other K tools and for low level debugging - however it is not very human readable. pretty has the same information but shows the state in XML-like notation which is much easier for a human to navigate. I believe these two outputs are satisfactory for this stage of the proposal. A recommendation I have, perhaps some intuition of what kmir run is doing could be gained by using the --depth flag along with --output pretty for a particular example. sum-to-n.mir is quite straight forward (matching rust program), and you can start with --depth 0 to see the intial configuration, empty except for the mir program ready on the <k> cell. Increasing the depth will show move through phases Initialization, to Execution, until the program terminates in Finalization and the state can be observed as it transforms throughout this process.

I hope that provides some clarity, and let me know if there is anything more that I can do to help.

dsm-w3f · 2023-10-16T22:41:18Z

@dkcumming thank you for the answer. It is not clear to me how to verify the claim of the tests "be thought to be a representation of the full conformance suite, featuring all rust constructs and paradigms". You also mentioned that some features, such as recursion, are still missing. This together with the percentage of tests passing ~15%, make me believe that this is a very initial work and a significant effort should be spent to complete this. Correct me if I'm wrong. How far this work are of actually be possible to be used to formally verify Rust programs? It would be possible to verify Ink! smart contracts for example? Ink! is a DSL of Rust for developing smart contracts in the Polkadot ecosystem. I saw from the application that this milestone was timeboxed to 8 weeks. It would be possible by the commits or other evidence to show that the 8 weeks of work were provided?

dkcumming · 2023-10-17T01:54:13Z

Okay, I think that we are expecting that there are broadly the same features across the tests. Do you know what things you might find reasonable indicators that the compiletest-rs submodule, is a faithful subset of the rust ui test suite? One example of a query that I can think of is keywords. If we are just restricting based on compiling to MIR with 2021 edition rust version rustc 1.72.0-nightly (46514218f 2023-06-20), which is about 18% of the tests, then we should expect pretty similar reduction of keywords as reduction of tests. If entire keywords were absent that would indicate a problem, if struct dropped off completely that would tell us the test suite doesn't cover structs. I looked up the keywords of rust from here, and I ran a simple script to count their occurrences (excluding Resevered keywords) on rust source files in our repo and the rust ui repo (both linked above):

words	KMIR	RUST	%
as	11939	38668	31%
break	214	604	35%
const	3005	10723	28%
continue	29	130	22%
crate	117	4237	2.7%
else	397	6280	6.3%
enum	410	2566	16%
extern	197	3591	5%
false	590	1584	37%
fn	8249	42942	19%
for	3008	15648	19%
if	1315	13246	10%
impl	2877	14668	20%
in	12043	70937	17%
let	8938	30027	30%
loop	203	959	21%
match	1081	6221	17%
mod	234	3394	7%
move	314	2097	15%
mut	3032	13970	22%
pub	2313	12544	18%
ref	1102	5287	20%
return	383	2006	19%
self	3099	10787	29%
Self	974	3748	26%
static	1138	5098	22%
struct	2093	12032	17%
super	46	422	11%
trait	1567	10674	15%
true	785	2404	32%
type	2033	16774	12%
unsafe	659	4249	15%
use	2188	14958	15%
where	791	3387	23%
while	115	812	14%
async	240	2138	11%
await	85	744	11%
dyn	651	3734	17%
try	135	678	20%
macro_rules	358	1437	25%
union	114	666	17%
'static	645	2870	22%
dyn	651	3734	17%

I think these numbers match up with expectation. Words of concern might be those are single digit percentage which are crate, extern, mod. A similar method could be used to query particular idioms or function calls.

dsm-w3f · 2023-10-18T19:51:06Z

@dkcumming thank you for the answer. I tried to find evidence of 8 weeks of work and I found these two PR: runtimeverification/mir-semantics#213 and runtimeverification/mir-semantics#188. Could you confirm if these PRs are related to the scope of M3 and represent the work performed for developing it?

dkcumming · 2023-10-18T23:21:41Z

Sorry I forgot to respond to the rest of your question when I did the checking of the keywords. The commits relating to M3 actually begin much further back than that. The first PR relating to 3rd milestone began was merged on March 28th PR#115 with further PRs seen in PR#117, PR#160, PR#176, PR#183, PR#186, PR#195, PR#203. These PR's directly relate to developing the semantics of MIR, or developing the executable extension kmir run with features to give users control and feedback of the execution.

You refer to PR#188, which I believe is actually beyond the scope of M3, however early implementation of this assists in giving feedback to the semantics - which it has done, so it is in cooperation with M3.

PR#213 I believe is within scope of M3, although this is a step towards a general improvement for KMIR overall. Since July, rust have accelerated development on stable mir, this is incredibly fortuitous timing as it means that there is solid footing for MIR with guarantees (within reason) that we can expect it to be unchanged. These refactors aim to match the internals of KMIR with stable MIR.

I believe the work above demonstrates that we have met and exceeded the 8 week timebox.

Also note there are other PR's in the commit history that relate to M4, kup integration.

You previously asked about the verification of Rust programs, that is futurework. This proposal is on the parsing and execution of MIR programs. Also in relation to Ink!, investigation would need to be taken to determine what would be required for KMIR to verify those smart contracts, but again this is beyond the scope of this proposal and certainly this milestone.

dsm-w3f · 2023-10-20T14:09:30Z

@dkcumming thank you for the answer. I think you provided evidence that the work intended in M3 was performed and it is working. The milestone is approved. Please submit the invoice using the link in the comment below. After that, the payment should take place within two weeks. Great job!

github-actions · 2023-10-20T14:10:55Z

🪙 Please fill out the invoice form in order to initiate the payment process. Thank you!

dkcumming · 2023-10-20T14:11:50Z

Thank you.

fededubbi · 2023-10-23T07:57:58Z

Hi @dkcumming,

I noticed that the VAT number on your invoice is wrong.
The correct one is: CHE-322.596.347.

Can you please update and resend?

Many thanks,
Federica

dkcumming · 2023-10-23T22:34:12Z

@fededubbi We sent an updated invoice, with the correct VAT number, on October 4th to the following email address: [email protected]

Is that one sufficient?

* deliveries/rv_kmir-milestone_3: milestone document

5e3a1f2

dsm-w3f self-assigned this Oct 5, 2023

dsm-w3f merged commit 7b91c38 into w3f:master Oct 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rv kmir milestone 3 #1021

Rv kmir milestone 3 #1021

ehildenb commented Oct 2, 2023 •

edited

Loading

dsm-w3f commented Oct 6, 2023

dkcumming commented Oct 12, 2023

dsm-w3f commented Oct 16, 2023

dkcumming commented Oct 17, 2023 •

edited

Loading

dsm-w3f commented Oct 18, 2023

dkcumming commented Oct 18, 2023 •

edited

Loading

dsm-w3f commented Oct 20, 2023

github-actions bot commented Oct 20, 2023

dkcumming commented Oct 20, 2023

fededubbi commented Oct 23, 2023

dkcumming commented Oct 23, 2023

Rv kmir milestone 3 #1021

Rv kmir milestone 3 #1021

Conversation

ehildenb commented Oct 2, 2023 • edited Loading

Milestone Delivery Checklist

dsm-w3f commented Oct 6, 2023

dkcumming commented Oct 12, 2023

dsm-w3f commented Oct 16, 2023

dkcumming commented Oct 17, 2023 • edited Loading

dsm-w3f commented Oct 18, 2023

dkcumming commented Oct 18, 2023 • edited Loading

dsm-w3f commented Oct 20, 2023

github-actions bot commented Oct 20, 2023

dkcumming commented Oct 20, 2023

fededubbi commented Oct 23, 2023

dkcumming commented Oct 23, 2023

ehildenb commented Oct 2, 2023 •

edited

Loading

dkcumming commented Oct 17, 2023 •

edited

Loading

dkcumming commented Oct 18, 2023 •

edited

Loading