The safe_arch crate is now at 0.3 and is ready for a good "kick the tires" test run.
@Lokathor Just to make sure, the bit saying "Intel (x86 / x86_64) " is actually false, right? It supports AMD I assume as well?
oh yeah, i always forget about AMD being a thing XD.
Yes it's all x86 / x86_64
Docs are fixed in
Technically, VIA is still a thing too, as are some other x86 vendors, they're just uncommon. You might consider "x86 and x86-64 (Intel, AMD, etc)".
Also, a request for the documentation:
For the benefit of people who do know the instruction they want, and don't know the name of the corresponding function, would you consider systematically including the instruction mnemonic in the documentation?
That way, people can find it by searching.
And, if they need to know the precise instruction semantics, they can also more easily look up the instruction in the SDM.
Also, in the phrase "Compliment Flag", I think you wanted "complement"? But also, that isn't the name of a flag on x86. I think you mean "carry flag".
https://docs.rs/safe_arch/0.3.1/safe_arch/fn.test_mixed_ones_and_zeroes_m128i.html is one place I saw that, and the corresponding instruction,
ptest, uses the carry flag.
ah, good catch, yeah i meant carry
and yeah putting instruction name into the docs so that a search shows it would be good. so far ive only been using actual searches in the files via my editor, which works accurately, but rustdoc doesn't search source of course.
The instructions aren't necessarily in the sources, either, as far as I can tell.
_mm_test_mix_ones_zeros (a name which might also want to be in the docs), but doesn't mention the underlying
ahhh... did you mean you want the assembly as well as the intrinsic name?
okay, right, can probably do that over time.
Yeah, I meant the instruction mnemonic. The intrinsic name would be useful, but I was originally thinking of the instruction mnemonic. :)
For the intrinsic name and assembly op i opened a new issue here, https://github.com/Lokathor/safe_arch/issues/49, but that particular thing will probably be a slow, slow task. I'll probably experiment with it some as I put in the new avx2 things and then go back and revise previous entries later.
I appreciate it, thank you. Would have helped me track down the flag that that instruction actually used. :)
This is something I hope you'd be able to partially automate.
Intrinsic name, yes, assembly instruction, probably not easily.
At least right now the crate is 0% automated.
The data at https://software.intel.com/sites/landingpage/IntrinsicsGuide/ includes the names of the underlying instructions.
Right, that's what i've been using so far, but I don't have an easy way to scrape the page to associate an intrinsic name with an assembly op.
The raw data is around somewhere...
<intrinsic tech="SSE4.1" vexEq="TRUE" name="_mm_test_mix_ones_zeros"> <type>Integer</type> <type>Flag</type> <CPUID>SSE4.1</CPUID> <category>Logical</category> <return type="int" varname="dst" etype="UI32"/> <parameter type="__m128i" varname="a" etype="M128"/> <parameter type="__m128i" varname="mask" etype="M128"/> <description>Compute the bitwise AND of 128 bits (representing integer data) in "a" and "mask", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "mask", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return 1 if both the "ZF" and "CF" values are zero, otherwise return 0.</description> <operation> IF ((a[127:0] AND mask[127:0]) == 0) ZF := 1 ELSE ZF := 0 FI IF (((NOT a[127:0]) AND mask[127:0]) == 0) CF := 1 ELSE CF := 0 FI IF (ZF == 0 && CF == 0) dst := 1 ELSE dst := 0 FI </operation> <instruction name="PTEST" form="xmm, xmm" xed="PTEST_XMMdq_XMMdq"/> <header>smmintrin.h</header> </intrinsic>
Digging around, I don't see any actual license attached to that data, unfortunately. So the descriptions shouldn't be directly reused. But you could at least get the mapping data from that.
And you could link to the intrinsics guide, like the built-in intrinsics do.
Hope that helps. :)
Yeah, I mean in general their descriptions of the operations range between "okay" and "what the heck?", but being able to easily check the assembly easily helps.
Yeah, these are operational descriptions, which don't necessarily make good documentation. :)
But the data would help you check the assembly instruction, as well as confirming the types and making sure they correspond to your naming scheme (because this kind of thing is really prone to copy-paste issues).
tiny copy-paste bugs are knows as "doing a loka" in some parts of the Rust Discord universe.
A few other bits of feedback:
The general term
blend should be defined along with the other verbs you have a glossary for (and thank you for that glossary!).
shift_right seem rather verbose; even Rust's own traits abbreviate those as
shr. I think as long as you have them both mentioned in the glossary, and keep them separated from other words in the name with
_, they'll be quite readable to people. And avoiding that bit of verbosity seems likely to substantially help with over-wide lines that can themselves reduce readability.
shuffle (and contrasting the differences) would help.
yeah, shl and shr are probably better names. In 0.2 the names were even longer actually, because they also had "immediate" in there. The lib has enough macros at this point that people will probably end up coming to grips with "macro means there's an immediate involved" on their own, though i'd like to make that clearer probably. Some of the early macros said "this is a macro because it needs to be a compile time const" but the most recent macros started to shift away from that.
Also, a typo:
convert_i16_lower2_to_i64_m128i Convert the lower two i16 lanes to two i32 lanes.
That should be
two i64 lanes, right?
ah, yes, doc text falls to copy-paste bugs the most because
cargo test sadly doesn't (yet?) check your words themselves.
There's one thing you could probably check: if the summary line of your docs mention a type that the function name doesn't (counting the implicit things like
m128d implying two
One future plan is to re-review every single function in random order (a few at a time) so that (hopefully) each individual function has to fully stand on its own without the brain just assuming the context from the previous similar function.
That might help. That's roughly how I caught this.
(Also, as an aside, the whole
lower4 naming is a great clarification.)
Regarding https://github.com/Lokathor/safe_arch/commit/24f28a9aa43493ec982967a7222fb0eaacf02d48 , please don't link directly to the versioned XML; that's versioned, and there may be a newer version. If you're going to link to the versioned XML file, please at least include a note that there may be a newer version, and put in your release procedure to check for a newer version. :)
ah, hmm, well how did you find the 3.5 version?
Browser console. :/
https://software.intel.com/sites/landingpage/IntrinsicsGuide/files/ReleaseNotes.html will tell you the current version number, though.
And you can edit the URL accordingly.
There really should be a "latest" URL. :(
alright readme should be better now.
I think that the naming gets easier the more you know about every single operation available, so i have a lot of sympathy for whoever had to name some of the early sse stuff without really knowing what would eventually exist later on.