Elliptic curve implementations - dark magic, right? We all copy the mysterious bit twiddles andhave mechanically ported nacl everywhere. But what the hell are we actuallydoing?
I recently implemented Ed25519 from scratch in both pure Go and(dramatically faster) amd64 assembly, spending a frankly pathological amount of time to be sure I understood what I was doing, for a change. Now I'd like to share that. I'll explain the code (mine, and by extension ref10, donna, and amd64-51-30k from SUPERCOP) and the underlying concepts / design decisions behind it all. Then I'll talk about how I made the code fast- endianness tricks with Big.Ints, why assembly doesn't always mean faster, how the inlining model of the compiler works, and some tools you can use to make writing Plan9 asm less awful. Talk MAY use the “make it Go fast”joke but implementers SHOULD avoid the temptation.