assembly - GCC Extended ASM syntax: load 128-bit memory location as source -
GCC generates the following code for this shuffle:
Ideally this should be: pshufb xmm4, xmmvs ptr [rip + x + 0x125]
What is the expanded ASM syntax to generate this single instruction?
Many thanks, Adam
PS: Internal for this comment generates optimal code for this example. It does not work normally (there is a possibility of generating unnecessary registers copies in the presence of GCC Global Registration variables).
#include & lt; Stdint.h & gt; Typedef int8_t xmm_t __attribute__ ((vector_size (16))); Constxmm_t xmm_shuf = {128, 0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15}; Register xmm_t xmm __asm __ ("xmm4"); #define NTL ".intel_syntax noprefix \ n" #define ATT ".att_syntax \ n" void shuffle () {// xmm = __Biltin_II 32_PHSW 128 (xmm, xmm_shuf); __asm __ (NTL "pshufb% 0,% 1 \ n" ATT: "= x" (xmm): "x" (xmm_shuf)); } Int main () {}
$ gcc-os -std = gnu99 -msse4.4 -flax-vector-conversions pshufb_128bit_constant.c & amp; Amp; Objdump -d -m i386: x86-64: Intel a.out | Less
0000000000400494 & lt; Shuffle: 400494: 0f 28 05 25 01 00 xx xx MMX, XMMWD PTR [Rip + 0x125] # 4005c0 & amp; the lift; Xmm_shuf + 0x10 & gt; 40049b: 66 0f 38 00 e pshufb xmm4, xmm0 4004a0: c3 ret
change input Operand's obligation is "xm"
, so that memory spaces are allowed in addition to the SSE registers.
However, when I tested it, the compiler generated code that does not fit well with Intel syntax, finally, this is what I used to do:
__ asm __ ("pshufb% 1,% 0": "+ x" (xmm): "xm" (xmm_shuf));
Comments
Post a Comment