It would be nice if the user could choose out of a small collection of examples...
E.g., here is another example...
__m128i Delta(__m128i curr, __m128i prev) {
__m128i a = _mm_alignr_epi8(curr, prev, 12);
return _mm_sub_epi32(curr, a);
}
c.c. @pmntang