Skip to content

Commit 8ac6816

Browse files
committed
More affine
1 parent 09c4fd6 commit 8ac6816

File tree

1 file changed

+113
-4
lines changed

1 file changed

+113
-4
lines changed

gf2p8affineqb/index.html

Lines changed: 113 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,7 @@ <h2>What does GF2P8AFFINEQB do?</h2>
199199
</div>
200200
</blockquote>
201201

202+
202203
<p>
203204
The visualizations' matrices' below are <i>unreversed</i> for clarity,
204205
but with the example code bit-shifting configured such that the
@@ -486,12 +487,122 @@ <h3>PS3 Cell floating point emulation</h3>
486487
<li>An underflow occurs; that is, the result before rounding is different from zero and the result after
487488
rounding is zero.</li>
488489
</ul>
489-
</blockquote>
490+
</blockquote>
490491

491492
<p>
492-
493+
RPCS3 has an optimized path for <code>shufb</code>:
493494
</p>
494495

496+
<blockquote class="ibm">
497+
<h2>Shuffle Bytes</h2>
498+
<p style="display: flex; justify-content: space-between; align-items: center;">
499+
<span><strong>shufb</strong> &nbsp;&nbsp; rt,ra,rb,rc</span>
500+
</p>
501+
<table style="width: 100%; table-layout: fixed; font-size: 0.9em;">
502+
<tr>
503+
<td style="border: none; text-align: center;">1</td>
504+
<td style="border: none; text-align: center;">0</td>
505+
<td style="border: none; text-align: center;">1</td>
506+
<td style="border: none; text-align: center;">1</td>
507+
<td colspan="7" style="border: none;"></td>
508+
<td colspan="7" style="border: none;"></td>
509+
<td colspan="7" style="border: none;"></td>
510+
<td colspan="7" style="border: none;"></td>
511+
</tr>
512+
<tr>
513+
<td>0</td>
514+
<td>1</td>
515+
<td>2</td>
516+
<td>3</td>
517+
<td colspan="7">RT</td>
518+
<td colspan="7">RB</td>
519+
<td colspan="7">RA</td>
520+
<td colspan="7">RC</td>
521+
</tr>
522+
<tr>
523+
<td style="border: none; text-align: center;">0</td>
524+
<td style="border: none; text-align: center;">1</td>
525+
<td style="border: none; text-align: center;">2</td>
526+
<td style="border: none; text-align: center;">3</td>
527+
<td style="border: none; text-align: center;">4</td>
528+
<td style="border: none; text-align: center;">5</td>
529+
<td style="border: none; text-align: center;">6</td>
530+
<td style="border: none; text-align: center;">7</td>
531+
<td style="border: none; text-align: center;">8</td>
532+
<td style="border: none; text-align: center;">9</td>
533+
<td style="border: none; text-align: center;">10</td>
534+
<td style="border: none; text-align: center;">11</td>
535+
<td style="border: none; text-align: center;">12</td>
536+
<td style="border: none; text-align: center;">13</td>
537+
<td style="border: none; text-align: center;">14</td>
538+
<td style="border: none; text-align: center;">15</td>
539+
<td style="border: none; text-align: center;">16</td>
540+
<td style="border: none; text-align: center;">17</td>
541+
<td style="border: none; text-align: center;">18</td>
542+
<td style="border: none; text-align: center;">19</td>
543+
<td style="border: none; text-align: center;">20</td>
544+
<td style="border: none; text-align: center;">21</td>
545+
<td style="border: none; text-align: center;">22</td>
546+
<td style="border: none; text-align: center;">23</td>
547+
<td style="border: none; text-align: center;">24</td>
548+
<td style="border: none; text-align: center;">25</td>
549+
<td style="border: none; text-align: center;">26</td>
550+
<td style="border: none; text-align: center;">27</td>
551+
<td style="border: none; text-align: center;">28</td>
552+
<td style="border: none; text-align: center;">29</td>
553+
<td style="border: none; text-align: center;">30</td>
554+
<td style="border: none; text-align: center;">31</td>
555+
</tr>
556+
</table>
557+
<p>
558+
Registers RA and RB are logically concatenated with the least-significant bit of RA adjacent to the most-significant bit of RB. The bytes of the resulting value are considered to be numbered from 0 to 31.
559+
</p>
560+
<p>
561+
For each byte slot in registers RC and RT:
562+
</p>
563+
<ul>
564+
<li>The value in register RC is examined, and a result byte is produced as shown in <i>Table 5-1</i>.</li>
565+
<li>The result byte is inserted into register RT.</li>
566+
</ul>
567+
<p>
568+
<i>Table 5-1. Binary Values in Register RC and Byte Results</i>
569+
</p>
570+
<table style="margin-left: 20px;">
571+
<tr>
572+
<th>Value in Register RC<br>(Expressed in Binary)</th>
573+
<th>Result Byte</th>
574+
</tr>
575+
<tr>
576+
<td>10xxxxxx</td>
577+
<td>0x00</td>
578+
</tr>
579+
<tr>
580+
<td>110xxxxx</td>
581+
<td>0xFF</td>
582+
</tr>
583+
<tr>
584+
<td>111xxxxx</td>
585+
<td>0x80</td>
586+
</tr>
587+
<tr>
588+
<td>Otherwise</td>
589+
<td>The byte of the concatenated register addressed by the rightmost 5 bits of register RC</td>
590+
</tr>
591+
</table>
592+
<pre style="margin-left: 20px; margin-top: 10px;">
593+
Rconcat ← RA || RB
594+
for j = 0 to 15
595+
b ← RCʲ
596+
If b₀:₁ = 0b10 then c ← 0x00
597+
else If b₀:₂ = 0b110 then c ← 0xFF
598+
else If b₀:₂ = 0b111 then c ← 0x80
599+
else
600+
b ← b & 0x1F;
601+
c ← Rconcatᵇ;
602+
RTʲ ← c
603+
end</pre>
604+
</blockquote>
605+
495606
<p>
496607
<a href="https://www.youtube.com/watch?v=19ae5Mq2lJE">Here's the video that discusses it</a>
497608
</p>
@@ -530,8 +641,6 @@ <h2>What can we apply it to?</h2>
530641

531642
https://github.com/riscv/riscv-bitmanip/wiki
532643

533-
https://wunkolo.github.io/post/2020/11/gf2p8affineqb-bit-reversal/
534-
535644
<a
536645
href="https://github.com/animetosho/ParPar/blob/master/fast-gf-multiplication.md#affine-transformation--bit-matrix-xor">
537646
i don't even fucking know man

0 commit comments

Comments
 (0)