forked from JianyuanXiao/PSCMC
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathpscmc_kernel.html
More file actions
5 lines (5 loc) · 38 KB
/
pscmc_kernel.html
File metadata and controls
5 lines (5 loc) · 38 KB
1
2
3
4
5
<html><style>table {border-collapse: collapse;} th, td {border:1px solid; text-align :center}</style> <head><meta charset="UTF8"></head> <h1 align="center">PSCMC kernel function usage</h1> <h2>1 Introduction</h2> <p>The PSCMC kernel function (called <i>kernel</i> ) is designed for executing parallel kernels on multiple platforms. When a <i>kernel</i> is called, it will be executed in parallel. Currently it supports <i>C, OpenMP, CUDA, OpenCL</i> and <i>SWMC</i> parallel environments. The codes below demostrate the use of <i>kernels</i> in the <b>SCMC</b> language.</p> <p>kernel.kernel.scmc:</p> <script> function func_mcl_1() {var x = document.getElementById("DIV_ID_1");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,249,249);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_1" onClick="event.cancelBubble=true;func_mcl_1()">(<block style="color:RGB(192,0,0);">defkernel</block> muladd ((double* A) (double* B) (double* C) (double d))<br> (<block style="color:RGB(0,128,32);">vector-set!</block> A __global_idx (<block style="color:RGB(0,128,32);">+</block> d (<block style="color:RGB(0,128,32);">vector-ref</block> A __global_idx) (<block style="color:RGB(0,128,32);">*</block> (<block style="color:RGB(0,128,32);">vector-ref</block> B __global_idx) (<block style="color:RGB(0,128,32);">vector-ref</block> C __global_idx)))))</div> <p>main.scmc:</p> <script> function func_mcl_2() {var x = document.getElementById("DIV_ID_2");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,249,249);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_2" onClick="event.cancelBubble=true;func_mcl_2()">(<block style="color:RGB(192,0,0);">begin</block><br> (<block style="color:RGB(0,128,32);">include<</block> <block style="color:RGB(224,0,192)">"stdio.h"</block>)<br> (<block style="color:RGB(0,128,32);">include<</block> <block style="color:RGB(224,0,192)">"stdlib.h"</block>)<br> (<block style="color:RGB(0,128,32);">include<</block> <block style="color:RGB(224,0,192)">"omp_pscmc_inc.h"</block>)<br> (<block style="color:RGB(0,128,32);">include<</block> <block style="color:RGB(224,0,192)">"kernel.kernel_inc.h"</block>)<br> <script> function func_mcl_3() {var x = document.getElementById("DIV_ID_3");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_3" onClick="event.cancelBubble=true;func_mcl_3()">(<block style="color:RGB(192,0,0);">defun</block> main int ((int argc) (char** argv))<br> (<block style="color:RGB(192,0,0);">define-omp_pscmc_env</block> env)<br> (<block style="color:RGB(192,0,0);">define-omp_muladd_struct</block> muladd1)<br> (omp_pscmc_env_init (<block style="color:RGB(224,0,192)">"&"</block> env) <block style="color:RGB(192,0,192);">0</block> <block style="color:RGB(0,128,32);">NULL</block>)<br> (omp_muladd_init (<block style="color:RGB(224,0,192)">"&"</block> env) (<block style="color:RGB(224,0,192)">"&"</block> muladd1))<br> (<block style="color:RGB(192,0,0);">declare-omp_pscmc_mem</block> vA vB vC sd)<br> (<block style="color:RGB(192,0,0);">define-const-long</block> VLEN <block style="color:RGB(192,0,192);">16</block>)<br> (omp_pscmc_mem_init (<block style="color:RGB(224,0,192)">"&"</block> env) (<block style="color:RGB(224,0,192)">"&"</block> vA) PS_DOUBLE_NUM VLEN)<br> (omp_pscmc_mem_init (<block style="color:RGB(224,0,192)">"&"</block> env) (<block style="color:RGB(224,0,192)">"&"</block> vB) PS_DOUBLE_NUM VLEN)<br> (omp_pscmc_mem_init (<block style="color:RGB(224,0,192)">"&"</block> env) (<block style="color:RGB(224,0,192)">"&"</block> vC) PS_DOUBLE_NUM VLEN)<br> (omp_pscmc_mem_init (<block style="color:RGB(224,0,192)">"&"</block> env) (<block style="color:RGB(224,0,192)">"&"</block> sd) PS_DOUBLE_NUM <block style="color:RGB(192,0,192);">0</block>)<br> (set! (<block style="color:RGB(0,128,32);">vector-ref</block> (type-convert double* sd.h_data) <block style="color:RGB(192,0,192);">0</block>) <block style="color:RGB(192,0,192);">5.00000000000000000e-01</block>)<br> (<block style="color:RGB(192,0,0);">define-double*</block> vA_h (omp_pscmc_get_h_data (<block style="color:RGB(224,0,192)">"&"</block> vA) <block style="color:RGB(0,128,32);">NULL</block>))<br> (<block style="color:RGB(192,0,0);">define-double*</block> vB_h (omp_pscmc_get_h_data (<block style="color:RGB(224,0,192)">"&"</block> vB) <block style="color:RGB(0,128,32);">NULL</block>))<br> (<block style="color:RGB(192,0,0);">define-double*</block> vC_h (omp_pscmc_get_h_data (<block style="color:RGB(224,0,192)">"&"</block> vC) <block style="color:RGB(0,128,32);">NULL</block>))<br> (<block style="color:RGB(192,0,0);">declare-int</block> i)<br> <script> function func_mcl_4() {var x = document.getElementById("DIV_ID_4");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,225,225);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_4" onClick="event.cancelBubble=true;func_mcl_4()">(<block style="color:RGB(192,0,0);">for</block> (set! i <block style="color:RGB(192,0,192);">0</block>) (<block style="color:RGB(0,128,32);"><</block> i VLEN) i++<br> (<block style="color:RGB(0,128,32);">vector-set!</block> vA_h i i)<br> (<block style="color:RGB(0,128,32);">vector-set!</block> vB_h i (<block style="color:RGB(0,128,32);">*</block> <block style="color:RGB(192,0,192);">5.00000000000000000e-01</block> i))<br> (<block style="color:RGB(0,128,32);">vector-set!</block> vC_h i (<block style="color:RGB(0,128,32);">+</block> <block style="color:RGB(192,0,192);">5.00000000000000000e-01</block> i)))</div><br> (omp_pscmc_mem_sync_h2d (<block style="color:RGB(224,0,192)">"&"</block> vA))<br> (omp_pscmc_mem_sync_h2d (<block style="color:RGB(224,0,192)">"&"</block> vB))<br> (omp_pscmc_mem_sync_h2d (<block style="color:RGB(224,0,192)">"&"</block> vC))<br> (omp_muladd_scmc_set_parameter_A (<block style="color:RGB(224,0,192)">"&"</block> muladd1) (<block style="color:RGB(224,0,192)">"&"</block> vA))<br> (omp_muladd_scmc_set_parameter_B (<block style="color:RGB(224,0,192)">"&"</block> muladd1) (<block style="color:RGB(224,0,192)">"&"</block> vB))<br> (omp_muladd_scmc_set_parameter_C (<block style="color:RGB(224,0,192)">"&"</block> muladd1) (<block style="color:RGB(224,0,192)">"&"</block> vC))<br> (omp_muladd_scmc_set_parameter_d (<block style="color:RGB(224,0,192)">"&"</block> muladd1) (<block style="color:RGB(224,0,192)">"&"</block> sd))<br> (omp_muladd_exec (<block style="color:RGB(224,0,192)">"&"</block> muladd1) <block style="color:RGB(192,0,192);">1</block> VLEN)<br> (omp_pscmc_mem_sync_d2h (<block style="color:RGB(224,0,192)">"&"</block> vA))<br> (omp_pscmc_mem_sync_d2h (<block style="color:RGB(224,0,192)">"&"</block> vB))<br> (omp_pscmc_mem_sync_d2h (<block style="color:RGB(224,0,192)">"&"</block> vC))<br> <script> function func_mcl_5() {var x = document.getElementById("DIV_ID_5");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,225,225);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_5" onClick="event.cancelBubble=true;func_mcl_5()">(<block style="color:RGB(192,0,0);">for</block> (set! i <block style="color:RGB(192,0,192);">0</block>) (<block style="color:RGB(0,128,32);"><</block> i VLEN) i++<br> (fprintf stdout <block style="color:RGB(224,0,192)">"%e\n"</block> (<block style="color:RGB(0,128,32);">vector-ref</block> vA_h i)))</div><br> (omp_pscmc_mem_destroy (<block style="color:RGB(224,0,192)">"&"</block> vA))<br> (omp_pscmc_mem_destroy (<block style="color:RGB(224,0,192)">"&"</block> vB))<br> (omp_pscmc_mem_destroy (<block style="color:RGB(224,0,192)">"&"</block> vC))<br> (<block style="color:RGB(192,0,0);">return</block> <block style="color:RGB(192,0,192);">0</block>))</div>)</div> <p>local_header.h:</p> <pre class="prettyprint lang-c">#include <stdio.h>
#include <math.h>
#include <assert.h>
#define IDX_OPT_MAX 1
</pre> <p>Using the following commands to compile the program.</p> <div style="border-style:solid;border-width:thin thin thin thin;background-color:white;">PREFIX=omp_ <br> cp $SCMC_ROOT/pscmc_openmp_inc.scmc ${PREFIX}pscmc_inc.scmc <br> echo "(define-scmc-global PREFIX '$PREFIX)">${PREFIX}pscmc.scmc <br> cat $SCMC_ROOT/pscmc_openmp.scmc >> ${PREFIX}pscmc.scmc <br> scmc_parallel_compile_passes kernel.kernel.scmc OpenMP $PREFIX <br> scmc_compile_passes ${PREFIX}pscmc.scmc C host ${PREFIX}pscmc.c $PREFIX <br> scmc_compile_passes ${PREFIX}pscmc_inc.scmc C host ${PREFIX}pscmc_inc.h $PREFIX <br> scmc_compile_passes main.scmc <br> gcc -fopenmp main.c kernel.kernel.c kernel.kernel_runtime.c ${PREFIX}pscmc.c -o kernel_test</div> <p>The output of kernel_test should be something like:</p> <div style="border-style:solid;border-width:thin thin thin thin;background-color:white;">$ ./kernel_test <br> 5.00000000000000000e-01 <br> 2.25000000000000000e+00 <br> 5.00000000000000000e+00 <br> 8.75000000000000000e+00 <br> 1.35000000000000000e+01 <br> 1.92500000000000000e+01 <br> 2.60000000000000000e+01 <br> 3.37500000000000000e+01 <br> 4.25000000000000000e+01 <br> 5.22500000000000000e+01 <br> 6.30000000000000000e+01 <br> 7.47500000000000000e+01 <br> 8.75000000000000000e+01 <br> 1.01250000000000000e+02 <br> 1.16000000000000000e+02 <br> 1.31750000000000000e+02 <br></div> <p>The defkernel keyword is used to define <i>kernel</i> functions in the <b>PSCMC</b> language. The __global_idx is a built-in variable to refer the current index of the parallel kernel. For other APIs and built-in variables you may refer to c files generated in this directory.</p> <h2>2 Understanding the kernel execution</h2> <h3>2.1 The __idx and __idy variable</h3> <p>Modern parallel compute devices may have multiple level of parallelism. For example, an CUDA GPU have multiple SM (or SMX), each SM (or SMX) has a lot of CUDA cores, a modern CPU generally has multiple cores, each core has one or more SIMD vector units. To release all the computing powers, the programming interface should also need multiple level of parallelisms.</p> <div align="center"><img src="ComputeDevice.png" width="70%"> <h3 align="center">Figure: The architecture of a compute device.</h3></div> <p>In the host side, the ${PREFIX}_${KERNELNAME}_exec function takes 3 arguments p_kernel, __xlen and __ylen. The p_kernel is a pointer which points to the kernel data structure, [__xlen, __ylen] is the shape of the kernel. Usually the kernel will be invoked __xlen*__ylen times simultaneously, and the execution can be grouped into __ylen groups, each group contains __xlen kernel executions. These __ylen group executions will be mapped into all available compute units in one compute device, and for each group the __xlen kernel executions will be mapped into SIMD/SIMT components of one compute units.</p> <p>In the device side, there are 5 built-in variables __idx, __idy, __xlen, __ylen and __global_idx in the kernel environment. The __xlen and __ylen are the same as the host side parameter when calling ${PREFIX}_${KERNELNAME}_exec, __idx is used to obtain the indices of the __xlen executions in one group, __idy is used to obtain the indices of the group. The __global_idx=__idx+__idy*__xlen is the global index of the kernel execution.Note that in CPU runtimes (i.e. C, OpenMP and SWMC), such treatment will increase many unnecessary codes to make sure SIMD instructions is executed exactly as we wanted, which is very insufficient. So in CPU runtimes __xlen should be always 1 and __idx is always 0, and the SIMD level parallelisms is achieved using another particularly designed statements.</p> <table align="center"><tr><th>Parallelism level</th> <th>PSCMC</th> <th>CUDA</th> <th>OpenCL</th> <th>CPU</th></tr> <tr><td>SIMD</td> <td>__idx</td> <td>threadIdx.x</td> <td>get_local_id ()</td> <td>always be 0</td></tr> <tr><td>SIMD</td> <td>__xlen</td> <td>blockDim.x</td> <td>get_local_size ()</td> <td>always be 1</td></tr> <tr><td>Core/SMX</td> <td>__idy</td> <td>blockIdx.x</td> <td>get_group_id ()</td> <td>__idy</td></tr> <tr><td>Core/SMX</td> <td>__ylen</td> <td>gridDim.x</td> <td>get_num_groups ()</td> <td>__ylen</td></tr> <tr><td>Global ID</td> <td>__global_idx</td> <td>blockIdx.x * blockDim.x + threadIdx.x</td> <td>get_global_id ()</td> <td>__idy</td></tr></table> <h3 align="center">Table: Correspondence of multiple parallel runtimes.</h3> <h3>2.2 Shared/Local memory and local synchronization</h3> <p>Kernel executions in one group can utilize the shared/local memory on the chip, and they can be synchronized. These features can be accessed using dec-local_shared-array and sync-local. Note that in CPU runtimes (C, OpenMP, SWMC) shared memory will become standard memory and local synchronization will be ignored.</p> <p>Suppose the __xlen=32,</p> <script> function func_mcl_6() {var x = document.getElementById("DIV_ID_6");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,249,249);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_6" onClick="event.cancelBubble=true;func_mcl_6()">(<block style="color:RGB(192,0,0);">begin</block><br> (dec-local_shared-array double shared_var <block style="color:RGB(192,0,192);">32</block>)<br> (<block style="color:RGB(0,128,32);">vector-set!</block> shared_var __idx <block style="color:RGB(192,0,192);">0.00000000000000000e+00</block>)<br> (sync-local))</div> <p>The shared_var can be accessed and shared for the same __idy kernel execution group. It act as the same as __shared__ double shared_var [32] in CUDA and __local double shared_var [32] in OpenCL. The sync-local is the same as __syncthreads() in CUDA and barrier() in OpenCL.</p> <h2>3 On the SWMC environment</h2> <p>The sunway many-core architecture is different from GPU and conventional CPU. For example the SW26010 CPU has four clusters, each cluster contains 1 Management Processing Element (MPE) and 64 Compute-Processing Elements (CPEs). The MPE is just like a common CPU, it can run operation system and MPI based parallel codes. However the CPE is a simplified CPU, it only has 64KiB scratchpad memory for data that should be explicitly managed by the programmer. So to achieve high bandwidth memory operations, using some specificated APIs provided by the following PSCMC macros is suggested.</p> <script> function func_mcl_7() {var x = document.getElementById("DIV_ID_7");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,249,249);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_7" onClick="event.cancelBubble=true;func_mcl_7()">(<block style="color:RGB(192,0,0);">begin</block><br> <script> function func_mcl_8() {var x = document.getElementById("DIV_ID_8");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_8" onClick="event.cancelBubble=true;func_mcl_8()">(<block style="color:RGB(192,0,0);">define-scmc-global</block> VOLATILE-FREE<br> <block style="color:RGB(192,0,192);">#f</block>)</div><br> <script> function func_mcl_9() {var x = document.getElementById("DIV_ID_9");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_9" onClick="event.cancelBubble=true;func_mcl_9()">(<block style="color:RGB(192,0,0);">defmacro</block> GET_SPARSE_MEMORY (dest from num_block b_stride b_size . sync_id)<br> (<block style="color:RGB(192,0,0);">case</block> RUNTIME ('SWMC `<script> function func_mcl_10() {var x = document.getElementById("DIV_ID_10");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,213,213);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_10" onClick="event.cancelBubble=true;func_mcl_10()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">*</block> ,num_block ,b_size)<br> <script> function func_mcl_11() {var x = document.getElementById("DIV_ID_11");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,201,201);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_11" onClick="event.cancelBubble=true;func_mcl_11()">(<block style="color:RGB(192,0,0);">begin</block><br> (,<script> function func_mcl_12() {var x = document.getElementById("DIV_ID_12");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_12" onClick="event.cancelBubble=true;func_mcl_12()">(<block style="color:RGB(192,0,0);">if</block> VOLATILE-FREE<br> '<block style="color:RGB(192,0,0);">define-int</block><br> '<block style="color:RGB(192,0,0);">define-volatile-int</block>)</div> lgd_swmc <block style="color:RGB(192,0,192);">0</block>)<br> (<block style="color:RGB(192,0,0);">define-const-size_t</block> data_size (sizeof-var (<block style="color:RGB(0,128,32);">vector-ref</block> ,from <block style="color:RGB(192,0,192);">0</block>)))<br> (,<script> function func_mcl_13() {var x = document.getElementById("DIV_ID_13");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_13" onClick="event.cancelBubble=true;func_mcl_13()">(<block style="color:RGB(192,0,0);">if</block> VOLATILE-FREE<br> <script> function func_mcl_14() {var x = document.getElementById("DIV_ID_14");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,171,171);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_14" onClick="event.cancelBubble=true;func_mcl_14()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">eq?</block> VOLATILE-FREE <block style="color:RGB(192,0,192);">1</block>)<br> 'athread_get_procedure<br> 'swmc_athread_get_procedue)</div><br> 'athread_get)</div> PE_MODE ,from ,dest (<block style="color:RGB(0,128,32);">*</block> ,b_size ,num_block data_size) (<block style="color:RGB(224,0,192)">"&"</block> ,<script> function func_mcl_15() {var x = document.getElementById("DIV_ID_15");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,177,177);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_15" onClick="event.cancelBubble=true;func_mcl_15()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">null?</block> sync_id)<br> 'lgd_swmc<br> (<block style="color:RGB(0,128,32);">car</block> sync_id))</div>) <block style="color:RGB(192,0,192);">0</block> (<block style="color:RGB(0,128,32);">*</block> (<block style="color:RGB(0,128,32);">-</block> ,b_stride ,b_size) data_size) ,<script> function func_mcl_16() {var x = document.getElementById("DIV_ID_16");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_16" onClick="event.cancelBubble=true;func_mcl_16()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">eq?</block> b_size b_stride)<br> <block style="color:RGB(192,0,192);">0</block><br> `(<block style="color:RGB(0,128,32);">*</block> ,b_size data_size))</div>)<br> ,<script> function func_mcl_17() {var x = document.getElementById("DIV_ID_17");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,189,189);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_17" onClick="event.cancelBubble=true;func_mcl_17()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">null?</block> sync_id)<br> `(while (neq? lgd_swmc <block style="color:RGB(192,0,192);">1</block>))<br> `(incf! ,(concat (<block style="color:RGB(0,128,32);">car</block> sync_id) '_swmc_shadow)))</div>)</div>)</div>) (<block style="color:RGB(192,0,0);">else</block> `(<block style="color:RGB(192,0,0);">block</block> <script> function func_mcl_18() {var x = document.getElementById("DIV_ID_18");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,207,207);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_18" onClick="event.cancelBubble=true;func_mcl_18()">(<block style="color:RGB(192,0,0);">inner-for-from-to</block> inner_step <block style="color:RGB(192,0,192);">0</block> ,num_block<br> <script> function func_mcl_19() {var x = document.getElementById("DIV_ID_19");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,195,195);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_19" onClick="event.cancelBubble=true;func_mcl_19()">(<block style="color:RGB(192,0,0);">inner-for-from-to</block> inner_g <block style="color:RGB(192,0,192);">0</block> ,b_size<br> (<block style="color:RGB(0,128,32);">vector-set!</block> ,dest (<block style="color:RGB(0,128,32);">+</block> (<block style="color:RGB(0,128,32);">*</block> inner_step ,b_size) inner_g) (<block style="color:RGB(0,128,32);">vector-ref</block> ,from (<block style="color:RGB(0,128,32);">+</block> (<block style="color:RGB(0,128,32);">*</block> inner_step ,b_stride) inner_g))))</div>)</div>))))</div><br> <script> function func_mcl_20() {var x = document.getElementById("DIV_ID_20");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_20" onClick="event.cancelBubble=true;func_mcl_20()">(<block style="color:RGB(192,0,0);">defmacro</block> GET_CONT_MEMORY (dest from b_size . sync_id)<br> `(GET_SPARSE_MEMORY ,dest ,from ,b_size <block style="color:RGB(192,0,192);">1</block> <block style="color:RGB(192,0,192);">1</block> unquote sync_id))</div><br> <script> function func_mcl_21() {var x = document.getElementById("DIV_ID_21");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_21" onClick="event.cancelBubble=true;func_mcl_21()">(<block style="color:RGB(192,0,0);">defmacro</block> PUT_SPARSE_MEMORY (dest from num_block b_stride b_size . sync_id)<br> (<block style="color:RGB(192,0,0);">case</block> RUNTIME ('SWMC `<script> function func_mcl_22() {var x = document.getElementById("DIV_ID_22");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,213,213);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_22" onClick="event.cancelBubble=true;func_mcl_22()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">*</block> ,num_block ,b_size)<br> <script> function func_mcl_23() {var x = document.getElementById("DIV_ID_23");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,201,201);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_23" onClick="event.cancelBubble=true;func_mcl_23()">(<block style="color:RGB(192,0,0);">begin</block><br> (,<script> function func_mcl_24() {var x = document.getElementById("DIV_ID_24");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_24" onClick="event.cancelBubble=true;func_mcl_24()">(<block style="color:RGB(192,0,0);">if</block> VOLATILE-FREE<br> '<block style="color:RGB(192,0,0);">define-int</block><br> '<block style="color:RGB(192,0,0);">define-volatile-int</block>)</div> lgd_swmc <block style="color:RGB(192,0,192);">0</block>)<br> (<block style="color:RGB(192,0,0);">define-const-size_t</block> data_size (sizeof-var (<block style="color:RGB(0,128,32);">vector-ref</block> ,from <block style="color:RGB(192,0,192);">0</block>)))<br> (,<script> function func_mcl_25() {var x = document.getElementById("DIV_ID_25");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_25" onClick="event.cancelBubble=true;func_mcl_25()">(<block style="color:RGB(192,0,0);">if</block> VOLATILE-FREE<br> <script> function func_mcl_26() {var x = document.getElementById("DIV_ID_26");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,171,171);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_26" onClick="event.cancelBubble=true;func_mcl_26()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">eq?</block> VOLATILE-FREE <block style="color:RGB(192,0,192);">1</block>)<br> 'athread_put_procedue<br> 'swmc_athread_put_procedue)</div><br> 'athread_put)</div> PE_MODE ,from ,dest (<block style="color:RGB(0,128,32);">*</block> data_size ,b_size ,num_block) (<block style="color:RGB(224,0,192)">"&"</block> ,<script> function func_mcl_27() {var x = document.getElementById("DIV_ID_27");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,177,177);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_27" onClick="event.cancelBubble=true;func_mcl_27()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">null?</block> sync_id)<br> 'lgd_swmc<br> (<block style="color:RGB(0,128,32);">car</block> sync_id))</div>) (<block style="color:RGB(0,128,32);">*</block> data_size (<block style="color:RGB(0,128,32);">-</block> ,b_stride ,b_size)) ,<script> function func_mcl_28() {var x = document.getElementById("DIV_ID_28");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,183,183);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_28" onClick="event.cancelBubble=true;func_mcl_28()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">eq?</block> b_size b_stride)<br> <block style="color:RGB(192,0,192);">0</block><br> `(<block style="color:RGB(0,128,32);">*</block> ,b_size data_size))</div>)<br> ,<script> function func_mcl_29() {var x = document.getElementById("DIV_ID_29");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,189,189);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_29" onClick="event.cancelBubble=true;func_mcl_29()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">null?</block> sync_id)<br> `(while (neq? lgd_swmc <block style="color:RGB(192,0,192);">1</block>))<br> `(incf! ,(concat (<block style="color:RGB(0,128,32);">car</block> sync_id) '_swmc_shadow)))</div>)</div>)</div>) (<block style="color:RGB(192,0,0);">else</block> `<script> function func_mcl_30() {var x = document.getElementById("DIV_ID_30");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,213,213);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_30" onClick="event.cancelBubble=true;func_mcl_30()">(<block style="color:RGB(192,0,0);">inner-for-from-to</block> inner_step <block style="color:RGB(192,0,192);">0</block> ,num_block<br> <script> function func_mcl_31() {var x = document.getElementById("DIV_ID_31");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,201,201);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_31" onClick="event.cancelBubble=true;func_mcl_31()">(<block style="color:RGB(192,0,0);">inner-for-from-to</block> inner_g <block style="color:RGB(192,0,192);">0</block> ,b_size<br> (force-v-set! (<block style="color:RGB(0,128,32);">vector-ref</block> ,dest (<block style="color:RGB(0,128,32);">+</block> (<block style="color:RGB(0,128,32);">*</block> inner_step ,b_stride) inner_g)) (<block style="color:RGB(0,128,32);">vector-ref</block> ,from (<block style="color:RGB(0,128,32);">+</block> (<block style="color:RGB(0,128,32);">*</block> inner_step ,b_size) inner_g))))</div>)</div>)))</div><br> <script> function func_mcl_32() {var x = document.getElementById("DIV_ID_32");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_32" onClick="event.cancelBubble=true;func_mcl_32()">(<block style="color:RGB(192,0,0);">defmacro</block> PUT_CONT_MEMORY (dest from b_size . sync_id)<br> `(PUT_SPARSE_MEMORY ,dest ,from ,b_size <block style="color:RGB(192,0,192);">1</block> <block style="color:RGB(192,0,192);">1</block> unquote sync_id))</div><br> <script> function func_mcl_33() {var x = document.getElementById("DIV_ID_33");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_33" onClick="event.cancelBubble=true;func_mcl_33()">(<block style="color:RGB(192,0,0);">defmacro</block> INIT_ASYNC_IO_LOCK (name)<br> (<block style="color:RGB(192,0,0);">case</block> RUNTIME ('SWMC `(,<script> function func_mcl_34() {var x = document.getElementById("DIV_ID_34");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,207,207);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_34" onClick="event.cancelBubble=true;func_mcl_34()">(<block style="color:RGB(192,0,0);">if</block> VOLATILE-FREE<br> '<block style="color:RGB(192,0,0);">declare-int</block><br> '<block style="color:RGB(192,0,0);">declare-volatile-int</block>)</div> (,name <block style="color:RGB(192,0,192);">0</block>) (,(concat name '_swmc_shadow) <block style="color:RGB(192,0,192);">0</block>))) (<block style="color:RGB(192,0,0);">else</block> '())))</div><br> <script> function func_mcl_35() {var x = document.getElementById("DIV_ID_35");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_35" onClick="event.cancelBubble=true;func_mcl_35()">(<block style="color:RGB(192,0,0);">defmacro</block> WAIT_SPARSE_MEMORY (sync_id)<br> (<block style="color:RGB(192,0,0);">case</block> RUNTIME ('SWMC `(while (neq? ,sync_id ,(concat sync_id '_swmc_shadow)))) (<block style="color:RGB(192,0,0);">else</block> '())))</div>)</div> <p>For example a many-core version of memory copier can be written as:</p> <script> function func_mcl_36() {var x = document.getElementById("DIV_ID_36");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,249,249);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_36" onClick="event.cancelBubble=true;func_mcl_36()">(<block style="color:RGB(192,0,0);">defkernel</block> copy_example ((double* a) (double* b) (long len))<br> (<block style="color:RGB(192,0,0);">define-long</block> one_size <block style="color:RGB(192,0,192);">64</block>)<br> (<block style="color:RGB(192,0,0);">dec-array</block> double tmp_buffer one_size)<br> (INIT_ASYNC_IO_LOCK sync_id)<br> (<block style="color:RGB(192,0,0);">define-long</block> buff_loc)<br> <script> function func_mcl_37() {var x = document.getElementById("DIV_ID_37");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,237,237);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_37" onClick="event.cancelBubble=true;func_mcl_37()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);">eq?</block> __idx <block style="color:RGB(192,0,192);">0</block>)<br> <script> function func_mcl_38() {var x = document.getElementById("DIV_ID_38");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,225,225);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_38" onClick="event.cancelBubble=true;func_mcl_38()">(<block style="color:RGB(192,0,0);">for</block> (set! buff_loc (<block style="color:RGB(0,128,32);">*</block> __idy one_size)) (<block style="color:RGB(0,128,32);"><</block> buff_loc len) (incf! buff_loc (<block style="color:RGB(0,128,32);">*</block> __ylen one_size))<br> (<block style="color:RGB(192,0,0);">define-int</block> numcp <script> function func_mcl_39() {var x = document.getElementById("DIV_ID_39");if (x.style.height == "") {x.style.height="1.5em";} else {x.style.height="";}} </script><div style="overflow:hidden;margin-left:20px;background-color:RGB(255,207,207);border-style:solid;border-color: RGB(128,192,192);border-width:thin thin thin thin;" id="DIV_ID_39" onClick="event.cancelBubble=true;func_mcl_39()">(<block style="color:RGB(192,0,0);">if</block> (<block style="color:RGB(0,128,32);"><</block> (<block style="color:RGB(0,128,32);">+</block> buff_loc one_size) len)<br> one_size<br> (<block style="color:RGB(0,128,32);">-</block> len buff_loc))</div>)<br> (GET_CONT_MEMORY tmp_buffer (<block style="color:RGB(0,128,32);">+</block> b buff_loc) numcp sync_id)<br> (WAIT_SPARSE_MEMORY sync_id)<br> (PUT_CONT_MEMORY (<block style="color:RGB(0,128,32);">+</block> a buff_loc) tmp_buffer numcp sync_id)<br> (WAIT_SPARSE_MEMORY sync_id))</div>)</div>)</div> <p>The full version of this code can be obtained at <a href="kernel.scmc">kernel.scmc</a> .</p></html>