Skip to content
This repository was archived by the owner on Jun 4, 2018. It is now read-only.
This repository was archived by the owner on Jun 4, 2018. It is now read-only.

Optimizations techniques for code generation #58

@felippezacarias

Description

@felippezacarias
Pull Request Why Reference Code parameters used Time Before/After – Xeon Time Before/After – Xeon Phi
#51 Thread blocking access would be achieved by the directive  schedule(static,1) on the outer most loop. It allows threads processing the z plane use some y and x planes already in cache. Wave Equation Based Stencil Optimizations on Multi-core CPU - Muhong Zhou and William W. Symes, Rice University – Section: Reducing L3 Cache Misses – Blocking thread accesses Xeon: Code 8th order, Grid size 512x512x512 Xeon Phi: Code 8th order, Grid size 420x420x420 288 sec - 258 sec 123 sec - 112 sec
#52 Modifies the array access pattern by fission on the inner most loop and rearranging the access pattern by its stride. Beyond that, this changes helps to reduce register pressure on the vectorization. Borges, L., 2011, 3d finite differences on multi-core processors. (available online at [https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors](https://software.intel. com/en-us/articles/3d-finite-differences-on-multi-core-processors)). Xeon: Code 8th order, Grid size 512x512x512 Xeon Phi: Code 8th order, Grid size 420x420x420 258 sec - 158 sec 112 sec - 196 sec

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions