Dang, Francis Hoai Dinh (2007-05). Speculative parallelization of partially parallel loops. Master's Thesis. Thesis uri icon

abstract

  • Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. In our previous work, we have speculatively executed a loop as a doall, and applied a fully parallel data dependence test to determine if it had any cross-processor depen- dences. If the test failed, then the loop was re-executed serially. While this method exploits doall parallelism well, it can cause slowdowns for loops with even one cross- processor flow dependence because we have to re-execute sequentially. Moreover, the existing, partial parallelism of loops is not exploited. We demonstrate a generalization of the speculative doall parallelization tech- nique, called the Recursive LRPD test, that can extract and exploit the maximum available parallelism of any loop and that limits potential slowdowns to the over- head of the run-time dependence test itself. In this thesis, we have presented the base algorithm and an analysis of the different heuristics for its practical applica- tion. To reduce the run-time overhead of the Recursive LRPD test, we have im- plemented on-demand checkpointing and commit, more efficient data dependence analysis and shadow structures, and feedback-guided load balancing. We obtained scalable speedups for loops from Track, Spice, and FMA3D that were not paralleliz- able by previous speculative parallelization methods.
  • Current parallelizing compilers cannot identify a significant fraction of parallelizable
    loops because they have complex or statically insufficiently defined access patterns.
    In our previous work, we have speculatively executed a loop as a doall, and applied a
    fully parallel data dependence test to determine if it had any cross-processor depen-
    dences. If the test failed, then the loop was re-executed serially. While this method
    exploits doall parallelism well, it can cause slowdowns for loops with even one cross-
    processor flow dependence because we have to re-execute sequentially. Moreover, the
    existing, partial parallelism of loops is not exploited.
    We demonstrate a generalization of the speculative doall parallelization tech-
    nique, called the Recursive LRPD test, that can extract and exploit the maximum
    available parallelism of any loop and that limits potential slowdowns to the over-
    head of the run-time dependence test itself. In this thesis, we have presented the
    base algorithm and an analysis of the different heuristics for its practical applica-
    tion. To reduce the run-time overhead of the Recursive LRPD test, we have im-
    plemented on-demand checkpointing and commit, more efficient data dependence
    analysis and shadow structures, and feedback-guided load balancing. We obtained
    scalable speedups for loops from Track, Spice, and FMA3D that were not paralleliz-
    able by previous speculative parallelization methods.

publication date

  • May 2007