blog Deadlock during first touch of upc_alloc'd remote memory when target is in upc_barrier [updated] <p><strong>Status: This has been reported to Cray (811537) and a workaround is available. Fixed in CCE 8.3.5.<br/></strong></p> <p>When a UPC thread first attempts to access a shared memory location that has been allocated with upc_alloc() and the resulting pointer stored to a shared variable, it will first need to send an active message to the target thread requesting the DMAPP descriptor for the shared memory location. However, if the target is waiting in upc_barrier, there is a risk that this active message will not be serviced, leading the origin thread to wait indefinitely (as it will never reach the same barrier, this will lead to deadlock). The likelihood of deadlock is higher at greater levels of concurrency.</p> <p>This is due to the behavior of the optimized barrier implementation in Cray's DMAPP library. A potential workaround that appears to be effective on Hopper is to disable the optimized barrier by the PGAS_USE_DMAPP_COLLECTIVES environmental variable to "0" in your job submission script (or in your environment for an interactive session). This has been shown to work with some regularity on cce/8.3.2 (under the current default version, 8.2.1, this does not appear to be the case), although some have also found this workaround not to be effective. Also, this setting may have a performance impact, particularly in the latencies of barriers and certain reductions.</p> <p><strong>Update</strong>: Cray reports that a fix has been added to their DMAPP-based PGAS runtime library, which is slated to be included in CCE 8.3.5 (to be released Nov. 2014).</p> Thu, 16 Oct 2014 16:17:57 -0700