NERSCPowering Scientific Discovery Since 1974

Deep Networks for HEP

This page provides example code, datasets and recipes for running HEP Physics analyses using deep neural networks on Cori. The current scripts were those used for the CNN classification and timing studies reported at this ACAT talk.

Datasets

These contain simulated data with an ATLAS-like detector. Data is available from  http://portal.nersc.gov/project/mpccc/wbhimji/RPVSusyData/ . A README is provided in the directory.

Currently data binned into 64x64 images is provided. Unbined data will be provided in due course, together with additional documentation.

Convolutional Neural Network for Classification

This provides a network for classification (RPVSusy signal vs QCD background) on 3-channel (calorimeter + track) whole-detector images as presented this ACAT talk. It implements 3 convolution+pooling units with rectified linear unit (ReLU) activation functions. These layers output into two fully connected layers, with cross-entropy as the loss function and the ADAM optimizer.

Code 

Keras code to implement the a convolutional neural net  is available at https://github.com/eracah/atlas_dl/tree/micky . This single script is fairly self explanatory and easily run at NERSC following recipes below.

Code for preselection of data as well as for Lasgne/Theano implementations is in the main branch of that repository.

Running at NERSC 

An example batch script is given below. Loading the intel-head module sets a variety of KMP* environment variables for best performance as documented on the NERSC Tensorflow page. If using the intel-head module some other libraries may be missing so one can add them with pip --user: e.g. in this case   pip install --user h5py. For this script and study it was also necessary to set OMP_NUM_THREADS differently to that in the module (to avoid thread exhaustion) - that may vary for other workloads.

#!/bin/sh
#SBATCH -N 1 -C knl -t 90 -p regular
module load tensorflow/intel-head
export OMP_NUM_THREADS=16
python train.py --nb-epochs 10 --nb-events 999999999 --batch-size 512 train.h5 val.h5 CNN

  Performance Timing

Timings for the network above on Cori are given below. For more details please see the ACAT presentation. 'TF (Intel)' corresponds to what is now the tensorflow/intel-1.2 module while 'latest' is tensorflow/intel-head.

TimingsOct9