Here is a link to the original paper.
The main aim of was to find a general purpose end-to-end deep neural network to perform modeling of nonlinear audio effects. I also found the usage of Soft Adaptive Activation Function(SAAF) quite interesting. Had never encountered an adaptive activation before and was curious about implementing it.
The network consists of 1D convolutional filters to expand the input into multiple channels, then max pooling to extract out the most important aspects of the channel output. After that it is passed through a couple of densely connected layers, upsampled, some more dense layers, one SAAF layer and then finally de-convolution layer to yield the final output.