DLPrimitives
|
Perform InnerProduct/FullyConnected/Dense forward calulations, allow fusing bias and activation into same GPU kernel. More...
#include <include/dlprim/core/conv.hpp>
Public Member Functions | |
virtual void | enqueue (Tensor &x, Tensor &w, Tensor *bias, Tensor &y, Tensor &ws, float factor, ExecutionContext const &e)=0 |
Public Member Functions inherited from dlprim::core::Conv2DBase | |
virtual char const * | algo () const =0 |
virtual size_t | workspace () |
Static Public Member Functions | |
static std::unique_ptr< Conv2DForward > | create (Context &ctx, Conv2DSettings const &config, bool bias, StandardActivations activation=StandardActivations::identity, std::string const &algo=std::string()) |
Create optimal object for conv2d. More... | |
Static Public Member Functions inherited from dlprim::core::Conv2DBase | |
static Shape | get_output_shape (Convolution2DConfigBase const &config, Shape const &in) |
static Shape | get_output_shape_transposed (Convolution2DConfigBase const &config, Shape const &in, int output_pad[2]) |
Perform InnerProduct/FullyConnected/Dense forward calulations, allow fusing bias and activation into same GPU kernel.
|
static |
Create optimal object for conv2d.
algo is one of "" or "auto" - automatic selection, "gemm" - use fused GEMM based algo "winograd" - use Winograd convoltion - suitable for non strided, non dilated, non grouped 3x3 with pad=1 conv "depthwise_separable"