DLPrimitives
|
Central Data Contrainer - Tensor. More...
#include <include/dlprim/tensor.hpp>
Public Member Functions | |
Tensor (Context &ctx, Shape const &s, DataType d=float_data, bool is_trainable=true) | |
Create a tensor for specific context and allocate the device memory for it. | |
Tensor (cl::Buffer const &buffer, cl_ulong offset, Shape const &s, DataType d=float_data, bool is_trainable=true) | |
Create a tensor from external buffer. | |
Tensor () | |
Create null tensor, binding such a tensor to kernel will pass NULL pointer. | |
Tensor (Tensor const &)=default | |
Copy construtor - uses reference counting points to same memory. | |
Tensor & | operator= (Tensor const &)=default |
Assignment - uses reference counting points to same memory. | |
Tensor (Tensor &&)=default | |
Tensor & | operator= (Tensor &&)=default |
TensorSpecs const & | specs () const |
Shape const & | shape () const |
get tensor shape | |
bool | is_trainable () const |
return if tensor need to participate in gradient decent | |
size_t | memory_size () const |
Get reuired memory size for the tensor. | |
DataType | dtype () const |
void | reshape (Shape const &ns) |
Reshape the tensor, the only requirement that ns.total_size() <= shape().total_size() | |
cl::Buffer & | device_buffer () |
Get cl::Buffer for the tensor. | |
cl_ulong | device_offset () |
Get offset - you should always bind both buffer and offset since there is no pointer arithmetics at host in OpenCL and same memory may be used for several tensors. More... | |
void * | host_data () |
Get a pointer to CPU memory - uses lazy allocation on demand. | |
Tensor | workspace_as_type (DataType d=float_data) const |
Create tensor over all avalible size for data type d. | |
Tensor | sub_tensor_target_offset (size_t offset, Shape const &s, DataType d=float_data, bool trainable=true) const |
Create tensor on the memory of existing tensor. More... | |
Tensor | sub_tensor (size_t offset, Shape const &s, DataType d=float_data, bool trainable=true) const |
Create tensor on the memory of existing tensor. More... | |
Tensor | alias () const |
Create a tensor with same memory but shape isn't connected to original - it is alias to same data but with ability to modify shape. | |
Tensor | alias (Shape const &new_shape) const |
same as t=alias(); t.reshape(s); return t; | |
template<typename T > | |
T * | data () |
get pointer to the host pointer and cast to relevant type | |
void | to_device (ExecutionContext const &c, void *host_memory, bool sync=true) |
Copy external host memory to device, sync - for synchronoys copy. | |
void | to_host (ExecutionContext const &c, void *host_memory, bool sync=true) |
Copy device memory to external host memory, sync - for synchronoys copy. | |
void | to_device (ExecutionContext const &c, bool sync=true) |
Copy host memory to device, sync - for synchronoys copy. | |
void | to_host (ExecutionContext const &c, bool sync=true) |
Copy device memory to host, sync - for synchronoys copy. | |
void | set_arg (cl::Kernel &k, int &pos) |
Assign buffer and offset as kernel argumnets, at position pos and pos+1, pos incrementeded twice. | |
Central Data Contrainer - Tensor.
Note all this object data is reference counted - copying is cheap but be aware that modifications of one tensor affect other
|
inline |
Get offset - you should always bind both buffer and offset since there is no pointer arithmetics at host in OpenCL and same memory may be used for several tensors.
Always uses 64 bit ulong even of the device 32 bit.
Tensor dlprim::Tensor::sub_tensor | ( | size_t | offset, |
Shape const & | s, | ||
DataType | d = float_data , |
||
bool | trainable = true |
||
) | const |
Create tensor on the memory of existing tensor.
offset | - memory offset in the units of the data type of this tensor, if this tensor has type uint16_data and offset is 16 that the offset is 32 bytes |
s | - shape of new tensor d - new tensor type |
trainable | - mark as trainable tensor |
|
inline |
Create tensor on the memory of existing tensor.
offset | - memory offset in d units, i.e. if new tensor has float_data and offset=2 than address offset is 8 bytes |
s | - shape of new tensor d - new tensor type |
trainable | - mark as trainable tensor |