Skip to content

Commit

Permalink
Merge pull request #2 from CNugteren/develop
Browse files Browse the repository at this point in the history
Added generalised constraints and further use of C++11 features
  • Loading branch information
CNugteren committed Feb 25, 2015
2 parents 4b7a00a + 14da696 commit fa1437f
Show file tree
Hide file tree
Showing 15 changed files with 290 additions and 405 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
build/
7 changes: 6 additions & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@

Version 1.1.0
- User-defined parameter constraints are now fully customizable by accepting arbitrary functions on
an arbitrary combination of parameters.
- Re-factored the code to use more C++11 features: auto, smart pointers, constexpr, class enums, ...

Version 1.0.1
- Replaced one more occurence of a pointer with an std::shared_ptr
- Replaced one more occurrence of a pointer with an std::shared_ptr
- Re-added OpenCL class constructor exception test
- Updated license information

Expand Down
21 changes: 11 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ OpenCL kernel (first argument), the name of the kernel (second argument), a list
dimensions (third argument), and a list of local thread or workgroup dimensions (fourth argument).
Here is an example:

int id = my_tuner.AddKernel("path/to/kernel.opencl", "my_kernel", {1024,512}, {16,8});
auto id = my_tuner.AddKernel("path/to/kernel.opencl", "my_kernel", {1024,512}, {16,8});

Notice that the AddKernel function returns an integer: it is the ID of the added kernel. We'll need
this ID when we want to add tuning parameters to this kernel. Let's say that our kernel has two
Expand All @@ -68,13 +68,13 @@ The tuner also needs to know which arguments the kernels take. Scalar arguments
as-is and are passed-by-value, whereas arrays have to be provided as C++ `std::vector`s. That's
right, we won't have to create OpenCL buffers, CLTune will handle that for us! Here is an example:

int my_variable = 900;
auto my_variable = 900;
std::vector<float> input_vector(8192);
std::vector<float> output_vector(8192);
my_tuner.AddArgumentScalar<int>(my_variable);
my_tuner.AddArgumentScalar<float>(3.7);
my_tuner.AddArgumentInput<float>(input_vector);
my_tuner.AddArgumentOutput<float>(output_vector);
my_tuner.AddArgumentScalar(my_variable);
my_tuner.AddArgumentScalar(3.7);
my_tuner.AddArgumentInput(input_vector);
my_tuner.AddArgumentOutput(output_vector);

Now that we've configured the tuner, it is time to start it and ask it to report the results:

Expand All @@ -85,9 +85,9 @@ Other examples
-------------

Two examples are included as part of the CLTune distribution. They illustrate some more advanced
features, such as modifying the thread dimensions based on the parameters and adding parameter
constraints. The examples are compiled when providing `-ENABLE_SAMPLES=ON` to CMake (default option
currently). The two included examples are:
features, such as modifying the thread dimensions based on the parameters and adding user-defined
parameter constraints. The examples are compiled when providing `-ENABLE_SAMPLES=ON` to CMake
(default option). The two included examples are:

* `simple.cc` providing a basic example of matrix-vector multiplication
* `gemm.cc` providing a more advanced and heavily tuned implementation of matrix-matrix
Expand All @@ -98,7 +98,8 @@ Development and tests

The CLTune project follows the Google C++ styleguide (with some exceptions) and uses a tab-size of
two spaces and a max-width of 100 characters per line. It is furthermore based on practises from the
third edition of Effective C++. The project is licensed under the MIT license by SURFsara, (c) 2014.The contributing authors so far are:
third edition of Effective C++ and the first edition of Effective Modern C++. The project is
licensed under the MIT license by SURFsara, (c) 2014. The contributing authors so far are:

* Cedric Nugteren

Expand Down
62 changes: 16 additions & 46 deletions include/tuner/internal/kernel_info.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
#include <vector>
#include <iostream>
#include <stdexcept>
#include <functional>

// The C++ OpenCL wrapper
#include "cl.hpp"
Expand All @@ -45,21 +46,13 @@
namespace cltune {
// =================================================================================================

// Enumeration of modifiers to global/local thread-sizes
enum ThreadSizeModifierType { kGlobalMul, kGlobalDiv, kLocalMul, kLocalDiv };

// Enumeration of equalities/inequalities on parameter
enum ConstraintType { kEqual, kLargerThan, kLargerEqual, kSmallerThan, kSmallerEqual, kMultipleOf };

// Enumeration of operations on parameter
enum OperatorType { kNoOp, kMultipliedBy, kDividedBy };

// =================================================================================================

// See comment at top of file for a description of the class
class KernelInfo {
public:

// Enumeration of modifiers to global/local thread-sizes
enum class ThreadSizeModifierType { kGlobalMul, kGlobalDiv, kLocalMul, kLocalDiv };

// Helper structure holding a parameter name and a list of all values
struct Parameter {
std::string name;
Expand All @@ -81,30 +74,18 @@ class KernelInfo {
ThreadSizeModifierType type;
};

// Helper structure holding a constraint on parameters
// TODO: Make this more generic with a vector of parameters and operators
// Helper structure holding a constraint on parameters. This constraint consists of a constraint
// function object and a vector of paramater names represented as strings.
using ConstraintFunction = std::function<bool(std::vector<int>)>;
struct Constraint {
std::string parameter_1;
ConstraintType type;
std::string parameter_2;
OperatorType op_1;
std::string parameter_3;
OperatorType op_2;
std::string parameter_4;
};

// Temporary structure
struct SupportKernel {
std::string name;
cl::NDRange global;
cl::NDRange local;
ConstraintFunction valid_if;
std::vector<std::string> parameters;
};

// Exception of the KernelInfo class
class KernelInfoException : public std::runtime_error {
class Exception : public std::runtime_error {
public:
KernelInfoException(const std::string &message)
: std::runtime_error(message) { };
Exception(const std::string &message): std::runtime_error(message) { };
};

// Initializes the class with a given name and a string of OpenCL source-code
Expand Down Expand Up @@ -135,22 +116,11 @@ class KernelInfo {
// supported modifiers are given by the ThreadSizeModifierType enumeration.
void AddModifier(const StringRange range, const ThreadSizeModifierType type);

// Adds a new constraint to the set of parameters (e.g. must be equal or larger than)
// TODO: Combine the below three functions and make them more generic.
void AddConstraint(const std::string parameter_1, const ConstraintType type,
const std::string parameter_2);

// Also adds a constraint, but the second parameter is now modified by an operation "op" with
// respect to a third parameter (e.g. multiplication)
void AddConstraint(const std::string parameter_1, const ConstraintType type,
const std::string parameter_2, const OperatorType op,
const std::string parameter_3);

// As above, but with a second operation and a fourth parameter
void AddConstraint(const std::string parameter_1, const ConstraintType type,
const std::string parameter_2, const OperatorType op_1,
const std::string parameter_3, const OperatorType op_2,
const std::string parameter_4);
// Adds a new constraint to the set of parameters (e.g. must be equal or larger than). The
// constraints come in the form of a function object which takes a number of tuning parameters,
// given as a vector of strings (parameter names). Their names are later substituted by actual
// values.
void AddConstraint(ConstraintFunction valid_if, const std::vector<std::string> &parameters);

// Computes the global/local ranges (in NDRange-form) based on all global/local thread-sizes (in
// StringRange-form) and a single permutation (i.e. a configuration) containing a list of all
Expand Down
24 changes: 7 additions & 17 deletions include/tuner/internal/memory.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,22 +36,15 @@
#include <memory>

// The C++ OpenCL wrapper
#include "tuner/internal/opencl.h"

#include "cl.hpp"

namespace cltune {
// =================================================================================================

// Enumeration of currently supported data-types by this class
enum MemType { kInt, kFloat, kDouble };

// OpenCL-related exception, prints not only a message but also an OpenCL error code. This class is
// added to this file because it is only used by the Memory class.
class OpenCLException : public std::runtime_error {
public:
OpenCLException(const std::string &message, cl_int status)
: std::runtime_error(message+
" [code: "+std::to_string(static_cast<long long>(status))+"]") { };
};
enum class MemType { kInt, kFloat, kDouble };

// See comment at top of file for a description of the class
template <typename T>
Expand All @@ -62,9 +55,8 @@ class Memory {
const static MemType type;

// Initializes the host and device data (with zeroes or based on a source-vector)
explicit Memory(const size_t size, cl::Context context, cl::CommandQueue queue);
explicit Memory(const size_t size, cl::Context context, cl::CommandQueue queue,
std::vector<T> &source);
explicit Memory(const size_t size, std::shared_ptr<OpenCL> opencl);
explicit Memory(const size_t size, std::shared_ptr<OpenCL> opencl, std::vector<T> &source);

// Accessors to the host/device data
std::vector<T> host() const { return host_; }
Expand All @@ -81,10 +73,8 @@ class Memory {
std::vector<T> host_;
std::shared_ptr<cl::Buffer> device_;

// Pointers to the memory's context and command queue
// TODO: Pass these objects by reference instead of creating copies
cl::Context context_;
cl::CommandQueue queue_;
// Pointer to the OpenCL context and command queue
std::shared_ptr<OpenCL> opencl_;
};


Expand Down
28 changes: 19 additions & 9 deletions include/tuner/internal/opencl.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,28 +44,38 @@ namespace cltune {
class OpenCL {
public:

// Converts an unsigned integer to a string by first casting it to a long long integer. This is
// required for older compilers that do not fully implement std::to_string (part of C++11).
static std::string ToString(int value) {
return std::to_string(static_cast<long long>(value));
}

// OpenCL-related exception, prints not only a message but also an OpenCL error code.
class Exception : public std::runtime_error {
public:
Exception(const std::string &message, cl_int status):
std::runtime_error(message+" [code: "+ToString(status)+"]") {
};
};

// Types of devices to consider
const cl_device_type kDeviceType = CL_DEVICE_TYPE_ALL;

// Initializes the OpenCL platform, device, and creates a context and a queue
explicit OpenCL(const size_t platform_id, const size_t device_id);

// Accessors
cl::Device device() const { return device_; }
cl::Context context() const { return context_; }
cl::CommandQueue queue() const { return queue_; }
const cl::Device& device() const { return device_; }
const cl::Context& context() const { return context_; }
const cl::CommandQueue& queue() const { return queue_; }

// Checks whether the global and local thread-sizes, and local memory size are compatible with the
// current device
size_t VerifyThreadSizes(const cl::NDRange global, const cl::NDRange local);
void VerifyLocalMemory(const size_t local_memory);
size_t VerifyThreadSizes(const cl::NDRange &global, const cl::NDRange &local) const;
void VerifyLocalMemory(const size_t local_memory) const;

private:

// Converts an unsigned integer to a string by first casting it to a long long integer. This is
// required for older compilers that do not fully implement std::to_string (part of C++11).
std::string ToString(int value) { return std::to_string(static_cast<long long>(value)); }

// Settings
bool suppress_output_;

Expand Down
53 changes: 16 additions & 37 deletions include/tuner/tuner.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
#include <vector>
#include <stdexcept>
#include <memory>
#include <functional>

// Include other classes
#include "tuner/internal/memory.h"
Expand All @@ -48,8 +49,8 @@ namespace cltune {
// See comment at top of file for a description of the class
class Tuner {
public:
const double kMaxL2Norm = 1e-4; // This is the threshold for 'correctness'
const int kNumRuns = 1; // This is used for more-accurate kernel execution time measurement
static constexpr auto kMaxL2Norm = 1e-4; // This is the threshold for 'correctness'
static constexpr auto kNumRuns = 1; // This is used for more-accurate execution time measurement

// Messages printed to stdout (in colours)
static const std::string kMessageFull;
Expand Down Expand Up @@ -80,38 +81,30 @@ class Tuner {
};

// Exception of the tuner itself
class TunerException : public std::runtime_error {
class Exception : public std::runtime_error {
public:
TunerException(const std::string &message)
Exception(const std::string &message)
: std::runtime_error(message) { };
};

// OpenCL-related exception
class OpenCLException : public std::runtime_error {
public:
OpenCLException(const std::string &message, cl_int status)
: std::runtime_error(message+
" [code: "+std::to_string(static_cast<long long>(status))+"]") { };
};

// Initialize either with platform 0 and device 0 or with a custom platform/device
explicit Tuner();
explicit Tuner(int platform_id, int device_id);
~Tuner();

// Adds a new kernel to the list of tuning-kernels and returns a unique ID (to be used when
// adding tuning parameters)
int AddKernel(const std::string filename, const std::string kernel_name,
const cl::NDRange global, const cl::NDRange local);
int AddKernel(const std::string &filename, const std::string &kernel_name,
const cl::NDRange &global, const cl::NDRange &local);

// Sets the reference kernel. Same as the AddKernel function, but in this case there is only one
// reference kernel. Calling this function again will overwrite the previous reference kernel.
void SetReference(const std::string filename, const std::string kernel_name,
const cl::NDRange global, const cl::NDRange local);
void SetReference(const std::string &filename, const std::string &kernel_name,
const cl::NDRange &global, const cl::NDRange &local);

// Adds a new tuning parameter for a kernel with a specific ID. The parameter has a name, the
// number of values, and a list of values.
// TODO: Remove all following functions (those that take "const int id" as first argument) and
// TODO: Remove all following functions (those that take "const size_t id" as first argument) and
// make the KernelInfo class publicly accessible instead.
void AddParameter(const size_t id, const std::string parameter_name,
const std::initializer_list<int> values);
Expand All @@ -123,23 +116,12 @@ class Tuner {
void MulLocalSize(const size_t id, const StringRange range);
void DivLocalSize(const size_t id, const StringRange range);

// Adds a new constraint to the set of parameters (e.g. must be equal or larger than)
// TODO: Combine the below three functions and make them more generic.
void AddConstraint(const size_t id, const std::string parameter_1, const ConstraintType type,
const std::string parameter_2);

// Same as above but now the second parameter is created by performing an operation "op" on two
// supplied parameters.
void AddConstraint(const size_t id, const std::string parameter_1, const ConstraintType type,
const std::string parameter_2, const OperatorType op,
const std::string parameter_3);

// Same as above but now the second parameter is created by performing two operations on three
// supplied parameters.
void AddConstraint(const size_t id, const std::string parameter_1, const ConstraintType type,
const std::string parameter_2, const OperatorType op_1,
const std::string parameter_3, const OperatorType op_2,
const std::string parameter_4);
// Adds a new constraint to the set of parameters (e.g. must be equal or larger than). The
// constraints come in the form of a function object which takes a number of tuning parameters,
// given as a vector of strings (parameter names). Their names are later substituted by actual
// values.
void AddConstraint(const size_t id, KernelInfo::ConstraintFunction valid_if,
const std::vector<std::string> &parameters);

// Functions to add kernel-arguments for input buffers, output buffers, and scalars. Make sure to
// call these in the order in which the arguments appear in the OpenCL kernel.
Expand Down Expand Up @@ -181,9 +163,6 @@ class Tuner {
// Loads a file from disk into a string
std::string LoadFile(const std::string &filename);

// Converts an unsigned integer into a string
std::string ToString(const int value) const;

// Prints a header of a new section in the tuning process
void PrintHeader(const std::string &header_name) const;

Expand Down
Loading

0 comments on commit fa1437f

Please sign in to comment.