Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CString as a wrapper around a C-String to avoid using std::string #107

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Flamefire
Copy link
Contributor

This avoids unnecessary memory allocations by wrapping the raw pointer (and the length).

I did some experiments using the 2 Benchmark PRs and the simpleFunc BM.

Master:

 100000: Range=1.1948-1.6665 Mean=1.3541 Median=1.3415
1000000: Range=6.1907-6.8819 Mean=6.4104 Median=6.3628

CString w/o length:

  100000: Range=1.1841-1.5298 Mean=1.3112 Median=1.3075
 1000000: Range=5.9802-6.4112 Mean=6.1600 Median=6.1523

CString w/ length:

 100000: Range=1.0298-1.6591 Mean=1.2913 Median=1.2709
1000000: Range=5.8449-6.4538 Mean=6.1164 Median=6.1250

CString w/ length and mem* functions:

 100000: Range=1.1546-1.5333 Mean=1.3124 Median=1.3039
1000000: Range=5.7451-6.4505 Mean=6.0884 Median=6.0576

So yes it does get faster although the effect in this limited scenario is limited.

@AndreasGocht
Copy link
Collaborator

Your call. If you like to have it in the master please fix the conflicts that occurred after merging #105 .

@Flamefire
Copy link
Contributor Author

Rebased to master to fix the conflicts. Had to squash the commits into 1 as there were to many conflicts and new functions so most commits would have been pointless.

Copy link
Collaborator

@AndreasGocht AndreasGocht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few design decisions I do not understand. For example:

  • Why do you have a CString and a PythonCString class? Arent Python C-Strings the only thing we are dealing with?
  • Why does make_region_name still return a std::string?

I have to admit, I am not happy having now four different string representations:

  • std::string
  • classical c strings
  • CString
  • PythonCString

I think we should either move to only PythonCStrings for anything string related or we stay with the current approach.
Best,

Andreas

SCOREP_User_RegionSetGroup(region_handle.value,
std::string(module, 0, module.find('.')).c_str());
const auto& region_name = make_region_name(module, function_name);
region_handle = create_user_region(region_name, module, file_name, line_number);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please either rename to scorep_create_user_region or use the initial code here, what I would prefer, as we would have all the Score-P calls on one level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that we duplicate the code on how a user region is created and that code is slightly more than the 2 SCOREP_* calls if you want to avoid useless memory allocations: std::string(module, 0, module.find('.')) copies the string even when module was a std::string already and no dot is found which is what this PR strives to avoid. Encapsulating duplicate code is what functions are for.

And why scorep_create_user_region? What is not clear from the current name? And why the scorep_ prefix which makes it look like a Score-P call while it is a scorepy call creating a user region. Maybe create_scorep_user_region if you want to be more explicit?


/// Pair of a C-String and it's length useful for PyArg_ParseTuple with 's#'
/// Implicitely converts to CString.
struct PythonCString
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not merging CString and PythonCString to PythongCString?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PythonCString is a workaround for the situation where you need a const char* and size pair. CString is a NULL-terminated sequence of chars with a length. According to rules of encapsulation you must not have write access to the members of CString as that could destroy invariants, in this case that len is the length of s. This doesn't hold for PythonCString which is simply for getting a CString from Python using the PyArg_ParseTuple function and unsafe to use further. Hence to be used only directly at the API boundary.

assert(module_name);
// TODO: Use string_view/CString comparison?
if (std::string(name) != "_unsetprofile" && std::string(module_name, 0, 6) != "scorep")
const CString name = get_string_from_python(*code.co_name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about an Initalisert, which takes the frame?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate?

@@ -37,49 +37,60 @@ static const std::array<std::string, 2> EXIT_REGION_WHITELIST = {
#endif
};

static region_handle create_user_region(const std::string& region_name, const CString module,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why std::string for region_name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is what you have when calling the function. I could add an overloaded constructor to CString so it can be implicitely (or better explicitely) converted from a std::string if you like.


// Extract main module name if module is like "mainmodule.submodule.subsubmodule"
const char* dot_pos = module.find('.');
if (dot_pos)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to move this to PythonCString.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what you want to move exactly? Keep in mind that allocations should be avoided hence a string copy must only be done when required, in this case when a dot was found

#include <cstdint>
#include <string>

namespace scorepy
{
/// Combine the arguments into a region name
/// Return value is a statically allocated string to avoid memory (re)allocations
inline const std::string& make_region_name(const std::string& module_name, const std::string& name)
inline const std::string& make_region_name(const CString module_name, const CString name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written I would vote for implementing this returning PythonCString

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is used to search a map with keys of std::string. So if this should return a CString then CString would need to be implicitely convertible from const std::string& to work with that. Shall I do that?

@@ -71,11 +73,37 @@ auto cast_to_PyFunc(TFunc* func) -> detail::ReplaceArgsToPyObject_t<TFunc>*
return reinterpret_cast<detail::ReplaceArgsToPyObject_t<TFunc>*>(func);
}

inline CString get_string_from_python(PyObject& o)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this as function, or initialiser of PythonCString

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that. However the intention of CString was to keep it generic. This separates concerns. Adding this function would tie it to Python

What is wrong with this function?

@Flamefire
Copy link
Contributor Author

Why do you have a CString and a PythonCString class? Arent Python C-Strings the only thing we are dealing with?

CString is a generic wrapper around a C String to a) make working with it easier (see comparison and starts_with) and b) avoid passing around const char* which you disliked.
PythonCString is an interface class which bridges a gap between Python and CString, see #107 (comment) Not a big fan, but at least it simplifies things like foo_str, foo_len pairs and their conversion to CString which is error-prone.

Why does make_region_name still return a std::string?

Because that is what is need for the map where it is used. See #107 (comment) for an alternative

I have to admit, I am not happy having now four different string representations:

Correct me if I'm wrong but there are no "classical c strings" used anymore (except as sinks in SCOREP_* functions) and PythonCString is the opposite: Another interface type only used for ParseTuple. So you have only CString as a non-owning string and std::string as an owning string which sounds perfectly valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants