-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add comparison operators to the String class #1142
Comments
You can implement But I would go optimize even furzer, and directly use |
Thinking at it again we might need to use |
I hadn't really thought about it before, but given the way UTF-8 is encoded, I think you're right that I wouldn't have thought that we'd need to worry about the UTF-8 BOM, because if we've picked that up when reading a string from a file (say), then we should have already stripped it off from the Wren side before doing any comparisons. Anyway, if we're going to implement DEF_PRIMITIVE(string_compareTo)
{
if (!validateString(vm, args[1], "Argument")) return false;
ObjString* str1 = AS_STRING(args[0]);
ObjString* str2 = AS_STRING(args[1]);
uint32_t len1 = str1->length;
uint32_t len2 = str2->length;
// If strings have equal length, return memcmp result.
if (len1 == len2) RETURN_NUM(memcmp(str1->value, str2->value, len1));
// Get minimum length for comparison.
uint32_t len3 = (len1 <= len2) ? len1 : len2;
int res = memcmp(str1->value, str2->value, len3);
// If result is non-zero, just return that.
if (res) RETURN_NUM(res);
// Otherwise the shorter string will come first.
res = (len1 < len2) ? -1 : 1;
RETURN_NUM(res);
}
...
PRIMITIVE(vm->stringClass, "compareTo(_)", string_compareTo); |
I would compact it a little bit more.
I would:
- Remove the happy path `(len1==len2)`, It would need some benchmark on
real use cases to prove it is really useful.
- Rename `len3` to `minLen`
- Change `res == -1 || res==1` to ` res != 0` si `memcmp` only guarantee
the sign and not the value.
|
OK, I've incorporated your first two suggestions. I'd already dealt with the third as I realized just after I'd posted it that we could only guarantee the sign. DEF_PRIMITIVE(string_compareTo)
{
if (!validateString(vm, args[1], "Argument")) return false;
ObjString* str1 = AS_STRING(args[0]);
ObjString* str2 = AS_STRING(args[1]);
uint32_t len1 = str1->length;
uint32_t len2 = str2->length;
// Get minimum length for comparison.
uint32_t minLen = (len1 <= len2) ? len1 : len2;
int res = memcmp(str1->value, str2->value, minLen);
// If result is non-zero, just return that.
if (res) RETURN_NUM(res);
// If the lengths are the same, the strings must be equal
if (len1 == len2) RETURN_NUM(0);
// Otherwise the shorter string will come first.
res = (len1 < len2) ? -1 : 1;
RETURN_NUM(res);
}
...
PRIMITIVE(vm->stringClass, "compareTo(_)", string_compareTo); |
We can use something more compact, by doing something like: // Use a variation of https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign
#define intcmp(a, b) (((a) > (b)) - ((a) < (b)))
RETURN_NUM(res != 0 ? res : intcmp(len1, len2)); Not sure if it should be a macro or specialized function since we might want to have at least Speaking of which maybe we might want to use |
Well, we can do something like that if you want to keep the line count down but it's hardly making the code easier to read. Personally, I'll be surprised if Going off topic slightly, I think a better use for a new operator such as var a = 1
var b = 2
a <=> b
System.print([a, b]) // [2, 1] Currently, we have nothing like this in the language for scalars and it's impossible to write one as we don't have parameter passing by reference. You either have to use a temporary variable (3 lines) or (for numbers) a one line hack such as:: var a = 1
var b = 2
b = a + b - (a = b)
System.print([a, b]) // [2, 1] I noticed recently that the supercomputer language, Chapel, was using Python, Ruby and Lua use multiple assignment for swapping but I don't like that solution much as it introduces problems of its own. |
About the ease of read, it is subjective. My solution at least express the intent in code, so I find it simpler than reading comments. But it is a minor detail, as long as the code working, it is fine. While I find the idea interesting for |
OK, I'll change the type of the lengths to The way I envisaged a swap operator working is analogous to But you're probably right that it wouldn't be worth the effort of implementing all this just to save two lines of code. Anyway it's just an idea I had :) |
Just to finish up on the 'swap' thing, the nearest I can get to a generic swap function, using what we currently have, is this: import "meta" for Meta
var Swap = Fn.new { |a, b|
var s = "{\n"
s = s + " var t = %(a)\n"
s = s + " %(a) = %(b)\n"
s = s + " %(b) = t\n"
s = s + "}\n"
Meta.compile(s).call()
}
var a = 1
var b = 2
Swap.call("a", "b")
System.print([a, b])
var c = 3
var d = 4
Swap.call("c", "d")
System.print([c, d])
var e = [5, 6]
Swap.call("e[0]", "e[1]")
System.print(e)
/* output:
[2, 1]
[4, 3]
[6, 5]
*/ The generated code is placed in its own block to ensure the temporary variable 't' is destroyed after each invocation. Note though that you have to pass the variable names as strings. Although it works for indexed variables, we don't really need this as we have |
Incidentally, it struck me later that the Nor will it work if called from within a function or method - not much does with |
I've now opened a PR to implement string comparisons on the basis agreed above. |
Currently, if you need to compare two strings or sort a list of strings lexicographically by codepoint, then you need to write your own comparison routine. This is an unsatisfactory state of affairs since you can't add your own comparison operators (or anything else) to the built-in
String
class or inherit from it.I therefore propose that we add a
compareTo
method and comparison operators to theString
class so that it will be as easy to compare two strings or sort a list of strings as it is to perform the same operations on numbers.As we need to work at the codepoint rather than the byte level, it will probably be quick enough to do this from the Wren side and here's my attempt at some code which is essentially what I've been using myself for the last 3 years or so, albeit in the guise of static methods of a proxy class:
I look forward to receiving comments on this proposal and any suggestions for improving the code.
The text was updated successfully, but these errors were encountered: