layout | title |
---|---|
default |
Ruby Language Details |
I’ll talk about the details of Ruby’s syntax and evaluation,
which haven’t been covered yet. I didn’t intend a complete exposition,
so I left out everything which doesn’t come up in this book.
That’s why you won’t be able to write Ruby programs just by
reading this. A complete exposition can be found in the
reference manual.
Readers who know Ruby can skip over this chapter.
The expressiveness of Ruby’s literals is extremely high.
What distinguishes Ruby as a scripting language
is firstly the existence of
the toplevel, secondly it’s the expressiveness of its literals in my
opinion. Thirdly it might be the richness of its standard library.
The literals have already as elements enormous power, but even more
when combined. Especially that one can create complex literals from
hash and array literals is a great advantage of Ruby. One can simply
write down a hash of arrays of regular expressions for instance.
Let’s look at the valid expressions one by one.
Strings and regular expressions can’t be missing in a scripting language.
There is a great variety of string literals.
'string' # 「string」 '\\begin{document}' # 「\begin{document}」 '\n' # 「\n」backslash and an n, no newline '\1' # 「\1」backslash and 1 '\'' # 「'」
This is the simplest form. In C evrything enclosed in single quotes becomes a string, it’s the same in Ruby. Let’s call this a `‘`-string. The backslash escape
is in effect only for `\` itself and `’`. If one puts a backslash
in front of another character the backslash remains as for example
in the fourth example.
And Ruby’s strings aren’t divided by newline characters.
If we write a string over several lines the newlines are contained
in the string.
'multi line string'
And if the `-K` option is given to the `ruby` command multibyte strings
will be accepted. At present the three encodings EUC-JP (`-Ke`),
Shift JIS (`-Ks`), and UTF8 (`-Ku`) can be specified. (Translator’s note:
`-K` option was removed in Ruby 1.9)
'「漢字が通る」と「マルチバイト文字が通る」はちょっと違う'
"string" # 「string」 "\n" # newline "\0x0f" # a byte given in hexadecimal form "page#{n}.html" # embedding a command
With double quotes we can use command expansion and backslash notation.
The backslash notation is classical, it was already supported in C,
`\n` is a newline, `\b` is a backspace, this kind of notation.
In Ruby also `Ctrl-C` and ESC can be expressed, that’s convenient.
It’s probably of no use to list the whole notation here.
On the other hand, expression expansion is even more phantastic.
We can write an arbitrary Ruby expression inside `#{ }` and it
will be evaluated at runtime and imbedded into the string. There
are no limitations like only one variable or only one method.
This is not a mere literal anymore but a whole expression representing
a string.
"embedded #{lvar} expression" "embedded #{@ivar} expression" "embedded #{1 + 1} expression" "embedded #{method_call(arg)} expression" "embedded #{"string in string"} expression"
%q(string) # same as 'string' %Q(string) # same as "string" %(string) # same as %Q(string) or "string"
If a lot of separator characters appear in a string, escaping all of them
becomes a burden. In that case the separator characters can be
changed. The string which contains the one character `"` can be
written with a `%` string as follows:
"<a href=\"http://i.loveruby.net#{path}\">" %Q(")
The expression isn’t shorter, but nicer to look at.
When we have to escape more often, it even becomes more concise.
Here we have used parantheses as delimiters, but something else is fine,
too. Like brackets or braces or `#`. Almost every symbol is fine, even
`%`.
%q#this is string# %q[this is string] %q%this is string%
Here documents are a syntactical device where one or more lines can
form a string. A normal string starts right after the delimiter `“`
and takes everything until the ending `”`. Here documents start
at the line after a `<<EOS` and end at the line before the ending `EOS`.
<<EOS All lines between the starting and the ending line are in this here document EOS
Here we used `EOS` as identifier but any word is fine.
Precisely speaking all the character matching `[a-zA-Z_0-9]` can be used.
The characteristic of a here document is that every line between
the start symbol and the ending symbol will form a string. The line
which contains the start symbol delimits the string.
That’s why the position of the start symbol is not important.
It can even be in the middle of an expression:
printf(<<EOS, count_n(str)) count=%d EOS
In this case the string `“count=%d\n”` goes in the place of `<<EOS`.
So it’s the the as the following.
printf("count=%d\n", count_n(str))
The start symbol can be somewhere in the line, but there are strict
rules for the ending symbol: It must be at the beginning of the line
and there must not be another letter in that line. However
if we write the start symbol with a minus like this `<<-EOS` we
can indent the line with the end symbol.
<<-EOS It would be convenient if one could indent the content of a here document. But that's not possible. If you want that, the best way is to write a method which delete the indent. But beware of tabs. EOS
Furthermore, the start symbol can be enclosed in single or double quotes.
Then the properties of the whole here document change.
When we change `<<EOS` to `<<“EOS”` we can use command expression
imbedding and backslash notation.
<<"EOS" One day is #{24 * 60 * 60} seconds. Incredible. EOS
But `<<‘EOS’` is not the same as a single quoted string. It starts
the complete literal mode. Everything even backslashes go
into the string as they are typed. This is useful for a string which
contains many backslashes.
In the second part we also explain how a here document is passed.
But I’d like you to guess it before.
(Translators Note: In Ruby 1.8 and 1.9 expression expansion and
backslash notation can be used in a normal here document.
There does not seem to be a difference anymore to a double quote enclosed
here document.)
Ruby strings are byte strings, there are no character objects.
Instead there are the following expressions which return the
integers which correspond a certain character in ASCII code.
?a # the integer which corresponds to 「a」 ?. # the integer which corresponds to 「.」 ?\n # LF ?\C-a # Ctrl-a
(Translator’s note: Strings in Ruby 1.9 are not byte strings anymore,
they have an attached encoding. `?a` returns the string `“a”` in Ruby1.9)
/regexp/ /^Content-Length:/i /正規表現/ /\/\*.*?\*\//m # An expression which matches C comments /reg#{1 + 1}exp/ # the same as /reg2exp/
What is contained between slashes is a regular expression.
Regular expressions are a language to designate string patterns.
For example
/abc/
This regular expression matches a string where there’s an `a` followed
by a `b` followed by a `c`. It matches “abc” or “fffffffabc” or
“abcxxxxx”.
One can designate more special patterns.
/^From:/
This matches a string where there’s a `From` followed by a `:` at
the beginning. There are several more expressions of this kind,
such that one can create quite complex patterns.
The uses are infinite:
Changing the matched part to another string, deleting the matched part,
determining if there’s one match and and and…
A more concrete use case would be extracting the `From:` header
from a mail. Then changing the `\n` to an `\r` and
checking if the rest looks like a mail address.
The regular expressions form an independent language, it has
it’s own parser and evaluator within ruby. It can be found in `regex.c`
in the Ruby source. In effect from a grammar view point they are
treated the same as strings. Escapes, backslash notation and command
embedding can be used almost the same as in strings.
Of course regular expressions and strings are treated the same
in the Ruby syntax only. Regular expressions themselves are a language
of their own with own rules which have to be obeyed. This is a subject
of a whole other book. We won’t go deeper into that here.
Refer for instance to Jeffrey Friedl, Regular expressions.
Also as with strings, regular expressions also have a syntax for changing
delimiters. In this case it is `%r`. Here just some examples.
%r(regexp) %r[/\*.*?\*/] # matches a C comment %r("(?:[^"\\]+|\\.)*") # matches a string in C %r{reg#{1 + 1}exp} # imbedding a Ruby expression
An array literal is contained in brackets `[]`, elements are separated
by commas.
[1, 2, 3] ['This', 'is', 'an', 'array', 'of', 'string'] [/regexp/, {'hash'=>3}, 4, 'string', ?\C-a] lvar = $gvar = @ivar = @@cvar = nil [lvar, $gvar, @ivar, @@cvar] [Object.new(), Object.new(), Object.new()]
Ruby’s arrays are a list of arbitrary objects. From a syntactical
standpoint it’s characteristic is, that
the elements can be arbitrary expressions. As mentioned earlier,
an array of hashes of regular expressions can easily be made.
Not just literals but also variables or method calls can also be
put together.
And as with the other literals note that this is really an “expression
which generates an array object”
i = 0 while i < 5 p([1,2,3].id) # Each time another object id is shown. i += 1 end
When writing scripts one uses arrays of strings a lot, hence
there is a special notation only for arrays of strings.
That is `%w`. With an example it’s immediately obvious.
%w( alpha beta gamma delta ) # ['alpha','beta','gamma','delta'] %w( 月 火 水 木 金 土 日 ) %w( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec )
There’s also `%W` where command embedding can be used.
It’s a relatively recent implementation.
n = 5 %w( list0 list#{n} ) # ['list0', 'list#{n}'] %W( list0 list#{n} ) # ['list0', 'list5']
The author hasn’t come up with a good use yet.
Hash tables are data structure which store a one-to-one relation between
arbitrary objects. The following expressions generate a table.
{ 'key' => 'value', 'key2' => 'value2' } { 3 => 0, 'string' => 5, ['array'] => 9 } { Object.new() => 3, Object.new() => 'string' } # Of course we can put it in several lines. { 0 => 0, 1 => 3, 2 => 6 }
We explained hashes in detail in the third chapter “Names and
Nametables”. They are fast lookup tables which allocate memory slots depending
on the hash value(?). In Ruby grammar the key value can be an arbitrary expression.
Furthermore inside a method call the braces can be omitted.
some_method(arg, key => value, key2 => value2) # some_method(arg, {key => value, key2 => value2}) # same as above
With this we can imitate named arguments.
button.set_geometry('x' => 80, 'y' => '240')
Of course in this case `set_geometry` must accept a hash as input.
Real keyword arguments will be transformed into parameter variables, though.
With a passed hash it is not quite the real thing.
Range literals are oddballs which don’t appear in most other languages.
Here are some expressions which generate Range objects.
0..5 # from 0 to 5 containing 5 0...5 # from 0 to 5 not containing 5 1+2 .. 9+0 # from 3 to 9 containing 9 'a'..'z' # strings from 'a' to 'z' containing 'z'
If there are two dots the last element is included. If there
are three dots it is not included. Not only integers but also floats
and strings can be made into ranges, even arbitrary objects can
be used in ranges. Syntactically arbitrary expressions can be
used as delimiters of a range object. If the returned object cannot
be made into a range there will be a runtime error.
Besides, the precedence of `..` and `…` is quite low. Here’s a surprising
interpretation.
1..5.to_a() # 1..(5.to_a())
I thinks the Ruby grammar is really intuitive,
but I do not like this.
In the first part we talked about symbols at length.
They are something which corresponds one-to-one to strings.
In Ruby symbols are expressed with a `:` in front.
:identifier :abcde
These are pretty standard examples. But all symbol and method
names become symbols with a `:` in front. Like this:
:$gvar :@ivar :@@cvar :CONST
We haven’t shown any method names so far. Of course `[]` or `attr=`
can be used as symbols too.
:[] :attr=
When one uses these symbols as values in an array, it’ll look quite
complicated.
This is the least interesting. It might be added that
1_000_000
becomes one million and that underscores can be used inside a number.
But that isn’t particularly interesting. This is it about numerical
values in this book. We’ll completely forget them from here on.
Let’s talk about the definition and calling of methods.
def some_method( arg ) .... end class C def some_method( arg ) .... end end
Methods are defined with `def`. If they are defined at toplevel
they become function style methods, inside a class they become
methods of this class. To call a method which was defined in a class,
one usually has to create an instance with `new` as shown below.
C.new().some_method(0)
The return value of a method is the value of a `return` statement
if it runs across one.
If there is none it’s the value of the last statement.
def one() # 1 is returned return 1 999 end def two() # 2 is returned 999 2 end def three() # 3 is returned if true then 3 else 999 end end
If the method body is empty `nil` is returned
and an expression without a value cannot put at the end.
Hence every method has a return value.
Optional arguments can also be defined. If the number of arguments
doesn’t suffice the parameters are automatically assigned to a
default value.
def some_method( arg = 9 ) # default value is 9 p arg end some_method(0) # 0 is shown. some_method() # The default value 9 is shown.
There can also be several optional arguments.
But in that case they must all come at the end. It is not
possible to make an argument in the middle optional.
It would be unclear how this should be made sense of.
def right_decl( arg1, arg2, darg1 = nil, darg2 = nil ) .... end # This is not possible def wrong_decl( arg, default = nil, arg2 ) # A middle argument cannot be optional .... end
The parentheses from a method call can be omitted.
puts 'Hello, World!' # puts("Hello, World") obj = Object.new # obj = Object.new()
In Python leaving out parentheses gets the method object, but
there is no such thing in Ruby.
We can also omit parentheses within the arguments itself.
puts(File.basename fname) # puts(File.basename(fname)) same as the above
If we like we can even leave out more
puts File.basename fname # puts(File.basename(fname)) same as the above
In Ruby 2.0 such an expression will probably not pass anymore.
Actually let’s also leave out the parantheses in the definition
def some_method param1, param2, param3 end def other_method # without arguments we see this a lot end
Parantheses are often left out in method calls, but leaving out
parantheses in the definition is not very popular.
Only if there are no arguments the parantheses are frequently omitted.
Arguments form a list of objects. If we want to use the elements of a list as arguments we can do this as follows:
def delegate(a, b, c) p(a, b, c) end list = [1, 2, 3] delegate(*list) # identical to delegate(1, 2, 3)
In this way we can distribute an array into arguments.
We call this device a `*`argument. Here we used a local variable
for demonstration, but of course there is no limitation.
We can also directly put a literal or a method call instead.
m(*[1,2,3]) # We could have written the expanded form in the first place... m(*mcall())
The *
argument can be used together with ordinary arguments,
but the *
argument must come last.
In the definition on the other hand we can handle the arguments in
bulk when we put a `*` in front of the parameter variable.
def some_method( *args ) p args end some_method() # prints [] some_method(0) # prints [0] some_method(0, 1) # prints [0,1]
The surplus arguments are gathered in an array. Only one `*`parameter
can be declared. It must also come after the default arguments.
def some_method0( arg, *rest ) end def some_method1( arg, darg = nil, *rest ) end
If we combine list expansion and bulk reception together, the arguments
of one method can be passed as a whole to another method. This might
be the most practical use of the `*`parameter.
# a method which passes its arguments to other_method def delegate(*args) other_method(*args) end def other_method(a, b, c) return a + b + c end delegate(0, 1, 2) # same as other_method(0, 1, 2) delegate(10, 20, 30) # same as other_method(10, 20, 30)
There is only one mechanism for ‘method call’, but there still
can be several representations of the same mechanism. This is
colloquially called syntactic sugar.
In Ruby there is a ton of it, the parser fetch becomes unbearable (?).
For instance the examples below are all method calls.
1 + 2 # 1.+(2) a == b # a.==(b) ~/regexp/ # /regexp/.~ obj.attr = val # obj.attr=(val) obj[i] # obj.[](i) obj[k] = v # obj.[]=(k,v) `cvs diff abstract.rd` # Kernel.`('cvs diff abstract.rd')
It’s hard to believe until you get used to it, but `attr=`, `[]=`, `\``
are all names of methods. They can appear as names in a method definition
and can also be used as symbols.
class C def []( index ) end def +( another ) end end p(:attr=) p(:[]=) p(:`)
There are people who don’t like sweets and there are people who
hate syntactic sugar. Maybe because one cannot tell by the looks
that it’s really the same thing. It feels like a deception.
(Why’s everyone so serious?)
Let’s see some more details.
obj.name? obj.name!
First a small thing. It’s just appending a `?` or a `!`. Call and Definition
do not differ, so it’s not too painful. There are convention for what
to use these method names, but there is no enforcement on language level.
It’s just a convention.
These method names are probably an influence from Lisp which has a great variety
of function names.
1 + 2 # 1.+(2)
Binary Operators will be converted to a method call to the object on the
left hand side. Here the method `` from the object `1` is called.
As listed below there are many of them. There are the general operators
`` and `-`, also the equivalence operator `==` and the spaceship operator
`<=>’ as in Perl, all sorts. They are listed in order of their precedence.
** * / % + - << >> & | ^ > >= < <= <=> == === =~
The symbols `&` and `|` are methods, but the double symbols `&&` and `||`
are built-in operators. Remember how it is in C.
+2 -1.0 ~/regexp/
These are the unary operators. There are only three of them: `+ – `.
`+` and `-` work as one would imagine ( in the default setting).
The operator `` matches a string or a regular expression
with the variable `$_`. With an integer it stands for bit conversion.
To distinguish the unary `` from the binary `` the method names
for the unary operators are `` and `-
` respectively.
Of course they can be called by just writing `n` or `-n`.
obj.attr = val # attr=(val)
This is an attribute assignment statement. The above will be translated
into the method call `attr=`. When using this together with method calls whose
parantheses are omitted, we can write code which looks like attribute access.
class C def i() @i end # We can write the definition in one line def i=(n) @i = n end end c = C.new c.i = 99 p c.i # prints 99
However both are method calls.
They are similar to get/set property in Delphi or slot accessors in CLOS.
Besides, we cannot define a attribute assignment which takes an argument like
`obj.attr(arg)=`.
obj[i] # obj.[](i)
The above will be translated into a method call for `[]`.
Array and hash access are also implemented with this device.
obj[i] = val # obj.[]=(i, val)
When assigning to an index the `[]=` method is used.
Often we don’t want to replace a method, but we want to add a little
bit to the behaviour of an already existing method. Here it becomes
necessary to not just overwrite the method in the superclass but
to also call the method in the superclass.
That’s what Ruby’s `super` is for.
class A def test puts 'in A' end end class B < A def test super # launches A#test end end
Ruby’s `super differs from the one in C++ or Java. This one here
calls the method with the same name in the superclass.
In other words `super` is a reserved word.
When using super be careful about the difference between the difference
of the zero arguments `super` and the omitted arguments `super.
The super with omitted arguments passes all the parameter variables.
class A def test( *args ) p args end end class B < A def test( a, b, c ) # super with no arguments super() # shows [] # super with omitted arguments. Same result as super(a, b, c) super # shows [1, 2, 3] end end B.new.test(1,2,3)
Depending on the location ( the object `self`) a method can or
cannot be called. This function was usually called visibility.
In Ruby there are three kinds of methods.
- `public`
- `private`
- `protected`
`public` methods can be called from anywhere in any form.
`private` methods can syntactically only be called without a receiver.
In effect they can only be called by instances of the class
in which they were defined and in instances of its subclass.
`protected` methods can only be called by instances of the defining class
and its subclasses.
It differs from `private` that methods can still be called from other
instances of the same class.
The terms are the same as in C++ but the meaning is slightly different.
Be careful.
Usually we control visibility as shown below.
class C public def a1() end # becomes public def a2() end # becomes public private def b1() end # becomes private def b2() end # becomes private protected def c1() end # becomes protected def c2() end # becomes protected end
Here `public`, `private` and `protected are method calls without
parentheses. These aren’t reserved words.
`public` and `private` can also be used with an argument to set
the visibility of a particular method. But that’s not really relevant.
We’ll leave this out.
Given a module ‘M’. If there are two methods with the exact same
content
- `M.method_name`
- `M#method_name`(Visibility is `private`)
then we call this a module function.
It is not apparent why this should be useful. But let’s look
at the next example which is happily used.
Math.sin(5) # If used for a few times this is more convenient include Math sin(5) # If used more often this is more practical
It’s important that both functions have the same content.
With a different `self` but with the same code the behavior should
still be the same. Instance variables become extremely difficult to use.
Hence these methods are probably only used
for procedures like `sin`. That’s why they are called module functions.
Ruby’s iterators differ a bit from Java’s or C++’s iterator classes
or ‘Iterator’ design patterns. Precisely speaking those iterators
are exterior iterators. Ruby’s iterators are called interior iterators.
It’s difficult to understand from the definition so
let’s explain it with a concrete example.
arr = [0,2,4,6.8]
This array is given and we want to access the elements in
order. In C style we would write the following.
i = 0 while i < arr.length print arr[i] i += 1 end
Using an iterator we can write:
arr.each do |item| print item end
Everything from `each do` to `end` is the call to an iterator method.
More precisely `each` is the iterator method and between
`do` and `end` is the iterator block.
The part between the vertical bars are the block parameters.
They are the arguments passed from the iterator method to the block where
they become variables.
Saying it quite abstractly, an iterator is something like
a piece of code which has been cut out and passed. In our example the
piece `print item` has been cut out and is passed to the `each` method.
Then `each` takes all the elements of the array in order and passes them
to the cut out piece of code.
We can also think the other way round. The other parts except `print item`
are being cut out and inserted into the `each` method.
i = 0 while i < arr.length print arr[i] i += 1 end arr.each do |item| print item end
What comes closest in C to iterators are functions which receive function pointers,
or higher order functions. But there are two points in which iterators in Ruby
and higher order functions in C differ.
Firstly, Ruby iterators can only take one block. For instance we can’t
do the following.
# Mistake. Several blocks cannot be passed. array_of_array.each do |i| .... end do |j| .... end
Secondly, Ruby’s blocks can share local variables with the code outside.
lvar = 'ok' [0,1,2].each do |i| p lvar # Can acces local variable outside the block. end
That’s where iterators are convenient.
But variables can only be shared with the outside. They cannot be shared
with the inside of the iterator method ( e.g. `each`). Putting it intuitively,
only the local variables can be seen, which are on the outside of the code.
Local variables which are assigned inside a block stay local to that block.
They become block local variables. Let’s check it out.
[0].each do i = 0 p i # 0 end
For the time being we apply each to an array of length 1. ( We can
leave out the block parameter.) The variable i
is first assigned
and declared inside the block. So i
becomes a block local variable.
Block local means that it cannot be accessed from the outside.
Let’s test it.
% ruby -e ' [0].each do i = 0 end p i # Here occurs an error. ' -e:5: undefined local variable or method `i' for #<Object:0x40163a9c> (NameError)
When we referenced a block local variable from outside the block
an error occured. Without a doubt it stayed local to the block.
Iterators can also be nested repeatedly. Each time
the new block creates another scope.
lvar = 0 [1].each do var1 = 1 [2].each do var2 = 2 [3].each do var3 = 3 # Here lvar, var1, var2, var3 can be seen end # Here lvar, var1, var2 can be seen end # Here lvar, var1 can be seen end # Here only lvar can be seen
There’s one point which you have to keep in mind. Differing from
nowadays’ major languages Ruby’s block local variables don’t do shadowing.
Shadowing means for instance in C that in the code below the two declared
variables `i` are different.
{ int i = 3; printf("%d\n", i); /* 3 */ { int i = 99; printf("%d\n", i); /* 99 */ } printf("%d\n", i); /* 3 (元に戻った) */ }
Inside the block the i
inside overshadows the i
outside.
That’s why it’s called shadowing.
But what happens in Ruby where there’s no shadowing.
Let’s look at this example.
i = 0 p i # 0 [0].each do i = 1 p i # 1 end p i # 1 the change is preserved
When we assign i
inside the block and if there is a variable i
that same variable will be used. Hence if we assign to i
inside
the value for i
on the outside changes. On this point there
came many complains: “This is error prone. Please do shadowing.”
Each time there’s flaming but till now no conclusion was reached.
There are some smaller topics left.
First, there are two ways to write an iterator. One is the
`do` ~ `end` as used above, the other one is the enclosing in braces.
The two expressions below have exactly the same meaning.
arr.each do |i| puts i end arr.each {|i| # The author likes a four space indentation for puts i # an iterator with braces. }
But grammaticarly the precedence is different.
The braces bind much stronger than `do`~`end`.
m m do .... end # m(m) do....end m m { .... } # m(m() {....})
And iterators are of course just methods so they also take
arguments.
re = /^\d/ # regular expression to match a digit at the beginning of the line $stdin.grep(re) do |line| # look repeatedly for this regular expression .... end
Of course users can write their own iterators. Methods which have
a `yield` in their definition text are iterators.
Let’s try to write an iterator with the same effect as `Array#each`:
# adding the definition to the Array class class Array def my_each i = 0 while i < self.length yield self[i] i += 1 end end end # this is the original each [0,1,2,3,4].each do |i| p i end # my_each works the same [0,1,2,3,4].my_each do |i| p i end
yield
calls the block. At this point control is passed to the block,
when the execution of the block finishes it returns back to the same
location. Think about it like calling a special function. When the
present method does not have a block a runtime error will occur.
% ruby -e '[0,1,2].each' -e:1:in `each': no block given (LocalJumpError) from -e:1
I said, that iterators are like cut out code which is passed as an
argument. But we can even more directly make code to an object
and carry it around.
twice = Proc.new {|n| n * 2 } p twice.call(9) # 18 will be printed
In short, it is like a function. It can be created with new
and
as might be expected, the return value of Proc.new
is an instance
of the Proc
class.
Proc.new
looks surely like an iterator and it is indeed so.
It is an ordinary iterator. There’s only some mechanism inside Proc.new
which turns an iterator block into an object.
Besides there is a function style method lambda
provided which
has the same effect as Proc.new
. Choose whatever suits you.
twice = lambda {|n| n * 2 }
Why did we start talking all of a sudden about Proc
? Because there
is a deep relationship between iterators and Proc
.
In fact iterators and Proc
objects are quite the same thing.
That’s why one can be transformed into the other.
First, to turn an iterator block into a Proc
object
one has to put an &
in front of the parameter name.
def print_block( &block ) p block end print_block() do end # Shows something like <Proc:0x40155884> print_block() # Without a block nil is printed
With an &
in front of the argument name, the block is transformed to
a Proc
object and assigned to the variable. If the method is not an
iterator (there’s no block attached) nil
is assigned.
And in the other direction, if we want to pass a Proc
to an iterator
we also use &
.
block = Proc.new {|i| p i } [0,1,2].each(&block)
This code means exactly the same as the code below.
[0,1,2].each {|i| p i }
If we combine these two, we can delegate an iterator
block to a method somewhere else.
def each_item( &block ) [0,1,2].each(&block) end each_item do |i| # same as [0,1,2].each do |i| p i end
Expressions in Ruby can be combined to build new expressions or statements.
For instance a method call can be another method call’s argument,
and so it would become an expression.
The same goes for literals. But literals and method calls are not combinations
of other elements. From here on the expressions introduced will always be used
in combination with other expressions.
We probably do not need to explain the if
expression. If the conditional
expression is true the body expression is executed. As explained in the
first part in Ruby every object except nil
and false
is true.
if cond0 then .... elsif cond1 then .... elsif cond2 then .... else .... end
elsif
and else
can be omitted. Each then
can also be omitted
`elsif`・But there are some finer requirements concerning then
.
It will be apparent by looking at the examples below.
All of them are valid.
# 1 # 4 if cond then ..... end if cond then .... end # 2 if cond; .... end # 5 if cond # 3 then if cond then; .... end .... end
Furthermore, as every expression has a return value, there
is also a return value here. It is the return value of the
body expression which holds. For instance if the condition
at the beginning is true it is the return value of the
following expression.
p(if true then 1 else 2 end) #=> 1 p(if false then 1 else 2 end) #=> 2 p(if false then 1 elsif true then 2 else 3 end) #=> 2
If there’s no match, or the match is empty nil
is returned.
p(if false then 1 end) #=> nil p(if true then end) #=> nil
An if
with a negated condition is the same as an unless
.
The following two examples have the same meaning.
unless cond then if not (cond) then .... .... end end
unless
can also have an else
clause but there cannot be an elsif
.
Of course then
can be omitted.
unless
also has a value. Analogous to if
it is the value of the of the
clause which matches. If there’s no match or the match is empty nil
is returned.
The most useful utilization of the and
is probably as a boolean operator.
For instance in the conditional clause of an if
.
if cond1 and cond2 puts 'ok' end
But as in Perl, the Shell or Lisp it can also be used as a conditional
branch expression.
The two following expressions have the same meaning.
if invalid?(key) invalid?(key) and return nil return nil end
&&
and and
have the same meaning. Different is the binding order.
method arg0 && arg1 # method(arg0 && arg1) method arg0 and arg1 # method(arg0) and arg1
Basically the symbolic operator is used in an expression which becomes an argument.
The alphabetical operator is used in an expression which will not become
an argument.
On the other hand or
is the opposite of and
. If the evaluation of the left hand
side is false, the right hand side will also be evaluated.
valid?(key) or return nil
or
and ||
have the same relationship as &&
and and
. Only the precedence is
different.
There is a conditional operator similar to C:
cond ? iftrue : iffalse
The space between the symbols is important.
If they bump together the following weirdness happens.
cond?iftrue:iffalse # cond?(iftrue(:iffalse))
The value of the conditional operator is the value of the last executed expression.
Either the value of the true side or the value of the false side.
Here’s a `while` expression.
while cond do .... end
This is the most basic loop construct. As long as cond
is true
the body is executed. The do
can be omitted.
until io_ready?(id) do sleep 0.5 end
until
comes to the exact opposite conclusion as the while
.
As long as the body expression is false it is executed.
The do
can be omitted.
There is also a jump construct which exits the loop.
As in C/C++/Java it is called break
. Instead of continue
there
is next
. That seems to have come from Perl.
i = 0 while true if i > 10 break # exit the loop elsif i % 2 == 0 i *= 2 next # next loop iteration end i += 1 end
And there is another Perlism: the redo
.
while cond # (A) .... redo .... end
It will return to (A)
and repeat from there. With next there
is a condition check
, with redo
there is none.
I might come into the world top 100, if the amount of Ruby programs
would be counted, but I haven’t used redo
yet. It does not seem to be
that necessary after all.
A special form of the if
form. It performs branching on a series of
conditions. The following two expression are identical in meaning.
case value when cond1 then if cond1 === value .... .... when cond2 then elsif cond2 === value .... .... when cond3, cond4 then elsif cond3 === value or cond4 === value .... .... else else .... .... end end
The threefold equals ===
is like the ==
really a method call.
The receiver is the object on the left hand side. Concretely,
in Array
a ===
checks which values the elements contain.
For a Hash
it tests whether the keys have a value. For regular
expression it tests if the value
matches. (??? can’t find good documentation
to ===
) There are many grammatical
elements to case
. To list them all would be tedious, we will not
cover them in this book.
This is a control structure which can pass method boundaries and
transmit errors. Readers who are acquainted to C++ or Java
will know about exceptions. Ruby exceptions are basically the
same.
In Ruby exceptions come in the form of the function style method `raise`.
`raise` is not a reserved word.
raise ArgumentError, "wrong number of argument"
In Ruby exception are instances of the Exception
class and it’s
subclasses. This form takes an exception class as its first argument
and an error message as its second argument. In the above case
an instance of ArgumentError
is created and “thrown”. Exception
object ditch the part after the raise
and return upwards the
method call stack.
def raise_exception raise ArgumentError, "wrong number of argument" # the code after the exception will not be executed puts 'after raise' end raise_exception()
If nothing blocks the exception it will move to the top level.
When it reaches the top level, ruby
gives out a message and ends
with a non-zero exit code.
% ruby raise.rb raise.rb:2:in `raise_exception': wrong number of argument (ArgumentError) from raise.rb:7
However, for this an exit
would be sufficient, for an exception ther
should be ways to handle it. In Ruby there is the begin
, rescue
, end
for this.
It resembles the try
and catch
in C++ and Java.
def raise_exception raise ArgumentError, "wrong number of argument" end begin raise_exception() rescue ArgumentError => err then puts 'exception catched' p err end
rescue
is a control structure which captures an exception, it catches
the exceptions of the declared class and its subclasses. If in the
above example an instance of ArgumentError
comes flying this rescue
matches it. By =>err
the exception object will be assigned to the local variable
err
, after that the rescue
part is executed.
% ruby rescue.rb exception catched #<ArgumentError: wrong number of argument>
If the exception is resuced the execution carries on after the rescue
as if nothing happened. We can also make it retry the critical part
with retry
.
begin # return here .... rescue ArgumentError => err then retry # beginning anew end
We can omit the =>err
and the then
after rescue
. We can also leave
out the exception class. In this case the class StandardError
is matched.
If we want to add more exception classes we can just list them after rescue
.
When we want to handle different errors differently we can use several @rescue@s
in one begin
~end
block.
begin raise IOError, 'port not ready' rescue ArgumentError, TypeError rescue IOError rescue NameError end
In this case the exception class is checked in order until there is a match.
Only the one matched part is executed. For instance in the above case only
IOError
is executed.
On the othor hand, when there is an else
clause, it is executed
only when there is no exception.
begin nil # Of course here will no error occur rescue ArgumentError # This part will not be executed else # This part will be executed end
Moreover an ensure
clause will be executed in every case:
when there is no exception, when there is an exception, rescued or not.
begin f = File.open('/etc/passwd') # do stuff ensure # this part will be executed anyway f.close end
Besides, this begin
expression also has a value. The value of the
whole begin
~end
expression is the value of the part which was executed
last. The ensure
part does not count as it is normally used for cleanup only.
<<<<<<< Updated upstream
Referring a variable or a constant. The value is the object the variable points to.
We already talked in too much detail about the various behaviors.
===
Referring
変数および定数の参照。値は変数の指すオブジェクト。
それぞれの挙動は既に詳しすぎるほど説明したので省略する。
>>>>>>> Stashed changes
lvar @ivar @@cvar CONST $gvar
I want to add one more thing. The variables starting with $
are
of a special kind. Some have strange names, but they are not
necessarily global.
First the Perlish variables $_
and $~
. $_
saves the return
value of gets
and other methods, $~
contains the last match
of a regular expression. They are called local and thread local.
Incredible variables.
And the variable $!
which saves the exceptional object when
an exception has occurred as well as the variable $?
which
saves the status of a child process and $SAFE
which represents
the security level are all thread local.
Variable assignment are all performed by `=`. All variables are
typeless. What is saved is a reference to an object. It is
implemented with `VALUE` (pointer).
var = 1 obj = Object.new @ivar = 'string' @@cvar = ['array'] PI = 3.1415926535 $gvar = {'key' => 'value'}
However, as mentioned earlier `obj.attr=val` is not an assignment.
It is a method call.
var += 1
As in C/C++/Java this is a shortcut for
var = var + 1
Differing from C the Ruby +
is a method and thus part of the library.
In the the whole meaning of +=
is dealt with in the language itself(?).
In C++
+=
and *=
are (?can be) overwritten, we cannot do this in Ruby.
In Ruby +=
is always the combination of +
and assignment.
We can also combine self assignment and attribute access.
The result looks like an attribute.
class C def i() @i end # A method definition can be written in one line. def i=(n) @i = n end end obj = C.new obj.i = 1 obj.i += 2 # obj.i = obj.i + 2 p obj.i # 3
If there is `=` there might also be `` but this is not the case.
Why is that so? In Ruby assignment is dealt with on the language level.
But on the other hand methods are in the library. Keeping these two,
the world of variables and the world of objects, strictly apart is an
important peculiarity of Ruby. If @@ were introduced the separation
might easily be broken. That’s why there’s no @+@
Some people don’t want to go without the brevity of ++
. It has been
proposed again and again in the mailing list but was always turned down.
I also am in favor of ++
but I can do without, there has never been
a ++
in Ruby so let’s forget about it.
defined?
is a strange construct in Ruby. It tells whether an
expression value is defined or not.
var = 1 defined?(var) #=> true
In other words it tells whether the received argument (is it okay to call
it so?) returns a value after evaluation. But it won’t tell you if there’s
a parse error or an exception is raised.
I would have loved to tell you more about defined?
but it will not appear again in this book. What a pity.
A statement is a syntactic construct which basically
cannot be combined with something else and is written
in a separate line.
But it still can be evaluated. For instance there are return values
for class definition statements and method definition statements.
However this is only rarely used, not recommended and isn’t useful.
We stick with this informal criteria.
Here we also don’t mention the various return values.
Up to now we just said “For now one line’s one statement”.
But Ruby’s statement ending’s aren’t that straightforward.
First a statement can be ended explicitely with a semicolon as in C.
Of course then we can write two and more statements in one line.
puts 'Hello, World!'; puts 'Hello, World once more!'
On the other hand after opened parentheses, dyadic operators, or commas
when the command apparently continues the sentence continues automatically.
# 1 + 3 * method(6, 7 + 8) 1 + 3 * method( 6, 7 + 8)
But it’s also no problem to connect lines explicitely with a backslash.
p 1 + \ 2
The `if` modifier is an irregular version of the normal `if`
The programs on the left and right mean exactly the same.
on_true() if cond if cond on_true() end
The `unless` is the negative version.
Guard statements ( statements which exclude exceptions) can
be conveniently written with it.
`while` and `until` also have a back notation.
process() while have_content? sleep(1) until ready?
Combining this with `begin` and `end` gives a `do`-`while`-loop like in C.
begin res = get_response(id) end while need_continue?(res)
class C < SuperClass .... end
Defines the class `C` which inherits from `SuperClass`
We talked quite extensively about classes in the first part.
This statement will be executed, within the definition the class will
become self
, arbitrary expressions can be written within. Class
definitions can be nested. They form the foundation of Ruby execution
image (???).
def m(arg) end
I’ve already written about method definition and won’t add more.
They also belong to statements.
We already talked a lot about singleton methods in the first part.
They do not belong to classes but to objects, in fact, they belong
to singleton classes. We define singleton methods by putting the
receiver in front of the method name. Parameter declaration is done
the same way like with ordinary methods.
def obj.some_method end def obj.some_method2( arg1, arg2, darg = nil, *rest, &block ) end
class << obj .... end
目的の観点からすると、特異メソッドをまとめて定義するための文。
手段の観点からすると、文の実行中、`obj`の特異クラスが`self`になる文。
Rubyプログラムにおいて特異クラスが露出するのは唯一ここだけである。
class << obj p self #=> #<Class:#<Object:0x40156fcc>> # Singleton Class 「(obj)」 def a() end # def obj.a def b() end # def obj.b end
With a multiple assignment several assignments can be combined into one.
The following is a simple example:
a, b, c = 1, 2, 3
It’s exactly the same as the following.
a = 1 b = 2 c = 3
It’s not just for brevity’s sake. When we bind variables to an elements
of an array it becomes delightful.
a, b, c = [1, 2, 3]
This also has the same result as the above.
Furthermore, the right handside does not need to be a literal.
It can also be a variable or a method call.
tmp = [1, 2, 3] a, b, c = tmp ret1, ret2 = some_method() # some_method might probably return several values
Precisely speaking it is as follows. We will write the value of the
left hand side as obj
.
- `obj` if it is an array
- if `obj` に`to_ary`メソッドが定義されていればそれで配列に変換する
- `[obj]`を使う
この手順に従って右辺を決定し、代入を行う。つまり右辺の評価と代入の操作は
完全に独立している。
And it goes on, the left and right hand side can be arbitrarily nested.
a, (b, c, d) = [1, [2, 3, 4]]
a, (b, (c, d)) = [1, [2, [3, 4]]]
(a, b), (c, d) = [[1, 2], [3, 4]]
The result after each line will be the assignments a=1 b=2 c=3 d=4
.
And it goes on. The left hand side can be index or parameter assignments.
i = 0 arr = [] arr[i], arr[i+1], arr[i+2] = 0, 2, 4 p arr # [0, 2, 4] obj.attr0, obj.attr1, obj.attr2 = "a", "b", "c"
And like with method parameters, *
can be received.
first, *rest = 0, 1, 2, 3, 4 p first # 0 p rest # [1, 2, 3, 4]
If you start using them all, you will easily get confused.
We brushed over block parameters when we were talking about iterators.
But there is a deep relationship between them and multiple assignment.
For instance in the following case.
array.each do |i| .... end
When the block is called with a `yield`, the provided parameters are multi-assigned to `i`.
Here there’s only one variable on the left hand side, so it does not look like multi assignment.
But if there are two or more variables we see what’s going on. For instance Hash#each
provides a key and a value we usually call it like that:
hash.each do |key, value| .... end
In this case an array with elements key and value are yielded
from the hash.
Hence we can also use nested multiple assignment as shown below.
# [[key,value],index] are given to yield hash.each_with_index do |(key, value), index| .... end
class C alias new orig end
Defining another method `new` with the same body as the already
defined method `orig`. `alias` are similar to hardlinks in a unix
file system. They are a means of assigning multiple names to one method body. I other words,
because the names themselves are independent of each other,
if one method name is overwritten by a subclass method, the
other one still returns the method as before.
class C undef method_name end
Prohibits the calling of `C#method_name`. It’s not just a simple
revoking of the definition. If there even were a method in the
superclass it would also be forbidden. In other words the method is
exchanged for a sign which says “This method must not be called”.
`undef` is extremely powerful, once it is set it cannot be
deleted on the Ruby level, because it is used to cover up contradictions
in the internal structure. One must define a method in the lower class.
Even then when one calls `super` an
error occurs.
By the way the method which corresponds to `unlink` in a file system
is `Module#remove_method`. While defining a class, `self` refers
to that class, we can call it as follows ( Remember that `Class` is a
subclass of `Module`.
class C remove_method(:method_name) end
But even with a `remove_method` one cannot cancel the `undef`.
It’s because the sign put up by `undef` prohibits any kind of searches.
# examples of bad comments. 1 + 1 # compute 1+1. alias my_id id # my_id is an alias of id.
From a `#` to the end of line is a comment.
It doesn’t have a meaning for the program.
=begin This is an embedded document. It's so called because it is embedded in the program. Plain and simple. =end
An embedded document stretches from
an `=begin` outside a string at the beginning of a line
to a `=end`. The interior can be arbitrary.
The program reads and ignores it like a simple comment.
When the global variable $KCODE
is set to either EUC
, SJIS
or UTF8
the strings can be encoded in EUC-JP, SHIFT-JIS or UTF-8
respectively. (?)
And if the option -Ke
, -Ks
or -Ku
is given to the ruby
command multibyte strings can be used within the Ruby code.
String literals, regular expressions and even operator names
can contain multibyte characters. Hence it is possible to do
something like this:
def 表示( arg ) puts arg end 表示 'にほんご'
But I really cannot recommend doing things like that.
御意見・御感想・誤殖の指摘などは
青木峰郎 <[email protected]>":http://i.loveruby.net#{path}
までお願いします。
『Rubyソースコード完全解説』
はインプレスダイレクトで御予約・御購入いただけます (書籍紹介ページへ飛びます)。
Copyright © 2002-2004 Minero Aoki, All rights reserved.