More yield_self awesomeness. Also, the new name proposed.
Since my previous article about Ruby 2.5 new #yield_self
method I started to use it a lot—in production and experimental code.
Here, I just want to share several code samples/ideas of usage, where this seemingly simple method allows to rewrite code in— let’s say another way. Which somebody may consider cleaner, more functional, or better showing the intention.
But first—
The name
After considering and comparing a lot of possibilities in experimental code, and after using yield_self
extensively in production, I now firmly believe that the best possible name for the method is just .then
. Despite being a keyword, it is allowed by current syntax to be a method and reads extremely clear in all possible cases. Here, for one, is one of the examples of the previous article:
construct_url
.then(&Faraday.method(:get)).body
.then(&JSON.method(:parse))
.dig('object', 'id')
.then { |id| id || '<undefined>' } # BTW, don't you want .or_else('<undefined>') here? ;)
.then { |id| "server:#{id}" }
Join the discussion of the proposal on Ruby tracker. There is also voting for the new name, published by Koichi Sasada, one of the main Ruby maintainers.
I will use this imaginary name through the rest of examples, deal with it (please).
So, practically speaking, what’s this yield_self
then
is good for?
Keep your scopes short
Imagine this code:
def format(paragraphs, **options)
paragraphs = cleanup_forbidden_markup(paragraphs)
paragraphs = join_too_short(paragraphs)
paragraphs = realign_by_timestamps(paragraphs)
if options[:split_parts]
paragraphs = paragraphs.each_slice(PART_SIZE).flat_map { |chunk| [part_header, *chunk] }
end
# ...
# OK, smart boy, what is paragraphs now?..
end
(Yeah, I understand that discussing this code could be turned into refactoring championship easily, I just want to show one point, OK? And yes, the example is exaggerated, but more realistic one, with a method call, then switching through if
to process the result, then post-processing the result again, is easily imaginable.)
The problem is: in not-so-small scopes of algorithmically intensive methods, it is pretty hard to follow where each local variable came from, what it is needed for, and when it’ll change. We can rename them into initial_paragraphs
, cleaned_paragraphs
and so on, but will it be really better?
On the other hand, we can do this:
paragraphs
.then { |paragraphs| cleanup_forbidden_markup(paragraphs) } # also `.then(&method(:cleanup_forbidden_markup))`
.then { |paragraphs| join_too_short(paragraphs) }
.then { |paragraphs| realign_by_timestamps(paragraphs) }
.then { |paragraphs|
next paragraphs unless options[:split_parts]
paragraphs.each_slice(PART_SIZE).flat_map { |chunk| [part_header, *chunk] }
}
.then { ... }
There is more or less the same amount of typing, but scopes became short and focused here. Algorithm’s structure of x → x'
steps is enforced and demonstrated on the syntax level.
I can also say that attempts to refactor some algorithmically intensive methods this way typically led to a lot of small insights, like “Oh, in fact, there are two different computation processes, related to different variables, it is easy to split.”
Note how nicely next
keyword plays here, naturally reading like “to next block in the chain.”
Embrace our chains!
Method chains are an awesome tool for structuring code and thoughts! Unfortunately, not all concepts are easily chainable. One of them is “empty value guard”:
# the last statement in the method
very.long.and.chained.computation || default
This works to guard against nil
, but in real life, different contexts produce different “empty values.” This fact led to two common solutions, both of them look pretty “hacky” to me every time I see them:
# Ruby core: infamous #nonzero? returns "this number, if it is not a zero."
length = max_length.nonzero? || DEFAULT_LENGTH
# ActiveSupport: #presence does the same for, well, not #present? values
nickname = params[:nickname].presence || '<anonymous>'
With .then
:
very.long.and.chained.computation
.then { |res| res.empty? ? DEFAULT : res }
The res
repeated three times here, though. So it is compromise solution: we gain chainability and a less hacky way to express thoughts, but lose DRYness.
Poor man’s Maybe
There are a lot of argument around nil
recently, and how you should avoid relying on it.
But it is embedded into Ruby’s nature, and has very nice syntactical support via operators (&&
, ||
, &.
) and defaults (argument-less return
or next
generate it, for example). ~yield_self
~, oh, sorry, then
also can benefit from it:
Style 1 aka “and dot, and dot, and dot”
construct_uri
.then(&Faraday.method(:get))
.then { |res| res.body if res.success? } # Here the magic begins!
&.then(&JSON.method(:parse))
&.dig('content', 0) # One more possible source of nil
&.then { |item| item.transform_keys(&:to_sym) }
Style 2 aka “one is many”
construct_uri
.then(&Faraday.method(:get))
.then # Blockless form returns Enumerator
.select(&:success?)
.map(&:body)
.map(&JSON.method(:parse))
.reject { |data| data.key?('error') }
.map { |data| data.dig('content') }
.first # It is like .unwrap in monadic code
BTW, the presence
puzzle from the previous section could be solved this way, for good or for bad:
very.long.and.chained.computation
.then.reject(&:empty?).first || DEFAULT
Hygienic utils
In Elixir guides, the |>
operator (which #yield_self
is frequently compared to) used like this:
1..100_000 |> Enum.map(&(&1 * 3)) |> Enum.filter(odd?) |> Enum.sum
What is interesting here, that Elixir, which borrowed a lot from Ruby, doesn’t need to “include” some new modules into class to use their methods in chains. This means a lot of utilities can be implemented to produce short and readable (enough) code for contextually-convenient things:
10.then(&TimeCalc.method(:years)).ago
Of course, it is still less “readable” (at least longer) than 10.years.ago
, but if Ruby’s core team will simplify method()
as discussed for a long time (see #13581), it will be just (I am using one of the proposed syntaxes):
10.then(&TimeCalc.:years).ago
# Or, just in the current module, without polluting all the scopes:
include TimeCalc
10.then(&.:years).ago
This, of course, is still longer, yet readable enough and 100% hygienic and chainable (imagine one more long.chain.of.computation
in place of 10
).
Do we really need this?
I honestly don’t know. Probably, old good local variables, if
s and a boring imperative code is good enough for most cases. Probably, Borland Pascal 5 was already good enough, and things were becoming just fancier and unnecessarily complicated since that. But it is really tempting to investigate those unusual possibilities, and, maybe, just maybe, add sprinkles of new approaches into the mature language.
The one important thing that should be said, though: in Ruby, small abstractions are typically not “zero-cost”, not even close. The difference of “imperative” and then
- styles is that in the latter each step has an added cost of small closure creation, which sometimes, just sometimes, may surprise you in profiler’s output.
Bonus: where this leads us
If .then
-style (zen-style?) of code writing would be adopted, then in a few Ruby versions you may want to write your code this way…
construct_url
.rescuing(Faraday::Error, &Faraday.:get) # returns `nil` if the specified error caught
&.body
&.then(&JSON.:parse.curry[symbolize_names: true]) # we need to pass arguments!
&.dig('content', 0)
&.match(/<<(?<name>.+)>>/)
&.at('name') # core objects should all be &. friendly!
|.else('<noname>') # Can I haz "or else" already?..
Do we want to go that far?.. I know what /r/ruby will answer, but as I’ve said, it is not always easy to stand the dream.