Gracefully terminate a program in Go

This post is about gracefully terminating a program without breaking currently running process.

Let’s implement some dummy task to run.

package main

import (
    "fmt"
    "time"
)

type Task struct {
    ticker *time.Ticker
}

func (t *Task) Run() {
    for {
        select {
        case <-t.ticker.C:
            handle()
        }
    }
}

func handle() {
    for i := 0; i < 5; i++ {
        fmt.Print("#")
        time.Sleep(time.Millisecond * 200)
    }
    fmt.Println()
}

func main() {
    task := &Task{
        ticker: time.NewTicker(time.Second * 2),
    }
    task.Run()
}

At two-second interval Task.Run() calls handle() function, which just prints five ‘#’ symbols with 200ms delay.

If we terminate a running program by pressing Ctrl+C, while in the middle of the handle(), we’ll be left with partly-done job.

$ go run main.go
#####
###^Csignal: interrupt

But we want our program to handle the interrupt signal gracefully, i.e. finish the currently running handle(), and, probably, perform some cleanup. First, let’s capture the Ctrl+C. Notice, that we handle the receiving from channel c in another goroutine. Otherwise, the select construct would block the execution, and we would never get to creating and starting our Task.

func main() {
    task := &Task{
        ticker: time.NewTicker(time.Second * 2),
    }

    c := make(chan os.Signal)
    signal.Notify(c, os.Interrupt)

    go func() {
        select {
        case sig := <-c:
            fmt.Printf("Got %s signal. Aborting...\n", sig)
            os.Exit(1)
        }
    }()

    task.Run()
}

Now, if we interrupt in the middle of handle(), we’ll get this:

$ go run main.go
#####
##^CGot interrupt signal. Aborting...
exit status 1

Well, except that we see our message instead of a default one, nothing changed.

Graceful exit

There is a pattern for a graceful exit, that utilises a channel.

type Task struct {
    closed chan struct{}
    ticker *time.Ticker
}

The channel is used to tell all interested parties, that there is an intention to stop the execution of a Task. That’s why it’s called closed by the way, but that’s just a convention. The type of a channel doesn’t matter, therefor usually it’s chan struct{}. What matters is the fact of receiving a value from this channel. All long-running processes, that want to shut down gracefully, will, in addition to performing their actual job, listen for a value from this channel, and terminate, if there is one.

In our example, the long-running process is Run() function.

func (t *Task) Run() {
    for {
        select {
        case <-t.closed:
            return
        case <-t.ticker.C:
            handle()
        }
    }
}

If we receive a value from closed channel, then we simply exit from Run() with return.

To express the intention to terminate the task we need to send some value to the channel. But we can do better. Since a receive from a closed channel returns the zero value immediately [1], we can just close the channel.

func (t *Task) Stop() {
    close(t.closed)
}

We call this function upon receiving a signal to interrupt. In order to close the channel, we first need to create it with make.

func main() {
    task := &Task{
        closed: make(chan struct{}),
        ticker: time.NewTicker(time.Second * 2),
    }

    c := make(chan os.Signal)
    signal.Notify(c, os.Interrupt)

    go func() {
        select {
        case sig := <-c:
            fmt.Printf("Got %s signal. Aborting...\n", sig)
            task.Stop()
        }
    }()

    task.Run()
}

Let’s try pressing Ctrl+C in the middle of handle() now.

$ go run main.go
#####
##^CGot interrupt signal. Aborting...
###

This works. Despite that we got an interrupt signal, the currently running handle() finished printing.

Waiting for a goroutine to finish

But there is a tricky part. This works, because task.Run() is called from the main goroutine, and handling of an interrupt signal happens in another. When the signal is caught, and the task.Stop() is called, this another goroutine dies, while the main goroutine continues to execute the select in Run(), receives a value from t.closed channel and returns.

What if we execute task.Run() not in the main goroutine? Like that.

func main() {
    // previous code...

    go task.Run()

    select {
    case sig := <-c:
        fmt.Printf("Got %s signal. Aborting...\n", sig)
        task.Stop()
    }
}

If you interrupt the execution now, then currently running handle() will not finish, because the program will be terminated immediately. It happens, because when the interrupt signal is caught and processed, the main goroutine has nothing more to do - since the task.Run() is executed in another gourotine - and just exits. To fix this we need to somehow wait for the task to finish. This is where sync.WaitGroup will help us.

First, we associate a WaitGroup with our Task:

type Task struct {
    closed chan struct{}
    wg     sync.WaitGroup
    ticker *time.Ticker
}

We instruct the WaitGroup to wait for one background process to finish, which is our task.Run().

func main() {
    // previous code...

    task.wg.Add(1)
    go func() { defer task.wg.Done(); task.Run() }()

    // other code...
}

Finally, we need to actually wait for the task.Run() to finish. This happens in Stop():

func (t *Task) Stop() {
    close(t.closed)
    t.wg.Wait()
}

The full code:

package main

import (
	"fmt"
	"os"
	"os/signal"
	"sync"
	"time"
)

type Task struct {
	closed chan struct{}
	wg     sync.WaitGroup
	ticker *time.Ticker
}

func (t *Task) Run() {
	for {
		select {
		case <-t.closed:
			return
		case <-t.ticker.C:
			handle()
		}
	}
}

func (t *Task) Stop() {
	close(t.closed)
	t.wg.Wait()
}

func handle() {
	for i := 0; i < 5; i++ {
		fmt.Print("#")
		time.Sleep(time.Millisecond * 200)
	}
	fmt.Println()
}

func main() {
	task := &Task{
		closed: make(chan struct{}),
		ticker: time.NewTicker(time.Second * 2),
	}

	c := make(chan os.Signal)
	signal.Notify(c, os.Interrupt)

	task.wg.Add(1)
	go func() { defer task.wg.Done(); task.Run() }()

	select {
	case sig := <-c:
		fmt.Printf("Got %s signal. Aborting...\n", sig)
		task.Stop()
	}
}

Update: Ahmet Alp Balkan pointed out, that the pattern used in this post is more error-prone and, probably, should not be used in favor of a pattern with context package. For details, read Make Ctrl+C cancel the context.Context.

References

Tie-breaking rounding

So I was reading an article about differences between Python 2 and Python 3 [1], and there was a statement:

Python 3 adopted the now standard way of rounding decimals when it results in a tie (.5) at the last significant digits. Now, in Python 3, decimals are rounded to the nearest even number.

At this point, I was like “WTF?!”. At school I was taught a simple rule: if x is exactly half-way between two integers - round to the largest absolute value, i.e. 13.5 becomes 14, and -13.5 becomes -14. No magic with even/odd numbers. It wasn’t even discussed, that there are might be different ways of rounding.

But, as it often happens with school program, they didn’t tell us all the truth.

There are actually six more or less normal ways and two not so normal, thus leaving us with eight (eight, Carl!) rules of rounding.

These are the normal rules [2]:

  • Round half down (or towards negative infinity): 13.5 rounds to 13, -13.5 rounds to -14.
  • Round half up (or towards positive infinity): 13.5 rounds to 14, -13.5 rounds to -13.
  • Round half towards zero: 13.5 rounds to 13, -13.5 rounds to -13.
  • Round half away from zero: 13.5 rounds to 14, -13.5 rounds to -14. I believe, this is the rule I was taught at school.
  • Round half to even: 13.5 rounds to 14, -13.5 rounds to -14, but 14.5 also rounds to 14, and -14.5 rounds to -14.
  • Round half to odd: Opposite of the previous rule. 13.5 rounds to 13, -13.5 rounds to -13, 14.5 rounds to 15, -14.5 rounds to -15.

And these are some not so normal:

  • Stochastic rounding: the choice of the result is… random!.
  • Alternating tie-breaking: this just alternate round up and round down for.

Rounding in programming languages

Just out of curiousity I checked how rounding works in a few popular programming languages. It seems like most of them use Round half away from zero rule, the most logical for me, since I was taught it as school.

So, this is what you’ll get in C, PHP 7, Python 2, Ruby 2:

// C's printf
printf("%g -> %g, %g -> %g, %g -> %g, %g -> %g\n", 
    13.5, round(13.5), 14.5, round(14.5), -14.5, round(-14.5), -13.5, round(-13.5));

// output: 13.5 -> 14, 14.5 -> 15, -14.5 -> -15, -13.5 -> -14

But, as was already mentioned, Python3 uses Round half to even, and it will be:

>>> print('%.1f -> %d, %.1f -> %d, %.1f -> %d, %.1f -> %d' \
... % (13.5, round(13.5), 14.5, round(14.5), -14.5, round(-14.5), -13.5, round(-13.5)))

// output: 13.5 -> 14, 14.5 -> 14, -14.5 -> -14, -13.5 -> -14

What’s more surprising for me is that Java 8 also uses another rule - I guess it is Round half towards zero:

System.out.printf("%.1f -> %d, %.1f -> %d, %.1f -> %d, %.1f -> %d\n",
    13.5, Math.round(13.5), 14.5, Math.round(14.5), -14.5, Math.round(-14.5), -13.5, Math.round(-13.5));

// output: 13.5 -> 14, 14.5 -> 15, -14.5 -> -14, -13.5 -> -13

Go 1.8 doesn’t have built-in round function at all [3], you have to choose from math.Ceil or math.Floor yourself.

Conclusion

Well, beware of different rules in different programming languages!

References

One way to deal with anxiety

Whenever it comes to some sort of a test or exam, I start to have this feeling of worry and anxiety. It doesn’t matter how well I’m prepared - I still have it. And, of course, it affects my performance not in the best way.

But there is one way, that helps me overcome this feeling. The key point, though it might sound strange, lies in our insignificance. When the stress begins before some important occasion, I do an exercise.

First, I concentrate on myself, like: this is me, I’m standing here and feeling anxiety. At this point I can describe myself a bit, e.g. the clothes I’m wearing, etc. It usually helps to relax by changing the topic in your head.

Then I expand the area of my perception a bit - up to the room, where I’m standing. I try to understand, what’s happening inside that room, who is inside, what they are doing, and how this people refer to me.

Next step, expand further - up to the building. I imaging all the people inside, they are sitting in different rooms, talking to each other or keeping silent; they have their own concerns (maybe another examination, huh?). Most of the people in the building don’t even know my existence.

And further - a district or a city. Tens of thousands, millions people. Cars are moving alone the roads. Everybody is busy. Somebody is having a date, somebody is yelling, because the taxi driver blocked the road. And again - they don’t care about you.

Continue expanding: country, planet, universe… Do you see the stars? The vast space of nothing? The suns, that have been exploding for billion of years? See that asteroid, which is going to hit the Mars in a billion years?

Now I start to quickly shrink the area of perception back to myself: universe, planet, country, city, building, room, myself. So, what was my anxiety about? That I couldn’t pass some test or score not well enough? Pff, who cares? The results of this test are so insignificant in comparison to what happens in the universe.

You’re so insignificant.

This feeling makes me feel free and peaceful. I calm down and just do the job.

A few notes on PHP exceptions

There are several practices, that I found myself using over and over again while working with PHP exceptions. Here they are.

Package/component level exception interface

Create an interface that represents the top level of exceptions hierarchy for your package/component/module. This interface is known as a Marker interface. This approach has several advantages. First, it allows clients of your code to distinguish these component specific exceptions from others. Second, as PHP doesn’t support multiple inheritance, using an interface allows the exceptions to extends from other exceptions. e.g. SPL exceptions.

Here is an example wth usage:

<?php
namespace some\package {
    // Common package level exception
    interface Exception extends \Throwable {}

    class InvalidArgumentException extends \InvalidArgumentException
        implements Exception {}
}

namespace {
    try {
        // Do something
    } catch (some\package\Exception $e) {
        // All package specific exceptions
    } catch (Exception $e) {
        // Other exceptions
    }
}

Factory methods to create exceptions

It is quite often, that exception’s message is long and contains some placeholders. Generating this message, especially if you throw it in different places, is not very convenient. In this case a factory method will hide this complexity. Imaging, you are doing something like this:

<?php
interface SomeInterface {}
$c = new stdClass();
if (!($c instanceof SomeInterface)) {
    throw new some\package\UnexpectedValueException(
        sprintf('Argument is of type "%s", but expecting "%s"', get_class($c), SomeInterface::class)
    );
}

But instead, it would me much more cleaner to do this:

<?php
class UnexpectedValueException extends \UnexpectedValueException
    implements Exception
{
    public static function wrongType($given, $expected)
    {
        return new self(
            sprintf('Argument is of type "%s", but expecting "%s"', $given, $expected)
        );
    }
}

// Calling code
use some\package\UnexpectedValueException;

if (!($c instanceof SomeInterface)) {
    throw UnexpectedValueException::wrongType(get_class($c), SomeInterface::class);
}

Extended exceptions with additional details

There are cases, when we need to perform additional actions in catch block. For that we often need to know the details about the original arguments, that caused the exception. For instance, you’re catching a DuplicatedAccountException, and want to know the e-mail, that was passed to a registration service. It might be quite easy, if the call to a service and the catch block are in the same context:

<?php
class DuplicatedAccountException extends \LogicException {}

class RegistrationService
{
    public function register($email)
    {
        throw new DuplicatedAccountException(sprintf('Account with %s email alread exists.', $email));
    }
}

// Calling code
$email = 'test@test.com';
$registrationService = new RegistrationService();

try {
    $registrationService->register($email);
} catch (DuplicatedAccountException $e) {
    echo 'Error: ' . $e->getMessage() . PHP_EOL;

    // Do something else with $email. We can do that
    // because we are in the same context, i.e. in one method.
}

But it is quite common, when you are catching an exception, that is thrown from some deeply nested method, and you don’t have access to the context. In this case, it is helpful to have this information attached to the exception itself.

<?php
class DuplicatedAccountException extends \LogicException
{
    private $originalEmail;

    public function __construct($originalEmail, $code = 0, Exception $previous = null)
    {
        $this->originalEmail = $originalEmail;

        parent::__construct(
            sprintf('Account with %s email alread exists.', $originalEmail),
            $code,
            $previous
        );
    }

    public static function create($originalEmail)
    {
        return new self($originalEmail);
    }

    public function getOriginalEmail()
    {
        return $this->originalEmail;
    }
}

// Calling code
try {
    // Some actions…
} catch (DuplicatedAccountException $e) {
    // Do something with $originalEmail.
    echo 'Original email: ' . $e->getOriginalEmail() . PHP_EOL;
}

Atomic change of an index in MySQL

Imagine you have a composite index on four columns, and you need to remove one column from the index. The obvious solution would be to re-create the index:

DROP INDEX idx ON test_table;
CREATE INDEX idx ON test_table (column1, column2, column3);

But after the DROP statement there is no index at all, which is, of course, bad for performance. Especially on a production system. The other way would be to re-create the index in one ALTER TABLE statement:

ALTER TABLE test_table
DROP INDEX idx,
ADD KEY idx (column1, column2, column3);

I was wondering how atomic this statement was, i.e. if internally it actually just executes DROP and CREATE, thus making a time window, when there is no index, or uses some other technique to make this change not so painful. I made a test table with four fields (except id), and populated it with 500K random records. The size of the table was 28Mb. Then created composite index on four columns (the size increased to 36Mb). A test query:

mysql> EXPLAIN SELECT * FROM atomic WHERE field1 = 30 AND field2 = 69370;
+----+-------------+--------+------+---------------+------+---------+-------------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref         | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+-------------+------+-------------+
|  1 | SIMPLE      | atomic | ref  | idx           | idx  | 8       | const,const |    1 | Using index |
+----+-------------+--------+------+---------------+------+---------+-------------+------+-------------+
1 row in set (0.00 sec)

Ok, the index is in place and is used. At this point I did an experiment to see how the size of .ibd file changes if I just re-create the index with separate DROP and CREATE statements. As expected, the size stayed the same - 36Mb.

Then I did that: in one session ran an ALTER TABLE statement with DROP/ADD, and while it was running, in the other session I executed the already mentioned SELECT statement, to see if the index was still used. The answer is - yes, it was still used during the execution of the ALTER TABLE.

Also, I checked the size of the .ibd file after changing the index. It increased to 44Mb. It might mean, that MySQL created a temporary index with different number of columns (that’s why the size increased), and then just updated the table’s metadata to use the new index.

Split file in chunks without breaking the sequence with GAWK

Today I came up with a task for myself, mostly for fun, but it also has a useful application. I have a dump of some MySQL table in csv, it’s ~500K records. The data will be loaded into Neo4j, though I want to speed up the load and be able to parallelize the process. Of course, I can make it in any programming language, but I decided to practice with Linux shell commands and to use gawk.

The example structure of the file is like this:

Title,Id,Sequence
"eee",3,2
"hhh",1,2
"bbb",2,1
"hhh",1,3
"kkk",4,1
"hhh",1,1
"bbb",2,2
"eee",3,1
"eee",3,3

There is a requirement: records with the same Id column must be processed together. Basically, that means, they cannot be in different chunks, even if we exceed the limit of a chunk a little, say, have 1003 records instead of 1000.

The first step would be to sort the file by Id column. But it will be even easier for my loader to work, if the file is sorted secondary by the Sequence column. To simplify, let’s get rid of the header too.

$> awk 'BEGIN { getline; } { print $0 }' test.csv > test_without_header.csv

A little explanation. AWK’s BEGIN statement is executed only once - before processing the first line. Here getline; will just read the header line, but it won’t be printed, so the real output with print $0 will start actually from the second line. Now sort:

$> sort -k2n -k3n --field-separator "," test_without_header.csv > test_sorted.csv

-k2 means sort by the column number 2 (initial is 1), n means numerical sort. The same for the secondary sort. In MySQL the same result will be achieved with ORDER BY Id ASC, Sequence ASC. --field-separator is necessary, because the default separator is a space, but we have a comma. Now we have this:

"hhh",1,1
"hhh",1,2
"hhh",1,3
"bbb",2,1
"bbb",2,2
"eee",3,1
"eee",3,2
"eee",3,3
"kkk",4,1

Let’s assume we want to split the file into chunks by 2 records. The natural approach with split doesn’t work, because it breaks the sequence. Indeed, if we do like:

$> split -l 2 test_sorted.csv
$> cat xaa
"hhh",1,1
"hhh",1,2
$> cat xab
"hhh",1,3
"bbb",2,1

The third record with Id = 1 got to the second chunk, but should be in the first. To solve the problem I played around with AWK a little and wrote the following script:

# split.awk
#
# Initialise values.
# This block will be executed only once before
# processing the first line.
BEGIN {
    records_limit_per_chunk = 2
    records_currently_in_chunk = 0
    chunk_number = 1
    prev = -1
    chunk_limit_is_hit = 0
}

# Process current line
{
    # If the limit is hit and we are not
    # in the middle of a sequence.
    # $2 means the second column, which is Id in our case.
    if (chunk_limit_is_hit && prev != $2) {
        chunk_number++
        records_currently_in_chunk = 0
        chunk_limit_is_hit = 0
    }

    # Write line to a chunk
    file = "chunk_" chunk_number
    print $0 > file
    ++records_currently_in_chunk

    # Test the limit
    if (records_currently_in_chunk >= records_limit_per_chunk) {
        chunk_limit_is_hit = 1
    }

    # Save previous Id value
    prev = $2
}

Run the script:

$> gawk -F, -f split.awk test_sorted.csv

Pay attention to -F, argument, which says to use , as a field separator. -f split.awk means to load script from file. For our input file it creates 4 chunks with the following content:

$> cat chunk_1
"hhh",1,1
"hhh",1,2
"hhh",1,3
$> cat chunk_2
"bbb",2,1
"bbb",2,2
$> cat chunk_3
"eee",3,1
"eee",3,2
"eee",3,3
$> cat chunk_4
"kkk",4,1

Exactly what I need: the records with ids 1 and 3 are in the same chunk, though its size is larger than the limit of 2 records.

Broken code theory

I have recently come across Broken windows theory article. It is a criminological theory which addresses the problems of urban disorder, vandalism and anti-social behavior. The theory states that:

… maintaining and monitoring urban environments to prevent small crimes such as vandalism, public drinking, and toll-jumping helps to create an atmosphere of order and lawfulness, thereby preventing more serious crimes from happening.

The theory was introduced in a 1982 article by social scientists James Q. Wilson and George L. Kelling in an article titled Broken Windows, where they had an example:

Consider a building with a few broken windows. If the windows are not repaired, the tendency is for vandals to break a few more windows. Eventually, they may even break into the building, and if it’s unoccupied, perhaps become squatters or light fires inside.

The authors suggested to prevent vandalism by addressing the problems when they are small and easy to manage with.

I think the same theory can be applied to software development - writing “bad” code may be observed as a vandalism. We often find ourselves in situations in which we can accomplish some task in two ways: either program a clean, robust solution (often meaning complex and long) or make a quick dirty hack. By that time the code may be in two states:

  1. The code is clean and there no dirty hack.
  2. There are dirty hacks.

Guess when will you be more tempted to add “just another small hack, that no one will notice”? Exactly. In the second case. After all, you have deadlines. But you’ll definitely feel uncomfortable, if you need to do that in the first case, just because there seem to be is an established norm of clean code, that you don’t want to break.

In ideal world, the code is always perfect, but in real world it tends to become over the time a collection of small (sometimes not so small) fixes. To prevent that, like with an ordinal vandalism, the hacks should be removed/replaces as soon as possible. By adding another small fix you not only making an obvious poor design decision, but also send the other the message “Go, add more of them!”.

Liquid engine for PHP

I’ve just finished working on my fork of the Liquid template engine.

Liquid is a templating library which was originally developed for usage with Ruby on Rails in Shopify. It uses a combination of tags, objects, and filters to load dynamic content. They are used inside Liquid template files, which are a group of files that make up a theme.

Here is a simple example:

<?php

use Liquid\Template;

$template = new Template();
$template->parse("Hello, {{ name }}!");
echo $template->render(array('name' => 'Alex');

// Will echo
// Hello, Alex!

The implementation my fork is based on was a little outdated and I decided that I could make some improvements to it.

Namespaces

Namespaces were added. Now all classes are under Liquid namespace. The library now implements PSR-4. The minimum required PHP version is 5.3 now.

Composer

You can now install the library via composer. Here is the package’s page.

composer create-project liquid/liquid

New standard filters

Implemented new standard filters: sort, sort_key, uniq, map, reverse, slice, round, ceil, floor, strip, lstrip, rstrip, replace, replace_first, remove, remove_first, prepend, append.

For the full list of supported filters read the Ruby implementation’s wiki page.

New tags

raw tag was added. The implementation of unless tag is in plans. You’re welcome to contribute.

Golang channels tutorial

Golang has built-in instruments for writing concurrent programs. Placing a go statement before a function call starts the execution of that function as an independent concurrent thread in the same address space as the calling code. Such thread is called goroutine in Golang. Here I should mention that concurrently doesn’t always mean in parallel. Goroutines are means of creating concurrent architecture of a program which could possibly execute in parallel in case the hardware allows it. There is a great talk on that topic Concurrency is not parallelism.

Let’s start with an example of a goroutine:

func main() {
     // Start a goroutine and execute println concurrently
     go println("goroutine message")
     println("main function message")
}

This program will print main function message and possibly goroutine message. I say possibly because spawning a goroutine has some peculiarities. When you start a goroutine the calling code (in our case it is the main function) doesn’t wait for a goroutine to finish, but continues running further. After calling a println the main function ends its execution and in Golang it means stopping of execution of the whole program with all spawned goroutines. But before it happens our goroutine could possibly finish executing its code and print the goroutine message string.

As you understand there must be some way to avoid such situations. And for that there are channels in Golang.

Channels basics

Channels serve to synchronize execution of concurrently running functions and to provide a mechanism for their communication by passing a value of a specified type. Channels have several characteristics: the type of element you can send through a channel, capacity (or buffer size) and direction of communication specified by a <- operator. You can allocate a channel using the built-in function make:

i := make(chan int)       // by default the capacity is 0
s := make(chan string, 3) // non-zero capacity

r := make(<-chan bool)          // can only read from
w := make(chan<- []os.FileInfo) // can only write to

Channels are first-class values and can be used anywhere like other values: as struct elements, function arguments, function returning values and even like a type for another channel:

// a channel which:
//  - you can only write to
//  - holds another channel as its value
c := make(chan<- chan bool)

// function accepts a channel as a parameter
func readFromChannel(input <-chan string) {}

// function returns a channel
func getChannel() chan bool {
     b := make(chan bool)
     return b
}

For writing and reading operations on channel there is a <- operator. Its position relatively to the channel variable determines whether it will be a read or a write operation. The following example demonstrates its usage, but I have to warn you that this code does not work for some reasons described later:

func main() {
     c := make(chan int)
     c <- 42    // write to a channel
     val := <-c // read from a channel
     println(val)
}

Now, as we know what channels are, how to create them and perform basic operations on them, let’s return to our very first example and see how channels can help us.

func main() {
     // Create a channel to synchronize goroutines
     done := make(chan bool)

     // Execute println in goroutine
     go func() {
          println("goroutine message")

          // Tell the main function everything is done.
          // This channel is visible inside this goroutine because
          // it is executed in the same address space.
          done <- true
     }()

     println("main function message")
     <-done // Wait for the goroutine to finish
}

This program will print both messages without any possibilities. Why? done channel has no buffer (as we did not specify its capacity). All operations on unbuffered channels block the execution until both sender and receiver are ready to communicate. That’s why unbuffered channels are also called synchronous. In our case the reading operation <-done in the main function will block its execution until the goroutine will write data to the channel. Thus the program ends only after the reading operation succeeds.

In case a channel has a buffer all read operations succeed without blocking if the buffer is not empty, and write operations - if the buffer is not full. These channels are called asynchronous. Here is an example to demonstrate the difference between them:

func main() {
     message := make(chan string) // no buffer
     count := 3

     go func() {
          for i := 1; i <= count; i++ {
               fmt.Println("send message")
               message <- fmt.Sprintf("message %d", i)
          }
     }()

     time.Sleep(time.Second * 3)

     for i := 1; i <= count; i++ {
          fmt.Println(<-message)
     }
}

In this example message is a synchronous channel and the output of the program is:

send message
// wait for 3 seconds
message 1
send message
send message
message 2
message 3

As you see after the first write to the channel in the goroutine all other writing operations on that channel are blocked until the first read operation is performed (about 3 seconds later).

Now let’s provide a buffer to out message channel, i.e. the creation line will look as message := make(chan string, 2). This time the output will be the following:

send message
send message
send message
// wait for 3 seconds
message 1
message 2
message 3

Here we see that all writing operations are performed without waiting for the first read for the buffer of the channel allows to store all three messages. By changing channels capacity we can control the amount of information being processed thus limiting throughput of a system.

Deadlock

Now let’s get back to our not working example with read/write operations.

func main() {
     c := make(chan int)
     c <- 42    // write to a channel
     val := <-c // read from a channel
     println(val)
}

On running you’ll get this error (details will differ):

fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
     /fullpathtofile/channelsio.go:5 +0x54
exit status 2

The error you got is called a deadlock. This is a situation when two goroutines wait for each other and non of them can proceed its execution. Golang can detect deadlocks in runtime that’s why we can see this error. This error occurs because of the blocking nature of communication operations.

The code here runs within a single thread, line by line, successively. The operation of writing to the channel (c <- 42) blocks the execution of the whole program because, as we remember, writing operations on a synchronous channel can only succeed in case there is a receiver ready to get this data. And we create the receiver only in the next line.

To make this code work we should had written something like:

func main() {
     c := make(chan int)
     
     // Make the writing operation be performed in
     // another goroutine.
     go func() { 
     	c <- 42 
     }()
     val := <-c
     println(val)
}

Range channels and closing

In one of the previous examples we sent several messages to a channel and then read them. The receiving part of code was:

for i := 1; i <= count; i++ {
	 fmt.Println(<-message)
}

In order to perform reading operations without getting a deadlock we have to know the exact number of sent messages (count, to be exact), because we cannot read more then we sent. But it’s not quite convenient. It would be nice to be able to write more general code.

In Golang there is a so called range expression which allows to iterate through arrays, strings, slices, maps and channels. For channels, the iteration proceeds until the channel is closed. Consider the following example (does not work for now):

func main() {
     message := make(chan string)
     count := 3

     go func() {
          for i := 1; i <= count; i++ {
               message <- fmt.Sprintf("message %d", i)
          }
     }()

     for msg := range message {
          fmt.Println(msg)
     }
}

Unfortunately this code does not work now. As was mentioned above the range will work until the channel is closed explicitly. All we have to do is to close the channel with a close function. The goroutine will look like:

go func() {
     for i := 1; i <= count; i++ {
          message <- fmt.Sprintf("message %d", i)
	 }
     close(message)
}()

Closing a channel has one more useful feature - reading operations on closed channels do not block and always return default value for a channel type:

done := make(chan bool)
close(done)

// Will not block and will print false twice 
// because it’s the default value for bool type
println(<-done)
println(<-done)

This feature may be used for goroutines synchronization. Let’s recall one of our examples with synchronization (the one with done channel):

func main() {
     done := make(chan bool)

     go func() {
          println("goroutine message")

          // We are only interested in the fact of sending itself, 
          // but not in data being sent.
          done <- true
     }()

     println("main function message")
     <-done 
} 

Here the done channel is only used to synchronize the execution but not for sending data. There is a kind of pattern for such cases:

func main() {
     // Data is irrelevant
     done := make(chan struct{})

     go func() {
          println("goroutine message")

          // Just send a signal "I'm done"
          close(done)
     }()

     println("main function message")
     <-done
} 

As we close the channel in the goroutine the reading operation does not block and the main function continues to run.

Multiple channels and select

In real programs you’ll probably need more than one goroutine and one channel. The more independent parts are - the more need for effective synchronization. Let’s look at more complex example:

func getMessagesChannel(msg string, delay time.Duration) <-chan string {
     c := make(chan string)
     go func() {
          for i := 1; i <= 3; i++ {
               c <- fmt.Sprintf("%s %d", msg, i)
               // Wait before sending next message
               time.Sleep(time.Millisecond * delay)
          }
     }()
     return c
}

func main() {
     c1 := getMessagesChannel("first", 300)
     c2 := getMessagesChannel("second", 150)
     c3 := getMessagesChannel("third", 10)

     for i := 1; i <= 3; i++ {
          println(<-c1)
          println(<-c2)
          println(<-c3)
     }
}

Here we have a function that creates a channel and spawns a goroutine which will populate the channel with three messages in a specified interval. As we see the third channel c3 has the least interval, thus we except its messages to appear prior to others. But the output will be the following:

first 1
second 1
third 1
first 2
second 2
third 2
first 3
second 3
third 3

Obviously we got a successive output. That is because the reading operation on the first channel blocks for 300 milliseconds for each loop iteration and other operations must wait. What we actually want is to read messages from all channels as soon as they are any.

For communication operations on multiple channels there is a select statement in Golang. It’s much like the usual switch but all cases here are communication operations (both reads and writes). If the operation in case can be performed than the corresponding block of code executes. So, to accomplish what we want, we have to write:

for i := 1; i <= 9; i++ {
     select {
     case msg := <-c1:
          println(msg)
	 case msg := <-c2:
          println(msg)
     case msg := <-c3:
          println(msg)
     }
}

Pay attention to the number 9: for each of the channels there were 3 writing operations, that’s why I have to perform 9 loops of the select statement. In a program which is meant to run as a daemon there is a common practice to run select in an infinite loop, but here I’ll get a deadlock if I’ll run one.

Now we get the expected output, and non of reading operations block others. The output is:

first 1
second 1
third 1 // this channel does not wait for others
third 2
third 3
second 2
first 2
second 3
first 3

Conclusion

Channels is a very powerful and interesting mechanism in Golang. But in order to use them effectively you have to understand how they work. In this article I tried to explain the very necessary basics. For further learning I recommend you look at the following:

My first hackathon (AngelHack, Moscow)

TL;DR It is inspiring and motivating. And the butler is a killer.

Last week I received an email with an invitation to a hackathon which was going to happen on weekend. I didn’t know much about these events so far - in my head there were only scrappy pieces of information gathered from different blogs and news sites pell-mell with photos of crowds of programmers sitting in comparatively small rooms and coding, and coding, and coding… Well, of course I’d like to be there.

Though there was a small problem: only 3 days left to the event and I had neither idea nor team. The chances of wining under such circumstances are slightly less than nothing. When I say “no idea” I don’t mean I did not have ideas at lot, on the contrary I had a whole bunch of ideas which (to be honest) I would never implement by myself.

So I opened Skype and half-jokingly wrote to my friends something like “there is a hackathon this weekend. any ideas?”. Of course they had. We spent the whole evening trying to chose something we’d be able to develop and that seemed to suit the format of hackathon. But as we were babes and sucklings in this sphere and none of us could clearly define what on earth was “the format of hackathon”, we did not choose anything.

Nevertheless, we were set to participate - if not with our project than as a part of other team. I should tell I felt myself quite uncomfortable going there without a prepared project. I thought we would look stupid among those cool guys with MacBooks discussing their great ideas while drinking coffee. I was pleasantly surprised I was wrong.

Of course there were really prepared teams with thoroughly researched ideas, but about one-third of people were just like us - no concrete plan, no team, just a desire to be a part of an event like this. And it was great! People came there to talk with others, to share there points of views, to learn things and to get inspiration. Wow! I really enjoyed to be a part of that!

What about our results? Well, we failed. No, it’s better say - we did not win, because despite the fact there was a first place price, people came not for that, they came for experience.