We’ve been experimenting with breaking up employees into random groups (of size 4) and setting up video hangouts between them. We’re doing this to replace the serendipitous meetings that sometimes occur around coffee machines, in lunch lines or while waiting for the printer. And also, we just want people to get to know each other.

Which lead to me writing some code. The core of which is *divide n elements into groups of at least size g minimizing the size of each group*. So, suppose an office has 15 employees in it then it would be divided into three groups of sizes 5, 5, 5; if an office had 16 employees it would be 4, 4, 4, 4; if it had 17 employees it would be 4, 4, 4, 5 and so on.

I initially wrote the following code (in Python):

```
groups = [g] * (n//g)
for e in range(0, n % g):
groups[e % len(groups)] += 1
```

The first line creates `n//g`

(`//`

is integer division) entries of size `g`

(for example, if `g == 4`

and `n == 17`

then `groups == [4, 4, 4, 4]`

). The `for`

loop deals with the ‘left over’ parts that don’t divide exactly into groups of size `g`

. If `g == 4`

and `n == 17`

then there will be one left over element to add to one of the existing `[4, 4, 4, 4]`

groups resulting in `[5, 4, 4, 4]`

.

The `e % len(groups)`

is needed because it’s possible that there are more elements left over after dividing into equal sized groups than there are entries in `groups`

. For example, if `g == 4`

and `n == 11`

then `groups`

is initially set to `[4, 4]`

with three left over elements that have to be distributed into just two entries in `groups`

.

So, that code works and here’s the output for various sizes of `n`

(and `g == 4`

):

```
4 [4]
5 [5]
6 [6]
7 [7]
8 [4, 4]
9 [5, 4]
10 [5, 5]
11 [6, 5]
12 [4, 4, 4]
13 [5, 4, 4]
14 [5, 5, 4]
15 [5, 5, 5]
16 [4, 4, 4, 4]
17 [5, 4, 4, 4]
```

But the code irritated me because I felt there must be a simple formula to work out how many elements should be in each group. After noodling on this problem I decided to do something that’s often helpful… make the problem simple and naive, or, at least, the solution simple and naive, and so I wrote this code:

```
groups = [0] * (n//g)
for i in range(n):
groups[i % len(groups)] += 1
```

This is a really simple implementation. I don’t like it because it loops `n`

times but it helps visualize something. Imagine that `g == 4`

and `n == 17`

. This loop ‘fills up’ each entry in `groups`

like this (each square is an entry in `groups`

and numbers in the squares are values of `i`

for which that entry was incremented by the loop).

So groups ends up being `[5, 4, 4, 4]`

. What this helps see is that the number of times `groups[i]`

is incremented depends on the number of times the `for`

loop ‘loops around’ on the `i`

th element. And that’s something that’s easy to calculate without looping.

So this means that the code is now simply:

```
groups = [1+max(0,n-(i+1))//(n//g) for i in range(n//g)]
```

And to me that is more satisfying. `n//g`

is the size of `groups`

which makes the loop update each entry in `groups`

once. Each entry is set to `1 + max(0, n-(i+1))//(n//g).`

You can think of this as follows:

1. The `1`

is the first element to be dropped into each entry in `groups`

.

2. `max(0, n-(i+1))`

is the number of elements left over once you’ve placed `1`

in each of the elements of `groups`

up to position `i`

. It’s divided by `n//g`

to work out how many times the process of sharing out elements (see the naive loop above) will loop around.

If #2 there isn’t clear, consider the image above and imagine we are computing `groups[0]`

(`n == 17`

and `g == 4`

). We place `1`

in `groups[0]`

leaving 16 elements to share out. If you naively shared them out you’d loop around four times and thus need to add 16/4 elements to `groups[0]`

making it 5.

Move on to `groups[1]`

and place a `1`

in it. Now there are 15 elements to share out, that’s 15/4 (which is 3 in integer division) and so you place 4 in `groups[1]`

. And so on…

And that solution pleases me most. It succinctly creates `groups`

in one shot. Of course, I might have over thought this… and others might think the other solutions are clearer or more maintainable.

Go to Source of this post

Author Of this post: John Graham-Cumming