IEEE Globecom postmortem

Ulrich Speidel attended IEEE Globecom in Abu Dhabi in December 2018 to present a IEEE Globecom 2018 paper on bufferbloat vs. TCP queue oscillation in satellite links on the wisdom of using buffers in the order of magnitude of the bandwidth-delay product on shared satellite Internet links, a topic already discussed here and here. The paper used data from the simulator to demonstrate that buffers dimensioned the conventional way are prone to large standing queues, which add latency for everyone but don’t act as shock absorbers. This practice was shown to have the same effect on Internet routers a long time ago and is no longer seen as best practice on terrestrial router buffers, but has continued unabated in the satellite industry – ironically the corner of the Internet with the highest inherent latencies and the most expensive bandwidth resource.

As is usual on events such as these, one gets to talk to people afterwards. In this case, I was approached by a very senior and well respected colleague from industry (VP Technology of a major satellite network provider serving direct to site end customers in the Americas). He told me that the paper had just solved a technical mystery for him – they were seeing the same added latency effects on some of their client site to network uplinks and couldn’t figure out why this was happening for some clients but not others. Well they do now, and I hope their clients (probaly primarily those sharing their connections locally) are enjoying better performance already.

Simulator adds storage and modernises parts of the satellite link chain

As part of our annual CAPEX round, we were able to upgrade the satellite emulator server (sats-em), which as the heart of the original machines was one of the oldest CPUs on the block. Seeing that the disk on our existing storage server was nearing capacity, we also invested in two large NAS storage servers with almost 200 TB raided capacity combined. Each experiment generates between about 40 MB and 1.2 GB worth of data, and we can run almost 100 experiments per day when the going is good. The upgrade coincided with the annual electrical inspection of the simulator (which required over 200 power cables, power boards, power distributors and any mains-powered devices to be safety inspected). While we passed the inspection with flying colours, we’re now having to ensure that the sparky reconnected everything correctly – in particular our network cables… fun in the dark of the cabinets!

Simulator project wins ISIF Asia Network Operations Research Grant

We’re pleased to report that ISIF Asia have chosen us again as a winner of one of their Network Operations Research Grants to the value of AU$45,000 in their 2017 round. Among others, the grant will be used to develop the our network coding software further. We thank ISIF Asia for their renewed expression of trust in the work that we do, and thank all those who helped us get there!

Simulator recommissioned after upgrade

Following our acquisiton of additional hardware, we have upgraded the simulator. This included installation of a new 7ft 19 inch rack, which we didn’t quite realise would turn up fully assembled on a pallet – and I wasn’t even there when it arrived. Lei was in the vicinity and with the help of some other wonderful souls managed to direct the truck to the right entrance and knew that there was only one lift that would handle racks that size – having moved the two existing ones from the 5th floor to the new 4th floor lab was a bit of an exercise but on this occasion it paid off!

Liam Scott-Russell joined us again as an intern for a couple of weeks during his high school holidays in July & together with Lei set up the new servers – a total of 15. So when I got back from ISIT in Aachen, much of the hardware work was done.

Lei in front of the upgraded simulator. The two racks on the left contain the island machines – 10 Intel NUCs and 96 Raspberry Pis. The world servers sit in the rack on the far right and at the top of the rack left of it. This rack also contains the satellite chain (sat emulator, 2xPEP + 2xencoder/decoder), the copper taps (blue boxes), and the capture machines that record the data off the taps.

Recommissioning the simulator was another story altogether! Most of the work went into two aspects: Getting the scripting upgraded and getting the new command & control machine set up. One of the lessons from the existing setup was that troubleshooting things often involved having a large number of terminal windows open, and with a few extra monitors from leftover ISIF funding, we were able to assemble a nice large 2×2 screen array of 27″screens – enough to keep a dozen or so terminals in constant view. Getting this to work was another matter – it took a while to learn that newer versions of Ubuntu require the Composite extension to run their Unity desktop, so we had to switch to xfce4 in order to run Xinerama, which is incompatible with Composite but an absolute must-have for a contiguous screen experience.

Scripting: We added dozens of new scripts, modularised even more than before, and added a lot of error detection and handling to ensure that we would learn about problems early. We’ve also implemented a new directory structure for our data.

The addition of two more machines into the satellite chain as well as the removal of the monitoring to dedicated capture machines necessitated quite a bit of network reconfiguration as well. We can now run PEP traffic through a coded tunnel, too.

Another new feature is a special purpose server on the world side of the simulator, which produces baseline iperf3 and ping measurements to ascertain queue sojourn time and lets us monitor the performance of large standardised TCP transfers. Previously, this load was shouldered by one of the world servers.

We also took a good look at our terrestrial latency distribution and now use a distribution that is based on empirical data from Rarotonga rather than the educated guesswork we used in the simulator’s first edition. Average terrestrial latency has increased a little as a result.

The lab setup at the time of writing with Lei at the command and control seat.

The first experiments are now underway – essentially just a repeat of uncoded baselines with recommended queue capacities to ensure that we have a set of results that is directly comparable when we move into coding and PEP territory again soon. First indications are that goodput with the new latency distribution is a little lower than before, which supports our conjecture that island ISPs should choose the location of their world-side teleport carefully. We’ll look more into this a bit further down the track!

At this point, I hope to have completed the baselines in about a week.

One of the upshots of having upgraded and reworked our scripting is that we can now farm out some of the trace conversion to the new capture machines, which parallelises and hence reduces the time it takes to run an experiment by around 25%.

Data coming soon!

Simulator gets boost with additional hardware

An additional Faculty of Science CAPEX allocation in 2017 has enabled us to add significantly to the simulator’s hardware. We have just received 15 further Super Micro servers and four Gigabit Ethernet copper taps, as well as a new 7-foot 19″ rack to accommodate the extra equipment.

Eight of the new servers will boost our fleet of “world servers”, allowing us to project a larger diversity of terrestrial latencies.

Another of the new servers will replace the current machine used for storage, command and control – a standard student-issue desktop at this point in time. The new machine will add plenty of storage capacity for log files as well as the ability to connect additional screens, a very useful capability if you’re trying to work with well over 100 machines at a time.

Two of the servers will act as dedicated traffic capture machines on the world and island sides of the simulated link. Together with the copper taps, they will be able to monitor traffic either side of an encoder/decoder and/or PEP. Currently, the capture functionality still resides on the encoder/decoder machines, which is not ideal as the capture functionality competes with encoding/decoding for resources. Having dedicated capture machines allows us to simply observe the network without placing any load on any of its components.

Two further machines will separate PEP and encoder/decoder functionality on both sides of the link, meaning that we will be able to run coded experiments with PEP and still be able to investigate which technique gains accrue to.

Of the remaining two machines, one will act as a spare world server and the other as a special purpose world server (e.g., for the provision of UDP traffic across the link).

The extra hardware is currently being configured and will be contributing to experimentation within the next few weeks.

Best paper award at SPACOMM 2017

We’re pleased to report that our paper

“Topologies for the Provision of Network-Coded Services via Shared Satellite Channels”

by Ulrich Speidel, ‘Etuate Cocker, Muriel Médard, Janus Heide, and Péter Vingelmann scooped up one of two best paper awards at SPACOMM 2017 in Venice:

https://www.iaria.org/conferences2017/AwardsSPACOMM17.html

Thanks to Muriel Médard for presenting the paper on our behalf!