SakeTami
sondehub
sondehub

patreon


Video, WebSockets in the frontend, predictions and history update

It's only when I look back at the last post do I realise just how much stuff has changed in the SondeHub ecosystem!

SondeHub video

First up a little treat. Due to the Melbourne lockdown AWS Melbourne users group was live streamed rather than the typical in person event. This has the bonus that the talks were recorded and published, one of which was myself talking about SondeHub infrastructure. 

If you want to check that out head to this YouTube link. 

WebSockets update

Luke has been smashing out updates! Recent frontend changes have brought websockets onto the website, and in such a seamless fashion you might not have even noticed. The bottom right hand corner now displays if WebSockets is connected and the amount of messages your receiving. This lets your browser get the latest updates as quickly as they come in, no more need to wait 5 seconds for the polling interval. 

For this to remain scalable we've had to make some more improvements to our websockets system so keep reading as I'll add some technical details down below.

Predictions

snh has done some great work dockerizing the prediction apps that we use. At the moment we use the publicly hosted Tāwhirimātea API endpoint for our predictions, and this has a few drawbacks. We are likely one of the heavier loads on their system, and while I'm unaware of any performance impact we might be causing it's probably best to run something ourselves. The other concern is that we can't maintain it, so it becomes a risk for us if there is service disruptions like we recently saw.

Since we now have docker images for these components we'll likely be running our own predictions shortly. What's really neat is that NOAA publish the required GFS data into an S3 bucket so downloading these will be cheap and fast from our backend.

History

Admittedly I haven't had a huge amount of time to continue on the history work, however it has progressed. Technical details have been worked through and some code has been written in this space. It certainly looks like a viable option.


And that's the update so far. Stick around if you want to learn some more technical details around websockets.

~ Michaela.


A deeper dive into WebSockets

Previously when we implemented websockets we were just targeting the sondehub CLI users and people integrating with our services. For this we could get away with a single node which could easily handly a couple of hundred users. Switching the frontend over to websockets we quickly realised that this wouldn't be scalable for times where meteorology organisations live stream, or share links to our site.

Luckily mosquitto (our MQTT/websockets broker) allows bridging of servers. This lets you replicate data on one server on another. So a simple plan was devised.

The idea is that we could have a single writer node, and an autoscaling group of reader nodes. The reader nodes would automatically connect to the writer and start replicating the data. We just scale based on the CPU load of the readers and we should be good.

There's a few technical challenges to overcome in this though. First off is having the reader nodes discover how to connect to the writer node. Now ideally you'd just have a single writer node but we also need to account for times where the writer node needs to be replaced. Further adding to the difficultly of this task is AWS ECS services only allow you to attach a service to either a ALB or an NLB, not both. An NLB would be perfect for this job as we need the readers to connect over the binary endpoint. So instead I used a little hack I learnt a long time ago.

Connecting mosquitto to another server requires lines like this:

connection to_writer
addresses 172.16.1.3:1883 172.16.1.4:1883
topic sondes/# in 0
notifications false
try_private false
bridge_outgoing_retain false
restart_timeout 3
round_robin true

So we need to define where mosquitto is connecting to, and in this case we have two IP addresses listed. One of these will be active and the other is only used when the writer server is being replaced out.

Create the smallest possible subnet you can in AWS. AWS will reserve some IPs for internal use . At time of writing the smallest subnet you can make is a /28. With a /28, the reserved IP addresses, that leaves 12 IP addresses our writer server could be on. This is no good as it would take mosquitto far too long to check each one of those IP addresses

Instead what we do is reserve 10 of the IP addresses which won't be used in this subnet. Under EC2 network interfaces you can create network interfaces which will allocate a private IP address without costing in $. These aren't used for anything and will sit idle.

This provides us with two remaining IP addresses that mosquitto could be running on, and is fast enough for mosquitto reader instance to find the writer instance on boot.

Now the next challenging part is autoscaling. ALBs by default will roundrobin requests. This is a fine approach if your doing short web requests but if you are using autoscaling group with long running websockets you'll end up with very unbalanced load when a new server is added. 

Luckily there is an easy fix for this one. Under the target group configuration you can configure the algorithm to least number of outstanding requests:

Now the last thing I wanted to touch on with the websocket configuration is bandwidth.

When we switched from API calls to websockets we were expecting to improved performance and cheaper hosting. When we made the switch however we saw improved performance, and more expensive bill.

So what happened here? Well it turns out that mosquitto server doesn't support compressing of payload data. This means that due to the JSON nature of our payloads we were wasting a lot of bandwidth. When we were using the API the API would compress these for us.

So the solution here, after exploring a lot of options and yak shaving, ended up being adding compression to mosquitto. We are now running our own patched version and I think this graph speaks for itself.

Can you tell when we switched to our patched version? While the changes we made to make it work aren't suitable to be lodged as a pull request I've opened up an issue on the upstream project with the details on how we added it in.


Thanks for sticking around for the dive into websockets. Happy sonde hunting.

~ Michaela.


More Creators