mastodon.online is one of the many independent Mastodon servers you can use to participate in the fediverse.
A newer server operated by the Mastodon gGmbH non-profit

Server stats:

11K
active users

#fediversedotgames

2 posts2 participants0 posts today

Today's Peertube update:

  • Did a significant amount of work optimising and increasing the security on our object storage.
  • Replaced some of the hacky things I did to get an MVP with better strategic options.
  • Availability is high and speeds appear strong.
  • Horizontal Pod Scaling is now set up for our runners so we can handle periods of increased uploads/streams

Need to work on:

  • Re-evaluate resources on runners (I think we can probably vertically scale them up a little bit and we have resources available to be able to accommodate it).
  • Graceful shutdown of runners when scaling down.

We're not quite production ready but if you'd like an account to have a mess around and help test out the functionality, please let me know.

Some Peertube instance updates for today:

  • DNS issues experienced this morning have been resolved.
  • Object storage is stable and fast.
  • Peertube backend pods are running well.
  • There is a working design for remote runners.
  • The "scratch disk"/landing zone volume has been resized to allow more concurrent import/transcode jobs to run without running out of space.
  • Uptime is now tracked at status.fediverse.games

To work on:

  • Get some metrics into our observability platform to monitor for potential issues.
  • Set up Horizontal Pod Autoscaler for the remote runner pods to scale up when there is a large backlog of transcoding jobs.
  • Work out how to have remote runners deregister themselves from the instance when they're shutting down due to HPA policy (they currently register themselves fine but drop off without deregistering, leading to a lot of "dead" remote runner pods sitting in registered state.
  • Also need to take stock of the current hardware situation and see if adding some additional storage might be a good option.

Peertube is coming to Fediverse.Games! You might have noticed a sneaky reshare of the first video on the service, a quick intro from me!

At the moment we're not open for registrations. I will be adding some of my old YouTube videos and some new ones going forward, and will be looking to analyse the impact any traffic has on our infrastructure or existing services.

Super keen to see how this one plays out!

tube.fediverse.games/

tube.fediverse.gamestube.fediverse.gamesThis a friendly, inclusive server for all lovers of games, in their many forms! We love video games, board games, table tops and more!

It's been an exhausting couple of weeks. The forced commute definitely took it out of me last week and there's more to come this week. Haven't had any time at all to record or edit videos so there's likely to be nothing released this week (though it's possible we might sneak something out on Thursday).

On the side though I have been doing some work on bringing an additional service to the Fediverse.Games family. I hope to have some more news to share about that over the next couple of weeks.

Continued thread

I've been continuing to investigate this, and it appears the processes associated with some of our Ceph OSDs are consuming too much memory, causing one of our nodes to become unresponsive.

We'll work through some solutions over the next couple of days, but for the meantime we will implement scheduled restarts on the affected node, which will mitigate the issue.

Meta's decision to gut its moderation functions is concerning, especially given that these functions were already incredibly weak - in particular in the areas of celebrity endorsement scams and romance scams.

Having a strong moderation process is a key pillar of our charter when it comes to federated instances, and I'm of the belief that it is untenable for us to continue to federate with threads.

No concrete decision has been made, and I will have more to say on this matter.

We're currently experiencing some intermittent issues with one of our nodes, which is causing some issues with our static assets. You may occasionally experience images, including post media and profile pictures, loading slowly or not at all.

I believe this is being caused by an unrelated service that was impacted by the recent large scale outage we had, and I'm currently investigating the root cause to resolve the issue.

Okay, all redundancy has been restored across all Fediverse.Games services.

I promise to be transparent about what goes on with this instance, so I thought I'd take you through the cause.

A part of our storage infrastructure is built on ceph, which is a clustered, self-healing storage solution.

One of our nodes is (now was) a reasonably old Dell Micro PC. I've been gradually working through a technology upgrade to remove these ultra-compact, low power devices. While they're really cool little devices, they have limited usage in a production environment (don't worry though, I've still got plenty of uses for them).

Unfortunately, in purging and removing one of the ceph-managed discs on the mini PC, something has gone wrong which caused a number of volumes to have lost/out of sync objects, which completely blocked IO across the entire cluster.

All of our data is backed up to the cloud, so there was never any risk of significant permanent data loss, but it was easier and quicker to repair than restore, so that's what we went for.

Part of the delay (aside from needing to go to sleep for the night for my first day back at work after the holidays - yes great timing I know!) was diagnosing the actual issue and taking the steps to set up ceph to do the self-healing work on its own.

I've learnt some extra steps I could have taken to guarantee this won't happen in future, but also the steps to diagnose the issue and set the ball rolling to solve the issue.

Alright, so we've just recovered from a major outage which lasted for around 18 hours.

I will post a more detailed explanation of what went wrong and the learnings shortly, but I just wanted to drop a quick post to let everyone know.

Services are mostly online now with the exception of search, which is still being repaired.

Services have reduced redundancy at this time so there may be further brief outages or slowness as this resolves.

I've switched the instance from vanilla Mastodon over to glitch-soc.

We now have access to a massive **2048** character limit for posts, more poll options and a few more bits and bobs to hopefully make life on the server a little bit more fun!

I've done some stability testing and monitoring and everything appears to be working absolutely fine, but I'll be continuing to monitor closely over the coming days.