jerry 17h ago • 100%
The modlog function in lemmy/mbin can be a bit confusing. I don't think the moderation actions of one instance propagate to the modlog of another instance. I think that is only really problematic in cases where the moderation action was taken on the home instance of a magazine/community.
jerry 1d ago • 100%
thanks. I think the name of the instance is the most important part - I can test of federation with them to see what went wrong.
jerry 1d ago • 100%
At the moment, there isn't anything going on that would cause that. Can you tell me which other instance/which thread so i can try to troubleshoot?
jerry 3d ago • 100%
I am investigating…
I have sort of given up in fixing the problem, and will instead work on auto-detecting and auto-recovering when the problem happens.
jerry 6d ago • 100%
Did it take a long time before that error came back?
jerry 2w ago • 100%
In the worst case scenario, fedia.io will be around for at least 5 more years (that is apparently the standard subset timeline). I have many more domain names if the worst comes, but there is a lot of expectation that the .io may somehow be saved due to the incredible amount of infrastructure built around it (GitHub.io, dockerhub.io, and of course fedia.io among many others)
I just saw this: https://every.to/p/the-disappearance-of-an-internet-domain I have no idea if it's real, but if it is, that will be most unfortunate
jerry 2w ago • 100%
I see it - it's working again and I'm trying to figure out why rabbitmq died yet again.
jerry 2w ago • 100%
I don’t know yet - it’s definitely not expected, so my guess is an unintentional bug in mbin somewhere. I am hoping to find a way run a profiler or something similar to see what it’s doing.
jerry 2w ago • 100%
I’ve been in the business for about 25 years. I have a hard time recommending people jump in at this point, or at least to understand the market now. While there is constant talk of “unfilled cyber jobs”, the reality is that many people with experience are struggling to find a job in the field.
If you can get in and find a stable spot at a good company, it’s a great career, but those are getting hard to find I fear. Anyhow, please do a bit of reading about the job market for cyber security before making a decision.
After I resolved the federation issue, I had to clean up a few things and so the site may have been unavailable for a bit. I'm done fussing with it and will keep an eye on it to make sure things are working. IF YOU SEE PROBLEMS - please let me know. As far as I know, I've fixed all of the federation and error 500 issues we've had, so please don't assume it's just more of the same if you see them. Thanks for your patience.
jerry 2w ago • 100%
thank you again for your help! I was at the end of my rope and you gave me the idea that solved it
jerry 2w ago • 100%
be aware that this is fixed now
jerry 2w ago • 100%
be aware that this is fixed now
jerry 2w ago • 100%
be aware that this is fixed now
Fedia.io is sort of like she Ship of Theseus right now - I literally replaced nearly everything trying to get it back working. The problem ended up being a silent out of memory error that php-fpm was running into. I had to increase the memory limit to about 10x what the docs require to get it to work, but once I did that, it works great. I was only able to sort this out after @bentigorlich recommended I move the site to debug mode (which requires me to lock everyone else out). Once I did that, it started giving some useful errors. My apologies for the amount of time it took to fix this. I learned a lot about php today.
jerry 2w ago • 100%
Thanks. It’s a strange problem that only happens when trying to post directly to a community/magazine that resides on another instance. So this works fine because this community it hosted locally. I think I understand when but hoping to get some clarification.
Hi all. As some of you have reported, outbound federation to at least some other instances is broken from fedia.io. At the moment. I don't know why and I don't have any leads as there are no logs or other indications of what is going wrong, but I am working on it.
jerry 3w ago • 100%
Note that even once I get this fixed, there will inevitably be another problem crop up. Posting here is fine, but an email to jerry@infosec.exchange or a ping to @jerry@infosec.exchange (my mastodon account) would probably be faster (note, when federation breaks here, messages to @jerry@infosec.exchange wouldn't get through from fedia.io...
jerry 3w ago • 100%
It’s almost caught up. My apologies. I am trying to get it fixed permanently.
jerry 3w ago • 100%
ok - the queues are processing again. I will work on a more permanent fix after dinner
jerry 3w ago • 100%
ok - rabbitmq started having prroblems with the delivery queue again. I got it going again. Those messages should be delivered soon.
Hi all. Several of you have reported problems with fedia.io not federating with other instances correctly. The cause is that rabbitmq crashed, but not all the way. It crashed to the point where new connections would timeout, but the service was still running such that it wouldn't auto restart. I will be creating some automation to detect that proactively and restart rabbitmq if/when it happens again.
We made some changes a few minute ago that we hope fixes the problem. There may be some other lingering issues, but I am hoping the voting problem is fixed now. Let me know below if you continue to see that problem.
Until I implement a better system to screen out spammers, I will be closing registrations on Fedia.io. That’s not what I want - I’d like for it to be available for legitimate accounts, but the spam is off the hook. Anyone seeing this can send me an email (jerry@infosec.exchange) and I’ll get an account created for you in the mean time.
Hello everyone. Today, I moved fedia.io behind the Fastly CDN. This should make the site consistently fast for everyone, no matter where you are in the world. It'll also help with bandwidth usage and mitigate DDoS attacks. There were a few hiccups as I set that up today - my apologies if you saw errors or broken images for a bit. EDIT: I previously said that this was the first time mbin or kbin was put behind a CDN. That is incorrect. kbin.earth has been behind Cloudflare. Apologies.
Hi all. I've been having some problems keeping fedia.io running - at the moment, either the message workers or the php web server processes are dying after an hour or so and I have to restart everything. I have been working with the mbin team and installed some updates that we hoped would fix the problems, but no luck. I am going to work on a cron job to automatically restart things once an hour. The down side, is that you'll likely see some error 500's if you happen to hit it when the processes are restarting, but it should happen quickly and refreshing the page should make it work again.
Shortly after upgrading to Mbin 1.7.1-rc1, php ran out of workers. I dramatically increased the limit. It isn’t clear to me why that happened and if it’s related to the upgrade or just coincidental. My intuition is that it’s related, but I have no evidence.
Hello everyone. I just upgraded fedia.io to mbin 1.7.1-rc1. One of the notable changes is that mbin is deprecating mercure, which is the component that provided streaming updates. As such, you will have to refresh the web page to see new posts and comments.
The (relatively new) server that Fedia.io was running on, a Hetzner AX 162-R, died overnight. Hetzner tells me that the main board failed and had to be replaced. In the process of repairing, the raid set got corrupted and would no longer boot. Every single AX 162 (R or M) I’ve rented from Hetzner has failed now at least once. This was the last one I had. It was on my to do list to move fedia.io to a Dell server with the same specs. I knew this was going to happen, but I didn’t get it done in time. For those of you who have been following along, Fedia has been cursed from the beginning. The kbin software was a god damned disaster, and very fortunately the mbin team spent an incredible amount of time and patience to help me sort out the many problems, nearly all of which are fixed now. Except for the random occurrences where federation breaks due to an as-yet-unknown bug, the main stability issue has been hardware. I have had excellent luck with Hetzner’s Dell servers, so I am hopeful that is now fixed as well. The challenge is that the Dell server is quite expensive ($350 per month) so I will be looking to find a more cost effective way to host fedia.io, given the very small number of active users.
I will be rehoming fedia.io to a less expensive server the afternoon of July 1 - exact timing is TBD. Downtime should last about 2 hours. The current server is quite expensive and donations are dwindling, which is normally ok, but I am losing my job and have to be a bit more frugal.
Yesterday, the fedia.io server locked up. I was able to reboot it remotely and it came up clean. After less than an hour, the server froze again. This happened several more times throughout the day. Unfortunately, there were no logs recording what happened, and nothing on the console - just frozen hardware. I contacted Hetzner early this morning and they diagnosed the server as having a faulty motherboard. Hetzner replaced the board and rebooted the server, and so far the server has been stable. I have had pretty bad luck with this particular model of server from Hetzner, so I do not have confidence that this won't happen again, and so will be looking to migrate to a different type of server that is hopefully more stable and less expensive (I am losing my job at the end of June, and so need to save all the cash I can).
Fedia.io had a few issues over the past 24 hours - sometimes working find till you click on certain posts, which result in an error 500, and other times just getting an error 500 no matter what. The first issue I found is that amqproxy, which helps to reduce the load on the server between the queue runners that process incoming and outgoing posts and rabbitmq. I found this morning that amqproxy was consistently failing, despite there being no apparent problem. I bypassed amqproxy, since the server can handle the load fine without amqproxy. That seemed to work and things returned to normal. A few hours later, the site started responding with error 500 to nearly all requests. This happened because the database server ran our of connections. The 300 it was set to should have been plenty, but clearly it was not. I've set that to 3000 and so far, so good. My apologies for the instability. I continue to learn the nuances here and will keep making the service more reliable as I go.
The server fedia.io had been running on started developing stability problems overnight from Thursday April 4 to Friday April 5. By Saturday (today), the system was completely unbootable. After attempting to resolve the hardware issue with Hetzner (the ISP) for about 6 hours, I gave up and moved the site to a new server. All that took quite a lot of time, during which fedia.io was not available. My apologies for this.
Fedia.io is now running on it's own stand-alone server. This server is costing me $210/month, so any contributions are welcome. Please let me know of any issues.