Posts by BernH

    little progress I took a good look:

    inet 127.0.0.1/8 while my udm is 192.168.1.1. So from my udm I created a virtual network on 127.0.0.1/8, the udm still does not detect the nas but I was able to ping from the nas on 127.0.0.1

    127.0.0.1 is the localhost (internal address) of a device. You pinged yourself.


    The enps0…. Is the ethernet port. It is showing down. It is either not configured static or is configured for dhcp but there is no dhcp server that it can reach because one is not configured on your network or router port, or there is a cabling problem.


    You have to confirm you have a dhcp server configured and the nic is set for dhcp, which I suspect is the issue since you said plugging into other ports didn’t work.


    The best thing to do is use regular lan ports for lan and leave the wan port for wan. Lan connections are not supposed to be plugged into a wan port and as such a dhcp server does not operate on a wan port. Is there a reason you are trying to use a wan port for the nas?

    I was digging into this too. readarr and lidarr are using a different api set from what I can see. (v1 not the v3 sonarr and radarr use). I was unable to get it to connect to both of them.


    I suspect the entire script would have to be changed to talk to the v1 api. Unfortunately I don't know python so I'm at a loss on actually programming something in here.


    All that said, I couldn't actually get this to work with my setup. The initial run reads the status of anything in the queue, and then goes into it's timeout before running again, but it never does a second run. so it never actually blacklists or removes the items from the queue. And of course since it just sits there at the timeout, there is no additional logging info that would point me to an error.

    So, where are the ip addresses set from? Is the nas set static or dhcp? If dhcp, the udm would need to have a dhcp server running. If static, it should be set in the same ip range as the rest of the lan.


    What ip addresses are you seeing set on different computers?


    I agree with chente that if it is the udm setting that changed, that is where the problem is likely originating from.

    Works great BernH. I just made a folder and ran the two commands after filling in the API etc... ive set it to every 30 minutes tho.

    Looking at the docker file, it looks like the env variables are set for only sonarr and radarr, but theoretically, we should be able to add variables for the other apps. If their json structurs are the same (hopefully they should be), extending it to work with the others should not be a problem.


    Since it builds a docker container to run this in, there should be no worries about it polluting the native python install for OMV.

    I don't run nextcloud in a docker container, but if the container is structured the same and behaves the same as a native install, you should also be able to bash into it and run sudo -u www-data php /var/www/nextcloud/occ maintenance:mode --off


    This has the same effect as the config.php edit that soma mentioned.


    The occ app in nextcloud can do all kinds of functions. Running sudo -u www-data php /var/www/nextcloud/occ without the options at the end will return a list of all the functions.

    Not familiar with the equipment, but a quick look states that the port you have the nas plugged into is a 2.5Gbe WAN port and your motherboard is a 1Gbe port. Does the 2.5Gbe port on the udm auto-negotiate 1Gbe or can it be or does it have to be set at 1Gbe, and since it's a WAN port by default, can it be set to be a LAN port?


    Does the nas come up when plugged into one of the 1Gbe ports (ports 1-8)?

    If the wired networks goes down, you just need to brig it back up again. You shouldn't have to reboot. OMV is using netplan for interface management, so scheduling a periodic task in the web UI that runs a netplan apply should try to bring the interface back up and tell it to fetch a new IP address if it is using dhcp.


    You could have this run ever 30 minutes or 60 minutes. It will create a slight interruption in connectivity of any connected ethernet port as it negotiates a new IP lease, but this should not be noticeable under normal use.

    fail2ban-client banned

    It gives me an output with 11 banned IPs of all kinds, among them is the IP of my PC on the local network, the IP of my smartphone when it connects with Wi-Fi on the local network, an IP of one of the wireguard networks that I have configured, several external IPs that I assume are from accesses from the smartphone...

    This doesn't make sense, the container is systematically banning all access.

    The container should not be banning everything unless there are failed login attempts. The external ip's may not be you accessing from the smartphone. They could also be other people or bots trying to hack into your system. I occasionally get a few of those being banned as well.


    Aside from the maxretry, bantime and findtime, have you changeed anything else in the jail? What about the filter? Have you changed anything there?


    I am asking because the way fail2ban works is that the logs have the filter applied to them to look for 300 and 400 errors. If it finds them the jail sends the ip addresses to the action file to trigger an iptables ban. If your system is blocking everything, something in that chain of events is not working right.


    Can you post your jail and filter files or at least double check them against the ones I posted. There are a few other people that have applied these and I have not had complaints from them, and mine is working fine.

    This is from the iptables-nft.conf regarding the port directive:

    Code
    # Option:  port
    # Notes.:  specifies port to monitor
    # Values:  [ NUM | STRING ]  Default:
    #
    port = ssh

    According to this, the ports can be specified by number or string. If you want to be more specific about the banned ports, you can change the jail action line from action = iptables-nft[type=allports, chain=DOCKER-USER] to action = iptables-nft[port=80, port=443, chain=DOCKER-USER] I just tested and it worked.


    I would recommend the allports though as it will allow for protection of the other ports if there are any others passed through NPM.


    The ignoreip line in the jail where I had commented oout the LAN ip was supposed to have that note I just added to it, but I must have forgotten to add it when I posted.

    Code
    ignoreip = 127.0.0.1/8 #192.168.2.0/24

    Is it possible that this line should be adjusted to the local network? In my case, for example, it would be like this:

    Code
    ignoreip = 127.0.0.1/8 192.168.10.0/24

    It's just a shot in the air. I ask you because you have studied it...

    Or separated with a comma (no idea...)

    Code
    ignoreip = 127.0.0.1/8, 192.168.10.0/24

    Yes, the ignoreip lines will keep that IP range from triggering a ban as it will bypass the jail. That's why I asked earlier what is triggering the ban. If it is hammering from the internet it will have no effect, but if it is something on your network it will. Look at the jail status to see what IP is banned. The correct way is without the comma.


    I just added a note to the guide for that line.

    The action = iptables-nft[type=allports, chain=DOCKER-USER] line in the jail blocks all ports. I'd have to do a little digging to see how to specify what ports to actually block, but I suspect the type would have to be multiport with different ports specified and/or 2 different port directives to say http and https

    Yeah. That makes sense. I'll try a different configuration.

    Not just containers, it goes further. I have not been able to connect via ssh to any vm with WinSCP. I have stopped the fail2ban container and could access it via ssh. That's why I think it affects more than just docker. Reading everything you say, and it seems very logical to me, I was surprised not to be able to access the vms via ssh.

    I have not experienced it blocking ssh access. when setting up I was only seeing docker access blocked as it's on;y the DOCKER-USER chain that is affected. I will do a test later when I have time to confirm again, but recollection of my initial testing did not show any ssh blocks.


    My initial reaction is that it seems like there is something with your configuration that is behaving different than mine.

    Interesting. I just realized that the fail2ban container is also blocking connections to the virtual machines. It looks like he is creating some iptables rule that affects the server in general.

    It is doing this because NPM is a docker and it's the DOCKER-USER chain that the ban is applied to, so access to NPM is blocked, which in turn gives access to the hosts. Fail2ban is "closing the door" to access any docker containers.

    I was going to answer what you say, but before doing so I started fail2ban again and did some more tests.

    It turns out that it is blocking absolutely all connections. I have Nextcloud and Jellyfin through npm. I can't access either of them, neither from the local network nor from outside.

    However, if I connect from outside with wireguard I can access both services. I'm not sure what conclusions to draw from this, can you think of anything? The wireguard network has a different network range than the local network, it also has its own iptables rules.

    What is evident now is that it is not an issue related to the Nextcloud AIO container. The fact that he was found yesterday in Detained status must have had some different, unrelated cause.

    I think all this is too dense for a Tuesday :)

    As I said, the fail2ban block happens to all docker containers because the action line in the jail is triggered on the DOCKER-USER chain. action = iptables-nft[type=allports, chain=DOCKER-USER], and since all docker containers operate on the DOCKER-USER chan, the ban applies to all of them. This is not a bad thing when an ip is hammering on your server looking for open ports, as it will block that ip address from everything docker related.


    Perhaps you have gone too aggresive with your 1 week bantime, not allowing anything to get unbanned in a reasonable amount of time? Even my nextcloud lxc, which runs it's own fail2ban only has a 3 hour ban triggered by 10 failed attempts in 12 hours, while my NPM fail2ban is a 3 hour ban triggered by 5 fails in 30 minutes.


    I have it set like this because the NPM fail2ban will stop hammering, but the nextcloud fail2ban watched the nextcloud logs and also checks for "Trusted Domain" errors.


    This all too a little bit of trial and error to get levels that would catch malicious activity, but not cripple my server, and also why the posted guide has sort ban times.


    If it is only access to one host config that is causing the bans you could get very specific by editing the logpath lines to point to specific logs and excluding the one that is triggering the bans, or a better solution would be making a different jail and matching filter for each host log, enabling you to have different maxretry/findtime/bantimes, but the bans would still be applied to the DOCKER-USER chain, when a failure occurs since it is all docker related,


    Code: NPM fail2ban jail
    [npm-docker]
    enabled = true
    ignoreip = 127.0.0.1/8 192.168.2.0/24
    chain = INPUT
    action = iptables-nft[type=allports, chain=DOCKER-USER]
    logpath = /var/log/default-host_*.log
              /var/log/proxy-host-*.log
    maxretry = 5
    bantime  = 10800
    findtime = 1800


    The question is, what is triggering the bans? Is it something on your server or is it hammering activity from the outside?

    I don't use swag. This is a Nextcloud AIO container installed and configured with Nginx Proxy Manager according to the Nextcloud AIO manual. I didn't add anything more than what was recommended. What the manual says that needs to be added is here: https://github.com/nextcloud/a…xy.md#nginx-proxy-manager

    I just had to add this to the npm advanced configuration:

    Code
    client_body_buffer_size 512k;
    proxy_read_timeout 86400s;
    client_max_body_size 0;

    In theory it is a tested configuration and in fact it has worked perfectly for me for just over a month, until I installed fail2ban. If you can't think of what it could be, I'll have to study the fail2ban container in depth to see if I can figure out where the problem is. I'll try to do it this weekend and post what I find. Thanks for your response anyway. :thumbup:

    Fail2ban is just monitoring the NPM host logs and looking for 300 and 400 errors. When it see an error in the host logs, it just bans that ip address in the server iptables docker chain, which is applied to all docker containers. It does not do anything directly to the containers that the NPM host is pointing to.


    The only thing that I could think of, is the port 443 issues I mentioned. As I stated, if you were running it before with the container handling the port 443 access directly, the proxying of 443 from the container could be causing the problems, since you are passing the ssl headers through the proxy, portions of the headers are being replaced and not passed untouched.


    If you were not letting the container handle ssl/port 443, and setting up NPM like that guide you just linked to, then the only other thing I can think of would be if the container is somehow triggering 300 or 400 errors in the NPM logs when it is going and checking for updates or something of the sort or someone is hammering the server, causing fail2ban to block access, and then the container is hitting an access error and shutting down.


    I have run owncloud and then nextcloud for about 12 to 15 years, but always as either their official VM made by hansonit, a bare metal install, or in the lxc that I am using now, both of which are installed based on their official guide. I have never run it as a docker container, so I have never had to chase this problem related to a docker install.

    BernH

    It seems like fail2ban is causing me problems somehow. Yesterday I found the Nextcloud and fail2ban containers in Stopped status. I didn't give it any more importance, I picked them up again and everything worked fine.

    This morning I was doing some things in the Nextcloud GUI and suddenly lost connection. After downloading the fail2ban container everything is back to normal.

    I followed your guide strictly. Any idea what could be happening?

    I don't run nextcloud as a container, so I am not sure why could be happening exactly when run as a container. My nectcloud is ran as an lxc so as far as NPM and fail2ban are concerned it is actually a different server, and my access to the lxc from NPM is all unencrypted over port 80, leaving NPM to handle the encryption and port 443 access. If you deployed the container using port 443, as the container guide had outlined, the container port 443, which is an ssl port and and the NPM encryption are probably conflicting. From my experience, nextcloud does not really like to have it's port 443 proxied. If you do still want to run like that you would have to figure out what advanced settings would allow that. You may be able you find that in the swag site config, since it seems to work like that, but since I don't run that way I can't test it.


    That said though, nextcloud is a bit of a different beast and required some extra settings in NPM even when leaving all the access on port 80. I don't know if these will work the same with the container


    Here is what I have added:


    Code: Advanced - Custom Nginx Configuration
    add_header X-Robots-Tag "noindex, nofollow, nosnippet, noarchive";
    rewrite ^/\.well-known/carddav https://$server_name/remote.php/dav/ redirect;
    rewrite ^/\.well-known/caldav https://$server_name/remote.php/dav/ redirect;
    proxy_hide_header Upgrade;
    proxy_read_timeout 180m;

    Additionally, in the Custom Locations I have 3 defined. The first 2 are for caldav and carddav connections and the third is for webfinger protocols.


    location: /.well-known/caldav

    scheme: http

    Forward Hostname/IP: <your server ip address>

    Forward Port: 80



    location: /.well-known/caldav

    scheme: http

    Forward Hostname/IP: <your server ip address>

    Forward Port: 80



    location: /.well-known/webfinger

    scheme: http

    Forward Hostname/IP: <your server ip address>

    Forward Port: 80

    Ok so, fired up a deb 11 lxc, it's pinging the host ok, running the scripts returns no errors but seems to do nothing still.


    No access activity in the *arr logs. Tried echoing the parsed id's that the python scripts are supposed to return, and not seeing anything, which tells me that the scripts are not actually pulling anything from the containers.


    Once again though if this is an issue with a docker versus a native install, I don't know to address that really. If it is because of a module missing from the container, you could add it, but the next restart would drop the module, unless the base image is rebuilt from scratch. A try with a native install of radarr and these scripts on another system or a vm might work and answer the question of the no returned id's, but it's something I don't have the time to even get into building up another system and testing right now.


    While this would be nice for automation, I don't have the ability (without some time learning some things) or desire to really chase further by rebuilding docker base images.


    ***edit***

    The deb lxc was not giving the pyarr errors that the ubuntu lxc was giving.