Hi all!
Gio answered my email (thanks Gio for the answer!) but his email does not appear in the mailing list archive, so maybe the mailserver of the mailing list did not like his mailserver? Mah...
What he wrote is an interesting situation (read it in the cited email below).
The scenario he mentions, as far as I understand, can have two negative impacts:
1) on the whole network, as a spurious DHCP server appears and the clients could get a wrong ip and wrong gateway from it instead of receiving the right one from the LibreMesh nodes (and this would not happen every time, but only when the wrong DHCP offer arrives before the good one, so the problem would be difficult to identify);
2) on the owner of the node, as the local DHCP server will aaalways offer the wrong IP first and the clients of the home will be unable to access the LibreMesh network.
In my opinion, the problem 1 should be fixed with a firewall rule, active by default, that eliminates the incoming DHCP offers. This rule should be possible to disable with some option in /etc/config/lime-* in case someone does that on purpose, disabling the DHCP from LibreMesh and using an external one. Or this firewall rule should be included, and always active, in the optional lime-proto-anygw package.
I found an ancient issue on this, where it seems that there is not yet a firewall rule for that:
https://github.com/libremesh/lime-packages/issues/658
The problem 2 cannot be fixed on LibreMesh side, as the solution is to disable the DHCP server on the dummy AP router (an aggressive alternative solution would be an ARP spoofing for convincing all the clients that the MAC address of the gateway e.g. 192.168.1.1 is the libremesh router's one, it works. If we want to do this ugly thing, tell me and we can discuss it further).
To implement a notification in lime-app would not help the owner of the node, as also lime-app stops being reachable for them.
But it would help other members of the network to realize what happened and help the node owner to fix it. I suppose that the mention of shared-state during the meeting was also for sharing these notifications to the lime-app of other LibreMesh nodes.
Last comment: it seems to me that the comparison with a reference status is not needed, as a DHCP server connected to the LAN while libremesh's DHCP is active (so when lime-proto-anygw is installed) is always bad, right?
Is this right Gio?
Ciao!
Ilario
-------- Original Message --------
From: G10h4ck <g10h4ck@???>
Sent: June 19, 2024 9:03:58 AM GMT+02:00
To: "LibreMesh.org project mailing list" <libremesh@???>
Cc: Ilario <ilario@???>
Subject: Re: [lime] GSoC - Cable purpose autodetection
During the meeting I mentioned a specific thing to detect and notify to start with.
_Rogue DHCP_
Which happens very often in community networks.
AKA
1) Participant have libremesh router on the roof
2) One or more unsupportable routers around at home with disabled DHCP connected via Ethernet port to the libremesh router on the roof.
This happens because often those unsupported routers are much cheaper/available then supported ones on some markets. Everything works fine, except one day a spike in the current or whatever random condition, cause some of those unsupported routers to reset to default and so their DHCP server get enable by default and everything stop working.
3) I also suggested where to start to look
Look into OpenWrt DHCP client to see how easy/hard would be to implement a functionality like `dhcpcd -T eth0` that could be used to periodically check for the aforementioned condition and notify the user/network via lime-app
4) I have no problem being interrupted or getting questions while explaining something or exposing an idea, in fact i find it helpful, so next time if something is not clear, please ask questions at moment, don't wait next week to be full of (much of them very wide) questions ;-)
Cheers
Gio
On 2024-06-19 08:10, Ilario via LibreMesh wrote:
> Hi all and hi Nemael!
> I think we need to define better the GSoC about cable purpose
> auto-detection. In my opinion, during the meeting the goal was
> broadened and became confused (at least for me).
>
> During the meeting Gio mentioned that faulty commercial
> routers could be detected and a notification in the lime-web interface
> could appear.
> But for implementing this we need to know more info:
> * what exactly should be detected?
> * why?
>
> Can anyone share some more situations where any kind of detection
> could be useful, according to their experience/opinion?
>
> I would propose to fragment the project in small well defined tasks,
> as independent as possible.
>
> I would suggest starting from:
>
> 0) gather necessities from the communities (share your thoughts please!!!);
>
> 1) small lua or bash scripts for detecting specific things about an
> ethernet port. Things that can be useful for the identified
> necessities;
>
> then, if time is enough:
>
> 2) small lua script that creates a interface-specific configuration
> for a few identified cases, allowing the user to modify or disable the
> resulting automatic configuration (e.g. like what lime-hwd-openwrt-wan
> package does)
>
> 3) integrate each detection script with the corresponding
> configuration script and create a new "lime-hwd-" package like
> lime-hwd-ground-routing.
> https://github.com/libremesh/lime-packages/blob/master/packages/lime-hwd-ground-routing/files/usr/lib/lua/lime/hwd/ground_routing.lua
> It could be something like lime-hwd-autodetect-ethernet-
> mesh or lime-hwd-autodetect-ethernet-client
> so that they will run and configure stuff every time lime-config is
> run, but only if this module is explicitly activated in
> /etc/config/lime-* files (e.g. lime-hwd-ground-routing does not do
> anything unless a hwd_gr section is found).
> Instead of making packages, we can also have the files as lime-assets.
> lime-assets can be run at the first boot or every time lime-config is
> executed. Some documentation here:
> https://github.com/libremesh/lime-packages/issues/719
> and implemented here:
> https://github.com/libremesh/lime-packages/blob/master/packages/lime-system/files/usr/lib/lua/lime/generic_config.lua
>
> 4) for the modules where it makes sense, have it running when an
> interface goes up or down (this was suggested by Javier). OpenWrt has a very useful function for this:
> https://openwrt.org/docs/guide-user/base-system/hotplug#iface
>
> 5) for the modules where it makes sense, have a system for running
> constantly (e.g. we already have a script for checking if the internet
> connection is working: babeld-auto-gw-mode this is just a rule for
> watchping. Watchping will continuously ping an IP on the internet and
> execute the rule when it fails/succeeds
> https://github.com/libremesh/lime-packages/tree/master/packages/babeld-auto-gw-mode/files/etc/watchping
> )
>
> 6) share found info via shared-state (this was proposed during the
> meeting but I am not sure why should we do this, maybe is required for
> lime-app integration?)
>
> 7) Integrate the configuration modules from point 2 with the lime-app. Similarly to what LuCI (the web interface from OpenWrt) does, lime-app could allow the user to manually configure some interfaces for some common uses.
>
> 8) Integrate the auto-detection modules from point 1 with lime-app notifications. From Gio during the meeting:
> "shared-state has a reference state, so lime-app can show what changed
> and show a notification saying it is broken now. The goal is to make
> troubleshooting easier for the user. The configuration of batman-adv
> vs client connected to ethernet is useful but could be suggested to
> the user from the lime-app, not necessarily applied automatically.
> Troubleshooting is very important, and this could help."
>
>
> About point 0 we need your opinion!
>
> i) Gio suggested detecting misconfigured commercial routers for
> helping the debugging of network problems in large networks. I am not
> sure which kind of fault should be detected...? We already have the
> watchping+babeld-auto-gw-mode combo that detects if the advertised
> internet connection does not work. Should we reimplement this inside
> this GSoC's framework? Is there anything else we could look for in
> commercial routers?
>
> ii) an ethernet port used for meshing (connected to other LibreMesh
> routers) should not be included in br-lan bridge, for avoiding
> possible loops (unsure if Batman-adv can really manage all the loops,
> but it throws a creepy error message). There is a bit of old
> discussion from this comment on:
> https://github.com/libremesh/lime-packages/issues/56#issuecomment-637598835
> and during BattleMesh v15 in 2023 we configured that manually
>
> iii) from Gio during the meeting "we can also inform user the user
> about what changed on their network"
>
> iv) cut the broadcast when two clouds with different ap_name (the
> parameters that defines the nodes routing with batman-adv, when the
> ap_name is different batman-adv is separated between the two networks
> and Babeld is doing the layer3 routing). Currently some people (from
> what Nico Pace wrote years ago) are using cabled WAN-WAN connections
> for this. Clearly, it would be much more elegant to do it LAN-LAN (so
> that you can use the WAN port for its internet-gateway function) and
> still avoid that all the broadcast traffic goes to the other cloud.
>
> v) ???
>
>
> Starting to share ideas about the point 1 (please share your ideas!):
>
> a) As Gio suggested during the meeting, we could use a small piece of
> shared-state which if the tool for detecting the neighbors. If any
> neighbor is found, we can assume there is a LibreMesh device there. We
> can be sure about that until when shared-state will be moved to
> OpenWrt repositories (devs, are there any plans for this migration?).
> The script is this one:
> https://github.com/libremesh/lime-packages/blob/master/packages/shared-state/files/usr/bin/shared-state-get_candidates_neigh
> and it works with ping6 to ipv6 link local broadcast.
>
> a2) we could do the same using ping6 link local but without using the
> shared-state thing
>
> a3) we should start detecting LibreMesh nodes using the output from
> the routing protocols, as they already run their detection
>
> doubt: in the routers with swconfig (like all the ath79 ones, which
> still have to be supported by DSA), usually all the LAN ports are
> inside the same interface, right? Like the eth0.1 on TP-Link WDR3600.
> In this case, can we detect and configure a single port (not all of
> them at the same time?). Is it something like what
> lime-hwd-ground-routing does (it requires the user to specify the CPU
> port, which allows it to identify a specific ethernet port inside the
> LAN interface)?
>
> b) We could detect if the WAN ethernet port has a DHCP server connected
>
> b2) if this DHCP server offers a default route
>
> b3) one way to do this would be what Gio suggested in the meeting:
> modify the DHCP client of OpenWrt for showing the received information
> without applying it (like the -T option of dhcpcd, see an example of
> output in the meeting minute of the 5th of June 2024). As everything that could be accepted on OpenWrt repositories, this should go to OpenWrt as soon as it is good enough.
>
> b3) if this default route is a working internet connection (but
> Watchping already works for this...)
>
> c) when another LibreMesh router is detected, check if it is part of
> the same cloud (for example if babeld can see the other node but
> batman-adv cannot see it)
>
>
> Ciao!
> Ilario
>