Hi all,
This is the report of the tests conducted on devices running the actual
libremesh master branch on top of openwrt-24.10.2
Here are reported the various points defined in the email "New testing
grant"
https://lists.autistici.org/message/20250504.153057.850a1bae.en.html
with some explanation.
# Devices details
All devices running libremesh-master-ow24.10.2 built with
firmware-selector/online-imagebuilder
The asus without the package shared-state-async because it has only 64MB
of ram and had crashes.
- name: Cudy WR3000S v1
price: 49.90€ (amazon.it 2025-07)
hostname: cudy
target: mediatek/filogic
switch: DSA (mt7530)
ipv4: 10.13.15.35
eth_ports: 4xLAN 1xWAN (1Gbits)
radios: 2ghz 5ghz
- name: OpenWrt One
price: 135.98€ (amazon.it 2025-07)
hostname: openwrt
target: mediatek/filogic
switch: swconfig
ipv4: 10.13.4.128
eth_ports: 1xLAN 1xWAN (1Gbits)
radios: 2ghz 5ghz
- name: Asus RT-AC51U
price: 5€ (second-hand market 2021)
hostname: asus
target: ramips/mt7620
switch: swconfig
ipv4: 10.13.135.8
eth_ports: 4xLAN 1xWAN (100Mbits)
radios: 2ghz 5ghz
- name: MikroTik
price: 20€ (ebay 2021)
hostname: mikrotik
target: ipq40xx/mikrotik
switch: dsa
ipv4: 10.13.223.57 (eth0's macaddress and ipv4 is randomized)
eth_ports: 1xLAN (1Gbits)
radios: 5ghz
2) the work of the recipient will have to be aimed to the grant goal
defined above as "Help the release of a LibreMesh release based on
OpenWrt 24.10: testing with a realistic setup, acquiring the needed
hardware, reporting issues and, if enough time is available, fixing
blocking issues.";
3) specifically, the receiver is required to check if the observed
issues are already reported in the project bug tracker on Github, add
there any useful information gathered, and file a new bug report for
issues that are not yet properly described. The recipient is not
required to fix the observed bugs, but they are strongly encouraged to
use their work-hours for fixing the issues that they perceive as useful
in pursuing the grant goal;
Most of the time of testing was focused to address the issue #1192
'Default anygw route working intermittently via cable'. And to prepare a
pull request containg a fix for it #1214.
The issue affects the master branch as well as the release 2024.1.
Testing reveals that there are no substantial differences for libremesh
between branches 23.05 and 24.10 of openwrt.
Description of a tipical situation for this issue (based on real
experience):
A volunteer of a community network replace an asus_rt-ac51u (100Mbits,
swconfig) with a newer cudy_wr3000s-v1 (1Gbits, dsa). Connect the pc via
cable and navigate.
In the evening a neighbour come back home and power on his
netgear_dgn3500 (1Gbits, swconfig) connected via cable to the
cudy_wr3000s-v1. Then the pc and other cable connected clients stop
reaching the anygw address and the internet.
The issue is mainly due to the fact that dsa devices are not able to
keep in sync their hardware tables with the bridge fdb:
- like suggested by pony in this comment:
https://github.com/libremesh/lime-packages/issues/1192#issuecomment-2994404168
- like described in
https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#forwarding-database-fdb-management
"The existing DSA switches do not have the necessary hardware support to
keep the software FDB of the bridge in sync with the hardware tables, so
the two tables are managed separately" (bridge fdb show queries both [cut])
I verified that the erroneus fdb entry refers to the mac-address of the
anygw interface of a neighboring libremesh node.
To verify this I followed these steps:
1. connected only two devices one dsa and one swconfig each other via
cable, and connected a pc client via cable to the dsa device.
2. changed the anygw mac of the swconfig device (like aa:aa:aa:0d:fe:aa
-> aa:aa:aa:0d:fe:ab) and noticed that the dsa device create a malformed
entry for this different mac (visible running `opkg update; opkg install
ip-bridge; bridge fdb | grep aa`). Clearly in this situation the correct
anygw, owned only by the dsa device, restarts working.
I tested various solutions to prevent that the erroneus fdb entry cause
a non-working anygw for cable connected hosts.
These tests focused to find a configuration that:
1. is simple, means to keep as small as possible the number of
additional configurations and the number of newly introduced network
devices and interfaces.
2. allow an ethernet port to be used both by hosts and by other mesh
nodes, without further manual configurations (like libremesh already is
for swconfig devices)
3. allow the dsa device to ignore the information of a neighboring
anygw, since by default it should never forward packets directed to the
anygw mac address, so it should never use that information.
A incomplete list of non-working workarounds, includes:
- disabling mac-learning on the specific port (doesn't work for both
hosts and routers)
- creating mac-vlans (mode: passtru) on top of each ethernet port, and
bridging them in br-lan instead of bridging the raw interfaces (is not
'simple' and will require additional routing and nftables rules)
The best solution found described in #1214
1. adds a static local entry in the bridge fdb with the anygw_mac
associated with br-lan
2. adds a nftables rule that drop packets with ether source address
equal to the anygw_mac
## test anygw issue when babel doesn't use vlans (pull #1210)
At the end I checked the presence of the same issue while testing the
pull #1210 that removes the vlan 17 for babeld and that configures the
babel routing protocol on top of br-lan:
In this case, in dsa devices:
- it is still required the hotplug network's hook that using ip-bridge
create a static entry for the bridge fdb stating that the anygw
macaddress can be found locally in br-lan. So the router never tries to
forward packets directed to anygw.
- it is still required the lime-config's hook that instruct nftables to
drops packets with source address equal to the anygw macaddress for each
ethernet port that's inside br-lan.
So ideally #1210 should be merged after #1214
In #1210 however, in a configuration like below:
dsa_device --wire-- swconfig_device --wire-- swconfig_device
the dsa device see both swconfig devices as direct neighbours,
confirming that the issue described in
[#1210](
https://github.com/libremesh/lime-packages/pull/1210#issue-3328326823)
at the point 'Safety net for bridged meshes' is not resolved by the
newly introduced nftables rule also for swconfig device
As a final though I'm proposing to the assembly of LibreMesh to do in
the next period two newer releases
- a 2024.2 containing the fix #1214
- a 2025.1 or 2026.1, with a big-red-warning-message of incompatibility
with 2024.1 and 2024.2, containing the migration to babel without vlan.
## additional notes
For lack of hardware I didn't tested a situation where a dsa device has
two or more CPU ports.
4) the recipient of the grant will have to build a setup matching the
following minimum requirements:
*The minimal simple topology is a linear one, represented here:*
internet1 --wire-- dual_band#1 --wifi-- dual_band#2 --wire--
single_band#1 --wifi-- single_band#2 --wire or wifi-- internet2
5) the minimum scenarios to test are:
* checking if the internet connection internet1 goes down, if the wifi
clients (common AP name) still have connection
* checking if the internet connection internet2 goes down, if the wifi
clients (common AP name) still have connection
* checking if the internet connection internet1 goes down, if the cabled
clients (on dual_band#2) still have connection
* checking if the internet connection internet2 goes down, if the cabled
clients (on dual_band#2) still have connection
OK
6) On the running network, check the main functionalities of the
lime-app web interface.
OK
lime-app mostly works
tested functionalities
- STATUS: OK
- ALIGN: OK (notice that it doesn't speak anymore)
- METRICS: OK
- NOTES: OK
- MAP: allows to edit node location but does NOT show community
- SHARED PASSWORD: OK
- NODE CONFIGURATION: Connect to a mobile hotspot: OK (works only if the
hostpot use WPA2 PSK)
- VISIT A NEIGHBORING NODE: OK
- REMOTE SUPPORT: OK (works after installing tmate and ubus-tmate)
untested functionalities:
- UPDATE FIRMWARE
- VOUCHER
7) Some additional interesting things to inspect are included here but
are not required:
* check if hostnames in the network can be resolved to IP-addresses (DNS);
OK
* split the test network in two different clouds (setting two different
ap_name values) and checking the connection via wireless and via cable;
NOT TESTED
* test batman_V routing algo. Are vlans actually needed on wifi mesh
interfaces? See also:
https://github.com/libremesh/lime-packages/issues/1009
NOT TESTED
10) Since part of the grant funds is recommended for hardware
acquisition, the recipient of the grant must:
* Provide a detailed report on the purchased devices, if any, including
model names and specifications.
See the above 'Devices details'
* Ensure these devices are updated with the latest LibreMesh code
compiled on top of OpenWrt 24.10.
OK
* Report any issues encountered during upgrades, even if exhaustive
testing is not performed.
OK
* These upgrade-related reports should at least include evidence of the
upgrade process (e.g., logs or screenshots) and a brief description of
any issues encountered.
* Maintain these devices as part of a global, distributed testbench,
contributing to the long-term stability and reliability of LibreMesh
releases.
Or, in alternative:
* Send the devices to other testing locations when identified.
OK
--
gothos
PGP Key ID: 0x6406B32F2CEC0008
PGP Key server:
https://keys.openpgp.org/