Introduction
A number of C9105AXW access points (all PIDs) were manufactured with a NAND flash subsystem that may, over time, spuriously mark blocks as bad. Once 94 blocks have been marked as bad, the flash bad blocks table is full. As a result, the AP may suffer various symptoms:
- The flash filesystem may become writelocked, so that the AP is no longer able to commit configuration changes, write new logs, or download a new image. Errors similar to the following may be seen:
sync_log: couldn't open /storage/syslogs/7: Read-only file system
- The AP may crash, with a kernel panic showing UBIFS errors similar to the following:
<3>[02/06/2023 05:06:06.0290] UBIFS error (ubi0:1 pid 5454): do_writepage: cannot write page 8 of inode 54848, error -30
- The AP may be unable to boot; the console log shows an error similar to the following:
[01/01/1970 00:00:05.0600] ubi0 error: ubi_eba_init: no enough physical eraseblocks (0, need 1)
[*01/01/1970 00:00:06.4720] mount failure
In some cases, the AP may need to be replaced.
Cisco has implemented two bugfixes to address this problem.
Bugfixes
This bugfix prevents flash blocks from being incorrectly marked bad. However it does not repair APs that have already have an excessive number of bad blocks.
This bugfix repairs APs with excessive bad blocks. At boot time (in u-boot), if the AP's bad block table exceeds a threshold number of entries (default: 40; controlled by the SCRUB_LIMIT u-boot variable), then the bad block table will be emptied, before the AP boots.
Affected Units
Only C9105AXW APs are affected by this problem, no other AP models. To determine whether given C9105AXW units, open Cisco bug ID CSCwf50177 in BST and click "Check Bug Applicability", to enter the APs' serial numbers.
Fixed Software
If you have affected C9105AXWs, you should upgrade to software with fixes for both Cisco bug ID CSCwf50177 and Cisco bug ID CSCwf68131 . Track the latter bug for the availability of the fixes in different branches; as of 5-Sep-2023, the fixes are or will be available in the following releases:
AireOS
- 8.10.190.0 (on CCO)
- 8.10.185.7 and 8.10.189.111 were special releases with the fixes for this flash problem; customers running these releases should upgrade to 8.10.190.0 when convenient
Cisco IOSĀ® XE
- 17.3.7 APSP5 or above (open TAC case)
- 17.3.8 (on CCO)
- 17.6.5 APSP5 or above (on CCO)
- 17.6.6 (on CCO)
- 17.9.3 APSP5 or above (on CCO)
- 17.9.4 APSP1 or above (on CCO)
- 17.9.5 (CCO 2024)
- 17.12.2 (CCO November 2023)
- 17.13.1 (CCO December 2023)
Checking Susceptible APs for Excessive Bad Blocks
First, check all of your susceptible C9105AXWs, to see how many bad blocks they have. If none have more than 60 bad blocks, you may upgrade directly.
Checking for bad blocks - 17.6 and above
On each susceptible C9105AXW (as determined from "Check Bug Applicability" for CSCwf50177 ), collect the output of "show flash statistics". Look for "count of bad physical eraseblocks". To automate checking a large number of APs, use WLAN Poller.
Checking for bad blocks - 8.10 and 17.3
TAC (or other Cisco employee with SWIMS access) will need to devshell into each susceptible C9105AXW and issue the following command:
ubinfo -a
Look for "count of bad physical eraseblocks". To automate checking a large number of APs, use RADKit.
Upgrade Procedure
If you have affected C9105AXW units with excessive bad blocks, follow the following procedure when upgrading to the fixed software.
Upgrading in a single controller deployment - complete new controller image
1. (Optional) you may install the new controller image, but do not activate it, and do not predownload the new AP software to the affected C9105AXWs.
2. While still running the old controller image, reboot the affected C9105AXWs. This will, in most cases, allow affected APs to be upgraded. (In some cases, a few APs may need to be replaced)
3. You may now predownload the new AP image, if so desired.
4. Reload the controller, ru nning the new software
Upgrading in a single controller deployment - APSP
1. (Optional) you may install the new APSP, but do not activate it, and do not predownload the new AP software to the affected C9105AXWs.
2. Reboot the affected C9105AXWs. This will, in most cases, allow affected APs to be upgraded. (In some cases, a few APs may need to be replaced)
3. You may now predownload, actitvate and commit the APSP.
Upgrading in an N+1 deployment
In this scenario, a backup controller is used to upgrade the affected C9105AXWs.
1. While the affected APs are still joined to the old controller, upgrade the backup controller to the fixed software (full controller version, or APSP)
2. Reload the affected APs - have them rejoin the old controller. (In some cases, a few APs may need to be replaced)
3. Now reconfigure the affected APs, to set their primary controller to the upgraded one, and have them join the backup controller.
4. After the primary controller has been upgraded to the fixed software, you may move the C9105AXWs back to it.