Cisco UCS Chassis IOM Bug – Engineering Update

After I wrote the article on the Cisco UCS Chassis issue while upgrading the firmware to version 2.0 (4a), I have exchanged couple of emails with TAC and got a comprehensive update on this.

So, what happened that day, as you know while updating UCS cluster from 2.0 (3a) to 2.0 (4a), update failed on 2 IOMs.

Root Cause : /altflash/ partition on the IOM which gets the “basepkg.sh”(image) from Fabric Interconnect to install the firmware code was full by 96% already. While updating, the partition is full (100%) and the update gets failed and goes for a reset and tries again and again. Look at the below logs

cmcXXXX:~# df -h
Filesystem                Size      Used Available Use% Mounted on
none                     20.0M    256.0k     19.8M   1% /var/cmc
/dev/mtdblock4           32.0M      9.1M     22.9M  28% /obfl
/dev/mtdblock6           59.0M      1.5M     57.5M   3% /ws
/dev/mtdblock5           61.0M     30.4M     30.6M  50% /flash
tmpfs                    32.0M      8.2M     23.8M  26% /dev/shm
none                      5.0M      8.0k      5.0M   0% /var/sysmgr
/dev/mtdblock2           61.0M     59.0M      2.0M  97% /altflash           <–   2.0M available

2012-09-24T11:37:28.318627+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:37:58.749465+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:39:05.801469+00:00 CMC NOCSN_-3-CMC last message repeated 2 times
2012-09-24T11:40:05.801782+00:00 CMC NOCSN_-3-CMC last message repeated 2 times
2012-09-24T11:40:35.878567+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:41:40.399224+00:00 CMC NOCSN_-3-CMC last message repeated 2 times
2012-09-24T11:41:40.142331+00:00 CMC NOCSN_updated-5-CMC  OBFL:0:check_for_requests:Update failed, code: 102#012
2012-09-24T11:42:12.187552+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:42:12.288629+00:00 CMC NOCSN_root-3-CMC  OBFL: Update Failed. Error – 5

cmc1XXXX:~# show_swupdate_progress
Update progress for Image 1:

Package being processed:  ciscowoodsidepkg.sh
Stage:                    Installing package
Completed packages:       1/7
Completed bytes:          3254550/34945130           <–   (required Free Space 33M approx)

Recovery of IOM: Deleted unwanted .tar files from both the IOMs to get some free space for the update process to complete.

cmc1XXXX:~# ls -la /altflash/
drwxr-xr-x    3 root     root            0 Jan  1  1970 [1;34m.[0m
drwxr-xr-x    2 root     root            0 Sep 24 10:45 [1;34m..[0m
-rw-rw-rw-    1 root     root      3254550 Sep 24 10:52 [0;0mbasepkg.sh[0m
-rw-rw-rw-    1 root     root     17798045 Sep 24 10:43 [0;0mciscowoodsidepkg.sh[0m
-rw-rw-rw-    1 root     root      1486010 Sep 24 10:44 [0;0mcmcapppkg.sh[0m
-rw-rw-rw-    1 root     root      1343488 Sep 24 10:45 [0;0mdebugpkg.sh[0m
-rw-r–r–    1 root     root      2198688 May 11 09:07 [0;0mdiagpkg.sh[0m
-rwxrwxrwx    1 root     root         2882 May 11 09:28 [1;32mpsu-screen.sh[0m
-rw-r–r–    1 root     root      8263680 May 11 09:29 [0;0mpsu_chas.tar[0m
-rw-r–r–    1 root     root     22974464 May 11 09:32 [0;0mpsu_ucs.tar[0m
-rwxr-xr-x    1 root     root         1202 Apr 23 18:28 [1;32msetup_swupdate[0m
-rw-r–r–    1 root     root       327884 Nov 21  2011 [0;0mu-boot.bin[0m
-rw-r–r–    1 root     root      2871505 May 11 08:43 [0;0muImage.bin[0m

cmc1XXXX:~# cd /altflash
cmc1XXXX:/altflash# rm psu_chas.tarrm psu_chas.tar

cmc1XXXX:~# show_swupdate_progress
Update progress for Image 1:

Package being processed:  uImage.bin
Stage:                    Installing package
Completed packages:       6/7
Completed bytes:          32073727/34945130

Update Succeeded logs
2012-09-24T11:55:35.706859+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:55:47.626599+00:00 CMC NOCSN_updated-3-CMC  OBFL:0:update_sw_v2:[uboot update] forced: 0, running version: 2.24, install version: 2.24#012
2012-09-24T11:55:47.627078+00:00 CMC NOCSN_updated-3-CMC  OBFL:0:update_sw_v2:[uboot update] because of golden boot running#012
2012-09-24T11:55:47.627590+00:00 CMC NOCSN_updated-3-CMC  OBFL:0:update_sw_v2:Installing uboot#012
2012-09-24T11:56:07.009923+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:56:38.846762+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012
2012-09-24T11:56:47.392637+00:00 CMC NOCSN_updated-5-CMC  OBFL:0:check_for_requests:Update succeeded#012
2012-09-24T11:56:47.533533+00:00 CMC NOCSN_updated-5-CMC  OBFL:0:clear_fail_files:removal of image1.ok succeeded#012
2012-09-24T11:57:10.148558+00:00 CMC NOCSN_-3-CMC  OBFL:0:mcserver_read_param_data:Active image read: 2#012

2012-09-24T11:57:12.603250+00:00 CMC NOCSN_-5-CMC  OBFL:0:mcserver_set_param:Set active image request to image 1#012
2012-09-24T11:57:12.603457+00:00 CMC NOCSN_-5-CMC  OBFL:0:mcserver_set_param:shm write of active image to: 1 successful#012
2012-09-24T11:57:12.736301+00:00 CMC NOCSN_updated-5-CMC  OBFL:0:check_for_requests:Updating active image to 1 (current: 2)#012
2012-09-24T11:57:12.986115+00:00 CMC NOCSN_root-5-CMC  OBFL: New active image: 1
2012-09-24T11:57:12.993806+00:00 CMC NOCSN_updated-5-CMC  OBFL:0:check_for_requests:set_active_image succeeded#012
2012-09-24T11:57:13.602977+00:00 CMC NOCSN_-5-CMC  OBFL:0:mcserver_set_param:Set active image accepted by update daemon: yes#012
2012-09-24T11:57:13.655504+00:00 CMC NOCSN_root-3-CMC  OBFL: Running image is: 2. Setting 1 as the active image. Rebooting the System…

This issue has been filed with the following bugs with the Engineering team to work on
CSCuc15009, CSCto68104 (not public now).

Please Note: This is not a firmware bug rather a IOM bug, as not all IOMs are affected and can happen during any firmware update. Cisco Development team is working onto either modifying or adding a script to delete all files from that partition, before it starts updating.

 

About Prasenjit Sarkar

Prasenjit Sarkar is a Product Manager at Oracle for their Public Cloud with primary focus on Cloud Strategy, Cloud Native Applications and API Platform. His primary focus is driving Oracle’s Cloud Computing business with commercial and public sector customers; helping to shape and deliver on a strategy to build broad use of Oracle’s Infrastructure as a Service (IaaS) offerings such as Compute, Storage, Network & Database as a Service. He is also responsible for developing public/private cloud integration strategies, customer’s Cloud Computing architecture vision, future state architectures, and implementable architecture roadmaps in the context of the public, private, and hybrid cloud computing solutions Oracle can offer.

Leave a Reply