Isilon Stuff and Other Things

This is a disclaimer: 
Using the notes below is dangerous for both your sanity and peace of mind.  
If you still want to read them beware of the fact that they may be "not even wrong".

Everything I write in there is just a mnemonic device to give me a chance to
fix things I badly broke because I'm bloody stupid and think I can tinker with stuff
that is way above my head and go away with it. It reminds me of Gandalf's warning: 
"Perilous to all of us are the devices of an art deeper than we ourselves possess."

Moreover, a lot of it I blatantly stole on the net from other obviously cleverer 
persons than me -- not very hard. Forgive me. My bad.

Please consider it and go away. You have been warned!

(:#toc:)

EMC Support

  • Support is at https://support.emc.com
  • One must create a profile with 2 roles, one as Authorized Contact and another as Dial Home, Primary Contact.
  • Site ID is:
Site ID:        1003902358 
Created On:     05/13/2016 12:36 PM 
Site Name:      MCGILL UNIVERSITY
Address 1:      3801 UNIVERSITY ST
Address 2:      ROOM WB212
City:           MONTREAL
State:
Country:        CA
Postal Code:    H3A 2B4

About This Cluster

This is from the web interface, [Help] → [About This Cluster]

About This Cluster

OneFSUpgrade
Isilon OneFS v8.0.0.4 B_MR_8_0_0_4_053(RELEASE) installed on all nodes.

Packages and Updates

No packages or updates are installed.

Cluster Information
GUID: 000e1ea7eec05c211157780e00f5f0ce64c1

Cluster Hardware
Node    Model   Configuration   Serial Number
Node 1  Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB    400-0049-03 SX410-301608-0260
Node 2  Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB    400-0049-03 SX410-301608-0255
Node 4  Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB    400-0049-03 SX410-301608-0264
Node 3  Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB    400-0049-03 SX410-301608-0254
Node 5  Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB    400-0049-03 SX410-301608-0248

Cluster Firmware
    Device              Type                Firmware                Nodes
    BMC_S2600CP         BMC                  1.25.9722              1-5
    CFFPS1              CFFPS               03.03                   1-5
    CFFPS2              CFFPS               03.03                   1-5
    CMCSDR_Honeybadger  CMCSDR              00.0B                   1-5
    CMC_HFHB            CMC                 02.05                   1-5
    IsilonFPV1          FrontPnl            UI.01.36                1-5
    LOx2-MLC-YD         Nvram               rp180c01+rp180c01       1-5
    Lsi                 DiskCtrl            20.00.04.00             1-5
    LsiExp0             DiskExp             0910+0210               1-5
    LsiExp1             DiskExp             0910+0210               1-5
    Mellanox            Network             2.30.8020+ISL1090110018 1-5
    QLogic-NX2          10GigE              7.6.55                  1-5

Copyright © 2001-2017 EMC Corporation. All Rights Reserved. This software is protected, without limitation, by copyright law and international treaties. Use of this software and intellectual property contained therein is expressly limited to the terms and conditions of the License Agreement under which it is provided by or on behalf of EMC. All other trademarks used herein are the property of their respective owners.

Logical Node Numbers (LNN), Device IDs, Serial Numbers and Firmwares

  • Use isi_for_array command to loop over the nodes and run the command command
BIC-Isilon-Cluster-4# isi_for_array isi_hw_status -i
BIC-Isilon-Cluster-4:   SerNo: SX410-301608-0264
BIC-Isilon-Cluster-4:  Config: 400-0049-03
BIC-Isilon-Cluster-4: FamCode: X
BIC-Isilon-Cluster-4: ChsCode: 4U
BIC-Isilon-Cluster-4: GenCode: 10
BIC-Isilon-Cluster-4: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
BIC-Isilon-Cluster-1:   SerNo: SX410-301608-0260
BIC-Isilon-Cluster-1:  Config: 400-0049-03
BIC-Isilon-Cluster-1: FamCode: X
BIC-Isilon-Cluster-1: ChsCode: 4U
BIC-Isilon-Cluster-1: GenCode: 10
BIC-Isilon-Cluster-1: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
BIC-Isilon-Cluster-2:   SerNo: SX410-301608-0255
BIC-Isilon-Cluster-2:  Config: 400-0049-03
BIC-Isilon-Cluster-2: FamCode: X
BIC-Isilon-Cluster-2: ChsCode: 4U
BIC-Isilon-Cluster-2: GenCode: 10
BIC-Isilon-Cluster-2: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
BIC-Isilon-Cluster-3:   SerNo: SX410-301608-0254
BIC-Isilon-Cluster-3:  Config: 400-0049-03
BIC-Isilon-Cluster-3: FamCode: X
BIC-Isilon-Cluster-3: ChsCode: 4U
BIC-Isilon-Cluster-3: GenCode: 10
BIC-Isilon-Cluster-3: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
BIC-Isilon-Cluster-5:   SerNo: SX410-301608-0248
BIC-Isilon-Cluster-5:  Config: 400-0049-03
BIC-Isilon-Cluster-5: FamCode: X
BIC-Isilon-Cluster-5: ChsCode: 4U
BIC-Isilon-Cluster-5: GenCode: 10
BIC-Isilon-Cluster-5: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
  • isi_nodes can extract formatted strings like:
BIC-Isilon-Cluster-3# isi_nodes %{id} %{lnn} %{name} %{serialno}
1 1 BIC-Isilon-Cluster-1 SX410-301608-0260
2 2 BIC-Isilon-Cluster-2 SX410-301608-0255
4 3 BIC-Isilon-Cluster-3 SX410-301608-0254
3 4 BIC-Isilon-Cluster-4 SX410-301608-0264
6 5 BIC-Isilon-Cluster-5 SX410-301608-0248
  • Why is there an %{id} equal to 6?
  • 20160923: I can now answer this question I think.
  • It might be the result of the initial configuration of the cluster back in April ‘16.
  • The guy who did it (from J Laganiere from Gallium-it.com) has a few problems with nodes not responding.
    • The nodes are labeled from top to bottom as 1 (highest in the rack) to 5 (lowest in the rack).
    • They should have been labeled as their physical order in the rack, 1/bottom to 5/top.
    • As to why the LLN don’t match the Device IDs: the Device IDs are incrementally updated when failing and adding nodes.
    • I smartfailed one node once so that explains the ID=6.
    • This is extremely annoying as the allocation of IPs is also affected.
    • The last octet of IP pools prod and node don’t match for the same LNN.
BIC-Isilon-Cluster >>> lnnset

  LNN      Device ID          Cluster IP
----------------------------------------
    1              1            10.0.3.1
    2              2            10.0.3.2
    3              4            10.0.3.4
    4              3            10.0.3.3
    5              6            10.0.3.5

BIC-Isilon-Cluster-2# isi_nodes %{lnn} %{devid} %{external} %{dynamic} 
1 1 172.16.10.20 132.206.178.237,172.16.20.237
2 2 172.16.10.21 132.206.178.236,172.16.20.236
3 4 172.16.10.22 132.206.178.233,172.16.20.234
4 3 172.16.10.23 132.206.178.234,172.16.20.235
5 6 172.16.10.24 132.206.178.235,172.16.20.233

BIC-Isilon-Cluster-2# isi network interfaces ls
LNN  Name         Status        Owners               IP Addresses   
--------------------------------------------------------------------
1    10gige-1     Up            -                    -              
1    10gige-2     Up            -                    -              
1    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.237
                                groupnet0.node.pool1 172.16.20.237  
1    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.20   
1    ext-2        No Carrier    -                    -              
1    ext-agg      Not Available -                    -              
2    10gige-1     Up            -                    -              
2    10gige-2     Up            -                    -              
2    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.236
                                groupnet0.node.pool1 172.16.20.236  
2    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.21   
2    ext-2        No Carrier    -                    -              
2    ext-agg      Not Available -                    -              
3    10gige-1     Up            -                    -              
3    10gige-2     Up            -                    -              
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.233
                                groupnet0.node.pool1 172.16.20.234  
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22   
3    ext-2        No Carrier    -                    -              
3    ext-agg      Not Available -                    -              
4    10gige-1     Up            -                    -              
4    10gige-2     Up            -                    -              
4    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.234
                                groupnet0.node.pool1 172.16.20.235  
4    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.23   
4    ext-2        No Carrier    -                    -              
4    ext-agg      Not Available -                    -              
5    10gige-1     Up            -                    -              
5    10gige-2     Up            -                    -              
5    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
                                groupnet0.node.pool1 172.16.20.233  
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24   
5    ext-2        No Carrier    -                    -              
5    ext-agg      Not Available -                    -              
--------------------------------------------------------------------
Total: 30

  • This will list the status of devices and firmware on a node:
BIC-Isilon-Cluster-1# isi upgrade cluster firmware devices
Device             Type     Firmware                Mismatch  Lnns   
---------------------------------------------------------------------
CFFPS1_Blastoff    CFFPS    03.03                   -         1-5    
CFFPS2_Blastoff    CFFPS    03.03                   -         1-5    
CMC_HFHB           CMC      01.02                   -         1-5    
CMCSDR_Honeybadger CMCSDR   00.0B                   -         1-5    
Lsi                DiskCtrl 17.00.01.00             -         1-5    
LsiExp0            DiskExp  0910+0210               -         1-5    
LsiExp1            DiskExp  0910+0210               -         1-5    
IsilonFPV1         FrontPnl UI.01.36                -         1-2,4-5
Mellanox           Network  2.30.8020+ISL1090110018 -         1-5    
LOx2-MLC-YD        Nvram    rp180c01+rp180c01       -         1-5    
---------------------------------------------------------------------
Total: 10

Licenses

  • The following licenses are active:
BIC-Isilon-Cluster-3# isi license licenses ls
Name                  Status     Expiration         
----------------------------------------------------
SmartDedupe           Inactive   -                  
Swift                 Activated  -                  
SmartQuotas           Activated  -                  
InsightIQ             Activated  -                  
SmartPools            Inactive   -                  
SmartLock             Inactive   -                  
Isilon for vCenter    Inactive   -                  
CloudPools            Inactive   -                  
Hardening             Inactive   -                  
SnapshotIQ            Activated  -                  
HDFS                  Inactive   -                  
SyncIQ                Inactive   -                  
SmartConnect Advanced Activated  -
----------------------------------------------------
Total: 13

Alerts and Events

  • Modify event retention period from 90 days (default) to 360:
BIC-Isilon-Cluster-1# isi event settings view
      Retention Days: 90
       Storage Limit: 1
   Maintenance Start: Never
Maintenance Duration: Never
  Heartbeat Interval: daily

BIC-Isilon-Cluster-1# isi event settings modify --retention-days 360

BIC-Isilon-Cluster-1# isi event settings view                       
      Retention Days: 360
       Storage Limit: 1
   Maintenance Start: Never
Maintenance Duration: Never
  Heartbeat Interval: daily
  • The syntax to modify events settings:
isi event settings modify
[--retention-days <integer>]
[--storage-limit <integer>]
[--maintenance-start <timestamp>]
[--clear-maintenance-start]
[--maintenance-duration <duration>]
[--heartbeat-interval
  • Every event has two ID numbers that help to establish the context of the event.
  • The event type ID identifies the type of event that has occurred.
  • The event instance ID is a unique number that is specific to a particular occurrence of an event type.
  • When an event is submitted to the kernel queue, an event instance ID is assigned.
  • You can reference the instance ID to determine the exact time that an event occurred.
  • You can view individual events. However, you manage events and alerts at the event group level.
BIC-Isilon-Cluster-3# isi event events list

ID       Occurred    Sev  Lnn  Eventgroup ID  Message             
--------------------------------------------------------------------------------------------------------
1.426    04/19 13:28 U    0    1              Resolved from PAPI                              
3.309    04/19 11:11 C    4    1              External network link ext-1 (igb0) down                                                                                      
2.530    04/20 00:00 I    2    131            Heartbeat Event
3.545    04/27 22:09 C    4    131101         Disk Repair Complete: Bay 2, Type HDD, LNUM 34. Replace the drive according to the instructions in the OneFS Help system.
1.664    05/05 12:05 U    0    131124         Resolved from PAPI
3.563    05/03 23:56 C    4    131124         One or more drives (bay(s) 2 / type(s) HDD) are ready to be replaced.
3.551    04/27 22:19 C    4    131124         One or more drives (bay(s) 2 / type(s) HDD) are ready to be replaced.                                                               

BIC-Isilon-Cluster-3# isi event events view 3.545   
           ID: 3.545
Eventgroup ID: 131101
   Event Type: 100010010
      Message: Disk Repair Complete: Bay 2, Type HDD, LNUM 34. Replace the drive according to the instructions in the OneFS Help system.
        Devid: 3
          Lnn: 4
         Time: 2016-04-27T22:09:12
     Severity: critical
        Value: 0.0

BIC-Isilon-Cluster-3# isi event groups list
ID    Started     Ended       Causes Short                    Events  Severity   
---------------------------------------------------------------------------------
3     04/19 11:10 04/19 13:28 external_network                2       critical   
2     04/19 11:10 04/19 13:28 external_network                2       critical   
4     04/19 11:10 04/19 11:16 NODE_STATUS_OFFLINE             2       critical   
1     04/19 11:11 04/19 13:28 external_network                2       critical   
24    04/19 11:16 04/19 11:16 NODE_STATUS_ONLINE              1       information
26    04/19 11:23 04/19 13:28 external_network                2       critical   
27    04/19 11:31 04/19 13:28 WINNET_AUTH_NIS_SERVERS_UNREACH 6       critical   
32    04/19 12:43 04/19 12:44 HW_IPMI_POWER_SUPPLY_STATUS_REG 4       critical   
...
524525 05/30 02:18 05/30 02:18 SYS_DISK_REMOVED               1       critical   
524538 05/30 02:19 --          SYS_DISK_UNHEALTHY             3       critical   
...

BIC-Isilon-Cluster-3# isi event groups view 524525
         ID: 524525
    Started: 05/30 02:18
Causes Long: Disk Repair Complete: Bay 18, Type HDD, LNUM 27. Replace the drive according to the instructions in the OneFS Help system.
 Last Event: 2016-05-30T02:18:16
     Ignore: No
Ignore Time: Never
   Resolved: Yes
      Ended: 05/30 02:18
     Events: 1
   Severity: critical

BIC-Isilon-Cluster-3# isi event groups view 524538
         ID: 524538
    Started: 05/30 02:19
Causes Long: One or more drives (bay(s) 18 / type(s) HDD) are ready to be replaced.
 Last Event: 2016-06-04T12:42:09
     Ignore: No
Ignore Time: Never
   Resolved: No
      Ended: --
     Events: 3
   Severity: critical

Scheduling A Maintenance Window

  • You can schedule a maintenance window by setting a maintenance start time and duration.
  • During a scheduled maintenance window, the system will continue to log events, but no alerts will be generated.
  • Scheduling a maintenance window will keep channels from being flooded by benign alerts associated with cluster maintenance procedures.
  • Active event groups will automatically resume generating alerts when the scheduled maintenance period ends.
  • You can schedule a maintenance window to discontinue alerts while you are performing maintenance on your cluster.
  • Schedule a maintenance window by running the isi event settings modify command.
  • The following example command schedules a maintenance window that begins on September 1, 2015 at 11:00pm and lasts for two days:
isi event settings modify --maintenance-start 2015-09-01T23:00:00 --maintenance-duration 2D

Hardware, Devices and Nodes

Storage Pool Protection Level

  • Default and suggested protection level for a cluster size less than 2PB is +2d:1n.
  • A +2d:1n protection level implies that the cluster can recover from two simultaneous drive failures or one node failure without sustaining any data loss.
  • The parity overhead is 20% for a 5-nodes cluster with a +2d:1n protection level.
BIC-Isilon-Cluster-4# isi storagepool list         
Name            Nodes  Requested Protection  HDD     Total     %     SSD  Total  %    
--------------------------------------------------------------------------------------
x410_144tb_64gb 1-5    +2d:1n                1.1190T 641.6275T 0.17% 0    0      0.00%
--------------------------------------------------------------------------------------
Total: 1                                     1.1190T 641.6275T 0.17% 0    0      0.00% 

Hardware status on a specific node:

BIC-Isilon-Cluster-4# isi_hw_status
  SerNo: SX410-301608-0264
 Config: 400-0049-03
FamCode: X
ChsCode: 4U
GenCode: 10
Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
  HWGen: CTO (CTO Hardware)
Chassis: ISI36V3 (Isilon 36-Bay(V3) Chassis)
    CPU: GenuineIntel (2.00GHz, stepping 0x000306e4)
   PROC: Dual-proc, Octa-HT-core
    RAM: 68602642432 Bytes
   Mobo: IntelS2600CP (Intel S2600CP Motherboard)
  NVRam: LX4381 (Isilon LOx NVRam Card) (2016MB card) (size 2113798144B)
FlshDrv: None (No physical dongle supported) ((null))
 DskCtl: LSI2308SAS2 (LSI 2308 SAS Controller) (8 ports)
 DskExp: LSISAS2X24_X2 (LSI SAS2x24 SAS Expander (Qty 2))
PwrSupl: PS1 (type=ACBEL POLYTECH , fw=03.03)
PwrSupl: PS2 (type=ACBEL POLYTECH , fw=03.03)
ChasCnt: 1 (Single-Chassis System)
  NetIF: ib1,ib0,igb0,igb1,bxe0,bxe1
 IBType: MT4099 QDR (Mellanox MT4099 IB QDR Card)
 LCDver: IsiVFD1 (Isilon VFD V1)
    IMB: Board Version 0xffffffff
Power Supplies OK
Power Supply 1 good
Power Supply 2 good
CPU Operation (raw 0x88390000)  = Normal
CPU Speed Limit                 = 100.00%
FAN TAC SENSOR 1                = 8800.000
FAN TAC SENSOR 2                = 8800.000
FAN TAC SENSOR 3                = 8800.000
PS FAN SPEED 1                  = 9600.000
PS FAN SPEED 2                  = 9500.000
BB +12.0V                       = 11.935
BB +5.0V                        = 4.937
BB +3.3V                        = 3.268
BB +5.0V STBY                   = 4.894
BB +3.3V AUX                    = 3.268
BB +1.05V P1Vccp                = 0.828
BB +1.05V P2Vccp                = 0.822
BB +1.5 P1DDR AB                = na
BB +1.5 P1DDR CD                = na
BB +1.5 P2DDR AB                = na
BB +1.5 P2DDR CD                = na
BB +1.8V AUX                    = 1.769
BB +1.1V STBY                   = 1.081
BB VBAT                         = 3.120
BB +1.35 P1LV AB                = 1.342
BB +1.35 P1LV CD                = 1.348
BB +1.35 P2LV AB                = 1.378
BB +1.35 P2LV CD                = 1.348
VCC_12V0                        = 12.100
VCC_5V0                         = 5.000
VCC_3V3                         = 3.300
VCC_1V8                         = 1.800
VCC_5V0_SB                      = 4.900
VCC_1V0                         = 0.990
VCC_5V0_CBL                     = 5.000
VCC_SW                          = 4.900
VBATT_1                         = 4.000
VBATT_2                         = 4.000
PS IN VOLT 1                    = 241.000
PS IN VOLT 2                    = 241.000
PS OUT VOLT 1                   = 12.300
PS OUT VOLT 2                   = 12.300
PS IN CURR 1                    = 1.200
PS IN CURR 2                    = 1.200
PS OUT CURR 1                   = 19.000
PS OUT CURR 2                   = 19.500
Front Panel Temp                = 20.6
BB EDGE Temp                    = 25.000
BB BMC Temp                     = 34.000
BB P2 VR Temp                   = 30.000
BB MEM VR Temp                  = 28.000
LAN NIC Temp                    = 42.000
P1 Therm Margin                 = -56.000
P2 Therm Margin                 = -58.000
P1 DTS Therm Mgn                = -56.000
P2 DTS Therm Mgn                = -58.000
DIMM Thrm Mrgn 1                = -66.000
DIMM Thrm Mrgn 2                = -68.000
DIMM Thrm Mrgn 3                = -67.000
DIMM Thrm Mrgn 4                = -66.000
TEMP SENSOR 1                   = 23.000
PS TEMP 1                       = 28.000
PS TEMP 2                       = 28.000

List devices on nodes 5 (node logical node number):

BIC-Isilon-Cluster-1# isi devices list --node-lnn 5
Lnn  Location  Device    Lnum  State   Serial  
-----------------------------------------------
5    Bay  1    /dev/da1  35    HEALTHY S1Z1S6BY
5    Bay  2    /dev/da2  34    HEALTHY Z1ZAECJM
5    Bay  3    /dev/da19 17    HEALTHY S1Z1SB0L
5    Bay  4    /dev/da20 16    HEALTHY S1Z1SAYP
5    Bay  5    /dev/da3  33    HEALTHY Z1ZA74A4
5    Bay  6    /dev/da21 15    HEALTHY Z1ZAEQ13
5    Bay  7    /dev/da22 14    HEALTHY S1Z1SAF5
5    Bay  8    /dev/da23 13    HEALTHY S1Z1SB0C
5    Bay  9    /dev/da4  32    HEALTHY Z1ZAEPR8
5    Bay 10    /dev/da24 12    HEALTHY Z1ZAB3ZD
5    Bay 11    /dev/da25 11    HEALTHY S1Z1RYGS
5    Bay 12    /dev/da26 10    HEALTHY S1Z1SB0A
5    Bay 13    /dev/da5  31    HEALTHY Z1ZAEPS5
5    Bay 14    /dev/da6  30    HEALTHY Z1ZAF5GQ
5    Bay 15    /dev/da7  29    HEALTHY Z1ZAB40S
5    Bay 16    /dev/da27 9     HEALTHY Z1ZAF625
5    Bay 17    /dev/da8  28    HEALTHY Z1ZAEPJY
5    Bay 18    /dev/da9  27    HEALTHY Z1ZAF1LG
5    Bay 19    /dev/da10 26    HEALTHY Z1ZAF724
5    Bay 20    /dev/da28 8     HEALTHY Z1ZAF5W8
5    Bay 21    /dev/da11 25    HEALTHY Z1ZAEW1W
5    Bay 22    /dev/da12 24    HEALTHY Z1ZAF0CW
5    Bay 23    /dev/da29 7     HEALTHY Z1ZAF5VM
5    Bay 24    /dev/da30 6     HEALTHY Z1ZAF59X
5    Bay 25    /dev/da31 5     HEALTHY Z1ZAF21G
5    Bay 26    /dev/da32 4     HEALTHY Z1ZAF5QJ
5    Bay 27    /dev/da33 3     HEALTHY Z1ZAF58Y
5    Bay 28    /dev/da13 23    HEALTHY Z1ZAF6CG
5    Bay 29    /dev/da34 2     HEALTHY Z1ZAB3XJ
5    Bay 30    /dev/da14 22    HEALTHY S1Z1RYHB
5    Bay 31    /dev/da35 1     HEALTHY Z1ZAB3TQ
5    Bay 32    /dev/da15 21    HEALTHY Z1ZAEPYX
5    Bay 33    /dev/da36 0     HEALTHY Z1ZAF4Z0
5    Bay 34    /dev/da16 20    HEALTHY Z1ZAEPMC
5    Bay 35    /dev/da17 19    HEALTHY Z1ZAF4H4
5    Bay 36    /dev/da18 18    HEALTHY Z1ZAF6JA
-----------------------------------------------
Total: 36

Use command isi_for_array an select node 4 and list its drives:

BIC-Isilon-Cluster-1# isi_for_array -n4 isi devices drive list
BIC-Isilon-Cluster-4: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-4: -----------------------------------------------
BIC-Isilon-Cluster-4: 4    Bay  1    /dev/da1  35    HEALTHY S1Z1STTN
BIC-Isilon-Cluster-4: 4    Bay  2    /dev/da2  36    HEALTHY Z1Z9XE67
BIC-Isilon-Cluster-4: 4    Bay  3    /dev/da19 17    HEALTHY S1Z1NE5B
BIC-Isilon-Cluster-4: 4    Bay  4    /dev/da20 16    HEALTHY S1Z1QQBN
BIC-Isilon-Cluster-4: 4    Bay  5    /dev/da3  33    HEALTHY S1Z1RYJ0
BIC-Isilon-Cluster-4: 4    Bay  6    /dev/da21 15    HEALTHY S1Z1SL53
BIC-Isilon-Cluster-4: 4    Bay  7    /dev/da22 14    HEALTHY S1Z1QNVG
BIC-Isilon-Cluster-4: 4    Bay  8    /dev/da23 13    HEALTHY S1Z1R8TT
BIC-Isilon-Cluster-4: 4    Bay  9    /dev/da4  32    HEALTHY S1Z1SLDG
BIC-Isilon-Cluster-4: 4    Bay 10    /dev/da24 12    HEALTHY S1Z1RVGX
BIC-Isilon-Cluster-4: 4    Bay 11    /dev/da25 11    HEALTHY S1Z1QNSG
BIC-Isilon-Cluster-4: 4    Bay 12    /dev/da26 10    HEALTHY S1Z1NEGJ
BIC-Isilon-Cluster-4: 4    Bay 13    /dev/da5  31    HEALTHY S1Z1QR9E
BIC-Isilon-Cluster-4: 4    Bay 14    /dev/da6  30    HEALTHY S1Z1SL23
BIC-Isilon-Cluster-4: 4    Bay 15    /dev/da7  29    HEALTHY S1Z1NEPA
BIC-Isilon-Cluster-4: 4    Bay 16    /dev/da27 9     HEALTHY S1Z1SLAZ
BIC-Isilon-Cluster-4: 4    Bay 17    /dev/da8  28    HEALTHY S1Z1STT6
BIC-Isilon-Cluster-4: 4    Bay 18    /dev/da9  27    HEALTHY S1Z1SL2W
BIC-Isilon-Cluster-4: 4    Bay 19    /dev/da10 26    HEALTHY S1Z1SL4P
BIC-Isilon-Cluster-4: 4    Bay 20    /dev/da28 8     HEALTHY S1Z1QS4J
BIC-Isilon-Cluster-4: 4    Bay 21    /dev/da11 25    HEALTHY S1Z1SAXY
BIC-Isilon-Cluster-4: 4    Bay 22    /dev/da12 24    HEALTHY S1Z1SL9J
BIC-Isilon-Cluster-4: 4    Bay 23    /dev/da29 7     HEALTHY S1Z1NFS6
BIC-Isilon-Cluster-4: 4    Bay 24    /dev/da30 6     HEALTHY S1Z1NE26
BIC-Isilon-Cluster-4: 4    Bay 25    /dev/da31 5     HEALTHY S1Z1RX6H
BIC-Isilon-Cluster-4: 4    Bay 26    /dev/da32 4     HEALTHY S1Z1QRTK
BIC-Isilon-Cluster-4: 4    Bay 27    /dev/da33 3     HEALTHY S1Z1SAWG
BIC-Isilon-Cluster-4: 4    Bay 28    /dev/da13 23    HEALTHY S1Z1QR5B
BIC-Isilon-Cluster-4: 4    Bay 29    /dev/da34 2     HEALTHY S1Z1RVEK
BIC-Isilon-Cluster-4: 4    Bay 30    /dev/da14 22    HEALTHY S1Z1SLAN
BIC-Isilon-Cluster-4: 4    Bay 31    /dev/da35 1     HEALTHY S1Z1QPES
BIC-Isilon-Cluster-4: 4    Bay 32    /dev/da15 21    HEALTHY S1Z1SLBR
BIC-Isilon-Cluster-4: 4    Bay 33    /dev/da36 0     HEALTHY S1Z1SAXM
BIC-Isilon-Cluster-4: 4    Bay 34    /dev/da16 20    HEALTHY S1Z1RVJX
BIC-Isilon-Cluster-4: 4    Bay 35    /dev/da17 19    HEALTHY S1Z1RV62
BIC-Isilon-Cluster-4: 4    Bay 36    /dev/da18 18    HEALTHY S1Z1RYH9
BIC-Isilon-Cluster-4: -----------------------------------------------
BIC-Isilon-Cluster-4: Total: 36

Loop through the cluster nodes and grep for non-healthy drives using

BIC-Isilon-Cluster-4# isi_for_array "isi devices drive list| grep -iv healthy"
BIC-Isilon-Cluster-2: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-2: -----------------------------------------------
BIC-Isilon-Cluster-2: -----------------------------------------------
BIC-Isilon-Cluster-2: Total: 36
BIC-Isilon-Cluster-3: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-3: -----------------------------------------------
BIC-Isilon-Cluster-3: -----------------------------------------------
BIC-Isilon-Cluster-3: Total: 36
BIC-Isilon-Cluster-4: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-4: -----------------------------------------------
BIC-Isilon-Cluster-4: 4    Bay  2    -         N/A   REPLACE -       
BIC-Isilon-Cluster-4: -----------------------------------------------
BIC-Isilon-Cluster-4: Total: 36
BIC-Isilon-Cluster-1: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-1: -----------------------------------------------
BIC-Isilon-Cluster-1: -----------------------------------------------
BIC-Isilon-Cluster-1: Total: 36
BIC-Isilon-Cluster-5: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: Total: 36

View Firmware Devices Status

BIC-Isilon-Cluster-5# isi devices drive firmware list --node-lnn all
Lnn  Location  Firmware  Desired  Model              
-----------------------------------------------------
1    Bay  1    SNG4      -        ST4000NM0033-9ZM170
1    Bay  2    SNG4      -        ST4000NM0033-9ZM170
1    Bay  3    SNG4      -        ST4000NM0033-9ZM170
1    Bay  4    SNG4      -        ST4000NM0033-9ZM170
1    Bay  5    SNG4      -        ST4000NM0033-9ZM170
1    Bay  6    SNG4      -        ST4000NM0033-9ZM170
1    Bay  7    SNG4      -        ST4000NM0033-9ZM170
1    Bay  8    SNG4      -        ST4000NM0033-9ZM170
1    Bay  9    SNG4      -        ST4000NM0033-9ZM170
1    Bay 10    SNG4      -        ST4000NM0033-9ZM170
1    Bay 11    SNG4      -        ST4000NM0033-9ZM170
1    Bay 12    SNG4      -        ST4000NM0033-9ZM170
1    Bay 13    SNG4      -        ST4000NM0033-9ZM170
1    Bay 14    SNG4      -        ST4000NM0033-9ZM170
1    Bay 15    SNG4      -        ST4000NM0033-9ZM170
1    Bay 16    SNG4      -        ST4000NM0033-9ZM170
1    Bay 17    SNG4      -        ST4000NM0033-9ZM170
1    Bay 18    SNG4      -        ST4000NM0033-9ZM170
1    Bay 19    SNG4      -        ST4000NM0033-9ZM170
1    Bay 20    SNG4      -        ST4000NM0033-9ZM170
1    Bay 21    SNG4      -        ST4000NM0033-9ZM170
1    Bay 22    SNG4      -        ST4000NM0033-9ZM170
1    Bay 23    SNG4      -        ST4000NM0033-9ZM170
1    Bay 24    SNG4      -        ST4000NM0033-9ZM170
1    Bay 25    SNG4      -        ST4000NM0033-9ZM170
1    Bay 26    SNG4      -        ST4000NM0033-9ZM170
1    Bay 27    SNG4      -        ST4000NM0033-9ZM170
1    Bay 28    SNG4      -        ST4000NM0033-9ZM170
1    Bay 29    SNG4      -        ST4000NM0033-9ZM170
1    Bay 30    SNG4      -        ST4000NM0033-9ZM170
1    Bay 31    SNG4      -        ST4000NM0033-9ZM170
1    Bay 32    SNG4      -        ST4000NM0033-9ZM170
1    Bay 33    SNG4      -        ST4000NM0033-9ZM170
1    Bay 34    SNG4      -        ST4000NM0033-9ZM170
1    Bay 35    SNG4      -        ST4000NM0033-9ZM170
1    Bay 36    SNG4      -        ST4000NM0033-9ZM170
2    Bay  1    SNG4      -        ST4000NM0033-9ZM170
2    Bay  2    SNG4      -        ST4000NM0033-9ZM170
2    Bay  3    SNG4      -        ST4000NM0033-9ZM170
2    Bay  4    SNG4      -        ST4000NM0033-9ZM170
2    Bay  5    SNG4      -        ST4000NM0033-9ZM170
2    Bay  6    SNG4      -        ST4000NM0033-9ZM170
2    Bay  7    SNG4      -        ST4000NM0033-9ZM170
2    Bay  8    SNG4      -        ST4000NM0033-9ZM170
2    Bay  9    SNG4      -        ST4000NM0033-9ZM170
2    Bay 10    SNG4      -        ST4000NM0033-9ZM170
2    Bay 11    SNG4      -        ST4000NM0033-9ZM170
2    Bay 12    SNG4      -        ST4000NM0033-9ZM170
2    Bay 13    SNG4      -        ST4000NM0033-9ZM170
2    Bay 14    SNG4      -        ST4000NM0033-9ZM170
2    Bay 15    SNG4      -        ST4000NM0033-9ZM170
2    Bay 16    SNG4      -        ST4000NM0033-9ZM170
2    Bay 17    SNG4      -        ST4000NM0033-9ZM170
2    Bay 18    SNG4      -        ST4000NM0033-9ZM170
2    Bay 19    SNG4      -        ST4000NM0033-9ZM170
2    Bay 20    SNG4      -        ST4000NM0033-9ZM170
2    Bay 21    SNG4      -        ST4000NM0033-9ZM170
2    Bay 22    SNG4      -        ST4000NM0033-9ZM170
2    Bay 23    SNG4      -        ST4000NM0033-9ZM170
2    Bay 24    SNG4      -        ST4000NM0033-9ZM170
2    Bay 25    SNG4      -        ST4000NM0033-9ZM170
2    Bay 26    SNG4      -        ST4000NM0033-9ZM170
2    Bay 27    SNG4      -        ST4000NM0033-9ZM170
2    Bay 28    SNG4      -        ST4000NM0033-9ZM170
2    Bay 29    SNG4      -        ST4000NM0033-9ZM170
2    Bay 30    SNG4      -        ST4000NM0033-9ZM170
2    Bay 31    SNG4      -        ST4000NM0033-9ZM170
2    Bay 32    SNG4      -        ST4000NM0033-9ZM170
2    Bay 33    SNG4      -        ST4000NM0033-9ZM170
2    Bay 34    SNG4      -        ST4000NM0033-9ZM170
2    Bay 35    SNG4      -        ST4000NM0033-9ZM170
2    Bay 36    SNG4      -        ST4000NM0033-9ZM170
4    Bay  1    SNG4      -        ST4000NM0033-9ZM170
4    Bay  2    SNG4      -        ST4000NM0033-9ZM170
4    Bay  3    SNG4      -        ST4000NM0033-9ZM170
4    Bay  4    SNG4      -        ST4000NM0033-9ZM170
4    Bay  5    SNG4      -        ST4000NM0033-9ZM170
4    Bay  6    SNG4      -        ST4000NM0033-9ZM170
4    Bay  7    SNG4      -        ST4000NM0033-9ZM170
4    Bay  8    SNG4      -        ST4000NM0033-9ZM170
4    Bay  9    SNG4      -        ST4000NM0033-9ZM170
4    Bay 10    SNG4      -        ST4000NM0033-9ZM170
4    Bay 11    SNG4      -        ST4000NM0033-9ZM170
4    Bay 12    SNG4      -        ST4000NM0033-9ZM170
4    Bay 13    SNG4      -        ST4000NM0033-9ZM170
4    Bay 14    SNG4      -        ST4000NM0033-9ZM170
4    Bay 15    SNG4      -        ST4000NM0033-9ZM170
4    Bay 16    SNG4      -        ST4000NM0033-9ZM170
4    Bay 17    SNG4      -        ST4000NM0033-9ZM170
4    Bay 18    SNG4      -        ST4000NM0033-9ZM170
4    Bay 19    SNG4      -        ST4000NM0033-9ZM170
4    Bay 20    SNG4      -        ST4000NM0033-9ZM170
4    Bay 21    SNG4      -        ST4000NM0033-9ZM170
4    Bay 22    SNG4      -        ST4000NM0033-9ZM170
4    Bay 23    SNG4      -        ST4000NM0033-9ZM170
4    Bay 24    SNG4      -        ST4000NM0033-9ZM170
4    Bay 25    SNG4      -        ST4000NM0033-9ZM170
4    Bay 26    SNG4      -        ST4000NM0033-9ZM170
4    Bay 27    SNG4      -        ST4000NM0033-9ZM170
4    Bay 28    SNG4      -        ST4000NM0033-9ZM170
4    Bay 29    SNG4      -        ST4000NM0033-9ZM170
4    Bay 30    SNG4      -        ST4000NM0033-9ZM170
4    Bay 31    SNG4      -        ST4000NM0033-9ZM170
4    Bay 32    SNG4      -        ST4000NM0033-9ZM170
4    Bay 33    SNG4      -        ST4000NM0033-9ZM170
4    Bay 34    SNG4      -        ST4000NM0033-9ZM170
4    Bay 35    SNG4      -        ST4000NM0033-9ZM170
4    Bay 36    SNG4      -        ST4000NM0033-9ZM170
3    Bay  1    SNG4      -        ST4000NM0033-9ZM170
3    Bay  2    SNG4      -        ST4000NM0033-9ZM170
3    Bay  3    SNG4      -        ST4000NM0033-9ZM170
3    Bay  4    SNG4      -        ST4000NM0033-9ZM170
3    Bay  5    SNG4      -        ST4000NM0033-9ZM170
3    Bay  6    SNG4      -        ST4000NM0033-9ZM170
3    Bay  7    SNG4      -        ST4000NM0033-9ZM170
3    Bay  8    SNG4      -        ST4000NM0033-9ZM170
3    Bay  9    SNG4      -        ST4000NM0033-9ZM170
3    Bay 10    SNG4      -        ST4000NM0033-9ZM170
3    Bay 11    SNG4      -        ST4000NM0033-9ZM170
3    Bay 12    SNG4      -        ST4000NM0033-9ZM170
3    Bay 13    SNG4      -        ST4000NM0033-9ZM170
3    Bay 14    SNG4      -        ST4000NM0033-9ZM170
3    Bay 15    SNG4      -        ST4000NM0033-9ZM170
3    Bay 16    SNG4      -        ST4000NM0033-9ZM170
3    Bay 17    SNG4      -        ST4000NM0033-9ZM170
3    Bay 18    SNG4      -        ST4000NM0033-9ZM170
3    Bay 19    SNG4      -        ST4000NM0033-9ZM170
3    Bay 20    SNG4      -        ST4000NM0033-9ZM170
3    Bay 21    SNG4      -        ST4000NM0033-9ZM170
3    Bay 22    SNG4      -        ST4000NM0033-9ZM170
3    Bay 23    SNG4      -        ST4000NM0033-9ZM170
3    Bay 24    SNG4      -        ST4000NM0033-9ZM170
3    Bay 25    SNG4      -        ST4000NM0033-9ZM170
3    Bay 26    SNG4      -        ST4000NM0033-9ZM170
3    Bay 27    SNG4      -        ST4000NM0033-9ZM170
3    Bay 28    SNG4      -        ST4000NM0033-9ZM170
3    Bay 29    SNG4      -        ST4000NM0033-9ZM170
3    Bay 30    SNG4      -        ST4000NM0033-9ZM170
3    Bay 31    SNG4      -        ST4000NM0033-9ZM170
3    Bay 32    SNG4      -        ST4000NM0033-9ZM170
3    Bay 33    SNG4      -        ST4000NM0033-9ZM170
3    Bay 34    SNG4      -        ST4000NM0033-9ZM170
3    Bay 35    SNG4      -        ST4000NM0033-9ZM170
3    Bay 36    SNG4      -        ST4000NM0033-9ZM170
5    Bay  1    SNG4      -        ST4000NM0033-9ZM170
5    Bay  2    SNG4      -        ST4000NM0033-9ZM170
5    Bay  3    SNG4      -        ST4000NM0033-9ZM170
5    Bay  4    SNG4      -        ST4000NM0033-9ZM170
5    Bay  5    SNG4      -        ST4000NM0033-9ZM170
5    Bay  6    SNG4      -        ST4000NM0033-9ZM170
5    Bay  7    SNG4      -        ST4000NM0033-9ZM170
5    Bay  8    SNG4      -        ST4000NM0033-9ZM170
5    Bay  9    SNG4      -        ST4000NM0033-9ZM170
5    Bay 10    SNG4      -        ST4000NM0033-9ZM170
5    Bay 11    SNG4      -        ST4000NM0033-9ZM170
5    Bay 12    SNG4      -        ST4000NM0033-9ZM170
5    Bay 13    SNG4      -        ST4000NM0033-9ZM170
5    Bay 14    SNG4      -        ST4000NM0033-9ZM170
5    Bay 15    SNG4      -        ST4000NM0033-9ZM170
5    Bay 16    SNG4      -        ST4000NM0033-9ZM170
5    Bay 17    SNG4      -        ST4000NM0033-9ZM170
5    Bay 18    SNG4      -        ST4000NM0033-9ZM170
5    Bay 19    SNG4      -        ST4000NM0033-9ZM170
5    Bay 20    SNG4      -        ST4000NM0033-9ZM170
5    Bay 21    SNG4      -        ST4000NM0033-9ZM170
5    Bay 22    SNG4      -        ST4000NM0033-9ZM170
5    Bay 23    SNG4      -        ST4000NM0033-9ZM170
5    Bay 24    SNG4      -        ST4000NM0033-9ZM170
5    Bay 25    SNG4      -        ST4000NM0033-9ZM170
5    Bay 26    SNG4      -        ST4000NM0033-9ZM170
5    Bay 27    SNG4      -        ST4000NM0033-9ZM170
5    Bay 28    SNG4      -        ST4000NM0033-9ZM170
5    Bay 29    SNG4      -        ST4000NM0033-9ZM170
5    Bay 30    SNG4      -        ST4000NM0033-9ZM170
5    Bay 31    SNG4      -        ST4000NM0033-9ZM170
5    Bay 32    SNG4      -        ST4000NM0033-9ZM170
5    Bay 33    SNG4      -        ST4000NM0033-9ZM170
5    Bay 34    SNG4      -        ST4000NM0033-9ZM170
5    Bay 35    SNG4      -        ST4000NM0033-9ZM170
5    Bay 36    SNG4      -        ST4000NM0033-9ZM170
-----------------------------------------------------
Total: 180                                            

Add a drive to a node:

BIC-Isilon-Cluster-5# isi devices add <bay> --node-lnn=< node#>

Disk Failure Replacement Procedure

  • A disk in bay 4 of the Logical Node Number 5 (mode 5) is bad.
### Remove bad disk, insert new one.

# List disk devices on node 5:

BIC-Isilon-Cluster-4# isi_for_array -n5 isi devices drive list
BIC-Isilon-Cluster-5: Lnn  Location  Device    Lnum  State   Serial  
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: 5    Bay  1    /dev/da1  35    HEALTHY S1Z1S6BY
BIC-Isilon-Cluster-5: 5    Bay  2    /dev/da2  34    HEALTHY Z1ZAECJM
BIC-Isilon-Cluster-5: 5    Bay  3    /dev/da19 17    HEALTHY S1Z1SB0L
BIC-Isilon-Cluster-5: 5    Bay  4    /dev/da20 N/A   NEW     K4K73KGB
BIC-Isilon-Cluster-5: 5    Bay  5    /dev/da3  33    HEALTHY Z1ZA74A4
BIC-Isilon-Cluster-5: 5    Bay  6    /dev/da21 15    HEALTHY Z1ZAEQ13
BIC-Isilon-Cluster-5: 5    Bay  7    /dev/da22 14    HEALTHY S1Z1SAF5
BIC-Isilon-Cluster-5: 5    Bay  8    /dev/da23 13    HEALTHY S1Z1SB0C
BIC-Isilon-Cluster-5: 5    Bay  9    /dev/da4  32    HEALTHY Z1ZAEPR8
BIC-Isilon-Cluster-5: 5    Bay 10    /dev/da24 36    HEALTHY S1Z26JWM
BIC-Isilon-Cluster-5: 5    Bay 11    /dev/da25 11    HEALTHY S1Z1RYGS
BIC-Isilon-Cluster-5: 5    Bay 12    /dev/da26 10    HEALTHY S1Z1SB0A
BIC-Isilon-Cluster-5: 5    Bay 13    /dev/da5  31    HEALTHY Z1ZAEPS5
BIC-Isilon-Cluster-5: 5    Bay 14    /dev/da6  30    HEALTHY Z1ZAF5GQ
BIC-Isilon-Cluster-5: 5    Bay 15    /dev/da7  29    HEALTHY Z1ZAB40S
BIC-Isilon-Cluster-5: 5    Bay 16    /dev/da27 9     HEALTHY Z1ZAF625
BIC-Isilon-Cluster-5: 5    Bay 17    /dev/da8  28    HEALTHY Z1ZAEPJY
BIC-Isilon-Cluster-5: 5    Bay 18    /dev/da9  27    HEALTHY Z1ZAF1LG
BIC-Isilon-Cluster-5: 5    Bay 19    /dev/da10 26    HEALTHY Z1ZAF724
BIC-Isilon-Cluster-5: 5    Bay 20    /dev/da28 8     HEALTHY Z1ZAF5W8
BIC-Isilon-Cluster-5: 5    Bay 21    /dev/da11 25    HEALTHY Z1ZAEW1W
BIC-Isilon-Cluster-5: 5    Bay 22    /dev/da12 24    HEALTHY Z1ZAF0CW
BIC-Isilon-Cluster-5: 5    Bay 23    /dev/da29 7     HEALTHY Z1ZAF5VM
BIC-Isilon-Cluster-5: 5    Bay 24    /dev/da30 6     HEALTHY Z1ZAF59X
BIC-Isilon-Cluster-5: 5    Bay 25    /dev/da31 5     HEALTHY Z1ZAF21G
BIC-Isilon-Cluster-5: 5    Bay 26    /dev/da32 4     HEALTHY Z1ZAF5QJ
BIC-Isilon-Cluster-5: 5    Bay 27    /dev/da33 3     HEALTHY Z1ZAF58Y
BIC-Isilon-Cluster-5: 5    Bay 28    /dev/da13 23    HEALTHY Z1ZAF6CG
BIC-Isilon-Cluster-5: 5    Bay 29    /dev/da34 2     HEALTHY Z1ZAB3XJ
BIC-Isilon-Cluster-5: 5    Bay 30    /dev/da14 22    HEALTHY S1Z1RYHB
BIC-Isilon-Cluster-5: 5    Bay 31    /dev/da35 1     HEALTHY Z1ZAB3TQ
BIC-Isilon-Cluster-5: 5    Bay 32    /dev/da15 21    HEALTHY Z1ZAEPYX
BIC-Isilon-Cluster-5: 5    Bay 33    /dev/da36 0     HEALTHY Z1ZAF4Z0
BIC-Isilon-Cluster-5: 5    Bay 34    /dev/da16 20    HEALTHY Z1ZAEPMC
BIC-Isilon-Cluster-5: 5    Bay 35    /dev/da17 19    HEALTHY Z1ZAF4H4
BIC-Isilon-Cluster-5: 5    Bay 36    /dev/da18 18    HEALTHY Z1ZAF6JA
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: Total: 36

# Check cluster status:

BIC-Isilon-Cluster-4# isi status

Cluster Name: BIC-Isilon-Cluster
Cluster Health:     [ ATTN]
Cluster Storage:  HDD                 SSD Storage    
Size:             638.0T (645.7T Raw) 0 (0 Raw)      
VHS Size:         7.7T                
Used:             204.8T (32%)        0 (n/a)        
Avail:            433.2T (68%)        0 (n/a)        

                   Health  Throughput (bps)  HDD Storage      SSD Storage
ID |IP Address     |DASR |  In   Out  Total| Used / Size     |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
  1|172.16.10.20   | OK  |28.8M|83.3k|28.9M|41.0T/ 130T( 32%)|(No Storage SSDs)
  2|172.16.10.21   | OK  | 1.2M| 3.7M| 4.9M|40.9T/ 130T( 32%)|(No Storage SSDs)
  3|172.16.10.22   | OK  | 3.2k| 167k| 170k|41.0T/ 130T( 32%)|(No Storage SSDs)
  4|172.16.10.23   | OK  | 861k|49.2M|50.0M|41.0T/ 130T( 32%)|(No Storage SSDs)
  5|172.16.10.24   |-A-- | 858k| 1.6M| 2.5M|41.0T/ 126T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals:          |31.7M|54.7M|86.4M| 205T/ 638T( 32%)|(No Storage SSDs)

     Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only     

Critical Events:

 10/14 23:24   5 One or more drives (bay(s) 4 / location  / type(s) HDD) are...

Cluster Job Status:

No running jobs.

No paused or waiting jobs.

# Try to add disk drive in bay 4 of node 5:

BIC-Isilon-Cluster-4# isi devices add 4 --node-lnn=5      
You are about to add drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
Initiating add on bay4

The drive in bay4 was not added to the file system because it is not formatted. 
Format the drive to add it to the file system by running the following command, 
where <bay> is the bay number of the drive: isi devices drive format <bay>

# Oups. It doesn't like it. The new disk is an Hitachi drive while
# the cluster is built of Seagate ES3.

# Login directly to node 5 as it's easier to do the stuff directly there.

BIC-Isilon-Cluster-4# ssh BIC-Isilon-Cluster-5 
Password: 
Last login: Sun Oct 15 02:31:18 2017 from 172.16.10.160
Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved.
Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.

Isilon OneFS v8.0.0.4

# Check the state of the drive in bay 4:

BIC-Isilon-Cluster-5# isi devices drive view 4
                  Lnn: 5
             Location: Bay  4
                 Lnum: N/A
               Device: /dev/da20
               Baynum: 4
               Handle: 333
               Serial: K4K73KGB
                Model: HUS726040ALA610
                 Tech: SATA
                Media: HDD
               Blocks: 7814037168
 Logical Block Length: 512
Physical Block Length: 512
                  WWN: 0000000000000000
                State: NEW
              Purpose: UNKNOWN
  Purpose Description: A drive whose purpose is unknown
              Present: Yes

# Check the difference between it and the drive in bay 3 (healthy):

BIC-Isilon-Cluster-5# isi devices drive view 3
                  Lnn: 5
             Location: Bay  3
                 Lnum: 17
               Device: /dev/da19
               Baynum: 3
               Handle: 353
               Serial: S1Z1SB0L
                Model: ST4000NM0033-9ZM170
                 Tech: SATA
                Media: HDD
               Blocks: 7814037168
 Logical Block Length: 512
Physical Block Length: 512
                  WWN: 5000C5008CAD9092
                State: HEALTHY
              Purpose: STORAGE
  Purpose Description: A drive used for normal data storage operation
              Present: Yes

# Format the drive bay 4:

BIC-Isilon-Cluster-5# isi devices drive format 4
You are about to format drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
BIC-Isilon-Cluster-5# isi devices drive view 4  
                  Lnn: 5
             Location: Bay  4
                 Lnum: 37
               Device: /dev/da20
               Baynum: 4
               Handle: 332
               Serial: K4K73KGB
                Model: HUS726040ALA610
                 Tech: SATA
                Media: HDD
               Blocks: 7814037168
 Logical Block Length: 512
Physical Block Length: 512
                  WWN: 0000000000000000
                State: NONE
              Purpose: NONE
  Purpose Description: A drive that doesn't yet have a purpose
              Present: Yes

# The drive shows up as 'PREPARING'now:

BIC-Isilon-Cluster-5# isi devices drive list    
Lnn  Location  Device    Lnum  State     Serial  
-------------------------------------------------
5    Bay  1    /dev/da1  35    HEALTHY   S1Z1S6BY
5    Bay  2    /dev/da2  34    HEALTHY   Z1ZAECJM
5    Bay  3    /dev/da19 17    HEALTHY   S1Z1SB0L
5    Bay  4    /dev/da20 37    PREPARING K4K73KGB
5    Bay  5    /dev/da3  33    HEALTHY   Z1ZA74A4
5    Bay  6    /dev/da21 15    HEALTHY   Z1ZAEQ13
5    Bay  7    /dev/da22 14    HEALTHY   S1Z1SAF5
5    Bay  8    /dev/da23 13    HEALTHY   S1Z1SB0C
5    Bay  9    /dev/da4  32    HEALTHY   Z1ZAEPR8
5    Bay 10    /dev/da24 36    HEALTHY   S1Z26JWM
5    Bay 11    /dev/da25 11    HEALTHY   S1Z1RYGS
5    Bay 12    /dev/da26 10    HEALTHY   S1Z1SB0A
5    Bay 13    /dev/da5  31    HEALTHY   Z1ZAEPS5
5    Bay 14    /dev/da6  30    HEALTHY   Z1ZAF5GQ
5    Bay 15    /dev/da7  29    HEALTHY   Z1ZAB40S
5    Bay 16    /dev/da27 9     HEALTHY   Z1ZAF625
5    Bay 17    /dev/da8  28    HEALTHY   Z1ZAEPJY
5    Bay 18    /dev/da9  27    HEALTHY   Z1ZAF1LG
5    Bay 19    /dev/da10 26    HEALTHY   Z1ZAF724
5    Bay 20    /dev/da28 8     HEALTHY   Z1ZAF5W8
5    Bay 21    /dev/da11 25    HEALTHY   Z1ZAEW1W
5    Bay 22    /dev/da12 24    HEALTHY   Z1ZAF0CW
5    Bay 23    /dev/da29 7     HEALTHY   Z1ZAF5VM
5    Bay 24    /dev/da30 6     HEALTHY   Z1ZAF59X
5    Bay 25    /dev/da31 5     HEALTHY   Z1ZAF21G
5    Bay 26    /dev/da32 4     HEALTHY   Z1ZAF5QJ
5    Bay 27    /dev/da33 3     HEALTHY   Z1ZAF58Y
5    Bay 28    /dev/da13 23    HEALTHY   Z1ZAF6CG
5    Bay 29    /dev/da34 2     HEALTHY   Z1ZAB3XJ
5    Bay 30    /dev/da14 22    HEALTHY   S1Z1RYHB
5    Bay 31    /dev/da35 1     HEALTHY   Z1ZAB3TQ
5    Bay 32    /dev/da15 21    HEALTHY   Z1ZAEPYX
5    Bay 33    /dev/da36 0     HEALTHY   Z1ZAF4Z0
5    Bay 34    /dev/da16 20    HEALTHY   Z1ZAEPMC
5    Bay 35    /dev/da17 19    HEALTHY   Z1ZAF4H4
5    Bay 36    /dev/da18 18    HEALTHY   Z1ZAF6JA
-------------------------------------------------

# Add the drive bay 4:

BIC-Isilon-Cluster-5# isi devices drive add 4 
You are about to add drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
Initiating add on bay4

The add operation is in-progress. 
A OneFS-formatted drive was found in bay4 and is being added to the file system. 
Wait a few minutes and then list all drives to verify that the add operation completed successfully.

# There was a event while this was going on.
# Not sure what is means as bay for is not a Seagate ES3 anymore.

BIC-Isilon-Cluster-5# isi event groups view 4882746
          ID: 4882746
     Started: 10/18 14:20
 Causes Long: Drive in bay 4 location Bay  4 is unknown model ST4000NM0033-9ZM170
         Lnn: 5
       Devid: 6
  Last Event: 2017-10-18T14:20:30
      Ignore: No
 Ignore Time: Never
    Resolved: Yes
Resolve Time: 2017-10-18T14:18:15
       Ended: 10/18 14:18
      Events: 2
    Severity: warning

# After a little while, minutes, the drive shows up good:

BIC-Isilon-Cluster-5# isi devices drive list       
Lnn  Location  Device    Lnum  State   Serial  
-----------------------------------------------
5    Bay  1    /dev/da1  35    HEALTHY S1Z1S6BY
5    Bay  2    /dev/da2  34    HEALTHY Z1ZAECJM
5    Bay  3    /dev/da19 17    HEALTHY S1Z1SB0L
5    Bay  4    /dev/da20 37    HEALTHY K4K73KGB
5    Bay  5    /dev/da3  33    HEALTHY Z1ZA74A4
5    Bay  6    /dev/da21 15    HEALTHY Z1ZAEQ13
5    Bay  7    /dev/da22 14    HEALTHY S1Z1SAF5
5    Bay  8    /dev/da23 13    HEALTHY S1Z1SB0C
5    Bay  9    /dev/da4  32    HEALTHY Z1ZAEPR8
5    Bay 10    /dev/da24 36    HEALTHY S1Z26JWM
5    Bay 11    /dev/da25 11    HEALTHY S1Z1RYGS
5    Bay 12    /dev/da26 10    HEALTHY S1Z1SB0A
5    Bay 13    /dev/da5  31    HEALTHY Z1ZAEPS5
5    Bay 14    /dev/da6  30    HEALTHY Z1ZAF5GQ
5    Bay 15    /dev/da7  29    HEALTHY Z1ZAB40S
5    Bay 16    /dev/da27 9     HEALTHY Z1ZAF625
5    Bay 17    /dev/da8  28    HEALTHY Z1ZAEPJY
5    Bay 18    /dev/da9  27    HEALTHY Z1ZAF1LG
5    Bay 19    /dev/da10 26    HEALTHY Z1ZAF724
5    Bay 20    /dev/da28 8     HEALTHY Z1ZAF5W8
5    Bay 21    /dev/da11 25    HEALTHY Z1ZAEW1W
5    Bay 22    /dev/da12 24    HEALTHY Z1ZAF0CW
5    Bay 23    /dev/da29 7     HEALTHY Z1ZAF5VM
5    Bay 24    /dev/da30 6     HEALTHY Z1ZAF59X
5    Bay 25    /dev/da31 5     HEALTHY Z1ZAF21G
5    Bay 26    /dev/da32 4     HEALTHY Z1ZAF5QJ
5    Bay 27    /dev/da33 3     HEALTHY Z1ZAF58Y
5    Bay 28    /dev/da13 23    HEALTHY Z1ZAF6CG
5    Bay 29    /dev/da34 2     HEALTHY Z1ZAB3XJ
5    Bay 30    /dev/da14 22    HEALTHY S1Z1RYHB
5    Bay 31    /dev/da35 1     HEALTHY Z1ZAB3TQ
5    Bay 32    /dev/da15 21    HEALTHY Z1ZAEPYX
5    Bay 33    /dev/da36 0     HEALTHY Z1ZAF4Z0
5    Bay 34    /dev/da16 20    HEALTHY Z1ZAEPMC
5    Bay 35    /dev/da17 19    HEALTHY Z1ZAF4H4
5    Bay 36    /dev/da18 18    HEALTHY Z1ZAF6JA
-----------------------------------------------
Total: 36

BIC-Isilon-Cluster-5# isi devices drive view 4
                  Lnn: 5
             Location: Bay  4
                 Lnum: 37
               Device: /dev/da20
               Baynum: 4
               Handle: 332
               Serial: K4K73KGB
                Model: HUS726040ALA610
                 Tech: SATA
                Media: HDD
               Blocks: 7814037168
 Logical Block Length: 512
Physical Block Length: 512
                  WWN: 0000000000000000
                State: HEALTHY
              Purpose: STORAGE
  Purpose Description: A drive used for normal data storage operation
              Present: Yes

# Check cluster state after resolving the event group.
# I did resolve through the web UI as it is easier.

BIC-Isilon-Cluster-5# isi status                   
Cluster Name: BIC-Isilon-Cluster
Cluster Health:     [  OK ]
Cluster Storage:  HDD                 SSD Storage    
Size:             641.6T (649.3T Raw) 0 (0 Raw)      
VHS Size:         7.7T                
Used:             204.8T (32%)        0 (n/a)        
Avail:            436.8T (68%)        0 (n/a)        

                   Health  Throughput (bps)  HDD Storage      SSD Storage
ID |IP Address     |DASR |  In   Out  Total| Used / Size     |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
  1|172.16.10.20   | OK  | 500k|    0| 500k|41.0T/ 130T( 32%)|(No Storage SSDs)
  2|172.16.10.21   | OK  | 4.6k| 320k| 325k|40.9T/ 130T( 32%)|(No Storage SSDs)
  3|172.16.10.22   | OK  |    0|    0|    0|41.0T/ 130T( 32%)|(No Storage SSDs)
  4|172.16.10.23   | OK  | 321k|83.4k| 404k|41.0T/ 130T( 32%)|(No Storage SSDs)
  5|172.16.10.24   | OK  | 1.1M|    0| 1.1M|41.0T/ 130T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals:          | 1.9M| 403k| 2.3M| 205T/ 642T( 32%)|(No Storage SSDs)

     Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only     

Critical Events:


Cluster Job Status:

Running jobs:                                                                   
Job                        Impact Pri Policy     Phase Run Time   
-------------------------- ------ --- ---------- ----- ---------- 
MultiScan[4097]            Low    4   LOW        1/4   0:02:18 

No paused or waiting jobs.

No failed jobs.

Recent job results:                                                                                                                                                                                               
Time            Job                        Event                          
--------------- -------------------------- ------------------------------ 
10/18 04:00:22  ShadowStoreProtect[4096]   Succeeded (LOW) 
10/18 03:02:15  SnapshotDelete[4095]       Succeeded (MEDIUM) 
10/18 02:00:04  WormQueue[4094]            Succeeded (LOW) 
10/18 01:01:37  SnapshotDelete[4093]       Succeeded (MEDIUM) 
10/18 00:31:08  SnapshotDelete[4092]       Succeeded (MEDIUM) 
10/18 00:01:39  SnapshotDelete[4091]       Succeeded (MEDIUM) 
10/17 23:04:54  FSAnalyze[4089]            Succeeded (LOW) 
10/17 22:33:17  SnapshotDelete[4090]       Succeeded (MEDIUM) 
11/15 14:53:34  MultiScan[1254]            MultiScan[1254] Failed 
10/06 14:45:55  ChangelistCreate[975]      ChangelistCreate[975] Failed 

Network

BIC-Isilon-Cluster-1# isi config
Welcome to the Isilon IQ configuration console.
Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved.
Enter 'help' to see list of available commands.
Enter 'help <command>' to see help for a specific command.
Enter 'quit' at any prompt to discard changes and exit.

	Node build: Isilon OneFS v8.0.0.0 B_8_0_0_037(RELEASE) 
	Node serial number: SX410-301608-0260

BIC-Isilon-Cluster >>> status

Configuration for 'BIC-Isilon-Cluster'
Local machine:
----------------------------------+-----------------------------------------
Node LNN      : 1                 | Date        : 2016/06/09 15:53:01 EDT      
----------------------------------+-----------------------------------------
Interface     : ib1               | MAC         : 00:00:00:49:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:9e:e9:a2
IP Address    : 10.0.1.1          | MAC Options : none                         
----------------------------------+-----------------------------------------
Interface     : ib0               | MAC         : 00:00:00:48:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:9e:e9:a1
IP Address    : 10.0.2.1          | MAC Options : none                         
----------------------------------+-----------------------------------------
Interface     : lo0               | MAC         : 00:00:00:00:00:00            
IP Address    : 10.0.3.1          | MAC Options : none                         
----------------------------------+-----------------------------------------
Network:
----------------------------------+-----------------------------------------
JoinMode    : Manual          

Interfaces:
----------------------------------+-----------------------------------------
Interface     : int-a             | Flags       : enabled_ok                   
Netmask       : 255.255.255.0     | MTU         : N/A                          
----------------+-----------------+------------------+----------------------
Low IP          | High IP         | Allocated        | Free                 
----------------+-----------------+------------------+----------------------
10.0.1.1        | 10.0.1.254      | 5                | 249                  
----------------+-----------------+------------------+----------------------
Interface     : int-b             | Flags       : enabled_ok                   
Netmask       : 255.255.255.0     | MTU         : N/A                          
----------------+-----------------+------------------+----------------------
Low IP          | High IP         | Allocated        | Free                 
----------------+-----------------+------------------+----------------------
10.0.2.1        | 10.0.2.254      | 5                | 249                  
----------------+-----------------+------------------+----------------------
Interface     : lpbk              | Flags       : enabled_ok cluster_traffic failover
Netmask       : 255.255.255.0     | MTU         : 1500                         
----------------+-----------------+------------------+----------------------
Low IP          | High IP         | Allocated        | Free                 
----------------+-----------------+------------------+----------------------
10.0.3.1        | 10.0.3.254      | 5                | 249                  
----------------+-----------------+------------------+----------------------
  • Initial groupnet and DNS client settings:
BIC-Isilon-Cluster-4# isi network groupnets list
ID        DNS Cache Enabled  DNS Search  DNS Servers     Subnets 
-----------------------------------------------------------------
groupnet0 True               -           132.206.178.7   mgmt    
                                         132.206.178.186 prod    
                                                         node    
-----------------------------------------------------------------
Total: 1

BIC-Isilon-Cluster-4# isi network groupnets view groupnet0
                    ID: groupnet0
                  Name: groupnet0
           Description: Initial groupnet
     DNS Cache Enabled: True
           DNS Options: -
            DNS Search: -
           DNS Servers: 132.206.178.7, 132.206.178.186
Server Side DNS Search: True
               Subnets: mgmt, prod, node
  • List and view the network subnets defined in the cluster:
BIC-Isilon-Cluster-4# isi network subnets list                   
ID             Subnet           Gateway|Priority  Pools  SC Service     
groupnet0.mgmt 172.16.10.0/24   172.16.10.1|2     mgmt   0.0.0.0        
groupnet0.node 172.16.20.0/23   172.16.20.1|3     pool1  172.16.20.232  
groupnet0.prod 132.206.178.0/24 132.206.178.1|1   pool0  132.206.178.232
------------------------------------------------------------------------
Total: 3
  • List and view the network pools defined in the cluster.
  • Note that the IP allocation for the pool groupnet0.prod.pool0 is set to dynamic. This requires a SmartConnect Advanced license.
BIC-Isilon-Cluster-4# isi network pools list     
ID                   SC Zone                    Allocation Method 
groupnet0.mgmt.mgmt  mgmt.isi.bic.mni.mcgill.ca     static            
groupnet0.node.pool1 nfs.isi-node.bic.mni.mcgill.ca dynamic           
groupnet0.prod.pool0 nfs.isi.bic.mni.mcgill.ca      dynamic           
----------------------------------------------------------------------
Total: 3

BIC-Isilon-Cluster-4# isi network pools view groupnet0.mgmt.mgmt
                     ID: groupnet0.mgmt.mgmt
               Groupnet: groupnet0
                 Subnet: mgmt
                   Name: mgmt
                  Rules: -
            Access Zone: System
      Allocation Method: static
       Aggregation Mode: lacp
     SC Suspended Nodes: -
            Description: -
                 Ifaces: 1:ext-1, 2:ext-1, 4:ext-1, 3:ext-1, 5:ext-1
              IP Ranges: 172.16.10.20-172.16.10.24
       Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
      SC Connect Policy: round_robin
                SC Zone: mgmt.isi.bic.mni.mcgill.ca
    SC DNS Zone Aliases: -
     SC Failover Policy: round_robin
              SC Subnet: prod
                 SC Ttl: 0
          Static Routes: -

BIC-Isilon-Cluster-4# isi network pools view groupnet0.prod.pool0
                     ID: groupnet0.prod.pool0
               Groupnet: groupnet0
                 Subnet: prod
                   Name: pool0
                  Rules: -
            Access Zone: prod
      Allocation Method: dynamic
       Aggregation Mode: lacp
     SC Suspended Nodes: -
            Description: -
                 Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
              IP Ranges: 132.206.178.233-132.206.178.237
       Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
      SC Connect Policy: round_robin
                SC Zone: nfs.isi.bic.mni.mcgill.ca
    SC DNS Zone Aliases: -
     SC Failover Policy: round_robin
              SC Subnet: prod
                 SC Ttl: 0
          Static Routes: -

IC-Isilon-Cluster-2# isi network pools view groupnet0.node.pool1
                     ID: groupnet0.node.pool1
               Groupnet: groupnet0
                 Subnet: node
                   Name: pool1
                  Rules: -
            Access Zone: prod
      Allocation Method: dynamic
       Aggregation Mode: lacp
     SC Suspended Nodes: -
            Description: -
                 Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
              IP Ranges: 172.16.20.233-172.16.20.237
       Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
      SC Connect Policy: round_robin
                SC Zone: nfs.isi-node.bic.mni.mcgill.ca
    SC DNS Zone Aliases: -
     SC Failover Policy: round_robin
              SC Subnet: node
                 SC Ttl: 0
          Static Routes: -
  • Display network interfaces configuration:
BIC-Isilon-Cluster-4# isi network interfaces list
LNN  Name         Status        Owners               IP Addresses   
--------------------------------------------------------------------
1    10gige-1     Up            -                    -              
1    10gige-2     Up            -                    -              
1    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.237
                                groupnet0.node.pool1 172.16.20.237  
1    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.20   
1    ext-2        No Carrier    -                    -              
1    ext-agg      Not Available -                    -              
2    10gige-1     Up            -                    -              
2    10gige-2     Up            -                    -              
2    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.236
                                groupnet0.node.pool1 172.16.20.236  
2    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.21   
2    ext-2        No Carrier    -                    -              
2    ext-agg      Not Available -                    -              
3    10gige-1     Up            -                    -              
3    10gige-2     Up            -                    -              
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.234
                                groupnet0.node.pool1 172.16.20.234  
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22   
3    ext-2        No Carrier    -                    -              
3    ext-agg      Not Available -                    -              
4    10gige-1     Up            -                    -              
4    10gige-2     Up            -                    -              
4    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
                                groupnet0.node.pool1 172.16.20.235  
4    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.23   
4    ext-2        No Carrier    -                    -              
4    ext-agg      Not Available -                    -              
5    10gige-1     Up            -                    -              
5    10gige-2     Up            -                    -              
5    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.233
                                groupnet0.node.pool1 172.16.20.233  
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24   
5    ext-2        No Carrier    -                    -              
5    ext-agg      Not Available -                    -              
--------------------------------------------------------------------
Total: 30
  • Suspend or resume a node:

From the docu65065_OneFS-8.0.0-CLI-Administration-Guide, page 950:

Suspend or resume a node

You can suspend and resume SmartConnect DNS query responses on a node.

Procedure
1. To suspend DNS query responses for an node:

a. (Optional) To identify a list of nodes and IP address pools, run the
following command:

    isi network interfaces list

b. Run the isi network pools sc-suspend-nodes command and specify the pool ID
and logical node number (LNN).

Specify the pool ID you want in the following format:

    <groupnet_name>.<subnet_name>.<pool_name>

The following command suspends DNS query responses on node 3 when queries come
through IP addresses in pool5 under groupnet1.subnet 3:

    isi network pools sc-suspend-nodes groupnet1.subnet3.pool5 3

2. To resume DNS query responses for an IP address pool, run the isi network
pools sc-resume-nodes command and specify the pool ID and logical node number
(LNN).

The following command resumes DNS query responses on node 3 when queries come
through IP addresses in pool5 under groupnet1.subnet 3:

    isi network pools sc-resume-nodes groupnet1.subnet3.pool5 3

Example of an IP Failover with Dynamic Allocation Method

  • First, set the dynamic IP allocation for the pool:
isi network pools modify groupnet0.prod.pool0 --alloc-method=dynamic
  • Then pull the fiber cables from one node, say node 5 and watch what happens:
  • Before pulling the cables:
BIC-Isilon-Cluster-4# isi network interfaces list
...
3    10gige-1     Up            -                    -              
3    10gige-2     Up            -                    -              
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22   
3    ext-2        No Carrier    -                    -              
3    ext-agg      Not Available -                    -              
4    10gige-1     Up            -                    -              
...
5    10gige-1     Up            -                    -              
5    10gige-2     Up            -                    -              
5    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.237
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24   
5    ext-2        No Carrier    -                    -              
5    ext-agg      Not Available -                    -              
--------------------------------------------------------------------
Total: 30
  • After.
  • Node 5 external network interfaces 10gig-1, −2, -agg-1 now display No Carrier.
  • Note how node 3 external network interface 10gige-agg-1 picked up the IP of node 5.
BIC-Isilon-Cluster-4# isi network interfaces list
LNN  Name         Status        Owners               IP Addresses
...
3    10gige-1     Up            -                    -
3    10gige-2     Up            -                    -
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
                                                     132.206.178.237
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22
3    ext-2        No Carrier    -                    -
3    ext-agg      Not Available -                    -
...                                                    
5    10gige-1     No Carrier    -                    -
5    10gige-2     No Carrier    -                    -
5    10gige-agg-1 No Carrier    groupnet0.prod.pool0 -
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24
5    ext-2        No Carrier    -                    -
5    ext-agg      Not Available -                    -

How To Add a Subnet to a Cluster

  • Goal: to have clients access the Isilon cluster through the private network 172.16.20.0/24 (data network).
  • Hosts in the arnodes compute cluster have 2 extra bounded NICs that are configured on this network.
  • The private network 172.16.20.0/24 is directly attached to cluster’s front-end: there are no intervening gateways or routers in between.
  • This section explains how to configure the Isilon cluster such that clients on 172.16.20.0/24 are granted NFS access.
  • Current network cluster state:
BIC-Isilon-Cluster-4# isi network interfaces ls
LNN  Name         Status        Owners               IP Addresses   
--------------------------------------------------------------------
1    10gige-1     Up            -                    -              
1    10gige-2     Up            -                    -              
1    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.237 
1    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.20   
1    ext-2        No Carrier    -                    -              
1    ext-agg      Not Available -                    -              
2    10gige-1     Up            -                    -              
2    10gige-2     Up            -                    -              
2    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.236
2    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.21   
2    ext-2        No Carrier    -                    -              
2    ext-agg      Not Available -                    -              
3    10gige-1     Up            -                    -              
3    10gige-2     Up            -                    -              
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.233
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22   
3    ext-2        No Carrier    -                    -              
3    ext-agg      Not Available -                    -              
4    10gige-1     Up            -                    -              
4    10gige-2     Up            -                    -              
4    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.234
4    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.23   
4    ext-2        No Carrier    -                    -              
4    ext-agg      Not Available -                    -              
5    10gige-1     Up            -                    -              
5    10gige-2     Up            -                    -              
5    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24   
5    ext-2        No Carrier    -                    -              
5    ext-agg      Not Available -                    -              
--------------------------------------------------------------------

BIC-Isilon-Cluster-4# isi network pools ls
ID                   SC Zone                        Allocation Method 
----------------------------------------------------------------------
groupnet0.mgmt.mgmt  mgmt.isi.bic.mni.mcgill.ca     static              
groupnet0.prod.pool0 nfs.isi.bic.mni.mcgill.ca      dynamic           
----------------------------------------------------------------------
Total: 2

BIC-Isilon-Cluster-4# isi network subnets ls   
ID             Subnet           Gateway|Priority  Pools  SC Service     
------------------------------------------------------------------------
groupnet0.mgmt 172.16.10.0/24   172.16.10.1|2     mgmt   0.0.0.0          
groupnet0.prod 132.206.178.0/24 132.206.178.1|1   pool0  132.206.178.232
------------------------------------------------------------------------
Total: 2
  • It is faster and easier to configure this by using the WebUI rather than the CLI.
  • Essentially, it boils down to the following actions:
    • Create a new subnet called node in the default groupnet groupnet0.
    • Set the SmartConnect (Sc) Service IP to 172.16.20.232.
    • Update the domain master DNS server with the new delegation record and “glue” record. More on this later.
BIC-Isilon-Cluster-4# isi network subnets view groupnet0.node
              ID: groupnet0.node
            Name: node
        Groupnet: groupnet0
           Pools: pool1
     Addr Family: ipv4
       Base Addr: 172.16.20.0
            CIDR: 172.16.20.0/24
     Description: -
       Dsr Addrs: -
         Gateway: 172.16.20.1
Gateway Priority: 3
             MTU: 1500
       Prefixlen: 24
         Netmask: 255.255.255.0
 Sc Service Addr: 172.16.20.232
    VLAN Enabled: False
         VLAN ID: -
  • Create a new pool called pool1 with the following properties:
    • Access zone is set to prod like the pool pool0.
    • Allocation method is dynamic.
    • Select the 10gige aggregate interfaces from each node.
    • Set the SmartConnect Connect policy to round-robin.
    • Best practices might require to set it to cpu or network utilization for NFSv4. Benchmarking should help.
    • Name the SmartConnect zone as nfs.isi-node.bic.mni.mcgill.ca.
    • Proper records in the master domain DNS server will have to be set for the new zone. More on this later.
BIC-Isilon-Cluster-4# isi network pools view groupnet0.node.pool1
                     ID: groupnet0.node.pool1
               Groupnet: groupnet0
                 Subnet: node
                   Name: pool1
                  Rules: -
            Access Zone: prod
      Allocation Method: dynamic
       Aggregation Mode: lacp
     SC Suspended Nodes: -
            Description: -
                 Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
              IP Ranges: 172.16.20.233-172.16.20.237
       Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
      SC Connect Policy: round_robin
                SC Zone: nfs.isi-node.bic.mni.mcgill.ca
    SC DNS Zone Aliases: -
     SC Failover Policy: round_robin
              SC Subnet: node
                 SC Ttl: 0
          Static Routes: -
  • With this in place, the cluster network interfaces settings will be:
BIC-Isilon-Cluster-4# isi network interfaces ls              
LNN  Name         Status        Owners               IP Addresses   
--------------------------------------------------------------------
1    10gige-1     Up            -                    -              
1    10gige-2     Up            -                    -              
1    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.237
                                groupnet0.node.pool1 172.16.20.237  
1    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.20   
1    ext-2        No Carrier    -                    -              
1    ext-agg      Not Available -                    -              
2    10gige-1     Up            -                    -              
2    10gige-2     Up            -                    -              
2    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.236
                                groupnet0.node.pool1 172.16.20.236  
2    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.21   
2    ext-2        No Carrier    -                    -              
2    ext-agg      Not Available -                    -              
3    10gige-1     Up            -                    -              
3    10gige-2     Up            -                    -              
3    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.233
                                groupnet0.node.pool1 172.16.20.234  
3    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.22   
3    ext-2        No Carrier    -                    -              
3    ext-agg      Not Available -                    -              
4    10gige-1     Up            -                    -              
4    10gige-2     Up            -                    -              
4    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.234
                                groupnet0.node.pool1 172.16.20.235  
4    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.23   
4    ext-2        No Carrier    -                    -              
4    ext-agg      Not Available -                    -              
5    10gige-1     Up            -                    -              
5    10gige-2     Up            -                    -              
5    10gige-agg-1 Up            groupnet0.prod.pool0 132.206.178.235
                                groupnet0.node.pool1 172.16.20.233  
5    ext-1        Up            groupnet0.mgmt.mgmt  172.16.10.24   
5    ext-2        No Carrier    -                    -              
5    ext-agg      Not Available -                    -              
--------------------------------------------------------------------
Total: 30
  • A few notes about the above:
    • Because the initial cluster configuration was sloppy, the LNNs (Logical Node Number) and Node ID don’t match.
    • This explains why some 10gige-agg interface have different octal bits in pool0 and pool1.
    • Ultimately, the LLNs and NodeIDs should be re-assigned to match the nodes position in the rack.
    • This would avoid potential mistakes when updating or servicing the cluster.
  • Current setting:
BIC-Isilon-Cluster-4# isi config
Welcome to the Isilon IQ configuration console.
Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved.
Enter 'help' to see list of available commands.
Enter 'help <command>' to see help for a specific command.
Enter 'quit' at any prompt to discard changes and exit.

	Node build: Isilon OneFS v8.0.0.1 B_MR_8_0_0_1_131(RELEASE) 
	Node serial number: SX410-301608-0264

BIC-Isilon-Cluster >>> lnnset

  LNN      Device ID          Cluster IP
----------------------------------------
    1              1            10.0.3.1
    2              2            10.0.3.2
    3              4            10.0.3.4
    4              3            10.0.3.3
    5              6            10.0.3.5

BIC-Isilon-Cluster >>> exit

BIC-Isilon-Cluster-4#
  • The domain DNS configuration must be updated:
    • The new zone delegation for the SmartConnect zone isi-node.bic.mni.mcgill.ca. has to be put in place.
    • A new glue record must be created for the SSIP (SmartConnect Service IP) of the delegated zone.
; glue record
sip-node.bic.mni.mcgill.ca.     IN A  172.16.20.232
; zone delegation
isi-node.bic.mni.mcgill.ca.     IN NS sip-node.bic.mni.mcgill.ca.
  • Verify the SC Zone: nfs.isi-node.bic.mni.mcgill.ca resolves properly and in a round-robin way.
  • Both on the cluster and a clients:
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.236
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.237
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.233
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.234
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.235
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.236

BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca  
172.16.20.237
BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.233
BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.234
BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.235
BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.236
BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca
172.16.20.237

  • It works!

FileSystems and Access Zones

  • There are 2 access zones defined.
  • Default zone is System: it must exist and cannot be deleted.
  • The access zone called prod will be used to hold the user data.
  • Both zones have lsa-nis-provider:BIC as Auth Providers.
  • See below in Section about NIS that this might create a security weakness.
BIC-Isilon-Cluster-4# isi zone zones list 
Name   Path 
------------
System /ifs 
prod   /ifs 
------------
Total: 2

BIC-Isilon-Cluster-4# isi zone view System 
                Name: System
                Path: /ifs
            Groupnet: groupnet0
       Map Untrusted: -
      Auth Providers: lsa-nis-provider:BIC, lsa-file-provider:System, lsa-local-provider:System
        NetBIOS Name: -
  User Mapping Rules: -
Home Directory Umask: 0077
  Skeleton Directory: /usr/share/skel
  Cache Entry Expiry: 4H
             Zone ID: 1

BIC-Isilon-Cluster-4# isi zone view prod  
                Name: prod
                Path: /ifs
            Groupnet: groupnet0
       Map Untrusted: -
      Auth Providers: lsa-nis-provider:BIC, lsa-local-provider:prod, lsa-file-provider:System
        NetBIOS Name: -
  User Mapping Rules: -
Home Directory Umask: 0077
  Skeleton Directory: /usr/share/skel
  Cache Entry Expiry: 4H
             Zone ID: 2
  • Another. more concise way of displaying the defined access zones:
IC-Isilon-Cluster-4# isi zone list -v
                Name: System
                Path: /ifs
            Groupnet: groupnet0
       Map Untrusted: -
      Auth Providers: lsa-nis-provider:BIC, lsa-file-provider:System, lsa-local-provider:System
        NetBIOS Name: -
  User Mapping Rules: -
Home Directory Umask: 0077
  Skeleton Directory: /usr/share/skel
  Cache Entry Expiry: 4H
             Zone ID: 1
--------------------------------------------------------------------------------
                Name: prod
                Path: /ifs
            Groupnet: groupnet0
       Map Untrusted: -
      Auth Providers: lsa-nis-provider:BIC, lsa-local-provider:prod, lsa-file-provider:System
        NetBIOS Name: -
  User Mapping Rules: -
Home Directory Umask: 0077
  Skeleton Directory: /usr/share/skel
  Cache Entry Expiry: 4H
             Zone ID: 2

NFS, NIS: Exports and Aliases.

  • There seem to be something amiss with NIS and OneFS v8.0.
  • The System access zone had to be provided with NIS authentication as otherwise only numerical UIDs and GIDs show up on the /isi/data filesystem.
  • There might be a potential security weakness there.
  • See https://community.emc.com/thread/193468?start=0&tstart=0 even though this thread is for v7.2
  • Created /etc/netgroup with “+” in it on one node as suggested in the post above and somehow OneFS propagated it to the other nodes.
  • List the NIS auth providers:
BIC-Isilon-Cluster-4# isi auth nis list
Name  NIS Domain  Servers         Status 
-----------------------------------------
BIC   vamana      132.206.178.227 online 
                  132.206.178.243        
-----------------------------------------
Total: 1
BIC-Isilon-Cluster-1# isi auth nis view BIC
                   Name: BIC
             NIS Domain: vamana
                Servers: 132.206.178.227, 132.206.178.243
                 Status: online
         Authentication: Yes
        Balance Servers: Yes
  Check Online Interval: 3m
  Create Home Directory: No
                Enabled: Yes
       Enumerate Groups: Yes
        Enumerate Users: Yes
        Findable Groups: -
         Findable Users: -
           Group Domain: NIS_GROUPS
               Groupnet: groupnet0
Home Directory Template: -
        Hostname Lookup: Yes
        Listable Groups: -
         Listable Users: -
            Login Shell: /bin/bash
       Normalize Groups: No
        Normalize Users: No
        Provider Domain: -
           Ntlm Support: all
        Request Timeout: 20
      Restrict Findable: Yes
      Restrict Listable: No
             Retry Time: 5
      Unfindable Groups: wheel, 0, insightiq, 15, isdmgmt, 16
       Unfindable Users: root, 0, insightiq, 15, isdmgmt, 16
      Unlistable Groups: -
       Unlistable Users: -
            User Domain: NIS_USERS
      Ypmatch Using Tcp: No
  • Show the exports for the zone prod:
BIC-Isilon-Cluster-1# isi nfs exports list --zone prod
ID   Zone  Paths     Description 
---------------------------------
1    prod  /ifs/data -           
---------------------------------
Total: 1

BIC-Isilon-Cluster-1# isi nfs exports view 1 --zone prod                
                     ID: 1
                   Zone: prod
                  Paths: /ifs/data
            Description: -
                Clients: 132.206.178.0/24
           Root Clients: -
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k
  • It doesn’t seem possible to directly list the netgroups defined on the NIS master.
  • One can however list the members of a specific netgroup if one happens to know its name:
BIC-Isilon-Cluster-4# isi auth netgroups view xgeraid --recursive --provider nis:BIC
Netgroup: -
  Domain: -
Hostname: edgar-xge.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: gustav-xge.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: tatania-xge.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: tubal-xge.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: tullus-xge.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: tutor-xge.bic.mni.mcgill.ca
Username: -

BIC-Isilon-Cluster-1# isi auth netgroups view computecore                 
Netgroup: -
  Domain: -
Hostname: thaisa
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: vaux
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: widow
Username: -

I can’t reproduce this behaviour anymore so it should be taken with a grain of salt! I’ll leave this in place for the moment, buut it might go away soon…

  • Clients in netgroups must be specified with IP addresses, names don’t work:
BIC-Isilon-Cluster-1# isi nfs exports modify 1 --clear-clients --zone prod

BIC-Isilon-Cluster-1# isi auth netgroups view isix --zone prod           
Netgroup: -
  Domain: -
Hostname: dromio.bic.mni.mcgill.ca
Username: -

BIC-Isilon-Cluster-1# isi nfs exports modify 1 --clients isix --zone prod
bad host dromio in netgroup isix, skipping

BIC-Isilon-Cluster-1# isi auth netgroups view xisi                                  
Netgroup: -
  Domain: -
Hostname: 132.206.178.51
Username: -
BIC-Isilon-Cluster-1# isi nfs exports modify 1 --add-clients xisi --zone prod

BIC-Isilon-Cluster-1# isi nfs exports view 1 --zone prod                                            
                     ID: 1
                   Zone: prod
                  Paths: /ifs/data
            Description: -
                Clients: xisi
           Root Clients: -
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

Workaround To The Zone Exports Issue With Netgroups.

  • Netgroup entries in the NIS maps must be FQDN: even short names won’t work with the option —ignore-unresolvable-hosts.
  • Modify the zone exports by using the options to the isi nfs exports modify 1 —add-clients sgibic —ignore-unresolvable-hosts —zone prod
BIC-Isilon-Cluster-3# isi auth netgroups view sgibic --recursive --provider nis:BIC
Netgroup: -
  Domain: -
Hostname: julia.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: luciana.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: mouldy.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
  Domain: -
Hostname: vaux.bic.mni.mcgill.ca
Username: -

BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod
                     ID: 1
                   Zone: prod
                  Paths: /ifs/data
            Description: -
                Clients: isix, xisi
           Root Clients: -
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

BIC-Isilon-Cluster-3# isi nfs exports modify 1 --add-clients sgibic --zone prod
bad host julia in netgroup sgibic, skipping

BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod                       
                     ID: 1
                   Zone: prod
                  Paths: /ifs/data
            Description: -
                Clients: isix, xisi
           Root Clients: -
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

BIC-Isilon-Cluster-3# isi nfs exports modify 1 --add-clients sgibic --ignore-unresolvable-hosts --zone prod
BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod
                     ID: 1
                   Zone: prod
                  Paths: /ifs/data
            Description: -
                Clients: isix, sgibic, xisi
           Root Clients: -
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

A Real Example With Quotas

  • Create an export with no root squashing from hosts in the admincore NIS netgroup and access to hosts in admincore
  • List the exports in the prod zone.
  • Check the exports for any error.
BIC-Isilon-Cluster-4# isi nfs exports create /ifs/data/bicadmin1 --zone prod --clients admincore --root-clients admincore --ignore-unresolvable-hosts 

BIC-Isilon-Cluster-4# isi nfs exports list --zone prod
ID   Zone  Paths               Description 
-------------------------------------------
1    prod  /ifs/data           -           
3    prod  /ifs/data/bicadmin1 -           
-------------------------------------------
Total: 2

BIC-Isilon-Cluster-4# isi nfs exports view 3 --zone prod
                     ID: 3
                   Zone: prod
                  Paths: /ifs/data/bicadmin1
            Description: -
                Clients: admincore
           Root Clients: admincore
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

BIC-Isilon-Cluster-4# isi nfs exports check --zone prod
ID Message
----------
----------
Total: 0

How to create NFS aliases and use them

  • It might be useful to create NFS aliases so that NFS clients can use a short symbolic name to mount the Isilon exports.
  • Useful for ipl, movement or noel agglomerated mount points like /ifs/data/ipl/ipl-5–6−8–10/, /ifs/data/movement/movement3–4−5–6−7 or /ifs/data/noel/noel1–5.
BIC-Isilon-Cluster-2# mkdir /ifs/data/movement/movement3-4-5-6-7

BIC-Isilon-Cluster-2# isi quota quotas create /ifs/data/movement/movement3-4-5-6-7 directory --zone prod --hard-threshold 400G --container=yes

BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do mkdir /ifs/data/movement/movement3-4-5-6-7/movement$i; done

BIC-Isilon-Cluster-2# ll /ifs/data/movement/movement3-4-5-6-7
total 14
drwxr-xr-x 7 root  wheel  135 Oct 19 14:56 ./
drwxr-xr-x 5 root  wheel   89 Oct 19 14:42 ../
drwxr-xr-x 2 root  wheel    0 Oct 19 14:56 movement3/
drwxr-xr-x 2 root  wheel    0 Oct 19 14:56 movement4/
drwxr-xr-x 2 root  wheel    0 Oct 19 14:56 movement5/
drwxr-xr-x 2 root  wheel    0 Oct 19 14:56 movement6/
drwxr-xr-x 2 root  wheel    0 Oct 19 14:56 movement7/

BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do isi nfs exports create /ifs/data/movement/movement3-4-5-6-7/movement$i --zone prod --clients admincore --root-clients admincore; done

BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do isi nfs aliases create /movement$i /ifs/data/movement/movement3-4-5-6-7/movement$i --zone prod; done
  • This is used for the ipl. movement and noel allocated storage:
BIC-Isilon-Cluster-2# isi nfs aliases ls --zone prod | egrep '(ipl|movement|noel)'
prod  /ipl1         /ifs/data/ipl/ipl-agglo/ipl1    
prod  /ipl10        /ifs/data/ipl/ipl-5-6-8-10/ipl10
prod  /ipl11        /ifs/data/ipl/ipl11             
prod  /ipl2         /ifs/data/ipl/ipl-agglo/ipl2    
prod  /ipl3         /ifs/data/ipl/ipl-agglo/ipl3    
prod  /ipl4         /ifs/data/ipl/ipl-agglo/ipl4    
prod  /ipl5           /ifs/data/ipl/ipl-5-6-8-10/ipl5     
prod  /ipl6           /ifs/data/ipl/ipl-5-6-8-10/ipl6     
prod  /ipl7           /ifs/data/ipl/ipl-agglo/ipl7        
prod  /ipl8           /ifs/data/ipl/ipl-5-6-8-10/ipl8     
prod  /ipl9           /ifs/data/ipl/ipl-agglo/ipl9        
prod  /ipl_proj01     /ifs/data/ipl/ipl-agglo/proj01      
prod  /ipl_proj02     /ifs/data/ipl/proj02                
prod  /ipl_proj03     /ifs/data/ipl/proj03                
prod  /ipl_proj04     /ifs/data/ipl/proj04                
prod  /ipl_proj05     /ifs/data/ipl/proj05                
prod  /ipl_proj06     /ifs/data/ipl/proj06                
prod  /ipl_proj07     /ifs/data/ipl/proj07                
prod  /ipl_proj08     /ifs/data/ipl/proj08                
prod  /ipl_proj09     /ifs/data/ipl/proj09                
prod  /ipl_proj10     /ifs/data/ipl/proj10                
prod  /ipl_proj11     /ifs/data/ipl/proj11                
prod  /ipl_proj12     /ifs/data/ipl/proj12                
prod  /ipl_proj13     /ifs/data/ipl/proj13                
prod  /ipl_proj14     /ifs/data/ipl/proj14                
prod  /ipl_proj15     /ifs/data/ipl/proj15                
prod  /ipl_proj16     /ifs/data/ipl/proj16                
prod  /ipl_quarantine /ifs/data/ipl/quarantine            
prod  /ipl_scratch01  /ifs/data/ipl/scratch01             
prod  /ipl_scratch02  /ifs/data/ipl/scratch02             
prod  /ipl_scratch03  /ifs/data/ipl/scratch03             
prod  /ipl_scratch04  /ifs/data/ipl/scratch04             
prod  /ipl_scratch05  /ifs/data/ipl/scratch05             
prod  /ipl_scratch06  /ifs/data/ipl/scratch06             
prod  /ipl_scratch07  /ifs/data/ipl/scratch07             
prod  /ipl_scratch08  /ifs/data/ipl/scratch08             
prod  /ipl_scratch09  /ifs/data/ipl/scratch09             
prod  /ipl_scratch10  /ifs/data/ipl/scratch10             
prod  /ipl_scratch11  /ifs/data/ipl/scratch11             
prod  /ipl_scratch12  /ifs/data/ipl/scratch12             
prod  /ipl_scratch13  /ifs/data/ipl/scratch13             
prod  /ipl_scratch14  /ifs/data/ipl/scratch14             
prod  /ipl_scratch15  /ifs/data/ipl/scratch15             
prod  /ipl_user01     /ifs/data/ipl/ipl-agglo/user01      
prod  /ipl_user02     /ifs/data/ipl/user02                
prod  /movement3      /ifs/data/movement/movement3-4-5-6-7/movement3
prod  /movement4      /ifs/data/movement/movement3-4-5-6-7/movement4
prod  /movement5      /ifs/data/movement/movement3-4-5-6-7/movement5
prod  /movement6      /ifs/data/movement/movement3-4-5-6-7/movement6
prod  /movement7      /ifs/data/movement/movement3-4-5-6-7/movement7
prod  /movement8      /ifs/data/movement/movement8                  
prod  /movement9      /ifs/data/movement/movement9                  
prod  /noel1          /ifs/data/noel/noel1-5/noel1                  
prod  /noel2          /ifs/data/noel/noel1-5/noel2                  
prod  /noel3          /ifs/data/noel/noel1-5/noel3                  
prod  /noel4          /ifs/data/noel/noel1-5/noel4                  
prod  /noel5          /ifs/data/noel/noel1-5/noel5                  
prod  /noel6          /ifs/data/noel/noel6                          
prod  /noel7          /ifs/data/noel/noel7                          
prod  /noel8          /ifs/data/noel/noel8
  • With the NFS aliases in place a NFS client can mount an exports like this:
~$ mkdir /mnt/ifs/movement7
~$ mount -t nfs -o vers=4 nfs.isi.bic.mni.mcgill.ca:/movement7 /mnt/ifs/movement7

Quotas

User Quotas

  • One can use the web GUI to create a user quota for export /ifs/data/bicadmin1 defined above.
  • Here, using the CLI we create a user quota for user malin on the export /ifs/data/bicadmin1 with soft and hard limits.
BIC-Isilon-Cluster-4# isi quota quotas create /ifs/data/bicadmin1 user  \
                                              --user malin --hard-threshold 1G \
                                              --soft-threshold 500M --soft-grace 1W --zone prod --verbose
Created quota: USER:malin@/ifs/data/bicadmin1

BIC-Isilon-Cluster-2# isi quota quotas list --path /ifs/data/bicadmin1 --zone prod --verbose
Type      AppliesTo  Path                Snap  Hard    Soft    Adv  Grace  Files  With Overhead  W/O Overhead  Over  Enforced  Container  Linked 
-------------------------------------------------------------------------------------------------------------------------------------------------
user      malin      /ifs/data/bicadmin1 No    1.00G   500.00M -    1W     4625   1.39G          1024.00M      -     Yes       No         No     
directory DEFAULT    /ifs/data/bicadmin1 No    400.00G 399.00G -    1W     797410 138.94G        93.39G        -     Yes       Yes        -      
-------------------------------------------------------------------------------------------------------------------------------------------------
Total: 2

Directory Quotas

  • The exports /ifs/data/loris with ID=5 has already been created.
  • Here we put a 1TB directory quota on it and list the quota explicitely.
  • Option —container yes should be used: then df on a client will display the exports quota value rather than the whole cluster available space in the zone of the export.
BIC-Isilon-Cluster-3# isi nfs exports list --zone prod
ID   Zone  Paths               Description 
-------------------------------------------
1    prod  /ifs/data           -           
3    prod  /ifs/data/bicadmin1 -           
4    prod  /ifs/data/bicdata   -           
5    prod  /ifs/data/loris     -           
-------------------------------------------
Total: 4

BIC-Isilon-Cluster-3# isi nfs exports view 5 --zone prod
                     ID: 5
                   Zone: prod
                  Paths: /ifs/data/loris
            Description: -
                Clients: admincore
           Root Clients: admincore
      Read Only Clients: -
     Read Write Clients: -
               All Dirs: No
             Block Size: 8.0k
           Can Set Time: Yes
       Case Insensitive: No
        Case Preserving: Yes
       Chown Restricted: No
    Commit Asynchronous: No
Directory Transfer Size: 128.0k
               Encoding: DEFAULT
               Link Max: 32767
         Map Lookup UID: No
              Map Retry: Yes
               Map Root
                    Enabled: True
                       User: nobody
              Primary Group: -
           Secondary Groups: -
           Map Non Root
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
            Map Failure
                    Enabled: False
                       User: nobody
              Primary Group: -
           Secondary Groups: -
               Map Full: Yes
          Max File Size: 8192.00000P
          Name Max Size: 255
            No Truncate: No
              Read Only: No
            Readdirplus: Yes
   Readdirplus Prefetch: 10
  Return 32Bit File Ids: No
 Read Transfer Max Size: 1.00M
 Read Transfer Multiple: 512
     Read Transfer Size: 128.0k
          Security Type: unix
   Setattr Asynchronous: No
               Snapshot: -
               Symlinks: Yes
             Time Delta: 1.0 ns
  Write Datasync Action: datasync
   Write Datasync Reply: datasync
  Write Filesync Action: filesync
   Write Filesync Reply: filesync
  Write Unstable Action: unstable
   Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
    Write Transfer Size: 512.0k

BIC-Isilon-Cluster-3# isi quota quotas create /ifs/data/loris directory \
                      --hard-threshold 1T --zone prod --container=yes

BIC-Isilon-Cluster-3# isi quota quotas list
Type      AppliesTo  Path                Snap  Hard    Soft    Adv  Used    
----------------------------------------------------------------------------
user      malin      /ifs/data/bicadmin1 No    10.00G  500.00M -    1024.00M
directory DEFAULT    /ifs/data/bicadmin1 No    900.00G 800.00G -    298.897G
directory DEFAULT    /ifs/data/bicdata   No    1.00T   -       -    106.904G
directory DEFAULT    /ifs/data/loris     No    1.00T   -       -    0       
----------------------------------------------------------------------------
Total: 4

BIC-Isilon-Cluster-3# isi quota quotas view /ifs/data/loris directory
                       Path: /ifs/data/loris
                       Type: directory
                  Snapshots: No
Thresholds Include Overhead: No
                      Usage
                          Files: 16050
                  With Overhead: 422.04G
                   W/O Overhead: 335.45G
                       Over: -
                   Enforced: Yes
                  Container: Yes
                     Linked: -
                 Thresholds
                 Hard Threshold: 1.00T
                  Hard Exceeded: No
             Hard Last Exceeded: 1969-12-31T19:00:00
                       Advisory: -
              Advisory Exceeded: No
         Advisory Last Exceeded: -
                 Soft Threshold: -
                  Soft Exceeded: No
             Soft Last Exceeded: -
                     Soft Grace: -

Snapѕhots

  • This has been recently (Aug 2016) set in motion.
  • All the settings are in flux, like snapshots schedule and path naming convention.
  • Some experimentation will be in order.

Snapshots schedules listing

BIC-Isilon-Cluster-3# isi snapshot schedules ls                                        
ID   Name                        
---------------------------------
2    snapshot-bicadmin1-daily-31d
3    snapshot-bicdata-daily-7d   
4    snapshot-mril2-daily-3d     
5    snapshot-mril3-daily-3d     
---------------------------------
Total: 4

Viewing the scheduled snapshots in details

BIC-Isilon-Cluster-3# isi snapshot schedules ls -v
           ID: 2
         Name: snapshot-bicadmin1-daily-31d
         Path: /ifs/data/bicadmin1
      Pattern: snapshot_bicadmin1_31d_%Y-%m-%d-%H-%M
     Schedule: every 1 days at 07:00 PM
     Duration: 1M1D
        Alias: alias-snapshot-bicadmin1-daily
     Next Run: 2016-10-04T19:00:00
Next Snapshot: snapshot_bicadmin1_31d_2016-10-04-19-00
--------------------------------------------------------------------------------
           ID: 3
         Name: snapshot-bicdata-daily-7d
         Path: /ifs/data/bicdata
      Pattern: snapshot-bicdata_daily_7d_%Y-%m-%d-%H-%M
     Schedule: every 1 days at 07:00 PM
     Duration: 1W1D
        Alias: alias-snapshot-bicdata-daily
     Next Run: 2016-10-04T19:00:00
Next Snapshot: snapshot-bicdata_daily_7d_2016-10-04-19-00
--------------------------------------------------------------------------------
           ID: 4
         Name: snapshot-mril2-daily-3d
         Path: /ifs/data/mril/mril2
      Pattern: snapshot-mril2-daily-3d-%Y-%m-%d-%H-%M
     Schedule: every 1 days at 11:45 PM
     Duration: 3D1H
        Alias: alias-snapshot-mril2-daily-3d
     Next Run: 2016-10-04T23:45:00
Next Snapshot: snapshot-mril2-daily-3d-2016-10-04-23-45
--------------------------------------------------------------------------------
           ID: 5
         Name: snapshot-mril3-daily-3d
         Path: /ifs/data/mril/mril3
      Pattern: snapshot-mril3-daily-3d-%Y-%m-%d-%H-%M
     Schedule: every 1 days at 11:45 PM
     Duration: 3D2H
        Alias: alias-snapshot-mril3-daily-3d
     Next Run: 2016-10-04T23:45:00
Next Snapshot: snapshot-mril3-daily-3d-2016-10-04-23-45

Listing the snapshots and viewing the details on a particular snapshot

 
BIC-Isilon-Cluster-3# isi snapshot snapshots list                                      
ID   Name                                                    Path               
--------------------------------------------------------------------------------
378  alias-snapshot-bicadmin1-daily                          /ifs/data/bicadmin1
737  snapshot_bicadmin1_30D_2016-09-03-_19-00                /ifs/data/bicadmin1
740  snapshot-bicdata_daily_30D_expiration_2016-09-03-_19-00 /ifs/data/bicdata  
744  snapshot_bicadmin1_30D_2016-09-04-_19-00                /ifs/data/bicadmin1
747  snapshot-bicdata_daily_30D_expiration_2016-09-04-_19-00 /ifs/data/bicdata  
751  snapshot_bicadmin1_30D_2016-09-05-_19-00                /ifs/data/bicadmin1
754  snapshot-bicdata_daily_30D_expiration_2016-09-05-_19-00 /ifs/data/bicdata  
758  snapshot_bicadmin1_30D_2016-09-06-_19-00                /ifs/data/bicadmin1
761  snapshot-bicdata_daily_30D_expiration_2016-09-06-_19-00 /ifs/data/bicdata  
765  snapshot_bicadmin1_30D_2016-09-07-_19-00                /ifs/data/bicadmin1
768  snapshot-bicdata_daily_30D_expiration_2016-09-07-_19-00 /ifs/data/bicdata  
772  snapshot_bicadmin1_30D_2016-09-08-_19-00                /ifs/data/bicadmin1
775  snapshot-bicdata_daily_30D_expiration_2016-09-08-_19-00 /ifs/data/bicdata  
779  snapshot_bicadmin1_30D_2016-09-09-_19-00                /ifs/data/bicadmin1
782  snapshot-bicdata_daily_30D_expiration_2016-09-09-_19-00 /ifs/data/bicdata  
786  snapshot_bicadmin1_30D_2016-09-10-_19-00                /ifs/data/bicadmin1
789  snapshot-bicdata_daily_30D_expiration_2016-09-10-_19-00 /ifs/data/bicdata  
793  snapshot_bicadmin1_30D_2016-09-11-_19-00                /ifs/data/bicadmin1
796  snapshot-bicdata_daily_30D_expiration_2016-09-11-_19-00 /ifs/data/bicdata  
800  snapshot_bicadmin1_30D_2016-09-12-_19-00                /ifs/data/bicadmin1
803  snapshot-bicdata_daily_30D_expiration_2016-09-12-_19-00 /ifs/data/bicdata  
807  snapshot_bicadmin1_30D_2016-09-13-_19-00                /ifs/data/bicadmin1
810  snapshot-bicdata_daily_30D_expiration_2016-09-13-_19-00 /ifs/data/bicdata  
814  snapshot_bicadmin1_30D_2016-09-14-_19-00                /ifs/data/bicadmin1
817  snapshot-bicdata_daily_30D_expiration_2016-09-14-_19-00 /ifs/data/bicdata  
821  snapshot_bicadmin1_30D_2016-09-15-_19-00                /ifs/data/bicadmin1
824  snapshot-bicdata_daily_30D_expiration_2016-09-15-_19-00 /ifs/data/bicdata  
828  snapshot_bicadmin1_30D_2016-09-16-_19-00                /ifs/data/bicadmin1
831  snapshot-bicdata_daily_30D_expiration_2016-09-16-_19-00 /ifs/data/bicdata  
835  snapshot_bicadmin1_30D_2016-09-17-_19-00                /ifs/data/bicadmin1
838  snapshot-bicdata_daily_30D_expiration_2016-09-17-_19-00 /ifs/data/bicdata  
842  snapshot_bicadmin1_30D_2016-09-18-_19-00                /ifs/data/bicadmin1
845  snapshot-bicdata_daily_30D_expiration_2016-09-18-_19-00 /ifs/data/bicdata  
849  snapshot_bicadmin1_30D_2016-09-19-_19-00                /ifs/data/bicadmin1
852  snapshot-bicdata_daily_30D_expiration_2016-09-19-_19-00 /ifs/data/bicdata  
856  snapshot_bicadmin1_30D_2016-09-20-_19-00                /ifs/data/bicadmin1
859  snapshot-bicdata_daily_30D_expiration_2016-09-20-_19-00 /ifs/data/bicdata  
863  snapshot_bicadmin1_30D_2016-09-21-_19-00                /ifs/data/bicadmin1
866  snapshot-bicdata_daily_30D_expiration_2016-09-21-_19-00 /ifs/data/bicdata  
870  snapshot_bicadmin1_30D_2016-09-22-_19-00                /ifs/data/bicadmin1
873  snapshot-bicdata_daily_30D_expiration_2016-09-22-_19-00 /ifs/data/bicdata  
877  snapshot_bicadmin1_30D_2016-09-23-_19-00                /ifs/data/bicadmin1
880  snapshot-bicdata_daily_30D_expiration_2016-09-23-_19-00 /ifs/data/bicdata  
884  snapshot_bicadmin1_30D_2016-09-24-_19-00                /ifs/data/bicadmin1
887  snapshot-bicdata_daily_30D_expiration_2016-09-24-_19-00 /ifs/data/bicdata  
891  snapshot_bicadmin1_30D_2016-09-25-_19-00                /ifs/data/bicadmin1
894  snapshot-bicdata_daily_30D_expiration_2016-09-25-_19-00 /ifs/data/bicdata  
898  snapshot_bicadmin1_30D_2016-09-26-_19-00                /ifs/data/bicadmin1
901  snapshot-bicdata_daily_30D_expiration_2016-09-26-_19-00 /ifs/data/bicdata  
905  snapshot_bicadmin1_30D_2016-09-27-_19-00                /ifs/data/bicadmin1
908  snapshot-bicdata_daily_30D_expiration_2016-09-27-_19-00 /ifs/data/bicdata  
912  snapshot_bicadmin1_30D_2016-09-28-_19-00                /ifs/data/bicadmin1
915  snapshot-bicdata_daily_30D_expiration_2016-09-28-_19-00 /ifs/data/bicdata  
919  snapshot_bicadmin1_30D_2016-09-29-_19-00                /ifs/data/bicadmin1
922  snapshot-bicdata_daily_30D_expiration_2016-09-29-_19-00 /ifs/data/bicdata  
926  snapshot_bicadmin1_30D_2016-09-30-_19-00                /ifs/data/bicadmin1
929  snapshot-bicdata_daily_30D_expiration_2016-09-30-_19-00 /ifs/data/bicdata  
933  snapshot_bicadmin1_30D_2016-10-01-_19-00                /ifs/data/bicadmin1
936  snapshot-bicdata_daily_30D_expiration_2016-10-01-_19-00 /ifs/data/bicdata  
940  snapshot_bicadmin1_30D_2016-10-02-_19-00                /ifs/data/bicadmin1
943  snapshot-bicdata_daily_30D_expiration_2016-10-02-_19-00 /ifs/data/bicdata  
947  snapshot_bicadmin1_30D_2016-10-03-_19-00                /ifs/data/bicadmin1
950  snapshot-bicdata_daily_30D_expiration_2016-10-03-_19-00 /ifs/data/bicdata  
952  FSAnalyze-Snapshot-Current-1475546412                   /ifs               
--------------------------------------------------------------------------------
Total: 64                                                                        

BIC-Isilon-Cluster-3# isi snapshot snapshots view snapshot_bicadmin1_30D_2016-10-03-_19-00
               ID: 947
             Name: snapshot_bicadmin1_30D_2016-10-03-_19-00
             Path: /ifs/data/bicadmin1
        Has Locks: No
         Schedule: snapshot-bicadmin1-daily-31d
  Alias Target ID: -
Alias Target Name: -
          Created: 2016-10-03T19:00:03
          Expires: 2016-11-03T19:00:00
             Size: 1.016G
     Shadow Bytes: 0
        % Reserve: 0.00%
     % Filesystem: 0.00%
            State: active
  • What is this snapshot FSAnalyze-Snapshot-Current-1475546412??
  • I never created it.
  • It looks likes it is against best-practices: path is /ifs
  • Found the answer: this is needed for the FS analytics done with the InsightIQ server.
  • DO NOT DELETE IT!
BIC-Isilon-Cluster-3# isi snapshot snapshots view FSAnalyze-Snapshot-Current-1475546412
               ID: 952
             Name: FSAnalyze-Snapshot-Current-1475546412
             Path: /ifs
        Has Locks: No
         Schedule: -
  Alias Target ID: -
Alias Target Name: -
          Created: 2016-10-03T22:00:12
          Expires: -
             Size: 1.2129T
     Shadow Bytes: 0
        % Reserve: 0.00%
     % Filesystem: 0.00%
            State: active

Snapshot aliases point to the latest snapshot

BIC-Isilon-Cluster-3# isi snapshot  aliases ls
ID   Name                           Target ID  Target Name                             
---------------------------------------------------------------------------------------
378  alias-snapshot-bicadmin1-daily 947        snapshot_bicadmin1_30D_2016-10-03-_19-00
---------------------------------------------------------------------------------------

Creating snapshot schedules

  • Create a snapshot schedule and an alias to it so that that it points to the last performed snapshot.
  • For example, to create a snapshot schedule for /ifs/data/mril/mril2 done
    • Everyday day at 11:45PM
    • with a retention period of 73 hours (3 days + 1 hour).
    • Alias alias-snapshot-mril2-daily-3d points to the last scheduled snapshot.
BIC-Isilon-Cluster-2# isi snapshot schedules create snapshot-mril2-daily-3d /ifs/data/mril/mril2 snapshot_mril2_daily_3d-%Y-%m-%d-%H-%M \
                      "every 1 days at 11:45 PM" --duration 73H --alias alias-snapshot-mril2-daily-3d

BIC-Isilon-Cluster-2# isi snapshot schedules ls
ID   Name                        
---------------------------------
2    snapshot-bicadmin1-daily-31d
3    snapshot-bicdata-daily-7d   
4    snapshot-mril2-daily-3d     
5    snapshot-mril3-daily-3d     
---------------------------------
Total: 4

BIC-Isilon-Cluster-2# isi snapshot schedules view 4
           ID: 4
         Name: snapshot-mril2-daily-3d
         Path: /ifs/data/mril/mril2
      Pattern: snapshot_mril2_daily_3d-%Y-%m-%d-%H-%M
     Schedule: every 1 days at 11:45 PM
     Duration: 3D1H
        Alias: alias-snapshot-mril2-daily-3d
  • CLI command can be messy! The web GUI is more intuitive.
  • See below for the pattern syntax.
  • Syntax for snapshot schedule creation:
BIC-Isilon-Cluster-2# isi snapshot schedules create <name> <path> <pattern> <schedule>
[--alias <alias>]
[--duration <duration>]
[--verbose

Options
<name> 
           Specifies a name for the snapshot schedule.
<path> 
           Specifies the path of the directory to include in the snapshots.
<pattern> 
           Specifies a naming pattern for snapshots created according to the schedule. See below.
<schedule>
           Specifies how often snapshots are created.
           Specify in the following format: "<interval> [<frequency>]"

Specify <interval> in one of the following formats:

 Every [{other | <integer>}] week [on <day>]
 Every [{other | <integer>}] month [on the <integer>]
 Every [<day>[, ...] [of every [{other | <integer>}] week]]
 The last {day | weekday | <day>} of every [{other |<integer>}] month
 The <integer> {weekday | <day>} of every [{other | <integer>}] month
 Yearly on <month> <integer>
 Yearly on the {last | <integer>} [weekday | <day>] of <month>

Specify <frequency> in one of the following formats:

 at <hh>[:<mm>] [{AM | PM}]
  every [<integer>] {hours | minutes} [between <hh>[:<mm>] [{AM | PM}] and <hh>[:<mm>] [{AM | PM}]]
  every [<integer>] {hours | minutes} [from <hh>[:<mm>] [{AM | PM}] to <hh>[:<mm>] [{AM | PM}]]

You can optionally append "st", "th", or "rd" to <integer>. 
For example, you can specify "Every 1st month"

Specify <day> as any day of the week or a three-letter abbreviation for the day. 
For example, both "saturday" and "sat" are valid.

--alias <alias>
                Specifies an alias for the latest snapshot generated based on the schedule. 
                The alias enables you to quickly locate the most recent snapshot that was generated
                according to the schedule.
                Specify as any string.

{--duration | -x} <duration>
                             Specifies how long snapshots generated according to the schedule are stored on the
                             cluster before OneFS automatically deletes them.

Specify in the following format:
<integer><units>

The following <units> are valid:

Y Specifies years
M Specifies months
W Specifies weeks
D Specifies days
H Specifies hours

{--verbose | -v}
Displays a message confirming that the snapshot schedule was created.
  • The following variables can be included in a snapshot naming pattern:
  • Have fun choosing one!
VariableDescription
%AThe day of the week.
%aThe abbreviated day of the week. For example, if the snapshot is generated on a Sunday, %a is replaced with Sun.
%BThe name of the month.
%bThe abbreviated name of the month. For example, if the snapshot is generated in September, %b is replaced with Sep.
%CThe first two digits of the year. For example, if the snapshot is created in 2014, %C is replaced with 20.
%cThe time and day. This variable is equivalent to specifying b T %Y.
%dThe two digit day of the month.
%eThe day of the month. A single-digit day is preceded by a blank space.
%FThe date. This variable is equivalent to specifying m-%d.
%GThe year. This variable is equivalent to specifying G is replaced with 2016, because only one day of that week is in 2017.
%gThe abbreviated year. This variable is equivalent to specifying g is replaced with 16, because only oneday of that week is in 2017.
%HThe hour. The hour is represented on the 24-hour clock. Single-digit hours are preceded by a zero. For example, if a snapshot is created at 1:45 AM, %H is replaced with 01.
%hThe abbreviated name of the month. This variable is equivalent to specifying %b.
%IThe hour represented on the 12-hour clock. Single-digit hours are preceded by a zero. For example, if a snapshot is created at 1:45 PM, %I is replaced with 01.
%jThe numeric day of the year. For example, if a snapshot is created on February 1, %j is replaced with 32.
%kThe hour represented on the 24-hour clock. Single-digit hours are preceded by a blank space.
%lThe hour represented on the 12-hour clock. Single-digit hours are preceded by a blank space. For example, if a snapshot is created at 1:45 AM, %I is replaced with 1.
%MThe two-digit minute.
%mThe two-digit month.
%pAM or PM.
%{PolicyName}The name of the replication policy that the snapshot was created for. This variable is valid only if you are specifying a snapshot naming pattern for a replication policy.
%RThe time. This variable is equivalent to specifying M.
!%rThe time. This variable is equivalent to specifying M:p.
%SThe two-digit second.
%sThe second represented in UNIX or POSIX time.
%{SrcCluster}The name of the source cluster of the replication policy that the snapshot was created for. This variable is valid only if you are specifying a snapshot naming pattern for a replication policy.
%TThe time. This variable is equivalent to specifying M:%S
%UThe two-digit numerical week of the year. Numbers range from 00 to 53. The first day of the week is calculated as Sunday.
%uThe numerical day of the week. Numbers range from 1 to 7. The first day of the week is calculated as Monday. For example, if a snapshot is created on Sunday, %u is replaced with 7.
%VThe two-digit numerical week of the year that the snapshot was created in. Numbers range from 01 to 53. The first day of the week is calculated as Monday. If the week of January 1 is four or more days in length, then that week is counted as the first week of the year.
%vThe day that the snapshot was created. This variable is equivalent to specifying b-%Y.
%WThe two-digit numerical week of the year that the snapshot was created in. Numbers range from 00 to 53. The first day of the week is calculated as Monday.
%wThe numerical day of the week that the snapshot was created on. Numbers range from 0 to 6. The first day of the week is calculated as Sunday. For example, if the snapshot was created on Sunday, %w is replaced with 0.
%XThe time that the snapshot was created. This variable is equivalent to specifying M:%S.
%YThe year that the snapshot was created in.
%yThe last two digits of the year that the snapshot was created in. For example, if the snapshot was created in 2014, %y is replaced with 14.
%ZThe time zone that the snapshot was created in.
%zThe offset from coordinated universal time (UTC) of the time zone that the snapshot was created in. If preceded by a plus sign, the time zone is east of UTC. If preceded by a minus sign, the time zone is west of UTC.
%+The time and date that the snapshot was created. This variable is equivalent to specifying b X Y.
%%Escapes a percent sign. For example, 100%% is replaced with 100%.

Creating ChangeList between snapshots

  • Create a ChangeList between 2 snapshots and list its content.
  • Delete it at the end.
BIC-Isilon-Cluster-3# isi snapshot snapshots ls | grep mril3
965  snapshot-mril3-daily-3d-2016-10-04-23-45                /ifs/data/mril/mril3
966  alias-snapshot-mril3-daily-3d                           /ifs/data/mril/mril3
979  snapshot-mril3-daily-3d-2016-10-05-23-45                /ifs/data/mril/mril3

BIC-Isilon-Cluster-3# isi snapshot snapshots view 979
               ID: 979
             Name: snapshot-mril3-daily-3d-2016-10-05-23-45
             Path: /ifs/data/mril/mril3
        Has Locks: No
         Schedule: snapshot-mril3-daily-3d
  Alias Target ID: -
Alias Target Name: -
          Created: 2016-10-05T23:45:04
          Expires: 2016-10-09T01:45:00
             Size: 6.0k
     Shadow Bytes: 0
        % Reserve: 0.00%
     % Filesystem: 0.00%
            State: active

BIC-Isilon-Cluster-3# isi snapshot snapshots view 965       
               ID: 965
             Name: snapshot-mril3-daily-3d-2016-10-04-23-45
             Path: /ifs/data/mril/mril3
        Has Locks: No
         Schedule: snapshot-mril3-daily-3d
  Alias Target ID: -
Alias Target Name: -
          Created: 2016-10-04T23:45:10
          Expires: 2016-10-08T01:45:00
             Size: 61.0k
     Shadow Bytes: 0
        % Reserve: 0.00%
     % Filesystem: 0.00%
            State: active

BIC-Isilon-Cluster-3# isi job jobs start ChangelistCreate --older-snapid 965 --newer-snapid 979

BIC-Isilon-Cluster-3# isi_changelist_mod -l
965_979_inprog

BIC-Isilon-Cluster-3# isi job jobs list
ID   Type             State   Impact  Pri  Phase  Running Time 
---------------------------------------------------------------
964  ChangelistCreate Running Low     5    2/4    21m          
---------------------------------------------------------------
Total: 1

BIC-Isilon-Cluster-3# isi_changelist_mod -l        
965_979

BIC-Isilon-Cluster-3# isi_changelist_mod -a 965_979
st_ino=4357852748 st_mode=040755 st_size=14149 st_atime=1475608476 st_mtime=1475608476 st_ctime=1475698088 st_flags=224 cl_flags=00 path=/ifs/data/mril/mril3/ilana/matlab
st_ino=4374360447 st_mode=040755 st_size=207 st_atime=1467833479 st_mtime=1467833479 st_ctime=1475698088 st_flags=224 cl_flags=00 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other
st_ino=4402080042 st_mode=0100644 st_size=1033 st_atime=1475678676 st_mtime=1475678676 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m~
st_ino=4402080043 st_mode=0100644 st_size=2922 st_atime=1475588733 st_mtime=1475588733 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other/AMICO_LoadData.m
st_ino=4414831639 st_mode=0100644 st_size=1047 st_atime=1475690420 st_mtime=1475690420 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m
st_ino=4374468851 st_mode=0100644 st_size=2921 st_atime=1466519137 st_mtime=1466519137 st_ctime=1470857490 st_flags=224 cl_flags=02 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other/AMICO_LoadData.m
st_ino=4416223571 st_mode=0100644 st_size=890 st_atime=1475264575 st_mtime=1475264575 st_ctime=1475350931 st_flags=224 cl_flags=02 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m

BIC-Isilon-Cluster-3# isi_changelist_mod -k 965_979  

Jobs

How To Delete A Large Amount Of Files/Dirs Without Impacting the Cluster Performance

  • Submit the job type TreeDelete.
  • Here the /ifs/data/zmanda contains 5TB of restored data from the Zmanda NDMP backup tapes.
BIC-Isilon-Cluster-4# isi job jobs start TreeDelete --paths /ifs/data/zmanda --priority 10 --policy low
Started job [4050]

BIC-Isilon-Cluster-4# isi job jobs list
ID   Type       State   Impact  Pri  Phase  Running Time 
---------------------------------------------------------
4050 TreeDelete Running Low     10   1/1    -            
---------------------------------------------------------
Total: 1

BIC-Isilon-Cluster-4# isi job jobs view 4050
               ID: 4050
             Type: TreeDelete
            State: Running
           Impact: Low
           Policy: LOW
              Pri: 10
            Phase: 1/1
       Start Time: 2017-10-12T11:45:41
     Running Time: 22s
     Participants: 1, 2, 3, 4, 6
         Progress: Started
Waiting on job ID: -
      Description: {'count': 1, 'lins': {'1:1044:db60': """/ifs/data/zmanda"""}}

BIC-Isilon-Cluster-4# isi status             
Cluster Name: BIC-Isilon-Cluster
Cluster Health:     [  OK ]
Cluster Storage:  HDD                 SSD Storage    
Size:             641.6T (649.3T Raw) 0 (0 Raw)      
VHS Size:         7.7T                
Used:             207.1T (32%)        0 (n/a)        
Avail:            434.5T (68%)        0 (n/a)        

                   Health  Throughput (bps)  HDD Storage      SSD Storage
ID |IP Address     |DASR |  In   Out  Total| Used / Size     |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
  1|172.16.10.20   | OK  |48.9k| 133k| 182k|41.4T/ 130T( 32%)|(No Storage SSDs)
  2|172.16.10.21   | OK  |20.5M|128.0|20.5M|41.4T/ 130T( 32%)|(No Storage SSDs)
  3|172.16.10.22   | OK  | 1.3M| 111k| 1.4M|41.4T/ 130T( 32%)|(No Storage SSDs)
  4|172.16.10.23   | OK  |14.1M|75.6M|89.6M|41.4T/ 130T( 32%)|(No Storage SSDs)
  5|172.16.10.24   | OK  |96.6k|66.7k| 163k|41.4T/ 130T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals:          |36.0M|75.9M| 112M| 207T/ 642T( 32%)|(No Storage SSDs)

     Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only     

Critical Events:


Cluster Job Status:

Running jobs:                                                                   
Job                        Impact Pri Policy     Phase Run Time   
-------------------------- ------ --- ---------- ----- ---------- 
TreeDelete[4050]           Low    10  LOW        1/1   0:25:01 

No paused or waiting jobs.

No failed jobs.

Recent job results:                                                                                                                                                                                                
Time            Job                        Event                          
--------------- -------------------------- ------------------------------ 
10/12 04:00:02  ShadowStoreProtect[4049]   Succeeded (LOW) 
10/12 03:05:16  SnapshotDelete[4048]       Succeeded (MEDIUM) 
10/12 02:00:17  WormQueue[4047]            Succeeded (LOW) 
10/12 01:05:31  SnapshotDelete[4046]       Succeeded (MEDIUM) 
10/12 00:04:57  SnapshotDelete[4045]       Succeeded (MEDIUM) 
10/11 23:21:32  FSAnalyze[4043]            Succeeded (LOW) 
10/11 22:37:01  SnapshotDelete[4044]       Succeeded (MEDIUM) 
10/11 20:00:25  ShadowStoreProtect[4042]   Succeeded (LOW) 
11/15 14:53:34  MultiScan[1254]            MultiScan[1254] Failed 
10/06 14:45:55  ChangelistCreate[975]      ChangelistCreate[975] Failed 

InsightIQ Installation and Config

Install IIQ

  • License must be installed on the Isilon cluster.
  • Create a CentOS 6.7 (beurk) virtual machine and properly configure the network on it.
  • Call this machine zaphod.
  • Need the InsightIQ shell script from EMC support.
  • Will use a local (to the VM) data store for IIQ.
  • The install script will fail due to some dependencies mismatch with openssl, I think.
  • Here a way to force the install.
  • Extract the content of the self-packaged script.
  • Remove the offending package openssl-1.0.1e-42.el6_7.2.x86_64.rpm from it.
  • Manually install yum install openssl-devel.x86_64.
  • Run the install script sh ./install_insightiq.sh.
root@zaphod ~$ sh ./install-insightiq-4.0.0.0049.sh --target ./iiq
root@zaphod ~$ cd iiq
root@zaphod ~$ rm openssl-1.0.1e-42.el6_7.2.x86_64.rpm
root@zaphod ~/iiq$ ll *.rpm
-rw-r--r-- 1 root root   928548 Jan 12 18:55 bash-4.1.2-29.el6.x86_64.rpm
-rw-r--r-- 1 root root   367680 Jan 12 18:55 freetype-2.3.11-14.el6_3.1.x86_64.rpm
-rw-r--r-- 1 root root  3993500 Jan 12 18:55 glibc-2.12-1.149.el6_6.7.x86_64.rpm
-rw-r--r-- 1 root root 14884088 Jan 12 18:56 glibc-common-2.12-1.149.el6_6.7.x86_64.rpm
-rw-r--r-- 1 root root 25899811 Jan 12 18:55 isilon-insightiq-4.0.0.0049-1.x86_64.rpm
-rw-r--r-- 1 root root   139192 Jan 12 18:56 libXfont-1.4.5-3.el6_5.x86_64.rpm
-rw-r--r-- 1 root root    25012 Jan 12 18:56 libfontenc-1.0.5-2.el6.x86_64.rpm
-rw-r--r-- 1 root root   178512 Jan 12 18:56 libjpeg-turbo-1.2.1-3.el6_5.x86_64.rpm
-rw-r--r-- 1 root root   186036 Jan 12 18:56 libpng-1.2.49-1.el6_2.x86_64.rpm
-rw-r--r-- 1 root root   280524 Jan 12 18:56 openssh-5.3p1-112.el6_7.x86_64.rpm
-rw-r--r-- 1 root root   448872 Jan 12 18:56 openssh-clients-5.3p1-112.el6_7.x86_64.rpm
-rw-r--r-- 1 root root   331544 Jan 12 18:56 openssh-server-5.3p1-112.el6_7.x86_64.rpm
-rw-r--r-- 1 root root  1225760 Jan 12 18:56 openssl-devel-1.0.1e-42.el6_7.2.x86_64.rpm
-rw-r--r-- 1 root root  1033984 Jan 12 18:56 postgresql93-9.3.4-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root  1544220 Jan 12 18:56 postgresql93-devel-9.3.4-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root   194856 Jan 12 18:56 postgresql93-libs-9.3.4-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root  4259740 Jan 12 18:56 postgresql93-server-9.3.4-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root    43900 Jan 12 18:56 ttmkfdir-3.0.9-32.1.el6.x86_64.rpm
-rw-r--r-- 1 root root   453984 Jan 12 18:56 tzdata-2015b-1.el6.noarch.rpm
-rw-r--r-- 1 root root 39308573 Jan 12 18:56 wkhtmltox-0.12.2.1_linux-centos6-amd64.rpm
-rw-r--r-- 1 root root    76712 Jan 12 18:56 xorg-x11-font-utils-7.2-11.el6.x86_64.rpm
-rw-r--r-- 1 root root  2929960 Jan 12 18:56 xorg-x11-fonts-75dpi-7.2-9.1.el6.noarch.rpm
-rw-r--r-- 1 root root   532016 Jan 12 18:56 xorg-x11-fonts-Type1-7.2-9.1.el6.noarch.rpm
root@zaphod ~/iiq$ yum list openssl\*
Installed Packages
openssl.x86_64                                                                       1.0.1e-42.el6_7.4                                                                     @updates/$releasever
Available Packages
openssl.i686                                                                         1.0.1e-42.el6_7.4                                                                     updates             
openssl-devel.i686                                                                   1.0.1e-42.el6_7.4                                                                     updates             
openssl-devel.x86_64                                                                 1.0.1e-42.el6_7.4                                                                     updates             
openssl-perl.x86_64                                                                  1.0.1e-42.el6_7.4                                                                     updates             
openssl-static.x86_64                                                                1.0.1e-42.el6_7.4                                                                     updates             
openssl098e.i686                                                                     0.9.8e-20.el6.centos.1                                                                updates             
openssl098e.x86_64                                                                   0.9.8e-20.el6.centos.1                                                                updates             
root@zaphod ~/iiq$ yum install openssl-devel.x86_64

===============================================================================================================================================================================================
 Package                                             Arch                                   Version                                              Repository                               Size
===============================================================================================================================================================================================
Installing:
 openssl-devel                                       x86_64                                 1.0.1e-42.el6_7.4                                    updates                                 1.2 M
Installing for dependencies:
 keyutils-libs-devel                                 x86_64                                 1.4-5.el6                                            base                                     29 k
 krb5-devel                                          x86_64                                 1.10.3-42z1.el6_7                                    updates                                 502 k
 libcom_err-devel                                    x86_64                                 1.41.12-22.el6                                       base                                     33 k
 libselinux-devel                                    x86_64                                 2.0.94-5.8.el6                                       base                                    137 k
 libsepol-devel                                      x86_64                                 2.0.41-4.el6                                         base                                     64 k
 zlib-devel                                          x86_64                                 1.2.3-29.el6                                         base                                     44 k

Transaction Summary
===============================================================================================================================================================================================
Install       7 Package(s)

Total download size: 2.0 M
Installed size: 4.9 M

Installed:
  openssl-devel.x86_64 0:1.0.1e-42.el6_7.4                                                                                                                                                     

Dependency Installed:
  keyutils-libs-devel.x86_64 0:1.4-5.el6        krb5-devel.x86_64 0:1.10.3-42z1.el6_7        libcom_err-devel.x86_64 0:1.41.12-22.el6        libselinux-devel.x86_64 0:2.0.94-5.8.el6       
  libsepol-devel.x86_64 0:2.0.41-4.el6          zlib-devel.x86_64 0:1.2.3-29.el6            

root@zaphod ~/iiq$ sh ./install_insightiq.sh 

This script automates the installation or upgrade of InsightIQ.  If you are
running a version of InsightIQ that can be upgraded by this version, the
upgrade will occur automatically.  If you are trying to upgrade an unsupported
version, the script will exit.  If you are installing on a new system, the
script will perform a clean install.

Are you ready to proceed with the installation?
Please enter (Y)es or (N)o followed by [ENTER] >>> y

===============================================================================================================================================================================================
 Package                                     Arch                          Version                                  Repository                                                            Size
===============================================================================================================================================================================================
Installing:
 freetype                                    x86_64                        2.3.11-14.el6_3.1                        /freetype-2.3.11-14.el6_3.1.x86_64                                   816 k
 isilon-insightiq                            x86_64                        4.0.0.0049-1                             /isilon-insightiq-4.0.0.0049-1.x86_64                                 93 M
 libXfont                                    x86_64                        1.4.5-3.el6_5                            /libXfont-1.4.5-3.el6_5.x86_64                                       295 k
 libfontenc                                  x86_64                        1.0.5-2.el6                              /libfontenc-1.0.5-2.el6.x86_64                                        40 k
 libjpeg-turbo                               x86_64                        1.2.1-3.el6_5                            /libjpeg-turbo-1.2.1-3.el6_5.x86_64                                  466 k
 libpng                                      x86_64                        2:1.2.49-1.el6_2                         /libpng-1.2.49-1.el6_2.x86_64                                        639 k
 postgresql93                                x86_64                        9.3.4-1PGDG.rhel6                        /postgresql93-9.3.4-1PGDG.rhel6.x86_64                               5.2 M
 postgresql93-devel                          x86_64                        9.3.4-1PGDG.rhel6                        /postgresql93-devel-9.3.4-1PGDG.rhel6.x86_64                         6.7 M
 postgresql93-libs                           x86_64                        9.3.4-1PGDG.rhel6                        /postgresql93-libs-9.3.4-1PGDG.rhel6.x86_64                          631 k
 postgresql93-server                         x86_64                        9.3.4-1PGDG.rhel6                        /postgresql93-server-9.3.4-1PGDG.rhel6.x86_64                         15 M
 ttmkfdir                                    x86_64                        3.0.9-32.1.el6                           /ttmkfdir-3.0.9-32.1.el6.x86_64                                       99 k
 wkhtmltox                                   x86_64                        1:0.12.2.1-1                             /wkhtmltox-0.12.2.1_linux-centos6-amd64                              109 M
 xorg-x11-font-utils                         x86_64                        1:7.2-11.el6                             /xorg-x11-font-utils-7.2-11.el6.x86_64                               294 k
 xorg-x11-fonts-75dpi                        noarch                        7.2-9.1.el6                              /xorg-x11-fonts-75dpi-7.2-9.1.el6.noarch                             2.9 M
 xorg-x11-fonts-Type1                        noarch                        7.2-9.1.el6                              /xorg-x11-fonts-Type1-7.2-9.1.el6.noarch                             863 k
Installing for dependencies:
 avahi-libs                                  x86_64                        0.6.25-15.el6                            base                                                                  55 k
 blas                                        x86_64                        3.2.1-4.el6                              base                                                                 321 k
 c-ares                                      x86_64                        1.10.0-3.el6                             base                                                                  75 k
 cups-libs                                   x86_64                        1:1.4.2-72.el6                           base                                                                 321 k
 cyrus-sasl-gssapi                           x86_64                        2.1.23-15.el6_6.2                        base                                                                  34 k
 fontconfig                                  x86_64                        2.8.0-5.el6                              base                                                                 186 k
 gnutls                                      x86_64                        2.8.5-19.el6_7                           updates                                                              347 k
 keyutils                                    x86_64                        1.4-5.el6                                base                                                                  39 k
 lapack                                      x86_64                        3.2.1-4.el6                              base                                                                 4.3 M
 libX11                                      x86_64                        1.6.0-6.el6                              base                                                                 586 k
 libX11-common                               noarch                        1.6.0-6.el6                              base                                                                 192 k
 libXau                                      x86_64                        1.0.6-4.el6                              base                                                                  24 k
 libXext                                     x86_64                        1.3.2-2.1.el6                            base                                                                  35 k
 libXrender                                  x86_64                        0.9.8-2.1.el6                            base                                                                  24 k
 libbasicobjects                             x86_64                        0.1.1-11.el6                             base                                                                  21 k
 libcollection                               x86_64                        0.6.2-11.el6                             base                                                                  36 k
 libdhash                                    x86_64                        0.4.3-11.el6                             base                                                                  24 k
 libevent                                    x86_64                        1.4.13-4.el6                             base                                                                  66 k
 libgfortran                                 x86_64                        4.4.7-16.el6                             base                                                                 267 k
 libgssglue                                  x86_64                        0.1-11.el6                               base                                                                  23 k
 libini_config                               x86_64                        1.1.0-11.el6                             base                                                                  46 k
 libipa_hbac                                 x86_64                        1.12.4-47.el6_7.8                        updates                                                              106 k
 libldb                                      x86_64                        1.1.25-2.el6_7                           updates                                                              113 k
 libnl                                       x86_64                        1.1.4-2.el6                              base                                                                 121 k
 libpath_utils                               x86_64                        0.2.1-11.el6                             base                                                                  24 k
 libref_array                                x86_64                        0.1.4-11.el6                             base                                                                  23 k
 libsss_idmap                                x86_64                        1.12.4-47.el6_7.8                        updates                                                              110 k
 libtalloc                                   x86_64                        2.1.5-1.el6_7                            updates                                                               26 k
 libtdb                                      x86_64                        1.3.8-1.el6_7                            updates                                                               43 k
 libtevent                                   x86_64                        0.9.26-2.el6_7                           updates                                                               29 k
 libtiff                                     x86_64                        3.9.4-10.el6_5                           base                                                                 343 k
 libtirpc                                    x86_64                        0.2.1-10.el6                             base                                                                  79 k
 libxcb                                      x86_64                        1.9.1-3.el6                              base                                                                 110 k
 nfs-utils                                   x86_64                        1:1.2.3-64.el6                           base                                                                 331 k
 nfs-utils-lib                               x86_64                        1.1.5-11.el6                             base                                                                  68 k
 pytalloc                                    x86_64                        2.1.5-1.el6_7                            updates                                                               10 k
 python-argparse                             noarch                        1.2.1-2.1.el6                            base                                                                  48 k
 python-sssdconfig                           noarch                        1.12.4-47.el6_7.8                        updates                                                              133 k
 rpcbind                                     x86_64                        0.2.0-11.el6_7                           updates                                                               51 k
 samba4-libs                                 x86_64                        4.2.10-6.el6_7                           updates                                                              4.4 M
 sssd                                        x86_64                        1.12.4-47.el6_7.8                        updates                                                              101 k
 sssd-ad                                     x86_64                        1.12.4-47.el6_7.8                        updates                                                              193 k
 sssd-client                                 x86_64                        1.12.4-47.el6_7.8                        updates                                                              152 k
 sssd-common                                 x86_64                        1.12.4-47.el6_7.8                        updates                                                              978 k
 sssd-common-pac                             x86_64                        1.12.4-47.el6_7.8                        updates                                                              136 k
 sssd-ipa                                    x86_64                        1.12.4-47.el6_7.8                        updates                                                              238 k
 sssd-krb5                                   x86_64                        1.12.4-47.el6_7.8                        updates                                                              135 k
 sssd-krb5-common                            x86_64                        1.12.4-47.el6_7.8                        updates                                                              191 k
 sssd-ldap                                   x86_64                        1.12.4-47.el6_7.8                        updates                                                              216 k
 sssd-proxy                                  x86_64                        1.12.4-47.el6_7.8                        updates                                                              130 k

Transaction Summary
===============================================================================================================================================================================================
Install      65 Package(s)

Total size: 252 M
Total download size: 15 M
Installed size: 277 M
insightiq       0:off   1:off   2:on    3:on    4:on    5:on    6:off
chmod: cannot access `sssd': No such file or directory
ip6tables: unrecognized service
ip6tables: unrecognized service
error reading information on service ip6tables: No such file or directory
Shutting down interface eth0:  [  OK  ]
Shutting down loopback interface:  [  OK  ]
Bringing up loopback interface:  [  OK  ]
Bringing up interface eth0:  Determining if ip address 132.206.178.250 is already in use for device eth0...
[  OK  ]
Generating RSA private key, 2048 bit long modulus
..+++
.........................................................................................................................+++
e is 65537 (0x10001)
Signature ok
subject=/C=US/ST=Washington/L=Seattle/O=EMC Isilon/CN=InsightIQ/emailAddress=support@emc.com
Getting Private key
Initializing database: [  OK  ]
Starting iiq_db service: [  OK  ]
Starting insightiq: [  OK  ]

Installed:
  freetype.x86_64 0:2.3.11-14.el6_3.1            isilon-insightiq.x86_64 0:4.0.0.0049-1           libXfont.x86_64 0:1.4.5-3.el6_5             libfontenc.x86_64 0:1.0.5-2.el6                
  libjpeg-turbo.x86_64 0:1.2.1-3.el6_5           libpng.x86_64 2:1.2.49-1.el6_2                   postgresql93.x86_64 0:9.3.4-1PGDG.rhel6     postgresql93-devel.x86_64 0:9.3.4-1PGDG.rhel6  
  postgresql93-libs.x86_64 0:9.3.4-1PGDG.rhel6   postgresql93-server.x86_64 0:9.3.4-1PGDG.rhel6   ttmkfdir.x86_64 0:3.0.9-32.1.el6            wkhtmltox.x86_64 1:0.12.2.1-1                  
  xorg-x11-font-utils.x86_64 1:7.2-11.el6        xorg-x11-fonts-75dpi.noarch 0:7.2-9.1.el6        xorg-x11-fonts-Type1.noarch 0:7.2-9.1.el6  

Dependency Installed:
  avahi-libs.x86_64 0:0.6.25-15.el6                blas.x86_64 0:3.2.1-4.el6                        c-ares.x86_64 0:1.10.0-3.el6                cups-libs.x86_64 1:1.4.2-72.el6                
  cyrus-sasl-gssapi.x86_64 0:2.1.23-15.el6_6.2     fontconfig.x86_64 0:2.8.0-5.el6                  gnutls.x86_64 0:2.8.5-19.el6_7              keyutils.x86_64 0:1.4-5.el6                    
  lapack.x86_64 0:3.2.1-4.el6                      libX11.x86_64 0:1.6.0-6.el6                      libX11-common.noarch 0:1.6.0-6.el6          libXau.x86_64 0:1.0.6-4.el6                    
  libXext.x86_64 0:1.3.2-2.1.el6                   libXrender.x86_64 0:0.9.8-2.1.el6                libbasicobjects.x86_64 0:0.1.1-11.el6       libcollection.x86_64 0:0.6.2-11.el6            
  libdhash.x86_64 0:0.4.3-11.el6                   libevent.x86_64 0:1.4.13-4.el6                   libgfortran.x86_64 0:4.4.7-16.el6           libgssglue.x86_64 0:0.1-11.el6                 
  libini_config.x86_64 0:1.1.0-11.el6              libipa_hbac.x86_64 0:1.12.4-47.el6_7.8           libldb.x86_64 0:1.1.25-2.el6_7              libnl.x86_64 0:1.1.4-2.el6                     
  libpath_utils.x86_64 0:0.2.1-11.el6              libref_array.x86_64 0:0.1.4-11.el6               libsss_idmap.x86_64 0:1.12.4-47.el6_7.8     libtalloc.x86_64 0:2.1.5-1.el6_7               
  libtdb.x86_64 0:1.3.8-1.el6_7                    libtevent.x86_64 0:0.9.26-2.el6_7                libtiff.x86_64 0:3.9.4-10.el6_5             libtirpc.x86_64 0:0.2.1-10.el6                 
  libxcb.x86_64 0:1.9.1-3.el6                      nfs-utils.x86_64 1:1.2.3-64.el6                  nfs-utils-lib.x86_64 0:1.1.5-11.el6         pytalloc.x86_64 0:2.1.5-1.el6_7                
  python-argparse.noarch 0:1.2.1-2.1.el6           python-sssdconfig.noarch 0:1.12.4-47.el6_7.8     rpcbind.x86_64 0:0.2.0-11.el6_7             samba4-libs.x86_64 0:4.2.10-6.el6_7            
  sssd.x86_64 0:1.12.4-47.el6_7.8                  sssd-ad.x86_64 0:1.12.4-47.el6_7.8               sssd-client.x86_64 0:1.12.4-47.el6_7.8      sssd-common.x86_64 0:1.12.4-47.el6_7.8         
  sssd-common-pac.x86_64 0:1.12.4-47.el6_7.8       sssd-ipa.x86_64 0:1.12.4-47.el6_7.8              sssd-krb5.x86_64 0:1.12.4-47.el6_7.8        sssd-krb5-common.x86_64 0:1.12.4-47.el6_7.8    
  sssd-ldap.x86_64 0:1.12.4-47.el6_7.8             sssd-proxy.x86_64 0:1.12.4-47.el6_7.8           

Configure IIQ and X509 Certificates for Web Access.

  • Do as said in the install manual, caveat some errors/omissions.
  • Create a user called iiq on the IIQ server (zaphod).
  • On the Isilon cluster, activate the user insightiq on the Auth File provider, System zone.
BIC-Isilon-Cluster-4# isi auth users view insightiq
                    Name: insightiq
                      DN: -
              DNS Domain: -
                  Domain: UNIX_USERS
                Provider: lsa-file-provider:System
        Sam Account Name: insightiq
                     UID: 15
                     SID: S-1-22-1-15
                 Enabled: Yes
                 Expired: No
                  Expiry: -
                  Locked: No
                   Email: -
                   GECOS: InsightIQ User
           Generated GID: No
           Generated UID: No
           Generated UPN: Yes
           Primary Group
                          ID: GID:15
                        Name: insightiq
          Home Directory: /ifs/home/insightiq
        Max Password Age: -
        Password Expired: No
         Password Expiry: -
       Password Last Set: -
        Password Expires: Yes
                   Shell: /sbin/nologin
                     UPN: insightiq@UNIX_USERS
User Can Change Password: No
  • Install the BIC wildcard Comodo X509 certificate, key server and Comodo cert bundle in /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem.
root@zaphod ~$ cat STAR_bic_mni_mcgill_ca.crt \
                   STAR_bic_mni_mcgill_ca.key \
                   COMODO_CA_bundle.crt >> /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem
  • Protect that file since it contains the secret server key.
root@zaphod ~$ chmod 400 /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem
  • Modify /etc/isilon/insightiq.ini for the server cert location: ssl_pem = /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem.
  • Restart the IIQ stuff with /etc/init.d/insightiq restart.
  • Installation guide speaks uses command iiq_restart which is an alias defined in /etc/profile.d/insightiq.sh
  • Check in /var/log/insightiq_stdio.log to see if the cert is OK.
  • Port 80 and 443 must not be blocked by a firewall. Access restrictions should be enabled however.
  • Connect to the web interface using the credentials for the user iiq.
  • Go to “Setting” and add a cluster to monitor using the SmartConnect ip access sip.bic.mni.mcgill.ca.
  • Use the cluster’s local user insightiq and its credentials to connect to the cluster.
  • Bingo.

NFS Benchmarks Using FIO

  • The following is shamelessly copied/stolen (with a few local modifications to suit our local environment) from this EMC blog entry:

https://community.emc.com/community/products/isilon/blog/2015/12/08/nfs-throughput-benchmarks-with-fio

  • The benchmark as described in the above URL bypasses the mechanisms provided by the Isilon product SmartConnect Adavanced:
    • Connections to the cluster are made directly to an IP of a particular node in the Isilon cluster.
    • This is done in order to not introduce any biases with any load-balancing (round-robin, cpu or network) done by SmartConnect.
  • Strategy:
    • Latencies and IOPs (I/O Operations per seconds) are the most meaningful metrics when assessing random IO performance: bandwidth is of secondary value for this case of I/O access pattern.
    • For sequential access to storage, I/O performance is best assessed by measuring the client-to-server bandwidth.
    • The client buffers and caching mechanisms must be examined and dealt with carefully.
    • We are not interested in benchmarking the clients efficient use of local caches!
  • A birds’s view of the network layout and the working files organization as explained in the url above:
 /ifs/data/fiotest => /mnt/isilon/fiotest/
                                         ..
                                         /fiojob_8k_50G_4jobs_randrw
                                         /fioresult_8k_50G_4jobs_randrw_172.16.20.42.log
                                         /172.16.20.102/
                                         /172.16.20.203/
                                         /172.16.20.204/
                                         /172.16.20.42/
                                         /control/.. /cleanup_remount_1to1.sh
                                                     /nfs_copy_trusted.sh
+-----------------+                                  /run_nfs_fio_8k_50G_4jobs_randrw.sh
| node02          |                                  /nfs_hosts.list
| 172.16.20.202   |                                  /trusted.key
+-----------------+                                  /trusted.key.pub
                    2x 1GiG                                   
+-----------------+         +------------------+
| node03          |.........| LNN2             |
| 172.16.20.203   |.........| 172.16.20.236    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| node04          |.........| LNN4             |
| 172.16.20.204   |.........| 172.16.20.234    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| thaisa          |.........| LNN5             |
| 172.16.20.42    |.........| 172.16.20.235    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| widow           |.........| LNN1             |
| 172.16.20.102   |.........| 172.16.20.237    |
+-----------------+         +------------------+
                            +------------------+
                            | LNN3             |
                            | 172.16.20.233    |
                            +------------------+

If this diagram is enough for you, skip to the section FIO Configuration and Benchmarking if you are not interested in the details of the networking setup or even simply jump to the FIO NFS Statistics Reports section for the actual results of the benchmarks.


Nodes Configuration

  • Use the nodes node02, node03, node04, thaisa and widow.
  • Note: somehow I can’t make the 6th node vaux mount the Isilon exports so I didn’t use it out of frustration.
  • Here are the relevant nodes network configuration and settings:
  • node02, node03 and node04 have the same network layout:
    • eth0 in 192.168.86.0/24
    • eth1 and eth2 bonded to bond0 in data network 172.16.20.0/24
    • bond0:0 IP alias in management network 172.16.10.0/24
    • All data links from nodes to the Isilon cluster network front-end are dual 1GiG in bond.
node02:

eth0       inet addr:192.168.86.202  Bcast:192.168.86.255  Mask:255.255.255.0
bond0      inet addr:172.16.20.202   Bcast:172.16.20.255   Mask:255.255.255.0
bond0:0    inet addr:172.16.10.202   Bcast:172.16.10.255   Mask:255.255.255.0

~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.86.1    0.0.0.0         UG    0      0        0 eth0
172.16.10.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
172.16.20.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
192.168.86.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0

node03:

eth0       inet addr:192.168.86.203  Bcast:192.168.86.255  Mask:255.255.255.0
bond0      inet addr:172.16.20.203   Bcast:172.16.20.255   Mask:255.255.255.0
bond0:0    inet addr:172.16.10.203   Bcast:172.16.10.255   Mask:255.255.255.0

~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.86.1    0.0.0.0         UG    0      0        0 eth0
172.16.10.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
172.16.20.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
192.168.86.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0

node04:

eth0       inet addr:192.168.86.204  Bcast:192.168.86.255  Mask:255.255.255.0
bond0      inet addr:172.16.20.204   Bcast:172.16.20.255   Mask:255.255.255.0
bond0:0    inet addr:172.16.10.204   Bcast:172.16.10.255   Mask:255.255.255.0

~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.86.1    0.0.0.0         UG    0      0        0 eth0
172.16.10.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
172.16.20.0     0.0.0.0         255.255.255.0   U     0      0        0 bond0
192.168.86.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
  • thaisa and widow in real life are Xen Dom0.
    • They provide virtual hosts when running a Xen-ified kernel.
    • For the purpose of this test, both have been rebooted without a Xen kernel.
    • They use a (virtual) bridged network interface xenbr0 connected to a bonded network interface bond0 that acts as a external physical network interface.
thaisa:

~# brctl show

bridge name     bridge id               STP enabled     interfaces
xenbr0          8000.00e081c19a1a       no              bond0

xenbr0    inet addr:132.206.178.42  Bcast:132.206.178.255  Mask:255.255.255.0
xenbr0:0  inet addr:172.16.20.42    Bcast:172.16.20.255    Mask:255.255.255.0

route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         132.206.178.1   0.0.0.0         UG    0      0        0 xenbr0
132.206.178.0   0.0.0.0         255.255.255.0   U     0      0        0 xenbr0
172.16.20.0     0.0.0.0         255.255.255.0   U     0      0        0 xenbr0
192.168.86.0    0.0.0.0         255.255.255.0   U     0      0        0 xenbr0

widow:

~# brctl show

bridge name     bridge id               STP enabled     interfaces
xenbr0          8000.00e081c19a9a       no              bond0

xenbr0    inet addr:132.206.178.102  Bcast:132.206.178.255  Mask:255.255.255.0
xenbr0:0  inet addr:172.16.20.102  Bcast:172.16.20.255  Mask:255.255.255.0

route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         132.206.178.1   0.0.0.0         UG    0      0        0 xenbr0
132.206.178.0   0.0.0.0         255.255.255.0   U     0      0        0 xenbr0
172.16.20.0     0.0.0.0         255.255.255.0   U     0      0        0 xenbr0
192.168.86.0    0.0.0.0         255.255.255.0   U     0      0        0 xenbr0
  • Create a NIS netgroup fiotest
# Temp netgroup for fio test: node02,node03,node04,thaisa,widow.
# nodes access Isilon from the 172.16.20.0/24 network.
# thaisa, vaux and widow have bonded NIC aliases in 172.16.20.0/24 and 132.206.178.0/24 networks.
# FOLLOWING LINE IS A ONE LINER. WATCH OUT FOR END_OF_LINE CHARACTERS!
# DO NOT COPY-AND-PASTE!!

fiotest (172.16.20.202,,) (172.16.20.203,,) (172.16.20.204,,) \
(172.16.20.42,,) (132.206.178.42,,) (thaisa.bic.mni.mcgill.ca,,) \
(172.16.20.102,,) (132.206.178.102,,) (widow.bic.mni.mcgill.ca,,)
  • Node02 is the control host (called “harness” in the blog link above).
  • Node02 should have password-less root access to the Isilon cluster and hosts in fiotest
    • Create a ssh key and distribute the public key to fiotest in ~root/.ssh/authorized_keys
    • Distribute the pub key on all the nodes of the cluster, using isi_for_array
    • Create a ssh config file that will redirect the ssh host keys to /dev/null.
    • The options CheckHostIP and StrictHostKeyChecking are to remove any spurious warning messages in the output stream of the FIO logfiles.
node02:~# cat .ssh/config 
Host mgmt.isi.bic.mni.mcgill.ca
    ForwardX11      no
    ForwardAgent    no
    User            root
    CheckHostIP     no
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null
  • Verify that you can ssh from node02 to any nodes or cluster nodes and also mgmt.isi.bic.mni.mcgill.ca without issuing a password.
  • Continue if and only if you have this working without any problem.
  • Create a NFSv4 export on the Isilon cluster with the properties:
isi nfs exports create /ifs/data/fiotest --zone prod --root-clients fiotest --clients fiotest
isi nfs aliases create /fiotest /ifs/data/fiotest --zone prod
isi quota quotas create /ifs/data/fiotest directory --zone prod --hard-threshold 2T --container yes

chmod 777 /ifs/data/fiotest
ls -ld /ifs/data/fiotest 
drwxrwxrwx    9 root  wheel  4711 Sep 27 11:11 /ifs/data/fiotest
  • no_root_squash for hosts in fiotest: all nodes must be able to write as root in /ifs/data/fiotest
  • read/write by everyone for the top dir /ifs/data/fiotest
  • On all the nodes:
    • Create the local mount point /mnt/isilon/fiotest.
    • Verify that each node can mount the Isilon export on the local mount point /mnt/isilon/fiotest
    • Continue if and only if you have this working without any problem.
  • Create a file /mnt/isilon/fiotest/control/nfs_hosts.list with the IPs of the nodes and cluster node IPS.
  • Separator is the pipe character |.
  • No comments, no white space, no trailing end-of-line white space!
~# cat /mnt/isilon/fiotest/control/nfs_hosts.list
172.16.20.203|172.16.20.236
172.16.20.204|172.16.20.234
172.16.20.42|172.16.20.235
172.16.20.102|172.16.20.237
  • Verify that the export /ifs/data/fiotest on the Isilon cluster can be mounted on all nodes.
  • Only when the config above are done and working correctly can we start on working with FIO.

Not interested in the FIO configuration: jump to the FIO NFS Statistics Reports section for the actual results of the benchmarks.


FIO Configuration and Benchmarking

  • The FIO benchmarking are done using the following logic:
    • On the master node node02, for each benchmark run with a specific FIO configuration:
      • Run the cleaning script /mnt/isilon/fiotest/controlcleanup_remount_1to1.sh:
      • For each benchmarking nodes in /mnt/isilon/fiotest/control/nfs_hosts.list:
        • Removes any FIO working files located in /mnt/isilon/fiotest/xxx.xxx.xxx.xxx used in the read-write benchmarkings.
        • umount/remounts the Isilon export on the benchmarking node.
      • Run the FIO script /mnt/isilon/fiotest/control/run_nfs_fio_8k_50G_4jobs_randrw.sh:
        • flush the L1 and L2 caches on the all the Isilon cluster nodes.
        • For each nodes in /mnt/isilon/fiotest/control/nfs_hosts.list:
          • Connect to a benchmarking node with ssh.
          • Sync all local filesystem cache buffers to disks and flush all I/O caches.
          • Run the FIO command:
          • FIO job file is /mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw.
          • FIO working dir is /mnt/isilon/fiotest/xxx.xxx.xxx.xxx where x’s are IP of the benchmarking node.
          • FIO output is sent to /mnt/isilon/fiotest/fioresult_8k_50G_4jobs_randrw_xxx.xxx.xxx.xxx.log.
  • Once more, an ascii diagram explaining the files layout:
 /ifs/data/fiotest => /mnt/isilon/fiotest/
                                         ..
                                         /fiojob_8k_50G_4jobs_randrw                      <-- FIO jobfile
                                         /fioresult_8k_50G_4jobs_randrw_172.16.20.42.log  <-- FIO output logfile
                                         /172.16.20.102/                                  <-- FIO working directory
                                         /172.16.20.203/                                  <-- "                   "
                                         /172.16.20.204/                                  <-- "                   "
                                         /172.16.20.42/                                   <-- "                   "
                                         /control/.. /cleanup_remount_1to1.sh             <-- cleanup and remount script
                                                     /nfs_copy_trusted.sh                 <-- key distributor script
+-----------------+                                  /run_nfs_fio_8k_50G_4jobs_randrw.sh  <-- FIO start script
| node02          |                                  /nfs_hosts.list                      <-- nodes list
| 172.16.20.202   |                                  /trusted.key                         <-- ssh private
+-----------------+                                  /trusted.key.pub                     <-- and public keys
                    2x 1GiG                                    
+-----------------+         +------------------+
| node03          |.........| LNN2             |
| 172.16.20.203   |.........| 172.16.20.236    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| node04          |.........| LNN4             |
| 172.16.20.204   |.........| 172.16.20.234    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| thaisa          |.........| LNN5             |
| 172.16.20.42    |.........| 172.16.20.235    |
+-----------------+         +------------------+
+-----------------+         +------------------+
| widow           |.........| LNN1             |
| 172.16.20.102   |.........| 172.16.20.237    |
+-----------------+         +------------------+
                            +------------------+
                            | LNN3             |
                            | 172.16.20.233    |
                            +------------------+

The cleanup script file /mnt/isilon/fiotest/controlcleanup_remount_1to1.sh

#!/bin/bash

#first go through all lines in hosts.list

for i in $(cat /mnt/isilon/fiotest/control/nfs_hosts.list) ; do

# then split each line read in to an array by the pipe symbol

    IFS='|' read -a pairs <<< "${i}"

# show back the mapping

    echo "Client host: ${pairs[0]}  Isilon node: ${pairs[1]}"

# connect over ssh with the key and mount hosts, create directories etc. - has to be single line

    ssh -i /mnt/isilon/fiotest/control/trusted.key ${pairs[0]} -fqno StrictHostKeyChecking=no \
    "[ -d /mnt/isilon/fiotest/${pairs[0]} ] && rm -rf /mnt/isilon/fiotest/${pairs[0]}; sleep 1; \
    umount -fl /mnt/isilon/fiotest; sleep 1; \
    mount -t nfs -o vers=4 ${pairs[1]}:/fiotest /mnt/isilon/fiotest; sleep 1; \
    [ ! -d /mnt/isilon/fiotest/${pairs[0]} ] && mkdir /mnt/isilon/fiotest/${pairs[0]}"

# erase the array pair
    unset pairs

# go for the next line in nfs_hosts.list;
done

The FIO script file /mnt/isilon/fiotest/control/run_nfs_fio_8k_50G_4jobs_randrw.sh

#!/bin/bash

# First, connect to the first isilon node, and flush cache on array
# This might takes minutes to complete.

echo -n "Purging L1 and L2 cache first...";

ssh -i /mnt/isilon/fiotest/control/trusted.key mgmt.isi.bic.mni.mcgill.ca -fqno StrictHostKeyChecking=no "isi_for_array isi_flush"
#ssh -i /mnt/isilon/fiotest/control/trusted.key mgmt.isi.bic.mni.mcgill.ca -fqno StrictHostKeyChecking=no "isi_for_array w"

# wait for cache flushing to finish, normally around 10 seconds is enough
# on larger clusters, sometimes up to few minutes should be used!
echo "...sleeping for 30secs"
sleep 30

# The L3 cache purge is not recommended as all metadata accelerated by SSDs is going. but, maybe...
#echo "On OneFS 7.1.1 clusters and newer, running L3, purging L3 cache";
#ssh -i /mnt/isilon/fiotest/control/trusted.key 10.63.208.64 -fqno StrictHostKeyChecking=no "isi_for_array isi_flush --l3-full";
#sleep 10;

# The rest is similar to the other scripts
# First go through all lines in nfs_hosts.list

for i in $(cat /mnt/isilon/fiotest/control/nfs_hosts.list) ; do

# then split each line read in to an array by the pipe symbol

    IFS='|' read -a pairs <<< "${i}"

# Connect over ssh with the key and mount hosts, create directories etc. - has to be single line
# "sync && echo 3 > /proc/sys/vm/drop_caches" purges all buffers to disk

# The fio jobfile is one level above from control directory

    ssh -i /mnt/isilon/fiotest/control/trusted.key ${pairs[0]} -fqno StrictHostKeyChecking=no \
           "sync && echo 3 > /proc/sys/vm/drop_caches; FILENAME=\"/mnt/isilon/fiotest/${pairs[0]}\" \
            fio --output=/mnt/isilon/fiotest/fioresult_8k_50G_4jobs_randrw_${pairs[0]}.log \
            /mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw"

done

The FIO jobfile /mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw.

  • The most important parameters:
    • directory=${FILENAME} sets the working directory to the variable ${FILENAME}, set in the FIO calling script.
    • rw=randrw specifies a mixed random read and write I/O pattern.
    • size=50G sets the total transferred I/O size to 50GB.
    • bs= sets the block size for I/O units. Default: 4k
    • direct=0 makes use of buffered I/O.
    • ioengine=sync use a synchronous ioengine (simple read, write and fseeks system calls).
    • iodepth=1: I/O depth set to 1 (number of I/O units to keep in flight towards the working file).
    • numjobs=4 creates 4 clones (processes/threads performing the same workload) of this job.
    • group_reporting aggregates per-job stats into one per-group report when numjobs is specified.
    • runtime=10800 restricts the run time to 10800 seconds, 3 hours. This might limit the total transferred to less than the value specified by size=
; --start job file --
[global]
description=-------------THIS IS A JOB DOING ${FILENAME} ---------
directory=${FILENAME}
rw=randrw
size=50G
bs=8k
zero_buffers
direct=0
sync=0
refill_buffers
ioengine=sync
iodepth=1
numjobs=4
group_reporting
runtime=10800
[8k_randread]
; -- end job file --

A typical output log file from FIO is like the following:

  • 1024k I/O block size, random read-write I/O pattern, 4 threads, 50GB data transferred per thread, 200GB total.
  • 4 of these are submitted at the same time on 4 different nodes for a total number of 16 threads and a total transferred size of 800GB (runtime might limit this)
1024k_randrw: (g=0): rw=randrw, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
...
fio-2.1.11
Starting 4 processes
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)

1024k_seqrw: (groupid=0, jobs=4): err= 0: pid=27587: Mon Oct  3 11:03:24 2016
  Description  : [-------------THIS IS A JOB DOING /mnt/isilon/fiotest/172.16.20.203 ---------]
  read : io=102504MB, bw=36773KB/s, iops=35, runt=2854400msec
    clat (msec): min=11, max=29666, avg=106.96, stdev=433.67
     lat (msec): min=11, max=29666, avg=106.96, stdev=433.67
    clat percentiles (msec):
     |  1.00th=[   23],  5.00th=[   27], 10.00th=[   29], 20.00th=[   34],
     | 30.00th=[   36], 40.00th=[   42], 50.00th=[   54], 60.00th=[   83],
     | 70.00th=[  110], 80.00th=[  143], 90.00th=[  198], 95.00th=[  255],
     | 99.00th=[  408], 99.50th=[  510], 99.90th=[ 6390], 99.95th=[10290],
     | 99.99th=[16712]
    bw (KB  /s): min=   34, max=37415, per=31.68%, avg=11651.07, stdev=8193.03
  write: io=102296MB, bw=36698KB/s, iops=35, runt=2854400msec
    clat (usec): min=399, max=57018, avg=450.81, stdev=477.23
     lat (usec): min=399, max=57018, avg=451.15, stdev=477.23
    clat percentiles (usec):
     |  1.00th=[  410],  5.00th=[  418], 10.00th=[  422], 20.00th=[  426],
     | 30.00th=[  430], 40.00th=[  434], 50.00th=[  438], 60.00th=[  442],
     | 70.00th=[  450], 80.00th=[  458], 90.00th=[  470], 95.00th=[  490],
     | 99.00th=[  556], 99.50th=[  580], 99.90th=[  684], 99.95th=[  956],
     | 99.99th=[27520]
    bw (KB  /s): min=   34, max=80788, per=34.53%, avg=12670.23, stdev=10361.46
    lat (usec) : 500=48.01%, 750=1.90%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.10%, 50=23.92%
    lat (msec) : 100=9.51%, 250=13.87%, 500=2.40%, 750=0.14%, 1000=0.01%
    lat (msec) : 2000=0.01%, >=2000=0.10%
  cpu          : usr=0.07%, sys=1.03%, ctx=2703279, majf=0, minf=118
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=102504/w=102296/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=102504MB, aggrb=36772KB/s, minb=36772KB/s, maxb=36772KB/s, mint=2854400msec, maxt=2854400msec
  WRITE: io=102296MB, aggrb=36698KB/s, minb=36698KB/s, maxb=36698KB/s, mint=2854400msec, maxt=2854400msec
  • An explanation of the output stats in a FIO logfile (from the manpage):
io     Number of megabytes of I/O performed.

bw     Average data rate (bandwidth).

runt   Threads run time.

slat   Submission latency minimum, maximum, average and standard deviation. 
       This is the time it took to submit the I/O.

clat   Completion latency minimum, maximum, average and standard deviation.  
       This is the time between submission and completion.

bw     Bandwidth minimum, maximum, percentage of aggregate bandwidth received, average and standard deviation.

cpu    CPU usage statistics. Includes user and system time, number of context switches this 
                             thread went through and number of major and minor page faults.

IO depths 
             Distribution of I/O depths.  
             Each depth includes everything less than (or equal) to it, 
             but greater than the previous depth.

IO issued 
             Number of read/write requests issued, and number of short read/write requests.

IO latencies 
             Distribution of I/O completion latencies.  
             The numbers follow the same pattern as IO depths.

       The group statistics show:

              io     Number of megabytes I/O performed.
              aggrb  Aggregate bandwidth of threads in the group.
              minb   Minimum average bandwidth a thread saw.
              maxb   Maximum average bandwidth a thread saw.
              mint   Shortest runtime of threads in the group.
              maxt   Longest runtime of threads in the group.

       Finally, disk statistics are printed with reads first:

              ios    Number of I/Os performed by all groups.
              merge  Number of merges in the I/O scheduler.
              ticks  Number of ticks we kept the disk busy.
              io_queue
                     Total time spent in the disk queue.
              util   Disk utilization.

FIO NFS Statistics Reports

  • FIO outputs galores of stats!
  • Some stats, like disks statistics are not relevant in the case of NFS benchmarkings.
  • Synchronous (sync) and asynchronous IO (libaio), buffered and un-buffered should be done.
    • Synchronous IO (sync) is usually done for regular applications.
    • Synchronous just refers to the system call interface: i.e. the when the system call returns to the application/
    • It does not imply synchronous I/O aka O_SYNC which is way slower and enabled by sync=1
    • Thus it does not guarantee that the I/O has been physically written to the underlying device.
    • For reads, the IO has been done by the device. For writes, it could just be sitting in the page cache for later writeback.
    • For reads, the IO always happens in the context of the process.
    • For buffered writes, it usually does not. The process merely dirties the page, kernel threads will most often do the actual writeback of the data.
    • direct=1 will circumvent the page cache.
    • direct=1 will make the writes sync as well.
    • So instead of just returning when it’s in page cache, when a sync write with direct=1 returns, the data has been received and acknowledged by the backing device.
    • aio assumes the identity of the process. aio is usually mostly used by databases.
    • Question:
    • hat difference is between the following two other than the second one seems to be more popular in fio example job files?
      • 1) ioengine=sync + direct=1
      • 2) ioengine=libaio + direct=1
    • Current answer: It is that fio can issue further I/Os while the Linux kernels handles the I/O.
  • Perform FIO random IO (mixed read-write) and sequential IO (mixed read-write).
  • Block size ranges from 4k to 1024k in multiplicative steps of 2.
  • Working file size set to 50G. (runtime=10800 (3 hours) might limit the total transferred size).
  • Basic synchronous read and write (sync) is used for the ioengine (ioengine=sync).
    • A second round of async benchmarks should be attempted with ioengine=libaio (Linux native asynchronous I/O) along with direct=1 and a range of iodepth.
  • buffered IO is set (direct=false).
    • Un-buffered IO will almost certainly worsen stats performance, but that’s not Real Life (TM).
    • Real performance of the Isilon cluster should be assessed by bypassing the client’s local memory caching, ie, set direct=1 and iodepth=1 and higher values.
  • Set iodepth=1 as it doesn’t make sense to use any larger value when using a synchronous ioengine.
    • Is it important to realize that the OS/kernel/block IO stack might restrict the iodepth parameter values.
    • This is to checked when one sets ioengine=libaio AND direct=false.
  • 4 threads for each FIO job will be launched (numjobs=4).
  • Only consider the following stats:
    • Random IO: IOPs and latencies (average and 95th percentiles value).
    • Sequential IO: bandwidth.
  • Plot the following stats versus the FIO blocks size used: (4k,8k,16k,32k,64k,128k,256k,512k,1024k)
    • IOPs for random IO reads and writes
    • Total submission and completion latency (clat) 95th percentile values clat 99.95th=[XXX] ,ie 95% of all latencies are under this latency value.
    • Bandwidth for sequential reads and writes.