Isilon Stuff and Other Things
This is a disclaimer: Using the notes below is dangerous for both your sanity and peace of mind. If you still want to read them beware of the fact that they may be "not even wrong". Everything I write in there is just a mnemonic device to give me a chance to fix things I badly broke because I'm bloody stupid and think I can tinker with stuff that is way above my head and go away with it. It reminds me of Gandalf's warning: "Perilous to all of us are the devices of an art deeper than we ourselves possess." Moreover, a lot of it I blatantly stole on the net from other obviously cleverer persons than me -- not very hard. Forgive me. My bad. Please consider it and go away. You have been warned!
(:#toc:)
EMC Support
- Support is at https://support.emc.com
- One must create a profile with 2 roles, one as Authorized Contact and another as Dial Home, Primary Contact.
- Site ID is:
Site ID: 1003902358 Created On: 05/13/2016 12:36 PM Site Name: MCGILL UNIVERSITY Address 1: 3801 UNIVERSITY ST Address 2: ROOM WB212 City: MONTREAL State: Country: CA Postal Code: H3A 2B4
About This Cluster
This is from the web interface, [Help] → [About This Cluster]
About This Cluster
OneFSUpgrade
Isilon OneFS v8.0.0.4 B_MR_8_0_0_4_053(RELEASE) installed on all nodes.
Packages and Updates
No packages or updates are installed.
Cluster Information
GUID: 000e1ea7eec05c211157780e00f5f0ce64c1
Cluster Hardware
Node Model Configuration Serial Number
Node 1 Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB 400-0049-03 SX410-301608-0260
Node 2 Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB 400-0049-03 SX410-301608-0255
Node 4 Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB 400-0049-03 SX410-301608-0264
Node 3 Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB 400-0049-03 SX410-301608-0254
Node 5 Isilon X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB 400-0049-03 SX410-301608-0248
Cluster Firmware
Device Type Firmware Nodes
BMC_S2600CP BMC 1.25.9722 1-5
CFFPS1 CFFPS 03.03 1-5
CFFPS2 CFFPS 03.03 1-5
CMCSDR_Honeybadger CMCSDR 00.0B 1-5
CMC_HFHB CMC 02.05 1-5
IsilonFPV1 FrontPnl UI.01.36 1-5
LOx2-MLC-YD Nvram rp180c01+rp180c01 1-5
Lsi DiskCtrl 20.00.04.00 1-5
LsiExp0 DiskExp 0910+0210 1-5
LsiExp1 DiskExp 0910+0210 1-5
Mellanox Network 2.30.8020+ISL1090110018 1-5
QLogic-NX2 10GigE 7.6.55 1-5
Copyright © 2001-2017 EMC Corporation. All Rights Reserved. This software is protected, without limitation, by copyright law and international treaties. Use of this software and intellectual property contained therein is expressly limited to the terms and conditions of the License Agreement under which it is provided by or on behalf of EMC. All other trademarks used herein are the property of their respective owners.
Logical Node Numbers (LNN), Device IDs, Serial Numbers and Firmwares
- Use
isi_for_array commandto loop over the nodes and run the commandcommand
BIC-Isilon-Cluster-4# isi_for_array isi_hw_status -i BIC-Isilon-Cluster-4: SerNo: SX410-301608-0264 BIC-Isilon-Cluster-4: Config: 400-0049-03 BIC-Isilon-Cluster-4: FamCode: X BIC-Isilon-Cluster-4: ChsCode: 4U BIC-Isilon-Cluster-4: GenCode: 10 BIC-Isilon-Cluster-4: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB BIC-Isilon-Cluster-1: SerNo: SX410-301608-0260 BIC-Isilon-Cluster-1: Config: 400-0049-03 BIC-Isilon-Cluster-1: FamCode: X BIC-Isilon-Cluster-1: ChsCode: 4U BIC-Isilon-Cluster-1: GenCode: 10 BIC-Isilon-Cluster-1: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB BIC-Isilon-Cluster-2: SerNo: SX410-301608-0255 BIC-Isilon-Cluster-2: Config: 400-0049-03 BIC-Isilon-Cluster-2: FamCode: X BIC-Isilon-Cluster-2: ChsCode: 4U BIC-Isilon-Cluster-2: GenCode: 10 BIC-Isilon-Cluster-2: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB BIC-Isilon-Cluster-3: SerNo: SX410-301608-0254 BIC-Isilon-Cluster-3: Config: 400-0049-03 BIC-Isilon-Cluster-3: FamCode: X BIC-Isilon-Cluster-3: ChsCode: 4U BIC-Isilon-Cluster-3: GenCode: 10 BIC-Isilon-Cluster-3: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB BIC-Isilon-Cluster-5: SerNo: SX410-301608-0248 BIC-Isilon-Cluster-5: Config: 400-0049-03 BIC-Isilon-Cluster-5: FamCode: X BIC-Isilon-Cluster-5: ChsCode: 4U BIC-Isilon-Cluster-5: GenCode: 10 BIC-Isilon-Cluster-5: Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
isi_nodescan extract formatted strings like:
BIC-Isilon-Cluster-3# isi_nodes %{id} %{lnn} %{name} %{serialno}
1 1 BIC-Isilon-Cluster-1 SX410-301608-0260
2 2 BIC-Isilon-Cluster-2 SX410-301608-0255
4 3 BIC-Isilon-Cluster-3 SX410-301608-0254
3 4 BIC-Isilon-Cluster-4 SX410-301608-0264
6 5 BIC-Isilon-Cluster-5 SX410-301608-0248
- Why is there an
%{id}equal to 6? - 20160923: I can now answer this question I think.
- It might be the result of the initial configuration of the cluster back in April ‘16.
- The guy who did it (from J Laganiere from Gallium-it.com) has a few problems with nodes not responding.
- The nodes are labeled from top to bottom as 1 (highest in the rack) to 5 (lowest in the rack).
- They should have been labeled as their physical order in the rack, 1/bottom to 5/top.
- As to why the LLN don’t match the Device IDs: the Device IDs are incrementally updated when failing and adding nodes.
- I smartfailed one node once so that explains the ID=6.
- This is extremely annoying as the allocation of IPs is also affected.
- The last octet of IP pools
prodandnodedon’t match for the same LNN.
BIC-Isilon-Cluster >>> lnnset
LNN Device ID Cluster IP
----------------------------------------
1 1 10.0.3.1
2 2 10.0.3.2
3 4 10.0.3.4
4 3 10.0.3.3
5 6 10.0.3.5
BIC-Isilon-Cluster-2# isi_nodes %{lnn} %{devid} %{external} %{dynamic}
1 1 172.16.10.20 132.206.178.237,172.16.20.237
2 2 172.16.10.21 132.206.178.236,172.16.20.236
3 4 172.16.10.22 132.206.178.233,172.16.20.234
4 3 172.16.10.23 132.206.178.234,172.16.20.235
5 6 172.16.10.24 132.206.178.235,172.16.20.233
BIC-Isilon-Cluster-2# isi network interfaces ls
LNN Name Status Owners IP Addresses
--------------------------------------------------------------------
1 10gige-1 Up - -
1 10gige-2 Up - -
1 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.237
groupnet0.node.pool1 172.16.20.237
1 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.20
1 ext-2 No Carrier - -
1 ext-agg Not Available - -
2 10gige-1 Up - -
2 10gige-2 Up - -
2 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.236
groupnet0.node.pool1 172.16.20.236
2 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.21
2 ext-2 No Carrier - -
2 ext-agg Not Available - -
3 10gige-1 Up - -
3 10gige-2 Up - -
3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.233
groupnet0.node.pool1 172.16.20.234
3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22
3 ext-2 No Carrier - -
3 ext-agg Not Available - -
4 10gige-1 Up - -
4 10gige-2 Up - -
4 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.234
groupnet0.node.pool1 172.16.20.235
4 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.23
4 ext-2 No Carrier - -
4 ext-agg Not Available - -
5 10gige-1 Up - -
5 10gige-2 Up - -
5 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235
groupnet0.node.pool1 172.16.20.233
5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24
5 ext-2 No Carrier - -
5 ext-agg Not Available - -
--------------------------------------------------------------------
Total: 30
- This will list the status of devices and firmware on a node:
BIC-Isilon-Cluster-1# isi upgrade cluster firmware devices Device Type Firmware Mismatch Lnns --------------------------------------------------------------------- CFFPS1_Blastoff CFFPS 03.03 - 1-5 CFFPS2_Blastoff CFFPS 03.03 - 1-5 CMC_HFHB CMC 01.02 - 1-5 CMCSDR_Honeybadger CMCSDR 00.0B - 1-5 Lsi DiskCtrl 17.00.01.00 - 1-5 LsiExp0 DiskExp 0910+0210 - 1-5 LsiExp1 DiskExp 0910+0210 - 1-5 IsilonFPV1 FrontPnl UI.01.36 - 1-2,4-5 Mellanox Network 2.30.8020+ISL1090110018 - 1-5 LOx2-MLC-YD Nvram rp180c01+rp180c01 - 1-5 --------------------------------------------------------------------- Total: 10
Licenses
- The following licenses are active:
BIC-Isilon-Cluster-3# isi license licenses ls Name Status Expiration ---------------------------------------------------- SmartDedupe Inactive - Swift Activated - SmartQuotas Activated - InsightIQ Activated - SmartPools Inactive - SmartLock Inactive - Isilon for vCenter Inactive - CloudPools Inactive - Hardening Inactive - SnapshotIQ Activated - HDFS Inactive - SyncIQ Inactive - SmartConnect Advanced Activated - ---------------------------------------------------- Total: 13
Alerts and Events
- Modify event retention period from 90 days (default) to 360:
BIC-Isilon-Cluster-1# isi event settings view
Retention Days: 90
Storage Limit: 1
Maintenance Start: Never
Maintenance Duration: Never
Heartbeat Interval: daily
BIC-Isilon-Cluster-1# isi event settings modify --retention-days 360
BIC-Isilon-Cluster-1# isi event settings view
Retention Days: 360
Storage Limit: 1
Maintenance Start: Never
Maintenance Duration: Never
Heartbeat Interval: daily
- The syntax to modify events settings:
isi event settings modify [--retention-days <integer>] [--storage-limit <integer>] [--maintenance-start <timestamp>] [--clear-maintenance-start] [--maintenance-duration <duration>] [--heartbeat-interval
- Every event has two ID numbers that help to establish the context of the event.
- The event type ID identifies the type of event that has occurred.
- The event instance ID is a unique number that is specific to a particular occurrence of an event type.
- When an event is submitted to the kernel queue, an event instance ID is assigned.
- You can reference the instance ID to determine the exact time that an event occurred.
- You can view individual events. However, you manage events and alerts at the event group level.
BIC-Isilon-Cluster-3# isi event events list
ID Occurred Sev Lnn Eventgroup ID Message
--------------------------------------------------------------------------------------------------------
1.426 04/19 13:28 U 0 1 Resolved from PAPI
3.309 04/19 11:11 C 4 1 External network link ext-1 (igb0) down
2.530 04/20 00:00 I 2 131 Heartbeat Event
3.545 04/27 22:09 C 4 131101 Disk Repair Complete: Bay 2, Type HDD, LNUM 34. Replace the drive according to the instructions in the OneFS Help system.
1.664 05/05 12:05 U 0 131124 Resolved from PAPI
3.563 05/03 23:56 C 4 131124 One or more drives (bay(s) 2 / type(s) HDD) are ready to be replaced.
3.551 04/27 22:19 C 4 131124 One or more drives (bay(s) 2 / type(s) HDD) are ready to be replaced.
BIC-Isilon-Cluster-3# isi event events view 3.545
ID: 3.545
Eventgroup ID: 131101
Event Type: 100010010
Message: Disk Repair Complete: Bay 2, Type HDD, LNUM 34. Replace the drive according to the instructions in the OneFS Help system.
Devid: 3
Lnn: 4
Time: 2016-04-27T22:09:12
Severity: critical
Value: 0.0
BIC-Isilon-Cluster-3# isi event groups list
ID Started Ended Causes Short Events Severity
---------------------------------------------------------------------------------
3 04/19 11:10 04/19 13:28 external_network 2 critical
2 04/19 11:10 04/19 13:28 external_network 2 critical
4 04/19 11:10 04/19 11:16 NODE_STATUS_OFFLINE 2 critical
1 04/19 11:11 04/19 13:28 external_network 2 critical
24 04/19 11:16 04/19 11:16 NODE_STATUS_ONLINE 1 information
26 04/19 11:23 04/19 13:28 external_network 2 critical
27 04/19 11:31 04/19 13:28 WINNET_AUTH_NIS_SERVERS_UNREACH 6 critical
32 04/19 12:43 04/19 12:44 HW_IPMI_POWER_SUPPLY_STATUS_REG 4 critical
...
524525 05/30 02:18 05/30 02:18 SYS_DISK_REMOVED 1 critical
524538 05/30 02:19 -- SYS_DISK_UNHEALTHY 3 critical
...
BIC-Isilon-Cluster-3# isi event groups view 524525
ID: 524525
Started: 05/30 02:18
Causes Long: Disk Repair Complete: Bay 18, Type HDD, LNUM 27. Replace the drive according to the instructions in the OneFS Help system.
Last Event: 2016-05-30T02:18:16
Ignore: No
Ignore Time: Never
Resolved: Yes
Ended: 05/30 02:18
Events: 1
Severity: critical
BIC-Isilon-Cluster-3# isi event groups view 524538
ID: 524538
Started: 05/30 02:19
Causes Long: One or more drives (bay(s) 18 / type(s) HDD) are ready to be replaced.
Last Event: 2016-06-04T12:42:09
Ignore: No
Ignore Time: Never
Resolved: No
Ended: --
Events: 3
Severity: critical
Scheduling A Maintenance Window
- You can schedule a maintenance window by setting a maintenance start time and duration.
- During a scheduled maintenance window, the system will continue to log events, but no alerts will be generated.
- Scheduling a maintenance window will keep channels from being flooded by benign alerts associated with cluster maintenance procedures.
- Active event groups will automatically resume generating alerts when the scheduled maintenance period ends.
- You can schedule a maintenance window to discontinue alerts while you are performing maintenance on your cluster.
- Schedule a maintenance window by running the
isi event settings modifycommand. - The following example command schedules a maintenance window that begins on September 1, 2015 at 11:00pm and lasts for two days:
isi event settings modify --maintenance-start 2015-09-01T23:00:00 --maintenance-duration 2D
Hardware, Devices and Nodes
Storage Pool Protection Level
- Default and suggested protection level for a cluster size less than 2PB is
+2d:1n. - A
+2d:1nprotection level implies that the cluster can recover from two simultaneous drive failures or one node failure without sustaining any data loss. - The parity overhead is 20% for a 5-nodes cluster with a
+2d:1nprotection level.
BIC-Isilon-Cluster-4# isi storagepool list Name Nodes Requested Protection HDD Total % SSD Total % -------------------------------------------------------------------------------------- x410_144tb_64gb 1-5 +2d:1n 1.1190T 641.6275T 0.17% 0 0 0.00% -------------------------------------------------------------------------------------- Total: 1 1.1190T 641.6275T 0.17% 0 0 0.00%
Hardware status on a specific node:
BIC-Isilon-Cluster-4# isi_hw_status
SerNo: SX410-301608-0264
Config: 400-0049-03
FamCode: X
ChsCode: 4U
GenCode: 10
Product: X410-4U-Dual-64GB-2x1GE-2x10GE SFP+-144TB
HWGen: CTO (CTO Hardware)
Chassis: ISI36V3 (Isilon 36-Bay(V3) Chassis)
CPU: GenuineIntel (2.00GHz, stepping 0x000306e4)
PROC: Dual-proc, Octa-HT-core
RAM: 68602642432 Bytes
Mobo: IntelS2600CP (Intel S2600CP Motherboard)
NVRam: LX4381 (Isilon LOx NVRam Card) (2016MB card) (size 2113798144B)
FlshDrv: None (No physical dongle supported) ((null))
DskCtl: LSI2308SAS2 (LSI 2308 SAS Controller) (8 ports)
DskExp: LSISAS2X24_X2 (LSI SAS2x24 SAS Expander (Qty 2))
PwrSupl: PS1 (type=ACBEL POLYTECH , fw=03.03)
PwrSupl: PS2 (type=ACBEL POLYTECH , fw=03.03)
ChasCnt: 1 (Single-Chassis System)
NetIF: ib1,ib0,igb0,igb1,bxe0,bxe1
IBType: MT4099 QDR (Mellanox MT4099 IB QDR Card)
LCDver: IsiVFD1 (Isilon VFD V1)
IMB: Board Version 0xffffffff
Power Supplies OK
Power Supply 1 good
Power Supply 2 good
CPU Operation (raw 0x88390000) = Normal
CPU Speed Limit = 100.00%
FAN TAC SENSOR 1 = 8800.000
FAN TAC SENSOR 2 = 8800.000
FAN TAC SENSOR 3 = 8800.000
PS FAN SPEED 1 = 9600.000
PS FAN SPEED 2 = 9500.000
BB +12.0V = 11.935
BB +5.0V = 4.937
BB +3.3V = 3.268
BB +5.0V STBY = 4.894
BB +3.3V AUX = 3.268
BB +1.05V P1Vccp = 0.828
BB +1.05V P2Vccp = 0.822
BB +1.5 P1DDR AB = na
BB +1.5 P1DDR CD = na
BB +1.5 P2DDR AB = na
BB +1.5 P2DDR CD = na
BB +1.8V AUX = 1.769
BB +1.1V STBY = 1.081
BB VBAT = 3.120
BB +1.35 P1LV AB = 1.342
BB +1.35 P1LV CD = 1.348
BB +1.35 P2LV AB = 1.378
BB +1.35 P2LV CD = 1.348
VCC_12V0 = 12.100
VCC_5V0 = 5.000
VCC_3V3 = 3.300
VCC_1V8 = 1.800
VCC_5V0_SB = 4.900
VCC_1V0 = 0.990
VCC_5V0_CBL = 5.000
VCC_SW = 4.900
VBATT_1 = 4.000
VBATT_2 = 4.000
PS IN VOLT 1 = 241.000
PS IN VOLT 2 = 241.000
PS OUT VOLT 1 = 12.300
PS OUT VOLT 2 = 12.300
PS IN CURR 1 = 1.200
PS IN CURR 2 = 1.200
PS OUT CURR 1 = 19.000
PS OUT CURR 2 = 19.500
Front Panel Temp = 20.6
BB EDGE Temp = 25.000
BB BMC Temp = 34.000
BB P2 VR Temp = 30.000
BB MEM VR Temp = 28.000
LAN NIC Temp = 42.000
P1 Therm Margin = -56.000
P2 Therm Margin = -58.000
P1 DTS Therm Mgn = -56.000
P2 DTS Therm Mgn = -58.000
DIMM Thrm Mrgn 1 = -66.000
DIMM Thrm Mrgn 2 = -68.000
DIMM Thrm Mrgn 3 = -67.000
DIMM Thrm Mrgn 4 = -66.000
TEMP SENSOR 1 = 23.000
PS TEMP 1 = 28.000
PS TEMP 2 = 28.000
List devices on nodes 5 (node logical node number):
BIC-Isilon-Cluster-1# isi devices list --node-lnn 5 Lnn Location Device Lnum State Serial ----------------------------------------------- 5 Bay 1 /dev/da1 35 HEALTHY S1Z1S6BY 5 Bay 2 /dev/da2 34 HEALTHY Z1ZAECJM 5 Bay 3 /dev/da19 17 HEALTHY S1Z1SB0L 5 Bay 4 /dev/da20 16 HEALTHY S1Z1SAYP 5 Bay 5 /dev/da3 33 HEALTHY Z1ZA74A4 5 Bay 6 /dev/da21 15 HEALTHY Z1ZAEQ13 5 Bay 7 /dev/da22 14 HEALTHY S1Z1SAF5 5 Bay 8 /dev/da23 13 HEALTHY S1Z1SB0C 5 Bay 9 /dev/da4 32 HEALTHY Z1ZAEPR8 5 Bay 10 /dev/da24 12 HEALTHY Z1ZAB3ZD 5 Bay 11 /dev/da25 11 HEALTHY S1Z1RYGS 5 Bay 12 /dev/da26 10 HEALTHY S1Z1SB0A 5 Bay 13 /dev/da5 31 HEALTHY Z1ZAEPS5 5 Bay 14 /dev/da6 30 HEALTHY Z1ZAF5GQ 5 Bay 15 /dev/da7 29 HEALTHY Z1ZAB40S 5 Bay 16 /dev/da27 9 HEALTHY Z1ZAF625 5 Bay 17 /dev/da8 28 HEALTHY Z1ZAEPJY 5 Bay 18 /dev/da9 27 HEALTHY Z1ZAF1LG 5 Bay 19 /dev/da10 26 HEALTHY Z1ZAF724 5 Bay 20 /dev/da28 8 HEALTHY Z1ZAF5W8 5 Bay 21 /dev/da11 25 HEALTHY Z1ZAEW1W 5 Bay 22 /dev/da12 24 HEALTHY Z1ZAF0CW 5 Bay 23 /dev/da29 7 HEALTHY Z1ZAF5VM 5 Bay 24 /dev/da30 6 HEALTHY Z1ZAF59X 5 Bay 25 /dev/da31 5 HEALTHY Z1ZAF21G 5 Bay 26 /dev/da32 4 HEALTHY Z1ZAF5QJ 5 Bay 27 /dev/da33 3 HEALTHY Z1ZAF58Y 5 Bay 28 /dev/da13 23 HEALTHY Z1ZAF6CG 5 Bay 29 /dev/da34 2 HEALTHY Z1ZAB3XJ 5 Bay 30 /dev/da14 22 HEALTHY S1Z1RYHB 5 Bay 31 /dev/da35 1 HEALTHY Z1ZAB3TQ 5 Bay 32 /dev/da15 21 HEALTHY Z1ZAEPYX 5 Bay 33 /dev/da36 0 HEALTHY Z1ZAF4Z0 5 Bay 34 /dev/da16 20 HEALTHY Z1ZAEPMC 5 Bay 35 /dev/da17 19 HEALTHY Z1ZAF4H4 5 Bay 36 /dev/da18 18 HEALTHY Z1ZAF6JA ----------------------------------------------- Total: 36
Use command isi_for_array an select node 4 and list its drives:
BIC-Isilon-Cluster-1# isi_for_array -n4 isi devices drive list BIC-Isilon-Cluster-4: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-4: ----------------------------------------------- BIC-Isilon-Cluster-4: 4 Bay 1 /dev/da1 35 HEALTHY S1Z1STTN BIC-Isilon-Cluster-4: 4 Bay 2 /dev/da2 36 HEALTHY Z1Z9XE67 BIC-Isilon-Cluster-4: 4 Bay 3 /dev/da19 17 HEALTHY S1Z1NE5B BIC-Isilon-Cluster-4: 4 Bay 4 /dev/da20 16 HEALTHY S1Z1QQBN BIC-Isilon-Cluster-4: 4 Bay 5 /dev/da3 33 HEALTHY S1Z1RYJ0 BIC-Isilon-Cluster-4: 4 Bay 6 /dev/da21 15 HEALTHY S1Z1SL53 BIC-Isilon-Cluster-4: 4 Bay 7 /dev/da22 14 HEALTHY S1Z1QNVG BIC-Isilon-Cluster-4: 4 Bay 8 /dev/da23 13 HEALTHY S1Z1R8TT BIC-Isilon-Cluster-4: 4 Bay 9 /dev/da4 32 HEALTHY S1Z1SLDG BIC-Isilon-Cluster-4: 4 Bay 10 /dev/da24 12 HEALTHY S1Z1RVGX BIC-Isilon-Cluster-4: 4 Bay 11 /dev/da25 11 HEALTHY S1Z1QNSG BIC-Isilon-Cluster-4: 4 Bay 12 /dev/da26 10 HEALTHY S1Z1NEGJ BIC-Isilon-Cluster-4: 4 Bay 13 /dev/da5 31 HEALTHY S1Z1QR9E BIC-Isilon-Cluster-4: 4 Bay 14 /dev/da6 30 HEALTHY S1Z1SL23 BIC-Isilon-Cluster-4: 4 Bay 15 /dev/da7 29 HEALTHY S1Z1NEPA BIC-Isilon-Cluster-4: 4 Bay 16 /dev/da27 9 HEALTHY S1Z1SLAZ BIC-Isilon-Cluster-4: 4 Bay 17 /dev/da8 28 HEALTHY S1Z1STT6 BIC-Isilon-Cluster-4: 4 Bay 18 /dev/da9 27 HEALTHY S1Z1SL2W BIC-Isilon-Cluster-4: 4 Bay 19 /dev/da10 26 HEALTHY S1Z1SL4P BIC-Isilon-Cluster-4: 4 Bay 20 /dev/da28 8 HEALTHY S1Z1QS4J BIC-Isilon-Cluster-4: 4 Bay 21 /dev/da11 25 HEALTHY S1Z1SAXY BIC-Isilon-Cluster-4: 4 Bay 22 /dev/da12 24 HEALTHY S1Z1SL9J BIC-Isilon-Cluster-4: 4 Bay 23 /dev/da29 7 HEALTHY S1Z1NFS6 BIC-Isilon-Cluster-4: 4 Bay 24 /dev/da30 6 HEALTHY S1Z1NE26 BIC-Isilon-Cluster-4: 4 Bay 25 /dev/da31 5 HEALTHY S1Z1RX6H BIC-Isilon-Cluster-4: 4 Bay 26 /dev/da32 4 HEALTHY S1Z1QRTK BIC-Isilon-Cluster-4: 4 Bay 27 /dev/da33 3 HEALTHY S1Z1SAWG BIC-Isilon-Cluster-4: 4 Bay 28 /dev/da13 23 HEALTHY S1Z1QR5B BIC-Isilon-Cluster-4: 4 Bay 29 /dev/da34 2 HEALTHY S1Z1RVEK BIC-Isilon-Cluster-4: 4 Bay 30 /dev/da14 22 HEALTHY S1Z1SLAN BIC-Isilon-Cluster-4: 4 Bay 31 /dev/da35 1 HEALTHY S1Z1QPES BIC-Isilon-Cluster-4: 4 Bay 32 /dev/da15 21 HEALTHY S1Z1SLBR BIC-Isilon-Cluster-4: 4 Bay 33 /dev/da36 0 HEALTHY S1Z1SAXM BIC-Isilon-Cluster-4: 4 Bay 34 /dev/da16 20 HEALTHY S1Z1RVJX BIC-Isilon-Cluster-4: 4 Bay 35 /dev/da17 19 HEALTHY S1Z1RV62 BIC-Isilon-Cluster-4: 4 Bay 36 /dev/da18 18 HEALTHY S1Z1RYH9 BIC-Isilon-Cluster-4: ----------------------------------------------- BIC-Isilon-Cluster-4: Total: 36
Loop through the cluster nodes and grep for non-healthy drives using
BIC-Isilon-Cluster-4# isi_for_array "isi devices drive list| grep -iv healthy" BIC-Isilon-Cluster-2: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-2: ----------------------------------------------- BIC-Isilon-Cluster-2: ----------------------------------------------- BIC-Isilon-Cluster-2: Total: 36 BIC-Isilon-Cluster-3: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-3: ----------------------------------------------- BIC-Isilon-Cluster-3: ----------------------------------------------- BIC-Isilon-Cluster-3: Total: 36 BIC-Isilon-Cluster-4: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-4: ----------------------------------------------- BIC-Isilon-Cluster-4: 4 Bay 2 - N/A REPLACE - BIC-Isilon-Cluster-4: ----------------------------------------------- BIC-Isilon-Cluster-4: Total: 36 BIC-Isilon-Cluster-1: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-1: ----------------------------------------------- BIC-Isilon-Cluster-1: ----------------------------------------------- BIC-Isilon-Cluster-1: Total: 36 BIC-Isilon-Cluster-5: Lnn Location Device Lnum State Serial BIC-Isilon-Cluster-5: ----------------------------------------------- BIC-Isilon-Cluster-5: ----------------------------------------------- BIC-Isilon-Cluster-5: Total: 36
View Firmware Devices Status
BIC-Isilon-Cluster-5# isi devices drive firmware list --node-lnn all Lnn Location Firmware Desired Model ----------------------------------------------------- 1 Bay 1 SNG4 - ST4000NM0033-9ZM170 1 Bay 2 SNG4 - ST4000NM0033-9ZM170 1 Bay 3 SNG4 - ST4000NM0033-9ZM170 1 Bay 4 SNG4 - ST4000NM0033-9ZM170 1 Bay 5 SNG4 - ST4000NM0033-9ZM170 1 Bay 6 SNG4 - ST4000NM0033-9ZM170 1 Bay 7 SNG4 - ST4000NM0033-9ZM170 1 Bay 8 SNG4 - ST4000NM0033-9ZM170 1 Bay 9 SNG4 - ST4000NM0033-9ZM170 1 Bay 10 SNG4 - ST4000NM0033-9ZM170 1 Bay 11 SNG4 - ST4000NM0033-9ZM170 1 Bay 12 SNG4 - ST4000NM0033-9ZM170 1 Bay 13 SNG4 - ST4000NM0033-9ZM170 1 Bay 14 SNG4 - ST4000NM0033-9ZM170 1 Bay 15 SNG4 - ST4000NM0033-9ZM170 1 Bay 16 SNG4 - ST4000NM0033-9ZM170 1 Bay 17 SNG4 - ST4000NM0033-9ZM170 1 Bay 18 SNG4 - ST4000NM0033-9ZM170 1 Bay 19 SNG4 - ST4000NM0033-9ZM170 1 Bay 20 SNG4 - ST4000NM0033-9ZM170 1 Bay 21 SNG4 - ST4000NM0033-9ZM170 1 Bay 22 SNG4 - ST4000NM0033-9ZM170 1 Bay 23 SNG4 - ST4000NM0033-9ZM170 1 Bay 24 SNG4 - ST4000NM0033-9ZM170 1 Bay 25 SNG4 - ST4000NM0033-9ZM170 1 Bay 26 SNG4 - ST4000NM0033-9ZM170 1 Bay 27 SNG4 - ST4000NM0033-9ZM170 1 Bay 28 SNG4 - ST4000NM0033-9ZM170 1 Bay 29 SNG4 - ST4000NM0033-9ZM170 1 Bay 30 SNG4 - ST4000NM0033-9ZM170 1 Bay 31 SNG4 - ST4000NM0033-9ZM170 1 Bay 32 SNG4 - ST4000NM0033-9ZM170 1 Bay 33 SNG4 - ST4000NM0033-9ZM170 1 Bay 34 SNG4 - ST4000NM0033-9ZM170 1 Bay 35 SNG4 - ST4000NM0033-9ZM170 1 Bay 36 SNG4 - ST4000NM0033-9ZM170 2 Bay 1 SNG4 - ST4000NM0033-9ZM170 2 Bay 2 SNG4 - ST4000NM0033-9ZM170 2 Bay 3 SNG4 - ST4000NM0033-9ZM170 2 Bay 4 SNG4 - ST4000NM0033-9ZM170 2 Bay 5 SNG4 - ST4000NM0033-9ZM170 2 Bay 6 SNG4 - ST4000NM0033-9ZM170 2 Bay 7 SNG4 - ST4000NM0033-9ZM170 2 Bay 8 SNG4 - ST4000NM0033-9ZM170 2 Bay 9 SNG4 - ST4000NM0033-9ZM170 2 Bay 10 SNG4 - ST4000NM0033-9ZM170 2 Bay 11 SNG4 - ST4000NM0033-9ZM170 2 Bay 12 SNG4 - ST4000NM0033-9ZM170 2 Bay 13 SNG4 - ST4000NM0033-9ZM170 2 Bay 14 SNG4 - ST4000NM0033-9ZM170 2 Bay 15 SNG4 - ST4000NM0033-9ZM170 2 Bay 16 SNG4 - ST4000NM0033-9ZM170 2 Bay 17 SNG4 - ST4000NM0033-9ZM170 2 Bay 18 SNG4 - ST4000NM0033-9ZM170 2 Bay 19 SNG4 - ST4000NM0033-9ZM170 2 Bay 20 SNG4 - ST4000NM0033-9ZM170 2 Bay 21 SNG4 - ST4000NM0033-9ZM170 2 Bay 22 SNG4 - ST4000NM0033-9ZM170 2 Bay 23 SNG4 - ST4000NM0033-9ZM170 2 Bay 24 SNG4 - ST4000NM0033-9ZM170 2 Bay 25 SNG4 - ST4000NM0033-9ZM170 2 Bay 26 SNG4 - ST4000NM0033-9ZM170 2 Bay 27 SNG4 - ST4000NM0033-9ZM170 2 Bay 28 SNG4 - ST4000NM0033-9ZM170 2 Bay 29 SNG4 - ST4000NM0033-9ZM170 2 Bay 30 SNG4 - ST4000NM0033-9ZM170 2 Bay 31 SNG4 - ST4000NM0033-9ZM170 2 Bay 32 SNG4 - ST4000NM0033-9ZM170 2 Bay 33 SNG4 - ST4000NM0033-9ZM170 2 Bay 34 SNG4 - ST4000NM0033-9ZM170 2 Bay 35 SNG4 - ST4000NM0033-9ZM170 2 Bay 36 SNG4 - ST4000NM0033-9ZM170 4 Bay 1 SNG4 - ST4000NM0033-9ZM170 4 Bay 2 SNG4 - ST4000NM0033-9ZM170 4 Bay 3 SNG4 - ST4000NM0033-9ZM170 4 Bay 4 SNG4 - ST4000NM0033-9ZM170 4 Bay 5 SNG4 - ST4000NM0033-9ZM170 4 Bay 6 SNG4 - ST4000NM0033-9ZM170 4 Bay 7 SNG4 - ST4000NM0033-9ZM170 4 Bay 8 SNG4 - ST4000NM0033-9ZM170 4 Bay 9 SNG4 - ST4000NM0033-9ZM170 4 Bay 10 SNG4 - ST4000NM0033-9ZM170 4 Bay 11 SNG4 - ST4000NM0033-9ZM170 4 Bay 12 SNG4 - ST4000NM0033-9ZM170 4 Bay 13 SNG4 - ST4000NM0033-9ZM170 4 Bay 14 SNG4 - ST4000NM0033-9ZM170 4 Bay 15 SNG4 - ST4000NM0033-9ZM170 4 Bay 16 SNG4 - ST4000NM0033-9ZM170 4 Bay 17 SNG4 - ST4000NM0033-9ZM170 4 Bay 18 SNG4 - ST4000NM0033-9ZM170 4 Bay 19 SNG4 - ST4000NM0033-9ZM170 4 Bay 20 SNG4 - ST4000NM0033-9ZM170 4 Bay 21 SNG4 - ST4000NM0033-9ZM170 4 Bay 22 SNG4 - ST4000NM0033-9ZM170 4 Bay 23 SNG4 - ST4000NM0033-9ZM170 4 Bay 24 SNG4 - ST4000NM0033-9ZM170 4 Bay 25 SNG4 - ST4000NM0033-9ZM170 4 Bay 26 SNG4 - ST4000NM0033-9ZM170 4 Bay 27 SNG4 - ST4000NM0033-9ZM170 4 Bay 28 SNG4 - ST4000NM0033-9ZM170 4 Bay 29 SNG4 - ST4000NM0033-9ZM170 4 Bay 30 SNG4 - ST4000NM0033-9ZM170 4 Bay 31 SNG4 - ST4000NM0033-9ZM170 4 Bay 32 SNG4 - ST4000NM0033-9ZM170 4 Bay 33 SNG4 - ST4000NM0033-9ZM170 4 Bay 34 SNG4 - ST4000NM0033-9ZM170 4 Bay 35 SNG4 - ST4000NM0033-9ZM170 4 Bay 36 SNG4 - ST4000NM0033-9ZM170 3 Bay 1 SNG4 - ST4000NM0033-9ZM170 3 Bay 2 SNG4 - ST4000NM0033-9ZM170 3 Bay 3 SNG4 - ST4000NM0033-9ZM170 3 Bay 4 SNG4 - ST4000NM0033-9ZM170 3 Bay 5 SNG4 - ST4000NM0033-9ZM170 3 Bay 6 SNG4 - ST4000NM0033-9ZM170 3 Bay 7 SNG4 - ST4000NM0033-9ZM170 3 Bay 8 SNG4 - ST4000NM0033-9ZM170 3 Bay 9 SNG4 - ST4000NM0033-9ZM170 3 Bay 10 SNG4 - ST4000NM0033-9ZM170 3 Bay 11 SNG4 - ST4000NM0033-9ZM170 3 Bay 12 SNG4 - ST4000NM0033-9ZM170 3 Bay 13 SNG4 - ST4000NM0033-9ZM170 3 Bay 14 SNG4 - ST4000NM0033-9ZM170 3 Bay 15 SNG4 - ST4000NM0033-9ZM170 3 Bay 16 SNG4 - ST4000NM0033-9ZM170 3 Bay 17 SNG4 - ST4000NM0033-9ZM170 3 Bay 18 SNG4 - ST4000NM0033-9ZM170 3 Bay 19 SNG4 - ST4000NM0033-9ZM170 3 Bay 20 SNG4 - ST4000NM0033-9ZM170 3 Bay 21 SNG4 - ST4000NM0033-9ZM170 3 Bay 22 SNG4 - ST4000NM0033-9ZM170 3 Bay 23 SNG4 - ST4000NM0033-9ZM170 3 Bay 24 SNG4 - ST4000NM0033-9ZM170 3 Bay 25 SNG4 - ST4000NM0033-9ZM170 3 Bay 26 SNG4 - ST4000NM0033-9ZM170 3 Bay 27 SNG4 - ST4000NM0033-9ZM170 3 Bay 28 SNG4 - ST4000NM0033-9ZM170 3 Bay 29 SNG4 - ST4000NM0033-9ZM170 3 Bay 30 SNG4 - ST4000NM0033-9ZM170 3 Bay 31 SNG4 - ST4000NM0033-9ZM170 3 Bay 32 SNG4 - ST4000NM0033-9ZM170 3 Bay 33 SNG4 - ST4000NM0033-9ZM170 3 Bay 34 SNG4 - ST4000NM0033-9ZM170 3 Bay 35 SNG4 - ST4000NM0033-9ZM170 3 Bay 36 SNG4 - ST4000NM0033-9ZM170 5 Bay 1 SNG4 - ST4000NM0033-9ZM170 5 Bay 2 SNG4 - ST4000NM0033-9ZM170 5 Bay 3 SNG4 - ST4000NM0033-9ZM170 5 Bay 4 SNG4 - ST4000NM0033-9ZM170 5 Bay 5 SNG4 - ST4000NM0033-9ZM170 5 Bay 6 SNG4 - ST4000NM0033-9ZM170 5 Bay 7 SNG4 - ST4000NM0033-9ZM170 5 Bay 8 SNG4 - ST4000NM0033-9ZM170 5 Bay 9 SNG4 - ST4000NM0033-9ZM170 5 Bay 10 SNG4 - ST4000NM0033-9ZM170 5 Bay 11 SNG4 - ST4000NM0033-9ZM170 5 Bay 12 SNG4 - ST4000NM0033-9ZM170 5 Bay 13 SNG4 - ST4000NM0033-9ZM170 5 Bay 14 SNG4 - ST4000NM0033-9ZM170 5 Bay 15 SNG4 - ST4000NM0033-9ZM170 5 Bay 16 SNG4 - ST4000NM0033-9ZM170 5 Bay 17 SNG4 - ST4000NM0033-9ZM170 5 Bay 18 SNG4 - ST4000NM0033-9ZM170 5 Bay 19 SNG4 - ST4000NM0033-9ZM170 5 Bay 20 SNG4 - ST4000NM0033-9ZM170 5 Bay 21 SNG4 - ST4000NM0033-9ZM170 5 Bay 22 SNG4 - ST4000NM0033-9ZM170 5 Bay 23 SNG4 - ST4000NM0033-9ZM170 5 Bay 24 SNG4 - ST4000NM0033-9ZM170 5 Bay 25 SNG4 - ST4000NM0033-9ZM170 5 Bay 26 SNG4 - ST4000NM0033-9ZM170 5 Bay 27 SNG4 - ST4000NM0033-9ZM170 5 Bay 28 SNG4 - ST4000NM0033-9ZM170 5 Bay 29 SNG4 - ST4000NM0033-9ZM170 5 Bay 30 SNG4 - ST4000NM0033-9ZM170 5 Bay 31 SNG4 - ST4000NM0033-9ZM170 5 Bay 32 SNG4 - ST4000NM0033-9ZM170 5 Bay 33 SNG4 - ST4000NM0033-9ZM170 5 Bay 34 SNG4 - ST4000NM0033-9ZM170 5 Bay 35 SNG4 - ST4000NM0033-9ZM170 5 Bay 36 SNG4 - ST4000NM0033-9ZM170 ----------------------------------------------------- Total: 180
Add a drive to a node:
BIC-Isilon-Cluster-5# isi devices add <bay> --node-lnn=< node#>
Disk Failure Replacement Procedure
- A disk in bay 4 of the Logical Node Number 5 (mode 5) is bad.
### Remove bad disk, insert new one.
# List disk devices on node 5:
BIC-Isilon-Cluster-4# isi_for_array -n5 isi devices drive list
BIC-Isilon-Cluster-5: Lnn Location Device Lnum State Serial
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: 5 Bay 1 /dev/da1 35 HEALTHY S1Z1S6BY
BIC-Isilon-Cluster-5: 5 Bay 2 /dev/da2 34 HEALTHY Z1ZAECJM
BIC-Isilon-Cluster-5: 5 Bay 3 /dev/da19 17 HEALTHY S1Z1SB0L
BIC-Isilon-Cluster-5: 5 Bay 4 /dev/da20 N/A NEW K4K73KGB
BIC-Isilon-Cluster-5: 5 Bay 5 /dev/da3 33 HEALTHY Z1ZA74A4
BIC-Isilon-Cluster-5: 5 Bay 6 /dev/da21 15 HEALTHY Z1ZAEQ13
BIC-Isilon-Cluster-5: 5 Bay 7 /dev/da22 14 HEALTHY S1Z1SAF5
BIC-Isilon-Cluster-5: 5 Bay 8 /dev/da23 13 HEALTHY S1Z1SB0C
BIC-Isilon-Cluster-5: 5 Bay 9 /dev/da4 32 HEALTHY Z1ZAEPR8
BIC-Isilon-Cluster-5: 5 Bay 10 /dev/da24 36 HEALTHY S1Z26JWM
BIC-Isilon-Cluster-5: 5 Bay 11 /dev/da25 11 HEALTHY S1Z1RYGS
BIC-Isilon-Cluster-5: 5 Bay 12 /dev/da26 10 HEALTHY S1Z1SB0A
BIC-Isilon-Cluster-5: 5 Bay 13 /dev/da5 31 HEALTHY Z1ZAEPS5
BIC-Isilon-Cluster-5: 5 Bay 14 /dev/da6 30 HEALTHY Z1ZAF5GQ
BIC-Isilon-Cluster-5: 5 Bay 15 /dev/da7 29 HEALTHY Z1ZAB40S
BIC-Isilon-Cluster-5: 5 Bay 16 /dev/da27 9 HEALTHY Z1ZAF625
BIC-Isilon-Cluster-5: 5 Bay 17 /dev/da8 28 HEALTHY Z1ZAEPJY
BIC-Isilon-Cluster-5: 5 Bay 18 /dev/da9 27 HEALTHY Z1ZAF1LG
BIC-Isilon-Cluster-5: 5 Bay 19 /dev/da10 26 HEALTHY Z1ZAF724
BIC-Isilon-Cluster-5: 5 Bay 20 /dev/da28 8 HEALTHY Z1ZAF5W8
BIC-Isilon-Cluster-5: 5 Bay 21 /dev/da11 25 HEALTHY Z1ZAEW1W
BIC-Isilon-Cluster-5: 5 Bay 22 /dev/da12 24 HEALTHY Z1ZAF0CW
BIC-Isilon-Cluster-5: 5 Bay 23 /dev/da29 7 HEALTHY Z1ZAF5VM
BIC-Isilon-Cluster-5: 5 Bay 24 /dev/da30 6 HEALTHY Z1ZAF59X
BIC-Isilon-Cluster-5: 5 Bay 25 /dev/da31 5 HEALTHY Z1ZAF21G
BIC-Isilon-Cluster-5: 5 Bay 26 /dev/da32 4 HEALTHY Z1ZAF5QJ
BIC-Isilon-Cluster-5: 5 Bay 27 /dev/da33 3 HEALTHY Z1ZAF58Y
BIC-Isilon-Cluster-5: 5 Bay 28 /dev/da13 23 HEALTHY Z1ZAF6CG
BIC-Isilon-Cluster-5: 5 Bay 29 /dev/da34 2 HEALTHY Z1ZAB3XJ
BIC-Isilon-Cluster-5: 5 Bay 30 /dev/da14 22 HEALTHY S1Z1RYHB
BIC-Isilon-Cluster-5: 5 Bay 31 /dev/da35 1 HEALTHY Z1ZAB3TQ
BIC-Isilon-Cluster-5: 5 Bay 32 /dev/da15 21 HEALTHY Z1ZAEPYX
BIC-Isilon-Cluster-5: 5 Bay 33 /dev/da36 0 HEALTHY Z1ZAF4Z0
BIC-Isilon-Cluster-5: 5 Bay 34 /dev/da16 20 HEALTHY Z1ZAEPMC
BIC-Isilon-Cluster-5: 5 Bay 35 /dev/da17 19 HEALTHY Z1ZAF4H4
BIC-Isilon-Cluster-5: 5 Bay 36 /dev/da18 18 HEALTHY Z1ZAF6JA
BIC-Isilon-Cluster-5: -----------------------------------------------
BIC-Isilon-Cluster-5: Total: 36
# Check cluster status:
BIC-Isilon-Cluster-4# isi status
Cluster Name: BIC-Isilon-Cluster
Cluster Health: [ ATTN]
Cluster Storage: HDD SSD Storage
Size: 638.0T (645.7T Raw) 0 (0 Raw)
VHS Size: 7.7T
Used: 204.8T (32%) 0 (n/a)
Avail: 433.2T (68%) 0 (n/a)
Health Throughput (bps) HDD Storage SSD Storage
ID |IP Address |DASR | In Out Total| Used / Size |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
1|172.16.10.20 | OK |28.8M|83.3k|28.9M|41.0T/ 130T( 32%)|(No Storage SSDs)
2|172.16.10.21 | OK | 1.2M| 3.7M| 4.9M|40.9T/ 130T( 32%)|(No Storage SSDs)
3|172.16.10.22 | OK | 3.2k| 167k| 170k|41.0T/ 130T( 32%)|(No Storage SSDs)
4|172.16.10.23 | OK | 861k|49.2M|50.0M|41.0T/ 130T( 32%)|(No Storage SSDs)
5|172.16.10.24 |-A-- | 858k| 1.6M| 2.5M|41.0T/ 126T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals: |31.7M|54.7M|86.4M| 205T/ 638T( 32%)|(No Storage SSDs)
Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only
Critical Events:
10/14 23:24 5 One or more drives (bay(s) 4 / location / type(s) HDD) are...
Cluster Job Status:
No running jobs.
No paused or waiting jobs.
# Try to add disk drive in bay 4 of node 5:
BIC-Isilon-Cluster-4# isi devices add 4 --node-lnn=5
You are about to add drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
Initiating add on bay4
The drive in bay4 was not added to the file system because it is not formatted.
Format the drive to add it to the file system by running the following command,
where <bay> is the bay number of the drive: isi devices drive format <bay>
# Oups. It doesn't like it. The new disk is an Hitachi drive while
# the cluster is built of Seagate ES3.
# Login directly to node 5 as it's easier to do the stuff directly there.
BIC-Isilon-Cluster-4# ssh BIC-Isilon-Cluster-5
Password:
Last login: Sun Oct 15 02:31:18 2017 from 172.16.10.160
Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved.
Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
Isilon OneFS v8.0.0.4
# Check the state of the drive in bay 4:
BIC-Isilon-Cluster-5# isi devices drive view 4
Lnn: 5
Location: Bay 4
Lnum: N/A
Device: /dev/da20
Baynum: 4
Handle: 333
Serial: K4K73KGB
Model: HUS726040ALA610
Tech: SATA
Media: HDD
Blocks: 7814037168
Logical Block Length: 512
Physical Block Length: 512
WWN: 0000000000000000
State: NEW
Purpose: UNKNOWN
Purpose Description: A drive whose purpose is unknown
Present: Yes
# Check the difference between it and the drive in bay 3 (healthy):
BIC-Isilon-Cluster-5# isi devices drive view 3
Lnn: 5
Location: Bay 3
Lnum: 17
Device: /dev/da19
Baynum: 3
Handle: 353
Serial: S1Z1SB0L
Model: ST4000NM0033-9ZM170
Tech: SATA
Media: HDD
Blocks: 7814037168
Logical Block Length: 512
Physical Block Length: 512
WWN: 5000C5008CAD9092
State: HEALTHY
Purpose: STORAGE
Purpose Description: A drive used for normal data storage operation
Present: Yes
# Format the drive bay 4:
BIC-Isilon-Cluster-5# isi devices drive format 4
You are about to format drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
BIC-Isilon-Cluster-5# isi devices drive view 4
Lnn: 5
Location: Bay 4
Lnum: 37
Device: /dev/da20
Baynum: 4
Handle: 332
Serial: K4K73KGB
Model: HUS726040ALA610
Tech: SATA
Media: HDD
Blocks: 7814037168
Logical Block Length: 512
Physical Block Length: 512
WWN: 0000000000000000
State: NONE
Purpose: NONE
Purpose Description: A drive that doesn't yet have a purpose
Present: Yes
# The drive shows up as 'PREPARING'now:
BIC-Isilon-Cluster-5# isi devices drive list
Lnn Location Device Lnum State Serial
-------------------------------------------------
5 Bay 1 /dev/da1 35 HEALTHY S1Z1S6BY
5 Bay 2 /dev/da2 34 HEALTHY Z1ZAECJM
5 Bay 3 /dev/da19 17 HEALTHY S1Z1SB0L
5 Bay 4 /dev/da20 37 PREPARING K4K73KGB
5 Bay 5 /dev/da3 33 HEALTHY Z1ZA74A4
5 Bay 6 /dev/da21 15 HEALTHY Z1ZAEQ13
5 Bay 7 /dev/da22 14 HEALTHY S1Z1SAF5
5 Bay 8 /dev/da23 13 HEALTHY S1Z1SB0C
5 Bay 9 /dev/da4 32 HEALTHY Z1ZAEPR8
5 Bay 10 /dev/da24 36 HEALTHY S1Z26JWM
5 Bay 11 /dev/da25 11 HEALTHY S1Z1RYGS
5 Bay 12 /dev/da26 10 HEALTHY S1Z1SB0A
5 Bay 13 /dev/da5 31 HEALTHY Z1ZAEPS5
5 Bay 14 /dev/da6 30 HEALTHY Z1ZAF5GQ
5 Bay 15 /dev/da7 29 HEALTHY Z1ZAB40S
5 Bay 16 /dev/da27 9 HEALTHY Z1ZAF625
5 Bay 17 /dev/da8 28 HEALTHY Z1ZAEPJY
5 Bay 18 /dev/da9 27 HEALTHY Z1ZAF1LG
5 Bay 19 /dev/da10 26 HEALTHY Z1ZAF724
5 Bay 20 /dev/da28 8 HEALTHY Z1ZAF5W8
5 Bay 21 /dev/da11 25 HEALTHY Z1ZAEW1W
5 Bay 22 /dev/da12 24 HEALTHY Z1ZAF0CW
5 Bay 23 /dev/da29 7 HEALTHY Z1ZAF5VM
5 Bay 24 /dev/da30 6 HEALTHY Z1ZAF59X
5 Bay 25 /dev/da31 5 HEALTHY Z1ZAF21G
5 Bay 26 /dev/da32 4 HEALTHY Z1ZAF5QJ
5 Bay 27 /dev/da33 3 HEALTHY Z1ZAF58Y
5 Bay 28 /dev/da13 23 HEALTHY Z1ZAF6CG
5 Bay 29 /dev/da34 2 HEALTHY Z1ZAB3XJ
5 Bay 30 /dev/da14 22 HEALTHY S1Z1RYHB
5 Bay 31 /dev/da35 1 HEALTHY Z1ZAB3TQ
5 Bay 32 /dev/da15 21 HEALTHY Z1ZAEPYX
5 Bay 33 /dev/da36 0 HEALTHY Z1ZAF4Z0
5 Bay 34 /dev/da16 20 HEALTHY Z1ZAEPMC
5 Bay 35 /dev/da17 19 HEALTHY Z1ZAF4H4
5 Bay 36 /dev/da18 18 HEALTHY Z1ZAF6JA
-------------------------------------------------
# Add the drive bay 4:
BIC-Isilon-Cluster-5# isi devices drive add 4
You are about to add drive bay4, on node lnn 5. Are you sure? (yes/[no]): yes
Initiating add on bay4
The add operation is in-progress.
A OneFS-formatted drive was found in bay4 and is being added to the file system.
Wait a few minutes and then list all drives to verify that the add operation completed successfully.
# There was a event while this was going on.
# Not sure what is means as bay for is not a Seagate ES3 anymore.
BIC-Isilon-Cluster-5# isi event groups view 4882746
ID: 4882746
Started: 10/18 14:20
Causes Long: Drive in bay 4 location Bay 4 is unknown model ST4000NM0033-9ZM170
Lnn: 5
Devid: 6
Last Event: 2017-10-18T14:20:30
Ignore: No
Ignore Time: Never
Resolved: Yes
Resolve Time: 2017-10-18T14:18:15
Ended: 10/18 14:18
Events: 2
Severity: warning
# After a little while, minutes, the drive shows up good:
BIC-Isilon-Cluster-5# isi devices drive list
Lnn Location Device Lnum State Serial
-----------------------------------------------
5 Bay 1 /dev/da1 35 HEALTHY S1Z1S6BY
5 Bay 2 /dev/da2 34 HEALTHY Z1ZAECJM
5 Bay 3 /dev/da19 17 HEALTHY S1Z1SB0L
5 Bay 4 /dev/da20 37 HEALTHY K4K73KGB
5 Bay 5 /dev/da3 33 HEALTHY Z1ZA74A4
5 Bay 6 /dev/da21 15 HEALTHY Z1ZAEQ13
5 Bay 7 /dev/da22 14 HEALTHY S1Z1SAF5
5 Bay 8 /dev/da23 13 HEALTHY S1Z1SB0C
5 Bay 9 /dev/da4 32 HEALTHY Z1ZAEPR8
5 Bay 10 /dev/da24 36 HEALTHY S1Z26JWM
5 Bay 11 /dev/da25 11 HEALTHY S1Z1RYGS
5 Bay 12 /dev/da26 10 HEALTHY S1Z1SB0A
5 Bay 13 /dev/da5 31 HEALTHY Z1ZAEPS5
5 Bay 14 /dev/da6 30 HEALTHY Z1ZAF5GQ
5 Bay 15 /dev/da7 29 HEALTHY Z1ZAB40S
5 Bay 16 /dev/da27 9 HEALTHY Z1ZAF625
5 Bay 17 /dev/da8 28 HEALTHY Z1ZAEPJY
5 Bay 18 /dev/da9 27 HEALTHY Z1ZAF1LG
5 Bay 19 /dev/da10 26 HEALTHY Z1ZAF724
5 Bay 20 /dev/da28 8 HEALTHY Z1ZAF5W8
5 Bay 21 /dev/da11 25 HEALTHY Z1ZAEW1W
5 Bay 22 /dev/da12 24 HEALTHY Z1ZAF0CW
5 Bay 23 /dev/da29 7 HEALTHY Z1ZAF5VM
5 Bay 24 /dev/da30 6 HEALTHY Z1ZAF59X
5 Bay 25 /dev/da31 5 HEALTHY Z1ZAF21G
5 Bay 26 /dev/da32 4 HEALTHY Z1ZAF5QJ
5 Bay 27 /dev/da33 3 HEALTHY Z1ZAF58Y
5 Bay 28 /dev/da13 23 HEALTHY Z1ZAF6CG
5 Bay 29 /dev/da34 2 HEALTHY Z1ZAB3XJ
5 Bay 30 /dev/da14 22 HEALTHY S1Z1RYHB
5 Bay 31 /dev/da35 1 HEALTHY Z1ZAB3TQ
5 Bay 32 /dev/da15 21 HEALTHY Z1ZAEPYX
5 Bay 33 /dev/da36 0 HEALTHY Z1ZAF4Z0
5 Bay 34 /dev/da16 20 HEALTHY Z1ZAEPMC
5 Bay 35 /dev/da17 19 HEALTHY Z1ZAF4H4
5 Bay 36 /dev/da18 18 HEALTHY Z1ZAF6JA
-----------------------------------------------
Total: 36
BIC-Isilon-Cluster-5# isi devices drive view 4
Lnn: 5
Location: Bay 4
Lnum: 37
Device: /dev/da20
Baynum: 4
Handle: 332
Serial: K4K73KGB
Model: HUS726040ALA610
Tech: SATA
Media: HDD
Blocks: 7814037168
Logical Block Length: 512
Physical Block Length: 512
WWN: 0000000000000000
State: HEALTHY
Purpose: STORAGE
Purpose Description: A drive used for normal data storage operation
Present: Yes
# Check cluster state after resolving the event group.
# I did resolve through the web UI as it is easier.
BIC-Isilon-Cluster-5# isi status
Cluster Name: BIC-Isilon-Cluster
Cluster Health: [ OK ]
Cluster Storage: HDD SSD Storage
Size: 641.6T (649.3T Raw) 0 (0 Raw)
VHS Size: 7.7T
Used: 204.8T (32%) 0 (n/a)
Avail: 436.8T (68%) 0 (n/a)
Health Throughput (bps) HDD Storage SSD Storage
ID |IP Address |DASR | In Out Total| Used / Size |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
1|172.16.10.20 | OK | 500k| 0| 500k|41.0T/ 130T( 32%)|(No Storage SSDs)
2|172.16.10.21 | OK | 4.6k| 320k| 325k|40.9T/ 130T( 32%)|(No Storage SSDs)
3|172.16.10.22 | OK | 0| 0| 0|41.0T/ 130T( 32%)|(No Storage SSDs)
4|172.16.10.23 | OK | 321k|83.4k| 404k|41.0T/ 130T( 32%)|(No Storage SSDs)
5|172.16.10.24 | OK | 1.1M| 0| 1.1M|41.0T/ 130T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals: | 1.9M| 403k| 2.3M| 205T/ 642T( 32%)|(No Storage SSDs)
Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only
Critical Events:
Cluster Job Status:
Running jobs:
Job Impact Pri Policy Phase Run Time
-------------------------- ------ --- ---------- ----- ----------
MultiScan[4097] Low 4 LOW 1/4 0:02:18
No paused or waiting jobs.
No failed jobs.
Recent job results:
Time Job Event
--------------- -------------------------- ------------------------------
10/18 04:00:22 ShadowStoreProtect[4096] Succeeded (LOW)
10/18 03:02:15 SnapshotDelete[4095] Succeeded (MEDIUM)
10/18 02:00:04 WormQueue[4094] Succeeded (LOW)
10/18 01:01:37 SnapshotDelete[4093] Succeeded (MEDIUM)
10/18 00:31:08 SnapshotDelete[4092] Succeeded (MEDIUM)
10/18 00:01:39 SnapshotDelete[4091] Succeeded (MEDIUM)
10/17 23:04:54 FSAnalyze[4089] Succeeded (LOW)
10/17 22:33:17 SnapshotDelete[4090] Succeeded (MEDIUM)
11/15 14:53:34 MultiScan[1254] MultiScan[1254] Failed
10/06 14:45:55 ChangelistCreate[975] ChangelistCreate[975] Failed
Network
BIC-Isilon-Cluster-1# isi config Welcome to the Isilon IQ configuration console. Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved. Enter 'help' to see list of available commands. Enter 'help <command>' to see help for a specific command. Enter 'quit' at any prompt to discard changes and exit. Node build: Isilon OneFS v8.0.0.0 B_8_0_0_037(RELEASE) Node serial number: SX410-301608-0260 BIC-Isilon-Cluster >>> status Configuration for 'BIC-Isilon-Cluster' Local machine: ----------------------------------+----------------------------------------- Node LNN : 1 | Date : 2016/06/09 15:53:01 EDT ----------------------------------+----------------------------------------- Interface : ib1 | MAC : 00:00:00:49:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:9e:e9:a2 IP Address : 10.0.1.1 | MAC Options : none ----------------------------------+----------------------------------------- Interface : ib0 | MAC : 00:00:00:48:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:9e:e9:a1 IP Address : 10.0.2.1 | MAC Options : none ----------------------------------+----------------------------------------- Interface : lo0 | MAC : 00:00:00:00:00:00 IP Address : 10.0.3.1 | MAC Options : none ----------------------------------+----------------------------------------- Network: ----------------------------------+----------------------------------------- JoinMode : Manual Interfaces: ----------------------------------+----------------------------------------- Interface : int-a | Flags : enabled_ok Netmask : 255.255.255.0 | MTU : N/A ----------------+-----------------+------------------+---------------------- Low IP | High IP | Allocated | Free ----------------+-----------------+------------------+---------------------- 10.0.1.1 | 10.0.1.254 | 5 | 249 ----------------+-----------------+------------------+---------------------- Interface : int-b | Flags : enabled_ok Netmask : 255.255.255.0 | MTU : N/A ----------------+-----------------+------------------+---------------------- Low IP | High IP | Allocated | Free ----------------+-----------------+------------------+---------------------- 10.0.2.1 | 10.0.2.254 | 5 | 249 ----------------+-----------------+------------------+---------------------- Interface : lpbk | Flags : enabled_ok cluster_traffic failover Netmask : 255.255.255.0 | MTU : 1500 ----------------+-----------------+------------------+---------------------- Low IP | High IP | Allocated | Free ----------------+-----------------+------------------+---------------------- 10.0.3.1 | 10.0.3.254 | 5 | 249 ----------------+-----------------+------------------+----------------------
- Initial groupnet and DNS client settings:
BIC-Isilon-Cluster-4# isi network groupnets list
ID DNS Cache Enabled DNS Search DNS Servers Subnets
-----------------------------------------------------------------
groupnet0 True - 132.206.178.7 mgmt
132.206.178.186 prod
node
-----------------------------------------------------------------
Total: 1
BIC-Isilon-Cluster-4# isi network groupnets view groupnet0
ID: groupnet0
Name: groupnet0
Description: Initial groupnet
DNS Cache Enabled: True
DNS Options: -
DNS Search: -
DNS Servers: 132.206.178.7, 132.206.178.186
Server Side DNS Search: True
Subnets: mgmt, prod, node
- List and view the network subnets defined in the cluster:
BIC-Isilon-Cluster-4# isi network subnets list ID Subnet Gateway|Priority Pools SC Service groupnet0.mgmt 172.16.10.0/24 172.16.10.1|2 mgmt 0.0.0.0 groupnet0.node 172.16.20.0/23 172.16.20.1|3 pool1 172.16.20.232 groupnet0.prod 132.206.178.0/24 132.206.178.1|1 pool0 132.206.178.232 ------------------------------------------------------------------------ Total: 3
- List and view the network pools defined in the cluster.
- Note that the IP allocation for the pool groupnet0.prod.pool0 is set to dynamic. This requires a SmartConnect Advanced license.
BIC-Isilon-Cluster-4# isi network pools list
ID SC Zone Allocation Method
groupnet0.mgmt.mgmt mgmt.isi.bic.mni.mcgill.ca static
groupnet0.node.pool1 nfs.isi-node.bic.mni.mcgill.ca dynamic
groupnet0.prod.pool0 nfs.isi.bic.mni.mcgill.ca dynamic
----------------------------------------------------------------------
Total: 3
BIC-Isilon-Cluster-4# isi network pools view groupnet0.mgmt.mgmt
ID: groupnet0.mgmt.mgmt
Groupnet: groupnet0
Subnet: mgmt
Name: mgmt
Rules: -
Access Zone: System
Allocation Method: static
Aggregation Mode: lacp
SC Suspended Nodes: -
Description: -
Ifaces: 1:ext-1, 2:ext-1, 4:ext-1, 3:ext-1, 5:ext-1
IP Ranges: 172.16.10.20-172.16.10.24
Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
SC Connect Policy: round_robin
SC Zone: mgmt.isi.bic.mni.mcgill.ca
SC DNS Zone Aliases: -
SC Failover Policy: round_robin
SC Subnet: prod
SC Ttl: 0
Static Routes: -
BIC-Isilon-Cluster-4# isi network pools view groupnet0.prod.pool0
ID: groupnet0.prod.pool0
Groupnet: groupnet0
Subnet: prod
Name: pool0
Rules: -
Access Zone: prod
Allocation Method: dynamic
Aggregation Mode: lacp
SC Suspended Nodes: -
Description: -
Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
IP Ranges: 132.206.178.233-132.206.178.237
Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
SC Connect Policy: round_robin
SC Zone: nfs.isi.bic.mni.mcgill.ca
SC DNS Zone Aliases: -
SC Failover Policy: round_robin
SC Subnet: prod
SC Ttl: 0
Static Routes: -
IC-Isilon-Cluster-2# isi network pools view groupnet0.node.pool1
ID: groupnet0.node.pool1
Groupnet: groupnet0
Subnet: node
Name: pool1
Rules: -
Access Zone: prod
Allocation Method: dynamic
Aggregation Mode: lacp
SC Suspended Nodes: -
Description: -
Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
IP Ranges: 172.16.20.233-172.16.20.237
Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
SC Connect Policy: round_robin
SC Zone: nfs.isi-node.bic.mni.mcgill.ca
SC DNS Zone Aliases: -
SC Failover Policy: round_robin
SC Subnet: node
SC Ttl: 0
Static Routes: -
- Display network interfaces configuration:
BIC-Isilon-Cluster-4# isi network interfaces list
LNN Name Status Owners IP Addresses
--------------------------------------------------------------------
1 10gige-1 Up - -
1 10gige-2 Up - -
1 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.237
groupnet0.node.pool1 172.16.20.237
1 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.20
1 ext-2 No Carrier - -
1 ext-agg Not Available - -
2 10gige-1 Up - -
2 10gige-2 Up - -
2 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.236
groupnet0.node.pool1 172.16.20.236
2 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.21
2 ext-2 No Carrier - -
2 ext-agg Not Available - -
3 10gige-1 Up - -
3 10gige-2 Up - -
3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.234
groupnet0.node.pool1 172.16.20.234
3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22
3 ext-2 No Carrier - -
3 ext-agg Not Available - -
4 10gige-1 Up - -
4 10gige-2 Up - -
4 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235
groupnet0.node.pool1 172.16.20.235
4 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.23
4 ext-2 No Carrier - -
4 ext-agg Not Available - -
5 10gige-1 Up - -
5 10gige-2 Up - -
5 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.233
groupnet0.node.pool1 172.16.20.233
5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24
5 ext-2 No Carrier - -
5 ext-agg Not Available - -
--------------------------------------------------------------------
Total: 30
- Suspend or resume a node:
From the docu65065_OneFS-8.0.0-CLI-Administration-Guide, page 950:
Suspend or resume a node
You can suspend and resume SmartConnect DNS query responses on a node.
Procedure
1. To suspend DNS query responses for an node:
a. (Optional) To identify a list of nodes and IP address pools, run the
following command:
isi network interfaces list
b. Run the isi network pools sc-suspend-nodes command and specify the pool ID
and logical node number (LNN).
Specify the pool ID you want in the following format:
<groupnet_name>.<subnet_name>.<pool_name>
The following command suspends DNS query responses on node 3 when queries come
through IP addresses in pool5 under groupnet1.subnet 3:
isi network pools sc-suspend-nodes groupnet1.subnet3.pool5 3
2. To resume DNS query responses for an IP address pool, run the isi network
pools sc-resume-nodes command and specify the pool ID and logical node number
(LNN).
The following command resumes DNS query responses on node 3 when queries come
through IP addresses in pool5 under groupnet1.subnet 3:
isi network pools sc-resume-nodes groupnet1.subnet3.pool5 3
Example of an IP Failover with Dynamic Allocation Method
- First, set the dynamic IP allocation for the pool:
isi network pools modify groupnet0.prod.pool0 --alloc-method=dynamic
- Then pull the fiber cables from one node, say node 5 and watch what happens:
- Before pulling the cables:
BIC-Isilon-Cluster-4# isi network interfaces list ... 3 10gige-1 Up - - 3 10gige-2 Up - - 3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235 3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22 3 ext-2 No Carrier - - 3 ext-agg Not Available - - 4 10gige-1 Up - - ... 5 10gige-1 Up - - 5 10gige-2 Up - - 5 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.237 5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24 5 ext-2 No Carrier - - 5 ext-agg Not Available - - -------------------------------------------------------------------- Total: 30
- After.
- Node 5 external network interfaces 10gig-1, −2, -agg-1 now display No Carrier.
- Note how node 3 external network interface 10gige-agg-1 picked up the IP of node 5.
BIC-Isilon-Cluster-4# isi network interfaces list
LNN Name Status Owners IP Addresses
...
3 10gige-1 Up - -
3 10gige-2 Up - -
3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235
132.206.178.237
3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22
3 ext-2 No Carrier - -
3 ext-agg Not Available - -
...
5 10gige-1 No Carrier - -
5 10gige-2 No Carrier - -
5 10gige-agg-1 No Carrier groupnet0.prod.pool0 -
5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24
5 ext-2 No Carrier - -
5 ext-agg Not Available - -
How To Add a Subnet to a Cluster
- Goal: to have clients access the Isilon cluster through the private network
172.16.20.0/24(data network). - Hosts in the
arnodescompute cluster have 2 extra bounded NICs that are configured on this network. - The private network
172.16.20.0/24is directly attached to cluster’s front-end: there are no intervening gateways or routers in between. - This section explains how to configure the Isilon cluster such that clients on
172.16.20.0/24are granted NFS access. - Current network cluster state:
BIC-Isilon-Cluster-4# isi network interfaces ls LNN Name Status Owners IP Addresses -------------------------------------------------------------------- 1 10gige-1 Up - - 1 10gige-2 Up - - 1 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.237 1 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.20 1 ext-2 No Carrier - - 1 ext-agg Not Available - - 2 10gige-1 Up - - 2 10gige-2 Up - - 2 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.236 2 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.21 2 ext-2 No Carrier - - 2 ext-agg Not Available - - 3 10gige-1 Up - - 3 10gige-2 Up - - 3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.233 3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22 3 ext-2 No Carrier - - 3 ext-agg Not Available - - 4 10gige-1 Up - - 4 10gige-2 Up - - 4 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.234 4 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.23 4 ext-2 No Carrier - - 4 ext-agg Not Available - - 5 10gige-1 Up - - 5 10gige-2 Up - - 5 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235 5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24 5 ext-2 No Carrier - - 5 ext-agg Not Available - - -------------------------------------------------------------------- BIC-Isilon-Cluster-4# isi network pools ls ID SC Zone Allocation Method ---------------------------------------------------------------------- groupnet0.mgmt.mgmt mgmt.isi.bic.mni.mcgill.ca static groupnet0.prod.pool0 nfs.isi.bic.mni.mcgill.ca dynamic ---------------------------------------------------------------------- Total: 2 BIC-Isilon-Cluster-4# isi network subnets ls ID Subnet Gateway|Priority Pools SC Service ------------------------------------------------------------------------ groupnet0.mgmt 172.16.10.0/24 172.16.10.1|2 mgmt 0.0.0.0 groupnet0.prod 132.206.178.0/24 132.206.178.1|1 pool0 132.206.178.232 ------------------------------------------------------------------------ Total: 2
- It is faster and easier to configure this by using the WebUI rather than the CLI.

- Essentially, it boils down to the following actions:
- Create a new subnet called
nodein the default groupnetgroupnet0. - Set the SmartConnect (Sc) Service IP to
172.16.20.232. - Update the domain master DNS server with the new delegation record and “glue” record. More on this later.
- Create a new subnet called
BIC-Isilon-Cluster-4# isi network subnets view groupnet0.node
ID: groupnet0.node
Name: node
Groupnet: groupnet0
Pools: pool1
Addr Family: ipv4
Base Addr: 172.16.20.0
CIDR: 172.16.20.0/24
Description: -
Dsr Addrs: -
Gateway: 172.16.20.1
Gateway Priority: 3
MTU: 1500
Prefixlen: 24
Netmask: 255.255.255.0
Sc Service Addr: 172.16.20.232
VLAN Enabled: False
VLAN ID: -
- Create a new pool called
pool1with the following properties:- Access zone is set to
prodlike the poolpool0. - Allocation method is
dynamic. - Select the 10gige aggregate interfaces from each node.
- Set the SmartConnect Connect policy to
round-robin. - Best practices might require to set it to
cpuornetwork utilizationfor NFSv4. Benchmarking should help. - Name the SmartConnect zone as
nfs.isi-node.bic.mni.mcgill.ca. - Proper records in the master domain DNS server will have to be set for the new zone. More on this later.
- Access zone is set to
BIC-Isilon-Cluster-4# isi network pools view groupnet0.node.pool1
ID: groupnet0.node.pool1
Groupnet: groupnet0
Subnet: node
Name: pool1
Rules: -
Access Zone: prod
Allocation Method: dynamic
Aggregation Mode: lacp
SC Suspended Nodes: -
Description: -
Ifaces: 1:10gige-agg-1, 2:10gige-agg-1, 4:10gige-agg-1, 3:10gige-agg-1, 5:10gige-agg-1
IP Ranges: 172.16.20.233-172.16.20.237
Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
SC Connect Policy: round_robin
SC Zone: nfs.isi-node.bic.mni.mcgill.ca
SC DNS Zone Aliases: -
SC Failover Policy: round_robin
SC Subnet: node
SC Ttl: 0
Static Routes: -
- With this in place, the cluster network interfaces settings will be:
BIC-Isilon-Cluster-4# isi network interfaces ls
LNN Name Status Owners IP Addresses
--------------------------------------------------------------------
1 10gige-1 Up - -
1 10gige-2 Up - -
1 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.237
groupnet0.node.pool1 172.16.20.237
1 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.20
1 ext-2 No Carrier - -
1 ext-agg Not Available - -
2 10gige-1 Up - -
2 10gige-2 Up - -
2 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.236
groupnet0.node.pool1 172.16.20.236
2 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.21
2 ext-2 No Carrier - -
2 ext-agg Not Available - -
3 10gige-1 Up - -
3 10gige-2 Up - -
3 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.233
groupnet0.node.pool1 172.16.20.234
3 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.22
3 ext-2 No Carrier - -
3 ext-agg Not Available - -
4 10gige-1 Up - -
4 10gige-2 Up - -
4 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.234
groupnet0.node.pool1 172.16.20.235
4 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.23
4 ext-2 No Carrier - -
4 ext-agg Not Available - -
5 10gige-1 Up - -
5 10gige-2 Up - -
5 10gige-agg-1 Up groupnet0.prod.pool0 132.206.178.235
groupnet0.node.pool1 172.16.20.233
5 ext-1 Up groupnet0.mgmt.mgmt 172.16.10.24
5 ext-2 No Carrier - -
5 ext-agg Not Available - -
--------------------------------------------------------------------
Total: 30
- A few notes about the above:
- Because the initial cluster configuration was sloppy, the LNNs (Logical Node Number) and Node ID don’t match.
- This explains why some 10gige-agg interface have different octal bits in
pool0andpool1. - Ultimately, the LLNs and NodeIDs should be re-assigned to match the nodes position in the rack.
- This would avoid potential mistakes when updating or servicing the cluster.
- Current setting:
BIC-Isilon-Cluster-4# isi config
Welcome to the Isilon IQ configuration console.
Copyright (c) 2001-2016 EMC Corporation. All Rights Reserved.
Enter 'help' to see list of available commands.
Enter 'help <command>' to see help for a specific command.
Enter 'quit' at any prompt to discard changes and exit.
Node build: Isilon OneFS v8.0.0.1 B_MR_8_0_0_1_131(RELEASE)
Node serial number: SX410-301608-0264
BIC-Isilon-Cluster >>> lnnset
LNN Device ID Cluster IP
----------------------------------------
1 1 10.0.3.1
2 2 10.0.3.2
3 4 10.0.3.4
4 3 10.0.3.3
5 6 10.0.3.5
BIC-Isilon-Cluster >>> exit
BIC-Isilon-Cluster-4#
- The domain DNS configuration must be updated:
- The new zone delegation for the SmartConnect zone
isi-node.bic.mni.mcgill.ca.has to be put in place. - A new glue record must be created for the SSIP (SmartConnect Service IP) of the delegated zone.
- The new zone delegation for the SmartConnect zone
; glue record sip-node.bic.mni.mcgill.ca. IN A 172.16.20.232 ; zone delegation isi-node.bic.mni.mcgill.ca. IN NS sip-node.bic.mni.mcgill.ca.
- Verify the
SC Zone: nfs.isi-node.bic.mni.mcgill.caresolves properly and in a round-robin way. - Both on the cluster and a clients:
malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.236 malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.237 malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.233 malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.234 malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.235 malin@login1:~$ dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.236 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.237 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.233 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.234 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.235 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.236 BIC-Isilon-Cluster-4# dig +short nfs.isi-node.bic.mni.mcgill.ca 172.16.20.237
- It works!
FileSystems and Access Zones
- There are 2 access zones defined.
- Default zone is System: it must exist and cannot be deleted.
- The access zone called prod will be used to hold the user data.
- Both zones have lsa-nis-provider:BIC as Auth Providers.
- See below in Section about NIS that this might create a security weakness.
BIC-Isilon-Cluster-4# isi zone zones list
Name Path
------------
System /ifs
prod /ifs
------------
Total: 2
BIC-Isilon-Cluster-4# isi zone view System
Name: System
Path: /ifs
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-nis-provider:BIC, lsa-file-provider:System, lsa-local-provider:System
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 1
BIC-Isilon-Cluster-4# isi zone view prod
Name: prod
Path: /ifs
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-nis-provider:BIC, lsa-local-provider:prod, lsa-file-provider:System
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 2
- Another. more concise way of displaying the defined access zones:
IC-Isilon-Cluster-4# isi zone list -v
Name: System
Path: /ifs
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-nis-provider:BIC, lsa-file-provider:System, lsa-local-provider:System
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 1
--------------------------------------------------------------------------------
Name: prod
Path: /ifs
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-nis-provider:BIC, lsa-local-provider:prod, lsa-file-provider:System
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 2
NFS, NIS: Exports and Aliases.
- There seem to be something amiss with NIS and OneFS v8.0.
- The System access zone had to be provided with NIS authentication as otherwise only numerical UIDs and GIDs show up on the /isi/data filesystem.
- There might be a potential security weakness there.
- See https://community.emc.com/thread/193468?start=0&tstart=0 even though this thread is for v7.2
- Created /etc/netgroup with “+” in it on one node as suggested in the post above and somehow OneFS propagated it to the other nodes.
- List the NIS auth providers:
BIC-Isilon-Cluster-4# isi auth nis list
Name NIS Domain Servers Status
-----------------------------------------
BIC vamana 132.206.178.227 online
132.206.178.243
-----------------------------------------
Total: 1
BIC-Isilon-Cluster-1# isi auth nis view BIC
Name: BIC
NIS Domain: vamana
Servers: 132.206.178.227, 132.206.178.243
Status: online
Authentication: Yes
Balance Servers: Yes
Check Online Interval: 3m
Create Home Directory: No
Enabled: Yes
Enumerate Groups: Yes
Enumerate Users: Yes
Findable Groups: -
Findable Users: -
Group Domain: NIS_GROUPS
Groupnet: groupnet0
Home Directory Template: -
Hostname Lookup: Yes
Listable Groups: -
Listable Users: -
Login Shell: /bin/bash
Normalize Groups: No
Normalize Users: No
Provider Domain: -
Ntlm Support: all
Request Timeout: 20
Restrict Findable: Yes
Restrict Listable: No
Retry Time: 5
Unfindable Groups: wheel, 0, insightiq, 15, isdmgmt, 16
Unfindable Users: root, 0, insightiq, 15, isdmgmt, 16
Unlistable Groups: -
Unlistable Users: -
User Domain: NIS_USERS
Ypmatch Using Tcp: No
- Show the exports for the zone prod:
BIC-Isilon-Cluster-1# isi nfs exports list --zone prod
ID Zone Paths Description
---------------------------------
1 prod /ifs/data -
---------------------------------
Total: 1
BIC-Isilon-Cluster-1# isi nfs exports view 1 --zone prod
ID: 1
Zone: prod
Paths: /ifs/data
Description: -
Clients: 132.206.178.0/24
Root Clients: -
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
- It doesn’t seem possible to directly list the netgroups defined on the NIS master.
- One can however list the members of a specific netgroup if one happens to know its name:
BIC-Isilon-Cluster-4# isi auth netgroups view xgeraid --recursive --provider nis:BIC Netgroup: - Domain: - Hostname: edgar-xge.bic.mni.mcgill.ca Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: gustav-xge.bic.mni.mcgill.ca Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: tatania-xge.bic.mni.mcgill.ca Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: tubal-xge.bic.mni.mcgill.ca Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: tullus-xge.bic.mni.mcgill.ca Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: tutor-xge.bic.mni.mcgill.ca Username: - BIC-Isilon-Cluster-1# isi auth netgroups view computecore Netgroup: - Domain: - Hostname: thaisa Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: vaux Username: - -------------------------------------------------------------------------------- Netgroup: - Domain: - Hostname: widow Username: -
I can’t reproduce this behaviour anymore so it should be taken with a grain of salt! I’ll leave this in place for the moment, buut it might go away soon…
- Clients in netgroups must be specified with IP addresses, names don’t work:
BIC-Isilon-Cluster-1# isi nfs exports modify 1 --clear-clients --zone prod
BIC-Isilon-Cluster-1# isi auth netgroups view isix --zone prod
Netgroup: -
Domain: -
Hostname: dromio.bic.mni.mcgill.ca
Username: -
BIC-Isilon-Cluster-1# isi nfs exports modify 1 --clients isix --zone prod
bad host dromio in netgroup isix, skipping
BIC-Isilon-Cluster-1# isi auth netgroups view xisi
Netgroup: -
Domain: -
Hostname: 132.206.178.51
Username: -
BIC-Isilon-Cluster-1# isi nfs exports modify 1 --add-clients xisi --zone prod
BIC-Isilon-Cluster-1# isi nfs exports view 1 --zone prod
ID: 1
Zone: prod
Paths: /ifs/data
Description: -
Clients: xisi
Root Clients: -
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
Workaround To The Zone Exports Issue With Netgroups.
- Netgroup entries in the NIS maps must be FQDN: even short names won’t work with the option
—ignore-unresolvable-hosts. - Modify the zone exports by using the options to the
isi nfs exports modify 1 —add-clients sgibic —ignore-unresolvable-hosts —zone prod
BIC-Isilon-Cluster-3# isi auth netgroups view sgibic --recursive --provider nis:BIC
Netgroup: -
Domain: -
Hostname: julia.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
Domain: -
Hostname: luciana.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
Domain: -
Hostname: mouldy.bic.mni.mcgill.ca
Username: -
--------------------------------------------------------------------------------
Netgroup: -
Domain: -
Hostname: vaux.bic.mni.mcgill.ca
Username: -
BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod
ID: 1
Zone: prod
Paths: /ifs/data
Description: -
Clients: isix, xisi
Root Clients: -
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
BIC-Isilon-Cluster-3# isi nfs exports modify 1 --add-clients sgibic --zone prod
bad host julia in netgroup sgibic, skipping
BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod
ID: 1
Zone: prod
Paths: /ifs/data
Description: -
Clients: isix, xisi
Root Clients: -
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
BIC-Isilon-Cluster-3# isi nfs exports modify 1 --add-clients sgibic --ignore-unresolvable-hosts --zone prod
BIC-Isilon-Cluster-3# isi nfs exports view 1 --zone prod
ID: 1
Zone: prod
Paths: /ifs/data
Description: -
Clients: isix, sgibic, xisi
Root Clients: -
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
A Real Example With Quotas
- Create an export with no root squashing from hosts in the
admincoreNIS netgroup and access to hosts inadmincore - List the exports in the prod zone.
- Check the exports for any error.
BIC-Isilon-Cluster-4# isi nfs exports create /ifs/data/bicadmin1 --zone prod --clients admincore --root-clients admincore --ignore-unresolvable-hosts
BIC-Isilon-Cluster-4# isi nfs exports list --zone prod
ID Zone Paths Description
-------------------------------------------
1 prod /ifs/data -
3 prod /ifs/data/bicadmin1 -
-------------------------------------------
Total: 2
BIC-Isilon-Cluster-4# isi nfs exports view 3 --zone prod
ID: 3
Zone: prod
Paths: /ifs/data/bicadmin1
Description: -
Clients: admincore
Root Clients: admincore
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
BIC-Isilon-Cluster-4# isi nfs exports check --zone prod
ID Message
----------
----------
Total: 0
How to create NFS aliases and use them
- It might be useful to create NFS aliases so that NFS clients can use a short symbolic name to mount the Isilon exports.
- Useful for ipl, movement or noel agglomerated mount points like
/ifs/data/ipl/ipl-5–6−8–10/,/ifs/data/movement/movement3–4−5–6−7or/ifs/data/noel/noel1–5.
BIC-Isilon-Cluster-2# mkdir /ifs/data/movement/movement3-4-5-6-7 BIC-Isilon-Cluster-2# isi quota quotas create /ifs/data/movement/movement3-4-5-6-7 directory --zone prod --hard-threshold 400G --container=yes BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do mkdir /ifs/data/movement/movement3-4-5-6-7/movement$i; done BIC-Isilon-Cluster-2# ll /ifs/data/movement/movement3-4-5-6-7 total 14 drwxr-xr-x 7 root wheel 135 Oct 19 14:56 ./ drwxr-xr-x 5 root wheel 89 Oct 19 14:42 ../ drwxr-xr-x 2 root wheel 0 Oct 19 14:56 movement3/ drwxr-xr-x 2 root wheel 0 Oct 19 14:56 movement4/ drwxr-xr-x 2 root wheel 0 Oct 19 14:56 movement5/ drwxr-xr-x 2 root wheel 0 Oct 19 14:56 movement6/ drwxr-xr-x 2 root wheel 0 Oct 19 14:56 movement7/ BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do isi nfs exports create /ifs/data/movement/movement3-4-5-6-7/movement$i --zone prod --clients admincore --root-clients admincore; done BIC-Isilon-Cluster-2# for i in 3 4 5 6 7; do isi nfs aliases create /movement$i /ifs/data/movement/movement3-4-5-6-7/movement$i --zone prod; done
- This is used for the ipl. movement and noel allocated storage:
BIC-Isilon-Cluster-2# isi nfs aliases ls --zone prod | egrep '(ipl|movement|noel)' prod /ipl1 /ifs/data/ipl/ipl-agglo/ipl1 prod /ipl10 /ifs/data/ipl/ipl-5-6-8-10/ipl10 prod /ipl11 /ifs/data/ipl/ipl11 prod /ipl2 /ifs/data/ipl/ipl-agglo/ipl2 prod /ipl3 /ifs/data/ipl/ipl-agglo/ipl3 prod /ipl4 /ifs/data/ipl/ipl-agglo/ipl4 prod /ipl5 /ifs/data/ipl/ipl-5-6-8-10/ipl5 prod /ipl6 /ifs/data/ipl/ipl-5-6-8-10/ipl6 prod /ipl7 /ifs/data/ipl/ipl-agglo/ipl7 prod /ipl8 /ifs/data/ipl/ipl-5-6-8-10/ipl8 prod /ipl9 /ifs/data/ipl/ipl-agglo/ipl9 prod /ipl_proj01 /ifs/data/ipl/ipl-agglo/proj01 prod /ipl_proj02 /ifs/data/ipl/proj02 prod /ipl_proj03 /ifs/data/ipl/proj03 prod /ipl_proj04 /ifs/data/ipl/proj04 prod /ipl_proj05 /ifs/data/ipl/proj05 prod /ipl_proj06 /ifs/data/ipl/proj06 prod /ipl_proj07 /ifs/data/ipl/proj07 prod /ipl_proj08 /ifs/data/ipl/proj08 prod /ipl_proj09 /ifs/data/ipl/proj09 prod /ipl_proj10 /ifs/data/ipl/proj10 prod /ipl_proj11 /ifs/data/ipl/proj11 prod /ipl_proj12 /ifs/data/ipl/proj12 prod /ipl_proj13 /ifs/data/ipl/proj13 prod /ipl_proj14 /ifs/data/ipl/proj14 prod /ipl_proj15 /ifs/data/ipl/proj15 prod /ipl_proj16 /ifs/data/ipl/proj16 prod /ipl_quarantine /ifs/data/ipl/quarantine prod /ipl_scratch01 /ifs/data/ipl/scratch01 prod /ipl_scratch02 /ifs/data/ipl/scratch02 prod /ipl_scratch03 /ifs/data/ipl/scratch03 prod /ipl_scratch04 /ifs/data/ipl/scratch04 prod /ipl_scratch05 /ifs/data/ipl/scratch05 prod /ipl_scratch06 /ifs/data/ipl/scratch06 prod /ipl_scratch07 /ifs/data/ipl/scratch07 prod /ipl_scratch08 /ifs/data/ipl/scratch08 prod /ipl_scratch09 /ifs/data/ipl/scratch09 prod /ipl_scratch10 /ifs/data/ipl/scratch10 prod /ipl_scratch11 /ifs/data/ipl/scratch11 prod /ipl_scratch12 /ifs/data/ipl/scratch12 prod /ipl_scratch13 /ifs/data/ipl/scratch13 prod /ipl_scratch14 /ifs/data/ipl/scratch14 prod /ipl_scratch15 /ifs/data/ipl/scratch15 prod /ipl_user01 /ifs/data/ipl/ipl-agglo/user01 prod /ipl_user02 /ifs/data/ipl/user02 prod /movement3 /ifs/data/movement/movement3-4-5-6-7/movement3 prod /movement4 /ifs/data/movement/movement3-4-5-6-7/movement4 prod /movement5 /ifs/data/movement/movement3-4-5-6-7/movement5 prod /movement6 /ifs/data/movement/movement3-4-5-6-7/movement6 prod /movement7 /ifs/data/movement/movement3-4-5-6-7/movement7 prod /movement8 /ifs/data/movement/movement8 prod /movement9 /ifs/data/movement/movement9 prod /noel1 /ifs/data/noel/noel1-5/noel1 prod /noel2 /ifs/data/noel/noel1-5/noel2 prod /noel3 /ifs/data/noel/noel1-5/noel3 prod /noel4 /ifs/data/noel/noel1-5/noel4 prod /noel5 /ifs/data/noel/noel1-5/noel5 prod /noel6 /ifs/data/noel/noel6 prod /noel7 /ifs/data/noel/noel7 prod /noel8 /ifs/data/noel/noel8
- With the NFS aliases in place a NFS client can mount an exports like this:
~$ mkdir /mnt/ifs/movement7 ~$ mount -t nfs -o vers=4 nfs.isi.bic.mni.mcgill.ca:/movement7 /mnt/ifs/movement7
Quotas
User Quotas
- One can use the web GUI to create a user quota for export
/ifs/data/bicadmin1defined above. - Here, using the CLI we create a user quota for user
malinon the export/ifs/data/bicadmin1with soft and hard limits.
BIC-Isilon-Cluster-4# isi quota quotas create /ifs/data/bicadmin1 user \
--user malin --hard-threshold 1G \
--soft-threshold 500M --soft-grace 1W --zone prod --verbose
Created quota: USER:malin@/ifs/data/bicadmin1
BIC-Isilon-Cluster-2# isi quota quotas list --path /ifs/data/bicadmin1 --zone prod --verbose
Type AppliesTo Path Snap Hard Soft Adv Grace Files With Overhead W/O Overhead Over Enforced Container Linked
-------------------------------------------------------------------------------------------------------------------------------------------------
user malin /ifs/data/bicadmin1 No 1.00G 500.00M - 1W 4625 1.39G 1024.00M - Yes No No
directory DEFAULT /ifs/data/bicadmin1 No 400.00G 399.00G - 1W 797410 138.94G 93.39G - Yes Yes -
-------------------------------------------------------------------------------------------------------------------------------------------------
Total: 2
Directory Quotas
- The exports
/ifs/data/loriswithID=5has already been created. - Here we put a 1TB directory quota on it and list the quota explicitely.
- Option
—container yesshould be used: thendfon a client will display the exports quota value rather than the whole cluster available space in the zone of the export.
BIC-Isilon-Cluster-3# isi nfs exports list --zone prod
ID Zone Paths Description
-------------------------------------------
1 prod /ifs/data -
3 prod /ifs/data/bicadmin1 -
4 prod /ifs/data/bicdata -
5 prod /ifs/data/loris -
-------------------------------------------
Total: 4
BIC-Isilon-Cluster-3# isi nfs exports view 5 --zone prod
ID: 5
Zone: prod
Paths: /ifs/data/loris
Description: -
Clients: admincore
Root Clients: admincore
Read Only Clients: -
Read Write Clients: -
All Dirs: No
Block Size: 8.0k
Can Set Time: Yes
Case Insensitive: No
Case Preserving: Yes
Chown Restricted: No
Commit Asynchronous: No
Directory Transfer Size: 128.0k
Encoding: DEFAULT
Link Max: 32767
Map Lookup UID: No
Map Retry: Yes
Map Root
Enabled: True
User: nobody
Primary Group: -
Secondary Groups: -
Map Non Root
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Failure
Enabled: False
User: nobody
Primary Group: -
Secondary Groups: -
Map Full: Yes
Max File Size: 8192.00000P
Name Max Size: 255
No Truncate: No
Read Only: No
Readdirplus: Yes
Readdirplus Prefetch: 10
Return 32Bit File Ids: No
Read Transfer Max Size: 1.00M
Read Transfer Multiple: 512
Read Transfer Size: 128.0k
Security Type: unix
Setattr Asynchronous: No
Snapshot: -
Symlinks: Yes
Time Delta: 1.0 ns
Write Datasync Action: datasync
Write Datasync Reply: datasync
Write Filesync Action: filesync
Write Filesync Reply: filesync
Write Unstable Action: unstable
Write Unstable Reply: unstable
Write Transfer Max Size: 1.00M
Write Transfer Multiple: 512
Write Transfer Size: 512.0k
BIC-Isilon-Cluster-3# isi quota quotas create /ifs/data/loris directory \
--hard-threshold 1T --zone prod --container=yes
BIC-Isilon-Cluster-3# isi quota quotas list
Type AppliesTo Path Snap Hard Soft Adv Used
----------------------------------------------------------------------------
user malin /ifs/data/bicadmin1 No 10.00G 500.00M - 1024.00M
directory DEFAULT /ifs/data/bicadmin1 No 900.00G 800.00G - 298.897G
directory DEFAULT /ifs/data/bicdata No 1.00T - - 106.904G
directory DEFAULT /ifs/data/loris No 1.00T - - 0
----------------------------------------------------------------------------
Total: 4
BIC-Isilon-Cluster-3# isi quota quotas view /ifs/data/loris directory
Path: /ifs/data/loris
Type: directory
Snapshots: No
Thresholds Include Overhead: No
Usage
Files: 16050
With Overhead: 422.04G
W/O Overhead: 335.45G
Over: -
Enforced: Yes
Container: Yes
Linked: -
Thresholds
Hard Threshold: 1.00T
Hard Exceeded: No
Hard Last Exceeded: 1969-12-31T19:00:00
Advisory: -
Advisory Exceeded: No
Advisory Last Exceeded: -
Soft Threshold: -
Soft Exceeded: No
Soft Last Exceeded: -
Soft Grace: -
Snapѕhots
- This has been recently (Aug 2016) set in motion.
- All the settings are in flux, like snapshots schedule and path naming convention.
- Some experimentation will be in order.
Snapshots schedules listing
BIC-Isilon-Cluster-3# isi snapshot schedules ls ID Name --------------------------------- 2 snapshot-bicadmin1-daily-31d 3 snapshot-bicdata-daily-7d 4 snapshot-mril2-daily-3d 5 snapshot-mril3-daily-3d --------------------------------- Total: 4
Viewing the scheduled snapshots in details
BIC-Isilon-Cluster-3# isi snapshot schedules ls -v
ID: 2
Name: snapshot-bicadmin1-daily-31d
Path: /ifs/data/bicadmin1
Pattern: snapshot_bicadmin1_31d_%Y-%m-%d-%H-%M
Schedule: every 1 days at 07:00 PM
Duration: 1M1D
Alias: alias-snapshot-bicadmin1-daily
Next Run: 2016-10-04T19:00:00
Next Snapshot: snapshot_bicadmin1_31d_2016-10-04-19-00
--------------------------------------------------------------------------------
ID: 3
Name: snapshot-bicdata-daily-7d
Path: /ifs/data/bicdata
Pattern: snapshot-bicdata_daily_7d_%Y-%m-%d-%H-%M
Schedule: every 1 days at 07:00 PM
Duration: 1W1D
Alias: alias-snapshot-bicdata-daily
Next Run: 2016-10-04T19:00:00
Next Snapshot: snapshot-bicdata_daily_7d_2016-10-04-19-00
--------------------------------------------------------------------------------
ID: 4
Name: snapshot-mril2-daily-3d
Path: /ifs/data/mril/mril2
Pattern: snapshot-mril2-daily-3d-%Y-%m-%d-%H-%M
Schedule: every 1 days at 11:45 PM
Duration: 3D1H
Alias: alias-snapshot-mril2-daily-3d
Next Run: 2016-10-04T23:45:00
Next Snapshot: snapshot-mril2-daily-3d-2016-10-04-23-45
--------------------------------------------------------------------------------
ID: 5
Name: snapshot-mril3-daily-3d
Path: /ifs/data/mril/mril3
Pattern: snapshot-mril3-daily-3d-%Y-%m-%d-%H-%M
Schedule: every 1 days at 11:45 PM
Duration: 3D2H
Alias: alias-snapshot-mril3-daily-3d
Next Run: 2016-10-04T23:45:00
Next Snapshot: snapshot-mril3-daily-3d-2016-10-04-23-45
Listing the snapshots and viewing the details on a particular snapshot
BIC-Isilon-Cluster-3# isi snapshot snapshots list
ID Name Path
--------------------------------------------------------------------------------
378 alias-snapshot-bicadmin1-daily /ifs/data/bicadmin1
737 snapshot_bicadmin1_30D_2016-09-03-_19-00 /ifs/data/bicadmin1
740 snapshot-bicdata_daily_30D_expiration_2016-09-03-_19-00 /ifs/data/bicdata
744 snapshot_bicadmin1_30D_2016-09-04-_19-00 /ifs/data/bicadmin1
747 snapshot-bicdata_daily_30D_expiration_2016-09-04-_19-00 /ifs/data/bicdata
751 snapshot_bicadmin1_30D_2016-09-05-_19-00 /ifs/data/bicadmin1
754 snapshot-bicdata_daily_30D_expiration_2016-09-05-_19-00 /ifs/data/bicdata
758 snapshot_bicadmin1_30D_2016-09-06-_19-00 /ifs/data/bicadmin1
761 snapshot-bicdata_daily_30D_expiration_2016-09-06-_19-00 /ifs/data/bicdata
765 snapshot_bicadmin1_30D_2016-09-07-_19-00 /ifs/data/bicadmin1
768 snapshot-bicdata_daily_30D_expiration_2016-09-07-_19-00 /ifs/data/bicdata
772 snapshot_bicadmin1_30D_2016-09-08-_19-00 /ifs/data/bicadmin1
775 snapshot-bicdata_daily_30D_expiration_2016-09-08-_19-00 /ifs/data/bicdata
779 snapshot_bicadmin1_30D_2016-09-09-_19-00 /ifs/data/bicadmin1
782 snapshot-bicdata_daily_30D_expiration_2016-09-09-_19-00 /ifs/data/bicdata
786 snapshot_bicadmin1_30D_2016-09-10-_19-00 /ifs/data/bicadmin1
789 snapshot-bicdata_daily_30D_expiration_2016-09-10-_19-00 /ifs/data/bicdata
793 snapshot_bicadmin1_30D_2016-09-11-_19-00 /ifs/data/bicadmin1
796 snapshot-bicdata_daily_30D_expiration_2016-09-11-_19-00 /ifs/data/bicdata
800 snapshot_bicadmin1_30D_2016-09-12-_19-00 /ifs/data/bicadmin1
803 snapshot-bicdata_daily_30D_expiration_2016-09-12-_19-00 /ifs/data/bicdata
807 snapshot_bicadmin1_30D_2016-09-13-_19-00 /ifs/data/bicadmin1
810 snapshot-bicdata_daily_30D_expiration_2016-09-13-_19-00 /ifs/data/bicdata
814 snapshot_bicadmin1_30D_2016-09-14-_19-00 /ifs/data/bicadmin1
817 snapshot-bicdata_daily_30D_expiration_2016-09-14-_19-00 /ifs/data/bicdata
821 snapshot_bicadmin1_30D_2016-09-15-_19-00 /ifs/data/bicadmin1
824 snapshot-bicdata_daily_30D_expiration_2016-09-15-_19-00 /ifs/data/bicdata
828 snapshot_bicadmin1_30D_2016-09-16-_19-00 /ifs/data/bicadmin1
831 snapshot-bicdata_daily_30D_expiration_2016-09-16-_19-00 /ifs/data/bicdata
835 snapshot_bicadmin1_30D_2016-09-17-_19-00 /ifs/data/bicadmin1
838 snapshot-bicdata_daily_30D_expiration_2016-09-17-_19-00 /ifs/data/bicdata
842 snapshot_bicadmin1_30D_2016-09-18-_19-00 /ifs/data/bicadmin1
845 snapshot-bicdata_daily_30D_expiration_2016-09-18-_19-00 /ifs/data/bicdata
849 snapshot_bicadmin1_30D_2016-09-19-_19-00 /ifs/data/bicadmin1
852 snapshot-bicdata_daily_30D_expiration_2016-09-19-_19-00 /ifs/data/bicdata
856 snapshot_bicadmin1_30D_2016-09-20-_19-00 /ifs/data/bicadmin1
859 snapshot-bicdata_daily_30D_expiration_2016-09-20-_19-00 /ifs/data/bicdata
863 snapshot_bicadmin1_30D_2016-09-21-_19-00 /ifs/data/bicadmin1
866 snapshot-bicdata_daily_30D_expiration_2016-09-21-_19-00 /ifs/data/bicdata
870 snapshot_bicadmin1_30D_2016-09-22-_19-00 /ifs/data/bicadmin1
873 snapshot-bicdata_daily_30D_expiration_2016-09-22-_19-00 /ifs/data/bicdata
877 snapshot_bicadmin1_30D_2016-09-23-_19-00 /ifs/data/bicadmin1
880 snapshot-bicdata_daily_30D_expiration_2016-09-23-_19-00 /ifs/data/bicdata
884 snapshot_bicadmin1_30D_2016-09-24-_19-00 /ifs/data/bicadmin1
887 snapshot-bicdata_daily_30D_expiration_2016-09-24-_19-00 /ifs/data/bicdata
891 snapshot_bicadmin1_30D_2016-09-25-_19-00 /ifs/data/bicadmin1
894 snapshot-bicdata_daily_30D_expiration_2016-09-25-_19-00 /ifs/data/bicdata
898 snapshot_bicadmin1_30D_2016-09-26-_19-00 /ifs/data/bicadmin1
901 snapshot-bicdata_daily_30D_expiration_2016-09-26-_19-00 /ifs/data/bicdata
905 snapshot_bicadmin1_30D_2016-09-27-_19-00 /ifs/data/bicadmin1
908 snapshot-bicdata_daily_30D_expiration_2016-09-27-_19-00 /ifs/data/bicdata
912 snapshot_bicadmin1_30D_2016-09-28-_19-00 /ifs/data/bicadmin1
915 snapshot-bicdata_daily_30D_expiration_2016-09-28-_19-00 /ifs/data/bicdata
919 snapshot_bicadmin1_30D_2016-09-29-_19-00 /ifs/data/bicadmin1
922 snapshot-bicdata_daily_30D_expiration_2016-09-29-_19-00 /ifs/data/bicdata
926 snapshot_bicadmin1_30D_2016-09-30-_19-00 /ifs/data/bicadmin1
929 snapshot-bicdata_daily_30D_expiration_2016-09-30-_19-00 /ifs/data/bicdata
933 snapshot_bicadmin1_30D_2016-10-01-_19-00 /ifs/data/bicadmin1
936 snapshot-bicdata_daily_30D_expiration_2016-10-01-_19-00 /ifs/data/bicdata
940 snapshot_bicadmin1_30D_2016-10-02-_19-00 /ifs/data/bicadmin1
943 snapshot-bicdata_daily_30D_expiration_2016-10-02-_19-00 /ifs/data/bicdata
947 snapshot_bicadmin1_30D_2016-10-03-_19-00 /ifs/data/bicadmin1
950 snapshot-bicdata_daily_30D_expiration_2016-10-03-_19-00 /ifs/data/bicdata
952 FSAnalyze-Snapshot-Current-1475546412 /ifs
--------------------------------------------------------------------------------
Total: 64
BIC-Isilon-Cluster-3# isi snapshot snapshots view snapshot_bicadmin1_30D_2016-10-03-_19-00
ID: 947
Name: snapshot_bicadmin1_30D_2016-10-03-_19-00
Path: /ifs/data/bicadmin1
Has Locks: No
Schedule: snapshot-bicadmin1-daily-31d
Alias Target ID: -
Alias Target Name: -
Created: 2016-10-03T19:00:03
Expires: 2016-11-03T19:00:00
Size: 1.016G
Shadow Bytes: 0
% Reserve: 0.00%
% Filesystem: 0.00%
State: active
- What is this snapshot
FSAnalyze-Snapshot-Current-1475546412?? - I never created it.
- It looks likes it is against best-practices: path is
/ifs - Found the answer: this is needed for the FS analytics done with the InsightIQ server.
- DO NOT DELETE IT!
BIC-Isilon-Cluster-3# isi snapshot snapshots view FSAnalyze-Snapshot-Current-1475546412
ID: 952
Name: FSAnalyze-Snapshot-Current-1475546412
Path: /ifs
Has Locks: No
Schedule: -
Alias Target ID: -
Alias Target Name: -
Created: 2016-10-03T22:00:12
Expires: -
Size: 1.2129T
Shadow Bytes: 0
% Reserve: 0.00%
% Filesystem: 0.00%
State: active
Snapshot aliases point to the latest snapshot
BIC-Isilon-Cluster-3# isi snapshot aliases ls ID Name Target ID Target Name --------------------------------------------------------------------------------------- 378 alias-snapshot-bicadmin1-daily 947 snapshot_bicadmin1_30D_2016-10-03-_19-00 ---------------------------------------------------------------------------------------
Creating snapshot schedules
- Create a snapshot schedule and an alias to it so that that it points to the last performed snapshot.
- For example, to create a snapshot schedule for
/ifs/data/mril/mril2done- Everyday day at 11:45PM
- with a retention period of 73 hours (3 days + 1 hour).
- Alias
alias-snapshot-mril2-daily-3dpoints to the last scheduled snapshot.
BIC-Isilon-Cluster-2# isi snapshot schedules create snapshot-mril2-daily-3d /ifs/data/mril/mril2 snapshot_mril2_daily_3d-%Y-%m-%d-%H-%M \
"every 1 days at 11:45 PM" --duration 73H --alias alias-snapshot-mril2-daily-3d
BIC-Isilon-Cluster-2# isi snapshot schedules ls
ID Name
---------------------------------
2 snapshot-bicadmin1-daily-31d
3 snapshot-bicdata-daily-7d
4 snapshot-mril2-daily-3d
5 snapshot-mril3-daily-3d
---------------------------------
Total: 4
BIC-Isilon-Cluster-2# isi snapshot schedules view 4
ID: 4
Name: snapshot-mril2-daily-3d
Path: /ifs/data/mril/mril2
Pattern: snapshot_mril2_daily_3d-%Y-%m-%d-%H-%M
Schedule: every 1 days at 11:45 PM
Duration: 3D1H
Alias: alias-snapshot-mril2-daily-3d
- CLI command can be messy! The web GUI is more intuitive.
- See below for the
patternsyntax. - Syntax for snapshot schedule creation:
BIC-Isilon-Cluster-2# isi snapshot schedules create <name> <path> <pattern> <schedule>
[--alias <alias>]
[--duration <duration>]
[--verbose
Options
<name>
Specifies a name for the snapshot schedule.
<path>
Specifies the path of the directory to include in the snapshots.
<pattern>
Specifies a naming pattern for snapshots created according to the schedule. See below.
<schedule>
Specifies how often snapshots are created.
Specify in the following format: "<interval> [<frequency>]"
Specify <interval> in one of the following formats:
Every [{other | <integer>}] week [on <day>]
Every [{other | <integer>}] month [on the <integer>]
Every [<day>[, ...] [of every [{other | <integer>}] week]]
The last {day | weekday | <day>} of every [{other |<integer>}] month
The <integer> {weekday | <day>} of every [{other | <integer>}] month
Yearly on <month> <integer>
Yearly on the {last | <integer>} [weekday | <day>] of <month>
Specify <frequency> in one of the following formats:
at <hh>[:<mm>] [{AM | PM}]
every [<integer>] {hours | minutes} [between <hh>[:<mm>] [{AM | PM}] and <hh>[:<mm>] [{AM | PM}]]
every [<integer>] {hours | minutes} [from <hh>[:<mm>] [{AM | PM}] to <hh>[:<mm>] [{AM | PM}]]
You can optionally append "st", "th", or "rd" to <integer>.
For example, you can specify "Every 1st month"
Specify <day> as any day of the week or a three-letter abbreviation for the day.
For example, both "saturday" and "sat" are valid.
--alias <alias>
Specifies an alias for the latest snapshot generated based on the schedule.
The alias enables you to quickly locate the most recent snapshot that was generated
according to the schedule.
Specify as any string.
{--duration | -x} <duration>
Specifies how long snapshots generated according to the schedule are stored on the
cluster before OneFS automatically deletes them.
Specify in the following format:
<integer><units>
The following <units> are valid:
Y Specifies years
M Specifies months
W Specifies weeks
D Specifies days
H Specifies hours
{--verbose | -v}
Displays a message confirming that the snapshot schedule was created.
- The following variables can be included in a snapshot naming pattern:
- Have fun choosing one!
| Variable | Description |
|---|---|
| %A | The day of the week. |
| %a | The abbreviated day of the week. For example, if the snapshot is generated on a Sunday, %a is replaced with Sun. |
| %B | The name of the month. |
| %b | The abbreviated name of the month. For example, if the snapshot is generated in September, %b is replaced with Sep. |
| %C | The first two digits of the year. For example, if the snapshot is created in 2014, %C is replaced with 20. |
| %c | The time and day. This variable is equivalent to specifying b T %Y. |
| %d | The two digit day of the month. |
| %e | The day of the month. A single-digit day is preceded by a blank space. |
| %F | The date. This variable is equivalent to specifying m-%d. |
| %G | The year. This variable is equivalent to specifying G is replaced with 2016, because only one day of that week is in 2017. |
| %g | The abbreviated year. This variable is equivalent to specifying g is replaced with 16, because only oneday of that week is in 2017. |
| %H | The hour. The hour is represented on the 24-hour clock. Single-digit hours are preceded by a zero. For example, if a snapshot is created at 1:45 AM, %H is replaced with 01. |
| %h | The abbreviated name of the month. This variable is equivalent to specifying %b. |
| %I | The hour represented on the 12-hour clock. Single-digit hours are preceded by a zero. For example, if a snapshot is created at 1:45 PM, %I is replaced with 01. |
| %j | The numeric day of the year. For example, if a snapshot is created on February 1, %j is replaced with 32. |
| %k | The hour represented on the 24-hour clock. Single-digit hours are preceded by a blank space. |
| %l | The hour represented on the 12-hour clock. Single-digit hours are preceded by a blank space. For example, if a snapshot is created at 1:45 AM, %I is replaced with 1. |
| %M | The two-digit minute. |
| %m | The two-digit month. |
| %p | AM or PM. |
| %{PolicyName} | The name of the replication policy that the snapshot was created for. This variable is valid only if you are specifying a snapshot naming pattern for a replication policy. |
| %R | The time. This variable is equivalent to specifying M. |
| !%r | The time. This variable is equivalent to specifying M:p. |
| %S | The two-digit second. |
| %s | The second represented in UNIX or POSIX time. |
| %{SrcCluster} | The name of the source cluster of the replication policy that the snapshot was created for. This variable is valid only if you are specifying a snapshot naming pattern for a replication policy. |
| %T | The time. This variable is equivalent to specifying M:%S |
| %U | The two-digit numerical week of the year. Numbers range from 00 to 53. The first day of the week is calculated as Sunday. |
| %u | The numerical day of the week. Numbers range from 1 to 7. The first day of the week is calculated as Monday. For example, if a snapshot is created on Sunday, %u is replaced with 7. |
| %V | The two-digit numerical week of the year that the snapshot was created in. Numbers range from 01 to 53. The first day of the week is calculated as Monday. If the week of January 1 is four or more days in length, then that week is counted as the first week of the year. |
| %v | The day that the snapshot was created. This variable is equivalent to specifying b-%Y. |
| %W | The two-digit numerical week of the year that the snapshot was created in. Numbers range from 00 to 53. The first day of the week is calculated as Monday. |
| %w | The numerical day of the week that the snapshot was created on. Numbers range from 0 to 6. The first day of the week is calculated as Sunday. For example, if the snapshot was created on Sunday, %w is replaced with 0. |
| %X | The time that the snapshot was created. This variable is equivalent to specifying M:%S. |
| %Y | The year that the snapshot was created in. |
| %y | The last two digits of the year that the snapshot was created in. For example, if the snapshot was created in 2014, %y is replaced with 14. |
| %Z | The time zone that the snapshot was created in. |
| %z | The offset from coordinated universal time (UTC) of the time zone that the snapshot was created in. If preceded by a plus sign, the time zone is east of UTC. If preceded by a minus sign, the time zone is west of UTC. |
| %+ | The time and date that the snapshot was created. This variable is equivalent to specifying b X Y. |
| %% | Escapes a percent sign. For example, 100%% is replaced with 100%. |
Creating ChangeList between snapshots
- Create a ChangeList between 2 snapshots and list its content.
- Delete it at the end.
BIC-Isilon-Cluster-3# isi snapshot snapshots ls | grep mril3
965 snapshot-mril3-daily-3d-2016-10-04-23-45 /ifs/data/mril/mril3
966 alias-snapshot-mril3-daily-3d /ifs/data/mril/mril3
979 snapshot-mril3-daily-3d-2016-10-05-23-45 /ifs/data/mril/mril3
BIC-Isilon-Cluster-3# isi snapshot snapshots view 979
ID: 979
Name: snapshot-mril3-daily-3d-2016-10-05-23-45
Path: /ifs/data/mril/mril3
Has Locks: No
Schedule: snapshot-mril3-daily-3d
Alias Target ID: -
Alias Target Name: -
Created: 2016-10-05T23:45:04
Expires: 2016-10-09T01:45:00
Size: 6.0k
Shadow Bytes: 0
% Reserve: 0.00%
% Filesystem: 0.00%
State: active
BIC-Isilon-Cluster-3# isi snapshot snapshots view 965
ID: 965
Name: snapshot-mril3-daily-3d-2016-10-04-23-45
Path: /ifs/data/mril/mril3
Has Locks: No
Schedule: snapshot-mril3-daily-3d
Alias Target ID: -
Alias Target Name: -
Created: 2016-10-04T23:45:10
Expires: 2016-10-08T01:45:00
Size: 61.0k
Shadow Bytes: 0
% Reserve: 0.00%
% Filesystem: 0.00%
State: active
BIC-Isilon-Cluster-3# isi job jobs start ChangelistCreate --older-snapid 965 --newer-snapid 979
BIC-Isilon-Cluster-3# isi_changelist_mod -l
965_979_inprog
BIC-Isilon-Cluster-3# isi job jobs list
ID Type State Impact Pri Phase Running Time
---------------------------------------------------------------
964 ChangelistCreate Running Low 5 2/4 21m
---------------------------------------------------------------
Total: 1
BIC-Isilon-Cluster-3# isi_changelist_mod -l
965_979
BIC-Isilon-Cluster-3# isi_changelist_mod -a 965_979
st_ino=4357852748 st_mode=040755 st_size=14149 st_atime=1475608476 st_mtime=1475608476 st_ctime=1475698088 st_flags=224 cl_flags=00 path=/ifs/data/mril/mril3/ilana/matlab
st_ino=4374360447 st_mode=040755 st_size=207 st_atime=1467833479 st_mtime=1467833479 st_ctime=1475698088 st_flags=224 cl_flags=00 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other
st_ino=4402080042 st_mode=0100644 st_size=1033 st_atime=1475678676 st_mtime=1475678676 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m~
st_ino=4402080043 st_mode=0100644 st_size=2922 st_atime=1475588733 st_mtime=1475588733 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other/AMICO_LoadData.m
st_ino=4414831639 st_mode=0100644 st_size=1047 st_atime=1475690420 st_mtime=1475690420 st_ctime=1475698087 st_flags=224 cl_flags=01 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m
st_ino=4374468851 st_mode=0100644 st_size=2921 st_atime=1466519137 st_mtime=1466519137 st_ctime=1470857490 st_flags=224 cl_flags=02 path=/ifs/data/mril/mril3/ilana/matlab/AMICO-master/matlab/other/AMICO_LoadData.m
st_ino=4416223571 st_mode=0100644 st_size=890 st_atime=1475264575 st_mtime=1475264575 st_ctime=1475350931 st_flags=224 cl_flags=02 path=/ifs/data/mril/mril3/ilana/matlab/correlate.m
BIC-Isilon-Cluster-3# isi_changelist_mod -k 965_979
Jobs
How To Delete A Large Amount Of Files/Dirs Without Impacting the Cluster Performance
- Submit the job type TreeDelete.
- Here the /ifs/data/zmanda contains 5TB of restored data from the Zmanda NDMP backup tapes.
BIC-Isilon-Cluster-4# isi job jobs start TreeDelete --paths /ifs/data/zmanda --priority 10 --policy low
Started job [4050]
BIC-Isilon-Cluster-4# isi job jobs list
ID Type State Impact Pri Phase Running Time
---------------------------------------------------------
4050 TreeDelete Running Low 10 1/1 -
---------------------------------------------------------
Total: 1
BIC-Isilon-Cluster-4# isi job jobs view 4050
ID: 4050
Type: TreeDelete
State: Running
Impact: Low
Policy: LOW
Pri: 10
Phase: 1/1
Start Time: 2017-10-12T11:45:41
Running Time: 22s
Participants: 1, 2, 3, 4, 6
Progress: Started
Waiting on job ID: -
Description: {'count': 1, 'lins': {'1:1044:db60': """/ifs/data/zmanda"""}}
BIC-Isilon-Cluster-4# isi status
Cluster Name: BIC-Isilon-Cluster
Cluster Health: [ OK ]
Cluster Storage: HDD SSD Storage
Size: 641.6T (649.3T Raw) 0 (0 Raw)
VHS Size: 7.7T
Used: 207.1T (32%) 0 (n/a)
Avail: 434.5T (68%) 0 (n/a)
Health Throughput (bps) HDD Storage SSD Storage
ID |IP Address |DASR | In Out Total| Used / Size |Used / Size
---+---------------+-----+-----+-----+-----+-----------------+-----------------
1|172.16.10.20 | OK |48.9k| 133k| 182k|41.4T/ 130T( 32%)|(No Storage SSDs)
2|172.16.10.21 | OK |20.5M|128.0|20.5M|41.4T/ 130T( 32%)|(No Storage SSDs)
3|172.16.10.22 | OK | 1.3M| 111k| 1.4M|41.4T/ 130T( 32%)|(No Storage SSDs)
4|172.16.10.23 | OK |14.1M|75.6M|89.6M|41.4T/ 130T( 32%)|(No Storage SSDs)
5|172.16.10.24 | OK |96.6k|66.7k| 163k|41.4T/ 130T( 32%)|(No Storage SSDs)
---+---------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals: |36.0M|75.9M| 112M| 207T/ 642T( 32%)|(No Storage SSDs)
Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only
Critical Events:
Cluster Job Status:
Running jobs:
Job Impact Pri Policy Phase Run Time
-------------------------- ------ --- ---------- ----- ----------
TreeDelete[4050] Low 10 LOW 1/1 0:25:01
No paused or waiting jobs.
No failed jobs.
Recent job results:
Time Job Event
--------------- -------------------------- ------------------------------
10/12 04:00:02 ShadowStoreProtect[4049] Succeeded (LOW)
10/12 03:05:16 SnapshotDelete[4048] Succeeded (MEDIUM)
10/12 02:00:17 WormQueue[4047] Succeeded (LOW)
10/12 01:05:31 SnapshotDelete[4046] Succeeded (MEDIUM)
10/12 00:04:57 SnapshotDelete[4045] Succeeded (MEDIUM)
10/11 23:21:32 FSAnalyze[4043] Succeeded (LOW)
10/11 22:37:01 SnapshotDelete[4044] Succeeded (MEDIUM)
10/11 20:00:25 ShadowStoreProtect[4042] Succeeded (LOW)
11/15 14:53:34 MultiScan[1254] MultiScan[1254] Failed
10/06 14:45:55 ChangelistCreate[975] ChangelistCreate[975] Failed
InsightIQ Installation and Config
Install IIQ
- License must be installed on the Isilon cluster.
- Create a CentOS 6.7 (beurk) virtual machine and properly configure the network on it.
- Call this machine zaphod.
- Need the InsightIQ shell script from EMC support.
- Will use a local (to the VM) data store for IIQ.
- The install script will fail due to some dependencies mismatch with openssl, I think.
- Here a way to force the install.
- Extract the content of the self-packaged script.
- Remove the offending package openssl-1.0.1e-42.el6_7.2.x86_64.rpm from it.
- Manually install yum install openssl-devel.x86_64.
- Run the install script sh ./install_insightiq.sh.
root@zaphod ~$ sh ./install-insightiq-4.0.0.0049.sh --target ./iiq root@zaphod ~$ cd iiq root@zaphod ~$ rm openssl-1.0.1e-42.el6_7.2.x86_64.rpm root@zaphod ~/iiq$ ll *.rpm -rw-r--r-- 1 root root 928548 Jan 12 18:55 bash-4.1.2-29.el6.x86_64.rpm -rw-r--r-- 1 root root 367680 Jan 12 18:55 freetype-2.3.11-14.el6_3.1.x86_64.rpm -rw-r--r-- 1 root root 3993500 Jan 12 18:55 glibc-2.12-1.149.el6_6.7.x86_64.rpm -rw-r--r-- 1 root root 14884088 Jan 12 18:56 glibc-common-2.12-1.149.el6_6.7.x86_64.rpm -rw-r--r-- 1 root root 25899811 Jan 12 18:55 isilon-insightiq-4.0.0.0049-1.x86_64.rpm -rw-r--r-- 1 root root 139192 Jan 12 18:56 libXfont-1.4.5-3.el6_5.x86_64.rpm -rw-r--r-- 1 root root 25012 Jan 12 18:56 libfontenc-1.0.5-2.el6.x86_64.rpm -rw-r--r-- 1 root root 178512 Jan 12 18:56 libjpeg-turbo-1.2.1-3.el6_5.x86_64.rpm -rw-r--r-- 1 root root 186036 Jan 12 18:56 libpng-1.2.49-1.el6_2.x86_64.rpm -rw-r--r-- 1 root root 280524 Jan 12 18:56 openssh-5.3p1-112.el6_7.x86_64.rpm -rw-r--r-- 1 root root 448872 Jan 12 18:56 openssh-clients-5.3p1-112.el6_7.x86_64.rpm -rw-r--r-- 1 root root 331544 Jan 12 18:56 openssh-server-5.3p1-112.el6_7.x86_64.rpm -rw-r--r-- 1 root root 1225760 Jan 12 18:56 openssl-devel-1.0.1e-42.el6_7.2.x86_64.rpm -rw-r--r-- 1 root root 1033984 Jan 12 18:56 postgresql93-9.3.4-1PGDG.rhel6.x86_64.rpm -rw-r--r-- 1 root root 1544220 Jan 12 18:56 postgresql93-devel-9.3.4-1PGDG.rhel6.x86_64.rpm -rw-r--r-- 1 root root 194856 Jan 12 18:56 postgresql93-libs-9.3.4-1PGDG.rhel6.x86_64.rpm -rw-r--r-- 1 root root 4259740 Jan 12 18:56 postgresql93-server-9.3.4-1PGDG.rhel6.x86_64.rpm -rw-r--r-- 1 root root 43900 Jan 12 18:56 ttmkfdir-3.0.9-32.1.el6.x86_64.rpm -rw-r--r-- 1 root root 453984 Jan 12 18:56 tzdata-2015b-1.el6.noarch.rpm -rw-r--r-- 1 root root 39308573 Jan 12 18:56 wkhtmltox-0.12.2.1_linux-centos6-amd64.rpm -rw-r--r-- 1 root root 76712 Jan 12 18:56 xorg-x11-font-utils-7.2-11.el6.x86_64.rpm -rw-r--r-- 1 root root 2929960 Jan 12 18:56 xorg-x11-fonts-75dpi-7.2-9.1.el6.noarch.rpm -rw-r--r-- 1 root root 532016 Jan 12 18:56 xorg-x11-fonts-Type1-7.2-9.1.el6.noarch.rpm root@zaphod ~/iiq$ yum list openssl\* Installed Packages openssl.x86_64 1.0.1e-42.el6_7.4 @updates/$releasever Available Packages openssl.i686 1.0.1e-42.el6_7.4 updates openssl-devel.i686 1.0.1e-42.el6_7.4 updates openssl-devel.x86_64 1.0.1e-42.el6_7.4 updates openssl-perl.x86_64 1.0.1e-42.el6_7.4 updates openssl-static.x86_64 1.0.1e-42.el6_7.4 updates openssl098e.i686 0.9.8e-20.el6.centos.1 updates openssl098e.x86_64 0.9.8e-20.el6.centos.1 updates root@zaphod ~/iiq$ yum install openssl-devel.x86_64 =============================================================================================================================================================================================== Package Arch Version Repository Size =============================================================================================================================================================================================== Installing: openssl-devel x86_64 1.0.1e-42.el6_7.4 updates 1.2 M Installing for dependencies: keyutils-libs-devel x86_64 1.4-5.el6 base 29 k krb5-devel x86_64 1.10.3-42z1.el6_7 updates 502 k libcom_err-devel x86_64 1.41.12-22.el6 base 33 k libselinux-devel x86_64 2.0.94-5.8.el6 base 137 k libsepol-devel x86_64 2.0.41-4.el6 base 64 k zlib-devel x86_64 1.2.3-29.el6 base 44 k Transaction Summary =============================================================================================================================================================================================== Install 7 Package(s) Total download size: 2.0 M Installed size: 4.9 M Installed: openssl-devel.x86_64 0:1.0.1e-42.el6_7.4 Dependency Installed: keyutils-libs-devel.x86_64 0:1.4-5.el6 krb5-devel.x86_64 0:1.10.3-42z1.el6_7 libcom_err-devel.x86_64 0:1.41.12-22.el6 libselinux-devel.x86_64 0:2.0.94-5.8.el6 libsepol-devel.x86_64 0:2.0.41-4.el6 zlib-devel.x86_64 0:1.2.3-29.el6 root@zaphod ~/iiq$ sh ./install_insightiq.sh This script automates the installation or upgrade of InsightIQ. If you are running a version of InsightIQ that can be upgraded by this version, the upgrade will occur automatically. If you are trying to upgrade an unsupported version, the script will exit. If you are installing on a new system, the script will perform a clean install. Are you ready to proceed with the installation? Please enter (Y)es or (N)o followed by [ENTER] >>> y =============================================================================================================================================================================================== Package Arch Version Repository Size =============================================================================================================================================================================================== Installing: freetype x86_64 2.3.11-14.el6_3.1 /freetype-2.3.11-14.el6_3.1.x86_64 816 k isilon-insightiq x86_64 4.0.0.0049-1 /isilon-insightiq-4.0.0.0049-1.x86_64 93 M libXfont x86_64 1.4.5-3.el6_5 /libXfont-1.4.5-3.el6_5.x86_64 295 k libfontenc x86_64 1.0.5-2.el6 /libfontenc-1.0.5-2.el6.x86_64 40 k libjpeg-turbo x86_64 1.2.1-3.el6_5 /libjpeg-turbo-1.2.1-3.el6_5.x86_64 466 k libpng x86_64 2:1.2.49-1.el6_2 /libpng-1.2.49-1.el6_2.x86_64 639 k postgresql93 x86_64 9.3.4-1PGDG.rhel6 /postgresql93-9.3.4-1PGDG.rhel6.x86_64 5.2 M postgresql93-devel x86_64 9.3.4-1PGDG.rhel6 /postgresql93-devel-9.3.4-1PGDG.rhel6.x86_64 6.7 M postgresql93-libs x86_64 9.3.4-1PGDG.rhel6 /postgresql93-libs-9.3.4-1PGDG.rhel6.x86_64 631 k postgresql93-server x86_64 9.3.4-1PGDG.rhel6 /postgresql93-server-9.3.4-1PGDG.rhel6.x86_64 15 M ttmkfdir x86_64 3.0.9-32.1.el6 /ttmkfdir-3.0.9-32.1.el6.x86_64 99 k wkhtmltox x86_64 1:0.12.2.1-1 /wkhtmltox-0.12.2.1_linux-centos6-amd64 109 M xorg-x11-font-utils x86_64 1:7.2-11.el6 /xorg-x11-font-utils-7.2-11.el6.x86_64 294 k xorg-x11-fonts-75dpi noarch 7.2-9.1.el6 /xorg-x11-fonts-75dpi-7.2-9.1.el6.noarch 2.9 M xorg-x11-fonts-Type1 noarch 7.2-9.1.el6 /xorg-x11-fonts-Type1-7.2-9.1.el6.noarch 863 k Installing for dependencies: avahi-libs x86_64 0.6.25-15.el6 base 55 k blas x86_64 3.2.1-4.el6 base 321 k c-ares x86_64 1.10.0-3.el6 base 75 k cups-libs x86_64 1:1.4.2-72.el6 base 321 k cyrus-sasl-gssapi x86_64 2.1.23-15.el6_6.2 base 34 k fontconfig x86_64 2.8.0-5.el6 base 186 k gnutls x86_64 2.8.5-19.el6_7 updates 347 k keyutils x86_64 1.4-5.el6 base 39 k lapack x86_64 3.2.1-4.el6 base 4.3 M libX11 x86_64 1.6.0-6.el6 base 586 k libX11-common noarch 1.6.0-6.el6 base 192 k libXau x86_64 1.0.6-4.el6 base 24 k libXext x86_64 1.3.2-2.1.el6 base 35 k libXrender x86_64 0.9.8-2.1.el6 base 24 k libbasicobjects x86_64 0.1.1-11.el6 base 21 k libcollection x86_64 0.6.2-11.el6 base 36 k libdhash x86_64 0.4.3-11.el6 base 24 k libevent x86_64 1.4.13-4.el6 base 66 k libgfortran x86_64 4.4.7-16.el6 base 267 k libgssglue x86_64 0.1-11.el6 base 23 k libini_config x86_64 1.1.0-11.el6 base 46 k libipa_hbac x86_64 1.12.4-47.el6_7.8 updates 106 k libldb x86_64 1.1.25-2.el6_7 updates 113 k libnl x86_64 1.1.4-2.el6 base 121 k libpath_utils x86_64 0.2.1-11.el6 base 24 k libref_array x86_64 0.1.4-11.el6 base 23 k libsss_idmap x86_64 1.12.4-47.el6_7.8 updates 110 k libtalloc x86_64 2.1.5-1.el6_7 updates 26 k libtdb x86_64 1.3.8-1.el6_7 updates 43 k libtevent x86_64 0.9.26-2.el6_7 updates 29 k libtiff x86_64 3.9.4-10.el6_5 base 343 k libtirpc x86_64 0.2.1-10.el6 base 79 k libxcb x86_64 1.9.1-3.el6 base 110 k nfs-utils x86_64 1:1.2.3-64.el6 base 331 k nfs-utils-lib x86_64 1.1.5-11.el6 base 68 k pytalloc x86_64 2.1.5-1.el6_7 updates 10 k python-argparse noarch 1.2.1-2.1.el6 base 48 k python-sssdconfig noarch 1.12.4-47.el6_7.8 updates 133 k rpcbind x86_64 0.2.0-11.el6_7 updates 51 k samba4-libs x86_64 4.2.10-6.el6_7 updates 4.4 M sssd x86_64 1.12.4-47.el6_7.8 updates 101 k sssd-ad x86_64 1.12.4-47.el6_7.8 updates 193 k sssd-client x86_64 1.12.4-47.el6_7.8 updates 152 k sssd-common x86_64 1.12.4-47.el6_7.8 updates 978 k sssd-common-pac x86_64 1.12.4-47.el6_7.8 updates 136 k sssd-ipa x86_64 1.12.4-47.el6_7.8 updates 238 k sssd-krb5 x86_64 1.12.4-47.el6_7.8 updates 135 k sssd-krb5-common x86_64 1.12.4-47.el6_7.8 updates 191 k sssd-ldap x86_64 1.12.4-47.el6_7.8 updates 216 k sssd-proxy x86_64 1.12.4-47.el6_7.8 updates 130 k Transaction Summary =============================================================================================================================================================================================== Install 65 Package(s) Total size: 252 M Total download size: 15 M Installed size: 277 M insightiq 0:off 1:off 2:on 3:on 4:on 5:on 6:off chmod: cannot access `sssd': No such file or directory ip6tables: unrecognized service ip6tables: unrecognized service error reading information on service ip6tables: No such file or directory Shutting down interface eth0: [ OK ] Shutting down loopback interface: [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: Determining if ip address 132.206.178.250 is already in use for device eth0... [ OK ] Generating RSA private key, 2048 bit long modulus ..+++ .........................................................................................................................+++ e is 65537 (0x10001) Signature ok subject=/C=US/ST=Washington/L=Seattle/O=EMC Isilon/CN=InsightIQ/emailAddress=support@emc.com Getting Private key Initializing database: [ OK ] Starting iiq_db service: [ OK ] Starting insightiq: [ OK ] Installed: freetype.x86_64 0:2.3.11-14.el6_3.1 isilon-insightiq.x86_64 0:4.0.0.0049-1 libXfont.x86_64 0:1.4.5-3.el6_5 libfontenc.x86_64 0:1.0.5-2.el6 libjpeg-turbo.x86_64 0:1.2.1-3.el6_5 libpng.x86_64 2:1.2.49-1.el6_2 postgresql93.x86_64 0:9.3.4-1PGDG.rhel6 postgresql93-devel.x86_64 0:9.3.4-1PGDG.rhel6 postgresql93-libs.x86_64 0:9.3.4-1PGDG.rhel6 postgresql93-server.x86_64 0:9.3.4-1PGDG.rhel6 ttmkfdir.x86_64 0:3.0.9-32.1.el6 wkhtmltox.x86_64 1:0.12.2.1-1 xorg-x11-font-utils.x86_64 1:7.2-11.el6 xorg-x11-fonts-75dpi.noarch 0:7.2-9.1.el6 xorg-x11-fonts-Type1.noarch 0:7.2-9.1.el6 Dependency Installed: avahi-libs.x86_64 0:0.6.25-15.el6 blas.x86_64 0:3.2.1-4.el6 c-ares.x86_64 0:1.10.0-3.el6 cups-libs.x86_64 1:1.4.2-72.el6 cyrus-sasl-gssapi.x86_64 0:2.1.23-15.el6_6.2 fontconfig.x86_64 0:2.8.0-5.el6 gnutls.x86_64 0:2.8.5-19.el6_7 keyutils.x86_64 0:1.4-5.el6 lapack.x86_64 0:3.2.1-4.el6 libX11.x86_64 0:1.6.0-6.el6 libX11-common.noarch 0:1.6.0-6.el6 libXau.x86_64 0:1.0.6-4.el6 libXext.x86_64 0:1.3.2-2.1.el6 libXrender.x86_64 0:0.9.8-2.1.el6 libbasicobjects.x86_64 0:0.1.1-11.el6 libcollection.x86_64 0:0.6.2-11.el6 libdhash.x86_64 0:0.4.3-11.el6 libevent.x86_64 0:1.4.13-4.el6 libgfortran.x86_64 0:4.4.7-16.el6 libgssglue.x86_64 0:0.1-11.el6 libini_config.x86_64 0:1.1.0-11.el6 libipa_hbac.x86_64 0:1.12.4-47.el6_7.8 libldb.x86_64 0:1.1.25-2.el6_7 libnl.x86_64 0:1.1.4-2.el6 libpath_utils.x86_64 0:0.2.1-11.el6 libref_array.x86_64 0:0.1.4-11.el6 libsss_idmap.x86_64 0:1.12.4-47.el6_7.8 libtalloc.x86_64 0:2.1.5-1.el6_7 libtdb.x86_64 0:1.3.8-1.el6_7 libtevent.x86_64 0:0.9.26-2.el6_7 libtiff.x86_64 0:3.9.4-10.el6_5 libtirpc.x86_64 0:0.2.1-10.el6 libxcb.x86_64 0:1.9.1-3.el6 nfs-utils.x86_64 1:1.2.3-64.el6 nfs-utils-lib.x86_64 0:1.1.5-11.el6 pytalloc.x86_64 0:2.1.5-1.el6_7 python-argparse.noarch 0:1.2.1-2.1.el6 python-sssdconfig.noarch 0:1.12.4-47.el6_7.8 rpcbind.x86_64 0:0.2.0-11.el6_7 samba4-libs.x86_64 0:4.2.10-6.el6_7 sssd.x86_64 0:1.12.4-47.el6_7.8 sssd-ad.x86_64 0:1.12.4-47.el6_7.8 sssd-client.x86_64 0:1.12.4-47.el6_7.8 sssd-common.x86_64 0:1.12.4-47.el6_7.8 sssd-common-pac.x86_64 0:1.12.4-47.el6_7.8 sssd-ipa.x86_64 0:1.12.4-47.el6_7.8 sssd-krb5.x86_64 0:1.12.4-47.el6_7.8 sssd-krb5-common.x86_64 0:1.12.4-47.el6_7.8 sssd-ldap.x86_64 0:1.12.4-47.el6_7.8 sssd-proxy.x86_64 0:1.12.4-47.el6_7.8
Configure IIQ and X509 Certificates for Web Access.
- Do as said in the install manual, caveat some errors/omissions.
- Create a user called iiq on the IIQ server (zaphod).
- On the Isilon cluster, activate the user insightiq on the Auth File provider, System zone.
BIC-Isilon-Cluster-4# isi auth users view insightiq
Name: insightiq
DN: -
DNS Domain: -
Domain: UNIX_USERS
Provider: lsa-file-provider:System
Sam Account Name: insightiq
UID: 15
SID: S-1-22-1-15
Enabled: Yes
Expired: No
Expiry: -
Locked: No
Email: -
GECOS: InsightIQ User
Generated GID: No
Generated UID: No
Generated UPN: Yes
Primary Group
ID: GID:15
Name: insightiq
Home Directory: /ifs/home/insightiq
Max Password Age: -
Password Expired: No
Password Expiry: -
Password Last Set: -
Password Expires: Yes
Shell: /sbin/nologin
UPN: insightiq@UNIX_USERS
User Can Change Password: No
- Install the BIC wildcard Comodo X509 certificate, key server and Comodo cert bundle in /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem.
root@zaphod ~$ cat STAR_bic_mni_mcgill_ca.crt \
STAR_bic_mni_mcgill_ca.key \
COMODO_CA_bundle.crt >> /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem
- Protect that file since it contains the secret server key.
root@zaphod ~$ chmod 400 /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem
- Modify /etc/isilon/insightiq.ini for the server cert location: ssl_pem = /home/iiq/certificates/STAR_bic_mni_mcgill_ca.pem.
- Restart the IIQ stuff with /etc/init.d/insightiq restart.
- Installation guide speaks uses command iiq_restart which is an alias defined in /etc/profile.d/insightiq.sh
- Check in /var/log/insightiq_stdio.log to see if the cert is OK.
- Port 80 and 443 must not be blocked by a firewall. Access restrictions should be enabled however.
- Connect to the web interface using the credentials for the user iiq.
- Go to “Setting” and add a cluster to monitor using the SmartConnect ip access sip.bic.mni.mcgill.ca.
- Use the cluster’s local user insightiq and its credentials to connect to the cluster.
- Bingo.
NFS Benchmarks Using FIO
- The following is shamelessly copied/stolen (with a few local modifications to suit our local environment) from this EMC blog entry:
- The benchmark as described in the above URL bypasses the mechanisms provided by the Isilon product
SmartConnect Adavanced:- Connections to the cluster are made directly to an IP of a particular node in the Isilon cluster.
- This is done in order to not introduce any biases with any load-balancing (round-robin, cpu or network) done by
SmartConnect.
- Strategy:
- Latencies and IOPs (I/O Operations per seconds) are the most meaningful metrics when assessing random IO performance: bandwidth is of secondary value for this case of I/O access pattern.
- For sequential access to storage, I/O performance is best assessed by measuring the client-to-server bandwidth.
- The client buffers and caching mechanisms must be examined and dealt with carefully.
- We are not interested in benchmarking the clients efficient use of local caches!
- A birds’s view of the network layout and the working files organization as explained in the url above:
/ifs/data/fiotest => /mnt/isilon/fiotest/
..
/fiojob_8k_50G_4jobs_randrw
/fioresult_8k_50G_4jobs_randrw_172.16.20.42.log
/172.16.20.102/
/172.16.20.203/
/172.16.20.204/
/172.16.20.42/
/control/.. /cleanup_remount_1to1.sh
/nfs_copy_trusted.sh
+-----------------+ /run_nfs_fio_8k_50G_4jobs_randrw.sh
| node02 | /nfs_hosts.list
| 172.16.20.202 | /trusted.key
+-----------------+ /trusted.key.pub
2x 1GiG
+-----------------+ +------------------+
| node03 |.........| LNN2 |
| 172.16.20.203 |.........| 172.16.20.236 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| node04 |.........| LNN4 |
| 172.16.20.204 |.........| 172.16.20.234 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| thaisa |.........| LNN5 |
| 172.16.20.42 |.........| 172.16.20.235 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| widow |.........| LNN1 |
| 172.16.20.102 |.........| 172.16.20.237 |
+-----------------+ +------------------+
+------------------+
| LNN3 |
| 172.16.20.233 |
+------------------+
If this diagram is enough for you, skip to the section FIO Configuration and Benchmarking if you are not interested in the details of the networking setup or even simply jump to the FIO NFS Statistics Reports section for the actual results of the benchmarks.
Nodes Configuration
- Use the nodes
node02,node03,node04,thaisaandwidow. - Note: somehow I can’t make the 6th node
vauxmount the Isilon exports so I didn’t use it out of frustration. - Here are the relevant nodes network configuration and settings:
node02,node03andnode04have the same network layout:eth0in192.168.86.0/24eth1andeth2bonded tobond0in data network172.16.20.0/24bond0:0IP alias in management network172.16.10.0/24- All data links from nodes to the Isilon cluster network front-end are dual 1GiG in bond.
node02: eth0 inet addr:192.168.86.202 Bcast:192.168.86.255 Mask:255.255.255.0 bond0 inet addr:172.16.20.202 Bcast:172.16.20.255 Mask:255.255.255.0 bond0:0 inet addr:172.16.10.202 Bcast:172.16.10.255 Mask:255.255.255.0 ~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.86.1 0.0.0.0 UG 0 0 0 eth0 172.16.10.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 172.16.20.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 192.168.86.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 node03: eth0 inet addr:192.168.86.203 Bcast:192.168.86.255 Mask:255.255.255.0 bond0 inet addr:172.16.20.203 Bcast:172.16.20.255 Mask:255.255.255.0 bond0:0 inet addr:172.16.10.203 Bcast:172.16.10.255 Mask:255.255.255.0 ~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.86.1 0.0.0.0 UG 0 0 0 eth0 172.16.10.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 172.16.20.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 192.168.86.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 node04: eth0 inet addr:192.168.86.204 Bcast:192.168.86.255 Mask:255.255.255.0 bond0 inet addr:172.16.20.204 Bcast:172.16.20.255 Mask:255.255.255.0 bond0:0 inet addr:172.16.10.204 Bcast:172.16.10.255 Mask:255.255.255.0 ~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.86.1 0.0.0.0 UG 0 0 0 eth0 172.16.10.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 172.16.20.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 192.168.86.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
thaisaandwidowin real life are Xen Dom0.- They provide virtual hosts when running a Xen-ified kernel.
- For the purpose of this test, both have been rebooted without a Xen kernel.
- They use a (virtual) bridged network interface
xenbr0connected to a bonded network interfacebond0that acts as a external physical network interface.
thaisa: ~# brctl show bridge name bridge id STP enabled interfaces xenbr0 8000.00e081c19a1a no bond0 xenbr0 inet addr:132.206.178.42 Bcast:132.206.178.255 Mask:255.255.255.0 xenbr0:0 inet addr:172.16.20.42 Bcast:172.16.20.255 Mask:255.255.255.0 route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 132.206.178.1 0.0.0.0 UG 0 0 0 xenbr0 132.206.178.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0 172.16.20.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0 192.168.86.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0 widow: ~# brctl show bridge name bridge id STP enabled interfaces xenbr0 8000.00e081c19a9a no bond0 xenbr0 inet addr:132.206.178.102 Bcast:132.206.178.255 Mask:255.255.255.0 xenbr0:0 inet addr:172.16.20.102 Bcast:172.16.20.255 Mask:255.255.255.0 route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 132.206.178.1 0.0.0.0 UG 0 0 0 xenbr0 132.206.178.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0 172.16.20.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0 192.168.86.0 0.0.0.0 255.255.255.0 U 0 0 0 xenbr0
- Create a NIS netgroup
fiotest
# Temp netgroup for fio test: node02,node03,node04,thaisa,widow. # nodes access Isilon from the 172.16.20.0/24 network. # thaisa, vaux and widow have bonded NIC aliases in 172.16.20.0/24 and 132.206.178.0/24 networks. # FOLLOWING LINE IS A ONE LINER. WATCH OUT FOR END_OF_LINE CHARACTERS! # DO NOT COPY-AND-PASTE!! fiotest (172.16.20.202,,) (172.16.20.203,,) (172.16.20.204,,) \ (172.16.20.42,,) (132.206.178.42,,) (thaisa.bic.mni.mcgill.ca,,) \ (172.16.20.102,,) (132.206.178.102,,) (widow.bic.mni.mcgill.ca,,)
Node02is the control host (called “harness” in the blog link above).Node02should have password-less root access to the Isilon cluster and hosts infiotest- Create a ssh key and distribute the public key to
fiotestin~root/.ssh/authorized_keys - Distribute the pub key on all the nodes of the cluster, using
isi_for_array - Create a ssh config file that will redirect the ssh host keys to
/dev/null. - The options
CheckHostIPandStrictHostKeyCheckingare to remove any spurious warning messages in the output stream of the FIO logfiles.
- Create a ssh key and distribute the public key to
node02:~# cat .ssh/config
Host mgmt.isi.bic.mni.mcgill.ca
ForwardX11 no
ForwardAgent no
User root
CheckHostIP no
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
- Verify that you can ssh from
node02to any nodes or cluster nodes and alsomgmt.isi.bic.mni.mcgill.cawithout issuing a password. - Continue if and only if you have this working without any problem.
- Create a NFSv4 export on the Isilon cluster with the properties:
isi nfs exports create /ifs/data/fiotest --zone prod --root-clients fiotest --clients fiotest isi nfs aliases create /fiotest /ifs/data/fiotest --zone prod isi quota quotas create /ifs/data/fiotest directory --zone prod --hard-threshold 2T --container yes chmod 777 /ifs/data/fiotest ls -ld /ifs/data/fiotest drwxrwxrwx 9 root wheel 4711 Sep 27 11:11 /ifs/data/fiotest
- no_root_squash for hosts in
fiotest: all nodes must be able to write as root in/ifs/data/fiotest - read/write by everyone for the top dir
/ifs/data/fiotest - On all the nodes:
- Create the local mount point
/mnt/isilon/fiotest. - Verify that each node can mount the Isilon export on the local mount point
/mnt/isilon/fiotest - Continue if and only if you have this working without any problem.
- Create the local mount point
- Create a file
/mnt/isilon/fiotest/control/nfs_hosts.listwith the IPs of the nodes and cluster node IPS. - Separator is the pipe character
|. - No comments, no white space, no trailing end-of-line white space!
~# cat /mnt/isilon/fiotest/control/nfs_hosts.list 172.16.20.203|172.16.20.236 172.16.20.204|172.16.20.234 172.16.20.42|172.16.20.235 172.16.20.102|172.16.20.237
- Verify that the export
/ifs/data/fioteston the Isilon cluster can be mounted on all nodes. - Only when the config above are done and working correctly can we start on working with FIO.
Not interested in the FIO configuration: jump to the FIO NFS Statistics Reports section for the actual results of the benchmarks.
FIO Configuration and Benchmarking
- The FIO benchmarking are done using the following logic:
- On the master node
node02, for each benchmark run with a specific FIO configuration:- Run the cleaning script
/mnt/isilon/fiotest/controlcleanup_remount_1to1.sh: - For each benchmarking nodes in
/mnt/isilon/fiotest/control/nfs_hosts.list:- Removes any FIO working files located in
/mnt/isilon/fiotest/xxx.xxx.xxx.xxxused in the read-write benchmarkings. - umount/remounts the Isilon export on the benchmarking node.
- Removes any FIO working files located in
- Run the FIO script
/mnt/isilon/fiotest/control/run_nfs_fio_8k_50G_4jobs_randrw.sh:- flush the L1 and L2 caches on the all the Isilon cluster nodes.
- For each nodes in
/mnt/isilon/fiotest/control/nfs_hosts.list:- Connect to a benchmarking node with ssh.
- Sync all local filesystem cache buffers to disks and flush all I/O caches.
- Run the FIO command:
- FIO job file is
/mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw. - FIO working dir is
/mnt/isilon/fiotest/xxx.xxx.xxx.xxxwhere x’s are IP of the benchmarking node. - FIO output is sent to
/mnt/isilon/fiotest/fioresult_8k_50G_4jobs_randrw_xxx.xxx.xxx.xxx.log.
- Run the cleaning script
- On the master node
- Once more, an ascii diagram explaining the files layout:
/ifs/data/fiotest => /mnt/isilon/fiotest/
..
/fiojob_8k_50G_4jobs_randrw <-- FIO jobfile
/fioresult_8k_50G_4jobs_randrw_172.16.20.42.log <-- FIO output logfile
/172.16.20.102/ <-- FIO working directory
/172.16.20.203/ <-- " "
/172.16.20.204/ <-- " "
/172.16.20.42/ <-- " "
/control/.. /cleanup_remount_1to1.sh <-- cleanup and remount script
/nfs_copy_trusted.sh <-- key distributor script
+-----------------+ /run_nfs_fio_8k_50G_4jobs_randrw.sh <-- FIO start script
| node02 | /nfs_hosts.list <-- nodes list
| 172.16.20.202 | /trusted.key <-- ssh private
+-----------------+ /trusted.key.pub <-- and public keys
2x 1GiG
+-----------------+ +------------------+
| node03 |.........| LNN2 |
| 172.16.20.203 |.........| 172.16.20.236 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| node04 |.........| LNN4 |
| 172.16.20.204 |.........| 172.16.20.234 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| thaisa |.........| LNN5 |
| 172.16.20.42 |.........| 172.16.20.235 |
+-----------------+ +------------------+
+-----------------+ +------------------+
| widow |.........| LNN1 |
| 172.16.20.102 |.........| 172.16.20.237 |
+-----------------+ +------------------+
+------------------+
| LNN3 |
| 172.16.20.233 |
+------------------+
The cleanup script file /mnt/isilon/fiotest/controlcleanup_remount_1to1.sh
#!/bin/bash
#first go through all lines in hosts.list
for i in $(cat /mnt/isilon/fiotest/control/nfs_hosts.list) ; do
# then split each line read in to an array by the pipe symbol
IFS='|' read -a pairs <<< "${i}"
# show back the mapping
echo "Client host: ${pairs[0]} Isilon node: ${pairs[1]}"
# connect over ssh with the key and mount hosts, create directories etc. - has to be single line
ssh -i /mnt/isilon/fiotest/control/trusted.key ${pairs[0]} -fqno StrictHostKeyChecking=no \
"[ -d /mnt/isilon/fiotest/${pairs[0]} ] && rm -rf /mnt/isilon/fiotest/${pairs[0]}; sleep 1; \
umount -fl /mnt/isilon/fiotest; sleep 1; \
mount -t nfs -o vers=4 ${pairs[1]}:/fiotest /mnt/isilon/fiotest; sleep 1; \
[ ! -d /mnt/isilon/fiotest/${pairs[0]} ] && mkdir /mnt/isilon/fiotest/${pairs[0]}"
# erase the array pair
unset pairs
# go for the next line in nfs_hosts.list;
done
The FIO script file /mnt/isilon/fiotest/control/run_nfs_fio_8k_50G_4jobs_randrw.sh
#!/bin/bash
# First, connect to the first isilon node, and flush cache on array
# This might takes minutes to complete.
echo -n "Purging L1 and L2 cache first...";
ssh -i /mnt/isilon/fiotest/control/trusted.key mgmt.isi.bic.mni.mcgill.ca -fqno StrictHostKeyChecking=no "isi_for_array isi_flush"
#ssh -i /mnt/isilon/fiotest/control/trusted.key mgmt.isi.bic.mni.mcgill.ca -fqno StrictHostKeyChecking=no "isi_for_array w"
# wait for cache flushing to finish, normally around 10 seconds is enough
# on larger clusters, sometimes up to few minutes should be used!
echo "...sleeping for 30secs"
sleep 30
# The L3 cache purge is not recommended as all metadata accelerated by SSDs is going. but, maybe...
#echo "On OneFS 7.1.1 clusters and newer, running L3, purging L3 cache";
#ssh -i /mnt/isilon/fiotest/control/trusted.key 10.63.208.64 -fqno StrictHostKeyChecking=no "isi_for_array isi_flush --l3-full";
#sleep 10;
# The rest is similar to the other scripts
# First go through all lines in nfs_hosts.list
for i in $(cat /mnt/isilon/fiotest/control/nfs_hosts.list) ; do
# then split each line read in to an array by the pipe symbol
IFS='|' read -a pairs <<< "${i}"
# Connect over ssh with the key and mount hosts, create directories etc. - has to be single line
# "sync && echo 3 > /proc/sys/vm/drop_caches" purges all buffers to disk
# The fio jobfile is one level above from control directory
ssh -i /mnt/isilon/fiotest/control/trusted.key ${pairs[0]} -fqno StrictHostKeyChecking=no \
"sync && echo 3 > /proc/sys/vm/drop_caches; FILENAME=\"/mnt/isilon/fiotest/${pairs[0]}\" \
fio --output=/mnt/isilon/fiotest/fioresult_8k_50G_4jobs_randrw_${pairs[0]}.log \
/mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw"
done
The FIO jobfile /mnt/isilon/fiotest/fiojob_8k_50G_4jobs_randrw.
- The most important parameters:
directory=${FILENAME}sets the working directory to the variable${FILENAME}, set in the FIO calling script.rw=randrwspecifies a mixed random read and write I/O pattern.size=50Gsets the total transferred I/O size to 50GB.bs=sets the block size for I/O units. Default: 4kdirect=0makes use of buffered I/O.ioengine=syncuse a synchronous ioengine (simple read, write and fseeks system calls).iodepth=1: I/O depth set to 1 (number of I/O units to keep in flight towards the working file).numjobs=4creates 4 clones (processes/threads performing the same workload) of this job.group_reportingaggregates per-job stats into one per-group report when numjobs is specified.runtime=10800restricts the run time to 10800 seconds, 3 hours. This might limit the total transferred to less than the value specified bysize=
; --start job file --
[global]
description=-------------THIS IS A JOB DOING ${FILENAME} ---------
directory=${FILENAME}
rw=randrw
size=50G
bs=8k
zero_buffers
direct=0
sync=0
refill_buffers
ioengine=sync
iodepth=1
numjobs=4
group_reporting
runtime=10800
[8k_randread]
; -- end job file --
A typical output log file from FIO is like the following:
- 1024k I/O block size, random read-write I/O pattern, 4 threads, 50GB data transferred per thread, 200GB total.
- 4 of these are submitted at the same time on 4 different nodes for a total number of 16 threads and a total transferred size of 800GB (runtime might limit this)
1024k_randrw: (g=0): rw=randrw, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
...
fio-2.1.11
Starting 4 processes
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: Laying out IO file(s) (1 file(s) / 51200MB)
1024k_seqrw: (groupid=0, jobs=4): err= 0: pid=27587: Mon Oct 3 11:03:24 2016
Description : [-------------THIS IS A JOB DOING /mnt/isilon/fiotest/172.16.20.203 ---------]
read : io=102504MB, bw=36773KB/s, iops=35, runt=2854400msec
clat (msec): min=11, max=29666, avg=106.96, stdev=433.67
lat (msec): min=11, max=29666, avg=106.96, stdev=433.67
clat percentiles (msec):
| 1.00th=[ 23], 5.00th=[ 27], 10.00th=[ 29], 20.00th=[ 34],
| 30.00th=[ 36], 40.00th=[ 42], 50.00th=[ 54], 60.00th=[ 83],
| 70.00th=[ 110], 80.00th=[ 143], 90.00th=[ 198], 95.00th=[ 255],
| 99.00th=[ 408], 99.50th=[ 510], 99.90th=[ 6390], 99.95th=[10290],
| 99.99th=[16712]
bw (KB /s): min= 34, max=37415, per=31.68%, avg=11651.07, stdev=8193.03
write: io=102296MB, bw=36698KB/s, iops=35, runt=2854400msec
clat (usec): min=399, max=57018, avg=450.81, stdev=477.23
lat (usec): min=399, max=57018, avg=451.15, stdev=477.23
clat percentiles (usec):
| 1.00th=[ 410], 5.00th=[ 418], 10.00th=[ 422], 20.00th=[ 426],
| 30.00th=[ 430], 40.00th=[ 434], 50.00th=[ 438], 60.00th=[ 442],
| 70.00th=[ 450], 80.00th=[ 458], 90.00th=[ 470], 95.00th=[ 490],
| 99.00th=[ 556], 99.50th=[ 580], 99.90th=[ 684], 99.95th=[ 956],
| 99.99th=[27520]
bw (KB /s): min= 34, max=80788, per=34.53%, avg=12670.23, stdev=10361.46
lat (usec) : 500=48.01%, 750=1.90%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.10%, 50=23.92%
lat (msec) : 100=9.51%, 250=13.87%, 500=2.40%, 750=0.14%, 1000=0.01%
lat (msec) : 2000=0.01%, >=2000=0.10%
cpu : usr=0.07%, sys=1.03%, ctx=2703279, majf=0, minf=118
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=102504/w=102296/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: io=102504MB, aggrb=36772KB/s, minb=36772KB/s, maxb=36772KB/s, mint=2854400msec, maxt=2854400msec
WRITE: io=102296MB, aggrb=36698KB/s, minb=36698KB/s, maxb=36698KB/s, mint=2854400msec, maxt=2854400msec
- An explanation of the output stats in a FIO logfile (from the manpage):
io Number of megabytes of I/O performed.
bw Average data rate (bandwidth).
runt Threads run time.
slat Submission latency minimum, maximum, average and standard deviation.
This is the time it took to submit the I/O.
clat Completion latency minimum, maximum, average and standard deviation.
This is the time between submission and completion.
bw Bandwidth minimum, maximum, percentage of aggregate bandwidth received, average and standard deviation.
cpu CPU usage statistics. Includes user and system time, number of context switches this
thread went through and number of major and minor page faults.
IO depths
Distribution of I/O depths.
Each depth includes everything less than (or equal) to it,
but greater than the previous depth.
IO issued
Number of read/write requests issued, and number of short read/write requests.
IO latencies
Distribution of I/O completion latencies.
The numbers follow the same pattern as IO depths.
The group statistics show:
io Number of megabytes I/O performed.
aggrb Aggregate bandwidth of threads in the group.
minb Minimum average bandwidth a thread saw.
maxb Maximum average bandwidth a thread saw.
mint Shortest runtime of threads in the group.
maxt Longest runtime of threads in the group.
Finally, disk statistics are printed with reads first:
ios Number of I/Os performed by all groups.
merge Number of merges in the I/O scheduler.
ticks Number of ticks we kept the disk busy.
io_queue
Total time spent in the disk queue.
util Disk utilization.
FIO NFS Statistics Reports
- FIO outputs galores of stats!
- Some stats, like disks statistics are not relevant in the case of NFS benchmarkings.
- Synchronous (sync) and asynchronous IO (libaio), buffered and un-buffered should be done.
- Synchronous IO (sync) is usually done for regular applications.
- Synchronous just refers to the system call interface: i.e. the when the system call returns to the application/
- It does not imply synchronous I/O aka O_SYNC which is way slower and enabled by sync=1
- Thus it does not guarantee that the I/O has been physically written to the underlying device.
- For reads, the IO has been done by the device. For writes, it could just be sitting in the page cache for later writeback.
- For reads, the IO always happens in the context of the process.
- For buffered writes, it usually does not. The process merely dirties the page, kernel threads will most often do the actual writeback of the data.
- direct=1 will circumvent the page cache.
- direct=1 will make the writes sync as well.
- So instead of just returning when it’s in page cache, when a sync write with direct=1 returns, the data has been received and acknowledged by the backing device.
- aio assumes the identity of the process. aio is usually mostly used by databases.
- Question:
- hat difference is between the following two other than the second one seems to be more popular in fio example job files?
- 1) ioengine=sync + direct=1
- 2) ioengine=libaio + direct=1
- Current answer: It is that fio can issue further I/Os while the Linux kernels handles the I/O.
- Perform FIO random IO (mixed read-write) and sequential IO (mixed read-write).
- Block size ranges from 4k to 1024k in multiplicative steps of 2.
- Working file size set to 50G. (
runtime=10800(3 hours) might limit the total transferred size). - Basic synchronous read and write (sync) is used for the ioengine (
ioengine=sync).- A second round of async benchmarks should be attempted with
ioengine=libaio(Linux native asynchronous I/O) along withdirect=1and a range ofiodepth.
- A second round of async benchmarks should be attempted with
- buffered IO is set (
direct=false).- Un-buffered IO will almost certainly worsen stats performance, but that’s not Real Life (TM).
- Real performance of the Isilon cluster should be assessed by bypassing the client’s local memory caching, ie, set
direct=1andiodepth=1and higher values.
- Set
iodepth=1as it doesn’t make sense to use any larger value when using a synchronous ioengine.- Is it important to realize that the OS/kernel/block IO stack might restrict the
iodepthparameter values. - This is to checked when one sets
ioengine=libaioANDdirect=false.
- Is it important to realize that the OS/kernel/block IO stack might restrict the
- 4 threads for each FIO job will be launched (
numjobs=4). - Only consider the following stats:
- Random IO: IOPs and latencies (average and 95th percentiles value).
- Sequential IO: bandwidth.
- Plot the following stats versus the FIO blocks size used: (4k,8k,16k,32k,64k,128k,256k,512k,1024k)
- IOPs for random IO reads and writes
- Total submission and completion latency (clat) 95th percentile values
clat 99.95th=[XXX],ie 95% of all latencies are under this latency value. - Bandwidth for sequential reads and writes.
