To display what HBA's are installed.
#cat /var/adm/messages grep -i wwn more
To set the configuration you must carry out the following:
-changes to the /etc/system file
-HBA driver modifications
-Persistent binding (HBA and SD driver config file)
-EMC recommended changes
-Install the Sun StorEdge SAN Foundation package
Changes to /etc/system: ( Plz ignore 3 equal sign when u edit the file)
SCSI throttle === set sd:sd_max_throttle=20
Enable wide SCSI === set scsi_options=0x7F8
SCSI I/O timeout value === sd:sd_io_time=0x3c (with powerpath)
sd:sd_io_time=0x78 (without powerpath
Changes to HBA driver (/kernel/drv/lpfc.conf):
no-device-delay=1 (without PP/DMP) 0 (with PP/DMP)
linkdown-tmo=0 (without PP/DMP) 60 (with PP/DMP)
Both the lpfc.conf and sd.conf files need to be updated. General format is
name="sd" parent="lpfc" target="X" lun="Y" hba="lpfcZ"
X is the target number that corresponds to the fcp_bindWWNID lpfcZtXY is the LUN number that corresponds to symmetrix volume mapping on the symmetrix port WWN or HLU on the clariionZ is the lpfc drive instance number that corresponds to the fcp_bind_WWID lpfcZtX
To discover the SAN devices
#disk;devlinks;devalias (solaris 2.6)
#devfsadm (solaris 2.8)
#/usr/sbin/update_drv -f sd (solaris 2.9 >)
To display what HBA's are installed. use admin tool "device manager"
To set the configuration you must carry out the following:
#EMC recommended changes
#Install emulex exlcfg utility
Arbitrated loop without powerpath/ATF:-
InitLinkFlags=0x00000000 (arbitrated loop, auto-link speed)
Arbitrated loop with powerpath/ATF:
InitLinkFlags=0x00000000 (arbitrated loop, auto-link speed)
Fabric without powerpath/ATF:
InitLinkFlags=0x00000002 (fabric, auto-link speed
Fabric with powerpath/ATF:
InitLinkFlags=0x00000002 (fabric, auto-link speed)
Modifying the EMC environment :
In the shortcut for the elxcfg add the "--emc" option to the target option.
To discover the SAN devices
control panel -> admin tools -> computer management -> select disk management -> (top menu)action -> rescan tools
To display what HBA's are installed.
#/opt/fcms/bin/fcmsutil /dev/td# (A5158A HBA)
#/opt/fc/bin/fcutil /dev/fcs# (A6685A HBA)
On a HP system there is no additional software to install. The HP systems Volume address setting must be enabledon the SAN, you can check this will the following command.
To discover the SAN devices:
#ioscan -fnC disk (scans hardware busses for devices according to class)
#insf -e (install special device files)
To display what HBA's are installed
#lscfg -v -l fcs*
To set the configuration you must carry out the following:
#List HBA WWN and entry on system
#Determine code level of OS and HBA
#Download and install EMC ODM support fileset
#run /usr/lpp/Symmetrix/bin/emc_cfgmgr (symmetrix) #or /usr/lpp/emc/CLARiiON/bin/emc_cfgmgr (clariion)
To discover the SAN devices
if the above does not work reboot server
Hope this will be documented and useful info for novice user.
Tier "0" is not new in storage market but for implementation purposes it has been difficult to accommodate because it requires best performance and lowest latency. Enterprise Flash disks (Solid State Disks) capable to meet this requirement. It is possible to get more performance for company most critical applications. The performance can be gained through using Flash drives supported in VMAX and DMX-4 systems. Read More →
Handling BCV and Clone Disk on Veritas Volume Manager.
Get Output of Vxdisk List command
# vxdisk list
Vxdisk list command showing that some disks are marked with the udid_mismatch flag.
Write a New UUID to Disk
You can use the following command to update the unique disk identifier (UDID) for one or more disks:
# vxdisk [-f] [-g diskgroup] updateudid disk ...
Note : The -f option must be specified if VxVM has not raised the udid_mismatch flag for a disk.
Importing a disk group containing cloned/BCV disks
You can then import the cloned disks by specifying the -ouseclonedev=on option to the vxdgimport command, as shown in this example:
# vxdg -o useclonedev=on [-o updateid] import mydg ( Group Name )
Note: This form of the command allows only cloned disks to be imported. All non-cloned disks remain unimported. .) However, the import fails if multiple copies of one or more cloned disks exist.
You can use the following command to tag all the disks in the disk group that are to be imported:
# vxdisk [-g diskgroup] settag tagname disk ...
where tagname is a string of up to 128 characters, not including spaces or tabs.
For example, the following command sets the tag, my_tagged_disks, on several disks that are to be imported together:
You can use the following command to ensure that a copy of the metadata is placed on a disk, regardless of the placement policy for the disk group:
# vxdisk [-g diskgroup] set disk keepmeta=always
Alternatively, use the following command to place a copy of the configuration
database and kernel log on all disks in a disk group that share a given tag:
# vxdg [-g diskgroup] set tagmeta=on tag=tagname nconfig=all nlog=all
To check which disks in a disk group contain copies of this configuration information, use the vxdglistmeta command:
# vxdg [-q] listmeta diskgroup
The tagged disks in the disk group may be imported by specifying the tag to the vxdgimport command in addition to the -ouseclonedev=on option:
# vxdg -o useclonedev=on -o tag=my_tagged_disks import mydg
If you have already imported the non-cloned disks in a disk group, you can use
the -n and -t option to specify a temporary name for the disk group containing
the cloned disks:
# vxdg -t -n clonedg -o useclonedev=on -o tag=my_tagged_disks import mydg
I tried to collect some good information on iSCSI driver details which was request by some reader. Hope this will help you on iSCSI queries. Leave comment if it is useful... I will try to write iSCSI overview in next entry. Happy Reading!!!!!!
The iSCSI driver provides a transport for SCSI requests and responses to storage devices via an IP network instead of using a direct attached SCSI bus channel or an FC connection. The SN 5400 Series Storage Router, in turn, transports these SCSI requests and responses received via the IP network between it and the storage devices attached to it. Once the iSCSI driver is installed, the host will proceed with a discovery process for storage devices as follows:
1. The iSCSI driver requests available targets through the SendTargets discovery mechanism as configured in the /etc/iscsi.conf configuration file.
2. Each iSCSI target sends available iSCSI target names to the iSCSI driver.
3. The iSCSI driver discovery daemon process looks up each discovered target
4. The iSCSI target accepts the login and sends target identifiers.
5. The iSCSI driver queries the targets for device information.
6. The targets respond with the device information.
7. The iSCSI driver creates a table of available target devices.
Once the table is completed, the iSCSI targets are available for use by the
host using all the same commands and utilities as a direct attached (e.g., via
a SCSI bus) storage device.
- All Linux kernels released on or before Feb 4, 2002 have a known bug in the buffer and page cache design. When any writes to a buffered block device fail, it is possible for the unwritten data to be discarded from the caches, even though the data was never written to disk. Any future reads will get the prior contents of the disk, and it is possible for applications to get no errors reported.
This occurs because block I/O write failures from the buffer cache simply mark the buffer invalid when the write fails. This leaves the buffer marked clean and invalid, and it may be
discarded from the cache at any time. Any future read either finds no existing buffer or finds the invalid buffer, so the read will fetch old data from disk and place it in the cache. If the fsync(2) function initiated the write, an error may be returned. If memory pressure on the cache initiated the write, the unwritten buffer may be discarded before fsync(2) is ever called, and in that case fsync will be unaware of the data loss, and will incorrectly report success. There is currently no reliable way for an application to ensure that data written to buffered block devices has actually been written to disk. Buffered data may be lost whenever a buffered
block I/O device fails a write. The iSCSI driver attempts to avoid this problem by retrying disk
commands for many types of failures. The MinDiskCommandTimeout defaults to "infinite", which disables the command timeout, allowing commands to be retried forever if the storage device is unreachable or unresponsive.
be in use. In addition, any other LUN probes initiated by the iSCSI driver will also block, since any other probes will lock waiting for the probe currently in progress to finish. When the failure to allocate command blocks occurs, the kernel will log a message similar to the following:
kernel: scsi_build_commandblocks: want=12, space for=0 blocks
In some cases, the following message will also be logged:
kernel: scan_scsis: DANGER, no command blocks
- Linux kernels 2.2.16 through 2.2.20 and 2.4.0 through 2.4.18 are known to have a problem in the SCSI error recovery process. In some cases, a successful device reset may be ignored and the SCSI layer will continue on to the later stages of the error recovery process. The problem occurs when multiple SCSI commands for a particular device are queued in the low-level SCSI driver when a device reset occurs. Even if the low-level driver correctly reports that all the commands for the device have been completed by the reset, Linux will assume only one command has been completed and continue the error recovery process. (If only one command has timed out or failed, Linux will correctly terminate the error recovery process following
the device reset.) This action is undesirable because the later stages of error recovery may send other types of resets, which can affect other SCSI initiators using the same target or other targets on the same bus. It is also undesirable because there are more serious bugs in the later stages of the Linux SCSI error recovery process. The Linux iSCSI driver now attempts to avoid this problem by replacing the usual error recovery handler for SCSI commands that timeout or fail.
- Linux kernels 2.2.16 through 2.2.20 and 2.4.0 through 2.4.2 may take SCSI devices offline after Linux issues a reset as part of the error recovery process. Taking a device offline causes all I/O to the device to fail until the HBA driver is reloaded. After the error recovery process does a reset, it sends a SCSI Test Unit Ready command to check if the SCSI target is operational
again. If this command returns SCSI sense data, instead of correctly retrying the command, Linux will treat it as a fatal error, and immediately take the SCSI device offline.
The Test Unit Ready will almost always be returned with sense data because most targets return a deferred error in the sense data of the first command received after a reset. This is a way of telling the initiator that a reset has occurred. Therefore, the affected Linux kernel versions almost always take a SCSI device offline after a reset occurs.
This bug is fixed in Linux kernels 2.4.3 and later. The Linux iSCSI driver now attempts to avoid this problem by replacing the usual error recovery handler for SCSI commands that timeout or fail.
- Linux kernels 2.2.16 through 2.2.21 and 2.4.0 through 2.4.20 appear to have problems when SCSI commands to disk devices are completed with a check condition/unit attention containing deferred sense data. This can result in applications receiving I/O errors, short reads or short writes. The Linux SCSI code may deal with the error by giving up reading or writing the first buffer head of a command, and retrying the remainder of the I/O.
The Linux iSCSI driver attempts to avoid this problem by translating deferred sense data to current sense data for commands sent to disk devices.
- Linux kernels 2.2.16 through 2.2.21 and 2.4.0 through 2.4.20 may crash on a NULL pointer if a SCSI device is taken offline while one of the Linux kernel's I/O daemons (e.g. kpiod, kflushd, etc.) is trying to do I/O to the SCSI device. The exact cause of this problem is still being investigated.
Note that some of the other bugs in the Linux kernel's error recovery handling may result in a SCSI device being taken offline, thus triggering this bug and resulting in a Linux kernel crash.
- Linux kernels 2.2.16 through 2.2.21 running on uniprocessors may hang if a SCSI disk device node is opened while the Linux SCSI device structure for that node is still being initialized.
This occurs because the sd driver which controls SCSI disks will loop forever waiting for a device busy flag to be cleared at a certain point in the open routine for the disk device. Since this particular loop will never yield control of the processor, the process initializing the SCSI disk device is not allowed to run, and the initialization process can never clear the device busy flag which the sd driver is constantly checking.
A similar problem exists in the SCSI generic driver in some 2.4 kernel versions. The sg driver may crash on a bad pointer if a /dev/sg* device is opened while it is being
CONFIGURING AND USING THE DRIVER
This section describes a number of topics related to configuring and using the iSCSI Driver for Linux. The topics covered include:
Starting and Stopping the iSCSI driver
Dynamic Driver Reconfiguration
Target Portal Failover
iSCSI HBA Status
Using Multipath I/O Software
Making Storage Configuration Changes
Target and LUN Discovery Limits
Dynamic Target And LUN Discovery
Persistent Target Binding
Editing The iscsi.conf File
iSCSI Commands and Utilities
Driver File Listing
STARTING AND STOPPING THE iSCSI DRIVER
To manually start the iSCSI driver enter:
The iSCSI initialization will report information on each detected
device to the console or in dmesg(8) output. For example:
Vendor: SEAGATE Model: ST39103FC Rev: 0002
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: hdwr sector= 512 bytes.
Sectors= 17783240 [8683 MB] [8.7 GB]
The directory /proc/scsi/iscsi will contain a file (the controller
number) that contains information about the iSCSI devices.
To see the iscsi devices currently available on this system, use the utility:
If there are problems loading the iSCSI kernel module, diagnostic information will be placed in /var/log/iscsi.log.
To manually stop the iSCSI driver enter:
When the driver is stopped, the init.d script will attempt to kill all processes using iSCSI devices by first sending them "SIGTERM" and then by sending any remaining processes "SIGKILL". The init.d script will then unmount all iSCSI devices in /etc/fstab.iscsi and kill the iSCSI daemon terminating all connections to iSCSI devices. It is important to note that the init.d script may not be able to successfully unmount filesystems if they are in use by processes that can't be killed. It is recommended that the you manually stop all applications using the filesystem on iSCSI devices before stopping the driver. Filesystems not listed in /etc/fstab.iscsi will not be unmounted by the script and should be manually unmounted prior to a system shutdown.
It is very important to unmount all filesystems on iSCSI devices before stopping the iSCSI driver. If the iSCSI driver is stopped while iSCSI devices are mounted, buffered writes may not be committed to disk and file system corruption may occur.
The Linux "reboot" command should not be used to reboot the system while iSCSI devices are mounted or being used since the "reboot" command will not execute the iSCSI shutdown script in /etc/rc6.d/ and file system corruption may occur. To safely reboot a Linux system, enter the
/sbin/shutdown -r now
All iSCSI devices should be unmounted prior to a system shutdown or reboot.
Because Linux assigns SCSI device nodes dynamically whenever a SCSI logical unit is detected, the mapping from device nodes (e.g., /dev/sda or /dev/sdb) to iSCSI targets and logical units may vary.
Variations in process scheduling and network delay may result in iSCSI targets being mapped to different SCSI device nodes every time the driver is started. Because of this variability, configuring applications or operating system utilities to use the standard SCSI device nodes to access iSCSI devices may result in SCSI commands being sent to the wrong target or logical unit.
To provide a more reliable namespace, the iSCSI driver scans the system to determine the mapping from SCSI device nodes to iSCSI targets, and then creates a tree of directories and symbolic links under /dev/iscsi to make it easier to use a particular iSCSI target's logical units.
Under /dev/iscsi, there will be a directory tree containing subdirectories for each iSCSI bus number, each target id number on the bus, and each logical unit number for each target. For
example, the whole disk device for bus 0, target id 0, LUN 0 would be /dev/iscsi/bus0/target0/lun0/disk.
In each logical unit directory there will be a symbolic link for each SCSI device node that may be connected to that particular logical unit. These symbolic links are modeled after the Linux
devfs naming convention.
The symbolic link 'disk' will map to the whole-disk SCSI device node
(e.g., /dev/sda, /dev/sdb, etc.).
The symbolic links 'part1' through 'part15' will map to each
partition of that SCSI disk (e.g., /dev/sda1, dev/sda15, etc.).
Note that these links will exist regardless of the number of disk partitions. Opening the partition devices will result in an error if the partition does not actually exist on the disk.
The symbolic link 'mt' will map to the auto-rewind SCSI tape device node for this LUN (e.g., /dev/st0), if any. Additional links for 'mtl', 'mtm', and 'mta' will map to the other auto-rewind devices (e.g., /dev/st0l, /dev/st0m, /dev/st0a), regardless of whether these
device nodes actually exist or could be opened. The symbolic link 'mtn' will map to the no-rewind SCSI tape device node for this LUN (e.g., /dev/nst0), if any. Additional links for 'mtln', 'mtmn', and 'mtan' will map to the other no-rewind devices (e.g., /dev/nst0l, /dev/nst0m, /dev/nst0a), regardless of whether those device nodes actually exist or could be opened. The symbolic link 'cd' will map to the SCSI cdrom device node for this LUN (e.g., /dev/scd0), if any.
The symbolic link 'generic' will map to the SCSI generic device
node for this LUN (e.g., /dev/sg0), if any.
Because the symlink creation process must open all of the SCSI
device nodes in /dev in order to determine which nodes map to
iSCSI devices, you may see many modprobe messages logged to syslog
indicating that modprobe could not find a driver for a particular
combination of major and minor numbers. This is harmless, and can
be ignored. The messages occur when Linux is unable to find a
driver to associate with a SCSI device node that the iSCSI daemon
is opening as part of it's symlink creation process. To prevent
these messages, the SCSI device nodes with no associated high-level
SCSI driver can be removed.
Filesystems installed on iSCSI devices cannot be automatically mounted at
system reboot due to the fact that the IP network is not yet configured at
mount time. However, the driver provides a method to auto-mount these
filesystems as soon as the iSCSI devices become available (i.e., after the IP
network is configured).
To auto-mount a filesystem installed on an iSCSI device, follow these steps:
1. List the iSCSI partitions to be automatically mounted in
/etc/fstab.iscsi which has the same format as /etc/fstab. The
/etc/fstab.iscsi file will not be overwritten when the driver is
installed nor will removing the current version of the driver delete
/etc/fstab.iscsi. It is left untouched during an install.
2. For each filesystem on each iscsi device(s), enter the logical volume on
which the filesystem resides. The mount points must exist for the
filesystems to be mounted. For example, the following /etc/fstab.iscsi
entries will mount the two iSCSI devices specified (sda and sdb):
#device mount FS mount backup fsck
#to mount point type options frequency pass
/dev/sda /mnt/t0 ext2 defaults 0 0
/dev/sdb /mnt/t1 ext2 defaults 0 0
3. Upon a system restart, the iSCSI startup script invokes the
iscsi-mountall script will try to mount iSCSI devices listed in
/etc/fstab.iscsi file. iscsi-mountall tries to mount the iSCSI devices
for "NUM_RETRIES" (default value 10) number of times, at an interval of
"SLEEP_INTERVAL" seconds (default value 1) between each attempt, giving
the driver the time to establish a connection with an iSCSI target.
The value of these parameters can be changed in the iscsi-mountall script
if the devices are not getting configured in the system within the
default time periods.
Due to variable network delays, targets may not always become available in the
same order from one boot to the next. Thus, the order in which iSCSI devices
are mounted may vary and may not match the order the devices are listed in
/etc/fstab.iscsi You should not assume mounts of iSCSI devices will occur in
any particular order.
Because of the variability of the mapping between SCSI device nodes
and iSCSI targets, instead of directly mounting SCSI device nodes,
it is recommended to either mount the /dev/iscsi tree symlinks,
mount filesystem UUIDs or labels (see man pages for mke2fs, mount,
and fstab), or use logical volume management (see Linux LVM) to
avoid mounting the wrong device due to device name changes resulting
from iSCSI target configuration changes or network delays.
The iSCSI driver contains components in the kernel and user level.
The log messages from these components are sent to syslog. Based on the
syslogd configuration on the Linux host, the messages will be sent to the
appropriate destination. For example, if /etc/syslog.conf has the following
then all log messages of level 'info' or higher will be sent to
If /etc/syslog.conf has the following entry:
then all log messages (except kernel messages) of level info or higher
will be sent to /var/log/messages.
If /etc/syslog.conf has the following entry:
then all kernel messages will be sent to the console.
All messages from the iSCSI driver when loading the iSCSI kernel
module will be placed in /var/log/iscsi.log.
The user can also use dmesg(8) to view the log messages.
DYNAMIC DRIVER RECONFIGURATION
Configuration changes can be made to the iSCSI driver without having to stop
it or reboot the host system. To dynamically change the configuration of the
driver, follow the steps below:
1. Edit /etc/iscsi.conf with the desired configuration changes.
2. Enter the following command:
This will cause the iSCSI daemon to re-read /etc/iscsi.conf file and to
create any new DiscoveryAddress connections it finds. Those discovery
sessions will then discover targets and create new target connections.
Note that any configuration changes will not affect existing target sessions.
For example, removal of a DiscoveryAddress entry from /etc/iscsi.conf
will not cause the removal of sessions to targets discovered through this
DiscoveryAddress, but it will cause the removal of the discovery session
corresponding to the deleted DiscoveryAddress.
TARGET PORTAL FAILOVER
Some SN 5400 Series Storage Routers have multiple Gigabit Ethernet ports.
Those systems may be configured to allow iSCSI target access via multiple
paths. When the iSCSI driver discovers targets through a multi-port SN 5400
Series system, it also discovers all the IP addresses that can be used to
reach each of those targets.
When an existing target connection fails, the iSCSI driver will attempt to
connect to that target using the next available IP address. You can also
choose a preferred portal to which the iSCSI driver should attempt to connect
to when the iSCSI driver is started or whenever automatic portal failover
occurs. This is significant in a situation when you want the connection
to the targets to be made through a faster network portal (for example, when
the I/Os are going through a Gigabit Ethernet interface and you do not
prefer the connection to failover to a slower network interface).
The preference for portal failover can be specified through the
"PreferredPortal" or "PreferredSubnet" parameter in /etc/iscsi.conf.
If this preference is set, then on any subsequent failover the driver will
first try to failover to the preferred portal or preferred subnet whichever
is specified in the conf file. If both preferred portal and preferred subnet
entries are present in the conf file then the preferred portal takes
precedence. If the preferred portal or preferred subnet is unreachable,
then the driver will continuously rotate through the list of available
portals until it finds one that is active.
The Portal Failover feature is turned on by default and the whole process of
failover occurs automatically. You can chose to turn off portal failover
by disabling the portal failover parameter in /etc/iscsi.conf.
If a target advertises more than one network portal, you can manually
switch portals by writing to the HBA's special file in /proc/scsi/iscsi/.
For example, if a target advertises two network portals:
10.77.13.248:3260 and 192.168.250.248:3260.
If the device is configured with targetId as 0, busId as 0, HBA's host
number is 3 and you want to switch the target from
10.77.13.248 to 188.8.131.52, use the following command:
echo "target 0 0 address 192.168.250.248" > /proc/scsi/iscsi/3
Where the syntax is:
The host system must have multiple network interfaces to effectively
utilize this failover feature.
iSCSI HBA STATUS
The directory /proc/scsi/iscsi will contain a special file that can be
used to get status from your iSCSI HBA. The name of the file will
be the iSCSI HBA's host number, which is assigned to the driver
When the file is read, it will show the driver's version number,
followed by a list all iSCSI targets and LUNs the driver has found
and can use.
Each line will show the iSCSI bus number, target id number, and
logical unit number, as well as the IP address, TCP port, and
iSCSI TargetName. If an iSCSI session exists, but no LUNs have
yet been found for a target, the LUN number field will contain a
question mark. If a TCP connection is not currently established,
the IP address and port number will both appear as question marks.
USING MULTIPATH I/O SOFTWARE
If a third-party multipath I/O software application is being used in
conjunction with the iSCSI driver (e.g., HP Secure Path), it may be
necessary to modify the configuration of the driver to allow the
multi-pathing software to operate more efficiently. If you are using
a multipath I/O application, you may need to set the "ConnFailTimeout"
parameter of the iSCSI driver to a smaller value so that SCSI commands
will fail more quickly when an iSCSI network connection drops allowing
the multipath application to try a different path to for access to the
storage device. Also, you may need to set the "MaxDiskCommandTimeout"
to a smaller value (e.g., 5 or 10 seconds), so that SCSI commands to
unreachable or unresponsive devices will fail more quickly and the
multipath software will know to try a different path to the storage device.
Multipath support in the iSCSI driver can be turned on by setting
Multipath=<"yes" or "portal" or "portalgroup"> in /etc/iscsi.conf.
If Multipath=<"yes" or "portal">, then the discovered targets that
are configured to allow access via multiple paths will have a separate
iSCSI session created for each path (i.e., iSCSI portal). The target
portal failover feature should not be used if Multipath=<"yes" or "portal">
since multiple sessions will be established with all available paths.
MAKING STORAGE CONFIGURATION CHANGES
Making changes to your storage configuration, including adding or
removing targets or LUNs, remapping targets, or modifying target
access, may change how the devices are presented to the host operating
system. This may require corresponding changes in the iSCSI driver
configuration and /etc/vfstab file.
It is important to understand the ramifications of SCSI routing
service configuration changes on the hosts accessing the associated
storage devices. For example, changing the instance configuration
may change the device presentation to the host's iSCSI driver,
effectively changing the name or number assigned to the device
by the host operating system. Certain configuration changes,
such as adding or deleting targets, adding or deleting LUNs
within a particular target, or adding or deleting entire instances
may change the order of the devices presented to the host.
Even if the host is only associated with one SCSI routing
service instance, the device order could make a difference.
Typically, the host operating system assigns drive identifications
in the order they are received based on certain criteria. Changing
the order of the storage device discovery may result in a changed
drive identification. Applications running on the host may require
modifications to appropriately access the current drives.
If an entire SCSI routing service instance is removed, or there
are no targets available for the host, the host's iSCSI driver
configuration file must be updated to remove the appropriate
reference before restarting the iSCSI driver. If a host's iSCSI
configuration file contains an IP address of a SCSI routing
service instance that does not exist, or has no targets available
for the host, the iSCSI driver will not complete a login and
will keep on trying to discover targets associated with this SCSI
routing service instance.
In general, the following steps are normally required when reconfiguring
1. Unmount any filesystems and stop any applications using iSCSI
2. Stop the iSCSI driver by entering:
3. Make the appropriate changes to the iSCSI driver
configuration file. Remove any references to iSCSI
DiscoveryAddresses that have been removed, or that
no longer have valid targets for this host.
4. Modify /etc/fstab.iscsi and application configurations as
5. Restart the iSCSI driver by entering:
Failure to appropriately update the iSCSI configuration using
the above procedure may result in a situation that prevents
the host from accessing iSCSI storage resources.
TARGET AND LUN DISCOVERY LIMITS
The bus ID and target ID are assigned by the iSCSI initiator driver
whereas the lun ID is assigned by the iSCSI target. The driver provides
access to a maximum of 256 bus IDs with each bus supporting 256 targets
and each target capable of supporting 256 LUNs. Any discovered iSCSI
device will be allocated the next available target ID on bus 0.
If a target ID > 256 on bus 0, then a next available target ID on bus 1
will be allocated. If a bus ID > 256 and LUN ID > 256 it will be ignored
by the driver and will not be configured in the system.
DYNAMIC TARGET AND LUN DISCOVERY
When using iSCSI targets that support long-lived iSCSI discovery sessions,
such as the Cisco 5400 Series, the driver will keep a discovery session
open waiting for change notifications from the target. When a notification
is received, the driver will rediscover targets, add any new targets, and
activate LUNs on all targets.
If a new LUN is dynamically added to an existing target on a SCSI routing
instance with which the driver has established a connection, then the driver
does not automatically activate the new LUN. The user can manually activate
the new LUN by executing the following command:
echo "scsi add-single-device
HBA#: is the controller number present under /proc/scsi/iscsi/
bus-id: is the bus number present on controller
target-id: is the target ID present on
LUN: new LUN added dynamically to the target.
PERSISTENT TARGET BINDING
This feature ensures that the same iSCSI bus and target id number are used
for every iSCSI session to a particular iSCSI TargetName, and a Linux SCSI
target always maps to the same physical storage device from one reboot to
This feature ensures that the SCSI numbers in the device symlinks described
above will always map to the same iSCSI target.
Note that because of the way Linux dynamically allocates SCSI device nodes
as SCSI devices are found, the driver does not and cannot ensure that any
particular SCSI device node (e.g., /dev/sda) will always map to the same
iSCSI TargetName. The symlinks described in the section on Device Names are
intended to provide a persistent device mapping for use by applications and
fstab files, and should be used instead of direct references to particular
SCSI device nodes.
The file /etc/iscsi.bindings is used by the iSCSI daemon to store bindings of
iSCSI target names to SCSI target ID's. If the file doesn't exist,
it will be created when the driver is started. If an entry exists for a
discovered target, the Linux target ID from the entry is assigned to the
target. If no entry exists for a discovered target, an entry is written to
the file. Each line of the file contains the following fields:
BusId TargetId TargetName
An example file would be:
0 0 iqn.1987-05.com.cisco.00.7e9d6f942e45736be69cb65c4c22e54c.disk_one
0 1 iqn.1987-05.com.cisco.00.4d678bd82965df7765c788f3199ac15f.disk_two
0 2 iqn.1987-05.com.cisco.00.789ac4483ac9114bc6583b1c8a332d1e.disk_three
Note that the /etc/iscsi.bindings file will permanently contain entries
for all iSCSI targets ever logged into from this host. If a target is
no longer available to a host you can manually edit the file and remove
entries so the obsolete target no longer consumes a SCSI target ID.
If you know the iSCSI target name of a target in advance, and you want
it to be assigned a particular SCSI target ID, you can add an entry
manually. You should stop the iSCSI driver before editing the
/etc/iscsi.bindings file. Be careful to keep an entire entry on a single
line, with only whitespace characters between the three fields. Do not
use a target ID number that already exists in the file.
NOTE: iSCSI driver versions prior to 3.2 used the file /var/iscsi/bindings
instead of /etc/iscsi.bindings. The first time you start the new driver
version, it will change the location and the name of the bindings file
The CHAP authentication mechanism provides for two way authentication between
the target and the initiator. The authentication feature on the SN 5400
system has to be enabled to make use of this feature. The username and
password for both initiator side and target side authentication needs to be
listed in /etc/iscsi.conf. The username and password can be specified as
global values or can be made specific to the target address. Please refer to
the Editing The iscsi.conf File section of this document for a more detailed
description of these parameters.
EDITING THE ISCSI.CONF FILE
The /etc/iscsi.conf file is used to control the operation of the iSCSI driver
by allowing the user to configure the values for a number of programmable
parameters. These parameters can be setup to apply to specific configuration
types or they can be setup to apply globally. The configuration types that are
- DiscoveryAddress = SCSI routing instance IP address with format a.d.c.d or
a.b.c.d:n or hostname.
- TargetName = Target name in 'iqn' or 'eui' format
eg: TargetName = iqn.1987-05.com.cisco:00.0d1d898e8d66.t0
- TargetIPAddress = Target name with format a.b.c.d/n
- Subnet = Network portal IP address with format a.b.c.d/n or a.b.c.d&hex
- Address = Network portal IP address with format a.b.c.d/32
The complete list of parameters that can be applied either globally or to the
configuration types listed above are shown below. Not all parameters are
applicable to all configuration types.
- Username = CHAP username used for initiator authentication by the target.
- OutgoingUsername = <>
- Password = CHAP password used for initiator authentication by the target.
- OutgoingPassword = <>
- IncomingUsername = CHAP username for target authentication by the initiato
- IncomingPassword = CHAP password for target authentication by the initiato
- HeaderDigest = Type of header digest support the initiator is requesting
of the target.
- DataDigest = Type of data digest support the initiator is requesting of
- PortalFailover = Enabling/disabling of target portal failover feature.
- PreferredSubnet = IP address of the subnet that should be used for a
- PreferredPortal = IP address of the portal that should be used for a
- Multipath = Enabling/disabling of multipathing feature.
- LoginTimeout = Time interval to wait for a response to a login request to
be received from a target before failing a connection
- AuthTimeout = Time interval to wait for a response to a login request
containing authentication information to be received from a
target before failing a connection attempt.
- IdleTimeout = Time interval to wait on a connection with no traffic before
sending out a ping.
- PingTimeout = Time interval to wait for a ping response after a ping is
sent before failing a connection.
- ConnFailTimeout = Time interval to wait before failing SCSI commands back
to an application for unsuccessful commands.
- AbortTimeout = Time interval to wait for a abort command to complete
before declaring the abort command failed.
- ResetTiemout = Time interval to wait for a reset command to complete
before declaring the reset command failed.
- InitialR2T = Enabling/disabling of R2T flow control with the target.
- MaxRecvDataSegmentLength = Maximum number of bytes that the initiator can
receive in an iSCSI PDU.
- FirstBurstLength = Maximum number of bytes of unsolicited data the
initiator is allowed to send.
- MaxBurstLength = Maximum number of bytes for the SCSI payload negotiated
- TCPWindowSize = Maximum number of bytes that can be sent over a TCP
connection by the initiator before receiving an
acknowledgement from the target.
- Continuous = Enabling/disabling the discovery session to be kept alive.
A detailed description for each of these parameters is included in both the
man page and the included sample iscsi.conf file. Please consult these sources
for examples and more detailed programming instructions.
iSCSI COMMANDS AND UTILITIES
This section gives a description of all the commands and utilities available
with the iSCSI driver.
- "iscsi-ls" lists information about the iSCSI devices available to the
driver. Please refer to the man page for more information.
Mount EMC BCVs at the same host
I have created a volumegroup, a logical volume, afilesystem and a file on two EMC standard volumes.(For this test you need to have two hdisks hdisk and hdisk andtwo BCVs dev and available)
# mkvg -f -y MyName_vg -s 16 hdisk hdisk
# mklv -y MyName_lv -b n MyName_vg 20
# crfs -v jfs -d MyName_lv -m /MyName_mp -A yes -p rw
# mount /MyName_mp
# lptest > /MyName_mp/lptest.out
For using EMCs TimeFinder I have to create a device group.(AIX is working with volumegroups. EMCs TimeFinder is working withdiskgroups.)With the following command the AIX volumegroup MyName_vg is convertedto the diskgroup MyName_dg)
# symvg vg2dg MyName_vg MyName_dg -dgtype RDF1
For to use TimeFinder I have to associate two BCVs to this devicegroup
# symbcv -g MyName_dg associate dev
# symbcv -g MyName_dg associate dev
Now I have to set the BCVs to the defined-state
# rmbcv -a
Using the establish I mirror all data from the original hdisks to the BCVs (including the PVIDs!)
# symmir -g MyName_dg establish -full -exac
I have to wait until the establish is done
# symmir -g MyName_dg -i 10
Query When the establish is done, I have to unmount my filesystem andvaryoff the volumegroup
# umount /MyName_mp
# varyoffvg MyName_vg
Now I am in the right state to split the BCV copies
# symmir -g MyName_dg split -noprompt
When the split is done, I can varyon my volumegroup and mount myFilesystem
# symmir -g MyName_dg -i 10 query
# varyonvg MyName_vg
# mount /MyName_mp
I configure the BCVs
# mkbcv -a
Now I am able to create a new volumegroup from the BCVs
# recreatevg -y MyName_bcv_vg -Y test -L /bcv hdisk hdisk
# lsvg -l MyName_bcv_vgMyName_bcv_vg:LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINTtestMyName_lv jfs 20 20 1 closed/syncd /bcv/MyName_mptestloglv00 jfslog 1 1 1 closed/syncd N/A
# lspv grep -v None
hdisk0 000039386adb2317 rootvg
hdisk 00003938874658c8 MyName_vg
hdisk 0000393887468473 MyName_vg
hdisk 000039388794adb8 MyName_bcv_vg
hdisk 000039388794b7f5 MyName_bcv_vg
Simply stated, a LUN is a logical entity that converts raw physical disk space into logical storage space that a host server's operating system can access and use. Any computer user recognizes the logical drive letter that has been carved out of their disk drive. For example, a computer may boot from the C: drive and access file data from a different D: drive. LUNs do the same basic job. "LUNs differentiate between different chunks of disk space. "A LUN is part of the address of the storage that you're presenting to a [host] server."
LUNs are created as a fundamental part of the storage provisioning process using software tools that typically accompany the particular storage platform. However, there is not a 1-to-1 ratio between drives and LUNs. Numerous LUNs can easily be carved out of a single disk drive. For example, a 500 GB drive can be partitioned into one 200 GB LUN and one 300 GB LUN, which would appear as two unique drives to the host server. Conversely, storage administrators can employ Logical Volume Manager software to combine multiple LUNs into a larger volume. Veritas Volume Manager from Symantec Corp. is just one example of this software. In actual practice, disks are first gathered into a RAID group for larger capacity and redundancy (e.g., RAID-50), and then LUNs are carved from that RAID group.
LUNs are often referred to as logical "volumes," reflecting the traditional use of "drive volume letters," such as volume C: or volume F: on your computer. But some experts warn against mixing the two terms, noting that the term "volume" is often used to denote the large volume created when multiple LUNs are combined with volume manager software. In this context, a volume may actually involve numerous LUNs and can potentially confuse storage allocation. "The 'volume' is a piece of a volume group, and the volume group is composed of multiple LUNs,"
Once created, LUNs can also be shared between multiple servers. For example, a LUN might be shared between an active and standby server. If the active server fails, the standby server can immediately take over. However, it can be catastrophic for multiple servers to access the same LUN simultaneously without a means of coordinating changed blocks to ensure data integrity. Clustering software, such as a clustered volume manager, a clustered file system, a clustered application or a network file system using NFS or CIFS, is needed to coordinate data changes.
SAN zoning and masking
LUNs are the basic vehicle for delivering storage, but provisioning SAN storage isn't just a matter of creating LUNs or volumes; the SAN fabric itself must be configured so that disks and their LUNs are matched to the appropriate servers. Proper configuration helps to manage storage traffic and maintain SAN security by preventing any server from accessing any LUN.
Zoning makes it possible for devices within a Fibre Channel network to see each other.
Consequently, zoning is an important element of SAN security and high-availability SAN design. Zoning can typically be broken down into hard and soft zoning. With hard zoning, each device is assigned to a zone, and that assignment can never change. In soft zoning, the device assignments can be changed by the network administrator.
LUN masking adds granularity to this concept. Just because you zone a server and disk together doesn't mean that the server should be able to see all of the LUNs on that disk. Once the SAN is zoned, LUNs are masked so that each host server can only see specific LUNs. For example, suppose that a disk has two LUNs, LUN_A and LUN_B. If we zoned two servers to that disk, both servers would see both LUNs. However, we can use LUN masking to allow one server to see only LUN_A and mask the other server to see only LUN_B. Port-based LUN masking is granular to the storage array port, so any disks on a given port will be accessible to any servers on that port. Server-based LUN masking is a bit more granular where a server will see only the LUNs assigned to it, regardless of the other disks or servers connected.
LUN scaling and performance
LUNs are based on disks, so LUN performance and reliability will vary for the same reasons. For example, a LUN carved from a Fibre Channel 15K rpm disk will perform far better than a LUN of the same size taken from a 7,200 rpm SATA disk. This is also true of LUNs based on RAID arrays where the mirroring of a RAID-0 group may offer significantly different performance than the parity protection of a RAID-5 or RAID-6/dual parity (DP) group. Proper RAID group configuration will have a profound impact on LUN performance.
An organization may utilize hundreds or even thousands of LUNs, so the choice of storage resources has important implications for the storage administrator. Not only is it necessary to supply an application with adequate capacity (in gigabytes), but the LUN must also be drawn from disk storage with suitable characteristics. "We go through a qualification process to understand the requirements of the application that will be using the LUNs for performance, availability and cost," For example, a LUN for a mission-critical database application might be taken from a RAID-0 group using Tier-1 storage, while a LUN slated for a virtual tape library (VTL) or archive application would probably work with a RAID-6 group using Tier-2 or Tier-3 storage.
LUN management tools
A large enterprise array may host more than 10,000 LUNs, so software tools are absolutely vital for efficient LUN creation, manipulation and reporting. Fortunately, management tools are readily available, and almost every storage vendor provides some type of management software to accompany products ranging from direct-attached storage (DAS) devices to large enterprise arrays.
Administrators can typically opt for vendor-specific or heterogeneous tools. A data center with one storage array or a single-vendor shop would probably do well with the indigenous LUN management tool that accompanied their storage system. Multivendor shops should at least consider heterogeneous tools that allow LUN management across all of the storage platforms. Mack uses EMC ControlCenter for LUN masking and mapping, which is just one of several different heterogeneous tools available in the marketplace. While good heterogeneous tools are available, he advises caution when selecting a multiplatform tool. "Sometimes, if the tool is written by a particular vendor, it will manage 'their' LUNs the best," he says. "LUNs from the other vendors can take the back seat -- the management may not be as well integrated."
In addition to vendor support, a LUN management tool should support the entire storage provisioning process. Features should include mapping to specific array ports and masking specific host bus adapters (HBA), along with comprehensive reporting. The LUN management tool should also be able to reclaim storage that is no longer needed. Although a few LUN management products support autonomous provisioning, experts see some reluctance toward automation. "It's hard to do capacity planning when you don't have any checks and balances over provisioning," Mack says, also noting that automation can circumvent strict change control processes in an IT organization.
LUNs at work
Significant storage growth means more LUNs, which must be created and managed efficiently while minimizing errors, reigning in costs and maintaining security. For Thomas Weisel Partners LLC, an investment firm based in San Francisco, storage demands have simply exploded to 80 terabytes (TB) today -- up from about 8 TB just two years ago. Storage continues to flood the organization's data center at about 2 TB to 3 TB each month depending on projects and priorities.
This aggressive growth pushed the company out of a Hitachi Data Systems (HDS) storage array and into a 3PARdata Inc. S400 system. LUN deployment starts by analyzing realistic space and performance requirements for an application. "Is it something that needs a lot of fast access, like a database or something that just needs a file share?" asks Kevin Fiore, director of engineering services at Thomas Weisel. Once requirements are evaluated, a change ticket is generated and a storage administrator provisions the resources from a RAID-5 or RAID-1 group depending on the application. Fiore emphasizes the importance of provisioning efficiency, noting that the S400's internal management tools can provision storage in just a few clicks.
Fiore also notes the importance of versatility in LUN management tools and the ability to move data. "Dynamic optimization allows me to move LUNs between disk sets," he says. Virtualization has also played an important role in LUN management. VMware has allowed Fiore to consolidate about 50 servers enterprise-wide along with the corresponding reduction in space, power and cooling. this lets the organization manage more storage with less hardware.
LUNs getting large
As organizations deal with spiraling storage volumes, experts suggest that efficiency enhancing features, such as automation, will become more important in future LUN management. Experts also note that virtualization and virtual environments will play a greater role in tomorrow's LUN management. For example, it's becoming more common to provision very large chunks of storage (500 GB to 1 TB or more) to virtual machines. "You might provision a few terabytes to a cluster of VMware servers, and then that storage will be provisioned out over time.
Very simply, RAID striping is a means of improving the performance of large storage systems. For most normal PCs or laptops, files are stored in their entirety on a single disk drive, so a file must be read from start to finish and passed to the host system. With large storage arrays, disks are often organized into RAID groups that can enhance performance and protect data against disk failures. Striping is actually RAID-0; a technique that breaks up a file and interleaves its contents across all of the disks in the RAID group. This allows multiple disks to access the contents of a file simultaneously. Instead of a single disk reading a file from start to finish, striping allows one disk to read the next stripe while the previous disk is passing its stripe data to the host system -- this enhances the overall disk system performance, which is very beneficial for busy storage arrays.
Parity can be added to protect the striped data. Parity data is calculated for the stripes and placed on another disk drive. If one of the disks in the RAID group fails, the parity data can be used to rebuild the failed disk. However, multiple simultaneous disk failures may result in data loss because conventional parity only accommodates a single disk failure.
The performance impact of RAID striping at the array and operating system level.
RAID striping or concatenation: Which has better performance?
Designing storage for performance is a very esoteric effort by nature. There are quite a few variables that need to be taken into account.
RAID-50: RAID-5 with suspenders
RAID-50 combines striping with distributed parity for higher reliability and data transfer capabilities.
RAID-53: RAID by any other name
RAID-53 has a higher transaction rate than RAID-3, and offers all the protection of RAID-10, but there are disadvantages as well.
RAID-10 and RAID-01: Same or different?
The difference between RAID-10 and RAID-01 is explained.
RAID, or redundant array of independent disks, can make many smaller disks appear as one large disk to a server for better performance and higher availability.
You have two fabrics running off of two switches. You'd like to make them one fabric. How to do that? For the most part, it's simply connecting the two switches via e_ports.
Before doing that, however, realize there's several factors that can prevent them from mergingg
- Incompatible operating parameters such as RA_TOV and ED_TOV
- Duplicate domain IDs.
- Incompatible zoning configurations
- No principal switch (priority set to 255 on all switches)
- No response from the switch (hello sent every 30 seconds)
To avoid the issues above:
- Check IPs on all Service Processors and switches; deconflict as necessary.
- Ensure that all switches have unique domain ids.
- Ensure that operating parameters are the same.
- Ensure there aren't any zoning conflicts in the fabric (port zones, etc).
Once that's done:
- Physically link the switches
- View the active zone set to ensure the merge happens.
- Save the active zone set
- Activate the new zone set.
EMC recommends no more than four connectrix switches per fabric based on the following formulae:
-32 Total ports
- 4 ports reserved for card failure
28 ports remaining.
- (int(28/5)) No more than 4:1 ratio, hosts : fa
23 Possible host connections
-2 to support multi-pathing
-11 total host connections
- 64 Total ports
- 4 ports reserved for card failure
- 4 ports reserved for E_ports
- 56 ports remaining.
-(int(56/5)) No more than 4:1 ratio, hosts : fa
-45 Possible host connections
-/ 2 to support multi-pathing
22 host connections (gain of 11)
- 96 total ports
- 4 ports reserved for card failure
- 12 ports reserved for E_ports
-80 ports remaining
- (int(80/5)) No more than 4:1 ratio, hosts : fa
-64 Possible host connections
-/ 2 to support multi-pathing
-32 host connections (gain of 10)
-128 total ports
- 4 ports reserved for card failure
- 24 ports reserved for E_ports
-100 ports remaining
- (int(100/5)) No more than 4:1 ratio, hosts : fa
-80 Possible host connections
- / 2 to support multi-pathing
-40 host connections (gain of 8)
Putting in that fourth connectrix means that you gain only 8 host connections from a 32 port connectrix switch.
The recent Storage industry challange is minimize downtime and how to keep business running 24 X 7 X 365. The data that drives today’s globally oriented businesses is stored on large networks of interconnected computers and data storage devices. This data must be 100% available and always accessible and up-to-date, even in the face of local or regional disasters. Moreover, these conditions must be met at a cost that is affordable, and without in any way hampering normal company operations.
To reduce the business risk of an unplanned event of this type, an enterprise must ensure that a copy of its business-critical data is stored at a secondary location. Synchronous replication, used so effectively to create perfect copies in local networks, performs poorly over longer distances.
1) Synchronous – Every write transaction committed must be acknowledged from the
secondary site. This method enables efficient replication of data within the local
2) Asynchronous – Every write transaction is acknowledged locally and then added to a
queue of writes waiting to be sent to the secondary site. With this method, some
data will normally be lost in the event of a disaster. This requires the same
bandwidth as a synchronous solution.
3) Snapshot –A consistent image of the storage subsystem is periodically transferred to the secondary site. Only the changes made since the previous snapshot must be transferred, resulting in significant savings in bandwidth. By definition, this solution produces a copy that is not up-to-date; however, increasing the frequency of the snapshots can reduce the extent of this lag.
4) Small-Aperture Snapshot – Kashya’s system offers the unique ability to take frequent snapshots, just seconds apart. This innovative feature is utilized to minimize the risk of data loss due to data corruption that typically follows rolling disasters.
Kashya’s advanced architecture can be summarized as follows:
Positioning at the junction between the SAN and the IP infrastructure enables Kashya
Deploy enterprise-class data protection non-disruptively and non-invasively
Support heterogeneous server, SAN, and storage platforms
Monitor SAN and WAN behavior on an ongoing basis, to maximize the data
Advanced algorithms, that:- Automatically manage the replication process, with strict adherence to userdefined policies that are tied to user-specified business objectives