Showing posts with label CLARiiON. Show all posts
Showing posts with label CLARiiON. Show all posts

EMC Unisphere - Next generation storage management that provides single and simple interface for both current and future clariion and celerra series. Unisphere provides capability to integrate other element manager and provides built in support tool, software download and live chat etc. Note :-  Unisphere features will be available in Q3 2010:

Check out demo about EMC Unisphere, presented by Bob Abraham :

           


• Task-based navigation and controls offer an intuitive, context-based approach to configuring storage, creating replicas, monitoring the environment, managing host connections, and accessing the Unisphere Support ecosystem.

Self-service Unisphere support ecosystem is accessible with 1-click from Unisphere, providing users with quick access to “real-time” support tools including live chat support, software downloads, product documentation, best practices, FAQs, online customer communities, ordering spares, and submitting service requests.

• Customizable dashboard views and reporting capabilities enable at-a-glance management by automatically presenting end-users with valuable information in context of how they manage storage. For example, customers can develop custom reports 18x faster with EMC Unisphere.

CLARiiON Flare release 29 (04.29.000.5.001) introduce support for several new features as follow:

1) Virtualization-aware Navisphere Manager - Discovery of VMware client always were difficult in earlier release but Flare 29 enables CLARiiON CX4 users and VMware administrators to reduce infrastructure reporting time from hours to minutes. Earlier releases have allowed only a single IP address to be assigned to each iSCSI physical port. With Flare 29, the ability to define multiple virtual iSCSI ports on each physical port has been added along with the ability to tag each virtual port with unique VLAN tag. VLAN tagging has also been added to the single Management Port interface. It should be noted that the IP address and VLAN tag assignments should be carefully coordinated with those supporting the network infrastructure where the storage system will operate

2) Built-in policy-based spin down of idle SATA II drives for CLARiiON CX4 - Lowers power requirements in environments such as test and development in physical and virtual environments. Features include a simple management via a “set it and forget it” policy, complete spin down of inactive drives during times of zero I/O activity, and drives automatically spin back up after a "first I/O" request is received.

3) Virtual Provisioning Phase 2 - Support for MirrorView and SAN Copy replication on thin LUNs has been added.

4) Search feature – Provides users with the ability to search for a wide-variety of objects across their storage systems. Objects can be either logical (e.g., LUN) or physical (e.g., disks).

5) Replication roles - Three additional roles have to be added in Navisphere: “Local Replication Only”, “Replication” and “Replication/ Recovery”.

6) Dedicated VMware software files - VMware software files (i.e. NaviSecCLI, Navisphere Initialization Wizard) are now separate from those of the Linux Operating System.

7) Software filename standardization - all CLARiiON software filenames beginning with FLARE Release 29

8) Changing SP IP addresses - SP IP addresses can now be changed without rebooting the SP. Only the Management Sever will need to be rebooted from the Setup page, which results in no storage system downtime.

9) Linux 64-bit server software – Native 64-bit Linux server software files simplify installation by eliminating the need to gather and load 32-bit DLLs.

10) Solaris x64 Navisphere Host Agent – Release 29 marks the introduction of Solaris 64-bit Navisphere Host Agent software. This Host Agent is backward compatible with older FLARE release.


Any disk drive from any manufacturer can exhibit sector read errors due to media defects. This is a known and accepted reality in the disk drive industry, particularly with the high recording densities employed by recent products. These media defects only affect the drive’s ability to read data from a specific sector; they do not indicate general unreliability of the disk drive. The disk drives that EMC purchases from its vendors are within specifications for soft media errors according to the vendors as well as EMC’s own Supply Base Management organization.

Prior to shipment from manufacturing, disk drives have a surface scan operation performed that detects and reallocates any sectors that are defective. This operation is run to reduce the possibility that a disk drive will experience soft media errors in operation. Improper handling after leaving EMC manufacturing can lead to the creation of additional media defects, as can improper drive handling during installation or replacement.

When a disk drive encounters trouble reading data from a sector, the drive will automatically attempt recovery of the data through its various internal methods. Whether or not the drive is eventually successful at reading the sector, the drive will report the event to FLARE. FLARE will in turn log this event as a “Soft Media Error” (event code 820) and will re-allocate the sector to a spare physical location on the drive (this does not affect the logical address of the sector). In the event that the drive was eventually successful at reading the sector, (event coded 820 with sub-code of 22), FLARE will directly write that data into the new physical location. If the correct sector data was not available, (event code 820 with sub-code of 05). There are certain tools from EMC to verify disk and check detail about these Soft Media Errors like sniffer/FBI Tool/SMART Technology etc..

To setup Replication Manager you must perform the following tasks:

1) Verify that your environment has the minimum required storage hardware and that the hardware has a standard CLARiiON configuration.
2) Confirm that your Replication Manager hosts (server and clients) are connected to the CLARiiON environment through a LAN connection.
3) Zone the fibre switch appropriately (if applicable). The clients must be able to access all storage arrays they are using and the mount hosts must be able to access all storage in the EMC Replication Storage group.
4) Install all necessary software on each Replication Manager client , server, and mount host. Also install the appropriate firmware and software on the CLARiiON array itself.
5) Modify the clarcnfg file to represent all CLARiiON Arrays.
6) On Solaris hosts, verify that there are enough entries in the sd.conf file to support all dynamic mounts of replica LUNs.
7) Install Replication Manager Client software on each client that has an application with data from which you plan to create replicas.
8) Create a new user account on the CLARiiON and give this new account privileges as an administrator. Replication Manager can use this account to access and manipulate the CLARiiON as necessary.
9) Grant storage processor privileges through the agent tab of storage processor properties to allow aviCLI.jar commands from Replication Manager Client Control Daemon (irccd) process to reach the CLARiiON storage array.
10) Update the agent.config file on each client where Replication Manager is installed to include a link to: user system@ where is the IP address of a storage processor. You should add a link to both storage processors in each StorageWorks array that you are using.
11) Verify that you have Clone Private LUNs set up on your CLARiiON storage array. --Create a mount storage group for each mount host and make sure that storage group contains at least one LUN, and that the LUN is visible to the mount host. This LUN does not have to be dedicated or remain empty; you can use it for any purpose. However if no LUNs are visible to the Replication Manager mount host, Replication Manager will not operate.
12) Create a storage group named EMC Replication Storage and populate it with free LUNs that you created in advance for Replication Manager to use for storing replicas.
13) Start the Replication Manager Console and connect to your Replication Manager server. You must perform the following steps:
a) Register all Replication Manager clients
b) Run Discover Arrays
c) Run Configure Array for each array discovered
d) Run Discover Storage for each array discovered

The following rules and recommendations CX systems:
1)
You cannot use any of the disks 000 through 004 (enclosure 0, loop 0, disks 0-4) as a hot spare in a CX-Series system.
2) The hardware reserves several gigabytes on each of disks 000 through 004 for the cache vault and internal tables. To conserve disk space, you should avoid binding any other disk into a RAID Group that includes any of these disks. Any disk you include in a RAID Group with a cache disk 000-004 is bound to match the lower unreserved capacity, resulting in lost storage of several gigabytes per disk.
3) Each disk in the RAID Group should have the same capacity. All disks in a Group are bound to match the smallest capacity disk, and you could waste disk space. The first five drives (000-004) should always be the same size.
4) You cannot mix ATA (Advanced Technology Attachment) and Fibre Channel disk drives within a RAID Group.
5) Hot spares for Fibre Channel drives must be Fibre Channel drives; ATA drives require ATA hot spares.
6) If a storage system will use disks of different speeds (for example, 10K and 15K rpm), then EMC recommends that you use disks of the same speed throughout each 15-disk enclosure. For any enclosure, the hardware allows one speed change within an enclosure, so if need be, you may use disks of differing speeds. Place the higher speed drives in the first (leftmost) drive slot(s).
7) You should always use disks of the same speed and capacity in any RAID Group.
8) Do not use ATA drives to store boot images of an operating system. You must boot host operating systems from a Fibre Channel drive.

The following major configuration steps for the storage, server and switches necessary for implementing the CLARiiON.
  1. Install Fibre Channel HBAs in all systems
  2. Install EMC CLARiiON LP8000 port driver ( For Emulex) on all system
  3. Connect each host to both switches ( Broace/Cisco/McData)
  4. Connect SP1-A and SP2-A to the first switch
  5. Connect SP1-B and SP2-B to the second switch
  6. Note:- You can use cross SP connection for HA and connect SPA1 and SPB1 to first switch and SPB2 and SPA2 to the second switch.
  7. Install Operating System on windows/solaris/linux/Vmware hosts
  8. Connect all hosts to the Ethernet LAN
  9. Install EMC CLARiiON Agent Configurator/Navisphere Agent on all hosts
  10. Install EMC CLARiiON ATF software on all hosts if you are not using EMC powerpath fail-over software otherwise install supported version EMC Powerpath on all hosts.
  11. Install the Navisphere Manager on one of the NT hosts
  12. Configure Storage Groups using the Navisphere Manager
  13. Assign Storage groups to hosts as dedicated storage/Cluster/Shared Storage
  14. Install cluster software on host.
  15. Test the cluster for node failover
  16. Create Raid Group with protection as application required(raid5,raid1/0 etc)
  17. Bind LUN according to application device layout requirement.
  18. Add LUN to storage Group.
  19. Zone SP port and Host HBA on both switch
  20. Register Host on CLARiiON using Navisphere Manager.
  21. Add all hosts to storage group.
  22. Scan the devices on host.
  23. Label and Format the device on host.

Performance Tuning

Posted by Diwakar ADD COMMENTS

Performance tuning has always been a challenge for System administrators and Database administrators for a long time. As virtualization continues to grow in every aspect of the IT infrastructure, tuning the OS, DB or storage tends to become even more complex.

Apart from CPU power and memory size, disk subsystem handles the movement of data on the computer system and has a powerful influence on its overall response. Also performance disk layout must also be designed to provide appropriate data protection with overall cost in mind.
Planning ahead is the most effective practice to avoid performance issues later on while also providing the flexibility to make adjustments before committing the changes into production.

Some fundamental disk terminology:

Alignment –Data block addresses compared to RAID stripe addresses
Coalesce – To bunch together multiple smaller IOs in one larger one
Concurrency – Multiple threads writing to disk simultaneously
Flush – Data in cache written to disk
Multi-pathing – Concurrent paths to same disk storage

How to choose a RAID type:

The concept of RAID is comprehensible to most in the storage industry. To extract the best disk performance, choosing the right RAID type based on IO patterns is very important. Most commonly observed IO patterns are listed later in the article. Since RAID 5 and RAID 1/0 are the most commonly used RAID types in the industry lets focus on these two for now.



RAID 1/0 – This RAID type works best for random IOs pattern especially for write-intensive applications. If the writes are above 20% go and RAID 1/0

RAID 5 – Compared to RAID1/0, for same number of physical spindles, the performance of the two is very close in a read-heavy environment. For instance, a 2+2 RAID 1/0 (total 4 disks) will perform similar to a RAID 5 3+1 (total 4 disks).

On the other hand, if one can afford to ignore the number of physical spindles and consider only the usable capacity (RAID 10 3+3 v/s RAID 5 3+1), RAID 1/0 is the way to go.

RAID1/0 has higher cost associated with it while RAID 5 provides a more efficient use of disk space. The only con associated with RAID 5 is the re-sync time after a disk replacement.

Number of Operations per RAID type:

RAID 1 and 1/0 require that two disks to be written for each host initiated write.
Total IO=host reads + 2* host writes
RAID5 (4+1) requires 4 operations per host write. If the data is sequential, do one large stripe write
A r5 write requires 2 reads and 2 writes
Total IO=host reads + 4*host writes

To re-cap,
. Parity RAID operations increase the disk load. For similar capacity, go for RAID1/0
. RAID5 is better than RAID1/0 for large sequential IO
. RAID 3 is much effective as you can now use cache with RAID 3
. RAID1/0 is best for mixed IO types.

Commonly noticed Database IO patterns:

OLTP Log – sequential
OLTP – Data – random
Bulk Insert - sequential
Backup – sequential read/write
Restore – sequential read/write
Re-index - sequential read/write
Create Database - Sequential read


Knowing your IO personality is important to choose which RAID type to consider. Using the above notes and application IO patterns, one can make a choice on the disk layout. Some characteristics of the IOs to consider are:
- IO size
- IO Read/Write ratio
- Type of IO - Random v/s Sequential
- Snapshots / Clones
- Application type – OLTP ?
- Bandwidth requirements
- Estimated Growth


Once you decide the type of RAID to use, you can fine tune the disk system by following vendor recommended practices like:

-- Lun distribution
-- Distribute the IO load evenly across available disk drives
-- Avoid using Primary and BCv/Snapshots luns on same physical spindles. (The best way to avoid this is to have separate disk groups for primary data disks and BCVs/Clones/Snapshots)
-- Consider using Meta or host stripping

Cache:
Disk writes are more costly and thus must be given bigger share of cache
Match cache page size to IO size to prevent multiple Ios

Stripe Size:
Retain default stripe size of 64KB
Larger the stripe size, more cache size required and longer it takes to rebuild

FC or iSCSI:
FC Best for large IO and high bandwidth
FC More expensive
iSCSI involves lowest cost
iSCSI works best for OLTP, small block IO

Some Best Practice recommendations from Microsoft:

Microsoft has laid a few guidelines for designing the SQL database
Use RAID1+0 for log files
Isolate log from data at physical disk level
Use RAID 1+0 for tempdb

Revisiting the above recommendations periodically to stay on track, will go a long way in extracting the best out of disks

--Contributed by Suraj Kawlekar


For best performance with most applications, each SP should have its maximum amount of cache memory and you should use the default settings for the cache properties. Analyzer shows how the cache affects the storage system, and lets you tune the cache properties to best suit your application.

A storage-system cache has two parts: a read cache and a write cache. The read cache uses a read-ahead mechanism that lets the storage system prefetch data from the disk. Therefore the data will be ready in the cache when the application needs it. The write cache buffers and optimizes writes by absorbing peak loads, combining small writes, and eliminating rewrites.

You can change read cache size, write cache size, and cache page size to achieve optimal performance. The best sizes of the read and write caches depend on the read/write ratio. A general norm for the ratio of reads to writes is two reads per write; that is, reads represent 66 percent of all I/Os.

Since the contents of write cache are available for read operations as well, you should allocate most of the available SP memory to the write cache. However, since the write cache is flushed after a certain timeout period, a read cache is also required to hold active data for longer periods of time.

Read cache size

The read cache holds data that is expected to be accessed in the near future. If a request for data that is in the cache arrives, the request can be serviced from the cache faster than from the disks. Each request satisfied from cache eliminates the need for a disk access, reducing disk load. If the workload exhibits a “locality of reference” behavior, where a relatively small set of data is


accessed frequently and repeatedly, the read cache can improve performance. In read-intensive environments, where more than 70 percent of all requests are reads, the read cache should be large enough to accommodate the dataset that is most frequently accessed. For sequential reads from a LUN, data that is expected to be accessed by subsequent read requests is read (prefetched) into the cache before being requested. Therefore, for optimal performance, the read cache should be large enough to accommodate prefetched data for sequential reads from each LUN.

Write cache size

Write cache serves as a temporary buffer where data is stored temporarily before it is written to the disks. Cache writes are far faster than disk writes. Also, write-cached data is consolidated into larger I/Os when possible, and written to the disks more efficiently. (This reduces the expensive small writes in case of RAID 5 LUNs.) Also, in cases where data is modified frequently, the data is overwritten in the cache and written to the disks only once for several updates in the cache. This reduces disk load. Consequently, the write cache absorbs write data during heavy load periods and writes them to the disks, in an optimal fashion, during light load periods. However, if the amount of write data during an I/O burst exceeds the write cache size, the cache fills. Subsequent requests must wait for cached data to be flushed and for cache pages to become available for writing new data.

The write cache provides sustained write speed by combining sequential RAID 5 write operations and writing them in RAID 3 mode. This eliminates the need to read old data and parity before writing the new data. To take advantage of this feature, the cache must have enough space for one entire stripe of sequential data (typically 64 KB x [number-of-disks -1], or, for a five-disk group, 256 KB) before starting to flush. Note that the sequential stream can be contained in other streams of sequential or random data.

Cache page size

This can be 2, 4, 8, or 16 KB. As a general guideline, EMC suggest 8 KB. The ideal cache page size depends on the operating system and application. Analyzer can help you decide which size performs best.

The Storage Processor (SP) processes all I/Os, host requests, management and maintenance tasks, as well as operations related to replication or migration features.

In Navisphere Analyzer, the statistics for an SP are based on the I/O workload from its attached hosts. It reflects the overall performance of CLARiiON storage system. The following Performance Metrics will be monitored for each CLARiiON storage system.

A LUN is an abstract object whose performance depends on various factors. The primary consideration is whether a host I/O can be satisfied by the cache. A cache hit does not require disk access; a cache miss requires one or more disk accesses to complete the data request.

As the slowest devices in a storage system, disk drives are often responsible for performance-related issues. Therefore, we recommend that you pay close attention to disk drives when analyzing performance problems.

SP performance metrics

Utilization (%)

The percentage of time during which the SP is servicing any request.

Total Throughput (I/O/sec)

The average number of host requests that are passed through the SP per second, including both read and write requests.


Read Throughput (I/O/sec)

The average number of host read requests that are passed through the SP per second.

Write Throughput (I/O/sec)

The average number of host write requests that are passed through the SP per second.

Read Bandwidth (MB/s)

The average amount of host read data in Mbytes that is passed through the SP per second.

Write Bandwidth (MB/s)

The average amount of host write data in Mbytes that is passed through the SP per second.

LUN performance metrics

Response Time (ms)

The average time, in milliseconds, that a request to a LUN is outstanding, including waiting time.

Total Throughput (I/O/sec)

The average number of host requests that are passed through the LUN per second, including both read and write requests.

Read Throughput (I/O/sec)

The average number of host read requests passed through the LUN per second.

Write Throughput (I/O/sec)

The average number of host write requests passed through the LUN per second.

Read Bandwidth (MB/s)

The average amount of host read data in Mbytes that is passed through the LUN per second.

Write Bandwidth (MB/s)

The average amount of host write data in Mbytes that is passed through the LUN per second.

Average Busy Queue Length

The average number of outstanding requests when the LUN was busy. This does not include idle time.

Utilization (%)

The fraction of an observation period during which a LUN has any outstanding requests.

DISK performance metrics

Utilization (%)

The percentage of time that the disk is servicing requests.

Response Time (ms)

The average time, in milliseconds, that it takes for one request to pass through the disk, including any waiting time.

Total Throughput (I/O/sec)

The average number of requests to the disk on a per second basis. Total throughput includes both read and write requests.

Read Throughput (I/O/sec)

The average number of read requests to the disk per second.

Write Throughput (I/O/sec)

The average number of write requests to the disk per second.

Read Bandwidth (MB/s)

The average amount of data read from the disk in Mbytes per second.

Write Bandwidth (MB/s)

The average amount of data written to the disk in Mbytes per second.

Average Busy Queue Length

The average number of requests waiting at a busy disk to be serviced,

including the request that is currently in service.

CLARiiON SP, LUN and DISK performance data is retrieved and processed daily. Raw performance data is kept for a longer term, i.e. 180 days, and CLARiiON performance reports are kept indefinitely for performance trend analysis.


EMC RecoverPoint

Posted by Diwakar ADD COMMENTS

EMC RecoverPoint is a comprehensive data protection and data replication solution for the entire data center. RecoverPoint provides local replication using continuous data protection (CDP) and remote replication using continuous remote replication (CRR) of the same data. RecoverPoint protects companies from data loss by enabling local recovery from common problems such as server failures, data corruption, software errors, viruses, and end-user errors. RecoverPoint also incorporates remote recovery to protect against catastrophic data loss events that can bring an entire data center to a standstill. Enterprise performance, scalability, and instant recovery are combined to guarantee data consistency with recovery that takes seconds or minutes instead of hours or days.

RecoverPoint offers bi-directional local and remote replication with no distance limitation, guaranteed data consistency, and

advanced bandwidth reduction technology designed to dramatically reduce WAN bandwidth requirements and associated costs.

• End-to-end data protection – continuously protect data locally and remotely

• Any point-in-time recovery - ultimate flexibility in selecting the optimal recovery point using user-specified or application-specific bookmarks.

• Unique bandwidth compression and bi-directional replication, combined with write-order consistency enables application restartability.

• Heterogeneous data protection - one data protection and remote replication solution for CLARiiON, Symmetrix, and third-party arrays.

• Intelligent agents for Microsoft Exchange and SQL Server facilities intelligent protection and recovery, dramatically reducing application recovery time and minimizing or eliminating data loss

• Integration with CLARiiON CX3 array-based splitter to simplify local and remote replication.

• Integration with EMC Connectrix intelligent -fabric solutions using Brocade and Cisco technology.

Gold Copy represents a virtual point-in-time copy of the secondary LUN before the secondary LUN is updated with the changes from the primary LUN. Since it is a pointer-based virtual copy it only consumes space in the reserved LUN pool based on the amount of changes on the secondary LUN. During the secondary LUN update, the Gold Copy tracks all of the updates. When a region on the secondary is updated, the original region is copied to the reserved LUN pool to preserve a consistent point-in-time view of the secondary LUN at the time of start of update. This is a key feature that always ensures a consistent view of the secondary LUN. If the update from primary to secondary is interrupted due to a link failure or due to failure at the primary site (during the update), the Gold Copy is used by MirrorView/A software to rollback the partial update on the secondary LUN and return it to its previous consistent state.

Delta Set : MirrorView uses asynchronous writes, which means that I/O is not sent to the remote site at the same time as the host I/O. It is a periodic replication of a set of LUNs to a remote storage system. The Delta Set is created and changes are tracked during a MirrorView/A replication cycle. MirrorView/A replicates only

the last changed blocks during replication cycle and results in lower bandwidth requirement than synchronous or traditional native ordered write techniques. The Delta Set is, in reality, a local snap taken at the source side at the time of replication.





About Me

My photo
Sr. Solutions Architect; Expertise: - Cloud Design & Architect - Data Center Consolidation - DC/Storage Virtualization - Technology Refresh - Data Migration - SAN Refresh - Data Center Architecture More info:- diwakar@emcstorageinfo.com
Blog Disclaimer: “The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.”
EMC Storage Product Knowledge Sharing