Chapter 3. IBM SDS product offerings

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

IBM SDS product offerings

This chapter gives an overview of IBM software-defined storage (SDS) products offerings with a focus on the IBM Spectrum Storage family of products and their capabilities and benefits.

The products that are discussed are organized by the management capabilities and the type of solution they provide, whether block, file, or object.

This chapter includes the following sections:

•SDS architecture

•IBM Spectrum Connect

•IBM Spectrum Control

•IBM Virtual Storage Center

•IBM Storage Insights

•IBM Copy Services Manager

•IBM Spectrum Protect

•IBM Spectrum Protect Snapshot

•IBM Spectrum Copy Data Management

•Block, file, and object storage

•IBM Block Storage solutions

•IBM File Storage solutions

•IBM Object Storage solutions

•IBM storage support of OpenStack components

3.1 SDS architecture

The Spectrum Storage family of solutions are depicted by their primary function as either a Control Plane or Data Plane solution.

Figure 3-1 shows the IBM SDS architecture with a mapping of the IBM Spectrum Storage family of products across the SDS control plane and data plane.

Figure 3-1 IBM Spectrum Storage family mapped to SDS Control Plane and Data Plane

Table 3-1 is an overview of the IBM Spectrum Storage family high-level descriptions with the products that provide the functions.

Table 3-1 IBM Spectrum Storage Family descriptions

	IBM Spectrum Storage family member	Description
SDS Control Plane
	IBM Spectrum Connect	Simplifies multi-cloud deployment across your IBM enterprise storage systems. Formally IBM Spectrum Control™ Base. Facilitates API connections for components, such as VMWare and Containers.
	IBM Spectrum Control	Automated control and optimization of storage and data infrastructure.
	IBM Storage Insights	Analytics-driven, storage resource management solution that is delivered from the cloud in a SaaS model. Provides for Cognitive Storage Management capabilities for clients with IBM Storage.
	IBM Copy Services Manager	Automated control and optimization of storage replication features.
	IBM Spectrum Protect™	Optimized data protection for client data through backup and restore capabilities.
	IBM Spectrum Protect Plus	Data protection for Virtual Machines.
	IBM Spectrum Protect Snapshot	Integrated application-aware point-in-time copies.
	IBM Spectrum Copy Data Management	Automate creation and use of copy data snapshots, vaults, clones, and replicas on existing storage infrastructure.
SDS Data Plane
	IBM Spectrum Virtualize™	Core SAN Volume Controller function is virtualization that frees client data from IT boundaries.
	IBM Spectrum Accelerate	Enterprise storage for cloud that is deployed in minutes instead of months.
	IBM Spectrum Scale	Storage scalability to yottabytes and across geographical boundaries.
	IBM Spectrum NAS	Remote and departmental file share.
	IBM Cloud Object Storage	Object storage solution that delivers scalability across multiple geographies.
	IBM Spectrum Archive™	Enables long-term storage of low activity data.
	IBM Spectrum Storage Suite	A single software license for all your changing software-defined storage needs. Straightforward per-TB pricing for the entire IBM Spectrum Storage suite. Includes IBM Spectrum Accelerate, IBM Spectrum Scale, IBM Spectrum Protect, IBM Spectrum Control, IBM Spectrum Scale, IBM Spectrum Archive, and IBM Cloud Object Storage.
IBM Spectrum Storage Solutions
	IBM Spectrum Access Blueprint	Enterprise data management and protection in hybrid and multi-cloud environments.

3.1.1 SDS control plane

The control plane is a software layer that manages the virtualized storage resources. It provides all the high-level functions that are needed by the customer to run the business workload and enable optimized, flexible, scalable, and rapid provisioning storage infrastructure capacity. These capabilities span many functions, such as storage virtualization, policies automation, analytics and optimization, backup and copy management, security, and integration with the API services, including other cloud provider services.

3.1.2 SDS data plane

The data plane encompasses the infrastructure where data is processed. It consists of all basic storage management functions such as virtualization, RAID protection, tiering, copy services (remote, local, synchronous, asynchronous, and point-in-time), encryption, compression, and data deduplication that can be started and managed by the control plane. The data plane is the interface to the hardware infrastructure where the data is stored. It provides a complete range of data access possibilities, spanning traditional access methods like block I/O (for example, Fibre Channel or iSCSI) and File I/O (POSIX compliant), to object-storage and Hadoop Distributed File System (HDFS).

3.2 IBM Spectrum Connect

IBM Spectrum Connect is a centralized server system that consolidates a range of IBM storage provisioning, automation, and monitoring solutions through a unified server platform.

IBM Spectrum Connect provides support for the following functions:

•VMWare environments with IBM Storage integration for advanced features and APIs

•Container orchestration with Kubernetes including IBM Cloud Private

•PowerShell command-lets for provisioning and managing IBM Storage systems

3.2.1 VMware integration

As shown in Figure 3-2, IBM Spectrum Connect provides a single-server back-end location and enables centralized management of IBM storage resources for the use of independent software vendor (ISV) platforms and frameworks. These frameworks currently include VMware vCenter Server, VMware vSphere Web Client, and VMware vSphere Storage APIs for Storage Awareness (VASA). IBM Spectrum Connect is available for no extra fee to storage-licensed clients.

Figure 3-2 IBM Spectrum Connect

As shown in Figure 3-2, IBM Spectrum Connect is not in the data path. IBM Spectrum Connect runs in the control plane, as shown in Figure 3-1 on page 20. IBM Spectrum Connect provides integration between IBM Block Storage and VMware. Clients use IBM Spectrum Connect if they are using, or plan to use, the VMware Web Client (VWC), VMware Virtual Volumes (VVol), or the vRealize Automation Suite from VMware.

IBM Spectrum Connect provides common services, such as authentication, high availability (HA), and storage configuration for IBM Block Storage in homogeneous and heterogeneous multiple target environments. IBM Spectrum Connect manages IBM XIV Storage System, A9000, A9000R, IBM DS8000 series, IBM SAN Volume Controller, the IBM Storwize® family, and third-party storage subsystems.

IBM Storage connectivity to VMware through IBM Spectrum Connect is shown in Figure 3-3.

Figure 3-3 IBM Storage connectivity to VMware through IBM Spectrum Connect

3.2.2 PowerShell

IBM developed multiple PowerShell command-lets (cmdlets) for provisioning and managing IBM storage systems through trusted PowerShell commands to the devices. These command-lets are included in the IBM Storage Automation Plug-in for PowerShell, which is deployed on a PowerShell host that uses Spectrum Control is Connect as the common user interface. The capabilities can also be used with PowerCLI to automate storage-related tasks for Microsoft environments that are managed in VMware vSphere.

3.2.3 Containers and Kubernetes

IBM Spectrum Connect enables container orchestration with Kubernetes, including IBM Cloud Private solutions. It simplifies the provisioning of storage for containers by defining policies by SLA or by workload. It supports multiple and variable IBM storage systems as a single interface. It also provides better storage visibility to improve troubleshooting in containerized environments. For more information about the integration with Containers, see “This section describes IBM block storage solutions.” on page 48.

For more information about IBM Spectrum Connect, see this website:

https://www.ibm.com/us-en/marketplace/spectrum-connect

3.3 IBM Spectrum Control

IBM Spectrum Control provides efficient infrastructure management for virtualized, cloud, and software-defined storage by reducing the complexity that is associated with managing multivendor infrastructures. It also helps businesses optimize provisioning, capacity, availability, protection, reporting, and management for today’s business applications without having to replace existing storage infrastructure. With support for block, file, and object workloads, IBM Spectrum Control enables administrators to provide efficient management for heterogeneous storage environments.

3.3.1 Key capabilities

IBM Spectrum Control helps organizations transition to new workloads and updated storage infrastructures by providing the following advantages to significantly reduce total cost of ownership:

•A single management console that supports IBM Spectrum Virtualize, IBM Spectrum Accelerate, IBM Cloud Object Storage, and IBM Spectrum Scale environments, enabling holistic management of physical and virtual block, file, and object systems storage environments.

•Insights that offer advanced, detailed metrics for storage configurations, performance, and tiered capacity in an intuitive web-based user interface with customizable dashboards so that the most important information is always accessible.

•Performance monitoring views that enable quick and efficient troubleshooting during an issue with simple threshold configuration and fault alerting for HA.

3.3.2 Benefits

IBM Spectrum Control can help reduce the administrative complexity of managing a heterogeneous storage environment, improve capacity forecasting, and reduce the amount of time spent troubleshooting performance-related issues. IBM Spectrum Control provides the following key values:

•Transparent mobility across storage tiers and devices for IBM Spectrum Virtualize based designs

•Centralized management that offers visibility to block, file, and object workloads and control and automation of block storage volumes

The IBM Spectrum Control dashboard window, in which all of the managed resources in a data center are presented in an aggregated view, is shown in Figure 3-4.

Figure 3-4 Single dashboard for monitoring all storage components

3.3.3 IBM Data and Storage Management Solutions features

IBM Spectrum Control solutions provide improved visibility, simplified administration, and greater scalability. This section describes the features of the specific products that provide the functions for IBM Spectrum Control.

Note: The Management Layer of VSC is now called IBM Spectrum Control Advanced Edition.

The IBM Spectrum Control offerings are shown in Figure 3-5.

Figure 3-5 IBM Spectrum Control offerings

3.3.4 IBM Spectrum Control Standard Edition

IBM Spectrum Control Standard Edition is designed to provide storage infrastructure and data management capabilities for traditional and software-defined storage environments. IBM Spectrum Control Standard Edition includes the following primary features:

•Capacity visualization and management

•Performance reporting and troubleshooting

•Health and performance alerting

•Data Path view

•Department and Application grouping

•Hypervisor integration with VMware

• Management of IBM Replication features with Copy Services Manager (CSM)

3.3.5 IBM Spectrum Control Advanced Edition

IBM Spectrum Control Advanced Edition includes all of the features of IBM Spectrum Control Standard Edition, and adds the following advanced capabilities:

•Tiered storage optimization with intelligent analytics for IBM Spectrum Virtualize

•Service catalog with policy-based provisioning

•Self-service provisioning with restricted use logins

•Application-based snapshot management from IBM Spectrum ProtectTM Snapshot

Advanced Edition includes the following built-in efficiency features that help users avoid complicated integration issues or the need to purchase add-ons or extra licenses:

•Simplified user experience: Virtual Storage Center provides an advanced GUI where administrators can perform common tasks consistently over multiple storage systems, including those systems from different vendors. The IBM storage GUI enables simplified storage provisioning with intelligent presets and embedded best practices, and integrated context-sensitive performance management.

•Near-instant, application-aware backup and restore: To reduce downtime in high-availability virtual environments, critical applications such as mission critical databases or executive email requiring near-instant backups must have little or no impact on application performance. Application-aware snapshot backups can be performed frequently throughout the day to reduce the risk of data loss. Virtual Storage Center simplifies administration and recovery from snapshot backups through the inclusion of IBM Spectrum Protect Snapshot.

IBM Spectrum Protect Snapshot, previously known as IBM Tivoli® Storage FlashCopy® Manager, is designed to deliver data protection for business-critical applications through integrated application snapshot backup and restore capabilities. These capabilities are achieved through the utilization of advanced storage-specific hardware snapshot technology to help create a high-performance, low-impact, application data protection solution. It is designed for easy installation, configuration, and deployment, and integrates with various traditional storage systems and software-defined storage environments.

•IBM Tiered Storage Optimizer: Virtual Storage Center uses performance metrics, advanced analytics, and automation to enable storage optimization on a large scale. Self-optimizing storage adapts automatically to workload changes to optimize application performance, eliminating most manual tuning efforts. It can optimize storage volumes across different storage systems and virtual machine vendors. The Tiered Storage Optimizer feature can reduce the unit cost of storage by as much as 50 percent, based on deployment results in a large IBM data center.

IBM Spectrum Control Advanced Edition is data and storage management software for managing heterogeneous storage infrastructures. It helps to improve visibility, control, and automation for data and storage infrastructures. Organizations with multiple storage systems can simplify storage provisioning, performance management, and data replication.

IBM Spectrum Control Advanced Edition simplifies the following data and storage management processes:

•A single console for managing all types of data on disk, flash, file, and object storage systems.

•Simplified visual administration tools that include an advanced web-based user interface, the ability to see servers and connected hypervisors (such as VMware) and IBM Cognos® Business Intelligence with pre-designed reports.

•Storage and device management to give you fast deployment with agent-less device management.

•Intelligent presets that improve provisioning consistency and control.

•Integrated performance management features end-to-end views that include devices, SAN fabrics, and storage systems. The server-centric view of storage infrastructure enables fast troubleshooting.

•Data replication management that enables you to have remote mirror, snapshot, and copy management, and supports Windows, Linux, UNIX, and IBM z® Systems® data with the inclusion of IBM Spectrum Copy Services Manager.

IBM Spectrum Control enables multi-platform storage virtualization and data and storage management. It supports most storage systems and devices by using the Storage Networking Industry Association (SNIA) Storage Management Initiative Specification (SMI-S), versions 1.0.2, 1.1, and 1.5 and later.

Hardware and software interoperability information is provided on the IBM Support Portal for IBM Spectrum Control. For more information about the interoperability matrix, see this website:

http://www.ibm.com/support/docview.wss?uid=swg27047049

Advanced Edition enables you to adapt to the dynamic storage needs of your applications by providing storage virtualization, automation, and integration for cloud environments, including the following features:

•OpenStack cloud application provisioning: Advanced Edition includes an OpenStack Cinder volume driver that enables automated provisioning using any of the heterogeneous storage systems that are controlled by IBM Cloud Orchestrator or Virtual Storage Center. OpenStack cloud applications can access multiple storage tiers and services without adding complexity.

•Self-service portal: Advanced Edition can provide provisioning automation for self-service storage portals, which enables immediate responses to service requests while eliminating manual administration tasks.

•Pay-per-use invoicing: Advanced Edition now includes a native chargeback tool. This tool allows customers to create chargeback or showback reports from the native GUI, or work with more advanced reporting as part of the embedded Cognos engine that is also included for building custom reports.

IBM Cognos-based reporting helps create and integrate custom reports about capacity, performance, and utilization. IBM Spectrum Control provides better reporting and analytics with no extra cost through integration with Cognos reporting and modeling.

Some reporting is included. Novice users can rapidly create reports with the intuitive drag function. Data abstraction and ad hoc reporting makes it easy to create high-quality reports and charts. You can easily change the scaling and select sections for both reporting and charting. Reports can be generated on schedule or on demand in multiple distribution formats, including email.

IBM Spectrum Control provides better user management and integration with external user repositories, such as Microsoft Active Directory. Enhanced management for virtual environments provides enhanced reporting for virtual servers (VMware). Tiered Storage Optimization provides integration with the existing storage optimizer and storage tiering reporting. Tiered Storage Optimization is policy-driven information lifecycle management (ILM) that uses virtualization technology to provide recommendations for storage relocation.

It provides recommendations for workload migration based on user-defined policy that is based on file system level data, performance, and capacity utilization. This feature ensures that only the highest performing workloads are allocated to the most expensive storage.

IBM Spectrum Control in a VMware environment

IBM Spectrum Control supports the following functions in VMware:

•Acts as a Control Plane for the storage reporting and provides for the capability to provision Spectrum Virtualize based architectures.

•Provides a view into the connected VMWare servers and their virtual machines.

IBM Spectrum Control in an OpenStack environment

The IBM Spectrum Control OpenStack Cinder driver enables your OpenStack-powered cloud environment to use your IBM Spectrum Control installation for block storage provisioning.

IBM Spectrum Control provides block storage provisioning capabilities that a storage administrator can use to define the properties and characteristics of storage volumes within a particular service class. For example, a block storage service class can define RAID levels, tiers of storage, and various other storage characteristics.

3.4 IBM Virtual Storage Center

Organizations need to spend less of their IT budgets on storage capacity and storage administration so that they can spend more on new, revenue-generating initiatives. Virtual Storage Center (VSC) delivers an end-to-end view of storage with the ability to virtualize Fibre Channel block storage infrastructures, helping you manage your data with more confidence with improved storage utilization and management efficiency.

It combines IBM Spectrum Control Advanced features with IBM Spectrum Virtualize capabilities to deliver an integrated infrastructure to transform your block storage into an agile, efficient, and economical business resource.

IBM Virtual Storage Center is a virtualization platform and a management solution for cloud-based and software-defined storage. It is an offering that combines both IBM Spectrum Control Advanced Edition with IBM Spectrum Virtualize, including SAN Volume Controller, members of the IBM Storwize family, and FlashSystem V9000. VSC helps organizations transition to new workloads and update storage infrastructures.

It enables organizations to monitor, automate, and analyze storage. It delivers provisioning, capacity management, storage tier optimization, and reporting. VSC helps standardize processes without replacing existing storage systems, and can also significantly reduce IT costs by making storage more user and application oriented.

Cloud computing is all about agility. Storage for clouds needs to be as flexible and service-oriented as the applications it supports. IBM Virtual Storage Center can virtualize existing storage into a private storage cloud with no “rip and replace” required.

3.5 IBM Storage Insights

IBM Storage Insights provides for a storage resource management solution that is delivered in an SaaS format from the IBM Cloud. This delivery allows the versions and updates to be automatically managed by IBM and enable customers to use the platforms for insight and knowledge of their storage environment. Two components can build off each other; IBM Storage Insights is available to all IBM Storage customers running storage platforms that are under support. It provides for an integrated support experience, short-term monitoring, and the ability for cognitive insight into storage health.

IBM Storage Insights Pro is a for fee-based offering that is integrated into the same management pane as IBM Storage Insights. It provides storage performance reporting, insight into usage, and the ability to monitor use at the Application or Department level.

3.5.1 IBM Storage Insights

The success of a business is intertwined with its IT performance. As IT environments become increasingly complex, the technology that is supposed to help is now beyond what humans alone can manage. To be successful, enterprises must rethink how they use technology to give them more power than ever before.

IBM Storage Insights provides Cognitive Storage Management capabilities for clients with IBM Storage. This new support capability automates data access and increases insight into storage health, performance, and capacity. It is designed for these clients to enjoy faster resolution of issues with minimal effort, HA and the confidence of services delivered from the IBM Cloud (see Figure 3-6).

Figure 3-6 IBM Storage Insights

The platform can use IBM’s Data Lake and Knowledge Base that includes curated data that is based on 30+ years of operations experience. This rich foundation provides valuable knowledge and insights that cognitive capabilities can mine.

By using operational data and applying artificial intelligence (or cognitive technology) to deliver actionable insights and drive automation are at the core of this transformation. Partnering people with cognitive technologies allows enterprises to autonomously run and optimize their IT environments based on business needs.

This ability, in turn, enables cognitive insights for faster, data-driven decisions and autonomous management and governance of IT operations. The result is the delivery of higher service quality by anticipating problems, reducing errors, and responding more quickly to incidents and service requests (see Figure 3-7).

Figure 3-7 IBM Storage Insights Dashboard

3.5.2 IBM Storage Insights Pro

IBM Storage Insights Pro, formally called IBM Spectrum Control Storage Insights, is an analytics-driven, storage resource management solution that is delivered over the cloud. The solution uses cloud technology to provide visibility into on-premises storage with the goal of helping clients optimize their storage environments in today’s data-intense world. This software as a solution (SaaS) runs on IBM Cloud can deploy in as little as 5 minutes and show actionable insights in 30 minutes.

IBM Storage Insights Pro is a cloud data and storage management service that is deployed in a secure and reliable cloud infrastructure that provides the following features:

•Accurately identify and categorize storage assets

•Monitor capacity and performance from the storage consumer’s view, including server, application, and department-level views

•Increase capacity forecasting precision by using historical growth metrics

•Reclaim unused storage to delay future purchases and improve utilization

•Optimize data placement based on historical usage patterns that can help lower the cost

An example of IBM the Storage Insights Pro dashboard is shown in Figure 3-8.

Figure 3-8 IBM Storage Insights Pro Dashboard

For more information about IBM Storage Insights Pro, see the following websites:

•http://www.ibm.com/systems/storage/spectrum/insights

•http://www.ibm.com/marketplace/cloud/analytics-driven-data-management/us/en-us

IBM Storage Insights can also be used with IBM Spectrum Control. For more information, see “IBM Spectrum Control offerings” on page 28.

3.6 IBM Copy Services Manager

IBM CSM replication management tool set (formerly in IBM Tivoli Storage Productivity Center) is included in IBM Spectrum Control. This replication management solution delivers central control of your replication environment by using simplified and automated complex replication tasks.

By using the CSM functions within IBM Spectrum Control, you can coordinate copy services on IBM Storage, including DS8000, DS6000TM, SAN Volume Controller, Storwize V7000, IBM Spectrum Accelerate, and XIV. You can also help prevent errors and increase system continuity by using source and target volume matching, site awareness, disaster recovery testing, and standby management. Copy services include IBM FlashCopy, Metro Mirror, Global Mirror, and Metro Global Mirror.

You can use Copy Services Manager to complete the following data replication tasks and help reduce the downtime of critical applications:

• Plan for replication when you are provisioning storage

• Keep data on multiple related volumes consistent across storage systems during a planned or unplanned outage

• Monitor and track replication operations

• Automate the mapping of source volumes to target volumes

• Practice disaster recovery procedures

The IBM Copy Services Manager family of products consists of the following products:

• Copy Services Manager provides HA and disaster recovery for multiple sites

• Copy Services Manager for IBM z Systems provides HA and disaster recovery for multiple sites

• Copy Services Manager Basic Edition for z Systems provides HA for a single site if a disk storage system failure occurs

The Copy Services Manager overview window is shown in Figure 3-9.

Figure 3-9 Copy Services Manager Overview window

3.7 IBM Spectrum Protect

IBM Spectrum Protect is an intuitive, intelligent, and transparent software that provides a set of product features that allow you to design adaptive and comprehensive data protection solutions. It is a comprehensive data protection and recovery solution for virtual, physical, and cloud data. IBM Spectrum Protect provides backup, snapshot, archive, recovery, space management, bare machine recovery, and disaster recovery capabilities.

3.7.1 Key capabilities

IBM Spectrum Protect features the following capabilities:

•Protects virtual, physical, and cloud data with one solution

•Reduces backup and recovery infrastructure costs

•Delivers greater visualization and administrator productivity

•Simplifies backups by consolidating administration tasks

•Space Management moves less active data to less expensive storage, such as tape or cloud

• Provides long-term data archive for data retention, such as for compliance with government regulations

3.7.2 Benefits

IBM Spectrum Protect includes the following benefits:

•Application-aware and VM-aware data protection for any size organization

•Simplified administration

•Built-in efficiency features: Data deduplication, compression, and incremental ‘forever’ backup

•Integrated multi-site replication and disaster recovery

•Multi-site data availability with active-active replication-based architecture and heterogeneous storage flexibility using disk, tape, or cloud

Whatever your data type and infrastructure size, IBM Spectrum Protect scales from a small environment that consists of 10 - 20 machines to a large environment with thousands of machines to protect. The software product consists of the following basic functional components:

•IBM Spectrum Protect server with IBM Db2® database engine

The IBM Spectrum Protect server provides backup, archive, and space management services to the IBM Spectrum Protect clients, and manages the storage repository. The storage repository can be implemented in a hierarchy of storage pools by using any combination of supported media and storage devices. These devices can be directly connected to the IBM Spectrum Protect server system or be accessible through a SAN or be cloud storage accessible using TCP/IP.

•IBM Spectrum Protect clients with application programming interfaces (APIs)

IBM Spectrum Protect enables data protection from failures and other errors by storing backup, archive, space management, and “bare-metal” restore data, and compliance and disaster-recovery data in a hierarchy of auxiliary storage. IBM Spectrum Protect can help protect computers that run various operating systems, on various hardware platforms and connected together through the Internet, wide area networks (WANs), local area networks (LANs), or storage area networks (SANs).

It uses web-based management, intelligent data move-and-store techniques, and comprehensive policy-based automation that work together to increase data protection and potentially decrease time and administration costs.

The progressive incremental methods that are used by IBM Spectrum Protect back up only new or changed versions of files, greatly reducing data redundancy, network bandwidth, and storage pool consumption as compared to traditional methods.

3.7.3 Backup and recovery

Despite rapid data growth, data protection and retention systems are expected to maintain service levels and data governance policies. Data has become integral to business decision-making and basic operations, from production to sales and customer management. Data protection and retention are core capabilities for their role in risk mitigation and for the amount of data involved.

The storage environment offers the following functions that improve the efficiency and effectiveness of data protection and retention:

•Backup and recovery: Provides cost-effective and efficient backup and restore capabilities, improving the performance, reliability, and recovery of data that is aligned to business required service levels. Backups protect current data, and are unlikely to be accessed unless data is lost or corrupted.

•Archiving: Stores data that includes long-term data retention requirements for compliance or business purposes by providing secure and cost effective solutions with automated process for retention policies and data migration to different storage media.

•Node Replication: Ensures uninterrupted access to data for critical business systems, reducing the risk of downtime by providing the capability to fail over transparently and as instantaneously as possible to an active copy of the data.

Optimizing all of these areas helps an organization deliver better services with reduced application downtime. Data protection and retention, archiving, and node replication can improve business agility by ensuring that applications have the correct data when needed, while inactive data is stored in the correct places for the correct length of time.

3.7.4 Tool set

IBM Spectrum Protect is a family of tools that helps manage and control the “information explosion” by delivering a single point of control and administration for storage management needs. It provides a wide range of data protection, recovery management, movement, retention, reporting, and monitoring capabilities by using policy-based automation.

Products: For an updated list of the available products in the IBM Spectrum Protect family, see the following website:

https://www.ibm.com/us-en/marketplace/data-protection-and-recovery

For more information about the most recent releases, see IBM Spectrum Protect Knowledge Center:

https://www.ibm.com/support/knowledgecenter/en/SSEQVQ_8.1.0/tsm/welcome.html

The main features, functions, and benefits that are offered by the IBM Spectrum Protect family are listed in Table 3-2.

Table 3-2 Main features, functions, and benefits of IBM Spectrum Protect

Feature	Function	Benefits
Backup and recovery management	Intelligent backups and restores using a progressive incremental backup and restore strategy, where only new and used files are backed up	Centralized protection based on smart-move and smart-store technology, which leads to faster backups and restores with fewer network and storage resources needed
Hierarchical storage management	Policy-based management of file backup and archiving	Ability to automate critical processes related to the media on which data is stored while reducing storage media and administrative costs associated with managing data
Archive management	Managed archives	Ability to easily protect and manage documents that need to be kept for a designated length of time
Advanced data reduction	Combines incremental backup, source inline, and target data deduplication, compression, and tape management to provide data reduction	Reduces the costs of data storage, environmental requirements, and administration

3.7.5 IBM Spectrum Protect Operations Center

IBM Spectrum Protect Operations Center is a graphical user interface (GUI), with new features (as shown in Figure 3-10). It provides an advanced visualization dashboard, built-in analytics, and integrated workflow automation features that dramatically simplify backup administration.

Figure 3-10 IBM Spectrum Protect Operations Center

3.7.6 IBM Spectrum Protect cloud architectures

IBM Spectrum Protect has multiple cloud architectures to meet various requirements. Figure 3-11 shows several IBM Spectrum Protect cloud architectures for storing IBM Spectrum Protect cloud-container storage pools.

Figure 3-11 IBM Spectrum Protect Cloud Architectures

IBM Spectrum Protect supports the following cloud providers:

•IBM Cloud

•Amazon Web Services (Amazon Simple Storage Service S3)

•Microsoft Azure (Blob Storage)

IBM Spectrum Protect also supports IBM Cloud Object Storage (Cleversafe®) as an on-premise storage system that is configured by using dedicated hardware.

Several third-party vendors, including Scality RING and EMC Elastic Cloud Storage, validated their object storage hardware and software for use with IBM Spectrum Protect. For more information about these third-party devices, see this website:

http://www.ibm.com/support/docview.wss?uid=swg22000915

IBM Spectrum Protect can also protect data that is hosted in an OpenStack environment, and use the OpenStack (Swift) environment as a repository for backup and archive objects.

Data privacy considerations

Although the security of sensitive data is always a concern, data that you store off-premises in a cloud computing system should be considered particularly vulnerable. Data can be intercepted during transmission, or a weakness of the cloud computing system might be used to gain access to the data.

To guard against these threats, define a cloud-container storage pool to be encrypted. When you do, the server encrypts data before it is sent to the storage pool. After data is retrieved from the storage pool, the server decrypts it so is understandable and usable again.

Your data is protected from eavesdropping and unauthorized access when it is outside your network because it can be understood only when it is back on premises.

3.7.7 Cloud Tiering

Beginning with version 8.1.3 IBM Spectrum Protect, IBM Spectrum Protect allows tiering of data from directory container storage pools to cloud container storage pools, as shown in Figure 3-12.

Figure 3-12 IBM Spectrum Protect cloud tiering

With cloud tiering, data is stored on block storage device for quick ingest and operational recovery. A storage rule keeps data on disk for a specified number of days, after which it is migrated off to cloud storage. When restoring data, the IBM Spectrum Protect server automatically restores data from wherever it is stored. The tiering feature supports on-premises and off-premises implementations of cloud storage.

3.7.8 IBM Spectrum Protect Plus

IBM Spectrum Protect Plus is a data protection and availability solution for virtual environments that can be deployed in minutes and protect your environment within an hour. It simplifies data protection, whether data is hosted in physical, virtual, software-defined or cloud environments. It can be implemented as a stand-alone solution or integrate with your IBM Spectrum Protect environment to off-load copies for long-term storage and data governance with scale and efficiency.

IBM Spectrum Protect Plus uses APIs and incremental-forever data copy technology to create backup copies of Hyper-V and VMware virtual machines. It stores these copies as addressable snapshot images on a vSnap server and optionally offloads them to IBM Spectrum Protect. IBM Spectrum Protect Plus can support all of the tape and cloud environments that are supported by IBM Spectrum Protect.

IBM Spectrum Protect Plus creates and maintains a global catalog of all copies of data and optionally indexes files. When the need to recover arises, this global catalog enables the administrator to quickly search and identify what they want to recover instead of browsing through hundreds of objects and recovery points.

IBM Spectrum Protect Plus provides instant access and restore from the catalog so that an administrator can restore the organization’s operations in a matter of minutes and enables multiple use cases. These key use cases include Data Protection, Disaster Recovery (DR), Development and Test (Dev/Test), and Business Analytics.

Installation and Configuration

Because it is delivered as a Virtual Machine Image, IBM Spectrum Protect Plus is extremely simple to deploy. Install into your virtualization environment and access the dashboard by using a web browser.

Immediately available SLAs provide secure self-service management for defining schedules and protecting virtual machines. A simple to use dashboard (see Figure 3-13) provides information protection status, SLA compliance, VM Sprawl, and storage usage.

Figure 3-13 IBM Spectrum Protect Dashboard

3.7.9 IBM Spectrum Protect for Virtual Environments

IBM Spectrum Protect for Virtual Environments simplifies data protection for virtual and cloud environments. It protects VMware and Microsoft Hyper-V virtual machines by offloading backup workloads to a centralized IBM Spectrum Protect server for safe keeping. Administrators can create backup policies or restore virtual machines with a few clicks.

IBM Spectrum Protect for Virtual Environments enables your organization to protect data without the need for a traditional backup window. It allows you to reliably and confidently safeguard the massive amounts of information that virtual machines generate.

IBM Spectrum Protect for Virtual Environments provides the following benefits:

•Improves efficiency with data deduplication, incremental “forever” backup, and other advanced IBM technology to help reduce costs.

•Simplifies backups and restores for VMware with an easy-to-use interface that you can access from within VMware vCenter or vCloud Director.

•Enables VMware vCloud Director and OpenStack cloud backups.

•Enables faster, more frequent snapshots for your most critical virtual machines.

•Flexible recovery and copy options from image-level backups give you the ability to perform recovery at the file, mailbox, database object, volume, or VM image level by using a single backup of a VMware image.

Eliminates processor usage that is caused by optimized virtual machine backup by supporting VMware vStorage APIs for Data Protection and Microsoft Hyper-V technology, which simplifies and optimizes data protection.

3.8 IBM Spectrum Protect Snapshot

In today’s business world, where application servers are operational 24 hours a day, the data on these servers must be fully protected. You cannot afford to lose any data, but you also cannot afford to stop these critical systems for hours so you can protect the data adequately. As the amount of data that needs protecting continues to grow exponentially and the need to keep the downtime associated with backup to an absolute minimum, IT processes are at their breaking point. Data volume snapshot technologies such as IBM Spectrum Protect Snapshot can help minimize the effect caused by backups and provide near instant restore capabilities.

Although many storage systems are now equipped with volume snapshot tools, these hardware-based snapshot technologies provide only “crash consistent” copies of data. Many business critical applications, including those that rely on a relational database, need an extra snapshot process to ensure that all parts of a data transaction are flushed from memory and committed to disk before the snapshot. This process is necessary to ensure that you have a usable, consistent copy of the data.

IBM Spectrum Protect Snapshot helps deliver the highest levels of protection for mission-critical IBM Db2, SAP, Oracle, Microsoft Exchange, and Microsoft SQL Server applications by using integrated, application-aware snapshot backup and restore capabilities. This protection is achieved by using advanced IBM storage hardware snapshot technology to create a high-performance, low-impact application data protection solution.

The snapshots that are captured by IBM Spectrum Protect Snapshot can be retained as backups on local disk. With optional integration with IBM Spectrum Protect, customers can use the full range of advanced data protection and data reduction capabilities, such as data deduplication, progressive incremental backup, hierarchical storage management, and centrally managed policy-based administration, as shown in Figure 3-14 on page 43.

Figure 3-14 IBM Spectrum Protect Snapshot storage snapshot capabilities

Because a snapshot operation typically takes much less time than the time for a tape backup, the window during which the application must be aware of a backup can be reduced. This advantage facilitates more frequent backups, which can reduce the time that is spent performing forward recovery through transaction logs, increase the flexibility of backup scheduling, and ease administration.

Application availability is also significantly improved due to the reduction of the load on the production servers. IBM Spectrum Protect Snapshot uses storage snapshot capabilities to provide high speed, low impact, application-integrated backup and restore functions for the supported application and storage environments.

Automated policy-based management of multiple snapshot backup versions, together with a simple and guided installation and configuration process, provide an easy way to use and quick to deploy data protection solution that enables the most stringent database recovery time requirements to be met.

For more information, see this website:

https://ibm.biz/BdZgV3

3.9 IBM Spectrum Copy Data Management

IBM Spectrum Copy Data Management makes copies available to data consumers when and where they need them, without creating unnecessary copies or leaving unused copies on valuable storage. It catalogs copy data from across local, hybrid cloud, and off-site cloud infrastructure, identifies duplicates, and compares copy requests to existing copies. This process ensures that the minimum number of copies are created to service business requirements.

Data consumers can use the self-service portal to create the copies they need when they need them, creating business agility. Copy processes and work flows are automated to ensure consistency and reduce complexity. IBM Spectrum Copy Data Management rapidly deploys as an agentless VM as it helps manage snapshot and FlashCopy images made to support DevOps, data protection, disaster recovery, and Hybrid Cloud computing environments.

This member of the IBM Spectrum Storage family automates the creation and catalogs the copy data on existing storage infrastructure, such as snapshots, vaults, clones, and replicas. One of the key use cases centers around use with Oracle, Microsoft SQL server, and other databases that are often copied to support application development, testing, and data protection.

The IBM Spectrum Copy Data Management software is an IT modernization technology that focuses on using existing data in a manner that is efficient, automated, scalable, and easy to use to improve data access. IBM Spectrum Copy Data Management (Figure 3-15), with IBM storage arrays, delivers in-place copy data management that modernizes IT processes and enables key use cases with existing infrastructure.

Figure 3-15 Software-Defined IBM Spectrum Copy Data Management Platform

IBM Spectrum Copy Data Management includes support for the following copy data management use cases:

•Automated Copy management

•Development and operations

•Data protection and disaster recovery

•Test and development

•Hybrid cloud computing

Automated copy management

IT functions that rely heavily on copies or ‘snapshots’ are typically managed by using a complex mix of scripts, tools, and other products, none of which are optimized for copy management. With IBM Spectrum Copy Data Management, organizations have a holistic, simplified approach that greatly reduces cycle time and frees staff to manage more productive projects.

IT teams can use the core policy engine, catalog, and reporting of IBM Spectrum Copy Data Management to dramatically improve IT operations that rely on copies of data, including disaster recovery, testing and development, business analytics, and local recovery. IBM Spectrum Copy Data Management improves operations by using automated, service-level based copy policies that are consistent, reliable, and easily repeatable. This feature provides huge savings in operating expenses.

Development and operations (DevOps)

Organizations are increasingly moving toward DevOps for faster delivery of new applications to market. IBM Spectrum Copy Data Management enables IT teams to use their existing storage infrastructure to enable DevOps, helping to meet the needs of the development teams for rapid deployment of the infrastructure. IBM Spectrum Copy Data Management templates define the policies for infrastructure deployment. The whole system is accessible through the REST API.

Rather than following legacy processes to requisition IT resources, developers include the infrastructure deployment commands directly within their development systems, such as IBM Cloud (IBM Bluemix®), Chef, or Puppet. Predefined scripts and plug-ins for popular DevOps tools simplify implementation.

Next-generation data protection and disaster recovery

Through its template-based management and orchestration of application-aware copies, IBM Spectrum Copy Data Management can support next-generation data protection and recovery workflows. IBM Spectrum Copy Data Management enables IT to mount and instantly access copies that are already in the production storage environment. IBM Spectrum Copy Data Management catalogs all snapshots and replicas, and alerts you if a snap or replication job was missed or failed.

Disaster recovery can be fully automated and tested nondisruptively. In addition, IBM Spectrum Copy Data Management can coordinate sending data through the AWS Storage Gateway to the AWS storage infrastructure. This feature provides a simplified, low-cost option for longer term or archival storage of protection copies.

Automated test and development

The speed and effectiveness of test and development processes are most often limited by the time it takes to provision IT infrastructure. With IBM Spectrum Copy Data Management, test and development infrastructure can be spun-up in minutes, either on an automated, scheduled basis, or on-demand basis.

Hybrid cloud computing

IBM Spectrum Copy Data Management is a powerful enabler of the hybrid cloud, enabling IT to take advantage of cloud compute resources. IBM Spectrum Copy Data Management not only helps customers move data to the cloud, it also enables IT organizations to create live application environments that can use the less expensive, elastic compute infrastructure in the cloud. Being able to spin up workloads and then spin them back down reliably helps maximize the economic benefit of the cloud by only using and paying for infrastructure as needed.

IBM Spectrum Copy Data Management is a software platform that is designed to use the infrastructure in the IT environment. It works directly with hypervisor and enterprise storage APIs to provide the overall orchestration layer that uses the copy services of the underlying infrastructure resources. IBM Spectrum Copy Data Management also integrates with IBM Cloud (IBM Bluemix is now integrated with IBM Cloud), AWS S3 for cloud-based data retention, Puppet, and others.

Database-specific functionality

IBM Spectrum Copy Data Management allows the IT team to easily create and share copies of all popular database management systems by integrating key database management system (DBMS) tasks within well-defined policies and work-flows. The solution also includes application-aware integration for Oracle and Microsoft SQL Server platforms, providing a deeper level of coordination with the DBMS.

Secure multi-tenancy

Secure multi- tenancy meets the needs of both managed service providers and large organizations that need to delegate resources internally. Individual “tenants” can be created within a single IBM Spectrum Copy Data Management instance, allowing each tenant its own set of resources and the ability to support administration within the tenancy to create users, define jobs, and perform other functions.

Policy templates for automation and self-service

Template-based provisioning and copy management provides easy self- service access for internal customers to request the resources that they need, when they need them. Templates are pre- defined by the IT team, and they are accessible through a self- service portal interface or through API calls.

Compatibility

IBM Spectrum Copy Data Management is a simple-to-deploy software platform that is designed to use the existing IT infrastructure. It works directly with hypervisor and storage APIs to provide the overall orchestration layer that uses the copy services of the underlying infrastructure resources. It also integrates with Amazon Web Services S3 for cloud-based data retention.

IBM Spectrum Copy Data Management delivers the following benefits:

•Automate the creation and use of copy data on existing storage infrastructure, such as snapshots, vaults, clones, and replicas

•Reduce time that is spent on infrastructure management while improving reliability

•Modernize existing IT resources by providing automation, user self-service, and API-based operations without the need for any additional hardware

•Simplify management of critical IT functions such as data protection and disaster recovery

•Automate test and development infrastructure provisioning, drastically reducing management time

•Drive new, high-value use cases, such as using hybrid cloud compute

•Catalog and track IT objects, including volumes, snapshots, virtual machines, data stores, and files

3.10 Block, file, and object storage

Block, file, and object are different approaches to accessing data. This section provides a high-level overview of each method. Figure 3-16 is a high-level view of these differences.

Figure 3-16 High-level view of data access differences between file, block, and object storage

3.10.1 Block storage

Block storage offerings are differentiated by speed/throughput (as measured in IOPS) and segmented by lifecycle of the disk. Data is split into evenly sized chunks or blocks of data, each with its own unique address. How blocks of data are accessed is up to the application. Few applications access the blocks directly. Rather, a Portable Operating System Interface (POSIX) file system is used in a hierarchical way of organizing files so that an individual file can be located by describing the path to that file.

3.10.2 File storage

File storage uses protocols to access individual directories and files over various protocols, including NFS, CIFS/SMB, and so on. Certain file attributes might describe a file and its contents, such as its owner, who can access the file, and its size. This metadata is all stored along with the data or related directory structure.

3.10.3 Object storage

With object storage, data is written into self-contained entities called objects. Unlike file systems, an object storage system gives each object a unique ID, which is managed in a flat index. There are no folders and subfolders. Unlike files, objects are created, retrieved, deleted, or replaced in their entirety, rather than being updated or appended in place.

Object storage also introduces the concept of eventual consistency. If one user creates an object, a second user might not see that object listed immediately. Eventually, all users will be able to see the object listed.

When a user or application needs access to an object, the object storage system is provided with a unique ID. This flat index approach provides greater scalability, enabling an object storage system to support faster access to a massively higher quantity of objects or files as compared to traditional file systems.

3.11 IBM Block Storage solutions

This section describes IBM block storage solutions.

3.11.1 IBM Spectrum Virtualize

IBM Spectrum Virtualize software is at the heart of IBM SAN Volume Controller, IBM Storwize family, IBM FlashSystem® V9000, and VersaStack. It enables these systems to deliver better data value, security, and simplicity through industry-leading virtualization. This virtualization transforms existing and new storage and streamlines deployment for a simpler, more responsive, scalable, and cost-efficient IT infrastructure.

IBM Spectrum Virtualize systems provide storage management from entry and midrange up to enterprise disk systems, and enable hosts to attach through SAN, FCoE, or iSCSI to Ethernet networks. IBM Spectrum Virtualize is easy to use, which enables staff to start working with it rapidly.

IBM Spectrum Virtualize uses virtualization, thin provisioning, and compression technologies to improve storage utilization and meet changing needs quickly and easily. In this way, IBM Spectrum Virtualize products are the ideal complement to server virtualization strategies.

Key Capabilities

IBM Spectrum Virtualize software capabilities are offered across various platforms, including SAN Volume Controller, Storwize V7000, Storwize V5000, and FlashSystem V9000. The following IBM Spectrum Virtualize products are designed to deliver the benefits of storage virtualization and advanced storage capabilities in environments from large enterprises to small businesses and midmarket companies:

•IBM Real-time Compression™ for inline, real-time compression

•Stretched Cluster and IBM HyperSwap® for a high-availability solution

•IBM Easy Tier® for automatic and dynamic data tiering

•Distributed RAID for better availability and faster rebuild times

•Encryption for internal and external virtualized capacities

•FlashCopy snapshots

•Remote data replication

Benefits

The sophisticated virtualization, management, and functions of IBM Spectrum Virtualize provide the following storage benefits:

•Improves storage utilization up to 2x

•Supports up to 5x as much data in the same physical space

•Simplifies management of heterogeneous storage systems

•Enables rapid deployment of new storage technologies for greater ROI

•Improves application availability with virtually zero storage-related outages

The SAN Volume Controller combines software and hardware into a comprehensive, modular appliance that uses symmetric virtualization.

Symmetric virtualization is achieved by creating a pool of managed disks from the attached storage systems. Those storage systems are then mapped to a set of volumes for use by the attached host systems. System administrators can view and access a common pool of storage on the SAN. This function helps administrators to use storage resources more efficiently and provides a common base for advanced functions.

The IBM Spectrum Virtualize functions are shown in Figure 3-17.

Figure 3-17 IBM Spectrum Virtualize functions

IBM Spectrum Virtualize features and benefits are listed in Table 3-3.

Table 3-3 IBM Spectrum Virtualize features and benefits

Feature	Benefits
Single point of control for storage resources	•Designed to increase management efficiency •Designed to help support business application availability
Pools the storage capacity of multiple storage systems on a SAN	•Helps you manage storage as a resource to meet business requirements and not just as a set of boxes •Helps administrators better deploy storage as required beyond traditional “SAN islands” •Can help increase use of storage assets •Insulates applications from physical changes to the storage infrastructure
Clustered pairs of IBM SAN Volume Controller data engines	•Highly reliable hardware foundation •Designed to avoid single points of hardware failure
IBM Real-time Compression	•Increases effective capacity of storage systems up to five times, helping to lower costs, floor-space requirements, and power and cooling needs •Can be used with a wide range of data, including active primary data, for dramatic savings •Hardware compression acceleration helps transform the economics of data storage
Innovative and tightly integrated support for flash storage	•Designed to deliver ultra-high performance capability for critical application data •Move data to and from flash storage without disruption; make copies of data onto hard disk drive (HDD)
Support for IBM FlashSystem	Enables high performance for critical applications with IBM MicroLatency®, coupled with sophisticated functions
Easy-to-use IBM Storwize family management interface	•Single interface for storage configuration, management, and service tasks regardless of storage vendor •Helps administrators use their existing storage assets more efficiently
IBM Storage Mobile Dashboard	Provides basic monitoring capabilities to securely check system health and performance
Dynamic data migration	•Migrate data among devices without taking applications that are using that data offline •Manage and scale storage capacity without disrupting applications
Manage tiered storage	Helps balance performance needs against infrastructure costs in a tiered storage environment
Advanced network-based copy services	•Copy data across multiple storage systems with IBM FlashCopy •Copy data across metropolitan and global distances as needed to create high-availability storage solutions
Integrated Bridgeworks SANrockIT technology for IP replication	•Optimize use of network bandwidth •Reduce network costs or speed replication cycles, improving the accuracy of remote data
Enhanced stretch cluster configurations	•Provide highly available, concurrent access to a single copy of data from data centers up to 300 km apart •Enable nondisruptive storage and virtual machine mobility between data centers
Thin provisioning and “snapshot” replication	•Dramatically reduce physical storage requirements by using physical storage only when data changes •Improve storage administrator productivity through automated on-demand storage provisioning
Hardware snapshots integrated with IBM Spectrum Protect Snapshot Manager	•Performs near-instant application-aware snapshot backups, with minimal performance impact for IBM DB2®, Oracle, SAP, VMware, Microsoft SQL Server, and Microsoft Exchange •Provides advanced, granular restoration of Microsoft Exchange data

Virtualizing storage with SAN Volume Controller helps make new and existing heterogeneous storage arrays more effective by including many functions that are traditionally deployed within disk array systems. By including these functions in a virtualization system, SAN Volume Controller standardizes functions across virtualized storage for greater flexibility and potentially lower costs.

How SAN Volume Controller stretches virtual volume with heterogeneous storage across data centers is shown in Figure 3-18.

Figure 3-18 Stretching virtual volume across data centers with heterogeneous storage

SAN Volume Controller functions benefit all virtualized storage. For example, IBM Easy Tier optimizes use of flash memory, and Real-time Compression enhances efficiency even further by enabling the storage of up to five times as much active primary data in the same physical disk space¹. Finally, high-performance thin provisioning helps automate provisioning. These benefits can help extend the useful life of existing storage assets, reducing costs.

Integrating these functions into SAN Volume Controller also means that they are designed to operate smoothly together, reducing management effort:

•Storage virtualization: Virtualization is a foundational technology for software-defined infrastructures that enables software configuration of the storage infrastructure. Without virtualization, networked storage capacity utilization averages about 50 percent, depending on the operating platform. Virtualized storage enables up to 90 percent utilization by enabling pooling across storage networks with online data migration for capacity load balancing.

Virtual Storage Center supports a virtualization of storage resources from multiple storage systems and vendors (that is, heterogeneous storage). Pooling storage devices enables access to capacity from any networked storage system, which is a significant advantage over the limitations inherent in traditional storage arrays.

•IBM Easy Tier: Virtual Storage Center helps optimize flash memory with automated tiering for critical workloads. Easy Tier helps make the best use of available storage resources by automatically moving the most active data to the fastest storage tier, which helps applications and virtual desktop environments run up to three times faster.

•Thin provisioning: Thin provisioning helps automate provisioning and improve productivity by enabling administrators to focus on overall storage deployment and utilization, and on longer-term strategic requirements, without being distracted by routine storage-provisioning requests.

•Remote mirroring: IBM Metro Mirror and Global Mirror functions automatically copy data to remote sites as it changes, enabling fast failover and recovery. These capabilities are integrated into the advanced GUI, making them easy to deploy.

•IBM Real-time Compression: Real-time Compression is patented technology that is designed to reduce space requirements for active primary data. It enables users to store up to five times as much data in the same physical disk space, and can do so without affecting performance.

IBM Spectrum Virtualize for Public Cloud

A flexible declination of the product delivers a powerful solution for the deployment of IBM Spectrum Virtualize software in public clouds, starting with IBM Cloud. This new capability provides a monthly license to deploy and use IBM Spectrum Virtualize in IBM Cloud to enable Hybrid Cloud solutions. It also offers the ability to transfer data between on-premise data centers by using any IBM Spectrum Virtualize based appliances (including SAN Volume Controller, Storwize family, V9000, VersaStack with Storwize family, or SAN Volume Controller appliance), or IBM Spectrum Virtualize Software Only and the IBM Cloud.

Through IP-based replication with Global or Metro Mirror, users can now create secondary copies of their own premises data in the public cloud for Disaster Recovery, workload redistribution, or migration of data from on premises data centers to public clouds.

IBM Spectrum Virtualize Hybrid Cloud opportunities are shown in Figure 3-19.

Figure 3-19 IBM Spectrum Virtualize Hybrid Cloud opportunities

3.11.2 IBM Spectrum Access Blueprint

IBM Spectrum Access Blueprint provides enterprise data management and protection for hybrid and multicloud environments with operational control and efficiency. Deployable with VersaStack solutions from IBM and Cisco, IBM Spectrum Access provides a blueprint to deliver the economics and simplicity of the public cloud with the accessibility, virtualization, security and performance of an on-premises implementation.

Ideally, a cloud platform and cloud technologies are essential to implement microservices. Microservices are based on the ability to spin up business services in distinct containers and intelligently route from one container or endpoint to another. This means that having a cloud-service fabric is important to stand up microservices instances and have them connect and reliably offer the quality of service expected.

One of the challenges with microservices compared with classic service-oriented architecture (SOA) lies in the complexity of orchestrating identity and access management in harmony with the dynamic nature of microservices components. With new services being spun up dynamically, the host names for the instances of each service become dynamic as well.

This end-to-end private cloud solution of IBM Spectrum Access Blueprint with VersaStack converged infrastructure and IBM Cloud Private technologies delivers the essential private cloud service fabric for building and managing on-premises, containerized applications and guarantees to provide seamless access to persistent storage for stateful services, such as database applications. This solution also comes with services for data, messaging, Java, blockchain, DevOps, analytics, and many others.

IBM Spectrum Access Blueprint includes the following key features:

•Enterprises can build new applications delivering infrastructure services easily and efficiently

•Simple scalability

•Optimizes cloud deployments

•Quickly deploy storage classes to comply with business SLAs

•Provision capacity directly with containerized applications

IBM Spectrum Access currently incorporates the following features:

•VersaStack to create a compute and storage platform that consists of IBM Spectrum Virtualize and IBM Storwize systems that are paired with the Cisco UCS compute platform

•IBM Spectrum Connect to provide the ability to connect stateful storage to containerize workloads

•IBM Cloud Private to provide a management and orchestration platform to deploy Containers

For more information, see this website:

https://www.ibm.com/us-en/marketplace/ibm-spectrum-access

IBM Cloud Private 2.1.0

IBM Cloud Private is an application platform for developing and managing on-premises, containerized applications. It is an integrated environment for managing containers that includes the container orchestrator Kubernetes, a private image repository, a management console, and monitoring frameworks.

IBM Cloud Private makes it easy to stand up an elastic runtime that is based on Kubernetes to address each of the following workloads:

•Deliver packaged workloads for traditional middleware. IBM Cloud Private initially supports IBMDb2, IBM MQ, Redis, and several other open source packages.

•Support API connectivity from 12-factor apps (software-as-a-service) to manage API endpoints within an on-premises datacenter.

•Support development and delivery of modern 12-factor apps with Microservice Builder.

Along with great capabilities to run enterprise workloads, IBM Cloud Private is delivering enhanced support to run processor-intensive capabilities, such as machine learning or data analytics quickly by taking advantage of Graphics Processing Unit (GPU) clusters (see Figure 3-20).

Figure 3-20 End-to-end private cloud solution architecture that uses IBM Spectrum Access

The IBM Spectrum Access blueprint offers a pretested and validated solution to help pool your compute, network, and storage resources for cloud application deployment. It delivers a simplified, standardized, and trusted approach for the deployment, use, and management of your shared infrastructure and cloud environment. It helps you monitor and manage applications while providing resource consumption reports. IBM Spectrum Access is ideal for IBM Cloud Private deployment because it has the essential private cloud service fabric for building and managing on-premises, containerized applications with persistent storage.

3.11.3 IBM Spectrum Accelerate

IBM Spectrum Accelerate is a highly flexible, software-defined storage solution that enables rapid deployment of block data storage services for new and traditional workloads on and off premises. It is a key member of the IBM Spectrum Storage portfolio.

IBM Spectrum Accelerate allows you to run hotspot-free, grid-scale software that runs on the XIV Storage System Gen3 enterprise storage platform in your data center infrastructure or in a cloud provider, such as IBM Cloud. It offers proven grid-scale technology, mature features, and ease of use. It is deployed on over 100,000 servers worldwide.

IBM Spectrum Accelerate delivers predictable, consistent storage performance, management scaling to more than 68 petabytes usable, and a rich feature set that includes remote mirroring and granular multi-tenancy. It deploys on premises on x86 commodity servers and on the optimized XIV Storage System, and off-premises as a public cloud service on IBM Cloud.

You can manage all your IBM Spectrum Accelerate instances, wherever they are deployed, in a single, intuitive interface. Hardware-independent, transferable licensing offers superb operational flexibility and cost benefits.

IBM Spectrum Accelerate delivers a single management experience across software-defined storage infrastructure by using IBMs HyperScale Manager, which can manage IBM Spectrum Accelerate instances, IBM XIV, and the IBM A9000 all flash solution. This combination helps cut costs through reduced administration effort and training, reduces procurement costs, standardizes data center storage hardware operations and services, and provides licensing flexibility that enables cost-efficient cloud building.

How straightforward scaling is by building a storage grid with IBM Spectrum Accelerate is shown in Figure 3-21.

Figure 3-21 IBM Spectrum Accelerate iSCSI storage grid

Key capabilities

IBM Spectrum Accelerate gives organizations these following capabilities:

•Enterprise cloud storage in minutes, by using commodity hardware

•Hotspot-free performance and QoS without any manual or background tuning needed

•Advanced remote replication, role-based security, and multi-tenancy

•Deploy on-premises or on the cloud (also as a service on IBM Cloud)

•Hyper-scale management of dozens of petabytes

•Best in class VMware and OpenStack integration

•Run IBM Spectrum Accelerate and other application virtual machines on the same server

IBM Spectrum Accelerate runs as a virtual machine on vSphere ESXi hypervisor, which enables you to build a server-based SAN from commodity hardware that includes x86 servers, Ethernet switches, solid-state drives (SSDs), and direct-attached, high-density disks. IBM Spectrum Accelerate essentially acts as an operating system for your self-built SAN storage, grouping virtual nodes and spreading the data across the entire grid.

IBM Spectrum Accelerate release 11.5.3 manages up to 15 nodes in a grid. It provides a single point of management of up to 144 grids connected through Hyper-Scale Manager, up to 2,160 nodes.

IBM Spectrum Accelerate allows you to deploy storage services flexibly across different delivery models, including customer-choice hardware, data center infrastructure, and IBM storage systems.

IBM Spectrum Accelerate includes the following benefits:

•Cost reduction by delivering hotspot-free storage to different deployment models on- and off-premises, enabling organizations to pay less overall for the same capacity by optimizing utilization, acquiring less hardware, and minimize administrative overhead

•Increased operational agility through easy cloud building, faster provisioning, small capacity increments, and flexible, transferable licensing

•Rapid response through enterprise-class storage availability, data protection, and security for the needs of new and traditional workload in the data center and other sites, while flexibly balancing capital and operational expenses

•Scaleout across 144 virtual systems and seamless management across IBM Spectrum Accelerate instances on- and off-premises and the XIV storage system

The IBM Spectrum Accelerate features with their associated benefits are listed in Table 3-4.

Table 3-4 IBM Spectrum Accelerate features and benefits

Feature	Benefit
Performance	•Ensures even data distribution through massive parallelism and automatic load balancing including upon capacity add •Distributed cache
Reliability and Availability	•Grid redundancy maintains two copies of each 1-MB data partition with each copy being on a different VM, proactive diagnostics, fast and automatic rebuilds, event externalization •Advanced monitoring; network monitoring; disk performance tracking/reporting; data center monitoring; shared monitoring for some components; data and graphical reports on I/O, usage, and trends •Self-healing, which minimizes the rebuild process by rebuilding only actual data •Automated load balancing across components; minimized risk of disk failure due to rapid return to redundancy
Management	Intuitive GUI: Scales to up to 144 virtual arrays and up to more than 45 PB with IBM Hyper-Scale Manager; extensive CLI; RESTful API; mobile app support with push notifications; multi-tenancy with quality of service by tenant, pool, or host
Cloud automation and Self-service	OpenStack; VMware vRealize Orchestrator through IBM Spectrum Control Base
Snapshot management	Space efficient snapshots: Writable, snapshot of snapshot, restore from snapshot, snapshots for consistency groups, mirroring
Thin provisioning; space reclamation	Thin provisioning per pool, thick-to-thin migration; VMware, Microsoft, Symantec space reclamation support
Mirroring	Synchronous/asynchronous; volumes and consistency groups, recovery point objective (RPO) of seconds; online/offline initialization; failover/failback; mirroring across platforms including with XIV Storage System
Security	Role-based access management, multi-tenancy, iSCSI Challenge Handshake Authentication Protocol (CHAP) and auditing; integrates with Lightweight Directory Access Protocol (LDAP) and Microsoft Active Directory servers

OpenStack device support for IBM XIV

IBM built and contributed the OpenStack Cinder block storage driver for XIV to the OpenStack community. This driver allows IBM Spectrum Accelerate to be the first enterprise class storage system to have OpenStack software support. This feature allows ease of use and fast time to implementation characteristics to be magnified by being able to be automatically managed and provisioned within the OpenStack environment.

The IBM Storage Driver for OpenStack Cinder component added support starting with the Folsom release as shown in Figure 3-22, and then expanded the support for the Grizzly and Havana releases. The driver enables OpenStack clouds to be able to directly access and use IBM Spectrum Accelerate Storage System Gen3.

Figure 3-22 OpenStack Cinder support for XIV

Hyperconverged Flexible deployment

IBM Spectrum Accelerate provides customers the capability to create Hyperconverged solutions that run Compute and the Storage services on the same physical x86 servers wherever they are deployed.

By using the VMware ESX hypervisor, extra resources that are not used by the IBM Spectrum Accelerate instances (such as memory and processors) can be provisioned to more guest workloads.

The IBM Spectrum Accelerate instance can be managed and administered from its native GUI or through the IBM HyperScale Manager option. This capability includes tasks, such as creating pools, replication structures, hardware component replacement, and firmware updates (see Figure 3-23).

Figure 3-23 IBM Spectrum Accelerate in a Hyperconverged infrastructure

VMware Orchestration

By using IBM Spectrum Connect (see Figure 3-3 on page 25), orchestration of the compute layer, provisioning from predefined IBM Spectrum Accelerate pools, and replication can all be done through the VMware integration points. Therefore, the solution can be controlled through the vRealize suite and can use vCenter and VMware SRM-based APIs.

IBM Spectrum Accelerate as a pre-configured Hyperconverged Solution

Customers looking to benefit from IBM Spectrum Accelerate’s Hyperconverged capabilities and its ability to work as a storage system that can replicate to the IBM XIV, but want the ability to deploy it as a pre-integrated solution can order it from Supermicro. IBM and Supermicro have jointly designed an appliance deliverable that combines IBM Spectrum Accelerate software with Supermicro hardware.

This product is delivered as a pre-configured, preinstalled, and pre-tested solution that is ready to be integrated into customer networks. The three basic building blocks are Small, Medium, and Large, and those building blocks can be customized for based on customer-specific requirements.

For more information, see this website:

https://www.supermicro.com/solutions/spectrum-accelerate.cfm

3.11.4 IBM XIV Storage System Gen3

IBM Spectrum Accelerate is the common software-defined layer that is inside the IBM XIV Storage System Gen3.

Note: For more information about IBM Spectrum Accelerate, see the following IBM publications:

•IBM Spectrum Accelerate Deployment, Usage, and Maintenance, SG24-8267

•Deploying IBM Spectrum Accelerate on Cloud, REDP-5261

•IBM Spectrum Accelerate Reference Architecture, REDP-5260

3.11.5 IBM FlashSystem A9000 and A9000R

IBM Spectrum Accelerate is the common software-defined layer across the IBM FlashSystem A9000 and A9000R all-flash arrays.

Note: For more information about IBM FlashSystem A9000 and A9000R, see the following IBM publications:

•IBM FlashSystem A9000 and IBM FlashSystem A9000R Architecture and Implementation, SG24-8345

•IBM FlashSystem A9000 Product Guide, REDP-5325

3.12 IBM File Storage solutions

This section describes IBM file storage solutions.

3.12.1 IBM Spectrum Scale

IBM Spectrum Scale is a proven, scalable, high-performance file management solution that is based on IBM’s General Parallel File System (GPFS™). IBM Spectrum Scale provides world-class storage management with extreme scalability, flash accelerated performance, and automatic policy-based storage tiering from flash to disk, then to tape. IBM Spectrum Scale reduces storage costs up to 90% while improving security and management efficiency in cloud, big data, and analytics environments.

First introduced in 1998, this mature technology enables a maximum volume size of 8 YB, a maximum file size of 8 EB, and up to 18.4 quintillion (two to the 64th power) files per file system. IBM Spectrum Scale provides simplified data management and integrated information lifecycle tools such as software-defined storage for cloud, big data, and analytics. It introduces enhanced security, flash accelerated performance, and improved usability. It also provides capacity quotas, access control lists (ACLs), and a powerful snapshot function.

Key capabilities

IBM Spectrum Scale adds elasticity with the following capabilities:

•Global namespace with high-performance access scales from departmental to global

•Automated tiering, data lifecycle management from flash (6x acceleration) to tape (10x savings)

•Enterprise ready with data security (encryption), availability, reliability, large scale

•POSIX compliant

•Integrated with OpenStack components and Hadoop

Benefits

IBM Spectrum Scale provides the following benefits:

•Improves performance by removing data-related bottlenecks

•Automated tiering, data lifecycle management from flash (acceleration) to tape (savings)

•Enables sharing of data across multiple applications

•Reduces cost per performance by placing data on most applicable storage (flash to tape or cloud)

IBM Spectrum Scale is part of the IBM market-leading software-defined storage family. Consider the following points:

•As a Software-only solution: Runs on virtually any hardware platform and supports almost any block storage device. IBM Spectrum Scale runs on Linux (including Linux on
IBM Z® Systems), IBM AIX®, and Windows systems.

•As an integrated IBM Elastic Storage™ Server solution: A bundled hardware, software, and services offering that includes installation and ease of management with a graphical user interface. Elastic Storage Server provides unsurpassed end-to-end data availability, reliability, and integrity with unique technologies that include IBM Spectrum Scale RAID.

•As a cloud service: IBM Spectrum Scale delivered as a service provides high performance, scalable storage, and integrated data governance for managing large amounts of data and files in the IBM Cloud.

IBM Spectrum Scale features enhanced security with native encryption and secure erase. It can increase performance by using server-side flash cache to increase I/O performance up to six times. IBM Spectrum Scale provides improved usability through data replication capabilities, data migration capabilities, Active File Management (AFM), transparent cloud tiering (TCT), File Placement Optimizer (FPO), and IBM Spectrum Scale Native RAID.

An example of the IBM Spectrum Scale architecture is shown in Figure 3-24.

Figure 3-24 IBM Spectrum Scale architecture

IBM Spectrum Scale is based around the following concepts:

•Storage pools

•File sets

•Policy engine

•Mirroring, replication, and migration capabilities

•Active File Management

•File Placement Optimizer

•Licensing

Storage pools

A storage pool is a collection of disks or arrays with similar attributes. It is an organizational structure that allows the combination of multiple storage locations with identical characteristics. The following types of storage pools are available:

•System Pool

One system pool is needed per file system. The system pool includes file system metadata and can be used to store data.

•Data Pool

A data pool is used to store file data. A data pool is optional.

•External Pool

An external pool is used to attach auxiliary storage, such as tape-to-IBM Spectrum Scale. An external pool is optional.

File sets

IBM Spectrum Scale creates a single name space; therefore, tools are available that provide a fine grained management of the directory structure. A file set acts as a partition of a file system, a subdirectory tree.

File sets can be used for operations, such as quotas or used in management policies. It is a directory tree that behaves as a “file system” within a file system. Consider the following points:

•It is part of the global namespace.

•It can be linked and unlinked (such as mount and unmount).

•Policy scan can be restricted to only scan file sets. This setting can be helpful when the file system has billions of files.

•A file set can be assigned to a storage pool.

The following type of file sets are available:

•Dependent

A dependent file set allows for a finer granularity of administration. It shares the inode space with another file set.

•Independent

An independent file set features a distinct inode space. An independent file set allows file set level snapshots, independent file scans, and enabled advanced features, such as AFM.

Policy engine

The policy engine uses an SQL style syntax to query or operate on files based on file attributes. Policies can be used to migrate all data that has not been accessed in 6 months (for example) to less expensive storage or used to query the contents of a file system. Management policies support advanced query capabilities, though what makes the policy engine most useful is the performance. The policy engine is capable of scanning billions of objects as shown in Table 3-5.

Table 3-5 Speed comparison for GPFS policy engine

Search through 1,000,000,000 (1 billion) files
find	~ 47 hours
GPFS policy engine	~ 5 hours

Table 3-5 shows the power of the GPFS policy engine. Although an average find across 1 billion files took ~47 hours, the GPFS policy engine can satisfy the request within five hours. The GPFS policy engine can also create a candidate list for backup applications to use to achieve a massive reduction in candidate identification time.

IBM Spectrum Scale has next generation availability with features that include rolling software and hardware upgrades. You can add and remove servers to adapt the performance and capacity of the system to changing needs. Storage can be added or replaced online, and you can control how data is balanced after storage is assessed.

Mirroring, replication, and migration capabilities

In IBM Spectrum Scale, you can replicate a single file, a set of files, or the entire file system. You can also change the replication status of a file at any time by using a policy or command. Using these capabilities, you can achieve a replication factor of two, which equals mirroring, or a replication factor of three.

A replication factor of two in IBM Spectrum Scale means that each block of a replicated file is in at least two failure groups. A failure group is defined by the administrator and contains one or more disks. Each storage pool in a file system contains one or more failure groups. Failure groups are defined by the administrator and can be changed at any time. So when a file system is fully replicated, any single failure group can fail and the data remains online.

For migration, IBM Spectrum Scale provides the capability to add storage to the file system, migrate the existing data to the new storage, and remove the old storage from the file system. All of this can be done online without disruption to your business.

Active File Management

AFM enables the sharing of data across unreliable or high latency networks. With AFM, you can create associations between IBM Spectrum Scale clusters and define the location and flow of file data. AFM allows you to implement a single name space view across clusters, between buildings, and around the world.

AFM operates at the file set level. This configuration means that you can create hundreds of AFM relationships in each file system. AFM is a caching technology though inode. File data in a cache file set is the same as an inode and file data in any IBM Spectrum Scale file system. It is a “real” file that is stored on disk. The job of the cache is to keep the data in the file consistent with the data on the other side of the relationship.

AFM can be implemented in five different modes:

•Read-Only (ro)

•Local-Update (lu)

•Single-Writer (sw)

•Independent Writer (iw)

•Asynchronous DR

These modes can be used to collect data at a remote location (single-writer), create a flash cache for heavily read data (read-only), provide a development copy of data (local-update), create a global interactive name space (independent-writer), and create asynchronous copies of file data (asynchronous DR).

Transparent Cloud Tiering

Data in the enterprise is growing at an alarming rate led by growth in unstructured data, leading to a capacity crisis. Cooler and cold data constitutes a large proportion of data in the enterprise. Migrating this data to lower-cost cloud object storage provides cost savings.

Transparent cloud tiering is a new feature of IBM Spectrum Scale 4.2.1 that provides hybrid cloud storage capability. This software-defined capability enables usage of public, private, and on-premises cloud object storage as a secure, reliable, transparent storage tier that is natively integrated with IBM Spectrum Scale without introducing more hardware appliances or new management touch points. It uses the ILM policy language semantics that are available in IBM Spectrum Scale. The semantics allow administrators to define the following policies for tiering cooler and cold data to a cloud object storage:

•IBM Cloud Object Storage (Cleversafe)

•Amazon Web Services S3

•OpenStack Swift

This configuration frees up storage capacity in higher-cost storage tiers that can be used for more active data.

The IBM Spectrum Scale transparent cloud tiering feature is shown in Figure 3-25.

Figure 3-25 IBM Spectrum Scale transparent cloud tiering feature highlights

For more information, see Enabling Hybrid Cloud Storage for IBM Spectrum Scale Using Transparent Cloud Tiering, REDP-5411.

IBM Spectrum Scale Management GUI

The IBM Spectrum Scale Management GUI (Graphical User Interface) can be used in conjunction with the existing command line interface. The GUI is meant to support common administrator tasks, such as provisioning more capacity, which can be accomplished faster and without knowledge of the command-line interface.

System health, capacity, and performance displays can be used to identify trends and respond quickly to any issues that arise. The GUI is available to IBM Spectrum Scale Clusters running at or above the 4.2 release for the Standard Edition and Advanced Edition.

For more information, see Figure 3-26.

Figure 3-26 IBM Spectrum Scale management GUI dashboard

The IBM Spectrum Scale management GUI provides an easy way to configure and manage various features that are available with the IBM Spectrum Scale system. You can perform the following important tasks through the IBM Spectrum Scale management GUI:

•Monitoring the performance of the system based on various aspects

•Monitoring system health

•Managing file systems

•Creating file sets and snapshots

•Managing Objects, NFS, and SMB data exports

•Creating administrative users and defining roles for the users

•Creating object users and defining roles for them

•Defining default, user, group, and file set quotas

•Monitoring the capacity details at various levels such as file system, pools, file sets, users, and user groups

File Placement Optimizer

FPO allows IBM Spectrum Scale to use locally attached disks on a cluster of servers that communicate by using the network, rather than the regular case of the use of dedicated servers for shared disk access (such as the use of SAN). IBM Spectrum Scale FPO is suitable for workloads, such as SAP HANA, and IBM Db2 with Database Partitioning Feature. It can be used as an alternative to Hadoop Distributed File System (HDFS) in big data environments.

The use of FPO extends the core IBM Spectrum Scale architecture, which provides greater control and flexibility to use data location, reduces hardware costs, and improves I/O performance. The following benefits are realized when FPO is used:

•Allows your jobs to be scheduled where the data is located (locality awareness)

•Metablocks that allow large and small block sizes to coexist in the same file system

•Write affinity that allows applications to dictate the layout of files on different nodes, maximizing write and read bandwidth

•Pipelined replication to maximize use of network bandwidth for data replication

•Distributed recovery to minimize the effect of failures on ongoing computation

For more information about IBM Spectrum Scale FPO, see GPFS V4.1: Advanced Administration Guide, SC23-7032.

IBM Spectrum Scale Native RAID

IBM Spectrum Scale Native RAID provides next generation performance and data security. Using IBM Spectrum Scale native RAID, just a bunch of disks (JBOD) are directly attached to the systems running IBM Spectrum Scale software. This technology uses declustered RAID to minimize performance degradation during RAID rebuilds and provides extreme data integrity by using end-to-end checksums and version numbers to detect, locate, and correct silent disk corruption. An advanced disk hospital function automatically addresses storage errors and slow performing drives so that your workload is not affected.

IBM Spectrum Scale native RAID is available with the IBM Power8 architecture in the IBM Elastic Storage Server (ESS) offering.

Licensing

IBM Spectrum Scale V5 offers the following editions so you only pay for the functions that you need:

•Standard Edition includes the base function plus ILM, AFM, and integrated multiprotocol support, which includes NFS, SMB, and Object and is measured by the number of servers and clients attached to the cluster.

•Data Management Edition includes encryption of data at rest, secure erase, asynchronous multisite disaster recovery, and all the features of Standard Edition. It is measured by the amount of storage capacity that is supported in the cluster and includes all connected servers and clients. Data Management also includes tiering-to-Object storage (on-prem or cloud), and file audit logging.

For more information, see the following resources:

•IBM Spectrum Scale:

http://www.ibm.com/systems/storage/spectrum/scale/index.html

•IBM Spectrum Scale (IBM Knowledge Center):

http://www.ibm.com/support/knowledgecenter/STXKQY/ibmspectrumscale_welcome.html

•IBM Spectrum Scale Wiki:

https://ibm.biz/BdFPR2

•IBM Elastic Storage Server

http://www.ibm.com/systems/storage/spectrum/ess

IBM Spectrum Scale for Linux on IBM Z

The IBM Spectrum Scale for Linux on IBM Z implements the IBM Spectrum Scale Software-based delivery model in the Linux on IBM Z environment. The highlights of IBM Spectrum Scale for Linux on IBM Z include the following features:

•Supports extended count key data (IBM ECKD™) DASD disks and Fibre Channel Protocol attached SCSI disks

•Supports IBM HiperSockets™ for communication within one IBM Z System

For more information, see Getting started with IBM Spectrum Scale for Linux on IBM Z, which is available at this website:

http://www.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=ZSW03272USEN

Using IBM Spectrum Scale in an OpenStack cloud deployment

Deploying OpenStack over IBM Spectrum Scale offers benefits that are provided by the many enterprise features in IBM Spectrum Scale. It also provides the ability to consolidate storage for various OpenStack components and applications that are running on top of the OpenStack infrastructure under a single storage management plan (see Figure 3-27).

Figure 3-27 IBM Spectrum Scale in an OpenStack cloud deployment

One key benefit of IBM Spectrum Scale is that it provides uniform access to data under a single namespace with integrated analytics.

The following OpenStack components are related to IBM Spectrum Scale:

•Cinder: Provides virtualized block storage for virtual machines. The IBM Spectrum Scale Cinder driver, also known as the GPFS driver, is written to take full advantage of the IBM Spectrum Scale enterprise features.

•Glance: Provides the capability to manage virtual machine images. When Glance is configured to use the same IBM Spectrum Scale fileset that stores Cinder volumes, bootable images can be created almost instantly by using the copy-on-write file clone capability.

•Swift: Provides object storage to any user or application that requires access to data through a RESTful API. The Swift object storage configuration was optimized for the IBM Spectrum Scale environment, which provides high availability and simplified management. Swift object storage also supports native the Swift APIs and Amazon S3 APIs for accessing data. Finally, the Swift object storage also supports access to the same data through object interface or file interface (POSIX, NFS, SMB) without creating a copy.

•Manila: Provides a shared file system access to client, virtual, and physical systems. The IBM Spectrum Scale share driver (GPFS driver) is written to take full advantage of the IBM Spectrum Scale enterprise features.

•Keystone: Although not a storage component, internal keystone with in-built HA is provided by IBM Spectrum Scale as part of the Object protocol. In deployments that already have keystone support, the Object protocol can be configured to use the external keystone server rather than the internal one

IBM Spectrum Scale for Amazon Web Services

Amazon Quick Starts are built by Amazon Web Services (AWS) solutions architects and partners to help you deploy popular solutions on AWS, based on AWS best practices for security and HA. A Quick Start automatically deploys a highly available IBM Spectrum Scale cluster on the AWS Cloud, into a configuration of your choice.

IBM Spectrum Scale is placed into a virtual private cloud (VPC) that spans two Availability Zones in your AWS account. You can build a VPC for IBM Spectrum Scale, or deploy the software into your VPC (see Figure 3-28).

Figure 3-28 IBM Spectrum Scale on AWS

For more information about IBM Spectrum Scale on AWS, see this website:

https://aws.amazon.com/quickstart/architecture/ibm-spectrum-scale

3.12.2 IBM Spectrum NAS

IBM Spectrum NAS is a software-defined solution to help customers with enterprise NAS, remote, and departmental file storage needs. It is deployed as a software-defined solution on storage-rich standard x86 servers, with an initial deployment of four instance (physical servers).

IBM Spectrum NAS provides support for SMB and NFS fileshares to users, leveraging internal storage of the servers. IBM Spectrum NAS is designed to support General Purpose Enterprise NAS workloads, such as home directories, Microsoft Applications that require SMB file storage, and Virtual Machines that require file storage.

IBM Spectrum NAS provides the following key capabilities:

•SMB 1, 2.1 and 3.1.1

•NFS 3, 4.0 and NFS 4.1

IBM Spectrum NAS is managed from a simple and easy-to-use GUI across the cluster, which removes the need to manage independent file servers or filers. IBM Spectrum NAS data management allows mixing workloads on the same cluster and in multiple tiers in the cluster.

IBM Spectrum NAS includes the following features:

•Snapshots

•Quotas

•Tiering

•Encryption

•NENR (Non-Erasable, Non-rewritable) Capabilities

•Data Retention

•Synchronous & Asynchronous Replication for DR

•AntiVirus Integration

•Authentication by way of LDAP, AD, NIS, Kerberos KDC and local databases

IBM Spectrum NAS Architecture

IBM Spectrum NAS uses a true scale out architecture that allows deployments to add storage nodes as needed. The software architecture features the following tiers (as shown in Figure 3-29):

•Scale-Out Protocols: Support connections across the file system on all of the nodes in the cluster.

•Scale-Out Cache Pool: Provides the ability to use NVMe drives to support read and write cache.

•Scale-Out File Systems: Provides for the file storage across the name space in the cluster.

•Scale-Out Data Store: Provides data protection across the cluster by using a tunable erasure coding structure that is based on the number of servers in the cluster and the amount of required redundancy.

Figure 3-29 IBM Spectrum NAS Software architecture

3.12.3 IBM Spectrum Archive

A member of the IBM Spectrum Storage family, IBM Spectrum Archive enables direct, intuitive, and graphical access to data that is stored in IBM tape drives and libraries by incorporating the IBM Linear Tape File System™ (LTFS) format standard for reading, writing, and exchanging descriptive metadata on formatted tape cartridges. IBM Spectrum Archive eliminates the need for extra tape management and software to access data.

IBM Spectrum Archive offers the following software solutions for managing your digital files with the LTFS format:

•IBM Spectrum Archive Single Drive Edition (SDE)

•IBM Spectrum Archive Library Edition (LE)

•IBM Spectrum Archive Enterprise Edition (EE)

With IBM Spectrum Archive Enterprise Edition and IBM Spectrum Scale, tape can now add savings as a low-cost storage tape tier. The use of a tier of tape for active but “cold” data enables enterprises to look at new ways to cost optimize their unstructured data storage. They can match the value of the data, or the value of the copies of data to the most appropriate storage media.

In addition, the capability to store the data at the cost of tape storage allows customers to build their cloud environments to take advantage of this new cost structure. IBM Spectrum Archive provides enterprises with the ability to store cold data at costs that can be cheaper than some public cloud provider options.

For more information about the potential costs with large-scale cold data storage and retention, see IBM’s Tape TCO Calculator, which is available at this website:

http://www.ibm.com/systems/storage/tape/tco-calculator

Network attached unstructured data storage with native tape support using LTFS delivers the best mix of performance and lowest cost storage.

Key capabilities

IBM Spectrum Archive options can support small, medium, and enterprise businesses with the following advantages:

•Seamless virtualization of storage tiers

•Policy-based placement of data

•Single universal namespace for all file data

•Security and protection of assets

•Open, non-proprietary, cross platform interchange

•Integrated functionality with IBM Spectrum Scale

Benefits

IBM Spectrum Archive enables direct, intuitive, and graphical access to data that is stored in IBM tape drives and libraries by incorporating the LTFS format standard for reading, writing, and exchanging descriptive metadata on formatted tape cartridges. IBM Spectrum Archive eliminates the need for more tape management and software to access data.

IBM Spectrum Archive takes advantage of the low cost of tape storage while making it easy to use. IBM Spectrum Archive provides the following benefits:

•Access and manage all data in stand-alone tape environments as easily as though it were on disk

•Enable easy-as-disk access to single or multiple cartridges in a tape library

•Improve efficiency and reduce costs for long-term, tiered storage

•Optimize data placement for cost and performance

•Enable data file sharing without proprietary software

•Scalable and low cost

Linear Tape File System

IBM developed LTFS and then contributed it to SNIA as an open standard so that all tape vendors can participate. LTFS is the first file system that works with Linear Tape-Open (LTO) generation 8, 7, 6, and 5 tape technology (or IBM TS1155, TS1150, and TS1140 tape drives) to set a new standard for ease of use and portability for open systems tape storage.

With this application, accessing data that is stored on an IBM tape cartridge is as easy and intuitive as using a USB flash drive. Tapes are self-describing, and you can quickly recall any file from a tape without having to read the whole tape from beginning to end.

Also, any LTFS-capable system can read a tape that is created by any other LTFS-capable system (regardless of the operating system and platform). Any LTFS-capable system can identify and retrieve the files that are stored on it. LTFS-capable systems have the following characteristics:

•Files and directories are displayed to you as a directory tree listing.

•More intuitive searches of cartridge and library content are now possible due to the addition of file tagging.

•Files can be moved to and from LTFS tape by using the familiar drag-and-drop metaphor common to many operating systems.

• Many applications that were written to use files on disk can now use files on tape without any modification.

•All standard File Open, Write, Read, Append, Delete, and Close functions are supported.

IBM Spectrum Archive Editions

As shown in Figure 3-30, IBM Spectrum Archive is available in different editions that support small, medium, and enterprise businesses.

Figure 3-30 IBM Spectrum Archive SDE, LE, and EE implementations

IBM Spectrum Scale Single Drive Edition

The IBM Spectrum Archive Single Drive Edition implements the LTFS Format and allows tapes to be formatted as LTFS Volumes. These LTFS Volumes can then be mounted by using LTFS to allow users and applications direct access to files and directories that are stored on the tape. No integration with tape libraries exists in this edition. You can access and manage all data in stand-alone tape environments as simply as though it were on disk.

IBM Spectrum Archive Library Edition

IBM Spectrum Archive Library Edition extends the file management capability of the IBM Spectrum Archive SDE. IBM Spectrum Archive LE is introduced with Version 2.0 of LTFS. It enables easy-as-disk access to single or multiple cartridges in a tape library.

LTFS is the first file system that works with IBM System Storage tape technology to optimize ease of use and portability for open-systems tape storage. It manages the automation and provides operating system-level access to the contents of the library.

IBM Spectrum Archive LE is based on the LTFS format specification, which enables tape library cartridges to be interchangeable with cartridges that are written with the open source SDE version of IBM Spectrum Archive. IBM Spectrum Archive LE supports most IBM tape libraries, including the following examples:

•TS2900 tape autoloader

•TS3100 tape library

•TS3200 tape library

•TS3310 tape library

•TS3500 tape library

•TS4300 tape library

•TS4500 tape library

Note: IBM TS1155, TS1150, and IBM TS1140 tape drives are supported on the IBM TS4500 and IBM TS3500 tape libraries only.

IBM Spectrum Archive LE enables the reading, writing, searching, and indexing of user data on tape and access to user metadata. Metadata is the descriptive information about user data that is stored on a cartridge. Metadata enables searching and accessing files through the GUI of the operating system. IBM Spectrum Archive LE supports Linux and Windows.

IBM Spectrum Archive LE provides the following product features:

•Direct access and management of data on tape libraries with LTO Ultrium 8 (LTO-8), LTO Ultrium 7 (LTO-7), LTO Ultrium 6 (LTO-6), LTO Ultrium 5 (LTO-5), and TS1155, TS1150, and TS1140 tape drives

•Tagging of files with any text, allowing more intuitive searches of cartridge and library content

•Exploitation of the partitioning of the media in LTO-5 tape format standard

•One-to-one mapping of tape cartridges in tape libraries to file folders

•Capability to create a single file system mount point for a logical library that is managed by a single instance of LTFS and runs on a single computer system

•Capability to cache tape indexes, and to search, query, and display tape content within an IBM tape library without having to mount tape cartridges

The IBM Spectrum Archive LE offers the same basic capabilities as the SDE with additional support of tape libraries. Each LTFS tape cartridge in the library appears as an individual folder within the file space. The user or application can browse to these folders to access the files that are stored on each tape. The IBM Spectrum Archive LE software automatically controls the tape library robotics to load and unload the necessary LTFS Volumes to provide access to the stored files.

IBM Spectrum Archive Enterprise Edition

IBM Spectrum Archive Enterprise Edition (EE) gives organizations an easy way to use cost-effective IBM tape drives and libraries within a tiered storage infrastructure. By using tape libraries instead of disks for Tier 2 and Tier 3 data storage (data that is stored for long-term retention), organizations can improve efficiency and reduce costs.

In addition, IBM Spectrum Archive EE seamlessly integrates with the scalability, manageability, and performance of IBM Spectrum Scale, which is an IBM enterprise file management platform that enables organizations to move from simply adding storage to optimizing data management.

IBM Spectrum Archive EE includes the following highlights:

•Simplify tape storage with the IBM LTFS format, which is combined with the scalability, manageability, and performance of IBM Spectrum Scale

•Help reduce IT expenses by replacing tiered disk storage (Tier 2 and Tier 3) with IBM tape libraries

•Expand archive capacity by simply adding and provisioning media without affecting the availability of data already in the pool

•Add extensive capacity to IBM Spectrum Scale installations with lower media, floor space, and power costs

•Support for attaching up to two tape libraries to a single IBM Spectrum Scale cluster

IBM Spectrum Archive EE for the IBM TS4500, IBM TS4300, IBM TS3500, and IBM TS3310 tape libraries provides seamless integration of IBM Spectrum Archive with Spectrum Scale by creating an LTFS tape tier. You can run any application that is designed for disk files on tape by using IBM Spectrum Archive EE.

IBM Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. This configuration improves efficiency and reduces costs for long-term, tiered storage.

With IBM Spectrum Archive EE, you can enable the use of LTFS for the policy management of tape as a storage tier in a IBM Spectrum Scale environment and use tape as a critical tier in the storage environment. IBM Spectrum Archive EE supports IBM LTO Ultrium 8, 7, 6, and 5, IBM System Storage TS1155, TS1150, and TS1140 tape drives that are installed in TS4500, TS3500, or LTO Ultrium 8, 7, 6, and 5 tape drives that are installed in the TS4300 and TS3310 tape libraries.

The use of IBM Spectrum Archive EE to replace disks with tape in Tier 2 and Tier 3 storage can improve data access over other storage solutions. It also improves efficiency and streamlines management for files on tape. IBM Spectrum Archive EE simplifies the use of tape by making it transparent to the user and manageable by the administrator under a single infrastructure.

The integration of IBM Spectrum Archive EE archive solution with Spectrum Scale is shown in Figure 3-31.

Figure 3-31 Integration of Spectrum Scale and Spectrum Archive EE

The seamless integration offers transparent file access in a continuous name space. It provides file level write and read caching with disk staging area, policy-based movement from disk to tape, creation of multiple data copies on different tapes, load balancing, and HA in multi-node clusters.

It also offers data exchange on LTFS tape by using import and export functions, fast import of file name space from LTFS tapes without reading data, built-in tape reclamation and reconciliation, and simple administration and management.

For more information, see this website:

http://www.ibm.com/systems/storage/tape/ltfs

IBM Spectrum Archive in two site mode

Asynchronous Archive Replication is an extension to the stretched cluster configuration, in which users require the data that is created is replicated to a secondary site and can be migrated to tape at both sites by incorporating IBM Spectrum Scale AFM to the stretched cluster. In addition to geolocation capabilities, data that is created on home or cache is asynchronously replicated to the other site.

Asynchronous Archive Replication (see Figure 3-32 on page 76) requires two remote clusters configured: the home cluster and a cache cluster with the independent writer mode. By using the independent writer mode in this configuration, users can create files at either site and the data and metadata is asynchronously replicated to the other site.

Figure 3-32 Asynchronous Archive Replication

For more information about the latest IBM Spectrum Archive release see IBM Spectrum Archive Enterprise Edition V1.2.6 Installation and Configuration Guide, SG24-8333.

Monitoring statistics of IBM Spectrum Archive

IBM Spectrum Archive Enterprise Edition (IBM Spectrum Archive EE) supports a dashboard that helps storage administrators manage and monitor the storage system by using a browser-based graphical interface. By using the dashboard, you can see the following information without the need to log in to a system and enter a command:

•If a system is running without error. If an error exists, the type of detected error is indicated.

•Basic tape-related configuration, such as how many pools are available and the amount of space that is available.

•A time-scaled storage consumption for each tape pool.

•A throughput of each drive for migration and recall.

The IBM Spectrum Archive EE dashboard is shown in Figure 3-33.

Figure 3-33 IBM Spectrum Archive sample dashboard

For more information, see the IBM Spectrum Archive Dashboard Deployment Guide.

OpenStack and IBM Spectrum Archive

IBM Spectrum Archive Enterprise Edition can also be used to provide object storage by using OpenStack Swift. By using this configuration, objects can be stored in the file system and exist on disk or tape tiers within the enterprise.

For more information about creating an object storage Active Archive with IBM Spectrum Scale and Spectrum Archive, see Active Archive Implementation Guide with IBM Spectrum Scale Object and IBM Spectrum Archive, REDP-52377.

3.13 IBM Object Storage solutions

This section describes IBM object storage solutions.

3.13.1 IBM Cloud Object Storage

The IBM Cloud Object Storage (COS) system is a breakthrough cloud platform that helps solve petabyte and beyond storage challenges for companies worldwide. Clients across multiple industries use IBM Cloud Object Storage for large-scale content repository, backup, archive, collaboration, and SaaS.

The Internet of Things (IoT) allows every aspect of life to be instrumented through millions of devices that create, collect, and send data every second. These trends are causing an unprecedented growth in the volume of data being generated. IT organizations are now tasked with finding ways to efficiently preserve, protect, analyze, and maximize the value of their unstructured data as it grows to petabytes and beyond. Object storage is designed to handle unstructured data at web-scale.

The IBM Cloud Object Storage portfolio gives clients strategic data flexibility, simplified management, and consistency with on-premises, cloud, and hybrid cloud deployment options (see Figure 3-34).

Figure 3-34 IBM Cloud Object Storage offers flexibility for on-premises, cloud, and hybrid cloud deployment options

IBM Cloud Object Storage solutions enhances on-premises storage options for clients and service providers with low-cost, large-scale active archives and unstructured data content stores. The solutions complement the software-defined IBM Spectrum Storage portfolio for data protection and backup, tape archive, and a high-performance file and object solution where the focus is on response time.

IBM Cloud Object Storage can be deployed as an on-premises, public cloud, or hybrid solution, which provides unprecedented choice, control, and efficiency:

•On-Premise solutions

Deploy IBM Cloud Object Storage on premises for optimal scalability, reliability, and security. The software runs on industry standard hardware for flexibility and simplified management.

•Cloud Solutions

Easily deploy IBM Cloud Object Storage on the IBM Cloud public cloud.

•Hybrid Solutions

For optimal flexibility, deploy IBM Cloud object storage as a hybrid solution to support multiple sites across your enterprise (on-premises and in the public cloud) for agility and efficiency.

Access methods

The IBM Cloud Object storage pool can be shared and is jointly accessible by multiple access protocols:

•Object-based access methods: The Simple Object interface is accessed with a HTTP/REST API. Simple PUT, GET, DELETE, and LIST commands enable applications to access digital content, and the resulting object ID is stored directly within the application. The IBM COS Accesser® does not require a dedicated appliance because the application can talk directly to the IBM COS Slicestor® using object IDs (see Figure 3-35).

•REST API access to storage: REST is a style of software architecture for distributed hypermedia information retrieval systems such as the World Wide Web. REST style architectures consist of clients and servers. Clients send requests to servers. Servers process those requests and return associated responses. Requests and responses are built around the transfer of various representations of the resources. The REST API works in way that is similar to retrieving a Universal Resource Locator (URL). But instead of requesting a web page, the application references an object.

•File-based access methods: Dispersed storage can also support the traditional NAS protocols (SMB/CIFS and NFS) through integration with third-party gateway appliances. Users and storage administrators are able to easily transfer, access, and preserve data assets over standard file protocol.

Figure 3-35 REST APIs accessing objects using object IDs with IBM COS Slicestor

The IBM COS System is deployed as a cluster that combines three types of nodes, as shown in Figure 3-36. Each node consists of IBM COS software running on an industry-standard server. IBM COS software is compatible with a wide range of servers from many sources, including a physical or virtual appliance. In addition, IBM conducts certification of specific servers that customers want to use in their environment to help ensure a quick initial installation, long-term reliability, and predictable performance.

Figure 3-36 IBM COS System deployed as a cluster combining three types of nodes

The following three types of nodes are available:

•IBM Cloud Object Storage Manager

•IBM Cloud Object Storage Accesser

•IBM Cloud Object Storage Slicestor

Each IBM COS System include the following nodes:

•A single Manager node, which provides out-of-band configuration, administration and monitoring capabilities

•One or more Accesser nodes, which provide the storage system endpoint for applications to store and retrieve data

•One or more Slicestor nodes, which provide the data storage capacity for the IBM COS System

The Accesser is a stateless node that presents the storage interface of the IBM COS System to client applications and transforms data using an Information Dispersal Algorithm (IDA). Slicestor nodes receive data to be stored from Accesser nodes on ingest and return data to Accesser nodes as required by reads.

The IDA transforms each object written to the system into a number of slices such that the object can be read bit-perfectly by using a subset of those slices. The number of slices created is called the IDA Width (or Width) and the number required to read the data is called the IDA Read Threshold (or Read Threshold).

The difference between the Width and the Read Threshold is the maximum number of slices that can be lost or temporarily unavailable while still maintaining the ability to read the object. For example, in a system with a width of 12 and threshold of seven, data can be read even if five of the 12 stored slices cannot be read.

Storage capacity is provided by a group of Slicestor nodes, which are referred to as a storage pool. In the diagram in Figure 3-36 on page 80, 12 Slicestor nodes are grouped in a storage pool. A single IBM COS System can have one or multiple storage pools.

A Vault is not part of the physical architecture, but is an important concept in an IBM COS System. A Vault is a logical container or a virtual storage space, upon which reliability, data transformation options (for example, IBM COS SecureSlice and IDA algorithm), and access control policies can be defined. Multiple vaults can be provisioned on the same storage pool.

The Information Dispersal Algorithm combines encryption and erasure-coding techniques that are designed to transform the data in a way that enables highly reliable and available storage without making copies of the data as would be required by traditional storage architectures.

Information Dispersal

At the foundation of the IBM COS System is a technology called information dispersal. Information dispersal is the practice of using erasure codes as a means to create redundancy for transferring and storing data. An erasure code is a Forward Error Correction (FEC) code that transforms a message of k symbols into a longer message with n symbols such that the original message can be recovered from a subset of the n symbols (k symbols).

Erasure codes use advanced deterministic math to insert “extra data” in the “original data” that allows a user to need only a subset of the “coded data” to re-create the original data.

An IDA can be made from any Forward Error Correction code. The extra step of the IDA is to split the coded data into multiple segments. These segments can then be stored on different devices or media to attain a high degree of failure independence. For example, the use of forward FEC alone on files on your computer is less likely to help if your hard disk drive fails. However, if you use an IDA to separate pieces across machines, you can now tolerate multiple failures without losing the ability to reassemble that data.

As shown in Figure 3-37 on page 82, five variables (as indicated by a - e in Figure 3-37 on page 82) and eight different equations that use these variables, with each yielding a different output. To understand how information dispersal works, imagine the five variables are bytes.

Following the eight equations, you can compute eight results, each of which is a byte. To solve for the original five bytes, you can use any five of the resulting eight bytes. This process is how information dispersal can support any value for k, n- k is the number of variables, and n is the number of equations.

Figure 3-37 Example of calculations to illustrate how information dispersal works

How the Storage Dispersal and Retrieval works

At a basic level, the IBM COS System uses the following process for slicing, dispersing, and retrieving data (see Figure 3-38 on page 83):

1. Data is virtualized, transformed, sliced, and dispersed by using IDAs. In the example that is shown in Figure 3-38 on page 83, the data is separated into 12 slices. Therefore, the “width” (n) of the system is 12.

2. Slices are distributed to separate disks, storage nodes, geographic locations, or some combination of these three. In this example, the slices are distributed to three different sites.

3. The data is retrieved from a subset of slices. In this example, the number of slices that are needed to retrieve the data is 7. Therefore, the “threshold” (k) of the system is 7. Given a width of 12 and a threshold of 7, this example can be called a “7 of 12” (k of n) configuration.

The configuration of a system is determined by the level of reliability required. In a “7 of 12” configuration, five slices can be lost or unavailable and the data can still be retrieved because the threshold of seven slices has been met. With a “5 of 8” configuration, only three slices can be lost, so the level of reliability is lower. Conversely, with a “20 of 32” configuration, 12 slices can be lost, so the level of reliability is higher.

Figure 3-38 COS System’s three steps for slicing, dispersing, and retrieving data

Security

IBM COS uses Information Dispersal Algorithm (IDA) to split and disperse data to all Slicestor nodes; therefore, no whole copy of any object is in any single disk, node, or location. IBM COS implements the following security features on top of IDA:

•Crucial configuration information is digitally signed.

•Communication between any node is Certificate-based.

•TLS is supported between IBM COS nodes and on Client to Accesser network connections.

•SecureSlice algorithm can optionally be applied when storing data. It can implement RC4-128, AES-128, or AES-256 encryption, with MD5-128 or SHA-256 hash algorithm.

For more information about COS security aspects, see IBM Cloud Object Storage Concepts and Architecture: An Under-the-Hood Guide for IBM Cloud Object Storage, REDP-5435.

Compliance Enabled Vault support

The Compliance Enabled Vault (CEV) solution provides the user with the ability to create compliance vaults. Objects that are stored in compliance vaults are protected objects that include associated retention periods and legal holds. Protected objects cannot be deleted until the retention period expires and all legal holds on the object are deleted.

Applications can leverage CEV storage and control its retention by way of standard S3 API.

With this feature, IBM COS is natively compliant to the following standard and compliance requirements:

•Securities and Exchange Commission (SEC) Rule 17a-4(f)

•Financial Industry Regulatory Authority (FINRA) Rule 4511, which references requirements of SEC Rule 17a-4(f)

•Commodity Futures Trading Commission (CFTC) Rule 1.31(b)-(c) (July/Aug. ’17 Release)

File support

Although IBM COS is primarily an object storage, situations exist in which file support is required. In such cases, the following alternatives are available:

•Provide a native file support feature

•Use a solution that is provided by certified partners

Partner-provided file support

The partners that provide a file interface that is certified with IBM COS are listed in Table 3-6.

Table 3-6 IBM COS gateway options

Gateway product	Description
IBM Spectrum Scale	IBM Spectrum Scale is a software-defined parallel file system with a rich HPC heritage. It can make available native IBM GPFS protocol, and CIFS/SMB, NFS, and OpenStack Swift. IBM Spectrum Scale can use IBM COS as an external storage pool and move inactive data to this tier.
IBM Aspera®	IBM Aspera is a software suite that is designed for high-performance data transfer, which is achieved through a protocol stack that replaces TCP with the IBM FASP® protocol. IBM Aspera Direct-to-Cloud provides the capability to make available a file system interface to users and applications through a web interface or through client software that is installed on the server or workstation or mobile device and provide file sync and share capability. Data that is imported through Aspera can be read as an object because one file is uploaded as one object to IBM COS.
Avere FXT Filer	Avere consists of a hardware or virtual offering that makes available CIFS/NFS for users and applications. Because the architecture is caching, active, recent, and hot data is cached locally while the entire data set is kept in IBM COS. Data reduction in the form of compression is available and the customer use cases include I/O intensive workloads, such as rendering and transcoding.
Nasuni Filer	Nasuni filer is a software or hardware solution that provides general purpose NAS capability through SMB/CIFS and NFS. Because the architecture is caching active, recent, and hot data is cached locally while the entire data set is kept in IBM COS. The suite provides file sync and share capability and a global name space. Data reduction in the form of deduplication and compression is included.
Panzura	Panzura is a software or hardware solution that provides general purpose NAS capability through SMB/CIFS and NFS. Because the architecture is caching, active, recent, and hot data is cached locally while the entire data set is kept in IBM COS. Data reduction in the form of deduplication and compression is included.
Ctera	Ctera is a hardware and software solution that is focused on file sync and share capability. The architecture caches the data set onsite with the master copy retained in IBM COS. Client software can be installed on workstations or mobile devices, Data can also be made available through SMB/CIFS and NFS. Data reduction in the form of deduplication and compression is included.
Storage Made Easy (SME)	SME provides File Sync, which is a software-based file sync and share solution with mobile and workstation clients and a focus on inter-cloud compatibility including IBM COS.
CloudBerry Explorer	CloudBerry Explorer is a software-based object storage client that allows users to directly interact with an IBM COS. Data that is imported through CloudBerry Explorer can be read as an object because one file is uploaded as one object to IBM COS.
Seven10 Storfirst	Seven10 Storfirst is a software-based SMB/CIFS and NFS gateway offering that can talk to IBM COS, legacy tape, VTLs.

IBM Cloud Object: Concentrated Dispersal Mode

Concentrated Dispersal Mode enables deployment of a COS System with as few as three Slicestor nodes (six in the case of two-site deployments). Customers can then start with a system as small as 72 TB at reasonable cost, and grow seamlessly to petabytes and beyond. As their object storage capacity needs grow, their cost per TB of capacity decreases significantly.

Concentrated Dispersal Mode is implemented with the following concept:

•When Standard Dispersal Mode is set, data is sliced; then, each slice is spread on the COS system (one slide per Slicestor node).

•With Concentrated Dispersal Mode, after data is sliced, multiple slices are stored within the same Slicestor node. This process allows a COS system to be configured with a wider IDA than the number of Slicestor nodes.

An example of an IDA that is 12 slices wide is shown in Figure 3-39. With normal dispersal mode, this IDA requires 12 Slicestor nodes, whereas with concentrated dispersal mode that is configured with four slices per node, IDA requires three Slicestor nodes.

Figure 3-39 Concentrated Dispersal Mode concept

The number of multiple slices that is stored in the same Slicestor node depends on the configured IDA. A wider IDA reduces storage expansion factor, but requires each Slicestor to process more slices, which optimizes storage efficiency, but lowers Ops/sec performance. Although a narrower IDA increases the storage expansion factor, it reduces the number of slices each Slicestor must process, which lowers storage efficiency, but optimizing Ops/sec performance.

After a customer deploys such a low-end system, they can expand it at will to fit their current needs by adding a device set to a new storage pool if they need to change their IDA, or adding a device set to an existing storage pool.

The advantages and drawbacks of both expansion options are listed in Table 3-7.

Table 3-7 IBM COS System storage expansion options

Expansion option	PRO's	CON's
Add a new storage pool over a new device set	Any legal set size and IDA can be added	•Vaults must be created on new storage pool to use added storage •No system level distribution of writes across storage pools
Add a new device set within the same storage pool	•Existing Vaults can use added storage •System automatically rebalances storage across sets •System distributes new writes across sets to balance capacity percentage usage	Limited set size and IDAs that can be added

VersaStack for IBM Cloud Object Storage: Cisco Validated Design

Cisco servers can now host IBM Cloud Object Storage code. The main purpose of VersaStack for IBM Cloud Object Storage Cisco Validated Design (CVD) is to show which Cisco servers are validated to run IBM Cloud Object Storage code, and how they get integrated into a VersaStack infrastructure.

For more information about VersaStack Cisco Validated Design (CVD), see 4.7, “VersaStack for Hybrid Cloud”.

Cisco UCS C220 M4 are validated to host IBM Cloud Object Storage Manager or Accesser cod. Cisco UCS S3260 are validated to host IBM Cloud Object Storage Slicestor code.

A sample physical layout of an integration of the Cisco servers that host IBM Cloud Object Storage code with VersaStack is shown in Figure 3-40.

Figure 3-40 IBM COS integration with VersaStack

IBM Spectrum Scale Object support

IBM Spectrum Scale supports file and object solutions. For more information about IBM Spectrum Scale object support, see 3.12.1, “IBM Spectrum Scale” on page 59.

IBM Spectrum Scale includes the ability to provide a single namespace for all data, which means that applications can use the POSIX, NFS, and SMB file access protocols with an HDFS connector plug-in, and the Swift and S3 object protocols, all to a single data set. IBM Spectrum Scale Object Storage combines the benefits of IBM Spectrum Scale with the best pieces of OpenStack Swift, which is the most widely used open source object store today,

Core to the benefit of the use of IBM Spectrum Scale for Object services is the integration of file and object in a single system, which provides applications the ability to store data in one place, and administrators to support multiple protocol services from one storage system.

For storage policies that include enabled file access, the same data can be accessed in place from object and file interfaces so that data does not need to be moved between storage pillars for different kinds of processing. IBM Spectrum Scale Object OpenStack Swift is bundled and managed as part of the deliverable, which hides all of the complexities that otherwise are shown in the raw open source project (see Figure 3-41).

Figure 3-41 IBM Spectrum Scale Object Store architecture

As shown in Figure 3-41, all of the IBM Spectrum Scale protocol nodes are active and provide a front end for the entire object store. The Load Balancer, which distributes HTTP requests across the IBM Spectrum Scale protocol nodes, can be based on software or hardware.

The IBM Spectrum Scale protocol nodes run the IBM Spectrum Scale client and all the Swift services. Clients that use the Swift or S3 API (users or applications) first obtain a token from the Keystone authorization service. The token is included in all requests that are made to the Swift proxy service, which verifies the token by comparing it with cached tokens or by contacting the authorization service.

After applications are authenticated, they perform all object store operations, such as storing and retrieving objects and metadata, or listing account and container information through any of the proxy service daemons (possibly by using an HTTP Load Balancer, as shown in Figure 3-41).

For object requests, the proxy service then contacts the object service for the object, which in turn performs file system-related operations to the IBM Spectrum Scale client. Account and container information requests are handled in a similar manner.

IBM Spectrum Scale Object Storage can provide an efficient storage solution for cost and space, which is built on commodity parts that offer high throughput. The density and performance varies with each storage solution.

IBM Spectrum Scale Object Storage features the following benefits:

•Use of IBM Spectrum Scale data protection: Delegating the responsibility of protecting data to IBM Spectrum Scale and not using the Swift three-way replication or the relatively slow erasure coding increases the efficiency and the performance of the system in the following ways:

– With IBM Spectrum Scale RAID as part of the IBM Elastic Storage Server (see “IBM Spectrum NAS” on page 69), storage efficiency rises from 33% to up to 80%.

– Disk failure recovery does not cause data to flow over the storage network. Recovery is handled transparently and with minimal affect on applications. For more information, see “GPFS-based implementation of a hyper-converged system for a software defined infrastructure” by Azagury, et al.

•Applications now realize the full bandwidth of the storage network because IBM Spectrum Scale writes only a single copy of each object to the storage servers. Consider the following points

– Maximum object size is increased up to a configurable value of 5 TB. With IBM Spectrum Scale data striping, large objects do not cause capacity imbalances or server hotspots. They also do not inefficiently use available network bandwidth.

– No separate replication network is required to replicate data within a single cluster.

– Capacity growth is seamless because the storage capacity can be increased without requiring rebalancing of the objects across the cluster.

– IBM Spectrum Scale protocol nodes can be added or removed without failure requiring recovery operations or movement of data between nodes and disks.

•Integration of file and object in a single system: Applications can store various application data in a single file system. For storage policies that include enabled file access, the same data can be accessed in place from object and file interfaces so that data does not need to be moved between storage pillars for different kinds of processing.

•Energy saving: High per-server storage density and efficient use of network resources reduces energy costs.

•Enterprise storage management features: IBM Spectrum Scale Object Storage uses all IBM Spectrum Scale features, such as global namespace, compression, encryption, backup, disaster recovery, ILM (auto-tiering), tape integration, transparent cloud tiering (TCT), and remote caching.

For more information see the following resources:

•http://www.redbooks.ibm.com/abstracts/redp5113.html

•https://ibm.biz/BdZgCM

3.14 IBM storage support of OpenStack components

OpenStack technology is a key enabler of cloud infrastructure as a service (IaaS) capability. OpenStack architecture provides an overall cloud preferred practices workflow solution that is readily installable, and supported by a large ecosystem of worldwide developers in the OpenStack open source community.

Within the overall cloud workflow, specific OpenStack components support storage. The following OpenStack components support storage:

•IBM Cinder storage drivers

•Swift (object storage)

•Manila (file storage)

OpenStack architecture is one implementation of a preferred practices cloud workflow. Regardless of the cloud operating system environment that is used, the following key summary points apply:

•Cloud operating systems provide the necessary technology workflow to provide truly elastic, pay per use cloud services

•OpenStack cloud software provides a vibrant open source cloud operating system that is growing quickly

•OpenStack storage components

3.14.1 Cinder

Cinder is an OpenStack project to provide block storage as a service and provides an API to users to interact with different storage backend solutions. Cinder component provides support, provisioning, and control of block storage. The following are standards across all drivers for Cinder services to properly interact with a driver.

Icehouse updates for Cinder are block storage added backend migrations with tiered storage environments, allowing for performance management in heterogeneous environments. Mandatory testing for external drivers now ensures a consistent user experience across storage platforms, and fully distributed services improve scalability.

3.14.2 Swift

The OpenStack Object Store project, which is known as OpenStack Swift, offers cloud storage software so that you can store and retrieve lots of data with a simple API. It is built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow without bound.

Note: Do not confuse OpenStack Swift with Apple Swift, a programming language. In this paper, the term “Swift” always refers to OpenStack Swift.

3.14.3 Manila

The OpenStack Manila (File) component provides file storage, which allows coordinated access to shared or distributed file systems. Although the primary consumption of shares would be OpenStack compute instances, the service is also intended to be accessed independently, based on the modular design established by OpenStack services.

Manila features the following capabilities:

•Shared file system services for VMs

•Vendor-neutral API for NFS/CIFS and other network file systems

•IBM Spectrum Scale Manila (in Kilo):

– Extends Spectrum Scale data plane into VM

– Supports both kNFS and Ganesha 2.0

– Create/list/delete Shared and Snapshots

– Allow/deny access to a share based on IP address

– Multi-tenancy

For more information about OpenStack technology, see the following website:

http://www.openstack.org

3.14.4 IBM SDS products that include interfaces to OpenStack component

The following IBM SDS products include interfaces to OpenStack components:

•The IBM Storage Driver for OpenStack environments: The IBM Storage Driver for OpenStack environments is a software component that integrates with the OpenStack cloud environment. It enables the usage of storage resources that are provided by the following IBM storage systems:

– DS8880: This storage system can offer a range of capabilities that enable more effective storage automation deployments in private or public clouds. Enabling the OpenStack Cinder storage component with DS8880 allows for storage to be made available whenever it is needed without the traditional associated cost of highly skilled administrators and infrastructure. For more information, see Using IBM DS8000 in an OpenStack Environment, REDP-5220.

– IBM Spectrum Accelerate: Remote cloud users can issue requests for storage resources from the OpenStack cloud. These requests are transparently handled by the IBM Storage Driver. The IBM Storage Driver communicates with the IBM Spectrum Accelerate Storage System and controls the storage volumes on it. With the release of Version 11.5 software, IBM Spectrum Accelerate introduced support for multi-tenancy. Multi-tenancy enables cloud providers to divide and isolate the IBM Spectrum Accelerate resources into logical domains, which can then be used by tenants without any knowledge of the rest of the system resources. For more information, see Using XIV in OpenStack Environments, REDP-4971.

– IBM Storwize family/SAN Volume Controller: The volume management driver for the Storwize family and SAN Volume Controller provides OpenStack Compute instances with access to IBM Storwize family or SAN Volume Controller storage systems.

Storwize and SAN Volume Controller support fully transparent live storage migration in OpenStack Havana:

• No interaction with the host is required: All advanced Storwize features are supported and exposed to the Cinder system.

• Real-time Compression with EasyTier supports iSCSI + FC attachment.

– IBM FlashSystem (Kilo release): The volume driver for FlashSystem provides OpenStack Block Storage hosts with access to IBM FlashSystems.

•IBM Spectrum Scale: As of OpenStack Juno Release, Spectrum Scale combines the benefits of Spectrum Scale with the most widely used open source object store today, OpenStack Swift. Spectrum Scale provides enterprise ILM features. OpenStack Swift provides a robust object layer with an active community that is continuously adding innovative new features. To ensure compatibility with the Swift packages over time, no code changes are required to either Spectrum Scale or Swift to build the solution. For more information, see A Deployment Guide for IBM Spectrum Scale Unified File and Object Storage, REDP-5113.

•IBM Spectrum Protect: IBM data protection and data recovery solutions provide protection for virtual, physical, cloud, and software-defined infrastructures and core applications and remote facilities. These solutions fit nearly any size organization and recovery objective. They deliver the functions of IBM Spectrum Protect.

IBM Spectrum Protect enables software-defined storage environments by delivering automated data protection services at the control plane for file, block, and object backup.

IBM Spectrum Protect enables cloud data protection with OpenStack and VMware integration, cloud portal, and cloud deployment options.

For more information, see the following resources:

Protecting OpenStack with Tivoli Storage Manager for Virtual Environments:

https://ibm.biz/BdXZmY

Note: For more information about the IBM storage drivers and functions that are supported in the various OpenStack releases, see the following wiki:

https://wiki.openstack.org/wiki/CinderSupportMatrix

¹ Compression data based on IBM measurements. Compression rates vary by data type and content.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. IBM SDS product offerings

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3. IBM SDS product offerings