A high percentage of data that is retained on backup media by most backup solutions is highly redundant. The typical backup process for most organizations consists of a series of daily incremental backups and weekly full backups.
Daily backups are usually retained for a few weeks and weekly full backups are retained for several months to several years. Because of this process, multiple copies of identical or slowly changing data are retained on backup media, leading to a high level of data redundancy.
A large number of operating system, application files and data files are common across multiple systems in an enterprise. Identical files such as Word documents, PowerPoint presentations and Excel spreadsheets, are stored by many users across an environment. Backups of these systems will contain a large number of identical files.
Additionally, many users keep multiple versions of files that they are currently working on. Many of these files differ only slightly from other versions, but are seen by backup applications as new data that must be protected.
Backing up redundant data increases the amount of backup storage needed and can negatively impact network bandwidth. Organizations are running out of backup window time and facing difficulties meeting recovery objectives due to the need to manage backup versions and a myriad of backup tapes.
Avamar differs from traditional backup and restore solutions by identifying and storing only unique, sub-file data objects. Redundant data is identified at the source, drastically reducing the amount of backup data that travels across the network to be stored and managed by the backup host. When storing data objects, Avamar takes maximum advantage of inherent hard-disk characteristics. Avamar also creates and stores “trees” that link all data objects from a single backup. These “trees” are used to re-create files for restore.
Backup Storage Journey Diary
Thursday 15 May 2014
Thursday 24 April 2014
Avamar RAIN
Avamar supports two basic types of standard Avamar server configurations, which are:
- Non-RAIN and
- RAIN.
Non-RAIN configurations consist of a single stand-alone node. In single node configurations, both utility and data functions are provided. Non-RAIN configurations require replication. (Previous versions of Avamar supported 1x2 servers consisting of 1 utility and two data nodes).
RAIN configurations include one utility node, three or more data storage nodes, and a spare data node. Currently, the largest standard configuration consists of 16 data nodes, 1 utility node, and 1 spare data node. A minimum RAIN configuration is 1x3 server.
In a multi-node system, the nodes operate together as one server. The hostname and the IP address of the utility node are the identity of the Avamar server for access and client/server communication. Avamar load balances data across all available nodes in a server. With node architecture, Avamar can be easily scaled by adding more nodes.
Monday 21 April 2014
Avamar Replication
Avamar replication is a feature that transfers data from a “source” Avamar server to a “destination” Avamar server. All data in the destination server can be directly restored back to primary storage without having to be staged through the source Avamar server.
You can use either Avamar Administrator or Avamar Enterprise Manager to manage your replication settings.
Efficient Data Transfers. Replication is accomplished by way of highly efficient, asynchronous Internet Protocol (IP) data transfers, which can be scheduled during off-peak hours to make optimum use of network bandwidth.
Additionally, like other members of the Avamar product family, replication uses sophisticated data de-duplication technology that finds and eliminates redundant sequences of data before it is sent to the destination server, thereby reducing network traffic and promoting efficient use of hard disk storage.
Remote Branch Disaster Recovery. Replication enables the efficient replication of data stored in a single-node server to a multi-node server. Using replication, a distributed enterprise can centrally protect and manage multiple remote branch offices that are using individual single-node servers for local backup and restore. The centralized multi-node server can then be used for disaster recovery, in the event of catastrophic data loss at any remote branch office.
Enterprise Data Center Disaster Recovery. Replication can also be used to replicate data stored in a multi-node server to any other multi-node server in your enterprise. In this manner, multi-node servers can provide peer-to-peer disaster recovery for each other.
You can use either Avamar Administrator or Avamar Enterprise Manager to manage your replication settings.
Efficient Data Transfers. Replication is accomplished by way of highly efficient, asynchronous Internet Protocol (IP) data transfers, which can be scheduled during off-peak hours to make optimum use of network bandwidth.
Additionally, like other members of the Avamar product family, replication uses sophisticated data de-duplication technology that finds and eliminates redundant sequences of data before it is sent to the destination server, thereby reducing network traffic and promoting efficient use of hard disk storage.
Remote Branch Disaster Recovery. Replication enables the efficient replication of data stored in a single-node server to a multi-node server. Using replication, a distributed enterprise can centrally protect and manage multiple remote branch offices that are using individual single-node servers for local backup and restore. The centralized multi-node server can then be used for disaster recovery, in the event of catastrophic data loss at any remote branch office.
Enterprise Data Center Disaster Recovery. Replication can also be used to replicate data stored in a multi-node server to any other multi-node server in your enterprise. In this manner, multi-node servers can provide peer-to-peer disaster recovery for each other.
Tuesday 8 April 2014
Avamar Data Deduplication
Data Deduplication
Avamar differs from traditional backup and restore solutions by identifying and storing only unique, sub-file data objects. Redundant data is identified at the source, drastically reducing the amount of backup data that travels across the network to be stored and managed by the backup host. When storing data objects, Avamar takes maximum advantage of inherent hard-disk characteristics. Avamar also creates and stores “trees” that link all data objects from a single backup. These “trees” are used to re-create files for restore.
Data deduplication, or single instance storage, reduces storage needs by identifying duplicate or redundant data. Only unique data is then stored on the storage media. The level at which data deduplication is employed determines the granularity of deduplication. Three levels of data deduplication are:
File level deduplication helps organizations reduce storage needs for file servers by identifying duplicate files within hard disk volumes and providing an efficient mechanism for consolidating them. The most common implementation of single instance storage is at the file level. With this method, a single change in a file results in the entire file being identified as unique. As shown in the example, if there were 5 versions of a file in a backup environment, the 5 files in their entirety are stored.
Fixed block deduplication, also called fixed length deduplication, is commonly employed in snapshot and replication technologies. This method breaks a file into fixed length sub-objects. However, even with small changes to the data, all fixed length segments in a dataset can change despite the fact that very little of the dataset has actually changed.
Variable block level deduplication uses an intelligent method of determining segment size that looks at the data itself to determine repeatable boundary points. Variable block level deduplication yields a greater granularity in identifying duplicate data, eliminating the inefficiencies of file level and fixed block level deduplication. With variable block level deduplication, a change in a file results in only the variable-sized block containing the change being identified as unique. Consequently, more data is identified as common data, and in the case of backup, there is less data to store as only the unique data is backed up. This is the method used by Avamar.
Avamar differs from traditional backup and restore solutions by identifying and storing only unique, sub-file data objects. Redundant data is identified at the source, drastically reducing the amount of backup data that travels across the network to be stored and managed by the backup host. When storing data objects, Avamar takes maximum advantage of inherent hard-disk characteristics. Avamar also creates and stores “trees” that link all data objects from a single backup. These “trees” are used to re-create files for restore.
Data deduplication, or single instance storage, reduces storage needs by identifying duplicate or redundant data. Only unique data is then stored on the storage media. The level at which data deduplication is employed determines the granularity of deduplication. Three levels of data deduplication are:
File level deduplication helps organizations reduce storage needs for file servers by identifying duplicate files within hard disk volumes and providing an efficient mechanism for consolidating them. The most common implementation of single instance storage is at the file level. With this method, a single change in a file results in the entire file being identified as unique. As shown in the example, if there were 5 versions of a file in a backup environment, the 5 files in their entirety are stored.
Fixed block deduplication, also called fixed length deduplication, is commonly employed in snapshot and replication technologies. This method breaks a file into fixed length sub-objects. However, even with small changes to the data, all fixed length segments in a dataset can change despite the fact that very little of the dataset has actually changed.
Variable block level deduplication uses an intelligent method of determining segment size that looks at the data itself to determine repeatable boundary points. Variable block level deduplication yields a greater granularity in identifying duplicate data, eliminating the inefficiencies of file level and fixed block level deduplication. With variable block level deduplication, a change in a file results in only the variable-sized block containing the change being identified as unique. Consequently, more data is identified as common data, and in the case of backup, there is less data to store as only the unique data is backed up. This is the method used by Avamar.
Thursday 3 April 2014
Avamar Components
Avamar Server
- Client backups are stored on this server. This server provides essential processes and services required for client access and remote system administration. Take note that the following processes: Avamar Administrator Server (mcs) and Avamar Data Server (gsan) run on the Avamar server.
Avamar Client Software
- After software installation on the clients, Avamar client software runs on each computer or network server that is being backed up. This provides client software for various computing platforms. Each client consists of a client agent and one or more plug-ins.
Avamar Administrator Console
- This console software is installed on any supported Windows or client computer. which is able to access the Avamar system on the network. From then on, this user management console software application is used to remotely administer the Avamar system.
Wednesday 2 April 2014
NetApp Qtrees
A NetApp qtree is a directory with special properties. Originally, the "Q" is the quota and also know as the “quota-tree”. Quota-tree can be used to set a quota on a particular directory.
Nowadays, we have FlexVols in NetApp, which also can be quota-limited. In addition to a quota, a qtree possesses a few other properties.
A qtree enables you to apply attributes such as oplocks and security style to a subset of files and directories rather than to an entire volume.
Single files can be moved across a qtree without moving the data blocks. Directories cannot be moved across a qtree. However, since most clients use recursion to move the children of directories, the actual observed behavior is that directories are copied and files are then moved. Security style & oplocks settings can be different than rest of volume.
The following describes the replication relationship to qtress:
SnapMirror
- Whole volumes OR qtrees can be replicated
SnapVault
- Only qtrees can be replicated
OSSV (Open Systems SnapVault)
- Only directories can be replicated to qtrees
Nowadays, we have FlexVols in NetApp, which also can be quota-limited. In addition to a quota, a qtree possesses a few other properties.
A qtree enables you to apply attributes such as oplocks and security style to a subset of files and directories rather than to an entire volume.
Single files can be moved across a qtree without moving the data blocks. Directories cannot be moved across a qtree. However, since most clients use recursion to move the children of directories, the actual observed behavior is that directories are copied and files are then moved. Security style & oplocks settings can be different than rest of volume.
The following describes the replication relationship to qtress:
SnapMirror
- Whole volumes OR qtrees can be replicated
SnapVault
- Only qtrees can be replicated
OSSV (Open Systems SnapVault)
- Only directories can be replicated to qtrees
Find Avamar's Serial Number via SSH
If you are not at the data center location, and you need to access Avamar's hardware serial number per node, the following command line will help you to obtain Avamar's serial number.
1. Login to the Avamar node as root
2. Execute
1. Login to the Avamar node as root
2. Execute
root@avamarnode:~/#:
/usr/bin/ipmitool fru print 0 | grep "Product Asset Tag" | sed
"s/^.*: *\(.*\)/\1/"
Subscribe to:
Posts (Atom)