Group
Extension

Net-Amazon-Glacier/lib/Net/Amazon/Glacier.pm

package Net::Amazon::Glacier;

use 5.10.0;
use strict;
use warnings;

use Net::Amazon::Signature::V4;
use Net::Amazon::TreeHash;

use HTTP::Request;
use LWP::UserAgent;
use JSON 2.61;
use POSIX;
use Digest::SHA;
use File::Slurp 9999.19;
use Carp;

=head1 NAME

Net::Amazon::Glacier - An implementation of the full Amazon Glacier RESTful 2012-06-01 API.

=head1 VERSION

Version 0.15

=cut

our $VERSION = '0.15';

=head1 SYNOPSIS

Amazon Glacier is Amazon's long-term storage service and can be used to store
cold archives with a novel pricing scheme.
This module implements the full Amazon Glacier RESTful API, version 2012-06-01
(current at writing). It can be used to manage Glacier vaults, upload archives
as single part or multipart up to 40.000Gb in a single element and download them
in ranges or single parts.

Perhaps a little code snippet:

	use Net::Amazon::Glacier;

	my $glacier = Net::Amazon::Glacier->new(
		'eu-west-1',
		'AKIMYACCOUNTID',
		'MYSECRET',
	);

	my $vault = 'a_vault';

	my @vaults = $glacier->list_vaults();

	if ( $glacier->create_vault( $vault ) ) {

		if ( my $archive_id = $glacier->upload_archive( './archive.7z' ) ) {

			my $job_id = $glacier->inititate_job( $vault, $archive_id );

			# Jobs generally take about 4 hours to complete
			my $job_description = $glacier->describe_job( $vault, $job_id );

			# For a better way to wait for completion, see
			# http://docs.aws.amazon.com/amazonglacier/latest/dev/api-initiate-job-post.html
			while ( $job_description->{'StatusCode'} ne 'Succeeded' ) {
				sleep 15 * 60 * 60;
				$job_description = $glacier->describe_job( $vault, $job_id );
			}

			my $archive_bytes = $glacier->get_job_output( $vault, $job_id );

			# Jobs live as completed jobs for "a period", according to
			# http://docs.aws.amazon.com/amazonglacier/latest/dev/api-jobs-get.html
			my @jobs = $glacier->list_jobs( $vault );

			# As of 2013-02-09 jobs are blindly created even if a job for the same archive_id and Range exists.
			# Keep $archive_ids, reuse the expensive job resource, and remember 4 hours.
			foreach my $job ( @jobs ) {
				next unless $job->{ArchiveId} eq $archive_id;
				my $archive_bytes = $glacier->get_job_output( $vault, $job_id );
			}

		}

	}

The functions are intended to closely reflect Amazon's Glacier API. Please see
Amazon's API reference for documentation of the functions:
L<http://docs.amazonwebservices.com/amazonglacier/latest/dev/amazon-glacier-api.html>.

=head1 CONSTRUCTOR

=head2 new( $region, $access_key_id, $secret )

=cut

sub new {
	my ( $class, $region, $access_key_id, $secret ) = @_;

	croak "no region specified" unless $region;
	croak "no access key specified" unless $access_key_id;
	croak "no secret specified" unless $secret;

	my $self = {
		region => $region,
		# be well behaved and tell who we are
		ua     => LWP::UserAgent->new( agent=> __PACKAGE__ . '/' . $VERSION ),
		sig    => Net::Amazon::Signature::V4->new( $access_key_id, $secret, $region, 'glacier' ),
	};
	return bless $self, $class;
}

=head1 VAULT OPERATORS

=head2 create_vault( $vault_name )

Creates a vault with the specified name. Returns true on success, croaks on failure.
L<Create Vault (PUT vault)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-put.html>
=cut

sub create_vault {
	my ( $self, $vault_name ) = @_;

	croak "no vault name given" unless $vault_name;

	my $res = $self->_send_receive( PUT => "/-/vaults/$vault_name" );

	# updated error severity
	croak 'describe_vault failed with error ' . $res->status_line
		unless $res->is_success;

	return 1;

}

=head2 delete_vault( $vault_name )

Deletes the specified vault. Returns true on success, croaks on failure.

L<Delete Vault (DELETE vault)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-delete.html>
=cut

sub delete_vault {
	my ( $self, $vault_name ) = @_;

	croak "no vault name given" unless $vault_name;

	my $res = $self->_send_receive( DELETE => "/-/vaults/$vault_name" );
	# updated error severity
	croak 'describe_vault failed with error ' . $res->status_line
		unless $res->is_success;

	return 1;
}

=head2 describe_vault( $vault_name )

Fetches information about the specified vault.

Returns a hash reference with
the keys described by L<http://docs.amazonwebservices.com/amazonglacier/latest/dev/api-vault-get.html>.

Croaks on failure.

L<Describe Vault (GET vault)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-get.html>

=cut

sub describe_vault {
	my ( $self, $vault_name ) = @_;

	croak "no vault name given" unless $vault_name;

	my $res = $self->_send_receive( GET => "/-/vaults/$vault_name" );
	# updated error severity
	croak 'describe_vault failed with error ' . $res->status_line unless $res->is_success;

	return $self->_decode_and_handle_response( $res );
}

=head2 list_vaults

Lists the vaults. Returns an array with all vaults.
L<Amazon Glacier List Vaults (GET vaults)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vaults-get.html>.

A call to list_vaults can result in many calls to the Amazon API at a rate
of 1 per 1,000 vaults in existence.
Calls to List Vaults in the API are L<free|http://aws.amazon.com/glacier/pricing/#storagePricing>.

Croaks on failure.

=cut

sub list_vaults {
	my ( $self ) = @_;
	my @vaults;

	my $marker;
	do {
		#1000 is the default limit, send a marker if needed
		my $res = $self->_send_receive( GET => "/-/vaults?limit=1000" . ($marker?'&'.$marker:'') );
		# updated error severity
		croak 'list_vaults failed with error ' . $res->status_line unless $res->is_success;
		my $decoded = $self->_decode_and_handle_response( $res );

		push @vaults, @{$decoded->{VaultList}};
		$marker = $decoded->{Marker};
	} while ( $marker );

	return ( \@vaults );
}

=head2 set_vault_notifications( $vault_name, $sns_topic, $events )

Sets vault notifications for a given vault.

An SNS Topic to send notifications to must be provided. The SNS Topic must
grant permission to the vault to be allowed to publish notifications to the topic.

An array ref to a list of events must be provided. Valid events are
ArchiveRetrievalCompleted and InventoryRetrievalCompleted

Return true on success, croaks on failure.

L<Set Vault Notification Configuration (PUT notification-configuration)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-notifications-put.html>.

=cut

sub set_vault_notifications {
	my ( $self, $vault_name, $sns_topic, $events ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no sns topic given" unless $sns_topic;
	croak "events should be an array ref" unless ref $events eq 'ARRAY';

	my $content_raw;

	$content_raw->{SNSTopic} = $sns_topic
		if defined($sns_topic);

	$content_raw->{Events} = $events
		if defined($events);

	my $res = $self->_send_receive(
		PUT => "/-/vaults/$vault_name/notification-configuration",
		[
		],
		encode_json($content_raw),
	);
	# updated error severity
	croak 'get_vault_notifications failed with error ' . $res->status_line
		unless $res->is_success;

	return 1;
}

=head2 get_vault_notifications( $vault_name )

Gets vault notifications status for a given vault.

Returns a hash with an 'SNSTopic' and and array of 'Events' on success, croaks
on failure.

L<Get Vault Notifications (GET notification-configuration)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-notifications-get.html>.

=cut

sub get_vault_notifications {
	my ( $self, $vault_name, $sns_topic, $events ) = @_;

	croak "no vault name given" unless $vault_name;

	my $res = $self->_send_receive(
		PUT => "/-/vaults/$vault_name/notification-configuration",
	);
	# updated error severity
	croak 'get_vault_notifications failed with error ' . $res->status_line
		unless $res->is_success;

	return $self->_decode_and_handle_response( $res );
}

=head2 delete_vault_notifications( $vault_name )

Deletes vault notifications for a given vault.

Return true on success, croaks on failure.

L<Delete Vault Notifications (DELETE notification-configuration)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-vault-notifications-delete.html>.

=cut

sub delete_vault_notifications {
	my ( $self, $vault_name, $sns_topic, $events ) = @_;

	croak "no vault name given" unless $vault_name;

	my $res = $self->_send_receive(
		DELETE => "/-/vaults/$vault_name/notification-configuration",
	);
	# updated error severity
	croak 'delete_vault_notifications failed with error ' . $res->status_line
		unless $res->is_success;

	return 1;
}

=head1 ARCHIVE OPERATIONS

=head2 upload_archive( $vault_name, $archive_path, [ $description ] )

Uploads an archive to the specified vault. $archive_path is the local path to
any file smaller than 4GB. For larger files, see MULTIPART UPLOAD OPERATIONS.

An archive description of up to 1024 printable ASCII characters can be supplied.

Returns the Amazon-generated archive ID on success, or false on failure.

L<Upload Archive (POST archive)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-archive-post.html>

=cut

sub upload_archive {
	my ( $self, $vault_name, $archive_path, $description ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no archive path given" unless $archive_path;
	croak 'archive path is not a file' unless -f $archive_path;

	$description //= '';
	my $content = File::Slurp::read_file( $archive_path, err_mode => 'croak', binmode => ':raw', scalar_ref => 1 );

	return $self->_do_upload($vault_name, $content, $description);
}

=head2 upload_archive_from_ref( $vault_name, $ref, [ $description ] )

DEPRECATED at birth. Will be dropped in next version. A more robust
upload_archive will support file paths, refs, code refs, filehandles and more.

In the meanwhile...

Like upload_archive, but takes a reference to your data instead of the path to
a file. For data greater than 4GB, see multi-part upload. An archive
description of up to 1024 printable ASCII characters can be supplied. Returns
the Amazon-generated archive ID on success, or false on failure.

=cut

sub upload_archive_from_ref {
	my ( $self, $vault_name, $ref, $description ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "data must be a reference" unless ref $ref;

	return $self->_do_upload($vault_name, $ref, $description);
}

sub _do_upload {
	my ( $self, $vault_name, $content_ref, $description ) = @_;

	_enforce_description_limits( \$description );

	my $th = Net::Amazon::TreeHash->new();
	$th->eat_data ( $content_ref );
	$th->calc_tree;

	my $res = $self->_send_receive(
		POST => "/-/vaults/$vault_name/archives",
		[
			'x-amz-archive-description' => $description,
			'x-amz-sha256-tree-hash' => $th->get_final_hash(),
			'x-amz-content-sha256' => Digest::SHA::sha256_hex( $$content_ref ),
		],
		$$content_ref
	);
	croak 'upload_archive failed with error ' . $res->status_line unless $res->is_success;

	my $rec_archive_id;
	unless ( $res->header('location') =~ m{^/[^/]+/vaults/[^/]+/archives/(.*)$} ) {
		# update severity of error. This method must return an archive id
		croak 'request succeeded, but reported archive location does not match regex: ' . $res->header('location');
	} else {
		$rec_archive_id = $1;
	}

	return $rec_archive_id;
}

=head2 delete_archive( $vault_name, $archive_id )

Issues a request to delete a file from Glacier. $archive_id is the ID you
received either when you uploaded the file originally or from an inventory.
L<Delete Archive (DELETE archive)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-archive-delete.html>

=cut

sub delete_archive {
	my ( $self, $vault_name, $archive_id ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no archive ID given" unless $archive_id;

	my $res = $self->_send_receive( DELETE => "/-/vaults/$vault_name/archives/$archive_id" );
	# updated error severity
	croak 'delete_archive failed with error ' . $res->status_line unless $res->is_success;

	return $res->is_success;
}

=head1 MULTIPART UPLOAD OPERATIONS

Amazon requires this method for files larger than 4GB, and recommends it for
files larger than 100MB.

L<Uploading Large Archives in Parts (Multipart Upload)|http://docs.aws.amazon.com/amazonglacier/latest/dev/uploading-archive-mpu.html>

=head2 SYNOPSIS

	use Net::Amazon::Glacier;

	my $glacier = Net::Amazon::Glacier->new(
		'eu-west-1',
		'AKIMYACCOUNTID',
		'MYSECRET',
	);

	my $part_size = $glacier->calculate_multipart_upload_partsize( -s $filename );

	my $upload_id = $glacier->multipart_upload_init( $vault, $part_size, $description );

	open ( A_FILE, '<', 'a_file.bin' );

	my $part_index = 0;
	my $read_bytes;
	my $parts_hash = []; # to store partial tree hash for complete method

	# Upload parts of A_FILE
	do {
		$read_bytes = read ( A_FILE, $part, $part_size );
		$parts_hash->[$part_index] = $glacier->multipart_upload_upload_part( $vault, $upload_id, $part_size, $part_index, \$part );
	} while ( ( $read_bytes == $part_size) && $parts_hash->[$part_index++] =~ /^[0-9a-f]{64}$/ );
	close ( A_FILE );

	my $archive_size = $part_size * ( $part_index ) + $read_bytes;

	# Capture archive id or error code
	my $archive_id = $glacier->multipart_upload_complete( $vault, $upload_id, $parts_hash, $archive_size  );

	# Check if we have a valid $archive_id
	unless ( $archive_id =~ /^[a-zA-Z0-9_\-]{10,}$/ ) {
		# abort partial failed upload
		# could also store upload_id and continue later
		$glacier->multipart_upload_abort( $vault, $upload_id );
	}

	# Other useful methods
	# Get an array ref with incomplete multipart uploads
	my $upload_list = $glacier->multipart_upload_list_uploads( $vault );

	# Get an array ref with uploaded parts for a multipart upload
	my $upload_parts = $glacier->multipart_upload_list_parts( $vault, $upload_id );

=head2 calculate_multipart_upload_partsize ( $archive_size )

Calculates the part size that would allow to uploading files of $archive_size

$archive_size is the maximum expected archive size

Returns the smallest possible part size to upload an archive of
size $archive_size, 0 when files cannot be uploaded in parts (i.e. >39Tb)

=cut

sub calculate_multipart_upload_partsize {
	my ( $self, $archive_size ) = @_;

	# get the size of a part if uploaded in the maximum possible parts in MiB
	my $part_size = ( $archive_size - 1) / 10000;

	# the smallest power of 2 that fits this amount of MiB
	my $part_size_MiB_rounded = 2**(int(log($part_size)/log(2))+1);

	# range check response for minimum and maximum API limits
	if ( $part_size_MiB_rounded < 1024 * 1024 ) {
		# part size must be at least 1MiB
		return 1024 * 1024;
	} elsif ( $part_size_MiB_rounded > 4 * 1024 * 1024 * 1024 ) {
		# updated error severity
		croak 'part size must not exceed 4GiB, this file size is not uploadable';
	} else {
		return $part_size_MiB_rounded;
	}
}

=head2 multipart_upload_init( $vault_name, $part_size, [ $description ] )

Initiates a multipart upload.
$part_size should be carefully calculated to avoid dead ends as documented in
the API. Use calculate_multipart_upload_partsize.

Returns a multipart upload id that should be used while adding parts to the
online archive that is being constructed.

Multipart upload ids are valid until multipart_upload_abort is called or 24
hours after last archive related activity is registered. After that period id
validity should not be expected.

L<Initiate Multipart Upload (POST multipart-uploads)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-initiate-upload.html>.

=cut

sub multipart_upload_init {
	my ( $self, $vault_name, $part_size, $description) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no part size given" unless $part_size;
	croak "parameter number mismatch" unless @_ == 3 || @_ == 4;

	_enforce_description_limits( \$description );

	my $multipart_upload_id;

	my $res = $self->_send_receive(
		POST => "/-/vaults/$vault_name/multipart-uploads",
		[
			'x-amz-archive-description' => $description,
			'x-amz-part-size' => $part_size,
		],
	);
	# updated error severity
	croak 'multipart_upload_init failed with error ' . $res->status_line unless $res->is_success;

	$multipart_upload_id = $res->header('x-amz-multipart-upload-id');

	# double check the webservice speaks the same language
	# updated error severity
	croak 'request succeeded, but no multipart upload id was returned' unless ( $multipart_upload_id );

	return $multipart_upload_id;
}

=head2 multipart_upload_upload_part( $vault_name, $multipart_upload_id, $part_size, $part_index, $part )

Uploads a certain range of a multipart upload.

$part_size must be the same supplied to multipart_upload_init for a given
multipart upload.

$part_index should be the index of a file of N $part_size chunks whose data is
passed in $part.

$part can must be a reference to a string or be a filehandle and must be exactly
the part_size supplied to multipart_upload_initiate unless it is the last past
which can be any non-zero size.

Absolute maximum online archive size is 4GB*10000 or slightly over 39Tb.
L<Uploading Large Archives in Parts (Multipart Upload) Quick Facts|docs.aws.amazon.com/amazonglacier/latest/dev/uploading-archive-mpu.html#qfacts>

Returns uploaded part tree-hash (which should be store in an array ref to be
passed to multipart_upload_complete

L<Upload Part (PUT uploadID)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-upload-part.html>.

=cut

sub multipart_upload_upload_part {
	my ( $self, $vault_name, $multipart_upload_id, $part_size, $part_index, $part ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no multipart upload id given" unless $multipart_upload_id;
	croak "parameter number mismatch" unless @_ == 6;

	# identify $part as filehandle or string and get content
	my $content = '';

	if ( ref $part eq 'SCALAR' ) {
		# keep scalar reference
		$content = $part;
		croak "no data supplied" unless length $$content;
	} else {
		#try to read any other content as supported by File::Slurp
		eval {
			$content = File::Slurp::read_file( $part, bin_mode => ':raw', err_mode => 'carp', scalar_ref => 1 );
		};
		croak "\$part interpreted as file (GLOB, IO::Handle/File) but error occured while reading: $@" if ( $@ );

		croak "no data read from file" unless length $$content;
	}

	my $upload_part_size = length $$content;

	# compute part hash
	my $th = Net::Amazon::TreeHash->new();

	$th->eat_data( $content );

	$th->calc_tree();

	# range end must not be ( $part_size * ( $part_index + 1 ) - 1 ) or last part
	# will fail.
	my $res = $self->_send_receive(
		PUT => "/-/vaults/$vault_name/multipart-uploads/$multipart_upload_id",
		[
			'Content-Range' => 'bytes ' . ( $part_size * $part_index ) . '-' .  ( ( $part_size * $part_index ) + $upload_part_size - 1 ) . '/*',
			'Content-Length' => $upload_part_size,
			'Content-Type' => 'application/octet-stream',
			'x-amz-sha256-tree-hash' => $th->get_final_hash(),
			'x-amz-content-sha256' => Digest::SHA::sha256_hex( $$content ),
			# documentation seems to suggest x-amz-content-sha256 may not be needed but it is!
		],
		$$content
	);
	# updated error severity
	croak 'multipart_upload_upload_part failed with error ' . $res->status_line unless $res->is_success;

	# check glacier tree-hash = local tree-hash
	# updated error severity; multipart upload id must be returned
	croak 'request succeeded, but reported and computed tree-hash for part do not match' unless ( $th->get_final_hash() eq $res->header('x-amz-sha256-tree-hash') );
	# return computed tree-hash for this part
	return $res->header('x-amz-sha256-tree-hash');
}

=head2 multipart_upload_complete( $vault_name, $multipart_upload_id, $tree_hash_array_ref, $archive_size )

Signals completion of multipart upload.

$tree_hash_array_ref must be an ordered list (same order as final assembled online
archive, as opposed to upload order) of partial tree hashes as returned by
multipart_upload_upload_part

$archive_size is provided at completion to check all parts make up an archive an
not before hand to allow for archive streaming a.k.a. upload archives of unknown
size. Beware of dead ends when choosing part size. Use
calculate_multipart_upload_partsize to select a part size that will work.

Returns an archive id that can be used to request a job to retrieve the archive
at a later time on success and 0 on failure.

On failure multipart_upload_list_parts could be used to determine the missing
part or recover the partial tree hashes, complete the missing parts and
recalculate the correct archive tree hash and call multipart_upload_complete
with a successful result.

L<Complete Multipart Upload (POST uploadID)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-complete-upload.html>.

=cut

sub multipart_upload_complete {
	my ( $self, $vault_name, $multipart_upload_id, $tree_hash_array_ref, $archive_size ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no multipart upload id given" unless $multipart_upload_id;
	croak "no tree hash object given" unless ref $tree_hash_array_ref eq 'ARRAY';
	croak "parameter number mismatch" unless @_ == 5;

	my $archive_tree_hash = $self->_tree_hash_from_array_ref( $tree_hash_array_ref );

	my $res = $self->_send_receive(
		POST => "/-/vaults/$vault_name/multipart-uploads/$multipart_upload_id",
		[
			'x-amz-sha256-tree-hash' => $archive_tree_hash ,
			'x-amz-archive-size' => $archive_size,
		],
	);
	# updated error severity
	croak 'multipart_upload_complete failed with error ' . $res->status_line unless $res->is_success;

	my $rec_archive_id;
	unless ( $res->header('location') =~ m{^/[^/]+/vaults/[^/]+/archives/(.*)$} ) {
		# update severity of error. This method must return an archive id
		croak 'request succeeded, but reported archive location does not match regex: ' . $res->header('location');
	} else {
		$rec_archive_id = $1;
	}

	return $rec_archive_id;
}

=head2 multipart_upload_abort( $vault_name, $multipart_upload_id )

Aborts multipart upload releasing the id and related online resources of
a partially uploaded archive.

L<Abort Multipart Upload (DELETE uploadID)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-abort-upload.html>.

=cut

sub multipart_upload_abort {
	my ( $self, $vault_name, $multipart_upload_id ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no multipart_upload_id given" unless $multipart_upload_id;
	croak "parameter number mismatch" unless @_ == 3;

	my $res = $self->_send_receive(
		DELETE => "/-/vaults/$vault_name/multipart-uploads/$multipart_upload_id",
	);
	# updated error severity
	croak 'multipart_upload_abort failed with error ' . $res->status_line unless $res->is_success;

	# double check the webservice speaks the same language
	# updated error severity
	croak 'request returned an invalid code' unless ( $res->code == 204 );

	return $res->is_success;
}

=head2 multipart_upload_list_parts ( $vault_name, $multipart_upload_id )

Returns an array ref with information on all uploaded parts of the, probably
partially uploaded, online archive.

Useful to recover file part tree hashes and complete a broken multipart upload.

L<List Parts (GET uploadID)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-list-parts.html>

A call to multipart_upload_part_list can result in many calls to the
Amazon API at a rate of 1 per 1,000 recently completed job in existence.
Calls to List Parts in the API are L<free|http://aws.amazon.com/glacier/pricing/#storagePricing>.

=cut

sub multipart_upload_list_parts {
	my ( $self, $vault_name, $multipart_upload_id ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no multipart_upload_id given" unless $multipart_upload_id;
	croak "parameter number mismatch" unless @_ == 3;

	my @upload_part_list;

	my $marker;
	do {
		#1000 is the default limit, send a marker if needed
		my $res = $self->_send_receive( GET => "/-/vaults/$vault_name/multipart-uploads/$multipart_upload_id?limit=1000" . ($marker?'&'.$marker:'') );
		# updated error severity
		croak 'multipart_upload_list_parts failed with error ' . $res->status_line unless $res->is_success;
		my $decoded = $self->_decode_and_handle_response( $res );

		push @upload_part_list, @{$decoded->{Parts}};
		$marker = $decoded->{Marker};
	} while ( $marker );

	return \@upload_part_list;
}

=head2 multipart_upload_list_uploads( $vault_name )

Returns an array ref with information on all non completed multipart uploads.
Useful to recover multipart upload ids.
L<List Multipart Uploads (GET multipart-uploads)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-multipart-list-uploads.html>

A call to multipart_upload_list can result in many calls to the Amazon API
at a rate of 1 per 1,000 recently completed job in existence.
Calls to List Multipart Uploads in the API are L<free|http://aws.amazon.com/glacier/pricing/#storagePricing>.

=cut

sub multipart_upload_list_uploads {
	my ( $self, $vault_name ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "parameter number mismatch" unless @_ == 2;

	my @upload_list;

	my $marker;
	do {
		#1000 is the default limit, send a marker if needed
		my $res = $self->_send_receive( GET => "/-/vaults/$vault_name/multipart-uploads?limit=1000" . ($marker?'&'.$marker:'') );
		# updated error severity
		croak 'multipart_upload_list_uploads failed with error ' . $res->status_line unless $res->is_success;
		my $decoded = $self->_decode_and_handle_response( $res );

		push @upload_list, @{$decoded->{UploadsList}};
		$marker = $decoded->{Marker};
	} while ( $marker );

	return \@upload_list;
}

=head1 JOB OPERATIONS

=head2 initiate_archive_retrieval( $vault_name, $archive_id, [
$description, $sns_topic ] )

Initiates an archive retrieval job. $archive_id is an ID previously
retrieved from Amazon Glacier.

A job description of up to 1,024 printable ASCII characters may be supplied.
Net::Amazon::Glacier does it's best to enforce this restriction. When unsure
send the string and look for Carp.

An SNS Topic to send notifications to upon job completion may also be supplied.

L<Initiate a Job (POST jobs)|docs.aws.amazon.com/amazonglacier/latest/dev/api-initiate-job-post.html#api-initiate-job-post-requests-syntax>.

=cut

sub initiate_archive_retrieval {
	my ( $self, $vault_name, $archive_id, $description, $sns_topic ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no archive id given" unless $archive_id;

	my $content_raw = {
		Type => 'archive-retrieval',
		ArchiveId => $archive_id,
	};

	if ( defined $description ) {
		 _enforce_description_limits( \$description );
		$content_raw->{Description} = $description;
	}

	$content_raw->{SNSTopic} = $sns_topic
		if defined($sns_topic);

	my $res = $self->_send_receive(
		POST => "/-/vaults/$vault_name/jobs",
		[ ],
		encode_json($content_raw),
	);
	# updated error severity; method must return a job id
	croak 'initiate_archive_retrieval failed with error ' . $res->status_line unless $res->is_success;

	return $res->header('x-amz-job-id');
}

=head2 initiate_inventory_retrieval( $vault_name, $format, [ $description,
$sns_topic ] )

Initiates an inventory retrieval job. $format is either CSV or JSON.

A job description of up to 1,024 printable ASCII characters may be supplied.
Net::Amazon::Glacier does it's best to enforce this restriction. When unsure
send the string and look for Carp.

An SNS Topic to send notifications to upon job completion may also be supplied.

L<Initiate a Job (POST jobs)|docs.aws.amazon.com/amazonglacier/latest/dev/api-initiate-job-post.html#api-initiate-job-post-requests-syntax>.

=cut

sub initiate_inventory_retrieval {
	my ( $self, $vault_name, $format, $description, $sns_topic ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no format given" unless $format;

	my $content_raw = {
		Type => 'inventory-retrieval',
	};

	$content_raw->{Format} = $format
		if defined($format);

	if ( defined $description ) {
		_enforce_description_limits( \$description );
		$content_raw->{Description} = $description;
	}

	$content_raw->{SNSTopic} = $sns_topic
		if defined($sns_topic);

	my $res = $self->_send_receive(
		POST => "/-/vaults/$vault_name/jobs",
		[ ],
		encode_json($content_raw),
	);
	# updated error severity; method must return a job id
	croak 'initiate_inventory_retrieval failed with error ' . $res->status_line unless $res->is_success;

	return $res->header('x-amz-job-id');
}

=head2 initiate_job( ( $vault_name, $archive_id, [ $description, $sns_topic ] )

Effectively calls initiate_inventory_retrieval.

Exists for the sole purpose or implementing the Amazon Glacier Developer Guide (API Version 2012-06-01)
nomenclature.

L<Initiate a Job (POST jobs)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-initiate-job-post.html>.

=cut

sub initiate_job {
	initiate_inventory_retrieval( @_ );
}

=head2 describe_job( $vault_name, $job_id )

Retrieves a hashref with information about the requested JobID.

L<Amazon Glacier Describe Job (GET JobID)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-describe-job-get.html>.

=cut

sub describe_job {
	my ( $self, $vault_name, $job_id ) = @_;
	my $res = $self->_send_receive( GET => "/-/vaults/$vault_name/jobs/$job_id" );
	# updated error severity
	croak 'describe_job failed with error ' . $res->status_line unless $res->is_success;
	return $self->_decode_and_handle_response( $res );
}

=head2 get_job_output( $vault_name, $job_id, [ $range ] )

Retrieves the output of a job, returns a binary blob. Optional range
parameter is passed as an HTTP header.
L<Amazon Glacier Get Job Output (GET output)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-job-output-get.html>.

If you pass a range parameter, you're going to want the tree-hash for your
chunk.  That will be returned in an additional return value, so collect it
like this:

	($bytes, $tree_hash) = get_job_output(...)

=cut

sub get_job_output {
	my ( $self, $vault_name, $job_id, $range ) = @_;

	croak "no vault name given" unless $vault_name;
	croak "no job id given" unless $vault_name;

	my $headers = [];

	push @$headers, (Range => $range)
		if defined($range);

	my $res = $self->_send_receive( GET => "/-/vaults/$vault_name/jobs/$job_id/output", $headers );
	# updated error severity
	croak 'get_job_output failed with error ' . $res->status_line unless $res->is_success;

	return wantarray ? ($res->decoded_content, $res->header('x-amz-sha256-tree-hash')) : $res->decoded_content;
}

=head2 list_jobs( $vault_name )

Return an array with information about all recently completed jobs for the
specified vault.
L<Amazon Glacier List Jobs (GET jobs)|http://docs.aws.amazon.com/amazonglacier/latest/dev/api-jobs-get.html>.

A call to list_jobs can result in many calls to the Amazon API at a rate of
1 per 1,000 recently completed job in existence.
Calls to List Jobs in the API are L<free|http://aws.amazon.com/glacier/pricing/#storagePricing>.

=cut

sub list_jobs {
	my ( $self, $vault_name ) = @_;

	croak "no vault name given" unless $vault_name;

	my @completed_jobs;

	my $marker;
	do {
		#1000 is the default limit, send a marker if needed
		my $res = $self->_send_receive( GET => "/-/vaults/$vault_name/jobs?limit=1000" . ($marker?'&'.$marker:'') );
		# updated error severity
		croak 'list_jobs failed with error ' . $res->status_line unless $res->is_success;
		my $decoded = $self->_decode_and_handle_response( $res );

		push @completed_jobs, @{$decoded->{JobList}};
		$marker = $decoded->{Marker};
	} while ( $marker );

	return ( \@completed_jobs );
}

# helper functions

# receives an array ref of hex strings as returned by multipart_upload_upload_part
# the array ref must be in the resulting online archive order as oppossed to the
# upload order
# returns an hex string representing the tree hash of the complete archive for
# use in multipart_upload_complete
sub _tree_hash_from_array_ref {
	my ( $self, $tree_hash_array_ref ) = @_;

	croak "no tree hash object given" unless $tree_hash_array_ref;
	croak "tree hash array ref is not an array reference" unless ref $tree_hash_array_ref eq 'ARRAY';
	croak "tree hash array ref does not seem to contain sha256 hex strings" unless
		length join ('', map m/^[0-9a-fA-F]{64}$/, @$tree_hash_array_ref) == scalar @$tree_hash_array_ref;

	# copy array to temporary array mapped to byte values
	my @prevLvlHashes = map( pack("H*", $_), @{$tree_hash_array_ref} );

	# consume parts in pairs A (+) B until we have one part (unrolled recursive)
	while ( @prevLvlHashes > 1 ) {
		my ( $prevLvlIterator, $currLvlIterator );

		my @currLvlHashes;

		# consume two elements form previous level to make for one element of the
		# next level, last elements on odd sized arrays copied verbatim to next level
		for ( $prevLvlIterator = 0, $currLvlIterator = 0; $prevLvlIterator < @prevLvlHashes; $prevLvlIterator+=2 ) {
			if ( @prevLvlHashes - $prevLvlIterator > 1) {
				# store digest in next level as byte values
				push @currLvlHashes, Digest::SHA::sha256( $prevLvlHashes[ $prevLvlIterator ], $prevLvlHashes[ $prevLvlIterator + 1 ] );
			} else {
				push @currLvlHashes, $prevLvlHashes[ $prevLvlIterator ];
			}
		}

		# advance one level
		@prevLvlHashes = @currLvlHashes;
	}

	# return resulting array as string of hex values
	return unpack( 'H*', $prevLvlHashes[0] );
}

sub _decode_and_handle_response {
	my ( $self, $res ) = @_;

	if ( $res->is_success ) {
		return decode_json( $res->decoded_content );
	} else {
		return undef;
	}
}

sub _send_receive {
	my $self = shift;
	my $req = $self->_craft_request( @_ );
	return $self->_send_request( $req );
}

sub _craft_request {
	my ( $self, $method, $url, $header, $content ) = @_;
	my $host = 'glacier.'.$self->{region}.'.amazonaws.com';
	my $total_header = [
		'x-amz-glacier-version' => '2012-06-01',
		'Host' => $host,
		'Date' => POSIX::strftime( '%Y%m%dT%H%M%SZ', gmtime ),
		$header ? @$header : ()
	];
	my $req = HTTP::Request->new( $method => "https://$host$url", $total_header, $content);
	my $signed_req = $self->{sig}->sign( $req );
	return $signed_req;
}

sub _send_request {
	my ( $self, $req ) = @_;
	my $res = $self->{ua}->request( $req );
	if ( $res->is_error ) {
		# try to decode Glacier error
		eval {
			my $error = decode_json( $res->decoded_content );
			carp sprintf 'Non-successful response: %s (%s)', $res->status_line, $error->{code};
			carp decode_json( $res->decoded_content )->{message};
		};
		if ( $@ ) {
			# fall back to reporting ua errors
			carp sprintf "[%d] %s %s\n", $res->code, $res->message, $res->decoded_content;
		}
	}
	return $res;
}

sub _enforce_description_limits {
	my ( $description ) = @_;
	croak 'Description should be a reference so that I can enforce limits on it.' unless ref $description eq 'SCALAR';
	# order is important. We do not want to loose any characters unless needed.
	my $changes = ( $$description =~ tr/\x20-\x7f//cd );
	carp 'Description contains invalid characters stick to printable ASCII (x20-x7f). Fixed.' if ( $changes );
	if ( length $$description > 1024 ) {
		$$description = substr( $$description, 0, 1024 );
		carp 'Description should not be longer than 1024 characters. Fixed.';
	}

	return $description;
}

=head1 ROADMAP

=over 4

=item * Online tests.

=item * Implement a "simple" interfase in the lines of

		use Net::Amazon::Glacier;

		# Bless and upload something
		my $glacier = Net::Amazon::Glacier->new( $region, $aws_key, $aws_secret, $metadata_store );

		# Upload intelligently, i.e. in resumable parts, split very big files.
		$glacier->simple->upload( $path || $scalar_ref || $some_fh );

		# Support automatic archive_id to some description conversion
		# Ask for a job when first called, return while it is not ready,
		# return content when ready.
		$glacier->simple->download( $archive_id || 'description', [ $ranges ] );

		# Request download and spawn something, wait and execute $some_code_ref
		# when content ready.
		$glacier->simple->download_wait( $archive_id || 'description' , $some_code_ref, [ $ranges ] );

		# Delete online archive
		$glacier->simple->delete( $archive_id || 'description' );

=item * Implement a simple command line cli with access to simple interface.

		glacier new us-east-1 AAIKSAKS... sdoasdod... /metadata/file
		glacier upload /some/file
		glacier download /some/file (this would spawn a daemon waiting for download)
		glacier ls

=back

=head1 SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Net::Amazon::Glacier

You can also look for information at:

=over 4

=item * RT: CPAN's request tracker (report bugs here)

L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Net-Amazon-Glacier>

=item * AnnoCPAN: Annotated CPAN documentation

L<http://annocpan.org/dist/Net-Amazon-Glacier>

=item * CPAN Ratings

L<http://cpanratings.perl.org/d/Net-Amazon-Glacier>

=item * Search CPAN

L<http://search.cpan.org/dist/Net-Amazon-Glacier/>

=item * Check the GitHub repo, development branches in particular.

L<https://github.com/gbarco/Net-Amazon-Glacier>

=item * Mail Gonzalo Barco

C<< <gbarco uy at gmail com, no spaces> >>

=back

=head1 BUGS

Please report any bugs or feature requests to C<bug-net-amazon-glacier at rt.cpan.org>,
or through the web interface at L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Net-Amazon-Glacier>.
I will be notified, and then you'll automatically be notified of progress on
your bug as I make changes.

=head1 SEE ALSO

See also Victor Efimov's MT::AWS::Glacier, an application for AWS Glacier
synchronization. It is available at L<https://github.com/vsespb/mt-aws-glacier>.

=head1 AUTHORS

Originally written by Tim Nordenfur, C<< <tim at gurka.se> >>.
Maintained by Gonzalo Barco C<< <gbarco uy at gmail com, no spaces> >>
Support for job operations was contributed by Ted Reed at IMVU.
Support for many file operations and multipart uploads by Gonzalo Barco.
Bugs, suggestions and fixes contributed by Victor Efimov and Kevin Goess.

=head1 LICENSE AND COPYRIGHT

Copyright 2012 Tim Nordenfur.

This program is free software; you can redistribute it and/or modify it
under the terms of either: the GNU General Public License as published
by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.

=cut

1; # End of Net::Amazon::Glacier


Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.