Group
Extension

WebService-Browshot/lib/WebService/Browshot.pm

package WebService::Browshot;

use 5.006006;
use strict;
use warnings;

use LWP::UserAgent;
use JSON;
use URI::Encode qw(uri_encode);
use File::Basename;
use File::Path qw(make_path);

$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
use IO::Socket::SSL;
IO::Socket::SSL::set_ctx_defaults( 
     SSL_verifycn_scheme => 'www', 
     SSL_verify_mode => 0,
     verify_mode => 0,
);

our $VERSION = '1.29.0';

=head1 NAME

WebService::Browshot - Perl extension for Browshot (L<https://browshot.com/>), a web service to create screenshots of web pages.

=head1 SYNOPSIS

  use WebService::Browshot;
  
  my $browshot = WebService::Browshot->new(key => 'my_key');
  my $screenshot = $browshot->screenshot_create(url => 'http://www.google.com/');
  [...]
  $browshot->screenshot_thumbnail_file(id => $screenshot->{id}, file => 'google.png');

=head1 DESCRIPTION

Browshot (L<http://www.browshot.com/>) is a web service to easily make screenshots of web pages in any screen size, as any device: iPhone, iPad, Android, Nook, PC, etc. Browshot has full Flash, JavaScript, CSS, & HTML5 support.

The latest API version is detailed at L<http://browshot.com/api/documentation>. WebService::Browshot follows the API documentation very closely: the function names are similar to the URLs used (screenshot/create becomes C<screenshot_create()>, instance/list becomes C<instance_list()>, etc.), the request arguments are exactly the same, etc.

The library version matches closely the API version it handles: WebService::Browshot 1.0.0 is the first release for the API 1.0, WebService::Browshot 1.1.1 is the second release for the API 1.1, etc.

WebService::Browshot can handle most the API updates within the same major version, e.g. WebService::Browshot 1.0.0 should be compatible with the API 1.1 or 1.2.

The source code is available on github at L<https://github.com/juliensobrier/browshot-perl>.


=head1 METHODS

=over 4

=head2 new()

  my $browshot = WebService::Browshot->new(key => 'my_key', base => 'http://api.browshot.com/api/v1/', debug => 1]);

Create a new WebService::Browshot object. You must pass your API key (go to you Dashboard to find your API key).

Arguments:

=over 4

=item key

Required.  API key.

=item base

 Optional. Base URL for all API requests. You should use the default base provided by the library. Be careful if you decide to use HTTP instead of HTTPS as your API key could be sniffed and your account could be used without your consent.

=item debug

Optional. Set to 1 to print debug output to the standard output. 0 (disabled) by default.

=item timeout

Optional. Set the request timeout - in seconds - against the API. Defaults to 90s.

=back

C<last_error> contains the last error message, it is NEVER reset, i.e last_error may not be empty after a successful API call if an earlier call failed.

=cut

sub new {
  	my ($self, %args) = @_;

	my $ua = LWP::UserAgent->new();
	$ua->timeout($args{'timeout'} || 90);
	$ua->env_proxy;
	$ua->max_redirect(32); # for the simple API only
	$ua->agent("WebService::Browshot $VERSION");
	$ua->ssl_opts( verify_hostnames => 0 );

  	my $browshot = {	
		_key	=> $args{key}	|| '',
		_base	=> $args{base}	|| 'https://api.browshot.com/api/v1/',
		_debug	=> $args{debug}	|| 0,

		_retry	=> 2,
		last_error	=> '',

		_ua		=> $ua,
	};

  return bless($browshot, $self);
}


=head2 api_version()

Return the API version handled by the library. Note that this library can usually handle new arguments in requests without requiring an update.

=cut

sub api_version {
	my ($self, %args) = @_;

	if ($VERSION =~ /^(\d+\.\d+)\.\d/) {
		return $1;
	}

	return $VERSION;
}



=head2 simple()

   $browshot->simple(url => 'http://mobilito.net')

Retrieve a screenshot in one function. Note: by default, screenshots are cached for 24 hours. You can tune this value with the cache=X parameter.

Return an array (status code, PNG). See L<http://browshot.com/api/documentation#simple> for the list of possible status codes.

Arguments:

See L<http://browshot.com/api/documentation#simple> for the full list of possible arguments.

=over 4

=item url

Required. URL of the website to create a screenshot of.

=back

=cut

sub simple {
	my ($self, %args) = @_;

	my $url	= $self->make_url(action => 'simple', parameters => { %args });
	my $res = $self->{_ua}->get($url);

	# $self->info($res->message);
	# $self->info($res->request->as_string);
	# $self->info($res->as_string);
	
	return ($res->code, $res->decoded_content);
}

=head2 simple_file()

   $browshot->simple_file(url => 'http://mobilito.net', file => '/tmp/mobilito.png')

Retrieve a screenshot and save it locally in one function. Note: by default, screenshots are cached for 24 hours. You can tune this value with the cache=X parameter.

Return an array (status code, file name). The file name is empty if the screenshot was not retrieved. See L<http://browshot.com/api/documentation#simple> for the list of possible status codes.

Arguments:

See L<http://browshot.com/api/documentation#simple> for the full list of possible arguments.

=over 4

=item url

Required. URL of the website to create a screenshot of.

=item file

Required. Local file name to write to.

=back

=cut

sub simple_file {
	my ($self, %args) 	= @_;
	my $file		= $args{file}	|| $self->error("Missing file in simple_file");
	
	if (-d $file) {
		$self->error("You must specify a file path, not a folder, to save the screenshot");
		return (400, '');
	}

	my $url	= $self->make_url(action => 'simple', parameters => { %args });
	my $res = $self->{_ua}->get($url);

	my $content = $res->decoded_content;

	if ($content ne '') {
		open TARGET, "> $file" or $self->error("Cannot open $file for writing: $!");
		binmode TARGET;
		print TARGET $content;
		close TARGET;

		return ($res->code, $file);
	}
	else {
		$self->error("No thumbnail retrieved");
		return ($res->code, '');
	}
}

=head2 instance_list()

Return the list of instances as a hash reference. See L<http://browshot.com/api/documentation#instance_list> for the response format.

=cut

sub instance_list {
	my ($self, %args) = @_;
	
	return $self->return_reply(action => 'instance/list');
}

=head2 instance_info()

   $browshot->instance_info(id => 2)

Return the details of an instance. See L<http://browshot.com/api/documentation#instance_info> for the response format.

Arguments:

=over 4

=item id

Required. Instance ID

=back

=cut

sub instance_info  {
	my ($self, %args) 	= @_;
	my $id							= $args{id}	|| $self->error("Missing id in instance_info");

	return $self->return_reply(action => 'instance/info', parameters => { id => $id });
}

=head2 browser_list()

Return the list of browsers as a hash reference. See L<http://browshot.com/api/documentation#browser_list> for the response format.

=cut

sub browser_list {
	my ($self, %args) = @_;
	
	return $self->return_reply(action => 'browser/list');
}

=head2 browser_info()

   $browshot->browser_info(id => 2)

Return the details of a browser. See L<http://browshot.com/api/documentation#browser_info> for the response format.

Arguments:

=over 4

=item id

Required. Browser ID

=back

=cut

sub browser_info  {
	my ($self, %args) 	= @_;
	my $id				= $args{id}	|| $self->error("Missing id in browser_info");

	return $self->return_reply(action => 'browser/info', parameters => { id => $id });
}


=head2 screenshot_create()

  $browshot->screenshot_create(url => 'http://wwww.google.com/', instance_id => 3, size => 'page')

Request a screenshot. See L<http://browshot.com/api/documentation#screenshot_create> for the response format.
Note: by default, screenshots are cached for 24 hours. You can tune this value with the cache=X parameter.

Arguments:

See L<http://browshot.com/api/documentation#screenshot_create> for the full list of possible arguments.

=over 4

=item url

Required. URL of the website to create a screenshot of.

=item instance_id

Optional. Instance ID to use for the screenshot.

=item size

Optional. Screenshot size.

=back

=cut

sub screenshot_create {
	my ($self, %args) 	= @_;
	# my $url				= $args{url}			|| $self->error("Missing url in screenshot_create");
	# my $instance_id		= $args{instance_id};
	# my $screen			= $args{screen};
	# my $size			= $args{size}			|| "screen";
	# my $cache			= $args{cache};
	# my $priority		= $args{priority};

	$self->error("Missing url in screenshot_create") 	if (! defined($args{url}));
# 	$args{size} = "screen" 					if (! defined($args{size}));

	return $self->return_reply(action => 'screenshot/create', parameters => { %args });
}

=head2 screenshot_info()

  $browshot->screenshot_info(id => 568978)

Get information about a screenshot requested previously. See L<http://browshot.com/api/documentation#screenshot_info> for the response format.

Arguments:

=over 4

=item id

Required. Screenshot ID.

=back

=cut

sub screenshot_info {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in screenshot_info");


	return $self->return_reply(action => 'screenshot/info', parameters => { %args });
}

=head2 screenshot_list()

  $browshot->screenshot_list(limit => 50)

Get details about screenshots requested. See L<http://browshot.com/api/documentation#screenshot_list> for the response format.

Arguments:

=over 4

=item limit

Optional. Maximum number of screenshots to retrieve.

=back

=cut

sub screenshot_list {
	my ($self, %args) 	= @_;

	return $self->return_reply(action => 'screenshot/list', parameters => { %args });
}

=head2 screenshot_search()

  $browshot->screenshot_search(url => 'google.com')

Get details about screenshots requested. See L<http://browshot.com/api/documentation#screenshot_search> for the response format.

Arguments:

=over 4

=item url

Required. URL string to look for.

=back

=cut

sub screenshot_search {
	my ($self, %args) 	= @_;
	my $url			= $args{url}	|| $self->error("Missing url in screenshot_search");

	return $self->return_reply(action => 'screenshot/search', parameters => { %args });
}

=head2 screenshot_host()

  $browshot->screenshot_host(id => 12345, hosting => 'browshot')

Host a screenshot or thumbnail. See L<http://browshot.com/api/documentation#screenshot_host> for the response format.

Arguments:

=over 4

=item id

Required. Screenshot ID.

=back

=cut

sub screenshot_host {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in screenshot_host");

	return $self->return_reply(action => 'screenshot/host', parameters => { %args });
}


=head2 screenshot_thumbnail()

  $browshot->screenshot_thumbnail(id => 52942, width => 500)

Retrieve the screenshot, or a thumbnail. See L<http://browshot.com/api/documentation#screenshot_thumbnail> for the response format.

Return an empty string if the image could not be retrieved.

Arguments:

See L<http://browshot.com/api/documentation#screenshot_thumbnail> for the full list of possible arguments.

=over 4

=item id

Required. Screenshot ID.

=item width

Optional. Maximum width of the thumbnail.

=item height

Optional. Maximum height of the thumbnail.

=back

=cut
sub screenshot_thumbnail {
	my ($self, %args) 	= @_;

	if (exists($args{url}) && $args{url} =~ /image\/(\d+)\?/i && ! exists($args{id})) {
		# get ID from url
		$args{id} = $1;

		if ($args{url}  =~ /&width=(\d+)\?/i && ! exists($args{width})) {
				$args{width} = $1;
		}
		if ($args{url}  =~ /&height=(\d+)\?/i && ! exists($args{height})) {
				$args{height} = $1;
		}
		
	}
	elsif(! exists($args{id}) ) {
		$self->error("Missing id and url in screenshot_thumbnail");
		return '';
	}


	my $url	= $self->make_url(action => 'screenshot/thumbnail', parameters => { %args });
	my $res =  $self->{_ua}->get($url);

	if ($res->is_success) {
		return $res->decoded_content; # raw image file content
	}
	else {
		$self->error("Error in thumbnail request: " . $res->as_string);
		return '';
	}
}


=head2 screenshot_thumbnail_file()

  $browshot->screenshot_thumbnail_file(id => 123456, height => 500, file => '/tmp/google.png')

Retrieve the screenshot, or a thumbnail, and save it to a file. See L<http://browshot.com/api/documentation#thumbnails> for the response format.

Return an empty string if the image could not be retrieved or not saved. Returns the file name if successful.

Arguments:

See L<http://browshot.com/api/documentation#thumbnails> for the full list of possible arguments.

=over 4

=item url

 Required. URL of the screenshot (screenshot_url value retrieved from C<screenshot_create()> or C<screenshot_info()>). You will get the full image if no other argument is specified.

=item file

Required. Local file name to write to.

=item width

Optional. Maximum width of the thumbnail.

=item height

Optional. Maximum height of the thumbnail.

=back

=cut
sub screenshot_thumbnail_file {
	my ($self, %args) 	= @_;
	my $file		= $args{file}	|| $self->error("Missing file in screenshot_thumbnail_file");

	delete($args{file});
	my $content = $self->screenshot_thumbnail(%args);

	my $dir = dirname($file);
	if (! -d $dir) {
		make_path($dir)
	}

	if ($content ne '') {
		open TARGET, "> $file" or $self->error("Cannot open $file for writing: $!");
		binmode TARGET;
		print TARGET $content;
		close TARGET;

		return $file;
	}
	else {
		$self->error("No thumbnail retrieved");
		return '';
	}
}

=head2 screenshot_share()

  $browshot->screenshot_share(id => 12345, note => 'This is my screenshot')

Share a screenshot. See L<http://browshot.com/api/documentation#screenshot_share> for the response format.

Arguments:

=over 4

=item id

Required. Screenshot ID.

=item note

Optional. Public note to add to the screenshot.

=back

=cut

sub screenshot_share {
	my ($self, %args) 	= @_;
	my $id				= $args{id}	|| $self->error("Missing id in screenshot_share");

	return $self->return_reply(action => 'screenshot/share', parameters => { %args });
}


=head2 screenshot_delete()

  $browshot->screenshot_delete(id => 12345, data => 'url,metadata')

Delete details of a screenshot. See L<http://browshot.com/api/documentation#screenshot_delete> for the response format.

Arguments:

=over 4

=item id

Required. Screenshot ID.

=item data

Optional. Information to delete.

=back

=cut

sub screenshot_delete {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in screenshot_delete");

	return $self->return_reply(action => 'screenshot/delete', parameters => { %args });
}

=head2 screenshot_html()

  $browshot->screenshot_html(id => 12345)

Get the HTML code of the rendered page. See L<http://browshot.com/api/documentation#screenshot_html> for the response format.

Arguments:

=over 4

=item id

Required. Screenshot ID.

=back

=cut

sub screenshot_html {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in screenshot_html");

	return $self->return_string(action => 'screenshot/html', parameters => { %args });
}

=head2 screenshot_multiple()

  $browshot->screenshot_multiple(urls => ['http://mobilito.net/'], instances => [22, 30])

Request multiple screenshots. See L<http://browshot.com/api/documentation#screenshot_multiple> for the response format.

Arguments:

=over 4

=item urls

Required. One or more URLs.

=item instances

Required. One or more instance_id.

=back

=cut

sub screenshot_multiple {
	my ($self, %args) 	= @_;
# 	my $urls		= $args{urls}		|| $self->error("Missing urls in screenshot_multiple");
# 	my $instances		= $args{instances}	|| $self->error("Missing instances in screenshot_multiple");

	return $self->return_reply(action => 'screenshot/multiple', parameters => { %args });
}


=head2 batch_create()

  $browshot->batch_create(file => '/my/file/urls.txt', instance_id => 65)

Request multiple screenshots from a text file. See L<http://browshot.com/api/documentation#batch_create> for the response format.

Arguments:

=over 4

=item file

Required. Path to the text file which contains the list of URLs.

=item instance_id

Required. instance_id to use for all screenshots.

=back

=cut

sub batch_create {
	my ($self, %args) 	= @_;
	my $file		= $args{file}		|| $self->error("Missing file in batch_create");
	my $instance_id		= $args{instance_id}	|| $self->error("Missing instance_id in batch_create");

	delete $args{file};
	return $self->return_post_reply(action => 'batch/create', parameters => { %args }, file => $file);
}

=head2 batch_info()

  $browshot->batch_info(id => 5)

Get information about a screenshot batch requested previously. See L<http://browshot.com/api/documentation#batch_info> for the response format.

Arguments:

=over 4

=item id

Required. Batch ID.

=back

=cut

sub batch_info {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in batch_info");

	return $self->return_reply(action => 'batch/info', parameters => { %args });
}


=head2 crawl_create()

  $browshot->crawl_create(domain => 'blitapp.com', url => 'https://blitapp.com/', max => 50, instance_id => 65)

Crawl a domain and screenshot all pages. See L<http://browshot.com/api/documentation#crawl_create> for the response format.

Arguments:

=over 4

=item domain

Required. Domain to crawl.

=item url

Required. URl to start with.

=item instance_id

Required. instance_id to use for all screenshots.

=back

=cut

sub crawl_create {
	my ($self, %args) = @_;
	my $domain				= $args{domain}			|| $self->error("Missing domain in crawl_create");
	my $url						= $args{url}				|| $self->error("Missing url in crawl_create");
	my $instance_id		= $args{instance_id}	|| $self->error("Missing instance_id in crawl_create");

	return $self->return_reply(action => 'crawl/create', parameters => { %args });
}

=head2 crawl__info()

  $browshot->crawl__info(id => 5)

Get information about a crawl_ requested previously. See L<http://browshot.com/api/documentation#crawl__info> for the response format.

Arguments:

=over 4

=item id

Required. Crawl ID.

=back

=cut

sub crawl_info {
	my ($self, %args) 	= @_;
	my $id			= $args{id}	|| $self->error("Missing id in crawl_info");

	return $self->return_reply(action => 'crawl/info', parameters => { %args });
}

=head2 account_info()

Return information about the user account. See L<http://browshot.com/api/documentation#account_info> for the response format.

=cut

sub account_info {
	my ($self, %args) = @_;
	
	return $self->return_reply(action => 'account/info', parameters => { %args });
}


# Private methods

sub return_string {
	my ($self, %args) 	= @_;

	my $url	= $self->make_url(%args);
	
	my $res;
	my $try = 0;

	do {
		$self->info("Try $try");
		eval {
			$res = $self->{_ua}->get($url);
		};
		$self->error($@) if ($@);
		$try++;
	}
	until($try < $self->{_retry} && defined $@);

	if (! $res->is_success) {
		$self->error("Server sent back an error: " . $res->code);
	}
  
	return $res->decoded_content;
}

sub return_post_string {
	my ($self, %args) 	= @_;
	my $file 		= $args{'file'} || '';

	delete $args{'file'};
	my $url	= $self->make_url(%args);
	
	my $res;
	my $try = 0;

	do {
		$self->info("Try $try");
		eval {
			$res = $self->{_ua}->post(
			  $url,
			  Content_Type => 'form-data',
			  Content => [
			    file => [$file],
			  ]
			);
		};
		$self->error($res->request->as_string) if ($@);
		$self->error($@) if ($@);
		$try++;
	}
	until($try < $self->{_retry} && defined $@);

	if (! $res->is_success) {
		$self->error("Server sent back an error: " . $res->code);
		$self->info($res->request->as_string);
		$self->info($res->as_string);
	}
  
	return $res->decoded_content;
}

sub return_reply {
	my ($self, %args) 	= @_;

	my $content = $self->return_string(%args);

	my $info;
	eval {
		$info = decode_json($content);
	};
	if ($@) {
		$self->error("Invalid server response: " . $@);
		return $self->generic_error($@);
	}

	return $info;
}

sub return_post_reply {
	my ($self, %args) 	= @_;

	my $content = $self->return_post_string(%args);

	my $info;
	eval {
		$info = decode_json($content);
	};
	if ($@) {
		$self->error("Invalid server response: " . $@);
		return $self->generic_error($@);
	}

	return $info;
}

sub make_url {
	my ($self, %args) 	= @_;
	my $action		= $args{action}		|| '';
	my $parameters		= $args{parameters}	|| { };

	my $url = $self->{_base} . "$action?key=" . uri_encode($self->{_key}, 1);


	if (exists $parameters->{urls}) {
	  foreach my $uri (@{ $parameters->{urls} }) {
	    $url .= '&url=' . uri_encode($uri, 1);
	  }
	  delete  $parameters->{urls};
	}

	if (exists $parameters->{instances}) {
	  foreach my $instance_id (@{ $parameters->{instances} }) {
	    $url .= '&instance_id=' . uri_encode($instance_id, 1);
	  }
	  delete  $parameters->{instances};
	}

	foreach my $key (keys %$parameters) {
	  $url .= '&' . uri_encode($key) . '=' . uri_encode($parameters->{$key}, 1) if (defined $parameters->{$key});
	}

	$self->info($url);
	return $url;
}

sub info {
	my ($self, $message) = @_;

	if ($self->{_debug}) {
	  print $message, "\n";
	}

	return '';
}

sub error {
	my ($self, $message) = @_;

	$self->{last_error} = $message;

	if ($self->{_debug}) {
		print $message, "\n";
	}

	return '';
}

sub generic_error {
	my ($self, $message) = @_;


	return { error => 1, message => $message };
}

=head1 CHANGES

=over 4

=item 1.24.0

C<screenshot_thumbnail_file> creates the directory structure as needed.

=item 1.16.0

Check if the file is not a folder in simple_file

=item 1.14.1

Remove deprecated API calls.

=item 1.14.0

Add C<batch_create> and C<batch_info> for API 1.14.

=item 1.13.0

Add C<screenshot_html> and C<screenshot_multiple> for API 1.13.

=item 1.12

Add C<screenshot_search> for API 1.12.

=item 1.11.1

Return Browshot response in case of error if the reply is valid JSON.

=item 1.11

Compatible with API 1.11. Optional HTTP timeout.

=item 1.10

Add C<screenshot_delete> for API 1.10.

=item 1.9.4

Fix status code in error messages.

=item 1.9.3

Keep backward compatibility for C<screenshot_thumbnail>.

=item 1.9.0

Add C<screenshot_share> for API 1.9.

=item 1.8.0

Update C<screenshot_thumbnail> to use new API.

=item 1.7.0

Update C<screenshot_info> to handle additional parameters

=item 1.5.1

Use binmode to create valid PNG files on Windows.

=item 1.4.1

Fix URI encoding.

=item 1.4.0

Add C<simple> and C<simple_file> methods.

=item 1.3.1

Retry requests (up to 2 times) to browshot.com in case of error

=back

=head1 SEE ALSO

See L<http://browshot.com/api/documentation> for the API documentation.

Create a free account at L<http://browshot.com/login> to get your free API key.

Go to L<http://browshot.com/dashboard> to find your API key after you registered.

=head1 AUTHOR

Julien Sobrier, E<lt>julien@sobrier.netE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2015 by Julien Sobrier

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.8.8 or,
at your option, any later version of Perl 5 you may have available.


=cut

1;


Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.