WWW-Wikipedia-LangTitles/lib/WWW/Wikipedia/LangTitles.pod
=encoding UTF-8
=head1 NAME
WWW::Wikipedia::LangTitles - get interwiki links from Wikipedia.
=head1 SYNOPSIS
use utf8;
use WWW::Wikipedia::LangTitles 'get_wiki_titles';
my $title = 'Three-phase electric power';
my $links = get_wiki_titles ($title);
print "$title is '$links->{de}' in German.\n";
my $film = '東京物語';
my $flinks = get_wiki_titles ($film, lang => 'ja');
print "映画「$film」はイタリア語で「$flinks->{it}」と名付けた。\n";
produces output
Three-phase electric power is 'Dreiphasenwechselstrom' in German.
映画「東京物語」はイタリア語で「Viaggio a Tokyo」と名付けた。
(This example is included as L<F<synopsis.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/synopsis.pl> in the distribution.)
=head1 VERSION
This documents version 0.04 of
WWW::Wikipedia::LangTitles corresponding to L<git commit cd5d0156c401472bc424421159fca7d3c0f769fe|https://github.com/benkasminbullock/www-wikipedia-langtitles/commit/cd5d0156c401472bc424421159fca7d3c0f769fe> released
on Thu Jul 20 13:15:53 2017 +0900.
=head1 DESCRIPTION
This module retrieves the Wikipedia interwiki link titles from the web
site wikidata.org. It can be used, for example, to translate a term in
English into other languages, or to get near equivalents.
=head1 FUNCTIONS
=head2 get_wiki_titles
my $ref = get_wiki_titles ('Helium');
Given a word or phrase as an argument, which is the title of a
Wikipedia article, the return value is a hash reference containing
keys which are language codes, and values which are the names of the
equivalent Wikipedia article in other languages. For example, in the
above case of B<Helium>, C<< $ref->{th} >> will be equal to ฮีเลียม, the
Thai title of the Wikipedia article on helium.
The language of the original page can be specified like this:
use utf8;
my $from_th = get_wiki_titles ('ฮีเลียม', lang => 'th');
The URL is encoded using L<URI::Escape/uri_escape_utf8>, so use
character, not byte, strings (use "use utf8;" etc.)
As of version 0.04, get_wiki_titles deletes the
non-encyclopedia sites like Wikiquote and Wikiversity from the list of
returned values.
=head2 make_wiki_url
my $url = make_wiki_url ('helium');
Make a URL for the Wikidata page. You will then need to retrieve the
page and parse the JSON yourself. Use a second argument to specify the
language of the page:
use utf8;
use WWW::Wikipedia::LangTitles 'make_wiki_url';
print make_wiki_url ('ฮีเลียม', 'th'), "\n";
produces output
https://www.wikidata.org/w/api.php?action=wbgetentities&sites=thwiki&titles=%E0%B8%AE%E0%B8%B5%E0%B9%80%E0%B8%A5%E0%B8%B5%E0%B8%A2%E0%B8%A1&props=sitelinks/urls|datatype&format=json
(This example is included as L<F<thai-url.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/thai-url.pl> in the distribution.)
If no language is specified, the default is C<en> for English.
This method was added in version 0.02 of the module.
=head1 SEE ALSO
=over
=item L<Locale::Codes>
This module enables one to convert the language key names given by
this module into the English-language names of the languages.
use utf8;
use FindBin '$Bin';
use WWW::Wikipedia::LangTitles 'get_wiki_titles';
use Locale::Codes::Language;
my $article = 'King Kong';
my $titles = get_wiki_titles ($article);
for my $lang (keys %$titles) {
my $l2c = code2language ($lang);
if (! $l2c) {
$l2c = $lang;
}
my $name = $titles->{$lang};
if ($name ne $article) {
print "$name in $l2c.\n";
}
}
produces output
king.kong in jbo.
קינג קונג in Hebrew.
Кинг Конг in Bulgarian.
キングコング in Japanese.
كينغ كونغ in Arabic.
Кінг-Конг in Ukrainian.
King Kong (hahmo) in Finnish.
金剛 (怪獸) in Chinese.
Քինգ Քոնգ in Armenian.
คิงคอง in Thai.
کینگ کونگ in Persian.
Кинг-Конг in Russian.
킹콩 in Korean.
კინგ კონგი in Georgian.
(This example is included as L<F<locale-codes.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/locale-codes.pl> in the distribution.)
=back
=head1 DEPENDENCIES
=over
=item Carp
L<Carp> is used to report errors
=item LWP::UserAgent
L<LWP::UserAgent> is used to retrieve the data from Wikidata.
=item JSON::Parse
L<JSON::Parse> is used to parse the JSON data from Wikidata.
=item URI::Escape
L<URI::Escape> is used to make the URLs for Wikidata from the input
titles.
=back
=head1 EXPORTS
Nothing is exported by default. The export tag ':all' exports all the
functions of the module.
use WWW::Wikipedia::LangTitles ':all';
=head1 TESTING
The default tests of the module do not attempt to connect to the
internet. To test using an internet connection, run F<xt/scrape.t>
like this:
prove -I lib xt/scrape.t
from the top directory of the distribution.
=head1 HISTORY
This module was a collection of small scripts I had been using to
scrape multilingual article names related to physics from Wikipedia. I
made the scripts into a CPAN module because I thought it could be
useful to other people. Specifically, I used my scripts to add some
Japanese element names to L<Chemistry::Elements>, and I thought this
method might be useful for someone else.
Version 0.02 added the L</make_wiki_url> for people who want to
retrieve and parse the output themselves.
=head1 AUTHOR
Ben Bullock, <bkb@cpan.org>
=head1 COPYRIGHT & LICENCE
This package and associated files are copyright (C)
2016-2017
Ben Bullock.
You can use, copy, modify and redistribute this package and associated
files under the Perl Artistic Licence or the GNU General Public
Licence.