Group
Extension

WWW-Wikipedia-LangTitles/lib/WWW/Wikipedia/LangTitles.pod




=encoding UTF-8

=head1 NAME

WWW::Wikipedia::LangTitles - get interwiki links from Wikipedia.

=head1 SYNOPSIS

    
    use utf8;
    use WWW::Wikipedia::LangTitles 'get_wiki_titles';
    my $title = 'Three-phase electric power';
    my $links = get_wiki_titles ($title);
    print "$title is '$links->{de}' in German.\n";
    my $film = '東京物語';
    my $flinks = get_wiki_titles ($film, lang => 'ja');
    print "映画「$film」はイタリア語で「$flinks->{it}」と名付けた。\n";


produces output

    Three-phase electric power is 'Dreiphasenwechselstrom' in German.
    映画「東京物語」はイタリア語で「Viaggio a Tokyo」と名付けた。


(This example is included as L<F<synopsis.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/synopsis.pl> in the distribution.)


=head1 VERSION

This documents version 0.04 of
WWW::Wikipedia::LangTitles corresponding to L<git commit cd5d0156c401472bc424421159fca7d3c0f769fe|https://github.com/benkasminbullock/www-wikipedia-langtitles/commit/cd5d0156c401472bc424421159fca7d3c0f769fe> released
on Thu Jul 20 13:15:53 2017 +0900.

=head1 DESCRIPTION

This module retrieves the Wikipedia interwiki link titles from the web
site wikidata.org. It can be used, for example, to translate a term in
English into other languages, or to get near equivalents.

=head1 FUNCTIONS

=head2 get_wiki_titles

    my $ref = get_wiki_titles ('Helium');

Given a word or phrase as an argument, which is the title of a
Wikipedia article, the return value is a hash reference containing
keys which are language codes, and values which are the names of the
equivalent Wikipedia article in other languages. For example, in the
above case of B<Helium>, C<< $ref->{th} >> will be equal to ฮีเลียม, the
Thai title of the Wikipedia article on helium.

The language of the original page can be specified like this:

    use utf8;
    my $from_th = get_wiki_titles ('ฮีเลียม', lang => 'th');

The URL is encoded using L<URI::Escape/uri_escape_utf8>, so use
character, not byte, strings (use "use utf8;" etc.)

As of version 0.04, get_wiki_titles deletes the
non-encyclopedia sites like Wikiquote and Wikiversity from the list of
returned values.

=head2 make_wiki_url

    my $url = make_wiki_url ('helium');

Make a URL for the Wikidata page. You will then need to retrieve the
page and parse the JSON yourself. Use a second argument to specify the
language of the page:

    
    use utf8;
    use WWW::Wikipedia::LangTitles 'make_wiki_url';
    print make_wiki_url ('ฮีเลียม', 'th'), "\n";


produces output

    https://www.wikidata.org/w/api.php?action=wbgetentities&sites=thwiki&titles=%E0%B8%AE%E0%B8%B5%E0%B9%80%E0%B8%A5%E0%B8%B5%E0%B8%A2%E0%B8%A1&props=sitelinks/urls|datatype&format=json


(This example is included as L<F<thai-url.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/thai-url.pl> in the distribution.)


If no language is specified, the default is C<en> for English.

This method was added in version 0.02 of the module.

=head1 SEE ALSO

=over

=item L<Locale::Codes>

This module enables one to convert the language key names given by
this module into the English-language names of the languages.

    
    use utf8;
    use FindBin '$Bin';
    use WWW::Wikipedia::LangTitles 'get_wiki_titles';
    use Locale::Codes::Language;
    my $article = 'King Kong';
    my $titles = get_wiki_titles ($article);
    for my $lang (keys %$titles) {
        my $l2c = code2language ($lang);
        if (! $l2c) {
            $l2c = $lang;
        }
        my $name = $titles->{$lang};
        if ($name ne $article) {
            print "$name in $l2c.\n";
        }
    }


produces output

    king.kong in jbo.
    קינג קונג in Hebrew.
    Кинг Конг in Bulgarian.
    キングコング in Japanese.
    كينغ كونغ in Arabic.
    Кінг-Конг in Ukrainian.
    King Kong (hahmo) in Finnish.
    金剛 (怪獸) in Chinese.
    Քինգ Քոնգ in Armenian.
    คิงคอง in Thai.
    کینگ کونگ in Persian.
    Кинг-Конг in Russian.
    킹콩 in Korean.
    კინგ კონგი in Georgian.


(This example is included as L<F<locale-codes.pl>|https://fastapi.metacpan.org/source/BKB/WWW-Wikipedia-LangTitles-0.04/examples/locale-codes.pl> in the distribution.)


=back

=head1 DEPENDENCIES

=over

=item Carp

L<Carp> is used to report errors

=item LWP::UserAgent

L<LWP::UserAgent> is used to retrieve the data from Wikidata.

=item JSON::Parse

L<JSON::Parse> is used to parse the JSON data from Wikidata.

=item URI::Escape

L<URI::Escape> is used to make the URLs for Wikidata from the input
titles.

=back

=head1 EXPORTS

Nothing is exported by default. The export tag ':all' exports all the
functions of the module.

    use WWW::Wikipedia::LangTitles ':all';

=head1 TESTING

The default tests of the module do not attempt to connect to the
internet.  To test using an internet connection, run F<xt/scrape.t>
like this:

    prove -I lib xt/scrape.t

from the top directory of the distribution.

=head1 HISTORY

This module was a collection of small scripts I had been using to
scrape multilingual article names related to physics from Wikipedia. I
made the scripts into a CPAN module because I thought it could be
useful to other people. Specifically, I used my scripts to add some
Japanese element names to L<Chemistry::Elements>, and I thought this
method might be useful for someone else.

Version 0.02 added the L</make_wiki_url> for people who want to
retrieve and parse the output themselves.


=head1 AUTHOR

Ben Bullock, <bkb@cpan.org>

=head1 COPYRIGHT & LICENCE

This package and associated files are copyright (C) 
2016-2017
Ben Bullock.

You can use, copy, modify and redistribute this package and associated
files under the Perl Artistic Licence or the GNU General Public
Licence.





Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.