Group
Extension

Wiki-JSON/lib/Wiki/JSON.pm

package Wiki::JSON;

use v5.16.3;

use strict;
use warnings;

use Moo;
use Data::Dumper;
use Const::Fast;
use Wiki::JSON::Parser;
use Wiki::JSON::HTML;

our $VERSION = "0.0.31";

const my $MAX_HX_SIZE                                           => 6;
const my $EXTRA_CHARACTERS_BOLD_AND_ITALIC_WHEN_ITALIC          => 3;
const my $LIST_ELEMENT_INTERRUPT_NUMBER_OF_CHARACTERS_TO_IGNORE => 3;
const my $MINIMUM_LINK_SEARCH                                   => 3;
const my $MINIMUM_TEMPLATE_SEARCH                               => 3;
const my $LIST_ELEMENT_DELIMITER                                => "\n* ";

sub parse {
    if (@_ < 2) {
        die 'Parse arguments: <$self> <$wiki_text> [$options]';
    }
    my ( $self, $wiki_text, $options ) = @_;
    $options //= {};
    return Wiki::JSON::Parser->new->parse($wiki_text, $options);
}

sub pre_html {
    my ($self, $wiki_text, $template_callbacks) = @_;
    $template_callbacks //= {};
    $template_callbacks->{is_inline} //= sub { return 1; };
    $template_callbacks->{generate_elements} //= sub {};
    return Wiki::JSON::HTML->new->pre_html_json($wiki_text, $template_callbacks);
}
1;

=encoding utf8

=head1 NAME

Wiki::JSON - Parse wiki-like articles to a data-structure transformable to JSON.

=head1 SYNOPSIS

    use Wiki::JSON;

    my $structure = Wiki::JSON->new->parse(<<'EOF');
    = This is a wiki title =
    '''This is bold'''
    ''This is italic''
    '''''This is bold and italic'''''
    == This is a smaller title, the user can use no more than 6 equal signs ==
    <nowiki>''This is printed without expanding the special characters</nowiki>
    * This
    * Is
    * A
    * Bullet
    * Point
    * List
    {{foo|Templates are generated|with their arguments}}
    {{stub|This is under heavy development}}
    The parser has some quirks == This will generate a title ==
    ''' == '' Unterminated syntaxes will still be parsed until the end of file
    This is a link to a wiki article: [[Cool Article]]
    This is a link to a wiki article with an alias: [[Cool Article|cool article]]
    This is a link to a URL with an alias: [[https://example.com/cool-source.html|cool article]]
    This is a link to a Image [[File:https:/example.com/img.png|50x50px|frame|This is a caption]]
    EOF

=head1 DESCRIPTION

A parser for a subset of a mediawiki-like syntax, quirks include some
supposedly inline elements are parsed multi-line like headers, templates*,
italic and bolds.

Lists are only one level and not everything in mediawiki is supported by the
moment.

=head2 INSTALLING

    cpanm https://github.com/sergiotarxz/Perl-Wiki-JSON.git

=head2 USING AS A COMMAND

    wiki2json file.wiki > output.json

=head1 INSTANCE METHODS

=head2 new

    my $wiki_parser = Wiki::JSON->new;

=head1 SUBROUTINES/METHODS

=head2 parse

    my $structure = $wiki_parser->parse($wiki_string);

Parses the wiki format into a serializable to JSON or YAML Perl data structure.

=head2 pre_html

    my $template_callbacks = {
        generate_elements => sub {
            my ( $template, $options, $parse_sub, $open_html_element_sub,
                $close_html_element_sub )
              = @_;
            my @dom;
            if ( $element->{template_name} eq 'stub' ) {
                push @dom,
                  $open_html_element_sub->( 'span', 0, { style => 'color: red;' } );
                push @dom, @{ $element->{output} };
                push @dom, $close_html_element_sub->('span');
            }
            return \@dom;
        },
        is_inline => sub {
            my ($template) = @_;
            if ($template->{template_name} eq 'stub') {
                return 1;
            }
            if ($template->{template_name} eq 'videoplayer') {
                return 0;
            }
        }
    };

    my $structure = $wiki_parser->pre_html($wiki_string, $template_callbacks);

Retrieves an ArrayRef containing just HashRefs without nesting describing how HTML tags should be open and closed for a wiki text.

=head3 template_callbacks

An optional hashref containing any or both of this two keys pointing to subrefs

=head1 RETURN FROM METHODS

=head2 parse

The return is an ArrayRef in which each element is either a string or a HashRef.

HashRefs can be classified by the key type which can be one of these:

=head3 hx

A header to be printed as h1..h6 in HTML, has the following fields:

=over 4

=item hx_level

A number from 1 to 6 defining the header level.

=item output

An ArrayRef defined by the return from parse.

=back

=head3 template

A template thought for developer defined expansions of how some data shoudl be represented.

=over 4

=item template_name

The name of the template.

=item output

An ArrayRef defined by the return from parse.

=back

=head3 bold

A set of elements that must be represented as bold text.

=over 4

=item output

An ArrayRef defined by the return from parse.

=back

=head3 italic

A set of elements that must be represented as italic text.

=over 4

=item output

An ArrayRef defined by the return from parse.

=back

=head3 bold_and_italic

A set of elements that must be represented as bold and italic text.

=over 4

=item output

An ArrayRef defined by the return from parse.

=back

=head3 unordered_list

A bullet point list.

=over 4

=item output

A ArrayRef of HashRefs from the type list_element.

=back

=head3 list_element

An element in a list, this element must not appear outside of the output element of a list.

=over 4

=item output

An ArrayRef defined by the return from parse.

=back

=head3 link

An URL or a link to other Wiki Article.

=over 4

=item link

The String containing the URL or link to other Wiki Article.

=item title

The text that should be used while showing this URL to point the user where it is going to be directed.

=back

=head3 image

An Image, PDF, or Video.

=over 4

=item link

Where to find the File.

=item caption

What to show the user if the image is requested to explain to the user what he is seeing.

=item options

=back

Undocumented by the moment.

=head1 DEPENDENCIES

The module will pull all the dependencies it needs on install, the minimum supported Perl is v5.16.3, although latest versions are mostly tested for 5.38.2

=head1 CONFIGURATION AND ENVIRONMENT

If your OS Perl is too old perlbrew can be used instead.

=head1 BUGS AND LIMITATIONS

The author thinks it is possible the parser hanging forever, use it in
a subprocess the program can kill if it takes too long.

The developer can use fork, waitpid, pipe, and non-blocking IO for that.

=head1 DIAGNOSTICS

If a string halting forever this module is found, send it to me in the Github issue tracker.

=head1 LICENSE AND COPYRIGHT

    Copyright ©Sergiotarxz (2025)

Licensed under the The GNU General Public License, Version 3, June 2007 L<http://www.gnu.org/licenses/gpl-3.0.txt>.

You can use this software under the terms of the GPLv3 license or a new later
version provided by the FSF or the GNU project.

=head1 INCOMPATIBILITIES

None known.

=head1 VERSION

0.0.x

=head1 AUTHOR

Sergio Iglesias

=head1 SEE ALSO

Look what is supported and how in the tests: L<https://github.com/sergiotarxz/Perl-Wiki-JSON/tree/main/t>

=cut


Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.