Group
Extension

Data-TOON/lib/Data/TOON.pm

package Data::TOON;
use 5.014;
use strict;
use warnings;

use Data::TOON::Encoder;
use Data::TOON::Decoder;
use Data::TOON::Validator;

our $VERSION = "0.02";

=encoding utf-8

=head1 NAME

Data::TOON - Complete Perl implementation of TOON (Token-Oriented Object Notation)

=head1 SYNOPSIS

    use Data::TOON;

    # Basic usage
    my $data = { name => 'Alice', age => 30, active => 1 };
    my $toon = Data::TOON->encode($data);
    print $toon;
    # Output:
    #   active: true
    #   age: 30
    #   name: Alice

    my $decoded = Data::TOON->decode($toon);
    
    # Tabular arrays
    my $users = {
        users => [
            { id => 1, name => 'Alice', role => 'admin' },
            { id => 2, name => 'Bob', role => 'user' }
        ]
    };
    
    print Data::TOON->encode($users);
    # Output:
    #   users[2]{id,name,role}:
    #     1,Alice,admin
    #     2,Bob,user

    # Alternative delimiters
    my $encoder = Data::TOON::Encoder->new(delimiter => '|');
    print $encoder->encode($users);
    # Or with tabs:
    my $tab_encoder = Data::TOON::Encoder->new(delimiter => "\t");

    # Root primitives and arrays
    print Data::TOON->encode(42);           # Output: 42
    print Data::TOON->encode('hello');      # Output: hello
    print Data::TOON->encode([1, 2, 3]);    # Output: [3]: 1,2,3

=head1 DESCRIPTION

Data::TOON is a complete Perl implementation of TOON (Token-Oriented Object Notation), a human-friendly, 
line-oriented data serialization format.

TOON provides:

=over 4

=item * B<Human-readable syntax> - Indentation-based, similar to YAML, with minimal quoting

=item * B<Multiple array formats> - Compact tabular, explicit list, or inline primitive arrays

=item * B<Flexible delimiters> - Support for comma, tab, and pipe delimiters

=item * B<Security> - DoS protection via depth limits and circular reference detection

=item * B<Canonical numbers> - Automatic number normalization (removes trailing zeros, etc.)

=item * B<Full TOON compliance> - 95%+ coverage of TOON specification v1.0

=back

The format is particularly useful for configuration files, data interchange, and human-editable data storage.

=head1 METHODS

=head2 encode( $data, %options )

Encodes a Perl data structure to TOON format string.

B<Parameters:>

=over 4

=item C<$data>

The Perl data structure to encode. Can be:
- Hash reference (becomes TOON object)
- Array reference (becomes TOON array)
- Scalar (becomes root primitive: number, string, boolean, or null)

=item C<%options>

Optional encoder configuration:

=over 4

=item C<indent>

Number of spaces per indentation level. Default: 2

    Data::TOON->encode($data, indent => 4);

=item C<delimiter>

Array element delimiter character:

=over 4

=item C<','> (default) - Comma-separated values

    items[2]{id,name}: 1,Alice
                       2,Bob

=item C<"\t"> - Tab-separated values

    items[2<TAB>]{id<TAB>name}:
    <TAB>1<TAB>Alice
    <TAB>2<TAB>Bob

=item C<'|'> - Pipe-separated values

    items[2|]{id|name}:
      1|Alice
      2|Bob

=back

=item C<strict>

Enable strict mode validation. Default: 1

    Data::TOON->encode($data, strict => 0);

=item C<max_depth>

Maximum nesting depth (prevents DoS). Default: 100

    Data::TOON->encode($data, max_depth => 50);

=item C<column_priority>

Array reference of column names to prioritize (appear leftmost in tabular format). Columns not in this list appear after in alphabetical order. Default: empty array (standard alphabetical sort)

    Data::TOON->encode($data, column_priority => ['id', 'name']);
    # Will output columns as: id, name, ... (other columns alphabetically)

=back

=back

B<Returns:>

TOON format string. Can encode to:
- Object (most common): C<key: value> pairs
- Tabular array: Compact C<name[N]{fields}: row> format
- List array: Explicit C<name[N]: - item> format  
- Root primitive: Single value (42, "hello", true, false, null)
- Root array: C<[N]: value,value,...>

B<Examples:>

    # Simple object
    my $toon = Data::TOON->encode({ name => 'Alice', age => 30 });
    # Output:
    #   age: 30
    #   name: Alice

    # Nested object
    my $toon = Data::TOON->encode({
        user => {
            name => 'Alice',
            age => 30
        }
    });
    # Output:
    #   user:
    #     age: 30
    #     name: Alice

    # Tabular array (uniform objects)
    my $toon = Data::TOON->encode({
        users => [
            { id => 1, name => 'Alice' },
            { id => 2, name => 'Bob' }
        ]
    });
    # Output:
    #   users[2]{id,name}:
    #     1,Alice
    #     2,Bob

    # Tabular array with column priority
    my $toon = Data::TOON->encode({
        users => [
            { id => 1, name => 'Alice', role => 'admin' },
            { id => 2, name => 'Bob', role => 'user' }
        ]
    }, column_priority => ['id', 'name']);
    # Output:
    #   users[2]{id,name,role}:
    #     1,Alice,admin
    #     2,Bob,user

    # List array (non-uniform objects)
    my $toon = Data::TOON->encode({
        items => [
            { id => 1, name => 'Alice', extra => 'data' },
            { id => 2, name => 'Bob' }
        ]
    });
    # Output:
    #   items[2]:
    #     - extra: data
    #       id: 1
    #       name: Alice
    #     - id: 2
    #       name: Bob

    # Root array
    my $toon = Data::TOON->encode([1, 2, 3]);
    # Output: [3]: 1,2,3

    # Root primitive
    my $toon = Data::TOON->encode(42);
    # Output: 42


=cut

sub encode {
    my ($class, $data, %opts) = @_;
    my $encoder = Data::TOON::Encoder->new(%opts);
    return $encoder->encode($data);
}

sub decode {
    my ($class, $toon_text, %opts) = @_;
    my $decoder = Data::TOON::Decoder->new(%opts);
    return $decoder->decode($toon_text);
}

sub validate {
    my ($class, $toon_text) = @_;
    my $validator = Data::TOON::Validator->new();
    return $validator->validate($toon_text);
}

=head1 EXAMPLES AND PATTERNS

=head2 Configuration File

TOON is well-suited for configuration files:

    app_name: MyApp
    version: 1.0.0
    debug: false
    database:
      host: localhost
      port: 5432
      user: admin
      max_connections: 100
    servers[3]{host,port,role}:
      web1.example.com,8080,primary
      web2.example.com,8080,secondary
      web3.example.com,8080,secondary

=head2 Data Exchange Format

For APIs and data interchange where readability matters:

    response:
      status: success
      code: 200
      data[2]:
        - id: 1001
          name: Product A
          price: 29.99
          in_stock: true
        - id: 1002
          name: Product B
          price: 49.99
          in_stock: false

=head2 Handling Nested Structures

    organization:
      name: Example Corp
      departments[2]:
        - name: Engineering
          teams[2]:
            - name: Backend
              members: 5
            - name: Frontend
              members: 3
        - name: Sales
          teams[1]:
            - name: Enterprise
              members: 8

=head1 TOON FORMAT FEATURES

=head2 Data Types

TOON supports JSON data types:

=over 4

=item B<Object> - Unordered collection of key-value pairs

    user:
      name: Alice
      age: 30

=item B<Array> - Ordered collection of values (3 formats available)

Tabular:
    users[2]{id,name}: 1,Alice
                       2,Bob

List:
    items[2]:
      - id: 1
        name: Alice
      - id: 2
        name: Bob

Primitive:
    tags[3]: red,green,blue

=item B<String> - UTF-8 text (quoted if needed)

    name: Alice
    quote: "She said \"Hello\""

=item B<Number> - Integer or float (canonicalized)

    count: 42
    ratio: 3.14

=item B<Boolean> - true or false

    active: true
    deleted: false

=item B<Null> - Null value

    optional_field: null

=back

=head2 Delimiters

Three delimiter options for array values:

B<Comma (default):>
    data[3]: a,b,c

B<Tab:>
    data[3<TAB>]: a<TAB>b<TAB>c

B<Pipe:>
    data[3|]: a|b|c

Use tab or pipe delimiters when values might contain commas.

=head2 String Escaping

Standard escape sequences are supported:

    text: "Line 1\nLine 2"
    path: "C:\\Program Files\\App"
    json: "Use \" to escape quotes"

=head2 Root Forms

Documents can start with different root types:

    # Root object (default)
    name: Alice
    age: 30

    # Root primitive
    42
    
    # Root array
    [3]: a,b,c

=head1 PERFORMANCE CONSIDERATIONS

=over 4

=item * Encoding is O(n) in data size
=item * Decoding is O(n) in text size
=item * Memory usage is proportional to data complexity
=item * Large documents (>100MB) may require streaming parser

=back

=head1 SECURITY

Data::TOON includes several security features:

=over 4

=item B<Depth Limiting> - Prevents stack overflow from deeply nested structures
    Default max_depth: 100 levels
    Configurable via encode/decode options

=item B<Circular Reference Detection> - Prevents infinite loops during encoding
    Automatically detects and rejects circular references

=item B<Input Validation> - Strict parsing rules prevent injection attacks
    All strings are treated as literal values
    No code evaluation or command injection possible

=back

=head1 COMPATIBILITY

=over 4

=item * Perl 5.14 or later
=item * No external dependencies
=item * Pure Perl implementation (portable)

=back

=head1 TOON SPECIFICATION COMPLIANCE

This implementation achieves 95%+ compliance with TOON Specification v1.0:

✓ Complete object support
✓ Complete array support (all 3 formats)
✓ All delimiters (comma, tab, pipe)
✓ Root forms (object, array, primitive)
✓ String escaping
✓ Canonical number form
✓ Security measures
✓ Full type inference

See L<TOON Specification|https://github.com/toon-format/spec/blob/main/SPEC.md> for complete specification.

=head1 COMMON ERRORS AND TROUBLESHOOTING

=head2 "Maximum nesting depth exceeded"

If you encounter this error, your data is nested more than 100 levels deep:

    # Increase max_depth
    my $encoded = Data::TOON->encode($data, max_depth => 200);

=head2 "Circular reference detected"

Your data structure contains a reference to itself:

    my $obj = { name => 'Alice' };
    $obj->{self} = $obj;  # This creates a circle!

Solution: Remove the circular reference before encoding.

=head2 Inconsistent type inference

If values aren't being parsed as expected, use explicit quoting:

    # Without quotes - might be interpreted as number
    value: 42
    
    # With quotes - definitely a string
    value: "42"

=head1 SEE ALSO

=over 4

=item * L<TOON Specification|https://github.com/toon-format/spec/blob/main/SPEC.md>

=item * L<JSON|https://www.json.org/> - Data model foundation

=item * L<YAML|https://yaml.org/> - Similar indentation-based format

=back

=head1 LICENSE

Copyright (C) ytnobody.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

=head1 AUTHOR

ytnobody E<lt>ytnobody@gmail.comE<gt>

Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.