Group
Extension

Data-FastPack/lib/Data/FastPack.pod

=head1 NAME 

Data::FastPack - FastPack Record format, Parsing and Serialising

=head1 DESCRIPTION 

Implements an incremental FastPack encoder and decoder for arbitrary reliable
streams of data, with arbitrary name data channels.

This document also defines the FastPack format 


=head1 CONCEPTS 

The message format is primarily intended to store a sequence of time indexed
values which are to be parameterized to another channel.  For example indexing
a sensor to position where both the sensor and the position are sampled
separately, but can be stored with the same time base.


It's small footprint and easy encoding/decoding also make is a great format for
IPC.

It optionally (but highly recommended ) supports name aliasing for messages.
This saves bytes in storage or transmission, by dynamically assigning a integer
value to a message ID during encoding. The management of mappings is automatic
and inserted into the stream of encoded messages. This makes it seamless for
the decoder to reconstruct the named message.

There is only one reserved channel id, 0 , which is a meta data channel for
application level.  This channel payload is JSON or MessagePack encoded.  If no
payload is present, this is a namespace reset, which will remove named id
mappings so new ones can be generated from future data

The meta data semantics are application dependant, giving great flexibility, in
time base, channel relationships etc


=over 

=item * Efficient use of memory access for ARM cpus

Multi byte data types are stored in little endian order, unless otherwise
specified in the meta data. Payloads are also on a 8 byte boundary allows
direct access for double precision float


=item * Messages stored in one or more files

Multiple streams of message can be stored in a single file if they share a time
base. For streams that have differing time bases, they are stored in a separate
files. This give good compression ability

=item * External defintition file(s) for message types if required.

The definitions of a file can be pointed to externally, or can be stored
internally in a meta data message

=item * Highly compresssable and suitable for self contained web applications

The message time, id, length fields will be mostly unchanging when multiple
message source of same time base are recorded together. The 24 byte header will
basically be reduced to 1 or 2 bytes after compression for most messages.

=back



=head2 Timimg

Timing data is a double float field and can represent many different timing
scenarios.
	
=over

=item * Direct time (seconds)

The simplest case is the storing seconds as floating point values in the field.
Whether the value is a difference to the previous message or an absolute value
is based on definition messages for the file.

=item * Multiple of a time base

Similar to direct time above, however the value is multiplied by an external
time base factor to generate the actual time.

=item * Argument/index into a timing function.

The value is used as an index into a timing function stored in external
JavaScript, which when called calculates a time. i.e. for processing video with
fixed non integer frame rates

=back

For a system reporting only a single message, the time field will constantly be
updated for each message. However with multiple channels only the first message
from the group will have a non zero time when using difference mode.  Most
messages in a system will have a 0 time because of this. 

=head2 Padding

Padding to an 8 byte boundary is implicit to every message.  Arbitrary bytes
can be appended to ensure the alignment.


=head2 Payload Length

A 32 bit field indicating the length of the data in the payload. It is stored
just before the payload to allow more efficient decoding in scripting languages

=head2 Payload

The payload of the message. It is 8 byte aligned for better memory access (ie a
double can extracted directly out of the payload field)


=head1 META AND STRUCTURED DATA

All message ids of 0 are designated "Meta Data". This means the
payload is encoded either in a JSON array or object or as a MessagePack
structure


This gives the fast encoding/decoding of simple time series values and the
ability to have arbitrarily complex data when required

Decoding of meta data automatically picks MessagePack or JSON as required,
as long as the encoded values are are map or array types.

It is recommended that general structured data in other message ids also either
JSON or MessagePack also.


=head1 NAMED IDs

Static IDs are simple for small closed systems. However for larger system,
using dynamic ids allows scoping.  The mapping between names and ids is
automatically applied when using a name space. 

Encoding a named id for the first time will in insert a entry in the local
encoding side table. It will also inject a message with  the dynamic id and
payload of the name into the stream.  This will be the first time this id is
seen, and is a marker to update the decoding end of the system.

Messages with id "0" are never subject to named id lookup.

A message with id 0 and no payload, de-registers all (clears) the name space

For a bidirectional link, as an example, 4 name spaces are actually used. The
encoding end of each direction is the master in that direction. The ids likely
don't match, but they don't need to. The decoding end of each direction is
shadow/slave copy of the encoding table. Over a reliable transport, these
tables will always be in sync with the master from the encoding end. 

=head1 FASTPACK FORMAT SUMMARY

  time(double float)  # 8 byte aligned
  id(32)              # uint32
  len(32)	            # uint32
  payload		bytes     # 8 byte aligned
  padding (as required)

The FastPack format consists of a single type of time indexed binary record.
The semantics of the message is application specific, and is bound to the ID
field in every message. Finally the message payload is opaque data , with the
entire message padded to a multiples of 8 bytes. 


All multi byte fields are in little endian byte order

  

C<time> is the absolute time of the sample or the difference in time from the
current message to the previous message in the same stream. The exact meaning
of the time is as per the definition messages. It is a double float to allow
web browsers to utilise high resolution time, as they do not support 64bit
integers.


C<id> is is the channel id within the file/stream. It relates to a definition
file. 0 indicates a meta data point which is JSON or other structured data,
which alters the processing of the file. 

Optionally a namespace can be used to provide dynamic ID generation. The ids
are mapped from a name/label/topic/ to an integer. Additional mapping messages
are inserted into the stream to facilitate this mapping at both ends.

C<len> is the length of the payload. If the length is larger than 2^32 then it
must be fragmented at the 'application level'.


C<payload> is the data.

C<padding> Every record is padded to an 8 byte bounadary, with nulls, if nessicary

=head1 API Usage

=head2 Encoding and Decoding

=head3 encode_fastpack

  encode_fastpack $buf, $inputs, $limit, $namespace


Encodes and array ref of an array refs C<[time, ID, payload]> in to the buffer
C<$buf>.  C<$buf> is aliased internally, and not copied. All need encoded
messages are appended to the buffer.

If C<limit> is supplied, and less then the length of the C<$input>
array, only this number of inputs will be consumed.  Inputs consumed are
spliced out of the input array to allow the same array to be appended to
externally

If C<$namespace> is provided, all messages are encoded assuming the id is name
and dynamically allocates an ID to the name space.  Note C<$limit> must be
provided  (even if undefended) to use this argument.

Namespace if prevented from using C<0>, C<"0">, or C<undef> as names. Messages
with these names issue a warning.

If a C<$namespace> is utilised, the ID/name  of the message can be any string.
It is mapped to an integer code. If the message is the first one to use the
ID, an extra message mapping the integer to the name (id becomes the integer
value, payload becomes the name), in inserted into the output. 

If the message has and ID of "0", and a payload with no length, this resets the
mappings in the namespace and is output as normal


=head3 decode_fastpack

  decode_fastpack $buf,$output, $limit, $namespace


Decodes FastPack messages from C<$buf>, consuming it as it progresses. C<$buf>
is aliases so new messages can be added to it externally.

C<$output> is a reference to an array to store the decoded messages. The
messages are decoded into  C<[time, id, payload]>.

If C<$limit> is provided, this is the maximum number of messages to decode
during a single call. If not provided or undef, 4096 is the default.


C<$namespace> if provided, int  enables named ids. Decoded messages have the id
mapped to a name stored in the names space. If its the id has not been
encountered before, it uses the payload as the name to update the internal
mapping. This message is not sent to the output, as it is intended to update
the mapping only.

If the message has and ID of "0", and a payload with no length, this resets the
mappings in the namespace and is not forward past the decoder

=head2 Namspaces

=head3 create_namespace

  create_namespace

Returns a name space structure for named ids. Separate name spaces structure
are needed for encoding and decoding ends, even withing the same program.


=head3 id_for_name

  id_for_name $namespace, $name

Returns the integer id in C<$namespace> for  C<$name>. Useful for testing and
optimisation when multiple of the same messages are being encoded.

=head3 name_for_id

  name_for_id $namespace, $id

Returns the name id in C<$namespace> for  C<$id>. Useful for testing and
optimisation when multiple of the same messages are being encoded.

=head1 AUTHOR

Ruben Westerberg, E<lt>drclaw@mac.comE<gt>

=head1 REPOSITORTY and BUGS

Please report any bugs via git hub: L<https://github.com/drclaw1394/perl-data-fastpack>


=head1 COPYRIGHT AND LICENSE

Copyright (C) 2025 by Ruben Westerberg

Licensed under MIT

=head1 DISCLAIMER OF WARRANTIES

THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE.


=cut



Powered by Groonga
Maintained by Kenichi Ishigaki <ishigaki@cpan.org>. If you find anything, submit it on GitHub.