Couch-DB/lib/Couch/DB.pod
=encoding utf8
=head1 NAME
Couch::DB - CouchDB database client
=head1 INHERITANCE
Couch::DB is extended by
Couch::DB::Mojolicious
=head1 SYNOPSIS
use Couch::DB::Mojolicious ();
my $couch = Couch::DB::Mojolicious->new(api => '3.3.3');
my $db = $couch->db('my-db'); # Couch::DB::Database object
my $cluster = $couch->cluster; # Couch::DB::Cluster object
my $client = $couch->createClient(...); # Couch::DB::Client
=head1 DESCRIPTION
When this module was written, there were already a large number of
CouchDB implementations on CPAN. Still, there was a need for one more.
This implementation provides a B<thick interface>: a far higher level
of abstraction than the other modules. This should make your work much,
much easier.
Also, open F<https://perl.overmeer.net/couch-db/reference.html>
in a browser window, as useful cross-reference: parameters for CouchDB
are not documented in this Perl documentation! For those, you need
to visit F<https://docs.couchdb.org/en/stable/>.
B<Please read> the L</DETAILS> section, further down, at least once
before you start!
=head1 METHODS
=head2 Constructors
=over 4
=item Couch::DB-E<gt>B<new>(%options)
Create a relation with a CouchDB server (cluster). You should use
totally separated L<Couch::DB|Couch::DB>-objects for totally separate database
clusters. B<Note:> you can only instantiate extensions of this class.
When you do not specify a C<server>-url, but have an environment variable
C<PERL_COUCH_DB_SERVER>, then server url, username, and password are
derived from it.
-Option --Default
api <required>
auth 'BASIC'
password undef
server "http://127.0.0.1:5984"
to_json +{ }
to_perl +{ }
to_query +{ }
username undef
=over 2
=item api => $version
You MUST specify the version of the server you expect to answer your
queries. L<Couch::DB|Couch::DB> tries to hide differences between your expectations
and the reality of the server version.
The $version can be a string or a version object (see "man version").
=item auth => 'BASIC'|'COOKIE'
Authentication method to be used by default for each client.
=item password => STRING
=item server => URL
The default server to connect to, by URL. See C<< etc/local.ini[chttpd] >>
This server will be named C<_local>.
You can add more servers using L<addClient()|Couch::DB/"Processing">. In such case, you probably
do not want this default client to be created as well. To achieve this,
explicitly set C<server =E<gt> undef>.
=item to_json => HASH
A table mapping converter name to CODE, to override/add the default PERL to JSON
object conversions for sending structures. See L<toJSON()|Couch::DB/"Processing">.
=item to_perl => HASH
A table mapping converter name to CODE, to override/add the default JSON to PERL
object conversions for L<Couch::DB::Result::values()|Couch::DB::Result/"When the document is collected">. See L<toPerl()|Couch::DB/"Processing"> and L<listToPerl()|Couch::DB/"Processing">.
=item to_query => HASH
A table mapping converter name to CODE, to override/add the default PERL to URL
QUERY conversions. Defaults to the json converters. See L<toQuery()|Couch::DB/"Processing">.
=item username => STRING
When a C<username> is given, it will be used together with C<auth> and
C<password> to login to any created client.
=back
=back
=head2 Accessors
=over 4
=item $obj-E<gt>B<api>()
Returns the interface version you expect the server runs, as a version
object. Differences between reality and expectations are mostly
automatically resolved.
=back
=head2 Interface starting points
=over 4
=item $obj-E<gt>B<cluster>()
Returns a L<Couch::DB::Cluster|Couch::DB::Cluster>-object, which organizes calls to
manipulate replication, sharding, and related jobs. This will always
return the same object.
=item $obj-E<gt>B<createClient>(%options)
Create a client object which handles a server. All options are passed
to L<Couch::DB::Client|Couch::DB::Client>. The C<couch> parameter is added for you.
The client will also be added via L<addClient()|Couch::DB/"Processing">, and is then returned.
It may be useful to create to clients to the same server: one with admin
rights, and one without. Or clients to different nodes, to create
fail-over.
=item $obj-E<gt>B<db>($name, %options)
Declare a database. The database may not exist yet: calling this
method does nothing with the CouchDB server.
my $db = $couch->db('authors');
$db->ping or $db->create(...);
=item $obj-E<gt>B<node>($name)
Returns a L<Couch::DB::Node|Couch::DB::Node>-object with the $name. If the object does not
exist yet, it gets created, otherwise reused.
=back
=head2 Ungrouped calls
=over 4
=item $obj-E<gt>B<freshUUID>(%options)
Returns one unique identifier.
=item $obj-E<gt>B<freshUUIDs>($count, %options)
Returns a $count number of UUIDs in a LIST. This uses L<requestUUIDs()|Couch::DB/"Ungrouped calls">
to get a bunch at the same time, for efficiency. You may get fewer than
you want, but only when the server is not sending them.
-Option--Default
bulk 50
=over 2
=item bulk => INTEGER
When there are not enough UUIDs in stock, in how large chuncks should we ask for
more.
=back
=item $obj-E<gt>B<requestUUIDs>($count, %options)
[CouchDB API "GET /_uuids", since 2.0]
Returns UUIDs (Universally unique identifiers), when the call was
successful. Better use L<freshUUIDs()|Couch::DB/"Ungrouped calls">. It is faster to use Perl
modules to generate UUIDs.
=item $obj-E<gt>B<searchAnalyze>(%options)
[CouchDB API "POST /_search_analyze", since 3.0, UNTESTED]
Check what the build-in Lucene tokenizer(s) will do with your text.
-Option --Default
analyzer <required>
text <required>
=over 2
=item analyzer => KIND
=item text => STRING
=back
=back
=head2 Processing
The methods in this section implement the CouchDB API. You should
usually not need to use these yourself, as this libary abstracts them.
=over 4
=item $obj-E<gt>B<addClient>($client)
Add a L<Couch::DB::Client|Couch::DB::Client>-object to be used to contact the CouchDB
cluster. Returned is the couch object, so these calls are stackable.
=item $obj-E<gt>B<call>($method, $path, %options)
Call some couchDB server, to get work done. This is the base for any
interaction with the server.
Besides the explicitly listed parameters, this C<call> method is also responsible
for handling the generic parameters which influence the connection to the server
(like C<delay>, C<client>, and C<headers>) and hook into events (like C<on_final>
and C<on_error>).
B<Note:> you should probably not use this method yourself: all endpoint of
CouchDB are available via a nice, abstract wrapper.
-Option--Default
paging {}
query undef
send undef
=over 2
=item paging => HASH
When the endpoint support paging, then its needed configuration
data has been collected in here. This enables the use of C<succeed>,
C<page>, C<skip>, and friends. See examples in section L</Pagination>.
=item query => HASH
Query parameters for the request.
=item send => HASH
The content to be sent with POST and PUT methods.
in those cases, even when there is nothing to pass on, simply to be
explicit about that.
=back
=item $obj-E<gt>B<check>($condition, $change, $version, $what)
If the $condition it true (usually the existence of some parameter), then
check whether api limitiations apply.
Parameter $change is either C<removed>, C<introduced>, or C<deprecated> (as
strings). The C<version> is taken from the CouchDB API documentation.
The $what describes the element, to be used in error or warning messages.
=item $obj-E<gt>B<client>($name)
Returns the client with the specific $name (which defaults to the server's url).
=item $obj-E<gt>B<clients>(%options)
Returns a LIST with the defined clients; L<Couch::DB::Client|Couch::DB::Client>-objects.
-Option--Default
role undef
=over 2
=item role => $role
When defined, only return clients which are able to fulfill the
specific $role.
=back
=item $obj-E<gt>B<jsonText>($json, %options)
Convert the (complex) $json structure into serialized JSON. By default, it
is beautified.
-Option --Default
compact false
=over 2
=item compact => BOOLEAN
Produce compact (no white-space) JSON.
=back
=item $obj-E<gt>B<listToPerl>($set, $type, @data|\@data)
Returns a LIST from all elements in the LIST @data or the ARRAY, each
converted from JSON to pure Perl according to rule $type.
=item $obj-E<gt>B<toJSON>(\%data, $type, @keys)
Convert the named fields in the %data into a JSON compatible format.
Fields which do not exist are left alone.
=item $obj-E<gt>B<toPerl>(\%data, $type, @keys)
Convert all fields with @keys in the %data HASH into object
of $type. Fields which do not exist are ignored.
As default JSON to Perl translations are currently defined:
C<abs_uri>, C<epoch>, C<isotime>, C<mailtime>, C<version>, and
C<node>.
=item $obj-E<gt>B<toQuery>(\%data, $type, @keys)
Convert the named fields in the %data HASH into a Query compatible
format. Fields which do not exist are left alone.
=back
=head1 DETAILS
=head2 Early adopters
B<Be warned> that this module is really new. The 127 different endpoints
that the CouchDB 3.3.3 API defines, are grouped and combined. The result
is often not tested, and certainly not combat ready. Please, report
the result of calls which are currently flagged "UNTESTED".
B<Please help> me fix issues by reporting them. Bugs will be solved within
a day. Please, contribute ideas to make the use of the module lighter.
Together, we can make the quality grow fast.
=head2 Integration with your framework
You need to instantiate an extensions of this class. At the moment,
you can pick from:
=over 4
=item * L<Couch::DB::Mojolicious|Couch::DB::Mojolicious>
Implements the client using the Mojolicious framework, using Mojo::URL,
Mojo::UserAgent, Mojo::IOLoop, and many other.
=back
Other extensions are hopefully added in the future. Preferrably as part
of this release so it gets maintained together. The extensions are not
too difficult to create and certainly quite small.
=head2 Where can I find what?
The CouchDB API lists all endpoints as URLs. This library, however,
creates an Object Oriented interface around these calls: you do not
see the internals in the resulting code. Knowing the CouchDB API,
it is usually immediately clear where to find a certain end-point:
C<< /{db} >> will be in L<Couch::DB::Database|Couch::DB::Database>. A major exception is
anything what has to do with replication and sharding: this is bundled
in L<Couch::DB::Cluster|Couch::DB::Cluster>.
Have a look at F<https://perl.overmeer.net/couch-db/reference.html>.
Keep that page open in your browser while developing.
=head2 Thick interface
The CouchDB client interface is based on HTTP. It is really easy to
construct a JSON, and then use a UserAgent to send it to the CouchDB
server. All other CPAN modules which support CouchDB stick on this
level of support; except C<Couch::DB>.
When your library is very low-level, your program needs to put effort to
create an abstraction around the interface it itself. When the library
offers that abstraction already, you need to write much less code!
The Perl programming language works with functions, methods, and
objects, so why would your libary require you to play with URLs?
So, C<Couch::DB> has the following extra features:
=over 4
=item *
Calls have a functional name, and are grouped into classes: the
endpoint URL processing is totally abstracted away;
=item *
Define multiple clients at the same time, for automatic fail-over,
read, write, and permission separation, or parallellism;
=item *
Resolving differences between CouchDB-server versions. You may
even run different CouchDB versions on your nodes;
=item *
JSON-types do not match Perl's type concept: this module will
convert boolean and integer parameters (and more) from Perl to
JSON and back transparently;
=item *
Offer error handling and event processing on each call;
=item *
Event framework independent (currently only a Mojolicious connector).
=back
=head2 Using the CouchDB API
All methods which are marked with C<< [CouchDB API] >> are, as the name
says: client calls to some CouchDB server. Often, this connects to a node
on your local server, but you can also connect to other servers and even
multiple servers.
All these API methods return a L<Couch::DB::Result|Couch::DB::Result> object, which can tell
you how the call worked, and the results. The resulting object is overloaded
boolean to produce C<false> in case of an error. So typically:
my $couch = Couch::DB::Mojolicious->new(version => '3.3.3');
my $result = $couch->requestUUIDs(100);
$result or die;
my $uuids = $result->values->{uuids};
This CouchDB library hides the fact that endpoint C</_uuids> has been called.
It also hides the client (UserAgent) which was used to collect the data.
You could also write
my $uuids = $couch->requestUUIDs(100)->values->{uuids};
because "values()" will terminate when the database call did not result
in a successful answer. Last alternative:
my @uuids = $couch->freshUUIDs(100);
Besides calls, there are all kinds of facility methods, which add
further abstraction from the server connection.
=head3 Type conversions
With the L<Couch::DB::Result::values()|Couch::DB::Result/"When the document is collected"> method, conversions between JSON
syntax and pure Perl are done. This also hides database interface changes
for you, based on your L<new(api)|Couch::DB/"Constructors"> setting. Avoid L<Couch::DB::Result::answer()|Couch::DB::Result/"When the document is collected">,
which gives the uninterpreted, unabstracted results.
This library also converts parameters from Perl space into JSON space.
POST and PUT parameters travel in JSON documents. In JSON, a boolean is
C<true> and C<false> (without quotes). In Perl, these are C<undef> and
C<1> (and many alternatives). For anything besides your own documents,
C<Couch::DB> will totally hide these differences for you!
=head3 Generic parameters
Each method which is labeled C<< [CouchDB API] >> also accepts a few options
which are controlling the calling progress. They are handled by the L<call()|Couch::DB/"Processing">
method which implements the API calls.
These parameters are available for every API call, hence no-where
documented explicitly. These options are either about the connection
between client and server, for result paging, or processing event hooks.
=head4 Connection parameters
At the moment, the following generic C<%options> are supported everywhere:
=over 4
=item * C<delay> =E<gt> BOOLEAN, default C<false>
Do not perform and wait for the actual call, but prepare it to be used in parallel
querying. TO BE IMPLEMENTED/DOCUMENTED.
=item * C<client> =E<gt> $client-object or -name
Use only the specified client (=server) to perform the call.
=item * C<clients> =E<gt> ARRAY-of-clients or a role
Use any of the specified clients to perform the call. When not an ARRAY, the
parameter is a C<role>: select all clients which can perform that role (the
logged-in user of that client is allowed to perform that task).
=item * C<headers> =E<gt> HASH
Add headers to the request. When applicable (for instance, the C<Accept>-header)
this will overrule the internally calculated defaults.
=back
=head4 Processing events
Besides, at the moment we support the following events:
=over 4
=item * C<on_error> =E<gt> CODE or ARRAY-of-CODE
A CODE (sub) which is called when the interaction with the server has
been completed without success. The CODE gets the result object as
only parameter.
=item * C<on_final> =E<gt> CODE or ARRAY-of-CODE
A CODE (sub) which is called when the interaction with the server has
been completed. This may happen much later, when combined with C<delay>.
The CODE gets the result object as only parameter, and returns a result
object which might be different... as calls can be chained.
=item * C<on_chain> =E<gt> CODE
Run the CODE after the call has been processed. It works as if the
changed logic is run after the call, with the difference is that this
next step is defined before the call has been made. This sometimes
produces a nicer interface (like paging).
=item * C<on_values> =E<gt> CODE
Run the CODE on the result on the returned JSON data, to translate the
raw C<answer()> into C<values()>. Wherever seemed useful, this is
already hidden for you. However: there may be cases where you want to
add changes.
=item * C<on_row> =E<gt> CODE
Used to produce rows where the answer of a call produces a list of
answers.
=back
=head2 Searching with CouchDB
CouchDB supports various search mechanisms, which are confusingly
named. Their configuration is stored in design documents (see
L<Couch::DB::Design|Couch::DB::Design>). The mechanisms can often be combined.
=over 4
=item * Views
Views are docid-key-value tables which are following the changes made
in a database. The value may be complex, and may contain extracts
and computed elements based on each document.
See F<https://docs.couchdb.org/en/stable/ddocs/views/>
=item * Search
Use Lucene (mainly full text search) indexes on a database. The
interface uses Clouseau wrapper libraries around the Lucene core.
See F<https://docs.couchdb.org/en/stable/ddocs/search.html>
Clouseau was introduced in CouchDB 3.0.0, and now being phased out
because it got stuck on Java 8. It has index type C<text>.
=item * Search
Uses Lucene indexes via the (new) Nouveau wrapper.
Nouveau was introduced as beta in 3.4.1 as alternative, with index
type C<nouveau>.
=item * Mango
The C<find()> method uses MongoDB compatible search statements,
which can be combined with index and view restrictions.
These indexes are of the (ill-named) type C<json>.
See F<https://docs.couchdb.org/en/stable/ddocs/mango.html>
=back
Confusing? Yes it is. There are often multiple solutions for the
same problem.
=head2 Paging
Searches tend to give a large number of results. The CouchDB server
will refuse to return too many answers at a time (typically 25).
When you need more results, you will need to do more calls.
We distinguish three cases:
=over 4
=item 1. no paging possible
Most commands will give you the full requested knowledge in one go.
In some cases, the result may still present parts of the reply in
convenient rows.
=item 2. paging possible, but not used
Command may support C<skip>, C<limit>, and/or C<bookmark>. You decide
you need them. You can even use C<succeed> (see below). However, the
number of rows you get in the result can differ from your expectations.
You may expect to get 50 elements, but sometimes get 10. Use the
"row"- and "doc" methods on the result object.
=item 3. paging used
When you provide options C<all> or C<page_size>, then your call gets
into paging mode: the server will be polled until the requested number
of results has been received.
You cannot use the row commands on the result, because it will only
return the rows of the last call which was made to fulfil your request.
Now, you need to use the page commands of the result object.
=back
=head3 paging possible
To get more answers, CouchDB implements two mechanisms: some calls
provide a C<skip> and C<limit> only. Other calls implement the more
sophisticated bookmark mechanism. Both mechanisms are abstracted away
in this library by the C<succeed> mechanism.
B<Be aware> that you shall provide the same query parameters to each
call of the search method. Succession may be broken when you change
some parameters: it is not fully documented which ones are needed to
continue, so simply pass all again. Probably, it is save to change
the C<limit> between pages.
The following parameters can be used then the CouchDB call supports
paging (this will be documented explicitly at these methods)
=over 4
=item * C<skip> =E<gt> INTEGER
Do not return this amount of first following elements.
B<Be warned:> use as C<%option>, not as search parameter.
=item * C<limit> =E<gt> INTEGER
Do not request more than C<limit> number of results per request. May be
less than C<page_size>.
B<Be warned:> use as C<%option>, not as search parameter.
=item * C<bookmark> =E<gt> STRING
If you accidentally know the bookmark for the search. Usually, this is
automatically picked-up via C<succeed>.
=item * C<stop> =E<gt> CODE|'EMPTY'|'SMALLER'|'UPTO($nr)'
When do we stop asking the server for more pages? When the call returned
no rows, then we always stop. C<EMPTY> reflects the same.
With C<SMALLER>, you will also stop when fewer rows were returned than in
the first call. It's not sure what the normal number of rows is what the
server returns, but probably always the same until the source runs out.
When C<UPTO> is used with some value (for instance C<UPTO(5)>) then no new
call is made when less or equal to that number of rows is returned.
=item * C<succeed> =E<gt> $result or $result->paging
Make this query as successor of a previous query. Some requests support
paging (via bookmarks). See examples in a section below.
=item * C<harvester> =E<gt> CODE
How or what to extract per request. You may add other information,
like collecting response objects. The CODE returns the extract LIST of
objects/elements.
=item * C<map> =E<gt> CODE
Call the CODE on each of the (defined) harvested page elements. The CODE
is called with the result object, and one of the harvested elements. When
a single page requires multiple requests to the CouchDB server, this map
will happen on the moment each response has been received, which may help
to create a better interactive experience.
Your CODE may return the harvested object, but also something small
(even undef) which will free-up the memory use of the object immediately.
However: at least return a single scalar (it will be returned in the
"page"), because an empty list signals "end of results".
=back
=head3 paging used
To manage paged results, selected calls support the following options:
=over 4
=item * C<all> =E<gt> BOOLEAN (default false)
Return all results at once. This may involve multiple calls, like when
the number of results is larger than what the server wants to produce
in one go.
Avoid to use this when you expect many or large results. Although, in
such case C<map> may help you a lot.
=item * C<page_size> =E<gt> INTEGER (default 25)
The CouchDB server will often not give you more than 25 or 50 answers
at a time, but you do not want to know.
=item * C<page> =E<gt> INTEGER (default 1)
Start-point of returned results, for calls which support paging.
Pages are numbered starting from 1. When available, bookmarks will
be used for next pages. Succeeding searches will automatically move
through pages (see examples)
=item * C<pagenr> =E<gt> INTEGER (default C<page>)
The start of the page counter, for display purposes.
=back
B<. Example: paging through result>
Get page by page, where you may use the C<limit> parameter to request
for a number of elements. Do not use C<skip>, except in the first call.
The C<succeed> handling will play tricks with C<page>, C<harvester>,
and C<client>, which you do not wish to know.
my $page1 = $couch->find(\%search, limit => 12, skip => 300, page_size => 10);
my $rows1 = $page1->page;
my @rows1 = $page1->pageRows;
my @docs1 = map $_->doc, @rows1;
my $page2 = $couch->find(\%search, succeed => $page1);
my $rows2 = $page2->page;
my @docs2 = map $_->doc, $page2->page;
my @docs2 = $page2->pageDocs;
sub h { my ($result) = @_; $result->docs }
my $page3 = $couch->find(\%search, succeed => $page2, harvester => \&h);
my $docs3 = $page3->page; # now docs!
my @docs3 = $page3->pageRows;
B<. Example: paging with a website user session>
When you cannot ask for pages within a single continuous process, because
the page is shown to a user who has to take action to see an other page,
then save the pagingState.
The state cannot contain code references, so when you have a specific
harvester or map, then you need to resupply those.
my $page1 = $couch->find(\%search, page_size => 50);
my $rows1 = $page1->page;
$session->save(current => serialized $page1->pagingState);
...
my $prev = deserialize $session->load('current');
my $page2 = $couch->find(\%search, succeed => $prev);
my $rows2 = $page2->page;
B<. Example: get all results in a loop>
Handle the responses which are coming in one by one. This is useful
when the documents (with attachements?) are large. Each C<$list>
is a new result object.
my $list;
while($list = $couch->find(\%search, succeed => $list))
{ my @rows = $list->rows;
@rows or last; # nothing left
...; # use the rows
}
# Checks the success of $list, not the number of rows
$list or die "Stopped somewhere with ". $list->message;
This call does not use C<page_size> or C<all>, so the number of rows
received per loop may differ by decission of the server. You cannot
use the C<page*> set of methods, only the C<row*> and C<doc> methods.
B<. Example: get one page of results>
You can jump back and forward in the pages: bookmarks will remember the
pages already seen.
my $page4 = $couch->find(\%search,
limit => 10, # results per server request
page_size => 50, # results until complete
page => 4, # start point, may use bookmark
harvester => sub { $_[0]->rows }, # default
);
my $rows4 = $page4->page;
my $page5 = $couch->find(\%search, succeed => $page4);
my $rows5 = $page5->page;
When paging (here selected via C<page_size>, you cannot use the
the C<row*> and C<doc> methods on the result object: you need to
use the C<page*> versions.
B<. Example: get all results in one call>
When you ask for all answers at once, you must be aware this can consume
a lot of time and memory. Your clients will keep on requesting data from
the server until all rows have been collected.
my $all = $couch->find(\%search, all => 1) or die;
my @rows6 = $all->pageRows;
You may consider to use C<map> here: to extract the data immediately
when it comes in. In this case, you can reduce the amount of information
which has to be kept in memory immediately on arrival.
B<. Example: processing results when they arrive>
When a page (may) require multiple calls to the server, this may enhance
the user experience.
sub do_something($$) { my ($result, $doc) = @_; ...; 42 }
my $all = $couch->find(\%search, all => 1, map => \&do_something);
# $all->page will now show elements containing '42'.
=head1 SEE ALSO
This module is part of Couch-DB distribution version 0.200,
built on June 18, 2025. Website: F<http://perl.overmeer.net/CPAN/>
=head1 LICENSE
Copyrights 2024-2025 by [Mark Overmeer]. For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
See F<http://dev.perl.org/licenses/>