The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Parallel::Map::Segmented - use Parallel::Map on batches / segments of items.

VERSION

version 0.4.0

SYNOPSIS

    use Parallel::Map::Segmented ();
    use Path::Tiny qw/ path /;

    my $NUM    = 30;
    my $temp_d = Path::Tiny->tempdir;

    my @queue = ( 1 .. $NUM );
    my $proc  = sub {
        my $fn = shift;
        $temp_d->child($fn)->spew_utf8("Wrote $fn .\n");
        return;
    };
    Parallel::Map::Segmented->new()->run(
        {
            WITH_PM      => 1,
            items        => \@queue,
            nproc        => 3,
            batch_size   => 8,
            process_item => $proc,
        }
    );

DESCRIPTION

This module builds upon Parallel::Map allowing one to pass a batch (or "segment") of several items for processing inside a worker. This is done in order to hopefully speed up the processing.

It aims to provide a compatible API with Parallel::ForkManager::Segmented only based on IO::Async and Parallel::Map rather than Parallel::ForkManager .

METHODS

my $obj = Parallel::Map::Segmented->new;

Initializes a new object.

my \%ret = $obj->process_args(+{ %ARGS })

Process the arguments - see run().

$obj->run(+{ %ARGS });

Runs the processing. Accepts the following named arguments:

  • process_item

    A reference to a subroutine that accepts one item and processes it.

  • items

    A reference to the array of items.

  • stream_cb

    A reference to a callback for returning new batches of items (cannot be specified along with 'items'.)

    Accepts a hash ref with the key 'size' specifying an integer of the maximal item count.

    Returns a hash ref with the key 'items' pointing to an array reference of items or undef() upon end-of-stream.

    E.g:

            $stream_cb = sub {
                my ($args) = @_;
                my $size = $args->{size};
    
                return +{ items =>
                        scalar( @$items ? [ splice @$items, 0, $size ] : undef() ),
                };
            };
  • nproc

    The number of child processes to use.

  • batch_size

    The number of items in each batch.

  • disable_fork

    Disable forking and use of Parallel::ForkManager and process the items serially.

  • process_batch

    A reference to a subroutine that accepts a reference to an array of a whole batch that is processed as a whole. If specified, process_item is not used.

    Example:

        use strict;
        use warnings;
        use Test::More tests => 30;
        use Parallel::Map::Segmented ();
        use Path::Tiny qw/ path /;
    
        {
            my $NUM    = 30;
            my $temp_d = Path::Tiny->tempdir;
    
            my @queue = ( 1 .. $NUM );
            my $proc  = sub {
                foreach my $fn ( @{ shift(@_) } )
                {
                    $temp_d->child($fn)->spew_utf8("Wrote $fn .\n");
                }
                return;
            };
            Parallel::Map::Segmented->new->run(
                {
                    WITH_PM       => 1,
                    items         => \@queue,
                    nproc         => 3,
                    batch_size    => 8,
                    process_batch => $proc,
                }
            );
            foreach my $i ( 1 .. $NUM )
            {
                # TEST*30
                is( $temp_d->child($i)->slurp_utf8, "Wrote $i .\n", "file $i", );
            }
        }

SEE ALSO

THANKS

SUPPORT

Websites

The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources.

Bugs / Feature Requests

Please report any bugs or feature requests by email to bug-parallel-map-segmented at rt.cpan.org, or through the web interface at https://rt.cpan.org/Public/Bug/Report.html?Queue=Parallel-Map-Segmented. You will be automatically notified of any progress on the request by the system.

Source Code

The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :)

https://github.com/shlomif/perl-Parallel-ForkManager-Segmented

  git clone https://github.com/shlomif/perl-Parallel-ForkManager-Segmented.git

AUTHOR

Shlomi Fish <shlomif@cpan.org>

BUGS

Please report any bugs or feature requests on the bugtracker website https://github.com/shlomif/parallel-map-segmented/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

COPYRIGHT AND LICENSE

This software is Copyright (c) 2018 by Shlomi Fish.

This is free software, licensed under:

  The MIT (X11) License