The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

WebService::Speechmatics - ALPHA interface to speech-to-text API from speechmatics.com

SYNOPSIS

 use WebService::Speechmatics;

 my $sm = WebService::Speechmatics->new(
              user_id => 42,
              token   => '...THISISNOTREALLYMYAPITOKEN...',
              lang    => 'en-GB',
          );
 my $response = $sm->submit_job('foobar.wav');

 # wait a bit

 $transcript = $sm->transcript($response->id);

DESCRIPTION

This module provides an interface to the Speechmatics API for converting speech audio to text.

UNSTABLE: please note that this is very much a work in progress, and all aspects of the interface may change in the future. I've only played with the service so far. Happy to hear suggestions for this module's interface. My current thoughts are in TODO.md.

Before using this module you need to register with speechmatics.com, which will provide you with a user id (integer) and a token to use with the API (a string of random characters).

After submitting a speech audio file, you can either poll until it has been converted to text (or failed), or you can provide a callback URL and Speechmatics will POST the result to your URL.

Specifying language

Whenever you submit a transcription job, you must specify the suspected language (of the speaker(s) in the audio). Right now that can be one of

 en-GB    UK English
 en-US    American English

You can either specify the language every time you submit a transcription job, or you can specify it when you instantiate this module, as in the SYNOPSIS.

METHODS

new

The following attributes can be passed to the constructor:

  • token - the API token on registering with Speechmatics.

  • user_id - the integer user id which you also get from Speechmatics.

  • lang - the suspected language of the speaker, described above.

  • callback - a URL which transcripts should be POSTed back to.

  • notification - if you set this to 'email' then you'll get an email sent to you when jobs are completed. Defaults to 'none'.

The token and user_id attributes are required, but the others are optional, as they can be specified on a per-job basis as well.

submit_job

There are two ways to submit a job. The simplest is where you just pass the name / path for an audio file:

 $speechmatics->submit_job('i-have-a-dream.wav');

To submit jobs this way, you must specify the language by passing lang to the constructor.

You can also provide additional attributes by passing a hash ref:

 $speechmatics->submit_job({
     filename     => 'i-have-a-dream.wav',
     lang         => 'en-GB',
     notification => 'email',
 });

jobs

Returns a list of your jobs, each of which is an instance of WebService::Speechmatics::Job, which has attributes named exactly the same as the fields given in the Speechmatics API documentation.

balance

Returns an integer, which is the number of Speechmatics credits you have left in your account.

transcript

Returns an instance of WebService::Speechmatics::Transcript, or undef if the job is still in progress. This has three attributes:

  • job - instance of WebService::Speechmatics::Job with details of the job which produced this transcription.

  • speakers - an array ref of speakers, which will currently always contain the single dominant speaker.

  • words - an array ref of WebService::Speechmatics::Word. Each instance has attributes named exactly as in the Speechmatics API doc.

Here's a simple example how you might submit a job for transcription, then dispay the converted text:

 my $sm         = WebService::Speechmatics->new( ... );
 my $response   = $sm->submit_job('sample.wav');

 # wait

 my $transcript = $sm->transcript($response->id);
 my @words      = map { $_->name } @{ $transcript->words };

 print "you said: @words\n";

SEE ALSO

speechmatics.com - home page for Speechmatics

API doc - the official documentation for the Speechmatics API.

REPOSITORY

https://github.com/neilbowers/WebService-Speechmatics

AUTHOR

Neil Bowers <neilb@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2015 by Neil Bowers <neilb@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.