Talend and Amazon Polly Integration

Overview

This article outlines the capability of Talend to integrate with Amazon Polly, an AWS service. Amazon Polly, a front runner in the text to speech technology, helps companies build products with lifelike speech ability.

 

The article is a continuation of the Talend AWS Machine Learning integration series. You can read the previous articles, Introduction to Talend and Amazon Real-Time Machine Learning, Talend and Amazon Comprehend Integration, Talend and Amazon Translate Integration, and Talend and Amazon Rekognition Integration in the Talend Community Knowledge Base.

 

Environment for Talend and AWS

This article was written using Talend 7.1. However, you can configure earlier versions of Talend with the logic provided to integrate Amazon Polly.

 

Currently, Amazon Polly is only available in selected AWS regions. Talend recommends verifying the availability of the service from the AWS Global Infrastructure, Region Table before creating the overall application architecture.

 

Talend recommends reviewing the list of Languages Supported by Amazon Polly.

 

Practical use case

This section discusses a practical use case where integrating Talend with Amazon Polly can help in the automatic audio conversion of incoming flight take-off and landing information.

 

Automatic airline audio notification application

Automatic audio notifications through telephone calls and public address systems have become a trend among companies, because it significantly reduces the costs associated with manual intervention, and rules out possible human error.

Polly Use case diagram.jpgAutomatic Flight Tracking Audio Notification System

 

The diagram above illustrates the workflow. The steps in the flow are:

  1. The real-time data from flights and air traffic controllers is transferred to web servers.
  2. The data is channeled to producer queues where Kafka handles the queue systems.
  3. Talend uses in-built native Kafka connectors, to read the producer queues and transmit the data to downstream systems.
  4. Talend performs the request call to the Amazon Polly speech synthesize service by transferring the input text.
  5. Talend receives the response from the Amazon Polly speech synthesize service. Talend automatically downloads the audio files to the specified local directory on your Talend JobServer for further processing.
  6. Talend transmits the data to the consumer Kafka queue using native Kafka connector components.
  7. The data from the consumer Kafka queue is transmitted to the communication servers, and that transfers the audio message to the customer's telephone or the public address systems at the airport, depicting the live flight status.

 

Configure a Talend routine for Amazon Polly service

Create a Talend user routine, by performing the following steps.

  1. Connect to Talend Studio, and create a new routine called called AWS_Polly that connects to the AWS Polly service to transmit the incoming input text and collect the response back from the AWS Polly service.

    image.png

     

  2. Insert the following code into the Talend routine:

    package routines;
    
    //Amazon SDK 1.11.438
    
    import com.amazonaws.auth.BasicAWSCredentials;
    import com.amazonaws.auth.AWSStaticCredentialsProvider;
    import com.amazonaws.services.polly.AmazonPolly;
    import com.amazonaws.services.polly.AmazonPollyClientBuilder;
    import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
    import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
    import com.amazonaws.services.polly.model.OutputFormat;
    
    import org.apache.commons.logging.LogFactory;
    
    import com.fasterxml.jackson.databind.ObjectMapper;
    import com.fasterxml.jackson.databind.ObjectMapper;
    import com.fasterxml.jackson.annotation.JsonView;
    
    import org.apache.http.protocol.HttpRequestExecutor;
    import org.apache.http.client.HttpClient;
    import org.apache.http.conn.DnsResolver;
    import org.joda.time.format.DateTimeFormat;
    
    import java.io.FileNotFoundException;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;
    import java.io.FileOutputStream;
    import java.io.File;
    
    
    public class AWS_Polly {
    
    	public static Integer Speech_Synthesize(String AWS_Access_Key,String AWS_Secret_Key, String languageCode,String AWS_regionName,String input_text, String audio_format, String SampleRate, String VoiceID, String filepath) throws IOException  
    	{
    		// AWS Connection
    			
    		BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_Access_Key,AWS_Secret_Key);
    
    		AmazonPolly client = AmazonPollyClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCreds)).withRegion(AWS_regionName).build();
    
    
    		//AWS_Polly
    
    		SynthesizeSpeechRequest request = new SynthesizeSpeechRequest().withLanguageCode(languageCode).withOutputFormat(audio_format).withSampleRate(SampleRate)
    		        .withText(input_text).withTextType("text").withVoiceId(VoiceID);
    		SynthesizeSpeechResult synthesizeSpeechResult = client.synthesizeSpeech(request);
    		
    		InputStream speechStream = synthesizeSpeechResult.getAudioStream();
    		OutputStream outstream = new FileOutputStream(new File(filepath));
    	    byte[] buffer = new byte[4096];
    	    int len;
    			while ((len = speechStream.read(buffer)) > 0) {
    			    outstream.write(buffer, 0, len);
    			}	
    	    outstream.close();
    
    	    return 0;
    	}
    }
    

     

  3. The Talend routine needs additional JAR files. Install the following JAR files in the routine:

    • AWS SDK 1.11.438
    • apache.commons.logging 1.2.0
    • Jackson core 2.9.7
    • Jackson Annotations 2.9.0
    • Jackson Databind 2.9.7
    • httpcore 4.4.10
    • httpclient 4.5.6
    • joda-time 2.9.4
    • jl 1.0
    • org.osgi.foundation 1.2.0
  4. Add additional Java libraries to the routine. For more information on how to add Java libraries, see the Talend and Amazon Comprehend Integration article of the series.

The setup activities are complete. The next section shows sample Jobs for the functionalities described in the practical use cases.

 

For ease of understanding, and to keep the focus on the integration between Talend and Amazon Polly, the sample Job uses a CSV file for input and audio files of the input text are generated as output.

 

Talend sample Job for Amazon Polly service

The Polly_input_file.csv file, attached to this article, provides the data for the sample Job. The data from the input file is transmitted to the Amazon Polly speech synthesize service, and the response is captured. The response from the Amazon Polly service contains the audio stream details, and the audio data is automatically downloaded by Talend to the file path you specified in the Talend routine.

 

The configuration details are as follows:

  1. Create a new Standard Job called AWS_Polly_sample_job, or use the sample Job, AWS_Polly_sample_job.zip, attached to this article.

  2. The first stage in associating the routine to a Talend Job is to add the routines to the newly created Job, by selecting Setup routine dependencies.

    image.png

     

  3. Add the AWS_Polly routine to the User routines section of the pop-up screen, to link the newly created routine to the Talend Job.

    image.png

     

  4. Review the overall Job flow, shown in the following diagram.

    image.png

     

  5. Configure the context variables, as shown below:

    image.png

     

  6. The input file for the Job, Polly_input_files.csv, attached to this article, contains the data to be processed along with other parameters, such as language code, audio format, audio sample rate, and Polly Voice ID that are used for speech synthesize and file path.

    image.png

     

  7. Configure the tFileInputDelimited component as shown below:

    image.png

     

  8. Use a tJavaRow component, where the call to the Amazon Polly service is made through a Talend routine. You will have to pass the parameters mentioned in the code snippet in the same order as the function call in the tJavaRow component.

    AWS_Polly.Speech_Synthesize(context.AWS_Access_Key, context.AWS_Secret_Key, input_row.languageCode, context.AWS_regionName, input_row.input_text, input_row.audio_format, input_row.SampleRate, input_row.VoiceID, input_row.filepath);

     

  9. Configure the tJavaRow component layout as shown below:

    image.png

     

  10. Notice that Talend automatically downloads the audio output files to the file path you specified in the Talend routine. In this case, the audio files are stored in the Output Audio Files.zip, attached to this article. 

    image (2).png

     

In practical scenarios, the output at this stage can be passed to downstream systems for further processing and storage.

 

Threshold limits for data processing

At the time of this writing; Amazon Polly supports MP3, Vorbis, and raw PCM audio stream formats.

 

The throttle rate per account is 100 transactions (requests or operations) per second (tps) with a burst limit of 120 tps and concurrent connections per account is 90. The throttle rate per SynthesizeSpeech operation is 80 tps with a burst limit of 100 tps.

 

The size of the input text can be up to 3,000 billed characters (6,000 total characters). SSML tags are not counted as billed characters. Up to five lexicons (to apply to the input text) can be specified in a request. The output audio stream (synthesis) is limited to 10 minutes and after this upper threshold limit, any remaining speech is cut off.

 

The audio frequency is specified in Hertz(Hz). For audio sample rate, the valid values for MP3 and ogg_vorbis are, 8,000, 16,000, and the default value 22,050. Similarly, the valid values for PCM are 8,000 and the default value 16,000.

 

For a list of voices in multiple languages, see Voices in Amazon Polly.

 

Conclusion

This article depicts use cases of integrating Talend with Amazon Polly service. In real time scenarios, data input flow is in the form of APIs, batch files, web services or queues instead of input files mentioned in the sample Jobs.

 

Citations

AWS Documentation

Version history
Revision #:
23 of 23
Last update:
‎06-28-2019 09:20 AM
Updated by:
 
Labels (1)