|
Key
This line was removed.
This word was removed. This word was added.
This line was added.
|
Changes (48)
View Page HistoryBefore going further, it must be clear that no configuration for this component is intended to be created by hand. Neither by the Petals Studio.
In fact, only Talend Open Studio and Talend Integration Suite have the ability to generate a correct configuration for this component.
In fact, only Talend Open Studio and Talend Integration Suite have the ability to generate a correct configuration for this component.
{column}
{hide-if:display=printable}
{hide-if:display=printable}
Indeed, a job can easily interact with almost any database, any kind of file, or other systems (SAP, Alfresco, etc...). Rather than developing a specific component, it can be interesting to create a job to cover this task, and expose it as a service through Petals. *The job here acts as a mediation way with data stores or systems.*
\\
Second, a job can be seen as *a set of transformation means*.
Indeed, Talend products provide _Talend components_ that are very efficient and convenient for transformations (e.g. the tMap component). However, it must be clear that these transformations only cover flat structures, like schemas of relational databases. Object or XML schemas are not covered. From this point of view, *the Petals-SE-Talend component cannot replace the Petals-SE-XSLT component*. But it can be an alternative in few cases. Hence, transformations will either rely on attachment files (the content of the attached files is transformed by the job) or on Talend components for Petals (known as _tPetalsInput_ and _tPetalsOutput_). The later solution provides a means to place the content to transform inside the XML message, rather than as an attachment. But still with constraints on the XML shape.
Indeed, Talend products provide _Talend components_ that are very efficient and convenient for transformations (e.g. the tMap component). However, it must be clear that these transformations only cover flat structures, like schemas of relational databases. Object or XML schemas are not covered. From this point of view, *the Petals-SE-Talend component cannot replace the Petals-SE-XSLT component*. But it can be an alternative in few cases. Hence, transformations will either rely on attachment files (the content of the attached files is transformed by the job) or on Talend components for Petals (known as _tPetalsInput_ and _tPetalsOutput_). The later solution provides a means to place the content to transform inside the XML message, rather than as an attachment. But still with constraints on the XML shape.
\\
Eventually, the last way to apprehend a Talend job inside Petals, is to see it as a *an application to execute inside Petals*.
This application may work with data stores (files, databases...), may involve data transformations, but may also use other (if not redundant with Petals) features. As an example, it is possible to send mails inside a job, connect to FTPs, etc... Obviously, these features are also available inside Petals. But in some cases, it can be more interesting and more simple to integrate them directly in the job than use the Petals'ones (where you will have to use EIPs or BPELs to link the calls). In some other cases though, the exact opposite may be the best option, i.e. externalize some parts to Petals components. *It all depends on the expected granularity and reusability*.
This application may work with data stores (files, databases...), may involve data transformations, but may also use other (if not redundant with Petals) features. As an example, it is possible to send mails inside a job, connect to FTPs, etc... Obviously, these features are also available inside Petals. But in some cases, it can be more interesting and more simple to integrate them directly in the job than use the Petals'ones (where you will have to use EIPs or BPELs to link the calls). In some other cases though, the exact opposite may be the best option, i.e. externalize some parts to Petals components. *It all depends on the expected granularity and reusability*.
\\
The wide variety of possibilities (allowed by the non-less important variety of _Talend components_, and by the features of the Petals-SE-Talend component) makes this solution a very flexible one. However, as a swiss-knife component, *the Petals-SE-Talend component should mainly be seen as a functional service-engine*. Performances, without being bad, cannot be the best ones offered. People looking for a very specific and performant usage will prefer develop their own Petals component, or use Petals-SE-Pojo or Petals-SE-Jsr181 components.
h1. Message Processing
This section deals with the way messages (or requests) are processed by the Petals-SE-Talend component.
As a user, it is important to understand the logic of the component to use it efficiently.
A request received by this component may have only one goal: execute the target Talend job.
The request processing is made up of the different steps involved between the message reception and the response.
There are five steps in the processing of a request.
h2. Validating the request
When a request is received and started to be processed in the Petals-SE-Talend component, it is validated before being really processed.
Here are the different steps involved in this validation process.
The first step is the WSDL-based validation of the request's XML payload.
If the *validate-exchange-by-wsdl* parameter is set to *true*, either in the component or in the service-unit, then the XML payload is validated against the WSDL of the service-unit.
If the validation fails, an exception is thrown (which becomes either fault or an error, depending on the Message Exchange Pattern). Otherwise, the validation goes on.
{warning}
Be careful, WSDL-based validation does not work when the input message contains attachments.
The Talend export for Petals does prevent that from happening.
Just remember it if you modify the jbi.xml by hand.
{warning}
\\
The WSDL-based validation checks three elements:
# The called operation is defined in the service's WSDL.
# The called operation is associated with the called Message Exchange Pattern (MEP).
# The XML payload is validated against the WSDL's XML schemas.
{warning}
Be careful, the current implementation of this feature makes disk access, thus reducing the performances.
{warning}
\\
This component supports *InOut*, *InOnly* and *RobustInOnly* patterns.
\\
The second and last step in the validation is a check about the singleton property of a job.
If a job is singleton, it means that only one instance of this job can be executed at once.
{info}
One typical example of a singleton job is a job which moves data from one database to another one.
It would make no sense for two instances of this job to run at the same time, especially if they work on the same databases.
{info}
If the job is singleton and already running, then an exception (fault or error) is raised.
Otherwise, a new job instance is created. If the job is singleton, then the running state of this job is set to true and locked until it is this state is released (i.e. the job is executed).
\\
{info}
The job creation strategy is a lazy strategy. A job instance is created on every received and validated message.
The consequence for singleton jobs is that all the messages sent to a singleton job while it is running will be rejected.
{info}
Once accepted, the request can now be parsed to prepare the job's input.
h2. Preparing the job's input
Once the request has been accepted, it is parsed to get the different possible parameters for the job.
The message input contains up to 4 parts, that are described in the serivce's WSDL.
# The first parameters are the context parameters, child elements of the *contexts* element from the input message. These parameters will be passed to the job in its main method.
# Then, the data flow to be passed to a tPetalsIOnput instance is retrieved from the request. This data flow may not be present.
# The third kind of parameters is the input attachments.
# Eventually, the component processes the native options to be passed to the job.
{info}
If a job does not support to be passed data flow (for a tPetalsInput), an entry is logged, but no fault is raised. The execution goes on normally.
{info}
\\
Input attachments must respect some constraints:
* Each input attachments is serialized as a temporary file.
* Its location will be passed to the job through a context variable. This is why attachments are associated with context variables.
* Be careful, attachments are expected to be passed in MTOM mode. That is to say the attachment element has a grand-child element "xop:include" whose href attribute references an attachment.
* Besides, the name of the attachment element is the name of the context variable that will be associated with the temporary file location.
{info}
As a user, you do not have to worry about this appearing complexity.
The configuration and the WSDL creation are made by the tools, during the export.
And hopefully, clients to call such a service can be generated automatically from the WSDL.{info}
\\
As you can see, from one JBI message (an XML payload and attachments), the Petals-SE-Talend component gets at most 4 kinds of parameter to pass to the jobs.
Three of them are merged together, since they are passed as contexts to the job. The remaining one concerns the tPetalsInput data.
Notice that the input message may not define any of these parameters. In this case, the component will pass nothing to the job.
{info}
In fact, the WSDL content and the expected parameters depend on the job's content and on the defined options during the export operation.
{info}
h2. Executing the job
At this point, the Petals-SE-Talend has built the job instance and prepared the parameters.
If the job contains a tPetalsInput component, the data for this component is passed to the job.
The Talend contexts and options are then passed to the job, right before its execution is launched.
h2. Getting the job's output
The native job's output is an array of array of String, that is to say: {code:lang=java}String[][]{code}
This result may contain only an integer, in which case {code:lang=java}String[ 0 ][ 0 ]{code} is an integer and indicates the result of the job execution. Otherwise, this array contains raw data (which is the case if the job contains a tBufferOutput).
Checking it can be a solution to determine whether the job execution succedded or not. The Petals-SE-Talend does not do it. It is the responsibility of the client to make this check (since in fact, it depends on the job itself).
\\
If the job contained a tPetalsOutput, then the output data flow is retreived from the job.
{info}
If a job does not support to be asked data flow (for a tPetalsOutput), an entry is logged, but no fault is raised. The execution goes on normally.
{info}
\\
Eventually, if it was specified during the job export that output attachments are to be expected after the job was executed, then they are taken back from the job.
These attachments must be passed from the job to the component through files. These files are loaded by the component in memory and then, deleted from the disk.
The deletion of these files is not an option. Letting them on the disk could represent important risks. Indeed, a malicious client could override the context on each call, thus creating an infinite number of files on the disk. Unfinite until the disk crashes, obviously.
Like input attachments, output attachments are returned in MTOM mode.
{info}
If a component expects output attachments to be returned by the job, and that this job does not support it, then an exception (fault or error)is thrown.
This can typically happen if you created your job with Talend Open Studio and exported a context as an "OUT-Attachment".
{info}
h2. Building the response
Now that everything has been gathered from the job, the response can be built and returned.
Hence, the response can count up to 3 parts:
# The job's result (remember, the array of array of String). This part is always returned.
# The output data beans, if the job contained a tPetalsOutput.
# The output attachments.
Like the input message, the structure of the output message is determined by the job content and the options which were checked during the export of the job for Petals.
This section deals with the way messages (or requests) are processed by the Petals-SE-Talend component.
As a user, it is important to understand the logic of the component to use it efficiently.
A request received by this component may have only one goal: execute the target Talend job.
The request processing is made up of the different steps involved between the message reception and the response.
There are five steps in the processing of a request.
h2. Validating the request
When a request is received and started to be processed in the Petals-SE-Talend component, it is validated before being really processed.
Here are the different steps involved in this validation process.
The first step is the WSDL-based validation of the request's XML payload.
If the *validate-exchange-by-wsdl* parameter is set to *true*, either in the component or in the service-unit, then the XML payload is validated against the WSDL of the service-unit.
If the validation fails, an exception is thrown (which becomes either fault or an error, depending on the Message Exchange Pattern). Otherwise, the validation goes on.
{warning}
Be careful, WSDL-based validation does not work when the input message contains attachments.
The Talend export for Petals does prevent that from happening.
Just remember it if you modify the jbi.xml by hand.
{warning}
\\
The WSDL-based validation checks three elements:
# The called operation is defined in the service's WSDL.
# The called operation is associated with the called Message Exchange Pattern (MEP).
# The XML payload is validated against the WSDL's XML schemas.
{warning}
Be careful, the current implementation of this feature makes disk access, thus reducing the performances.
{warning}
\\
This component supports *InOut*, *InOnly* and *RobustInOnly* patterns.
\\
The second and last step in the validation is a check about the singleton property of a job.
If a job is singleton, it means that only one instance of this job can be executed at once.
{info}
One typical example of a singleton job is a job which moves data from one database to another one.
It would make no sense for two instances of this job to run at the same time, especially if they work on the same databases.
{info}
If the job is singleton and already running, then an exception (fault or error) is raised.
Otherwise, a new job instance is created. If the job is singleton, then the running state of this job is set to true and locked until it is this state is released (i.e. the job is executed).
\\
{info}
The job creation strategy is a lazy strategy. A job instance is created on every received and validated message.
The consequence for singleton jobs is that all the messages sent to a singleton job while it is running will be rejected.
{info}
Once accepted, the request can now be parsed to prepare the job's input.
h2. Preparing the job's input
Once the request has been accepted, it is parsed to get the different possible parameters for the job.
The message input contains up to 4 parts, that are described in the serivce's WSDL.
# The first parameters are the context parameters, child elements of the *contexts* element from the input message. These parameters will be passed to the job in its main method.
# Then, the data flow to be passed to a tPetalsIOnput instance is retrieved from the request. This data flow may not be present.
# The third kind of parameters is the input attachments.
# Eventually, the component processes the native options to be passed to the job.
{info}
If a job does not support to be passed data flow (for a tPetalsInput), an entry is logged, but no fault is raised. The execution goes on normally.
{info}
\\
Input attachments must respect some constraints:
* Each input attachments is serialized as a temporary file.
* Its location will be passed to the job through a context variable. This is why attachments are associated with context variables.
* Be careful, attachments are expected to be passed in MTOM mode. That is to say the attachment element has a grand-child element "xop:include" whose href attribute references an attachment.
* Besides, the name of the attachment element is the name of the context variable that will be associated with the temporary file location.
{info}
As a user, you do not have to worry about this appearing complexity.
The configuration and the WSDL creation are made by the tools, during the export.
And hopefully, clients to call such a service can be generated automatically from the WSDL.{info}
\\
As you can see, from one JBI message (an XML payload and attachments), the Petals-SE-Talend component gets at most 4 kinds of parameter to pass to the jobs.
Three of them are merged together, since they are passed as contexts to the job. The remaining one concerns the tPetalsInput data.
Notice that the input message may not define any of these parameters. In this case, the component will pass nothing to the job.
{info}
In fact, the WSDL content and the expected parameters depend on the job's content and on the defined options during the export operation.
{info}
h2. Executing the job
At this point, the Petals-SE-Talend has built the job instance and prepared the parameters.
If the job contains a tPetalsInput component, the data for this component is passed to the job.
The Talend contexts and options are then passed to the job, right before its execution is launched.
h2. Getting the job's output
The native job's output is an array of array of String, that is to say: {code:lang=java}String[][]{code}
This result may contain only an integer, in which case {code:lang=java}String[ 0 ][ 0 ]{code} is an integer and indicates the result of the job execution. Otherwise, this array contains raw data (which is the case if the job contains a tBufferOutput).
Checking it can be a solution to determine whether the job execution succedded or not. The Petals-SE-Talend does not do it. It is the responsibility of the client to make this check (since in fact, it depends on the job itself).
\\
If the job contained a tPetalsOutput, then the output data flow is retreived from the job.
{info}
If a job does not support to be asked data flow (for a tPetalsOutput), an entry is logged, but no fault is raised. The execution goes on normally.
{info}
\\
Eventually, if it was specified during the job export that output attachments are to be expected after the job was executed, then they are taken back from the job.
These attachments must be passed from the job to the component through files. These files are loaded by the component in memory and then, deleted from the disk.
The deletion of these files is not an option. Letting them on the disk could represent important risks. Indeed, a malicious client could override the context on each call, thus creating an infinite number of files on the disk. Unfinite until the disk crashes, obviously.
Like input attachments, output attachments are returned in MTOM mode.
{info}
If a component expects output attachments to be returned by the job, and that this job does not support it, then an exception (fault or error)is thrown.
This can typically happen if you created your job with Talend Open Studio and exported a context as an "OUT-Attachment".
{info}
h2. Building the response
Now that everything has been gathered from the job, the response can be built and returned.
Hence, the response can count up to 3 parts:
# The job's result (remember, the array of array of String). This part is always returned.
# The output data beans, if the job contained a tPetalsOutput.
# The output attachments.
Like the input message, the structure of the output message is determined by the job content and the options which were checked during the export of the job for Petals.
h1. Talend Open Studio vs. Talend Integration Suite
h1. Component Configuration
h1. Exposing a Talend job as a Petals service (Provides mode)
The most important thing to understand is that this component is not intended to be used without Talend products.
It is useless to create JBI descriptor and a service-unit by hand for this component.
In fact, you should even care about the operations this component suppports, and only rely on the WSDL generated at the export.
The operations contained by the generated WSDL only contain the operations that this component supports.
\\
This component supports two operations:
* *executeJob:* this operation creates a new job instance, passes it the received parameters, executes it and gets the result back before sending it in the response.
* *executeJobOnly:* this operation creates a new job instance, passes it the received parameters and executes it. This operation does not send back a response.
h2. The "executeJob" operation
The fully qualified name of this operation is:
* Name space URI: any URI, provided it is not null (e.g. *{html}http://petals.ow2.org/talend/{html}*)
* Local part: *executeJob*
\\
{warning}
In the version 1.0 (or 1.0.0), the only name space URI that is accepted for this operation is *{html}http://petals.ow2.org/talend/{html}*.
From the version 1.0.1, you can use any name space URI, as long as it matches a WSDL operation.
Once again, rely on the WSDL generated by Talend products.
{warning}
\\
This operation only supports the *InOut* message exchange pattern (MEP).
When invoking this operation, you must call it using its fully qualified name.
\\
The input and output depend on the job itself and is fully described in the generated WSDL definitions.
Once again, rely on the generated WSDL.
h2. The "executeJobOnly" operation
The fully qualified name of this operation is:
* Name space URI: any URI, provided it is not null (e.g. *{html}http://petals.ow2.org/talend/{html}*)
* Local part: *executeJobOnly*
\\
{warning}
In the version 1.0 (or 1.0.0), the only name space URI that is accepted for this operation is *{html}http://petals.ow2.org/talend/{html}*.
From the version 1.0.1, you can use any name space URI, as long as it matches a WSDL operation.
Once again, rely on the WSDL generated by Talend products.
{warning}
\\
This operation only supports the *InOnly* message exchange pattern (MEP).
When invoking this operation, you must call it using its fully qualified name.
\\
The input depends on the job itself and is fully described in the generated WSDL definitions.
This operation does not send any response. Once again, rely on the generated WSDL.
h2. JBI Descriptor
The service-unit descriptor file ( jbi.xml ) looks like this:
{code:lang=xml}<!-- Remember, this file is intended to be generated by Talend Open Studio or Talend Integration Suite -->
<?xml version="1.0" encoding="UTF-8"?>
<jbi:jbi version="1.0"
xmlns:jbi="http://java.sun.com/xml/ns/jbi"
xmlns:petalsCDK="http://petals.ow2.org/components/extensions/version-5"
xmlns:talend="http://petals.ow2.org/components/talend/version-1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<jbi:services binding-component="false">
<jbi:provides
interface-name="generatedNs:AttachmentInOutServicePortType"
service-name="generatedNs:AttachmentInOutService"
endpoint-name="AttachmentInOutEndpoint"
xmlns:generatedNs="http://petals.ow2.org/talend/">
<!-- CDK parameters -->
<petalsCDK:wsdl>AttachmentInOut.wsdl</petalsCDK:wsdl>
<petalsCDK:validate-wsdl>true</petalsCDK:validate-wsdl>
<!-- Component parameters -->
<talend:name>AttachmentInOut</talend:name>
<talend:class-name>
talenddemosjava.attachmentinout_0_1.AttachmentInOut
</talend:class-name>
<talend:context>Default</talend:context>
<talend:singleton>true</talend:singleton>
<talend:validate-exchange-by-wsdl>false</talend:validate-exchange-by-wsdl>
<!--
Define all the expected output attachements.
There can be as many "output-attachment" as required.
-->
<talend:output-attachment>outputFile</talend:output-attachment>
</jbi:provides>
</jbi:services>
</jbi:jbi>
{code}
\\
*Configuration of a Service-Unit to expose a Talend job as a service into Petals ESB :*
|| Parameter || Description || Default || Required ||
| name | The job's name \\ | \- | Yes |
| class-name | The job's class name \\ | \- | Yes |
| context | The context to use (if invalid, the job will use the default context) \\ | \- | Yes |
| singleton | The singleton property \\ | \- | Yes |
| validate-exchange-by-wsdl | Validate the messages against the WSDLs' schemas | False | No \\ |
| output-attachment \\ | The name of a context variable that points to a file that must be attached to the returned message. \\
The cardinality for this element is 0-*. \\ | \- | No \\ |
\\
{include:0 CDK SU Provide Configuration}
\\
{include:0 CDK Interceptor configuration for SU}
h2. Service-Unit content
The service unit must contain the JAR archives of the job and its dependencies, plus the context files.
It is also highly recommended to provide a WSDL description of your job. This WSDL is not mandatory, but not providing it will prevent your service from interacting with other Petals services and components.
By default, a WSDL is generated during the Talend export for Petals.
\\
The directory structure of a SU for the Petals-SE-Talend looks like this:
{noformat}
su-talend-JobName-provide.zip
+ META-INF
- jbi.xml
+ src
- { source files }
+ { Contexts directory }
- *.properties
- JobName.wsdl
- systemRoutines.jar
- userRoutines.jar
- JobName_JobVersion.jar
- JobDependencies.jar (there can be several jars)
{noformat}
h1. Configuring the component
It is useless to create JBI descriptor and a service-unit by hand for this component.
In fact, you should even care about the operations this component suppports, and only rely on the WSDL generated at the export.
The operations contained by the generated WSDL only contain the operations that this component supports.
\\
This component supports two operations:
* *executeJob:* this operation creates a new job instance, passes it the received parameters, executes it and gets the result back before sending it in the response.
* *executeJobOnly:* this operation creates a new job instance, passes it the received parameters and executes it. This operation does not send back a response.
h2. The "executeJob" operation
The fully qualified name of this operation is:
* Name space URI: any URI, provided it is not null (e.g. *{html}http://petals.ow2.org/talend/{html}*)
* Local part: *executeJob*
\\
{warning}
In the version 1.0 (or 1.0.0), the only name space URI that is accepted for this operation is *{html}http://petals.ow2.org/talend/{html}*.
From the version 1.0.1, you can use any name space URI, as long as it matches a WSDL operation.
Once again, rely on the WSDL generated by Talend products.
{warning}
\\
This operation only supports the *InOut* message exchange pattern (MEP).
When invoking this operation, you must call it using its fully qualified name.
\\
The input and output depend on the job itself and is fully described in the generated WSDL definitions.
Once again, rely on the generated WSDL.
h2. The "executeJobOnly" operation
The fully qualified name of this operation is:
* Name space URI: any URI, provided it is not null (e.g. *{html}http://petals.ow2.org/talend/{html}*)
* Local part: *executeJobOnly*
\\
{warning}
In the version 1.0 (or 1.0.0), the only name space URI that is accepted for this operation is *{html}http://petals.ow2.org/talend/{html}*.
From the version 1.0.1, you can use any name space URI, as long as it matches a WSDL operation.
Once again, rely on the WSDL generated by Talend products.
{warning}
\\
This operation only supports the *InOnly* message exchange pattern (MEP).
When invoking this operation, you must call it using its fully qualified name.
\\
The input depends on the job itself and is fully described in the generated WSDL definitions.
This operation does not send any response. Once again, rely on the generated WSDL.
h2. JBI Descriptor
The service-unit descriptor file ( jbi.xml ) looks like this:
{code:lang=xml}<!-- Remember, this file is intended to be generated by Talend Open Studio or Talend Integration Suite -->
<?xml version="1.0" encoding="UTF-8"?>
<jbi:jbi version="1.0"
xmlns:jbi="http://java.sun.com/xml/ns/jbi"
xmlns:petalsCDK="http://petals.ow2.org/components/extensions/version-5"
xmlns:talend="http://petals.ow2.org/components/talend/version-1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<jbi:services binding-component="false">
<jbi:provides
interface-name="generatedNs:AttachmentInOutServicePortType"
service-name="generatedNs:AttachmentInOutService"
endpoint-name="AttachmentInOutEndpoint"
xmlns:generatedNs="http://petals.ow2.org/talend/">
<!-- CDK parameters -->
<petalsCDK:wsdl>AttachmentInOut.wsdl</petalsCDK:wsdl>
<petalsCDK:validate-wsdl>true</petalsCDK:validate-wsdl>
<!-- Component parameters -->
<talend:name>AttachmentInOut</talend:name>
<talend:class-name>
talenddemosjava.attachmentinout_0_1.AttachmentInOut
</talend:class-name>
<talend:context>Default</talend:context>
<talend:singleton>true</talend:singleton>
<talend:validate-exchange-by-wsdl>false</talend:validate-exchange-by-wsdl>
<!--
Define all the expected output attachements.
There can be as many "output-attachment" as required.
-->
<talend:output-attachment>outputFile</talend:output-attachment>
</jbi:provides>
</jbi:services>
</jbi:jbi>
{code}
\\
*Configuration of a Service-Unit to expose a Talend job as a service into Petals ESB :*
|| Parameter || Description || Default || Required ||
| name | The job's name \\ | \- | Yes |
| class-name | The job's class name \\ | \- | Yes |
| context | The context to use (if invalid, the job will use the default context) \\ | \- | Yes |
| singleton | The singleton property \\ | \- | Yes |
| validate-exchange-by-wsdl | Validate the messages against the WSDLs' schemas | False | No \\ |
| output-attachment \\ | The name of a context variable that points to a file that must be attached to the returned message. \\
The cardinality for this element is 0-*. \\ | \- | No \\ |
\\
{include:0 CDK SU Provide Configuration}
\\
{include:0 CDK Interceptor configuration for SU}
h2. Service-Unit content
The service unit must contain the JAR archives of the job and its dependencies, plus the context files.
It is also highly recommended to provide a WSDL description of your job. This WSDL is not mandatory, but not providing it will prevent your service from interacting with other Petals services and components.
By default, a WSDL is generated during the Talend export for Petals.
\\
The directory structure of a SU for the Petals-SE-Talend looks like this:
{noformat}
su-talend-JobName-provide.zip
+ META-INF
- jbi.xml
+ src
- { source files }
+ { Contexts directory }
- *.properties
- JobName.wsdl
- systemRoutines.jar
- userRoutines.jar
- JobName_JobVersion.jar
- JobDependencies.jar (there can be several jars)
{noformat}
h1. Configuring the component
The component can be configured through its JBI descriptor file, as shown below.
{code:lang=xml}<?xml version="1.0" encoding="UTF-8"?>
{code:lang=xml}<?xml version="1.0" encoding="UTF-8"?>
Beware, for the moment, WSDL-based validation does not work with messages having attachments.
h1. Understanding the way messages are processed
This section deals with the way messages (or requests) are processed by the Petals-SE-Talend component.
As a user, it is important to understand the logic of the component to use it efficiently.
A request received by this component may have only one goal: execute the target Talend job.
The request processing is made up of the different steps involved between the message reception and the response.
There are five steps in the processing of a request.
h2. Validating the request
When a request is received and started to be processed in the Petals-SE-Talend component, it is validated before being really processed.
Here are the different steps involved in this validation process.
The first step is the WSDL-based validation of the request's XML payload.
If the *validate-exchange-by-wsdl* parameter is set to *true*, either in the component or in the service-unit, then the XML payload is validated against the WSDL of the service-unit.
If the validation fails, an exception is thrown (which becomes either fault or an error, depending on the Message Exchange Pattern). Otherwise, the validation goes on.
{warning}
Be careful, WSDL-based validation does not work when the input message contains attachments.
The Talend export for Petals does prevent that from happening.
Just remember it if you modify the jbi.xml by hand.
{warning}
\\
The WSDL-based validation checks three elements:
# The called operation is defined in the service's WSDL.
# The called operation is associated with the called Message Exchange Pattern (MEP).
# The XML payload is validated against the WSDL's XML schemas.
# The called operation is defined in the service's WSDL.
# The called operation is associated with the called Message Exchange Pattern (MEP).
# The XML payload is validated against the WSDL's XML schemas.
{warning}
The Petals-SE-Talend component can only handle messages coming from inside the bus. Therefore, you cannot specify an external-listener class-name.{warning}
Be careful, the current implementation of this feature makes disk access, thus reducing the performances.
{warning}
{warning}
h1. Service Configuration
\\
This component supports *InOut*, *InOnly* and *RobustInOnly* patterns.
This component supports *InOut*, *InOnly* and *RobustInOnly* patterns.
\\
The second and last step in the validation is a check about the singleton property of a job.
If a job is singleton, it means that only one instance of this job can be executed at once.
The second and last step in the validation is a check about the singleton property of a job.
If a job is singleton, it means that only one instance of this job can be executed at once.
{info}
h2. Service Unit descriptor
The Service Unit descriptor file ( jbi.xml ) looks like this:
{code:lang=xml}<!-- Remember, this file is intended to be generated by Talend Open Studio or Talend Integration Suite -->
<?xml version="1.0" encoding="UTF-8"?>
<jbi:jbi version="1.0"
xmlns:jbi="http://java.sun.com/xml/ns/jbi"
xmlns:petalsCDK="http://petals.ow2.org/components/extensions/version-5"
xmlns:talend="http://petals.ow2.org/components/talend/version-1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
{code:lang=xml}<!-- Remember, this file is intended to be generated by Talend Open Studio or Talend Integration Suite -->
<?xml version="1.0" encoding="UTF-8"?>
<jbi:jbi version="1.0"
xmlns:jbi="http://java.sun.com/xml/ns/jbi"
xmlns:petalsCDK="http://petals.ow2.org/components/extensions/version-5"
xmlns:talend="http://petals.ow2.org/components/talend/version-1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
One typical example of a singleton job is a job which moves data from one database to another one.
It would make no sense for two instances of this job to run at the same time, especially if they work on the same databases.
{info}
It would make no sense for two instances of this job to run at the same time, especially if they work on the same databases.
{info}
<jbi:services binding-component="false">
<jbi:provides
interface-name="generatedNs:AttachmentInOutServicePortType"
service-name="generatedNs:AttachmentInOutService"
endpoint-name="AttachmentInOutEndpoint"
xmlns:generatedNs="http://petals.ow2.org/talend/">
<jbi:provides
interface-name="generatedNs:AttachmentInOutServicePortType"
service-name="generatedNs:AttachmentInOutService"
endpoint-name="AttachmentInOutEndpoint"
xmlns:generatedNs="http://petals.ow2.org/talend/">
<!-- CDK parameters -->
<petalsCDK:wsdl>AttachmentInOut.wsdl</petalsCDK:wsdl>
<petalsCDK:validate-wsdl>true</petalsCDK:validate-wsdl>
<!-- Component parameters -->
<talend:name>AttachmentInOut</talend:name>
<talend:class-name>
talenddemosjava.attachmentinout_0_1.AttachmentInOut
</talend:class-name>
<talend:context>Default</talend:context>
<talend:singleton>true</talend:singleton>
<talend:validate-exchange-by-wsdl>false</talend:validate-exchange-by-wsdl>
<petalsCDK:wsdl>AttachmentInOut.wsdl</petalsCDK:wsdl>
<petalsCDK:validate-wsdl>true</petalsCDK:validate-wsdl>
<!-- Component parameters -->
<talend:name>AttachmentInOut</talend:name>
<talend:class-name>
talenddemosjava.attachmentinout_0_1.AttachmentInOut
</talend:class-name>
<talend:context>Default</talend:context>
<talend:singleton>true</talend:singleton>
<talend:validate-exchange-by-wsdl>false</talend:validate-exchange-by-wsdl>
If the job is singleton and already running, then an exception (fault or error) is raised.
Otherwise, a new job instance is created. If the job is singleton, then the running state of this job is set to true and locked until it is this state is released (i.e. the job is executed).
Otherwise, a new job instance is created. If the job is singleton, then the running state of this job is set to true and locked until it is this state is released (i.e. the job is executed).
<!--
Define all the expected output attachements.
There can be as many "output-attachment" as required.
-->
<talend:output-attachment>outputFile</talend:output-attachment>
</jbi:provides>
</jbi:services>
Define all the expected output attachements.
There can be as many "output-attachment" as required.
-->
<talend:output-attachment>outputFile</talend:output-attachment>
</jbi:provides>
</jbi:services>
</jbi:jbi> \\
{info}
The job creation strategy is a lazy strategy. A job instance is created on every received and validated message.
The consequence for singleton jobs is that all the messages sent to a singleton job while it is running will be rejected.
The consequence for singleton jobs is that all the messages sent to a singleton job while it is running will be rejected.
{code} {info}
Once accepted, the request can now be parsed to prepare the job's input.
h2. Preparing the job's input
Once the request has been accepted, it is parsed to get the different possible parameters for the job.
The message input contains up to 4 parts, that are described in the serivce's WSDL.
# The first parameters are the context parameters, child elements of the *contexts* element from the input message. These parameters will be passed to the job in its main method.
# Then, the data flow to be passed to a tPetalsIOnput instance is retrieved from the request. This data flow may not be present.
# The third kind of parameters is the input attachments.
# Eventually, the component processes the native options to be passed to the job.
{info}
If a job does not support to be passed data flow (for a tPetalsInput), an entry is logged, but no fault is raised. The execution goes on normally.
{info}
\\
*Configuration of a Service Unit to expose a Talend job as a service into Petals ESB :*
Input attachments must respect some constraints:
|| Parameter || Description || Default || Required ||
| name | The job's name \\ | \- | Yes |
| class-name | The job's class name \\ | \- | Yes |
| context | The context to use (if invalid, the job will use the default context) \\ | \- | Yes |
| singleton | The singleton property \\ | \- | Yes |
| validate-exchange-by-wsdl | Validate the messages against the WSDLs' schemas | False | No \\ |
| output-attachment \\ | The name of a context variable that points to a file that must be attached to the returned message. \\
The cardinality for this element is 0-*. \\ | \- | No \\ |
| name | The job's name \\ | \- | Yes |
| class-name | The job's class name \\ | \- | Yes |
| context | The context to use (if invalid, the job will use the default context) \\ | \- | Yes |
| singleton | The singleton property \\ | \- | Yes |
| validate-exchange-by-wsdl | Validate the messages against the WSDLs' schemas | False | No \\ |
| output-attachment \\ | The name of a context variable that points to a file that must be attached to the returned message. \\
The cardinality for this element is 0-*. \\ | \- | No \\ |
* Each input attachments is serialized as a temporary file.
* Its location will be passed to the job through a context variable. This is why attachments are associated with context variables.
* Be careful, attachments are expected to be passed in MTOM mode. That is to say the attachment element has a grand-child element "xop:include" whose href attribute references an attachment.
* Besides, the name of the attachment element is the name of the context variable that will be associated with the temporary file location.
* Its location will be passed to the job through a context variable. This is why attachments are associated with context variables.
* Be careful, attachments are expected to be passed in MTOM mode. That is to say the attachment element has a grand-child element "xop:include" whose href attribute references an attachment.
* Besides, the name of the attachment element is the name of the context variable that will be associated with the temporary file location.
{info}
As a user, you do not have to worry about this appearing complexity.
The configuration and the WSDL creation are made by the tools, during the export.
And hopefully, clients to call such a service can be generated automatically from the WSDL.{info}
As a user, you do not have to worry about this appearing complexity.
The configuration and the WSDL creation are made by the tools, during the export.
And hopefully, clients to call such a service can be generated automatically from the WSDL.{info}
\\
{include:0 CDK SU Provide Configuration}
\\
{include:0 CDK Interceptor configuration for SU}
\\
{include:0 CDK Interceptor configuration for SU}
As you can see, from one JBI message (an XML payload and attachments), the Petals-SE-Talend component gets at most 4 kinds of parameter to pass to the jobs.
Three of them are merged together, since they are passed as contexts to the job. The remaining one concerns the tPetalsInput data.
Notice that the input message may not define any of these parameters. In this case, the component will pass nothing to the job.
Three of them are merged together, since they are passed as contexts to the job. The remaining one concerns the tPetalsInput data.
Notice that the input message may not define any of these parameters. In this case, the component will pass nothing to the job.
h2. Service Unit content
{info}
In fact, the WSDL content and the expected parameters depend on the job's content and on the defined options during the export operation.
{info}
In fact, the WSDL content and the expected parameters depend on the job's content and on the defined options during the export operation.
{info}
The service unit must contain the JAR archives of the job and its dependencies, plus the context files.
It is also highly recommended to provide a WSDL description of your job. This WSDL is not mandatory, but not providing it will prevent your service from interacting with other Petals services and components.
By default, a WSDL is generated during the Talend export for Petals.
It is also highly recommended to provide a WSDL description of your job. This WSDL is not mandatory, but not providing it will prevent your service from interacting with other Petals services and components.
By default, a WSDL is generated during the Talend export for Petals.
h2. Executing the job
At this point, the Petals-SE-Talend has built the job instance and prepared the parameters.
If the job contains a tPetalsInput component, the data for this component is passed to the job.
The Talend contexts and options are then passed to the job, right before its execution is launched.
h2. Getting the job's output
The native job's output is an array of array of String, that is to say: {code:lang=java}String[][]{code}
This result may contain only an integer, in which case {code:lang=java}String[ 0 ][ 0 ]{code} is an integer and indicates the result of the job execution. Otherwise, this array contains raw data (which is the case if the job contains a tBufferOutput).
Checking it can be a solution to determine whether the job execution succedded or not. The Petals-SE-Talend does not do it. It is the responsibility of the client to make this check (since in fact, it depends on the job itself).
If the job contains a tPetalsInput component, the data for this component is passed to the job.
The Talend contexts and options are then passed to the job, right before its execution is launched.
h2. Getting the job's output
The native job's output is an array of array of String, that is to say: {code:lang=java}String[][]{code}
This result may contain only an integer, in which case {code:lang=java}String[ 0 ][ 0 ]{code} is an integer and indicates the result of the job execution. Otherwise, this array contains raw data (which is the case if the job contains a tBufferOutput).
Checking it can be a solution to determine whether the job execution succedded or not. The Petals-SE-Talend does not do it. It is the responsibility of the client to make this check (since in fact, it depends on the job itself).
\\
The directory structure of a SU for the Petals-SE-Talend looks like this:
If the job contained a tPetalsOutput, then the output data flow is retreived from the job.
{noformat} {info}
If a job does not support to be asked data flow (for a tPetalsOutput), an entry is logged, but no fault is raised. The execution goes on normally.
su-talend-JobName-provide.zip {info}
+ META-INF
- jbi.xml
+ src
- { source files }
+ { Contexts directory }
- *.properties
- JobName.wsdl
- systemRoutines.jar
- userRoutines.jar
- JobName_JobVersion.jar
- JobDependencies.jar (there can be several jars)
- jbi.xml
+ src
- { source files }
+ { Contexts directory }
- *.properties
- JobName.wsdl
- systemRoutines.jar
- userRoutines.jar
- JobName_JobVersion.jar
- JobDependencies.jar (there can be several jars)
{noformat}
\\
Eventually, if it was specified during the job export that output attachments are to be expected after the job was executed, then they are taken back from the job.
These attachments must be passed from the job to the component through files. These files are loaded by the component in memory and then, deleted from the disk.
The deletion of these files is not an option. Letting them on the disk could represent important risks. Indeed, a malicious client could override the context on each call, thus creating an infinite number of files on the disk. Unfinite until the disk crashes, obviously.
Like input attachments, output attachments are returned in MTOM mode.
{info}
If a component expects output attachments to be returned by the job, and that this job does not support it, then an exception (fault or error)is thrown.
This can typically happen if you created your job with Talend Open Studio and exported a context as an "OUT-Attachment".
{info}
h2. Building the response
Now that everything has been gathered from the job, the response can be built and returned.
Hence, the response can count up to 3 parts:
# The job's result (remember, the array of array of String). This part is always returned.
# The output data beans, if the job contained a tPetalsOutput.
# The output attachments.
Like the input message, the structure of the output message is determined by the job content and the options which were checked during the export of the job for Petals.
Eventually, if it was specified during the job export that output attachments are to be expected after the job was executed, then they are taken back from the job.
These attachments must be passed from the job to the component through files. These files are loaded by the component in memory and then, deleted from the disk.
The deletion of these files is not an option. Letting them on the disk could represent important risks. Indeed, a malicious client could override the context on each call, thus creating an infinite number of files on the disk. Unfinite until the disk crashes, obviously.
Like input attachments, output attachments are returned in MTOM mode.
{info}
If a component expects output attachments to be returned by the job, and that this job does not support it, then an exception (fault or error)is thrown.
This can typically happen if you created your job with Talend Open Studio and exported a context as an "OUT-Attachment".
{info}
h2. Building the response
Now that everything has been gathered from the job, the response can be built and returned.
Hence, the response can count up to 3 parts:
# The job's result (remember, the array of array of String). This part is always returned.
# The output data beans, if the job contained a tPetalsOutput.
# The output attachments.
Like the input message, the structure of the output message is determined by the job content and the options which were checked during the export of the job for Petals.