Fwd: To get Schema for jdbc database in Flink

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: To get Schema for jdbc database in Flink

Punit Tandel
Hi ,

I was looking into flink streaming api and trying to implement the solution for reading the data from jdbc database and writing them to jdbc databse again.

At the moment i can see the datastream is returning Row from the database. dataStream.getType().getGenericParameters() retuning an empty list of collection.

I am right now manually creating a database connection and getting the schema from ResultMetadata and constructing the schema for the table which is a bit heavy operation.

So is there any other way to get the schema for the table in order to create a new table and write those records in the database ?

Please let me know

Thanks
Punit

Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Ufuk Celebi
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:

> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit
Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

rmetzger0
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit

Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Punit Tandel

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit


Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Chesnay Schepler
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit



Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Punit Tandel

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit




Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Chesnay Schepler
Hello,

in the JDBC case i would suggest that you extract the schema from the first Row that your sink receives, create the table, and then
start writing data.

However, keep in mind that Rows can contain null fields; so you may not be able to extract the entire schema if the first
row has a null somewhere.

Regards,
Chesnay

On 08.02.2017 10:48, Punit Tandel wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit





Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Punit Tandel

HI

With this approach i will be able to get data types but not the column names because TypeInformation<?> typeInformation = dataStream.getType() will return types but not the columns names.

Is there any other way to get the column names from Row?

Thanks
Punit


On 02/08/2017 10:17 AM, Chesnay Schepler wrote:
Hello,

in the JDBC case i would suggest that you extract the schema from the first Row that your sink receives, create the table, and then
start writing data.

However, keep in mind that Rows can contain null fields; so you may not be able to extract the entire schema if the first
row has a null somewhere.

Regards,
Chesnay

On 08.02.2017 10:48, Punit Tandel wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit






Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Flavio Pompermaier
I also thought about it and my conclusion was to use a generic sql parser (e.g. Calcite?) to extract the column names from the input query (because in the query you can rename/add fields...).. I'd like to hear opinions about this..unfortunately I don't have the time to implement this right now :(

On Wed, Feb 8, 2017 at 1:59 PM, Punit Tandel <[hidden email]> wrote:

HI

With this approach i will be able to get data types but not the column names because TypeInformation<?> typeInformation = dataStream.getType() will return types but not the columns names.

Is there any other way to get the column names from Row?

Thanks
Punit


On 02/08/2017 10:17 AM, Chesnay Schepler wrote:
Hello,

in the JDBC case i would suggest that you extract the schema from the first Row that your sink receives, create the table, and then
start writing data.

However, keep in mind that Rows can contain null fields; so you may not be able to extract the entire schema if the first
row has a null somewhere.

Regards,
Chesnay

On 08.02.2017 10:48, Punit Tandel wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit








Reply | Threaded
Open this post in threaded view
|

Re: To get Schema for jdbc database in Flink

Punit Tandel

Ok . I am right now simply taking a POJO to get the data types and schema but needed generic approach to get these information.

Thanks


On 02/08/2017 01:37 PM, Flavio Pompermaier wrote:
I also thought about it and my conclusion was to use a generic sql parser (e.g. Calcite?) to extract the column names from the input query (because in the query you can rename/add fields...).. I'd like to hear opinions about this..unfortunately I don't have the time to implement this right now :(

On Wed, Feb 8, 2017 at 1:59 PM, Punit Tandel <[hidden email]> wrote:

HI

With this approach i will be able to get data types but not the column names because TypeInformation<?> typeInformation = dataStream.getType() will return types but not the columns names.

Is there any other way to get the column names from Row?

Thanks
Punit


On 02/08/2017 10:17 AM, Chesnay Schepler wrote:
Hello,

in the JDBC case i would suggest that you extract the schema from the first Row that your sink receives, create the table, and then
start writing data.

However, keep in mind that Rows can contain null fields; so you may not be able to extract the entire schema if the first
row has a null somewhere.

Regards,
Chesnay

On 08.02.2017 10:48, Punit Tandel wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit









Reply | Threaded
Open this post in threaded view
|

Data stream to write to multiple rds instances

Sathi Chowdhury
In reply to this post by Punit Tandel
Hi All,
Is there any preferred way to manage multiple jdbc connections from flink..? I am new to flink and looking for some guidance around the right pattern and apis to do this. The usecase needs to route a stream to a particular jdbc connection depending on a field value.So the records are written to multiple destination dbs.
Thanks
Sathi
On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.
Sent from my iPhone

On Feb 8, 2017, at 1:49 AM, Punit Tandel <[hidden email]> wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit




=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============
Reply | Threaded
Open this post in threaded view
|

Re: Data stream to write to multiple rds instances

Till Rohrmann
Hi Sathi,

you can split select or filter your data stream based on the field's value. Then you are able to obtain multiple data streams which you can output using a JDBCOutputFormat for each data stream. Be aware, however, that the JDBCOutputFormat does not give you any processing guarantees since it does not take part in Flink's checkpointing mechanism. Unfortunately, Flink does not have a streaming JDBC connector, yet.

Cheers,
Till

On Thu, Mar 2, 2017 at 7:21 AM, Sathi Chowdhury <[hidden email]> wrote:
Hi All,
Is there any preferred way to manage multiple jdbc connections from flink..? I am new to flink and looking for some guidance around the right pattern and apis to do this. The usecase needs to route a stream to a particular jdbc connection depending on a field value.So the records are written to multiple destination dbs.
Thanks
Sathi
On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.
Sent from my iPhone

On Feb 8, 2017, at 1:49 AM, Punit Tandel <[hidden email]> wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit




=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============

Reply | Threaded
Open this post in threaded view
|

Re: Data stream to write to multiple rds instances

Sathi Chowdhury
Hi Till,
Thanks for your reply.I guess I will have to write a custom sink function that will use JdbcOutputFormat. I have a question about checkpointing support though ..if I  am reading a stream from kinesis , streamA and it is transformed to streamB, and that is written to db, as streamB is checkpointed when program recovers will it start from the streamB's Checkpointed offset ? In that case checkpointing the jdbc side is not so important maybe ..
Thanks
Sathi


On Mar 2, 2017, at 5:58 AM, Till Rohrmann <[hidden email]> wrote:

Hi Sathi,

you can split select or filter your data stream based on the field's value. Then you are able to obtain multiple data streams which you can output using a JDBCOutputFormat for each data stream. Be aware, however, that the JDBCOutputFormat does not give you any processing guarantees since it does not take part in Flink's checkpointing mechanism. Unfortunately, Flink does not have a streaming JDBC connector, yet.

Cheers,
Till

On Thu, Mar 2, 2017 at 7:21 AM, Sathi Chowdhury <[hidden email]> wrote:
Hi All,
Is there any preferred way to manage multiple jdbc connections from flink..? I am new to flink and looking for some guidance around the right pattern and apis to do this. The usecase needs to route a stream to a particular jdbc connection depending on a field value.So the records are written to multiple destination dbs.
Thanks
Sathi
On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.
Sent from my iPhone

On Feb 8, 2017, at 1:49 AM, Punit Tandel <[hidden email]> wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit




=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============

=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============
Reply | Threaded
Open this post in threaded view
|

Re: Data stream to write to multiple rds instances

Till Rohrmann
Hi Sathi,

if you read data from Kinesis than Flink can offer you exactly once processing guarantees. However, what you see written out to your database depends a little bit on the implementation of your custom sink. If you have synchronous JDBC client which does not lose data and you fail your job whenever you see an error, then you should achieve at least once.

Cheers,
Till

On Thu, Mar 2, 2017 at 4:49 PM, Sathi Chowdhury <[hidden email]> wrote:
Hi Till,
Thanks for your reply.I guess I will have to write a custom sink function that will use JdbcOutputFormat. I have a question about checkpointing support though ..if I  am reading a stream from kinesis , streamA and it is transformed to streamB, and that is written to db, as streamB is checkpointed when program recovers will it start from the streamB's Checkpointed offset ? In that case checkpointing the jdbc side is not so important maybe ..
Thanks
Sathi


On Mar 2, 2017, at 5:58 AM, Till Rohrmann <[hidden email]> wrote:

Hi Sathi,

you can split select or filter your data stream based on the field's value. Then you are able to obtain multiple data streams which you can output using a JDBCOutputFormat for each data stream. Be aware, however, that the JDBCOutputFormat does not give you any processing guarantees since it does not take part in Flink's checkpointing mechanism. Unfortunately, Flink does not have a streaming JDBC connector, yet.

Cheers,
Till

On Thu, Mar 2, 2017 at 7:21 AM, Sathi Chowdhury <[hidden email]> wrote:
Hi All,
Is there any preferred way to manage multiple jdbc connections from flink..? I am new to flink and looking for some guidance around the right pattern and apis to do this. The usecase needs to route a stream to a particular jdbc connection depending on a field value.So the records are written to multiple destination dbs.
Thanks
Sathi
On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.
Sent from my iPhone

On Feb 8, 2017, at 1:49 AM, Punit Tandel <[hidden email]> wrote:

Hi Chesnay

Currently that is what i have done, reading the schema from database in order to create a new table in jdbc database and writing the rows coming from jdbcinputformat.

Overall i am trying to implement the solution which reads the streaming data from one source which either could be coming from kafka, Jdbc, Hive, Hdfs and writing those streaming data to output source which is again could be any of those.

For a simple use case i have just taken one scenario using jdbc in and jdbc out, Since the jdbc input source returns the datastream of Row and to write them into jdbc database we have to create a table which requires schema.

Thanks
Punit



On 02/08/2017 08:22 AM, Chesnay Schepler wrote:
Hello,

I don't understand why you explicitly need the schema since the batch JDBCInput-/Outputformats don't require it.
That's kind of the nice thing about Rows.

Would be cool if you could tell us what you're planning to do with the schema :)

In any case, to get the schema within the plan then you will have to query the DB and build it yourself. Note that this
is executed on the client.

Regards,
Chesnay

On 08.02.2017 00:39, Punit Tandel wrote:

Hi Robert

Thanks for the response, So in near future release of the flink version , is this functionality going to be implemented ?

Thanks

On 02/07/2017 04:12 PM, Robert Metzger wrote:
Currently, there is no streaming JDBC connector.

On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi <[hidden email]> wrote:
I'm not sure how well this works for the streaming API. Looping in
Chesnay, who worked on this.

On Mon, Feb 6, 2017 at 11:09 AM, Punit Tandel <[hidden email]> wrote:
> Hi ,
>
> I was looking into flink streaming api and trying to implement the solution
> for reading the data from jdbc database and writing them to jdbc databse
> again.
>
> At the moment i can see the datastream is returning Row from the database.
> dataStream.getType().getGenericParameters() retuning an empty list of
> collection.
>
> I am right now manually creating a database connection and getting the
> schema from ResultMetadata and constructing the schema for the table which
> is a bit heavy operation.
>
> So is there any other way to get the schema for the table in order to create
> a new table and write those records in the database ?
>
> Please let me know
>
> Thanks
> Punit




=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============

=============Notice to Recipient: This e-mail transmission, and any documents, files or previous e-mail messages attached to it may contain information that is confidential or legally privileged, and intended for the use of the individual or entity named above. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify the sender by telephone or return e-mail and delete the original transmission and its attachments without reading or saving in any manner. Thank you. =============