Create Dynamic data type

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Create Dynamic data type

Soheil Pourbafrani
Hi,

Using JDBCInputFormat I want to read data from database but the problem is the table columns are dynamic according to the number of rows. In the schema the first column is of type int and in the rest of the column the first half is String and the second half is double. So I need a way to create the data type dynamically.

I tried the following:

Tuple t = Tuple.getTupleClass(num_col + 1).newInstance();
t.setField("Integer", 0);
for(int i = 0; i < num_col; i++) {
if (i < i / 2) t.setField("String", i);
if (i > i / 2) t.setField("Double", i);
}

JDBCInputFormat jdbcInputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("com.mysql.jdbc.Driver")
.setDBUrl("url")
.setUsername("root")
.setPassword("pass")
.setQuery("SELECT * FROM table;")
.setFetchSize(100)
.setRowTypeInfo(new RowTypeInfo(TypeExtractor.getForObject(t)))
.finish();
but I got the following error:

Automatic type extraction is not possible on candidates with null values. Please specify the types directly.

Creating the data type using TypeInformation[] fieldTypes I successfully can get the data but it needs the static schema and doesn't fit in my case!

Any help will be appreciated!
Reply | Threaded
Open this post in threaded view
|

Re: Create Dynamic data type

Rong Rong
Hi Soheil,

If I understand correctly, when you said "according to the number of rows", you were trying to dynamically determine the RowType based on how long one row is, correct?
In this case, I am not sure this is considered supported in JDBCInputFormat at this moment and it would be hard to support this.

Even if we extend the JDBCInputFormat to dynamically call Connection.setSchema() every time you consume a row, this would still be tricky because
1. In your "Select *", you won't be able to know how long the row is until you actual executes the statement, but you have to setSchema before you prepare statement.
2. You have to prepare statement every time schema changes.

You might be able to set all fields to just GenericTypeInfo<>(Object.class) and convert it downstream. This will get around the dynamic schema, but you still need to know the length of your select beforehand.

So, the best I can think of is to change your schema into maps or arrays of Strings and Ints, or have your own SourceFunction to consume and deserialize in your own way.

Thanks,
Rong



On Fri, Apr 19, 2019 at 8:19 AM Soheil Pourbafrani <[hidden email]> wrote:
Hi,

Using JDBCInputFormat I want to read data from database but the problem is the table columns are dynamic according to the number of rows. In the schema the first column is of type int and in the rest of the column the first half is String and the second half is double. So I need a way to create the data type dynamically.

I tried the following:

Tuple t = Tuple.getTupleClass(num_col + 1).newInstance();
t.setField("Integer", 0);
for(int i = 0; i < num_col; i++) {
if (i < i / 2) t.setField("String", i);
if (i > i / 2) t.setField("Double", i);
}

JDBCInputFormat jdbcInputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("com.mysql.jdbc.Driver")
.setDBUrl("url")
.setUsername("root")
.setPassword("pass")
.setQuery("SELECT * FROM table;")
.setFetchSize(100)
.setRowTypeInfo(new RowTypeInfo(TypeExtractor.getForObject(t)))
.finish();
but I got the following error:

Automatic type extraction is not possible on candidates with null values. Please specify the types directly.

Creating the data type using TypeInformation[] fieldTypes I successfully can get the data but it needs the static schema and doesn't fit in my case!

Any help will be appreciated!