Bug in MATCH_RECOGNIZE ?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Bug in MATCH_RECOGNIZE ?

maverick
Hi,
I have a very strange bug when using MATCH_RECOGNIZE.

I'm using some joins and unions to create event stream. Sample event stream (for one user) looks like this:


Then I'm using following MATCH_RECOGNIZE definition (trace function will be explained later)

CREATE VIEW scenario_1 AS (
SELECT * FROM events
    MATCH_RECOGNIZE(
        PARTITION BY cif
        ORDER BY ts
        MEASURES
            TRX.v as trx_amount,
            TRX.ts as trx_ts,
            APP_1.ts as app_1_ts,
            APP_2.ts as app_2_ts,
            APP_2.balance as app_2_balance
        ONE ROW PER MATCH
        PATTERN (TRX ANY_EVENT*? APP_1 NOT_LOAN*? APP_2) WITHIN INTERVAL '10' DAY
        DEFINE
        TRX AS trace(TRX.event_type = 'trx' AND TRX.v > 1000,
                  'TRX', TRX.uuid, TRX.cif, TRX.event_type, TRX.ts),
        ANY_EVENT AS trace(true,
                  'ANY_EVENT', TRX.uuid, ANY_EVENT.cif, ANY_EVENT.event_type, ANY_EVENT.ts),
        APP_1 AS trace(APP_1.event_type = 'application' AND APP_1.ts < TRX.ts + INTERVAL '3' DAY,
                  'APP_1', TRX.uuid, APP_1.cif, APP_1.event_type, APP_1.ts),
        APP_2 AS trace(APP_2.event_type = 'application' AND APP_2.ts > APP_1.ts
                   AND APP_2.ts < APP_1.ts + INTERVAL '7' DAY AND APP_2.balance < 100,
                  'APP_2', TRX.uuid, APP_2.cif, APP_2.event_type, APP_2.ts),
        NOT_LOAN AS trace(NOT_LOAN.event_type <> 'loan',
                  'NOT_LOAN', TRX.uuid, NOT_LOAN.cif, NOT_LOAN.event_type, NOT_LOAN.ts)
    ))


This scenario could be matched by sample events because:
- TRX is matched by event with ts 2021-05-01 04:42:57
- APP_1 by ts 2021-05-01 10:29:10
- APP_2 by ts 2021-05-01 10:39:02
Unfortunately I'm not getting any data. And it's not watermarks fault.

Trace function has following code and gives me some logs:

public class TraceUDF extends ScalarFunction {

    public Boolean eval(Boolean condition, @DataTypeHint(inputGroup = InputGroup.ANY) Object ... message) {
        log.info((condition ? "Condition true: " : "Condition false: ") + Arrays.stream(message).map(Object::toString).collect(Collectors.joining(" ")));
        return condition;
    }
}

And log from this trace function is following.

2021-07-06 13:09:43,762 INFO TraceUDF                             [] - Condition true: TRX 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 2021-05-01T04:42:57
2021-07-06 13:12:28,914 INFO  TraceUDF                             [] - Condition true: ANY_EVENT 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 2021-05-01T15:28:34
2021-07-06 13:12:28,915 INFO  TraceUDF                             [] - Condition false: APP_1 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 2021-05-01T15:28:34
2021-07-06 13:12:28,915 INFO  TraceUDF                             [] - Condition false: TRX 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 trx 2021-05-01T15:28:34

As you can see 2 events are missing.
What can I do ?
I failed with create minimal example of this bug. Any other ideas ?