We've been facing issues* w.r.t watermarks not supported per key, which led us to:
Either (a) run the job in Processing time for a KeyedStream -> compromising on use cases which revolve around catching time-based patterns or (b) run the job in Event time for multiple data streams (one data stream per key) -> this is not scalable as the number of operators grow linearly with the number of keys
To address this, we've done a quick (poc) change in the AbstractKeyedCEPPatternOperator to allow for the NFAs to progress based on timestamps extracted from the events arriving into the operator (and not from the watermarks). We've tested it against our usecase and are seeing a significant improvement in memory usage without compromising on the watermark functionality.
It'll be really helpful if someone from the cep dev group can take a look at this branch - https://github.com/jainshailesh/flink/commits/cep_changes and provide comments on the approach taken, and maybe guide us on the next steps for taking it forward.