Consider the watermarks that are generated by your chosen watermark generator as an +assertion+ about the progression of time, based on domain knowledge, observation of elements, and connector specifics. The generator is asserting that any elements observed after a given watermark will come later in event time, e.g. "we've reached 12:00 PM; subsequent events will have a timestamp greater than 12:00 PM".
Your specific output seems fine to me. It reads like, "event @ 11:59, watermark @ 12:00, event @ 12:02, watermark @ 12:01, event @ 12:03". The watermark assertion wasn't violated in this situation.
Some operators provide special "late event" handling logic for the situation that the assertion is violated. The process function is quite flexible, providing timers to observe the progression of time (due to watermarks), and making it possible to handle late events as you see fit. Often a process function will buffer events until a certain time is reached.