Processing millions of messages in milliseconds real time -- Architecture guide required

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Processing millions of messages in milliseconds real time -- Architecture guide required

Deepak Sharma
Hi all,
I am looking for an architecture to ingest 10 mils of messages in the real time streaming mode.
If anyone has worked on similar kind of architecture  , can you please point me to any documentation around the same like what should be the architecture , which all components/big data ecosystem tools should i consider etc.
The messages has to be in xml/json format , a preprocessor engine or message enhancer and then finally a processor.
I thought about using data cache as well for serving the data 
The data cache should have the capability to serve the historical  data in milliseconds (may be upto 30 days of data)

--
Thanks
Deepak
Reply | Threaded
Open this post in threaded view
|

Re: Processing millions of messages in milliseconds real time -- Architecture guide required

Hung
Maybe you can refer to this- Kafka + Flink
http://data-artisans.com/kafka-flink-a-practical-how-to/
Reply | Threaded
Open this post in threaded view
|

Re: Processing millions of messages in milliseconds real time -- Architecture guide required

Ken Krugler
In reply to this post by Deepak Sharma
This seems pretty similar to what you’re asking about:


Especially the part where they “...directly exposed the in-flight windows to be queried”, as that sounds like what you meant by “The data cache should have the capability to serve the historical data in milliseconds”

— Ken



On Apr 18, 2016, at 10:03pm, Deepak Sharma <[hidden email]> wrote:

Hi all,
I am looking for an architecture to ingest 10 mils of messages in the real time streaming mode.
If anyone has worked on similar kind of architecture  , can you please point me to any documentation around the same like what should be the architecture , which all components/big data ecosystem tools should i consider etc.
The messages has to be in xml/json format , a preprocessor engine or message enhancer and then finally a processor.
I thought about using data cache as well for serving the data 
The data cache should have the capability to serve the historical  data in milliseconds (may be upto 30 days of data)

--
Thanks
Deepak

--------------------------
Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr