Storm: Business Use Cases
- Bhupendra patni
- Aug 22, 2015
- 2 min read
The blog is to share business use cases in which Storm can be used to process continuous streams of data in real-time and its benefits.
Business Use Cases
Apple, Amazon, Visa, Bank of America etc. - There is huge need for retailer and financial services organization to process transactions in real-time to prevent fraud.
Verizon, AT&T, T-Mobile etc. - Telecom companies need to analyze network traffic to allocate cellular towers in in real-time.
Swift, Uber, Lyft - Transportation companies need to analyze real-time data to optimize driver routes to save time and fuel costs.
Google, Facebook, Twitter etc. - Monitor application logs in real-time to analyze and respond to application anomalies as and when it happens.
Apache Storm is a distributed system for processing continuous streams of real-time data which augments the batch processing capabilities of Hadoop MapReduce, which is commonly used for Stream Processing, Continuous Computation, Remote Procedure Calls etc.
Storm processes real-time data by dividing complex jobs into small tasks processed by a series of workers performing different operations. The workers are not always a linear processing, there are branches and directed acyclic graphs.
Batch Vs Real-time
The batch processing is performed on disk and moved to memory for processing while real-time processing is performed primarily in memory and moved to disk after processing.
The age of the data in batch processing is usually batched for 15 minutes or more while real-time processing is less than few minutes.
The processing engine for batch is expected to be periodic while real-time is expected to be always running.
The speed for the batch processing is few minutes to hours while real-time processing is sub-second to few seconds.
Storm Benefits
Highly Scalable - Can be scaled horizontally.
Very Fast - Storm is very fast and can process millions of events per second depending on the size of the cluster.
Guaranteed Processing - Supports semantics like at least once and exactly once processing.
Fault Tolerant - Highly fault tolerant due to redundant services and operations with automated failover capabilities.
Programming Language Agnostic - Data processing logic can be developed in multiple languages.
Thank you for reading and I hope the blog was helpful. Please provide your feedback.
Comments