S4 Distributed Stream Platform

Yahoo! Open Source S4 Computing Platform

Jessica Thornsby

Platform originally developed for personalising search ads, gets open sourced.

Yahoo! have open sourced the S4 distributed stream computing
platform
for developing applications for processing continuous,
unbounded streams of data. S4 was originally developed for
personalising search advertising products at Yahoo! where S4 was
used in processing recent queries, clicks and timing
information.

S4 routes keyed data events with affinity to Processing
Elements, which consume the events and either emits events which
may be consumed by other Processing Elements, or publishes the
results. The nodes are symmetric with no centralised service and no
single point of failure, and a cluster management layer based on
ZooKeeper re-routes events to other servers automatically. The S4
team are currently encouraging those interested in stream
processing to get involved in the project.

Author
Comments
comments powered by Disqus