Users can rely on the processing power of more Terrastore servers.
JAXenter speaks to Terrastore founder, Sergio Bossa, on the 0.8.0 release.
Terrastore 0.8.0 has just been released. In this interview
JAXenter catches up with Terrastore founder Sergio Bossa, to find
out what’s new in this release…….
JAXenter: Terrastore 0.8.0 has just been
announced. What benefits does the new map/reduce processing
functionality, bring to the product?
Sergio Bossa: The new map/reduce processing
will enable users to leverage Terrastore distributed architecture
to perform complex aggregations and queries over stored data.
Previously – as with any other product with no distributed
map/reduce or just scatter/gather capability – the only way users
could perform an aggregation over stored data was to retrieve it by
the client side and make the complex processing locally, which
obviously doesn’t scale when data size grows.
With the introduction of map/reduce, users can rely on the
processing power of more Terrastore servers to perform the
aggregation in a parallel manner, taking less time and consuming
almost no memory and CPU by the client side.
So let’s say we have stored documents with data about – among
other things – people age, and you want to find the median age:
with map/reduce, it’s just a matter of writing a mapper function
for extracting the desired piece of data, the age indeed, and a
reduce function to compute the median value; Terrastore will take
care of running the parallel computation and returning you the
JAXenter: How has the events management
infrastructure been enhanced in 0.8.0?
Sergio Bossa: First, event listeners have now
access to both the old and new version of the changed document, and
the old version in case of a removal: this is very useful to
perform actions depending on what actually changed in stored data.
And, you can now also perform actions that modify stored data, what
I refer to as “active listeners”: many use cases now materialize,
mainly centered on the possibility to update dependent documents
and/or create processing chains which elaborate and store
intermediate document versions up to the final, desired one, with
everything happening inside the store with no user
JAXenter: What is the ‘Adaptive Ensemble
Scheduling’ that comes as part of the recent release?
Sergio Bossa: This question needs a little
background. First, the ensemble is a Terrastore deployment mode to
provide horizontal scalability by joining together several
clusters, making them work as a whole – so that users can
transparently access whatever node in whatever cluster, and take
advantage of the whole storage and processing capabilities.
All clusters in the ensemble get access each other by exchanging
“cluster views”: the “ensemble scheduler” activates such a
view-exchange process. The new “Adaptive Ensemble Scheduling”
mechanism implements a more efficient, more reliable, algorithm to
exchange views and so keep all clusters in the ensemble up-to-date:
it’s based on a dynamic algorithm which computes the optimal
frequency of the exchanges by taking into account previous data
such as number of joining/leaving nodes and their frequency, rather
than absolute, fixed data.
JAXenter: What’s planned for the 0.9.0
Sergio Bossa: Prior to 0.9.0, we’ll probably go
through a few 0.8.x releases, providing bug fixes and minor
enhancements and features. Then, 0.9.0 will probably focus on
enhancing Terrastore ensemble functionalities, in particular
regarding data replication, and optimizing some Terrastore
performance aspects, in particular regarding memory consumption …
that’s unless users will demand for something different and more