#Big Data

SMACK — Next generation Big Data

Big Data becomes Fast Data

Big Data is changing. Buzzwords such as Hadoop, Storm, Pig and Hive are not the darlings of the industry anymore —they are being replaced by a powerful duo: Fast Data and SMACK. Such a fast change in such a (relatively) young ecosystem begs the following question: What is wrong with the current approach? What is the difference between Fast and Big Data? And what is SMACK?

“If you can cache everything in a very efficient way, you can often change the game”

Netflix OSS: Change the game with Hollow

Netflix Hollow is a Java library and comprehensive toolset for harnessing small to moderately sized in-memory datasets which are disseminated from a single producer to many consumers for read-only access. It is built with servers busily serving requests at or near maximum capacity in mind and its aim is to address the scaling challenges of in-memory datasets. Let’s see the advantages that come from using Netflix Hollow.

For a good cause

IBM joins R Consortium, aims to make analytics easier

As a (new) member of the R Consortium, IBM will work side by side with the R user community and support the project’s mission to pinpoint, create and implement infrastructure projects that drive standards and best practices for R code.

Open-source streaming analytics

Big Data with Apache Apex

It’s touted as the industry’s only open-source enterprise grad unified stream and batch processing platform. Apache Apex community manager Desmond Chan show’s us what exactly that means and how this open-source engine handles big data.

Data analysis tool in a new version

Apache Spark 1.6 with Dataset API

After a preview version had been published at the end of November 2015, the final version of Apache Spark 1.6 is at long last ready for download. The update contains a total of over 1,000 changes; release highlights include a variety of performance improvements, the new Dataset API and expanded data science functions.

How to teach computers to learn

Machine learning – An introduction for programmers

If you search Google Scholar for “machine learning”, it returns over 1,800,000 publications. As the buzz around this technology grows, so too does its complexity. Sebastian Raschka, author of Packt’s “Python Machine Learning”, introduces us to the three types of machine learning.

Big Data tool comparisons

Which Logging Tool is right for me?

With a plethora of logging tools available at a range of price points, it might be hard to decide on what to use. Rather than diving into a tonne of research, we’ve done it for you – a host of popular approaches to data processing have been fact checked and outlined for your convenience.

Database decisions

MySQL is a great NoSQL

Nowhere else are business decisions as hype-oriented as in IT. And while NoSQL is all well and good, MySQL is often the sensible choice in terms of operational cost and scalability, says JAX London speaker Aviran Mordo.