It’s touted as the industry’s only open-source enterprise grad unified stream and batch processing platform. Apache Apex community manager Desmond Chan show’s us what exactly that means and how this open-source engine handles big data.
If you search Google Scholar for “machine learning”, it returns over 1,800,000 publications. As the buzz around this technology grows, so too does its complexity. Sebastian Raschka, author of Packt’s “Python Machine Learning”, introduces us to the three types of machine learning.
At the SAP TechEd in Barcelona, SAP brought its new technology down to developer level, showcasing the latest in SAP Hana, such as the Hana Cloud Platform’s usage of Cloud Foundry, while calling on IT to innovate and build a ‘digital core’, rather than just integrate.
Who’s the sheriff in today’s data centre wild west? Postgres advocate Pierre Fricke looks at the risks that NoSQL will pose in years to come, while doing his best to deflate the Hadoop hype.
With a plethora of logging tools available at a range of price points, it might be hard to decide on what to use. Rather than diving into a tonne of research, we’ve done it for you – a host of popular approaches to data processing have been fact checked and outlined for your convenience.
Nowhere else are business decisions as hype-oriented as in IT. And while NoSQL is all well and good, MySQL is often the sensible choice in terms of operational cost and scalability, says JAX London speaker Aviran Mordo.
Considering a change in your architecture? If you’re looking at Apache Spark, it might be worth seeing what Alex Zhitnitsky has to say about the top 5 things you should consider before the jump. Software architecture is hard.
Data crunchers can rejoice at the sight of Spark 1.4 – support for R, Python 3 plus a load of clustering and container management improvements all make their way to the top of the highlights reel for this cluster computing framework.
Built for scalability across multiple machines, the JSON document store RethinkDB is a distributed database that uses an easy query language. Here’s how to get started.
A total of approximately 480 JIRA tickets is what it takes to update Apache Hive to Version 1.2. The data warehouse software for Apache Hadoop has already reached its third release of the year, with the Hive community continuing its growth.
Although Facebook famously ditched Cassandra to use HBase for its messenger service, the NoSQL database remains largely overlooked. Ubeeko CEO Ghislain Mazars takes a look under the hood of HBase features.
In the final chapter of Cory Isaacson’s data modelling series, he explains why it’s necessary to disrupt your beautifully normalized data model for web-scale performance.
For the next instalment of Cory Isaacson’s Data Modelling series, we’re keeping our ‘Angry Shards’ database normalised while adding complexity.
In the second installment of Cory Isaacson’s crash-course guide to Data Modelling, we learn how to design a data model good enough to last forever.