14 Jun 2017
Berlin Buzzwords - the best bits
I've been in Germany for Berlin Buzzwords this week. Here's the best bits from the conference.
06 May 2017
Getting started with testing Kafka
There can be a gap between the people being asked to support a Kafka cluster and the people who's job it is to produce and consume from it. In this blog post we aim to quickly get you up and running with a local instance as well as some portable tooling to test it.
03 May 2017
European Data Conferences
I turned my list of upcoming data conferences into a proper webpage
28 Apr 2017
When does a project need Hadoop?
When should you use Hadoop in your big data project? Alice takes a slightly tongue in cheek look at when you should and shouldn't use Hadoop.
09 Feb 2017
Visited Countries Website
Say you travel quite a lot, you may even consider yourself a collector of countries. Say you're also more than happy using github and related tools. Well then I think I have the tool for you.
23 Jan 2017
Towards a realtime streaming architecture
Outline of the streaming architecture we are standardising around in the data tribe at Sky Betting & Gaming
09 Dec 2016
When Hadoop tools disagree with each other
We recently saw an 8-year spike on one of our graphs recently. It caused much amusement when it was tweeted out, but there’s actually a good story behind this apparent 8-year lag in data processing.
02 Dec 2016
Guardian SSL Actively Harmful
Earlier this week the Guardian posted a piece about how they'd switched to SSL everywhere, how hard this was, and why it's a great thing. Using SSL/TLS is generally a good thing, but in this case it's actually harmful.
25 Nov 2016
Our Top 10 Big Data News Sources
Keeping on top of an area of technology that is as rapidly moving as the big data ecosystem is hard. Our data tribe share some of their resources for keeping up to date.
05 May 2016
Measuring Impala performance using Apache JMeter
Our web performance teams regularly use JMeter to load test our websites to identify performance of the various components involved, but it turns out you can actually use it to directly test the performance of a Hadoop datawarehouse.