#45 Michal Klos, Localytics and the World of Big Data

Summary
Michal Klos of Localytics tells me about their big data stack and where he thinks the industry is going.

Details
Who he is, what he does; overview of the world of big data, history, batch processing, stream processing and micro batching; databases, Apache Spark, separating storage and compute; where he think the industry going in the next five years, more about Spark, data lakes, query federation, Presto; how to get started with a big data project, picking technologies, doing a test; most big data projects fail, you should start small, get cross team involvement; how to scale to petabytes, start small with short expected lifespan; technologies Localytics uses, blog, they are hiring.

 

Book Recommendations
Hadoop Application Architectures

I Heart Logs: Event Data, Stream Processing, and Data Integration

Systems Performance: Enterprise and the Cloud

Small Is Beautiful: Economics as if People Mattered

Brilliant!: Shuji Nakamura And the Revolution in Lighting Technology

Barbarians at the Gate: The Fall of RJR Nabisco

 

Download mp3 of podcast