Hadoop - lessons learned


Hadoop has proven to be an invaluable tool for many companies over the past few years. Yet it has it's ways and knowing them up front can safe valuable time. This session is a run down of the ever recurring lessons learned from running various Hadoop clusters in production since version 0.15. What to expect from Hadoop - and what not? How to integrate Hadoop into exiting infrastructure? Which data formats to use? What compression? Small files vs big files? Append or not? Essential configuration and operations tips. What about querying all the data? The project, the community and pointers to interesting projects that complement the Hadoop experience. This session varies from high level discussions to quite technical details. Check out the slides of the talk here

Schedule info
Time slot: 
4 June 11:55 - 12:35
Experience level: 
Presentation Format: 
Long (40min)