You know, for search. Querying 24 Billion Records in 900ms.

Speaker:

Jodok Batlogg

Who doesn't love building high-available, scalable systems holding multiple Terabytes of data? Recently we had the pleasure to crack some tough nuts to solve the problems and we'd love to share our findings designing, building up and operating a 120 Node, 6TB Elasticsearch (and Hadoop) cluster with the community: - Dynamicly increasing and decreasing cluster size - Amazon Webservices vs. Dedicated Hardware - IO performance on Solid State Disks, Amazon Elastic Block Storage (EBS) or instance store - Choosing the EC2 right instance type, Dimensioning your Hardware - Tuning Elasticsearch configuration - Out of Memory: Implementing custom facets - Keep the cluster responsive while heavily indexing - Automated Deployment (e.g. Puppet), Version updates - Monitoring/Tools (Ganglia, Zabbix, Elasticsearch-Head and Bigdesk) - Costs (EC2 vs. dedicated), how to safe money. - Integration with Hadoop: Use Hadoop/Mapred/Hive to fill the search cluster, HDFS to backup.

Watch the video of Jodok Batlogg's talk here.

Schedule info

Time slot:

5 June 14:20 - 14:40

Room:

Kleistsaal

Track:

Experience level:

intermediate

Presentation Format:

Short (20min)

Please login to sign up for this Session.

You know, for search. Querying 24 Billion Records in 900ms.

Gold-Partner

Silver-Partner

Bronze-Partner

Startup-Sponsor

User login