Scalability Challenges in Big Data Science

Speaker:

Mikio Braun

Scaling complex data analysis applications has become one of the hottest topics in Data Science and Big Data in recent years. We want to perform more and more complex analysis methods on larger and larger data sets. To achieve this, we need to bring together methods from computational statistics and machine learning with scalable technologies like NoSQL databases, stream processing, map reduce frameworks, or concepts for concurrency like actors. In practice this is often anything but trivial as both fields have quite different backgrounds. In this presentation we will talk about these challenges based on our experience with real-time social network analysis at TWIMPACT, and also in the broader context of machine learning methods in general. We will discuss how concepts like eventually consistent data stores, map reduce, or stream processing relate to the requirements of machine learning methods, how the issue of scaling is usually addressed in machine learning, and discuss what the common ground is and what the issues we are currently facing.

Schedule info

Time slot:

4 June 15:20 - 16:00

Room:

Kleistsaal

Track:

scale

Experience level:

advanced

Presentation Format:

Long (40min)

Slides:

scalability big data science-mbraun-bbuzz12.pdf

Please login to sign up for this Session.

Scalability Challenges in Big Data Science

Gold-Partner

Silver-Partner

Bronze-Partner

Startup-Sponsor

User login