Highly Scalable Discriminative Spam Filtering

Speaker:

Michael Brückner

Content-based email spam filtering remains a technological challenge. The commercial incentive for spam senders results in an arms race between filtering methods and spam obfuscation techniques. The main problem arises from the size of the problem: Email service providers and large companies have to filter millions or even billions of emails per day. Where these providers must employ filters which are accurate for emails of all sorts and languages, spam senders can exploit local weaknesses on one specific type of messages. Hence, spam filters should be trained on as many as available ham & spam-flagged emails which requires highly scalable discriminative classifiers such as support vector machines and logistic regression. In the last couple of years, several new algorithms were developed to make these methods applicable to train from hundreds of millions training instances with millions of attributes. In this talk I will review these methods and strategies in the context of email spam filtering.

Watch Michael Brückner`s video talk here.

Schedule info

Time slot:

4 June 16:10 - 16:30

Room:

Kleistsaal

Track:

scale

Experience level:

intermediate

Presentation Format:

Short (20min)

Slides:

Spam Filtering-mbrückner-bbuzz12.pdf

Please login to sign up for this Session.

Highly Scalable Discriminative Spam Filtering

Gold-Partner

Silver-Partner

Bronze-Partner

Startup-Sponsor

User login