Tech Mesh London 2012
Dean Wampler, TweetBig Dataist, O'Reilly Author

Biography: Dean Wampler
Dean Wampler is a Principal Consultant at Think Big Analytics, where he specializes in "Big Data" problems and tools like Hadoop and Machine Learning. Besides Big Data, he specializes in Scala, the JVM ecosystem, JavaScript, Ruby, functional and object-oriented programming, and Agile methods. Dean is a frequent speaker at industry and academic conferences on these topics. He has a Ph.D. in Physics from the University of Washington.
Presentation: TweetPanel Debate: What the hell is Big Data
Panel Debate.
Presentation: TweetBeyond MapReduce
Apache Hadoop is the current darling of the "Big Data" world. At its core is the MapReduce computing model for decomposing large data-analysis jobs into smaller tasks and distributing those tasks around a cluster. MapReduce itself was pioneered at Google for indexing the Web and other computations over massive data sets.
The strengths of MapReduce are cost-effective scalability and relative maturity. Its weaknesses are its batch orientation, making it unsuitable for real-time event processing, and the difficulty of implementing data analysis idioms in the MapReduce computing model.
We can address the weaknesses in several ways. First, higher-level programming languages, which provide common query and manipulation abstractions, make it easier to implement MapReduce programs. However, longer term, we need new distributed computing models that are more flexible for different problems and which provide better real-time performance.
We'll review these strengths and weaknesses of MapReduce and the Hadoop implementation, then discuss several emerging alternatives, such as Google's Pregel system for graph processing and Storm for event processing. We'll finish with some speculation about the longer-term future of Big Data.
Workshop: The Seductions of Scala Tweet
Dean was seduced by the Scala language several years ago. This hands-on tutorial will show you why. Attendees will see how Scala fixes many issues with Java's object model and type system, adds powerful Functional Programming features, and provides a great platform for creating Domain Specific Languages (DSLs) and distributed systems.
Keywords: Scala, Functional Programming, Concurrency, Distributed Systems
Target Audience: This tutorial will appeal to developers interested in functional programming and new languages on the JVM, especially for building distributed systems.
I will assume that you already know other programming languages and that you are comfortable writing code for this hands-on tutorial. Bring your laptop with Scala V2.10 installed and your favourite text editor or IDE.