Apache Spark & Storm Use Cases – Big Data Cloud Meetup – April 3 2014

Home » Apache Spark & Storm Use Cases – Big Data Cloud Meetup – April 3 2014


A special meetup brought to you by Big Data Cloud with the support and sponsorship from Symantec to showcase the power and beauty of open source technologies as they pertain to enterprise business cases.


  • 6:30 to 7:00 pm
    – Real Time Stream Processing & Batch Analytics Use Cases at Symantec.
    Steve Hallett, & Sourabh Satish, Symantec.Symantec collects TBs of metadata on a regular scale and leverages the information for providing the best-of-breed security, backup and availability solutions. We will cover some of the real time and batch analytics use cases, how we are solving it today and how Symantec is building a next generation platform to address the needs of the future.
  • 7:10 to 7:55 pm
    – Real-time Recommendation Systems using Apache Storm.
    Pranab Ghosh, Third Eye CSS LLC. Generally in recommendation engines, user’s past history on engagements with different items is a key input. However, in many situations in an enterprise’s business cycle, it is necessary to generate recommendations based on user activity in real time. In this session we will show how to decipher real time click streams into meaningful recommendations in real time. Pranab would be discussing the real time recommendations feature of Sifarish, which is an open source project built on Hadoop, Storm and Redis. Sifarish is a recommendation engine that does content based recommendation as well as social collaborative filtering based recommendation.
  • 8:00 to 8:45 pm
    – How Hadoop and the New Interactive Query Engines like Apache Shark Change the Game Forever. David P. Mariani , AtScale Inc. Remember the promise of a centralized data warehouse, able to store all your company’s data in one place? Unfortunately, technology couldn’t live up to the promise. With the emergence of interactive query engines like Apache Shark, Cloudera Impala and Hortonworks’ Stinger, we are seeing a new opportunity to bring data back together at scale. In this session, you will see how Klout built one of the world’s largest data services using Hadoop and virtually no commercial software. You’ll also see how AtScale helped an online gaming company understand their customer behavior and improved game play by leveraging Apache Shark and Hadoop to create a flexible dimensional data model that adapts to new data with without complicated ETL, data transformations or pre-aggregation.


Steve Hallett

Steve Hallett, VP – Cloud Infrastructure Engineering, recently joined Symantec to lead the infrastructure and platform development for Symantec’s new Cloud Platform Engineering group. The CPE group is chartered with building and running essential platform and infrastructure shared services for next generation Symantec products and services.

Sourabh Satish

Sourabh Satish is a Distinguished Engineer in the Security Technology and Response group in Symantec’s Office of the CTO, where he leads research and development of security engines and technologies. Mr. Satish has been at Symantec for more than 14 years. He is a prolific innovator with 120 issued patents. In the last 18 years, since earning his Bachelor’s Degree in Computer Science and Engineering from India, Mr. Satish worked as an engineer and technical lead on many security technologies such as network IDS, firewall, policy compliance, host IDS, server security, VOIP security, OS and application security, behavioral security, personal information protection and machine learning based applications. For the last several years, Mr. Satish and his team have been applying various innovative Machine Learning approaches to handle and process large amounts of data, mine intelligence and build applications that can help protect millions of Symantec customers or improve/streamline internal processes for malware analysis.

Pranab Ghosh

A 25+ years veteran in the software industry who has worked with myriad of technologies and platforms including main frames, real time and systems programming, java and enterprise applications, big data and cloud technologies. For the last several years, he has been working on Hadoop, Hive, NOSQL databases and the surrounding big data ecosystem, Distributed Computation, Big Data, NOSQL DB and Data Mining. He had previously worked for Oracle, HP, Yahoo, Motorola, Apple, Accenture and many startups and mid size companies. He is the owner of several big data open source projects hosted on Github. He is an active blogger covering topics around big data, cloud and machine learning.

David P. Mariani

David P. Mariani is CEO of AtScale, Inc., an incubating software startup bringing business intelligence into Hadoop. Prior to AtScale, David was VP Engineering of Klout, a social analytics data service that scores over 450 million user profiles daily and collects over 12 billion events across the social web. Previously, David ran the analytics and data pipelines for Yahoo!’s consumer sites and advertising services, where he built the world’s largest cube and drove early Hadoop development and adoption.

[button text=”Check the details on Big Data Cloud meetup site…” link=”http://www.meetup.com/BigDataCloud/events/171351342/” size=”normal” style=”style7″ target=”_blank”]


By | 2018-03-12T12:50:29+00:00 April 2nd, 2014|Uncategorized|