InfoQ Homepage Presentations Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop
Facebook’s Petabyte Scale Data Warehouse using Hive and Hadoop
Summary
Ashish Thusoo and Namit Jain explain how Facebook manages to deal with 12 TB of compressed new data everyday with Hive’s help. Hive is an open source data warehousing framework built on Hadoop, allowing developers to perform analysis against large datasets using SQL.
Bio
Ashish Thusoo is currently managing the Facebook data infrastructure team. He is the project leader of Hive at Apache and a member of Hadoop PMC. Namit Jain is a member of Facebook’s data-infrastructure group and he is a committer for Hive. He also worked for over 10 years at Oracle on streaming technologies, XML, replication and queuing.
About the conference
QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community. QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.