BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Hadoop Content on InfoQ

  • Big Data Architecture at LinkedIn

    In this interview at QCon London, LinkedIn’s Sid Anand discusses the problems they face when serving high-traffic, high-volume data. Sid explains how they’re moving some use cases from Oracle to gain headroom, and lifts the hood on their open source search and data replication projects, including Kafka, Voldemort, Espresso and Databus.

    Big Data Architecture at LinkedIn
    Icon
    17:08
  • Optimizing for Big Data at Facebook

    Hive co-creator Ashish Thusoo describes the Big Data challenges Facebook faced and presents solutions in 2 areas: Reduction in the data footprint and CPU utilization. Generating 300 to 400 terabytes per day, they store RC files as blocks, but store as columns within a block to get better compression. He also talks about the current Big Data ecosystem and trends for companies going forward.

    Optimizing for Big Data at Facebook
    Icon
    16:55
  • All things Hadoop

    In this interview Ted Dunning talk about Hadoop, its current usage and its future. He explains the reasons for Hadoop's success and make recommendations on how to start using it.

    All things Hadoop
    Icon
    25:55
  • Costin Leau on Spring Data, Spring Hadoop and Data Grid Patterns

    In this interview recorded at JavaOne 2011 Conference, Spring Hadoop project lead Costin Leau talks about the current state and upcoming features of Spring Data and Spring Hadoop projects. He also talks about the Caching and Data Grid architecture patterns.

    Costin Leau on Spring Data, Spring Hadoop and Data Grid Patterns
    Icon
    28:15
  • Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco

    Ville Tuulos talks about Disco, the Map/Reduce framework for Python and Erlang, real-world data mining with Python, the advantages of Erlang for distributed and fault tolerant software, and more.

    Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco
    Icon
    16:28
  • Ron Bodkin on Big Data and Analytics

    Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.

    Ron Bodkin on Big Data and Analytics
    Icon
    22:51
  • What’s Next for jclouds?

    Adrian Cole discusses his jclouds project, which is an open source library that helps Java developers get started in the cloud and reuse their Java development skills. Cole also talks about some of the challenges of creating a cloud agnostic library, such as the use of different hypervisors and that various cloud implementations are written in different languages, such as VB, Python, Ruby, etc.

    What’s Next for jclouds?
    Icon
    21:02
  • Billy Newport Discusses Parallel Programming in Java

    Billy Newport talks to InfoQ about the need for higher level abstraction to do parallel programming with multi-core systems effectively. The interview explores some approaches taken with MapReduce products such as Cascading and Pig for a Hadoop cluster, explores the limitations of the actor model and message passing, and touches on IBM's WebSphere eXtreme Scale (ObjectGrid) product.

    Billy Newport Discusses  Parallel Programming in Java
    Icon
    26:43
BT