Google File System

Requirements in GFS

  • Big

  • Fast

  • Sharding

  • Automatic recovery

  • Single Data center

  • Internal use

  • big sequential reads and writes


100s of clients 1 Master Chunk Servers (CS) each with one or two disks The naster knows where the chunks are the master keeps a list of files and their chunk information

Master Data

Two main tables that are important. One maps filenames to array of chunk handles another maps chunk handles to a list of chunk servers, version #, primary chunk, lease expiration A log and checkpoint on disk

