org.apache.hadoop.fs
Class ChecksumFileSystem

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.fs.FileSystem
          extended by org.apache.hadoop.fs.FilterFileSystem
              extended by org.apache.hadoop.fs.ChecksumFileSystem
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
DistributedFileSystem, InMemoryFileSystem, LocalFileSystem

public abstract class ChecksumFileSystem
extends FilterFileSystem

Abstract Checksumed FileSystem. It provide a basice implementation of a Checksumed FileSystem, which creates a checksum file for each raw file. It generates & verifies checksums at the client side.

Author:
Hairong Kuang

Field Summary
 
Fields inherited from class org.apache.hadoop.fs.FilterFileSystem
fs
 
Fields inherited from class org.apache.hadoop.fs.FileSystem
LOG
 
Constructor Summary
ChecksumFileSystem(FileSystem fs)
           
 
Method Summary
 void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Called when we're all done writing to the target.
 void copyFromLocalFile(boolean delSrc, Path src, Path dst)
          The src file is on the local disk.
 void copyToLocalFile(boolean delSrc, Path src, Path dst)
          The src file is under FS, and the dst is on the local disk.
 void copyToLocalFile(Path src, Path dst, boolean copyCrc)
          The src file is under FS, and the dst is on the local disk.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress)
          Opens an FSDataOutputStream at the indicated Path with write-progress reporting.
 boolean delete(Path f)
          Get rid of Path f, whether a true file or dir.
static double getApproxChkSumLength(long size)
           
 int getBytesPerSum()
          Return the bytes Per Checksum
 Path getChecksumFile(Path file)
          Return the name of the checksum file associated with a file.
 long getChecksumFileLength(Path file, long fileSize)
          Return the length of the checksum file given the size of the actual file.
 FileSystem getRawFileSystem()
          get the raw file system
static boolean isChecksumFile(Path file)
          Return true iff file is a checksum file name.
 Path[] listPaths(Path f)
          Filter raw files in the given path using the default checksum filter.
 Path[] listPaths(Path[] files)
          Filter raw files in the given pathes using the default checksum filter.
 void lock(Path f, boolean shared)
          Obtain a lock on the given Path
 boolean mkdirs(Path f)
          Make the given file and all non-existent parents into directories.
 FSDataInputStream open(Path f, int bufferSize)
          Opens an FSDataInputStream at the indicated Path.
 void release(Path f)
          Release the lock
 boolean rename(Path src, Path dst)
          Rename files/dirs
 boolean reportChecksumFailure(Path f, FSDataInputStream in, long inPos, FSDataInputStream sums, long sumsPos)
          Report a checksum error to the file system.
 boolean setReplication(Path src, short replication)
          Set replication for an existing file.
 Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Returns a local File that the user can write output to.
 
Methods inherited from class org.apache.hadoop.fs.FilterFileSystem
checkPath, close, exists, getBlockSize, getConf, getDefaultBlockSize, getDefaultReplication, getFileCacheHints, getLength, getName, getReplication, getUri, getWorkingDirectory, initialize, isDirectory, makeQualified, setWorkingDirectory
 
Methods inherited from class org.apache.hadoop.fs.FileSystem
closeAll, copyFromLocalFile, copyToLocalFile, create, create, create, create, create, create, create, createNewFile, get, get, getContentLength, getLocal, getNamed, getUsed, globPaths, globPaths, isFile, listPaths, listPaths, moveFromLocalFile, moveToLocalFile, open, parseArgs
 
Methods inherited from class org.apache.hadoop.conf.Configured
setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChecksumFileSystem

public ChecksumFileSystem(FileSystem fs)
Method Detail

getApproxChkSumLength

public static double getApproxChkSumLength(long size)

getRawFileSystem

public FileSystem getRawFileSystem()
get the raw file system


getChecksumFile

public Path getChecksumFile(Path file)
Return the name of the checksum file associated with a file.


isChecksumFile

public static boolean isChecksumFile(Path file)
Return true iff file is a checksum file name.


getChecksumFileLength

public long getChecksumFileLength(Path file,
                                  long fileSize)
Return the length of the checksum file given the size of the actual file.


getBytesPerSum

public int getBytesPerSum()
Return the bytes Per Checksum


open

public FSDataInputStream open(Path f,
                              int bufferSize)
                       throws IOException
Opens an FSDataInputStream at the indicated Path.

Overrides:
open in class FilterFileSystem
Parameters:
f - the file name to open
bufferSize - the size of the buffer to be used.
Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize,
                                 short replication,
                                 long blockSize,
                                 Progressable progress)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path with write-progress reporting.

Overrides:
create in class FilterFileSystem
Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
replication - required block replication for the file.
Throws:
IOException

setReplication

public boolean setReplication(Path src,
                              short replication)
                       throws IOException
Set replication for an existing file. Implement the abstract setReplication of FileSystem

Overrides:
setReplication in class FilterFileSystem
Parameters:
src - file name
replication - new replication
Returns:
true if successful; false if file does not exist or is a directory
Throws:
IOException

rename

public boolean rename(Path src,
                      Path dst)
               throws IOException
Rename files/dirs

Overrides:
rename in class FilterFileSystem
Throws:
IOException

delete

public boolean delete(Path f)
               throws IOException
Get rid of Path f, whether a true file or dir.

Overrides:
delete in class FilterFileSystem
Throws:
IOException

listPaths

public Path[] listPaths(Path[] files)
                 throws IOException
Filter raw files in the given pathes using the default checksum filter.

Overrides:
listPaths in class FileSystem
Parameters:
files - a list of paths
Returns:
a list of files under the source paths
Throws:
IOException

listPaths

public Path[] listPaths(Path f)
                 throws IOException
Filter raw files in the given path using the default checksum filter.

Overrides:
listPaths in class FilterFileSystem
Parameters:
f - source path
Returns:
a list of files under the source path
Throws:
IOException

mkdirs

public boolean mkdirs(Path f)
               throws IOException
Description copied from class: FilterFileSystem
Make the given file and all non-existent parents into directories. Has the semantics of Unix 'mkdir -p'. Existence of the directory hierarchy is not an error.

Overrides:
mkdirs in class FilterFileSystem
Throws:
IOException

lock

public void lock(Path f,
                 boolean shared)
          throws IOException
Description copied from class: FilterFileSystem
Obtain a lock on the given Path

Overrides:
lock in class FilterFileSystem
Throws:
IOException

release

public void release(Path f)
             throws IOException
Description copied from class: FilterFileSystem
Release the lock

Overrides:
release in class FilterFileSystem
Throws:
IOException

copyFromLocalFile

public void copyFromLocalFile(boolean delSrc,
                              Path src,
                              Path dst)
                       throws IOException
Description copied from class: FilterFileSystem
The src file is on the local disk. Add it to FS at the given dst name. delSrc indicates if the source should be removed

Overrides:
copyFromLocalFile in class FilterFileSystem
Throws:
IOException

copyToLocalFile

public void copyToLocalFile(boolean delSrc,
                            Path src,
                            Path dst)
                     throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name.

Overrides:
copyToLocalFile in class FilterFileSystem
Throws:
IOException

copyToLocalFile

public void copyToLocalFile(Path src,
                            Path dst,
                            boolean copyCrc)
                     throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name. If src and dst are directories, the copyCrc parameter determines whether to copy CRC files.

Throws:
IOException

startLocalOutput

public Path startLocalOutput(Path fsOutputFile,
                             Path tmpLocalFile)
                      throws IOException
Description copied from class: FilterFileSystem
Returns a local File that the user can write output to. The caller provides both the eventual FS target name and the local working file. If the FS is local, we write directly into the target. If the FS is remote, we write into the tmp local area.

Overrides:
startLocalOutput in class FilterFileSystem
Throws:
IOException

completeLocalOutput

public void completeLocalOutput(Path fsOutputFile,
                                Path tmpLocalFile)
                         throws IOException
Description copied from class: FilterFileSystem
Called when we're all done writing to the target. A local FS will do nothing, because we've written to exactly the right place. A remote FS will copy the contents of tmpLocalFile to the correct target at fsOutputFile.

Overrides:
completeLocalOutput in class FilterFileSystem
Throws:
IOException

reportChecksumFailure

public boolean reportChecksumFailure(Path f,
                                     FSDataInputStream in,
                                     long inPos,
                                     FSDataInputStream sums,
                                     long sumsPos)
Report a checksum error to the file system.

Parameters:
f - the file name containing the error
in - the stream open on the file
inPos - the position of the beginning of the bad data in the file
sums - the stream open on the checksum file
sumsPos - the position of the beginning of the bad data in the checksum file
Returns:
if retry is neccessary


Copyright © 2006 The Apache Software Foundation