shutil —- High-level file operations

Source code:Lib/shutil.py


The shutil module offers a number of high-level operations on files andcollections of files. In particular, functions are provided which support filecopying and removal. For operations on individual files, see also theos module.

警告

Even the higher-level file copying functions (shutil.copy(),shutil.copy2()) cannot copy all file metadata.

On POSIX platforms, this means that file owner and group are lost as wellas ACLs. On Mac OS, the resource fork and other metadata are not used.This means that resources will be lost and file type and creator codes willnot be correct. On Windows, file owners, ACLs and alternate data streamsare not copied.

Directory and files operations

  • shutil.copyfileobj(fsrc, fdst[, length])
  • Copy the contents of the file-like object fsrc to the file-like object fdst.The integer length, if given, is the buffer size. In particular, a negativelength value means to copy the data without looping over the source data inchunks; by default the data is read in chunks to avoid uncontrolled memoryconsumption. Note that if the current file position of the fsrc object is not0, only the contents from the current file position to the end of the file willbe copied.

  • shutil.copyfile(src, dst, *, follow_symlinks=True)

  • Copy the contents (no metadata) of the file named src to a file nameddst and return dst in the most efficient way possible.src and dst are path-like objects or path names given as strings.

dst must be the complete target file name; look at copy()for a copy that accepts a target directory path. If src and _dst_specify the same file, SameFileError is raised.

The destination location must be writable; otherwise, an OSErrorexception will be raised. If dst already exists, it will be replaced.Special files such as character or block devices and pipes cannot becopied with this function.

If follow_symlinks is false and src is a symbolic link,a new symbolic link will be created instead of copying thefile src points to.

在 3.3 版更改: IOError used to be raised instead of OSError.Added follow_symlinks argument.Now returns dst.

在 3.4 版更改: Raise SameFileError instead of Error. Since the former isa subclass of the latter, this change is backward compatible.

在 3.8 版更改: Platform-specific fast-copy syscalls may be used internally in order tocopy the file more efficiently. SeePlatform-dependent efficient copy operations section.

  • exception shutil.SameFileError
  • This exception is raised if source and destination in copyfile()are the same file.

3.4 新版功能.

  • shutil.copymode(src, dst, *, follow_symlinks=True)
  • Copy the permission bits from src to dst. The file contents, owner, andgroup are unaffected. src and dst are path-like objects or path namesgiven as strings.If follow_symlinks is false, and both src and dst are symbolic links,copymode() will attempt to modify the mode of dst itself (ratherthan the file it points to). This functionality is not available on everyplatform; please see copystat() for more information. Ifcopymode() cannot modify symbolic links on the local platform, and itis asked to do so, it will do nothing and return.

在 3.3 版更改: Added follow_symlinks argument.

  • shutil.copystat(src, dst, *, follow_symlinks=True)
  • Copy the permission bits, last access time, last modification time, andflags from src to dst. On Linux, copystat() also copies the"extended attributes" where possible. The file contents, owner, andgroup are unaffected. src and dst are path-like objects or pathnames given as strings.

If follow_symlinks is false, and src and dst bothrefer to symbolic links, copystat() will operate onthe symbolic links themselves rather than the files thesymbolic links refer to—reading the information from thesrc symbolic link, and writing the information to thedst symbolic link.

注解

Not all platforms provide the ability to examine andmodify symbolic links. Python itself can tell you whatfunctionality is locally available.

  • If os.chmod in os.supports_follow_symlinks isTrue, copystat() can modify the permissionbits of a symbolic link.

  • If os.utime in os.supports_follow_symlinks isTrue, copystat() can modify the last accessand modification times of a symbolic link.

  • If os.chflags in os.supports_follow_symlinks isTrue, copystat() can modify the flags ofa symbolic link. (os.chflags is not available onall platforms.)

On platforms where some or all of this functionalityis unavailable, when asked to modify a symbolic link,copystat() will copy everything it can.copystat() never returns failure.

Please see os.supports_follow_symlinksfor more information.

在 3.3 版更改: Added follow_symlinks argument and support for Linux extended attributes.

  • shutil.copy(src, dst, *, follow_symlinks=True)
  • Copies the file src to the file or directory dst. src and dst_should be strings. If _dst specifies a directory, the file will becopied into dst using the base filename from src. Returns thepath to the newly created file.

If follow_symlinks is false, and src is a symbolic link,dst will be created as a symbolic link. If follow_symlinks_is true and _src is a symbolic link, dst will be a copy ofthe file src refers to.

copy() copies the file data and the file's permissionmode (see os.chmod()). Other metadata, like thefile's creation and modification times, is not preserved.To preserve all file metadata from the original, usecopy2() instead.

在 3.3 版更改: Added follow_symlinks argument.Now returns path to the newly created file.

在 3.8 版更改: Platform-specific fast-copy syscalls may be used internally in order tocopy the file more efficiently. SeePlatform-dependent efficient copy operations section.

  • shutil.copy2(src, dst, *, follow_symlinks=True)
  • Identical to copy() except that copy2()also attempts to preserve file metadata.

When follow_symlinks is false, and src is a symboliclink, copy2() attempts to copy all metadata from thesrc symbolic link to the newly-created dst symbolic link.However, this functionality is not available on all platforms.On platforms where some or all of this functionality isunavailable, copy2() will preserve all the metadatait can; copy2() never raises an exception because itcannot preserve file metadata.

copy2() uses copystat() to copy the file metadata.Please see copystat() for more informationabout platform support for modifying symbolic link metadata.

在 3.3 版更改: Added follow_symlinks argument, try to copy extendedfile system attributes too (currently Linux only).Now returns path to the newly created file.

在 3.8 版更改: Platform-specific fast-copy syscalls may be used internally in order tocopy the file more efficiently. SeePlatform-dependent efficient copy operations section.

  • shutil.ignorepatterns(*patterns_)
  • This factory function creates a function that can be used as a callable forcopytree()'s ignore argument, ignoring files and directories thatmatch one of the glob-style patterns provided. See the example below.

  • shutil.copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False, dirs_exist_ok=False)

  • Recursively copy an entire directory tree rooted at src to a directorynamed dst and return the destination directory. dirs_exist_ok dictateswhether to raise an exception in case dst or any missing parent directoryalready exists.

Permissions and times of directories are copied with copystat(),individual files are copied using copy2().

If symlinks is true, symbolic links in the source tree are represented assymbolic links in the new tree and the metadata of the original links willbe copied as far as the platform allows; if false or omitted, the contentsand metadata of the linked files are copied to the new tree.

When symlinks is false, if the file pointed by the symlink doesn'texist, an exception will be added in the list of errors raised inan Error exception at the end of the copy process.You can set the optional ignore_dangling_symlinks flag to true if youwant to silence this exception. Notice that this option has no effecton platforms that don't support os.symlink().

If ignore is given, it must be a callable that will receive as itsarguments the directory being visited by copytree(), and a list of itscontents, as returned by os.listdir(). Since copytree() iscalled recursively, the ignore callable will be called once for eachdirectory that is copied. The callable must return a sequence of directoryand file names relative to the current directory (i.e. a subset of the itemsin its second argument); these names will then be ignored in the copyprocess. ignore_patterns() can be used to create such a callable thatignores names based on glob-style patterns.

If exception(s) occur, an Error is raised with a list of reasons.

If copy_function is given, it must be a callable that will be used to copyeach file. It will be called with the source path and the destination pathas arguments. By default, copy2() is used, but any functionthat supports the same signature (like copy()) can be used.

Raises an auditing event shutil.copytree with arguments src, dst.

在 3.3 版更改: Copy metadata when symlinks is false.Now returns dst.

在 3.2 版更改: Added the copy_function argument to be able to provide a custom copyfunction.Added the ignore_dangling_symlinks argument to silent dangling symlinkserrors when symlinks is false.

在 3.8 版更改: Platform-specific fast-copy syscalls may be used internally in order tocopy the file more efficiently. SeePlatform-dependent efficient copy operations section.

3.8 新版功能: The dirs_exist_ok parameter.

  • shutil.rmtree(path, ignore_errors=False, onerror=None)
  • Delete an entire directory tree; path must point to a directory (but not asymbolic link to a directory). If ignore_errors is true, errors resultingfrom failed removals will be ignored; if false or omitted, such errors arehandled by calling a handler specified by onerror or, if that is omitted,they raise an exception.

注解

On platforms that support the necessary fd-based functions a symlinkattack resistant version of rmtree() is used by default. On otherplatforms, the rmtree() implementation is susceptible to a symlinkattack: given proper timing and circumstances, attackers can manipulatesymlinks on the filesystem to delete files they wouldn't be able to accessotherwise. Applications can use the rmtree.avoids_symlink_attacksfunction attribute to determine which case applies.

If onerror is provided, it must be a callable that accepts threeparameters: function, path, and excinfo.

The first parameter, function, is the function which raised the exception;it depends on the platform and implementation. The second parameter,path, will be the path name passed to function. The third parameter,excinfo, will be the exception information returned bysys.exc_info(). Exceptions raised by onerror will not be caught.

Raises an auditing event shutil.rmtree with argument path.

在 3.3 版更改: Added a symlink attack resistant version that is used automaticallyif platform supports fd-based functions.

在 3.8 版更改: On Windows, will no longer delete the contents of a directory junctionbefore removing the junction.

  • rmtree.avoids_symlink_attacks
  • Indicates whether the current platform and implementation provides asymlink attack resistant version of rmtree(). Currently this isonly true for platforms supporting fd-based directory access functions.

3.3 新版功能.

  • shutil.move(src, dst, copy_function=copy2)
  • Recursively move a file or directory (src) to another location (dst)and return the destination.

If the destination is an existing directory, then src is moved inside thatdirectory. If the destination already exists but is not a directory, it maybe overwritten depending on os.rename() semantics.

If the destination is on the current filesystem, then os.rename() isused. Otherwise, src is copied to dst using copy_function and thenremoved. In case of symlinks, a new symlink pointing to the target of src_will be created in or as _dst and src will be removed.

If copy_function is given, it must be a callable that takes two argumentssrc and dst, and will be used to copy src to dest ifos.rename() cannot be used. If the source is a directory,copytree() is called, passing it the copyfunction(). Thedefault _copy_function is copy2(). Using copy() as thecopy_function allows the move to succeed when it is not possible to alsocopy the metadata, at the expense of not copying any of the metadata.

在 3.3 版更改: Added explicit symlink handling for foreign filesystems, thus adaptingit to the behavior of GNU's mv.Now returns dst.

在 3.5 版更改: Added the copy_function keyword argument.

在 3.8 版更改: Platform-specific fast-copy syscalls may be used internally in order tocopy the file more efficiently. SeePlatform-dependent efficient copy operations section.

  • shutil.diskusage(_path)
  • Return disk usage statistics about the given path as a named tuplewith the attributes total, used and free, which are the amount oftotal, used and free space, in bytes. path may be a file or adirectory.

3.3 新版功能.

在 3.8 版更改: On Windows, path can now be a file or directory.

可用性: Unix, Windows。

  • shutil.chown(path, user=None, group=None)
  • Change owner user and/or group of the given path.

user can be a system user name or a uid; the same applies to group. Atleast one argument is required.

See also os.chown(), the underlying function.

Availability: Unix.

3.3 新版功能.

  • shutil.which(cmd, mode=os.F_OK | os.X_OK, path=None)
  • Return the path to an executable which would be run if the given cmd wascalled. If no cmd would be called, return None.

mode is a permission mask passed to os.access(), by defaultdetermining if the file exists and executable.

When no path is specified, the results of os.environ() are used,returning either the "PATH" value or a fallback of os.defpath.

On Windows, the current directory is always prepended to the path whetheror not you use the default or provide your own, which is the behavior thecommand shell uses when finding executables. Additionally, when finding thecmd in the path, the PATHEXT environment variable is checked. Forexample, if you call shutil.which("python"), which() will searchPATHEXT to know that it should look for python.exe within the _path_directories. For example, on Windows:

  1. >>> shutil.which("python")
  2. 'C:\\Python33\\python.EXE'

3.3 新版功能.

在 3.8 版更改: The bytes type is now accepted. If cmd type isbytes, the result type is also bytes.

  • exception shutil.Error
  • This exception collects exceptions that are raised during a multi-fileoperation. For copytree(), the exception argument is a list of 3-tuples(srcname, dstname, exception).

Platform-dependent efficient copy operations

Starting from Python 3.8 all functions involving a file copy (copyfile(),copy(), copy2(), copytree(), and move()) may useplatform-specific "fast-copy" syscalls in order to copy the file moreefficiently (see bpo-33671)."fast-copy" means that the copying operation occurs within the kernel, avoidingthe use of userspace buffers in Python as in "outfd.write(infd.read())".

On macOS fcopyfile is used to copy the file content (not metadata).

On Linux os.sendfile() is used.

On Windows shutil.copyfile() uses a bigger default buffer size (1 MiBinstead of 64 KiB) and a memoryview()-based variant ofshutil.copyfileobj() is used.

If the fast-copy operation fails and no data was written in the destinationfile then shutil will silently fallback on using less efficientcopyfileobj() function internally.

在 3.8 版更改.

copytree example

This example is the implementation of the copytree() function, describedabove, with the docstring omitted. It demonstrates many of the other functionsprovided by this module.

  1. def copytree(src, dst, symlinks=False):
  2. names = os.listdir(src)
  3. os.makedirs(dst)
  4. errors = []
  5. for name in names:
  6. srcname = os.path.join(src, name)
  7. dstname = os.path.join(dst, name)
  8. try:
  9. if symlinks and os.path.islink(srcname):
  10. linkto = os.readlink(srcname)
  11. os.symlink(linkto, dstname)
  12. elif os.path.isdir(srcname):
  13. copytree(srcname, dstname, symlinks)
  14. else:
  15. copy2(srcname, dstname)
  16. # XXX What about devices, sockets etc.?
  17. except OSError as why:
  18. errors.append((srcname, dstname, str(why)))
  19. # catch the Error from the recursive copytree so that we can
  20. # continue with other files
  21. except Error as err:
  22. errors.extend(err.args[0])
  23. try:
  24. copystat(src, dst)
  25. except OSError as why:
  26. # can't copy file access times on Windows
  27. if why.winerror is None:
  28. errors.extend((src, dst, str(why)))
  29. if errors:
  30. raise Error(errors)

Another example that uses the ignore_patterns() helper:

  1. from shutil import copytree, ignore_patterns
  2.  
  3. copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

This will copy everything except .pyc files and files or directories whosename starts with tmp.

Another example that uses the ignore argument to add a logging call:

  1. from shutil import copytree
  2. import logging
  3.  
  4. def _logpath(path, names):
  5. logging.info('Working in %s', path)
  6. return [] # nothing will be ignored
  7.  
  8. copytree(source, destination, ignore=_logpath)

rmtree example

This example shows how to remove a directory tree on Windows where someof the files have their read-only bit set. It uses the onerror callbackto clear the readonly bit and reattempt the remove. Any subsequent failurewill propagate.

  1. import os, stat
  2. import shutil
  3.  
  4. def remove_readonly(func, path, _):
  5. "Clear the readonly bit and reattempt the removal"
  6. os.chmod(path, stat.S_IWRITE)
  7. func(path)
  8.  
  9. shutil.rmtree(directory, onerror=remove_readonly)

Archiving operations

3.2 新版功能.

在 3.5 版更改: Added support for the xztar format.

High-level utilities to create and read compressed and archived files are alsoprovided. They rely on the zipfile and tarfile modules.

  • shutil.makearchive(_base_name, format[, root_dir[, base_dir[, verbose[, dry_run[, owner[, group[, logger]]]]]]])
  • Create an archive file (such as zip or tar) and return its name.

base_name is the name of the file to create, including the path, minusany format-specific extension. format is the archive format: one of"zip" (if the zlib module is available), "tar", "gztar" (if thezlib module is available), "bztar" (if the bz2 module isavailable), or "xztar" (if the lzma module is available).

root_dir is a directory that will be the root directory of thearchive; for example, we typically chdir into root_dir before creating thearchive.

base_dir is the directory where we start archiving from;i.e. base_dir will be the common prefix of all files anddirectories in the archive.

root_dir and base_dir both default to the current directory.

If dry_run is true, no archive is created, but the operations that would beexecuted are logged to logger.

owner and group are used when creating a tar archive. By default,uses the current owner and group.

logger must be an object compatible with PEP 282, usually an instance oflogging.Logger.

The verbose argument is unused and deprecated.

Raises an auditing event shutil.make_archive with arguments base_name, format, root_dir, base_dir.

在 3.8 版更改: The modern pax (POSIX.1-2001) format is now used instead ofthe legacy GNU format for archives created with format="tar".

  • shutil.get_archive_formats()
  • Return a list of supported formats for archiving.Each element of the returned sequence is a tuple (name, description).

By default shutil provides these formats:

  • zip: ZIP file (if the zlib module is available).

  • tar: Uncompressed tar file. Uses POSIX.1-2001 pax format for new archives.

  • gztar: gzip'ed tar-file (if the zlib module is available).

  • bztar: bzip2'ed tar-file (if the bz2 module is available).

  • xztar: xz'ed tar-file (if the lzma module is available).

You can register new formats or provide your own archiver for any existingformats, by using register_archive_format().

  • shutil.registerarchive_format(_name, function[, extra_args[, description]])
  • Register an archiver for the format name.

function is the callable that will be used to unpack archives. The callablewill receive the base_name of the file to create, followed by thebase_dir (which defaults to os.curdir) to start archiving from.Further arguments are passed as keyword arguments: owner, group,dry_run and logger (as passed in make_archive()).

If given, extra_args is a sequence of (name, value) pairs that will beused as extra keywords arguments when the archiver callable is used.

description is used by get_archive_formats() which returns thelist of archivers. Defaults to an empty string.

  • shutil.unregisterarchive_format(_name)
  • Remove the archive format name from the list of supported formats.

  • shutil.unpackarchive(_filename[, extract_dir[, format]])

  • Unpack an archive. filename is the full path of the archive.

extract_dir is the name of the target directory where the archive isunpacked. If not provided, the current working directory is used.

format is the archive format: one of "zip", "tar", "gztar", "bztar", or"xztar". Or any other format registered withregister_unpack_format(). If not provided, unpack_archive()will use the archive file name extension and see if an unpacker wasregistered for that extension. In case none is found,a ValueError is raised.

在 3.7 版更改: Accepts a path-like object for filename and extract_dir.

  • shutil.registerunpack_format(_name, extensions, function[, extra_args[, description]])
  • Registers an unpack format. name is the name of the format andextensions is a list of extensions corresponding to the format, like.zip for Zip files.

function is the callable that will be used to unpack archives. Thecallable will receive the path of the archive, followed by the directorythe archive must be extracted to.

When provided, extra_args is a sequence of (name, value) tuples thatwill be passed as keywords arguments to the callable.

description can be provided to describe the format, and will be returnedby the get_unpack_formats() function.

  • shutil.unregisterunpack_format(_name)
  • Unregister an unpack format. name is the name of the format.

  • shutil.get_unpack_formats()

  • Return a list of all registered formats for unpacking.Each element of the returned sequence is a tuple(name, extensions, description).

By default shutil provides these formats:

  • zip: ZIP file (unpacking compressed files works only if the correspondingmodule is available).

  • tar: uncompressed tar file.

  • gztar: gzip'ed tar-file (if the zlib module is available).

  • bztar: bzip2'ed tar-file (if the bz2 module is available).

  • xztar: xz'ed tar-file (if the lzma module is available).

You can register new formats or provide your own unpacker for any existingformats, by using register_unpack_format().

Archiving example

In this example, we create a gzip'ed tar-file archive containing all filesfound in the .ssh directory of the user:

  1. >>> from shutil import make_archive
  2. >>> import os
  3. >>> archive_name = os.path.expanduser(os.path.join('~', 'myarchive'))
  4. >>> root_dir = os.path.expanduser(os.path.join('~', '.ssh'))
  5. >>> make_archive(archive_name, 'gztar', root_dir)
  6. '/Users/tarek/myarchive.tar.gz'

The resulting archive contains:

  1. $ tar -tzvf /Users/tarek/myarchive.tar.gz
  2. drwx------ tarek/staff 0 2010-02-01 16:23:40 ./
  3. -rw-r--r-- tarek/staff 609 2008-06-09 13:26:54 ./authorized_keys
  4. -rwxr-xr-x tarek/staff 65 2008-06-09 13:26:54 ./config
  5. -rwx------ tarek/staff 668 2008-06-09 13:26:54 ./id_dsa
  6. -rwxr-xr-x tarek/staff 609 2008-06-09 13:26:54 ./id_dsa.pub
  7. -rw------- tarek/staff 1675 2008-06-09 13:26:54 ./id_rsa
  8. -rw-r--r-- tarek/staff 397 2008-06-09 13:26:54 ./id_rsa.pub
  9. -rw-r--r-- tarek/staff 37192 2010-02-06 18:23:10 ./known_hosts

Querying the size of the output terminal

  • shutil.getterminal_size(_fallback=(columns, lines))
  • Get the size of the terminal window.

For each of the two dimensions, the environment variable, COLUMNSand LINES respectively, is checked. If the variable is defined andthe value is a positive integer, it is used.

When COLUMNS or LINES is not defined, which is the common case,the terminal connected to sys.stdout is queriedby invoking os.get_terminal_size().

If the terminal size cannot be successfully queried, either becausethe system doesn't support querying, or because we are notconnected to a terminal, the value given in fallback parameteris used. fallback defaults to (80, 24) which is the defaultsize used by many terminal emulators.

The value returned is a named tuple of type os.terminal_size.

See also: The Single UNIX Specification, Version 2,Other Environment Variables.

3.3 新版功能.