Python压缩文件处理模块

Python中的zipfile和tarfile两个模块可以实现对文件的压缩和解压缩。并且支持向压缩包中压入新的文件,和在压缩包中指定文件进行解压缩

Python Version: 3.5+

zip

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import zipfile

# 创建新的压缩文件
z = zipfile.ZipFile('docs.zip', 'w')
# 添加需要压缩的文件
z.write("c1.py")
z.write("c2.py")
z.close()

# 向已存在的压缩包中添加文件
z = zipfile.ZipFile('docs.zip', 'a')
# 添加需要压缩的文件
z.write("test.conf")
z.close()

# 解压缩所有文件
z = zipfile.ZipFile('docs.zip', 'r')
z.extractall()

# 解压缩指定文件

# 列出压缩包中的所有文件名
z = zipfile.ZipFile('docs.zip', 'r')
for i in z.namelist():
print(i)

# 指定文件名称进行解压
z.extract("c1.py")
z.close()

解压缩补充:

  • path 可以指定解压缩之后的文件的路径
  • pwd 可以指定去哪个目录下解压缩文件
1
2
3
4
5
6
def extract(self, member, path=None, pwd=None):
"""Extract a member from the archive to the current working directory,
using its full name. Its file information is extracted as accurately
as possible. `member' may be a filename or a ZipInfo object. You can
specify a different directory using `path'.
"""

tar.gz

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import tarfile

# 创建新的压缩文件
t = tarfile.open('docs.zip', 'w')
# 添加需要压缩的文件
t.add("c1.py")
t.add("c2.py", arcname="tc2.py")
t.close()

# 向已存在的压缩包中添加文件
t = tarfile.open('docs.zip', 'a')
# 添加需要压缩的文件
t.add("test.conf")
t.close()

# 解压缩所有文件
t = tarfile.open('docs.zip', 'r')
t.extractall()

# 解压缩指定文件
# 列出压缩包中的所有文件名
t = tarfile.open('docs.zip', 'r')
for i in t.getnames():
print(i)

# 指定文件名称进行解压
t.extract("tc2.py")
t.close()

# 使用文件名获取该文件对象,指定该文件对象进行解压缩
o = t.getmember('tc2.py')
t.extract(o)
t.close()

解压缩补充:

  • tarfile解压缩单个文件时支持两种方式进行解压,一种是指定文件名,一种是指定文件对象。
  • 解压方法支持解压缩文件到指定目录中
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def extractall(self, path=".", members=None, *, numeric_owner=False):
"""Extract all members from the archive to the current working
directory and set owner, modification time and permissions on
directories afterwards. `path' specifies a different directory
to extract to. `members' is optional and must be a subset of the
list returned by getmembers(). If `numeric_owner` is True, only
the numbers for user/group names are used and not the names.
"""

def extract(self, member, path="", set_attrs=True, *, numeric_owner=False):
"""Extract a member from the archive to the current working directory,
using its full name. Its file information is extracted as accurately
as possible. `member' may be a filename or a TarInfo object. You can
specify a different directory using `path'. File attributes (owner,
mtime, mode) are set unless `set_attrs' is False. If `numeric_owner`
is True, only the numbers for user/group names are used and not
the names.
"""