mirror of
https://github.com/ansible-collections/community.general.git
synced 2024-09-14 20:13:21 +02:00
[PR #6274/14b19afc backport][stable-5] archive: Generate crc32 over 16MiB chunks (#6325)
archive: Generate crc32 over 16MiB chunks (#6274)
* archive: Generate crc32 over 16MiB chunks
Running crc32 over the whole content of the compressed file potentially
requires a lot of RAM. The crc32 function in zlib allows for calculating
the checksum in chunks. This changes the code to calculate the checksum
over 16 MiB chunks instead. 16 MiB is the value also used by
shutil.copyfileobj().
* Update changelogs/fragments/6199-archive-generate-checksum-in-chunks.yml
Change the type of change to bugfix
Co-authored-by: Felix Fontein <felix@fontein.de>
* Update changelogs/fragments/6199-archive-generate-checksum-in-chunks.yml
Co-authored-by: Felix Fontein <felix@fontein.de>
---------
Co-authored-by: Felix Fontein <felix@fontein.de>
(cherry picked from commit 14b19afc9a
)
Co-authored-by: Nils Meyer <nils@nm.cx>
This commit is contained in:
parent
cb49b96b4d
commit
72b282fe85
2 changed files with 9 additions and 1 deletions
|
@ -0,0 +1,2 @@
|
|||
bugfixes:
|
||||
- archive - reduce RAM usage by generating CRC32 checksum over chunks (https://github.com/ansible-collections/community.general/pull/6274).
|
|
@ -601,7 +601,13 @@ class TarArchive(Archive):
|
|||
# The python implementations of gzip, bz2, and lzma do not support restoring compressed files
|
||||
# to their original names so only file checksum is returned
|
||||
f = self._open_compressed_file(_to_native_ascii(path), 'r')
|
||||
checksums = set([(b'', crc32(f.read()))])
|
||||
checksum = 0
|
||||
while True:
|
||||
chunk = f.read(16 * 1024 * 1024)
|
||||
if not chunk:
|
||||
break
|
||||
checksum = crc32(chunk, checksum)
|
||||
checksums = set([(b'', checksum)])
|
||||
f.close()
|
||||
except Exception:
|
||||
checksums = set()
|
||||
|
|
Loading…
Reference in a new issue