Why am I leaking memory with this python loop? -
I am writing a custom file system crawler, which passes millions of globe to process through sys.stdin goes. I think that when the script is running, its memory usage grows on a large scale over time and the whole thing practically crawls to a stop. I have written a minimum case that shows the problem. Am I doing something wrong or did I get a bug in the Python / Globe module? (I'm using Python 2.5.2)
# / usr / bin / env dragon import globe import sys import gc previous_num_objects = count for 0 in line, calculation (sys.stdin) !: Glob_result = glob.glob (line.rstrip ('\ n')) current_num_objects = lane (gc.get_objects ()) new_objects = current_num_objects - print previous_num_objects "(% d) is:% d, new:% d , Garbage:% d, counts the collection:% s "\% (count, current_num_objects, new_objects, lane (gc.garbage), gc.get_count ()) previous_num_objects = current_num_objects
It seems:
(0) It: 4042, New: 4042, Python Trash: 0, Python counts: (660, 5, 0) (1) It: 4061, New: 19, Python Kach Ra: 0, Python collection number: (90, 6, 0) (2) It: 4064, new: 3, Python waste: 0, Python archive number: (127, 6, 0) (3) It: 4067, new : Python Collection Number: (133, 6, 0) (5) It: 4073, New: Python Collection: 0, Python Storage Number: (130, 6, 0) (4) It: 4070 , New: 3, Python Trash: 3, Python Trash: 0, Python counts: (136, 6, 0) (6) It: 4076, New: 3, Python Trash: 0, Python counts: 139 , 6, 0) (7) It: 4079, Python Trash: 0, Python Collection Number: (142, 6, 0) (8) It: 4082, New: 3, Python Paste: 0, Python Collection No. Note: (145, 6, 0) (9) It: 4085, New: 3, Python Trash: 0, Python counts: (148, 6, 0)
every 100th trip, 100 The objects are freed, so that lane (gc.get_objects ()
200 grows with every 100 iterations. lane (gc.garbage)
can never be changed from 0. The number of second generation collection gradually increases, while 0th and the first number goes up and down.
I tracked it down on the fnmatch module. Glob.glob actually calls fnmatch to globing, and fnmatch contains a regular expression cache that is never cleaned. So in this use, the cache was constantly rising and unchecked I have filed a bug against the Fnmatch library [1].
[1]: Python Bug
Comments
Post a Comment