Also I wanted my results to satisfy two requirements (1) only output file information, if that file is in pdf, ppt, or doc format; (2) the size of file should be in meaningful format. I chose Python to finish this little task. Here is the code I used
import os
import time, stat
from datetime import datetime
def sizeof_fmt(num):
for x in ['bytes','KB','MB','GB','TB']:
if num < 1024.0:
return "%3.1f%s" % (num, x)
num /= 1024.0
f = open('output.txt', 'w')
for root, dirs, files in os.walk('E:\my_papers'):
for file in files:
if file.split('.')[-1] in ('pdf','ppt','doc'):
st=os.stat(os.path.join(root,file))
sz=st[stat.ST_SIZE]
tm=time.ctime(st[stat.ST_MTIME])
tm_tmp=datetime.strptime(tm, '%a %b %d %H:%M:%S %Y')
tm=tm_tmp.strftime('%Y-%m-%d')
sz2=sizeof_fmt(sz)
strg=root+'\t'+file+'\t'+tm +'\t' + sz2 +'\n'
f.write(strg)
The modules that are used here are "os", "time", "stat" and "datetime".
First there is a user-defined function that returns the file size in human readable format. I found this interesting function here .
Next os.walk function will walk through the given directory and stop until it finds files. Then for each file, the format is obtained by splitting the file names by '.'. Once I have the file format meets my requirement, its size in bytes format and its last modified-time is collected. Unfortunately the time is a very long string, some of which is not relevant at all. So I created a "datetime" object "tm_tmp" using "datetime.striptime()", and then created a string "tm" that only keeps year, month and date information of the file. Next the size function is called and the human readable file size is returned. Finally the directory, filename, time and size information are written to file.
No comments:
Post a Comment