Python + inotify = Pyinotify [ how to watch folders for file activity ]

Sometimes it just might be handy to be able to watch a folder on a hard disk for changes. For example: A client app might drop small files on a shared folder. A server app might be watching the folder for just such an event. Once the file is created, the server will kick into action and perform whatever tasks are required.

This all comes from my CD burning application. I am currently thinking that the client apps will drop small xml files containing information about what to burn onto a folder the webserver has access to and the cdburner service will be watching...

The linux kernel provides inotify. This from wikipedia:
notify is a Linux kernel subsystem that provides file system event notification. It was written by John McCutchan with help from Robert Love and later Amy Griffis to replace dnotify. It was included in the mainline kernel from release 2.6.13 (2005-06-18), and could be compiled into 2.6.12 and possibly earlier releases by use of a patch. Its function is essentially an extension to filesystems to notice changes to the filesystem, and report those changes to applications.
Pyinotify is a python module that exposes the inotify api in python. This from http://pyinotify.sourceforge.net/:
pyinotify is a Python module for watching filesystems changes. pyinotify can be used for various kind of fs monitoring. pyinotify relies on a recent Linux Kernel feature (merged in kernel 2.6.13) called inotify. inotify is an event-driven notifier, its notifications are exported from kernel space to user space through three system calls. pyinotify binds these system calls and provides an implementation on top of them offering a generic and abstract way to use inotify from Python. Pyinotify doesn't requires much detailed knowledge of inotify. Moreover, it only needs few statements for initializing, watching, handling (optionnaly trough a new separate thread), and processing events notifications through subclassing. The only things to know is the path of items to watch, the kind of events to monitor and the actions to execute on these notifications. Note: pyinotify requires Python 2.3 and above, and Linux 2.6.13 at least.
I went ahead and gave it a go. I did find though, that on my fedora system ( fedora 9 x86_64 ) that the tutorial on the above wiki didn't quite work. Here is the tutorial code for the non threaded example that does work on my system.


#!/usr/bin/python
import os
import pyinotify

wm = pyinotify.WatchManager()
mask = pyinotify.IN_DELETE | pyinotify.IN_CREATE

class PTmp(pyinotify.ProcessEvent):
def process_IN_CREATE(self, event):
print "Create: %s " % os.path.join(event.path, event.name)
def process_IN_DELETE(self, event):
print "Delete: %s " % os.path.join(event.path, event.name)


notifier = pyinotify.Notifier(wm, PTmp())

wdd = wm.add_watch('/home/dave/projects', mask, rec=True)

while True:
try:
notifier.process_events()
if notifier.check_events():
notifier.read_events()
except KeyboardInterrupt:
notifier.stop()
break

Unfortunately inotify is a Linux Kernel technology that is not currently available on windows. I guess Windows has some other kind of API for filesystem event monitoring but if you are like me and want to keep things simple with python then I am afraid inotify on windows is not possible. If you feel that you would like to give this stuff a go then I suggest getting yourself a LiveCD of one of the Linux Distros. They can be bought here.

Comments

Charlie Evatt said…
Hi there, what does this code do when it notices a new file in the folder? Does it print it to the screen?

Is it possible to code this kind of thing so it is constantly running looking at a folder, and to do whatever when a file is dropped into it?

Thanks - Charlie
David Latham said…
Yes it is possible. The code example in my post does exactly that. It simply writes a line of text to the std output ( the screen ) whenever a file is created or deleted in the folder that is being watched.

You can perform whatever functions you like in the IN_CREATE folder.

A specific application example would be:
An ftp service is setup where clients upload files to the ftp location and some automated process is required. This process then kicks off whenever a file is created on the ftp folder being watched. Maybe the incomming file has to be parsed and the data inserted into a database.

I suggest you try it out. Its pretty powerful stuff. Python provides a nice interface to this useful linux tool.
Anonymous said…
Thanks for the example!

I'm not sure of the version differences, but for me (using pyinotify 0.7.1), the event codes were not in the pyinotify namespace. I had to go one level further into EventsCodes. Where you have pyinotify.IN_CREATE, I needed pyinotify.EventsCodes.IN_CREATE
David Latham said…
I am not sure how to tell which version of python-inotify I am using. I ran a yum install python-inotify and installed this:
***
Installed: python-inotify.noarch 0:0.8.0-3.r.fc9
***

Then I did a test:
***
Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:56)
[GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyinotify
INFO: Maybe it could speed-up a little bit if you had psyco installed (not required).
>>> mask = pyinotify.IN_CREATE
>>> print mask
256
>>> mask = pyinotify.IN_DELETE
>>> print mask
512
***

So it is probably an issue with versions or simply the distribution.

But thanks anyway for the heads up.
AnGrY_bOb said…
hello david...
im hoping you could help me.
i got this working well...
but i need it to on the increate step
run another python script in a folder.
eg the watched folder.
i have the python script that does what i need perfectly. and can execute it from cmd line but it would be better if the script could run auto from file change...
i really hope you can help...


thankyou - Adam
David Latham said…
Hi Andy,

First of all the siplest way to deal with calling another script is to import and run one of its functions.

Here is my example:

notifier.py ( handles create and delete events. Calls main() on otherscript.py on create )

import os
import pyinotify
import otherscript

wm = pyinotify.WatchManager()
mask = pyinotify.IN_CREATE | pyinotify.IN_DELETE

class PTmp(pyinotify.ProcessEvent):
def process_IN_CREATE(self, event):
print "Create: %s" % os.path.join(event.path, event.name)
print "Executing otherscript.py"
otherscript.main(event.name)
def process_IN_DELETE(self, event):
print "Delete: %s" % os.path.join(event.path, event.name)

notifier = pyinotify.Notifier(wm, PTmp() )

wdd = wm.add_watch('/home/davidl/test', mask, rec = True)

while True:
try:
notifier.process_events()
if notifier.check_events():
notifier.read_events()
except KeyboardInterrupt:
notifier.stop()
break



otherscript.py ( defines a main() function that takes the filename argument and prints some text to the screen )

#!/usr/bin/python

def main(filename):
print "Other Python Script Executed!"
print "Otherscript received filename: %s " % filename


To use:
1. Create a folder called test
2. Install both of these scripts in the test folder
3. Open a shell and run "python notifier.py"
4. create a new file in the test folder.
5. watch your notifier script and otherscript do the job.

There is a way to call another application using the popen or popen2. Look in this blog for a script called pyVerify for examples.
David Latham said…
Sorry - formatting is up the pole.

The lines: notifier = pyinotify.Notifier... and wdd = wm.add.... are on new lines outside of the PTmp class.
AnGrY_bOb said…
thank you david ill try it out tonite and give you some feedback if it worked...

thank you ALOT

Adam
AnGrY_bOb said…
thank you for your help david but it didnt work...
once i got it running without indentation errors it gave a module main error and just sat there...
unfortunately im only learning, so
i googled the error and really got no answers as to why it didnt go...
again thankyou for trying...

Adam
Anonymous said…
Hello. Thank you for this great article. Quick question. I'm using the threaded notifier (because the unthreaded doesn't work as conveniently for unit-testing), but want the IN_CREATE event to stop the notifier thread and return the created file path. Is there a way to do this with the threaded inotify. I would think this would require a callback to the first thread, which I'm not sure how to do. Thanks!
David Latham said…
@Eric
I have tried to get the ThreadedNotifier to work the way you would like. Specifically, I have tried to get it to stop the thread on only the IN_CREATE event. In any case my attempts were largely futile. I suppose this would be because of a lack of thorough understanding about threads on my part.

I have learned the following:
The ThreadedNotifier inherits from threading.Thread so all of the threading.Thread methods should be available.

The ThreadedNotifier also inherits from Notifier but overrides the loop function. This is important because the loop() function in the Notifier Class provides a small callback handler.

Here is a code snippet to prove how callbacks work. The callback will be executed every time there is an event and will be passed the Notifier object.

Unfortunately the NotifierObject does not know exactly which event was fired by the kernel. At about this time, I started to loose interest in the problem. Maybe if I could learn more about the application... :)

#!/usr/bin/env python
import pyinotify


class PTmp(pyinotify.ProcessEvent):
"""Event handler"""
def process_IN_CREATE(self, event):
"""Capture events for the CREATE event."""
print "Create: ", event.pathname

def process_default(self,event):
"""Default handler for ALLOTHER events."""
print "Default handler hit: ", event.pathname


def myCallBack(notifierObj):
"""Call back function to show how it works."""
print "Callback received."


wm = pyinotify.WatchManager()
mask = pyinotify.IN_CREATE | pyinotify.IN_DELETE
notifier = pyinotify.Notifier(wm, PTmp())
path = '/home/dave'
wm.add_watch(path, mask, rec=True)

notifier.loop(callback=myCallBack)
archim said…
Who watching for windows solution. Here I found very good examples:
http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_for_changes.html
archim said…
and here is pretty useful class based on this tehnique
http://tech-artists.org/wiki/Python_Recipes#Watch_a_Directory_for_Changes
David Latham said…
SELECT * FROM Solutions WHERE Solutions.SolutionName = 'Windows';

Empty set (0.00 sec)
psk said…
hi ! If i am copying a file from my hard disk to usb(say flash drive).After copying the file into my usb, if i would like to know what all files i have copied to usb without inserting the flash drive. what is the procedure to do it on Linux platform????? please give your response to my email: premshiva1990@mail.com
David Latham said…
@PSK,

So let me see if I understand the question:

1. You want to copy files to a flash drive
2. and then have something tell you what you just copied?
3. And you want to be able to do this WITHOUT inserting the flash drive?

I will try to respond. (Not to your private email - that would deny all the other folks that seem to be following this post any chance at reading it)

1. Copy the files.
2. Remember what you copied.
3. Huh? How did you manage 1 and 2 without having FIRST inserted the flash drive?

You could write a python script that prompts for the folder to monitor and somehow checks to see if the flash drive is still mounted otherwise quit.
Then as files are copied to them, your py-inotify script will be watching the write events on the flash drive's filesystem. It could then write to a log file.

In the end when the flash drive is removed, it could email the contents of the log just written to you.

I can only assume you are trying to protect against staff removing sensitive data from your systems. How about an old fashioned security policy which all your staff have to sign...

Popular posts from this blog

Extending the AD Schema on Samba4 - Part 2

Automatically mount NVME volumes in AWS EC2 on Windows with Cloudformation and Powershell Userdata

Extending the AD Schema on Samba4 - Part 1