pip install callmonitor
or clone this repo and:
python setup.py install
It's simple to use, just decorate any function with the @intercept
decorator.
Eg:
from callmonitor import intercept
@intercept
def test_fn_2(x, y=2, z=3):
pass
This will save the inputs (args
, kwargs
and argspec
) along with a call
database (callmonitor.DB
) to: call-monitor/test_fn_2/<invocation count>
.
If the call-monitor
folder already exists (eg. a previous run), then a new
folder call-monitor-1
, or call-monitor-2
, and so on, is created. See the
sections on Data Structure
for more details on how this data is saved.
To avoid different processes from writing to the same location, callmonitor
appends -tid=<N>
to the root (call-monitor
) folder. Currently callmonitor
supports mpi4py
out of the box: if mpi4py.MPI.COMM_WORLD.Get_rank() > 1
,
callmonitor
automatically assumes that it's running im multi-threaded mode
and appends -tid=<Get_rank()>
to the output. If your programm is
multi-threaded with another framwork (eg. concurrent.Futures
) then you need
to tell callmonitor
your thread ID using callmonitor.Settings
:
from callmonitor import Settings
settings = Settings()
settings.enable_multi_threading(THREAD_ID)
before the first invocation of intercept
(the database is created on disk
when it is first needed, it is at that point when callmonitor.Settings
is
read. Any changes made to callmonitor.Settings
afterwards will only take
effect if the database is recreated -- using callmonitor.CONTEXT.new
).
Sometimes pickle
just won't cut it in terms of saving function inputs -- eg.
when we need to save our own fancy data types. callmonitor
provides a way of
building your down argument handlers and registering to the global
callmonitor.REGISTRY
. The registry is queried every time function inputs are
processed, so if you build your own ArgHandler
and add them usingg
callmonitor.REGISTRY.add
, it will process any arguments of the associated
datatype from that point forward. Eg, numpy
provides its own save
/load
functions. We have already build (and registered) a numpy arggument handler
like so:
import numpy as np
from os.path import join
from callmonitor import Handler, REGISTRY
class NPHandler(Handler):
def load(self, path):
self.data = np.load(join(path, f"arg_{self.target}.npy"))
def save(self, path):
np.save(join(path, f"arg_{self.target}.npy"), self.data)
@classmethod
def accumulator_done(cls):
pass
REGISTRY.add(np.ndarray, NPHandler)
(remember that callmonitor.REGISTRY.add
needs to be called before the
first invocation of @intercept
that needs this particular Handler
). A
custom handler needs to inherit the callmonitor.Handler
class and define
save
, load
, and accumulator_done
(the last one being a @classmethod
).
callmonitor.load(<path>)
will load a database at <path>
(see section on
Data Structure
). Eg:
from callmonitor import load
db = load("call-monitor")
We can now get individual function calls data from the database using DB.get
:
args, kwargs = db.get("function_name", invocation_count)
(which will also automatically load .npy
files and any custom handlers --
remember to register these in callmonitor.REGISTRY
before executing db.get
)
Remember: invocation_count
starts at 1. Therefore to access the first call to test_np_1
, run:
In [4]: db.get("test_np_1", 1)
Out[4]: ([10, array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])], {})
We try to enable top-level summaries of the following user-facing classes:
REGISTRY
DB
DB.get_args
, andArgs
via the__str__
and__repr__
functions. Eg,callmonitor.REGISTRY
shows which datatype/handler pairs are configured:
In [2]: callmonitor.REGISTRY
Out[2]:
{
<class 'numpy.ndarray'>: <class 'callmonitor.handler.NPHandler'>
}
Likewise the DB
object displays a summary of functions called and how often.
In [3]: db = callmonitor.load("call-monitor")
In [4]: db
Out[4]:
{
Locked: True
test_np_1: {
calls: 2
args: ['x', 'n']
defaults: None
}
}
Picking apart args
, kwargs
, and argspec.defaults
can be very tedious --
especially if you're trying to find out the value of a specific argument. Hence
callmonitor.DB
provides an additionl getter -- get_args
which returns an
Args
object. callmonitor.Args
are container classes that store each input
argument by name as an attributed. Eg:
In [3]: args = db.get_args("test_fn_1", 1)
In [4]: args
Out[4]: dict_keys(['x', 'y', 'z'])
In [5]: args.x
Out[5]: 1
Note: the callmonitor.Args
constructor will fill in any arguments not in
args
and kwargs
from the FullArgSpec
defaults. If you just want to
recreate the original function call the args
and kwargs
returned by
callmonitor.DB.get
should be enough.
While not technically a database, let's call the directories generated by
callmonitor
a database for the lack of a better term. Each database consists
of a db.pkl
file (containing metadata), as well as folders for each function
(each function call is enumerated). Eg:
call-monitor
├── db.pkl
├── test_fn_1
│ ├── 1
│ │ └── input_descriptor.pkl
│ └── 2
│ └── input_descriptor.pkl
└── test_fn_2
└── 1
└── input_descriptor.pkl
Special attention is given to numpy
inputs -- these are called
arg_<label>.npy
, where <label>
is either the index of the input argument,
or the kw
for kwargs. Eg:
call-monitor
├── db.pkl
└── test_np_1
├── 1
│ ├── arg_1.npy
│ └── input_descriptor.pkl
└── 2
├── arg_n.npy
└── input_descriptor.pkl
Full consideration was given to saving all call data in a single data structure -- maybe even a real database ;) -- but to do this efficiently at scale is not easy, and might make this package cumbersome. Future versions will include the ability to fuse multiple small function calls into a single accumulator object to avoid large numbers of small files.
Version 0.3.0 brigns many enhancements to callmonitor
. We therefore could no
longer enable native backward compatibility. A tool that can convert an version
0.2.0 database to a version 0.3.0 (or later) is currently being developed.
Versions pre-dating 0.2.0 are no longer supported.