DataFrame.copy(), at least, should be threadsafe

dataframe.copy() should happen atomically/be threadsafe, meaning that it should produce a consistent dataframe even if the call to .copy() is made while another thread is deleting entries from the dataframe, or if another thread calls a deletion method while the call to .copy() is working (in other words, i guess .copy() should acquire a lock that prevents mutation during the copy). That is, the following code, which crashes in 0.7.3, should succeed:

``` python

import pandas
import threading

df = pandas.DataFrame()

def mutateDf(df):
    while True:
        df[0] = pandas.Series([1,2,3])
        del df[0]

def readDf(df):
    while True:
        dfCopy = df.copy()
        if 0 in dfCopy and 1 in dfCopy[0]:
            a = dfCopy[0][1]

t1 = threading.Thread(target=mutateDf, args=(df,))
t2 = threading.Thread(target=readDf, args=(df,))

t1.start()
t2.start()
```

```
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "<ipython-input-5-8aef72c7f1b4>", line 4, in readDf
    if 0 in dfCopy and 1 in dfCopy[0]:
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1458, in __getitem__
    return self._get_item_cache(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/generic.py", line 294, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 625, in get
    _, block = self._find_block(item)
TypeError: 'NoneType' object is not iterable
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DataFrame.copy(), at least, should be threadsafe #2728

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DataFrame.copy(), at least, should be threadsafe #2728

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions