- New parallel upload feature. Use two new parameters max_workers and chunk_record_size in Client’s load_table_from_dataframe api to configure parallelization of uploading data to Treasure Data, saving time
- Support for pandas 2
- Increase supported numpy versions to between 1.17.3 and less than 2.0.0
- Add support for python 3.11
- Drop support of EOL python 3.7
- Handle cases with cur.description is None
- Keep single quotes in InsertIntoWriter
- Use obj.items instead of iteritems()
- Updated dependencies and pytd now supports python 3.9 and 3.10
- Upgrade urllib3 version (#112)
- [hot fix] Set upper version for pandas lower than 1.2 (#108)
- Presto queries issued by
pandas_td.read_td_query
usejoin_distribution_type
session property instead of the deprecateddistributed_join
property. See our documentation for more information about the change. (#100)
Note
pytd does not offer its version 1.4.1 because an unofficial 1.4.1 binary was unexpectedly published to PyPI and deleted immediately; PyPI refuses to re-upload deleted version numbers.
- Deprecate (Py)Spark 2.x and Python 3.5 support, and migrate to Spark 3.x and Python 3.8, respectively.
SparkWriter
requires running with Spark 3.x from now on. (#94)
- Enable passing extra keyword arguments (e.g.,
fmt="msgpack"
) topandas_td.to_td
. (#80) - Support
engine_version
option in query APIs. (#81) - Add
force_tdclient
option to Presto query interfaces for deterministically usingtdclient
rather thanprestodb
. (#85) - Add a precondition check to
Writer#write_dataframe
for making sure the type oftable
argument. (#86) - Documentation updates. (#82, #89)
- Support nullable column containing
pandas.NA
, which was newly introduced in pandas 1.0.0. TheWriter
module internally convertspandas.NA
intoNone
before ingestingpandas.DataFrame
to Treasure Data. Note thatWriter#write_dataframe
may behave differently between before and after upgrading pandas to 1.0.0 because of the experimental, backward-incompatible updates on the dependent package. (#72)
- Support list-type column in
BulkImportWriter#write_dataframe
. A list-type column ofpandas.DataFrame
will be stored into Treasure Data table as an array-type column. (#60) - Store a resulting object from
Client#query
toClient.query_executed
. The object could be a Treasure Data job id if query is executed viatdclient
. (#63) - Support null value in
str
,bool
, and"Int64"
-type column ofpandas.DataFrame
. (#68, #71) - Update minimum required pandas version to 0.24.0 (#69)
- Update documentation site. (#49, #57)
- Add Treasure Data API endpoint HTTPS scheme validation. (#51)
- Support bulk importing with the MessagePack format. (#53)
- Improve stability of
BulkImportWriter
session ID. (#55) - Require td-client-python version 1.1.0 or later. (#56)
- Add
Client#exists(database, table)
andClient#create_database_if_not_exists(database)
method. (#58)
- Clean up docstrings and launch documentation site. (#43, #44)
- Disable
type
, one of the Treasure Data-specific query parameters, because it is conflicted with theengine
option. (#45) - Add td-pyspark dependency for easily accessing to the td-spark functionalities. (#46, #47)