Skip to content

Latest commit

 

History

History
176 lines (148 loc) · 7.18 KB

README.md

File metadata and controls

176 lines (148 loc) · 7.18 KB

JDBC Driver for Apache Cassandra

Build Status Maven Central

Type 4 JDBC driver for Apache Cassandra. Building on top of DataStax Java Driver and JSqlParser, it intends to provide better SQL compatibility over CQL, so that it works well with existing tools like SQuirreL SQL for SQL development, JMeter for stress testing, and Pentaho BI Suite for data processing and reporting.

You may find this helpful if you came from RDBMS world and hoping to get your hands on Apache Cassandra right away. Having said that, it is NOT recommended to use this for production but development and research. You should use DataStax Java Driver, Spark and maybe Presto if you want to do something serious.

OK, you have been warned :) Now go ahead to download the latest driver and give it a shot!

Features

  • Implicit type conversion for ease of use
...
// set parameter
preparedStatment.setString(index, "13:30:54.234"); // or setTime(index, new Time(1465536654234L))
...
// get query result
resultSet.getTime(index); // or getString(index)
...
  • Instruct CQL statement through CQL comments(aka. magic comments)
/* please be aware that only single line comment begins with "set" can be recognized */
-- set consistency_level = ALL; fetch_size = 10000;
// set no_limit = true; read_timeout = 600;
select * from logs
  • Limit unconstrained queries according to configuration
-- the following SQL will be translated to "SELECT * FROM logs LIMIT 10000"
-- you may change the behavior via magic comments or config.yaml
select * from logs
  • Improved SQL compatibility, for example: table alias, and more to come: group by, select into, insert select, field expression, lucene filter(if you have Stratio's Cassandra Lucene Index installed)...
-- the following SQL will be translated into "SELECT * FROM logs LIMIT 10000"
select l.* from logs l
  • Possibly support alternative Java driver, for example: Netflix Astyanax

  • Possibly support alternative storage (e.g. HBase just for fun)

Get Started

Before you start, please make sure you have JDK 7 or above - JDK 6 is not supported.

Get the driver

The last release of the driver is available on Maven Central. You can install it in your application using the following Maven dependency:

<dependency>
	<groupId>com.github.zhicwu</groupId>
	<artifactId>cassandra-jdbc-driver</artifactId>
	<version>0.6.1</version>
	<!-- comment out the classifier if you don't need shaded jar -->
	<classifier>shaded</classifier>
</dependency>

If you can't use a dependency management tool, you can download the latest shaded jar from here.

Optionally, if you want to build the driver on your own. You may follow the instructions below if you have both Git and Maven installed:

$ git clone https://github.com/zhicwu/cassandra-jdbc-driver
$ cd cassandra-jdbc-driver
$ mvn clean package
$ ls -alF target/cassandra-jdbc-driver-*-shaded.jar

Say Hello to Cassandra

This is pretty much same as we did for any other database, except different driver and connection URL.

...
// Driver driver = new com.github.cassandra.jdbc.CassandraDriver();
Properties props = new Properties();
props.setProperty("user", "cassandra");
props.setProperty("password", "cassandra");

// ":datastax" in the URL is optional, it suggests to use DataStax Java driver as the provider to connect to Cassandra
Connection conn = DriverManager.getConnection("jdbc:c*:datastax://host1,host2/system_auth?consistencyLevel=ONE", props);
// change current keyspace from system_auth to system
conn.setSchema("system");

// query peers table in current keyspace, by default the SQL below will be translated into the following CQL:
// SELECT * FROM peers LIMIT 10000
// Please be aware that the original SQL does not work in Cassandra as table alias is not supported
ResultSet rs = conn.createStatement().executeQuery("select p.* from peers p");
while (rs.next()) {
...
}
...

Configuration

Driver Configuration

Default settings of this driver can be found in config.yaml. Besides changing it in the jar file, you may set system property "cassandra.jdbc.driver.config" to use your own config instead.

$ java -Dcassandra.jdbc.driver.config=/usr/local/private/new_config.yaml ...

Connection Properties

Connection Properties

Magic Comments

To set read timeout to 120 seconds just for a specific query, you can do it by adding a single line comment:

-- set read_timeout=120
select * from xyz

Please notice that magic comments have to be started with "-- set " or "// set ", and you can use semicolon as separator in one line for multiple instructions:

-- set read_timeout = 120; replace_null_value = true
-- set no_limit = true
select * from xyz

All supported instructions in magic comments are declared at here.

HOWTOs

SQuirrel SQL

  1. Configure Apache Cassandra driver Configure Driver
  2. Create a new alias using above driver Configure Alias
  3. Congratulations! You now can to connect to Cassandra Query Trace
  4. To use magic comments, please use "//" instead of "--" as SQuirrel SQL will remove the latter automatically before sending the query to JDBC driver.

JMeter

  1. Put the driver in $JMETER_HOME/lib directory
  2. Use JDBC Sampler to access Cassandra

Pentaho Data Integration(aka. Kettle)

  1. Put the driver in $KETTLE_HOME/lib directory
  2. Create new connection to Cassandra
  3. Use TableInput / TableOutput steps to query / update Cassandra data
  4. You may want to add "-- set replace_null_value = true" to your query, as Kettle tries to use NULL value get meta data

Pentaho BI Server

  1. Put the driver in $BISERVER_HOME/tomcat/lib directory
  2. Create new datasource pointing to Cassandra
  3. Use CDA to issue SQL to access Cassandra - Mondrian is not tested and is not supposed to work

Build

$ mvn -Prelease notice:generate
$ mvn license:format
$ mvn clean package

TODOs

  • Remove CQL Parser to support JDK 7
  • UDT support and smooth type conversion
  • Multiple ResultSet support, especially when tracing turned on
  • Better SQL compatibility(e.g. SELECT INTO, GROUP BY and probably simple table joins and sub-queries)
  • (Basic)Mondrian support
  • More providers(and storage?)...