Skip to content
This repository was archived by the owner on Apr 11, 2024. It is now read-only.

Commit 2c11b76

Browse files
committed
Merge pull request #1 from eladamitpxi/dev-logstash2
A new version for logstash 2.x
2 parents d7b9ec7 + ed85604 commit 2c11b76

20 files changed

+1701
-316
lines changed

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,8 @@
22
Gemfile.lock
33
.bundle
44
vendor
5-
/nbproject/private/
5+
/nbproject/private/
6+
.idea
7+
coverage
8+
tmp
9+
.sonar

CONTRIBUTORS

Lines changed: 5 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,7 @@
1-
The following is a list of people who have contributed ideas, code, bug
2-
reports, or in general have helped logstash along its way.
1+
The following is a list of people who have contributed (in chronological order) ideas, code, bug
2+
reports, or in general have helped this plugin along its way.
33

44
Contributors:
5-
* Aaron Mildenstein (untergeek)
6-
* Graham Bleach (bleach)
7-
* John E. Vincent (lusis)
8-
* Jordan Sissel (jordansissel)
9-
* Kevin Amorin (kamorin)
10-
* Kevin O'Connor (kjoconnor)
11-
* Kurt Hurtado (kurtado)
12-
* Mathias Gug (zimathias)
13-
* Pete Fritchman (fetep)
14-
* Pier-Hugues Pellerin (ph)
15-
* Richard Pijnenburg (electrical)
16-
* bitsofinfo (bitsofinfo)
17-
18-
Note: If you've sent us patches, bug reports, or otherwise contributed to
19-
Logstash, and you aren't on the list above and want to be, please let us know
20-
and we'll make sure you're here. Contributions from folks like you are what make
21-
open source awesome.
5+
* Oleg Tokarev (otokarev)
6+
* Elad Amit (eladamitpxi, amitelad7)
7+
* Valentin Fischer (valentinul)

Gemfile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
1+
#ruby=jruby
2+
#ruby-gemset=logstash-output-cassandra
13
source 'https://rubygems.org'
2-
gemspec
4+
gemspec

README.md

Lines changed: 92 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,72 +1,104 @@
11
# Logstash Cassandra Output Plugin
22

3-
This is a plugin for [Logstash](https://github.com/elasticsearch/logstash).
3+
This is a plugin for [Logstash](https://github.com/elastic/logstash).
44

55
It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
66

7+
It was originally a fork of the [logstash-output-cassandra](https://github.com/otokarev/logstash-output-cassandra) plugin by [Oleg Tokarev](https://github.com/otokarev), which has gone unmaintained and went through a major re-design in this version we built.
8+
79
## Usage
810

911
<pre><code>
1012
output {
1113
cassandra {
12-
# Credentials of a target Cassandra, keyspace and table
13-
# where you want to stream data to.
14-
username => "cassandra"
15-
password => "cassandra"
16-
hosts => ["127.0.0.1"]
17-
keyspace => "logs"
18-
table => "query_log"
14+
# List of Cassandra hostname(s) or IP-address(es)
15+
hosts => [ "cass-01", "cass-02" ]
16+
17+
# The port cassandra is listening to
18+
port => 9042
19+
20+
# The protocol version to use with cassandra
21+
protocol_version => 4
22+
1923
# Cassandra consistency level.
20-
# Options: "any", "one", "two", "three", "quorum", "all",
21-
# "local_quorum", "each_quorum", "serial", "local_serial",
22-
# "local_one"
24+
# Options: "any", "one", "two", "three", "quorum", "all", "local_quorum", "each_quorum", "serial", "local_serial", "local_one"
2325
# Default: "one"
24-
consistency => "all"
25-
26-
# Where from the event hash to take a message
27-
source => "payload"
28-
29-
# if cassandra does not understand formats of data
30-
# you feeds it with, just provide some hints here
26+
consistency => 'any'
27+
28+
# The keyspace to use
29+
keyspace => "a_ks"
30+
31+
# The table to use (event level processing (e.g. %{[key]}) is supported)
32+
table => "%{[@metadata][cassandra_table]}"
33+
34+
# Username
35+
username => "cassandra"
36+
37+
# Password
38+
password => "cassandra"
39+
40+
# An optional hints hash which will be used in case filter_transform or filter_transform_event_key are not in use
41+
# It is used to trigger a forced type casting to the cassandra driver types in
42+
# the form of a hash from column name to type name in the following manner:
3143
hints => {
3244
id => "int"
3345
at => "timestamp"
3446
resellerId => "int"
3547
errno => "int"
3648
duration => "float"
37-
ip => "inet"}
38-
39-
# Sometimes it's usefull to ignore malformed messages
40-
# (e.x. source contains nothing),
41-
# in the case set ignore_bad_messages to True.
42-
# By default it is False
43-
ignore_bad_messages => true
44-
45-
# Sometimes it's usefull to ignore problems with a convertation
46-
# of a received value to Cassandra format and set some default
47-
# value (inet: 0.0.0.0, float: 0.0, int: 0,
48-
# uuid: 00000000-0000-0000-0000-000000000000,
49-
# timestamp: 1970-01-01 00:00:00) in the case set
50-
# ignore_bad_messages to True.
51-
# By default it is False
52-
ignore_bad_values => true
53-
54-
# Datastax cassandra driver supports batch insert.
55-
# You can define the batch size explicitely.
56-
# By default it is 1.
57-
batch_size => 100
58-
59-
# Every batch_processor_thread_period sec. a special thread
60-
# pushes all collected messages to Cassandra. By default it is 1 (sec.)
61-
batch_processor_thread_period => 1
62-
63-
# max max_retries times the plugin will push failed batches
64-
# to Cassandra before give up. By defult it is 3.
65-
max_retries => 3
66-
67-
# retry_delay secs. between two sequential tries to push a failed batch
68-
# to Cassandra. By default it is 3 (secs.)
69-
retry_delay => 3
49+
ip => "inet"
50+
}
51+
52+
# The retry policy to use (the default is the default retry policy)
53+
# the hash requires the name of the policy and the params it requires
54+
# The available policy names are:
55+
# * default => retry once if needed / possible
56+
# * downgrading_consistency => retry once with a best guess lowered consistency
57+
# * failthrough => fail immediately (i.e. no retries)
58+
# * backoff => a version of the default retry policy but with configurable backoff retries
59+
# The backoff options are as follows:
60+
# * backoff_type => either * or ** for linear and exponential backoffs respectively
61+
# * backoff_size => the left operand for the backoff type in seconds
62+
# * retry_limit => the maximum amount of retries to allow per query
63+
# example:
64+
# using { "type" => "backoff" "backoff_type" => "**" "backoff_size" => 2 "retry_limit" => 10 } will perform 10 retries with the following wait times: 1, 2, 4, 8, 16, ... 1024
65+
# NOTE: there is an underlying assumption that the insert query is idempotent !!!
66+
# NOTE: when the backoff retry policy is used, it will also be used to handle pure client timeouts and not just ones coming from the coordinator
67+
retry_policy => { "type" => "default" }
68+
69+
# The command execution timeout
70+
request_timeout => 1
71+
72+
# Ignore bad values
73+
ignore_bad_values => false
74+
75+
# In Logstashes >= 2.2 this setting defines the maximum sized bulk request Logstash will make
76+
# You you may want to increase this to be in line with your pipeline's batch size.
77+
# If you specify a number larger than the batch size of your pipeline it will have no effect,
78+
# save for the case where a filter increases the size of an inflight batch by outputting
79+
# events.
80+
#
81+
# In Logstashes <= 2.1 this plugin uses its own internal buffer of events.
82+
# This config option sets that size. In these older logstashes this size may
83+
# have a significant impact on heap usage, whereas in 2.2+ it will never increase it.
84+
# To make efficient bulk API calls, we will buffer a certain number of
85+
# events before flushing that out to Cassandra. This setting
86+
# controls how many events will be buffered before sending a batch
87+
# of events. Increasing the `flush_size` has an effect on Logstash's heap size.
88+
# Remember to also increase the heap size using `LS_HEAP_SIZE` if you are sending big commands
89+
# or have increased the `flush_size` to a higher value.
90+
flush_size => 500
91+
92+
# The amount of time since last flush before a flush is forced.
93+
#
94+
# This setting helps ensure slow event rates don't get stuck in Logstash.
95+
# For example, if your `flush_size` is 100, and you have received 10 events,
96+
# and it has been more than `idle_flush_time` seconds since the last flush,
97+
# Logstash will flush those 10 events automatically.
98+
#
99+
# This helps keep both fast and slow log streams moving along in
100+
# near-real-time.
101+
idle_flush_time => 1
70102
}
71103
}
72104
</code></pre>
@@ -106,4 +138,12 @@ bin/logstash -e 'output {cassandra {}}'
106138
```
107139

108140
## TODO
109-
141+
* Fix the authentication bug (no user;pass in cassandra plugin?!)
142+
* Finish integration specs
143+
* it "properly works with counter columns"
144+
* it "properly adds multiple events to multiple tables in the same bulk"
145+
* Improve retries to include (but probably only handle Errors::Timeout and Errors::NoHostsAvailable):
146+
* \#get_query
147+
* \#execute_async
148+
* Upgrade / test with logstash 2.3
149+
* Upgrade / test with cassandra 3

0 commit comments

Comments
 (0)