CloudLens is a Swift library for processing machine-generated text streams such as log streams. CloudLens supports plain text as well as JSON-encoded streams.
Analyzing logs is challenging. Logs contain a mix of text and semi-structured meta-data such as timestamps. Logs are often aggregated from multiple sources with under-specified, heterogeneous formats. Parsing techniques based on schemas or grammars are not practical. In contrast, CloudLens is built on the premise that parsing need not be exhaustive and relies on pattern matching for data extraction. Matches can augment the raw text with structured attributes and trigger actions. For instance, a single line of CloudLens code can detect error messages in a log stream, extract the error code for future analyzis, and count errors.
stream.process(onPattern: "error (?<error:Number>\\d+)") { _ in errorCount += 1 }
Thanks to IBM’s Swift Sandbox, it is possible to try CloudLens online using this link. Simply press Play to run the example. The code editor is fully functional but the sandbox cannot access the network, so testing is limited to the supplied log.txt file originally produced by Travis CI for Apache OpenWhisk.
CloudLens has been tested on macOS and Linux. CloudLens uses IBM’s fork of SwiftyJSON for Linux compatibility.
Clone the repository:
git clone https://github.com/cloudlens/swift-cloudlens.git
CloudLens is built using the Swift Package Manager. To build, execute in the root CloudLens folder:
swift build --config release
The build process automatically fetches required dependencies from GitHub.
The build process automatically compiles a simple test program available in Sources/Main/main.swift. To run the example program, execute:
.build/release/Main
To load CloudLens in the Swift REPL, execute in the root CloudLens folder:
swift -I.build/release -L.build/release -lCloudLens
Then import the CloudLens module with:
import CloudLens
To build the necessary library on Linux, please follow instructions at the end of Package.swift.
A workspace is provided to support CloudLens development in Xcode. It includes a CloudLens playground to make it easy to experiment with CloudLens.
open CloudLens.xcworkspace
To build and run the example program in Xcode, make sure to select the “Main" target and activate the console.
A CloudLens program constructs and processes streams of JSON objects. JSON support is provided by the SwiftyJSON library.
A CloudLens stream (an instance of the CLStream
class) is a lazy sequence of JSON objects. A stream can be derived from various sources. The following code constructs a stream with four elements. Each stream element is a JSON object with a single field "message"
of type String:
let stream = CLStream(messages: "error 42", "warning", "info", "error 255")
The next example constructs a stream from a text file.
Each line becomes a JSON object with a single field "message"
that contains the line's text.
let stream = CLStream(textFile: "log.txt")
The next example constructs a stream from a file containing an array of JSON objects.
let stream = CLStream(jsonFile: "array.json")
In general, a stream can be constructed from any function of type () -> JSON?
.
Streams are constructed lazily when possible. For example, for stream constructed from a text file, the file is read line by line, as needed.
The process
method of the CLStream
class registers actions to be executed on the stream elements. The run
method triggers the execution of these actions.
For instance, this code specifies an action to be executed on all stream elements:
stream.process { obj in print(obj) }
But nothing happens until run
is invoked:
stream.run()
The two methods return self
so the following syntax is also possible:
CLStream(messages: "error 42", "warning", "info", "error 255")
.process { obj in print(obj) }
.run()
This example outputs:
{"message":"error 42"}
{"message":"warning"}
{"message":"info"}
{"message":"error 255"}
Stream elements are processed in order. When multiple actions are specified, actions are executed in order for each stream element. Moreover, all actions for a given stream element are executed before the next stream element is considered. For instance this code
let stream = CLStream(messages: "foo", "bar")
stream.process { obj in print(1, obj) }
stream.process { obj in print(2, obj) }
stream.run()
outputs:
1 {"message":"foo"}
2 {"message":"foo"}
1 {"message":"bar"}
2 {"message":"bar"}
By default, run
preserves the output stream, which becomes the input stream for subsequent actions. For instance this code
let stream = CLStream(messages: "foo", "bar")
stream.process { obj in print(1, obj) }
stream.run()
stream.process { obj in print(2, obj) }
stream.run()
outputs:
1 {"message":"foo"}
1 {"message":"bar"}
2 {"message":"foo"}
2 {"message":"bar"}
Alternatively, the following invocation of run
discards the output stream elements as they are produced:
stream.run(withHistory: false)
The later is recommended to avoid buffering the entire stream.
It is possible to mutate, replace, or remove the stream element being processed.
// to mutate the stream element
stream.process { obj in obj["timestamp"] = String(describing: Date()) }
// to remove the element from the stream
stream.process { obj in obj = .null }
// to replace the element in the stream
stream.process { obj in obj = otherObject }
// to replace one stream element with multiple objects
stream.process { obj in obj = CLStream.emit([thisObject, thatObject]) }
Actions can be guarded by activation conditions.
stream.process(onPattern: "error", onKey: "message") { obj in print(obj) }
If a key is specified, the action only executes for JSON objects that have a value for the given key. In addition, if a pattern is specified, the field value must match the pattern. If a pattern is specified but no key, the key defaults to "message"
. Objects that do not satisfy the activation condition are unaffected by the action.
Keys can be paths in JSON objects. Patterns can be simple strings or regular expressions.
A regular expression pattern cannot include numbered capture groups but it may include named capture groups. Upon a successful match, the JSON object is augmented with new fields that bind each group name to the corresponding substring in the match. For instance,
let stream = CLStream(messages: "error 42", "warning", "info", "error 255")
stream.process(onPattern: "error (?<error>\\d+)") { obj in print(obj) }
stream.run()
outputs:
{"error":"42","message":"error 42"}
{"error":"255","message":"error 255"}
Named captured groups can be given an explicit type using the :type syntax, for example "(?<error:Number>\\d+)"
. The supported types are Number
, String
, and Date
, with String
the implicit default. A Date
type should include a Date format specification as in "(?<date:Date[yyyy-MM-dd' 'HH:mm:ss.SSS]>^.{23})"
.
The special key CLKey.endOfStream
may be used to defer an action until after the complete stream has been processed:
let stream = CLStream(messages: "error 42", "warning", "info", "error 255")
var count = 0;
stream.process(onKey: "error") { _ in count += 1 }
stream.process(onKey: CLKey.endOfStream) { _ in print(count, "error(s)") }
stream.run()
outputs:
2 error(s)
A deferred action, may append new elements at the end of the stream:
stream.process(onKey: CLKey.endOfStream) { obj in obj = ["message": "\(count) error(s)"] }
CloudLens can be extented with new processing lenses easily, for example:
extension CLStream {
@discardableResult func grep(_ pattern: String) -> CLStream {
return process(onPattern: pattern) { obj in print(obj["message"]) }
}
}
CLStream(messages: "error 42", "warning", "info", "error 255")
.grep("error")
.run()
Copyright 2015-2017 IBM Corporation
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.