Skip to content

attractivechaos/k8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Getting Started

# Download precomiplied binaries
wget -O- https://github.com/attractivechaos/k8/releases/download/v1.2/k8-1.2.tar.bz2 | tar -jxf -
k8-1.2/k8-x86_64-Linux -e 'print(Math.log(2))'

# Compile from source code. This requires to compile node.js first (or v18.20.3 on Mac):
wget -O- https://nodejs.org/dist/v18.19.1/node-v18.19.1.tar.gz | tar -zxf -
cd node-v18.19.1 && ./configure && make -j16
# Then compile k8
git clone https://github.com/attractivechaos/k8
cd k8 && make

The following example counts the number of lines:

if (arguments.length == 0) { // test command-line arguments
	warn("Usage: k8 lc.js <in.txt>");
	exit(1);
}
let buf = new Bytes();
let n = 0, file = new File(arguments[0]);
while (file.readline(buf) >= 0) ++n;
file.close();
buf.destroy();
print(n);

Introduction

K8 is a JavaScript runtime built on top of Google's v8 JavaScript engine. It provides a resizable binary buffer and synchronous APIs for plain file writing and gzip'd file reading.

Motivations

JavaScript is among the fastest scripting languages. It is essential for web development but not often used for large-scale text processing or command-line utilities, in my opinion, due to the lack of sensible file I/O. Current JavaScript runtimes such as Node.js and Deno focus on asynchronous I/O and whole-file reading. Even reading a file line by line, which is required to work with large files, becomes a cumbersome effort. K8 aims to solve this problem. With synchronous I/O APIs, JavaScript can be a powerful language for developing command-line tools.

Installation

It is recommended to download precompiled binaries (also available from Zenodo). If you want to compile k8, you need to compile Node.js which bundles v8 and provides a more convenient way build v8. As the v8 APIs are fast changing, both Node.js and k8 only work with specific versions of v8. The k8-1.x branch is known to work with node-18.x but not 19.x or higher. It is also worth noting that node-18.20.x upgraded c-ares which is now incompatible with older glibc. Node-18.19.1 is the most recent version that can be compiled on CentOS 7. On the other hand, Node-18.19.x cannot be compiled on MacOS with clang-15. Node-18.20.3 is known to work.

API Documentations

Functions

// Print to stdout (print) or stderr (warn). TAB delimited if multiple arguments.
function print(data: any)
function warn(data: any)

// Exit
function exit(code: number)

// Load a JavaScript file and execute. It searches the working directory, the
// script directory and then the K8_PATH environment variable in order.
function load(fileName: string)

// Read entire file as an ArrayBuffer
function k8_read_file(fileName: string): ArrayBuffer

// Decode $buf to string under the $enc encoding; only "utf-8" is supported for now
// Missing or unknown encoding is treated as Latin-1
function k8_decode(buf: ArrayBuffer|Bytes, enc?: string): string

// Encode $str into an ArrayBuffer
function k8_encode(str: string, enc?: string): ArrayBuffer

// Reverse complement a DNA sequence in string
function k8_revcomp(seq: string): string

// Reverse complement a DNA sequence in place
function k8_revcomp(seq: ArrayBuffer|Bytes)

// Get version string
function k8_version(): string

The Bytes Object

Bytes provides a resizable byte array.

// Create an array of byte buffer of $len in size. 
new Bytes(len?: number = 0)

// Property: get/set length of the array
.length: number

// Property: get/set the max capacity of the array
.capacity: number

// Property: get ArrayBuffer of the underlying data, not allocated from v8
.buffer: ArrayBuffer

// Deallocate the array. This is necessary as the memory is not managed by the v8 GC.
Bytes.prototype.destroy()

// Replace the byte array starting from $offset to $data, where $data can be a number,
// a string, an array or Bytes. The size of the array is modified if the new array
// is larger. Return the number of modified bytes.
Bytes.prototype.set(data: number|string|Array|ArrayBuffer, offset?: number) :number

// Convert the byte array to string
Bytes.prototype.toString()

The File Object

File provides buffered file I/O.

// Open a plain or gzip'd file for reading or a plain file for writing. $file
// is file descriptor if it is an integer or file name if string. Each File
// object can only be read or only be written, not mixed
new File(file?: string|number = 0, mode?: string = "r")

// Read a byte and return it
File.prototype.read() :number

// Read the rest of the file into $buf at offset 0. Return the number of bytes read.
File.prototype.read(buf: Bytes) :number

// Read $len bytes into $buf at $offset.
// Return the number of bytes read on success; 0 on file end; <0 on errors
File.prototype.read(buf: Bytes, offset: number, len: number) :number

// Read a line or a token to $buf at $offset. $sep=0 for SPACE, 1 for TAB and 2
// for newline. If $sep is a string, only the first character is considered.
// Return the delimiter if non-negative, -1 upon EOF, or <-1 for errors
File.prototype.readline(buf: Bytes, sep?: number|string = 2, offset?: number = 0) :number

// Write data
File.prototype.write(data: string|ArrayBuffer) :number

// Close a file
File.prototype.close()