Skip to content

Quickstart

Matthieu Monsch edited this page Dec 18, 2016 · 44 revisions

What is a Type?

Each Avro type maps to a corresponding JavaScript Type:

  • int maps to IntType.
  • arrays map to ArrayTypes.
  • records map to RecordTypes.
  • etc.

An instance of a Type knows how to decode and encode and its corresponding objects. For example the StringType knows how to handle JavaScript strings:

const stringType = new avro.types.StringType();
const buf = stringType.toBuffer('Hi'); // Buffer containing 'Hi''s Avro encoding.
const str = stringType.fromBuffer(buf); // === 'Hi'

The toBuffer and fromBuffer methods above are convenience functions which encode and decode a single object into/from a standalone buffer.

Each type also provides other methods which can be useful. Here are a few (refer to the API documentation for the full list):

  • JSON-encoding:

    const jsonString = type.toString('Hi'); // === '"Hi"'
    const str = type.fromString(jsonString); // === 'Hi'
  • Validity checks:

    const b1 = stringType.isValid('hello'); // === true ('hello' is a valid string.)
    const b2 = stringType.isValid(-2); // === false (-2 is not.)
  • Random object generation:

    const s = stringType.random(); // A random string.

How do I get a Type?

It is possible to instantiate types directly by calling their constructors (available in the avro.types namespace; this is what we used earlier), but in the vast majority of use-cases they will be automatically generated by parsing an existing schema.

avsc exposes a static method, Type.forSchema, to do the heavy lifting and generate a type from its Avro schema definition:

// Equivalent to what we did earlier.
const stringType = avro.Type.forSchema({type: 'string'});

// A slightly more complex type.
const mapType = avro.Type.forSchema({type: 'map', values: 'long'});

// The sky is the limit!
const personType = avro.Type.forSchema({
  name: 'Person',
  type: 'record',
  fields: [
    {name: 'name', type: 'string'},
    {name: 'phone', type: ['null', 'string'], default: null},
    {name: 'address', type: {
      name: 'Address',
      type: 'record',
      fields: [
        {name: 'city', type: 'string'},
        {name: 'zip', type: 'int'}
      ]
    }}
  ]
});

Of course, all the type methods are available. For example:

personType.isValid({
  name: 'Ann',
  phone: null,
  address: {city: 'Cambridge', zip: 02139}
}); // === true

personType.isValid({
  name: 'Bob',
  phone: {string: '617-000-1234'},
  address: {city: 'Boston'}
}); // === false (Missing the zip code.)

For advanced use-cases, Type.forSchema also has a few options which are detailed the API documentation.

What about Avro files?

Avro files (meaning Avro object container files) hold serialized Avro records along with their schema. Reading them is as simple as calling createFileDecoder:

const personStream = avro.createFileDecoder('./persons.avro');

personStream is a readable stream of decoded records, which we can for example use as follows:

personStream.on('data', (person) => {
  if (person.address.city === 'San Francisco') {
    doSomethingWith(person);
  }
});

In case we need the records' type or the file's codec, they are available by listening to the 'metadata' event:

personStream.on('metadata', (type, codec) => { /* Something useful. */ });

To access a file's header synchronously, there also exists an extractFileHeader method:

const header = avro.extractFileHeader('persons.avro');

Writing to an Avro container file is possible using createFileEncoder:

const encoder = avro.createFileEncoder('./processed.avro', type);

Next steps

The API documentation provides a comprehensive list of available functions and their options. The Advanced usage section goes through a few examples to show how the API can be used, including remote procedure calls.

Clone this wiki locally