Go to Sign up
Note: Your files never leave your device. We don't upload, transfer, or store your data.
|
|
|
|
|---|---|---|
|
|
|
Apache Avro is a data serialization system developed within the Apache Software Foundation. It provides a compact, fast, binary data format with rich data structures, and relies on JSON-based schemas to define data types and protocols. Avro is a core component of the Apache big data ecosystem:
Apache Kafka — Avro is the most popular schema format for Kafka's Confluent Schema Registry.
Apache Spark — Avro is a supported data source format for Spark SQL and Structured Streaming.
Apache Hadoop — Avro is the native serialization format for Hadoop MapReduce and HDFS.
Apache Flink — Avro schemas are used for Flink's event-time streaming pipelines.
The official Apache Avro specification is available at avro.apache.org/docs.
An Avro schema is a JSON document that defines the structure of serialized data. Avro schemas support two categories of types:
| Avro Type | Description | Example Value |
|---|---|---|
null | No value | null |
boolean | True or false | true |
int | 32-bit signed integer | 42 |
long | 64-bit signed integer | 1700000000 |
float | Single precision (32-bit) IEEE 754 | 3.14 |
double | Double precision (64-bit) IEEE 754 | 3.141592653589793 |
bytes | Sequence of 8-bit unsigned bytes | "\u00ff" |
string | Unicode character sequence | "hello" |
Avro also supports record, enum, array, map, union, fixed, and nested schemas. The most common is the record type:
{
"type": "record",
"name": "User",
"namespace": "com.example",
"fields": [
{"name": "id", "type": "long"},
{"name": "name", "type": "string"},
{"name": "active", "type": "boolean"},
{"name": "score", "type": ["null", "double"]}
]
}
Excel to Avro Schema is a free online tool that converts spreadsheet column definitions from Excel files (.xlsx, .xls, .xlsm) into Apache Avro record schemas. Each column in your spreadsheet becomes a field in the generated Avro schema. The tool processes everything in your browser — your files are never uploaded to any server.
Instant Schema Generation — Convert Excel column headers to Avro record schema fields with one click.
Multi-Sheet Support — Select which worksheet to convert from multi-sheet workbooks.
Built-in Table Editor — Edit cells, transpose, deduplicate, and transform data before conversion.
First Row as Header — Toggle to use the first row as field names or generic column letters.
Output Includes Data — Generates both the schema and the data rows in JSON format.
Copy and Download — Copy the Avro output to clipboard or download as a .avro.json file.
100% Client-Side Processing — No data leaves your device.
Drag and drop an Excel file (.xlsx, .xls, or .xlsm) onto the upload area, or click to browse. Alternatively, click Enter Data to type column definitions manually.
After loading, your data appears in an editable table. Use the toolbar to modify cells, add or remove rows and columns, transpose data, remove duplicates and empty rows, change text case, or find and replace values. Toggle First Row as Header to define column names.
Click Convert in the Properties panel. The tool generates a JSON object containing:
"schema" — An Avro record schema with "type": "record", "name": "Record", and a "fields" array.
"data" — An array of objects representing each row as key-value pairs.
Given this Excel table with column headers:
| id | name | price | active |
|---|---|---|---|
| 1 | Widget A | 29.99 | true |
| 2 | Widget B | 49.99 | false |
The tool generates:
{
"schema": {
"type": "record",
"name": "Record",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "name", "type": "string" },
{ "name": "price", "type": "string" },
{ "name": "active", "type": "string" }
]
},
"data": [
{ "id": "1", "name": "Widget A", "price": "29.99", "active": "true" },
{ "id": "2", "name": "Widget B", "price": "49.99", "active": "false" }
]
}
Note: The current version maps all fields to "string" type. For production Avro schemas used with Kafka Schema Registry or Spark, manually update field types to the appropriate Avro primitive types (int, long, double, boolean, null) or use union types like ["null", "string"] for nullable fields.
Click Copy to Clipboard to copy the output into your project. For file download, use the Download button (requires Premium Plan). The file is saved with a .avro.json extension.
| Feature | Avro | Protobuf | Parquet | JSON |
|---|---|---|---|---|
| Schema Language | JSON | Proto IDL | Embedded (Parquet metadata) | None / JSON Schema |
| Data Format | Binary | Binary | Columnar binary | Text |
| Schema Evolution | Full (add/remove/default) | Partial | Limited | None |
| Kafka Native Support | Yes (Schema Registry) | Yes (Schema Registry) | No | No |
| Spark Data Source | Yes | Yes | Yes (native) | Yes |
| Compression | Snappy, Deflate, Zstandard | Varint, wire format | Snappy, Gzip, Zstandard | Gzip (external) |
Kafka Schema Registry — Define Avro schemas for Kafka topics, register them in Confluent Schema Registry, and produce/consume typed messages.
Spark Data Pipelines — Create Avro schemas for Spark SQL data sources used in ETL jobs and Structured Streaming.
Hadoop HDFS Storage — Generate Avro schemas for HDFS files in MapReduce, Hive, and Impala tables.
Data Contract Definition — Define data contracts between microservices using Avro as the schema language.
Database Migration to Big Data — Export database table definitions from Excel and generate Avro schemas for the target data lake.
Event Schema Design — Design event schemas for streaming platforms (Kafka, Kinesis, Event Hubs) in a spreadsheet and convert to Avro.
Documentation — Keep data model documentation in Excel and regenerate Avro schemas when the model evolves.
No. All file parsing and Avro schema generation happens entirely in your browser using client-side JavaScript. Your files are never uploaded, transferred, or stored on any server. The tool works offline once the page has loaded.
The tool supports .xlsx (Excel 2007+), .xls (Excel 97-2003), and .xlsm (macro-enabled) files. Multi-sheet workbooks are supported — use the sheet selector to choose which sheet to convert.
The current version maps all columns to Avro "string" type. For production use with Kafka Schema Registry or Spark, manually update field types to the appropriate Avro primitives (int, long, double, boolean) or use union types like ["null", "string"] for nullable fields. Future versions will include automatic type inference.
JSON is a text-based data format without a built-in schema. Avro uses JSON for schema definition but serializes data in a compact binary format. Avro schemas enable type safety, schema evolution, and efficient compression — making it suitable for high-throughput systems like Kafka and Hadoop. JSON is better for human-readable APIs and configuration files.
Copy the "schema" object from the output and register it with Confluent Schema Registry using the REST API or CLI. Example: curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{\"type\":\"record\",...}"}' http://localhost:8081/subjects/my-topic-value/versions. Ensure field types are updated from "string" to proper Avro types before registration.
A union type in Avro is represented as a JSON array of types, e.g., ["null", "string"]. It allows a field to hold values of different types. The most common use is nullable fields — ["null", "string"] means the field can be null or a string. Union types are essential for schema evolution and handling optional data in Kafka and Spark.
Yes. After uploading your file, the built-in table editor lets you modify cell values, add or remove rows and columns, transpose data, remove duplicates and empty rows, change text case, and find-and-replace — all before generating the Avro schema.
In Avro, every record schema has a "name" (the record type name, e.g., "User") and an optional "namespace" (a dot-separated qualifier like "com.example.models"). The namespace prevents naming conflicts between schemas. The current tool defaults to "name": "Record" with no namespace. You can modify the output JSON to set your own values.
Schema evolution allows you to modify an Avro schema over time while maintaining compatibility with data written using older schemas. Avro supports adding fields with defaults, removing fields, and renaming fields via aliases. Confluent Schema Registry enforces compatibility rules (BACKWARD, FORWARD, FULL) to ensure schema changes don't break consumers.
Yes. Click the Enter Data button to open a blank editor where you can type or paste column definitions manually, then generate the Avro schema.