Go to Sign up
Note: Your files never leave your device. We don't upload, transfer, or store your data.
|
|
|
|
|---|---|---|
|
|
|
The Markdown To Avro Schema Converter on A.Tools transforms Markdown pipe-delimited tables into Apache Avro schema definitions in JSON format. Five settings let you configure the schema name, namespace, field type detection, indentation, and output minification. All processing runs in your browser. No data leaves your device.
Apache Avro is a data serialization framework widely used in Apache Kafka, Hadoop, Spark, and big data pipelines. Avro schemas define the structure of your data — field names, types, and optional defaults — enabling schema evolution, strong typing, and efficient binary serialization.
Click Enter Data to paste a Markdown table into the input area, or click Choose File to drag and drop a .md file. Press Sample to load example data.
Once parsed, an interactive spreadsheet appears. Use the toolbar to:
Add or delete rows and columns
Transpose the table (swap rows and columns)
Remove duplicate rows
Delete empty rows and columns
Change text case (UPPERCASE, lowercase, Capitalize)
Find and replace values — supports case-sensitive search and regex
Toggle First Row as Header to define field names
Right-click any cell for context-menu operations.
In the Properties panel:
| Setting | Default | Description |
|---|---|---|
| Schema Name | (empty) | Name of the Avro record type |
| Namespace | (empty) | Java-style package namespace for the schema |
| Field Type Detection | On | Auto-detect field types from data values |
| Indent Size | 2 | Number of spaces per indentation level |
| Minify Output | Off | Remove whitespace for compact JSON |
Click Convert to generate the Avro schema JSON. Use Copy to Clipboard or Download File to save as an .avsc file.
Two input modes: Paste Markdown directly or upload a .md file via drag-and-drop
Full table editor: Edit, transpose, deduplicate, find-and-replace before converting
Auto type detection: Infer Avro types (string, int, long, float, double, boolean) from data values
Schema name and namespace: Configure the record name and Java-style namespace
Indent control: Set indentation to 2, 4, or 8 spaces — or use tab characters
Minified output: Strip whitespace for compact schema JSON
Client-side processing: Files never leave the browser — zero data upload
Undo/Redo: Full edit history with revert support
Context menu: Right-click for quick row/column/cell operations
Header toggle: Treat the first row as field names or regular data
Validation indicator: Real-time feedback on input validity
Given this Markdown input:
| name | age | score | active |
|-------|-----|-------|--------|
| Alice | 30 | 95.5 | true |
| Bob | 25 | 87.3 | false |
{
"type": "record",
"name": "UserRecord",
"namespace": "com.example",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "age",
"type": "int"
},
{
"name": "score",
"type": "double"
},
{
"name": "active",
"type": "boolean"
}
]
}
{
"type": "record",
"name": "UserRecord",
"namespace": "com.example",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "age",
"type": "string"
},
{
"name": "score",
"type": "string"
},
{
"name": "active",
"type": "string"
}
]
}
{"type":"record","name":"UserRecord","namespace":"com.example","fields":[{"name":"name","type":"string"},{"name":"age","type":"int"},{"name":"score","type":"double"},{"name":"active","type":"boolean"}]}When a column contains both values and empty cells, the tool may generate nullable union types:
{
"name": "middle_name",
"type": ["null", "string"]
}
Apache Avro is a data serialization system that provides rich data structures, a compact binary format, and schema evolution. It is the default serialization format for Apache Kafka and is widely used in Hadoop, Spark, and Flink ecosystems.
An Avro schema is a JSON document that defines a data type. For records:
{
"type": "record",
"name": "SchemaName",
"namespace": "com.example.namespace",
"doc": "Description of the record",
"fields": [
{
"name": "field1",
"type": "string"
},
{
"name": "field2",
"type": "int"
}
]
}
| Type | Description | JSON Example |
|---|---|---|
null | No value | null |
boolean | True or false | true |
int | 32-bit signed integer | 42 |
long | 64-bit signed integer | 9223372036854775807 |
float | Single precision (32-bit) IEEE 754 | 3.14 |
double | Double precision (64-bit) IEEE 754 | 3.141592653589793 |
bytes | Sequence of 8-bit unsigned bytes | "\u00ff" |
string | Unicode character sequence | "hello" |
| Type | Description | Schema Example |
|---|---|---|
record | Structured object with named fields | {"type":"record","name":"...","fields":[...]} |
enum | Named set of values | {"type":"enum","name":"...","symbols":["A","B"]} |
array | Ordered collection | {"type":"array","items":"string"} |
map | Key-value pairs | {"type":"map","values":"string"} |
union | One of several types | ["null","string"] |
fixed | Fixed-size binary | {"type":"fixed","name":"...","size":16} |
| Feature | Avro | Protobuf | JSON Schema |
|---|---|---|---|
| Schema format | JSON | .proto IDL | JSON |
| Serialization | Binary | Binary | Text (JSON) |
| Schema evolution | Full (add/remove fields) | Limited | Full |
| Type safety | Strong | Strong | Validation only |
| Kafka support | Default | Via plugin | Via plugin |
| Code generation | Yes | Yes | Limited |
| Namespace support | Java-style packages | Java-style packages | URI-based |
| Used by | Kafka, Hadoop, Spark | gRPC, Google services | REST APIs, validation |
| File extension | .avsc | .proto | .json |
Avro supports forward and backward compatibility through schema evolution rules:
Add a field — Backward compatible if a default value is provided
Remove a field — Forward compatible if a default value was provided
Change a type — Generally not compatible (with narrow exceptions)
Rename a field — Compatible using aliases
This makes Avro ideal for Kafka topics where producers and consumers evolve independently.
| Platform | Usage |
|---|---|
| Apache Kafka | Default serialization with Confluent Schema Registry |
| Confluent Platform | Schema Registry stores and enforces schemas |
| AWS Glue | Schema Registry for Kinesis and MSK |
| Apache Hadoop | File format for MapReduce and Hive |
| Apache Spark | Data source format for batch and streaming |
| Apache Flink | Stream processing serialization |
| Azure Event Hubs | Schema Registry for event schemas |
| Apache NiFi | Data flow schema validation |
The name of the Avro record type. This becomes the "name" field in the schema JSON and is used for code generation and class naming.
{
"type": "record",
"name": "MySchemaName",
...
}
Naming rules:
Must start with a letter or underscore
Can contain letters, digits, and underscores
Use PascalCase for readability: UserRecord, OrderEvent, SensorReading
Avoid spaces and special characters
A Java-style package namespace that qualifies the schema name. Combined with Schema Name, it creates a fully qualified name: com.example.UserRecord.
{
"type": "record",
"name": "UserRecord",
"namespace": "com.company.project.models",
...
}
Common namespace patterns:
| Pattern | Example |
|---|---|
| Company domain | com.example |
| Project-specific | com.example.orderservice |
| Domain model | com.example.models.events |
| Kafka topic | com.example.kafka.schemas |
Controls how field types are determined:
| Mode | Behavior |
|---|---|
| Auto-detect (on) | Analyzes data values to infer types: "42" → int, "3.14" → double, "true" → boolean |
| All strings (off) | All fields are typed as string regardless of data values |
When to use auto-detect:
Your data contains a mix of types (strings, numbers, booleans)
You want the schema to accurately reflect the data
The schema will be used for Kafka serialization with type enforcement
When to use all strings:
You want to defer type decisions to the consumer
Your data is inconsistent (mixed types in the same column)
You are prototyping and prefer simplicity
Number of spaces per indentation level in the output JSON. Options: 2, 4, 8, or tab.
| Size | Convention |
|---|---|
| 2 spaces | Standard for Avro schemas and most JSON projects |
| 4 spaces | Common in enterprise Java environments |
| 8 spaces | Legacy style |
| Tab | Personal preference |
Removes all whitespace, newlines, and indentation from the schema JSON. Useful for:
Registering schemas via REST API (smaller payload)
Embedding schemas in configuration files
Storing schemas in metadata fields with size constraints
| Scenario | Why This Tool |
|---|---|
| Define Kafka topic schemas | Generate Avro schemas from data specification tables |
| Register schemas in Confluent Schema Registry | Create .avsc files ready for registration |
| Design data models for Hadoop/Hive | Convert data dictionaries to Avro record definitions |
| Prototype Kafka producers | Generate schemas from sample data tables |
| Create Spark data source schemas | Define Avro schemas for batch and streaming jobs |
| Document data contracts | Turn specification tables into machine-readable schemas |
| Set up AWS Glue Schema Registry | Generate schemas for Kinesis Data Streams or MSK |
| Generate code from schemas | Create Avro IDL for Java, Python, or C# code generation |
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
props.put("schema.registry.url", "http://localhost:8081");
Producer<String, GenericRecord> producer = new KafkaProducer<>(props);
Schema schema = new Schema.Parser().parse(new File("user.avsc"));
GenericRecord user = new GenericData.Record(schema);
user.put("name", "Alice");
user.put("age", 30);
user.put("score", 95.5);
user.put("active", true);
ProducerRecord<String, GenericRecord> record = new ProducerRecord<>("users", "alice", user);
producer.send(record);
curl -X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\":\"record\",\"name\":\"UserRecord\",\"namespace\":\"com.example\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"age\",\"type\":\"int\"}]}"}' \
http://localhost:8081/subjects/users-value/versions
from confluent_kafka import Producer
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroSerializer
schema_str = open("user.avsc").read()
schema_registry_conf = {"url": "http://localhost:8081"}
schema_registry_client = SchemaRegistryClient(schema_registry_conf)
avro_serializer = AvroSerializer(schema_registry_client,schema_str)
producer = Producer({"bootstrap.servers": "localhost:9092"})
def delivery_report(err, msg):
if err:
print(f"Delivery failed: {err}")
user = {"name": "Alice", "age": 30, "score": 95.5, "active": True}
producer.produce(
topic="users",
value=avro_serializer(user, None),
on_delivery=delivery_report
)
producer.flush()
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
df = spark.read \
.format("avro") \
.option("avroSchema", open("user.avsc").read()) \
.load("hdfs://path/to/data.avro")
df.show()
No. All file parsing and conversion runs in your browser using JavaScript. Your data stays on your device. A.Tools never receives, stores, or transmits your file contents.
The tool supports standard pipe-delimited Markdown tables following the CommonMark specification, including tables with or without leading/trailing pipes and alignment indicators.
Apache Avro is a data serialization framework that uses JSON schemas to define data types and a compact binary format for serialization. It is the default serialization format for Apache Kafka and is widely used in Hadoop, Spark, and big data ecosystems.
Use .avsc (Avro Schema) for schema definition files. Use .avro for binary data files that contain serialized records.
When enabled, the tool analyzes data values in each column to infer Avro types — integers become int, decimals become double, true/false becomes boolean, and everything else becomes string. When disabled, all fields are typed as string.
A namespace is a Java-style package identifier that qualifies the schema name. It prevents naming conflicts between schemas. For example, com.example.orders.OrderEvent and com.example.users.OrderEvent are different schemas despite having the same record name.
Use the Schema Registry REST API: send a POST request to http://localhost:8081/subjects/{topic}-value/versions with the schema JSON in the request body. The Minify Output option produces a compact format suitable for API payloads.
Yes. After parsing, the full table editor lets you modify cells, add or remove rows and columns, transpose, deduplicate, change text case, and find-and-replace values.