CSV To Avro Schema

Login

Email
Password

Don't have an account yet?

Go to Sign up

{{ workbook ? 'Online Table Editor' : 'Input Data' }}
Change File Enter Data
Row Col Row Col
Transpose Clear Delete Empty Deduplicate
ABC abc Abc
Replace
First Row as Header
{{ displayRows.length }} rows x {{ displayHeaders.length }} columns{{ firstRowAsHeader ? ' (1 header)' : '' }} {{ selectedRows.length > 0 ? selectedRows.length + ' selected' : '' }}
Output Data
{{ copied ? 'Copied!' : 'Copy to Clipboard' }} Download File
Properties
Convert CSV to Avro Schema online — paste, edit, and download Avro.

Convert Restart
Insert Row Below
Insert Row Above
Insert Column Right
Insert Column Left
Delete Row {{ contextMenu.row + 1 }}
Delete Column {{ contextMenu.col + 1 }}
Clear Cell
Clear Row
Case sensitive Use regex Cancel Replace All

What Is the CSV to Avro Schema Converter?

Apache Avro is a row-oriented remote procedure call and data serialization framework developed within the Apache Hadoop project. It uses JSON-based schemas to define data structures and compacts binary encoding for efficient storage and transmission. Avro is the default serialization format for Apache Kafka (via Confluent), and is widely used in Apache Spark, Apache Flink, and AWS services.

The CSV to Avro Schema Converter on A.Tools reads your CSV file, analyzes the data in each column, and generates a complete Avro schema in JSON format with automatically inferred types.

All processing runs locally in your browser. No data leaves your device.


Core Features

Automatic Type Inference

The tool examines the actual values in each CSV column and maps them to Avro primitive types:

Data PatternAvro TypeExample Values
Text, mixed charactersstring"Alice", "NYC"
Whole numbers (small)int30, -5, 145
Whole numbers (large)long9223372036854775807
Decimal numbersfloat / double3.14, 99.99
Booleanbooleantrue, false
Empty cells["null", "type"] (union)(empty)

Nullable Columns (Union Types)

When a column contains empty cells, the tool generates an Avro union type to allow null values:

{  

"name": "age",  

"type": ["null", "int"]

}

This is essential for real-world data where some fields may be missing.

Avro Record Schema Structure

The output follows the standard Avro schema specification:

{  

"type": "record",  

"name": "MyRecord",  

"namespace": "com.example",  

"fields": [

{"name": "id", "type": "int"},

{"name": "name", "type": "string"},

{"name": "price", "type": "double"}  

]

}

Online Table Editor

Edit your data in-browser before converting:

  • Undo / Redo — Full edit history.

  • Add / Delete Rows & Columns — Expand or trim the table.

  • Transpose — Swap rows and columns.

  • Delete Empty — Remove empty rows and columns.

  • Deduplicate — Remove duplicate rows.

  • ABC / abc / Abc — Batch case conversion.

  • Find & Replace — With regex support.

  • First Row as Header — Column headers become Avro field names.

Privacy & Security

All processing runs client-side via the browser File API. Files are never uploaded, transmitted, or stored. Safe for enterprise data models, proprietary schemas, and production field definitions.


How to Use the CSV to Avro Schema Converter

Step 1 — Load Your Data

Upload a .csv or .tsv file by dragging it onto the upload area, or click to browse. Alternatively, click Enter Data to type or paste data directly.

Step 2 — Edit Your Data (Optional)

Use the toolbar to refine your data:

  • Add, insert, or delete rows and columns.

  • Transpose the table.

  • Remove empty rows/columns or duplicate rows.

  • Change text case.

  • Find and replace values (supports regex).

  • Toggle First Row as Header to define field names.

Step 3 — Convert

Click Convert. The tool analyzes each column's data values and generates an Avro schema with inferred types. The JSON schema appears in the Output Data panel.

Step 4 — Copy and Use

Click Copy to Clipboard and use the schema with:

  • Confluent Schema Registry — Register the schema for Kafka topics.

  • Apache Spark — Define the schema for spark.read.format("avro").

  • Apache Kafka Producers/Consumers — Embed in producer/consumer config.

  • AWS Glue / Kinesis Data Analytics — Use as table schema definitions.


Practical Examples

Example 1: Kafka Event Schema

Input CSV:

event_id,event_type,user_id,amount,timestamp,processed

1001,purchase,U-501,49.99,2026-05-07T10:30:00Z,true

1002,refund,U-502,,2026-05-07T11:15:00Z,false

1003,purchase,U-503,125.00,2026-05-07T12:00:00Z,true

Output Avro Schema:

{  

"type": "record",  

"name": "CsvRecord",  

"fields": [

{"name": "event_id", "type": "int"},

{"name": "event_type", "type": "string"},

{"name": "user_id", "type": "string"},

{"name": "amount", "type": ["null", "double"]},

{"name": "timestamp", "type": "string"},

{"name": "processed", "type": "boolean"}  

]

}

Note: amount is a union ["null", "double"] because row 2 has an empty value.

Example 2: Product Catalog Schema

Input CSV:

id,name,category,price,in_stock,rating1,Widget A,Hardware,12.99,145,4.52,Widget B,Hardware,8.50,0,3.83,Gadget X,Electronics,45.00,23,4.9

Output Avro Schema:

{  

"type": "record",  

"name": "CsvRecord",  

"fields": [

{"name": "id", "type": "int"},

{"name": "name", "type": "string"},

{"name": "category", "type": "string"},

{"name": "price", "type": "double"},

{"name": "in_stock", "type": "int"},

{"name": "rating", "type": "double"}  

]

}

Example 3: IoT Sensor Data

Input CSV:

sensor_id,temperature,humidity,active,reading_timeS-001,22.5,65.0,true,2026-05-07T08:00:00ZS-002,,78.3,true,2026-05-07T08:00:01ZS-003,19.0,,false,

Output Avro Schema:

{  

"type": "record",  

"name": "CsvRecord",  

"fields": [

{"name": "sensor_id", "type": "string"},

{"name": "temperature", "type": ["null", "double"]},

{"name": "humidity", "type": ["null", "double"]},

{"name": "active", "type": ["null", "boolean"]},

{"name": "reading_time", "type": ["null", "string"]}  

]

}

Multiple columns have empty values, so most fields use union types.


Understanding Apache Avro

What Is Avro?

Apache Avro is a data serialization system that provides:

  • Rich data structures — Records, enums, arrays, maps, unions.

  • Compact binary format — Smaller than JSON or XML.

  • Schema-based — Every data file includes its schema.

  • Schema evolution — Add/remove fields without breaking consumers.

  • Language-agnostic — Bindings for Java, Python, C, C++, C#, Go, Ruby, etc.

Avro is defined by the Apache Avro Specification.

Avro Schema Structure

An Avro schema is a JSON document with this structure:

{  

"type": "record",  

"name": "RecordName",  

"namespace": "com.example.namespace",  

"doc": "Description of this record",  

"fields": [

{"name": "fieldName", "type": "string", "doc": "Field description"}  

]

}

Key elements:

  • type: "record" — A record is Avro's equivalent of a struct or class.

  • name — The record type name.

  • namespace — Java-style package name for uniqueness.

  • fields — Array of field definitions, each with name and type.

Avro Primitive Types

TypeDescriptionSize
nullNo value0 bytes
booleanTrue or false1 byte
int32-bit signed integervariable (zigzag)
long64-bit signed integervariable (zigzag)
floatIEEE 754 single precision4 bytes
doubleIEEE 754 double precision8 bytes
bytesSequence of 8-bit bytesvariable
stringUnicode character sequencevariable

Avro vs. JSON vs. Protobuf

AspectAvroJSONProtobuf
Schema formatJSONNone (self-describing).proto (IDL)
EncodingBinaryTextBinary
Schema evolutionFull (add/remove/alias)N/APartial
Type safetyStrongWeakStrong
Used byKafka, Hadoop, SparkREST APIs, webgRPC, Google services
Field orderingMust match writer schemaN/ABy field number

Avro in the Kafka Ecosystem

In Confluent Platform and Kafka:

  • Schema Registry stores Avro schemas with versioning.

  • Producers serialize data using a specific schema ID.

  • Consumers deserialize using the same schema or a compatible evolved version.

  • The generated schema can be registered directly via the Schema Registry REST API:

    POST /subjects/my-topic-value/versions{ "schema": "<generated schema JSON>" }

Frequently Asked Questions

  • Is my CSV data uploaded to a server?

    No. All file processing happens entirely in your browser using JavaScript. Your CSV data is never uploaded, transferred, or stored on any server.

  • What is Apache Avro?

    Apache Avro is a data serialization framework that uses JSON schemas to define data structures and binary encoding for compact, efficient serialization. It is the default format for Confluent Kafka and is widely used in Hadoop, Spark, and Flink ecosystems.

  • How does type inference work?

    The tool scans the data values in each CSV column. If all non-empty values are whole numbers within int range, it uses int. Larger integers become long. Decimal values become double. true/false becomes boolean. Everything else defaults to string. Columns with empty cells get union types (["null", "type"]).

  • Can I use the schema with Confluent Schema Registry?

    Yes. Copy the generated schema JSON and register it via the Schema Registry REST API: POST /subjects/<topic-name>-value/versions with {"schema": "<your schema>"}.

  • What Avro types does the tool generate?

    The tool generates Avro primitive types: string, int, long, float, double, boolean, and null. Nullable fields use Avro union types (e.g., ["null", "string"]).

  • What file formats are supported?

    The tool accepts .csv (comma-separated values) and .tsv (tab-separated values) files. You can also enter data manually through the built-in table editor.

  • Can I edit the generated schema manually?

    Yes. The output is plain JSON. You can modify field names, types, add doc descriptions, change the record name/namespace, or add logical types (e.g., {"type": "long", "logicalType": "timestamp-millis"}) after generation.

  • What is a union type in Avro?

    A union type is an array of types that allows a field to hold values of different types. The most common use is ["null", "string"] which means the field can be either null or a string. The tool generates unions when a column has empty cells.

Featured Tools

Featured tools that you might find useful.

Popular Tools

List of popular tools that users love and frequently use.

New Tools

The latest tools added to our collection, designed for you.

Topics

The tools grouped by topics to quickly find what you need.
Free online Excel to JSON converter. Transform XLSX, XLS, XLSM files into JSON arrays, objects, or keyed formats instantly in your browser — no upload, 100% private.

Excel To JSON

Free online Excel to JSON converter. Transform XLSX, XLS, XLSM files into JSON arrays, objects, or keyed formats instantly in your browser — no upload, 100% private.
Free Excel to CSV converter. Convert XLSX, XLS, XLSM to CSV instantly in your browser. No upload, 100% private. Edit, transpose, deduplicate before exporting.

Excel To CSV

Free Excel to CSV converter. Convert XLSX, XLS, XLSM to CSV instantly in your browser. No upload, 100% private. Edit, transpose, deduplicate before exporting.
Free online Excel to SQL converter. Generate CREATE TABLE and INSERT statements from spreadsheets for MySQL, PostgreSQL, SQLite, and SQL Server. Supports batch insert, primary keys, and type inference.

Excel To SQL

Free online Excel to SQL converter. Generate CREATE TABLE and INSERT statements from spreadsheets for MySQL, PostgreSQL, SQLite, and SQL Server. Supports batch insert, primary keys, and type inference.
Free online Excel to ASCII table converter with 10 border styles (MySQL, Unicode, reStructuredText, and more). Add code comment wrappers in 8 languages. Supports text alignment. Client-side processing.

Excel To ASCII Table

Free online Excel to ASCII table converter with 10 border styles (MySQL, Unicode, reStructuredText, and more). Add code comment wrappers in 8 languages. Supports text alignment. Client-side processing.