Build a Go Tool to Convert CSV to JSON
Simple tutorial to read CSV, validate data, convert to JSON and build a basic CLI in Go. Ideal project to get started.

This is probably the most useful simple project you can build in Go. No frameworks, no databases, no HTTP. Just files, structs, and the standard library. You read a CSV, convert it to JSON, and output it to stdout or write it to a file. Nothing more. And in the process you touch file reading, parsing, validation, serialization, command-line flags, and error handling. Everything you need to feel comfortable with the language before diving into more complex things.
If you’re looking for projects to learn Go, this is an excellent starting point. It’s small enough to finish in an afternoon and real enough that the result is something you can actually use.
What We’re Going to Build
A command-line tool that:
- Takes a CSV file as an argument.
- Reads and parses its content.
- Validates the data: detects empty fields, incorrect types, malformed rows.
- Converts rows to a typed Go structure.
- Serializes that structure to JSON.
- Supports compact or human-readable output (pretty print).
- Allows specifying an output file or printing to stdout.
The final result is used like this:
csvtojson -input data.csv -output result.json -prettyOr in compact mode directly to stdout:
csvtojson -input data.csvNo external dependencies. Everything with Go’s standard library.
Project Setup
Create the project directory and initialize the module:
mkdir csvtojson && cd csvtojson
go mod init csvtojsonCreate a main.go file. That will be the entire structure for now. One file, one package. When the project grows you can split it up, but for a tool like this it doesn’t make sense to complicate it from the start.
If you don’t have experience with the go command, the basics you need to know are that go mod init creates the go.mod file that defines the module, and go run main.go compiles and runs directly without generating a binary.
For the example CSV, create a file data.csv with this content:
name,age,email,city
Ana García,34,ana@example.com,Madrid
Pedro López,28,pedro@example.com,Barcelona
María Torres,,maria@example.com,Valencia
,45,no-email,Sevilla
Carlos Ruiz,abc,carlos@example.com,BilbaoI’ve included dirty data on purpose: an empty age, an empty name, and an age that’s not a number. This will be useful for the validation section.
Reading CSV with encoding/csv
Go has the encoding/csv package in its standard library. You don’t need to install anything. The package exposes a csv.Reader that takes an io.Reader and returns records as slices of strings.
package main
import (
"encoding/csv"
"fmt"
"os"
)
func main() {
file, err := os.Open("data.csv")
if err != nil {
fmt.Fprintf(os.Stderr, "error opening file: %v\n", err)
os.Exit(1)
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
fmt.Fprintf(os.Stderr, "error reading CSV: %v\n", err)
os.Exit(1)
}
for i, record := range records {
fmt.Printf("Row %d: %v\n", i, record)
}
}reader.ReadAll() loads the entire file into memory. For small or medium files (up to hundreds of thousands of rows) this is the simplest approach and there’s no performance issue. If you need to process huge files, you can use reader.Read() in a loop to read row by row, but for this tool it’s not necessary.
The first record (records[0]) contains the headers. The rest is data. That distinction is important because we’ll use the headers to map each field to its position.
One detail about csv.Reader: by default it assumes the delimiter is a comma. If you need a semicolon or another character, you can change it with reader.Comma = ';' before calling ReadAll().
Mapping Rows to Structs
Working with slices of strings is fine for reading, but to serialize to JSON we need a typed structure. Let’s define a struct that represents each row of the CSV and a function that maps the records.
type Person struct {
Name string `json:"name"`
Age int `json:"age"`
Email string `json:"email"`
City string `json:"city"`
}The json:"..." tags control how each field is serialized to JSON. Without them, Go would use the field name with the first letter capitalized, which isn’t what we want.
Now the mapping function:
import "strconv"
func mapRecordToPerson(header []string, record []string) (Person, error) {
if len(record) != len(header) {
return Person{}, fmt.Errorf("row has %d fields, expected %d", len(record), len(header))
}
fieldMap := make(map[string]string)
for i, h := range header {
fieldMap[h] = record[i]
}
age, err := strconv.Atoi(fieldMap["age"])
if err != nil {
return Person{}, fmt.Errorf("field 'age' is not a valid number: %q", fieldMap["age"])
}
return Person{
Name: fieldMap["name"],
Age: age,
Email: fieldMap["email"],
City: fieldMap["city"],
}, nil
}We use an intermediate map (fieldMap) to avoid depending on column order. This makes the tool work even if someone changes the column order in the CSV, as long as the header names are the same.
strconv.Atoi converts a string to int. If the value isn’t a number, it returns an error. Go forces you to handle it explicitly. No exceptions, no implicit conversions. Each error is checked where it occurs.
Validating Data: Empty Fields, Incorrect Types
The mapping function already detects incorrect types in the age. But we need a more complete validation. Empty fields, emails that don’t make sense, rows with incomplete data. Let’s add a validation function that returns descriptive errors.
import "strings"
type ValidationError struct {
Row int
Field string
Message string
}
func (e ValidationError) Error() string {
return fmt.Sprintf("row %d, field '%s': %s", e.Row, e.Field, e.Message)
}
func validatePerson(row int, header []string, record []string) []ValidationError {
var errs []ValidationError
if len(record) != len(header) {
errs = append(errs, ValidationError{
Row: row,
Field: "-",
Message: fmt.Sprintf("incorrect number of fields: has %d, expected %d", len(record), len(header)),
})
return errs
}
fieldMap := make(map[string]string)
for i, h := range header {
fieldMap[h] = strings.TrimSpace(record[i])
}
if fieldMap["name"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "name", Message: "empty field"})
}
if fieldMap["age"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "age", Message: "empty field"})
} else if _, err := strconv.Atoi(fieldMap["age"]); err != nil {
errs = append(errs, ValidationError{Row: row, Field: "age", Message: fmt.Sprintf("not a valid number: %q", fieldMap["age"])})
}
if fieldMap["email"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "email", Message: "empty field"})
} else if !strings.Contains(fieldMap["email"], "@") {
errs = append(errs, ValidationError{Row: row, Field: "email", Message: "invalid email format"})
}
if fieldMap["city"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "city", Message: "empty field"})
}
return errs
}We define a ValidationError type that implements the error interface. This is idiomatic in Go: instead of throwing exceptions, you return values that describe the problem. The calling code decides what to do with them. It can abort execution, skip the row, or accumulate errors and display a summary.
The email validation is basic: we only check that it contains @. For a real tool you could use net/mail.ParseAddress, but for our case it’s sufficient.
Converting to JSON with encoding/json
With the data validated and mapped to structs, the conversion to JSON is trivial. The encoding/json package from the standard library does all the work.
import "encoding/json"
func toJSON(people []Person, pretty bool) ([]byte, error) {
if pretty {
return json.MarshalIndent(people, "", " ")
}
return json.Marshal(people)
}json.Marshal serializes any struct with JSON tags to a []byte. json.MarshalIndent does the same but with readable indentation. The two extra arguments are the prefix (normally empty) and the indentation string.
The result with pretty = true looks like this:
[
{
"name": "Ana García",
"age": 34,
"email": "ana@example.com",
"city": "Madrid"
},
{
"name": "Pedro López",
"age": 28,
"email": "pedro@example.com",
"city": "Barcelona"
}
]And with pretty = false:
[{"name":"Ana García","age":34,"email":"ana@example.com","city":"Madrid"},{"name":"Pedro López","age":28,"email":"pedro@example.com","city":"Barcelona"}]The compact version is better for pipelines and automated processing. The formatted version is better for debugging or for humans.
Pretty Printing vs Compact Output
The difference between the two modes isn’t just aesthetic. In production, when you connect the output of one tool to another via pipes, compact output is what you need. Every byte counts if you’re processing millions of records.
But during development, or when you want to inspect the result manually, pretty print saves time. You don’t need to pipe JSON through jq or paste it into an online formatter.
That’s why our tool supports both modes. The -pretty flag activates indentation. Without it, the output is compact by default. It’s a common convention in CLI tools: the silent, efficient mode is the default; the human mode is activated explicitly.
A useful trick: if your tool only writes to stdout, you can combine it with jq to format on the fly:
csvtojson -input data.csv | jq .But having the flag built-in is more convenient and removes the dependency on jq.
Adding CLI Flags: Input File, Output File, and Format Options
Go has the flag package in its standard library. You don’t need cobra, urfave/cli, or any other dependency for a simple tool. If you later need subcommands or autocomplete, you can look at how to build a CLI in Go with more advanced tools. But for this, flag is perfect.
import "flag"
func main() {
inputFile := flag.String("input", "", "input CSV file (required)")
outputFile := flag.String("output", "", "output JSON file (stdout if not specified)")
pretty := flag.Bool("pretty", false, "JSON output with indentation")
strict := flag.Bool("strict", false, "abort on validation errors")
flag.Parse()
if *inputFile == "" {
fmt.Fprintln(os.Stderr, "error: you must specify an input file with -input")
flag.Usage()
os.Exit(1)
}
}flag.String and flag.Bool return pointers. You have to dereference them with * to get the value. It’s one of the things that feels odd in Go at first, but it has its logic: flag.Parse() fills in the values after they’re defined, so it needs a mutable reference.
The -strict flag is useful for different scenarios. In normal mode, the tool skips rows with errors and shows warnings. In strict mode, any validation error stops execution. This is important when you use the tool inside a data pipeline where you don’t want partial data.
Error Handling: Clear Messages for the User
Go has no exceptions. Every function that can fail returns an error as its last return value. This means error handling code is always visible, always explicit. In exchange, you get full control over what to do in each case.
For a CLI tool, error messages must be useful. Nothing like “error: something went wrong”. The user needs to know which file failed, which row has the problem, and which field has incorrect data.
func processCSV(inputPath string, strict bool) ([]Person, error) {
file, err := os.Open(inputPath)
if err != nil {
if os.IsNotExist(err) {
return nil, fmt.Errorf("file '%s' does not exist", inputPath)
}
if os.IsPermission(err) {
return nil, fmt.Errorf("no permission to read '%s'", inputPath)
}
return nil, fmt.Errorf("cannot open '%s': %w", inputPath, err)
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
return nil, fmt.Errorf("error parsing CSV: %w", err)
}
if len(records) < 2 {
return nil, fmt.Errorf("CSV file is empty or has only headers")
}
header := records[0]
var people []Person
var allErrors []ValidationError
for i, record := range records[1:] {
rowNum := i + 2 // +2 because we start at 1 and skip the header
validationErrs := validatePerson(rowNum, header, record)
if len(validationErrs) > 0 {
allErrors = append(allErrors, validationErrs...)
if strict {
return nil, fmt.Errorf("strict mode: %v", validationErrs[0])
}
for _, ve := range validationErrs {
fmt.Fprintf(os.Stderr, "warning: %v\n", ve)
}
continue
}
person, err := mapRecordToPerson(header, record)
if err != nil {
fmt.Fprintf(os.Stderr, "warning: row %d: %v\n", rowNum, err)
continue
}
people = append(people, person)
}
if len(people) == 0 {
return nil, fmt.Errorf("no valid rows found (%d errors)", len(allErrors))
}
if len(allErrors) > 0 {
fmt.Fprintf(os.Stderr, "\n%d rows with errors, %d rows processed successfully\n", len(allErrors), len(people))
}
return people, nil
}Several important points here:
- Error wrapping with
%w: allows the calling code to inspect the root cause witherrors.Isorerrors.As. It’s the standard way to chain errors in Go since version 1.13. - Warnings to stderr: warning messages go to
os.Stderrso they don’t pollute the JSON output going to stdout. This is fundamental if the tool is used in a pipeline. - Final summary: when done, the user sees how many rows were processed and how many failed. Information, not just an exit code.
Compiling and Distributing the Binary
One of the practical advantages of Go is that it compiles to a static binary. No runtime dependencies, no need for the target machine to have Go installed. Copy the binary and it works.
go build -o csvtojson main.goThat generates a csvtojson executable for your current operating system and architecture. To distribute it to other platforms, you can do cross-compilation:
# Linux AMD64
GOOS=linux GOARCH=amd64 go build -o csvtojson-linux-amd64 main.go
# macOS ARM (Apple Silicon)
GOOS=darwin GOARCH=arm64 go build -o csvtojson-darwin-arm64 main.go
# Windows
GOOS=windows GOARCH=amd64 go build -o csvtojson.exe main.goYou don’t need Docker, a VM, or a full CI environment. Go compiles for any platform from any platform. It’s one of the reasons Go is so popular for command-line tools.
To reduce binary size, you can use the -ldflags flag:
go build -ldflags="-s -w" -o csvtojson main.go-s strips the symbol table and -w strips DWARF debug information. For a small tool like this you go from about 5-6 MB to 3-4 MB. Not critical, but good practice.
If you want to install the tool directly in your $GOPATH/bin:
go installAnd you can run csvtojson from any directory.
Extending the Tool: Filtering, Sorting, and More Formats
Once you have the base working, there are natural extensions you can add without changing the architecture. Each one forces you to learn something new about Go.
Filtering Rows
Add a -filter flag that accepts a simple expression like city=Madrid:
filterExpr := flag.String("filter", "", "filter rows (field=value)")And then filter after validation:
func filterPeople(people []Person, field, value string) []Person {
var result []Person
for _, p := range people {
match := false
switch field {
case "name":
match = strings.EqualFold(p.Name, value)
case "city":
match = strings.EqualFold(p.City, value)
case "email":
match = strings.EqualFold(p.Email, value)
}
if match {
result = append(result, p)
}
}
return result
}strings.EqualFold does case-insensitive comparison. It’s more robust than converting everything to lowercase manually.
Sorting Results
Use the sort package from the standard library:
import "sort"
func sortPeople(people []Person, field string, ascending bool) {
sort.Slice(people, func(i, j int) bool {
var less bool
switch field {
case "name":
less = people[i].Name < people[j].Name
case "age":
less = people[i].Age < people[j].Age
case "city":
less = people[i].City < people[j].City
default:
less = people[i].Name < people[j].Name
}
if ascending {
return less
}
return !less
})
}sort.Slice is the standard way to sort slices in Go. It takes a comparison function that returns true if element i should come before j.
Support for Other Output Formats
You can add output in YAML format, NDJSON (one JSON line per record, useful for streaming), or even text tables:
func toNDJSON(people []Person) ([]byte, error) {
var buf bytes.Buffer
encoder := json.NewEncoder(&buf)
for _, p := range people {
if err := encoder.Encode(p); err != nil {
return nil, fmt.Errorf("error encoding person: %w", err)
}
}
return buf.Bytes(), nil
}NDJSON is especially practical when working with tools like jq, because each line is a standalone valid JSON. You can process them with grep, head, tail, or any standard Unix tool.
Complete Code
Here’s the complete program, ready to compile and run:
package main
import (
"bytes"
"encoding/csv"
"encoding/json"
"flag"
"fmt"
"os"
"strconv"
"strings"
)
type Person struct {
Name string `json:"name"`
Age int `json:"age"`
Email string `json:"email"`
City string `json:"city"`
}
type ValidationError struct {
Row int
Field string
Message string
}
func (e ValidationError) Error() string {
return fmt.Sprintf("row %d, field '%s': %s", e.Row, e.Field, e.Message)
}
func validatePerson(row int, header []string, record []string) []ValidationError {
var errs []ValidationError
if len(record) != len(header) {
errs = append(errs, ValidationError{
Row: row,
Field: "-",
Message: fmt.Sprintf("incorrect number of fields: has %d, expected %d", len(record), len(header)),
})
return errs
}
fieldMap := make(map[string]string)
for i, h := range header {
fieldMap[h] = strings.TrimSpace(record[i])
}
if fieldMap["name"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "name", Message: "empty field"})
}
if fieldMap["age"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "age", Message: "empty field"})
} else if _, err := strconv.Atoi(fieldMap["age"]); err != nil {
errs = append(errs, ValidationError{Row: row, Field: "age", Message: fmt.Sprintf("not a valid number: %q", fieldMap["age"])})
}
if fieldMap["email"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "email", Message: "empty field"})
} else if !strings.Contains(fieldMap["email"], "@") {
errs = append(errs, ValidationError{Row: row, Field: "email", Message: "invalid email format"})
}
if fieldMap["city"] == "" {
errs = append(errs, ValidationError{Row: row, Field: "city", Message: "empty field"})
}
return errs
}
func mapRecordToPerson(header []string, record []string) (Person, error) {
if len(record) != len(header) {
return Person{}, fmt.Errorf("row has %d fields, expected %d", len(record), len(header))
}
fieldMap := make(map[string]string)
for i, h := range header {
fieldMap[h] = strings.TrimSpace(record[i])
}
age, err := strconv.Atoi(fieldMap["age"])
if err != nil {
return Person{}, fmt.Errorf("field 'age' is not a valid number: %q", fieldMap["age"])
}
return Person{
Name: fieldMap["name"],
Age: age,
Email: fieldMap["email"],
City: fieldMap["city"],
}, nil
}
func processCSV(inputPath string, strict bool) ([]Person, error) {
file, err := os.Open(inputPath)
if err != nil {
if os.IsNotExist(err) {
return nil, fmt.Errorf("file '%s' does not exist", inputPath)
}
if os.IsPermission(err) {
return nil, fmt.Errorf("no permission to read '%s'", inputPath)
}
return nil, fmt.Errorf("cannot open '%s': %w", inputPath, err)
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
return nil, fmt.Errorf("error parsing CSV: %w", err)
}
if len(records) < 2 {
return nil, fmt.Errorf("CSV file is empty or has only headers")
}
header := records[0]
var people []Person
var allErrors []ValidationError
for i, record := range records[1:] {
rowNum := i + 2
validationErrs := validatePerson(rowNum, header, record)
if len(validationErrs) > 0 {
allErrors = append(allErrors, validationErrs...)
if strict {
return nil, fmt.Errorf("strict mode: %v", validationErrs[0])
}
for _, ve := range validationErrs {
fmt.Fprintf(os.Stderr, "warning: %v\n", ve)
}
continue
}
person, err := mapRecordToPerson(header, record)
if err != nil {
fmt.Fprintf(os.Stderr, "warning: row %d: %v\n", rowNum, err)
continue
}
people = append(people, person)
}
if len(people) == 0 {
return nil, fmt.Errorf("no valid rows found (%d errors)", len(allErrors))
}
if len(allErrors) > 0 {
fmt.Fprintf(os.Stderr, "\n%d rows with errors, %d rows processed successfully\n", len(allErrors), len(people))
}
return people, nil
}
func toJSON(people []Person, pretty bool) ([]byte, error) {
if pretty {
return json.MarshalIndent(people, "", " ")
}
return json.Marshal(people)
}
func toNDJSON(people []Person) ([]byte, error) {
var buf bytes.Buffer
encoder := json.NewEncoder(&buf)
for _, p := range people {
if err := encoder.Encode(p); err != nil {
return nil, fmt.Errorf("error encoding record: %w", err)
}
}
return buf.Bytes(), nil
}
func main() {
inputFile := flag.String("input", "", "input CSV file (required)")
outputFile := flag.String("output", "", "output JSON file (stdout if not specified)")
pretty := flag.Bool("pretty", false, "JSON output with indentation")
strict := flag.Bool("strict", false, "abort on validation errors")
format := flag.String("format", "json", "output format: json or ndjson")
flag.Parse()
if *inputFile == "" {
fmt.Fprintln(os.Stderr, "error: you must specify an input file with -input")
flag.Usage()
os.Exit(1)
}
people, err := processCSV(*inputFile, *strict)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
var output []byte
switch *format {
case "json":
output, err = toJSON(people, *pretty)
case "ndjson":
output, err = toNDJSON(people)
default:
fmt.Fprintf(os.Stderr, "error: unknown format '%s' (use 'json' or 'ndjson')\n", *format)
os.Exit(1)
}
if err != nil {
fmt.Fprintf(os.Stderr, "error generating output: %v\n", err)
os.Exit(1)
}
if *outputFile != "" {
err = os.WriteFile(*outputFile, output, 0644)
if err != nil {
fmt.Fprintf(os.Stderr, "error writing '%s': %v\n", *outputFile, err)
os.Exit(1)
}
fmt.Fprintf(os.Stderr, "JSON written to '%s' (%d records)\n", *outputFile, len(people))
} else {
fmt.Println(string(output))
}
}Run and Test
# Compile
go build -o csvtojson main.go
# Compact output to stdout
./csvtojson -input data.csv
# Formatted output to file
./csvtojson -input data.csv -output result.json -pretty
# Strict mode: fail on first error
./csvtojson -input data.csv -strict
# NDJSON format
./csvtojson -input data.csv -format ndjsonWith our example CSV, the output will include warnings for rows with problematic data and only valid rows will appear in the resulting JSON:
$ ./csvtojson -input data.csv -pretty
warning: row 4, field 'age': empty field
warning: row 5, field 'name': empty field
warning: row 5, field 'email': invalid email format
warning: row 6, field 'age': not a valid number: "abc"
3 rows with errors, 2 rows processed successfully
[
{
"name": "Ana García",
"age": 34,
"email": "ana@example.com",
"city": "Madrid"
},
{
"name": "Pedro López",
"age": 28,
"email": "pedro@example.com",
"city": "Barcelona"
}
]Under 200 Lines and Zero Dependencies
With under 200 lines of Go you’ve built a functional command-line tool that reads CSV, validates data, converts to JSON, and supports multiple output formats. All with the standard library. No external dependencies, no frameworks, no magic. In the process you’ve touched the fundamental packages of the language: encoding/csv and encoding/json for data transformation, flag for arguments, strconv for type conversion, and os for filesystem interaction. But most importantly, you’ve practiced the if err != nil pattern and the use of structs with tags, which are the foundation of any Go program.
This type of project is exactly what you need to consolidate the fundamentals of the language. It’s not a theoretical exercise. It’s a tool you can use in your day-to-day work, extend according to your needs, and compile for any platform with a single command.
If you want to keep practicing with real projects, take a look at the list of projects to learn Go. And if you’re interested in taking CLI tools further, with subcommands and autocomplete, the natural next step is building a CLI in Go with specialized libraries.


