# fimo-csv (file-mongo-csv)

fimo-csv is a fast and flexible CLI tool written in Rust that imports CSV file into MongoDB documents using YAML-based field mappings and Jinja2-style templating. It's ideal for bulk inserts, updates, and upserts with full control over document structure.


# 🚀 Features

  • RFC 4180-compliant CSV parsing (including headers, quoting, escaped quotes)
  • 🛠️ Field mapping via YAML configuration
  • 🧠 Custom transformation logic using MiniJinja
  • 🔁 Supports complex templated pipelines for update and upsert operations, enabling aggregation logic and fine-grained control over MongoDB document modifications.
  • 📦 MongoDB insert, update, and upsert support
  • 🧪 Validate-only and dry-run modes
  • 🔄 Batch processing support for large files
  • 🔐 Supports Extended JSON and BSON types
  • 🔣 Configurable CSV delimiter and quote characters
  • 📊 Debug and verbose output for development and testing
  • 📅 NEW: Flexible date parsing with multiple format support (e.g. ISO, MSSQL, Oracle, Go)

# 📦 Installation

cargo install fimo-csv

Or clone and build:

git clone https://github.com/fimo-org/fimo-csv.git
cd fimo-csv
cargo build --release

# 📝 Usage

fimo-csv \
  --input tests/data/extended.csv \
  --mapping tests/mapping/extended.yaml \
  --template-dir tests/templates \
  --mongo-uri mongodb://localhost:27017 \
  --db testdb \
  --collection testcol \
  --operation upsert \
  --extended-json \
  --debug

# 🧪 Example: With Templates and Extended JSON

📁 data.csv

_id,price,created_at,name,active
507f1f77bcf86cd799439011,12.34,2024-01-01T10:00:00Z,Alice,yes

🧩 mapping.yaml

_id:
  type: objectId
price:
  type: decimal
created_at:
  type: date
name:
  type: string
active:
  type: bool
  truthy: ["yes", "true", "1", "Y"]
  falsy: ["no", "false", "0", "N"]

🧾 templates/upsert.j2

{
  "filter": { "_id": {{ ERROR }} },
  "update": {
    "$set": {
      "price": {{ ERROR }},
      "created_at": {{ ERROR }},
      "name": "{{ ERROR }}",
      "active": {{ ERROR }}
    },
    "$setOnInsert": {
      "created_at": {{ ERROR }}
    }
  }
}

▶️ Run the Import

fimo-csv \
  --input data.csv \
  --mapping mapping.yaml \
  --template-dir templates \
  --mongo-uri mongodb://localhost:27017 \
  --db testdb \
  --collection customers \
  --operation upsert \
  --extended-json

# 🧪 Example: Raw Insert (No Templates)

📁 simple.csv

name,age,active
Bob,25,true

🧩 simple.yaml

name:
  type: string
age:
  type: int
active:
  type: bool

▶️ Raw Insert Command

fimo-csv \
  --input simple.csv \
  --mapping simple.yaml \
  --mongo-uri mongodb://localhost:27017 \
  --db demo \
  --collection people \
  --operation insert \
  --raw-insert

# 🔧 CLI Options

Option Description
--input Path to the CSV file
--mapping Path to YAML mapping file
--mongo-uri MongoDB connection URI
--db MongoDB database name
--collection MongoDB collection name
--operation insert, update, or upsert
--batch-size Number of docs to write in bulk (default: 0)
--no-header Use autogenerated headerscol_0, col_1...
--delimiter CSV delimiter (default:,)
--quote CSV quote character (default:")
--template-dir Directory with Jinja templates
--extended-json Enable support for non-JSON BSON values
--validate-only Validate rows without writing to MongoDB
--dry-run Print documents instead of inserting
--debug Enable verbose output

# 🧠 Truthy/Falsy Mapping for Booleans

In mapping.yaml, you can define per-field truthy/falsy values:

active:
  type: bool
  truthy: ["yes", "1", "true"]
  falsy: ["no", "0", "false"]

This allows more natural mapping from "yes"/"no", "Y"/"N" strings into true/false.

# 🧠 Flexible Date Parsing with Custom Formats

Fimo supports parsing date strings using custom formats, giving you the flexibility to import dates from a wide range of sources such as Oracle, MSSQL, or ISO standards.

You can define multiple formats for a date field in your mapping file:

created_at:
  type: date
  formats:
    - "%Y-%m-%dT%H:%M:%S%.fZ"                # ISO 8601
    - "%Y-%m-%d %H:%M:%S"                    # MSSQL style
    - "%Y-%m-%d %H:%M:%S%.f"                 # Go-style (chrono-compatible)
    - "%Y/%m/%d %H:%M"                       # Custom

Fimo will try each format in order until one matches. This makes importing data from diverse systems much easier.

# ▶️ Example CSV

name,created_at
Alice,2024-01-01T10:00:00Z
Bob,2024-01-01 10:00:00

# ▶️ Corresponding Mapping

name:
  type: string
created_at:
  type: date
  formats:
    - "%Y-%m-%dT%H:%M:%S%.fZ"
    - "%Y-%m-%d %H:%M:%S"

This feature leverages the chrono crate for robust and standards-compliant date parsing.

ℹ️ You can define multiple formats for a date field in the formats array. If omitted, Fimo defaults to parsing using RFC 3339 (e.g. 2024-01-01T10:00:00Z).

# 📁 Project Structure

pgsql
.
├── src/
│   ├── main.rs             # CLI entry point
│   ├── cli.rs              # Command-line argument parsing
│   ├── mongo.rs            # MongoDB connection
│   ├── transform.rs        # Mapping, templating, BSON conversion
│   ├── mapping.rs          # YAML field type parsing
│   └── template.rs         # Jinja environment loader
├── mappings/               # Sample mapping YAML files
├── templates/              # Sample Jinja templates
├── tests/                  # Sample CSV input for testing
└── Cargo.toml              # Project manifest

# 📚 RFC 4180 Compatibility

Fimo is fully compatible with RFC 4180:

  • Comma-separated fields (configurable)
  • Quoted fields with escape support
  • Optional headers
  • Uniform field count (recommended but not enforced)

# ⚠️ Disclaimer

fimo-csv is not affiliated with MongoDB Inc.
MongoDB® is a registered trademark of MongoDB Inc.


# 📜 License

MIT ©