konifay

`diesel-rs` ORM - typed SQL `sum(BigInt)` invocations

Using diesel-rs 2.2.x: sum() with AS and Type Handling (Version Control Example)


Introduction

diesel-rs is a powerful ORM for Rust, enabling safe and efficient database interactions. In version 2.2.x, summing values with an AS alias requires careful handling of SQL types and Diesel’s type system. This post explains how to use sum(_) with AS via raw SQL and the Diesel DSL, using a version control system as inspiration for the schema and tables.


Problem Statement

When using SUM(column) AS alias in SQL, you must alias the result to map it to a Rust struct. Diesel’s type system, however, can complicate this for BigInt columns due to the Foldable trait’s dependency on Numeric types.


Let's exercise using a fictive VCS database example:

Schema Setup

Tables Overview

Here's a simplified version control schema:

  • commits: id, repository_id, author_id, message, timestamp
  • files: id, repository_id, name, path
  • file_versions: id, file_id, commit_id, content_hash, content_size
  • repositories: id, name, description
  • users: id, username, email

Setup Instructions

To generate the schema using diesel with SQLite, omitting any migrations:

-- Create repositories table
-- Stores information about each repository
CREATE TABLE IF NOT EXISTS repositories (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    description TEXT
);

-- Create users table
-- Stores information about users (authors of commits)
CREATE TABLE IF NOT EXISTS users (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    username TEXT NOT NULL UNIQUE,
    email TEXT NOT NULL UNIQUE
);

-- Create commits table
-- Records each commit made in a repository
CREATE TABLE IF NOT EXISTS commits (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    repository_id INTEGER NOT NULL,
    author_id INTEGER NOT NULL,
    message TEXT NOT NULL,
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (repository_id) REFERENCES repositories(id),
    FOREIGN KEY (author_id) REFERENCES users(id)
);

-- Create files table
-- Stores information about files in a repository
CREATE TABLE IF NOT EXISTS files (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    repository_id INTEGER NOT NULL,
    name TEXT NOT NULL,
    path TEXT NOT NULL,
    FOREIGN KEY (repository_id) REFERENCES repositories(id)
);

-- Create file_versions table
-- Tracks changes to files over time (e.g., each version of a file in a commit)
CREATE TABLE IF NOT EXISTS file_versions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    file_id INTEGER NOT NULL,
    commit_id INTEGER NOT NULL,
    content_hash TEXT NOT NULL,
    content_size INTEGER NOT NULL,
    FOREIGN KEY (file_id) REFERENCES files(id),
    FOREIGN KEY (commit_id) REFERENCES commits(id)
);
cargo binstall diesel_cli
sqlite3 -init ./src/db/migrations/001-init.sql ephemeral_setup.db ""
diesel setup --database-url=./ephemeral_setup.db
diesel print-schema > src/db/schema.rs

Example Schema File

table! {
    commits (id) {
        id -> Integer,
        repository_id -> Integer,
        author_id -> Integer,
        message -> Text,
        timestamp -> DateTime,
    }
}

table! {
    file_versions (id) {
        id -> Integer,
        file_id -> Integer,
        commit_id -> Integer,
        content_hash -> Text,
        content_size -> BigInt,
    }
}

table! {
    repositories (id) {
        id -> Integer,
        name -> Text,
        description -> Text,
    }
}

Problem: Summing File Sizes

Let’s say we want to sum the total content size of all files in a specific repository.

Raw SQL Approach

Use sql_query() for direct control over SQL and aliasing.

#[derive(Debug, PartialEq, Clone)]
struct TotalSize {
    total_size: i64,
}

impl QueryableByName<Sqlite> for TotalSize {
    fn build<'a>(row: &impl NamedRow<'a, Sqlite>) -> diesel::deserialize::Result<Self> {
        let total_size = NamedRow::<'a, Sqlite>::get::<BigInt, i64>(row, "total_size")?;
        Ok(Self { total_size })
    }
}

let repo_id = 1;
let result = diesel::sql_query(
    "SELECT SUM(content_size) AS total_size FROM file_versions 
     JOIN commits ON file_versions.commit_id = commits.id 
     JOIN repositories ON commits.repository_id = repositories.id 
     WHERE repositories.id = ?",
)
.bind::<Integer, _>(&repo_id)
.get_result::<Option<TotalSize>>(conn)
.optional()?
.flatten();

dbg!(result);

Why this works:

  • Uses raw SQL with AS to alias the result.
  • QueryableByName maps the alias "total_size" to the struct field.
  • Binds parameters safely, avoiding type system conflicts.

Solution 2: Diesel DSL with BigDecimal

Using sum() with AS in Diesel requires type annotations and custom types.

use diesel::dsl::sum;

let maybe_total_size = SelectDsl::select(
    schema::file_versions::table,
    sum::<BigInt, _>(schema::file_versions::content_size),
)
.inner_join(schema::commits::table.on(schema::file_versions::commit_id.eq(schema::commits::id)))
.inner_join(schema::repositories::table.on(schema::commits::repository_id.eq(schema::repositories::id)))
.filter(schema::repositories::id.eq(repo_id))
.get_result::<Option<BigIntSum>>(conn)
.optional()?
.flatten();

Note: If BigInt is not supported, you may need to use BigDecimal for more flexibility with large values.


Final Notes

Using diesel-rs with a version control schema demonstrates how ORMs can be applied in a common, real-world scenario. The example shows how to:

  • Set up a schema with multiple tables.
  • Query using both raw SQL and the Diesel DSL.
  • Sum values with type-safe handling.

ORMs like Diesel are excellent for productivity and safety in most applications, but they are not a one-size-fits-all solution. Use them when they align with your project’s needs, and be prepared to fall back to raw SQL when performance, flexibility, or database-specific features are required.